Lauren Greenspan

An OV-Coherent Toy Model of Attention Head Superposition

Background This project was inspired by Anthropic’s post on attention head superposition, which constructed a toy model trained to learn a circuit to identify skip-trigrams that are OV-incoherent (attending from multiple destination tokens to a single source token) as a way to ensure that superposition would occur. Since the OV...

Aug 29, 202326

Lauren Greenspan

Lauren Greenspan

Renormalization Roadmap

Renormalization Redux: QFT Techniques for AI Interpretability

Building Big Science from the Bottom-Up: A Fractal Approach to AI Safety

An OV-Coherent Toy Model of Attention Head Superposition

Lauren Greenspan

Renormalization Roadmap

Renormalization Redux: QFT Techniques for AI Interpretability

Building Big Science from the Bottom-Up: A Fractal Approach to AI Safety

An OV-Coherent Toy Model of Attention Head Superposition

Renormalization Roadmap

Renormalization Redux: QFT Techniques for AI Interpretability

Is AI Physical?

Building Big Science from the Bottom-Up: A Fractal Approach to AI Safety

An OV-Coherent Toy Model of Attention Head Superposition