This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Superposition
•
Applied to
Superposition is not "just" neuron polysemanticity
by
Lawrence Chan
21d
ago
•
Applied to
Scaling Laws and Superposition
by
Pavan Katta
1mo
ago
•
Applied to
Sparse autoencoders find composed features in small toy models
by
Evan Anders
2mo
ago
•
Applied to
Some costs of superposition
by
Linda Linsefors
2mo
ago
•
Applied to
From Conceptual Spaces to Quantum Concepts: Formalising and Learning Structured Conceptual Models
by
Roman Leventov
3mo
ago
•
Applied to
AI alignment as a translation problem
by
Roman Leventov
3mo
ago
•
Applied to
Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small
by
Joseph Isaac Bloom
3mo
ago
•
Applied to
Toward A Mathematical Framework for Computation in Superposition
by
Nina Rimsky
4mo
ago
•
Applied to
Sparse MLP Distillation
by
slavachalnev
5mo
ago
•
Applied to
Towards Monosemanticity: Decomposing Language Models With Dictionary Learning
by
duck_master
5mo
ago
•
Applied to
Some open-source dictionaries and dictionary learning infrastructure
by
duck_master
5mo
ago
•
Applied to
Comparing Anthropic's Dictionary Learning to Ours
by
duck_master
5mo
ago
•
Applied to
Intro to Superposition & Sparse Autoencoders (Colab exercises)
by
duck_master
5mo
ago
•
Applied to
Superposition and Dropout
by
duck_master
5mo
ago
•
Applied to
[Interim research report] Taking features out of superposition with sparse autoencoders
by
duck_master
5mo
ago
•
Applied to
200 COP in MI: Exploring Polysemanticity and Superposition
by
duck_master
5mo
ago
•
Applied to
Taking features out of superposition with sparse autoencoders more quickly with informed initialization
by
duck_master
5mo
ago
•
Applied to
Expanding the Scope of Superposition
by
duck_master
5mo
ago