This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Superposition
•
Applied to
Superposition is not "just" neuron polysemanticity
by
Lawrence Chan
3mo
ago
•
Applied to
Scaling Laws and Superposition
by
Pavan Katta
4mo
ago
•
Applied to
Sparse autoencoders find composed features in small toy models
by
Evan Anders
4mo
ago
•
Applied to
Some costs of superposition
by
Linda Linsefors
5mo
ago
•
Applied to
From Conceptual Spaces to Quantum Concepts: Formalising and Learning Structured Conceptual Models
by
Roman Leventov
6mo
ago
•
Applied to
AI alignment as a translation problem
by
Roman Leventov
6mo
ago
•
Applied to
Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small
by
Joseph Isaac Bloom
6mo
ago
•
Applied to
Toward A Mathematical Framework for Computation in Superposition
by
Nina Panickssery
6mo
ago
•
Applied to
Sparse MLP Distillation
by
slavachalnev
7mo
ago
•
Applied to
Towards Monosemanticity: Decomposing Language Models With Dictionary Learning
by
duck_master
8mo
ago
•
Applied to
Some open-source dictionaries and dictionary learning infrastructure
by
duck_master
8mo
ago
•
Applied to
Comparing Anthropic's Dictionary Learning to Ours
by
duck_master
8mo
ago
•
Applied to
Intro to Superposition & Sparse Autoencoders (Colab exercises)
by
duck_master
8mo
ago
•
Applied to
Superposition and Dropout
by
duck_master
8mo
ago
•
Applied to
[Interim research report] Taking features out of superposition with sparse autoencoders
by
duck_master
8mo
ago
•
Applied to
200 COP in MI: Exploring Polysemanticity and Superposition
by
duck_master
8mo
ago
•
Applied to
Taking features out of superposition with sparse autoencoders more quickly with informed initialization
by
duck_master
8mo
ago