AI ALIGNMENT FORUMTags
AF

Superposition

•

Applied to Superposition is not "just" neuron polysemanticity by Lawrence Chan 3mo ago

•

Applied to Scaling Laws and Superposition by Pavan Katta 4mo ago

•

Applied to Sparse autoencoders find composed features in small toy models by Evan Anders 4mo ago

•

Applied to Some costs of superposition by Linda Linsefors 5mo ago

•

Applied to From Conceptual Spaces to Quantum Concepts: Formalising and Learning Structured Conceptual Models by Roman Leventov 6mo ago

•

Applied to AI alignment as a translation problem by Roman Leventov 6mo ago

•

Applied to Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small by Joseph Isaac Bloom 6mo ago

•

Applied to Toward A Mathematical Framework for Computation in Superposition by Nina Panickssery 6mo ago

•

Applied to Sparse MLP Distillation by slavachalnev 7mo ago

•

Applied to Towards Monosemanticity: Decomposing Language Models With Dictionary Learning by duck_master 8mo ago

•

Applied to Some open-source dictionaries and dictionary learning infrastructure by duck_master 8mo ago

•

Applied to Comparing Anthropic's Dictionary Learning to Ours by duck_master 8mo ago

•

Applied to Intro to Superposition & Sparse Autoencoders (Colab exercises) by duck_master 8mo ago

•

Applied to Superposition and Dropout by duck_master 8mo ago

•

Applied to [Interim research report] Taking features out of superposition with sparse autoencoders by duck_master 8mo ago

•

Applied to 200 COP in MI: Exploring Polysemanticity and Superposition by duck_master 8mo ago

•

Applied to Taking features out of superposition with sparse autoencoders more quickly with informed initialization by duck_master 8mo ago