AI ALIGNMENT FORUM
AF

Sid Black
Ω41000
Message
Subscribe to posts

Posts

Sorted by New
64The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable4mo
10
33Conjecture Second Hiring Round4mo
0
63Conjecture: a retrospective after 8 months of work4mo
5
38Current themes in mechanistic interpretability research4mo
2
41Interpreting Neural Networks through the Polytope Lens6mo
10
50Conjecture: Internal Infohazard Policy8mo
2

Wiki Contributions

No wiki contributions to display.

Comments

No Comments Found