This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Sid Black
Posts
Sorted by New
69
The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable
2y
11
33
Conjecture Second Hiring Round
2y
0
65
Conjecture: a retrospective after 8 months of work
2y
5
38
Current themes in mechanistic interpretability research
2y
2
47
Interpreting Neural Networks through the Polytope Lens
2y
11
53
Conjecture: Internal Infohazard Policy
2y
2
Wiki Contributions
Comments
Sorted by
Newest