This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
GDM Mech Interp Progress Updates
36
[Summary] Progress Update #1 from the GDM Mech Interp Team
Neel Nanda
,
Arthur Conmy
,
lewis smith
,
Senthooran Rajamanoharan
,
Tom Lieberum
,
János Kramár
,
Vikrant Varma
1y
0
40
[Full Post] Progress Update #1 from the GDM Mech Interp Team
Neel Nanda
,
Arthur Conmy
,
lewis smith
,
Senthooran Rajamanoharan
,
Tom Lieberum
,
János Kramár
,
Vikrant Varma
1y
3
26
The GDM AGI Safety+Alignment Team is Hiring for Applied Interpretability Research
Arthur Conmy
,
Neel Nanda
5mo
0
58
Negative Results for SAEs On Downstream Tasks and Deprioritising SAE Research (GDM Mech Interp Team Progress Update #2)
lewis smith
,
Senthooran Rajamanoharan
,
Arthur Conmy
,
CallumMcDougall
,
Tom Lieberum
,
János Kramár
,
Rohin Shah
,
Neel Nanda
3mo
6