This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Archive Recommendations
143
AGI Ruin: A List of Lethalities
Eliezer Yudkowsky
2y
144
216
Where I agree and disagree with Eliezer
Paul Christiano
2y
59
142
SolidGoldMagikarp (plus, prompt generation)
Jessica Rumbelow
,
mwatkins
1y
16
63
The Waluigi Effect (mega-post)
Cleo Nardo
1y
25
129
Simulators
janus
2y
89
153
Let’s think about slowing down AI
KatjaGrace
2y
3
100
What 2026 looks like
Daniel Kokotajlo
3y
28
116
Steering GPT-2-XL by adding an activation vector
Alex Turner
,
Monte MacDiarmid
,
David Udell
,
lisathiergart
,
Ulisse Mini
1y
63
97
chinchilla's wild implications
nostalgebraist
2y
13
95
(My understanding of) What Everyone in Technical Alignment is Doing and Why
Thomas Larsen
,
elifland
2y
21