This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Archive Recommendations
145
AGI Ruin: A List of Lethalities
Eliezer Yudkowsky
3y
144
218
Where I agree and disagree with Eliezer
Paul Christiano
2y
59
142
SolidGoldMagikarp (plus, prompt generation)
Jessica Rumbelow
,
mwatkins
2y
16
62
The Waluigi Effect (mega-post)
Cleo Nardo
2y
25
130
Simulators
janus
2y
89
154
Let’s think about slowing down AI
KatjaGrace
2y
3
111
What 2026 looks like
Daniel Kokotajlo
3y
29
121
Steering GPT-2-XL by adding an activation vector
Alex Turner
,
Monte MacDiarmid
,
David Udell
,
lisathiergart
,
Ulisse Mini
2y
63
97
chinchilla's wild implications
nostalgebraist
2y
13
103
What failure looks like
Paul Christiano
6y
28