x
Recommendations — AI Alignment Forum
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Archive Recommendations
147
AGI Ruin: A List of Lethalities
Eliezer Yudkowsky
4y
144
218
Where I agree and disagree with Eliezer
paulfchristiano
4y
59
134
SolidGoldMagikarp (plus, prompt generation)
Jessica Rumbelow
,
mwatkins
3y
17
142
Simulators
janus
3y
90
191
AI 2027: What Superintelligence Looks Like
Daniel Kokotajlo
,
Thomas Larsen
,
elifland
,
Scott Alexander
,
Jonas V
,
romeo
8mo
2
61
The Waluigi Effect (mega-post)
Cleo Nardo
3y
26
120
What 2026 looks like
Daniel Kokotajlo
4y
33
152
Let’s think about slowing down AI
KatjaGrace
3y
3
193
Alignment Faking in Large Language Models
ryan_greenblatt
,
evhub
,
Carson Denison
,
Benjamin Wright
,
Fabien Roger
,
Monte M
,
Sam Marks
,
Johannes Treutlein
,
Sam Bowman
,
Buck
1y
26
Review
121
Steering GPT-2-XL by adding an activation vector
TurnTrout
,
Monte M
,
David Udell
,
lisathiergart
,
Ulisse Mini
3y
63