This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Home
Library
Questions
All Posts
About
Home
Library
Questions
All Posts
Top Questions
69
Why is o1 so deceptive?
Q
Abram Demski
,
Sahil
2mo
Q
14
40
why assume AGIs will optimize for fixed goals?
Q
nostalgebraist
,
Rob Bensinger
3y
Q
3
18
We might be dropping the ball on Autonomous Replication and Adaptation.
Q
Charbel-Raphael Segerie
,
Épiphanie Gédéon
,
Richard Ngo
6mo
Q
18
27
What convincing warning shot could help prevent extinction from AI?
Q
Charbel-Raphael Segerie
,
Diego Dorn
8mo
Q
0
40
Forecasting Thread: AI Timelines
Q
Amanda Ngo
,
Daniel Kokotajlo
,
Ben Pace
,
datscilly
4y
Q
33
Recent Activity
16
Are You More Real If You're Really Forgetful?
Q
Thane Ruthenis
,
Charlie Steiner
14d
Q
4
6
Why not tool AI?
Q
smithee
,
Ben Pace
6y
Q
2
69
Why is o1 so deceptive?
Q
Abram Demski
,
Sahil
2mo
Q
14
40
why assume AGIs will optimize for fixed goals?
Q
nostalgebraist
,
Rob Bensinger
3y
Q
3
7
Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception?
Q
David Scott Krueger
3mo
Q
5
22
What progress have we made on automated auditing?
Q
Lawrence Chan
5mo
Q
0
18
We might be dropping the ball on Autonomous Replication and Adaptation.
Q
Charbel-Raphael Segerie
,
Épiphanie Gédéon
,
Richard Ngo
6mo
Q
18
27
What convincing warning shot could help prevent extinction from AI?
Q
Charbel-Raphael Segerie
,
Diego Dorn
8mo
Q
0
7
Is CIRL a promising agenda?
Q
Chris_Leong
2y
Q
0
7
What evidence is there of LLM's containing world models?
Q
Chris_Leong
1y
Q
0
40
Forecasting Thread: AI Timelines
Q
Amanda Ngo
,
Daniel Kokotajlo
,
Ben Pace
,
datscilly
4y
Q
33
33
Why The Focus on Expected Utility Maximisers?
Q
Cinera Verinia
,
Scott Garrabrant
2y
Q
1