This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Home
Library
Questions
All Posts
About
Home
Library
Questions
All Posts
Top Questions
68
Why is o1 so deceptive?
Q
Abram Demski
,
Sahil
8d
Q
14
18
We might be dropping the ball on Autonomous Replication and Adaptation.
Q
Charbel-Raphael Segerie
,
Épiphanie Gédéon
,
Richard Ngo
4mo
Q
18
27
What convincing warning shot could help prevent extinction from AI?
Q
Charbel-Raphael Segerie
,
Diego Dorn
6mo
Q
0
40
why assume AGIs will optimize for fixed goals?
Q
nostalgebraist
,
Rob Bensinger
2y
Q
3
40
Forecasting Thread: AI Timelines
Q
Amanda Ngo
,
Daniel Kokotajlo
,
Ben Pace
,
datscilly
4y
Q
33
Recent Activity
68
Why is o1 so deceptive?
Q
Abram Demski
,
Sahil
8d
Q
14
6
Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception?
Q
David Scott Krueger
1mo
Q
5
22
What progress have we made on automated auditing?
Q
Lawrence Chan
3mo
Q
0
18
We might be dropping the ball on Autonomous Replication and Adaptation.
Q
Charbel-Raphael Segerie
,
Épiphanie Gédéon
,
Richard Ngo
4mo
Q
18
27
What convincing warning shot could help prevent extinction from AI?
Q
Charbel-Raphael Segerie
,
Diego Dorn
6mo
Q
0
7
Is CIRL a promising agenda?
Q
Chris_Leong
2y
Q
0
40
why assume AGIs will optimize for fixed goals?
Q
nostalgebraist
,
Rob Bensinger
2y
Q
3
7
What evidence is there of LLM's containing world models?
Q
Chris_Leong
1y
Q
0
40
Forecasting Thread: AI Timelines
Q
Amanda Ngo
,
Daniel Kokotajlo
,
Ben Pace
,
datscilly
4y
Q
33
33
Why The Focus on Expected Utility Maximisers?
Q
Cinera Verinia
,
Scott Garrabrant
2y
Q
1
1
Can we isolate neurons that recognize features vs. those which have some other role?
Q
Joshua Clancy
1y
Q
0
0
Training a RL Model with Continuous State & Action Space in a Real-World Scenario
Q
Alexander Ries
1y
Q
0