x
All Questions
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Top Questions
Recent Activity
17
Are You More Real If You're Really Forgetful?
Q
Thane Ruthenis
,
Charlie Steiner
1y
Q
4
52
Have LLMs Generated Novel Insights?
Q
abramdemski
,
Cole Wyeth
,
Kaj_Sotala
9mo
Q
19
44
why assume AGIs will optimize for fixed goals?
Q
nostalgebraist
,
Rob Bensinger
3y
Q
3
27
What convincing warning shot could help prevent extinction from AI?
Q
Charbel-Raphaël
,
cozyfractal
,
peterbarnett
2y
Q
2
7
Egan's Theorem?
Q
johnswentworth
5y
Q
7
40
Seriously, what goes wrong with "reward the agent when it makes you smile"?
Q
TurnTrout
,
johnswentworth
3y
Q
13
14
Is weak-to-strong generalization an alignment technique?
Q
cloud
10mo
Q
1
9
What is the most impressive game LLMs can play well?
Q
Cole Wyeth
11mo
Q
8
4
How counterfactual are logical counterfactuals?
Q
Donald Hobson
1y
Q
9
6
Why not tool AI?
Q
smithee
,
Ben Pace
7y
Q
2
69
Why is o1 so deceptive?
Q
abramdemski
,
Sahil
1y
Q
14
7
Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception?
Q
David Scott Krueger (formerly: capybaralet)
1y
Q
5
17
Are You More Real If You're Really Forgetful?
Q
Thane Ruthenis
,
Charlie Steiner
1y
Q
4
52
Have LLMs Generated Novel Insights?
Q
abramdemski
,
Cole Wyeth
,
Kaj_Sotala
9mo
Q
19
44
why assume AGIs will optimize for fixed goals?
Q
nostalgebraist
,
Rob Bensinger
3y
Q
3
27
What convincing warning shot could help prevent extinction from AI?
Q
Charbel-Raphaël
,
cozyfractal
,
peterbarnett
2y
Q
2
40
Seriously, what goes wrong with "reward the agent when it makes you smile"?
Q
TurnTrout
,
johnswentworth
3y
Q
13