AI ALIGNMENT FORUM
AF

HomeLibraryQuestionsAll Posts
About

Top Questions

Recent Activity

177
17Are You More Real If You're Really Forgetful?
Q
Thane Ruthenis, Charlie Steiner
11mo
Q
4
52Have LLMs Generated Novel Insights?
Q
abramdemski, Cole Wyeth, Kaj_Sotala
8mo
Q
19
44why assume AGIs will optimize for fixed goals?
Q
nostalgebraist, Rob Bensinger
3y
Q
3
27What convincing warning shot could help prevent extinction from AI?
Q
Charbel-Raphaël, cozyfractal, peterbarnett
2y
Q
2
7Egan's Theorem?
Q
johnswentworth
5y
Q
7
40Seriously, what goes wrong with "reward the agent when it makes you smile"?
Q
TurnTrout, johnswentworth
3y
Q
13
14Is weak-to-strong generalization an alignment technique?
Q
cloud
9mo
Q
1
9What is the most impressive game LLMs can play well?
Q
Cole Wyeth
9mo
Q
8
4How counterfactual are logical counterfactuals?
Q
Donald Hobson
10mo
Q
9
6Why not tool AI?
Q
smithee, Ben Pace
7y
Q
2
69Why is o1 so deceptive?
Q
abramdemski, Sahil
1y
Q
14
7Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception?
Q
David Scott Krueger (formerly: capybaralet)
1y
Q
5
Load MoreView All Questions
17Are You More Real If You're Really Forgetful?
Q
Thane Ruthenis, Charlie Steiner
11mo
Q
4
52Have LLMs Generated Novel Insights?
Q
abramdemski, Cole Wyeth, Kaj_Sotala
8mo
Q
19
44why assume AGIs will optimize for fixed goals?
Q
nostalgebraist, Rob Bensinger
3y
Q
3
27What convincing warning shot could help prevent extinction from AI?
Q
Charbel-Raphaël, cozyfractal, peterbarnett
2y
Q
2
40Seriously, what goes wrong with "reward the agent when it makes you smile"?
Q
TurnTrout, johnswentworth
3y
Q
13
Load MoreView All Top Questions