AI ALIGNMENT FORUM
AF

HomeLibraryQuestionsAll Posts

Top Questions

Recent Activity

177

17Are You More Real If You're Really Forgetful?

Thane Ruthenis, Charlie Steiner

11mo

4

52Have LLMs Generated Novel Insights?

abramdemski, Cole Wyeth, Kaj_Sotala

8mo

19

44why assume AGIs will optimize for fixed goals?

nostalgebraist, Rob Bensinger

3y

3

27What convincing warning shot could help prevent extinction from AI?

Charbel-Raphaël, cozyfractal, peterbarnett

2y

2

7Egan's Theorem?

5y

7

40Seriously, what goes wrong with "reward the agent when it makes you smile"?

TurnTrout, johnswentworth

3y

13

14Is weak-to-strong generalization an alignment technique?

9mo

1

9What is the most impressive game LLMs can play well?

9mo

8

4How counterfactual are logical counterfactuals?

10mo

9

6Why not tool AI?

smithee, Ben Pace

7y

2

69Why is o1 so deceptive?

abramdemski, Sahil

1y

14

7Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception?

David Scott Krueger (formerly: capybaralet)

1y

5

Load MoreView All Questions

17Are You More Real If You're Really Forgetful?

Thane Ruthenis, Charlie Steiner

11mo

4

52Have LLMs Generated Novel Insights?

abramdemski, Cole Wyeth, Kaj_Sotala

8mo

19

44why assume AGIs will optimize for fixed goals?

nostalgebraist, Rob Bensinger

3y

3

27What convincing warning shot could help prevent extinction from AI?

Charbel-Raphaël, cozyfractal, peterbarnett

2y

2

40Seriously, what goes wrong with "reward the agent when it makes you smile"?

TurnTrout, johnswentworth

3y

13

Load MoreView All Top Questions