x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Alignment Hot Take Advent Calendar — AI Alignment Forum
Alignment Hot Take Advent Calendar
15
Take 1: We're not going to reverse-engineer the AI.
Charlie Steiner
3y
1
12
Take 2: Building tools to help build FAI is a legitimate strategy, but it's dual-use.
Charlie Steiner
3y
0
11
Take 3: No indescribable heavenworlds.
Charlie Steiner
3y
0
16
Take 4: One problem with natural abstractions is there's too many of them.
Charlie Steiner
3y
1
13
Take 5: Another problem for natural abstractions is laziness.
Charlie Steiner
3y
1
5
Take 6: CAIS is actually Orwellian.
Charlie Steiner
3y
2
24
Take 7: You should talk about "the human's utility function" less.
Charlie Steiner
3y
1
14
Take 8: Queer the inner/outer alignment dichotomy.
Charlie Steiner
3y
0
23
Take 9: No, RLHF/IDA/debate doesn't solve outer alignment.
Charlie Steiner
3y
5
16
Take 10: Fine-tuning with RLHF is aesthetically unsatisfying.
Charlie Steiner
3y
1
16
Take 11: "Aligning language models" should be weirder.
Charlie Steiner
3y
0
11
Take 12: RLHF's use is evidence that orgs will jam RL at real-world problems.
Charlie Steiner
3y
1
24
Take 13: RLHF bad, conditioning good.
Charlie Steiner
3y
0
10
Take 14: Corrigibility isn't that great.
Charlie Steiner
3y
3