User Profile

Ω8638187

Recent Posts

Curated Posts
Curated - Recent, high quality posts selected by the LessWrong moderation team.
Frontpage Posts
Posts meeting our frontpage guidelines: aim to explain, not to persuade. Avoid meta-discussion
(includes curated content and frontpage posts)
All Posts
Includes personal and meta blogposts (as well as curated and frontpage).

Corrigibility

712d6 min readShow Highlight
0

Benign model-free RL

38d7 min readShow Highlight
0

Clarifying "AI Alignment"

1524d3 min readShow Highlight
41

Prosaic AI alignment

1019d7 min readShow Highlight
0

Humans Consulting HCH

414d1 min readShow Highlight
0

Approval-directed bootstrapping

314d1 min readShow Highlight
0

Approval-directed agents: details

416d7 min readShow Highlight
0

The Steering Problem

91mo7 min readShow Highlight
1

An unaligned benchmark

822d9 min readShow Highlight
0

Approval-directed agents: overview

317d8 min readShow Highlight
1

Recent Comments