Curated Sequences

Embedded Agency
AGI safety from first principles
Iterated Amplification
Value Learning
Risks from Learned Optimization
Cartesian Frames

Community Sequences

Anthropic Decision Theory
Reviews for the Alignment Forum
Predictions & Self-awareness
Pointing at Normativity
Counterfactual Planning
AI Alignment Unwrapped
AI Timelines
Takeoff and Takeover in the Past and Future
Deconfusing Goal-Directedness
Factored Cognition
Infra-Bayesianism
Toying With Goal-Directedness
Load More (12/24)