Curated Sequences

Embedded Agency
AGI safety from first principles
Iterated Amplification
Value Learning
Risks from Learned Optimization
Cartesian Frames

Community Sequences

Epistemic Cookbook for Alignment
AI Safety Subprojects
Modeling Transformative AI Risk (MTAIR)
Practical Guide to Anthropics
The Causes of Power-seeking and Instrumental Convergence
Finite Factored Sets
Anthropic Decision Theory
Reviews for the Alignment Forum
Predictions & Self-awareness
Pointing at Normativity
Counterfactual Planning
AI Alignment Unwrapped
Load More (12/30)