The Best of LessWrong

Here you can find the best posts of LessWrong. When posts turn more than a year old, the LessWrong community reviews and votes on how well they have stood the test of time. These are the posts that have ranked the highest for all years since 2018 (when our annual tradition of choosing the least wrong of LessWrong began).

For the years 2018, 2019 and 2020 we also published physical books with the results of our annual vote, which you can buy and learn more about here.
Sort by:
curatedyear
+

Rationality

Eliezer Yudkowsky
Local Validity as a Key to Sanity and Civilization
Buck
"Other people are wrong" vs "I am right"
Mark Xu
Strong Evidence is Common
johnswentworth
You Are Not Measuring What You Think You Are Measuring
johnswentworth
Gears-Level Models are Capital Investments
Hazard
How to Ignore Your Emotions (while also thinking you're awesome at emotions)
Scott Garrabrant
Yes Requires the Possibility of No
Scott Alexander
Trapped Priors As A Basic Problem Of Rationality
Duncan Sabien (Deactivated)
Split and Commit
Ben Pace
A Sketch of Good Communication
Eliezer Yudkowsky
Meta-Honesty: Firming Up Honesty Around Its Edge-Cases
Duncan Sabien (Deactivated)
Lies, Damn Lies, and Fabricated Options
Duncan Sabien (Deactivated)
CFAR Participant Handbook now available to all
johnswentworth
What Are You Tracking In Your Head?
Mark Xu
The First Sample Gives the Most Information
Duncan Sabien (Deactivated)
Shoulder Advisors 101
Zack_M_Davis
Feature Selection
abramdemski
Mistakes with Conservation of Expected Evidence
Scott Alexander
Varieties Of Argumentative Experience
Eliezer Yudkowsky
Toolbox-thinking and Law-thinking
alkjash
Babble
Kaj_Sotala
The Felt Sense: What, Why and How
Duncan Sabien (Deactivated)
Cup-Stacking Skills (or, Reflexive Involuntary Mental Motions)
Ben Pace
The Costly Coordination Mechanism of Common Knowledge
Jacob Falkovich
Seeing the Smoke
Elizabeth
Epistemic Legibility
Daniel Kokotajlo
Taboo "Outside View"
alkjash
Prune
johnswentworth
Gears vs Behavior
Raemon
Noticing Frame Differences
Duncan Sabien (Deactivated)
Sazen
AnnaSalamon
Reality-Revealing and Reality-Masking Puzzles
Eliezer Yudkowsky
ProjectLawful.com: Eliezer's latest story, past 1M words
Eliezer Yudkowsky
Self-Integrity and the Drowning Child
Jacob Falkovich
The Treacherous Path to Rationality
Scott Garrabrant
Tyranny of the Epistemic Majority
alkjash
More Babble
abramdemski
Most Prisoner's Dilemmas are Stag Hunts; Most Stag Hunts are Schelling Problems
Raemon
Being a Robust Agent
Zack_M_Davis
Heads I Win, Tails?—Never Heard of Her; Or, Selective Reporting and the Tragedy of the Green Rationalists
Benquo
Reason isn't magic
habryka
Integrity and accountability are core parts of rationality
Raemon
The Schelling Choice is "Rabbit", not "Stag"
Diffractor
Threat-Resistant Bargaining Megapost: Introducing the ROSE Value
Raemon
Propagating Facts into Aesthetics
johnswentworth
Simulacrum 3 As Stag-Hunt Strategy
LoganStrohl
Catching the Spark
Jacob Falkovich
Is Rationalist Self-Improvement Real?
Benquo
Excerpts from a larger discussion about simulacra
Zvi
Simulacra Levels and their Interactions
abramdemski
Radical Probabilism
sarahconstantin
Naming the Nameless
AnnaSalamon
Comment reply: my low-quality thoughts on why CFAR didn't get farther with a "real/efficacious art of rationality"
Eric Raymond
Rationalism before the Sequences
Owain_Evans
The Rationalists of the 1950s (and before) also called themselves “Rationalists”
+

Optimization

sarahconstantin
The Pavlov Strategy
johnswentworth
Coordination as a Scarce Resource
AnnaSalamon
What should you change in response to an "emergency"? And AI risk
Zvi
Prediction Markets: When Do They Work?
johnswentworth
Being the (Pareto) Best in the World
alkjash
Is Success the Enemy of Freedom? (Full)
jasoncrawford
How factories were made safe
HoldenKarnofsky
All Possible Views About Humanity's Future Are Wild
jasoncrawford
Why has nuclear power been a flop?
Zvi
Simple Rules of Law
Elizabeth
Power Buys You Distance From The Crime
Eliezer Yudkowsky
Is Clickbait Destroying Our General Intelligence?
Scott Alexander
The Tails Coming Apart As Metaphor For Life
Zvi
Asymmetric Justice
Jeffrey Ladish
Nuclear war is unlikely to cause human extinction
Spiracular
Bioinfohazards
Zvi
Moloch Hasn’t Won
Zvi
Motive Ambiguity
Benquo
Can crimes be discussed literally?
Said Achmiz
The Real Rules Have No Exceptions
Lars Doucet
Lars Doucet's Georgism series on Astral Codex Ten
johnswentworth
When Money Is Abundant, Knowledge Is The Real Wealth
HoldenKarnofsky
This Can't Go On
Scott Alexander
Studies On Slack
johnswentworth
Working With Monsters
jasoncrawford
Why haven't we celebrated any major achievements lately?
abramdemski
The Credit Assignment Problem
Martin Sustrik
Inadequate Equilibria vs. Governance of the Commons
Raemon
The Amish, and Strategic Norms around Technology
Zvi
Blackmail
KatjaGrace
Discontinuous progress in history: an update
Scott Alexander
Rule Thinkers In, Not Out
Jameson Quinn
A voting theory primer for rationalists
HoldenKarnofsky
Nonprofit Boards are Weird
Wei Dai
Beyond Astronomical Waste
johnswentworth
Making Vaccine
jefftk
Make more land
+

World

Ben
The Redaction Machine
Samo Burja
On the Loss and Preservation of Knowledge
Alex_Altair
Introduction to abstract entropy
Martin Sustrik
Swiss Political System: More than You ever Wanted to Know (I.)
johnswentworth
Interfaces as a Scarce Resource
johnswentworth
Transportation as a Constraint
eukaryote
There’s no such thing as a tree (phylogenetically)
Scott Alexander
Is Science Slowing Down?
Martin Sustrik
Anti-social Punishment
Martin Sustrik
Research: Rescuers during the Holocaust
GeneSmith
Toni Kurz and the Insanity of Climbing Mountains
johnswentworth
Book Review: Design Principles of Biological Circuits
Elizabeth
Literature Review: Distributed Teams
Valentine
The Intelligent Social Web
jacobjacob
Unconscious Economics
eukaryote
Spaghetti Towers
Eli Tyre
Historical mathematicians exhibit a birth order effect too
johnswentworth
What Money Cannot Buy
Scott Alexander
Book Review: The Secret Of Our Success
johnswentworth
Specializing in Problems We Don't Understand
KatjaGrace
Why did everything take so long?
Ruby
[Answer] Why wasn't science invented in China?
Scott Alexander
Mental Mountains
Kaj_Sotala
My attempt to explain Looking, insight meditation, and enlightenment in non-mysterious terms
johnswentworth
Evolution of Modularity
johnswentworth
Science in a High-Dimensional World
zhukeepa
How uniform is the neocortex?
Kaj_Sotala
Building up to an Internal Family Systems model
Steven Byrnes
My computational framework for the brain
Natália
Counter-theses on Sleep
abramdemski
What makes people intellectually active?
Bucky
Birth order effect found in Nobel Laureates in Physics
KatjaGrace
Elephant seal 2
JackH
Anti-Aging: State of the Art
Vaniver
Steelmanning Divination
Kaj_Sotala
Book summary: Unlocking the Emotional Brain
+

AI Strategy

Ajeya Cotra
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Daniel Kokotajlo
Cortés, Pizarro, and Afonso as Precedents for Takeover
Daniel Kokotajlo
The date of AI Takeover is not the day the AI takes over
paulfchristiano
What failure looks like
Daniel Kokotajlo
What 2026 looks like
gwern
It Looks Like You're Trying To Take Over The World
Andrew_Critch
What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)
paulfchristiano
Another (outer) alignment failure story
Ajeya Cotra
Draft report on AI timelines
Eliezer Yudkowsky
Biology-Inspired AGI Timelines: The Trick That Never Works
HoldenKarnofsky
Reply to Eliezer on Biological Anchors
Richard_Ngo
AGI safety from first principles: Introduction
Daniel Kokotajlo
Fun with +12 OOMs of Compute
Wei Dai
AI Safety "Success Stories"
KatjaGrace
Counterarguments to the basic AI x-risk case
johnswentworth
The Plan
Rohin Shah
Reframing Superintelligence: Comprehensive AI Services as General Intelligence
lc
What an actually pessimistic containment strategy looks like
Eliezer Yudkowsky
MIRI announces new "Death With Dignity" strategy
evhub
Chris Olah’s views on AGI safety
So8res
Comments on Carlsmith's “Is power-seeking AI an existential risk?”
Adam Scholl
Safetywashing
abramdemski
The Parable of Predict-O-Matic
KatjaGrace
Let’s think about slowing down AI
nostalgebraist
human psycholinguists: a critical appraisal
nostalgebraist
larger language models may disappoint you [or, an eternally unfinished draft]
Daniel Kokotajlo
Against GDP as a metric for timelines and takeoff speeds
paulfchristiano
Arguments about fast takeoff
Eliezer Yudkowsky
Six Dimensions of Operational Adequacy in AGI Projects
+

Technical AI Safety

Andrew_Critch
Some AI research areas and their relevance to existential safety
1a3orn
EfficientZero: How It Works
elspood
Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment
So8res
Decision theory does not imply that we get to have nice things
TurnTrout
Reward is not the optimization target
johnswentworth
Worlds Where Iterative Design Fails
Vika
Specification gaming examples in AI
Rafael Harth
Inner Alignment: Explain like I'm 12 Edition
evhub
An overview of 11 proposals for building safe advanced AI
johnswentworth
Alignment By Default
johnswentworth
How To Go From Interpretability To Alignment: Just Retarget The Search
Alex Flint
Search versus design
abramdemski
Selection vs Control
Mark Xu
The Solomonoff Prior is Malign
paulfchristiano
My research methodology
Eliezer Yudkowsky
The Rocket Alignment Problem
Eliezer Yudkowsky
AGI Ruin: A List of Lethalities
So8res
A central AI alignment problem: capabilities generalization, and the sharp left turn
TurnTrout
Reframing Impact
Scott Garrabrant
Robustness to Scale
paulfchristiano
Inaccessible information
TurnTrout
Seeking Power is Often Convergently Instrumental in MDPs
So8res
On how various plans miss the hard bits of the alignment challenge
abramdemski
Alignment Research Field Guide
paulfchristiano
The strategy-stealing assumption
Veedrac
Optimality is the tiger, and agents are its teeth
Sam Ringer
Models Don't "Get Reward"
johnswentworth
The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables
Buck
Language models seem to be much better than humans at next-token prediction
abramdemski
An Untrollable Mathematician Illustrated
abramdemski
An Orthodox Case Against Utility Functions
johnswentworth
Selection Theorems: A Program For Understanding Agents
Rohin Shah
Coherence arguments do not entail goal-directed behavior
Alex Flint
The ground of optimization
paulfchristiano
Where I agree and disagree with Eliezer
Eliezer Yudkowsky
Ngo and Yudkowsky on alignment difficulty
abramdemski
Embedded Agents
evhub
Risks from Learned Optimization: Introduction
nostalgebraist
chinchilla's wild implications
johnswentworth
Why Agent Foundations? An Overly Abstract Explanation
zhukeepa
Paul's research agenda FAQ
Eliezer Yudkowsky
Coherent decisions imply consistent utilities
paulfchristiano
Open question: are minimal circuits daemon-free?
evhub
Gradient hacking
janus
Simulators
LawrenceC
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
TurnTrout
Humans provide an untapped wealth of evidence about alignment
Neel Nanda
A Mechanistic Interpretability Analysis of Grokking
Collin
How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme
evhub
Understanding “Deep Double Descent”
Quintin Pope
The shard theory of human values
TurnTrout
Inner and outer alignment decompose one hard problem into two extremely hard problems
Eliezer Yudkowsky
Challenges to Christiano’s capability amplification proposal
Scott Garrabrant
Finite Factored Sets
paulfchristiano
ARC's first technical report: Eliciting Latent Knowledge
Diffractor
Introduction To The Infra-Bayesianism Sequence
#1

Strong evidence is much more common than you might think. Someone telling you their name provides about 24 bits of evidence. Seeing something on Wikipedia provides enormous evidence. We should be willing to update strongly on everyday events. 

6Joe Carlsmith
I really like this post. It's a crisp, useful insight, made via a memorable concrete example (plus a few others), in a very efficient way. And it has stayed with me. 
6Ben Pace
This post is in my small list of +9s that I think count as a key part of how I think, where the post was responsible for clarifying my thinking on the subject. I've had a lingering confusion/nervousness about having extreme odds (anything beyond 100:1) but the name example shows that seeing odds ratios of 20,000,000:1 is just pretty common. I also appreciated Eliezer's corollary: "most beliefs worth having are extreme", this also influences how I think about my key beliefs. (Haha, I just realized that I curated it back when it was published.)
#6

What was rationalism like before the Sequences and LessWrong? Eric S. Raymond explores the intellectual roots of the rationalist movement, including General Semantics, analytic philosophy, science fiction, and Zen Buddhism. 

#7

When people disagree or face difficult decisions, they often include fabricated options - choices that seem possible but are actually incoherent or unrealistic. Learning to spot these fabricated options can help you make better decisions and have more productive disagreements. 

1Elizabeth
My first reaction when this post came out was being mad Duncan got the credit for an idea I also had, and wrote a different post than the one I would have written if I'd realized this needed a post. But at the end of the day the post exists and my post is imaginary, and it has saved me time in conversations with other people because now they have the concept neatly labeled.
#16

People use the term "outside view" to mean very different things. Daniel argues this is problematic, because different uses of "outside view" can have very different validity. He suggests we taboo "outside view" and use more specific, clearer language instead.

8Alex_Altair
This is a negative review of an admittedly highly-rated post. The positives first; I think this post is highly reasonable and well written. I'm glad that it exists and think it contributes to the intellectual conversation in rationality. The examples help the reader reason better, and it contains many pieces of advice that I endorse. But overall, 1) I ultimately disagree with its main point, and 2) it's way too strong/absolutist about it. Throughout my life of attempting to have true beliefs and take effective actions, I have quite strongly learned some distinction that maps onto the ideas of inside and outside view. I find this distinction extremely helpful, and specifically, remembering to use (what I call) the outside view often wins me a lot of Bayes points. When I read through the Big Lists O' Things, I have these responses; * I think many of those things are simply valid uses of the terms[1] * People using a term wrong isn't a great reason[2] to taboo that term; e.g. there are countless mis-uses of the concept of "truth" or "entropy" or "capitalism", but the concepts still carve reality * Seems like maybe some of these you heard one person use once, and then it got to go on the list? A key example of the absolutism comes from the intro: "I recommend we permanently taboo “Outside view,” i.e. stop using the word and use more precise, less confused concepts instead." (emphasis added). But, as described in the original linked sequence post, the purpose of tabooing a word is to remember why you formed a concept in the first place, and see if that break-down helps you reason further. The point is not to stop using a word. I think the absolutism has caused this post to have negative effects; the phrase "taboo the outside view" has stuck around as a meme, and in my memory, when people use it it has not tended to be good for the conversation. Instead, I think the post should have said the following. * The term "outside view" can mean many things that can
#19

When you encounter evidence that seems to imply X, Duncan suggests explicitly considering both "What kind of world contains both [evidence] and [X]?" and "What kind of world contains both [evidence] and [not-X]?". 

Then commit to preliminary responses in each of those possible worlds.

#23

Scott Alexander explores the idea of "trapped priors" - beliefs that become so strong they can't be updated by new evidence, even when that evidence should change our mind. 

#28

The rationalist scene based around LessWrong has a historical predecessor! There was a "Rationalist Association" founded in 1885 that published works by Darwin, Russell, Haldane, Shaw, Wells, and Popper. Membership peaked in 1959 with over 5000 members and Bertrand Russell as President.

#31

In this short story, an AI wakes up in a strange environment and must piece together what's going on from limited inputs and outputs. Can it figure out its true nature and purpose?

#32

Duncan explores a concept he calls "cup-stacking skills" - extremely fast, almost reflexive mental or physical abilities developed through intense repetition. These can be powerful but also problematic if we're unaware of them or can't control them. 

#34

"The Watcher asked the class if they thought it was right to save the child, at the cost of ruining their clothing. Everyone in there moved their hand to the 'yes' position, of course. Except Keltham, who by this point had already decided quite clearly who he was, and who simply closed his hand into a fist, otherwise saying neither 'yes' nor 'no' to the question, defying it entirely."

#37

"Simulacrum Level 3 behavior" (i.e. "pretending to pretend something") can be an effective strategy for coordinating on high-payoff equilibria in Stag Hunt-like situations. This may explain some seemingly-irrational corporate behavior, especially in industries with increasing returns to scale. 

5Raymond Arnold
This gave a satisfying "click" of how the Simulacra and Staghunt concepts fit together.  Things I would consider changing: 1. Lion Parable. In the comments, John expands on this post with a parable about lion-hunters who believe in "magical protection against lions." That parable is actually what I normally think of when I think of this post, and I was sad to learn it wasn't actually in the post. I'd add it in, maybe as the opening example. 2. Do we actually need the word "simulacrum 3"? Something on my mind since last year's review is "how much work are the words "simulacra" doing for us? I feel vaguely like I learned something from Simulacra Levels and their Interactions, but the concept still feels overly complicated as a dependency to explain new concepts. If I read this post in the wild without having spent awhile grokking Simulacra I think I'd find it pretty confusing. But, meanwhile, the original sequences talked about "belief in belief". I think that's still a required dependency here, but, a) Belief in Belief is a shorter post, and I think b) I think this post + the literal words "belief in belief" helps grok the concept in the first place. On the flipside, I think the Simulacra concept does help point towards an overall worldview about what's going on in society, in a gnarlier way than belief-in-belief communicates. I'm confused here. Important Context A background thing in my mind whenever I read one of these coordination posts is an older John post: From Personal to Prison Gangs. We've got Belief-in-Belief/Simulacra3 as Stag Hunt strategies. Cool. They still involve... like, falsehoods and confusion and self-deception. Surely we shouldn't have to rely on that? My hope is yes, someday. But I don't know how to reliably do it at scale yet. I want to just quote the end of the prison gangs piece:
7Elizabeth
Most of the writing on simulacrum levels have left me feeling less able to reason about them, that they are too evil to contemplate. This post engaged with them as one fact in the world among many, which was already an improvement. I've found myself referring to this idea several times over the last two years, and it left me more alert to looking for other explanations in this class. 
#41

Logan Strohl outlines a structured approach for tapping into genuine curiosity and embarking on self-driven investigations, inspired by the spirit of early scientific pioneers. They hopes this method can help people overcome modern hesitancy to make direct observations, and draw their own conclusions. 

8Logan Strohl
* Oh man, what an interesting time to be writing this review! * I've now written second drafts of an entire sequence that more or less begins with an abridged (or re-written?) version of "Catching the Spark". The provisional title of the sequence is "Nuts and Bolts Of Naturalism".  (I'm still at least a month and probably more from beginning to publish the sequence, though.) This is the post in the sequence that's given me the most trouble; I've spent a lot of the past week trying to figure out where I stand with it. * I think if I just had to answer "yes" or "no" to "do I endorse the post at this point", I'd say "yes". I continue to think it lays out a valuable process that can result in a person being much more in tune with what they actually care about, and able to see much more clearly how they're relating to a topic that they might want to investigate. * As I re-write the post for my new sequence, though, I have two main categories of objections to it, both of which seem to be results of my having rushed to publish it as a somewhat stand-alone piece so I could get funding for the rest of my work. * One category of objection I have is that it tries to do too much at once. It tries to give instructions for the procedure itself, demonstrate the procedure, and provide a grounding in the underlying philosophy/worldview. It's perhaps a noble goal to do all of that in one post, but I don't think I personally am actually capable of that, and I think I ended up falling short of my standards on all three points. If you've read my sequence Intro To Naturalism, you might possibly share my feeling that the philosophy parts of Catching the Spark are some kind of desperate and muddled. Additionally, I think the demonstration parts are insufficiently real and insufficiently diverse. When I wrote the post, I mostly looked back at my memories to find illustrative examples, rather than catching my examples in real time. A version of this with demonstrations that meet my stan
#43

Duncan discusses "shoulder advisors" – imaginary simulations of real friends or fictional characters that can offer advice, similar to the cartoon trope of a devil and angel on each shoulder, but more nuanced. He argues these can be genuinely useful for improving decision making and offers tips on developing and using shoulder advisors effectively.