The Best of LessWrong

Here you can find the best posts of LessWrong. When posts turn more than a year old, the LessWrong community reviews and votes on how well they have stood the test of time. These are the posts that have ranked the highest for all years since 2018 (when our annual tradition of choosing the least wrong of LessWrong began).

For the years 2018, 2019 and 2020 we also published physical books with the results of our annual vote, which you can buy and learn more about here.
Sort by:
curatedyear
+

Rationality

Eliezer Yudkowsky
Local Validity as a Key to Sanity and Civilization
Buck
"Other people are wrong" vs "I am right"
Mark Xu
Strong Evidence is Common
johnswentworth
You Are Not Measuring What You Think You Are Measuring
johnswentworth
Gears-Level Models are Capital Investments
Hazard
How to Ignore Your Emotions (while also thinking you're awesome at emotions)
Scott Garrabrant
Yes Requires the Possibility of No
Scott Alexander
Trapped Priors As A Basic Problem Of Rationality
Duncan Sabien (Deactivated)
Split and Commit
Ben Pace
A Sketch of Good Communication
Eliezer Yudkowsky
Meta-Honesty: Firming Up Honesty Around Its Edge-Cases
Duncan Sabien (Deactivated)
Lies, Damn Lies, and Fabricated Options
Duncan Sabien (Deactivated)
CFAR Participant Handbook now available to all
johnswentworth
What Are You Tracking In Your Head?
Mark Xu
The First Sample Gives the Most Information
Duncan Sabien (Deactivated)
Shoulder Advisors 101
Zack_M_Davis
Feature Selection
abramdemski
Mistakes with Conservation of Expected Evidence
Scott Alexander
Varieties Of Argumentative Experience
Eliezer Yudkowsky
Toolbox-thinking and Law-thinking
alkjash
Babble
Kaj_Sotala
The Felt Sense: What, Why and How
Duncan Sabien (Deactivated)
Cup-Stacking Skills (or, Reflexive Involuntary Mental Motions)
Ben Pace
The Costly Coordination Mechanism of Common Knowledge
Jacob Falkovich
Seeing the Smoke
Elizabeth
Epistemic Legibility
Daniel Kokotajlo
Taboo "Outside View"
alkjash
Prune
johnswentworth
Gears vs Behavior
Raemon
Noticing Frame Differences
Duncan Sabien (Deactivated)
Sazen
AnnaSalamon
Reality-Revealing and Reality-Masking Puzzles
Eliezer Yudkowsky
ProjectLawful.com: Eliezer's latest story, past 1M words
Eliezer Yudkowsky
Self-Integrity and the Drowning Child
Jacob Falkovich
The Treacherous Path to Rationality
Scott Garrabrant
Tyranny of the Epistemic Majority
alkjash
More Babble
abramdemski
Most Prisoner's Dilemmas are Stag Hunts; Most Stag Hunts are Schelling Problems
Raemon
Being a Robust Agent
Zack_M_Davis
Heads I Win, Tails?—Never Heard of Her; Or, Selective Reporting and the Tragedy of the Green Rationalists
Benquo
Reason isn't magic
habryka
Integrity and accountability are core parts of rationality
Raemon
The Schelling Choice is "Rabbit", not "Stag"
Diffractor
Threat-Resistant Bargaining Megapost: Introducing the ROSE Value
Raemon
Propagating Facts into Aesthetics
johnswentworth
Simulacrum 3 As Stag-Hunt Strategy
LoganStrohl
Catching the Spark
Jacob Falkovich
Is Rationalist Self-Improvement Real?
Benquo
Excerpts from a larger discussion about simulacra
Zvi
Simulacra Levels and their Interactions
abramdemski
Radical Probabilism
sarahconstantin
Naming the Nameless
AnnaSalamon
Comment reply: my low-quality thoughts on why CFAR didn't get farther with a "real/efficacious art of rationality"
Eric Raymond
Rationalism before the Sequences
Owain_Evans
The Rationalists of the 1950s (and before) also called themselves “Rationalists”
+

Optimization

sarahconstantin
The Pavlov Strategy
johnswentworth
Coordination as a Scarce Resource
AnnaSalamon
What should you change in response to an "emergency"? And AI risk
Zvi
Prediction Markets: When Do They Work?
johnswentworth
Being the (Pareto) Best in the World
alkjash
Is Success the Enemy of Freedom? (Full)
jasoncrawford
How factories were made safe
HoldenKarnofsky
All Possible Views About Humanity's Future Are Wild
jasoncrawford
Why has nuclear power been a flop?
Zvi
Simple Rules of Law
Elizabeth
Power Buys You Distance From The Crime
Eliezer Yudkowsky
Is Clickbait Destroying Our General Intelligence?
Scott Alexander
The Tails Coming Apart As Metaphor For Life
Zvi
Asymmetric Justice
Jeffrey Ladish
Nuclear war is unlikely to cause human extinction
Spiracular
Bioinfohazards
Zvi
Moloch Hasn’t Won
Zvi
Motive Ambiguity
Benquo
Can crimes be discussed literally?
Said Achmiz
The Real Rules Have No Exceptions
Lars Doucet
Lars Doucet's Georgism series on Astral Codex Ten
johnswentworth
When Money Is Abundant, Knowledge Is The Real Wealth
HoldenKarnofsky
This Can't Go On
Scott Alexander
Studies On Slack
johnswentworth
Working With Monsters
jasoncrawford
Why haven't we celebrated any major achievements lately?
abramdemski
The Credit Assignment Problem
Martin Sustrik
Inadequate Equilibria vs. Governance of the Commons
Raemon
The Amish, and Strategic Norms around Technology
Zvi
Blackmail
KatjaGrace
Discontinuous progress in history: an update
Scott Alexander
Rule Thinkers In, Not Out
Jameson Quinn
A voting theory primer for rationalists
HoldenKarnofsky
Nonprofit Boards are Weird
Wei Dai
Beyond Astronomical Waste
johnswentworth
Making Vaccine
jefftk
Make more land
+

World

Ben
The Redaction Machine
Samo Burja
On the Loss and Preservation of Knowledge
Alex_Altair
Introduction to abstract entropy
Martin Sustrik
Swiss Political System: More than You ever Wanted to Know (I.)
johnswentworth
Interfaces as a Scarce Resource
johnswentworth
Transportation as a Constraint
eukaryote
There’s no such thing as a tree (phylogenetically)
Scott Alexander
Is Science Slowing Down?
Martin Sustrik
Anti-social Punishment
Martin Sustrik
Research: Rescuers during the Holocaust
GeneSmith
Toni Kurz and the Insanity of Climbing Mountains
johnswentworth
Book Review: Design Principles of Biological Circuits
Elizabeth
Literature Review: Distributed Teams
Valentine
The Intelligent Social Web
jacobjacob
Unconscious Economics
eukaryote
Spaghetti Towers
Eli Tyre
Historical mathematicians exhibit a birth order effect too
johnswentworth
What Money Cannot Buy
Scott Alexander
Book Review: The Secret Of Our Success
johnswentworth
Specializing in Problems We Don't Understand
KatjaGrace
Why did everything take so long?
Ruby
[Answer] Why wasn't science invented in China?
Scott Alexander
Mental Mountains
Kaj_Sotala
My attempt to explain Looking, insight meditation, and enlightenment in non-mysterious terms
johnswentworth
Evolution of Modularity
johnswentworth
Science in a High-Dimensional World
zhukeepa
How uniform is the neocortex?
Kaj_Sotala
Building up to an Internal Family Systems model
Steven Byrnes
My computational framework for the brain
Natália
Counter-theses on Sleep
abramdemski
What makes people intellectually active?
Bucky
Birth order effect found in Nobel Laureates in Physics
KatjaGrace
Elephant seal 2
JackH
Anti-Aging: State of the Art
Vaniver
Steelmanning Divination
Kaj_Sotala
Book summary: Unlocking the Emotional Brain
+

AI Strategy

Ajeya Cotra
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Daniel Kokotajlo
Cortés, Pizarro, and Afonso as Precedents for Takeover
Daniel Kokotajlo
The date of AI Takeover is not the day the AI takes over
paulfchristiano
What failure looks like
Daniel Kokotajlo
What 2026 looks like
gwern
It Looks Like You're Trying To Take Over The World
Andrew_Critch
What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)
paulfchristiano
Another (outer) alignment failure story
Ajeya Cotra
Draft report on AI timelines
Eliezer Yudkowsky
Biology-Inspired AGI Timelines: The Trick That Never Works
HoldenKarnofsky
Reply to Eliezer on Biological Anchors
Richard_Ngo
AGI safety from first principles: Introduction
Daniel Kokotajlo
Fun with +12 OOMs of Compute
Wei Dai
AI Safety "Success Stories"
KatjaGrace
Counterarguments to the basic AI x-risk case
johnswentworth
The Plan
Rohin Shah
Reframing Superintelligence: Comprehensive AI Services as General Intelligence
lc
What an actually pessimistic containment strategy looks like
Eliezer Yudkowsky
MIRI announces new "Death With Dignity" strategy
evhub
Chris Olah’s views on AGI safety
So8res
Comments on Carlsmith's “Is power-seeking AI an existential risk?”
Adam Scholl
Safetywashing
abramdemski
The Parable of Predict-O-Matic
KatjaGrace
Let’s think about slowing down AI
nostalgebraist
human psycholinguists: a critical appraisal
nostalgebraist
larger language models may disappoint you [or, an eternally unfinished draft]
Daniel Kokotajlo
Against GDP as a metric for timelines and takeoff speeds
paulfchristiano
Arguments about fast takeoff
Eliezer Yudkowsky
Six Dimensions of Operational Adequacy in AGI Projects
+

Technical AI Safety

Andrew_Critch
Some AI research areas and their relevance to existential safety
1a3orn
EfficientZero: How It Works
elspood
Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment
So8res
Decision theory does not imply that we get to have nice things
TurnTrout
Reward is not the optimization target
johnswentworth
Worlds Where Iterative Design Fails
Vika
Specification gaming examples in AI
Rafael Harth
Inner Alignment: Explain like I'm 12 Edition
evhub
An overview of 11 proposals for building safe advanced AI
johnswentworth
Alignment By Default
johnswentworth
How To Go From Interpretability To Alignment: Just Retarget The Search
Alex Flint
Search versus design
abramdemski
Selection vs Control
Mark Xu
The Solomonoff Prior is Malign
paulfchristiano
My research methodology
Eliezer Yudkowsky
The Rocket Alignment Problem
Eliezer Yudkowsky
AGI Ruin: A List of Lethalities
So8res
A central AI alignment problem: capabilities generalization, and the sharp left turn
TurnTrout
Reframing Impact
Scott Garrabrant
Robustness to Scale
paulfchristiano
Inaccessible information
TurnTrout
Seeking Power is Often Convergently Instrumental in MDPs
So8res
On how various plans miss the hard bits of the alignment challenge
abramdemski
Alignment Research Field Guide
paulfchristiano
The strategy-stealing assumption
Veedrac
Optimality is the tiger, and agents are its teeth
Sam Ringer
Models Don't "Get Reward"
johnswentworth
The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables
Buck
Language models seem to be much better than humans at next-token prediction
abramdemski
An Untrollable Mathematician Illustrated
abramdemski
An Orthodox Case Against Utility Functions
johnswentworth
Selection Theorems: A Program For Understanding Agents
Rohin Shah
Coherence arguments do not entail goal-directed behavior
Alex Flint
The ground of optimization
paulfchristiano
Where I agree and disagree with Eliezer
Eliezer Yudkowsky
Ngo and Yudkowsky on alignment difficulty
abramdemski
Embedded Agents
evhub
Risks from Learned Optimization: Introduction
nostalgebraist
chinchilla's wild implications
johnswentworth
Why Agent Foundations? An Overly Abstract Explanation
zhukeepa
Paul's research agenda FAQ
Eliezer Yudkowsky
Coherent decisions imply consistent utilities
paulfchristiano
Open question: are minimal circuits daemon-free?
evhub
Gradient hacking
janus
Simulators
LawrenceC
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
TurnTrout
Humans provide an untapped wealth of evidence about alignment
Neel Nanda
A Mechanistic Interpretability Analysis of Grokking
Collin
How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme
evhub
Understanding “Deep Double Descent”
Quintin Pope
The shard theory of human values
TurnTrout
Inner and outer alignment decompose one hard problem into two extremely hard problems
Eliezer Yudkowsky
Challenges to Christiano’s capability amplification proposal
Scott Garrabrant
Finite Factored Sets
paulfchristiano
ARC's first technical report: Eliciting Latent Knowledge
Diffractor
Introduction To The Infra-Bayesianism Sequence
#1

How does it work to optimize for realistic goals in physical environments of which you yourself are a part? E.g. humans and robots in the real world, and not humans and AIs playing video games in virtual worlds where the player not part of the environment.  The authors claim we don't actually have a good theoretical understanding of this and explore four specific ways that we don't understand this process.

4orthonormal
Insofar as the AI Alignment Forum is part of the Best-of-2018 Review, this post deserves to be included. It's the friendliest explanation to MIRI's research agenda (as of 2018) that currently exists.
#2

This post is a not a so secret analogy for the AI Alignment problem. Via a fictional dialog, Eliezer explores and counters common questions to the Rocket Alignment Problem as approached by the Mathematics of Intentional Rocketry Institute. 

MIRI researchers will tell you they're worried that "right now, nobody can tell you how to point your rocket’s nose such that it goes to the moon, nor indeed any prespecified celestial destination."

15Isnasene
[Disclaimer: I'm reading this post for the first time now, as of 1/11/2020. I also already have a broad understanding of the importance of AI safety. While I am skeptical about MIRI's approach to things, I am also a fan of MIRI. Where this puts me relative to the target demographic of this post, I cannot say.] Overall Summary I think this post is pretty good. It's a solid and well-written introduction to some of the intuitions behind AI alignment and the fundamental research that MIRI does. At the same time, the use of analogy made the post more difficult for me to parse and hid some important considerations about AI alignment from view. Though it may be good (but not optimal) for introducing some people to the problem of AI alignment and a subset of MIRI's work, it did not raise or lower my opinion of MIRI as someone who already understood AGI safety to be important. To be clear, I do not consider any of these weaknesses serious because I believe them to be partially irrelevant to the audience of people who don't appreciate the importance of AI-Safety. Still, they are relevant to the audience of people who give AI-Safety the appropriate scrutiny but remain skeptical of MIRI. And I think this latter audience is important enough to assign this article a "pretty good" instead of a "great". I hope a future post directly explores the merit of MIRI's work on the context AI alignment without use of analogy. Below is an overview of my likes and dislikes in this post. I will go into more detail about them in the next section, "Evaluating Analogies." Things I liked: * It's a solid introduction to AI-alignment, covering a broad range of topics including: * Why we shouldn't expect aligned AGI by default * How modern conversation about AGI behavior is problematically underspecified * Why fundamental deconfusion research is necessary for solving AI-alignment * It directly explains the value/motivation of particular pieces of MIRI work via analogy -- which
#3

Eliezer describes the similarity between understanding what a locally valid proof step is in mathematics, knowing there are bad arguments for true conclusions, and that for civilization to hold together, people need to apply rules impartially even if it feels like it costs them in a particular instance. He fears that our society is losing appreciation for these points.

7Ben Pace
I think about this post a lot, and sometimes in conjunction with my own post on common knowlege. As well as it being a referent for when I think about fairness, it also ties in with how I think about LessWrong, Arbital and communal online endeavours for truth. The key line is: You can think of Wikipedia as being a set of communally editable web pages where the content of the page is constrained to be that which we can easily gain common knowledge of its truth. Wikipedia's information is only that which comes from verifiable sources, which is how they solve this problem - all the editors don't have to get in a room and talk forever if there's a simple standard of truth. (I mean, they still do, but it would blow up to an impossible level if the standard were laxer than this.) I understand a key part of the vision for Arbital was that, instead of the common standard being verifiable facts, it was instead to build a site around verifiable steps of inference, or alternatively phrased, local validity. This would allow us to walk through argument space together without knowing whether the conclusions were true or false yet. I think about this a lot, in terms of what steps a community can make together. I maybe will write a post on it more some day. I'm really grateful that Eliezer wrote this post.
3Zack M. Davis
It strikes me as pedagogically unfortunate that sections i. and ii. (on arguments and proof-steps being locally valid) are part of the same essay as as sections iii.–vi. (on what this has to do with the function of Law in Society). Had this been written in the Sequences-era, one would imagine this being (at least) two separate posts, and it would be nice to have a reference link for just the concept of argumentative local validity (which is obviously correct and important to have a name for, even if some of the speculations about Law in sections iii.–vi. turned out to be wrong).
#4

Will AGI progress gradually or rapidly? I think the disagreement is mostly about what happens before we build powerful AGI. 

I think weaker AI systems will already have radically transformed the world. This is strategically relevant because I'm imagining AGI strategies playing out in a world where everything is already going crazy, while other people are imagining AGI strategies playing out in a world that looks kind of like 2018 except that someone is about to get a decisive strategic advantage.

0scarcegreengrass
Epistemics: Yes, it is sound. Not because of claims (they seem more like opinions to me), but because it is appropriately charitable to those that disagree with Paul, and tries hard to open up avenues of mutual understanding. Valuable: Yes. It provides new third paradigms that bring clarity to people with different views. Very creative, good suggestions. Should it be in the Best list?: No. It is from the middle of a conversation, and would be difficult to understand if you haven't read a lot about the 'Foom debate'. Improved: The same concepts rewritten for a less-familiar audience would be valuable. Or at least with links to some of the background (definitions of AGI, detailed examples of what fast takeoff might look like and arguments for its plausibility). Followup: More posts thoughtfully describing positions for and against, etc. Presumably these exist, but i personally have not read much of this discussion in the 2018-2019 era.
#5

A coordination problem is when everyone is taking some action A, and we’d rather all be taking action B, but it’s bad if we don’t all move to B at the same time. Common knowledge is the name for the epistemic state we’re collectively in, when we know we can all start choosing action B - and trust everyone else to do the same.

#6

Why do some societies exhibit more antisocial punishment than others? Martin explores both some literature on the subject, and his own experience living in a country where "punishment of cooperators" was fairly common.

5Martin Sustrik
Author here. In the hindsight, I still feel that the phenomenon is interesting and potentially important topic to look into. I am not aware of any attempt to replicate or dive deeper though. As for my attempt to explain the psychology underlying the phenomenon I am not entirely happy with it. It's based only on introspection and lacks sound game-theoretic backing. By the way, there's one interesting explanation I've read somewhere in the meantime (unfortunately, I don't remember the source): Cooperation may incur different costs on different participants. If you are well-off, putting $100 into a common pool is not a terribly important matter. If others fail to cooperate all you can lose is $100. If you just barely getting along, putting $100 into a common pool may threaten you in a serious way. Therefore, rich will be more likely to cooperate than poor. Now, if the thing is framed in moral terms (those cooperating are "good", those not cooperating are "bad") the whole thing may feel like a scam providing the rich a way to buy moral superiority. As a poor person you may thus resort to anti-social punishment as a way to punish the scam.
#7

The "tails coming apart" is a phenomenon where two variables can be highly correlated overall, but at extreme values they diverge. Scott Alexander explores how this applies to complex concepts like happiness and morality, where our intuitions work well for common situations but break down in extreme scenarios. 

3SebastianG
“The Tails Coming Apart as a Metaphor for Life” should be retitled “The Tails Coming Apart as a Metaphor for Earth since 1800.” Scott does three things, 1) he notices that happiness research is framing dependent, 2) he notices that happiness is a human level term, but not specific at the extremes, 3) he considers how this relates to deep seated divergences in moral intuitions becoming ever more apparent in our world. He hints at why moral divergence occurs with his examples. His extreme case of hedonic utilitarianism, converting the entire mass of the universe into nervous tissue experiencing raw euphoria, represents a ludicrous extension of the realm of the possible: wireheading, methadone, subverting factory farming. Each of these is dependent upon technology and modern economies, and presents real ethical questions. None of these were live issues for people hundreds of years ago. The tails of their rival moralities didn’t come apart – or at least not very often or in fundamental ways. Back then Jesuits and Confucians could meet in China and agree on something like the “nature of the prudent man.” But in the words of Lonergan that version of the prudent man, Prudent Man 1.0, is obsolete: “We do not trust the prudent man’s memory but keep files and records and develop systems of information retrieval. We do not trust the prudent man’s ingenuity but call in efficiency experts or set problems for operations research. We do not trust the prudent man’s judgment but employ computers to forecast demand,” and he goes on. For from the moment VisiCalc primed the world for a future of data aggregation, Prudent Man 1.0 has been hiding in the bathroom bewildered by modern business efficiency and moon landings. Let’s take Scott’s analogy of the Bay Area Transit system entirely literally, and ask the mathematical question: when do parallel lines come apart or converge? Recall Euclid’s Fifth Postulate, the one saying that parallel lines will never intersect. For almost a couple
#8

How do human beings produce knowledge? When we describe rational thought processes, we tend to think of them as essentially deterministic, deliberate, and algorithmic. After some self-examination, however, Alkjash came to think that his process is closer to babbling many random strings and later filtering by a heuristic.

4Raymond Arnold
I just re-read this sequence. Babble has definitely made its way into my core vocabulary. I think of "improving both the Babble and Prune of LessWrong" as being central to my current goals, and I think this post was counterfactually relevant for that. Originally I had planned to vote weakly in favor of this post, but am currently positioning it more at the upper-mid-range of my votes. I think it's somewhat unfortunate that the Review focused only on posts, as opposed to sequences as a whole. I just re-read this sequence, and I think the posts More Babble, Prune, and Circumambulation have more substance/insight/gears/hooks than this one. (I didn't get as much out of Write). But, this one was sort of "the schelling post to nominate" if you were going to nominate one of them. The piece as a whole succeeds very much as both Art as well as pedagogy.
#9

In this post, Alkjash explores the concept of Babble and Prune as a model for thought generation. Babble refers to generating many possibilities with a weak heuristic, while Prune involves using a stronger heuristic to filter and select the best options. He discusses how this model relates to creativity, problem-solving, and various aspects of human cognition and culture. 

#10

Babble is our ability to generate ideas. Prune is our ability to filter those ideas. For many people, Prune is too strong, so they don't generate enough ideas. This post explores how to relax Prune to let more ideas through.

#11

orthonormal reflects that different people experience different social fears. He guesses that the strongest fear for a person (an "alarm" in their head) is usually broken. So the people who are most selfless end up that way because uncalibrated fear they're being too selfish, the most loud are that because of the fear of not being heard, etc.

4orthonormal
The LW team is encouraging authors to review their own posts, so: In retrospect, I think this post set out to do a small thing, and did it well. This isn't a grand concept or a vast inferential distance, it's just a reframe that I think is valuable for many people to try for themselves. I still bring up this concept quite a lot when I'm trying to help people through their psychological troubles, and when I'm reflecting on my own. I don't know whether the post belongs in the Best of 2018, but I'm proud of it.
#12

Social reality and culture work a lot like improv comedy. We often don't know "who we are" or what's going on socially, but everyone unconsciously tries to establish expectations of one another. Understanding this dynamic can give you more freedom to change your role in social interactions. 

12Jacob Falkovich
In my opinion, the biggest shift in the study of rationality since the Sequences were published were a change in focus from "bad math" biases (anchoring, availability, base rate neglect etc.) to socially-driven biases. And with good reason: while a crash course in Bayes' Law can alleviate many of the issues with intuitive math, group politics are a deep and inextricable part of everything our brains do. There has been a lot of great writing describing the issue like Scott’s essays on ingroups and outgroups and Robin Hanson’s theory of signaling. There are excellent posts summarizing the problem of socially-driven bias on a high level, like Kevin Simler’s post on crony beliefs. But The Intelligent Social Web offers something that all of the above don’t: a lens that looks into the very heart of social reality, makes you feel its power on an immediate and intuitive level, and gives you the tools to actually manipulate and change your reaction to it. Valentine’s structure of treating this as a “fake framework” is invaluable in this context. A high-level rigorous description of social reality doesn’t really empower you to do anything about it. But seeing social interactions as an improv scene, while not literally true, offers actionable insight. The specific examples in the post hit very close to home for me, like the example of one’s family tugging a person back into their old role. I noticed that I quite often lose my temper around my parents, something that happens basically never around my wife or friends. I realized that much of it is caused by a role conflict with my father about who gets to be the “authority” on living well. I further recognized that my temper is triggered by “should” statements, even innocuous ones like “you should have the Cabernet with this dish” over dinner. Seeing these interactions through the lens of both of us negotiating and claiming our roles allowed me to control how I feel and react rather than being driven by an anger that I don’t
11Michael Smith
I don't know if I'll ever get to a full editing of this. I'll jot notes here of how I would edit it as I reread this. * I'd ax the whole opening section. * That was me trying to (a) brute force motivation for the reader and (b) navigate some social tension I was feeling around what it means to be able to make a claim here. In particular I was annoyed with Oli and wanted to sidestep discussion of the lemons problem. My focus was actually on making something in culture salient by offering a fake framework. The thing speaks for itself once you look at it. After that point I don't care what anyone calls it. * This would, alas, leave out the emphasis that it's a fake framework. But I've changed my attitude about how much hand-holding to do for stuff like that. Part of the reason I put that in the beginning was to show the LW audience that I was taking it as fake, so as to sidestep arguments about how justified everything is or isn't. At this point I don't care anymore. People can project whatever they want on me because, uh, I can't really stop them anyway. So I'm not going to fret about it. * I had also intended the opening to have a kind of conversational tone, as part of a Sequence that I never finished (on "ontology-cracking"). I probably never will finish it at this point. So no point in making this stand-alone essay pretend to be part of an ongoing conversation. * A minor nitpick: I open the meat of the idea by telling some facts about improv theater. I suspect it'd be more engaging if I had written it as a story illustrating the experience. "Bob walked onto the stage, his heart pounding. 'God, what do I say?'" Etc. The whole thing would have felt less abstract if I had done that. But it clearly communicated well for this audience, so that's not a big concern. * One other reviewer mentioned how the strong examples end up obfuscating my overall point. That was actually a writing strategy: I didn't want the point stated early on and elucidated throug
#13

Prediction markets are a potential way to harness wisdom of crowds and incentivize truth-seeking. But they're tricky to set up correctly. Zvi Mowshowitz, who has extensive experience with prediction markets and sports betting, explains the key factors that make prediction markets succeed or fail.

#14

Rohin Shah argues that many common arguments for AI risk (about the perils of powerful expected utility maximizers) are actually arguments about goal-directed behavior or explicit reward maximization, which are not actually implied by coherence arguments. An AI system could be an expected utility maximizer without being goal-directed or an explicit reward maximizer.

21DanielFilan
I think that strictly speaking this post (or at least the main thrust) is true, and proven in the first section. The title is arguably less true: I think of 'coherence arguments' as including things like 'it's not possible for you to agree to give me a limitless number of dollars in return for nothing', which does imply some degree of 'goal-direction'. I think the post is important, because it constrains the types of valid arguments that can be given for 'freaking out about goal-directedness', for lack of a better term. In my mind, it provokes various follow-up questions: 1. What arguments would imply 'goal-directed' behaviour? 2. With what probability will a random utility maximiser be 'goal-directed'? 3. How often should I think of a system as a utility maximiser in resources, perhaps with a slowly-changing utility function? 4. How 'goal-directed' are humans likely to make systems, given that we are making them in order to accomplish certain tasks that don't look like random utility functions? 5. Is there some kind of 'basin of goal-directedness' that systems fall in if they're even a little goal-directed, causing them to behave poorly? Off the top of my head, I'm not familiar with compelling responses from the 'freak out about goal-directedness' camp on points 1 through 5, even though as a member of that camp I think that such responses exist. Responses from outside this camp include Rohin's post 'Will humans build goal-directed agents?'. Another response is Brangus' comment post, although I find its theory of goal-directedness uncompelling. I think that it's notable that Brangus' post was released soon after this was announced as a contender for Best of LW 2018. I think that if this post were added to the Best of LW 2018 Collection, the 'freak out' camp might produce more of these responses and move the dialogue forward. As such, I think it should be added, both because of the clear argumentation and because of the response it is likely to provoke.
17Vanessa Kosoy
In this essay, Rohin sets out to debunk what ey perceive as a prevalent but erroneous idea in the AI alignment community, namely: "VNM and similar theorems imply goal-directed behavior". This is placed in the context of Rohin's thesis that solving AI alignment is best achieved by designing AI which is not goal-directed. The main argument is: "coherence arguments" imply expected utility maximization, but expected utility maximization does not imply goal-directed behavior. Instead, it is a vacuous constraint, since any agent policy can be regarded as maximizing the expectation of some utility function. I have mixed feelings about this essay. On the one hand, the core argument that VNM and similar theorems do not imply goal-directed behavior is true. To the extent that some people believed the opposite, correcting this mistake is important. On the other hand, (i) I don't think the claim Rohin is debunking is the claim Eliezer had in mind in those sources Rohin cites (ii) I don't think that the conclusions Rohin draws or at least implies are the right conclusions. The actual claim that Eliezer was making (or at least my interpretation of it) is, coherence arguments imply that if we assume an agent is goal-directed then it must be an expected utility maximizer, and therefore EU maximization is the correct mathematical model to apply to such agents. Why do we care about goal-directed agents in the first place? The reason is, on the one hand goal-directed agents are the main source of AI risk, and on the other hand, goal-directed agents are also the most straightforward approach to solving AI risk. Indeed, if we could design powerful agents with the goals we want, these agents would protect us from unaligned AIs and solve all other problems as well (or at least solve them better than we can solve them ourselves). Conversely, if we want to protect ourselves from unaligned AIs, we need to generate very sophisticated long-term plans of action in the physical world, possibl
#15

Scott reviews a paper by Bloom, Jones, Reenen & Webb which argues that scientific progress is slowing down, as measured by outputs per researcher. Scott argues that this is actually the expected result - constant progress in response to exponentially increasing inputs should be our null hypothesis, based on historical trends.

0Scott Alexander
I still endorse most of this post, but https://docs.google.com/document/d/1cEBsj18Y4NnVx5Qdu43cKEHMaVBODTTyfHBa8GIRSec/edit has clarified many of these issues for me and helped quantify the ways that science is, indeed, slowing down.
#16

You want your proposal for an AI to be robust to changes in its level of capabilities. It should be robust to the AI's capabilities scaling up, and also scaling down, and also the subcomponents of the AI scaling relative to each other. 

We might need to build AGIs that aren't robust to scale, but if so we should at least realize that we are doing that.

#17

Democratic processes are important loci of power. It's useful to understand the dynamics of the voting methods used real-world elections. My own ideas of ethics and of fun theory are deeply informed by my decades of interest in voting theory

7Ryan Blough
I think this post should be included in the best posts of 2018 collection. It does an excellent job of balancing several desirable qualities: it is very well written, being both clear and entertaining; it is informative and thorough; it is in the style of argument which is preferred on LessWrong, by which I mean makes use of both theory and intuition in the explanation. This post adds to the greater conversation by displaying rationality of the kind we are pursuing directed at a big societal problem. A specific example of what I mean that distinguishes this post from an overview that any motivated poster might write is the inclusion of Warren Smith's results; Smith is a mathematician from an unrelated field who has no published work on the subject. But he had work anyway, and it was good work which the author himself expanded on, and now we get to benefit from it through this post. This puts me very much in mind of the fact that this community was primarily founded by an autodidact who was deeply influenced by a physicist writing about probability theory. A word on one of our sacred taboos: in the beginning it was written that Politics is the Mindkiller, and so it was for years and years. I expect this is our most consistently and universally enforced taboo. Yet here we have a high-quality and very well received post about politics, and of the ~70 comments only one appears to have been mindkilled. This post has great value on the strength of being an example of how to address troubling territory successfully. I expect most readers didn't even consider that this was political territory. Even though it is a theory primer, it manages to be practical and actionable. Observe how the very method of scoring posts for the review, quadratic voting, is one that is discussed in the post. Practical implications for the management of the community weigh heavily in my consideration of what should be considered important conversation within the community. Carrying on from that
#18

Eliezer explores a dichotomy between "thinking in toolboxes" and "thinking in laws". 
Toolbox thinkers are oriented around a "big bag of tools that you adapt to your circumstances." Law thinkers are oriented around universal laws, which might or might not be useful tools, but which help us model the world and scope out problem-spaces. There seems to be confusion when toolbox and law thinkers talk to each other.

#19

Often you can compare your own Fermi estimates with those of other people, and that’s sort of cool, but what’s way more interesting is when they share what variables and models they used to get to the estimate. This lets you actually update your model in a deeper way.

0Jameson Quinn
This is the second time I've seen this. Now it seems obvious. I remember liking it the first time, but also remember it being obvious. That second part of the memory is probably false. I think it's likely that this explained the idea so well that I now think it's obvious. In other words: very well done.
#20

Alex Zhu spent quite awhile understanding Paul's Iterated Amplication and Distillation agenda. He's written an in-depth FAQ, covering key concepts like amplification, distillation, corrigibility, and how the approach aims to create safe and capable AI assistants.

#21

A hand-drawn presentation on the idea of an 'Untrollable Mathematician' - a mathematical agent that can't be manipulated into believing false things. 

#22

Scott Alexander reviews and expands on Paul Graham's "hierarchy of disagreement" to create a broader and more detailed taxonomy of argument types, from the most productive to the least. He discusses the difficulty and importance of avoiding lower levels of argument, and the value of seeking "high-level generators of disagreement" even when they don't lead to agreement. 

0Scott Alexander
I still generally endorse this post, though I agree with everyone else's caveats that many arguments aren't like this. The biggest change is that I feel like I have a slightly better understanding of "high-level generators of disagreement" now, as differences in priors, contexts, and categorizations - see my post "Mental Mountains" for more.
#23

A collection of examples of AI systems "gaming" their specifications - finding ways to achieve their stated objectives that don't actually solve the intended problem. These illustrate the challenge of properly specifying goals for AI systems.

12Victoria Krakovna
I've been pleasantly surprised by how much this resource has caught on in terms of people using it and referring to it (definitely more than I expected when I made it). There were 30 examples on the list when was posted in April 2018, and 20 new examples have been contributed through the form since then. I think the list has several properties that contributed to wide adoption: it's fun, standardized, up-to-date, comprehensive, and collaborative. Some of the appeal is that it's fun to read about AI cheating at tasks in unexpected ways (I've seen a lot of people post on Twitter about their favorite examples from the list). The standardized spreadsheet format seems easier to refer to as well. I think the crowdsourcing aspect is also helpful - this helps keep it current and comprehensive, and people can feel some ownership of the list since can personally contribute to it. My overall takeaway from this is that safety outreach tools are more likely to be impactful if they are fun and easy for people to engage with. This list had a surprising amount of impact relative to how little work it took me to put it together and maintain it. The hard work of finding and summarizing the examples was done by the people putting together the lists that the master list draws on (Gwern, Lehman, Olsson, Irpan, and others), as well as the people who submit examples through the form. What I do is put them together in a common format and clarify and/or shorten some of the summaries. I also curate the examples to determine whether they fit the definition of specification gaming (as opposed to simply a surprising behavior or solution). Overall, I've probably spent around 10 hours so far on creating and maintaining the list, which is not very much. This makes me wonder if there is other low hanging fruit in the safety resources space that we haven't picked yet. I have been using it both as an outreach and research tool. On the outreach side, the resource has been helpful for making the arg
#24

There are problems with the obvious-seeming "wizard's code of honesty" aka "never say things that are false". Sometimes, even exceptionally honest people lie (such as when hiding fugitives from an unjust regime). If "never lie" is unworkable as an absolute rule, what code of conduct should highly honest people aspire to? 

10Ben Pace
Here are my thoughts. 1. Being honest is hard, and there are many difficult and surprising edge-cases, including things like context failures, negotiating with powerful institutions, politicised narratives, and compute limitations. 2. On top of the rule of trying very hard to be honest, Eliezer's post offers an additional general rule for navigating the edge cases. The rule is that when you’re having a general conversation all about the sorts of situations you would and wouldn’t lie, you must be absolutely honest. You can explicitly not answer certain questions if it seems necessary, but you must never lie. 3. I think this rule is a good extension of the general principle of honesty, and appreciate Eliezer's theoretical arguments for why this rule is necessary. 4. Eliezer’s post introduces some new terminology for discussions of honesty - in particular, the term 'meta-honesty' as the rule instead of 'honesty'. 5. If the term 'meta-honesty' is common knowledge but the implementation details aren't, and if people try to use it, then they will perceive a large number of norm violations that are actually linguistic confusions. Linguistic confusions are not strongly negative in most fields, merely a nuisance, but in discussions of norm-violation (e.g. a court of law) they have grave consequences, and you shouldn't try to build communal norms on such shaky foundations. 6. I and many other people this post was directed at, find it requires multiple readings to understand, so I think that if everyone reads this post, it will not be remotely sufficient for making the implementation details common knowledge, even if the term can become that. 7. In general, I think that everyone should make sure it is acceptable, when asking "Can we operate under the norms of meta-honesty?" for the other person to reply "I'd like to taboo the term 'meta-honesty', because I'm not sure we'll be talking about the same thing if we use that term." 8. This is a valuable bedrock for thinking
0Zack M. Davis
Reply: "Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think"
#25

Frustrated by claims that "enlightenment" and similar meditative/introspective practices can't be explained and that you only understand if you experience them, Kaj set out to write his own detailed gears-level, non-mysterious, non-"woo" explanation of how meditation, etc., work in the same way you might explain the operation of an internal combustion engine.

11DanielFilan
As far as I can tell, this post successfully communicates a cluster of claims relating to "Looking, insight meditation, and enlightenment". It's written in a quite readable style that uses a minimum of metaphorical language or Buddhist jargon. That being said, likely due to its focus as exposition and not persuasion, it contains and relies on several claims that are not supported in the text, such as: * Many forms of meditation successfully train cognitive defusion. * Meditation trains the ability to have true insights into the mental causes of mental processes. * "Usually, most of us are - on some implicit level - operating off a belief that we need to experience pleasant feelings and need to avoid experiencing unpleasant feelings." * Flinching away from thoughts of painful experiences is what causes suffering, not the thoughts of painful experiences themselves, nor the actual painful experiences. * Impermanence, unsatisfactoriness, and no-self are fundamental aspects of existence that "deep parts of our minds" are wrong about. I think that all of these are worth doubting without further evidence, and I think that some of them are in fact wrong. If this post were coupled with others that substantiated the models that it explains, I think that that would be worthy of inclusion in a 'Best of LW 2018' collection. However, my tentative guess is that Buddhist psychology is not an important enough set of claims that a clear explanation of it deserves to be signal-boosted in such a collection. That being said, I could see myself being wrong about that.
0Kaj Sotala
I still broadly agree with everything that I said in this post. I do feel that it is a little imprecise, in that I now have much more detailed and gears-y models for many of its claims. However, elaborating on those would require an entirely new post (one which I currently working on) with a sequence's worth of prerequisites. So if I were to edit this post, I would probably mostly leave it as it is, but include a pointer to the new post once it's finished. In terms of this post being included in a book, it is worth noting that the post situates itself in the context of Valentine's Kensho post, which has not been nominated for the review and thus wouldn't be included in the book. So if this post were to be included, I should probably edit this so as to not require reading Kensho.
#26

Some people claim that aesthetics don't mean anything, and are resistant to the idea that they could.  After all, aesthetic preferences are very individual. 

Sarah argues that the skeptics have a point, but they're too epistemically conservative. Colors don't have intrinsic meanings, but they do have shared connotations within a culture. There's obviously some signal being carried through aesthetic choices.

6Zvi
This post kills me. Lots of great stuff, and I think this strongly makes the cut. Sarah has great insights into what is going on, then turns away from them right when following through would be most valuable. The post is explaining why she and an entire culture is being defrauded by aesthetics. That is it used to justify all sorts of things, including high prices and what is cool, based on things that have no underlying value. How it contains lots of hostile subliminal messages that are driving her crazy. It's very clear. And then she... doesn't see the fnords. So close!
#27

A book review examining Elinor Ostrom's "Governance of the Commons", in light of Eliezer Yudkowsky's "Inadequate Equilibria." Are successful local institutions for governing common pool resources possible without government intervention? Under what circumstances can such institutions emerge spontaneously to solve coordination problems?

6Martin Sustrik
Author here. I still believe this article is a important addition to the discussion of inadequate equilibria. While Scott Alexander's Moloch post and Eliezer Yudkowsky's book are great for introduction and discussion of the topic, both of them fail, in my opinion, to convey the sheer complexity of the problem as it occurs in the real world. That, I think, results in readers thinking about the issue in simple malthusian or naive game-theoretic terms and eventually despairing about inescapability of suboptimal Nash equilibria. What I try to present is a world that is much more complex but also much less hopeless. Everything is an intricate mess of games played on different levels and interacting in complex and unpredictable ways. What, at the first glance, looks like a simple tragedy-of-the-commons problem is in fact a complex dynamic system with many inputs and many intertwined interests. To solve it, one may just have to step back a bit and consider other forces and mechanisms at play. One idea that is expressed in the article and that I often come back to is (my wording, but the idea is very much implicitly present in Ostrom's book): Another one that still feels important in the hindsight is the attaching of a price tag to a coordination failure ("this can be solved for $1M") which turns the semi-mystical work of Moloch into a boring old infrastructure project, very much like building a dam. This may have implications for Effective Altruism. Solving a coordination failure may often be the most efficient way to spend money in a specific area.
7Vanessa Kosoy
This essay provides some fascinating case studies and insights about coordination problems and their solutions, from a book by Elinor Ostrom. Coordination problems are a major theme in LessWrongian thinking (for good reasons) and the essay is a valuable addition to the discussion. I especially liked the 8 features of sustainable governance systems (although I wish we got a little more explanation for "nested enterprises"). However, I think that the dichotomy between "absolutism (bad)" and "organically grown institutions (good)" that the essay creates needs more nuance or more explanation. What is the difference between "organic" and "inorganic" institutions? All institutions "grew" somehow. The relevant questions are e.g. how democratic is the institution, whether the scope of the institution is the right scope for this problem, whether the stakeholders have skin in the game (feature 3) et cetera. The 8 features address some of that, but I wish it was more explicit. Also, It's notable that all examples focus on relatively small scale problems. While it makes perfect sense to start by studying small problems before trying to understand the big problems, it does make me wonder whether going to higher scales brings in qualitatively new issues and difficulties. Paying to officials with parcels in the tail end works for water conflicts, but what is the analogous approach to global warming or multinational arms races?
#28

You can learn to spot when something is hijacking your motivation, and take back control of your goals.

9Michael Smith
I thought I'd add a few quick notes as the author. As I reread this, a few things jump out for me: * I enjoy its writing style. Its clarity is probably part of why it was nominated. * I'd now say this post is making a couple of distinct claims: * External forces can shape what we want to do. (I.e., there are lotuses.) * It's possible to notice this in real time. (I.e., you can notice the taste of lotuses.) * It's good to do so. Otherwise we find our wanting aligned with others' goals regardless of how they relate to our own. * If you notice this, you'll find yourself wanting to spit out lotuses that you can tell pull you away from your goals. * I still basically agree with the content. * I think the emotional undertone is a little confused, says the version of me about 19 months later. That last point is probably the most interesting to meta-reviewers, so I'll say a little about that here. The basic emotional backdrop I brought in writing this was something like, "Look out, you could get hijacked! Better watch out!" And then luckily there's this thing you can be aware of, to defend yourself against one more form of psychic/emotional attack. Right? I think this is kind of nuts. It's a popular form of nuts, but it's still nuts. Looking at the Duolingo example I gave, it doesn't address the question of why those achievements counted as a lotus structure for me. There are tons of things others find have lotus nature that I don't (e.g., gambling). And vice versa: my mother (who's an avid Duolingo user) couldn't care less about those achievements. So what gives? I have a guess, but I think that's outside the purview of the purpose of these reviews. I'll just note that "We're in a worldwide memetic war zone where everyone is out to get us by hijacking our minds!" is (a) not the hypothesis to default to and (b) if true is itself a questionable meme that seems engineered to stimulate fight-or-flight type reactions that do, indeed, hijack clarity
2Scott Alexander
I was surprised that this post ever seemed surprising, which either means it wasn't revolutionary, or was *very* revolutionary. Since it has 229 karma, seems like it was the latter. I feel like the same post today would have been written with more explicit references to reinforcement learning, reward, addiction, and dopamine. The overall thesis seems to be that you can get a felt sense for these things, which would be surprising - isn't it the same kind of reward-seeking all the way down, including on things that are genuinely valuable? Not sure how to model this.
#29

You've probably heard about the "tit-for-tat" strategy in the iterated prisoner's dilemma. But have you heard of the Pavlov strategy? The simple strategy performs surprisingly well in certain conditions. Why don't we talk about Pavlov strategy as much as Tit-for-Tat strategy?

#30

By default, humans are a kludgy bundle of impulses. But we have the ability to reflect upon our decision making, and the implications thereof, and derive better overall policies. You might want to become a more robust, coherent agent – in particular if you're operating in an unfamiliar domain, where common wisdom can't guide you.

0Raymond Arnold
Author here. I still endorse the post and have continued to find it pretty central to how I think about myself and nearby ecosystems. I just submitted some major edits to the post. Changes include: 1. Name change ("Robust, Coherent Agent") After much hemming and hawing and arguing, I changed the name from "Being a Robust Agent" to "Being a Robust, Coherent Agent." I'm not sure if this was the right call. It was hard to pin down exactly one "quality" that the post was aiming at. Coherence was the single word that pointed towards "what sort of agent to become." But I think "robustness" still points most clearly towards why you'd want to change. I added some clarifying remarks about that. In individual sentences I tend to refer to either "Robust Agents" or "Coherent agents" depending on what that sentence was talking about Other options include "Reflective Agent" or "Deliberate Agent." (I think once you deliberate on what sort of agent you want to be, you often become more coherent and robust, although not necessarily) Edit" Undid the name change, seemed like it was just a worse title. 2. Spelling out what exactly the strategy entails Originally the post was vaguely gesturing at an idea. It seemed good to try to pin that idea down more clearly. This does mean that, by getting "more specific" it might also be more "wrong." I've run the new draft by a few people and I'm fairly happy with the new breakdown: * Deliberate Agency * Gears Level Understanding of Yourself * Coherence and Consistency * Game Theoretic Soundness But, if people think that's carving the concept at the wrong joints, let me know. 3. "Why is this important?" Zvi's review noted that the post didn't really argue why becoming a robust agent was so important.  Originally, I viewed the post as simply illustrating an idea rather than arguing for it, and... maybe that was fine. I think it would have been fine to "why" that for a followup post.  But I reflected a bit on why it seemed importan
4Zvi
As you would expect from someone who was one of the inspirations for the post, I strongly approve of the insight/advice contained herein. I also agree with the previous review that there is not a known better write-up of this concept. I like that this gets the thing out there compactly. Where I am disappointed is that this does not feel like it gets across the motivation behind this or why it is so important - I neither read this and think 'yes that explains why I care about this so much' or 'I expect that this would move the needle much on people's robustness as agents going forward if they read this.' So I guess the takeaway for me looking back is, good first attempt and I wouldn't mind including it in the final list, but someone needs to try again? It is worth noting that Jacob did *exactly* the adjustments that I would hope would result from this post if it worked as intended, so perhaps it is better than I give it credit for? Would be curious if anyone else had similar things to report.
#31

Here’s a pattern I’d like to be able to talk about. It might be known under a certain name somewhere, but if it is, I don’t know it. I call it a Spaghetti Tower. It shows up in large complex systems that are built haphazardly.

8Georgia Ray
A brief authorial take - I think this post has aged well, although as with Caring Less (https://www.lesswrong.com/posts/dPLSxceMtnQN2mCxL/caring-less), this was an abstract piece and I didn't make any particular claims here. I'm so glad that A) this was popular B) I wasn't making up a new word for a concept that most people already know by a different name, which I think will send you to at least the first layer of Discourse Hell on its own. I've met at least one person in the community who said they knew and thought about this post a lot, well before they'd met me, which was cool. I think this website doesn't recognize the value of bad hand-drawn graphics for communicating abstract concepts (except for Garrabrant and assorted other AI safety people, whose posts are too technical for me to read but who I support wholly.) I'm guessing that the graphics helped this piece, or at least got more people to look at it. I do wish I'd included more examples of spaghetti towers, but I knew that before posting it, and this was an instance of "getting something out is better than making it perfect." I've planned on doing followups in the same sort of abstract style as this piece, like methods I've run into for getting around spaghetti towers. (Modularization, swailing, documentation.) I hopefully will do that some day. If anyone wants to help brainstorm examples, hit me up and I may or may not get back to you.
#32

What if our universe's resources are just a drop in the bucket compared to what's out there? We might be able to influence or escape to much larger universes that are simulating us or can otherwise be controlled by us. This could be a source of vastly more potential value than just using the resources in our own universe. 

#33

People who helped Jews during WWII are intriguing. They appear to be some kind of moral supermen. They had almost nothing to gain and everything to lose. How did they differ from the general population? Can we do anything to get more of such people today?

4Jameson Quinn
I think we should encourage posts which are well-delimited and research based; "here's a question I had, and how I answered it in a finite amount of time" rather than "here's something I've been thinking about for a long time, and here's where I've gotten with it". Also, this is an engaging topic and well-written. I feel the "final thoughts" section could be tightened up/shortened, as to me it's not the heart of the piece.
#34

Can the smallest boolean circuit that solves a problem be a "daemon" (a consequentialist system with its own goals)? Paul Christiano suspects not, but isn't sure. He thinks this question, while not necessarily directly important, may yield useful insights for AI alignment.

#35

A tradition of knowledge is a body of knowledge that has been consecutively and successfully worked on by multiple generations of scholars or practitioners. This post explores the difference between living traditions (with all the necessary pieces to preserve and build knowledge), and dead traditions (where crucial context has been lost).

#36

It might be some elements of human intelligence (at least at the civilizational level) are culturally/memetically transmitted. All fine and good in theory. Except the social hypercompetition between people and intense selection pressure of ideas online might be eroding our world's intelligence. Eliezer wonders if he's only who he is because he grew up reading old science fiction from before the current era's memes.

3Raymond Arnold
This a first pass review that's just sort of organizing my thinking about this post. This post makes a few different types of claims: * Hyperselected memes may be worse (generally) than weakly selected ones * Hyperselected memes may specifically be damaging our intelligence/social memetic software * People today are worse at negotiating complex conflicts from different filter bubbles * There's a particular set of memes (well represented in 1950s sci-fi) that was particularly important, and which are not as common nowadays. It has a question which is listed although not focused on too explicitly on its own terms: * What do you do if you want to have good ideas? (i.e. "drop out of college? read 1950s sci-fi in your formative years?") It prompts me to separately consider the questions: * What actually is the internet doing to us? It's surely doing something. * What sorts of cultures are valuable? What sorts of cultures can be stably maintained? What sorts of cultures cause good intellectual development? ... Re: the specific claim of "hypercompetition is destroying things", I think the situation is complicated by the "precambrian explosion" of stuff going on right now. Pop music is defeating classical music in relative terms, but, like, in absolute terms there's still a lot more classical music now than in 1400 [citation needed?]. I'd guess this is also true of for tribal FB comments vs letter-to-the-editor-type writings.  * [claim by me] Absolute amounts of thoughtful discourse is probably still increasing My guess is that "listens carefully to arguments" has just always been rare, and that people have generally been dismissive of the outgroup, and now that's just more prominent. I'd also guess that there's more 1950s style sci-fi today than in 1950. But it might not be, say, driving national projects that required a critical mass of it. (And it might or might not be appearing on bestseller lists?) If so, the question is less "are things being destro
#37

What causes some people to develop extensive frameworks of ideas rather than remain primarily consumers of ideas? There is something incomplete about my model of people doing this vs not doing this. I expect more people to have more ideas than they do.

A question post, which received many thoughtful answers.

#38

One of the biggest intuitive mysteries to me is how humanity took so long to do anything. Humans have been 'behaviorally modern' for about 50 thousand years. And apparently didn't invent, for instance, rope until 28 thousand years ago. Why did everything take so long?

#39

Eliezer Yudkowsky offers detailed critiques of Paul Christiano's AI alignment proposal, arguing that it faces major technical challenges and may not work without already having an aligned superintelligence. Christiano acknowledges the difficulties but believes they are solvable.

#40

In the 2012 LessWrong survey, it turned out LessWrongers were 22% more likely than expected to be a first-born child. Later, a MIRI researcher wondered off-handedly if great mathematicians (who plausibly share some important features with LessWrongers), also exhibit this same trend towards being first born.

The short answer: Yes, they do, as near as I can tell, but not as strongly as LessWrongers.

0Bucky
I was going to write a longer review but I realised that Ben’s curation notice actually explains the strengths of this post very well so you should read that! In terms of including this in the 2018 review I think this depends on what the review is for. If the review is primarily for the purpose of building common knowledge within the community then including this post maybe isn’t worth it as it is already fairly well known, having been linked from SSC. On the other hand if the review process is at least partly for, as Raemon put it: “I want LessWrong to encourage extremely high quality intellectual labor.” Then this post feels like an extremely strong candidate. (Personal footnote: This post was essentially what converted me from a LessWrong lurker to a regular commentor/contributor - I think it was mainly just being impressed with how thorough it was and thinking that's the kind of community I'd like to get involved with.)
#42

Analyzing Nobel Laureates in Physics, there's a statistically significant birth order effect: they're 10 percentage points more likely to be firstborn than chance would predict. This effect is smaller than seen in the rationalist community (22 points) or historical mathematicians (16.7 points), but still interesting. 

0Bucky
This is a review of my own post. The first thing to say is that for the 2018 Review Eli’s mathematicians post should take precedence because it was him who took up the challenge in the first place and inspired my post. I hope to find time to write a review on his post. If people were interested (and Eli was ok with it) I would be happy to write a short summary of my findings to add as a footnote to Eli’s post if it was chosen for the review. *** This was my first post on LessWrong and looking back at it I think it still holds up fairly well. There are a couple of things I would change if I were doing it again: * Put less time into the sons vs daughters thing. I think this section could have two thirds of it chopped out without losing much. * Unnamed’s comment is really important in pointing out a mistake I was making in my final paragraph. * I might have tried to analyse whether it is a firstborn thing vs an earlyborn thing. In the SSC data it is strongly a firstborn thing and if I combined Eli and my datasets I might be able to confirm whether this is also the case in our datasets. I’m not sure if this would provide a decisive answer as our sample size is much smaller even when combining the sets.