Most (but not all) of my writings are published here on Less Wrong. You can find links to everything I write by visiting my blog You can subscribe to my posts via RSS.

Wiki Contributions


[Book Review] "The Alignment Problem" by Brian Christian

Much of the dialogue about AI Safety I encounter in off-the-record conversations seems to me like it's not grounded in reality. I repeatedly hear (what I feel to be) a set of shaky arguments that both shut down conversation and are difficult to validate empirically.

The shaky argument is as follows:

  1. Machine learning is rapidly growing more powerful. If trends continue it will soon eclipse human performance.
  2. Machine learning equals artificial intelligence equals world optimizer.
  3. World optimizers can easily turn the universe into paperclips by accident.
  4. Therefore we need to halt machine learning advancement until the abstract philosophical + mathematical puzzle of AI alignment is solved.

I am not saying this line of reasoning is what AI researchers believe or that it's mainstream (among the rationality/alignment communities)―or even that it's wrong. The argument annoys me for the same reason a popular-yet-incoherent political platform annoys me; I have encountered badly-argued versions of the idea too many times.

I agree with #1, though I quibble "absolute power" should be distinguished from "sample efficiency" as well as how we'll get to superintelligence. (I am bearish on applying the scaling hypothesis to existing architectures.) I agree with #3 in theory. Theory is often very different from practice. I disagree with #2 because it relies on the tautological equivalence of two definitions. I can imagine superintelligent machines that aren't world optimizers. Without #2 the argument falls apart. It might be easy to build a superintelligence but hard to build a world optimizer.

I approached The Alignment Problem with the (incorrect) prior that it would be more vague abstract arguments untethered from technical reality. Instead, the book was dominated by ideas that have passed practical empirical tests.