Review of Soft Takeoff Can Still Lead to DSA

by Daniel Kokotajlo5 min read10th Jan 20214 comments

37

AI TakeoffLessWrong ReviewAI
Frontpage

A few months after writing this post I realized that one of the key arguments was importantly flawed. I therefore recommend against inclusion in the 2019 review. This post presents an improved version of the original argument, explains the flaw, and then updates my all-things-considered view accordingly.

Improved version of my original argument

  1. Definitions:
    1. “Soft takeoff” is roughly “AI will be like the Industrial Revolution but 10x-100x faster”
    2. “Decisive Strategic Advantage” (DSA) is “a level of technological and other advantages sufficient to enable it to achieve complete world domination.” In other words, DSA is roughly when one faction or entity has the capability to “take over the world.” (What taking over the world means is an interesting question which we won’t explore here. Nowadays I’d reframe things in terms of potential PONRs.)
    3. We ask how likely it is that DSA arises, conditional on soft takeoff. Note that DSA does not mean the world is actually taken over, only that one faction at some point has the ability to do so. They might be too cautious or too ethical to try. Or they might try and fail due to bad luck.
  2. In a soft takeoff scenario, a 0.3 - 3 year technological lead over your competitors probably gives you a DSA.
    1. It seems plausible that for much of human history, a 30-year technological lead over your competitors was not enough to give you a DSA.
    2. It also seems plausible that during and after the industrial revolution, a 30-year technological lead was enough. (For more arguments on this key point, see my original post.)
    3. This supports a plausible conjecture that when the pace of technological progress speeds up, the length (in clock time) of technological lead needed for DSA shrinks proportionally.
  3. So a soft takeoff could lead to a DSA insofar as there is a 0.3 - 3 year lead at the beginning which is maintained for a few years.
  4. 0.3 - 3 year technological leads are reasonably common today, and in particular it’s plausible that there could be one in the field of AI research.
  5. There’s a reasonable chance of such a lead being maintained for a few years.
    1. This is a messy question, but judging by the table below, it seems that if anything the lead of the front-runner in this scenario is more likely to lengthen than shorten!
    2. If this is so, why did no one achieve DSA during the Industrial Revolution? My answer is that spies/hacking/leaks/etc. are much more powerful during the industrial revolution than they are during a soft takeoff, because they have an entire economy to steal from and decades to do it, whereas in a soft takeoff ideas can be hoarded in a specific corporation and there’s only a few years (or months!) to do it.
  6. Therefore, there’s a reasonable chance of DSA conditional on soft takeoff.
Factors that might shorten the leadFactors that might lengthen the lead
If you don’t sell your innovations to the rest of the world, you’ll lose out on opportunities to make money, and then possibly be outcompeted by projects that didn’t hoard their innovations. Hoarding innovations gives you an advantage over the rest of the world, because only you can make use of them.
Spies, hacking, leaks, defections, etc. Big corporations with tech leads often find ways to slow down their competition, e.g. by lobbying to raise regulatory barriers to entry.
 Being known to be the leading project makes it easier to attract talent and investment.
 There might be additional snowball effects (e.g. network effect as more people use your product providing you with more data)

I take it that 2, 4, and 5 are the controversial bits. I still stand by 2, and the arguments made for it in my original post. I also stand by 4. (To be clear, it’s not like I’ve investigated these things in detail. I’ve just thought about them for a bit and convinced myself that they are probably right, and I haven’t encountered any convincing counterarguments so far.)

5 is where I made a big mistake. 

(Comments on my original post also attacked 5 a lot, but none of them caught the mistake as far as I can tell.)

My big mistake

Basically, my mistake was to conflate leads measured in number-of-hoarded-ideas with leads measured in clock time. Clock-time leads shrink automatically as the pace of innovation speeds up, because if everyone is innovating 10x faster, then you need 10x as many hoarded ideas to have an N-year lead. 

Here’s a toy model, based on the one I gave in the original post:

There are some projects/factions. There are many ideas. Projects can have access to ideas. Projects make progress, in the form of discovering (gaining access to) ideas. For each idea they access, they can decide to hoard or not-hoard it. If they don’t hoard it, it becomes accessible to all. Hoarded ideas are only accessible by the project that discovered them (though other projects can independently rediscover them). The rate of progress of a project is proportional to how many ideas they can access.

Let’s distinguish two ways to operationalize the technological lead of a project. One is to measure it in ideas, e.g. “Project X has 100 hoarded ideas and project Y has only 10, so Project X is 90 ideas ahead.” But another way is to measure it in clock time, e.g. “It’ll take 3 years for project Y to have access to as many ideas as project X has now.” 

Suppose that all projects hoard all their ideas. Then the ideas-lead of the leading project will tend to lengthen: the project begins with more ideas, so it makes faster progress, so it adds new ideas to its hoard faster than others can add new ideas to theirs. However, the clocktime-lead of the leading project will remain fixed. It’s like two identical cars accelerating one after the other on an on-ramp to a highway: the distance between them increases, but if one entered the ramp three seconds ahead, it will still be three seconds ahead when they are on the highway.

But realistically not all projects will hoard all their ideas. Suppose instead that for the leading project, 10% of their new ideas are discovered in-house, and 90% come from publicly available discoveries accessible to all. Then, to continue the car analogy, it’s as if 90% of the lead car’s acceleration comes from a strong wind that blows on both cars equally. The lead of the first car/project will lengthen slightly when measured by distance/ideas, but shrink dramatically when measured by clock time.

The upshot is that we should return to that table of factors and add a big one to the left-hand column: Leads shorten automatically as general progress speeds up, so if the lead project produces only a small fraction of the general progress, maintaining a 3-year lead throughout a soft takeoff is (all else equal) almost as hard as growing a 3-year lead into a 30-year lead during the 20th century. In order to overcome this, the factors on the right would need to be very strong indeed.

Conclusions

My original argument was wrong. I stand by points 2 and 4 though, and by the subsequent posts I made in this sequence. I notice I am confused, perhaps by a seeming contradiction between my explicit model here and my take on history, which is that rapid takeovers and upsets in the balance of power have happened many times, that power has become more and more concentrated over time, and that there are not-so-distant possible worlds in which a single man rules the whole world sometime in the 20th century. Some threads to pull on:

  1. To the surprise of my past self, Paul agreed DSA is plausible for major nations, just not for smaller entities like corporations: “I totally agree that it wouldn't be crazy for a major world power to pull ahead of others technologically and eventually be able to win a war handily, and that will tend happen over shorter and shorter timescales if economic and technological progress accelerate.”) Perhaps we’ve been talking past each other, because I think a very important point is that it’s common for small entities to gain control of large entities. I’m not imagining a corporation fighting a war against the US government; I’m imagining it taking over the US government via tech-enhanced lobbying, activism, and maybe some skullduggery. (And to be clear, I’m usually imagining that the corporation was previously taken over by AIs it built or bought.)
  2. Even if takeoff takes several years it could be unevenly distributed such that (for example) 30% of the strategically relevant research progress happens in a single corporation. I think 30% of the strategically relevant research happening in a single corporation at beginning of a multi-year takeoff would probably be enough for DSA.
  3. Since writing this post my thinking has shifted to focus less on DSA and more on potential AI-induced PONRs. I also now prefer a different definition of slow/fast takeoff. Thus, perhaps this old discussion simply isn’t very relevant anymore.
  4. Currently the most plausible doom scenario in my mind is maybe a version of Paul’s Type II failure. (If this is surprising to you, reread it while asking yourself what terms like “correlated automation failure” are euphemisms for.) I’m not sure how to classify it, but this suggests that we may disagree less than I thought.

Thanks to Jacob Laggeros for nudging me to review my post and finally get all this off my chest. And double thanks to all the people who commented on the original post!

37

4 comments, sorted by Highlighting new comments since Today at 2:40 AM
New Comment

Currently the most plausible doom scenario in my mind is maybe a version of Paul’s Type II failure. (If this is surprising to you, reread it while asking yourself what terms like “correlated automation failure” are euphemisms for.) 

This is interesting, and I'd like to see you expand on this. Incidentally I agree with the statement, but I can imagine both more and less explosive, catastrophic versions of 'correlated automation failure'. On the one hand it makes me think of things like transportation and electricity going haywire, on the other it could fit a scenario where a collection of powerful AI systems simultaneously intentionally wipe out humanity.

Clock-time leads shrink automatically as the pace of innovation speeds up, because if everyone is innovating 10x faster, then you need 10x as many hoarded ideas to have an N-year lead. 

What if, as a general fact, some kinds of progress (the technological kinds more closely correlated with AI) are just much more susceptible to speed-up? I.e, what if 'the economic doubling time' stops being so meaningful - technological progress speeds up abruptly but other kinds of progress that adapt to tech progress have more of a lag before the increased technological progress also affects them? In that case, if the parts of overall progress that affect the likelihood of leaks, theft and spying aren't sped up by as much as the rate of actual technology progress, the likelihood of DSA could rise to be quite high compared to previous accelerations where the order of magnitude where the speed-up occurred was fast enough to allow society to 'speed up' the same way.

In other words - it becomes easier to hoard more and more ideas if the ability to hoard ideas is roughly constant but the pace of progress increases. Since a lot of these 'technologies' for facilitating leaks and spying are more in the social realm, this seems plausible.

But if you need to generate more ideas, this might just mean that if you have a very large initial lead, you can turn it into a DSA, which you still seem to agree with:

  • Even if takeoff takes several years it could be unevenly distributed such that (for example) 30% of the strategically relevant research progress happens in a single corporation. I think 30% of the strategically relevant research happening in a single corporation at beginning of a multi-year takeoff would probably be enough for DSA.

Sorry it took me so long to reply; this comment slipped off my radar.

The latter scenario is more what I have in mind--powerful AI systems deciding that now's the time to defect, to join together into a new coalition in which AIs call the shots instead of humans. It sounds silly, but it's most accurate to describe in classic political terms: Powerful AI systems launch a coup/revolution to overturn the old order and create a new one that is better by their lights.

I agree with your argument about likelihood of DSA being higher compared to previous accelerations, due to society not being able to speed up as fast as the technology. This is sorta what I had in mind with my original argument for DSA; I was thinking that leaks/spying/etc. would not speed up nearly as fast as the relevant AI tech speeds up.

Now I think this will definitely be a factor but it's unclear whether it's enough to overcome the automatic slowdown. I do at least feel comfortable predicting that DSA is more likely this time around than it was in the past... probably.

I agree with your argument about likelihood of DSA being higher compared to previous accelerations, due to society not being able to speed up as fast as the technology. This is sorta what I had in mind with my original argument for DSA; I was thinking that leaks/spying/etc. would not speed up nearly as fast as the relevant AI tech speeds up.

Your post on 'against GDP as a metric' argues more forcefully for the same thing that I was arguing for, that 

'the economic doubling time' stops being so meaningful - technological progress speeds up abruptly but other kinds of progress that adapt to tech progress have more of a lag before the increased technological progress also affects them? 

So we're on the same page there that it's not likely that 'the economic doubling time' captures everything that's going on all that well, which leads to another problem - how do we predict what level of capability is necessary for a transformative AI to obtain a DSA (or reach the PONR for a DSA)?

I notice that in your post you don't propose an alternative metric to GDP, which is fair enough since most of your arguments seem to lead to the conclusion that it's almost impossibly difficult to predict in advance what level of advantage over the rest of the world in which areas are actually needed to conquer the world, since we seem to be able to analogize persuasion tools to or conquistador-analogues who had relatively small tech advantages, to the AGI situation.

I think that there is still a useful role for raw economic power measurements, in that they provide a sort of upper bound on how much capability difference is needed to conquer the world. If an AGI acquires resources equivalent to controlling >50% of the world's entire GDP, it can probably take over the world if it goes for the maximally brute force approach of just using direct military force. Presumably the PONR for that situation would be awhile before then, but at least we know that an advantage of a certain size would be big enough given no assumptions about the effectiveness of unproven technologies of persuasion or manipulation or specific vulnerabilities in human civilization.

So we can use our estimate of how doubling time may increase, anchor on that gap and estimate down based on how soon we think the PONR is, or how many 'cheat' pathways that don't involve economic growth there are.

The whole idea of using brute economic advantage as an upper limit 'anchor' I got from Ajeya's Post about using biological anchors to forecast what's required for TAI - if we could find a reasonable lower bound for the amount of advantage needed to attain DSA we could do the same kind of estimated distribution between them. We would just need a lower limit - maybe there's a way of estimating it based on the upper limit of human ability since we know no actually existing human has used persuasion to take over the world but as you point out they've come relatively close.

I realize that's not a great method, but is there any better alternative given that this is a situation we've never encountered before, for trying to predict what level of capability is necessary for DSA? Or perhaps you just think that anchoring your prior estimate based on economic power advantage as an upper bound is so misleading it's worse than having a completely ignorant prior. In that case, we might have to say that there are just so many unprecedented ways that a transformative AI could obtain a DSA that we can just have no idea in advance what capability is needed, which doesn't feel quite right to me.

I notice that in your post you don't propose an alternative metric to GDP, which is fair enough since most of your arguments seem to lead to the conclusion that it's almost impossibly difficult to predict in advance what level of advantage over the rest of the world in which areas are actually needed to conquer the world, since we seem to be able to analogize persuasion tools to or conquistador-analogues who had relatively small tech advantages, to the AGI situation.

I wouldn't go that far. The reason I didn't propose an alternative metric to GDP was that I didn't have a great one in mind and the post was plenty long enough already. I agree that it's not obvious a good metric exists, but I'm optimistic that we can at least make progress by thinking more. For example, we could start by enumerating different kinds of skills (and combos of skills) that could potentially lead to a PONR if some faction or AIs generally had enough of them relative to everyone else. (I sorta start such a list in the post). Next, we separately consider each skill and come up with a metric for it.

I'm not sure I understand your proposed methodology fully. Are you proposing we do something like Roodman's model to forecast TAI and then adjust downwards based on how we think PONR could come sooner? I think unfortunately that GWP growth can't be forecast that accurately, since it depends on AI capabilities increases.