Forecasting Thread: AI Timelines

Daniel Kokotajlo; Ben Pace

New Answer

New Comment

12 Answers sorted by
top scoring

Aug 23, 2020

130

A week ago I recorded a prediction on AI timeline after reading a Vox article on GPT-3 . In general I'm much more spread out in time than the Lesswrong community. Also, I weigh more heavily outside view considerations than detailed inside view information. For example, a main consideration of my prediction is using the heurastic With 50% probability, things will last twice as long as they already have, with the starting time of 1956, the time of the Dartmouth College summer AI conference.
If AGI will definitely happen eventually, then the heuristic gives us [21.3, 64, 192] years at the [25th, 50th, 75th] percentiles of AGI to occur. AGI may never happen, but the chance of that is small enough that adjusting for that here will not make a big difference (I put ~10% that AGI will not happen for 500 years or more, but it already matches that distribution quite well).

A more inside view consideration is: what happens if the current machine learning paradigm scales to AGI? Given that assumption, a 50% confidence interval might be [2028, 2045] (since the current burst of machine learning research began in 2012-2013), which is more in line with the Lesswrong predictions and Metaculus community prediction . Taking the super outside view consideration and the outside view-ish consideration together, I get the prediction I made a week ago.

I adapted my prediction to the timeline of this post [1], and compared it with some other commenters predictions [2].

[-]Ben Pace5y*50

For example, a main consideration of my prediction is using the heurastic With 50% probability, things will last twice as long as they already have, with the starting time of 1956, the time of the Dartmouth College summer AI conference.

A counter hypothesis I’ve heard (not original to me) is: With 50% probability, we will be half-way through the AI researcher-years required to get AGI.

I think this suggests much shorter timelines, as most researchers have been doing research in the last ~10 years.

It's not clear to me what reference class makes sense here though. Like, I feel like 50% doesn’t make any sense. It implies that for all outstanding AI problems we’re fifty percent there. We’re 50% of the way to a rat brain, to a human emulation, to a vastly superintelligent AGI, etc. It’s not a clearly natural category for a field to be “done”, and it’s not clear which thing counts as ”done” in this particular field.

[-]gwern5y120

I was looking at the NIPS growth numbers last June and I made a joke:

AI researcher anthropics: 'researchers [should] tend to think AI is ~20 years away because given exponential growth of researchers & careers of ~30 years, the final generation of researchers will make up a majority of all researchers, hence, by SSA+Outside View, one must assume 20 years.'

(Of course, I'm making a rather carbon-chauvinistic assumption here that it's only human researchers/researcher-years which matter.)

Ethan Perez

Aug 22, 2020*

150

Here is my Elicit Snapshot.

I'll follow the definition of AGI given in this Metaculus challenge, which roughly amounts to a single model that can "see, talk, act, and reason." My predicted distribution is a weighted sum of two component distributions described below:

Prosaic AGI (25% probability). Timeline: 2024-2037 (Median: 2029): We develop AGI by scaling and combining existing techniques. The most probable paths I can foresee loosely involves 3 stages: (1) developing a language model with human-level language ability, then (2) giving it visual capabilities (i.e., talk about pictures and videos, solve SAT math problems with figures), and then (3) giving it capabilities to intelligently act in the world (i.e., trade stocks or navigate webpages). Below are my timelines for the above stages:
1. Human-level Language Model: 1.5-4.5 years (Median: 2.5 years). We can predictably improve our language models by increasing model size (parameter count), which we can do in the following two ways:
  1. Scaling Language Model Size by 1000x relative to GPT3. 1000x is pretty feasible, but we'll hit difficult hardware/communication bandwidth constraints beyond 1000x as I understand.
  2. Increasing Effective Parameter Count by 100x using modeling tricks (Mixture of Experts, Sparse Tranformers, etc.)
2. +Visual Capabilities: 2-6 extra years (Median: 4 years). We'll need good representation learning techniques for learning from visual input (which I think we mostly have). We'll also need to combine vision and language models, but there are many existing techniques for combining vision and language models to try here, and they generally work pretty well. A main potential bottleneck time-wise is that the language+vision components will likely need to be pretrained together, which slows the iteration time and reduces the number of research groups that can contribute (especially for learning from video, which is expensive). For reference, Language+Image pretrained models like ViLBERT came out 10 months after BERT did.
3. +Action Capabilities: 0-6 extra years (Median: 2 years). GPT3-style zero-shot or few-shot instruction following is the most feasible/promising approach to me here; this approach could work as soon as we have a strong, pretrained vision+language model. Alternatively, we could use that model within a larger system, e.g. a policy trained with reinforcement learning, but this approach could take a while to get to work.
Breakthrough AGI (75% probability). Timeline: Uniform probability over the next century: We need several, fundamental breakthroughs to achieve AGI. Breakthroughs are hard to predict, so I'll assume a uniform distribution that we'll hit upon the necessary breakthroughs at any year <2100, with 15% total probability mass after 2100 (a rough estimate); I'm estimating 15% roughly based on a 5% probability that we won't find the right insights by 2100, 5% probability that we have the right insights but not enough compute by 2100, and 5% probability to account for planning fallacy, unknown unknowns, and the fact that a number of top AI researchers believe that we are very far from AGI.

My probability for Prosaic AGI is based on an estimated probability of each of the 3 stages of development working (described above):

P(Prosaic AGI) = P(Stage 1) x P(Stage 2) x P(Stage 3) = 3/4 x 2/3 x 1/2 = 1/4

------------------

Updates/Clarification after some feedback from Adam Gleave:

Updated from 5% -> 15% probability that AGI won't happen by 2100 (see reasoning above). I've updated my Elicit snapshot appropriately.
There are other concrete paths to AGI, but I consider these fairly low probability to work first (<5%) and experimental enough that it's hard to predict when they will work. For example, I can't think of a good way to predict when we'll get AGI from training agents in a simulated, multi-agent environment (e.g., in the style of OpenAI's Emergent Tool Use paper). Thus, I think it's reasonable to group such other paths to AGI into the "Breakthrough AGI" category and model these paths with a uniform distribution.
I think you can do better than a uniform distribution for the "Breakthrough AGI" category, by incorporating the following information:
- Breakthroughs will be less frequent as time goes on, as the low-hanging fruit/insights are picked first. Adam suggested an exponential decay over time / Laplacian prior, which sounds reasonable.
- Growth of AI research community: Estimate the size of the AI research community at various points in time, and estimate the pace of research progress given that community size. It seems reasonable to assume that the pace of progress will increase logarithmically in the size of the research community, but I can also see arguments for why we'd benefit more or less from a larger community (or even have slower progress).
- Growth of funding/compute for AI research: As AI becomes increasingly monetizable, there will be more incentives for companies and governments to support AI research, e.g., in terms of growing industry labs, offering grants to academic labs to support researchers, and funding compute resources - each of these will speed up AI development.

Daniel Kokotajlo

Aug 23, 2020*

160

Here is my snapshot. My reasoning is basically similar to Ethan Perez', it's just that I think that if transformative AI is achievable in the next five orders of magnitude of compute improvement (e.g. prosaic AGI?), it will likely be achieved in the next five years or so. I also am slightly more confident that it is, and slightly less confident that TAI will ever be achieved.

I am aware that my timelines are shorter than most... Either I'm wrong and I'll look foolish, or I'm right and we're doomed. Sucks to be me.
[Edited the snapshot slightly on 8/23/2020]
[Edited to add the following powerpoint slide that gets a bit more at my reasoning]

[-]Daniel Kokotajlo4y40

It's been a year, what do my timelines look like now?

My median has shifted to the left a bit, it's now 2030. However, I have somewhat less probability in the 2020-2025 range I think, because I've become more aware of the difficulties in scaling up compute. You can't just spend more money. You have to do lots of software engineering and for 4+ OOMs you literally need to build more chip fabs to produce more chips. (Also because 2020 has passed without TAI/AGI/etc., so obviously I won't put as much mass there...)

So if I were to draw a distribution it would look pretty similar, just a bit more extreme of a spike and the tip of the spike might be a bit to the right.

3Michaël Trazzi4y

I have probably missed many considerations you have mentioned elsewhere, but in terms of software engineering, how do you think the "software production rate" for scaling up large evolved from 2020 to late 2021? I don't see why we couldn't get 4 OOM between 2020 and 2025. If we just take the example of large LM, we went from essentially 1-10 publicly known models in 2020, to 10-100 in 2021 (cf. China, Korea, Microsoft, DM, etc.), and I expect the amount of private models to be even higher, so it makes sense to me that we could have 4OOM more SWE in that area by 2025. Now, for the chip fabs, I feel like one update from 2020 to 2022 has been NVIDIA & Apple doing unexpected hardware advances (A100, M1) and Nvidia stock growing massively, so I would be more optimistic about "build more fabs" than in 2020. Though I'mm not an expert in hardware at all and those two advances I mentioned were maybe not that useful for scaling.

3Daniel Kokotajlo4y

If I understand you correctly, you are asking something like: How many programmer-hours of effort and/or how much money was being spent specifically on scaling up large models in 2020? What about in 2025? Is the latter plausibly 4 OOMs more than the former? (You need some sort of arbitrary cutoff for what counts as large. Let's say GPT-3 sized or bigger.) Yeah maybe, I don't know! I wish I did. It's totally plausible to me that it could be +4 OOMs in this metric by 2025. It's certainly been growing fast, and prior to GPT-3 there may not have been much of it at all.

1Michaël Trazzi4y

Yes, something like: given (programmer-hours-into-scaling(July 2020) - programmer-hours-into-scaling(Jan 2022)), and how much progress there has been on hardware for such training (I don't know the right metric for this, but probably something to do with FLOP and parallelization), the extrapolation to 2025 (either linear or exponential) would give the 4 OOM you mentioned.

[-]Daniel Kokotajlo2y20

Blast from the past!

I'm biased but I'm thinking this "33% by 2026" forecast is looking pretty good.

Sammy Martin

Aug 23, 2020

Here's my answer. I'm pretty uncertain compared to some of the others!

AI Forecast

First, I'm assuming that by AGI we mean an agent-like entity that can do the things associated with general intelligence, including things like planning towards a goal and carrying that out. If we end up in a CAIS-like world where there is some AI service or other that can do most economically useful tasks, but nothing with very broad competence, I count that as never developing AGI.

I've been impressed with GPT-3, and could imagine it or something like it scaling to produce near-human level responses to language prompts in a few years, especially with RL-based extensions.

But, following the list (below) of missing capabilities by Stuart Russell, I still think things like long-term planning would elude GPT-N, so it wouldn't be agentive general intelligence. Even though you might get those behaviours with trivial extensions of GPT-N, I don't think it's very likely.

That's why I think AGI before 2025 is very unlikely (not enough time for anything except scaling up of existing methods). This is also because I tend to expect progress to be continuous, though potentially quite fast, and going from current AI to AGI in less than 5 years requires a very sharp discontinuity.

AGI before 2035 or so happens if systems quite a lot like current deep learning can do the job, but which aren't just trivial extensions of them - this seems reasonable to me on the inside view - e.g. it takes us less than 15 years to take GPT-N and add layers on top of it that handle things like planning and discovering new actions. This is probably my 'inside view' answer.

I put a lot of weight on a tail peaking around 2050 because of how quickly we've advanced up this 'list of breakthroughs needed for general intelligence' -

There is this list of remaining capabilities needed for AGI in an older post I wrote, with the capabilities of 'GPT-6' as I see them underlined:

Stuart Russell’s List

human-like language comprehension

cumulative learning

discovering new action sets

managing its own mental activity

For reference, I’ve included two capabilities we already have that I imagine being on a similar list in 1960

perception and object recognition

efficient search over known facts

So we'd have discovering new action sets, and managing mental activity - effectively, the things that facilitate long-range complex planning, remaining.

So (very oversimplified) if around the 1980s we had efficient search algorithms, by 2015 we had image recognition (basic perception) and by 2025 we have language comprehension courtesy of GPT-8, that leaves cumulative learning (which could be obtained by advanced RL?), then discovering new action sets and managing mental activity (no idea). It feels a bit odd that we'd breeze past all the remaining milestones in one decade after it took ~6 to get to where we are now. Say progress has sped up to be twice as fast, then it's 3 more decades to go. Add to this the economic evidence from things like Modelling the Human Trajectory, which suggests a roughly similar time period of around 2050.

Finally, I think it's unlikely but not impossible that we never build AGI and instead go for tool AI or CAIS, most likely because we've misunderstood the incentives such that it isn't actually economical or agentive behaviour doesn't arise easily. Then there's the small (few percent) chance of catastrophic or existential disaster which wrecks our ability to invent things. This is the one I'm most unsure about - I put 15% for both but it may well be higher.

[-]Daniel Kokotajlo5y40

This is also because I tend to expect progress to be continuous, though potentially quite fast, and going from current AI to AGI in less than 5 years requires a very sharp discontinuity.

I object! I think your argument from extrapolating when milestones have been crossed is good, but it's just one argument among many. There are other trends which, if extrapolated, get to AGI in less than five years. For example if you extrapolate the AI-compute trend and the GPT-scaling trends you get something like "GPT-5 will appear 3 years from now and be 3 orders of mag... (read more)

1Sammy Martin5y

The 'progress will be continuous' argument, to apply to our near future, does depend on my other assumptions - mainly that the breakthroughs on that list are separable, so agentive behaviour and long-term planning won't drop out of a larger GPT by themselves and can't be considered part of just 'improving up language model accuracy'. We currently have partial progress on human-level language comprehension, a bit on cumulative learning, but near zero on managing mental activity for long term planning, so if we were to suddenly reach human level on long-term planning in the next 5 years, that would probably involve a discontinuity, which I don't think is very likely for the reasons given here. If language models scale to near-human performance but the other milestones don't fall in the process, and my initial claim is right, that gives us very transformative AI but not AGI. I think that the situation would look something like this: If GPT-N reaches par-human: So there would be 2 (maybe 3?) breakthroughs remaining. It seems like you think just scaling up a GPT will also resolve those other milestones, rather than just giving us human-like language comprehension. Whereas if I'm right and also those curves do extrapolate, what we would get at the end would be an excellent text generator, but it wouldn't be an agent, wouldn't be capable of long-term planning and couldn't be accurately described as having a utility function over the states of the external world, and I don't see any reason why trivial extensions of GPT would be able to do that either since those seem like problems that are just as hard as human-like language comprehension. GPT seems like it's also making some progress on cumulative learning, though it might need some RL-based help with that, but none at all on managing mental activity for longterm planning or discovering new action sets. As an additional argument, admittedly from authority - Stuart Russell also clearly sees human-like language comprehe

[-]Ben Pace5y10

(I can't see your distribution in your image.)

stuhlmueller

Aug 22, 2020

110

My quick take:

Ben Pace

Aug 22, 2020*

100

Here's my quick forecast, to get things going. Probably if anyone asks me questions about it I'll realise I'm embarrassed by it and change it.

Link.

It has three buckets:

10%: We get to AGI with the current paradigm relatively quickly without major bumps.

60%: We get to it eventually sometime in the next ~50 years.

30%: We manage to move into a stable state where nobody can unilaterally build an AGI, then we focus on alignment for as long as it takes before we build it.

2nd attempt

Adele Lopez is right that 30% is super optimistic. Also I accidentally put a bunch within '2080-2100', instead of 'after 2100'. And also I thought about it more. here's my new one.

Link.

It has four buckets:

20% Current work leads directly into AI in the next 15 years.

55% There are some major bottlenecks, new insights needed, and some engineering projects comparable in size to the manhattan project. This is 2035 to 2070.

10% This is to fill out 2070 to 2100.

15% We manage to move to a stable state, or alternatively civilizational collapse / non-AI x-risk stops AI research. This is beyond 2100.

Mark Xu

Aug 22, 2020

My rough take: https://elicit.ought.org/builder/oTN0tXrHQ

3 buckets, similar to Ben Pace's

5% chance that current techniques just get us all the way there, e.g. something like GPT-6 is basically AGI
10% chance AGI doesn't happen this century, e.g. humanity sort of starts taking this seriously and decides we ought to hold off + the problem being technically difficult enough that small groups can't really make AGI themselves
50% chance that something like current techniques and some number of new insights gets us to AGI.

If I thought about this for 5 additional hours, I can imagine assigning the following ranges to the scenarios:

[1, 25]
[1, 30]
[20, 80]

Adele Lopez

Aug 22, 2020

110

Roughly my feelings: https://elicit.ought.org/builder/trBX3uNCd

Reasoning: I think lots of people have updated too much on GPT-3, and that the current ML paradigms are still missing key insights into general intelligence. But I also think enough research is going into the field that it won't take too long to reach those insights.

steven0461

Aug 24, 2020

Here's my prediction:

To the extent that it differs from others' predictions, probably the most important factor is that I think even if AGI is hard, there are a number of ways in which human civilization could become capable of doing almost arbitrarily hard things, like through human intelligence enhancement or sufficiently transformative narrow AI. I think that means the question is less about how hard AGI is and more about general futurism than most people think. It's moderately hard for me to imagine how business as usual could go on for the rest of the century, but who knows.

Rohin Shah

Aug 30, 2020*

My snapshot: https://elicit.ought.org/builder/xPoVZh7Xq

Idk what we mean by "AGI", so I'm predicting when transformative AI will be developed instead. This is still a pretty fuzzy target: at what point do we say it's "transformative"? Does it have to be fully deployed and we already see the huge economic impact? Or is it just the point at which the model training is complete? I'm erring more on the side of "when the model training is complete", but also there may be lots of models contributing to TAI, in which case it's not clear which particular model we mean. Nonetheless, this feels a lot more concrete and specific than AGI.

Methodology: use a quantitative model, and then slightly change the prediction to account for important unmodeled factors. I expect to write about this model in a future newsletter.

[-]Rohin Shah3y120

Some updates:

This should really be thought of as "when we see the transformative economic impact", I don't like the "when model training is complete" framing (for basically the reason mentioned above, that there may be lots of models).
I've updated towards shorter timelines; my median is roughly 2045 with a similar shape of the distribution as above.
One argument for shorter timelines than that in bio anchors is "bio anchors doesn't take into account how non-transformative AI would accelerate AI progress".
Another relevant argument is "the huge difference between training time compute and inference time compute suggests that we'll find ways to get use out of lots of inferences with dumb models rather than a few inferences with smart models; this means we don't need models as smart as the human brain, thus lessening the needed compute at training time".
I also feel more strongly about short horizon models probably being sufficient (whereas previously I mostly had a mixture between short and medium horizon models).
Conversely, reflecting on regulation and robustness made me think I was underweighting those concerns, and lengthened my timelines.

2Rohin Shah3y

Interestingly, I apparently had a median around 2040 back in 2019, so my median is still later than it used to be prior to reading the bio anchors report.

Matthew Barnett

Sep 03, 2020*

If AGI is taken to mean, the first year that there is radical economic, technological, or scientific progress, then these are my AGI timelines.

My percentiles

5th: 2029-09-09
25th: 2049-01-17
50th: 2079-01-24
75th: above 2100-01-01
95th: above 2100-01-01

I have a bit lower probability for near-term AGI than many people here are. I model my biggest disagreement as about how much work is required to move from high-cost impressive demos to real economic performance. I also have an intuition that it is really hard to automate everything and progress will be bottlenecked by the tasks that are essential but very hard to automate.

[-]Matthew Barnett3y10

Some updates:

I now have an operationalization of AGI I feel happy about, and I think it's roughly just as difficult as creating transformative AI (though perhaps still slightly easier).
I have less probability now on very long timelines (>80 years). Previously I had 39% credence on AGI arriving after 2100, but I now only have about 25% credence.
I also have a bit more credence on short timelines, mostly because I think the potential for massive investment is real, and it doesn't seem implausible that we could spend >1% of our GDP on AI development at s

... (read more)

ChosunOne

Aug 26, 2020

Here is a link to my forecast

And here are the rough justifications for this distribution:

I don't have much else to add beyond what others have posted, though it's in part influenced by an AIRCS event I attended in the past. Though I do remember being laughed at for suggesting GPT-2 represented a very big advance toward AGI.

I've also never really understood the resistance to why current models of AI are incapable of AGI. Sure, we don't have AGI with current models, but how do we know it isn't a question of scale? Our brains are quite efficient, but the total energy consumption is comparable to that of a light bulb. I find it very hard to believe that a server farm in an Amazon, Microsoft, or Google Datacenter would be incapable of running the final AGI algorithm. And for all the talk of the complexity in the brain, each neuron is agonizingly slow (200-300Hz).

That's also to say nothing of the fact that the vast majority of brain matter is devoted to sensory processing. Advances in autonomous vehicles are already proving that isn't an insurmountable challenge.

Current AI models are performing very well at pattern recognition. Isn't that most of what our brains do anyway?

Self attended recurrent transformer networks with some improvements to memory (attention context) access and recall to me look very similar to our own brain. What am I missing?

[-]TurnTrout5y10

I've also never really understood the resistance to why current models of AI are incapable of AGI. Sure, we don't have AGI with current models, but how do we know it isn't a question of scale? Our brains are quite efficient, but the total energy consumption is comparable to that of a light bulb. I find it very hard to believe that a server farm in an Amazon, Microsoft, or Google Datacenter would be incapable of running the final AGI algorithm. And for all the talk of the complexity in the brain, each neuron is agonizingly slow (200-

... (read more)

Rendering 7/13 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 4:10 PM

[-]Raemon5y70

Curated.

I think this was a quite interesting experiment in LW Post format. Getting to see everyone's probability-distributions in visual graph format felt very different from looking at a bunch of numbers in a list, or seeing them averaged together. I especially liked some of the weirder shapes of some people's curves.

This is a bit of an offbeat curation, but I think it's good to periodically highlight experimental formats like this.

[-]Rohin Shah5y30

Planned summary for the Alignment Newsletter:

This post collects forecasts of timelines until human-level AGI, and (at the time of this writing) has twelve such forecasts.

[-]Matthew Barnett5y20

It's unclear to me what "human-level AGI" is, and it's also unclear to me why the prediction is about the moment an AGI is turned on somewhere. From my perspective, the important thing about artificial intelligence is that it will accelerate technological, economic, and scientific progress. So, the more important thing to predict is something like, "When will real economic growth rates reach at least 30% worldwide?"

It's worth comparing the vagueness in this question with the specificity in this one on Metaculus. From the virtues of rationality,

The tenth virtue is precision. One comes and says: The quantity is between 1 and 100. Another says: the quantity is between 40 and 50. If the quantity is 42 they are both correct, but the second prediction was more useful and exposed itself to a stricter test. What is true of one apple may not be true of another apple; thus more can be said about a single apple than about all the apples in the world. The narrowest statements slice deepest, the cutting edge of the blade.

[-]jungofthewon5y20

I generally agree with this but think the alternative goal of "make forecasting easier" is just as good, might actually make aggregate forecasts more accurate in the long run, and may require things that seemingly undermine the virtue of precision.

More concretely, if an underdefined question makes it easier for people to share whatever beliefs they already have, then facilitates rich conversation among those people, that's better than if a highly specific question prevents people from making a prediction at all. At least as much, if not more, of the value of making public, visual predictions like this comes from the ensuing conversation and feedback than from the precision of the forecasts themselves.

Additionally, a lot of assumptions get made at the time the question is defined more precisely, which could prematurely limit the space of conversation or ideas. There are good reasons why different people define AGI the way they do, or the moment of "AGI arrival" the way they do, that might not come up if the question askers had taken a point of view.

[-]Ben Pace5y20

Comment here if you have technical issues with the Elicit tool, with putting images in your comments, or with anything else.

[-]Teerth Aloke5y10

Apparently, GPT-3 has greatly influenced the forecasts. I wonder if this is true?

[-]Daniel Kokotajlo5y30

It significantly influenced mine, though the majority of that influence wasn't the evidence it provided but rather the motivation it gave me to think more carefully and deeply about timelines.

Moderation Log

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

41

[ Question ]

Forecasting Thread: AI Timelines

41

12 Answers sorted by
top scoring

Aug 23, 2020

Aug 22, 2020*

Aug 23, 2020*

Aug 23, 2020

Aug 22, 2020

Aug 22, 2020*

Aug 22, 2020

Aug 22, 2020

Aug 24, 2020

Aug 30, 2020*

Sep 03, 2020*

Aug 26, 2020

41

[ Question ]

Forecasting Thread: AI Timelines

41

12 Answers sorted by top scoring

Aug 23, 2020

Aug 22, 2020*

Aug 23, 2020*

Aug 23, 2020

Aug 22, 2020

Aug 22, 2020*

Aug 22, 2020

Aug 22, 2020

Aug 24, 2020

Aug 30, 2020*

Sep 03, 2020*

Aug 26, 2020

12 Answers sorted by
top scoring