Some people seem to think my timelines have shifted a bunch while they've only moderately changed.
Relative to my views at the start of 2025, my median (50th percentile) for AIs fully automating AI R&D was pushed back by around 2 years—from something like Jan 2032 to Jan 2034. My 25th percentile has shifted similarly (though perhaps more importantly) from maybe July 2028 to July 2030. Obviously, my numbers aren't fully precise and vary some over time. (E.g., I'm not sure I would have quoted these exact numbers for this exact milestone at the start of the year; these numbers for the start of the year are partially reverse engineered from this comment.)
Fully automating AI R&D is a pretty high milestone; my current numbers for something like "AIs accelerate AI R&D as much as what would happen if employees ran 10x faster (e.g. by ~fully automating research engineering and some other tasks)" are probably 50th percentile Jan 2032 and 25th percentile Jan 2029.[1]
I'm partially posting this so there is a record of my views; I think it's somewhat interesting to observe this over time. (That said, I don't want to anchor myself, which does seem like a serious downside. I should slide around a bunch and be somewhat incoherent if I'm updating as much as I should: my past views are always going to be somewhat obviously confused from the perspective of my current self.)
While I'm giving these numbers, note that I think Precise AGI timelines don't matter that much.
See this comment for the numbers I would have given for this milestone at the start of the year. ↩︎
I've updated towards somewhat longer timelines again over the last 5 months. Maybe my 50th percentile for this milestone is now Jan 2032.
Some AI company employees with shorter timelines than me mostly. I also think that "why I don't agree with X" is a good prompt to express some deeper aspect of my models/views. It also makes a good reasonably engaging hook for a blog post.
I might write some posts responding arguments for longer timelines that I disagree with if I feel like I have something interesting to say.
While I do spend some time discussing AGI timelines (and I've written some posts about it recently), I don't think moderate quantitative differences in AGI timelines matter that much for deciding what to do[1]. For instance, having a 15-year median rather than a 6-year median doesn't make that big of a difference. That said, I do think that moderate differences in the chance of very short timelines (i.e., less than 3 years) matter more: going from a 20% chance to a 50% chance of full AI R&D automation within 3 years should potentially make a substantial difference to strategy.[2]
Additionally, my guess is that the most productive way to engage with discussion around timelines is mostly to not care much about resolving disagreements, but then when there appears to be a large chance that timelines are very short (e.g., >25% in <2 years) it's worthwhile to try hard to argue for this.[3] I think takeoff speeds are much more important to argue about when making the case for AI risk.
I do think that having somewhat precise views is helpful for some people in doing relatively precise prioritization within people already working on safety, but this seems pretty niche.
Given that I don't think timelines are that important, why have I been writing about this topic? This is due to a mixture of: I find it relatively quick and easy to write about timelines, my commentary is relevant to the probability of very short timelines (which I do think is important as discussed above), a bunch of people seem interested in timelines regardless, and I do think timelines matter some.
Consider reflecting on whether you're overly fixated on details of timelines.
I've seen Richard Ngo make this point before, though I couldn't find where he did this. More generally, this isn't a very original point; I just think it's worth making given that I've been talking about timelines recently. ↩︎
I also think that the chance that very powerful AI happens under this presidential administration is action-relevant for policy. ↩︎
You could have views such that you expect to never be >25% confident in <2-year timelines until it's basically too late. For instance, maybe you expect very fast takeoff driven by a single large algorithmic advance. Under this view, I think arguing about the details of timelines looks even less good and you should mostly make the case for risk independently of this, perhaps arguing "it seems like AI could emerge quickly and unexpectedly, so we need to act now". ↩︎
I'm more sympathetic to "companies will spend this much on some high quality RL envs" than "the typical RL env will be very expensive", but I think some disagreement remains.
Thanks, I wasn't aware of this post. (I think it overstates the level of spending we'll see on the average RL env within a year by maybe 10x or more, but I agree directionally.)
This seems pretty reasonable (strong upvoted).
These aren't disagreements with the post, but here are some of my thoughts (most of which Sam already knows, but I'm sharing for third parties):
My largest concern is that the auditing test beds will end up being pretty disanalogous to the cases of misalignment I find most concerning and thus hill climbing on these test beds isn't that useful. (You discuss the need for high quality environments in the appendix. I agree.) In practice, my understanding is that constructing good auditing test beds has been a huge pain, especially because current AI systems have trouble "keeping their shit together" (and generally are unreliable at concealing secret aims) which makes behavioral red-teaming very powerful. I don't expect behavioral red teaming to work for the AI systems which I'm most worried about.[1] To be clear, the issues I discuss in this paragraph generally apply to testing alignment auditing methods.
Another concern is that current AI systems (and future AIs which aren't much more capable) are pretty different than humans which could limit transfer in the conclusions, though this doesn't mean that hill climbing wouldn't have substantial transfer. E.g., maybe some tool helps AI agents uncover some issues with relatively less need for cognitive effort or agency or noticing something and this makes a huge difference to the bottom line due to AI capability limitations. But, humans would only be marginally helped by the tool because they could just apply more effort.
There is a decent chance that very powerful AIs still have trouble keeping their shit together, but these AIs correspondingly are much easier to alignment audit (at least for hidden intentions) such that I also think that these worlds are less leveraged to work on. ↩︎
Probably it would be more accurate to say "doesn't seem to help much while it helps a lot for openai models".