A proposed method for forecasting transformative AI

Nice post, and nice argument! I think I agree that this is a worthy alternative to the Bio Anchors brain-and-genome-size-comparison stuff. I even tentatively agree that it's better overall, though I'd want to think about it more. (When I go on about how great Bio Anchors is, it's not because I'm in love with the brain size comparison--though I used to like it more than I do now--it's because I'm in love with the "useful core" of it, the breakdown into computing price-performance, willingness to spend, algorithmic progress, and compute requirements. Which it seems you also are doing.)

This type of reasoning personally convinced me that a reasonable hard upper bound for training TAI was about 10^40 FLOP, with something between 10^30 to 10^35 FLOP as my central estimate for the training requirements, using 2022 algorithms.
...
If we're also given estimates for growth in computing price-performance, willingness to spend, and algorithmic progress, then it is possible to provide a distribution over dates when we expect TAI to arrive.
...
Alternatively, you can incorporate this approach into Tom Davidson's takeoff model to build a more theoretically grounded timelines model but I have not done that yet.

Tom's model just uses a training requirements variable, it doesn't appeal to all the fancy bio anchors stuff that your method is a viable alternative to. Insofar as you are still using what I consider the "useful core" of the bio anchors model then I think your bottom line numbers for compute requirements can just be straightforwardly plugged into Tom's model.

So I just went and plugged in the values of 10^30 and 10^35 FLOP for the training requirements variable at takeoffspeeds.com. Playing around with it a bit (modifying their preset scenarios) it looks like this gives you somewhere between 2029 and 2044.

Though if you also do what I recommend and increase software returns from 1.25 to 2.5, to be more consistent with the data we have so far about algorithmic progress, the top end of the range cuts down considerably: Now you get 2027 - 2033 as the range.

[-]Daniel Kokotajlo3y88

[-]Wei Dai2y52

I'm confused about how heterogeneity in data quality interacts with scaling. Surely training a LM on scientific papers would give different results from training it on web spam, but data quality is not an input to the scaling law... This makes me wonder whether your proposed forecasting method might have some kind of blind spot in this regard, for example failing to take into account that AI labs have probably already fed all the scientific papers they can into their training processes. If future LMs train on additional data that have little to do with science, could that keep reducing overall cross-entropy loss (as scientific papers become a smaller fraction of the overall corpus) but fail to increase scientific ability?

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

32

A proposed method for forecasting transformative AI

32

Some background

Summary of the Direct Approach

Interpreting the training loss

Building a more realistic model

When will TAI arrive?

Comparison to Bio Anchors