Someone who is interested in learning and doing good.
My main blog: https://matthewbarnett.substack.com/
Thanks for the useful comment.
You might say "okay, sure, at some level of scaling GPTs learn enough general reasoning that they can manage a corporation, but there's no reason to believe it's near".
Right. This is essentially the same way we might reply to Claude Shannon if he said that some level of brute-force search would solve the problem of natural language translation.
one of the major points of the bio anchors framework is to give a reasonable answer to the question of "at what level of scaling might this work", so I don't think you can argue that current forecasts are ignoring (2).
Figuring out how to make a model manage a corporation involves a lot more than scaling a model until it has the requisite general intelligence to do it in principle if its motivation were aligned.
I think it will be hard to figure out how to actually make models do stuff we want. Insofar as this is simply a restatement of the alignment problem, I think this assumption will be fairly uncontroversial around here. Yet, it's also a reason to assume that we won't simply obtain transformative models the moment they become theoretically attainable.
It might seem unfair that I'm inputting safety and control as an input in our model for timelines, if we're using the model to reason about the optimal time to intervene. But I think on an individual level it makes sense to just try to forecast what will actually happen.
These arguments prove too much; you could apply them to pretty much any technology (e.g. self-driving cars, 3D printing, reusable rockets, smart phones, VR headsets...).
I suppose my argument has an implicit, "current forecasts are not taking these arguments into account." If people actually were taking my arguments into account, and still concluding that we should have short timelines, then this would make sense. But, I made these arguments because I haven't seen people talk about these considerations much. For example, I deliberately avoided the argument that according to the outside view, timelines might be expected to be long, since that's an argument I've already seen many people make, and therefore we can expect a lot of people to take it into account when they make forecasts.
I agree that the things you say push in the direction of longer timelines, but there are other arguments one could make that push in the direction of shorter timelines
Sure. I think my post is akin to someone arguing for a scientific theory. I'm just contributing some evidence in favor of the theory, not conducting a full analysis for and against it. Others can point to evidence against it, and overall we'll just have to sum over all these considerations to arrive at our answer.
In addition to the reasons you mentioned, there's also empirical evidence that technological revolutions generally precede the productivity growth that they eventually cause. In fact, economic growth may even slow down as people pay costs to adopt new technologies. Philippe Aghion and Peter Howitt summarize the state of the research in chapter 9 of The Economics of Growth,
Although each [General Purpose Technology (GPT)] raises output and productivity in the long run, it can also cause cyclical fluctuations while the economy adjusts to it. As David (1990) and Lipsey and Bekar (1995) have argued, GPTs like the steam engine, the electric dynamo, the laser, and the computer require costly restructuring and adjustment to take place, and there is no reason to expect this process to proceed smoothly over time. Thus, contrary to the predictions of real-business-cycle theory, the initial effect of a “positive technology shock” may not be to raise output, productivity, and employment but to reduce them.
If AGI is taken to mean, the first year that there is radical economic, technological, or scientific progress, then these are my AGI timelines.
I have a bit lower probability for near-term AGI than many people here are. I model my biggest disagreement as about how much work is required to move from high-cost impressive demos to real economic performance. I also have an intuition that it is really hard to automate everything and progress will be bottlenecked by the tasks that are essential but very hard to automate.
It's unclear to me what "human-level AGI" is, and it's also unclear to me why the prediction is about the moment an AGI is turned on somewhere. From my perspective, the important thing about artificial intelligence is that it will accelerate technological, economic, and scientific progress. So, the more important thing to predict is something like, "When will real economic growth rates reach at least 30% worldwide?"
It's worth comparing the vagueness in this question with the specificity in this one on Metaculus. From the virtues of rationality,
The tenth virtue is precision. One comes and says: The quantity is between 1 and 100. Another says: the quantity is between 40 and 50. If the quantity is 42 they are both correct, but the second prediction was more useful and exposed itself to a stricter test. What is true of one apple may not be true of another apple; thus more can be said about a single apple than about all the apples in the world. The narrowest statements slice deepest, the cutting edge of the blade.
To me the most obvious risk (which I don't ATM think of as very likely for the next few iterations, or possibly ever, since the training is myopic/SL) would be that GPT-N in fact is computing (e.g. among other things) a superintelligent mesa-optimization process that understands the situation it is in and is agent-y.
Do you have any idea of what the mesa objective might be. I agree that this is a worrisome risk, but I was more interested in the type of answer that specified, "Here's a plausible mesa objective given the incentives." Mesa optimization is a more general risk that isn't specific to the narrow training scheme used by GPT-N.
Second, the major disagreement is between those who think progress will be discontinuous and sudden (such as Eliezer Yudkowsky, MIRI) and those who think progress will be very fast by normal historical standards but continuous (Paul Chrisiano, Robin Hanson).
I'm not actually convinced this is a fair summary of the disagreement. As I explained in my post about different AI takeoffs, I had the impression that the primary disagreement between the two groups was over locality rather than the amount of time takeoff lasts. Though of course, I may be misinterpreting people.
I tend to think that the pandemic shares more properties with fast takeoff than it does with slow takeoff. Under fast takeoff, a very powerful system will spring into existence after a long period of AI being otherwise irrelevant, in a similar way to how the virus was dormant until early this year. The defining feature of slow takeoff, by contrast, is a gradual increase in abilities from AI systems all across the world.
In particular, I object to this portion of your post,
The "moving goalposts" effect, where new advances in AI are dismissed as not real AI, could continue indefinitely as increasingly advanced AI systems are deployed. I would expect the "no fire alarm" hypothesis to hold in the slow takeoff scenario - there may not be a consensus on the importance of general AI until it arrives, so risks from advanced AI would continue to be seen as "overblown" until it is too late to address them.
I'm not convinced that these parallels to COVID-19 are very informative. Compared to this pandemic, I expect the direct effects of AI to be very obvious to observers, in a similar way that the direct effects of cars are obvious to people who go outside. Under a slow takeoff, AI will already be performing a lot of important economic labor before the world "goes crazy" in the important senses. Compare to the pandemic, in which
Oops yes. That's the weaker claim, that I agree with. The stronger claim is that because we can't understand something "all at once" then mechanistic transparency is too hard and so we shouldn't take Daniel's approach. But the way we understand laptops is also in a mechanistic sense. No one argues that because laptops are too hard to understand all at once, then we should't try to understand them mechanistically.
This seems to be assuming that we have to be able to take any complex trained AGI-as-a-neural-net and determine whether or not it is dangerous. Under that assumption, I agree that the problem is itself very hard, and mechanistic transparency is not uniquely bad relative to other possibilities.
I didn't assume that. I objected to the specific example of a laptop as an instance of mechanistic transparency being too hard. Laptops are normally understood well because understanding can be broken into components and built up from abstractions. But each our understanding of each component and abstraction is pretty mechanistic -- and this understanding is useful.
Furthermore, because laptops did not fall out of the sky one day, but instead slowly built over successive years of research and development, it seems like a great example of how Daniel's mechanistic transparency approach does not rely on us having to understand arbitrary systems. Just as we built up an understanding of laptops, presumably we could do the same with neural networks. This was my interpretation of why he is using Zoom In as an example.
All of the other stories for preventing catastrophe that I mentioned in the grandparent are tackling a hopefully easier problem than "detect whether an arbitrary neural net is dangerous".
Indeed, but I don't think this was the crux of my objection.
I'd be shocked if there was anyone to whom it was mechanistically transparent how a laptop loads a website, down to the gates in the laptop.
Could you clarify why this is an important counterpoint. It seems obviously useful to understand mechanistic details of a laptop in order to debug it. You seem to be arguing the [ETA: weaker] claim that nobody understands the an entire laptop "all at once", as in, they can understand all the details in their head simultaneously. But such an understanding is almost never possible for any complex system, and yet we still try to approach it. So this objection could show that mechanistic transparency is hard in the limit, but it doesn't show that mechanistic transparency is uniquely bad in any sense. Perhaps you disagree?