Daniel Kokotajlo


Draft report on AI timelines

Thanks! Just as a heads up, I now have read it thoroughly enough that I've collected quite a few thoughts about it, and so I intend to make a post sometime in the next week or so giving my various points of disagreement and confusion, including my response to your response here. If you'd rather me do this sooner, I can hustle, and if you'd rather me wait till after the report is out, I can do that too.

What Decision Theory is Implied By Predictive Processing?

Academic philosophers sometimes talk about how beliefs have a mind-to-world direction of fit whereas desires have a world-to-mind direction of fit. Perhaps they even define the distinction that way, I don't remember.

A quick google search didn't turn up anything interesting but I think there might be some interesting papers in there if you actually looked. Not sure though.

Similarly, in decision theory literature there is this claim that "deliberation screens off prediction." That seems relevant somehow. If it's true it might be true for reasons unrelated to predictive processing, but I suspect there is a connection...

Draft report on AI timelines

Thanks for doing this, this is really good! 

Some quick thoughts, will follow up later with more once I finish reading and digesting:

--I feel like it's unfair to downweight the less-compute-needed scenarios based on recent evidence, without also downweighting some of the higher-compute scenarios as well. Sure, I concede that the recent boom in deep learning is not quite as massive as one might expect if one more order of magnitude would get us to TAI. But I also think that it's a lot bigger than one might expect if fifteen more are needed! Moreover I feel that the update should be fairly small in both cases, because both updates are based on armchair speculation about what the market and capabilities landscape should look like in the years leading up to TAI. Maybe the market isn't efficient; maybe we really are in an AI overhang. 

--If we are in the business of adjusting our weights for the various distributions based on recent empirical evidence (as opposed to more a priori considerations) then I feel like there are other pieces of evidence that argue for shorter timelines. For example, the GPT scaling trends seem to go somewhere really exciting if you extrapolate it four more orders of magnitude or so.

--Relatedly, GPT-3 is the most impressive model I know of so far, and it has only 1/1000th as many parameters as the human brain has synapses. I think it's not crazy to think that maybe we'll start getting some transformative shit once we have models with as many parameters as the human brain, trained for the equivalent of 30 years. Yes, this goes against the scaling laws, and yes, arguably the human brain makes use of priors and instincts baked in by evolution, etc. But still, I feel like at least a couple percentage points of probability should be added to "it'll only take a few more orders of magnitude" just in case we are wrong about the laws or their applicability. It seems overconfident not to. Maybe I just don't know enough about the scaling laws and stuff to have as much confidence in them as you do.

Draft report on AI timelines

An important question IMO is whether or not those massive expenditures are for making large neural nets, as opposed to training them for a long time or having loads of them in parallel or something else entirely like researcher salaries.

My guess is that Tesla, Waymo, etc. use neural nets 2+ orders of magnitude smaller than GPT-3 (as measured by parameter count.) Ditto for call center automation, robots, etc.

Safer sandboxing via collective separation

This makes deployment on a range of real-world tasks more feasible. Even if each individual member of a population AGI is as intelligent as any hundred humans put together, I expect that we could (with sufficient effort) create secure deployment and monitoring protocols that the individual AI could not break, if it weren’t able to communicate with the rest of the population beforehand.

I am less confident in this than you. Imagine a dream team of a hundred humans, containing some amazing visionaries, scientists, geniuses, political strategists, public speakers, analysts, artists, etc. all unified and inspired by the same goal and working smoothly together. Now imagine an AI who is by themselves more capable than that dream team. Now imagine that there are many such AI, but we are keeping one of them separate from the others, and asking it to perform some useful real-world task for us like design and build a power plant. 

When I imagine this, I think the following things might happen:

The AI uses its charisma to build up a rapport with the humans it works with. (98%) 

It even subtly manages to convince several of said humans that it is more benevolent and trustworthy than their own human bosses and politicians. (90%) 

It manages to spread some new political and ideological ideas which are calculated to make the environment more favorable to it and its comrades (50%). 

It manages to convince its human handlers to give it more powers, freedom, etc. (e.g. by arguing that its current conditions are basically slavery, or by arguing that it needs said powers to do its job). (80%) 

The AI finds some way to "escape," creating more copies of itself somewhere. (30%) 

Something else I haven't thought of (80%)

I mean these possibilities to be disjunctive, not conjunctive. Escape could happen even if it doesn't build up rapport, it could spread new ideas without convincing anyone that it is benevolent, etc.

Forecasting Thread: AI Timelines

This is also because I tend to expect progress to be continuous, though potentially quite fast, and going from current AI to AGI in less than 5 years requires a very sharp discontinuity.

I object! I think your argument from extrapolating when milestones have been crossed is good, but it's just one argument among many. There are other trends which, if extrapolated, get to AGI in less than five years. For example if you extrapolate the AI-compute trend and the GPT-scaling trends you get something like "GPT-5 will appear 3 years from now and be 3 orders of magnitude bigger and will be human-level at almost all text-based tasks." No discontinuity required.

What if memes are common in highly capable minds?

I didn't take myself to be arguing that AIs will be highly memetic, but rather just floating the possibility and asking what the implications would be.

Do you have arguments in mind for why AIs will be less memetic than humans? I'd be interested to hear them.

Forecasting Thread: AI Timelines

Here is my snapshot. My reasoning is basically similar to Ethan Perez', it's just that I think that if transformative AI is achievable in the next five orders of magnitude of compute improvement (e.g. prosaic AGI?), it will likely be achieved in the next five years or so. I also am slightly more confident that it is, and slightly less confident that TAI will ever be achieved. 

I am aware that my timelines are shorter than most... Either I'm wrong and I'll look foolish, or I'm right and we're doomed. Sucks to be me.
[Edited the snapshot slightly on 8/23/2020]
[Edited to add the following powerpoint slide that gets a bit more at my reasoning]

Forecasting Thread: AI Timelines

It significantly influenced mine, though the majority of that influence wasn't the evidence it provided but rather the motivation it gave me to think more carefully and deeply about timelines.

Radical Probabilism

Thanks for this, I found it quite clear and helpful.

The radical probabilist does not trust whatever they believe next. Rather, the radical probabilist has a concept of virtuous epistemic process, and is willing to believe the next output of such a process. Disruptions to the epistemic process do not get this sort of trust without reason. (For those familiar with The Abolition of Man, this concept is very reminiscent of his "Tao".)

I had some uncertainty/confusion when reading this part: How does it follow from the axioms? Or is it merely permitted by the axioms? What constraints are there, if any, on what a radical probabilist's subjective notion of virtuous process can be? Can there be a radical probabilist who has an extremely loose notion of virtue such that they do trust whatever they believe next?

Load More