All of Jsevillamol's Comments + Replies

 is not a transpose! It is the timestep . We are raising  to the -th power.


Our current best guess is that this includes costs other than the amortized compute of the final training run.

If no extra information surfaces we will add a note clarifying this and/or adjust our estimate.

1Edouard Harris10mo
Gotcha, that makes sense!

Thanks Neel!

The difference between tf16 and FP32 comes to a x15 factor IIRC. Though also ML developers seem to prioritise other characteristics than cost effectiveness when choosing GPUs like raw performance and interconnect, so you can't just multiply the top price performance we showcase by this factor and expect that to match the cost performance of the largest ML runs today.

More soon-ish.

This site claims that the strong SolidGoldMagikarp was the username of a moderator involved somehow with Twitch Plays Pokémon

2Jaime Sevilla1y
Here is a 2012 meme about SolidGoldMagikarp

Marius Hobbhahn has estimated the number of parameters here. His final estimate is 3.5e6 parameters.

Anson Ho has estimated the training compute (his reasoning at the end of this answer). His final estimate is 7.8e22 FLOPs.

Below I made a visualization of the parameters vs training compute of n=108 important ML system, so you can see how DeepMind's syste (labelled GOAT in the graph) compares to other systems. 

[Final calculation]
(8 TPUs)(4.20e14 FLOP/s)(0.1 utilisation rate)(32 agents)(7.3e6 s/agent) = 7.8e22 FLOPs



... (read more)
3Daniel Kokotajlo2y
Thanks so much! So, for comparison, fruit flies have more synapses than these XLAND/GOAT agents have parameters!

Following up on this: we have updated appendix F of our paper with an analysis of different choices of the threshold that separates large-scale and regular-scale systems. Results are similar independently of the threshold choice.

There's also a lot of research that didn't make your analysis, including work explicitly geared towards smaller models. What exclusion criteria did you use? I feel like if I was to perform the same analysis with a slightly different sample of papers I could come to wildly divergent conclusions.

It is not feasible to do an exhaustive analysis of all milestone models. We necessarily are missing some important ones, either because we are not aware of them, because they did not provide enough information to deduce the training compute or because we haven't gott... (read more)

Great questions! I think it is reasonable to be suspicious of the large-scale distinction.

I do stand by it - I think the companies discontinuously increased their training budgets around 2016 for some flagship models.[1] If you mix these models with the regular trend, you might believe that the trend was doubling very fast up until 2017 and then slowed down. It is not an entirely unreasonable interpretation, but it explains worse the discontinuous jumps around 2016. Appendix E discusses this in-depth.

The way we selected the large-scale models is half ... (read more)

Following up on this: we have updated appendix F of our paper with an analysis of different choices of the threshold that separates large-scale and regular-scale systems. Results are similar independently of the threshold choice.

Thank you Alex! You make some great points.

It seems like you probably could have gotten certainty about compute for at least a handful of the models studied in question

We thought so too - but in practice it has been surprisingly hard. Profilers are surprisingly buggy. Our colleague Marious looked into this more in depth here.

Maybe we are just going the wrong way about it. If someone here figures out how to directly measure compute in eg a pytorch or TF model it would be a huge boon to us. 

I think two more contemporary techniques are worth considering

... (read more)

ASSUMPTION 3: The algorithm is human-legible, but nobody knows how it works yet.


Can you clarify what you mean by this assumption? And how is your argument dependent on it?

Is the point that the "secret sauce" algorithm is something that humans can plausibly come up with by thinking hrd about it? As opposed maybe to a evolution-designed nightmare that humans cannot plausibly design except by brute forcing it? 

3Steve Byrnes2y
Yes, what you said. The opposite of "a human-legible learning algorithm" is "a nightmarishly-complicated Rube-Goldberg-machine learning algorithm". If the latter is what we need, we could still presumably get AGI, but it would involve some automated search through a big space of many possible nightmarishly-complicated Rube-Goldberg-machine learning algorithms to find one that works. That would be a different AGI development story, and thus a different blog post. Instead of "humans figure out the learning algorithm" as an exogenous input to the path-to-AGI, which is how I treated it, it would instead be an output of that automated search process. And there would be much more weight on the possibility that the resulting learning algorithm would be wildly different than the human brain's, and hence more uncertainty in its computational requirements.

I could only skim and the details went over my head, but it seems you intend to do experiments with Bayesian Networks and human operators.

I recently developed and released an open source explainability framework for Bayes nets - dropping it here in the unlikely case it might be useful.

I don't fully understand how the embeddings are done.

Can you spell out one of the examples? 

It would be helpful for me to see how the semes map to the actual matrix.

Relevant related work : NNs are surprisingly modular

On the topic of pruning neural networks, see the lottery ticket hypothesis

I believe Richard linked to Clusterability in Neural Networks, which has superseded Pruned Neural Networks are Surprisingly Modular.  The same authors also recently published Detecting Modularity in Deep Neural Networks.

How might we quantify size in our definitions above?

Random K complexity inspired measure of size for a context / property / pattern.

Least number of squares you need to turn on, starting from an empty board, so that the grid eventually evolves into the context.

It doesn't work for infinite contexts though.

My user experience

When I first load the page, I am greeted by an empty space.


From here I didn't know what to look for, since I didn't remember what kind of things where in the database.

I tried clicking on table to see what content is there.

Ok, too much information, hard to navigate.

I remember that one of my manuscripts made it to the database, so I look up my surname


That was easy! (and it loaded very fast)

The interface is very neat too. I want to see more papers, so I click on one of the tags.

I get what I wanted.

Now I want to find a list of all... (read more)

One more question: for the BigGAN which model do your calculations refer to?

Could it be the 256x256 deep version?

Do you mind sharing your guesstimate on number of parameters?

Also, do you have per chance guesstimates on number of parameters / compute of other systems?

2Daniel Kokotajlo3y
I did, sorry -- I guesstimated FLOP/step and then figured parameters is probably a bit less than 1 OOM less than that. But since this is recurrent maybe it's even less? IDK. My guesstimate is shitty and I'd love to see someone do a better one!

Very tangential to the discussion so feel free to ignore, but given that you have put some though before on prize structures I am curious about the reasoning for why you would award a different prize for something done in the past versus something done in the future

Thank you! The shapes mean the same as the color (ie domain) - they were meant to make the graph more clear. Ideally both shape and color would be reflected in the legend. But whenever I tried adding shapes to the legend instead a new legend was created, which was more confusing.

If somebody reading this knows how to make the code produce a correct legend I'd be very keen on hearing it!

EDIT: Now fixed

re: impotance of oversight

I do not think we really disagree on this point. I also believe that looking at the state of the computer is not as important as having an understanding of how the program is going to operate and how to shape its incentives. 

Maybe this could be better emphasized, but the way I think about this article is showing that even the strongest case for looking at the intersection of quantum computing and AI alignment does not look very promising. 


re: How quantum computing will affect ML

I basically agree that the most plaus... (read more)

Suggestion 1: Utility != reward by Vladimir Mikulik. This post attempts to distill the core ideas of mesa alignment. This kind of distillment increases the surface area of AI Alignment, which is one of the key bottlenecks of the area (that is, getting people familiarized with the field, motivated to work on it and with a handle on some open questions to work on). I would like an in-depth review because it might help us learn how to do it better!

Suggestion 2: me and my coauthor Pablo Moreno would be interested in feedback in our post about quantum computing... (read more)

I think this helped me a lot understand you a bit better - thank you

Let me try paraphrasing this:

> Humans are our best example of a sort-of-general intelligence. And humans have a lazy, satisfying, 'small-scale' kind of reasoning that is mostly only well suited for activities close to their 'training regime'. Hence AGIs may also be the same - and in particular if AGIs are trained with Reinforcement Learning and heavily rewarded for following human intentions this may be a likely outcome.

Is that pointing in the direction you intended?

Let me try to paraphrase this: 

In the first paragraph you are saying that "seeking influence" is not something that a system will learn to do if that was not a possible strategy in the training regime. (but couldn't it appear as an emergent property? Certainly humans were not trained to launch rockets - but they nevertheless did?)

In the second paragraph you are saying that common sense sometimes allows you to modify the goals you were given (but for this to apply to AI ststems, wouldn't they need have common sense in the first place, which kind of ass... (read more)

3Richard Ngo3y
Hey, thanks for the questions! It's a very confusing topic so I definitely don't have a fully coherent picture of it myself. But my best attempt at a coherent overall point: No, I'm saying that giving an agent a goal, in the context of modern machine learning, involves reinforcement in the training regime. It's not clear to me exactly what goals will result from this, but we can't just assume that we can "give an AI the final goal of evaluating the Riemann hypothesis" in a way that's devoid of all context. It may be the case that it's very hard to train AIs without common sense of some kind, potentially because a) that's just the default for how minds work, they don't by default extrapolate to crazy edge cases. And b) common sense is very useful in general. For example, if you train AIs on obeying human instructions, then they will only do well in the training environment if they have a common-sense understanding of what humans mean. No, it's more that the goal itself is only defined in a small-scale setting, because the agent doesn't think in ways which naturally extrapolate small-scale goals to large scales. Perhaps it's useful to think about having the goal of getting a coffee. And suppose there is some unusual action you can take to increase the chances that you get the coffee by 1%. For example, you could order ten coffees instead of one coffee, to make sure at least one of them arrives. There are at least two reasons you might not take this unusual action. In some cases it goes against your values - for example, if you want to save money. But even if that's not true, you might just not think about what you're doing as "ensure that I have coffee with maximum probability", but rather just "get a coffee". This goal is not high-stakes enough for you to actually extrapolate beyond the standard context. And then some people are just like that with all their goals - so why couldn't an AI be too?

I notice I am surprised you write

However, the link from instrumentally convergent goals to dangerous influence-seeking is only applicable to agents which have final goals large-scale enough to benefit from these instrumental goals

and not address the "Riemman disaster" or "Paperclip maximizer" examples [1]

  • Riemann hypothesis catastrophe. An AI, given the final goal of evaluating the Riemann hypothesis, pursues this goal by transforming the Solar System into “computronium” (physical resources arranged in a way that is optimized for computation)— including the
... (read more)
4Richard Ngo3y
Yes, because it skips over the most important part: what it means to "give an AI a goal". For example, perhaps we give the AI positive reward every time it solves a maths problem, but it never has a chance to seize more resources during training - all it's able to do is think about them. Have we "given it" the goal of solving maths problems by any means possible, or the goal of solving maths problems by thinking about them? The former I'd call large-scale, the latter I wouldn't. I think I'll concede that "large-scale" is maybe a bad word for the concept I'm trying to point to, because it's not just a property of the goal, it's a property of how the agent thinks about the goal too. But the idea I want to put forward is something like: if I have the goal of putting a cup on a table, there's a lot of implicit context around which table I'm thinking about, which cup I'm thinking about, and what types of actions I'm thinking about. If for some reason I need to solve world peace in order to put the cup on the table, I won't adopt solving world peace as an instrumental goal, I'll just shrug and say "never mind then, I've hit a crazy edge case". I don't think that's because I have safe values. Rather, this is just how thinking works - concepts are contextual, and it's clear when the context has dramatically shifted. So I guess I'm kind of thinking of large-scale goals as goals that have a mental "ignore context" tag attached. And these are certainly possible, some humans have them. But it's also possible to have exactly the same goal, but only defined within "reasonable" boundaries - and given the techniques we'll be using to train AGIs, I'm pretty uncertain which one will happen by default. Seems like, when we're talking about tasks like "manage this specific factory" or "solve this specific maths problem", the latter is more natural.

I have been thinking about this research direction for ~4 days.

No interesting results, though it was a good exercise to calibrate how much do I enjoy researching this type of stuff.

In case somebody else wants to dive into it, here are some thoughts I had and resources I used:


  • The definition of depth given in the post seems rather unnatural to me. This is because I expected it would be easy to relate the depth of two agents to the rank of the world of a Kripke chain where the fixed points representing their behavior will stabilize. Looking at Zachar
... (read more)