Also, it's a lot easier to fake by writing 10 letters with 10 different predictions and then burning the ones that don't come true.
I agree with all this I think.This is why I said commitment races happen between consequentialists (I defined that term more narrowly than you do; the sophisticated reasoning you do here is nonconsequentialist by my definition). I agree that agents worthy of the label "rational" will probably handle these cases gracefully and safely.
However, I'm not yet supremely confident that the AGIs we end up building will handle these cases gracefully and safely. I would love to become more confident & am looking for ways to make it more likely.
If toda... (read more)
OK, cool. I think I was confused.
You would behave the exact same way as GPT-3, were you to be put in this same challenging situation. In fact I think you'd do worse; GPT-3 managed to get quite a few words actually reversed whereas I expect you'd just output gibberish. (Remember, you only have about 1 second to think before outputting each token. You have to just read the text and immediately start typing.)
Thanks Lanrian and Gwern! Alas that my quick-and-dirty method is insufficient.
You may be interested in this image. I would be grateful for critiques; maybe I'm thinking about it wrong?
You calculated things for the neural network brain size anchor; now here's the peformance scaling trend calculation (I think):
I took these graphs from the Chinchilla paper and then made them transparent and superimposed them on one another and then made a copy on the right to extend the line. And I drew some other lines to extend them.Eyeballing this graph it looks like whatever performance we could achieve with 10^27 FLOPs under the Kaplan scaling laws, we can now achieve with 10^25 FLOPs. (!!!) This is a big deal if true. Am I reasoning incorrectly here?... (read more)
Cool. Yep, that makes sense. I'd love to see those numbers if you calculate them!
So then... If before we looked at the Kaplan scaling and thought e.g. 50% chance that +6 OOMs would be enough... now we correct for the updated scaling laws and think 50% chance that, what, +4 OOMs would be enough? How big do you think the adjustment would be? (Maybe I can work it out by looking at some of those IsoX graphs in the paper?)
Depends on how you were getting to that +N OOMs number.
If you were looking at my post, or otherwise using the scaling laws to extrapolate how fast AI was improving on benchmarks (or subjective impressiveness), then the chinchilla laws means you should get there sooner. I haven't run the numbers on how much sooner.
If you were looking at Ajeya's neural network anchor (i.e. the one using the Kaplan scaling-laws, not the human-lifetime or evolution anchors), then you should now expect that AGI comes later. That model anchors the number of parameters in AGI to ... (read more)
The difference between Chinchilla and Gopher was small but noticeable. Since the Kaplan and DM optimal scaling trajectories are like two lines with different slopes, should we perhaps expect the difference to get larger at greater scales?
Thanks for doing this!
Since PaLM was basically a continuation of the scaling strategy derived from the Kaplan paper, it's not surprising that it basically continues the trend. (right?)
I'd be very interested to see how Chinchilla compares, since it claims to be using a superior scaling strategy.
That's my read. It continues the Kaplan scaling. The Kaplan scaling isn't wrong (everything really does scale that way if you train that way), it's just suboptimal. PaLM is not a surprise, neither in the compute cost nor in having capability-spikes (at least, if you've been paying attention and not handwaving them away).
The surprise here is perhaps showing how bad GB/DM communications are, that DM may have let GB piss away millions of dollars of TPU time. As one Googler put it, 'we find out about this stuff the same way you do - from Twitter'.
OK, good to know. I look forward to seeing the performance trends updated with the new scaling paradigm/law.
(In terms of the neural network model, this means lowering our estimate for how many parameters will be needed.)
Overall I guess this should shorten timelines, because the effect you explain here is counteracted by the other first-order effect of "oh geez it looks like our earlier scaling projections were inefficient; for any performance level, we now know how to reach that level for less compute cost than the earlier projections said." What do you think?
It ought to shorten actual timelines, for the reason you say. (Except insofar as data sourcing could actually become a practical problem.)
However, it lengthens the Bio Anchors timeline, because the parameter count in Bio Anchors is fixed. (It's the parameter count of a model that uses about as much inference compute as the brain.)
This is a weird thing about Bio Anchors -- it asks when models will cross a threshold for the compute required to run them, so efficiency improvements of various kinds will lengthen its timeline. It's always wait... (read more)
Thanks for writing this, I'm excited to see more work on this subject!
One minor musing: I think the problem is a bit more dire than the framing "who to align to" suggests. Humans are biased, including us, including me. A system which replicates those biases and tells us/me what we would have concluded if we investigated in our usual biased way... is "aligned" in some sense, but in a very important sense is unaligned.* To use Ajeya's metaphor, it's a sycophant, not a saint. Rather than assisting us to find the truth, it'll assist us in becoming more unreas... (read more)
Thanks so much! So, for comparison, fruit flies have more synapses than these XLAND/GOAT agents have parameters! https://en.wikipedia.org/wiki/List_of_animals_by_number_of_neurons
Here is a heavily condensed summary of the takeoff speeds thread of the conversation, incorporating earlier points made by Hanson, Grace, etc. https://objection.lol/objection/3262835
(kudos to Ben Goldhaber for pointing me to it)
I wrote my undergrad thesis on this problem and tentatively concluded it's unsolveable, if you read it and think you have a solution that might satisfy me I'd love to hear it! Maybe Chalmers (linked by Jacob) solves it, idk.
(For the next few years or so at least. Specialization won't allow us to permanently have a faster trend, I think, but maybe for the next ten years...)
I think the point you make is a good one and should increase your credence in AGI/TAI/etc. by 2030. (Because if you take Ajeya's distribution and add more uncertainty to it, the left tail gets thicker as well as the right). I'd love to see someone expand on Ajeya's spreadsheet to include uncertainty in the flops/$ and 2020flops/flop trends.
Re FLOPS-per-dollar trends: My impression is that what we care about is the price of compute specifically for large AGI training runs, which is different from the price of compute in general. (which is what the dataset i... (read more)
Minor update: See e.g. this US government website definitions:
Misinformation is false, but not created or shared with the intention of causing harm.
Disinformation is deliberately created to mislead, harm, or manipulate a person, social group, organization, or country.
Malinformation is based on fact, but used out of context to mislead, harm, or manipulate.
(got this example from Zvi's covid post today)
Also, the recent events with GoFundMe and GiveSendGo is an instance of the trend I predicted with separate tech stacks being developed. (GoFundMe froze and/o... (read more)
Came across this today on r/mlscaling and thought I'd put it here since it's relevant: https://arxiv.org/abs/2201.11903#google
This paper explores the ability of language models to generate a coherent chain of thought—a series of short sentences that mimic the reasoning process a person might have when responding to a question. Experiments show that inducing a chain of thought via prompting can enable sufficiently large language models to better perform reasoning tasks that otherwise have flat scaling curves.
Nothing to apologize for, it was reasonably clear, I'm just trying to learn more about what you believe and why. This has been helpful, thanks!
I totally agree that in fast takeoff scenarios we are less likely to spot those things until it's too late. I guess I agree that truthful LM work is less likely to scale gracefully to AGI in fast takeoff scenarios... so I guess I agree with your overall point... I just notice I feel a bit confused and muddle about it, is all. I can imagine plausible slow-takeoff scenarios in which truthful LM work doesn't scale grac... (read more)
Thanks, these clarifications are very helpful.
FWIW I think paul slow takeoff is pretty unlikely for reasons to be found in this thread and this post. On the other hand, as someone who thinks fast takeoff (in various senses) is more likely than not, I don't yet see why that makes Truthful LM work significantly less useful. (By contrast I totally see why Truthful LM work is significantly less useful if AGI/TAI/etc. comes from stuff that doesn't resemble modern deep learning.)
"Catch misalignment early..." This makes it sound like misalignment is something tha... (read more)
Thanks for this!
I think that working on truthful LMs has a comparative advantage in worlds where:
--We have around 10-40 years until transformative AI
--Transformative AI is built using techniques that resemble modern deep learning
--There is a slow takeoff
--Alignment does not require vastly more theoretical insight (but may require some)
--Our current picture of the risks posed by transformative AI is incomplete
Can you elaborate on what you mean by slow takeoff here?
Also, what do you mean by the current picture of the risks being incomplete? What would it ev... (read more)
If I understand you correctly, you are asking something like: How many programmer-hours of effort and/or how much money was being spent specifically on scaling up large models in 2020? What about in 2025? Is the latter plausibly 4 OOMs more than the former? (You need some sort of arbitrary cutoff for what counts as large. Let's say GPT-3 sized or bigger.)
Yeah maybe, I don't know! I wish I did. It's totally plausible to me that it could be +4 OOMs in this metric by 2025. It's certainly been growing fast, and prior to GPT-3 there may not have been much of it at all.
Thank you for doing this research! There's a lot that I love about this piece, besides the obvious thing which is that it is seriously investigating a very important and neglected topic. For example, I love the definitions of various things, I love the little vignettes of ways the near future could go, and I love the core argument about why these possibilities are both plausible and really bad.
I am pretty much on the same page as you, but I have a few minor disagreements (mostly about emphasis). TL;DR is that you focus on more extreme, exotic possibilities... (read more)
Thanks, nice answers!
I agree it would be good to extend the bio anchors framework to include more explicit modelling of data requirements and the like instead of just having it be implicit in the existing variables. I'm generally a fan of making more detailed, realistic models and this seems reasonably high on the priority list of extensions to make. I'd also want to extend the model to include multiple different kinds of transformative task and dangerous task, and perhaps to include interaction effects between them (e.g. once we get R&D acceleration t... (read more)
Awesome post! I'll have more to say later, but for now, check out this experiment I ran with GPT-3:
Inspired by this bit of the post:
If unfavorable regulation is threatened, companies use their widespread companion bots to sway public opinion, making people feel sympathetic for their AI companion who ‘is afraid of getting modified or shut down’ by some regulation.
I decided to ask GPT-3 in chatbot mode what it thought about regulation like this. I did 5 trials; tl;dr is that GPT-3 supported regulation twice and opposed it twice and got confused once.
What l... (read more)
I occasionally hear people make this point but it really seems wrong to me, so I'd like to hear more! Here are the reasons it seems wrong to me:
1. Data generally seems to be the sort of thing you can get more of by throwing money at. It's not universally true but it's true in most cases, and it only needs to be true for at least one transformative or dangerous task. Moreover, investment in AI is increasing; when a tech company is spending $10,000,000,000 on compute for a single AI training run, they can spend 10% as much money to hire 2,000 trained profess... (read more)
Yes, good questions, but I think there are convincing answers. Here's a shot:
1. Some kinds of data can be created this way, like parallel corpora for translation or video annotated with text. But I think it's selection bias that it seems like most cases are like this. Most of the cases we're familiar with seem like this because this is what's easy to do! But transformative tasks are hard, and creating data that really contains latent in it the general structures necessary for task performance, that is also hard. I'm not saying research can't solve it, but ... (read more)
Fair. If you don't share my intuition that people in 1950 should have had more than 90% credence that computers would be militarily useful, or that people at the dawn of steam engines should have predicted that automobiles would be useful (conditional on them being buildable) then that part of my argument has no force on you.
Maybe instead of picking examples from the past, I should pick an example of a future technology that everyone agrees is 90%+ likely to be super useful if developed, even though Joe's skeptical arguments can still be made.
Thanks! Ah, I shouldn't have put the word "portable" in there then. I meant to be talking about computers in general, not computers-on-missiles-as-opposed-to-ground-installations.
Also, the whole setup of picking something which we already know to be widespread (cars) and then applying Joe's arguments to it, seems like it's shouldn't tell us much. If Joe were saying 1% yes, 99% no for incentives to build APS systems, then the existence of counterexamples like cars which have similar "no" arguments would be compelling. But he's saying 80% yes, 20% no, and so
Musings on ways in which the analogy is deep, or at least seems deep to me right now: (It's entirely possible my pattern-recognizer is overclocking and misfiring here... this is fun to think about though)
Automobiles move the cargo to the target. To do this, they move themselves. Agents improve the world according to some evaluation function. To do this, they improve their own power/capability/position.
You often don't know what's inside an automobile until the thing reaches its destination and gets unloaded. Similarly, you often don't know what's inside an ... (read more)
I guess I would say: Ajeya's framework/model can incorporate this objection; this isn't a "get rid of the whole framework" objection but rather a "tweak the model in the following way" objection.
Like, I agree that it would be bad if everyone who used Ajeya's model had to put 100% of their probability mass into the six bio anchors she chose. That's super misleading/biasing/ignores loads of other possible ways AGI might happen. But I don't think of this as a necessary part of Ajeya's model; when I use it, I throw out the six bio anchors and just directly input my probability distribution over OOMs of compute. My distribution is informed by the bio anchors, of course, but that's not the only thing that informs it.
Thanks for this comment (and the other comment below also).
I think we don't really disagree that much here. I may have just poorly communicated, slash maybe I'm objecting to the way Yudkowsky said things because I read it as implying things I disagree with.
I don't think this is what Yudkowsky is saying at all in the post. Actually, I think he is saying the exact opposite: that 2.5 years estimate is too fast as an estimate that is supposed to always work. If I understand correctly, his point is that you have significantly less than that most of the time, ex
Strongly disagree with this, to the extent that I think this is probably the least cruxy topic discussed in this post, and thus the comment is as wrong as is physically possible.
Hahaha ok, interesting! If you are right I'll take some pride in having achieved that distinction. ;)
I interpreted Yudkowsky as claiming that Ajeya's model had enough free parameters that it could be made to predict a wide range of things, and that what was actually driving the 30-year prediction was a bunch of implicit biases rather than reality. Platt's Law is evidence for this c... (read more)
(I am the author)
I still like & stand by this post. I refer back to it constantly. It does two things:
1. Argue that an AI-induced point of no return could significantly before, or significantly after, world GDP growth accelerates--and indeed will probably come before!
2. Argue that we shouldn't define timelines and takeoff speeds in terms of economic growth. So, against "is there a 4 year doubling before a 1 year doubling?" and against "When will we have TAI = AI capable of doubling the economy in 4 years if deployed?"
I think both things are pretty impo... (read more)
I still like & endorse this post. When I wrote it, I hadn't read more than the wiki articles on the subject. But then afterwards I went and read 3 books (written by historians) about it, and I think the original post held up very well to all this new info. In particular, the main critique the post got -- that disease was more important than I made it sound, in a way that undermined my conclusion -- seems to have been pretty wrong. (See e.g. this comment thread, these follow up posts)
So, why does it matter? What contribution did this po... (read more)
Adding to what Paul said: jacob_cannell points to this comment which claims that in Mind Children Moravec predicted human-level AGI in 2028.
Moravec, "Mind Children", page 68: "Human equivalence in 40 years". There he is actually talking about human-level intelligent machines arriving by 2028 - not just the hardware you would theoretically require to build one if you had the ten million dollars to spend on it.
I just went and skimmed Mind Children. He's predicting human-equivalent computational power on a personal computer in 40 years. He seems to say tha... (read more)
We all saw the GPT performance scaling graphs in the papers, and we all stared at them and imagined extending the trend for another five OOMs or so... but then Lanrian went and actually did it! Answered the question we had all been asking! And rigorously dealt with some technical complications along the way.
I've since referred to this post a bunch of times. It's my go-to reference when discussing performance scaling trends.
Ajeya's timelines report is the best thing that's ever been written about AI timelines imo. Whenever people ask me for my views on timelines, I go through the following mini-flowchart:
1. Have you read Ajeya's report?
--If yes, launch into a conversation about the distribution over 2020's training compute and explain why I think the distribution should be substantially to the left, why I worry it might shift leftward faster than she projects, and why I think we should use it to forecast AI-PONR instead of TAI.
--If no, launch into a conversation about Ajey... (read more)
Ok, thanks! I defer to your judgment on this, you clearly know way more than me. Oh well, there goes one of my hopes for the price of compute reaching a floor.
This post is the best overview of the field so far that I know of. I appreciate how it frames things in terms of outer/inner alignment and training/performance competitiveness--it's very useful to have a framework with which to evaluate proposals and this is a pretty good framework I think.
Since it was written, this post has been my go-to reference both for getting other people up to speed on what the current AI alignment strategies look like (even though this post isn't exhaustive). Also, I've referred back to it myself several times. I learned a lot from... (read more)
What do you think about energy costs? Last I thought about this it seemed plausible to me that in ten years or so the atoms making up supercomputers will be <10% of the cost of training giant models, most of the cost being paying for the electricity and upkeep and rent.
Thanks! Wow I missed/forgot that 30% figure, my bad. I disagree with you much less than I thought! (I'm more like 70% instead of 30%). [ETA: Update: I'm going with the intuitive definition of takeoff speeds here, not the "doubling in 4 years before 1 year?" one. For my thoughts on how to define takeoff speeds, see here. If GWP doubling times is the definition we go with then I'm more like 85% fast takeoff I think, for reasons mentioned by Rob Bensinger below.]
1. Everyone agrees that if we have less than 10 years left before the end, it's probably not going to look like the multi-year, gradual, distributed takeoff Paul prophecies, and instead will look crazier, faster, more discontinuous, more Yudkowskian... right? In other words, everyone agrees <10-year timelines and Paul-slow takeoff are in tension with each other.*
2. Assuming we agree on 1, I'd be interested to hear whether people think we should resolve this tension by having low credence in <10 year timelines, or not having low credence in Yudkowskia... (read more)
I still expect things to be significantly more gradual than Eliezer, in the 10 year world I think it will be very fast but we still have much tighter bounds on how fast (maybe median is more like a year and very likely 2+ months). But yes, the timeline will be much shorter than my default expectation, and then you also won't have time for big broad impacts.
I don't think you should have super low credence in fast takeoff. I gave 30% in the article that started this off, and I'm still somewhere in that ballpark.
Perhaps you think this implies a "low credence" in <10 year timelines. But I don't really think the arguments about timelines are "solid" to the tune of 20%+ probability in 10 years.
Like Gwern said, the goal is to win. If EfficientZero gets superhuman data-efficiency by "cheating," well it still got superhuman data-efficiency...
I think a relevant comparison here would be total cost. EfficientZero took 7 hours on 4 GPUs to master each particular Atari game. Equipment probably cost around $10,000 last I checked. How long is the lifespan of a GPU? Two years? OK, so that's something like 20,000 hours of 4 GPU's time for $10K, so $0.50 for one hour, so $3.50 for the training run? Eh, maybe it's a bit more than that due to energy costs or s... (read more)
These are three separate things:
(a) What is the meaning of "2020-FLOPS-equivalent that TAI needs?"
(b) Can you build TAI with 2020 algorithms without some truly astronomical amount of FLOPs?
(c) Why should we believe the "neural anchor?"
(a) is answered roughly in my linked post and in much more detail and rigor in Ajeya's doc.
(b) depends on what you mean by truly astronomical; I think it would probably be doable for 10^35, Ajeya thinks 50% chance.
For (c), I actually don't think we should put that much weight on the "neural anchor," and I don't think Ajeya's... (read more)
What is the meaning of "2020-FLOPS-equivalent that TAI needs"? Plausibly you can't build TAI with 2020 algorithms without some truly astronomical amount of FLOPs.
I think 10^35 would probably be enough. This post gives some intuition as to why, and also goes into more detail about what 2020-flops-equivalent-that-TAI-needs means. If you want even more detail + rigor, see Ajeya's report. If you think it's very unlikely that 10^35 would be enough, I'd love to hear more about why -- what are the blockers? Why would OmegaStar, SkunkWorks, etc. described in the p... (read more)
OK, here are my guesses, without seeing anyone else's answers. I think I'm probably wrong, which is why I'm asking this question:
1.a. Underdetermined? It depends on what we mean by the outer objective, and what we mean when we assume it has no inner alignment problems? See e.g. this discussion. That said, yeah it totally seems possible. If the part that predicts reward gets good at generalizing, it should be able to reason/infer/guess that hacking the reward function would yield tons of reward. And then that's what the agent would do.
1.b. Yes? Even though ... (read more)