Matthew Barnett

Someone who is interested in learning and doing good.

My Substack: https://matthewbarnett.substack.com/

Wiki Contributions

Comments

Matthew Barnett's Shortform

I have mixed feelings and some rambly personal thoughts about the bet Tamay Besiroglu and I proposed a few days ago. 

The first thing I'd like to say is that we intended it as a bet, and only a bet, and yet some people seem to be treating it as if we had made an argument. Personally, I am uncomfortable with the suggestion that our post was "misleading" because we did not present an affirmative case for our views.

I agree that LessWrong culture benefits from arguments as well as bets, but it seems a bit weird to demand that every bet come with an argument attached. A norm that all bets must come with arguments would seem to substantially damper the incentives to make bets, because then each time people must spend what will likely be many hours painstakingly outlining their views on the subject.

That said, I do want to reply to people who say that our post was misleading on other grounds. Some said that we should have made different bets, or at different odds. In response, I can only say that coming up with good concrete bets about AI timelines is actually really damn hard, and so if you wish you come up with alternatives, you can be my guest. I tried my best, at least.

More people said that our bet was misleading since it would seem that we too (Tamay and I) implicitly believe in short timelines, because our bets amounted to the claim that AGI has a substantial chance of arriving in 4-8 years. However, I do not think this is true.

The type of AGI that we should be worried about is one that is capable of fundamentally transforming the world. More narrowly, and to generalize a bit, fast takeoff folks believe that we will only need a minimal seed AI that is capable of rewriting its source code, and recursively self-improving into superintelligence. Slow takeoff folks believe that we will need something capable of automating a wide range of labor.

Given the fast takeoff view, it is totally understandable to think that our bets imply a short timeline. However, (and I'm only speaking for myself here) I don't believe in a fast takeoff. I think there's a huge gap between AI doing well on a handful of benchmarks, and AI fundamentally re-shaping the economy. At the very least, AI has been doing well on a ton of benchmarks since 2012. Each time AI excels in one benchmark, a new one is usually invented that's a bit more tough, and hopefully gets us a little closer to measuring what we actually mean by general intelligence.

In the near-future, I hope to create a much longer and more nuanced post expanding on my thoughts on this subject, hopefully making it clear that I do care a lot about making real epistemic progress here. I'm not just trying to signal that I'm a calm and arrogant long-timelines guy who raises his nose at the panicky short timelines people, though I understand how my recent post could have given that impression.

A comment on Ajeya Cotra's draft report on AI timelines

Thanks for the thoughtful reply. Here's my counter-reply.

You frame my response as indicating "disagreements". But my tweet said "I broadly agree" with you, and merely pointed out ways that I thought your statements were misleading. I do just straight up disagree with you about two specific non-central claims you made, which I'll get to later. But I'd caution against interpreting me as disagreeing with you by any degree greater than what is literally implied by what I wrote.

Before I get to the specific disagreements, I'll just bicker about some points you made in response to me. I think this sort of quibbling could last forever and it would serve little purpose to continue past this point, so I release you from any obligation you might think you have to reply to these points. However, you might still enjoy reading my response here, just to understand my perspective in a long-form non-Twitter format.

Note: I continued to edit my response after I clicked "submit", after realizing a few errors of mine. Apologies if you read an erroneous version.

My quibbles with what you wrote

You said,

  • Barnett's critique doesn't propose an alternative trajectory of hardware progress he thinks is more likely, or spell out what that would mean for the overall forecasts, besides saying that the doubling time has been closer to 3.5 years recently.
  • The Bio Anchors report includes a conservative analysis that assumes a 3.5 year doubling time with (I think more importantly) a cap on overall hardware efficiency that is only 4 orders of magnitude higher than today's, as well as a number of other assumptions that are more conservative than the main Bio Anchors report's; and all of this still produces a "weighted average" best guess of a 50% probability of transformative AI by 2100, with only one of the "anchors" (the "evolution anchor," which I see as a particularly conservative soft upper bound) estimating a lower probability.

The fact that the median for the conservative analysis is right at 2100 — which indeed is part of the 21st century means that when you said, "You can run the bio anchors analysis in a lot of different ways, but they all point to transformative AI this century", you were technically correct, by the slimmest of margins. 

I had the sense that many people might interpret your statement as indicating a higher degree of confidence; that is, maybe something like "even the conservative analysis produces a median prediction well before 2100." 

Maybe no one misinterpreted you like that! 

It's very reasonable for to think that no one would have misinterpreted you. But this incorrect interpretation of your statement was, at least to me, the thinking that I remember having at the time I read the sentence.

This is simply an opinion, and I hope to gain more clarity over time as more effort is put into this question, but I'll give one part of the intuition: I think that conditional on hardware efficiency improvements coming in on the low side, there will be more effort put into increasing efficiency via software and/or via hybrid approaches (e.g., specialized hardware for the specific tasks at hand; optimizing researcher-time and AI development for finding more efficient ways to use compute). So reacting to Bio Anchors by saying "I think the hardware projections are too aggressive; I'm going to tweak them and leave everything else in place" doesn't seem like the right approach.

I intend to produce fuller thoughts on this point in the coming months. In short: I agree that we shouldn't tweak the hardware projections and leave everything in place. On the other hand, it seems wrong to me to expect algorithmic efficiency to get faster as hardware progress slows. While it's true there will be more pressure to innovate, there will also be less hardware progress available to test innovations, which arguably is one of the main bottlenecks to software innovation.

I was referring to https://www.metaculus.com/questions/5121/when-will-the-first-artificial-general-intelligence-system-be-devised-tested-and-publicly-known-of-stronger-operationalization/ and https://www.metaculus.com/questions/3479/when-will-the-first-artificial-general-intelligence-system-be-devised-tested-and-publicly-known-of/ , which seem more germane than the link Barnett gives in the tweet above.

I am confused why you think my operationalization for timing transformative AI seems less relevant than a generic question about timing AGI (note that I am the author of one of the questions you linked).

My operationalization for transformative AI is the standard operationalization used in Open Philanthopy reports, such as Tom Davidson's report here, when he wrote,

This report evaluates the likelihood of ‘explosive growth’, meaning > 30% annual growth of gross world product (GWP), occurring by 2100.

Davidson himself refers to Ajeya Cotra, writing,

In her draft report, my colleague Ajeya Cotra uses TAI to mean ‘AI which drives Gross World Product (GWP) to grow at ~20-30% per year’ – roughly ten times faster than it is growing currently.

I agree with what you write here,

There are many ways transformative AI might not be reflected in economic growth figures, e.g. if economic growth figures don't include digital economies; if misaligned AI derails civilization; or if growth is deliberately held back, perhaps with AI help, in order to buy more time for improving things like AI alignment.

However, it's not clear to me that the questions you linked to lack drawbacks of equal or greater severity to these ones. To clarify, I merely said that "Metaculus has no consensus position on transformative AI" and I think that statement is borne out by the link I gave.

Actual disagreements between us

Now I get to the real disagreements we have/had.

I replied to your statement "Specific arguments for “later than 2100,” including outside-view arguments, seem reasonably close to nonexistent" by pointing to my own analysis, which produced three non-outside view arguments for longer timelines.

You defended your statement as follows,

I'm going to stand by my statement here - these look to be simply ceteris paribus reasons that AI development might take longer than otherwise. I'm not seeing a model or forecast integrating these with other considerations and concluding that our median expectation should be after 2100. (To be clear, I might still stand by my statement if such a model or forecast is added - my statement was meant as an abbreviated argument, and in that sort of context I think it's reasonable to say "reasonably close to nonexistent" when I mean something like "There aren't arguments of this form that have gotten a lot of attention/discussion/stress-testing and seem reasonably strong to me or, I claim, a reasonable disinterested evaluator.")

I have a few things to say here,

  1. "these look to be simply ceteris paribus reasons that AI development might take longer than otherwise" does not back up your actual claim, which was that specific arguments seem reasonably close to nonexistent. It's not clear to me how you're using "ceteris paribus" in that sentence, but ceteris paribus is not the same as non-specific which was what I responded to.
  2. I don't think I need to build an explicit probabilistic model in order to gesture at a point. It seems reasonably clear to me that someone could build a model using the arguments I gave, which would straightforwardly put more probability mass on dates past 2100 (even if the median is still <= 2100). But you're right that, since this model has yet to be built, it's uncertain how much of an effect these considerations will have on eventual AI timelines.

In response to my claim that "[Robin Hanson's] most recent public statements have indicated that he thinks AI is over a century away" you said,

I think the confusion here is whether ems count as transformative AI.

No, that's not the confusion, but I can see why you'd think that's the confusion. I made a mistake by linking the AI Impacts interview with Robin Hanson, which admittedly did not support my claim.

In fact, someone replied to the very tweet you criticize with the same objection as the one you gave. They said,

In his "when Robots rule the Earth" book he seems to think said robots will be there "sometime in the next century or so".

I replied,

Yes, he seems to have longer timelines now.

And Robin Hanson liked my tweet, which as far as I can tell, is a strong endorsement of my correctness in this debate.

Conversation on technology forecasting and gradualism

Transistors: Wikipedia claims "the MOSFET was also initially slower and less reliable than the BJT", and further discussion seems to suggest that its benefits were captured with further work and effort (e.g. it was a twentieth the size of a BJT by the 1990s, decades after invention). This sounds like it wasn't a discontinuity to me.

I am also skeptical that the MOSFET produced a discontinuity. Plausibly, what we care about is the number of computations we can do per dollar. Nordhaus (2007) provides data showing that that the rate of progress on this metric was practically unchanged at the time the MOSFET was invented, in 1959.

Late 2021 MIRI Conversations: AMA / Discussion

I think you sufficiently addressed my confusion, so you don't need to reply to this comment, but I still had a few responses to what you said.

What does this mean? On my understanding, singularities don't proceed at fixed rates?

No, I agree. But growth is generally measured over an interval. In the original comment I proposed the interval of one year during the peak rate of economic growth. To allay your concern that a 25% growth rate indicates we didn't experience a singularity, I meant that we were halving the growth rate during the peak economic growth year in our future, regardless of whether that rate was very fast.

I agree that in practice there will be some maximum rate of GDP growth, because there are fundamental physical limits (and more tight in-practice limits that we don't know), but it seems like they'll be way higher than 25% per year.

The 25% figure was totally arbitrary. I didn't mean it as any sort of prediction. I agree that an extrapolation from biological growth implies that we can and should see >1000% growth rates eventually, though it seems plausible that we would coordinate to avoid that.

If you actually mean halving the peak rate of GDP growth during the singularity, and a singularity actually happens, then I think it doesn't affect my actions at all; all of the relevant stuff happened well before we get to the peak rate.

That's reasonable. A separate question might be about whether the rate of growth during the entire duration from now until the peak rate will cut in half.

Let's say for purposes of argument I think 10% chance of extinction, and 90% chance of "moderate costs but nothing terrible". Which of the following am I supposed to have updated to?

I think the way you're bucketing this into "costs if we go extinct" and "costs if we don't go extinct" is reasonable. But one could also think that the disvalue of extinction is more continuous with disvalue in non-extinction scenarios, which makes things a bit more tricky. I hope that makes sense.

Late 2021 MIRI Conversations: AMA / Discussion

Thanks for your response. :)

I'm confused by the question. If the peak rate of GWP growth ever is 25%, it seems like the singularity didn't happen?

I'm a little confused by your confusion. Let's say you currently think the singularity will proceed at a rate of R. The spirit of what I'm asking is: what would you change if you learned that it will proceed at a rate of one half R. (Maybe plucking specific numbers about the peak-rate of growth just made things more confusing). For me at least, I'd probably expect a lot more oversight, as people have more time to adjust to what's happening in the world around them.

No effect. Averting 50% of an existential catastrophe is still really good.

I'm also a little confused about this. My exact phrasing was, "You learn that the cost of misalignment is half as much as you thought, in the sense that slightly misaligned AI software impose costs that are half as much (ethically, or economically), compared to what you used to think." I assume you don't think that slightly misaligned software will, by default, cause extinction, especially if it's acting alone and is economically or geographically isolated.

We could perhaps view this through an analogy. War is really bad: so bad that maybe it will even cause our extinction (if say, we have some really terrible nuclear winter). But by default, I don't expect war to cause humanity to go extinct. And so, if someone asked me about a scenario in which the costs of war are only half as much as I thought, it would probably significantly update me away from thinking we need to take actions to prevent war. The magnitude of this update might not be large, but understanding exactly how much we'd update and change our strategy in light of this information is type of thing I'm asking for.

Late 2021 MIRI Conversations: AMA / Discussion

This question is not directed at anyone in particular, but I'd want to hear some alignment researchers answer it. As a rough guess, how much would it affect your research—in the sense of changing your priorities, or altering your strategy of impact, and method of attack on the problem—if you made any of the following epistemic updates?

(Feel free to disambiguate anything here that's ambiguous or poorly worded.)

  1. You update to think that AI takeoff will happen twice as slowly as your current best-estimate. e.g. instead of the peak-rate of yearly GWP growth being 50% during the singularity, you learn it's only going to be 25%. Alternatively, update to think that AI takeoff will happen twice as quickly, e.g. the peak-rate of GWP growth will be 100% rather than 50%.
  2. You learn that transformative AI will arrive in half the time you currently think it will take, from your current median, e.g. in 15 years rather than 30. Alternatively, you learn that transformative AI will arrive in twice the time you currently think it will take.
  3. You learn that power differentials will be twice as imbalanced during AI takeoff compared to your current median. That is, you learn that if we could measure the relative levels of "power" for agents in the world, the gini coefficient for this power distribution will be twice as unequal than your current median scenario, in the sense that world dynamics will look more unipolar than multipolar; local, rather than global. Alternatively, you learn the opposite.
  4. You learn that the cost of misalignment is half as much as you thought, in the sense that slightly misaligned AI software impose costs that are half as much (ethically, or economically), compared to what you used to think. Alternatively, you learn the opposite.
  5. You learn that the economic cost of aligning an AI to your satisfaction is half as much as you thought, for a given AI, e.g. it will take a team of 20 full-time workers writing test-cases, as opposed to 40 full-time workers of equivalent pay. Alternatively, you learn the opposite.
  6. You learn that the requisite amount of "intelligence" needed to discover a dangerous x-risk inducing technology is half as much as you thought, e.g. someone with an IQ of 300, as opposed to 600 (please interpret charitably) could by themselves figure out how to deploy full-scale molecular nanotechnology, of the type required to surreptitiously inject botulinum toxin into the bloodstreams of everyone, making us all fall over and die. Alternatively, you learn the opposite.
  7. You learn that either government or society in general will be twice as risk-averse, in the sense of reacting twice as strongly to potential AI dangers, compared to what you currently think. Alternatively, you learn the opposite.
  8. You learn that the AI paradigm used during the initial stages of the singularity—when the first AGIs are being created—will be twice as dissimilar from the current AI paradigm, compared to what you currently think. Alternatively, you learn the opposite.
Reply to Eliezer on Biological Anchors

Unless I’m mistaken, the Bio Anchors framework explicitly assumes that we will continue to get algorithmic improvements, and even tries to estimate and extrapolate the trend in algorithmic efficiency. It could of course be that progress in reality will turn out a lot faster than the median trendline in the model, but I think that’s reflected by the explicit uncertainty over the parameters in the model.

In other words, Holden’s point about this framework being a testbed for thinking about timelines remains unscathed if there is merely more ordinary algorithmic progress than expected.

Reply to Eliezer on Biological Anchors

While Matthew says "Overall I find the law to be pretty much empirically validated, at least by the standards I'd expect from a half in jest Law of Prediction," I don't agree: I don't think an actual trendline on the chart would be particularly close to the Platt's Law line. I think it would, instead, predict that Bio Anchors should point to longer timelines than 30 years out.

I ran OLS regression on the data, and this was the result. Platt's law is in blue.

I agree this trendline doesn't look great for Platt's law, and backs up your observation by predicting that Bio Anchors should be more than 30 years out.

However, OLS is notoriously sensitive to outliers. If instead of using some more robust regression algorithm, we instead super arbitrarily eliminated all predictions after 2100, then we get this, which doesn't look absolutely horrible for the law. Note that the median forecast is 25 years out.

Considerations on interaction between AI and expected value of the future

I interpreted steven0461 to be saying that many apparent "value disagreements" between humans turn out, upon reflection, to be disagreements about facts rather than values. It's a classic outcome concerning differences in conflict vs. mistake theory: people are interpreted as having different values because they favor different strategies, even if everyone shares the same values.

Biology-Inspired AGI Timelines: The Trick That Never Works

I had mixed feelings about the dialogue personally. I enjoy the writing style and think Eliezer is a great writer with a lot of good opinions and arguments, which made it enjoyable.

But at the same time, it felt like he was taking down a strawman. Maybe you’d label it part of “conflict aversion”, but I tend to get a negative reaction to take-downs of straw-people who agree with me.

To give an unfair and exaggerated comparison, it would be a bit like reading a take-down of a straw-rationalist in which the straw-rationalist occasionally insists such things as “we should not be emotional” or “we should always use Bayes’ Theorem in every problem we encounter.” It should hopefully be easy to see why a rationalist might react negatively to reading that sort of dialogue.

Load More