Individual humans do make off much better when they get to select between products from competing companies rather than monopolies, benefitting from companies going out of their way to demonstrate when their products are verifiably better than rivals'. Humans get treated better by sociopathic powerful politicians and parties when those politicians face the threat of election rivals (e.g. no famines). Small states get treated better when multiple superpowers compete for their allegiance. Competitive science with occasional refutations of false claims produces much more truth for science consumers than intellectual monopolies. Multiple sources with secret information are more reliable than one.It's just routine for weaker less sophisticated parties to do better in both assessment of choices and realized outcomes when multiple better informed or powerful parties compete for their approval vs just one monopoly/cartel.
Also, a flaw in your analogy is that schemes that use AIs as checks and balances on each other don't mean more AIs. The choice is not between monster A and monsters A plus B, but between two copies of monster A (or a double-size monster A), and a split of one A and one B, where we hold something of value that we can use to help throw the contest to either A or B (or successors further evolved to win such contests). In the latter case there's no more total monster capacity, but there's greater hope of our influence being worthwhile and selecting the more helpful winner (which we can iterate some number of times).
I think this claim is true, on account of gray goo and lots of other things, and I suspect Eliezer does too, and I’m pretty sure other people disagree with this claim.
If you have robust alignment, or AIs that are rapidly bootstrapping their level of alignment fast enough to outpace the danger of increased capabilities, aligned AGI could get through its intelligence explosion to get radically superior technology and capabilities. It could also get a hard start on superexponential replication in space, so that no follower could ever catch up, and enough tech and military hardware to neutralize any attacks on it (and block attacks on humans via nukes, bioweapons, robots, nanotech, etc). That wouldn't work if there are thing like vacuum collapse available to attackers, but we don't have much reason to expect that from current science and the leading aligned AGI would find out first.That could be done without any violation of the territory of other sovereign states. The legality of grabbing space resources is questionable in light of the Outer Space Treaty, but commercial exploitation of asteroids is in the Overton window. The superhuman AGI would also be in a good position to persuade and trade with any other AGI developers.
Again I think this claim is true, and I suspect Eliezer does too. In fact, my guess is that there are already single GPU chips with enough FLOP/s to run human-level, human-speed, AGI, or at least in that ballpark.
An A100 may have humanlike FLOP/s but has only 80 GB of memory, probably orders of magnitude less memory per operation than brains. Stringing together a bunch of them makes it possible to split up human-size models and run them faster/in parallel on big batches using the extra operations.
Some more points about this action:
Agreed, and versions of them exist in human governments trying to maintain control (where non-cooordination of revolts is central). A lot of the differences are about exploiting new capabilities like copying and digital neuroscience or changing reward hookups.In ye olde times of the early 2010s people (such as I) would formulate questions about what kind of institutional setups you'd use to get answers out of untrusted AIs (asking them separately to point out vulnerabilities in your security arrangement, having multiple AIs face fake opportunities to whistleblow on bad behavior, randomized richer human evaluations to incentivize behavior on a larger scale).
"Overall these estimates imply a timeline of [372 years](https://aiimpacts.org/surveys-on-fractional-progress-towards-hlai/)."That was only for Hanson's convenience sample, other surveys using the method gave much shorter timelines, as discussed in the post.
But new algorithms also don't work well on old hardware. That's evidence in favor of Paul's view that much software work is adapting to exploit new hardware scales.
A perfectly correlated time series of compute and labor would not let us say which had the larger marginal contribution, but we have resources to get at that, which I was referring to with 'plausible decompositions.' This includes experiments with old and new software and hardware, like the chess ones Paul recently commissioned, and studies by AI Impacts, OpenAI, and Neil Thompson. There are AI scaling experiments, and observations of the results of shocks like the end of Dennard scaling, the availability of GPGPU computing, and Besiroglu's data on the relative predictive power of computer and labor in individual papers and subfields.In different ways those tend to put hardware as driving more log improvement than software (with both contributing), particularly if we consider software innovations downstream of hardware changes.
Progress in AI has largely been a function of increasing compute, human software research efforts, and serial time/steps. Throwing more compute at researchers has improved performance both directly and indirectly (e.g. by enabling more experiments, refining evaluation functions in chess, training neural networks, or making algorithms that work best with large compute more attractive).Historically compute has grown by many orders of magnitude, while human labor applied to AI and supporting software by only a few. And on plausible decompositions of progress (allowing for adjustment of software to current hardware and vice versa), hardware growth accounts for more of the progress over time than human labor input growth.So if you're going to use an AI production function for tech forecasting based on inputs (which do relatively OK by the standards tech forecasting), it's best to use all of compute, labor, and time, but it makes sense for compute to have pride of place and take in more modeling effort and attention, since it's the biggest source of change (particularly when including software gains downstream of hardware technology and expenditures).Thinking about hardware has a lot of helpful implications for constraining timelines:
These are all useful things highlighted by Ajeya's model, and by earlier work like Moravec's. In particular, I think Moravec's forecasting methods are looking pretty good, given the difficulty of the problem. He and Kurzweil (like the computing industry generally) were surprised by the death of Dennard scaling and general price-performance of computing growth slowing, and we're definitely years behind his forecasts in AI capability, but we are seeing a very compute-intensive AI boom in the right region of compute space. Moravec also did anticipate it would take a lot more compute than one lifetime run to get to AGI. He suggested human-level AGI would be in the vicinity of human-like compute quantities being cheap and available for R&D. This old discussion is flawed, but makes me feel the dialogue is straw-manning Moravec to some extent.Ajeya's model puts most of the modeling work on hardware, but it is intentionally expressive enough to let you represent a lot of different views about software research progress, you just have to contribute more of that yourself when adjusting weights on the different scenarios, or effective software contribution year by year. You can even represent a breakdown of the expectation that software and hardware significantly trade off over time, and very specific accounts of the AI software landscape and development paths. Regardless modeling the most importantly changing input to AGI is useful, and I think this dialogue misleads with respect to that by equivocating between hardware not being the only contributing factor and not being an extremely important to dominant driver of progress.
Mainly such complete (and irreversible!) delegation to such incompetent systems being necessary or executed. If AI is so powerful that the nuclear weapons are launched on hair-trigger without direction from human leadership I expect it to not be awful at forecasting that risk.You could tell a story where bargaining problems lead to mutual destruction, but the outcome shouldn't be very surprising on average, i.e. the AI should be telling you about it happening with calibrated forecasts.
The US and China might well wreck the world by knowingly taking gargantuan risks even if both had aligned AI advisors, although I think they likely wouldn't.But what I'm saying is really hard to do is to make the scenarios in the OP (with competition among individual corporate boards and the like) occur without extreme failure of 1-to-1 alignment (for both companies and governments). Competitive pressures are the main reason why AI systems with inadequate 1-to-1 alignment would be given long enough leashes to bring catastrophe. I would cosign Vanessa and Paul's comments about these scenarios being hard to fit with the idea that technical 1-to-1 alignment work is much less impactful than cooperative RL or the like.
In more detail, I assign a ≥10% chance to a scenario where two or more cultures each progressively diminish the degree of control they exercise over their tech, and the safety of the economic activities of that tech to human existence, until an involuntary human extinction event. (By comparison, I assign at most around a ~3% chance of a unipolar "world takeover" event, i.e., I'd sell at 3%.)
If this means that a 'robot rebellion' would include software produced by more than one company or country, I think that that is a substantial possibility, as well as the alternative, since competitive dynamics in a world with a few giant countries and a few giant AI companies (and only a couple leading chip firms) can mean that the way safety tradeoffs work is by one party introducing rogue AI systems that outcompete by not paying an alignment tax (and intrinsically embodying in themselves astronomically valuable and expensive IP), or cascading alignment failure in software traceable to a leading company/consortium or country/alliance. But either way reasonably effective 1-to-1 alignment methods (of the 'trying to help you and not lie to you and murder you with human-level abilities' variety) seem to eliminate a supermajority of the risk.[I am separately skeptical that technical work on multi-agent RL is particularly helpful, since it can be done by 1-to-1 aligned systems when they are smart, and the more important coordination problems seem to be earlier between humans in the development phase.]