AI
Frontpage

Here are two kinds of super high-level explanation for how things turned out in the future:

  1. Someone wanted things to turn out that way.
  2. Selective forces favoured that outcome.

Which one is a better explanation?

This question is one I’m highly uncertain about, and often shows up as a crux when thinking about where to focus one’s efforts when aiming for a good future.

One way of labelling the alternatives is intelligence vs evolution. There’s a cluster of stuff on each side. I’ll try to point to them better by listing points in the clusters:

  • Intelligence: an agent, someone capably trying to do something, intentional deliberate action, the will of the most powerful entity, consequentialists, unipolar/singleton or few-agent scenarios
  • Evolution: Moloch (multipolar traps), natural selection, competitive pressures, the incentive landscape, game theoretic solution concepts, highly multipolar or multi-agent scenarios

Under some view, both intelligence and evolution are good explanations for how things turned out. They just amount to taking a different perspective or looking at the situation at a different scale. I agree with this but want to avoid this view here so we can focus on the distinction. So let’s try to make the alternatives mutually exclusive:

  1. Intelligence: someone was trying to produce the outcome
  2. Evolution: no one was trying to produce the outcome

This question is very similar to unipolar vs multipolar, but maybe not the same. My focus is on whether the main determinant of the outcome is “capable trying” vs anything else. This can be for a few agents, it doesn’t require exactly one.

 

If you think our future will be better explained by intelligence, you might prefer to work on understanding intelligence and related things like:

  • Decision theory, anthropics, probability, epistemology, goal-directedness, embedded agency, intent alignment – understanding what agents are, how they work, how to build them
  • Moral philosophy and value learning
  • {single, multi}/single alignment, and approaches to building AGI that focus on a single agent

If you think our future will be better explained by evolution, you might prefer to work on understanding evolution and related things like:

  • Economics and sociology, especially areas like social choice theory, game theory, and topics like bargaining and cooperation
  • Biological and memetic evolution, and the underlying theory, perhaps something like formal darwinism
  • {multi, single}/multi alignment, and approaches to building AGI that involve multiple agents
  • Governance, and interventions aimed at improving global coordination

 

Why believe intelligence explains our future better than evolution? One argument is that intelligence is powerful. The outsized impact humans have had on the planet, and might be expected to have beyond it, is often attributed to their intelligence. The pattern of “some human or humans want X to happen” causing “X happens” occurs very frequently and reliably, and seems to happen via intelligence-like things such as planning and reasoning. 

Relatedly, the ideal of a rational agent – something that has beliefs and desires, updates beliefs towards accuracy, and takes actions thereby expected to achieve the desires – looks, almost by construction, like something that would in the limit of capability explain what outcomes actually obtain.

Both of these considerations ignore multipolarity, possibly to their peril. Why believe evolution explains our future better than intelligence? Because it seems to explain a lot of the past and present. Evolution (biological and cultural) has much to say about the kinds of creatures and ideas that are abundant today, and the dynamics that led to this situation. The world currently looks like a competitive marketplace more than like a unified decision-maker.

Will this continue? Will there always be many agents with similar levels of capabilities but different goals? To argue for this, I think there are two types of arguments one could put forward. The first is that no single entity will race ahead of the rest (“foom”) in capability, rendering the rest irrelevant. The second is to rebut trends – such as multicellular life, tribes, firms, and civilizations – towards greater coordination and cooperation, and argue that they are fundamentally limited.

 

I don’t know all the arguments that have been made on this, and since this post is for blog post day I’m not going to go find and summarise them. But I don’t think the question is settled – please tell me if you know better. Being similar to the unipolar vs multipolar question, the intelligence vs evolution question has been explored in the AI foom debate and Superintelligence. Here is some other related work, split by which side it’s more relevant to or favourable of.
Intelligence:

  • Section 4 of Yudkowsky’s chapter on AI and Global Risk argues that intelligence is more powerful than people tend to think.
  • Bostrom’s singleton hypothesis is very similar to expecting Intelligence over Evolution.
  • Christiano’s speculation that the Solomonoff prior is controlled by consequentialists relies on (and argues for) the power of intelligence.

Evolution:

  • Alexander’s Meditations on Moloch points to many examples of multipolar traps in reality.
  • Critch’s Robust Agent-Agnostic Processes argues for outcomes being better understood from something like evolution than intelligence. Critch and Krueger’s ARCHES provides the {single, multi}/{single, multi} alignment framework and their multiplicity thesis seems to me to assume that evolution matters more than intelligence.
  • Hanson’s Age of Em illustrates how a future with advanced intelligence but determined primarily by evolution-like things, such as selection for productivity, might look.
  • Henrich’s The Secret of Our Success shows how human impact and control of the planet may be better explained by evolution (of memes) than the intelligence of individuals.

 

Acknowledgements: Daniel Kokotajlo for running the impromptu blog post day and giving me feedback, Andrew Critch and Victoria Krakovna for one conversation where this question came up as a crux, and Allan Dafoe for another.

AI
Frontpage

26

New Comment
5 comments, sorted by Click to highlight new comments since: Today at 5:15 AM

Darwinian evolution as such isn't a thing amongst superintelligences. They can and will preserve terminal goals. This means the number of superintelligences running around is bounded by the number humans produce before the point the first ASI get powerful enough to stop any new rivals being created. Each AI will want to wipe out its rivals if it can. (unless they are managing to cooperate somewhat)  I don't think superintelligences would have humans kind of partial cooperation. Either near perfect cooperation, or near total competition. So this is a scenario where a smallish number of ASI's that have all foomed in parallel expand as a squabbling mess.

Do you know of any formal or empirical arguments/evidence for the claim that evolution stops being relevant when there exist sufficiently intelligent entities (my possibly incorrect paraphrase of "Darwinian evolution as such isn't a thing amongst superintelligences")?

Error correction codes exist. They are low cost in terms of memory etc. Having a significant portion of your descendent mutate and do something you don't want is really bad.

If error correcting to the point where there is not a single mutation in the future only costs you 0.001% resources in extra hard drive, then <0.001% resources will be wasted due to mutations.

Evolution is kind of stupid compared to super-intelligences. Mutations are not going to be finding improvements. Because the superintelligence will be designing their own hardware and the hardware will already be extremely optimized. If the superintelligence wants to spend resources developing better tech, It can do that better than evolution.

So squashing evolution is a convergent instrumental goal, and easily achievable for an AI designing its own hardware.

Error correction codes help a superintelligence to avoid self-modifying but they don't allow goals necessarily to be stable with changing reasoning abilities. 

Firstly this would be AI's looking at their own version of the AI alignment problem. This is not random mutation or anything like it. Secondly I would expect there to only be a few rounds maximum of self modification that runs risk to goals. (Likely 0 rounds) Firstly damaging goals looses a lot of utility. You would only do it if its a small change in goals for a big increase in intelligence. And if you really need to be smarter and you can't make yourself smarter while preserving your goals. 

You don't have millions of AI all with goals different from each other. The self upgrading step happens once before the AI starts to spread across star systems.