Matthew Barnett

Just someone who wants to learn about the world. I think about AI risk sometimes, but I still have a lot to learn.

I also change my views often. Anything I wrote that's more than 10 days old should be treated as potentially outdated.

Matthew Barnett's Comments

An Analytic Perspective on AI Alignment
weaker claim?

Oops yes. That's the weaker claim, that I agree with. The stronger claim is that because we can't understand something "all at once" then mechanistic transparency is too hard and so we shouldn't take Daniel's approach. But the way we understand laptops is also in a mechanistic sense. No one argues that because laptops are too hard to understand all at once, then we should't try to understand them mechanistically.

This seems to be assuming that we have to be able to take any complex trained AGI-as-a-neural-net and determine whether or not it is dangerous. Under that assumption, I agree that the problem is itself very hard, and mechanistic transparency is not uniquely bad relative to other possibilities.

I didn't assume that. I objected to the specific example of a laptop as an instance of mechanistic transparency being too hard. Laptops are normally understood well because understanding can be broken into components and built up from abstractions. But each our understanding of each component and abstraction is pretty mechanistic -- and this understanding is useful.

Furthermore, because laptops did not fall out of the sky one day, but instead slowly built over successive years of research and development, it seems like a great example of how Daniel's mechanistic transparency approach does not rely on us having to understand arbitrary systems. Just as we built up an understanding of laptops, presumably we could do the same with neural networks. This was my interpretation of why he is using Zoom In as an example.

All of the other stories for preventing catastrophe that I mentioned in the grandparent are tackling a hopefully easier problem than "detect whether an arbitrary neural net is dangerous".

Indeed, but I don't think this was the crux of my objection.

An Analytic Perspective on AI Alignment
I'd be shocked if there was anyone to whom it was mechanistically transparent how a laptop loads a website, down to the gates in the laptop.

Could you clarify why this is an important counterpoint. It seems obviously useful to understand mechanistic details of a laptop in order to debug it. You seem to be arguing the [ETA: weaker] claim that nobody understands the an entire laptop "all at once", as in, they can understand all the details in their head simultaneously. But such an understanding is almost never possible for any complex system, and yet we still try to approach it. So this objection could show that mechanistic transparency is hard in the limit, but it doesn't show that mechanistic transparency is uniquely bad in any sense. Perhaps you disagree?

Cortés, Pizarro, and Afonso as Precedents for Takeover

For my part, I think you summarized my position fairly well. However, after thinking about this argument for another few days, I have more points to add.

  • Disease seems especially likely to cause coordination failures since it's an internal threat rather than an external threat (which unlike internal threats, tend to unite empires). We can compare the effects of the smallpox epidemic in the Aztec and Inca empires alongside other historical diseases during wartime, such as the Plauge of Athens which arguably is what caused Athens to lose the Peloponnesian War.
  • Along these same lines, the Aztec/Inca didn't have any germ theory of disease, and therefore didn't understand what was going on. They may have thought that the gods were punishing them for some reason, and therefore they probably spent a lot of time blaming random groups for the catastrophe. We can contrast these circumstances to eg. the Paraguayan War which killed up to 90% of the male population, but people probably had a much better idea what was going on and who was to blame, so I expect that the surviving population had an easier time coordinating.
  • A large chunk of the remaining population likely had some sort of disability. Think of what would happen if you got measles and smallpox in the same two year window: even if you survived it probably wouldn't look good. This means that the pure death rate is an underestimate of the impact of a disease. The Aztecs, for whom "only" 40 percent died of disease, were still greatly affected
It killed many of its victims outright, particularly infants and young children. Many other adults were incapacitated by the disease – because they were either sick themselves, caring for sick relatives and neighbors, or simply lost the will to resist the Spaniards as they saw disease ravage those around them. Finally, people could no longer tend to their crops, leading to widespread famine, further weakening the immune systems of survivors of the epidemic. [...] a third of those afflicted with the disease typically develop blindness.
Cortés, Pizarro, and Afonso as Precedents for Takeover
Later, other Europeans would come along with other advantages, and they would conquer India, Persia, Vietnam, etc., evidence that while disease was a contributing factor (I certainly am not denying it helped!) it wasn't so important a factor as to render my conclusion invalid (my conclusion, again, is that a moderate technological and strategic advantage can enable a small group to take over a large region.)

Europeans conquered places such as India, but that was centuries later, after they had a large technological advantage, and they also didn't come with just a few warships either: they came with vast armadas. I don't see why that supports the point that a small group can take over a large region?

Cortés, Pizarro, and Afonso as Precedents for Takeover
I really don't think the disease thing is important enough to undermine my conclusion. For the two reasons I gave: One, Afonso didn't benefit from disease

This makes sense, but I think the case of Afonso is sufficiently different from the others that it's a bit of a stretch to use it to imply much about AI takeovers. I think if you want to make a more general point about how AI can be militarily successful, then a better point of evidence is a broad survey of historical military campaigns. Of course, it's still a historically interesting case to consider!

two, the 90% argument: Suppose there was no disease but instead the Aztecs and Incas were 90% smaller in population and also in the middle of civil war. Same result would have happened, and it still would have proved my point.

Yeah but why are we assuming that they are still in the civil war? Call me out if I'm wrong here, but your thesis now seems to be: if some civilization is in complete disarray, then a well coordinated group of slightly more advanced people/AI can take control of the civilization.

This would be a reasonable thesis, but it doesn't shed too much light on AI takeovers. The important part lies in the "if some civilization is in complete disarray" conditional, and I think it's far from obvious that AI will emerge in such a world, unless some other more important causal factor already occurred that gave rise to the massive disarray in the first place. But even in that case, don't you think we should focus on that thing that caused the disarray instead?

Cortés, Pizarro, and Afonso as Precedents for Takeover
I agree that it would be good to think about how AI might create devastating pandemics. I suspect it wouldn't be that hard to do, for an AI that is generally smarter than us. However, I think my original point still stands.

It's worth clarifying exactly what "original point" stands because I'm currently unsure.

I don't get why you think a small technologically primitive tribe could take over the world if they were immune to disease. Seems very implausible to me.

Sorry, I meant to say, "Were immune to diseases that were currently killing everyone else." If everyone is dying around you, then your level of technology doesn't really matter that much. You just wait for your enemy to die and then settle the land after they are gone. This is arguably what Europeans did in America. My point is that by focusing on technology, you are missing the main reason for the successful conquest.

But I don't want to do this yet, because it seems to me that even with disease factored in, "most" of the "credit" for Cortes and Pizarro's success goes to the factors I mentioned.
After all, suppose the disease reduced the on-paper strength of the Americans by 90%. They were still several orders of magnitude stronger than Cortes and Pizarro. So it's still surprising that Cortes/Pizarro won... until we factor in the technological and strategic advantages I mentioned.

I feel like you don't actually have a civilization if 90% of your people died. I think it's more fair to say that when 90% of your people die, your civilization basically stops existing rather than just being weakened. For example, I can totally imagine an Incan voyage to Spain conquering Madrid if 90% of the Spanish died. Their chain of command would be in complete shambles. It wouldn't just be like some clean 90% reduction in GDP with everything else held constant.

But the civilizations wouldn't have been destroyed without the Spaniards. (I might be wrong about this, but... hadn't the disease mostly swept through Inca territory by the time Pizarro arrived? So clearly their civilization had survived.)
I think I am somewhat close to being convinced by your criticism, at least when phrased in the way you just did: "your thesis is trivial!" But I'm not yet convinced, because of my argument about the 90% reduction. (I keep making the same argument basically in response to all your points; it is the crux for me I think.)

Look, if 90% of a country dies of a disease, and then the surviving 10% become engulfed in a civil war, and then some military group who is immune to the disease comes in and takes the capital city during this all, don't you think it's very misleading to conclude "A small group of people with a slight military advantage can take over a large civilization" without heavily emphasizing the whole 90% of people dying of a disease part? This is the heart of my critique.

Cortés, Pizarro, and Afonso as Precedents for Takeover

Here's what I'll be putting in the Alignment Newsletter about this piece. Let me know if you spot inaccuracies or lingering disagreement regarding the opinion section.


This post lists three historical examples of how small human groups conquered large parts of the world, and shows how they are arguably precedents for AI takeover scenarios. The first two historical examples are the conquests of American civilizations by Hernán Cortés and Francisco Pizarro in the early 16th century. The third example is the Portugese capture of key Indian Ocean trading ports, which happened at roughly the same time as the other conquests. Daniel argues that technological and strategic advantages were the likely causes of these European victories. However, since a European technological advantage was small in this period, we might expect that an AI coalition could similarly take over a large portion of the world, even without a large technological advantage.


In a comment, I dispute the claimed reasons for why Europeans conquered American civilizations. I think that a large body of historical literature supports the conclusion that American civilizations fell primarily because of their exposure to diseases which they lacked immunity to, rather than because of European military power. I also think that this helps explain why Portugal was "only" able to capture Indian Ocean trading ports during this time period, rather than whole civilizations. I think the primary insight here should instead be that pandemics can kill large groups of humans, and therefore it would be worth exploring the possibility that AI systems use pandemics as a mechanism to kill large numbers of biological humans.
Cortés, Pizarro, and Afonso as Precedents for Takeover

[ETA: Another way of framing my disagreement is that if you are trying to argue that small groups can take over the world, it seems almost completely irrelevant to focus on relative strategic or technological advantages in light of these historical examples. For instance, it could have theoretically been that some small technologically primitive tribe took over the world if they had some sort of immunity to disease. This would seem to imply that relative strategic advantages in Europeans vs. Americans was not that important. Instead we should focus on what ways AIs could create eg. artificial pandemics, and we could use the smallpox epidemic in America as an example of how devastating pandemics can be.]

First response: Disease wasn't a part of Afonso's success. It helped the Europeans take over the Americas but did not help them take over Africa or Asia or the middle east; this suggests to me that it may have been a contributing factor but was not the primary explanation.

That makes sense. I'm much less familiar with Afonso de Albuquerque, though my understanding is that he didn't really conquer civilizations, mostly just trading ports. I think it's safe to say that successful military campaigns are common in history, and therefore I don't find his success very unique or indicative of a future AI takeover.

Second response: Even if we decide that Cortes and Pizarro wouldn't have been able to succeed without the disease, my overall conclusion still stands.

Well, it depends. If your conclusion is that "small groups with relatively little military or strategic advantages can still take over large areas of the world" then I completely agree. If your conclusion is that, "small military or strategic advantages are by themselves often sufficient for small groups to take over large areas of the world" then I disagree. I worry your post gave the impression that the second conclusion was true.

Then the modified conclusion in light of your claim about disease would be "In times of chaos and disruption, a force with a small tech and cunning/experience advantage can take over a region 1,000 times its size." This modified conclusion is, as far as I'm concerned, still almost as powerful and interesting as the original conclusion.

A big part of my critique here is that you need to focus way more on getting the true the causal factors that lead to these historical success, because otherwise you can't use them to argue why AI is going to be anything like it.

Since disease is, in my opinion, the primary causal factor at play here, I think we should instead explore the potential for AI to engineer pandemics that kill everyone -- but that seems way different than what you were arguing.

I don't think making the thesis "in times of chaos and destruction, groups can conquer other groups" really makes the argument say much. The thing that destroyed the Incas and Aztecs was disease, not European military power, so maybe that's the lesson we should learn? Saying that merely "times of chaos" destroyed the Incas and Aztecs is tautological and not interesting.

For example, it's true that the disease may have sparked the Incan civil war -- but civil wars happen pretty often anyway, historically. And when civil wars aren't happening, ordinary wars often are.

Yes, but this Incan civil war was particularly extreme and unusual, and from the source I listed, it seems that between 60% and 90% of Incans had died. So again, determining the underlying causal factors is key to this sort of analysis.

Nitpick: The war was Cortez + allies vs. Tenochtitlan + allies. The vast majority of people on both sides were Americans. So the smallpox wreaked havoc on all sides. (Maybe I should have said "both sides" instead of "all sides")

Yeah that makes sense, but it's important to note that neither the Aztec nor the Cortez-allied Americans survived in great numbers. It was only the Spanish that were prosperous afterwards, and that's really important!

Nitpick: If it turns out that getting sick from various diseases was what kept the Europeans out of Africa for so long, that actually supports my overall argument. (Because, imagine instead that Europeans had no problem with disease in Africa, but simply were unable to conquer much of it due to ordinary military/political reasons. Then their tech+cunning/experience advantage would have failed to be enough in that case, which makes their successes in America seem more like a fluke than a pattern explained by tech+cunning/experience. In other words, if disease wasn't a factor in Africa, that would be evidence against my claims.)

I'm not sure if I understand this point well, but I think I agree. However, the quinine drug treatment for malaria was a technological advantage brought by the industrial revolution, and wasn't just some innate advantage that the Europeans eventually got.

Load More