They say that the flap of a butterfly’s wings can change the course of a hurricane. Sadly, this does not give us an effective strategy for controlling hurricane trajectories - if a flap of my butterfly’s wings can change the hurricane’s course, then so can the flap of any other butterfly’s wings. Unless I know exactly how all of the world’s butterflies are flapping, I do not know how my own butterfly’s wings must flap to achieve the course I want.

On the other hand, even though I do not know how the vast majority of the world’s butterflies are flapping, it is far easier to estimate a *distribution* of possible hurricane-paths in my state of ignorance than it would be to compute the exact hurricane path in a state of full knowledge.

In general, we usually think of computing with uncertainty as harder than computing without it - Bayesian reasoning requires tracking all possible worlds, which is exponentially more difficult than just tracking one. Yet in practice, uncertainty often makes large systems simpler to reason about. Why? Because noise wipes out long-range correlations. Unless I know how the wings of *all* the world’s butterflies are flapping, the flapping of any particular butterfly gives me no information at all about the path of a hurricane. So in practice, weather forecasters do not need to worry about tracking butterflies.

Let’s look at a toy model, to build some intuition for this idea.

I have a long list of randomly-chosen numbers between 1 and 10, and I want to know whether their sum is even or odd. If I know all of the numbers, then I can perform a relatively complicated calculation: I can add them all up, divide by 2, and find the remainder. But if there is even just *one* number in the list whose value I do not know, then all the rest of the numbers tell me *absolutely nothing* about whether the sum is even or odd. Even just a little bit of noise wipes out all of the signal completely.

On the other hand, if I don’t know the values of one or more of the numbers, then my “model” for this system is much simpler: whether the sum is even or odd is independent of all the numbers I know. If I’m reasoning about the sum, I can just forget about all those numbers entirely. In this sense, noise makes the system much simpler to reason about; there’s no need for all that addition and division, and we don’t even need to remember the numbers.

A rough general insight from this model: if changing any input of a system changes the output, then a complete lack of information about *any* input implies a complete lack of information about the output - no matter how much information we have about all the other inputs. When a system is very sensitive to all of its inputs, just a little bit of noise makes all of our information about the inputs irrelevant - which makes the system much simpler to model.

On the other hand, what if a system is *not* very sensitive to all of its inputs? Well, then we can build simplified approximate models of the system anyway, just by using rough estimates for whatever inputs it isn’t very sensitive to. There’s a kind of duality here: if the system isn’t very sensitive, we can model it well as only depending on some coarse estimates of the inputs; if it is very sensitive, we can model it as independent of most inputs as long as just a few are unknown.