Distributed Decisions — AI Alignment Forum

x

Distributed Decisions — AI Alignment Forum

Consider two prototypical “agents”: a human, and a company.

The human is relatively centralized and monolithic. As a rough approximation, every 100 ms or so observations flow into the brain from the eyes, ears, etc. This raw input data updates the brain’s world-model, and then decisions flow out, e.g. muscle movements. This is exactly the sort of “state-update model” which Against Time In Agent Models criticized: observations update one central internal state at each timestep, and all decisions are made based on that central state. It’s not even all that accurate a model for a human, but let’s set that aside for now and contrast it to a more obviously decentralized example.

In a company, knowledge and decisions are distributed. A cashier sees and hears customers in the store, and interacts with them in order to sell things. Meanwhile, a marketing editor tweaks some ad copy. Each mostly makes decisions based on their local information; most of that local information is not propagated to other decision-makers. Observations don’t update a single centralized state which informs all decisions. Instead, different decisions have different input information from different sources.

In Optimization at a Distance, I suggested a mental picture of agents kinda like this:

Note that this particular drawing doesn’t have any inputs to the optimizer (i.e. observations) for simplicity, but it’s easy to add inputs. The optimizer need not be strictly causally upstream of the target; it could have interactions back-and-forth with the target.

It’s like a phased array: there’s lots of little actions distributed over space/time, all controlled in such a way that their influence can add up coherently and propagate over a long distance to optimize some far-away target. Optimization at a Distance mainly emphasized the “height” of this picture, i.e. the distance between optimizer and target. This post is instead about the “width”: not only are the actions far from the optimization target, the actions themselves are also distributed in spacetime and potentially far apart from each other.

Contrast: Bayesian Updates

Suppose I want to watch my favorite movie, 10 Things I Hate About You, in the evening. To make this happen, I do some optimization - I steer myself-in-the-evening and my-immediate-environment-in-the-evening into the relatively small set of states in which I’m watching the movie. Via the argument in Utility Maximization = Description Length Minimization, we should expect that I approximately-act-as-though I’m a Bayesian reasoner maximizing some expected utility over myself-in-the-evening and my-immediate-environment-in-the-evening. (Note that it’s a utility function over myself-in-the-evening and my-immediate-environment-in-the-evening, not just any old random utility function; something like e.g. a rock would not be well-described by such a utility function.)

While arranging my evening, I may perform some Bayesian updates. Maybe I learn that the movie is not available on Netflix, so I ask a friend if they have a copy, then check Amazon when they don’t. This process is reasonably well-characterized as me having a centralized model of the places I might find the movie, and then Bayes-updating that model each time I learn another place where I can/can’t find it. (If I had checked Netflixed, then asked my friend, then checked Netflix again because I forgot whether it was on Netflix, that would not be well-modeled as Bayesian updates.)

By contrast, imagine that myself and some friends are arranging to watch 10 Things I Hate About You in the evening. I check to see if the movie is on Netflix, and at the same time my friend checks their parents’ pile of DVDs. My friend doesn’t find it in their parents’ DVD pile, and doesn’t know I already checked Netflix, so they also check Netflix. My friends and I, as a system, are not well-modeled as Bayesian updates to a single central knowledge-state; otherwise we wouldn’t check Netflix twice. And yet, it’s not obviously suboptimal (like me forgetting whether the movie is on Netflix would be). If there’s a lag in communication between us, it may just be faster and easier for us to both check Netflix independently, and then both check other sources independently if the movie isn’t there. We’re acting independently to optimize the same goal; our actions are chosen “locally” on the basis of whatever information is available, not necessarily based on a single unified knowledge-state.

So, we don’t really have “Bayesian updates” in the usual sense. And yet… we’re still steering the world into a relatively narrow set of states, the argument in Utility Maximization = Description Length Minimization still applies just fine, and that argument is still an essentially Bayesian argument. It’s still using a Bayesian distribution - i.e. a distribution which is ultimately part of a model, not necessarily a fundamental feature of the territory. It’s still about maximizing expected utility under that distribution. My friends and I, as a system, are still well modeled as a “Bayesian agent” in some sense. Just… not a monolithic Bayesian agent. We’re a distributed Bayesian agent, one in which different parts have different information.

Conditioning

Conditional probabilities do still enter the picture, just not as updates to a centralized world-state.

In the movie example, when I’m searching for the movie in various places, how do I steer the world into the state of us-watching-the-movie-in-the-evening? How do I maximize , jointly with my friends? Well, I act on the information I have, plus my priors about e.g. what information my friends will have and how they will act. If I have information $Y$ (e.g. I know that the movie isn’t on Netflix, and know nothing else relevant other than priors) when making a particular decision, then I act to maximize $E [u (X) | Y]$ .

Why that particular mathematical form? Well, our shared optimization objective $E [u (X)]$ is a sum over worlds $(X, Y, \dots)$ :

$E [u (X)] = \sum_{X, Y, \dots} P [X, Y, \dots] u (X)$

If I know that e.g. the movie is not on Netflix, then I know my current action won’t impact any of the worlds where the movie is on Netflix. So I can ignore those worlds while making the current decision, and just sum over all the worlds in which the movie is not on Netflix. My new sum is $\sum_{X, \dots} P [X, Y, \dots] u (X)$ , which becomes $E [u (X) | Y]$ after normalizing the probabilities. (Normalizing doesn’t change the optimal action, so we can do that “for free”.) By ignoring all the worlds I’m not in (based on the input information to the current decision), and taking the expectation over the rest, I’m effectively maximizing expected utility conditional on the information I have when making the decision.

More generally: each action is chosen to maximize expected utility conditional on whatever information is available as an input to that action (including priors about how the other actions will be taken). That’s the defining feature of a distributed Bayesian agent.

This post (and the more dense version here) spells out the mathematical argument in a bit more detail, starting from coherence rather than utility-maximization-as-description-length-minimization.

(Side note: some decision theory scenarios attempt to mess with the “current action won’t impact any of the other worlds” part, by making actions in one world impact other worlds. Something FDT-like would fix that, but that’s out of scope for the current post.)

Resources

The "Measuring Stick of Utility" Problem talks about how grounding the idea of “resources” in non-agenty concepts is a major barrier to using coherence theorems to e.g. identify agents in a given system. If we have distributed decisions, optimization at a distance, or both, and we expect that information at a distance is mediated by relatively low-dimensional summaries (i.e. the Telephone Theorem), then there’s an intuitively-natural way to recognize “resources” for purposes of coherence arguments.

Let’s go back to the example of a company, in which individual employees make many low-level decisions in parallel. The information relevant to each decision is mostly local - e.g. a cashier at a retail store in upstate New York does not need to know the details of station 13 on the company’s assembly line in Shenzhen. But there is some relevant information - for instance, if an extra 10 cents per item are spent at station 13 on the assembly line in Shenzhen, then the cashier needs to end up charging another ~10 cents per item to customers. Or, if the assembly line shuts down for a day and 10000 fewer items are produced, then the cashiers at all of the company’s stores need to end up selling 10000 fewer items.

So we have this picture where lots of different decisions are made mostly-locally, but with some relatively small summary information passed around between local decision makers. That summary consists mainly of a sum of “resources'' gained/lost across each decision. In our example, the resources would be dollars spent/gained, and items created/sold.

The key here is that we have lots of local decisions, with relatively low-dimensional coupling between them. The summary-information through which the decisions couple is, roughly speaking, the “resources”. (In practice, there will probably also be lots of extra summary-information between localities which isn’t controllable via the actions, and therefore needn’t be treated as a resource - e.g. all the facts about concrete one could learn from the store’s walls which would carry over to the concrete in the factory’s walls.)

Alternatively, rather than starting from distributed decisions, we could start from optimization at a distance. Because the optimization target is “far away” from the actions, only some relatively-low-dimensional summary of the actions impacts the target. Again, the components of that summary are, roughly speaking, the “resources”.

This picture fits in nicely with coherence theorems. The theorems talk about how a local decision maker needs to act in order to achieve pareto-optimal resource use, while still achieving local goals. For instance, the company’s marketing department should act-as-though it has a utility function over ads, otherwise it could run the same ads while spending pareto-fewer resources.

This picture also fits in nicely with natural abstractions. We have a large system with lots of parts “far away” from each other. The Telephone Theorem then says that they will indeed interact only via some relatively low-dimensional summary. In a decision framing, it says that only a relatively low-dimensional summary of the far-away decisions will be relevant to the local decision. Furthermore, we can in-principle derive that low-dimensional summary from the low-level physics of the world.

But this is still just an intuitive story. To make it rigorous, the Measuring Stick of Utility post argued that we need our resources to have two main properties:

More resource is always better
Resources are additive across decisions

Additivity across decisions, in particular, is the more restrictive condition mathematically. In order to identify natural abstraction summaries as “resources” for coherence purposes, those summaries need to be additive across all the local decisions.

… which is the main claim argued in Maxent and Abstractions. Summaries of information relevant at a distance can indeed be represented as sums over local variables/decisions.