Review

Mentioned in

[Request for Distillation] Coherence of Distributed Decisions With Different Inputs Implies Conditioning

5johnswentworth

4jacobjacob

2johnswentworth

5jacobjacob

2johnswentworth

4jacobjacob

New Comment

6 comments, sorted by Click to highlight new comments since: Today at 8:51 PM

Short answer: about one full day.

Longer answer: normally something like this would sit in my notebook for a while, only informing my own thinking. It would get written up as a post mainly if it were adjacent to something which came up in conversation (either on LW or in person). I would have the idea in my head from the conversation, already be thinking about how best to explain it, chew on it overnight, and then if I'm itching to produce something in the morning I'd bang out the post in about 3-4 hours.

Alternative paths: I might need this idea as background for something else I'm writing up, or I might just be in a post-writing mood and not have anything more ready-to-go. In either of those cases, I'd be starting more from scratch, and it would take about a full day.

to be paid out to anyone you think did a fine job of distilling the thing

Needing to judge submissions is the main reason I didn't offer a bounty myself. Read the distillation, and see if you yourself understand it. If "Coherence of Distributed Decisions With Different Inputs Implies Conditioning" makes sense as a description of the idea, then you've probably understood it.

If you don't understand it after reading an attempted distillation, then it wasn't distilled well enough.

An update on this: sadly I underestimated how busy I would be after posting this bounty. I spent 2h reading this and Thomas post the other day, but didn't not manage to get into the headspace of evaluating the bounty (i.e. making my own interpretation of John's post, and then deciding whether Thomas' distillation captured that). So I will not be evaluating this. (Still happy to pay if someone else I trust claim Thomas' distillation was sufficient.) My apologies to John and Thomas about that.

There’s been a lot of response to theCall For Distillers, so I’m experimenting with a new post format. This post is relatively short and contains only a simple mathematical argument, with none of the examples, motivation, more examples, or context which would normally make such a post readable. My hope is that someone else will write a more understandable version.Jacob isofferinga $500 bounty on a distillation.Goal: following the usual coherence argument setup, show that if multiple decisions are each made with different input information available, then each decision maximizes expected utility given its input information.

We’ll start with the usual coherence argument setup: a system makes a bunch of choices, aiming to be pareto-optimal across a bunch of goals (e.g. amounts of various resources) u1…um. Pareto optimality implies that, at the pareto-optimum, there exists some vector of positive reals P1…Pm such that the choices maximize ∑iPiui. Note that P can be freely multiplied by a constant, so without loss of generality we could either take P to sum to 1 (in which case we might think of P as probabilities) or take P1 to be 1 where u1 is amount of money (in which case P is a marginal price vector).

When the goals are all “the same goal” across different “worlds” X, and we normalize P[X] to sum to 1, P[X] is a probability distribution over worlds in the usual Bayesian sense. The system then maximizes (over its actions A) ∑XP[X]u(A,X)=EX[u(A,X)], i.e. it’s an “expected utility maximizer”.

That’s the usual setup in a nutshell. Now, let’s say that the system makes multiple decisions A=A1…An in a distributed fashion. Each decision is made with only limited information: Ai receives fi(X) as input (and nothing else). The system then chooses the functions Ai(fi(X)) to maximize EX[u(A,X)].

Consider the maximization problem for just Ai(f∗i), i.e. the optimal action for choice i given input f∗i. Expanded out, the objective is EX[u(A,X)]=∑Xu(A1(f1(X)),…,Ai(fi(X)),…An(fn(X)),X).

Note that the only terms in that sum which actually depend on Ai(f∗i) are those for which fi(X)=f∗i. So, for purposes of choosing Ai(f∗i) specifically, we can reduce the objective to

∑X:fi(X)=f∗iu(A,X)

… which is equal to P[fi(X)=f∗i]E[u(A,X)|fi(X)=f∗i]. The P[fi(X)=f∗i] multiplier is always positive and does not depend on Ai, so we can drop it without changing the optimal Ai. Thus, action Ai(f∗i) maximizes the conditional expected value E[u(A,X)|fi(X)=f∗i].

Returning to the optimization problem for all of the actions simultaneously: any optimum for all actions must also be an optimum for each action individually (otherwise we could change one action to get a better result), so each action Ai(f∗i) must maximize E[u(A,X)|fi(X)=f∗i].

A few notes on this:

FDT, in which case the action function would appear in more than one place.uniquelyspecify the system’s behavior. The classic example is that P[X] might not be unique. Once we have distributed decisions there may also be “local optima” such that each individual action is optimal but the actions are not jointly optimal; that’s another form of non-uniqueness.