Dutch-Booking CDT

abramdemski

Dutch-Booking CDT — AI Alignment Forum

CDT=EDT?

12 Dutch-Booking CDT

by abramdemski

13th Jan 2019

2 min read

12

[This post is now superseded by a much better version of the argument.]

In a previous post, I speculated that you might be able to Dutch-Book CDT agents if their counterfactual expectations differed from the conditional expectations of EDT. The answer turns out to be yes.

I'm going to make this a short note rather than being very rigorous about the set of decision problems for which this works.

(This is an edited version of an email, and benefits from correspondence with Caspar Oesterheld, Gerard Roth, and Alex Appel. In particular, Caspar Oesterheld is working on similar ideas. My views on how to interpret the situation have changed since I originally wrote these words, but I'll save that for a future post.)

Suppose a CDT agent has causal expectations which differ from its evidential expectations, in a specific decision.

We can modify the decision by allowing an agent to bet on outcomes in the same act. Because the bet is made simultaneously with the decision, the CDT agent uses causal expected value, and will bet accordingly.

Then, immediately after (before any new observations come in), we offer a new bet about the outcome. The agent will now bet based on its evidential expectations, since the causal intervention has already been made.

For example, take a CDT agent in Death in Damascus. A CDT agent will take each action with 50% probability, and its causal expectations expect to escape death with 50% probability. We can expand the set of possible actions from (stay, run) to (stay, run, stay and make side bet, run and make side bet). The side bet could cost 1 util and pay out 3 utils if the agent doesn't die. Then, immediately after taking the action but before anything else happens, we offer another deal: the agent can get .5 util in exchange for -3 util conditional on not dying. We offer the new bet regardless of whether the agent agrees to the first bet.

The CDT agent will happily make the bet, since the expected utility is calculated along with the intervention. Then, it will happily sell the bet back, because after taking its action, it sees no chance of the 3 util payout.

The CDT agent makes the initial bet even though it knows it will later reverse the transaction at a cost to itself, because we offer the second transaction whether the agent agrees to the first or not. So, from the perspective of the initial decision, taking the bet is still +.5 expected utils. If it could stop itself from later taking the reverse bet, that would be even better, but we suppose that it can't.

I conclude from this that CDT should equal EDT (hence, causality must account for logical correlations, IE include logical causality). By "CDT" I really mean any approach at all to counterfactual reasoning; counterfactual expectations should equal evidential expectations.

As with most of my CDT=EDT arguments, this only provides an argument that the expectations should be equal for actions taken with nonzero probability. In fact, the amount lost to Dutch Book will be proportional to the probability of the action in question. So, differing counterfactual and evidential expectations are smoothly more and more tenable as actions become less and less probable. Actions with very low probability will imply negligible monetary loss. Still, in terms of classical Dutch-Book-ability, CDT is Dutch-Bookable.

Both CDT and EDT have dynamic inconsistencies, but only CDT may be Dutch-booked in this way. I'm not sure how persuasive this should be as an argument -- how special a status should Dutch-book arguments have?

ETA: The formalization of this is now a question.

Decision theory

Frontpage

A Rationality Condition for CDT Is That It Equal EDT (Part 2)

New Comment

6 comments, sorted by

top scoring

Click to highlight new comments since: Today at 7:57 AM

[-]Caspar Oesterheld7y*60

Caspar Oesterheld is working on similar ideas.

For anyone who's interested, Abram here refers to my work with Vincent Conitzer which we write about here.

ETA: This work has now been published in The Philosophical Quarterly.

[-]Diffractor7y20

(lightly edited restatement of email comment)

Let's see what happens when we adapt this to the canonical instance of "no, really, counterfactuals aren't conditionals and should have different probabilities". The cosmic ray problem, where the agent has the choice between two paths, it slightly prefers taking the left path, but its conditional on taking the right path is a tiny slice of probability mass that's mostly composed of stuff like "I took the suboptimal action because I got hit by a cosmic ray".

There will be 0 utility for taking left path, -10 utility for taking the right path, and -1000 utility for a cosmic ray hit. The CDT counterfactual says 0 utility for taking left path, -10 utility for taking the right path, while the conditional says 0 utility for left path, -1010 utility for right path (because conditional on taking the right path, you were hit by a cosmic ray).

In order to get the dutch book to go through, we need to get the agent to take the right path, to exploit P(cosmic ray) changing between the decision time and afterwards. So the initial bet could be something like -1 utility now, +12 utility upon taking the right path and not being hit by a cosmic ray. But now since the optimal action is "take the right path along with the bet", the problem setup has been changed, and we can't conclude that the agent's conditional on taking the right path places high probability on getting hit by a cosmic ray (because now the right path is the optimal action), so we can't money-pump with the "+0.5 utility, -12 utility upon taking a cosmic ray hit" bet.

So this seems to dutch-book Death-in-Damascus, not CDT $\neq$ EDT cases in general.

[-]Vivek Hebbar2y10

If the agent follows EDT, it seems like you are giving it epistemically unsound credences. In particular, the premise is that it's very confident it will go left, and the consequence is that it in fact goes right. This was the world model's fault, not EDT's fault. (It is notable though that EDT introduces this loopiness into the world model's job.)

[-]abramdemski7y10

(lightly edited version of my original email reply to above comment; note that Diffractor was originally replying to a version of the Dutch-book which didn't yet call out the fact that it required an assumption of nonzero probability on actions.)

I agree that this Dutch-book argument won't touch probability zero actions, but my thinking is that it really should apply in general to actions whose probability is bounded away from zero (in some fairly broad setting). I'm happy to require an epsilon-exploration assumption to get the conclusion.

Your thought experiment raises the issue of how to ensure in general that adding bets to a decision problem doesn't change the decisions made. One thought I had was to make the bets always smaller than the difference in utilities. Perhaps smaller Dutch-books are in some sense less concerning, but as long as they don't vanish to infinitesimal, seems legit. A bet that's desirable at one scale is desirable at another. But scaling down bets may not suffice in general. Perhaps a bet-balancing scheme to ensure that nothing changes the comparative desirability of actions as the decision is made?

For your cosmic ray problem, what about:

You didn't specify the probability of a cosmic ray. I suppose it should have probability higher than the probability of exploration. Let's say 1/million for cosmic ray, 1/billion for exploration.

Before the agent makes the decision, it can be given the option to lose .01 util if it goes right, in exchange for +.02 utils if it goes right & cosmic ray. This will be accepted (by either a CDT agent or EDT agent), because it is worth approximately +.01 util conditioned on going right, since cosmic ray is almost certain in that case.

Then, while making the decision, cosmic ray conditioned on going right looks very unlikely in terms of CDT's causal expectations. We give the agent the option of getting .001 util if it goes right, if it also agrees to lose .02 conditioned on going right & cosmic ray.

CDT agrees to both bets, and so loses money upon going right.

Ah, that's not a very good money pump. I want it to lose money no matter what. Let's try again:

Before decision: option to lose 1 millionth of a util in exchange for 2 utils if right&ray.

During decision: option to gain .1 millionth util in exchange for -2 util if right&ray.

That should do it. CDT loses .9 millionth of a util, with nothing gained. And the trick is almost the same as my dutch book for death in damascus. I think this should generalize well.

The amounts of money lost in the Dutch Book get very small, but that's fine.

[-]ESRogs7y10

I conclude from this that CDT should equal EDT (hence, causality must account for logical correlations, IE include logical causality). By "CDT" I really mean any approach at all to counterfactual reasoning; counterfactual expectations should equal evidential expectations.

As with most of my CDT=EDT arguments, this only provides an argument that the expectations should be equal for actions taken with nonzero probability. In fact, the amount lost to Dutch Book will be proportional to the probability of the action in question. So, differing counterfactual and evidential expectations are smoothly more and more tenable as actions become less and less probable.

I'm having a little trouble following the terminology here (despite the disclaimer).

One particular thing that confuses me -- you say, "the expectations should be equal for actions taken with nonzero probability" and also "differing counterfactual and evidential expectations are smoothly more and more tenable as actions become less and less probable", but I'm having trouble understanding how they could both be true. How does, "they're equal for nonzero probability" match with "they move further and further apart the closer the probability gets to zero"? (Or are those incorrect paraphrases?)

It seems to me that if you have two functions that are equal whenever then input (the probability of an action) is nonzero, then they can't also get closer and closer together as the input increases from zero -- they're already equal as soon as the input does not equal zero! I assume that I have misunderstood something, but I'm not sure which part.

[-]abramdemski7y20

"The expectations should be equal for actions with nonzero probability" -- this means a CDT agent should have equal causal expectations for any action taken with nonzero probability, and EDT agents should similarly have equal evidential expectations. Actually, I should revise my statement to be more careful: in the case of epsilon-exploring agents, the condition is >epsilon rather than >0. In any case, my statement there isn't about evidential and causal expectations being equal to each other, but rather about one of them being conversant across (sufficiently probable) actions.

"differing counterfactual and evidential expectations are smoothly more and more tenable as actions become less and less probable" -- this means that the amount we can take from a CDT agent through a Dutch Book, for an action which is given a different casual expectation than evidential expectation, smoothly reduces as the probability of an action goes to zero. In that statement, I was assuming you hold the difference between evidential and causal expectations constant add you reduce the probability of the action. Otherwise it's not necessarily true.

Moderation Log