Extracting Money from Causal Decision Theorists

by Caspar Oesterheld1 min read28th Jan 20212 comments


This is a linkpost for https://doi.org/10.1093/pq/pqaa086

My paper with my Ph.D. advisor Vince Conitzer titled "Extracting Money from Causal Decision Theorists" has been formally published (Open Access) in The Philosophical Quarterly. Probably many of you have seen either earlier drafts of this paper or similar arguments that others have independently given on this forum (e.g., Stuart Armstrong posted about an almost identical scenario; Abram Demski's post on Dutch-Booking CDT also has some similar ideas) and elsewhere (e.g., Spencer (forthcoming)  and Ahmed (unpublished) both make arguments that resemble some points from our paper).

Our paper focuses on the following simple scenario which can be used to, you guessed it, extract money from causal decision theorists:

Adversarial Offer: Two boxes,  and , are on offer. A (risk-neutral) buyer may purchase one or none of the boxes but not both. Each of the two boxes costs . Yesterday, the seller put  in each box that she predicted the buyer not to acquire. Both the seller and the buyer believe the seller's prediction to be accurate with probability .

At least one of the two boxes contains money. Therefore, the average box contains at least  in (unconditional) expectation. In particular, at least one of the two boxes must contain at least  in expectation. Since CDT doesn't condition on choosing box  when assigning an expected utility to choosing box , the CDT expected utility of at least one of the two boxes is at least . Thus, CDT agents buy one of the boxes, to the seller's delight.

Most people on this forum are probably already convinced that (orthodox, two-boxing) CDT should be rejected. But I think the Adversarial Offer is one of the more convincing "counterexamples" to CDT. So perhaps the scenario is worth posing to your pro-CDT friends, and the paper worth sending to your pro-academic peer review, pro-CDT friends. (Relating their responses to me would be greatly appreciated – I am quite curious what different causal decision theorists think about this scenario.)



3 comments, sorted by Highlighting new comments since Today at 4:25 AM
New Comment

I like the following example:

  • Someone offers to play rock-paper-scissors with me.
  • If I win I get $6. If I lose, I get $5.
  • Unfortunately, I've learned from experience that this person beats me at rock-paper-scissors 40% of the time, and I only beat them 30% of the time, so in expectation I lose $0.20 in expectation by playing.
  • My decision is set up as allowing 4 options: rock, paper, scissors, or "don't play."

This seems like a nice relatable example to me---it's not uncommon for someone to offer to bet on a rock paper scissors game, or to offer slightly favorable odds, and it's not uncommon for them to have a slight edge.

Are there features of the boxes case that don't apply in this case, or is it basically equivalent?

>If I win I get $6. If I lose, I get $5.

I assume you meant to write: "If I lose, I lose $5."

Yes, these are basically equivalent. (I even mention rock-paper-scissors bots in a footnote.)

I've skimmed over the beginning of your paper, and I think there might be several problems with it.

  1. I don't see where it is explicitly stated, but I think information "seller's prediction is accurate with probability 0,75" is supposed to be common knowledge. Is it even possible for a non-trivial probabilistic prediction to be a common knowledge? Like, not as in some real-life situation, but as in this condition not being logical contradiction? I am not a specialist on this subject, but it looks like a logical contradiction. And you can prove absolutely anything if your premise contains contradiction.
  2. A minor nitpick compared to the previous one, but you don't specify what you mean by "prediction is accurate with probability 0.75". What kinds of mistakes does seller make? For example, if buyer is going to buy the , then with probability 0.75 the prediction will be "". What about the 0.25? Will it be 0.125 for "none" and 0.125 for ""? Will it be 0.25 for "none" and 0 for ""? (And does buyer knows about that? What about seller knowing about buyer knowing...)

    When you write "$1−P (money in Bi | buyer chooses Bi ) · $3 = $1 − 0.25 · $3 = $0.25.", you assume that P(money in Bi | buyer chooses Bi )=0.75. That is, if buyer chooses the first box, seller can't possibly think that buyer will choose none of the boxes. And the same for the case of buyer choosing the second box. You can easily fix it by writing "$1−P (money in Bi | buyer chooses Bi ) · $3 >= $1 − 0.25 · $3 = $0.25" instead. It is possible that you make some other implicit assumptions about mistakes that seller can make, so you might want to check it.