In this post, I hope to persuade you of what I consider to be an important principle when dealing with decision theory and counterfactuals.
Joe Carlsmith describes the Yankees vs. Red Sox Problem as below:
In this case, the Yankees win 90% of games, and you face a choice between the following bets:
Yankees win Red Sox win
You bet on Yankees 1 -2
You bet on Red Sox -1 2
Or, if we think of the outcomes here as “you win your” and “you lose your bet” instead, we get:
You win your bet You lose your bet
You bet on Yankees 1 -2
You bet on Red Sox 2 -1
Before you choose your bet, an Oracle tells you whether you’re going to win your next bet. The issue is that once you condition on winning or losing (regardless of which), you should always bet on the Red Sox. So, the thought goes, EDT always bets on the Red Sox, and loses money 90% of the time. Betting on the Yankees every time does much better.
The mistake is assuming that here is assuming that the Oracle's prediction applies in counterfactuals (worlds that don't occur) in addition to the factual (the world that does).
If the Oracle knows both:
a) The Yankees will win
b) You will bet on the Yankees, if you are told that you will win your next bet
Then the Oracle knows that it can safely predict you winning your bet, without any possiblity of this prediction being wrong.
Notice that the Oracle doesn't need to know anything about the counterfactual where you bet Red Sox, except that the Yankees will win (and maybe not even that).
After all, Condition b) only applies when you are told that you will win your next bet. If you would have bet on Red Sox instead after being told that you were going to win, then the Oracle wouldn't have promised that you were going to choose correctly.
In fact, the Oracle mightn't have been able to publicly make a consistent prediction at all, as learning of its prediction might change your actions. This would be the case if all of the following three conditions held at once:
a) Yankees were going to win
b) If you were told that you'd win your next bet, you'd bet Red Sox
c) If you were told that you'd lose your next bet, you'd bet Yankees
The only way the Oracle would be able to avoid being mistaken, would be to not make any prediction at all. This example clearly demonstrates how an Oracle's predictions can be limited by your betting tendencies.
To be clear, if the Oracle tells you that you are going to win, you can't interpret this as applying completely unconditionally. Instead, you have to allow that the Oracle's prediction may be contingent on how you bet (in many problems the Oracle actually knows how you will bet and will use this in its prediction).
The Oracle's prediction only has to apply to the world that is. It doesn't have to apply to worlds that are not.
Why not go with Joe's argument?
Joe argues that the Oracle's prediction renders your decision-making unstable. While this is "fishy", it's not clear to me that this is a mistake. After all, maybe the Oracle knows how many times you'll switch back and forth between the two teams before making your final decision? Maybe this doesn't answer Joe's objection, plausibly it does.
Are there Wider Implications?
If it is conceded that that the Oracle's prediction can vary in counterfactuals, then this would undermine the argument for 2-boxing in Newcomb's Problem which relies on the Oracle's prediction being constant across all counterfactuals. I suppose someone could argue that this problem only demonstrates that the prediction can vary in counterfactuals when the prediction is publicly shared. But even if I haven't specifically shown that non-public predictions can vary across counterfactuals, I've still successfully undermined the notion that the past is fixed across counterfactuals.
This result is damaging to both EDT and CDT which both take the Oracle's prediction to apply across all counterfactuals.
I also suspect that anyone who finds this line of argument persuasive will end up being more persuaded by my explaination for why 1 boxing doesn't necessarily imply backwards causation (short answer: because counterfactuals are a construction and constructing at least part of a counterfactual backwards is different from asserting that the internal structure of that counterfactual involves backwards causation). However, I can't explain why it's related in any clear fashion.
Vladamir Nesov suggested that the principle should be "The Oracle's prediction only has to apply to the world where the prediction is delivered". My point was that Oracle predictions made in the factual don't apply to counterfactuals, but I prefer his way of framing things as it is more general.