Obstacle to modal optimality when you're being modalized

Summary: The optimality result for modal UDT on provably extensional problems doesn't carry over even to very straightforward problems where the agent's action is "predicted" (in the sense that a modalized $A$ is used) rather than just "run" (in the sense that a nonmodalized $A$ is used). In fact, there's a pair of such problems on which no modal decision theory can be simultaneously optimal.

Prerequisites: "Evil" decision problems in provability logic, An optimality result for modal UDT, Improving the modal UDT optimality result

Benja's modal UDT optimality results (especially the fact that $\neg □^{n} ⊥$ is the only axiom schema you'll ever need) are really nifty, but they correspond to exactly the games we might think of as "truly one-player". The agent $A ()$ depends on the universe $U ()$ only by proving things about it (all instances of $U ()$ within $A ()$ are modalized, by the definition of a modal agent), while the universe depends on the agent only by running it (all instances of $A ()$ within $U ()$ are nonmodalized, by the definition of provably extensional).

So in particular, if we want to consider multi-agent problems (or other non-extensional problems), the modal UDT algorithm continues to be perfectly well-defined, but the optimality result no longer has a valid proof. But could something like that result continue to hold?

OK, fine, you've read the summary, so you know that the answer is "no" in the following sense: I can show you two such universes, such that any modal agent fails in at least one case to achieve the best outcome that any modal agent can on that problem.

Let the possible actions in both universes be denoted by $a$ and $\neg a$ , and the possible outcomes be denoted by $g$ and $\neg g$ . Let's say that $g$ is good and $\neg g$ is bad.

Universe 1 is defined by $(U_{1} () = g) \leftrightarrow □ (A () = a)$ (and similarly, $(U_{1} () = \neg g) \leftrightarrow \neg □ (A () = a)$ ).

Universe 2 is defined by $(U_{2} () = g) \leftrightarrow □ (A () = \neg a)$ (and similarly, $(U_{2} () = \neg g) \leftrightarrow \neg □ (A () = \neg a)$ ).

I claim that no modal decision theory (as defined in the "evil" decision problem post) can get the outcome $g$ in both universes, but of course for each universe there exists a modal decision theory that gets the outcome $g$ . (The constant strategies are modal decision theories.) Let $\to T$ be a modal decision theory.

Lemma: $□ ⊥ \to (\to T (U_{1}) = a \leftrightarrow \to T (U_{2}) = a)$ .

Given the lemma, we see that $(\to T (U_{1}) = a \land \neg \to T (U_{2}) = a) \to \neg □ ⊥$ , so $□ (\to T (U_{1}) = a \land \neg \to T (U_{2}) = a) \to □ \neg □ ⊥$ , and as in Gödel's second incompleteness theorem, $□ \neg □ ⊥ \to □ ⊥$ . Thus $\neg □ ⊥ \to \neg (□ \to T (U_{1}) = a \land □ \neg \to T (U_{2}) = a)$ , and by definition of the universes, $\neg □ ⊥ \to \neg (U_{1} (\to T) = g \land U_{2} (\to T) = g)$ . Since we're evaluating fixed points in the standard model of the natural numbers, this completes the proof.

Now why should the lemma hold? Because $\to T$ is modalized, and so $□ ⊥$ implies that all subexpressions $□ A$ of $\to T$ evaluate to true, and this uniquely determines the value of $\to T$ given $□ ⊥$ , regardless of its input.

Thus we can't expect much in the way of optimality in non-extensional problems. (This in addition to several other obstacles to optimality in the special case of modal combat, to which this example also belongs.)

P.S. Thanks to Marcello Herreshoff for working this example out with me, and Benja Fallenstein for looking it over (though Benja wishes to register their disapproval for my notation, in which I've used nonmodal expressions like $U_{1} () = g$ and $T (U_{2}) = a$ to stand in for the fixed points of the theory).

P.P.S. I'm not happy calling this "modal UDT", because here we're showing that it doesn't succeed at doing what our philosophical intuitions about updateless decision theory ought to do. But the name is getting increasingly entrenched for both of these contexts...

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

3

Obstacle to modal optimality when you're being modalized

3