Proofs Section 2.2 (Isomorphism to Expectations)

Previous proof post is here.

Theorem 2: Isomorphism Theorem: For (causal, pseudocausal, acausal, surcausal) $Θ^{s t}$ or $Θ^{ω}$ which fulfill finitary or infinitary analogues of all the defining conditions, $↑ (Θ^{s t})$ and $↓ (Θ^{ω})$ are (causal, pseudocausal, acausal, surcausal) hypotheses. Also, $↑$ and $\to^{s t}$ define an isomorphism between $Θ$ and $Θ^{s t}$ , and $↓$ and $\to^{ω}$ define an isomorphism between $Θ$ and $Θ^{ω}$ .

Proof sketch: The reason this proof is so horrendously long is that we've got almost a dozen conditions to verify, and some of them are quite nontrivial to show and will require sub-proof-sketches of their own! Our first order of business is verifying all the conditions for a full belief function for $↑ (Θ^{s t})$ . Then, we have to do it all over again for $↓ (Θ^{ω})$ . That comprises the bulk of the proof. Then, we have to show that taking a full belief function $Θ$ and restricting it to the infinite/finite levels fulfills the infinite/finite analogues of all the defining conditions for a belief function on policies or policy-stubs, which isn't quite as bad. Once we're done with all the legwork showing we can derive all the conditions from each other, showing the actual isomorphism is pretty immediate from the Consistency condition of a belief function.

Part 1:Let's consider $↑ (Θ^{π_{s t}})$ . This is defined as: $↑ (Θ^{s t}) (π_{p a}) := ⋂_{π_{s t} \leq π_{p a}} (p r_{*}^{π_{p a}, π_{s t}})^{- 1} (Θ^{s t} (π_{s t}))$

We'll show that all 9+2 defining conditions for a belief function are fulfilled for $↑ (Θ^{s t})$ . The analogue of the 9+2 conditions for a $Θ_{s t}$ is:

1: Stub Nirvana-free Nonemptiness: $\forall π_{s t} : Θ^{s t} (π_{s t}) \cap N F \neq \emptyset$

2: Stub Closure: $\forall π_{s t} : Θ^{s t} (π_{s t}) = ¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯ Θ^{s t} (π_{s t})$

3: Stub Convexity. $\forall π_{s t} : Θ^{s t} (π_{s t}) = c . h (Θ^{s t} (π_{s t}))$

4: Stub Nirvana-free Upper-Completion.

$\forall π_{s t} : Θ^{s t} (π_{s t}) \cap N F = ((Θ^{s t} (π_{s t}) \cap N F) + M^{s a} (F^{N F} (π_{s t}))) \cap M^{a} (F (π_{s t}))$

5: Stub Restricted Minimals:

$\exists λ^{⊙}, b^{⊙} \forall π_{s t} : (λ μ, b) \in {(Θ^{s t} (π_{s t}))}_{s t}^{min} \to λ + b \leq λ^{⊙} + b^{⊙}$

6: Stub Normalization: ${inf}_{π_{s t}} E_{Θ^{s t} (π_{s t})} (0) = 0$ and ${sup}_{π_{s t}} E_{Θ^{s t} (π_{s t})} (1) = 1$

7: Weak Consistency: $\forall π_{s t}^{l o}, π_{s t}^{h i} \geq π_{s t}^{l o} : p r_{*}^{π_{s t}^{h i}, π_{s t}^{l o}} (Θ^{s t} (π_{s t}^{h i})) \subseteq Θ^{s t} (π_{s t}^{l o})$

8: Stub Extreme Point Condition: for all $M, π_{s t}^{l o}$ :

$M \in (Θ^{s t} (π_{s t}^{l o}))^{x m i n} \cap N F \to$

$\exists π \geq π_{s t}^{l o} \forall π_{s t}^{h i} : (π > π_{s t}^{h i} \geq π_{s t}^{l o} \to (\exists M^{'} \in Θ^{s t} (π_{s t}^{h i}) \cap N F : p r_{*}^{π_{s t}^{h i}, π_{s t}^{l o}} (M^{'}) = M))$

9: Stub Uniform Continuity: The function $π_{s t} \mapsto (p r_{*}^{\infty, π_{s t}})^{- 1} (Θ^{s t} (π_{s t}) \cap N F \cap {\leq ⊙})$ is uniformly continuous.

C: Stub Causality: $\forall π_{s t}, M \in Θ^{s t} (π_{s t}) \exists o f \forall π_{s t}^{'} : o f (π_{s t}) = M \land o f (π_{s t}^{'}) \in Θ^{s t} (π_{s t}^{'})$

where the outcome function $o f$ is defined over all stubs.

P: Stub Pseudocausality:

$\forall π_{s t}, π_{s t}^{'} : ((M \in Θ^{s t} (π_{s t}) \land supp (M) \subseteq F^{N F} (π_{s t}^{'})) \to M \in Θ^{s t} (π_{s t}^{'}))$

Let's begin showing the conditions. But first, note that since we have weak consistency, we can invoke Lemma 6 to reexpress $↑ (Θ^{s t}) (π_{p a})$ as $⋂_{n} (p r_{*}^{π_{p a}, π_{p a}^{n}})^{- 1} (Θ^{s t} (π_{p a}^{n}))$ Where $π_{p a}^{n}$ is the n'th member of the fundamental sequence of $π_{p a}$ .

Also note that, for all stubs, $↑ (Θ^{s t} (π_{s t})) = Θ^{s t} (π_{s t})$ . We'll be casually invoking this all over the place and won't mention it further.

Proof: By Lemma 6 with weak consistency, $↑ (Θ^{s t} (π_{s t})) = ⋂_{n \geq m} (p r_{*}^{π_{s t}, π_{s t}^{n}})^{- 1} (Θ^{s t} (π_{s t}^{n}))$ Now, m can be anything we like, as long as it's finite. Set m to be larger than the maximum timestep that the stub is defined for. Then $π_{s t}^{n} = π_{s t}$ no matter what n is (since it's above m) and projection from a stub to itself is identity, so the preimage is exactly our original set $Θ^{s t} (π_{s t})$ .

We'll also be using another quick result. For all stubs $π_{s t}^{h i} \geq π_{s t}^{l o}$ , given stub causality,

$p r_{*}^{π_{s t}^{h i}, π_{s t}^{l o}} (Θ^{s t} (π_{s t}^{h i})) = Θ^{s t} (π_{s t}^{l o})$

Proof: Fix an arbitrary point $M \in Θ^{s t} (π_{s t}^{l o})$ . By causality, we get an outcome function which includes $M$ , giving us that there's something in $Θ^{s t} (π_{s t}^{h i})$ that projects down onto $M$ . Use weak consistency to get the other subset direction.

Condition 1: Nirvana-free Nonemptiness.

Invoking Stub Nirvana-free Nonemptiness, $\forall π_{s t} : Θ^{s t} (π_{s t}) \cap N F \neq \emptyset$ so we get nirvana-free nonemptiness for $↑ (Θ^{s t}) (π_{s t})$ .

Now, assume $π_{p a}$ is not a stub. By stub-bounded-minimals, there is some $λ^{⊙} + b^{⊙}$ bound on the set of minimal points, regardless of stub. Let $(Θ^{s t} (π_{p a}^{n}))^{c l i p}$ be $Θ^{s t} (π_{p a}^{n}) \cap {\leq ⊙} \cap N F$

This contains all the minimal nirvana-free points for $π_{p a}^{n}$ . This set is nonempty because we have stub nirvana-free nonemptiness, so a nirvana-free point $M$ exists. We have stub-closure and stub minimal-boundedness, so we can step down to a minimal nirvana-free point below $M$ , and it obeys the $λ^{⊙} + b^{⊙}$ bound.

Further, by weak consistency and projection preserving $λ$ and $b$ and nirvana-freeness,

$p r_{*}^{π_{p a}^{n + 1}, π_{p a}^{n}} ((Θ^{s t} (π_{p a}^{n + 1}))^{c l i p}) \subseteq (Θ^{s t} (π_{p a}^{n}))^{c l i p}$

Invoking Lemma 9, the intersection of preimages of these is nonempty. It's also nirvana-free, because if there's nirvana somewhere, it occurs after finite time, so projecting down to some sufficiently large finite stage preserves the presence of Nirvana, but then we'd have a nirvana-containing point in a nirvana-free set $(Θ^{s t} (π_{p a}^{n}))^{c l i p}$ , which is impossible. This is also a subset of the typical intersection of preimages used to define $↑ (Θ^{s t}) (π_{p a})$ . Pick an arbitrary point in said intersection of preimages of clipped subsets.

Bam, we found a nirvana-free point in $↑ (Θ^{s t}) (π_{p a})$ and we're done.

Time for conditions 2 and 3, Closure and Convexity. These are easy.

$↑ (Θ^{s t}) (π_{p a}) = ⋂_{n} (p r_{*}^{π_{p a}, π_{p a}^{n}})^{- 1} (Θ^{s t} (π_{p a}^{n}))$

The preimage of a closed set (stub-closure) is a closed set, and the intersection of closed sets is closed, so we have closure.

Also, $p r_{*}^{π_{p a}, π_{s t}^{n}}$ is linear, so the preimage of a convex set (stub-convexity) is convex, and we intersect a bunch of convex sets so it's convex as well.

Condition 4: Nirvana-free upper completion.

Let $M \in↑ (Θ^{s t}) (π_{p a}) \cap N F$ . Let's check whether $M + M^{*}$ (assuming that's an a-measure and $M^{*}$ is nirvana-free) also lies in the set. A sufficient condition on this given how we defined things is that for all $π_{p a}^{n}$ , $p r_{*}^{π_{p a}, π_{p a}^{n}} (M + M^{*}) \in Θ^{s t} (π_{p a}^{n})$ , as that would certify that $M + M^{*}$ is in all the preimages.

$p r_{*}^{π_{p a}, π_{p a}^{n}}$ is linear, so $p r_{*}^{π_{p a}, π_{p a}^{n}} (M + M^{*}) = p r_{*}^{π_{p a}, π_{p a}^{n}} (M) + p r_{*}^{π_{p a}, π_{p a}^{n}} (M^{*})$

The first component is in $Θ^{s t} (π_{s t})$ , obviously. And then, by stub nirvana-free-upper-completion, we have a nirvana-free a-measure plus a nirvana-free sa-measure (projection preserves nirvana-freeness), making a nirvana-free a-measure (projection preserves a-measures), so $p r_{*}^{π_{p a}, π_{p a}^{n}} (M) + p r_{*}^{π_{p a}, π_{p a}^{n}} (M^{*})$ is in $Θ^{s t} (π_{p a}^{n}) \cap N F$ , and we're done.

Condition 5: Bounded-Minimals

So, there is a critical $λ^{⊙} + b^{⊙}$ value by restricted-minimals for $Θ^{s t}$

Fix a $π_{p a}$ , and assume that there is a minimal point $M$ in $↑ (Θ^{s t}) (π_{p a})$ with a $λ + b$ value that exceeds the bound. Project $M$ down into each $Θ^{s t} (π_{p a}^{n})$ . Projection preserves $λ$ and $b$ so each of these projected points $M_{n}$ lie above some $M_{n}^{min}$ .

Now, invoke Lemma 7 to construct a $M_{n}^{'} \in M^{a} (F (π_{p a}))$ (or the nirvana-free variant) that lies below $M$ , and projects down to $M_{n}^{min}$ . Repeat this for all n. All these $M_{n}^{'}$ points are a-measures and have the standard $λ^{⊙} + b^{⊙}$ bound so they all lie in a compact set and we can extract a convergent subsequence, that converges to $M^{'}$ , which still obeys the $λ^{⊙} + b^{⊙}$ bound.

$M^{'}$ is below $M$ because $({M} - M^{s a} (F (π_{p a}))) \cap M^{a} (F (π_{p a}))$ (or the nirvana-free variant) is a closed set. Further, by Lemma 10, $M^{'}$ is in the defining sequence of intersections for $↑ (Θ^{s t}) (π_{p a})$ . This witnesses that $M$ isn't minimal, because we found a point below it that actually obeys the bounds. Thus, we can conclude minimal-point-boundedness for $↑ (Θ^{s t})$ .

Condition 6: Normalization. We'll have to go out of order here, this can't be shown at our current stage. We're going to have to address Hausdorff continuity first, then consistency, and solve normalization at the very end. Let's put that off until later, and just get extreme points.

Condition 8: Extreme point condition:

The argument for this one isn't super-complicated, but the definitions are, so let's recap what condition we have and what condition we're trying to get.

Condition we have: for all $M, π_{s t}^{l o}$ :

$M \in (Θ^{s t} (π_{s t}^{l o}))^{x m i n} \cap N F \to$

Condition we want: for all $M, π_{s t}$ ,

$M \in (Θ^{s t} (π_{s t}))^{x m i n} \cap N F \to \exists π > π_{s t}, M^{'} : M^{'} \in↑ (Θ^{s t}) (π) \cap N F \land p r_{*}^{π, π_{s t}} (M^{'}) = M$

Ok, so $M \in (Θ^{s t} (π_{s t}))^{x m i n} \cap N F$ By the stub extreme point condition, there's a $π \geq π_{s t}$ , where, for all $π_{s t}^{h i}$ that fulfill $π > π_{s t}^{h i} \geq π_{s t}$ , there's a $M^{'} \in Θ^{s t} (π_{s t}^{h i}) \cap N F$ , where $p r_{*}^{π_{s t}^{h i}, π_{s t}} (M^{'}) = M$ .

Lock in the $π$ we have. We must somehow go from this to a $M^{'} \in Θ (π) \cap N F$ that projects down to our point of interest. To begin with, let $π^{n}$ be the n'th member of the fundamental sequence for $π$ . Past a certain point m, these start being greater than $π_{s t}$ . The $M^{'} \in Θ^{s t} (π^{n}) \cap N F$ which projects down to $M$ that we get by the stub-extreme-point condition will be called $M_{n}^{^{'} l o}$ . Pick some random-ass point in $(p r_{*}^{π, π_{s t}^{n}})^{- 1} (M_{n}^{^{'} l o})$ and call it $M_{n}^{^{'} h i}$ .

$M_{n}^{^{'} h i}$ all obey the $λ$ and $b$ values of $M$ , because it projects down to $M$ . We get a limit point of them, $M^{'}$ , and invoking Lemma 10, it's also in $↑ (Θ^{s t}) (π)$ . It also must be nirvana-free, because it's a limit of points that are nirvana-free for increasingly late times. It also projects down to $M$ because the sequence $M_{n}^{^{'} h i}$ was wandering around in the preimage of $M$ , which is closed.

Condition 9: Hausdorff Continuity:

Ok, this one is going to be fairly complicated. Remember, our original form is:

"The function $π_{s t} \mapsto (p r_{*}^{\infty, π_{s t}})^{- 1} (Θ^{s t} (π_{s t}) \cap N F \cap {\leq ⊙})$ is uniformly continuous"

And the form we want is:

"The function $π_{p a} \mapsto (p r_{*}^{\infty, π_{p a}})^{- 1} (↑ (Θ^{s t}) (π_{p a}) \cap N F \cap {\leq ⊙})$ is uniformly continuous"

Uniform continuity means that if we want an $ϵ$ Hausdorff-distance between two preimages, there's a $δ$ distance between partial policies that suffices to produce that. To that end, fix our $ϵ$ . We'll show that the $δ$ we get from uniform continuity on stubs suffices to tell us how close two partial policies must be.

So, we have an $ϵ$ . For uniform continuity, we need to find a $δ$ where, regardless of which two partial policies $π_{p a}$ and $π_{p a}^{'}$ we select, as long as they're $δ$ or less apart, the sets $(p r_{*}^{\infty, π_{p a}})^{- 1} (↑ (Θ^{s t}) (π_{p a}) \cap N F \cap {\leq ⊙})$ (and likewise for $π_{p a}^{'}$ ) are only $ϵ$ apart. So, every point in the first preimage must have a point in the second preimage only $ϵ$ distance away, and vice-versa. However, we can swap $π_{p a}$ and $π_{p a}^{'}$ (our argument will be order-agnostic) to establish the other direction, so all we really need to do is to show that every point in the preimage associated with $π_{p a}$ is within $ϵ$ of a point in the preimage associated with $π_{p a}^{'}$ .

First, $m := {log}_{γ} (δ)$ is the time at which two partial policies $δ$ apart may start differing. Conversely, any two partial policies which only disagree at-or-after m are $δ$ apart or less. Let $π_{s t}^{*}$ be the policy stub defined as follows: take the inf of $π_{p a}$ and $π_{p a}^{'}$ (the partial policy which is everything they agree on, which is going to perfectly mimic both of them up till time m), and clip things off at time m to make a stub. This is only $δ$ apart from $π_{p a}$ and $π_{p a}^{'}$ , because it perfectly mimics both of them up till time m, and then becomes undefined (so there's a difference at time m) Both $π_{p a}$ and $π_{p a}^{'}$ are $\geq π_{s t}^{*}$ .

Let $M$ be some totally arbitrary point in $(p r_{*}^{\infty, π_{p a}})^{- 1} (Θ^{s t} (π_{p a}) \cap N F \cap {\leq ⊙})$ . $M$ is also in $(p r_{*}^{\infty, π_{s t}^{*}})^{- 1} (Θ^{s t} (π_{s t}^{*}) \cap N F \cap {\leq ⊙})$ , because $M$ projects down to some point in $Θ^{s t} (π_{s t}^{*})$ that's nirvana-free.

Let $π_{p a}^{^{'} n}$ , where $n \geq m$ , be the n'th stub in the fundamental sequence for $π_{p a}^{'}$ . These form a chain starting at $π_{s t}^{*}$ and ascending up to $π_{p a}^{'}$ , and are all $δ$ distance from $π_{s t}^{*}$ .

Anyways, in $M^{a} (\infty)$ , we can make a closed ball $B_{\leq ϵ}$ of size $ϵ$ around $M$ . This restricts $λ$ and $b$ to a small range of values, so we can use the usual arguments to conclude that $B_{\leq ϵ}$ is compact.

Further, because $π_{s t}^{*}$ is $δ$ or less away from $π_{p a}^{^{'} n}$ , the two sets $(p r_{*}^{\infty, π_{s t}^{*}})^{- 1} (Θ^{s t} (π_{s t}^{*}) \cap N F \cap {\leq ⊙})$ and $(p r_{*}^{\infty, π_{p a}^{^{'} n}})^{- 1} (Θ^{s t} (π_{p a}^{^{'} n}) \cap N F \cap {\leq ⊙})$ are within $ϵ$ of each other, so there's some point of the latter set that lies within our closed $ϵ$ -ball.

Consider the set $⋂_{n \geq m} ((p r_{*}^{\infty, π_{p a}^{^{'} n}})^{- 1} (Θ^{s t} (π_{p a}^{^{'} n}) \cap N F \cap {\leq ⊙}) \cap B_{\leq ϵ})$

the inner intersection is an intersection of closed and compact sets, so it's compact. Thus, this is an intersection of an infinite family of nonempty compact sets. To check the finite intersection property, just observe that since preimages of the sets $↑ (Θ^{s t}) (π_{p a}^{^{'} n}) \cap N F \cap {\leq ⊙}$ get smaller and smaller as n increases due to weak-consistency but always exist.

Pick some arbitrary point $M^{'}$ from the intersection. it's $\leq ϵ$ away from $M$ since it's in the $ϵ$ -ball. However, we still have to show that $M^{'}$ is in $(p r_{*}^{\infty, π_{p a}^{'}})^{- 1} (↑ (Θ^{s t}) (π_{p a}^{'}) \cap N F \cap {\leq ⊙})$ to get Hausdorff-continuity to go through.

To begin with, since $M^{'}$ lies in our big intersection, we can project it down to any $Θ^{s t} (π_{p a}^{^{'} n}) \cap N F \cap {\leq ⊙}$ . Projecting it down to stage n makes $M_{n}^{'}$ . Let $M_{\infty}^{'}$ be the point in $↑ (Θ^{s t}) (π_{p a}^{'}) \cap N F \cap {\leq ⊙}$ defined by: $⋂_{n \geq m} (p r_{*}^{π_{p a}^{'}, π_{s t}^{n}})^{- 1} (M_{n}^{'})$

Well, we still have to show that this set is nonempty, contains only one point, and that it's in $↑ (Θ^{s t}) (π_{p a}^{'})$ , and is nirvana-free, to sensibly identify it with a single point.

Nonemptiness is easy, just invoke Lemma 9. It lies in the usual intersections that define $↑ (Θ^{s t}) (π_{p a}^{'})$ , so we're good there. If it had nirvana, it'd manifest at some finite point, but all finite projections are nirvana-free, so it's nirvana-free. If it had more than one point in it, they differ at some finite stage, so we can project to a finite $π_{p a}^{^{'} n}$ to get two different points, but they both project to $M_{n}^{'}$ , so this is impossible. Thus, $M_{\infty}^{'}$ is a legit point in the appropriate set. If the projection of $M^{'}$ didn't equal $M_{\infty}^{'}$ , then we'd get two different points, which differ at some finite stage, so we could project down to separate them, but they both project to $M_{n}^{'}$ for all n so this is impossible.

So, as a recap, we started with an arbitrary point $M$ in $(p r_{*}^{\infty, π_{p a}})^{- 1} (↑ (Θ^{s t}) (π_{p a})) \cap {\leq ⊙}$ , and got another point $M^{'}$ that's only $ϵ$ or less away and lies in $(p r_{*}^{\infty, π_{p a}^{'}})^{- 1} (↑ (Θ^{s t}) (π_{p a}^{'})) \cap {\leq ⊙}$ This argument also works if we flip $π_{p a}$ and $π_{p a}^{'}$ , so the two preimages are only $ϵ$ or less apart in Hausdorff-distance.

So, given some $ϵ$ , there's some $δ$ where any two partial policies which are only $δ$ apart have preimages only $ϵ$ apart from each other in Hausdorff-distance. And thus, we have uniform continuity for the function mapping $π_{p a}$ to the set of a-measures over infinite histories which project down to $↑ (Θ^{s t}) (π_{p a}) \cap N F \cap {\leq ⊙}$ Hausdorff-continuity is done.

Condition 7: Consistency.

Ok, we have two relevant things to check here. The first, very easy one, is that

$↑ (Θ^{s t}) (π_{p a}) = ⋂_{π_{s t} \leq π_{p a}} (p r_{*}^{π_{p a}, π_{s t}})^{- 1} (↑ (Θ^{s t}) (π_{s t}))$

From earlier, we know that $↑ (Θ^{s t}) (π_{s t}) = Θ^{s t} (π_{s t})$ , and from how $↑ (Θ^{s t}) (π_{p a})$ is defined, this is a tautology.

The other, much more difficult direction, is that

$↑ (Θ^{s t}) (π_{p a}) = ¯ ¯¯¯¯¯¯ ¯ c . h (⋃_{π \geq π_{p a}} p r_{*}^{π, π_{p a}} (↑ (Θ^{s t}) (π)))$

We'll split this into four stages. First, we'll show one subset direction holds in full generality. Second, we'll get the reverse subset direction for causal/surcausal. Third, we'll show it for policy stubs for pseudocausal/acausal, and finally we'll use that to show it for all partial policies for pseudocausal/acausal.

First, the easy direction. $↑ (Θ^{s t}) (π_{p a}) \supseteq ¯ ¯¯¯¯¯¯ ¯ c . h (⋃_{π \geq π_{p a}} p r_{*}^{π, π_{p a}} (↑ (Θ^{s t}) (π)))$

If we pick an arbitrary $M \in↑ (Θ^{s t}) (π)$ , it projects down to $Θ^{s t} (π_{s t})$ for all stubs $π_{s t}$ below $π$ . Since $π_{p a} \leq π$ , it projects down to all stubs beneath $π_{p a}$ . Since projections commute, $M$ projected down into $M^{a} (F (π_{p a}))$ makes a point that lies in the preimage of all the $Θ^{s t} (π_{s t})$ where $π_{s t} \leq π_{p a}$ , so it projects down into $↑ (Θ^{s t}) (π_{p a})$ .

This holds for all points in $↑ (Θ^{s t}) (π)$ , so $p r_{*}^{π, π_{p a}} (↑ (Θ^{s t}) (π)) \subseteq↑ (Θ^{s t}) (π_{p a})$ . This works for all $π \geq π_{p a}$ , so it holds for the union, and then due to closure and convexity which we've already shown, we get that the closed convex hull of the projections lies in $↑ (Θ^{s t}) (π_{p a})$ too, establishing one subset direction in full generality.

Now, for phase 2, deriving $↑ (Θ^{s t}) (π_{p a}) \subseteq ¯ ¯¯¯¯¯¯ ¯ c . h (⋃_{π \geq π_{p a}} p r_{*}^{π, π_{p a}} (↑ (Θ^{s t}) (π)))$ in the causal/surcausal case.

First, observe that if $π \geq π_{p a}$ , then $π^{n} \geq π_{p a}^{n}$ . Fix some $M \in↑ (Θ^{s t}) (π_{p a})$ and arbitrary $π \geq π_{p a}$ . We'll establish the existence of a $M^{'} \in↑ (Θ^{s t}) (π)$ that projects down to $M$ .

To begin with, $M$ projects down to $M_{n}$ in $Θ^{s t} (π_{p a}^{n})$ . Lock in a value for n, and consider the sequence that starts off $M_{0}, M_{1} . . . M_{n}$ , and then, by causality for stubs and $π^{n} \geq π_{p a}^{n}$ , you can find something in $Θ^{s t} (π^{n})$ that projects down onto $M_{n}$ , and something in $Θ^{s t} (π^{n + 1})$ that projects down onto that, and complete your sequence that way, making a sequence of points that all project down onto each other that climb up to $π$ . By Lemma 10, we get a $M_{n}^{'} \in↑ (Θ^{s t}) (π)$ . You can unlock n now. All these $M_{n}^{'}$ have the same $λ$ and $b$ value because projection preserves them, so we can isolate a convergent subsequence converging to some $M^{'} \in↑ (Θ^{s t}) (π)$ .

Assume $p r_{*}^{π, π_{p a}} (M^{'}) \neq M$ . Then we've got two different points. They differ at some finite stage, so there's some n where can project down onto $M^{a} (F (π_{p a}^{n}))$ to witness the difference, but from our construction process for $M^{'}$ , both $M^{'}$ and $M$ project down to $M_{n}$ , and we get a contradiction.

So, since $p r_{*}^{π, π_{p a}} (↑ (Θ^{s t}) (π)) =↑ (Θ^{s t} (π_{p a}))$ , this establishes the other direction, showing equality, and thus consistency, for causal/surcausal hypotheses.

For part 3, we'll solve the reverse direction for pseudocausal/acausal hypotheses in the case of stubs, getting $↑ (Θ^{s t}) (π_{s t}) \subseteq ¯ ¯¯¯¯¯¯ ¯ c . h (⋃_{π \geq π_{p a}} p r_{*}^{π, π_{s t}} (↑ (Θ^{s t}) (π)))$

Since we're working in the Nirvana-free case and are working with stubs, we can wield Lemma 3

$c . h ((Θ^{s t} (π_{s t}))^{min}) = c . h ((Θ^{s t} (π_{s t}))^{xmin})$

So, if we could just show that the union of the projections includes all the extreme minimal points, then when we take convex hull, we'd get the convex hull of the extreme minimal points, which by Lemma 3, would also nab all the minimal points as well. By Lemmas 11 and 12, our resulting convex hull of a union of projections from above would be upper-complete. It would also get all the minimal points, so it'd nabs the entire $Θ^{s t} (π_{s t})$ within it and this would show the other set inclusion direction for pseudocausal/acausal stubs. Also, we've shown enough to invoke Lemma 20 to conclude that said convex hull is closed. Having fleshed out that argument, all we need that all extreme minimal points are captured by the union of the projections.

By our previously proved extreme minimal point condition, for every extreme minimal point $M$ in $Θ^{s t} (π_{s t})$ , there's some $π > π_{s t}$ and $M^{'}$ in $↑ (Θ^{s t}) (π)$ that projects down to $M$ , which shows that all extreme points are included, and we're good.

For part 4, we'll show that in the nirvana-free pseudocausal/acausal setting, we have

$↑ (Θ^{s t}) (π_{p a}) \subseteq ¯ ¯¯¯¯¯¯ ¯ c . h (⋃_{π \geq π_{p a}} p r_{*}^{π, π_{p a}} (↑ (Θ^{s t}) (π)))$

Fix some arbitrary $M \in↑ (Θ^{s t}) (π_{p a})$ . Our task is to express it as a limit of some sequence of points that are mixtures of stuff projected from above.

For this, we can run through the same exact proof path that was used in the part of the Lemma 21 proof about how the nirvana-free part of $Θ (π_{p a})$ is a subset of closed convex hull of the projections of the nirvana-free parts of $Θ (π)$ , $π \geq π_{p a}$ . Check back to it. Since we're working in the Nirvana-free case, we can apply it very straightforwardly. The stuff used in that proof path is the ability to project down and land in $↑ (Θ^{s t}) (π_{p a}^{n})$ (we have that by how we defined $↑$ ), Hausdorff-continuity (which we have), and stubs being the convex hull, not the closed convex hull, of projections of stuff above them (which we mentioned recently in part 3 of our consistency proof).

Thus, consistency is shown.

Condition 6: Normalization.

Ok, now that we have all this, we can tackle our one remaining condition, normalization. Then move on to the two optional conditions, causality and pseudocausality.

A key thing to remember is, in this setting, when you're doing $E_{H} (f)$ , it's actually $E_{H \cap N F} (f)$ , because if Murphy picked a thing with nirvana in it, you'd get infinite value, which is not ok, so Murphy always picks a nirvana-free point.

Let's show the first part, that ${inf}_{π} E_{↑ (Θ^{s t}) (π)} (0) = 0$ . This unpacks as: ${inf}_{π} {min}_{(λ μ, b) \in↑ (Θ^{s t}) (π) \cap N F} b = 0$ and we have that ${inf}_{π_{s t}} {min}_{(λ μ, b) \in Θ^{s t} (π_{s t}) \cap N F} b = 0$

Projections preserves $b$ values, so we can take some nirvana-free point with a $b$ value of nearly 0, and project it down to $Θ^{s t} (π_{\emptyset})$ (belief function of the empty policy-stub) where there's no nirvana possible because no events happen.

So, we've got $b$ values of nearly zero in there. Do we have a point with a $b$ value of exactly zero? Yes. it's closed, and has bounded minimals, so we can go "all positive functionals are minimized by a minimal point", to get a point with a $b$ value of exactly zero. Then, we can invoke Lemma 21 (we showed consistency, extreme points, and all else required to invoke it) to decompose our point into the projections of nirvana-free stuff from above, all of which must have a $b$ value of 0. So, there's a nirvana-free point in some policy with 0 $b$ value.

Now for the other direction. Let's show that ${sup}_{π} E_{↑ (Θ^{s t}) (π)} (1) = 1$ .

This unpacks as: ${sup}_{π} {min}_{(λ μ, b) \in↑ (Θ^{s t}) (π) \cap N F} (λ + b) = 1$

We'll show this by disproving that the sup is <1, and disproving that the sup is >1.

First, assume that, regardless of $π$ , ${min}_{(λ μ, b) \in↑ (Θ^{s t}) (π) \cap N F} (λ + b) \leq 1 - ϵ$ Then, regardless of $π_{s t}$ we can pick some totally arbitrary $π > π_{s t}$ , and there's a nirvana-free point with a $λ + b$ value of $1 - ϵ$ or less. By consistency, we can project it down into $Θ (π_{s t})$ , to get a nirvana-free point with a $λ + b$ value of $1 - ϵ$ or less. Thus, regardless of the stub we pick, there's nirvana-free points where Murphy can force a value of $1 - ϵ$ or less, which contradicts ${sup}_{π_{s t}} {min}_{(λ μ, b) \in Θ^{s t} (π_{s t}) \cap N F} (λ + b) = 1$

What if it's above 1? Assume there's some $π$ where ${min}_{(λ μ, b) \in↑ (Θ^{s t}) (π) \cap N F} (λ + b) \geq 1 + ϵ$

From uniform-continuity-Hausdorff, pick some $δ$ to get a $\frac{ϵ}{2}$ Hausdorff-distance or lower (for stuff obeying the $λ^{⊙} + b^{⊙}$ bound, which all minimal points of $↑ (Θ^{s t}) (π) \cap N F$ do for all $π$ ). This $δ$ specifies some extremely large n, consider $π^{n}$ . Now, consider the set of every policy $π^{'}$ above $π^{n}$ . All of these are $δ$ or less away from $π$ . Also, remember that the particular sort of preimage-to-infinity that we used for Hausdorff-continuity slices away all the nirvana.

So, Murphy, acting on $π$ , can only force a value of $1 + ϵ$ or higher. Now, there can be no nirvana-free point in $↑ (Θ^{s t}) (π^{'})$ with $λ + b < 1 + \frac{ϵ}{2}$ . The reason for this is that, since $π^{'}$ is $δ$ or less away from $π$ , there's a nirvana-free point in $↑ (Θ^{s t}) (π)$ that's $\frac{ϵ}{2}$ away, and thus has $λ + b < 1 + ϵ$ , which is impossible.

Ok, so all the nirvana-free points in $↑ (Θ^{s t}) (π^{'})$ where $π^{'} > π_{s t}^{n}$ have $λ + b \geq 1 + \frac{ϵ}{2}$ .

Now, since we have Lemma 21, we can go "hm, $Θ^{s t} (π^{n})$ equals the convex hull of the projections of $↑ (Θ^{s t}) (π^{'}) \cap N F$ . Thus, any minimal point with $λ + b \leq 1$ is a finite mix of nirvana-free stuff from above, one of which must have $λ + b \leq 1$ . But we get a contradiction with the fact that there's no nirvana-free point from $π^{'}$ above $π^{n}$ with a $λ + b$ value that low, they're all $\geq 1 + \frac{ϵ}{2}$ "

So, since we've disproved both cases, ${sup}_{π} E_{↑ (Θ^{s t}) (π)} (1) = 1$ . And we're done with normalization! On to causality and pseudocausality.

Condition C: Causality.

An "outcome function" $o f$ for $↑ (Θ^{s t}) (π_{p a})$ is a function that maps a $π_{p a}$ to a point in $↑ (Θ^{s t}) (π_{p a})$ , s.t. for all $π_{p a}^{h i}, π_{p a}^{l o} : p r_{*}^{π_{p a}^{h i}, π_{p a}^{l o}} (o f (π_{p a}^{h i})) = o f (π_{p a}^{l o})$ .

Causality is, if you have a $M \in↑ (Θ^{s t}) (π_{p a})$ , you can always find an outcome function $o f$ where $o f (π_{p a}) = M$ . Sadly, all we have is causality over stubs. We'll be using the usual identification between $↑ (Θ^{s t}) (π_{s t})$ and $Θ^{s t} (π_{s t})$ .

Anyways, fix a $π_{p a}$ and a point $M$ in $↑ (Θ^{s t}) (π_{p a})$ . Project $M$ down to get a sequence $M_{n} \in Θ^{s t} (π_{p a}^{n})$ . By causality for stubs, we can find an $o f_{n}$ where, for all $π_{s t}$ , $o f_{n} (π_{p a}^{n}) = M_{n}$ . Observe that there are countably many stubs, and no matter the n, all the $λ$ and $b$ values are the same because projection preserves those. We can view $o f_{n}$ as a sequence in

$\prod_{π_{s t}} (Θ^{s t} (π_{s t}) \cap {(λ^{'} μ, b^{'}) | λ^{'} = λ \land b^{'} = b})$

By stub closure, and a $λ$ and $b$ bound, this is a product of compact sets, and thus compact by Tychonoff (no axiom of choice needed, its just a countable product of compact metric spaces) so we can get a limiting $o f^{s t}$ (because it's only defined over stubs).

An outcome function for stubs fixes an outcome function for all partial policies, by

$o f (π_{p a}) = ⋂_{n} (p r_{*}^{π_{p a}, π_{p a}^{n}})^{- 1} (o f^{s t} (π_{p a}^{n}))$

We've got several things to show now. We need to show that $o f^{s t}$ is an outcome function, that $o f$ is well-defined, that $o f (π_{p a}) = M,$ and that it's actually an outcome function.

For showing that $o f^{s t}$ is an outcome function, observe that projection is continuous, and, letting n index our convergent subsequence of interest, regardless of stub $π_{s t}$ , ${lim}_{n \to \infty} o f_{n} (π_{s t}) = o f^{s t} (π_{s t})$ . With this,

$p r_{*}^{π_{s t}^{h i}, π_{s t}^{l o}} (o f^{s t} (π_{s t}^{h i})) = p r_{*}^{π_{s t}^{h i}, π_{s t}^{l o}} ({lim}_{n \to \infty} o f_{n} (π_{s t}^{h i})) = {lim}_{n \to \infty} p r_{*}^{π_{s t}^{h i}, π_{s t}^{l o}} (o f_{n} (π_{s t}^{h i}))$

$= {lim}_{n \to \infty} o f_{n} (π_{s t}^{l o}) = o f^{s t} (π_{s t}^{l o})$

Now, let's show that $o f$ is well-defined. Since $o f^{s t}$ is an outcome function, all the points project down onto each other, so we can invoke Lemma 9 to show that the preimage is nonempty. If the preimage had multiple points, we could project down to some finite stage to observe their difference, but nope, they always project to the same point. So it does pick out a single well-defined point, and it lies in $↑ (Θ^{s t}) (π_{p a}$ ) by being a subset of the defining sequence of intersection of preimages.

Does $o f (π_{p a}) = M$ ? Well, $M$ projected down to all the $M_{n}$ . If $n \geq m$ , then $o f_{n} (π_{p a}^{m}) = p r_{*}^{π_{p a}^{n}, π_{p a}^{m}} (o f_{n} (π_{p a}^{n})) = p r_{*}^{π_{p a}^{n}, π_{p a}^{m}} (M_{n}) = M_{m}$ So, the limit specification $o f^{s t}$ has $o f^{s t} (π_{p a}^{n}) = M_{n}$ for all n. The only thing that projects down to make all the $M_{n}$ is $M$ itself, so $o f (π_{p a}) = M$ .

Last thing to check: Is $o f$ an outcome function over partial policies? Well, if $π_{p a}^{h i} \geq π_{p a}^{l o}$ , then for all n, $π_{p a}^{h i, n} \geq π_{p a}^{l o, n}$ . Assume $p r_{*}^{π_{p a}^{h i}, π_{p a}^{l o}} (o f (π_{p a}^{h i})) \neq o f (π_{p a}^{l o})$ . Then, in that case, we can project down to some $π_{p a}^{l o, n}$ and they'll still be unequal. However, since projections commute, it doesn't matter whether you project down to $π_{p a}^{l o}$ and then to $π_{p a}^{l o, n}$ , or whether you project down to $π_{p a}^{h i, n}$ (making $o f^{s t} (π_{p a}^{h i, n})$ ), and then project down to $π_{p a}^{l o, n}$ (making $o f^{s t} (π_{p a}^{l o, n})$ ). Wait, hang on, this is the exact point that $o f (π_{p a}^{l o})$ projects down to, contradiction. Therefore it's an outcome function.

And we're done, we took an arbitrary $π_{p a}$ and $M \in↑ (Θ^{s t}) (π_{p a})$ , and got an outcome function $o f$ with $o f (π_{p a}) = M$ , showing causality.

Condition P: Pseudocausality: If $(m, b) \in↑ (Θ^{s t}) (π_{p a})$ , and $m$ 's support is on $M^{a} (F^{N F} (π_{p a}^{'}))$ , then $(m, b) \in↑ (Θ^{s t}) (π_{p a}^{'})$ .

But all we have is, if $(m, b) \in Θ^{s t} (π_{s t})$ and $m$ 's support is on $M^{a} (F (π_{s t}^{'}))$ , then $(m, b) \in Θ^{s t} (π_{s t}^{'})$ .

There's a subtlety here. Our exact formulation of pseudocausality we want is the condition $supp (m) \subseteq F (π_{p a}^{'})$ , so if the measure is 0, then support is the empty set, which is trivially a subset of everything, then pseudocausality transfers it to all partial policies.

Ok, so let's assume that $M \in↑ (Θ^{s t}) (π_{p a})$ , and the measure part $m$ has its support being a subset of $M^{a} (F^{N F} (π_{p a}^{'}))$ but yet is not in $↑ (Θ^{s t}) (π_{p a}^{'})$ . Then, since this is an intersection of preimages from below, there should be some finite level $π_{p a}^{^{'} n}$ that you can project $M$ down to (it's present in $M^{a} (F^{N F} (π_{p a}^{'}))$ , just maybe not in $↑ (Θ^{s t}) (π_{p a}^{'})$ ) where the projection of $M$ (call it $M_{n}^{'}$ ) lies outside $Θ^{s t} (π_{p a}^{^{'} n})$ (lying outside the intersection of preimages)

This is basically "take $m$ , chop it off at height n". However, since $M \in↑ (Θ^{s t}) (π_{p a})$ , you can project it down to $Θ^{s t} (π_{p a}^{n})$ . Which does the exact same thing of chopping $m$ off at height n, getting you $M_{n}^{'}$ exactly. We can invoke stub-pseudocausality (because with full measure, the history will land in $F (π_{p a}^{'})$ , then with full measure, the truncated history will land in $F (π_{p a}^{^{'} n})$ as the latter is prefixes of the former, or maybe the full measure is 0 in which case pseudocausality transfer still works) to conclude that $M^{'}$ actually lies inside $Θ^{s t} (π_{p a}^{^{'} n})$ , getting a contradiction. This establishes pseudocausality in full generality.

Ok, so we have one direction. $↑ (Θ^{s t})$ is a hypothesis, if $Θ^{s t}$ fulfills analogues of the hypothesis conditions for the finitary stub case. Our proof of everything doesn't distinguish between causal and surcausal, and the arguments work for all types of hypotheses, whether causal, surcausal, pseudocausal, or acausal. Ok, we're 1/4 of the way through. Now we do the same thing, but for building everything from infinitary hypotheses.

PART 2: Infinitary hypotheses. We now consider $↓ (Θ^{ω})$ , defined as $↓ (Θ^{ω}) (π_{p a}) = ¯ ¯¯¯¯¯¯ ¯ c . h (⋃_{π \geq π_{p a}} p r_{*}^{π, π_{p a}} (Θ^{ω} (π))$

We'll show that with 8+2 defining conditions, the 9+2 defining conditions for a hypothesis hold for $↓ (Θ^{ω})$ . The the 8+2 conditions for a $Θ^{ω}$ are:

1: Infinitary Nirvana-Free Nonemptiness: $\forall π : Θ^{ω} (π) \cap N F \neq \emptyset$

2: Infinitary Closure: $\forall π : Θ^{ω} (π) = ¯ ¯¯¯¯¯¯¯¯¯¯¯¯ ¯ Θ^{ω} (π)$

3: Infinitary Convexity. $\forall π : Θ^{ω} (π) = c . h (Θ^{ω} (π))$

4: Infinitary Nirvana-free Upper-Completeness

$\forall π : Θ^{ω} (π) \cap N F = ((Θ^{ω} (π) \cap N F) + M^{s a} (F^{N F} (π))) \cap M^{a} (F (π))$

5: Infinitary Bounded Minimals: $\exists λ^{⊙}, b^{⊙} \forall π : (λ μ, b) \in {(Θ^{ω} (π))}^{min} \to λ + b \leq λ^{⊙} + b^{⊙}$

6: Normalization: ${inf}_{π} E_{Θ^{s t} (π)} (0) = 0$ and ${sup}_{π} E_{Θ^{s t} (π)} (1) = 1$

7: Nirvana-free consistency.

$\forall π_{s t} : c . h (⋃_{π > π_{s t}} p r_{*}^{π, π_{s t}} (Θ^{ω} (π) \cap N F)) = ¯ ¯¯¯¯¯¯ ¯ c . h (⋃_{π > π_{s t}} p r_{*}^{π, π_{s t}} (Θ^{ω} (π))) \cap N F$

8: Infinitary Uniform Hausdorff Continuity:

The function $π \mapsto Θ^{ω} (π) \cap N F \cap {\leq ⊙}$ is uniformly continuous.

C: Infinitary Causality: Regardless of $π$ and $M \in Θ^{ω} (π)$ , there's an outcome function $o f$ over full policies s.t. $o f (π) = M$ , and for all $π^{'}$ and $π^{''}$ , $p r_{*}^{π^{'}, inf (π^{'}, π^{''})} (o f (π^{'})) = p r_{*}^{π^{''}, inf (π^{'}, π^{''})} (o f (π^{''}))$

P: Infinitary Pseudocausality: $\forall π, π^{'} : (M \in Θ^{ω} (π), supp (M) \subseteq F^{N F} (π^{'}) \to π_{s t} \in Θ^{ω} (π^{'})$

Let's begin showing the conditions.

Condition 1: Nirvana-free Nonemptiness.

This one is trivial. Pick some $π \geq π_{p a}$ . There's a nirvana-free point. Project it down. You get a nirvana-free point and you're done.

Conditions 2 and 3, Closure and convexity. We explicitly took the closed convex hull when defining everything, these are tautological.

Condition 4: Nirvana-free upper completion.

For the pseudo/acausal case, it's doable by Lemmas 10, 11, and 12. The projection of an upper-complete set (by infinitary nirvana-free upper-completion) is upper-complete, so the union of projections is upper-complete, and then the convex hull is upper-complete, and then the closure is upper-complete and we're done.

We'll have to loop back to the causal case of Nirvana-free Upper Completion later, because we need Lemma 21 to make it go through and that requires consistency and the extreme point condition to make it work.

Condition 5: Bounded Minimals.

We can break down into three phases. First is showing that all points in the projection set have something under them that respects the $λ^{⊙} + b^{⊙}$ bound. Second is showing that all points in the convex hull of the union of projection sets have something under them that respects the $λ^{⊙} + b^{⊙}$ bound. Third is showing that all points in the closure have something under them that respects the usual bound. The reason we have to phrase it this way is that we don't necessarily know that our sets of interest are closed until the end, so we can't find a minimal point, just a bounded one that is lower, but that suffices to show that a "minimal point" that violates the restricted minimal condition isn't actually minimal.

For part 1, let $M \in p r_{*}^{π, π_{p a}} (Θ^{ω} (π))$ . Then, $M$ is the projection of some point $M^{'} \in Θ^{ω} (π)$ . By infinitary bounded-minimals, we can find a minimal point $M^{min} \in Θ^{ω} (π)$ below $M^{'}$ that obeys the $λ^{⊙} + b^{⊙}$ bound, so $M^{min} + M^{*} = M^{'}$ . Projecting down is linear, so we get $p r_{*}^{π, π_{p a}} (M^{min}) + p r_{*}^{π, π_{p a}} (M^{*}) = M$ , and $p r_{*}^{π, π_{p a}} (M^{min})$ is below $M$ and fulfills the $λ^{⊙} + b^{⊙}$ bound.

For part 2, let $M \in c . h (⋃_{π \geq π_{p a}} p r_{*}^{π, π_{p a}} (Θ^{ω} (π)))$ We can rewrite $M$ as $E_{ζ} M_{i}$ , and then, by part 1, decompose the $M_{i}$ into $M_{i}^{l o}$ (not actually minimal, just a point obeying the $λ^{⊙} + b^{⊙}$ bound) and $M_{i}^{*}$ . Then we can decompose $M$ further into $E_{ζ} M_{i}^{l o} + E_{ζ} M_{i}^{*}$ . The former is an a-measure (mix of a-measures) and obeys the $λ^{⊙} + b^{⊙}$ bound since all its components do, and it's in the relevant convex hull, witnessing that $M$ has a point below it in the convex hull that obeys the bounds.

For part 3, let $M$ be in the closure of the convex hull. There's some sequence $M_{n}$ in the convex hull that limits to $M$ . Below each $M_{n}$ we can find a $M_{n}^{l o}$ (again, not actually minimal) that obeys the $λ^{⊙} + b^{⊙}$ bound. Invoke Lemma 16 to get a point below $M$ that respects the bounds, and we're done.

Condition 6: Normalization.

We literally have the exact phrasing of normalization we need already, this is a tautology.

Condition 7: Consistency.

Ok, one direction is trivial because $↓ (Θ^{ω}) (π) = Θ^{ω} (π)$ , so we can just use the definition of $↓$ . $↓ (Θ^{ω}) (π_{p a}) = ¯ ¯¯¯¯¯¯ ¯ c . h (⋃_{π \geq π_{p a}} p r_{*}^{π, π_{p a}} (↓ (Θ^{ω}) (π)))$

The other direction, that everything equals the intersection of preimages of stuff below it, is trickier. One subset direction isn't too bad, the one that

$↓ (Θ^{ω}) (π_{p a}) \subseteq ⋂_{π_{s t} \leq π_{p a}} (p r_{*}^{π_{p a}, π_{s t}})^{- 1} (↓ (Θ^{ω}) (π_{s t}))$

If we take a $M \in↓ (Θ^{ω}) (π_{p a})$ that wasn't added in the final closure step, it's expressible as $E_{ζ} M_{i}$ , and all the $M_{i}$ come from points $M_{i}^{\infty}$ in $Θ^{ω} (π_{i})$ where $π_{i} \geq π_{p a}$ . Projecting the $M_{i}^{\infty}$ down to a $π_{s t} \leq π_{p a}$ instead makes $M_{i}^{l o}$ , which mix together in the same way to make $M^{l o} \in↓ (Θ^{ω}) (π_{s t})$ . Because projections are linear and commute, $M_{l o}$ is the projection of $M$ . So, any point in $↓ (Θ^{ω}) (π_{p a})$ (without the closure step) projects down to lie in $↓ (Θ^{ω}) (π_{s t})$ for any $π_{s t} \leq π_{p a}$ .

Then, for the closure step, we just fix a sequence $M_{n}$ limiting to $M$ . The $M_{n}$ can project down to whichever $↓ (Θ^{ω}) (π_{s t})$ you wish, and by continuity of projection, the $M$ comes along for the ride as a limit point. However, $↓ (Θ^{ω}) (π_{s t})$ is closed, so $M$ projects down to land in that set as well. Bam, any old $M \in↓ (Θ^{ω}) (π_{p a})$ projects down to land in any $↓ (Θ^{ω}) (π_{s t})$ set you wish with $π_{s t} \leq π_{p a}$ , certifying that $↓ (Θ^{ω}) (π_{p a})$ lies in the intersection of preimages of stubs below.

Now, we just have to establish $↓ (Θ^{ω}) (π_{p a}) \supseteq ⋂_{π_{s t} \leq π_{p a}} (p r_{*}^{π_{p a}, π_{s t}})^{- 1} (↓ (Θ^{ω}) (π_{s t}))$ which splits into two cases. The causal/surcausal case, and the pseudocausal/acausal case where you don't have to worry about nirvana.

For the nirvana-free case... We can use the same proof strategy as the last part of Lemma 21, where we were showing the result for partial policies. It may be a bit nonobvious why it works. We do need to swap things around a bit, and will mention important changes without fleshing out the fiddly details, which are already given in the last part of Lemma 21.

Start with a $M$ in the intersection of preimages of stubs below. To show it's in $↓ (Θ^{ω}) (π_{p a})$ , we need a sequence limiting to it, where each member of the sequence is a mix of finitely many points projected down from policies above $π_{p a}$ . The end part of Lemma 21 gives how to construct such a sequence. The fact that we're working in a nirvana-free setting means you can ignore all fiddly details about points being nirvana-free and preimages of only the nirvana-free parts, because everything fulfills that. The key steps in that proof path are:

1: being able to project down $M$ to make a sequence $M_{n} \in↓ (Θ^{ω}) (π_{p a}^{n})$ . We trivially have this by $M$ being defined as "in the intersection of preimages of stubs below it".

2: Having uniform Hausdorff-continuity for the policies. This is our condition 8 we're assuming, so we're good there.

3: The ability to shatter our $M_{n}$ into finitely many $M_{i, n}$ which are the projections of various $M_{i, n}^{\infty}$ points from above. This is the key difference. The proof of Lemma 21 had to set up that fact beforehand. However, in our case, we have the Nirvana-free consistency condition, which says

$\forall π_{s t} : c . h (⋃_{π > π_{s t}} p r_{*}^{π, π_{s t}} (Θ^{ω} (π) \cap N F)) = ¯ ¯¯¯¯¯¯ ¯ c . h (⋃_{π > π_{s t}} p r_{*}^{π, π_{s t}} (Θ^{ω} (π))) \cap N F$

But, since we're working in the nirvana-free setting, this turns into:

$\forall π_{s t} : c . h (⋃_{π > π_{s t}} p r_{*}^{π, π_{s t}} (Θ^{ω} (π))) = ¯ ¯¯¯¯¯¯ ¯ c . h (⋃_{π > π_{s t}} p r_{*}^{π, π_{s t}} (Θ^{ω} (π)))$

And that right-hand term... is just the definition of $↓ (Θ^{ω}) (π_{s t})$ ! So, swapping that out, and specializing our arbitrary stub to $π_{p a}^{n}$ , we have:

$c . h (⋃_{π > π_{p a}^{n}} p r_{*}^{π, π_{p a}^{n}} (Θ^{ω} (π))) =↓ (Θ^{ω}) (π_{p a}^{n})$

So, since our $M_{n}$ lie in $↓ (Θ^{ω}) (π_{p a}^{n})$ , they can be written as a finite mix of nirvana-free things from above projected down, and the Lemma 21 argument goes through.

Now, for the nirvana cases where we can assume infinitary causality. We'll do this by showing a little sublemma, that, if $π \geq π_{p a}$ , then $p r_{*}^{π, π_{p a}} (Θ^{ω} (π)) =↓ (Θ^{ω}) (π_{p a})$

First, we'll show that if $π, π^{'} \geq π_{p a}$ , then $p r_{*}^{π, π_{p a}} (Θ^{ω} (π)) = p r_{*}^{π^{'}, π_{p a}} (Θ^{ω} (π^{'}))$

Fix an arbitrary $M$ in the projection of $Θ^{ω} (π)$ , we can get a preimage point $M^{h i} \in Θ^{ω} (π)$ . Then, by infinitary causality, we can make a point $M^{^{'} h i} \in Θ^{ω} (π^{'})$ that projects down to $M$ . Just make an outcome function $o f$ where $o f (π) = M^{h i}$ , feed in $π^{'}$ , that gets you your point $M^{^{'} h i}$ , the two agree when you project them down to $inf (π, π^{'})$ , and $π_{p a}$ is further down than that and projections commute so they both hit the same point $M$ if you project down. Flipping $π$ and $π^{'}$ shows our equality.

Alright, so now $↓ (Θ^{ω}) (π_{p a})$ can be written as $¯ ¯¯¯¯¯¯ ¯ c . h (p r_{*}^{π, π_{p a}} (Θ^{ω} (π)))$ where $π$ is arbitrary above $π_{p a}$ .

Projection is linear, so the projection of a convex set is convex. To get the closure points, just take a sequence $M_{n}$ in the projection limiting to some $M$ . Take preimage points $M_{n}^{\infty} \in Θ^{ω} (π)$ . There's a bound on the $λ$ and $b$ values of this sequence because projections preserve $λ$ and $b$ and our sequence $M_{n}$ converges, so we can apply the Compactness Lemma and get a convergent subsequence limiting to a point $M^{\infty}$ , which must be in $Θ^{ω} (π)$ because closure. Projection is continuous, so $M^{\infty}$ projects down to $M$ . And we have $p r_{*}^{π, π_{p a}} Θ^{ω} (π) =↓ (Θ^{ω}) (π_{p a})$ proved! Wow, that was a sublemma of case 2 of part 3 of the proof of condition 7 in part 2 of the proof of the Isomorphism theorem, we're really in the weeds at this point.

Moving on, how can we use this to show $↓ (Θ^{ω}) (π_{p a}) \supseteq ⋂_{π_{s t} \leq π_{p a}} (p r_{*}^{π_{p a}, π_{s t}})^{- 1} (↓ (Θ^{ω}) (π_{s t}))$ for the causal case, which is the last bit we need to show consistency?

Well, fix a $M$ in the intersection of preimages, and an arbitrary $π \geq π_{p a}$ . $M$ projects down to make some $M_{n} \in↓ (Θ^{ω}) (π_{p a}^{n})$ . Since $π \geq π_{p a} \geq π_{p a}^{n}$ , and we have our sublemma, there's a point $M_{n}^{\infty}$ in $Θ^{ω} (π)$ that projects down to some $M_{n}^{'} \in↓ (Θ^{ω}) (π_{p a})$ , and further down to $M_{n}$ .

This sequence $M_{n}^{\infty}$ all has the same $λ$ and $b$ value since projection preserves those, so by the Compactness Lemma and closure, there's a convergent subsequence and limit point $M^{\infty} \in Θ^{ω} (π)$ . Does $M^{\infty}$ project down onto $M$ ? (witnessing that $M \in↓ (Θ^{ω}) (π_{p a}))$ ?

Well, let's say it didn't and projecting down gets you a distinct point. Then there's some n where projecting down further to $π_{p a}^{n}$ would keep the points distinct, since they have to differ at some finite time. But... after time n, our sequence $M_{n}^{\infty}$ is roaming around entirely in the preimage of $M_{n}$ , so the limit point is in there too, and it projects down to $M_{n}$ and we have a contradiction. Therefore, $M^{\infty} \in Θ^{ω} (π)$ projects down onto $M$ , witnessing that $M \in↓ (Θ^{ω}) (π_{p a})$ , and $M$ was arbitrary in the intersection of preimages.

So, we have $↓ (Θ^{ω}) (π_{p a}) \supseteq ⋂_{π_{s t} \leq π_{p a}} (p r_{*}^{π_{p a}, π_{s t}})^{- 1} (↓ (Θ^{ω}) (π_{s t}))$ for the nirvana-containing causal/surcausal case, which is the last piece we needed to show consistency.

Condition 8: Extreme point condition.

The thing we want is that an extreme nirvana-free minimal point $M^{e x}$ in $↓ (B F^{ω}) (π_{s t}) \cap N F$ is the projection of a nirvana-free point from a policy above it. By the nirvana-free consistency property, $M^{e x}$ lies in $c . h (⋃_{π \geq π_{s t}} p r_{*}^{π, π_{s t}} (Θ^{ω} (π) \cap N F))$

$M^{e x}$ is extreme, so it lies in $⋃_{π \geq π_{s t}} p r_{*}^{π, π_{s t}} (Θ^{ω} (π) \cap N F)$

So, there's some point in some $Θ^{ω} (π) \cap N F$ that projects down to $M^{e x}$ and we're done.

Condition 9: Hausdorff Continuity:

Ok, this one is going to be complicated. We'll work with the Lemma 15 version of Hausdorff-continuity, where a $δ$ difference between two policies means that if you start off in one preimage, you've gotta travel $ϵ (1 + λ)$ distance or less to get to the preimage associated with the other policy, and vice-versa.

We split into two parts. Part 1 is showing that if $π_{p a} \geq π_{s t}$ , and the distance between $π_{p a}$ and $π_{s t}$ is $δ$ or less, the distance between their respective preimages is low. Part 2 is showing that if $π_{p a}$ and $π_{p a}^{'}$ are $δ$ apart, then we can exploit part 1 to get that the distance between the preimages is low, and will be pretty easy after we get part 1.

Our Hausdorff-continuity condition links $δ$ and $ϵ$ . So, when we fix an $ϵ$ and are like "how close do the policies have to be to guarantee the preimages are $ϵ$ apart", pick the $δ$ that gets you $\frac{ϵ}{3}$ distance w.r.t our original Hausdorff-continuity condition, and also have $δ < \frac{ϵ}{3}$ .

Our first part uses $M^{l o}$ and $M^{^{'} l o}$ for points in $↓ (Θ^{ω}) (π_{s t}) \cap N F$ , $M^{^{'} m i d}$ for a point in $↓ (Θ^{ω}) (π_{p a}) \cap N F$ , $M^{h i}$ and $M^{^{'} h i}$ for points in $M^{a} (\infty)$ (that are expressible as a finite mix of points from $Θ^{ω} (π)$ for varying $π$ ), and $M, M^{'}$ for two more general points in $M^{a} (\infty)$ .

So, one half of showing the two preimages are close to each other is trivial. Everything in $↓ (Θ^{ω}) (π_{p a}) \cap N F$ projects down into $↓ (Θ^{ω}) (π_{s t}) \cap N F$ by consistency, and projection preserving nirvana-freeness, so the preimage associated with $π_{s t}$ is a superset of the preimage associated with $π_{p a}$ , so there's distance 0 from a point in the $π_{p a}$ preimage to a point in the $π_{s t}$ preimage.

The other half is trickier. Pick an arbitrary point $M$ in $(p r_{*}^{\infty, π_{s t}})^{- 1} (↓ (Θ^{ω}) (π_{s t}) \cap N F)$ and $λ$ is the $λ$ value of this point. $M$ projects down to some point $M^{l o}$ in $↓ (Θ^{ω}) (π_{s t}) \cap N F$ . From nirvana-free consistency, $M \in c . h (⋃_{π \geq π_{s t}} p r_{*}^{π, π_{s t}} (Θ^{ω} (π) \cap N F))$ , $M^{l o}$ can then be produced by (keeping in mind that it doesn't matter whether we mix before or after projecting) finitely many points in varying $Θ^{ω} (π) \cap N F$ sets that are mixed to make a point $M^{h i}$ , and then projected down.

Important note: $M^{h i}$ is not necessarily equal to, or even close to, $M$ .

Because $π_{s t}$ is within $δ$ of $π_{p a}$ , every policy above $π_{s t}$ has a corresponding policy within $δ$ that lies above $π_{p a}$ . Thus, we can perturb the component points (indexed by i) that mix to make $M^{h i}$ by $\frac{ϵ}{3} (1 + λ_{i})$ (infinitary Hausdorff-uniform-continuity Lemma 15 variant, $δ$ was assumed to be small enough for that to be the case), and mix them, to get a $M^{^{'} h i}$ in $c . h (⋃_{π \geq π_{p a}} (Θ^{ω} (π) \cap N F))$

$M^{^{'} h i}$ projects down to $↓ (Θ^{ω}) (π_{p a}) \cap N F$ to make a $M^{^{'} m i d}$ , and further projects down to $↓ (Θ^{ω}) (π_{s t}) \cap N F$ to make a $M^{^{'} l o}$ . Because $M^{^{'} h i}$ is within $\frac{ϵ}{3} (1 + λ)$ of $M^{h i}$ , projecting down (and projecting being nonexpansive) means that $M^{^{'} l o}$ is within $\frac{ϵ}{3} (1 + λ)$ of $M^{l o}$ .

Now, we can take $M^{^{'} l o}$ and fill in all the missing measure data to get a $M^{'}$ that projects down onto $M^{^{'} m i d}$ (certifying that it's in the preimage of $↓ (Θ^{ω}) (π_{p a}) \cap N F)$ as follows. Our most important constraint is that, when extending $m^{^{'} l o}$ , it should perfectly mimic $m^{^{'} m i d}$ so it can project down onto it. Our second constraint is that, if $m^{l o}$ doesn't specify what comes after a finite history and it doesn't conflict with the first constraint, it should exactly mimic the conditional probabilities of $m$ . Also, our $δ$ fixes a first time n (ie, ${log}_{γ} (δ)$ ) at which $π_{p a}$ is defined where $π_{s t}$ isn't, so all conflicts of the second constraint with the first constraint must happen after then. This does the following: We can slice the histories assigned measures by $M^{'}$ into three parts.

Part 1 is prefixes of histories in $F (π_{s t})$ . There's only $ϵ (1 + λ)$ difference in these between $M$ and $M^{'}$ (after all, projecting down to $π_{s t}$ leaves these unchanged, and $M$ / $M^{'}$ project down to $M^{l o}$ and $M^{^{'} l o}$ which are only $\frac{ϵ}{3} (1 + λ)$ apart).

Part 2 is histories which have as a prefix something in $F (π_{s t})$ less than length n. In that case, we're mimicking the conditional probabilities of $m$ .

Part 3 is histories which have as a prefix something in $F (π_{s t})$ of length n or higher. Because this is the threshold where $π_{p a}$ and $π_{s t}$ start differing, we've got to obey the $m^{^{'} m i d}$ probabilities. But this only occurs after time n.

Let's analyze the difference between $M^{'}$ and $M$ , shall we? Our two relevant results are Vanessa's folk result that two distributions that differ by an amount will differ by the same amount if we extend them with the same conditional probabilities, and the result from the proof of Lemma 15 that arbitrarily reshuffling the measure/amount of dirt after time n takes $γ^{n} λ^{'}$ effort, where $λ^{'}$ is the $λ$ value of the measure you're reshuffling.

So, we start off with a $\frac{ϵ}{3} (1 + λ)$ distance (includes the $b$ term) between $M^{l o}$ and $M^{^{'} l o}$ . Then, extending up further to fill in everything up till time n, $m$ and $m^{'}$ mimic the conditional probabilities of each other. Still a $\frac{ϵ}{3} (1 + λ)$ distance between them at this stage. Finally, after time n, $M^{'}$ may go its own arbitrary way because it's gotta be compliant with $M^{^{'} m i d}$ , and to reshuffle this around, it takes $γ^{n} λ^{'}$ effort. So, the net distance between $M$ (arbitrary point in the preimage of $↓ (Θ^{ω}) (π_{s t})$ , and $M^{'}$ (specially crafted point in the preimage of $↓ (Θ^{ω}) (π_{p a})$ is below $\frac{ϵ}{3} (1 + λ) + γ^{n} λ^{'}$ .

Wait, n (time of first difference) was ${log}_{γ} (δ)$ since $π_{s t}$ and $π_{p a}$ are only $δ$ apart, and $λ^{'}$ can be at most $λ + \frac{ϵ}{3} (1 + λ)$ because $λ$ values are preserved by projections, and $M^{l o}$ and $M^{^{'} l o}$ are only $\frac{ϵ}{3} (1 + λ)$ distance apart, so no more than that amount of dirt is the difference between the two. Finally, we assumed $δ < \frac{ϵ}{3}$ . So, we get:

$d (M, M^{'}) \leq \frac{ϵ}{3} (1 + λ) + γ^{n} λ^{'} = \frac{ϵ}{3} (1 + λ) + γ^{{log}_{γ} (δ)} (λ + \frac{ϵ}{3} (1 + λ)) < \frac{ϵ}{3} (1 + λ) + δ (2 λ + 2)$

$< \frac{ϵ}{3} (1 + λ) + \frac{ϵ}{3} (2 λ + 2) = ϵ (1 + λ)$

And we have our appropriate distance bound between preimages! Now to use this in part 2, which should go a lot faster.

Time for part 2, to get full generality. Pick two partial policies $π_{p a}$ and $π_{p a}^{'}$ and assume the distance between them is $δ$ . Then, the stub $π_{s t}$ given by " $inf (π_{p a}, π_{p a}^{'})$ but cut it off so it's undefined after time n (where n is ${log}_{γ} (δ)$ )" is within distance $δ$ of both $π_{p a}$ and $π_{p a}^{'}$ . Further, $π_{s t} \leq π_{p a}, π_{p a}^{'}$ . Then, take some point in the preimage of $π_{p a}$ . It's also in the preimage of $π_{s t}$ . Because $π_{s t}$ is at a distance of $δ$ from $π_{p a}^{'}$ , we only have to go $ϵ (1 + λ)$ distance to get a point in the preimage of $π_{p a}^{'}$ , and then reverse $π_{p a}$ and $π_{p a}^{'}$ and we're done!

By Lemma 15, this establishes uniform continuity for the function mapping partial policies to the preimage of their nirvana-free part in the space of all nirvana-free measures over infinite histories.

Condition 4: Nirvana-free upper completeness (causal case)

Now that we've nabbed every nice condition other than this one, we can invoke Lemma 21 (we only require upper completion on the infinite levels, which we have) to get that the nirvana-free part is the (closed) convex hull of the projections of nirvana-free stuff from above. Then, just appeal to lemmas 11, 12, and 13, that the closed convex hull of projections of nirvana-free upper-complete sets is nirvana-free upper-complete.

Condition C: Causality.

We showed part of this all the way back in our consistency argument. For causal/surcausal, $p r_{*}^{π, π_{p a}} (Θ^{ω} (π)) =↓ (Θ^{ω}) (π_{p a})$ regardless of which $π \geq π_{p a}$ we picked. We'll be using this.

Pick some arbitrary $π_{p a}$ and $M \in↓ (Θ^{ω}) (π_{p a})$ . $M$ has a preimage point $M^{π} \in Θ^{ω} (π)$ where $π \geq π_{p a}$ . We get an outcome function $o f^{ω}$ mapping policies to points in their associated sets s.t. $o f^{ω} (π) = M^{π}$ . Extend this $o f^{ω}$ to all points by defining $o f (π_{p a}^{'}) := p r_{*}^{π^{'}, π_{p a}^{'}} (o f^{ω} (π^{'}))$

Ok, we need to show that: This actually singles out a unique point and isn't an invalid definition, said point is in $↓ (Θ^{ω}) (π_{p a}^{'})$ , that $o f (π_{p a}) = M$ , and that it's an outcome function.

Assuming this is actually well-defined, $o f (π_{p a}^{'})$ is in $↓ (Θ^{ω}) (π_{p a}^{'})$ trivially because it's a projection of a point from above. Also, $o f (π_{p a}) = p r_{*}^{π, π_{p a}} (o f^{ω} (π)) = p r_{*}^{π, π_{p a}} (M^{π}) = M$ which clean ups that part. Now for showing that it's an outcome function.

$p r_{*}^{π_{p a}^{h i}, π_{p a}^{l o}} (o f (π_{p a}^{h i})) = p r_{*}^{π_{p a}^{h i}, π_{p a}^{l o}} (p r_{*}^{π, π_{p a}^{h i}} (o f^{ω} (π))) = p r_{*}^{π, π_{p a}^{l o}} (o f^{ω} (π)) = o f (π_{p a}^{l o})$

So, we got everything assuming the extension is well-defined, let's show that. Pick any two $π, π^{'}$ above any $π_{p a}$ . We'll show that they project to the same point.

$p r_{*}^{π, π_{p a}} (o f^{ω} (π)) = p r_{*}^{inf (π, π^{'}), π_{p a}} (p r_{*}^{π, inf (π, π^{'})} (o f^{ω} (π))) = p r_{*}^{inf (π, π^{'}), π_{p a}} (p r_{*}^{π^{'}, inf (π, π^{'})} (o f^{ω} (π^{'})))$

$= p r_{*}^{π^{'}, π_{p a}} (o f^{ω} (π^{'}))$

And we're done with causality! Now for pseudocausality.

Condition P: Pseudocausality.

We'll do this in two steps. One is showing that for stubs, points which meet the appropriate conditions are also present in all the requisite other stubs. Step 2 is generalizing this to all points in $↓ (Θ^{ω}) (π_{p a})$ .

Let's say you have some $M \in↓ (Θ^{ω}) (π_{s t})$ . By Nirvana-free consistency, $M \in c . h (⋃_{π \geq π_{s t}} p r_{*}^{π, π_{s t}} (Θ^{ω} (π)))$ so we can shatter it into finitely many $M_{i}$ that are projections of stuff from above, $M_{i}^{\infty}$ . The support of the measure component of $M$ is a subset of $F^{N F} (π_{s t}) \cap F^{N F} (π_{s t}^{'})$ , so the same must apply to all the $M_{i}$ .

Now, what we can do is make a $π_{i}^{'}$ that mimics the behavior of $π_{i}$ for all prefixes and extensions of strings in $F^{N F} (π_{s t}) \cap F^{N F} (π_{s t}^{'})$ , but otherwise mimics $π_{s t}^{'}$ , and extends if needed in some random-ass way, and is above $π_{s t}^{'}$ .

The reason we can do this is because, if there's a contradiction in this construction, it would be from $π_{s t}^{'}$ and $π_{i}$ behaving differently on some prefix or extension of a string in $F^{N F} (π_{s t}) \cap F^{N F} (π_{s t}^{'})$ . But, $π_{s t}^{'}$ can't specify what to do for any nodes in $F^{N F} (π_{s t}^{'})$ or later (because $F^{N F} (π_{s t}^{'})$ is basically a coat of leaf observation nodes around the extent of $π_{s t}^{'}$ ), and if $π_{s t}^{'}$ and $π_{i}$ differ on a strict prefix of something in $F^{N F} (π_{s t}) \cap F^{N F} (π_{s t}^{'})$ , then that means that $π_{s t}^{'}$ and $π_{s t}$ branch different ways so there's no node in both $F^{N F} (π_{s t})$ and $F^{N F} (π_{s t}^{'})$ after the branch point, so again we get a contradiction.

Anyways, we've crafted our finitely many $π_{i}^{'}$ which lie above $π_{s t}^{'}$ , and mimic $π_{i}$ going forward. Our $M_{i}^{\infty}$ is an extension of $M_{i}$ whose measure component is only supported on $F^{N F} (π_{s t}) \cap F^{N F} (π_{s t}^{'})$ . Also, before and past that, $π_{i}^{'}$ mimics $π_{i}$ perfectly, so we can transfer $M_{i}^{\infty}$ to $Θ^{ω} (π_{i}^{'})$ by infinitary pseudocausality. Do this for all the i. Then, projecting all those down to $π_{s t}^{'}$ , we get that all the $M_{i}$ lie in $↓ (Θ^{ω}) (π_{s t}^{'})$ , and mixing them together, we get that $M$ itself lies in $↓ (Θ^{ω}) (π_{s t}^{'})$ .

Now for part 2, where we show it for partial policies in general. Let $M$ be arbitrary in $↓ (Θ^{ω}) (π_{p a})$ . Project $M$ down to all the $π_{p a}^{n}$ to make a sequence of $M_{n}$ . Since the support of the measure component of $M$ is a subset of $F (π_{p a}) \cap F (π_{p a}^{'})$ (pseudocausality assumption) the support of the measure component of $M_{n}$ is a subset of $F (π_{p a}^{n}) \cap F (π_{p a}^{^{'} n})$ , so by pseudocausality for stubs which we've shown, $M_{n}$ is also present in $↓ (Θ^{ω}) (π_{p a}^{^{'} n})$ . Then, take the preimage in $π_{p a}^{'}$ of all those $M_{n}$ points. By consistency and Lemma 9 and the usual argument about "there can only be one preimage point for a series of points", we get that $M$ itself (the only thing that could project down on $M_{n}$ for all n) lies in $↓ (Θ^{ω}) (π_{p a}^{'})$ and we're done with pseudocausality.

Alright, that's most of the proof out of the way, all that's left is showing that the full belief function conditions imply the finitary and infinitary versions, respectively, and getting isomorphism. Let's begin.

Let's check whether $\to^{s t} (Θ)$ makes a stub-hypothesis, and whether $\to^{ω} (Θ)$ makes an infinitary-hypothesis, if $Θ$ is a hypothesis/fulfills all the conditions. $\to^{s t}$ is just "restrict $Θ$ to only reporting sets for stubs", and $\to^{ω}$ is just "restrict $Θ$ to only reporting sets for full policies"

The variants of nonemptiness, closure, convexity, nirvana-free upper-completion, bounded minimals, hausdorff-continuity, and pseudocausality for the finite and infinite case are trivially implied by the corresponding condition for hypotheses, leaving the four moderately nontrivial cases of the analogues of normalization, consistency, the extreme point condition, and causality.

Extreme point condition: The infinitary case doesn't have an analogue of the extreme point condition. So that leaves the finitary case. What we can do is take a nirvana-free extreme minimal point $M^{e x}$ in some $Θ (π_{s t})$ , apply the general extreme point condition to get a nirvana-free $M \in Θ (π)$ for some suitable $π$ that projects down to $M^{e x}$ , and, clipping away the infinite parts by $\to^{s t}$ , the projections of $M$ fill the role of the points in $Θ (π_{s t}^{'})$ all below some policy that project down to $M^{e x}$ .

Causality. The finite case is that we can take a point associated with some stub, and craft an outcome function for stubs that matches up with our point. This is trivially implied by the general case of causality, where you can take any partial policy and point and get an outcome function that matches up with it. The infinite case is that we can take a point $M$ in $Θ (π)$ , and get points for all the other $Θ (π^{'})$ that project down appropriately. For this, again, we just take an outcome function for $M$ and clip it off to the infinite levels.

Consistency: The finite case of weak consistency is pretty easy. We get

$p r_{*}^{π_{s t}^{h i}, π_{s t}^{l o}} (\to^{s t} (Θ) (π_{s t}^{h i})) = p r_{*}^{π_{s t}^{h i}, π_{s t}^{l o}} (Θ (π_{s t}^{h i})) \subseteq Θ (π_{s t}^{l o}) = (\to^{s t}) (Θ) (π_{s t}^{l o})$

Where the subset came from full consistency because everything is the closed convex hull of projections from above, so projecting down gets you a subset. For the Nirvana-free consistency condition for the infinite case, it's a simple consequence of Lemma 21.

Normalization: ${inf}_{π_{s t}} E_{Θ (π_{s t})} (0) = 0$ and ${sup}_{π_{s t}} E_{Θ (π_{s t})} (1) = 1$

To begin, the normalization condition for infinitary hypotheses and general hypotheses is the exact same, so we can ignore that and work on the stub hypothesis case. The inf one is pretty easy. From general normalization, at the infinite level, there are $π$ and nirvana-free points in $Θ (π)$ with a $b$ value at-or-near zero, and you can just project them down to any stub you want.

The sup one is a bit trickier. It's obviously not above 1, because no matter what policy $π$ you pick, you've got a nirvana-free point with $λ + b \leq 1$ in $Θ (π)$ , which you can project down to whichever stub you're looking at, to certify that the expectation of 1 is 1 or less. Showing that it isn't below 1 is a bit harder.

Let's say there's some $π$ where ${min}_{(λ μ, b) \in Θ (π) \cap N F} (λ + b) = 1$ (or arbitrarily close to 1, doesn't really matter, although we'll show later that there is indeed a maximizing policy where Murphy can only force a value of 1)

From Hausdorff-continuity, pick some $δ$ to get an $ϵ$ Hausdorff-distance or lower. This $δ$ specifies some extremely large n, consider $π^{n}$ . Now, consider the set of every policy $π^{'}$ above $π^{n}$ . All of these are $δ$ or less away from $π$ .

By Hausdorff-continuity, there can't be a nirvana-free point in any $Θ (π^{'})$ with $λ + b < 1 - ϵ$ , because we could do an $ϵ$ perturbation to get a point in $Θ (π) \cap N F$ with $λ + b < 1$ , because small changes in $M$ induce small changes in $λ$ and $b$ . Or, we can add a little bit of wiggle room if the minimizing value of $λ + b$ in $π$ is slightly less than 1

However, any nirvana-free point in $Θ (π^{n})$ must originate as a mix of finitely many points from $Θ (π_{i}^{'}) \cap N F$ (varying $π_{i}^{'}$ as long as it's above $π^{n}$ ) that have been projected down. This is because, by our earlier proof of nirvana-free consistency from consistency in general, $Θ (π^{n}) \cap N F = c . h (⋃_{π^{'} \geq π^{n}} p r_{*}^{π^{'}, π^{n}} (Θ (π^{'}) \cap N F))$

All of these projected points have $λ + b \geq 1 - ϵ$ , so the mix point has $λ + b \geq 1 - ϵ$ , so Murphy can only force a value of $1 - ϵ$ or higher. And we can make $δ$ as small as we wish to get a stub $π^{n}$ below $π$ (n extremely large) where $ϵ$ is as small as we wish, so the sup of the $λ + b$ values Murphy can force over all stubs can't be below 1. So it must be 1.

Isomorphism! Let's go! As a quick recap,

$↓ (Θ^{ω}) (π_{p a}) := ¯ ¯¯¯¯¯¯ ¯ c . h (⋃_{π \geq π_{p a}} p r_{*}^{π, π_{p a}} (Θ^{ω} (π)))$

$↑ (Θ^{s t}) (π_{p a}) := ⋂_{π_{s t} \leq π_{p a}} (p r_{*}^{π_{p a}, π_{s t}})^{- 1} (Θ^{s t} (π_{s t}))$

And $\to^{ω}$ / $\to^{s t}$ is just "clip down your hypothesis to full policies/stubs".

So, two parts of this are trivially easy. From earlier in the proof (the start of the first section for the stub one, and an obvious corollary of definitions for the full policy one), we established that $↑ (Θ^{s t}) (π_{s t}) = Θ^{s t} (π_{s t})$ and $↓ (Θ^{ω}) (π) = Θ^{ω} (π)$ . Using this, $\to^{ω} (↓ (Θ^{ω})) (π) =↓ (Θ^{ω}) (π) = Θ^{ω} (π)$ and $\to^{s t} (↑ (Θ^{s t})) (π_{s t}) =↑ (Θ^{s t}) (π_{s t}) = Θ^{s t} (π_{s t})$

So, $\to^{ω} (↓ (Θ^{ω})) = Θ^{ω}$ and $\to^{s t} (↑ (Θ^{s t})) = Θ^{s t}$

Let's get fancier and show the other two.

$↓ (\to^{ω} (Θ)) (π_{p a}) = ¯ ¯¯¯¯¯¯ ¯ c . h (⋃_{π \geq π_{p a}} p r_{*}^{π, π_{p a}} (\to^{ω} (Θ) (π))) = ¯ ¯¯¯¯¯¯ ¯ c . h (⋃_{π \geq π_{p a}} p r_{*}^{π, π_{p a}} (Θ (π))) = Θ (π_{p a})$

The first two equalities are unpacking definitions, the third is consistency for $Θ$ .

$↑ (\to^{s t} (Θ)) (π_{p a}) = ⋂_{π_{s t} \leq π_{p a}} (p r_{*}^{π_{p a}, π_{s t}})^{- 1} (\to^{s t} (Θ) (π_{s t})) = ⋂_{π_{s t} \leq π_{p a}} (p r_{*}^{π_{p a}, π_{s t}})^{- 1} (Θ (π_{s t})) = Θ (π_{p a})$

Again, first two equalities are unpacking definitions, the third is consistency for $Θ$ . So, $↓ (\to^{ω} (Θ)) = Θ$ and $↑ (\to^{s t} (Θ)) = Θ$

Putting it together, $\to^{ω}$ and $↓$ make an isomorphism between $Θ^{ω}$ and $Θ$ , and $\to^{s t}$ and $↑$ make an isomorphism between $Θ^{s t}$ and $Θ$ . We're finally done!

Proposition 1: If $Θ$ fulfills the causality condition, nonemptiness, closure, and convexity, then $S^{Θ}$ is a nonempty, closed, convex set of a-environments or a-survironments. $Θ^{S^{Θ}} = Θ$ . Also, $S \subseteq S^{Θ^{S}}$ .

Ok, what $S^{Θ}$ is, is the set of a-environments $(λ e, b)$ where, regardless of $π_{p a}$ , $(λ (π_{p a} \cdot e), b)$ lies in $Θ (π_{p a})$ . For nonemptiness, pick some arbitrary point in one of your $Θ (π_{p a})$ , use causality to get an outcome function, and then you fill in the conditional probabilities for an action-observation sequence with your outcome function points. This never produces a contradiction anywhere because if there was a contradiction, you'd be able to project two specified points down and have them disagree somewhere, which is impossible because we have an outcome function.

For closure, if you take a limit of a-environments, this makes a limiting sequence in all the $S^{Θ}$ , which are all closed, so the limit point environment has all its induced distributions lying in the usual $Θ (π_{p a})$ , and is in $S^{Θ}$

For convexity, if you take a mix of a-environments, this makes the same mix in all the $S^{Θ}$ which are all convex, so the mixed environment has all its induced distributions lying in the usual $Θ (π_{p a})$ , and is in $S^{Θ}$ .

For equality, if $M \in Θ^{S^{Θ}} (π_{p a})$ , then it originated from some a-environment made from an outcome function for $Θ$ , which... just gets your original point so $M \in Θ (π_{p a})$ . In the other direction, if $M \in Θ (π_{p a})$ , by causality, we can project down and extend the specification and make an a-environment that acts like $M$ on $π_{p a}$ , and then going back gets you $M \in Θ^{S^{Θ}} (π_{p a})$ .

In the other direction, if $(λ e, b) \in S$ , then it induces an outcome function and you can go back from that to $(λ e, b) \in S^{Θ^{S}}$ , so $S \subseteq S^{Θ^{S}}$

Theorem 3.1: Pseudocausal Translation: For all pseudocausal $Θ^{s t}$ hypotheses defined only on policy stubs, $\to^{c} (Θ^{s t})$ is a causal hypothesis only defined on policy stubs. $\to^{N F} (\to^{c} (Θ^{s t})) = Θ^{s t}$ . For all causal $Θ^{s t}$ hypotheses defined only on policy stubs, $\to^{N F} (Θ^{s t})$ is a pseudocausal hypothesis only defined on policy stubs.

Theorem 3.2: Acausal Translation: For all acausal $Θ^{s t}$ hypotheses defined only on policy stubs, $\to^{s c} (Θ^{s t})$ is a surcausal hypothesis only defined on policy stubs. $\to^{N F} (\to^{s c} (Θ^{s t})) = Θ^{s t}$ . For all surcausal $Θ^{s t}$ hypotheses defined only on policy stubs, $\to^{N F} (Θ^{s t})$ is an acausal hypothesis only defined on policy stubs.

Both these theorems have highly similar proofs, so let's group them together. First, we'll need to set up how $\to^{c}$ and $\to^{s c}$ work, and then knock out two lemmas we'll need before we can proceed to the main result. $\to^{c}$ is defined by $\to^{c} (Θ^{s t}) (π_{s t}) = ¯ ¯¯¯¯¯¯ ¯ c . h (⋂_{π_{s t}^{'} \leq π_{s t}} (I_{*}^{π_{s t}^{'}, π_{s t}} (Θ^{s t} (π_{s t}^{'}))))$ $\to^{s c}$ is defined identically, just with $I_{* s}$ instead of $I_{*}$ , and closed convex hull permitting us to mix with $0^{+}$ probability.

$I^{π_{s t}^{'}, π_{s t}}$ where $π_{s t}^{'} \leq π_{s t}$ (this is like the inverse of projection, it's going up instead of down) is a function $F (π_{s t}^{'}) \to F (π_{s t})$ defined by: If $h \in F (π_{s t}^{'}) \cap F (π_{s t})$ , then $I^{π_{s t}^{'}, π_{s t}} (h) = h$ . If $h \in F (π_{s t}^{'})$ and isn't in $F (π_{s t})$ , then $I^{π_{s t}^{'}, π_{s t}} (h) = h π_{s t} (h) N$

$I_{*}^{π_{s t}^{'}, π_{s t}}$ from $M^{a} (F (π_{s t}^{'})) \to M^{a} (F (π_{s t}))$ is just pushing $(m, b)$ through the mapping $I^{π_{s t}^{'}, π_{s t}}$ . You keep the $b$ term the same, and push the measure terms up. $I_{* s}^{π_{s t}^{'}, π_{s t}}$ is defined identically on the measure part, except that it has the rule that all nirvana events in $F (π_{s t})$ and not in $F (π_{s t}^{'})$ with 0 measure get $0^{+}$ measure instead.

Intuitively, what $I_{*}$ and $I_{* s}$ are doing, is capping off whatever they need to (in order to extend appropriately) with Nirvana. $I_{*}$ is capping off positive-probability histories with guaranteed Nirvana immediately afterwards, where $I_{* s}$ is more paranoid and caps off every 0-probability Nirvana history that got added with "it is possible that Nirvana occurs here".

Let's go over some properties that $I_{*}$ and $I_{* s}$ fulfill. $I_{*}$ is an injective continuous map $M^{a} (F (π_{s t}^{'})) \to M^{a} (F (π_{s t}))$ , and $I_{* s}$ is an injective continuous map ${S M}^{a} (F (π_{s t}^{'})) \to {S M}^{a} (F (π_{s t}))$ . $I_{*}$ and $I_{* s}$ are undone by projecting back down, $p r_{*}^{π_{s t}, π_{s t}^{'}} (I_{*}^{π_{s t}^{'}, π_{s t}} (M)) = M$ . Both $I_{*}$ and $I_{s *}$ are linear, the latter in the stronger sense that it's linear when you mix stuff with $0^{+}$ probability, it doesn't matter whether you mix before or after injecting up. Further, injections up commute, $I_{*}^{π_{s t}^{'}, π_{s t}} (I_{*}^{π_{s t}^{''}, π_{s t}^{'}} (M)) = I_{*}^{π_{s t}^{''}, π_{s t}} (M)$ , and the same for $I_{* s}$ .

In order to make progress, we want to get two important lemmas. The first one, Lemma 22, is that slicing away the nirvana from this thing recovers the original pseudocausal hypothesis. The second one I call the "Diamond Lemma", and it says that injecting up and projecting down is the same as projecting down and then injecting up, and if you sketch it out, it looks like a diamond.

Lemma 22: $(\to^{c} (Θ^{s t}) (π_{s t})) \cap N F = Θ^{s t} (π_{s t})$ , and the same holds for $\to^{s c}$ .

Proof sketch: One direction is trivial, the other direction that $\to^{c}$ doesn't add any new nirvana-free points is trickier. Working in the pseudocausal-to-causal setting, we can take some $M$ that's nirvana-free in the closed convex hull, and get a sequence $M_{n}$ limiting to it where each $M_{n}$ is in the convex hull. Now, indexing stubs below $π_{s t}$ by i, the $M_{n}$ can all be viewed as a mix of $M_{i, n}$ points projected up from below. The problem is, the mix varies as n does. What we can do is separate into "good" i where we can get a suitable limit point and limit probability, and "bad" i that we have to treat as a special chunk, and reexpress $M$ as a sum of a probabilistic mix of "good" $M_{i}$ injected up, and an additional "bad" chunk. We can show that the "good" $M_{i}$ can all be transferred up to $Θ^{s t} (π_{s t})$ itself by pseudocausality and mixed in there, and the "bad" chunk is a nirvana-free a-measure. So, $M$ is the sum of a point in $Θ^{s t} (π_{s t})$ , plus a nirvana-free a-measure, so $M$ lies in $Θ^{s t} (π_{s t})$ by nirvana-free upper completion.

Working in the surcausal-to-acausal setting, we take our $M$ in the closed convex hull and a sequence $M_{n}$ , but injection up in this setting is much more effective at adding Nirvana, and the surmetric is much more sensitive than the usual metric for noticing the presence or absence of Nirvana. So, only an initial segment of $M_{n}$ is "contaminated" with Nirvana since the limit point is Nirvana-free, and we can clip that part off, and the "uncontaminated tail" can only have come from $Θ^{s t} (π_{s t})$ itself because injection up is very aggressive with adding Nirvana, so we get it from just closure on $Θ^{s t} (π_{s t})$ .

Proof: For $(\to^{c} (Θ^{s t}) (π_{s t})) \cap N F \supseteq Θ^{s t} (π_{s t})$ , just observe that the identity injection $I_{*}^{π_{s t}, π_{s t}}$ leaves $Θ^{s t} (π_{s t})$ completely unchanged and adds no nirvana, so any point in $Θ^{s t} (π_{s t})$ also lies in the closed convex hull of the injections up, and is nirvana-free because the original point that we mapped through identity was nirvana-free. This works with the surcausal case too.

Now for the considerably more difficult reverse direction, for the pseudocausal-to-causal case first.

If $M \in \to^{c} (Θ^{s t}) (π_{s t}) \cap N F$ , then unpacking that, $M$ is nirvana-free, and lies in the closed convex hull of the injections up. So, we can fix a sequence $M_{n}$ in the convex hull of injections up that limits to $M$ . Index the stubs below $π_{s t}$ by i, there's only finitely many of them.

The $M_{n}$ can be written as $\sum_{i} ζ_{i, n} I_{*}^{π_{s t}^{i}, π_{s t}} (M_{i, n})$ where $M_{i, n} \in Θ^{s t} (π_{s t}^{i})$ . $ζ_{i, n}$ may be 0. This is because, if there's multiple points in the injection of a particular stub that are mixed, you can mix them before injecting up to get a single $M_{i, n}$ that's injected up, because injections are linear and we're injecting a convex set.

Blessed by the gift of finitely many i to worry about, use repeated picking of subsequences to get a subsequence of n where:

For all i, $ζ_{i, n}$ converges. Call the limiting values $ζ_{i}$ . Now split the i into good i where $ζ_{i} > 0$ , and bad i where $ζ_{i} = 0$ . The $ζ_{i}$ will sum up to 1.

For all good i, $ζ_{i, n} > 0$ always. The fact that they all limit to above 0 helps you out because you only have to trim off an initial segment.

For all good i, $M_{i, n}$ converges, call the limit point $M_{i}$ . This is because injection up preserves $λ$ and $b$ , and $ζ_{i, n}$ is bounded above 0, so the $λ + b$ value of the $M_{i, n}$ is upper-bounded by $\frac{λ^{'} + b^{'}}{{min}_{n} ζ_{i, n}}$ , which is a finite nonnegative number divided by a finite positive number, and we can apply the Compactness Lemma to establish that a convergent subsequence exists. In this case, $λ^{'} + b^{'}$ is the bound as a whole for the sequence $M_{n}$ , which converges so it must have a bound of that form, and not the bound on minimal points.

Finally, $\sum_{bad i} (ζ_{i, n} I_{*}^{π_{s t}^{i}, π_{s t}} (M_{i, n}))$ converges. This is doable because the sequence $M_{n}$ has bounded $λ$ and $b$ because it converges to something, so the partial sum of bad i has the same bound, so we can invoke the Compactness Lemma to get our convergent subsequences.

Putting all this together (we kept selecting from compact sets so that is what let us build a subsequence with all these great properties at once) we have a decomposition of $M$ itself into:

$\sum_{good i} (ζ_{i} I_{*}^{π_{s t}^{i}, π_{s t}} (M_{i})) + {lim}_{n \to \infty} (\sum_{bad i} (ζ_{i, n} I_{*}^{π_{s t}^{i}, π_{s t}} (M_{i, n})))$

Now, since $\sum_{good i} ζ_{i} = 1$ (all the bad i had their probability components limit to 0), that first sum part looks like an actual mixture of points injected up! Since $M$ is nirvana-free, both parts must be nirvana-free, and the sums are also a-measures.

First, by closure of all our original $Θ^{s t} (π_{s t}^{i})$ , all the $M_{i}$ components (where i is good) do lie in $Θ^{s t} (π_{s t}^{i})$ . And when we inject the $M_{i}$ up, since the mix of them is nirvana-free, this means that each individual $M_{i}$ must be nirvana-free after injection.

Now, what injection does, is it caps Nirvana on everything that is in $F (π_{s t}^{i})$ and not in $F (π_{s t})$ that has positive probability. So, if $M_{i}$ is nirvana-free after injection, this must mean that its measure component is only supported on $F (π_{s t})$ . Via pseudocausality, this means that $M_{i}$ lies in $Θ^{s t} (π_{s t})$ itself! Also, $I_{*}^{π_{s t}^{i}, π_{s t}} (M_{i}) = M_{i}$ .

So, our sum over good i components (by convexity), is actually a probabilistic mixture of stuff in $Θ^{s t} (π_{s t})$ itself! Abbreviating $\sum_{good i} ζ_{i} M_{i}$ as $M^{'}$ , which lies in $Θ^{s t} (π_{s t})$ by convexity, and rewriting the sum, we can reepress $M$ as:

$M^{'} + {lim}_{n \to \infty} (\sum_{bad i} (ζ_{i, n} I_{*}^{π_{s t}^{i}, π_{s t}} (M_{i, n})))$

This is a nirvana-free a-measure in $Θ^{s t} (π_{s t})$ , plus a nirvana-free a-measure, so, by nirvana-free upper-completion, $M$ lies in $Θ^{s t} (π_{s t})$ and we're done. Now, let's hit up the surcausal case.

Assume $S M \in \to^{N F} (Θ^{s t}) (π_{s t}) \cap N F$ . $S M$ is nirvana-free, and lies in the closed convex hull of the injections up. So, we can fix a sequence $S M_{n}$ in the convex hull of injections up that limits to $S M$ . Index the stubs below $π_{s t}$ by i, there's only finitely many of them, reserve i=0 for $π_{s t}$ itself.

The $S M_{n}$ can be written as $\sum_{i} ζ_{i, n} I_{* s}^{π_{s t}^{i}, π_{s t}} (M_{i, n})$ where $M_{i, n} \in Θ^{s t} (π_{s t}^{i})$ . $ζ_{i, n}$ may be 0 or $0^{+}$ . This is because, if there's multiple points in the injection of a particular stub, you can mix them before injecting up to get your single point, one for each $π_{s t}^{i}$ , because injections are linear and we're injecting a convex set.

Note that $S M$ is nirvana-free, and there's only finitely many spots where nirvana could be since we're working in a stub, so past a certain point all the $S M_{n}$ will be nirvana-free due to the surmetric we're using. Let's clip off that initial segment that's contaminated with Nirvana. Now, we can get something very interesting. If $π_{s t}^{i} < π_{s t}$ , then injecting up anything at all is going to stick nirvana (maybe with $0^{+}$ measure) somewhere. Having $ζ_{i, n}$ be $0^{+}$ doesn't help you, because mixing with a nirvana-containing thing with $0^{+}$ probability means the mixture contains the nirvana-spots of that thing you mixed in. So, past a certain point, all the $S M_{n}$ can only be written as $M_{0, n}$ (the identity injection, anything else either has exactly 0 probability so it gets clipped out of the sum, or it has Nirvana somewhere and can't be present).

Therefore, in the tail, the sequence of $S M_{n}$ limiting to $S M$ is the same as $M_{0, n} \in Θ^{s t} (π_{s t})$ limiting to some $M_{0} \in Θ^{s t} (π_{s t})$ , so $S M \in Θ^{s t} (π_{s t})$ and it's actually an a-measure, not an a-surmeasure. This establishes Lemma 22 for the sur-case.

Lemma 23/Diamond Lemma: For any $π_{s t}, π_{s t}^{'}$ , and any $π_{s t}^{h i} \geq π_{s t}, π_{s t}^{'}$ , and any $M \in M^{a} (F (π_{s t}))$ , then: $p r_{*}^{π_{s t}^{h i}, π_{s t}^{'}} (I_{*}^{π_{s t}, π_{s t}^{h i}} (M)) = I_{*}^{inf (π_{s t}, π_{s t}^{'}), π_{s t}} (p r_{*}^{π_{s t}, inf (π_{s t}, π_{s t}^{'})} (M))$ (and same for $I_{* s}$ and the sur variants)

it's called the Diamond Lemma because if you sketch out the injections as going diagonally up and the projections as going diagonally down, the commutative diagram looks like a diamond.

To begin with, we can go "hm, there's an upper bound on $π_{s t}$ and $π_{s t}^{'}$ . For every finite history in $F (π_{s t})$ , there's an extension of that history in $F (π_{s t}^{h i})$ , which has a prefix in $F (π_{s t}^{'})$ , and vice-versa. This establishes that for all the finitely many histories in $F (π_{s t})$ , either a prefix of that history lies in $F (π_{s t}^{'})$ , or an extension of that history lies in $F (π_{s t}^{'})$ , and vice-versa for $F (π_{s t}^{'})$

Now, we can split into three possible cases and show that up-then-down equals down-then-up in terms of what measure is assigned to a history in $F (π_{s t}^{'})$ by mapping $M$ through the injections and projections, which shows the diamond lemma in full generality.

In the first case, our history $h$ in $F (π_{s t}^{'})$ is also in $F (π_{s t})$ (the equality case)In this case, $h$ also lies in $F (inf (π_{s t}, π_{s t}^{'}))$ . Projecting down to inf does nothing to the measure on $h$ , and embedding up also does nothing to the measure on $h$ . Embedding up to $π_{s t}^{h i}$ also does nothing to the measure on $h$ , and projecting down doesn't affect it either.

In the second case, our history $h$ in $F (π_{s t}^{'})$ isn't in $F (π_{s t})$ , but there are strict extensions that lie in $F (π_{s t})$ (this requires $h$ to be nirvana-free). $h$ is still assigned a measure by $M$ , though, being a prefix of stuff with measure. In this case, $h$ also lies in $F (inf (π_{s t}, π_{s t}^{'}))$ . The same analysis from our first case works, $h$ doesn't have its measure disrupted.

In the third case, our history $h$ in $F (π_{s t}^{'})$ isn't in $F (π_{s t})$ , but a strict prefix $h^{'}$ lies in $F (π_{s t})$ . We can distinguish three subcases. In the first subcase, $h$ is of the form $h^{'} a N$ . In the second subcase, $h$ still ends with Nirvana, but it isn't immediately after $h^{'}$ happens, some stuff happens in the meantime first. In the third subcase, $h$ doesn't end with Nirvana. Also, $h^{'}$ lies in $F (inf (π_{s t}, π_{s t}^{'}))$ .

For the first subcase where $h$ is of the form $h^{'} a N$ , injecting up means $h^{'} a N$ now has the measure originally associated with $h^{'}$ and nirvana is marked as "possible" there (if we're using the sur-injection). Projecting down leaves this alone. Projecting down leaves the measure on $h^{'}$ alone, and injecting up means $h^{'} a N$ now has the measure originally associated with $h^{'}$ and nirvana is marked as "possible" there (if we're using the sur-injection). In both paths, $h^{'} a N$ ends up with the measure that $h^{'}$ started with, and nirvana marked as "possible" in the sur-case.

For the second subcase where $h$ is of the form " $h^{'}$ , then some stuff happens, then Nirvana occurs", then in the causal case, the injection up would assign $h$ 0 measure (all the measure of $h^{'}$ got channeled into $h^{'} a N$ instead of $h$ ), and then projecting down, it stays the same. Similarly, projecting down means $h^{'}$ has some measure, then it's all channeled into $h^{'} a N$ on the injection up, so $h$ itself gets 0 measure. For the surcausal case, the injection up assigns $h$ $0^{+}$ measure (by the same argument and sur-injections tagging every freshly-added nirvana outcome with $0^{+}$ measure). and projecting down, it remains with $0^{+}$ measure. Projecting down leaves $h^{'}$ alone and then injecting up tags $h$ with $0^{+}$ measure.

For the third subcase where $h$ is an extension of $h^{'}$ that doesn't add any Nirvana, we can run through the same argument as the second subcase to conclude that we get 0 measure for both the causal case and the surcausal one.

Thus, no matter whether we inject up and project down, or project down and inject up, the measure assigned to $h \in F (π_{s t}^{'})$ by the measure component of the p-(sur)measure will agree.

An important thing to note with this is that we can use any stub above $π_{s t}$ and $π_{s t}^{'}$ for the injection up, but we must use $inf (π_{s t}, π_{s t}^{'})$ for the projection down.

Now, we can finally embark on the proof of the two translation theorems! There's enough similarities between the proofs that we can just do one big proof and remark on any differences we come across. The things we must show are that slicing off the Nirvana from a causal/surcausal hypothesis makes a pseudocausal/acausal hypothesis, and that adding in those injections up can turn a pseudocausal/acausal hypothesis to a causal/surcausal one, and that going nirvana-free to nirvana-containing back to nirvana-free is identity.

Proof sketch: While at first this may look like the proof will be almost as long as the Isomorphism theorem because we're verifying a list of 9 conditions twice over, it'll actually be considerably shorter. The only nontrivial part of the first part where we check that slicing off the Nirvana makes a pseudocausal/acausal hypothesis is deriving pseudocausality from causality, and even that is fairly easy.

Going from pseudocausal/acausal to causal/surcausal is trickier, though thankfully most conditions are trivially true, there's only three notable ones. There's the bound on minimal points, which is done by taking a sequence $M_{n}$ limiting to a $M$ that violates the bound, using the definition of the causal translation to get a point below each $M_{n}$ which obeys the $λ^{⊙} + b^{⊙}$ bound (fairly nontrivial), and appealing to Lemma 16 to construct a limit point below $M$ that obeys the $λ^{⊙} + b^{⊙}$ bound. Showing weak consistency (projecting down makes a subset) requires the Diamond Lemma to write the projection of your point of interest as a mix of injections up from below, and the last tricky one is causality. Which requires first showing that injecting up $\to^{c} (Θ^{s t} (π_{s t}))$ to a higher stub won't add any new points, and then coming up with a clever way of building our outcome function, and using the Diamond Lemma to show that it indeed an outcome function.

Finally, nirvana-free to causal to nirvana-free is instant by Lemma 22.

Proof: Referring back to the conditions for a hypothesis on policy stubs, we'll show that they're fulfilled when you slice away the Nirvana, and that pseudocausality can be derived from causality if we're just dealing with a-measures and not a-surmeasures.

Stub Nirvana-free Nonemptiness was a property already possessed by the causal hypothesis, so it's preserved when we clip away the Nirvana. Stub closure and convexity also hold because we're intersecting with the closed convex set (of nirvana-free a-measures). Nirvana-free upper completion also holds. Bounded minimals holds because a minimal point in the Nirvana-free part must also be a minimal point in the original set, because adding anything to a nirvana-containing a-measure makes a nirvana-containing a-measure, so there can be no nirvana below our minimal point in the nirvana-free part, so it's minimal in general. Normalization holds because the expectation values only depend on the nirvana-free part. Nirvana-free stuff projects down to nirvana-free stuff, getting stub-consistency. The stub extreme point condition carries over due to the preexisting intersection with nirvana-free used to define it, and the same applies to uniform continuity. This wraps up surcausal-to-acausal, and just leaves deriving pseudocausality from causality for causal-to-pseudocausal.

Let's say we have a nirvana-free $M \in Θ^{s t} (π_{s t})$ , where the measure part of $M$ is supported over $F (π_{s t}^{'})$ , and we want to show that it's also present in $Θ^{s t} (π_{s t}^{'})$ . Then $M$ is present in $Θ^{s t} (inf (π_{s t}, π_{s t}^{'}))$ , because it's supported entirely over histories that both the different policies produce, so it's supported over histories that the intersection of the policies produces, just project it down to the inf. Now, by causality, we can find something in $Θ^{s t} (π_{s t}^{'})$ that projects down onto $M$ , which must be $M$ itself because the measure part of $M$ is supported entirely on histories in $F (π_{s t}^{'})$ . This gets pseudocausality.

Now, let's show that $\to^{c} (Θ^{s t})$ and $\to^{s c} (Θ^{s t})$ fulfill the defining conditions for finitary causal.

1: Stub Nirvana-Free Nonemptiness: This one is trivial, because $Θ^{s t} (π_{s t})$ is present as a subset via the identity injection $I_{*}^{π_{s t}, π_{s t}}$ , and is nirvana-free.

2,3: Stub Closure/Convexity: We took a closed convex hull, these are tautological.

4: Stub Nirvana-Free Upper-Completeness: Just apply Lemma 22 to get that the nirvana-free part of $\to^{c} (Θ^{s t}) (π_{s t})$ (and same with surcausal) is just the original $Θ^{s t} (π_{s t})$ , which is nirvana-free and upper-complete by stub nirvana-free upper-completeness, so we're good there.

Condition 5: Stub Bounded Minimals:

By stub bounded minimals on the $Θ^{s t}$ we have a $λ^{⊙} + b^{⊙}$ bound on $λ + b$ for minimal points in $Θ^{s t} (π_{s t})$ for all stubs. Pick a $M \in \to^{c} (Θ^{s t}) (π_{s t})$ (or the surcausal analogue) with $λ + b > λ^{⊙} + b^{⊙}$ . There's a sequence of points $M_{n}$ limiting to $M$ that lie in the convex hull. All these $M_{n}$ (or $S M_{n}$ ) can be written as $\sum_{i} ζ_{i, n} I_{*}^{π_{s t}^{i}, π_{s t}} (M_{i, n})$ where $M_{i, n} \in Θ^{s t} (π_{s t}^{i})$ . Decompose $M_{i, n}$ into a minimal point and something else, getting $M_{i, n}^{min} + M_{i, n}^{*}$ .

Then, do a further rewrite as $M_{i, n}^{min} + (m_{i, n}^{* -}, - m_{i, n}^{* -} (1)) + (m_{i, n}^{* +}, b_{i, n}^{*} + m_{i, n}^{* -} (1))$

Note that the $λ + b$ value of the sum of the first two terms is bounded above by $λ^{⊙} + b^{⊙}$ , because $M_{i, n}^{min}$ obeys that bound, and for the second term, it deletes exactly as much from the $λ$ term as it adds to the $b$ term. Also, since $M_{i, n}$ is an a-measure, adding in just the negative component to $M_{i, n}^{min}$ doesn't make it go negative anywhere, so the sum of the first two terms is an a-measure, and by nirvana-free upper completeness, it lies in $Θ^{s t} (π_{s t}^{i})$ . The third component of the sum is also an a-measure.

By linearity of $I_{*}$ or $I_{* s}$ , injecting up the first two terms and the last term, and adding them afterwards, is the same as injecting up the bulk of them (we can only inject up a-measures). Let's abbreviate $M_{i, n}^{min} + (m_{i, n}^{* -}, - m_{i, n}^{* -} (1))$ as $M_{i, n}^{◊}$ and abbreviate $(m_{i, n}^{* +}, b_{i, n}^{*} + m_{i, n}^{* -} (1))$ as $M_{i, n}^{♡}$ Now, we can rewrite $M_{n}$ as:

$M_{n} = \sum_{i} ζ_{i, n} I_{*}^{π_{s t}^{i}, π_{s t}} (M_{i, n}) = \sum_{i} (ζ_{i, n} I_{*}^{π_{s t}^{i}, π_{s t}} (M_{i, n}^{◊})) + \sum_{i} (ζ_{i, n} I_{*}^{π_{s t}^{i}, π_{s t}} (M_{i, n}^{♡}))$

The first component is in $\to^{c} (Θ^{s t}) (π_{s t})$ and has the $λ^{⊙} + b^{⊙}$ bound (injection up preserves $λ$ and $b$ ) and $M_{i, n}^{◊}$ lies below the $λ^{⊙} + b^{⊙}$ bound, and mixing stuff below the bound produces a point below the bound. Abbreviate the first component as $M_{n}^{◊}$ .

So, $M_{n}$ isn't minimal, it lies above $M_{n}^{◊}$ . Because the $M_{n}^{◊}$ have a $λ^{⊙} + b^{⊙}$ bound, and there's only finitely many places where nirvana could be, we can extract a convergent subsequence, limiting to some $M^{◊}$ which obeys the $λ^{⊙} + b^{⊙}$ bound, and by Lemma 16, $M^{◊}$ lies below $M$ .

Therefore, $M$ isn't minimal, and it was an arbitrary point that violated the $λ^{⊙} + b^{⊙}$ bound, so all minimal points in any $\to^{c} (Θ^{s t}) (π_{s t})$ obey the same $λ^{⊙} + b^{⊙}$ bound, and we get bounded minimals.

6: Stub Normalization. By Lemma 22, we didn't introduce any new nirvana-free points, so stub normalization of $Θ^{s t}$ carries over.

Condition 7: Weak Consistency.

This is "projecting down makes a subset". All the following arguments work with sur-stuff. Fix some $M \in \to^{c} (Θ^{s t}) (π_{s t}^{h i})$ . It's a limit of points $M_{n}$ in the convex hull. We can decompose the $M_{n}$ as $\sum_{i} ζ_{i, n} I_{*}^{π_{s t}^{i}, π_{s t}^{h i}} (M_{i, n})$ . Then,

$p r_{*}^{π_{s t}^{h i}, π_{s t}^{l o}} (M_{n}) = p r_{*}^{π_{s t}^{h i}, π_{s t}^{l o}} (\sum_{i} ζ_{i, n} I_{*}^{π_{s t}^{i}, π_{s t}^{h i}} (M_{i, n})) = \sum_{i} ζ_{i, n} p r_{*}^{π_{s t}^{h i}, π_{s t}^{l o}} (I_{*}^{π_{s t}^{i}, π_{s t}^{h i}} (M_{i, n}))$

Which, by the Diamond Lemma, can be rewritten as: $\sum_{i} ζ_{i, n} I_{*}^{inf (π_{s t}^{l o}, π_{s t}^{i}), π_{s t}^{l o}} (p r_{*}^{π_{s t}^{i}, inf (π_{s t}^{l o}, π_{s t}^{i})} (M_{i, n}))$

The projections of the $M_{i, n} \in Θ^{s t} (π_{s t}^{i})$ lie in $Θ^{s t} (inf (π_{s t}^{l o}, π_{s t}^{i}))$ by weak consistency for $Θ^{s t}$ . So, actually, the projection of $M_{n}$ down to $π_{s t}^{l o}$ can be written as a mix of injections up from stubs below $π_{s t}^{l o}$ , so the projection of $M_{n}$ lies in $\to^{c} (Θ^{s t}) (π_{s t})$ . Then, just use continuity of projection, and closure, to get $M$ itself projecting down into $\to^{c} (Θ^{s t}) (π_{s t}^{l o})$ , so we're good on weak consistency.

8: Stub Extreme Point Condition: By Lemma 22, we didn't introduce any new nirvana-free points, so any nirvana-free extreme minimal point was present (and nirvana-free extreme minimal) in $Θ^{s t}$ already, so the extreme point condition carries over from there.

9: Stub Hausdorff Continuity: By Lemma 22, we didn't introduce any new nirvana-free points, and the preimages for Hausdorff continuity are of the nirvana-free parts, so this is completely unaffected and carries over.

Condition C: Causality: As a warmup to this result, we'll show that if $π_{s t} \leq π_{s t}^{h i}$ , then

$I_{*}^{π_{s t}, π_{s t}^{h i}} (\to^{c} (Θ^{s t}) (π_{s t})) \subseteq \to^{c} (Θ^{s t}) (π_{s t}^{h i})$

Pick a $M \in \to^{c} (Θ^{s t}) (π_{s t})$ . It's a limit of $M_{n}$ which are finite mixtures of injections of stuff from below, and $M_{n}$ can be written as $\sum_{i} ζ_{i, n} I_{*}^{π_{s t}^{i}, π_{s t}} (M_{i, n})$ Then,

$I_{*}^{π_{s t}, π_{s t}^{h i}} (M_{n}) = I_{*}^{π_{s t}, π_{s t}^{h i}} (\sum_{i} ζ_{i, n} I_{*}^{π_{s t}^{i}, π_{s t}} (M_{i, n})) = \sum_{i} ζ_{i, n} I_{*}^{π_{s t}, π_{s t}^{h i}} (I_{*}^{π_{s t}^{i}, π_{s t}} (M_{i, n}))$

And then, by commutativity of injections, the injection of $M_{n}$ rewrites as $\sum_{i} ζ_{i, n} I_{*}^{π_{s t}^{i}, π_{s t}^{h i}} (M_{i, n})$ All the $π_{s t}^{i}$ are below $π_{s t}$ so they're also under $π_{s t}^{h i}$ , witnessing that the injection of $M_{n}$ lies in $\to^{c} (Θ^{π_{s t}} (π_{s t}^{h i}))$ . Then, just appeal to closure and continuity of $I_{*}$ or $I_{* s}$ to get that $M$ injects up into $\to^{c} (Θ^{π_{s t}} (π_{s t}^{h i}))$ Again, all this stuff works for the sur-situation as well.

With this out of the way, fix some $M \in \to^{c} Θ^{π_{s t}} (π_{s t})$ . Let's try to make an outcome function from this, shall we? Let's do $o f (π_{s t}^{'}) := I_{*}^{inf (π_{s t}, π_{s t}^{'}), π_{s t}^{'}} (p r_{*}^{π_{s t}, inf (π_{s t}, π_{s t}^{'})} (M))$

Yup, that does indeed specify one point for everything. It obviously spits out $M$ when you feed $π_{s t}$ in because both the injection and projection turn into identity. Further, by weak-consistency, the projection of $M$ down lies in $\to^{c} (Θ^{π_{s t}}) (inf (π_{s t}, π_{s t}^{'}))$ , and by our freshly-proved result, injecting up lands you in $\to^{c} (Θ^{s t}) (π_{s t}^{'})$ .

So, all that's left is showing that it's an outcome function! That, for any two $π_{s t}^{h i}$ and $π_{s t}^{l o}$ where $π_{s t}^{h i} \geq π_{s t}^{l o}$ , that $p r_{*}^{π_{s t}^{h i}, π_{s t}^{l o}} (o f (π_{s t}^{h i})) = o f (π_{s t}^{l o})$ Let's begin.

$p r_{*}^{π_{s t}^{h i}, π_{s t}^{l o}} (o f (π_{s t}^{h i})) = p r_{*}^{π_{s t}^{h i}, π_{s t}^{l o}} (I_{*}^{inf (π_{s t}, π_{s t}^{h i}), π_{s t}^{h i}} (p r_{*}^{π_{s t}, inf (π_{s t}, π_{s t}^{h i})} (M)))$

And then, by the Diamond Lemma, this equals

$I_{*}^{inf (π_{s t}^{l o}, inf (π_{s t}, π_{s t}^{h i})), π_{s t}^{l o}} (p r_{*}^{inf (π_{s t}, π_{s t}^{h i}), inf (π_{s t}^{l o}, inf (π_{s t}, π_{s t}^{h i}))} (p r_{*}^{π_{s t}, inf (π_{s t}, π_{s t}^{h i})} (M)))$

And then, $inf (π_{s t}^{l o}, inf (π_{s t}, π_{s t}^{h i})) = inf (π_{s t}^{l o}, π_{s t}^{h i}, π_{s t}) = inf (π_{s t}, π_{s t}^{l o})$ because $π_{s t}^{l o} \leq π_{s t}^{h i}$ . Rewriting a bit, and grouping the two projections together because they commute, we have:

$I_{*}^{inf (π_{s t}, π_{s t}^{l o}), π_{s t}^{l o}} (p r_{*}^{π_{s t}, inf (π_{s t}, π_{s t}^{l o})} (M)) = o f (π_{s t}^{l o})$

And we're finally done with everything, we showed causality.

Lemma 24: Given a $Θ$ (maybe just defined on stubs or full policies) that fulfills all hypothesis conditions except normalization, if it's normalizable, then all belief-function conditions are preserved (works in the sur-case too)

Nirvana-free nonemptiness, closure, convexity, nirvana-free upper-completion, and bounded-minimals are all obviously preserved by scale-and-shift/Proposition 7 in section 1 of proofs. For consistency, due to projections preserving $λ$ and $b$ , the scale-and-shift in the stubs (or full policies) is perfectly reflected in whichever partial policy you're evaluating, so consistency holds too. For the extreme point condition, any nirvana-free minimal extreme point post-renormalization is also nirvana-free minimal extreme pre-renormalization, so we can undo the renormalization, get a point in the nirvana-free component of a full policy that projects down accordingly, and scale-shift that point to get something that projects down to the scale-shifted extreme point. For Hausdorff-continuity, the scaling just scales the distance between two sets by the scale term, so Hausdorff-continuity carries over. Pseudocausality is preserved by normalization (un-normalize, transfer over to the partial policy of interest, then normalize back again), and so is causality (unnormalize the point and outcome function, complete it, normalize your batch of points back again).

Proposition 2: Given a nirvana-free $Θ^{? ω}$ , the minimal constraints we must check of $Θ^{? ω}$ to turn it into an acausal hypothesis are: Nonemptiness, Restricted Minimals, Hausdorff-Continuity, and non-failure of renormalization. Every other constraint to make a hypothesis can be freely introduced.

Ok, we have our $Θ^{? ω}$ , and we want to produce an acausal hypothesis. The obvious way to do it is: $Θ^{ω} (π) = {(¯ ¯¯¯¯¯¯ ¯ c . h {(Θ^{? ω} (π))}^{u c})}^{R}$ We'll use $Θ^{ω, \neg R} (π)$ to refer to the set before renormalization.

Proof sketch: We basically just run through the infinitary hypothesis conditions, and show that they're fulfilled by $Θ^{ω, \neg R}$ , and then appeal to Lemma 24 that we didn't destroy our hard work when we normalize. As for the hypothesis conditions themselves, they're all pretty simple to show except for bounded-minimals and Hausdorff-continuity, which is where the bulk of the work is.

1: Nirvana-free nonemptiness. Trivial, because all the $Θ^{? ω} (π)$ are nonempty.

2: Closure: Appeal to Lemma 2, not in this document, but of section 1 in basic inframeasure theory, that the upper completion of a closed set is closed. Then we just intersect with the cone of a-measures (closed) to get our set of interest, so it's closed.

3: Convexity: If you have $M$ and $M^{'}$ which decompose into $M^{l o} + M^{*}$ and $M^{^{'} l o} + M^{^{'} *}$ ( $M$ and $M^{'}$ lie in the upper completion and $M^{l o}$ / $M^{^{'} l o}$ lie in the closed convex hull), then

$p M + (1 - p) M^{'} = p (M^{l o} + M^{*}) + (1 - p) (M^{^{'} l o} + M^{^{'} *})$

$= (p M^{l o} + (1 - p) M^{^{'} l o}) + (p M^{*} + (1 - p) M^{^{'} *})$

The first component lies in the closed convex hull because it's a mix of two points from the closed convex hull, the second component is an sa-measure, and by our upper completion, then $p M + (1 - p) M^{'}$ lies in our $Θ^{ω, \neg R} (π)$

4: Nirvana-free Upper-completeness: Trivial, we took the upper completion.

Condition 5: Bounded Minimals:

This can be shown by demonstrating that, if $λ^{⊙} + b^{⊙}$ is our bound for $Θ^{? ω}$ (ie, every point in $Θ^{? ω} (π)$ , regardless of $π$ , either respects the bound or lies strictly above a point in $Θ^{? ω} (π)$ that respects the bound), then every point in $Θ^{ω, \neg R} (π)$ lies above a point that obeys the $λ^{⊙} + b^{⊙}$ bound.

Take a point $M \in ¯ ¯¯¯¯¯¯ ¯ c . h (Θ^{? ω} (π))$ . We don't have to worry about points in the upper completion that weren't part of the original closed convex hull, because they're above something in the closed convex hull, so we just have to show that everything in the closed convex hull lies above something that respects the bounds.

$M$ can be written as a limit of points $M_{n} \in c . h (Θ^{? ω} (π))$ , which split into a mixture of finitely many $M_{i, n} \in Θ^{? ω} (π)$ . We can then split the $M_{i, n}$ into $M_{i, n}^{l o} + M_{i, n}^{*}$ , where $M_{i, n}^{l o}$ respects the appropriate bounds (everything in $Θ^{? ω} (π)$ either obeys the bounds or is above a point which obeys the bounds). Now, we can rewrite $M_{n}$ as: $\sum_{i} ζ_{i, n} (M_{i, n}^{l o}) + \sum_{i} ζ_{i, n} (M_{i, n}^{*})$

That first sum term is a mixture of stuff that respects the $λ^{⊙} + b^{⊙}$ bound, so it respects the same property and lies below $M_{n}$ by the addition of the second term making $M_{n}$ . All these lower points lie in the closed convex hull, and obey the $λ^{⊙} + b^{⊙}$ bound, so there's a convergent subsequence that limits to some limit point that's also in the closed convex hull, respects the $λ^{⊙} + b^{⊙}$ bound, and by Lemma 16, is below $M$ .

So, any $M \in Θ^{ω, \neg R} (π)$ , regardless of $π$ , which violates the $λ^{⊙} + b^{⊙}$ bound on minimal points, has a point lower than it which does respect the bound, showing Minimal Boundedness.

We normalize at the end, and need uniform Hausdorff Continuity to show nirvana-free consistency, so let's skip to that one, which is hard.

Condition 8: Uniform Hausdorff Continuity:

We'll be working with the Lemma 15 variant of Hausdorff-continuity, that given any $ϵ$ , there's a $δ$ where two policies $π, π^{'}$ being $δ$ apart or less guarantees that if $M$ is in $Θ^{? ω} (π)$ , then there's a point $M^{'}$ in $Θ^{? ω} (π^{'})$ that's only $ϵ (1 + λ)$ away, where $λ$ is the $λ$ value associated with $M$ , and establish that variant for $Θ^{ω, \neg R}$ .

Fix an $ϵ$ . How close do two policies have to be to guarantee that for any $M \in Θ^{ω, \neg R} (π)$ , there's a point $M^{'}$ in $Θ^{ω, \neg R} (π^{'})$ within $ϵ (1 + λ)$ ? Well, for our original Hausdorff-continuity condition, pick a $δ$ that forces a $\frac{ϵ}{2 (1 + λ^{⊙} + b^{⊙})}$ "distance", and $δ < \frac{ϵ}{2 (1 + λ^{⊙} + b^{⊙})}$ .

Since we've got closure and bounded-minimals, write $M$ as $M^{l o} + M^{*}$ where $M^{l o}$ respects the $λ^{⊙} + b^{⊙}$ bound, and it lies in the closed convex hull and is a limit of $M_{n}^{l o}$ points, which decompose into a mixture of finitely many $M_{i, n}^{l o}$ points.

Now, each of these $M_{i, n}^{l o}$ points in $Θ^{? ω} (π)$ , by Hausdorff-continuity of $Θ^{? ω}$ , have a $M_{i, n}^{^{'} l o}$ point in $Θ^{? ω} (π^{'})$ , that's only $\frac{ϵ}{2 (1 + λ^{⊙} + b^{⊙})} (1 + λ_{i, n})$ away, by $π^{'}$ and $π$ being $δ$ or less apart.

We can mix the $M_{i, n}^{^{'} l o}$ in the same way as usual to make a $M_{n}^{^{'} l o}$ that's only $\frac{ϵ}{2 (1 + λ^{⊙} + b^{⊙})} (1 + λ_{n})$ or less away from $M_{n}^{l o}$

Because the $M_{n}^{l o}$ sequence converges, there's some bound on the $λ_{n}$ and $b_{n}$ values, and the (at most) $\frac{ϵ}{2 (1 + λ^{⊙} + b^{⊙})} (1 + {max}_{n} (λ_{n}))$ change to make $M_{n}^{^{'} l o}$ still keeps the $λ$ and $b$ values of our new sequence bounded, so by the Compactness Lemma, the $M_{n}^{^{'} l o}$ sequence has a convergent subsequence, with a limit point $M^{^{'} l o}$ , that lies in $Θ^{ω, \neg R} (π^{'})$ by closure. Also, for all n,

$d (M^{l o}, M^{^{'} l o}) \leq d (M^{l o}, M_{n}^{l o}) + d (M_{n}^{l o}, M_{n}^{^{'} l o}) + d (M_{n}^{^{'} l o}, M^{^{'} l o})$

The two distances on either side limit to 0, and the middle distance limits to $\frac{ϵ}{2 (1 + λ^{⊙} + b^{⊙})} (1 + λ^{⊙} + b^{⊙})$ or less, because eventually the $λ$ value of $M_{n}^{l o}$ gets really close to the $λ$ value of $M^{l o}$ , which is subject to the constraint that it can't be bigger than $λ^{⊙} + b^{⊙}$ due to $M^{l o}$ being picked to have its $λ + b$ value below $λ^{⊙} + b^{⊙}$ , so $d (M^{l o}, M^{^{'} l o}) \leq \frac{ϵ}{2}$

Ok, so those two things are pretty close to each other. But what we really want is to find a point in $Θ^{ω, \neg R} (π^{'})$ that's close to $M$ , ie, $M^{l o} + M^{*}$ . We can invoke the proof path from direction 2 of Lemma 15 (we have enough tools to do it, most notably upper completion) to craft a $M^{'} \in Θ^{ω, \neg R} (π^{'})$ where $d (M, M^{'}) \leq \frac{ϵ}{2} + δ (\frac{ϵ}{2} + λ)$

Further, $δ < \frac{ϵ}{2 (1 + λ^{⊙} + b^{⊙})} \leq \frac{ϵ}{2}$ . So, we get $d (M, M^{'}) \leq ϵ (1 + λ)$ and we're done.

7: Nirvana-free Consistency: We're working in a nirvana-free setting, so we can simplify things. Our formulation that we're going to show is, regardless of stub $π_{s t}$ , $c . h (⋃_{π > π_{s t}} p r_{*}^{π, π_{s t}} (Θ^{ω} (π)))$ is closed. Just invoke Lemma 20 and we have it and that's the last one we needed besides renormalization. Now all we have to do is to show that every property is preserved when we do the necessary rescaling. Invoke Lemma 24.

Proposition 3: Given some arbitarary $Θ^{? ω}$ which can be turned into an acausal hypothesis, turning it into $Θ^{ω}$ has $E_{Θ^{ω} (π)} (f) = α (E_{Θ^{? ω} (π)} (f) - β)$ for all $π$ and $f$ .

The steps to make your full $Θ^{ω}$ are convex hull, closure, upper completion, and renormalization. For convex hull, because $f$ induces a positive functional, which is linear, convex hull doesn't affect the worst-case value (mixing a-measures mixes the score you get w.r.t the function), closure just swaps inf out for min, and upper completion doesn't add any new minimal points so it preserves the same minimal values for everything. Let's use $Θ^{ω, \neg R}$ for the upper completion of the closed convex hull (no renormalization) so, unpacking definitions, for all $π, f$ :

$E_{Θ^{? ω} (π)} (f) = {inf}_{(m, b) \in Θ^{? ω} (π)} (m (f) + b) = {inf}_{(m, b) \in c . h (Θ^{? ω} (π))} (m (f) + b)$

$= {inf}_{(m, b) \in ¯ ¯¯¯¯ ¯ c . h (Θ^{? ω} (π))} (m (f) + b) = {inf}_{(m, b) \in (¯ ¯¯¯¯ ¯ c . h (Θ^{? ω} (π)))^{u c}} (m (f) + b) = E_{Θ^{ω, \neg R} (π)} (f)$

Continuing onwards, let's use $β$ for our shift constant and $α$ for our rescaling constant.

$E_{Θ^{ω} (π)} (f) = {inf}_{(m, b) \in Θ^{ω} (π)} (m (f) + b) = {inf}_{(m, b) \in Θ^{ω, \neg R} (π)} (α m (f) + α (b - β))$

$= α (({inf}_{(m, b) \in Θ^{ω, \neg R} (π)} (m (f) + b)) - β) = α (E_{Θ^{ω, \neg R} (π)} (f) - β)$

So, regardless of your $π$ and $f$ , $E_{Θ^{ω} (π)} (f) = α (E_{Θ^{? ω} (π)} (f) - β)$

and we're done. In particular, since this scale and shift is completely uniform across everything, it keeps the set of optimal policies unchanged.

Proposition 4: For all hypotheses $Θ$ and $Θ^{'}$ , $(\forall π, f : E_{Θ (π)} (f) = E_{Θ^{'} (π)} (f)) \leftrightarrow (\to^{N F} (Θ) = \to^{N F} (Θ^{'}))$

We can use that Murphy never picks something with Nirvana in it, and $\to^{N F} (Θ) (π) = Θ (π) \cap N F$ to rewrite our desired property as:

$(\forall π, f : E_{\to^{N F} (Θ) (π)} (f) = E_{\to^{N F} (Θ^{'}) (π)} (f)) \leftrightarrow (\to^{N F} (Θ) = \to^{N F} (Θ^{'}))$

one direction of this is pretty easy, if the belief functions are identical when you slice off the nirvana, then regardless of $π$ and $f$ , Murphy forces the same value. The other direction of this can be done by Theorem 3 from Section 1. Fixing a $π$ , the property $\forall f : E_{\to^{N F} (Θ) (π)} (f) = E_{\to^{N F} (Θ^{'}) (π)} (f)$ implies $\to^{N F} (Θ) (π) = \to^{N F} (Θ^{'}) (π)$ , but this holds for all $π$ , so we get $\to^{N F} (Θ) = \to^{N F} (Θ^{'})$

Proposition 5: For all hypotheses $Θ$ , and all continuous functions $g$ from policies to functions $f \in C ((A \times O)^{ω}, [0, 1])$ , then ${argmax}_{π} E_{Θ (π)} (g (π))$ exists and is closed.

Proof sketch: We'll prove this in four phases, where $π_{n}$ is some arbitrary sequence of policies limiting to the policy $π$ .

Our first phase will be establishing that ${lim}_{n \to \infty} | E_{Θ (π_{n})} (g (π_{n})) - E_{Θ (π_{n})} (g (π)) | = 0$

Our second phase will be establishing that ${limsup}_{n \to \infty} E_{Θ (π_{n})} (g (π)) \leq E_{Θ (π)} (g (π))$

Our third phase will be establishing that ${liminf}_{n \to \infty} E_{Θ (π_{n})} (g (π)) \geq E_{Θ (π)} (g (π))$

Putting phase 2 and 3 together, ${lim}_{n \to \infty} E_{Θ (π_{n})} (g (π)) = E_{Θ (π)} (g (π))$ and then, in conjunction with phase 1, ${lim}_{n \to \infty} E_{Θ (π_{n})} (g (π_{n})) = E_{Θ (π)} (g (π))$ This establishes that the function $π \mapsto E_{Θ (π)} (g (π))$ is continuous. Phase 4 is then deriving our desired result from the continuity of that function.

Begin phase 1. To begin with, because $g$ is a function from a compact metric space to a metric space, by the Heine-Cantor theorem, it's uniformly continuous. So, there's some $δ$ difference between policies that guarantees an $ϵ$ difference between the functions produced, and our distance metric on functions in this case is ${sup}_{x} | f (x) - f^{'} (x) |$ , the distance metric associated with uniform convergence.

By Proposition 3 (Section 1), every positive functional (and, by Proposition 1 (Section 1), continuous functions induce positive functionals) is minimized within the set of minimal points, so we can fix an $(m, b)$ and $(m^{'}, b^{'})$ within $(Θ (π_{n}))^{min}$ (specifically, the nirvana-free component) which minimize the positive functionals associated with $g (π_{n})$ and $g (π)$ , respectively. Being able to get an actual minimizing point follows from minimal-boundedness, so the closure of the set of minimal points (due to having bounds) is compact, and a continuous function from a compact set to $[0, 1]$ has an actual minimizing point, so we can pick such a point and then step down to a minimal point if needed

Note that, by the way we picked these, $m (g (π_{n})) + b = E_{Θ (π_{n})} (g (π_{n}))$ and also $m^{'} (g (π)) + b^{'} = E_{Θ (π_{n})} (g (π))$

First, we'll bound the following two terms. $| (m (g (π_{n})) + b) - (m (g (π)) + b) |$ and $| (m^{'} (g (π)) + b^{'}) - (m^{'} (g (π_{n})) + b^{'}) |$ The same arguments work for both, so we'll just show one of them.

$| (m (g (π_{n})) + b) - (m (g (π)) + b) | = | m (g (π_{n}) - g (π)) | \leq m (| g (π_{n}) - g (π) |) \leq λ^{⊙} ϵ_{n}$

The argument for this is that the first $\leq$ is because $m$ is a measure (never negative), so an upper-bound on the absolute value of the expectation is given by the expectation of the absolute value of the distance between the two functions. For the second $\leq$ , if n is large enough to make $π_{n}$ and $π$ be only $δ_{n}$ apart, then $g (π_{n})$ and $g (π)$ are only $ϵ_{n}$ apart, so that absolute value is upper bounded by $ϵ_{n}$ , getting us an upper bound of $m (ϵ_{n})$ Then, because the total amount of measure for minimal points is upper bounded by some $λ^{⊙}$ regardless of which policy we picked (minimal-point boundedness for $Θ$ ), we can finally impose an upper bound of $ϵ_{n} λ^{⊙}$ on the distance. The sort of argument works for the second thing, and gets us the exact same upper bound.

Further, $m (g (π_{n})) + b \leq m^{'} (g (π_{n})) + b^{'}$ and $m^{'} (g (π)) + b^{'} \leq m (g (π)) + b$ . This is because $(m, b)$ is specialized to minimize $g (π_{n})$ and $(m^{'}, b^{'})$ is specialized to minimize $g (π)$ . Therefore, in one direction:

$(m^{'} (g (π)) + b^{'}) - (m (g (π)) + b) \leq 0$ so $(m^{'} (g (π)) + b^{'}) - (m (g (π_{n})) + b) \leq ϵ_{n} λ^{⊙}$

In the other direction,

$(m (g (π_{n})) + b) - (m^{'} (g (π_{n})) + b^{'}) \leq 0$ so $(m (g (π_{n})) + b) - (m^{'} (g (π)) + b^{'}) \leq ϵ_{n} λ^{⊙}$

Thus, putting the two parts together,

$| E_{Θ (π_{n})} (g (π_{n})) - E_{Θ (π_{n})} (g (π)) | = | (m (g (π_{n})) + b) - (m^{'} (g (π)) + b^{'}) | \leq ϵ_{n} λ^{⊙}$

We can make n go to infinity, which makes $δ_{n}$ (distance between policies) go to 0, which makes $ϵ_{n}$ (distance between functions) go to 0, and $λ^{⊙}$ is a constant, so we get that the distance between the two expectations limits to 0 and we're done with the first phase.

Time for phase 2, showing that ${limsup}_{n \to \infty} E_{Θ (π_{n})} (g (π)) \leq E_{Θ (π)} (g (π))$

Fix some $(m, b)$ in $Θ (π) \cap N F$ that minimizes the positive functional associated with $g (π)$ . By Hausdorff-Continuity, we can find a sequence of points $(m_{n}, b_{n}) \in Θ (π_{n}) \cap N F$ that limit to $(m, b)$ . By continuity of $g (π)$ , this means that $m_{n} (g (π)) + b_{n}$ limits to $m (g (π)) + b$ , which is $E_{Θ (π)} (g (π))$ . However, $m_{n} (g (π)) + b_{n} \geq E_{Θ (π_{n})} (g (π))$

Thus, ${limsup}_{n \to \infty} E_{Θ (π_{n})} (g (π)) \leq E_{Θ (π)} (g (π))$

Now for phase 3, showing that ${liminf}_{n \to \infty} E_{Θ (π_{n})} (g (π)) \geq E_{Θ (π)} (g (π))$

Assume it's false, we'll get a proof-by contradiction. That is,

${liminf}_{n \to \infty} E_{Θ (π_{n})} (g (π)) < E_{Θ (π)} (g (π))$

Then, we can get some subsequence where the expectations converge to the liminf. For each n in that subsequence, fix a $(m_{n}, b_{n}) \in (Θ (π_{n}))^{min} \cap N F$ that minimizes the positive functional associated with $g (π)$ within $Θ (π_{n})$ . Ie, $m_{n} (g (π)) + b_{n} = E_{Θ (π_{n})} (g (π))$

By bounded minimals, there's some $λ^{⊙} + b^{⊙}$ bound on all of these, so we can isolate another convergent subsequence (the expectation values still limit to the liminf), where the $(m_{n}, b_{n})$ limit to some $(m, b)$ . For the following arguments, we'll use n to denote numbers from our original sequence (ranges over all natural numbers) and j to denote numbers from our convergent subsequence of interest (where the expectations converge to liminf and our sequence of minimizing points converges to a limit point)

First, this $(m, b)$ limit point lies in $Θ (π)$ , because it's arbitrarily close to points that are arbitrarily close to $Θ (π)$ (Hausdorff-continuity), so the distance to that set shrinks to 0, and $Θ (π)$ is closed so said point limits to be in it. Now, we can go

${liminf}_{n \to \infty} E_{Θ (π_{n})} (g (π)) = {lim}_{j \to \infty} (m_{j} (g (π)) + b_{j}) = m (g (π)) + b$

By $m_{j} (g (π)) + b_{j} = E_{Θ (π_{j})} (g (π))$ , and the j's making a subsequence where we attain the liminf value in the limit, and then the second equality is a convergent sequence of a-measures having their expectation value limit to the expectation value of the limit point.

But then we get something impossible. $(m, b) \in Θ (π)$ , and yet somehow (by our original assumption that the liminf undershot the expectation value of $g (π)$ in $Θ (π)$ ),

$m (g (π)) + b = {liminf}_{n \to \infty} E_{Θ (π_{n})} (g (π)) < E_{Θ (π)} (g (π)) = {min}_{(m^{'}, b^{'}) \in Θ (π) \cap N F} (m^{'} (g (π)) + b^{'})$

Which cannot be. This shows that the liminf is $\geq E_{Θ (π)} (g (π))$ .

Now for phase 4. Again, from the proof sketch, phases 1, 2, and 3 establish that $π \mapsto E_{Θ (π)} (g (π))$ is a continuous function $Π \to R^{\geq 0}$ Let's abbreviate this function as $χ$ . Since we're mapping $Π$ (which is compact) through a continuous function, the image $χ (Π)$ is compact. Thus, it has a maximum value, which is attained by some policy. Take that maximum value (it's a single point so it's closed), take the preimage (which is a nonempty closed set of policies), and that's your ${argmax}_{π} E_{Θ (π)} (g (π))$ set. And $E_{Θ (π)} (g (π))$ unpacks as ${min}_{(m, b) \in Θ (π) \cap N F} (m (g (π)) + b)$ Thus showing our result.

A quick corollary of it is, if $g$ just returns the constant 1 function, you can find a policy $π$ where ${min}_{(λ μ, b) \in Θ (π)} (λ + b)$ is 1, by normalization, so we can use $max$ in normalization instead of $sup$ .

Lemma 25: In the nirvana-free setting, with all the $B_{i} \subseteq F^{N F} (π_{p a})$ being nonempty and upper complete, then $E_{ζ} B_{i}$ is upper-complete.

The proof of this is nearly identical to the proof of Lemma 12. Except in this case, our $M_{i}$ aren't finitely many points selected from a nonconvex $B$ , they're countably many points selected from the various $B_{i}$ . Apart from that difference, the proof path works as it usually does.

Lemma 26: A belief function $Θ$ fulfilling all conditions except normalization, which is renormalized by subtracting ${inf}_{π} E_{Θ (π)} (0)$ and scaling by $({sup}_{π} E_{Θ (π)} (1) - {inf}_{π} E_{Θ (π)} (0))^{- 1}$ , only has the renormalization fail if, for all $π$ , $Θ (π) \cap N F$ has a single minimal point of the form $(0, b)$ , with the same $b$ for all $π$ . (works in the sur-case too)

First, fixing an arbitrary $π^{'}$ , then, if there's a divide-by-zero when scaling,

$E_{Θ (π^{'})} (1) \leq {max}_{π} E_{Θ (π)} (1) = {min}_{π} E_{Θ (π)} (0) \leq E_{Θ (π^{'})} (0)$

However, $E_{Θ (π^{'})} (1) \geq E_{Θ (π^{'})} (0)$ always, so the two terms are equal.

Now, just invoke Proposition 6 from Section 1, which says that if $E_{Θ (π^{'}) \cap N F} (1) = E_{Θ (π^{'}) \cap N F} (0)$ , then there's a single minimal point of the form $(0, b)$ , and the $b$ is $E_{Θ (π^{'}) \cap N F} (0)$ , which is the same for all policies $π^{'}$ . The converse is, if all $Θ (π)$ are of this form, then renormalization fails.

Let's define "nontriviality" for a belief function $Θ$ . A $Θ$ is nontrivial if there exists some $π$ where $E_{Θ (π)} (1) \neq E_{Θ (π)} (0)$

In other words, there's some policy you can feed in where the minimal points of $Θ (π)$ aren't just a single $(0, b)$ point. This is a very weak condition. Also, for the upcoming proposition 6, mixing just doesn't interact well with nirvana-free consistency, so we have to do it just for pseudocausal and acausal hypotheses.

Proposition 6: For pseudocausal and acausal hypotheses $Θ_{i}$ where $\sum_{i} ζ_{i} λ_{i}^{⊙} < \infty$ and there exists a nontrivial $Θ_{i}$ , then mixing them and renormalizing produces a pseudocausal or acausal hypothesis.

Mixing is defined on the infinite levels by $(E_{ζ} Θ_{i}) (π) = E_{ζ} (Θ_{i} (π))$ , and is then extended down to the finite levels by the usual process of projecting down and taking the closed convex hull. Then, we can renormalize if we wish. We'll distinguish these by "raw" or "renormalized" mix. Thus, the only conditions we need to check are the infinite conditions. For everything else, we can just go "the infinite conditions work, so we can extend to a full belief function" by the Isomorphism theorem.

If you want what mixing does at finite levels, it's $¯ ¯¯¯¯¯¯ ¯ c . h (⋃_{π > π_{s t}} (E_{ζ} (p r_{*}^{π, π_{s t}} (Θ_{i} (π)))))$ So it isn't "mix all the finite levels", it's "mix all the projections individually and then take convex hull"

Proof sketch: Neglecting normalization (because Lemma 24 shows we can just renormalize and all nice conditions are preserved), we just need to verify all the relevant infinitary conditions, and then we can extend to lower levels by isomorphism, and get our result. We also need to show that nontriviality implies that the renormalization doesn't fail, but that's easy. As for the conditions, our lemmas let us get most of them with little trouble. Bounded-minimals from just the $λ^{⊙}$ bound is slightly more difficult and relies on showing that $(0, 1)$ is in all the $Θ_{i} (π)$ sets regardless of i and $π$ by normalization to eliminate the $b$ term, and Hausdorff-continuity is also fairly nontrivial (we have to split our mix into three pieces and bound each one individually via a different argument) and relies on the same $(0, 1)$ is in all the $Θ_{i} (π)$ result. For causality, we'll knock it out with Tychonoff in a slightly more complicated way than usual so we just use a countable product and don't have to invoke the full Axiom of Choice.

We'll take a detour and show that $(0, 1) \in E_{ζ} Θ_{i} (π)$ for all $π$ , we need this in a few places. First, pick an arbitrary i and $π$ and look at $Θ_{i} (π)$ . Find a minimal point $(m, b)$ that minimizes $m (1) + b$ . Now, consider $(m, b) + (- m, m (1))$ . This is $(0, m (1) + b)$ . However...

$m (1) + b = {min}_{(m^{'}, b^{'}) \in Θ_{i} (π) \cap N F} (m (1) + b) = E_{Θ_{i} (π)} (1) \leq {max}_{π} E_{Θ_{i} (π)} (1) = 1$

(by $(m, b)$ minimizing $m (1) + b$ , and normalization, respectively)

So, by nirvana-free upper-completion for $Θ_{i}$ , we get the point $(0, 1) \in Θ_{i} (π)$ for all i and $π$ . Then, mixing these gets that $(0, 1)$ lies in $(E_{ζ} Θ_{i}) (π)$ .

1: Infinitary Nirvana-Free Nonemptiness, it's easy, all our $(E_{ζ} Θ_{i}) (π)$ contain the a-measure $(0, 1)$ , which is nirvana free.

2,3: Closure, convexity: Closure follows from the proof of closure in Proposition 11 of section 1, and we mixed convex sets so the mixture is convex.

4: Nirvana-free Upper-Completion: Just invoke Lemma 25.

5: Bounded minimals: Since $(0, 1) \in (E_{ζ} Θ_{i}) (π)$ for all i, any a-measure with $b \geq 1$ isn't minimal (add whatever you want to $(0, 1)$ ), so we have a bound on the $b$ values of minimals. What about a bound on the $λ$ values?

Well, just take a $M$ in $(E_{ζ} Θ_{i}) (π)$ with $b < 1$ , and split it into a mixture of $M_{i}$ points from the $Θ_{i} (π)$ . Then, by bounded minimality for the $Θ_{i}$ , we can take each $M_{i}$ and find a minimal point below it that fulfills the $λ_{i}^{⊙}$ bound on the $λ$ values. Mixing those minimals produces a point $M^{l o}$ that's below $M$ , and has a $λ$ value below $E_{ζ} λ_{i}^{⊙}$ , which, by assumption, is finite. So, every minimal point in any $(E_{ζ} Θ_{i}) (π)$ has a $λ + b$ value below $E_{ζ} λ_{i}^{⊙} + 1$ and we have bounded-minimals.

Nirvana-free consistency is something we'll have to loop back to after Hausdorff-continuity.

Condition 8: Hausdorff-continuity:

Pick an arbitrary $M \in (E_{ζ} Θ_{i}) (π)$ . It shatters into $M_{i} \in Θ_{i} (π)$ . We'll be showing the Lemma 15 variant, which is that for all $ϵ$ , there's a $δ$ where if $d (π^{'}, π) < δ$ , then there's a point in $(E_{ζ} Θ_{i}) (π^{'})$ that's $ϵ (1 + λ)$ away.

First, we'll shuffle around what our $M_{i}$ are supposed to be, we need a certain decomposition to make it work. Reindex your probability distribution so the highest-probability thing is assigned to $i = 0$ . All the $M_{i}$ can be decomposed as a $M_{i}^{min} + M_{i}^{*}$ . Now, let our new $M_{i}$ for $i > 0$ be defined as: $M_{i}^{min} + (m_{i}^{* -}, - m_{i}^{* -} (1))$ ,

and our $M_{i}$ for $i = 0$ is defined as: $M_{0}^{min} + (m_{0}^{* -}, - m_{0}^{* -} (1)) + \sum_{i} \frac{ζ_{i}}{ζ_{0}} (m_{i}^{* +}, b_{i}^{*} + M_{i}^{* -} (1))$

These new $M_{i}$ still lie in $Θ_{i} (π) \cap N F$ (nirvana-free upper-completion), and they're all a-measures (the negative part isn't enough to cancel out the positive measure on $m_{i}^{min}$ otherwise our old $M_{i}$ wouldn't be an a-measure). Further, mixing them together still makes $M$ , and if $i > 0$ , then $λ_{i} \leq λ_{i}^{⊙}$ (because we start off at a minimal point with $λ_{i} \leq λ_{i}^{⊙}$ , and then add something to it that saps some measure from it).

Fixing some $ϵ$ ... well, $E_{ζ} λ_{i}^{⊙} < \infty$ , so find a j where $\sum_{i > j} ζ_{i} (λ_{i}^{⊙} + 1) < \frac{ϵ}{3}$ . For $i \leq j$ ... well, using the Lemma 15 variant of Hausdorff-continuity, note that fixing a $δ$ gets you a different $ϵ_{i}$ value for Hausdorff-continuity of each $Θ_{i}$ . We only have to worry about the $i \leq j$ , though, and there's finitely many. So, pick a $δ$ where the induced $ϵ_{0}$ is below $\frac{ϵ}{3}$ , and for all $0 < i \leq j$ , $ϵ_{i} < \frac{ϵ}{E_{ζ} λ_{i}^{⊙} + 1}$ .

So... we take our $M_{i}$ in $Θ_{i} (π)$ , and go to nearby $M_{i}^{'}$ in $Θ_{i} (π^{'})$ . We should break down exactly how this is done. For $M_{0}$ , the $λ$ value relative to $M$ is at most $\frac{λ}{ζ_{0}}$ (in the degenerate case where it contributes all the measure to the mixture $M$ ), so, by Hausdorff-continuity for $Θ_{0}$ , the gap between $M_{0}$ and $M_{0}^{'}$ is at most $\frac{ϵ}{3} (1 + \frac{λ}{ζ_{0}})$ because we picked $δ$ low enough to get that scale factor on the front.

For $M_{i}$ where $0 < i \leq j$ , the gap between $M_{i}$ and $M_{i}^{'}$ is at most $\frac{ϵ}{3 (E_{ζ} λ_{i}^{⊙} + 1)} (1 + λ_{i}^{⊙})$ Because we picked $δ$ low enough to guarantee that scale factor on the front, and $M_{i}$ is made by adding a minimal point and an sa-measure where the measure component is entirely negative, so $λ_{i}^{⊙}$ is a bound on the $λ$ value for $M_{i}$ .

And finally, for $i > j$ , we can specially craft a $M_{i}^{'}$ where the gap between $M_{i}$ and $M_{i}^{'}$ is at most $λ_{i}^{⊙} + 1$ . This is because $M_{i}$ has measure below $λ_{i}^{⊙}$ due to being a minimal point that lost some measure. So, we can expend $λ_{i}^{⊙}$ effort to completely reshuffle it however we wish, and then add $(0, 1)$ to our reshuffled a-measure to make an a-measure that lies above $(0, 1)$ , which must be in $Θ_{i} (π^{'})$ , so our reshuffled a-measure plus $(0, 1)$ lies in $Θ_{i} (π^{'})$ by nirvana-free upper-completion, and we only had to spend $λ_{i}^{⊙} + 1$ effort to travel to it (first term is the reshuffling, second term is adding 1 to the $b$ term)

Now, let's analyze the distance between $M$ and the point $E_{ζ} M_{i}^{'}$ which lies in $(E_{ζ} Θ_{i}) (π^{'})$ . $d (M, E_{ζ} M_{i}^{'})$ equals...

$d (ζ_{0} M_{0} + \sum_{0 j} (ζ_{i} M_{i}), ζ_{0} M_{0}^{'} + \sum_{0 j} (ζ_{i} M_{i}^{'}))$

$\leq d (ζ_{0} M_{0}, ζ_{0} M_{0}^{'}) + \sum_{0 j} d (ζ_{i} M_{i}, ζ_{i} M_{i}^{'})$

$= ζ_{0} d (M_{0}, M_{0}^{'}) + \sum_{0 j} (ζ_{i} d (M_{i}, M_{i}^{'}))$

$< ζ_{0} \frac{ϵ}{3} (1 + \frac{λ}{ζ_{0}}) + \sum_{0 j} ζ_{i} (λ_{i}^{⊙} + 1)$

$< \frac{ϵ}{3} (ζ_{0} + λ) + \frac{ϵ}{3} \frac{1}{E_{ζ} λ_{i}^{⊙} + 1} \sum_{0 < i \leq j} (ζ_{i} (1 + λ_{i}^{⊙})) + \frac{ϵ}{3}$

$< \frac{ϵ}{3} (ζ_{0} + 1 + λ + \frac{1}{E_{ζ} λ_{i}^{⊙} + 1} \sum_{i} ζ_{i} (λ^{⊙} + 1)) < \frac{ϵ}{3} (1 + 1 + λ + \frac{E_{ζ} (λ_{i}^{⊙} + 1)}{E_{ζ} λ_{i}^{⊙} + 1})$

$= \frac{ϵ}{3} (3 + λ) \leq \frac{ϵ}{3} (3 + 3 λ) = ϵ (1 + λ)$

And bam, we've got Hausdorff-continuity!

9: Nirvana-free consistency: Invoke Lemma 20, notice that we're in the nirvana-free setting, we're done.

Pseudocausality. Fix some $M \in (E_{ζ} Θ_{i}) (π)$ , whose support is a subset of $F^{N F} (π^{'})$ . $M$ shatters into $M_{i} \in Θ_{i} (π)$ . All of them have their support being a subset of $F^{N F} (π^{'})$ (otherwise there'd be measure-mass outside of there), so all the $M_{i}$ transfer over to $Θ_{i} (π^{'})$ by pseudocausality for the $Θ_{i}$ , and then we can mix them back together to get $M \in (E_{ζ} Θ_{i}) (π^{'})$ .

And we're done, mixing works just fine, as long as we can show that the renormalization preserves everything and the renormalization doesn't fail. Renormalization fails iff (By Lemma 26), for all $π$ , then $E_{(E_{ζ} Θ_{i}) (π)} (1) = E_{(E_{ζ} Θ_{i}) (π)} (0)$

We can then go $E_{(E_{ζ} Θ_{i}) (π)} (1) = E_{E_{ζ} (Θ_{i} (π))} (1) = E_{ζ} (E_{Θ_{i} (π)} (1))$ and similar for 0, (by the definition of the mix of belief functions and Proposition 10 in Section 1) and then we can use that $E_{(E_{ζ} Θ_{i}) (π)} (1) - E_{(E_{ζ} Θ_{i}) (π)} (0) = 0$ to go

$E_{ζ} (E_{Θ_{i} (π)} (1)) - E_{ζ} (E_{Θ_{i} (π)} (0)) = 0$

$\sum_{i} ζ_{i} (E_{Θ_{i} (π)} (1) - E_{Θ_{i} (π)} (0)) = 0$

So, then, for all $π$ and i, $E_{Θ_{i} (π)} (1) = E_{Θ_{i} (π)} (0)$ However, this is incompatible with the existence of a nontrivial $Θ_{i}$ , because nontriviality just says that there's a $π$ where $E_{Θ_{i} (π)} (1) \neq E_{Θ_{i} (π)} (0)$ .

So, nontriviality for some $Θ_{i} (π)$ means that the renormalization of your mix can be done. It's a very weak condition, just saying "there's some possibility of starting with a hypothesis ( $Θ_{i}$ ) which has a policy $π$ , where murphy actually has to pay attention to what function you're maximizing.

Now, we just need to show that renormalization preserves all nice conditions. Just invoke isomorphism to complete to a full psuedo/acausal belief function lacking normalization, apply Lemma 24 and renormalize everything, and we're done.

Proposition 7: For pseudocausal and acausal hypotheses, $E_{(E_{ζ} Θ_{i}) (π_{p a})} (f) = E_{ζ} (E_{Θ_{i} (π_{p a})} (f))$

Proof: $E_{(E_{ζ} Θ_{i}) (π_{p a})} (f) = E_{E_{ζ} (Θ_{i} (π_{p a}))} (f) = E_{ζ} (E_{Θ_{i} (π_{p a})} (f))$ by Proposition 10 of Section 1.

Proposition 8: For pseudocausal and acausal hypotheses,

$p r_{*}^{π_{p a}^{h i}, π_{p a}^{l o}} ((E_{ζ} Θ_{i}) (π_{p a})) = E_{ζ} (p r_{*}^{π_{p a}^{h i}, π_{p a}^{l o}} (Θ_{i} (π_{p a}^{'})))$

This is an easy one. If $M \in p r_{*}^{π_{p a}^{h i}, π_{p a}^{l o}} ((E_{ζ} Θ_{i}) (π_{p a}^{h i}))$ , then there's a preimage point $M^{'} \in (E_{ζ} Θ_{i}) (π_{p a}^{h i})$ , which then decomposes into $M_{i}^{'} \in Θ_{i} (π_{p a}^{h i})$ . These $M_{i}^{'}$ project down to $M_{i}$ , which then mix to make $M$ (project then mix equals mix then project because of linearity) witnessing that $M \in E_{ζ} (p r_{*}^{π_{p a}^{h i}, π_{p a}^{l o}} (Θ_{i} (π_{p a}^{h i})))$

Now for the reverse direction. If $M \in E_{ζ} (p r_{*}^{π_{p a}^{h i}, π_{p a}^{l o}} (Θ_{i} (π_{p a}^{h i})))$ , then it shatters into $M_{i} \in p r_{*}^{π_{p a}^{h i}, π_{p a}^{l o}} (Θ_{i} (π_{p a}^{h i}))$ , which have preimage points $M_{i}^{'} \in Θ_{i} (π_{p a}^{h i})$ . The $M_{i}^{'}$ mix to make a $M^{'} \in (E_{ζ} Θ_{i}) (π_{p a}^{h i})$ , which projects down to $M$ , by project-mix equaling mix-project, witnessing that $M \in p r_{*}^{π_{p a}^{h i}, π_{p a}^{l o}} ((E_{ζ} Θ_{i}) (π_{p a}^{h i}))$ .

Next proof post!

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

4

Proofs Section 2.2 (Isomorphism to Expectations)

4