3

19 min read

3

Theorem 4: Pseudocausal IPOMDP's: If an infra-POMDP has a transition kernel  which fulfills the niceness conditions, and there's a starting infradistribution , unrolling it into an infrakernel via  has the infrakernel  being pseudocausal and fulfilling all belief function conditions except for normalization.

This takes a whole lot of work to show. The proof breaks down into two phases. The first is showing that the  fulfill the niceness conditions (which splits into four subphases of showing 1-Lipschitzness, lower-semicontinuity, compact-shared CAS, and moving constants up/mapping 1 to 1). This is the easy part to do. The second phase is much longer. It splits into five pieces, where we show our list of conditions for .  Our list of conditions to show is: 

1:The  have a uniform upper bound on their Lipschitz constants.

2:  is lower-semicontinuous for all 

3:  is supported entirely on histories compatible with .

4: All the  agree on the value of 1 (or infinity).

5:  is pseudocausal.

Parts 1 and 4 can be done with little trouble, but parts 2, 3, and 5 split into additional pieces. To show pseudocausality, ie, condition 5, we have to start with our proof goal and keep rewriting it into an easier form, until we get something we can knock out with an induction proof, which splits into a base case and an induction step. To show sensible supports, ie, part 3, we have to split it into three parts, there's one induction proof for the finite , showing sensible supports for , and then finally extending this to the result we actually want, which is fairly easy.

And then there's lower-semicontinuity, part 2, which is an absolute nightmare. This splits into three parts. One is explicitly constructing a sequence of compact sets to use in the Almost-Monotone Lemma for any policy whatsoever, and then there's trying to show that  is lower-semimicontinuous which requires a massive amount of messing around with integrals and limits, and then it's pretty easy to show lower-semicontinuity for  from there.

T4.1 Our first order of business is showing the five niceness conditions for all the  infrakernels, so we can take the infinite semidirect product and show that  inherits the five niceness conditions (from the proof of Theorem 1).

Remember that  is defined as

Obviously it produces inframeasures, since  does. 

T4.1.1 For lower-semicontinuity in the input, we have

and then, as  limits to , we have  forever after some finite , because  is a finite space. So, it turns into

and then, by lower-semicontinuity for K (one of the niceness conditions)

and then we pack this back up as

and lower-semicontinuity has been shown.

T4.1.2 For 1-Lipschitzness, observe that  is 1-Lipschitz because  is 1-Lipschitz, and this is the semidirect product of a 1-Lipschitz (all probability distributions are 1-Lipschitz, and this is a dirac-delta distribution) infradistribution with a 1-Lipschitz infrakernel.

T4.1.3 For compact-shared compact-almost-support, observe that  is a continuous function, so any compact subset of  (input to ) maps to a compact subset of  (corresponding inputs for ). Then, we can apply compact-shared compact-almost-support for  (one of the niceness conditions) to get a compact almost-support on , and take the product of that subset with all of  itself, to get a shared-compact almost-support for .

T4.1.4 For mapping constants to above constants, we have

By increasing constants for  (one of the niceness conditions). This same proof works with equality when , because of the "map 1 to 1" condition on the  type signature. So all the  have the niceness properties, and we can form the infinite semidirect product at all, and the infinite semidirect product inherits these niceness conditions.

T4.2 Now we can get working on showing the belief function conditions.

T4.2.1 For showing a bounded Lipschitz constant, we have

and then since  is an infradistribution, it must have a finite Lipschitz constant, . So we then get

and then, since all the  are 1-Lipschitz (infinite semidirect product has niceness conditions), we get

and we're done, we showed a Lipschitz bound on the  that is uniform in .

T4.2.2 Now for lower-semicontinuity for , which is the extremely hard one.

The way this proof will work is that we'll first show a variant of the Almost-Monotone Lemma that works for all the  infrakernels simultaneously. Then, using that, we'll show that

 is lower-semicontinuous. And then, it's pretty easy to wrap up and get that

 is lower-semicontinuous, from that.

T4.2.2.1 Anyways, our first goal is a variant of the Almost-Monotone Lemma. The particular variant we're searching for, specialized to our particular case, is:



We let our  be an arbitrary compact subset of states, and  be arbitrary.

Now, let's assume there's a sequence of compact sets C where, regardless of n or  is a shared -almost-support for all the  inframeasures (where ).

Then our proof goal becomes



The proof of the Almost-Monotone Lemma goes through perfectly fine, the key part where our choice of compact sets matters is in the ability to make the assumption that  is an -almost-support for all the  with , as our  is. And we assumed our sequence of compact sets fulfilled that property.

So, where we're at is that our variant of the Almost-Monotone-Lemma works, as long as our assumption works out, that there's a sequence of compact sets  where, regardless of n or  or  is a -almost-support for .

In order to get our variant of the AML to go through, we really need to prove that there is such a sequence, for any  and . For this, we'll be looking at a function

What this does is that  is defined to return a compact subset of  which is an -almost-support for all the  inframeasures, when . There are always compact subsets like this for any input because of shared almost-compact-support for the infrakernel  (one of the assumed niceness conditions), so there does indeed exist a function  with these properties.

Then, we recursively define the  as:

and

All these sets are compact, by induction. For the first one, since  always returns compact sets, we're taking a finite union of compact sets (is compact), and taking the product with a finite set (is compact). For induction up, since the product of compact sets is compact,  is compact, and then projections of compact sets are compact, so then  returns a compact set, and again, we take a finite union of compact sets and then the product with a compact set to make another compact set.

Now, let's get started on showing that C^{\top}_{1:n+1}[C^{S},\eps] is an \eps\left(1-\frac{1}{2^{n+1}}\right)-almost-support for K^{\pi}_{:n}(s_0), for arbitrary \pi and s_0\in C^{S}. This proof will proceed by induction.

For the base-case, we want that  is a -almost-support for  with . Well, assume  are equal on , and .

Now,  is a -almost-support for  because , so we can apply Lemma 2 from LBIT, and 1-Lipschitzness of  to split this up as

That second term can be upper bounded by , so we get an upper bound of

and then we observe that since  on , and that set is defined as  , we have that they agree on 

So that term turns into 0, the two functions are equal on our set of interest, so our net upper bound is

Showing that  is a -almost support for all the , where  is arbitrary and s_0\in C^{S}.

Now for induction. We want to show that  is a -almost-support for  with , assuming that  is a -almost-support for  with .

To begin, assume  are equal on  and . Then we can go:


and then unpack to get


and unpack the semidirect product and substitute the dirac-delta value in to get


We apply a Lemma 2 from LBIT decomposition, with  as an -almost support for , which works by induction assumption. Since  is also 1-Lipschitz (the niceness conditions we already know), the Lemma 2 upper bound breaks down as




By 1-Lipschitzness of , we get an upper-bound of



And then this is upper-boundable by , so we get


Now, for that top term, we know that  comes from . We apply another Lemma 2 decomposition on  this time, with the compact set of interest being 

which is a -almost-support for . Pairing this with 1-Lipschitzness of , we have an upper bound of




That second term is upper-bounded by  no matter what, so we have



Pull out the constant and combine, to get



Now,  were assumed to agree on , which factors as

and then expands as

which all our  we're feeding in lie within. So those functions are identical for those inputs, and our upper-bound just reduces to

Which witnesses that  is a -almost-support for any  (arbitrary ) as long as . So our induction step goes off without a hitch, and we now know that the sequence is suitable to get our modification of the Almost-Monotone-Lemma to work out. We have the result 



In particular, since any defining sequence for  has an actual limit, this can be trivially reshuffled to attain the result:



And remember, the sequence of compact sets only depends on  and , not on the policy. This works for any policy. Now that we have this result in the bag, we can move on to phase 2.

T4.2.2.2 For phase 2, we'll show that  is lower-semicontinuous, and dear lord this is going to be hard and take a lot of integrals and limits to show. We start with

Which unpacks as

If this liminf is infinity, we're just done, infinity beats everything. So we'll be assuming that it settles down to a finite value. We can unpack the  as a challenge of picking points in , so we get

And, since we can pick amongst minimal points, we can rewrite this as

In the  case, the set of minimal points is precompact. In the  case, it may not be precompact, but the only way it fails to be precompact is having b values running off to infinity. Said  function can't be unbounded below, because it is above , and infinite semidirect products map constants to greater than their usual value, and  fulfills this condition. Thus, if the b value of a selected minimal point is far higher than the finite liminf value plus , the value said minimal point assigns to the function of interest is far higher than the true minimal value. So, when minimizing, we can restrict to a compact subset of  to attain the min, call it . With this, we get

And then we unpack this as an integral.

Now, we can pass to two subsequences. First, we can pass to a subsequence of the m that actually limits to the liminf value. On this subsequence, the policies still limit to . Also, for each member of this subsequence, we can find an true minimizing  pair for it since we're minimizing the lower-semicontinuous function 

over a compact set. Said function is lower-semicontinuous because

is lower-semicontinuous, it's one of the niceness properties that the infinite infrakernel inherits.

Since we're selecting a-measures from the compact set , we can find a subsequence of our subsequence where the a-measures converge. Now we'll be using i as our index, and using it to index the  pairs. The limit a-measure is . So, we have a rewrite as

Further, all numbers in this limit are finite, because we're operating in the case where the liminf is finite, and i indexes a subsequence which actually limits to the liminf. For said limit,  limits to , so we can pull that out of the limit, to get

At this point, since the  limit to , the set of the measure components  is compact in the space of measures on . In particular, this implies that there is some compact set  where all the  only have  or less measure outside said set (a necessary condition for precompactness of a set of measures)

Now, we split the measures  into the component of measure on , and the component outside of it.


Because  is a compact subset of , we can use it as a seed set for our  sequence that attains almost-monotonicity. To economize on space for the lines to fit, we'll write 

as just

We unpack the defining limit of the infinite semidirect product with our relevant sequence of compact sets to get


We'll now be splitting up the second integral as two more integrals. Fix an arbitrary .




We're really in the weeds here, let's see if we can make any dent whatsoever. What we'll be showing is that, regardless of ,


It is exceptionally important to note here that we can't just throw our variant of the Almost-Monotone Lemma at this, because we don't have a promise that .

An easy way to show this is to show it for all . So let . Then, our proof goal is that


First, observe that the functions

and

can only differ on the same input by at most . Since the finite stages are all 1-Lipschitz or less, we have



And then, by monotonicity of the infrakernel stages for functions which cut off at a finite point (it was from Theorem 1), we have


And we're done, we have shown that, regardless of , we have


Now, applying this fact to our giant list of integrals, we can take




and apply our lower-bound to get



Also, since we picked  so  only has  or less measure outside of it, regardless of i, we can impose a lower bound of:


Now, to break down the integral on the top. Since the behavior of the interior of that integral only depends on , we can apply our good old Almost-Monotone-Lemma variation, and lower-bound the limit by the  finite stage, minus  (independently of the choice of ) yielding


Now, we pull that constant piece out of the integral, and since both functions in the integral are now identical, we can stitch the measure components back together to get:


Now,  is a fragment of , which came from a minimal point in  or a limit of such, so the maximal amount of measure present would be . This lets us get a lower bound of


Doing a little bit of regrouping, we have


Recapping what we've done so far, we showed that 


For all . Regardless of , this liminf value in i is finite because it's upper-bounded by the liminf in m, which is assumed to be finite (we disposed of the infinite case already). Because the sequence  and the compact sets  were fixed before we picked , we can take a limit of the  without worrying about it affecting any of the stuff on the inside of the integral, to get


This liminf in i of the integral is finite for all n. At this point, we can notice something interesting. For fixed n, when we take the limit as i approaches infinity, eventually  will become  and never go back, because of convergence of the policies, and the fact that the infrakernel has a cutoff at time n, so the behavior of the policies before that time will stabilize on a partial policy. So, we can rewrite our lower bound as:


Again, for all n, this liminf in i quantity is finite. Now, the function

is lower-semicontinuous in  (as all the finite stages of the infinite infrakernel have that niceness property). And  is lower-semicontinuous when  is. Also, by now, the only thing i is controlling is the measure term. Therefore, that liminf of the integral (in i) is greater than what you'd get if you just passed to . So, we get


Again, for all n, the integral is finite. Now, since the functions

converge pointwise in n to the function

(any sequence of compact sets works to define the infinite semidirect product) and for all n, the integrals are finite (this is why we kept restating it over and over!), we can apply Fatou's Lemma to move the liminf into the integral, to get


And since we have pointwise limiting, this is


and then, packing it up,

And then, pack it up as

and one last step. Since  lies in , we can get

which reexpresses as

Our net result is

But because  was arbitrary and  and  are finite, we have our desired net result of

And lower-semicontinuity is shown for 

T4.2.2.3 Now for our phase 3, getting lower-semicontinuity of . Let  limit to . We have

and then this unpacks as

Reexpressing the projection, we have:

And then, we can apply lower-semicontinuity of  which we've shown in phase 2, to get

Which then packs up as

and then

and we have lower-semicontinuity for ! Finally!

T4.2.3 Time to return to easier fare. We need that  is supported entirely on histories compatible with . To do this, we proceed in three steps.

First, we do an induction proof that, for any given  and  is supported on the subset of  where the actions and observations so far are compatible with .

Second, we use our first result to show that for any two continuous bounded utility functions  which only depend on the action/observation sequence, and agree on all histories compatible with 

Finally, we use this result to show that if  agree on all histories compatible with , showing that  is supported on histories compatible with .

T4.2.3.1 Let's start with the induction proof that  has the subset of  where the history of actions and observations is compatible with  as a support. The way to do that is to pick two functions that are identical on said subset, and show they have equal expectation values. For the base case of the induction proof, fix an  and  which are identical on  (as that's the subset of  which is compatible with ), and observe:

and then by how  is defined, we have

Which, by unpacking the semidirect product and subsituting the dirac-delta value in, we have

and then since  when the initial action equals , we have

and then this just wraps back up by our sequence of rewrite steps in reverse as

Now that we have the base case out of the way, we'll attempt to show the induction step. Let  and  be identical on the subset of  which is compatible with , and we'll show they have equal expectation value. Our job is to unpack

this unpacks, by how sequences of semidirect products are defined, as


Now, since, by induction assumption,  is supported on the subset of  where the history is compatible with , we just need to show that the inner functions are identical on that subset to get that the difference is 0, so our proof goal is now