We're going to need to go out of order here a bit and do Proposition 31 before doing Propositions 29 and 30. Please have patience.
Proposition 31: All five characterizations of the pullback define the same infradistribution.
So, the two requirements of being able to do a pullback are that: has as a support, and is a proper function (preimage of a compact set is compact). And our five characterizations are:
1: For sets,
ie, take the preimage of w.r.t. the linear operator that induces from .
2: For concave functionals, is the concave monotone hull of the partial function .
3: For concave functionals, is the monotone hull of the partial function , ie:
4: For concave functionals,
take the inf of all infradistributions that could have made your infradistribution of interest via pushforward.
5: For sets, take the closed convex hull of all infradistribution sets which project down to be equivalent to .
The proof path proof path will be first establishing that 3 makes an infradistribution, then showing that 3 and 2 are equivalent, then showing that 2 and 4 are equivalent, then showing that 4 and 5 are equivalent. And now that 2, 3, 4, and 5 are in an equivalence class, we wrap up by showing that 2 implies 1 and 1 implies 5.
First up: Establishing that 3 makes an infradistribution, which is where the bulk of the work is.
Starting with normalization:
And clearly, this supremum is attained by , it can be no higher, by normalization, so we have and thus one part of normalization. For the other half of normalization,
Now, in order for to be the case, all we need is that be 0 on . However, since is assumed to be a support for , all functions like this must have and we have the other part of normalization.
Monotonicity: This is immediate because if , then the supremum for has more options and can attain a higher value.
And we can pick some and which very nearly attain the relevant values, as close as you wish, to go:
So, with that, we can go:
And we're done with concavity.
For Lipschitzness, the proof will work as follows. We'll take two functions and , and show that in a way that generalizes to symmetrically show that to get our result that
where is the Lipschitz constant of .
So, let's show
What we can do is, because
then we can consider this value to be approximately obtained by some specific function . Then,
The first two inequalities are from monotonicity. The third inequality is because , where is our special function from a bit earlier, because . So, it isn't necessarily the supremum and doesn't attain maximum value. Then, there's just Lipschitzness of , and being approximately . And we're done with Lipschitzness, just swap and and use earlier arguments.
Before proceeding further, note the following information: Polish spaces are first-countable and Hausdorff so they're compactly generated, and Wikipedia says any proper map to a compactly generated space is closed (maps closed sets to closed sets). So, is a closed set.
Now just leaves compact almost-support. Fix an , and take a compact -almost-support of , . Intersect with if you haven't already, it still makes a compact set and it keeps being an -almost-support. We will show that , which is compact because we assumed our map was proper, is an -almost-support of .
This proof will proceed by showing that, if agree on , then . Symmetrical arguments work by swapping and , showing that the two expectation values are only apart.
As usual, we have
Where very nearly attains the supremum, and we know that .
Let the function (a set-valued function) be defined as:
If , then
If , then
These sets are always concave, and also always nonempty, because, in the second case, the only way it could be nonempty is if
Now, g being a proper map, the preimage of the single point is a compact set, so we can actually minimize our function, pick out our particular that maps to . Then, we know:
The second inequality is because by assumption. Shuffling this around a bit, we have:
where our is minimizing, which is incompatible with nonemptiness.
So, we've got convexity and nonemptiness. Let's shower lower-hemicontinuity.
As a warmup for that, we'll need to show that the function
So, our proof goal is: if limits to , then
And we know that all our and itself, lie in .
At this point, here's what we do. Isolate the subsequence of points we're using to attain the . This sequence of points (when unioned with itself) is compact in , so its preimage in is compact. And you can minimize functions over compact sets. So, for each , pick a point that minimizes . Our sequence of points is roaming around in a compact set, so we can pick a convergent subsequence. At this point, let's start indexing with . Said limit point must have by continuity for , and by continuity for (all the get mapped to , so the limit point must go to the limit point ) Putting it all together,
And so we have that, if limits to , then
And is lower-semicontinuous over .
At this point, we'll try to show that the set-valued function is lower-hemicontinuous over . We'll use z to denote numbers in .
The criterion for lower-hemicontinuity of is: If limits to , and , then there's a subsequence and points where
To show this, we'll divide into three cases. In our first case, only finitely many of the are in . Let's pass to a subsequence where we remove all the that lie in , and where converges to a value. Let
This was because all these sequences have their corresponding values converge, whether by continuity or assumption. So, this takes care of lower-hemicontinuity in one case.
Now for our second case, where only finitely may of the aren't in . Pass to a subsequence where they are all removed, and where converges to a value. Accordingly, all our lie in . Let . Then
Our third case is the trickiest, where there are infinitely many in , and infinitely many outside that set, and accordingly, Let's pass to a subsequence where we remove all the that lie in and where converges to a value. We would be able to apply the proof from the first case if we knew that
Obviously, we have , but we still need to show that
The second one can be addressed trivially, so now we just have to show that
Pick some minimizing , where , and remember that this lies in , so , the compact set where . Now we can just go:
And we're done. That's the last piece we need to know that is lower-hemicontinuous.
Now we will invoke the Michael selection theorem, which gets us a continuous selection (bounded because and are) where, for all , .
The conditions to invoke the Michael selection theorem are that is lower-hemicontinuous (check), that is paracompact, and that is a Banach space (check). Fortunately, any metrizable space is paracompact, and Polish spaces are metrizable, so we can invoke the theorem and get our continuous .
Note two important things about this continuous we just constructed. First, it perfectly matches over . Second, for all in (remember, that's the space it's defined over)
Because of that worst-case bound thing being one of the upper bounds on . Also, over , it remains within of , again, because of the imposed bounds.
We can use the Tietze extension theorem to extend to all of while staying within of .
At this point, we can return to where we started out in this section, and go:
The first equality was the definition of the preimage, the approximate equality was because is very very near attaining the true supremum, the is because and agree on which is an -almost-support for , the next inequality after that is because we just showered that , and then packing up the definition of preimage.
Perfectly symmetric arguments work when you swap and , yielding our net result that
And were selected to be arbitrary continuous functions that agree on , which is compact, so we have a compact -almost-support, so has compact almost-support and is an infradistribution.
Ok, so we're done, we've defined a pullback, it's the monotone hull of the function under precomposition with , and this was definition 3. Now it's time to start showing equivalences.
First, the equality of definitions 2 and 3. The concave monotone hull of the partially defined pullback must be than the monotone hull of the partially defined pullback, because we're imposing an additional condition, there are less functions that fit the bill. But, we've showed that the monotone hull is concave, so it's actually an equality.
Now for the equality of of 2 and 4, that the concave monotone hull of the pullback is the same as the inf of all infradistributions that produce when pushfoward is applied.
We know that the concave monotone hull is indeed an infradistribution, and
So, since its projection forward matches perfectly, we have
For actual equality, we note that, if , then
And since is the concave monotone hull of the partial function , and the concave monotone hull is the least concave monotone function compatible with said partial function, and any which pushes forward to map to must be concave and monotone and match this partial function, all the we're inf-ing together lie above , so we have actual equality and
So, at this point, definitions 2, 3, and 4 of the concave hull are equal.
Now for equality of definitions 4 and 5. Taking the inf of infradistributions on the concave functional level is the same as unioning together all the infradistribution sets, and taking the closed concave hull (because that doesn't affect the expectation values), so "inf of infradistributions that pushfoward to make ", on the concave functional side of LF-duality, is the same as "closed concave hull and upper completion of infradistributions which pushforward to make " on the set side of LF-duality.
Now that we know definitions 2-5 of the pullback are all equal, and using for the expectation functionals corresponding to definitions 3, 1, and 5 of the pullback, we'll show , so , ie, the first definition of the pullback, must be equal to the rest.
First, is the functional corresponding to the set:
Where in this case is the function of type signature given by keeping the term the same and pushing forward the signed measure component via the function . Note that said pushfoward can actually only make points in , the measures can only be supported over that set.
We'll show that
is monotone and that
To begin with, for monotonicity, if , then
This is easy because our set is entirely of a-measures. So the corresponding is monotone.
Now, at this point, we can go:
The next line is because (the continuous pushfoward of a measure, evaluated via ) is the same as (old measure evaluating )
The next line is a bit tricky, because pushforward seems like it may only net some of the points in H. However, any measure supported over has a measure over that, when pushfoward is applied, produces it. Just take each point , craft a Markov Kernel from to that maps all to a probability distribution supported over , take the semidirect product of your measure over and your kernel, and project to to get such a measure. So, said pushfoward surjects onto , and we can get:
So, now that we've shown that
is monotone and that
We know that the corresponding functional has the property that it's monotone and . Thus, it must lie above the monontone hull of , which is just (third definition of the pullback) and we have: .
Now for the other direction, showing that . The fifth definition of the pullback was the closed convex hull (and upper completion) of all the infradistributions which project to . So, if we can show that any infradistribution set which projects into must have its minimal points lie in , we'll have that all infradistribution which project to are a subset of , and thus is higher in the ordering on functionals, .
This is trivial by just being a preimage, so any set which pushforwards into the set of interest must be a subset of the preimage.
And we have our desired result that any infradistribution set which projects into must be a subset of (all minimal points are present in it), and higher in the information ordering, so the inf of all of them, ie, the functional , must lie above the corresponding functional .
So, we have and so equals all of them, and the restricted preimage of H is another formulation of the pullback, and we're done.
Proposition 29: The pullback of an infradistribution is an infradistribution, and it preserves all properties indicated in the table for this section.
So, we know from the proof of Proposition 31 that the pullback of an infradistribution is an infradistribution. Now, to show preservation of properties, we'll work with the formulation of the pullback as:
For homogenity, 1-Lipschitzness, cohomogenity, C-additivity, and crispness, they are all characterized by all minimal points having certain properties of their and values. All minimal points of must pushforward to minimal points of (otherwise the term of your original minimal point could be strictly decreased and it'd still pushforward into ), which have the property of interest, and because pushforward preserves the and values of a-measures, we know that all minimal points of have the same property on their and , so all these properties are preserved under pullback.
Sharpness takes a slightly more detailed argument. Let's say that is sharp, ie, its minimal points consist entirely of the probability distributions on . We know from crispness preservation that has its minimal points consist entirely of probability distributions. If there was a probability distribution that wasn't supported on (preimage of , which, by the assumed properness of , is compact), that was present as a minimal point, then it would pushfoward to make a minimal point in that was a probability distribution that had some measure outside of , which is impossible. Further, any probability distribution that is supported on would pushfoward to make a probability distribution supported on , which would be a minimal point of , so said point must be minimal in . Thus, the minimal points of consist entirely of probability distributions supported on , so is sharp, and all nice properties are therefore preserved.
Proposition 30: Pullback then pushforward is identity, , and pushforward then pullback is below identity,
The first part is easy to show,
Yielding pullback then pushforward being identity. For the reverse direction, we use Proposition 31, in particular the part about the pullback being the least infradistribution that pushes forward to produce the target. And we have , therefore
Proposition 32: The free product of two infradistributions exists and is an infradistribution iff:
1: There are points and where
2: There are points and where , and , and .
So, to show this, what we'll be doing is using our set-form characterization of the pullback as a preimage and the supremum as an intersection. We already know that, from proposition 14, a necessary condition for the intersection of infradistributions being an infradistribution is normalization, ie, on the set level, the presence of a point with and a point with . The presence of a point with in the intersection of the preimages would pushforward to make a point in and , respectively, with equal values, fulfilling condition 1.
Also, the presence of a point with in the intersection of the preimages would pushforward to make a minimal point in and respectively, with equal values and values, fufilling condition 2.
Thus, if the free product exists and is an infradistribution, the two properties are fulfilled. That just leaves the reverse direction of establishing that the free product exists and is an infradistribution if both properties are fulfilled.
We'll be handling the set conditions on an infradistribution. For nonemptiness and normalization, observe that if condition 1 is fulfilled, you can make the a-measure which projects down to and , so it's present in the intersection of the preimages of and under the projection mappings. This witnesses nonemptiness and one half of normalization.
For the second half of normalization, any point with would project down to make a point outside of and , so it can't exist. Further, if condition 2 is fulfilled, you can make the a-measure , with , and it projects down to and in and respectively, witnessing that it's present in the free product, and witnessing that the other half of normalization works out.
For closure and convexity and upper-completeness, the intersection of two closed convex upper-complete sets (the pullback of closed convex upper-complete sets along a continuous linear mapping that preserves the term) is closed-convex upper-complete, so that works out.
That leaves the compact-projection property, which is the only one where we may have issues. Remember, the necessary-and-sufficient conditions for compactness of a set of measures are: bounded value, and, for all , there's a compact subset of where all the measure components of the points have all but measure on said set.
Any point in the intersection of preimages with the measure component having a value above the minimum of the two and values (maximum amount of measure present in points in and respectively) would project down to not be in or , respectively, so those are ruled out, and we've got a bound on the amount of measure present.
As for an -almost-support for all the measure components in , what you do is find a compact -almost-support for and a compact -almost-support for , and consider
We claim this is a compact (product of compact sets) -almost-support for all measure components in .
This is because any measure component in must project down to be a measure component of and a measure component of . So, if there was more than measure outside
then it would project down to either or and have more than measure outside of the two compact sets, which is impossible by the projection landing in (or ) and the two compact sets being -almost-supports for and respectively.
For any we can make a compact subset of that accounts for all but of the measure present of a measure in , so it's got the compact-projection property, which is the last thing we need to certify it's an infradistribution.
Proposition 33: Having (homogenity or 1-Lipschitzness) present on both of the component infradistributions is a sufficient condition to guarantee the existence of the free product.
From Proposition 32, all we need to do is to verify properties 1 and 2, ie
1: There are points and where
2: There are points and where , and , and .
Given only that (without loss of generality) is 1-Lipschitz or homogenous, and same for , we'll show properties 1 and 2.
The trick to this is showing that 1-Lipschitzness and homogenity both imply that an infradistribution contains some point with and , and if both infradistributions fulfill that, then conditions 1 and 2 are fulfilled for free, as can be verified with inspection.
So, let's show that 1-Lipschitzness and homogenity imply there's a point with and . For 1-Lipschitzness, , and since there must be a point with by normalization, and all points have by normalization, said point must have . As for homogenity, all minimal points have , and normalization says there's a point with , getting that there's a point with . So, we're done.
Proposition 34: and , with equality when and are C-additive.
And same for . For equality in the C-additive case, we need to show that every point in has some point in which surjects onto it. Remember, C-additivity is "all minimal points have ", which, by our notion of upper-completion, is equivalent to all measure components in the set just being ordinary probability distributions.
The existence of minimal points with means that, given any point , you can find a point with the same value in , call it , and then because it projects down into both sets, showing that surjects onto when projected down and so must have equal expectation value. Same for .
Proposition 35: Free product is commutative and associative.
This is because supremum is commutative (order doesn't matter) and associative (parentheses don't matter), because on the set level it's just intersection, which is order-and-grouping agnostic.
Proposition 36: The free product of countably many infradistributions exists and is an infradistribution iff:
1: There are points where,
2: There are points where
We recap the proof of Proposition 32. The presence of a point with or in a hypothetical infinite free product must project down into all the , and cause the fulfillment of conditions 1 and 2.
For the set conditions on an infradistribution, for nonemptiness and normalization, observe that if condition 1 is fulfilled, you can make the a-measure which is present in the intersection of all the a-measure pullbacks. This witnesses nonemptiness and one half of normalization.
For the second half of normalization, any point with would project down to make a point outside of all the so it can't exist. Further, if condition 2 is fulfilled, you can make the a-measure , with , and it is present in the intersection of all the a-measure pullbacks, witnessing that the other half of normalization works out.
For closure and convexity and upper-completeness, the intersection of closed convex upper-complete sets (the pullback of closed convex upper-complete sets along a continuous linear mapping that preserves the term) is closed-convex upper-complete, so that works out.
That leaves the compact-projection property, which is the only one where we may have issues. For Lipschitzness, we remember that any point in the intersection of preimages with the measure component having a value above has some that it can't project down into, so we get a contradiction, and we've got a bound on the amount of measure present.
As for an -almost-support for all the measure components, you biject your with the natural numbers, and find a compact set which is a -almost-support for and take the product of them. It's compact by Tychonoff, and because each speck of measure of the measure component of a point in the infinite free product that doesn't land in this set projects down to be outside of one of the sets, the total amount of measure outside this set is upper-bounded by:
So it's a compact -almost-support for the measure components of all a-measures in the infinite free product, and this can work for any , so the infinite free product fulfills the compact almost-support property and has compact projection, which is the last condition we needed.
Proposition 37: , with equality when all the are C-additive.
For equality in the C-additive case, we need to show that every point in has some point in which surjects onto it. Remember, C-additivity is "all minimal points have ", which, by our notion of upper-completion, is equivalent to all measure components in the set just being ordinary probability distributions.
The existence of minimal points with means that, given any point , you can find a point with the same value in (regardless of ), call it , and then because it projects down into all sets, showing that surjects onto when projected down and so must have equal expectation value.
Proposition 38: Ultradistributions are isomorphic to convex, monotone, normalized, Lipschitz, CAS, functions .
We'll be going over the proof of Theorem 1 again, but with some stuff swapped around.
Our first order of business is establishing the isomorphism. Our first direction is to and back is exactly. By lower completion, and reproved analogues of Proposition 2 and Theorem 1 from "Basic inframeasure theory", which an interested party can reprove if they want to see it, we can characterize as
And then, our can further be reexpressed as
and, by the definition of the convex conjugate, and the space of finite signed measures being the dual space of , and being a functional applied to an element, this is...
So, our original set is identical to the convex-conjugate set, when we go from to back to a set of a-measures.
Proof Phase 2: In the reverse direction for isomorphism, assume that fulfills the conditions (we'll really only need continuity and concavity)We want to show that
Given an , we have a natural candidate for maximizing the , just set it equal to .
So then we get
And this is just... , and, because is continuous over , and convex, . From that, we get
and we're done with isomorphism.
So, in our first direction, we're going to derive the conditions on the functional from the condition on the set, so we can assume nonemptiness, closure, convexity, upper completion, projected-compactness, and normalization, and derive monotonicity, concavity, normalization, Lipschitzness, and compact almost-support (CAS) from that.
For monotonicity, remember that all points in the infradistribution set are a-measures, so if , then
We could do that because all the measure components are actual measures.
And we're done with that. For normalization,
So we have normalization.
For Lipschitzness, we first observe that compact-projection (the minimal points, when projected up to their measure components, make a set with compact closure) enforces that there's an upper bound on the value of a minimal point , because otherwise you could pick a sequence with unbounded , and it'd have no convergent subsequence of measures, which contradicts precompactness of the minimal points projected up to their measure components.
Then, we observe that points in correspond perfectly to hyperplanes that lie below the graph of , and a maximal point is "you shift your hyperplane up as much as you can until you can't shift it up any more without starting to cut into the function ". Further, for every function , you can make a hyperplane tangent to the function at that point by the Hahn-Banach theorem, which must correspond to a maximal point.
Putting it together, the epigraph of is exactly the region above all its tangent hyperplanes. And we know all the tangent hyperplanes correspond to maximal points, and their Lipschitz constants correspond to the value of the minimal points. Which are bounded. So, Compact-Projection in H implies h is Lipschitz.
Finally, we'll want compact almost-support. A set of measures is compact iff the amount of measure is upper-bounded, and, for all , there is a compact set where all the measures have measure outside of .
So, given that the set of measures corresponding to is compact by the compact-projection property, we want to show that the functional has compact almost-support. To do this, we'll observe that if is the sup of a bunch of functions, and all functions think two different points are only a little ways apart in value, then h must think they're only a little distance apart in value. Keeping that in mind, we have:
And then, we can think of a minimal point as corresponding to a hyperplane and , and is the sup of all of them, so to bound the distance between these two values, we just need to assess the maximum size of the gap between those values over all minimal points/tangent hyperplanes. Thus, we can get:
And then, we can do some canceling and get:
And then, because was selected to be 0 on , which makes up all but of the measure for all measures present in , we can upper-bound by , so we have that
And so, is a compact -almost-support for h, and this argument works for all , so is CAS, and that's the last condition we need. Thus, if is an infradistribution (set form), the expectation functional is an infradistribution (expectation form).
Now for the other direction, where we assume monotonicity, concavity, normalization, Lipschitzness, and CAS on an infradistribution (expectation form) and show that the induced form fulfills nonemptiness, convexity, closure, upper completion, projection-compactness, normalization, and being a set of a-measures.
Remember, our specification of the corresponding set was:
Where is the convex conjugate of .
First, being a nonempty set of a-measures with 0 or negative value. Because there's an isomorphism linking points of the set and hyperplanes below the graph of , we just need to establish that no hyperplanes below the graph of can slope down in the direction of a nonnegative function (as this certifies that the measure component must be an actual measure), and no hyperplanes below the graph of can assign 0 a value above 0 (as this corresponds to the term, and can be immediately shown by normalization).
What we do is go "assume there's a where the linear functional corresponding to isn't a measure, ie, there's some negative function where ". Well, because of monotonicity for (one of the assumed properties), we have .... And, because all affine functionals are made by taking a linear functional and displacing it, ..., increases at a linear rate, so eventually the hyperplane and cross over, but was assumed to be below always, so we have a contradiction.
Therefore, all hyperplanes below must have their linear functional component corresponding to an actual measure. And we get nonemptiness from the concavity of , so we can pick any function and use the Hahn-Banach theorem to make a tangent hyperplane to that touches at that point, certifying nonemptiness.
By the way, the convex conjugate,