Theorem 2: Isomorphism Theorem:For (causal, pseudocausal, acausal, surcausal) Θst or Θω which fulfill finitary or infinitary analogues of all the defining conditions, ↑(Θst) and ↓(Θω) are (causal, pseudocausal, acausal, surcausal) hypotheses. Also, ↑ and →st define an isomorphism between Θ and Θst, and ↓ and →ω define an isomorphism between Θ and Θω.
Proof sketch: The reason this proof is so horrendously long is that we've got almost a dozen conditions to verify, and some of them are quite nontrivial to show and will require sub-proof-sketches of their own! Our first order of business is verifying all the conditions for a full belief function for ↑(Θst). Then, we have to do it all over again for ↓(Θω). That comprises the bulk of the proof. Then, we have to show that taking a full belief function Θ and restricting it to the infinite/finite levels fulfills the infinite/finite analogues of all the defining conditions for a belief function on policies or policy-stubs, which isn't quite as bad. Once we're done with all the legwork showing we can derive all the conditions from each other, showing the actual isomorphism is pretty immediate from the Consistency condition of a belief function.
Part 1:Let's consider ↑(Θπst). This is defined as: ↑(Θst)(πpa):=⋂πst≤πpa(prπpa,πst∗)−1(Θst(πst))
We'll show that all 9+2 defining conditions for a belief function are fulfilled for ↑(Θst). The analogue of the 9+2 conditions for a Θst is:
Let's begin showing the conditions. But first, note that since we have weak consistency, we can invoke Lemma 6 to reexpress ↑(Θst)(πpa) as ⋂n(prπpa,πnpa∗)−1(Θst(πnpa)) Where πnpa is the n'th member of the fundamental sequence of πpa.
Also note that, for all stubs, ↑(Θst(πst))=Θst(πst). We'll be casually invoking this all over the place and won't mention it further.
Proof: By Lemma 6 with weak consistency, ↑(Θst(πst))=⋂n≥m(prπst,πnst∗)−1(Θst(πnst)) Now, m can be anything we like, as long as it's finite. Set m to be larger than the maximum timestep that the stub is defined for. Then πnst=πst no matter what n is (since it's above m) and projection from a stub to itself is identity, so the preimage is exactly our original set Θst(πst).
We'll also be using another quick result. For all stubs πhist≥πlost, given stub causality,
prπhist,πlost∗(Θst(πhist))=Θst(πlost)
Proof: Fix an arbitrary point M∈Θst(πlost). By causality, we get an outcome function which includes M, giving us that there's something in Θst(πhist) that projects down onto M. Use weak consistency to get the other subset direction.
Condition 1: Nirvana-free Nonemptiness.
Invoking Stub Nirvana-free Nonemptiness, ∀πst:Θst(πst)∩NF≠∅ so we get nirvana-free nonemptiness for ↑(Θst)(πst).
Now, assume πpa is not a stub. By stub-bounded-minimals, there is some λ⊙+b⊙ bound on the set of minimal points, regardless of stub. Let (Θst(πnpa))clip be Θst(πnpa)∩{≤⊙}∩NF
This contains all the minimal nirvana-free points for πnpa. This set is nonempty because we have stub nirvana-free nonemptiness, so a nirvana-free point M exists. We have stub-closure and stub minimal-boundedness, so we can step down to a minimal nirvana-free point below M, and it obeys the λ⊙+b⊙ bound.
Further, by weak consistency and projection preserving λ and b and nirvana-freeness,
prπn+1pa,πnpa∗((Θst(πn+1pa))clip)⊆(Θst(πnpa))clip
Invoking Lemma 9, the intersection of preimages of these is nonempty. It's also nirvana-free, because if there's nirvana somewhere, it occurs after finite time, so projecting down to some sufficiently large finite stage preserves the presence of Nirvana, but then we'd have a nirvana-containing point in a nirvana-free set (Θst(πnpa))clip, which is impossible. This is also a subset of the typical intersection of preimages used to define ↑(Θst)(πpa). Pick an arbitrary point in said intersection of preimages of clipped subsets.
Bam, we found a nirvana-free point in ↑(Θst)(πpa) and we're done.
Time for conditions 2 and 3, Closure and Convexity. These are easy.
↑(Θst)(πpa)=⋂n(prπpa,πnpa∗)−1(Θst(πnpa))
The preimage of a closed set (stub-closure) is a closed set, and the intersection of closed sets is closed, so we have closure.
Also, prπpa,πnst∗ is linear, so the preimage of a convex set (stub-convexity) is convex, and we intersect a bunch of convex sets so it's convex as well.
Condition 4: Nirvana-free upper completion.
Let M∈↑(Θst)(πpa)∩NF. Let's check whether M+M∗ (assuming that's an a-measure and M∗ is nirvana-free) also lies in the set. A sufficient condition on this given how we defined things is that for all πnpa, prπpa,πnpa∗(M+M∗)∈Θst(πnpa), as that would certify that M+M∗ is in all the preimages.
prπpa,πnpa∗ is linear, so prπpa,πnpa∗(M+M∗)=prπpa,πnpa∗(M)+prπpa,πnpa∗(M∗)
The first component is in Θst(πst), obviously. And then, by stub nirvana-free-upper-completion, we have a nirvana-free a-measure plus a nirvana-free sa-measure (projection preserves nirvana-freeness), making a nirvana-free a-measure (projection preserves a-measures), so prπpa,πnpa∗(M)+prπpa,πnpa∗(M∗) is in Θst(πnpa)∩NF, and we're done.
Condition 5: Bounded-Minimals
So, there is a critical λ⊙+b⊙ value by restricted-minimals for Θst
Fix a πpa, and assume that there is a minimal point M in ↑(Θst)(πpa) with a λ+b value that exceeds the bound. Project M down into each Θst(πnpa). Projection preserves λ and b so each of these projected points Mn lie above some Mminn.
Now, invoke Lemma 7 to construct a M′n∈Ma(F(πpa)) (or the nirvana-free variant) that lies below M, and projects down to Mminn. Repeat this for all n. All these M′n points are a-measures and have the standard λ⊙+b⊙ bound so they all lie in a compact set and we can extract a convergent subsequence, that converges to M′, which still obeys the λ⊙+b⊙ bound.
M′ is below M because ({M}−Msa(F(πpa)))∩Ma(F(πpa)) (or the nirvana-free variant) is a closed set. Further, by Lemma 10, M′ is in the defining sequence of intersections for ↑(Θst)(πpa). This witnesses that M isn't minimal, because we found a point below it that actually obeys the bounds. Thus, we can conclude minimal-point-boundedness for ↑(Θst).
Condition 6: Normalization. We'll have to go out of order here, this can't be shown at our current stage. We're going to have to address Hausdorff continuity first, then consistency, and solve normalization at the very end. Let's put that off until later, and just get extreme points.
Condition 8: Extreme point condition:
The argument for this one isn't super-complicated, but the definitions are, so let's recap what condition we have and what condition we're trying to get.
Ok, so M∈(Θst(πst))xmin∩NF By the stub extreme point condition, there's a π≥πst, where, for all πhist that fulfill π>πhist≥πst, there's a M′∈Θst(πhist)∩NF, where prπhist,πst∗(M′)=M.
Lock in the π we have. We must somehow go from this to a M′∈Θ(π)∩NF that projects down to our point of interest. To begin with, let πn be the n'th member of the fundamental sequence for π. Past a certain point m, these start being greater than πst. The M′∈Θst(πn)∩NF which projects down to M that we get by the stub-extreme-point condition will be called M′lon. Pick some random-ass point in (prπ,πnst∗)−1(M′lon) and call it M′hin.
M′hin all obey the λ and b values of M, because it projects down to M. We get a limit point of them, M′, and invoking Lemma 10, it's also in ↑(Θst)(π). It also must be nirvana-free, because it's a limit of points that are nirvana-free for increasingly late times. It also projects down to M because the sequence M′hin was wandering around in the preimage of M, which is closed.
Condition 9: Hausdorff Continuity:
Ok, this one is going to be fairly complicated. Remember, our original form is:
"The function πst↦(pr∞,πst∗)−1(Θst(πst)∩NF∩{≤⊙}) is uniformly continuous"
And the form we want is:
"The function πpa↦(pr∞,πpa∗)−1(↑(Θst)(πpa)∩NF∩{≤⊙}) is uniformly continuous"
Uniform continuity means that if we want an ϵ Hausdorff-distance between two preimages, there's a δ distance between partial policies that suffices to produce that. To that end, fix our ϵ. We'll show that the δ we get from uniform continuity on stubs suffices to tell us how close two partial policies must be.
So, we have an ϵ. For uniform continuity, we need to find a δ where, regardless of which two partial policies πpa and π′pa we select, as long as they're δ or less apart, the sets (pr∞,πpa∗)−1(↑(Θst)(πpa)∩NF∩{≤⊙}) (and likewise for π′pa) are only ϵ apart. So, every point in the first preimage must have a point in the second preimage only ϵ distance away, and vice-versa. However, we can swap πpa and π′pa (our argument will be order-agnostic) to establish the other direction, so all we really need to do is to show that every point in the preimage associated with πpa is within ϵ of a point in the preimage associated with π′pa.
First, m:=logγ(δ) is the time at which two partial policies δ apart may start differing. Conversely, any two partial policies which only disagree at-or-after m are δ apart or less. Let π∗st be the policy stub defined as follows: take the inf of πpa and π′pa (the partial policy which is everything they agree on, which is going to perfectly mimic both of them up till time m), and clip things off at time m to make a stub. This is only δ apart from πpa and π′pa, because it perfectly mimics both of them up till time m, and then becomes undefined (so there's a difference at time m) Both πpa and π′pa are ≥π∗st.
Let M be some totally arbitrary point in (pr∞,πpa∗)−1(Θst(πpa)∩NF∩{≤⊙}). M is also in (pr∞,π∗st∗)−1(Θst(π∗st)∩NF∩{≤⊙}), because M projects down to some point in Θst(π∗st) that's nirvana-free.
Let π′npa, where n≥m, be the n'th stub in the fundamental sequence for π′pa. These form a chain starting at π∗st and ascending up to π′pa, and are all δ distance from π∗st.
Anyways, in Ma(∞), we can make a closed ball B≤ϵ of size ϵ around M. This restricts λ and b to a small range of values, so we can use the usual arguments to conclude that B≤ϵ is compact.
Further, because π∗st is δ or less away from π′npa, the two sets (pr∞,π∗st∗)−1(Θst(π∗st)∩NF∩{≤⊙}) and (pr∞,π′npa∗)−1(Θst(π′npa)∩NF∩{≤⊙}) are within ϵ of each other, so there's some point of the latter set that lies within our closed ϵ-ball.
Consider the set ⋂n≥m((pr∞,π′npa∗)−1(Θst(π′npa)∩NF∩{≤⊙})∩B≤ϵ)
the inner intersection is an intersection of closed and compact sets, so it's compact. Thus, this is an intersection of an infinite family of nonempty compact sets. To check the finite intersection property, just observe that since preimages of the sets ↑(Θst)(π′npa)∩NF∩{≤⊙} get smaller and smaller as n increases due to weak-consistency but always exist.
Pick some arbitrary point M′ from the intersection. it's ≤ϵ away from M since it's in the ϵ-ball. However, we still have to show that M′ is in (pr∞,π′pa∗)−1(↑(Θst)(π′pa)∩NF∩{≤⊙}) to get Hausdorff-continuity to go through.
To begin with, since M′ lies in our big intersection, we can project it down to any Θst(π′npa)∩NF∩{≤⊙}. Projecting it down to stage n makes M′n. Let M′∞ be the point in ↑(Θst)(π′pa)∩NF∩{≤⊙} defined by: ⋂n≥m(prπ′pa,πnst∗)−1(M′n)
Well, we still have to show that this set is nonempty, contains only one point, and that it's in ↑(Θst)(π′pa), and is nirvana-free, to sensibly identify it with a single point.
Nonemptiness is easy, just invoke Lemma 9. It lies in the usual intersections that define ↑(Θst)(π′pa), so we're good there. If it had nirvana, it'd manifest at some finite point, but all finite projections are nirvana-free, so it's nirvana-free. If it had more than one point in it, they differ at some finite stage, so we can project to a finite π′npa to get two different points, but they both project to M′n, so this is impossible. Thus, M′∞ is a legit point in the appropriate set. If the projection of M′ didn't equal M′∞, then we'd get two different points, which differ at some finite stage, so we could project down to separate them, but they both project to M′n for all n so this is impossible.
So, as a recap, we started with an arbitrary point M in (pr∞,πpa∗)−1(↑(Θst)(πpa))∩{≤⊙}, and got another point M′ that's only ϵ or less away and lies in (pr∞,π′pa∗)−1(↑(Θst)(π′pa))∩{≤⊙} This argument also works if we flip πpa and π′pa, so the two preimages are only ϵ or less apart in Hausdorff-distance.
So, given some ϵ, there's some δ where any two partial policies which are only δ apart have preimages only ϵ apart from each other in Hausdorff-distance. And thus, we have uniform continuity for the function mapping πpa to the set of a-measures over infinite histories which project down to ↑(Θst)(πpa)∩NF∩{≤⊙} Hausdorff-continuity is done.
Condition 7: Consistency.
Ok, we have two relevant things to check here. The first, very easy one, is that
↑(Θst)(πpa)=⋂πst≤πpa(prπpa,πst∗)−1(↑(Θst)(πst))
From earlier, we know that ↑(Θst)(πst)=Θst(πst), and from how ↑(Θst)(πpa) is defined, this is a tautology.
We'll split this into four stages. First, we'll show one subset direction holds in full generality. Second, we'll get the reverse subset direction for causal/surcausal. Third, we'll show it for policy stubs for pseudocausal/acausal, and finally we'll use that to show it for all partial policies for pseudocausal/acausal.
First, the easy direction. ↑(Θst)(πpa)⊇¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πpa∗(↑(Θst)(π)))
If we pick an arbitrary M∈↑(Θst)(π), it projects down to Θst(πst) for all stubs πst below π. Since πpa≤π, it projects down to all stubs beneath πpa. Since projections commute, M projected down into Ma(F(πpa)) makes a point that lies in the preimage of all the Θst(πst) where πst≤πpa, so it projects down into ↑(Θst)(πpa).
This holds for all points in ↑(Θst)(π), so prπ,πpa∗(↑(Θst)(π))⊆↑(Θst)(πpa). This works for all π≥πpa, so it holds for the union, and then due to closure and convexity which we've already shown, we get that the closed convex hull of the projections lies in ↑(Θst)(πpa) too, establishing one subset direction in full generality.
Now, for phase 2, deriving ↑(Θst)(πpa)⊆¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πpa∗(↑(Θst)(π))) in the causal/surcausal case.
First, observe that if π≥πpa, then πn≥πnpa. Fix some M∈↑(Θst)(πpa) and arbitrary π≥πpa. We'll establish the existence of a M′∈↑(Θst)(π) that projects down to M.
To begin with, M projects down to Mn in Θst(πnpa). Lock in a value for n, and consider the sequence that starts off M0,M1...Mn, and then, by causality for stubs and πn≥πnpa, you can find something in Θst(πn) that projects down onto Mn, and something in Θst(πn+1) that projects down onto that, and complete your sequence that way, making a sequence of points that all project down onto each other that climb up to π. By Lemma 10, we get a M′n∈↑(Θst)(π). You can unlock n now. All these M′n have the same λ and b value because projection preserves them, so we can isolate a convergent subsequence converging to some M′∈↑(Θst)(π).
Assume prπ,πpa∗(M′)≠M. Then we've got two different points. They differ at some finite stage, so there's some n where can project down onto Ma(F(πnpa)) to witness the difference, but from our construction process for M′, both M′ and M project down to Mn, and we get a contradiction.
So, since prπ,πpa∗(↑(Θst)(π))=↑(Θst(πpa)), this establishes the other direction, showing equality, and thus consistency, for causal/surcausal hypotheses.
For part 3, we'll solve the reverse direction for pseudocausal/acausal hypotheses in the case of stubs, getting ↑(Θst)(πst)⊆¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πst∗(↑(Θst)(π)))
Since we're working in the Nirvana-free case and are working with stubs, we can wield Lemma 3
c.h((Θst(πst))min)=c.h((Θst(πst))xmin)
So, if we could just show that the union of the projections includes all the extreme minimal points, then when we take convex hull, we'd get the convex hull of the extreme minimal points, which by Lemma 3, would also nab all the minimal points as well. By Lemmas 11 and 12, our resulting convex hull of a union of projections from above would be upper-complete. It would also get all the minimal points, so it'd nabs the entire Θst(πst) within it and this would show the other set inclusion direction for pseudocausal/acausal stubs. Also, we've shown enough to invoke Lemma 20 to conclude that said convex hull is closed. Having fleshed out that argument, all we need that all extreme minimal points are captured by the union of the projections.
By our previously proved extreme minimal point condition, for every extreme minimal point M in Θst(πst), there's some π>πst and M′ in ↑(Θst)(π) that projects down to M, which shows that all extreme points are included, and we're good.
For part 4, we'll show that in the nirvana-free pseudocausal/acausal setting, we have
Fix some arbitrary M∈↑(Θst)(πpa). Our task is to express it as a limit of some sequence of points that are mixtures of stuff projected from above.
For this, we can run through the same exact proof path that was used in the part of the Lemma 21 proof about how the nirvana-free part of Θ(πpa) is a subset of closed convex hull of the projections of the nirvana-free parts of Θ(π), π≥πpa. Check back to it. Since we're working in the Nirvana-free case, we can apply it very straightforwardly. The stuff used in that proof path is the ability to project down and land in ↑(Θst)(πnpa) (we have that by how we defined ↑), Hausdorff-continuity (which we have), and stubs being the convex hull, not the closed convex hull, of projections of stuff above them (which we mentioned recently in part 3 of our consistency proof).
Thus, consistency is shown.
Condition 6: Normalization.
Ok, now that we have all this, we can tackle our one remaining condition, normalization. Then move on to the two optional conditions, causality and pseudocausality.
A key thing to remember is, in this setting, when you're doing EH(f), it's actually EH∩NF(f), because if Murphy picked a thing with nirvana in it, you'd get infinite value, which is not ok, so Murphy always picks a nirvana-free point.
Let's show the first part, that infπE↑(Θst)(π)(0)=0. This unpacks as:infπmin(λμ,b)∈↑(Θst)(π)∩NFb=0 and we have that infπstmin(λμ,b)∈Θst(πst)∩NFb=0
Projections preserves b values, so we can take some nirvana-free point with a b value of nearly 0, and project it down to Θst(π∅) (belief function of the empty policy-stub) where there's no nirvana possible because no events happen.
So, we've got b values of nearly zero in there. Do we have a point with a b value of exactly zero? Yes. it's closed, and has bounded minimals, so we can go "all positive functionals are minimized by a minimal point", to get a point with a b value of exactly zero. Then, we can invoke Lemma 21 (we showed consistency, extreme points, and all else required to invoke it) to decompose our point into the projections of nirvana-free stuff from above, all of which must have a b value of 0. So, there's a nirvana-free point in some policy with 0 b value.
Now for the other direction. Let's show that supπE↑(Θst)(π)(1)=1.
This unpacks as: supπmin(λμ,b)∈↑(Θst)(π)∩NF(λ+b)=1
We'll show this by disproving that the sup is <1, and disproving that the sup is >1.
First, assume that, regardless of π, min(λμ,b)∈↑(Θst)(π)∩NF(λ+b)≤1−ϵ Then, regardless of πst we can pick some totally arbitrary π>πst, and there's a nirvana-free point with a λ+b value of 1−ϵ or less. By consistency, we can project it down into Θ(πst), to get a nirvana-free point with a λ+b value of 1−ϵ or less. Thus, regardless of the stub we pick, there's nirvana-free points where Murphy can force a value of 1−ϵ or less, which contradicts supπstmin(λμ,b)∈Θst(πst)∩NF(λ+b)=1
What if it's above 1? Assume there's some π where min(λμ,b)∈↑(Θst)(π)∩NF(λ+b)≥1+ϵ
From uniform-continuity-Hausdorff, pick some δ to get a ϵ2 Hausdorff-distance or lower (for stuff obeying the λ⊙+b⊙ bound, which all minimal points of ↑(Θst)(π)∩NF do for all π). This δ specifies some extremely large n, consider πn. Now, consider the set of every policy π′ above πn. All of these are δ or less away from π. Also, remember that the particular sort of preimage-to-infinity that we used for Hausdorff-continuity slices away all the nirvana.
So, Murphy, acting on π, can only force a value of 1+ϵ or higher. Now, there can be no nirvana-free point in ↑(Θst)(π′) with λ+b<1+ϵ2. The reason for this is that, since π′ is δ or less away from π, there's a nirvana-free point in ↑(Θst)(π) that's ϵ2 away, and thus has λ+b<1+ϵ, which is impossible.
Ok, so all the nirvana-free points in ↑(Θst)(π′) where π′>πnst have λ+b≥1+ϵ2.
Now, since we have Lemma 21, we can go "hm, Θst(πn) equals the convex hull of the projections of ↑(Θst)(π′)∩NF. Thus, any minimal point with λ+b≤1 is a finite mix of nirvana-free stuff from above, one of which must have λ+b≤1. But we get a contradiction with the fact that there's no nirvana-free point from π′ above πn with a λ+b value that low, they're all ≥1+ϵ2"
So, since we've disproved both cases, supπE↑(Θst)(π)(1)=1. And we're done with normalization! On to causality and pseudocausality.
Condition C: Causality.
An "outcome function" of for ↑(Θst)(πpa) is a function that maps a πpa to a point in ↑(Θst)(πpa), s.t. for all πhipa,πlopa:prπhipa,πlopa∗(of(πhipa))=of(πlopa).
Causality is, if you have a M∈↑(Θst)(πpa), you can always find an outcome function of where of(πpa)=M. Sadly, all we have is causality over stubs. We'll be using the usual identification between ↑(Θst)(πst) and Θst(πst).
Anyways, fix a πpa and a point M in ↑(Θst)(πpa). Project M down to get a sequence Mn∈Θst(πnpa). By causality for stubs, we can find an ofn where, for all πst, ofn(πnpa)=Mn. Observe that there are countably many stubs, and no matter the n, all the λ and b values are the same because projection preserves those. We can view ofn as a sequence in
∏πst(Θst(πst)∩{(λ′μ,b′)|λ′=λ∧b′=b})
By stub closure, and a λ and b bound, this is a product of compact sets, and thus compact by Tychonoff (no axiom of choice needed, its just a countable product of compact metric spaces) so we can get a limiting ofst (because it's only defined over stubs).
An outcome function for stubs fixes an outcome function for all partial policies, by
of(πpa)=⋂n(prπpa,πnpa∗)−1(ofst(πnpa))
We've got several things to show now. We need to show that ofst is an outcome function, that of is well-defined, that of(πpa)=M, and that it's actually an outcome function.
For showing that ofst is an outcome function, observe that projection is continuous, and, letting n index our convergent subsequence of interest, regardless of stub πst, limn→∞ofn(πst)=ofst(πst). With this,
Now, let's show that of is well-defined. Since ofst is an outcome function, all the points project down onto each other, so we can invoke Lemma 9 to show that the preimage is nonempty. If the preimage had multiple points, we could project down to some finite stage to observe their difference, but nope, they always project to the same point. So it does pick out a single well-defined point, and it lies in ↑(Θst)(πpa) by being a subset of the defining sequence of intersection of preimages.
Does of(πpa)=M? Well, M projected down to all the Mn. If n≥m, then ofn(πmpa)=prπnpa,πmpa∗(ofn(πnpa))=prπnpa,πmpa∗(Mn)=Mm So, the limit specification ofst has ofst(πnpa)=Mn for all n. The only thing that projects down to make all the Mn is M itself, so of(πpa)=M.
Last thing to check: Is of an outcome function over partial policies? Well, if πhipa≥πlopa, then for all n, πhi,npa≥πlo,npa. Assume prπhipa,πlopa∗(of(πhipa))≠of(πlopa). Then, in that case, we can project down to some πlo,npa and they'll still be unequal. However, since projections commute, it doesn't matter whether you project down to πlopa and then to πlo,npa, or whether you project down to πhi,npa (making ofst(πhi,npa)), and then project down to πlo,npa (making ofst(πlo,npa)). Wait, hang on, this is the exact point that of(πlopa) projects down to, contradiction. Therefore it's an outcome function.
And we're done, we took an arbitrary πpa and M∈↑(Θst)(πpa), and got an outcome function of with of(πpa)=M, showing causality.
Condition P: Pseudocausality: If (m,b)∈↑(Θst)(πpa), and m's support is on Ma(FNF(π′pa)), then (m,b)∈↑(Θst)(π′pa).
But all we have is, if (m,b)∈Θst(πst) and m's support is on Ma(F(π′st)), then (m,b)∈Θst(π′st).
There's a subtlety here. Our exact formulation of pseudocausality we want is the condition supp(m)⊆F(π′pa), so if the measure is 0, then support is the empty set, which is trivially a subset of everything, then pseudocausality transfers it to all partial policies.
Ok, so let's assume that M∈↑(Θst)(πpa), and the measure part m has its support being a subset of Ma(FNF(π′pa)) but yet is not in ↑(Θst)(π′pa). Then, since this is an intersection of preimages from below, there should be some finite level π′npa that you can project M down to (it's present in Ma(FNF(π′pa)), just maybe not in ↑(Θst)(π′pa)) where the projection of M (call it M′n) lies outside Θst(π′npa) (lying outside the intersection of preimages)
This is basically "take m, chop it off at height n". However, since M∈↑(Θst)(πpa), you can project it down to Θst(πnpa). Which does the exact same thing of chopping m off at height n, getting you M′n exactly. We can invoke stub-pseudocausality (because with full measure, the history will land in F(π′pa), then with full measure, the truncated history will land in F(π′npa) as the latter is prefixes of the former, or maybe the full measure is 0 in which case pseudocausality transfer still works) to conclude that M′ actually lies inside Θst(π′npa), getting a contradiction. This establishes pseudocausality in full generality.
Ok, so we have one direction. ↑(Θst) is a hypothesis, if Θst fulfills analogues of the hypothesis conditions for the finitary stub case. Our proof of everything doesn't distinguish between causal and surcausal, and the arguments work for all types of hypotheses, whether causal, surcausal, pseudocausal, or acausal. Ok, we're 1/4 of the way through. Now we do the same thing, but for building everything from infinitary hypotheses.
PART 2: Infinitary hypotheses. We now consider ↓(Θω), defined as↓(Θω)(πpa)=¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πpa∗(Θω(π))
We'll show that with 8+2 defining conditions, the 9+2 defining conditions for a hypothesis hold for ↓(Θω). The the 8+2 conditions for a Θω are:
The function π↦Θω(π)∩NF∩{≤⊙} is uniformly continuous.
C: Infinitary Causality: Regardless of π and M∈Θω(π), there's an outcome function of over full policies s.t. of(π)=M, and for all π′ and π′′, prπ′,inf(π′,π′′)∗(of(π′))=prπ′′,inf(π′,π′′)∗(of(π′′))
This one is trivial. Pick some π≥πpa. There's a nirvana-free point. Project it down. You get a nirvana-free point and you're done.
Conditions 2 and 3, Closure and convexity. We explicitly took the closed convex hull when defining everything, these are tautological.
Condition 4: Nirvana-free upper completion.
For the pseudo/acausal case, it's doable by Lemmas 10, 11, and 12. The projection of an upper-complete set (by infinitary nirvana-free upper-completion) is upper-complete, so the union of projections is upper-complete, and then the convex hull is upper-complete, and then the closure is upper-complete and we're done.
We'll have to loop back to the causal case of Nirvana-free Upper Completion later, because we need Lemma 21 to make it go through and that requires consistency and the extreme point condition to make it work.
Condition 5: Bounded Minimals.
We can break down into three phases. First is showing that all points in the projection set have something under them that respects the λ⊙+b⊙ bound. Second is showing that all points in the convex hull of the union of projection sets have something under them that respects the λ⊙+b⊙ bound. Third is showing that all points in the closure have something under them that respects the usual bound. The reason we have to phrase it this way is that we don't necessarily know that our sets of interest are closed until the end, so we can't find a minimal point, just a bounded one that is lower, but that suffices to show that a "minimal point" that violates the restricted minimal condition isn't actually minimal.
For part 1, let M∈prπ,πpa∗(Θω(π)). Then, M is the projection of some point M′∈Θω(π). By infinitary bounded-minimals, we can find a minimal point Mmin∈Θω(π) below M′ that obeys the λ⊙+b⊙ bound, so Mmin+M∗=M′. Projecting down is linear, so we get prπ,πpa∗(Mmin)+prπ,πpa∗(M∗)=M, and prπ,πpa∗(Mmin) is below M and fulfills the λ⊙+b⊙ bound.
For part 2, let M∈c.h(⋃π≥πpaprπ,πpa∗(Θω(π))) We can rewrite M as EζMi, and then, by part 1, decompose the Mi into Mloi (not actually minimal, just a point obeying the λ⊙+b⊙ bound) and M∗i. Then we can decompose M further into EζMloi+EζM∗i. The former is an a-measure (mix of a-measures) and obeys the λ⊙+b⊙ bound since all its components do, and it's in the relevant convex hull, witnessing that M has a point below it in the convex hull that obeys the bounds.
For part 3, let M be in the closure of the convex hull. There's some sequence Mn in the convex hull that limits to M. Below each Mn we can find a Mlon (again, not actually minimal) that obeys the λ⊙+b⊙ bound. Invoke Lemma 16 to get a point below M that respects the bounds, and we're done.
Condition 6: Normalization.
We literally have the exact phrasing of normalization we need already, this is a tautology.
Condition 7: Consistency.
Ok, one direction is trivial because ↓(Θω)(π)=Θω(π), so we can just use the definition of ↓. ↓(Θω)(πpa)=¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πpa∗(↓(Θω)(π)))
The other direction, that everything equals the intersection of preimages of stuff below it, is trickier. One subset direction isn't too bad, the one that
↓(Θω)(πpa)⊆⋂πst≤πpa(prπpa,πst∗)−1(↓(Θω)(πst))
If we take a M∈↓(Θω)(πpa) that wasn't added in the final closure step, it's expressible as EζMi, and all the Mi come from points M∞i in Θω(πi) where πi≥πpa. Projecting the M∞i down to a πst≤πpa instead makes Mloi, which mix together in the same way to make Mlo∈↓(Θω)(πst). Because projections are linear and commute, Mlo is the projection of M. So, any point in ↓(Θω)(πpa) (without the closure step) projects down to lie in ↓(Θω)(πst) for any πst≤πpa.
Then, for the closure step, we just fix a sequence Mn limiting to M. The Mn can project down to whichever ↓(Θω)(πst) you wish, and by continuity of projection, the M comes along for the ride as a limit point. However, ↓(Θω)(πst) is closed, so M projects down to land in that set as well. Bam, any old M∈↓(Θω)(πpa) projects down to land in any ↓(Θω)(πst) set you wish with πst≤πpa, certifying that ↓(Θω)(πpa) lies in the intersection of preimages of stubs below.
Now, we just have to establish ↓(Θω)(πpa)⊇⋂πst≤πpa(prπpa,πst∗)−1(↓(Θω)(πst)) which splits into two cases. The causal/surcausal case, and the pseudocausal/acausal case where you don't have to worry about nirvana.
For the nirvana-free case... We can use the same proof strategy as the last part of Lemma 21, where we were showing the result for partial policies. It may be a bit nonobvious why it works. We do need to swap things around a bit, and will mention important changes without fleshing out the fiddly details, which are already given in the last part of Lemma 21.
Start with a M in the intersection of preimages of stubs below. To show it's in ↓(Θω)(πpa), we need a sequence limiting to it, where each member of the sequence is a mix of finitely many points projected down from policies above πpa. The end part of Lemma 21 gives how to construct such a sequence. The fact that we're working in a nirvana-free setting means you can ignore all fiddly details about points being nirvana-free and preimages of only the nirvana-free parts, because everything fulfills that. The key steps in that proof path are:
1: being able to project down M to make a sequence Mn∈↓(Θω)(πnpa). We trivially have this by M being defined as "in the intersection of preimages of stubs below it".
2: Having uniform Hausdorff-continuity for the policies. This is our condition 8 we're assuming, so we're good there.
3: The ability to shatter our Mn into finitely many Mi,n which are the projections of various M∞i,n points from above. This is the key difference. The proof of Lemma 21 had to set up that fact beforehand. However, in our case, we have the Nirvana-free consistency condition, which says
And that right-hand term... is just the definition of ↓(Θω)(πst)! So, swapping that out, and specializing our arbitrary stub to πnpa, we have:
c.h(⋃π>πnpaprπ,πnpa∗(Θω(π)))=↓(Θω)(πnpa)
So, since our Mn lie in ↓(Θω)(πnpa), they can be written as a finite mix of nirvana-free things from above projected down, and the Lemma 21 argument goes through.
Now, for the nirvana cases where we can assume infinitary causality. We'll do this by showing a little sublemma, that, if π≥πpa, then prπ,πpa∗(Θω(π))=↓(Θω)(πpa)
First, we'll show that if π,π′≥πpa, then prπ,πpa∗(Θω(π))=prπ′,πpa∗(Θω(π′))
Fix an arbitrary M in the projection of Θω(π), we can get a preimage point Mhi∈Θω(π). Then, by infinitary causality, we can make a point M′hi∈Θω(π′) that projects down to M. Just make an outcome function of where of(π)=Mhi, feed in π′, that gets you your point M′hi, the two agree when you project them down to inf(π,π′), and πpa is further down than that and projections commute so they both hit the same point M if you project down. Flipping π and π′ shows our equality.
Alright, so now ↓(Θω)(πpa) can be written as ¯¯¯¯¯¯¯¯c.h(prπ,πpa∗(Θω(π))) where π is arbitrary above πpa.
Projection is linear, so the projection of a convex set is convex. To get the closure points, just take a sequence Mn in the projection limiting to some M. Take preimage points M∞n∈Θω(π). There's a bound on the λ and b values of this sequence because projections preserve λ and b and our sequence Mn converges, so we can apply the Compactness Lemma and get a convergent subsequence limiting to a point M∞, which must be in Θω(π) because closure. Projection is continuous, so M∞ projects down to M. And we have prπ,πpa∗Θω(π)=↓(Θω)(πpa) proved! Wow, that was a sublemma of case 2 of part 3 of the proof of condition 7 in part 2 of the proof of the Isomorphism theorem, we're really in the weeds at this point.
Moving on, how can we use this to show ↓(Θω)(πpa)⊇⋂πst≤πpa(prπpa,πst∗)−1(↓(Θω)(πst)) for the causal case, which is the last bit we need to show consistency?
Well, fix a M in the intersection of preimages, and an arbitrary π≥πpa. M projects down to make some Mn∈↓(Θω)(πnpa). Since π≥πpa≥πnpa, and we have our sublemma, there's a point M∞n in Θω(π) that projects down to some M′n∈↓(Θω)(πpa), and further down to Mn.
This sequence M∞n all has the same λ and b value since projection preserves those, so by the Compactness Lemma and closure, there's a convergent subsequence and limit point M∞∈Θω(π). Does M∞ project down onto M? (witnessing that M∈↓(Θω)(πpa))?
Well, let's say it didn't and projecting down gets you a distinct point. Then there's some n where projecting down further to πnpa would keep the points distinct, since they have to differ at some finite time. But... after time n, our sequence M∞n is roaming around entirely in the preimage of Mn, so the limit point is in there too, and it projects down to Mn and we have a contradiction. Therefore, M∞∈Θω(π) projects down onto M, witnessing that M∈↓(Θω)(πpa), and M was arbitrary in the intersection of preimages.
So, we have ↓(Θω)(πpa)⊇⋂πst≤πpa(prπpa,πst∗)−1(↓(Θω)(πst)) for the nirvana-containing causal/surcausal case, which is the last piece we needed to show consistency.
Condition 8: Extreme point condition.
The thing we want is that an extreme nirvana-free minimal point Mex in ↓(BFω)(πst)∩NF is the projection of a nirvana-free point from a policy above it. By the nirvana-free consistency property, Mex lies in c.h(⋃π≥πstprπ,πst∗(Θω(π)∩NF))
Mex is extreme, so it lies in ⋃π≥πstprπ,πst∗(Θω(π)∩NF)
So, there's some point in some Θω(π)∩NF that projects down to Mex and we're done.
Condition 9: Hausdorff Continuity:
Ok, this one is going to be complicated. We'll work with the Lemma 15 version of Hausdorff-continuity, where a δ difference between two policies means that if you start off in one preimage, you've gotta travel ϵ(1+λ) distance or less to get to the preimage associated with the other policy, and vice-versa.
We split into two parts. Part 1 is showing that if πpa≥πst, and the distance between πpa and πst is δ or less, the distance between their respective preimages is low. Part 2 is showing that if πpa and π′pa are δ apart, then we can exploit part 1 to get that the distance between the preimages is low, and will be pretty easy after we get part 1.
Our Hausdorff-continuity condition links δ and ϵ. So, when we fix an ϵ and are like "how close do the policies have to be to guarantee the preimages are ϵ apart", pick the δ that gets you ϵ3 distance w.r.t our original Hausdorff-continuity condition, and also have δ<ϵ3.
Our first part uses Mlo and M′lo for points in ↓(Θω)(πst)∩NF, M′mid for a point in ↓(Θω)(πpa)∩NF, Mhi and M′hi for points in Ma(∞) (that are expressible as a finite mix of points from Θω(π) for varying π), and M,M′ for two more general points in Ma(∞).
So, one half of showing the two preimages are close to each other is trivial. Everything in ↓(Θω)(πpa)∩NF projects down into ↓(Θω)(πst)∩NF by consistency, and projection preserving nirvana-freeness, so the preimage associated with πst is a superset of the preimage associated with πpa, so there's distance 0 from a point in the πpa preimage to a point in the πst preimage.
The other half is trickier. Pick an arbitrary point M in (pr∞,πst∗)−1(↓(Θω)(πst)∩NF) and λ is the λ value of this point. M projects down to some point Mlo in ↓(Θω)(πst)∩NF. From nirvana-free consistency, M∈c.h(⋃π≥πstprπ,πst∗(Θω(π)∩NF)) ,Mlo can then be produced by (keeping in mind that it doesn't matter whether we mix before or after projecting) finitely many points in varying Θω(π)∩NF sets that are mixed to make a point Mhi, and then projected down.
Important note: Mhi is not necessarily equal to, or even close to, M.
Because πst is within δ of πpa, every policy above πst has a corresponding policy within δ that lies above πpa. Thus, we can perturb the component points (indexed by i) that mix to make Mhi by ϵ3(1+λi) (infinitary Hausdorff-uniform-continuity Lemma 15 variant, δ was assumed to be small enough for that to be the case), and mix them, to get a M′hi in c.h(⋃π≥πpa(Θω(π)∩NF))
M′hi projects down to ↓(Θω)(πpa)∩NF to make a M′mid, and further projects down to ↓(Θω)(πst)∩NF to make a M′lo. Because M′hi is within ϵ3(1+λ) of Mhi, projecting down (and projecting being nonexpansive) means that M′lo is within ϵ3(1+λ) of Mlo.
Now, we can take M′lo and fill in all the missing measure data to get a M′ that projects down onto M′mid (certifying that it's in the preimage of ↓(Θω)(πpa)∩NF) as follows. Our most important constraint is that, when extending m′lo, it should perfectly mimic m′mid so it can project down onto it. Our second constraint is that, if mlo doesn't specify what comes after a finite history and it doesn't conflict with the first constraint, it should exactly mimic the conditional probabilities of m. Also, our δ fixes a first time n (ie, logγ(δ)) at which πpa is defined where πst isn't, so all conflicts of the second constraint with the first constraint must happen after then. This does the following: We can slice the histories assigned measures by M′ into three parts.
Part 1 is prefixes of histories in F(πst). There's only ϵ(1+λ) difference in these between M and M′ (after all, projecting down to πst leaves these unchanged, and M/M′ project down to Mlo and M′lo which are only ϵ3(1+λ) apart).
Part 2 is histories which have as a prefix something in F(πst) less than length n. In that case, we're mimicking the conditional probabilities of m.
Part 3 is histories which have as a prefix something in F(πst) of length n or higher. Because this is the threshold where πpa and πst start differing, we've got to obey the m′mid probabilities. But this only occurs after time n.
Let's analyze the difference between M′ and M, shall we? Our two relevant results are Vanessa's folk result that two distributions that differ by an amount will differ by the same amount if we extend them with the same conditional probabilities, and the result from the proof of Lemma 15 that arbitrarily reshuffling the measure/amount of dirt after time n takes γnλ′effort, where λ′ is the λ value of the measure you're reshuffling.
So, we start off with a ϵ3(1+λ) distance (includes the b term) between Mlo and M′lo. Then, extending up further to fill in everything up till time n, m and m′ mimic the conditional probabilities of each other. Still a ϵ3(1+λ) distance between them at this stage. Finally, after time n, M′ may go its own arbitrary way because it's gotta be compliant with M′mid, and to reshuffle this around, it takes γnλ′ effort. So, the net distance between M (arbitrary point in the preimage of ↓(Θω)(πst), and M′ (specially crafted point in the preimage of ↓(Θω)(πpa) is below ϵ3(1+λ)+γnλ′.
Wait, n (time of first difference) was logγ(δ) since πst and πpa are only δ apart, and λ′ can be at most λ+ϵ3(1+λ) because λ values are preserved by projections, and Mlo and M′lo are only ϵ3(1+λ) distance apart, so no more than that amount of dirt is the difference between the two. Finally, we assumed δ<ϵ3. So, we get:
And we have our appropriate distance bound between preimages! Now to use this in part 2, which should go a lot faster.
Time for part 2, to get full generality. Pick two partial policies πpa and π′pa and assume the distance between them is δ. Then, the stub πst given by "inf(πpa,π′pa) but cut it off so it's undefined after time n (where n is logγ(δ))" is within distance δ of both πpa and π′pa. Further, πst≤πpa,π′pa. Then, take some point in the preimage of πpa. It's also in the preimage of πst. Because πst is at a distance of δ from π′pa, we only have to go ϵ(1+λ) distance to get a point in the preimage of π′pa, and then reverse πpa and π′pa and we're done!
By Lemma 15, this establishes uniform continuity for the function mapping partial policies to the preimage of their nirvana-free part in the space of all nirvana-free measures over infinite histories.
Now that we've nabbed every nice condition other than this one, we can invoke Lemma 21 (we only require upper completion on the infinite levels, which we have) to get that the nirvana-free part is the (closed) convex hull of the projections of nirvana-free stuff from above. Then, just appeal to lemmas 11, 12, and 13, that the closed convex hull of projections of nirvana-free upper-complete sets is nirvana-free upper-complete.
Condition C: Causality.
We showed part of this all the way back in our consistency argument. For causal/surcausal, prπ,πpa∗(Θω(π))=↓(Θω)(πpa) regardless of which π≥πpa we picked. We'll be using this.
Pick some arbitrary πpa and M∈↓(Θω)(πpa). M has a preimage point Mπ∈Θω(π) where π≥πpa. We get an outcome function ofω mapping policies to points in their associated sets s.t. ofω(π)=Mπ. Extend this ofω to all points by defining of(π′pa):=prπ′,π′pa∗(ofω(π′))
Ok, we need to show that: This actually singles out a unique point and isn't an invalid definition, said point is in ↓(Θω)(π′pa), that of(πpa)=M, and that it's an outcome function.
Assuming this is actually well-defined, of(π′pa) is in ↓(Θω)(π′pa) trivially because it's a projection of a point from above. Also, of(πpa)=prπ,πpa∗(ofω(π))=prπ,πpa∗(Mπ)=M which clean ups that part. Now for showing that it's an outcome function.
So, we got everything assuming the extension is well-defined, let's show that. Pick any two π,π′ above any πpa. We'll show that they project to the same point.
And we're done with causality! Now for pseudocausality.
Condition P: Pseudocausality.
We'll do this in two steps. One is showing that for stubs, points which meet the appropriate conditions are also present in all the requisite other stubs. Step 2 is generalizing this to all points in ↓(Θω)(πpa).
Let's say you have some M∈↓(Θω)(πst). By Nirvana-free consistency, M∈c.h(⋃π≥πstprπ,πst∗(Θω(π))) so we can shatter it into finitely many Mi that are projections of stuff from above, M∞i. The support of the measure component of M is a subset of FNF(πst)∩FNF(π′st), so the same must apply to all the Mi.
Now, what we can do is make a π′i that mimics the behavior of πi for all prefixes and extensions of strings in FNF(πst)∩FNF(π′st), but otherwise mimics π′st, and extends if needed in some random-ass way, and is above π′st.
The reason we can do this is because, if there's a contradiction in this construction, it would be from π′st and πi behaving differently on some prefix or extension of a string in FNF(πst)∩FNF(π′st). But, π′st can't specify what to do for any nodes in FNF(π′st) or later (because FNF(π′st) is basically a coat of leaf observation nodes around the extent of π′st), and if π′st and πi differ on a strict prefix of something in FNF(πst)∩FNF(π′st), then that means that π′st and πst branch different ways so there's no node in both FNF(πst) and FNF(π′st) after the branch point, so again we get a contradiction.
Anyways, we've crafted our finitely many π′i which lie above π′st, and mimic πi going forward. Our M∞i is an extension of Mi whose measure component is only supported on FNF(πst)∩FNF(π′st). Also, before and past that, π′i mimics πi perfectly, so we can transfer M∞i to Θω(π′i) by infinitary pseudocausality. Do this for all the i. Then, projecting all those down to π′st, we get that all the Mi lie in ↓(Θω)(π′st), and mixing them together, we get that M itself lies in ↓(Θω)(π′st).
Now for part 2, where we show it for partial policies in general. Let M be arbitrary in ↓(Θω)(πpa). Project M down to all the πnpa to make a sequence of Mn. Since the support of the measure component of M is a subset of F(πpa)∩F(π′pa) (pseudocausality assumption) the support of the measure component of Mn is a subset of F(πnpa)∩F(π′npa), so by pseudocausality for stubs which we've shown, Mn is also present in ↓(Θω)(π′npa). Then, take the preimage in π′pa of all those Mn points. By consistency and Lemma 9 and the usual argument about "there can only be one preimage point for a series of points", we get that M itself (the only thing that could project down on Mn for all n) lies in ↓(Θω)(π′pa) and we're done with pseudocausality.
Alright, that's most of the proof out of the way, all that's left is showing that the full belief function conditions imply the finitary and infinitary versions, respectively, and getting isomorphism. Let's begin.
Let's check whether →st(Θ) makes a stub-hypothesis, and whether →ω(Θ) makes an infinitary-hypothesis, if Θ is a hypothesis/fulfills all the conditions. →st is just "restrict Θ to only reporting sets for stubs", and →ω is just "restrict Θ to only reporting sets for full policies"
The variants of nonemptiness, closure, convexity, nirvana-free upper-completion, bounded minimals, hausdorff-continuity, and pseudocausality for the finite and infinite case are trivially implied by the corresponding condition for hypotheses, leaving the four moderately nontrivial cases of the analogues of normalization, consistency, the extreme point condition, and causality.
Extreme point condition: The infinitary case doesn't have an analogue of the extreme point condition. So that leaves the finitary case. What we can do is take a nirvana-free extreme minimal point Mex in some Θ(πst), apply the general extreme point condition to get a nirvana-free M∈Θ(π) for some suitable π that projects down to Mex, and, clipping away the infinite parts by →st, the projections of M fill the role of the points in Θ(π′st) all below some policy that project down to Mex.
Causality. The finite case is that we can take a point associated with some stub, and craft an outcome function for stubs that matches up with our point. This is trivially implied by the general case of causality, where you can take any partial policy and point and get an outcome function that matches up with it. The infinite case is that we can take a point M in Θ(π), and get points for all the other Θ(π′) that project down appropriately. For this, again, we just take an outcome function for M and clip it off to the infinite levels.
Consistency: The finite case of weak consistency is pretty easy. We get
Where the subset came from full consistency because everything is the closed convex hull of projections from above, so projecting down gets you a subset. For the Nirvana-free consistency condition for the infinite case, it's a simple consequence of Lemma 21.
Normalization: infπstEΘ(πst)(0)=0 and supπstEΘ(πst)(1)=1
To begin, the normalization condition for infinitary hypotheses and general hypotheses is the exact same, so we can ignore that and work on the stub hypothesis case. The inf one is pretty easy. From general normalization, at the infinite level, there are π and nirvana-free points in Θ(π) with a b value at-or-near zero, and you can just project them down to any stub you want.
The sup one is a bit trickier. It's obviously not above 1, because no matter what policy π you pick, you've got a nirvana-free point with λ+b≤1 in Θ(π), which you can project down to whichever stub you're looking at, to certify that the expectation of 1 is 1 or less. Showing that it isn't below 1 is a bit harder.
Let's say there's some π where min(λμ,b)∈Θ(π)∩NF(λ+b)=1 (or arbitrarily close to 1, doesn't really matter, although we'll show later that there is indeed a maximizing policy where Murphy can only force a value of 1)
From Hausdorff-continuity, pick some δ to get an ϵ Hausdorff-distance or lower. This δ specifies some extremely large n, consider πn. Now, consider the set of every policy π′ above πn. All of these are δ or less away from π.
By Hausdorff-continuity, there can't be a nirvana-free point in any Θ(π′) with λ+b<1−ϵ, because we could do an ϵ perturbation to get a point in Θ(π)∩NF with λ+b<1, because small changes in M induce small changes in λ and b. Or, we can add a little bit of wiggle room if the minimizing value of λ+b in π is slightly less than 1
However, any nirvana-free point in Θ(πn) must originate as a mix of finitely many points from Θ(π′i)∩NF (varying π′i as long as it's above πn) that have been projected down. This is because, by our earlier proof of nirvana-free consistency from consistency in general, Θ(πn)∩NF=c.h(⋃π′≥πnprπ′,πn∗(Θ(π′)∩NF))
All of these projected points have λ+b≥1−ϵ, so the mix point has λ+b≥1−ϵ, so Murphy can only force a value of 1−ϵ or higher. And we can make δ as small as we wish to get a stub πn below π (n extremely large) where ϵ is as small as we wish, so the sup of the λ+b values Murphy can force over all stubs can't be below 1. So it must be 1.
Isomorphism! Let's go! As a quick recap,
↓(Θω)(πpa):=¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πpa∗(Θω(π)))
↑(Θst)(πpa):=⋂πst≤πpa(prπpa,πst∗)−1(Θst(πst))
And →ω/→st is just "clip down your hypothesis to full policies/stubs".
So, two parts of this are trivially easy. From earlier in the proof (the start of the first section for the stub one, and an obvious corollary of definitions for the full policy one), we established that ↑(Θst)(πst)=Θst(πst) and ↓(Θω)(π)=Θω(π). Using this, →ω(↓(Θω))(π)=↓(Θω)(π)=Θω(π) and→st(↑(Θst))(πst)=↑(Θst)(πst)=Θst(πst)
Again, first two equalities are unpacking definitions, the third is consistency for Θ. So, ↓(→ω(Θ))=Θ and ↑(→st(Θ))=Θ
Putting it together, →ω and ↓ make an isomorphism between Θω and Θ, and →st and ↑ make an isomorphism between Θst and Θ. We're finally done!
Proposition 1:If Θ fulfills the causality condition, nonemptiness, closure, and convexity, then SΘ is a nonempty, closed, convex set of a-environments or a-survironments. ΘSΘ=Θ. Also, S⊆SΘS.
Ok, what SΘ is, is the set of a-environments (λe,b) where, regardless of πpa, (λ(πpa⋅e),b) lies in Θ(πpa). For nonemptiness, pick some arbitrary point in one of your Θ(πpa), use causality to get an outcome function, and then you fill in the conditional probabilities for an action-observation sequence with your outcome function points. This never produces a contradiction anywhere because if there was a contradiction, you'd be able to project two specified points down and have them disagree somewhere, which is impossible because we have an outcome function.
For closure, if you take a limit of a-environments, this makes a limiting sequence in all the SΘ, which are all closed, so the limit point environment has all its induced distributions lying in the usual Θ(πpa), and is in SΘ
For convexity, if you take a mix of a-environments, this makes the same mix in all the SΘ which are all convex, so the mixed environment has all its induced distributions lying in the usual Θ(πpa), and is in SΘ.
For equality, if M∈ΘSΘ(πpa), then it originated from some a-environment made from an outcome function for Θ, which... just gets your original point so M∈Θ(πpa). In the other direction, if M∈Θ(πpa), by causality, we can project down and extend the specification and make an a-environment that acts like M on πpa, and then going back gets you M∈ΘSΘ(πpa).
In the other direction, if (λe,b)∈S, then it induces an outcome function and you can go back from that to (λe,b)∈SΘS, so S⊆SΘS
Theorem 3.1: Pseudocausal Translation:For all pseudocausal Θst hypotheses defined only on policy stubs, →c(Θst) is a causal hypothesis only defined on policy stubs. →NF(→c(Θst))=Θst. For all causal Θst hypotheses defined only on policy stubs, →NF(Θst) is a pseudocausal hypothesis only defined on policy stubs.
Theorem 3.2: Acausal Translation:For all acausal Θst hypotheses defined only on policy stubs, →sc(Θst) is a surcausal hypothesis only defined on policy stubs. →NF(→sc(Θst))=Θst. For all surcausal Θst hypotheses defined only on policy stubs, →NF(Θst) is an acausal hypothesis only defined on policy stubs.
Both these theorems have highly similar proofs, so let's group them together. First, we'll need to set up how →c and →sc work, and then knock out two lemmas we'll need before we can proceed to the main result. →c is defined by →c(Θst)(πst)=¯¯¯¯¯¯¯¯c.h(⋂π′st≤πst(Iπ′st,πst∗(Θst(π′st))))→sc is defined identically, just with I∗s instead of I∗, and closed convex hull permitting us to mix with 0+ probability.
Iπ′st,πst where π′st≤πst (this is like the inverse of projection, it's going up instead of down) is a function F(π′st)→F(πst) defined by: If h∈F(π′st)∩F(πst), then Iπ′st,πst(h)=h. If h∈F(π′st) and isn't in F(πst), then Iπ′st,πst(h)=hπst(h)N
Iπ′st,πst∗ from Ma(F(π′st))→Ma(F(πst)) is just pushing (m,b) through the mapping Iπ′st,πst. You keep the b term the same, and push the measure terms up. Iπ′st,πst∗s is defined identically on the measure part, except that it has the rule that all nirvana events in F(πst) and not in F(π′st) with 0 measure get 0+ measure instead.
Intuitively, what I∗ and I∗s are doing, is capping off whatever they need to (in order to extend appropriately) with Nirvana. I∗ is capping off positive-probability histories with guaranteed Nirvana immediately afterwards, where I∗s is more paranoid and caps off every 0-probability Nirvana history that got added with "it is possible that Nirvana occurs here".
Let's go over some properties that I∗ and I∗s fulfill. I∗ is an injective continuous map Ma(F(π′st))→Ma(F(πst)), and I∗s is an injective continuous map SMa(F(π′st))→SMa(F(πst)). I∗ and I∗s are undone by projecting back down, prπst,π′st∗(Iπ′st,πst∗(M))=M. Both I∗ and Is∗ are linear, the latter in the stronger sense that it's linear when you mix stuff with 0+ probability, it doesn't matter whether you mix before or after injecting up. Further, injections up commute, Iπ′st,πst∗(Iπ′′st,π′st∗(M))=Iπ′′st,πst∗(M), and the same for I∗s.
In order to make progress, we want to get two important lemmas. The first one, Lemma 22, is that slicing away the nirvana from this thing recovers the original pseudocausal hypothesis. The second one I call the "Diamond Lemma", and it says that injecting up and projecting down is the same as projecting down and then injecting up, and if you sketch it out, it looks like a diamond.
Lemma 22:(→c(Θst)(πst))∩NF=Θst(πst), and the same holds for →sc.
Proof sketch: One direction is trivial, the other direction that →c doesn't add any new nirvana-free points is trickier. Working in the pseudocausal-to-causal setting, we can take some M that's nirvana-free in the closed convex hull, and get a sequence Mn limiting to it where each Mn is in the convex hull. Now, indexing stubs below πst by i, the Mn can all be viewed as a mix of Mi,n points projected up from below. The problem is, the mix varies as n does. What we can do is separate into "good" i where we can get a suitable limit point and limit probability, and "bad" i that we have to treat as a special chunk, and reexpress M as a sum of a probabilistic mix of "good" Mi injected up, and an additional "bad" chunk. We can show that the "good" Mi can all be transferred up to Θst(πst) itself by pseudocausality and mixed in there, and the "bad" chunk is a nirvana-free a-measure. So, M is the sum of a point in Θst(πst), plus a nirvana-free a-measure, so M lies in Θst(πst) by nirvana-free upper completion.
Working in the surcausal-to-acausal setting, we take our M in the closed convex hull and a sequence Mn, but injection up in this setting is much more effective at adding Nirvana, and the surmetric is much more sensitive than the usual metric for noticing the presence or absence of Nirvana. So, only an initial segment of Mn is "contaminated" with Nirvana since the limit point is Nirvana-free, and we can clip that part off, and the "uncontaminated tail" can only have come from Θst(πst) itself because injection up is very aggressive with adding Nirvana, so we get it from just closure on Θst(πst).
Proof: For (→c(Θst)(πst))∩NF⊇Θst(πst), just observe that the identity injection Iπst,πst∗ leaves Θst(πst) completely unchanged and adds no nirvana, so any point in Θst(πst) also lies in the closed convex hull of the injections up, and is nirvana-free because the original point that we mapped through identity was nirvana-free. This works with the surcausal case too.
Now for the considerably more difficult reverse direction, for the pseudocausal-to-causal case first.
If M∈→c(Θst)(πst)∩NF, then unpacking that, M is nirvana-free, and lies in the closed convex hull of the injections up. So, we can fix a sequence Mn in the convex hull of injections up that limits to M. Index the stubs below πst by i, there's only finitely many of them.
The Mn can be written as ∑iζi,nIπist,πst∗(Mi,n) where Mi,n∈Θst(πist). ζi,n may be 0. This is because, if there's multiple points in the injection of a particular stub that are mixed, you can mix them before injecting up to get a single Mi,n that's injected up, because injections are linear and we're injecting a convex set.
Blessed by the gift of finitely many i to worry about, use repeated picking of subsequences to get a subsequence of n where:
For all i, ζi,n converges. Call the limiting values ζi. Now split the i into good i where ζi>0, and bad i where ζi=0. The ζi will sum up to 1.
For all good i, ζi,n>0 always. The fact that they all limit to above 0 helps you out because you only have to trim off an initial segment.
For all good i, Mi,n converges, call the limit point Mi. This is because injection up preserves λ and b, and ζi,n is bounded above 0, so the λ+b value of the Mi,n is upper-bounded by λ′+b′minnζi,n, which is a finite nonnegative number divided by a finite positive number, and we can apply the Compactness Lemma to establish that a convergent subsequence exists. In this case, λ′+b′ is the bound as a whole for the sequence Mn, which converges so it must have a bound of that form, and not the bound on minimal points.
Finally, ∑bad i(ζi,nIπist,πst∗(Mi,n)) converges. This is doable because the sequence Mn has bounded λ and b because it converges to something, so the partial sum of bad i has the same bound, so we can invoke the Compactness Lemma to get our convergent subsequences.
Putting all this together (we kept selecting from compact sets so that is what let us build a subsequence with all these great properties at once) we have a decomposition of M itself into:
Now, since ∑good iζi=1 (all the bad i had their probability components limit to 0), that first sum part looks like an actual mixture of points injected up! Since M is nirvana-free, both parts must be nirvana-free, and the sums are also a-measures.
First, by closure of all our original Θst(πist), all the Mi components (where i is good) do lie in Θst(πist). And when we inject the Mi up, since the mix of them is nirvana-free, this means that each individual Mi must be nirvana-free after injection.
Now, what injection does, is it caps Nirvana on everything that is in F(πist) and not in F(πst) that has positive probability. So, if Mi is nirvana-free after injection, this must mean that its measure component is only supported on F(πst). Via pseudocausality, this means that Mi lies in Θst(πst) itself! Also, Iπist,πst∗(Mi)=Mi.
So, our sum over good i components (by convexity), is actually a probabilistic mixture of stuff in Θst(πst) itself! Abbreviating ∑good iζiMi as M′, which lies in Θst(πst) by convexity, and rewriting the sum, we can reepress M as:
M′+limn→∞(∑bad i(ζi,nIπist,πst∗(Mi,n)))
This is a nirvana-free a-measure in Θst(πst), plus a nirvana-free a-measure, so, by nirvana-free upper-completion, M lies in Θst(πst) and we're done. Now, let's hit up the surcausal case.
Assume SM∈→NF(Θst)(πst)∩NF. SM is nirvana-free, and lies in the closed convex hull of the injections up. So, we can fix a sequence SMn in the convex hull of injections up that limits to SM. Index the stubs below πst by i, there's only finitely many of them, reserve i=0 for πst itself.
The SMn can be written as ∑iζi,nIπist,πst∗s(Mi,n) where Mi,n∈Θst(πist). ζi,n may be 0 or 0+. This is because, if there's multiple points in the injection of a particular stub, you can mix them before injecting up to get your single point, one for each πist, because injections are linear and we're injecting a convex set.
Note that SM is nirvana-free, and there's only finitely many spots where nirvana could be since we're working in a stub, so past a certain point all the SMn will be nirvana-free due to the surmetric we're using. Let's clip off that initial segment that's contaminated with Nirvana. Now, we can get something very interesting. If πist<πst, then injecting up anything at all is going to stick nirvana (maybe with 0+ measure) somewhere. Having ζi,n be 0+ doesn't help you, because mixing with a nirvana-containing thing with 0+ probability means the mixture contains the nirvana-spots of that thing you mixed in. So, past a certain point, all the SMn can only be written as M0,n (the identity injection, anything else either has exactly 0 probability so it gets clipped out of the sum, or it has Nirvana somewhere and can't be present).
Therefore, in the tail, the sequence of SMn limiting to SM is the same as M0,n∈Θst(πst) limiting to some M0∈Θst(πst), so SM∈Θst(πst) and it's actually an a-measure, not an a-surmeasure. This establishes Lemma 22 for the sur-case.
Lemma 23/Diamond Lemma:For any πst,π′st, and any πhist≥πst,π′st, and any M∈Ma(F(πst)), then: prπhist,π′st∗(Iπst,πhist∗(M))=Iinf(πst,π′st),πst∗(prπst,inf(πst,π′st)∗(M)) (and same for I∗s and the sur variants)
it's called the Diamond Lemma because if you sketch out the injections as going diagonally up and the projections as going diagonally down, the commutative diagram looks like a diamond.
To begin with, we can go "hm, there's an upper bound on πst and π′st. For every finite history in F(πst), there's an extension of that history in F(πhist), which has a prefix in F(π′st), and vice-versa. This establishes that for all the finitely many histories in F(πst), either a prefix of that history lies in F(π′st), or an extension of that history lies in F(π′st), and vice-versa for F(π′st)
Now, we can split into three possible cases and show that up-then-down equals down-then-up in terms of what measure is assigned to a history in F(π′st) by mapping M through the injections and projections, which shows the diamond lemma in full generality.
In the first case, our history h in F(π′st) is also in F(πst) (the equality case)In this case, h also lies in F(inf(πst,π′st)). Projecting down to inf does nothing to the measure on h, and embedding up also does nothing to the measure on h. Embedding up to πhist also does nothing to the measure on h, and projecting down doesn't affect it either.
In the second case, our history h in F(π′st) isn't in F(πst), but there are strict extensions that lie in F(πst) (this requires h to be nirvana-free). h is still assigned a measure by M, though, being a prefix of stuff with measure. In this case, h also lies in F(inf(πst,π′st)). The same analysis from our first case works, h doesn't have its measure disrupted.
In the third case, our history h in F(π′st) isn't in F(πst), but a strict prefix h′ lies in F(πst). We can distinguish three subcases. In the first subcase, h is of the form h′aN. In the second subcase, h still ends with Nirvana, but it isn't immediately after h′ happens, some stuff happens in the meantime first. In the third subcase, h doesn't end with Nirvana. Also, h′ lies in F(inf(πst,π′st)).
For the first subcase where h is of the form h′aN, injecting up means h′aN now has the measure originally associated with h′ and nirvana is marked as "possible" there (if we're using the sur-injection). Projecting down leaves this alone. Projecting down leaves the measure on h′ alone, and injecting up means h′aN now has the measure originally associated with h′ and nirvana is marked as "possible" there (if we're using the sur-injection). In both paths, h′aN ends up with the measure that h′ started with, and nirvana marked as "possible" in the sur-case.
For the second subcase where h is of the form "h′, then some stuff happens, then Nirvana occurs", then in the causal case, the injection up would assign h 0 measure (all the measure of h′ got channeled into h′aN instead of h), and then projecting down, it stays the same. Similarly, projecting down means h′ has some measure, then it's all channeled into h′aN on the injection up, so h itself gets 0 measure. For the surcausal case, the injection up assigns h0+ measure (by the same argument and sur-injections tagging every freshly-added nirvana outcome with 0+ measure). and projecting down, it remains with 0+ measure. Projecting down leaves h′ alone and then injecting up tags h with 0+ measure.
For the third subcase where h is an extension of h′ that doesn't add any Nirvana, we can run through the same argument as the second subcase to conclude that we get 0 measure for both the causal case and the surcausal one.
Thus, no matter whether we inject up and project down, or project down and inject up, the measure assigned to h∈F(π′st) by the measure component of the p-(sur)measure will agree.
An important thing to note with this is that we can use any stub above πst and π′st for the injection up, but we must use inf(πst,π′st) for the projection down.
Now, we can finally embark on the proof of the two translation theorems! There's enough similarities between the proofs that we can just do one big proof and remark on any differences we come across. The things we must show are that slicing off the Nirvana from a causal/surcausal hypothesis makes a pseudocausal/acausal hypothesis, and that adding in those injections up can turn a pseudocausal/acausal hypothesis to a causal/surcausal one, and that going nirvana-free to nirvana-containing back to nirvana-free is identity.
Proof sketch: While at first this may look like the proof will be almost as long as the Isomorphism theorem because we're verifying a list of 9 conditions twice over, it'll actually be considerably shorter. The only nontrivial part of the first part where we check that slicing off the Nirvana makes a pseudocausal/acausal hypothesis is deriving pseudocausality from causality, and even that is fairly easy.
Going from pseudocausal/acausal to causal/surcausal is trickier, though thankfully most conditions are trivially true, there's only three notable ones. There's the bound on minimal points, which is done by taking a sequence Mn limiting to a M that violates the bound, using the definition of the causal translation to get a point below each Mn which obeys the λ⊙+b⊙ bound (fairly nontrivial), and appealing to Lemma 16 to construct a limit point below M that obeys the λ⊙+b⊙ bound. Showing weak consistency (projecting down makes a subset) requires the Diamond Lemma to write the projection of your point of interest as a mix of injections up from below, and the last tricky one is causality. Which requires first showing that injecting up →c(Θst(πst)) to a higher stub won't add any new points, and then coming up with a clever way of building our outcome function, and using the Diamond Lemma to show that it indeed an outcome function.
Finally, nirvana-free to causal to nirvana-free is instant by Lemma 22.
Proof: Referring back to the conditions for a hypothesis on policy stubs, we'll show that they're fulfilled when you slice away the Nirvana, and that pseudocausality can be derived from causality if we're just dealing with a-measures and not a-surmeasures.
Stub Nirvana-free Nonemptiness was a property already possessed by the causal hypothesis, so it's preserved when we clip away the Nirvana. Stub closure and convexity also hold because we're intersecting with the closed convex set (of nirvana-free a-measures). Nirvana-free upper completion also holds. Bounded minimals holds because a minimal point in the Nirvana-free part must also be a minimal point in the original set, because adding anything to a nirvana-containing a-measure makes a nirvana-containing a-measure, so there can be no nirvana below our minimal point in the nirvana-free part, so it's minimal in general. Normalization holds because the expectation values only depend on the nirvana-free part. Nirvana-free stuff projects down to nirvana-free stuff, getting stub-consistency. The stub extreme point condition carries over due to the preexisting intersection with nirvana-free used to define it, and the same applies to uniform continuity. This wraps up surcausal-to-acausal, and just leaves deriving pseudocausality from causality for causal-to-pseudocausal.
Let's say we have a nirvana-free M∈Θst(πst), where the measure part of M is supported over F(π′st), and we want to show that it's also present in Θst(π′st). Then M is present in Θst(inf(πst,π′st)), because it's supported entirely over histories that both the different policies produce, so it's supported over histories that the intersection of the policies produces, just project it down to the inf. Now, by causality, we can find something in Θst(π′st) that projects down onto M, which must be M itself because the measure part of M is supported entirely on histories in F(π′st). This gets pseudocausality.
Now, let's show that →c(Θst) and →sc(Θst) fulfill the defining conditions for finitary causal.
1: Stub Nirvana-Free Nonemptiness: This one is trivial, because Θst(πst) is present as a subset via the identity injection Iπst,πst∗, and is nirvana-free.
2,3: Stub Closure/Convexity: We took a closed convex hull, these are tautological.
4: Stub Nirvana-Free Upper-Completeness: Just apply Lemma 22 to get that the nirvana-free part of →c(Θst)(πst) (and same with surcausal) is just the original Θst(πst), which is nirvana-free and upper-complete by stub nirvana-free upper-completeness, so we're good there.
Condition 5: Stub Bounded Minimals:
By stub bounded minimals on the Θst we have a λ⊙+b⊙ bound on λ+b for minimal points in Θst(πst) for all stubs. Pick a M∈→c(Θst)(πst) (or the surcausal analogue) with λ+b>λ⊙+b⊙. There's a sequence of points Mn limiting to M that lie in the convex hull. All these Mn (or SMn) can be written as ∑iζi,nIπist,πst∗(Mi,n) where Mi,n∈Θst(πist). Decompose Mi,n into a minimal point and something else, getting Mmini,n+M∗i,n.
Then, do a further rewrite as Mmini,n+(m∗−i,n,−m∗−i,n(1))+(m∗+i,n,b∗i,n+m∗−i,n(1))
Note that the λ+b value of the sum of the first two terms is bounded above by λ⊙+b⊙, because Mmini,n obeys that bound, and for the second term, it deletes exactly as much from the λ term as it adds to the b term. Also, since Mi,n is an a-measure, adding in just the negative component to Mmini,n doesn't make it go negative anywhere, so the sum of the first two terms is an a-measure, and by nirvana-free upper completeness, it lies in Θst(πist). The third component of the sum is also an a-measure.
By linearity of I∗ or I∗s, injecting up the first two terms and the last term, and adding them afterwards, is the same as injecting up the bulk of them (we can only inject up a-measures). Let's abbreviate Mmini,n+(m∗−i,n,−m∗−i,n(1)) as M◊i,n and abbreviate (m∗+i,n,b∗i,n+m∗−i,n(1)) as M♡i,n Now, we can rewrite Mn as:
The first component is in →c(Θst)(πst) and has the λ⊙+b⊙ bound (injection up preserves λ and b) and M◊i,n lies below the λ⊙+b⊙ bound, and mixing stuff below the bound produces a point below the bound. Abbreviate the first component as M◊n.
So, Mn isn't minimal, it lies above M◊n. Because the M◊n have a λ⊙+b⊙ bound, and there's only finitely many places where nirvana could be, we can extract a convergent subsequence, limiting to some M◊ which obeys the λ⊙+b⊙ bound, and by Lemma 16, M◊ lies below M.
Therefore, M isn't minimal, and it was an arbitrary point that violated the λ⊙+b⊙ bound, so all minimal points in any →c(Θst)(πst) obey the same λ⊙+b⊙ bound, and we get bounded minimals.
6: Stub Normalization. By Lemma 22, we didn't introduce any new nirvana-free points, so stub normalization of Θst carries over.
Condition 7: Weak Consistency.
This is "projecting down makes a subset". All the following arguments work with sur-stuff. Fix some M∈→c(Θst)(πhist). It's a limit of points Mn in the convex hull. We can decompose the Mn as ∑iζi,nIπist,πhist∗(Mi,n). Then,
Which, by the Diamond Lemma, can be rewritten as: ∑iζi,nIinf(πlost,πist),πlost∗(prπist,inf(πlost,πist)∗(Mi,n))
The projections of the Mi,n∈Θst(πist) lie in Θst(inf(πlost,πist)) by weak consistency for Θst. So, actually, the projection of Mn down to πlost can be written as a mix of injections up from stubs below πlost, so the projection of Mn lies in →c(Θst)(πst). Then, just use continuity of projection, and closure, to get M itself projecting down into →c(Θst)(πlost), so we're good on weak consistency.
8: Stub Extreme Point Condition: By Lemma 22, we didn't introduce any new nirvana-free points, so any nirvana-free extreme minimal point was present (and nirvana-free extreme minimal) in Θst already, so the extreme point condition carries over from there.
9: Stub Hausdorff Continuity: By Lemma 22, we didn't introduce any new nirvana-free points, and the preimages for Hausdorff continuity are of the nirvana-free parts, so this is completely unaffected and carries over.
Condition C: Causality: As a warmup to this result, we'll show that if πst≤πhist, then
Iπst,πhist∗(→c(Θst)(πst))⊆→c(Θst)(πhist)
Pick a M∈→c(Θst)(πst). It's a limit of Mn which are finite mixtures of injections of stuff from below, and Mn can be written as ∑iζi,nIπist,πst∗(Mi,n) Then,
And then, by commutativity of injections, the injection of Mn rewrites as ∑iζi,nIπist,πhist∗(Mi,n) All the πist are below πst so they're also under πhist, witnessing that the injection of Mn lies in →c(Θπst(πhist)). Then, just appeal to closure and continuity of I∗ or I∗s to get that M injects up into →c(Θπst(πhist)) Again, all this stuff works for the sur-situation as well.
With this out of the way, fix some M∈→cΘπst(πst). Let's try to make an outcome function from this, shall we? Let's do of(π′st):=Iinf(πst,π′st),π′st∗(prπst,inf(πst,π′st)∗(M))
Yup, that does indeed specify one point for everything. It obviously spits out M when you feed πst in because both the injection and projection turn into identity. Further, by weak-consistency, the projection of M down lies in →c(Θπst)(inf(πst,π′st)), and by our freshly-proved result, injecting up lands you in →c(Θst)(π′st).
So, all that's left is showing that it's an outcome function! That, for any two πhist and πlost where πhist≥πlost, that prπhist,πlost∗(of(πhist))=of(πlost) Let's begin.
And then, inf(πlost,inf(πst,πhist))=inf(πlost,πhist,πst)=inf(πst,πlost) because πlost≤πhist. Rewriting a bit, and grouping the two projections together because they commute, we have:
And we're finally done with everything, we showed causality.
Lemma 24:Given a Θ (maybe just defined on stubs or full policies) that fulfills all hypothesis conditions except normalization, if it's normalizable, then all belief-function conditions are preserved (works in the sur-case too)
Nirvana-free nonemptiness, closure, convexity, nirvana-free upper-completion, and bounded-minimals are all obviously preserved by scale-and-shift/Proposition 7 in section 1 of proofs. For consistency, due to projections preserving λ and b, the scale-and-shift in the stubs (or full policies) is perfectly reflected in whichever partial policy you're evaluating, so consistency holds too. For the extreme point condition, any nirvana-free minimal extreme point post-renormalization is also nirvana-free minimal extreme pre-renormalization, so we can undo the renormalization, get a point in the nirvana-free component of a full policy that projects down accordingly, and scale-shift that point to get something that projects down to the scale-shifted extreme point. For Hausdorff-continuity, the scaling just scales the distance between two sets by the scale term, so Hausdorff-continuity carries over. Pseudocausality is preserved by normalization (un-normalize, transfer over to the partial policy of interest, then normalize back again), and so is causality (unnormalize the point and outcome function, complete it, normalize your batch of points back again).
Proposition 2:Given a nirvana-free Θ?ω, the minimal constraints we must check of Θ?ω to turn it into an acausal hypothesis are: Nonemptiness, Restricted Minimals, Hausdorff-Continuity, and non-failure of renormalization. Every other constraint to make a hypothesis can be freely introduced.
Ok, we have our Θ?ω, and we want to produce an acausal hypothesis. The obvious way to do it is: Θω(π)=(¯¯¯¯¯¯¯¯c.h(Θ?ω(π))uc)R We'll use Θω,¬R(π) to refer to the set before renormalization.
Proof sketch: We basically just run through the infinitary hypothesis conditions, and show that they're fulfilled by Θω,¬R, and then appeal to Lemma 24 that we didn't destroy our hard work when we normalize. As for the hypothesis conditions themselves, they're all pretty simple to show except for bounded-minimals and Hausdorff-continuity, which is where the bulk of the work is.
1: Nirvana-free nonemptiness. Trivial, because all the Θ?ω(π) are nonempty.
2: Closure: Appeal to Lemma 2, not in this document, but of section 1 in basic inframeasure theory, that the upper completion of a closed set is closed. Then we just intersect with the cone of a-measures (closed) to get our set of interest, so it's closed.
3: Convexity: If you have M and M′ which decompose into Mlo+M∗ and M′lo+M′∗ (M and M′ lie in the upper completion and Mlo/M′lo lie in the closed convex hull), then
pM+(1−p)M′=p(Mlo+M∗)+(1−p)(M′lo+M′∗)
=(pMlo+(1−p)M′lo)+(pM∗+(1−p)M′∗)
The first component lies in the closed convex hull because it's a mix of two points from the closed convex hull, the second component is an sa-measure, and by our upper completion, then pM+(1−p)M′ lies in our Θω,¬R(π)
4: Nirvana-free Upper-completeness: Trivial, we took the upper completion.
Condition 5: Bounded Minimals:
This can be shown by demonstrating that, if λ⊙+b⊙ is our bound for Θ?ω (ie, every point in Θ?ω(π), regardless of π, either respects the bound or lies strictly above a point in Θ?ω(π) that respects the bound), then every point in Θω,¬R(π) lies above a point that obeys the λ⊙+b⊙ bound.
Take a point M∈¯¯¯¯¯¯¯¯c.h(Θ?ω(π)). We don't have to worry about points in the upper completion that weren't part of the original closed convex hull, because they're above something in the closed convex hull, so we just have to show that everything in the closed convex hull lies above something that respects the bounds.
M can be written as a limit of points Mn∈c.h(Θ?ω(π)), which split into a mixture of finitely many Mi,n∈Θ?ω(π). We can then split the Mi,n into Mloi,n+M∗i,n, where Mloi,n respects the appropriate bounds (everything in Θ?ω(π) either obeys the bounds or is above a point which obeys the bounds). Now, we can rewrite Mn as: ∑iζi,n(Mloi,n)+∑iζi,n(M∗i,n)
That first sum term is a mixture of stuff that respects the λ⊙+b⊙ bound, so it respects the same property and lies below Mn by the addition of the second term making Mn. All these lower points lie in the closed convex hull, and obey the λ⊙+b⊙ bound, so there's a convergent subsequence that limits to some limit point that's also in the closed convex hull, respects the λ⊙+b⊙ bound, and by Lemma 16, is below M.
So, any M∈Θω,¬R(π), regardless of π, which violates the λ⊙+b⊙ bound on minimal points, has a point lower than it which does respect the bound, showing Minimal Boundedness.
We normalize at the end, and need uniform Hausdorff Continuity to show nirvana-free consistency, so let's skip to that one, which is hard.
Condition 8: Uniform Hausdorff Continuity:
We'll be working with the Lemma 15 variant of Hausdorff-continuity, that given any ϵ, there's a δ where two policies π,π′ being δ apart or less guarantees that if M is in Θ?ω(π), then there's a point M′ in Θ?ω(π′) that's only ϵ(1+λ) away, where λ is the λ value associated with M, and establish that variant for Θω,¬R.
Fix an ϵ. How close do two policies have to be to guarantee that for any M∈Θω,¬R(π), there's a point M′ in Θω,¬R(π′) within ϵ(1+λ)? Well, for our original Hausdorff-continuity condition, pick a δ that forces a ϵ2(1+λ⊙+b⊙) "distance", and δ<ϵ2(1+λ⊙+b⊙).
Since we've got closure and bounded-minimals, write M as Mlo+M∗ where Mlo respects the λ⊙+b⊙ bound, and it lies in the closed convex hull and is a limit of Mlon points, which decompose into a mixture of finitely many Mloi,n points.
Now, each of these Mloi,n points in Θ?ω(π), by Hausdorff-continuity of Θ?ω, have a M′loi,n point in Θ?ω(π′), that's only ϵ2(1+λ⊙+b⊙)(1+λi,n) away, by π′ and π being δ or less apart.
We can mix the M′loi,n in the same way as usual to make a M′lon that's only ϵ2(1+λ⊙+b⊙)(1+λn) or less away from Mlon
Because the Mlon sequence converges, there's some bound on the λn and bn values, and the (at most) ϵ2(1+λ⊙+b⊙)(1+maxn(λn)) change to make M′lon still keeps the λ and b values of our new sequence bounded, so by the Compactness Lemma, the M′lon sequence has a convergent subsequence, with a limit point M′lo, that lies in Θω,¬R(π′) by closure. Also, for all n,
The two distances on either side limit to 0, and the middle distance limits to ϵ2(1+λ⊙+b⊙)(1+λ⊙+b⊙) or less, because eventually the λ value of Mlon gets really close to the λ value of Mlo, which is subject to the constraint that it can't be bigger than λ⊙+b⊙ due to Mlo being picked to have its λ+b value below λ⊙+b⊙, so d(Mlo,M′lo)≤ϵ2
Ok, so those two things are pretty close to each other. But what we really want is to find a point in Θω,¬R(π′) that's close to M, ie, Mlo+M∗. We can invoke the proof path from direction 2 of Lemma 15 (we have enough tools to do it, most notably upper completion) to craft a M′∈Θω,¬R(π′) where d(M,M′)≤ϵ2+δ(ϵ2+λ)
Further, δ<ϵ2(1+λ⊙+b⊙)≤ϵ2. So, we get d(M,M′)≤ϵ(1+λ) and we're done.
7: Nirvana-free Consistency: We're working in a nirvana-free setting, so we can simplify things. Our formulation that we're going to show is, regardless of stub πst, c.h(⋃π>πstprπ,πst∗(Θω(π)))is closed. Just invoke Lemma 20 and we have it and that's the last one we needed besides renormalization. Now all we have to do is to show that every property is preserved when we do the necessary rescaling. Invoke Lemma 24.
Proposition 3:Given some arbitarary Θ?ω which can be turned into an acausal hypothesis, turning it into Θω has EΘω(π)(f)=α(EΘ?ω(π)(f)−β) for all π and f.
The steps to make your full Θω are convex hull, closure, upper completion, and renormalization. For convex hull, because f induces a positive functional, which is linear, convex hull doesn't affect the worst-case value (mixing a-measures mixes the score you get w.r.t the function), closure just swaps inf out for min, and upper completion doesn't add any new minimal points so it preserves the same minimal values for everything. Let's use Θω,¬R for the upper completion of the closed convex hull (no renormalization) so, unpacking definitions, for all π,f:
one direction of this is pretty easy, if the belief functions are identical when you slice off the nirvana, then regardless of π and f, Murphy forces the same value. The other direction of this can be done by Theorem 3 from Section 1. Fixing a π, the property ∀f:E→NF(Θ)(π)(f)=E→NF(Θ′)(π)(f) implies →NF(Θ)(π)=→NF(Θ′)(π), but this holds for all π, so we get →NF(Θ)=→NF(Θ′)
Proposition 5:For all hypotheses Θ, and all continuous functions g from policies to functions f∈C((A×O)ω,[0,1]), then argmaxπEΘ(π)(g(π)) exists and is closed.
Proof sketch: We'll prove this in four phases, where πn is some arbitrary sequence of policies limiting to the policy π.
Our first phase will be establishing that limn→∞|EΘ(πn)(g(πn))−EΘ(πn)(g(π))|=0
Our second phase will be establishing that limsupn→∞EΘ(πn)(g(π))≤EΘ(π)(g(π))
Our third phase will be establishing that liminfn→∞EΘ(πn)(g(π))≥EΘ(π)(g(π))
Putting phase 2 and 3 together, limn→∞EΘ(πn)(g(π))=EΘ(π)(g(π)) and then, in conjunction with phase 1, limn→∞EΘ(πn)(g(πn))=EΘ(π)(g(π)) This establishes that the function π↦EΘ(π)(g(π)) is continuous. Phase 4 is then deriving our desired result from the continuity of that function.
Begin phase 1. To begin with, because g is a function from a compact metric space to a metric space, by the Heine-Cantor theorem, it's uniformly continuous. So, there's some δ difference between policies that guarantees an ϵ difference between the functions produced, and our distance metric on functions in this case is supx|f(x)−f′(x)|, the distance metric associated with uniform convergence.
By Proposition 3 (Section 1), every positive functional (and, by Proposition 1 (Section 1), continuous functions induce positive functionals) is minimized within the set of minimal points, so we can fix an (m,b) and (m′,b′) within (Θ(πn))min (specifically, the nirvana-free component) which minimize the positive functionals associated with g(πn) and g(π), respectively. Being able to get an actual minimizing point follows from minimal-boundedness, so the closure of the set of minimal points (due to having bounds) is compact, and a continuous function from a compact set to [0,1] has an actual minimizing point, so we can pick such a point and then step down to a minimal point if needed
Note that, by the way we picked these, m(g(πn))+b=EΘ(πn)(g(πn)) and also m′(g(π))+b′=EΘ(πn)(g(π))
First, we'll bound the following two terms. |(m(g(πn))+b)−(m(g(π))+b)| and |(m′(g(π))+b′)−(m′(g(πn))+b′)| The same arguments work for both, so we'll just show one of them.
The argument for this is that the first ≤ is because m is a measure (never negative), so an upper-bound on the absolute value of the expectation is given by the expectation of the absolute value of the distance between the two functions. For the second ≤, if n is large enough to make πn and π be only δn apart, then g(πn) and g(π) are only ϵn apart, so that absolute value is upper bounded by ϵn, getting us an upper bound of m(ϵn) Then, because the total amount of measure for minimal points is upper bounded by some λ⊙ regardless of which policy we picked (minimal-point boundedness for Θ), we can finally impose an upper bound of ϵnλ⊙ on the distance. The sort of argument works for the second thing, and gets us the exact same upper bound.
Further, m(g(πn))+b≤m′(g(πn))+b′ and m′(g(π))+b′≤m(g(π))+b. This is because (m,b) is specialized to minimize g(πn) and (m′,b′) is specialized to minimize g(π). Therefore, in one direction:
(m′(g(π))+b′)−(m(g(π))+b)≤0 so (m′(g(π))+b′)−(m(g(πn))+b)≤ϵnλ⊙
In the other direction,
(m(g(πn))+b)−(m′(g(πn))+b′)≤0 so (m(g(πn))+b)−(m′(g(π))+b′)≤ϵnλ⊙
We can make n go to infinity, which makes δn (distance between policies) go to 0, which makes ϵn (distance between functions) go to 0, and λ⊙ is a constant, so we get that the distance between the two expectations limits to 0 and we're done with the first phase.
Time for phase 2, showing that limsupn→∞EΘ(πn)(g(π))≤EΘ(π)(g(π))
Fix some (m,b) in Θ(π)∩NF that minimizes the positive functional associated with g(π). By Hausdorff-Continuity, we can find a sequence of points (mn,bn)∈Θ(πn)∩NF that limit to (m,b). By continuity of g(π), this means that mn(g(π))+bn limits to m(g(π))+b, which is EΘ(π)(g(π)). However, mn(g(π))+bn≥EΘ(πn)(g(π))
Thus, limsupn→∞EΘ(πn)(g(π))≤EΘ(π)(g(π))
Now for phase 3, showing that liminfn→∞EΘ(πn)(g(π))≥EΘ(π)(g(π))
Assume it's false, we'll get a proof-by contradiction. That is,
liminfn→∞EΘ(πn)(g(π))<EΘ(π)(g(π))
Then, we can get some subsequence where the expectations converge to the liminf. For each n in that subsequence, fix a (mn,bn)∈(Θ(πn))min∩NF that minimizes the positive functional associated with g(π) within Θ(πn). Ie, mn(g(π))+bn=EΘ(πn)(g(π))
By bounded minimals, there's some λ⊙+b⊙ bound on all of these, so we can isolate another convergent subsequence (the expectation values still limit to the liminf), where the (mn,bn) limit to some (m,b). For the following arguments, we'll use n to denote numbers from our original sequence (ranges over all natural numbers) and j to denote numbers from our convergent subsequence of interest (where the expectations converge to liminf and our sequence of minimizing points converges to a limit point)
First, this (m,b) limit point lies in Θ(π), because it's arbitrarily close to points that are arbitrarily close to Θ(π)(Hausdorff-continuity), so the distance to that set shrinks to 0, and Θ(π) is closed so said point limits to be in it. Now, we can go
By mj(g(π))+bj=EΘ(πj)(g(π)), and the j's making a subsequence where we attain the liminf value in the limit, and then the second equality is a convergent sequence of a-measures having their expectation value limit to the expectation value of the limit point.
But then we get something impossible. (m,b)∈Θ(π), and yet somehow (by our original assumption that the liminf undershot the expectation value of g(π) in Θ(π)),
Which cannot be. This shows that the liminf is ≥EΘ(π)(g(π)).
Now for phase 4. Again, from the proof sketch, phases 1, 2, and 3 establish that π↦EΘ(π)(g(π)) is a continuous function Π→R≥0 Let's abbreviate this function as χ. Since we're mapping Π (which is compact) through a continuous function, the image χ(Π) is compact. Thus, it has a maximum value, which is attained by some policy. Take that maximum value (it's a single point so it's closed), take the preimage (which is a nonempty closed set of policies), and that's your argmaxπEΘ(π)(g(π)) set. And EΘ(π)(g(π)) unpacks as min(m,b)∈Θ(π)∩NF(m(g(π))+b)Thus showing our result.
A quick corollary of it is, if g just returns the constant 1 function, you can find a policy π where min(λμ,b)∈Θ(π)(λ+b) is 1, by normalization, so we can use max in normalization instead of sup.
Lemma 25:In the nirvana-free setting, with all the Bi⊆FNF(πpa) being nonempty and upper complete, then EζBi is upper-complete.
The proof of this is nearly identical to the proof of Lemma 12. Except in this case, our Mi aren't finitely many points selected from a nonconvex B, they're countably many points selected from the various Bi. Apart from that difference, the proof path works as it usually does.
Lemma 26:A belief function Θ fulfilling all conditions except normalization, which is renormalized by subtracting infπEΘ(π)(0) and scaling by (supπEΘ(π)(1)−infπEΘ(π)(0))−1, only has the renormalization fail if, for all π, Θ(π)∩NF has a single minimal point of the form (0,b), with the same b for all π. (works in the sur-case too)
First, fixing an arbitrary π′, then, if there's a divide-by-zero when scaling,
EΘ(π′)(1)≤maxπEΘ(π)(1)=minπEΘ(π)(0)≤EΘ(π′)(0)
However, EΘ(π′)(1)≥EΘ(π′)(0) always, so the two terms are equal.
Now, just invoke Proposition 6 from Section 1, which says that if EΘ(π′)∩NF(1)=EΘ(π′)∩NF(0), then there's a single minimal point of the form (0,b), and the b is EΘ(π′)∩NF(0), which is the same for all policies π′. The converse is, if all Θ(π) are of this form, then renormalization fails.
Let's define "nontriviality" for a belief function Θ. A Θ is nontrivial if there exists some π where EΘ(π)(1)≠EΘ(π)(0)
In other words, there's some policy you can feed in where the minimal points of Θ(π) aren't just a single (0,b) point. This is a very weak condition. Also, for the upcoming proposition 6, mixing just doesn't interact well with nirvana-free consistency, so we have to do it just for pseudocausal and acausal hypotheses.
Proposition 6:For pseudocausal and acausal hypotheses Θi where ∑iζiλ⊙i<∞ and there exists a nontrivial Θi, then mixing them and renormalizing produces a pseudocausal or acausal hypothesis.
Mixing is defined on the infinite levels by (EζΘi)(π)=Eζ(Θi(π)), and is then extended down to the finite levels by the usual process of projecting down and taking the closed convex hull. Then, we can renormalize if we wish. We'll distinguish these by "raw" or "renormalized" mix. Thus, the only conditions we need to check are the infinite conditions. For everything else, we can just go "the infinite conditions work, so we can extend to a full belief function" by the Isomorphism theorem.
If you want what mixing does at finite levels, it's ¯¯¯¯¯¯¯¯c.h(⋃π>πst(Eζ(prπ,πst∗(Θi(π))))) So it isn't "mix all the finite levels", it's "mix all the projections individually and then take convex hull"
Proof sketch: Neglecting normalization (because Lemma 24 shows we can just renormalize and all nice conditions are preserved), we just need to verify all the relevant infinitary conditions, and then we can extend to lower levels by isomorphism, and get our result. We also need to show that nontriviality implies that the renormalization doesn't fail, but that's easy. As for the conditions, our lemmas let us get most of them with little trouble. Bounded-minimals from just the λ⊙ bound is slightly more difficult and relies on showing that (0,1) is in all the Θi(π) sets regardless of i and π by normalization to eliminate the b term, and Hausdorff-continuity is also fairly nontrivial (we have to split our mix into three pieces and bound each one individually via a different argument) and relies on the same (0,1) is in all the Θi(π) result. For causality, we'll knock it out with Tychonoff in a slightly more complicated way than usual so we just use a countable product and don't have to invoke the full Axiom of Choice.
We'll take a detour and show that (0,1)∈EζΘi(π) for all π, we need this in a few places. First, pick an arbitrary i and π and look at Θi(π). Find a minimal point (m,b) that minimizes m(1)+b. Now, consider (m,b)+(−m,m(1)). This is (0,m(1)+b). However...
(by (m,b) minimizing m(1)+b, and normalization, respectively)
So, by nirvana-free upper-completion for Θi, we get the point (0,1)∈Θi(π) for all i and π. Then, mixing these gets that (0,1) lies in (EζΘi)(π).
1: Infinitary Nirvana-Free Nonemptiness, it's easy, all our (EζΘi)(π) contain the a-measure (0,1), which is nirvana free.
2,3: Closure, convexity: Closure follows from the proof of closure in Proposition 11 of section 1, and we mixed convex sets so the mixture is convex.
4: Nirvana-free Upper-Completion: Just invoke Lemma 25.
5: Bounded minimals: Since (0,1)∈(EζΘi)(π) for all i, any a-measure with b≥1 isn't minimal (add whatever you want to (0,1)), so we have a bound on the b values of minimals. What about a bound on the λ values?
Well, just take a M in (EζΘi)(π) with b<1, and split it into a mixture of Mi points from the Θi(π). Then, by bounded minimality for the Θi, we can take each Mi and find a minimal point below it that fulfills the λ⊙i bound on the λ values. Mixing those minimals produces a point Mlo that's below M, and has a λ value below Eζλ⊙i, which, by assumption, is finite. So, every minimal point in any (EζΘi)(π) has a λ+b value below Eζλ⊙i+1 and we have bounded-minimals.
Nirvana-free consistency is something we'll have to loop back to after Hausdorff-continuity.
Condition 8: Hausdorff-continuity:
Pick an arbitrary M∈(EζΘi)(π). It shatters into Mi∈Θi(π). We'll be showing the Lemma 15 variant, which is that for all ϵ, there's a δ where if d(π′,π)<δ, then there's a point in (EζΘi)(π′) that's ϵ(1+λ) away.
First, we'll shuffle around what our Mi are supposed to be, we need a certain decomposition to make it work. Reindex your probability distribution so the highest-probability thing is assigned to i=0. All the Mi can be decomposed as a Mmini+M∗i. Now, let our new Mi for i>0 be defined as: Mmini+(m∗−i,−m∗−i(1)),
and our Mi for i=0 is defined as: Mmin0+(m∗−0,−m∗−0(1))+∑iζiζ0(m∗+i,b∗i+M∗−i(1))
These new Mi still lie in Θi(π)∩NF (nirvana-free upper-completion), and they're all a-measures (the negative part isn't enough to cancel out the positive measure on mmini otherwise our old Mi wouldn't be an a-measure). Further, mixing them together still makes M, and if i>0, then λi≤λ⊙i (because we start off at a minimal point with λi≤λ⊙i, and then add something to it that saps some measure from it).
Fixing some ϵ... well, Eζλ⊙i<∞, so find a j where ∑i>jζi(λ⊙i+1)<ϵ3. For i≤j... well, using the Lemma 15 variant of Hausdorff-continuity, note that fixing a δ gets you a different ϵi value for Hausdorff-continuity of each Θi. We only have to worry about the i≤j, though, and there's finitely many. So, pick a δ where the induced ϵ0 is below ϵ3, and for all 0<i≤j, ϵi<ϵEζλ⊙i+1.
So... we take our Mi in Θi(π), and go to nearby M′i in Θi(π′). We should break down exactly how this is done. For M0, the λ value relative to M is at most λζ0 (in the degenerate case where it contributes all the measure to the mixture M), so, by Hausdorff-continuity for Θ0, the gap between M0 and M′0 is at most ϵ3(1+λζ0) because we picked δ low enough to get that scale factor on the front.
For Mi where 0<i≤j, the gap between Mi and M′i is at most ϵ3(Eζλ⊙i+1)(1+λ⊙i) Because we picked δ low enough to guarantee that scale factor on the front, and Mi is made by adding a minimal point and an sa-measure where the measure component is entirely negative, so λ⊙i is a bound on the λ value for Mi.
And finally, for i>j, we can specially craft a M′i where the gap between Mi and M′i is at most λ⊙i+1. This is because Mi has measure below λ⊙i due to being a minimal point that lost some measure. So, we can expend λ⊙i effort to completely reshuffle it however we wish, and then add (0,1) to our reshuffled a-measure to make an a-measure that lies above (0,1), which must be in Θi(π′), so our reshuffled a-measure plus (0,1) lies in Θi(π′) by nirvana-free upper-completion, and we only had to spend λ⊙i+1 effort to travel to it (first term is the reshuffling, second term is adding 1 to the b term)
Now, let's analyze the distance between M and the point EζM′i which lies in (EζΘi)(π′). d(M,EζM′i) equals...
9: Nirvana-free consistency: Invoke Lemma 20, notice that we're in the nirvana-free setting, we're done.
Pseudocausality. Fix some M∈(EζΘi)(π), whose support is a subset of FNF(π′). M shatters into Mi∈Θi(π). All of them have their support being a subset of FNF(π′) (otherwise there'd be measure-mass outside of there), so all the Mi transfer over to Θi(π′) by pseudocausality for the Θi, and then we can mix them back together to get M∈(EζΘi)(π′).
And we're done, mixing works just fine, as long as we can show that the renormalization preserves everything and the renormalization doesn't fail. Renormalization fails iff (By Lemma 26), for all π, then E(EζΘi)(π)(1)=E(EζΘi)(π)(0)
We can then go E(EζΘi)(π)(1)=EEζ(Θi(π))(1)=Eζ(EΘi(π)(1)) and similar for 0, (by the definition of the mix of belief functions and Proposition 10 in Section 1) and then we can use that E(EζΘi)(π)(1)−E(EζΘi)(π)(0)=0 to go
Eζ(EΘi(π)(1))−Eζ(EΘi(π)(0))=0
∑iζi(EΘi(π)(1)−EΘi(π)(0))=0
So, then, for all π and i, EΘi(π)(1)=EΘi(π)(0) However, this is incompatible with the existence of a nontrivial Θi, because nontriviality just says that there's a π where EΘi(π)(1)≠EΘi(π)(0).
So, nontriviality for some Θi(π) means that the renormalization of your mix can be done. It's a very weak condition, just saying "there's some possibility of starting with a hypothesis (Θi) which has a policy π, where murphy actually has to pay attention to what function you're maximizing.
Now, we just need to show that renormalization preserves all nice conditions. Just invoke isomorphism to complete to a full psuedo/acausal belief function lacking normalization, apply Lemma 24 and renormalize everything, and we're done.
Proposition 7:For pseudocausal and acausal hypotheses,E(EζΘi)(πpa)(f)=Eζ(EΘi(πpa)(f))
Proof: E(EζΘi)(πpa)(f)=EEζ(Θi(πpa))(f)=Eζ(EΘi(πpa)(f)) by Proposition 10 of Section 1.
Proposition 8:For pseudocausal and acausal hypotheses,
This is an easy one. If M∈prπhipa,πlopa∗((EζΘi)(πhipa)), then there's a preimage point M′∈(EζΘi)(πhipa), which then decomposes into M′i∈Θi(πhipa). These M′i project down to Mi, which then mix to make M (project then mix equals mix then project because of linearity) witnessing that M∈Eζ(prπhipa,πlopa∗(Θi(πhipa)))
Now for the reverse direction. If M∈Eζ(prπhipa,πlopa∗(Θi(πhipa))), then it shatters into Mi∈prπhipa,πlopa∗(Θi(πhipa)), which have preimage points M′i∈Θi(πhipa). The M′i mix to make a M′∈(EζΘi)(πhipa), which projects down to M, by project-mix equaling mix-project, witnessing that M∈prπhipa,πlopa∗((EζΘi)(πhipa)).
Previous proof post is here.
Theorem 2: Isomorphism Theorem: For (causal, pseudocausal, acausal, surcausal) Θst or Θω which fulfill finitary or infinitary analogues of all the defining conditions, ↑(Θst) and ↓(Θω) are (causal, pseudocausal, acausal, surcausal) hypotheses. Also, ↑ and →st define an isomorphism between Θ and Θst, and ↓ and →ω define an isomorphism between Θ and Θω.
Proof sketch: The reason this proof is so horrendously long is that we've got almost a dozen conditions to verify, and some of them are quite nontrivial to show and will require sub-proof-sketches of their own! Our first order of business is verifying all the conditions for a full belief function for ↑(Θst). Then, we have to do it all over again for ↓(Θω). That comprises the bulk of the proof. Then, we have to show that taking a full belief function Θ and restricting it to the infinite/finite levels fulfills the infinite/finite analogues of all the defining conditions for a belief function on policies or policy-stubs, which isn't quite as bad. Once we're done with all the legwork showing we can derive all the conditions from each other, showing the actual isomorphism is pretty immediate from the Consistency condition of a belief function.
Part 1:Let's consider ↑(Θπst). This is defined as: ↑(Θst)(πpa):=⋂πst≤πpa(prπpa,πst∗)−1(Θst(πst))
We'll show that all 9+2 defining conditions for a belief function are fulfilled for ↑(Θst). The analogue of the 9+2 conditions for a Θst is:
1: Stub Nirvana-free Nonemptiness: ∀πst:Θst(πst)∩NF≠∅
2: Stub Closure: ∀πst:Θst(πst)=¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯Θst(πst)
3: Stub Convexity. ∀πst:Θst(πst)=c.h(Θst(πst))
4: Stub Nirvana-free Upper-Completion.
∀πst:Θst(πst)∩NF=((Θst(πst)∩NF)+Msa(FNF(πst)))∩Ma(F(πst))
5: Stub Restricted Minimals:
∃λ⊙,b⊙∀πst:(λμ,b)∈(Θst(πst))min→λ+b≤λ⊙+b⊙
6: Stub Normalization: infπstEΘst(πst)(0)=0 and supπstEΘst(πst)(1)=1
7: Weak Consistency: ∀πlost,πhist≥πlost:prπhist,πlost∗(Θst(πhist))⊆Θst(πlost)
8: Stub Extreme Point Condition: for all M,πlost:
M∈(Θst(πlost))xmin∩NF→
∃π≥πlost∀πhist:(π>πhist≥πlost→(∃M′∈Θst(πhist)∩NF:prπhist,πlost∗(M′)=M))
9: Stub Uniform Continuity: The function πst↦(pr∞,πst∗)−1(Θst(πst)∩NF∩{≤⊙}) is uniformly continuous.
C: Stub Causality: ∀πst,M∈Θst(πst)∃of∀π′st:of(πst)=M∧of(π′st)∈Θst(π′st)
where the outcome function of is defined over all stubs.
P: Stub Pseudocausality:
∀πst,π′st:((M∈Θst(πst)∧supp(M)⊆FNF(π′st))→M∈Θst(π′st))
Let's begin showing the conditions. But first, note that since we have weak consistency, we can invoke Lemma 6 to reexpress ↑(Θst)(πpa) as ⋂n(prπpa,πnpa∗)−1(Θst(πnpa)) Where πnpa is the n'th member of the fundamental sequence of πpa.
Also note that, for all stubs, ↑(Θst(πst))=Θst(πst). We'll be casually invoking this all over the place and won't mention it further.
Proof: By Lemma 6 with weak consistency, ↑(Θst(πst))=⋂n≥m(prπst,πnst∗)−1(Θst(πnst)) Now, m can be anything we like, as long as it's finite. Set m to be larger than the maximum timestep that the stub is defined for. Then πnst=πst no matter what n is (since it's above m) and projection from a stub to itself is identity, so the preimage is exactly our original set Θst(πst).
We'll also be using another quick result. For all stubs πhist≥πlost, given stub causality,
prπhist,πlost∗(Θst(πhist))=Θst(πlost)
Proof: Fix an arbitrary point M∈Θst(πlost). By causality, we get an outcome function which includes M, giving us that there's something in Θst(πhist) that projects down onto M. Use weak consistency to get the other subset direction.
Condition 1: Nirvana-free Nonemptiness.
Invoking Stub Nirvana-free Nonemptiness, ∀πst:Θst(πst)∩NF≠∅ so we get nirvana-free nonemptiness for ↑(Θst)(πst).
Now, assume πpa is not a stub. By stub-bounded-minimals, there is some λ⊙+b⊙ bound on the set of minimal points, regardless of stub. Let (Θst(πnpa))clip be Θst(πnpa)∩{≤⊙}∩NF
This contains all the minimal nirvana-free points for πnpa. This set is nonempty because we have stub nirvana-free nonemptiness, so a nirvana-free point M exists. We have stub-closure and stub minimal-boundedness, so we can step down to a minimal nirvana-free point below M, and it obeys the λ⊙+b⊙ bound.
Further, by weak consistency and projection preserving λ and b and nirvana-freeness,
prπn+1pa,πnpa∗((Θst(πn+1pa))clip)⊆(Θst(πnpa))clip
Invoking Lemma 9, the intersection of preimages of these is nonempty. It's also nirvana-free, because if there's nirvana somewhere, it occurs after finite time, so projecting down to some sufficiently large finite stage preserves the presence of Nirvana, but then we'd have a nirvana-containing point in a nirvana-free set (Θst(πnpa))clip, which is impossible. This is also a subset of the typical intersection of preimages used to define ↑(Θst)(πpa). Pick an arbitrary point in said intersection of preimages of clipped subsets.
Bam, we found a nirvana-free point in ↑(Θst)(πpa) and we're done.
Time for conditions 2 and 3, Closure and Convexity. These are easy.
↑(Θst)(πpa)=⋂n(prπpa,πnpa∗)−1(Θst(πnpa))
The preimage of a closed set (stub-closure) is a closed set, and the intersection of closed sets is closed, so we have closure.
Also, prπpa,πnst∗ is linear, so the preimage of a convex set (stub-convexity) is convex, and we intersect a bunch of convex sets so it's convex as well.
Condition 4: Nirvana-free upper completion.
Let M∈↑(Θst)(πpa)∩NF. Let's check whether M+M∗ (assuming that's an a-measure and M∗ is nirvana-free) also lies in the set. A sufficient condition on this given how we defined things is that for all πnpa, prπpa,πnpa∗(M+M∗)∈Θst(πnpa), as that would certify that M+M∗ is in all the preimages.
prπpa,πnpa∗ is linear, so prπpa,πnpa∗(M+M∗)=prπpa,πnpa∗(M)+prπpa,πnpa∗(M∗)
The first component is in Θst(πst), obviously. And then, by stub nirvana-free-upper-completion, we have a nirvana-free a-measure plus a nirvana-free sa-measure (projection preserves nirvana-freeness), making a nirvana-free a-measure (projection preserves a-measures), so prπpa,πnpa∗(M)+prπpa,πnpa∗(M∗) is in Θst(πnpa)∩NF, and we're done.
Condition 5: Bounded-Minimals
So, there is a critical λ⊙+b⊙ value by restricted-minimals for Θst
Fix a πpa, and assume that there is a minimal point M in ↑(Θst)(πpa) with a λ+b value that exceeds the bound. Project M down into each Θst(πnpa). Projection preserves λ and b so each of these projected points Mn lie above some Mminn.
Now, invoke Lemma 7 to construct a M′n∈Ma(F(πpa)) (or the nirvana-free variant) that lies below M, and projects down to Mminn. Repeat this for all n. All these M′n points are a-measures and have the standard λ⊙+b⊙ bound so they all lie in a compact set and we can extract a convergent subsequence, that converges to M′, which still obeys the λ⊙+b⊙ bound.
M′ is below M because ({M}−Msa(F(πpa)))∩Ma(F(πpa)) (or the nirvana-free variant) is a closed set. Further, by Lemma 10, M′ is in the defining sequence of intersections for ↑(Θst)(πpa). This witnesses that M isn't minimal, because we found a point below it that actually obeys the bounds. Thus, we can conclude minimal-point-boundedness for ↑(Θst).
Condition 6: Normalization. We'll have to go out of order here, this can't be shown at our current stage. We're going to have to address Hausdorff continuity first, then consistency, and solve normalization at the very end. Let's put that off until later, and just get extreme points.
Condition 8: Extreme point condition:
The argument for this one isn't super-complicated, but the definitions are, so let's recap what condition we have and what condition we're trying to get.
Condition we have: for all M,πlost:
M∈(Θst(πlost))xmin∩NF→
∃π≥πlost∀πhist:(π>πhist≥πlost→(∃M′∈Θst(πhist)∩NF:prπhist,πlost∗(M′)=M))
Condition we want: for all M,πst,
M∈(Θst(πst))xmin∩NF→∃π>πst,M′:M′∈↑(Θst)(π)∩NF∧prπ,πst∗(M′)=M
Ok, so M∈(Θst(πst))xmin∩NF By the stub extreme point condition, there's a π≥πst, where, for all πhist that fulfill π>πhist≥πst, there's a M′∈Θst(πhist)∩NF, where prπhist,πst∗(M′)=M.
Lock in the π we have. We must somehow go from this to a M′∈Θ(π)∩NF that projects down to our point of interest. To begin with, let πn be the n'th member of the fundamental sequence for π. Past a certain point m, these start being greater than πst. The M′∈Θst(πn)∩NF which projects down to M that we get by the stub-extreme-point condition will be called M′lon. Pick some random-ass point in (prπ,πnst∗)−1(M′lon) and call it M′hin.
M′hin all obey the λ and b values of M, because it projects down to M. We get a limit point of them, M′, and invoking Lemma 10, it's also in ↑(Θst)(π). It also must be nirvana-free, because it's a limit of points that are nirvana-free for increasingly late times. It also projects down to M because the sequence M′hin was wandering around in the preimage of M, which is closed.
Condition 9: Hausdorff Continuity:
Ok, this one is going to be fairly complicated. Remember, our original form is:
"The function πst↦(pr∞,πst∗)−1(Θst(πst)∩NF∩{≤⊙}) is uniformly continuous"
And the form we want is:
"The function πpa↦(pr∞,πpa∗)−1(↑(Θst)(πpa)∩NF∩{≤⊙}) is uniformly continuous"
Uniform continuity means that if we want an ϵ Hausdorff-distance between two preimages, there's a δ distance between partial policies that suffices to produce that. To that end, fix our ϵ. We'll show that the δ we get from uniform continuity on stubs suffices to tell us how close two partial policies must be.
So, we have an ϵ. For uniform continuity, we need to find a δ where, regardless of which two partial policies πpa and π′pa we select, as long as they're δ or less apart, the sets (pr∞,πpa∗)−1(↑(Θst)(πpa)∩NF∩{≤⊙}) (and likewise for π′pa) are only ϵ apart. So, every point in the first preimage must have a point in the second preimage only ϵ distance away, and vice-versa. However, we can swap πpa and π′pa (our argument will be order-agnostic) to establish the other direction, so all we really need to do is to show that every point in the preimage associated with πpa is within ϵ of a point in the preimage associated with π′pa.
First, m:=logγ(δ) is the time at which two partial policies δ apart may start differing. Conversely, any two partial policies which only disagree at-or-after m are δ apart or less. Let π∗st be the policy stub defined as follows: take the inf of πpa and π′pa (the partial policy which is everything they agree on, which is going to perfectly mimic both of them up till time m), and clip things off at time m to make a stub. This is only δ apart from πpa and π′pa, because it perfectly mimics both of them up till time m, and then becomes undefined (so there's a difference at time m) Both πpa and π′pa are ≥π∗st.
Let M be some totally arbitrary point in (pr∞,πpa∗)−1(Θst(πpa)∩NF∩{≤⊙}). M is also in (pr∞,π∗st∗)−1(Θst(π∗st)∩NF∩{≤⊙}), because M projects down to some point in Θst(π∗st) that's nirvana-free.
Let π′npa, where n≥m, be the n'th stub in the fundamental sequence for π′pa. These form a chain starting at π∗st and ascending up to π′pa, and are all δ distance from π∗st.
Anyways, in Ma(∞), we can make a closed ball B≤ϵ of size ϵ around M. This restricts λ and b to a small range of values, so we can use the usual arguments to conclude that B≤ϵ is compact.
Further, because π∗st is δ or less away from π′npa, the two sets (pr∞,π∗st∗)−1(Θst(π∗st)∩NF∩{≤⊙}) and (pr∞,π′npa∗)−1(Θst(π′npa)∩NF∩{≤⊙}) are within ϵ of each other, so there's some point of the latter set that lies within our closed ϵ-ball.
Consider the set ⋂n≥m((pr∞,π′npa∗)−1(Θst(π′npa)∩NF∩{≤⊙})∩B≤ϵ)
the inner intersection is an intersection of closed and compact sets, so it's compact. Thus, this is an intersection of an infinite family of nonempty compact sets. To check the finite intersection property, just observe that since preimages of the sets ↑(Θst)(π′npa)∩NF∩{≤⊙} get smaller and smaller as n increases due to weak-consistency but always exist.
Pick some arbitrary point M′ from the intersection. it's ≤ϵ away from M since it's in the ϵ-ball. However, we still have to show that M′ is in (pr∞,π′pa∗)−1(↑(Θst)(π′pa)∩NF∩{≤⊙}) to get Hausdorff-continuity to go through.
To begin with, since M′ lies in our big intersection, we can project it down to any Θst(π′npa)∩NF∩{≤⊙}. Projecting it down to stage n makes M′n. Let M′∞ be the point in ↑(Θst)(π′pa)∩NF∩{≤⊙} defined by: ⋂n≥m(prπ′pa,πnst∗)−1(M′n)
Well, we still have to show that this set is nonempty, contains only one point, and that it's in ↑(Θst)(π′pa), and is nirvana-free, to sensibly identify it with a single point.
Nonemptiness is easy, just invoke Lemma 9. It lies in the usual intersections that define ↑(Θst)(π′pa), so we're good there. If it had nirvana, it'd manifest at some finite point, but all finite projections are nirvana-free, so it's nirvana-free. If it had more than one point in it, they differ at some finite stage, so we can project to a finite π′npa to get two different points, but they both project to M′n, so this is impossible. Thus, M′∞ is a legit point in the appropriate set. If the projection of M′ didn't equal M′∞, then we'd get two different points, which differ at some finite stage, so we could project down to separate them, but they both project to M′n for all n so this is impossible.
So, as a recap, we started with an arbitrary point M in (pr∞,πpa∗)−1(↑(Θst)(πpa))∩{≤⊙}, and got another point M′ that's only ϵ or less away and lies in (pr∞,π′pa∗)−1(↑(Θst)(π′pa))∩{≤⊙} This argument also works if we flip πpa and π′pa, so the two preimages are only ϵ or less apart in Hausdorff-distance.
So, given some ϵ, there's some δ where any two partial policies which are only δ apart have preimages only ϵ apart from each other in Hausdorff-distance. And thus, we have uniform continuity for the function mapping πpa to the set of a-measures over infinite histories which project down to ↑(Θst)(πpa)∩NF∩{≤⊙} Hausdorff-continuity is done.
Condition 7: Consistency.
Ok, we have two relevant things to check here. The first, very easy one, is that
↑(Θst)(πpa)=⋂πst≤πpa(prπpa,πst∗)−1(↑(Θst)(πst))
From earlier, we know that ↑(Θst)(πst)=Θst(πst), and from how ↑(Θst)(πpa) is defined, this is a tautology.
The other, much more difficult direction, is that
↑(Θst)(πpa)=¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πpa∗(↑(Θst)(π)))
We'll split this into four stages. First, we'll show one subset direction holds in full generality. Second, we'll get the reverse subset direction for causal/surcausal. Third, we'll show it for policy stubs for pseudocausal/acausal, and finally we'll use that to show it for all partial policies for pseudocausal/acausal.
First, the easy direction. ↑(Θst)(πpa)⊇¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πpa∗(↑(Θst)(π)))
If we pick an arbitrary M∈↑(Θst)(π), it projects down to Θst(πst) for all stubs πst below π. Since πpa≤π, it projects down to all stubs beneath πpa. Since projections commute, M projected down into Ma(F(πpa)) makes a point that lies in the preimage of all the Θst(πst) where πst≤πpa, so it projects down into ↑(Θst)(πpa).
This holds for all points in ↑(Θst)(π), so prπ,πpa∗(↑(Θst)(π))⊆↑(Θst)(πpa). This works for all π≥πpa, so it holds for the union, and then due to closure and convexity which we've already shown, we get that the closed convex hull of the projections lies in ↑(Θst)(πpa) too, establishing one subset direction in full generality.
Now, for phase 2, deriving ↑(Θst)(πpa)⊆¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πpa∗(↑(Θst)(π))) in the causal/surcausal case.
First, observe that if π≥πpa, then πn≥πnpa. Fix some M∈↑(Θst)(πpa) and arbitrary π≥πpa. We'll establish the existence of a M′∈↑(Θst)(π) that projects down to M.
To begin with, M projects down to Mn in Θst(πnpa). Lock in a value for n, and consider the sequence that starts off M0,M1...Mn, and then, by causality for stubs and πn≥πnpa, you can find something in Θst(πn) that projects down onto Mn, and something in Θst(πn+1) that projects down onto that, and complete your sequence that way, making a sequence of points that all project down onto each other that climb up to π. By Lemma 10, we get a M′n∈↑(Θst)(π). You can unlock n now. All these M′n have the same λ and b value because projection preserves them, so we can isolate a convergent subsequence converging to some M′∈↑(Θst)(π).
Assume prπ,πpa∗(M′)≠M. Then we've got two different points. They differ at some finite stage, so there's some n where can project down onto Ma(F(πnpa)) to witness the difference, but from our construction process for M′, both M′ and M project down to Mn, and we get a contradiction.
So, since prπ,πpa∗(↑(Θst)(π))=↑(Θst(πpa)), this establishes the other direction, showing equality, and thus consistency, for causal/surcausal hypotheses.
For part 3, we'll solve the reverse direction for pseudocausal/acausal hypotheses in the case of stubs, getting ↑(Θst)(πst)⊆¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πst∗(↑(Θst)(π)))
Since we're working in the Nirvana-free case and are working with stubs, we can wield Lemma 3
c.h((Θst(πst))min)=c.h((Θst(πst))xmin)
So, if we could just show that the union of the projections includes all the extreme minimal points, then when we take convex hull, we'd get the convex hull of the extreme minimal points, which by Lemma 3, would also nab all the minimal points as well. By Lemmas 11 and 12, our resulting convex hull of a union of projections from above would be upper-complete. It would also get all the minimal points, so it'd nabs the entire Θst(πst) within it and this would show the other set inclusion direction for pseudocausal/acausal stubs. Also, we've shown enough to invoke Lemma 20 to conclude that said convex hull is closed. Having fleshed out that argument, all we need that all extreme minimal points are captured by the union of the projections.
By our previously proved extreme minimal point condition, for every extreme minimal point M in Θst(πst), there's some π>πst and M′ in ↑(Θst)(π) that projects down to M, which shows that all extreme points are included, and we're good.
For part 4, we'll show that in the nirvana-free pseudocausal/acausal setting, we have
↑(Θst)(πpa)⊆¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πpa∗(↑(Θst)(π)))
Fix some arbitrary M∈↑(Θst)(πpa). Our task is to express it as a limit of some sequence of points that are mixtures of stuff projected from above.
For this, we can run through the same exact proof path that was used in the part of the Lemma 21 proof about how the nirvana-free part of Θ(πpa) is a subset of closed convex hull of the projections of the nirvana-free parts of Θ(π), π≥πpa. Check back to it. Since we're working in the Nirvana-free case, we can apply it very straightforwardly. The stuff used in that proof path is the ability to project down and land in ↑(Θst)(πnpa) (we have that by how we defined ↑), Hausdorff-continuity (which we have), and stubs being the convex hull, not the closed convex hull, of projections of stuff above them (which we mentioned recently in part 3 of our consistency proof).
Thus, consistency is shown.
Condition 6: Normalization.
Ok, now that we have all this, we can tackle our one remaining condition, normalization. Then move on to the two optional conditions, causality and pseudocausality.
A key thing to remember is, in this setting, when you're doing EH(f), it's actually EH∩NF(f), because if Murphy picked a thing with nirvana in it, you'd get infinite value, which is not ok, so Murphy always picks a nirvana-free point.
Let's show the first part, that infπE↑(Θst)(π)(0)=0. This unpacks as:infπmin(λμ,b)∈↑(Θst)(π)∩NFb=0 and we have that infπstmin(λμ,b)∈Θst(πst)∩NFb=0
Projections preserves b values, so we can take some nirvana-free point with a b value of nearly 0, and project it down to Θst(π∅) (belief function of the empty policy-stub) where there's no nirvana possible because no events happen.
So, we've got b values of nearly zero in there. Do we have a point with a b value of exactly zero? Yes. it's closed, and has bounded minimals, so we can go "all positive functionals are minimized by a minimal point", to get a point with a b value of exactly zero. Then, we can invoke Lemma 21 (we showed consistency, extreme points, and all else required to invoke it) to decompose our point into the projections of nirvana-free stuff from above, all of which must have a b value of 0. So, there's a nirvana-free point in some policy with 0 b value.
Now for the other direction. Let's show that supπE↑(Θst)(π)(1)=1.
This unpacks as: supπmin(λμ,b)∈↑(Θst)(π)∩NF(λ+b)=1
We'll show this by disproving that the sup is <1, and disproving that the sup is >1.
First, assume that, regardless of π, min(λμ,b)∈↑(Θst)(π)∩NF(λ+b)≤1−ϵ Then, regardless of πst we can pick some totally arbitrary π>πst, and there's a nirvana-free point with a λ+b value of 1−ϵ or less. By consistency, we can project it down into Θ(πst), to get a nirvana-free point with a λ+b value of 1−ϵ or less. Thus, regardless of the stub we pick, there's nirvana-free points where Murphy can force a value of 1−ϵ or less, which contradicts supπstmin(λμ,b)∈Θst(πst)∩NF(λ+b)=1
What if it's above 1? Assume there's some π where min(λμ,b)∈↑(Θst)(π)∩NF(λ+b)≥1+ϵ
From uniform-continuity-Hausdorff, pick some δ to get a ϵ2 Hausdorff-distance or lower (for stuff obeying the λ⊙+b⊙ bound, which all minimal points of ↑(Θst)(π)∩NF do for all π). This δ specifies some extremely large n, consider πn. Now, consider the set of every policy π′ above πn. All of these are δ or less away from π. Also, remember that the particular sort of preimage-to-infinity that we used for Hausdorff-continuity slices away all the nirvana.
So, Murphy, acting on π, can only force a value of 1+ϵ or higher. Now, there can be no nirvana-free point in ↑(Θst)(π′) with λ+b<1+ϵ2. The reason for this is that, since π′ is δ or less away from π, there's a nirvana-free point in ↑(Θst)(π) that's ϵ2 away, and thus has λ+b<1+ϵ, which is impossible.
Ok, so all the nirvana-free points in ↑(Θst)(π′) where π′>πnst have λ+b≥1+ϵ2.
Now, since we have Lemma 21, we can go "hm, Θst(πn) equals the convex hull of the projections of ↑(Θst)(π′)∩NF. Thus, any minimal point with λ+b≤1 is a finite mix of nirvana-free stuff from above, one of which must have λ+b≤1. But we get a contradiction with the fact that there's no nirvana-free point from π′ above πn with a λ+b value that low, they're all ≥1+ϵ2"
So, since we've disproved both cases, supπE↑(Θst)(π)(1)=1. And we're done with normalization! On to causality and pseudocausality.
Condition C: Causality.
An "outcome function" of for ↑(Θst)(πpa) is a function that maps a πpa to a point in ↑(Θst)(πpa), s.t. for all πhipa,πlopa:prπhipa,πlopa∗(of(πhipa))=of(πlopa).
Causality is, if you have a M∈↑(Θst)(πpa), you can always find an outcome function of where of(πpa)=M. Sadly, all we have is causality over stubs. We'll be using the usual identification between ↑(Θst)(πst) and Θst(πst).
Anyways, fix a πpa and a point M in ↑(Θst)(πpa). Project M down to get a sequence Mn∈Θst(πnpa). By causality for stubs, we can find an ofn where, for all πst, ofn(πnpa)=Mn. Observe that there are countably many stubs, and no matter the n, all the λ and b values are the same because projection preserves those. We can view ofn as a sequence in
∏πst(Θst(πst)∩{(λ′μ,b′)|λ′=λ∧b′=b})
By stub closure, and a λ and b bound, this is a product of compact sets, and thus compact by Tychonoff (no axiom of choice needed, its just a countable product of compact metric spaces) so we can get a limiting ofst (because it's only defined over stubs).
An outcome function for stubs fixes an outcome function for all partial policies, by
of(πpa)=⋂n(prπpa,πnpa∗)−1(ofst(πnpa))
We've got several things to show now. We need to show that ofst is an outcome function, that of is well-defined, that of(πpa)=M, and that it's actually an outcome function.
For showing that ofst is an outcome function, observe that projection is continuous, and, letting n index our convergent subsequence of interest, regardless of stub πst, limn→∞ofn(πst)=ofst(πst). With this,
prπhist,πlost∗(ofst(πhist))=prπhist,πlost∗(limn→∞ofn(πhist))=limn→∞prπhist,πlost∗(ofn(πhist))
=limn→∞ofn(πlost)=ofst(πlost)
Now, let's show that of is well-defined. Since ofst is an outcome function, all the points project down onto each other, so we can invoke Lemma 9 to show that the preimage is nonempty. If the preimage had multiple points, we could project down to some finite stage to observe their difference, but nope, they always project to the same point. So it does pick out a single well-defined point, and it lies in ↑(Θst)(πpa) by being a subset of the defining sequence of intersection of preimages.
Does of(πpa)=M? Well, M projected down to all the Mn. If n≥m, then ofn(πmpa)=prπnpa,πmpa∗(ofn(πnpa))=prπnpa,πmpa∗(Mn)=Mm So, the limit specification ofst has ofst(πnpa)=Mn for all n. The only thing that projects down to make all the Mn is M itself, so of(πpa)=M.
Last thing to check: Is of an outcome function over partial policies? Well, if πhipa≥πlopa, then for all n, πhi,npa≥πlo,npa. Assume prπhipa,πlopa∗(of(πhipa))≠of(πlopa). Then, in that case, we can project down to some πlo,npa and they'll still be unequal. However, since projections commute, it doesn't matter whether you project down to πlopa and then to πlo,npa, or whether you project down to πhi,npa (making ofst(πhi,npa)), and then project down to πlo,npa (making ofst(πlo,npa)). Wait, hang on, this is the exact point that of(πlopa) projects down to, contradiction. Therefore it's an outcome function.
And we're done, we took an arbitrary πpa and M∈↑(Θst)(πpa), and got an outcome function of with of(πpa)=M, showing causality.
Condition P: Pseudocausality: If (m,b)∈↑(Θst)(πpa), and m's support is on Ma(FNF(π′pa)), then (m,b)∈↑(Θst)(π′pa).
But all we have is, if (m,b)∈Θst(πst) and m's support is on Ma(F(π′st)), then (m,b)∈Θst(π′st).
There's a subtlety here. Our exact formulation of pseudocausality we want is the condition supp(m)⊆F(π′pa), so if the measure is 0, then support is the empty set, which is trivially a subset of everything, then pseudocausality transfers it to all partial policies.
Ok, so let's assume that M∈↑(Θst)(πpa), and the measure part m has its support being a subset of Ma(FNF(π′pa)) but yet is not in ↑(Θst)(π′pa). Then, since this is an intersection of preimages from below, there should be some finite level π′npa that you can project M down to (it's present in Ma(FNF(π′pa)), just maybe not in ↑(Θst)(π′pa)) where the projection of M (call it M′n) lies outside Θst(π′npa) (lying outside the intersection of preimages)
This is basically "take m, chop it off at height n". However, since M∈↑(Θst)(πpa), you can project it down to Θst(πnpa). Which does the exact same thing of chopping m off at height n, getting you M′n exactly. We can invoke stub-pseudocausality (because with full measure, the history will land in F(π′pa), then with full measure, the truncated history will land in F(π′npa) as the latter is prefixes of the former, or maybe the full measure is 0 in which case pseudocausality transfer still works) to conclude that M′ actually lies inside Θst(π′npa), getting a contradiction. This establishes pseudocausality in full generality.
Ok, so we have one direction. ↑(Θst) is a hypothesis, if Θst fulfills analogues of the hypothesis conditions for the finitary stub case. Our proof of everything doesn't distinguish between causal and surcausal, and the arguments work for all types of hypotheses, whether causal, surcausal, pseudocausal, or acausal. Ok, we're 1/4 of the way through. Now we do the same thing, but for building everything from infinitary hypotheses.
PART 2: Infinitary hypotheses. We now consider ↓(Θω), defined as↓(Θω)(πpa)=¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πpa∗(Θω(π))
We'll show that with 8+2 defining conditions, the 9+2 defining conditions for a hypothesis hold for ↓(Θω). The the 8+2 conditions for a Θω are:
1: Infinitary Nirvana-Free Nonemptiness: ∀π:Θω(π)∩NF≠∅
2: Infinitary Closure: ∀π:Θω(π)=¯¯¯¯¯¯¯¯¯¯¯¯¯¯Θω(π)
3: Infinitary Convexity. ∀π:Θω(π)=c.h(Θω(π))
4: Infinitary Nirvana-free Upper-Completeness
∀π:Θω(π)∩NF=((Θω(π)∩NF)+Msa(FNF(π)))∩Ma(F(π))
5: Infinitary Bounded Minimals: ∃λ⊙,b⊙∀π:(λμ,b)∈(Θω(π))min→λ+b≤λ⊙+b⊙
6: Normalization: infπEΘst(π)(0)=0 and supπEΘst(π)(1)=1
7: Nirvana-free consistency.
∀πst:c.h(⋃π>πstprπ,πst∗(Θω(π)∩NF))=¯¯¯¯¯¯¯¯c.h(⋃π>πstprπ,πst∗(Θω(π)))∩NF
8: Infinitary Uniform Hausdorff Continuity:
The function π↦Θω(π)∩NF∩{≤⊙} is uniformly continuous.
C: Infinitary Causality: Regardless of π and M∈Θω(π), there's an outcome function of over full policies s.t. of(π)=M, and for all π′ and π′′, prπ′,inf(π′,π′′)∗(of(π′))=prπ′′,inf(π′,π′′)∗(of(π′′))
P: Infinitary Pseudocausality: ∀π,π′:(M∈Θω(π),supp(M)⊆FNF(π′)→πst∈Θω(π′)
Let's begin showing the conditions.
Condition 1: Nirvana-free Nonemptiness.
This one is trivial. Pick some π≥πpa. There's a nirvana-free point. Project it down. You get a nirvana-free point and you're done.
Conditions 2 and 3, Closure and convexity. We explicitly took the closed convex hull when defining everything, these are tautological.
Condition 4: Nirvana-free upper completion.
For the pseudo/acausal case, it's doable by Lemmas 10, 11, and 12. The projection of an upper-complete set (by infinitary nirvana-free upper-completion) is upper-complete, so the union of projections is upper-complete, and then the convex hull is upper-complete, and then the closure is upper-complete and we're done.
We'll have to loop back to the causal case of Nirvana-free Upper Completion later, because we need Lemma 21 to make it go through and that requires consistency and the extreme point condition to make it work.
Condition 5: Bounded Minimals.
We can break down into three phases. First is showing that all points in the projection set have something under them that respects the λ⊙+b⊙ bound. Second is showing that all points in the convex hull of the union of projection sets have something under them that respects the λ⊙+b⊙ bound. Third is showing that all points in the closure have something under them that respects the usual bound. The reason we have to phrase it this way is that we don't necessarily know that our sets of interest are closed until the end, so we can't find a minimal point, just a bounded one that is lower, but that suffices to show that a "minimal point" that violates the restricted minimal condition isn't actually minimal.
For part 1, let M∈prπ,πpa∗(Θω(π)). Then, M is the projection of some point M′∈Θω(π). By infinitary bounded-minimals, we can find a minimal point Mmin∈Θω(π) below M′ that obeys the λ⊙+b⊙ bound, so Mmin+M∗=M′. Projecting down is linear, so we get prπ,πpa∗(Mmin)+prπ,πpa∗(M∗)=M, and prπ,πpa∗(Mmin) is below M and fulfills the λ⊙+b⊙ bound.
For part 2, let M∈c.h(⋃π≥πpaprπ,πpa∗(Θω(π))) We can rewrite M as EζMi, and then, by part 1, decompose the Mi into Mloi (not actually minimal, just a point obeying the λ⊙+b⊙ bound) and M∗i. Then we can decompose M further into EζMloi+EζM∗i. The former is an a-measure (mix of a-measures) and obeys the λ⊙+b⊙ bound since all its components do, and it's in the relevant convex hull, witnessing that M has a point below it in the convex hull that obeys the bounds.
For part 3, let M be in the closure of the convex hull. There's some sequence Mn in the convex hull that limits to M. Below each Mn we can find a Mlon (again, not actually minimal) that obeys the λ⊙+b⊙ bound. Invoke Lemma 16 to get a point below M that respects the bounds, and we're done.
Condition 6: Normalization.
We literally have the exact phrasing of normalization we need already, this is a tautology.
Condition 7: Consistency.
Ok, one direction is trivial because ↓(Θω)(π)=Θω(π), so we can just use the definition of ↓. ↓(Θω)(πpa)=¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πpa∗(↓(Θω)(π)))
The other direction, that everything equals the intersection of preimages of stuff below it, is trickier. One subset direction isn't too bad, the one that
↓(Θω)(πpa)⊆⋂πst≤πpa(prπpa,πst∗)−1(↓(Θω)(πst))
If we take a M∈↓(Θω)(πpa) that wasn't added in the final closure step, it's expressible as EζMi, and all the Mi come from points M∞i in Θω(πi) where πi≥πpa. Projecting the M∞i down to a πst≤πpa instead makes Mloi, which mix together in the same way to make Mlo∈↓(Θω)(πst). Because projections are linear and commute, Mlo is the projection of M. So, any point in ↓(Θω)(πpa) (without the closure step) projects down to lie in ↓(Θω)(πst) for any πst≤πpa.
Then, for the closure step, we just fix a sequence Mn limiting to M. The Mn can project down to whichever ↓(Θω)(πst) you wish, and by continuity of projection, the M comes along for the ride as a limit point. However, ↓(Θω)(πst) is closed, so M projects down to land in that set as well. Bam, any old M∈↓(Θω)(πpa) projects down to land in any ↓(Θω)(πst) set you wish with πst≤πpa, certifying that ↓(Θω)(πpa) lies in the intersection of preimages of stubs below.
Now, we just have to establish ↓(Θω)(πpa)⊇⋂πst≤πpa(prπpa,πst∗)−1(↓(Θω)(πst)) which splits into two cases. The causal/surcausal case, and the pseudocausal/acausal case where you don't have to worry about nirvana.
For the nirvana-free case... We can use the same proof strategy as the last part of Lemma 21, where we were showing the result for partial policies. It may be a bit nonobvious why it works. We do need to swap things around a bit, and will mention important changes without fleshing out the fiddly details, which are already given in the last part of Lemma 21.
Start with a M in the intersection of preimages of stubs below. To show it's in ↓(Θω)(πpa), we need a sequence limiting to it, where each member of the sequence is a mix of finitely many points projected down from policies above πpa. The end part of Lemma 21 gives how to construct such a sequence. The fact that we're working in a nirvana-free setting means you can ignore all fiddly details about points being nirvana-free and preimages of only the nirvana-free parts, because everything fulfills that. The key steps in that proof path are:
1: being able to project down M to make a sequence Mn∈↓(Θω)(πnpa). We trivially have this by M being defined as "in the intersection of preimages of stubs below it".
2: Having uniform Hausdorff-continuity for the policies. This is our condition 8 we're assuming, so we're good there.
3: The ability to shatter our Mn into finitely many Mi,n which are the projections of various M∞i,n points from above. This is the key difference. The proof of Lemma 21 had to set up that fact beforehand. However, in our case, we have the Nirvana-free consistency condition, which says
∀πst:c.h(⋃π>πstprπ,πst∗(Θω(π)∩NF))=¯¯¯¯¯¯¯¯c.h(⋃π>πstprπ,πst∗(Θω(π)))∩NF
But, since we're working in the nirvana-free setting, this turns into:
∀πst:c.h(⋃π>πstprπ,πst∗(Θω(π)))=¯¯¯¯¯¯¯¯c.h(⋃π>πstprπ,πst∗(Θω(π)))
And that right-hand term... is just the definition of ↓(Θω)(πst)! So, swapping that out, and specializing our arbitrary stub to πnpa, we have:
c.h(⋃π>πnpaprπ,πnpa∗(Θω(π)))=↓(Θω)(πnpa)
So, since our Mn lie in ↓(Θω)(πnpa), they can be written as a finite mix of nirvana-free things from above projected down, and the Lemma 21 argument goes through.
Now, for the nirvana cases where we can assume infinitary causality. We'll do this by showing a little sublemma, that, if π≥πpa, then prπ,πpa∗(Θω(π))=↓(Θω)(πpa)
First, we'll show that if π,π′≥πpa, then prπ,πpa∗(Θω(π))=prπ′,πpa∗(Θω(π′))
Fix an arbitrary M in the projection of Θω(π), we can get a preimage point Mhi∈Θω(π). Then, by infinitary causality, we can make a point M′hi∈Θω(π′) that projects down to M. Just make an outcome function of where of(π)=Mhi, feed in π′, that gets you your point M′hi, the two agree when you project them down to inf(π,π′), and πpa is further down than that and projections commute so they both hit the same point M if you project down. Flipping π and π′ shows our equality.
Alright, so now ↓(Θω)(πpa) can be written as ¯¯¯¯¯¯¯¯c.h(prπ,πpa∗(Θω(π))) where π is arbitrary above πpa.
Projection is linear, so the projection of a convex set is convex. To get the closure points, just take a sequence Mn in the projection limiting to some M. Take preimage points M∞n∈Θω(π). There's a bound on the λ and b values of this sequence because projections preserve λ and b and our sequence Mn converges, so we can apply the Compactness Lemma and get a convergent subsequence limiting to a point M∞, which must be in Θω(π) because closure. Projection is continuous, so M∞ projects down to M. And we have prπ,πpa∗Θω(π)=↓(Θω)(πpa) proved! Wow, that was a sublemma of case 2 of part 3 of the proof of condition 7 in part 2 of the proof of the Isomorphism theorem, we're really in the weeds at this point.
Moving on, how can we use this to show ↓(Θω)(πpa)⊇⋂πst≤πpa(prπpa,πst∗)−1(↓(Θω)(πst)) for the causal case, which is the last bit we need to show consistency?
Well, fix a M in the intersection of preimages, and an arbitrary π≥πpa. M projects down to make some Mn∈↓(Θω)(πnpa). Since π≥πpa≥πnpa, and we have our sublemma, there's a point M∞n in Θω(π) that projects down to some M′n∈↓(Θω)(πpa), and further down to Mn.
This sequence M∞n all has the same λ and b value since projection preserves those, so by the Compactness Lemma and closure, there's a convergent subsequence and limit point M∞∈Θω(π). Does M∞ project down onto M? (witnessing that M∈↓(Θω)(πpa))?
Well, let's say it didn't and projecting down gets you a distinct point. Then there's some n where projecting down further to πnpa would keep the points distinct, since they have to differ at some finite time. But... after time n, our sequence M∞n is roaming around entirely in the preimage of Mn, so the limit point is in there too, and it projects down to Mn and we have a contradiction. Therefore, M∞∈Θω(π) projects down onto M, witnessing that M∈↓(Θω)(πpa), and M was arbitrary in the intersection of preimages.
So, we have ↓(Θω)(πpa)⊇⋂πst≤πpa(prπpa,πst∗)−1(↓(Θω)(πst)) for the nirvana-containing causal/surcausal case, which is the last piece we needed to show consistency.
Condition 8: Extreme point condition.
The thing we want is that an extreme nirvana-free minimal point Mex in ↓(BFω)(πst)∩NF is the projection of a nirvana-free point from a policy above it. By the nirvana-free consistency property, Mex lies in c.h(⋃π≥πstprπ,πst∗(Θω(π)∩NF))
Mex is extreme, so it lies in ⋃π≥πstprπ,πst∗(Θω(π)∩NF)
So, there's some point in some Θω(π)∩NF that projects down to Mex and we're done.
Condition 9: Hausdorff Continuity:
Ok, this one is going to be complicated. We'll work with the Lemma 15 version of Hausdorff-continuity, where a δ difference between two policies means that if you start off in one preimage, you've gotta travel ϵ(1+λ) distance or less to get to the preimage associated with the other policy, and vice-versa.
We split into two parts. Part 1 is showing that if πpa≥πst, and the distance between πpa and πst is δ or less, the distance between their respective preimages is low. Part 2 is showing that if πpa and π′pa are δ apart, then we can exploit part 1 to get that the distance between the preimages is low, and will be pretty easy after we get part 1.
Our Hausdorff-continuity condition links δ and ϵ. So, when we fix an ϵ and are like "how close do the policies have to be to guarantee the preimages are ϵ apart", pick the δ that gets you ϵ3 distance w.r.t our original Hausdorff-continuity condition, and also have δ<ϵ3.
Our first part uses Mlo and M′lo for points in ↓(Θω)(πst)∩NF, M′mid for a point in ↓(Θω)(πpa)∩NF, Mhi and M′hi for points in Ma(∞) (that are expressible as a finite mix of points from Θω(π) for varying π), and M,M′ for two more general points in Ma(∞).
So, one half of showing the two preimages are close to each other is trivial. Everything in ↓(Θω)(πpa)∩NF projects down into ↓(Θω)(πst)∩NF by consistency, and projection preserving nirvana-freeness, so the preimage associated with πst is a superset of the preimage associated with πpa, so there's distance 0 from a point in the πpa preimage to a point in the πst preimage.
The other half is trickier. Pick an arbitrary point M in (pr∞,πst∗)−1(↓(Θω)(πst)∩NF) and λ is the λ value of this point. M projects down to some point Mlo in ↓(Θω)(πst)∩NF. From nirvana-free consistency, M∈c.h(⋃π≥πstprπ,πst∗(Θω(π)∩NF)) ,Mlo can then be produced by (keeping in mind that it doesn't matter whether we mix before or after projecting) finitely many points in varying Θω(π)∩NF sets that are mixed to make a point Mhi, and then projected down.
Important note: Mhi is not necessarily equal to, or even close to, M.
Because πst is within δ of πpa, every policy above πst has a corresponding policy within δ that lies above πpa. Thus, we can perturb the component points (indexed by i) that mix to make Mhi by ϵ3(1+λi) (infinitary Hausdorff-uniform-continuity Lemma 15 variant, δ was assumed to be small enough for that to be the case), and mix them, to get a M′hi in c.h(⋃π≥πpa(Θω(π)∩NF))
M′hi projects down to ↓(Θω)(πpa)∩NF to make a M′mid, and further projects down to ↓(Θω)(πst)∩NF to make a M′lo. Because M′hi is within ϵ3(1+λ) of Mhi, projecting down (and projecting being nonexpansive) means that M′lo is within ϵ3(1+λ) of Mlo.
Now, we can take M′lo and fill in all the missing measure data to get a M′ that projects down onto M′mid (certifying that it's in the preimage of ↓(Θω)(πpa)∩NF) as follows. Our most important constraint is that, when extending m′lo, it should perfectly mimic m′mid so it can project down onto it. Our second constraint is that, if mlo doesn't specify what comes after a finite history and it doesn't conflict with the first constraint, it should exactly mimic the conditional probabilities of m. Also, our δ fixes a first time n (ie, logγ(δ)) at which πpa is defined where πst isn't, so all conflicts of the second constraint with the first constraint must happen after then. This does the following: We can slice the histories assigned measures by M′ into three parts.
Part 1 is prefixes of histories in F(πst). There's only ϵ(1+λ) difference in these between M and M′ (after all, projecting down to πst leaves these unchanged, and M/M′ project down to Mlo and M′lo which are only ϵ3(1+λ) apart).
Part 2 is histories which have as a prefix something in F(πst) less than length n. In that case, we're mimicking the conditional probabilities of m.
Part 3 is histories which have as a prefix something in F(πst) of length n or higher. Because this is the threshold where πpa and πst start differing, we've got to obey the m′mid probabilities. But this only occurs after time n.
Let's analyze the difference between M′ and M, shall we? Our two relevant results are Vanessa's folk result that two distributions that differ by an amount will differ by the same amount if we extend them with the same conditional probabilities, and the result from the proof of Lemma 15 that arbitrarily reshuffling the measure/amount of dirt after time n takes γnλ′effort, where λ′ is the λ value of the measure you're reshuffling.
So, we start off with a ϵ3(1+λ) distance (includes the b term) between Mlo and M′lo. Then, extending up further to fill in everything up till time n, m and m′ mimic the conditional probabilities of each other. Still a ϵ3(1+λ) distance between them at this stage. Finally, after time n, M′ may go its own arbitrary way because it's gotta be compliant with M′mid, and to reshuffle this around, it takes γnλ′ effort. So, the net distance between M (arbitrary point in the preimage of ↓(Θω)(πst), and M′ (specially crafted point in the preimage of ↓(Θω)(πpa) is below ϵ3(1+λ)+γnλ′.
Wait, n (time of first difference) was logγ(δ) since πst and πpa are only δ apart, and λ′ can be at most λ+ϵ3(1+λ) because λ values are preserved by projections, and Mlo and M′lo are only ϵ3(1+λ) distance apart, so no more than that amount of dirt is the difference between the two. Finally, we assumed δ<ϵ3. So, we get:
d(M,M′)≤ϵ3(1+λ)+γnλ′=ϵ3(1+λ)+γlogγ(δ)(λ+ϵ3(1+λ))<ϵ3(1+λ)+δ(2λ+2)
<ϵ3(1+λ)+ϵ3(2λ+2)=ϵ(1+λ)
And we have our appropriate distance bound between preimages! Now to use this in part 2, which should go a lot faster.
Time for part 2, to get full generality. Pick two partial policies πpa and π′pa and assume the distance between them is δ. Then, the stub πst given by "inf(πpa,π′pa) but cut it off so it's undefined after time n (where n is logγ(δ))" is within distance δ of both πpa and π′pa. Further, πst≤πpa,π′pa. Then, take some point in the preimage of πpa. It's also in the preimage of πst. Because πst is at a distance of δ from π′pa, we only have to go ϵ(1+λ) distance to get a point in the preimage of π′pa, and then reverse πpa and π′pa and we're done!
By Lemma 15, this establishes uniform continuity for the function mapping partial policies to the preimage of their nirvana-free part in the space of all nirvana-free measures over infinite histories.
Condition 4: Nirvana-free upper completeness (causal case)
Now that we've nabbed every nice condition other than this one, we can invoke Lemma 21 (we only require upper completion on the infinite levels, which we have) to get that the nirvana-free part is the (closed) convex hull of the projections of nirvana-free stuff from above. Then, just appeal to lemmas 11, 12, and 13, that the closed convex hull of projections of nirvana-free upper-complete sets is nirvana-free upper-complete.
Condition C: Causality.
We showed part of this all the way back in our consistency argument. For causal/surcausal, prπ,πpa∗(Θω(π))=↓(Θω)(πpa) regardless of which π≥πpa we picked. We'll be using this.
Pick some arbitrary πpa and M∈↓(Θω)(πpa). M has a preimage point Mπ∈Θω(π) where π≥πpa. We get an outcome function ofω mapping policies to points in their associated sets s.t. ofω(π)=Mπ. Extend this ofω to all points by defining of(π′pa):=prπ′,π′pa∗(ofω(π′))
Ok, we need to show that: This actually singles out a unique point and isn't an invalid definition, said point is in ↓(Θω)(π′pa), that of(πpa)=M, and that it's an outcome function.
Assuming this is actually well-defined, of(π′pa) is in ↓(Θω)(π′pa) trivially because it's a projection of a point from above. Also, of(πpa)=prπ,πpa∗(ofω(π))=prπ,πpa∗(Mπ)=M which clean ups that part. Now for showing that it's an outcome function.
prπhipa,πlopa∗(of(πhipa))=prπhipa,πlopa∗(prπ,πhipa∗(ofω(π)))=prπ,πlopa∗(ofω(π))=of(πlopa)
So, we got everything assuming the extension is well-defined, let's show that. Pick any two π,π′ above any πpa. We'll show that they project to the same point.
prπ,πpa∗(ofω(π))=prinf(π,π′),πpa∗(prπ,inf(π,π′)∗(ofω(π)))=prinf(π,π′),πpa∗(prπ′,inf(π,π′)∗(ofω(π′)))
=prπ′,πpa∗(ofω(π′))
And we're done with causality! Now for pseudocausality.
Condition P: Pseudocausality.
We'll do this in two steps. One is showing that for stubs, points which meet the appropriate conditions are also present in all the requisite other stubs. Step 2 is generalizing this to all points in ↓(Θω)(πpa).
Let's say you have some M∈↓(Θω)(πst). By Nirvana-free consistency, M∈c.h(⋃π≥πstprπ,πst∗(Θω(π))) so we can shatter it into finitely many Mi that are projections of stuff from above, M∞i. The support of the measure component of M is a subset of FNF(πst)∩FNF(π′st), so the same must apply to all the Mi.
Now, what we can do is make a π′i that mimics the behavior of πi for all prefixes and extensions of strings in FNF(πst)∩FNF(π′st), but otherwise mimics π′st, and extends if needed in some random-ass way, and is above π′st.
The reason we can do this is because, if there's a contradiction in this construction, it would be from π′st and πi behaving differently on some prefix or extension of a string in FNF(πst)∩FNF(π′st). But, π′st can't specify what to do for any nodes in FNF(π′st) or later (because FNF(π′st) is basically a coat of leaf observation nodes around the extent of π′st), and if π′st and πi differ on a strict prefix of something in FNF(πst)∩FNF(π′st), then that means that π′st and πst branch different ways so there's no node in both FNF(πst) and FNF(π′st) after the branch point, so again we get a contradiction.
Anyways, we've crafted our finitely many π′i which lie above π′st, and mimic πi going forward. Our M∞i is an extension of Mi whose measure component is only supported on FNF(πst)∩FNF(π′st). Also, before and past that, π′i mimics πi perfectly, so we can transfer M∞i to Θω(π′i) by infinitary pseudocausality. Do this for all the i. Then, projecting all those down to π′st, we get that all the Mi lie in ↓(Θω)(π′st), and mixing them together, we get that M itself lies in ↓(Θω)(π′st).
Now for part 2, where we show it for partial policies in general. Let M be arbitrary in ↓(Θω)(πpa). Project M down to all the πnpa to make a sequence of Mn. Since the support of the measure component of M is a subset of F(πpa)∩F(π′pa) (pseudocausality assumption) the support of the measure component of Mn is a subset of F(πnpa)∩F(π′npa), so by pseudocausality for stubs which we've shown, Mn is also present in ↓(Θω)(π′npa). Then, take the preimage in π′pa of all those Mn points. By consistency and Lemma 9 and the usual argument about "there can only be one preimage point for a series of points", we get that M itself (the only thing that could project down on Mn for all n) lies in ↓(Θω)(π′pa) and we're done with pseudocausality.
Alright, that's most of the proof out of the way, all that's left is showing that the full belief function conditions imply the finitary and infinitary versions, respectively, and getting isomorphism. Let's begin.
Let's check whether →st(Θ) makes a stub-hypothesis, and whether →ω(Θ) makes an infinitary-hypothesis, if Θ is a hypothesis/fulfills all the conditions. →st is just "restrict Θ to only reporting sets for stubs", and →ω is just "restrict Θ to only reporting sets for full policies"
The variants of nonemptiness, closure, convexity, nirvana-free upper-completion, bounded minimals, hausdorff-continuity, and pseudocausality for the finite and infinite case are trivially implied by the corresponding condition for hypotheses, leaving the four moderately nontrivial cases of the analogues of normalization, consistency, the extreme point condition, and causality.
Extreme point condition: The infinitary case doesn't have an analogue of the extreme point condition. So that leaves the finitary case. What we can do is take a nirvana-free extreme minimal point Mex in some Θ(πst), apply the general extreme point condition to get a nirvana-free M∈Θ(π) for some suitable π that projects down to Mex, and, clipping away the infinite parts by →st, the projections of M fill the role of the points in Θ(π′st) all below some policy that project down to Mex.
Causality. The finite case is that we can take a point associated with some stub, and craft an outcome function for stubs that matches up with our point. This is trivially implied by the general case of causality, where you can take any partial policy and point and get an outcome function that matches up with it. The infinite case is that we can take a point M in Θ(π), and get points for all the other Θ(π′) that project down appropriately. For this, again, we just take an outcome function for M and clip it off to the infinite levels.
Consistency: The finite case of weak consistency is pretty easy. We get
prπhist,πlost∗(→st(Θ)(πhist))=prπhist,πlost∗(Θ(πhist))⊆Θ(πlost)=(→st)(Θ)(πlost)
Where the subset came from full consistency because everything is the closed convex hull of projections from above, so projecting down gets you a subset. For the Nirvana-free consistency condition for the infinite case, it's a simple consequence of Lemma 21.
Normalization: infπstEΘ(πst)(0)=0 and supπstEΘ(πst)(1)=1
To begin, the normalization condition for infinitary hypotheses and general hypotheses is the exact same, so we can ignore that and work on the stub hypothesis case. The inf one is pretty easy. From general normalization, at the infinite level, there are π and nirvana-free points in Θ(π) with a b value at-or-near zero, and you can just project them down to any stub you want.
The sup one is a bit trickier. It's obviously not above 1, because no matter what policy π you pick, you've got a nirvana-free point with λ+b≤1 in Θ(π), which you can project down to whichever stub you're looking at, to certify that the expectation of 1 is 1 or less. Showing that it isn't below 1 is a bit harder.
Let's say there's some π where min(λμ,b)∈Θ(π)∩NF(λ+b)=1 (or arbitrarily close to 1, doesn't really matter, although we'll show later that there is indeed a maximizing policy where Murphy can only force a value of 1)
From Hausdorff-continuity, pick some δ to get an ϵ Hausdorff-distance or lower. This δ specifies some extremely large n, consider πn. Now, consider the set of every policy π′ above πn. All of these are δ or less away from π.
By Hausdorff-continuity, there can't be a nirvana-free point in any Θ(π′) with λ+b<1−ϵ, because we could do an ϵ perturbation to get a point in Θ(π)∩NF with λ+b<1, because small changes in M induce small changes in λ and b. Or, we can add a little bit of wiggle room if the minimizing value of λ+b in π is slightly less than 1
However, any nirvana-free point in Θ(πn) must originate as a mix of finitely many points from Θ(π′i)∩NF (varying π′i as long as it's above πn) that have been projected down. This is because, by our earlier proof of nirvana-free consistency from consistency in general, Θ(πn)∩NF=c.h(⋃π′≥πnprπ′,πn∗(Θ(π′)∩NF))
All of these projected points have λ+b≥1−ϵ, so the mix point has λ+b≥1−ϵ, so Murphy can only force a value of 1−ϵ or higher. And we can make δ as small as we wish to get a stub πn below π (n extremely large) where ϵ is as small as we wish, so the sup of the λ+b values Murphy can force over all stubs can't be below 1. So it must be 1.
Isomorphism! Let's go! As a quick recap,
↓(Θω)(πpa):=¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πpa∗(Θω(π)))
↑(Θst)(πpa):=⋂πst≤πpa(prπpa,πst∗)−1(Θst(πst))
And →ω/→st is just "clip down your hypothesis to full policies/stubs".
So, two parts of this are trivially easy. From earlier in the proof (the start of the first section for the stub one, and an obvious corollary of definitions for the full policy one), we established that ↑(Θst)(πst)=Θst(πst) and ↓(Θω)(π)=Θω(π). Using this, →ω(↓(Θω))(π)=↓(Θω)(π)=Θω(π) and→st(↑(Θst))(πst)=↑(Θst)(πst)=Θst(πst)
So, →ω(↓(Θω))=Θω and →st(↑(Θst))=Θst
Let's get fancier and show the other two.
↓(→ω(Θ))(πpa)=¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πpa∗(→ω(Θ)(π)))=¯¯¯¯¯¯¯¯c.h(⋃π≥πpaprπ,πpa∗(Θ(π)))=Θ(πpa)
The first two equalities are unpacking definitions, the third is consistency for Θ.
↑(→st(Θ))(πpa)=⋂πst≤πpa(prπpa,πst∗)−1(→st(Θ)(πst))=⋂πst≤πpa(prπpa,πst∗)−1(Θ(πst))=Θ(πpa)
Again, first two equalities are unpacking definitions, the third is consistency for Θ. So, ↓(→ω(Θ))=Θ and ↑(→st(Θ))=Θ
Putting it together, →ω and ↓ make an isomorphism between Θω and Θ, and →st and ↑ make an isomorphism between Θst and Θ. We're finally done!
Proposition 1: If Θ fulfills the causality condition, nonemptiness, closure, and convexity, then SΘ is a nonempty, closed, convex set of a-environments or a-survironments. ΘSΘ=Θ. Also, S⊆SΘS.
Ok, what SΘ is, is the set of a-environments (λe,b) where, regardless of πpa, (λ(πpa⋅e),b) lies in Θ(πpa). For nonemptiness, pick some arbitrary point in one of your Θ(πpa), use causality to get an outcome function, and then you fill in the conditional probabilities for an action-observation sequence with your outcome function points. This never produces a contradiction anywhere because if there was a contradiction, you'd be able to project two specified points down and have them disagree somewhere, which is impossible because we have an outcome function.
For closure, if you take a limit of a-environments, this makes a limiting sequence in all the SΘ, which are all closed, so the limit point environment has all its induced distributions lying in the usual Θ(πpa), and is in SΘ
For convexity, if you take a mix of a-environments, this makes the same mix in all the SΘ which are all convex, so the mixed environment has all its induced distributions lying in the usual Θ(πpa), and is in SΘ.
For equality, if M∈ΘSΘ(πpa), then it originated from some a-environment made from an outcome function for Θ, which... just gets your original point so M∈Θ(πpa). In the other direction, if M∈Θ(πpa), by causality, we can project down and extend the specification and make an a-environment that acts like M on πpa, and then going back gets you M∈ΘSΘ(πpa).
In the other direction, if (λe,b)∈S, then it induces an outcome function and you can go back from that to (λe,b)∈SΘS, so S⊆SΘS
Theorem 3.1: Pseudocausal Translation: For all pseudocausal Θst hypotheses defined only on policy stubs, →c(Θst) is a causal hypothesis only defined on policy stubs. →NF(→c(Θst))=Θst. For all causal Θst hypotheses defined only on policy stubs, →NF(Θst) is a pseudocausal hypothesis only defined on policy stubs.
Theorem 3.2: Acausal Translation: For all acausal Θst hypotheses defined only on policy stubs, →sc(Θst) is a surcausal hypothesis only defined on policy stubs. →NF(→sc(Θst))=Θst. For all surcausal Θst hypotheses defined only on policy stubs, →NF(Θst) is an acausal hypothesis only defined on policy stubs.
Both these theorems have highly similar proofs, so let's group them together. First, we'll need to set up how →c and →sc work, and then knock out two lemmas we'll need before we can proceed to the main result. →c is defined by →c(Θst)(πst)=¯¯¯¯¯¯¯¯c.h(⋂π′st≤πst(Iπ′st,πst∗(Θst(π′st))))→sc is defined identically, just with I∗s instead of I∗, and closed convex hull permitting us to mix with 0+ probability.
Iπ′st,πst where π′st≤πst (this is like the inverse of projection, it's going up instead of down) is a function F(π′st)→F(πst) defined by: If h∈F(π′st)∩F(πst), then Iπ′st,πst(h)=h. If h∈F(π′st) and isn't in F(πst), then Iπ′st,πst(h)=hπst(h)N
Iπ′st,πst∗ from Ma(F(π′st))→Ma(F(πst)) is just pushing (m,b) through the mapping Iπ′st,πst. You keep the b term the same, and push the measure terms up. Iπ′st,πst∗s is defined identically on the measure part, except that it has the rule that all nirvana events in F(πst) and not in F(π′st) with 0 measure get 0+ measure instead.
Intuitively, what I∗ and I∗s are doing, is capping off whatever they need to (in order to extend appropriately) with Nirvana. I∗ is capping off positive-probability histories with guaranteed Nirvana immediately afterwards, where I∗s is more paranoid and caps off every 0-probability Nirvana history that got added with "it is possible that Nirvana occurs here".
Let's go over some properties that I∗ and I∗s fulfill. I∗ is an injective continuous map Ma(F(π′st))→Ma(F(πst)), and I∗s is an injective continuous map SMa(F(π′st))→SMa(F(πst)). I∗ and I∗s are undone by projecting back down, prπst,π′st∗(Iπ′st,πst∗(M))=M. Both I∗ and Is∗ are linear, the latter in the stronger sense that it's linear when you mix stuff with 0+ probability, it doesn't matter whether you mix before or after injecting up. Further, injections up commute, Iπ′st,πst∗(Iπ′′st,π′st∗(M))=Iπ′′st,πst∗(M), and the same for I∗s.
In order to make progress, we want to get two important lemmas. The first one, Lemma 22, is that slicing away the nirvana from this thing recovers the original pseudocausal hypothesis. The second one I call the "Diamond Lemma", and it says that injecting up and projecting down is the same as projecting down and then injecting up, and if you sketch it out, it looks like a diamond.
Lemma 22: (→c(Θst)(πst))∩NF=Θst(πst), and the same holds for →sc.
Proof sketch: One direction is trivial, the other direction that →c doesn't add any new nirvana-free points is trickier. Working in the pseudocausal-to-causal setting, we can take some M that's nirvana-free in the closed convex hull, and get a sequence Mn limiting to it where each Mn is in the convex hull. Now, indexing stubs below πst by i, the Mn can all be viewed as a mix of Mi,n points projected up from below. The problem is, the mix varies as n does. What we can do is separate into "good" i where we can get a suitable limit point and limit probability, and "bad" i that we have to treat as a special chunk, and reexpress M as a sum of a probabilistic mix of "good" Mi injected up, and an additional "bad" chunk. We can show that the "good" Mi can all be transferred up to Θst(πst) itself by pseudocausality and mixed in there, and the "bad" chunk is a nirvana-free a-measure. So, M is the sum of a point in Θst(πst), plus a nirvana-free a-measure, so M lies in Θst(πst) by nirvana-free upper completion.
Working in the surcausal-to-acausal setting, we take our M in the closed convex hull and a sequence Mn, but injection up in this setting is much more effective at adding Nirvana, and the surmetric is much more sensitive than the usual metric for noticing the presence or absence of Nirvana. So, only an initial segment of Mn is "contaminated" with Nirvana since the limit point is Nirvana-free, and we can clip that part off, and the "uncontaminated tail" can only have come from Θst(πst) itself because injection up is very aggressive with adding Nirvana, so we get it from just closure on Θst(πst).
Proof: For (→c(Θst)(πst))∩NF⊇Θst(πst), just observe that the identity injection Iπst,πst∗ leaves Θst(πst) completely unchanged and adds no nirvana, so any point in Θst(πst) also lies in the closed convex hull of the injections up, and is nirvana-free because the original point that we mapped through identity was nirvana-free. This works with the surcausal case too.
Now for the considerably more difficult reverse direction, for the pseudocausal-to-causal case first.
If M∈→c(Θst)(πst)∩NF, then unpacking that, M is nirvana-free, and lies in the closed convex hull of the injections up. So, we can fix a sequence Mn in the convex hull of injections up that limits to M. Index the stubs below πst by i, there's only finitely many of them.
The Mn can be written as ∑iζi,nIπist,πst∗(Mi,n) where Mi,n∈Θst(πist). ζi,n may be 0. This is because, if there's multiple points in the injection of a particular stub that are mixed, you can mix them before injecting up to get a single Mi,n that's injected up, because injections are linear and we're injecting a convex set.
Blessed by the gift of finitely many i to worry about, use repeated picking of subsequences to get a subsequence of n where:
For all i, ζi,n converges. Call the limiting values ζi. Now split the i into good i where ζi>0, and bad i where ζi=0. The ζi will sum up to 1.
For all good i, ζi,n>0 always. The fact that they all limit to above 0 helps you out because you only have to trim off an initial segment.
For all good i, Mi,n converges, call the limit point Mi. This is because injection up preserves λ and b, and ζi,n is bounded above 0, so the λ+b value of the Mi,n is upper-bounded by λ′+b′minnζi,n, which is a finite nonnegative number divided by a finite positive number, and we can apply the Compactness Lemma to establish that a convergent subsequence exists. In this case, λ′+b′ is the bound as a whole for the sequence Mn, which converges so it must have a bound of that form, and not the bound on minimal points.
Finally, ∑bad i(ζi,nIπist,πst∗(Mi,n)) converges. This is doable because the sequence Mn has bounded λ and b because it converges to something, so the partial sum of bad i has the same bound, so we can invoke the Compactness Lemma to get our convergent subsequences.
Putting all this together (we kept selecting from compact sets so that is what let us build a subsequence with all these great properties at once) we have a decomposition of M itself into:
∑good i(ζiIπist,πst∗(Mi))+limn→∞(∑bad i(ζi,nIπist,πst∗(Mi,n)))
Now, since ∑good iζi=1 (all the bad i had their probability components limit to 0), that first sum part looks like an actual mixture of points injected up! Since M is nirvana-free, both parts must be nirvana-free, and the sums are also a-measures.
First, by closure of all our original Θst(πist), all the Mi components (where i is good) do lie in Θst(πist). And when we inject the Mi up, since the mix of them is nirvana-free, this means that each individual Mi must be nirvana-free after injection.
Now, what injection does, is it caps Nirvana on everything that is in F(πist) and not in F(πst) that has positive probability. So, if Mi is nirvana-free after injection, this must mean that its measure component is only supported on F(πst). Via pseudocausality, this means that Mi lies in Θst(πst) itself! Also, Iπist,πst∗(Mi)=Mi.
So, our sum over good i components (by convexity), is actually a probabilistic mixture of stuff in Θst(πst) itself! Abbreviating ∑good iζiMi as M′, which lies in Θst(πst) by convexity, and rewriting the sum, we can reepress M as:
M′+limn→∞(∑bad i(ζi,nIπist,πst∗(Mi,n)))
This is a nirvana-free a-measure in Θst(πst), plus a nirvana-free a-measure, so, by nirvana-free upper-completion, M lies in Θst(πst) and we're done. Now, let's hit up the surcausal case.
Assume SM∈→NF(Θst)(πst)∩NF. SM is nirvana-free, and lies in the closed convex hull of the injections up. So, we can fix a sequence SMn in the convex hull of injections up that limits to SM. Index the stubs below πst by i, there's only finitely many of them, reserve i=0 for πst itself.
The SMn can be written as ∑iζi,nIπist,πst∗s(Mi,n) where Mi,n∈Θst(πist). ζi,n may be 0 or 0+. This is because, if there's multiple points in the injection of a particular stub, you can mix them before injecting up to get your single point, one for each πist, because injections are linear and we're injecting a convex set.
Note that SM is nirvana-free, and there's only finitely many spots where nirvana could be since we're working in a stub, so past a certain point all the SMn will be nirvana-free due to the surmetric we're using. Let's clip off that initial segment that's contaminated with Nirvana. Now, we can get something very interesting. If πist<πst, then injecting up anything at all is going to stick nirvana (maybe with 0+ measure) somewhere. Having ζi,n be 0+ doesn't help you, because mixing with a nirvana-containing thing with 0+ probability means the mixture contains the nirvana-spots of that thing you mixed in. So, past a certain point, all the SMn can only be written as M0,n (the identity injection, anything else either has exactly 0 probability so it gets clipped out of the sum, or it has Nirvana somewhere and can't be present).
Therefore, in the tail, the sequence of SMn limiting to SM is the same as M0,n∈Θst(πst) limiting to some M0∈Θst(πst), so SM∈Θst(πst) and it's actually an a-measure, not an a-surmeasure. This establishes Lemma 22 for the sur-case.
Lemma 23/Diamond Lemma: For any πst,π′st, and any πhist≥πst,π′st, and any M∈Ma(F(πst)), then: prπhist,π′st∗(Iπst,πhist∗(M))=Iinf(πst,π′st),πst∗(prπst,inf(πst,π′st)∗(M)) (and same for I∗s and the sur variants)
it's called the Diamond Lemma because if you sketch out the injections as going diagonally up and the projections as going diagonally down, the commutative diagram looks like a diamond.
To begin with, we can go "hm, there's an upper bound on πst and π′st. For every finite history in F(πst), there's an extension of that history in F(πhist), which has a prefix in F(π′st), and vice-versa. This establishes that for all the finitely many histories in F(πst), either a prefix of that history lies in F(π′st), or an extension of that history lies in F(π′st), and vice-versa for F(π′st)
Now, we can split into three possible cases and show that up-then-down equals down-then-up in terms of what measure is assigned to a history in F(π′st) by mapping M through the injections and projections, which shows the diamond lemma in full generality.
In the first case, our history h in F(π′st) is also in F(πst) (the equality case)In this case, h also lies in F(inf(πst,π′st)). Projecting down to inf does nothing to the measure on h, and embedding up also does nothing to the measure on h. Embedding up to πhist also does nothing to the measure on h, and projecting down doesn't affect it either.
In the second case, our history h in F(π′st) isn't in F(πst), but there are strict extensions that lie in F(πst) (this requires h to be nirvana-free). h is still assigned a measure by M, though, being a prefix of stuff with measure. In this case, h also lies in F(inf(πst,π′st)). The same analysis from our first case works, h doesn't have its measure disrupted.
In the third case, our history h in F(π′st) isn't in F(πst), but a strict prefix h′ lies in F(πst). We can distinguish three subcases. In the first subcase, h is of the form h′aN. In the second subcase, h still ends with Nirvana, but it isn't immediately after h′ happens, some stuff happens in the meantime first. In the third subcase, h doesn't end with Nirvana. Also, h′ lies in F(inf(πst,π′st)).
For the first subcase where h is of the form h′aN, injecting up means h′aN now has the measure originally associated with h′ and nirvana is marked as "possible" there (if we're using the sur-injection). Projecting down leaves this alone. Projecting down leaves the measure on h′ alone, and injecting up means h′aN now has the measure originally associated with h′ and nirvana is marked as "possible" there (if we're using the sur-injection). In both paths, h′aN ends up with the measure that h′ started with, and nirvana marked as "possible" in the sur-case.
For the second subcase where h is of the form "h′, then some stuff happens, then Nirvana occurs", then in the causal case, the injection up would assign h 0 measure (all the measure of h′ got channeled into h′aN instead of h), and then projecting down, it stays the same. Similarly, projecting down means h′ has some measure, then it's all channeled into h′aN on the injection up, so h itself gets 0 measure. For the surcausal case, the injection up assigns h 0+ measure (by the same argument and sur-injections tagging every freshly-added nirvana outcome with 0+ measure). and projecting down, it remains with 0+ measure. Projecting down leaves h′ alone and then injecting up tags h with 0+ measure.
For the third subcase where h is an extension of h′ that doesn't add any Nirvana, we can run through the same argument as the second subcase to conclude that we get 0 measure for both the causal case and the surcausal one.
Thus, no matter whether we inject up and project down, or project down and inject up, the measure assigned to h∈F(π′st) by the measure component of the p-(sur)measure will agree.
An important thing to note with this is that we can use any stub above πst and π′st for the injection up, but we must use inf(πst,π′st) for the projection down.
Now, we can finally embark on the proof of the two translation theorems! There's enough similarities between the proofs that we can just do one big proof and remark on any differences we come across. The things we must show are that slicing off the Nirvana from a causal/surcausal hypothesis makes a pseudocausal/acausal hypothesis, and that adding in those injections up can turn a pseudocausal/acausal hypothesis to a causal/surcausal one, and that going nirvana-free to nirvana-containing back to nirvana-free is identity.
Proof sketch: While at first this may look like the proof will be almost as long as the Isomorphism theorem because we're verifying a list of 9 conditions twice over, it'll actually be considerably shorter. The only nontrivial part of the first part where we check that slicing off the Nirvana makes a pseudocausal/acausal hypothesis is deriving pseudocausality from causality, and even that is fairly easy.
Going from pseudocausal/acausal to causal/surcausal is trickier, though thankfully most conditions are trivially true, there's only three notable ones. There's the bound on minimal points, which is done by taking a sequence Mn limiting to a M that violates the bound, using the definition of the causal translation to get a point below each Mn which obeys the λ⊙+b⊙ bound (fairly nontrivial), and appealing to Lemma 16 to construct a limit point below M that obeys the λ⊙+b⊙ bound. Showing weak consistency (projecting down makes a subset) requires the Diamond Lemma to write the projection of your point of interest as a mix of injections up from below, and the last tricky one is causality. Which requires first showing that injecting up →c(Θst(πst)) to a higher stub won't add any new points, and then coming up with a clever way of building our outcome function, and using the Diamond Lemma to show that it indeed an outcome function.
Finally, nirvana-free to causal to nirvana-free is instant by Lemma 22.
Proof: Referring back to the conditions for a hypothesis on policy stubs, we'll show that they're fulfilled when you slice away the Nirvana, and that pseudocausality can be derived from causality if we're just dealing with a-measures and not a-surmeasures.
Stub Nirvana-free Nonemptiness was a property already possessed by the causal hypothesis, so it's preserved when we clip away the Nirvana. Stub closure and convexity also hold because we're intersecting with the closed convex set (of nirvana-free a-measures). Nirvana-free upper completion also holds. Bounded minimals holds because a minimal point in the Nirvana-free part must also be a minimal point in the original set, because adding anything to a nirvana-containing a-measure makes a nirvana-containing a-measure, so there can be no nirvana below our minimal point in the nirvana-free part, so it's minimal in general. Normalization holds because the expectation values only depend on the nirvana-free part. Nirvana-free stuff projects down to nirvana-free stuff, getting stub-consistency. The stub extreme point condition carries over due to the preexisting intersection with nirvana-free used to define it, and the same applies to uniform continuity. This wraps up surcausal-to-acausal, and just leaves deriving pseudocausality from causality for causal-to-pseudocausal.
Let's say we have a nirvana-free M∈Θst(πst), where the measure part of M is supported over F(π′st), and we want to show that it's also present in Θst(π′st). Then M is present in Θst(inf(πst,π′st)), because it's supported entirely over histories that both the different policies produce, so it's supported over histories that the intersection of the policies produces, just project it down to the inf. Now, by causality, we can find something in Θst(π′st) that projects down onto M, which must be M itself because the measure part of M is supported entirely on histories in F(π′st). This gets pseudocausality.
Now, let's show that →c(Θst) and →sc(Θst) fulfill the defining conditions for finitary causal.
1: Stub Nirvana-Free Nonemptiness: This one is trivial, because Θst(πst) is present as a subset via the identity injection Iπst,πst∗, and is nirvana-free.
2,3: Stub Closure/Convexity: We took a closed convex hull, these are tautological.
4: Stub Nirvana-Free Upper-Completeness: Just apply Lemma 22 to get that the nirvana-free part of →c(Θst)(πst) (and same with surcausal) is just the original Θst(πst), which is nirvana-free and upper-complete by stub nirvana-free upper-completeness, so we're good there.
Condition 5: Stub Bounded Minimals:
By stub bounded minimals on the Θst we have a λ⊙+b⊙ bound on λ+b for minimal points in Θst(πst) for all stubs. Pick a M∈→c(Θst)(πst) (or the surcausal analogue) with λ+b>λ⊙+b⊙. There's a sequence of points Mn limiting to M that lie in the convex hull. All these Mn (or SMn) can be written as ∑iζi,nIπist,πst∗(Mi,n) where Mi,n∈Θst(πist). Decompose Mi,n into a minimal point and something else, getting Mmini,n+M∗i,n.
Then, do a further rewrite as Mmini,n+(m∗−i,n,−m∗−i,n(1))+(m∗+i,n,b∗i,n+m∗−i,n(1))
Note that the λ+b value of the sum of the first two terms is bounded above by λ⊙+b⊙, because Mmini,n obeys that bound, and for the second term, it deletes exactly as much from the λ term as it adds to the b term. Also, since Mi,n is an a-measure, adding in just the negative component to Mmini,n doesn't make it go negative anywhere, so the sum of the first two terms is an a-measure, and by nirvana-free upper completeness, it lies in Θst(πist). The third component of the sum is also an a-measure.
By linearity of I∗ or I∗s, injecting up the first two terms and the last term, and adding them afterwards, is the same as injecting up the bulk of them (we can only inject up a-measures). Let's abbreviate Mmini,n+(m∗−i,n,−m∗−i,n(1)) as M◊i,n and abbreviate (m∗+i,n,b∗i,n+m∗−i,n(1)) as M♡i,n Now, we can rewrite Mn as:
Mn=∑iζi,nIπist,πst∗(Mi,n)=∑i(ζi,nIπist,πst∗(M◊i,n))+∑i(ζi,nIπist,πst∗(M♡i,n))
The first component is in →c(Θst)(πst) and has the λ⊙+b⊙ bound (injection up preserves λ and b) and M◊i,n lies below the λ⊙+b⊙ bound, and mixing stuff below the bound produces a point below the bound. Abbreviate the first component as M◊n.
So, Mn isn't minimal, it lies above M◊n. Because the M◊n have a λ⊙+b⊙ bound, and there's only finitely many places where nirvana could be, we can extract a convergent subsequence, limiting to some M◊ which obeys the λ⊙+b⊙ bound, and by Lemma 16, M◊ lies below M.
Therefore, M isn't minimal, and it was an arbitrary point that violated the λ⊙+b⊙ bound, so all minimal points in any →c(Θst)(πst) obey the same λ⊙+b⊙ bound, and we get bounded minimals.
6: Stub Normalization. By Lemma 22, we didn't introduce any new nirvana-free points, so stub normalization of Θst carries over.
Condition 7: Weak Consistency.
This is "projecting down makes a subset". All the following arguments work with sur-stuff. Fix some M∈→c(Θst)(πhist). It's a limit of points Mn in the convex hull. We can decompose the Mn as ∑iζi,nIπist,πhist∗(Mi,n). Then,
prπhist,πlost∗(Mn)=prπhist,πlost∗(∑iζi,nIπist,πhist∗(Mi,n))=∑iζi,nprπhist,πlost∗(Iπist,πhist∗(Mi,n))
Which, by the Diamond Lemma, can be rewritten as: ∑iζi,nIinf(πlost,πist),πlost∗(prπist,inf(πlost,πist)∗(Mi,n))
The projections of the Mi,n∈Θst(πist) lie in Θst(inf(πlost,πist)) by weak consistency for Θst. So, actually, the projection of Mn down to πlost can be written as a mix of injections up from stubs below πlost, so the projection of Mn lies in →c(Θst)(πst). Then, just use continuity of projection, and closure, to get M itself projecting down into →c(Θst)(πlost), so we're good on weak consistency.
8: Stub Extreme Point Condition: By Lemma 22, we didn't introduce any new nirvana-free points, so any nirvana-free extreme minimal point was present (and nirvana-free extreme minimal) in Θst already, so the extreme point condition carries over from there.
9: Stub Hausdorff Continuity: By Lemma 22, we didn't introduce any new nirvana-free points, and the preimages for Hausdorff continuity are of the nirvana-free parts, so this is completely unaffected and carries over.
Condition C: Causality: As a warmup to this result, we'll show that if πst≤πhist, then
Iπst,πhist∗(→c(Θst)(πst))⊆→c(Θst)(πhist)
Pick a M∈→c(Θst)(πst). It's a limit of Mn which are finite mixtures of injections of stuff from below, and Mn can be written as ∑iζi,nIπist,πst∗(Mi,n) Then,
Iπst,πhist∗(Mn)=Iπst,πhist∗(∑iζi,nIπist,πst∗(Mi,n))=∑iζi,nIπst,πhist∗(Iπist,πst∗(Mi,n))
And then, by commutativity of injections, the injection of Mn rewrites as ∑iζi,nIπist,πhist∗(Mi,n) All the πist are below πst so they're also under πhist, witnessing that the injection of Mn lies in →c(Θπst(πhist)). Then, just appeal to closure and continuity of I∗ or I∗s to get that M injects up into →c(Θπst(πhist)) Again, all this stuff works for the sur-situation as well.
With this out of the way, fix some M∈→cΘπst(πst). Let's try to make an outcome function from this, shall we? Let's do of(π′st):=Iinf(πst,π′st),π′st∗(prπst,inf(πst,π′st)∗(M))
Yup, that does indeed specify one point for everything. It obviously spits out M when you feed πst in because both the injection and projection turn into identity. Further, by weak-consistency, the projection of M down lies in →c(Θπst)(inf(πst,π′st)), and by our freshly-proved result, injecting up lands you in →c(Θst)(π′st).
So, all that's left is showing that it's an outcome function! That, for any two πhist and πlost where πhist≥πlost, that prπhist,πlost∗(of(πhist))=of(πlost) Let's begin.
prπhist,πlost∗(of(πhist))=prπhist,πlost∗(Iinf(πst,πhist),πhist∗(prπst,inf(πst,πhist)∗(M)))
And then, by the Diamond Lemma, this equals
Iinf(πlost,inf(πst,πhist)),πlost∗(prinf(πst,πhist),inf(πlost,inf(πst,πhist))∗(prπst,inf(πst,πhist)∗(M)))
And then, inf(πlost,inf(πst,πhist))=inf(πlost,πhist,πst)=inf(πst,πlost) because πlost≤πhist. Rewriting a bit, and grouping the two projections together because they commute, we have:
Iinf(πst,πlost),πlost∗(prπst,inf(πst,πlost)∗(M))=of(πlost)
And we're finally done with everything, we showed causality.
Lemma 24: Given a Θ (maybe just defined on stubs or full policies) that fulfills all hypothesis conditions except normalization, if it's normalizable, then all belief-function conditions are preserved (works in the sur-case too)
Nirvana-free nonemptiness, closure, convexity, nirvana-free upper-completion, and bounded-minimals are all obviously preserved by scale-and-shift/Proposition 7 in section 1 of proofs. For consistency, due to projections preserving λ and b, the scale-and-shift in the stubs (or full policies) is perfectly reflected in whichever partial policy you're evaluating, so consistency holds too. For the extreme point condition, any nirvana-free minimal extreme point post-renormalization is also nirvana-free minimal extreme pre-renormalization, so we can undo the renormalization, get a point in the nirvana-free component of a full policy that projects down accordingly, and scale-shift that point to get something that projects down to the scale-shifted extreme point. For Hausdorff-continuity, the scaling just scales the distance between two sets by the scale term, so Hausdorff-continuity carries over. Pseudocausality is preserved by normalization (un-normalize, transfer over to the partial policy of interest, then normalize back again), and so is causality (unnormalize the point and outcome function, complete it, normalize your batch of points back again).
Proposition 2: Given a nirvana-free Θ?ω, the minimal constraints we must check of Θ?ω to turn it into an acausal hypothesis are: Nonemptiness, Restricted Minimals, Hausdorff-Continuity, and non-failure of renormalization. Every other constraint to make a hypothesis can be freely introduced.
Ok, we have our Θ?ω, and we want to produce an acausal hypothesis. The obvious way to do it is: Θω(π)=(¯¯¯¯¯¯¯¯c.h(Θ?ω(π))uc)R We'll use Θω,¬R(π) to refer to the set before renormalization.
Proof sketch: We basically just run through the infinitary hypothesis conditions, and show that they're fulfilled by Θω,¬R, and then appeal to Lemma 24 that we didn't destroy our hard work when we normalize. As for the hypothesis conditions themselves, they're all pretty simple to show except for bounded-minimals and Hausdorff-continuity, which is where the bulk of the work is.
1: Nirvana-free nonemptiness. Trivial, because all the Θ?ω(π) are nonempty.
2: Closure: Appeal to Lemma 2, not in this document, but of section 1 in basic inframeasure theory, that the upper completion of a closed set is closed. Then we just intersect with the cone of a-measures (closed) to get our set of interest, so it's closed.
3: Convexity: If you have M and M′ which decompose into Mlo+M∗ and M′lo+M′∗ (M and M′ lie in the upper completion and Mlo/M′lo lie in the closed convex hull), then
pM+(1−p)M′=p(Mlo+M∗)+(1−p)(M′lo+M′∗)
=(pMlo+(1−p)M′lo)+(pM∗+(1−p)M′∗)
The first component lies in the closed convex hull because it's a mix of two points from the closed convex hull, the second component is an sa-measure, and by our upper completion, then pM+(1−p)M′ lies in our Θω,¬R(π)
4: Nirvana-free Upper-completeness: Trivial, we took the upper completion.
Condition 5: Bounded Minimals:
This can be shown by demonstrating that, if λ⊙+b⊙ is our bound for Θ?ω (ie, every point in Θ?ω(π), regardless of π, either respects the bound or lies strictly above a point in Θ?ω(π) that respects the bound), then every point in Θω,¬R(π) lies above a point that obeys the λ⊙+b⊙ bound.
Take a point M∈¯¯¯¯¯¯¯¯c.h(Θ?ω(π)). We don't have to worry about points in the upper completion that weren't part of the original closed convex hull, because they're above something in the closed convex hull, so we just have to show that everything in the closed convex hull lies above something that respects the bounds.
M can be written as a limit of points Mn∈c.h(Θ?ω(π)), which split into a mixture of finitely many Mi,n∈Θ?ω(π). We can then split the Mi,n into Mloi,n+M∗i,n, where Mloi,n respects the appropriate bounds (everything in Θ?ω(π) either obeys the bounds or is above a point which obeys the bounds). Now, we can rewrite Mn as: ∑iζi,n(Mloi,n)+∑iζi,n(M∗i,n)
That first sum term is a mixture of stuff that respects the λ⊙+b⊙ bound, so it respects the same property and lies below Mn by the addition of the second term making Mn. All these lower points lie in the closed convex hull, and obey the λ⊙+b⊙ bound, so there's a convergent subsequence that limits to some limit point that's also in the closed convex hull, respects the λ⊙+b⊙ bound, and by Lemma 16, is below M.
So, any M∈Θω,¬R(π), regardless of π, which violates the λ⊙+b⊙ bound on minimal points, has a point lower than it which does respect the bound, showing Minimal Boundedness.
We normalize at the end, and need uniform Hausdorff Continuity to show nirvana-free consistency, so let's skip to that one, which is hard.
Condition 8: Uniform Hausdorff Continuity:
We'll be working with the Lemma 15 variant of Hausdorff-continuity, that given any ϵ, there's a δ where two policies π,π′ being δ apart or less guarantees that if M is in Θ?ω(π), then there's a point M′ in Θ?ω(π′) that's only ϵ(1+λ) away, where λ is the λ value associated with M, and establish that variant for Θω,¬R.
Fix an ϵ. How close do two policies have to be to guarantee that for any M∈Θω,¬R(π), there's a point M′ in Θω,¬R(π′) within ϵ(1+λ)? Well, for our original Hausdorff-continuity condition, pick a δ that forces a ϵ2(1+λ⊙+b⊙) "distance", and δ<ϵ2(1+λ⊙+b⊙).
Since we've got closure and bounded-minimals, write M as Mlo+M∗ where Mlo respects the λ⊙+b⊙ bound, and it lies in the closed convex hull and is a limit of Mlon points, which decompose into a mixture of finitely many Mloi,n points.
Now, each of these Mloi,n points in Θ?ω(π), by Hausdorff-continuity of Θ?ω, have a M′loi,n point in Θ?ω(π′), that's only ϵ2(1+λ⊙+b⊙)(1+λi,n) away, by π′ and π being δ or less apart.
We can mix the M′loi,n in the same way as usual to make a M′lon that's only ϵ2(1+λ⊙+b⊙)(1+λn) or less away from Mlon
Because the Mlon sequence converges, there's some bound on the λn and bn values, and the (at most) ϵ2(1+λ⊙+b⊙)(1+maxn(λn)) change to make M′lon still keeps the λ and b values of our new sequence bounded, so by the Compactness Lemma, the M′lon sequence has a convergent subsequence, with a limit point M′lo, that lies in Θω,¬R(π′) by closure. Also, for all n,
d(Mlo,M′lo)≤d(Mlo,Mlon)+d(Mlon,M′lon)+d(M′lon,M′lo)
The two distances on either side limit to 0, and the middle distance limits to ϵ2(1+λ⊙+b⊙)(1+λ⊙+b⊙) or less, because eventually the λ value of Mlon gets really close to the λ value of Mlo, which is subject to the constraint that it can't be bigger than λ⊙+b⊙ due to Mlo being picked to have its λ+b value below λ⊙+b⊙, so d(Mlo,M′lo)≤ϵ2
Ok, so those two things are pretty close to each other. But what we really want is to find a point in Θω,¬R(π′) that's close to M, ie, Mlo+M∗. We can invoke the proof path from direction 2 of Lemma 15 (we have enough tools to do it, most notably upper completion) to craft a M′∈Θω,¬R(π′) where d(M,M′)≤ϵ2+δ(ϵ2+λ)
Further, δ<ϵ2(1+λ⊙+b⊙)≤ϵ2. So, we get d(M,M′)≤ϵ(1+λ) and we're done.
7: Nirvana-free Consistency: We're working in a nirvana-free setting, so we can simplify things. Our formulation that we're going to show is, regardless of stub πst, c.h(⋃π>πstprπ,πst∗(Θω(π)))is closed. Just invoke Lemma 20 and we have it and that's the last one we needed besides renormalization. Now all we have to do is to show that every property is preserved when we do the necessary rescaling. Invoke Lemma 24.
Proposition 3: Given some arbitarary Θ?ω which can be turned into an acausal hypothesis, turning it into Θω has EΘω(π)(f)=α(EΘ?ω(π)(f)−β) for all π and f.
The steps to make your full Θω are convex hull, closure, upper completion, and renormalization. For convex hull, because f induces a positive functional, which is linear, convex hull doesn't affect the worst-case value (mixing a-measures mixes the score you get w.r.t the function), closure just swaps inf out for min, and upper completion doesn't add any new minimal points so it preserves the same minimal values for everything. Let's use Θω,¬R for the upper completion of the closed convex hull (no renormalization) so, unpacking definitions, for all π,f:
EΘ?ω(π)(f)=inf(m,b)∈Θ?ω(π)(m(f)+b)=inf(m,b)∈c.h(Θ?ω(π))(m(f)+b)
=inf(m,b)∈¯¯¯¯¯¯c.h(Θ?ω(π))(m(f)+b)=inf(m,b)∈(¯¯¯¯¯¯c.h(Θ?ω(π)))uc(m(f)+b)=EΘω,¬R(π)(f)
Continuing onwards, let's use β for our shift constant and α for our rescaling constant.
EΘω(π)(f)=inf(m,b)∈Θω(π)(m(f)+b)=inf(m,b)∈Θω,¬R(π)(αm(f)+α(b−β))
=α((inf(m,b)∈Θω,¬R(π)(m(f)+b))−β)=α(EΘω,¬R(π)(f)−β)
So, regardless of your π and f, EΘω(π)(f)=α(EΘ?ω(π)(f)−β)
and we're done. In particular, since this scale and shift is completely uniform across everything, it keeps the set of optimal policies unchanged.
Proposition 4: For all hypotheses Θ and Θ′,(∀π,f:EΘ(π)(f)=EΘ′(π)(f))↔(→NF(Θ)=→NF(Θ′))
We can use that Murphy never picks something with Nirvana in it, and →NF(Θ)(π)=Θ(π)∩NF to rewrite our desired property as:
(∀π,f:E→NF(Θ)(π)(f)=E→NF(Θ′)(π)(f))↔(→NF(Θ)=→NF(Θ′))
one direction of this is pretty easy, if the belief functions are identical when you slice off the nirvana, then regardless of π and f, Murphy forces the same value. The other direction of this can be done by Theorem 3 from Section 1. Fixing a π, the property ∀f:E→NF(Θ)(π)(f)=E→NF(Θ′)(π)(f) implies →NF(Θ)(π)=→NF(Θ′)(π), but this holds for all π, so we get →NF(Θ)=→NF(Θ′)
Proposition 5: For all hypotheses Θ, and all continuous functions g from policies to functions f∈C((A×O)ω,[0,1]), then argmaxπEΘ(π)(g(π)) exists and is closed.
Proof sketch: We'll prove this in four phases, where πn is some arbitrary sequence of policies limiting to the policy π.
Our first phase will be establishing that limn→∞|EΘ(πn)(g(πn))−EΘ(πn)(g(π))|=0
Our second phase will be establishing that limsupn→∞EΘ(πn)(g(π))≤EΘ(π)(g(π))
Our third phase will be establishing that liminfn→∞EΘ(πn)(g(π))≥EΘ(π)(g(π))
Putting phase 2 and 3 together, limn→∞EΘ(πn)(g(π))=EΘ(π)(g(π)) and then, in conjunction with phase 1, limn→∞EΘ(πn)(g(πn))=EΘ(π)(g(π)) This establishes that the function π↦EΘ(π)(g(π)) is continuous. Phase 4 is then deriving our desired result from the continuity of that function.
Begin phase 1. To begin with, because g is a function from a compact metric space to a metric space, by the Heine-Cantor theorem, it's uniformly continuous. So, there's some δ difference between policies that guarantees an ϵ difference between the functions produced, and our distance metric on functions in this case is supx|f(x)−f′(x)|, the distance metric associated with uniform convergence.
By Proposition 3 (Section 1), every positive functional (and, by Proposition 1 (Section 1), continuous functions induce positive functionals) is minimized within the set of minimal points, so we can fix an (m,b) and (m′,b′) within (Θ(πn))min (specifically, the nirvana-free component) which minimize the positive functionals associated with g(πn) and g(π), respectively. Being able to get an actual minimizing point follows from minimal-boundedness, so the closure of the set of minimal points (due to having bounds) is compact, and a continuous function from a compact set to [0,1] has an actual minimizing point, so we can pick such a point and then step down to a minimal point if needed
Note that, by the way we picked these, m(g(πn))+b=EΘ(πn)(g(πn)) and also m′(g(π))+b′=EΘ(πn)(g(π))
First, we'll bound the following two terms. |(m(g(πn))+b)−(m(g(π))+b)| and |(m′(g(π))+b′)−(m′(g(πn))+b′)| The same arguments work for both, so we'll just show one of them.
|(m(g(πn))+b)−(m(g(π))+b)|=|m(g(πn)−g(π))|≤m(|g(πn)−g(π)|)≤λ⊙ϵn
The argument for this is that the first ≤ is because m is a measure (never negative), so an upper-bound on the absolute value of the expectation is given by the expectation of the absolute value of the distance between the two functions. For the second ≤, if n is large enough to make πn and π be only δn apart, then g(πn) and g(π) are only ϵn apart, so that absolute value is upper bounded by ϵn, getting us an upper bound of m(ϵn) Then, because the total amount of measure for minimal points is upper bounded by some λ⊙ regardless of which policy we picked (minimal-point boundedness for Θ), we can finally impose an upper bound of ϵnλ⊙ on the distance. The sort of argument works for the second thing, and gets us the exact same upper bound.
Further, m(g(πn))+b≤m′(g(πn))+b′ and m′(g(π))+b′≤m(g(π))+b. This is because (m,b) is specialized to minimize g(πn) and (m′,b′) is specialized to minimize g(π). Therefore, in one direction:
(m′(g(π))+b′)−(m(g(π))+b)≤0 so (m′(g(π))+b′)−(m(g(πn))+b)≤ϵnλ⊙
In the other direction,
(m(g(πn))+b)−(m′(g(πn))+b′)≤0 so (m(g(πn))+b)−(m′(g(π))+b′)≤ϵnλ⊙
Thus, putting the two parts together,
|EΘ(πn)(g(πn))−EΘ(πn)(g(π))|=|(m(g(πn))+b)−(m′(g(π))+b′)|≤ϵnλ⊙
We can make n go to infinity, which makes δn (distance between policies) go to 0, which makes ϵn (distance between functions) go to 0, and λ⊙ is a constant, so we get that the distance between the two expectations limits to 0 and we're done with the first phase.
Time for phase 2, showing that limsupn→∞EΘ(πn)(g(π))≤EΘ(π)(g(π))
Fix some (m,b) in Θ(π)∩NF that minimizes the positive functional associated with g(π). By Hausdorff-Continuity, we can find a sequence of points (mn,bn)∈Θ(πn)∩NF that limit to (m,b). By continuity of g(π), this means that mn(g(π))+bn limits to m(g(π))+b, which is EΘ(π)(g(π)). However, mn(g(π))+bn≥EΘ(πn)(g(π))
Thus, limsupn→∞EΘ(πn)(g(π))≤EΘ(π)(g(π))
Now for phase 3, showing that liminfn→∞EΘ(πn)(g(π))≥EΘ(π)(g(π))
Assume it's false, we'll get a proof-by contradiction. That is,
liminfn→∞EΘ(πn)(g(π))<EΘ(π)(g(π))
Then, we can get some subsequence where the expectations converge to the liminf. For each n in that subsequence, fix a (mn,bn)∈(Θ(πn))min∩NF that minimizes the positive functional associated with g(π) within Θ(πn). Ie, mn(g(π))+bn=EΘ(πn)(g(π))
By bounded minimals, there's some λ⊙+b⊙ bound on all of these, so we can isolate another convergent subsequence (the expectation values still limit to the liminf), where the (mn,bn) limit to some (m,b). For the following arguments, we'll use n to denote numbers from our original sequence (ranges over all natural numbers) and j to denote numbers from our convergent subsequence of interest (where the expectations converge to liminf and our sequence of minimizing points converges to a limit point)
First, this (m,b) limit point lies in Θ(π), because it's arbitrarily close to points that are arbitrarily close to Θ(π)(Hausdorff-continuity), so the distance to that set shrinks to 0, and Θ(π) is closed so said point limits to be in it. Now, we can go
liminfn→∞EΘ(πn)(g(π))=limj→∞(mj(g(π))+bj)=m(g(π))+b
By mj(g(π))+bj=EΘ(πj)(g(π)), and the j's making a subsequence where we attain the liminf value in the limit, and then the second equality is a convergent sequence of a-measures having their expectation value limit to the expectation value of the limit point.
But then we get something impossible. (m,b)∈Θ(π), and yet somehow (by our original assumption that the liminf undershot the expectation value of g(π) in Θ(π)),
m(g(π))+b=liminfn→∞EΘ(πn)(g(π))<EΘ(π)(g(π))=min(m′,b′)∈Θ(π)∩NF(m′(g(π))+b′)
Which cannot be. This shows that the liminf is ≥EΘ(π)(g(π)).
Now for phase 4. Again, from the proof sketch, phases 1, 2, and 3 establish that π↦EΘ(π)(g(π)) is a continuous function Π→R≥0 Let's abbreviate this function as χ. Since we're mapping Π (which is compact) through a continuous function, the image χ(Π) is compact. Thus, it has a maximum value, which is attained by some policy. Take that maximum value (it's a single point so it's closed), take the preimage (which is a nonempty closed set of policies), and that's your argmaxπEΘ(π)(g(π)) set. And EΘ(π)(g(π)) unpacks as min(m,b)∈Θ(π)∩NF(m(g(π))+b)Thus showing our result.
A quick corollary of it is, if g just returns the constant 1 function, you can find a policy π where min(λμ,b)∈Θ(π)(λ+b) is 1, by normalization, so we can use max in normalization instead of sup.
Lemma 25: In the nirvana-free setting, with all the Bi⊆FNF(πpa) being nonempty and upper complete, then EζBi is upper-complete.
The proof of this is nearly identical to the proof of Lemma 12. Except in this case, our Mi aren't finitely many points selected from a nonconvex B, they're countably many points selected from the various Bi. Apart from that difference, the proof path works as it usually does.
Lemma 26: A belief function Θ fulfilling all conditions except normalization, which is renormalized by subtracting infπEΘ(π)(0) and scaling by (supπEΘ(π)(1)−infπEΘ(π)(0))−1, only has the renormalization fail if, for all π, Θ(π)∩NF has a single minimal point of the form (0,b), with the same b for all π. (works in the sur-case too)
First, fixing an arbitrary π′, then, if there's a divide-by-zero when scaling,
EΘ(π′)(1)≤maxπEΘ(π)(1)=minπEΘ(π)(0)≤EΘ(π′)(0)
However, EΘ(π′)(1)≥EΘ(π′)(0) always, so the two terms are equal.
Now, just invoke Proposition 6 from Section 1, which says that if EΘ(π′)∩NF(1)=EΘ(π′)∩NF(0), then there's a single minimal point of the form (0,b), and the b is EΘ(π′)∩NF(0), which is the same for all policies π′. The converse is, if all Θ(π) are of this form, then renormalization fails.
Let's define "nontriviality" for a belief function Θ. A Θ is nontrivial if there exists some π where EΘ(π)(1)≠EΘ(π)(0)
In other words, there's some policy you can feed in where the minimal points of Θ(π) aren't just a single (0,b) point. This is a very weak condition. Also, for the upcoming proposition 6, mixing just doesn't interact well with nirvana-free consistency, so we have to do it just for pseudocausal and acausal hypotheses.
Proposition 6: For pseudocausal and acausal hypotheses Θi where ∑iζiλ⊙i<∞ and there exists a nontrivial Θi, then mixing them and renormalizing produces a pseudocausal or acausal hypothesis.
Mixing is defined on the infinite levels by (EζΘi)(π)=Eζ(Θi(π)), and is then extended down to the finite levels by the usual process of projecting down and taking the closed convex hull. Then, we can renormalize if we wish. We'll distinguish these by "raw" or "renormalized" mix. Thus, the only conditions we need to check are the infinite conditions. For everything else, we can just go "the infinite conditions work, so we can extend to a full belief function" by the Isomorphism theorem.
If you want what mixing does at finite levels, it's ¯¯¯¯¯¯¯¯c.h(⋃π>πst(Eζ(prπ,πst∗(Θi(π))))) So it isn't "mix all the finite levels", it's "mix all the projections individually and then take convex hull"
Proof sketch: Neglecting normalization (because Lemma 24 shows we can just renormalize and all nice conditions are preserved), we just need to verify all the relevant infinitary conditions, and then we can extend to lower levels by isomorphism, and get our result. We also need to show that nontriviality implies that the renormalization doesn't fail, but that's easy. As for the conditions, our lemmas let us get most of them with little trouble. Bounded-minimals from just the λ⊙ bound is slightly more difficult and relies on showing that (0,1) is in all the Θi(π) sets regardless of i and π by normalization to eliminate the b term, and Hausdorff-continuity is also fairly nontrivial (we have to split our mix into three pieces and bound each one individually via a different argument) and relies on the same (0,1) is in all the Θi(π) result. For causality, we'll knock it out with Tychonoff in a slightly more complicated way than usual so we just use a countable product and don't have to invoke the full Axiom of Choice.
We'll take a detour and show that (0,1)∈EζΘi(π) for all π, we need this in a few places. First, pick an arbitrary i and π and look at Θi(π). Find a minimal point (m,b) that minimizes m(1)+b. Now, consider (m,b)+(−m,m(1)). This is (0,m(1)+b). However...
m(1)+b=min(m′,b′)∈Θi(π)∩NF(m(1)+b)=EΘi(π)(1)≤maxπEΘi(π)(1)=1
(by (m,b) minimizing m(1)+b, and normalization, respectively)
So, by nirvana-free upper-completion for Θi, we get the point (0,1)∈Θi(π) for all i and π. Then, mixing these gets that (0,1) lies in (EζΘi)(π).
1: Infinitary Nirvana-Free Nonemptiness, it's easy, all our (EζΘi)(π) contain the a-measure (0,1), which is nirvana free.
2,3: Closure, convexity: Closure follows from the proof of closure in Proposition 11 of section 1, and we mixed convex sets so the mixture is convex.
4: Nirvana-free Upper-Completion: Just invoke Lemma 25.
5: Bounded minimals: Since (0,1)∈(EζΘi)(π) for all i, any a-measure with b≥1 isn't minimal (add whatever you want to (0,1)), so we have a bound on the b values of minimals. What about a bound on the λ values?
Well, just take a M in (EζΘi)(π) with b<1, and split it into a mixture of Mi points from the Θi(π). Then, by bounded minimality for the Θi, we can take each Mi and find a minimal point below it that fulfills the λ⊙i bound on the λ values. Mixing those minimals produces a point Mlo that's below M, and has a λ value below Eζλ⊙i, which, by assumption, is finite. So, every minimal point in any (EζΘi)(π) has a λ+b value below Eζλ⊙i+1 and we have bounded-minimals.
Nirvana-free consistency is something we'll have to loop back to after Hausdorff-continuity.
Condition 8: Hausdorff-continuity:
Pick an arbitrary M∈(EζΘi)(π). It shatters into Mi∈Θi(π). We'll be showing the Lemma 15 variant, which is that for all ϵ, there's a δ where if d(π′,π)<δ, then there's a point in (EζΘi)(π′) that's ϵ(1+λ) away.
First, we'll shuffle around what our Mi are supposed to be, we need a certain decomposition to make it work. Reindex your probability distribution so the highest-probability thing is assigned to i=0. All the Mi can be decomposed as a Mmini+M∗i. Now, let our new Mi for i>0 be defined as: Mmini+(m∗−i,−m∗−i(1)),
and our Mi for i=0 is defined as: Mmin0+(m∗−0,−m∗−0(1))+∑iζiζ0(m∗+i,b∗i+M∗−i(1))
These new Mi still lie in Θi(π)∩NF (nirvana-free upper-completion), and they're all a-measures (the negative part isn't enough to cancel out the positive measure on mmini otherwise our old Mi wouldn't be an a-measure). Further, mixing them together still makes M, and if i>0, then λi≤λ⊙i (because we start off at a minimal point with λi≤λ⊙i, and then add something to it that saps some measure from it).
Fixing some ϵ... well, Eζλ⊙i<∞, so find a j where ∑i>jζi(λ⊙i+1)<ϵ3. For i≤j... well, using the Lemma 15 variant of Hausdorff-continuity, note that fixing a δ gets you a different ϵi value for Hausdorff-continuity of each Θi. We only have to worry about the i≤j, though, and there's finitely many. So, pick a δ where the induced ϵ0 is below ϵ3, and for all 0<i≤j, ϵi<ϵEζλ⊙i+1.
So... we take our Mi in Θi(π), and go to nearby M′i in Θi(π′). We should break down exactly how this is done. For M0, the λ value relative to M is at most λζ0 (in the degenerate case where it contributes all the measure to the mixture M), so, by Hausdorff-continuity for Θ0, the gap between M0 and M′0 is at most ϵ3(1+λζ0) because we picked δ low enough to get that scale factor on the front.
For Mi where 0<i≤j, the gap between Mi and M′i is at most ϵ3(Eζλ⊙i+1)(1+λ⊙i) Because we picked δ low enough to guarantee that scale factor on the front, and Mi is made by adding a minimal point and an sa-measure where the measure component is entirely negative, so λ⊙i is a bound on the λ value for Mi.
And finally, for i>j, we can specially craft a M′i where the gap between Mi and M′i is at most λ⊙i+1. This is because Mi has measure below λ⊙i due to being a minimal point that lost some measure. So, we can expend λ⊙i effort to completely reshuffle it however we wish, and then add (0,1) to our reshuffled a-measure to make an a-measure that lies above (0,1), which must be in Θi(π′), so our reshuffled a-measure plus (0,1) lies in Θi(π′) by nirvana-free upper-completion, and we only had to spend λ⊙i+1 effort to travel to it (first term is the reshuffling, second term is adding 1 to the b term)
Now, let's analyze the distance between M and the point EζM′i which lies in (EζΘi)(π′). d(M,EζM′i) equals...
d(ζ0M0+∑0<i≤j(ζiMi)+∑i>j(ζiMi),ζ0M′0+∑0<i≤j(ζiM′i)+∑i>j(ζiM′i))
≤d(ζ0M0,ζ0M′0)+∑0<i≤jd(ζiMi,ζiM′i)+∑i>jd(ζiMi,ζiM′i)
=ζ0d(M0,M′0)+∑0<i≤j(ζid(Mi,M′i))+∑i>j(ζid(Mi,M′i))
<ζ0ϵ3(1+λζ0)+∑0<i≤j(ζiϵ3(Eζλ⊙i+1)(1+λ⊙i))+∑i>jζi(λ⊙i+1)
<ϵ3(ζ0+λ)+ϵ31Eζλ⊙i+1∑0<i≤j(ζi(1+λ⊙i))+ϵ3
<ϵ3(ζ0+1+λ+1Eζλ⊙i+1∑iζi(λ⊙+1))<ϵ3(1+1+λ+Eζ(λ⊙i+1)Eζλ⊙i+1)
=ϵ3(3+λ)≤ϵ3(3+3λ)=ϵ(1+λ)
And bam, we've got Hausdorff-continuity!
9: Nirvana-free consistency: Invoke Lemma 20, notice that we're in the nirvana-free setting, we're done.
Pseudocausality. Fix some M∈(EζΘi)(π), whose support is a subset of FNF(π′). M shatters into Mi∈Θi(π). All of them have their support being a subset of FNF(π′) (otherwise there'd be measure-mass outside of there), so all the Mi transfer over to Θi(π′) by pseudocausality for the Θi, and then we can mix them back together to get M∈(EζΘi)(π′).
And we're done, mixing works just fine, as long as we can show that the renormalization preserves everything and the renormalization doesn't fail. Renormalization fails iff (By Lemma 26), for all π, then E(EζΘi)(π)(1)=E(EζΘi)(π)(0)
We can then go E(EζΘi)(π)(1)=EEζ(Θi(π))(1)=Eζ(EΘi(π)(1)) and similar for 0, (by the definition of the mix of belief functions and Proposition 10 in Section 1) and then we can use that E(EζΘi)(π)(1)−E(EζΘi)(π)(0)=0 to go
Eζ(EΘi(π)(1))−Eζ(EΘi(π)(0))=0
∑iζi(EΘi(π)(1)−EΘi(π)(0))=0
So, then, for all π and i, EΘi(π)(1)=EΘi(π)(0) However, this is incompatible with the existence of a nontrivial Θi, because nontriviality just says that there's a π where EΘi(π)(1)≠EΘi(π)(0).
So, nontriviality for some Θi(π) means that the renormalization of your mix can be done. It's a very weak condition, just saying "there's some possibility of starting with a hypothesis (Θi) which has a policy π, where murphy actually has to pay attention to what function you're maximizing.
Now, we just need to show that renormalization preserves all nice conditions. Just invoke isomorphism to complete to a full psuedo/acausal belief function lacking normalization, apply Lemma 24 and renormalize everything, and we're done.
Proposition 7: For pseudocausal and acausal hypotheses, E(EζΘi)(πpa)(f)=Eζ(EΘi(πpa)(f))
Proof: E(EζΘi)(πpa)(f)=EEζ(Θi(πpa))(f)=Eζ(EΘi(πpa)(f)) by Proposition 10 of Section 1.
Proposition 8: For pseudocausal and acausal hypotheses,
prπhipa,πlopa∗((EζΘi)(πpa))=Eζ(prπhipa,πlopa∗(Θi(π′pa)))
This is an easy one. If M∈prπhipa,πlopa∗((EζΘi)(πhipa)), then there's a preimage point M′∈(EζΘi)(πhipa), which then decomposes into M′i∈Θi(πhipa). These M′i project down to Mi, which then mix to make M (project then mix equals mix then project because of linearity) witnessing that M∈Eζ(prπhipa,πlopa∗(Θi(πhipa)))
Now for the reverse direction. If M∈Eζ(prπhipa,πlopa∗(Θi(πhipa))), then it shatters into Mi∈prπhipa,πlopa∗(Θi(πhipa)), which have preimage points M′i∈Θi(πhipa). The M′i mix to make a M′∈(EζΘi)(πhipa), which projects down to M, by project-mix equaling mix-project, witnessing that M∈prπhipa,πlopa∗((EζΘi)(πhipa)).
Next proof post!