Proposition 5:If Bmin⊆Ma(X), then the condition "there is a λ⊙ where, ∀(λμ,b)∈Bmin:λ≤λ⊙" is equivalent to "there is a compact C s.t. Bmin⊆C"
Proof sketch: One direction is immediate from the Compactness Lemma. For showing that just a bound on the λ values suffices to be contained in a compact set, instead of a bound on the λ and b values to invoke the Compactness Lemma, we use a proof by contradiction where we can get a bound on the b values of the minimal points from just a bound on the λ values.
Proof: In one direction, assume there's a compact C s.t. Bmin⊆C, and yet there's no upper-bounding λ⊙ on the λ values. This is impossible by the Compactness Lemma, since (λμ)+(1)=λμ+(1)=λμ(1)=λ.
In the other direction, assume there's a λ⊙ bound on λ for the minimal points. Fix some arbitrary (λμ,b)∈Bmin for the rest of the proof. Now, we will show that all minimal points (λ′μ′,b′)∈Bmin have λ′≤λ⊙, and b′≤λ⊙+b, letting us invoke the Compactness Lemma to get that everything is in a suitable compact set C. The first bound is obvious. Since λ′ came from a minimal point, it must have λ⊙ as an upper bound.
For the other one, by contradiction, let's assume that there's a minimal point (λ′μ′,b′) where b′>λ⊙+b. Then, we can write (λ′μ′,b′) as: (λμ,b)+(−λμ,λ⊙)+(λ′μ′,b′−λ⊙−b)
The first component, (λμ,b) is our fixed minimal point of interest. The second component is an sa-measure, because λ⊙−λ≥0, due to the λ⊙ upper bound on the λ value of minimal points. The third component is also a nonzero sa-measure, because λ′ is nonnegative (it came from a minimal point), and by assumption, b′>λ⊙+b. Hang on, we wrote a minimal point (λ′μ′,b′) as another minimal point (λμ,b), plus two sa-measures (one of which is nonzero), so (λ′μ′,b′) can't be minimal, and we have a contradiction.
Therefore, all (λ′μ′,b′)∈Bmin have b′≤λ⊙+b. Now that we have bounds on λ and b for minimal points, we can invoke the Compactness Lemma to conclude that everything is in a compact set.
Proposition 6:EB(0)=EB(1) only occurs when there's only one minimal point of the form (0,b).
Proof: Unpacking the expectations, and in light of Proposition 3,
EB(1)=inf(λμ,b)∈Bmin(λμ(1)+b)=inf(λμ,b)∈Bmin(λ+b) and EB(0)=inf(λμ,b)∈Bmin(λμ(0)+b)=inf(λμ,b)∈Bminb
So, take a minimal a-measure (λμ,b) that minimizes λ+b. One must exist because we have λ and b bounds, so by the Compactness Lemma, we can restrict our attention to an actual compact set, and continuous functions from a compact set to R have a minimum, so there's an actual minimizing minimal point.
λ must be 0, because otherwise EB(1)=λ+b>b≥EB(0) which contradicts EB(1)=EB(0). Further, since b=λ+b=EB(1)=EB(0), said b must be the lowest b possible amongst minimal points.
So, we have a minimal point of the form (0,b) where b is the lowest possible b amongst the minimal points. Any other distinct minimal point must be of the form (λ′μ′,b′), where b′≥b. This other minimal point can be written as (0,b)+(λ′μ′,b′−b), where the latter component is an sa-measure, so it's not minimal. Thus, there's only one minimal a-measure and it's of the form (0,b).
Proposition 7:Renormalizing a bounded inframeasure produces a bounded infradistribution, if renormalization doesn't fail.
Proof sketch: Our first order of business is showing that our renormalization process doesn't map anything outside the cone of sa-measures. A variant of this argument establishes that the preimage of a minimal point in BR must be a minimal point in B, which quickly establishes positive-minimals and bounded-minimals for BR. Then, we verify the other conditions of a bounded infradistribution. Nonemptiness, closure, and convexity are very easy, upper-closure is shown by adding appropriately-scaled sa-measures such that, after renormalization, they hit whatever sa-measure you want. Then, finally, we just have to verify that our renormalization procedure is the right one to use, that it makes EBR(1)=1 and EBR(0)=0.
Proof: First up, we need to show that after renormalization, nothing gets mapped outside the cone of sa-measures. Observe that the renormalization process is injective. If two points are distinct, after a scale-and-shift, they'll still be distinct.
Let B be our original set and BR be our renormalized set. Take a point in BR, given by (m,b). Undoing the renormalization, we get (EB(1)−EB(0))(m,b)+(0,EB(0))∈B.
By decomposition into a minimal point and something else via Theorem 2, we get that
(EB(1)−EB(0))(m,b)+(0,EB(0))=(mmin,bmin)+(m∗,b∗)
where (mmin,bmin)∈Bmin. Renormalizing back, we get that
(m,b)=1EB(1)−EB(0)((mmin,bmin−EB(0))+(m∗,b∗))
b′≥EB(0), obviously, because EB(0) is the minimal b value amongst the minimal points. So, the first component is an a-measure, the second component is an sa-measure, so adding them is an sa-measure, and then we scale by a nonnegative constant, so (m,b) is an sa-measure as well.
This general line of argument also establishes positive-minimals and bounded-minimals, as we'll now show. If the (m∗,b∗) isn't 0, then we just wrote (m,b) as
1EB(1)−EB(0)(mmin,bmin−EB(0))+1EB(1)−EB(0)(m∗,b∗)
And the first component lies in BR, but the latter component is nonzero, witnessing that (m,b) isn't minimal. So, if (m,b) is minimal in BR, then (m∗,b∗)=0, so it must be the image of a single minimal point (mmin,bmin)∈Bmin by injectivity. Ie, the preimage of a minimal point in BR is a minimal point in B.
Scale-and-shift maps a-measures to a-measures, showing positive-minimals, and the positive scale constant of (EB(1)−EB(0))−1 just scales up the λ⊙ upper bound on the λ values of the minimal points in B, showing bounded-minimals.
For the remaining conditions, nonemptiness, closure, and convexity are trivial. We're taking a nonempty closed convex set and doing a scale-and-shift so it's nonempty closed convex.
Time for upper-completeness. Letting B be our original set and BR be our renormalized set, take a point MR+M∗ in (BR)uc. By injectivity, MR has a single preimage point M∈B. Undoing the renormalization by multiplying by EB(1)−EB(0) (our addition of EB(0) is paired with BR to undo the renormalization on that one), consider M+(EB(1)−EB(0))M∗ This lies in B by upper-completeness, and renormalizing it back produces MR+M∗, which is in BR, so BR is upper-complete.
That just leaves showing that after renormalizing, we're normalized.
Proof sketch: First show linearity, then continuity, for the operator that just maps a signed measure through g, using some equation-crunching and characterizations of continuity. Then, since g∗ is just the pair of that and the identity function, it's trivial to show that it's linear and continuous.
We'll use g′∗ to refer to the function M±(X)→M±(Y) defined by (g′∗(m))(Z)=m(g−1(Z)), where Z is a measurable subset of Y and g∈C(X,Y). Ie, this specifies what the measure g′∗(m) is in terms of telling you what value it assigns to all measurable subsets of Y.
We'll use g∗ to refer to the function M±(X)⊕R→M±(X)⊕R given by g∗(m,b)=(g′∗(m),b).
Our first order of business is establishing the linearity of g′∗. Observe that, for all measurable Z⊆Y, and a,a′ being real numbers, and m,m′ being signed measures over X,
So, g′∗(am+a′m′)=ag′∗(m))+a′g′∗(m′) and we have linearity of g′∗.
Now for continuity of g′∗. Let mn limit to m. The sequence g′∗(mn) converging to g′∗(m) in our metric on M±(Y) is equivalent to: ∀f∈C(Y):limn→∞g′∗(mn)(f)=g′∗(m)(f)
So, if g′∗(mn) fails to converge to g′∗(m), then there is some continuous function f∈C(Y) that witnesses the failure of convergence. But, because g is a continuous function X→Y, then f∘g∈C(X), and also mn(f∘g)=g′∗(mn)(f), so:
limn→∞g′∗(mn)(f)=limn→∞mn(f∘g)=m(f∘g)=g′∗(m)(f)
The key step in the middle is that mn limits to m, so mn(f∘g) limits to m(f∘g), by our characterization of continuity. Thus, we get a contradiction, our f that witnesses the failure of convergence actually does converge. Therefore, g′∗(mn) limits to g′∗(m) if mn limits to m, so g′∗ is continuous.
To finish up, continuity for g∗ comes from the product of two continuous functions being continuous (g′∗ which we showed already, and idR because duh), and linearity comes from:
Proposition 9:g∗(H) is a (bounded) inframeasure if H is, and it doesn't require upper completion if g is surjective.
Proof sketch: Nonemptiness is obvious, and showing that it maps sa-measures to sa-measures is also pretty easy. Closure takes a rather long argument that the image of any closed subset of sa-measures over X, through g∗, is closed, which is fairly tedious. We may or may not invoke upper completion afterwards, but if we do, we can just appeal to the lemma that the upper completion of a closed set is closed. Convexity is immediate from linearity of g∗.
For upper completion, we can just go "we took the upper completion" if g isn't surjective, but we also need to show that we don't need to take the upper completion if g is surjective, which requires crafting a measurable inverse function to g via the Kuratowski-Ryll-Nardzewski selection theorem, in order to craft suitable preimage points.
Then we can use LF-Duality to characterize the induced h function, along with Proposition 8, which lets us get positive-minimals, bounded-minimals, and normalization fairly easily, wrapping up the proof.
Proof: Nonemptiness is obvious. For showing that it takes sa-measures to sa-measures, take an (m,b)∈H, and map it through to get (g′∗(m),b)∈g∗(H). (m,b) is an sa-measure, so b+m−(1)≥0. Now, we can use Lemma 5 to get:
So the b term is indeed big enough that the image of (m,b) is an sa-measure.
For closure, fix a sequence of (mn,bn)∈g∗(H) limiting to some (m,b), with preimage points (m′n,b′n)∈H. Due to convergence of (mn,bn) there must be some b◯ bound on the bn. g∗ preserves those values, so b◯ is an upper bound on the b′n. Since the (m′n,b′n) are sa-measures, −b◯ is a lower bound on the m′−n(1) values. Since mn converges to m, mn(1) converges to m(1), so there's a λ◯ upper bound on the mn(1) values. Further,
So, for all n, m′+n(1)≤λ◯+b◯, so we have an upper bound on the b′n and m′+n(1) values. Now we can invoke the Compactness Lemma to conclude that there's a convergent subsequence of the (m′n,b′n), with a limit point (m′,b′), which must be in H since H is closed. By continuity of g∗(H) from Lemma 6, g∗(m′,b′) must equal (m,b), witnessing that (m,b)∈g∗(H). So, g∗(H) is closed. Now, if we take upper completion afterwards, we can just invoke Lemma 2 to conclude that the upper completion of a closed set of sa-measures is closed.
Also, g∗ is linear from Lemma 6, so it maps convex sets to convex sets getting convexity.
Now for upper completion. Upper completion is immediate if g isn't surjective, because we had to take the upper completion there. Showing we don't need upper completion if g is surjective is trickier. We must show that g∗ is a surjection from Msa(X) to Msa(Y).
First, we'll show that g∗(U) where U is an open subset of X is a measurable subset of Y. In metrizable spaces (of which X is one), every open set is a Fσ set, ie, it can be written as a countable union of closed sets. Because our space is compact, all those closed sets are compact. And the continuous image of a compact set is a compact set, ie closed. Therefore, g∗(U) is a countable union of closed sets, ie, measurable.
X is a Polish space (all compact metric spaces are Polish), it has the Borel σ-algebra, and we'll use the function g−1. Note that g−1(y) is closed and nonempty for all y∈Y due to g being a continuous surjection. Further, the set {y:g−1(y)∩U≠∅} equals g(U) for all open sets U. In one direction, if the point y is in the first set, then there's some point x∈U where g(x)=y. In the other direction, if a point y is in g(U), then there's some point x∈U where g(x)=y so g−1(y)∩U is nonempty.
Thus, g−1 is weakly measurable, because for all open sets U of X, {y:g−1(y)∩U≠∅}=g(U) and g(U) is measurable. Now, by the Kuratowski-Ryll-Nardzewski Measurable Selection Theorem, we get a measurable function g◊ from Y to X where g◊(y)∈g−1(y) so g(g◊(y))=y, and g◊ is an injection.
So, we can push any sa-measure of interest (m∗,b∗) through g◊∗ (which preserves the amount of negative measure due to being an injection), to get an sa-measure that, when pushed through g∗ recovers (m∗,b∗) exactly. Thus, if g∗(m,b)∈g∗(H), and you want to show g∗(m,b)+(m∗,b∗)∈g∗(H), just consider
So, since (m,b)+g◊∗(m∗,b∗)∈H due to upper-completeness, then g∗((m,b)+g◊∗(m∗,b∗))=g∗(m,b)+(m∗,b∗)∈g∗(H) And we have shown upper-completeness of g∗(H) if g is a surjection.
We should specify something about using LF-Duality here. If you look back through the proof of Theorem 5 carefully, the only conditions you really need for isomorphism are (on the set side) g∗(H) being closed, convex, and upper complete (in order to use Proposition 2 to rewrite g∗(H) appropriately for the subsequent arguments, we have these properties), and (on the functional side), f↦Eg∗(H)(f) being concave (free), −∞ if range(f)⊈[0,1] (by proof of Theorem 4, comes from upper completeness), and continuous over f∈C(Y,[0,1]) (showable by Proposition 8 that Eg∗(H)(f)=EH(f∘g), and the latter being continuous since H is an infradistribution)
It's a bit of a pain to run through this argument over and over again, so we just need to remember that if you can show closure, convexity, upper completeness, and the expectations to be continuous, that's enough to invoke LF-Duality and clean up the minimal point conditions. We did that, so we can invoke LF-Duality now.
Time for normalization. From Proposition 8, the g∗(h) function we get from f↦Eg∗(H)(f) is uniquely characterized as: g∗(h)(f)=h(f∘g). So,
Eg∗(H)(1)=g∗(h)(1)=h(1∘g)=h(1)=EH(1)=1
Eg∗(H)(0)=g∗(h)(0)=h(0∘g)=h(0)=EH(0)=0
and normalization is taken care of.
For bounded-minimals/weak-bounded-minimals, since g∗(H) is the LF-dual of g∗(h), we can appeal to Theorem 5 and just check whether g∗(h) is Lipschitz/uniformly continuous. if d(f,f′)<δ, then d(f∘g,f′∘g)<δ according to the sup metric on C(Y,[0,1]) and C(X,[0,1]), respectively, which (depending on whether we're dealing with Lipschitzness or uniform continuity), implies that |h(f∘g)−h(f′∘g)|<λ⊙δ, or ϵ for uniform continuity. So, we get: |g∗(h)(f)−g∗(h)(f′)|=|h(f∘g)−h(f′∘g)|<λ⊙δ (or ϵ for uniform continuity), thus establishing that f and f′ being sufficiently close means that g∗(h) doesn't change much, which, by Theorem 5, implies bounded-minimals/weak-bounded-minimals in g∗(H).
For positive-minimals it's another Theorem 5 argument. If f′≥f, then f′∘g≥f∘g, so: g∗(h)(f′)−g∗(h)(f)=h(f′∘g)−h(f∘g)≥0 And we have monotonicity for g∗(h), which, by Theorem 5, translates into positive-minimals on g∗(H).
Lemma 7:If M∈(EζHi)min, then for all decompositions of M into Mi, Mi∈(Hi)min
This is easy. Decompose M into EζMn. To derive a contradiction, assume there exists a nonminimal Mi that decomposes into Mmini+M∗i where M∗i≠0. Then,
M=EζMi=Eζ(Mmini+M∗i)=Eζ(Mmini)+Eζ(M∗i)
Thus, we have decomposed our minimal point into another point which is also present in EζHi, and a nonzero sa-measure because there's a nonzero M∗i so our original "minimal point" is nonminimal. Therefore, all decompositions of a minimal point in the mixture set must have every component part being minimal as well.
Proposition 11:A mixture of infradistributions is an infradistribution. If it's a mixture of bounded infradistributions with Lipschitz constants on their associated h functions of λ⊙i, and ∑iζiλ⊙i<∞, then the mixture is a bounded infradistribution.
Proof sketch: Nonemptiness, convexity, upper completion, and normalization are pretty easy to show. Closure is a nightmare.
The proof sketch of Closure is: Take a sequence (mn,bn) limiting to (m,b). Since each approximating point is a mixture of points from the Hi, we can shatter each of these (mn,bn)∈EζHi into countably many (mi,n,bi,n)∈Hi. This defines a sequence in eachHi (not necessarily convergent). Then, we take some bounds on the (mn,bn) and manage to translate them into (rather weak) i-dependent bounds on the (mi,n,bi,n) sequence. This lets us invoke the Compactness Lemma and view everything as wandering around in a compact set, regardless of Hi. Then, we take the product of these compact sets to view everything as a single sequence in the product of compact sets, which is compact by Tychonoff's theorem. This is only a countable product of compact metric spaces, so we don't need full axiom of choice. Anyways, we isolate a convergent subsequence in there, which makes a convergent subsequence in each of the Hi. And then, we can ask "what happens when we mix the limit points in the Hi according to ζ?" Well, what we can do is just take a partial sum of the mixture of limit points, like the i from 0 to 1 zillion. We can establish that (m,b) gets arbitrarily close to the upper completion of a partial sum of the mixture of limit points, so (m,b) lies above all the partial sums of our limit points. We show that the partial sums don't have multiple limits, then, we just do one more invocation of Lemma 3 to conclude that the mixture of limit points lies below (m,b). Finally, we appeal to upper completion to conclude that (m,b) is in our mixed set of interest. Whew!
Once those first 4 are out of the way, we can then invoke Theorem 5 to translate to the h view, and mop up the remaining minimal-point conditions.
First, nonemptiness. By Theorem 5, we can go "hm, the hi are monotone on C(X,[0,1]), and −∞ everywhere else, and hi(1)=1, so the affine functional ϕ:ϕ(f)=1 lies above the graph of hi". This translates to the point (0,1) being present in all the Hi. Then, we can just go: Eζ(0,1)=(0,1), so we have a point in our EζHi set.
For normalization, appeal to Proposition 10 and normalization for all the Hi. EEζHi(1)=Eζ(EHi(1))=Eζ(1)=1 and EEζHi(0)=Eζ(EHi(0))=Eζ(0)=0.
Convexity is another easy one. Take a M,M′∈EζHi. They shatter into Mi,M′i∈Hi. Then, we can just go:
That leaves the nightmare of closure. Fix a sequence Mn∈Eζ(Hi) limiting to M. You can think of the Mn as (mn,bn). We can shatter the Mn into Mi,n∈Hi, where Mi,n can be thought of as (mi,n,bi,n).
Now, since Mn converge to something, there must be an upper bound on the bn and mn(1) terms of the sequence, call those b◯ and λ◯. Now, for all n and all i′, b◯≥bn=∑iζibi,n≥ζi′bi′,n so, for all n and i, bi,n≤b◯ζi.
Also, for all n and i′, λ◯+b◯≥mn(1)+bn=∑i(ζi(mi,n(1)+bi,n))≥ζi′(mi′,n(1)+bi′,n) and reshuffling, we get λ◯+b◯ζi′≥mi′,n(1)+bi′,n which then makes λ◯+b◯ζi′≥m+i′,n(1)+(m−i′,n(1)+bi′,n). Further, due to (mi′,n,bi′,n) being a sa-measure, bi′,n+m−i,n(1)≥0, so for all n and i, m+i,n(1)≤λ◯+b◯ζi.
Ok, so taking stock of what we've shown so far, it's that for all i, the sequence Mi,n is roaming about within Hi∩{(m,b)|b≤b◯ζi,m+(1)≤λ◯+b◯ζi} And, by the Compactness Lemma, this set is compact, since it's got bounds (weak bounds, but bounds nonetheless). Defining
¯¯¯¯¯¯Mn∈∏i(Hi∩{(m,b)|b≤b◯ζi,m+(1)≤λ◯+b◯ζi})
where ¯¯¯¯¯¯Mn(i):=Mi,n, we can view everything as one single sequence ¯¯¯¯¯¯Mn wandering around in the product of compact sets. By Tychonoff's theorem (we've only got a countable product of compact metric spaces, so we don't need full axiom of choice, dependent choice suffices), we can fix a convergent subsequence of this, and the projections of this subsequence to every Hi converge.
Ok, so we've got a subsequence of n where, regardless of i, Mi,n converge to some Mi∈Hi (by closure of Hi). How does that help us? We don't even know if mixing these limit points converges to something or runs off to infinity. Well... fix any j you like, we'll just look at the partial sum of the first j components. Also fix any ϵ you please. On our subsequence of interest, the Mn converge to M, and in all i, the Mi,n converge to Mi. So, let n be large enough (and in our subsequence) that d(Mn,M)<ϵ, and ∀i≤j:d(Mi,n,Mi)<ϵ, we can always find such an n.
Now, ∑i≤jζiMi+∑i>jζiMi,n is a well-defined point (because it's a finite sum of points plus a convergent sequence as witnessed by the well-definedness of Mn which breaks down as ∑iζiMi,n) It also lies in the upper completion of the single point ∑i≤jζiMi. We'll show that this point is close to M. Since we're working in a space with a norm,
So, M is less than 2ϵ away from the upper completion of the point ∑i≤jζiMi, which is a closed set (Minkowski sum of a closed and compact set is closed). ϵ can be shrank to 0 with increasing n, so M has distance 0 from the upper completion of said partial sum, and thus lies above the partial sum!
Abbreviating ∑i≤jζiMi as Mj, we get that all the Mj lie in {M}−Msa(X), and are all sa-measures. Thus, if the sequence Mj converges to a unique point, then said limit point is ∑iζiMi, and all the Mi∈Hi, so ∑iζiMi would lie in EζHi. Further, by Lemma 3, ∑iζiMi∈{M}−Msa(X), since that set is compact, so M lies above ∑iζiMi, and would lie in EζHi by upper-completeness.
So, all that's left to wrap up our closure argument is showing that the sequence Mj has a single limit point. Since it's wandering around in ({M}−Msa(X))∩Msa(X) which is compact by Lemma 3, there are convergent subsequences. All we have to show now is that all convergent subsequences must have the same limit point.
Assume this is false, and there's two distinct limit points of the sequence Mj, call them M∞ and M′∞. Because it's impossible for two points to both be above another (in the minimal-point/adding-points sense), without both points being identical, either M∞∉{M′∞}−Msa(X), or vice-versa. Without loss of generality, assume M∞∉{M′∞}−Msa(X). Since the latter is a closed set, M∞ must be ϵ away for some ϵ>0. Fix some j from the subsequence that M∞ is a limit point of, where d(Mj,M∞)<ϵ2. There must be some strictly greaterj′ from the subsequence that M′∞ is a limit point of.
Mj′=∑i≤j′ζiMi=∑i≤jζiMi+∑j<i≤j′ζiMi=Mj+∑j<i≤j′ζiMi
Further, the ζi are nonzero. Also, no Mi can be the 0 point, because Mi∈Hi, and if Mi=(0,0), then EHi(1)=0, which is impossible by normalization. So, Mj lies strictly below Mj′. Also, Mj′ lies below M′∞, because for all the j∗>j′,
so Mj∗∈{Mj′}+Msa(X) for all j∗>j′. The sequence that limits to M′∞ is roaming around in this set, which is closed because the sum of a compact set (a single point) and a closed set is closed. So, M′∞ lies above Mj′ which lies above Mj. Thus, Mj∈{M′∞}−Msa(X). However, Mj is ϵ2 or less distance from M∞, which must be ϵ distance from {M′∞}−Msa(X), and we have a contradiction.
Ok, so the sequence of partial sums Mj has a single limit point, which is ∑iζiMi, and all the Mi∈Hi, so ∑iζiMi∈EζHi, and by Lemma 3, ∑iζiMi∈{M}−Msa(X), since that set is compact, so M lies above ∑iζiMi, and lies in EζHi by upper-completeness. We're done!
For minimals, by our argument about what it takes to invoke LF-Duality in Proposition 9, we only need convexity, closure, and upper completion (which we have), and that the h induced by EζHi is continuous. By Proposition 10, EEζHi(f)=Eζ(EHi(f))=Eζ(hi(f))=(Eζhi)(f). We might as well go for uniform continuity since all the Hi are infradistributions, and so fulfill weak-bounded-minimals, so their hi are uniformly continuous. Then, this continuity lets you invoke LF-Duality, and transfer uniform continuity for the h induced by EζHi to weak-bounded-minimals for EζHi
For uniform continuity/weak-bounded-minimals, given an arbitrary ϵ, we can pick a finite j where ∑i>jζi<ϵ2, and a finite δ where, for all hi with i≤j, d(f,f′)<δ implies |hi(f)−hi(f′)|<ϵ2. Monotonicity and normalization for the hi ensures that, no matter what, hi(f)∈[0,1], so regardless of the f,f′, |hi(f)−hi(f′)|≤1. Then, we can go: Ok, if |f−f′|<δ, then
|Eζ(hi(f))−Eζ(hi(f′))|≤Eζ|hi(f)−hi(f′)|
=∑i≤jζi|hi(f)−hi(f′)|+∑i>jζi|hi(f)−hi(f′)|
<∑i≤jζiϵ2+∑i>jζi<∑iζiϵ2+ϵ2=ϵ2+ϵ2=ϵ
And by our earlier argument, we invoke LF-Duality and pick up weak-bounded-minimals.
For positive-minimals, we can just observe that, if f′≥f, then
(Eζhi)(f′)=Eζ(hi(f′))≥Eζ(hi(f))=(Eζhi)(f)
By monotonicity for the hi because Hi had positive-minimals. Going back to EζHi, since its associated h is monotone, it must have positive-minimals as well.
For bounded minimals assuming the Lipschitz constants aren't too big, fix some ϵ. We know that ∑iζiλ⊙i<∞, where λ⊙i is the Lipschitz constant of hi. So, if d(f,f′)<ϵ, then:
So, ∑iζiλ⊙i is a finite constant, and is an upper bound on the Lipschitz constant of the mixture of the hi, so the h corresponding to EζHi has a Lipschitz constant, which, by Theorem 5, translates to bounded-minimals. And we're done.
Proposition 12:g∗(Eζ(Hi))=Eζ(g∗(Hi))
Let's use Theorem 5 to translate this into the concave functional setting. We want to show that g∗(Eζhi)=Eζ(g∗(hi)) Now, given any function f∈C(Y,[0,1]),
Now for continuity. mn⋅L limits to m⋅L if, for all f∈C(¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯supp(L)), (mn⋅L)(f) limits to (m⋅L)(f). Observe that (m⋅L)(f)=m(f★L0), and f★L0 is continuous.
So we have continuity in the second vector component as well, and we're done.
Lemma 9:(ugL(H))min⊆ugL(Hmin)
As a recap, the raw update function ugL is: (m,b)↦(m⋅L,b+m(0★Lg))
Take a point (m,b)∈(ugL(H))min. Now there must be a preimage point (m′,b′)∈H that, when we apply ugL, produces (m,b). Because (m′,b′) is in an infradistribution, we can decompose it into a minimal point and something else, (m′,b′)=(mmin,bmin)+(m∗,b∗). Then,
This was done by using linearity of ugL via Lemma 8.
Note that, since we have written (m,b) as a sum of a different point also in ugL(H) and an sa-measure, but (m,b) is minimal in ugL(H), the sa-measure must be 0, so (m,b)=ugL(mmin,bmin)∈ugL(Hmin), and we're done.
Proposition 13:When updating a bounded infradistribution over Msa(X), if the renormalization doesn't fail, you get a bounded infradistribution over the set Msa(L). (for infradistributions in general, you may have to take the closure)
Proof sketch: It doesn't matter whether you take upper-completion before or after renormalization, so we can appeal to Proposition 7: Renormalizing a bounded inframeasure produces a bounded infradistribution (if the renormalization doesn't fail).
So, we just have to show nonemptiness, convexity, upper-completion (trivial), positive-minimals/bounded minimals (by Lemma 9, the preimage of a minimal point contains a minimal point, so we can transfer over the properties from the minimal point in the preimage), and closure. The set of minimal points in H is contained in a compact set, so we can take a sequence in (ugL(H))uc, split into a component in ugL(H) and something else, take preimage points, get minimals below all of them, isolate a convergent subsequence, map the limit point back through, and show that the limit point lands under your point of interest. That establishes all conditions for a bounded inframeasure, so then we just have to check that our renormalization is the right one to do.
Proof: Nonemptiness is trivial, ugL isn't a partial function. Upper-completion is also trivial, because we explicitly took the upper completion. For convexity, observe that ugL is a linear operator by Lemma 7, so it maps convex sets to convex sets, and the Minkowski sum of two convex sets is convex. ugL maps sa-measures to sa-measures, because
b+m(0★Lg)+(m⋅L)−(1)=b+m(0★Lg)+(m−⋅L)(1)
=b+m(0★Lg)+m−(1★L0)=b+m+(0★Lg)+m−(0★Lg)+m−(1★L0)
≥b+m−(1★Lg)≥b+m−(1)≥0
For positive-minimals and bounded-minimals, we invoke Lemma 9, (ugL(H))min⊆ugL(Hmin). All minimal points in ugL(H) must have a preimage minimal in H, which is an a-measure. Chopping down a measure by L keeps it a measure, so we still have no negative components post-update, and all minimal points in ugL(H) are a-measures. Similarly, chopping down a measure by L reduces the λ value, and we had an upper bound of λ⊙ originally, so the upper bound still works post-update. This gets bounded-minimals.
This just leaves closure. Fix a sequence Mn in ugL(H)uc limiting to M. The Mn break down into ufg(M′n)+M∗n, where M′n∈H. M′n further breaks down into Mminn+M∗∗n, where Mminn∈Hmin. By Proposition 5, the Mminn sequence is wandering around in a compact set since we have bounded-minimals on H, so there's a convergent subsequence which has a limit point Mmin. Map that convergent subsequence and limit point through ugL which is continuous by Lemma 8 to get a sequence of points ugL(Mminn) limiting to ugL(Mmin)∈ugL(H). Fix some really big n where d(M,Mn)<ϵ and d(ugL(Mminn),ugL(Mmin))<ϵ.
Now, ugL(Mmin)+ugL(M∗∗n)+M∗n lies in the upper completion of the point ugL(Mmin). We'll show that this sum of 3 terms is close to M. Since we're working in a Banach space, d(x+y,z+y)=d(x,z), by norm arguments.
So,M is within 2ϵ of the upper completion of {ugL(Mmin)} for all ϵ, and it's a closed set, so M lies above ugL(Mmin)∈ugL(H), so M∈(ugL(H))uc, and we have closure.
Now that all prerequisite conditions have been established, we just need to show that 1PgH(L) and EH(0★Lg) are the proper renormalization constants to use.
The proper renormalization to use is: 1E(ugL(H))uc(1)−E(ugL(H))uc(0) for the scale, and E(ugL(H))uc(0) for the shift. So let's unpack these quantities.
For the scale constant, observe that 1E(ugL(H))uc(1)−E(ugL(H))uc(0)=1EH(1★Lg)−EH(0★Lg)=1PgH(L)
So our scale constant is also the right scale constant to use. Now, we can invoke Proposition 7: Renormalizing a bounded inframeasure produces a bounded infradistribution if the renormalization doesn't fail.
Now, if PgH(L)=0, then EH(1★Lg)=EH(0★Lg) so, for any f∈C(X,[0,1]), (1★Lg)≥(f★Lg)≥(0★Lg) by monotonicity for the h induced by H, and h(1★Lg)=h(0★Lg), so h(f★Lg)=h(0★Lg). Therefore,
EH(0★Lg)+PgH(L)EH|gL(f)=EH(0★Lg)+0=EH(f★Lg)
and we get our same result.
Proposition 15:(H|gL)|g′L′=H|⎛⎝g★1−L1−LL′g′⎞⎠LL′
Proof sketch: First, we do some shuffling around of the stars to get a lemma that will help. Then, we can use the link between updated sets and their associated concave functionals h, getting the identity purely on the concave functional level, where it's much easier to approach.
Proof: First, the star shuffling. For any f,g,g′,L,L′∈C(X,[0,1]), we'll show that
f★LL′(g★1−L1−LL′g′)=(f★L′g′)★Lg.
Let's begin. First, let's deal with points x where L(x)=L′(x)=1, because that gets you a divide-by-zero error.
Ok, so we've established our crucial f★LL′(g★1−L1−LL′g′)=(f★L′g′)★Lg identity. Let's proceed. Updates for concave functionals are: (h|gL)(f)=h(f★Lg)−h(0★Lg)h(1★Lg)−h(1★Lg)
Importing Proposition 14, EH(f★Lg)=EH(0★Lg)+PgH(L)EH|gL(f) and rearranging it (and unpacking the definition of PgH(L)), we get EH|gL(f)=EH(f★Lg)−EH(0★Lg)EH(1★Lg)−EH(0★Lg)
So, updating fulfills the positive functional definition of update, because this transfers into (h|gL)(f)=h(f★Lg)−h(0★Lg)h(1★Lg)−h(0★Lg) which is exactly our concave functional definition of updating. So, in order to verify that the two updates equal the one big update, we could just show that their concave functional definitions are equivalent. (H|gL)|g′L′ would, on the concave functional level, turn into:
Theorem 6:(EζHi)|gL=Eζ(PgHi(L)⋅(Hi|gL))Eζ(PgHi(L))If the update doesn't fail.
Proof: Let ζ′ be defined as ζ′i:=ζiPgHi(L)∑jζjPgHj(L) It is a probability distribution, because if all PgHi(L)=0, then EζPgHi(L)=0, and so by Lemma 10, PgEζHi(L)=0, which would cause the update to fail.
The left-hand-side corresponds to (Eζhi)|gL on the concave functional level, and the right-hand-side corresponds to Eζ′(hi|gL) on the concave functional level. Let's begin unpacking. Lemma 10 will be used throughout, as well as the definition of PgHi(L).
The previous proofs are here.
Proposition 5: If Bmin⊆Ma(X), then the condition "there is a λ⊙ where, ∀(λμ,b)∈Bmin:λ≤λ⊙" is equivalent to "there is a compact C s.t. Bmin⊆C"
Proof sketch: One direction is immediate from the Compactness Lemma. For showing that just a bound on the λ values suffices to be contained in a compact set, instead of a bound on the λ and b values to invoke the Compactness Lemma, we use a proof by contradiction where we can get a bound on the b values of the minimal points from just a bound on the λ values.
Proof: In one direction, assume there's a compact C s.t. Bmin⊆C, and yet there's no upper-bounding λ⊙ on the λ values. This is impossible by the Compactness Lemma, since (λμ)+(1)=λμ+(1)=λμ(1)=λ.
In the other direction, assume there's a λ⊙ bound on λ for the minimal points. Fix some arbitrary (λμ,b)∈Bmin for the rest of the proof. Now, we will show that all minimal points (λ′μ′,b′)∈Bmin have λ′≤λ⊙, and b′≤λ⊙+b, letting us invoke the Compactness Lemma to get that everything is in a suitable compact set C. The first bound is obvious. Since λ′ came from a minimal point, it must have λ⊙ as an upper bound.
For the other one, by contradiction, let's assume that there's a minimal point (λ′μ′,b′) where b′>λ⊙+b. Then, we can write (λ′μ′,b′) as: (λμ,b)+(−λμ,λ⊙)+(λ′μ′,b′−λ⊙−b)
The first component, (λμ,b) is our fixed minimal point of interest. The second component is an sa-measure, because λ⊙−λ≥0, due to the λ⊙ upper bound on the λ value of minimal points. The third component is also a nonzero sa-measure, because λ′ is nonnegative (it came from a minimal point), and by assumption, b′>λ⊙+b. Hang on, we wrote a minimal point (λ′μ′,b′) as another minimal point (λμ,b), plus two sa-measures (one of which is nonzero), so (λ′μ′,b′) can't be minimal, and we have a contradiction.
Therefore, all (λ′μ′,b′)∈Bmin have b′≤λ⊙+b. Now that we have bounds on λ and b for minimal points, we can invoke the Compactness Lemma to conclude that everything is in a compact set.
Proposition 6: EB(0)=EB(1) only occurs when there's only one minimal point of the form (0,b).
Proof: Unpacking the expectations, and in light of Proposition 3,
EB(1)=inf(λμ,b)∈Bmin(λμ(1)+b)=inf(λμ,b)∈Bmin(λ+b) and EB(0)=inf(λμ,b)∈Bmin(λμ(0)+b)=inf(λμ,b)∈Bminb
So, take a minimal a-measure (λμ,b) that minimizes λ+b. One must exist because we have λ and b bounds, so by the Compactness Lemma, we can restrict our attention to an actual compact set, and continuous functions from a compact set to R have a minimum, so there's an actual minimizing minimal point.
λ must be 0, because otherwise EB(1)=λ+b>b≥EB(0) which contradicts EB(1)=EB(0). Further, since b=λ+b=EB(1)=EB(0), said b must be the lowest b possible amongst minimal points.
So, we have a minimal point of the form (0,b) where b is the lowest possible b amongst the minimal points. Any other distinct minimal point must be of the form (λ′μ′,b′), where b′≥b. This other minimal point can be written as (0,b)+(λ′μ′,b′−b), where the latter component is an sa-measure, so it's not minimal. Thus, there's only one minimal a-measure and it's of the form (0,b).
Proposition 7: Renormalizing a bounded inframeasure produces a bounded infradistribution, if renormalization doesn't fail.
Proof sketch: Our first order of business is showing that our renormalization process doesn't map anything outside the cone of sa-measures. A variant of this argument establishes that the preimage of a minimal point in BR must be a minimal point in B, which quickly establishes positive-minimals and bounded-minimals for BR. Then, we verify the other conditions of a bounded infradistribution. Nonemptiness, closure, and convexity are very easy, upper-closure is shown by adding appropriately-scaled sa-measures such that, after renormalization, they hit whatever sa-measure you want. Then, finally, we just have to verify that our renormalization procedure is the right one to use, that it makes EBR(1)=1 and EBR(0)=0.
Proof: First up, we need to show that after renormalization, nothing gets mapped outside the cone of sa-measures. Observe that the renormalization process is injective. If two points are distinct, after a scale-and-shift, they'll still be distinct.
Let B be our original set and BR be our renormalized set. Take a point in BR, given by (m,b). Undoing the renormalization, we get (EB(1)−EB(0))(m,b)+(0,EB(0))∈B.
By decomposition into a minimal point and something else via Theorem 2, we get that
(EB(1)−EB(0))(m,b)+(0,EB(0))=(mmin,bmin)+(m∗,b∗)
where (mmin,bmin)∈Bmin. Renormalizing back, we get that
(m,b)=1EB(1)−EB(0)((mmin,bmin−EB(0))+(m∗,b∗))
b′≥EB(0), obviously, because EB(0) is the minimal b value amongst the minimal points. So, the first component is an a-measure, the second component is an sa-measure, so adding them is an sa-measure, and then we scale by a nonnegative constant, so (m,b) is an sa-measure as well.
This general line of argument also establishes positive-minimals and bounded-minimals, as we'll now show. If the (m∗,b∗) isn't 0, then we just wrote (m,b) as
1EB(1)−EB(0)(mmin,bmin−EB(0))+1EB(1)−EB(0)(m∗,b∗)
And the first component lies in BR, but the latter component is nonzero, witnessing that (m,b) isn't minimal. So, if (m,b) is minimal in BR, then (m∗,b∗)=0, so it must be the image of a single minimal point (mmin,bmin)∈Bmin by injectivity. Ie, the preimage of a minimal point in BR is a minimal point in B.
Scale-and-shift maps a-measures to a-measures, showing positive-minimals, and the positive scale constant of (EB(1)−EB(0))−1 just scales up the λ⊙ upper bound on the λ values of the minimal points in B, showing bounded-minimals.
For the remaining conditions, nonemptiness, closure, and convexity are trivial. We're taking a nonempty closed convex set and doing a scale-and-shift so it's nonempty closed convex.
Time for upper-completeness. Letting B be our original set and BR be our renormalized set, take a point MR+M∗ in (BR)uc. By injectivity, MR has a single preimage point M∈B. Undoing the renormalization by multiplying by EB(1)−EB(0) (our addition of EB(0) is paired with BR to undo the renormalization on that one), consider M+(EB(1)−EB(0))M∗ This lies in B by upper-completeness, and renormalizing it back produces MR+M∗, which is in BR, so BR is upper-complete.
That just leaves showing that after renormalizing, we're normalized.
EBR(1)=inf(λμ,b)∈BR(λ+b)=inf(λ′μ′,b′)∈B1EB(1)−EB(0)(λ′+b′−EB(0))
=1EB(1)−EB(0)(inf(λ′μ′,b′)∈B(λ′+b′)−EB(0))=EB(1)−EB(0)EB(1)−EB(0)=1
For the other part,
EBR(0)=inf(λμ,b)∈BRb=inf(λ′μ′,b′)∈B1EB(1)−EB(0)(b′−EB(0))
=1EB(1)−EB(0)(inf(λ′μ′,b′)∈Bb′−EB(0))=EB(0)−EB(0)EB(1)−EB(0)=0
And we're done.
Lemma 6: g∗ is a continuous linear operator.
Proof sketch: First show linearity, then continuity, for the operator that just maps a signed measure through g, using some equation-crunching and characterizations of continuity. Then, since g∗ is just the pair of that and the identity function, it's trivial to show that it's linear and continuous.
We'll use g′∗ to refer to the function M±(X)→M±(Y) defined by (g′∗(m))(Z)=m(g−1(Z)), where Z is a measurable subset of Y and g∈C(X,Y). Ie, this specifies what the measure g′∗(m) is in terms of telling you what value it assigns to all measurable subsets of Y.
We'll use g∗ to refer to the function M±(X)⊕R→M±(X)⊕R given by g∗(m,b)=(g′∗(m),b).
Our first order of business is establishing the linearity of g′∗. Observe that, for all measurable Z⊆Y, and a,a′ being real numbers, and m,m′ being signed measures over X,
(g′∗(am+a′m′))(Z)=(am+a′m′)(g−1(Z))=am(g−1(Z))+a′m′(g−1(Z))
=ag′∗(m)(Z)+a′g′∗(m′)(Z)=(ag′∗(m)+a′g′∗(m′))(Z)
So, g′∗(am+a′m′)=ag′∗(m))+a′g′∗(m′) and we have linearity of g′∗.
Now for continuity of g′∗. Let mn limit to m. The sequence g′∗(mn) converging to g′∗(m) in our metric on M±(Y) is equivalent to: ∀f∈C(Y):limn→∞g′∗(mn)(f)=g′∗(m)(f)
So, if g′∗(mn) fails to converge to g′∗(m), then there is some continuous function f∈C(Y) that witnesses the failure of convergence. But, because g is a continuous function X→Y, then f∘g∈C(X), and also mn(f∘g)=g′∗(mn)(f), so:
limn→∞g′∗(mn)(f)=limn→∞mn(f∘g)=m(f∘g)=g′∗(m)(f)
The key step in the middle is that mn limits to m, so mn(f∘g) limits to m(f∘g), by our characterization of continuity. Thus, we get a contradiction, our f that witnesses the failure of convergence actually does converge. Therefore, g′∗(mn) limits to g′∗(m) if mn limits to m, so g′∗ is continuous.
To finish up, continuity for g∗ comes from the product of two continuous functions being continuous (g′∗ which we showed already, and idR because duh), and linearity comes from:
g∗(a(m,b)+a′(m′,b′))=g∗(am+a′m′,ab+a′b′)=(g′∗(am+a′m′),ab+ab′)
=(ag′∗(m)+a′g′∗(m),ab+ab′)=a(g′∗(m)+b)+a′(g′∗(m′)+b′)=ag∗(m,b)+a′g∗(m′,b′)
Proposition 8: If f∈C(X,[0,1]) and g is a continuous function X→Y, then Eg∗(H)(f)=EH(f∘g)
Eg∗(H)(f)=inf(m,b)∈(g∗(H))(m(f)+b)=inf(m,b)∈H(g′∗(m)(f)+b)
=inf(m,b)∈H(m(f∘g)+b)=EH(f∘g)
Proposition 9: g∗(H) is a (bounded) inframeasure if H is, and it doesn't require upper completion if g is surjective.
Proof sketch: Nonemptiness is obvious, and showing that it maps sa-measures to sa-measures is also pretty easy. Closure takes a rather long argument that the image of any closed subset of sa-measures over X, through g∗, is closed, which is fairly tedious. We may or may not invoke upper completion afterwards, but if we do, we can just appeal to the lemma that the upper completion of a closed set is closed. Convexity is immediate from linearity of g∗.
For upper completion, we can just go "we took the upper completion" if g isn't surjective, but we also need to show that we don't need to take the upper completion if g is surjective, which requires crafting a measurable inverse function to g via the Kuratowski-Ryll-Nardzewski selection theorem, in order to craft suitable preimage points.
Then we can use LF-Duality to characterize the induced h function, along with Proposition 8, which lets us get positive-minimals, bounded-minimals, and normalization fairly easily, wrapping up the proof.
Proof: Nonemptiness is obvious. For showing that it takes sa-measures to sa-measures, take an (m,b)∈H, and map it through to get (g′∗(m),b)∈g∗(H). (m,b) is an sa-measure, so b+m−(1)≥0. Now, we can use Lemma 5 to get:
b+(g′∗(m))−(1)=b+inff∈C(Y,[0,1])g′∗(m)(f)=b+inff∈C(Y,[0,1])m(f∘g)
≥b+inff′∈C(X,[0,1])m(f)=b+m−(1)≥0
So the b term is indeed big enough that the image of (m,b) is an sa-measure.
For closure, fix a sequence of (mn,bn)∈g∗(H) limiting to some (m,b), with preimage points (m′n,b′n)∈H. Due to convergence of (mn,bn) there must be some b◯ bound on the bn. g∗ preserves those values, so b◯ is an upper bound on the b′n. Since the (m′n,b′n) are sa-measures, −b◯ is a lower bound on the m′−n(1) values. Since mn converges to m, mn(1) converges to m(1), so there's a λ◯ upper bound on the mn(1) values. Further,
λ◯≥mn(1)=g′∗(m′n)(1)=m′n(1∘g)=m′n(1)=m′+n(1)+m′−n(1)≥m′+n(1)−b◯
So, for all n, m′+n(1)≤λ◯+b◯, so we have an upper bound on the b′n and m′+n(1) values. Now we can invoke the Compactness Lemma to conclude that there's a convergent subsequence of the (m′n,b′n), with a limit point (m′,b′), which must be in H since H is closed. By continuity of g∗(H) from Lemma 6, g∗(m′,b′) must equal (m,b), witnessing that (m,b)∈g∗(H). So, g∗(H) is closed. Now, if we take upper completion afterwards, we can just invoke Lemma 2 to conclude that the upper completion of a closed set of sa-measures is closed.
Also, g∗ is linear from Lemma 6, so it maps convex sets to convex sets getting convexity.
Now for upper completion. Upper completion is immediate if g isn't surjective, because we had to take the upper completion there. Showing we don't need upper completion if g is surjective is trickier. We must show that g∗ is a surjection from Msa(X) to Msa(Y).
First, we'll show that g∗(U) where U is an open subset of X is a measurable subset of Y. In metrizable spaces (of which X is one), every open set is a Fσ set, ie, it can be written as a countable union of closed sets. Because our space is compact, all those closed sets are compact. And the continuous image of a compact set is a compact set, ie closed. Therefore, g∗(U) is a countable union of closed sets, ie, measurable.
X is a Polish space (all compact metric spaces are Polish), it has the Borel σ-algebra, and we'll use the function g−1. Note that g−1(y) is closed and nonempty for all y∈Y due to g being a continuous surjection. Further, the set {y:g−1(y)∩U≠∅} equals g(U) for all open sets U. In one direction, if the point y is in the first set, then there's some point x∈U where g(x)=y. In the other direction, if a point y is in g(U), then there's some point x∈U where g(x)=y so g−1(y)∩U is nonempty.
Thus, g−1 is weakly measurable, because for all open sets U of X, {y:g−1(y)∩U≠∅}=g(U) and g(U) is measurable. Now, by the Kuratowski-Ryll-Nardzewski Measurable Selection Theorem, we get a measurable function g◊ from Y to X where g◊(y)∈g−1(y) so g(g◊(y))=y, and g◊ is an injection.
So, we can push any sa-measure of interest (m∗,b∗) through g◊∗ (which preserves the amount of negative measure due to being an injection), to get an sa-measure that, when pushed through g∗ recovers (m∗,b∗) exactly. Thus, if g∗(m,b)∈g∗(H), and you want to show g∗(m,b)+(m∗,b∗)∈g∗(H), just consider
g∗((m,b)+g◊∗(m∗,b∗))=g∗(m,b)+g∗(g◊∗(m∗,b∗))=g∗(m,b)+(m∗,b∗)
So, since (m,b)+g◊∗(m∗,b∗)∈H due to upper-completeness, then g∗((m,b)+g◊∗(m∗,b∗))=g∗(m,b)+(m∗,b∗)∈g∗(H) And we have shown upper-completeness of g∗(H) if g is a surjection.
We should specify something about using LF-Duality here. If you look back through the proof of Theorem 5 carefully, the only conditions you really need for isomorphism are (on the set side) g∗(H) being closed, convex, and upper complete (in order to use Proposition 2 to rewrite g∗(H) appropriately for the subsequent arguments, we have these properties), and (on the functional side), f↦Eg∗(H)(f) being concave (free), −∞ if range(f)⊈[0,1] (by proof of Theorem 4, comes from upper completeness), and continuous over f∈C(Y,[0,1]) (showable by Proposition 8 that Eg∗(H)(f)=EH(f∘g), and the latter being continuous since H is an infradistribution)
It's a bit of a pain to run through this argument over and over again, so we just need to remember that if you can show closure, convexity, upper completeness, and the expectations to be continuous, that's enough to invoke LF-Duality and clean up the minimal point conditions. We did that, so we can invoke LF-Duality now.
Time for normalization. From Proposition 8, the g∗(h) function we get from f↦Eg∗(H)(f) is uniquely characterized as: g∗(h)(f)=h(f∘g). So,
Eg∗(H)(1)=g∗(h)(1)=h(1∘g)=h(1)=EH(1)=1
Eg∗(H)(0)=g∗(h)(0)=h(0∘g)=h(0)=EH(0)=0
and normalization is taken care of.
For bounded-minimals/weak-bounded-minimals, since g∗(H) is the LF-dual of g∗(h), we can appeal to Theorem 5 and just check whether g∗(h) is Lipschitz/uniformly continuous. if d(f,f′)<δ, then d(f∘g,f′∘g)<δ according to the sup metric on C(Y,[0,1]) and C(X,[0,1]), respectively, which (depending on whether we're dealing with Lipschitzness or uniform continuity), implies that |h(f∘g)−h(f′∘g)|<λ⊙δ, or ϵ for uniform continuity. So, we get: |g∗(h)(f)−g∗(h)(f′)|=|h(f∘g)−h(f′∘g)|<λ⊙δ (or ϵ for uniform continuity), thus establishing that f and f′ being sufficiently close means that g∗(h) doesn't change much, which, by Theorem 5, implies bounded-minimals/weak-bounded-minimals in g∗(H).
For positive-minimals it's another Theorem 5 argument. If f′≥f, then f′∘g≥f∘g, so: g∗(h)(f′)−g∗(h)(f)=h(f′∘g)−h(f∘g)≥0 And we have monotonicity for g∗(h), which, by Theorem 5, translates into positive-minimals on g∗(H).
Lemma 7: If M∈(EζHi)min, then for all decompositions of M into Mi, Mi∈(Hi)min
This is easy. Decompose M into EζMn. To derive a contradiction, assume there exists a nonminimal Mi that decomposes into Mmini+M∗i where M∗i≠0. Then,
M=EζMi=Eζ(Mmini+M∗i)=Eζ(Mmini)+Eζ(M∗i)
Thus, we have decomposed our minimal point into another point which is also present in EζHi, and a nonzero sa-measure because there's a nonzero M∗i so our original "minimal point" is nonminimal. Therefore, all decompositions of a minimal point in the mixture set must have every component part being minimal as well.
Proposition 10: EEζHi(f)=Eζ(EHi(f))
EEζHn(f)=inf(m,b)∈EζHi(m(f)+b)=inf(mi,bi)∈ΠiHi((Eζmi)(f)+Eζbi)
=inf(mi,bi)∈ΠiHi(Eζ(mi(f))+Eζ(bi))=inf(mi,bi)∈ΠiHiEζ(mi(f)+bi)
=Eζ(inf(mi,bi)∈Hi(mi(f)+bi))=Eζ(EHi(f))
Done.
Proposition 11: A mixture of infradistributions is an infradistribution. If it's a mixture of bounded infradistributions with Lipschitz constants on their associated h functions of λ⊙i, and ∑iζiλ⊙i<∞, then the mixture is a bounded infradistribution.
Proof sketch: Nonemptiness, convexity, upper completion, and normalization are pretty easy to show. Closure is a nightmare.
The proof sketch of Closure is: Take a sequence (mn,bn) limiting to (m,b). Since each approximating point is a mixture of points from the Hi, we can shatter each of these (mn,bn)∈EζHi into countably many (mi,n,bi,n)∈Hi. This defines a sequence in each Hi (not necessarily convergent). Then, we take some bounds on the (mn,bn) and manage to translate them into (rather weak) i-dependent bounds on the (mi,n,bi,n) sequence. This lets us invoke the Compactness Lemma and view everything as wandering around in a compact set, regardless of Hi. Then, we take the product of these compact sets to view everything as a single sequence in the product of compact sets, which is compact by Tychonoff's theorem. This is only a countable product of compact metric spaces, so we don't need full axiom of choice. Anyways, we isolate a convergent subsequence in there, which makes a convergent subsequence in each of the Hi. And then, we can ask "what happens when we mix the limit points in the Hi according to ζ?" Well, what we can do is just take a partial sum of the mixture of limit points, like the i from 0 to 1 zillion. We can establish that (m,b) gets arbitrarily close to the upper completion of a partial sum of the mixture of limit points, so (m,b) lies above all the partial sums of our limit points. We show that the partial sums don't have multiple limits, then, we just do one more invocation of Lemma 3 to conclude that the mixture of limit points lies below (m,b). Finally, we appeal to upper completion to conclude that (m,b) is in our mixed set of interest. Whew!
Once those first 4 are out of the way, we can then invoke Theorem 5 to translate to the h view, and mop up the remaining minimal-point conditions.
First, nonemptiness. By Theorem 5, we can go "hm, the hi are monotone on C(X,[0,1]), and −∞ everywhere else, and hi(1)=1, so the affine functional ϕ:ϕ(f)=1 lies above the graph of hi". This translates to the point (0,1) being present in all the Hi. Then, we can just go: Eζ(0,1)=(0,1), so we have a point in our EζHi set.
For normalization, appeal to Proposition 10 and normalization for all the Hi. EEζHi(1)=Eζ(EHi(1))=Eζ(1)=1 and EEζHi(0)=Eζ(EHi(0))=Eζ(0)=0.
Convexity is another easy one. Take a M,M′∈EζHi. They shatter into Mi,M′i∈Hi. Then, we can just go:
pM+(1−p)(m′,b′)=pEζ(mi,bi))+(1−p)Eζ(m′i,b′i))=Eζ(p(mi,bi)+(1−p)(m′i,b′i))
and then, by convexity of the Hi, p(mi,bi)+(1−p)(m′i,b′i)∈Hi, so we wrote p(m,b)+(1−p)(m′,b′) as a mixture of points in Hi.
Upper completion is another easy one, because, if (m,b)∈EζHi, then you can go
(m,b)+(m∗,b∗)=Eζ(mi,bi)+Eζ(m∗,b∗)=Eζ((mi,bi)+(m∗,b∗))
And ((mi,bi)+(m∗,b∗))∈Hi by upper completion.
That leaves the nightmare of closure. Fix a sequence Mn∈Eζ(Hi) limiting to M. You can think of the Mn as (mn,bn). We can shatter the Mn into Mi,n∈Hi, where Mi,n can be thought of as (mi,n,bi,n).
Now, since Mn converge to something, there must be an upper bound on the bn and mn(1) terms of the sequence, call those b◯ and λ◯. Now, for all n and all i′, b◯≥bn=∑iζibi,n≥ζi′bi′,n so, for all n and i, bi,n≤b◯ζi.
Also, for all n and i′, λ◯+b◯≥mn(1)+bn=∑i(ζi(mi,n(1)+bi,n))≥ζi′(mi′,n(1)+bi′,n) and reshuffling, we get λ◯+b◯ζi′≥mi′,n(1)+bi′,n which then makes λ◯+b◯ζi′≥m+i′,n(1)+(m−i′,n(1)+bi′,n). Further, due to (mi′,n,bi′,n) being a sa-measure, bi′,n+m−i,n(1)≥0, so for all n and i, m+i,n(1)≤λ◯+b◯ζi.
Ok, so taking stock of what we've shown so far, it's that for all i, the sequence Mi,n is roaming about within Hi∩{(m,b)|b≤b◯ζi,m+(1)≤λ◯+b◯ζi} And, by the Compactness Lemma, this set is compact, since it's got bounds (weak bounds, but bounds nonetheless). Defining
¯¯¯¯¯¯Mn∈∏i(Hi∩{(m,b)|b≤b◯ζi,m+(1)≤λ◯+b◯ζi})
where ¯¯¯¯¯¯Mn(i):=Mi,n, we can view everything as one single sequence ¯¯¯¯¯¯Mn wandering around in the product of compact sets. By Tychonoff's theorem (we've only got a countable product of compact metric spaces, so we don't need full axiom of choice, dependent choice suffices), we can fix a convergent subsequence of this, and the projections of this subsequence to every Hi converge.
Ok, so we've got a subsequence of n where, regardless of i, Mi,n converge to some Mi∈Hi (by closure of Hi). How does that help us? We don't even know if mixing these limit points converges to something or runs off to infinity. Well... fix any j you like, we'll just look at the partial sum of the first j components. Also fix any ϵ you please. On our subsequence of interest, the Mn converge to M, and in all i, the Mi,n converge to Mi. So, let n be large enough (and in our subsequence) that d(Mn,M)<ϵ, and ∀i≤j:d(Mi,n,Mi)<ϵ, we can always find such an n.
Now, ∑i≤jζiMi+∑i>jζiMi,n is a well-defined point (because it's a finite sum of points plus a convergent sequence as witnessed by the well-definedness of Mn which breaks down as ∑iζiMi,n) It also lies in the upper completion of the single point ∑i≤jζiMi. We'll show that this point is close to M. Since we're working in a space with a norm,
d(M+M∗,M′+M∗)=||(M+M∗)−(M′+M∗)||=||M−M′||=d(M,M′)
This will come in handy in the later equations.
d(∑i≤jζiMi+∑i>jζiMi,n,M)≤d(∑i≤jζiMi+∑i>jζiMi,n,Mn)+d(Mn,M)
<d(∑i≤jζiMi+∑i>jζiMi,n,∑iζiMi,n)+ϵ=d(∑i≤jζiMi,∑i≤jζiMi,n)+ϵ
≤∑i≤jd(ζiMi,ζiMi,n)+ϵ=∑i≤j||ζiMi−ζiMi,n||+ϵ=∑i≤jζi||Mi−Mi,n||+ϵ
=∑i≤jζid(Mi,Mi,n)+ϵ<∑i≤jζiϵ+ϵ≤ϵ+ϵ=2ϵ
So, M is less than 2ϵ away from the upper completion of the point ∑i≤jζiMi, which is a closed set (Minkowski sum of a closed and compact set is closed). ϵ can be shrank to 0 with increasing n, so M has distance 0 from the upper completion of said partial sum, and thus lies above the partial sum!
Abbreviating ∑i≤jζiMi as Mj, we get that all the Mj lie in {M}−Msa(X), and are all sa-measures. Thus, if the sequence Mj converges to a unique point, then said limit point is ∑iζiMi, and all the Mi∈Hi, so ∑iζiMi would lie in EζHi. Further, by Lemma 3, ∑iζiMi∈{M}−Msa(X), since that set is compact, so M lies above ∑iζiMi, and would lie in EζHi by upper-completeness.
So, all that's left to wrap up our closure argument is showing that the sequence Mj has a single limit point. Since it's wandering around in ({M}−Msa(X))∩Msa(X) which is compact by Lemma 3, there are convergent subsequences. All we have to show now is that all convergent subsequences must have the same limit point.
Assume this is false, and there's two distinct limit points of the sequence Mj, call them M∞ and M′∞. Because it's impossible for two points to both be above another (in the minimal-point/adding-points sense), without both points being identical, either M∞∉{M′∞}−Msa(X), or vice-versa. Without loss of generality, assume M∞∉{M′∞}−Msa(X). Since the latter is a closed set, M∞ must be ϵ away for some ϵ>0. Fix some j from the subsequence that M∞ is a limit point of, where d(Mj,M∞)<ϵ2. There must be some strictly greater j′ from the subsequence that M′∞ is a limit point of.
Mj′=∑i≤j′ζiMi=∑i≤jζiMi+∑j<i≤j′ζiMi=Mj+∑j<i≤j′ζiMi
Further, the ζi are nonzero. Also, no Mi can be the 0 point, because Mi∈Hi, and if Mi=(0,0), then EHi(1)=0, which is impossible by normalization. So, Mj lies strictly below Mj′. Also, Mj′ lies below M′∞, because for all the j∗>j′,
Mj∗=∑i≤j∗ζiMi=∑i≤j′ζiMi+∑j′<i≤j∗ζiMi=Mj′+∑j′<i≤j∗ζiMi
so Mj∗∈{Mj′}+Msa(X) for all j∗>j′. The sequence that limits to M′∞ is roaming around in this set, which is closed because the sum of a compact set (a single point) and a closed set is closed. So, M′∞ lies above Mj′ which lies above Mj. Thus, Mj∈{M′∞}−Msa(X). However, Mj is ϵ2 or less distance from M∞, which must be ϵ distance from {M′∞}−Msa(X), and we have a contradiction.
Ok, so the sequence of partial sums Mj has a single limit point, which is ∑iζiMi, and all the Mi∈Hi, so ∑iζiMi∈EζHi, and by Lemma 3, ∑iζiMi∈{M}−Msa(X), since that set is compact, so M lies above ∑iζiMi, and lies in EζHi by upper-completeness. We're done!
For minimals, by our argument about what it takes to invoke LF-Duality in Proposition 9, we only need convexity, closure, and upper completion (which we have), and that the h induced by EζHi is continuous. By Proposition 10, EEζHi(f)=Eζ(EHi(f))=Eζ(hi(f))=(Eζhi)(f). We might as well go for uniform continuity since all the Hi are infradistributions, and so fulfill weak-bounded-minimals, so their hi are uniformly continuous. Then, this continuity lets you invoke LF-Duality, and transfer uniform continuity for the h induced by EζHi to weak-bounded-minimals for EζHi
For uniform continuity/weak-bounded-minimals, given an arbitrary ϵ, we can pick a finite j where ∑i>jζi<ϵ2, and a finite δ where, for all hi with i≤j, d(f,f′)<δ implies |hi(f)−hi(f′)|<ϵ2. Monotonicity and normalization for the hi ensures that, no matter what, hi(f)∈[0,1], so regardless of the f,f′, |hi(f)−hi(f′)|≤1. Then, we can go: Ok, if |f−f′|<δ, then
|Eζ(hi(f))−Eζ(hi(f′))|≤Eζ|hi(f)−hi(f′)|
=∑i≤jζi|hi(f)−hi(f′)|+∑i>jζi|hi(f)−hi(f′)|
<∑i≤jζiϵ2+∑i>jζi<∑iζiϵ2+ϵ2=ϵ2+ϵ2=ϵ
And by our earlier argument, we invoke LF-Duality and pick up weak-bounded-minimals.
For positive-minimals, we can just observe that, if f′≥f, then
(Eζhi)(f′)=Eζ(hi(f′))≥Eζ(hi(f))=(Eζhi)(f)
By monotonicity for the hi because Hi had positive-minimals. Going back to EζHi, since its associated h is monotone, it must have positive-minimals as well.
For bounded minimals assuming the Lipschitz constants aren't too big, fix some ϵ. We know that ∑iζiλ⊙i<∞, where λ⊙i is the Lipschitz constant of hi. So, if d(f,f′)<ϵ, then:
|Eζ(hi(f))−Eζ(hi(f′))|≤Eζ|hi(f)−hi(f′)|=∑iζi|hi(f)−hi(f′)|<∑iζiλ⊙iϵ
So, ∑iζiλ⊙i is a finite constant, and is an upper bound on the Lipschitz constant of the mixture of the hi, so the h corresponding to EζHi has a Lipschitz constant, which, by Theorem 5, translates to bounded-minimals. And we're done.
Proposition 12: g∗(Eζ(Hi))=Eζ(g∗(Hi))
Let's use Theorem 5 to translate this into the concave functional setting. We want to show that g∗(Eζhi)=Eζ(g∗(hi)) Now, given any function f∈C(Y,[0,1]),
(g∗(Eζhi))(f)=(Eζhi)(f∘g)=Eζ(hi(f∘g))=Eζ(g∗(hi)(f))=(Eζ(g∗(hi)))(f)
and we're done! The two concave functionals corresponding to those two sets are the same, so the sets themselves are the same.
Lemma 8: The "raw update" ugL:Msa(X)→Msa(L) defined by (m,b)↦(m⋅L,b+m(0★Lg)) is a continuous linear operator.
For linearity,
ugL(a(m,b)+a′(m′,b′))=ugL(am+a′m′,ab+a′b)
=((am+a′m′)⋅L,ab+a′b′+(am+a′m′)(0★Lg))
=(a(m⋅L)+a′(m′⋅L),ab+a′b′+am(0★Lg)+a′m′(0★Lg))
=a(m⋅L,b+m(0★Lg))+a′(m′⋅L,b′+m′(0★Lg))=augL(m,b)+a′ugL(m,b)
Now for continuity. mn⋅L limits to m⋅L if, for all f∈C(¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯supp(L)), (mn⋅L)(f) limits to (m⋅L)(f). Observe that (m⋅L)(f)=m(f★L0), and f★L0 is continuous.
Now, for any f we can go
limn→∞((mn⋅L)(f))=limn→∞(mn(f★L0))=m(f★L0)=(m⋅L)(f)
establishing continuity in the first vector component, by mn limiting to m. For the second vector component,
m(f★Lg)+b=limn→∞(mn(f★Lg))+limn→∞bn=limn→∞(mn(f★Lg)+bn)
So we have continuity in the second vector component as well, and we're done.
Lemma 9: (ugL(H))min⊆ugL(Hmin)
As a recap, the raw update function ugL is: (m,b)↦(m⋅L,b+m(0★Lg))
Take a point (m,b)∈(ugL(H))min. Now there must be a preimage point (m′,b′)∈H that, when we apply ugL, produces (m,b). Because (m′,b′) is in an infradistribution, we can decompose it into a minimal point and something else, (m′,b′)=(mmin,bmin)+(m∗,b∗). Then,
(m,b)=ugL((m′,b′))=ugL((mmin,bmin)+(m∗,b∗))=ugL(mmin,bmin)+ugL(m∗,b∗)
This was done by using linearity of ugL via Lemma 8.
Note that, since we have written (m,b) as a sum of a different point also in ugL(H) and an sa-measure, but (m,b) is minimal in ugL(H), the sa-measure must be 0, so (m,b)=ugL(mmin,bmin)∈ugL(Hmin), and we're done.
Proposition 13: When updating a bounded infradistribution over Msa(X), if the renormalization doesn't fail, you get a bounded infradistribution over the set Msa(L). (for infradistributions in general, you may have to take the closure)
Proof sketch: It doesn't matter whether you take upper-completion before or after renormalization, so we can appeal to Proposition 7: Renormalizing a bounded inframeasure produces a bounded infradistribution (if the renormalization doesn't fail).
So, we just have to show nonemptiness, convexity, upper-completion (trivial), positive-minimals/bounded minimals (by Lemma 9, the preimage of a minimal point contains a minimal point, so we can transfer over the properties from the minimal point in the preimage), and closure. The set of minimal points in H is contained in a compact set, so we can take a sequence in (ugL(H))uc, split into a component in ugL(H) and something else, take preimage points, get minimals below all of them, isolate a convergent subsequence, map the limit point back through, and show that the limit point lands under your point of interest. That establishes all conditions for a bounded inframeasure, so then we just have to check that our renormalization is the right one to do.
Proof: Nonemptiness is trivial, ugL isn't a partial function. Upper-completion is also trivial, because we explicitly took the upper completion. For convexity, observe that ugL is a linear operator by Lemma 7, so it maps convex sets to convex sets, and the Minkowski sum of two convex sets is convex. ugL maps sa-measures to sa-measures, because
b+m(0★Lg)+(m⋅L)−(1)=b+m(0★Lg)+(m−⋅L)(1)
=b+m(0★Lg)+m−(1★L0)=b+m+(0★Lg)+m−(0★Lg)+m−(1★L0)
≥b+m−(1★Lg)≥b+m−(1)≥0
For positive-minimals and bounded-minimals, we invoke Lemma 9, (ugL(H))min⊆ugL(Hmin). All minimal points in ugL(H) must have a preimage minimal in H, which is an a-measure. Chopping down a measure by L keeps it a measure, so we still have no negative components post-update, and all minimal points in ugL(H) are a-measures. Similarly, chopping down a measure by L reduces the λ value, and we had an upper bound of λ⊙ originally, so the upper bound still works post-update. This gets bounded-minimals.
This just leaves closure. Fix a sequence Mn in ugL(H)uc limiting to M. The Mn break down into ufg(M′n)+M∗n, where M′n∈H. M′n further breaks down into Mminn+M∗∗n, where Mminn∈Hmin. By Proposition 5, the Mminn sequence is wandering around in a compact set since we have bounded-minimals on H, so there's a convergent subsequence which has a limit point Mmin. Map that convergent subsequence and limit point through ugL which is continuous by Lemma 8 to get a sequence of points ugL(Mminn) limiting to ugL(Mmin)∈ugL(H). Fix some really big n where d(M,Mn)<ϵ and d(ugL(Mminn),ugL(Mmin))<ϵ.
Now, ugL(Mmin)+ugL(M∗∗n)+M∗n lies in the upper completion of the point ugL(Mmin). We'll show that this sum of 3 terms is close to M. Since we're working in a Banach space, d(x+y,z+y)=d(x,z), by norm arguments.
d(ugL(Mmin)+ugL(M∗∗n)+M∗n,M)≤d(ugL(Mmin)+ugL(M∗∗n)+M∗n,Mn)+d(Mn,M)
<d(ugL(Mmin)+ugL(M∗∗n)+M∗n,ufg(M′n)+M∗n)+ϵ
=d(ugL(Mmin)+ugL(M∗∗n),ugL(M′n))+ϵ=d(ugL(Mmin)+ugL(M∗∗n),ugL(Mminn+M∗∗n))+ϵ
=d(ugL(Mmin)+ugL(M∗∗n),ugL(Mminn)+ugL(M∗∗n))+ϵ=d(ugL(Mmin),ugL(Mminn))+ϵ<2ϵ
So,M is within 2ϵ of the upper completion of {ugL(Mmin)} for all ϵ, and it's a closed set, so M lies above ugL(Mmin)∈ugL(H), so M∈(ugL(H))uc, and we have closure.
Now that all prerequisite conditions have been established, we just need to show that 1PgH(L) and EH(0★Lg) are the proper renormalization constants to use.
The proper renormalization to use is: 1E(ugL(H))uc(1)−E(ugL(H))uc(0) for the scale, and E(ugL(H))uc(0) for the shift. So let's unpack these quantities.
E(ugL(H))uc(0)=EugL(H)(0)=inf(m,b)∈ugL(H)b=inf(m,b)∈H(b+m(0★Lg))=EH(0★Lg)
So, our shift constant checks out, it's the proper shift constant to use. In the other direction,
E(ugL(H))uc(1)=EugL(H)(1)=inf(m,b)∈ugL(H)(m(1)+b)
=inf(m,b)∈H((m′⋅L)(1)+b+m(0★Lg))=inf(m,b)∈H(m(1★L0)+b+m(0★Lg))
=inf(m,b)∈H(m(1★Lg)+b)=EH(1★Lg)
For the scale constant, observe that 1E(ugL(H))uc(1)−E(ugL(H))uc(0)=1EH(1★Lg)−EH(0★Lg)=1PgH(L)
So our scale constant is also the right scale constant to use. Now, we can invoke Proposition 7: Renormalizing a bounded inframeasure produces a bounded infradistribution if the renormalization doesn't fail.
Proposition 14: EH(f★Lg)=EH(0★Lg)+PgH(L)EH|gL(f)
Proof: if PgH(L)≠0, then
EH(0★Lg)+PgH(L)EH|gL(f)=EH(0★Lg)+PgH(L)(inf(m,b)∈H|gL(m(f)+b))
=EH(0★Lg)+PgH(L)(inf(m,b)∈H((1PgH(L)m⋅L)(f)+1PgH(L)(b+m(0★Lg)−EH(0★Lg))))
=EH(0★Lg)+inf(m,b)∈H((m⋅L)(f)+b+m(0★Lg)−EH(0★Lg))
=inf(m,b)∈H((m⋅L)(f)+b+m(0★Lg))
=inf(m,b)∈H(m(f★L0)+b+m(0★Lg))=inf(m,b)∈H(m(f★Lg)+b)=EH(f★Lg)
Now, if PgH(L)=0, then EH(1★Lg)=EH(0★Lg) so, for any f∈C(X,[0,1]), (1★Lg)≥(f★Lg)≥(0★Lg) by monotonicity for the h induced by H, and h(1★Lg)=h(0★Lg), so h(f★Lg)=h(0★Lg). Therefore,
EH(0★Lg)+PgH(L)EH|gL(f)=EH(0★Lg)+0=EH(f★Lg)
and we get our same result.
Proposition 15: (H|gL)|g′L′=H|⎛⎝g★1−L1−LL′g′⎞⎠LL′
Proof sketch: First, we do some shuffling around of the stars to get a lemma that will help. Then, we can use the link between updated sets and their associated concave functionals h, getting the identity purely on the concave functional level, where it's much easier to approach.
Proof: First, the star shuffling. For any f,g,g′,L,L′∈C(X,[0,1]), we'll show that
f★LL′(g★1−L1−LL′g′)=(f★L′g′)★Lg.
Let's begin. First, let's deal with points x where L(x)=L′(x)=1, because that gets you a divide-by-zero error.
(f★LL′(g★1−L1−LL′g′))(x)=L(x)L′(x)f(x)+(1−L(x)L′(x))(g★1−L1−LL′g′)(x)
=L(x)L′(x)f(x)+0+0=L(x)L′(x)f(x)+L(x)⋅0⋅g′(x)+0⋅g(x)
=L(x)L′(x)f(x)+L(x)(1−L′(x))g′(x)+(1−L(x))g(x)
=L(x)(L′(x)f(x)+(1−L′(x))g′(x))+(1−L(x))g(x)
=((L′f+(1−L′)g′)★Lg)(x)=((f★L′g′)★Lg)(x)
and we're done with the divide-by-zero case. In the other case, we can safely assume there's no divide-by-zero errors.
f★LL′(g★1−L1−LL′g′)=LL′f+(1−LL′)(g★1−L1−LL′g′)
=LL′f+(1−LL′)(1−L1−LL′g+(1−1−L1−LL′)g′)
=LL′f+(1−LL′)(1−L1−LL′g+(1−LL′−1+L1−LL′)g′)
=LL′f+(1−L)g+(1−LL′−1+L)g′=LL′f+(1−L)g+L(1−L′)g′
=L(L′f+(1−L′)g′)+(1−L)g=(L′f+(1−L′)g′)★Lg=(f★L′g)★Lg
Ok, so we've established our crucial f★LL′(g★1−L1−LL′g′)=(f★L′g′)★Lg identity. Let's proceed. Updates for concave functionals are: (h|gL)(f)=h(f★Lg)−h(0★Lg)h(1★Lg)−h(1★Lg)
Importing Proposition 14, EH(f★Lg)=EH(0★Lg)+PgH(L)EH|gL(f) and rearranging it (and unpacking the definition of PgH(L)), we get EH|gL(f)=EH(f★Lg)−EH(0★Lg)EH(1★Lg)−EH(0★Lg)
So, updating fulfills the positive functional definition of update, because this transfers into (h|gL)(f)=h(f★Lg)−h(0★Lg)h(1★Lg)−h(0★Lg) which is exactly our concave functional definition of updating. So, in order to verify that the two updates equal the one big update, we could just show that their concave functional definitions are equivalent. (H|gL)|g′L′ would, on the concave functional level, turn into:
((h|gL)|g′L)(f)=(h|gL)(f★L′g′)−(h|gL)(0★L′g′)(h|gL)(1★L′g′)−(h|gL)(0★L′g′)
=h((f★L′g′)★Lg)−h(0★Lg)h(1★Lg)−h(0★Lg)−h((0★L′g′)★Lg)−h(0★Lg)h(1★Lg)−h(0★Lg)h((1★L′g′)★Lg)−h(0★Lg)h(1★Lg)−h(0★Lg)−h((0★L′g′)★Lg)−h(0★Lg)h(1★Lg)−h(0★Lg)
=h((f★L′g′)★Lg)−h(0★Lg)−h((0★L′g′)★Lg)+h(0★Lg)h((1★L′g′)★Lg)−h(0★Lg)−h((0★L′g′)★Lg)+h(0★Lg)
=h((f★L′g′)★Lg)−h((0★L′g′)★Lg)h((1★L′g′)★Lg)−h((0★L′g′)★Lg)
and now we can use our earlier star identity to rewrite as:
=h⎛⎝f★LL′⎛⎝g★1−L1−LL′g′⎞⎠⎞⎠−h⎛⎝0★LL′⎛⎝g★1−L1−LL′g′⎞⎠⎞⎠h⎛⎝1★LL′⎛⎝g★1−L1−LL′g′⎞⎠⎞⎠−h⎛⎝0★LL′⎛⎝g★1−L1−LL′g′⎞⎠⎞⎠
=⎛⎜ ⎜⎝h|⎛⎝g★1−L1−LL′g′⎞⎠LL′⎞⎟ ⎟⎠(f)
establishing our identity of updating twice, vs one big update of a different form.
Corollary 2: Regardless of L and L′ and g, then (H|gL)|gL′=H|g(LL′)
Just use Proposition 15, and notice that: g★1−L1−LL′g=1−L1−LL′g+(1−1−L1−LL′)g=g getting us our result.
Corollary 3: If Y and Z are clopen sets, then, abusing notation by glossing over the difference between indicator functions and sets, (H|gY)|gZ=H|g(Y∩Z)
Invoke Corollary 2, and observe that 1Y⋅1Z=1Y∩Z.
Lemma 10: PgEζHi(L)=Eζ(PgHi(L))
Proof: Invoke Proposition 10 to go:
PgEζHi(L)=EEζHn(1★Lg)−EEζHi(0★Lg)=Eζ(EHi(1★Lg))−Eζ(EHi(0★Lg))
=Eζ(EHi(1★Lg)−EHi(0★Lg))=Eζ(PgHn(L))
Theorem 6: (EζHi)|gL=Eζ(PgHi(L)⋅(Hi|gL))Eζ(PgHi(L)) If the update doesn't fail.
Proof: Let ζ′ be defined as ζ′i:=ζiPgHi(L)∑jζjPgHj(L) It is a probability distribution, because if all PgHi(L)=0, then EζPgHi(L)=0, and so by Lemma 10, PgEζHi(L)=0, which would cause the update to fail.
The left-hand-side corresponds to (Eζhi)|gL on the concave functional level, and the right-hand-side corresponds to Eζ′(hi|gL) on the concave functional level. Let's begin unpacking. Lemma 10 will be used throughout, as well as the definition of PgHi(L).
(Eζ′(hi|gL))(f)=Eζ′((hi|gL)(f))=∑i(ζiPgHi(L)∑jζjPgHj(L)hi(f★Lg)−hi(0★Lg)hi(1★Lg)−hi(0★Lg))
=∑i(ζiPgHi(L)∑jζjPgHj(L)hi(f★Lg)−hi(0★Lg)PgHi(L))=∑i(ζi(hi(f★Lg)−hi(0★Lg))∑jζjPgHj(L))
=∑iζi(hi(f★Lg)−hi(0★Lg))EζPgHj(L)=Eζ(hi(f★Lg)−hi(0★Lg))PgEζHi(L)
=Eζ(hi(f★Lg))−Eζ(hi(0★Lg))Eζ(hi(1★Lg))−Eζ(hi(0★Lg))=(Eζhi)(f★Lg)−(Eζhi)(0★Lg)(Eζhi)(1★Lg)−(Eζhi)(0★Lg)=((Eζhi)|gL)(f)
So, (Eζhi)|gL=Eζ′(hi|gL) as desired, which shows our result.