{0,1}L and {0,1}∗ are the sets of all bitstrings of length L, and the set of all finite bitstrings, respectively. Elements of these sets are denoted by x. The length i prefix of x is denoted by x:i.
BL is the set of all satisfiable booleans with all variables having an index ≤L, and B is the set of all satisfiable booleans. Similarly, B∗L and B∗ are the set of all booleans with all variables having an index ≤L, and the set of all booleans. Elements of these sets are denoted by B. B∧B′ is also a boolean. Bitstrings x may be interpreted as boolean constraints saying that the prefix of the bitstring must equal x.
f is a function N+→N+ that is time-constructible, monotonically increasing, and f(t)>t, which gives the runtime bound for traders.
l is some function N+→N+ upper-bounded by 2f(t) that is time-constructible, monotonically increasing, and l(t)>t. It gives the most distant bit the oracle inductor thins about on turn t.
d is the function N+→N+ given by d(t)=l(t)+⌈log2l(t)⌉+2t+7, giving the number of iterations of binary-search deployed on turn t.
δ is the function N+→[0,1]∪Q given by δ(t)=2−(t+3). It gives the proportion of the inductor distribution that is made up by the uniform distribution.
The operations query(A,p) and flip(p) take unit time to execute in our model of computation, though the input still has to be written down in the usual amount of time.
Given some arbitrary distribution Δ over {0,1}L, and a B∈B∗L, Δ(B) is the probability mass assigned to bitstrings that fulfill B. Given a B∈BL, ΔB is the conditional distribution given by ΔB(x)=Δ(B∧x)Δ(B). Δ∗B is the distribution induced by Approx(Δ,L,D,B). There is an implicit dependence on D which will be suppressed in the notation.
Lemma 1 (n-Lemma):Let ϵ=2−D. Given some distribution Δ over {0,1}L for which ∀x∈{0,1}L:Δ(x)≥nϵ, then
To begin with, if x is incompatible with B, Δ∗B(x)=0 by the construction of Approx, and the inequality is trivial. So we can consider the case where x is compatible with B. Given σ as a strict prefix of x, abbreviate NextBit(Δ,L,D,|σ|+1,B,σ) as NB(σ).
Because we're using a binary-search depth of D on each bit, and taking the average of the interval, the probabilities are approximated to within ϵ2.
Taking the midle three terms and subtracting 1 from all of them, which flips the sign of the inequality, and rewriting 1−P(NB(σ)=1) as P(NB(σ)=0) and rewriting 1 as Δ(B∧σ)±ϵ2Δ(B∧σ)±ϵ2, we get
For an individual term in the product of the lower/upper bound (± and ∓ will be used, the upper sign is used for the lower bound, the lower sign is used for the upper bound, and ⋛ will be abused to mean ≥ for the lower bound and ≤ for the upper bound), we can get the following equality.
Lemma 2: The distribution induced by OI(t) has the probability of all bitstrings bounded above by nϵ.
Identify D with d(t)=l(t)+⌈log2l(t)⌉+2t+7. Identify L with l(t). Identify n with l(t)2t+4.
Because ϵ=2−D, ϵ=2−(l(t)+⌈log2l(t)⌉+2t+7)≤1l(t)2−(l(t)+2t+7). Due to δ(t) of the distribution being composed of the uniform distribution, the probability of all x is bounded below by
From here on, we will frequently be going from the approximation of a conditional distribution to a conditional distribution via the n-Lemma and Lemma 2, so let ϵt=2−d(t), and nt=l(t)2t+4, and Lt=l(t). Δi and ΔiB will be used to refer to the probability distribution and conditional probability distribution over bitstrings that OI(i) produces, and Δ∗i and Δ∗iB refer to the approximation of that probability distribution with D=d(t).
Lemma 3:The assessed value at time t in world x of a trader T with a runtime and oracle-call filter attached is lower-bounded by
−t+∑1≤i≤tmax((nt−1nt+1)Lt∑B∈BLt(P(Ti=B)ΔiB(x))−ϵt,0)Δi(x)+ϵtand upper-bounded by
−t+(nt+1nt−1)Lt∑1≤i≤t∑B∈BLtP(Ti=B)ΔiB(x)Δi(x) .
To begin, the assessed value at time t in world x of a trader T by OverBudget is −t+∑1≤i≤t(∑B∈BLt(P(Ti=B)Δ∗iB(x)))∗Δ∗i(x) .
Δ∗i(x) has d(t) rounds of binary search run on it, which yields an interval with a width of ϵt, and an upper-bound estimate is taken, so Δi(x)≤Δ∗i(x)≤Δi(x)+ϵt. Similarly, the net value of a trade has a lower-bound estimate run on it, so
Further, by the n-Lemma and Lemma 2 (probability of x is either 0 or over ntϵt), we can replace Δ∗iB(x) by (nt+1nt−1)LiΔiB(x) or (nt−1nt+1)LiΔiB(x) respectively, which are then upper and lower-bounded by (nt+1nt−1)LtΔiB(x) and (nt−1nt+1)LtΔiB(x), which, because the fraction doesn't depend on i, can be pulled out of the sum, yielding the stated bounds.
In the next theorem, we will be abbreviating ∑B∈Bl(i)P(Ti=B)ΔiB(x) as ai and abbreviating Δi(x) as ci, so the value of a trader at timestep t in world x against the sequence of OI distributions is −t+∑1≤i≤taici.
Theorem 1:If a trader exploits the market, there is a budgeted version that trader that exploits the market.
As before, the net value of a trader at timestep t in world x against the sequence of OI distributions is −t+∑1≤i≤taici. Because the trader exploits the market, for all t and all x plausible at t, the net value ≥−b for some integer constant b, we get that for all t and all x plausible at t, ∑1≤i≤taici≥t−b. Call this quantity the gain of the trader. Separate the sum into "good" days where ai≥tϵt, and "bad" days where this is false, and use Lemma 2 to get
Abbreviate (nt−1nt+1)Lt as yt, and abbreviate (nt−1nt+1)Lt+1 as zt. Using Lemma 3, the worst-case assessed gain of the trader is ≥∑1≤i≤tmax(ytai−ϵt,0)ci+ϵt≥∑goodytai−ϵtci+ϵt. Focusing on an individual day, and using the fact that ci≥ntϵt, and it's a good day so ai≥tϵt, we get
By putting both sides in an exponent, this inequality is equivalent to (nt−1nt+1)Lt+1>1−1/4t. Multiplying both sides by t and using the definition of zt, we get tzt>t−14.
Using the previous inequalities, we get that the worst-case assessed gain of the trader is greater than t−b−14−564−1=t−b−8564. Therefore, the worst-case assessed value of the trader is greater than −b−8564, so when the budget is b+3, OverBudget will at worst evaluate the value as −b−8564>−(b+3)+1, so it will never return 1 and interfere with the trade, so the budgeted trader makes the same trades as the original trader and exploits the market.
Lemma 4: The worst-case value of a trader with a budget of b is≥−b−1/4.
Assume the value of a trader equals −b for some b∈R≥0 on some turn t. Then the gain of the trader, ∑1≤i≤taici is ≤t−b. By Lemma 3, the assessed gain is less than
By putting both sides in an exponent, we get the equivalent inequality (nt+1nt−1)Lt<1+1/4t and multiplying both sides by t yields t(nt+1nt−1)Lt<t+1/4, so the assessed gain is less than t−b+1/4 and the assessed value is ≤−b+1/4.
Then, if the trader has a budget of b (an integer), the worst-case scenario is that on some turn t it has a true value of −b+3/4 and it is assessed to have a value of −b+1 (maximum possible), so it barely passes the budgeting filter, and then loses 1 dollar on the next turn, getting a final value of −b−1/4 (after which it only outputs B⊤ which has a value of 0).
Theorem 2:If a budgeted trader exploits the market, the supertrader exploits the market.
Pi(A∧b)=Pi(A)Pi(b) is the probability of the supertrader drawing a bitstring a which encodes the algorithm A and budget b on timestep i. P(Aib=B) is the probability of A(i) with a runtime and oracle call filter and a budget filter of b outputting the boolean B. P(A) is the probability of a randomly selected infinite string encoding A, and P(b)=2−b. For sufficiently large i, Pi(A)=P(A) and Pi(b)=P(b).
A useful thing to note is that, because the bitstring a that is chosen is of length f(t), and it is only run on the UTM for f(t) steps, its behavior is identical to that of all algorithms that are encoded by a bitstring that has a as a prefix. Therefore, even though the probability of some A being drawn may be 0 on a timestep, there is some amount of probability mass in the supertrader that might as well have come from A, in the sense that its behavior is indistinguishable from A after the runtime and oracle call filter is applied.
Similarly, because the maximum selectable budget is f(t), and a trader can lose at most 1 dollar each turn, for all A and positive integers c, Aif(i) has the same exact behavior as Aif(i)+c, so we can pretend we are dealing with the full distribution over all budgets.
Therefore, for all i, the distribution over satisfiable booleans given by ∑A,bPi(A∧b)P(Aib=B) equals the distribution over satisfiable booleans given by ∑A,bP(A)P(b)P(Aib=B), and because of this,
Because V≤t(Tb′) is unbounded above for appropriate choices of t and x by assumption, it continues to be unbounded above when multiplied by P(A)2−b′ and has 94 subtracted from it. Therefore, the supertrader has plausible value unbounded above as well.
Theorem 3:The supertrader doesn't exploitOI.
Now, we will show that the maximum value that the supertrader can possibly get in worlds plausible at some turn t is upper-bounded by 2−t, so the value of the supertrader is upper-bounded by 1.
To begin with, the probability mass the supertrader places on world x at timestep t is ∑A,bP(A)2−b∑B∈BP(Atb=B)ΔB(x). This sum can be rewritten as ∑B∈BΔB(x)∑A,bP(A)2−bP(Atb=B), and for brevity, from here on, we will abbreviate ∑A,bP(A)2−bP(Atb=B) as Pt(B). With this abbreviation, we can express the value of the supertrader's trade on day t and world x as ∑B∈BPt(B)ΔB(x)δ(t)2−Lt+(1−δ(t))∑B∈BPt(B)Δ∗B(x)−1. Note that the numerator of the fraction is the supertrader, which uses the true conditional distribution, while the denominator is what the actual distribution is composed of, a small fragment of a uniform distribution with the rest is composed of a mixture of approximations to the appropriate conditional distribution. To begin the bounding argument, apply the n-Lemma to yield
By putting both sides in an exponent, this inequality is equivalent to (nt+1nt−1)Lt1−δ(t)<1+2−t, so (nt+1nt−1)Lt1−δ(t)−1<2−t, so the value of the supertrader gained on any given day is bounded above by 2−t, so the total value of the supertrader at any time is bounded above by 1.
Runtime Analysis:
The strength of the bounded reflective oracle needed is controlled by the longest runtime of the algorithms that the oracle is queried about. The relevant oracle calls are the invocations of SAT in NextBit, eval(B',OI(t)) in the binary search portion of NextBit, OverBudget, the SAT call in TradeToBool, the eval(B,OI(i)) calls that the trader may make, and the oracle calls in the binary search portion of OverBudget.
To begin with, the runtime of eval(B,x) is the time needed to flip enough coins to get to the most distant variable index in B (call it n), and then the time to run the boolean circuit on the padded x itself, which would be be O(n⋅|B|), to check each bit of the padded x and do a pass over the boolean to substitute ⊤ or ⊥ for that variable, and then evaluating the resulting boolean with variables filled in.
Similarly, the runtime of SAT is the time to write down B and the binary string corresponding to the probability bound, which has a length of approximately n, for a runtime of O(n+|B|).
Now we can look at the runtime of BinSearch. It is always called with D=d(t)=O(l(t)), and there are that many iterations of writing down Δ (which, everywhere it's invoked, takes less than d(t) bits) and the average of the two bounds, which, by a suitable binary encoding, takes at most about d(t) bits, so we get a runtime of about d(t)2, which is O(l(t)2).
As for the runtime of NextBit, the length of the boolean at the start is at most f(t)+O(l(t)) (the first part is because B was returned by an algorithm that ran for at most f(t) steps, and the second part is the bound on the length of x). A boolean of this length is written down 3 times, and SAT is invoked twice, and BinSearch is run twice, for a runtime of O(3(f(t)+l(t))+2(l(t)+f(t)+l(t))+2l(t)2)=O(f(t)+l(t)2) . Adding in the time to write in/read B and x from the input just adds another f(t)+l(t) term which doesn't affect runtime that much.
Now for Approx. There are at most l(t) iterations of NextBit, and we already accounted for the time to write the input to NextBit, so the runtime is O(f(t)l(t)+l(t)3).
TradeToBool reads the input, which takes O(f(t)) time, emulates f(t) steps of a turing machine, which takes O(f(t)logf(t)) time, clips the input which takes O(f(t)+l(t)) time (the latter is an upper bound on the time to compute l(t) because l is time-constructible), calls SAT on a boolean of length at most f(t) with a maximum index of l(t), and writes the boolean, for a runtime of O(f(t)logf(t)+l(t)).
BTtoBool reads the input and writes parts of it into the query, which takes O(f(t)) time, writes down the probability bound in O(l(t)) time, and runs TradeToBool, for a runtime of O(f(t)logf(t)+l(t)).
OI either writes down δ(t) in O(t) time and generates a string in O(l(t)) time, or computes f(t), l(t), and d(t) in O(f(t)+l(t)) time, generates the trader and budget string in O(f(t)) time, and invokes TradeToBool and Approx for a runtime of O(l(t)3+f(t)l(t)+f(t)logf(t)) .
Now we can address the runtime of all the algorithms fed into an oracle query but one. The SAT calls in NextBit ask about eval(B',λ) with a runtime of O(l(t)+f(t)), which is sufficiently low. The queries about eval(B',OI(t)) in BinSearch have eval(B',OI(t)) running in O(l(t)2) time (for the evaluation) plus O(l(t)3+f(t)l(t)+f(t)logf(t)) time (to calculate OI(t)), so the runtime is O(l(t)3+f(t)l(t)+f(t)logf(t)) which is sufficiently low. The SAT call in TradeToBool asks about eval(B,λ) with a runtime of O(l(t)+f(t)) which is sufficiently low. The SAT call the trader makes about eval(B,OI(i)) isn't about a longer-running algorithm than the algorithm invoked in BinSearch, so we're good there as well. Finally, the oracle queries in the binary search portion of OverBudget take O(l(t)2) time (for the evaluation), plus the runtime of Approx and the runtime of TradeToBool (in the numerator) or the runtime of OI (in the denominator), for a runtime of O(l(t)3+f(t)l(t)+f(t)logf(t)) in both cases, which is sufficiently low.
This just leaves establishing the runtime of OverBudget to ensure that the oracle is strong enough for our needs.
Generating n takes O(t) time. Generating the world x takes O(l(t)) time. Running the deductive process takes O(f(t)+l(t)) time, and so does writing down the resulting boolean, and the maximum index is l(t) so the SAT invocation takes O(f(t)+l(t)) time as well, for a runtime so far of O(f(t)+l(t)). That leaves the sum (computing the < isn't harder than computing the sum in the first place). We are carrying out at most t fraction-additions. Each fraction-addition involves three multiplications, one addition, and two iterations of binary search. The length of the numbers is l(t), but with each multiplication, they get longer by l(t), so the maximum length of the numbers is tl(t). Using the standard inefficient multiplication algorithm for O(t2l(t)2) runtime per multiplication, and with addition being faster than multiplication, we get O(t(t2l(t)2+l(t)2))=O(t3l(t)2) for the whole sum, to get a runtime of O(t3l(t)2+f(t)), which again is sufficiently low to permit oracle calls.
Thus, an oracle strength of O(l(t)3+t3l(t)2+f(t)l(t)+f(t)logf(t)) suffices to make all oracle calls directed to algorithms with permissibly low runtime.
Notation:
{0,1}L and {0,1}∗ are the sets of all bitstrings of length L, and the set of all finite bitstrings, respectively. Elements of these sets are denoted by x. The length i prefix of x is denoted by x:i.
BL is the set of all satisfiable booleans with all variables having an index ≤L, and B is the set of all satisfiable booleans. Similarly, B∗L and B∗ are the set of all booleans with all variables having an index ≤L, and the set of all booleans. Elements of these sets are denoted by B. B∧B′ is also a boolean. Bitstrings x may be interpreted as boolean constraints saying that the prefix of the bitstring must equal x.
f is a function N+→N+ that is time-constructible, monotonically increasing, and f(t)>t, which gives the runtime bound for traders.
l is some function N+→N+ upper-bounded by 2f(t) that is time-constructible, monotonically increasing, and l(t)>t. It gives the most distant bit the oracle inductor thins about on turn t.
d is the function N+→N+ given by d(t)=l(t)+⌈log2l(t)⌉+2t+7, giving the number of iterations of binary-search deployed on turn t.
δ is the function N+→[0,1]∪Q given by δ(t)=2−(t+3). It gives the proportion of the inductor distribution that is made up by the uniform distribution.
The operations query(A,p) and flip(p) take unit time to execute in our model of computation, though the input still has to be written down in the usual amount of time.
Given some arbitrary distribution Δ over {0,1}L, and a B∈B∗L, Δ(B) is the probability mass assigned to bitstrings that fulfill B. Given a B∈BL, ΔB is the conditional distribution given by ΔB(x)=Δ(B∧x)Δ(B). Δ∗B is the distribution induced by Approx(Δ,L,D,B). There is an implicit dependence on D which will be suppressed in the notation.
Lemma 1 (n-Lemma): Let ϵ=2−D. Given some distribution Δ over {0,1}L for which ∀x∈{0,1}L:Δ(x)≥nϵ, then
∀B∈BL,x∈{0,1}L:(n−1n+1)LΔB(x)≤Δ∗B(x)≤(n+1n−1)LΔB(x) .
To begin with, if x is incompatible with B, Δ∗B(x)=0 by the construction of Approx, and the inequality is trivial. So we can consider the case where x is compatible with B. Given σ as a strict prefix of x, abbreviate NextBit(Δ,L,D,|σ|+1,B,σ) as NB(σ).
Because we're using a binary-search depth of D on each bit, and taking the average of the interval, the probabilities are approximated to within ϵ2.
Δ(B∧σ1)−ϵΔ(B∧σ)+ϵ<Δ(B∧σ1)−ϵ2Δ(B∧σ)+ϵ2≤P(NB(σ)=1)≤Δ(B∧σ1)+ϵ2Δ(B∧σ)−ϵ2<Δ(B∧σ1)+ϵΔ(B∧σ)−ϵ
Taking the midle three terms and subtracting 1 from all of them, which flips the sign of the inequality, and rewriting 1−P(NB(σ)=1) as P(NB(σ)=0) and rewriting 1 as Δ(B∧σ)±ϵ2Δ(B∧σ)±ϵ2, we get
Δ(B∧σ)−ϵΔ(B∧σ)+ϵ<Δ(B∧σ)−ϵ2Δ(B∧σ)−ϵ2−Δ(B∧σ1)+ϵ2Δ(B∧σ)−ϵ2≤P(NB(σ)=0)≤Δ(B∧σ)+ϵ2Δ(B∧σ)+ϵ2−Δ(B∧σ1)−ϵ2Δ(B∧σ)+ϵ2<Δ(B∧σ)+ϵΔ(B∧σ)−ϵ
Now, because Δ∗B(x)=∏Li=1P(NB(x:i−1)=xi), using the previous inequalities, we can establish
∏Li=1Δ(B∧x:i)−ϵΔ(B∧x:i−1)+ϵ<Δ∗B(x)<∏Li=1Δ(B∧x:i)+ϵΔ(B∧x:i−1)−ϵ
For an individual term in the product of the lower/upper bound (± and ∓ will be used, the upper sign is used for the lower bound, the lower sign is used for the upper bound, and ⋛ will be abused to mean ≥ for the lower bound and ≤ for the upper bound), we can get the following equality.
Δ(B∧x:i)∓ϵΔ(B∧x:i−1)±ϵ=Δ(B∧x:i)Δ(B∧x:i−1)±ϵ∓ϵΔ(B∧x:i−1)±ϵ=Δ(B∧x:i)Δ(B∧x:i−1)Δ(B∧x:i−1)Δ(B∧x:i−1)±ϵ(1∓ϵΔ(B∧x:i−1))
Because, by assumption, all bitstrings with nonzero probability and therefore prefixes of them have probability bounded below by nϵ, we get
Δ(B∧x:i−1)Δ(B∧x:i−1)±ϵ⋛nϵnϵ±ϵ=nn±1 and 1∓ϵΔ(B∧x:i−1)⋛1∓ϵnϵ=1∓1n=n∓1n
which can be applied to conclude
Δ(B∧x:i)Δ(B∧x:i−1)Δ(B∧x:i−1)Δ(B∧x:i−1)±ϵ(1∓ϵΔ(B∧x:i−1))⋛Δ(B∧x:i)Δ(B∧x:i−1)nn±1n∓1n=Δ(B∧x:i)Δ(B∧x:i−1)n∓1n±1
Therefore, since we have lower-bounded or upper-bounded every term in the product, and Δ(B∧x:L)=Δ(B∧x)=Δ(x), we can show
Δ∗B(x)≷∏Li=1Δ(B∧x:i)∓ϵΔ(B∧x:i−1)±ϵ⋛∏Li=1Δ(B∧x:i)Δ(B∧x:i−1)n∓1n±1=Δ(x)Δ(B)(n∓1n±1)L=ΔB(x)(n∓1n±1)L
The n-Lemma is thus proven.
Lemma 2: The distribution induced by OI(t) has the probability of all bitstrings bounded above by nϵ.
Identify D with d(t)=l(t)+⌈log2l(t)⌉+2t+7. Identify L with l(t). Identify n with l(t)2t+4.
Because ϵ=2−D, ϵ=2−(l(t)+⌈log2l(t)⌉+2t+7)≤1l(t)2−(l(t)+2t+7). Due to δ(t) of the distribution being composed of the uniform distribution, the probability of all x is bounded below by
δ(t)2−L=2−(t+3)2−l(t)=l(t)2t+41l(t)2−(l(t)+2t+7)≥nϵ
And Lemma 2 is proved.
From here on, we will frequently be going from the approximation of a conditional distribution to a conditional distribution via the n-Lemma and Lemma 2, so let ϵt=2−d(t), and nt=l(t)2t+4, and Lt=l(t). Δi and ΔiB will be used to refer to the probability distribution and conditional probability distribution over bitstrings that OI(i) produces, and Δ∗i and Δ∗iB refer to the approximation of that probability distribution with D=d(t).
Lemma 3: The assessed value at time t in world x of a trader T with a runtime and oracle-call filter attached is lower-bounded by
−t+∑1≤i≤tmax((nt−1nt+1)Lt∑B∈BLt(P(Ti=B)ΔiB(x))−ϵt,0)Δi(x)+ϵt and upper-bounded by
−t+(nt+1nt−1)Lt∑1≤i≤t∑B∈BLtP(Ti=B)ΔiB(x)Δi(x) .
To begin, the assessed value at time t in world x of a trader T by OverBudget is −t+∑1≤i≤t(∑B∈BLt(P(Ti=B)Δ∗iB(x)))∗Δ∗i(x) .
Δ∗i(x) has d(t) rounds of binary search run on it, which yields an interval with a width of ϵt, and an upper-bound estimate is taken, so Δi(x)≤Δ∗i(x)≤Δi(x)+ϵt. Similarly, the net value of a trade has a lower-bound estimate run on it, so
∑B∈BLt(P(Ti=B)Δ∗iB(x))−ϵt≤(∑B∈BLtP(Ti=B)Δ∗iB(x))∗≤∑B∈BLtP(Ti=B)Δ∗iB(x)
Further, by the n-Lemma and Lemma 2 (probability of x is either 0 or over ntϵt), we can replace Δ∗iB(x) by (nt+1nt−1)LiΔiB(x) or (nt−1nt+1)LiΔiB(x) respectively, which are then upper and lower-bounded by (nt+1nt−1)LtΔiB(x) and (nt−1nt+1)LtΔiB(x), which, because the fraction doesn't depend on i, can be pulled out of the sum, yielding the stated bounds.
In the next theorem, we will be abbreviating ∑B∈Bl(i)P(Ti=B)ΔiB(x) as ai and abbreviating Δi(x) as ci, so the value of a trader at timestep t in world x against the sequence of OI distributions is −t+∑1≤i≤taici.
Theorem 1: If a trader exploits the market, there is a budgeted version that trader that exploits the market.
As before, the net value of a trader at timestep t in world x against the sequence of OI distributions is −t+∑1≤i≤taici. Because the trader exploits the market, for all t and all x plausible at t, the net value ≥−b for some integer constant b, we get that for all t and all x plausible at t, ∑1≤i≤taici≥t−b. Call this quantity the gain of the trader. Separate the sum into "good" days where ai≥tϵt, and "bad" days where this is false, and use Lemma 2 to get
∑goodaici≥t−b−∑badaici>t−b−ttϵtntϵt≥t−b−t22−(t+4)>t−b−564
Abbreviate (nt−1nt+1)Lt as yt, and abbreviate (nt−1nt+1)Lt+1 as zt. Using Lemma 3, the worst-case assessed gain of the trader is ≥∑1≤i≤tmax(ytai−ϵt,0)ci+ϵt≥∑goodytai−ϵtci+ϵt. Focusing on an individual day, and using the fact that ci≥ntϵt, and it's a good day so ai≥tϵt, we get
ytai−ϵtci+ϵt=cici+ϵtaici(yt−ϵtai)>ntnt+1(yt−1t)aici>(nt−1nt+1yt−1t)aici=(zt−1t)aici
Substituting this into the sum and pulling out terms that don't depend on i, we get
∑goodytai−ϵtci+ϵt>(zt−1t)∑goodaici>(zt−1t)(t−b−564)>tzt−b−564−1
The last step was done by zt<1 and ignoring some positive terms. Now we will bound tzt.
(Lt+1)∫nt+1nt−1dxx<2(Lt+1)nt−1≤8Ltnt=2−(t+1)≤14t<∫tt−1/4dxx
Multiplying both sides by −1, which flips the upper and lower bounds of integration, we get
ln((nt−1nt+1)Lt+1)=(Lt+1)(ln(nt−1)−ln(nt+1))=(Lt+1)∫nt−1nt+1dxx>∫t−1/8tdxx=ln(t−1/4)−ln(t)=ln(t−1/4t)=ln(1−1/4t)
By putting both sides in an exponent, this inequality is equivalent to (nt−1nt+1)Lt+1>1−1/4t. Multiplying both sides by t and using the definition of zt, we get tzt>t−14.
Using the previous inequalities, we get that the worst-case assessed gain of the trader is greater than t−b−14−564−1=t−b−8564. Therefore, the worst-case assessed value of the trader is greater than −b−8564, so when the budget is b+3, OverBudget will at worst evaluate the value as −b−8564>−(b+3)+1, so it will never return 1 and interfere with the trade, so the budgeted trader makes the same trades as the original trader and exploits the market.
Lemma 4: The worst-case value of a trader with a budget of b is ≥−b−1/4.
Assume the value of a trader equals −b for some b∈R≥0 on some turn t. Then the gain of the trader, ∑1≤i≤taici is ≤t−b. By Lemma 3, the assessed gain is less than
(nt+1nt−1)Lt∑1≤i≤taici=(nt+1nt−1)Lt(t−b)<t(nt+1nt−1)Lt−b
Now we will bound the last term.
ln((nt+1nt−1)Lt)=Lt(ln(nt+1)−ln(nt−1))=Lt∫nt+1nt−1dxx<2Ltnt−1≤4Ltnt=2−(t+2)≤1/4t+1<1/4t+1/4<∫t+1/4tdxx=ln(t+1/4)−ln(t)=ln(t+1/4t)=ln(1+1/4t)
By putting both sides in an exponent, we get the equivalent inequality (nt+1nt−1)Lt<1+1/4t and multiplying both sides by t yields t(nt+1nt−1)Lt<t+1/4, so the assessed gain is less than t−b+1/4 and the assessed value is ≤−b+1/4.
Then, if the trader has a budget of b (an integer), the worst-case scenario is that on some turn t it has a true value of −b+3/4 and it is assessed to have a value of −b+1 (maximum possible), so it barely passes the budgeting filter, and then loses 1 dollar on the next turn, getting a final value of −b−1/4 (after which it only outputs B⊤ which has a value of 0).
Theorem 2: If a budgeted trader exploits the market, the supertrader exploits the market.
Pi(A∧b)=Pi(A)Pi(b) is the probability of the supertrader drawing a bitstring a which encodes the algorithm A and budget b on timestep i. P(Aib=B) is the probability of A(i) with a runtime and oracle call filter and a budget filter of b outputting the boolean B. P(A) is the probability of a randomly selected infinite string encoding A, and P(b)=2−b. For sufficiently large i, Pi(A)=P(A) and Pi(b)=P(b).
A useful thing to note is that, because the bitstring a that is chosen is of length f(t), and it is only run on the UTM for f(t) steps, its behavior is identical to that of all algorithms that are encoded by a bitstring that has a as a prefix. Therefore, even though the probability of some A being drawn may be 0 on a timestep, there is some amount of probability mass in the supertrader that might as well have come from A, in the sense that its behavior is indistinguishable from A after the runtime and oracle call filter is applied.
Similarly, because the maximum selectable budget is f(t), and a trader can lose at most 1 dollar each turn, for all A and positive integers c, Aif(i) has the same exact behavior as Aif(i)+c, so we can pretend we are dealing with the full distribution over all budgets.
Therefore, for all i, the distribution over satisfiable booleans given by ∑A,bPi(A∧b)P(Aib=B) equals the distribution over satisfiable booleans given by ∑A,bP(A)P(b)P(Aib=B), and because of this,
∀i∀x∈{0,1}Li:∑A,bPi(A∧b)∑B∈BLiP(Aib=B)ΔiB(x)=∑A,bP(A)2−b∑B∈BLiP(Aib=B)ΔiB(x)
Also, let Vi(Ab) be the value of A(i)'s trade on day i according to some fixed x. V≤i(Ab) is the value that Ab accumulates over all days up to t.
The value of the supertrader at time t according to a world x that is plausible at that time is
∑1≤i≤t(−1+∑A,bPi(A∧b)∑B∈BLiP(Aib=B)ΔiB(x)Δi(x))=∑1≤i≤t∑A,bP(A)2−b(−1+∑B∈BLiP(Aib=B)ΔiB(x)Δi(x))=∑1≤i≤t∑A,bP(A)2−bVi(Ab)=∑A,bP(A)2−bV≤t(Ab)
Now the worst-case value of V≤t(Ab) is −b−14 by Lemma 4, so we get that the value of the supertrader is bounded below by
∑AP(A)∑b2−b(−b−1/4)=−94∑AP(A)=−94.
Also, if our exploiting trader and the budget which ensures its trade is never altered are T and b′, we get that the value of the supertrader equals
∑A,bP(A)2−bV≤t(Ab)=P(A)2−b′V≤t(Tb′)+∑A,b≠T,b′P(A)2−bV≤t(Ab)>P(A)2−b′V≤t(Tb′)−94
Because V≤t(Tb′) is unbounded above for appropriate choices of t and x by assumption, it continues to be unbounded above when multiplied by P(A)2−b′ and has 94 subtracted from it. Therefore, the supertrader has plausible value unbounded above as well.
Theorem 3: The supertrader doesn't exploit OI.
Now, we will show that the maximum value that the supertrader can possibly get in worlds plausible at some turn t is upper-bounded by 2−t, so the value of the supertrader is upper-bounded by 1.
To begin with, the probability mass the supertrader places on world x at timestep t is ∑A,bP(A)2−b∑B∈BP(Atb=B)ΔB(x). This sum can be rewritten as ∑B∈BΔB(x)∑A,bP(A)2−bP(Atb=B), and for brevity, from here on, we will abbreviate ∑A,bP(A)2−bP(Atb=B) as Pt(B). With this abbreviation, we can express the value of the supertrader's trade on day t and world x as ∑B∈BPt(B)ΔB(x)δ(t)2−Lt+(1−δ(t))∑B∈BPt(B)Δ∗B(x)−1. Note that the numerator of the fraction is the supertrader, which uses the true conditional distribution, while the denominator is what the actual distribution is composed of, a small fragment of a uniform distribution with the rest is composed of a mixture of approximations to the appropriate conditional distribution. To begin the bounding argument, apply the n-Lemma to yield
∑B∈BPt(B)ΔB(x)δ(t)2−Lt+(1−δ(t))∑B∈BPt(B)Δ∗B(x)−1<∑B∈BPt(B)ΔB(x)(1−δ(t))(nt−1nt+1)Lt∑B∈BPt(B)ΔB(x)−1=(nt+1nt−1)Lt1−δ(t)−1
Also,
ln((nt+1nt−1)Lt)−ln(1−δ(t))=Lt(ln(nt+1)−ln(nt−1))−ln(1−δ(t))+ln(1)=Lt∫nt+1nt−1dxx+∫11−δ(t)dxx<2Ltnt−1+δ(t)1−δ(t)<4Ltnt+2δ(t)=2−(t+2)+2−(t+2)=2−(t+1)<2−t1+2−t<∫1+2−t1dxx=ln(1+2−t)−ln(1)=ln(1+2−t)
By putting both sides in an exponent, this inequality is equivalent to (nt+1nt−1)Lt1−δ(t)<1+2−t, so (nt+1nt−1)Lt1−δ(t)−1<2−t, so the value of the supertrader gained on any given day is bounded above by 2−t, so the total value of the supertrader at any time is bounded above by 1.
Runtime Analysis:
The strength of the bounded reflective oracle needed is controlled by the longest runtime of the algorithms that the oracle is queried about. The relevant oracle calls are the invocations of SAT in NextBit, eval(B',OI(t)) in the binary search portion of NextBit, OverBudget, the SAT call in TradeToBool, the eval(B,OI(i)) calls that the trader may make, and the oracle calls in the binary search portion of OverBudget.
To begin with, the runtime of eval(B,x) is the time needed to flip enough coins to get to the most distant variable index in B (call it n), and then the time to run the boolean circuit on the padded x itself, which would be be O(n⋅|B|), to check each bit of the padded x and do a pass over the boolean to substitute ⊤ or ⊥ for that variable, and then evaluating the resulting boolean with variables filled in.
Similarly, the runtime of SAT is the time to write down B and the binary string corresponding to the probability bound, which has a length of approximately n, for a runtime of O(n+|B|).
Now we can look at the runtime of BinSearch. It is always called with D=d(t)=O(l(t)), and there are that many iterations of writing down Δ (which, everywhere it's invoked, takes less than d(t) bits) and the average of the two bounds, which, by a suitable binary encoding, takes at most about d(t) bits, so we get a runtime of about d(t)2, which is O(l(t)2).
As for the runtime of NextBit, the length of the boolean at the start is at most f(t)+O(l(t)) (the first part is because B was returned by an algorithm that ran for at most f(t) steps, and the second part is the bound on the length of x). A boolean of this length is written down 3 times, and SAT is invoked twice, and BinSearch is run twice, for a runtime of O(3(f(t)+l(t))+2(l(t)+f(t)+l(t))+2l(t)2)=O(f(t)+l(t)2) . Adding in the time to write in/read B and x from the input just adds another f(t)+l(t) term which doesn't affect runtime that much.
Now for Approx. There are at most l(t) iterations of NextBit, and we already accounted for the time to write the input to NextBit, so the runtime is O(f(t)l(t)+l(t)3).
TradeToBool reads the input, which takes O(f(t)) time, emulates f(t) steps of a turing machine, which takes O(f(t)logf(t)) time, clips the input which takes O(f(t)+l(t)) time (the latter is an upper bound on the time to compute l(t) because l is time-constructible), calls SAT on a boolean of length at most f(t) with a maximum index of l(t), and writes the boolean, for a runtime of O(f(t)logf(t)+l(t)).
BTtoBool reads the input and writes parts of it into the query, which takes O(f(t)) time, writes down the probability bound in O(l(t)) time, and runs TradeToBool, for a runtime of O(f(t)logf(t)+l(t)).
OI either writes down δ(t) in O(t) time and generates a string in O(l(t)) time, or computes f(t), l(t), and d(t) in O(f(t)+l(t)) time, generates the trader and budget string in O(f(t)) time, and invokes TradeToBool and Approx for a runtime of O(l(t)3+f(t)l(t)+f(t)logf(t)) .
Now we can address the runtime of all the algorithms fed into an oracle query but one. The SAT calls in NextBit ask about eval(B',λ) with a runtime of O(l(t)+f(t)), which is sufficiently low. The queries about eval(B',OI(t)) in BinSearch have eval(B',OI(t)) running in O(l(t)2) time (for the evaluation) plus O(l(t)3+f(t)l(t)+f(t)logf(t)) time (to calculate OI(t)), so the runtime is O(l(t)3+f(t)l(t)+f(t)logf(t)) which is sufficiently low. The SAT call in TradeToBool asks about eval(B,λ) with a runtime of O(l(t)+f(t)) which is sufficiently low. The SAT call the trader makes about eval(B,OI(i)) isn't about a longer-running algorithm than the algorithm invoked in BinSearch, so we're good there as well. Finally, the oracle queries in the binary search portion of OverBudget take O(l(t)2) time (for the evaluation), plus the runtime of Approx and the runtime of TradeToBool (in the numerator) or the runtime of OI (in the denominator), for a runtime of O(l(t)3+f(t)l(t)+f(t)logf(t)) in both cases, which is sufficiently low.
This just leaves establishing the runtime of OverBudget to ensure that the oracle is strong enough for our needs.
Generating n takes O(t) time. Generating the world x takes O(l(t)) time. Running the deductive process takes O(f(t)+l(t)) time, and so does writing down the resulting boolean, and the maximum index is l(t) so the SAT invocation takes O(f(t)+l(t)) time as well, for a runtime so far of O(f(t)+l(t)). That leaves the sum (computing the < isn't harder than computing the sum in the first place). We are carrying out at most t fraction-additions. Each fraction-addition involves three multiplications, one addition, and two iterations of binary search. The length of the numbers is l(t), but with each multiplication, they get longer by l(t), so the maximum length of the numbers is tl(t). Using the standard inefficient multiplication algorithm for O(t2l(t)2) runtime per multiplication, and with addition being faster than multiplication, we get O(t(t2l(t)2+l(t)2))=O(t3l(t)2) for the whole sum, to get a runtime of O(t3l(t)2+f(t)), which again is sufficiently low to permit oracle calls.
Thus, an oracle strength of O(l(t)3+t3l(t)2+f(t)l(t)+f(t)logf(t)) suffices to make all oracle calls directed to algorithms with permissibly low runtime.