I don't yet know whether I can extend it to two nonconstant advisors, but I do know I can extend it to a countably infinite number of constant-prediction advisors. Let be an enumeration of their predictions that contains each one an infinite number of times. Then:
def M(p, E, P):
prev, this, next = 0, 0, 1
def bad(i):
return sum(log(abs((E[k] + P[i] - 1) /
(E[k] + p[k] - 1)))
for k in xrange(prev))
for k in xrange(this, next): p[k] = 0.5
prev, this, next = this, next, floor(exp
... def M(p, E):
p1, p2 = 1./3, 2./3
prev, this, next = 0, 0, 1
bad1
and bad2
compute log-badnesses of M
relative to p1
and p2
, on E[:prev]
; the goal of M
is to ensure neither one goes to . prev, this, next
are set in such a way that M
is permitted access to this when computing p[this:next]
.
def bad(advisor):
return lambda:
sum(log(abs((E[i] + advisor(i) - 1) /
(E[i] + p[i] - 1)))
for i in xrange(prev))
bad1, bad2 = bad(lambda i: p1), bad(lambda i: p2)
for i in xrange(this, next)
... These results are still a bit unsatisfying.
The first half constructs an invariant measure which is then shown to be unsatisfactory because UTMs can rank arbitrarily high while only being good at encoding variations of themselves. This is mostly the case because the chain is transient; if it was positive recurrent then the measure would be finite, and UTMs ranking high would have to be good at encoding (and being encoded by) the average UTM rather than just a select family of UTMs.
The second half looks at whether we can get better results (ie a probability
...There is a lot more to say about the perspective that isn't relaxed to continuous random variables. In particular, the problem of finding the maximum entropy joint distribution that agrees with particular pairwise distributions is closely related to Markov Random Fields and the Ising model. (The relaxation to continuous random variables is a Gaussian Markov Random Field.) It is easily seen that this maximum entropy joint distribution must have the form where is the normalizing constant, or partition funct
...In order to understand what the measure that was constructed from will reward, here's the sort of machine that comes close to :
Let be an arbitrary UTM. Now consider the function (or, really, any function with that visits every nonnegative integer infinitely many times), and let . (The indices here are zero-based.) Choose such that has no proper prefix in . Then, construct the UTM that does:
rep
... Consider the function where . The reversible Markov chain with transition probabilities has a bounded positive invariant measure . Of course, as the post showed, the total measure is infinite. Also, because the chain is reversible and transient, the invariant measure is far from unique - indeed, for any machine , the measure will be a bounded positive invariant meas
...Actually, on further thought, I think the best thing to use here is a log-bilinear distribution over the space of truth-assignments. For these, it is easy to efficiently compute exact normalizing constants, conditional distributions, marginal distributions, and KL divergences; there is no impedance mismatch. KL divergence minimization here is still a convex minimization (in the natural parametrization of the exponential family).
The only shortcoming is that 0 is not a probability, so it won't let you eg say that ; but this can be remedied using a
...An easy way to get rid of the probabilities-outside-[0,1] problem in the continuous relaxation is to constrain the "conditional"/updated distribution to have (which is a convex constraint; it's equivalent to ), and then minimize KL-divergence accordingly.
The two obvious flaws are that the result of updating becomes ordering-dependent (though this may not be a problem in practice), and that the updated distribution will sometimes have , and it's not clear how
...It may still be possible to get a unique (up to scaling) invariant measure (with infinite sum) over the UTMs by invoking something like the Krein-Rutman theorem and applying it to the transition operator. I haven't yet verified that all the conditions hold.
This measure would then be an encoding-invariant way to compare UTMs' "intrinsic complexity" in the sense of "number of bits needed to simulate".
This is interesting! I would dispute, though, that a good logical conditional probability must be able to condition on arbitrary, likely-non-r.e. sets of sentences.
What's the harm in requiring prior coordination, considering there's already a prior agreement to follow a particular protocol involving s? (And something earlier on in the context about a shared source of randomness to ensure convexity of the feasible set.)
If the fairness constraints are all pairwise (ie each player has fairness curves for each opponent), then the scheme generalizes directly. Slightly more generally, if each player's fairness set is weakly convex and closed under componentwise max, the scheme still generalizes directly (in effect the componentwise max creates a fairness curve which can be intersected with the surfaces to get the points.
In order to generalize fully, the agents should each precommunicate their fairness sets. In fact, after doing this, the algorithm is very si
...You did miss something: namely from PA+2 X wants to show feasibility of , not . In your example, , so the Löbian circle you describe will fail.
I'll walk through what will happen in the example.
The are just areas (ie ), not rectangles. In this example, is enough to contain and . For conciseness let's have , , and (so ).
Both X and Y have . According to X, , , and .
First the speculative phase will happen:
X will try to prove in PA+1 that and that
...How about a gridless scheme like:
The agents agree that they will each output how much utility they will get, and if they fail to choose a feasible point they both get 0.
Now discretize the possible "rectangle areas": let them be . (This requires a way to agree on that, but this seems easier than agreeing on grid points; the finer the better, basically. Perhaps the most obvious way to do it is to have these be evenly spaced from to ; then only and need to be agreed upon.)
X will do the following:
let be the first area
...How about a gridless scheme like:
The agents agree that they will each output how much utility they will get, and if they fail to choose a feasible point they both get 0.
Now discretize the possible "rectangle areas": let them be . (This requires a way to agree on that, but this seems easier than agreeing on grid points; the finer the better, basically. Perhaps the most obvious way to do it is to have these be evenly spaced from to ; then only and need to be agreed upon.)
X will do the following:
let be the first area
...
Ah, I think I can stymy M with 2 nonconstant advisors. Namely, let A1(n)=12−1n+3 and A2(n)=12+1n+3. We (setting up an adversarial E) precommit to setting E(n)=0 if p(n)≥A2(n) and E(n)=1 if p(n)≤A1(n); now we can assume that M always chooses p(n)∈[A1(n),A2(n)], since this is better for M.
Now define b′i(j)=|Ai(j)+E(j)−1|−|p(j)+E(j)−1| and bi(n)=∑j<nb′i(j). Note that if we also define badi(n)=∑j<n(log|Ai(j)+E(j)−1|−log|p(j)+E(j)−1|) then ∑j<n|2bi(j)−badi(j)|≤∑j<n(2A1(j)−1−log(2A1(j))))=∑j<nO((12−A1(j))2) is bounded; therefore if we can force b1
... (read more)