Futarchy Fix

[-]Scott Garrabrant9y50

The property of futarchy that I really don't like is the fact that one person with a lot of money can bet on "Policy X will lead to bad outcome Y," causing policy X to never be tried in the first place, and all of that person's money to be refunded, allowing them to make the same bets next time.

This may or may not be a problem in practice, but I would really like to see a good fix for it in theory.

This problem is what causes the failure to take the 10 in the 5 and 10 problem described here. One trader in the logical inductor can say that taking the 10 will lead to 0 utility, and then get all his money back, because the markets conditional on taking the 10 never get resolved. (I often refer to this problem as "the futarchy exploit.")

[-]paulfchristiano9y20

The only way I see to get around this is:

Be willing to try X whenever enough people are willing to bet at sufficiently aggressive odds.
Assume that honest (greedily log-wealth-maximizing) players have enough money that they can can cause any given X to be tried if a manipulator attempts to suppress it.

It would be interesting to see this style of solution fleshed out, to see exactly how strong the assumptions have to be in order to avoid trouble.

The analog of EXP3 is to have investors put their money on policies (rather than predictions about policy outcomes), to pick each policy with probability proportional to the amount of money behind it, and then to take money away from the people who financed the chosen option based on how badly it performs relative to the best possible outcome (giving that money to the people who financed the non-chosen options). This prevents you from cheating the system in the way you describe, though it also means that investing is quite risky even if you know exactly what is going to happen.

In this analogy, futarchy corresponds to estimating Q values (with a regression loss defined by the market maker you use in the decision markets) and then picking the Q-maximizing action. This can have lower variance but has no guarantees of any kind.

I suspect the optimal thing is to run both kinds of markets in parallel, to use the policy market with the EXP3 rule for picking actions, and to use the decision markets only for variance reduction.

I have thought about this a little bit in the context of online learning, and suspect that we can prove an optimality theorem along these lines. It would be nice to see the analogous claim with markets, and the market version would probably be more relevant to alignment. A clear and convincing exposition would also likely be of interest to researchers in RL.

(As usual, this comment is not intended as a land grab, if anyone executes on this idea and it works out it's all theirs.)

[-]abramdemski9y00

In my current way of thinking about futarchy, it seems like the right way to do this is through good adjudication. It passes the buck, just like my assumption in a recent post that a logical inductor had a correct logical counterfactual in its underlying deductive system. But for a futarchy, the situation isn't quite as bad. We could rely on human judgement somehow.

But another alternative for an underlying adjudication system occurred to me today. Maybe the market could be adjudicated via models. My intuition is that a claim of existential risk (if made in the underlying adjudication system rather than as a bet) must be accompanied by a plausible model - a relatively short computer program which fits the data so far. A counter-claim would have to give an alternative plausible model which shows no risk. These models would lead to payouts.

This could address your problem as well, since a counterfactual claim of doom could be (partially) adjudicated as false by giving a casual model. (I don't intend this proposal to help for logical counterfactuals; it just allows regular causal counterfactuals, described in some given formalism.) But I haven't thought through how this would work yet.

[-]Stuart_Armstrong9y22

Now, whether a perfect market should pick up an existential risk signature is different from whether a real market would. The behaviour of the Dow around the Cuban missile crisis isn't encouraging in that regards:

[-]Stuart_Armstrong9y10

it seems prediction markets would not be able to assign probability to existential risk, since you can’t collect on bets after everyone’s dead (I’ll call this the existential risk problem)

It seems that a rational market should implicitly price existential risk. Imagine a world with no chance of disaster, and a functioning futures market that prices everything relative to everything else.

Then add in a 10% chance of immediate extinction at some specific time $t$ , and assume the market participants know this. Then futures contracts for perishable consumption goods after $t$ should go down 10% (relative to before $t$ ), while futures contract for durable consumption goods will change values as a function of their depreciation rate.

This kind of signature should be possible to pick up from the futures prices.

[-]abramdemski9y00

You make a good point. I wonder if there's a nice way to exploit this to make conditional risk estimates, consumable by futarchy.

[-]Stuart_Armstrong9y10

The problem is that this is rational behaviour for a market, but, if it fails, it's not really exploitable.

As in, the only way to profit is if a disaster happens, and then you've just profited a little bit by having a more rational consumption profile, rather than profited arbitrarily.

[-]AlexMennen9y00

When modeling the incentives to change probabilities of events, it probably makes sense to model the payoff of changing probabilities of events and the cost of changing probabilities of events separately. You'd expect someone to alter the probabilities if they gain more in expectation from the bets than the cost to them of altering the probabilities. If someone bets on an event and changes the probability that it occurs from $p$ to $q$ , then their expected payoff is $\frac{q}{p} - 1$ times their investment, so if, in a prediction market in which there are $n$ possible outcomes, the expected payoff you can get from changing the probability distribution from $(p_{1}, . . ., p_{n})$ to $(q_{1}, . . ., q_{n})$ is proportional to ${max}_{i} (\frac{q_{i}}{p_{i}}) - 1$ .

Modeling the cost of changing a probability distribution seems harder to model, but the Fisher information metric might be a good crude estimate of how difficult you should expect it to be to change the probability distribution over outcomes from one distribution to another.

[-]abramdemski9y00

I found a paper addressing the entropy-market issue, probably more thoroughly than I do (but I haven't read it yet): http://www.cs.duke.edu/csed/honors/pengshi.pdf

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

3

3

Futarchy and Agents

Entropy-Market Problem

Solution 1: Equalized Differential ROI

Solution 2: Equalized Event-Forcing ROI

Forecasting Doom