AI ALIGNMENT FORUM
AF

Counterfactuals
Frontpage

8

Counterfactuals, thick and thin

by Nisan
31st Jul 2018
2 min read
11

8

Counterfactuals
Frontpage
Counterfactuals, thick and thin
3Dacyn
2Nisan
2Dacyn
2Nisan
New Comment
4 comments, sorted by
top scoring
Click to highlight new comments since: Today at 3:06 PM
[-]Dacyn7y30

The question "how would the coin have landed if I had guessed tails?" seems to me like a reasonably well-defined physical question about how accurately you can flip a coin without having the result be affected by random noise such as someone saying "heads" or "tails" (as well as quantum fluctuations). It's not clear to me what the answer to this question is, though I would guess that the coin's counterfactual probability of landing heads is somewhere strictly between 0% and 50%.

Reply
[-]Nisan7y20

Oh, interesting. Would your interpretation be different if the guess occurred well after the coinflip (but before we get to see the coinflip)?

Reply
[-]Dacyn7y20

Sure, in that case there is a 0% counterfactual chance of heads, your words aren't going to flip the coin.

Reply
[-]Nisan7y20

Ok. I think that's the way I should have written it, then.

Reply
Moderation Log
More from Nisan
View more
Curated and popular this week
4Comments

Summary: There's a "thin" concept of counterfactual that's easy to formalize and a "thick" concept that's harder to formalize.

Suppose you're trying to guess the outcome of a coinflip. You guess heads, and the coin lands tails. Now you can ask how the coin would have landed if you had guessed tails. The obvious answer is that it would still have landed tails. One way to think about this is that we have two variables, your guess A and the coin C, that are independent in some sense; so we can counterfactually vary A while keeping C constant.

But consider the variable X=A XOR C. If we change A to tails and keep X the same, we conclude that if we had guessed tails, the coin would have landed heads!

Now this is clearly silly. In real life, we have a causal model of the world that tells us that the first counterfactual is correct. But we don't have anything like that for logical uncertainty; the best we have is logical induction, which just give us a joint distribution. Given a joint distribution over A×C, there's no reason to prefer holding C constant rather than holding A XOR C constant. I want a thin concept of counterfactuals that includes both choices. Here are a few definitions, in increasing generality:

1. Given independent discrete random variables A and C, such that C is uniform, a thin counterfactual is a choice of permutation ϕ(a) of C for every a∈A.

2. Given a joint distribution over A and Y, a thin counterfactual is a random variable Z independent of A and an isomorphism of probability spaces A×Z≈A×Y that commutes with the projection to A.

3. Given a probability space A and a probability kernel κ:A→Y, a thin counterfactual is a probability space Z and a kernel λ:A×Z→Y such that ∫Zλdz=κ.

There are often multiple choices of thin counterfactual. When we say that one of the thin counterfactuals is more natural or better than the others, we are using a thick concept of counterfactuals. Pearl's concept of counterfactuals is a thick one. No one has yet formalized a thick concept of counterfactuals in the setting of logical uncertainty.

Mentioned in
4Alignment Newsletter #18
4Counterfactuals and reflective oracles