Finite Factored Sets: Inferring Time

Scott Garrabrant

The fundamental theorem of finite factored sets tells us that (conditional) orthogonality data can be inferred from probabilistic data. Thus, if we can infer temporal data from orthogonality data, we will be able to combine these to infer temporal data purely from probabilistic data. In this section, we will discuss the problem of inferring temporal data from orthogonality data, mostly by going through a couple of examples.

6.1. Factored Set Models

We'll begin with a sample space, .

Naively, one might except that temporal inference in this paradigm involves inferring a factorization of $Ω$ . What we'll actually be doing, however, is inferring a factored set model of $Ω$ . This will allow for the possibility that some situations are distinct without being distinct in $Ω$ —that there can be latent structure not represented in $Ω$ .

Definition 38 (model). Given a set $Ω$ , a model of $Ω$ is a pair $M = (F, f)$ , where $F$ is a finite factored set and $f : set (F) \to Ω$ is a function from the set of $F$ to $Ω$ .

Definition 39. Let $S$ and $Ω$ be sets, and let $f : S \to Ω$ be a function from $S$ to $Ω$ .

Given a $ω \in Ω$ , we let $f^{- 1} (ω) = {s \in S ∣ f (s) = ω}$ .

Given an $E \subseteq Ω$ , we let $f^{- 1} (E) = {s \in S ∣ f (s) \in E}$ .

Given an $X \in Part (Ω)$ , we let $f^{- 1} (X) \in Part (S)$ be given by $f^{- 1} (X) = {f^{- 1} (x) | x \in X, f^{- 1} (x) \neq {}}$ .

Definition 40 (orthogonality database). Given a set $Ω$ , an orthogonality database on $Ω$ is a pair $D = (O, N)$ , where $O$ and $N$ are both subsets of $Part (Ω) \times Part (Ω) \times Part (Ω)$ .

Definition 41. Given an orthogonality database $D = (O, N)$ on a set $Ω$ , and partitions $X, Y, Z \in Part (Ω)$ , we write $X ⊥_{D} Y | Z$ if $(X, Y, Z) \in O$ , and we write $X ⇌_{D} Y | Z$ if $(X, Y, Z) \in N$ .

Definition 42. Given a set $Ω$ , a model $M = (F, f)$ of $Ω$ , and an orthogonality database $D = (O, N)$ on $Ω$ , we say $M$ models $D$ if for all $X, Y, Z \in Part (Ω)$ ,

if $X ⊥_{D} Y | Z$ then $f^{- 1} (X) ⊥^{F} f^{- 1} (Y) | f^{- 1} (Z)$ , and
if $X ⇌_{D} Y | Z$ then $\neg (f^{- 1} (X) ⊥^{F} f^{- 1} (Y) | f^{- 1} (Z))$ .

Definition 43. An orthogonality database $D$ on a set $Ω$ is called consistent if there exists a model $M$ of $Ω$ such that $M$ models $D$ .

Definition 44. An orthogonality database $D$ on a set $Ω$ is called complete if for all $X, Y, Z \in Part (Ω)$ , either $X ⊥_{D} Y | Z$ or $X ⇌_{D} Y | Z$ .

Definition 45. Given a set $Ω$ , an orthogonality database $D$ on $Ω$ , and $X, Y \in Part (Ω)$ , we say $X <_{D} Y$ if for all models $(F, f)$ of $Ω$ that model $D$ , we have $f^{- 1} (X) <^{F} f^{- 1} (Y)$ .

6.2. Examples

Example 1. Let $Ω = {00, 01, 10, 11}$ be the set of all bit strings of length $2$ . For $i \in {0, 1}$ , let $x_{i} = {i 0, i 1}$ be the event that the first bit is $i$ , and let $y_{i} = {0 i, 1 i}$ be the event that the second bit is i. Let $X = {x_{0}, x_{1}}$ and let $Y = {y_{0}, y_{1}}$ .

Let $v_{0} = {00, 11}$ be the event that the two bits are equal, let $v_{1} = {01, 10}$ be the event that the two bits are unequal, and let $V = {v_{0}, v_{1}}$ .

Let $D = (O, N)$ , where $O = {(X, V, {Ω})}$ and $N = {(V, V, {Ω})}$ .

Proposition 33. In Example 1, $D$ is consistent.

Proof. First observe that $F = (Ω, {X, V})$ is a factored set, and so $M = (F, f)$ is a model of $Ω$ , where $f$ is the identity on $Ω$ . It suffices to show that $M$ models $D$ .

Indeed $h^{F} (X) = {X}$ , and $h^{F} (V) = {V}$ , so $X ⊥^{F} V$ , so $f^{- 1} (X) ⊥^{F} f^{- 1} (V) | f^{- 1} ({Ω})$ .

Further, it is not the case that $V ⊥^{F} V$ , since $V \neq {Ind}_{Ω}$ . Thus it is not the case that $f^{- 1} (V) ⊥^{F} f^{- 1} (V) | f^{- 1} ({Ω})$ .

Thus $M$ satisfies all of the conditions to model $D$ , so $D$ is consistent. $□$

Proposition 34. In Example 1, $X <_{D} Y$ .

Proof. Let $(F, f)$ be any model of $Ω$ that models $D$ . Let $F = (S, B)$ . For any $A \in Part (Ω)$ , let $H_{A} = h^{F} (f^{- 1} (A))$ . Our goal is to show that $H_{X}$ is a strict subset of $H_{Y}$ .

First observe that $X \leq_{Ω} Y \lor_{Ω} V$ , so for any $s, t \in S$ , if $s \sim_{f^{- 1} (Y)} t$ and $s \sim_{f^{- 1} (V)} t$ , then $f (s) \sim_{Y} f (t)$ and $f (s) \sim_{V} f (t)$ , so $f (s) \sim_{X} f (t)$ , so $s \sim_{f^{- 1} (X)} t$ . Thus $f^{- 1} (X) \leq_{S} f^{- 1} (Y) \lor_{S} f^{- 1} (V)$ .

It follows that $H_{X} \subseteq h^{F} (f^{- 1} (Y) \lor_{S} f^{- 1} (V)) = H_{Y} \cap H_{V}$ . However, since $X ⊥_{D} V | {Ω}$ , we have that $H_{X} \cap H_{V} = {}$ , so $H_{X} \subseteq H_{Y}$ .

By swapping $X$ and $V$ in the argument above, we also get that $H_{V} \subseteq H_{Y}$ . Since $V ⇌_{D} V | {Ω}$ , we have that $H_{V} \neq {}$ . Thus $H_{V}$ contains some element $b$ . Observe that $b \notin H_{X}$ , but $b \in H_{Y}$ . Thus $H_{X}$ is a strict subset of $H_{Y}$ , so $f^{- 1} (X) <^{F} f^{- 1} (Y)$ .

Since $(F, f)$ was an arbitrary model of $Ω$ that models $D$ , this implies that $X <_{D} Y$ . $□$

Example 2. Let $Ω = {000, 001, 010, 011, 100, 101, 110, 111}$ be the set of all bit strings of length $3$ . For $i \in {0, 1}$ , let $x_{i} = {i 00, i 01, i 10, i 11}$ be the event that the first bit is $i$ , let $y_{i} = {0 i 0, 0 i 1, 1 i 0, 1 i 1}$ be the event that the second bit is $i$ , and let $z_{i} = {00 i, 01 i, 10 i, 11 i}$ be the event that the third bit is $i$ . Let $X = {x_{0}, x_{1}}$ , let $Y = {y_{0}, y_{1}}$ , and let $Z = {z_{0}, z_{1}}$ .

Let $v_{0} = {000, 001, 110, 111}$ be the event that the first two bits are equal, let $v_{1} = {010, 011, 100, 101}$ be the event that the first two bits are unequal, and let $V = {v_{0}, v_{1}}$ .

Let $D = (O, N)$ , where $O = {(X, V, {Ω}), (X, Z, Y), (V, Z, Y)}$ and $N = {(X, Z, {Ω}), (V, Z, {Ω}), (Z, Z, Y)}$ .

Proposition 35. In Example 2, $D$ is consistent.

Proof. Let $S = Ω \cup {00, 01, 10, 11}$ be the set of all bit strings of length either $2$ or $3$ .

For $i \in {0, 1}$ , let $x_{i}^{'} = {i 00, i 01, i 10, i 11, i 0, i 1}$ be the event that the first bit is $i$ , and let $X^{'} = {x_{0}^{'}, x_{1}^{'}}$ .

For $i \in {0, 1}$ , let $y_{i}^{'} = {0 i 0, 0 i 1, 1 i 0, 1 i 1, 0 i, 1 i}$ be the event that the second bit is $i$ , and let $Y^{'} = {y_{0}^{'}, y_{1}^{'}}$ .

Let $v_{0}^{'} = {000, 001, 110, 111, 00, 11}$ be the event that the first two bits are equal, let $v_{1}^{'} = {010, 011, 100, 101, 01, 10}$ be the event that the first two bits are unequal, and let $V^{'} = {v_{0}^{'}, v_{1}^{'}}$ .

For $i \in {0, 1}$ , let $z_{i}^{'} = {00 i, 01 i, 10 i, 11 i}$ be the event that the third bit exists and is $i$ , let $z_{2}^{'} = {00, 01, 10, 11}$ be the event that there are only two bits, and let $Z^{'} = {z_{0}^{'}, z_{1}^{'}, z_{2}^{'}}$ .

Let $B = {X^{'}, V^{'}, Z^{'}}$ . Clearly, $(S, B)$ is a finite factored set.

Let $f : S \to Ω$ be given by $f (s) = s$ if $s \in Ω$ , $f (00) = 000$ , $f (01) = 011$ , $f (10) = 100$ , and $f (11) = 111$ , so $f$ copies the last bit on inputs of length $2$ , and otherwise leaves the bit string alone. We will show that $(F, f)$ models $D$ .

First, observe that $f^{- 1} (X) = X^{'}$ , $f^{- 1} (Y) = Y^{'}$ , $f^{- 1} (V) = V^{'}$ , and $f^{- 1} (Z) = {{000, 010, 100, 110, 00, 10}, {001, 011, 101, 111, 01, 11}}$ .

It is easy to verify that $h^{F} (X^{'}) = {X^{'}}, h^{F} (V^{'}) = {V^{'}}, h^{F} (Y^{'}) = {X^{'}, V^{'}}$ , and $h^{F} (f^{- 1} (Z)) = B$ . From this, we get that $X^{'} ⊥^{F} V^{'}$ holds, but $X^{'} ⊥^{F} f^{- 1} (Z)$ and $V^{'} ⊥^{F} f^{- 1} (Z)$ do not hold.

Next, observe that for $i \in {0, 1}$ , $X^{'} | y_{i} = V^{'} | y_{i} = {{0 i 0, 0 i 1, 0 i}, {1 i 0, 1 i 1, 1 i}}$ . It is easy to verify that $h^{F} (X^{'} | y_{i}) = h^{F} (V^{'} | y_{i}) = {X^{'}, V^{'}}$ .

Also, observe that $f^{- 1} (Z) | y_{0} = {{000, 100, 00, 10}, {001, 101}}$ , and observe that $f^{- 1} (Z) | y_{1} = {{010, 110}, {011, 111, 01, 11}}$ . It is easy to verify that $h^{F} (f^{- 1} (Z) | y_{0}) = h^{F} (f^{- 1} (Z) | y_{1}) = {Z^{'}}$ .

From this, we get that $X^{'} ⊥^{F} f^{- 1} (Z) | Y^{'}$ and $V^{'} ⊥^{F} f^{- 1} (Z) | Y^{'}$ hold, and $f^{- 1} (Z) ⊥^{F} f^{- 1} (Z) | Y^{'}$ does not hold.

Thus, $(F, f)$ models $D$ , so $D$ is consistent. $□$

Proposition 36. In Example 2, $X <_{D} Y <_{D} Z$ .

First observe that $X \leq_{Ω} Y \lor_{Ω} V$ , so $f^{- 1} (X) \leq_{S} f^{- 1} (Y) \lor f^{- 1} (V)$ , so $H_{X} \subseteq H_{Y} \cup H_{V}$ . Since $X ⊥_{D} Y | {Ω}$ , $H_{X} \cap H_{V} = {}$ , so $H_{X} \subseteq H_{Y}$ . Symmetrically, $H_{V} \subseteq H_{Y}$ , so $H_{X} \cup H_{V} \subseteq H_{Y}$ .

Similarly, $Y \leq_{Ω} X \lor_{Ω} V$ , so $H_{Y} \subseteq H_{X} \cup H_{V}$ . Thus $H_{Y} = H_{X} \cup H_{V}$ .

We also know that $H_{X}$ and $H_{V}$ are nonempty, because $X ⇌_{D} Z | {Ω}$ and $Y ⇌_{D} Z | {Ω}$ .

Thus $H_{X}$ is a strict subset of $H_{Y}$ , so $X <_{D} Y$ .

Let $C \subseteq B$ be arbitrary such that $H_{X} \cap C$ and $H_{V} \cap (B ∖ C)$ are both nonempty. Fix some $b_{X} \in H_{X} \cap C$ and $b_{V} \in H_{V} \cap (B ∖ C)$ .

Since $b_{X} \in H_{X}$ , there must exist $s_{0}, s_{1} \in S$ such that $s_{0} \sim_{b} s_{1}$ for all $b \in B ∖ {b_{X}}$ , but not $s_{0} \sim_{f^{- 1} (X)} s_{1}$ . Thus it is not the case that $f (s_{0}) \sim_{X} f (s_{1})$ . Without loss of generality, assume that $f (s_{0}) \in x_{0}$ and $f (s_{1}) \in x_{1}$ .

Similarly, since $b_{V} \in H_{V}$ , there must exist $t_{0}, t_{1} \in S$ such that $t_{0} \sim_{b} t_{1}$ for all $b \in B ∖ {b_{V}}$ , but not $t_{0} \sim_{f^{- 1} (V)} t_{1}$ . Again, without loss of generality, assume that $f (t_{0}) \in v_{0}$ and $f (t_{1}) \in v_{1}$ .

For $i, j \in {0, 1}$ , let $r_{i j} = χ_{H_{X}}^{F} (s_{i}, t_{j})$ .

Next, observe that $r_{i j} \sim_{f^{- 1} (X)} s_{i}$ , so $f (r_{i j}) \sim_{X} f (s_{i}) \in x_{i}$ , so $f (r_{i j}) \in x_{i}$ . Similarly, $f (r_{i j}) \in v_{j}$ , so $f (r_{i j}) \in x_{i} \cap v_{j}$ . Thus, if $i = j, f (r_{i j}) \in y_{0}$ , and if $i \neq j$ , $f (r_{i j}) \in y_{1}$ .

Further, observe that $χ_{C}^{F} (r_{00}, r_{11}) = r_{01}$ , since $r_{00}$ and $r_{11}$ agree on all factors other than $b_{X}$ and $b_{V}$ . In particular, this means that $χ_{C}^{F} (f^{- 1} (y_{0}), f^{- 1} (y_{0})) \neq f^{- 1} (y_{0})$ . Similarly, since $χ_{C}^{F} (r_{01}, r_{10}) = r_{00}$ , we have that $χ_{C}^{F} (f^{- 1} (y_{1}), f^{- 1} (y_{1})) \neq f^{- 1} (y_{1})$ .

We will use this to show that for any $y \in f^{- 1} (Y)$ and $A \in Part (y)$ , either $h^{F} (A) \cap H_{Y} = {}$ , or $H_{Y} \subseteq h^{F} (A)$ . This is because $h^{F} (A) ⊢^{F} A$ , so $χ_{h^{F} (A)}^{F} (y, y) = y$ , so by the above argument, if $h^{F} (A) \cap H_{X}$ is nonempty, then $H_{V} \subseteq h^{F} (A)$ , which since $H_{V}$ is nonempty means $h^{F} (A) \cap H_{V}$ is nonempty, so $H_{X} \subseteq h^{F} (A)$ , so $H_{Y} \subseteq h^{F} (A)$ . Symmetrically, we also have that if $h^{F} (A) \cap H_{V}$ is nonempty, then $H_{Y} \subseteq h^{F} (A)$ . Thus, if $h^{F} (A) \cap H_{Y}$ is nonempty, then either $h^{F} (A) \cap H_{X}$ or $h^{F} (A) \cap H_{V}$ is nonempty, so $H_{Y} \subseteq h^{F} (A)$ .

Note that for any $y \in f^{- 1} (Y)$ , two of the elements among the four $r_{i j}$ defined above are in $y$ , and those two elements are in different parts in $f^{- 1} (X)$ , so $f^{- 1} (X) | y$ has at least two parts, so $h^{F} (f^{- 1} (X) | y)$ is nonempty. However, $h^{F} (f^{- 1} (X) | y) \subseteq h^{F} (f^{- 1} (X) \lor_{S} f^{- 1} (Y)) = H_{Y}$ . Thus, $h^{F} (f^{- 1} (X) | y) \cap H_{Y} \neq {}$ , so $H_{Y} \subseteq h^{F} (f^{- 1} (X) | y)$ , so $h^{F} (f^{- 1} (X) | y) = H_{Y}$ . Symmetrically, $h^{F} (f^{- 1} (V) | y) = H_{Y}$ .

In particular, this means that $h^{F} (f^{- 1} (Z) | y) \cap H_{Y} = {}$ , since $X ⊥_{D} Z | Y$ .

Since $X ⇌_{D} Z | {Ω}$ , there exists some $b_{Z} \in H_{X} \cap H_{Z}$ . Since $b_{Z} \in H_{Z}$ , there exist $u_{0}, u_{1} \in S$ such that $u_{0} \sim_{b} u_{1}$ for all $b \in B ∖ {b_{Z}}$ , but it is not the case that $u_{0} \sim_{f^{- 1} (Z)} u_{1}$ . Without loss of generality, assume that $f (u_{0}) \in z_{0}$ and $f (u_{1}) \in z_{1}$ . Let $y = [u_{0}]_{f^{- 1} (Y)}$ .

Let $b_{y}$ be an arbitrary element of $H_{Y}$ . Since $b_{Y} \in H_{Y}$ , there exist $q_{0}, q_{1} \in S$ such that $q_{0} \sim_{b} q_{1}$ for all $b \in B ∖ {b_{Y}}$ , but it is not the case that $q_{0} \sim_{f^{- 1} (Y)} q_{1}$ . Without loss of generality, assume that $q_{0} \in y$ and $q_{1} \notin y$ .

Consider $p_{0} = χ_{H_{Y}}^{F} (q_{0}, u_{0}) = χ_{H_{Y}}^{F} (q_{0}, u_{1})$ . Since $q_{0} \in y$ , $p_{0} \in y$ . Since $u_{0}$ is also in $y$ , $χ_{h^{F} (f^{- 1} (Z) | y)}^{F} (p_{0}, u_{0}) \sim_{f^{- 1} (Z)} p_{0}$ . However, since $h^{F} (f^{- 1} (Z) | y) \cap H_{Y} = {}$ , we have $χ_{h^{F} (f^{- 1} (Z) | y)}^{F} (p_{0}, u_{0}) = u_{0}$ , so $u_{0} \sim_{f^{- 1} (Z)} p_{0}$ .

If $u_{1}$ were in $y$ , we would similarly have $u_{1} \sim_{f^{- 1} (Z)} p_{0}$ , which would contradict the fact that it is not the case that $u_{0} \sim_{f^{- 1} (Z)} u_{1}$ . Thus $u_{1} \notin y$ .

Next, consider $p_{1} = χ_{H_{Y}}^{F} (q_{1}, u_{0}) = χ_{H_{Y}}^{F} (q_{1}, u_{1})$ . Since $q_{1} \notin y$ , $p_{1} \notin y$ . Since $u_{1}$ is also not in $y$ , $χ_{h^{F} (f^{- 1} (Z) | (S ∖ y))}^{F} (p_{1}, u_{1}) \sim_{f^{- 1} (Z)} p_{1}$ . However, since $h^{F} (f^{- 1} (Z) | (S ∖ y)) \cap H_{Y} = {}$ , we have $χ_{h^{F} (f^{- 1} (Z) | (S ∖ y))}^{F} (p_{1}, u_{1}) = u_{1}$ , so $u_{1} \sim_{f^{- 1} (Z)} p_{1}$ .

Thus, it is not the case that $p_{0} \sim_{f^{- 1} (Z)} p_{1}$ . However, we constructed $p_{0}$ and $p_{1}$ such that $p_{0} \sim_{b} p_{1}$ for all $b \neq b_{Y}$ . Thus $b_{Y} \in H_{Z}$ . Since $b_{Y}$ was arbitrary in $H_{Y}$ , we have that $H_{Y} \subseteq H_{Z}$ . Finally, we need to show that this subset relation is strict.

Since $Z ⇌_{D} Z | Y$ , there is some $y$ such that $h^{F} (f^{- 1} (Z) | y) \neq {}$ . Let $b$ be any element of $h^{F} (f^{- 1} (Z) | y)$ . Since $h^{F} (f^{- 1} (Z) | y) \cap H_{Y} = {}$ , $b \notin H_{Y}$ . However, $b \in h^{F} (f^{- 1} (Z) | y) \subseteq h^{F} (f^{- 1} (Z) \lor_{S} f^{- 1} (Y)) = h_{Z} \cup H_{Y}$ . Therefore $b \in H_{Z}$ . Thus $H_{Y}$ is a strict subset of $H_{Z}$ , so $Y <_{D} Z$ . $□$

In the next post, we'll discuss applications and future research directions.

[-]Rohin Shah4y20

For Example 2 / Prop 35, would this model also work?

Define to be the factor corresponding to the question "are the second and third bits equal or not?" Then $((Ω, {X, V, W}), id)$ is a model of $Ω$ . I believe this is consistent with $D$ :

For $O = {(X, V, {Ω}), (X, Z, Y), (V, Z, Y)}$ :

We have $h^{F} (X) = {X}$ and $h^{F} (V) = {V}$ for the first condition.

We have $h^{F} (X ∣ Y) = {X}$ and $h^{F} (Z ∣ Y) = {W}$ for the second condition.

We have $h^{F} (V ∣ Y) = {V}$ and $h^{F} (Z ∣ Y) = {W}$ for the third condition.

For $N = {(X, Z, {Ω}), (V, Z, {Ω}), (Z, Z, Y)}$ .

We have $h^{F} (Z) = {X, V, W}$ for the first and second conditions.

We have $h^{F} (Z ∣ Y) = {W}$ for the third condition.

[-]Scott Garrabrant4y20

I think that works, I didn't look very hard. Yore histories of X given Y and V given Y are wrong, but it doesn't change the conclusion.

Yeah, both of those should be , if I'm not mistaken (a second time).

[-]Scott Garrabrant4y40

Yeah, also note that the history of given $Y$ is not actually a well defined concept. There is only the history of $X$ given $y$ for $y \in Y$ . You could define it to be the union of all of those, but that would not actually be used in the definition of orthogonality. In this case $h^{F} (X | y)$ , $h^{F} (V | y)$ , and $h^{F} (Z | y)$ are all independent of choice of $y \in Y$ , but in general, you should be careful about that.

[-]Rohin Shah4y70

Yeah, fair point. (I did get this right in the summary; turns out if you try to explain things from first principles it becomes blindingly obvious what you should and shouldn't be doing.)

AI ALIGNMENT FORUM
AF

14

Finite Factored Sets: Inferring Time

14

6.1. Factored Set Models

6.2. Examples