Cartesian frames as generalised models

Stuart_Armstrong

Scott presented Cartesian frames/Chu spaces as follows:

Let $W$ be a set of possible worlds. A Cartesian frame $C$ over $W$ is a triple $C = (A, D, \cdot)$ , where $A$ represents a set of possible ways the agent can be, $D$ represents a set of possible ways the environment can be, and $\cdot : A \times D \to W$ is an evaluation function that returns a possible world given an element of $A$ and an element of $D$ .

In a previous post, I defined $G M$ , the category of generalised models.

In this post, I'll try and see how these two formalisms relate to each other.

Equivalence with Cartesian frames

We'll now demonstrate the equivalence of Cartesian frames morphisms with the morphisms of generalised models. To do so, and avoid a collision of symbols, I've slightly tweaked the notation for Cartesian frames.

Equivalence of morphisms

Let $C_{0} = (A_{0}, D_{0}, ⋆_{0})$ and $C_{1} = (A_{1}, D_{1}, ⋆_{1})$ be Cartesian frames over $W$ : thus there are relations $⋆_{0} : A_{0} \times D_{0} \to W$ (written as $a_{0} ⋆_{0} d_{0} = w$ ) and $⋆_{1} : A_{1} \times D_{1} \to W$ (written as $a_{1} ⋆_{1} d_{1} = w^{'}$ ).

A morphism between them is a pair of maps $(g_{0} : A_{0} \to A_{1}, h_{1} : D_{1} \to D_{0})$ , such that, for all $a_{0} \in A_{0}$ and $d_{1} \in D_{1}$ , $g_{0} (a_{0}) ⋆_{1} d_{1} = a_{0} ⋆_{0} h_{1} (d_{1})$ .

How can we express this in the generalised model formalism?

First, let $E_{i} = A_{i} \times D_{i} \times W$ . In terms of features, this can be defined by setting ${¯ ¯ ¯ f}_{A_{i}} = A_{i}$ , ${¯ ¯ ¯ f}_{D_{i}} = D$ and ${¯ ¯ ¯ f}_{W} = W$ . Then $F_{i} = {f_{A_{i}}, f_{D_{i}}, f_{W}}$ , and $M_{i} = (F_{i}, E_{i}, Q_{i})$ is the feature-split generalised model with $A_{i} \subset 2^{{¯ ¯ ¯ f}_{A_{i}}} = 2^{A_{i}}$ , $D_{i} \subset 2^{{¯ ¯ ¯ f}_{D_{i}}} = 2^{D_{i}}$ , and $W \subset 2^{{¯ ¯ ¯ f}_{W}} = 2^{W}$ .

As we'll see in the bears example, there can be more interesting ways of defining the feature split $M_{i}$ .

Then the map pair $(g_{0}, h_{1})$ is equivalent to the (feature-split) relation $r$ , defined such that $(a_{0}, d_{0}, w) \sim_{r} (a_{1}, d_{1}, w^{'})$ iff:

$g_{0} (a_{0}) = a_{1}$ ,
$h_{1} (d_{1}) = d_{0}$ ,
and $w = w^{'}$ .

Without loss of clarity, we can thus write $r$ as the feature-split relation $(g_{0}, h_{0}, I d_{W})$ .

Composing $(g_{0}, h_{1})$ and $(g_{1}, h_{2})$ generates $(g_{1} \circ g_{0}, h_{1} \circ h_{2})$ . Take $r$ as the relation defined by $(g_{0}, h_{1})$ and $q$ as the relation defined by $(g_{1}, h_{2})$ . Then if $(a_{0}, d_{0}, w) \sim_{p r} (a_{2}, d_{2}, w^{''})$ , there must exist an $(a_{1}, d_{1}, w^{'})$ with $(a_{0}, d_{0}, w) \sim_{r} (a_{1}, d_{1}, w^{'}) \sim_{p} (a_{2}, d_{2}, w^{''})$ . Then:

$g_{1} g_{0} (a_{0}) = g_{1} (a_{1}) = a_{2}$ ,
$h_{1} h_{2} (d_{2}) = h_{1} (d_{1}) = d_{0}$ ,
$w = w^{'} = w^{''}$ .

So composition of morphisms $(g, h)$ for Cartesian frames is the same as the composition of corresponding relations $(g, h, I d_{W})$ .

The extra structure

We have two structures to add: Cartesian frames have the $⋆$ map, while generalised models have the probability measures $Q$ ; we need to relate them.

One natural way to relate them is to consider that if $a ⋆ d = w$ , then we should get $Q (w ∣ a, d) = 1$ and have $Q (w^{'} ∣ a, d) = 0$ for $w^{'} \neq w$ . This reflects the fact that action $a$ and environment $d$ lead inevitably to world $w$ .

Now $Q (w ∣ a, d) = Q (a, d, w) / Q (a, d, W)$ , where $Q (a, d, W)$ denotes $Q$ on the set ${a} \times {d} \times W$ ; this is $\sum_{w^{'} \in W} Q (a, d, w^{'})$ .

Hence the desired condition on $Q (w ∣ a, d)$ is equivalent with $Q (a, d, w) = 0$ iff $a ⋆ d \neq w$ . There are, of course, multiple possible $Q$ s with that property for any given $⋆$ .

The categorical equivalence

Now let's tie these together, and define $C (W)$ , a subcategory $C (W)$ of the $G M$ , the category of generalised models. This $C (W)$ will map surjectively to the category of Cartesian frames.

The objects of $C (W)$ are those (feature-split) generalised models which have $E = A \times D \times W$ for some sets $A$ and $D$ , and have $Q (a, d, w) = 0$ iff $a ⋆ d \neq w$ for some evaluation function $⋆ : A \times D \to W$ .

The morphisms of $C (W)$ are those morphisms of $G M$ that map $C (W)$ to itself, and that are of the form $r = (g, h, I d_{W})$ for $(g, h)$ a morphism of Cartesian frames.

Thus morphisms of $C (W)$ are derived from morphisms of $C h u (W)$ , and are also compatible with the $Q$ structures (since they are also morphisms of $G M$ ). Also included are the identity morphisms $(I d_{A}, I d_{D}, I d_{W})$ , which trivially preserve the $Q$ structures.

To demonstrate that $C (W)$ is a category, we need to show that $p r$ is a morphism of it whenever $r = (g_{0}, h_{1}, I d_{W})$ and $p = (g_{1}, h_{2}, I d_{W})$ are. We know that $p r$ must respect the $Q$ structures (since $r$ and $p$ are morphisms of $G M$ ), while $p r = (g_{1} \circ g_{0}, h_{1} \circ h_{2}, I d_{W})$ , which derives from $(g_{1} \circ g_{0}, h_{1} \circ h_{2})$ , a morphism of $C h u (W)$ .

Thus $C (W)$ is a category. Let $Φ : C (W) \to C h u (W)$ be the map that sends $(F, A \times D \times W, Q)$ to $(A, D, ⋆)$ , and sends $r = (g, h, I d_{W})$ to $(g, h)$ .

This $Φ$ is clearly a functor of categories, and it is surjective on the objects of $C h u (W)$ (the Cartesian frames). Now we need to show that it's also surjective on the morphisms, which comes from the following result:

Let $(g_{0}, h_{1})$ be a morphism between $C_{0} = (A_{0}, D_{0}, ⋆_{0})$ and $C_{1} = (A_{1}, D_{1}, ⋆_{1})$ . Then there exists $M_{0}, M_{1} \in C (W)$ and a morphism $r = (g_{0}, h_{1}, I d_{W})$ between them such that $Φ (M_{i}) = C_{i}$ .

To show that, we need to choose $Q_{0}$ and $Q_{1}$ that are compatible with $⋆_{0}$ and $⋆_{1}$ , and are compatible with $r$ .

In fact, we'll show a slightly stronger result: that for any $M_{0}$ with $Φ (M_{0}) = C_{0}$ , we can pick an $M_{1}$ (ie pick a $Q_{1}$ ) with the required properties.

To show this, note that $r = (g_{0}, h_{1}, I d_{W})$ will relate every element of $(g_{- 1} (a_{1}), d_{0}, w)$ with every element of $(a_{1}, h_{1}^{- 1} (d_{0}), w)$ . In fact, $r$ is defined by such relations, for any $a_{1} \in A_{1}$ , $d_{0} \in D_{1}$ and $w \in W$ . No other elements are related by $r$ .

For compatibility of $r$ with the $Q$ s, it suffices that $Q_{0} (g_{- 1} (a_{1}), d_{0}, w)$ be equal to $Q_{1} (a_{1}, h_{1}^{- 1} (d_{0}), w)$ .

For any $d_{1} \in D_{1}$ , define $#_{d_{1}}$ as the size of $h_{1}^{- 1} (h_{1} (d_{1}))$ ; since $d_{1} \in h_{1}^{- 1} (h_{1} (d_{1}))$ , $#_{d_{1}} \geq 1$ .

Then define $Q_{1} (a_{1}, d_{1}, w)$ as $Q_{0} (g_{- 1} (a_{1}), h_{1} (d_{1}), w) / #_{d_{1}}$ . This will give the compatibility that we want.

Hence $Φ : C (W) \to C h u (W)$ is a surjective functor of categories, from a subcategory of $G M$ , the category of generalised models.

More functors

Given two sets $W$ and $V$ , and a function $p : W \to V$ , there is an induced functor $p : C h u (W) \to C h u (V)$ , sending $(a, d, w)$ to $(a, d, p (w))$ and sending the morphism $(g, h)$ to the morphism with the same underlying functions, $(g, h)$ .

Then by the above, we have $C (W)$ and $C (V)$ as distinct subcategories of $G M$ , with functors $Φ_{W}$ and $Φ_{V}$ sending these subcategories to $C h u (W)$ and $C h u (V)$ .

Then $p$ also induces a functor $C (W) \to C (V)$ , by sending $(a, d, w) \in E = A \times D \times W$ to $(a, d, p (w))$ . The induced $Q$ is given by $Q (a, d, v) = \sum_{w \in p^{- 1} (v)} Q (a, d, v)$ .

Note that $p$ is not only a functor $C (W) \to C (V)$ , it also acts as a morphism between any $M_{0} \in C (W)$ to $p (M_{0}) \in C (V)$ , when both are seen as elements of $G M$ . We'll designate these various morphisms by $p$ as well. As a functor, $p$ maps the relation $r = (g, h, I d_{W})$ to $p (r) (g, h, I d_{V})$ ; seeing $p$ as a morphism, $p (r)$ is precisely $p r p^{- 1}$ , where $p^{- 1}$ is the opposite morphism to $p$ : $(a, d, v) \sim_{p^{- 1}} (a, d, w)$ iff $p (w) = v$ .

We can see that the morphism $p$ commutes with the $Φ$ s:

$Φ_{V} \circ p = p \circ Φ_{W}$ .

This is probably enough exploration of the functorial properties of these spaces for one post.

An example: colours and bears

To illustrate, let's use the Cartesian frame from this post; this construction will also show how features can figure non-trivially in this construction.

Here the agent has two unrelated choices: which colour to think about (green, $G$ or red $R$ ) and whether to go for a walk or stay home ( $W$ or $H$ ). So $A = {G H, G W, R H, R W}$ . The environment is either safe or has bears: $D = {S, B}$ .

This gives the following frame $C_{0}$ :

$\begin{matrix} \begin{matrix} S & B \end{matrix} C_{0} = & \begin{matrix} G H G W R H R W \end{matrix} & ⎛ ⎜ ⎜ ⎜ ⎝ \begin{matrix} w_{0} & w_{1} w_{2} & w_{3} w_{4} & w_{5} w_{6} & w_{7} \end{matrix} ⎞ ⎟ ⎟ ⎟ ⎠ \end{matrix}$

Of course, $w_{0}$ and $w_{4}$ only differ in the colour that the agent is thinking about (similarly for $w_{1}$ and $w_{5}$ , etc...). We could choose a $C_{1}$ frame that doesn't distinguish between these thoughts:

$\begin{matrix} \begin{matrix} S & B \end{matrix} C_{1} = & \begin{matrix} G H G W R H R W \end{matrix} & ⎛ ⎜ ⎜ ⎜ ⎝ \begin{matrix} w_{0} & w_{1} w_{2} & w_{3} w_{0} & w_{1} w_{2} & w_{3} \end{matrix} ⎞ ⎟ ⎟ ⎟ ⎠ \end{matrix}$

Let $V = {w_{0}, w_{1}, w_{2}, w_{3}}$ . Then we can define the various sets through features; specifically, in this example, $F_{A} = {f_{G / R}, f_{W / H}}$ . Similarly $F_{D} = {f_{S / B}}$ .

Adding a definition of $F_{V} = {f_{V}}$ and $F_{W} = {f_{V}, f_{G / R}}$ , we can construct the feature split generalised models:

$M_{0} = {F_{A} ⊔ F_{D} ⊔ F_{W}, A \times D \times W, Q_{0}}$ .
$M_{1} = {F_{A} ⊔ F_{D} ⊔ F_{V}, A \times D \times V, Q_{1}}$ .

The $Q_{i}$ are defined by the matrix above; if we want them to make sense as traditional probability distributions, we might require that $Q_{i} (a, d, w) = 1 / 8$ whenever it is non-zero, with $8 = | | A \times D | |$ the size of the matrix. In that case, $Q_{i} (E_{i}) = 1$ , as required.

Notes on non-synonyms

Some of the terminology is repeated between the two formalisms, but doesn't mean the same things. Specifically:

Environments: for Cartesian frames, this is $D$ , the different columns of the matrix. For generalised models, this is the larger set $E = A \times D \times W$ .
Worlds: for Cartesian frames, this is $W$ , the possible values of the elements of the matrix. For generalised models, this is $W = 2^{¯ ¯¯ ¯ F}$ , the set of all possible values all the features could take. At the very least, $W$ contains $E = A \times D \times W$ , but it could be much larger.

AI ALIGNMENT FORUM
AF

11