Compositional language for hypotheses about computations

Good idea!

Example 1

Fix some alphabet . Here's how you make an automaton that checks that the input sequence (an element of $Σ^{*}$ ) is a subsequence of some infinite periodic sequence with period $n$ . For every $k$ in $Z / n$ , let $A_{k}$ be an automaton that checks whether the symbols in the input sequences at places $i$ s.t. $i \equiv k (mod n)$ are all equal (its number of states is $O (n | Σ |)$ ). We can modify it to make a transducer $A_{k}^{'}$ that produces its unmodified input sequence if the test passes and $⊥$ if the test fails. It also produces $⊥$ when the input is $⊥$ . We then chain $A_{0}^{'}, A_{1}^{'} \dots A_{n - 1}^{'}$ and get the desired automaton. Alternatively, we can connect the $A_{k}$ in parallel and then add an automaton $B$ with $n$ boolean inputs that acts as an AND gate. $B$ is a valid multi-input automaton in our language because AND is associative and commutative (so we indeed get a functor on the product category).

Notice that the description size of this automaton in our language is polynomial in $n$ . On the other hand, a tabular description would be exponential in $n$ (the full automaton has exponentially many states). Moreover, I think that any regular expression for this language is also exponentially large.

Example 2

We only talked about describing deterministic (or probabilistic, or monadic) automata. What about nondeterministic? Here is how you can implement a nondeterministic automaton in the same language, without incurring the exponential penalty of determinization, assuming non-free categories are allowed.

Let $B$ be some category that contains an object $B$ and a morphism $b : B \to B$ s.t. $b \neq {i d}_{B}$ and $b^{2} = b$ . For example it can be the closed cartesian category freely generated by this datum (which I think is well-defined). Then, we can simulate a non-deterministic automaton $A$ on category $C$ by a deterministic transducer from $C$ to $B$ :

The state set is always the one element set (or, it can be two elements: "accept" and "reject").
For every state of $A$ , we have a variable of signature $B \to B$ . This variable is intended to hold $b$ when the state is achievable with the current input and ${i d}_{B}$ otherwise.
We use the fact that composition of endomorophisms of $B$ behaves as an "OR" operator to implement the transition rules of $A$ .

However, this trick doesn't give us nondeterministic transducers. A completely different way to get nondeterminism, which works for transducers as well, is using infra-Bayesian Knightian uncertainty to represent it. We can do via the memo monad approach, but then we get the constraint that nondeterministic transitions happen identically in e.g. identical subgraphs. This doesn't matter for ordinary automata (that process words) but it matters for tree/graph automata. If we don't want this constraint, we can probably do it in a framework that uses "infra-Markov" categories (which I don't know how to define atm, but it seems likely to be possible).

Together with Example 1, this implies that (this version of) our language is strictly more powerful than regular expressions.

Example 3

Suppose we want to take two input sequences (elements of $Σ^{*}$ ) and check them for equality. This is actually impossible without the meta-vertex, because of the commutativity properties of category products. With the meta-vertex (the result is not an automaton anymore, we can call it a "string machine"), here is how we can do it. Denote the input sequences $x$ and $y$ . We apply a transducer to $x$ that transforms it into a string diagram. Each symbol $a$ in $x$ is replaced by a transducer that checks whether the first symbol of its input is $a$ and outputs the same sequence with the first symbol removed. These transducers are chained together. In the end of the chain we add a final transducer that checks whether its input is empty. Finally, we use the meta-vertex to feed $y$ into this chain of transducers.

Notice that this requires only one level of meta: the string diagram we generate doesn't contain any meta-vertices.

Example 4

More generally, we can simulate any multi-tape automaton using a succinct string machine with one level of meta: First, there is a transducer $T$ that takes the tapes as separate inputs and simulates a single step of the multi-tape automaton. Here, moving the tape head forward corresponds to outputting a sequence with the first symbol omitted. Second, there is a transducer $P$ that takes the inputs $x_{1} \dots x_{n}$ and produces a string diagram that consists of $\sum_{i} | x_{i} |$ copies of $T$ chained together (this transducer just adds another $T$ to the chain per every input symbol). Finally, the desired string machine runs $P$ and then uses a meta-vertex to apply the diagram $P$ produces to the input.

Example 5

A string machine with unbounded meta can simulate a 1D cellular automaton, and hence any Turing machine: First, there is a transducer $T$ which simulates a single step of the automaton (this is a classical transducer, not even the stronger version we allow). Second, we already know there is an automaton $E$ that tests for equality. Third, fix some automaton $A$ whose job is to read the desired output off the final state of the cellular automaton. We modify $E$ to make a transducer $E^{'}$ , which (i) when the inputs are equal, produces a string diagram that consists only of $A$ (ii) when the inputs are different, produces this string machine. This string macine applies $T$ to the input $x$ , then applies $E^{'}$ to $x$ and $T (x)$ , then uses the meta-vertex to apply $E^{'} (x, T (x))$ to $T (x)$ ...

Except that I cheated here because the description of $E^{'}$ references a string machine that contains $E^{'}$ . However, with closed machines such quining can be implemented: instead of producing a string diagram, the transducer produces an morphism that converts transducers to string diagrams, and then we feed the same transducer composed in the same way with itself into this morphism.

We can use the same methods to talk about arbitrary describable facts, but I'm content with computable. ↩︎
The analogy is very rough, it is possible that there is some stronger compositional language that contains both as segments. ↩︎
This is obvious, but I haven't seen it in the literature. ↩︎
I don't know whether even this special case is already known under some other name in automata theory. ↩︎
I don't know whether there is a standard name for this in category theory. ↩︎ ↩︎
In full generality, I'm not sure the composition of transducers yields transducers (however, it certainly does for closed cartesian automata, see below), so this language might define a broader class of "automata" rather than just describing them. ↩︎
Instead of $S e t$ , we can use a different cartesian category for the state spaces of our automata, e.g. compact Polish spaces. But, we will continue the explanation using $S e t$ , for the sake of simplicity. ↩︎
Equivalently, $M_{0} (X)$ is the free magma on $X$ . In fact, algebras over $M_{0}$ are precisely magmas. ↩︎
Not sure what is the right term for 2-monads. ↩︎

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

17

Compositional language for hypotheses about computations

17

Example 1

Example 2

Example 3

Example 4

Example 5

Motivation

Deterministic Operadic Automata

Deterministic Operadic Transducers

Deterministic Cartesian Automata/Transducers

Compositional language

Monadic Cartesian Automata (MCA)

Processing DAGs with MCAs

(Monadic) Closed Cartesian Automata/Transducers