Weak HCH accesses EXP

evhub

This post is a follow-up to my “Alignment proposals and complexity classes” post. Thanks to Sam Eisenstat for helping with part of the proof here.

Previously, I proved that imitative amplification with weak HCH, approval-based amplification, and recursive reward modeling access while AI safety via market making accesses $EXP$ . At the time, I wasn't sure whether my market making proof would generalize to the others, so I just published it with the $PSPACE$ proofs instead. However, I have since become convinced that the proof does generalize—and that it generalizes for all of the proposals I mentioned—such that imitative amplification with weak HCH, approval-based amplification, and recursive reward modeling all actually access $EXP$ . This post attempts to prove that.

Updated list of proposals by complexity class

$P$ : Imitation learning (trivial)

$PSPACE$ : AI safety via debate (proof)

$EXP$ : AI safety via market making (proof), Imitative amplification with weak HCH (proof below), Approval-based amplification (proof below), Recursive reward modeling (proof below)

$NEXP$ : Debate with cross-examination (proof)

$R$ : Imitative amplification with strong HCH (proof), AI safety via market making with pointers (proof)

Proofs

Imitative amplification with weak HCH accesses $EXP$

The proof here is similar in structure to my previous proof that weak HCH accesses $PSPACE$ , so I'll only explain where this proof differs from that one.

First, since $l \in EXP$ , we know that for any $x \in X$ , $T_{l} (x)$ halts in $O (2^{poly (n)})$ steps where $n = | x |$ . Thus, we can construct a function $f_{l} (n) = c_{1} + c_{2} e^{c_{3} n^{c_{4}}}$ such that for all $x \in X$ , $T_{l} (x)$ halts in less than or equal to $f_{l} (x)$ steps by picking $c_{3}, c_{4}$ large enough that they dominate all other terms in the polynomial for all $n \in N$ . Note that $f_{l}$ is then computable in time polynomial in $n$ .

Second, let $H$ 's new strategy be as follows:

Given $p$ , let $s, x = M (p : f (| x |))$ . Then, return accept/reject based on whether $s$ is an accept or reject state (it will always be one or the other)
Given $p : 0$ , return $s_{0}, x_{0}$ where $s_{0}$ is the starting state and $x_{0}$ is the empty tape symbol.
Given $p : i$ , let $\begin{matrix} s, x = M (p : i - 1) x_{left} = M (p : i - 1 : - 1) x_{right} = M (p : i - 1 : 1) . \end{matrix}$ Then, return $s^{'}, x^{'}$ where $s^{'}$ is the next state of $T_{l}$ from state $s$ over symbol $x$ (where if we're already in an accept/reject state we just stay there) and $x^{'}$ is the new tape symbol at the head (which can be determined given $s, x, x_{left}, x_{right}$ ).
Given $p : 0 : j$ , return $x_{0}$ .
Given $p : i : j$ , let $s, x_{head} = M (p : i - 1)$ . Then, let $h$ be the amount by which $T_{l}$ on $s, x_{head}$ changes the head position by (such that $h \in {- 1, 0, 1}$ ). If $j + h = 0$ , let $s^{'}, x^{'} = M (p : i - 1)$ otherwise let $x^{'} = M (p : i - 1 : j + h)$ . Return $x^{'}$ .

Note that this strategy precisely replicates the strategy used in my proof that market making accesses $EXP$ for inputs $p : i$ and $p : i : j$ . Thus, I'll just defer to that proof for why the strategy works and is polynomial time on those inputs. Note that this is where $l \in EXP$ becomes essential, as it guarantees that $i$ and $j$ can be represented in polynomial space.

Then, the only difference between this strategy and the market making one is for the base $p$ input. On $p$ , given that $M (p : i)$ works, $M (p : f (| x |))$ will always return a state after $T_{l}$ has halted, which will always be an accept state if $T_{l}$ accepts and a reject state if it rejects. Furthermore, since $| M (p : f (| x |) | \in O (1)$ and $f$ is computable in polynomial time, the strategy for $p$ is polynomial, as desired.

Since this procedure works for all $l \in EXP$ , we get that amplification with weak HCH accesses $EXP$ , as desired.

Approval-based amplification accesses $EXP$

The proof here is almost precisely the same as my previous proof that approval-based amplification accesses $PSPACE$ with the only modification being that we need to verify that $| M (q) | + | H (q, M) |$ are still polynomial in size for the new imitative amplification strategy given above. However, the fact that $i$ and $j$ are bounded above by $t_{p}$ means that they are representable in polynomial space, as desired.

Recursive reward modeling accesses $EXP$

Again, the proof here is almost precisely the same as my previous proof that recursive reward modeling accesses $PSPACE$ with the only modification being the same as with the above proof that approval-based amplification accesses $EXP$ .

AI ALIGNMENT FORUM
AF

10

Weak HCH accesses EXP

10

Updated list of proposals by complexity class

Proofs

Imitative amplification with weak HCH accesses $EXP$

Approval-based amplification accesses $EXP$

Recursive reward modeling accesses $EXP$

10

Weak HCH accesses EXP

10

Updated list of proposals by complexity class

Proofs

Imitative amplification with weak HCH accesses EXP

Approval-based amplification accesses EXP

Recursive reward modeling accesses EXP

Imitative amplification with weak HCH accesses $EXP$

Approval-based amplification accesses $EXP$

Recursive reward modeling accesses $EXP$