Huh, what would you recommend I do to reduce my uncertainty around meta-execution (e.g. "read x", "ask about it as a top level question", etc)?
Is this necessarily true? It seems like this describes what Christiano calls "delegation" in his paper, but wouldn't apply to IDA schemes with other capability amplification methods (such as the other examples in the appendix of "Capability Amplification").
I found this immensely helpful overall, thank you!
However, I'm still somewhat confused by meta-execution. Is it essentially a more sophisticated capability amplification strategy that replaces the role filled by "deliberation" in Christiano's IA paper?
Two basic questions I couldn't figure out (sorry):
Can you use a different oracle for every subquestion? If you can, how would this affect the concern Wei_Dai raises?
If we know the oracle is only optimizing for the specified objective function, are mesa-optimisers still a problem for the proposed system as a whole?
Huh, I thought that all amplification/distillation procedures were intended as a way to approximate HCH, which is itself a tree. Can you not meaningfully discuss "this amplification procedure is like an n-depth approximation of HCH at step x", for any amplification procedure?
For example, the internal structure of the distilled agent described in Christiano's paper is unlikely to look anything like a tree. However, my (potentially incorrect?) impression is that the agent's capabilities at step x are identical to an HCH tree of depth x if the underlying learning system is arbitrarily capable.
It's possible that I'm not understanding the difference between "depth", "tree-based" and "recursion" in this context