I heavily recommend @beren's "Deconfusing Direct vs Amortised Optimisation". It's a very important conceptual clarification that has changed how I think about many issues bearing on technical AI safety.
Currently, it's the most important blog post I've read this year.
This sequence (if I get around to completing it) is an attempt to draw more attention to Beren's conceptual frame and its implications for how to think about issues of alignment and agency.
This first post presents a distillation of the concept, and subsequent posts explore its implications.
Beren introduces a taxonomy categorising intelligent systems according to the kind of optimisation they are performing. I think it's more helpful to think of these as two ends of a spectrum as opposed to distinct discrete categories; sophisticated real world intelligent systems (e.g. humans) appear to be a hybrid of the two approaches.
Naively, direct optimisers can be understood as computing (an approximation of) (or ) for a suitable objective function during inference.
Naively, amortised optimisers can be understood as evaluating a (fixed) learned function; they're not directly computing (or ) for any particular objective function during inference.
Aspect | Direct Optimization | Amortized Optimization |
Problem Solving | Computes optimal responses "on the fly" | Evaluates the learned function approximator on the given input |
Computational Approach | Searches through a solution space | Learns a function approximator |
Runtime Cost | Higher, as it requires in-depth search for a suitable solution | Lower, as it only needs a forward pass through the function approximator |
Scalability with Compute | Scales by expanding search depth | Scales by better approximating the posterior distribution |
Convergence | In the limit of arbitrary compute, the system's policy converges to of the appropriate objective function | In the limit of arbitrary compute, the system's policy converges to the best description of the training dataset |
Performance | More favourable in "simple" domains | More favourable in "rich" domains |
Data Efficiency | Little data needed for high performance (e.g. an MCTS agent can attain strongly superhuman performance in Chess/Go given only the rules and sufficient compute) | Requires (much) more data for high performance (e.g. an amortised agent necessarily needs to observe millions of chess games to learn skilled play) |
Generalization | Dependent on search depth and compute | Dependent on the learned function approximator/training dataset |
Alignment Focus | Emphasis on safe reward function design | Emphasis on reward function and dataset design |
Out-of-Distribution Behavior | Can diverge arbitrarily from previous behavior | Constrained by the learned function approximator |
Examples | AIXI, MCTS, model-based RL | Supervised learning, model-free RL, GPT models |