x
Predicted corrigibility: pareto improvements — AI Alignment Forum