x
A toy model of a corrigibility problem — AI Alignment Forum