Understanding the two-head strategy for teaching ML to answer questions honestly — AI Alignment Forum