Evaluating Stability of Unreflective Alignment — AI Alignment Forum