OpenAI: Detecting misbehavior in frontier reasoning models — AI Alignment Forum