200 COP in MI: Analysing Training Dynamics — AI Alignment Forum