[Interim research report] Activation plateaus & sensitive directions in GPT2 — AI Alignment Forum