You can remove GPT2’s LayerNorm by fine-tuning for an hour — AI Alignment Forum