Analysing Adversarial Attacks with Linear Probing — AI Alignment Forum