[Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations — AI Alignment Forum