Auditing language models for hidden objectives — AI Alignment Forum