Interpreting the Learning of Deceit — AI Alignment Forum