Understanding and controlling a maze-solving policy network — AI Alignment Forum