Reward learning summary — AI Alignment Forum