Reward functions and updating assumptions can hide a multitude of sins — AI Alignment Forum