Towards deconfusing wireheading and reward maximization — AI Alignment Forum