Comparing reward learning/reward tampering formalisms — AI Alignment Forum