Reward splintering as reverse of interpretability — AI Alignment Forum