Disentangling inner alignment failures — AI Alignment Forum