Trying to isolate objectives: approaches toward high-level interpretability — AI Alignment Forum