Laying the Foundations for Vision and Multimodal Mechanistic Interpretability & Open Problems — AI Alignment Forum