Interpretability with Sparse Autoencoders (Colab exercises) — AI Alignment Forum