Limitations on the Interpretability of Learned Features from Sparse Dictionary Learning — AI Alignment Forum