Decomposing the QK circuit with Bilinear Sparse Dictionary Learning — AI Alignment Forum