Can we efficiently explain model behaviors? — AI Alignment Forum