x
Activation space interpretability may be doomed — AI Alignment Forum