1
56The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks
Lucius Bushnaq,
Jake Mendel,
Dan Braun,
Stefan Heimersheim,
Nicholas Goldowsky-Dill,
Kaarel Hänni,
Avery,
Joern Stoehler,
Cindy Wu,
Magdalena Wache,
Marius Hobbhahn
1
42Apollo Research 1-year updateMarius Hobbhahn,
Lee Sharkey,
Lucius Bushnaq,
Dan Braun,
Mikita Balesni,
Jérémy Scheurer,
Nicholas Goldowsky-Dill,
Stefan Heimersheim,
Jake Mendel,
AlexMeinke,
rusheb