The typical noise on feature caused by 1 unit of activation from feature , for any pair of features , , is (derived from Johnson–Lindenstrauss lemma)
1. ... This is a worst case scenario. I have not calculated the typical case, but I expect it to be somewhat less, but still same order of magnitude
Perhaps I'm misunderstanding your claim here, but the "typical" (i.e. RMS) inner product between two independently random unit vectors in is . So I think the&nb...
Nice work! I'm not sure I fully understand what the "gated-ness" is adding, i.e. what the role the Heaviside step function is playing. What would happen if we did away with it? Namely, consider this setup:
Let f and ^x be the encoder and decoder functions, as in your paper, and let x be the model activation that is fed into the SAE.
The usual SAE reconstruction is ^x(f(x)), which suffers from the shrinkage problem.
Now, introduce a new learned parameter t∈Rnfeatures, and define an "expanded" reconstruction yexpand... (read more)