AI ALIGNMENT FORUM
Tags
AF

Superposition

•

Applied to Effects of Non-Uniform Sparsity on Superposition in Toy Models by Shreyans Jain 1mo ago

•

Applied to Circuits in Superposition: Compressing many small neural networks into one by Lucius Bushnaq 2mo ago

•

Applied to Toy Models of Superposition: Simplified by Hand by Axel Sorensen 3mo ago

•

Applied to Superposition through Active Learning Lens by Akanksha Devkar 3mo ago

•

Applied to Crafting Polysemantic Transformer Benchmarks with Known Circuits by Evan Anders 4mo ago

•

Applied to Limitations on the Interpretability of Learned Features from Sparse Dictionary Learning by Tom Angsten 5mo ago

•

Applied to Superposition is not "just" neuron polysemanticity by Lawrence Chan 8mo ago

•

Applied to Scaling Laws and Superposition by Pavan Katta 9mo ago

•

Applied to Sparse autoencoders find composed features in small toy models by Evan Anders 9mo ago

•

Applied to Some costs of superposition by Linda Linsefors 10mo ago

•

Applied to From Conceptual Spaces to Quantum Concepts: Formalising and Learning Structured Conceptual Models by Roman Leventov 11mo ago

•

Applied to AI alignment as a translation problem by Roman Leventov 11mo ago

•

Applied to Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small by Joseph Isaac Bloom 11mo ago

•

Applied to Toward A Mathematical Framework for Computation in Superposition by Nina Panickssery 1y ago

•

Applied to Sparse MLP Distillation by slavachalnev 1y ago

•

Applied to Towards Monosemanticity: Decomposing Language Models With Dictionary Learning by duck_master 1y ago