This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Superposition
•
Applied to
Effects of Non-Uniform Sparsity on Superposition in Toy Models
by
Shreyans Jain
1mo
ago
•
Applied to
Circuits in Superposition: Compressing many small neural networks into one
by
Lucius Bushnaq
2mo
ago
•
Applied to
Toy Models of Superposition: Simplified by Hand
by
Axel Sorensen
3mo
ago
•
Applied to
Superposition through Active Learning Lens
by
Akanksha Devkar
3mo
ago
•
Applied to
Crafting Polysemantic Transformer Benchmarks with Known Circuits
by
Evan Anders
4mo
ago
•
Applied to
Limitations on the Interpretability of Learned Features from Sparse Dictionary Learning
by
Tom Angsten
5mo
ago
•
Applied to
Superposition is not "just" neuron polysemanticity
by
Lawrence Chan
8mo
ago
•
Applied to
Scaling Laws and Superposition
by
Pavan Katta
9mo
ago
•
Applied to
Sparse autoencoders find composed features in small toy models
by
Evan Anders
9mo
ago
•
Applied to
Some costs of superposition
by
Linda Linsefors
10mo
ago
•
Applied to
From Conceptual Spaces to Quantum Concepts: Formalising and Learning Structured Conceptual Models
by
Roman Leventov
11mo
ago
•
Applied to
AI alignment as a translation problem
by
Roman Leventov
11mo
ago
•
Applied to
Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small
by
Joseph Isaac Bloom
11mo
ago
•
Applied to
Toward A Mathematical Framework for Computation in Superposition
by
Nina Panickssery
1y
ago
•
Applied to
Sparse MLP Distillation
by
slavachalnev
1y
ago
•
Applied to
Towards Monosemanticity: Decomposing Language Models With Dictionary Learning
by
duck_master
1y
ago