This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
AI Capabilities
•
Applied to
INTELLECT-1 Release: The First Globally Trained 10B Parameter Model
by
Matrice Jacobine
4d
ago
•
Applied to
Have we seen any "ReLU instead of sigmoid-type improvements" recently
by
KvmanThinking
11d
ago
•
Applied to
A short project on Mamba: grokking & interpretability
by
Alejandro Tlaie Boria
2mo
ago
•
Applied to
The case for a negative alignment tax
by
Cameron Berg
3mo
ago
•
Applied to
On agentic generalist models: we're essentially using existing technology the weakest and worst way you can use it
by
Yuli_Ban
3mo
ago
•
Applied to
Molecular dynamics data will be essential for the next generation of ML protein models
by
Lauren (often wrong)
3mo
ago
•
Applied to
Lifelogging for Alignment & Immortality
by
Roland
4mo
ago
•
Applied to
Diffusion Guided NLP: better steering, mostly a good thing
by
Nathan Helm-Burger
4mo
ago
•
Applied to
"AI achieves silver-medal standard solving International Mathematical Olympiad problems"
by
Lauren (often wrong)
4mo
ago
•
Applied to
How bad would AI progress need to be for us to think general technological progress is also bad?
by
Jim Buhler
5mo
ago
•
Applied to
[Paper] Stress-testing capability elicitation with password-locked models
by
Lauren (often wrong)
6mo
ago
•
Applied to
Memorizing weak examples can elicit strong behavior out of password-locked models
by
Lauren (often wrong)
6mo
ago
•
Applied to
The thing I don't understand about AGI
by
Jeremy Kalfus
6mo
ago
•
Applied to
Getting 50% (SoTA) on ARC-AGI with GPT-4o
by
Ryan Greenblatt
6mo
ago
•
Applied to
What’s the future of AI hardware?
by
Itay Dreyfus
6mo
ago
•
Applied to
[Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations
by
Teun van der Weij
6mo
ago
•
Applied to
An Introduction to AI Sandbagging
by
Teun van der Weij
7mo
ago
•
Applied to
Addressing Accusations of Handholding
by
Yeshua God
8mo
ago
•
Applied to
Timelines to Transformative AI: an investigation
by
Zershaaneh Qureshi
8mo
ago