This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Archive Recommendations
146
AGI Ruin: A List of Lethalities
Eliezer Yudkowsky
3y
144
218
Where I agree and disagree with Eliezer
Paul Christiano
3y
59
142
SolidGoldMagikarp (plus, prompt generation)
Jessica Rumbelow
,
mwatkins
2y
16
Review
63
The Waluigi Effect (mega-post)
Cleo Nardo
2y
25
Review
130
Simulators
janus
2y
90
154
Let’s think about slowing down AI
KatjaGrace
2y
3
111
What 2026 looks like
Daniel Kokotajlo
3y
29
121
Steering GPT-2-XL by adding an activation vector
Alex Turner
,
Monte MacDiarmid
,
David Udell
,
lisathiergart
,
Ulisse Mini
2y
63
Review
97
chinchilla's wild implications
nostalgebraist
2y
13
103
What failure looks like
Paul Christiano
6y
28