This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Charlie Griffin
Posts
Sorted by New
22
Subversion Strategy Eval: Can language models statelessly strategize to subvert control protocols?
24d
0
20
Games for AI Control
9mo
0
23
Scenario Forecasting Workshop: Materials and Learnings
1y
1
9
Five projects from AI Safety Hub Labs 2023
1y
0
48
Goodhart's Law in Reinforcement Learning
2y
5
Wikitag Contributions
Comments
Sorted by
Newest