This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
METR (org)
•
Applied to
METR is hiring!
by
Jérémy Perret
5mo
ago
•
Applied to
ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks
by
Jérémy Perret
5mo
ago
•
Applied to
ARC Evals: Responsible Scaling Policies
by
Jérémy Perret
5mo
ago
•
Applied to
Review of METR’s public evaluation protocol
by
Ruben Bloom
5mo
ago
Ruben Bloom
v1.0.0
Jul 1st 2024 GMT
(+18)
LW
2
Formerly ARC Evals
•
Created by
Ruben Bloom
at
5mo
Formerly ARC Evals