This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Scalable Oversight
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Scalable Oversight
Random Tag
Contributors
Posts tagged
Scalable Oversight
Most Relevant
1
57
Discriminating Behaviorally Identical Classifiers: a model problem for applying interpretability to scalable oversight
Sam Marks
7mo
7
1
43
Scalable oversight as a quantitative rather than qualitative problem
Buck Shlegeris
4mo
8
1
15
Inference-Only Debate Experiments Using Math Problems
Arjun Panickssery
,
Abhimanyu Pallavi Sudhir
,
JacksonKaunismaa
3mo
0
2
13
AXRP Episode 35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization
DanielFilan
3mo
0
1
22
On scalable oversight with weak LLMs judging strong LLMs
Zachary Kenton
,
Noah Siegel
,
janos
,
Jonah Brown-Cohen
,
Samuel Albanie
,
David Lindner
,
Rohin Shah
4mo
18