This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Scalable Oversight
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Scalable Oversight
Random Tag
Contributors
Posts tagged
Scalable Oversight
Most Relevant
1
57
Discriminating Behaviorally Identical Classifiers: a model problem for applying interpretability to scalable oversight
Sam Marks
5mo
5
1
43
Scalable oversight as a quantitative rather than qualitative problem
Buck Shlegeris
2mo
8
1
14
Inference-Only Debate Experiments Using Math Problems
Arjun Panickssery
,
Abhimanyu Pallavi Sudhir
,
JacksonKaunismaa
1mo
0
2
13
AXRP Episode 35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization
DanielFilan
21d
0
1
21
On scalable oversight with weak LLMs judging strong LLMs
Zachary Kenton
,
Noah Siegel
,
janos
,
Jonah Brown-Cohen
,
Samuel Albanie
,
David Lindner
,
Rohin Shah
2mo
18