x

Sophie Bridgers

Subscribe

Message

10

2

1y

Sophie Bridgers

Subscribe

Message

10

2

1y

Human-AI Complementarity: A Goal for Amplified Oversight

Sophie Bridgers1y20

Hi Charlie, Thanks for your thoughtful feedback and comments! If we may, we think we actually agree more than we disagree. By “definitionally accurate”, we don’t necessarily mean that a group of randomly selected humans are better than AI at explicitly defining or articulating human values or better at translating those values into actions in any given situation. We might call this “empirical accuracy” – that is, under certain empirical conditions such as time pressure, expertise and background of the empirical sample, incentive structure of the empirical ... (read more)

Reply

Human-AI Complementarity: A Goal for Amplified Oversight

17

rishubjain, Sophie Bridgers

1y

This is a linkpost for https://deepmindsafetyresearch.medium.com/human-ai-complementarity-a-goal-for-amplified-oversight-0ad8a44cae0a

By Sophie Bridgers, Rishub Jain, Rory Greig, and Rohin Shah
For more details and full list of contributors, please see our paper: https://arxiv.org/abs/2510.26518

Human oversight is critical for ensuring that Artificial Intelligence (AI) models remain safe and aligned to human values. But AI systems are rapidly advancing in capabilities and are being used to complete ever more complex tasks, making it increasingly challenging for humans to verify AI outputs and provide high-quality feedback. How can we ensure that humans can continue to meaningfully evaluate AI performance? An avenue of research to tackle this problem is “Amplified Oversight” (also called “Scalable Oversight”), which aims to develop techniques to use AI to amplify humans’ abilities to oversee increasingly powerful AI systems, even if they eventually surpass human capabilities in particular domains.

With...

(See More - 249 more words)