AI ALIGNMENT FORUMTags
AF

Debate (AI safety technique)

•
Applied to Deception Chess: Game #2 by RobertM 5d ago
•
Applied to AI debate: test yourself against chess 'AIs' by Richard Willis 13d ago
•
Applied to Debate helps supervise human experts [Paper] by Roger Dearnaley 18d ago
•
Applied to AI Safety 101 - Chapter 5.1 - Debate by Charbel-Raphael Segerie 1mo ago
•
Applied to Evaluating Superhuman Models with Consistency Checks by Daniel Paleka 4mo ago
•
Applied to A Proposal for AI Alignment: Using Directly Opposing Models by Arne B 7mo ago
•
Applied to Empathy bandaid for immediate AI catastrophe by installgentoo 8mo ago
•
Applied to [New LW Feature] "Debates" by Jim Babcock 8mo ago
•
Applied to Why I’m not working on {debate, RRM, ELK, natural abstractions} by Steve Byrnes 10mo ago
•
Applied to Alignment with argument-networks and assessment-predictions by Tor Økland Barstad 1y ago
•
Applied to Take 9: No, RLHF/IDA/debate doesn't solve outer alignment. by Ruben Bloom 1y ago
•
Applied to Notes on OpenAI’s alignment plan by RobertM 1y ago
•
Applied to Questions about Value Lock-in, Paternalism, and Empowerment by Sam F. Brown 1y ago
•
Applied to AI Unsafety via Non-Zero-Sum Debate by Noosphere89 1y ago