PhD student at the University of Toronto, studying machine learning and working on AI safety problems.
Reinforcement Learning in the Iterated Amplification Framework
HCH is not just Mechanical Turk
[Link]Improbable Oversight, An Attempt at Informed Oversight
[Link]Informed Oversight through Generalizing Explanations
[Link]Proposal for an Implementable Toy Model of Informed Oversight