Disentangling Perspectives On Strategy-Stealing in AI Safety
This post was written under Evan Hubinger’s direct guidance and mentorship, as a part of the Stanford Existential Risks Institute ML Alignment Theory Scholars (MATS) program. Additional thanks to Ameya Prabhu and Callum McDougall for their thoughts and feedback on this post. Introduction I’ve seen that in various posts people...
Dec 18, 202120