Why we are excited about confession!
Boaz Barak, Gabriel Wu, Jeremy Chen, Manas Joglekar [Linkposting from the OpenAI alignment blog, where we post more speculative/technical/informal results and thoughts on safety and alignment.] > TL;DR We go into more details and some follow up results from our paper on confessions (see the original blog post). We give...
Jan 14138