Introduction How many years will pass before transformative AI is built? Three people who have thought about this question a lot are Ajeya Cotra from Open Philanthropy, Daniel Kokotajlo from OpenAI and Ege Erdil from Epoch. Despite each spending at least hundreds of hours investigating this question, they still still...
I worked on my draft report on biological anchors for forecasting AI timelines mainly between ~May 2019 (three months after the release of GPT-2) and ~Jul 2020 (a month after the release of GPT-3), and posted it on LessWrong in Sep 2020 after an internal review process. At the time,...
I think that in the coming 15-30 years, the world could plausibly develop “transformative AI”: AI powerful enough to bring us into a new, qualitatively different future, via an explosion in science and technology R&D. This sort of AI could be sufficient to make this the most important century of...
ARC has published a report on Eliciting Latent Knowledge, an open problem which we believe is central to alignment. We think reading this report is the clearest way to understand what problems we are working on, how they fit into our plan for solving alignment in the worst case, and...
This post is a transcript of a discussion between Paul Christiano, Ajeya Cotra, and Eliezer Yudkowsky (with some comments from Rob Bensinger, Richard Ngo, and Carl Shulman), continuing from 1, 2, and 3. Color key: Chat by Paul and Eliezer Other chat 10.2. Prototypes, historical perspectives, and betting [Bensinger][4:25] I...
This post is a transcript of a discussion between Paul Christiano, Ajeya Cotra, and Eliezer Yudkowsky on AGI forecasting, following up on Paul and Eliezer's "Takeoff Speeds" discussion. Color key: Chat by Paul and Eliezer Chat by Ajeya Inline comments 8. September 20 conversation 8.1. Chess and Evergrande [Christiano][15:28] I...
By Ajeya Cotra Training powerful models to maximize simple metrics (such as quarterly profits) could be risky. Sufficiently intelligent models could discover strategies for maximizing these metrics in perverse and unintended ways. For example, the easiest way to maximize profits may turn out to involve stealing money, manipulating whoever keeps...