Various arguments have been made for why advanced AI systems will plausibly not have the goals their operators intended them to have (due to either outer or inner alignment failure). I would really like a distilled collection of the strongest arguments. Does anyone know if this has been done? If...
Epistemic status: lots of this involves interpreting/categorising other people’s scenarios, and could be wrong. We’d really appreciate being corrected if so. [ETA: so far, no corrections.] TLDR: see the summary table. In the last few years, people have proposed various AI takeover scenarios. We think this type of scenario building...
Cross-posted to the EA forum. Summary * In August 2020, we conducted an online survey of prominent AI safety and governance researchers. You can see a copy of the survey at this link.[1] * We sent the survey to 135 researchers at leading AI safety/governance research organisations (including AI Impacts,...
Thanks to Jess Whittlestone, Daniel Eth, Shahar Avin, Rose Hadshar, Eliana Lorch, Alexis Carlier, Flo Dorner, Kwan Yee Ng, Lewis Hammond, Phil Trammell and Jenny Xiao for valuable conversations, feedback and other support. I am especially grateful to Jess Whittlestone for long conversations and detailed feedback on drafts, and her...