Advanced AI systems could lead to existential risks via several different pathways, some of which may not fit neatly into traditional risk forecasts. Many previous forecasts, for example the well known report by Joe Carlsmith, decompose a failure story into a conjunction of different claims, and in doing so risk missing some important dangers. ‘Safety’ and ‘Alignment’ are both now used by labs to refer to things which seem far enough from existential risk reduction that using the term ‘AI notkilleveryoneism’ instead is becoming increasingly popular among AI researchers who are particularly focused on existential risk.
This post presents a series of scenarios that we must avoid, ranked by how embarrassing it would... (read 5577 more words →)