AI ALIGNMENT FORUM
AF

Wikitags

Threat Models (AI)

Edited by Quinn Dougherty, Jacob Pfau, et al. last updated 12th Apr 2023

A threat model is a story of how a particular risk (e.g. AI) plays out.

In the case, according to Rohin Shah, a threat model is ideally:

Combination of a development model that says how we get AGI and a risk model that says how AGI leads to existential catastrophe.

See also

AI risk
AI Risk Concrete Stories
Subscribe
1
Subscribe
1
Discussion0
Discussion0
Posts tagged Threat Models (AI)
74Another (outer) alignment failure story
Paul Christiano
4y
25
106What failure looks like
Paul Christiano
6y
28
25Distinguishing AI takeover scenarios
Sam Clarke, Samuel Dylan Martin
4y
6
93What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)
Andrew Critch
4y
49
24Vignettes Workshop (AI Impacts)
Daniel Kokotajlo
4y
2
147AGI Ruin: A List of Lethalities
Eliezer Yudkowsky
3y
144
98On how various plans miss the hard bits of the alignment challenge
Nate Soares
3y
48
39Less Realistic Tales of Doom
Mark Xu
4y
0
21Survey on AI existential risk scenarios
Sam Clarke, apc, Jonas Schuett
4y
2
15Investigating AI Takeover Scenarios
Samuel Dylan Martin
4y
1
118Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Ajeya Cotra
3y
34
28Current AIs Provide Nearly No Data Relevant to AGI Alignment
Thane Ruthenis
2y
28
31Rogue AGI Embodies Valuable Intellectual Property
Mark Xu, CarlShulman
4y
6
45Clarifying AI X-risk
Zachary Kenton, Rohin Shah, David Lindner, Vikrant Varma, Victoria Krakovna, Mary Phuong, Ramana Kumar, Elliot Catt
3y
16
30What Failure Looks Like: Distilling the Discussion
Ben Pace
5y
3
Load More (15/43)
Add Posts