Threat Models (AI)

Edited by Quinn, Jacob Pfau, et al. last updated 12th Apr 2023

A threat model is a story of how a particular risk (e.g. AI) plays out.

In the AI risk case, according to Rohin Shah, a threat model is ideally:

Combination of a development model that says how we get AGI and a risk model that says how AGI leads to existential catastrophe.

See also AI Risk Concrete Stories

Posts tagged Threat Models (AI)

6

76Another (outer) alignment failure story

paulfchristiano

5y

25

6

106What failure looks like

paulfchristiano

7y

28

5

25Distinguishing AI takeover scenarios

Sam Clarke, Sammy Martin

4y

6

5

93What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)

Andrew_Critch

5y

49

6

24Vignettes Workshop (AI Impacts)

Daniel Kokotajlo

4y

2

1

147AGI Ruin: A List of Lethalities

Eliezer Yudkowsky

3y

144

1

98On how various plans miss the hard bits of the alignment challenge

So8res

3y

48

3

39Less Realistic Tales of Doom

Mark Xu

5y

0

4

21Survey on AI existential risk scenarios

Sam Clarke, apc, Jonas Schuett

4y

2

4

15Investigating AI Takeover Scenarios

Sammy Martin

4y

1

118Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

Ajeya Cotra

3y

34

2

28Current AIs Provide Nearly No Data Relevant to AGI Alignment

Thane Ruthenis

2y

28

2

31Rogue AGI Embodies Valuable Intellectual Property

Mark Xu, CarlShulman

4y

6

2

45Clarifying AI X-risk

zac_kenton, Rohin Shah, David Lindner, Vikrant Varma, Vika, Mary Phuong, Ramana Kumar, Elliot Catt

3y

16

2

30What Failure Looks Like: Distilling the Discussion

Ben Pace

5y

3

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Threat Models (AI)