Threat Models (AI)

Edited by Quinn, Jacob Pfau, et al. last updated 12th Apr 2023

A threat model is a story of how a particular risk (e.g. AI) plays out.

In the AI risk case, according to Rohin Shah, a threat model is ideally:

Combination of a development model that says how we get AGI and a risk model that says how AGI leads to existential catastrophe.

See also AI Risk Concrete Stories

Posts tagged Threat Models (AI)

147AGI Ruin: A List of Lethalities

Eliezer Yudkowsky

3y

144

106What failure looks like

paulfchristiano

6y

28

118Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

Ajeya Cotra

3y

34

93What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)

Andrew_Critch

4y

49

93Deep Deceptiveness

So8res

2y

16

98On how various plans miss the hard bits of the alignment challenge

So8res

3y

48

74Another (outer) alignment failure story

paulfchristiano

4y

25

74A central AI alignment problem: capabilities generalization, and the sharp left turn

So8res

3y

18

63Without fundamental advances, misalignment and catastrophe are the default outcomes of training powerful AI

Jeremy Gillen, peterbarnett

2y

0

46AI x-risk, approximately ordered by embarrassment

Alex Lawsen

2y

1

44Ten Levels of AI Alignment Difficulty

Sammy Martin

2y

3

35The Main Sources of AI Risk?

Daniel Kokotajlo, Wei Dai

6y

11

45Clarifying AI X-risk

zac_kenton, Rohin Shah, David Lindner, Vikrant Varma, Vika, Mary Phuong, Ramana Kumar, Elliot Catt

3y

16

39Less Realistic Tales of Doom

Mark Xu

4y

0

47[Linkpost] Some high-level thoughts on the DeepMind alignment team's strategy

Vika, Rohin Shah

3y

11

AI ALIGNMENT FORUM
AF

Threat Models (AI)

AI ALIGNMENT FORUM
AF

Threat Models (AI)