AI ALIGNMENT FORUM
AF

Wikitags

Threat Models (AI)

Edited by Quinn, Jacob Pfau, et al. last updated 12th Apr 2023

A threat model is a story of how a particular risk (e.g. AI) plays out.

In the AI risk case, according to Rohin Shah, a threat model is ideally:

Combination of a development model that says how we get AGI and a risk model that says how AGI leads to existential catastrophe.

See also AI Risk Concrete Stories

Subscribe
1
Subscribe
1
Discussion0
Discussion0
Posts tagged Threat Models (AI)
74Another (outer) alignment failure story
paulfchristiano
4y
25
106What failure looks like
paulfchristiano
6y
28
25Distinguishing AI takeover scenarios
Sam Clarke, Sammy Martin
4y
6
93What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)
Andrew_Critch
4y
49
24Vignettes Workshop (AI Impacts)
Daniel Kokotajlo
4y
2
147AGI Ruin: A List of Lethalities
Eliezer Yudkowsky
3y
144
98On how various plans miss the hard bits of the alignment challenge
So8res
3y
48
39Less Realistic Tales of Doom
Mark Xu
4y
0
21Survey on AI existential risk scenarios
Sam Clarke, apc, Jonas Schuett
4y
2
15Investigating AI Takeover Scenarios
Sammy Martin
4y
1
118Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Ajeya Cotra
3y
34
28Current AIs Provide Nearly No Data Relevant to AGI Alignment
Thane Ruthenis
2y
28
31Rogue AGI Embodies Valuable Intellectual Property
Mark Xu, CarlShulman
4y
6
45Clarifying AI X-risk
zac_kenton, Rohin Shah, David Lindner, Vikrant Varma, Vika, Mary Phuong, Ramana Kumar, Elliot Catt
3y
16
30What Failure Looks Like: Distilling the Discussion
Ben Pace
5y
3
Load More (15/44)
Add Posts