AI ALIGNMENT FORUMTags
AF

Threat Models

EditHistorySubscribe
Discussion (0)
Help improve this page
EditHistorySubscribe
Discussion (0)
Help improve this page
Threat Models
Random Tag
Contributors
1Quinn Dougherty
1Multicore

A threat model is a story of how a particular risk (e.g. AI) plays out.

In the AI risk case, according to Rohin Shah, a threat model is ideally:

Combination of a development model that says how we get AGI and a risk model that says how AGI leads to existential catastrophe.

...

(Read More)

Posts tagged Threat Models
Most Relevant
6
69Another (outer) alignment failure story
Paul Christiano
2y
25
6
84What failure looks like
Paul Christiano
4y
27
5
25Distinguishing AI takeover scenarios
Sam Clarke, Samuel Dylan Martin
2y
6
5
82What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)
Andrew Critch
2y
49
6
24Vignettes Workshop (AI Impacts)
Daniel Kokotajlo
2y
2
3
38Less Realistic Tales of Doom
Mark Xu
2y
0
4
21Survey on AI existential risk scenarios
Sam Clarke, Alexis Carlier, Jonas Schuett
2y
2
4
15Investigating AI Takeover Scenarios
Samuel Dylan Martin
2y
1
2
31Rogue AGI Embodies Valuable Intellectual Property
Mark Xu, CarlShulman
2y
6
2
39Clarifying AI X-risk
Zachary Kenton, Rohin Shah, David Lindner, Vikrant Varma, Victoria Krakovna, Mary Phuong, Ramana Kumar, Elliot Catt
5mo
14
2
30What Failure Looks Like: Distilling the Discussion
Ben Pace
3y
3
3
34My AGI Threat Model: Misaligned Model-Based RL Agent
Steve Byrnes
2y
24
1
86On how various plans miss the hard bits of the alignment challenge
Nate Soares
8mo
45
1
83A central AI alignment problem: capabilities generalization, and the sharp left turn
Nate Soares
9mo
18
1
38My Overview of the AI Alignment Landscape: A Bird's Eye View
Neel Nanda
1y
4
Load More (15/26)
Add Posts