AI ALIGNMENT FORUMDeepMind Alignment Team on Threat Models
AF

DeepMind Alignment Team on Threat Models

Nov 25, 2022 by Victoria Krakovna

A collection of posts presenting our understanding of and opinions on alignment threat models.

107DeepMind alignment team opinions on AGI ruin arguments
Victoria Krakovna
5mo
5
45Will Capabilities Generalise More?
Ramana Kumar
7mo
28
37Clarifying AI X-risk
Zachary Kenton, Rohin Shah, David Lindner, Vikrant Varma, Victoria Krakovna, Mary Phuong, Ramana Kumar, Elliot Catt
3mo
14
32Threat Model Literature Review
Zachary Kenton, Rohin Shah, David Lindner, Vikrant Varma, Victoria Krakovna, Mary Phuong, Ramana Kumar, Elliot Catt
3mo
3
32Refining the Sharp Left Turn threat model, part 1: claims and mechanisms
Victoria Krakovna, Vikrant Varma, Ramana Kumar, Mary Phuong
6mo
1
20Refining the Sharp Left Turn threat model, part 2: applying alignment techniques
Victoria Krakovna, Vikrant Varma, Ramana Kumar, Rohin Shah
2mo
4