AI ALIGNMENT FORUMTags
AF

Treacherous Turn

EditHistorySubscribe
Discussion (0)
Help improve this page (1 flag)
EditHistorySubscribe
Discussion (0)
Help improve this page (1 flag)
Treacherous Turn
Random Tag
Contributors
2plex
2Ruben Bloom
1Noosphere89

A Treacherous Turn is a hypothetical event where an advanced AI system which has been pretending to be aligned due to its relative weakness turns on humanity once it achieves sufficient power that it can pursue its true objective without risk.

Posts tagged Treacherous Turn
Most Relevant
0
16A Gym Gridworld Environment for the Treacherous Turn
Michaël Trazzi
5y
0
0
37Soares, Tallinn, and Yudkowsky discuss AGI cognition
Nate Soares, Eliezer Yudkowsky, jaan
1y
24
1
41A very crude deception eval is already passed
Beth Barnes
1y
4
1
18AI learns betrayal and how to avoid it
Stuart Armstrong
1y
4
1
12[AN #165]: When large models are more likely to lie
Rohin Shah
1y
0
0
16[Linkpost] Treacherous turns in the wild
Mark Xu
2y
2
Add Posts