AI ALIGNMENT FORUM
AF

107
Thoughts on Corrigibility

Thoughts on Corrigibility

Nov 24, 2021 by TurnTrout

My writings on different kinds of corrigibility. These thoughts build on each other and form part of my alignment worldview, but they are not yet woven into a coherent narrative.

32Non-Obstruction: A Simple Concept Motivating Corrigibility
TurnTrout
5y
18
19Corrigibility as outside view
TurnTrout
5y
10
35A Certain Formalization of Corrigibility Is VNM-Incoherent
TurnTrout
4y
21
14Formalizing Policy-Modification Corrigibility
TurnTrout
4y
6