AI ALIGNMENT FORUM
AF

1368
If I were a well-intentioned AI...

If I were a well-intentioned AI...

Jan 22, 2020 by Stuart_Armstrong

I look at how some of the major problems in AI alignment - Goodhart problems, distributional shift, mesaoptimising, etc.. - look from the perspective of a well-intentioned but ignorant AI. And if this perspective can suggest methods of safety improvements.

15If I were a well-intentioned AI... I: Image classifier
Stuart_Armstrong
6y
4
10If I were a well-intentioned AI... II: Acting in a world
Stuart_Armstrong
6y
0
11If I were a well-intentioned AI... III: Extremal Goodhart
Stuart_Armstrong
6y
0
15If I were a well-intentioned AI... IV: Mesa-optimising
Stuart_Armstrong
6y
2