Value Learning

Oct 29, 2018

by rohinmshah

This is a sequence investigating the feasibility of one approach to AI alignment: value learning.

Preface to the sequence on value learning

185moShow Highlight
5

Ambitious Value Learning

What is ambitious value learning?

125moShow Highlight
12

The easy goal inference problem is still hard

95moShow Highlight
1
0

Latent Variables and Model Mis-Specification

54moShow Highlight
0

Future directions for ambitious value learning

124moShow Highlight
6

Goals vs Utility Functions

Ambitious value learning aims to give the AI the correct utility function to avoid catastrophe. Given its difficulty, we revisit the arguments for utility functions in the first place.

Intuitions about goal-directed behavior

74moShow Highlight
5

Coherence arguments do not imply goal-directed behavior

194mo7 min readShow Highlight
9

Will humans build goal-directed agents?

113moShow Highlight
17

AI safety without goal-directed behavior

122moShow Highlight
6

Narrow Value Learning

What is narrow value learning?

72moShow Highlight
3

Ambitious vs. narrow value learning

62moShow Highlight
14

Human-AI Interaction

72moShow Highlight
0

Reward uncertainty

62moShow Highlight
0

The human side of interaction

62moShow Highlight
0

Following human norms

92moShow Highlight
2

Future directions for narrow value learning

72moShow Highlight
2

Conclusion to the sequence on value learning

132moShow Highlight
0