Value Learning

Oct 29, 2018

by rohinmshah

This is a sequence investigating the feasibility of one approach to AI alignment: value learning.

Preface to the sequence on value learning
187mo2 min readShow Highlight
5

Ambitious Value Learning

What is ambitious value learning?
127mo2 min readShow Highlight
12
The easy goal inference problem is still hard
97mo4 min readShow Highlight
1
Humans can be assigned any values whatsoever…
117mo3 min readShow Highlight
0
Latent Variables and Model Mis-Specification
56mo9 min readShow Highlight
0
0
Future directions for ambitious value learning
126mo3 min readShow Highlight
6

Goals vs Utility Functions

Ambitious value learning aims to give the AI the correct utility function to avoid catastrophe. Given its difficulty, we revisit the arguments for utility functions in the first place.

Intuitions about goal-directed behavior
96mo6 min readShow Highlight
5
Coherence arguments do not imply goal-directed behavior
196mo7 min readShow Highlight
9
Will humans build goal-directed agents?
115mo5 min readShow Highlight
20
AI safety without goal-directed behavior
124mo3 min readShow Highlight
6

Narrow Value Learning

What is narrow value learning?
64mo1 min readShow Highlight
3
Ambitious vs. narrow value learning
64mo3 min readShow Highlight
14
Human-AI Interaction
94mo4 min readShow Highlight
0
Reward uncertainty
64mo5 min readShow Highlight
0
The human side of interaction
64mo3 min readShow Highlight
0
Following human norms
104mo4 min readShow Highlight
2
Future directions for narrow value learning
74mo4 min readShow Highlight
2
Conclusion to the sequence on value learning
144mo5 min readShow Highlight
1