Value Learning

Oct 29, 2018

by rohinmshah

This is a sequence investigating the feasibility of one approach to AI alignment: value learning.

Preface to the Sequence on Value Learning

173mo3 min readShow Highlight
5

Ambitious Value Learning

What is ambitious value learning?

123mo2 min readShow Highlight
12

The easy goal inference problem is still hard

93mo4 min readShow Highlight
1

Humans can be assigned any values whatsoever…

113mo3 min readShow Highlight
0

Latent Variables and Model Mis-Specification

53mo9 min readShow Highlight
0
0

Future directions for ambitious value learning

102mo4 min readShow Highlight
6

Goals vs Utility Functions

Ambitious value learning aims to give the AI the correct utility function to avoid catastrophe. Given its difficulty, we revisit the arguments for utility functions in the first place.

Intuitions about goal-directed behavior

72mo6 min readShow Highlight
5

Coherence arguments do not imply goal-directed behavior

172mo7 min readShow Highlight
9

Will humans build goal-directed agents?

1119d5 min readShow Highlight
15

AI safety without goal-directed behavior

1216d3 min readShow Highlight
6

Narrow Value Learning

What is narrow value learning?

613d1 min readShow Highlight
3

Ambitious vs. narrow value learning

612d3 min readShow Highlight
0

Human-AI Interaction

79d4 min readShow Highlight
0

Reward uncertainty

65d5 min readShow Highlight
0

Following human norms

93d4 min readShow Highlight
0