Guodong Zhang — AI Alignment Forum

I'm still a bit confused by the difference between inner alignment and out-of-distribution generalization. What's the fundamental difference between the cat-classifying problem and the maze problem. The model itself is an optimizer for the latter? But why this is any special?

What if the neural network used to solve the maze problem just learns a mapping (but doesn't do any search)? Is that still an inner-alignment problem?

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments