CitizenTen — AI Alignment Forum

Silly question warning.

You think that when an AI performs a bad action, (say remove the diamond) the AI has to have knowledge that the diamond is in fact no longer there. Even when the camera shows the diamond is (falsely) there and the human confirms that the diamond is there.

You call this ELK

You want the human to have access to this knowledge, as this is useful to choosing decisions that the human wants.

This is hard. So you have people propose how to do this.

And then people try to explain why that strategy wouldn't work.

Rinse and repeat.

Once you have a proposal that nobody is able to show doesn't work.... profit?

Correct any misunderstandings in my basic overview above.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments