Conservation of Expected Ethics isn't enough

Stuart_Armstrong

Conservation of Expected Ethics isn't enough

by Stuart Armstrong

1 min read15th Jun 20161 comment

1

Personal Blog

An idea relevant for AI control; index here.

Thanks to Jessica Taylor.

I've been playing with systems to ensure or incentivise conservation of expected ethics - the idea that if an agent estimates that utilities $v$ and $w$ are (for instance) equally likely, then its future estimate for the correctness of $v$ and $w$ must be the same. In other words, it can try and get more information, but can't bias the direction of the update.

Unfortunately, CEE isn't enough. Here are a few decisions the AI can take that respect CEE. Imagine that the conditions of update relied on, for instance, humans answering questions:

#. Don't ask. #. Ask casually. #. Ask emphatically. #. Build a robot that randomly rewires humans to answer one way or the other. #. Build a robot that observes humans, figures out which way they're going to answer, then rewires them to answer the opposite way.

All of these conserve CEE, but, obviously, the last two options are not ideal...

New Comment

1 comment, sorted by

top scoring

Click to highlight new comments since: Today at 1:39 PM

[-]Ryan Carey7y00

I noticed that CEE is already named in philosophy. Conservation of expected ethics is roughly what what Artnzenius calls Weak Desire Reflection. He calls Conservation of expected evidence Belief Reflection. [1]

Arntzenius, Frank. "No regrets, or: Edith Piaf revamps decision theory." Erkenntnis 68.2 (2008): 277-297. http://www.kennyeaswaran.org/readings/Arntzenius08.pdf

Reply

Moderation Log