Please answer with yes or no, then explain your thinking step by step.
Wait, why give the answer before the reasoning? You'd probably get better performance if it thinks step by step first and only gives the decision at the end.
[Based on conversations with Alex Flint, and also John Wentworth and Adam Shimi]
One of the design goals of the ELK proposal is to sidestep the problem of learning human values, and settle instead for learning human concepts. A system that can answer questions about human concepts allows for schemes that let humans learn all the relevant information about proposed plans and decide about them ourselves, using our values.
So, we have some process in which we consider lots of possible scenarios and collect... (read more)
Note that the way Paul phrases it in that post is much clearer and more accurate:
> "I believe this concept was introduced in the context of AI by Eliezer and named by Robert Miles"
Yeah I definitely wouldn't say I 'coined' it, I just suggested the name
Yeah, nuclear power is a better analogy than weapons, but I think the two are linked, and the link itself may be a useful analogy, because risk/coordination is affected by the dual-use nature of some of the technologies.
One thing that makes non-proliferation difficult is that nations legitimately want nuclear facilities because they want to use nuclear power, but 'rogue states' that want to acquire nuclear weapons will also claim that this is their only goal. How do we know who really just wants power plants?
And power generation comes with its ow... (read more)
Makes sense. It seems to flow from the fact that the source code is in some sense allowed to use concepts like 'Me' or 'I', which refer to the agent itself. So both agents have source code which says "Maximise the resources that I have control over", but in Agent 1 this translates to the utility function "Maximise the resources that Agent 1 has control over", and in Agent 2 this translates to the different utility function "Maximise the resources that Agent 2 has control over".
So this source code thing that... (read more)