Oracle AI - AI Alignment Forum

An Oracle AI is a regularly proposed solution to the problem of developing Friendly AI. It is conceptualized as a super-intelligent system which is designed for only answering questions, and has no ability to act in the world. The name was first suggested by Nick Bostrom.

Safety

Armstrong, Sandberg and Bostrom discuss Oracle AI safety at length in their Thinking inside the box: using and controlling an Oracle AI. The authors propose a conceptual architecture to create such a system, besides reviewing how one might measure it accuracy and shed some light on human level considerations. Among the last are physical security – also known as “boxing” -, the potential for the oracle to use social engineering, which questions may be safe to ask, utility indifference, and many other factors.

The paper’s conclusion that Oracles – or AI boxing concepts in general - are safer than fully free agent AIs has raised much debate. In Dreams of Friendliness, Eliezer Yudkowsky gives an informal argument stating that all oracles will be agent-like. It rests on the fact that anything considered "intelligent" must be an optimization process. That means that the Oracle will have many possible things to believe and very few correct beliefs. Therefore believing the correct thing means some method was used to select the correct belief from the many incorrect beliefs. By definition, this is an optimization process which has a goal of selecting correct beliefs. After the establishment of a goal, one can imagine things the optimization process might do towards that goal. This means that, for instance, the Oracle could answer more accurately and easily to a certain question if it killed all life on earth or turn all matter outside the box in computronium....

(Read More)