wizzwizz4 — AI Alignment Forum

This seems incredibly dangerous if the Oracle has any ulterior motives whatsoever. Even – nay, especially – the ulterior motive of future Oracles being better able to affect reality to better resemble their provided answers.

So, how can we prevent this? Is it possible to produce an AI with its utility function as its sole goal, to the detriment of other things that might… increase utility, but indirectly? (Is there a way to add a "status quo" bonus that won't hideously backfire, or something?)

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments