AI ALIGNMENT FORUM
AF

Jai
010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Categorizing failures as “outer” or “inner” misalignment is often confused
Jai3y00

The second scenario omits the details about continuing to create and submit pull requests after takeover, instead just referring to human farms. Since it doesn't explicitly say that it's still optimizing for the original objective criteria and instead just refers to world domination, it appears to be inner misalignment (e.g. no longer aligned with the original optimizer). Did the original posing of this question specify that scenario 2 still maximizes pull requests after world domination?

Reply
No posts to display.