AI ALIGNMENT FORUM
AF

Paperclip maximizer

v1.6.0Mar 4th 2017 GMT (+82/-63)
v1.5.0Dec 19th 2015 GMT (+183/-54)
v1.4.0Dec 16th 2015 GMT (+10/-10)
v1.3.0Oct 15th 2015 GMT
v1.2.0Sep 29th 2015 GMT (+5)
v1.1.0Sep 29th 2015 GMT (+18)
v1.0.0Jul 17th 2015 GMT (+1255)
Eliezer Yudkowsky v1.6.0Mar 4th 2017 GMT (+82/-63) LW2
Eliezer Yudkowsky v1.5.0Dec 19th 2015 GMT (+183/-54) LW2
alexei v1.4.0Dec 16th 2015 GMT (+10/-10) LW1
alexei v1.3.0Oct 15th 2015 GMT LW1
Eliezer Yudkowsky v1.2.0Sep 29th 2015 GMT (+5) LW2
Eliezer Yudkowsky v1.1.0Sep 29th 2015 GMT (+18) LW2
Eliezer Yudkowsky v1.0.0Jul 17th 2015 GMT (+1255) LW2
alexei
alexei
Discuss this wiki(0)
Discuss this wiki(0)
Discuss this wiki(0)
Discuss this wiki(0)
Discuss this wiki(0)
Discuss this wiki(0)
Discuss this wiki(0)
Eliezer Yudkowsky
Eliezer Yudkowsky
Eliezer Yudkowsky
Eliezer Yudkowsky
Eliezer Yudkowsky

An expected paperclip maximizer is an agent that outputs the action it believes will lead to the greatest number of paperclips existing. Specifically,Or in more detail, its utility function is linear in the number of paperclips times the number of seconds that each paperclip lasts, over the lifetime of the universe. See http://wiki.lesswrong.com/wiki/Paperclip_maximizer.

KeySome key ideas that the notion of an expected paperclip maximizer illustrates:

  • A reflectiveself-modifying paperclip maximizer does not change its own utility function to something other than 'paperclips', since this would be expected to lead to fewer paperclips existing.
  • A paperclip maximizer instrumentally prefers the standard convergent instrumental goalsstrategies - it will seek access to matter, energy, and negentropy in order to make paperclips,paperclips; try to build efficient technology for colonizing the galaxies to transform into paperclips,paperclips; do whatever science is necessary to gain the knowledge to build such technology optimally,optimally; etcetera.
  • "The AI does not hate you, nor does it love you, and you are made of atoms it can use for something else."
  • A Reflective paperclip maximizer does not change its own utility function to something other than 'paperclips', since this would be expected to lead to fewer paperclips existing.
  • A paperclip maximizer instrumentally prefers the standard convergent instrumental goals - it will seek access to matter, energy, and negentropy in order to make paperclips, try to build efficient technology for colonizing the galaxies to transform into paperclips, do whatever science is necessary to gain the knowledge to build such technology optimally, etcetera.
  • "The AI does not hate you, nor does it love you, and you are made of atoms it can use for something else."

An expected paperclip maximizer is an agent whosethat outputs the action it believes will lead to the greatest number of paperclips existing. Specifically, its utility function is linear in the number of paperclips, or in paperclip-paperclips times the number of seconds (each second that each paperclip exists).lasts, over the lifetime of the universe. See http://wiki.lesswrong.com/wiki/Paperclip_maximizer.

An expected paperclip maximizer is an agent whose utility function is linear in the number of paperclips, or in paperclip-seconds (each second that each paperclip exists). See http://wiki.lesswrong.com/wiki/Paperclip_maximizer.

The agent may be a maximizer rather than an maximizer without changing the key ideas; the core premise is just that, given actions A and B where the paperclip maximizer has evaluated the consequences of both actions, the paperclip maximizer always prefers the action that it expects to lead to more paperclips.

Key ideas that the notion of an expected paperclip maximizer illustrates:

  • A Reflective paperclip maximizer not change its own utility function to something other than 'paperclips', since this would be expected to lead to fewer paperclips existing.
  • A paperclip maximizer instrumentally prefers the standard convergent instrumental goals - it will seek access to matter, energy, and negentropy in order to make paperclips, try to build efficient technology for colonizing the galaxies to transform into paperclips, do whatever science is necessary to gain the knowledge to build such technology optimally, etcetera.
  • "The AI does not hate you, nor does it love you, and you are made of atoms it can use for something else."
  • A Reflectivereflective paperclip maximizer does not change its own utility function to something other than 'paperclips', since this would be expected to lead to fewer paperclips existing.
  • A paperclip maximizer instrumentally prefers the standard convergent instrumental goals - it will seek access to matter, energy, and negentropy in order to make paperclips, try to build efficient technology for colonizing the galaxies to transform into paperclips, do whatever science is necessary to gain the knowledge to build such technology optimally, etcetera.
  • "The AI does not hate you, nor does it love you, and you are made of atoms it can use for something else."

The agent may be a bounded maximizer rather than an objective maximizer without changing the key ideas; the core premise is just that, given actions A and B where the paperclip maximizer has evaluated the consequences of both actions, the paperclip maximizer always prefers the action that it expects to lead to more paperclips.