Nonperson Predicate

A Nonperson Predicate is a theorized test which can definitely distinguish computational structures which are not people; i.e., a predicate which returns 1 for all people, and returns 0 or 1 for nonpeople; thus if it returns 1, the structure may or may not be a person, but if it returns 0, the structure is definitely not a person. In other words, any time at least one trusted nonperson predicate returns 0, we know we can run that program without creating a person. (The impossibility of perfectly distinguishing people and nonpeople is a trivial consequence of Rice's Theorem which is a trivial consequence of the halting problem.)

The need for such a test arises from the possibility that when an Artificial General Intelligence predicts a person's actions, it may develop a model of them so complete that the model itself qualifies as a person (though not necessarily the same person). As the AGI investigates possibilities, these simulated people might be subjected to a large number of unpleasant situations. With a trusted nonperson predicate, either the AGI's designers or the AGI itself could ensure that no actual people are created.

Any practical implementation would likely consist of a large number of nonperson predicates of increasing complexity. For most nonpersons, a predicate will quickly return that it is not a person and conclude the test. Although any number of the predicates may be used before the test claims that something is not a person, it is crucial that any predicate in the test never claims that a person isn't a person. Unclassifiable cases being in-principle unavoidable, it is preferable that the AGI errs on the side of considering possible-persons as persons.

Blog Posts

Nonperson Predicates by Eliezer Yudkowsky
Computational Hazards by Alex Altair

AI ALIGNMENT FORUMTags
AF

See Also

Blog Posts