You are viewing revision 1.0.0, last edited by radical_negative_one

Reflective decision theory is a term occasionally used to refer to a decision theory which does not cause an agent to regret having used it. Such a regret would be a Reflective inconsistency, as seen in the Causal Decision Theorist who regrets not being able to achieve the optimal result in Newcomb's problem.

Many hypothesized AGIs are expected to be powerful specifically due to an ability to access their own source code and self-modify. Because such an AGI could change its decision algorithm in a situation like Newcomb's Problem, it is necessary to develop a reflectively consistent decision theory to understand the AGI's behavior. Particularly, reflective consistency would be needed to ensure that an AGI preserved a Friendly value system throughout its self-modifications.

For the reasons above, this is a topic of interest to SIAI's research team. Proposed solutions include Eliezer Yudkowsky's Timeless Decision Theory.

See also