Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort. My thanks to Eric Chen, Elliott Thornley, and John Wentworth for invaluable discussion and comments on earlier drafts. All errors are mine.

This article presents a few theorems about the invulnerability of agents with incomplete preferences. Elliott Thornley’s (2023) proposed approach to the AI shutdown problem relies on these preferential gaps, but John Wentworth and David Lorell have argued that they make agents play strictly dominated strategies.[1] I claim this is false.

Summary

Suppose there exists a formal description of an agent that willingly shuts down when a certain button is pressed. Elliott Thornley’s (2023) Incomplete Preference Proposal aims to offer such a description. It’s plausible that, for it to constitute a promising approach to solving the AI shutdown problem, this description also needs to (i) permit the agent to be broadly capable and (ii) assure us that the agent will remain willing to shut down as time passes. This article formally derives a set of sufficient conditions for an agent with incomplete preferences to satisfy properties relevant to (i) and (ii).

A seemingly relevant condition for an agent to be capably goal-directed is that it avoids sequences of actions that foreseeably leave it worse off.[2] I will say that an agent satisfying this condition is invulnerable. This is related to two intuitive conditions. The weaker one is unexploitability: that the agent cannot be forcibly money pumped (i.e., compelled by its preferences to sure loss). The stronger condition is opportunism: that the agent never accepts sure losses or foregoes sure gains.[3]

To achieve this, I propose a dynamic choice rule for agents with incomplete preferences. This rule, Dynamic Strong Maximality (DSM), requires that the agent consider the available plans that are acceptable at the time of choosing and, among these, pick any plan that wasn’t previously strictly dominated by any other such plan. I prove in section 2 that DSM-based backward induction is sufficient for invulnerability, even under uncertainty.

Having shown that incompleteness does not imply that agents will pursue dominated strategies, I consider the issue of whether DSM leads agents to act as if their preferences were complete. Section 3 begins with a conceptual argument suggesting that DSM-based choice under uncertainty will not, even behaviourally, effectively alter the agent’s preferences over time. This argument does not apply when the agent is unaware of the structure of its decision tree, so I provide some formal results for these cases which bound the extent to which preferences can de facto be completed.

These results show that there will always be sets of options with respect to which the agent never completes its preferences. This holds no matter how many choices it faces. In particular, if no new options appear in the decision tree, then no amount of completion will occur; and if new options do appear, the amount of completion is permanently bounded above by the number of mutually incomparable options. These results apply naturally to cases in which agents are unaware of the state space, but readers sceptical of the earlier conceptual argument can re-purpose them to make analogous claims in standard cases of certainty and uncertainty. Therefore, imposing DSM as a choice rule can get us invulnerability without sacrificing incompleteness, even in the limit.

1.  Incompleteness and Choice

The aim of this brief section is to show that the results that follow in section 2 do not require transitivity. Some question the requirement of transitivity when preferences are incomplete (cf Bradley 2015, p. 3), but if that doesn’t apply to you, a quick skim of this section will provide enough context for the rest.

1.1.  Suzumura Consistency

Requiring that preferences be transitive may require that they be complete. To see this, notice that the weak preference relation  on a set of prospects  is transitive just in case  and  implies . Suppose an agent does weakly prefer  to  and  to  but has no preference between  and . Then transitivity is violated. Suzumura (1976) proposes a weakening of transitivity for agents with incomplete preferences, which allows for such preferences while preserving some desirable properties. We will say that a weak preference relation is strongly acyclic just in case

    (Strong Acyclicity)    implies .

We'll say that an agent whose preferences satisfy this property is Suzumura consistent. Bossert and Suzumura (2010) show that such an agent has some noteworthy features:

  1. Strong acyclicity rules out cycles containing at least one strict preference. This will make Suzumura consistent agents invulnerable to (forcing) money pumps.
  2. Strong acyclicity is necessary and sufficient for the existence of a complete and transitive extension of the agent’s preference relation.
  3. Any preference relation that is both strongly acyclic and complete is transitive.

Preferences that are incomplete may also be intransitive. Whether or not transitivity is a rationality condition, strong acyclicity is weaker but preserves some desirable properties. Below I will mostly just assume strong acyclicity. But since transitivity implies strong acyclicity, all the (sufficiency) results likewise apply to agents with transitive preferences.

1.2.  Strong Maximality

Bradley (2015) proposes a choice rule for Suzumura consistent agents which differs from the standard condition—Maximality—for agents with incomplete preferences.[4] This rule—Strong Maximality—effectively asks the agent to eliminate dominated alternatives in the following way: eliminate any options that you strictly disprefer to any others, then if you are indifferent between any remaining options and any eliminated ones, eliminate those as well. To state the rule formally, we define a sequence  satisfying

 and  whenever 

for any nonempty set of prospects . We can then state the rule as

    (Strong Maximality)   .

Let’s see intuitively what this rule captures. Suppose  and  but .[5] The traditional maximality rule implies that . But strong maximality simply outputs . This is intuitive: don’t pick an option that’s just as bad as an option you dislike. And, more importantly, Theorem 2 of Bradley (2015) shows that Suzumura consistency is both necessary and sufficient for decisive, strongly maximal choice.[6]

2.  Uncertain Dynamic Choice

In this section I prove some theorems about the performance of agents with incomplete preferences in dynamic settings. I will amend strong maximality slightly to adapt it to dynamic choice, and show that this is sufficient to guarantee the invulnerability of these agents, broadly construed. I will say that an agent is invulnerable iff it is both unexploitable and opportunistic.

An agent is unexploitable just in case it is immune to all forcing money pumps. These are sequences of decisions through which an agent is compelled by their own preferences to sure loss. An agent is opportunistic just in case it is immune to all non-forcing money pumps. These are situations in which sure loss (or missed gain) is merely permissible, according to an agent’s preferences.

A few aspects of the broad approach are worth flagging. Normative decision theory has not settled on an ideal set of norms to govern dynamic choice. I will therefore provide results with respect to each of the following three dynamic choice principles: naivety, sophistication, and resoluteness. (More details below.) Agents will in general behave differently depending on which principle they follow. So, evaluating the behaviour resulting from an agent’s preferences should, at least initially, be done separately for each dynamic choice principle.

2.1.  Framework

The notation going forward comes from an edited version of Hammond's (1988) canonical construction of Bayesian decision trees. I will only describe the framework briefly, so I’ll refer interested readers to Rothfus (2020a) section 1.6. for discussion.

Definition 1  A decision tree is an eight-tuple  where

  1.  is a finite set of nodes partitioned into , and .
  2.  is the set of choice nodes. Here agents can pick the node’s immediate successor.
  3.  is the set of natural nodes. Agents have credences over their possible realisations.
  4.  is the set of terminal nodes. These determine the outcome of a trajectory.
  5.  is the immediate successor function.
  6.  is the initial node.
  7.  assigns the set of states that remain possible once a node is reached.
  8.  is the consequence of reaching terminal node .

Definition 2  A set of plans available at node  of tree , denoted , contains the propositions and continuations consistent with the agent being at that node. Formally,

  1.  if .
  2.  if .
  3.  if .[7]

I will begin the discussion below with dynamic choice under certainty, but to set up the more general results, I will now lay out the framework for uncertain choice as well and specify later what is assumed. To start, it will be useful to have a representation theorem for our incomplete agents.

Imprecise Bayesianism (Bradley 2017, Theorem 37)   Let  be a non-trivial preference relation on a complete and atomless Boolean algebra of prospects , which has a minimal coherent extension on  that is both continuous and impartial.[8] Then there exists a unique rationalising state of mind, , on  containing all pairs of probability and desirability measures jointly consistent with these preferences, in the sense that for all ,

    .

This theorem vindicates the claim that the state of mind of a broad class of rational agents with incomplete preferences can be represented by a set of pairs of probability and desirability functions. Although this theorem applies to imprecise credences, I’ll work with the special case of precise beliefs throughout. This will simplify the analysis. I’ll therefore use the following notation going forward: .

Next, I define some conditions that will be invoked in some of the derivations.

Definition 3  Material Planning: For a plan to specify choices across various contingencies, I formalise a planning conditional as follows. Let  be a natural node and  a possible realisation of it. Here, a plan

    

assigns a chosen continuation, , to each possible realisation of . When preferences are complete, a plan at a natural node is then evaluated as follows:

    .

This is a natural extension of Jeffrey’s equation. And when preferences are incomplete:

     iff .

This makes use of Bradley’s representation theorem for Imprecise Bayesianism.

Definition 4  Preference Stability (Incomplete): For all nodes  and  and plans  where , we have .

Definition 5  Plan-Independent Probabilities (PIP): For any decision tree  in which  is a natural node,  is a realisation of , and .

The results can now be stated. I will include proofs of the central claims in the main text; others are relegated to the appendix. I suggest scrolling past them if you aren’t particularly surprised by the result.

2.2.  Myopia

Let’s begin with the simplest case of exploitation. We can stay silent on which dynamic choice principle to employ here: even if our agent is myopic, it will never be vulnerable to forcing money pumps under certainty. This follows immediately from Suzumura consistency since this guarantees that the agent never has a strict preference in a cycle of weak preferences.

Proposition 1   Suzumura consistent agents myopically applying strong maximality are unexploitable under certainty. [Proof.]

Two points are worth noting about the simple proof. First, it relies on strict preferences. This is because, if a money pump must go through a strongly maximal choice set with multiple elements (due to indifference or appropriate incomparability), it is necessarily non-forcing. That's the topic of the next section. Second, the proof doesn't rely on foresight. Myopia is an intentionally weak assumption that lets us show that no knowledge of the future is required for Suzumura consistent agents to avoid exploitation via these forcing money pumps.

2.3.  Naive Choice

Although Suzumura consistent agents using strong maximality can't be forced into money pumps, concern remains. Such an agent might still incidentally do worse by their own lights. That is, it remains vulnerable to non-forcing money pumps and thereby fails to be opportunistic. The single-souring money pump below is a purported example of this.

Figure 1. Adapted from Gustafsson (2022, Figure 9).

The agent’s preferences satisfy  and . Suppose that it’s myopic and uses strong maximality at each node. The agent begins with  at node 0 (if we let ‘down’ be the default). It is permitted, though not compelled, to go ‘up’ at node 0 instead (since  will become available), but also to go ‘up’ upon arrival at node 1 (since ). Suppose it in fact goes ‘up’ at both node 0 and at node 1. This would leave it with , which is strictly worse than what it began with. This money pump is ‘non-forcing’ because the agent’s preferences are also consistent with avoiding it.

The agent need not be myopic, however. It can plan ahead.[9] To achieve opportunism for Suzumura consistent agents, I propose a choice rule which I’ll dub Dynamic Strong Maximality (DSM). DSM states that a plan  is permissible at node  just in case (a) it is strongly maximal at  and (b) no other such plan was previously more choiceworthy than .

(Dynamic Strong Maximality)    iff

   (a)   and

   (b)   .

DSM is a slight refinement of strong maximality. Condition (b) simply offers a partial tie-breaking rule whenever the agent faces multiple choiceworthy prospects. So, importantly, it never asks the agent to pick counter-preferentially. An agent following naive choice with DSM will, at each node, look ahead in the decision tree, select their favourite trajectory using DSM, and embark on it. It can continually re-evaluate its plans using naive-DSM as time progresses and, as the following result establishes, the agent will thereby never end up with something worse than what it began with.

Proposition 2 (Strong Dynamic Consistency Under Certainty via Naivety)   Let  be an arbitrary tree where  is a choice node, , and  is consistent with . Then  iff . [Proof.]

Intuitively, this means that (i) if the agent now considers a trajectory acceptable, it will continue to do so as time passes, and (ii) if it at any future point considers some plan continuation acceptable, its past self would agree. It follows immediately that all and only the strongly maximal terminal nodes are reachable by agents choosing naively using DSM (derived as Corollary 1). This gives us opportunism: the agent will never pick a plan that’s strictly dominated some other available plan.

Under certainty, this result is unsurprising.[10] What is less obvious is whether this also holds under uncertainty. I will say that a Bayesian decision tree exhibits PIP-uncertainty just in case the probability of any given event does not depend on what the agent plans to do after the event has occurred. We can now state the next result.

Proposition 3 (Strong Dynamic Consistency Under PIP-Uncertainty via Naivety)  
Let node  be non-terminal in decision tree , and plan  be consistent with . Assume Material Planning, Preference Stability, and Plan-Independent Probabilities. Then  iff .

Proof.  Lemma 3 establishes that  implies . To prove the converse, suppose that . Node  was either a choice node or a natural node. If it was a choice node, then it follows immediately from Proposition 2 that .

Now let  be a natural node. By Lemma 2,  under coherent extension. So by Theorem 37 of Bradley (2017),

     for all .

Let  denote the continuation selected by plan  upon reaching choice node . Thus

     for all .

By Preference Stability,

     for all . (1)

Notice that this implies that

     for all . (2)

And by Plan-Independent Probabilities, this is equivalent to

     for all . (3)

By Material Planning,  holds iff

     for all .

Using (2)-(3) this is equivalent to

for some  and all . (4)

Formally,  omits description of  for all . So, we can let  at all remaining realisations . Then (4) reduces to

. (The probability is nonzero as  is realised by assumption.)

This holds via (1), so  and, by Lemma 2, 

We have thereby shown that a naive DSM agent is strongly dynamically consistent in a very broad class of cases. Although these results are restricted to PIP-uncertainty, this applies to agents with complete preferences too. It’s just a result of the forward-looking nature of naive choice. In the next section, I will drop the PIP restriction by employing a different dynamic choice principle.

2.4.  Sophisticated Choice

Sophistication is the standard principle in dynamic choice theory. It achieves dynamic consistency in standard cases by using backward induction to achieve the best feasible outcome. (This is also the method by which subgame perfect equilibria are found in dynamic games of perfect information.) However, backward induction is undefined whenever the prospects being compared are either incomparable or of equal value. Rabinowicz (1995) proposed a ‘splitting procedure’ to address this, and Asheim (1997) used a similar approach to refine subgame perfection in games.

According to this procedure, whenever there is a tie between multiple prospects, the agent will split the sequential decision problem into parts, where each part assumes that a particular prospect was chosen at that node. The agent then compares each part’s solution and picks its favourite as the solution to the grand decision problem. Intuitively, these agents will follow a plan as long as they lack “a positive reason for deviation” (Rabinowicz 1997). In other words, the agent will consider all the plans that would not make it locally act against their own preferences and, among those plans, proceed to pick the one with the best ex-ante outcome.

In the case of incomplete preferences, however, it turns out that strongly maximal sophisticated choice with splitting will not suffice to guarantee opportunism. The reason behind this is that strong maximality does not satisfy Set Expansion, and that backward induction makes local comparisons.[11]

Proposition 4   Strongly maximal sophisticated choice with splitting does not guarantee opportunism for Suzumura consistent agents. [Proof.]

DSM, however, will suffice.

Proposition 5 (Strong Dynamic Consistency Under Certainty via Sophistication)
Assume certainty and Suzumura consistency. Then DSM-based backward induction with splitting reaches a strongly maximal terminal node. [Proof.]

More importantly, we can guarantee this property under uncertainty even without PIP.

Proposition 6 (Strong Dynamic Consistency Under Uncertainty via Sophistication)
Let  be a Bayesian decision tree in which  is a non-terminal node,  is a successor, and  is consistent with . Assume Material Planning and Preference Stability. Then with DSM-based backward induction (DSM-BI),  is permissible at  iff  is permissible at .

Proof.  We begin by formalising DSM-BI. For tree , let  denote its terminal nodes,  its choice nodes, and  its natural nodes. Let  denote the set of plans available at  that are consistent with some associated plan continuation in . We then define the tree’s permissible set of plans, , recursively:

  1.  for .
  2.  where  , for .
  3.  for .

The permissible set of plans is then given by . Notice that this is the set of plans consistent with backward induction using DSM, and that splitting is implicit since each pruning of the decision tree is set-valued.

Now suppose that  for some .

Then  , where .

So necessarily . By construction of , we know that .

And since , we have .

Next, suppose that  for some . Then  . So, for any particular . Therefore  as needed.

Having established , we proceed to the converse. ()

Notice that the only nodes reachable via sophisticated choice are those that are consistent with some plan  which was initially permissible. Therefore, for any  that is reached, there must be some , consistent with  where , which satisfies .

Suppose  for some reachable . Whether  is a choice or a natural node, we know by construction of (2) and (3) that

    .

Therefore, if  is reached, then the permissible plans at  must be continuations of plans that were permissible at . Hence .

Together with (), this establishes that  iff 

This lets us achieve strong dynamic consistency even in cases where the probability of an event depends on what the agent plans to do after the event has occurred. An example of such a decision problem is Sequential Transparent Newcomb, as devised by Bryan Skyrms and Gerard Rothfus (cf Rothfus 2020b). So, even in this very general setting, our agent’s incomplete preferences aren’t hindering its opportunism.

2.5.  Resolute Choice

A final canonical principle of dynamic choice is resoluteness as introduced by McClennen (1990). An informal version of this principle is often discussed in the AI alignment community under the name ‘updatelessness’. Briefly, a resolute agent will, first, pick its favourite trajectory now under the assumption that this plan will continue to be implemented as the agent moves through the decision tree. And, second, the agent will continually implement that plan, even if this makes it locally choose counter-preferentially at some future node.[12] It respects its ex-ante commitments.

Under certainty or uncertainty, this is the easiest principle with which to guarantee the invulnerability of agents with incomplete preferences (relative to those with complete preferences). By construction, the agent never implements a plan that was suboptimal from an earlier perspective. I will therefore omit formal derivations of dynamic consistency for resolute choosers.[13]

3.  The Trammelling Concern

A possible objection to the relevance of the results above is what I’ll call the Trammelling Concern. According to this objection, agents with incomplete preferences who adopt DSM as a dynamic choice rule will, in certain sequential choice problems, or over the course of sufficiently many and varied ones, eventually converge in behaviour to an agent with complete preferences using Optimality as a choice rule. This would be worrying since complete preferences and optimal choice resembles the kind of consequentialism that precludes Thornley-style corrigibility.

This section aims to quell such worries. I begin with a taxonomy of our agent’s possible doxastic attitudes towards decision trees. First, the set of non-terminal nodes of a given tree will either contain only choice nodes, or it will contain some natural nodes. Second, the structure of a given tree will either be entirely known to the agent, or it will not. If it is not known, the agent will either have beliefs about all its possible structures, or its credence function will not be defined over some possible tree structures. We will refer to these three cases as certainty, uncertainty, and unawareness about structure, respectively.[14] Finally, at least under unawareness, the possible tree structures may or may not include terminal nodes (prospects) that are also present in the tree structure currently being considered by the agent. I will say the prospects are new if so and fixed if not.

Table 1.

This table summarises the situations our agent might find itself in and its cells indicate whether trammelling can occur in each case. This is based on the arguments and results below. The conclusions for (1)-(2) are based on conceptual considerations in section 3.1. Cases (3)-(4) are not discussed explicitly since uncertain tree structures can simply be represented as certain tree structures with natural nodes. The conclusions for (5) and (7) are based on formal results in section 3.2. For the sake of brevity, the effects of natural nodes under unawareness are left for later work.

3.1.  Aware Choice

This section was co-authored with Eric Chen.

Let’s begin with an important distinction. As Bradley (2015) puts it, there’s a difference between "the choices that are permissible given the agent’s preferences, those that are mandatory and those that she actually makes" (p. 1). If an agent is indifferent between two options, for example, then it can be the case that (i) both are permissible, (ii) neither is mandatory, and (iii) a particular one is in fact chosen. One aspect of choice between incomparables that we need to preserve is that all were permissible ex-ante. The fact that one is in fact chosen, ex-post, is immaterial.

To see what this implies, consider first the case of certainty. Here we proved that DSM will result in only ex-ante strongly maximal plans being implemented (Proposition 2; Corollary 1; Proposition 5). Now consider the following toy case (Example 1). A Suzumura consistent agent can’t compare  and . It’s offered a choice between the two. Suppose it chooses . Does this constrain the agent to never pick  in future choices between  and ?

No. In fact, it’s a bit unclear what this would mean formally. Once we expand the decision tree to include future choices between  and  in some branches, everything boils down to the fact that all plans through this tree will result in either  or . And so any plan will be permissible. What DSM constrains is which terminal nodes are reached; not how the agent gets there. Let’s see how far this reasoning can get us.

Example 2   A Suzumura consistent agent’s preferences satisfy  and . It’s offered a choice between going ‘down’ for  or going ‘up’ to face a choice between  and . It happens to go ‘up’. Does this constrain the agent not to pick  over ?

This is a case of an agent ending up on some trajectory that is only consistent with one of its coherent extensions ('completions'). One might worry that, at this point, we can treat the agent as its unique extension, and that this denies us any kind of corrigible behaviour. But there is a subtle error here. It is indeed the case that, once the agent has reached a node consistent with only one extension, we can predict how it will act going forward via its uniquely completed preferences. But this is not worrisome. To see why, let’s look at two possibilities: one where the agent is still moving through the tree, and another where it has arrived at a terminal node.

First, if it is still progressing through the tree, then it is merely moving towards the node it picked at the start, by a process we are happy with. In Example 2, option  was never strongly maximal in comparison to  and . We knew, even before the agent went ‘up’, that it wouldn’t pick . So the agent is no more trammelled once it has gone ‘up’ than it was at the beginning.

For a more concrete case, picture a hungry and thirsty mule. For whatever reason, this particular mule is unable to compare a bale of hay to a bucket of water. Each one is then placed five metres from the mule in different directions. Now, even if the mule picks arbitrarily, we will still be able to predict which option it will end up at once we see the direction in which the mule is walking. But this is clearly not problematic. We wanted the mule to pick its plan arbitrarily, and it did.

Second, if the agent has reached the end of the tree, can we predict how it would act when presented with a new tree? No; further decisions just constitute a continuation of the tree. And this is just the case we described above. If it did not expect the continuation, then we are simply dealing with a Bayesian tree with natural nodes. In the case of PIP-uncertainty, we saw that DSM will once again only result in the ex-ante strongly maximal plans being implemented (Proposition 3).[15] Here it's useful to recall that plans are formalised using conditionals:

    , where  and .

That is, plans specify what path to take at every contingency. The agent selects plans, not merely prospects, so no trammelling can occur that was not already baked into the initial choice of plans. The general observation is that, if many different plans are acceptable at the outset, DSM will permit the agent to follow through on any one of these; the ex-post behaviour will “look like” it strictly prefers one option, but this gets close to conflating what options are mandatory versus actually chosen. Ex-post behavioural trajectories do not uniquely determine the ex-ante permissibility of plans.

3.2.  Unaware Choice

The conceptual argument above does not apply straightforwardly to the case of unawareness. The agent may not even have considered the possibility of a certain prospect that it ends up facing, so we cannot simply appeal to ex-ante permissibility. This section provides some results on trammelling in this context. But, first, given the argument about Bayesian decision trees under certain structure, one may ask: what, substantively, is the difference between the realisation of a natural node and awareness growth (if all else is equal)? Would the behaviour of our agent just coincide? I don't think so.

Example 3   Consider two cases. Again, preferences satisfy  and .

Figure 2.

Up top, we have a Bayesian tree. On the bottom, we have awareness growth, where each tree represents a different awareness context. In both, let’s suppose our agent went ‘up’ at node 0. In the Bayesian case, let’s also suppose the natural node (1) resolves such that the agent reaches node 2. In both diagrams, the agent ends up having to pick  or .

The permissible plans at the outset of the Bayesian tree are those that lead to  or to  (since  is strictly dominated by ). But in the bottom case, it seems that our agent faces two distinct decision problems: one between  and ; one between  and . In the first problem, both options are permissible. And likewise in the second. So, the difference between these situations is that, in the latter case, our agent faces a new decision problem.

In the Bayesian case, our agent might end up at a future node in which the only available options are incomparable but nevertheless reliably picks one. But the reason for this is that another node, also available at the moment the agent picked its plan, was strictly preferred to one of these. The agent’s decision problem was between , and . The arbitrary choice was between  and . This is as it should be; there is no trammelling.

However, in the case of unawareness, the agent initially chose between  and  and now it must choose between  and . It never got to compare  to  to . It compared  to  and, separately,  to . Therefore, if it reliably picks  over , for the sake of opportunism, one could see this as a form of trammelling. I now turn to this issue.

For brevity, I will focus on cases 4 and 5 from the taxonomy above. In these worlds, all decision trees have choice nodes and terminal nodes, but they lack natural nodes. And our agent is unaware of some possible continuations of the tree they are facing. That is, it’s unaware of—places no credence on—the possibility that a terminal node is actually another choice node. But once it in fact gets to such a node, it will realise that it wasn’t terminal. There are then two ways this could play out: either the set of available prospects expands when the tree grows, or it doesn’t. Let’s consider these in turn. As we’ll see, the most salient case is under a certain form of expansion (section 3.2.3., Example 6).

3.2.1.  Fixed Prospects (No Trammelling)

Here I’ll show that when prospects are fixed, opportunism and non-trammelling are preserved. We begin with an illustrative example.

Example 4   The agent’s preferences still satisfy  and .

Figure 3.

The agent initially sees the top tree. DSM lets it pick  or , arbitrarily. Suppose it in fact goes for . Then its awareness grows and the tree now looks like the bottom one. It can stay put (go ‘down’ to ) or move ‘up’ and choose between  and . DSM still says to pick, arbitrarily, any path leading to  or to . Suppose it goes for . There—no trammelling.

Here’s the lesson. Our agent will embark on a path that’ll lead to a node that is DSM with respect to the initial tree. When its awareness grows, that option will remain available in the new tree. By Set Contraction (Lemma 4) this option will also remain DSM in the new tree. So, picking it will still satisfy opportunism. And regarding trammelling, it can still pick entirely arbitrarily between all initially-DSM nodes. DSM constrains which prospects are reached; not how to get there. We can state this as follows.

Proposition 7   Assume no natural nodes. Then, DSM-based choice under awareness growth will remain arbitrary whenever the set of available prospects is fixed. [Proof.]

3.2.2.  New Prospects (No Trammelling)

Now consider the case where the set of available prospects expands with the tree. It's of course possible that our agent gets to a terminal node that isn’t strongly maximal according to the new tree. This could happen if they get to an initially-acceptable terminal node and then realise that a now-inaccessible branch would let them access a prospect that’s strictly preferred to all those in the initial tree.[16] But, importantly, this applies to agents with complete preferences too. It’s simply an unfortunate feature of the environment and the agent’s epistemic state. Complete preferences don’t help here. 

But there is a class of cases in which completeness, at first glance, seems to help.

Example 5   The agent’s preferences satisfy  and  (unlike before).

Figure 4.

The agent sees the tree on the left. DSM lets it pick  or , arbitrarily. Suppose it in fact goes for . Now its awareness grows and the tree looks like the one on the right. It can stay put (go ‘down’ to ) or move ‘up’ to get .

Now, none of the available options are DSM in the full tree, but the agent still has to pick one. This, on its own, is not problematic since it could just be an unfortunate aspect of the situation (as dismissed above). But in this particular case, because an agent with complete preferences would satisfy transitivity, it would never go for  in the first place. This kind of issue wouldn’t occur if our agent’s incomplete preferences were transitive, rather than just strongly acyclic.

So let’s impose transitivity. Then our agent wouldn’t pick  in the first place. In general it will now remain opportunistic in the same way that its complete counterparts are. That’s because it would never initially choose arbitrarily between options that all its coherent extensions have a strict preference between. So it would never pick in a way that makes it realise that it made a foreseeable mistake once more options come into view. To operationalise this, I'll say that an agent facing deterministic decision trees fails to be opportunistic under awareness growth iff

  1. The agent fails to reach a strongly maximal prospect in the grand tree, and
  2. A strongly maximal prospect in the grand tree was available in the initial tree.

This agent is opportunistic otherwise.[17] We can now state the associated result.

Proposition 8  Assume transitive preferences and no natural nodes. Then, under all possible forms of awareness growth, naive choice via DSM is opportunistic. [Proof.]

That gives us opportunism, but what about non-trammelling? Will our agent ever have to pick one option over another despite them being incomparable? Yes, in some cases.

3.2.2.  New Prospects (Bounded Trammelling)

Example 6   The agent’s preferences satisfy  and .

Figure 4 (repeated).

It faces the same trees as in the previous example. DSM first lets it pick  or , arbitrarily. Suppose it goes for . Now its awareness grows and the tree looks like the one on the right. It can stay put (go ‘down’ to ) or move ‘up’ to get . Since only  is DSM in the global tree, our agent might reliably pick  over , despite them being incomparable. Trammelled.

This is a special property of unawareness. Thankfully, however, we can bound the degree of trammelling. And I claim that this can be done in a rather satisfying way. To do this, let’s formally define an extension of DSM that will give us opportunism under awareness growth. Let  denote the initial tree, with initial node , and  the tree available after awareness growth, with initial node . And let  denote the appending of  on .

(Global-DSM)    iff

    (i)   and

    (ii)  .

Notice that G-DSM doesn’t look at what happens at other previously-terminal nodes after awareness grows. These are unreachable and would leave the choice rule undefined in some cases. (This holds with complete preferences too.) This rule naturally extends opportunism to these cases; it says to never pick something if you knew that you could’ve had something better before. We can now state some results.

Proposition 9 (Non-Trammelling I)   Assume transitive preferences and no natural nodes. Then, under all possible forms of awareness growth, globally-DSM naive choice will remain arbitrary whenever the available prospects are (i) mutually incomparable and (ii) not strictly dominated by any prospect. [Proof.]

That gives us a sufficient condition for arbitrary choice. To find out how often this will be satisfied, we define ‘comparability classes’ as subsets of prospects within which all are comparable and between which none are. A comparability-based partitioning of the prospects is possible when comparability is transitive.[18] The following results then follow.

Proposition 10 (Bounded Trammelling I)   Assume that comparability and preference are transitive and that there are no natural nodes. Then, under all possible forms of awareness growth, there are at least as many prospects between which globally-DSM naive choice is guaranteed to be arbitrary as there are comparability classes.

Proof.  We first partition the (possibly uncountable) set of prospects  into (possibly uncountable) subsets of (possibly uncountable) mutually comparable prospects.

Given transitivity,  is an equivalence relation on . Then, for any , we construct an equivalence class: . Call this set the comparability class of .

This lets  form a partition of .

We identify the class-optimal prospects as follows: .

Let . Suppose, for contradiction, that .

Since  is a choice function,  for any set . Then .

Then, by G-DSM, there exists some  such that

    (i’)   or

    (ii’)  .

Given transitivity, and since all elements of  are mutually incomparable or indifferent, condition (i’) does not hold. Therefore .

By transitivity, this implies that  for some . (1)

But since  for some , we know that . (2)

By Set Contraction,  contradicts (1)-(2).

This establishes that  whenever .

The set of all prospects satisfying this is of size , as needed. 

Corollary 2 (Bounded Trammelling II)  Assume that comparability and preference are transitive and that there are no natural nodes. Then, whenever  class-optimal prospects are available, choice will be arbitrary between at least  prospects. [Proof.]

3.3.  Discussion

It seems we can’t guarantee non-trammelling in general and between all prospects. But we don’t need to guarantee this for all prospects to guarantee it for some, even under awareness growth. Indeed, as we’ve now shown, there are always prospects with respect to which the agent never gets trammelled, no matter how many choices it faces. In fact, whenever the tree expansion does not bring about new prospects, trammelling will never occur (Proposition 7). And even when it does, trammelling is bounded above by the number of comparability classes (Proposition 10).

And it’s intuitive why this would be: we’re simply picking out the best prospects in each class. For instance, suppose prospects were representable as pairs  that are comparable iff the -values are the same, and then preferred to the extent that  is large. Then here’s the process: for each value of , identify the options that maximise . Put all of these in a set. Then choice between any options in that set will always remain arbitrary; never trammelled.

Three caveats are worth noting. First, sceptical readers may not agree with our initial treatment of (non-)trammelling under aware choice (i.e., known tree structures, section 3.1.). That section is based on a conceptual argument rather than formal results, so it should be evaluated accordingly. However, at least some of the results from the section on unawareness could be used to dampen many reasonable worries here. Whenever non-trammelling is satisfied under awareness growth with Global-DSM, it will likewise be satisfied, mutatis mutandis, by DSM when the tree structure is known.

Second, we've only considered part of the taxonomy described above. Due to time constraints, we left out discussion of unaware choice in trees with natural nodes. We suspect that extending the analysis to these kinds of cases would not meaningfully affect the main conclusions, but we hope to look into this in later work. Finally, we haven't provided a full characterisation of choice under unawareness. The literature hasn't satisfactorily achieved this even in the case of complete preferences, so this falls outside the scope of this article.

Conclusion

With the right choice rule, we can guarantee the invulnerability—unexploitability and opportunism—of agents with incomplete preferences. I’ve proposed one such rule, Dynamic Strong Maximality, which nevertheless doesn’t ask agents to pick against their preferences. What’s more, the choice behaviour this rule induces is not representable as the agent having implicitly completed its preferences. Even under awareness growth, the extent to which the rule can effectively complete an agent’s implied preferences is permanently bounded above. And with the framework provided, it’s possible to make statements about which kinds of completions are possible, and in what cases.

This article aims to be somewhat self-contained. In future work, I’ll more concretely consider the implications of this for Thornley’s Incomplete Preference Proposal. In general, however, I claim that worries about whether a competent agent with preferential gaps would in practice (partially) complete its preferences need to engage with the particulars of the situation: the preference structure, the available decision trees, and so on. Full completion won't occur, so the relevant question is whether preferential gaps will disappear in a way that matters.

References

Asheim, Geir. 1997. “Individual and Collective Time-Consistency.” The Review of Economic Studies 64, no. 3: 427–43.

Bales, Adam. 2023. “Will AI avoid exploitation? Artificial general intelligence and expected utility theory.” Philosophical Studies.

Bossert, Walter and Kotaro Suzumura. 2010. “Consistency, Choice, and Rationality.” Harvard University Press.

Bradley, Richard and Mareile Drechsler. 2014. “Types of Uncertainty.” Erkenntnis 79, no. 6: 1225–48.

Bradley, Richard. 2015. “A Note on Incompleteness, Transitivity and Suzumura Consistency.” In: Binder, C., Codognato, G., Teschl, M., Xu, Y. (eds) Individual and Collective Choice and Social Welfare. Studies in Choice and Welfare. Springer, Berlin, Heidelberg.

Bradley, Richard. 2017. Decision Theory with a Human Face. Cambridge: Cambridge University Press.

Gustafsson, Johan. 2022. Money-Pump Arguments. Elements in Decision Theory and Philosophy. Cambridge: Cambridge University Press.

Hammond, Peter J. 1988. “Consequentialist foundations for expected utility.” Theory and Decision 25, 25–78.

Huttegger, Simon and Gerard Rothfus. 2021. “Bradley Conditionals and Dynamic Choice.” Synthese 199 (3-4): 6585-6599.

Laibson, David and Yeeat Yariv. 2007. "Safety in Markets: An Impossibility Theorem for Dutch Books." Working Paper, Department of Economics, Harvard University.

McClennen, Edward. 1990. Rationality and Dynamic Choice: Foundational Explorations. Cambridge: Cambridge University Press.

Rabinowicz, Wlodek. 1995. “To Have One’s Cake and Eat It, Too: Sequential Choice and Expected-Utility Violations.” The Journal of Philosophy 92, no. 11: 586–620.

Rabinowicz, Wlodek. 1997. “On Seidenfeld‘s Criticism of Sophisticated Violations of the Independence Axiom.” Theory and Decision 43, 279–292.

Rothfus, Gerard. 2020a. “The Logic of Planning.” Doctoral dissertation, University of California, Irvine.

Rothfus, Gerard. 2020b. “Dynamic consistency in the logic of decision.” Philosophical Studies 177:3923–3934.

Suzumura, Kotaro. 1976. “Remarks on the Theory of Collective Choice.” Economica 43: 381-390.

Thornley, Elliott. 2023. “The Shutdown Problem: Two Theorems, Incomplete Preferences as a Solution.” AI Alignment Awards

Appendix: Proofs

Proposition 1   Suzumura consistent agents myopically applying strong maximality are unexploitable under certainty.

Proof.  By myopic strong maximality, whenever an agent is presented with a set of alternatives containing a prospect strictly preferred to all others, it is chosen. Suppose such a Suzumura consistent agent were forcibly money pumped: beginning with some  and ending with a strictly dispreferred , with each choice resulting from a strict preference. Then there must be a set of prospects satisfying  where . This trivially implies that . By Strong Acyclicity, we have , a contradiction. 

Lemma 1 (Dynamic Consistency Under Certainty via Naivety)   Let  be an arbitrary tree where  is a choice node, , and  is consistent with . Then  implies  .

Proof.  There are no natural nodes, so for any nodes  and  consistent with plan , we have . Therefore, by DSM,  is equivalent to

     and

    .

Noting that  and that Strong Maximality satisfies Set Contraction (Bradley 2015), we know . So

    . (1)

Any plan unavailable at  is also unavailable at , which implies that

    . (2)

By DSM, (1)-(2) is equivalent to 

Proposition 2 (Strong Dynamic Consistency Under Certainty via Naivety) Let  be an arbitrary tree where  is a choice node, , and  is consistent with . Then  iff .

Proof.  There are no natural nodes, so for any nodes  and  consistent with plan . Suppose that  while .

Suppose . Then by Set Contraction, . And since there are no natural nodes, . Hence , a contradiction.

Suppose . By Theorem 2 of Bradley (2015), Strong Maximality is decisive. Therefore, there must be another plan  that was initially more choiceworthy; that is, . Again, . This shows that . Hence , a contradiction.

We have a contradiction, so  . Using Lemma 1, we have thereby shown that  iff 

Corollary 1   Under certainty, all and only strongly maximal terminal prospects are reachable by naive DSM.

Proof.  Recall that, under certainty, a plan can be identified by the conjunction of propositions describing the information state at each node of its continuation. A node always logically entails its preceding nodes, so a plan under certainty can simply be identified by its unique terminal node. Let  denote the set of propositions at the terminal nodes.

We can thereby establish, for part (a) of DSM, that . Part (b) follows trivially at the initial node , so . Proposition 2 establishes that under certainty, a plan is DSM iff its immediate continuation is DSM. By induction on continuation-consistent nodes, all Strongly Maximal terminal prospects are reachable by DSM. And since  for all nodes  that are consistent with a plan , we know that DSM reaches only Strongly Maximal terminal prospects. 

Lemma 2   Under certainty and the coherent extension of a (Strongly Acyclic) preference relation, Dynamic Strong Maximality reduces to Optimality.

Proof.  By Theorem 1 of Bradley (2015), when a Strongly Acyclic preference relation is completed, Strong Maximality coincides with Optimality. DSM becomes equivalent to:  iff

   (a’)  Plan  is Optimal at . That is, .

   (b’)  No Optimal plan was previously better: .

Let  and suppose that . Then, because we have , the Set Expansion property of Optimality implies , i.e., that . These -Optimal plans were chosen arbitrarily, so this suffices to show that . Hence (b’) follows from (a’) under certainty. 

Lemma 3 (Dynamic Consistency Under PIP-Uncertainty via Naivety)   Let  be a non-terminal node in Bayesain decision tree , and plan  be consistent with . Assume Material Planning, Preference Stability, and Plan-Independent Probabilities. Then  implies  .

Proof.  Suppose that . Node  is either a choice node or a natural node. If it is a choice node, then it follows immediately from Lemma 1 that . Now let  be a natural node.

Then by DSM, . By Lemma 2,  under coherent extension. Therefore, by Theorem 37 of Bradley (2017),

     for all . (1)

Suppose, for contradiction, that  for some . This implies that

    , i.e., that

     .

Let  denote the continuation selected by plan  upon reaching choice node .

Then we can re-write the above as .

And by Preference Stability, . (2)

By applying Material Planning to (1), we get, for all ,

    

and, by assuming Plan-Independent Probabilities,

    . (3)

Let  denote the set of completions according to which (3) holds.

Then  for all . That is because, if this failed to hold for any , then plan  could be altered such that . But such Pareto improvements are unavailable since  is DSM at node .

In particular,  as . But since , condition (2) implies that , a contradiction. 

Proposition 4   Strongly maximal sophisticated choice with splitting does not guarantee opportunism for Suzumura consistent agents.

Proof.  The proof is by counterexample. Consider a Suzumura consistent agent facing the following simple decision tree with preferences satisfying  and .

Figure 5.

Proceeding via backward induction, the only permissible choice at node 1 is . The agent then compares  and  at node 0. These are incomparable so the agent splits the problem and gets two partial solutions:  and . The strongly maximal solutions are . Although  is indeed permissible, the agent could incidentally end up with  despite having been able to reach , a strictly preferred alternative. 

Remark. Decision trees lack a natural assignment of the ‘default’ outcome; i.e., what the agent ‘starts out with’. In this case we can think of the agent as starting with , and choosing whether to engage with a decision problem by going ‘up’ at node 0. Then we can claim that it is permissible, according to strongly maximal sophistication with splitting, for the agent to stay put at a strictly dispreferred node. The agent is therefore not opportunistic. It is worth noting that  is also impermissible according to planning-DSM as described above. The only permissible plans under DSM are . But since  dominates , a DSM agent is nevertheless opportunistic: it could not even incidentally pick a strictly dominated prospect.

Lemma 4   DSM satisfies Set Contraction.

Proof.  Recall Set Contraction: if  and  then . Now let  be sets of plans in tree . Suppose that  and . Then by DSM, . By Theorem 3 of Bradley (2015),  satisfies Set Contraction, so

    . (1)

By DSM, . And since , we also know that

    . (2)

By DSM, (1)-(2) is equivalent to 

Proposition 5 (Strong Dynamic Consistency Under Certainty via Sophistication)   Assume certainty and Suzumura consistency. Then DSM-based backward induction with splitting (DSM-BIS) reaches a strongly maximal terminal node.

Proof.  We first establish that every strongly maximal terminal node is preserved under some split of DSM-BIS. Let  denote the terminal nodes of a tree and denote by  the number of final choice nodes. For any such node , its associated terminal nodes are . By Set Contraction, we know that for any  we have . And since , this more generally implies that

    . (1)

Let . Whenever  for some , the splitting procedure will induce at least  separate trees. Each will initially preserve one element of the DSM set of terminal nodes following . We therefore know from (1) that, for any , there is some split which will initially preserve . The splitting procedure is repeatedly nested, as needed, within each subtree created via DSM-BIS elimination, so by Set Contraction (Lemma 4), this likewise applies to all subsequent subtrees. With  denoting possible continuations under split , we can now claim that

    .

Let  denote the partial solutions under split . Then since , this establishes that . And letting , we can state

    , i.e. . (2)

Because we also have , Set Contraction implies that

    . (3)

DSM-BIS then selects among . Since , we know from (2) that

    .

So by Set Contraction, . Clearly,  since  are all the terminal nodes. Therefore . This establishes that

    , (4)

i.e., that all strongly maximal terminal nodes are DSM at . To show the other direction, suppose for contradiction that .

Since  is not a strongly maximal terminal node, we know by the decisiveness of strong maximality that . By (3)-(4), this  must be a member of both  and . So by condition (b) of DSM, , a contradiction.

Therefore , which with (4) implies that 

Proposition 7   Assume no natural nodes. Then, DSM-based choice under awareness growth will remain arbitrary whenever the set of available prospects is fixed.

Proof.  Let  denote the initially available prospects. Upon reaching  the agent’s awareness grows and it faces  . Suppose  .

By Set Contraction, .

Choice was arbitrary within , and will be arbitrary in .

Therefore, we know that for any  between which choice was arbitrary, choice will remain arbitrary between  and  whenever 

Proposition 8   Assume transitive preferences and no natural nodes. Then, under all possible forms of awareness growth, naive choice via G-DSM is opportunistic.

Proof.  Let  denote the set of all prospects over which the agent’s preference relation is defined (i.e., candidates for terminal nodes). Let  denote the terminal nodes reachable in the initial tree  the terminal nodes reachable (once awareness grows) in the new tree  at the previously chosen node ; and  the set of all terminal nodes (reachable or not) in the grand tree .

Since there are no natural nodes,  and  according to the agent’s doxastic state at those nodes. By Theorem 1 of Bradley (2015) strong maximality and maximality coincide under transitivity. We can therefore use maximality going forward.

Node  is, by construction, chosen via DSM in . Therefore, by Proposition 2, DSM guarantees that  for any  consistent with .

(Note: with some abuse of notation, I will apply choice functions and preference relations to terminal nodes. This should be interpreted as attitudes towards their associated prospects.)

Awareness then grows and the agent faces . Consider two cases.

Case 1: . Here the best nodes are not the initially terminal nodes (nor are they accessible after reaching ). Therefore, no strongly maximal prospect in the grand tree was available in the initial tree. So, opportunism is satisfied automatically.

Case 2: . This is the situation in which the best nodes are either in the initial tree or the new tree. Notice that  for some set  (the now-inaccessible new terminal nodes). Sub-cases:

If , then by Set Expansion, . And so for any . Since , by Set Contraction,