In a previous post, I posited that a significant portion of the confusion surrounding counterfactuals can be traced back to an ontological shift. In this post, I will seek to elaborate on this argument and explain how we should reinterpret decision theory problems in light of this insight.
Our Naive Ontology Assumes Libertarian Free Will
Consider our naive ontology, by which I mean how humans tend to think without substantial philosophical reflection. In our naive ontology, we assume both that:
- That we possess libertarian free will, ie. that our choices are not predetermined before we make them.
- That when we're making a decision, both the past and present state of the universe are fixed independent of that decision.
Unfortunately, scientific evidence seems to indicate:
3. At the point of a decision, each brain-state only corresponds to one possible choice
It is impossible for all of these statements to be true simultaneously. If they were, there would be multiple possible choices we could make, each of which would necessitate a different brain-state. However, this would also require the current state of the universe, and therefore our brain state, to be fixed independent of these choices.
From my perspective, it is clear that the mistake was in assuming the existence of libertarian free will (assumption 1). Given this, I'll take it for granted that we have to revise our naive conception of decisions so as to be rooted in a deterministic ontology.
Adapting to Determinism by Augmenting with Counterfactuals
Determinism dictates that whatever decision is ultimately made was the only decision that could have ever been made. This immediately poses a challenge: how do we avoid any decision from ending up as a trivial one?
The obvious answer is that we must find a way to augment the factual with a counterfactual for each option we desire beyond the first. But how should we construct this and what kind of object should it even be?
In Favour of Consistent Counterfactuals
As I previously argued in my post Counterfactuals are an Answer, Not a Question, the definition of a counterfactual will depend on the types of questions we seek to use them to answer. For example, while Causal Decision Theory results in counterfactual scenarios that are inconsistent, they can still be useful in practice, because the inconsistencies are often not relevant to the decisions we are attempting to make.
To illustrate, we typically disregard the fact that making a decision would result in a different brain state, which would then lead to numerous changes in the past. However, when considering Newcomb-like problems within a deterministic framework, these are exactly the types of facts about the world that we cannot ignore. While it may not be immediately obvious that we should handle Newcomb-like problems by backpropagating the full effects of our decision backwards in time, it is clear that we must backpropagate at least some of these consequences. It is possible that someone may devise a concept of counterfactuals that allows for the selective backpropagation of certain consequences. However, at present, I can't see any obvious ways of accomplishing this, leading me to favour full propagation at least until a better theory arises.
Furthermore, once we have begun backpropagating even some consequences, there is a compelling argument to be made for fully embracing this approach as we have already relinquished the primary benefit of not backpropagating consequences. This benefit is that since the past was fixed between counterfactuals, it is easier to justify that the counterfactuals were a fair comparison of the different choices.
However, once we've accepted the need for some degree of backpropagation, it becomes necessary to devise a method for determining whether counterfactuals with pasts that are different in decision-relevant ways are comparable. Given this, it seems reasonable to decide that we may as well backpropagate all consequences of our decision.
Lastly, even though I recognise that the above arguments aren't conclusive, at worst, this path should still lead to a productive error.
Ranking Consistent Counterfactuals in the Obvious Way
For the remainder of this article, we will assume that we are working with consistent counterfactuals. If we accept that each brain state corresponds to a single potential choice, then we must create a consistent counterfactual for each choice that exists beyond the choice that occurs in the factual reality. While we will set aside the question of how these counterfactuals are constructed, once we have N different, consistent universes representing N different choices, and we have determined that the comparisons are fair, the most logical approach to selecting a choice is to choose the option corresponding to the universe with the highest utility.
Modelling Decisions As Choices from Outside the Universe(s)
I think it's worth being as explicit as possible about what's actually going on here. When we consider selecting a universe, we are effectively modelling the scenario as if it were an external agent, outside of any of these universes, making a decision about which universe to choose.
One might argue that this approach merely pushes the problem to another level, as we were initially attempting to understand what a decision is within the context of our naive ontology, and now we are stating that a decision made by an agent embedded in a universe should be modeled as an external agent choosing between universes. Does this not simply lead to the same problem, just one level of recursion down the chain?
Surprisingly, this is not necessarily the case. It was necessary to make this ontological shift in the first instance due to the fact that in Newcomb-like problems, the state of the external world at the time a decision is made depends on that decision in a way that impacts the optimal choice. Our decision influenced Omega's prediction, which in turn affected the money in the "mystery" box.
However, once we've made this ontological shift, we can safely disregard the external agent's past. In the context of Newcomb's problem, it may appear as though Omega is predicting the decision of the external agent, as their choice determines the decision of the agent within the universe, which in turn influences Omega's prediction. However, this overlooks the fact that the external agent is separate from the agent within the counterfactuals, and that the decision of the external agent exists in a completely different timeline, neither before nor after Omega's decision. This means that we can utilize a naive libertarian free will ontology without encountering any issues analogous to those presented in Newcomb's problem.
Application to Newcomb's Problem
If we assume that we are using consistent counterfactuals and that Omega is a perfect predictor for the sake of simplicity, we can consider the following scenario:
- In the world where the interior agent one-boxes, Omega predicts this choice and puts $1 million in the mystery box, resulting in a total of $1 million for the internal agent.
- In the world where the interior agent two-boxes, Omega predicts this choice and puts $0 in the mystery box, resulting in a total of $1000 for the internal agent.
From the perspective of an external agent outside of the universe, the first world would be preferred over the second. Therefore, if we accept this ontological shift as valid, we should choose to one-box (see this post if you're worried that this implies backwards causation)
Similarly, in the Transparent Newcomb's Problem:
- We can observe that Omega has placed $1 million in the box. This implies that we must choose to one-box, leading Omega to predict this choice and place the $1 million in the box as anticipated. Thus, this choice becomes the factual, and any other choice must be a counterfactual.
- If we two-box instead, Omega would predict this choice and therefore must place $0 in the box. While it may be stated that we see $1 million in the box, this does not pose a problem if we interpret the problem as only stating that we see $1 million in the factual, rather than as a claim about every counterfactual.
An external agent outside of the universe would prefer the first world over the second, leading us to once again choose to one-box.
For these problems, once we decided that we wanted our counterfactuals to be consistent, constructing them ended up being straightforward since the problem itself contains information about how we should construct the counterfactuals. If this feels strange to you, well done on making that observation. You may wish to refer to Counterfactuals as a Matter of Social Convention in which I briefly discuss how the method of constructing counterfactuals is frequently implicit in the problem description, rather than something we must independently deduce.
But we're still Using the Naive Ontology!
You may object that, given our knowledge that the libertarian free will ontology is naive, we should eliminate it entirely from our world model. While it would be desirable to do so, I am skeptical that it is possible.
Decisions are such a fundamental concept that I am uncertain how we could attempt to arrive construct our understanding of them other than by starting from our current naive notion and attempting to refine it by bootstrapping up. They seem too central to our reasoning for us to be able to step outside of this framework.
For further support, you might also want to see my argument for counterfactuals being circularly justified. This argument can be extended to decisions by noting that we can only select one notion of decision over another if we already possess at least an implicit concept of decision.
So You're Saying we Should use Timeless Decision Theory? We Already Knew That!
The model of an external agent considering universes corresponding to different choices, with the impacts of those choices being propagated both forwards and backwards, bears a strong resemblance to Timeless Decision Theory. Given that so many individuals within the Less Wrong community already agree that this approach is correct, it may seem that this article is unnecessary.
On the other hand, we have not yet addressed all questions regarding how these types of decision theories function. If we're encountering significant challenges with questions such as how do we determine which variables are subjunctively linked to which, then one of the most obvious things to do is to try to understand in as much detail as possible what is going on when we shift to a timeless perspective. Once we've achieved that, we should expect to be in a better position for making progress. Hopefully, this article helps provide some clarity on what shifting to the timeless perspective entails.
What about Updateless Decision Theory?
I'll address the Counterfactual Mugging and hence updatelessness more generally in a future article.
We've made a slight simplification here by ignoring quantum mechanics. Quantum mechanics only shifts us from the state of the world being deterministic, to the probability distribution being deterministic. It doesn't provide scope for free will, so it doesn't avoid the ontological shift.
I understand that the free-will/determinism debate is contentious, but I don't want to revisit it in this post.
One alternate solution would be to toss out the notion of Newcomb-like problems. However, this seems difficult, as even if we are skeptical of perfect predictors for humans, we can set up this situation in code where Omega has access to the contestant program's source code. So I don't see this as a solution.
If we adopt the perspective in which the counterfactual when we one-box includes $1 million in the box, while the counterfactual when we two-box lacks the million, it becomes unclear whether it is fair to compare these two counterfactuals, as it appears as though the agent is facing different problem setups in the two counterfactuals.
Or any one of the best options in the event of a tie.
The box that is either empty or contains the million.
Technically, they generally employ a variant known as either Updateless Decision Theory or Functional Decision Theory, depending on the preferred terminology.
Updateless decisions are made by agents that know less, to an arbitrary degree. In UDT proper, there is no choice in how much an agent doesn't know, you just pick the best policy from a position of maximal ignorance. It's this policy that needs to respond to possible and counterfactual past/future observations, but the policy itself is no longer making decisions, the only decision was about picking the policy.
But in practice knowing too little leads to inability to actually compute (or even meaningfully "write down") an optimal decision/policy, it becomes necessary to forget less, and that leads to a decision about how much to forget. When forgetting something, you turn into a different, more ignorant agent. So the choice to forget something in particular is a choice to turn into a particular other agent. More generally, you would interact with this other agent instead of turning into it. Knowing too little is also a problem when there is no clear abstraction of preference that survives the amnesia.
This way, updateless decision making turns into acausal trade where you need to pick who to trade with. There is a change in perspective here, where instead of making a decision personally, you choose whose decision to follow. The object level decision itself is made by someone else, but you pick who to follow based on considerations other than the decision they make. This someone else could also be a moral principle, or common knowledge you have between yourself and another agent, this moral principle or common knowledge just needs to itself take the form of an agent. See also these comments.
UDT doesn't really counter my claim that Newcomb-like problems are problems in which we can't ignore that our decisions aren't independent of the state of the world when we make that decision, even though in UDT we know less. To make this clear in the example of Newcomb's, the policy we pick affects the prediction which then affects the results of the policy when the decision is made. UDT isn't ignoring the fact that our decision and the state of the world are tied together, even if it possibly represents it in a different fashion. The UDT algorithm takes this into account regardless of whether the UDT agent models this explicitly.
I'll get to talking about UDT rather than TDT soon. I intend for my next post to be about Counterfactual Mugging and why this is such a confusing problem.
UDT still doesn't forget enough. Variations on UDT that move towards acausal trade with arbitrary agents are more obviously needed because UDT forgets too much, since that makes it impossible to compute in practice and forgetting less poses a new issue of choosing a particular updateless-to-some-degree agent to coordinate with (or follow). But not forgetting enough can also be a problem.
In general, an external/updateless agent (whose suggested policy the original agent follows) can forget the original preference, pursue a different version of it that has undergone an ontological shift. So it can forget the world and its laws, as long as the original agent would still find it to be a good idea to follow its policy (in advance, based on the updateless agent's nature, without looking at the policy). This updateless agent is shared among the counterfactual variants of the original agent that exist in the updateless agent's ontology, it's their chosen updateless core, the source of coherence in their actions.
How much do you think we should forget?