Entangled Equilibria and the Twin Prisoners' Dilemma

AI ALIGNMENT FORUM
AF

Entangled Equilibria and the Twin Prisoners' Dilemma — AI Alignment Forum

In this post, I present a generalization of Nash equilibria to non-CDT agents. I will use this formulation to model mutual cooperation in a twin prisoners' dilemma, caused by the belief that the other player is similar to you, and not by mutual prediction. (This post came mostly out of a conversation with Sam Eisenstat, as well as contributions from Tsvi Benson-Tilsen and Jessica Taylor)

This post is sketchy. If someone would like to go through the work of making it more formal/correct, let me know. Also, let me know if this concept already exists.

Let $A_{1}, \dots, A_{n}$ be a finite collection of players. Let $M_{i}$ be a finite collection of moves available to $A_{i}$ . Let $U_{i} : \prod_{i} M_{i} \to R$ be a utility function for player $A_{i}$ .

Let $Δ_{i}$ be the simplex of all probability distributions over $M_{i}$ . Let $V_{i}$ denote the vector space of functions $v : M_{i} \to R$ , with $\sum_{m \in M_{i}} v (m) = 0$ .

A vector given a distribution in $p \in Δ_{i}$ and a vector $v \in V_{i}$ we say that $v$ is unblocked from $p$ if there exists an $ε > 0$ , such that $p + ε v \in Δ_{i}$ . i.e., it is possible to move in the direction of $v$ from $p$ , and stay in $Δ_{i}$ .

Given a strategy profile $P = (p_{1}, \dots, p_{n})$ in $\prod_{j \leq n} Δ_{j}$ , and a vector $V = (v_{1}, \dots, v_{n})$ in $\prod_{j \leq n} V_{j}$ , we say that $V$ improves $P$ for $A_{i}$ if

$lim ε \to 0 \frac{\sum_{m_{1} \in M_{1}} \dots \sum_{m_{n} \in M_{n}} U_{i} (m_{1}, \dots, m_{n}) (\prod_{j \leq n} p_{j} (m_{j}) + ε v_{j} (m_{j}) - \prod_{j \leq n} p_{j} (m_{j}))}{ε} > 0.$

We call this limit the utility differential for $A_{i}$ at $P$ in the direction of $V$ .

i.e., $V$ improves $P$ for $A_{i}$ if $U_{i}$ is increased when $P$ is moved an infinitesimal amount in the direction of $V$ . (not that this is defined even when the vectors are not unblocked from the distributions)

A "counterfactual gradient" for player $A_{i}$ is a linear function from $V_{i}$ to $\prod_{j \leq n} V_{j}$ , such that the function into the $V_{i}$ component is the identity. This represents how much player $A_{i}$ expects the probabilities of the other players to move when she moves her probabilities.

A "counterfactual system" for $A_{i}$ is a continuous function which takes is a strategy profile $P$ in $\prod_{j \leq n} Δ_{j}$ , and outputs a counterfactual gradient. This represents the fact that a player's counterfactual gradient could be different depending on the strategy profile the game ends up in. We fix a counterfactual system $C_{i}$ for each player.

Claim: There exists a strategy profile $P = (p_{1}, \dots, p_{n})$ in $\prod_{j \leq n} Δ_{j}$ such that for all players $A_{i}$ , if $v_{i} \in V_{i}$ is unblocked from $p_{i}$ , then $C_{i} (P) (v_{i})$ does not improve $P$ .

I will call such a point an entangled equilibrium.

Proof Sketch: (Very sketchy, and I have not verified this. There is probably a better way. It might not be true.)

We will construct a continuous function from $\prod_{j \leq n} Δ_{j}$ to itself. We do this by moving $p_{i}$ by adding the gradient of function from $p_{i}$ to to $U_{i}$ , assuming all other players probabilities change according to the linear function $C_{i} (P)$ . If adding this gradient would take the point out of the simplex $Δ_{i}$ , you hit a boundary after moving some proportion $α$ of the gradient. Then, now that you are on the boundary, you use move by $1 - α$ times the gradient of the same function restricted to the boundary, and repeat.

This function has a fixed point, by Brouwer. For each player, this fixed point must have 0 gradient, or be on the boundary, pointing outward. If it is on a boundary, it has the same property when restricted to that boundary.

If there was an unblocked direction a player could move that would improve its utility, then the gradient would be nonzero. If the player is on a boundary, the gradient is pointing outward, and there is an unblocked direction that would be an improvement, there will be such a direction that stays on that boundary, and the gradient restricted to that boundary would be nonzero. $□$

Example 1

Consider a twin prisoners' dilemma game. Two players can either cooperate or defect. They get 0 utility for being exploited, 3 for exploiting, 2 for mutual cooperation, and 1 for mutual defection.

Both players believe that the other is using a decision procedure that is entangled with their own, but they are not completely sure.

Formally, both players think that when they increase their probability by $ε$ , the other player increases by $2 ε / 3$ , regardless of where the probabilities start.

Thus $C_{1} (p, q) (v_{1}) = (v_{1}, 2 v_{1}) / 3$ , where $v_{1}$ is the vector ( $A_{1}$ cooperates)-( $A_{1}$ defects) in $V_{1}$ , and $v_{2}$ is the analogous vector for player 2. Similarly, $C_{2} (p, q) (v_{2}) = (2 v_{2} / 3, v_{2})$ .

Since both players always think that increasing their probability of cooperation increases utility, the only equilibrium is when both players cooperate with probability 1.

Example 2

Now let look at a case where the gradients are a function of the distributions. Again, both players believe that the other is using a decision procedure that is entangled with their own, but they are not completely sure.

Formally, player 1 thinks that when they increase (or decrease) their probably of cooperation by $ε$ , the other player will increase (or decrease) their probability by $δ ε$ , where $δ$ is one minus the square of the difference between their two probabilities.

Player 2 on the other hand, thinks the other player will increase (or decrease) their probability by $δ ε$ , where $δ$ is one minus the absolute value of the difference between their two probabilities.

Thus $C_{1} (p, q) (v_{1}) = (v_{1}, (1 - (p - q)^{2}) \cdot v_{2})$ , where $v_{1}$ is the vector ( $A_{1}$ cooperates)-( $A_{1}$ defects) in $V_{1}$ , and $v_{2}$ is the analogous vector for player 2. Similarly, $C_{2} (p, q) (v_{2}) = ((1 - | p - q |) \cdot v_{1}, v_{2})$ .

Thus, if player 1 cooperates with probability $p$ , and player 2 cooperates with probability $q$ , then player 1 expects to lose gain $(2 (1 - (p - q)^{2}) - 1) ε$ utility by increasing his probability by $ε$ . This function is only 0 if $| p - q | = 1 / \sqrt{2}$ . Thus the only entangled equilibria with mixed strategies for player 1 have $(p - q) = 1 / \sqrt{2}$ . Similarly, the only entangled equilibria with mixed strategies for player 2 have $| p - q | = 1 / 2$ . Thus, there are no mixed strategies for both players in equilibria.

The only pure equilibrium is $p = q = 1$

If $p = 0$ , then $(2 (1 - (p - q)^{2}) - 1)$ must be nonpositive, so $q$ must be at least $1 / \sqrt{2}$ . None of these points are equilibria for player 2.

If $p = 1$ , then $(2 (1 - (p - q)^{2}) - 1)$ must be nonnegative, so $q$ must be at least $1 - 1 / \sqrt{2}$ . This is in equilibrium if $q = 1 / 2$ .

If $q = 0$ , then $p$ must be at least $1 / 2$ , This in in equilibrium if $p = 1 / \sqrt{2}$ .

If q=1, then $p$ must be at least $1 / 2$ , and there is no equilibrium.

Thus, there are three entangled equilibria (1,1), (1,1/2), (1/ $\sqrt{2}$ ,0).