x

Book report: Theory of Games and Economic Behavior (von Neumann & Morgenstern) — AI Alignment Forum

I finally read a game theory textbook. Von Neumann and Morgenstern's "Theory of Games and Economic Behavior" basically started the field of game theory. I'll summarize the main ideas and my opinions.

This was also the book that introduced the VNM theorem about decision-theoretic utility. They presented it as an improvement on "indifference curves", which apparently was how economists thought about preferences back then.

2-player zero-sum games

We start with 2-player zero-sum games, where the payoffs to the players sum to 0 in every outcome. (We could just as well consider constant-sum games, where the sum of payoffs is the same in every outcome.) Maximizing your reward is the same as minimizing your opponent's reward. An example is Rock, Paper, Scissors.

Suppose player 1 moves first, and then player 2 moves with, and then the players get their rewards $R$ , $- R$ . Then player 2 should play ${argmin}_{y} R (x, y)$ and player 1 should play ${argmax}_{x} {min}_{y} R (x, y)$ . If instead player 2 moves first, then player 1 should play ${argmax}_{x} R (x, y)$ and player 2 should play ${argmin}_{y} {max}_{x} R (x, y)$ .

The payoff to player 1 will be the "maximin" or "minimax" value, depending on whether they move first or second. And moving second is better:

$max x min y R (x, y) \leq min y max x R (x, y)$

If the action space is finite and we allow mixed strategies, then by the Kakutani fixed point theorem equality obtains. (The book proves this with elementary linear algebra instead.) In Rock, Paper, Scissors, the maximin value is 0, which you get by playing randomly.

VNM then argue that if both players must move simultaneously, they should play the maximin/minimax strategy. Their argument is kind of sketchy: They say that since player 1 can secure at least the maximin reward for themselves, and since player 2 can prevent player 1 from getting more than the maximin reward, it's rational for player 1 to play the strategy that guarantees them the maximin reward (17.8.2). They claim they aren't assuming knowledge of rationality of all players, but I think they're smuggling it in.

Also, the minimax strategies form a Nash equilibrium (which they call a "saddle point"). But their argument for playing minimax doesn't rely on this fact directly.

Despite the sketchiness I agree that you should play maximin in 2-player zero-sum games, unless you have reason to believe your opponent won't play maximin.

$n$ -player zero-sum games

Next we move on to $n$ -player zero-sum games. We allow coalitions, coordination, and side payments, so Nash equilibrium is no longer relevant. An example is the 3-player game where you point at another player; if two players point to each other, they each get 1 point and the third player gets -2; otherwise, everyone gets 0. We pass through four layers of abstraction:

First we define a game in extensive form: There are $n$ players and a finite sequence of moves. A move may be chosen by a player, or be determined randomly with fixed probabilities. A player makes a limited observation of the previous moves before moving. The number of turns, the identity of the player who gets to move, and the allowed observations can all depend on previous moves. All players get numerical rewards at the end.
A game can be expressed in normal form, where each each player chooses a strategy without making any observations, and then the reward is a function of the strategies chosen.
The characteristic function for a game is defined as follows: If we divide the players into two coalitions and allow players within a coalition to coordinate, then we get a 2-player game. Let $v (S)$ be the minimax value for coalition $S \subset {1, \dots, n}$ . It's the total rewards of all coalition members if they coordinate on an optimal strategy.

Note that this requires the coalition to have access to a shared source of randomness that the other coalition can't see (in order to implement mixed strategies). They also need to coordinate on which minimax strategy to execute, if there's more than one.

A solution to a game is not a single outcome but a set of outcomes, defined as follows:

An imputation is a vector $α$ of rewards for each player such that $\sum_{i} α_{i} = 0$ and $α_{i} \geq v ({i})$ for all players $i$ . Every imputation is the result of some choice of strategies.

For imputations $α$ , $β$ , we say that $α$ dominates $β$ ( $α ⪰ β$ ) if there's a coalition $S$ satisfying:

$S \neq \emptyset$

$\forall i \in S α_{i} > β_{i}$

$\sum i \in S α_{i} \leq v (S)$

The last condition means that the coalition can secure enough collective reward to guarantee the imputation for everyone in the coalition. (This is where we assume the players can make side payments.) Domination isn't necessarily transitive.

A solution (nowadays called a von Neumann–Morgenstern stable set) is a set $V$ of imputations such that no element of $V$ dominates another, and every imputation outside of $V$ is dominated by an element of $V$ . Solutions aren't necessarily unique, and we'll see later that they might not even exist. A solution to the pointing game described above is:

$V = {(1, 1, - 2), (1, - 2, 1), (- 2, 1, 1)}$

This represents the outcome where two players always point at each other, but the solution doesn't determine which two players those are. Surprisingly, this is not the only solution.

$n$ -player non-zero-sum games

Finally, we can remove the constant-sum condition. So we get non-zero-sum games like the Prisoner's Dilemma, whose characteristic function is:

$v (\emptyset) = 0$

$v ({1}) = v ({2}) = 0$ (Each player's maximin or BATNA corresponds to mutual defection, with zero reward.)

$v ({1, 2}) = 2$ (If the players cooperate, they get 1 reward each.)

An $n$ -player non-zero-sum game is simply an $(n + 1)$ -player zero-sum game where the extra player has only one strategy and can't make side payments. It turns out that you can equivalently define the solutions without referring to this extra player; you just need to change the definition of imputation to require $\sum_{i} α_{i} \leq v ({1, \dots, n})$ . It turns out that the solutions are all Pareto-optimal. So a solution to Prisoner's Dilemma is one where both players always cooperate, but then there's an undetermined side payment:

$V = {(a, 2 - a) ∣ 0 \leq a \leq 2}$

What does it mean?

VNM say that a solution is a "standard of behavior". I think it looks kind of like an epistemic state. If $V$ is a solution, then we can imagine it being common knowledge that the outcome will belong to $V$ . If $V$ is not a solution, then either there's an imputation in $V$ that we can deduce will not occur, or there's a coalition that can force an outcome outside of $V$ ; either way, $V$ is not a consistent epistemic state.

The theory doesn't say anything about the "negotiating, bargaining, higgling, contracting and recontracting" (61.2.2) that picks out a particular solution or outcome. Morgenstern really wanted to discover a dynamic theory (Rellstab 1992), and such a theory would probably be a model of negotiation. But this book presents only a static theory, a theory of equilibria. The authors speculate that a dynamic theory would talk about a single imputation-under-consideration changing over time (4.8.2-3).

I don't know what progress has been made toward a dynamic theory in the last 75 years, but it seems like we now have the tools to build one: Logical induction tells us how epistemic states change over time, and the Löbian handshake is a mechanism for forming coalitions in open-source game theory.

Disappointment

Unfortunately, the authors had a lot of trouble finding solutions. This is understandable because it involves a search over the powerset of a manifold. The authors found all solutions to all 3-player zero-sum games. But they didn't find all solutions to all 4-player zero-sum games; they say there's a "bewildering variety" of them (35.3.1). The more players there are, the more complicated it gets.

They find families of solutions for some games of $n$ players, but they aren't exhaustive. They describe how to decompose games into smaller games, but most games aren't decomposable. They suggest modeling markets by taking $n$ to be large, but they don't actually do this. They show that all solutions are closed, but otherwise they prove nothing about solutions in general.

They even fail to prove that every game has a solution, and in fact (Lucas 1968) is an example of a 10-player game with no solution. I imagine a dynamic theory would say that in this game, the players fail to converge to an equilibrium in the limit of infinite computation.

Fun bits

The book requires only an elementary background in mathematics. It's a bit wordy. Von Neumann came up with pretty much all the results (Rellstab 1992).

There are applications to Poker and 3-person markets that I found interesting.

There are at least two easter eggs in the text. Fig. 4 in (8.3.2) contains a hidden elephant because von Neumann's wife liked elephants (Morgenstern 1976). And I'm pretty sure there's a hidden allusion to the Devil in (56.10.4): This paragraph is about the extra, fictional player in a non-zero-sum game, who wants to minimize the rewards of the real players. Real players can collude with the fictional player to make everyone as a whole worse off, and in exchange they are enriched by side payments from the fictional player.

Soon after writing this book, von Neumann went on to build nuclear bombs. This feels profound, but I can't say why exactly.

References

W. F. Lucas. "A game with no solution". Bull. Amer. Math. Soc., Volume 74, Number 2 (1968). 237–239. https://projecteuclid.org/euclid.bams/1183529519

Oskar Morgenstern. "The Collaboration between Oskar Morgenstern and John von Neumann on the Theory of Games". Journal of Economic Literature, Vol. 14, No. 3 (Sep. 1976). p. 805–816. https://www.jstor.org/stable/2722628 (paywalled)

John von Neumann, Oskar Morgenstern. Theory of Games and Economic Behavior. Princeton University Press (1944).

Urs Rellstab. "New Insights into the Collaboration between John von Neumann and Oskar Morgenstern on the Theory of Games and Economic Behavior". History of Political Economy (1992), 24 (Supplement). p. 77–93. https://doi.org/10.1215/00182702-24-Supplement-77 (paywalled)