My take on higher-order game theory
This is how I currently think about higher-order game theory, the study of agents thinking about agents thinking about agents.... This post doesn't add any new big ideas beyond what was already in the post by Diffractor linked above. I just have a slightly different perspective that emphasizes the "metathreat" approach and the role of nondeterminism. This is a work in progress. There's a bunch of technical work that must be done to make this rigorous. I'll save the details for the last section. Multiple levels of strategic thinking Suppose you're an agent with accurate beliefs about your opponents. It doesn't matter where your beliefs come from; perhaps you have experience with these opponents, or perhaps you read your opponents' source code and thought about it. Your beliefs are accurate, although for now we'll be vague about what exactly "accurate" means. In a game of Chicken, you may want to play Swerve, since that's the maximin strategy. This is zeroth-order thinking because you don't need to predict what your opponent will do. Or maybe you predict what your opponent will do and play the best response to that, Swerving if they'll go Straight and going Straight if they'll Swerve. This is first-order thinking. Or maybe you know your opponent will use first-order thinking. So you resolve to go Straight, your opponent will predict this, and they will Swerve. This is second-order thinking. Beyond second-order, Chicken turns into a commitment race. In Prisoner's Dilemma, zeroth- and first-order thinking recommend playing Defect, as that's a dominant strategy. Second-order thinking says that it's good to Cooperate on the margin if by doing so you cause your opponent to also Cooperate on the margin. Let's say it's worth it to Cooperate with probability p if your opponent thereby Cooperates with probability more than p2 (although the exact numbers depend on the payoff matrix). If you're a third-order agent, you might commit to Cooperating with probability equal