Identifiability Problem for Superrational Decision Theories

[-]abramdemski5y20

Now I think the reasoning presented is correct in both cases, and the lesson here is for our expectations of rationality.

I agree that the reasoning is correct in both cases (or rather: could be correct, assuming some details), but the lesson I derive is that we have to be really careful about our assumptions here.

Normally, in game theory, we're comfortable asserting that re-labeling the options doesn't matter (and re-numbering the players also doesn't matter). But normally we aren't worried about anthropic uncertainty in a game.

If we suppose that players can see their numbers, as well, this can be used as a signal to break symmetry for anti-matching. Player 1 can choose option 1, and player 2 can choose option 2. (Or whatever -- they just have to agree on an anti-matching policy acausally.)

Thinking physically, the question is: are the two players physically precisely the same (including environment), at least insofar as the players can tell? Then anti-matching is hard. Usually we don't need to think about such things for game theory (since a game is a highly abstracted representation of the physical situation).

But this is one reason why correlated equilibria are, usually, a better abstraction than Nash equilibria. For example, a game of chicken is similar to anti-matching. In correlated equilibria, there is a "fair" solution to chicken: each player goes straight with 50% probability (and the other player swerves). This corresponds to the idea of a traffic light. If traffic lights were not invented, some other correlating signal from the environment might be used (particularly as we assume increasingly intelligent agents). This is a possible game-theoretic explanation for divination practices such as reading entrails.

Nash equilibria, otoh, are a better abstraction for the case where there truly is no "environment" to take complicated signals from (besides what you explicitly represent in the game). It better fits a way of thinking where models are supposed to be complete.

[-]Bunthut5y00

are the two players physically precisely the same (including environment), at least insofar as the players can tell?

In the examples I gave yes. Because thats the case where we have a guarantee of equal policy, from which people try to generalize. If we say players can see their number, then the twins in the prisoners dilemma needn't play the same way either.

But this is one reason why correlated equilibria are, usually, a better abstraction than Nash equilibria.

The "signals" players receive for correlated equilibria are already semantic. So I'm suspicious that they are better by calling on our intuition more to be used, with the implied risks. For example I remember reading about a result to the effect that correlated equilibria are easier to learn. This is not something we would expect from your explanation of the differences: If we explicitly added something (like the signals) into the game, it would generally get more complicated.

[-]abramdemski5y20

The "signals" players receive for correlated equilibria are already semantic. So I'm suspicious that they are better by calling on our intuition more to be used, with the implied risks. For example I remember reading about a result to the effect that correlated equilibria are easier to learn. This is not something we would expect from your explanation of the differences: If we explicitly added something (like the signals) into the game, it would generally get more complicated.

It's not something we would naively expect, but it does further speak in favor of CE, yes?

In particular, if you look at those learnability results, it turns out that the "external signal" which the agents are using to correlate their actions is the play history itself. IE, they are only using information which must be available to learning agents (granted, sufficiently forgetful learning agents might forget the history; however, I do not think the learnability results actually rely on any detailed memory of the history -- the result still holds with very simple agents who only remember a few parameters, with no explicit episodic memory (unlike, eg, tit-for-tat).

[-]adamShimi5y10

I don't see how the two problems are the same. They are basically the agreement and symmetry breaking problems of distributed computing, and those two are not equivalent in all models. What you're saying is simply that in the no-communication model (where the same algorithm is used on two processes that can't communicate), these two problems are not equivalent. But they are asking for fundamentally different properties, and are not equivalent in many models that actually allow communication.

[-]Bunthut5y00

"The same" in what sense? Are you saying that what I described in the context of game theory is not surprising, or outlining a way to explain it in retrospect?

Communication won't make a difference if you're playing with a copy.

[-]adamShimi5y00

Well, if I understand the post correctly, you're saying that these two problems are fundamentally the same problem, and so rationality should be able to solve them both if it can solve one. I disagree with that, because from the perspective of distributed computing (which I'm used to), these two problems are exactly the two kinds of problems that are fundamentally distinct in a distributed setting: agreement and symmetry-breaking.

Communication won't make a difference if you're playing with a copy.

Actually it could. Basically all of distributed computing assumes that every process is running the same algorithm, and you can solve symmetry-breaking in this case with communication and additional constraint on the scheduling of processes (the difficulty here is that the underlying graph is symmetric, whereas if you had some form of asymmetry (like three processes in a line, such that the one in the middle has two neighbors but the others only have one), they you can use directly that asymmetry to solve symmetry-breaking.

(By the way, you just gave me the idea that maybe I can use my knowledge of distributed computing to look at the sort of decision problems where you play with copies? Don't know if it would be useful, but that's interesting at least)

[-]Bunthut5y00

Well, if I understand the post correctly, you're saying that these two problems are fundamentally the same problem

No. I think:

...the reasoning presented is correct in both cases, and the lesson here is for our expectations of rationality...

As outlined in the last paragraph of the post. I want to convince people that TDT-like decision theories won't give a "neat" game theory, by giving an example where they're even less neat than classical game theory.

Actually it could.

I think you're thinking about a realistic case (same algorithm, similar environment) rather than the perfect symmetry used in the argument. A communication channel is of no use there because you could just ask yourself what you would send, if you had one, and then you know you would have just gotten that message from the copy as well.

I can use my knowledge of distributed computing to look at the sort of decision problems where you play with copies

I'd be interested. I think even just more solved examples of the reasoning we want are useful currently.

[-]adamShimi5y10

As outlined in the last paragraph of the post. I want to convince people that TDT-like decision theories won't give a "neat" game theory, by giving an example where they're even less neat than classical game theory.

Hum, then I'm not sure I understand in what way classical game theory is neater here?

I think you're thinking about a realistic case (same algorithm, similar environment) rather than the perfect symmetry used in the argument. A communication channel is of no use there because you could just ask yourself what you would send, if you had one, and then you know you would have just gotten that message from the copy as well.

As long as the probabilistic coin flips are independent on both sides (you also mention the case where they're symmetric, but let's put that aside for the example), then you can apply the basic probabilistic algorithm for leader election: both copies flip a coin n times to get a n-bit number, which they exchange. If the numbers are different, then the copy with the smallest one says 0 and the other says 1; otherwise they flip a coin and return the answer. With this algorithm, you have probability of deciding different values, and so you can get as close as you want to 1 (by paying the price in more random bits).

I'd be interested. I think even just more solved examples of the reasoning we want are useful currently.

Do you have examples of problems with copies that I could look at and that you think would be useful to study?

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

10

Identifiability Problem for Superrational Decision Theories

10