# 0

Information TheoryWorld Modeling
Frontpage

I'm going to consider two players Alice and Bob who like to share media over a communication channel. For example, Alice and Bob might share music over Spotify. I assume each player sends music at equal, but stochastic rates. I'm also going to make some value assumptions. First, assume Alice and Bob's behavior can be reduced to an optimization of a predictive coding objective. Second, Alice and Bob's media consumption habits are i.i.d samples from and . Moreover, these consumption habits strongly reveal their personal preferences. Altogether we have,

1. Alice and Bob are optimizing a predictive coding objective
2. Alice and Bob's consumption habits can be described as i.i.d samples from and
3. Consumption habits are revealed preferences
4. Alice and Bob send each other i.i.d sampled media at equal rates

With these assumptions, I can introduce the game. Alice and Bob send each other media sampled from the message distributions and . The distribution of the message channel then becomes . By assumption one, Alice and Bob pay a cost associated with how much their consumption habits match this channel. There's also an additional prediction cost. Namely, Alice and Bob's message distributions also serve as predictions of what the other person would like to receive. By assumption three, Alice and Bob should pay a cost proportional to how well their message distributions match each other. The goal of both Alice and Bob is to minimize their objectives.

What shape should the cost take? I will interpret the predictive coding objective literally and say that Alice and Bob pay costs proportional to how much additional processing is necessary to properly encode the messages they receive. I've argued that KL divergence is a natural measure for this and this also fits with what you might get out of predictive coding. Thus, overall we have the objectives, The left term indicates how well Alice's consumption habits inform her about the message channel between her and Bob. The right term indicates how well the messages she sends to Bob line up with Bob's revealed preferences. Similar reasoning applies to Bob's objective.

If the model reduces to one where Alice and Bob share things they like that the other person isn't sharing. On the other hand, say and . In this case, Bob only sends media he likes and Alice will match his preferences. Similarly, if and then Alice only sends media she likes and Bob will match his preferences. Finally, if at the same rate Alice and Bob will match each other's preferences and jointly optimize the message channel so that it's as enjoyable for both parties as possible.

At the moment, a player is defined by their prior distribution , message distribution , and self-interest . If then we'll say is social. At this point I'd also like to introduce the symbol to indicate a null-message or silence on the communication channel.

The main difficulty with analysis is that there are a variety of strategies for game-play. I'm interested in finding nash-equilibria, but I also want to show that there are certain quantities that behave in predictable ways. In that direction I have a partial result,

Theorem: If and both players are sufficiently social then .

It's easier to see this with an example. We'll assume that the sample space consists of . Say the two players have an empty shared support and that both are social. A player can either send nothing or defect. I claim the payoffs are, The case of cooperation is easy to see. Both players send on the support and this allows everyone to not pay an infinite cost. However, imagine either player defects. Clearly, neither would add something out of their own personal support so then we're really just talking about shifting mass. Say Alice shifts probability mass to then the social-divergence term will increase. Thus, if Alice is sufficiently social this increased cost will outweigh her personal savings. The same will hold for Bob. It follows that if Alice and Bob are sufficiently social sub-opt will really be less than opt which means silence is a nash-equilibria. It follows that if Alice and Bob are sufficiently social and have empty common support the nash-equilibria for communication is an empty communication channel.

The point here is that even if Alice and Bob have a small set of mutual support, that is not enough to guarantee that their predictive loss will be finite. We also need Alice and Bob to sufficiently social or invested in using information from the other's revealed preferences. This basic model could be generalized in a few different direction. First, we can introduce non i.i.d messages using KL-divergence rates. Second, we could introduce multiple players and start analyzing which preferences are most relatable to a group. Finally, one could refine the analysis presented here to look at convergence rates and finite sampling effects.

New Comment