Superrational Agents Kelly Bet Influence!

[-]Bunthut1y83

Maybe I'm missing something, but it seems to me that all of this is straightforwardly justified through simple selfish pareto-improvements.

Take a look at Critchs cake-splitting example in section 3.5. Now imagine varying the utility of splitting. How high does it need to get, before [red->Alice;green->Bob] is no longer a pareto improvement over [(split)] from both player's selfish perspective before the observation? It's 27, and thats also exactly where the decision flips when weighing Alice 0.9 and Bob 0.1 in red, and Alice 0.1 and Bob 0.9 in green.

Intuitively, I would say that the reason you don't bet influence all-or-nothing, or with some other strategy, is precisely because influence is not money. Influence can already be all-or-nothing all by itself, if one player never cares that much more than the other. The influence the "losing" bettor retains in the world where he lost is not some kind of direct benefit to him, the way money would be: it functions instead as a reminder of how bad a treatment he was willing to risk in the unlikely world, and that is of course proportional to how unlikely he thought it is.

So I think all this complicated strategizing you envision in influence betting, actually just comes out exactly to Critches results. Its true that there are many situations where this leads to influence bets that don't matter to the outcome, but they also don't hurt. The theorem only says that actions must be describable as following a certain policy, it doesn't exclude that they can be described by other policies as well.

[-]abramdemski1y21

I think you are right, I was confused. A pareto-improvement bet is necessarily one which all parties would selfishly consent to (at least not actively object to).

[-]Gurkenglas5y*30

Suppose instead of a timeline with probabilistic events, the coalition experiences the full tree of all possible futures - but we translate everything to preserve behavior. Then beliefs encode which timelines each member cares about, and bets trade influence (governance tokens) between timelines.

[-]abramdemski5y20

Can you justify Kelly "directly" in terms of Pareto-improvement trades rather than "indirectly" through Pareto-optimality? I feel this gets at the distinction between the selfish vs altruistic view.

[-]SimonM5y00

I also looked into this after that discussion. At the time I thought that this might have been something special about Kelly, but when I did some calculations afterwards I found that I couldn't get this to work in the other direction. I haven't fully parsed what you mean by:

(And since payoffs of the bet-against-yourself strategy are exactly identical to Kelly betting payoffs, a bunch of Kelly bets at house odds rearrange money in exactly the same way as this.)
But this is clearly equivalent to how hypotheses redistribute weight during Bayesian updates!
So, a market of Kelly betters re-distributes money according to Bayesian updates.

So take the following with a (large) grain of salt before I can recheck my reasoning, but:

Everything you've written (as I currently understand it) also applies for many other betting strategies. eg if everyone was betting (the same constant) fractional Kelly.

Specifically the market will clear at the same price (weighted average probability) and "everyone who put money on the winning side picks up a fraction of money proportional to the fraction they originally contributed to that side".

[-]abramdemski5y20

I also looked into this after that discussion. At the time I thought that this might have been something special about Kelly, but when I did some calculations afterwards I found that I couldn't get this to work in the other direction.

I'm not sure what you mean here. What is "this" in "looked into this" -- Critch's theorem? What is "the other direction"?

Everything you've written (as I currently understand it) also applies for many other betting strategies. eg if everyone was betting (the same constant) fractional Kelly.
Specifically the market will clear at the same price (weighted average probability) and "everyone who put money on the winning side picks up a fraction of money proportional to the fraction they originally contributed to that side".

It seems obvious to me that the market will clear at the same price if everyone is using the same fractional Kelly, but if people are using different Kelly fractions, the weighted sum would be correspondingly skewed, right? Anyway, that's not really important here...

The important thing for the connection to Critch's theorem is: the total wealth gets adjusted like Bayes' Law. Other betting strategies may not have this property; for example, fractional Kelly means losers lose less, and winners win less. This doesn't limit us to exactly Kelly (for example, the bet-against-yourself strategy in the post also has the desired property); however, all such strategies must be equivalent to Kelly in terms of the payoffs (otherwise, they wouldn't be equivalent to Bayes in terms of the updates!).

For example, if everyone uses fractional Kelly with the same fraction, then on the first round of betting, the market clears with all the right prices, since everyone is just scaling down how much they bet. However, the subsequent decisions will then get messed up, because the everyone has the wrong weights (weights changed less than they should).

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

22

Superrational Agents Kelly Bet Influence!

22

Superrationality

Negotiable RL

Kelly Betting & Bayes

Altruistic Bets