Work done during SERI MATS 3.0 with mentorship from Jesse Cliffton. Huge thanks for all the feedback and discussions to Anthony DiGiovanni, Daniel Kokotajlo, Martín Soto, Rubi J. Hudson and Jan Betley! Also posted to EA forum.
Daniel's post about commitment races motivates why they may be a severe problem. Here, I'll describe a concrete protocol that if adopted, would let us avoid some cases of miscoordination caused by them.
The key ingredient is having a mandatory time delay, during which the commitments aren't yet binding. At the end of that delay, you decide whether to make your commitment binding or revert it, and this decision can be conditional on previous decisions of other participants. This in itself would give rise to new races, but it can be managed by adding some additional rules.
I think the biggest challenge would be to convince the "commitment infrastructure" (which I describe below) to adopt such a protocol.
The protocol relies on some mechanism M on which agents can make commitments - a "commitment infrastructure". M could be something like the Ethereum network, or some powerful international body.
We require that:
2. is needed because the protocol relies on certain commitments being forbidden. Agents could decide to do those forbidden commitments outside of M, so we need to make that as hard as possible for them, compared to committing on M. I think this is the hardest part of the whole proposal. M would need to be locked into place by a network effect - everyone is using M because everyone else is using M.
Here are the rules:
Those rules may seem like a lot, but I think they (or some comparably complex set of rules) are all needed if we want to avoid creating new races later in time. The aim is to have only one race, at the very beginning, and everything else should be calm, non-racy and completely independent of agents' speed of making commitments (f.e. what their ping is, or how well connected they are with the commitment infrastructure).
We have a modified game of chicken with the following payoffs:
Let's set the length of the tentative period at one minute, and let’s say that they have 3 minutes before they potentially crash into each other.
Note that in principle at 0:53 Bob could instead decide to unconditionally Dare even though he is second, hoping that Alice may be too scared to Dare.
But with Boomerang such ruthless Daring is much less likely than without it. At the time of decision, Alice and Bob have a shared knowledge of who is first, and also only the second one can make a conditional commitment. This breaks the symmetry of the original game of chicken. The option of making the conditional commitment (when you have that option) is pretty compelling - it's both safe and taking opportunities when they arise. Additionally it would create a focal point of what the participants are "supposed to do" - everyone expects that the first committer gets to Dare and the second must do a conditional commitment, and diverting from this equilibrium would only hurt you.
With the three rules described above, we managed to avoid the most catastrophic outcome. But that outcome is still pretty poor, because the initial commitments were chosen with almost zero thought. If agents later notice some Pareto improvement, to move to this new solution the first agent (Alice) would need to revert her first commitment and give up her privileged position. To be willing to do it, Alice would need a guarantee from the second agent (Bob) that he will also revert. But in the existing protocol, Alice cannot have such a guarantee, because after Alice reverts, Bob could still do whatever - R3 forbids conditioning on commitments that come after yours.
To fix that, we can add another rule:
It may be tricky to see how that helps, so let's rerun our example with that new rule:
We could even have a chain of multiple commitments “conditioning on the future”. In practice we may want to limit that somehow, so that the resolution cannot be delayed indefinitely.
Some non-crucial technical details that you may want to skip:
This would only work in very simple cases like chicken, because you would need to know in advance what are all the possible commitments that others can make, so that you can define what "being second in a race" exactly means.
Alternative rule could be to have M generate some random number at freeze_time, and only then an agent can make the final decision, because we require them to reference that number in the decision message. But that could create a race, where the second committer decides to Dare anyway, because they hope this information will reach the first committer soon enough to sway them. For this reason we would need to postpone the generation of second committer's random number, to wait for the first committer's decision. But if the protocol is used by a lot of agents at the same time, and we play it safe and assume that everyone may potentially clash with anyone, then we have to postpone every commitment on the network which scales badly.
To be clear, the decisions would actually be written as formal statements, not natural language, and also explicitly state which commitments they reference.
The order of sending these hashes is irrelevant here. That's why Bob can send that hash first, even though he's the second committer.
It may be better to adopt Boomerang sooner than later: After someone already established a strategic advantage that lets them commit more ruthlessly, they will oppose the adoption of such a protocol. But agents should be keener to accept the protocol if they don't know yet if they'll be the advantaged or disadvantaged ones.
This works best if commitments on those alternative mechanisms are crisp, so that you can clearly define what will be penalized. F.e. committing through smart contracts is crisper than committing through staking your reputation.But this penalization may be tricky, because it's costly for the penalizer, and you would prefer others to carry this cost. So it requires participants to coordinate to all penalize together. Here's an example technique which may help.
But if we require full anonymity, we lose any positive reputation effects we had. And if we “erase the identity” of whoever behaves ruthlessly, then encountering someone with a fresh identity serves as evidence that they are ruthless, defeating the purpose of this erasure.