AI coordination needs clear wins

evhub

Thanks to Kate Woolverton and Richard Ngo for useful conversations, comments, and feedback.

EA and AI safety have invested a lot of resources into building our ability to get coordination and cooperation between big AI labs. So far, however, despite that investment, it doesn’t seem to me like we’ve had that many big coordination “wins” yet. I don’t mean to say that to imply that our efforts have failed, however—the obvious other hypothesis is just that we don’t really have that much to coordinate on right now, other than the very nebulous goal of improving our general coordination/cooperation capabilities.

In my opinion, however, I think that our lack of clear wins is actually a pretty big problem—and not just because I think there are useful things that we can plausibly coordinate on right now, but also because I expect our lack of clear wins now to limit our ability to get the sort of cooperation we need in the future.

In the theory of political capital, it is a fairly well-established fact that “Everybody Loves a Winner.” That is: the more you succeed at leveraging your influence to get things done, the more influence you get in return. This phenomenon is most thoroughly studied in the context of the ability of U.S. presidents’ to get their agendas through Congress—contrary to a naive model that might predict that legislative success uses up a president’s influence, what is actually found is the opposite: legislative success engenders future legislative success, greater presidential approval, and long-term gains for the president’s party.

I think many people who think about the mechanics of leveraging influence don’t really understand this phenomenon and conceptualize their influence as a finite resource to be saved up over time so it can all be spent down when it matters most. But I think that is just not how it works: if people see you successfully leveraging influence to change things, you become seen as a person who has influence, has the ability to change things, can get things done, etc. in a way that gives you more influence in the future, not less.

Of course, you do have to actually succeed to make this work—if you try to spend your influence to make something happen and fail, you get the opposite effect. This suggests the obvious strategy, however, of starting with small but nevertheless clear coordination wins and working our way up towards larger ones—which is exactly the strategy that I think we should be pursuing.^[1]

In that vein, in a follow-up post, I will propose a particular clear, concrete coordination task that I think might be achievable soon given the current landscape, would generate a clear win, and that I think would be highly useful in and of itself. ↩︎

EA and AI safety have invested a lot of resources into building our ability to get coordination and cooperation between big AI labs.

Wait, really? Can you name some examples? I thought this was mostly being left to the big AI labs. Maybe I should be talking to the people investing these resources.

The one big coordination win I recall us having was the 2015 Research Priorities document that among other things talked about the threat of superintelligence. The open letter it was an attachment to was signed by over 8000 people, including prominent AI and ML researchers.

And then there's basically been nothing of equal magnitude since then.

Is the best way to suggest how to do political and policy strategy, or coordination, to post it publicly on Lesswrong? This seems obviously suboptimal, and I'd think that you should probably ask for feedback and look into how to promote cooperation privately first.

That said, I think everything you said here is correct on an object level, and worth thinking about.

I'd think that you should probably ask for feedback and look into how to promote cooperation privately first.

I have done this also.

I disagree with the conclusion of this post, but still found it a valuable reference for a bunch of arguments I do think are important to model in the space.