Musings on general systems alignment

Epistemic status: Musings from the past month. Still far too vague for satisfaction.

Friends.

I have been away on a retreat this past week, seeking clarity on how to move forward with living a vibrant and beneficial life that resolves problems in the world, but I’m afraid that I have come back empty-handed. I have only vague musings of a bigger picture, but no clear sense for how to take decisive action. I’ll try to share what I have succinctly, so as not to take up too much time.

Understand alignment, not intelligence

Look, our task here is to align the systems that will most influence the future with what is actually good. To do that, we should look out at the world, identify which kinds of systems are most influential, and seek to align them to the benefit of all life on the planet. Intelligence is the means by which certain very powerful systems could have a very large influence over the future, and to that end we ought to be interested in understanding intelligence. But we need not have any particular interest in understanding intelligence for its own sake. What we should be interested in understanding is the means by which any system can exert influence over the future, and the means to align such powerful systems with that which is worth protecting.

Align systems, not AI

There has been great debate about what AI might look like. Will it look like a singleton, or like a tool, or like a set of cloud services, or like a society of competing entities? One person says that a powerful singleton might be dangerous, then another person says that AI might not look much like a powerful singleton.

Yet there is a single unifying issue to resolve here, which is this: how do we build things in the world that are and remain consistently beneficial to all life? How do we construct international treaties that are aligned in this way? How do we construct financial systems that are aligned in this way? How do we construct tools that are aligned in this way? How do we construct belief-forming, observation-making, action-taking agents that are aligned in this way? These questions are connected in a deep, not surface-level way, because they all come down to clarifying what is good and implementing it in a tangible system.

There is a hard problem of alignment

There are many difficult problems in AI alignment, but there seems to be one problem at the center that has an entirely different character of difficulty. The hard problem, as I see it, is this: how do we set up any system in a way that is aligned with what is actually good, when any particular operationalization of what is good is certain to be wrong?

The world now looks to us

In the early days of AI safety, there was a narrative that the world was mostly not on our side, that it was our job to beat the world over the head with the hard stick of difficult truths about dangers of advanced AI in order to wake people up to the impending destruction of life on this planet. This was a good narrative to have in the early days, and it served its purpose, but it is no longer serving us. I think that a better narrative to have now is the following.

The world is like an extremely wealthy but depressed person who realizes that their business empire is rapidly causing the destruction of life, and despite not finding the energy to make sweeping changes on their own, summons just enough clarity to make a large financial gift to a deputy who seems unusually agentic and trustworthy and ethical. That deputy -- that is, us, this community -- faces the difficult task of reforming an empire that is caught up in harmful patterns of politics and finance and prestige, so it is not exactly the case that everyone is "on their side", yet almost everyone in the empire sees that things are not going well, and in moments of clarity urges this deputy onwards, even if they soon return to participate in the very patterns that they hope the deputy will help to resolve.

We are the great hope of our civilization. Us, here, in this community. It is not that our civilization has woken up completely to the dangers of advanced AI. It is that our civilization has not woken up, yet wishes to wake up, and knows that it wishes to wake up, and has found just enough clarity to bestow significant power and resources to us in the hope that we will take up leadership.

In this subtle way, everyone is now on our side. Yet everyone is caught up in the very patterns that, at moments of clarity, they see are causing harm. Our job is to find the resolve to move forward with this difficult task, without getting caught up in the harmful patterns that exist in the world, and without losing track of the subtle way in which everyone is on our side.

This is the story. It is a way of seeing things, an ethos for carrying on with a difficult task that requires coordination with many people. It is a good way of seeing things to the extent that, if we chose to see things in this way, our actions would be beneficial to all life. It seems to me that seeing things this way would indeed be beneficial to all life because it calls us to befriend exactly that within everyone that seeks The Good, without giving even the tiniest accommodation to the patterns of behavior that are causing existential risk.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

21

Musings on general systems alignment

21