Alignment tax: “How much more difficult will it be to create an aligned AI vs an unaligned AI when it becomes possible to create powerful AI?”
If the alignment tax is low, people have less incentive to build an unaligned AI as they'd prefer to build a system that's trying to do what they want. Then, to increase the probability that our AI trajectory goes well, one could focus on how to reduce the alignment tax.
Unipolar / Multipolar: "Will transformative AI systems be privately controlled by one organization or many?" Questions to consider when making this more precise are: What if all the relevant organizations are within one political bloc, like the USA, or many, like the USA + China + Russia + India? What if the humans are unified into a single faction, but the AIs are divided into multiple camps? What if it's the other way around? Also, for each of these kinds of unipolarity or multipolarity, perhaps the world will transition from one type to another at some point, so the question becomes whether the world is unipolar or multipolar "in the crucial period."
There are implications of the unipolar/multipolar variable for AI governance, but also for technical AI safety (e.g. it's more important to build AI that can do bargaining and game theory well, to the extent that the world will be multipolar).
Huh, I'm surprised this one got downvoted -- I had always thought of it as uncontroversially important. I'd be interested to hear why. Maybe the idea is that we should be reducing the alignment tax rather than organizing to pay it, and reducing the tax works the same way in unipolar and multipolar scenarios? EDIT: OK, now it's been strong-upvoted, lol. I guess my takeaway is that this is a controversial one.
Risk Awareness: “In the critical period, will it be widely believed by most of the relevant people that AI is a serious existential risk?”
This is closely related to whether or not there are warning shots or fire alarms, but in principle it could happen without any of either.
I would add "will relevant people expect AI to have extreme benefits, such as a significant percentage point reduction in other existential risk or a technological solution to aging"
Value-symmetry: "Will AI systems in the critical period be equally useful for different values?"
This could fail if, for example, we can build AI systems that are very good at optimizing for easy-to-measure values but significantly worse at optimizing for hard to measure values. It might be easy to build a sovereign AI to maximize the profit of a company, but hard to create one that cares about humans and what they want.
Evan Hubinger has some operationalizations of things like this here and here.
Deceptive alignment: “In the critical period, will AIs be deceptive?”
Within the framework of Risks from Learned Optimization, this is when a mesa-optimizer has a different objective than the base objective, but instrumentally optimizes the base objective to deceive humans. It can refer more generally to any scenario where an AI system behaves instrumentally one way to deceive humans.
Craziness: "Will the world be weird and crazy in the crucial period?" For example, are lots of important things happening fast, such that it's hard to keep up without AI assistance, is the strategic landscape importantly different from what we expected thanks to new technologies and/or other developments, does the landscape of effective strategies for AI risk reducers look importantly different than it does now...
Coordination Easier/Harder: In the crucial period, will the relevant kind of coordination be easier or harder? For example, perhaps the relevant kind of coordination is coordination to not build AGI until more safety work has been done. This is closely related to the question of whether collective epistemology will have improved or deteriorated.
Homogeneity: "Will transformative AI systems be trained/created all in the same way?"
From Evan’s post:
If there is only one AI, or many copies of the same AI, then you get a very homogenous takeoff, whereas if there are many different AIs trained via very different training regimes, then you get a heterogenous takeoff. Of particular importance is likely to be how homogenous the alignment of these systems is—that is, are deployed AI systems likely to all be equivalently aligned/misaligned, or some aligned and others misaligned?
Timelines: The intuitive definition is "When will crazy AI stuff start to happen?" The best analysis of timelines in the world as far as I know is Ajeya's, which uses the definition of Transformative AI given here: roughly, "When will the first AI be built that is capable of causing a change in the world comparable to the Industrial Revolution or greater?" My own preferred definition of timelines is "When is the first AI-induced potential point of no return?"
Takeoff speeds: "Will takeoff be fast or slow (or hard or soft, etc.)?"
This post gives an excellent overview of the various versions and operationalizations of this variable.
How dependent is the AGI on idiosyncratic hardware? While any algorithm can run on any hardware, in practice every algorithm will run faster and more energy-efficiently on hardware designed specifically for that algorithm. But there's a continuum from "runs perfectly fine on widely-available hardware, with maybe 10% speedup on a custom ASIC" to "runs a trillion times faster on a very specific type of room-sized quantum computer that only one company on earth has figured out how to make".
If your AGI algorithm requires a weird new chip / processor technology to run at a reasonable cost, it makes it less far-fetched (although still pretty far-fetched I think) to hope that governments or other groups could control who is running the AGI algorithm—at least for a couple years until that chip / processor technology is reinvented / stolen / reverse-engineered—even when everyone knows that this AGI algorithm exists and how the algorithm works.
I think this is an interesting and unique variable -- but it seems too predictable to me. In particular, I'd be surprised if custom hardware gives more than a 100x speedup to whatever the relevant transformative AI turns out to be, and in fact I'd be willing to bet the speedup would be less than 10x, compared to the hardware used by other major AI companies. (Obviously it'll be 1000x faster than, say, the CPUs on consumer laptops). Do you disagree? I'd be interested to hear your reasons!
public sympathy vs dehumanization? ... Like, people could perceive AI algorithms as they do now (just algorithms), or they could perceive (some) AI algorithms as deserving of rights and sympathies like they and their human friends are. Or other possibilities, I suppose. I think it would depend strongly on the nature of the algorithm, as well as on superficial things like whether there are widely-available AI algorithms with cute faces and charismatic, human-like personalities, and whether the public even knows that the algorithm exists, as well as random things like how the issue gets politicized and whatnot. A related issue is whether the algorithms are actually conscious, capable of suffering, etc., which would presumably feed into public perceptions, as well as (presumably) mattering in its own right.
Open / Closed: "Will transformative AI systems in the critical period be publicly available?"
A world where everyone has access to transformative AI systems, for example by being able to rent them (like GPT-3's API once it's publicly available), might be very different from one where they are kept private by one or more private organizations.
For example, if strategy stealing doesn't hold, this could dramatically change the distribution of power, because the systems might be more helpful for some tasks and values than others.
This variable could also affect timelines estimates if publicly accessible TAI systems increase GDP growth, among other effects it could have on the world.
Which variables are most important for predicting and influencing how AI goes?
Here are some examples:
We made this question to crowd-source more entries for our list, along with operationalizations and judgments of relative importance. This is the first step of a larger project.
Instructions: