[$20K in Prizes] AI Safety Arguments Competition

Dan H; Kevin Liu; ozhang; TW123; Sidney Hough

TL;DR—We’re distributing $20k in total as prizes for submissions that make effective arguments for the importance of AI safety. The goal is to generate short-form content for outreach to policymakers, management at tech companies, and ML researchers. This competition will be followed by another competition in around a month that focuses on long-form content.

This competition is for short-form arguments for the importance of AI safety. For the competition for distillations of posts, papers, and research agendas, see the Distillation Contest.

Objectives of the arguments

To mitigate AI risk, it’s essential that we convince relevant stakeholders sooner rather than later. To this end, we are initiating a pair of competitions to build effective arguments for a range of audiences. In particular, our audiences include policymakers, tech executives, and ML researchers.

Policymakers may be unfamiliar with the latest advances in machine learning, and may not have the technical background necessary to understand some/most of the details. Instead, they may focus on societal implications of AI as well as which policies are useful.
Tech executives are likely aware of the latest technology, but lack a mechanistic understanding. They may come from technical backgrounds and are likely highly educated. They will likely be reading with an eye towards how these arguments concretely affect which projects they fund and who they hire.
Machine learning researchers can be assumed to have high familiarity with the state of the art in deep learning. They may have previously encountered talk of x-risk but were not compelled to act. They may want to know how the arguments could affect what they should be researching.

We’d like arguments to be written for at least one of the three audiences listed above. Some arguments could speak to multiple audiences, but we expect that trying to speak to all at once could be difficult. After the competition ends, we will test arguments with each audience and collect feedback. We’ll also compile top submissions into a public repository for the benefit of the x-risk community.

Note that we are not interested in arguments for very specific technical strategies towards safety. We are simply looking for sound arguments that AI risk is real and important.

Competition details

The present competition addresses shorter arguments (paragraphs and one-liners) with a total prize pool of $20K. The prizes will be split among, roughly, 20-40 winning submissions. Please feel free to make numerous submissions and try your hand at motivating various different risk factors; it's possible that an individual with multiple great submissions could win a good fraction of the prize. The prize distribution will be determined by effectiveness and epistemic soundness as judged by us. Arguments must not be misleading.

To submit an entry:

Please leave a comment on this post (or submit a response to this form), including:
- The original source, if not original.
- If the entry contains factual claims, a source for the factual claims.
- The intended audience(s) (one or more of the audiences listed above).
In addition, feel free to adapt another user’s comment by leaving a reply⁠⁠—prizes will be awarded based on the significance and novelty of the adaptation.

Note that if two entries are extremely similar, we will, by default, give credit to the entry which was posted earlier. Please do not submit multiple entries in one comment; if you want to submit multiple entries, make multiple comments.

The first competition will run until May 27th, 11:59 PT. In around a month, we’ll release a second competition for generating longer “AI risk executive summaries'' (more details to come). If you win an award, we will contact you via your forum account or email.

Paragraphs

We are soliciting argumentative paragraphs (of any length) that build intuitive and compelling explanations of AI existential risk.

Paragraphs could cover various hazards and failure modes, such as weaponized AI, loss of autonomy and enfeeblement, objective misspecification, value lock-in, emergent goals, power-seeking AI, and so on.
Paragraphs could make points about the philosophical or moral nature of x-risk.
Paragraphs could be counterarguments to common misconceptions.
Paragraphs could use analogies, imagery, or inductive examples.
Paragraphs could contain quotes from intellectuals: “If we continue to accumulate only power and not wisdom, we will surely destroy ourselves” (Carl Sagan), etc.

For a collection of existing paragraphs that submissions should try to do better than, see here.

Paragraphs need not be wholly original. If a paragraph was written by or adapted from somebody else, you must cite the original source. We may provide a prize to the original author as well as the person who brought it to our attention.

One-liners

Effective one-liners are statements (25 words or fewer) that make memorable, “resounding” points about safety. Here are some (unrefined) examples just to give an idea:

Vladimir Putin said that whoever leads in AI development will become “the ruler of the world.” (source for quote)
Inventing machines that are smarter than us is playing with fire.
Intelligence is power: we have total control of the fate of gorillas, not because we are stronger but because we are smarter. (based on Russell)

One-liners need not be full sentences; they might be evocative phrases or slogans. As with paragraphs, they can be arguments about the nature of x-risk or counterarguments to misconceptions. They do not need to be novel as long as you cite the original source.

Conditions of the prizes

If you accept a prize, you consent to the addition of your submission to the public domain. We expect that top paragraphs and one-liners will be collected into executive summaries in the future. After some experimentation with target audiences, the arguments will be used for various outreach projects.

(We thank the Future Fund regrant program and Yo Shavit and Mantas Mazeika for earlier discussions.)

In short, make a submission by leaving a comment with a paragraph or one-liner. Feel free to enter multiple submissions. In around a month we'll divide 20K to award the best submissions.

I'd like to complain that this project sounds epistemically absolutely awful. It's offering money for arguments explicitly optimized to be convincing (rather than true), it offers money only for prizes making one particular side of the case (i.e. no money for arguments that AI risk is no big deal), and to top it off it's explicitly asking for one-liners.

I understand that it is plausibly worth doing regardless, but man, it feels so wrong having this on LessWrong.

Think of it as a "practicing a dark art of rationality" post, and I'd think it would seem less off-putting.

I think it would be less "off-putting" if we had common knowledge of it being such a post. I think the authors don't think of it as that from reading Sidney's comment.

The technology [of lethal autonomous drones], from the point of view of AI, is entirely feasible. When the Russian ambassador made the remark that these things are 20 or 30 years off in the future, I responded that, with three good grad students and possibly the help of a couple of my robotics colleagues, it will be a term project [six to eight weeks] to build a weapon that could come into the United Nations building and find the Russian ambassador and deliver a package to him.

-- Stuart Russell on a February 25, 2021 podcast with the Future of Life Institute.

(a)

Look, we already have superhuman intelligences. We call them corporations and while they put out a lot of good stuff, we're not wild about the effects they have on the world. We tell corporations 'hey do what human shareholders want' and the monkey's paw curls and this is what we get.

Anyway yeah that but a thousand times faster, that's what I'm nervous about.

(b)
Look, we already have superhuman intelligences. We call them governments and while they put out a lot of good stuff, we're not wild about the effects they have on the world. We tell governments 'hey do what human voters want' and the monkey's paw curls and this is what we get.

Anyway yeah that but a thousand times faster, that's what I'm nervous about.

As recent experience has shown, exponential processes don't need to be smarter than us to utterly upend our way of life. They can go from a few problems here and there to swamping all other considerations in a span of time too fast to react to, if preparations aren't made and those knowledgeable don't have the leeway to act. We are in the early stages of an exponential increase in the power of AI algorithms over human life, and people who work directly on these problems are sounding the alarm right now. It is plausible that we will soon have processes that can escape the lab just as a virus can, and we as a species are pouring billions into gain-of-function research for these algorithms, with little concomitant funding or attention paid to the safety of such research.

What about graphics? e.g. https://twitter.com/DavidSKrueger/status/1520782213175992320

"Most AI reserch focus on building machines that do what we say. Aligment reserch is about building machines that do what we want."

Source: Me, probably heavely inspred by "Human Compatible" and that type of arguments. I used this argument in conversations to explain AI Alignment for a while, and I don't remember when I started. But the argument is very CIRL (cooperative inverse reinforcment learning).

I'm not sure if this works as a one liner explanation. But it does work as a conversation starter of why trying to speify goals directly is a bad idea. And how the things we care about often are hard to messure and therefore hard to instruct an AI to do. Insert referenc to King Midas, or talk about what can go wrong with a super inteligent Youtube algorithm that only optimises for clicks.

_____________________________

"Humans rule the earth becasue we are smart. Some day we'll build something smarter than us. When it hapens we better make sure it's on our side."

Source: Me

Inspiration: I don't know. I probably stole the structure of this agument from somwhere, but it was too long ago to remember.

By "our side" I mean on the side of humans. I don't mean it as an us vs them thing. But maybe it can be read that way. That would be bad. I've never run in to that missunderstanding though, but I also have not talked to politicians.

Question: "effective arguments for the importance of AI safety" - is this about arguments for the importance of just technical AI safety, or more general AI safety, to include governance and similar things?

I understand that it is plausibly worth doing regardless, but man, it feels so wrong having this on LessWrong.

Think of it as a "practicing a dark art of rationality" post, and I'd think it would seem less off-putting.

I think it would be less "off-putting" if we had common knowledge of it being such a post. I think the authors don't think of it as that from reading Sidney's comment.

The technology [of lethal autonomous drones], from the point of view of AI, is entirely feasible. When the Russian ambassador made the remark that these things are 20 or 30 years off in the future, I responded that, with three good grad students and possibly the help of a couple of my robotics colleagues, it will be a term project [six to eight weeks] to build a weapon that could come into the United Nations building and find the Russian ambassador and deliver a package to him.

-- Stuart Russell on a February 25, 2021 podcast with the Future of Life Institute.

(a)

Anyway yeah that but a thousand times faster, that's what I'm nervous about.

What about graphics? e.g. https://twitter.com/DavidSKrueger/status/1520782213175992320

"Most AI reserch focus on building machines that do what we say. Aligment reserch is about building machines that do what we want."

_____________________________

"Humans rule the earth becasue we are smart. Some day we'll build something smarter than us. When it hapens we better make sure it's on our side."

Source: Me

Inspiration: I don't know. I probably stole the structure of this agument from somwhere, but it was too long ago to remember.

19

[$20K in Prizes] AI Safety Arguments Competition

19