We are announcing a $20k bounty for publicly-understandable explainers of AI safety concepts. We are also releasing the results of the AI Safety Arguments competition.
Of the technologists, ML researchers, and policymakers thinking about AI, very few are seriously thinking about AI existential safety. This results in less high-quality research and could also pose difficulties for deployment of safety solutions in the future.
There is no single solution to this problem. However, an increase in the number of publicly accessible discussions of AI risk can help to shift the Overton window towards more serious consideration of AI safety.
Capability advancements have surprised many in the broader ML community: as they have made discussion of AGI more possible, they can also contribute to making discussion of existential safety more possible. Still, there are not many good introductory resources to the topic or various subtopics. If somebody has no background, they might need to read books or very long sequences of posts to get an idea about why people are worried about AI x-risk. There are a few strong, short, introductions to AI x-risk, but some of them are out of date and they aren’t suited for all audiences.
Shane Legg, a co-founder of DeepMind, recently said the following about AGI:
If you go back 10-12 years ago the whole notion of Artificial General Intelligence was lunatic fringe. People [in the field] would literally just roll their eyes and just walk away. [I had that happen] multiple times. [...] [But] every year [the number of people who roll their eyes] becomes less.
We hope that the number of people rolling their eyes at AI safety can be reduced, too. In the case of AGI, increased AI capabilities and public relations efforts by major AI labs have fed more discussion. Similarly, conscious efforts to increase public understanding and knowledge of safety could have a similar effect.
The Center for AI Safety is announcing a $20,000 bounty for the best publicly-understandable explainers of topics in AI safety. Winners of the bounty will win $2,000 each, for a total of up to ten possible bounty recipients. The bounty is subject to the Terms and Conditions below.
By publicly understandable, we mean understandable to somebody who has never read a book or technical paper on AI safety and who has never read LessWrong or the EA Forum. Work may or may not assume technical knowledge of deep learning and related math, but should make minimal assumptions beyond that.
By explainer, we mean that it digests existing research and ideas into a coherent and comprehensible piece of writing. This means that the work should draw from multiple sources. This is not a bounty for original research, and is intended for work that covers more ground at a higher level than the distillation contest.
Below are some examples of public materials that we value. This should not be taken as an exhaustive list of all existing valuable public contributions.
Note that many of the works above are quite different and do not always agree with each other. Listing them isn’t to say that we agree with everything in them, and we don’t expect to necessarily agree with all claims in the pieces we award bounties to. However, we will not award bounties to work we believe is false or misleading.
Here are some categories of work we believe could be valuable:
There is no particular length of submission we are seeking, but we expect most winning submissions will take less than 30 minutes for a reader/viewer/listener to digest.
Judging will be conducted on a rolling basis, and we may award bounties at any time. Judging is at the discretion of the Center for AI Safety. Winners of the bounty are required to allow for their work to be reprinted with attribution to the author but not necessarily with a link to the original post.
We thank the FTX Future Fund regranting program for the funding for this competition.
We will accept several kinds of submissions:
If your submission is released somewhere other than the forums above, you may submit a link to it here.
The competition will run from today, August 4th, 2022 until December 31st, 2022. The bounty will be awarded on a rolling basis. If funds run out before the end date, we will edit this post to indicate that and also notify everyone who filled out the interest form below.
If you are interested in potentially writing something for this bounty, please fill out this interest form! We may connect you with others interested in working on similar things.
In our previously-announced AI Safety arguments competition, we aimed to compile short arguments for the importance of AI safety. The main intention of the competition was to compile a collection of points to riff on in other work.
We received over 800 submissions, and in this post we are releasing the top ~10% here. These submissions were selected through an effort by 29 volunteers followed by manual review and fact checking by our team, and prizes will be distributed amongst them in varying proportions. Many of the submissions were drawn from previously existing work. The spreadsheet format is inspired by Victoria Krakovna’s very useful spreadsheet of examples of specification gaming.
We would like to note that it is important to be mindful of potential negative risks when doing any kind of broader outreach, and it’s especially important that those doing outreach are familiar with the audiences they plan to interact with. If you are thinking of using the arguments for public outreach, please consider reaching out to us beforehand. You can contact us at email@example.com.
We hope that the arguments we have identified can serve as a useful compilation of common points of public outreach for AI safety, and can be used in a wide variety of work including the kind of work we are seeking in the competition above.
This was part of one of the winning submissions in the AI safety arguments competition, detailed below.
All winners have been notified, if you haven’t received notice yet via email or LessWrong/EA Forum message, then that unfortunately means you did not win.