Review
[People at AI labs] expected heavy scrutiny by leadership and communications teams on what they can state publicly. [...] One discussion with a person working at DeepMind is pending approval before publication. [...] We think organizations discouraging their employees from speaking openly about their views on AI risk is harmful, and we want to encourage more openness.
(I'm the person in question.)
I just want to note that in the case of DeepMind:
(Nothing in the post contradicts what I'm saying here, but I'm worried that readers would get a mistaken impression from it.)
At the end of 2022, following the success of the 2021 MIRI Conversations, Conjecture started a project to host discussions about AGI and alignment with key people in the field. The goal was simple: surface positions and disagreements, identify cruxes, and make these debates public whenever possible for collective benefit.
Given that people and organizations will have to coordinate to best navigate AI's increasing effects, this is the first, minimum-viable coordination step needed to start from. Coordination is impossible without at least common knowledge of various relevant actors' positions and models.
People sharing their beliefs, discussing them and making as much as possible of that public is strongly positive for a series of reasons.
First, beliefs expressed in public discussions count as micro-commitments or micro-predictions, and help keep the field honest and truth-seeking. When things are only discussed privately, humans tend to weasel around and take inconsistent positions over time, be it intentionally or involuntarily.
Second, commenters help debates progress faster by pointing out mistakes.
Third, public debates compound. Knowledge shared publicly leads to the next generation of arguments being more refined, and progress in public discourse.
We circulated a document about the project to various groups in the field, and invited people from OpenAI, DeepMind, Anthropic, Open Philanthropy, FTX Future Fund, ARC, and MIRI, as well as some independent researchers to participate in the discussions. We prioritized speaking to people at AGI labs, given that they are focused on building AGI capabilities.
The format of discussions was as follows:
People from ARC, DeepMind, and OpenAI, as well as one independent researcher agreed to participate. The two discussions with Paul Christiano and John Wentworth will be published shortly. One discussion with a person working at DeepMind is pending approval before publication. After a discussion with an OpenAI researcher took place, OpenAI strongly recommended to its employee to not publish, so we will not be publishing that discussion.
Most people we were in touch with were very interested in participating. However, after checking with their own organizations, many returned saying their organizations would not approve them sharing their positions publicly.
This was in spite of the extensive provisions we made to reduce downsides for them: making it possible to edit the transcript, veto publishing, strict comment moderation, and so on. We think organizations discouraging their employees from speaking openly about their views on AI risk is harmful, and we want to encourage more openness.
We are pausing the project for now, and we have mixed feelings about it. It cost a lot of time to organize and conduct, and we were disappointed to see resistance to having and publishing discussions. On the other hand, the participants and moderators did find them enjoyable and valuable. We expect that even the few discussions that we'll be able to publish will improve public discourse and understanding of cruxes.
We believe Conjecture's status at the time in the AI alignment field was not sufficient to get enough traction, but we encourage any high-status person to try and launch similar initiatives.
We'll be interested in running discussions like these again in the future if there's renewed interest, and we appreciate everyone involved in this round.