Comments

Eli Tyre6mo1811

Man, I agree with almost all the content of this post, but dispute the framing. This seems like maybe an opportunity to write up some related thoughts about transparency in the x-risk ecosystem.

 

A few months ago, I had opportunity to talk with a number of EA-aligned or x-risk concerned folks working in policy or policy adjacent roles as part of a grant evaluation process. My views here are informed by those conversations, but I am overall quite far from the action of AI policy stuff. I try to carefully flag my epistemic state regarding the claims below.

Omission

I think a lot of people, especially in AI governance, are...  

  1. Saying things that they think are true
  2. while leaving out other important things that they think are true, but are also so extreme or weird-sounding that they would lose credibility.

A central example is promoting regulations on frontier AI systems because powerful AI systems could develop bio-weapons that could be misused to wipe out large swaths of humanity. 

I think that most of the people promoting that policy agenda with that argumentation, do in fact think that AI-developed bioweapons are a real risk of the next 15 years. And, I guess, many to most of them think that there is also a risk of an AI takeover (including one that results in human extinction), within a similar timeframe. They're in fact more concerned about the AI takeover risk, but they're focusing on the bio-weapons misuse case, because that's more defensible, and (they think) easier to get others to take seriously.[1] So they're more likely to succed in getting their agenda passed into law, if they focus on those more-plausible sounding risks.

This is not, according to me, a lie. They are not making up the danger of AI-designed bio-weapons. And it is normal, in politics, to not say many things that you think and believe. If a person was asked point-blank about the risk AI takeover, and they gave an answer that implied the risk was lower than they think it is, in private, I would consider that a lie. But failing to volunteer that info when you're not being asked for it is something different.[2]

However, I do find this dynamic of saying defensible things in the overton window, and leaving out your more extreme beliefs, concerning.

It is on the table that we will have Superintelligence radically transforming planet earth by 2028. And government actors who might be able to take action on that now, are talking to advisors who do think that that kind of radical transformation is possible in that near a time frame. But those advisors hold back from telling the government actors that they think that, because they expect to loose the credibility they have.

This sure looks sus to me. It sure seems like a sad world where almost all of the people that were in a position to give a serious warning to the people in power, opted not to, and so the people in power didn't take the threat seriously until it was too late.

But is is important to keep in mind that the people I criticizing are much closer to the relevant action than I am. They may just be straightforwardly correct that they will be discredited if they talk about Superintelligence in the near term. 

I would be pretty surprised by that, given that eg OpenAI is talking about Superintelligence in the near team. And it overall becomes a lot less weird to talk about if 50 people from FHI, OpenPhil, the labs, etc. are openly saying that they think the risk of human extinction is >10%, instead of just that weird, longstanding kooky-guy, Eliezer Yudkowsky. 

And it seems like if you loose credibility for soothsaying, and then you're soothsaying looks like it's coming true, you will earn your credibility back later? I don't know if that's actually how it works in politics. 

But I'm not an expert here. I've heard at least one second hand anecdote of an EA in DC "coming out" as seriously concerned about Superintelligence and AI takeover risk, and loosing points for doing so. 

And overall, I have 1000x less experience engaging with government than these people, who have specialized in this kind of thing. I suspect that they're pretty calibrated about how different classes of people will react.

I am personally not sure how to balance advocating for a policy that seems more sensible and higher integrity to me, on my inside view, with taking into account the expertise of the people in these positions. For the time being, I'm trying to be transparent that my inside view wishes that EA policy people should be much more transparent about what they think, while also not punishing those people for following a different standard.

Belief-suppression

However, it gets worse than that. It's not only that many policy folks are not expressing their full beliefs, I think they're further exerting pressure on others not to express their full beliefs.

When I talk to EA people working in policy, about new people entering the advocacy space, they almost universally express some level of concern, due to "poisoning the well" dynamics.

To lay out an example of poisoning the well:

Let's say that some young EAs are excited about the opportunities to influence AI policy. They show up in DC and manage to schedule meetings with staffers. The talk about AI and AI risk, and maybe advocate for some specific policy like a licensing regime. 

But they're amateurs. They don't really know what they're doing, and they commit a bunch of faux pas, revealing that they don't know important facts about the relevant collations in congress, or which kinds of things are at all politically feasible. The staffers mark these people as unserious fools, who don't know what they're talking about, and who wasted their time. They disregard whatever proposal was put forward as un-serious. (The staffer doesn't let on about this though. Standard practice is to act polite, and then laugh about the meeting with your peers over drinks.)

Then, 6 months later, a different, more established advocacy group or think tank comes forward with a very similar policy. But they're now fighting an uphill battle, since people in government have already formed associations with that policy, and with the general worldview

As near as I can tell, I think this poisoning the well effect is real

People in government are overwhelmed with ideas, policies, and decisions. They don't have time to read the full reports, and often make relatively quick judgments. 

And furthermore, they're used to reasoning according to a coalition logic. to get legislation passed is not just a matter of whether it is a good idea, but largely depends on social context of the legislation. Who an idea is associated with is a strong determinant of whether to take it seriously. [3]

But this dynamic causes some established EA DC policy people to be wary of new people entering the space unless they already have a lot of policy experience, such that they can avoid making those kinds of faux pas. They would prefer that anyone entering the space have high levels of native social tact, and additionally to be familiar with DC etiquette. 

I don't know this to be the case, but I wouldn't be surprised if, people's sense of "DC etiquette" includes not talking about or not focusing too much on extreme, Sci-fi sounding scenarios." I would guess that there's one person working in the policy space can mess things up for everyone else in that space, and so that creates a kind of conformity pressure whereby everyone expresses the same sorts of thing. 

To be clear, I know that that isn't happening universally. There's at least one person that I talked to, working at org X, who suggested the opposite approach—they wanted a new advocacy org to explicitly not try to the sync their messaging with org X. They thought it made more sense for different groups, especially if they had different beliefs about what's necessary for a good future, to advocate for different policies. 

But I my guess is that there's a lot of this kind of thing, where there's a social pressure, amongst EA policy people, toward revealing less of one's private beliefs, lest one be seen as something of a loose cannon.

Even insofar as my inside view is mistaken about how productive it would be to say, straightforwardly, there's an additional question of how well-coordinated this kind of policy should be. My guess, is that by trying to all stay within the overton window, the EA policy ecosystem as a whole is preventing the overton window from shifting, and it would be better if there were less social pressure towards conformity, to enable more cascading social updates. 

  1. ^

    I'm sure that some of those folks would deny that they're more concerned about AI takeover risks. Some of them would claim something like agnosticism about which risks are biggest. 

  2. ^

    That said, my guess is that many of the people that I'm thinking of, in these policy positions, if they were asked, point blank, might lie in exactly that way. I have no specific evidence of that, but it does seem like the most likely way many of them would respond, given their overall policy about communicating their beliefs. 

    I think that kind of lying is very bad, both misleading the person or people who are seeking info from you and a defection on our collective discourse commons by making it harder for everyone who agrees with you to say what is true.

    And anyone who might be tempted to lie in a situation like that should take some time in advance to think through how they could respond in a way that is both an honest representation of their actual beliefs and also not disruptive to their profesional and political commitments

  3. ^

    And there are common knowledge effects here. Maybe some bumbling fools present a policy to you. You happen to have the ability to assess that their policy proposal is actually a really good idea. But you know that the bumbling fools also presented to a number of your colleagues, who are now snickering at how dumb and non-savvy they were. 

Yet, at no point during this development did any project leap forward by a huge margin. Instead, every paper built upon the last one by making minor improvements and increasing the compute involved. Since these minor improvements nonetheless happened rapidly, the result is that the GANs followed a fast development relative to the lifetimes of humans.

Does anyone have time series data on the effectiveness of Go-playing AI? Does that similarly follow a gradual trend?

AlphaGo seems much closer to "one project leaps forward by a huge margin." But maybe I'm mistaken about how big an improvement AlpahGo was over previous Go AIs.