(Co-written by Connor Leahy and Gabe)

We have talked to a whole bunch of people about pauses and moratoriums. Members of the AI safety community, investors, business peers, politicians, and more.

Too many claimed to pursue the following approach:

  1. It would be great if AGI progress stopped, but that is infeasible.
  2. Therefore, I will advocate for what I think is feasible, even if it is not ideal. 
  3. The Overton window being what it is, if I claim a belief that is too extreme, or endorse an infeasible policy proposal, people will take me less seriously on the feasible stuff.
  4. Given this, I will be tactical in what I say, even though I will avoid stating outright lies.

Consider if this applies to you, or people close to you.

If it does, let us be clear: hiding your beliefs, in ways that predictably leads people to believe false things, is lying. This is the case regardless of your intentions, and regardless of how it feels.

Not only is it morally wrong, it makes for a terrible strategy. As it stands, the AI Safety Community itself can not coordinate to state that we should stop AGI progress right now!

Not only can it not coordinate, the AI Safety Community is defecting, by making it more costly for people who do say it to say it.

We all feel like we are working on the most important things, and that we are being pragmatic realists.

But remember: If you feel stuck in the Overton window, it is because YOU ARE the Overton window.

1. The AI Safety Community is making our job harder

In a saner world, all AGI progress should have already stopped. If we don’t, there’s more than a 10% chance we all die.

Many people in the AI safety community believe this, but they have not stated it publicly. Worse, they have stated different beliefs more saliently, which misdirect everyone else about what should be done, and what the AI safety community believes.

To date, in our efforts to inform, motivate and coordinate with people: People in the AI Safety Community publicly lying has been one of the biggest direct obstacles we have encountered.

The newest example of this is ”Responsible Scaling Policies”, with many AI Safety people being much more vocal about their endorsement of RSPs than their private belief that in a saner world, all AGI progress should stop right now.

Because of them, we have been told many times that we are a minority voice, and that most people in the AI Safety community (understand, Open Philanthropy adjacent) disagree that we should stop all AGI progress right now.

That actually, there is an acceptable way to continue scaling! And given that this makes things easier, if there is indeed an acceptable way to continue scaling, this is what we should do, rather than stop all AGI progress right now!

Recently, Dario Amodei (Anthropic CEO), has used the RSP to frame the moratorium position as the most extreme version of an extreme position, and this is the framing that we have seen used over and over again. ARC mirrors this in their version of the RSP proposal, describing itself as a “pragmatic middle ground” between a moratorium and doing nothing.

Obviously, all AGI Racers use this against us when we talk to people.

There are very few people that we have consistently seen publicly call for a stop to AGI progress. The clearest ones are Eliezer’s “Shut it All Down” and Nate’s “Fucking stop”.

The loudest silence is from Paul Christiano, whose RSPs are being used to safety-wash scaling.

Proving me wrong is very easy. If you do believe that, in a saner world, we would stop all AGI progress right now, you can just write this publicly.

When called out on this, most people we talk to just fumble.

2. Lying for Personal Gain

We talk to many people who publicly lie about their beliefs.

The justifications are always the same: “it doesn’t feel like lying”, “we don’t state things we do not believe”, “we are playing an inside game, so we must be tactical in what we say to gain influence and power”.

Let me call this for what it is: lying for personal gain. If you state things whose main purpose is to get people to think you believe something else, and you do so to gain more influence and power: you are lying for personal gain.

The results of this “influence and power-grabbing” has many times over materialised with the safety-washing of the AGI race. What a coincidence it is that DeepMind, OpenAI and Anthropic are all related to the AI Safety community.

The only benefit we see from this politicking is the people lying gain more influence, while the time we have left to AGI keeps getting shorter.

Consider what happens when a community rewards the people who gain more influence by lying!

So many people lie, and they screw not only humanity, but one another.

Many AGI corp leaders will privately state that in a saner world, AGI progress should stop, but they will not state it because it would hurt their ability to race against each other!

Safety people will lie so that they can keep ties with labs in order to “pressure them” and seem reasonable to politicians.

Whatever: they just lie to gain more power.

“DO NOT LIE PUBLICLY ABOUT GRAVE MATTERS” is a very strong baseline. If you want to defect, you need a much stronger reason than “it will benefit my personal influence, and I promise I’ll do good things with it”.

And you need to accept the blame when you’re called out. You should not muddy the waters by justifying your lies, covering them, telling people they misunderstood, and try to maintain more influence within the community.

We have seen so many people be taken in this web of lies: from politicians and journalists, to engineers and intellectuals, all up until the concerned EA or regular citizen who wants to help, but is confused by our message when it looks like the AI safety community is ok with scaling.

Your lies compound and make the world a worse place.

There is an easy way to fix this situation: we can adopt the norm of publicly stating our true beliefs about grave matters.

If you know someone who claims to believe that in a saner world we should stop all AGI progress, tell them to publicly state their beliefs, unequivocally. Very often, you’ll see them fumbling, caught in politicking. And not that rarely, you’ll see that they actually want to keep racing. In these situations, you might want to stop finding excuses for them.

3. The Spirit of Coordination

A very sad thing that we have personally felt is that it looks like many people are so tangled in these politics that they do not understand what the point of honesty even is.

Indeed, from the inside, it is not obvious that honesty is a good choice. If you are honest, publicly honest, or even adversarially honest, you just make more opponents, you have less influence, and you can help less.

This is typical deontology vs consequentialism. Should you be honest, if from your point of view, it increases the chances of doom?

The answer is YES.

a) Politicking has many more unintended consequences than expected.

Whenever you lie, you shoot potential allies at random in the back.
Whenever you lie, you make it more acceptable for people around you to lie.

b) Your behavior, especially if you are a leader, a funder or a major employee (first 10 employees, or responsible for >10% of the headcount of the org), ripples down to everyone around you.

People lower in the respectability/authority/status ranks do defer to your behavior.
People outside of these ranks look at you.
Our work toward stopping AGI progress becomes easier whenever a leader/investor/major employee at Open AI, DeepMind, Anthropic, ARC, Open Philanthropy, etc. states their beliefs about AGI progress more clearly.
 

c) Honesty is Great.

Existential Risks from AI are now going mainstream. Academics talk about it. Tech CEOs talk about it. You can now talk about it, not be a weirdo, and gain more allies. Polls show that even non-expert citizens express diverse opinions about super intelligence.

Consider the following timeline:

  • ARC & Open Philanthropy state in a press release “In a sane world, all AGI progress should stop. If we don’t, there’s more than a 10% chance we will all die.
  • People at AGI labs working in the safety teams echo this message publicly.
  • AGI labs leaders who think this state it publicly.
  • We start coordinating explicitly against orgs (and groups within orgs) that race.
  • We coordinate on a plan whose final publicly stated goal is to get to a world state that, most of us agree is not one where humanity’s entire existence is at risk.
  • We publicly, relentlessly optimise for this plan, without compromising on our beliefs.

Whenever you lie for personal gain, you fuck up this timeline.

When you start being publicly honest, you will suffer a personal hit in the short term. But we truly believe that, coordinated and honest, we will have timelines much longer than any Scaling Policy will ever get us.

New Comment
21 comments, sorted by Click to highlight new comments since: Today at 5:49 PM

Here is a short post explaining some of my views on responsible scaling policies, regulation, and pauses I wrote it last week in response to several people asking me to write something. Hopefully this helps clear up what I believe.

I don’t think I’ve ever hidden my views about the dangers of AI or the advantages of scaling more slowly and carefully. I generally aim to give honest answers to questions and present my views straightforwardly. I often point out that catastrophic risk would be lower if we could coordinate to build AI systems later and slower; I usually caveat that doing so seems costly and politically challenging and so I expect it to require clearer evidence of risk.

I think this post is quite misleading and unnecessarily adversarial.

I'm not sure if I want to engage futher, I might give examples of this later. (See examples below)

(COI: I often talk to and am friendly with many of the groups criticized in this post.)

Examples:

  • It seems to conflate scaling pauses (which aren't clearly very useful) with pausing all AI related progress (hardware, algorithmic development, software). Many people think that scaling pauses aren't clearly that useful due to overhang issues, but hardware pauses are pretty great. However, hardware development and production pauses would clearly be extremely difficult to implement. IMO the sufficient pause AI ask is more like "ask nvidia/tsmc/etc to mostly shut down" rather than "ask AGI labs to pause".
  • More generally, the exact type of pause which would actually be better than (e.g.) well implemented RSPs is a non-trivial technical problem which makes this complex to communicate. I think this is a major reason why people don't say stuff like "obviously, a full pause with XYZ characteristics would be better". For instance, if I was running the US, I'd probably slow down scaling considerably, but I'd mostly be interested in implementing safety standards similar to RSPs due to lack of strong international coordination.
  • The post says "many people believe" a "pause is necessary" claim[1], but the exact claim you state probably isn't actually believed by the people you cite below without additional complications. Like what exact counterfactuals are you comparing? For instance, I think that well implemented RSPs required by a regulatory agency can reduce risk to <5% (partially by stopping in worlds where this appears needed). So as an example, I don't believe a scaling pause is necessary and other interventions would probably reduce risk more (while also probably being politically easier). And, I think a naive "AI scaling pause" doesn't reduce risk that much, certainly less than a high quality US regulatory agency which requires and reviews something like RSPs. When claiming "many people believe", I think you should make a more precise claim that the people you name actually believe.
  • Calling something a "pragmatic middle ground" doesn't imply that there aren't better options (e.g., shut down the whole hardware industry).
  • For instance, I don't think it's "lying" when people advocate for partial reductions in nuclear arms without noting that it would be better to secure sufficient international coordination to guarantee world peace. Like world peace would be great, but idk if it's necessary to talk about. (There is probably less common knowledge in the AI case, but I think this example mostly holds.)
  • This post says "When called out on this, most people we talk to just fumble.". I strongly predict that the people actually mentioned in the part above this (Open Phil, Paul, ARC evals, etc) don't actually fumble and have a reasonable response. So, I think this misleadingly conflates the responses of two different groups at best.
  • More generally, this post seems to claim people have views that I don't actually think they have and assumes the motives for various actions are powerseeking without any evidence for this.
  • The use of the term lying seems like a case of "noncentral fallacy" to me. The post presupposes a communication/advocacy norm and states violations of this norm should be labeled "lying". I'm not sure I'm sold on this communication norm in the first place. (Edit: I think "say the ideal thing" shouldn't be a norm (something where we punish people who violate this), but it does seem probably good in many cases to state the ideal policy.)

  1. The exact text from the post is:

    In a saner world, all AGI progress should have already stopped. If we don’t, there’s more than a 10% chance we all die.

    Many people in the AI safety community believe this, but they have not stated it publicly. Worse, they have stated different beliefs more saliently, which misdirect everyone else about what should be done, and what the AI safety community believes.

    ↩︎
  • The title doesn't seem supported by the content. The post doesn't argue that people are being cowardly or aren't being strategic (it does argue they are incorrect and seeking power in a immoral way, but this is different).

For instance, if I was running the US, I'd probably slow down scaling considerably, but I'd mostly be interested in implementing safety standards similar to RSPs due to lack of strong international coordination.

Surely if you were running the US, that would be a great position to try to get international coordination on policies you think are best for everyone?

Sure, but seems reasonably likely that it would be hard to get that much international coordination.

Maybe - but you definitely can't get it if you don't even try to communicate the thing you think would be better.

[I agree with most of this, and think it's a very useful comment; just pointing out disagreements]

For instance, I think that well implemented RSPs required by a regulatory agency can reduce risk <5% (partially by stopping in worlds where this appears needed).

I assume this would be a crux with Connor/Gabe (and I think I'm at least much less confident in this than you appear to be).

  • We're already in a world where stopping appears necessary.
  • It's entirely possible we all die before stopping was clearly necessary.
  • What gives you confidence that RSPs would actually trigger a pause?
    • If a lab is stopping for reasons that aren't based on objective conditions in an RSP, then what did the RSP achieve?
    • Absent objective tests that everyone has signed up for, a lab may well not stop, since there'll always be the argument "Well we think that the danger is somewhat high, but it doesn't help if only we pause".
    • It's far from clear that we'll get objective and sufficient conditions for safety (or even for low risk). I don't expect us to - though it'd obviously be nice to be wrong.
      • [EDIT: or rather, ones that allow scaling to continue safely - we already know sufficient conditions for safety: stopping]

Calling something a "pragmatic middle ground" doesn't imply that there aren't better options

I think the objection here is more about what is loosely suggested by the language used, and what is not said - not about logical implications. What is loosely suggested by the ARC Evals language is that it's not sensible to aim for the more "extreme" end of things (pausing), and that this isn't worthy of argument.

Perhaps ARC Evals have a great argument , but they don't make one. I think it's fair to say that they argue the middle ground is practical. I don't think it can be claimed they argue for pragmatic until they address both the viability of other options, and the risks of various courses. Doing a practical thing that would predictably lead to higher risk is not pragmatic.

It's not clear what the right course here, but making no substantive argument gives a completely incorrect impression. If they didn't think it was the right place for such an argument, then it'd be easy to say that: that this is a complex question, that it's unclear this course is best, and that RSPs vs Pause vs ... deserves a lot more analysis.

The post presupposes a communication/advocacy norm and states violations of this norm should be labeled "lying". I'm not sure I'm sold on this communication norm in the first place.

I'd agree with that, but I do think that in this case it'd be useful for people/orgs to state both a [here's what we'd like ideally] and a [here's what we're currently pushing for]. I can imagine many cases where this wouldn't hold, but I don't see the argument here. If there is an argument, I'd like to hear it! (fine if it's conditional on not being communicated further)

Thanks for the response, one quick clarification in case this isn't clear.

On:

For instance, I think that well implemented RSPs required by a regulatory agency can reduce risk to <5% (partially by stopping in worlds where this appears needed).

I assume this would be a crux with Connor/Gabe (and I think I'm at least much less confident in this than you appear to be).

It's worth noting here that I'm responding to this passage from the text:

In a saner world, all AGI progress should have already stopped. If we don’t, there’s more than a 10% chance we all die.

Many people in the AI safety community believe this, but they have not stated it publicly. Worse, they have stated different beliefs more saliently, which misdirect everyone else about what should be done, and what the AI safety community believes.

I'm responding to the "many people believe this" which I think implies that the groups they are critiquing believe this. I want to contest what these people believe, not what is actually true.

Like many of therse people think policy interventions other than pause reduce X-risk below 10%.

Maybe I think something like (numbers not well considered):

  • P(doom) = 35%
  • P(doom | scaling pause by executive order in 2024) = 25%
  • P(doom | good version of regulatory agency doing something like RSP and safety arguments passed into law in 2024) = 5% (depends a ton on details and political buy in!!!)
  • P(doom | full and strong international coordination around pausing all AI related progress for 10+ years which starts by pausing hardware progress and current manufacturing) = 3%

Note that these numbers take into account evidential updates (e.g., probably other good stuff is happening if we have super strong internation coordination around pausing AI).

Ah okay - thanks. That's clarifying.

Agreed that the post is at the very least not clear.
In particular, it's obviously not true that [if we don't stop today, there's more than a 10% chance we all die], and I don't think [if we never stop, under any circumstances...] is a case many people would be considering at all.

It'd make sense to be much clearer on the 'this' that "many people believe".

(and I hope you're correct on P(doom)!)

Calling something a "pragmatic middle ground" doesn't imply that there aren't better options

I think the objection here is more about what is loosely suggested by the language used, and what is not said - not about logical implications. What is loosely suggested by the ARC Evals language is that it's not sensible to aim for the more "extreme" end of things (pausing), and that this isn't worthy of argument.

Perhaps ARC Evals have a great argument , but they don't make one. I think it's fair to say that they argue the middle ground is practical. I don't think it can be claimed they argue for pragmatic until they address both the viability of other options, and the risks of various courses. Doing a practical thing that would predictably lead to higher risk is not pragmatic.

It's not clear what the right course here, but making no substantive argument gives a completely incorrect impression. If they didn't think it was the right place for such an argument, then it'd be easy to say that: that this is a complex question, that it's unclear this course is best, and that RSPs vs Pause vs ... deserves a lot more analysis.

Yeah, I probably want to walk back my claim a bit. Maybe I want to say "doesn't strongly imply"?

It would have been better if ARC evals noted that the conclusion isn't entirely obvious. It doesn't seem like a huge error to me, but maybe I'm underestimating the ripple effects etc.

As an aside, I think it's good for people and organizations (especially AI labs) to clearly state their views on AI risk, see e.g., my comment here. So I agree with this aspect of the post.

Stating clear views on what ideal government/international policy would look like also seems good.

(And I agree with a bunch of other misc specific points in the post like "we can maybe push the overton window far" and "avoiding saying true things to retain respectability in order to get more power is sketchy".)

(Edit: from a communication best practices perspective, I wish I noted where I agree in the parent comment than here.)

Man, I agree with almost all the content of this post, but dispute the framing. This seems like maybe an opportunity to write up some related thoughts about transparency in the x-risk ecosystem.

 

A few months ago, I had opportunity to talk with a number of EA-aligned or x-risk concerned folks working in policy or policy adjacent roles as part of a grant evaluation process. My views here are informed by those conversations, but I am overall quite far from the action of AI policy stuff. I try to carefully flag my epistemic state regarding the claims below.

Omission

I think a lot of people, especially in AI governance, are...  

  1. Saying things that they think are true
  2. while leaving out other important things that they think are true, but are also so extreme or weird-sounding that they would lose credibility.

A central example is promoting regulations on frontier AI systems because powerful AI systems could develop bio-weapons that could be misused to wipe out large swaths of humanity. 

I think that most of the people promoting that policy agenda with that argumentation, do in fact think that AI-developed bioweapons are a real risk of the next 15 years. And, I guess, many to most of them think that there is also a risk of an AI takeover (including one that results in human extinction), within a similar timeframe. They're in fact more concerned about the AI takeover risk, but they're focusing on the bio-weapons misuse case, because that's more defensible, and (they think) easier to get others to take seriously.[1] So they're more likely to succed in getting their agenda passed into law, if they focus on those more-plausible sounding risks.

This is not, according to me, a lie. They are not making up the danger of AI-designed bio-weapons. And it is normal, in politics, to not say many things that you think and believe. If a person was asked point-blank about the risk AI takeover, and they gave an answer that implied the risk was lower than they think it is, in private, I would consider that a lie. But failing to volunteer that info when you're not being asked for it is something different.[2]

However, I do find this dynamic of saying defensible things in the overton window, and leaving out your more extreme beliefs, concerning.

It is on the table that we will have Superintelligence radically transforming planet earth by 2028. And government actors who might be able to take action on that now, are talking to advisors who do think that that kind of radical transformation is possible in that near a time frame. But those advisors hold back from telling the government actors that they think that, because they expect to loose the credibility they have.

This sure looks sus to me. It sure seems like a sad world where almost all of the people that were in a position to give a serious warning to the people in power, opted not to, and so the people in power didn't take the threat seriously until it was too late.

But is is important to keep in mind that the people I criticizing are much closer to the relevant action than I am. They may just be straightforwardly correct that they will be discredited if they talk about Superintelligence in the near term. 

I would be pretty surprised by that, given that eg OpenAI is talking about Superintelligence in the near team. And it overall becomes a lot less weird to talk about if 50 people from FHI, OpenPhil, the labs, etc. are openly saying that they think the risk of human extinction is >10%, instead of just that weird, longstanding kooky-guy, Eliezer Yudkowsky. 

And it seems like if you loose credibility for soothsaying, and then you're soothsaying looks like it's coming true, you will earn your credibility back later? I don't know if that's actually how it works in politics. 

But I'm not an expert here. I've heard at least one second hand anecdote of an EA in DC "coming out" as seriously concerned about Superintelligence and AI takeover risk, and loosing points for doing so. 

And overall, I have 1000x less experience engaging with government than these people, who have specialized in this kind of thing. I suspect that they're pretty calibrated about how different classes of people will react.

I am personally not sure how to balance advocating for a policy that seems more sensible and higher integrity to me, on my inside view, with taking into account the expertise of the people in these positions. For the time being, I'm trying to be transparent that my inside view wishes that EA policy people should be much more transparent about what they think, while also not punishing those people for following a different standard.

Belief-suppression

However, it gets worse than that. It's not only that many policy folks are not expressing their full beliefs, I think they're further exerting pressure on others not to express their full beliefs.

When I talk to EA people working in policy, about new people entering the advocacy space, they almost universally express some level of concern, due to "poisoning the well" dynamics.

To lay out an example of poisoning the well:

Let's say that some young EAs are excited about the opportunities to influence AI policy. They show up in DC and manage to schedule meetings with staffers. The talk about AI and AI risk, and maybe advocate for some specific policy like a licensing regime. 

But they're amateurs. They don't really know what they're doing, and they commit a bunch of faux pas, revealing that they don't know important facts about the relevant collations in congress, or which kinds of things are at all politically feasible. The staffers mark these people as unserious fools, who don't know what they're talking about, and who wasted their time. They disregard whatever proposal was put forward as un-serious. (The staffer doesn't let on about this though. Standard practice is to act polite, and then laugh about the meeting with your peers over drinks.)

Then, 6 months later, a different, more established advocacy group or think tank comes forward with a very similar policy. But they're now fighting an uphill battle, since people in government have already formed associations with that policy, and with the general worldview

As near as I can tell, I think this poisoning the well effect is real

People in government are overwhelmed with ideas, policies, and decisions. They don't have time to read the full reports, and often make relatively quick judgments. 

And furthermore, they're used to reasoning according to a coalition logic. to get legislation passed is not just a matter of whether it is a good idea, but largely depends on social context of the legislation. Who an idea is associated with is a strong determinant of whether to take it seriously. [3]

But this dynamic causes some established EA DC policy people to be wary of new people entering the space unless they already have a lot of policy experience, such that they can avoid making those kinds of faux pas. They would prefer that anyone entering the space have high levels of native social tact, and additionally to be familiar with DC etiquette. 

I don't know this to be the case, but I wouldn't be surprised if, people's sense of "DC etiquette" includes not talking about or not focusing too much on extreme, Sci-fi sounding scenarios." I would guess that there's one person working in the policy space can mess things up for everyone else in that space, and so that creates a kind of conformity pressure whereby everyone expresses the same sorts of thing. 

To be clear, I know that that isn't happening universally. There's at least one person that I talked to, working at org X, who suggested the opposite approach—they wanted a new advocacy org to explicitly not try to the sync their messaging with org X. They thought it made more sense for different groups, especially if they had different beliefs about what's necessary for a good future, to advocate for different policies. 

But I my guess is that there's a lot of this kind of thing, where there's a social pressure, amongst EA policy people, toward revealing less of one's private beliefs, lest one be seen as something of a loose cannon.

Even insofar as my inside view is mistaken about how productive it would be to say, straightforwardly, there's an additional question of how well-coordinated this kind of policy should be. My guess, is that by trying to all stay within the overton window, the EA policy ecosystem as a whole is preventing the overton window from shifting, and it would be better if there were less social pressure towards conformity, to enable more cascading social updates. 

  1. ^

    I'm sure that some of those folks would deny that they're more concerned about AI takeover risks. Some of them would claim something like agnosticism about which risks are biggest. 

  2. ^

    That said, my guess is that many of the people that I'm thinking of, in these policy positions, if they were asked, point blank, might lie in exactly that way. I have no specific evidence of that, but it does seem like the most likely way many of them would respond, given their overall policy about communicating their beliefs. 

    I think that kind of lying is very bad, both misleading the person or people who are seeking info from you and a defection on our collective discourse commons by making it harder for everyone who agrees with you to say what is true.

    And anyone who might be tempted to lie in a situation like that should take some time in advance to think through how they could respond in a way that is both an honest representation of their actual beliefs and also not disruptive to their profesional and political commitments

  3. ^

    And there are common knowledge effects here. Maybe some bumbling fools present a policy to you. You happen to have the ability to assess that their policy proposal is actually a really good idea. But you know that the bumbling fools also presented to a number of your colleagues, who are now snickering at how dumb and non-savvy they were. 

If a person was asked point-blank about the risk AI takeover, and they gave an answer that implied the risk was lower than they think it is, in private, I would consider that a lie

[...]

That said, my guess is that many of the people that I'm thinking of, in these policy positions, if they were asked, point blank, might lie in exactly that way. I have no specific evidence of that, but it does seem like the most likely way many of them would respond, given their overall policy about communicating their beliefs. 

As a relevant piece of evidence here, Jason Matheny, when asked point-blank in a senate committee hearing about "how concerned should we be about catastrophic risks from AI?" responded with "I don't know", which seems like it qualifies as a lie by the standard you set here (which, to be clear, I don't super agree with and my intention here is partially to poke holes in your definition of a lie, while also sharing object-level relevant information).

See this video 1:39:00 to 1:43:00: https://www.hsgac.senate.gov/hearings/artificial-intelligence-risks-and-opportunities/ 

Quote (slightly paraphrased because transcription is hard): 

Senator Peters: "The last question before we close. We've heard thoughts from various experts about the risk of human-like artificial intelligence or Artificial General Intelligence, including various catastrophic projections. So my final question is, what is the risk that Artificial General Intelligence poses, and how likely is that to matter in the near future?"

[...]

Matheny: "As is typically my last words: I don't know. I think it's a really difficult question. I think whether AGI is nearer or farther than thought, I think there are things we can do today in either case. Including regulatory frameworks that include standards with third party tests and audits, governance of supply chains so we can understand where large amounts of computing is going, and so that we can prevent large amounts of computing going to places with lower ethical standards that we and other democracies have"

Given my best model of Matheny's beliefs, this sure does not seem like an answer that accurately summarizes his beliefs here, and represents a kind of response that I think causes people to be quite miscalibrated about the beliefs of experts in the field.

In my experience people raise the hypothetical of "but they would be honest when asked point blank" to argue that people working in the space are not being deceptive. However, I have now seen people being asked point blank, and I haven't seen them be more honest than their original evasiveness implied, so I think this should substantially increase people's priors on people doing something more deceptive here. 

Jason Matheny is approximately the most powerful person in the AI policy space. I think he is setting a precedent here for making statements that meet at least the definition of lying you set out in your comment (I am still unsure whether to count that as lying, though it sure doesn't feel honest), and if-anything, if I talk to people in the field, Matheny is generally known as being among the more open and honest people in the space.

I agree that it is important to be clear about the potential for catastrophic AI risk, and I am somewhat disappointed in the answer above (though I think calling "I don't know" lying is a bit of a stretch). But on the whole, I think people have been pretty upfront about catastrophic risk, e.g. Dario has given an explicit P(doom) publicly, all the lab heads have signed the CAIS letter, etc.

Notably, though, that's not what the original post is primarily asking for: it's asking for people to clearly state that they agree that we should pause/stop AI development, not to clearly state that that they think AI poses a catastrophic risk. I agree that people should clearly state that they think there's a catastrophic risk, but I disagree that people should clearly state that they think we should pause.

Primarily, that's because I don't actually think trying to get governments to enact some sort of a generic pause would make good policy. Analogizing to climate change, I think getting scientists to say publicly that they think climate change is a real risk helped the cause, but putting pressure on scientists to publicly say that environmentalism/degrowth/etc. would solve the problem has substantially hurt the cause (despite the fact that a magic button that halved consumption would probably solve climate change).

I'm happy to state on the record that, if I had a magic button that I could press that would stop all AGI progress for 50 years, I would absolutely press that button. I don't agree with the idea that it's super important to trot everyone out and get them to say that publicly, but I'm happy to say it for myself.

How do you feel about "In an ideal world, we'd stop all AI progress"? Or "ideally, we'd stop all AI progress"?

I agree with most of this, but I think the "Let me call this for what it is: lying for personal gain" section is silly and doesn't help your case.

The only sense in which it's clear that it's "for personal gain" is that it's lying to get what you want.
Sure, I'm with you that far - but if what someone wants is [a wonderful future for everyone], then that's hardly what most people would describe as "for personal gain".
By this logic, any instrumental action taken towards an altruistic goal would be "for personal gain".

That's just silly.
It's unhelpful too, since it gives people a somewhat legitimate reason to dismiss the broader point.

Of course it's possible that the longer-term altruistic goal is just a rationalization, and people are after power for its own sake, but I don't buy that this is often true - at least not in any clean [they're doing this and only this] sense. (one could have similar altruistic-goal-is-rationalization suspicions about your actions too)

In many cases, I think overconfidence is sufficient explanation.
And if we get into "Ah, but isn't it interesting that this overconfidence leads to power gain", then I'd agree - but then I claim that you should distinguish [conscious motivations] from [motivations we might infer by looking at the human as a whole, deep shadowy subconscious included]. If you're pointing at the latter, please make that clear. (and we might also ask "What actions are not for personal gain, in this sense?")

Again, entirely with you on the rest.
I'm not against accusations that may hurt feelings - but I think that more precision would be preferable here.

The only sense in which it's clear that it's "for personal gain" is that it's lying to get what you want.
Sure, I'm with you that far - but if what someone wants is [a wonderful future for everyone], then that's hardly what most people would describe as "for personal gain".

If Alice lies in order to get influence, with the hope of later using that influence for altruistic ends, it seems fair to call the influence Alice gets 'personal gain'. After all, it's her sense of altruism that will be promoted, not a generic one.

This is not what most people mean by "for personal gain". (I'm not disputing that Alice gets personal gain)

Insofar as the influence is required for altruistic ends, aiming for it doesn't imply aiming for personal gain.
Insofar as the influence is not required for altruistic ends, we have no basis to believe Alice was aiming for it.

"You're just doing that for personal gain!" is not generally taken to mean that you may be genuinely doing your best to create a better world for everyone, as you see it, in a way that many would broadly endorse.

In this context, an appropriate standard is the post's own:
Does this "predictably lead people to believe false things"?
Yes, it does. (if they believe it)

"Lying for personal gain" is a predictably misleading description, unless much stronger claims are being made about motivation (and I don't think there's sufficient evidence to back those up).

The "lying" part I can mostly go along with. (though based on a contextual 'duty' to speak out when it's unusually important; and I think I'd still want to label the two situations differently: [not speaking out] and [explicitly lying] may both be undesirable, but they're not the same thing)
(I don't really think in terms of duties, but it's a reasonable shorthand here)

hiding your beliefs, in ways that predictably leads people to believe false things, is lying

I think this has got to be tempered by Grice to be accurate. Like, if I don't bring up some unusual fact about my life in a brief conversation (e.g. that I consume iron supplements once a week), this predictably leads people to believe something false about my life (that I do not consume iron supplements once a week), but is not reasonably understood as the bad type of lie - otherwise to be an honest person I'd have to tell everyone tons of minutiae about myself all the time that they don't care about.

Is this relevant to the point of the post? Maybe a bit - if I (that is, literally me) don't tell the world that I wish people would stop advancing the frontier of AI, I don't think that's terribly deceitful or ruining coordination. What has to be true for me to have a duty to say that? Maybe for me to be a big AI thinkfluencer or something? I'm not sure, and the post doesn't really make it clear.