This is a special post for quick takes by simeon_c. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
25 comments, sorted by Click to highlight new comments since: Today at 7:36 PM

Idk what the LW community can do but somehow, to the extent we think liberalism is valuable, the Western democracies need to urgently put a hard stop to Russia and China war (preparation) efforts. I fear that rearmament is a key component of the only viable path at this stage.

I won't argue in details here but link to Noahpinion, who's been quite vocal on those topics. The TLDR is that China and Russia have been scaling their war industry preparation efforts for years, while Western democracies industries keep declining and remain crazily dependent from the Chinese industry. This creates a new global equilibrium where the US is no longer powerful enough to disincentivize all authoritarians regime from grabbing more land etc.

Some readings relevant to that:

I know this is not a core LW theme but to the extent this threat might be existential to liberalism, and to the existence of LW as a website in the first place, I think we should all care. It would also be quite terrible for safety if AGI was developed during a global war, which seems uncomfortably likely (~10% imo).

Reply4321

Something which concerns me is that transformative AI will likely be a powerful destabilizing force, which will place countries currently behind in AI development (e.g. Russia and China) in a difficult position. Their governments are currently in the position of seeing that peacefully adhering to the status quo may lead to rapid disempowerment, and that the potential for coercive action to interfere with disempowerment is high. It is pretty clearly easier and cheaper to destroy chip fabs than create them, easier to kill tech employees with potent engineering skills than to train new ones.

I agree that conditions of war make safe transitions to AGI harder, make people more likely to accept higher risk. I don't see what to do about the fact that the development of AI power is itself presenting pressures towards war. This seems bad. I don't know what I can do to make the situation better though.

Why Putin probably won't stop with Ukraine: https://en.m.wikipedia.org/wiki/Minsk_agreements

How do you draw that conclusion from the Minsk agreements? In those, Ukraine committed to pass laws for Decentralisation of power, including through the adoption of the Ukrainian law "On temporary Order of Local Self-Governance in Particular Districts of Donetsk and Luhansk Oblasts". Instead of Decentralization they passed laws forbidding those districts from teaching children in the languages that those districts wants to teach them. 

Ukraines unwillingness to follow the agreements was a key reason why the invasion in 2022 happened and was very popular with the Russian population. Being in denial about that is not helpful is you want to help prevent wars from breaking out.

Having maximalist foreign policy goals is not the way you get peace. 

This creates a new global equilibrium where the US is no longer powerful enough to disincentivize all authoritarians regime from grabbing more land etc.

The latest illegal land grab was done by Israel without any opposition by the US. If you are truly worried about land grabs being a problem why not speak against that US position of being okay with some land grabs instead of just speaking for buying more weapons?

In those, Ukraine committed to pass laws for Decentralisation of power, including through the adoption of the Ukrainian law "On temporary Order of Local Self-Governance in Particular Districts of Donetsk and Luhansk Oblasts". Instead of Decentralization they passed laws forbidding those districts from teaching children in the languages that those districts wants to teach them. 

Ukraines unwillingness to follow the agreements was a key reason why the invasion in 2022 happened and was very popular with the Russian population

I ignored that, that's useful, thank you. 

My (simple) reasoning is that I pattern matched hard to the Anschluss (https://en.wikipedia.org/wiki/Anschluss) as a prelude to WW2 where democracies accepted a first conquest hoping that it would stop there (spoiler: it didn't). 

Minsk really much feels the same way. From the perspetive of democracies it seems kinda reasonable to try one time a peaceful resolution accepting a conquest and see if Putin stops (although in hindsight it's unreasonable to not prepare to the possibility he doesn't). Now that he started invading Ukraine as a whole, it seems really hard for me to believe "once he'll get Ukraine, he'll really stop". I expect many reasons to invade other adjacent countries to come up aswell.

The latest illegal land grab was done by Israel without any opposition by the US. If you are truly worried about land grabs being a problem why not speak against that US position of being okay with some land grabs instead of just speaking for buying more weapons?

Two things on this. 

  1. Object-level: I'm not ok with this. 
  2. At a meta-level, there's a repugnant moral dilemma fundamental to this:
    1. The American hegemonic power was abused, e.g. see https://en.wikipedia.org/wiki/July_12,_2007,_Baghdad_airstrike or a number of wars that the US created for dubious reasons (i.e. usually some economic or geostrategic interests). (same for France, I'm just focusing on the US here for simplicity)
    2. Still, despite those deep injustice, the 2000s have been the least lethal in interstate conflicts because hegemony with threat of being crushed by the great power disincentivizes heavily anyone to fight. 
      1. It seems to me that hegemony of some power or coalition of powers is the most stable state for that reason. So I find this state quite desirable.
    3. Then the other question is, who should be in that position?
      1. I have the chance to be able to write this about my country without ending up in jail for. And if I do end up in jail, I have higher odds than in most other countries to be able to contest it. 
      2. So, although western democracies are quite bad and repugnant in a bunch of ways, I find them the least worse and most beneficial existing form of political power to currently defend and preserve the hegemony of.

Minsk really much feels the same way.

The key aspect of Minsk was that it was not put into practice. The annexation of Austria by Germany was fully put into practice and accepted by other states. 

From the perspetive of democracies it seems kinda reasonable to try one time a peaceful resolution accepting a conquest and see if Putin stops

Ukraine didn't try. They didn't pass the laws that Minsk called for. They did pass laws to discriminate against the Russian-speaking population. They said that they wanted to retake Crimea sooner or later. Ukraine never accepted losing any territory to Russia. 

I expect many reasons to invade other adjacent countries to come up aswell.

I don't see why we should ignore reasons. Georgia seems to be willing to produce reasons to be invaded. Maybe, Georgia shouldn't pass such laws? If you are worried about being invaded under the pretext of removing civil rights, maybe not remove civil rights?

I don't think any of the EU countries that border Russia have a situation that's remotely similar in either reasons to invade or in ability to launch a promising invasion against them by Russia.

I am sure that Putin had something like the Anschluss in mind when he started his invasion. 

Luckily for the west, he was wrong about that. 

From a Machiavellian perspective, the war in Ukraine is good for the West: for a modest investment in resources, we can bind a belligerent Russia while someone else does all the dying. From a humanitarian perspective, war is hell and we should hope for a peace where Putin gets whatever he has managed to grab while the rest of Ukraine joins NATO and will be protected by NATO nukes from further aggression. 

I am also not sure that a conventional arms race is the answer to Russia. I am very doubtful that a war between a NATO member and Russia would stay a regional or conventional conflict.

When I last looked a couple of months back, I found very little discussion of this topic in the rationalist communities. The most interesting post was probably this one from 2021: https://forum.effectivealtruism.org/posts/8cr7godn8qN9wjQYj/decreasing-populism-and-improving-democracy-evidence-based

I supposed it's not a popular topic because it rubs up against politics. But I do think that liberal democracy is the operating system for running things like LW, EA, and other communities we all love. It's worth defending it--though what that means exactly is vague to me.

Defending liberal democracy is complex, because everyone wants to say that they are on the side of liberal democracy.

If you take the Verified Voting Foundation as one of the examples of highly recommended projects in the link, mainstream opinion these days is probably that their talking points are problematic because people might trust less in elections when the foundations speaks about the need for a more trustworthy election process. 

While I personally believe that pushing for a more secure voting system is good, it's a complex situation and many other projects in the space are similar. It's easy for a project that's funded for the purpose of strengthening liberal democracy to do the opposite. 

Sure, but that's no reason not to try.

I think this is a strong argument against "just do something that feels like it's working toward liberal democracy". But not against actually trying to work toward liberal democracy.

I think this is a subset of work on most important problems: time figuring out what to work on is surprisingly effective. People don't do it as much as they should because it's frustrating and doesn't feel like it's working toward a rewarding outcome.

For sure! It's a devilishly hard problem. Despite dipping in and out of the topic, I don't feel confident in even forming a problem statement about it. I feel more like one of the blind men touching different parts of an elephant.

But it seems like having many projects like the Verified Voting Foundation should hedge the risk--if each such project focuses on a small part, then the blast radius of unfortunate mistakes should be limited. I would just hope that, on average, we would be trending in the right direction.

One aspect of having many small projects is that it makes it harder to see the whole picture. It obfuscates and makes public criticism harder. 

If someone builds a Ministery of Truth it's easy to criticize it as an Orwellian attack on liberal democracy. If they instead distribute it over hundreds of different organizations, it's a lot harder to conceptualize. 

Indeed. One consideration is that the LW community used to be much less into policy adjacent stuff and hence much less relevant on that domain. Now, with AI governance becoming an increasingly big deal, I think we could potentially use some of that presence to push for certain things in defense. 

Pushing for things in the genre of what Noah describes in the first piece I shared seems feasible for some people in policy.

[-]jmh18d20

On the AI aspect I suspect we could make a small case study out of Israel's use of their AI.

[-]jmh18d20

I wonder if potential war is the greatest concern with regard to either loss of liberalism or sites like LW. Interesting news story on views about democratic electoral processes and public trust in them as well as trust that the more democratic form of government will accomplish what it needs to. (Perhaps a lot of reading into with that particular summary but simplist was I could express the summary.)

I've not read the report so not sure if the headline is actually accurate about election -- certainly what is reported in the story doesn't quite support the "voters skeptical about fairness of elections" headline claim. The rest does seem to align with lots of news and events over the past 5 or 10 years.

Rephrasing based on an ask: "Western Democracies need to urgently put a hard stop to Russia and China war (preparation) efforts" -> Western Democracies need to urgently take actions to stop the current shift towards a new World order where conflicts are a lot more likely due to Western democracies no longer being a hegemonic power able to crush authoritarians power that grab land etc. This shift is currently primarily driven by the fact that Russia & China are heavily rearming themselves whereas Western democracies are not.

@Elizabeth

[-]jmh17d20

I'm unsure if the rephrasing is really helpful or if perhaps actually counter productive. Ithink the conflict and arming is in many ways the symptom and so the focus on that not going to be a solution. Additionally, that language seems to play directly into the framing both the Russian government and the Chinese goverment are framing things.

I mean the full option space obviously also includes "bargain with Russia and China to make credible commitments that they stop rearming (possibly in exchange for something)", and I think we should totally explore that path aswell, I just don't have much hope in it at this stage which is why I'm focusing on the other option, even if it is a fucked up local nash equilibrium. 

It would also be quite terrible for safety if AGI was developed during a global war, which seems uncomfortably likely (~10% imo).

This may be likely, iirc during wars countries tend to spend more on research and they could potentially just race to AGI like what happened with space race. Which could make hard takeoff even more likely.

Given the recent argument on whether Anthropic really did commit to not push the frontier or just misled most people into thinking that it was the case, it's relevant to reread the RSPs in hairsplitting mode. I was rereading the RSPs and noticed a few relevant findings:

Disclaimer: this is focused on negative stuff but does not deny the merits of RSPs etc etc.

  1. I couldn't find any sentence committing to not significantly increase extreme risks. OTOH I found statements that if taken literally could imply an implicit acknowledgment of the opposite: "our most significant immediate commitments include a high standard of security for ASL-3 containment, and a commitment not to deploy ASL-3 models until thorough red-teaming finds no risk of catastrophe.". 
    Note that it makes a statement on the risk only bearing on deployment measures and not on security. Given that the lack security is probably the biggest source of risk of ASL-3 systems & the biggest weakness of RSPs, I find it pretty likely that this is not random.
  2. I found a number of commitments that are totally unenforceable in hairsplitting mode. Here are two examples: 
    1. "World-class experts collaborating with prompt engineers should red-team the deployment thoroughly and fail to elicit information at a level of sophistication, accuracy, usefulness, detail, and frequency which significantly enables catastrophic misuse." 
      1. The use of five underdefined adjectives + "significantly" is a pretty safe barrier against any enforcement.
    2. "When informed of a newly discovered model vulnerability enabling catastrophic harm (e.g. a jailbreak or a detection failure), we commit to mitigate or patch it promptly (e.g. 50% of the time in which catastrophic harm could realistically occur)."
      1. The combination of "or", of the characterization of promptly as "50% of the time, the use of "e.g." and of "realistically" is also a safe barrier against enforceability. 
  3. It's only my subjective judgment here and you don't have to trust it but I also found Core Views on AI Safety to have a number of similar patterns.

This debate comes from before the RSP so I don’t actually think that’s cruxy. Will try to dig up an older post.

There was a hot debate recently but regardless, the bottom line is just "RSPs should probably be interpreted literally and nothing else. If a literal statement is not strictly there, it should be assumed it's not a commitment."

I've not seen people doing very literal interpretation on those so I just wanted to emphasize that point.

[-]Raemon21d106

I currently think Anthropic didn't "explicitly publicly commit" to not advance the rate of capabilities progress. But, I do think they made deceptive statements about it, and when I complain about Anthropic I am complaining about deception, not "failing to uphold literal commitments."

I'm not talking about the RSPs because the writing and conversations I'm talking about came before that. I agree that the RSP is more likely to be a good predictor of what they'll actually do.

I think most of the generator for this was more like "in person conversations", at least one of which was between Dario and Dustin Moswkowitz:

Image

The most explicit public statement I know is from this blogpost (which I agree is not an explicit commitment, but, I do think 

  • Capabilities: AI research aimed at making AI systems generally better at any sort of task, including writing, image processing or generation, game playing, etc. Research that makes large language models more efficient, or that improves reinforcement learning algorithms, would fall under this heading. Capabilities work generates and improves on the models that we investigate and utilize in our alignment research. We generally don’t publish this kind of work because we do not wish to advance the rate of AI capabilities progress. In addition, we aim to be thoughtful about demonstrations of frontier capabilities (even without publication). We trained the first version of our headline model, Claude, in the spring of 2022, and decided to prioritize using it for safety research rather than public deployments. We've subsequently begun deploying Claude now that the gap between it and the public state of the art is smaller.

If you wanna reread the debate, you can scroll through this thread (https://x.com/bshlgrs/status/1764701597727416448). 

I've been thinking a lot recently about taxonomizing AI risk related concepts to reduce the dimensionality of AI threat modelling while remaining quite comprehensive. It's in the context of developing categories to assess whether labs plans cover various areas of risk.

There are two questions I'd like to get takes on. Any take on one of these 2 wd be very valuable.

  1. In the misalignment threat model space, a number of safety teams tend to assume that the only type of goal misgeneralization that could lead to X-risks is deceptive misalignment. I'm not sure to understand where that confidence comes from. Could anyone make or link to a case that rules out the plausibility of all other forms of goal misgeneralization? 
  2. It seems to me that to minimize the dimensionality of the threat modelling, it's sometimes more useful to think about the threat model (e.g. a terrorist misuses an LLM to develop a bioweapon) and sometimes more useful to think about a property which has many downstream consequences on the level of risk. I'd like to get takes on one such property:
    1. Situational awareness: It seems to me that it's most useful to think of this property as its own hazard which has many downstream consequences on the level of risk (most prominently that a model with it can condition on being tested when completing tests). Do you agree or disagree with this take? Or would you rather discuss situational awareness only in the context of the deceptive alignment threat model?

There's a number of properties of AI systems that makes it easier to collect information in a safe way about those systems and hence demonstrate their safety: interpretability, formal verifiability, modularity etc. Which adjective wd you use to characterize those properties?

 

I'm thinking of "resilience" because from the perspective of an AI developer it helps a lot understanding the risk profile, but do you have other suggestions? 

Some alternatives: 

  1. auditability properties
  2. legibility properties