Safetywashing

Adam Scholl

In southern California there’s a two-acre butterfly preserve owned by the oil company Chevron. They spend little to maintain it, but many millions on television advertisements featuring it as evidence of their environmental stewardship.^[1]

Environmentalists have a word for behavior like this: greenwashing. Greenwashing is when companies misleadingly portray themselves, or their products, as more environmentally-friendly than they are.

Greenwashing often does cause real environmental benefit. Take the signs in hotels discouraging you from washing your towels:

My guess is that the net environmental effect of these signs is in fact mildly positive. And while the most central examples of greenwashing involve deception, I’m sure some of these signs are put up by people who earnestly care. But I suspect hotels might tend to care less about water waste if utilities were less expensive, and that Chevron might care less about El Segundo Blue butterflies if environmental regulations were less expensive.

The field of AI alignment is growing rapidly. Each year it attracts more resources, more mindshare, more people trying to help. The more it grows, the more people will be incentivized to misleadingly portray themselves or their projects as more alignment-friendly than they are.

I think some of this is happening already. For example, a capabilities company launched recently with the aim of training transformers to use every API in the world, which they described as the “safest path to general intelligence.” As I understand it, their argument is that this helps with alignment because it involves collecting feedback about people’s preferences, and because humans often wish AI systems could more easily take actions in the physical world, which is easier once you know how to use all the APIs.^[2]

It’s easier to avoid things that are easier to notice, and easier to notice things with good handles. So I propose adopting the handle “safetywashing.”

^{^}
From what I can tell, the original source for this claim is the book “The Corporate Planet: Ecology and Politics in the Age of Globalization,” which from my samples seems about as pro-Chevron as you’d expect from the title. So I wouldn’t be stunned if the claim were misleading, though the numbers passed my sanity check, and I did confirm the preserve and advertisements exist.
^{^}
I haven’t talked with anyone who works at this company, and all I know about their plans is from the copy on their website. My guess is that their project harms, rather than helps, our ability to ensure AGI remains safe, but I might be missing something.

[-]Oliver Habryka10mo1114Review for 2022 Review

I've used the term "safetwashing" at least once every week or two in the last year. I don't know whether I've picked it up from this post, but it still seems good to have an explanation of a term that is this useful and this common that people are exposed to.

AI ALIGNMENT FORUM
AF

78

Safetywashing

Safetywashing