Many people are very nationalistic, putting their country above all others. Such people can be hazy about what "above all others" can mean, outside of a few clear examples - eg winning a total war totally. They're also very hazy on what is meant by "their country" - geography is certainly involved, as is proclaimed or legal nationality, maybe some ethnic groups or a language, or even just giving deference to certain ideals.
Consider the plight of a communist Croatian Yugoslav nationalist during the 1990s...
I'd argue that the situation these nationalists find themselves in - strong views on poorly defined concepts - is the general human state for preferences. Or, to use an appropriate map and territory analogy:
Some of the debates about the meaning of words are about this extension-of-preferences process. Scott Alexander recommends that we dissolve concepts such as disease, looking for the relevant categories of 'deserves sympathy' and 'acceptable to treat in a medical way'.
And that dissolving is indeed the correct thing for rationalists to do. But, for most people, including most rationalists, 'sick people deserve sympathy' is a starting moral principle, one we've learnt by example and experience in childhood. When we ask 'do obese people deserve sympathy?' we've trying to extend that moral principle to a situation where our map/model (which includes, say, three categories of people: healthy, mildly sick, very sick) no longer matches up with reality.
Scott's dissolving process requires decomposing 'disease' into more nodes, and then applying moral principles to those individual nodes. In this case, a compelling consequentialist analysis is to look at whether condemnation or praise is effective at changing the condition; ie does fat-shaming people make them less likely to be fat, or others less likely to become fat in the first place? Here the moral principle involved is something like "it's wrong to harm someone (eg through shaming them) if there is no benefit to them or others from doing so".
And that's a compelling moral principle, but it's not the same one that we started with. Some people will have a strong "no harm" intuition, of which "sick people deserve sympathy" is merely an illustrative example. But many (most?) will have been taught that sick people deserve sympathy, as a specific moral requirement they should follow. When we dissolve the definition of disease, we lose a part of of our moral preferences.
And yes, human values are such a mess that we could do with losing or simplifying a bunch of them. But human values are genuinely complicated, and we don't want to over-simplify them. So it's important to note that the "dissolving" process also generally involves discarding a portion of our values, those that don't fit neatly on the new map we have. It's important to decide when we're willing to pay that price, and when we're not.
We generally see maps as working the other way round: as tools to that serve the purposes of our "real" goals. Eliezer writes about how, if definitions didn't stand for some query, something relevant to our "real" preferences, we'd have no reason to care about them.
But if, as I've argued, most of our preferences live in our mental maps, then changing definitions or improving maps can tear up our preferences and values - or at least force us to re-assess them.
This is why I spend so much time thinking about "conservative" values, especially those around the moral foundation of purity. I mainly don't share that moral foundation, so it's clear to me how incoherent it is. It's painful to listen to someone who has that moral foundation, twist and turn and try to justify it based on more consequentialist reasoning. Yes, rituals can bind a community together; but are you really telling me that if, say, TV shows or facebook games were shown to do a better binding job, you'd cheerfully discard those rituals?
But I strongly suspect that, ultimately, the moral foundations I do care about, such as care/harm, as also incoherent when we push too far into unfamiliar territory. So I want to forge something coherent out of purity, as practice for forging something coherent out of all our values.
Your parent, on their deathbed, gives you your mission in life: an old map, a compass, and the instructions "Go west, young man!"
The compass is fine, but, as we know, its concept of west is not exactly the same as the standard geographical one.
In the era and place that your hypothetical parent was from, the connotations of "going west" involve adventure and potential richness.
And, most importantly, neither of you have yet realised that the world is round.
So, for a short while, "going west" seems like a clear, well-defined goal. But as we get to the edge of the map, both literally and metaphorically, the concept starts to lose definition and become far more uncertain; and hence, so does your goal.
What will you do with your goal when your mental maps are forced to change?
Don't worry if you're not actually a young man; their mind was starting to go, towards the end. ↩︎
Planned summary for the Alignment Newsletter:
This post argues that by default, human preferences are strong views built upon poorly defined concepts, that may not have any coherent extrapolation in new situations. To put it another way, humans build mental maps of the world, and their preferences are defined on those maps, and so in new situations where the map no longer reflects the world accurately, it is unclear how preferences should be extended. As a result, anyone interested in preference learning should find some incoherent moral intuition that other people hold, and figure out how to make it coherent, as practice for the case we will face where our own values will be incoherent in the face of new situations.
This seems right to me -- we can also see this by looking at the various paradoxes found in the philosophy of ethics, which involve taking everyday moral intuitions and finding extreme situations in which they conflict, and it is unclear which moral intuition should “win”.
Cool, neat summary.