Have human values improved over the last few centuries? Or is it just that current human values are naturally closer to our (current) human values and so we think that there's been moral progress towards us?
If we project out in the future, the first scenario posits continuing increased moral improvements (as the "improvement trend" continues) and the second posits moral degeneration (as the values drift away from our own). So what is it?
I'll make the case that both trends are happening. We have a lot less slavery, racism, ethnic conflicts, and endorsements of slavery, racism, and ethnic conflicts. In an uneven way, poorer people have more effective rights than they did before, so it's somewhat less easy to abuse them.
Notice something interesting about the previous examples? They can all be summarised as "some people who were treated badly are now treated better". Many people throughout time would agree that these people are actually being treated better. On the issue of slavery, consider the following question:
- "If X would benefit from being a non-slave more than being a slave, and there were no costs to society, would it be better for X not to be a slave?"
Almost everyone would agree to that throughout history, barring a few examples of extremely motivated reasoning. So most defences of slavery rest on the idea that some classes of people are better off as slaves (almost always a factual error, and generally motivated reasoning), or that some morally relevant group of people benefited from slavery enough to make it worthwhile.
So most clear examples of moral progress are giving benefits to people, such that anyone who knew all the facts would agree it was beneficial for those people.
That trend we might expect to continue; as we gain greater knowledge how to benefit people, and as we gain greater resources, we can expect more people to be benefited.
Values that we have degenerated on
But I'll argue that there are a second class of values that have less of a "direction" to, and where we could plausibly be argued to have "degenerated". And, hence, where we might expect our descendants to "degenerate" more (ie move further away from us).
Community and extended family values, for example, are areas where much of the past would be horrified by the present. Why are people not (generally) meeting up with their second cousins every two weeks, and why do people waste time gossiping about irrelevant celebrities rather than friends and neighbours?
On issues of honour and reputation, why have we so meekly accepted to become citizens of administrative bureaucracies and defer to laws and courts, rather than taking pride in meeting out our own justice and defending our own honour? "Yes, yes", the hypothetical past person would say, "your current system is fairer and more efficient; but why did it have to turn you all so supine"? Are you not free men?
Play around with vaguely opposite virtues: spontaneity versus responsibility; rationality versus romanticism; pride versus humility; honesty versus tact, and so on. Where is the ideal mean between any of those two extremes? Different people and different cultures put the ideal mean in different places, and there's no reason to suspect that the means are "getting better" rather than just "moving around randomly".
I won't belabour the point; it just seems to me that there are areas where the moral progress narrative makes more sense (giving clear benefits to people who didn't have them) and areas where the "values drift around" narrative makes more sense. And hence we might hope for continuing moral progress in some areas, and degeneration (or at least stagnation) in others.
I think at least some of this has to do with the fact that some forms of local coordination can hurt global coordination (think of price fixing, organized crime, nepotism, nimbyism), so evolution has favored cultures that managed to reduce such local coordination. (Of course if community and extended family values are terminal values instead of instrumental ones this would still imply "degeneration", but I'm not sure if they are.)
I think community and extended family are (or were) terminal values, to some extent at least (which doesn't preclude them being instrumental values also).
I think that conditional on some form of moral anti-realism, community and extended family likely were terminal values, and there has been "moral degeneration" in the sense that we now weigh such values less than before. But it seems to me that conditional on moral anti-realism, slavery was also a kind of terminal value, in the sense that slave owners weighed their own welfare higher than the welfare of slaves, and racism was a kind of terminal value in that people weighed the welfare of people of their own race higher than people of other races. This seems to be what's going on if we put aside the factual claims.
If you disagree with this, can you explain more why it's a terminal value to weigh one's local community or extended family more than others, but not a terminal value to weigh oneself or people in one's race or one's social class (e.g., the nobility or slave owners) more than others? Or why that's not what's going on with racism or slavery?
I was talking about "extended family values" in the sense of "it is good for families to stick together and spend time with each other"; this preference can (and often does) apply to other families as well. I see no analogue for that with slavery.
But yeah, you could argue that racism can be a terminal value, and that slave owners would develop it, as a justification for what might have started as an instrumental value.
It seems that at least some people valued slavery in the sense of wanting to preserve a culture and way of life that included slavery. The following quotes from https://www.battlefields.org/learn/articles/why-non-slaveholding-southerners-fought seem to strongly suggest that slavery/racism (it seems hard to disentangle these) was a terminal value at least for some (again assuming moral anti-realism):
Back to you:
What scares me is the possibility that moral anti-realism is false, but we build an AI under the assumption that it's true, and it "synthesizes" or "learns" or "extrapolates" some terminal value like or analogous to racism, which turns out to be wrong.
One way of dealing with this, in part, is to figure out what would convince you that moral realism was true, and put that in as a strong conditional meta-preference.
I can see two possible ways to convince me that moral realism is true:
Do these seem like things that could be "put in as a strong conditional meta-preference" in your framework?
Yes, very easily.
The main issue is whether these should count as an overwhelming meta-preference - one that over-weights all other considerations. And, currently as I have things set up, the answer is no. I have no doubt that you feel strongly about potentially true moral realism. But I'm certain that this strong feeling is not absurdly strong compared to other preferences at other moments in your life. So if we synthesised your current preferences, and 1. or 2. ended up being true, then the moral realism would end up playing a large-but-not-dominating role in your moral preferences.
I wouldn't want to change that, because what I'm aiming for is an accurate synthesis of your current preferences, and your current preference for moral-realism-if-it's-true is not, in practice, dominating your preferences. If you wanted to ensure the potential dominance of moral realism, you'd have to put that directly into the synthesis process, as a global meta-preference (section 2.8 of the research agenda).
But the whole discussion feels a bit peculiar, to me. One property of moral realism that is often assumed, is that it is, in some sense, ultimately convincing - that all systems of morality (or all systems derived from humans) will converge to it. Yet when I said a "large-but-not-dominating role in your moral preferences", I'm positing that moral realism is true, but that we have a system of morality - UH - that does not converge to it. I'm not really grasping how this could be possible (you could argue that the moral realism UR is some sort of acausal trade convergent function, but that gives an instrumental reason to follow UR, not an actual reason to have UR; and I know that a moral system need not be a utility function ^_^).
So yes, I'm a bit confused by true-but-not-convincing moral realisms.