Let’s consider three ways you can be altruistic towards another agent:
I think a lot of unresolved tensions in ethics comes from seeing these types of morality as in opposition to each other, when they’re actually complementary:
Cooperation-morality and deference-morality have the weakness that they can be exploited by the agents we hold those attitudes towards; and so we also have adaptations for deterring or punishing this (which I’ll call conflict-morality). I’ll mostly treat conflict-morality as an implicit part of cooperation-morality and deference-morality; but it’s worth noting that a crucial feature of morality is the coordination of coercion towards those who act immorally.
I’ve mentioned that many moral principles are rational strategies for multi-agent environments even for selfish agents. So when we’re modeling people as rational agents optimizing for some utility function, it’s not clear whether we should view those moral principles as part of their utility functions, versus as part of their strategies. Some arguments for the former:
Some arguments for the latter:
The rough compromise which I use here is to:
I’ll finish by elaborating on how different decision theories endorse different instrumental strategies. Causal decision theories only endorse the same actions as our cooperation-morality intuitions in specific circumstances (e.g. iterated games with indefinite stopping points). By contrast, functional decision theories do so in a much wider range of circumstances (e.g. one-shot prisoner’s dilemmas) by accounting for logical connections between your choices and other agents’ choices. Functional decision theories follow through on commitments you previously made; and sometimes follow through on commitments that you would have made. However, the question of which hypothetical commitments they should follow through with depends on how updateless they are.
Updatelessness can be very powerful - it’s essentially equivalent to making commitments behind a veil of ignorance, which provides an instrumental rationale for implementing cooperation-morality. But it’s very unclear how to reason about how justified different levels of updatelessness are. So although it’s tempting to think of updatelessness as a way of deriving care-morality as an instrumental goal, for now I think it’s mainly just an interesting pointer in that direction. (In particular, I feel confused about the relationship between single-agent updatelessness and multi-agent updatelessness like the original veil of ignorance thought experiment; I also don’t know what it looks like to make commitments “before” having values.)
Lastly, I think deference-morality is the most straightforward to derive as an instrumentally-useful strategy, conditional on fully trusting the agent you’re deferring to - epistemic deference intuitions are pretty common-sense. If you don’t fully trust that agent, though, then it seems very tricky to reason about how much you should defer to them, because they may be manipulating you heavily. In such cases the approach that seems most robust is to diversify worldviews using a meta-rationality strategy which includes some strong principles.