A Rationality Condition for CDT Is That It Equal EDT (Part 1)

COEDT can be thought of as "learning" from an infinite sequence of agents who explore less and less.

Interestingly, the issue COEDT has with sequential decision problems looks suspiciously similar to folk theorems in iterated game theory (which also imply that completely-aligned agents can get a very bad outcome because they will each maximally punish anyone who doesn't play the grim trigger strategy). There might be some kind of folk theorem for COEDT, though there's a complication in that, conditioning on yourself taking a probability-0 action, you get both worlds where you are being punished and ones where you are punishing someone else, which might mean counterfactual punishments can't be maximal for everyone at once (yay?).

COEDT ensures that conditionals on each action exist at all, but it doesn't ensure that agents behave even remotely sanely in these conditionals, as it's still conditioning on a very rare event, and the relevant rationality conditions permit agents to behave insanely with very small probability. What would be really nice is to get some set of conditional beliefs under which:

no one takes any strictly dominated actions with nonzero probability (i.e. an action such that all possible worlds where the agent takes this action are worse than all possible worlds where the agent doesn't)
conditional on any subset of the agents taking non-strictly-dominated actions, no agent takes any strictly dominated action with nonzero probability

(I suspect this is easier for common-payoff problems; for non-common-payoff problems, agents might take strictly dominated actions as a form of extortion)

COEDT doesn't get this but perhaps a similar construction (maybe using the hyperreals?) does.

[-]jessicata7y20

#4 (implementability): I think of this as the shakiest assumption; it is easy to set up decision problems which violate it. However, I tend to think such setups get the causal structure wrong. Other parents of the action should instead be thought of as children of the action. Furthermore, if an agent is learning about the structure of a situation by repeated exposure to that situation, implementability seems necessary for the agent to come to understand the situation it is in: parents of the action will look like children if you try to perform experiments to see what happens when you do different things.

This assumption seems sketchy to me. In particular, what if you make 2 copies of a deterministic agent, move them physically far from each other, give them the same information, and ask each to select an action? Clearly, if a rational agent is uncertain about either agent's action, then they will believe the two agents' actions to be (perfectly) correlated. The two actions can't each be children of each other...

[-]abramdemski7y10

I maybe should have clarified that when I say CDT I'm referring to a steel-man CDT which would use some notion of logical causality. I don't think the physical counterfactuals are a live hypothesis in our circles, but several people advocate reasoning which looks like logical causality.

Implementability asserts that you should think of yourself as logico-causally controlling your clone when it is a perfect copy.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

10

A Rationality Condition for CDT Is That It Equal EDT (Part 1)

10

Hyperreal Probability

Hyperreal Bayes Nets & CDT=EDT

Are We Really Eliminating Exploration?

Ways of Taking Counterfactuals are Somewhat Interchangeable

Exploration is Always Necessary for Learning Guarantees

You Still Explore in Logical Time