Un-manipulable counterfactuals — AI Alignment Forum