https://auai.org/uai2021/pdf/uai2021.89.preliminary.pdf (this really is preliminary, e.g. they have not yet uploaded a newer version that incorporates peer review suggestions).---Can't do stuff in the second paper without worrying about stuff in the first (unless your model is very simple).
Pretty interesting.Since you are interested in policies that operate along some paths only, you might find these of interest:https://pubmed.ncbi.nlm.nih.gov/31565035/https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6330047/We have some recent stuff on generalizing MDPs to have a causal model inside every state ('path dependent structural equation models', to appear in UAI this year).
You can read Halpern's stuff if you want an axiomatization of something like the responses to the do-operator.
Or you can try to understand the relationship of do() and counterfactual random variables, and try to formulate causality as a missing data problem (whereby a full data distribution on counterfactuals and an observed data distribution on factuals are related via a coarsening process).