Counterfactual resiliency test for non-causal models — AI Alignment Forum