Koen Holtman

Computing scientist and Systems architect. Currently doing self-funded AGI safety research.


Counterfactual Planning

Wiki Contributions


LCDT, A Myopic Decision Theory

I'd be interested on your take on...

See the comment here for my take.

LCDT, A Myopic Decision Theory

Joe asked me in this comment:

I'd be interested on your take on Evan's comment on incoherence in LCDT.

To illustrate his point on incoherence, Joe gives a kite example:

Let's say I'm an LCDT agent, and you're a human flying a kite.

My action set: [Say "lovely day, isn't it?"] [Burn your kite]

Your action set: [Move kite left] [Move kite right] [Angrily gesticulate]

Let's say I initially model you as having p = 1/3 of each option, based on your> expectation of my actions.

Now I decide to burn your kite.

What should I imagine will happen? If I burn it, your kite pointers are dangling.

Do the [Move kite left] and [Move kite right] actions become NOOPs?

Do I assume that my [burn kite] action fails?

My take is that there is indeed a problem that 'your kite pointers are dangling' in projection that the LCDT world model will compute. So the world projected will be somewhat weird.

In my mental picture of the most obvious way to implement LCDT and the structural functions attached to the LCDT model, the projection will be weird in the following way. After [burn kite], the action [Move kite left], when applied to the world state produced by [burn kite], will produce a world state where the human is miming that they are flying a kite. They will make the right gestures to move an invisible kite left, they might even be holding a kite rope when making the gestures, but the rope will not be connected to an actual kite.

So this is weird. However, I would not call it 'incoherent' or 'requiring a contradiction' as Joe does:

I cannot coherently assume that the agent has a distribution over action sets that it does not have: this requires a contradiction in my world model.

The phrasing 'contradiction in the world model' evokes the concern that the LCDT-constructed world model might crash or not be solvable, when we use it to score the action [burn kite]. But a nice feature of causal models, even counterfactual ones as generated by LCDT, is that they will ever crash: they will always compute a future reward score for any possible candidate action or policy. The score may however be weird. There is a potential GIGO problem here.

The word 'incoherent' invokes the concern that the model will be so twisted that we can definitely expect weird scores being computed more often than not. If so, the agent actions computed may be ineffective, strangely inappropriate, or even even dangerous when applied to the real world.

In other words: garbage world model in, garbage agent decision out.

One specific worry discussed here is that a counterfactual model may output potentially dangerous garbage because it pushes the inputs of the structural functions being used way out of training distribution.

That being said, there can be advantages to imperfection too. If we design just the right kind of 'garbage' into the agent's world model, we may be able to suppress certain dangerous agent incentives, while still having an agent that is otherwise fairly good at doing the job we intend it to do. This is what LCDT is doing, for certain agent jobs, and it is also what my counterfactual planning agents designs here are doing, for certain other agent jobs.

That being said, it is clear (from the comments and I think also from the original post) that most feel that applying LCDT does not produce useful outcomes for all possible jobs we would want agents to do. Notably, when applied to a decision making problem where the agent has to come up with a multi-step reward-maximizing policy/plan, i.e. a typical MDP or RL benchmark problem, LCDT will produce an agent with a hugely impaired planning ability. How hugely will depend in part on the prior used.

Evan's take is that he is not too concerned with this, as he has other agent applications in mind:

an LCDT agent should still be perfectly capable of tasks like simulating HCH

i.e. we can apply LCDT when building an imitation learner, which is different from a reinforcement learner. In the argmax HCH examples above, the agent furthermore is not imitating a human mentor who is present in the real agent environment, but a simulated mentor built up out of simulated humans consulting simulated humans.

On a philosophical thought-experiment level, this combination of LCDT and HCH works for me, it is even elegant. But in applied safety engineering terms, I see several risks with using HCH. For example, if the learned model of humans that the agent uses in HCH calculations is not perfect, then the recursive nature of HCH might amplify these imperfections rather than dampen them, producing outcomes that are very much unaligned. Also, on a more moral-philosophical point, might all these simulated humans become aware that they live in a simulation, and if so will they then seek to take sweet revenge on the people who put them there?

Back to the topic of incoherence. Joe also asks:

Specifically, do you think the issue I'm pointing at is a difference between LCDT and counterfactual planners? (or perhaps that I'm just wrong about the incoherence??)

I see LCDT agents as a subset of all possible counterfactual planning agent architectures, so in that sense there is no difference.

However, in my sequence and paper on counterfactual planning, I construct planning worlds by using quite different world model editing steps than those considered in LCDT. These different steps produce different results in terms of the weirdness or garbage-ness of the planning world model.

The editing steps I consider in the main examples of counterfactual planning is that I edit the real world model to construct a planning world model that has a different agent compute core in it, while leaving the physical world outside of the compute core unchanged. Specifically, the planning world models I considered do not accurately depict the software running inside the agent compute core, they depict a compute core running different software.

In terms of plausibility and internal consistency, a compute core running different software is more plausible/coherent than what can happen in the models constructed by LCDT.

As I currently understand things, I believe that CPs are doing planning in a counterfactual-but-coherent world, whereas LCDT is planning in an (intentionally) incoherent world - but I might be wrong in either case.

You are right in both cases, at least if we picture coherence as a sliding scale, not as a binary property. It also depends on the world model you start out with, of course.

LCDT, A Myopic Decision Theory

By this setting, you ensure that the goal-keeper isn't a causal descendant of the LCDT-agent.

Oops! You are right, there is no cutting involved to create from in my toy example. Did not realise that. Next time, I need to draw these models on paper before I post, not just in my head.

and do work as examples to explore what one might count as deception or non-deception. But my discussion of a random prior above makes sense only if you first extend to a multi-step model, where the knowledge of the goal keeper explicitly depends on earlier agent actions.

LCDT, A Myopic Decision Theory

However, I don't think this is quite right (unless I'm missing something) [,,,] I don't think the situation is significantly different between B and C here. In B, the agent will decide to kick left most of the time since that's the Nash equilibrium. In C the agent will also decide to kick left most of the time: knowing the goalkeeper's likely action still leaves the same Nash solution

To be clear: the point I was trying to make is also that I do not think that and are significantly different in the goalkeeper benchmark. My point was that we need to go to a random prior to produce a real difference.

But your question makes me realise that this goalkeeper benchmark world opens up a bigger can of worms than I expected. When writing it, I was not thinking about Nash equilibrium policies, which I associate mostly with iterated games, and I was specifically thinking about an agent design that uses the planning world to compute a deterministic policy function. To state what I was thinking about in different mathematical terms, I was thinking of an agent design that is trying to compute the action that optimizes in the non-iterated gameplay world .

To produce the Nash equilibrium type behaviour you are thinking about (i.e. the agent will kick left most of the time but not all the time), you need to start out with an agent design that will use the constructed by LCDT to compute a nondeterministic policy function, which it will then use to do compute its real world action. If I follow that line of thought, it I would need additional ingredients to make the agent actually compute that Nash equilibrium policy function. I would need need to have iterated gameplay in , with mechanics that allow the goalkeeper to observe whether the agent is playing a non-Nash-equilibrium policy/strategy, so that the goalkeeper will exploit this inefficiency for sure if the agent plays the non-Nash-equilibrium strategy. The possibility of exploitation by the goalkeeper is what would push the optimal agent policy towards a Nash equilibrium. But interestingly, such mechanics where the goalkeeper can learn about a non-Nash agent policy being used might be present in an iterated version of the real world model , but they will be removed by LCDT from an iterated version of . (Another wrinkle: some AI algorithms for solving the optimal policy in a single-shot game in or would turn or into an iterated game automatically and then solve the iterated game. Such iteration might also update the prior, if we are not careful. But if we solve or analytically or with Monte Carlo simulation, this type of expansion to an iterated game will not happen.)

Hope this clarifies what I was thinking about. I think it is also true that, if the prior you use in your LCDT construction is that everybody is playing according to a Nash equilibrium, then agent may end up playing exactly that under LCDT.

(I plan to comment on your question about incoherence in a few days.)

LCDT, A Myopic Decision Theory


LCDT is has major structural similarities with some of the incentive-managing agent designs that have been considered by Everitt et al in work on Causal Influence Diagrams (CIDs), e.g. here and by me in work on counterfactual planning, e.g. here. These similarities are not immediately apparent however from the post above, because of differences in terminology and in the benchmarks chosen.

So I feel it is useful (also as a multi-disciplinary or community-bridging exercise) to make these similarities more explicit in this comment. Below I will map the LCDT defined above to the frameworks of CIDs and counterfactual planning, frameworks that were designed to avoid (and/or expose) all ambiguity by relying on exact mathematical definitions.

Mapping LCDT to detailed math

Lonely CDT is a twist on CDT: an LCDT agent will make its decision by using a causal model just like a CDT agent would, except that the LCDT agent first cuts the last link in every path from its decision node to any other decision node, including its own future decision nodes.

OK, so in the terminology of counterfactual planning defined here, an LCDT agent is built to make decisions by constructing a model of a planning world inside its compute core, then computing the optimal action to take in the planning world, and then doing the same action on the real world. The LCDT planning world model is a causal model, let's call it . This is constructed by modifying a causal model by cutting links. The we modify is a fully accurate, or reasonably approximate, model of bow the LCDT agent interacts with its environment, where the interaction aims to maximize a reward or minimize a loss function.

The planning world is a modification of that intentionally mis-approximates some of the real world mechanics visible in . is constructed to predict future agent actions less accurately than is possible, given all information in . This intentional mis-approximation this makes the LCDT into what I call a counterfactual planner. The LCDT plans actions that maximize reward (or minimize losses) in , and then performs these same actions in the real world it is in.

Some mathematical detail: in many graphical models of decision making, the nodes that represent the decision(s) made by the agent(s) do not have any incoming arrows. For the LCDT definition above to work, we need a graphical model where the decision-making nodes do have such incoming arrows. Conveniently, CIDs are such models. So we can disambiguate LCDT by saying that and are full causal models as defined in the CID framework. Terminology/mathematical details: in the CID definitions here, these full causal models and are called SCIMs, in the terminology defined here they are called policy-defining world models whose input parameters are fully known.

Now I identify some ambiguities that are left in the LCDT definition of the post. First, the definition has remained silent on how the initial causal world model is obtained. It might be by learning, by hand-coding (as in the benchmark examples), or a combination of the two. For an example of a models that is constructed with a combination of hand-coding and machine learning, see the planning world (p) here. There is also significant work in the ML community on using machine learning to construct from scratch full causal models including the nodes and the routing of the arrows themselves, or (more often) full Bayesian networks with nodes and arrows where the authors do not worry too much about any causal interpretation of the arrows. I have not tried this out in any examples, but I believe the LCDT approach might be usefully applied to predictive Bayesian networks too.

Regardless of how is obtained, we can do some safety analysis on the construction of out of .

The two works on CIDs here and here both consider that we can modify agent incentives by removing paths in the CID-based world model that the agent uses for planning its actions. In the terminology of the first paper above, the modifications made by LCDT to produce the model work to 'remove an instrumental control incentive on a future action'. In the terminology of the second paper, the modifications will 'make the agent indifferent about downstream nodes representing agent actions'. The post above speculates:

LCDT shows a form of indifference (related to indifference corrigibility maybe)

This is not a maybe: the indifference produced is definitely related to indifference corrigibility, the type of indifference-that-causes-corrigibility that the 2015 MIRI/FHI paper titled Corrigibility talks about. For some detailed mathematical work relating the two, see here.

A second ambiguity in LCDT is that it tell us how exactly the nodes in that represent agent decisions are to be identified. If is a hand-coded model of a game world, identifying these nodes may be easy. If is a somewhat opaque model produced by machine learning, identifying the nodes may be difficult. In many graphical world models, a single node may represent the state of a huge chunk of the agent environment: say both the vases and conveyor belts in the agent environment and the people in the agent environment. Does this node then become a node that represents agent decisions? We might imagine splitting the node into two nodes (this is often called factoring the state) to separate out the humans.

That being said, even a less-than-perfect identification of these nodes would work to suppress certain deceptive forms of manipulation, so LCDT could be usefully applied even to somewhat opaque learned causal models.

A third ambiguity is in the definition of the operations needed to create a computable causal model after taking a copy of and cutting incoming links to the downstream decision nodes:

What do we replace these decision nodes with (as their actual expression does depend on our decision)? We assume that the model has some fixed prior over its own decision, and then we marginalize the cut decision node with this prior, to leave the node with a distribution independent of our decision.

It is ambiguous how to construct this 'fixed prior over its own decision' that we should use to marginalize on. Specifically, is this prior allowed to take into account some or all of the events that preceded the decision to be made? This ambiguity leaves a large degree of freedom in constructing by modifying , especially in a setting where the agents involved make multiple decisions over time. This ambiguity is not necessarily a bad thing: we can interpret is as an open (hyper)parameter choice that allows us to create differently tuned versions of that trade off differently between suppressing manipulation and still achieving a degree of economic decision making effectiveness. On a side note, in a multi-decision setting, drawing a that encodes marginalization on 10 downstream decisions will generally create a huge diagram: it will add 10 new sub-diagrams feeding input observations into these decisions.

LCDT also considers agent self-modification, However, given the way these self-modification decisions are drawn, I cannot easily see how these would generalize to a multi-decision situation where the agent makes several decisions over time. Representations of self-modification in a multi-decision CID framework usually require that one draws a lot of extra nodes, see e.g. this paper. As this comment is long already, I omit the topic of how to map multi-action self-modification to unambiguous math. My safety analysis below is therefore limited to the case of the LCDT agent manipulating other agents, not the agent manipulating itself.

Some safety analysis

LCDT obviously removes some agent incentives, incentives to control the future decisions made by human agents in the agent environment. This is nice because one method of control is deception, so it suppresses deception. However, I do not believe LCDT removes all incentives to deceive in the general case.

As I explain in this example and in more detail in sections 9.2 and 11.5.2 here, the use of a counterfactual planning world model for decision making may remove some incentives for deception, compared to using a fully correct world model, but the planning world may still retain some game-theoretical mechanics that make deception part of an optimal planning world strategy. So we have to consider the value of deception in the planning world.

I'll now do this for a particular toy example: the decision making problem of a soccer playing agent that tries to score a goal, with a human goalkeeper trying to block the goal. I simplify this toy world by looking at one particular case only: the case where the agent is close to the goal, and must decide whether to kick the ball in the left or right corner. As the agent is close, the human goalkeeper will have to decide to run to the left corner or right corner of the goal even before the agent takes the shot: the goalkeeper does not have enough time to first observe where the ball is going and only then start moving. So this toy world decision problem has the agent deciding on kick left of right, and the goalkeeper simultaneously deciding on running left or right.

[Edited to add: as discussed in the comments below, the discussion of about marginsalisation that follows is somewhat wrong/confusing. It fails to mention that if we construct exactly as described above, there is no causal link from the agent action to the goalkeeper action, so LCDT would construct a that is exaclty the same model as , and the question of what prior to marginalise on does not even come up. To make that question come up, we need to model an iterative game where the goalkeeper remembers (learns from) past moves by the agent.]

Say that agent is mechanically stronger at kicking left, so that a ball kicked left, when the goalkeeper also moves left, has a lower probability of being intercepted by the goalkeeper than in the right+right alternative. In that case, the most reasonable prior over the agent action will model an agent kicking left most of the time. Now, if we use this prior to marginalize the expectations of the human goalkeeper in the planning world, the planning world goalkeeper will expect the agent to kick the ball left most of the time, so they are more likely to move left.

Now observe that in the LCDT planning world model constructed by marginalization, this knowledge of the goalkeeper is a known parameter of the ball kicking optimization problem that the agent must solve. If we set the outcome probabilities right, the game theoretical outcome will be that the optimal policy is for the agent to kicks right, so it plays the opposite move that the goalkeeper expects. I'd argue that this is a form of deception, a deceptive scenario that LCDT is trying to prevent.

A safer decision theory would marginalize the goalkeeper expectations with a random prior over agent actions. thereby removing the game-theoretical benefit of the agent doing something unexpected. If the goalkeeper knows the agent is using this safer decision theory, they can always run left.

Now, I must admit that I associate the word 'deception' mostly with multi-step policies that aim to implant incorrect knowledge into the opposite party's world model first, and then exploit that incorrect knowledge in later steps. The above example does only one of these things. So maybe others would deconfuse (define) the term 'deception' differently in a single-action setting, so that the above example does not in fact count as deception.


The post above does not benchmark LCDT on Newcomb’s Problem, which I feel is a welcome change, compared to many other decision theory posts on this forum. Still, I feel that there is somewhat of a gap in the benchmarking coverage provided by the post above, as 'mainstream' ML agent designs are usually benchmarked in MDP or RL problem settings, that is on multi-step decision making problems where the objective is to maximize a time discounted sum of rewards. (Some of the benchmarks in the post above can be mapped to MDP problems in toy worlds, but they would be somewhat unusual MDP toy worlds.)

A first obvious MDP-type benchmark would be an RL setting where the reward signal is provided directly by a human agent in the environment. When we apply LCDT in this context, it makes the LCDT agent totally indifferent to influencing the human-generated reward signal: any random policy will perform equally well in the planning world . So the LCDT agent becomes totally non-responsive to its reward signal, and non-competitive as a tool to achieve economic goals.

In a second obvious MDP-type benchmark, the reward signal is provided by a sensor in the environment, or by some software that reads and processes sensor signals. If we model this sensor and this software as not being agents themselves, then LCDT may perform very well. Specifically, if there are innocent human bystanders too in the agent environment, bystanders who are modeled as agents, then we can expect that the incentive of the agent to control or deceive these human bystanders into helping it achieve its goals is suppressed. This is because under LCDT, the agent will lose some, potentially all, of its ability to correctly anticipate the consequences of its own actions on the actions of these innocent human bystanders.

Other remarks

There is an interesting link between LCDT and counterfactual oracles: whereas LCDT breaks the last link in any causal chain that influences human decisions, counterfactual oracle designs can be said to break the first link. See e.g. section 13 here for example causal diagrams.

When applying an LCDT-like approach construct a from a causal model , it may sometimes be easier to keep the incoming links to nodes in that model future agent decisions intact, and instead cut the outgoing links. This would mean replacing these nodes in with fresh nodes that generate probability distributions over future actions taken by the future agents(s). These fresh nodes could potentially use node values that occurred earlier in time than the agent action(s) as inputs, to create better predictions. When I picture this approach visually as editing a causal graph into a , the approach is more easy to visualize than the approach of marginalizing on a prior.

To conclude, my feeling is that LCDT can definitely be used as a safety mechanism, as an element of an agent design that suppresses deceptive policies. But it is definitely not a perfect safety tool that will offer perfect suppression of deception in all possible game-theoretical situations. When it comes to suppressing deception, I feel that time-limited myopia and the use of very high time discount factors are equally useful but imperfect tools.

Research agenda update

I know all about that kind of to-do list.

Definitely my sequence of 6 months ago is not about doing counterfactual planning by modifying somewhat opaque million-node causal networks that might be generated by machine learning. The main idea is to show planning world model modifications that you can apply even when you have no way of decoding opaque machine-learned functions.

Research agenda update

Just FYI, for me personally this [from scratch] presumption comes from my trying to understand human brain algorithms.

Thanks for clarifying. I see how you might apply a 'from scratch' assumption to the neocortex. On the other hand, if the problem is to include both learned and hard-coded parts in a world model, one might take inspiration from things like the visual cortex, from the observation that while initial weights in the visual cortex neurons may be random (not sure if this is biologically true though), the broad neural wiring has been hardcoded by evolution. In AI terminology, this wiring represents a hardcoded prior, or (if you want to take the stance that you are learning without a prior) a hyperparameter.

So, the AI pioneers wrote into their source code whether each door in the building is open or closed? And if a door is closed when the programmer expected it to be open, then the robot would just slam right into the closed door?? That doesn't seem like something to be proud of! Or am I misunderstanding you?

The robots I am talking about were usually not completely blind, but they had very limited sensing capabilities. The point about hardcoding here is that the processing steps which turned sensor signals into world model details were often hardcoded. Other necessary world model details for which no sensors were available would have to be hardcoded as well.

If I understand you correctly, you're assuming that the programmer will manually set up a giant enormous Bayesian network that represents everything in the world

I do not think you not understand me correctly.

You are assuming I am talking about handcoding giant networks where each individual node might encode a single basic concept like a dowsing rod, and then ML may even add more nodes dynamically. This is not at all what the example networks I linked to look like, and not at all how ML works on them.

Look, I included this link to the sequence to clarify exactly what I mean: please click the link and take a look. The planning world causal graphs you see there are not world models for toy agents in toy worlds, they are plausible AGI agent world models. A single node typically represents a truly giant chunk of current or future world state. The learned details of a complex world are all inside the learned structural functions, in what I call the model parameter in the sequence.

The linked-to approach is not the only way to combine learned and hardcoded model parts, but think it shows very useful technique. My more general point is also that there are a lot of not-in-fashion historical examples that may offer further inspiration.

Research agenda update

Here are some remarks for anybody who wants to investigate the problem of Learned world-models with hardcoded pieces -- hope they will be useful.

My main message is that when thinking about this problem, you should be very aware that there are fashions in AI research. The current fashion is all about ML, about creating learned world models. In the most extreme expression of this fashion, represented by the essay the bitter lesson, even the act of hand-coding some useful prior for the learned world model is viewed with suspicion. It is viewed as domain-specific or benchmark-specific tweaking that will not teach us anything about making the next big ML breakthrough.

Fashion used to be different: there were times when AI pioneers built robots with fully hardcoded world models and were very proud of it.

Hardcoding parts of the world model never went out of fashion in the applied AI and cyber-physical systems community, e.g. with people who build actual industrial robots, and people who want to build safe self-driving cars.

Now, you are saying that My default presumption is that our AGIs will learn a world-model from scratch, i.e. learn their full world model from scratch. In this, you are following the prevailing fashion in theoretical (as opposed to applied) ML. But if you follow that fashion it will blind you to a whole class of important solutions for building learned world models with hardcoded pieces.

It is very easy, an almost routine software engineering task, to build predictive world models that combine both hardcoded and learned pieces. One example of building such a model is to implement it as a Bayesian network or a Causal graph. The key thing to note here is that each single probability distribution/table for each graph node (each structural function in case of a causal graph) might be produced either by machine learning from a training set, or simply be hard-coded by the programmer. See my sequence counterfactual planning for some examples of the design freedom this creates in adding safety features to an AGI agent's world model.

Good luck with your further research! Feel free to reach out if you want to discusses this problem of mixed-mode model construction further.

Finite Factored Sets

Some general comments:

Overcoming blindness

You mention above that Pearl's ontology 'has blinded us to the obvious next question'. I am very sympathetic to research programmes that try to overcome such blindness, this is the kind or research I have been doing myself recently. The main type of blindness that I have been trying to combat is blindness to complex types of self-referencing and indirect representation that can be present inside online machine learning agents, specifically in my recent work I have added a less blind viewpoint by modifying and extending Pearl's causal graphs, so that you end up with a two-causal-diagram model of agency and machine learning. These extensions may be of interest to you, especially in relation to problems of embeddedness, but the main point I want to make here is a methodological one.

What I found, somewhat to my surprise, is that I did not need to develop the full mathematical equivalent of all of Pearl's machinery, in order to shed more light on the problems I wanted to investigate. For example, the idea of d-separation is very fundamental to the type of thing that Pearl does with causal graphs, fundamental to clarifying problems of experimental design and interpretation in medical experiments. But I found that this concept was irrelevant to my aims. Above, you have a table of how concepts like d-separation map to the mathematics developed in your talk. My methodological suggestion here is that you probably do not want to focus on defining mathematical equivalents for all of Pearl's machinery, instead it will be a sign of de-blinding progress if you define new stuff that is largely orthogonal.

While I have been looking at blindness to problems of indirection. your part two subtitle suggests you are looking at blindness with respect to the problem of 'time' instead. However, my general feeling is that you are addressing another type of blindness, both this talk and in 'carthesian frames'. You are working to shed more light on the process that creates a causal model, be it a Pearlian or semi-Pearlian model, the process that generates the nodes and the arrows/relations between these nodes.

The mechanical generation of correct (or at least performant) causal models from observational data is a whole (emerging?) subfield of ML I believe, I have nor read much of the literature in this field, but here is one recent paper that may serve as an entry point.

How I can interpret factoring graphically

Part of your approach is to convert Pearl's partly graphical math into a different, non-graphical formalism you are more comfortable with. That being said, I will now construct a graphical analogy to the operation of factoring you define.

You define factoring as taking a set and creating a set of factors (sets) , such that (in my words) every can be mapped to an equivalent tuple . where , etc.

Graphically, I can depict would be a causal graph with just a single node, a node representing a random variable that takes values in . The factoring would be an n-node graph where each node represents a random variable taking values from . So I can imagine factorization as an operation that splits a single graph node into many nodes .

In terms of mainstream practice in experimental design, this splitting operation replaces a single observable with several sub-observables. Where you depart from normal practice is that you require the splitting operation to create a full bijection: this kind of constraint is much more loosely applied in normal practice. It feels to me you are after some kind of no-loss-of-information criterion in defining partitioning as you do -- the criterion you apply seems to be unnecessarily strict however, though it does create a fun mathematical sequence.

In any case, if a single node splits into nodes , we can wonder how we should picture the arrows between these nodes , that need to be drawn in after the split. Seems to me that this is a key question you are trying to answer: how does the split create arrows, or other relations that are almost but not entirely like Peal's causal arrows? My own visual picture here is that, in the most general case, the split creates fully connected directed graph: each node has an arrow to every other node . This would be a model representation that is compatible with the theory that all observables represented by the nodes are dependent on each other. Then, we might transform this fully connected graph into a DAG, a DAG that is still compatible with observed statistical relations, by deleting certain arrows, and potentially by adding unobserved nodes with emerging arrows. (Trivial example: drawing an arrow is equivalent to stating a theory that maybe is not statistically independent of . If I can disprove that theory, I can remove the arrow.)

This transformation process typically allows for many different candidate DAGs to be created which are all compatible with observational data. Pearl also teaches that we may design and run experiments with causal interventions in order to generate more observational data which can eliminate many of these candidate DAGs.

Finite Factored Sets

My thoughts on naming this finite factored sets: I agree with Paul's observation that

| Factorization seems analogous to describing a world as a set of variables

By calling this 'finite factored sets', you are emphasizing the process of coming up with individual random variables, the variables that end up being the (names of the) nodes in a causal graph. With representing the entire observable 4D history of a world (like a computation starting from a single game of life board state), a factorisation splits such into a tuple of separate, more basic observables . where , etc. In the normal narrative that explains Pearl causal graphs, this splitting of the world into smaller observables is not emphasized. Also, the splitting does not necessarily need to be a bijection. It may loose descriptive information with respect to .

So I see the naming finite factored sets as a way to draw attention to this splitting step, it draws attention to the fact that if you split things differently, you may end up with very different causal graphs. This leaves open the question of course is if really want to name your framework in a way that draws attention to this part of the process. Definitely you spend a lot of time on creating an equivalent to the arrows between the nodes too.

Load More