AI ALIGNMENT FORUM
AF

Finite Factored Sets: Introduction and Factorizations — AI Alignment Forum

This is a longer, more mathematically dense introduction to "Finite Factored Sets." For a shorter introduction, see my Topos talk/transcript.

Abstract: We propose a new approach to temporal inference, inspired by the Pearlian causal inference paradigm—though quite different from Pearl's approach formally. Rather than using directed acyclic graphs, we make use of factored sets, which are sets expressed as Cartesian products. We show that finite factored sets are powerful tools for inferring temporal relations. We introduce an analog of d-separation for factored sets, conditional orthogonality, and we demonstrate that this notion is equivalent to conditional independence in all probability distributions on a finite factored set.

1. Introduction

1.1. Pearlian Causal Inference

Judea Pearl's theory of inferred causation (e.g., as presented in chapter 2 of Causality: Models, Reasoning, and Inference) was a deep advance in our understanding of the nature of time. The Pearlian paradigm allows us to infer causal relationships between variables using statistical data, and thereby infer temporal sequence—in defiance of the old adage that correlation does not imply causation.

In particular, given a collection of variables and a joint probability distribution over those variables, the Pearlian paradigm can often infer temporal relationships between the variables.

The joint probability distribution is usually what gets emphasized in discussions of Pearl's approach. Quite a bit of work is being done, however, by the assumption that we are handed "a collection of variables" to reason about. The Pearlian paradigm is not inferring temporal relationships from purely statistical data, but rather inferring temporal relationships from statistical data together with data about how to factorize the world into variables.[1]

A doctor who misdiagnoses their patient or misidentifies a symptom may base their subsequent reasoning on a wrong factorization of the situation into causally relevant variables. We would ideally like to build fewer assumptions like this into our model of inference, and instead allow the reasoner to figure such facts out, consider the merits of different factorizations into variables, etc.

Instead of beginning with a collection of variables and a joint probability distribution over those variables, one could imagine starting with just a finite sample space and a probability distribution on that sample space. In this way, we might hope to do temporal inference purely using statistical data, without relying on a priori knowledge of a canonical way of factoring the situation into variables.

How might one do temporal inference without an existing factorization? One way might be to just consider all possible variables that can be defined on the sample space. This gives us one variable for each partition of the set.

However, when one tries to apply Pearl's methods to this collection of variables, one quickly runs into a problem: many of the variables definable on a fixed set are deterministic functions of each other. The Pearlian paradigm, as presented in the early chapters of Causality, lacks tools for performing temporal inference on variables that are highly deterministically related.[2]

We will introduce a new approach to temporal inference instead—one which is heavily inspired by the Pearlian paradigm, but approaches the problem with a very different formal apparatus, and does not make use of graphical models.

1.2. Overview

We'll begin by introducing the concept of a finite factored set, in Section 2. This will be our analogue of the directed acyclic graphs in Pearl's framework.

In Section 3, we will introduce the concepts of time and orthogonality, which can be read off of a finite factored set. In Pearl's framework, "time" corresponds to directed paths between nodes, and "orthogonality" corresponds to nodes that have no common ancestor.

In Section 4, we will introduce conditional orthogonality, which is our analogue of d-separation. We show that conditional orthogonality satisfies (a modified version of) the compositional semigraphoid axioms. We then (in Section 5) prove the fundamental theorem of finite factored sets, which states that conditional orthogonality is equivalent to conditional independence in all probability distributions on the finite factored set.

In Section 6, we discuss how to do temporal inference using finite factored sets, and give two examples. Finally, in Section 7 we discuss applications and future work, with an emphasis on temporal and conceptual inference, generalizing finite factored sets to the infinite case, and applications to embedded agency.

And here, we take our leave of Pearl. We've highlighted this approach's relationship to the Pearlian paradigm in order to motivate finite factored sets and explain how we'll be using them in this sequence. Formally, however, our approach is quite unlike Pearl's, and the rest of the sequence will stand alone.

2. Factorizations

Before giving a definition of finite factored sets, we will recall the definition of a partition, and give some basic notation related to partitions.

We do this for two reasons. First, we will use partitions in the definition of a factored set; and second, we want to draw attention to a duality between the notion of a partition, and the notion of a factorization.

2.1. Partitions

We begin with a definition of disjoint union.

Definition 1 (disjoint union). Given a set of sets, let $⊔ (S)$ denote the set of all ordered pairs $(T, t)$ , where $T \in S$ and $t \in T$ .[3]

Definition 2 (partition). A partition of a set $S$ is a set $X \subseteq P (S)$ of nonempty subsets of $S$ such that the function $ι : ⊔ (X) \to S$ given by $ι (x, s) = s$ is a bijection.[4]

Let $P a r t (S)$ denote the set of all partitions of $S$ . The elements of a partition are called parts.

An equivalent definition of partition is often given: a partition is a set $A$ of nonempty subsets of $S$ that are pairwise disjoint and union to $S$ . We choose the above definition because it will make the symmetry between partitions and factorizations more obvious.

Definition 3 (trivial partition). A partition $X$ of a set $S$ is called trivial if $| X | = 1$ .

Definition 4. Given a partition $X$ of a set $S$ , and an element $s \in S$ , let $[s]_{X}$ denote the unique $x \in X$ such that $s \in x$ .

Definition 5. Given a partition $X$ of a set $S$ , and elements $s_{0}, s_{1} \in S$ , we say $s_{0} \sim_{X} s_{1}$ if $[s_{0}]_{X} = [s_{1}]_{X}$ .

Proposition 1. Given a partition $X$ of a set $S$ , $\sim_{X}$ is an equivalence relation on $S$ .

Proof. Trivial. $□$

Definition 6 (finer and coarser). We say that a partition $X$ of $S$ is finer than another partition $Y$ of $S$ , if for all $s_{0}, s_{1} \in S$ , if $s_{0} \sim_{X} s_{1}$ , then $s_{0} \sim_{Y} s_{1}$ .

If $X$ is finer than $Y$ , we also say $Y$ is coarser than $X$ , and we write $X \geq_{S} Y$ and $Y \leq_{S} X$ .

Definition 7 (discrete and indiscrete partitions). Given a set $S$ , let ${Dis}_{S} = {{s} ∣ s \in S}$ .

If $S$ is empty, let ${Ind}_{S} = {}$ , and if $S$ is nonempty, let ${Ind}_{S} = {S}$ .

${Dis}_{S}$ is called the discrete partition, and ${Ind}_{S}$ is called the indiscrete partition.

Proposition 2. For any set $S$ , $\geq_{S}$ is a partial order on $Part (S)$ . Further, for all $X \in P a r t (S)$ , ${Dis}_{S} \geq_{S} X$ and $X \geq_{S} {Ind}_{S}$ .

Proof. Trivial. $□$

While both notations are sometimes used, it is more standard to draw the symbol in the opposite direction and have $X \leq Y$ when $X$ is finer than $Y$ . We choose to go against that standard because we want to think of partitions in part as the ability to distinguish between elements, and finer partitions correspond to greater ability to distinguish.[5]

Definition 8 (common refinement). Given a set $C$ of partitions of a fixed set $S$ , let $⋁_{S} (C)$ denote the partition $X \in P a r t (S)$ satisfying $s_{0} \sim_{X} s_{1}$ if and only if $s_{0} \sim_{c} s_{1}$ for all $c \in C$ . Given $X, Y \in P a r t (S)$ , we let $X \lor_{S} Y = ⋁_{S} ({X, Y})$ .

2.2. Factorizations

We start with a definition of Cartesian product.

Definition 9 (Cartesian product). Given a set $S$ of sets, let $⊓ (S)$ denote the set of all functions $f : S \to ⊔ (S)$ such that for all $T \in S$ , $f (T)$ is of the form $(T, t)$ , for some $t \in T$ .

We can now give the definition of a factorization of a set.

Definition 10 (factorization). A factorization of a set $S$ is a set $B \subseteq P a r t (S)$ of nontrivial partitions of $S$ such that the function $π : S \to ⊓ (B)$ , given by $π (s) = (b \mapsto (b, [s]_{b}))$ , is a bijection.

Let $Fact (S)$ denote the set of all factorizations of $S$ . The elements of a factorization are called factors.

In other words, a set of nontrivial partitions is a factorization of $S$ if for each way of choosing one part from each factor, there exists a unique element of $S$ in the intersection of those parts.

Notice the duality between the definitions of partition and factorization. We replace subsets with partitions, nonempty with nontrivial, and disjoint union with Cartesian product, and we reverse the direction of the function. We can think of a factorization of $S$ as a way to view $S$ as a product, in the same way that a partition was a way to view $S$ as a disjoint union.

A factored set is just a set together with a factorization of that set.

Definition 11 (factored set). A factored set $F$ is an ordered pair $(S, B)$ , such that $B$ is a factorization of $S$ .

If $F = (S, B)$ is a factored set, we let $set (F) = S$ , and let $basis (F) = B$ .

An important fact about factored sets is that the factors are enough to distinguish distinct elements.

Proposition 3. Given a factored set $F = (S, B)$ , and elements $s_{0}, s_{1} \in S$ , if $s_{0} \sim_{b} s_{1}$ for all $b \in B$ , then $s_{0} = s_{1}$ .

Proof. Let $F = (S, B)$ be a finite factored set, and let $s_{0}, s_{1} \in S$ satisfy $s_{0} \sim_{b} s_{1}$ for all $b \in B$ .

Let $π : S \to ⊓ (B)$ be given by $π (s) = (b \mapsto (b, [s]_{b}))$ , as in the definition of factorization. Then $π (s_{0}) = (b \mapsto (b, [s_{0}]_{b})) = (b \mapsto (b, [s_{1}]_{b})) = π (s_{1})$ . Since $π$ is bijective, this means $s_{0} = s_{1}$ . $□$

2.3. Chimera Functions

The following theorem can be viewed as an alternate characterization of factorization. We will use this alternate characterization to define chimera functions, which will be useful tools for manipulating elements of factored sets.

Theorem 1. Given a set $S$ , a set $B$ of nontrivial partitions of $S$ is a factorization of $S$ if and only if for every function $g : B \to S$ , there exists a unique $s \in S$ such that for all $b \in B$ , $s \sim_{b} g (b)$ .

Proof. First, we let $B$ be a factorization of $S$ , and let $g : B \to S$ be any function. We want to show that there exists a unique $s \in S$ such that for all $b \in B$ , $s \sim_{b} g (b)$ . Let $π : S \to ⊓ (B)$ be given by $π (s) = (b \mapsto (b, [s]_{b}))$ , as in the definition of factorization. Note that $π$ is bijective, and thus has an inverse.

Let $s = π^{- 1} (b \mapsto (b, [g (b)]_{b}))$ . Observe that this is well-defined, because $(b \mapsto (b, [g (b)]_{b}))$ is in fact in $⊓ (B)$ . We will show that $s \sim_{b} g (b)$ for all $b \in B$ , and the uniqueness of this $s$ will then follow directly from Proposition 3.

We have $π (s) = (b \mapsto [s]_{b})$ by the definition of $π$ . However, we also have $π (s) = (b \mapsto [g (b)]_{b})$ by the definition of $s$ . Thus, $b \mapsto [s]_{b}$ and $b \mapsto [g (b)]_{b}$ are the same function, so $[s]_{b} = [g (b)]_{b}$ for all $b \in B$ , so $s \sim_{b} g (b)$ for all $b \in B$ .

Conversely, let $S$ be any set, and let $B$ be any set of nontrivial partitions of $S$ . Assume that for all $g : B \to S$ , there exists a unique $s \in S$ satisfying $s \sim_{b} g (b)$ for $b \in B$ . Again, let $π : S \to ⊓ (B)$ be given by $π (s) = (b \mapsto (b, [s]_{b}))$ , as in the definition of factorization. We want to show that $π$ is invertible.

First, we show that $π$ is injective. Take an arbitrary $s_{0} \in S$ , and let $g : B \to S$ be the constant function satisfying $g (b) = s_{0}$ for all $b \in B$ . Given another $s_{1} \in S$ , if $π (s_{0}) = π (s_{1})$ , then $(b \mapsto [s_{0}]_{b}) = (b \mapsto [s_{1}]_{b})$ , so $[s_{1}]_{b} = [s_{0}]_{b} = [g (b)]_{b}$ for all $b \in B$ , so $s_{0} \sim_{b} s_{1} \sim_{b} g (b)$ for all $b \in B$ . Since there is a unique $s \in S$ satisfying $s \sim_{b} g (b)$ for all $b \in B$ , this means $s_{0} = s_{1}$ . Thus $π$ is injective.

To see that $π$ is surjective, consider some arbitrary $h \in ⊓ (B)$ . We want to show that there exists an $s \in S$ with $h = π (s)$ .

For all $b \in B$ , let $H_{b} \in b$ be given by $h (b) = (b, H_{b})$ , which is well-defined since $h \in ⊓ (B)$ . Note that $H_{b}$ is a nonempty subset of $S$ , so there exists a function $g : B \to S$ with $g (b) \in H_{b}$ for all $b \in B$ . Fix any such $g$ , and let $s$ satisfy $s \sim_{b} g (b)$ for all $b \in B$ .

We thus have that for all $b \in B$ , $h (b) = (b, H_{b}) = (b, [g (b)]_{b}) = (b, [s]_{b}) = π (s) (b)$ , so $h = π (s)$ . Thus $π$ is surjective.

Since $π$ is bijective, we have that $B$ is a factorization of $S$ . $□$

This also gives us that factors are disjoint from each other.

Corollary 1. Given a factored set $F = (S, B)$ and distinct factors $b_{0}, b_{1} \in B$ , $b_{0} \cap b_{1} = {}$ .

Proof. Assume by way of contradiction that $T \in b_{0} \cap b_{1}$ . Since $b_{0}$ is nontrivial, there must be some other $T^{'} \in b_{0}$ with $T \cap T^{'} = {}$ . Let $g : B \to S$ be any function such that $g (b_{0}) \in T^{'}$ and $g (b_{1}) \in T$ . Then there can be no $s$ such that $s \sim_{b_{0}} g (b_{0})$ and $s \sim_{b_{1}} g (b_{1})$ , since then $s$ would be in both $T$ and $T^{'}$ . This contradicts Theorem 1. $□$

We are now ready to define the chimera function of a factored set.

Definition 12 (chimera function). Given a factored set $F = (S, B)$ , the chimera function (of $F$ ) is the function $χ^{F} : (B \to S) \to S$ defined by $χ^{F} (g) \sim_{b} g (b)$ for all $g : B \to S$ and $b \in B$ .

The name "chimera function" comes from the fact that $χ^{F}$ can be viewed as building an element of $S$ by fusing together the properties of various different elements. Since we will often apply the chimera function to functions $g$ that only take on two values, we will give notation for this special case.

Definition 13. Given a factored set $F = (S, B)$ , and a subset $C \subseteq B$ , let $χ_{C}^{F} : S \times S \to S$ be given by $χ_{C}^{F} (s, t) = χ^{F} (g)$ , where $g : B \to S$ is given by $g (b) = s$ if $b \in C$ , and $g (b) = t$ otherwise.

For $T, R \subseteq S$ , we will write $χ_{C}^{F} (T, R)$ for ${χ_{C}^{F} (t, r) ∣ t \in T, r \in R}$ .

The following is a list of properties of $χ_{C}^{F}$ , which will be useful in later proofs. All of these properties follow directly from the definition of $χ_{C}^{F}$ .

Proposition 4. Fix $F = (S, B)$ , a factored set, $C, D \subseteq B$ , and $s, t, r \in S$ .

$χ_{C}^{F} (s, t) \sim_{c} s$ for all $c \in C$ .
$χ_{C}^{F} (s, t) \sim_{b} t$ for all $b \in B ∖ C$ .
$χ_{C}^{F} (s, s) = s$ .
$χ_{B ∖ C}^{F} (s, t) = χ_{C}^{F} (t, s)$ .
$χ_{C \cup D}^{F} (s, t) = χ_{C}^{F} (s, χ_{D}^{F} (s, t))$ .
$χ_{C \cap D}^{F} (s, t) = χ_{C}^{F} (χ_{D}^{F} (s, t), t)$ .
$χ_{C}^{F} (χ_{C}^{F} (s, t), r) = χ_{C}^{F} (s, χ_{C}^{F} (t, r)) = χ_{C}^{F} (s, r)$ .
$χ_{C}^{F} (s, χ_{D}^{F} (t, r)) = χ_{D}^{F} (χ_{C}^{F} (s, t), χ_{C}^{F} (s, r))$ .
$χ_{C}^{F} (χ_{D}^{F} (s, t), r) = χ_{D}^{F} (χ_{C}^{F} (s, r), χ_{C}^{F} (t, r))$ .
$χ_{B}^{F} (s, t) = s$ .
$χ_{{}}^{F} (s, t) = t$ .

Proof. Trivial. $□$

2.4. Trivial Factorizations

We now define a notion of a trivial factorization of a set, and show that every set has a unique trivial factorization.

Definition 14 (trivial factorization). A factorization $B$ of a set $S$ is called trivial if $| B | \leq 1$ . A factored set $(S, B)$ is called trivial if $B$ is trivial.

Proposition 5. For every set $S$ , there exists a unique trivial factorization $B$ of $S$ . If $| S | \neq 1$ , this trivial factorization is given by $B = {{Dis}_{S}}$ , and if $| S | = 1$ , it is given by $B = {}$ .

Proof. We start with the case where $| S | = 0$ . The only partition of $S$ is ${}$ , so we only need to consider the sets of partitions ${{}}$ and ${}$ as potential factorizations. ${{}}$ is vacuously a factorization of $S$ by Theorem 1, since there are no functions from ${{}}$ to $S$ . ${}$ is not a factorization by Theorem 1, since there is a function from ${}$ to $S$ , but there is no element of $S$ . Thus, when $| S | = 0$ , ${{}} = {{Dis}_{S}}$ is the unique trivial factorization of $S$ .

Next, consider the case where $| S | = 1$ . First, observe that the unique $s \in S$ vacuously satisfies $s \sim_{b} g (b)$ for all $g : {} \to S$ and $b \in {}$ , since there is no $b \in {}$ . Thus, by Theorem 1, ${}$ is a factorization of $S$ . Further, ${}$ is the only factorization of $S$ , since there are no nontrivial partitions of $S$ . Thus, when $| S | = 1$ , ${}$ is the unique trivial factorization of $S$ .

Next, we consider the case where $| S | \geq 2$ . Observe that ${Dis}_{S}$ is a nontrivial partition of $S$ . Let $B = {{Dis}_{S}}$ . We want to show that $B$ is a factorization of $S$ . By Theorem 1, it suffices to show that for all $g : B \to S$ , there exists a unique $s \in S$ with $s \sim_{{Dis}_{S}} g ({Dis}_{S})$ . We can take $s = g ({Dis}_{S})$ , which clearly satisfies $s \sim_{{Dis}_{S}} g ({Dis}_{S})$ . This $s$ is unique, since if $s^{'} \sim_{{Dis}_{S}} g ({Dis}_{S})$ , then $s^{'} \in [g ({Dis}_{S})]_{{Dis}_{S}} = [s]_{b} = {s}$ , so $s^{'} = s$ . Thus $B$ is a factorization of $S$ .

On the other hand, if $| S | \geq 2$ , ${}$ is not a factorization of $S$ , since if it were, Proposition 3 would imply that all elements of $S$ are equal. Further, for any partition $b$ of $S$ , with $b \neq {{Dis}_{S}}$ , there must exist $s_{0}, s_{1} \in S$ , with $s_{0} \sim_{b} s_{1}$ , but $s_{0} \neq s_{1}$ . Thus ${b}$ cannot be a factorization of $S$ by Proposition 3. Thus when $| S | \geq 2$ , ${Dis}_{S}$ is the unique trivial factorization of $S$ . $□$

2.5. Finite Factored Sets

This sequence will primarily be about finite factored sets.

Definition 15. If $F = (S, B)$ is a factored set, the size of $F$ , written $size (F)$ , is the cardinality of $S$ . The dimension of $F$ , written $dim (F)$ , is the cardinality of $B$ . $F$ is called finite if its size is finite, and finite-dimensional if its dimension is finite.

We suspect that the theory of infinite factored sets is both interesting and important. However, it is outside of the scope of this sequence, which will require finiteness for many of its key results.

Some of the definitions and results in this sequence will be given for finite factored sets, in spite of the fact that they could easily be extended to finite-dimensional or arbitrary factored sets. This is because they can often be extended in more than one way, and determining which extension is most natural requires further developing the theory of arbitrary factored sets.

Proposition 6. Every finite factored set is also finite-dimensional.

Proof. If $F = (S, B)$ is a factored set, $B$ is a set of sets of subsets of $S$ . Thus, $| B | \leq 2^{2^{| S |}}$ . $□$

This bound is horrible and will be improved in Proposition 9. First, however, we will take a look at the number of factorizations of a fixed finite set.

Proposition 7. Let $F = (S, B)$ be a finite factored set. Then $| S | = \prod_{b \in B} | b |$ .

Proof. Trivial. $□$

Proposition 8. If $| S |$ is equal to $0$ , $1$ , or a prime, the trivial factorization of $S$ is the only factorization of $S$ .

Proof. If $| S | = 0$ or $| S | = 1$ , then $| P a r t (S) | = 1$ , so $B \subseteq P a r t (S)$ can have cardinality at most 1.

If $| S | = p$ , a prime, then by Proposition 7, $| b |$ must divide $p$ for all $b \in B$ . Since factorizations cannot contain trivial partitions, this means $| b | = p$ for all $b \in B$ . However, ${{s} ∣ s \in S}$ is the only element of $P a r t (S)$ of cardinality $p$ , so $| B | \leq 1$ . $□$

On the other hand, in the case where $| S |$ is finite and composite, the number of factorizations of $S$ grows very quickly, as seen in Table 1. Table 1 shows the number of factorizations of a set $S$ with cardinality up to 25:

$\| S \|$	$\| F a c t (S) \|$	$\| S \|$	$\| F a c t (S) \|$
0	1	13	1
1	1	14	8648641
2	1	15	1816214401
3	1	16	181880899201
4	4	17	1
5	1	18	45951781075201
6	61	19	1
7	1	20	3379365788198401
8	1681	21	1689515283456001
9	5041	22	14079294028801
10	15121	23	1
11	1	24	4454857103544668620801
12	13638241	25	538583682060103680001

Given the naturalness of the notion of factorization, we were surprised to discover that this sequence did not exist on the On-Line Encyclopedia of Integer Sequences (OEIS). We added the sequence, A338681, on April 30, 2021.

To give one concrete example, the four factorizations of the set ${0, 1, 2, 3}$ are:

${{{0}, {1}, {2}, {3}}}$ ,
${{{0, 1}, {2, 3}}, {{0, 2}, {1, 3}}}$ ,
${{{0, 1}, {2, 3}}, {{0, 3}, {1, 2}}}$ , and
${{{0, 2}, {1, 3}}, {{0, 3}, {1, 2}}}$ .

Proposition 9. Let $F$ be a finite factored set.

If $size (F) = 0$ , then $dim (F) = 1$ .
If $size (F) = 1$ , then $dim (F) = 0$ .
If $size (F) = p$ is prime, then $dim (F) = 1$ .
If $size (F) = p_{0} \dots p_{k - 1}$ is a product of $k \geq 2$ primes, then $1 \leq dim (F) \leq k$ .

Proof. The first three parts follow directly from Proposition 5 and Proposition 8. For the fourth part, let $F = (S, B)$ , and let $| S | = p_{0} \dots p_{k - 1}$ be a product of $k \geq 2$ primes.

By Proposition 7, $| S | = \prod_{b \in B} | b |$ . Consider an arbitrary $b \in B$ . Since $b$ is a nontrivial partition of a finite set $S$ , $| b |$ is finite and $| b | \neq 1$ . If $| b |$ were $0$ , then $| S |$ would be $0$ . Thus $| b |$ is a natural number greater than or equal to $2$ . $B$ cannot be empty, since $| S | \neq 1$ . If $| B |$ were greater than $k$ , then we would be able to express $| S |$ as a product of more than $k$ natural numbers greater than or equal to $2$ , which is clearly not possible since $| S |$ is a product of $k$ primes. Thus $1 \leq dim (F) \leq k$ . $□$

In the next post, we will introduce the notions of the history of a partition, orthogonality between partitions, and time.

Acknowledgments: My thanks to Alex Appel, Ramana Kumar, Xiaoyu He, Tsvi Benson-Tilsen, Andrew Critch, Sam Eisenstat, Rob Bensinger, and Claire Wang for discussion and feedback on this sequence.

Footnotes

[1] Although I say "factorize" here, note that this will not be the kind of factorization that shows up in finite factored sets, because (as we will see) disjoint factors must be independent in a finite factored set. I appeal to the same concept in both contexts because factorization is just a very general and useful concept, rather than to indicate a direct connection.

[2] At least, it lacks such causal inference tools unless we assume access to interventional data.

[3] Note that this definition and Definition 9 could have been made more general by taking $S$ to be a multiset.

[4] $P (S)$ denotes the power set of $S$ .

[5] In our view, " $Y \geq X$ " is also a more natural way to visually represent a mapping between a three-part partition $Y$ that is finer than a two-part partition $X$ .

18

Finite Factored Sets: Introduction and Factorizations

18

1. Introduction

1.1. Pearlian Causal Inference

1.2. Overview

2. Factorizations

2.1. Partitions

2.2. Factorizations

2.3. Chimera Functions

2.4. Trivial Factorizations

2.5. Finite Factored Sets

Footnotes