Review

A simple example of conditional orthogonality in finite factored sets

8Scott Garrabrant

2DanielFilan

5Scott Garrabrant

New Comment

3 comments, sorted by Click to highlight new comments since: Today at 6:09 PM

Thanks for writing this.

On the finiteness point, I conjecture that "finite dimensional" (|B| is finite) is sufficient for all of my results so far, although some of my proofs actually use "finite" (|S| is finite). The example with real numbers is still finite dimensional, so I don't expect any problems.

Yeah, this is the point that orthogonality is a stronger notion than just all values being mutually compatible. Any x1 and x2 values are mutually compatible, but we don't call them orthogonal. This is similar to how we don't want to say that x1 and (the level sets of) x1+x2 are compatible.

The coordinate system has a collection of surgeries, you can take a point and change the x1 value without changing the other values. When you condition on E, that surgery is no longer well defined. However the surgery of only changing the x4 value is still well defined, and the surgery of changing x1 x2 and x3 simultaneously is still well defined (provided you change them to something compatible with E).

We could define a surgery that says that when you increase x1, you decrease x2 by the same amount, but that is a new surgery that we invented, not one that comes from the original coordinate system.

Recently, MIRI researcher Scott Garrabrant has publicized his work on finite factored sets. It allegedly offers a way to understand agency and causality in a set-up like the causal graphs championed by Judea Pearl. Unfortunately, the definition of conditional orthogonality is very confusing. I'm not aware of any public examples of people demonstrating that they understand it, but I didn't really understand it until an hour ago, and I've heard others say that it went above their heads. So, I'd like to give an example of it here.

In a finite factored set, you have your base set S, and a set B of 'factors' of your set. In my case, the base set S will be four-dimensional space - I'm sorry, I know that's one more dimension than the number that well-adjusted people can visualize, but it really would be a much worse example if I were restricted to three dimensions. We'll think of the points in this space as tuples (x1,x2,x3,x4) where each xi is a real number between, say, -2 and 2 [footnote 1]. We'll say that X1 is the 'factor', aka partition, that groups points together based on what their value of x1 is, and similarly for X2, X3, and X4, and set B={X1,X2,X3,X4}. I leave it as an exercise for the reader to check whether this is in fact a finite factored set. Also, I'll talk about the 'value' of partitions and factors - technically, I suppose you could say that the 'value' of some partition at a point is the set in the partition that contains the point, but I'll use it to mean that, for example, the 'value' of X1 at point (x1,x2,x3,x4) is x1. If you think of partitions as questions where different points in S give different answers, the 'value' of a partition at a point is the answer to the question.

[EDIT: for the rest of the post, you might want to imagine S as points in space-time, where x4 represents the time, and (x1,x2,x3) represent spatial coordinates - for example, inside a room, where you're measuring from the north-east corner of the floor. In this analogy, we'll imagine that there's a flat piece of sheet metal leaning on the floor against two walls, over that corner. We'll try conditioning on that - so, looking only at points in space-time that are spatially located on that sheet - and see that distance left is no longer orthogonal to distance up, but that both are still orthogonal to time.]

Now, we'll want to condition on the set E={(x1,x2,x3,x4)|x1+x2+x3=1}. The thing with E is that once you know you're in E, x1 is no longer independent of x2, like it was before, since they're linked together by the condition that x1+x2+x3=1. However, x4 has nothing to do with that condition. So, what's going to happen is that conditioned on being in E, X1 is orthogonal to X4 but not to X2.

In order to show this, we'll check the definition of conditional orthogonality, which actually refers to this thing called conditional history. I'll write out the definition of conditional history formally, and then try to explain it informally: the conditional history of X given E, which we'll write as h(X|E), is the smallest set of factors H⊆B satisfying the following two conditions:

Condition 1 means that, if you think of the partitions as carving up the set S, then the partition X doesn't carve E up more finely than if you carved according to everything in h(X|E). Another way to say that is that if you know you're in E, knowing everything in the conditional history of X in E tells you what the 'value' of X is, which hopefully makes sense.

Condition 2 says that if you want to know if a point is in E, you can separately consider the 'values' of the partitions in the conditional history, as well as the other partitions that are in B but not in the conditional history. So it's saying that there's no 'entanglement' between the partitions in and out of the conditional history regarding E. This is still probably confusing, but it will make more sense with examples.

Now, what's conditional orthogonality? That's pretty simple once you get conditional histories: X and Y are conditionally orthogonal given E if the conditional history of X given E doesn't intersect the conditional history of Y given E. So it's saying that once you're in E, the things determining X are different to the things determining Y, in the finite factored sets way of looking at things.

Let's look at some conditional histories in our concrete example: what's the history of X1 given E? Well, it's got to contain X1, because otherwise that would violate condition 1: you can't know the value of X1 without being told the value of X1, even once you know you're in E. But that can't be the whole thing. Consider the point s=(0.5,0.4,0.4,0.7). If you just knew the value of X1 at s, that would be compatible with s actually being (0.5,0.25,0.25,1), which is in E. And if you just knew the values of X2, X3, and X4, you could imagine that s was actually equal to (0.2,0.4,0.4,0.7), which is also in E. So, if you considered the factors in {X1} separately to the other factors, you'd conclude that s could be in E - but it's actually not! This is exactly the thing that condition 2 is telling us can't happen. In fact, the conditional history of X1 given E is {X1,X2,X3}, which I'll leave for you to check. I'll also let you check that the conditional history of X2 given E is {X1,X2,X3}.

Now, what's the conditional history of X4 given E? It has to include X4, because if someone doesn't tell you X4 you can't figure it out. In fact, it's exactly {X4}. Let's check condition 2: it says that if all the factors outside the conditional history are compatible with some point being in E, and all the factors inside the conditional history are compatible with some point being in E, then it must be in E. That checks out here: you need to know the values of all three of X1, X2, and X3 at once to know if something's in E, but you get those together if you jointly consider those factors outside your conditional history, which is {X1,X2,X3}. So looking at (0.5,0.4,0.4,0.7), if you only look at the values that aren't told to you by the conditional history, which is to say the first three numbers, you can tell it's not in E and aren't tricked. And if you look at (0.5,0.25,0.25,0.7), you look at the factors in {X4} (namely X4), and it checks out, you look at the factors outside {X4} and that also checks out, and the point is really in E.

Hopefully this gives you some insight into condition 2 of the definition of conditional history. It's saying that when we divide factors up to get a history, we can't put factors that are entangled by the set we're conditioning on on 'different sides' - all the entangled factors have to be in the history, or they all have to be out of the history.

In summary: h(X1|E)=h(X2|E)={X1,X2,X3}, and h(X4|E)={X4}. So, is X1 orthogonal to X2 given E? No, their conditional histories overlap - in fact, they're identical! Is X1 orthogonal to X4 given E? Yes, they have disjoint conditional histories.

Some notes:

[^1] I know what you're saying - "That's not a finite set! Finite factored sets have to be finite!" Well, if you insist, you can think of them as only the numbers between -2 and 2 with two decimal places. That makes the set finite and doesn't really change anything. (Which suggests that a more expansive concept could be used instead of finite factored sets.)