Introduction to Cartesian Frames

[-]Ramana Kumar4y140

A formalisation of the ideas in this sequence in higher-order logic, including machine verified proofs of all the theorems, is available here.

[-]Rohin Shah5y*120

Planned summary (of the full sequence) for the Alignment Newsletter:

The <@embedded agency sequence@>(@Embedded Agents@) hammered in the fact that there is no clean, sharp dividing line between an agent and its environment. This sequence proposes an alternate formalism: Cartesian frames. Note this is a paradigm that helps us _think about agency_: you should not be expecting some novel result that, say, tells us how to look at a neural net and find agents within it.
The core idea is that rather than _assuming_ the existence of a Cartesian dividing line, we consider how such a dividing line could be _constructed_. For example, when we think of a sports team as an agent, the environment consists of the playing field and the other team; but we could also consider a specific player as an agent, in which case the environment consists of the rest of the players (on both teams) and the playing field. Each of these are valid ways of carving up what actually happens into an “agent” and an “environment”, they are _frames_ by which we can more easily understand what’s going on, hence the name “Cartesian frames”.
A Cartesian frame takes **choice** as fundamental: the agent is modeled as a set of options that it can freely choose between. This means that the formulation cannot be directly applied to deterministic physical laws. It instead models what agency looks like [“from the inside”](https://www.lesswrong.com/posts/yA4gF5KrboK2m2Xu7/how-an-algorithm-feels-from-inside). _If_ you are modeling a part of the world as capable of making choices, _then_ a Cartesian frame is appropriate to use to understand the perspective of that choice-making entity.
Formally, a Cartesian frame consists of a set of agent options A, a set of environment options E, a set of possible worlds W, and an interaction function that, given an agent option and an environment option, specifies which world results. Intuitively, the agent can “choose” an agent option, the environment can “choose” an environment option, and together these produce some world. You might notice that we’re treating the agent and environment symmetrically; this is intentional, and means that we can define analogs of all of our agent notions for environments as well (though they may not have nice philosophical interpretations).
The full sequence uses a lot of category theory to define operations on these sorts of objects and show various properties of the objects and their operations. I will not be summarizing this here; instead, I will talk about their philosophical interpretations.
First, let’s look at an example of using a Cartesian frame on something that isn’t typically thought of as an agent: the atmosphere, within the broader climate system. The atmosphere can “choose” whether to trap sunlight or not. Meanwhile, in the environment, either the ice sheets could melt or they could not. If sunlight is trapped and the ice sheets melt, then the world is Hot. If exactly one of these is true, then the world is Neutral. Otherwise, the world is Cool.
(Yes, this seems very unnatural. That’s good! The atmosphere shouldn’t be modeled as an agent! I’m choosing this example because its unintuitive nature makes it more likely that you think about the underlying rule, rather than just the superficial example. I will return to more intuitive examples later.)
**Controllables**
A _property_ of the world is something like “it is neutral or warmer”. An agent can _ensure_ a property if it has some option such that no matter what environment option is chosen, the property is true of the resulting world. The atmosphere could ensure the warmth property above by “choosing” to trap sunlight. Similarly the agent can _prevent_ a property if it can guarantee that the property will not hold, regardless of the environment option. For example, the atmosphere can prevent the property “it is hot”, by “choosing” not to trap sunlight. The agent can _control_ a property if it can both ensure and prevent it. In our example, there is no property that the atmosphere can control.
**Coarsening or refining worlds**
We often want to describe reality at different levels of abstraction. Sometimes we would like to talk about the behavior of various companies; at other times we might want to look at an individual employee. We can do this by having a function that maps low-level (refined) worlds to high-level (coarsened) worlds. In our example above, consider the possible worlds {YY, YN, NY, NN}, where the first letter of a world corresponds to whether sunlight was trapped (Yes or No), and the second corresponds to whether the ice sheets melted. The worlds {Hot, Neutral, Cool} that we had originally are a coarsened version of this, where we map YY to Hot, YN and NY to Neutral, and NN to Cool.
**Interfaces**
A major upside of Cartesian frames is that given the set of possible worlds that can occur, we can choose how to divide it up into an “agent” and an “environment”. Most of the interesting aspects of Cartesian frames are in the relationships between different ways of doing this division, for the same set of possible worlds.
First, we have interfaces. Given two different Cartesian frames <A, E, W> and <B, F, W> with the same set of worlds, an interface allows us to interpret the agent A as being used in place of the agent B. Specifically, if A would choose an option a, the interface maps this one of B’s options b. This is then combined with the environment option f (from F) to produce a world w.
A valid interface also needs to be able to map the environment option f to e, and then combine it with the agent option a to get the world. This alternate way of computing the world must always give the same answer.
Since A can be used in place of B, all of A’s options must have equivalents in B. However, B could have options that A doesn’t. So the existence of this interface implies that A is “weaker” in a sense than B. (There are a bunch of caveats here.)
(Relevant terms in the sequence: _morphism_)
**Decomposing agents into teams of subagents**
The first kind of subagent we will consider is a subagent that can control “part of” the agent’s options. Consider for example a coordination game, where there are N players who each individually can choose whether or not to press a Big Red Button. There are only two possible worlds: either the button is pressed, or it is not pressed. For now, let’s assume there are two players, Alice and Bob.
One possible Cartesian frame is the frame for the entire team. In this case, the team has perfect control over the state of the button -- the agent options are either to press the button or not to press the button, and the environment does not have any options (or more accurately, it has a single “do nothing” option).
However, we can also decompose this into separate Alice and Bob _subagents_. What does a Cartesian frame for Alice look like? Well, Alice also has two options -- press the button, or don’t. However, Alice does not have perfect control over the result: from her perspective, Bob is part of the environment. As a result, for Alice, the environment also has two options -- press the button, or don’t. The button is pressed if Alice presses it _or_ if the environment presses it. (The Cartesian frame for Bob is identical, since he is in the same position that Alice is in.)
Note however that this decomposition isn’t perfect: given the Cartesian frames for Alice and Bob, you cannot uniquely recover the original Cartesian frame for the team. This is because both Alice and Bob’s frames say that the environment has some ability to press the button -- _we_ know that this is just from Alice and Bob themselves, but given just the frames we can’t be sure that there isn’t a third person Charlie who also might press the button. So, when we combine Alice and Bob back into the frame for a two-person team, we don’t know whether or not the environment should have the ability to press the button or not. This makes the mathematical definition of this kind of subagent a bit trickier though it still works out.
Another important note is that this is relative to how coarsely you model the world. We used a fairly coarse model in this example: only whether or not the button was pressed. If we instead used a finer model that tracked which subset of people pressed the button, then we _would_ be able to uniquely recover the team’s Cartesian frame from Alice and Bob’s individual frames.
(Relevant terms in the sequence: _multiplicative subagents, sub-tensors, tensors_)
**Externalizing and internalizing**
This decomposition isn’t just for teams of people: even a single “mind” can often be thought of as the interaction of various parts. For example, hierarchical decision-making can be thought of as the interaction between multiple agents at different levels of the hierarchy.
This decomposition can be done using _externalization_. Externalization allows you to take an existing Cartesian frame and some specific property of the world, and then construct a new Cartesian frame where that property of the world is controlled by the environment.
Concretely, let’s imagine a Cartesian frame for Alice, that represents her decision on whether to cook a meal or eat out. If she chooses to cook a meal, then she must also decide which recipe to follow. If she chooses to eat out, she must decide which restaurant to eat out at.
We can externalize the high-level choice of whether Alice cooks a meal or eats out. This results in a Cartesian frame where the environment chooses whether Alice is cooking or eating out, and the agent must then choose a restaurant or recipe as appropriate. This is the Cartesian frame corresponding to the low-level policy that must pursue whatever subgoal is chosen by the high-level planning module (which is now part of the environment). The agent of this frame is a subagent of Alice.
The reverse operation is called internalization, where some property of the world is brought under the control of the agent. In the above example, if we take the Cartesian frame for the low-level policy, and then internalize the cooking / eating out choice, we get back the Cartesian frame for Alice as a unified whole.
Note that in general externalization and internalization are _not_ inverses of each other. As a simple example, if you externalize something that is already “in the environment” (e.g. whether it is raining, in a frame for Alice), that does nothing, but when you then internalize it, that thing is now assumed to be under the agent’s control (e.g. now the “agent” in the frame can control whether or not it is raining). We will return to this point when we talk about observability.
**Decomposing agents into disjunctions of subagents**
Our subagents so far have been “team-based”: the original agent could be thought of as a supervisor that got to control all of the subagents together. (The team agent in the button-pressing game could be thought of as controlling both Alice and Bob’s actions; in the cooking / eating out example Alice could be thought of as controlling both the high-level subgoal selection as well as the low-level policy that executes on the subgoals.)
The sequence also introduces another decomposition into subagents, where the superagent can be thought of as a supervisor that gets to choose _which_ of the subagents gets to control the overall behavior. Thus, the superagent can do anything that either of the subagents could do.
Let’s return to our cooking / eating out example. We previously saw that we could decompose Alice into a high-level subgoal-choosing subagent that chooses whether to cook or eat out, and a low-level subgoal-execution subagent that then chooses which recipe to make or which restaurant to go to. We can also decompose Alice as being the choice of two subagents: one that chooses which restaurant to go to, and one that chooses which recipe to make. The union of these subagents is an agent that first chooses whether to go to a restaurant or to make a recipe, and then uses the appropriate subagent to choose the restaurant or recipe: this is exactly a description of Alice.
(Relevant terms in the sequence: _additive subagents, sub-sums, sums_)
**Committing and assuming**
One way to think about the subagents of the previous example is that they are the result of Alice _committing_ to a particular subset of choices. If Alice commits to eating out (but doesn’t specify at what restaurant), then the resulting frame is equivalent to the restaurant-choosing subagent.
Similarly to committing, we can also talk about _assuming_. Just as commitments restrict the set of options available to the agent, assumptions restrict the set of options available to the environment.
Just as we can union two agents together to get an agent that gets to choose between two subagents, we can also union two environments together to get an environment that gets to choose between two subenvironments. (In this case the agent is more constrained: it must be able to handle the environment regardless of which way the environment chooses.)
(Relevant terms in the sequence: _product_)
**Observables**
The most interesting (to me) part of this sequence was the various equivalent definitions of what it means for something to be observable. The overall story is similar to the one in [Knowledge is Freedom](https://www.alignmentforum.org/posts/b3Bt9Cz4hEtR26ANX/knowledge-is-freedom): an agent is said to “observe” a property P if it is capable of making different decisions based on whether P holds or not.
Thus we get our first definition of observability: **a property P of the world is _observable_ if, for any two agent options a and b, the agent also has an option that is equivalent to “if P then a else b”.**
Intuitively, this is meant to be similar to the notion of “inputs” to an agent. Intuitively, a neural net should be able to express arbitrary computations over its inputs, and so if we view the neural net as “choosing” what computation to do (by “choosing” what its parameters are), then the neural net can have its outputs (agent options) depend in arbitrary ways on the inputs. Thus, we say that the neural net “observes” its inputs, because what the neural net does can depend freely on the inputs.
Note that this is a very black-or-white criterion: we must be able to express _every_ conditional policy on the property for it to be observable; if even one such policy is not expressible then the property is not observable.
One way to think about this is that an observable property needs to be completely under the control of the environment, that is, the environment option should completely determine whether the resulting world satisfies the property or not -- nothing the agent does can matter (for this property). To see this, suppose that there was some environment option e that didn’t fully determine a property P, so that there are agent options a and b such that the world corresponding to (a, e) satisfies P but the one corresponding to (b, e) does not. Then our agent cannot implement the conditional policy “if P then b else a”, because it would lead to a self-referential contradiction (akin to “this sentence is false”) when the environment chooses e. Thus, P cannot be observable.
This is not equivalent to observability: it is possible for the environment to fully control P, while the agent is still unable to always condition on P. So we do need something extra. Nevertheless, this intuition suggests a few other ways of thinking about observability. The key idea is to identify a decomposition of the agent based on P that should only work if the environment has all the control, and then to identify a union step that puts the agent back together, that automatically adds in all of the policies that are conditional on P. I’ll describe these definitions here; the sequence proves that they are in fact equivalent to the original definition above.
First, recall that externalization and internalization are methods that allow us to “transfer” control of some property from the agent to the environment and vice versa. Thus, if all the control of P is in the environment, one would hope that internalization followed by externalization just transfers the control back and forth. In addition, when we externalize P, the externalization process will enforce that the agent can condition on P arbitrarily (this is how it is defined). This suggests the definition: **P is observable if and only if internalizing P followed by externalizing P gives us back the original frame.**
Second, if the environment has all of the control over P, then we should be able to decompose the agent into two parts: one that decides what to do when P is true, and one that decides what to do when P is false. We can achieve this using _assumptions_, that is, the first agent is the original agent under the assumption that P is true, and the second is under the assumption that P is false. Note that if the environment didn’t have perfect control over P, this would not work, as the environment options where P is not guaranteed to be true or false would simply be deleted, and could not be reconstructed from the two new agents.
We now need to specify how to put the agents back together, in a way that includes all the conditional policies on P. There are actually two variants in how we can do this:
In the first case, we combine the agents by unioning the environments, which lets the environment choose whether P is true or not. Given how this union is defined, the new agent is able to specify both what to do given the environment’s choice, _as well as_ what it would have done in the counterfactual case where the environment had decided P differently. This allows it to implement all conditional policies on P. So, **P is observable if and only if decomposing the frame using assumptions on P, and then unioning the environments of the resulting frames gives back the original frame.**
The second case, after getting agents via assumption on P, you extend each agent so that in the case where its assumption is false, it is as though it takes a noop action. Intuitively, the resulting agent is an agent that is hobbled so that it has no power in worlds where P comes out differently than was assumed. These agents are then combined into a team. Intuitively, the team selects an option of the form “the first agent tries to do X (which only succeeds when P is true) and the second agent tries to do Y (which only succeeds when P is false)”. Like the previous decomposition, this specifies both what to do in whatever actual environment results, as well as what would have been done in the counterfactual world where the value of P was reversed. Thus, this way of combining the agents once again adds in all conditional policies on P. So, **P is observable if and only if decomposing the frame using assumptions on P, then hobbling the resulting frames in cases where their assumptions are false, and then putting the agents back in a team, is equivalent to the original frame.**
**Time**
Cartesian frames do not have an intrinsic notion of time. However, we can still use them to model sequential processes, by having the agent options be _policies_ rather than actions, and having the worlds be histories or trajectories rather than states.
To say useful things about time, we need to broaden our notion of observables. So far I’ve been talking about whether you can observe binary properties P that are either true or false. In fact, all of the definitions can be easily generalized to n-ary properties P that can take on one of N values. We’ll be using this notion of observability here.
Consider a game of chess where Alice plays as white and Bob as black. Intuitively, when Alice is choosing her second move, she can observe Bob’s first move. However, the property “Bob’s first move” would not be observable in Alice’s Cartesian frame, because Alice’s _first_ move cannot depend on Bob’s first move (since Bob hasn’t made it yet), and so when deciding the first move we can’t implement policies that condition on what Bob’s first move is.
Really, we want some way to say “after Alice has made her first move, from the perspective of the rest of her decisions, Bob’s first move is observable”. But we know how to remove some control from the agent in order to get the perspective of “everything else” -- that’s externalization! In particular, in Alice’s frame, if we externalize the property “Alice’s first move”, then the property “Bob’s first move” _is_ observable in the new frame.
This suggests a way to define a sequence of frames that represent the passage of time: we define the Tth frame as “the original frame, but with the first T moves externalized”, or equivalently as “the T-1th frame, but with the Tth move externalized”. Each of these frames are subagents of the original frame, since we can think of the full agent (Alice) as the team of “the agent that plays the first T moves” and “the agent that plays the T+1th move and onwards”. As you might expect, as “time” progresses, the agent loses controllables and gains observables. For example, by move 3 Alice can no longer control her first two moves, but she can now observe Bob’s first two moves, relative to Alice at the beginning of the game.

Planned opinion:

I like this way of thinking about agency: we’ve been talking about “where to draw the line around the agent” for quite a while in AI safety, but there hasn’t been a nice formalization of this until now. In particular, it’s very nice that we can compare different ways of drawing the line around the agent, and make precise various concepts around this, such as “subagent”.
I’ve also previously liked the notion that “to observe P is to be able to change your decisions based on the value of P”, but I hadn’t really seen much discussion about it until now. This sequence makes some real progress on conceptual understanding of this perspective: in particular, the notion that observability requires “all the control to be in the environment” is not one I had until now. (Though I should note that this particular phrasing is mine, and I’m not sure the author would agree with the phrasing.)
One of my checks for the utility of foundational theory for a particular application is to see whether the key results can be explained without having to delve into esoteric mathematical notation. I think this sequence does very well on this metric -- for the most part I didn’t even read the proofs, yet I was able to reconstruct conceptual arguments for many of the theorems that are convincing to me. (They aren’t and shouldn’t be as convincing as the proofs themselves.) However, not all of the concepts score so well on this -- for example, the generic subagent definition was sufficiently unintuitive to me that I did not include it in this summary.

[-]Scott Garrabrant5y40

Looks like a pretty good summary to me.

[-]riceissa5y20

So the existence of this interface implies that A is “weaker” in a sense than A’.

Should that say B instead of A', or have I misunderstood? (I haven't read most of the sequence.)

[-]Rohin Shah5y20

It should, good catch, thanks!

[-]Mark Xu5y*120

This is very exciting. Looking forward to the rest of the sequence.

As I was reading, I found myself reframing a lot of things in terms of the rows and columns of the matrix. Here's my loose attempt to rederive most of the properties under this view.

The world is a set of states. One way to think about these states is by putting them in a matrix, which we call "cartesian frame." In this frame, the rows of the matrix are possible "agents" and the columns are possible "environments".
- Note that you don't have to put all the states in the matrix.
Ensurables are the part of the world that the agent can always ensure we end up in. Ensurables are the rows of the matrix, closed under supersets
Preventables are the part of the world that the agent can always ensure we don't end up in. Preventables are the complements of the rows, closed under subsets
Controllables are parts of the world that are both ensurable and preventable. Controlables are rows (or sets of rows) for which there exists rows that are disjoint. [edit: previous definition of "contains elements not found in other rows" was wrong, see comment by crabman]
Observeables are parts of the environment that the agent can observe and act conditionally according to. Observables are columns such that for every pair of rows there is a third row that equals the 1st row if the environment is in that column and the 2nd row otherwise. This means that for every two rows, there's a third row that's made by taking the first row and swapping elements with the 2nd row where it intersects with the column.
- Observables have to be sets of columns because if they weren't, you can find a column that is partially observable and partially not. This means you can build an action that says something like "if I am observable, then I am not observable. If I am not observable, I am observable" because the swapping doesn't work properly.
- Observables are closed under boolean combination (note it's sufficient to show closure under complement and unions):
  - Since swapping index 1 of a row is the same as swapping all non-1 indexes, observables are closed under complements.
  - Since you can swap indexes 1 and 2 by first swapping index 1, then swapping index 2, observables are closed under union.
    - This is equivalent to saying "If A or B, then a0, else a2" is logically equivalent to "if A, then a0, else (if B, then a0, else a2)"
Since controllables are rows with specific properties and observables are columns with specific properties, then nothing can be both controllable and observable. (The only possibility is the entire matrix, which is trivially not controllable because it's not preventable)
- This assumes that the matrix has at least one column
The image of a cartesian frame is the actual matrix part.
Since an ensurable is a row (or superset) and an observable is a column (or set of columns), then if something is ensurable and observable, then it must contain every column, so it must be the whole matrix (image).
If the matrix has 1 or 0 rows, then the observable constraint is trivially satisfied, so the observables are all possible sets of (possible) environment states (since 0/1 length columns are the same as states).
- "0 rows" doesn't quite make sense, but just pretend that you can have a 0 row matrix which is just a set of world states.
If the matrix has 0 columns, then the ensurable/preventable contraint is trivially satisfied, so the ensurables are the same as the preventables are the same as the controllables, which are all possible sets of (possible) environment states (since "length 0" rows are the same as states).
- "0 columns doesn't make that much sense either but pretend that you can have a 0 column matrix which is just a set of world state.
If the matrix has exactly 1 column, then the ensurable/preventable constraint is trivially satisfied for states in the image (matrix), so the ensurables are all non-empty sets of states in the matrix (since length 1 columns are the same as states), closed under union with states outside the matrix. It should be easy to see that controllables are all possible sets of states that intersect the matrix non-trivially, closed under union with states outside the matrix.

[-]Diffractor5y70

I will be hosting a readthrough of this sequence on MIRIxDiscord again, PM for a link.

[-]Ben Pace4y*60Review for 2020 Review

Introduction to Cartesian Frames is a piece that also gave me a new philosophical perspective on my life.

I don't know how to simply describe it. I don't know what even to say here.

One thing I can say is that the post formalized the idea of having "more agency" or "less agency", in terms of "what facts about the world can I force to be true?". The more I approach the world by stating things that are going to happen, that I can't change, the more I'm boxing-in my agency over the world. The more I treat constraints as things I could fight to change, the more I have power and agency over the world. If I can't imagine a fact being false, I don't have agency over it. (This applies to mathematical and logical claims too, which ties into logical induction and decision theory.)

Writing the last sentence I realize the idea is one with the post I wrote "Taking your environment as object" vs "Being subject to your environment" which is another chunk of this element of growth I've experienced in the last year.

Anyway, that was a big deal — the first few times I read the math of cartesian frames I didn't get the idea at all, then after seeing some examples and reflecting on it, it clicked and helped me understand this whole thing better.

(Also that Scott has formalized it is very valuable and impressive, and even more so is this notion of factorizations of a set and the apparently new sequence he discovered which is insane and can't be true. Factorization of a set seems like the third thing you'd invent about sets once you thought of the idea, and if Scott discovered it in 2020 I'll be like wtaf.)

(But this is not the primary reason I'm endorsing it in the review. The primary reason is that it captures something that seems philosophically important to me.)

In retrospect I'm bumping this up to a +9 for the review. I didn't think about it properly in the early vote, and it's a lot of technical stuff and I forgot about the core concepts I got from it.

(This review is taken from my post Ben Pace's Controversial Picks for the 2020 Review.)

[-]Ramana Kumar5y50

Do we lose much by restricting attention to finite Cartesian frames (i.e., with finite agent and environment)? I ask because I'm formalising these results in higher-order logic (HOL), and the category is too big to represent if it really must contain frames with infinite agents and also for any pair of frames the frame whose agents are the morphisms between them. The root problem is probably that I require any category's class of objects to be a set, but it's hard to avoid this requirement in HOL in a nice way. Everything should work out for finite frames though. (I haven't come across any compelling examples of infinite frames, but I haven't tried hard to think of them.)

[-]Scott Garrabrant5y20

I don't think you lose much by focusing on finite Cartesian frames. I have mostly only been imagining finite cases.

I think there is some potential for later extending the theory to encompass game theory and probabilistic strategies, and then we might want to think of the infinite space of mixed strategies as the agent, but it wouldn't surprise me if in doing this, we also put continuity into the system and want to assume compactness.

[-]Ramana Kumar5y20

To see that some restriction is required here (not imposed by HOL), consider that if may contain arbitrary Cartesian frames over $w$ then we would have an injection $2^{C h u (w)} \to C h u (w)$ that, for example, encodes a set $S \subseteq C h u (w)$ as the Cartesian frame $C_{S}$ with $A g e n t (C_{S}) = S$ (the environment and evaluation function are unimportant), which runs afoul of Cantor's theorem regarding the cardinality of $C h u (w)$ .

I wouldn't be surprised if a similar encoding/injection could be made using just the operations used to construct Cartesian frames that appear in this sequence - though I have not found one explicitly myself yet.

[-]Charlie Steiner5y50

If every pair led to a different world-state, this would be the boring case of complete factorizability, right? As in, you couldn't distinguish this from the world having no dynamics at all, just a recording of the choices of $a$ and $e$ . Therefore it seems important that your dynamics send some pairs of choices to identical states.

But that's not necessarily how the micro-scale laws of physics work. You can't squish state space irreversibly like that. And so $W$ can't be the actual microphysical world, it has to be some macro-level abstract model of it, or else it's boring.

So I'm a little confused about what you have in mind when you talk about putting different bases $A$ and $E$ onto the same $W$ . What's so great about keeping the same $W$ , if it's an abstraction of the microphysical world, tailor-made to help us model exactly this agent? I suspect that the answer is that you're using this to model an agent that also has subagents, so I'm excited for that post :)

[-]Scott Garrabrant5y30

Your suspected answer right.

[-]Ben Pace5y40

Curated.

I'm exceedingly excited about this sequence. The Embedded Agency sequence laid out a core set of confusions, and it seems like this is a formal system that deals with those issues far better than the current alternatives e.g. the cybernetics model. This post lays out the basics of Cartesian Frames clearly and communicates key parts of the overall approach ("reasoning like Pearl's to objects like game theory's, with a motivation like Hutter's"). I've also never seen math explained with as much helpful philosophical justification (e.g. "Part of the point of the Cartesian frame framework is that we are not privileging either interpretation"), and I appreciate all of that quite a bit.

It seems likely that by the end of this sequence it will be on a list of my all-time favorite things posted to LessWrong 2.0. I'm looking forward to getting to grips with Cartesian Frames, understanding how they work, and to start applying those intuitions to my other discussions of agency.

I'm also curating it a little quickly to let people know that Scott is giving a talk on this sequence this Sunday at 12:00PM PT. Furthermore, Scott is holding weekly office hours (see the same link for more info) for people to ask questions, and Diffractor is running a reading group in the MIRIx Discord, which I recommend people PM him to get an invite to (I just did so myself, it's a nice Discord server).

[-]Stuart_Armstrong5y30

Did posts on generalised models as a category and how one can see Cartesian frames as generalised models.

[-]Stuart_Armstrong5y30

I like it. I'll think about how it fits with my ways of thinking (eg model splintering).

[-]Mark Xu5y20

In 4.1:

Given a0 and a1, since S∈Obs(C), there exists an a2∈A such that for all e∈E, we have a2∈if(S,a0,a1). Then, since T∈Obs(C), there exists an a3∈A such that for all e∈E, we have a3∈if(S,a0,a2). Unpacking and combining these, we get for all e∈E, a3∈if(S∪T,a0,a1). Since we could construct such an a3 from an arbitrary a0,a1∈A, we know that S∪T∈Obs(C). □

I think there's a typo here. Should be , not $a_{3} \in if (S, a_{0}, a_{2})$ .

(also not sure how to copy latex properly).

[-]Scott Garrabrant5y10

Yep. Fixed. Thanks.

	base information	derived information
causality	joint probability distribution	causal diagram
games	normal-form game	extensive-form game
agency	Cartesian frame	control, observation, subagents, time, etc.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

57

Introduction to Cartesian Frames

57

1. Definition

2. Normal-Form Games

3. Controllables

4. Observables

5. Controllables and Observables Are Disjoint

6. Why Cartesian Frames?

Footnotes