Scott Garrabrant

I believe this post is (for the most part) accurate and demonstrates understanding of what is going on with logical induction. Thanks for writing (and coding) it!

I think your numbers are wrong, and the right column on the output should say 20% 20% 20%.

The output actually agrees with each of the components on every event in that component's sigma algebra. The input distributions don't actually have any conflicting beliefs, and so of course the output chooses a distribution that doesn't disagree with either.

I agree that the 0s are a bit unfortunate.

I think the best way to think of the type of the object you get out is not a probability distribution on but what I am calling a partial probability distribution on . A partial probability distribution is a partial function from that can be completed to a full probability distribution on (with some sigma algebra that is a superset of the domain of the partial probability distribution.

I like to think of the argmax function as something that takes in a distribution on probability distributions on with different sigma algebras, and outputs a partial probability distribution that is defined on the set of all events that are in the sigma algebra of (and given positive probability by) one of the components.

One nice thing about this definition is that it makes it so the argmax always takes on a unique value. (proof omitted.)

This doesn't really make it that much better, but the point here is that this framework admits that it doesn't really make much sense to ask about the probability of the middle column. You can ask about any of the events in the original pair of sigma algebras, and indeed, the two inputs don't disagree with the output at all on any of these sets.

Yeah, remember the above is all for updateless agents, which are already computationally intractable. For updateful agents, we will want to talk about conditional counterfactability. For example, if you and I are in a prisoners dilemma, we could would conditional on all the stuff that happened prior to us being put in separate cells, and given this condition, the histories are much smaller.

Also, we could do all of our reasoning up to a high level world model that makes histories more reasonably sized.

Also, if we could think of counterfactability as a spectrum. Some events are especially hard to reason about, because there are lots of different ways we could have done it, and we can selectively add details to make it more and more counterfactable, meaning it approximately screens off its history from that which you care about.

I agree, this is why I said I am being sloppy with conflating the output and our understanding of the output. We want our understanding of the output to screen off the history.

I mean, the definition is a little vague. If your meaning is something like "It goes in A if it is more accurately described as controlled by the viscera, and it goes in P if it is more accurately described as controlled by the environment," then I guess you can get a bijection by definition, but it is not obvious these are natural categories. I think there will be parts of the boundary that feel like they are controlled by both or neither, depending on how strictly you mean "controlled by."

My default plan is to not try to rename Cartesian frames, mostly because the benefit seems small, and I care more about building up the FFS ontology over the Cartesian frame one.

I agree completely. I am not really happy with any of the language in this post, and I want it to have scope limited to this post. I will for the most part say boundary for both the additive and multiplicative variants.

More of my thoughts here.

To be clear, everywhere I say “is wrong,” I mean I wish the model is slightly different, not that anything is actually is mistaken. In most cases, I don’t really have much of an idea how to actually implement my recommendation.

This underrated post is pretty good at explaining how to translate between FFSs and DAGs.