Probability is Real, and Value is Complex

[-]Thomas Kwa4y*120

I think a lot of commenters misunderstand this post, or think it's trying to do more than it is. TLDR of my take: it's conveying intuition, not suggesting we should model preferences with 2D vector spaces.

The risk-neutral measure in finance is one way that "rotations" between probability and utility can be made:

under the actual measure P, agents have utility nonlinear in money (e.g. risk aversion), and probability corresponds to frequentist notions
under the risk-neutral measure Q, agents have utility linear in money, and probability is skewed towards losing outcomes.

These two interpretations explain the same agent behavior. The risk-neutral measure still "feels" like probability due to its uniqueness in an efficient market (fundamental theorem of asset pricing), plus the fact that quants use and think in it every day to price derivatives. Mathematically, it's no different from the actual measure P.

The Radon-Nikodym theorem tells you how to transform between probability measures in general. For any utility function satisfying certain properties (which I don't know exactly), I think one can find a measure Q such that you're maximizing that utility function under Q. Sometimes when making career decisions, I think using the "actionable AI alignment probability measure" P_A which is P conditioned on my counterfactually saving the world. Under P_A, the alignment problem has a closer to 50% chance of being solved, my research directions are more tractable, etc. Again, P_A is just a probability measure, and "feels like" probability.

This post finds a particular probability measure Q which doesn't really have a physical meaning [1]. But its purpose is to make it more obvious that probability and utility are inextricably intertwined, because

instead of explaining behavior in terms of P and the utility function V, you can represent it using P and Q
P and Q form a vector space, and you can perform literal "rotations" between probability and utility that still predict the same agent behavior.

As far as I can tell, this is the entire point. I don't see this 2D vector space actually being used in modeling agents, and I don't think Abram does either.

Personally, I find it pretty compelling to just think of the risk-neutral measure, to understand why probability and utility are inextricably linked. But actually knowing there is symmetry between probability and utility does add to my intuition.

[1]: actually, if we're upweighting the high-utility worlds, maybe it can be called "rosy probability measure" or something.

[-]abramdemski4y90

As far as I can tell, this is the entire point. I don't see this 2D vector space actually being used in modeling agents, and I don't think Abram does either.

I largely agree. In retrospect, a large part of the point of this post for me is that it's practical to think of decision-theoretic agents as having expected value estimates for everything without having a utility function anywhere, which the expected values are "expectations of".

A utility function is a gadget for turning probability distributions into expected values. This object makes sense in a context like VNM, where you are asking agents to judge between arbitrary gambles. In the jeffrey-bolker setting, you instead only ask agents to choose between events, not gambles. This allows us to directly derive coherence constraints on expectations without introducing a function they're expectations "of".

For me, this fits better with the way humans seem to think; it's relatively easy to compare events to each other, but nigh impossible to take entire world-descriptions and compare them (which is what a utility function does).

The rotation comes into play because looking at preferences this way is much more 'situated': you are only required to have preferences relating to your current beliefs, rather than relating to arbitrary probability distributions (as in VNM). We can intuit from our experience that there is some wiggle room between probability vs preference when representing situations in the real world. VNM doesn't model this, because probabilities are simply given to us in the VNM setting, and we're to take them as gospel truth.

So jeffrey-bolker seems to do a better job of representing the subjective nature of probability, and the vector rotations illustrate this.

On the other hand, I think there is a real advantage to the 2d vector representation of a preference structure. For agents with identical beliefs (the "common prior assumption"), Harsanyi showed that cooperative preference structures can be represented by simple linear mixtures (Harsanyi's utilitarian theorem). However, Critch showed that combining preferences in general is not so simple. You can't separately average two agent's beliefs and their utility function; you have to dynamically change the weights of the utility-function averaging based on how bayesian updates shift the weights of the probability mixture.

Averaging the vector-valued measures together works fine, though, I believe. (I haven't worked it out in detail.) If true, this makes vector-valued measures an easier way to think about coalitions of cooperating agents who merge preferences in order to select a pareto-optimal joint policy.

[-]Scott Garrabrant7y30

The uniqueness of 0 is only roughly equivalent to the half plane definition if you also assume convexity (I.e. the existence of independent coins of no value.)

[-]dranorter7y20

What does it look like to rotate and then renormalize?

There seem to be two answers. The first answer is that the highest probability event is the one farthest to the right. This event must be the entire $Ω$ . All we do to renormalize is scale until this event is probability 1.

If we rotate until some probabilities are negative, and then renormalize in this way, the negative probabilities stay negative, but rescale.

The second way to renormalize is to choose a separating line, and use its normal vector as probability. This keeps probability positive. Then we find the highest probability event as before, and call this probability 1.

Trying to picture this, an obvious question is: can the highest probability event change when we rotate?

[-]cousin_it7y20

I can't make sense of the part with R-world and L-world. You assign probabilities to your possible actions (by what rule?) then do arithmetic on them to decide which action to take (why does that depend on probabilities of actions?) then rotate the picture and find that actions are correlated with hidden facts (how can such correlation happen?) It looks like this metaphor doesn't work very well for decision-making, or we're using it wrong.

[-]abramdemski7y10

Well... I agree with all of the "that's peculiar" implications there. To answer your question:

The assignment of probabilities to actions doesn't influence the final decision here. We just need to assign probabilities to everything. They could be anything, and the decision would come out the same.

The magic correlation is definitely weird. Before I worked out an example for this post, I thought I had a rough idea of what Jeffrey-Bolker rotation does to the probabilities and utilities, but I was wrong.

I see the epistemic status of this as "counterintuitive fact" rather than "using the metaphor wrong". The vector-valued measure is just a way to visualize it. You can set up axioms in which the Jeffrey-Bolker rotation is impossible (like the Savage axioms), but in my opinion they're cheating to rule it out. In any case, this weirdness clearly follows from the Jeffrey-Bolker axioms of decision theory.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

29

Probability is Real, and Value is Complex

29

Vector Valued Preferences

Linear Transformations

Rational Preferences

Conclusion