## AI ALIGNMENT FORUMAF

Pingbacks
4 comments, sorted by Highlighting new comments since
New Comment

I want to go a bit deep here on "maximum entropy" and misunderstandings thereof by the straw-man Humbali character, mostly to clarify things for myself, but also in the hopes that others might find it useful. I make no claim to novelty here—I think all this ground was covered by Jaynes (1968)—but I do have a sense that this perspective (and the measure-theoretic intuition behind it) is not pervasive around here, the way Bayesian updating is.

First, I want to point out that entropy of a probability measure  is only definable relative to a base measure , as follows:

(The derivatives notated here denote Radon-Nikodym derivatives; the integral is Lebesgue.) Shannon's formulae, the discrete  and the continuous , are the special cases of this where  is assumed to be counting measure or Lebesgue measure, respectively. These formulae actually treat  as having a subtly different type than "probability measure": namely, they treat it as a density with respect to counting measure (a "probability mass function") or a density with respect to Lebesgue measure (a "probability density function"), and implicitly supply the corresponding .

If you're familiar with Kullback–Leibler divergence (), and especially if you've heard  called "relative entropy," you may have already surmised that . Usually, KL divergence is defined with both arguments being probability measures (measures that add up to 1), but that's not required for it to be well-defined (what is required is absolute continuity, which is sort of orthogonal). The principle of "maximum entropy," or , is equivalent to . In the absence of additional constraints on , the solution of this is , so maximum entropy makes sense as a rule for minimum confidence to exactly the same extent that the implicit base measure  makes sense as a prior. The principle of maximum entropy should really be called "the principle of minimum updating", i.e., making a minimum-KL-divergence move from your prior  to your posterior  when the posterior is constrained to exactly agree with observed facts. (Standard Bayesian updating can be derived as a special case of this.)

Sometimes, the structure of a situation has some symmetry group with respect to which the situation of uncertainty seems to be invariant, with classic examples being relabeling heads/tails on a coin, or arbitrarily permuting a shuffled deck of cards. In these examples, the requirement that a prior be invariant with respect to those symmetries (in Jaynes' terms, the principle of transformation groups) uniquely characterizes counting measure as the only consistent prior (the classical principle of indifference, which still lies at the core of grade-school probability theory). In other cases, like a continuous roulette wheel, other Haar measures (which generalize both counting and Lebesgue measure) are justified. But taking "indifference" or "insufficient reason" to justify using an invariant measure as a prior in an arbitrary situation (as Laplace apparently did) is fraught with difficulties:

1. Most obviously, the invariant measure on  with respect to translations, namely Lebesgue measure, is an improper prior: it is a non-probability measure because its integral (formally, ) is infinite. If we're talking about forecasting the timing of a future event,  is a very natural space, but  is no less infinite. Discretizing into year-sized buckets doesn't help, since counting measure on  is also infinite (formally, ). In the context of maximum entropy, using an infinite measure for  means that there is no maximum-entropy —you can always get more entropy by spreading the probability mass even thinner.
2. But what if we discretize and also impose a nothing-up-my-sleeve outrageous-but-finite upper bound, like the maximum binary64 number at around ? Counting measure on  can be normalized into a probability measure, so what stops that from being a reasonable "unconfident" prior? Sometimes this trick can work, but the deeper issue is that the original symmetry-invariance argument that successfully justifies counting measure for shuffled cards just makes no sense here. If one relabels all the years, say reversing their order, the situation of uncertainty is decidedly not equivalent.
3. Another difficulty with using invariant measures as priors as a general rule is that they are not always uniquely characterized, as in the Bertrand paradox, or the obvious incompatibility between uniform priors (invariant to addition) and log-uniform priors (invariant to multiplication).

I think Humbali's confusion can be partially explained as conflating an invariant measure and a prior—in both directions:

1. First, Humbali implicitly uses a translation-invariant base measure as a prior when he claims as absolute a notion of "entropy" which is actually relative to that particular base measure. Something like this mistake was made by both Laplace and Shannon, so Humbali is in good company here—but already confused, because translation on the time axis is not a symmetry with respect to which forecasts ought to be invariant.
2. Then, when cornered about a particular absurd prediction that inevitably arises from the first mistake, Humbali implicitly uses his (socially-driven) prior as a base measure, when he says "somebody with a wide probability distribution over AGI arrival spread over the next century, with a median in 30 years, is in realistic terms about as uncertain as anybody could possibly be." Assuming he's still using "entropy" at all as the barometer of virtuous unconfidence, he's now saying that the way to fix the absurd conclusions of maximum-entropy relative to Lebesgue measure is that one really ought to measure unconfidence with respect to a socially-adjusted "base rate" measure, which just happens to be his own prior. (I think the lexical overlap between "base rate" and "base measure" is not a coincidence.) This second position is more in bad-faith than the first because it still has the bluster of objectivity without any grounding at all, but it has more hope of formal coherence: one can imagine a system of collectively navigating uncertainty where publicly maintaining one's own epistemic negentropy, explicitly relative to some kind of social median, comes at a cost (e.g. hypothetically or literally wagering with others).

There is a bit of motte-and-bailey uncovered by the bad-faith in position 2. Humbali all along primarily wants to defend his prior as unquestionably reasonable (the bailey), and when he brings up "maximum entropy" in the first place, he's retreating to the motte of Lebesgue measure, which seems to have a formidable air of mathematical objectivity about it. Indeed, by its lights, Humbali's own prior does happen to have more entropy than Eliezer's, though Lebesgue measure fails to support the full bailey of Humbali's actual prior. However, in this case even the motte is not defensible, since Lebesgue measure is an improper prior and the translation-invariance that might justify it simply has no relevance in this context.

Meta: any feedback about how best to make use of the channels here (commenting, shortform, posting, perhaps others I'm not aware of) is very welcome; I'm new to actually contributing content on AF.

Ha, I was just about to write this post. To add something, I think you can justify the uniform measure on bounded intervals of reals (for illustration purposes, say ) by the following argument: "Measuring a real number " is obviously simply impossible if interpreted literally, containing an infinite amount of data. Instead this is supposed to be some sort of idealization of a situation where you can observe "as many bits as you want" of the binary expansion of the number (choosing another base gives the same measure). If you now apply the principle of indifference to each measured bit, you're left with Lebesgue measure.

It's not clear that there's a "right" way to apply this type of thinking to produce "the correct" prior on (or or any other non-compact space.

Given any particular admissible representation of a topological space, I do agree you can generate a Borel probability measure by pushing forward the Haar measure of the digit-string space (considered as a countable product of copies of , considered as a group with the modular-arithmetic structure of ) along the representation. This construction is studied in detail in (Mislove, 2015).

But, actually, the representation itself (in this case, the Cantor map) smuggles in Lebesgue measure, because each digit happens to cut the target space "in half" according to Lebesgue measure. If I postcompose, say, after the Cantor map, that is also an admissible representation of , but it no longer induces Lebesgue measure. This works for any continuous bijection, so any absolutely continuous probability measure on can be induced by such a representation. In fact, this is why the inverse-CDF algorithm for drawing samples from arbitrary distributions, given only uniform random bits, works.

That being said, you can apply this to non-compact spaces. I could get a probability measure on via a decimal representation, where, say, the number of leading zeros encodes the exponent in unary and the rest is the mantissa. [Edit: I didn't think this through all the way, and it can only represent real numbers . No big obstacle; post-compose .] The reason there doesn't seem to be a "correct" way to do so is that, because there's no Haar probability measure on non-compact spaces (at least, usually?), there's no digit representation that happens to cut up the space "evenly" according to such a canonical probability measure.

Among computational constraints, I think the most significant/fundamental are, in order,

1. Semicomputability
2. P (polynomial time)
3. PSPACE
4. Computability
5. BPP
6. NP
7. (first-order hypercomputable)
8. All the rest (BQP, PP, RP, EXPTIME, etc)