AI ALIGNMENT FORUM
AF

[ Question ]

Egan's Theorem?

by johnswentworth

2 min read13th Sep 20201 answer 6 comments

7

Adding Up to NormalityRationality

Frontpage

When physicists were figuring out quantum mechanics, one of the major constraints was that it had to reproduce classical mechanics in all of the situations where we already knew that classical mechanics works well - i.e. most of the macroscopic world. Likewise for special and general relativity - they had to reproduce Galilean relativity and Newtonian gravity, respectively, in the parameter ranges where those were known to work. Statistical mechanics had to reproduce the fluid theory of heat; Maxwell's equations had to agree with more specific equations governing static electricity, currents, magnetic fields and light under various conditions.

Even if the entire universe undergoes some kind of phase change tomorrow and the macroscopic physical laws change entirely, it would still be true that the old laws did work before the phase change. Any new theory and any new theory would still have to be consistent with the old laws working, where and when they actually did work.

This is Egan's Law: it all adds up to normality. When new theory/data comes along, the old theories are still just as true as they always were. New models must reproduce the old in all the places where the old models worked; otherwise the new models are incorrect, at least in the places where the old models work and the new models disagree with them.

It really seems like this should be not just a Law, but a Theorem.

I imagine Egan's Theorem would go something like this. We find a certain type of pattern in some data. The pattern is highly unlikely to arise by chance, or allows significant compression of the data, or something along those lines. Then the theorem would say that, in any model of the data, either:

The model has some property (corresponding to the pattern), or
The model is "wrong" or "incomplete" in some sense - e.g. we can construct a strictly better model, or show that the model consistently fails to predict the pattern, or something like that.

The meat of such a theorem would be finding classes of patterns which imply model-properties less trivial than just "the model must predict the pattern" - i.e. patterns which imply properties we actually care about. Structural properties like e.g. (approximate) conditional independencies seem particularly relevant, as well as properties involving abstractions/embedded submodels (in which case the theorem should tell how to find the abstraction/embedding).

Does anyone know of theorems like that? Maybe this is equivalent to some standard property in statistics and I'm just overthinking it?

Adding Up to NormalityRationality

Frontpage

7

Mentioned in

18A Correspondence Theorem in the Maximum Entropy Framework

13A Correspondence Theorem

Egan's Theorem?

11Charlie Steiner

2johnswentworth

3Charlie Steiner

2johnswentworth

5Charlie Steiner

2johnswentworth

4romeostevensit

New Answer

New Comment

1 Answers sorted by
top scoring

Charlie Steiner

Sep 13, 2020

30

The answer to the question you actually asked is no, there is no ironclad guarantee of properties continuing, nor any guarantee that there will be a simple mapping between theories. With some effort you can construct some perverse Turing machines with bad behavior.

But the answer the more generalized question is yes, simple properties can be expected (in a probabilistic sense) to generalize even if the model is incomplete. This is basically Minimum Message Length prediction, which you can put on the theoretical basis of the Solomonoff prior (It's somewhere in Li and Vitanyi - chapter 5?).

[-]johnswentworth4y10

there is no ironclad guarantee of properties continuing

Properties continuing is not what I'm asking about. The example in the OP is relevant: even if the entire universe undergoes some kind of phase change tomorrow and the macroscopic physical laws change entirely, it would still be true that the old laws did work before the phase change, and any new theory needs to account for that in order to be complete.

nor any guarantee that there will be a simple mapping between theories

I do not know of any theorem or counterexample which actually says this. Do you?

si

... (read more)

1Charlie Steiner4y

If by "account for that" you mean not be in direct conflict with earlier sense data, then sure. All tautologies about the data will continue to be true. Suppose some data can be predicted by classical mechanics with 75% accuracy. This is a tautology given the data itself, and no future theory will somehow make classical mechanics stop giving 75% accurate predictions for that past data. Maybe that's all you meant? I'd sort of interpreted you as asking questions about properties of the theory. E.g. "this data is really well explained by the classical mechanics of point particles, therefore any future theory should have a particularly simple relationship to the point particle ontology." It seems like there shouldn't be a guaranteed relationship that's much simpler than reconstructing the data and recomputing the inferred point particles. I spent a little while trying to phrase this in terms of Turing machines but I don't think I quite managed to capture the spirit.

1johnswentworth4y

Yeah, I'm claiming exactly the opposite of this. When the old theory itself has some simple structure (e.g. classical mechanics), there should be a guaranteed relationship that's much simpler than reconstructing the data and recomputing the inferred point particles. One possible formulation: if I find that a terabyte of data compresses down to a gigabyte, and then I find a different model which compresses it down to 500MB, there should be a relationship between the two models which can be expressed without expanding out the whole terabyte. (Or, if there isn't such a relationship, that means the two models are capturing different patterns from the data, and there should exist another model which compresses the data more than either by capturing the patterns found by both models.)

2Charlie Steiner4y

Right, it's a little tricky to specify exactly what this "relationship" is. Is the notion that you should be able to compress the approximate model, given an oracle for the code of the best one (i.e. that they share pieces?). Because most Turing machines don't compress well, and so it's easy to find counterexamples (the most straightforward class is where the approximate model is already extremely simple). Anyhow, like I said, hard to capture the spirit of the problem. But when I *do* try to formalize the problem, it tends to not have the property, which is definitely driving my intuition.

1johnswentworth4y

I'd expect Turing machines to be a bad way to model this. They're inherently blackboxy; the only "structure" they make easy to work with is function composition. The sort of structures relevant here don't seem like they'd care much about function boundaries. (This is why I use models like these as my default model of computation these days.) Anyway, yeah, I'm still not sure what the "relationship" should be, and it's hard to formulate in a way that seems to capture the core idea.

1 comment, sorted by

Click to highlight new comments since: Today at 11:17 PM

[-]romeostevensit4y20

Sounds similar to Noether's Theorem in some ways when you take that theorem philosophically and not just mathematically.