Eigil Rischel — AI Alignment Forum

AI ALIGNMENT FORUM
AF

Clarifying the Agent-Like Structure Problem

I mean, "is a large part of the state space" is basically what "high entropy" means!

For case 3, I think the right way to rule out this counterexample is the probabilistic criterion discussed by John - the vast majority of initial states for your computer don't include a zero-day exploit and a script to automatically deploy it. The only way to make this likely is to include you programming your computer in the picture, and of course you do have a world model (without which you could not have programmed your computer)

davidad's Shortform

Eigil Rischel4y10

Ha, I was just about to write this post. To add something, I think you can justify the uniform measure on bounded intervals of reals (for illustration purposes, say ) by the following argument: "Measuring a real number $x \in [0, 1]$ " is obviously simply impossible if interpreted literally, containing an infinite amount of data. Instead this is supposed to be some sort of idealization of a situation where you can observe "as many bits as you want" of the binary expansion of the number (choosing another base gives the same measure). If you now apply the principle of indifference to each measured bit, you're left with Lebesgue measure.

It's not clear that there's a "right" way to apply this type of thinking to produce "the correct" prior on $N$ (or $R$ or any other non-compact space.

Biextensional Equivalence

Eigil Rischel5y20

But then shouldn't there be a natural biextensional equivalence ? Suppose $C = (A, E, ⋆)$ , and denote $^C = (^A,^E, ⋆)$ . Then the map $A \to^A$ is clear enough, it's simply the quotient map. But there's not a unique map $^E \to E$ - any section of the quotient map will do, and it doesn't seem we can make this choice naturally.

I think maybe the subcategory of just "agent-extensional" frames is reflective, and then the subcategory of "environment-extensional" frames is coreflective. And there's a canonical (i.e natural) zig-zag $C \to (^A, E, ⋆) \leftarrow^C$

Biextensional Equivalence

Eigil Rischel5y10

Does the biextensional collapse satisfy a universal property? There doesn't seem to be an obvious map either or $^C \to C$ (in each case one of the arrows is going the wrong way), but maybe there's some other way to make it universal?

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments