Simplicity priors with reflective oracles

Benya_Fallenstein

Previously, I described a type of probabilistic oracle machines that I think is powerful enough to allow the implementation of a variant of AIXI that can reason about worlds containing other instances of this variant of AIXI. In this post, I describe how to implement a simplicity prior (a variant of Solomonoff induction) in this framework; there are some technical issues which will also come up when implementing AIXI that are easier to talk about in the context of Solomonoff-like induction.

Solomonoff induction is a prior probability distribution over infinite bit strings, ${0, 1}^{ω}$ . We want to define a similar prior using probabilistic oracle machines with reflective oracles instead of Turing machines---and then "implement" that prior as a probabilistic oracle machine with that oracle.

The basic idea is simple. Say that a probabilistic oracle machine $H$ is a hypothesis if, for every oracle $O^{'} : N \times (Q \cap [0, 1]) \to [0, 1]$ , invoking $H [O^{'}]$ almost surely outputs an infinite bit string. Fix some programming language for probabilistic oracle machines, and randomly choose a program implementing a hypothesis $H$ , where the probability of each program of length $ℓ$ is proportional to $2^{- ℓ}$ . Then run $H [O]$ , where $O$ is our actual, reflective oracle, to obtain a sample bit string. This concludes the definition of our prior.

This also suggests how we could implement this prior, as a probabilistic oracle machine $S$ that samples bit strings with these probabilities; it seems like all we need is a way to test whether or not a given program implements a hypothesis. I'll tackle that problem later in this post, but first let me point out a different problem: simply running the hypothesis $H$ that we've sampled in the first step is not an ideal way to implement the prior.

Suppose that we want to determine the probability that the output of $S$ starts with $0111010$ . Well, here's an idea: why don't we use our reflective oracle?

To do so, we need to write a query $Q$ which returns $1$ if the output of $S$ has the right prefix, $0$ otherwise. Easy, or so it seems; run $S$ until it has outputted the first seven bits, check whether they're $0111010$ , return $0$ or $1$ accordingly. Then we can call our oracle on $(┌ Q ┐, p)$ , for any $p \in Q \cap [0, 1]$ , to check whether the probability of the prefix $0111010$ is greater or smaller than $p$ .

But a requirement for a machine $Q$ to be a query is that it $Q [O^{'}]$ halts with probability one, for every oracle $O^{'}$ . In the implementation of $S$ given above, any way we might find to check whether an arbitrary program $H$ is a hypothesis will certainly involve a use of the reflective oracle, which means that if we run $S$ on an arbitrary oracle $O^{'}$ , we might end up choosing a "hypothesis" which loops forever without producing any output. Thus, the machine described in the previous paragraph also has a positive probability of looping forever when run on this bad oracle, which means that it isn't a query.

To fix this, let's change the definition of $S$ in such a way as to turn it into a hypothesis (i.e., into a machine that almost surely outputs an infinite bitstring, no matter what oracle it's run on). Then the $Q$ described above will be a query, and we can use this method to compute the probabilities of arbitrary finite prefixes.

To do so, we need to make sure that once we sampled an $H$ , we definitely output an infinite bitstring, no matter what our oracle is and whether or not $H$ is actually a hypothesis. Of course, if $H$ is a hypothesis and we're running on a reflective oracle, then we want to output a sample from $H [O]$ (with the correct distribution).

We can do this---with a simple trick. Let's start by letting $Q$ be the machine which runs $H$ until it outputs its first bit, and then returns that bit. If $H$ is a hypothesis, then $Q$ is a query.

So we call our oracle on the pair $(┌ Q ┐, 0.5)$ , and throw a fair coin.

If the coin comes up heads and the oracle says "false" (the probability of $Q$ returning $1$ is smaller than 0.5), we output a zero.
If the coin lands heads and the oracle says "true", we output a one.
If the coin lands tails and the oracle says "false", we call our oracle on $(┌ Q ┐, 0.25)$ and repeat the process; if the coin lands tails and the oracle says "true", we call the oracle on $(┌ Q ┐, 0.75)$ and repeat.

This way, if we're running on a reflective oracle, we output a bit with the same distribution as the first bit outputted by $H$ ; but no matter what oracle we're running on, we definitely output a bit with probability one (because in each step of the binary search described above, we have a 50% probability of producing output).

This technique generalizes beyond the particular query we were talking about above:

Lemma (Protecting queries). There is a computable function $f (┌ Q ┐)$ which takes the Gödel number of a probabilistic oracle machine and returns the Gödel number of a different probabilistic oracle machine, such that (i) $f (┌ Q ┐)$ is a query, and (ii) if $Q$ is also a query, and $O$ is a reflective oracle, then $f (┌ Q ┐) [O]$ has the same distribution as $Q [O]$ .

This gives us the first bit. Now we want to produce a second bit of output, whose probability distribution is the same as the distribution of the second bit of $H$ ---conditional on $H$ having produced the first bit that we just produced.

Here's a simple way to do so. Suppose that the first bit was a $1$ . Let $Q$ now be the following machine: Run $H$ until it produces its first bit. If that bit is a $0$ , start over. If the bit is a $1$ , run it until it produces its second bit, and return that bit.

If $H$ is a hypothesis which has a positive probability of producing a $1$ as its first bit, then $Q$ is a query. Assuming this is safe if we're running on a reflective oracle, since in this case the probabability that we choose $1$ as our first bit is equal to the probability that $H$ chooses $1$ as its first bit.

Now we apply the above lemma about queries, with this new $Q$ . This gives us an actual query, which we can just run since it is guaranteed to halt, and which has the correct distribution if we're running on a reflective oracle.

We can use exactly the same procedure for each of the later bits, leading to a lemma exactly analogous to the earlier one for queries:

Lemma (Protecting hypotheses). There is a computable function $f (┌ H ┐)$ which takes the Gödel number of a probabilistic oracle machine and returns the Gödel number of a different probabilistic oracle machine, such that (i) $f (┌ H ┐)$ is a hypothesis, and (ii) if $H$ is also a hypothesis, and $O$ is a reflective oracle, then $f (┌ H ┐) [O]$ has the same distribution as $H [O]$ .

Given this result, the main missing piece is how to test whether a given machine $┌ H ┐$ is a hypothesis. We need this test to give the correct result if we're running on a reflective oracle, and we also need it to halt no matter which oracle we're running on---in other words, we need it to be a query.

It turns out that this is quite straight-forward to implement, given our definition of reflective oracles as always returning "false" when passed anything other than a query. Given the source code $┌ H ┐$ of a machine that may or may not be a hypothesis, we first implement the following machine, $M$ ; $M$ has the property that it is a query if and only if $H$ is a hypothesis.

$M$ first chooses a natural number, $n$ , according to any distribution that places positive probability on arbitrary large numbers.
Then, it runs $H$ until $H$ has outputted $n$ bits.
If $H$ halts before it has outputted $n$ bits, then $M$ loops forever. (If $H$ goes into an infinite loop before it has outputted $n$ bits, then $M$ does of course also loop forever.)
If $H$ does in fact output $n$ bits, then $M$ outputs $1$ and halts.

Our test is now to call our oracle on the pair $(┌ M ┐, 0)$ . Since this is only a single invocation of the oracle, it always halts. Moreover, if we're running on a reflective oracle, then there are two cases:

If $H$ is a hypothesis, then $M$ halts with probability one; whatever number $M$ selects in the first step, $H$ eventually outputs that many bits. In this case, $M$ is a query that always outputs $1$ , implying that invoking the oracle on $(┌ M ┐, 0)$ will return "true".
If $H$ is not a hypothesis, there is a positive probability that it will halt or loop forever after outputting only some finite number of bits, and there is a positive probability that $M$ will choose a number larger than this. Hence, there is a positive probability that $M$ loops forever, which implies that it is not a query, and hence that invoking the oracle on $(┌ M ┐, 0)$ will return "false".

We now have a query to test whether something is a hypothesis, and way of safely running a potential hypothesis, which will not go into infinite loops even if we're not running on a reflective oracle. Are we done?

Unfortunately not quite; there's still an annoying problem remaining, and so far I only have a slightly kludgy solution to it. The problem is that we would like our prior to, with probability one, choose a $┌ H ┐$ that actually is a hypothesis, and to do so it can't use the obvious method of trying random machines until finds one for which the query returns true.

Why not? Well, our query consists of a single oracle invocation---and if we're not running on a reflective oracle, all instances of that oracle invocation might return "false", in which case we would loop forever. And that would mean that our sampler $S$ would still not be a hypothesis, whose distribution we can probe with a reflective oracle.

My kludge is simple: Select only a single machine $┌ H ┐$ at random. If our test tells us that this isn't a hypothesis, then output an infinite stream of zeros. If our test tells us it is a hypothesis, then output a single $1$ before executing the protected version of $H$ . Then $S$ is a hypothesis, and we can use the tools detailed at the beginning of this post to figure out its distribution; and if we can find the probability of arbitrary finite prefixes of $S$ 's output, we can also compute the conditional probability, given that the first bit is a $1$ . This gives us the probability distribution we were initially looking for.

There's probably something more elegant, but I haven't yet been able to come up with an alternative that's both elegant and simple.