History of the Development of Logical Induction

Scott Garrabrant

I have been asked several times about how the development of logical induction happened, so I am writing it up.

June 2013 - I write my first Less Wrong Post. It may not seem very related to logical uncertainty, but it was in my head. I wanted to know how to take different probabilities for the same event and aggregate them, so I could take an agent that could maintain probabilities on facts even when facts that it originally thought were separate turn out to be logically equivalent. (I am not sure if all this was in my head at the time, but it was at some point over the next year.)

I make a proposal for a distribution on completions of a theory: repeatedly observe that a set of sentences whose probabilities should sum to one fail to sum to one, and shift their probabilities in a way inspired from the above post. I do not actually prove that this process converges, but I conjecture that it converges to the distribution I describe here. (This post is from when I wrote it up in June 2014; I can't remember exactly what parts I already had at the time.)

December 2013 - I tell my proposal to Abram Demski, who at some point says that he thinks it is either wrong or equivalent to his proposal for the same problem. (He was right and his proposal was better.) At this point I got very lucky; when I told this to Abram, I thought he was just a smart person at my local Less Wrong meet up, and it turned out that he was almost the only person to also try to do the thing I was doing. Abram and I start talking about my proposal a bunch together, and start trying to prove the above conjecture.

April 2014 - Abram and I start the first MIRIx to think about logical uncertainty, and especially this proposal. I at the time had not contacted MIRI, even to apply for a workshop, because I was dumb.

At some point we realize that the proposal is bad. The thing that makes us give up on it is the fact that sometimes observing that $A \to B$ can drastically decrease your probability for $B$ .

August 2014 - Abram and I go to MIRI to talk about logical uncertainty with Nate, Benya, Eliezer, and Paul. We share the thing we were thinking about, even though we had given up on it at the time. At some point in there, we talk about assigning probability $1 / 10$ to a sufficiently late digit of $π$ being 0.

Soon after that, I propose a new project for our MIRIxLA group to work on, which I call the Benford Test. I wanted an algorithm which on day $n$ , after thinking for some function of $n$ time, assigned probabilities to the $n$ th logical sentence in some enumeration of logical sentences. If I took a subsequence of logical sentences whose truth values appeared pseudorandom to anything that ran sufficiently quickly, I wanted the algorithm to converge to the correct probability on that subsequence. I.e., it should assign probability $l o g_{10} (2)$ to the first digit of Ackerman( $n$ ) being 1. The Benford thing was to make it concrete; I was thinking about it as pseudorandomness. There are a bunch of ugly things about the way I originally propose the problem. For example, I only assign a probability to one sentence on each day. We think about this quite a bit over the next 6 months and repeatedly fail to find anything that passes the Benford test.

March 2015 - I eventually find an algorithm that passes the Benford Test, but it is really hacky and ugly. I know writing it up is going to be a chore, so I decide to ask Luke if I can go to MIRI for a summer job and work on turning it into a paper. I become a MIRI contractor instead.

May 2015 - I go to my first MIRI workshop. During the workshop, there is reserved time for writing blog posts for agentfoundations.org. I start making writing blog posts a major part of my motivation system while doing research and writing the paper.

At some point in there, I notice that the algorithm we use to pass the Benford test does not obviously converge even in the probabilities that it assigns to a single sentence. I change the framework slightly and make it cleaner so that the Benford test algorithm can be seen in the same type of framework as the Demski prior. Now I have two algorithms. One can do well on pseudorandom sentences, and the other converges to a distribution in the limit. I set my next goal to do both at the same time. One hope is that in order to do both of these tasks at the same time, we will have to use something less hacky than we used to solve the Benford test.

July 2015 - I go to the first MIRI summer fellows program, and while there, I make the first noticeable step towards combining limit coherence and the Benford test - I define inductive coherence as a subgoal. This is a strengthening of limit coherence that I believed would bring it closer to having good behaviors before reaching the limit. I do not actually find an algorithm that is inductively coherent, but I have a goal that I think will help. A couple months later, Benya shows that a modification of the Demski prior is inductively coherent.

November 2015 - I make a significantly cleaner algorithm that passes the Benford test, and present it at the Berkeley PDTAI seminar. The cleaner algorithm is based on sleeping experts, which is pretty close to things that are going on in logical induction. It involves a bundle of hypotheses that are assigning probabilities to logical sentences, but are also allowed to sleep and not assign probabilities some days if they don't want to. I also notice that reflective oracle Solomonoff induction can be used to implement some sleeping expert stuff, because the hypotheses have access to the overall distribution, and can not be on the hook on a given day by deferring to the average.

December 2015 - I start working for MIRI full time. One thing I am working on is writing up inductive coherence. At some point around here, I make a stronger version of the Benford test that my algorithm fails that I think is pushing it closer to limit coherence.

January 2016 - I figure out how to do the stronger version of the Benford test. My algorithm uses a lemma that Jessica proved for a separate thing she was doing, so we start to write up a paper on it.

February 2016 - While walking to work, I notice the linchpin for Logical Induction. I notice that the Benford test and limit coherence are both special cases of a sequence of probability assignments with the property that no gambler who can see the probabilities each day, who can choose to bet whenever they want and has a finite starting wealth, can reach infinite returns. There is one type of gambler that can take advantage of a market that fails the Benford test, and another type that can take advantage of a market that is not limit coherent.

I am very excited. I know at the time that I am likely about to solve my goal from roughly the last 10 months. I tell Abram and everyone at the MIRI office. Jessica is actually the first to point out that we can use continuity, similarly to what you do with reflective oracles. The same day, Jessica and I manage to show that you can make a market no continuous gambler can exploit as described, and verify that continuous gamblers are enough to ensure you pass the Benford test and have a coherent limit. At this point the meat of logical induction is done.

Over the next 6 months, we put a lot of effort into making the theory clean, but we didn't really have to make the algorithm better. The very first thing we found that was able to pass the Benford test and have a coherent limit had basically all the properties that you see in the final paper. Over much of this time, we were still not sure if we wanted to make logical induction public.

AI ALIGNMENT FORUM
AF

History of the Development of Logical Induction

31

31