Andrew Critch

This is Dr. Andrew Critch's professional LessWrong account. Andrew is the CEO of Encultured AI, and works for ~1 day/week as a Research Scientist at the Center for Human-Compatible AI (CHAI) at UC Berkeley. He also spends around a ½ day per week volunteering for other projects like the Berkeley Existential Risk initiative and the Survival and Flourishing Fund. Andrew earned his Ph.D. in mathematics at UC Berkeley studying applications of algebraic geometry to machine learning models. During that time, he cofounded the Center for Applied Rationality and SPARC. Dr. Critch has been offered university faculty and research positions in mathematics, mathematical biosciences, and philosophy, worked as an algorithmic stock trader at Jane Street Capital’s New York City office, and as a Research Fellow at the Machine Intelligence Research Institute. His current research interests include logical uncertainty, open source game theory, and mitigating race dynamics between companies and nations in AI development.

Thanks! Added a note to the OP explaining that hereby means "by this utterance".

Hat tip to Ben Pace for pointing out that invitations are often self-referential, such as when people say "You are hereby invited", because "hereby" means "by this utterance":

https://www.lesswrong.com/posts/rrpnEDpLPxsmmsLzs/open-technical-problem-a-quinean-proof-of-loeb-s-theorem-for?commentId=CFvfaWGzJjnMP8FCa

That comment was like 25% of my inspiration for this post :)

I've now fleshed out the notation section to elaborate on this a bit. Is it better now?

In short, is our symbol for talking about what PA can prove, and is shorthand for PA's symbols for talking about what (a copy of) PA can prove.

- " 1+1=2" means "Peano Arithmetic (PA) can prove that 1+1=2". No parentheses are needed; the "" applies to the whole line that follows it. Also, does not stand for an expression in PA; it's a symbol we use to talk about what PA can prove.
- "" basically means the same thing. More precisely, it stands for a numerical expression
within PAthat can be translated as saying " 1+1=2". This translation is possible because of something called a Gödel numbering which allows PA to talk about a (numerically encoded version of) itself.- " " is short for "" in cases where "" is just a single character of text.
- "" means "PA, along with X as an additional axiom/assumption, can prove Y". In this post we don't have any analogous notation for .

Well, the deduction theorem is a fact about PA (and, propositional logic), so it's okay to use as long as means "PA can prove".

But you're right that it doesn't mix seamlessly with the (outer) necessitation rule. Necessitation is a property of "", but not generally a property of "". When PA can prove something, it can prove that it can prove it. By contrast, if PA+X can prove Y, that *does* mean that PA can prove that PA+X can prove Y (because PA alone can work through proofs in a Gödel encoding), but it *doesn't* mean that PA+X can prove that PA can prove Y. This can be seen by example, by setting ".

As for the case where you want to refer to K or S5 instead of PA provability, those logics are still built on propositional logic, for which the deduction theorem does hold. So if you do the deduction only using propositional logic from theorems in along with an additional assumption X, then the deduction theorem applies. In particular, inner necessitation and box distributivity are both theorems of for every and you stick into them (rather than meta theorems about , which is what necessitation is). So the application of the deduction theorem here is still valid.

Still, the deduction theorem isn't safe to just use willy nilly along with the (outer) necessitation rule, so I've just added a caveat about that:

Note that from we

cannotconclude , because still means "PA can prove", and not "PA+X can prove".

Thanks for calling this out.

Well, is just short for , i.e., "(not A) or B". By contrast, means that there exists a sequence of (very mechanical) applications of modus ponens, starting from the axioms of Peano Arithmetic (PA) with appended, ending in . We tried hard to make the rules of so that it would agree with in a lot of cases (i.e., we tried to design to make the deduction theorem true), but it took a lot of work in the design of Peano Arithmetic and can't be taken for granted.

For instance, consider the statement . If you believe Peano Arithmetic is consistent, then you believe that , and therefore you also believe that . But PA cannot prove that (by Gödel's Theorem, or Löb's theorem with ), so we *don't* have .

It's true that the deduction theorem is not needed, as in the Wikipedia proof. I just like using the deduction theorem because I find it intuitive (assume , prove , then drop the assumption and conclude ) and it removes the need for lots of parentheses everywhere.

I'll add a note about the meaning of so folks don't need to look it up, thanks for the feedback!

I agree this is a big factor, and might be the main pathway through which people end up believing what people believe the believe. If I had to guess, I'd guess you're right.

E.g., if there's a evidence E in favor of H and evidence E' against H, if the group is really into thinking about and talking about E as a topic, then the group will probably end up believing H too much.

I think it would be great if you or someone wrote a post about this (or whatever you meant by your comment) and pointed to some examples. I think the LessWrong community is somewhat plagued by attentional bias leading to collective epistemic blind spots. (Not necessarily more than other communities; just different blind spots.)

Ah, thanks for the correction! I've removed that statement about "integrity for consequentialists" now.

Thanks for raising this! I assume you're talking about this part?

They explore a pretty interesting set-up, but they don't avoid the narrowly-self-referential sentence Ψ:

So, I don't think their motivation was the same as mine. For me, the point of trying to use a quine is to try to get away from that sentence, to create a different perspective on the foundations for people that find that kind of sentence confusing, but who find self-referential *documents* less confusing. I added a section "**Further meta-motivation (added Nov 26)" **about this in the OP.

Based on a potential misreading of this post, I added the following caveat today:

Important Caveat:Arguments in natural language are basically never "theorems". The main reason is that human thinking isn't perfectly rational in virtually any precisely defined sense, so sometimes the hypotheses of an argument can hold while its conclusion remains unconvincing. Thus, the Löbian argument pattern of this postdoes not constitute a "theorem"about real-world humans: even when the hypotheses of the argument hold, the argument will not always play out like clockwork in the minds of real people. Nonetheless, Löb's-Theorem-like argumentscanplay out relatively simply in the English language, and this post shows what would look like.