AI ALIGNMENT FORUM
AF

Alex Mennen
Ω182137727
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
2AlexMennen's Shortform
6y
1
the void
Alex Mennen1mo20

This post claims that Anthropic is embarrassingly far behind twitter AI psychologists at skills that are possibly critical to Anthropic's mission. This suggests to me that Anthropic should be trying to recruit from the twitter AI psychologist circle.

Reply
Dequantifying first-order theories
Alex Mennen1y10

Yeah, sorry that was unclear; there's no need for any form of hypercomputation to get an enumeration of the axioms of U. But you need a halting oracle to distinguish between the axioms and non-axioms. If you don't care about distinguishing axioms from non-axioms, but you do want to get an assignment of truthvalues to the atomic formulas Q(i,j) that's consistent with the axioms of U, then that is applying a consistent guessing oracle to U.

Reply
Dequantifying first-order theories
Alex Mennen1y10

I see that when I commented yesterday, I was confused about how you had defined U. You're right that you don't need a consistent guessing oracle to get from U to a completion of U, since the axioms are all atomic propositions, and you can just set the remaining atomic propositions however you want. However, this introduces the problem that getting the axioms of U requires a halting oracle, not just a consistent guessing oracle, since to tell whether something is an axiom, you need to know whether there actually is a proof of a given thing in T.

Reply
Dequantifying first-order theories
Alex Mennen1y10

I think what you proved essentially boils down to the fact that a consistent guessing oracle can be used to compute a completion of any consistent recursively axiomatizable theory. (In fact, it turns out that a consistent guessing oracle can be used to compute a model (in the sense of functions and relations on a set) of any consistent recursively axiomatizable theory; this follows from what you showed and the fact that an oracle for a complete theory can be used to compute a model of that theory.)

I disagree with

Philosophically, what I take from this is that, even if statements in a first-order theory such as Peano arithmetic appear to refer to high levels of the Arithmetic hierarchy, as far as proof theory is concerned, they may as well be referring to a fixed low level of hypercomputation, namely a consistent guessing oracle.

The translation from T to U is computable. The consistent guessing oracle only came in to find a completion of U, but it could also find a completion of T (in fact, a completion of U can be computably translated to a completion of T), so the consistent guessing oracle doesn't really have anything to do with the relationship between T and U.

Reply
Dequantifying first-order theories
Alex Mennen1y30

a consistent guessing oracle rather than a halting oracle (which I theorize to be more powerful than a consistent guessing oracle).

This is correct. Or at least, the claim I'm interpreting this as is that there exist consistent guessing oracles that are strictly weaker than a halting oracle, and that claim is correct. Specifically, it follows from the low basis theorem that there are consistent guessing oracles that are low, meaning that access to a halting oracle makes it possible to tell whether any Turing machine with access to the consistent guessing oracle halts. In contrast, access to a halting oracle does not make it possible to tell whether any Turing machine with access to a halting oracle halts.

Reply
Dequantifying first-order theories
Alex Mennen1y10

I don't understand what relevance the first paragraph is supposed to have to the rest of the post.

Reply
Concave Utility Question
Alex Mennen2y40

Oh, derp. You're right.

Reply
Concave Utility Question
Alex Mennen2y3-2

I think the way I would rule out my counterexample is by strengthening A3 to if A≻B and B≻C then there is p∈(0,1)...

Reply
Concave Utility Question
Answer by Alex MennenApr 15, 2023102

Q2: No. Counterexample: Suppose there's one outcome x such that all lotteries are equally good, except for the lottery than puts probability 1 on x, which is worse than the others.

Reply
Open technical problem: A Quinean proof of Löb's theorem, for an easier cartoon guide
Alex Mennen3y78

It would kind of use assumption 3 inside step 1, but inside the syntax, rather than in the metalanguage. That is, step 1 involves checking that the number encoding "this proof" does in fact encode a proof of C. This can't be done if you never end up proving C.

One thing that might help make clear what's going on is that you can follow the same proof strategy, but replace "this proof" with "the usual proof of Lob's theorem", and get another valid proof of Lob's theorem, that goes like this: Suppose you can prove that []C->C, and let n be the number encoding a proof of C via the usual proof of Lob's theorem. Now we can prove C a different way like so:

  1. n encodes a proof of C.
  2. Therefore []C.
  3. By assumption, []C->C.
  4. Therefore C.

Step 1 can't be correctly made precise if it isn't true that n encodes a proof of C.

Reply
Load More
6Event [Berkeley]: Alignment Collaborator Speed-Meeting
3y
0
19Mapping Out Alignment
5y
0
2AlexMennen's Shortform
6y
1
12When wishful thinking works
7y
1
9Safely and usefully spectating on AIs optimizing over toy worlds
7y
12
15A comment on the IDA-AlphaGoZero metaphor; capabilities versus alignment
7y
0
0Logical uncertainty and mathematical uncertainty
7y
0
0Value learning subproblem: learning goals of simple agents
8y
0
2Being legible to other agents by committing to using weaker reasoning systems
8y
1
1Metamathematics and probability
8y
0
Load More
Computing Overhang
8y
Moral Divergence
12y
(+47/-56)
Consequentialism
12y
(+25)
Consequentialism
12y
(+942/-1879)
Adversarial Collaboration (Dispute Protocol)
12y
(+1911)
Death
12y
(+17)
Cryonics
12y
(+20/-17)
Cryonics
12y
(+17/-20)
Personal Identity
12y
(+3365)
Counterfactual Resiliency
12y
Load More