A rough and incomplete review of some of John Wentworth's research

[-]habryka3y11-19

Perhaps I've simply been misreading John, and he's been intending to say "I have some beliefs, and separately I have some suggestive technical results, and they feel kinda related to me! Which is not to say that any onlooker is supposed to be able to read the technical results and then be persuaded of any of my claims; but it feels promising and exciting to me!".

For what it's worth, I ask John about once ever month or two about his research progress and his answer has so far been (paraphrased) "I think I am making progress. I don't think I have anything to show you that would definitely convince you of my progress, which is fine because this is a preparadigmatic field. I could give you some high-level summaries or we could try to dive into the math, though I don't think I have anything super robust in the math so far, though I do think I have interesting approaches."

You might have had a totally different experience, but I've definitely had the epistemic state so far that John's math was in the "trying to find remotely reasonable definitions with tenuous connection of formalism to reality" stage, and not the "I have actually demonstrated robust connection of math to reality stage", so I feel very non-mislead by John. A good chunk of this impression comes from random short social interactions I've had with John, so someone who more engaged with just his online writing might come away with a different impression (though I've also done that a lot and don't super feel like John has ever tried to sell me in his writing on having super robust math to back things up).

[-]So8res3y*1614

John has also made various caveats to me, of the form "this field is pre-paradigmatic and the math is merely suggestive at this point". I feel like he oversold his results even so.

Part of it is that I get the sense that John didn't understand the limitations of his own results--like the fact that the telephone theorem only says anything in the infinite case, and the thing it says then does not (in its current form) arise as a limit of sensible things that can be said in finite cases. Or like the fact that the alleged interesting results of the gKPD theorem are a relatively-shallow consequence of the overly-strong assumption of .

My impression was that I had to go digging into the theorems to see what they said, only to be disappointed by how little resemblance they bore to what I'd heard John imply. (And it sounds to me like Lawrence, Leon, and Erik had a similar experience, although I might be misreading them on account of confirmation bias or w/e.)

I acknowledge that it's tricky to draw a line between "someone has math that they think teaches them something, and is inarticulate about exactly what it teaches" and "someone has math that they don't understand and are overselling". The sort of observation that would push me towards the former end in John's case is stuff like: John being able to gesture more convincingly at ways concepts like "tree" or "window" are related to his conserved-property math even in messy finite cases. I acknowledge that this isn't a super legible distinction and that that's annoying.

(Also, I had the above convos with John >1y ago, and perhaps John simply changed since then.)

Note that I continue to think John's cool for pursuing this particular research direction, and I'd enjoy seeing his math further fleshed out (and with more awareness on John's part of its current limitations). I think there might be interesting results down this path.

[-]johnswentworth3y120

(Also, I had the above convos with John >1y ago, and perhaps John simply changed since then.)

In hindsight, I do think the period when our discussions took place were a local maximum of (my own estimate of the extent of applicability of my math), partially thanks to your input and partially because I was in the process of digesting a bunch of the technical results we talked about and figuring out the next hurdles. In particular, I definitely underestimated the difficulty of extending the results to finite approximations.

That said, I doubt that fully accounts for the difference in perception.

^{^}

Insofar as the above feels like a more concise description of why there might be any hope at all in studying natural abstractions, and what those studies might entail, I reiterate that it seems to me like this community has a dearth of distillations. Alternatively, it's plausible to me that John's motivations make more sense to everyone else than they do to me, and/or that my attempts at explanation make no more sense to anybody else than John's.

^{^}

Analogy: if you know that the sum of two dice is 5, then you know that the first die definitely didn't come up six. This is some "extra" information above and beyond the fact that the average dice-value is 2.5. If instead you know that the sum of two thousand dice is 5000, then you can basically just ignore that "extra" information, and focus only on the average value. And somewhere around here, there's a theorem saying that the extra information goes to zero in the limit.

^{^}

Or, well, when we know all the conserved properties, and the rest of the laws of physics are sufficiently ergodic or chaotic or something; I'm not sure exactly what theorem we'd want here; I'm just trying to summarize my understanding of John's position. I'd welcome further formalization.

^{^}

If you want those examples, then… sorry. I'm going to go ahead and say that they're an exercise for the reader. If nobody else can reconstruct them, and you really want them, I might go delve through the chat logs. (My apologies for the inconvenience. Skipping that delve-and-cleanup process is part of the cost of getting this dang thing out at all, rather than never.)

^{^}

I also note that I was super annoying in my attempts to extract a working version of this theorem from John. I started out by trying to probe all his verbal intuitions about the "natural abstractions are like conserved quantities" stuff, and then when I couldn't make any sense of that we went to the math. And, because none of his English phrases were making sense to me, I just meticulously tried to understand the details of the math, which involved a whole lot of not knowing what the heck his notation meant, and a whole lot of inability to fill out partial definitions in "the obvious way", which I suspect was frustrating. Sorry John; thanks for putting up with me.

^{^}

But John, commenting on a draft of this post, was like "Nope!" and helpfully provided a quote.

^{^}

John noted in a draft of this document that this post of his was largely intended as a response to me on this point.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

67

A rough and incomplete review of some of John Wentworth's research

67

The Dream

Natural Abstractions

The Math

My Concerns

The Generalized Koopman-Pitman-Darmois theorem