Nick_Tarleton — AI Alignment Forum

Views on when AGI comes and on strategy to reduce existential risk

Strong minds are the most structurally rich things ever. That doesn't mean they have high algorithmic complexity; obviously brains are less algorithmically complex than entire organisms, and the relevant aspects of brains are presumably considerably simpler than actual brains. But still, IDK, it just seems weird to me to expect to make such an object "by default" or something? Craig Venter made a quasi-synthetic lifeform--but how long would it take us to make a minimum viable unbounded invasive organic replicator actually from scratch, like without copying DNA sequences from existing lifeforms?

I think I don't understand this argument. In creating AI we can draw on training data, which breaks the analogy to making a replicator actually from scratch (are you using a premise that this is a dead end, or something, because "Nearly all [thinkers] do not write much about the innards of their thinking processes..."?). We've seen that supervised (EDIT: unsupervised) learning and RL (and evolution) can create structural richness (if I have the right idea of what you mean) out of proportion to the understanding that went into them. Of course this doesn't mean any particular learning process is able to create a strong mind, but, idk, I don't see a way to put a strong lower bound on how much more powerful a learning process is necessary, and ISTM observations so far suggest 'less than I would have guessed'.

(EDIT: Maybe (you'd say) I should be drawing such a strong lower bound — or a lower bound on the needed difference from current techniques, not 'power level' — from the point about sample efficiency...? Like maybe I should think that we don't have a good enough sample space to learn over and will probably have to jump far outside it; this comment seems in that direction.)

(Nor do I get what view you're paraphrasing as 'expecting to make a strong mind "by default"'. Did LLMs or AlphaZero come about "by default"?)

(EDIT: I feel like I get "by default" more after looking again at your "Let me restate my view again" passage here.)

I think my timelines would have been considered normalish among X-risk people 15 years ago? And would have been considered shockingly short by most AI people.

Unfortunately I can't find the written artifact that came out of it, but I (very imperfectly) recall a large conversation around SIAI in 2010 where, IIRC, a 2040 median was pretty typical. I agree that "X-risk people" more broadly had longer timelines, and "most AI people" much longer.

I think most of the difference is in how we're updating, rather than on priors? IDK.

Yeah, in particular it seems like I'm updating more than you from induction on the conceptual-progress-to-capabilities ratio we've seen so far / on what seem like surprises to the 'we need lots of ideas' view. (Or maybe you disagree about observations there, or disagree with that frame.) (The "missing update" should weaken this induction, but doesn't invalidate it IMO.)

Views on when AGI comes and on strategy to reduce existential risk

Nick_Tarleton9mo20

I don't really have an empirical basis for this, but: If you trained something otherwise comparable to, if not current, then near-future reasoning models without any mention of angular momentum, and gave it a context with several different problems to which angular momentum was applicable, I'd be surprised if it couldn't notice that was a common interesting quantity, and then, in an extension of that context, correctly answer questions about it. If you gave it successive problem sets where the sum of that quantity was applicable, the integral, maybe other things, I'd be surprised if a (maybe more powerful) reasoning model couldn't build something worth calling the ability to correctly answer questions about angular momentum. Do you expect otherwise, and/or is this not what you had in mind?

Views on when AGI comes and on strategy to reduce existential risk

Nick_Tarleton9mo33

It seems right to me that "fixed, partial concepts with fixed, partial understanding" that are "mostly 'in the data'" likely block LLMs from being AGI in the sense of this post. (I'm somewhat confused / surprised that people don't talk about this more — I don't know whether to interpret that as not noticing it, or having a different ontology, or noticing it but disagreeing that it's a blocker, or thinking that it'll be easy to overcome, or what. I'm curious if you have a sense from talking to people.)

These also seem right

"LLMs have a weird, non-human shaped set of capabilities"
"There is a broken inference"
"we should also update that this behavior surprisingly turns out to not require as much general intelligence as we thought"
"LLMs do not behave with respect to X like a person who understands X, for many X"

(though I feel confused about how to update on the conjunction of those, and the things LLMs are good at — all the ways they don't behave like a person who doesn't understand X, either, for many X.)

But: you seem to have a relatively strong prior^[1] on how hard it is to get from current techniques to AGI, and I'm not sure where you're getting that prior from. I'm not saying I have a strong inside view in the other direction, but, like, just for instance — it's really not apparent to me that there isn't a clever continuous-training architecture, requiring relatively little new conceptual progress, that's sufficient; if that's less sample-efficient than what humans are doing, it's not apparent to me that it can't still accomplish the same things humans do, with a feasible amount of brute force. And it seems like that is apparent to you.

Or, looked at from a different angle: to my gut, it seems bizarre if whatever conceptual progress is required takes multiple decades, in the world I expect to see with no more conceptual progress, where probably:

AI is transformative enough to motivate a whole lot of sustained attention on overcoming its remaining limitations
AI that's narrowly superhuman on some range of math & software tasks can accelerate research

^{^}
It's hard for me to tell how strong: "—though not super strongly" is hard for me to square with your butt-numbers, even taking into account that you disclaim them as butt-numbers.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments