Epistemic status: Trying to explain why I have certain intuitions. Not sure whether people will find this obvious vs controversial.
Part 1: Brains probably do some useful things in utterly inscrutable ways
I'm not so much interested in arguing the strong claim that the brain does some useful things in infinitely inscrutable ways—i.e., that understanding them is fundamentally impossible. I merely want to make the weaker claim that the brain probably does some useful things in ways that are for all intents and purposes inscrutable.
Where did I get this intuition? A few places:
- Evolved FPGA circuits - see the awesome blog post On the Origin of Circuits focusing on the classic 1996 paper by Adrian Thompson. An evolved circuit of just 37 logic gates managed to perform a function which kinda seems impossible with those components. It turned out that the components were used in weird ways—the circuit ran differently on nominally-identical FPGAs, the transistors were not all used as on/off switches, there was some electromagnetic coupling or power-line coupling going on, etc. Can we understand how this circuit works? In the paper, they didn't try. I imagine that a good physicist, given enough time and experimental data, could get at least a vague idea of the most important aspects. But there might be subtleties that can't really be explained better than a simulation, or maybe some component has 17 unrelated functions that occur at different parts of the cycle, or maybe you need to account for a microscopic bump in some wire, or whatever. If it were 370 components instead of 37, and there were limits on what you can measure experimentally, it would be that much harder.
- The Busy Beaver Function Σ(n) is unknown for as low as n=5. So we have a bunch of really simple computer programs, and no one knows whether they run forever or halt. When you get to larger n it gets even worse: For n≥1919 (and perhaps much smaller n too), Σ(n) is formally undecidable. While that's not exactly the same as saying that we will never understand these programs, I kinda expect that there are in fact programs whose asymptotic behavior really is "infinitely inscrutable", i.e. programs which don't halt, but where there is fundamentally no way to understand why they don't halt, short of actually running them forever, and that's true even if you have a brain the size of Jupiter. (I could be wrong, and this is not an important part of my argument.)
- Riemann hypothesis: We have a simple-to-define function that exhibits an obvious pattern of behavior. Like those busy beaver Turing machines, the answer to "why" is "I dunno, we ran the calculation, and that's what we've found, at least so far". In this case, I assume that an explanation probably exists, but I find it interesting that we haven't discovered it yet, after 150 years of intense effort.
In summary, my intuition is that:
- Simple components can give rise to recognizable emergent patterns of behavior for inscrutably complicated reasons that can't necessarily be distilled down to any "explanation" beyond "we simulated it and that's what happens", and
- Neurons are not simple components, in that even if they have a legible primary input-output function, they probably have dozens of "side-channel" input-output functions that probably get sporadically used by evolution as well. (If you tug on a dendrite, then it's a spring!)
These two considerations coalesce to give me a prior expectation that there may be large numbers of very deep rabbit holes when you try to work out low-level implementation details of how the brain does any particular thing. The brain might do that thing by a beautiful, elegant, simple design ... or it might do that thing in some bizarre, ridiculous way, which we will not understand except by looking in weird places, like measuring mechanical stresses on cell membranes, or by measuring flows of chemicals that by all accounts ought to have no relation whatsoever to neuron firing, or by simulating systems of 492 components which interact in a complicated way that can't really be boiled down into anything simpler.
The book The Idea of the Brain has some great examples of the horrors facing neuroscientists trying to understand seemingly-simple neural circuits:
…Despite having a clearly established connectome of the thirty-odd neurons involved in what is called the crustacean stomatogastric ganglion, Marder's group cannot yet fully explain how even some small portions of this system function. ...in 1980 the neuroscientist Allen Selverston published a much-discussed think piece entitled "Are Central Pattern Generators Understandable?"...the situation has merely become more complex in the last four decades...The same neuron in different [individuals] can also show very different patterns of activity—the characteristics of each neuron can be highly plastic, as the cell changes its composition and function over time...
…Decades of work on the connectome of the few dozen neurons that form the central pattern generator in the lobster stomatogastric system, using electrophysiology, cell biology and extensive computer modelling, have still not fully revealed how its limited functions emerge.
Even the function of circuits like [frog] bug-detecting retinal cells—a simple, well-understood set of neurons with an apparently intuitive function—is not fully understood at a computational level. There are two competing models that explain what the cells are doing and how they are interconnected (one is based on a weevil, the other on a rabbit); their supporters have been thrashing it out for over half a century, and the issue is still unresolved. In 2017 the connectome of a neural substrate for detecting motion in Drosophila was reported, including information about which synapses were excitatory and which were inhibitory. Even this did not resolve the issue of which of those two models is correct.
I haven't chased down these references, and can't verify that understanding these things is really as difficult as this author says. On the other hand, these are really really simple systems; if they're even remotely approaching the limits of our capabilities, imagine an interacting bundle of 10× or 100× more neurons, doing something more complicated, in a way that is harder to experimentally measure.
So anyway, maybe scientists will eventually understand how the brain does absolutely everything it does, at the “implementation level”. I don't think that's ruled out. But I sure don't think it's likely, even for the simplest worm nervous system, in the foreseeable future.
Part 2: …But that doesn't mean brain-inspired AGI is hard!
Side note 1: I use "brain-inspired AGI" in the sense of copying (or reinventing) high-level data structures and algorithms, not in the sense of copying low-level implementation details, e.g. neurons that spike. "Neuromorphic hardware" is a thing, but I see no sign that neuromorphic hardware will be relevant for AGI. Most neuromorphic hardware researchers are focused on low-power sensors, as far as I understand.
Side note 2: The claim “brain-inspired AGI is likely” is unrelated to the claim “brain-inspired AGI will bring about a better future for humankind than other types of AGIs”, although these two claims sometimes get intuitively bundled together under the heading of "cheerleading for brain-like AGI". I have grown increasingly sympathetic to the former claim, but am undecided about the latter claim, and see it as an open research question—indeed, a particularly urgent open question, as it informs high-leverage research prioritization decisions that we can act on immediately.
OK, back to the main text. I want to argue something like this:
If some circuit in the brain is doing something useful, then it's humanly feasible to understand what that thing is and why it's useful, and to write our own CPU code that does the same useful thing.
In other words, the brain's implementation of that thing can be super-complicated, but the input-output relation cannot be that complicated—at least, the useful part of the input-output relation cannot be that complicated.
The crustacean stomatogastric ganglion central pattern generators discussed above are a great example: their mechanisms are horrifically complicated, but their function is simple: they create a rhythmic oscillation. Hey, you need a rhythmic oscillation in your AGI? No problem! I can do that in one line of Python.
At the end of the day, we survive by exploiting regularities in our ecological niche and environment. If the brain does something that's useful, I feel like there has to be a legible explanation in those terms; and from that, that there has to be legible CPU code that does the same thing.
I feel most strongly about the boldface statement above in regards to the neocortex. The neocortex is a big uniform-ish machine that learns patterns in inputs and outputs and rewards, builds a predictive model, and uses that model to choose outputs that increase rewards, using some techniques we already understand, and others we don’t. If the neocortex does some information-processing thing, and the result is that it does its job better, then I feel like there has to be some legible explanation for what it's doing, why, and how, in terms of that primary prediction-and-action task … there has to be some reason that it systematically helps run smarter searches, or generates better models, or makes more accurate predictions, etc.
I feel much less strongly about that above boldface statement in regards to the brainstem and hypothalamus (the home of evolved instinctive responses to different situations, I would argue, see here). For example, I can definitely imagine that the human brain has an instinctual response to a certain input which is adaptive in 500 different scenarios that ancestral humans typically encountered, and maladaptive in another 499 scenarios that ancestral humans typically encountered. So on average it's beneficial, and our brains evolved to have that instinct, but there's no tidy story about why that instinct is there and no simple specification for exactly what calculation it's doing.
By the same token, in this sense, I expect that understanding the key operating principles of human intelligence will be dramatically easier than understanding the key operating principles of the nervous system of a 100-neuron microscopic worm!! Weird thought, right?! But again, every little aspect of those worm neurons could be a random side-effect of something else, or it could be an adaptive strategy for some situation that comes up in the worm's environment once every 5 generations, and how on earth are you ever going to figure out which is which?? And if you can't figure out which is which, how can you hope to “understand” the system in any way besides running a molecule-by-molecule simulation?? By contrast, “human intelligence” is a specific suite of capabilities, including things like “can carry on conversations, invent new technology, etc.”—a known target to aim for.
(Added for clarification: The point of the previous paragraph is that “understanding how a nervous system gives rise to a particular identifiable set of behaviors” is tractable, whereas “understanding the entire design spec of a nervous system”—i.e., every way that it optimizes inclusive genetic fitness—is not tractable. And I'm saying that this is such a big factor that it outweighs even the many-orders-of-magnitude difference in complexity between microscopic worms' and humans' nervous systems.)
I guess I have a not-terribly-justified gut feeling that we already vaguely understand how neocortical algorithms work to create human intelligence, and that “soon” (few decades?) this vague understanding will develop into full-fledged AGIs, assuming that the associated R&D continues. On the other hand, I acknowledge that this is very much not a common view, including among people far more knowledgeable than myself, and in particular there are plenty of neuroscientists who view the project of understanding the human brain as a centuries-long endeavor. I guess this post is a little piece of how I reconcile those two facts: At least in some cases, when neuroscientists talk about understanding the brain, I think they mean understanding what all the calculations are and how they are implemented—like what those researchers have been trying and failing to do with the crustacean stomatogastric ganglion in that book quote from part 1 above—but for a human brain with 10⁹× more neurons. Yup, that sounds like a centuries-long endeavor to me too! But I think understanding human intelligence well enough to make a working AGI algorithm is dramatically easier than that. (Update: See further discussion in my later post series, Sections 2.8, 3.7, and 3.8.)
…And I do think that latter type of work is actually getting done, particularly by those researchers who go in armed with an understanding of (1) what useful algorithms might look like in general, (2) neuroscience, and (3) psychology / behavior, and then go hunting for ways that those three ingredients might come together, without getting too bogged down in explaining every last neuron spike.
Incidentally, this is also the lens through which I think about the arguments over whether or not glial cells (in addition to neurons) do computations. If glial cells are predictable systems that interact with neurons, of course they'll wind up getting entrained in computations! That's what evolution does, just like an evolved PCB circuit would probably use the board itself as a mechanical resonator or whatever other ridiculous things you can imagine. So my generic expectation is: (1) If you removed the glial cells, it would break lots of brain computations; (2) If there were no such thing as glial cells, a functionally-identical circuit would have evolved, and I bet it wouldn't even look all that different. By the way, I know almost nothing about glial cells, I'm just speculating. :-)
This was a great read! I wonder how much you're committed to "brain-inspired" vs "mind-inspired" AGI, given that the approach to "understanding the human brain" you outline seems to correspond to Marr's computational and algorithmic levels of analysis, as opposed to the implementational level (see link for reference). In which case, some would argue, you don't necessarily have to do too much neuroscience to reverse engineer human intelligence. A lot can be gleaned by doing classic psychological experiments to validate the functional roles of various aspects of human intelligence, before examining in more detail their algorithms and data structures (perhaps this time with the help of brain imaging, but also carefully designed experiments that elicit human problem solving heuristics, search strategies, and learning curves).
I ask because I think "brain-inspired" often gets immediately associated with neural networks, and not say, methods for fast and approximate Bayesian inference (MCMC, particle filters), which are less the AI zeitgeist nowadays, but still very much how cognitive scientists understand the human mind and its capabilities.
Thanks! I guess my feeling is that we have a lot of good implementation-level ideas (and keep getting more), and we have a bunch of algorithm ideas, and psychology ideas and introspection and evolution and so on, and we keep piecing all these things together, across all the different levels, into coherent stories, and that's the approach I think will (if continued) lead to AGI.
Like, I am in fact very interested in "methods for fast and approximate Bayesian inference" as being relevant for neuroscience and AGI, but I wasn't really interested in it until I learned bunch of supporting ideas about what part of the brain is doing that, and how it works on the neuron level, and how and when and why that particular capability evolved in that part of the brain. Maybe that's just me.
I haven't seen compelling (to me) examples of people going successfully from psychology to algorithms without stopping to consider anything whatsoever about how the brain is constructed . Hmm, maybe very early Steve Grossberg stuff? But he talks about the brain constantly now.
One reason it's tricky to make sense of psychology data on its own, I think, is the interplay between (1) learning algorithms, (2) learned content (a.k.a. "trained models"), (3) innate hardwired behaviors (mainly in the brainstem & hypothalamus). What you especially want for AGI is to learn about #1, but experiments on adults are dominated by #2, and experiments on infants are dominated by #3, I think.
Some recent examples, off the top of my head!
I guess this depends on how much you think we can make progress towards AGI by learning what's innate / hardwired / learned at an early age in humans and building that into AI systems, vs. taking more of a "learn everything" approach! I personally think there may still be a lot of interesting human-like thinking and problem solving strategies that we haven't figured out to implement as algorithms yet (e.g. how humans learn to program, and edit + modify programs and libraries to make them better over time), that adult and child studies would be useful in order to characterize what might even be aiming for, even if ultimately the solution is to use some kind of generic learning algorithm to reproduce it. I also think there's this fruitful in-between (1) and (3), which is to ask, "What are the inductive biases that guide human learning?", which I think you can make a lot of headway on without getting to the neural level.
Robin Hanson makes a similar argument in "Signal Processors Decouple":
FWIW I have come to similar conclusions along similar lines. I've said that I think human intelligence minus rat intelligence is probably easier to understand and implement than rat intelligence alone. Rat intelligence requires a long list of neural structures fine-tuned by natural selection, over tens of millions of years, to enable the rat to do very specific survival behaviors right out of the womb. How many individual fine-tuned behaviors? Hundreds? Thousands? Hard to say. Human intelligence, by contrast, cannot possibly be this fine tuned, because the same machinery lets us learn and predict almost arbitrarily different* domains.
I also think that recent results in machine learning have essentially proven the conjecture that moar compute regularly and reliably leads to moar performance, all things being equal. The human neocortical algorithm probably wouldn't work very well if it were applied in a brain 100x smaller because, by its very nature, it requires massive amounts of parallel compute to work. In other words, the neocortex needs trillions of synapses to do what it does for much the same reason that GPT-3 can do things that GPT-2 can't. Size matters, at least for this particular class of architectures.
*I think this is actually wrong - I don't think we can learn arbitrarily domains, not even close. Humans are not general. Yann LeCun has repeatedly said this and I'm inclined to trust him. But I think that the human intelligence architecture might be general. It's just that natural selection stopped seeing net fitness advantage at the current brain size.
I disagree, as I discussed here, I think the neocortex is uniform-ish and that a cortical column in humans is doing a similar calculation as a cortical column in rats or the equivalent bundle of cells (arranged not as a column) in a bird pallium or lizard pallium. I do think you need lots and lots of cortical columns, initialized with appropriate region-to-region connections, to get human intelligence. Well, maybe that's what you meant by "human neocortical algorithm", in which case I agree. You also need appropriate subcortical signals guiding the neocortex, for example to flag human speech sounds as being important to attend to.
Well, I do think that there's a lot of non-neocortical innovations between humans and rats, particularly to build our complex suite of social instincts, see here. I don't think understanding those innovations is necessary for AGI, although I do think it would be awfully helpful to understand them if we want aligned AGI. And I think they are going to be hard to understand, compared to the neocortex.
Sure. A good example is temporal sequence learning. If a sequence of things happens, we expect the same sequence to recur in the future. In principle, we can imagine an anti-inductive universe where, if a sequence of things happens, then it's especially unlikely to recur in the future, at all levels of abstraction. Our learning algorithm would crash and burn in such a universe. This is a particular example of the no-free-lunch theorem, and I think it illustrates that, while there are domains that the neocortical learning algorithm can't learn, they may be awfully weird and unlikely to come up.