Jon Garcia

I have a PhD in Computational Neuroscience from UCSD (Bachelor's was in Biomedical Engineering with Math and Computer Science minors). Ever since junior high, I've been trying to figure out how to engineer artificial minds, and I've been coding up artificial neural networks ever since I first learned to program. Obviously, all my early designs were almost completely wrong/unworkable/poorly defined, but I think my experiences did prime my brain with inductive biases that are well suited for working on AGI.

Although I now work as a data scientist in R&D at a large medical device company, I continue to spend my free time studying the latest developments in AI/ML/DL/RL and neuroscience and trying to come up with models for how to bring it all together into systems that could actually be implemented. Unfortnately, I don't seem to have much time to develop my ideas into publishable models, but I would love to have the opportunity to share ideas with those who do.

Of course, I'm also very interested in AI Alignment (hence the account here). My ideas on that front mostly fall into the "learn (invertible) generative models of human needs/goals and hook those up to the AI's own reward signal" camp. I think methods of achieving alignment that depend on restricting the AI's intelligence or behavior are about as destined to failure in the long term as Prohibition or the War on Drugs in the USA. We need a better theory of what reward signals are for in general (probably something to do with maximizing (minimizing) the attainable (dis)utility with respect to the survival needs of a system) before we can hope to model human values usefully. This could even extend to modeling the "values" of the ecological/socioeconomic/political supersystems in which humans are embedded or of the biological subsystems that are embedded within humans, both of which would be crucial for creating a better future.


Sorted by New

Wiki Contributions


[Intro to brain-like-AGI safety] 2. “Learning from scratch” in the brain

Memory ≠ Unstructured memory (and likewise, locally-random ≠ globally-random): [...]

Agreed. I didn't mean to imply that you thought otherwise.

"just" a memory system + learning algorithm—with a dismissive tone of voice on the "just": [...]

I apologize for how that came across. I had no intention of being dismissive. When I respond to a post or comment, I typically try to frame what I say for a typical reader as much for the original poster. In this case, I had a sense that a typical reader could get the wrong impression about how the neocortex does what it does if the only sorts of memory systems and learning algorithms that came to mind were things like a blank computer drive and stochastic gradient descent on a feed-forward neural network.

You are absolutely right that the neocortex is equipped to learn from scratch, starting out generating garbage and gradually learning to make sense of the world/body/other-brain-regions/etc., which can legitimately be described as a memory system + learning algorithm. I just wanted anyone reading to appreciate that, at least in biological brains, there is no clean separation between learning algorithm and memory, but that the neocortex's role as a hierarchical, dynamic, generative simulator is precisely what makes learning from scratch so efficient, since it only has to correlate its intrinsic dynamics with the statistically similar dynamics of learned experience.

I'm sure that there are vastly more ways of implementing learning-from-scratch, maybe some much better ways in fact, and I realize that the exact implementation is probably not relevant to the arguments you plan to make in this sequence. I just feel that a basic understanding of what a real learning-from-scratch system looks like could help drive intuitions of what is possible.

Generative models can be learned from scratch [...]

Indeed, but of course including their own particular structural priors.

Dynamics is not unrelated to neural architecture: [...]

Well, what is a recurrent neural network after all but an arbitrarily deep feed-forward neural network with shared weights across layers? My comment on cortical waves was just to point out a clever way that the brain learns to organize its cortical maps and primes them to expect causality to operate (mostly) locally in space and time. For example, orientation columns in V1 may be adjacent to each other because similarly oriented edges (from moving objects) were consistently presented to the same part of the visual field close in time, such that traveling waves of excitation would teach pyramidal cell A to learn orientation A at time A and then teach neuron B to learn orientation B at time B.

Lottery ticket hypothesis: [...]

"Lottery tickets" (i.e., subnetworks with random initializations that just so happen to give them the right inductive bias to generalize well from the training data for a particular task) probably occur in the brain as much as in current deep learning architectures. However, the issue in DL is that the rest of the network often fails to contribute much to test performance beyond what the lottery ticket subnetwork was able to achieve, as though there was a chasm in model space that the other subnetworks were unable to cross to reach a solution. Evolution seems to have found a way around this problem, at least by the time the neocortex came along, in the sense that the brain seems adept at giving every subnetwork a useful job.

Again, I think the nature of the neocortex as a generative simulator is what makes this feasible with sparse training data. The structure and dynamics of the neocortex have enough in common statistically with the structure and dynamics of the natural world that it is easy for it to align with experience. In contrast, the structure and (lack of) dynamics of current DL systems makes them more brittle when trying to make sense of the natural world.

All that being said, I realize that my comments might be taking things down a rabbit hole. But I appreciate your feedback and look forward to seeing your perspective fleshed out more in the rest of this sequence.

[Intro to brain-like-AGI safety] 2. “Learning from scratch” in the brain

Great summary of the argument. I definitely agree that this will be an important distinction (learning-from-scratch vs. innate circuitry) for AGI alignment, as well as for developing a useful Theory of Cognition. The core of what motivates our behavior must be innate to some extent (e.g., heuristics that evolution programmed into our hypothalamus that tell us how far from homeostasis we're veering), to act as a teaching signal to the rest of our brains (e.g., learn to generate goal states that minimize the effort required to maintain or return to homeostasis, goal states that would be far too complex to encode genetically).

However, I think that characterizing the telencephelon+cerebellum as just a memory system + learning algorithm is selling it a bit short. Even at the level of abstraction that you're dealing with here, it seems important to recognize that the neocortex is a dynamic generative simulator. It has intrinsic dynamics that generate top-down spatiotemporal patterns, from regions that represent the highest levels of abstraction down to regions that interface directly with the senses and motor system.

At the beginning, yes, it is simulating nonsense, but the key is that it is more than just randomly initialized memory slots. The sorts of hierarchical patterns that it generates contain information about the dynamical priors that it expects to deal with in life. (Cortical waves, for instance, set priors for spatiotemporally local causal interactions. Retinal waves are probably doing something similar.)

The job of the learning algorithm is, then, to bind the dynamics of experience to the intrinsic dynamics of the neural circuitry, so that the top-down system is better able to simulate real-world dynamics in the future. It probably does so by propagating prediction errors bottom-up from the sensorimotor areas up to the regions representing abstract concepts, making adjustments and learning from them along the way.

In my opinion, some sort of predictive coding scheme using some sort of hierarchical, dynamic generative simulator will be necessary (though not sufficient) to building generally intelligent systems. For learning-from-scratch to be effective (i.e., not require millions of labelled training examples before it can make useful generalizations), it needs to have such a head start.

[Intro to brain-like-AGI safety] 1. What's the problem & Why work on it now?

Great introduction. As someone with a background in computational neuroscience, I'm really looking forward to what you have to say on all this.

By the way, you seem to use a very broad umbrella for covering "brain-like-AGI" approaches. Would you classify something like a predictive coding network as more brain-like or more prosaic? What about a neural Turing machine? In other words, do you have a distinct boundary in mind for separating the red box from the blue box, or is your classification more fuzzy (or does it not matter)?

Infra-Bayesian physicalism: a formal theory of naturalized induction

The monotonicity principle requires it to be non-decreasing w.r.t. the manifesting of less facts. Roughly speaking, the more computations the universe runs, the better.

I think this is what I was missing. Thanks.

So, then, the monotonicity principle sets a baseline for the agent's loss function that corresponds to how much less stuff can happen to whatever subset of the universe it cares about, getting worse the fewer opportunities become available, due to death or some other kind of stifling. Then the agent's particular value function over universe-states gets added/subtracted on top of that, correct?

Infra-Bayesian physicalism: a formal theory of naturalized induction

Could you explain what the monotonicity principle is, without referring to any symbols or operators? I gathered that it is important, that it is problematic, that it is a necessary consequence of physicalism absent from cartesian models, and that it has something to do with the min-(across copies of an agent) max-(across destinies of the agent copy) loss. But I seem to have missed the content and context that makes sense of all of that, or even in what sense and over what space the loss function is being monotonic.

Your discussion section is good. I would like to see more of the same without all the math notation.

If you find that you need to use novel math notation to convey your ideas precisely, I would advise you to explain what every symbol, every operator, and every formula as a whole means every time you reference them. With all the new notation, I forgot what everything meant after the first time they were defined. If I had a year to familiarize myself with all the symbols and operators and their interpretations and applications, I imagine that this post would be much clearer.

That being said, I appreciate all the work you put into this. I can tell there's important stuff to glean here. I just need some help gleaning it.

There is essentially one best-validated theory of cognition.

So essentially, which types of information get routed for processing to which areas during the performance of some behavioral or cognitive algorithm, and what sort of processing each module performs?

There is essentially one best-validated theory of cognition.

So, from what I read, it looks like ACT-R is mostly about modeling which brain systems are connected to which and how fast their interactions are, not in any way how the brain systems actually do what they do. Is that fair? If so, I could see this framework helping to set useful structural priors for developing AGI (e.g., so we don't make the mistake of hooking up the declarative memory module directly to the raw sensory or motor modules), but I would expect most of the progress still to come from research in machine learning and computational neuroscience.