(Note: I published a much-improved discussion of this topic two years later: See Symbol Grounding & Human Social Instincts. I suggest you read that one instead. By the way, I have started calling it “transient empathetic simulations” instead of “little glimpses of empathy”.)

(Transcript of the little casual talk I gave here.)

I’ve been mulling over the algorithms behind social emotions. How does the brain know when to feel jealous? How does the brain know when to feel guilty? And so on. Knowing the answer would be useful for understanding the human condition, and perhaps also for programming AI's, like if we want to make AI's that are pro-social for the same reason that humans are pro-social—well, at least some humans!—so that the AI’s would wind up with human-like intuitions even in novel circumstances. So again, how does the brain know when to feel jealous? What are the calculations? I’ve looked in the literature, and I haven’t found an answer. The books and papers say things like, “The amygdala helps process emotion”, or whatever. Uhh, OK, what does that mean? The amygdala has 10 million neurons. You can do a lot of calculations with 10 million neurons! So, what are those neurons calculating?

Now, the reason social emotions are especially tricky is that they need to interface with your understanding of the world. And your understanding of the world is some horribly complicated data structure in your brain. Like, my friend Rita just won a trophy and I didn’t, and that makes me jealous. OK. Rita winning the trophy is represented by neuron firing pattern #6,000,000,423. So that’s supposed to trigger the jealousy circuit. How does that work? You can’t just say “The genome wires them up,” because, how does the genome know to do that? How many of my ancestors were friends with Rita when she won a trophy? None! There has to be more to the story than that. And likewise, you can’t just say “A learning algorithm will figure out the connection,” because there’s no ground truth about when you should be jealous. There's ground truth over evolutionary time, but there's no ground truth within a lifetime. So how do you learn it?

I couldn’t find anyone talking about this in a way that made sense to me, so eventually I paused my literature search, in favor of just making things up. A proud tradition here at lesswrong! I’m at the stage of trying to formulate a hypothesis. This is not a scientific talk, this is brainstorming at the watercooler with my friends! I kinda have a framework for a hypothesis, I think, but it’s a bit vague in some areas, and I'm trying to shore it up, nail it down, sanity-check, and so on, and actually proving it is a long way off! I'm actually going to skip a lot of the algorithm stuff in this talk, and just summarize the upshot in an intuitive way, and then all of you can think about whether it’s intuitively plausible and in accordance with your experience. Of course, introspection can be a bit unreliable, and you certainly can’t prove or disprove this kind of thing through introspection alone, but if careful thinkers tell me that everything I’m saying in this talk is wildly at odds with their lived experience, well, that would give me pause at least, and chatting about it might lead me in a more promising direction. Of course if anyone has seen relevant discussion or literature, that's even better.

So the story starts with empathy, or I like to call it “empathetic simulation”. That’s when you simulate something that you think is happening in someone else’s head. Like, here on the left, you’re thinking about a red bird, and this involves activating many thousands of concepts in your predictive world-model, to varying degrees, and I put a few of them in the picture here. I should mention that this isn’t just a silly diagram, there’s a story about an algorithm underneath it, which again I’m skipping for this talk. Now here on the right is what it looks like when you think that Alice is thinking about a red bird. You’ll notice that obviously you’re using the "red bird" concept that's already in your world model. You’re not building a new concept of “red bird” from scratch, right? Well, I consider that obvious, but some people would disagree here, and one of the objections in the literature is: what if Alice’s mind is different from yours? I say, no problem! You just learn little ad hoc patches that account for the differences. Like in this example, you happen to know Alice is colorblind, see, I put that in the bottom-left in the diagram. Our brain algorithms are really good at that kind of thing.

So by the same token, here on the left are the active concepts in your head when you’re laughing. When you think about the fact that Bob is laughing, it seems to me that it has to activate all those same concepts.

Now I’m using the term “concepts” for these little gray bubbles, but that’s not a great term. You can also call them “generative models” or “hypotheses”, Kurzweil calls them “patterns”, Minsky calls them “subagents”, in some contexts ... Whatever. The important thing is, don’t think of them as dry and theoretical and abstract. Many of them are connected to actions, at least to some degree. Like, if you imagine yourself jumping, you’ll often notice your muscles twitch a little bit as if preparing to jump. Or if you imagine feeling scared, you’ll notice your heart rate rise a bit. So by the same token, when you call to mind the idea that Bob is laughing, I suggest that you’ll feel just a little hint of a feeling of safety, playfulness, and so on.

To be clear, when I talk about empathy so far, I'm not talking about effortful empathy, like “Walk a mile in Bob's shoes”, as the saying goes. I'll get back to effortful empathy later. Instead this is like, Try looking at a chair and seeing it not as a chair, but rather as a bunch of meaningless lines and colors. It's very difficult. Not impossible, but very difficult. We just recognize things when we see them, we activate those concepts, it's automatic. So by the same token, if you recognize a sound as a laughing sound, my claim is that you just naturally get just a little glimpse in your head of a light-hearted, playful mood, and the other concepts that go with that.

So here you are. You’re in pain, hurt and scared. And you tell your friend Bob about it, and he just laughs at you. Ouch. Maybe it pisses you off, maybe it crushes you, but I sure bet you feel something! This is a great example of a social emotion. So the question from the beginning of this talk is: what's the chain of events in your brain that leads to that feeling? What's the proximate cause?

My hypothesis is that it's that little glimpse of a playful mood, within the empathetic simulation of Bob—I claim that's the most immediate and direct cause of your emotional reaction. I think when you’re feeling hurt and scared, there's a circuit in your brain that’s looking for an empathetic simulation of a playful feeling, along with some other signals and context, and if it sees that, it triggers this strong negative reaction, like with cortisol and negative valence.

So in general, yes I feel crushed that Bob is laughing at me, but more specifically, I feel that feeling most strongly right at the moment that I call up that little glimpse of Bob’s playful feeling. And the more that Bob’s playful feeling is vivid and salient in my mind, the more strongly I react.

So I want to put those little glimpses of empathetic simulation center stage, right at the core of how all our social instincts are implemented—maybe not the whole story, but the key ingredient. Like, if my friend is impressed by something I did, I feel proud, but I especially feel proud at the exact moment when I imagine my friend feeling that emotion. If my friend is disappointed in me, I feel guilty, but I especially feel guilty at the exact moment when I imagine my friend feeling that emotion. And conversely I feel like I can't really strongly summon those feelings in any other way. Maybe that’s just me—y'know, typical-mind fallacy—or maybe I’m bad at introspection, or it's misleading here. But that’s where I’m at right now.

Now again I want to distinguish these glimpses of empathy from real empathy, empathy in the everyday sense, like when you really try to “walk a mile in Bob’s shoes”. When you do that, you’re not just taking a glimpse at an empathetic simulation of Bob’s feelings, but rather you’re making that empathetic simulation very intense, so intense that it pushes aside whatever you were feeling before, like it fills your brain, so to speak. Now it’s a different context—not the "hurt and scared" context from before, those feelings were pushed away—so the empathetic simulation gives rise to a different emotional reaction. You probably wind up more sympathetic to Bob. You see where he’s coming from. Unless he's a psychopath, I guess.

Anyway, if this story is basically right, with the little glimpses of empathy, then I can start to imagine how to implement an algorithm for social instincts, although that’s still a bit of a long story that I won’t get into now. And I’m still hazy about some details, like maybe it’s not just empathetically simulated feelings, but there are other essential signals too, and if so, what are they? Oh and if I’m totally wrong, then back to the drawing board! So like I said before, this is brainstorming, go think about it and if you have any insights or ideas, let’s talk!