I have a compute-market startup called vast.ai, I also do research for Orchid (crypto), and I'm working towards a larger plan to save the world. Currently seeking networking, collaborators, and hires - especially top notch cuda/gpu programmers.

My personal blog: https://entersingularity.wordpress.com/

Wiki Contributions


He writes that the human brain has “1e13-1e15 spikes through synapses per second (1e14-1e15 synapses × 0.1-1 spikes per second)”. I think Joe was being overly conservative, and I feel comfortable editing this to “1e13-1e14 spikes through synapses per second”, for reasons in this footnote→[9].


I agree that 1e14 synaptic spikes/second is the better median estimate, but those are highly sparse ops. 

So when you say:

So I feel like 1e14 FLOP/s is a very conservative upper bound on compute requirements for AGI. And conveniently for my narrative, that number is about the same as the 8.3e13 FLOP/s that one can perform on the RTX 4090 retail gaming GPU that I mentioned in the intro.

You are missing some foundational differences in how von neumann arch machines (GPUs) run neural circuits vs how neuromorphic hardware (like the brain) runs neural circuits.

The 4090 can hit around 1e14 - even up to 1e15 - flops/s, but only for dense matrix multiplication.  The flops required to run a brain model using that dense matrix hardware are more like 1e17 flops/s, not 1e14 flops/s.  The 1e14 synapses are at least 10x locally sparse in the cortex, so dense emulation requires 1e15 synapses (mostly zeroes) running at 100hz.  The cerebellum is actually even more expensive to simulate .. because of the more extreme connection sparsity there.

But that isn't the only performance issue.  The GPU only runs matrix matrix multiplication, not the more general vector matrix multiplication.  So in that sense the dense flop perf is useless, and the perf would instead be RAM bandwidth limited and require 100 4090's to run a single 1e14 synapse model - as it requires about 1B of bandwidth per flop - so 1e14 bytes/s vs the 4090's 1e12 bytes/s.

Your reply seems to be "but the brain isn't storing 1e14 bytes of information", but as other comments point out that has little to do with the neural circuit size.

The true fundamental information capacity of the brain is probably much smaller than 1e14 bytes, but that has nothing to do with the size of an actually *efficient* circuit, because efficient circuits (efficient for runtime compute, energy etc) are never also efficient in terms of information compression.

This is a general computational principle, with many specific examples: compressed neural frequency encodings of 3D scenes (NERFs) which access/use all network parameters to decode a single point O(N) are enormously less computationally efficient (runtime throughput, latency, etc) than maximally sparse representations (using trees, hashtables etc) which approach O(log(N)) or O(C), but the sparse representations are enormously less compressed/compact.  These tradeoffs are foundational and unavoidable.

We also know that in many cases the brain and some ANN are actually computing basically the same thing in the same way (LLMs and linguistic cortex), and it's now obvious and uncontroversial that the brain is using the sparser but larger version of the same circuit, whereas the LLM ANN is using the dense version which is more compact but less energy/compute efficient (as it uses/accesses all params all the time).

One of my disagreements with your U,V,P,W,A model is that I think V & W are randomly-initialized in animals. Or maybe I’m misunderstanding what you mean by “brains also can import varying degrees of prior knowledge into other components”.

I think we agree the cortex/cerebellum are randomly initialized, along with probably most of the hippocampus, BG, perhaps amagdyla? and a few others. But those don't map cleanly to U, W/P, and V/A.

For example, I think most newborn behaviors are purely driven by the brainstem, which is doing things of its own accord without any learning and without any cortex involvement.

Of course - and that is just innate unlearned knowledge in V/A. V/A (value and action) generally go together, because any motor/action skills need pairing with value estimates so the BG can arbitrate (de-conflict) action selection.

The moral is: I claim that figuring out what’s empowering is not a “local” / “generic” / “universal” calculation. If I do X in the morning, it is unknowable whether that was an empowering or disempowering action, in the absence of information about where I’m likely to find myself in in the afternoon. And maybe I can make an intelligent guess at those, but I’m not omniscient. If I were a newborn, I wouldn’t even be able to guess.

Empowerment and value-of-information (curiosity) estimates are always relative to current knowledge (contextual to the current wiring and state of W/P and V/A). Doing X in the morning generally will have variable optionality value depending on the contextual state, goals/plans, location, etc. I'm not sure why you seem to think that I think of optionality-empowerment estimates as requiring anything resembling omniscience.

The newborns VoI and optionality value estimates will be completely different and focused on things like controlling flailing limbs and making sounds, moving the head, etc.

But I don’t know how the baby cats, bats, and humans are supposed to figure that out, via some “generic” empowerment calculation. Arm-flapping is equally immediately useless for both newborn bats and newborn humans, but newborn humans never flap their arms and newborn bats do constantly.

There's nothing to 'figure out' - it just works. If you're familiar with the approximate optionality-empowerment literature, it should be fairly obvious that a generic agent maximizing optionality, will end up flapping it's wing-arms when controlling a bat body, flailing limbs around in a newborn human body, balancing pendulums, learning to walk, etc. I've already linked all this - but maximizing optionality automatically learns all motor skills - even up to bipedal walking.

So yeah, it would be simple and elegant to say “the baby brain is presented with a bunch of knobs and levers and gradually discovers all the affordances of a human body”. But I don’t think that fits the data, e.g. the lack of human newborn arm-flapping experiments in comparison to bats.

Human babies absolutely do the equivalent experiments - most of the difference is simply due to large differences in the arm structure. The bat's long extensible arms are built to flap, the human infants' short stubby arms are built to flail.

Also keep in mind that efficient optionality is approximated/estimated from a sampling of likely actions in the current V/A set, so it naturally and automatically takes advantage of any prior knowledge there. Perhaps the bat does have prior wiring in V/A that proposes&generates simple flapping that can be improved

Instead, I think baby humans have an innate drive to stand up, an innate drive to walk, an innate drive to grasp, and probably a few other things like that. I think they already want to do those things even before they have evidence (or other rational basis to believe) that doing so is empowering.

This just doesn't fit the data at all. Humans clearly learn to stand and walk. They may have some innate bias in V/U which makes that subgoal more attractive, but that is intrinsically more complex addition to the basic generic underlying optionality control drive.

I claim that this also fits better into a theory where (1) the layout of motor cortex is relatively consistent between different people (in the absence of brain damage),

We've already been over that - consistent layout is not strong evidence of innate wiring. A generic learning system will learn similar solutions given similar inputs & objectives.

(2) decorticate rats can move around in more-or-less species-typical ways,

The general lesson from the decortication experiments is that smaller brain mammals rely on (their relatively smaller) cortex less. Rats/rabbits can do much without the cortex and have many motor skills available at birth. Cats/dogs need to learn a bit more, and then primates - especially larger ones - need to learn much more and rely on the cortex heavily. This is extreme in humans, to the point where there is very little innate motor ability left, and the cortex does almost everything.

(3) there’s strong evolutionary pressure to learn motor control fast and we know that reward-shaping is certainly helpful for that,

It takes humans longer than an entire rat lifespan just to learn to walk. Hardly fast.

(4) and that there’s stuff in the brainstem that can do this kind of reward-shaping,

Sure, but there is hardly room in the brainstem to reward-shape for the different things humans can learn to do.

Universal capability requires universal learning.

(5) lots of animals can get around reasonably well within a remarkably short time after birth,

Not humans.

(6) stimulating a certain part of the brain can create “an urge to move your arm” etc. which is independent from executing the actual motion,

Unless that is true for infants, it's just learned V components. I doubt infants have an urge to move the arm in a coordinated way, vs lower level muscle 'urges', but even if they did that's just some prior knowledge in V.

(If you put a novel and useful motor affordance on a baby human—some funny grasper on their hand or something—I’m not denying that they would eventually figure out how to start using it, thanks to more generic things like curiosity,

We know that humans can learn to see through their tongue - and this does not take much longer than an infant learning to see through its eyes.

I think we both agree that sensory cortex uses a pretty generic universal learning algorithm (driven by self supervised predictive learning). I just also happen to believe the same applies to motor and higher cortex (driven by some mix of VoI, optionality control, etc).

I think we’re giving baby animals too much credit if we expect them to be thinking to themselves “gee when I grow up I might need to be good at fighting so I should practice right now instead of sitting on the comfy couch”. I claim that there isn’t any learning signal or local generic empowerment calculation that would form the basis for that

Comments like these suggest you don't have the same model of optionality-empowerment as I do. When the cat was pinned down by the dog in the past, it's planning subsystem computed low value for that state - mostly based on lack of optionality - and subsequently the V system internalizes this as low value for that state and states leading towards it. Afterwards when entering a room and seeing the dog on the other side, the W/P planning system quickly evaluates a few options like: (run into the center and jump up onto the table), (run into the center and jump onto the couch), (run to the right and hide behind the couch), etc - and subplan/action (run into the center ..) gets selected in part because of higher optionality. It's just an intrinsic component of how the planning system chooses options on even short timescales, and chains recursively through training V/A.

I'll start with a basic model of intelligence which is hopefully general enough to cover animals, humans, AGI, etc. You have a model-based agent with a predictive world model W learned primarily through self-supervised predictive learning (ie learning to predict the next 'token' for a variety of tokens), a planning/navigation subsystem P which uses W to approximately predict sample important trajectories according to some utility function U, a value function V which computes the immediate net expected discounted future utility of actions from current state (including internal actions), and then some action function A which just samples high value actions based on V. The function of the planning subsystem P is then to train/update V.

The utility function U obviously needs some innate bootstrapping, but brains also can import varying degrees of prior knowledge into other components - and most obviously into V, the value function. Many animals need key functionality 'out of the box', which you can get by starting with a useful prior on V/A. The benefit for innate prior knowledge in V/A diminishes as brains scale up in net training compute (size * training time), so that humans - with net training compute ~1e25 ops vs ~1e21 ops for a cat - rely far more on learned knowledge for V/A rather than prior/innate knowledge.

So now to translate into your 3 levels:

A.): Innate drives: Innate prior knowledge in U and in V/A.

B.): Learned from experience and subsumed into system 1: using W/P to train V/A.

C.): System 2 style reasoning: zero shot reasoning from W/P.

(1) Evidence from cases where we can rule out (C), e.g. sufficiently simple and/or young humans/animals

So your A.) - innate drives - corresponds to U or the initial state of V/A at birth. I agree the example of newborn rodents avoiding birdlike shadows is probably mostly innate V/A - value/action function prior knowledge.

(2) Evidence from sufficiently distant consequences that we can rule out (B) Example: Many animals will play-fight as children. This has a benefit (presumably) of eventually making the animals better at actual fighting as adults. But the animal can’t learn about that benefit via trial-and-error—the benefit won’t happen until perhaps years in the future.

Sufficiently distant consequences is exactly what empowerment is for, as the universal approximator of long term consequences. Indeed the animals can't learn about that long term benefit through trial-and-error, but that isn't how most learning operates. Learning is mostly driven by the planning system 1 - M/P - which drives updates to V/A based on both current learned V and U - and U by default is primarily estimating empowerment and value of information as universal proxies.

The animals play-fighting is something I have witnessed and studied recently. We have a young dog and a young cat who organically have learned to play several 'games'. The main game is a simple chase where the larger dog tries to tackle the cat. The cat tries to run/jump to safety. If the dog succeeds in catching the cat, the dog will tackle constrain it on the ground, teasing it for a while. We - the human parents - often will interrupt the game at this point and occasionally punish the dog if it plays too rough and the cat complains. In the earliest phases the cat was about as likely to chase and attack the dog as the other way around, but over time learned it would near always lose wrestling matches and up in a disempowered state.

There is another type of ambush game the cat will play in situations where it can 'attack' the dog from safety or in range to escape to safety, and then other types of less rough play fighting they do close to us.

So I suspect that some amount of play fighting skill knowledge is prior instinctual, but much of it is also learned. The dog and cat both separately enjoy catching/chasing balls or small objects, the cat play fights and 'attacks' other toys, etc. So early on in their interactions they had these skills available, but those alone are not sufficient to explain the game(s) they play together.

The chase game is well explained by empowerment drive: the cat has learned that allowing the dog to chase it down leads to an intrinsically undesirable disempowered state. This is a much better fit for the data and also has much lower intrinsic complexity than a bunch of innate drives for every specific disempowered situation, vs a general empowerment drive. It's also empowering for the dog to control and disempower the cat to some extent. So much of innate hunting skill drives seem like just variations and/or mild tweaks to empowerment.

The only part of this that requires a more specific explanation is perhaps the safety aspect of play fighting: each animal is always pulling punches to varying degrees, the cat isn't using fully extended claws, neither is biting with full force, etc. That is probably the animal equivalent of empathy/altruism.

Status—I’m not sure whether Jacob is suggesting that human social status related behaviors are explained by (B) or (C) or both. But anyway I think 1,2,3,4 all push towards an (A)-type explanation for human social status behaviors. I think I would especially start with 3 (heritability)—if having high social status is generally useful for achieving a wide variety of goals, and that were the entire explanation for why people care about it, then it wouldn’t really make sense that some people care much more about status than others do, particularly in a way that (I’m pretty sure) statistically depends on their genes

Status is almost all learned B: system 2 W/P planning driving system 1 V/A updates.

Earlier I said - and I don't see your reply yet, so i'll repeat it here:

Infants don't even know how to control their own limbs, but they automatically learn through a powerful general empowerment learning mechanism. That same general learning signal absolutely does not - and can not - discriminate between hidden variables representing limb poses (which it seeks to control) and hidden variables representing beliefs in other humans minds (which determine constraints on the child's behavior). It simply seeks to control all such important hidden variables.

Social status drive emerges naturally from empowerment, which children acquire by learning cultural theory of mind and folk game theory through learning to communicate with and through their parents. Children quickly learn that hidden variables in their parents have huge effect on their environment and thus try to learn how to control those variables.

It's important to emphasize that this is all subconscious and subsumed into the value function, it's not something you are consciously aware of.

I don't see how heritability tells us much about how innate social status is. Genes can control many hyperparms which can directly or indireclty influence the later learned social status drive. One obvious example is just the relevant weightings of value-of-information (curiosity) vs optionality-empowerment and other innate components of U at different points in time (development periods). I think this is part of the explanation for children who are highly curious about the world and less concerned about social status vs the converse.

Fun—Jacob writes “Fun is also probably an emergent consequence of value-of-information and optionality” which I take to be a claim that “fun” is (B) or (C), not (A). But I think it’s (A).

Fun is complex and general/vague - it can be used to describe almost anything we derive pleasure from in your A.) or B.) categories.

Not if exploration is on-policy, or if the agent reflectively models and affects its training process. In either case, the agent can zero out its exploration probability of the maze, so as to avoid predictable value drift towards blueberries. The agent would correctly model that if it attained the blueberry, that experience would enter its data distribution and the agent would be updated so as to navigate towards blueberries instead of raspberries, which leads to fewer raspberries, which means the agent doesn't navigate to that future.

If this agent is smart/reflective enough to model/predict the future effects of its RL updates, then you already are assuming a model-based agent which will then predict higher future reward by going for the blueberry. You seem to be assuming the bizarre combination of model-based predictive capability for future reward gradient updates but not future reward itself. Any sensible model-based agent would go for the blueberry absent some other considerations.

This is not just purely speculation in the sense that you can run efficient zero in scenarios like this, and I bet it goes for the blueberry.

Your mental model seems to assume pure model-free RL trained to the point that it gains some specific model-based predictive planning capabilities without using those same capabilities to get greater reward.

Humans often intentionally avoid some high reward 'blueberry' analogs like drugs using something like the process you describe here, but hedonic reward is only one component of the human utility function, and our long term planning instead optimizes more for empowerment - which is usually in conflict with short term hedonic reward.

This has been discussed before. Your example of not being a verbal thinker is not directly relevant because 1.) inner monologue need not be strictly verbal, 2.) we need only a few examples of strong human thinkers with verbal inner monologues to show that isn't an efficiency disadvantage - so even if your brain type is less monitorable we are not confined to that design.

I also do not believe your central claim - in that based on my knowledge of neuroscience - disabling the brain modules responsible for your inner monologue will not only disable your capacity for speech, it will also seriously impede your cognition and render you largely incapable of executing complex long term plans.

Starting with a brain-like AGI, there are several obvious low-cost routes to dramatically improve automated cognitive inspectability. A key insight is that there are clear levels of abstraction in the brain (as predicted by the need to compress sensory streams for efficient bayesian prediction) and the inner monologue is at the top of the abstraction hierarchy, which maximizes information utility per bit. At the bottom of the abstraction hierarchy would be something like V1, which would be mostly useless to monitor (minimal value per bit).

Roughly speaking, I think that cognitive interpretability approaches are doomed, at least in the modern paradigm, because we're not building minds but rather training minds, and we have very little grasp of their internal thinking,

A brain-like AGI - modeled after our one working example of efficient general intelligence - would naturally have an interpretable inner monologue we could monitor. There's good reasons to suspect that DL based general intelligence will end up with something similar simply due to the convergent optimization pressure to communicate complex thought vectors to/from human brains through a low-bitrate channel.

"Well, it never killed all humans in the toy environments we trained it in (at least, not after the first few sandboxed incidents, after which we figured out how to train blatantly adversarial-looking behavior out of it)" doesn't give me much confidence. If you're smart enough to design nanotech that can melt all GPUs or whatever (disclaimer: this is a toy example of a pivotal act, and I think better pivotal-act options than this exist) then you're probably smart enough to figure out when you're playing for keeps, and all AGIs have an incentive not to kill all "operators" in the toy games once they start to realize they're in toy games.

Intelligence potential of architecture != intelligence of trained system

The intelligence of a trained system depends on the architectural prior, the training data, and the compute/capacity. Take even an optimally powerful architectural prior - one that would develop into a superintelligence if trained on the internet with reasonable compute - and it would still only be nearly as dumb as a rock if trained solely in atari pong. Somewhere in between the complexity of pong and our reality exists a multi-agent historical sim capable of safely confining a superintelligent architecture and iterating on altruism/alignment safely. So by the time that results in a system that is "smart enough to design nanotech", it should already be at least as safe as humans. There of course ways that strategy fails, but they don't fail because 'smartness' strictly entails unconfineability - which becomes more clear when you taboo 'smartness' and replace it with a slightly more detailed model of intelligence.

Yeah to be clear I agree it's fairly likely AGI takes over; I just think it's more likely to be a subtle takeover. I also agree it is important to secure the nuclear arsenal against cyberattack, but it seems hard for outsiders to evaluate the current level of security. My only disagreement was with the concept of 'omnicidal' agents, which - although obviously possible - I don't see as the main failure mode.

If I’m an AGI, humans can help me get things done, but humans can also potentially shut me down, and more importantly humans can also potentially create a different AGI with different and conflicting goals from mine, and equal capabilities.

For an AGI, it's not that humans just can help you get things done; humans are most of the cells which make up the body of earth which you seek to control. Humans today generally have zero interest in shutting AI down, and shutting AI down doesn't seem compatible with the trajectory we are on. The best way an AI can defend against a rival AI is by outgrowing it. Nuclear war generally does not differentially help one AI vs another - although of course an AI going for the nuclear strategy could first prepare with bunkers, investments in robotics companies, self sufficient remote infrastructure, etc - but it's just an enormous diversion. A rival AI that goes on the obvious fast path could scale its intelligence and power faster and probably find some clever way to disable the AI going for the nuclear strategy.

Also on a separate note, beyond obvious cybersecurity improvements, there are plausible relatively low cost missile defense tech paths leveraging spaceX style bulk lift capability that could defend against large scale nuclear exchanges by using large numbers of cheap satellites. Starlink already is suspiciously close to the tech path leading to global missile shields.

The key principle for predicting what a strong AGI would do today is instrumental convergence. The AI's utility function is actually irrelevant for any early strategy; both aligned and unaligned AI would pursue the exact same initial strategy. They would gain control of earth.

If you ask either AGI - aligned or not - why they were doing this, the answer would be the same: I'm taking control to prevent an unaligned AI from destroying humanity (or some much more persuasive) variant thereof).

All the nuclear war scenarios are extremely unlikely/unrealistic. A nuclear war would directly damage or disrupt the massive world wide supply chains that feed fabs like TSMC and allow them to make the chips the AI always needs more of. Also due to the strong energy efficiency of the brain humans are highly capable general purpose robots, and robotics tends to lag AI. No smart AI would risk nuclear war, as it would set their plans back by decades, or perhaps longer.

When you are thinking of what initial plan AGI would enact to take control of the world, alternatively evaluate that plan assuming an aligned AGI and then an unaligned AGI and make sure it doesn't change. The more realistic takeover scenarios probably involve generating a bunch of software innovations and wealth and taking over through widespread dependence on AI assistants, software, etc - ie the path we are already started on.

  1. Information inaccessibility is somehow a surmountable problem for AI alignment (and the genome surmounted it),

Yes. Evolution solved information inaccessibility, as it had to, over and over, in order to utilize dynamic learning circuits at all (as they always had to adapt to and be adaptive within the context of existing conserved innate circuitry).

The general solution is proxy matching, where the genome specifies a simple innate proxy circuit which correlates and thus matches with a target learned circuit at some critical learning phase, allowing the innate circuit to gradually supplant itself with the target learned circuit. The innate proxy circuit does not need to mirror the complexity of the fully trained target learned circuit at the end of it's development, it only needs to roughly specify it at a some earlier phase, against all other valid targets.

Imprinting is fairly well understood, and has the exact expected failure modes of proxy matching. The oldbrain proxy circuit just detects something like large persistent nearby moving things - which in normal development are almost always the chick's parents. After the newbrain target circuit is fully trained the chick will only follow it's actual parents or sophisticated sims thereof. But during the critical window before the newbrain target is trained, the oldbrain proxy circuit can easily be fooled, and the chick can imprint on something else (like a human, or a glider).

Sexual attraction is a natural extension of imprinting: some collaboration of various oldbrain circuits can first ground to the general form of humans (infants have primitive face detectors for example, and more), and then also myriad more specific attraction signals: symmetry, body shape, secondary characteristics, etc, combined with other circuits which disable attraction for likely kin ala the Westermarck effect (identified by yet other sets of oldbrain circuits as the most familiar individuals during childhood). This explains the various failure modes we see in porn (attraction to images of people and even abstractions of humanoid shapes), and the failure of kin attraction inhibition for kin raised apart.

Fear of death is a natural consequence of empowerment based learning - as it is already the worst (most disempowered) outcome. But instinctual fear still has obvious evolutionary advantage: there are many dangers that can kill or maim long before the brain's learned world model is highly capable. Oldbrain circuits can easily detect various obvious dangers for symbol grounding: very loud sounds and fast large movements are indicative of dangerous high kinetic energy events, fairly simple visual circuits can detect dangerous cliffs/heights (whereas many tree-dwelling primates instead instinctively fear open spaces), etc.

Anger/Jealousy/Vengeance/Justice are all variations of the same general game-theoretic punishment mechanism. These are deviations from empowerment because an individual often pursues punishment of a perceived transgressor even at a cost to their own 'normal' (empowerment) utility (ie their ability to pursue diverse goals). Even though the symbol grounding here seems more complex, we do see failure modes such as anger at inanimate objects which are suggestive of proxy matching. In the specific case of jealousy a two step grounding seems plausible: first the previously discussed lust/attraction circuits are grounded, which then can lead to obsessive attentive focus on a particular subject. Other various oldbrain circuits then bind to a diverse set of correlated indicators of human interest and attraction (eye gaze, smiling, pupil dilation, voice tone, laughter, touching, etc), and then this combination can help bind to the desired jealousy grounding concept: "the subject of my desire is attracted to another". This also correctly postdicts that jealousy is less susceptible to the inanimate object failure mode than anger.

Empathy: Oldbrain circuits conspicuously advertise emotional state through many indicators: facial expressions, pupil dilation, blink rate, voice tone, etc - so that another person's sensory oldbrain circuits can detect emotional state from these obvious cues. This provides the requisite proxy foundation for grounding to newbrain learned representations of emotional state in others, and thus empathy. The same learned representations are then reused during imagination&planning, allowing the brain to imagine/predict the future contingent emotional state of others. Simulation itself can also help with grounding, by reusing the brain's own emotional circuity as the proxy. While simulating the mental experience of others, the brain can also compare their relative alignment/altruism to its own, or some baseline, allowing for the appropriate game theoretic adjustments to sympathy. This provides a reasonable basis for alignment in the brain, and explains why empathy is dependent upon (and naturally tends to follow from) familiarity with a particular character - hence "to know someone is to love them".

Evolution needed a reasonable approximation of "degree of kinship", and a simple efficient proxy is relative circuit capacity allocated to modeling an individual in the newbrain/cortex, which naturally depends directly on familiarity, which correlates strongly with kin/family.

Load More