A Brief History of Intelligence: Evolution, AI, and the Five Breakthroughs That Made Our Brains

We need to ground our understanding of how the brain works and how it evolved in our understanding of how intelligence works—for which we must look to the field of artificial intelligence. The relationship between AI and the brain goes both ways; while the brain can surely teach us much about how to create artificial humanlike intelligence, AI can also teach us about the brain. If we think some part of the brain uses some specific algorithm but that algorithm doesn’t work when we implement it in machines, this gives us evidence that the brain might not work this way. Conversely, if we find an algorithm that works well in AI systems, and we find parallels between the properties of these algorithms and properties of animal brains, this gives us some evidence that the brain might indeed work this way.

The World Before Brains

Even the simplest single-celled organisms—such as bacteria—have proteins designed for movement, motorized engines that convert cellular energy into propulsion, rotating propellers using a mechanism no less complex than the motor of a modern boat. Bacteria also have proteins designed for perception—receptors that reshape when they detect certain features of the external environment, such as temperature, light, or touch. Armed with proteins for movement and perception, early life could monitor and respond to the outside world.

Cellular respiration requires sugar to produce energy, and this basic need provided the energetic foundation for the eventual intelligence explosion that occurred uniquely within the descendants of respiratory life. While most, if not all, microbes at the time exhibited primitive levels of intelligence, it was only in respiratory life that intelligence was later elaborated and extended. Respiratory microbes differed in one crucial way from their photosynthetic cousins: they needed to hunt. And hunting required a whole new degree of smarts.

This fueled the engine of evolutionary progress; for every defensive innovation prey evolved to stave off being killed, predators evolved an offensive innovation to overcome that same defense. Life became caught in an arms race, a perpetual feedback loop: offensive innovations led to defensive innovations that required further offensive innovations.

Animals with neurons share a common ancestor, an organism in whom the first neurons evolved and from whom all neurons descend. It seems that in this ancient grandmother of animals, neurons attained their modern form; from this point on, evolution rewired neurons but made no meaningful adjustments to the basic unit itself. This is a glaring example of how prior innovations impose constraints on future innovations, often leaving early structures unchanged—the fundamental building blocks of brains have been essentially the same for over six hundred million years.

These features of neurons—all-or-nothing spikes, rate coding, adaptation, and chemical synapses with excitatory and inhibitory neurotransmitters—are universal across all animals, even in animals that have no brain, such as coral polyps and jellyfish. Why do all neurons share these features? If early animals were, in fact, like today’s corals and anemones, then these aspects of neurons enabled ancient animals to successfully respond to their environment with speed and specificity, something that had become necessary to actively capture and kill level-two multicellular life. All-or-nothing electrical spikes triggered rapid and orchestrated reflexive movements so animals could catch prey in response to even the subtlest of touches or smells. Rate coding enabled animals to modify their responses based on the strengths of a touch or smell. Adaptation enabled animals to adjust the sensory threshold for when spikes are generated, allowing them to be highly sensitive to even the subtlest of touches or smells while also preventing overstimulation at higher strengths of stimuli. What about inhibitory neurons? Why did they evolve? Consider the simple task of a coral polyp opening or closing its mouth. For its mouth to open, one set of muscles must contract and another must relax. And the converse for closing its mouth. The existence of both excitatory and inhibitory neurons enabled the first neural circuits to implement a form of logic required for reflexes to work. They can enforce the rule of “do this, not that,” which was perhaps the first glimmer of intellect to emerge from circuits of neurons.

While the first animals, whether gastrula-like or polyp-like creatures, clearly had neurons, they had no brain. Like today’s coral polyps and jellyfish, their nervous system was what scientists call a nerve net: a distributed web of independent neural circuits implementing their own independent reflexes. But with the evolutionary feedback loop of predator-prey in full force, with the animal niche of active hunting, and the building blocks of neurons in place, it was only a matter of time before evolution stumbled on breakthrough #1, which led to rewiring nerve nets into brains.

Breakthrough #1: Steering and the First Bilaterians

The Birth of Good and Bad

Almost all animals on Earth have the same body plan. They all have a front that contains a mouth, a brain, and the main sensory organs (such as eyes and ears), and they all have a back where waste comes out. Evolutionary biologists call animals with this body plan bilaterians because of their bilateral symmetry. This is in contrast to our most distant animal cousins—coral polyps, anemones, and jellyfish—which have body plans with radial symmetry; that is, with similar parts arranged around a central axis, without any front or back. The most obvious difference between these two categories is how the animals eat. Bilaterians eat by putting food in their mouths and then pooping out waste products from their butts. Radially symmetrical animals have only one opening—a mouth-butt if you will—which swallows food into their stomachs and spits it out.

Radially symmetrical body plans work fine with the coral strategy of waiting for food. But they work horribly for the hunting strategy of navigating toward food. Radially symmetrical body plans, if they were to move, would require an animal to have sensory mechanisms to detect the location of food in any direction and then have the machinery to move in any direction. In other words, they would need to be able to simultaneously detect and move in all different directions. Bilaterally symmetrical bodies make movement much simpler. Instead of needing a motor system to move in any direction, they simply need one motor system to move forward and one to turn. Bilaterally symmetrical bodies don’t need to choose the exact direction; they simply need to choose whether to adjust to the right or the left. Even modern human engineers have yet to find a better structure for navigation. Cars, planes, boats, submarines, and almost every human-built navigation machine is bilaterally symmetric. It is simply the most efficient design for a movement system. Bilateral symmetry allows a movement apparatus to be optimized for a single direction (forward) while solving the problem of navigation by adding a mechanism for turning.

There is another observation about bilaterians, perhaps the more important one: They are the only animals that have brains. This is not a coincidence. The first brain and the bilaterian body share the same initial evolutionary purpose: They enable animals to navigate by steering. Steering was breakthrough #1.

It turns out that to successfully navigate in the complicated world of the ocean floor, you don’t actually need an understanding of that two-dimensional world. You don’t need an understanding of where you are, where food is, what paths you might have to take, how long it might take, or really anything meaningful about the world. All you need is a brain that steers a bilateral body toward increasing food smells and away from decreasing food smells.

In the 1980s and 1990s a schism emerged in the artificial intelligence community. On one side were those in the symbolic AI camp, who were focused on decomposing human intelligence into its constituent parts in an attempt to imbue AI systems with our most cherished skills: reasoning, language, problem solving, and logic. In opposition were those in the behavioral AI camp, led by the roboticist Rodney Brooks at MIT, who believed the symbolic approach was doomed to fail because “we will never understand how to decompose human level intelligence until we’ve had a lot of practice with simpler level intelligences.”

The navigational strategies of the Roomba and first bilaterians were not identical. But it may not be a coincidence that the first successful domestic robot contained an intelligence not so unlike the intelligence of the first brains. Both used tricks that enabled them to navigate a complex world without actually understanding or modeling that world. While others remained stuck in the lab working on million-dollar robots with eyes and touch and brains that attempted to compute complicated things like maps and movements, Brooks built the simplest possible robot, one that contained hardly any sensors and that computed barely anything at all. But the market, like evolution, rewards three things above all: things that are cheap, things that work, and things that are simple enough to be discovered in the first place.

The breakthrough of steering required bilaterians to categorize the world into things to approach (“good things”) and things to avoid (“bad things”). Even a Roomba does this—obstacles are bad; charging station when low on battery is good. Earlier radially symmetric animals did not navigate, so they never had to categorize things in the world like this. When animals categorize stimuli into good and bad, psychologists and neuroscientists say they are imbuing stimuli with valence. Valence is the goodness or badness of a stimulus. Valence isn’t about a moral judgment; it’s something far more primitive: whether an animal will respond to a stimulus by approaching it or avoiding it. The valence of a stimulus is, of course, not objective; a chemical, image, or temperature, on its own, has no goodness or badness. Instead, the valence of a stimulus is subjective, defined only by the brain’s evaluation of its goodness or badness.

This requirement of integrating input across sensory modalities was likely one reason why steering required a brain and could not have been implemented in a distributed web of reflexes like those in a coral polyp. All these sensory inputs voting for steering in different directions had to be integrated together in a single place to make a single decision; you can go in only one direction at a time. The first brain was this mega-integration center—one big neural circuit in which steering directions were selected.

This is another example of how past innovations enabled future innovations. Just as a bilaterian cannot both go forward and turn at the same time, a coral polyp cannot both open and close its mouth at the same time. Inhibitory neurons evolved in earlier coral-like animals to enable these mutually exclusive reflexes to compete with each other so that only one reflex could be selected at a time; this same mechanism was repurposed in early bilaterians to enable them to make trade-offs in steering decisions. Instead of deciding whether to open or close one’s mouth, bilaterians used inhibitory neurons to decide whether to go forward or turn.

Steering requires at least four things: a bilateral body plan for turning, valence neurons for detecting and categorizing stimuli into good and bad, a brain for integrating input into a single steering decision, and the ability to modulate valence based on internal states.

The Origin of Emotion

Our internal states are not only imbued with a level of valence, but also a degree of arousal. Blood-boiling fury is not only a bad mood but an aroused bad mood. Different from an unaroused bad mood, like depression or boredom. Similarly, the tingly serenity of lying on a warm beach is not only a good mood but a good mood with low arousal. Different from the highly arousing good mood produced by getting accepted to college or riding a roller coaster (if you like that sort of thing). Neuroscientists and psychologists use the word affect to refer to these two attributes of emotions; at any given point, humans are in an affective state represented by a location across these two dimensions of valence and arousal. While rigorous definitions of categories of human emotions themselves elude philosophers, psychologists, and neuroscientists alike, affect is the relatively well accepted unifying foundation of emotion.

The defining feature of these affective states is that, although often triggered by external stimuli, they persist for long after the stimuli are gone.

The idea that the function of the first brain was for steering provides a clue as to why nematodes have—and why the first bilaterians likely also had—these states: such persistence is required for steering to work. Sensory stimuli, especially the simple ones detected by nematodes, offer transient clues, not consistent certainties, of what exists in the real world. In the wild, outside of a scientist’s petri dish, food does not make perfectly distributed smell gradients—water currents can distort or even completely obscure smells, disrupting a worm’s ability to steer toward food or away from predators. These persistent affective states are a trick to overcome this challenge: If I detect a passing sniff of food that quickly fades, it is likely that there is food nearby even if I no longer smell it. Therefore, it is more effective to persistently search my surroundings after encountering food, as opposed to only responding to food smells in the moment that they are detected. Similarly, a worm passing through an area full of predators won’t experience a constant smell of predators but rather catch a transient hint of one nearby; if a worm wants to escape, it is a good idea to persistently swim away even after the smell has faded.

The brain of a nematode generates these affective states using chemicals called neuromodulators. Two of the most famous neuromodulators are dopamine and serotonin. The simple brain of the nematode offers a window into the first, or at least very early, functions of dopamine and serotonin. In the nematode, dopamine is released when food is detected around the worm, whereas serotonin is released when food is detected inside the worm. If dopamine is the something-good-is-nearby chemical, then serotonin is the something-good-is-actually-happening chemical. Dopamine drives the hunt for food; serotonin drives the enjoyment of it once it is being eaten.

While the exact functions of dopamine and serotonin have been elaborated throughout different evolutionary lineages, this basic dichotomy between dopamine and serotonin has been remarkably conserved since the first bilaterians. In species as divergent as nematodes, slugs, fish, rats, and humans, dopamine is released by nearby rewards and triggers the affective state of arousal and pursuit (exploitation); and serotonin is released by the consumption of rewards and triggers a state of low arousal, inhibiting the pursuit of rewards (satiation).

Dopamine is not a signal for pleasure itself; it is a signal for the anticipation of future pleasure.

The affective state of escape, whereby nematodes rapidly attempt to swim to a new location, is in part triggered by a different class of neuromodulators: norepinephrine, octopamine, and epinephrine (also called adrenaline). Across bilaterians, including species as divergent as nematodes, earthworms, snails, fish, and mice, these chemicals are released by negative-valanced stimuli and trigger the well-known fight-or-flight response: increasing heart rate, constricting blood vessels, dilating pupils and suppressing various luxury activities, such as sleep, reproduction, and digestion. These neuromodulators work in part by directly counteracting the effectiveness of serotonin—reducing the ability of an animal to rest and be content.

Any consistent, inescapable, or repeating negative stimuli, such as constant pain or prolonged starvation, will shift a nematode brain into a state of chronic stress. Chronic stress isn’t all that different from acute stress; stress hormones and opioids remain elevated, chronically inhibiting digestion, immune response, appetite, and reproduction. But chronic stress differs from acute stress in at least one important way: it turns off arousal and motivation.

Affect, despite all its modern color, evolved 550 million years ago in early bilaterians for nothing more than the mundane purpose of steering. The basic template of affect seems to have emerged from two fundamental questions in steering. The first was the arousal question: Do I want to expend energy moving or not? The second was the valence question: Do I want to stay in this location or leave this location? The release of specific neuromodulators enforced specific answers to each of these questions. And these global signals for stay and leave could then be used to modulate suites of reflexes, such as whether it was safe to lay eggs, mate, and expend energy digesting food.

Associating, Predicting, and the Dawn of Learning

It turns out that Pavlov’s associative learning is an intellectual ability of all bilaterians, even simple ones. If you expose nematodes simultaneously to both a yummy food smell and a noxious chemical that makes them sick, nematodes will subsequently steer away from that food smell. If you feed nematodes at a specific temperature, they will shift their preferences toward that temperature. Pair a gentle tap to the side of a slug with a small electric shock, which triggers a withdrawal reflex, and the slug will learn to withdraw to just the tap, an association that will last for days.

While learning in modern AI systems is not continual, learning in biological brains has always been continual. Even our ancestral nematode had no choice but to learn continually. The associations between things were always changing. In some environments, salt was found on food; in others, it was found on barren rocks without food. In some environments, food grew at cool temperatures; in others, it grew at warm temperatures. In some environments, food was found in bright areas; in others, predators were found in bright areas. The first brains needed a mechanism to not only acquire associations but also quickly change these associations to match the changing rules of the world.

The first bilaterians used these tricks of acquisition, extinction, spontaneous recovery, and reacquisition to navigate changing contingencies in their world.

Associative learning comes with another problem: When an animal gets food, there is never a single predictive cue beforehand but rather a whole swath of cues. If you pair a tap to the side of a slug with a shock, how does a slug’s brain know to associate only the tap with the shock and not the many other sensory stimuli that were present, such as the surrounding temperature, the texture of the ground, or the diverse chemicals floating around the seawater? In machine learning, this is called the credit assignment problem: When something happens, what previous cue do you give credit for predicting it? The ancient bilaterian brain, which was capable of only the simplest forms of learning, employed four tricks to solve the credit assignment problem. These tricks were both crude and clever, and they became foundational mechanisms for how neurons make associations in all their bilaterian descendants.

The first trick used what are called eligibility traces. A slug will associate a tap with a subsequent shock only if the tap occurs one second before the shock. If the tap occurs two seconds or more before the shock, no association will be made. A stimulus like a tap creates a short eligibility trace that lasts for about a second. Only within this short time window can associations be made. This is clever, as it invokes a reasonable rule of thumb: stimuli that are useful for predicting things should occur right before the thing you are trying to predict. The second trick was overshadowing. When animals have multiple predictive cues to use, their brains tend to pick the cues that are the strongest—strong cues overshadow weak cues. If a bright light and a weak odor are both present before an event, the bright light, not the weak odor, will be used as the predictive cue. The third trick was latent inhibition—stimuli that animals regularly experienced in the past are inhibited from making future associations. In other words, frequent stimuli are flagged as irrelevant background noise. Latent inhibition is a clever way to ask, “What was different this time?” If a slug has experienced the current texture of the ground and the current temperature a thousand times but has never experienced a tap before, then the tap is far more likely to be used as a predictive cue. The fourth and final trick for navigating the credit assignment problem was blocking. Once an animal has established an association between a predictive cue and a response, all further cues that overlap with the predictive cue are blocked from association with that response. If a slug has learned that a tap leads to shock, then a new texture, temperature, or chemical will be blocked from being associated with the shock. Blocking is a way to stick to one predictive cue and avoid redundant associations.

Learning occurs when synapses change their strength or when new synapses are formed or old synapses are removed.

From the bilaterian brain onward, the evolution of learning was primarily a process of finding new applications of preexisting synaptic learning mechanisms, without changing the learning mechanisms themselves. Learning was not the core function of the first brain; it was merely a feature, a trick to optimize steering decisions. Association, prediction, and learning emerged for tweaking the goodness and badness of things. In some sense, the evolutionary story that will follow is one of learning being transformed from a cute feature of the brain to its core function.

Summary of Breakthrough #1: Steering

Our ancestors from around 550 million years ago transitioned from a radially symmetric brainless animal, like a coral polyp, to a bilaterally symmetric brain-enabled animal, like a nematode. And while many neurological changes occurred across this transition, a surprisingly broad set of them can be understood through the lens of enabling a singular breakthrough: that of navigating by steering. These include:

A bilateral body plan that reduces navigational choices to two simple options: go forward or turn
A neural architecture for valence in which stimuli is evolutionarily hard-coded into good and bad Mechanisms for modulating valence responses based on internal states
Circuits whereby different valence neurons can be integrated into a singular steering decision (hence a big cluster of neurons we identify as a brain)
Affective states for making persistent decisions as to whether to leave or stay
The stress response for energy management of movements in the presence of hardship
Associative learning for changing steering decisions based on previous experience
Spontaneous recovery and reacquisition for dealing with changing contingencies in the world (making continual learning work, even if imperfectly)
Eligibility traces, overshadowing, latent inhibition, and blocking for (imperfectly) tackling the credit assignment problem

All of these changes made steering possible and solidified our ancestors’ place as the first large multicellular animals who survived by navigating—moving not with microscopic cellular propellers but with muscles and neurons.

Breakthrough #2: Reinforcing and the First Vertebrates

The Cambrian Explosion

The discovery of steering in our nematode-like ancestor accelerated the evolutionary arms race of predation. This triggered what is now known as the Cambrian explosion, the most dramatic expansion in the diversity of animal life Earth has ever seen. From the heat of the Cambrian explosion was forged the vertebrate brain template, one that, even today, is shared across all the descendants of these early fishlike creatures.

The first animals gifted us neurons. Then early bilaterians gifted us brains, clustering these neurons into centralized circuits, wiring up the first system for valence, affect, and association. But it was early vertebrates who transformed this simple proto-brain of early bilaterians into a true machine, one with subunits, layers, and processing systems.

This ability of fish to learn arbitrary sequences of actions through trial and error has been replicated many times. Fish can learn to find and push a specific button to get food; fish can learn to swim through a small escape hatch to avoid getting caught in a net; and fish can even learn to jump through hoops to get food. Fish can remember how to do these tasks for months or even years after being trained. The process of learning is the same in all these tests: fish try some relatively random actions and then progressively refine their behavior depending on what gets reinforced. Indeed, Thorndike’s trial and error learning often goes by another name: reinforcement learning.

The second breakthrough was reinforcement learning: the ability to learn arbitrary sequences of actions through trial and error. Thorndike’s idea of trial-and-error learning sounds so simple—reinforce behaviors that lead to good things and punish behaviors that lead to bad things. But this is an example where our intuitions about what is intellectually easy and what is hard are mistaken. It was only when scientists tried to get AI systems to learn through reinforcement that they realized that it wasn’t as easy as Thorndike had thought.

The Evolution of Temporal Difference Learning

Minsky was one of the first to realize that training algorithms the way that Thorndike believed animals learned—by directly reinforcing positive outcomes and punishing negative outcomes—was not going to work. Here’s why. Suppose we teach an AI to play checkers using Thorndike’s version of trial-and-error learning. This AI would start by making random moves, and we would give it a reward whenever it won and a punishment whenever it lost. Presumably, if it played enough games of checkers, it should get better. But here’s the problem: The reinforcements and punishments in a game of checkers—the outcome of winning or losing—occur only at the end of the game. A game can consist of hundreds of moves. If you win, which moves should get credit for being good? If you lose, which moves should get credit for being bad?

Minsky realized that reinforcement learning would not work without a reasonable strategy for assigning credit across time; this is called the temporal credit assignment problem.

When training an AI to play checkers, navigate a maze, or do any other task using reinforcement learning we cannot merely reinforce recent moves and we cannot merely reinforce all the moves. How, then, can AI ever learn through reinforcement?

Sutton proposed a simple but radical idea. Instead of reinforcing behaviors using actual rewards, what if you reinforced behaviors using predicted rewards? Put another way: Instead of rewarding an AI system when it wins, what if you reward it when the AI system thinks it is winning? Sutton decomposed reinforcement learning into two separate components: an actor and a critic. The critic predicts the likelihood of winning at every moment during the game; it predicts which board configurations are great and which are bad. The actor, on the other hand, chooses what action to take and gets rewarded not at the end of the game but whenever the critic thinks that the actor’s move increased the likelihood of winning. The signal on which the actor learns is not rewards, per se, but the temporal difference in the predicted reward from one moment in time to the next. Hence Sutton’s name for his method: temporal difference learning.

Dopamine is not a signal for reward but for reinforcement. As Sutton found, reinforcement and reward must be decoupled for reinforcement learning to work. To solve the temporal credit assignment problem, brains must reinforce behaviors based on changes in predicted future rewards, not actual rewards. This is why animals get addicted to dopamine-releasing behaviors despite it not being pleasurable, and this is why dopamine responses quickly shift their activations to the moments when animals predict upcoming reward and away from rewards themselves.

In early bilaterians, dopamine was a signal for good things nearby—a primitive version of wanting.* In the transition to vertebrates, however, this good-things-are-nearby signal was elaborated to not only trigger a state of wanting but also to communicate a precisely computed temporal difference learning signal. Indeed, it makes sense that dopamine was the neuromodulator that evolution reshaped into a temporal difference learning signal, as the signal for nearby rewards it was the closest thing to a measure of predicted future reward. And so, dopamine was transformed from a good-things-are-nearby signal to a there-is-a-35 percent-chance-of-something-awesome-happening-in-exactly-ten-seconds signal. Repurposed from a fuzzy average of recently detected food to an ever fluctuating, precisely measured, and meticulously computed predicted-future-reward signal.

From the ancient seed of TD learning sprouted several features of intelligence. Two of these—disappointment and relief—are so familiar that they almost disappear from view, so ubiquitous that it is easy to miss the unavoidable fact that they did not always exist. Both disappointment and relief are emergent properties of a brain designed to learn by predicting future rewards. Indeed, without an accurate prediction of a future reward, there can be no disappointment when it does not occur. And without an accurate prediction of future pain, there can be no relief when it does not occur.

TD learning, disappointment, relief, and the perception of time are all related. The precise perception of time is a necessary ingredient to learn from omission, to know when to trigger disappointment or relief, and thereby to make TD learning work. Without time perception, a brain cannot know whether something was omitted or simply hasn’t happened yet;

As neuroscientists traced the circuitry of the basal ganglia, its function became quite clear. The basal ganglia learns to repeat actions that maximize dopamine release. Through the basal ganglia, actions that lead to dopamine release become more likely to occur (the basal ganglia ungates those actions), and actions that lead to dopmaine inhibition become less likely to occur (the basal ganglia gates those actions). Sound familiar? The basal ganglia is, in part, Sutton’s “actor”—a system designed to repeat behaviors that lead to reinforcement and inhibit behaviors that lead to punishment.

The hypothalamus is, in principle, just a more sophisticated version of the steering brain of early bilaterians; it reduces external stimuli to good and bad and triggers reflexive responses to each. The valence neurons of the hypothalamus connect to the same cluster of dopamine neurons that propogates dopamine throughout the basal ganglia. When the hypothalamus is happy, it floods the basal ganglia with dopamine, and when it is upset, it deprives the basal ganglia of dopamine. And so, in some ways, the basal ganglia is a student, always trying to satisfy its vague but stern hypothalamic judge. The hypothalamus doesn’t get excited by predictive cues; it gets excited only when it actually gets what it wants—food when hungry, warmth when cold. The hypothalamus is the decider of actual rewards; in our AI-playing-backgammon metaphor, the hypothalamus tells the brain whether it won or lost the game but not how well it is doing as the game is unfolding.

But as Minsky found with his attempts to make reinforcement learning algorithms in the 1950s, if brains learned only from actual rewards, they would never be able to do anything all that intelligent. They would suffer from the problem of temporal credit assignment. So then how is dopamine transformed from a valence signal for actual rewards to a temporal difference signal for changes in predicted future reward? In all vertebrates, there is a mysterious mosaic of parallel circuits within the basal ganglia, one that flows down to motor circuits and gates movement, and another that flows back toward dopamine neurons directly. One leading theory of basal ganglia function is that these parallel circuits are literally Sutton’s actor-critic system for implementing temporal difference learning. One circuit is the “actor,” learning to repeat the behaviors that trigger dopamine release; the other circuit is the “critic,” learning to predict future rewards and trigger its own dopamine activation. In our metaphor, the basal ganglian student initially learns solely from the hypothalamic judge, but over time learns to judge itself, knowing when it makes a mistake before the hypothalamus gives any feedback.

The Problems of Pattern Recognition

When you recognize that a plate is too hot or a needle too sharp, you are recognizing attributes of the world the way early bilaterians did, with the activations of individual neurons. However, when you recognize a smell, a face, or a sound, you are recognizing things in the world in a way that was beyond early bilaterians; you are using a skill that emerged later in early vertebrates. Early vertebrates could recognize things using brain structures that decoded patterns of neurons. This dramatically expanded the scope of what animals could perceive.

If you were training a neural network to categorize smell patterns into egg smells or flower smells, you would show it a bunch of smell patterns and simultaneously tell the network whether each pattern is from an egg or a flower (as measured by the activation of a specific neuron at the end of the network). In other words, you tell the network the correct answer. You then compare the actual output with the desired output and nudge the weights across the entire network in the direction that makes the actual output closer to the desired output. If you do this many times (like, millions of times), the network eventually learns to accurately recognize patterns—it can identify smells of eggs and flowers. They called this learning mechanism backpropagation: they propagate the error at the end back throughout the entire network, calculate the exact error contribution of each synapse, and nudge that synapse accordingly.

The above type of learning, in which a network is trained by providing examples alongside the correct answer, is called supervised learning (a human has supervised the learning process by providing the network with the correct answers). Many supervised learning methods are more complex than this, but the principle is the same: the correct answers are provided, and networks are tweaked using backpropagation to update weights until the categorization of input patterns is sufficiently accurate. This design has proven to work so generally that it is now applied to image recognition, natural language processing, speech recognition, and self-driving cars. But even one of the inventors of backpropagation, Geoffrey Hinton, realized that his creation, although effective, was a poor model of how the brain actually works. First, the brain does not do supervised learning—you are not given labeled data when you learn that one smell is an egg and another is a strawberry. Even before children learn the words egg and strawberry, they can clearly recognize that they are different. Second, backpropagation is biologically implausible. Backpropagation works by magically nudging millions of synapses simultaneously and in exactly the right amount to move the output of the network in the right direction. There is no conceivable way the brain could do this. So then how does the brain recognize patterns?

Neuroscientists have also found hints of how the cortex might solve the problem of generalization. Pyramidal cells of the cortex send their axons back onto themselves, synapsing on hundreds to thousands of other nearby pyramidal cells. This means that when a smell pattern activates a pattern of pyramidal neurons, this ensemble of cells gets automatically wired together through Hebbian plasticity.* The next time a pattern shows up, even if it is incomplete, the full pattern can be reactivated in the cortex. This trick is called auto-association; neurons in the cortex automatically learn associations with themselves. This offers a solution to the generalization problem—the cortex can recognize a pattern that is similar but not the same.

Auto-association reveals an important way in which vertebrate memory differs from computer memory. Auto-association suggests that vertebrate brains use content-addressable memory—memories are recalled by providing subsets of the original experience, which reactivate the original pattern. If I tell you the beginning of a story you’ve heard before, you can recall the rest; if I show you half a picture of your car, you can draw the rest. However, computers use register-addressable memory—memories that can be recalled only if you have the unique memory address for them. If you lose the address, you lose the memory. Auto-associative memory does not have this challenge of losing memory addresses, but it does struggle with a different form of forgetfulness. Register-addressable memory enables computers to segregate where information is stored, ensuring that new information does not overwrite old information. In contrast, auto-associative information is stored in a shared population of neurons, which exposes it to the risk of accidentally overwriting old memories.

Cohen and McCloskey referred to this property of artificial neural networks as the problem of catastrophic forgetting. This was not an esoteric finding but a ubiquitous and devastating limitation of neural networks: when you train a neural network to recognize a new pattern or perform a new task, you risk interfering with the network’s previously learned patterns. How do modern AI systems overcome this problem? Well, they don’t yet. Programmers merely avoid the problem by freezing their AI systems after they are trained. We don’t let AI systems learn things sequentially; they learn things all at once and then stop learning. The artificial neural networks that recognize faces, drive cars, or detect cancer in radiology images do not learn continually from new experiences. As of this book going to print, even ChatGPT, the famous chatbot released by OpenAI, does not continually learn from the millions of people who speak to it. It too stopped learning the moment it was released into the world. These systems are not allowed to learn new things because of the risk that they will forget old things (or learn the wrong things). So modern AI systems are frozen in time, their parameters locked down; they are allowed to be updated only when retrained from scratch with humans meticulously monitoring their performance on all the relevant tasks.

The pattern of olfactory neurons activated by the smell of an egg is the same no matter the rotation, distance, or location of the egg. The same molecules diffuse through the air and activate the same olfactory neurons. But this is not the case for other senses such as vision. The same visual object can activate different patterns depending on its rotation, distance, or location in your visual field. This creates what is called the invariance problem: how to recognize a pattern as the same despite large variances in its inputs. Nothing we have reviewed about auto-association in the cortex provides a satisfactory explanation for how the brain so effortlessly did this. The auto-associative networks we described cannot identify an object you have never seen before from completely different angles. An auto-associative network would treat these as different objects because the input neurons are completely different. This is not only a problem with vision. When you recognize the same set of words spoken by the high-pitched voice of a child and the low-pitched voice of an adult, you are solving the invariance problem. The neurons activated in your inner ear are different because the pitch of the sound is completely different, and yet you can still tell they are the same words. Your brain is somehow recognizing a common pattern despite huge variances in the sensory input.

Most modern AI systems that use computer vision, from your self-driving car to the algorithms that detect tumors in radiology images use Fukushima’s convolutional neural networks. AI was blind, but now can see, a gift that can be traced all the way back to probing cat neurons over fifty years ago. The brilliance of Fukushima’s convolutional neural network is that it imposes a clever “inductive bias.” An inductive bias is an assumption made by an AI system by virtue of how it is designed. Convolutional neural networks are designed with the assumption of translational invariance, that a given feature in one location should be treated the same as that same feature but in a different location. This is an impregnable fact of our visual world: the same thing can exist in different places without the thing being different. And so, instead of trying to get an arbitrary web of neurons to learn this fact about the visual world, which would require too much time and data, Fukushima simply encoded this rule directly into the architecture of the network.

Perhaps the best lesson from CNNs is not the success of the specific assumptions they attempt to emulate—such as translational invariance—but the success of assumptions themselves. Indeed, while CNNs may not capture exactly how the brain works, they reveal the power of a good inductive bias. In pattern recognition, it is good assumptions that make learning fast and efficient. The vertebrate cortex surely has such an inductive bias, we just don’t know what it is.

In the predatory arms race of the Cambrian, evolution shifted from arming animals with new sensory neurons for detecting specific things to arming animals with general mechanisms for recognizing anything. With this new ability of pattern recognition, vertebrate sensory organs exploded with complexity, quickly flowering into their modern form. Noses evolved to detect chemicals; inner ears evolved to detect frequencies of sound; eyes evolved to detect sights. The coevolution of the familiar sensory organs and the familiar brain of vertebrates is not a coincidence—they each facilitated the other’s growth and complexity. Each incremental improvement to the brain’s pattern recognition expanded the benefits to be gained by having more detailed sensory organs; and each incremental improvement in the detail of sensory organs expanded the benefits to be gained by more sophisticated pattern recognition. In the brain, the result was the vertebrate cortex, which somehow recognizes patterns without supervision, somehow accurately discriminates overlapping patterns and generalizes patterns to new experiences, somehow continually learns patterns without suffering from catastrophic forgetting, and somehow recognizes patterns despite large variances in its input.

Why Life Got Curious

Sutton had always known that a problem with any reinforcement learning system is something called the exploitation-exploration dilemma. For trial-and-error learning to work, agents need to, well, have lots of trials from which to learn. This means that reinforcement learning can’t work by just exploiting behaviors they predict lead to rewards; it must also explore new behaviors. In other words, reinforcement learning requires two opponent processes—one for behaviors that were previously reinforced (exploitation) and the other for behaviors that are new (exploration). These choices are, by definition, opposing each other. Exploitation will always drive behavior toward known rewards, and exploration will always drive toward what is unknown.

The approach is to make AI systems explicitly curious, to reward them for exploring new places and doing new things, to make surprise itself reinforcing. The greater the novelty, the larger the compulsion to explore it.

The importance of curiosity in reinforcement learning algorithms suggests that a brain designed to learn through reinforcement, such as the brain of early vertebrates, should also exhibit curiosity. And indeed, evidence suggests that it was early vertebrates who first became curious. Curiosity is seen across all vertebrates, from fish to mice to monkeys to human infants. In vertebrates, surprise itself triggers the release of dopamine, even if there is no “real” reward.

Games of gambling are carefully designed to exploit this. In games of gambling, you don’t have a 0 percent chance of winning (which would lead you not to play); you have a 48 percent chance of winning, high enough to make it possible, uncertain enough to make it surprising when you win (giving you a dopamine boost), and low enough so that the casino will, in the long run, suck you dry. Our Facebook and Instagram feeds exploit this as well. With each scroll, there is a new post, and randomly, after some number of scrolls, something interesting shows up. Even though you might not want to use Instagram, the same way gamblers don’t want to gamble or drug addicts don’t want to use anymore, the behavior is subconsciously reinforced, making it harder and harder to stop.

The First Model of the World

The ability to learn a spatial map is seen across vertebrates. Fish, reptiles, mice, monkeys, and humans all do this. And yet simple bilaterians like nematodes are incapable of learning such a spatial map—they cannot remember the location of one thing relative to another thing.

The evolution of spatial maps in the minds of early vertebrates marked numerous firsts. It was the first time in the billion-year history of life that an organism could recognize where it was. It is not hard to envision the advantage this would have offered. While most invertebrates steered around and executed reflexive motor responses, early vertebrates could remember the places where arthropods tended to hide, how to get back to safety, and the locations of nooks and crannies filled with food. It was also the first time a brain differentiated the self from the world. To track one’s location in a map of space, an animal needs to be able to tell the difference between “something swimming toward me” and “me swimming toward something.” And most important, it was the first time that a brain constructed an internal model—a representation of the external world. The initial use of this model was, in all likelihood, pedestrian: it enabled brains to recognize arbitrary locations in space and to compute the correct direction to a given target location from any starting location. But the construction of this internal model laid the foundation for the next breakthrough in brain evolution. What began as a trick for remembering locations would go on to become much more.

Summary of Breakthrough #2: Reinforcing

Our ancestors from around five hundred million years ago transitioned from simple wormlike bilaterians to fishlike vertebrates. Many new brain structures and abilities emerged in these early vertebrate brains, most of which can be understood as enabling and emerging from breakthrough #2: reinforcement learning. These include:

Dopamine became a temporal difference learning signal, which helped solve the temporal credit assignment problem and enabled animals to learn through trial and error.
The basal ganglia emerged as an actor-critic system, enabling animals to generate this dopamine signal by predicting future rewards and to use this dopamine signal to reinforce and punish behaviors.
Curiosity emerged as a necessary part of making reinforcement learning work (solving the exploration-exploitation dilemma).
The cortex emerged as an auto-associative network, making pattern recognition possible.
The perception of precise timing emerged, enabling animals to learn, through trial and error, not only what to do but when to do it.
The perception of three-dimensional space emerged (in the hippocampus and other structures), enabling animals to recognize where they were and remember the location of things relative to other things.

Reinforcement learning in early vertebrates was possible only because the mechanisms of valence and associative learning had already evolved in early bilaterians. Reinforcement learning is bootstrapped on simpler valence signals of good and bad. Conceptually, the vertebrate brain is built on top of the more ancient steering system of bilaterians. Without steering, there is no starting point for trial and error, no foundation on which to measure what to reinforce or un-reinforce.

Breakthrough #3: Simulating and the First Mammals

The Neural Dark Ages

History repeats itself. One and a half billion years ago, the explosion of cyanobacteria suffocated the Earth with carbon dioxide and polluted it with oxygen. Over a billion years later, the explosion of plants on land seems to have committed a similar crime. The inland march of plants was too rapid for evolution to accommodate and rebalance carbon dioxide levels through the expansion of more CO2-producing animals. Carbon dioxide levels plummeted, which caused the climate to cool. The oceans froze over and gradually became inhospitable to life. This was the Late Devonian Extinction, the first great death of this era. From the end of this extinction event and for the next one hundred fifty million years, reptiles would rule.

In order to survive this ravenous era of predatory dinosaurs, pterosaurs, and other massive reptilian beasts, cynodonts got smaller and smaller until they were no more than four inches long. Equipped with warm-bloodedness and miniaturization, they survived by hiding in burrows during the day and emerging during the cold night when archosaurs were relatively blind and immobile. They made their homes in dug-out burrowed mazes or in the thick bark of trees. They hunted by quietly wandering the twilight forest floors and tree branches in search of insects. They became the first mammals.

At some point in this hundred-million-year reign of dinosaurs, as these small mammals survived tucked away in nooks and crannies of the world, they added one more survival trick to their repertoire. They evolved a new cognitive ability, the biggest neural innovation since the Cambrian fish.

The neocortex gave this small mouse a superpower—the ability to simulate actions before they occurred. It could look out at a web of branches leading from its hole to a tasty insect. It could see the faraway eyes of a nearby predatory bird. The mouse could simulate going down different paths, simulate the bird chasing it and the insects hopping away, then pick the best path—the one that, in its simulation, it found itself both alive and well fed. If the reinforcement-learning early vertebrates got the power of learning by doing, then early mammals got the even more impressive power of learning before doing—of learning by imagining.

The neocortex of this early mammal was small and took up only a small fraction of the brain. Most volume was given to the olfactory cortex (early mammals, like many modern mammals, had an incredible sense of smell). But despite the small size of the neocortex in early mammals, it was still the kernel from which human intelligence would arise. In the human brain, the neocortex takes up 70 percent of brain volume. In the breakthroughs that followed, this originally small structure would progressively expand from a clever trick to the epicenter of intelligence.

Generative Models and the Neocortical Mystery

According to Mountcastle, the neocortex does not do different things; each neocortical column does exactly the same thing. The only difference between regions of neocortex is the input they receive and where they send their output; the actual computations of the neocortex itself are identical. The only difference between, for example, the visual cortex and the auditory cortex is that the visual cortex gets input from the retina, and the auditory cortex gets input from the ear.

To those in the AI community, Mountcastle’s hypothesis is a scientific gift like no other. The human neocortex is made up of over ten billion neurons and trillions of connections; it is a hopeless endeavor to try and decode the algorithms and computations performed by such an astronomically massive hairball of neurons. So hopeless that many neuroscientists believe that attempting to decode how the neocortex works is a fruitless endeavor, doomed to fail. But Mountcastle’s theory offers a more hopeful research agenda—instead of trying to understand the entire human neocortex, perhaps we only have to understand the function of the microcircuit that is repeated a million or so times. Instead of understanding the trillions of connections in the entire neocortex, perhaps we only have to understand the million or so connections within the neocortical column. Further, if Mountcastle’s theory is correct, it suggests that the neocortical column implements some algorithm that is so general and universal that it can be applied to extremely diverse functions such as movement, language, and perception across every sensory modality.

In the nineteenth century, a German physicist and physician named Hermann von Helmholtz proposed a novel theory to explain these properties of perception. He suggested that a person doesn’t perceive what is experienced; instead, he or she perceives what the brain thinks is there—a process Helmholtz called inference. Put another way: you don’t perceive what you actually see, you perceive a simulated reality that you have inferred from what you see. This idea explains three opeculiar properties of perception. Your brain fills in missing parts of objects because it is trying to decipher the truth that your vision is suggesting (“Is there actually a sphere there?”). You can see only one thing at a time because your brain must pick a single reality to simulate—in reality, the animal can’t be both a rabbit and a duck. And once you see that an image is best explained as a frog, your brain maintains this reality when observing it.

In the 1990s, Geoffrey Hinton and some of his students (including the same Peter Dayan that had helped discover that dopamine responses are temporal difference learning signals) set their sights on building an AI system that learned in the way that Helmholtz suggested. The Helmholtz machine was an early proof of concept of a much broader class of models called generative models. Most modern generative models are more complicated than the Helmholtz machine, but they share the essential property that they learn to recognize things in the world by generating their own data and comparing the generated data to the actual data.

While most AI advancements that occurred in the early 2000s involved applications of supervised-learning models, many of the recent advancements have been applications of generative models. Deepfakes, AI-generated art, and language models like GPT-3 are all examples of generative models at work.

The neocortex (and presumably the bird equivalent) is always in an unstable balance between recognition and generation, and during our waking life, humans spend an unbalanced amount of time recognizing and comparatively less time generating. Perhaps dreams are a counterbalance to this, a way to stabilize the generative model through a process of forced generation. If we are deprived of sleep, this imbalance of too much recognition and not enough generation eventually becomes so severe that the generative model in the neocortex becomes unstable. Hence, mammals start hallucinating, recognition becomes distorted, and the difference between generation and recognition gets blurred.

One way to think about the generative model in the neocortex is that it renders a simulation of your environment so that it can predict things before they happen. The neocortex is continuously comparing the actual sensory data with the data predicted by its simulation. This is how you can immediately identify anything surprising that occurs in your surroundings.

The reason the neocortex is so powerful is not only that it can match its inner simulation to sensory evidence (Helmholtz’s perception by inference) but, more important, that its simulation can be independently explored. If you have a rich enough inner model of the external world, you can explore that world in your mind and predict the consequences of actions you have never taken.

This was the gift the neocortex gave to early mammals. It was imagination—the ability to render future possibilities and relive past events—that was the third breakthrough in the evolution of human intelligence.

Mice in the Imaginarium

There were three new abilities that neocortical simulating provided early mammals, all three of which were essential for surviving the one-hundred-and-fifty-million-year predatory onslaught of sharp-toothed dinosaurs.

New Ability #1: Vicarious Trial and Error
New Ability #2: Counterfactual Learning
New Ability #3: Episodic Memory. This is the form of memory in which we recall specific past episodes of our lives. This is distinct from, say, procedural memory, where we remember how to do various movements, such as speaking, typing, or throwing a baseball. But here is the weird thing—we don’t truly remember episodic events. The process of episodic remembering is one of simulating an approximate re-creation of the past. When imagining future events, you are simulating a future reality; when remembering past events, you are simulating a past reality. Both are simulations.

Model-Based Reinforcement Learning

While model-free approaches like temporal difference learning can do well in backgammon and certain video games, they do not perform well in more complex games like chess. The problem is that in complex situations, model-free learning—which contains no planning or playing out of possible futures—is not good at finding the moves that don’t look great right now but set you up well for the future.

How did AlphaZero achieve superhuman performance at Go and chess? How did AlphaZero succeed where temporal difference learning could not? The key difference was that AlphaZero simulated future possibilities. Like TD-Gammon, AlphaZero was a reinforcement learning system—its strategies were not programmed into it with expert rules but learned through trial and error. But unlike TD-Gammon, AlphaZero was a model-based reinforcement learning algorithm; AlphaZero searched through possible future moves before deciding what to do next.

How Mammals Make Choices

Step #1: Triggering Simulation
Step #2: Simulating Options
Step #3: Choosing an Option

Habits are automated actions triggered by stimuli directly (they are model-free). They are behaviors controlled directly by the basal ganglia. They are the way mammalian brains save time and energy, avoiding unnecessarily engaging in simulation and planning. When such automation occurs at the right times, it enables us to complete complex behaviors easily; when it occurs at the wrong times, we make bad choices.

The duality between model-based and model-free decision-making methods shows up in different forms across different fields. In AI, the terms model-based and model-free are used. In animal psychology, this same duality is described as goal-driven behavior and habitual behavior. And in behavioral economics, as in Daniel Kahneman’s famous book Thinking, Fast and Slow, this same duality is described as “system 2” (thinking slow) versus “system 1” (thinking fast). In all these cases, the duality is the same: Humans and, indeed, all mammals (and some other animals that independently evolved simulation) sometimes pause to simulate their options (model-based, goal-driven, system 2) and sometimes act automatically (model-free, habitual, system 1). Neither is better; each has its benefits and costs. Brains attempt to intelligently select when to do each, but brains do not always make this decision correctly, and this is the origin of many of our irrational behaviors.

The basal ganglia has no intent or goals. A model-free reinforcement learning system like the basal ganglia is intent-free; it is a system that simply learns to repeat behaviors that have previously been reinforced. This is not to say that such model-free systems are dumb or devoid of motivation; they can be incredibly intelligent and clever, and they can rapidly learn to produce behavior that maximizes the amount of reward. But these model-free systems do not have “goals” in the sense that they do not set out to pursue a specific outcome. This is one reason why model-free reinforcement learning systems are painfully hard to interpret—when we ask, “Why did the AI system do that?,” we are asking a question to which there is really no answer. Or at least, the answer will always be the same: because it thought that was the choice with the most predicted reward.

In contrast, the aPFC does have explicit goals—it wants to go to the fridge to eat strawberries or go to the water fountain to drink water. By simulating a future that terminates at some end result, the aPFC has an end state (a goal) that it seeks to achieve. This is why it is possible, at least in circumstances where people make aPFC-driven (goal-oriented, model-based, system 2) choices, to ask why a person did something.

It is somewhat magical that the very same neocortical microcircuit that constructs a model of external objects in the sensory cortex can be repurposed to construct goals and modify behavior to pursue these goals in the frontal cortex. Karl Friston of University College London—one of the pioneers of the idea that the neocortex implements a generative model—calls this “active inference.” The sensory cortex engages in passive inference—merely explaining and predicting sensory input. The aPFC engages in active inference—explaining one’s own behavior and then using its predictions to actively change that behavior. By pausing to play out what the aPFC predicts will happen and thereby vicariously training the basal ganglia, the aPFC is repurposing the neocortical generative model for prediction to create volition.

In a typical neuroscience textbook, the four functions ascribed to the frontal neocortex are attention, working memory, executive control, and, as we have already seen, planning. The connecting theme of these functions has always been confusing; it seems odd that one structure would subserve all these distinct roles. But through the lens of evolution, it makes sense that these functions are all intimately related—they are all different applications of controlling the neocortical simulation.

The Secret to Dishwashing Robots

The motor cortex was clearly not the locus of motor commands in early mammals, and it was only later—in primates—that it became required for movement. So why did the motor cortex evolve? What was its original function? What changed with primates?

Karl Friston, the pioneer of the theory of active inference, offers an alternative interpretation of the motor cortex. While the prevailing view has always been that the motor cortex generates motor commands, telling muscles exactly what to do, Friston flips this idea on its head: Perhaps the motor cortex doesn’t generate motor commands but rather motor predictions. Perhaps the motor cortex is in a constant state of observing the body movements that occur in the nearby somatosensory cortex (hence why there is such an elegant mirror of motor cortex and somatosensory cortex) and then tries to explain the behavior and use these explanations to predict what an animal will do next. And perhaps the wiring is merely tweaked so that motor cortex predictions flow to the spinal cord and control our movement—in other words, the motor cortex is wired to make its predictions come true.

Any level of goal, whether high-level or low-level goals, has both a self model in the frontal neocortex and a model-free system in the basal ganglia. The neocortex offers a slower but more flexible system for training, and the basal ganglia offers a faster but less flexible version for well-trained paths and movements.

The secret to dishwashing robots lives somewhere in the motor cortex and the broader motor system of mammals. Just as we do not yet understand how the neocortical microcircuit renders an accurate simulation of sensory input, we also do not yet understand how the motor cortex simulates and plans fine body movements with such flexibility and accuracy and how it continuously learns as it goes.

If we successfully build robots with motor systems similar to those of mammals, they will come along with many desirable properties. These robots will automatically learn new complex skills on their own. They will adjust their movements in real time to account for perturbations and changes in the world. We will give them high-level goals, and they will be able to figure out all the subgoals necessary to achieve it. When they try to learn some new task, they will be slow and careful as they simulate each body movement before they act, but as they get better, the behavior will become more automatic. Over the course of their lifetimes, the speed with which they learn new skills will increase as they reapply previously learned low-level skills to newly experienced higher-level goals. And if their brains work at all like mammal brains, they will not require massive supercomputers to accomplish these tasks. Indeed, the entire human brain operates on about the same amount of energy as a lightbulb. Or maybe not. Perhaps roboticists will get all this to work in a very nonmammalian way—perhaps roboticists will figure it all out without reverse-engineering human brains. But just as bird wings were an existence proof for the possibility of flight—a goal for humans to strive for—the motor skills of mammals are our existence proof for the type of motor skills we hope to build into machines one day, and the motor cortex and the surrounding motor hierarchy are nature’s clues about how to make it all work.

Summary of Breakthrough #3: Simulating

The primary new brain structure that emerged in early mammals was the neocortex. With the neocortex came the gift of simulation—the third breakthrough in our evolutionary story. To summarize how this occurred and how it was used:

Sensory neocortex evolved, which created a simulation of the external world (a world model).
The agranular prefrontal cortex (aPFC) evolved, which was the first region of the frontal neocortex. The aPFC created a simulation of an animal’s own movements and internal states (a self model) and constructed “intent” to explain one’s own behavior.
The aPFC and sensory neocortex worked together to enable early mammals to pause and simulate aspects of the world that were not currently being experienced—in other words, model-based reinforcement learning.
The aPFC somehow solved the search problem by intelligently selecting paths to simulate and determining when to simulate them.
These simulations enabled early mammals to engage in vicarious trial and error—to simulate future actions and decide which path to take based on the imagined outcomes.
These simulations enabled early mammals to engage in counterfactual learning, thereby offering a more advanced solution to the credit assignment problem—enabling mammals to assign credit based on causal relationships.
These simulations enabled early mammals to engage in episodic memory, which allowed mammals to recall past events and actions, and use these recollections to adjust their behavior.
In later mammals, the motor cortex evolved, enabling mammals to plan and simulate specific body movements.

Our mammalian ancestors from a hundred million years ago weaponized the imaginarium to survive. They engaged in vicarious trial and error, counterfactual learning, and episodic memory to outplan dinosaurs. Our ancestral mammal, like a modern cat, could look at a set of branches and plan where it wanted to place its paws. Together, these ancient mammals behaved more flexibly, learned faster, and performed more clever motor skills than their vertebrate ancestors. Most vertebrates at the time, as with modern lizards and fish, could still move quickly, remember patterns, track the passage of time, and intelligently learn through model-free reinforcement learning, but their movements were not planned.

Breakthrough #4: Mentalizing and the First Primates

The Arms Race for Political Savvy

In the 1980s and 1990s, numerous primatologists and evolutionary psychologists, including Nicholas Humphrey, Frans de Waal, and Robin Dunbar, began speculating that the growth of the primate brain had nothing to do with the ecological demands of being a monkey in the African jungles ten to thirty million years ago and was instead a consequence of the unique social demands. They argued that these primates had stable mini-societies: Groups of individuals that stuck together for long periods. Scientists hypothesized that to maintain these uniquely large social groups, these individuals needed unique cognitive abilities. This created pressure, they argued, for bigger brains.

It isn’t group size in general but the specific type of group that early primates created that seemed to have required larger brains.

What makes these monkey societies unique is not the presence of a social hierarchy (many animal groups have social hierarchies), but how the hierarchy is constructed. If you examined the social hierarchy of different monkey groups, you would notice that it often isn’t the strongest, biggest, or most aggressive monkey who sits at the top. Unlike most other social animals, for primates, it is not only physical power that determines one’s social ranking but also political power.

It isn’t clear how political savviness would even be possible if a species did not have at least a basic and primitive version of theory of mind—only through this ability can individuals infer what others want and thereby figure out whom to cozy up to and how. Only through theory of mind can individual primates know not to mess with a low-ranking individual with high-ranking friends; this requires understanding the intent of the high-ranking individuals and what they will do in future situations. Only through this ability of theory of mind can you figure out who is likely to become powerful in the future, whom you need to make friends with, and whom you can deceive.

How to Model Other Minds

As far back as Plato, there has been a running hypothesis about how humans understand the minds of other humans. The theory is that we first understand our own minds and then use this understanding of ourselves to understand others. Modern formulations of this old idea are referred to as “simulation theory” or “social projection theory.” When we try to understand why someone else did something, we do so by imagining ourselves in their situation—with their knowledge and life history: “She probably yelled at me because she is stressed out about having a test tomorrow; I know I yell more when I am stressed.” When we try to understand what others will do, we imagine what we would do in their situation if we had their knowledge and their background: “I don’t think James will share his food with George anymore; I believe James saw George steal, and I know if I saw my friend steal from me, I wouldn’t share with him anymore.” We understand others by imagining ourselves in their shoes. The best evidence for social projection theory is the fact that tasks that require understanding yourself and tasks that require understanding others both activate and require the same uniquely primate neural structures. Reasoning about your own mind and reasoning about other minds is, in the brain, the same process.

Monkey Hammers and Self-Driving Cars

If the driver of brain evolution in early primates was a politicking arms race, why would primates be uniquely good tool users? If the new brain regions of primates were “designed” to enable theory of mind, then from where do the unique tool-using skills of primates emerge?

One reason it is useful to simulate other people’s movements is that doing this helps us understand their intentions. By imagining yourself doing what others are doing, you can begin to understand why they are doing what they are doing: you can imagine yourself tying strings on a shoe or buttoning a shirt and then ask yourself “why would I do something like this?” and thereby begin to understand the underlying intentions behind other people’s movements.

The ability to use tools is less about ingenuity and more about transmissibility. Ingenuity must occur only once if transmissibility occurs frequently; if at least one member of a group figures out how to manufacture and use a termite-catching stick, the entire group can acquire this skill and continuously pass it down throughout generations.

Acquiring novel skills through observation required theory of mind, while selecting known skills through observation did not.

There is still much work to be done when it comes to imitation learning in robotics. But the fact that inverse reinforcement learning (whereby AI systems infer the intent of observed behavior) seems necessary for observational learning to work, at least in some tasks, supports the idea that theory of mind (whereby primates infer the intent of observed behavior) was required for observational learning and the transmission of tool skills among themselves. It is unlikely a coincidence that both the ingenuity of roboticists and the iteration of evolution both converged on similar solutions; a novice cannot reliably acquire a new motor skill by merely observing an expert’s movements, novices must also peer into an expert’s mind.

Theory of mind evolved in early primates for politicking. But this ability was repurposed for imitation learning. The ability to infer the intent of others enabled early primates to filter out extraneous behaviors and focus only on the relevant ones (what did the person mean to do?); it helped youngsters stay focused on learning over long stretches of time; and it may have enabled early primates to actively teach each other by inferring what a novice does and does not understand. While our ancestral mammal likely could select known skills by observing others, it was with early primates, armed with theory of mind, when the ability to acquire truly novel skills through observation emerged. This created a new degree of transmissibility: skills that were discovered by clever individuals and that would once have faded when they died, could now propagate throughout a group and be passed down endlessly through generations. This is why primates use hammers and rats do not.

Why Rats Can’t Go Grocery Shopping

Although Robin Dunbar's social-brain hypothesis has, for the past several decades, held primacy among scientists as the leading explanation of brain expansion in primates, there is an alternative explanation: what has been called the ecological-brain hypothesis. The ecological-brain hypothesis argues that it was the frugivore diet of early primates that drove the rapid expansion of their brains.

The ability to anticipate future needs would have offered numerous benefits to our ancestral frugivores. It would have enabled our ancestors to plan their foraging routes long in advance, thereby ensuring they were the first to get newly ripened fruits. Our ability to make decisions today for faraway, abstract, and not-yet-existent goals was inherited from tree faring primates. A trick that, perhaps, was first used for getting the first pick of fruits, but today, in humans, is used for far greater purposes. It laid the foundation for our ability to make long term plans over vast stretches of time.

Summary of Breakthrough #4: Mentalizing

There are three broad abilities that seem to have emerged in early primates:

Theory of mind: inferring intent and knowledge of others
Imitation learning: acquiring novel skills through observation
Anticipating future needs: taking an action now to satisfy a want in the future, even though I do not want it now

These may not, in fact, have been separate abilities but rather emergent properties of a single new breakthrough: the construction of a generative model of one’s own mind, a trick that can be called “mentalizing.” We see this in the fact that these abilities emerge from shared neural structures (such as the gPFC) that evolved first in early primates. We see this in the fact that children seem to acquire these abilities at similar developmental times. We see this in the fact that damage that impairs one of these abilities tends to impair many of them. And most important, we see this in the fact that the structures from which these skills emerge are the same areas from which our ability to reason about our own mind emerges. These new primate areas are required not only for simulating the mind of others but also for projecting yourself into your imagined futures, identifying yourself in the mirror (mirror-sign syndrome), and identifying your own movements (alien-hand syndrome). And a child’s ability to reason about her own mind tends to precede a child’s development of all three of these abilities.

However, the best evidence for this idea goes all the way back to Mountcastle. The main change to the brain of early primates, besides its size, was the addition of new areas of neocortex. So if we are to stick to the general idea—inspired by Mountcastle, Helmholtz, Hinton, Hawkins, Friston, and many others—that every area of neocortex is made up of identical microcircuits, this imposes strict constraints on how we explain the newfound abilities of primates. It suggests that these new intellectual skills must emerge from some new clever application of the neocortex and not some novel computational trick. This makes the interpretation of theory of mind, imitation learning, and anticipation of future needs as nothing more than an emergent property of a second-order generative model a nice proposal—all three abilities can emerge from nothing more than new applications of neocortex.

All these abilities—theory of mind, imitation learning, and anticipating future needs—would have been particularly adaptive in the unique niche of early primates. Dunbar argues that the social-brain hypothesis and the ecological-brain hypothesis are two sides of the same coin. The ability to mentalize may have simultaneously unlocked both the ability to successfully forage fruits and to successfully politick. The pressures of both frugivorism and social hierarchies may have converged to produce continual evolutionary pressure to develop and elaborate brain regions—such as the gPFC—for modeling your own mind.

Breakthrough #5: Speaking and the First Humans

The Search for Human Uniqueness

Linguists make a distinction between declarative and imperative labels. An imperative label is one that yields a reward: “When I hear sit, if I sit, I will get a treat” or “When I hear stay, if I stop moving, I will get a treat.” This is basic temporal difference learning—all vertebrates can do this. Declarative labeling, on the other hand, is a special feature of human language. A declarative label is one that assigns an object or behavior an arbitrary symbol—“That is a cow,” “That is running,”—without any imperative at all. No other form of naturally occurring animal communication has been found to do this. The second way in which human language differs from other animal communication is that it contains grammar. Human language contains rules by which we merge and modify symbols to convey specific meanings. We can thereby weave these declarative labels into sentences, and we can knit these sentences into concepts and stories. This allows us to convert the few thousand words present in a typical human language into a seemingly infinite number of unique meanings.

Our unique language, with declarative labels and grammar, enables groups of brains to transfer their inner simulations to each other with an unprecedented degree of detail and flexibility.

This trick of thought transfer would have provided many practical benefits to early humans. It would have enabled more accurate teaching of tool use, hunting techniques, and foraging tricks. It would have enabled flexible coordination of scavenging and hunting behaviors across individuals—a human could say, “Follow me, there is an antelope carcass two miles east” or “Wait here, let’s ambush the antelope when you hear me whistle three times.” All these practical benefits emerge from the fact that language expands the scope of sources a brain can extract learnings from. The breakthrough of reinforcing enabled early vertebrates to learn from their own actual actions (trial and error). The breakthrough of simulating enabled early mammals to learn from their own imagined actions (vicarious trial and error). The breakthrough of mentalizing enabled early primates to learn from other people’s actual actions (imitation learning). But the breakthrough of speaking uniquely enabled early humans to learn from other people’s imagined actions.

By sharing what we see in our imaginations, it is also possible for common myths to form and for entirely made-up imaginary entities and stories to persist merely because they hop between our brains. We tend to think about myths as the province of fantasy novels and children’s books, but they are the foundation of modern human civilizations. Money, gods, corporations, and states are imaginary concepts that exist only in the collective imaginations of human brains. One of the earlier versions of this idea was articulated by the philosopher John Searle, but was famously popularized by Yuval Harari’s book Sapiens. The two argue that humans are unique because we “cooperate in extremely flexible ways with countless numbers of strangers.” And to Searle and Harari, we can do this because we have such “common myths.”

An analogy to DNA is useful. The true power of DNA is not the products it constructs (hearts, livers, brains) but the process it enables (evolution). In this same way, the power of language is not its products (better teaching, coordinating, and common myths) but the process of ideas being transferred, accumulated, and modified across generations. Just as genes persist by hopping from parent cell to offspring cell, ideas persist by hopping from brain to brain, from generation to generation. And as with genes, this hopping is not uniform but operates under its own quasi-evolutionary rules—there is a continual selecting of good ideas and pruning of bad ideas. Ideas that helped humans survive persisted, while those that did not perished. This analogy of ideas evolving was proposed by Richard Dawkins in his famous book The Selfish Gene. He called these hopping ideas memes.

All human inventions, both technological and cultural, require an accumulation of basic building blocks before a single inventor can go “Aha!,” merge the preexisting ideas into something new, and transfer this new invention to others. If the baseline of ideas always fades after a generation or two, then a species will be forever stuck in a nonaccumulating state, always reinventing the same ideas over and over again. This is how it is for all other creatures in the animal kingdom. Even chimpanzees, who learn motor skills through observation, do not accumulate learnings across generations. Going from no accumulation across generations to some accumulation across generations was the subtle discontinuity that changed everything.

Eventually, the corpus of ideas accumulated reached a tipping point of complexity when the total sum of accumulated ideas no longer fit into the brain of a single human. This created a problem in sufficiently copying ideas across generations. In response, four things happened that further expanded the extent of knowledge that could be transferred across generations. First, humans evolved bigger brains, which increased the amount of knowledge that can be passed down through individual brains. Second, humans became more specialized within their groups, with ideas distributed across different members—some were the spear makers, others clothing makers, others hunters, others foragers. Third, population sizes expanded, which offered more brains to store ideas across generations. And fourth, most recent and most important, we invented writing. Writing allows humans to have a collective memory of ideas that can be downloaded at will and that can contain effectively an infinite corpus of knowledge.

The real reason why humans are unique is that we accumulate our shared simulations (ideas, knowledge, concepts, thoughts) across generations. We are the hive-brain apes. We synchronize our inner simulations, turning human cultures into a kind of meta-life-form whose consciousness is instantiated within the persistent ideas and thoughts flowing through millions of human brains over generations. The bedrock of this hive brain is our language. The emergence of language marked an inflection point in humanity’s history, the temporal boundary when this new and unique kind of evolution began: the evolution of ideas. In this way, the emergence of language was as monumental an event as the emergence of the first self-replicating DNA molecules. Language transformed the human brain from an ephemeral organ to an eternal medium of accumulating inventions.

Language in the Brain

We don’t realize it, but when we happily go back and forth making incoherent babbles with babies (proto-conversations), when we pass objects back and forth and smile (joint attention), and when we pose and answer even nonsensical questions from infants, we are unknowingly executing an evolutionarily hard-coded learning program designed to give human infants the gift of language. This is why humans deprived of contact with others will develop emotional expressions, but they’ll never develop language. The language curriculum requires both a teacher and a student. And as this instinctual learning curriculum is executed, young human brains repurpose older mentalizing areas of the neocortex for the new purpose of language. It isn’t Broca’s or Wernicke’s areas that are new, it is the underlying learning program that repurposes them for language that is new.

Here is the point: There is no language organ in the human brain, just as there is no flight organ in the bird brain. Asking where language lives in the brain may be as silly as asking where playing baseball or playing guitar lives in the brain. Such complex skills are not localized to a specific area; they emerge from a complex interplay of many areas. What makes these skills possible is not a single region that executes them but a curriculum that forces a complex network of regions to work together to learn them. So this is why your brain and a chimp brain are practically identical and yet only humans have language. What is unique in the human brain is not in the neocortex; what is unique is hidden and subtle, tucked deep in older structures like the amygdala and brain stem. It is an adjustment to hardwired instincts that makes us take turns, makes children and parents stare back and forth, and that makes us ask questions.

The Perfect Storm

We diverged from chimpanzees around seven million years ago, and brains stayed largely the same size until around two and a half million years ago, at which point something mysterious and dramatic happened. The human brain rapidly became over three times larger and earned its place as one of the largest brains on Earth. In the words of the neurologist John Ingram, some mysterious force more than two million years ago triggered a “runaway growth of the brain.”

Homo erectus was our meat-eating, stone-tool-using, (possibly) fire-wielding, premature-birthing, (mostly) monogamous, grandmothering, hairless, sweating, big-brained ancestor.

Defecting in language—directly lying or withholding information—has many benefits to an individual. And the presence of liars and cheaters defeats the value of language. In a group where everyone is lying to each other with words, those who spoke no language and were immune to the lies might in fact survive better than those with language. So the presence of language creates a niche for defectors, which eliminates the original value of language. How, then, could language ever propagate and persist within a group? In this way, the fifth breakthrough in the evolution of the human brain—language—is unlike any other breakthrough chronicled in this book. Steering, reinforcing, simulating, and mentalizing were adaptations that clearly benefited any individual organisms in which they began to emerge, and thus the evolutionary machinations by which they propagated are straightforward. Language, however, is only valuable if a group of individuals are using it. And so more nuanced evolutionary machinations must have been at work.

Much behavior of modern humans, however, doesn’t fit cleanly into kin selection or reciprocal altruism. Sure, humans are clearly biased toward their own kin. But people still regularly help strangers without expecting anything in return. We donate to charity; we are willing to go to war and risk our lives for our fellow citizens, most of whom we’ve never met; and we take part in social movements that don’t directly benefit us but help strangers we feel have been disadvantaged. Think about how weird it would be for a human to see a lost and scared child on the street and just do nothing. Most humans would stop to help a child and do so without expecting any reciprocity in return. Humans are, relative to other animals, by far the most altruistic to unrelated strangers. Of course, humans are also one of the cruelest species. Only humans will make incredible personal sacrifices to impose pain and suffering on others. Only humans commit genocide. Only humans hate entire groups of people. This paradox is not a random happenstance; it is not a coincidence that our language, our unparalleled altruism, and our unmatched cruelty all emerged together in evolution; all three were, in fact, merely different features of the same evolutionary feedback loop, one from which evolution made its finishing touches in the long journey of human brain evolution.

With the basics of language among kin in place, the opportunity for using language among non-kin became possible. Instead of makeshift languages constructed between mothers and offspring, it would have been possible for an entire group to share in labels. But as we have seen, information shared with unrelated individuals in a group would have been tenuous and unstable, ripe for defectors and liars. Here is where Robin Dunbar—the famous anthropologist who came up with the social-brain hypothesis—proposes something clever. What do we humans naturally have an instinct to talk about? What is the most natural activity we use language for? Well, we gossip.

If someone lied or freeloaded in a group that tended to gossip, everyone would quickly learn about it: “Did you hear that Billy stole food from Jill?” If groups imposed costs on cheaters by punishing them, either by withholding altruism or by directly harming them, then gossip would enable a stable system of reciprocal altruism among a large group of individuals.

The key point: The use of language for gossip plus the punishment of moral violators’ makes it possible to evolve high levels of altruism. Early humans born with extra altruistic instincts would have more successfully propagated in an environment that easily identified and punished cheaters and rewarded altruists. The more severe the costs of cheating, the more altruistic it was optimal to behave. Herein lies both the tragedy and beauty of humanity. We are indeed some of the most altruistic animals, but we may have paid the price for this altruism with our darker side: our instinct to punish those who we deem to be moral violators; our reflexive delineation of people into good and evil; our desperation to conform to our in-group and the ease with which we demonize those in the out-group. And with these new traits, empowered by our newly enlarged brains and cumulative language, the human instinct for politics—derived from our ancestral primates—was no longer a little trick for climbing social hierarchies but a cudgel of coordinated conquest. All this is the inevitable result of a survival niche requiring high levels of altruism between unrelated individuals.

This is exactly the kind of feedback loop where evolutionary changes occur rapidly. For every incremental increase in gossip and punishment of violators, the more altruistic it was optimal to be. For every incremental increase in altruism, the more optimal it was to freely share information with others using language, which would select for more advanced language skills. For every incremental increase in language skills, the more effective gossip became, thereby reinforcing the cycle. Every roundabout of this cycle made our ancestors’ brains bigger and bigger. As social groups got bigger (powered by improved gossip, altruism, and punishment), it created more pressure for bigger brains to keep track of all the social relationships. As more ideas accumulated across generations, it created more pressure for bigger brains to increase the storage capacity of ideas that could be maintained within a generation. As the usefulness of inner simulations increased due to more reliable sharing of thoughts through language, it created more pressure for bigger brains to render more sophisticated inner simulations in the first place.

ChatGPT and the Window into the Mind

Human brains have an automatic system for predicting words (one probably similar, at least in principle, to models like GPT-3) and an inner simulation. Much of what makes human language powerful is not the syntax of it, but its ability to give us the necessary information to render a simulation about it and, crucially, to use these sequences of words to render the same inner simulation as other humans around us.

By training GPT-4 to not just predict the answer, but to predict the next step in reasoning about the answer, the model begins to exhibit emergent properties of thinking, without, in fact, thinking—at least not in the way that a human thinks by rendering a simulation of the world.

What is most amazing about the success of LLMs is how much they seemingly understand about the world despite being trained on nothing but language. LLMs can correctly reason about the physical world without ever having experienced that world. Like a military cryptanalyst decoding the meaning behind encryped secret messages, finding patterns and meanings in what was originally gibberish, these LLMs have been able to tease out aspects of a world they have never seen or heard, that they have never touched or experienced, by merely scanning the entire corpus of our uniquely human code for transferring thoughts.

Summary of Breakthrough #5: Speaking

Early humans got caught in an unlikely perfect storm of effects. The dying forests of the African savannah pushed early humans into a tool-making meat-eating niche, one that required the accurate propagation of tool use across generations. Proto-languages emerged, enabling tool use and manufacture skills to successfully propogate across generations. The neurological change that enabled language was not a new neurological structure but an adjustment to more ancient structures, which created a learning program for language; the program of proto-conversations and joint attention that enables children to tether names to components of their inner simulation. Trained with this curriculum, older areas of the neocortex were repurposed for language.

From here, humans began experimenting with using this proto-language with unrelated individuals, and this kicked off a feedback loop of gossip, altruism, and punishment, which continuously selected for more sophisticated language skills. As social groups expanded and ideas began hopping from brain to brain, the human hive mind emerged, creating an ephemeral medium for ideas to propagate and accumulate across generations. This would have begged for bigger brains to store and share more accumulated knowledge. And perhaps due to this, or enabling it, cooking was invented, offering a huge caloric surplus that could be spent on tripling the size of brains. And so, from this perfect storm emerged the fifth and final breakthrough in the evolutionary story of the human brain: language. And along with language came the many unique traits of humans, from altruism to cruelty. If there is anything that truly makes humans unique, it is that the mind is no longer singular but is tethered to others through a long history of accumulated ideas.

Conclusion: The Sixth Breakthrough

Thus far, humanity’s story has been a saga of two acts. Act 1 is the evolutionary story: how biologically modern humans emerged from the raw lifeless stuff of our universe. Act 2 is the cultural story: how societally modern humans emerged from largely biologically identical but culturally primitive ancestors from around one hundred thousand years ago. While act 1 unfolded over billions of years, most of what we have learned in history class unfolded during the comparatively much shorter time of act 2—all civilizations, technologies, wars, discoveries, dramas, mythologies, heroes, and villains unfolded in this time window that, compared to act 1, was a mere blink of an eye.

Of course, we don’t know what breakthrough #6 will be. But it seems increasingly likely that the sixth breakthrough will be the creation of artificial superintelligence; the emergence of our progeny in silicon, the transition of intelligence—made in our image—from a biological medium to a digital medium. From this new medium will come an astronomical expansion in the scale of a single intelligence’s cognitive capacity. The cognitive capacity of the human brain is hugely limited by the processing speed of neurons, the caloric limitations of the human body, and the size constraints of how big a brain can be and still fit in a carbon-based life-form. Breakthrough #6 will be when intelligence unshackles itself from these biological limitations. A silicon-based AI can infinitely scale up its processing capacity as it sees fit. Indeed, individuality will lose its well-defined boundaries as AIs can freely copy and reconfigure themselves; parenthood will take on new meaning as biological mechanisms of mating give way to new silicon-based mechanisms of training and creating new intelligent entities. Even evolution itself will be abandoned, at least in its familiar form; intelligence will no longer be entrapped by the slow process of genetic variation and natural selection, but instead by more fundamental evolutionary principles, the purest sense of variation and selection—as AIs reinvent themselves, those who select features that support better survival will, of course, be the ones that survive.

And so we stand on the precipice of the sixth breakthrough in the story of human intelligence, at the dawn of seizing control of the process by which life came to be and of birthing superintelligent artificial beings. At this precipice, we are confronted with a very unscientific question but one that is, in fact, far more important: What should be humanity’s goals? This is not a matter of veritas—truth—but of values. As we have seen, past choices propagate through time. And so how we answer this question will have consequences for eons to come.

As we look forward into this new era, it behooves us to look backward at the long billion-year story by which our brains came to be. As we become endowed with godlike abilities of creation, we should learn from the god—the unthinking process of evolution—that came before us. The more we understand about our own minds, the better equipped we are to create artificial minds in our image. The more we understand about the process by which our minds came to be, the better equipped we are to choose which features of intelligence we want to discard, which we want to preserve, and which we want to improve upon. We are the stalwarts of this grand transition, one that has been fourteen billion years in the making. Whether we like it or not, the universe has passed us the baton.

A Brief History of Intelligence: Evolution, AI, and the Five Breakthroughs That Made Our Brains - by Max Bennett