Rationality: From AI to Zombies - by Eliezer Yudkowsky

Book I Map and Territory

Part A Predictably Wrong

What Do I Mean By “Rationality”?

I mean:

  1. Epistemic rationality: systematically improving the accuracy of your beliefs.
  2. Instrumental rationality: systematically achieving your values.

So rationality is about forming true beliefs and making winning decisions.

Feeling Rational

A popular belief about “rationality” is that rationality opposes all emotion—that all our sadness and all our joy are automatically anti-logical by virtue of being feelings. Yet strangely enough, I can’t find any theorem of probability theory which proves that I should appear ice-cold and expressionless. So is rationality orthogonal to feeling? No; our emotions arise from our models of reality. If I believe that my dead brother has been discovered alive, I will be happy; if I wake up and realize it was a dream, I will be sad. P. C. Hodgell said: “That which can be destroyed by the truth should be.” My dreaming self’s happiness was opposed by truth. My sadness on waking is rational; there is no truth which destroys it.

Why Truth? And...

It is written: “The first virtue is curiosity.” Curiosity is one reason to seek truth, and it may not be the only one, but it has a special and admirable purity. If your motive is curiosity, you will assign priority to questions according to how the questions, themselves, tickle your personal aesthetic sense. A trickier challenge, with a greater probability of failure, may be worth more effort than a simpler one, just because it is more fun.

...What’s a Bias, Again?

We seem to label as “biases” those obstacles to truth which are produced, not by the cost of information, nor by limited computing power, but by the shape of our own mental machinery. Perhaps the machinery is evolutionarily optimized to purposes that actively oppose epistemic accuracy; for example, the machinery to win arguments in adaptive political contexts. Or the selection pressure ran skew to epistemic accuracy; for example, believing what others believe, to get along socially. Or, in the classic heuristic-and-bias, the machinery operates by an identifiable algorithm that does some useful work but also produces systematic errors: the availability heuristic is not itself a bias, but it gives rise to identifiable, compactly describable biases. Our brains are doing something wrong, and after a lot of experimentation and/or heavy thinking, someone identifies the problem in a fashion that System 2 can comprehend; then we call it a “bias.” Even if we can do no better for knowing, it is still a failure that arises, in an identifiable fashion, from a particular kind of cognitive machinery—not from having too little machinery, but from the machinery’s shape.


The availability heuristic is judging the frequency or probability of an event by the ease with which examples of the event come to mind.

Selective reporting is one major source of availability biases. In the ancestral environment, much of what you knew, you experienced yourself; or you heard it directly from a fellow tribe-member who had seen it. There was usually at most one layer of selective reporting between you, and the event itself. With today’s Internet, you may see reports that have passed through the hands of six bloggers on the way to you—six successive filters. Compared to our ancestors, we live in a larger world, in which far more happens, and far less of it reaches us—a much stronger selection effect, which can create much larger availability biases.

Burdensome Details

The conjunction fallacy is when humans rate the probability P(A,B) higher than the probability P(B), even though it is a theorem that P(A,B) ≤ P(B). For example, in one experiment in 1981, 68% of the subjects ranked it more likely that “Reagan will provide federal support for unwed mothers and cut federal support to local governments” than that “Reagan will provide federal support for unwed mothers.”

By adding extra details, you can make an outcome seem more characteristic of the process that generates it. You can make it sound more plausible that Reagan will support unwed mothers, by adding the claim that Reagan will also cut support to local governments. The implausibility of one claim is compensated by the plausibility of the other; they “average out.” Which is to say: Adding detail can make a scenario SOUND MORE PLAUSIBLE, even though the event necessarily BECOMES LESS PROBABLE.

Planning Fallacy

When people are asked for a “realistic” scenario, they envision everything going exactly as planned, with no unexpected delays or unforeseen catastrophes—the same vision as their “best case.” Reality, it turns out, usually delivers results somewhat worse than the “worst case.” Unlike most cognitive biases, we know a good debiasing heuristic for the planning fallacy.

People tend to generate their predictions by thinking about the particular, unique features of the task at hand, and constructing a scenario for how they intend to complete the task—which is just what we usually think of as planning. When you want to get something done, you have to plan out where, when, how; figure out how much time and how much resource is required; visualize the steps from beginning to successful conclusion. All this is the “inside view,” and it doesn’t take into account unexpected delays and unforeseen catastrophes. As we saw before, asking people to visualize the “worst case” still isn’t enough to counteract their optimism—they don’t visualize enough Murphyness.

The outside view is when you deliberately avoid thinking about the special, unique features of this project, and just ask how long it took to finish broadly similar projects in the past. This is counterintuitive, since the inside view has so much more detail—there’s a temptation to think that a carefully tailored prediction, taking into account all available data, will give better results. But experiment has shown that the more detailed subjects’ visualization, the more optimistic (and less accurate) they become.

Illusion of Transparency: Why No One Understands You

We always know what we mean by our words, and so we expect others to know it too. Reading our own writing, the intended interpretation falls easily into place, guided by our knowledge of what we really meant. It’s hard to empathize with someone who must interpret blindly, guided only by the words.

Be not too quick to blame those who misunderstand your perfectly clear sentences, spoken or written. Chances are, your words are more ambiguous than you think.

Expecting Short Inferential Distances

Homo sapiens’s environment of evolutionary adaptedness (a.k.a. EEA or “ancestral environment”) consisted of hunter-gatherer bands of at most 200 people, with no writing. All inherited knowledge was passed down by speech and memory. In a world like that, all background knowledge is universal knowledge. All information not strictly private is public, period. In the ancestral environment, you were unlikely to end up more than one inferential step away from anyone else. When you discover a new oasis, you don’t have to explain to your fellow tribe members what an oasis is, or why it’s a good idea to drink water, or how to walk. Only you know where the oasis lies; this is private knowledge. But everyone has the background to understand your description of the oasis, the concepts needed to think about water; this is universal knowledge. When you explain things in an ancestral environment, you almost never have to explain your concepts. At most you have to explain one new concept, not two or more simultaneously. In the ancestral environment there were no abstract disciplines with vast bodies of carefully gathered evidence generalized into elegant theories transmitted by written books whose conclusions are a hundred inferential steps removed from universally shared background premises.

Combined with the illusion of transparency and self-anchoring, I think this explains a lot about the legendary difficulty most scientists have in communicating with a lay audience—or even communicating with scientists from other disciplines. When I observe failures of explanation, I usually see the explainer taking one step back, when they need to take two or more steps back. Or listeners assume that things should be visible in one step, when they take two or more steps to explain. Both sides act as if they expect very short inferential distances from universal knowledge to any new knowledge.

A clear argument has to lay out an inferential pathway, starting from what the audience already knows or accepts. If you don’t recurse far enough, you’re just talking to yourself. If at any point you make a statement without obvious justification in arguments you’ve previously supported, the audience just thinks you’re crazy.

The Lens That Sees Its Own Flaws

The brain is a flawed lens through which to see reality. This is true of both mouse brains and human brains. But a human brain is a flawed lens that can understand its own flaws—its systematic errors, its biases—and apply second-order corrections to them. This, in practice, makes the lens far more powerful. Not perfect, but far more powerful.

Part B Fake Beliefs

Making Beliefs Pay Rent (in Anticipated Experiences)

The rationalist virtue of empiricism consists of constantly asking which experiences our beliefs predict—or better yet, prohibit. Do you believe that phlogiston is the cause of fire? Then what do you expect to see happen, because of that? Do you believe that Wulky Wilkinsen is a post-utopian? Then what do you expect to see because of that? No, not “colonial alienation”; what experience will happen to you? Do you believe that if a tree falls in the forest, and no one hears it, it still makes a sound? Then what experience must therefore befall you?

It is even better to ask: what experience must not happen to you? Do you believe that élan vital explains the mysterious aliveness of living beings? Then what does this belief not allow to happen—what would definitely falsify this belief? A null answer means that your belief does not constrain experience; it permits anything to happen to you. It floats.

When you argue a seemingly factual question, always keep in mind which difference of anticipation you are arguing about. If you can’t find the difference of anticipation, you’re probably arguing about labels in your belief network—or even worse, floating beliefs, barnacles on your network. If you don’t know what experiences are implied by Wulky Wilkinsen being a post-utopian, you can go on arguing forever. Above all, don’t ask what to believe—ask what to anticipate. Every question of belief should flow from a question of anticipation, and that question of anticipation should be the center of the inquiry. Every guess of belief should begin by flowing to a specific guess of anticipation, and should continue to pay rent in future anticipations. If a belief turns deadbeat, evict it.

Pretending to be Wise

It’s common to put on a show of neutrality or suspended judgment in order to signal that one is mature, wise, impartial, or just has a superior vantage point. This I call “pretending to be Wise.” Of course there are many ways to try and signal wisdom. But trying to signal wisdom by refusing to make guesses—refusing to sum up evidence—refusing to pass judgment—refusing to take sides—staying above the fray and looking down with a lofty and condescending gaze—which is to say, signaling wisdom by saying and doing nothing—well, that I find particularly pretentious.

There’s a difference between:

  • Passing neutral judgment;

  • Declining to invest marginal resources;

  • Pretending that either of the above is a mark of deep wisdom, maturity, and a superior vantage point; with the corresponding implication that the original sides occupy lower vantage points that are not importantly different from up there.

Belief as Attire

Yet another form of improper belief is belief as group identification—as a way of belonging. Robin Hanson uses the excellent metaphor of wearing unusual clothing, a group uniform like a priest’s vestments or a Jewish skullcap, and so I will call this “belief as attire.”

Belief-as-attire may help explain how people can be passionate about improper beliefs. Mere belief in belief, or religious professing, would have some trouble creating genuine, deep, powerful emotional effects. Or so I suspect; I confess I’m not an expert here. But my impression is this: People who’ve stopped anticipating-as-if their religion is true, will go to great lengths to convince themselves they are passionate, and this desperation can be mistaken for passion. But it’s not the same fire they had as a child. On the other hand, it is very easy for a human being to genuinely, passionately, gut-level belong to a group, to cheer for their favorite sports team. (This is the foundation on which rests the swindle of “Republicans vs. Democrats” and analogous false dilemmas in other countries, but that’s a topic for another time.) Identifying with a tribe is a very strong emotional force. People will die for it. And once you get people to identify with a tribe, the beliefs which are attire of that tribe will be spoken with the full passion of belonging to that tribe.

Applause Lights

Most applause lights are much more blatant, and can be detected by a simple reversal test. For example, suppose someone says: We need to balance the risks and opportunities of AI. If you reverse this statement, you get: We shouldn’t balance the risks and opportunities of AI. Since the reversal sounds abnormal, the unreversed statement is probably normal, implying it does not convey new information. There are plenty of legitimate reasons for uttering a sentence that would be uninformative in isolation. “We need to balance the risks and opportunities of AI” can introduce a discussion topic; it can emphasize the importance of a specific proposal for balancing; it can criticize an unbalanced proposal. Linking to a normal assertion can convey new information to a bounded rationalist—the link itself may not be obvious. But if no specifics follow, the sentence is probably an applause light.

Part C Noticing Confusion

What Is Evidence?

What is evidence? It is an event entangled, by links of cause and effect, with whatever you want to know about. If the target of your inquiry is your shoelaces, for example, then the light entering your pupils is evidence entangled with your shoelaces. This should not be confused with the technical sense of “entanglement” used in physics—here I’m just talking about “entanglement” in the sense of two things that end up in correlated states because of the links of cause and effect between them.

Not every influence creates the kind of “entanglement” required for evidence. It’s no help to have a machine that beeps when you enter winning lottery numbers, if the machine also beeps when you enter losing lottery numbers. The light reflected from your shoes would not be useful evidence about your shoelaces, if the photons ended up in the same physical state whether your shoelaces were tied or untied. To say it abstractly: For an event to be evidence about a target of inquiry, it has to happen differently in a way that’s entangled with the different possible states of the target.

This is why rationalists put such a heavy premium on the paradoxical-seeming claim that a belief is only really worthwhile if you could, in principle, be persuaded to believe otherwise. If your retina ended up in the same state regardless of what light entered it, you would be blind. Some belief systems, in a rather obvious trick to reinforce themselves, say that certain beliefs are only really worthwhile if you believe them unconditionally—no matter what you see, no matter what you think. Your brain is supposed to end up in the same state regardless. Hence the phrase, “blind faith.” If what you believe doesn’t depend on what you see, you’ve been blinded as effectively as by poking out your eyeballs. If your eyes and brain work correctly, your beliefs will end up entangled with the facts. Rational thought produces beliefs which are themselves evidence.

How Much Evidence Does It Take?

In general, the rules for weighing “how much evidence it takes” follow a similar pattern: The larger the space of possibilities in which the hypothesis lies, or the more unlikely the hypothesis seems a priori compared to its neighbors, or the more confident you wish to be, the more evidence you need. You cannot defy the rules; you cannot form accurate beliefs based on inadequate evidence.

Occam’s Razor

The more complex an explanation is, the more evidence you need just to find it in belief-space. (In Traditional Rationality this is often phrased misleadingly, as “The more complex a proposition is, the more evidence is required to argue for it.”) How can we measure the complexity of an explanation? How can we determine how much evidence is required? Occam’s Razor is often phrased as “The simplest explanation that fits the facts.”

Your Strength as a Rationalist

Alas, belief is easier than disbelief; we believe instinctively, but disbelief requires a conscious effort.

Your strength as a rationalist is your ability to be more confused by fiction than by reality. If you are equally good at explaining any outcome, you have zero knowledge.

I should have paid more attention to that sensation of still feels a little forced. It’s one of the most important feelings a truthseeker can have, a part of your strength as a rationalist. It is a design flaw in human cognition that this sensation manifests as a quiet strain in the back of your mind, instead of a wailing alarm siren and a glowing neon sign reading: EITHER YOUR MODEL IS FALSE OR THIS STORY IS WRONG.

Absence of Evidence Is Evidence of Absence

In probability theory, absence of evidence is always evidence of absence. If E is a binary event and P(H|E) > P(H), i.e., seeing E increases the probability of H, then P(H|¬E) < P(H), i.e., failure to observe E decreases the probability of H. The probability P(H) is a weighted mix of P(H|E) and P(H|¬E), and necessarily lies between the two.

Under the vast majority of real-life circumstances, a cause may not reliably produce signs of itself, but the absence of the cause is even less likely to produce the signs. The absence of an observation may be strong evidence of absence or very weak evidence of absence, depending on how likely the cause is to produce the observation. The absence of an observation that is only weakly permitted (even if the alternative hypothesis does not allow it at all) is very weak evidence of absence (though it is evidence nonetheless). This is the fallacy of “gaps in the fossil record”—fossils form only rarely; it is futile to trumpet the absence of a weakly permitted observation when many strong positive observations have already been recorded. But if there are no positive observations at all, it is time to worry; hence the Fermi Paradox.

Your strength as a rationalist is your ability to be more confused by fiction than by reality; if you are equally good at explaining any outcome you have zero knowledge. The strength of a model is not what it can explain, but what it can’t, for only prohibitions constrain anticipation. If you don’t notice when your model makes the evidence unlikely, you might as well have no model, and also you might as well have no evidence; no brain and no eyes.

Hindsight Devalues Science

Daphna Baratz exposed college students to pairs of supposed findings, one true (“In prosperous times people spend a larger portion of their income than during a recession”) and one the truth’s opposite. In both sides of the pair, students rated the supposed finding as what they “would have predicted.” Perfectly standard hindsight bias. Which leads people to think they have no need for science, because they “could have predicted” that. (Just as you would expect, right?)

Hindsight will lead us to systematically undervalue the surprisingness of scientific findings, especially the discoveries we understand—the ones that seem real to us, the ones we can retrofit into our models of the world. If you understand neurology or physics and read news in that topic, then you probably underestimate the surprisingness of findings in those fields too. This unfairly devalues the contribution of the researchers; and worse, will prevent you from noticing when you are seeing evidence that doesn’t fit what you really would have expected. We need to make a conscious effort to be shocked enough.

Part D Mysterious Answers

Fake Explanations

Once upon a time, there was an instructor who taught physics students. One day the instructor called them into the classroom and showed them a wide, square plate of metal, next to a hot radiator. The students each put their hand on the plate and found the side next to the radiator cool, and the distant side warm. And the instructor said, Why do you think this happens? Some students guessed convection of air currents, and others guessed strange metals in the plate. They devised many creative explanations, none stooping so low as to say “I don’t know” or “This seems impossible.” And the answer was that before the students entered the room, the instructor turned the plate around. Consider the student who frantically stammers, “Eh, maybe because of the heat conduction and so?” I ask: Is this answer a proper belief? The words are easily enough professed—said in a loud, emphatic voice. But do the words actually control anticipation? Ponder that innocent little phrase, “because of,” which comes before “heat conduction.” Ponder some of the other things we could put after it. We could say, for example, “Because of phlogiston,” or “Because of magic.” “Magic!” you cry. “That’s not a scientific explanation!” Indeed, the phrases “because of heat conduction” and “because of magic” are readily recognized as belonging to different literary genres. “Heat conduction” is something that Spock might say on Star Trek, whereas “magic” would be said by Giles in Buffy the Vampire Slayer.

However, as Bayesians, we take no notice of literary genres. For us, the substance of a model is the control it exerts on anticipation. If you say “heat conduction,” what experience does that lead you to anticipate? Under normal circumstances, it leads you to anticipate that, if you put your hand on the side of the plate near the radiator, that side will feel warmer than the opposite side. If “because of heat conduction” can also explain the radiator-adjacent side feeling cooler, then it can explain pretty much anything. And as we all know by this point (I do hope), if you are equally good at explaining any outcome, you have zero knowledge. “Because of heat conduction,” used in such fashion, is a disguised hypothesis of maximum entropy. It is anticipation-isomorphic to saying “magic.” It feels like an explanation, but it’s not.

The deeper error of the students is not simply that they failed to constrain anticipation. Their deeper error is that they thought they were doing physics. They said the phrase “because of,” followed by the sort of words Spock might say on Star Trek, and thought they thereby entered the magisterium of science. Not so. They simply moved their magic from one literary genre to another.

Guessing the Teacher’s Password

There is an instinctive tendency to think that if a physicist says “light is made of waves,” and the teacher says “What is light made of?,” and the student says “Waves!,” then the student has made a true statement. That’s only fair, right? We accept “waves” as a correct answer from the physicist; wouldn’t it be unfair to reject it from the student? Surely, the answer “Waves!” is either true or false, right?

Which is one more bad habit to unlearn from school. Words do not have intrinsic definitions. If I hear the syllables “bea-ver” and think of a large rodent, that is a fact about my own state of mind, not a fact about the syllables “bea-ver.” The sequence of syllables “made of waves” (or “because of heat conduction”) is not a hypothesis, it is a pattern of vibrations traveling through the air, or ink on paper. It can associate to a hypothesis in someone’s mind, but it is not, of itself, right or wrong. But in school, the teacher hands you a gold star for saying “made of waves,” which must be the correct answer because the teacher heard a physicist emit the same sound-vibrations. Since verbal behavior (spoken or written) is what gets the gold star, students begin to think that verbal behavior has a truth-value. After all, either light is made of waves, or it isn’t, right? And this leads into an even worse habit. Suppose the teacher presents you with a confusing problem involving a metal plate next to a radiator; the far side feels warmer than the side next to the radiator. The teacher asks “Why?” If you say “I don’t know,” you have no chance of getting a gold star—it won’t even count as class participation. But, during the current semester, this teacher has used the phrases “because of heat convection,” “because of heat conduction,” and “because of radiant heat.” One of these is probably what the teacher wants. You say, “Eh, maybe because of heat conduction?” This is not a hypothesis about the metal plate. This is not even a proper belief. It is an attempt to guess the teacher’s password.

Science as Attire

The X-Men comics use terms like “evolution,” “mutation,” and “genetic code,” purely to place themselves in what they conceive to be the literary genre of science. The part that scares me is wondering how many people, especially in the media, understand science only as a literary genre. I encounter people who very definitely believe in evolution, who sneer at the folly of creationists. And yet they have no idea of what the theory of evolutionary biology permits and prohibits. They’ll talk about “the next step in the evolution of humanity,” as if natural selection got here by following a plan. Or even worse, they’ll talk about something completely outside the domain of evolutionary biology, like an improved design for computer chips, or corporations splitting, or humans uploading themselves into computers, and they’ll call that “evolution.” If evolutionary biology could cover that, it could cover anything. Probably an actual majority of the people who believe in evolution use the phrase “because of evolution” because they want to be part of the scientific in-crowd—belief as scientific attire, like wearing a lab coat. If the scientific in-crowd instead used the phrase “because of intelligent design,” they would just as cheerfully use that instead—it would make no difference to their anticipation-controllers.

Is there any idea in science that you are proud of believing, though you do not use the belief professionally? You had best ask yourself which future experiences your belief prohibits from happening to you. That is the sum of what you have assimilated and made a true part of yourself. Anything else is probably passwords or attire.

Semantic Stopsigns

Consider the seeming paradox of the First Cause. Science has traced events back to the Big Bang, but why did the Big Bang happen? It’s all well and good to say that the zero of time begins at the Big Bang—that there is nothing before the Big Bang in the ordinary flow of minutes and hours. But saying this presumes our physical law, which itself appears highly structured; it calls out for explanation. Where did the physical laws come from? You could say that we’re all a computer simulation, but then the computer simulation is running on some other world’s laws of physics—where did those laws of physics come from? At this point, some people say, “God!” What could possibly make anyone, even a highly religious person, think this even helped answer the paradox of the First Cause? Why wouldn’t you automatically ask, “Where did God come from?” Saying “God is uncaused” or “God created Himself” leaves us in exactly the same position as “Time began with the Big Bang.” We just ask why the whole metasystem exists in the first place, or why some events but not others are allowed to be uncaused.

Jonathan Wallace suggested that “God!” functions as a semantic stopsign—that it isn’t a propositional assertion, so much as a cognitive traffic signal: do not think past this point. Saying “God!” doesn’t so much resolve the paradox, as put up a cognitive traffic signal to halt the obvious continuation of the question-and-answer chain. Of course you’d never do that, being a good and proper atheist, right? But “God!” isn’t the only semantic stopsign, just the obvious first example.

Be careful here not to create a new generic counterargument against things you don’t like—“Oh, it’s just a stopsign!” No word is a stopsign of itself; the question is whether a word has that effect on a particular person. Having strong emotions about something doesn’t qualify it as a stopsign. What distinguishes a semantic stopsign is failure to consider the obvious next question.

Mysterious Answers to Mysterious Questions

Ignorance exists in the map, not in the territory. If I am ignorant about a phenomenon, that is a fact about my own state of mind, not a fact about the phenomenon itself. A phenomenon can seem mysterious to some particular person. There are no phenomena which are mysterious of themselves. To worship a phenomenon because it seems so wonderfully mysterious is to worship your own ignorance.

But the deeper failure is supposing that an answer can be mysterious. If a phenomenon feels mysterious, that is a fact about our state of knowledge, not a fact about the phenomenon itself. The vitalists saw a mysterious gap in their knowledge, and postulated a mysterious stuff that plugged the gap. In doing so, they mixed up the map with the territory. All confusion and bewilderment exist in the mind, not in encapsulated substances. This is the ultimate and fully general explanation for why, again and again in humanity’s history, people are shocked to discover that an incredibly mysterious question has a non-mysterious answer. Mystery is a property of questions, not answers. Therefore I call theories such as vitalism mysterious answers to mysterious questions.

These are the signs of mysterious answers to mysterious questions:

  • First, the explanation acts as a curiosity-stopper rather than an anticipation-controller.

  • Second, the hypothesis has no moving parts—the model is not a specific complex mechanism, but a blankly solid substance or force. The mysterious substance or mysterious force may be said to be here or there, to cause this or that; but the reason why the mysterious force behaves thus is wrapped in a blank unity.

  • Third, those who proffer the explanation cherish their ignorance; they speak proudly of how the phenomenon defeats ordinary science or is unlike merely mundane phenomena.

  • Fourth, even after the answer is given, the phenomenon is still a mystery and possesses the same quality of wonderful inexplicability that it had at the start.

The Futility of Emergence

The failures of phlogiston and vitalism are historical hindsight. Dare I step out on a limb, and name some current theory which I deem analogously flawed? I name emergence or emergent phenomena—usually defined as the study of systems whose high-level behaviors arise or “emerge” from the interaction of many low-level elements.

Taken literally, that description fits every phenomenon in our universe above the level of individual quarks, which is part of the problem. Imagine pointing to a market crash and saying “It’s not a quark!” Does that feel like an explanation? No? Then neither should saying “It’s an emergent phenomenon!”

I have lost track of how many times I have heard people say, “Intelligence is an emergent phenomenon!” as if that explained intelligence. This usage fits all the checklist items for a mysterious answer to a mysterious question. What do you know, after you have said that intelligence is “emergent”? You can make no new predictions. You do not know anything about the behavior of real-world minds that you did not know before. It feels like you believe a new fact, but you don’t anticipate any different outcomes. Your curiosity feels sated, but it has not been fed. The hypothesis has no moving parts—there’s no detailed internal model to manipulate. Those who proffer the hypothesis of “emergence” confess their ignorance of the internals, and take pride in it; they contrast the science of “emergence” to other sciences merely mundane. And even after the answer of “Why? Emergence!” is given, the phenomenon is still a mystery and possesses the same sacred impenetrability it had at the start.

“Emergence” has become very popular, just as saying “magic” used to be very popular. “Emergence” has the same deep appeal to human psychology, for the same reason. “Emergence” is such a wonderfully easy explanation, and it feels good to say it; it gives you a sacred mystery to worship. Emergence is popular because it is the junk food of curiosity. You can explain anything using emergence, and so people do just that; for it feels so wonderful to explain things.

Say Not “Complexity”

What you must avoid is skipping over the mysterious part; you must linger at the mystery to confront it directly. There are many words that can skip over mysteries, and some of them would be legitimate in other contexts—“complexity,” for example. But the essential mistake is that skip-over, regardless of what causal node goes behind it. The skip-over is not a thought, but a microthought. You have to pay close attention to catch yourself at it. And when you train yourself to avoid skipping, it will become a matter of instinct, not verbal reasoning. You have to feel which parts of your map are still blank, and more importantly, pay attention to that feeling.

Marcello and I developed a convention in our AI work: when we ran into something we didn’t understand, which was often, we would say “magic”—as in, “X magically does Y”—to remind ourselves that here was an unsolved problem, a gap in our understanding. It is far better to say “magic,” than “complexity” or “emergence”; the latter words create an illusion of understanding. Wiser to say “magic,” and leave yourself a placeholder, a reminder of work you will have to do later.

Positive Bias: Look into the Dark

Subjects who attempt the 2-4-6 task usually try to generate positive examples, rather than negative examples—they apply the hypothetical rule to generate a representative instance, and see if it is labeled “Yes.” Thus, someone who forms the hypothesis “numbers increasing by two” will test the triplet 8-10-12, hear that it fits, and confidently announce the rule. Someone who forms the hypothesis X-2X-3X will test the triplet 3-6-9, discover that it fits, and then announce that rule. In every case the actual rule is the same: the three numbers must be in ascending order. But to discover this, you would have to generate triplets that shouldn’t fit, such as 20-23-26, and see if they are labeled “No.” Which people tend not to do, in this experiment.

This cognitive phenomenon is usually lumped in with “confirmation bias.” However, it seems to me that the phenomenon of trying to test positive rather than negative examples, ought to be distinguished from the phenomenon of trying to preserve the belief you started with. “Positive bias” is sometimes used as a synonym for “confirmation bias,” and fits this particular flaw much better.

One may be lectured on positive bias for days, and yet overlook it in-the-moment. Positive bias is not something we do as a matter of logic, or even as a matter of emotional attachment. The 2-4-6 task is “cold,” logical, not affectively “hot.” And yet the mistake is sub-verbal, on the level of imagery, of instinctive reactions. Because the problem doesn’t arise from following a deliberate rule that says “Only think about positive examples,” it can’t be solved just by knowing verbally that “We ought to think about both positive and negative examples.” Which example automatically pops into your head? You have to learn, wordlessly, to zag instead of zig. You have to learn to flinch toward the zero, instead of away from it.

Failing to Learn from History

Once upon a time, in my wild and reckless youth, when I knew not the Way of Bayes, I gave a Mysterious Answer to a mysterious-seeming question. Many failures occurred in sequence, but one mistake stands out as most critical: My younger self did not realize that solving a mystery should make it feel less confusing. I was trying to explain a Mysterious Phenomenon—which to me meant providing a cause for it, fitting it into an integrated model of reality. Why should this make the phenomenon less Mysterious, when that is its nature? I was trying to explain the Mysterious Phenomenon, not render it (by some impossible alchemy) into a mundane phenomenon, a phenomenon that wouldn’t even call out for an unusual explanation in the first place.

I thought the lesson of history was that astrologers and alchemists and vitalists had an innate character flaw, a tendency toward mysterianism, which led them to come up with mysterious explanations for non-mysterious subjects. But surely, if a phenomenon really was very weird, a weird explanation might be in order? It was only afterward, when I began to see the mundane structure inside the mystery, that I realized whose shoes I was standing in. Only then did I realize how reasonable vitalism had seemed at the time, how surprising and embarrassing had been the universe’s reply of, “Life is mundane, and does not need a weird explanation.”

Truly Part of You

How can you realize that you shouldn’t trust your seeming knowledge that “light is waves”? One test you could apply is asking, “Could I regenerate this knowledge if it were somehow deleted from my mind?”

How much of your knowledge could you regenerate? From how deep a deletion? It’s not just a test to cast out insufficiently connected beliefs. It’s a way of absorbing a fountain of knowledge, not just one fact.

When you contain the source of a thought, that thought can change along with you as you acquire new knowledge and new skills. When you contain the source of a thought, it becomes truly a part of you and grows along with you. Strive to make yourself the source of every thought worth thinking. If the thought originally came from outside, make sure it comes from inside as well. Continually ask yourself: “How would I regenerate the thought if it were deleted?” When you have an answer, imagine that knowledge being deleted as well. And when you find a fountain, see what else it can pour.

Book II How to Actually Change Your Mind

Part E Overly Convenient Excuses

“To be humble is to take specific actions in anticipation of your own errors. To confess your fallibility and then do nothing about it is not humble; it is boasting of your modesty.”

The Third Alternative

Believing in Santa Claus gives children a sense of wonder and encourages them to behave well in hope of receiving presents. If Santa-belief is destroyed by truth, the children will lose their sense of wonder and stop behaving nicely. Therefore, even though Santa-belief is false-to-fact, it is a Noble Lie whose net benefit should be preserved for utilitarian reasons.

Classically, this is known as a false dilemma, the fallacy of the excluded middle, or the package-deal fallacy. Even if we accept the underlying factual and moral premises of the above argument, it does not carry through. Even supposing that the Santa policy (encourage children to believe in Santa Claus) is better than the null policy (do nothing), it does not follow that Santa-ism is the best of all possible alternatives. Other policies could also supply children with a sense of wonder, such as taking them to watch a Space Shuttle launch or supplying them with science fiction novels. Likewise (if I recall correctly), offering children bribes for good behavior encourages the children to behave well only when adults are watching, while praise without bribes leads to unconditional good behavior. Noble Lies are generally package-deal fallacies; and the response to a package-deal fallacy is that if we really need the supposed gain, we can construct a Third Alternative for getting it.

Beware when you find yourself arguing that a policy is defensible rather than optimal; or that it has some benefit compared to the null action, rather than the best benefit of any action. False dilemmas are often presented to justify unethical policies that are, by some vast coincidence, very convenient.

To do better, ask yourself straight out: If I saw that there was a superior alternative to my current policy, would I be glad in the depths of my heart, or would I feel a tiny flash of reluctance before I let go? If the answers are “no” and “yes,” beware that you may not have searched for a Third Alternative.

Which leads into another good question to ask yourself straight out: Did I spend five minutes with my eyes closed, brainstorming wild and creative options, trying to think of a better alternative? It has to be five minutes by the clock, because otherwise you blink—close your eyes and open them again—and say, “Why, yes, I searched for alternatives, but there weren’t any.” Blinking makes a good black hole down which to dump your duties. An actual, physical clock is recommended. And those wild and creative options—were you careful not to think of a good one? Was there a secret effort from the corner of your mind to ensure that every option considered would be obviously bad?

The Fallacy of Gray

The Sophisticate: “The world isn’t black and white. No one does pure good or pure bad. It’s all gray. Therefore, no one is better than anyone else.” The Zetet: “Knowing only gray, you conclude that all grays are the same shade. You mock the simplicity of the two-color view, yet you replace it with a one-color view...”
— Marc Stiegler, David’s Sling

I don’t know if the Sophisticate’s mistake has an official name, but I call it the Fallacy of Gray.

“The Moon is made of green cheese” and “the Sun is made of mostly hydrogen and helium” are both uncertainties, but they are not the same uncertainty. Everything is shades of gray, but there are shades of gray so light as to be very nearly white, and shades of gray so dark as to be very nearly black. Or even if not, we can still compare shades, and say “it is darker” or “it is lighter.”

Likewise the folly of those who say, “Every scientific paradigm imposes some of its assumptions on how it interprets experiments,” and then act like they’d proven science to occupy the same level with witchdoctoring. Every worldview imposes some of its structure on its observations, but the point is that there are worldviews which try to minimize that imposition, and worldviews which glory in it. There is no white, but there are shades of gray that are far lighter than others, and it is folly to treat them as if they were all on the same level.

If the Moon has orbited the Earth these past few billion years, if you have seen it in the sky these last years, and you expect to see it in its appointed place and phase tomorrow, then that is not a certainty. And if you expect an invisible dragon to heal your daughter of cancer, that too is not a certainty. But they are rather different degrees of uncertainty—this business of expecting things to happen yet again in the same way you have previously predicted to twelve decimal places, versus expecting something to happen that violates the order previously observed. Calling them both “faith” seems a little too un-narrow.

Absolute Authority

In the world of the unenlightened ones, there is authority and un-authority. What can be trusted, can be trusted; what cannot be trusted, you may as well throw away. There are good sources of information and bad sources of information. If scientists have changed their stories ever in their history, then science cannot be a true Authority, and can never again be trusted—like a witness caught in a contradiction, or like an employee found stealing from the till.

One obvious source for this pattern of thought is religion, where the scriptures are alleged to come from God; therefore to confess any flaw in them would destroy their authority utterly; so any trace of doubt is a sin, and claiming certainty is mandatory whether you’re certain or not. But I suspect that the traditional school regimen also has something to do with it. The teacher tells you certain things, and you have to believe them, and you have to recite them back on the test. But when a student makes a suggestion in class, you don’t have to go along with it—you’re free to agree or disagree (it seems) and no one will punish you.

What might you try, rhetorically, in front of an audience? Hard to say... maybe:

  • “The power of science comes from having the ability to change our minds and admit we’re wrong. If you’ve never admitted you’re wrong, it doesn’t mean you’ve made fewer mistakes.”

  • “Anyone can say they’re absolutely certain. It’s a bit harder to never, ever make any mistakes. Scientists understand the difference, so they don’t say they’re absolutely certain. That’s all. It doesn’t mean that they have any specific reason to doubt a theory—absolutely every scrap of evidence can be going the same way, all the stars and planets lined up like dominos in support of a single hypothesis, and the scientists still won’t say they’re absolutely sure, because they’ve just got higher standards. It doesn’t mean scientists are less entitled to certainty than, say, the politicians who always seem so sure of everything.”

  • “Scientists don’t use the phrase ‘not absolutely certain’ the way you’re used to from regular conversation. I mean, suppose you went to the doctor, and got a blood test, and the doctor came back and said, ‘We ran some tests, and it’s not absolutely certain that you’re not made out of cheese, and there’s a non-zero chance that twenty fairies made out of sentient chocolate are singing the “I love you” song from Barney inside your lower intestine.’ Run for the hills, your doctor needs a doctor. When a scientist says the same thing, it means that they think the probability is so tiny that you couldn’t see it with an electron microscope, but the scientist is willing to see the evidence in the extremely unlikely event that you have it.”

  • “Would you be willing to change your mind about the things you call ‘certain’ if you saw enough evidence? I mean, suppose that God himself descended from the clouds and told you that your whole religion was true except for the Virgin Birth. If that would change your mind, you can’t say you’re absolutely certain of the Virgin Birth. For technical reasons of probability theory, if it’s theoretically possible for you to change your mind about something, it can’t have a probability exactly equal to one. The uncertainty might be smaller than a dust speck, but it has to be there. And if you wouldn’t change your mind even if God told you otherwise, then you have a problem with refusing to admit you’re wrong that transcends anything a mortal like me can say to you, I guess.”

Once you realize you don’t need probabilities of 1.0 to get along in life, you’ll realize how absolutely ridiculous it is to think you could ever get to 1.0 with a human brain. A probability of 1.0 isn’t just certainty, it’s infinite certainty. In fact, it seems to me that to prevent public misunderstanding, maybe scientists should go around saying “We are not INFINITELY certain” rather than “We are not certain.” For the latter case, in ordinary discourse, suggests you know some specific reason for doubt.

0 And 1 Are Not Probabilities

In the usual way of writing probabilities, probabilities are between 0 and 1. A coin might have a probability of 0.5 of coming up tails, or the weatherman might assign probability 0.9 to rain tomorrow. This isn’t the only way of writing probabilities, though. For example, you can transform probabilities into odds via the transformation O = (P∕(1 - P)). So a probability of 50% would go to odds of 0.5/0.5 or 1, usually written 1:1, while a probability of 0.9 would go to odds of 0.9/0.1 or 9, usually written 9:1. To take odds back to probabilities you use P = (O∕(1 + O)), and this is perfectly reversible, so the transformation is an isomorphism—a two-way reversible mapping. Thus, probabilities and odds are isomorphic, and you can use one or the other according to convenience.

For example, it’s more convenient to use odds when you’re doing Bayesian updates. Let’s say that I roll a six-sided die: If any face except 1 comes up, there’s a 10% chance of hearing a bell, but if the face 1 comes up, there’s a 20% chance of hearing the bell. Now I roll the die, and hear a bell. What are the odds that the face showing is 1? Well, the prior odds are 1:5 (corresponding to the real number 1/5 = 0.20) and the likelihood ratio is 0.2:0.1 (corresponding to the real number 2) and I can just multiply these two together to get the posterior odds 2:5 (corresponding to the real number 2/5 or 0.40). Then I convert back into a probability, if I like, and get (0.4∕1.4) = 2∕7 = ~29%. So odds are more manageable for Bayesian updates—if you use probabilities, you’ve got to deploy Bayes’s Theorem in its complicated version. But probabilities are more convenient for answering questions like “If I roll a six-sided die, what’s the chance of seeing a number from 1 to 4?” You can add up the probabilities of 1/6 for each side and get 4/6, but you can’t add up the odds ratios of 0.2 for each side and get an odds ratio of 0.8.

Why am I saying all this? To show that “odd ratios” are just as legitimate a way of mapping uncertainties onto real numbers as “probabilities.” Odds ratios are more convenient for some operations, probabilities are more convenient for others.

Why does it matter that odds ratios are just as legitimate as probabilities? Probabilities as ordinarily written are between 0 and 1, and both 0 and 1 look like they ought to be readily reachable quantities—it’s easy to see 1 zebra or 0 unicorns. But when you transform probabilities onto odds ratios, 0 goes to 0, but 1 goes to positive infinity. Now absolute truth doesn’t look like it should be so easy to reach.

I propose that it makes sense to say that 1 and 0 are not in the probabilities; just as negative and positive infinity, which do not obey the field axioms, are not in the real numbers.

Your Rationality Is My Business

What business is it of mine, if someone else chooses to believe what is pleasant rather than what is true? Can’t we each choose for ourselves whether to care about the truth?

I believe that it is right and proper for me, as a human being, to have an interest in the future, and what human civilization becomes in the future. One of those interests is the human pursuit of truth, which has strengthened slowly over the generations (for there was not always Science). I wish to strengthen that pursuit further, in this generation. That is a wish of mine, for the Future. For we are all of us players upon that vast gameboard, whether we accept the responsibility or not. And that makes your rationality my business.

Part F Politics and Rationality

Politics is the Mind-Killer

People go funny in the head when talking about politics. The evolutionary reasons for this are so obvious as to be worth belaboring: In the ancestral environment, politics was a matter of life and death. And sex, and wealth, and allies, and reputation... When, today, you get into an argument about whether “we” ought to raise the minimum wage, you’re executing adaptations for an ancestral environment where being on the wrong side of the argument could get you killed. Being on the right side of the argument could let you kill your hated rival!

Policy Debates Should Not Appear One-Sided

On questions of simple fact (for example, whether Earthly life arose by natural selection) there’s a legitimate expectation that the argument should be a one-sided battle; the facts themselves are either one way or another, and the so-called “balance of evidence” should reflect this. Indeed, under the Bayesian definition of evidence, “strong evidence” is just that sort of evidence which we only expect to find on one side of an argument. But there is no reason for complex actions with many consequences to exhibit this onesidedness property. Why do people seem to want their policy debates to be one-sided?

Politics is the mind-killer. Arguments are soldiers. Once you know which side you’re on, you must support all arguments of that side, and attack all arguments that appear to favor the enemy side; otherwise it’s like stabbing your soldiers in the back. If you abide within that pattern, policy debates will also appear one-sided to you—the costs and drawbacks of your favored policy are enemy soldiers, to be attacked by any means necessary.

One should also be aware of a related failure pattern, thinking that the course of Deep Wisdom is to compromise with perfect evenness between whichever two policy positions receive the most airtime. A policy may legitimately have lopsided costs or benefits. If policy questions were not tilted one way or the other, we would be unable to make decisions about them. But there is also a human tendency to deny all costs of a favored policy, or deny all benefits of a disfavored policy; and people will therefore tend to think policy tradeoffs are tilted much further than they actually are.

Correspondence Bias

The correspondence bias is the tendency to draw inferences about a person’s unique and enduring dispositions from behaviors that can be entirely explained by the situations in which they occur.
— Gilbert and Malone

We tend to see far too direct a correspondence between others’ actions and personalities. When we see someone else kick a vending machine for no visible reason, we assume they are “an angry person.” But when you yourself kick the vending machine, it’s because the bus was late, the train was early, your report is overdue, and now the damned vending machine has eaten your lunch money for the second day in a row. Surely, you think to yourself, anyone would kick the vending machine, in that situation. We attribute our own actions to our situations, seeing our behaviors as perfectly normal responses to experience. But when someone else kicks a vending machine, we don’t see their past history trailing behind them in the air. We just see the kick, for no reason we know about, and we think this must be a naturally angry person—since they lashed out without any provocation.

The “fundamental attribution error” refers to our tendency to overattribute others’ behaviors to their dispositions, while reversing this tendency for ourselves. To understand why people act the way they do, we must first realize that everyone sees themselves as behaving normally. Don’t ask what strange, mutant disposition they were born with, which directly corresponds to their surface behavior. Rather, ask what situations people see themselves as being in. Yes, people do have dispositions—but there are not enough heritable quirks of disposition to directly account for all the surface behaviors you see.

Are Your Enemies Innately Evil?

When someone actually offends us—commits an action of which we (rightly or wrongly) disapprove—then, I observe, the correspondence bias redoubles. There seems to be a very strong tendency to blame evil deeds on the Enemy’s mutant, evil disposition. Not as a moral point, but as a strict question of prior probability, we should ask what the Enemy might believe about their situation that would reduce the seeming bizarrity of their behavior. This would allow us to hypothesize a less exceptional disposition, and thereby shoulder a lesser burden of improbability.

So let’s come right out and say it—the 9/11 hijackers weren’t evil mutants. They did not hate freedom. They, too, were the heroes of their own stories, and they died for what they believed was right—truth, justice, and the Islamic way. If the hijackers saw themselves that way, it doesn’t mean their beliefs were true. If the hijackers saw themselves that way, it doesn’t mean that we have to agree that what they did was justified. If the hijackers saw themselves that way, it doesn’t mean that the passengers of United Flight 93 should have stood aside and let it happen. It does mean that in another world, if they had been raised in a different environment, those hijackers might have been police officers. And that is indeed a tragedy. Welcome to Earth.

Reversed Stupidity Is Not Intelligence

As Robert Pirsig puts it, “The world’s greatest fool may say the Sun is shining, but that doesn’t make it dark out.”

If stupidity does not reliably anticorrelate with truth, how much less should human evil anticorrelate with truth? The converse of the halo effect is the horns effect: All perceived negative qualities correlate. If Stalin is evil, then everything he says should be false. You wouldn’t want to agree with Stalin, would you? Stalin also believed that 2 + 2 = 4. Yet if you defend any statement made by Stalin, even “2 + 2 = 4,” people will see only that you are “agreeing with Stalin”; you must be on his side.

Corollaries of this principle:

  • To argue against an idea honestly, you should argue against the best arguments of the strongest advocates. Arguing against weaker advocates proves nothing, because even the strongest idea will attract weak advocates. If you want to argue against transhumanism or the intelligence explosion, you have to directly challenge the arguments of Nick Bostrom or Eliezer Yudkowsky post-2003. The least convenient path is the only valid one.

  • Exhibiting sad, pathetic lunatics, driven to madness by their apprehension of an Idea, is no evidence against that Idea. Many New Agers have been made crazier by their personal apprehension of quantum mechanics.

  • Someone once said, “Not all conservatives are stupid, but most stupid people are conservatives.” If you cannot place yourself in a state of mind where this statement, true or false, seems completely irrelevant as a critique of conservatism, you are not ready to think rationally about politics.

  • Ad hominem argument is not valid.

  • You need to be able to argue against genocide without saying “Hitler wanted to exterminate the Jews.” If Hitler hadn’t advocated genocide, would it thereby become okay?

  • In Hansonian terms: Your instinctive willingness to believe something will change along with your willingness to affiliate with people who are known for believing it—quite apart from whether the belief is actually true. Some people may be reluctant to believe that God does not exist, not because there is evidence that God does exist, but rather because they are reluctant to affiliate with Richard Dawkins or those darned “strident” atheists who go around publicly saying “God does not exist.”

  • If your current computer stops working, you can’t conclude that everything about the current system is wrong and that you need a new system without an AMD processor, an ATI video card, a Maxtor hard drive, or case fans—even though your current system has all these things and it doesn’t work. Maybe you just need a new power cord.

  • If a hundred inventors fail to build flying machines using metal and wood and canvas, it doesn’t imply that what you really need is a flying machine of bone and flesh. If a thousand projects fail to build Artificial Intelligence using electricity-based computing, this doesn’t mean that electricity is the source of the problem. Until you understand the problem, hopeful reversals are exceedingly unlikely to hit the solution.

Hug the Query

In the art of rationality there is a discipline of closeness-to-the-issue—trying to observe evidence that is as near to the original question as possible, so that it screens off as many other arguments as possible. The Wright Brothers say, “My plane will fly.” If you look at their authority (bicycle mechanics who happen to be excellent amateur physicists) then you will compare their authority to, say, Lord Kelvin, and you will find that Lord Kelvin is the greater authority. If you demand to see the Wright Brothers’ calculations, and you can follow them, and you demand to see Lord Kelvin’s calculations (he probably doesn’t have any apart from his own incredulity), then authority becomes much less relevant. If you actually watch the plane fly, the calculations themselves become moot for many purposes, and Kelvin’s authority not even worth considering.

The more directly your arguments bear on a question, without intermediate inferences—the closer the observed nodes are to the queried node, in the Great Web of Causality—the more powerful the evidence. It’s a theorem of these causal graphs that you can never get more information from distant nodes, than from strictly closer nodes that screen off the distant ones. Jerry Cleaver said: “What does you in is not failure to apply some high-level, intricate, complicated technique. It’s overlooking the basics. Not keeping your eye on the ball.”

Whenever you can, dance as near to the original question as possible—press yourself up against it—get close enough to hug the query!

Rationality and the English Language

If you really want an artist’s perspective on rationality, then read Orwell; he is mandatory reading for rationalists as well as authors. Orwell was not a scientist, but a writer; his tools were not numbers, but words; his adversary was not Nature, but human evil. If you wish to imprison people for years without trial, you must think of some other way to say it than “I’m going to imprison Mr. Jennings for years without trial.” You must muddy the listener’s thinking, prevent clear images from outraging conscience. You say, “Unreliable elements were subjected to an alternative justice process.” Orwell was the outraged opponent of totalitarianism and the muddy thinking in which evil cloaks itself—which is how Orwell’s writings on language ended up as classic rationalist documents on a level with Feynman, Sagan, or Dawkins.

“Writers are told to avoid usage of the passive voice.” A rationalist whose background comes exclusively from science may fail to see the flaw in the previous sentence; but anyone who’s done a little writing should see it right away. I wrote the sentence in the passive voice, without telling you who tells authors to avoid passive voice. Passive voice removes the actor, leaving only the acted-upon. “Unreliable elements were subjected to an alternative justice process”—subjected by whom? What does an “alternative justice process” do? With enough static noun phrases, you can keep anything unpleasant from actually happening. Passive voice obscures reality.

Part G Against Rationalization

Knowing About Biases Can Hurt People

If you’re irrational to start with, having more knowledge can hurt you. For a true Bayesian, information would never have negative expected utility. But humans aren’t perfect Bayes-wielders; if we’re not careful, we can cut ourselves. I’ve seen people severely messed up by their own knowledge of biases. They have more ammunition with which to argue against anything they don’t like. And that problem—too much ready ammunition—is one of the primary ways that people with high mental agility end up stupid, in Stanovich’s “dysrationalia” sense of stupidity.

Update Yourself Incrementally

It’s okay if your cherished belief isn’t perfectly defended. If the hypothesis is that the coin comes up heads 95% of the time, then one time in twenty you will expect to see what looks like contrary evidence. This is okay. It’s normal. It’s even expected, so long as you’ve got nineteen supporting observations for every contrary one. A probabilistic model can take a hit or two, and still survive, so long as the hits don’t keep on coming in. Yet it is widely believed, especially in the court of public opinion, that a true theory can have no failures and a false theory no successes.

Rationality is not a walk, but a dance. On each step in that dance your foot should come down in exactly the correct spot, neither to the left nor to the right. Shifting belief upward with each iota of confirming evidence. Shifting belief downward with each iota of contrary evidence. Yes, down. Even with a correct model, if it is not an exact model, you will sometimes need to revise your belief down. If an iota or two of evidence happens to countersupport your belief, that’s okay. It happens, sometimes, with probabilistic evidence for non-exact theories. (If an exact theory fails, you are in trouble!) Just shift your belief downward a little—the probability, the odds ratio, or even a nonverbal weight of credence in your mind. Just shift downward a little, and wait for more evidence. If the theory is true, supporting evidence will come in shortly, and the probability will climb again. If the theory is false, you don’t really want it anyway.

The problem with using black-and-white, binary, qualitative reasoning is that any single observation either destroys the theory or it does not. When not even a single contrary observation is allowed, it creates cognitive dissonance and has to be argued away. And this rules out incremental progress; it rules out correct integration of all the evidence. Reasoning probabilistically, we realize that on average, a correct theory will generate a greater weight of support than countersupport. And so you can, without fear, say to yourself: “This is gently contrary evidence, I will shift my belief downward.” Yes, down. It does not destroy your cherished theory. That is qualitative reasoning; think quantitatively.

One Argument Against An Army

When people encounter a contrary argument, they prevent themselves from downshifting their confidence by rehearsing already-known support. The problem, of course, is that by rehearsing arguments you already knew, you are double-counting the evidence. This would be a grave sin even if you double-counted all the evidence. (Imagine a scientist who does an experiment with 50 subjects and fails to obtain statistically significant results, so the scientist counts all the data twice.) But to selectively double-count only some evidence is sheer farce.

With the right kind of wrong reasoning, a handful of support—or even a single argument—can stand off an army of contradictions.


“Rationalization.” What a curious term. I would call it a wrong word. You cannot “rationalize” what is not already rational. It is as if “lying” were called “truthization.”

Not every change is an improvement, but every improvement is necessarily a change. You cannot obtain more truth for a fixed proposition by arguing it; you can make more people believe it, but you cannot make it more true. To improve our beliefs, we must necessarily change our beliefs. Rationality is the operation that we use to obtain more accuracy for our beliefs by changing them. Rationalization operates to fix beliefs in place; it would be better named “anti-rationality,” both for its pragmatic results and for its reversed algorithm. “Rationality” is the forward flow that gathers evidence, weighs it, and outputs a conclusion.

“Rationalization” is a backward flow from conclusion to selected evidence. First you write down the bottom line, which is known and fixed; the purpose of your processing is to find out which arguments you should write down on the lines above. This, not the bottom line, is the variable unknown to the running process.

If you genuinely don’t know where you are going, you will probably feel quite curious about it. Curiosity is the first virtue, without which your questioning will be purposeless and your skills without direction. Feel the flow of the Force, and make sure it isn’t flowing backwards.

Avoiding Your Belief’s Real Weak Points

The reason that educated religious people stay religious, I suspect, is that when they doubt, they are subconsciously very careful to attack their own beliefs only at the strongest points—places where they know they can defend. Moreover, places where rehearsing the standard defense will feel strengthening.

When it comes to spontaneous self-questioning, one is much more likely to spontaneously self-attack strong points with comforting replies to rehearse, than to spontaneously self-attack the weakest, most vulnerable points. Similarly, one is likely to stop at the first reply and be comforted, rather than further criticizing the reply. A better title than “Avoiding Your Belief’s Real Weak Points” would be “Not Spontaneously Thinking About Your Belief’s Most Painful Weaknesses.”

To do better: When you’re doubting one of your most cherished beliefs, close your eyes, empty your mind, grit your teeth, and deliberately think about whatever hurts the most. Don’t rehearse standard objections whose standard counters would make you feel better. Ask yourself what smart people who disagree would say to your first reply, and your second reply. Whenever you catch yourself flinching away from an objection you fleetingly thought of, drag it out into the forefront of your mind. Punch yourself in the solar plexus. Stick a knife in your heart, and wiggle to widen the hole. In the face of the pain, rehearse only this:

What is true is already so.
Owning up to it doesn’t make it worse.
Not being open about it doesn’t make it go away.
And because it’s true, it is what is there to be interacted with.
Anything untrue isn’t there to be lived.
People can stand what is true,
for they are already enduring it.
— Eugene Gendlin

Motivated Stopping and Motivated Continuation

Gilovich’s distinction between motivated skepticism and motivated credulity highlights how conclusions a person does not want to believe are held to a higher standard than conclusions a person wants to believe. A motivated skeptic asks if the evidence compels them to accept the conclusion; a motivated credulist asks if the evidence allows them to accept the conclusion. I suggest that an analogous bias in psychologically realistic search is motivated stopping and motivated continuation: when we have a hidden motive for choosing the “best” current option, we have a hidden motive to stop, and choose, and reject consideration of any more options. When we have a hidden motive to reject the current best option, we have a hidden motive to suspend judgment pending additional evidence, to generate more options—to find something, anything, to do instead of coming to a conclusion.

The moral is that the decision to terminate a search procedure (temporarily or permanently) is, like the search procedure itself, subject to bias and hidden motives. You should suspect motivated stopping when you close off search, after coming to a comfortable conclusion, and yet there’s a lot of fast cheap evidence you haven’t gathered yet—there are websites you could visit, there are counter-counter arguments you could consider, or you haven’t closed your eyes for five minutes by the clock trying to think of a better option. You should suspect motivated continuation when some evidence is leaning in a way you don’t like, but you decide that more evidence is needed—expensive evidence that you know you can’t gather anytime soon, as opposed to something you’re going to look up on Google in thirty minutes—before you’ll have to do anything uncomfortable.

Part H Against Doublethink

Doublethink (Choosing to be Biased)

What if self-deception helps us be happy? What if just running out and overcoming bias will make us—gasp!—unhappy? Surely, true wisdom would be second-order rationality, choosing when to be rational. That way you can decide which cognitive biases should govern you, to maximize your happiness. Leaving the morality aside, I doubt such a lunatic dislocation in the mind could really happen.

For second-order rationality to be genuinely rational, you would first need a good model of reality, to extrapolate the consequences of rationality and irrationality. If you then chose to be first-order irrational, you would need to forget this accurate view. And then forget the act of forgetting.

You can’t know the consequences of being biased, until you have already debiased yourself. And then it is too late for self-deception. The other alternative is to choose blindly to remain biased, without any clear idea of the consequences. This is not second-order rationality. It is willful stupidity.

Part I Seeing with Fresh Eyes

Anchoring and Adjustment

Subjects take the initial, uninformative number as their starting point or anchor; and then they adjust upward or downward from their starting estimate until they reach an answer that “sounds plausible”; and then they stop adjusting. This typically results in under-adjustment from the anchor—more distant numbers could also be “plausible,” but one stops at the first satisfying-sounding answer.

There are obvious applications in, say, salary negotiations, or buying a car. I won’t suggest that you exploit it, but watch out for exploiters. And watch yourself thinking, and try to notice when you are adjusting a figure in search of an estimate. Debiasing manipulations for anchoring have generally proved not very effective. I would suggest these two: First, if the initial guess sounds implausible, try to throw it away entirely and come up with a new estimate, rather than sliding from the anchor. But this in itself may not be sufficient—subjects instructed to avoid anchoring still seem to do so. So, second, even if you are trying the first method, try also to think of an anchor in the opposite direction—an anchor that is clearly too small or too large, instead of too large or too small—and dwell on it briefly.

Priming and Contamination

Suppose you ask subjects to press one button if a string of letters forms a word, and another button if the string does not form a word (e.g., “banack” vs. “banner”). Then you show them the string “water.” Later, they will more quickly identify the string “drink” as a word. This is known as “cognitive priming”; this particular form would be “semantic priming” or “conceptual priming.” The fascinating thing about priming is that it occurs at such a low level—priming speeds up identifying letters as forming a word, which one would expect to take place before you deliberate on the word’s meaning.

Priming is subconscious and unstoppable, an artifact of the human neural architecture. Trying to stop yourself from priming is like trying to stop the spreading activation of your own neural circuits.

The more general result is that completely uninformative, known false, or totally irrelevant “information” can influence estimates and decisions. In the field of heuristics and biases, this more general phenomenon is known as contamination.

Cached Thoughts

It’s a good guess that the actual majority of human cognition consists of cache lookups. In modern civilization particularly, no one can think fast enough to think their own thoughts. If I’d been abandoned in the woods as an infant, raised by wolves or silent robots, I would scarcely be recognizable as human. No one can think fast enough to recapitulate the wisdom of a hunter-gatherer tribe in one lifetime, starting from scratch. As for the wisdom of a literate civilization, forget it. But the flip side of this is that I continually see people who aspire to critical thinking, repeating back cached thoughts which were not invented by critical thinkers.

What patterns are being completed, inside your mind, that you never chose to be there? It can be hard to see with fresh eyes. Try to keep your mind from completing the pattern in the standard, unsurprising, already-known way. It may be that there is no better answer than the standard one, but you can’t think about the answer until you can stop your brain from filling in the answer automatically. Now that you’ve read this, the next time you hear someone unhesitatingly repeating a meme you think is silly or false, you’ll think, “Cached thoughts.” My belief is now there in your mind, waiting to complete the pattern. But is it true? Don’t let your mind complete the pattern! Think!

The Virtue of Narrowness

Within their own professions, people grasp the importance of narrowness; a car mechanic knows the difference between a carburetor and a radiator, and would not think of them both as “car parts.” A hunter-gatherer knows the difference between a lion and a panther. A janitor does not wipe the floor with window cleaner, even if the bottles look similar to one who has not mastered the art. Outside their own professions, people often commit the misstep of trying to broaden a word as widely as possible, to cover as much territory as possible. Is it not more glorious, more wise, more impressive, to talk about all the apples in the world? How much loftier it must be to explain human thought in general, without being distracted by smaller questions, such as how humans invent techniques for solving a Rubik’s Cube. Indeed, it scarcely seems necessary to consider specific questions at all; isn’t a general theory a worthy enough accomplishment on its own?

And what could be more virtuous than seeing connections? Surely the wisest of all human beings are the New Age gurus who say, “Everything is connected to everything else.” If you ever say this aloud, you should pause, so that everyone can absorb the sheer shock of this Deep Wisdom. There is a trivial mapping between a graph and its complement. A fully connected graph, with an edge between every two vertices, conveys the same amount of information as a graph with no edges at all. The important graphs are the ones where some things are not connected to some other things. When the unenlightened ones try to be profound, they draw endless verbal comparisons between this topic, and that topic, which is like this, which is like that; until their graph is fully connected and also totally useless. The remedy is specific knowledge and in-depth study. When you understand things in detail, you can see how they are not alike, and start enthusiastically subtracting edges off your graph. Likewise, the important categories are the ones that do not contain everything in the universe. Good hypotheses can only explain some possible outcomes, and not others.

There’s nothing wrong with focusing your mind, narrowing your categories, excluding possibilities, and sharpening your propositions. Really, there isn’t! If you make your words too broad, you end up with something that isn’t true and doesn’t even make good poetry.

How to Seem (and Be) Deep

There’s a stereotype of Deep Wisdom. Death. Complete the pattern: “Death gives meaning to life.” Everyone knows this standard Deeply Wise response. And so it takes on some of the characteristics of an applause light. If you say it, people may nod along, because the brain completes the pattern and they know they’re supposed to nod. They may even say “What deep wisdom!,” perhaps in the hope of being thought deep themselves. But they will not be surprised; they will not have heard anything outside the box; they will not have heard anything they could not have thought of for themselves. One might call it belief in wisdom—the thought is labeled “deeply wise,” and it’s the completed standard pattern for “deep wisdom,” but it carries no experience of insight. People who try to seem Deeply Wise often end up seeming hollow, echoing as it were, because they’re trying to seem Deeply Wise instead of optimizing.

I suspect this is one reason Eastern philosophy seems deep to Westerners—it has nonstandard but coherent cache for Deep Wisdom. Symmetrically, in works of Japanese fiction, one sometimes finds Christians depicted as repositories of deep wisdom and/or mystical secrets.

To seem deep, study nonstandard philosophies. Seek out discussions on topics that will give you a chance to appear deep. Do your philosophical thinking in advance, so you can concentrate on explaining well. Above all, practice staying within the one-inferential-step bound. To be deep, think for yourself about “wise” or important or emotionally fraught topics. Thinking for yourself isn’t the same as coming up with an unusual answer. It does mean seeing for yourself, rather than letting your brain complete the pattern. If you don’t stop at the first answer, and cast out replies that seem vaguely unsatisfactory, in time your thoughts will form a coherent whole, flowing from the single source of yourself, rather than being fragmentary repetitions of other people’s conclusions.

We Change Our Minds Less Often Than We Think

We change our minds less often than we think. And most of the time we become able to guess what our answer will be within half a second of hearing the question. How swiftly that unnoticed moment passes, when we can’t yet guess what our answer will be; the tiny window of opportunity for intelligence to act. In questions of choice, as in questions of fact. The principle of the bottom line is that only the actual causes of your beliefs determine your effectiveness as a rationalist. Once your belief is fixed, no amount of argument will alter the truth-value; once your decision is fixed, no amount of argument will alter the consequences.

You might think that you could arrive at a belief, or a decision, by non-rational means, and then try to justify it, and if you found you couldn’t justify it, reject it. But we change our minds less often—much less often—than we think. I’m sure that you can think of at least one occasion in your life when you’ve changed your mind. We all can. How about all the occasions in your life when you didn’t change your mind? Are they as available, in your heuristic estimate of your competence? Between hindsight bias, fake causality, positive bias, anchoring/priming, et cetera, et cetera, and above all the dreaded confirmation bias, once an idea gets into your head, it’s probably going to stay there.

Hold Off On Proposing Solutions

Once an idea gets into your head, it will probably require way too much evidence to get it out again. I suspect that a more powerful (and more difficult) method is to hold off on thinking of an answer. To suspend, draw out, that tiny moment when we can’t yet guess what our answer will be; thus giving our intelligence a longer time in which to act.

Part J Death Spirals

The Affect Heuristic

The affect heuristic is when subjective impressions of goodness/badness act as a heuristic—a source of fast, perceptual judgments. Pleasant and unpleasant feelings are central to human reasoning, and the affect heuristic comes with lovely biases—some of my favorites.

Finucane et al. found that for nuclear reactors, natural gas, and food preservatives, presenting information about high benefits made people perceive lower risks; presenting information about higher risks made people perceive lower benefits; and so on across the quadrants. People conflate their judgments about particular good/bad aspects of something into an overall good or bad feeling about that thing. Finucane et al. also found that time pressure greatly increased the inverse relationship between perceived risk and perceived benefit, consistent with the general finding that time pressure, poor information, or distraction all increase the dominance of perceptual heuristics over analytic deliberation.

The Halo Effect

The affect heuristic is how an overall feeling of goodness or badness contributes to many other judgments, whether it’s logical or not, whether you’re aware of it or not. Subjects told about the benefits of nuclear power are likely to rate it as having fewer risks; stock analysts rating unfamiliar stocks judge them as generally good or generally bad—low risk and high returns, or high risk and low returns—in defiance of ordinary economic theory, which says that risk and return should correlate positively.

The halo effect is the manifestation of the affect heuristic in social psychology. Robert Cialdini, in Influence: Science and Practice, summarizes: Research has shown that we automatically assign to good-looking individuals such favorable traits as talent, kindness, honesty, and intelligence. Furthermore, we make these judgments without being aware that physical attractiveness plays a role in the process.

These studies on the halo effect of attractiveness should make us suspicious that there may be a similar halo effect for kindness, or intelligence. Let’s say that you know someone who not only seems very intelligent, but also honest, altruistic, kindly, and serene. You should be suspicious that some of these perceived characteristics are influencing your perception of the others. Maybe the person is genuinely intelligent, honest, and altruistic, but not all that kindly or serene. You should be suspicious if the people you know seem to separate too cleanly into devils and angels.

Affective Death Spirals

Many, many, many are the flaws in human reasoning which lead us to overestimate how well our beloved theory explains the facts. The phlogiston theory of chemistry could explain just about anything, so long as it didn’t have to predict it in advance. And the more phenomena you use your favored theory to explain, the truer your favored theory seems—has it not been confirmed by these many observations? As the theory seems truer, you will be more likely to question evidence that conflicts with it. As the favored theory seems more general, you will seek to use it in more explanations.

This positive feedback cycle of credulity and confirmation is indeed fearsome, and responsible for much error, both in science and in everyday life. But it’s nothing compared to the death spiral that begins with a charge of positive affect—a thought that feels really good. A new political system that can save the world. A great leader, strong and noble and wise. An amazing tonic that can cure upset stomachs and cancer.

When the Great Thingy feels good enough to make you seek out new opportunities to feel even better about the Great Thingy, applying it to interpret new events every day, the resonance of positive affect is like a chamber full of mousetraps loaded with ping-pong balls. You could call it a “happy attractor,” “overly positive feedback,” a “praise locked loop,” or “funpaper.” Personally I prefer the term “affective death spiral.”

Resist the Happy Death Spiral

You avoid a Happy Death Spiral by:

  • Splitting the Great Idea into parts;

  • Treating every additional detail as burdensome;

  • Thinking about the specifics of the causal chain instead of the good or bad feelings;

  • Not rehearsing evidence; and

  • Not adding happiness from claims that “you can’t prove are wrong”;

but not by:

  • Refusing to admire anything too much;

  • Conducting a biased search for negative points until you feel unhappy again;

  • or Forcibly shoving an idea into a safe box.

Every Cause Wants to Be a Cult

The ingroup-outgroup dichotomy is part of ordinary human nature. So are happy death spirals and spirals of hate. A Noble Cause doesn’t need a deep hidden flaw for its adherents to form a cultish in-group. It is sufficient that the adherents be human. Everything else follows naturally, decay by default, like food spoiling in a refrigerator after the electricity goes off. In the same sense that every thermal differential wants to equalize itself, and every computer program wants to become a collection of ad-hoc patches, every Cause wants to be a cult. It’s a high-entropy state into which the system trends, an attractor in human psychology. It may have nothing to do with whether the Cause is truly Noble.

Every group of people with an unusual goal—good, bad, or silly—will trend toward the cult attractor unless they make a constant effort to resist it. You can keep your house cooler than the outdoors, but you have to run the air conditioner constantly, and as soon as you turn off the electricity—give up the fight against entropy—things will go back to “normal.”

Guardians of the Truth

The criticism is sometimes leveled against rationalists: “The Inquisition thought they had the truth! Clearly this ‘truth’ business is dangerous.”

The Inquisition believed that there was such a thing as truth, and that it was important; well, likewise Richard Feynman. But the Inquisitors were not Truth-Seekers. They were Truth-Guardians.

The perfect age of the past, according to our best anthropological evidence, never existed. But a culture that sees life running inexorably downward is very different from a culture in which you can reach unprecedented heights. (I say “culture,” and not “society,” because you can have more than one subculture in a society.) You could say that the difference between e.g. Richard Feynman and the Inquisition was that the Inquisition believed they had truth, while Richard Feynman sought truth.

I don’t mean to provide a grand overarching single-factor view of history. I do mean to point out a deep psychological difference between seeing your grand cause in life as protecting, guarding, preserving, versus discovering, creating, improving. Does the “up” direction of time point to the past or the future? It’s a distinction that shades everything, casts tendrils everywhere.

I would also argue that this basic psychological difference is one of the reasons why an academic field that stops making active progress tends to turn mean. When there’s not many discoveries being made, there’s nothing left to do all day but witch-hunt the heretics. To get the best mental health benefits of the discover/create/improve posture, you’ve got to actually be making progress, not just hoping for it.

On Expressing Your Concerns

The scary thing about Asch’s conformity experiments is that you can get many people to say black is white, if you put them in a room full of other people saying the same thing. The hopeful thing about Asch’s conformity experiments is that a single dissenter tremendously drove down the rate of conformity, even if the dissenter was only giving a different wrong answer. And the wearisome thing is that dissent was not learned over the course of the experiment—when the single dissenter started siding with the group, rates of conformity rose back up. Being a voice of dissent can bring real benefits to the group. But it also (famously) has a cost. And then you have to keep it up. Plus you could be wrong.

The most fearsome possibility raised by Asch’s experiments on conformity is the specter of everyone agreeing with the group, swayed by the confident voices of others, careful not to let their own doubts show—not realizing that others are suppressing similar worries. This is known as “pluralistic ignorance.”

I think the most important lesson to take away from Asch’s experiments is to distinguish “expressing concern” from “disagreement.” Raising a point that others haven’t voiced is not a promise to disagree with the group at the end of its discussion.

Unfortunately, there’s not much difference socially between “expressing concerns” and “disagreement.” A group of rationalists might agree to pretend there’s a difference, but it’s not how human beings are really wired. Once you speak out, you’ve committed a socially irrevocable act; you’ve become the nail sticking up, the discord in the comfortable group harmony, and you can’t undo that. Anyone insulted by a concern you expressed about their competence to successfully complete task XYZ, will probably hold just as much of a grudge afterward if you say “No problem, I’ll go along with the group” at the end.

Lonely Dissent

Asch’s conformity experiment showed that the presence of a single dissenter tremendously reduced the incidence of “conforming” wrong answers. Individualism is easy, experiment shows, when you have company in your defiance. Every other subject in the room, except one, says that black is white. You become the second person to say that black is black. And it feels glorious: the two of you, lonely and defiant rebels, against the world! (Followup interviews showed that subjects in the one-dissenter condition expressed strong feelings of camaraderie with the dissenter—though, of course, they didn’t think the presence of the dissenter had influenced their own nonconformity.)

But you can only join the rebellion, after someone, somewhere, becomes the first to rebel. Someone has to say that black is black after hearing everyone else, one after the other, say that black is white. And that—experiment shows—is a lot harder. Lonely dissent doesn’t feel like going to school dressed in black. It feels like going to school wearing a clown suit. That’s the difference between joining the rebellion and leaving the pack.

It really isn’t necessary to be different for the sake of being different. If you do things differently only when you see an overwhelmingly good reason, you will have more than enough trouble to last you the rest of your life.

Part K Letting Go

The Importance of Saying “Oops”

There is a powerful advantage to admitting you have made a large mistake. It’s painful. It can also change your whole life. It is important to have the watershed moment, the moment of humbling realization. To acknowledge a fundamental problem, not divide it into palatable bite-size mistakes.

Until you admit you were wrong, you cannot get on with your life; your self-image will still be bound to the old mistake.

Human beings make mistakes, and not all of them are disguised successes. Human beings make mistakes; it happens, that’s all. Say “oops,” and get on with your life.

The Proper Use of Doubt

Because we all want to be seen as rational—and doubting is widely believed to be a virtue of a rationalist. But it is not widely understood that you need a particular reason to doubt, or that an unresolved doubt is a null-op. Instead people think it’s about modesty, a submissive demeanor, maintaining the tribal status hierarchy—almost exactly the same problem as with humility, on which I have previously written. Making a great public display of doubt to convince yourself that you are a rationalist will do around as much good as wearing a lab coat. To avoid professing doubts, remember:

  • A rational doubt exists to destroy its target belief, and if it does not destroy its target it dies unfulfilled.

  • A rational doubt arises from some specific reason the belief might be wrong.

  • An unresolved doubt is a null-op.

  • An uninvestigated doubt might as well not exist.

  • You should not be proud of mere doubting, although you can justly be proud when you have just finished tearing a cherished belief to shreds.

  • Though it may take courage to face your doubts, never forget that to an ideal mind doubt would not be scary in the first place.

Leave a Line of Retreat

The principle behind the technique is simple: as Sun Tzu advises you to do with your enemies, you must do with yourself—leave yourself a line of retreat, so that you will have less trouble retreating. The prospect of losing your job, say, may seem a lot more scary when you can’t even bear to think about it, than after you have calculated exactly how long your savings will last, and checked the job market in your area, and otherwise planned out exactly what to do next. Only then will you be ready to fairly assess the probability of keeping your job in the planned layoffs next month. Be a true coward, and plan out your retreat in detail—visualize every step—preferably before you first come to the battlefield. The hope is that it takes less courage to visualize an uncomfortable state of affairs as a thought experiment, than to consider how likely it is to be true. But then after you do the former, it becomes easier to do the latter.

Remember that Bayesianism is precise—even if a scary proposition really should seem unlikely, it’s still important to count up all the evidence, for and against, exactly fairly, to arrive at the rational quantitative probability. Visualizing a scary belief does not mean admitting that you think, deep down, it’s probably true. You can visualize a scary belief on general principles of good mental housekeeping. “The thought you cannot think controls you more than thoughts you speak aloud”—this happens even if the unthinkable thought is false! The leave-a-line-of-retreat technique does require a certain minimum of self-honesty to use correctly. For a start: You must at least be able to admit to yourself which ideas scare you, and which ideas you are attached to. But this is a substantially less difficult test than fairly counting the evidence for an idea that scares you.

Crisis of Faith

It ain’t a true crisis of faith unless things could just as easily go either way.
— Thor Shenkel

Not every doubt calls for staging an all-out Crisis of Faith. But you should consider it when:

  • A belief has long remained in your mind;

  • It is surrounded by a cloud of known arguments and refutations;

  • You have sunk costs in it (time, money, public declarations);

  • The belief has emotional consequences (note this does not make it wrong);

  • It has gotten mixed up in your personality generally.

None of these warning signs are immediate disproofs. These attributes place a belief at risk for all sorts of dangers, and make it very hard to reject when it is wrong. But they also hold for Richard Dawkins’s belief in evolutionary biology as well as the Pope’s Catholicism. This does not say that we are only talking about different flavors of ice cream. Only the unenlightened think that all deeply-held beliefs are on the same level regardless of the evidence supporting them, just because they are deeply held. The point is not to have shallow beliefs, but to have a map which reflects the territory. I emphasize this, of course, so that you can admit to yourself, “My belief has these warning signs,” without having to say to yourself, “My belief is false.” But what these warning signs do mark, is a belief that will take more than an ordinary effort to doubt effectively. So that if it were in fact false, you would in fact reject it. And where you cannot doubt effectively, you are blind, because your brain will hold the belief unconditionally.

Book III The Machine in the Ghost

Part L The Simple Math of Evolution

To sum up, if you have all of the following properties:

  • Entities that replicate;

  • Substantial variation in their characteristics;

  • Substantial variation in their reproduction;

  • Persistent correlation between the characteristics and reproduction;

  • High-fidelity long-range heritability in characteristics;

  • Frequent birth of a significant fraction of the breeding population;

  • And all this remains true through many iterations...

Then you will have significant cumulative selection pressures, enough to produce complex adaptations by the force of evolution.

Adaptation-Executers, Not Fitness-Maximizers

Fifty thousand years ago, the taste buds of Homo sapiens directed their bearers to the scarcest, most critical food resources—sugar and fat. Calories, in a word. Today, the context of a taste bud’s function has changed, but the taste buds themselves have not. Calories, far from being scarce (in First World countries), are actively harmful. Micronutrients that were reliably abundant in leaves and nuts are absent from bread, but our taste buds don’t complain. A scoop of ice cream is a superstimulus, containing more sugar, fat, and salt than anything in the ancestral environment. No human being with the deliberate goal of maximizing their alleles’ inclusive genetic fitness would ever eat a cookie unless they were starving. But individual organisms are best thought of as adaptation-executers, not fitness-maximizers.

Smushing several of the concepts together, you could sort-of-say, “Modern humans do today what would have propagated our genes in a hunter-gatherer society, whether or not it helps our genes in a modern society.” But this still isn’t quite right, because we’re not actually asking ourselves which behaviors would maximize our ancestors’ inclusive fitness. And many of our activities today have no ancestral analogue. In the hunter-gatherer society there wasn’t any such thing as chocolate. So it’s better to view our taste buds as an adaptation fitted to ancestral conditions that included near-starvation and apples and roast rabbit, which modern humans execute in a new context that includes cheap chocolate and constant bombardment by advertisements. Therefore it is said: Individual organisms are best thought of as adaptation-executers, not fitness-maximizers.

Part M Fragile Purposes

Optimization and the Intelligence Explosion

Among the topics I haven't delved into here is the notion of an optimization process. Roughly, this is the idea that your power as a mind is your ability to hit small targets in a large search space—this can be either the space of possible futures (planning) or the space of possible designs (invention).

Suppose you have a car, and suppose we already know that your preferences involve travel. Now suppose that you take all the parts in the car, or all the atoms, and jumble them up at random. It’s very unlikely that you’ll end up with a travel-artifact at all, even so much as a wheeled cart; let alone a travel-artifact that ranks as high in your preferences as the original car. So, relative to your preference ordering, the car is an extremely improbable artifact. The power of an optimization process is that it can produce this kind of improbability.

You can view both intelligence and natural selection as special cases of optimization: processes that hit, in a large search space, very small targets defined by implicit preferences. Natural selection prefers more efficient replicators. Human intelligences have more complex preferences. Neither evolution nor humans have consistent utility functions, so viewing them as “optimization processes” is understood to be an approximation. You’re trying to get at the sort of work being done, not claim that humans or evolution do this work perfectly. This is how I see the story of life and intelligence—as a story of improbably good designs being produced by optimization processes. The “improbability” here is improbability relative to a random selection from the design space, not improbability in an absolute sense—if you have an optimization process around, then “improbably” good designs become probable.

Very recently, certain animal brains have begun to exhibit both generality of optimization power (producing an amazingly wide range of artifacts, in time scales too short for natural selection to play any significant role) and cumulative optimization power (artifacts of increasing complexity, as a result of skills passed on through language and writing). Natural selection takes hundreds of generations to do anything and millions of years for de novo complex designs. Human programmers can design a complex machine with a hundred interdependent elements in a single afternoon. This is not surprising, since natural selection is an accidental optimization process that basically just started happening one day, whereas humans are optimized optimizers handcrafted by natural selection over millions of years.

We have meta-level inventions like science, that try to instruct humans in how to think. But the first person to invent Bayes’s Theorem did not become a Bayesian; they could not rewrite themselves, lacking both that knowledge and that power. Our significant innovations in the art of thinking, like writing and science, are so powerful that they structure the course of human history; but they do not rival the brain itself in complexity, and their effect upon the brain is comparatively shallow.

Now... some of us want to intelligently design an intelligence that would be capable of intelligently redesigning itself, right down to the level of machine code. The machine code at first, and the laws of physics later, would be a protected level of a sort. But that “protected level” would not contain the dynamic of optimization; the protected levels would not structure the work. The human brain does quite a bit of optimization on its own, and screws up on its own, no matter what you try to tell it in school. But this fully wraparound recursive optimizer would have no protected level that was optimizing. All the structure of optimization would be subject to optimization itself. And that is a sea change which breaks with the entire past since the first replicator, because it breaks the idiom of a protected meta level.

Terminal Values and Instrumental Values

I rarely notice people losing track of plans they devised themselves. People usually don’t drive to the supermarket if they know the chocolate is gone. But I’ve also noticed that when people begin explicitly talking about goal systems instead of just wanting things, mentioning “goals” instead of using them, they oft become confused. In particular, I’ve noticed people get confused when—in abstract philosophical discussions rather than everyday life—they consider the distinction between means and ends; more formally, between “instrumental values” and “terminal values.”

Part of the problem, it seems to me, is that the human mind uses a rather ad-hoc system to keep track of its goals—it works, but not cleanly. English doesn’t embody a sharp distinction between means and ends: “I want to save my sister’s life” and “I want to administer penicillin to my sister” use the same word “want.” Can we describe, in mere English, the distinction that is getting lost?

As a first stab: “Instrumental values” are desirable strictly conditional on their anticipated consequences. “I want to administer penicillin to my sister,” not because a penicillin-filled sister is an intrinsic good, but in anticipation of penicillin curing her flesh-eating pneumonia. If instead you anticipated that injecting penicillin would melt your sister into a puddle like the Wicked Witch of the West, you’d fight just as hard to keep her penicillin-free. “Terminal values” are desirable without conditioning on other consequences: “I want to save my sister’s life” has nothing to do with your anticipating whether she’ll get injected with penicillin after that.

In moral arguments, some disputes are about instrumental consequences, and some disputes are about terminal values. If your debating opponent says that banning guns will lead to lower crime, and you say that banning guns will lead to higher crime, then you agree about a superior instrumental value (crime is bad), but you disagree about which intermediate events lead to which consequences. This important distinction often gets flushed down the toilet in angry arguments. People with factual disagreements and shared values each decide that their debating opponents must be sociopaths. As if your hated enemy, gun control/rights advocates, really wanted to kill people, which should be implausible as realistic psychology.

I fear the human brain does not strongly type the distinction between terminal moral beliefs and instrumental moral beliefs. “We should ban guns” and “We should save lives” don’t feel different, as moral beliefs, the way that sight feels different from sound. Despite all the other ways that the human goal system complicates everything in sight, this one distinction it manages to collapse into a mishmash of things-with-conditional-value.

Part N A Human’s Guide to Words

Extensions and Intensions

“What is red?” “Red is a color.” “What’s a color?” “A color is a property of a thing.” But what is a thing? And what’s a property? Soon the two are lost in a maze of words defined in other words, the problem that Steven Harnad once described as trying to learn Chinese from a Chinese/Chinese dictionary. Alternatively, if you asked me “What is red?” I could point to a stop sign, then to someone wearing a red shirt, and a traffic light that happens to be red, and blood from where I accidentally cut myself, and a red business card, and then I could call up a color wheel on my computer and move the cursor to the red area.

To give an “intensional definition” is to define a word or phrase in terms of other words, as a dictionary does. To give an “extensional definition” is to point to examples, as adults do when teaching children. The preceding sentence gives an intensional definition of “extensional definition,” which makes it an extensional example of “intensional definition.” In Hollywood Rationality and popular culture generally, “rationalists” are depicted as word-obsessed, floating in endless verbal space disconnected from reality. But the actual Traditional Rationalists have long insisted on maintaining a tight connection to experience.

The strongest definitions use a crossfire of intensional and extensional communication to nail down a concept. Even so, you only communicate maps to concepts, or instructions for building concepts—you don’t communicate the actual categories as they exist in your mind or in the world.

How An Algorithm Feels From Inside

Before you can question your intuitions, you have to realize that what your mind’s eye is looking at is an intuition—some cognitive algorithm, as seen from the inside—rather than a direct perception of the Way Things Really Are. People cling to their intuitions, I think, not so much because they believe their cognitive algorithms are perfectly reliable, but because they can’t see their intuitions as the way their cognitive algorithms happen to look from the inside. And so everything you try to say about how the native cognitive algorithm goes astray, ends up being contrasted to their direct perception of the Way Things Really Are—and discarded as obviously wrong.

The Argument from Common Usage

Once any empirical proposition is at stake, or any moral proposition, you can no longer appeal to common usage.

If you want to know whether atheism should be clustered with supernaturalist religions for purposes of some particular empirical inference, the dictionary can’t answer you. If you want to know whether blacks are people, the dictionary can’t answer you. If everyone believes that the red light in the sky is Mars the God of War, the dictionary will define “Mars” as the God of War. If everyone believes that fire is the release of phlogiston, the dictionary will define “fire” as the release of phlogiston. There is an art to using words; even when definitions are not literally true or false, they are often wiser or more foolish. Dictionaries are mere histories of past usage; if you treat them as supreme arbiters of meaning, it binds you to the wisdom of the past, forbidding you to do better.

Taboo Your Words

ALBERT: “A tree falling in a deserted forest makes a sound.”
BARRY: “A tree falling in a deserted forest does not make a sound.”

Clearly, since one says “sound” and one says “not sound,” we must have a contradiction, right? But suppose that they both dereference their pointers before speaking:

ALBERT: “A tree falling in a deserted forest matches [membership test: this event generates acoustic vibrations].”
BARRY: “A tree falling in a deserted forest does not match [membership test: this event generates auditory experiences].”

Now there is no longer an apparent collision—all they had to do was prohibit themselves from using the word sound.

When you find yourself in philosophical difficulties, the first line of defense is not to define your problematic terms, but to see whether you can think without using those terms at all. Or any of their short synonyms. And be careful not to let yourself invent a new word to use instead. Describe outward observables and interior mechanisms; don’t use a single handle, whatever that handle may be.

Albert says that people have “free will.” Barry says that people don’t have “free will.” Well, that will certainly generate an apparent conflict. Most philosophers would advise Albert and Barry to try to define exactly what they mean by “free will,” on which topic they will certainly be able to discourse at great length. I would advise Albert and Barry to describe what it is that they think people do, or do not have, without using the phrase “free will” at all. (If you want to try this at home, you should also avoid the words “choose,” “act,” “decide,” “determined,” “responsible,” or any of their synonyms.) This is one of the nonstandard tools in my toolbox, and in my humble opinion, it works way way better than the standard one. It also requires more effort to use; you get what you pay for.

Replace the Symbol with the Substance

You have to visualize. You have to make your mind’s eye see the details, as though looking for the first time. You have to perform an Original Seeing.

The chief obstacle to performing an original seeing is that your mind already has a nice neat summary, a nice little easy-to-use concept handle. Like the word “baseball,” or “bat,” or “base.” It takes an effort to stop your mind from sliding down the familiar path, the easy path, the path of least resistance, where the small featureless word rushes in and obliterates the details you’re trying to see. A word itself can have the destructive force of cliché; a word itself can carry the poison of a cached thought. Playing the game of Taboo—being able to describe without using the standard pointer/label/handle—is one of the fundamental rationalist capacities. It occupies the same primordial level as the habit of constantly asking “Why?” or “What does this belief make me anticipate?”

To categorize is to throw away information. If you’re told that a falling tree makes a “sound,” you don’t know what the actual sound is; you haven’t actually heard the tree falling. If a coin lands “heads,” you don’t know its radial orientation. You want to use categories to throw away irrelevant information, to sift gold from dust, but often the standard categorization ends up throwing out relevant information too. And when you end up in that sort of mental trouble, the first and most obvious solution is to play Taboo.

If you see your activities and situation originally, you will be able to originally see your goals as well. If you can look with fresh eyes, as though for the first time, you will see yourself doing things that you would never dream of doing if they were not habits. Purpose is lost whenever the substance (learning, knowledge, health) is displaced by the symbol (a degree, a test score, medical care). To heal a lost purpose, or a lossy categorization, you must do the reverse: Replace the symbol with the substance; replace the signifier with the signified; replace the property with the membership test; replace the word with the meaning; replace the label with the concept; replace the summary with the details; replace the proxy question with the real question; dereference the pointer; drop into a lower level of organization; mentally simulate the process instead of naming it; zoom in on your map.

Categorizing Has Consequences

You can see this in terms of similarity clusters: once you draw a boundary around a group, the mind starts trying to harvest similarities from the group. And unfortunately the human pattern-detectors seem to operate in such overdrive that we see patterns whether they’re there or not; a weakly negative correlation can be mistaken for a strong positive one with a bit of selective memory. You can see this in terms of neural algorithms: creating a name for a set of things is like allocating a subnetwork to find patterns in them. You can see this in terms of a compression fallacy: things given the same name end up dumped into the same mental bucket, blurring them together into the same point on the map. Or you can see this in terms of the boundless human ability to make stuff up out of thin air and believe it because no one can prove it’s wrong. As soon as you name the category, you can start making up stuff about it. The named thing doesn’t have to be perceptible; it doesn’t have to exist; it doesn’t even have to be coherent.

Any way you look at it, drawing a boundary in thingspace is not a neutral act. Maybe a more cleanly designed, more purely Bayesian AI could ponder an arbitrary class and not be influenced by it. But you, a human, do not have that option. Categories are not static things in the context of a human brain; as soon as you actually think of them, they exert force on your mind. One more reason not to believe you can define a word any way you like.

Arguing “By Definition”

When people argue definitions, they usually start with some visible, known, or at least widely believed set of characteristics; then pull out a dictionary, and point out that these characteristics fit the dictionary definition; and so conclude, “Therefore, by definition, atheism is a religion!” But visible, known, widely believed characteristics are rarely the real point of a dispute.

People feel the need to squeeze the argument onto a single course by saying “Any P, by definition, has property Q!,” on exactly those occasions when they see, and prefer to dismiss out of hand, additional arguments that call into doubt the default inference based on clustering. So too with the argument “X, by definition, is a Y!” E.g., “Atheists believe that God doesn’t exist; therefore atheists have beliefs about God, because a negative belief is still a belief; therefore atheism asserts answers to theological questions; therefore atheism is, by definition, a religion.” You wouldn’t feel the need to say, “Hinduism, by definition, is a religion!” because, well, of course Hinduism is a religion. It’s not just a religion “by definition,” it’s, like, an actual religion.

Atheism does not resemble the central members of the “religion” cluster, so if it wasn’t for the fact that atheism is a religion by definition, you might go around thinking that atheism wasn’t a religion. That’s why you’ve got to crush all opposition by pointing out that “Atheism is a religion” is true by definition, because it isn’t true any other way. Which is to say: People insist that “X, by definition, is a Y!” on those occasions when they’re trying to sneak in a connotation of Y that isn’t directly in the definition, and X doesn’t look all that much like other members of the Y cluster.

Over the last thirteen years I’ve been keeping track of how often this phrase is used correctly versus incorrectly—though not with literal statistics, I fear. But eyeballing suggests that using the phrase by definition, anywhere outside of math, is among the most alarming signals of flawed argument I’ve ever found. It’s right up there with “Hitler,” “God,” “absolutely certain,” and “can’t prove that.”

Where to Draw the Boundary?

Just because there’s a word “art” doesn’t mean that it has a meaning, floating out there in the void, which you can discover by finding the right definition. It feels that way, but it is not so. Wondering how to define a word means you’re looking at the problem the wrong way—searching for the mysterious essence of what is, in fact, a communication signal.

Figuring where to cut reality in order to carve along the joints—this is the problem worthy of a rationalist. It is what people should be trying to do, when they set out in search of the floating essence of a word.

The way to carve reality at its joints is to draw boundaries around concentrations of unusually high probability density.

Interlude: An Intuitive Explanation of Bayes’s Theorem

The original proportion of patients with breast cancer is known as the prior probability. The chance that a patient with breast cancer gets a positive mammography, and the chance that a patient without breast cancer gets a positive mammography, are known as the two conditional probabilities. Collectively, this initial information is known as the priors. The final answer—the estimated probability that a patient has breast cancer, given that we know she has a positive result on her mammography—is known as the revised probability or the posterior probability. What we’ve just seen is that the posterior probability depends in part on the prior probability.

The mammography result doesn’t replace your old information about the patient’s chance of having cancer; the mammography slides the estimated probability in the direction of the result. A positive result slides the original probability upward; a negative result slides the probability downward.

The probability that a test gives a true positive divided by the probability that a test gives a false positive is known as the likelihood ratio of that test. The likelihood ratio for a positive result summarizes how much a positive result will slide the prior probability. Does the likelihood ratio of a medical test then sum up everything there is to know about the usefulness of the test? No, it does not! The likelihood ratio sums up everything there is to know about the meaning of a positive result on the medical test, but the meaning of a negative result on the test is not specified, nor is the frequency with which the test is useful. For example, a mammography with a hit rate of 80% for patients with breast cancer and a false positive rate of 9.6% for healthy patients has the same likelihood ratio as a test with an 8% hit rate and a false positive rate of 0.96%. Although these two tests have the same likelihood ratio, the first test is more useful in every way—it detects disease more often, and a negative result is stronger evidence of health.

Bayes’s Theorem describes what makes something “evidence” and how much evidence it is. Statistical models are judged by comparison to the Bayesian method because, in statistics, the Bayesian method is as good as it gets—the Bayesian method defines the maximum amount of mileage you can get out of a given piece of evidence, in the same way that thermodynamics defines the maximum amount of work you can get out of a temperature differential. This is why you hear cognitive scientists talking about Bayesian reasoners. In cognitive science, Bayesian reasoner is the technically precise code word that we use to mean rational mind.

There are also a number of general heuristics about human reasoning that you can learn from looking at Bayes’s Theorem. For example, in many discussions of Bayes’s Theorem, you may hear cognitive psychologists saying that people do not take prior frequencies sufficiently into account, meaning that when people approach a problem where there’s some evidence X indicating that condition A might hold true, they tend to judge A’s likelihood solely by how well the evidence X seems to match A, without taking into account the prior frequency of A.

A related error is to pay too much attention to P(X|A) and not enough to P(X|¬A) when determining how much evidence X is for A. The degree to which a result X is evidence for A depends not only on the strength of the statement we’d expect to see result X if A were true, but also on the strength of the statement we wouldn’t expect to see result X if A weren’t true.

The Bayesian revolution in the sciences is fueled, not only by more and more cognitive scientists suddenly noticing that mental phenomena have Bayesian structure in them; not only by scientists in every field learning to judge their statistical methods by comparison with the Bayesian method; but also by the idea that science itself is a special case of Bayes’s Theorem; experimental evidence is Bayesian evidence. The Bayesian revolutionaries hold that when you perform an experiment and get evidence that “confirms” or “disconfirms” your theory, this confirmation and disconfirmation is governed by the Bayesian rules. For example, you have to take into account not only whether your theory predicts the phenomenon, but whether other possible explanations also predict the phenomenon. Previously, the most popular philosophy of science was probably Karl Popper’s falsificationism—this is the old philosophy that the Bayesian revolution is currently dethroning. Karl Popper’s idea that theories can be definitely falsified, but never definitely confirmed, is yet another special case of the Bayesian rules; if P(X|A) ≈ 1—if the theory makes a definite prediction—then observing ¬X very strongly falsifies A. On the other hand, if P(X|A) ≈ 1, and we observe X, this doesn’t definitely confirm the theory; there might be some other condition B such that P(X|B) ≈ 1, in which case observing X doesn’t favor A over B.

Book IV Mere Reality

Part P Reductionism 101

Wrong Questions

Where the mind cuts against reality’s grain, it generates wrong questions—questions that cannot possibly be answered on their own terms, but only dissolved by understanding the cognitive algorithm that generates the perception of a question. One good cue that you’re dealing with a “wrong question” is when you cannot even imagine any concrete, specific state of how-the-world-is that would answer the question. When it doesn’t even seem possible to answer the question. Take the Standard Definitional Dispute, for example, about the tree falling in a deserted forest. Is there any way-the-world-could-be—any state of affairs—that corresponds to the word “sound” really meaning only acoustic vibrations, or really meaning only auditory experiences? (“Why, yes,” says the one, “it is the state of affairs where ‘sound’ means acoustic vibrations.” So Taboo the word “means,” and “represents,” and all similar synonyms, and describe again: What way-the-world-can-be, what state of affairs, would make one side right, and the other side wrong?)

Mystery exists in the mind, not in reality. If I am ignorant about a phenomenon, that is a fact about my state of mind, not a fact about the phenomenon itself. All the more so if it seems like no possible answer can exist: Confusion exists in the map, not in the territory. Unanswerable questions do not mark places where magic enters the universe. They mark places where your mind runs skew to reality. Such questions must be dissolved. Bad things happen when you try to answer them. It inevitably generates the worst sort of Mysterious Answer to a Mysterious Question: The one where you come up with seemingly strong arguments for your Mysterious Answer, but the “answer” doesn’t let you make any new predictions even in retrospect, and the phenomenon still possesses the same sacred inexplicability that it had at the start.

Righting a Wrong Question

When you are faced with an unanswerable question—a question to which it seems impossible to even imagine an answer—there is a simple trick that can turn the question solvable. Compare:

  • “Why do I have free will?”

  • “Why do I think I have free will?”

The nice thing about the second question is that it is guaranteed to have a real answer, whether or not there is any such thing as free will. Asking “Why do I have free will?” or “Do I have free will?” sends you off thinking about tiny details of the laws of physics, so distant from the macroscopic level that you couldn’t begin to see them with the naked eye. And you’re asking “Why is X the case?” where X may not be coherent, let alone the case. “Why do I think I have free will?,” in contrast, is guaranteed answerable. You do, in fact, believe you have free will. This belief seems far more solid and graspable than the ephemerality of free will. And there is, in fact, some nice solid chain of cognitive cause and effect leading up to this belief.

Mind Projection Fallacy

E. T. Jaynes used the term Mind Projection Fallacy to denote the error of projecting your own mind’s properties into the external world. It is in the argument over the real meaning of the word sound, and in the magazine cover of the monster carrying off a woman in the torn dress, and Kant’s declaration that space by its very nature is flat, and Hume’s definition of a priori ideas as those “discoverable by the mere operation of thought, without dependence on what is anywhere existent in the universe”...

Probability is in the Mind

Probabilities express uncertainty, and it is only agents who can be uncertain. A blank map does not correspond to a blank territory. Ignorance is in the mind.

Qualitatively Confused

I suggest that a primary cause of confusion about the distinction between “belief,” “truth,” and “reality” is qualitative thinking about beliefs. Consider the archetypal postmodernist attempt to be clever: “The Sun goes around the Earth” is true for Hunga Huntergatherer, but “The Earth goes around the Sun” is true for Amara Astronomer! Different societies have different truths!

No, different societies have different beliefs. Belief is of a different type than truth; it’s like comparing apples and probabilities.

The dichotomy between belief and disbelief, being binary, is confusingly similar to the dichotomy between truth and untruth. So let’s use quantitative reasoning instead. Suppose that I assign a 70% probability to the proposition that snow is white. It follows that I think there’s around a 70% chance that the sentence “snow is white” will turn out to be true. If the sentence “snow is white” is true, is my 70% probability assignment to the proposition, also “true”? Well, it’s more true than it would have been if I’d assigned 60% probability, but not so true as if I’d assigned 80% probability. When talking about the correspondence between a probability assignment and reality, a better word than “truth” would be “accuracy.” “Accuracy” sounds more quantitative, like an archer shooting an arrow: how close did your probability assignment strike to the center of the target?

Think Like Reality

Whenever I hear someone describe quantum physics as “weird”—whenever I hear someone bewailing the mysterious effects of observation on the observed, or the bizarre existence of nonlocal correlations, or the incredible impossibility of knowing position and momentum at the same time—then I think to myself: This person will never understand physics no matter how many books they read. Reality has been around since long before you showed up. Don’t go calling it nasty names like “bizarre” or “incredible.” The universe was propagating complex amplitudes through configuration space for ten billion years before life ever emerged on Earth. Quantum physics is not “weird.” You are weird. You have the absolutely bizarre idea that reality ought to consist of little billiard balls bopping around, when in fact reality is a perfectly normal cloud of complex amplitude in configuration space. This is your problem, not reality’s, and you are the one who needs to change. Human intuitions were produced by evolution and evolution is a hack.

Calling reality “weird” keeps you inside a viewpoint already proven erroneous. Probability theory tells us that surprise is the measure of a poor hypothesis; if a model is consistently stupid—consistently hits on events the model assigns tiny probabilities—then it’s time to discard that model. A good model makes reality look normal, not weird; a good model assigns high probability to that which is actually the case. Intuition is only a model by another name: poor intuitions are shocked by reality, good intuitions make reality feel natural. You want to reshape your intuitions so that the universe looks normal. You want to think like reality. This end state cannot be forced. It is pointless to pretend that quantum physics feels natural to you when in fact it feels strange. This is merely denying your confusion, not becoming less confused. But it will also hinder you to keep thinking How bizarre! Spending emotional energy on incredulity wastes time you could be using to update. It repeatedly throws you back into the frame of the old, wrong viewpoint. It feeds your sense of righteous indignation at reality daring to contradict you.

The principle extends beyond physics. Have you ever caught yourself saying something like, “I just don’t understand how a PhD physicist can believe in astrology?” Well, if you literally don’t understand, this indicates a problem with your model of human psychology. Perhaps you are indignant—you wish to express strong moral disapproval. But if you literally don’t understand, then your indignation is stopping you from coming to terms with reality. It shouldn’t be hard to imagine how a PhD physicist ends up believing in astrology. People compartmentalize, enough said.

I now try to avoid using the English idiom “I just don’t understand how...” to express indignation. If I genuinely don’t understand how, then my model is being surprised by the facts, and I should discard it and find a better model. Surprise exists in the map, not in the territory. There are no surprising facts, only models that are surprised by facts. Likewise for facts called such nasty names as “bizarre,” “incredible,” “unbelievable,” “unexpected,” “strange,” “anomalous,” or “weird.” When you find yourself tempted by such labels, it may be wise to check if the alleged fact is really factual. But if the fact checks out, then the problem isn’t the fact—it’s you.


First, let it be said that I do indeed hold that “reductionism,” according to the meaning I will give for that word, is obviously correct; and to perdition with any past civilizations that disagreed. This seems like a strong statement, at least the first part of it. General Relativity seems well-supported, yet who knows but that some future physicist may overturn it? On the other hand, we are never going back to Newtonian mechanics. The ratchet of science turns, but it does not turn in reverse. There are cases in scientific history where a theory suffered a wound or two, and then bounced back; but when a theory takes as many arrows through the chest as Newtonian mechanics, it stays dead. “To hell with what past civilizations thought” seems safe enough, when past civilizations believed in something that has been falsified to the trash heap of history.

So when your mind simultaneously believes explicit descriptions of many different levels, and believes explicit rules for transiting between levels, as part of an efficient combined model, it feels like you are seeing a system that is made of different level descriptions and their rules for interaction. But this is just the brain trying to efficiently compress an object that it cannot remotely begin to model on a fundamental level. The airplane is too large. Even a hydrogen atom would be too large. Quark-to-quark interactions are insanely intractable. You can’t handle the truth. But the way physics really works, as far as we can tell, is that there is only the most basic level—the elementary particle fields and fundamental forces. You can’t handle the raw truth, but reality can handle it without the slightest simplification. (I wish I knew where Reality got its computing power.) The laws of physics do not contain distinct additional causal entities that correspond to lift or airplane wings, the way that the mind of an engineer contains distinct additional cognitive entities that correspond to lift or airplane wings. This, as I see it, is the thesis of reductionism. Reductionism is not a positive belief, but rather, a disbelief that the higher levels of simplified multilevel models are out there in the territory.

Fake Reductionism

There is a very great distinction between being able to see where the rainbow comes from, and playing around with prisms to confirm it, and maybe making a rainbow yourself by spraying water droplets— —versus some dour-faced philosopher just telling you, “No, there’s nothing special about the rainbow. Didn’t you hear? Scientists have explained it away. Just something to do with raindrops or whatever. Nothing to be excited about.” I think this distinction probably accounts for a hell of a lot of the deadly existential emptiness that supposedly accompanies scientific reductionism. You have to interpret the anti-reductionists’ experience of “reductionism,” not in terms of their actually seeing how rainbows work, not in terms of their having the critical “Aha!,” but in terms of their being told that the password is “Science.” The effect is just to move rainbows to a different literary genre—a literary genre they have been taught to regard as boring.

Part Q Joy in the Merely Real

Is Humanism a Religion Substitute?

There is an acid test of attempts at post-theism. The acid test is: “If religion had never existed among the human species—if we had never made the original mistake—would this song, this art, this ritual, this way of thinking, still make sense?” If humanity had never made the original mistake, there would be no hymns to the nonexistence of God. But there would still be marriages, so the notion of an atheistic marriage ceremony makes perfect sense—as long as you don’t suddenly launch into a lecture on how God doesn’t exist. Because, in a world where religion never had existed, nobody would interrupt a wedding to talk about the implausibility of a distant hypothetical concept. They’d talk about love, children, commitment, honesty, devotion, but who the heck would mention God? And, in a human world where religion never had existed, there would still be people who got tears in their eyes watching a space shuttle launch. Which is why, even if experiment shows that watching a shuttle launch makes “religion”-associated areas of my brain light up, associated with feelings of transcendence, I do not see that as a substitute for religion; I expect the same brain areas would light up, for the same reason, if I lived in a world where religion had never been invented.


Scarcity, as that term is used in social psychology, is when things become more desirable as they appear less obtainable.

The conventional theory for explaining this is “psychological reactance,” social-psychology-speak for “When you tell people they can’t do something, they’ll just try even harder.” The fundamental instincts involved appear to be preservation of status and preservation of options. We resist dominance, when any human agency tries to restrict our freedom. And when options seem to be in danger of disappearing, even from natural causes, we try to leap on the option before it’s gone. Leaping on disappearing options may be a good adaptation in a hunter-gatherer society—gather the fruits while they are still ripe—but in a money-based society it can be rather costly.

As Cialdini remarks, a chief sign of this malfunction is that you dream of possessing something, rather than using it. (Timothy Ferriss offers similar advice on planning your life: ask which ongoing experiences would make you happy, rather than which possessions or status-changes.) But the really fundamental problem with desiring the unattainable is that as soon as you actually get it, it stops being unattainable. If we cannot take joy in the merely available, our lives will always be frustrated...

Part R Physicalism 201

Zombies! Zombies?

Your “zombie,” in the philosophical usage of the term, is putatively a being that is exactly like you in every respect—identical behavior, identical speech, identical brain; every atom and quark in exactly the same position, moving according to the same causal laws of motion—except that your zombie is not conscious. It is furthermore claimed that if zombies are “possible” (a term over which battles are still being fought), then, purely from our knowledge of this “possibility,” we can deduce a priori that consciousness is extra-physical, in a sense to be described below; the standard term for this position is “epiphenomenalism.”

Based on my limited experience, the Zombie Argument may be a candidate for the most deranged idea in all of philosophy. There are times when, as a rationalist, you have to believe things that seem weird to you. Relativity seems weird, quantum mechanics seems weird, natural selection seems weird. But these weirdnesses are pinned down by massive evidence. There’s a difference between believing something weird because science has confirmed it overwhelmingly— —versus believing a proposition that seems downright deranged, because of a great big complicated philosophical argument centered around unspecified miracles and giant blank spots not even claimed to be understood— —in a case where even if you accept everything that has been told to you so far, afterward the phenomenon will still seem like a mystery and still have the same quality of wondrous impenetrability that it had at the start.

Excluding the Supernatural

By far the best definition I’ve ever heard of the supernatural is Richard Carrier’s: A “supernatural” explanation appeals to ontologically basic mental things, mental entities that cannot be reduced to nonmental entities. This is the difference, for example, between saying that water rolls downhill because it wants to be lower, and setting forth differential equations that claim to describe only motions, not desires. It’s the difference between saying that a tree puts forth leaves because of a tree spirit, versus examining plant biochemistry. Cognitive science takes the fight against supernaturalism into the realm of the mind.

Part S Quantum Physics and Many Worlds

Quantum Explanations

There’s a widespread belief that quantum mechanics is supposed to be confusing. This is not a good frame of mind for either a teacher or a student. As a Bayesian, I don’t believe in phenomena that are inherently confusing. Confusion exists in our models of the world, not in the world itself. If a subject is widely known as confusing, not just difficult... you shouldn’t leave it at that. It doesn’t satisfice; it is not an okay place to be. Maybe you can fix the problem, maybe you can’t; but you shouldn’t be happy to leave students confused.

It is always best to think of reality as perfectly normal. Since the beginning, not one unusual thing has ever happened. The goal is to become completely at home in a quantum universe. Like a native. Because, in fact, that is where you live.

Living in Many Worlds

Don’t think that many-worlds is there to make strange, radical, exciting predictions. It all adds up to normality. Then why should anyone care? Because there was once asked the question, fascinating unto a rationalist: What all adds up to normality? And the answer to this question turns out to be: quantum mechanics. It is quantum mechanics that adds up to normality. If there were something else there instead of quantum mechanics, then the world would look strange and unusual. Bear this in mind, when you are wondering how to live in the strange new universe of many worlds: You have always been there.

Live in your own world. Before you knew about quantum physics, you would not have been tempted to try living in a world that did not seem to exist. Your decisions should add up to this same normality: you shouldn’t try to live in a quantum world you can’t communicate with.

But, by and large, it all adds up to normality. If your understanding of many-worlds is the tiniest bit shaky, and you are contemplating whether to believe some strange proposition, or feel some strange emotion, or plan some strange strategy, then I can give you very simple advice: Don’t. The quantum universe is not a strange place into which you have been thrust. It is the way things have always been.

Part T Science and Rationality

Science Doesn’t Trust Your Rationality

It seems to me that there is a deep analogy between (small-“l”) libertarianism and Science:

  • Both are based on a pragmatic distrust of reasonable-sounding arguments.

  • Both try to build systems that are more trustworthy than the people in them.

  • Both accept that people are flawed, and try to harness their flaws to power the system.

The core argument for libertarianism is historically motivated distrust of lovely theories of “How much better society would be, if we just made a rule that said XYZ.” If that sort of trick actually worked, then more regulations would correlate to higher economic growth as society moved from local to global optima. But when some person or interest group gets enough power to start doing everything they think is a good idea, history says that what actually happens is Revolutionary France or Soviet Russia. The plans that in lovely theory should have made everyone happy ever after, don’t have the results predicted by reasonable-sounding arguments. And power corrupts, and attracts the corrupt. So you regulate as little as possible, because you can’t trust the lovely theories and you can’t trust the people who implement them. You don’t shake your finger at people for being selfish. You try to build an efficient system of production out of selfish participants, by requiring transactions to be voluntary. So people are forced to play positive-sum games, because that’s how they get the other party to sign the contract. With violence restrained and contracts enforced, individual selfishness can power a globally productive system. Of course none of this works quite so well in practice as in theory, and I’m not going to go into market failures, commons problems, etc. The core argument for libertarianism is not that libertarianism would work in a perfect world, but that it degrades gracefully into real life. Or rather, degrades less awkwardly than any other known economic principle.

Science first came to know itself as a rebellion against trusting the word of Aristotle. If the people of that revolution had merely said, “Let us trust ourselves, not Aristotle!” they would have flashed and faded like the French Revolution. But the Scientific Revolution lasted because—like the American Revolution—the architects propounded a stranger philosophy: “Let us trust no one! Not even ourselves!” In the beginning came the idea that we can’t just toss out Aristotle’s armchair reasoning and replace it with different armchair reasoning. We need to talk to Nature, and actually listen to what It says in reply. This, itself, was a stroke of genius. But then came the challenge of implementation. People are stubborn, and may not want to accept the verdict of experiment. Shall we shake a disapproving finger at them, and say “Naughty”? No; we assume and accept that each individual scientist may be crazily attached to their personal theories. Nor do we assume that anyone can be trained out of this tendency—we don’t try to choose Eminent Judges who are supposed to be impartial.

Instead, we try to harness the individual scientist’s stubborn desire to prove their personal theory, by saying: “Make a new experimental prediction, and do the experiment. If you’re right, and the experiment is replicated, you win.” So long as scientists believe this is true, they have a motive to do experiments that can falsify their own theories. Only by accepting the possibility of defeat is it possible to win. And any great claim will require replication; this gives scientists a motive to be honest, on pain of great embarrassment. And so the stubbornness of individual scientists is harnessed to produce a steady stream of knowledge at the group level. The System is somewhat more trustworthy than its parts.

Libertarianism secretly relies on most individuals being prosocial enough to tip at a restaurant they won’t ever visit again. An economy of genuinely selfish human-level agents would implode. Similarly, Science relies on most scientists not committing sins so egregious that they can’t rationalize them away. To the extent that scientists believe they can promote their theories by playing academic politics—or game the statistical methods to potentially win without a chance of losing—or to the extent that nobody bothers to replicate claims—science degrades in effectiveness. But it degrades gracefully, as such things go. The part where the successful predictions belong to the theory and theorists who originally made them, and cannot just be stolen by a theory that comes along later—without a novel experimental prediction—is an important feature of this social process. The final upshot is that Science is not easily reconciled with probability theory. If you do a probability-theoretic calculation correctly, you’re going to get the rational answer. Science doesn’t trust your rationality, and it doesn’t rely on your ability to use probability theory as the arbiter of truth. It wants you to set up a definitive experiment.

No Safe Defense, Not Even Science

Of the people I know who are reaching upward as rationalists, who volunteer information about their childhoods, there is a surprising tendency to hear things like, “My family joined a cult and I had to break out,” or, “One of my parents was clinically insane and I had to learn to filter out reality from their madness.” My own experience with growing up in an Orthodox Jewish family seems tame by comparison... but it accomplished the same outcome: It broke my core emotional trust in the sanity of the people around me. Until this core emotional trust is broken, you don’t start growing as a rationalist. I have trouble putting into words why this is so. Maybe any unusual skills you acquire—anything that makes you unusually rational—requires you to zig when other people zag. Maybe that’s just too scary, if the world still seems like a sane place unto you. Or maybe you don’t bother putting in the hard work to be extra bonus sane, if normality doesn’t scare the hell out of you.

I know that many aspiring rationalists seem to run into roadblocks around things like cryonics or many-worlds. Not that they don’t see the logic; they see the logic and wonder, “Can this really be true, when it seems so obvious now, and yet none of the people around me believe it?” Yes. Welcome to the Earth where ethanol is made from corn and environmentalists oppose nuclear power. I’m sorry. (See also: Cultish Countercultishness. If you end up in the frame of mind of nervously seeking reassurance, this is never a good thing—even if it’s because you’re about to believe something that sounds logical but could cause other people to look at you funny.)

People who’ve had their trust broken in the sanity of the people around them seem to be able to evaluate strange ideas on their merits, without feeling nervous about their strangeness. The glue that binds them to their current place has dissolved, and they can walk in some direction, hopefully forward. Lonely dissent, I called it. True dissent doesn’t feel like going to school wearing black; it feels like going to school wearing a clown suit. That’s what it takes to be the lone voice who says, “If you really think you know who’s going to win the election, why aren’t you picking up the free money on the Intrade prediction market?” while all the people around you are thinking, “It is good to be an individual and form your own opinions, the shoe commercials told me so.” Maybe in some other world, some alternate Everett branch with a saner human population, things would be different... but in this world, I’ve never seen anyone begin to grow as a rationalist until they make a deep emotional break with the wisdom of their pack.

Your trust will not break, until you apply all that you have learned here and from other books, and take it as far as you can go, and find that this too fails you—that you have still been a fool, and no one warned you against it—that all the most important parts were left out of the guidance you received—that some of the most precious ideals you followed steered you in the wrong direction.

It is living with uncertainty—knowing on a gut level that there are flaws, they are serious and you have not found them—that is the difficult thing.

My Childhood Role Model

Your era supports you more than you realize, in unconscious assumptions, in subtly improved technology of mind. Einstein was a nice fellow, but he talked a deal of nonsense about an impersonal God, which shows you how well he understood the art of careful thinking at a higher level of abstraction than his own field. It may seem less like sacrilege to think that if you have at least one imaginary galactic supermind to compare with Einstein, so that he is not the far right end of your intelligence scale. If you only try to do what seems humanly possible, you will ask too little of yourself. When you imagine reaching up to some higher and inconvenient goal, all the convenient reasons why it is “not possible” leap readily to mind. The most important role models are dreams: they come from within ourselves. To dream of anything less than what you conceive to be perfection is to draw on less than the full power of the part of yourself that dreams.

Interlude:A Technical Explanation of Technical Explanation

On Popper’s philosophy, the strength of a scientific theory is not how much it explains, but how much it doesn’t explain. The virtue of a scientific theory lies not in the outcomes it permits, but in the outcomes it prohibits. Freud’s theories, which seemed to explain everything, prohibited nothing.

Translating this into Bayesian terms, we find that the more outcomes a model prohibits, the more probability density the model concentrates in the remaining, permitted outcomes. The more outcomes a theory prohibits, the greater the knowledge-content of the theory. The more daringly a theory exposes itself to falsification, the more definitely it tells you which experiences to anticipate. A theory that can explain any experience corresponds to a hypothesis of complete ignorance—a uniform distribution with probability density spread evenly over every possible outcome.

Book V Mere Goodness

Not for the Sake of Happiness (Alone)

I value freedom: When I’m deciding where to steer the future, I take into account not only the subjective states that people end up in, but also whether they got there as a result of their own efforts. The presence or absence of an external puppet master can affect my valuation of an otherwise fixed outcome. Even if people wouldn’t know they were being manipulated, it would matter to my judgment of how well humanity had done with its future. This is an important ethical issue, if you’re dealing with agents powerful enough to helpfully tweak people’s futures without their knowledge. So my values are not strictly reducible to happiness: There are properties I value about the future that aren’t reducible to activation levels in anyone’s pleasure center; properties that are not strictly reducible to subjective states even in principle. Which means that my decision system has a lot of terminal values, none of them strictly reducible to anything else. Art, science, love, lust, freedom, friendship... And I’m okay with that. I value a life complicated enough to be challenging and aesthetic—not just the feeling that life is complicated, but the actual complications—so turning into a pleasure center in a vat doesn’t appeal to me. It would be a waste of humanity’s potential, which I value actually fulfilling, not just having the feeling that it was fulfilled.

Part V Value Theory

Could Anything Be Right?

You should be willing to accept that you might know a little about morality. Nothing unquestionable, perhaps, but an initial state with which to start questioning yourself. Baked into your brain but not explicitly known to you, perhaps; but still, that which your brain would recognize as right is what you are talking about. You will accept at least enough of the way you respond to moral arguments as a starting point to identify “morality” as something to think about.

But that’s a rather large step. It implies accepting your own mind as identifying a moral frame of reference, rather than all morality being a great light shining from beyond (that in principle you might not be able to perceive at all). It implies accepting that even if there were a light and your brain decided to recognize it as “morality,” it would still be your own brain that recognized it, and you would not have evaded causal responsibility—or evaded moral responsibility either, on my view. It implies dropping the notion that a ghost of perfect emptiness will necessarily agree with you, because the ghost might occupy a different moral frame of reference, respond to different arguments, be asking a different question when it computes what-to-do-next. And if you’re willing to bake at least a few things into the very meaning of this topic of “morality,” this quality of rightness that you are talking about when you talk about “rightness”—if you’re willing to accept even that morality is what you argue about when you argue about “morality”—then why not accept other intuitions, other pieces of yourself, into the starting point as well?

Why not accept that, ceteris paribus, joy is preferable to sorrow? You might later find some ground within yourself or built upon yourself with which to criticize this—but why not accept it for now? Not just as a personal preference, mind you; but as something baked into the question you ask when you ask “What is truly right”? But then you might find that you know rather a lot about morality! Nothing certain—nothing unquestionable—nothing unarguable—but still, quite a bit of information. Are you willing to relinquish your Socratic ignorance?

Magical Categories

Why, Friendly AI isn’t hard at all! All you need is an AI that does what’s good! Oh, sure, not every possible mind does what’s good—but in this case, we just program the superintelligence to do what’s good. All you need is a neural network that sees a few instances of good things and not-good things, and you’ve got a classifier. Hook that up to an expected utility maximizer and you’re done! I shall call this the fallacy of magical categories—simple little words that turn out to carry all the desired functionality of the AI.

The novice thinks that Friendly AI is a problem of coercing an AI to make it do what you want, rather than the AI following its own desires. But the real problem of Friendly AI is one of communication—transmitting category boundaries, like “good,” that can’t be fully delineated in any training data you can give the AI during its childhood. Relative to the full space of possibilities the Future encompasses, we ourselves haven’t imagined most of the borderline cases, and would have to engage in full-fledged moral arguments to figure them out.

Value is Fragile

If I had to pick a single statement that relies on more Overcoming Bias content I’ve written than any other, that statement would be: Any Future not shaped by a goal system with detailed reliable inheritance from human morals and metamorals will contain almost nothing of worth.

Value isn’t just complicated, it’s fragile. There is more than one dimension of human value, where if just that one thing is lost, the Future becomes null. A single blow and all value shatters. Not every single blow will shatter all value—but more than one possible “single blow” will do so.

If you loose the grip of human morals and metamorals—the result is not mysterious and alien and beautiful by the standards of human value. It is moral noise, a universe tiled with paperclips. To change away from human morals in the direction of improvement rather than entropy requires a criterion of improvement; and that criterion would be physically represented in our brains, and our brains alone. Let go of the steering wheel, and the Future crashes.

Part W Quantified Humanism

Scope Insensitivity

Once upon a time, three groups of subjects were asked how much they would pay to save 2,000 / 20,000 / 200,000 migrating birds from drowning in uncovered oil ponds. The groups respectively answered $80, $78, and $88. This is scope insensitivity or scope neglect: the number of birds saved—the scope of the altruistic action—had little effect on willingness to pay.

People visualize “a single exhausted bird, its feathers soaked in black oil, unable to escape.” This image, or prototype, calls forth some level of emotional arousal that is primarily responsible for willingness-to-pay—and the image is the same in all cases. As for scope, it gets tossed out the window—no human can visualize 2,000 birds at once, let alone 200,000. The usual finding is that exponential increases in scope create linear increases in willingness-to-pay—perhaps corresponding to the linear time for our eyes to glaze over the zeroes; this small amount of affect is added, not multiplied, with the prototype affect. This hypothesis is known as “valuation by prototype.”

An alternative hypothesis is “purchase of moral satisfaction.” People spend enough money to create a warm glow in themselves, a sense of having done their duty. The level of spending needed to purchase a warm glow depends on personality and financial situation, but it certainly has nothing to do with the number of birds.

We are insensitive to scope even when human lives are at stake. The moral: If you want to be an effective altruist, you have to think it through with the part of your brain that processes those unexciting inky zeroes on paper, not just the part that gets real worked up about that poor struggling oil-soaked bird.

When human lives are at stake, we have a duty to maximize, not satisfice; and this duty has the same strength as the original duty to save lives. Whoever knowingly chooses to save one life, when they could have saved two—to say nothing of a thousand lives, or a world—they have damned themselves as thoroughly as any murderer.

Humans are not expected utility maximizers. Whether you want to relax and have fun, or pay some extra money for a feeling of certainty, depends on whether you care more about satisfying your intuitions or actually achieving the goal.

If what you care about is the warm fuzzy feeling of certainty, then fine. If someone’s life is at stake, then you had best realize that your intuitions are a greasy lens through which to see the world. Your feelings are not providing you with direct, veridical information about strategic consequences—it feels that way, but they’re not. Warm fuzzies can lead you far astray. There are mathematical laws governing efficient strategies for steering the future. When something truly important is at stake—something more important than your feelings of happiness about the decision—then you should care about the math, if you truly care at all.

Feeling Moral

Research shows that people distinguish “sacred values,” like human lives, from “unsacred values,” like money. When you try to trade off a sacred value against an unsacred value, subjects express great indignation. (Sometimes they want to punish the person who made the suggestion.)

Trading off a sacred value against an unsacred value feels really awful. To merely multiply utilities would be too cold-blooded—it would be following rationality off a cliff... But altruism isn’t the warm fuzzy feeling you get from being altruistic. If you’re doing it for the spiritual benefit, that is nothing but selfishness. The primary thing is to help others, whatever the means. So shut up and multiply!

And I say also this to you: That if you set aside your regret for all the spiritual satisfaction you could be having—if you wholeheartedly pursue the Way, without thinking that you are being cheated—if you give yourself over to rationality without holding back, you will find that rationality gives to you in return. But that part only works if you don’t go around saying to yourself, “It would feel better inside me if only I could be less rational.” Should you be sad that you have the opportunity to actually help people? You cannot attain your full potential if you regard your gift as a burden.

The rhetoric of sacredness gets bonus points for seeming to express an unlimited commitment, an unconditional refusal that signals trustworthiness and refusal to compromise. So you conclude that moral rhetoric espouses qualitative distinctions, because espousing a quantitative tradeoff would sound like you were plotting to defect. On such occasions, people vigorously want to throw quantities out the window, and they get upset if you try to bring quantities back in, because quantities sound like conditions that would weaken the rule.

When (Not) to Use Probabilities

It may come as a surprise to some readers that I do not always advocate using probabilities. Or rather, I don’t always advocate that human beings, trying to solve their problems, should try to make up verbal probabilities, and then apply the laws of probability theory or decision theory to whatever number they just made up, and then use the result as their final belief or decision.

Now there are benefits from trying to translate your gut feelings of uncertainty into verbal probabilities. It may help you spot problems like the conjunction fallacy. It may help you spot internal inconsistencies—though it may not show you any way to remedy them. But you shouldn’t go around thinking that if you translate your gut feeling into “one in a thousand,” then, on occasions when you emit these verbal words, the corresponding event will happen around one in a thousand times. Your brain is not so well-calibrated. If instead you do something nonverbal with your gut feeling of uncertainty, you may be better off, because at least you’ll be using the gut feeling the way it was meant to be used.

Interlude: Twelve Virtues of Rationality

The first virtue is curiosity. A burning itch to know is higher than a solemn vow to pursue truth. To feel the burning itch of curiosity requires both that you be ignorant, and that you desire to relinquish your ignorance. If in your heart you believe you already know, or if in your heart you do not wish to know, then your questioning will be purposeless and your skills without direction. Curiosity seeks to annihilate itself; there is no curiosity that does not want an answer.

The second virtue is relinquishment. P. C. Hodgell said: “That which can be destroyed by the truth should be.” Do not flinch from experiences that might destroy your beliefs. The thought you cannot think controls you more than thoughts you speak aloud. Submit yourself to ordeals and test yourself in fire. Relinquish the emotion which rests upon a mistaken belief, and seek to feel fully that emotion which fits the facts.

The third virtue is lightness. Let the winds of evidence blow you about as though you are a leaf, with no direction of your own. Beware lest you fight a rearguard retreat against the evidence, grudgingly conceding each foot of ground only when forced, feeling cheated. Surrender to the truth as quickly as you can. Do this the instant you realize what you are resisting, the instant you can see from which quarter the winds of evidence are blowing against you.

The fourth virtue is evenness. One who wishes to believe says, “Does the evidence permit me to believe?” One who wishes to disbelieve asks, “Does the evidence force me to believe?” Beware lest you place huge burdens of proof only on propositions you dislike, and then defend yourself by saying: “But it is good to be skeptical.” If you attend only to favorable evidence, picking and choosing from your gathered data, then the more data you gather, the less you know.

The fifth virtue is argument. Those who wish to fail must first prevent their friends from helping them. Those who smile wisely and say “I will not argue” remove themselves from help and withdraw from the communal effort. In argument strive for exact honesty, for the sake of others and also yourself: the part of yourself that distorts what you say to others also distorts your own thoughts. Do not believe you do others a favor if you accept their arguments; the favor is to you. Do not think that fairness to all sides means balancing yourself evenly between positions; truth is not handed out in equal portions before the start of a debate.

The sixth virtue is empiricism. The roots of knowledge are in observation and its fruit is prediction. What tree grows without roots? What tree nourishes us without fruit? Do not ask which beliefs to profess, but which experiences to anticipate. Always know which difference of experience you argue about. Do not be blinded by words. When words are subtracted, anticipation remains.

The seventh virtue is simplicity. Antoine de Saint-Exupéry said: “Perfection is achieved not when there is nothing left to add, but when there is nothing left to take away.” Simplicity is virtuous in belief, design, planning, and justification. When you profess a huge belief with many details, each additional detail is another chance for the belief to be wrong. Each specification adds to your burden; if you can lighten your burden you must do so.

The eighth virtue is humility. To be humble is to take specific actions in anticipation of your own errors. To confess your fallibility and then do nothing about it is not humble; it is boasting of your modesty. Who are most humble? Those who most skillfully prepare for the deepest and most catastrophic errors in their own beliefs and plans.

The ninth virtue is perfectionism. The more errors you correct in yourself, the more you notice. As your mind becomes more silent, you hear more noise. When you notice an error in yourself, this signals your readiness to seek advancement to the next level. If you tolerate the error rather than correcting it, you will not advance to the next level and you will not gain the skill to notice new errors. In every art, if you do not seek perfection you will halt before taking your first steps. If perfection is impossible that is no excuse for not trying. Hold yourself to the highest standard you can imagine, and look for one still higher. Do not be content with the answer that is almost right; seek one that is exactly right.

The tenth virtue is precision. One comes and says: The quantity is between 1 and 100. Another says: The quantity is between 40 and 50. If the quantity is 42 they are both correct, but the second prediction was more useful and exposed itself to a stricter test. The narrowest statements slice deepest, the cutting edge of the blade.

The eleventh virtue is scholarship. Study many sciences and absorb their power as your own. Each field that you consume makes you larger. If you swallow enough sciences the gaps between them will diminish and your knowledge will become a unified whole. If you are gluttonous you will become vaster than mountains. It is especially important to eat math and science which impinge upon rationality: evolutionary psychology, heuristics and biases, social psychology, probability theory, decision theory. But these cannot be the only fields you study. The Art must have a purpose other than itself, or it collapses into infinite recursion.

Before these eleven virtues is a virtue which is nameless. Do not ask whether it is “the Way” to do this or that. Ask whether the sky is blue or green. If you speak overmuch of the Way you will not attain it. You may try to name the highest principle with names such as “the map that reflects the territory” or “experience of success and failure” or “Bayesian decision theory.” But perhaps you describe incorrectly the nameless virtue. How will you discover your mistake? Not by comparing your description to itself, but by comparing it to that which you did not name.

These then are twelve virtues of rationality: Curiosity, relinquishment, lightness, evenness, argument, empiricism, simplicity, humility, perfectionism, precision, scholarship, and the void.

Book VI Becoming Stronger

Part X Yudkowsky’s Coming of Age

Eliezer2000 lives by the rule that you should always be ready to have your thoughts broadcast to the whole world at any time, without embarrassment. Otherwise, clearly, you’ve fallen from grace: either you’re thinking something you shouldn’t be thinking, or you’re embarrassed by something that shouldn’t embarrass you.

If there’s one thing I’ve learned from this history, it’s that saying “Oops” is something to look forward to. Sure, the prospect of saying “Oops” in the future means that the you of right now is a drooling imbecile, whose words your future self won’t be able to read because of all the wincing. But saying “Oops” in the future also means that, in the future, you’ll acquire new Jedi powers that your present self doesn’t dream exist. It makes you feel embarrassed, but also alive. Realizing that your younger self was a complete moron means that even though you’re already in your twenties, you haven’t yet gone over your peak. So here’s to hoping that my future self realizes I’m a drooling imbecile: I may plan to solve my problems with my present abilities, but extra Jedi powers sure would come in handy.

Part Y Challenging the Difficult

Tsuyoku Naritai! (I Want to Become Stronger)

Tsuyoku naritai is Japanese. Tsuyoku is “strong”; naru is “becoming,” and the form naritai is “want to become.” Together it means “I want to become stronger,” and it expresses a sentiment embodied more intensely in Japanese works than in any Western literature I’ve read. You might say it when expressing your determination to become a professional Go player—or after you lose an important match, but you haven’t given up—or after you win an important match, but you’re not a ninth-dan player yet—or after you’ve become the greatest Go player of all time, but you still think you can do better. That is tsuyoku naritai, the will to transcendence.

Take no pride in your confession that you too are biased; do not glory in your self-awareness of your flaws. This is akin to the principle of not taking pride in confessing your ignorance; for if your ignorance is a source of pride to you, you may become loath to relinquish your ignorance when evidence comes knocking. Likewise with our flaws—we should not gloat over how self-aware we are for confessing them; the occasion for rejoicing is when we have a little less to confess. Otherwise, when the one comes to us with a plan for correcting the bias, we will snarl, “Do you think to set yourself above us?” We will shake our heads sadly and say, “You must not be very self-aware.” Never confess to me that you are just as flawed as I am unless you can tell me what you plan to do about it. Afterward you will still have plenty of flaws left, but that’s not the point; the important thing is to do better, to keep moving ahead, to take one more step forward. Tsuyoku naritai!

It can be fun to proudly display your modesty, so long as everyone knows how very much you have to be modest about. But do not let that be the endpoint of your journeys. Even if you only whisper it to yourself, whisper it still: Tsuyoku, tsuyoku! Stronger, stronger! And then set yourself a higher target. That’s the true meaning of the realization that you are still flawed (though a little less so). It means always reaching higher, without shame. Tsuyoku naritai! I’ll always run as fast as I can, even if I pull ahead, I’ll keep on running; and someone, someday, will surpass me; but even though I fall behind, I’ll always run as fast as I can.

Shut Up and Do the Impossible!

Setting out to make an effort is distinct from setting out to win.

One of the key Rules For Doing The Impossible is that, if you can state exactly why something is impossible, you are often close to a solution.

Part Z The Craft and the Community

Raising the Sanity Waterline

Suppose we have a scientist who’s still religious, either full-blown scriptural-religion, or in the sense of tossing around vague casual endorsements of “spirituality.” We now know this person is not applying any technical, explicit understanding of...

  • ... what constitutes evidence and why;

  • ... Occam’s Razor;

  • ... how the above two rules derive from the lawful and causal operation of minds as mapping engines, and do not switch off when you talk about tooth fairies;

  • ... how to tell the difference between a real answer and a curiosity-stopper;

  • ... how to rethink matters for themselves instead of just repeating things they heard;

  • ... certain general trends of science over the last three thousand years;

  • ... the difficult arts of actually updating on new evidence and relinquishing old beliefs;

  • ... epistemology 101;

  • ... self-honesty 201;

  • ... et cetera, et cetera, et cetera, and so on.

When you consider it—these are all rather basic matters of study, as such things go.

That’s what the dead canary, religion, is telling us: that the general sanity waterline is currently really ridiculously low. Even in the highest halls of science. If we throw out that dead and rotting canary, then our mine may stink a bit less, but the sanity waterline may not rise much higher. This is not to criticize the neo-atheist movement. The harm done by religion is clear and present danger, or rather, current and ongoing disaster. Fighting religion’s directly harmful effects takes precedence over its use as a canary or experimental indicator. But even if Dawkins, and Dennett, and Harris, and Hitchens, should somehow win utterly and absolutely to the last corner of the human sphere, the real work of rationalists will be only just beginning.

Why aren’t “rationalists” surrounded by a visible aura of formidability? Why aren’t they found at the top level of every elite selected on any basis that has anything to do with thought? Why do most “rationalists” just seem like ordinary people, perhaps of moderately above-average intelligence, with one more hobbyhorse to ride? Of this there are several answers; but one of them, surely, is that they have received less systematic training of rationality in a less systematic context than a first-dan black belt gets in hitting people.

Why Our Kind Can’t Cooperate

There’s a saying I sometimes use: “It is dangerous to be half a rationalist.” For example, I can think of ways to sabotage someone’s intelligence by selectively teaching them certain methods of rationality. Suppose you taught someone a long list of logical fallacies and cognitive biases, and trained them to spot those fallacies and biases in other people’s arguments. But you are careful to pick those fallacies and biases that are easiest to accuse others of, the most general ones that can easily be misapplied. And you do not warn them to scrutinize arguments they agree with just as hard as they scrutinize incongruent arguments for flaws. So they have acquired a great repertoire of flaws of which to accuse only arguments and arguers who they don’t like. This, I suspect, is one of the primary ways that smart people end up stupid. (And note, by the way, that I have just given you another Fully General Counterargument against smart people whose arguments you don’t like.) Similarly, if you wanted to ensure that a group of “rationalists” never accomplished any task requiring more than one person, you could teach them only techniques of individual rationality, without mentioning anything about techniques of coordinated group rationality.

Here I want to focus on what you might call the culture of disagreement, or even the culture of objections, which is one of the two major forces preventing the technophile crowd from coordinating.

Yes, a group that can’t tolerate disagreement is not rational. But if you tolerate only disagreement—if you tolerate disagreement but not agreement—then you also are not rational. You’re only willing to hear some honest thoughts, but not others. You are a dangerous half-a-rationalist. We are as uncomfortable together as flying-saucer cult members are uncomfortable apart. That can’t be right either. Reversed stupidity is not intelligence.

Doing worse with more knowledge means you are doing something very wrong. You should always be able to at least implement the same strategy you would use if you are ignorant, and preferably do better. You definitely should not do worse. If you find yourself regretting your “rationality” then you should reconsider what is rational.

We would seem to be stuck in an awful valley of partial rationality where we end up more poorly coordinated than religious fundamentalists, able to put forth less effort than flying-saucer cultists. True, what little effort we do manage to put forth may be better-targeted at helping people rather than the reverse—but that is not an acceptable excuse.

Our culture puts all the emphasis on heroic disagreement and heroic defiance, and none on heroic agreement or heroic group consensus. We signal our superior intelligence and our membership in the nonconformist community by inventing clever objections to others’ arguments. Perhaps that is why the technophile / Silicon Valley crowd stays marginalized, losing battles with less nonconformist factions in larger society. No, we’re not losing because we’re so superior, we’re losing because our exclusively individualist traditions sabotage our ability to cooperate.

The other major component that I think sabotages group efforts in the technophile community is being ashamed of strong feelings. We still have the Spock archetype of rationality stuck in our heads, rationality as dispassion. Or perhaps a related mistake, rationality as cynicism—trying to signal your superior world-weary sophistication by showing that you care less than others. Being careful to ostentatiously, publicly look down on those so naive as to show they care strongly about anything.

The best informal definition I’ve ever heard of rationality is “That which can be destroyed by the truth should be.” We should aspire to feel the emotions that fit the facts, not aspire to feel no emotion. If an emotion can be destroyed by truth, we should relinquish it. But if a cause is worth striving for, then let us by all means feel fully its importance.

If the nonconformist crowd ever wants to get anything done together, we need to move in the direction of joining groups and staying there at least a little more easily. Even in the face of annoyances and imperfections! Even in the face of unresponsiveness to our own better ideas!

I suspect that the largest step rationalists could take toward matching the per-capita power output of the Catholic Church would be to have regular physical meetings of people contributing to the same task—just for purposes of motivation. In the absence of that... We could try for a group norm of being openly allowed—nay, applauded—for caring strongly about something. And a group norm of being expected to do something useful with your life—contribute your part to cleaning up this world. Religion doesn’t really emphasize the getting-things-done aspect as much.

Should the Earth last so long, I would like to see, as the form of rationalist communities, taskforces focused on all the work that needs doing to fix up this world. Communities in any geographic area would form around the most specific cluster that could support a decent-sized band. If your city doesn’t have enough people in it for you to find 50 fellow Linux programmers, you might have to settle for 15 fellow open-source programmers... or in the days when all of this is only getting started, 15 fellow rationalists trying to spruce up the Earth in their assorted ways.

Money: The Unit of Caring

In our society, this common currency of expected utilons is called “money.” It is the measure of how much society cares about something. This is a brutal yet obvious point, which many are motivated to deny.

There is this very, very old puzzle/observation in economics about the lawyer who spends an hour volunteering at the soup kitchen, instead of working an extra hour and donating the money to hire someone to work for five hours at the soup kitchen. There’s this thing called “Ricardo’s Law of Comparative Advantage.” There’s this idea called “professional specialization.” There’s this notion of “economies of scale.” There’s this concept of “gains from trade.” The whole reason why we have money is to realize the tremendous gains possible from each of us doing what we do best. This is what grownups do. This is what you do when you want something to actually get done. You use money to employ full-time specialists.

Yes, frugality is a virtue. Yes, spending money hurts. But in the end, if you are never willing to spend any units of caring, it means you don’t care.

Purchase Fuzzies and Utilons Separately

If I had to give advice to some new-minted billionaire entering the realm of charity, my advice would go something like this:

  • To purchase warm fuzzies, find some hard-working but poverty-stricken woman who’s about to drop out of state college after her husband’s hours were cut back, and personally, but anonymously, give her a cashier’s check for $10,000. Repeat as desired.

  • To purchase status among your friends, donate $100,000 to the current sexiest X-Prize, or whatever other charity seems to offer the most stylishness for the least price. Make a big deal out of it, show up for their press events, and brag about it for the next five years.

  • Then—with absolute cold-blooded calculation—without scope insensitivity or ambiguity aversion—without concern for status or warm fuzzies—figuring out some common scheme for converting outcomes to utilons, and trying to express uncertainty in percentage probabilities—find the charity that offers the greatest expected utilons per dollar. Donate up to however much money you wanted to give to charity, until their marginal efficiency drops below that of the next charity on the list.

I would furthermore advise the billionaire that what they spend on utilons should be at least, say, 20 times what they spend on warm fuzzies—5% overhead on keeping yourself altruistic seems reasonable, and I, your dispassionate judge, would have no trouble validating the warm fuzzies against a multiplier that large. Save that the original fuzzy act really should be helpful rather than actively harmful.

The main lesson is that all three of these things—warm fuzzies, status, and expected utilons—can be bought far more efficiently when you buy separately, optimizing for only one thing at a time.

Of course, if you’re not a millionaire or even a billionaire—then you can’t be quite as efficient about things, can’t so easily purchase in bulk. But I would still say—for warm fuzzies, find a relatively cheap charity with bright, vivid, ideally in-person and direct beneficiaries. Volunteer at a soup kitchen. Or just get your warm fuzzies from holding open doors for little old ladies. Let that be validated by your other efforts to purchase utilons, but don’t confuse it with purchasing utilons. Status is probably cheaper to purchase by buying nice clothes. And when it comes to purchasing expected utilons—then, of course, shut up and multiply.

Practical Advice Backed by Deep Theories

I think that the advice I need is from someone who reads up on a whole lot of experimental psychology dealing with willpower, mental conflicts, ego depletion, preference reversals, hyperbolic discounting, the breakdown of the self, picoeconomics, et cetera, and who, in the process of overcoming their own akrasia, manages to understand what they did in truly general terms—thanks to experiments that give them a vocabulary of cognitive phenomena that actually exist, as opposed to phenomena they just made up. And moreover, someone who can explain what they did to someone else, thanks again to the experimental and theoretical vocabulary that lets them point to replicable experiments that ground the ideas in very concrete results, or mathematically clear ideas.

Practical advice really, really does become a lot more powerful when it’s backed up by concrete experimental results, causal accounts that are actually true, and math validly interpreted.