Metamagical Themas: Questing for the Essence of Mind and Pattern - by Douglas Hofstadter

A healthy sentence is one that, so to speak, practices what it preaches, whereas a neurotic sentence is one that says one thing while doing its opposite. Alan Auerbach has given us a good example in each category. His healthy sentence is: "Terse!" His neurotic sentence is: "Proper writing—and you've heard this a million times —avoids exaggeration."

Ashleigh Brilliant is the inventor of a vast number of aphorisms he calls "potshots", many of which have become very popular phrases in this country. For some reason, he has a self-imposed limit of seventeen words per potshot. A few typical potshots (all taken from his four books listed in the Bibliography) are: What would life be, without me? As long as I have you, I can endure all the troubles you inevitably bring. Remember me? I'm the one who never made any impression on you. Why does trouble always come at the wrong time? Due to circumstances beyond my control, I am master of my fate and captain of my soul.

Hofstadter's Law states: "It always takes longer than you think it will take, even if you take into account Hofstadter's Law."

Chunking is the perception as a whole of an assembly of many parts. An excellent example is the difference between 100 pennies and the concept of one dollar. We would find it exceedingly hard to deal with the prices of cars and houses and computers if we always had to express them in pennies. A dollar has psychological reality, in that we usually do not break it down into its pieces. The concept is valuable for that very reason. It seems to me a pity that the monetary chunking process stops at the dollar level. We have inches, feet, yards, miles. Why could we not have pennies, dollars, grands, megs, gigs? We might be better able to digest newspaper headlines if they were expressed in terms of such chunked units —provided that those units had come to mean something to us, as such. We all have a pretty good grasp of the notion of a grand. But what can a meg or a gig buy you these days? How many megs does it take to build a high school? How many gigs is the annual budget of your state? Most numerically-oriented people, in order to answer these questions, will have to resort to calculation. They do not have such concepts at their mental fingertips. But in a numerate populace, everyone should. It should be a commonplace that a new high school equals about 20 megs, a state budget several gigs, and so on. These terms should not be thought of as shorthand for "million dollars" and "billion dollars" any more than "dollar" is a shorthand for "100 cents". They should be autonomous concepts— mental "nodes"—with information and associations dangling from them without any need for conversion to some other units or calculation of any sort. If that kind of direct sense of certain big numbers were available, then we would have a much more concrete grasp on what otherwise are nearly hopeless abstractions.

Why is gender, and gender alone, such a crucial variable? Think of the street sign that shows a man in silhouette walking across the street, intended to tell you "Pedestrian Crossing" in sign language. What if it were recognizably a woman walking across the street? Since it violates the standard default assumption that people have for people, it would immediately arouse a kind of suspicion: "Hmm... 'Women Crossing'? Is there a nunnery around here?" This would be the reaction not merely of dyed-in-the-wool sexists, but of anyone who grew up in our society, where women are portrayed—not deliberately or consciously, but ubiquitously and subliminally—as "exceptions". If I write, "In the nineteenth century, the kings of nonsense were Edward Lear and Lewis Carroll", people will with no trouble get the message that those two men were the best of all nonsense writers at that time. But now consider what happens if I write, "The queen of twentieth-century nonsense is Gertrude Stein". The implication is unequivocal: Gertrude Stein is, among female writers of nonsense, the best. It leaves completely open her ranking relative to males. She might be way down the list! Now isn't this preposterous? Why is our ianguage so asymmetric? This is hardly chivalry —it is utter condescension. A remarkable and insidious slippery-slope phenomenon is what has happened recently to formerly all-women's colleges that were paired with formerly all-men's colleges, such as Pembroke and Brown, Radcliffe and Harvard, and so on. As the two merged, the women's school gradually faded out of the picture. Do men now go to Radcliffe or Pembroke or Douglass? Good God, no! But women are proud to go to Harvard and Brown and Rutgers. Sometimes, the women's college keeps some status within the larger unit, but that larger unit is always named after the men's college. In a weird twist on this theme, Stanford University has no sororities at all—but guess what kinds of people it now allows in its fraternities! Another pernicious slippery slope has arisen quite recently. That is the one involving "gay" as both masculine and generic, and "Lesbian" as feminine. What is problematic here is that some people are very conscious of the problem, and refuse to use "gay" as a generic, replacing it with "gay or Lesbian" or "homosexual". (Thus there are many "Gay and Lesbian Associations".) Other people, however, have eagerly latched onto "gay" as a generic and use it freely that way, referring to "gay people", "gay men", "gay women", "gay rights", and so on. As a consequence, the word "gay" has a much broader flavor to it than does "Lesbian". What does "the San Francisco gay community" conjure up? Now replace "gay" by "Lesbian" and try it again. The former image probably is capable of flitting between that of both sexes and that of men only, while the latter is certainly restricted to women. The point is simply that men are made to seem standard, ordinary, somehow proper; women as special, deviant, exceptional. That is the essence of the slippery slope.

Variations on a Theme as the Crux of Creativity

Einstein didn't go around racking his brain, muttering to himself, "How, oh how, can I come up with a Great Idea?" Like Einstein (although perhaps on a lesser scale), Eve never needs to ask herself, "Hmm, let's see, shall I try to figure out some way to spin off a variation on this object sitting here in front of me?" No; she just does what comes naturally. The bottom line is that invention is much more like falling off a log than like sawing one in two. Despite Thomas Alva Edison's memorable remark, "Genius is 2 percent inspiration and 98 percent perspiration", we're not all going to become geniuses simply by sweating more or resolving to try harder. A mind follows its path of least resistance, and it's when it feels easiest that it is most likely being its most creative. Or, as Mozart used to say, things should "flow like oil"—and Mozart ought to know! Trying harder is not the name of the game; the trick is getting the right concept to begin, with, so that making variations on it is like taking candy from a baby. Uh-oh—now I've given the cat away! So let me boldly state the thesis that I shall now elaborate: Making variations on a theme is really the crux of creativity. On the face of it, this thesis is crazy. How can it possibly be true? Aren't variations simply derivative notions, never truly original creations? Isn't the notion of a 4x4x4 cube simply a result of "twiddling a knob" on the concept of Rubik's-Cubicity? You merely twist the knob from its "factory setting" of 3 to the new setting of 4, and presto—you've got it! An inner voice protests: "That's just too easy. That's certainly not where Rubik's Cube, the Rite of Spring, relativity, or Romeo and Juliet came from, is it?"

There is a way that concepts have of "slipping" from one into another, following a quite unpredictable path. Careful observation and theorizing about such slippages affords us perhaps our best chance to probe deeply into the hidden murk of our conceptual networks. An example of such a slip is furnished to us whenever we make a typo or a grammatical mistake, utter a malapropism ("She's just back from a one-year stench at Berkeley") or a malaphor (a novel phrase concocted unconsciously from bits and pieces of other phrases, such as "He's such an easy-go-lucky fellow" or "Uh-oh, now I've given the cat away"), or confuse two concepts at a deeply semantic level (e.g., saying "Tuesday" but meaning "February", or saying "midnight" in lieu of "zero degrees"). These types of slip are totally accidental and come straight out of our unconscious mind. However, sometimes a slippage can be nonaccidental yet still come from the unconscious mind. By "nonaccidental" here, I do not mean to imply that the slip is deliberate. It's not that we say to ourselves, "I think I shall now slip from one concept into a variation of it"; indeed, that kind of deliberate, conscious slippage is most often quite uninspired and infertile. "How to Think" and "How to Be Creative" books—even very thoughtful ones such as George P6lya's How to Solve It—are, for that reason, of little use to the would-be genius. Strange though it may sound, nondeliberate yet nonaccidental slippage permeates our mental processes, and is the very crux of fluid thought. That is my firmly held conviction.

We should use the power of computers to aid us in seeing the full concept—the implicit "sphere of hypothetical variations"—surrounding any static, frozen perception. I have concocted a playful name for this imaginary sphere: I call it the implicosphere, which stands for implicit counterfactual sphere, referring to things that never were but that we cannot help seeing anyway. (The word can also be taken as referring to the sphere of implications surrounding any given idea.) If we wish to enlist computers as our partners in this venture of inventing variations on a theme, which is to say, turning implicospheres into "explicospheres", we have to give them the ability to spot knobs themselves, not just to accept knobs that we humans have spotted. To do this we will have to look deeply into the nature of "slippability", into the fine-grained structure of those networks of concepts in human minds.

So let me present a few more examples of slippage of a new notion based on slipping some of its parts in their simplest ways. The notion I have chosen is that of yourself sitting there, reading this very column at this very moment. Here are some elements of the implicosphere of that concept: You are almost reading the September 1981 issue of Scientific American, You are almost reading a piece by Richard Hofstadter, the historian. You are almost reading a column by Martin Gardner. Your identical twin is almost reading this column. You are almost reading this column in French. You are almost reading Gbdel, Escfier, Bach. You are almost reading a letter from me. You are almost writing this column. You are almost hearing my voice. I am almost talking to you. You are almost ready to throw this copy of Mad magazine out in disgust. By now, the original concept is almost lost in a silly sea of "almost" variations—but it has been enriched by this exploration, and when you come back to it, it will have been that much more reified as a stand-alone concept, a single entity rather than a compound entity. After a while, under the proper triggering circumstances, this very example may be retrieved from memory as naturally and effortlessly as the concept of "fish" is. This is an important idea: the test of whether a concept has really come into its own, the test of its genuine mental existence, is its retrievability by that process of unconscious recall. That's what lets you know that it has been firmly planted in the soil of your mind. It is not whether that concept appears to be "atomic", in the sense that you have a single word to express it by. That is far too superficial.

I stated a thesis: that the crux of creativity resides in the ability to manufacture variations on a theme. I hope now to have sufficiently fleshed out this thesis that you understand the full richness of what I meant when I said "variations on a theme". The notion encompasses knobs, parameters, slippability, counterfactual conditionals, subjunctives, "almost"-situations, implicospheres, conceptual skeletons, mental reification, memory retrieval—and more. The question may persist in your mind: Aren't variations on a theme somehow trivial, compared to the invention of the theme itself? This leads one back to that seductive notion that Einstein and other geniuses are "cut from a different cloth" from ordinary mortals, or at least that certain cognitive acts done by them involve principles that transcend the everyday ones. This is something I do not believe at all.

My own mental image of the creative process involves viewing the organization of a mind as consisting of thousands, perhaps millions, of overlapping and intermingling implicospheres, at the center of each of which is a conceptual skeleton. The implicosphere is a flickering, ephemeral thing, a bit like a swarm of gnats around a gas-station light on a hot summer's night, perhaps more like an electron cloud, with its quantum-mechanical elusiveness, about a nucleus, blurring out and dying off the further removed from the core it is.

Making variations is not just twiddling a knob before you; part of the act is to manufacture the knob yourself. Where does a knob come from? The question amounts to asking: How do you see a variable where there is actually a constant? More specifically: What might vary, and how might it vary? It's not enough to just have the desire to see something different from what is there before you. Often the dullest knobs are a result of someone's straining to be original, and coming up with something weak and ineffective. So where do good knobs come from? I would say they come from seeing one thing as something else. Once an abstract connection is set up via some sort of analogy or reminding-incident, then the gate opens wide for ideas to slosh back and forth between the two concepts.

Serendipitous observation and quick exploration of potential are vital elements in the making of a knob. What goes hand in hand with the willingness to playfully explore a serendipitous connection is the willingness to censor or curtail an exploration that seems to be leading nowhere. It is the flip side of the risk-taking aspect of serendipity. It's fine to be reminded of something, to see an analogy or a vague connection, and it's fine to try to map one situation or concept onto another in the hopes of making something novel emerge—but you've also got to be willing and able to sense when you've lost the gamble, and to cut your losses. One of the problems with the ever-popular self-help books on how to be creative is that they all encourage "off-the-wall" thinking (under such slogans as "lateral thinking", "conceptual blockbusting", "getting whacked on the head", etc.) while glossing over the fact that most off-the-wall connections are of very little worth and that one could waste lifetimes just toying with ideas in that way. One needs something much more reliable than a mere suggestion to "think zany, out-of-the-system thoughts". Frantic striving to be original will usually get you nowhere. Far better to relax and let your perceptual system and your category system work together unconsciously, occasionally coming up with unbidden connections. At that point, you—the lucky owner of the mind in question—can seize the opportunity and follow out the proffered hint.

When I first heard the French saying Plus ca change, plus c'est la meme chose, it struck me as annoyingly nonsensical: "The more it changes, the samer it gets" (in my own colloquial translation). I was not amused but nonetheless it stuck in my mind for years, and finally it dawned on me that it was full of meanings. My favorite way of interpreting it is this. The more different manifestations you observe of one phenomenon, the more deeply you understand that phenomenon, and therefore the more clearly you can see the vein of sameness running through all those different things. Or put another way, experience with a wide variety of things refines your category system and allows you to make incisive, abstract connections based on deep shared qualities. A more cynical way of putting it, and probably more in line with the intended meaning, would be that superficially different things are often boringly the same. But the saying need not be taken cynically. Seeing clear to the essence of something unfamiliar is often best achieved by finding one or more known things that you can see it as, then being able to balance these views.

Once you have decided to try out a new way of viewing a phenomenon, you can let that view suggest a set of knobs to vary. The act of varying them will lead you down new pathways, generating new images ripe for perception in their own right. This sets up a closed loop:

  • fresh situations get unconsciously framed in terms of familiar concepts;

  • those familiar concepts come equipped with standard knobs to twiddle;

  • twiddling those knobs carries you into fresh new conceptual territory.

A visual image that I always find coming back in this context is that of a planet orbiting a star, and whose orbit brings it so close to another star that it gets "captured" and begins orbiting the second star. As it swings around the new star, perhaps it finds itself coming very close to yet another star, and ficklely changes allegiance. And thus it do-si-do's its way around the universe! The mental analogue of such stellar peregrinations is what the loop above attempts to convey. You can think of concepts as stars, and knob-twiddling as carrying you from one point on an orbit to another point. If you twiddle enough, you may well find yourself deep within the attractive zone of an unexpected but interesting concept and be captured by it. You may thus migrate from concept to concept. In short, knob-twiddling is a device that carries you from one concept to another, taking advantage of their overlapping orbits.

Slippage of thought is a remarkably invisible phenomenon, given its ubiquity. People simply don't recognize how curiously selective they are in their "choice" of what is and what is not a hingepoint in how they think of an event. It all seems so natural as to require no explanation. I dropped a slice of pizza on the floor of a pizza place the other evening. My friend Don, who was less hungry than I was, immediately sympathized, saying, "Too bad I didn't drop one of my pieces—or that you didn't drop one of mine instead of one of yours." Sounds sensible. But why didn't he say, "Too bad the pizza isn't larger"? His choice revealed that to his unconscious mind, it seemed sensible to switch the role-filler in a given event, as if to imply that a pizza-slice-droppage had been in the cards for that evening, that God had flipped a coin and, unluckily for me, it had come out with me as the dropper instead of Don—but that it might have come out the other way around. Some hypothetical replacement scenarios—I like to call them "subjunctive instant replays"—are compelling, and come to mind by reflex. They are not idle musings but very natural human emotional responses to a common type of occurrence. Other subjunctive instant replays have little intuitive appeal and seem far-fetched, although it is hard to say just why. Consider the following list: Too bad they didn't give us a replacement piece. Lucky we weren't in a really fancy restaurant. Too bad gravity isn't weaker, so that you could have caught it before it hit the ground. Lucky it wasn't a beaker filled with poison. Too bad it wasn't a fork. Lucky it wasn't a piece of good china. Too bad eating off floors isn't hygienic. Lucky you didn't drop the whole pizza. Too bad it wasn't the people at the next table who dropped their pizza. Lucky there was no carpet in here. Too bad you were the hungry one, rather than me. I'll leave it to you to generate other subjunctive instant replays that he might have come up with. There is a rough rank ordering to them, in terms of plausibility of springing to mind. It's the rhyme and reason behind that ordering that fascinates me.

There is such a thing, ephemeral though it may be, as "Art Deco spirit", just as there is undeniably such a thing as "French spirit" in music or "impressionistic spirit" in art. (Marcia Loeb has recently designed a whole series of typefaces in the Art Deco style, in case anyone doubts that the spirit of those times can be captured. And then there is the book Zany Afternoons by Bruce McCall, in which the entire spirit of several recent decades is wonderfully spoofed on all stylistic levels simultaneously.) Stylistic moods permeate whole periods and cultures, and they indirectly determine the kinds of creations—artistic, scientific, technological—that people in them come up with. They exert gentle but definite "downward" pressures. As a consequence, not only are the alphabets of a given period and area distinctive, but one can even recognize "the same spirit" in such things as teapots, coffee cups, furniture, automobiles, architecture, and so on, as Donald Bush clearly demonstrates in his book The Streamlined Decade.

People tend to think that only extreme versions of things pose deep problems. That's why few people see modeling the creativity of, say, the trite television character of Archie Bunker as a difficult task. It's strange and disorienting to realize that if we could write a program that could compose Muzak or write trashy novels, we would be 99 percent of the way to mechanizing Mozart and Einstein. Even a program that could act like a mentally retarded person would be a huge advance. The commonest mental abilities—not the rarest ones—are still the central mystery. John McCarthy, one of the founders of the field of artificial intelligence, is fond of talking of the day when we'll have "kitchen robots" to do chores for us, such as fixing a lovely Shrimp Creole. Such a robot would, in his view, be exploitable like a slave because it would not be conscious in the slightest. To me, this is incomprehensible. Anything that could get along in the unpredictable kitchen world would be as worthy of being considered conscious as would a robot that could survive for a week in the Rockies. To me, both worlds are incredibly subtle and potentially surprise-filled. Yet I suspect that McCarthy thinks of a kitchen as Sampson thinks of book faces: as some sort of simple and "closed' world, in contrast to "open-ended" worlds, such as the Rockies. This is just another example, in my opinion, of vastly underestimating the complexity of a world we take for granted, and thus underestimating the complexity of the beings that could get along in such a world.

Lisp: Atoms and Lists

Although there is among [AI researchers] a considerable divergence of opinion concerning the best route to AI, one thing that is nearly unanimous is the choice of programming language. Most AI research efforts are carried out in a language called "Lisp". (The name is not quite an acronym; it stands for "list processing".) Why is most AI work done in Lisp? There are many reasons, most of which are somewhat technical, but one of the best is quite simple: Lisp is crisp.

The heart of Lisp is its manipulable structures. All programs in Lisp work by creating, modifying, and destroying structures. Structures come in two types: atomic and composite, or, as they are usually called, atoms and lists. Thus, every Lisp object is either an atom or a list (but not both). The only exception is the special object called nil, which is both an atom and a list.

Lists are the flexible data structures of Lisp. A list is pretty much what it sounds like: a collection of some parts in a specific order. The parts of a list are usually called its elements or members. What can these members be? Well, not surprisingly, lists can have atoms as members. But just as easily, lists can contain lists as members, and those lists can in turn contain other lists as members, and so on, recursively.

Then there is the empty list—the list with no elements at all. How is this written down? You might think that an empty pair of parentheses—()— would work. Indeed, it will work—but there is a second way of indicating the empty list, and that is by writing nil. The two notations are synonymous, although nil is more commonly written than () is. The empty list, nil, is a key concept of Lisp; in the universe of lists, it is what zero is in the universe of numbers.

Psychologically, one of the great powers of programming is the ability to define new compound operations in terms of old ones, and to do this over and over again, thus building up a vast repertoire of ever more complex operations. It is quite reminiscent of evolution, in which ever more complex molecules evolve out of less complex ones, in an ever-upward spiral of complexity and creativity. It is also quite reminiscent of the industrial revolution, in which people used very simple early machines to help them build more complex machines, then used those in turn to build even more complex machines, and so on, once again in an ever-upward spiral of complexity and creativity. At each stage, whether in evolution or revolution, the products get more flexible and more intricate, more "intelligent" and yet more vulnerable to delicate "bugs" or breakdowns. Likewise with programming in Lisp, only here the "molecules" or "machines" are now Lisp functions defined in terms of previously known Lisp functions.

One goal that has seemed to some people to be both desirable and feasible using Lisp and related programming languages is (1) to make every single statement return a value and (2) to have it be through this returned value and only through it that the statement has any effect. The idea of (1) is that values are handed "upward" from the innermost function calls to the outermost ones, until the full statement's value is returned to you. The idea of (2) is that during all these calls, no atom has its value changed at all (unless the atom is a dummy variable). Such behavior is to be contrasted with that of functions that leave "side effects" in their wake. Such side effects are usually in the form of changed variable bindings, although there are other possibilities, such as causing input or output to take place.

Certain types of inert, or passive, information-containing data structures are sometimes referred to as declarative knowledge—"declarative" because they often have a form abstractly resembling that of a declarative sentence, and "knowledge" because they encode facts about the world in some way, accessible by looking in an index in somewhat the way "book-learned" facts are accessible to a person. By contrast, animate, or active, pieces of code are referred to as procedural knowledge—"procedural" since they define sequences of actions ("procedures") that actually manipulate data structures, and "knowledge" since they can be viewed as embodying the program's set of skills, something like a human's unconscious skills that were once learned through long, rote drill sessions. (Sometimes these contrasting knowledge types are referred to as "knowledge that" and "knowledge how".) This distinction should remind biologists of that between genes— relatively inert structures inside the cell—and enzymes, which are anything but inert. Enzymes are the animate agents of the cell; they transform and manipulate all the inert structures in indescribably sophisticated ways. Moreover, Lisp's loop of program and data should remind biologists of the way that genes dictate the form of enzymes, and enzymes manipulate genes (among other things). Thus Lisp's procedural-declarative program-data loop provides a primitive but very useful and tangible example of one of the most fundamental patterns at the base of life: the ability of passive structures to control their own destiny, by creating and regulating active structures whose form they dictate.

Why, in conclusion, is Lisp popular in artificial intelligence? There is no single answer, nor even a simple answer. Here is an attempt at a summary. (1) Lisp is elegant and simple. (2) Lisp is centered on the idea of lists and their manipulation—and lists are extremely flexible, fluid data structures. (3) Lisp code can easily be manufactured in Lisp, and run. (4) Interpreters for new languages can easily be built and experimented with in Lisp. (5) "Swimming" in a Lisp-like environment feels natural for many people. (6) The "recursive spirit" permeates Lisp. Perhaps it is this last rather intangible statement that gets best at it. For some reason, many people in artificial intelligence seem to have a deep sense that recursivity, in some form or other, is connected with the "trick" of intelligence. This is a hunch, an intuition, a vaguely felt and slightly mystical belief, one that I certainly share—but whether it will pay off in the long run remains to be seen.

I do use Lisp, I do think it is very convenient and natural in many ways, I do advocate that anyone seriously interested in AI learn Lisp well; all this is true, but I do not think that deep study of Lisp is the royal road to AI any more than I think that deep study of bricks is the royal road to understanding architecture. Indeed, I v/ould suggest that the raw materials to be found in Lisp are to AI what raw materials are to architecture: convenient building blocks out of which far larger structures are made. It would be ridiculous for anyone to hope to acquire a deep understanding of what AI is all about without first having a clear, precise understanding of what computers are all about. I know of no shorter cut to that latter goal than the study of Lisp, and that is one reason Lisp is so good for AI students. Beginners in Lisp encounter, and are in a good position to understand, fundamental issues in computer science that even some advanced programmers in other languages may not have encountered or thought about. Such concepts as lists, recursion, side effects, quoting and evaluating pieces of code, and many others that I did not have the space to present in my three columns, are truly central to the understanding of the potential of computing machinery. Moreover, without languages that allow people to deal with such concepts directly, it would be next to impossible to make programs of subtlety, grace, and multi-level complexity. Therefore I advocate Lisp very strongly. It would similarly be next to impossible to build structures of subtlety, grace, and multi-level complexity such as the Golden Gate Bridge and the Empire State Building out of bricks or stone. Until the use of steel as an architectural substrate was established, such constructions were unthinkable. Now we are in a position to erect buildings that use steel in even more sophisticated ways. But steel itself is not the source of great architects' inspiration; it is simply a liberator.

Spirit and Substrate

The world has traditionally been divided into the animate and the inanimate. Inanimate things do not have feelings or wills of their own, and can therefore be smashed, burned, or harnessed by animate ones without the animate ones having to feel guilty. This borderline, so long taken for granted by people, is gradually becoming blurrier with the advent of computers, especially as programs acquire more and more flexibility—and with that flexibility, a seeming mentality or personality. How and when could mind and emotions—surely the essence of the animate—emerge from complex inanimate substrates? What does it take to make spirit out of pure matter pattern? A number of recent artificial-intelligence programs have been touted as "thinking". Yet no one who looked closely could fail to see that there remains a huge gap between human self-aware fluidity and such programs. Even the best of them is still relatively rigid and unaware of anything, let alone itself. But where is the borderline between the highest inanimate flexibility and the lowest animate sentience? When does a system or organism have the right to call itself "I", and to be called "you" by us? Will we be able to recognize systems deserving of our respect when they come along, or will we abuse them? Will such systems have as much free will as we don't?

Through Godel's and Turing's work, mathematics was revealed to be unmechanizable—or, more precisely, incompletely mechanizable, no matter how complex the machine involved. Though on the surface this defeat for mechanism might seem to imply that human reasoning can always outwit or transcend mechanical imitations, on deeper analysis it turns out that Turing's argument can be applied to humans as well. Consider the yes-no question, "Will your answer to this particular question be 'no'?" You will find that you too go into a sort of computational vertigo in trying to answer it with a "yes" or a "no". This question exemplifies the sort of undecidability problems that Turing showed machines and mathematical systems are subject to. Though the example is simplistic, it reminds us of an essential fact of the human condition—that people, no matter how aware they are of their minds, cannot fully take their own complexity into account in attempting to understand themselves, and, quite like Turing machines baffled by their own descriptions, may be plunged into a vertigo of the psyche when they attempt to calculate their own hypothetical or future acts. Just as people can be surprised by their own complexity, so can machines, in that they can't predict their own behavior. People attribute this feature of themselves to "free will", and speak of "making choices". Turing's observation that machines will go into endless loops when trying to predict their own behavior suggests that a sufficiently complex machine might also come to suffer from that seemingly inevitable human delusion: believing that one has free will and is able to make choices that transcend physical law. Thus Turing's seemingly negative result about machines can be seen as a positive result, in that it sheds new light on how physical objects might reflect on themselves and even consider themselves to be conscious, deliberating beings. A mechanical approach to the mysteries of consciousness was Alan Turing's dream, and probably by the late 1930's he was a believer in the possibility that a properly organized machine could be intelligent, conscious, and have free will—at least to the extent that we or any physical object can do so.

In short, it seems that people who feel that machines—even intelligent ones—will always remain duller than minds are tacitly relying on the following thesis: Creativity is part of the very fabric of all human thought, rather than some esoteric, rare, exceptional, and fluky by-product of the ability to think, which every so often surfaces in places spread far and wide. With this thesis I agree. Where I differ with the antimechanists is over the matter of whether creativity lies beyond intelligence. I see creativity and insight, for machines no less than for people, as intimately bound up with intelligence, so that I cannot imagine a noncreative yet intelligent machine—something that, in order to make a point about what is essentially human, they seem to be willing and able to do. To me, "noncreative intelligence" is a flat-out contradiction in terms.

The gist of my notion is that having creativity is an automatic consequence of having the proper representation of concepts in a mind. It is not something you add on afterward. It is built into the way concepts are. To spell this out more concretely: If you have succeeded in making an accurate model of concepts, you have thereby also succeeded in making a model of the creative process, and even of consciousness.

Only through a deep understanding of the organization of memory—which is to say, only by answering the question "What is a concept?"—will we be able to make models of the creative process. This will be a long and arduous process, not one that will yield answers overnight, or even in a few decades. Nonetheless, we have the right beginnings, in the sciences of cognitive psychology and artificial intelligence. Philosophers of mind and neuroscientists will undoubtedly contribute as well. The union of all these disciplines is called "cognitive science".

A question that arises at the outset is: "What kinds of objects have concepts stored inside them, and what kinds do not?" One of my favorite passages that opens this question wide is in Dean Wooldridge's book Mechanical Man: The Physical Basis of Intelligent Life, and it runs this way:

When the time comes for egg laying, the wasp Sphex builds a burrow for the purpose and seeks out a cricket which she stings in such a way as to paralyze but not kill it. She drags the cricket into the burrow, lays her eggs alongside, closes the burrow, then flies away, never to return. In due course, the eggs hatch and the wasp grubs feed off the paralyzed cricket, which has not decayed, having been kept in the wasp equivalent of a deepfreeze. To the human mind, such an elaborately organized and seemingly purposeful routine conveys a convincing flavor of logic and thoughtfulness—until more details are examined. For example, the wasp's routine is to bring the paralyzed cricket to the burrow, leave it on the threshold, go inside to see that all is well, emerge, and then drag the cricket in. If the cricket is moved a few inches away while the wasp is inside making her preliminary inspection, the wasp, on emerging from the burrow, will bring the cricket back to the threshold, but not inside, and will then repeat the preparatory procedure of entering the burrow to see that everything is all right. If again the cricket is removed a few inches while the wasp is inside, once again she will move the cricket up to the threshold and reenter the burrow for a final check. The wasp never thinks of pulling the cricket straight in. On one occasion this procedure was repeated forty times, with the same result.

One can make the obvious remark that perhaps not the wasp but the experimenter was the one in the rut—but humor aside, this is a rather shocking revelation of the mechanical underpinning, in a living creature, of what looks like quite reflective behavior. There seems to be something supremely unconscious about the wasp's behavior here, something totally opposite to what we feel we are all about, particularly when we talk about our own consciousness. I propose to call the quality here portrayed sphexishness, and its opposite antisphexishness (a vexish word to pronounce!), and then I propose that consciousness is simply the possession of antisphexishness to the highest possible degree.

All human beings have that readiness, that alertness, and that is what makes them so antisphexish. Whenever they get into some kind of "loop", they quickly sense it. Something happens inside their heads—a kind of "loop detector" fires. Or you can think of it as a "rut detector", a "sameness detector"—but no matter how you phrase it, the possession of this ability to break out of loops of all sorts seems the antithesis of the mechanical. Or, to put it the other way around the essence of the mechanical seems to be in its lack of novelty and its repetitiveness, in its trappedness in some kind of precisely delimited space. This is why the wasp, the dog, even some humans seem so mechanical.

Computers are not inherently bored by adding long columns of numbers, even when all the numbers are the same. But people are. What is the difference? Clearly there is something lacking in the machine that allows it to have this unbounded tolerance for repetitive actions. This thing that is lacking can be described in a few words: It is the ability to watch oneself as one deals with the world, to perceive in one's own activities a pattern, and to be able to do so at many levels of abstraction.

Critical to the way our memory is organized is our automatic mode of storing and retrieving items, our knowledge of when we know and do not know, of how we know or why we wouldn't know. Such aspects of what is sometimes called "metaknowledge" are fluidly integrated into the way our concepts are meshed together. They are not some sort of "extra layer" added on top by a second-generation programmer who decided that metaknowledge is a good thing, over and above knowledge! No, metaknowledge and knowledge are simmering together in a single slew, totally fused and flavoring each other richly. This makes self-watching an automatic consequence of how memory is structured. How is this wondrous stew of antisphexishness realized in the human brain? And how can we create a program that, like a human brain, is all "of a piece", a program that is not simply a stack of ever-higher "other-watchers", but is truly a seamless "self-watcher", where all levels are collapsed into one?

To the extent of having an individual style, any artist is sphexish—trapped within invisible, intangible, but inescapable boundaries of mental space. But that is nothing to lament. Artists in groups form movements or schools or periods, and what limits one artist need not limit another. Thus, by the fact that its boundaries are wider, a school is less sphexish—more conscious—than any of its members. But even the collective movement of a school of art has its limits, shows its finitude, after a period of time. It begins to wind down, to lose fertility, to stagnate. And a new school begins to form. What no individual can make out clearly is perhaps seen collectively, on the level of a society. Thus art progresses towards an ever wider vision of beauty—a "prospective" vision of beauty—by a series of repeated "diagonalizations": processes of recognizing and breaking out of ruts. As I like to put it, this is the process of jootsing (jumping out of the system) to ever wider worlds. This endless jootsing is a process whose totality (so says Godel) cannot be formalized, either in a computer or in any finite brain or set of brains. Thus one need not fear that the mechanization of creativity, if ever it comes about, will mark the end of art. Quite the contrary: It is a day to look forward to, for on that day our eyes will open—as will those of computers, to be sure—onto whole new worlds of beauty.

It is funny how certain fads catch on, seemingly for no reason, while other things die, again for no clear reason. We all laugh at the Edsel today—yet what exactly is there to laugh at, except the fact that it did so poorly? What exactly was wrong with the Edsel? What is wrong with those thousands upon thousands of melodies that are composed every year and go nowhere? What made Michael Jackson and Pachelbels simple Canon all the rage? Why did the typeface Helvetica catch on like wildfire when it was first invented, when a dozen extremely similar ones died on the vine? Why did the typographical gimmick of symmetrically capitalizing both the first and the last letter of a word or title become a sudden vogue about four years ago? Why is it now faddish to write run-on words such as "Intelligenetics" or "PEOPLExpress"? What makes words like "Da-glo", "Turbomatic", and "Rayon" seem slightly dated? Why is "Qantas" still modern-sounding? What is poor about brand names like "Luggo" and "Flimp"? Why are 'x's now so popular in brand names? And yet why would "Goxie" be a weak name compared with, say, "Exigo" or "Xigeo"? Why are the ordinary-seeming names that nasal-voiced comedians Bob and Ray come up with—for example, "Wally Ballou", "Hudley Pierce", "Bodin Pardew", and "John W. Norbis"—apt to evoke snickers? How come Norma Jean Baker changed her name to "Marilyn Monroe"? Why would it not do for a movie star to be named "Arnold Wilberforce"? Why is the name "Tiffany" popular today, and why was "Lisa" so popular a few years earlier? Is something wrong with "Agnes", "Edna", or "Thelma"? With "Clyde", "Lance", or "Bartholomew"? Mere length certainly cannot be the answer (think of "Elizabeth"). Nor can the sound, in any simple sense. (Why is "Lance" bad if "Vance" is okay?) All this may seem a far, far cry from sphexishness and self-watching computers and brains. But what I am getting at is the unbelievable number of forces and factors that interact in our unconscious processing of even very tiny structures composed of discrete parts, such as words and names only a few letters long, let alone melodies several dozen notes long. Most of us could not put our finger on the answers to any of these questions. In fact, nobody could really answer these questions definitively. If we are going to try to get machines to do the subtlest of cognitive tasks, we had jolly well better be able to explain how mere words are appealing or repelling!

Some analogies help, others hinder. Our current mechanisms for analogy-making must certainly have emerged as a consequence of natural selection. Good mechanisms were selected for, bad ones were selected against, way back when, in the old times when you and I were but monkeys and rodents scampering about in tree branches ('member?). The point, then, is that far more than being just a matter of taste, variations in analogy-making skill can spell the difference between life and death. That's why "right answer" means something even for analogies; it's why analogies are only to some degree a matter of taste.

Among the various answers to any analogy problem, some will be decidedly weaker than others, even if you find that no one answer emerges as the clear victor. There is a radius beyond which analogies will be very likely to bring bad consequences to their proposers, at least if they are acted upon. It is for this kind of reason that I unbudgeably believe that there are better and worse answers to analogies, whether in life or in the Copycat domain. Elegance is more than just a frill in life; it is one of the driving criteria behind survival. Elegance is just another way of talking about getting at the essence of situations. If you don't trust the word "elegance" in this context, then you may substitute "compactness", 'efficiency", or "generality"—in short, survivability.

Still, it must be admitted, analogies that seem to require a deep perceptual shift after an initially unsatisfactory first stab are the ones that beguile us, for they seem to promise insight into that mystery of mysteries: insight. I must admit to the belief, or at least the strong intuition, that all the depth of scientific discovery, even the profoundest discovery, is wrapped up in the mechanisms for solving these simple problems in which conflicting pressures push around one's percepts and concepts, letting things bounce against each other until, all at once, something falls into place and then, presto! A sense of certainty crystallizes, so powerful that you know you have found the right way to look at things. I firmly believe, in short, that "mini-breakthroughs" and "maxi-breakthroughs" have precisely the same texture.

There is in AI today a tendency toward flashy, splashy domains—that is, toward developing programs that can do such things as medical diagnosis, geological consultation (for oil prospecting), designing of experiments in molecular biology, molecular spectroscopy, configuring of large computer systems, designing of VLSI circuits, and on and on. Yet there is no program that has common sense; no program that learns things that it has not been explicitly taught how to learn; no program that can recover gracefully from its own errors. The "artificial expertise" programs that do exist are rigid, brittle, inflexible. Like chess programs, they may serve a useful intellectual or even practical purpose, but despite much fanfare, they are not shedding much light on human intelligence. Mostly, they are being developed simply because various agencies or industries fund them. This does not follow the traditional pattern of basic science. That pattern is to try to isolate a phenomenon, lo reduce it to its simplest possible manifestation.

The problem is, AI programs are carrying out all these cognitive activities in the absence of any subcognitive activity. There is no substrate that corresponds to what goes on in the brain. There is no fluid recognition and recall and reminding. These programs have no common sense, little sense of similarity or repetition or pattern. They can perceive some patterns as long as they have been anticipated—and particularly, as long as the place where they will occur has been anticipated—but they cannot see patterns where nobody told them explicitly to look. They do not learn at a high level of abstraction. This style is in complete contrast to how people are. People perceive patterns anywhere and everywhere, without knowing in advance where to look. People learn automatically in all aspects of life. These are just facets of common sense. Common sense is not an "area of expertise", but a general—that is, domain-independent—capacity that has to do with fluidity in representation of concepts, an ability to sift what is important from what is not, an ability to find unanticipated analogical similarities between totally different concepts ("reminding", as Schank calls it). We have a long way to go before our programs exhibit this cognitive style. Recognition of one's mother's face is still nearly as much of a mystery as it was 30 years ago. And what about, such things as recognizing family resemblances between people, recognizing a "French" face, recognizing kindness or earnestness or slyness or harshness in a face? Even recognizing age—even sex!—these are fantastically difficult problems. As Donald Knuth has pointed out, we have written programs that can do wonderfully well at what people have to work very hard at doing consciously (e.g., doing integrals, playing chess, medical diagnosis, etc.)—but we have yet to write a program that remotely approaches our ability to do what we do without thinking or training—things like understanding a conversation partner with an accent at a loud cocktail party with music blaring in the background, while at the same time overhearing wisps of conversations in the far corner of the room. Or perhaps finding one's way through a forest on an overgrown trail. Or perhaps just doing some anagrams absentmindedly while washing the dishes. Asking for a program that can discover new scientific laws without having a program that can, say, do anagrams, is like wanting to go to the moon without having the ability to find your way around town.

Is the domain of anagrams simply a trivial, silly, "toy" domain? Or is it serious? I maintain that it is a far purer, far more interesting domain than many of the complex real-world domains of the expert systems, precisely because it is so playful, so unconscious, so enjoyable, for people. It is obviously more related to creativity and spontaneity than it is to logical derivations, but that does not make it—or the mode of thinking that it represents—any less worthy of attention. In fact, because it epitomizes the unconscious mode of thought, I think it more worthy of attention. In short, it seems to me that something fundamental is missing in the orthodox AI "information-processing" model of cognition, and that is some sort of substrate from which intelligence emerges as an epiphenomenon. Most AI people do not want to tackle that kind of underpinning work.

Such beliefs arise, in my opinion, from a confusion of levels, exemplified by the title of Barr's paper: "Cognition as Computation". Am I really computing when I think? Admittedly, my neurons may be performing sums in an analog way, but does this pseudo-arithmetical hardware mean that the epiphenomena themselves are also doing arithmetic, or should be—or even can be—described in conventional computer-science terminology? Does the fact that taxis stop at red lights mean that traffic jams stop at red lights? One should not confuse the properties of objects with the properties of statistical ensembles of those objects. In this analogy, traffic jams play the role of thoughts and taxis play the role of neurons or neuron-firings. It is not meant to be a deep analogy, only one that emphasizes that what you see at the top level need not have anything to do with the underlying swarm of activities bringing it into existence. In particular, something can be computational at one level, but not at another level. Yet many AI people, despite considerable sophistication in thinking about a given system at different levels, still seem to miss this. Most AI work goes into efforts to build rational thought ("cognition") out of smaller rational thoughts (elementary steps of deduction, for instance, or elementary motions in a tree). It comes down to thinking that what we see at the top level of our minds—our ability to think—comes out of rational "information-processing" activity, with no deeper levels below that.

The brain itself does not "manipulate symbols"; the brain is the medium in which the symbols are floating and in which they trigger each other. There is no central manipulator, no central program. There is simply a vast collection of "teams"—patterns of neural firings that, like teams of ants, trigger other patterns of neural firings. The symbols are not "down there" at the level of the individual firings; they are "up here" where we do our verbalization. We feel those symbols churning within ourselves in somewhat the same way as we feel our stomach churning; we do not do symbol manipulation by some sort of act of will, let alone some set of logical rules of deduction. We cannot decide what we will next think of, nor how our thoughts will progress. Not only are we not symbol manipulators; in fact, quite to the contrary, we are manipulated by our symbols! As Scott Kim once cleverly remarked, rather than speak of "free will", perhaps it is more appropriate to speak of "free won't". This way of looking at things turns everything on its head, placing cognition—that rational-seeming level of our minds—where it belongs, namely as a consequence of much deeper processes of myriads of interacting subcognitive structures. The rational has had entirely too much made of it in AI research; it is time for some of the irrational and subcognitive to be recognized for its pivotal role.

It is my belief that until AI has been stood on its head and is 100 percent bottom-up, it won't achieve the same level or type of intelligence as humans have. To be sure, when that kind of architecture exists, there will still be high-level, global, cognitive events—but they will be epiphenomenal, like those in a brain. They will not in themselves be computational. Rather, they will be constituted out of, and driven by, many many smaller computational events, rather than the reverse. In other words, subcognition at the bottom will drive cognition at the top. And, perhaps most importantly, the activities that take place at that cognitive top level will neither have been written nor anticipated by any programmer. This is the essence of what I call statistically emergent mentality.

Systems that are not interfaced with our tangible, three-dimensional world via perceptors and motor capacities, no matter how sophisticated their innards, seem to be un-identifiable-with, by most people. I have in mind a certain kind of program that most people would probably find it ludicrous to ever consider conscious: a program that does symbolic mathematical manipulations. Take the famous system called Macsyma, for instance, which can do calculus and algebra problems of a very high order of difficulty. Its performance would have been so unimaginable in the days of Gauss or Euler that many smart people would have gasped and many brilliant people might have worshiped it. No one could pooh-pooh it—but today we do. Today we are "sophisticated". In a way, this is good, but in a way it is bad. What bothers me is a kind of "hardware chauvinism" that we humans evince. This chauvinism says, "Real Things live in three dimensions; they are made of atoms. Photons bounce off Real Things. Real Things make noises when you drop them. Real Things are material, not insubstantial mental ghosts." The idea that numbers or functions or sets or any other kind of mathematical construct might be Real would provoke guffaws in many if not most intellectual quarters today. The idea that being able to maneuver about in a "space" or "universe" of pure abstractions might entitle a robot to be called "sentient" would be ridiculed to the skies, no matter if the maneuvering in that abstruse high-dimensional space were as supple and graceful as that of the most skilled Olympic ice-skating champion or the greatest jazz pianist. Speaking of which, the musical universe provides another wonderful testbed. Would a robot able to devise incredibly beautiful, lyrical, flowing passages that brought tears to your eyes be entitled to a bit of empathy? Suppose it were otherwise immobile, its only conception of "reality" being inward-directed rather than something accessible through hands or eyes or ears. How would you feel then? I personally don't think that such a program could come to exist in actuality, but as a thought experiment it asks something interesting about our conception of sentience. Does access to the "real world" count for a lot? Why should the intangible world of the intellect be any less real than the tangible world of the body? Does it have less structure? No, not if you get to know it. Every type of complexity in the physical world has its mirror image in the world of mathematical constructs, including time. What kind of prejudice is it, then, that biases us in favor of our kind so strongly? As questions of mind and matter grow ever more subtle, we must watch out for tacit assumptions of this sort ever more vigilantly, for they affect us at the deepest level and provide pat answers to exceedingly non-pat questions.

Selection and Stability

I have returned to thinking about games in which patterns in one's play can be taken advantage of, even if game theory in some theoretical sense can find the optimal strategy. There is still something curiously compelling and fascinating about the teasing and flirting and other ploys that arise in these games, something that vividly recalls strategies in evolution, and even seems relevant to many political situations today. Furthermore, there is something strikingly academic and bookish to adopting a purely game-theoretic strategy when playing against a human opponent, especially in the face of 'teasing" strategies. Obviously, humans have more complex goals in life than merely winning the game, and this fact determines a lot about how they play a game. Impatience and audacity, for instance, are both important psychological elements in human game-playing, and an optimal strategy in the ordinary game-theoretic sense does not take those into account. Therefore I feel games of this sort are still important models of how people and larger organizations tackle complex challenges and threats.

Much of the spontaneous and creative teasing behavior that tends to occur in these games has its parallels in evolution. The most picturesque and vivid portrayal that I know of the uncanny patterns and canny counter-patterns set up by living beings competing against each other is provided by Richard Dawkins in his book The Selfish Gene. The discussion centers around the notion of an evolutionary stable strategy, or ESS—a term due to J. Maynard Smith. An ESS is defined as: "a strategy which, if most members of a population adopt it, cannot be bettered by an alternative strategy". However, here, "adoption of a strategy by an individual" really means that that individual has genes for that behavioral policy. It's not a question of choice. In essence, Dawkins maintains, this is what nature has done over eons: Vast numbers of strategies have fought each other, nature's profligacy paying off in the long run in the development of species with optimal strategies, in some sense of the term. Dawkins uses this concept to show how group selection can seem to be taking place in a population, when in fact mere gene selection can account for what is observed. He says: Maynard Smith's concept of the ESS will enable us, for the first time, to see clearly how a collection of independent selfish entities can come to resemble a single organized whole... Selection at the low level of the single gene can give the impression of selection at some higher level.

Today, the most important problems facing humanity are in the arena of international relations where independent, egoistic nations face each other in a state of near anarchy. Many of these problems take the form of an iterated Prisoner's Dilemma. Examples can include arms races, nuclear proliferation, crisis bargaining, and military escalation. Of course, a realistic understanding of these problems would have to lake into account many factors not incorporated into the simple Prisoner's Dilemma formulation, such as ideology, bureaucratic politics, commitmenis, coalitions, mediation, and leadership. Nevertheless, we can use all the insights we can get. Robert Gilpin [in his book War and Change in World Politics] points out that from the ancient Greeks to contemporary scholarship all political theory addresses one fundamental question: "How can the human race, whether for selfish or more cosmopolitan ends, understand and control the seemingly blind forces of history?" In the contemporary world this question has become especially acute because of the development of nuclear weapons. The advice given in this book to players of the Prisoner's Dilemma might also serve as good advice to national leaders as well: Don't be envious, don't be the first to defect, reciprocate both cooperation and defection, and don't be too clever.

Sanity and Survival

What do you do when in a crushingly cold winter, you hear over the radio that there is a severe natural gas shortage in your part of the country, and everyone is requested to turn their thermostat down to 60 degrees? There's no way anyone will know if you've complied or not. Why shouldn't you toast in your house and let all the rest of the people cut down their consumption? After all, what you do surely can't affect what anyone else does. This is a typical "tragedy of the commons" situation. A common resource has reached the point of saturation or exhaustion, and the questions for each individual now are: "How shall I behave? Am I typical? How does a lone person's action affect the big picture?" Garrett Hardin's article "The Tragedy of the Commons" frames the scene in terms of grazing land shared by a number of herders. Each one is lempted to increase their own number of animals even when the land is being used beyond its optimum capacity, because the individual gain outweighs the individual loss, even though in the long run, that decision, multiplied throughout the population of herders, will destroy the land totally.

In an era when resources are running out in a way humanity has never had to face heretofore, new kinds of social arrangements and expectations must be imposed, Hardin feels, by society as a whole. He is a dire pessimist about any kind of superrational cooperation, emphasizing that cooperators in the birth-control game will breed themselves right out of the population. Hardin puts it bluntly: "Conscience is self-eliminating." He goes even further and says: The argument has here been stated in the context of the population problem, but it applies equally well to any instance in which society appeals to an individual exploiting a commons to restrain himself for the general good—by means of his conscience. To make such an appeal is to set up a selective system that works toward the elimination of conscience from the race.

I sometimes wonder whether there haven't been many civilizations Out There, in our galaxy and beyond, that have already dealt with just these types of gigantic social problems—Prisoner's Dilemmas, Tragedies of the Commons, and so forth. Most likely some would have survived, some would have perished. And it occurs to me lhat perhaps the ultimate difference in those societies may have been the survival of the meme that, in effect, asserts the logical, rational validity of cooperation in a one-shot Prisoner's Dilemma. In a way, this would be the opposite thesis to Hardin's. It would say that lack of conscience is self-eliminating—provided you wait long enough that natural selection can act at the level of entire societies.