Introduction: The Mind Explained in One Page or Less

“If we do discover a complete theory, it should in time be understandable in broad principle by everyone, not just a few scientists. Then we shall all, philosophers, scientists, and just ordinary people, be able to take part in the discussion of the question of why it is that we and the universe exist. If we find the answer to that, it would be the ultimate triumph of human reason—for then we would know the mind of God.” — Stephen Hawking, A Brief History of Time

We exist. We don’t quite know why we or the universe exists, but we know that we think, therefore we are. The problem is that we don’t know we know anymore. Worse still, we have convinced ourselves we don’t. It is a temptation of modern physical science to mitigate ourselves right out of existence. First, since the Earth is a small and insignificant place, certainly nothing that happens here can have any cosmic significance. But, more than that, the laws of physics have had such explanatory success that surely they must explain us as well, reducing the phenomenon of us to a dance of quarks and leptons. Well, I am here to tell you that Descartes was right, because we are here, and that science took a left turn at Francis Bacon and needs to get back on the right track. The problem is that we’ve been waiting for science to pull the mind down to earth, to dissect it into its component nuts and bolts, but we’ve had it backward. What we have to do first is use our minds to pull science up from the earth into the heavens, to invest it with the explanatory reach it needs to study imaginary things. Because minds aren’t made of nuts and bolts; brains are. Minds — and imagination — are made of information, function, capacity, and purpose, which are all well-established nonphysical things or forces from the human realm which science can’t see under a microscope.

In the above quote, Stephen Hawking is, of course, referring only to a complete theory of how we physically exist. Explaining the physical basis of the universe has been the driving quest of the physical sciences for four hundred years. It has certainly produced some fascinating and unexpected theories, but despite that, it matters much less to us than a complete theory of how we mentally exist, or, for that matter, of how we and the universe both physically and mentally exist. Can one theory embrace everything we can think of as existing, and if so, could we even understand it? Yes, it can and we can. All we have to do is take a few steps back and approach the subject again confidently. Everything we know, we learned from experience, and at no point in human history have we had more scientific knowledge and personal experience based on fairly robust scientific perspectives. In short, we know more than we think we know, and if we just think about the subject while keeping all that science in mind, we should be able to ferret out a well-supported scientific explanation of the mind.

I am going to go back to first principles to reseat the foundation of science and then use its expanded scope over both real and imaginary things to approach the concept of mind from the bottom up and the top down to develop a unified theory. The nature of the mind was long the sole province of philosophers, who approached it with reason but lacked any tools for uncovering its mechanisms. Wilhelm Wundt, the “father of modern psychology”, took on the conscious mind as a subject of experimental scientific study in the 1870’s. Immanuel Kant, himself probably the greatest philosopher of mind, held that the mind could only be studied through deductive reasoning, i.e. from an a priori stance, which (it turns out) makes the incorrect assumption that deduction in the mind stands apart from observation and experience. He disputed that psychology could ever be an empirical (experimental) science because mental phenomena could not be expressed mathematically, individual thoughts could not be isolated, and any attempt to study the mind introspectively would itself change the object being studied, and not to mention would introduce innumerable opportunities for bias.1 Wundt nevertheless founded experimental psychology and remained a staunch supporter of introspection, provided it was done under strict experimental control. Introspection’s dubious objectivity caught up with it, and in 1912, Knight Dunlap published an article called “The Case Against Introspection” that pointed out that no evidence supports the idea that we can observe the mechanisms of the mind with the mind. This set the stage for a fifty-year reign of behaviorism, which, in its most extreme forms, sought to deny that anything mental was real and that behavior was all there was2. Kant had made the philosophical case and the behaviorists the scientific case that the inner workings of the mind could not be studied by any means.

A cognitive revolution slowly started to challenge this idea starting in the late 1950s. In 1959, Noam Chomsky famously refuted B.F. Skinner’s 1957 Verbal Behavior, which sought to explain language through behavior, by claiming that language acquisition could not happen through behavior alone34. George Miller’s 1956 article “The Magical Number Seven, Plus or Minus Two” proposed a mental capacity that was independent of behavior. Ideas from computer science that the mind might be computational and from neuroscience that neurons could do it started to emerge. The idea that the nature of the mind could be studied scientifically was reborn, but it was clear to everyone that psychology was not broad enough to tackle it alone. Cognitive science was conceived in the 1970s as a new field to study how the mind works, driven mostly at first by the artificial intelligence community. But because the overall mission would need to draw support from many fields, cognitive science was intentionally chartered as a multidisciplinary effort rather than a research plan. It was to be about the journey rather than the destination. Instead of a firm foundation of one or more prevailing paradigms, as one finds in every other science, it floats on a makeshift interdisciplinary boat built that lashes together rafts from psychology, philosophy, artificial intelligence, neuroscience, linguistics, and anthropology. Instead of a planned city, we now have an assorted collection of flotsam and jetsam covering every imaginable perspective, with no means or mandate to sort it all out. Detailed work is being done on each raft, with assistance from the others, but there is no plan to fit everything together. While it has been productive, the forest can’t be seen for the trees. That it is relatively open-minded is a big improvement over the closed-mindedness of behaviorism, but cognitive science desperately needs to establish a prevailing paradigm as one finds in other fields. To pull these diverse subfields together, we need to develop a philosophical stance that finds some common bedrock.

We need to roll the clock back to when things started, to the beginning of minds and the beginning of science. We need to think about what really happened and why, about what went right and what went wrong. What we will find is that the essence of more explanatory perspectives was there all along, but we didn’t develop them past the intuitive stage to form an overall rational explanation. With a better model that can bridge that gap, we can establish a new framework for science that can explain both material and immaterial things. From this new vantage point, everything will fit together better and we can explain the mind just using available knowledge. I don’t want to hold you in suspense for hundreds of pages until I get to the point, so I am going to explain how the mind works right here on the first page. And then I’m going to do it again in a bit more detail over a few pages, and then across a few chapters, and then over the rest of the book. Each iteration will go into more detail, will be better supported, and will expand my theory further. I’m going to start on firm ground and make it firmer. My conclusions should sound obvious, intuitive, and scientific, pulling together the best of both common sense and established science. My theory should be comprehensive if not yet complete, and should be understandable in broad principle by everyone and not just by scientists.

From a high level, it is easy to understand what the mind does. But you have to understand evolution first. Evolution works by induction, which means trial and error. It keeps trying. It makes mistakes. It detects the mistakes with feedback and tries another way, building a large toolkit of ways that work well. Regardless of the underlying mechanisms, however, life persists. It is possible for these feedback structures to keep going, and this creates in them a logical disposition to do so. They keep living because they can. Living things thus combine physical matter with a “will” to live, which is really just an opportunity. This disposition or will is not itself physical; it is the stored capacities of feedback loops tested over long experience. These capacities capture the possibilities of doing things without actually doing them. They are the information of life, but through metabolism they have an opportunity to act, and those actions are not coincidentally precisely the actions that keep life alive another day. The kinds of actions that worked before are usually the kinds that will work again because that is how natural selection works: it creates “kinds” as statistical averages of prior successes that by induction are then statistically likely to work again. And yet, ways that can work better still can be found, and organisms that can find such better ways outcompete those that don’t, creating a functional arms race called evolution that always favors sets of capacities that have been, and thus are likely to continue being, more effective at perpetuating themselves. Survival depends on many such capacities, each of which competes against its variants in separate arms races but which are still selected based on their contribution to the overall fitness of the unit of selection, which is most significantly the individual organism because the genes and cells of an individual live or die together. As Richard Lewontin put it in his seminal 1970 paper “The Units of Selection5, “any selective effect of variation” in biomolecules, genes, ribosomes, and everything “under the control of nuclear genes” “will be at the level of the whole organisms of which they are a part.” So genes are not selfish; it is all for one and one for all6. Lewontin also pointed out that any subpopulation of individuals can also act as a unit of selection relative to other subpopulations.

Freedom of motion created a challenge and an opportunity for some living things to develop complex behavioral interactions with their environment, if only they could make their bodies pursue high-level plans. Animals met this challenge by evolving brains as control centers and minds as high-level control centers of brains. At the level the mind operates, the body is logically an agent, and its activities are not biochemical reactions but high-level (i.e. abstract) tasks like eating and mating. Unlike evolution, which gathers information slowly from natural selection, brains and minds gather information in real time from experience. Their primary strategy for doing that is also inductive trial and error. Patterns are detected and generalized from feedback into abstractions like one’s body, the seen world, friends, foes, and food sources. Most of the brain’s inductive work happens outside of conscious awareness in what I call the nonconscious mind (I am going to avoid the word unconscious as it also implies consciousness is turned off. I am also avoiding subconscious as it implies only that which is just below the surface.) The nonconscious mind is like the staff of a large corporation and the conscious mind is like the CEO. Only the CEO makes top-level decisions, but she also delegates habituated tasks to underlings who can then take care of them without bothering the CEO. Another good analogy is to search engines. The conscious mind enters the search phrases and the nonconscious mind brings back the results in the form of memories and intuitions.7 Nearly all the processing effort is nonconscious, but the nonconscious mind needs direction from the top to know what to do. The nonconscious mind packages up instincts, senses, emotions, common sense, and intuition, and also executes all motor actions. The conscious mind sits in an ivory tower in which everything has been distilled into the high-level, logical perspective that we think of as the first-person. This mind’s-eye view of the world (or mind’s-eye world for short) is analogous to a cartoon vs. a video, and for the same reasons: the cartoonist isolates relevant things and omits irrelevant detail. For minds to be effective at their job, the drive to survive needs to be translated into preferences that appropriately influence the high-level agent to choose beneficial actions. Minds therefore experience a network of feelings from pain and other senses through emotions that can influence complex social interactions that motivate them to act in their agent’s best interests. Where DNA evolves based on its contribution to survival, knowledge in minds evolves based on its contribution to satisfying or frustrating conscious desires. The subjective lens of consciousness is only accountable to survival indirectly. We do always try to protect our interests, but we protect the interests we feel as desires rather than the ultimate reasons those desires evolved. In the long run, those selections must result in increased rates of survival, but this requires tuning over time, and the possibility thus exists that some survival advantages may become selected for at the expense of others, eventually reducing net fitness. For example, some birds select mates for the appeal of their plumage, and because mate selection produces a direct survival advantage, this can eclipse the physical fitness of the birds, making them less competitive against other species. I will argue that this effect, called Fisherian runaway, has not happened to humans and is not a factor in gender differences, our concepts of beauty, war, or anything else. For humans, conscious desires are well-tuned to survival benefit.

Humans and some of the most functional animals also think a lot, making special use of the brand of logic we call deduction. Where induction works from the bottom up (from specifics to generalities), deduction works from the top down (generalities to specifics). Deduction is a conscious process that builds discrete logical models out of the high-level abstractions presented to consciousness. First, it takes the approximate kinds suggested by bottom-up induction and packages them up into abstract buckets called concepts. Concepts are the line in the sand that separates deduction from inductions, but they are built entirely out of bottom-up inductive knowledge I call “subconcepts”. We feel around in our subconceptual stew to establish discrete, conceptual relationships and rules between concepts like causes and effects. Although concepts and the deductive models we build out of them only form simplified high-level views that only approximately line up with the real world, they comprise that better part of our knowledge and understanding because they are explanatory. Deduction makes explanation possible. It is mostly what separates human intelligence from animal intelligence. Chimps, dolphins, elephants, and crows have a basic sense that objects can have general properties which lets them be used in different ways, and so can fashion and/or use simple tools, but they can’t go very far down this path, even though their minds are very similar genetically.

How did conceptual thinking become easy for us? The leap from the jumble of subconcepts to cleanly delineated concepts required a sufficiently worthwhile motive to make it worth the effort. It was language that forced us to put concepts into neat little buckets called words (or, more strictly, into morphemes or semantic units, which include parts of words (or, more strictly, into morphemes or semantic units, which include roots, prefixes, suffixes, whole words, and idioms). To communicate, we have to respect the implied lines around each word. After we learn how to delineate concepts using language, we then develop a conceptual approach to thinking with concepts, creating concepts around every little thing we think about to a much finer level of granularity than language alone supports. This begs the question, “Where did language come from?” Humans have been communicating more for over two million years, probably first with hand gestures and later with improved vocalizations. These efforts gave an edge to early humans who could think more conceptually and verbalize better, gradually making us naturally inclined to create and learn language. In any case, once we have learned a language and are thinking more conceptually, our nonconscious mind can develop habits and intuitions that leverage both subconceptual and conceptual knowledge. Creating concepts and deriving implications from them is a conscious activity because it requires forming new long-term memories, which seems to require conscious participation. Most higher-order thinking and deduction need capacities that are only available to consciousness, most notably the ability to create long-term memories of concepts, models, and their intermediate and final results.8 Consciousness uses many very short cognitive cycles that we perceive as a non-overlapping train of thought, but which do overlap somewhat. These cycles have privileged access to different kinds of memory and processing abilities that distinguish it from nonconscious processing and account for how it feels and for its ability to make top-level decisions.

The real world is not a conceptual place, so conceptual models never perfectly describe it. But they can come arbitrarily close, and as we shape the world into a collection of manmade artifacts that are each conceptual in nature, this mapping gets even better. Still, though, to some measurable degree, all conceptual models are quite uncertain because they must be shoehorned to any given application. However, perfect is the enemy of good, so seeing uncertainty for what it is would be crippling. To function well using deductive models, we need a way to establish a threshold above which we won’t care about uncertainties, and we have a built-in mechanism for doing this called belief or trust. When either inductive or deductive knowledge feels good enough to act on, we invest it with our trust and push nagging second thoughts aside. Having pushed the second thoughts aside, we proceed from that moment forward as if the deductive model were actually true, even though all models are generalizations, which mean all sorts of compromises have been made both in their definition and their application. Trust is not permanent, and, ironically, we establish a level of trust in everything which helps us manage exceptions and decide when to review our beliefs. The important benefit, however, is that we can act confidently when trust has been conferred. Trust is a 100/0 rule rather than an 80/20 rule because doubts have been swept aside as irrelevant, even though they could become relevant at any moment given sufficient reason.

The mind is more to us than just the capacity to experience feelings and thoughts. We also have the feeling that our existence as mental entities is both meaningful and consequential. While we also think cats and all feeling creatures have a claim to some level of mental existence, we think a human’s concept of self with a robust inner life full of specific memories, knowledge, and reflections, makes us more worthy. It is fair; the capacity to feel and think more deeply because we have been through more stuff makes human experience more compelling. That said, all animals should be treated with respect proportional to their mental capacity, which is a standard we often ignore because it is inconvenient. While anything about our inner life could be called an aspect of the self, what we primarily mean by the term is our tendency to think of our integrated sense of agency in the world as an entity in its own right. The stability of our desires and memories creates permanence and makes the self undeniable. This self identifies with consciousness as the active part of its existence, but its memories and plans for the future are inactive parts that it frequently consults. Its memories, plans, and current state of mind are data constructs which it consciously processes to make decisions. The initiative of consciousness derives from attention, short-term memory, and consideration, which is the capacity to reflect on ideas in short-term memory. The nonconscious mind gathers patterns from memory, but I don’t think it can iteratively turn ideas over in a concerted fashion to derive associations systematically and develop perspectives. Consideration uses attention and short-term memory to form concepts and conceptual models and then derives logical implications from them. Nonconscious thought is a first-order use of neural networks and conscious thought is a second-order use. Nonconsciously, we can push data through the net to recognize or intuit things, and we can also find new patterns and add to the net. Consciously, we build a secondary net out of concepts, models, perspectives, and we derive logical ways of processing these higher-level entities. This second-order network, which is mostly what we mean by intelligence, is entirely grounded in the first-order nonconscious network but can achieve cognitive feats well beyond it. This is what makes humans more capable than other animals and what makes our self more significant.

As an agent with a self, we believe we act with free will to choose our path into the future. If our minds make the top-level decisions that control our bodies, then this makes sense. But cognitive science increasingly suggests, and I agree, that we always and only pick the best plan we can given our constitution and background — our nature and nurture. How can we say we have any choice or free will if our choice must be the best one? It is just because picking the best plan is hard. It requires a staggering amount of information processing. Every decision we make draws indirectly on everything we know, and the outcome of that process can’t be predicted correctly by any imaginable science. Yes, if you know someone well, you can predict what they will do a good part of the time, but not always. And if real-time brain scanning technology improves, it could become arbitrarily good at knowing in advance what we are thinking about and our most likely actions. But it will always fall short of making perfect predictions because our decisions draw on the whole network, and in an iterative feedback system, even the tiniest differences can eventually produce big effects. In any case, whether someone else knows what you are going to do or not, you still have free will because you don’t know what you are going to do, but you have to decide. Quite the opposite of lacking free will, you don’t have any choice but to exercise free will so long as you have the mental capacity to make evaluations. The mentally incompetent are rightly recognized as being without free will, but everyone else is a mental entity whose only real job is to consult their nature and nurture to see what they say. Although the decisions you reach are deterministic, you are the machine and are obeying yourself, which is not such a bad thing. Neither you nor anybody else knows for sure what you are going to do until you do it, so for all practical purposes, you have chosen your path into the future. You can’t double-think your way out going through the motions, because all of that falls within your nature and nurture. You have to take responsibility for the outcome because you own yourself; our mind has to identify with the brain and body to which it is assigned because that is its job.

How could aliens studying people decide whether they were automatons or were exercising free will? In other words, at what point does a toaster or a robot develop a capacity for “free choice”? The answer just depends on the kind of algorithm it uses to initiate action. If that algorithm consults a knowledge network that was gathered from real-time experience and that can be extended when making a decision, then the choice is free. Because different minds have different amounts of knowledge and ability to consider implications, they differ in their degree of free will. The mentally incompetent retain some measure of free will, but the scope falls far enough below the normal range that they can’t grasp the implications of many decisions adequately. Insects act mostly from instinct and not gathered knowledge, but they can learn some things, and actions attributed to such knowledge must be considered acts of free will, even if the scope of their free will is shockingly limited relative to our own. When we make a decision, bringing all our talent, insight, and experience to bear, we give ourselves a lot of credit for marshaling all the relevant information, but from a more omniscient perspective we have brought little more to the table than an ant. In the end, we are just acting on our best hunches and hoping it will work out. The measure of whether any of our decisions “work out” is not physical; it is functional. Physically, atoms will still be atoms regardless of whether they are locally organized as rocks or living things. Life “works out” if the patterns and strategies it has devised continue to function through some physical mechanism. The mind “works out” if our conception of the world continues to function through some physical mechanism. We each develop our own conception of adequate, degraded, and improved functionality, and we struggle to always maintain and improve functionality from our own perspective.

That the mind exists as its own logical realm independent of the physical world is thus not an ineffable enigma; it is an inescapable consequence of the high-level control needs of complex mobile organisms. Our inner world is not magic; it is computed. My goal here is to start from the top down, starting with the knowledge of which we are the most certain, to explain how I came to the above conclusions. By sticking to solid ground in the first place, I hope to avoid intuitive leaps to unwarranted conclusions. In other words, everything I say should be well-supported scientifically and nothing I say should be controversial. This is very hard to do, because I am wading into very controversial waters, but bear with me and challenge me if I go too far. What I will be presuming from the outset is that the mind is a real natural phenomenon that can be explained and is not fundamentally ineffable or an illusion. All our direct knowledge of the mind comes from our own experience of having one, so we have to seek confirming evidence from our own minds for any theory we might put forth. In other words, our hypothesis must not only be supported by all relevant scientific disciplines, it has to pass the smell test of our own intuitive and deductive assessment of its legitimacy. Conventionally, science rules out subjective assessment as an acceptable measure, which is a very good rule for the study of non-subjective subjects. But the mind, and indeed all the social sciences, are very subjective. To exclude subjectivity on the grounds that it can be biased eliminates the possibility of achieving much of an explanation at all. Instead, we need to be looking for a scientifically credible approach to explain the mind on its own terms. We have to participate in the process, finding ways to pull objective meaning from subjective experience. The explanation that emerges should not be my opinion but should be an amalgam of the best-supported philosophical and scientific theories. Ideally, weaker or incorrect scientific theories will not find a way into this discussion, but I will sometimes refer to the more prominent of these if they have significantly distracted science from its mission.

Although I promise a full explanation, that fullness refers to breadth and not depth. Explanation provides a deductive model that comes at a subject from the top down, but in so doing it necessarily glosses over much detail that is visible from the bottom up. Also, there is an infinite number of possible deductive models one can construct to explain any phenomenon, but I am only presenting one. More accurately, I am going to present a set of models that fit together into a larger framework. We generally hold that there is one scientific model that best explains any phenomenon
Science strives to find the one model that best explains each phenomenon, which it accomplishes in some measure by finding the simplest models that cover just certain aspects. Once one is constrained by simplicity (Occam’s razor), the number of possible models that correctly explain things to a certain level of detail goes way down, though one needs a framework of such models. The concepts that underly these models are not completely rigid; they allow some room for interpretation. They are, as I say, full of breadth but a little uncertain in depth. We have to remember that the sea of inductive subconceptual knowledge is vast and can never be explained except in this compromised way offered by deductive explanation. So although I will deliver a comprehensive explanation of the mind, most of our experience of having a mind is complex beyond explanation. The mind is far more amazing in the details than any theory can adequately convey.

Beyond the theoretical limits of explanation, I am also constrained by some practical limits. The theory I will develop here is derived from our existing knowledge using the most objective approach I could devise. Although I think we know enough right now to provide a pretty good overall explanation of the mind, it is still very early days in the history of civilization and future explanations will provide better breadth and depth. However, if I have done my job right, such explanations will still fundamentally concur with my conclusions. Science is safe from revision to the extent it doesn’t overreach, so I will be taking great pains to avoid claims that can be faulted. Physicalists overreach when they claim all phenomena are physical and social scientists overreach when they claim behavior is narrowly predictable. These claims, on inspection, are untrue and impossible. But Descartes, in saying, “I think, therefore I am”, stated an inescapable and irreducible conclusion, as much as physicalists have tried to suggest it can be reduced. Quite the contrary, what science has been lacking is a periodic table of existing elements that can’t be reduced to each other. I will propose here that two such elements exist: form, including all things physical, and function, including all things informational, which includes the mind. The physical sciences strictly describe forms but use function to do it, while the social sciences mostly describe functions, also using function to do it. Biological sciences describe both forms and functions, and formal sciences strictly describe functions using function. Form and function have been the bedrock of science from the beginning, but they have never been called out before as its fundamental elementary building blocks. This starts to create real problems in the study of subjects heavily dependent on both form and function, like biology, because functional theories like evolution can’t be supported by purely physical theories like chemistry. The mind is a much more functional realm still, for which the underlying chemistry is almost beside the point, so looking to chemistry for answers is like examining the Library of Congress with a microscope. To understand the mind, we will need a framework of explanations that spans both physical and functional aspects, so I am going to start out by establishing the nature of form and function and how they pertain to the mind.

Part 1: The Duality of Mind

“Normal science, the activity in which most scientists inevitably spend almost all their time, is predicated on the assumption that the scientific community knows what the world is like”
― Thomas S. Kuhn, The Structure of Scientific Revolutions

The mind exists to control the body from the top level. Control is the use of feedback to regulate a device. Historically, science was not directly concerned with control and left its development to engineers. The first feedback control device is thought to be the water clock of Ctesibius in Alexandria, Egypt around the third century B.C. It kept time by regulating the water level in a vessel and, therefore, the water flow from that vessel.1

All living organisms are perfectly tuned feedback control systems, but brains, in particular, are organs that specialize in top-level control. The mind is a part of the brain of which we have first-hand knowledge that consists of the following properties of consciousness: awareness, attention, feelings, and thoughts. The body, brain, and mind work together harmoniously to keep us alive, but how do they do it? As a software engineer; I’ve spent my whole life devising algorithms to help people control things better with computers. Developing a comprehensive theory of the mind based on existing scientific information is a lot like writing a big program — I develop different ideas that look promising, then step back and see if everything still works, which leads to multiple revisions across the whole program until everything seems to be running smoothly. It is more a job for an engineer than a scientist because it is mostly about generalizing functionality to work together rather than specializing in underlying problems. Generalizing from specifics to functions that solve general cases is most of what computer programmers do. Perhaps it is temperamental, but I think engineers are driven more by a top-down perspective to get things done than a bottom-up perspective to discover details.

In this section, I am going to develop the idea that science has overlooked the fundamental role control plays in life and the mind and has consequently failed to devise an adequate control-based orientation to study them. By reframing our scientific perspective, we can develop the explanatory power we need to understand how life and the mind work.

1.1 Approaching the Mind Scientifically

“You unlock this door with the key of imagination. Beyond it is another dimension: a dimension of sound, a dimension of sight, a dimension of mind. You’re moving into a land of both shadow and substance, of things and ideas. You’ve just crossed over into… the Twilight Zone.” — Rod Serling

Many others before me have attempted to explain what the mind is and how it works. And some of them have been right on the money as far as they have gone. But no explanation has taken it to the nth degree to uncover the fundamental nature of the mind both physically and functionally, fully encompassing both how the brain and mind came to be and what they really consist of. Each branch of science that touches on the mind comes at it from a different direction, and each is fruitful in its own way, but a unified understanding requires a unified framework that spans those perspectives. I don’t see much effort being expended to do that unification, so I am going to do it here. My basic contention is that science has become too specialized and we’ve been missing the forest for the trees. Whose job is it in the sciences to conceive generalized, overarching frameworks? Nobody; all scientists are paid to dig into the details. I believe minds are designed to collect useful knowledge, and each of us already has encyclopedic knowledge about how our mind works. Our deeply held intuitions about the special transcendent status of the mind have merit, but science finds them hard to substantiate and so discounts them. Scientists are quick to assume that the ideas we have about how we think are biased, imaginary, or even delusional because they don’t fit into present-day scientific frameworks. I don’t agree that we should discount the value of intuition on these grounds, and instead propose that intuition should lead the way. I am going to use intuition and logic to devise an expanded framework for science that encompasses the mind that aligns with both common sense and the latest scientific thinking. I am not suggesting we are immune to delusion or bias. They are very real and are the enemy of good science, but if we are careful, we can avoid logical fallacies and see the inner workings of the mind in a new light.

We know things simply from experience, which leverages a number of techniques to develop useful knowledge. Advice columnists expound on new problems based on their presumably greater experience in a subject domain. But science goes beyond the scope of advice by proposing to have conceived and demonstrated cause-and-effect explanations for phenomena. Science is a formalization of knowledge, which, in its fully formalized state declares laws that can perfectly predict future behavior. We recognize that science falls a bit short of perfection in its applicability to the physical world for two reasons. First, we only know the world from its behavior, not from seeing its underlying mechanisms. Second, most laws make simplifying assumptions, so one must consider the impact of complexities beyond the scope of the model. The critical quality that science adds above experienced opinion from these steps to formalize and verify is objectivity. What objective exactly means is a topic I will explore in more detail later, but from a high level, it means to be independent of subjectivity. Knowledge that is not dependent on personal perspective becomes universal, and if it uses a reliable, causal model then we can count it as scientific truth.

My explanation of the mind is part framework and part explanation. It is easier to establish objectivity for an explanation than a framework. An explanation stands on models and evidence, but a framework is one level further removed, and so stands on whether the explanations based on it are objective. A framework is a philosophy of science, and philosophy is sometimes studied independently of the object under study. What I am saying is that to establish objectivity, I can’t do that. I have to develop the philosophy in the specific context of the explanations it supports to establish the overall consistency and reliability that objectivity demands. I do believe all existing philosophies and explanations of science have merit, but in some cases they will need minor revisions, extensions, or reinterpretations to fit into the framework I am proposing. I am going to try to justify everything I propose as I propose it, but to keep things moving, I won’t always be as thorough as I would like on the first pass. In these cases, I will come back to the subject later and fill in. My primary aim is to keep things simple and clear, to appeal to common sense, and to stay as far within the scientific canon as possible. I am presuming readers have no specialized scientific background, both because I am approaching this from first principles and trying to make this accessible to everyone.

Even the idea of studying the mind objectively is questionable considering we only know of the mind from our own subjective experience. We feel we have one and we have a sense that we know what it is up to, but all we can prove about it is that it somehow resides in the brain. Brain scans show what areas of the brain are active when our minds are active. We can even approximately tell what areas of the brain are related to what aspects of the mind by correlating personal reports with activity in brain scans.1 Beyond that, our knowledge of neurochemistry and computer science suggest that the brain potentially has the processing power to produce mental states. Other sciences, from biological to social, assume this processing is happening and draw conclusions based on that assumption. But how can we connect the physical, biological, and social sciences to see the mind in a consistent way? This search for common bounds quickly takes us into a scientific twilight zone where things and ideas join the physical world and the world of imagination. It is very easy to overreach in these waters, so I will remain cognizant of Richard Feynman’s injunction against cargo cult science, which he said could only be avoided by scientific integrity, which he described as, “a kind of leaning over backwards” to make sure scientists do not fool themselves or others. I’ll be trying to do that to ensure my objective — a coherent and unified theory of the mind — stands up to scrutiny.

Science has been fighting some pitched philosophical debates in recent decades which reached a standstill and left it on pretty shaky ground. I am referring to the so-called science wars of the 1990s, in which postmodernists pushed the claim that all of science was a social construction. Scientific realism alone is inadequate to fight off postmodern critiques, so, given that this is the stance on which science most firmly depends, science is formally losing the battle against relativism. The relativists have been held off for now with Richard Dawkins’ war cry, “Science works, bitches!”2, which presumably implies that a firm foundation exists even if it has not been expressed. I aim to provide that solid ground using a philosophy that explains and subsumes relativism itself. These skirmishes don’t affect most scientific progress because local progress can be made independent of the big picture. But relativism is a big problem for the science of mind because while battles can still be won, we don’t, in a sense, know what we are fighting for.

Two diametrically opposed frameworks of science collide in the mind and we have to resolve that conflict to proceed. The first framework is physicalism (described below), which supports the physical sciences. The second framework is an assortment of philosophies which support the biological and social sciences. These philosophies use life and the mind as starting points and then build on that premise. Biological philosophies, which now mostly rest on Darwinism and refinements to it, are fundamentally naturalistic, which means that they assert that the forces that created life are natural. But it is not clear what those forces are, because life already exists as complex, self-sustaining, engineered systems, and our theories describing how it managed to arise naturally are still somewhat incomplete. The social sciences are also naturalistic but usually add humanism as well, which emphasizes the fundamental significance and agency of human beings. In this case, it is not clear why humans or their agency should be fundamental, but to make progress, these premises are taken as foundational. While I agree with the mountains of evidence that suggests that life and the mind are natural phenomena, and hence I agree that naturalism correctly describes the universe, it is not at all clear that it is the same as physicalism, and I will show that it is not. Spoiler alert: the extra force found in nature that is not part of physicalism is, in short, the disposition of life to live and of minds to think. After correctly defining naturalism, we will be in a much better position to explain complex natural phenomena like life and the mind.

As I am planning to stay within the bounds of the best-established science, I want to highlight the theories from which I will draw the most support. Everything I say will be as consistent as possible with these theories. These theories do continue to be refined, as science is never completely settled, and I will cite credible published hypotheses that refine them as needed. Also, some of these theories are guilty of overreaching, so I will have to rein them in. Here they are:

  1. Physicalism, the idea that only physical entities comprised of matter and energy exist. Under the predominant physicalist paradigm, these entities’ behavior is governed by four fundamental forces, namely gravity, the electromagnetic force, and the strong and weak nuclear forces. The latter three are nicely wrapped up into the Standard Model of particle physics, and gravity by general relativity. So far a grand unified theory that unites these two theories remains elusive. Physicalists acknowledge that their theories cannot now or ever be proven correct or reveal why the universe behaves as it does. Rather, they stand as deductive models that map with a high degree of confidence to inductive evidence.

  2. Evolution, the idea that inanimate matter become animate over time through a succession of heritable changes. The paradigm Darwin introduced in 1859 itself evolved during the first half of the 20th century into the Modern Synthesis to incorporated genetic traits and rules of recombination and population genetics. Watson and Crick’s discovery of DNA in 1953 as the source of the genetic code provided the molecular basis for this theory. Since that time, however, our knowledge of molecular mechanisms has exploded, undermining much of that paradigm. The evolutionary biologist Eugene Koonin feels that “the edifice of the [early 20th century] Modern Synthesis has crumbled, apparently, beyond repair”3, but updated syntheses have been proposed. The most widely-supported post-modern synthesis is the extended evolutionary synthesis, which adds a variety of subtle mechanisms that are still consistent with natural selection but which are not as obvious as the basic rules behind genetic traits. These mechanisms include ways organisms can change quickly and then develop full genetic stability (facilitated variation, the Baldwin effect, and epigenetic inheritance) and the effects of kin and groups on natural selection. The Baldwin effect is the idea that learned behavior maintained over many generations will create a selection pressure for adaptations that support that behavior. Eva Jablonka and Marion J. Lamb proposed in Evolution in Four Dimensions: Genetic, Epigenetic, Behavioral, and Symbolic Variation in the History of Life that the Baldwin effect lets organisms change quickly using regulatory genes (epigenes), learned behavior, and language to shift more transient changes permanently into DNA.

  3. Information theory, the idea that something nonphysical called information exists and can be manipulated. The study of information is almost exclusively restricted to the study of the manipulation of information and not to its nature, because the manipulation has great practical value but the nature is seen a point of only philosophical interest. However, understanding the nature of information is critical to understanding how life and the mind work, so I will be primarily concerned in this work with nature rather than manipulation. Because the nature of information has been almost completely marginalized in the study of information theory, existing science its nature doesn’t go very far and I have mostly had to derive my own theory of the nature of information from first principles, building on the available evidence.

  4. The Computational Theory of Mind, the idea that the human mind is an information processing system (IP) and that both cognition and consciousness result from this computation. While we normally think of computation as being mathematical, under this theory computation is generalized to include any transformation of input and internal state information using rules to produce output information. This implies that the mind has ways of encoding and processing information, which seemed radical when this idea was first proposed 70 years ago, but now seems obvious and inescapable. Where mechanical computers use symbolic states stored in digital memory and manipulated electronically, neural computers use neurochemical inputs, states, outputs, and rules. This theory, more than any other, has guided my thinking in this book. It is considered by many to be the only scientific theory that appears capable of providing a natural explanation for the much if not all of the mind’s capabilities, yet its implications have not been thoroughly pursued. I am going to do that here. However, I largely reject the ideas of the representational theory of mind and especially the language of thought, as they unnecessarily and incorrectly go too far in proposing a rigid algorithmic approach when a more generalized solution is needed. Note that whenever I use the word “process” in this book, I mean a computational information process, unless I preface it with a differentiating adjective, e.g. biological process. Although my focus in this book is on the mind, I am incidentally proposing the Computational Theory of Life, which is the formal statement that all life is first and foremost information processing systems and only secondarily biological processes. There is de facto acceptance in the biological sciences that life is computational because it is known that genes drive life and genes contain information, but the full implications of this fact should have transformed the biological sciences, and they have not yet. Also, note that I am not saying that everything is computational; I am specifically saying that life and the mind are computational. Some generalists like to extend this line of thought to propose that the universe is a giant computer, but this is a bad analogy because the universe is, for the most part (i.e. the part that is not alive), physical and devoid of information and information processing.

While the scientific community would broadly agree that these four theories are the leading paradigms in their respective areas, they would not agree on any one version of each theory. They are still evolving, and in some cases have parallel, contradictory lines of development. I will cite appropriate sources that are representative of these theories as needed. When I don’t cite sources, you can assume that I am presenting my own proposal or interpretation, but if I have made my case well then my points should seem sound and uncontroversially.

1.2 Information is Fundamental

Physical scientists have become increasingly committed to physicalism over the past four centuries. Physicalism is intentionally a closed-minded philosophy: it says that only physical things exist, where physical includes matter and energy in spacetime. It seems, at first glance, to be obviously true given our modern perspective: there are no ghosts, and if there were, we should reasonably expect to see some physical evidence of them. Therefore, all that is left is physical. But this attitude is woefully blind; it completely misses the better part our existence, the world of ideas. Of course, physicalism has an answer for that — thought is physical. But are we really supposed to believe that concepts like three, red, golf, pride, and concept are physical? They aren’t. But the physicalists are not deterred. They simply say that while we may find it convenient to talk about things in a free-floating, hypothetical sense, that doesn’t constitute existence in any real sense and so will ultimately prove to be irrelevant. From their perspective, all that is “really” happening is that neurons are firing in the brain, analogously to a CPU running in a computer and our first-person perspective of the mind with thoughts and feelings is just the product of that purely physical process.

Now, it is certainly true that the physicalist perspective has been amazingly successful for studying many physical things, including everything unrelated to life. However, once life enters the picture, philosophical quandaries arise around three problems:

(a) the origin of life,
(b) the mind-body problem and
(c) the explanatory gap.

In 1859, Charles Darwin proposed an apparent solution to (a) the origin of life in On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life. His answer was that life evolved naturally through small incremental changes made possible by competitive natural selection between individuals. The basic idea of evolution through naturally occurring “selections” is now nearly universally endorsed by the scientific community because a vast and ever-growing body of evidence supports it while no convincing evidence refutes it. But just how these small incremental changes were individually selected was not understood in Darwin’s time, and even today’s models are somewhat superficial because so many intermediate steps are unknown. Two big unresolved problems in Darwin’s time were the inadequate upper limit of 100 million years for the age of the Earth and the great similarity of animals from different continents. It was nearly a century before the earth was found to be 4.5 billion years old (with life originating at least 4 billion years ago) and plate tectonics explained the separation of the continents. By the mid-20th century, evolutionary theory had developed into a paradigm known as the Modern Synthesis that standardized notions of how variants of discrete traits are inherited. This now classical view holds that each organism has a fixed number of inherited traits called genes, that random mutations lead to gene variants called alleles, and that each parent contributes one gene at random for each trait from the two it inherited from its parents to create offspring with a random mixture of traits. Offspring compete by natural selection, which allows more adaptive traits to increase in numbers over time. While the tenets of the Modern Synthesis are still considered to be broadly true, what we have learned in the past seventy or so years has greatly expanded the repertoire of evolutionary mechanisms, substantially undermining the Modern Synthesis in the process. I will discuss some of that new knowledge later on, but for now, it is sufficient to recognize that life and the mind evolved over billions of years from incremental changes.

Science still draws a blank trying to solve (b) the mind-body problem. In 1637, René Descartes, after thinking about his own thoughts, concluded “that I, who was thinking them, had to be something; and observing this truth, I am thinking therefore I exist”1, which is popularly shortened to Cogito ergo sum or I think, therefore I am. Now, we still know that “three” is something that exists that is persistent and can be shared among us regardless of the myriad ways we might use our brain’s neurochemistry to hold it as an idea, so intuitively we know that Descartes was right. But officially, under the inflexible auspices of physicalism, three doesn’t exist at all. Descartes saw that ideas were a wholly different kind of thing than physical objects and that somehow the two “interacted” in the brain. The idea that two kinds of things exist at a fundamental level and that they can interact is called interactionist dualism. And I will demonstrate that interactionist dualism is the correct ontology of the natural world (an ontology is a philosophy itemizing what kinds of things exist), but not, as it turns out, the brand that Descartes devised. Descartes famously, but incorrectly, proposed that a special mental substance existed that interacted with the physical substance of the brain in the pineal gland. He presumed his mental substance occupied a realm of existence independent from our physical world which had some kind of extent in time and possibly its own kind of space, which made it similar to physical substance. We call his dualism substance dualism. We know now substance dualism is incorrect because the substance of our brains alone is sufficient to create thought.

Physicalism is an ontological monism that says only one kind of thing, physical things, exist. But what is existence? Something that exists can be discriminated on some basis or another as being distinct from other things that exist and is able to interact with them in various ways. Physical things certainly qualify, but I am claiming that concepts also qualify. They can certainly be discriminated and have their own logic of interactions. This doesn’t quite get us down to their fundamental nature, but bear with me I and I will get there soon. Physicalism sees the mind is an activity of the brain, and activities are physical events in spacetime, so it just another way of talking about the same thing. At a low level, the mind/brain consists of neurons connected in some kind of web. Physicalism endorses the idea that one can model higher levels as convenient, aggregated ways of describing lower levels with fewer words. In principle, though, higher levels can always be “reduced” to lower levels incrementally by breaking them down in enough detail. So we may see cells and organs and thoughts as conveniences of higher-level perspectives which arise from purely physical forms. I am going to demonstrate that this is false and that cells, organs, and thoughts do not fully reduce to physical existence. The physicalists are partly right. The mind is a computational process of the brain like digestion is a biological process of the gastrointestinal system. Just as computers bundle data into variables, thinking bundles data into thoughts and concepts which may be stored as memories in neurons. Computers are clearly physical machines, so physicalists conclude that brains are also just physical machines with a “mind” process that is set up to “experience” things. This view misses the forest for the trees because neither computers nor brains are just physical machines… something more is going on that physical laws alone don’t explain.

This brings us to the third problem, (c) the explanatory gap. The explanatory gap is “the difficulty that physicalist theories have in explaining how physical properties give rise to the way things feel when they are experienced.” In the prototypical example, Joseph Levine said, “Pain is the firing of C fibers”, which provides the neurological basis but doesn’t explain the feeling of pain. Of course, we know, independently of how it feels, that the function of pain is to inform the brain that something is happening that threatens the body’s physical integrity. That the brain should have feedback loops that can assist with the maintenance of health sounds analogous to physical feedback mechanisms, so a physical explanation seems sufficient to explain pain. But why things feel the way they do, or why we should have any subjective experience at all, does not seem to follow from physical laws. Bridging this gap is called the hard problem of consciousness because no physical solution seems possible. However, once we recognize that certain nonphysical things exist as well, this problem will go away.

We can resolve these three philosophical quandaries by correcting the underlying mistake of physicalism. That mistake is in assuming that only physical things can arise from physical causes. In one sense, it’s true: life and minds are entirely physical systems following physical laws. But in almost every way that matters to us, it is false: some physical systems (living things) can use feedback to perpetuate themselves and also to become better at doing so, which in turn creates in them the disposition to do so. This inclination, backed up by the capacity to pursue it by capturing and storing information, is not itself a physical thing, even though it exists in a physical system. For simplicity, I will usually just refer to this kind of existence as information, which is a term that usually refers to physically-encoded functionality, but it is understood to have meaning beyond the encoding. Often I will call it function, since information only exists to serve a function and disposition is managed through functional capacities. Information, functions, capacities, and dispositions are not physical and do exist. Ideas are information, but information is much more than just ideas. The ontology of science needs to be reframed to define and encompass information. Before I can do that, I am going to take a very hard look at what information is, and what it is not. I am going approach the subject from several directions to build my case, but I’m going to start with how life, and later minds, expanded the playing field of conventional physics.

During the billions of years before life came along, particle behaved in ways that could be considered very direct consequences of the Standard Model of particle physics and general relativity. This is not to say these theories are in their final form, but one could apply the four fundamental forces to any bunch of matter or energy and be able to predict pretty well what would happen next. But when life came along, complex structure started to develop with intricate biochemistries that seemed to go far beyond what the basic laws of physics would have predicted would happen. This is because living things are a information processing systems, or information processors (IPs) for short, and information can make things happen that would be extremely unlikely to happen otherwise. Organisms today manage heritable information using DNA (or RNA for some viruses) as their information repository. While the information resides in the DNA, its meaning is only revealed when it is translated into biological functions via biochemical processes. Most famously, the genetic code of DNA uses four nucleotide letters to spell 64 3-letter words that map to twenty amino acids (plus start and stop). Technically, DNA is always transcribed first into RNA and from RNA into protein. A string of amino acids forms a protein, and proteins do most of the heavy lifting of cell maintenance. But only two percent of the DNA in humans codes for proteins. Much of the rest regulates when proteins get translated, which most critically controls cell differentiation and specialization in multicellular organisms. Later on I will discuss some other hypothesized additional functions of non-coding DNA that substantially impact how new adaptations arise. But, as regards the storage of information, we know that DNA or RNA store the information and it is translated one-way only to make proteins. The stored information in no way “summarizes” the function the DNA, RNA or proteins can perform; only careful study of their effects can reveal what they do. Consequently, knowing the sequence of the human genome tells us nothing about what it does; we have to figure out what pieces of DNA and RNA are active and (where relevant) what proteins they create, and then connect their activities back to the source.

Animals take information a step further by processing and storing real-time information using neurochemistry in brains2. While other multicellular organisms, like plants, fungi, and algae, react to their environments, they do so very slowly from our perspective. Sessile animals like sponges, corals, and anemones also seem plantlike and seem to lack coordinated behavior. Mobile animals encounter a wide variety of situations for which they need a coordinated response, so they evolved brains to assess and then prioritize and select behaviors appropriate to their current circumstances. Many and perhaps all animals with brains go further still by using agent-centric processes called minds within their brains that represent the external world to them through sensory information that is felt or experienced in a subjective, first-person way. Then first-person thinking contributes to top-level decisions.

While nobody disputes that organisms and brains use information, it is not at all obvious why this makes them fundamentally different from, say, simple machines that don’t use information. To see why they are fundamentally different, we have to think harder about what information really is and not just how it is used by life and brains. Colloquially, information is facts (as opposed to opinions) that provide reliable details about things. More formally, information is “something that exists that provides the answer to a question of some kind or resolves uncertainty.” But provides answers to whom? The answer must be to an information processor. Unless the information informs “someone” about something, it isn’t information. But this doesn’t mean information must be used to be information; it only has to provide answers that could be used. Information is a potential or capacity that can remain latent, but must potentially be usable by some information processor to do something. So what is fundamentally different about organisms and brains from the rest of the comparably inert universe is that they are IPs, and only IPs can create or use information.

But wait, you are thinking, isn’t the universe full of physical information? Isn’t that what science has been recording with instruments about every observable aspect of the world around us in ways that are quite objectively independent of our minds’ IPs? If we have one gram of pure water at 40 degrees Fahrenheit at sea level at 41°20’N 70°0’W (which is in Nantucket Harbor), then this information tells us everything knowable by our science about that gram of matter, and so could be used to answer any question or resolve any uncertainty we might have about it. Of course, the universe doesn’t represent that gram of water using the above sentence, it uses molecules, of which there are sextillions in that gram. One might think this would produce astronomically complex behavior, but the prevailing paradigm of physics claims a uniformity of nature in which all water molecules behave the same. Chemistry and materials science then provide many macroscopic properties that work with great uniformity as well. Materials science reduces to chemistry, and chemistry to physics, so higher-level properties are conveniences of description that can be reduced to lower-level properties and so are not fundamentally different. So, in principle, then, physical laws can be used to predict the behavior of anything. Once you know the structure, quantity, temperature, pressure, and location of anything, then the laws of the universe presumably take care of the rest. Our knowledge of physical laws is still a bit incomplete, but it is good enough that we can make quite accurate predictions about all the things we are familiar with.

Physical information clearly qualifies as information once we have taken it into our minds as knowledge, which is information within our minds’ awareness. But if we are thinking objectively about physical information outside the context of what our minds are doing, that means we are thinking of this information as being present in the structure of matter itself. But is that information really in the matter itself? Matter can clearly have different structures. First, it can differ in the subatomic particles that comprise it, and there are quite a variety of such particles. Next, how these particles combine into larger particles and then atoms and then molecules can vary tremendously. And finally, the configurations into which molecules can be assembled into crystalline and aggregate solids is nearly endless. Information can describe all these structural details, and also the local conditions the substance is under, which chiefly include quantity, temperature, pressure, and location (though gravity and the other fundamental forces work at a distance, which make each spot in the universe somewhat unique). But while we can use information to describe these things, is it meaningful to say the information is there even if we don’t measure and describe it? Wouldn’t it be fair to say that information is latent in physical things as a potential or capacity which can be extracted by us as needed? After all, I did say that information is a potential that doesn’t have to be used to exist.

The answer is no, physical things contain no information. Physical information is created by our minds when we describe physical things, but the physical things themselves don’t have it. Their complex structure is simply physical and that is it. The laws of the universe then operate uniformly at the subatomic level as particles or waves or whatever they really are. The universe doesn’t need to take measurements or collect information just as a clock doesn’t; it just ticks. It is a finite state machine that moves ahead one step at a time using local rules at each spot in the universe. This explanation doesn’t say how it does that or what time is, but I am not here to solve that problem. It is sufficient to know that outside of information processors, the universe has no dispositions, functions, capacities or information. Now, how close particles get to each other affects what atoms, molecules and aggregate substances form, and can create stars and black holes at high densities. But all this happens based on physical laws without any information. While there are patterns in nature that arise from natural processes, e.g. in stars, planets, crystals, and rivers, these patterns just represent the rather direct consequences of the laws of physics and are not information in and of themselves. They only become information at the point where an IP creates information about them. So let’s look at what life does to create information where none existed before.

Living things are complicated because they have microstructure down to the molecular level. Cells are pretty small but still big enough to contain trillions of molecules, all potentially doing different things, which is a lot of complexity. We aren’t currently able to collect all that information and project what each molecule will do using either physics or chemistry alone. But we have found many important biochemical reactions that illuminate considerably how living things collect and use energy and matter. And physicalism maintains that given a complete enough picture of such reactions we can completely understand how life works. But this isn’t true. Perfect knowledge of the biochemistry involved would still leave us unable to predict much of anything about how a living thing will behave. Physical laws alone provide essentially no insight. Our understanding of biological systems depends mostly on theories of macroscopic properties that don’t reduce to physical laws. We are just used to thinking in terms of biological functions so we don’t realize how irreducible they are. Even at a low level, we take for granted that living things maintain their bodies by taking in energy and materials for growth and eliminating waste. But rocks and lakes don’t do that, and nothing in the laws of physics suggests complex matter should organize itself to preserve such fragile, complex, energy-consuming structures. Darwin was the first to suggest a plausible physical mechanism: incremental change steered by natural selection. This continues to be the only idea on the table, and it is still thought to be correct. But what is still not well appreciated is how this process creates information.

At the heart of the theory of evolution is the idea of conducting a long series of trials in which two mechanisms compete and the fitter one vanquishes the less fit and gets to survive longer as its reward. In practice, the competition is not head-to-head this way and fitness is defined not by the features of competing traits but by the probability that an organism will survive and replicate. So all I mean by “more fit” is able to survive longer. Birds can and apparently have evolved into less physically-fit forms to produce more spectacular plumage, possibly sometimes even evolving themselves into extinction3. Also, under conventional evolutionary theory, simpler forms can emerge with equal likelihood of more complex forms, leading to things like blind cave fish4. Genetic recombination provided by sexual reproduction allows the fitness of each trait to evolved independently of the fitness of individual organisms. No one trait may make a life-or-death difference, but over time, the traits that support survival better will outcompete and displace less capable traits. Finally, evolution includes the idea that mutation can change traits or create new ones. If you look over this short summary of evolution, you can see the places where I implicitly departed from classical physics and invoked something new by using words like, “traits”, “fitness”, “probability”, and “compete”. These words are generalizations whose meaning relative to evolution is lost as soon as we think about them as physical specifics. Biological information is created at the moment that feedback from one or more situations is taken as evidence that can inform a future situation, which is to say that it can give us better than random odds of being able to predict something about that future situation. This concept of information is entirely nonphysical; it is only about similarities of features, where features themselves are informational constructs that depend on being able to be recognized with better than random odds. Two distinct physical things can be exactly alike except for their position in time and space, but we can never prove it. All that we can know is that two physical things have observable features which can be categorized as the same or different based on some criteria. These criteria of categorization, and the concept of generalized categories, are the essence of information. For now, let’s focus only on biological information captured by living organisms in DNA and not on mental information managed by brains. Natural selection implies that biological information is created by inductive logic, which consists of generalizations about specifics whose logical truth is a matter of probabilities rather than logical certainty. Logic produces generalities, which are not physical things one can point to. And the inductive trial-and-error of evolution creates and preserves traits that carry information, but it doesn’t describe what any of those traits are. Furthermore, any attempt to describe them will itself necessarily be an approximate generalization because the real definition of the information is tied to its measure of fitness, not to any specific effects it creates.

We know that evolution works as we are here as evidence, but why did processes that collected biological information form and progress so as to create all the diverse life on earth? The reason is what I call the functional ratchet, and also previously called an arms race. A ratchet is a mechanical device that allows motion in only one direction, as with a cogged wheel with backward angled teeth. Let’s call the fitness advantage a given trait provides its function. More generally capable functions will continuously displace less capable ones over time because of competition. This happens in two stages. First, useful functionality provided by entirely new traits will tend to persist because it provides capabilities other organisms lack. Second, variants of the same trait compete head to head to improve each trait continuously. It is often said that evolution is directionless and human beings are not the “most” evolved creatures at the inevitable endpoint, but this is an incorrect characterization of what is happening. Evolution is always pulled in the direction of greater functionality by the functional ratchet. What functionality means is local to the value each trait contributes to each organism at each moment, so because circumstances change and there is a wide variety of ecological niches, evolution has no specific target and no given function will necessarily ever become advantageous. But the relentless pull toward greater functionality has great potential to produce ever more complex and capable organisms, and this is why we see such a large variety. Simpler forms, like blind cave fish, have actually become increasingly functional relative to the needs of living in dark caves. The functionality that is lost because it is no longer sufficiently useful doesn’t violate the one-way movement of the ratchet; it has just become irrelevant because the ratchet changed. So it is not at all a coincidence that life is more diverse now than in the past or that human intelligence evolved. I will discuss later on how the cognitive ratchet created human brains in the evolutionary blink of an eye5.

Note that while the word function suggests we can list the effects the trait can cause in advance, I am using it in a more abstract sense to include any general effects it can cause whether they are knowable or not. In practice, because any effects caused by the trait in specific situations are more likely to be preserved over time if they create net benefits to survival, a collection of effects that are probably more helpful than not overall are likely to evolve for the trait, given that it can change over time. The trait has arguably been causing effects continuously for millions to billions of years, all of which have contributed probabilistically to the trait’s current functionality. However, for entirely physical reasons, traits are likely to be highly specialized, usually having just one fairly obvious functional effect. Any given protein coded by DNA can only have a small, finite number of effects, and it will likely only be used for effects for which it does a better job than any other trait. My point is that the exact benefits and costs can be very subtle and any understanding we acquire is likely to overlook such subtleties. Beyond subtleties, cases of protein moonlighting, in which the same protein performs quite unrelated functions, are now well-documented. In the best-known case, some crystallins can act both as enzymes that catalyze reactions or as transparent structural material of eye lenses.6 But even proteins that can only perform one enzymatic function can use that function in many contexts, effectively creating many functions.

Induction, the idea that function is a byproduct of a long series of trial and error experiments whose feedback has been aggregated, is sufficient to explain evolution, but the mind also uses deduction. I noted before that where induction works from the bottom up (from specifics to generalities), deduction works from the top down (generalities to specifics). From the deductive perspective, we see functions in their simplified, discrete forms which cause specific effects. The body is an agent, limbs are for locomotion, eyes are for seeing, hearts are for pumping blood, gullets are for taking in food, etc. Viewed this way, these functions describe clear contributions to overall survival and fitness, and detailed study always reveals many more subtle subsidiary functions. Of course, we know that evolution didn’t “design” anything because it is used trial and error rather than discrete deductive causes and effects, but we know from experience that deduction can provide very helpful and hence functional support, even though it is not the way the world works. Why and how it does this I will get into later, but for now, let’s review how deduction sees design problems. Deduction begins with a disposition, which is a tendency toward certain actions, that becomes an intent, which is an identified inclination to achieve possible effects or goals. Effects and goals are inherently abstractions in that they don’t refer to anything physical but instead to a general state of affairs, for which the original and driving state of affairs concerning life is to continue to survive. The manipulation of abstractions as logical chess pieces is called deductive reasoning. Techniques to reach goals or purposes are called strategies, designs, or planning. I call the actions of such techniques maneuvers. All these terms except disposition, function, cause, and effect are strictly deductive terms because they require abstractions to be identified. I will expand more in the next chapter how disposition, functionality, and causality (cause and effect) can be meaningful in inductive contexts alone without deduction. My point, for now, is that while evolution has produced a large body of function entirely by inductive means, deductive means can help us a lot to understand what it has done. Provided we develop an understanding of the limitations of deductive explanations, we can be well-justified in using them. I am not going to credit the philosophy of biology with fully exploring those limitations, but we can safely say they are approximately understood, and so on this basis it is reasonable for biologists both to use deductive models to explain life and to characterize evolutionary processes as having intent and designs. There is, however, an unspoken understanding among biologists that the forces of evolution, using only inductive processes, have created something functional that can be fairly called functional. This something has to sit beneath the surface of their deductive explanations because all explanations must be formed with words, which are themselves abstract tools of deductive logic. In other words, information and function are very much present in all living structures and has largely been recorded in DNA, and this information and function are not physical at all. Physicalists go a step too far, then, by discounting the byproducts of inductive information processes as the incidental effects of physical processes.

Although it is possible to create, collect, and use information in a natural universe, it is decidedly nontrivial, as the complexity of living things demonstrates. Beyond the already complex task of creating it with new traits, recombination, and natural selection, living things need to have a physical way of recording information and transcribing information so that it can be deployed as needed going forward. I have said how DNA and RNA do this for life on earth. Because of this, we can see the information of life captured in discrete packages called genes. DNA and RNA are physical structures, and the processes that replicate and translate them are physical, but as units of function, genes are not physical. Their physical components should be viewed as a means to an end, where the end is the function. It is not a designed end, an inductively-shaped one. The physical shapes of living structures are cajoled into forms that would have been entirely unpredictable based on forward-looking design goals, but which patient trial and error demonstrated are better than alternatives.

Beyond biological information, animals have brains that collect and use mental information in real time that is stored neurologically. And beyond that, humans can encode mental information as linguistic information or representational information. Linguistic information can either be in a natural language or a formal language. Natural languages assume a human mind as the IP, while formal languages declare the permissible terms and rules, which is most useful for logic, mathematics, and computers. Representational information simulates visual, audio or other sensory experience in any medium, but most notably nowadays in digital formats. And finally, humans create artificial information, which is information created by computer algorithms, most notably using machine learning. All of these forms of information, like biological information, answer questions or resolve uncertainties to inform a future situation. They do this by generalizing and applying nonphysical categorical criteria capable of distinguishing differences and similarities. Some of this information is inductive like biological information, but, as we will see, some of it is deductive, which expands the logical power of information.

We have become accustomed to focus mostly on encoded information because it can be readily shared, but all encodings presume the existence of an IP capable of using them. For organisms, the whole body processes biological information. Brains (or technically, the whole nervous and endocrine systems) are the IP of mental information in animals. Computers can act as the IPs for formal languages, formalized representations, and artificial information, but can’t process natural languages or natural representational information. However, artificial information processing can simulate natural information processing adequately for many applications, such as voice recognition and self-driving cars. My point here is that encoded information is only an incremental portion of any function, which requires an IP to be realized as function. We can take the underlying IPs for granted for any purpose except understanding how the IP itself works, which is the point of this book. While we have a perfect knowledge of how electronic IPs work, we have only a vague idea of how biological or mental information processors work.

Consider the following incremental piece of biological information. Bees can see ultraviolet light and we can’t. This fact builds on prevailing biological paradigms, e.g. that bees and people see light with eyes. This presumes bees and people are IPs for which living and seeing are axiomatic underlying functions. The new incremental fact tells us that certain animals, namely bees, see ultraviolet as well. This fact extends what we knew, which seems simple enough. A child who knows only that animals can see and bees are small flying animals that like flowers can now understand how bees see things in flowers that we can’t. A biologist working on bee vision needs no more complex paradigm than the child; living and seeing can be taken for granted axiomatically. She can focus on the ultraviolet part without worrying about why bees are alive or why they see. But if our goal is to explain bees or minds in general, we have to think about these things.

Our biological paradigm needs to define what animals and sight are, but the three philosophical quandaries of life cited above stand in the way of a detailed answer. Physicalists would say that lifeforms are just like clocks but more intricate. That is true; they are intricate machines, but, like clocks, an explanation of all their pieces, interconnections, and enabling physical forces says nothing about why they have the form they do. Living things, unlike glaciers, are shaped by feedback processes that gradually make them a better fit for what they are doing. Everything that happened to them back to their earliest ancestors about four billion years ago has contributed. A long series of feedback events created biological information leveraging inductive logic captured as information rather than using laws of physics alone. Yes, biological IPs leverage physical laws, but they add something important which for which the physical mechanisms are just the means to the end. The result is complex creations that have essentially a zero probability of arising by physical mechanisms alone.

How, exactly, do these feedback processes that created life create this new kind of entity called information and what is information made out of? The answer to both questions is actually the same definition given for information above: the reduction of uncertainty, which can also be phrased as an ability to predict the future with better odds than random chance. Information is made out of what it can do, so we are what we can do. We can do things with a pretty fair expectation that the outcome will align with our expectations of the outcome. It isn’t really predicting in a physical sense because we see nothing about the actual future and any number of things could always go wrong with our predictions. We could only know the future in advance with certainty if we had perfect knowledge of the present and a perfectly deterministic universe. But we can never get perfect knowledge because we can’t measure everything and because quantum uncertainty limits how much we can know about how things will behave. But biological information isn’t based on perfect predictions, only approximate ones. A prediction that is right more than it is wrong can arise in a physical system if it can use feedback from a set of situations to make generalized guesses about future situations that can be deemed similar. That similarity, measured any way you like, carries predictive information by exploiting the uniformity of nature, which usually causes situations that are sufficiently similar to behave similarly. It’s not magic, but it seems like magic relative to conventional laws of physics, which have no framework for measuring similarity or saying anything about the future. A physical system with this capacity is exceptionally nontrivial — living systems took billions of years to evolve into impressive IPs that now centrally manage their heritable information using DNA. Animals then spent hundreds of millions of years evolving minds that manage real-time information using neurochemistry. Finally, humans have built IPs that can manage information using either standardized practices (e.g. by institutions) or computers. But in each case the functional ratchet has acted to strongly conserve more effective functionality, pulling evolution in the direction of greater functionality. It has often been said that evolution is “directionless”, because it seems to pull to simplicity as much as toward complexity. As Christie Wilcox put it in Scientific American, “Evolution only leads to increases in complexity when complexity is beneficial to survival and reproduction. … the more simple you are, the faster you can reproduce, and thus the more offspring you can have. … it may instead be the lack of complexity, not the rise of it, that is most intriguing.”7 It is true, evolution is not about increasing complexity, it is about increasing functionality. Inductive trial and error always chooses more functionality over less, provided you define “more” as what induction did. In other words, it is a statistical amalgamation of successful performances where the criteria for each success was situation-specific. But what is “success”? Natural selections not made by brains depend only on the impact to survival and reproduction, but minds make real-time selections based on high-level models (desires) which must have evolved for reasons that aligned with survival or reproduction, but which may no longer accurately reflect a benefit to survival.

A functional entity has the capacity to do something useful, where useful means able to act so as to cause outcomes substantially similar to outcomes seen previously. To be able to do this, one must also be able to do many things that one does not actually do, which is to say one must be prepared for a range of circumstances for which appropriate responses are possible. Physical matter and energy are comprised of a vast number of small pieces whose behavior is relatively well-understood using physical laws. Functional entities are comprised of capacities and generalized responses based on those capacities. Both are natural phenomena. Until information processing came along through life, function (being generalized capacity and response) did not exist on earth (or perhaps anywhere). But now life has introduced an uncountable number of functions in the form of biological traits. As Eugene Koonin of the National Center for Biotechnology Information puts it, “The biologically relevant concept of information has to do with ‘meaning’, i.e. encoding various biological functions with various degrees of evolutionary conservation.”8 The mechanism behind each trait is itself purely physical, but the fact that the trait works across a certain range of circumstances is because “works” and “range” generalize abstract capacities, which one could call the reasons for the trait. The traits don’t know why they work, because knowledge is a function of minds, but their utility across a generalized range of situations is what causes them to form. That is why information is not a physical property of the DNA, it is a functional property.

Function starts to arise independent of physical existence at the moment a mechanism arises that can abstract from a token to a type, and, going the other way, from a type to a token. A token is a specific situation and a type is a generalization of that token to an abstract set of tokens that could be deemed similar based on one or more criteria. Each criterion permits a range of values that could be called a dimension, and so divides the full range of values into categories. Abstracting a token to a type is a form of indirection and is used all the time in computers, for example to let variables hold quantities not known in advance. An indirect reference to a token can either be a particular, in which case it will only ever refer to that one token, or a generality, in which case it is a type referring to the token. By referring to tokens through different kinds of references we can apply different kinds of functionality to them. Just as we can build physical computers that can use indirection, biological mechanisms can implement indirection as well. I am not suggesting that all types are representational; that is too strong a position. Information is necessarily “about” something else, but only in the sense that its collection and application must move between generalities and specifics. Inductive trial-and-error information doesn’t know it employs types because only minds can know things, but it does divide the world up this way. When we explain inductive information deductively with knowledge, we are simplifying what is happening by making analogies to cause-and-effect models even though they really use trial-and-error models. Cells have general approaches for moving materials across cell membranes which we can classify as taking resources in and expelling wastes, but the cells themselves don’t realize they have membranes and the simplification that materials are resources or waste neglects cases where they are both or neither. Sunlight is important to plants, so sunlight is a category plants process, which is to say they are organized so as to gather sunlight well, e.g. by turning their leaves to the sun, but they don’t pass around messages representing sunlight as a type and instructing cells to collect it.

To clarify further, we can now see that function is all about applying generalities to specifics using indirect references, while physical things are just about specifics. [Now is a good time to point out that “generalize” and “generalization” mean the same thing as “general” and “generality” except that a generalization is created by inference from specific cases, while a generality is unconcerned with whether it was created with inductive or deductive logic. Because I will argue that deductive logic can only be applied by aligning it to inductive findings, I will use the terms interchangeably but according to the more fitting connotation.] We can break generalities down into increasingly specific subcategories, arriving eventually at particulars.

Natural selection allows small functional changes to spread in a population, and these changes are accompanied by small DNA changes that caused them. The physical change to the DNA caused the functional change, but it is really that functional change that brought about the DNA change. Usually, if not always, a deductive cause-and-effect model can be found that accounts for most of the value of an inductive trial-and-error functional feature. For example, hearts pump blood because bodies need circulation. The form and function line up very closely in an obvious way. We can pretty confidently expect that all animals with hearts will continue to have them in future designs to fulfill their need for circulation. While I don’t know what genes build the circulatory system, it is likely that most of them have contributed in straightforward ways for millions of years.

Sex, on the other hand, is not as stable a trait. Sometimes populations benefit from parity between the sexes and sometimes with disproportionally more females. Having more females is beneficial during times of great stability, and having more males during times of change. I will discuss why this is later, but the fact that this pressure can change makes it advantageous sometimes for a new mechanism of sex determination to spring up. For example, all placental mammals used to use the Y chromosome to determine the sex of the offspring. Only males have it, but males also have an X chromosome. With completely random recombination, this means that offspring have a 50% chance of inheriting their father’s Y chromosome and being male. However, two species of mole voles, small rodents of Asia, have no Y chromosome, so males have XX chromosomes like females. We don’t know what trigger creates male mole voles, but a mechanism that could produce more than 50% females would be quite helpful to the propagation of polygamous mole vole populations, as some are, because they would be more reproducing (i.e. female) offspring.91011 The exact reason a change in sex determination was more adaptive is not relevant, all that matters is that it was and the physical mechanism was simply abandoned. A physical mechanism is necessary, and so only possible physical mechanisms can be employed, but the selection between physical mechanisms is not based on their physical merits but only on their functional contribution. As we move into the area of mental functions, the link between physical mechanisms and mental functions becomes increasingly abstract, effectively making the prediction of animal behavior based on physical knowledge alone impossible. To understand functional systems we have to focus on what capacities the functions bring to the table, not on the physical means they employ.

I have introduced the idea that information and the function it brings are the keys to resolving the three philosophical quandaries created by life. In the next chapter, I will develop it into a comprehensive ontology that is up to the task of supporting the scientific study of all manner of things.

1.3 Dualism and the Five Levels of Existence

To review, an information processor or IP is a physical construction, e.g. a living thing, that manages (creates and uses) information. As I have noted before, the reason that living things develop functions to help them survive is that they can — the opportunity exists that disposes them to keep existing. In other words, they have a reason to live. If I wrote a pointless computer program that had no disposition to do anything or accomplish any function, we would say it was devoid of any information or function; it processes data but no information. Something only counts as information or as having function when it is useful, meaning that it can be applied toward an end or purpose. But use, end, and purpose are not physical things or events; they are at most ways of thinking about things. However, if we could think about an approximate future state of physical things or events, where this approximation was defined in terms of similarities to past things and events, then we could think about making it our purpose to cause that future state to happen. Information processors are physical machines that use physical mechanisms to model physical things and events in a nonphysical way and then apply those nonphysical models back to physical circumstances to change them. It all sounds wildly complicated and unlikely, except that is exactly what life does: it collects physical processes (genetic traits) that approximately cause future physical events to transpire in such a way that it can keep doing it. The purpose of the prediction and application is nothing more than to be able to keep predicting and applying: to survive. But the “it” that keeps doing “it” is always changing or evolving, because time doesn’t repeat. Living IPs only do things similar to things they did before, and instead of just maintaining themselves, they create new IPs as offspring that are similar but never quite the same as themselves.

Information and function make generalizations about physical (or functional) things. These generalizations are references to these things, and references are not physical themselves even though a physical mechanism exists in an IP to hold them. The information is not how the reference is physically implemented (in neurons or computer chips); it is the useful, purposeful, functional ends to which it can be applied indirectly through references. These ends exist but are not physical; they are a new kind of existence I call functional existence. The fabric of functional existence is capacity, not spacetime. Specifically, it is the capacity to predict what will happen with better than random odds. This capacity can be described in other ways, such as the ability to answer questions, resolve uncertainties, or be useful, all of which refer to generalized ways of guessing that new things can happen that were similar to past things. I will use information synonymously with functional existence, but there is a subtle difference: information is often spoken of independently of the information processing that can be done on it, but functional existence requires both the information and its accompanying information processing as a functional whole.

We are justified in saying that function (or information) actually exists because things exist when we can discriminate them, they persist, and we can do things with them. We can discriminate information, it persists, and we can do things with it, yet it is not physical, so this qualifies it as a distinct category of being. We can therefore conclude that interactionist dualism is true after all. The idea that something’s existence can be defined in terms of the value it produces is called functionalism. For this reason, I call my brand of interactionist dualism form and function dualism, in which physical substance is “form” and information is “function”. I hold that physical things except for IPs are best explained using physicalism and IPs are best explained using a combination of physicalism and functionalism. While this means I am endorsing physicalism and functionalism, I am only endorsing a version of each. Specifically, I endorse physicalism but require it to drop the restriction that everything IPs do is physical, and I endorse functionalism only in the sense that I describe here. Many variations of functionalism with different ontologies exist which I will not describe or defend. The version I propose says that function (aka information) exists in an abstract, nonphysical way, but that a physical world like ours can use functional through information processors. Consequently, although function itself is abstract and independent of physical support, the use of function in a physical world is quite concrete and dependent on elaborate feedback systems that all derive from living things. As an interactionist, I hold that form and function interact in the mind and that they do so via information processing.

Probably most cognitive scientists already consider themselves to be functionalists, in that they view mental states and processes functionally, but that doesn’t make functionalism a well-defined stance. While progress can be made without a coherent definition, a vagueness pervades the conclusions that creates uncertainty about what has been shown. By explicitly defining function and information as a kind of existence that is independent of physical substance, I hope to clarify both the physical and functional aspects of information processing to show how these two kinds of existence persist, interact, and influence the future.

To develop this idea, I’m going to further distinguish five levels of understanding we can have for each of the two kinds of existence, only the first two of which apply to physical things:

Noumenon – the thing-in-itself. Keeps to itself.

Phenomenon – that which can be observed about a noumenon. Reaches out to others.

Percept, from perception – first-order information created by an information processor (IP) using inductive reasoning on phenomena received . Notices others.

Concept, from conception – second-order information created with deductive reasoning, usually by building on percepts. Understands others.

Metaconcept, from metacognition – third-order information or “thoughts about thoughts”. Understands self.

We believe our senses tell us that the world around us exists. We know our senses can fool us, but by accumulating multiple observations using multiple senses, we build a very strong inductive case that physical things are persistent and hence exist. Science has increased this certainty enormously with instruments that are both immune to many kinds of bias and can observe things beyond our sensory range. Still, though, no matter how much evidence accumulates, we can’t know for sure that the world exists because it is out there and we are in here. But we can imagine that a thing physically exists independent of our awareness of it, and we refer to this standalone type of existence as the thing’s noumenon, or thing-in-itself (what Kant called das Ding an sich). The only way we can ever come to know anything about noumena is through phenomena, which are emanations from or interactions with a noumenon. Example phenomena include light or sound bouncing off an object, but can also include matter and energy interactions like touch, smell, and temperature. When we talk about atoms, we are referring to their noumena, or actual nature, but we don’t really know what that nature is. We only know noumena by observing their phenomena. So everything science or experience tells us of physical apples or atoms is entirely in terms of their phenomena. We believe they have a noumenal existence because it can be measured in so many different ways, and this consistency would be unlikely if the apple or atom was an illusion. We know this because an illusion of an apple, say a picture or projection of one, lacks many phenomena that real apples provide.

All knowledge based on phenomena is called a posteriori, which includes all knowledge we have of the physical world. We can have direct or a priori knowledge of noumena we can logically perceive in our own minds, which most notably includes things that are true by definition. A priori knowledge includes everything that is true by construction, which includes the logical implications of explicit deductive models. If we define rules of arithmetic such that addition necessarily works, then the rules are just part of the definition and all their implications are a priori even if we can’t easily see what all those implications are. While Kant’s greatest contribution to philosophy was the recognition that we can only know the world through phenomena, leaving physical noumena unknowable, he was perturbed that this implied that philosophers can never derive anything about the physical world from reason alone. Philosophers had always thought some truths about the world (e.g. the idea of cause and effect) could be known through thought alone, yet he had apparently proven that knowledge of the outside world must be the exclusive domain of natural philosophers, i.e. scientists. While some saw this as a fatal blow to philosophy, all it really did was clarify that philosophy is a functional affair, not a physical one. The recognition that we only know the world through phenomena was an important breakthrough because now we can readily accept that everything about the physical world is approximate, a posteriori knowledge that, far from being absolutely true, merely extrapolates about the future based on patterns seen before. Causes and effects are convenient generalizations about the world, not intrinsic physical essences.

Perception is the receipt of a phenomenon by a sensor and adequate accompanying information processing to create information about it. Physical things have noumena that radiate phenomena, but they never have perception since perception is information. A single percept is never created entirely from a single phenomenon; the capacity for perception must be built over billions of inductive trial-and-error interactions as life has done it. We notice a camera flash as a percept, but only because our brain evolved the capacity over millions of years to convert data into information. So if a tree falls in the forest and there was nobody to hear it, there was a phenomenon but no perception. Because IPs exploit the uniformity of nature, our perceptions can very accurately characterize both the phenomena we observe and the underlying noumena from which they emanate, even if complete certainty is impossible.

Perception includes everything we consciously experience without conscious effort, and includes sensory information about our bodies and the world, and also emotions, common sense, and intuition, which somehow bubble up into our awareness as needed. Information we receive from perception divides into two parts, one from nature and one from nurture. The nature part, innate perception, provides information in the form of feelings from senses and emotions that require no experience to feel and which don’t change given more experience. The nurture part, learned perception, provides information in the form of memories and impressions, and continually changes during our lives based on the contents of stored experience. Color and fear have innate parts independent of experience, but they are only meaningful to us because of our experience with them; we have to learn what our bodies are telling us. For convenience, I will usually call learned percepts or intuitions subconcepts, since perception comes just below conception. Our capacity to develop common sense and intuition as subconcepts is itself innate, but the experiences themselves were not anticipated by our genetics and are entirely circumstantial to our individual lives. Everything we perceive is influenced by both innate and learned perception, even though they originate from completely independent sources. So we see red using innate perception, but a lifetime of experience seeing red things then influences our perception with impressions we attribute to either common sense or intuition. All information created by perception is first-order information because it is based on induction, which is the first kind of information one can extract from data. Inductive reasoning or “bottom-up logic” generalizes conclusions from multiple experiences based on similarities, a trial-and-error approach. Entirely outside the purposes of brains, all genetic information is created inductively, so I am going to use the word “perception” more broadly than mental perception to the evolutionary capture or perception of functionality into biological traits. More slowly than our intuitive mind, evolution “perceives” patterns that can provide functionality and it captures them in DNA as traits. A few of those genetic traits are mental and create for the innate perceptions we experience as senses and emotions.

Conception approaches information from a different direction. Instead of looking for associations from patterns from the bottom up, it works from the top down by proposing the existence of abstract entities called concepts that interact with each other according to rules of cause and effect. Concepts idealize frequently seen patterns into discrete buckets that group things or events into chunks that engage in predictable sorts of interactions with related concepts to form a conceptual model. Conceptual models obey whatever rules of logic we imagine for them, but they will predict best what will happen if they use deductive logic, because then they can reach conclusions that are logically certain. (Although conceptual thinking can follow any brand of logic and not necessarily full-fledged deductive logic, I will often refer to top-down or conceptual thinking as deductive for simplicity. However, conceptual models are broader than deductive models because concepts are vaguer than the more precisely-specified axioms of deduction.) The challenge of concepts is in building concepts and models that correspond well to situations in which they can be applied. To help with this, our base concepts are strongly linked to percepts and our conceptual rules are heavily influenced by patterns we intuit from perception.

The transition from percept to concept is gradual and is arguably a matter of perspective. From a functional standpoint, it has more to do with how the information is used than how it is structured. Structurally, information in the brain is all just interconnections, but functionally, top-level perspectives are necessary, and this makes conceptual interpretations meaningful. We trust our perceptions based on our memory and familiarity with them, but they don’t tell us why things happen. Understanding, comprehension, grasp, and explanation generally imply a conceptual model that says why. Within the logic of the conceptual model, especially if it uses deductive logic that reaches inescapable conclusions, we can know exactly what will happen with perfect foreknowledge as a logical consequence, which gives us the confident feeling that comes with understanding. We know that models never apply perfectly to the physical world, but when they come close enough for our purposes we take our chances with them (and even trust them; more on this later). The implications or entailments of logic can be chained together, allowing conceptual models to take us with certainty many steps further than induction, which can basically only reach probable one-step conclusions. I call knowledge built from conceptual models second-order information because it gives us understanding, as opposed to the mere familiarity of inductive first-order information.

Like perception, our capacity for conception is itself innate even though our concepts themselves are all learned1. So, as with perception, I will distinguish innate conception and learned conception as different components. A big difference between perception and conception, however, is that learned perception (subconcepts or intuitive knowledge) only grows at a fixed rate with experience, while learned conception (concepts or rational knowledge) is essentially unlimited because conceptual models can build on each other to become ever more powerful. This means we can not only leverage up our conceptual models over our own lifetimes, we can pass them on from generation to generation. Notably, although our innate conception is probably not much different than two thousand — or possibly even 50,000 to 200,000 — years ago, language and civilization have dramatically transformed our understanding of the world through learned conception.

I mentioned before that I would discuss whether cause and effect are meaningful in the context of induction. Inductive information processing acts based on past experience, but only looks one step ahead instead of chaining causes and effects. We can call the circumstances before an inductive action the cause and the result the effect. However, this uses concepts, calling out specific causes and effects in a general way. Induction itself doesn’t need the concepts of cause or effect; it just happens. However, given that caveat, it is fair to describe inductive processes conceptually using single-step causes and effects. Although these causes and effects happen without any intent, purpose, design, or strategy, they are dispositional. Life is predisposed to survive, and that is why the inductive processes that produce these simple causes and effects happen. Survival is the ultimate cause that produces all the inductive effects of life. This is very different from loose rocks that fall from cliffs, which have no disposition to do anything. The laws of physics are deductive tools that describe behavior in causative terms, e.g. that the loosening of clumps called rocks along cracks in cliffs will cause falling. Physically, however, the same subatomic rules are followed everywhere, and there are no rocks or cliffs except as abstract groupings in our heads.

Metacognition is thoughts about thoughts, or, more specifically, deductive reasoning about thoughts. Conception is a first-order use of deductive reasoning in which the premises are groupings of percepts, while metacognition is a higher-order use of deductive reasoning in which the premises can be concepts themselves, or concepts about concepts abstracted any number of levels. All physical things, grouped to any level of generality, are still just first-order concepts because we have a strong perceptual sense of their scope. So apple tree, tree, and plant are all first-order concepts, but lumber source and fruit season are metaconcepts about trees. Looking inward, we have a concept of ourself that is based on our subjective experience of doing things, but we also have a metaconcept of ourself which holds thoughts we have thought about ourselves. Metacognition expands our realm of comprehension from matters of immediate relevance to matters abstracted one or more levels away. It extends our reach from physical reality to unlimited imagination. I call metaconcepts third-order information because this move to arbitrary degrees of indirection unlocks new kinds of explanatory power. Conception, both first and higher-order, heavily leverages perception but gives us a kind of window into the future.

Noumena and phenomena just happen. That is, they are not functional in that they do nothing to influence future events. All physical things are necessarily just noumenal and phenomenal because they just happen, which we believe means that invariant physical laws of the universe apply to them. We can also apply the idea of noumena and phenomena to functional things as well by referring to them. Percepts and concepts are about things and we usually know what they are about and don’t need to observe them to find out. But things can get complicated and it may be necessary to observe a functional system to learn more about it. This is the case when we are using our minds to figure out the functions of other IPs, or if we have created a conceptual model whose implications are too hard to figure out logically. In these cases, we isolate the functional entity as a noumenon called a black box whose internal workings are taken to be unknown, and we make phenomenal observations of its behavior. For example, if we find a calculator-like device, we can play around with its buttons and observe what happens. We can never know for sure the true function the device was created to have, but we can develop theories based on observations of it. In this way, functional noumena are ultimately unknowable just like physical noumena. Our theories can become increasingly accurate, but only as they pertain to aspects of their existence that are useful to us, which may be quite different than their creators had in mind (if they were even created for a purpose).

When our concepts are based on formal models like mathematics, we have access to their actual noumena because we defined them. In this case, all their logical implications are also noumenal by definition. But the implications of many formal models can be too complex for us to reason out (i.e. prove), so we may instead opt to gather information about them by induction. If we can run the model on a computer, we can do this by running millions of simulations and analyzing the results for patterns. For example, weather simulators are precisely defined, but we have no idea what all the implications of their rules might be except by running simulations and seeing what pops out. Similarly, in our own minds we can’t logically forecast all the implications of many conceptual models, so we run simulations that project what will happen using subconceptual and conceptual heuristics we have refined over time. In so doing, we have effectively built functional noumena that describe the world about which we then make phenomenal observations.

Evolved functions or traits can be referred to as noumena to which evolution makes adjustments based on phenomenal observations called natural selection. As with the above cases, the noumena themselves can never be understood directly, but only in terms of surmised functions of their phenomena. The noumenon of any gene itself blends information from all the feedback that created it, which, if you think about it, would take an infinite amount of information to describe because time and space don’t break down into quantized units, even if matter and energy do. Even a finite attempt to characterize all that feedback would be astronomically complex. But we can simplify a genetic function using a deductive cause-and-effect model the helps us understand it in the sense that it empowers us to make any number of useful predictions about what it will do. We know such conceptual models leave out details, but they are still useful. Daniel Dennett calls explanations of evolved noumena “free-floating rationales”2. This is a great way of putting it, because it emphasizes that the underlying logic is not dependent on anything physical, which is important because function is not a physical thing. All functional noumena are necessarily free-floating in the sense that they don’t have to be implemented to exist; they embody logical relationships whether anyone knows it or not. But, of course, all physical IPs (either through DNA or thoughts) are implemented and are ultimately concerned only with what can be implemented, because function must ultimately be useful. In other words, we can imagine information and IPs in the abstract, but they can’t actually process any information.

Perception, conception, and metacognition are purely functional modes of existence. They depend on a physical IP, but the same function could potentially be rendered by any number of different physical implementations. We could happily live our lives entirely within a computer simulation if it did a good enough job. This doesn’t mean implementations of information on computers would be indistinguishable from implementations using physical matter; they could always be distinguished. Scientific experiments within the simulation could expose differences between physical and simulated reality. Of course, we could either prevent simulants from doing such experiments, or change the results they obtain, or modify their thought processes to fool them. But I digress; my point is not whether simulation could be a feasible alternative to physical life, but that we live our lives entirely as functional entities and only require a physical environment as a source of new information. We may require physical IPs to exist and think, but we are first and foremost not physical ourselves.

Let’s review our three quandaries in the light of form and function dualism. First, the origin of life. Outside of life, phenomena naturally occur and explanations of them comprise the laws of physics, chemistry, materials science and all the physical sciences. These sciences work out rules that describe the interactions of matter and energy. They essentially define matter and energy in terms of their interactions without really concerning themselves with their noumenal nature. As deductive explanations, they are based in the functional world of comprehension and draw their evidence from our perception of phenomena. While the target is ultimately the truth our noumena of nature, we realize that models are functional and not physical, and also only approximations, even if nearly perfect in their accuracy. With the arrival of life, a new kind of existence, a functional existence, arose when the feedback loops of natural selection developed perception to finds patterns in nature that could be exploited in “useful’ ways. The use that concerns life is survival, or the propagation of function for its own sake, and that use is sufficient to drive functional change. But perception forms its own rules transcendent to physical laws because it uses patterns to learn new patterns. The growth of patterns is directed toward ever greater function because of the functional ratchet. It exploits that fact that appropriately-configured natural systems are shaped by functional objectives to replicate similar patterns and not just by physical laws indifferent to similarity.

Next, let’s consider the mind-body problem. The essence of this problem is the feeling that what is happening in the mind is of an entirely different quality than the physical events of the external world. Form and function dualism tells us that this feeling is a reflection of the fact that some of the underlying natural entities are physical while others are functional. Specifically, the mind is a construction of functional entities, and the nonliving physical world is constructed of physical entities, but all living things are a composite of functional and physical entities. This division is not reducible, as physicalists would have us believe, because function is concerned with and defined by what is possible, and the realm of the possible is entirely outside the scope of mere physical things. While function doesn’t reduce to physical, it does depend on cellular metabolism in the case of life, or on the brain in the case of the mind. The mind is a natural entity comprised of a complex of functional capacities implemented using the physical machinery of the brain. The language conveniently has one word, mind, to refer strictly to the functional aspects, and another word, brain, to refer strictly to the physical aspects. The conscious mind is the functional portion to which we attribute conscious traits like awareness, attention, feelings, and thoughts, but nonconscious parts contribute to function to comprise the overall mind. Some nonconscious thoughts can “rise” up to consciousness in one form or another, while others can’t, but either way, our overall state of mind depends on the operation of the whole mind.

Finally, let’s look at the explanatory gap, which is about explaining with physical laws why our senses and emotions feel they way they do. I said this gap would evaporate with an expanded ontology. By recognizing functional existence as real, we can see that it opens up a vastly richer space than physical existence because it means anything can be related anything in any number of ways. The world of imagination is unbounded, while the physical world is closely ruled by rather rigid laws. The creation of IPs that can first generalize inductively (via evolution of life and minds) and then later deductively and metacognitively (via further evolution of minds) gave them increasing degrees of access to this unbounded world. The functional part alone is powerless in the physical world; it needs the physical manifestation of the IP and its limbs (manipulative extremities) to impact physical things; there is nothing spectral going on here. Physical circumstances are always finite and so IPs are finite, but their capacities are potentially unlimited because capacities can be generalized beyond specific circumstances as broad as we like. So to close the explanatory gap and explain what it means to feel something, we should first recognize that the scope of feeling, experience, and understanding was never itself physical; it was a functional effect within an IP. So what happens in the IP to create feelings?

I’m just going to say the answer here and develop and support it in more detail later on. The role of the brain is to control the body in a coordinated way, and as a practical matter, it solves this using a combination of bottom-up and top-down information processing. These two styles, which have to meet somewhere in the middle, are usually called intuitive and rational. The rational mind is entirely conscious, while the intuitive mind provides the conscious mind with a wealth of impressions and hunches from some inner source I call the nonconscious mind.

The role of consciousness is to focus specifically on top-level problems that the nonconscious mind can’t handle by itself. To create this logical view of top-level concerns, the nonconscious mind presents information to the conscious mind by creating a theater of consciousness. Conscious experience is a highly produced and streamlined version of the information the nonconscious mind processes. The analogy to a movie is particularly good because movies are designed to simulate consciousness. What we feel as pain is really just part of the user interface between the nonconscious and conscious processes in the brain. Senses, feelings, and lessons from the school of hard knocks bubble up to consciousness through the intuitive mind. We are aware of our bodies, our minds, the world, and the passage of time, and we have a specific conscious feeling of them through senses and emotions. Rationally, we organize the world into objects and other concepts that follow rules of cause and effect. Consciousness merges the two styles together pretty seamlessly, but they are actually quite different entirely functional constructs. Some of our intuitive and rational knowledge, though itself functional, is about the physical world, and some of it is about our mental world or other nonphysical subjects. Most notably, our somatic sensory information is about our bodies and our emotions are about our minds. They are not about our bodies and minds in a physical way; rather, they tell us things our bodies and minds need. Whether about physical, mental or other things, knowledge serves functional purposes and so can be said to be a functional entity.

To summarize my initial defense of dualism, I have proposed that form and function, also called physical and functional existence, encompass the totality of possible existence. We have evidence of physical things in our natural universe. We could potentially someday acquire evidence of other kinds of physical things from other universes, and they would still be physical, but they may produce different measurements that suggest an entirely different set of physical laws. Functional existence needs no time or space, but for physical creatures to benefit from it, there must be a way for functional existence to manifest in a natural universe. Fortunately, the feedback loops necessary for that to happen are physically possible and have arisen through evolution, and have then gone further to develop minds which can not only perceive, but can also comprehend and reflect on themselves. Note that this naturalistic view is entirely scientific, provided one expands the ontology of science to include functional things, and yet it is entirely consistent with both common sense and conventional wisdom, which hold that a “life force” is something fundamentally lacking in inanimate matter. We also see evidence of that “life force” in human artifacts because we are good at sensing patterns with a functional origin. Some patterns that occur in nature without any help from life do surprise us by appearing to have a functional origin when they don’t.3 Life isn’t magic, but some of its noumenal mystery is intrinsically beyond a complete deductive understanding. But we can continue to improve our deductive understanding of life and the mind to give us a better explanatory grasp of how they work.

1.4 Hey, Science, We’re Over Here

Between the physical view that the mind is a machine subject entirely to physical laws and the ideal view that the mind is a transcendent entity unto itself that exists independently of the body, science has come down firmly in the former camp. This is understandable considering we have unequivocally established that processes in the brain create the mind, although just how this happens is still not known. The latter camp, idealism, is, in its most extreme form called solipsism, the idea that only one’s own mind exists and everything else is a figment of it. Most idealists don’t go quite that far and will acknowledge physical existence, but still claim that our mental states like capacities, desires, beliefs, goals, and principles are more fundamental than concrete reality. So idealists are either mental monists or mental/physical dualists. Our intuition and language strongly support the idea of our mental existence independent of physical existence, so we consequently take mental existence for granted in our minds and in discourse. The social sciences also start from the assumption that minds exist and go from there to draw out implications in many directions. But none of this holds much sway with physicalists, who, taking the success of physical theories as proof that physical laws are sufficient, have found a number of creative ways to discount mental existence. Some hold that there are no mental states, but just brain states (eliminativism), while others acknowledge mental states, but say they can be viewed as or reduced to brain states (reductionism). Eliminativism, also called eliminative materialism (though it would be more accurate to call it eliminative physicalism to include energy and spacetime), holds that physical causes work from the bottom up to explain all higher-level causes, which will ultimately demonstrate that our common-sense or folk-psychology understanding of the mind is false. Reductionism seems to apply well in nonliving systems. We can predict subatomic and atomic interactions using physics, and molecular interactions using chemistry. Linus Pauling’s 1931 paper “On the Nature of the Chemical Bond” showed that chemistry could in principle be reduced to physics12. Applied to the mind, reductionism says that everything happening in the mind can be explained in terms of underlying physical causes. Reductionism doesn’t say higher-level descriptions are invalid, just that, like chemistry, they are just a convenience; a physical description is possible. However, both eliminativism and reductionism build on the incorrect assumption that information and its byproducts don’t fundamentally alter the range of natural possibility. This assumption, formally called the Causal Closure of the Physical, states that physical effects can only have physical causes. Stated correctly, it would say that physical effects can only have natural causes and recognize that information can be created and have effects in a natural world.

The ontology I am proposing is form and function dualism. This is the idea that the physical world exists exactly as the physicalists have described, but also that life and the mind have capacities or functions, which are entirely natural but cause things that would not otherwise happen. It is easy to get confused when talking about function because language itself is purely functional, so we have to use functional means to talk about either physical or functional things. So we have to be careful to distinguish the references or words, which are functional, that we use to talk about referents, which are either physical or functional noumena. Here are some general words commonly used to refer to either physical things or functional things:

form function, capacity, information
concrete abstract
things, events feelings, ideas, thoughts, concepts
action, process performance, behavior, maneuver
Note that we have no language that describes biological processes in a functional way beyond words like function, capacity, information, and behavior. This is because most of the functional terminology of language derives from our subjective experience, which leads to a mental vocabulary for thoughts and feelings and a physical vocabulary for things and events. Many words, like hard or deep, have both physical and functional meanings, but we can tell from context which is meant. Since we don’t subjectively experience living processes except for those of our own minds, we mostly just use ordinary physical observational terminology to describe them, e.g. as things and processes. Here, contextually, we might now that certain things and processes are actually biological things and processes, and hence much of what we are thinking about when we talk about them is actually functional and not physical. This is all sort of implied by context, but unfortunately it makes it harder for us to keep track of where information processes are critically involved. Mental vocabulary has a similar drawback. Since we only know what it means from personal experience, it is difficult to attach objective scientific meaning to it. However, the persistence of feelings and thoughts, the degree to which others seem to have similar feelings and thoughts, and plans and behaviors that we can associate with them can all provide confirming evidence of their existence and functionality, though not their underlying mechanism.

Function only becomes a kind of existence independent of physical existence if its existence can cause physical changes in the world. Since function, in the form of information, stored either in DNA or brains, is the product of a higher level of physical systems, specifically of information processors, and the impact is at the lower, purely physical level, the impact is called downward causation34. Downward causation directly rejects reductionism and asserts emergence5, the idea that something new, namely information, is created that can effect change. Information doesn’t violate any physical laws when it causes things to happen that would not otherwise happen. All it has done, really, is create a more complex feedback response in the physical system. Or, put another way, it has stored up potential using information instead of energy, so, like a spring, it can release that potential when it is needed. This complex feedback response is entirely natural, and one we could arguably call physical as well, except that the rules that govern it go far beyond anything conventional physical laws contemplate. First, because this complex response is indirect and uses logic, it is not physical, so conventional physical laws can’t help explain it. Second, the way the feedback has been tuned by long periods of inductive logic to develop specific functional capacities further make it irreducible to physical laws. The consequence, that information is created, is natural but complex and can only be understood by interpreting function as an independent form of existence. This is because “understanding” itself, being a property of information processing, is all about being functional or useful, which can only happen if it can predict what will happen, and we can only accurately predict what will happen physically using physical laws and functionally using functional laws (or explanations, as laws is too strict a term for much of what happens in the functional realm). So any physicalists out there who want to claim victory, on the grounds that the action of information in a physical system is a physical process akin to the conversion of potential energy to kinetic energy, please, go right ahead. Just keep in mind that you will never be able to predict what that release of information will do until you embrace functional existence, i.e. the logic of information processing.

So when happens when information is “released” to cause physical effects? When information is applied, resulting in downward causation, the physical effect is specific, but informationally it can be viewed as a special case of a general rule characterizing things that could happen in similar situations. Events happen that are generally similar to prior events, but physically all that is happening is a set of specific events because similarity means nothing from a physical standpoint. Functionally, it looks like things that have happened “before” are happening “again”, but nothing in the universe ever happens twice. Something entirely new has happened. Yes, it is similar to something that happened before, but only according to some ultimately arbitrary definition of similarity that is relevant to a specific information processor. So we must not conflate things happening in general with things happening specifically. As soon as we even speak of things happening in general, we have admitted the existence of function and we are no longer talking about the physical world independent of our functional perspective of it. Our minds and our language are very function-oriented, so seeing things in general comes very naturally to us, it can be hard to separate functional ideas from non-functional ones, but it is always possible.

Aside from the fact that we have words for concrete or physical things and words for functional or abstract things, nouns of any kind may be specific or general in that they can refer to a unique thing (a particular) or to a class of things (a universal). Wording and context help differentiate these cases. For example, “I own a green car” probably refers to a specific physical object and my conception of it, while “I am going to get a green car” refers to a categorical, functional thing (and not really to anything physical at all). While I am likely referring to my only green car, the indefinite article “a” doesn’t single out a definite car, and I may actually have several green cars, and so have not indicated which one I mean. By using the definite article “the”, as in “I own the green car”, refers to just one unique car, as would use of a proper noun (name). In the functional world of our minds, we are always very clear with ourselves whether our ideas are specifically referring to unique, particular things or universally to classes of things. This distinction is at the heart of function because the way function works is to “extract” order from the uniformity of nature by identifying and then capitalizing on generalities as universals.

Whenever IPs collect and apply information, they are causing function to emerge in a physical world. The word “emergence” suggests something comes out of nothing, and if you are willing to count a preestablished potential to do things as something, then something new has been created. It is not magical or inexplicable; it is just the result of feedback loops exploiting the uniformity of nature. As Bob Doyle puts it, “Some biologists (e.g., Ernst Mayr) have argued that biology is not reducible to physics and chemistry, although it is completely consistent with the laws of physics. … Biological systems process information at a very fine (atomic/molecular) level. Information is neither matter nor energy, but it needs matter for its embodiment and energy for its communication.”6 Arthur C. Clarke’s third law says “Any sufficiently advanced technology is indistinguishable from magic.” What evolution has created in life and minds qualifies as being sufficiently more advanced than our own technology to count as magic. We now know that it is technology, but we are still quite a ways off from understanding it in enough depth to say we really “get it”. But increasingly refined deductive explanations, such as I am developing here, bring us closer and will eventually bring our technology up to the level evolution has already attained.

Downward causation (i.e. the application of function by IPs) can be called an “interaction” between life and body or between mind and body because life and mind affect the body and vice versa. The IP, being a complex system with both physical and nonphysical aspects, has established mechanisms to mediate its stored potential capacities with effectors, which are generally proteins in the case of living things and muscles in the case of animals. Physical laws most effectively explain purely physical aspects and functional principles most effectively explain more functional aspects, though those functional principles are often tuned for physical applications. Still, though, what makes them functional is that they work by applying generalities to specific situations.

What other ways do physical things differ from functional ones? Each physical thing is unique, but the same information can exist in multiple ways or formats, which is called multiple realizability. What makes them the “same” is that they refer to the same things in the same ways. This characterization of sameness is itself subject to the judgment of an IP, but provided IPs agree, then sameness can be admitted. This allows information to be abstracted from the physical forms to which it refers and from the IPs that manage it. Information consists of generalizations and particulars derived from generalities, which are only indirectly about physical things and are not physical themselves. Next, while matter and energy are conserved in that they can neither be created nor destroyed (though quantum effects challenge this a bit), information is inherently infinite. The amount of information captured by IPs in the universe is finite but can grow over time without bound.7

Although physicalism specifically rejects the possibility of anything existing beyond the physical world as characterized by the laws of physics, in so doing it overlooks unexpected consequences of feedback. Perhaps overlooks is too strong a word, because physicalists can see life and the mind and can call them physical, but physicalists do presume that the behavior of such systems must be reducible to physical terms just because physical laws were sufficient to create them. But this assumption is both unjustified and wrong. Effects can happen in a physical world that can’t be traced back to the physical conditions that caused them because the connections between causes and effects have been hopelessly muddled by uncountable feedback effects that have become increasingly indirect. Instead, the real spirit of physicalist philosophy is naturalism, which says that everything arises from natural causes rather than and supernatural ones. Instead of declaring up front that the only natural causes to be allowed are from the Standard Model of particle physics or from general relativity, we should be open to other causes, such as information interactions. This is why I conclude that naturalism is best supported not by a monism of form but by a dualism of form and function.

Physicalists see the functional effects of life and the mind (and, not incidentally, make all their arguments using their minds), but they conflate biological information with complex physical structure, and they are not the same. Just because we can collect all sorts of physical, chemical, and structural information about both nonliving and living matter does not mean they are the same sort of thing. As I said earlier, rocks have a complex physical structure but contain no information. We collect information about them using physical laws and measurements that help us describe and understand them, but their structure by itself is information-free. Weather patterns are even more complex and also chaotic, but they too contain no information, just complex physical structure. We have devised models based on physical laws that are pretty helpful for predicting the weather. But the weather and all other nonliving systems don’t control their own behavior; they are reactive and not proactive. Living things introduce functions or capabilities built from information generated from countless feedback experiments and stored in DNA. This is a fundamental, insurmountable, and irreducible difference between the two. Living things are still entirely physical objects that follow physical laws, but when they use information they are triggering physical events that would not happen without it. Because abstraction has unlimited scope, information processing vastly increases the range of what is physically attainable, as the diversity of life and human achievement demonstrate.

What I am calling form corresponds to what Plato and Aristotle called the material cause, which is its physical substance, and what I am calling function corresponds to their final cause, or telos, which is its end, goal, or purpose. They understood that while material causes were always present and hence necessary, they were not sufficient to explain why many things were they way they were. The idea that one must invoke a final cause or purpose to fully explain why things happen is called teleology. Aristotle expanded on this to identify four kinds of causes that resolve different kinds of questions about why changes happen in the world. Of these, material, formal, efficient, and final, I have discussed the first and last. The formal cause is based on Plato’s closely-held notion of universals, the idea that general qualities or characteristics of things are somehow inherent in them, e.g. that females have “femaleness”, chairs have “chairness”, and beautiful things have “beauty”. While the Greeks clung to the idea that universals were intrinsic, William of Ockham put metaphysics on a firmer footing in the 13th century by advocating nominalism, the view that universals are extrinsic, i.e. that they have no existence except as classifications we create in our minds. While classifications are a critical function of the mind, I think everyone would now agree that we can safely say formal causes are descriptive but not causative. The efficient cause is what we usually mean by cause today, i.e. cause and effect. The laws of physicalism all start with matter and energy (no longer considered causative, but which simply exist) and then provide efficient causes to explain how they interact to bring about change. A table is thus caused to exist because wood is cut from trees and tools are used in a sequence of events that results in a table.

Although Aristotle could see that see that these steps had to happen to create a table, it doesn’t explain why the table was built. The telos or purpose of the table, and the whole reason it was built, is so that people can use it to support things at a convenient height. Physicalists reject this final, teleological cause because they see no mechanism — how can one put purpose into physical terms? For example, physically objects sink to lower places because of gravity, not because it is their purpose or final cause. This logic is sound enough for explaining gravity, but it doesn’t work at all for tables, and, as I have mentioned, it doesn’t work for anything life and the mind in general. So was it really reasonable to dispense with the final cause just because it wasn’t understood? How did such a non-explanatory stance come to be the default perspective of science? To see why, we have to go back to William of Ockham. 1650 years after Aristotle, William of Ockham laid the groundwork for the Scientific Revolution, which would still need another 300 years to get significantly underway. With his recognition that universals were not intrinsic properties but extrinsic classifications, Ockham eliminated a mystical property that was impeding understanding of the prebiotic physical world. But he did much more than identify the mind as the source of formal causes, he explained how the mind worked. Ockham held that knowledge and thought were functions of the mind which could be divided into two categories, intuitive and abstractive.8 Intuitive cognition is the process of deriving knowledge about objects from our perceptions of them, which our minds can do without conscious effort. Abstractive cognition derives knowledge from positing abstract or independent properties about things and drawing conclusions about them. Intuitive knowledge depends on the physical existence of things, while abstractive knowledge does not, but can operate on suppositions and hypotheses. I concur with Ockham that these are the two fundamental kinds of knowledge, and I will develop a much deeper view of them as we proceed. Ockham further asserted that intuitive knowledge precedes abstractive knowledge, which means all knowledge derives from intuitive knowledge. Since intuitive knowledge is fundamental, and it must necessarily be based on actual experience, we must look first to experience for knowledge and not to abstract speculation. Ockham can thus be credited with introducing the now ubiquitous notion that empiricism — the reliance on observation and experiment in the natural sciences — is the foundation of scientific knowledge. He recognized the value of mathematics (i.e. formal sciences) as useful tools to interpret observation and experiment, but cautioned that they are abstract and so can’t be sources of knowledge of the physical world in their own right.

Francis Bacon formally established the paramountcy of empiricism and the scientific method in his 1620 work, Novum Organum. Bacon repeatedly emphasizes how only observation with the senses can be trusted to generate truth about the natural world. His Aphorism 19, in particular, dismisses ungrounded, top-down philosophizing and endorses grounded, bottom-up empiricism:

“There are and can only be two ways of investigating and discovering truth. The one rushes up from the sense and particulars to axioms of the highest generality and, from these principles and their indubitable truth, goes on to infer and discover middle axioms; and this is the way in current use. The other way draws axioms from the sense and particulars by climbing steadily and by degrees so that it reaches the ones of highest generality last of all; and this is the true but still untrodden way.”

Bacon built on Ockham’s point that words alone could be misleading by citing a number of biases or logical fallacies that can so easily permeate top-down thinking and so obscure what is really happening. Specifically, he cited innate bias, personal bias, and rhetorical biases (into which one could include traditional logical fallacies like ad hominem, appeal to authority, begging the question, etc.).

Bacon didn’t dispense with Aristotle’s four causes but repartitioned them into two sets. He felt that physics should deal with material and efficient causes while metaphysics should deal with formal and final causes.9 He then laid out the basic form of the scientific method. The objective is to find and prove physical laws, which are formal causes that are universal. While Ockham had rejected such abstractions, Bacon accepted them, but rebranded the only legitimate ones as those that were demonstrable by his method. Using the example of heat as a formal cause, he recommended collecting evidence for and against, i.e. listing things with heat, things without, and things where heat varies. Comparative analysis of the cases should then lead to a hypothesis of the formal cause of heat. Bacon could see that further cycles of observation and analysis could inductively demonstrate the universal natural laws, and he attempted to formalize the process, but never quite finished that work. Even now people would disagree that the scientific method has a precise form, but would agree that it depends on iterative observation and analysis. Bacon had little to say about the final cause because it was the least helpful to his inductive method, and in any case could easily be perverted by bias to lead away from the discovery of efficient causes that underly formal causes. In any case, the success of the inductive, physicalist approach since Bacon and the inability of detractors to refute its universal scope have led to the outright rejection of teleology as an appeal to mysticism when physical laws seem to be sufficient.

We are now quite comfortable with the idea that all our knowledge of the physical world must derive only from observations of it and not suppositions about it. And we concur with Ockham that our primary knowledge of the physical world is inductive, but that secondary abstractive knowledge can group that knowledge into classifications and rules which provide us with causative explanatory power. We recognize that our explanations are extrinsic to the fabric of reality but are nevertheless very effective. However, this shift away from the more magical thinking of the ancients (not to mention the Christian idea that God designed everything) blinded us to something surprising that happens in biological systems and even more significantly in brains: the creation of function. Function is in many ways a subtle phenomenon, and this is why it has been overlooked or underappreciated. Function is not something specific you can point to; it results from creating indirect references to things and generalizing about what might happen to them.

In supposing that knowledge must originate inductively, Ockham and Bacon inadvertently put a spotlight on direct natural phenomena. How could they have known, how could anyone know, that indirect natural phenomena would play a critical role in the development of life and then the brain? Charles Darwin, of course, figured it out by process of elimination (shifting from direct forces to the indirect influences of a nearly infinite series of natural selections), but that is not the same thing as recognizing the source of the power of indirection. Aristotle had already pointed out that every phenomenon had an efficient cause, so of course some sequence of events must have caused life to arise, and Darwin put the pieces together to propose a basic strategy for it to start from nothing and end up where it is now. The events that power natural selection are, taken individually, entirely physical, and so it seems natural to assume that the whole of the process is entirely physical. But this assumption is a fundamental mistake, because natural selection is only superficially physical. The specific selection events of evolution don’t matter; what matters is how they are interpreted or applied in a general way so as to influence future similar events. What it really does by collecting evidence of the value of a mechanism across a series of events is to justify the conclusion of an indirect or general power of the mechanism across an abstract range of situations.

Rene Descartes tried to unravel function, but, coming long before Darwin, he could see no physical source and resorted to conjecture. As I mentioned before, he proposed a mental substance that interacted with the physical substance of the brain in the pineal gland. This is a wildly inaccurate conclusion which has only served to accentuate the value of experimental research over philosophy, but it is still true that knowledge is a nonphysical capacity of the brain whose functional character physical science has not yet attempted to explain. But Descartes’ mistaken assumptions and the rise of monism have led to a concomitant fall in the popularity of all stripes of dualism, even to the point where many consider it a proven dead end. Gilbert Ryle famously put the nail in the coffin of Cartesian dualism in The Concept of Mind10 in 1949. We know (and knew then) that Descartes’ mental “thinking substance” does not exist as a physical substance, but Ryle felt it still had tacit if not explicit “official” support. He felt we officially or implicitly accepted two independent arenas in which we live our lives, one of “inner” mental happenings and one of “outer” physical happenings. This view goes all the way down to the structure of language, which has a distinct vocabulary for mental things (using abstract nouns which denote ideas or qualities) and physical things (using concrete nouns which connect to the physical world through senses). As Ryle put it, we have “assumed that there are two different kinds of existence or status. What exists or happens may have the status of physical existence, or it may have the status of mental existence.” He disagreed with this view, contending that the mind is not a “ghost in the machine,” something independent from the brain that happens to interact with it. To explain why, he introduced the term “category mistake” to describe a situation where one inadvertently assumes something to be a member of a category when it is actually of a different sort of category. His examples focused on parts not being the same sort of thing as wholes, e.g. someone expecting to see a forest but being shown some trees might ask, “But where is the forest?”. In this sort of example, he identified the mistake as arising from a failure to understand that forest has a different scope than tree.11 He then contended that the way we isolate our mental existence from our physical existence was just a much larger category mistake which happens because we speak and think of the physical and the mental with two non-intersecting vocabularies and conceptual frameworks, yet we assume it makes sense to compare them with each other. As he put it, “The belief that there is a polar opposition between Mind and Matter is the belief that they are terms of the same logical type.” Ryle advocated the eliminativist stance: if we understood neurochemistry well enough, we could describe the mechanical processes by which the mind operates instead of saying things like think and feel.

But Ryle was more mistaken than Descartes. His mistake was in thinking that the whole problem was a category mistake, when actually only a superficial aspect of it was. Yes, it is true, the mechanics of what happens mentally can be explained in physical terms because the brain is a physical mechanism like a clock. So his reductionist plan can get us that far. But that is not the whole problem, and it is not the part that interested Descartes or that interests us, because saying how the clock works is not really the interesting part. The interesting part is the purpose of the clock: to tell time. Why the brain does what it does cannot be explained physically because function is not physical. The brain and the mind control exist to the body, but that function is not a physical feature. One can tell that nerves from the brain animate the hands, but one must invoke the concept of function to see why. As Aristotle would say, material and efficient causes are necessary but not sufficient, which is why we need to know their function. Ryle saw the superficial category mistake (forgetting that the brain is a machine) but missed the significant categorical difference (that function is not form). So, ironically, his argument falls apart due to a category mistake, a term that he coined.

Function can never be reduced to form because it is not built from subatomic particles; it is built from logic to characterize similarities and implications. It is true that function can only exist in a natural universe by leveraging physical mechanisms, but this dependency doesn’t mean it doesn’t exist. All it means is that nature supports both generalized and specific kinds of existence. We know the mind is the product of processes running in the brain, just as software is the product of signals in semiconductors, but that doesn’t tell us what either is for. Why we think and why we use software are both questions the physical mechanisms are not qualified to answer. Ryle concluded, “It is perfectly proper to say, in one logical tone of voice, that there exist minds and to say, in another logical tone of voice, that there exist bodies. But these expressions do not indicate two different types of existence, for ‘existence’ is not a generic word like ‘colored’ or ‘sexed.'” But he was wrong because there are two different kinds of existence, and living things exhibit both. Information processors have a physical mechanism for storing and manipulating information and use it to deliver functionality. For thinking, the brain, along with the whole nervous and endocrine systems, are the physical part and the mind is the functional part. For living things, the whole metabolism is the physical part and behavior is the functional part. This is the kind of dualistic distinction Descartes was grasping for. While Descartes overstepped by providing an incorrect physical explanation, we can be more careful. The true explanation is that functional things are not physical and their existence is not dependent on space or time, but they can have physical implementations, and they must for function to impact the physical world.

The path of scientific progress has understandably influenced our perspective. The scientific method was designed to unravel mysteries of the natural world, and was created on the assumption that fixed natural laws govern all natural activity. Despite his advocacy of dualism, Descartes promoted the idea of a universal mechanism behind the universe and living things, and his insistence that matter should be measured and studied mathematically as an extension of what we now call spacetime helped found modern physics: “I should like you to consider that these functions (including passion, memory, and imagination) follow from the mere arrangement of the machine’s organs every bit as naturally as the movements of a clock or other automaton follow from the arrangement of its counter-weights and wheels.” 12 He only invoked mental substance to bridge the explanatory gap of mental experience. If we instead identify the missing piece of the puzzle as function, then we can see that nature, through life, can “learn things about itself” using feedback to organize activities in a functional way we call behavior. Behavior guides actions through indirect assessments instead of direct interactions, which changes the rules of the game sufficiently to call it a different kind of existence.

Darwin described how indirect assessments could use feedback to shape physical mechanisms, but he didn’t call out functional existence specifically, and, in the 150 years since, I don’t think anyone else has either. But if this implies, as I am suggesting, that the underlying metaphysics of biology has been lacking all this time, then we have to ask ourselves what foundation it has been built on instead. The short answer is a physicalist one. Both before and after Darwin, traits were assumed to have a physical explanation, and they are still mostly thought to be physical today. And because function does always leverage a physical mechanism, this is true, but, as Aristotle said in the first place, it is not sufficient to tell us why. But if biologists honestly thought only in terms of physical mechanisms, they would have made very little progress. After all, we still have no idea, except by gross analogies to simple machines like levers, pipes, and circuits, how bodies work, let alone minds. Biology, as practiced, makes observations of functioning biological mechanisms and attempts to “reverse engineer” an explanation of them to create a natural history. Much of what is to be explained is provided by the result that is to be explained.13 We assume certain functions, like energy production or consumption, and work out biochemical details based on them, but we couldn’t build anything like a homeostatic, self-replicating living creature if our lives depended on it because we only understand superficial aspects. Biology is thus building on an unspoken foundation of some heretofore ineffable consequence of natural selection which I have now called out as biological function or information. Darwin gave biologists a license to identify function on the grounds that it is “adaptive”, and they have been doing that ever since, but not overtly as a new kind of existence, but covertly as “phenomena” to be explained, presumably with physical laws. I am saying that these phenomena are functional and not physical ones, and so their explanations must be based on functional principles, not physical.

But what of teleology? Do hearts pump blood because it is their purpose or final cause? We can certainly explain how hearts work using purposeful language, but that is just an artifact of our description. Evolved functionality gets there by inductive trial and error, while purpose must “put forth” a reason or goal to be attained. Evolution never looks forward because induction doesn’t work that way, so we can’t correctly use the word purpose or teleology to describe information created by inductive means. But we can use the word functional, because biological information is functional by generalizing on past results even though it is not forward-looking. And we can talk about biological causes and effects, because information is used to cause general kinds of outcomes. Biological causes and effects are never certainties the way physical laws deal in certainties because information is always generalizing to best-fits. Physical effects can also be said to have causes, but we should keep in mind that the causality models behind physical laws are for our benefit and not part of nature themselves. They are conceptual models that make generalizations about kinds of things which we then inductively map onto physical objects to “predict” what will happen to them, which will give us a good idea of the kind of things that will most likely happen.

With our minds, however, through the use of abstraction used with conceptual models we can “look forward” in the sense that we can run simulations on general types which we know could be mapped to potential real future situations. We can label elements of these forward-looking models as goals or purposes, because bringing reality into alignment with a desired simulation is another way of saying we attain goals. So we really can say that the purpose of a table is to support things at a convenient height for people. But tables are not pulled toward this purpose; they may also serve no purpose or be used for other purposes. Aristotle claimed that an acorn’s intrinsic telos is to become a fully grown oak tree.14 Biological functions can be said to be pulled inexorably toward fulfillment by metabolic processes. The difference is actually semantic. Biological processes can be said to run continuously until death, but again, it only looks like things that have happened “before” are happening “again” when really nothing ever happens twice. Similar biological processes run continuously, but each “instance” of such a process is over in an instant, so we are accustomed to using general and not specific terminology to describe biological functions. These processes have no purpose, per se, because none was put forth, but they do behave similarly to ways that have been effective in the past for reasons that we can call causes and effects. Many of the words we use to describe causes and effects imply intent and purpose, so it is natural for us to use such language, but we should keep in mind it is only metaphorical. Tables, on the other hand, are not used continuously and have no homeostatic regulation ensuring that people keep using them, so they may or may not be used for their intended purpose. Designers don’t always convey intended purposes to users, and users sometimes find unintended uses which become purposes for them, and both can be influenced by inductive or deductive approaches, so it is hard to speak with certainty about the purpose of anything. But it is definitely true that we sometimes have purposes and intentionally act until we consider them to be generally fulfilled, so minds can be teleological.

Part 2: The Rise of Function

I’ve outlined what function is and how it came to be, but to understand the detailed kinds of function we see in life and the mind, we need to back up to the start and consider the selection pressures at work. Humans have taken slightly over four billion years to evolve. Of that, the last 600 or so million years (about 15%) has been as animals with minds, the last 4 million years (about 0.1%) as human-like primates, and the last 10,000 or so years (about 0.00025%) as what we think of as civilized. The rate of change has been accelerating, and we know that our descendants will soon think of us as shockingly primitive (and some already do!). An explanation of the mind should account for what happened in each of these four periods and why.

Life: 4 billion to 600 million years ago
Minds: 600 million to 4 million years ago
Humans: 4 million to 10,000 years ago
Civilization: 10,000 years ago to present

2.1 Life: 4 billion to 600 million years ago

While we don’t know many of the details of how life emerged, the latest theories connect a few more dots than we could before. Deep-sea hydrothermal vents 1 may have provided at least these four necessary precursors for early life to arise around four billion years ago:

(a) a way for hydrogen to react directly with carbon dioxide to create organic compounds (called carbon fixation).

(b) an electrochemical gradient to power biochemical reactions that led to ATP (adenine triphosphate) as the store of energy for biochemical reactions.

(c) the formation of the RNA world in which RNA could replicate itself and catalyze reactions in protected areas (possibly iron-sulfur bubbles or other compartments). The ribosome evolved to translate RNA to proteins; up to six likely phases of ribosome evolution have been proposed.2 Early protein chemistry and basic metabolism were established.

(d) the chance creation of lipid bubbles enclosing this RNA metabolic soup to form the first RNA-based cells. Genes don’t code for everything in life — they can’t create membranes. But genes quickly evolved proteins to strengthen and improve those original cell walls to make them very sophisticated biological entities. Genetically-driven cell division evolved at this time, allowing cells to multiply.

This scenario is at least a plausible way for the precursors of life to congregate in one place and have opportunities for feedback loops to develop which could start to capture function and then ratchet it up. Many steps are missing here, and much of the early feedback probably depended more on chance than on mechanisms that actually capture and leverage it as information. Alexander Rich first proposed the concept of the RNA world in 1962 because RNA can both store information and catalyze reactions, and thus do both the tasks that DNA and proteins later specialized at. The RNA world may have evolved for hundreds of millions of years before viruses and DNA brought about the next big changes.

(e) the creation of of viruses. Genomic analysis suggests a critical viral contribution to the origin of life as we know it. I propose an extensive “age of viruses” that spanned hundreds of millions of years and included a vast proliferation and refinement of biochemical mechanisms to manage cells, genes and proteins from which only the few most successful lines survived.

While it has long been contentious whether viruses should even be classified as living given that they are obligate parasites, they are alive because they have their own genome, and they metabolize and reproduce during part of their life cycle.3 Patrick Forterre suggests calling normal cells “ribocells” and cells producing virions (inactive virus particles) “virocells”. Only ribocells code for ribosomes (needed to make proteins), while only virocells code for capsids, which are the protein coats of virions. Viruses are thus distinct living organisms when they are virocells and are like seeds when they are virions. All viruses today kill their ribocell host when they are virocells by rapidly reproducing and then bursting the cell wall (a process called lysis), but the first viruses probably lived symbiotically as “ribovirocells” until the capacity for lysis evolved. Many viruses also live symbiotically with their hosts via a lysogenic phase in which they integrate into the host genome until induced to leave, which can happen many generations later or even never. About 8% of the human genome is known to be viral as embedded (lysogenic) retroviruses, most of which are now permanent residents. All cells today use double-stranded DNA, but viruses come in both single and double-stranded RNA and DNA forms, and also have a more diverse molecular biology than today’s cellular life, suggesting an older origin. At least six surviving lines of viruses appear to have evolved independently, and many more have probably not survived.45

At the end of the age of viruses, just three lines of DNA-based cells and six lines of viruses symbiotic with them emerged. Those three lines are bacteria, archaea, and eukaryotes. These three either share a last universal common ancestor, or LUCA, about 3.5 billion years ago, or they come from three independent conversions of RNA cells to DNA cells. I suspect the latter case, along with many other cell lines from the age of viruses that did not survive. Furthermore, eukaryotes are much more highly evolved than bacteria and archaea, so I discuss them separately below. Viruses probably “invented” DNA as a superior replication strategy and then took over the replication machinery of some RNA cells they infected to create DNA-based cells. The stability of double-stranded DNA as a genetic material eventually eclipsed single or double-stranded RNA and single-stranded DNA in all living cells, but single-stranded and RNA-based viruses still exist.6 While it is theoretically possible that some new lines of viruses evolved after DNA-based cells took over, I suspect that the “wild west” window of opportunity to evolve something as novel as a virus had shut down for good because the competition from existing mechanisms had become too great.

Early life must have been very bad at even basic cell functions compared to modern forms, so much of the adaptive pressure in the early days must have focused on improving the core mechanisms of metabolism, replication, and adaptation during the RNA world and the age of viruses. As life first became more robust, it became less dependent on the hydrothermal vents and was gradually able to move away from them. Although the central mandate of evolution is to survive over time, we can roughly prioritize the set of component skills that needed to evolve along the way. As each of these skills improved over time, organisms that could do them better would squeeze out those that could not:

1. Metabolism is, of course, the fundamental function as life must be able to maintain itself. A source of energy was critical to this, which is why hydrothermal vents are such a likely starting point.

2. Reproduction was the next most critical function, as any kind of organism that could produce more like itself would quickly squeeze out those that could not. This is where RNA comes in. Although RNA is too complex to have been the first approach used to replicate functionality, we can guess that a functional ratchet got to RNA through a series of simpler but less effective molecules that have not survived in any lifeform today.

3. Natural selection at the level of traits is the next most critical function needed because it would make possible the piecewise improvement of organisms. Metabolism and replication critically need trait-level selection to improve, so these mechanisms coevolved. Bacteria developed a mechanism called conjugation that lets two bacterial cells connect and copy a piece of genetic material called a plasmid from one to the other. Most plasmids ensure that the recipient cell doesn’t already have a similar plasmid, which protects against counterproductive changes. There are so many bacteria that a good strategy for them is to try out everything and see what works.

4. Proactive gene creation. Directed mutation is currently a controversial theory, but I think it will turn out that nearly all genetic change is pretty carefully coordinated and that the mechanisms that make it possible evolved in these early years. I am talking about ways a cell can assemble new genes by combining snippets of DNA called transposable elements (also called TEs, “jumping genes” or transposons) and then inserting the result back into chromosomes. Sometimes this creates “junk DNA” that does nothing, and sometimes it creates new, active genes. Viruses depend on this kind of technology, and we know genes can jump, but it is hard but it is hard to see how mechanism could evolve that could do this in a useful way. What we have to remember is that it only has to be more useful than chance to survive and prosper. Cells that were “open minded” about mixing up their DNA could evolve strategies like this if they work sometimes. And because this could happen, it almost certainly did happen, because those that could gain such an advantage, even if it took many generations to show benefits, would have squeezed out those that could not. This very long-term selection of gene-editing technology has by now had as much time to evolve as the more visible traits of genes, even though it is much harder to see or even imagine how it might be working. Adi Livnat calls this mechanism the “writing phenotype” to contrast it with the “performing phenotype”, which are the genes behind the observable traits of an organism. If true, this would be the largest extension to the theory of evolution since Darwin.7

The next big step was:

(f) the arrival of eukaryotes

All cellular (non-viral) life today is in the bacteria, archaea, and eukaryote domains. Bacteria and archaea, collectively called prokaryotes, are primitive single-celled organisms, but eukaryotes are elaborate single-celled creatures typically 10,000 times the size of prokaryotes, and also comprise nearly all multicellular lifeforms on earth. Eukaryotes are a minuscule fraction of all living things by number, but because they are much bigger they have about the same worldwide biomass as prokaryotes. Prokaryotic cells lack internal structures, but eukaryotic cells have a variety of cell organelles, most notably a cell nucleus and mitochondria, which both have double membranes, and an endomembrane system including the endoplasmic reticulum and the Golgi apparatus. Eukaryotes must have a complex prehistory now lost to us, but the evidence suggests they arose from an original cell line, dubbed chronocytes, that must already have been much more complex creatures than bacteria or archaea.8 Chronocytes had a cytoskeleton, which is a network of protein fibers that attach to the cell wall that give it shape and made it possible for them to engage in endocytosis, the ingestion of bacteria and other objects by engulfment. The eukaryotic ancestor likely ingested a number of bacteria and archaea that made permanent alterations, as genetic remnants of bacteria, archaea and a third unrelated line (the postulated chronocytes line) are now found in eukaryotes. The nucleus and mitochondria are probably such engulfed organisms that retained much of their structure (a process called symbiogenesis).9 The double membrane is the expected signature of a single-membraned creature engulfed by an outer cell wall.10 Algae and plants are eukaryotes that engulfed organelles called plastids. Mitochondria and plastids reproduce with their own DNA, while cell nuclei became the repository for the host cell’s DNA.

These physical enhancements to eukaryotes gave them a whole host of new functional capabilities prokaryotes lack, including endocytosis (as noted) and locomotion using flagella, cilia, or pseudopods. The endomembrane system, which comprises more than half of the total membrane in eukaryotic cells, likely originated by folding and refolding an inner or outer membrane. It facilities the synthesis and transport of proteins like a post office, which likely accounts for why eukaryotic cells can be so much larger, more complex, and more functional than prokaryotes. Eukaryotes acquired energy-producing capabilities already refined by prokaryotes through mitochondria and chloroplasts, which are plastids that can photosynthesize. But perhaps the greatest invention of the eukaryotes was:

(g) sexual reproduction, which combines genes from two parents to create a new combination of genes in every offspring.

Sexual reproduction is a nearly universal feature of eukaryotic organisms11 and the basic mechanisms are believed to have been fully established in the last eukaryotic common ancestor (LECA) about 2.2 billion years ago. In the short term, sex has a high cost but few benefits. However, in the long term it provides enough advantages that eukaryotes almost always use it. Asexual reproduction is used by prokaryotes and by the somatic (non-sex) cells of eukaryotes. In prokaryotes it is called binary fission and in somatic cells it is called mitosis. In both cases, a double strand of DNA is separated and each single strand is then used as a template to create two new double strands. When the cell divides into two, each daughter cell ends up with one set of DNA.

Sexual reproduction uses a modified cell-division process called meiosis and a cell fusion process called fertilization. Cells that undergo meiosis contain a complete set of genes from each of two parents. They first replicate the DNA, making four sets of DNA in all, and then randomly shuffle genes between parent strands in a process called crossing over. The cell then divides twice to make four gametes each with a unique combination of parental genes. Gametes from different parents then fuse during fertilization to create a new organism with a complete set of genes from each of its two parents, where each set is now a largely random mixture from each parent.

Sexual reproduction is clearly a much more complex and seemingly unlikely process compared to asexual reproduction, but I will show why sex is probably a necessary development in the functional ratchet of life. The underlying reason for sex is that it facilitates points 3 and 4 above, namely natural selection at the level of traits and proactive gene creation. Because mechanisms evolved to do both 3 and 4 well, prokaryotes evolved in just two billion years instead of two trillion or quadrillion. Of course, I can only guess about time frames this large, but in my estimation evolution would have made almost no progress at all without refining these two mechanisms, so any organisms that could improve on them would have a huge advantage over those that did them less well. We know that conjugation is not the only mechanism prokaryotes use to transfer genetic material between themselves. All such mechanisms outside of sexual reproduction are called horizontal gene transfer (HGT), and also include transformation and transduction. Transduction is the incorporation of DNA from viruses, and viruses likely created most or even all of the very elaborate machinery behind horizontal gene transfer in the first place. Any mechanism that can share genetic information at the gene or function level with other organisms creates opportunities for new combinations of genes to compete. Life on earth has been a group (horizontal) effort because advantageous mutations useful to different domains arose in different lines of vertical descent. Without sex, prokaryotes would be evolutionary dead-ends that died off as deleterious mutations accumulated, but HGT gives them access to enough new genetic material to ward off this fate and even to adapt well to new environments. HGT allows many new genetic combinations to be tried at a fairly low cost since the number of single-cell organisms is very high. But it also lacks many mathematical advantages that sex brings to the table. If we assume “that the protoeukaryote → LECA era featured numerous sexual experiments, most of which failed but some of which were incorporated, integrated, and modified,”12 then nearly all of the steps that created sex, which is a highly-refined but complex mechanisms, are lost to us.

What benefits does sex provide that led to its evolution? John Maynard Smith famously pointed out that in a male-female sexual population, a mutation causing asexual reproduction (i.e. parthenogenesis, which does naturally arise sometimes allowing females to reproduce as clones without males) should rapidly spread because asexual reproduction has a “twofold” advantage since it no longer needs males. It is true that when resources allow unlimited growth, asexual reproduction can thus spread faster, but this rarely happens. Usually, populations are constrained by resources to a roughly stable population. Achieving the fastest reproduction cycle is not the critical factor in long-term success in these situations, and it is actually rather irrelevant. In any case, eukaryotic populations can and have evolved ways to switch between sexual and asexual modes of reproduction to capitalize on this asexual advantage, but in practice this almost never happens. I think this is because the ability to multiply faster comes with the cost of being monoclonal, and this staggering loss of genetic diversity is likely to create a genetic dead end. All major vertebrate groups except mammals have species that can sometimes reproduce parthenogenetically13, including about eighty species of unisex reptiles, amphibians, and fishes. While these lines may last for quite a while, they have few prospects for further adaptation. Sexual reproduction makes natural selection at the level of traits (point 3 above) possible. Only through sexual reproduction can variants of each gene in a population vie for success independently from all the other traits. Sexual reproduction can thus create an almost unlimited number of genomes with different combinations of genes, while all asexual creatures remain clones (barring HGT, though prokaryotic genomes stay very small, so they must both take genes in and knock them out). Beneficial traits can spread through a population “surgically” replacing less effective alleles (variants of the same gene). Sex gives a species vastly more capacity to adapt to changing environments because variants of every gene can remain in the gene pool waiting to spread when conditions make them more desirable.14 Asexual creatures can’t keep genes around for long that aren’t useful right now, because they can’t generate new combinations (except by HGT). We can conclude that Maynard Smith was right that asexual reproduction provides a “quick win”, but because it is a poor long-term strategy its use is very limited in multicellular life. Overall, sex is what makes eukaryotic evolution possible because it provides a controlled way for traits to evolve independently.

Finally, we see:

(h) complex multicellularity, meaning organisms with specialized cell types.

Multicellular life has arisen independently dozens of times, starting about 1 billion years ago, and even some prokaryotes have achieved it, but only six independently achieved complex multicellularity: animals, two kinds of fungi, green algae (including land plants), red algae, and brown algae. The relatively new science of evo-devo (evolutionary development) is focused largely on cell differentiation in complex multicellular (eukaryotic) organisms. The way that the cells of the body achieve such dramatically different forms, simplistically, is by first dividing and then turning on regulatory genes that usually then stay on permanently. Regulatory genes don’t code for proteins, but they do determine what other regulatory genes will do and ultimately what proteins will be transcribed. Consequently, as an embryo grows, each area can become specialized to perform specific tasks based on what proteins the cell produces.

The most dramatic demonstration of the power of triggered differentiation is radial and bilateral symmetry. Most animals (the bilateria) have near perfect bilateral symmetry because the same regulatory strategy is deployed on each side, which means that so long as growth conditions are maintained equally on both sides, a perfect (but reversed) “clone” will form on each side. Evo-devo has revealed that the eyes of insects, vertebrates, and cephalopods (and probably all bilateral animals) evolved from the same common ancestor, contrary to earlier theory. Homeoboxes are parts of regulator genes shared widely across eukaryotic species that regulate what organs develop where. As evo-devo uncovers the functions of regulatory genes, the new science of genomics is mathematically exposing the specific evolutionary origins of every gene. Knowing each gene’s origins and roughly what it does will coalesce into a comprehensive understanding of development.

Multicellularity and differentiation created opportunities for specialized structures to arise in bodies to perform different functions. Tissues are groups of cells with similar functions, organs are groups of tissues that provide a higher level of functionality still, and organ systems coordinate the organs at the highest level. A stream has no purpose; water just flows downhill. But a blood vessel is built specifically to deliver resources to tissues and to remove waste. This may not be the only purpose it serves, but it is definitely one of them. All tissues, organs, and organ systems have specific functions which we can identify, and usually one that seems primary. Additional functions can and often do arise because having multiple applications is sometimes the most convenient way for evolution to solve problems with the available resources. Making high-level generalizations about the functions of tissues, organs, and organ systems is the best way to understand them, provided we recognize that generalizations usually have exceptions. The heart definitely specializes in pumping blood and the brain in overall control of the body. The study of these structures should focus first on their function and only secondarily on their form because their form is driven by their function. The blood will still need circulation and the body will still need coordinated control regardless of what physical mechanisms are drafted to do it. So physicalism must take a back seat to functionalism in areas driven by function, which means in the study of life.

Evolution superficially appears to be a process in which complex forms supersede simpler ones, but it is more accurate to think of it as a continuously improving functional web. Functions can be lost along the way, but barring complete system collapse (and mass extinctions do happen), functionality will tend to increase steadily. Plants and animals could not survive without a complex symbiosis with countless bacteria, archaea, fungi, protists (single-cell eukaryotes), and viruses, which is fortunate because they left breadcrumbs that help us understand how small incremental steps moved the functional ratchet forward. I’ve broken those steps down as much as I could above, but what are the biggest missing links? Before about fifty years ago, we had no insight into what preceded multicellular life, and now we can break that period down into seven stages (a to g) with some detail. Even so, we can only surmise that vast complexity arose and was later simplified (through population bottlenecks) to create the RNA world, viruses, the eukaryotes, and sex. Because these developments were streamlined from complexity now lost, we can never be quite sure how they unfolded, but we will continue to develop insights. What we do know is that everything that has happened since multicellularity arose is child’s play compared to what happened before. We have enough genetic evidence in multicellular creatures that we should eventually be able to piece out almost exactly how each detail evolved. The traditional missing link, the leap from ape to man, still holds many mysteries, but they are all solvable.

2.2 Minds: 600 million to 4 million years ago

The Concerns of Animals

Animals are mobile. Mobile organisms need brains while sessile ones don’t. This point is so obvious it hardly needs to be said, but everything follows from it. Fungi are close relatives to animals that have evolved some highly specialized features that make them ideally suited to life under ground, and plants are arguably more evolved than fungi or animals because their cells can photosynthesize using chloroplasts. But they don’t need brains because they don’t move. They just sit tight and grow, making the best of whatever happens to them. Plants can afford to wait for food (i.e. sunlight) and mates (i.e. pollen) to come to them, but animals need to seek them out and compete for them. They need algorithms to decide where to go and what to do when they get there. The body must be controlled as a logical unit called an agent, and its activities in the world can be subdivided into discrete functions, starting with eating, mating, and sleeping (an activity most animals do for reasons still only partially understood). These can be subdivided further based on physical considerations, chiefly how to control the body and external objects, and functional considerations, chiefly maximizing survival potential by meeting needs and avoiding risks. Brains are the specialized control organs animals developed to weigh these considerations. Brains first collect information about their bodies and the environment from the bottom up but then fit that information into control algorithms that address discrete functions and control considerations from the top down. Top-down prioritization is essential to coordinate body movements and actions effectively. Let’s take a closer look at how animals evolved to get a better idea how they have met these challenges.

The last animalian common ancestor is called the urmetazoan, aka “first animal”, and is thought to have been a flagellate marine creature. The urmetazoan is important because, like the LUCA and LECA before it, an unknown but perhaps significant amount of animal evolution went into making the urmetazoan and an unknown but perhaps significant number of competing multicellular mobile forms were squeezed out by the metazoans (aka animals). Now we only see what got through this bottleneck. The surviving animals have differentiated into many branches with a wide variety of forms, so I will climb up through the animal family tree.

Sponges are the most primitive animals from a control standpoint, having no neurons or indeed any organs or specialized cells. But they have animal-like immune systems and some capacity for movement in distress.1. Cnidarians (like jellyfish, anemones, and corals) come next and feature diffuse nervous systems with nerve cells distributed throughout the body without a central brain, but often featuring a nerve net that coordinates movements of a radially symmetric body. Although jellyfish move with more energy efficiency than any other animal, a radial body design provides limited movement options, which may explain why all higher animals are bilateral (though some, like sea stars and sea urchins, have bilateral larvae but radial adults). Nearly all creatures that seem noticeably “animal-like” to us do so because of their bilateral design which features forward eyes and a mouth. This group is so important that we have to think of the features of the urbilaterian, the first bilateral animal about 570-600 million years ago. As I mentioned above, we now have evidence that the urbilaterian had eyes.

While the exact order in which the features of animals first appeared is still unknown, the centralized brain became the dominant strategy in most bilateral animals. A few exceptions to centralized control exist among the invertebrates, most notably the octopus (a mollusk), which has a brain for each arm and a central brain that loosely administers them. Having independent eight-way control of its arms comes in handy for an octopus because the arms can often usefully pursue independent tasks. Octopus arms are vastly more capable than those of any other animals, and they use them in amazingly coordinated ways, including to “bounce-walk” across the sea floor and to jump out of the water’s edge to capture crabs.

Why, then, don’t animals all have separate brains for each limb and organ? The ways function evolves is always a compromise between logical need and physical mechanism. To some degree, historical accident has undoubtedly shaped and constrained evolution, but, on the other hand, where logical needs exist, nature often finds a way, which sometimes results in convergent evolution of the same trait through completely different mechanisms. In the case of control, it seems likely that it was physically feasible for animals to either localize or centralize control according to which strategy was more effective. An example of decentralized control in the human body is the enteric nervous system, or “gut-brain”, which lines the gut with more than 100 million nerve cells. This is about 0.1% of the 100 billion or so nerves in the human brain. Its main role is controlling digestion, which is largely an internal affair that doesn’t require overall control from the brain.2 However, the brain and gut-brain do communicate in both directions, and the gut-brain has “advice” for the brain in the form of gut feelings. Much of the information sent from the gut to the brain is now thought to arise from our microbiota. The microbes in our gut can weigh several pounds and comprise hundreds of times more genes than our own genome. So gut feelings are probably a show of “no digestion without representation” that works to both parties’ benefit.34 The key point in terms of distributed control is that if the gut has information relevant to the control of the whole animal, it needs to convey that information in a form that can impact top-level control, and it does this through feelings and not thoughts. The highest level decisions of the animal need to be centralized so it can carry out coordinated plans.

So let’s consider how control of the body is accomplished in the other two families of highly mobile, complex animals, namely the arthropods and vertebrates. The control system of these animals is most broadly called the neuroendocrine system, as the nervous and endocrine systems are complementary control systems that work together. The endocrine system sends chemical messages using hormones traveling in the blood while the nervous system sends electrochemical messages through axons, which are long, slender projection of nerve cells, aka neurons, and then between neurons through specialized connections called synapses. Endocrine signals generally start slower and last longer than nerve-based signals. Both arthropods and vertebrates have endocrine glands in the brain and about the body, including the ovaries and testes. Hormones regulate both physiology and behavior of bodily functions like digestion, metabolism, respiration, tissue function, sensory perception, sleep, excretion, lactation, stress, growth and development, movement and reproduction. Hormones also affect our conscious mood, which encompasses a range of slowly-changing subjective states that can influence our behavior.

While the endocrine system focuses on control of specific functions, the nervous system provides overall control of the body, which includes communication to and from the endocrine system. In addition to the enteric nervous system (gut-brain), the body has two other peripheral systems called the somatic and autonomic nervous systems that control movement and regulation of the body somewhat independently from the brain. The central nervous system (CNS) comprises the spinal cord and the brain itself. Nerve cells divide into sensory or afferent neurons that send information from the body to the CNS, motor or efferent neurons that send information from the brain to the body, and interneurons which comprise the brain itself.

The functional capabilities of brains have developed quite differently in arthropods and vertebrates. I am not going to review arthropod brains in detail because vertebrates ultimately developed much larger brains with more generalized functionality, but arthropods are much more successful at small scales than vertebrates. It seems likely and appears to be the case that they depend much more on instinctive behavior than vertebrates. However, many can learn to adapt their behavior by learning about new features in their environment.5 Moving through the vertebrates on the way to Homo sapiens, first

fish branch off, then
amphibians, and then
amniotes, which enclose embryos with an amniotic sac that provides a protective environment and makes it possible to lay eggs on land. Amniotes divide into
reptiles, from which derive
birds (by way of dinosaurs), and
mammals. And mammals then divide into
monotremes (like the duck-billed platypus), then
marsupials, and then
placentals, which gestate their young internally to a relatively late stage of development. There are eighteen orders of placentals, one of which is
primates, to which
humans belong.

It seems to us that evolution reached its apotheosis (divine perfection) in Homo sapiens, and yet we all know that all species have had the same amount of time to evolve, so none should be “more evolved” than others. And yet, by almost any measure, brain power (viewed as a general capacity to solve problems) generally increases as one moves through the above branches toward humans. Furthermore, new brain structures appear along the way that help to account for that increase in power. Of course, the living representatives of each line above have evolved more brain power and new brain structures. Some birds, in particular, are smarter by almost any measure than some primitive mammals. But birds and mammals have specialized to many new environments, while fish, amphibians, and reptiles have mostly continued to occupy environments they were already well-adapted for. By comparison, fish haven’t had the need or the opportunity to evolve much more powerful brains than they have had for millions of years. The truth is, brain power has generally increased over time in all animal lines because evolution is not random, it is directed. It is not directed to more complex forms but to more functional forms. In animals, that functionality is most critically driven by the power of the top-level control center, which is the brain. So some species have made better use of the time they have had available to evolve because they have faced greater environmental challenges which have needed better control systems. But it is also worth noting that fish, amphibians, and reptiles are cold-blooded. Warm-blooded animals need much more food but can engage in a more active lifestyle in a wider temperature range. Also, warm-blooded animals can support more energy-dependent brains, making it easier for them to think more and faster.6

Let’s consider for a moment the differences in brain development among vertebrates. Fish and amphibians have no cerebral cortex, the outer layer of neural tissue of the cerebrum in the brain. The cerebral cortex is thought to be the principal control center of more complex behavior. Let’s just consider what some measures of brain and cerebral cortex differences among the animals suggest. Although neuron counts only provide a rough indication of brain power, they do suggest potential, so I have listed them for certain vertebrates:

AnimalTotal NeuronsCerebral Cortex Neurons
fish, amphibians 0.02 billion none
small rodents 0.03-0.3 billion 0.01-0.04 billion
cats, dogs, herbivores 0.5-2.5 billion 0.2-0.6 billion
monkeys 3-6 billion 0.5-1.7 billion
smarter birds 0.8-3 billion 0.8-1.9 billion
elephants 250 billion 6 billion
apes 10-40 billion 2.5-10 billion
cetaceans (dolphins, whales) ? 5-40 billion
humans 86 billion 16 billion

  • By total number of neurons, humans have substantially more at 86 billion than any animals except elephants and probably dolphins and whales.7
  • By total number of cerebral cortex neurons, humans have the most (about 16 billion), except that some whales may have more. Elephants have about 6 billion, which is only bested by cetaceans and primates.8

Consciousness as the Top-Down Logical Perspective

That the brain must control the body as a logical agent that pursues discrete tasks implies that it needs to maintain a top-down logical perspective about the world that defines it in terms of the functions it needs to perform. So, for example, it must distinguish its own body from that which is not part of its body, it must distinguish high-level categories like plant food, animal (prey) food, predators, members of the same species (conspecifics), relatives, mates, offspring, other plants and animals, terrain features (ground, water, mountains, sky, etc.), and so forth. While these are all things made from matter, they are functionally distinct to animals based on their expected interactions with them. It is more accurate to say that we define these things, and in fact all physical things, principally in terms of what they can do for us and only secondarily in more purely compositional or mechanical terms. However, brains can only gather information about the world outside them by looking for patterns in data collected from senses. So how can they maintain a top-down perspective when information comes to them through bottom-up channels? The answer is consciousness, aka the mind.

We can thus separate information processing into two broad categories, bottom-up and top-down:

  • Bottom-up processing finds patterns locally in data one source at a time
  • Top-down processing proposes functional units to subdivides the world

The brain manages these two kinds of processing through two logically-distinct subprocesses:

Technically, what we call the conscious mind comprises only our subjective awareness at one moment, including our current sensory perceptions, emotions, and thoughts, which have access to our short-term memory, which reputedly holds about four chunks of information and fades out after a few to at most thirty seconds. However, this is not the definition of consciousness I will be using in this book, as it is too narrow. While I do certainly mean current awareness and attention, I also include the scope of past and future awareness and attention. In other words, I also include the long-term memory that is reachable by consciousness. Technically, our long-term memory is part of our nonconscious mind, which also includes every other cognitive task that we surmise must be happening but of which we lack awareness. We infer the existence of the nonconscious mind by process of elimination — it is the tasks that need information processing that we can’t feel happen. Some nonconscious tasks are fundamentally outside the reach of conscious awareness, while others just don’t happen to be in our awareness right now, but we could bring them into consciousness if the need arose. In other words, they are within the scope of conscious processing. Conscious processing is structured around all the information that can be made conscious, not just around current information, which lacks sufficient context to accomplish anything by itself. So while memory itself is a nonconscious process, much of what we store in our memory is shaped by conscious processing. We have an excellent sense of its content, scope, and implications for our current thoughts even though we can’t pull much of it into our awareness at a time. Furthermore, the way we perceive our conscious thinking processes is, of course, part of consciousness, but that perception hides a lot of nonconscious support algorithms that we take for granted, which notably include recognition and language, but also many other talents whose mechanisms are invisible to us. Consciousness is coordinated using many nonconscious processes, and this makes it hard to say where one leaves off and the other begins.

My distinction based on the scope of conscious reach is sufficient to call the conscious and nonconscious minds distinct subprocesses, but they are highly interdependent so it would be overreach to label them independent subprocesses. All information in the conscious mind comes from the nonconscious mind, and, to the extent the conscious mind has executive power, the subconscious takes much of its direction from it. It is analogous to the CEO and other employees of a company. It has been estimated that 90 to 99 percent of mental processing is nonconscious based on neural activity10 but we can’t quantify this precisely because they blur into each other. Vision is processed nonconsciously for us in parallel. Each part of the image at a pixel-like level is simultaneously converted in real time from an input signal to an internal representation, which is then also often converted in real time into recognized objects. In addition to this nonconscious parallel processing of the input, our conscious perception of the output uses built-in (nonconscious) parallel processing because we can simultaneously see the whole image at once even though we feel like we are doing one thing at a time.11

Before going further, let me contrast “nonconscious mind” with the more commonly used term “unconscious mind” coined by Sigmund Freud. Freud’s unconscious mind was the union of repressed conscious thoughts that are no longer accessible (at least not without psychoanalytic assistance) and the nonconscious mind. He saw the preconscious, which is quite similar to what we now call the subconscious, as the mediator between them:

Freud described the unconscious section as a big room that was extremely full with thoughts milling around while the conscious was more like a reception area (small room) with fewer thoughts. The preconscious functioned as the guard between the two spaces and would let only some thoughts pass into the conscious area of thoughts. The ideas that are stuck in the unconscious are called “repressed” and are therefore unable to be “seen” by any conscious level. The preconscious allows this transition from repression to conscious thought to happen.12

Freud either didn’t contemplate or was not concerned with neural processing that happened below levels that could be understood were they to become conscious, repressed or not. Instead of the big room/reception area analogy, my nonconscious and conscious are much more like film production and finished movie — lots of processing with tools unfamiliar to consciousness is done to package information up in a streamlined form that consciousness can understand. In any case, the parts of the mind permanently outside conscious scope arguably don’t matter to psychoanalysts, but if they help explain how the conscious mind works they matter to me. In any case, the term unconscious also refers to a loss of consciousness, which can be confusing, so I will only use the term “nonconscious” going forward. Originally, Freud used the term “subconscious” instead of unconscious, but he abandoned it in 1893 because he felt it could be misunderstood to mean an alternate “subterranean” consciousness. It now persists in popular parlance as a synonym for intuitive, which is the word I will use instead to avoid any confusion.

To summarize, I am saying two different things about consciousness that don’t necessarily go together:

  • 1. Consciousness is one of two subprocesses in the brain, namely the one that works from the top down, and
  • 2. We are aware of consciousness (but not nonconsciousness).

This begs some bigger questions, namely, why does awareness exist, and why is it limited to the conscious part? The answer is that awareness is how consciousness maintains its top-down perspective. An agent must ultimately put one plan into effect: a cheetah decides to chase down a specific gazelle. But to do that, a myriad of bottom-up information must be condensed into relevant factors to create the top-down view. How can this information be condensed to a single decision while maintaining a comprehensive grasp of all the details, any of which might impact the next decision, e.g. to call off the chase? The answer is awareness and attention. Awareness simplifies bottom-up information into a set of discrete information channels called qualia, which I will discuss in the next section. Attention prioritizes qualia and thought processes to bring the most relevant factors into consideration for decisions. It is not a coincidence by any means, but the information processing that takes place to do this simplification of bottom-up data into top-down digestible forms is the same thing as conscious awareness. Awareness is just a specific kind of information processing, and it coincides with the consciousness subprocess (and not nonconsciousness) because it is the kind of processing that consciousness needs to do and is set up to do.

Consciousness feels like a theater because that is the approach to managing this information that works best. Once we recognize that this approach has been used, we can cite any number of good reasons why this approach evolved. Just from common sense, we know animals have to be aware and alert to get things done and to stay safe. Consciousness is clearly an extremely effective strategy for them. Of course, we can’t tell what consciousness feels like to them, but we can draw analogies to our own experience to see when and how they experience comparable sensory, emotional, and cognitive states. This does not mean that any top-level decision-making process, be it a computer program, a robot, or a zombie, would therefore have conscious awareness. The theater of consciousness is a user interface set up specifically to feed bottom-up information into a top-down algorithm that can continuously reassess priorities to produce a stream of decisions. The algorithms we associate with computers, robots, and zombies are just not in the same league of functionality as what animal minds do. We can’t even remotely imagine how to design such algorithms yet. But if we could devise such algorithms that used nonconscious and conscious subprocesses to take the same kinds of things into consideration for the same kinds of reasons to produce the same kind of stream of decisions, then it would be fair to say that it would have awareness comparable to our own. Could we instead design intelligent robots that are not conscious? Yes, undoubtedly, and we arguably have already started to do so, but it would not be a comparable kind of intelligence. Many tasks can be done very competently without any of the concerns that animals face.

Most of my focus in this book is on the processing done by the consciousness subprocess or for it by nonconscious processes because these are the aspects of the mind that matter the most to us. I’m going to take a closer look at these processes, starting with qualia.

Qualia – A Way to Holistically Manage Incoming Information

Living organisms are homeostatic, meaning they must maintain their internal conditions in a working state at all times. Animals had to evolve a homeostatic control system, meaning that it had to be able to adjust its supervision on a continuous basis. But it still needs to be able to fulfill tasks smoothly and efficiently and not in a herky-jerky panic. Karl Friston was the first to characterize these requirements of a homeostatic control system through his free energy principle.13 This principle says that a homeostatic control system must minimize its surprise, meaning that it should proceed calmly with its ongoing actions so long as all incoming information falls within expected ranges. Any unexpected information is a surprise, which should be bumped up in priority and dealt with until it can itself be brought back into an expected range. It would really be more accurate and informative to call it the surprise-minimization principle because it isn’t really about energy or anything physical at all. This principle says the control system must try to know what to expect, and, beyond that, it must also minimize the chances that inputs will go outside expected ranges. Animals have to follow this principle simply because it is maladaptive not to. Unlike machines we build, which are not homeostatic or homeostatically controlled, animals must have a holistic reaction strategy that can deal with control issues fractally, that is, as needed and at every level of concern.

Simple animals have simple expectations. Even a single-cell creature, like yeast, can sense its environment in some ways and respond to it. Simple creatures evolve a very limited range of expectations and fixed responses to them, but animals developed a broader range of senses, which made it possible to develop a broader range of expectations and responses. In a control arms race, animals have ratcheted up their capacity to respond to an increasing array of sensory information to develop ever more functional responses. But it all starts with the idea of real-time information, which is, of course, the specialty of sensory neurons. These neurons bring signals to the brain, but what the brain needs to know about each sense has to be converted logically into an expected range. Information within the range is irrelevant and can be ignored. Information outside the range requires a response. This requirement to translate the knowledge into a form usable for top-level control created the mind as we know it.

From a logical standpoint, here is what the brain does. First, it monitors its internal and external environment using a large number of sensory neurons, which are bundled into specific functional channels. The brain reprocesses each channel using a logical transformation and feeds it to a subprocess called the mind that maintains an “awareness” state over the channel that it ignores. These channels are kept open because a secondary process in the brain called an “attention” process evaluates each channel to see if it falls outside the expected range. When a channel does that, the attention process focuses on that channel, which moves the mind subprocess from an aware (but ignoring) state to a focused (attentive) state. The purpose of the mind subprocess is to collect incoming information that has been converted into a logical form that is relevant to tasks at hand so that it can prioritize and act so as to minimize future surprise. Of course, its more complex reactions complete necessary functions, and that is its “objective” if we view the problem deductively, but the brain doesn’t have to operate deductively or understand that it has objectives. All it needs to be able to do is convert sensory information into expected ranges and have ways of keeping them there.

Relatively simpler animal brains, like those of insects, use entirely instinctive strategies to make this happen. But you can still tell from observing them that, from a logical standpoint, they are operating with both awareness and attention. This alone doesn’t make their mind subprocess comparable to ours in any way we can intuitively identify with, but it does mean that they have a mind subprocess. They are very capable of shifting their focus when inputs fall outside expected ranges, and they then select new behaviors to deal with the situation. Do they “go back” to what they were dong once a minor problem has been stabilized? The free energy principle doesn’t answer questions like that directly, but it does indirectly. Once a crisis has been averted, the next most useful thing the animal can do to avoid a big surprise in its future is to return to what it was doing before. But for very simple animals it may be sufficiently competitive to just continually evaluate current conditions to decide what to do next rather than to devise longer-term plans. After all, current conditions can include desires for food or sex, which can then be prioritized to devise a new plan on a continuous basis. Insects have very complex instinctive strategies for getting food which often depend on monitoring and remembering environmental features. So even though their mind subprocess is simple compared to ours, it must be capable of bringing things to attention, considering remembered data, and consulting known strategies to prioritize its actions to choose an effective logical sequence of steps to take.

People usually consider the ability to feel pain as the most significant hallmark of consciousness. Insects don’t have nociceptors, the sensory neurons that transmit pain to the brain, so they don’t suffer when their legs are pulled off. It is just not sufficiently helpful or functional for insects to feel pain because their reproductive strategy is to make lots of expendable units. More complex animals make a larger investment in each individual and need them to be able to recover from injuries, and pain provides its own signal which is interpreted within an expected range to let the mind subprocess know whether it should ignore or act. Every sensory nerve (a bundle of sensory neurons) creates its own discrete and simultaneous channel of awareness in the mind. If you have followed my argument so far, you can see that what we think of as our first-person awareness or experience of the world is just the mind subprocess doing its job. Minds don’t have to be aware of themselves, or have comprehension or metacognition, to feel things. Feelings are, at their lowest level, just the way these nerve channels are processed for the mind subprocess. Feelings in this sense of being experienced sensations are called qualia. We distinguish red from green as very different qualia, but we could never describe the difference to a person who has red-green color blindness. The feelings themselves are indescribable experiences; words can only list associations we may have with them.

We don’t count each sensory nerve as its own quale (pronounced kwol-ee, singular of qualia), even though we can tell it apart from all others. Instead, the brain groups the sensory nerves functionality into a fixed number of categories, and the feeling of each quale as we experience it is exactly the same regardless of which nerve triggered it. Red looks the same to me no matter which optic nerve sends it. The red we experience is a function of perception and is not part of nature itself, which deals only in wavelengths, so our experience seems like magic as it is supernatural. But it isn’t really magic because there is a natural explanation: outside of our conscious awareness in the mind subprocess, the brain has done some information processing and presented a data channel to the mind in the form of a quale. The most important requirement of each sensory nerve is that we can distinguish it from all others, and the second most important requirement is that we can concurrently categorize it into a functional group, its quale. The third most important requirement is that we monitor each channel for being what we expect, and that unexpected signals then demand our attention. These requirements of qualia must all hold from ant minds to human minds, and, in an analogous way, for senses in yeast. But the detective and responsive range in yeast is much simpler than in ants, and that in ants is much simpler than in people. As we will see, the differences that arise are not just quantitative, but also qualitative as they bring new kinds of function to the table.

The way the brain processes qualia for us makes each one feel different in a special, customized way that is indescribable. More accurately, we can describe them, but words can’t create the feeling. One can describe colors to a blind person or sounds to a deaf person, but the meaning really lies in the experience. Where does this special feeling come from? The short answer is that it is made up. The longer answer is that everything the mind experiences is just information; the mind can only access and process information. But not all information is the same. Because the brain is creating complex signals for the qualia which are designed to let us instantly tell them apart, it has invested some energy in giving each of them a special data signature or “look and feel” which only that quale can produce. To some degree, we can remember that look and feel, but it is not as convincing or fresh as first hand experience of it. Synesthesia is a rare brain condition which allows some qualia to trigger other qualia. Most commonly, synesthetes who see letters or numbers or hear certain sounds then see colors or shapes they associate with them, or think of colors they associate with them. This indicates that some internal malfunction has allowed one quale’s channel to overlap or bleed into another. The overlap almost always goes beyond simple sensory qualia to include words, numbers, shapes or ideas, which suggests that other data channels feed these into our conscious awareness as well. But more to my point at hand, it suggests that the brain invents qualia but generally shields the mind from the details.

Our sensory perceptions provide us with information about our bodies or the world around us. The five classic human senses are sight, hearing, taste, smell, and touch. Sight combines senses for color, brightness, and depth to create composite percepts for objects and movement. The fovea (the highest-resolution area of the retina) only sees the central two degrees of the visual field, but our weaker peripheral vision extends to about 200 to 220 degrees. Smell combines over 1000 independent smell senses. Taste is based on five underlying taste senses (sweet, sour, salty, bitter, and umami). Hearing combines senses for pitch, volume, and other dimensions. And touch combines senses for pressure, temperature, and pain. Beyond these five used most for sensing external phenomena, we have a number of somatic senses that monitor internal body state, including balance (equilibrioception), proprioception (limb awareness), vibration sense, velocity sense, time sense (chronoception), hunger and thirst, erogenous sensation, chemoreception (e.g. salt, carbon dioxide or oxygen levels in blood), and a few more14. Most qualia are strictly informational, providing us with useful clues about ourselves or the world, but some are also dispositional, making us feel inclined to act or to take a position regarding them. Among senses, this most notably applies to pain, temperature, some smells, hunger, thirst, and sex. Dispositional senses are sometimes called drives.

We possess another large class of dispositional qualia called emotions that monitor our mental needs. If we recognize a spider or a snake, that is just a point of information, but if we feel fear, then we know we need to avoid it. Emotions give us motivation to act on a wide variety of needs. Emotions are metaperceptions our brain creates for us by forming perceptions about our conscious mental states. You could say our brain reads our mind and reacts to it. Our nonconscious mind computes what emotions we should feel be “peeking” at our conscious thoughts and feeding its conclusions back to us as emotions. It needs our conscious assessments because only the conscious mind understands the nuances involved, especially with interpersonal interactions. Emotions react to what we really believe, and so can’t be fooled easily, but thanks to metacognition we can potentially bring ourselves to believe things on one level that we don’t believe on another and so can manipulate our emotions.

If everything meets our default expectations, no emotion would be stirred because no further action is needed. But if an event falls short of or exceeds our expectations, emotions may be generated to spur further action. Negative emotions motivate us to take corrective action, while positive emotions motivate us to take reinforcing action. We may be aware of rational reasons to act (that ultimately tie back to motivations from drives and emotions), but reasoning lacks urgency. Emotion will inspire us to act quickly. Most emotions are intense but short-lived because quick action is needed. They can also be more diffuse and longer-lived to signal longer-term needs, at which point we call them moods.

We have more emotions than we have qualia for emotions, which causes many emotions to overlap in how they feel. The analysis of facial expressions suggests that there are just four basic emotions: happiness, sadness, fear, and anger.15 While that is a bit of an oversimplification, it is approximately true. Dozens of discernible emotions share these four qualia, but they affect us in different ways because we know the emotions not just by how they feel but by what they are about. So satisfaction, amusement, joy, awe, admiration, adoration, and appreciation are distinct emotions that all feel happy, while anguish, depression, despair, grief, loneliness, regret, and sorrow all feel sad, yet we distinguish them based on context. The feel of an emotion spurs a certain kind of reaction. Happiness spurs reinforcement, sadness spurs diminishment, fear spurs retreat, and anger spurs corrective action. Sexual desire has its own qualia that spur sex. So emotions that call for similar reactions can share qualia, and in some sense, an emotion can be said to feel “like” the action they inspire us to take. Fear and anger make us feel like doing something, happiness feels like something we want more of, and sadness makes us feel like pulling away from its source, which, in the long run, will help us overcome it. Wikipedia lists seventy or so emotions, while the Greater Good Science Center identifies twenty-seven16. Just as we can distinguish millions of colors with three qualia, we can probably distinguish a nearly unlimited range of emotions by combining the four to about twelve emotional qualia with an almost unlimited number of objects at which they can be directed. For example, embarrassment, shyness, and shame mostly trigger qualia for awkwardness, sadness, anxiety, and fear, but also correspond respectively to social appropriateness, comfortability around others, and breaking social norms.

Awareness and attention themselves can be said to have a custom feel to them and so can be called qualia. Awareness is informational while attention is dispositional. Their quality is just a sense of existence and interest, and so is not as specific as senses and emotions, but they are near permanent sensations in our conscious life. Qualia are the special effects of the theater of consciousness that make it feel “first-person” and so seamless and well-produced that we believe it shows us the world around us “as it really is”. We know that our visual, aural, tactile, and other sensory ranges are highly idiosyncratic and only represent a very biased view of the world around us, but because that view is entirely consistent with our interactions with that world, it is real for all intents and purposes. The world we are imagining in our heads counts as real if our interactions with it are faithfully executed. We recognize our senses can be fooled, and, more than that, we know that they fill in gaps for us to keep the show on the road, which invariably introduces some mistakes, but we also know we can reconfirm any information in doubt as needed.

2.3 Humans: 4 million to 10,000 years ago

How Cooperation and Engineering Evolved Through Niche Pressure and the Baldwin Effect

People are much more capable than our nearest animal relatives, but why? Clearly, something significant happened in the seven million years since we diverged from chimpanzees. To help figure out what mental functions are uniquely human, let’s first take a look at the most advanced capabilities of animals. Apes and another line of clever animals, the corvids (crows, ravens and rooks), can fashion simple tools from small branches, which requires cause and effect thinking using conceptual models. Most apes and corvids have complex social behaviors. Like many other animals, group living helps them defend against predators and extend foraging opportunities, but they also groom each other, share care of offspring, share knowledge, and communicate vocally and visually for warning, mating, and other purposes.12 Apes and corvids have also a substantial capacity to attribute mental states (such as senses, emotions, desires, and beliefs) to themselves and others, an ability called Theory of Mind (TOM)34. In particular, if they see food being hidden and they are aware of another animal (agent) observing it being hidden, this knowledge of the other animal’s knowledge will affect their behavior. Mammals and birds evolved all these capabilities independently, which indicates both that a functional ratchet is at work and that there is some universality to the kinds of functions that it is useful for animals to achieve.

But while apes, corvids, and a few other smart animals can do some clever things in isolation, they can’t abstract beyond the here and now to more generalized applications. If they devise a customized (non-instinctive) strategy to solve a problem, the goal needs to be very obvious. And while their social interactions can be mutually beneficial, they can’t cooperate in novel ways to solve problems. Rather than exhibiting full cooperation, they are “co-acting” by acting only on private motivations which happen to benefit the group when they act in concert. Social insects, for example, seem to work cooperatively to solve problems, but their strategies are instinctive, even to the point of having specialized roles. Apes and corvids can’t achieve that level of cooperation because they can’t devise plans beyond themselves. Humans, however, can engineer solutions for which the goal, the means to it, and the benefits are abstracted to any degree. Because of this, they can easily imagine a group of people working together using specialized talents to accomplish a project that one alone could not do. Only humans can both create and refine tools (engineering), and then communicate their plans to each other and execute them in a coordinated fashion. These coordinated activities depend critically on language. The semantic content of language is conceptual in that it uses words to call out generalizations in a top-down way. Gestures contribute to semantic content, and language may even initially have been mostly gestural at first, but except for sign languages, most kinds of semantic content now depend on verbal language.5 Language also conveys nonconceptual content through connotation, including emotional and intuitive undertones. There are two basic reasons for this. First, people will only be willing to cooperate if they trust each other, which develops mostly from our theory of mind ability to pick up on what others are thinking from their words, actions, emotions, etc. But to read people well, we need to interact with them a lot, and small talk creates lots of opportunity for people to develop trust in each other. Secondly, our thoughts connect to other thoughts in a vast network, but language forces us to distill that down to a single stream of primary meaning that follows an overt or implied conceptual model shared by the speaker and the listener. We use connotation and emotion to imply all sorts of secondary meaning, either subtly to persuade without being too direct, or subliminally as we draw on associations that have helped in the past and so are likely to help again. Language is inherently metacognitive because it represents shared generalizations with words (or, most granularly, with morphemes, the smallest grammatical unit in a language), and this means we need to devote some thought to what each word means. The meaning of a morpheme is established by how people use it, so it can never be completely rigid because usage patterns both vary and shift.

While evolution could not realize it, and early man did not either, cooperation and engineering uncorked an entirely new kind of cognitive ratchet that quickly drove human evolution toward our present capabilities. The reason is that cooperation and engineering can be used together in an unlimited number of ways to improve our chances of survival. They also come with costs which must be carefully managed to produce net benefit, so the right balance of ability and restraint of ability had to be struck over the course of human evolution. Perhaps most notably, cooperation and engineering quickly lead to the need for specialized service and manufacturing roles. Hunting is an oft-cited activity requiring both kinds of specialization — the creation of spears and weapons and the teamwork to use them — but if we could do this we could also cooperate to engineer housing, clothing, and food production, and we have probably been doing these things for more than a million years. Starting with rudimentary uses of semantic communication and tools, bands of humans established roles for group-level strategies that slowly evolved into elaborate cultural heritages. Other animals have an intuitive sense of time, but culture greatly expanded our need to contemplate the persistence of artifacts and practices both in our past and into our future. Remembering specific details of past exploits or social interactions matters much more to us, and our much greater ability to plan makes the future matter more as well. By comparison, animals live mostly in the here and now, but humans more substantially live in the past and future. This expansion in time is accompanied by an expansion into possible worlds. Our greater capacity to project our thoughts abstractly is so great that we can think of ourselves as having two distinct lives, one in the real world and a second, virtual life in our own imaginations.

Of course, just because something is possible doesn’t mean it will happen, let alone happen quickly. Assuming conditions were finally right for a species to start deriving new functional benefit from cooperation and engineering, what accounts for such seemingly significant evolutionary changes happening so quickly, considering how long evolution usually seems to take? It was well known by Darwin’s time that the fossil evidence shows organisms stay relatively unchanged for millions of years. Darwin said of this: “the periods during which species have undergone modification, though long as measured in years, have probably been short in comparison with the periods during which they retain the same form.”6 Niles Eldredge and Stephen Jay Gould published a paper in 1972 that named this phenomenon punctuated equilibrium, and contrasted it with the more widely subscribed notion of phyletic gradualism, which held that evolutionary change was gradual and constant. Evolutionary theory, from Darwin through the Modern Synthesis, states that only the mutation rate affects the rate of evolution, and since it must be constant, evolutionary change should be gradual. To this date, nobody has explained why punctuated equilibrium happens. But I propose a simple explanation, which I call niche pressure. In brief, change happens quickly when the slope toward local maxima of potential functionality is the steepest, and then slows down and nearly stops when the local maximum is achieved.

Niche pressure will cause genetic change to decline the longer an organism has lived in the same niche. This is because it gradually exhausts the range of physically reachable advantages from small functional changes to the existing genome, causing the organism to climb up to a local maximum in the space of all possible functionality. Humans that could fly might be more functional, but flight is not physically reachable from small changes. Evolutionary change always happens fastest when the fit of a species to its niche is worst and slows as that fit is perfected. This is not because mutation is any faster, it is just because mutations can make bigger strides when the range of reachable functional possibilities is largest and make less difference when all it can do is make subtle refinements. In other words, evolution is a function of environmental stability. If the environment changes, evolution will be spurred to make species fit better. If the environment stays the same, each interbreeding population will approach stasis as its gene pool comes to represent an optimal solution to the challenges presented by the niche. However, if that population is separated geographically into two subpopulations, then this divides the niche as well, and differences which were previously averaged now impact the two populations in different ways. Each population will quickly evolve to fit its new subniche. Rapid evolution can happen both when the environment changes quickly or when a niche is divided in two, but the difference is that in the latter case a new species will form. In both cases, however, a single interbreeding population changes rapidly because mutants survive better than standards when they fit the new niche better.

It is usually sufficient to view functional potential from the perspective of environmental opportunity, but organisms are also information processors and sometimes entirely new ways of processing information create new functional opportunities. This was the case with cooperation and engineering. Cooperation with engineering launched a new cognitive ratchet because they greatly extended the range of what was physically reachable from small functional changes. Michael Tomasello identified differences in ape and human social cognition using comparative studies that show just what capacities apes lack. Humans do more pointing, imitating, teaching, and reassessing from different angles, and our Theory of Mind goes deeper, so we not only realize what others know, but also what they know we know, and so forth recursively. These features combine to establish group-mindedness or what he calls “collective intentionality”, which are ideas of belonging with associated expectations. Though our early cooperating ancestors Australopithecus four million years ago and Homo erectus two million years ago didn’t know it, they were bad fits for their new niche, because they had barely begun to explore the range of tools and tasks now possible. (We are still bad fits for our new niche because even more tools and tasks are possible than ever, so we nervously await the arrival of the technological singularity when everything possible will be attainable). In fact, we were the worst fit for our niche that the history of life had ever seen because the slope toward our potential achievements was steepest (and growing steeper). Of course, we were also the only creatures yet to appear that could attempt to fill that niche.

Even given niche pressure, the idea that humans could evolve from something chimp-like to human-like in just a few million years seems pretty fast based on random mutations. The extended evolutionary synthesis and other attempts to update evolutionary theory include a variety of mechanisms that could “speed up” evolution. Taken together, these mechanisms basically all leverage the idea that most of the genetic sequences that comprise our genomes today predate the chimp-human split. They can consequently be thought of as a reserve of genetic potential which niche pressure drew on to shape us. Most of these mechanisms are still hypothetical, and since I am trying to stick to established science, I am not going to describe or defend them in detail. But some of these mechanisms will pan out to show how feasible it is for niche pressure to work through punctuated equilibrium. First, it is well-established that the genetic variation of each gene in a population provides a large reserve of adaptability to new circumstances. We used this diversity to domesticate animals and crops over a much shorter timespan than humans have been evolving. Less demonstrable are proposed mechanisms that could pull inactive genetic sequences into active use. Because DNA replication must primarily produce error-free copies, it seems sensible to assume that mechanisms could not evolve that allowed or encouraged certain kinds of genetic changes, which is what I am suggesting. But this is a bad assumption. Consider this: organisms that could promote useful genetic changes more often than those that could not would quickly come to predominate. This adaptive advantage alone creates constant demand for such (currently unknown) mechanisms that could produce useful changes more frequently than could blind chance alone, even if they are heavily dependent on chance themselves. Consequently, any such mechanisms that are physically possible are likely to have evolved. In fact, such mechanisms are unavoidable because inaction is also a kind of action. A mechanism that guarantees perfect replication would be an evolutionary dead end, which means that selection pressures on the kinds of errors that can happen during replication exist. Over time, mechanisms that allow certain kinds of “mistakes” to happen that have been more helpful than chance can arise. The existence of transposons, sometimes called jumping genes, demonstrates that active gene editing can occur through highly evolved mechanisms, and at least suggests we have barely scratched the surface of their potential. Considering that nearly all organisms have transposons and they comprise 44% of the human genome, the possibility that they participate in tinker-toy mechanisms of new gene creation is significant. In any case, a tendency for helpful new genes to come together by “lucky” accidents by a variety of subtle mechanisms is more likely than not. While we don’t know at this point to know just how important active gene editing is to evolution, it has seemed likely for some time that random pointwise mutation alone is not enough to account for what we see.

Language is often cited as the critical evolutionary development that drove human intelligence, and while I basically agree with this, there is more to the story than that. First, because language is a universal skill found in all human societies and seems so deeply entrenched in our thought processes, it has been proposed that it is an instinct, sometimes called the language acquisition device. This is completely untrue, as language is entirely a manmade communication system that must be learned from years of study. The innate (instinctive) skills we possess than enable us to learn language are all fairly general-purpose in nature, but it is true that only humans have a sufficient enough set of such skills to learn language readily. I will be making a considerable effort as the book proceeds to identify these and other skills that contribute to the more general intelligence of humans, but it must be understood at the outset that none of our intelligence or linguistic ability derives from specialized modules of the brain that process grammar or give us a “language” of thought (aka “mentalese”). The additional innate skills of humans vs. other animals are best thought of as subtly shifting our interests and focus rather than providing qualitatively different abilities. All animals use general-purpose neural networks to study sensory inputs to create a mental model of their bodies and the world around them by finding and reinforcing patterns in the data. Language is unique in that the patterns must be coordinated with patterns in the minds of others, and their underlying content is consequently relevant to higher-order actions as well. But each mind makes sense of these signals in its own way by making neural connections from its own experience and nothing else. We understand and control our bodies through long habituated processing of feedback from senses. We understand our own conscious thoughts only because of our long habituated processing of feedback from thoughts. And we also understand language and all other learned skills only because of our long habituated processing of feedback from using those skills. The mind has one general systematic approach that it uses for everything: look for patterns and attach more significance to the ones you see and use the most.

Now, this said, we do have innate talents that work at a low level, and because every species has a unique set of such talents, they are all cognitively different. As we cooperated and engineered more, we were both literally and figuratively playing with fire because we initiated a cognitive ratchet that let to the development of a wide variety of human-specific cognitive talents. None of these are entirely human-specific; they have just been weighted and refined a bit, so we can see them in slightly different forms in other animals. I’m not going to be able to describe any of them from a genetic perspective because we just don’t have that kind of knowledge yet. However, the good news is that this doesn’t really matter at this stage because function drives form. We will, in time, be able to provide genetic explanations, but to do that we first have to know what we are looking for and why. We need to unravel what functions the cognitive ratchet was clicking into place, which means we have to know what traits were providing cognitive benefit. It’s a lot harder to contemplate the genetic basis of these traits because, unlike eye color, the brain is very holistic, with every innate talent potentially providing subtle benefits and costs across the whole system. The value of cooperation and engineering cascaded, which led to the selection of untold innate talents that facilitated doing them better. We can test for specific mental talents to see how humans vary. For example, a simple memory test shows that chimps surprisingly have much better short-term working memory than humans.7 But it is very hard to assess most cognitive skills from such tests, which still leaves us knowing next to nothing about why humans seem to be more intelligent.

While we don’t know much about our mental skills from experiments, we can still ferret out information about them in other ways, most notably from general considerations of how we think, which I will take up in Part 3. But first I’d like to point out that while all our mental talents are broadly general-purpose, they can and have become specifically selected for how well they help us with language through the Baldwin effect. The Baldwin effect, first mentioned by Douglas Spalding in 1873 and then promoted by American psychologist James Mark Baldwin in 1896, proposes that the ability to learn new behaviors will lead animals to choose behaviors that help them fit their niche better, which will in turn lead to natural selections that make them better at those behaviors. As Daniel Dennett put it, learning lets animals “pretest the efficacy of particular different designs by phenotypic (individual) exploration of the space of nearby possibilities. If a particularly winning setting is thereby discovered, this discovery will create a new selection pressure: organisms that are closer in the adaptive landscape to that discovery will have a clear advantage over those more distant.” The Baldwin Effect is Lamarckian-like in that offspring tend to become better at what their ancestors did the most. It is entirely consistent with natural selection and is an accepted part of the Modern and Extended Synthesis because it in no way causes anything parents have learned to be inherited by their offspring. All it does is slowly cause behavior that is learned in every generation to become increasingly natural and innate, as those that can do naturally what they are doing anyway will prosper more. Language has likely been evolving for millions of years, which make it very likely that a number of instincts that help us with language are Baldwin instincts. None of them are language itself, but they probably help us to make and recognize sounds. As language starts to help us convey meaning better, those who can use it more effectively will be selected more, leading to natural talents that help us manage words and concepts better. For the capacity to conduct a conversation to evolve, the participants need to have enough attention span and memory to participate. Many small Baldwin refinements to general innate talents evolved as people communicated more, which led to us being highly predisposed to learning language specifically without it becoming an instinct per se.

Could these talents evolve far enough to make an ability for Universal Grammar (UG) innate, as Noam Chomsky proposes through his principles and parameters approach to generative grammar? While anything could evolve to an instinctive level given enough time and the right pressures, UG would be wildly maladaptive because it would be a drastic overspecialization. The whole power of thought and language derive from not being written in stone; to legislate any aspect of them would quickly paint us into a corner we could not get out of. Rather than demonstrating that grammar is universal, the study of the world’s languages shows that they are highly idiosyncratic, with highly specific words and word orders for everything. Nothing about them springs into our minds, either in the primary or secondary languages we learn, but must be memorized from long exposure and use. They only generalize along very narrow paths, and one can find exceptions that defy any generalization one might make about them. That they have similarities only reflects common origins and ultimately the common need to communicate, not any underlying regularity. That said, although nothing is guaranteed, languages do have lots of regularities because people are trying do want communication to be easy, and so they will use similar patterns to express similar ideas. In many languages, you can categorize words into well-defined parts of speech and describe well-defined rules of grammar, but such descriptions are, as is the nature of description, a simplification that only characterizes predominate patterns of usage and not hard and fast rules. Language can also be used poetically to an arbitrary degree, as James Joyce did in Finnegans Wake, in which the obfuscation of direct meaning helps to highlight the significance of indirect meaning. In any case, grammar is a superficial aspect of language, which itself offers a superficial window into our thought processes, which far from being linear, symbolic, or syntactic as mentalese proponents argue, float in the vast network of information held in the brain.

The Baldwin effect shaped many, and perhaps most, complex animal behaviors. I consider dam building in beavers to be a Baldwin instinct. It seems like it might have been reasoned out and taught from parents to offspring, but actually “young beavers, who had never seen or built a dam before, built a similar dam to the adult beavers on their first try.”8 Over the long period of time when this instinct was developing, beavers were gnawing wood and sometimes blocking streams. Those that blocked streams more did better. Beavers, like all animals, have some capacity for learning and so, in any given generation, will learn a few tricks on their own or from their parents that made a dam-oriented lifestyle more effective. This little bit of learning, over many generations, could have nudged evolution more toward making dam-building instinctive. We know now that the instinct to block water is not triggered unless they hear running water because it can be turned on and off with a recording of the sound. This is a great trigger because it leaves much of the logistics up to general beaver intelligence, which, in addition to using logs, can also use twigs, mud, and debris to block the flow of water. Consequently, without having originally conceived that dams would be a good idea, their ability to learn translated over time into a complex instinctive behavior. Chance mutations that make them more inclined to do the kinds of things they are doing anyway from learning let Baldwin instincts backfill learned behaviors into instincts.

Children raised without language will not simply speak fluent Greek. Both Holy Roman Emperor Frederick II and King James IV of Scotland performed such experiments in the 13th and 15th centuries9. In the former case, the infants died, probably from lack of love, while in the latter they did not speak any language, though they may have developed a sign language. The critical period hypothesis strongly suggests that normal brain development including the ability to use language requires adequate social exposure during the critical early years of brain development. Children with very limited exposure to language who interact with other similar kids will often develop an idioglossia or private language, which are not full-featured languages. Fifty deaf children, probably possessing idioglossia or home sign systems, were brought together in Nicaragua in a center for deaf education in 1977. Efforts to teach them Spanish had little success, but in the meantime, they developed what became a full-fledged sign language now called Idioma de Señas de Nicaragua (ISN) over a nine-year period10. Languages themselves must be created through a great deal of human interaction, but our facility with language, and our inclination to use it, is so great that we can quickly create complete languages given adequate opportunity. While every fact and rule about any given language must be learned, and while our general capacity for learning includes the ability to learn other complex skills as well, language has been with humans long enough to be heavily influenced by the Baldwin effect. A 2008 study on the feasibility of the Baldwin effect influencing language evolution using computer simulations found that it was quite plausible11. I think human populations have been using proto-languages for millions of years and that the Baldwin effect has been significant in preferentially selecting traits that help us learn them.

While linguists tend to focus on grammar, which is related only to the semantic content of language, much of language is nonverbal. Consider that Albert Mehrabian famously claimed in 1967 that only 7% of the information transmitted by verbal communicating was due to words, while 38% was tone of voice and 55% was body language. This breakdown was based on two studies in which nonverbal factors could be very significant and does not fairly represent all human communication. While other studies have shown that 60 to 80% of communication is nonverbal in typical face-to-face conversations, in a conversation about purely factual matters most of the information is, of course, carried by the semantic content of the words. This tells us that information carried nonverbally usually matters more to us than the facts of the matter. Cooperation depends more on goodwill and trust than good information, and that is the chief contribution of nonverbal information. Reading and writing are not interactive and don’t require a relationship to be established, so they work well without body language. But written language also conveys substantial nonverbal content through wording that evokes emotion or innuendo.

Reason and Responsibility: The Hidden Prerequisites of Greater Intelligence

Even though the cognitive ratchet created new opportunities for adaptive success, it also expanded the number of ways we could fail. Superficially, it seems that cooperating and planning better would translate to surviving better. But being able to do these things better also creates more ways to use them disadvantageously. Without safeguards, we will either use these talents counterproductively or antagonistically. Just being able to think better doesn’t motivate us to succeed. We might just play more, and this risk actually does keep many people from achieving their potential. To inspire us to apply ourselves productively, we have dispositional qualia like pain, temperature, some smells, hunger, thirst, and sex that give us subjective incentives to protect our adaptive self-interests. Those usually protect our physical well-being adequately, but we also have emotions to protect our social well-being. On the positive side, strong relationships with other build affection, love, trust, confidence, empathy, pride, and social connection, while erosion of these relationships leads to hostility, distrust, guilt, envy, jealousy, resentment, embarrassment, and shame. How emotions affect us is at least partially a social construct, but our difficulty in being able to control them consciously suggests a strong innate component as well. We don’t decide what emotions we will feel; rather, the nonconscious mind computes reviews our conscious thoughts and computes what emotions we should feel, which it presents to our consciousness as emotional qualia. Some emotions can be easily read from our faces and behavior, which would be maladaptive if revealing that information to others gave them an advantage over us, but adaptive if it fostered support and trust. Since we need people to work with us, and the rewards of double-crossing are great, ways that telegraph our honest feelings were distinctly adaptive.

We don’t make our decisions based on what will maximize the survival of our gene line; we decide things based on our conscious desires. Conscious desire is the indirect approach minds use to translate low-level needs into high-level actions. Any high-level decision needs to balance the relevant factors, and to help us balance a wide variety of factors efficiently, the mind has dispositional qualia that make us feel like doing desirable things and feel like avoiding undesirable things. We don’t eat because we need energy to survive; we eat because it tastes good and it starts to hurt if we don’t. We don’t do good deeds because they will build our reputation and lead to more money and procreative success; we do them because they build positive emotions like pride and deflect negative ones like shame. The filter of consciousness acts as an intermediary, roughly translating physical needs to psychological ones. In the very long run, the basis of mental decisions needs to sync up fairly well with physical needs, but over a shorter time frame, traits can evolve that are preferred by minds even though they are detrimental to survival. For example, we seem to prefer junk food to healthy food, which was not a problem over the time frame our taste buds evolved because we didn’t have access to junk food. Alternately, many female birds prefer their mates to have ostentatious plumage rather than, say, physical strength12. Eating badly negatively impacts your survival, so a preference to eat healthier food will start to evolve. Selecting mates for plumage, however, positively impacts their survival up to a point, because reproduction is such a critical part of the life cycle. My point here is just that conscious desires are pushed by evolution to line up with adaptive needs.

While awareness, attention, and qualia are all computed by nonconscious processes and fed to our conscious minds, much and probably most of our thought processes are also computed nonconsciously. One significant part, rational thinking, does appear to be entirely conscious so far as I can tell, but let me briefly list some important nonconscious parts. First, and most significantly, long-term memory is an entirely nonconscious process that works quite conveniently from a conscious perspective to retrieve memories based on associations. Somehow, we are not only able to recall many things in significant detail, we approximately know in advance what we will find. Our knowledge is indexed in a fractal way such that we have an approximate sense of all of it, and as we start to consider special areas of our knowledge, we recall first our approximate sense of the range of our knowledge in that area, and then details start to come to mind, and so forth. A considerable part of our memory seems to be to summarize how much more memory we have. Forgetting is likely an adaptive feature of memory since nearly everyone forgets nearly everything they have seen eventually. The existence of Eidetic (sometimes called photographic) memory and hyperthymesia (ability to remember one’s own life in almost perfect detail) demonstrate that the brain is capable of keeping nearly all long-term memories. When we forget, have we just lost conscious access to memories which may still be used for nonconscious purposes, or are they gone? We don’t yet know, but forgetting may be a natural way of giving more recent memories higher priority since they are more likely to help us, or it may just keep us from becoming obsessed with our past.

The next most significant nonconscious thought process is belief. However helpful planning can be, with or without the cooperation of others, we have to be able to deploy it decisively and parsimoniously, leveraging our past experience to produce a quick and confident response. Otherwise, we could fall into analysis paralysis, either hesitating a bit too long or freezing entirely when quick action is needed. Belief is how the brain keeps this from happening. We catalog all our knowledge with an appropriate degree of belief, and then belief gives us the green light to act on that knowledge without further consideration about whether it is right or not. I believe chairs will support my weight, so I sit on them without further thought.

The third most significant nonconscious talent is habituated thought processes. Belief is a way of habituating our use of long-term memory so we can trust it and act on it safely and efficiently. In a similar way, we develop ways of thinking from a lifetime of practice which conveniently become habituated so we can use them without thinking about how or why they work. This applies most significantly to language, which is complex enough that we can spend a lifetime improving language skills. Beyond language, our whole conception of cause and effect and each of the life skills we have learned across thousands of areas become second nature to us, effectively letting us think without active effort.

And the last nonconscious talent I want to mention at this time is our facility with mental models. It is easy to take this one for granted; the only way we can have any conception of the outside world is through mental models that represent it to us. But we need those models, and they need to seamlessly integrate awareness, attention, and qualia to knowledge structures that tell us what we are seeing and experiencing. Mental models do that for us, and we only have to wish for them to appear and they come to us.

Finally, given the right nonconscious supports, the crown jewel of intelligence, rational thinking, can flourish. In Part 3 of the book, I’m going to look closer at how rational thought works and at the subsystems that support it, which will lead into the final part that examines how the whole system works together.