Recent Posts

Introduction: The Mind Explained in One Page or Less

“If we do discover a complete theory, it should in time be understandable in broad principle by everyone, not just a few scientists. Then we shall all, philosophers, scientists, and just ordinary people, be able to take part in the discussion of the question of why it is that we and the universe exist. If we find the answer to that, it would be the ultimate triumph of human reason—for then we would know the mind of God.” — Stephen Hawking, A Brief History of Time

We exist. We don’t quite know why we or the universe exists, but we know that we think, therefore we are. The problem is that we don’t know we know anymore. Worse still, we have convinced ourselves we don’t. It is a temptation of modern physical science to mitigate ourselves right out of existence. First, since the Earth is a small and insignificant place, certainly nothing that happens here can have any cosmic significance. But, more than that, the laws of physics have had such explanatory success that surely they must explain us as well, reducing the phenomenon of us to a dance of quarks and leptons. Well, I am here to tell you that Descartes was right, because we are here, and that science took a left turn at Francis Bacon and needs to get back on the right track. The problem is that we’ve been waiting for science to pull the mind down to earth, to dissect it into its component nuts and bolts, but we’ve had it backward. What we have to do first is use our minds to pull science up from the earth into the heavens, to invest it with the explanatory reach it needs to study imaginary things. Because minds aren’t made of nuts and bolts; brains are. Minds — and imagination — are made of information, function, capacity, and purpose, which are all well-established nonphysical things or forces from the human realm which science can’t see under a microscope.

I am going to go back to first principles to reseat the foundation of science and then use its expanded scope over both real and imaginary things to approach the concept of mind from the bottom up and the top down to develop a unified theory. The nature of the mind was long the sole province of philosophers, who approached it with reason but lacked any tools for uncovering its mechanisms. Wilhelm Wundt, the “father of modern psychology“, took on the conscious mind as a subject of experimental scientific study in the 1870’s. Immanuel Kant, himself probably the greatest philosopher of mind, held that the mind could only be studied through deductive reasoning, i.e. from an a priori stance. He disputed that psychology could ever be an empirical (experimental) science because mental phenomena could not be expressed mathematically, individual thoughts could not be isolated, and any attempt to study the mind introspectively would itself change the object being studied, not to mention opening up innumerable opportunities for bias.1 Wundt nevertheless founded experimental psychology and remained a staunch supporter of introspection, provided it was done under strict experimental control. Introspection’s dubious objectivity caught up with it, and in 1912 Knight Dunlap published an article called “The Case Against Introspection” that pointed out that no evidence supports the idea that we can observe the mechanisms of the mind with the mind. This set the stage for a fifty-year reign of behaviorism, which, in its most extreme forms, sought to deny that anything mental was real and that behavior was all there was2. Kant had made the philosophical case and the behaviorists the scientific case that the inner workings of the mind could not be studied by any means.

A cognitive revolution slowly started to challenge this idea starting in the late 1950s. In 1959, Noam Chomsky famously refuted B.F. Skinner’s 1957 Verbal Behavior, which sought to explain language through behavior, by claiming that language acquisition could not happen through behavior alone3. George Miller’s 1956 article “The Magical Number Seven, Plus or Minus Two” proposed a mental capacity that was independent of behavior. Ideas from computer science that the mind might be computational and from neuroscience that neurons could do it started to emerge. The nature of the mind might be studied scientifically was reborn, but it was clear to everyone that psychology was not broad enough to tackle it. A new field was, cognitive science, was conceived in the 1970s, driven at first mostly from the artificial intelligence community, to figure out how the mind works. Psychology remains the study of the mind as informed only by its use, but because other fields could supply insight into how it works, cognitive science was intentionally chartered without a firm foundation. Instead, it floats on a makeshift interdisciplinary boat that lashes together rafts from psychology, philosophy, artificial intelligence, neuroscience, linguistics, and anthropology. And it has taken this interdisciplinary ball and run with it, welcoming contributing ideas from any direction. Detailed work is being done on each raft, with assistance from the others, but with no clear idea of how everything should fit together. While it has been productive, much of the forest can’t be seen for the trees. Open-mindedness is a big improvement over the closed-mindedness of behaviorism, but cognitive science desperately needs to find a prevailing paradigm, like one finds in all other fields. What philosophical stance can pull its diverse subfields together? I will propose a unifying philosophy that plants cognitive science on solid ground, and I will then use it to explain how the mind works.

What we need to do is roll the clock back to when things started, to the beginning of minds and the beginning of science. We need to think about what really happened and why, about what went right and what went wrong. What we will find is that the essence of more explanatory perspectives was there all along, but they did not quite get past making intuitive sense to forming an overall rational explanation. With a better model that can bridge that gap, we can establish a new framework for science that can explain both material and immaterial things. From this new vantage point, everything will fit together better using only available knowledge. I don’t want to hold you in suspense for hundreds of pages until I get to the point, so I am going to explain how the mind works right here on the first page. And then I’m going to do it again in a bit more detail over a few pages, and then across a few chapters, and then over the rest of the book. Each iteration will go into more detail, will be better supported, and will expand my theory further. I’m going to stand on firm ground and make it firmer. My conclusions should sound obvious, intuitive, and scientific, pulling together the best of both common sense and established science. My theory should be comprehensive if not yet complete, and should be understandable in broad principle by everyone and not just a few scientists.

From a high level, it is easy to understand what the mind does. But you have to understand evolution first. Evolution works by induction, which means trial and error. It keeps trying. It makes mistakes. It detects the mistakes with feedback and tries another way, building a large toolkit of ways that work well. Regardless of the underlying mechanisms, however, life persists. It is possible for these feedback structures to keep going, and this creates it them a logical disposition to do so. They keep living because they can. Living things thus combine physical matter with a “will” to live, which is really just an opportunity. This disposition or will is not itself physical; it is the stored capacities of feedback loops tested over long experience. These capacities capture the possibilities of doing things without actually doing them. They are the information of life, but through metabolism they have an opportunity to act, and those actions are not coincidentally precisely the actions that keep life alive another day. The kinds of actions that worked before are usually the kinds that will work again because the ways those kinds are delineated have themselves worked before. And yet, ways that can work better exist, and organisms that find better ways outcompete those that don’t, creating a functional arms race called evolution that always favors capacities more effective at survival.

Freedom of motion created a challenge and an opportunity for some living things to develop complex behavioral interactions with their environment, if only they could make their bodies pursue high-level plans. Animals met this challenge by evolving brains as control centers and minds as high-level control centers of brains. At the level the mind operates, the body is logically an agent, and its activities are not biochemical reactions but high-level (i.e. abstract) tasks like eating and mating. Unlike evolution, which gathers information slowly from natural selection, brains and minds gather information in real time from experience. Their primary strategy for doing that is also inductive trial and error. Patterns are detected and generalized from feedback into abstractions like friends, foes, and food sources. Most of the brain’s inductive work happens outside of conscious awareness, that is, outside the mind, but it then feeds a relevant distillation of that work up to the mind as instincts, senses, emotions, common sense and intuition. This distillation creates a high-level, logical perspective for the mind that we think of as the first-person: the capacity to experience things. This mind’s-eye view of the world is analogous to a cartoon vs. live action, and for the same reasons: the cartoonist isolates relevant things and omits irrelevant detail. For minds to be effective at their job, the drive to survive needs to be translated into preferences that appropriately influence the high-level agent to choose beneficial actions. Minds therefore experience a network of feelings from pain and other senses through emotions that can influence complex social interactions to make sure they are properly motivated to act in their agent’s best interests.

Humans and some of the most functional animals also use deduction. Where induction works from the bottom up (from specifics to generalities), deduction works from the top down (generalities to specifics). Deduction is a conscious process that builds discrete logical models out of the high-level abstractions presented to consciousness. First, it takes the approximate kinds suggested by bottom-up induction and packages them up into buckets called concepts. Then, it takes the approximate implications suggested by bottom-up induction and packages them up into fixed rules called causes and effects. Then it tunes those models with feedback until they work effectively as a simplified but useful high-level view of the world that we think of as knowledge and understanding. Deduction can only happen consciously because logical models, their concepts, their rules, and how to think with them are themselves all learned behaviors, and our nonconscious intuitive capacities are all innate. Conceptualization and the deduction it supports are hard-won skills we consciously build over a lifetime.

That the mind exists as its own logical realm independent of the physical world is thus not an ineffable enigma, it is an inescapable consequence of the high-level control needs of complex mobile organisms. Our inner world is not magic; it is computed.

Part 1: The Duality of Mind

“Normal science, the activity in which most scientists inevitably spend almost all their time, is predicated on the assumption that the scientific community knows what the world is like”
― Thomas S. Kuhn, The Structure of Scientific Revolutions

The mind exists to control the body from the top level. Control is the use of feedback to regulate a device. Historically, science was not directly concerned with control and left its development to engineers. The first feedback control device is thought to be the water clock of Ctesibius in Alexandria, Egypt around the third century B.C. It kept time by regulating the water level in a vessel and, therefore, the water flow from that vessel.1

All living organisms are perfectly tuned feedback control systems, but brains, in particular, are organs that specialize in top-level control. The mind is a part of the brain of which we have first-hand knowledge that consists of the following properties of consciousness: awareness, attention, feelings, and thoughts. The body, brain, and mind work together harmoniously to keep us alive, but how do they do it? As a software engineer; I’ve spent my whole life devising algorithms to help people control things better with computers. Developing a comprehensive theory of the mind based on existing scientific information is a lot like writing a big program — I develop different ideas that look promising, then step back and see if everything still works, which leads to multiple revisions across the whole program until everything seems to be running smoothly. It is more a job for an engineer than a scientist because it is mostly about generalizing functionality to work together rather than specializing in underlying problems. Generalizing from specifics to functions that solve general cases is most of what computer programmers do. Perhaps it is temperamental, but I think engineers are driven more by a top-down perspective to get things done than a bottom-up perspective to discover details.

In this section, I am going to develop the idea that science has overlooked the fundamental role control plays in life and the mind and has consequently failed to devise an adequate control-based orientation to study them. By reframing our scientific perspective, we can develop the explanatory power we need to understand how life and the mind work.

1.1 Approaching the Mind Scientifically

“You unlock this door with the key of imagination. Beyond it is another dimension: a dimension of sound, a dimension of sight, a dimension of mind. You’re moving into a land of both shadow and substance, of things and ideas. You’ve just crossed over into… the Twilight Zone.” — Rod Serling

Many others before me have attempted to explain what the mind is and how it works. And some of them have been right on the money as far as they have gone. But no explanation has taken it to the nth degree to uncover the fundamental nature of the mind both physically and functionally, fully encompassing both how the brain and mind came to be and what they really consist of. Each branch of science that touches on the mind comes at it from a different direction, and each is fruitful in its own way, but a unified understanding requires a unified framework that spans those perspectives. I don’t see much effort being expended to do that unification, so I am going to do it here. My basic contention is that science has become too specialized and we’ve been missing the forest for the trees. Whose job is it in the sciences to conceive generalized, overarching frameworks? Nobody; all scientists are paid to dig into the details. I believe minds are designed to collect useful knowledge, and each of us already has encyclopedic knowledge about how our mind works. Our deeply held intuitions about the special transcendent status of the mind have merit, but science finds them hard to substantiate and so discounts them. Scientists are quick to assume that the ideas we have about how we think are biased, imaginary, or even delusional because they don’t fit into present-day scientific frameworks. I don’t agree that we should discount the value of intuition on these grounds, and instead propose that intuition should lead the way. I am going to use intuition and logic to devise an expanded framework for science that encompasses the mind that aligns with both common sense and the latest scientific thinking. I am not suggesting we are immune to delusion or bias. They are very real and are the enemy of good science, but if we are careful, we can avoid logical fallacies and see the inner workings of the mind in a new light.

We know things simply from experience, which leverages a number of techniques to develop useful knowledge. Advice columnists expound on new problems based on their presumably greater experience in a subject domain. But science goes beyond the scope of advice by proposing to have conceived and demonstrated cause-and-effect explanations for phenomena. Science is a formalization of knowledge, which, in its fully formalized state declares laws that can perfectly predict future behavior. We recognize that science falls a bit short of perfection in its applicability to the physical world for two reasons. First, we only know the world from its behavior, not from seeing its underlying mechanisms. Second, most laws make simplifying assumptions, so one must consider the impact of complexities beyond the scope of the model. The critical quality that science adds above experienced opinion from these steps to formalize and verify is objectivity. What objective exactly means is a topic I will explore in more detail later, but from a high level, it means to be independent of subjectivity. Knowledge that is not dependent on personal perspective becomes universal, and if it uses a reliable, causal model then we can count it as scientific truth.

My explanation of the mind is part framework and part explanation. It is easier to establish objectivity for an explanation than a framework. An explanation stands on models and evidence, but a framework is one level further removed, and so stands on whether the explanations based on it are objective. A framework is a philosophy of science, and philosophy is sometimes studied independently of the object under study. What I am saying is that to establish objectivity, I can’t do that. I have to develop the philosophy in the specific context of the explanations it supports to establish the overall consistency and reliability that objectivity demands. I do believe all existing philosophies and explanations of science have merit, but in some cases they will need minor revisions, extensions, or reinterpretations to fit into the framework I am proposing. I am going to try to justify everything I propose as I propose it, but to keep things moving, I won’t always be as thorough as I would like on the first pass. In these cases, I will come back to the subject later and fill in. My primary aim is to keep things simple and clear, to appeal to common sense, and to stay as far within the scientific canon as possible. I am presuming readers have no specialized scientific background, both because I am approaching this from first principles and trying to make this accessible to everyone.

Even the idea of studying the mind objectively is questionable considering we only know of the mind from our own subjective experience. We feel we have one and we have a sense that we know what it is up to, but all we can prove about it is that it somehow resides in the brain. Brain scans show what areas of the brain are active when our minds are active. We can even approximately tell what areas of the brain are related to what aspects of the mind by correlating personal reports with activity in brain scans.1 Beyond that, our knowledge of neurochemistry and computer science suggest that the brain potentially has the processing power to produce mental states. Other sciences, from biological to social, assume this processing is happening and draw conclusions based on that assumption. But how can we connect the physical, biological, and social sciences to see the mind in a consistent way? This search for common bounds quickly takes us into a scientific twilight zone where things and ideas join the physical world and the world of imagination. It is very easy to overreach in these waters, so I will remain cognizant of Richard Feynman’s injunction against cargo cult science, which he said could only be avoided by scientific integrity, which he described as, “a kind of leaning over backwards” to make sure scientists do not fool themselves or others. I’ll be trying to do that to ensure my objective — a coherent and unified theory of the mind — stands up to scrutiny.

Science has been fighting some pitched philosophical debates in recent decades which reached a standstill and left it on pretty shaky ground. I am referring to the so-called science wars of the 1990s, in which postmodernists pushed the claim that all of science was a social construction. Scientific realism alone is inadequate to fight off postmodern critiques, so, given that this is the stance on which science most firmly depends, science is formally losing the battle against relativism. The relativists have been held off for now with Richard Dawkins’ war cry, “Science works, bitches!”2, which presumably implies that a firm foundation exists even if it has not been expressed. I aim to provide that solid ground using a philosophy that explains and subsumes relativism itself. These skirmishes don’t affect most scientific progress because local progress can be made independent of the big picture. But relativism is a big problem for the science of mind because while battles can still be won, we don’t, in a sense, know what we are fighting for.

Two diametrically opposed frameworks of science collide in the mind and we have to resolve that conflict to proceed. The first framework is physicalism (described below), which supports the physical sciences. The second framework is an assortment of philosophies which support the biological and social sciences. These philosophies use life and the mind as starting points and then build on that premise. Biological philosophies, which now mostly rest on Darwinism and refinements to it, are fundamentally naturalistic, which means that they assert that the forces that created life are natural. But it is not clear what those forces are, because life already exists as complex, self-sustaining, engineered systems, and our theories describing how it managed to arise naturally are still somewhat incomplete. The social sciences are also naturalistic but usually add humanism as well, which emphasizes the fundamental significance and agency of human beings. In this case, it is not clear why humans or their agency should be fundamental, but to make progress, these premises are taken as foundational. While I agree with the mountains of evidence that suggests that life and the mind are natural phenomena, and hence I agree that naturalism correctly describes the universe, it is not at all clear that it is the same as physicalism, and I will show that it is not. Spoiler alert: the extra force found in nature that is not part of physicalism is, in short, the disposition of life to live and of minds to think. After correctly defining naturalism, we will be in a much better position to explain complex natural phenomena like life and the mind.

As I am planning to stay within the bounds of the best-established science, I want to highlight the theories from which I will draw the most support. Everything I say will be as consistent as possible with these theories. These theories do continue to be refined, as science is never completely settled, and I will cite credible published hypotheses that refine them as needed. Also, some of these theories are guilty of overreaching, so I will have to rein them in. Here they are:

  1. Physicalism, the idea that only physical entities comprised of matter and energy exist. Under the predominant physicalist paradigm, these entities’ behavior is governed by four fundamental forces, namely gravity, the electromagnetic force, and the strong and weak nuclear forces. The latter three are nicely wrapped up into the Standard Model of particle physics, and gravity by general relativity. So far a grand unified theory that unites these two theories remains elusive. Physicalists acknowledge that their theories cannot now or ever be proven correct or reveal why the universe behaves as it does. Rather, they stand as deductive models that map with a high degree of confidence to inductive evidence.

  2. Evolution, the idea that inanimate matter become animate over time through a succession of heritable changes. The paradigm Darwin introduced in 1859 itself evolved during the first half of the 20th century into the Modern Synthesis to incorporated genetic traits and rules of recombination and population genetics. Watson and Crick’s discovery of DNA in 1953 as the source of the genetic code provided the molecular basis for this theory. Since that time, however, our knowledge of molecular mechanisms has exploded, undermining much of that paradigm. The evolutionary biologist Eugene Koonin feels that “the edifice of the [early 20th century] Modern Synthesis has crumbled, apparently, beyond repair”3, but updated syntheses have been proposed. The most widely-supported post-modern synthesis is the extended evolutionary synthesis, which adds a variety of subtle mechanisms that are still consistent with natural selection but which are not as obvious as the basic rules behind genetic traits. These mechanisms include ways organisms can change quickly and then develop full genetic stability (facilitated variation, the Baldwin effect, and epigenetic inheritance) and the effects of kin and groups on natural selection. The Baldwin effect is the idea that learned behavior maintained over many generations will create a selection pressure for adaptations that support that behavior. Eva Jablonka and Marion J. Lamb proposed in Evolution in Four Dimensions: Genetic, Epigenetic, Behavioral, and Symbolic Variation in the History of Life that the Baldwin effect lets organisms change quickly using regulatory genes (epigenes), learned behavior, and language to shift more transient changes permanently into DNA.

  3. Information theory, the idea that something nonphysical called information exists and can be manipulated. The study of information is almost exclusively restricted to the study of the manipulation of information and not to its nature, because the manipulation has great practical value but the nature is seen a point of only philosophical interest. However, understanding the nature of information is critical to understanding how life and the mind work, so I will be primarily concerned in this work with nature rather than manipulation. Because the nature of information has been almost completely marginalized in the study of information theory, existing science its nature doesn’t go very far and I have mostly had to derive my own theory of the nature of information from first principles, building on the available evidence.

  4. The Computational Theory of Mind, the idea that the human mind is an information processing system (IP) and that both cognition and consciousness result from this computation. While we normally think of computation as being mathematical, under this theory computation is generalized to include any transformation of input and internal state information using rules to produce output information. This implies that the mind has ways of encoding and processing information, which seemed radical when this idea was first proposed 70 years ago, but now seems obvious and inescapable. Where mechanical computers use symbolic states stored in digital memory and manipulated electronically, neural computers use neurochemical inputs, states, outputs, and rules. This theory, more than any other, has guided my thinking in this book. It is considered by many to be the only scientific theory that appears capable of providing a natural explanation for the much if not all of the mind’s capabilities, yet its implications have not been thoroughly pursued. I am going to do that here. However, I largely reject the ideas of the representational theory of mind and especially the language of thought, as they unnecessarily and incorrectly go too far in proposing a rigid algorithmic approach when a more generalized solution is needed. Note that whenever I use the word “process” in this book, I mean a computational information process, unless I preface it with a differentiating adjective, e.g. biological process. Although my focus in this book is on the mind, I am incidentally proposing the Computational Theory of Life, which is the formal statement that all life is first and foremost information processing systems and only secondarily biological processes. There is de facto acceptance in the biological sciences that life is computational because it is known that genes drive life and genes contain information, but the full implications of this fact should have transformed the biological sciences, and they have not yet. Also, note that I am not saying that everything is computational; I am specifically saying that life and the mind are computational. Some generalists like to extend this line of thought to propose that the universe is a giant computer, but this is a bad analogy because the universe is, for the most part (i.e. the part that is not alive), physical and devoid of information and information processing.

While the scientific community would broadly agree that these four theories are the leading paradigms in their respective areas, they would not agree on any one version of each theory. They are still evolving, and in some cases have parallel, contradictory lines of development. I will cite appropriate sources that are representative of these theories as needed. When I don’t cite sources, you can assume that I am presenting my own proposal or interpretation, but if I have made my case well then my points should seem sound and uncontroversially.

1.2 Information is Fundamental

Physical scientists have become increasingly committed to physicalism over the past four centuries. Physicalism is intentionally a closed-minded philosophy: it says that only physical things exist, where physical includes matter and energy in spacetime. It seems, at first glance, to be obviously true given our modern perspective: there are no ghosts, and if there were, we should reasonably expect to see some physical evidence of them. Therefore, all that is left is physical. But this attitude is woefully blind; it completely misses the better part our existence, the world of ideas. Of course, physicalism has an answer for that — thought is physical. But are we really supposed to believe that concepts like three, red, golf, pride, and concept are physical? They aren’t. But the physicalists are not deterred. They simply say that while we may find it convenient to talk about things in a free-floating, hypothetical sense, that doesn’t constitute existence in any real sense and so will ultimately prove to be irrelevant. From their perspective, all that is “really” happening is that neurons are firing in the brain, analogously to a CPU running in a computer and our first-person perspective of the mind with thoughts and feelings is just the product of that purely physical process.

Now, it is certainly true that the physicalist perspective has been amazingly successful for studying many physical things, including everything unrelated to life. However, once life enters the picture, philosophical quandaries arise around three problems:

(a) the origin of life,
(b) the mind-body problem and
(c) the explanatory gap.

In 1859, Charles Darwin proposed an apparent solution to (a) the origin of life in On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life. His answer was that life evolved naturally through small incremental changes made possible by competitive natural selection between individuals. The idea of evolution is now nearly universally endorsed by the scientific community because a vast and ever-growing body of evidence supports it while no convincing evidence refutes it. But just how these small incremental changes were individually selected was not understood in Darwin’s time, and even today’s models are somewhat superficial because so many intermediate steps are unknown. Two big unresolved problems in Darwin’s time were the inadequate upper limit of 100 million years for the age of the Earth and the great similarity of animals from different continents. It was nearly a century before the earth was found to be 4.5 billion years old (with life originating at least 4 billion years ago) and plate tectonics explained the separation of the continents. By the mid-20th century, evolutionary theory had developed into a paradigm known as the Modern Synthesis that standardized notions of how variants of discrete traits are inherited. This now classical view holds that each organism has a fixed number of inherited traits called genes, that random mutations lead to gene variants called alleles, and that each parent contributes one gene at random for each trait from the two it inherited from its parents to create offspring with a random mixture of traits. Offspring compete by natural selection, which allows more adaptive traits to increase in numbers over time. While the tenets of the Modern Synthesis are still considered to be broadly true, what we have learned in the past seventy or so years has greatly expanded the repertoire of evolutionary mechanisms, substantially undermining the Modern Synthesis in the process. I will discuss some of that new knowledge later on, but for now, it is sufficient to recognize that life and the mind evolved over billions of years from incremental changes.

Science still draws a blank trying to solve (b) the mind-body problem. In 1637, René Descartes, after thinking about his own thoughts, concluded “that I, who was thinking them, had to be something; and observing this truth, I am thinking therefore I exist”1, which is popularly shortened to Cogito ergo sum or I think, therefore I am. Now, we still know that “three” is something that exists that is persistent and can be shared among us regardless of the myriad ways we might use our brain’s neurochemistry to hold it as an idea, so intuitively we know that Descartes was right. But officially, under the inflexible auspices of physicalism, three doesn’t exist at all. Descartes saw that ideas were a wholly different kind of thing than physical objects and that somehow the two “interacted” in the brain. The idea that two kinds of things exist at a fundamental level and that they can interact is called interactionist dualism. And I will demonstrate that interactionist dualism is the correct ontology of the natural world (an ontology is a philosophy itemizing what kinds of things exist), but not, as it turns out, the brand that Descartes devised. Descartes famously, but incorrectly, proposed that a special mental substance existed that interacted with the physical substance of the brain in the pineal gland. He presumed his mental substance occupied a realm of existence independent from our physical world which had some kind of extent in time and possibly its own kind of space, which made it similar to physical substance. We call his dualism substance dualism. We know now substance dualism is incorrect because the substance of our brains alone is sufficient to create thought.

Physicalism is an ontological monism that says only one kind of thing, physical things, exist. But what is existence? Something that exists can be discriminated on some basis or another as being distinct from other things that exist and is able to interact with them in various ways. Physical things certainly qualify, but I am claiming that concepts also qualify. They can certainly be discriminated and have their own logic of interactions. This doesn’t quite get us down to their fundamental nature, but bear with me I and I will get there soon. Physicalism sees the mind is an activity of the brain, and activities are physical events in spacetime, so it just another way of talking about the same thing. At a low level, the mind/brain consists of neurons connected in some kind of web. Physicalism endorses the idea that one can model higher levels as convenient, aggregated ways of describing lower levels with fewer words. In principle, though, higher levels can always be “reduced” to lower levels incrementally by breaking them down in enough detail. So we may see cells and organs and thoughts as conveniences of higher-level perspectives which arise from purely physical forms. I am going to demonstrate that this is false and that cells, organs, and thoughts do not fully reduce to physical existence. The physicalists are partly right. The mind is a computational process of the brain like digestion is a biological process of the gastrointestinal system. Just as computers bundle data into variables, thinking bundles data into thoughts and concepts which may be stored as memories in neurons. Computers are clearly physical machines, so physicalists conclude that brains are also just physical machines with a “mind” process that is set up to “experience” things. This view misses the forest for the trees because neither computers nor brains are just physical machines… something more is going on that physical laws alone don’t explain.

This brings us to the third problem, (c) the explanatory gap. The explanatory gap is “the difficulty that physicalist theories have in explaining how physical properties give rise to the way things feel when they are experienced.” In the prototypical example, Joseph Levine said, “Pain is the firing of C fibers”, which provides the neurological basis but doesn’t explain the feeling of pain. Of course, we know, independently of how it feels, that the function of pain is to inform the brain that something is happening that threatens the body’s physical integrity. That the brain should have feedback loops that can assist with the maintenance of health sounds analogous to physical feedback mechanisms, so a physical explanation seems sufficient to explain pain. But why things feel the way they do, or why we should have any subjective experience at all, does not seem to follow from physical laws. Bridging this gap is called the hard problem of consciousness because no physical solution seems possible. However, once we recognize that certain nonphysical things exist as well, this problem will go away.

We can resolve these three philosophical quandaries by correcting the underlying mistake of physicalism. That mistake is in assuming that only physical things can arise from physical causes. In one sense, it’s true: life and minds are entirely physical systems following physical laws. But in almost every way that matters to us, it is false: some physical systems (living things) can use feedback to perpetuate themselves and also to become better at doing so, which in turn creates in them the disposition to do so. This inclination, backed up by the capacity to pursue it by capturing and storing information, is not itself a physical thing, even though it exists in a physical system. For simplicity, I will usually just refer to this kind of existence as information, which is a term that usually refers to physically-encoded functionality, but it is understood to have meaning beyond the encoding. Often I will call it function, since information only exists to serve a function and disposition is managed through functional capacities. Information, functions, capacities, and dispositions are not physical and do exist. Ideas are information, but information is much more than just ideas. The ontology of science needs to be reframed to define and encompass information. Before I can do that, I am going to take a very hard look at what information is, and what it is not. I am going approach the subject from several directions to build my case, but I’m going to start with how life, and later minds, expanded the playing field of conventional physics.

During the billions of years before life came along, particle behaved in ways that could be considered very direct consequences of the Standard Model of particle physics and general relativity. This is not to say these theories are in their final form, but one could apply the four fundamental forces to any bunch of matter or energy and be able to predict pretty well what would happen next. But when life came along, complex structure started to develop with intricate biochemistries that seemed to go far beyond what the basic laws of physics would have predicted would happen. This is because living things are a information processing systems, or information processors (IPs) for short, and information can make things happen that would be extremely unlikely to happen otherwise. Organisms today manage heritable information using DNA (or RNA for some viruses) as their information repository. While the information resides in the DNA, its meaning is only revealed when it is translated into biological functions via biochemical processes. Most famously, the genetic code of DNA uses four nucleotide letters to spell 64 3-letter words that map to twenty amino acids (plus start and stop). Technically, DNA is always transcribed first into RNA and from RNA into protein. A string of amino acids forms a protein, and proteins do most of the heavy lifting of cell maintenance. But only two percent of the DNA in humans codes for proteins. Much of the rest regulates when proteins get translated, which most critically controls cell differentiation and specialization in multicellular organisms. Later on I will discuss some other hypothesized additional functions of non-coding DNA that substantially impact how new adaptations arise. But, as regards the storage of information, we know that DNA or RNA store the information and it is translated one-way only to make proteins. The stored information in no way “summarizes” the function the DNA, RNA or proteins can perform; only careful study of their effects can reveal what they do. Consequently, knowing the sequence of the human genome tells us nothing about what it does; we have to figure out what pieces of DNA and RNA are active and (where relevant) what proteins they create, and then connect their activities back to the source.

Animals take information a step further by processing and storing real-time information using neurochemistry in brains2. While other multicellular organisms, like plants, fungi, and algae, react to their environments, they do so very slowly from our perspective. Sessile animals like sponges, corals, and anemones also seem plantlike and seem to lack coordinated behavior. Mobile animals encounter a wide variety of situations for which they need a coordinated response, so they evolved brains to assess and then prioritize and select behaviors appropriate to their current circumstances. Many and perhaps all animals with brains go further still by using agent-centric processes called minds within their brains that represent the external world to them through sensory information that is felt or experienced in a subjective, first-person way. Then first-person thinking contributes to top-level decisions.

While nobody disputes that organisms and brains use information, it is not at all obvious why this makes them fundamentally different from, say, simple machines that don’t use information. To see why they are fundamentally different, we have to think harder about what information really is and not just how it is used by life and brains. Colloquially, information is facts (as opposed to opinions) that provide reliable details about things. More formally, information is “something that exists that provides the answer to a question of some kind or resolves uncertainty.” But provides answers to whom? The answer must be to an information processor. Unless the information informs “someone” about something, it isn’t information. But this doesn’t mean information must be used to be information; it only has to provide answers that could be used. Information is a potential or capacity that can remain latent, but must potentially be usable by some information processor to do something. So what is fundamentally different about organisms and brains from the rest of the comparably inert universe is that they are IPs, and only IPs can create or use information.

But wait, you are thinking, isn’t the universe full of physical information? Isn’t that what science has been recording with instruments about every observable aspect of the world around us in ways that are quite objectively independent of our minds’ IPs? If we have one gram of pure water at 40 degrees Fahrenheit at sea level at 41°20’N 70°0’W (which is in Nantucket Harbor), then this information tells us everything knowable by our science about that gram of matter, and so could be used to answer any question or resolve any uncertainty we might have about it. Of course, the universe doesn’t represent that gram of water using the above sentence, it uses molecules, of which there are sextillions in that gram. One might think this would produce astronomically complex behavior, but the prevailing paradigm of physics claims a uniformity of nature in which all water molecules behave the same. Chemistry and materials science then provide many macroscopic properties that work with great uniformity as well. Materials science reduces to chemistry, and chemistry to physics, so higher-level properties are conveniences of description that can be reduced to lower-level properties and so are not fundamentally different. So, in principle, then, physical laws can be used to predict the behavior of anything. Once you know the structure, quantity, temperature, pressure, and location of anything, then the laws of the universe presumably take care of the rest. Our knowledge of physical laws is still a bit incomplete, but it is good enough that we can make quite accurate predictions about all the things we are familiar with.

Physical information clearly qualifies as information once we have taken it into our minds as knowledge, which is information within our minds’ awareness. But if we are thinking objectively about physical information outside the context of what our minds are doing, that means we are thinking of this information as being present in the structure of matter itself. But is that information really in the matter itself? Matter can clearly have different structures. First, it can differ in the subatomic particles that comprise it, and there are quite a variety of such particles. Next, how these particles combine into larger particles and then atoms and then molecules can vary tremendously. And finally, the configurations into which molecules can be assembled into crystalline and aggregate solids is nearly endless. Information can describe all these structural details, and also the local conditions the substance is under, which chiefly include quantity, temperature, pressure, and location (though gravity and the other fundamental forces work at a distance, which make each spot in the universe somewhat unique). But while we can use information to describe these things, is it meaningful to say the information is there even if we don’t measure and describe it? Wouldn’t it be fair to say that information is latent in physical things as a potential or capacity which can be extracted by us as needed? After all, I did say that information is a potential that doesn’t have to be used to exist.

The answer is no, physical things contain no information. Physical information is created by our minds when we describe physical things, but the physical things themselves don’t have it. Their complex structure is simply physical and that is it. The laws of the universe then operate uniformly at the subatomic level as particles or waves or whatever they really are. The universe doesn’t need to take measurements or collect information just as a clock doesn’t; it just ticks. It is a finite state machine that moves ahead one step at a time using local rules at each spot in the universe. This explanation doesn’t say how it does that or what time is, but I am not here to solve that problem. It is sufficient to know that outside of information processors, the universe has no dispositions, functions, capacities or information. Now, how close particles get to each other affects what atoms, molecules and aggregate substances form, and can create stars and black holes at high densities. But all this happens based on physical laws without any information. While there are patterns in nature that arise from natural processes, e.g. in stars, planets, crystals, and rivers, these patterns just represent the rather direct consequences of the laws of physics and are not information in and of themselves. They only become information at the point where an IP creates information about them. So let’s look at what life does to create information where none existed before.

Living things are complicated because they have microstructure down to the molecular level. Cells are pretty small but still big enough to contain trillions of molecules, all potentially doing different things, which is a lot of complexity. We aren’t currently able to collect all that information and project what each molecule will do using either physics or chemistry alone. But we have found many important biochemical reactions that illuminate considerably how living things collect and use energy and matter. And physicalism maintains that given a complete enough picture of such reactions we can completely understand how life works. But this isn’t true. Perfect knowledge of the biochemistry involved would still leave us unable to predict much of anything about how a living thing will behave. Physical laws alone provide essentially no insight. Our understanding of biological systems depends mostly on theories of macroscopic properties that don’t reduce to physical laws. We are just used to thinking in terms of biological functions so we don’t realize how irreducible they are. Even at a low level, we take for granted that living things maintain their bodies by taking in energy and materials for growth and eliminating waste. But rocks and lakes don’t do that, and nothing in the laws of physics suggests complex matter should organize itself to preserve such fragile, complex, energy-consuming structures. Darwin was the first to suggest a plausible physical mechanism: incremental change steered by natural selection. This continues to be the only idea on the table, and it is still thought to be correct. But what is still not well appreciated is how this process creates information.

At the heart of the theory of evolution is the idea of conducting a long series of trials in which two mechanisms compete and the fitter one vanquishes the less fit and gets to survive longer as its reward. In practice, the competition is not head-to-head this way and fitness is defined not by the features of competing traits but by the probability that an organism will replicate. Genetic recombination provided by sexual reproduction means that the fitness of an organism also measures the fitness of each of its traits. No one trait may make a life-or-death difference, but over time, the traits that support survival better will outcompete and displace less capable traits. Finally, note that mechanisms by which new or changed traits may arise must exist. If you look over this short summary of evolution, you can see the places where I implicitly departed from classical physics and invoked something new by using the words “traits” and “probability”. These words are generalizations whose meaning relative to evolution is lost as soon as we think about them as physical specifics. Biological information is created at the moment that feedback from one or more situations is taken as evidence that can inform a future situation, which is to say that it can give us better than random odds of being able to predict something about that future situation. This concept of information is entirely nonphysical; it is only about similarities of features, where features themselves are informational constructs that depend on being able to be recognized with better than random odds. Two distinct physical things can be exactly alike except for their position in time and space, but we can never prove it. All that we can know is that two physical things have observable features which can be categorized as the same or different based on some criteria. These criteria of categorization, and the concept of generalized categories, are the essence of information. For now, let’s focus only on biological information captured by living organisms in DNA and not on mental information managed by brains. Natural selection implies that biological information is created by inductive logic, which consists of generalizations about specifics whose logical truth is a matter of probabilities rather than logical certainty. Logic produces generalities, which are not physical things one can point to. And the inductive trial-and-error of evolution creates and preserves traits that carry information, but it doesn’t describe what any of those traits are. Furthermore, any attempt to describe them will itself necessarily be an approximate generalization because the real definition of the information is tied to its measure of fitness, not to any specific effects it creates.

We know that evolution works as we are here as evidence, but why did processes that collected biological information form and progress so as to create all the diverse life on earth? The reason is what I call the functional ratchet, and also previously called an arms race. A ratchet is a mechanical device that allows motion in only one direction, as with a cogged wheel with backward angled teeth. Let’s call the fitness advantage a given trait provides its function. More generally capable functions will continuously displace less capable ones over time because of competition. This happens in two stages. First, useful functionality provided by entirely new traits will tend to persist because it provides capabilities other organisms lack. Second, variants of the same trait compete head to head to improve each trait continuously. It is often said that evolution is directionless and human beings are not the “most” evolved creatures at the inevitable endpoint, but this is an incorrect characterization of what is happening. Evolution is always pulled in the direction of greater functionality by the functional ratchet. What functionality means is local to the value each trait contributes to each organism at each moment, so because circumstances change and there is a wide variety of ecological niches, evolution has no specific target and no given function will necessarily ever become advantageous. But the relentless pull toward greater functionality has great potential to produce ever more complex and capable organisms, and this is why we see such a large variety. It is not at all a coincidence that life is more diverse now than in the past or that human intelligence evolved. I will discuss later on how the cognitive ratchet created human brains in the evolutionary blink of an eye.

Note that while the word function suggests we can list the effects the trait can cause in advance, I am using it in a more abstract sense to include any general effects it can cause whether they are knowable or not. In practice, because any effects caused by the trait in specific situations are more likely to be preserved over time if they create net benefits to survival, a collection of effects that are probably more helpful than not overall are likely to evolve for the trait, given that it can change over time. The trait has arguably been causing effects continuously for millions to billions of years, all of which have contributed probabilistically to the trait’s current functionality. However, for entirely physical reasons, traits are likely to be highly specialized, usually having just one fairly obvious functional effect. Any given protein coded by DNA can only have a small, finite number of effects, and it will likely only be used for effects for which it does a better job than any other trait. My point is that the exact benefits and costs can be very subtle and any understanding we acquire is likely to overlook such subtleties. Beyond subtleties, cases of protein moonlighting, in which the same protein performs quite unrelated functions, are now well-documented. In the best-known case, some crystallins can act both as enzymes that catalyze reactions or as transparent structural material of eye lenses.3 But even proteins that can only perform one enzymatic function can use that function in many contexts, effectively creating many functions.

Induction, the idea that function is a byproduct of a long series of trial and error experiments whose feedback has been aggregated, is sufficient to explain evolution, but the mind also uses deduction. I noted before that where induction works from the bottom up (from specifics to generalities), deduction works from the top down (generalities to specifics). From the deductive perspective, we see functions in their simplified, discrete forms which cause specific effects. The body is an agent, limbs are for locomotion, eyes are for seeing, hearts are for pumping blood, gullets are for taking in food, etc. Viewed this way, these functions describe clear contributions to overall survival and fitness, and detailed study always reveals many more subtle subsidiary functions. Of course, we know that evolution didn’t “design” anything because it is used trial and error rather than discrete deductive causes and effects, but we know from experience that deduction can provide very helpful and hence functional support, even though it is not the way the world works. Why and how it does this I will get into later, but for now, let’s review how deduction sees design problems. Deduction begins with a disposition, which is a tendency toward certain actions, that becomes an intent, which is an identified inclination to achieve possible effects or goals. Effects and goals are inherently abstractions in that they don’t refer to anything physical but instead to a general state of affairs, for which the original and driving state of affairs concerning life is to continue to survive. The manipulation of abstractions as logical chess pieces is called deductive reasoning. Techniques to reach goals or purposes are called strategies, designs, or planning. I call the actions of such techniques maneuvers. All these terms except disposition, function, cause, and effect are strictly deductive terms because they require abstractions to be identified. I will expand more in the next chapter how disposition, functionality, and causality (cause and effect) can be meaningful in inductive contexts alone without deduction. My point, for now, is that while evolution has produced a large body of function entirely by inductive means, deductive means can help us a lot to understand what it has done. Provided we develop an understanding of the limitations of deductive explanations, we can be well-justified in using them. I am not going to credit the philosophy of biology with fully exploring those limitations, but we can safely say they are approximately understood, and so on this basis it is reasonable for biologists both to use deductive models to explain life and to characterize evolutionary processes as having intent and designs. There is, however, an unspoken understanding among biologists that the forces of evolution, using only inductive processes, have created something functional that can be fairly called functional. This something has to sit beneath the surface of their deductive explanations because all explanations must be formed with words, which are themselves abstract tools of deductive logic. In other words, information and function are very much present in all living structures and has largely been recorded in DNA, and this information and function are not physical at all. Physicalists go a step too far, then, by discounting the byproducts of inductive information processes as the incidental effects of physical processes.

Although it is possible to create, collect, and use information in a natural universe, it is decidedly nontrivial, as the complexity of living things demonstrates. Beyond the already complex task of creating it with new traits, recombination, and natural selection, living things need to have a physical way of recording information and transcribing information so that it can be deployed as needed going forward. I have said how DNA and RNA do this for life on earth. Because of this, we can see the information of life captured in discrete packages called genes. DNA and RNA are physical structures, and the processes that replicate and translate them are physical, but as units of function, genes are not physical. Their physical components should be viewed as a means to an end, where the end is the function. It is not a designed end, an inductively-shaped one. The physical shapes of living structures are cajoled into forms that would have been entirely unpredictable based on forward-looking design goals, but which patient trial and error demonstrated are better than alternatives.

Beyond biological information, animals have brains that collect and use mental information in real time that is stored neurologically. And beyond that, humans can encode mental information as linguistic information or representational information. Linguistic information can either be in a natural language or a formal language. Natural languages assume a human mind as the IP, while formal languages declare the permissible terms and rules, which is most useful for logic, mathematics, and computers. Representational information simulates visual, audio or other sensory experience in any medium, but most notably nowadays in digital formats. And finally, humans create artificial information, which is information created by computer algorithms, most notably using machine learning. All of these forms of information, like biological information, answer questions or resolve uncertainties to inform a future situation. They do this by generalizing and applying nonphysical categorical criteria capable of distinguishing differences and similarities. Some of this information is inductive like biological information, but, as we will see, some of it is deductive, which expands the logical power of information.

We have become accustomed to focus mostly on encoded information because it can be readily shared, but all encodings presume the existence of an IP capable of using them. For organisms, the whole body processes biological information. Brains (or technically, the whole nervous and endocrine systems) are the IP of mental information in animals. Computers can act as the IPs for formal languages, formalized representations, and artificial information, but can’t process natural languages or natural representational information. However, artificial information processing can simulate natural information processing adequately for many applications, such as voice recognition and self-driving cars. My point here is that encoded information is only an incremental portion of any function, which requires an IP to be realized as function. We can take the underlying IPs for granted for any purpose except understanding how the IP itself works, which is the point of this book. While we have a perfect knowledge of how electronic IPs work, we have only a vague idea of how biological or mental information processors work.

Consider the following incremental piece of biological information. Bees can see ultraviolet light and we can’t. This fact builds on prevailing biological paradigms, e.g. that bees and people see light with eyes. This presumes bees and people are IPs for which living and seeing are axiomatic underlying functions. The new incremental fact tells us that certain animals, namely bees, see ultraviolet as well. This fact extends what we knew, which seems simple enough. A child who knows only that animals can see and bees are small flying animals that like flowers can now understand how bees see things in flowers that we can’t. A biologist working on bee vision needs no more complex paradigm than the child; living and seeing can be taken for granted axiomatically. She can focus on the ultraviolet part without worrying about why bees are alive or why they see. But if our goal is to explain bees or minds in general, we have to think about these things.

Our biological paradigm needs to define what animals and sight are, but the three philosophical quandaries of life cited above stand in the way of a detailed answer. Physicalists would say that lifeforms are just like clocks but more intricate. That is true; they are intricate machines, but, like clocks, an explanation of all their pieces, interconnections, and enabling physical forces says nothing about why they have the form they do. Living things, unlike glaciers, are shaped by feedback processes that gradually make them a better fit for what they are doing. Everything that happened to them back to their earliest ancestors about four billion years ago has contributed. A long series of feedback events created biological information leveraging inductive logic captured as information rather than using laws of physics alone. Yes, biological IPs leverage physical laws, but they add something important which for which the physical mechanisms are just the means to the end. The result is complex creations that have essentially a zero probability of arising by physical mechanisms alone.

How, exactly, do these feedback processes that created life create this new kind of entity called information and what is information made out of? The answer to both questions is actually the same definition given for information above: the reduction of uncertainty, which can also be phrased as an ability to predict the future with better odds than random chance. Information is made out of what it can do, so we are what we can do. We can do things with a pretty fair expectation that the outcome will align with our expectations of the outcome. It isn’t really predicting in a physical sense because we see nothing about the actual future and any number of things could always go wrong with our predictions. We could only know the future in advance with certainty if we had perfect knowledge of the present and a perfectly deterministic universe. But we can never get perfect knowledge because we can’t measure everything and because quantum uncertainty limits how much we can know about how things will behave. But biological information isn’t based on perfect predictions, only approximate ones. A prediction that is right more than it is wrong can arise in a physical system if it can use feedback from a set of situations to make generalized guesses about future situations that can be deemed similar. That similarity, measured any way you like, carries predictive information by exploiting the uniformity of nature, which usually causes situations that are sufficiently similar to behave similarly. It’s not magic, but it seems like magic relative to conventional laws of physics, which have no framework for measuring similarity or saying anything about the future. A physical system with this capacity is exceptionally nontrivial — living systems took billions of years to evolve into impressive IPs that now centrally manage their heritable information using DNA. Animals then spent hundreds of millions of years evolving minds that manage real-time information using neurochemistry. Finally, humans have built IPs that can manage information using either standardized practices (e.g. by institutions) or computers. But in each case the functional ratchet has acted to strongly conserve more effective functionality, pulling evolution in the direction of greater functionality. It has often been said that evolution is “directionless”, because it seems to pull to simplicity as much as toward complexity. As Christie Wilcox put it in Scientific American, “Evolution only leads to increases in complexity when complexity is beneficial to survival and reproduction. … the more simple you are, the faster you can reproduce, and thus the more offspring you can have. … it may instead be the lack of complexity, not the rise of it, that is most intriguing.”4 It is true, evolution is not about increasing complexity, it is about increasing functionality. Inductive trial and error always chooses more functionality over less, provided you define “more” as what induction did. In other words, it is a statistical amalgamation of successful performances where the criteria for each success was situation-specific.

A functional entity has the capacity to do something useful, where useful means able to act so as to cause outcomes substantially similar to outcomes seen previously. To be able to do this, one must also be able to do many things that one does not actually do, which is to say one must be prepared for a range of circumstances for which appropriate responses are possible. Physical matter and energy are comprised of a vast number of small pieces whose behavior is relatively well-understood using physical laws. Functional entities are comprised of capacities and generalized responses based on those capacities. Both are natural phenomena. Until information processing came along through life, function (being generalized capacity and response) did not exist on earth (or perhaps anywhere). But now life has introduced an uncountable number of functions in the form of biological traits. As Eugene Koonin of the National Center for Biotechnology Information puts it, “The biologically relevant concept of information has to do with ‘meaning’, i.e. encoding various biological functions with various degrees of evolutionary conservation.”5 The mechanism behind each trait is itself purely physical, but the fact that the trait works across a certain range of circumstances is because “works” and “range” generalize abstract capacities, which one could call the reasons for the trait. The traits don’t know why they work, because knowledge is a function of minds, but their utility across a generalized range of situations is what causes them to form. That is why information is not a physical property of the DNA, it is a functional property.

Function starts to arise independent of physical existence at the moment a mechanism arises that can abstract from a token to a type, and, going the other way, from a type to a token. A token is a specific situation and a type is a generalization of that token to an abstract set of tokens that could be deemed similar based on one or more criteria. Each criterion permits a range of values that could be called a dimension, and so divides the full range of values into categories. Abstracting a token to a type is a form of indirection and is used all the time in computers, for example to let variables hold quantities not known in advance. An indirect reference to a token can either be a particular, in which case it will only ever refer to that one token, or a generality, in which case it is a type referring to the token. By referring to tokens through different kinds of references we can apply different kinds of functionality to them. Just as we can build physical computers that can use indirection, biological mechanisms can implement indirection as well. I am not suggesting that all types are representational; that is too strong a position. Information is necessarily “about” something else, but only in the sense that its collection and application must move between generalities and specifics. Inductive trial-and-error information doesn’t know it employs types because only minds can know things, but it does divide the world up this way. When we explain inductive information deductively with knowledge, we are simplifying what is happening by making analogies to cause-and-effect models even though they really use trial-and-error models. Cells have general approaches for moving materials across cell membranes which we can classify as taking resources in and expelling wastes, but the cells themselves don’t realize they have membranes and the simplification that materials are resources or waste neglects cases where they are both or neither. Sunlight is important to plants, so sunlight is a category plants process, which is to say they are organized so as to gather sunlight well, e.g. by turning their leaves to the sun, but they don’t pass around messages representing sunlight as a type and instructing cells to collect it.

To clarify further, we can now see that function is all about applying generalities to specifics using indirect references, while physical things are just about specifics. [Now is a good time to point out that “generalize” and “generalization” mean the same thing as “general” and “generality” except that a generalization is created by inference from specific cases, while a generality is unconcerned with whether it was created with inductive or deductive logic. Because I will argue that deductive logic can only be applied by aligning it to inductive findings, I will use the terms interchangeably but according to the more fitting connotation.] We can break generalities down into increasingly specific subcategories, arriving eventually at particulars.

Natural selection allows small functional changes to spread in a population, and these changes are accompanied by small DNA changes that caused them. The physical change to the DNA caused the functional change, but it is really that functional change that brought about the DNA change. Usually, if not always, a deductive cause-and-effect model can be found that accounts for most of the value of an inductive trial-and-error functional feature. For example, hearts pump blood because bodies need circulation. The form and function line up very closely in an obvious way. We can pretty confidently expect that all animals with hearts will continue to have them in future designs to fulfill their need for circulation. While I don’t know what genes build the circulatory system, it is likely that most of them have contributed in straightforward ways for millions of years.

Sex, on the other hand, is not as stable a trait. Sometimes populations benefit from parity between the sexes and sometimes with disproportionally more females. Having more females is beneficial during times of great stability, and having more males during times of change. I will discuss why this is later, but the fact that this pressure can change makes it advantageous sometimes for a new mechanism of sex determination to spring up. For example, all placental mammals used to use the Y chromosome to determine the sex of the offspring. Only males have it, but males also have an X chromosome. With completely random recombination, this means that offspring have a 50% chance of inheriting their father’s Y chromosome and being male. However, two species of mole voles, small rodents of Asia, have no Y chromosome, so males have XX chromosomes like females. We don’t know what trigger creates male mole voles, but a mechanism that could produce more than 50% females would be quite helpful to the propagation of polygamous mole vole populations, as some are, because they would be more reproducing (i.e. female) offspring.678 The exact reason a change in sex determination was more adaptive is not relevant, all that matters is that it was and the physical mechanism was simply abandoned. A physical mechanism is necessary, and so only possible physical mechanisms can be employed, but the selection between physical mechanisms is not based on their physical merits but only on their functional contribution. As we move into the area of mental functions, the link between physical mechanisms and mental functions becomes increasingly abstract, effectively making the prediction of animal behavior based on physical knowledge alone impossible. To understand functional systems we have to focus on what capacities the functions bring to the table, not on the physical means they employ.

I have introduced the idea that information and the function it brings are the keys to resolving the three philosophical quandaries created by life. In the next chapter, I will develop it into a comprehensive ontology that is up to the task of supporting the scientific study of all manner of things.

1.3 Dualism and the Five Levels of Existence

To review, an information processor or IP is a physical construction, e.g. a living thing, that manages (creates and uses) information. As I have noted before, the reason that living things develop functions to help them survive is that they can — the opportunity exists that disposes them to keep existing. In other words, they have a reason to live. If I wrote a pointless computer program that had no disposition to do anything or accomplish any function, we would say it was devoid of any information or function; it processes data but no information. Something only counts as information or as having function when it is useful, meaning that it can be applied toward an end or purpose. But use, end, and purpose are not physical things or events; they are at most ways of thinking about things. However, if we could think about an approximate future state of physical things or events, where this approximation was defined in terms of similarities to past things and events, then we could think about making it our purpose to cause that future state to happen. Information processors are physical machines that use physical mechanisms to model physical things and events in a nonphysical way and then apply those nonphysical models back to physical circumstances to change them. It all sounds wildly complicated and unlikely, except that is exactly what life does: it collects physical processes (genetic traits) that approximately cause future physical events to transpire in such a way that it can keep doing it. The purpose of the prediction and application is nothing more than to be able to keep predicting and applying: to survive. But the “it” that keeps doing “it” is always changing or evolving, because time doesn’t repeat. Living IPs only do things similar to things they did before, and instead of just maintaining themselves, they create new IPs as offspring that are similar but never quite the same as themselves.

Information and function make generalizations about physical (or functional) things. These generalizations are references to these things, and references are not physical themselves even though a physical mechanism exists in an IP to hold them. The information is not how the reference is physically implemented (in neurons or computer chips); it is the useful, purposeful, functional ends to which it can be applied indirectly through references. These ends exist but are not physical; they are a new kind of existence I call functional existence. The fabric of functional existence is capacity, not spacetime. Specifically, it is the capacity to predict what will happen with better than random odds. This capacity can be described in other ways, such as the ability to answer questions, resolve uncertainties, or be useful, all of which refer to generalized ways of guessing that new things can happen that were similar to past things. I will use information synonymously with functional existence, but there is a subtle difference: information is often spoken of independently of the information processing that can be done on it, but functional existence requires both the information and its accompanying information processing as a functional whole.

We are justified in saying that function (or information) actually exists because things exist when we can discriminate them, they persist, and we can do things with them. We can discriminate information, it persists, and we can do things with it, yet it is not physical, so this qualifies it as a distinct category of being. We can therefore conclude that interactionist dualism is true after all. The idea that something’s existence can be defined in terms of the value it produces is called functionalism. For this reason, I call my brand of interactionist dualism form and function dualism, in which physical substance is “form” and information is “function”. I hold that physical things except for IPs are best explained using physicalism and IPs are best explained using a combination of physicalism and functionalism. While this means I am endorsing physicalism and functionalism, I am only endorsing a version of each. Specifically, I endorse physicalism but require it to drop the restriction that everything IPs do is physical, and I endorse functionalism only in the sense that I describe here. Many variations of functionalism with different ontologies exist which I will not describe or defend. The version I propose says that function (aka information) exists in an abstract, nonphysical way, but that a physical world like ours can use functional through information processors. Consequently, although function itself is abstract and independent of physical support, the use of function in a physical world is quite concrete and dependent on elaborate feedback systems that all derive from living things. As an interactionist, I hold that form and function interact in the mind and that they do so via information processing.

Probably most cognitive scientists already consider themselves to be functionalists, in that they view mental states and processes functionally, but that doesn’t make functionalism a well-defined stance. While progress can be made without a coherent definition, a vagueness pervades the conclusions that creates uncertainty about what has been shown. By explicitly defining function and information as a kind of existence that is independent of physical substance, I hope to clarify both the physical and functional aspects of information processing to show how these two kinds of existence persist, interact, and influence the future.

To develop this idea, I’m going to further distinguish five levels of understanding we can have for each of the two kinds of existence, only the first two of which apply to physical things:

Noumenon – the thing-in-itself. Keeps to itself.

Phenomenon – that which can be observed about a noumenon. Reaches out to others.

Percept, from perception – first-order information created by an information processor (IP) using inductive reasoning on phenomena received . Notices others.

Concept, from conception – second-order information created with deductive reasoning, usually by building on percepts. Understands others.

Metaconcept, from metacognition – third-order information or “thoughts about thoughts”. Understands self.

We believe our senses tell us that the world around us exists. We know our senses can fool us, but by accumulating multiple observations using multiple senses, we build a very strong inductive case that physical things are persistent and hence exist. Science has increased this certainty enormously with instruments that are both immune to many kinds of bias and can observe things beyond our sensory range. Still, though, no matter how much evidence accumulates, we can’t know for sure that the world exists because it is out there and we are in here. But we can imagine that a thing physically exists independent of our awareness of it, and we refer to this standalone type of existence as the thing’s noumenon, or thing-in-itself (what Kant called das Ding an sich). The only way we can ever come to know anything about noumena is through phenomena, which are emanations from or interactions with a noumenon. Example phenomena include light or sound bouncing off an object, but can also include matter and energy interactions like touch, smell, and temperature. When we talk about atoms, we are referring to their noumena, or actual nature, but we don’t really know what that nature is. We only know noumena by observing their phenomena. So everything science or experience tells us of physical apples or atoms is entirely in terms of their phenomena. We believe they have a noumenal existence because it can be measured in so many different ways, and this consistency would be unlikely if the apple or atom was an illusion. We know this because an illusion of an apple, say a picture or projection of one, lacks many phenomena that real apples provide.

All knowledge based on phenomena is called a posteriori, which includes all knowledge we have of the physical world. We can have direct or a priori knowledge of noumena we can logically perceive in our own minds, which most notably includes things that are true by definition. A priori knowledge includes everything that is true by construction, which includes the logical implications of explicit deductive models. If we define rules of arithmetic such that addition necessarily works, then the rules are just part of the definition and all their implications are a priori even if we can’t easily see what all those implications are. While Kant’s greatest contribution to philosophy was the recognition that we can only know the world through phenomena, leaving physical noumena unknowable, he was perturbed that this implied that philosophers can never derive anything about the physical world from reason alone. Philosophers had always thought some truths about the world (e.g. the idea of cause and effect) could be known through thought alone, yet he had apparently proven that knowledge of the outside world must be the exclusive domain of natural philosophers, i.e. scientists. While some saw this as a fatal blow to philosophy, all it really did was clarify that philosophy is a functional affair, not a physical one. The recognition that we only know the world through phenomena was an important breakthrough because now we can readily accept that everything about the physical world is approximate, a posteriori knowledge that, far from being absolutely true, merely extrapolates about the future based on patterns seen before. Causes and effects are convenient generalizations about the world, not intrinsic physical essences.

Perception is the receipt of a phenomenon by a sensor and adequate accompanying information processing to create information about it. Physical things have noumena that radiate phenomena, but they never have perception since perception is information. A single percept is never created entirely from a single phenomenon; the capacity for perception must be built over billions of inductive trial-and-error interactions as life has done it. We notice a camera flash as a percept, but only because our brain evolved the capacity over millions of years to convert data into information. So if a tree falls in the forest and there was nobody to hear it, there was a phenomenon but no perception. Because IPs exploit the uniformity of nature, our perceptions can very accurately characterize both the phenomena we observe and the underlying noumena from which they emanate, even if complete certainty is impossible.

Perception includes everything we consciously experience without conscious effort, and includes sensory information about our bodies and the world, and also emotions, common sense, and intuition, which somehow bubble up into our awareness as needed. Information we receive from perception divides into two parts, one from nature and one from nurture. The nature part, innate perception, provides information in the form of feelings from senses and emotions that require no experience to feel and which don’t change given more experience. The nurture part, learned perception, provides information in the form of memories and impressions, and continually changes during our lives based on the contents of stored experience. For convenience, I will usually call learned percepts or intuitions subconcepts, since perception comes just below conception. Our capacity to develop common sense and intuition as subconcepts is itself innate, but the experiences themselves were not anticipated by our genetics and are entirely circumstantial to our individual lives. Everything we perceive is influenced by both innate and learned perception, even though they originate from completely independent sources. So we see red using innate perception, but a lifetime of experience seeing red things then influences our perception with impressions we attribute to either common sense or intuition. All information created by perception is first-order information because it is based on induction, which is the first kind of information one can extract from data. Inductive reasoning or “bottom-up logic” generalizes conclusions from multiple experiences based on similarities, a trial-and-error approach. Entirely outside the purposes of brains, all genetic information is created inductively, so I am going to use the word “perception” more broadly than mental perception to the evolutionary capture or perception of functionality into biological traits. More slowly than our intuitive mind, evolution “perceives” patterns that can provide functionality and it captures them in DNA as traits. A few of those genetic traits are mental and create for the innate perceptions we experience as senses and emotions.

Conception approaches information from a different direction. Instead of looking for associations from patterns from the bottom up, it works from the top down by proposing the existence of abstract entities called concepts that interact with each other according to rules of cause and effect. Concepts idealize frequently seen patterns into discrete buckets that group things or events into chunks that engage in predictable sorts of interactions with related concepts to form a conceptual model. Conceptual models obey whatever rules of logic we imagine for them, but they will predict best what will happen if they use deductive logic, because then they can reach conclusions that are logically certain. (Although conceptual thinking can follow any brand of logic and not necessarily full-fledged deductive logic, I will often refer to top-down or conceptual thinking as deductive for simplicity.) The challenge of concepts is in building concepts and models that correspond well to situations in which they can be applied. To help with this, our base concepts are strongly linked to percepts and our conceptual rules are heavily influenced by patterns we intuit from perception. We trust our perceptions based on our memory and familiarity with them, but they don’t tell us why things happen. Understanding, comprehension, grasp, and explanation generally imply a conceptual model that says why. Within the logic of the conceptual model, especially if it uses deductive logic that reaches inescapable conclusions, we can know exactly what will happen with perfect foreknowledge as a logical consequence, which gives us the confident feeling that comes with understanding. We know that models never apply perfectly to the physical world, but when they come close enough for our purposes we take our chances with them (and even trust them; more on this later). The implications or entailments of logic can be chained together, allowing conceptual models to take us with certainty many steps further than induction, which can basically only reach probable one-step conclusions. I call knowledge built from conceptual models second-order information because it gives us understanding, as opposed to the mere familiarity of inductive first-order information.

Like perception, our capacity for conception is itself innate even though our concepts themselves are all learned1. So, as with perception, I will distinguish innate conception and learned conception as different components. A big difference between perception and conception, however, is that learned perception (subconcepts or intuitive knowledge) only grows at a fixed rate with experience, while learned conception (concepts or rational knowledge) is essentially unlimited because conceptual models can build on each other to become ever more powerful. This means we can not only leverage up our conceptual models over our own lifetimes, we can pass them on from generation to generation. Notably, although our innate conception is probably not much different than two thousand — or possibly even 50,000 to 200,000 — years ago, language and civilization have dramatically transformed our understanding of the world through learned conception.

I mentioned before that I would discuss whether cause and effect are meaningful in the context of induction. Inductive information processing acts based on past experience, but only looks one step ahead instead of chaining causes and effects. We can call the circumstances before an inductive action the cause and the result the effect. However, this uses concepts, calling out specific causes and effects in a general way. Induction itself doesn’t need the concepts of cause or effect; it just happens. However, given that caveat, it is fair to describe inductive processes conceptually using single-step causes and effects. Although these causes and effects happen without any intent, purpose, design, or strategy, they are dispositional. Life is predisposed to survive, and that is why the inductive processes that produce these simple causes and effects happen. Survival is the ultimate cause that produces all the inductive effects of life. This is very different from loose rocks that fall from cliffs, which have no disposition to do anything. The laws of physics are deductive tools that describe behavior in causative terms, e.g. that the loosening of clumps called rocks along cracks in cliffs will cause falling. Physically, however, the same subatomic rules are followed everywhere, and there are no rocks or cliffs except as abstract groupings in our heads.

Metacognition is thoughts about thoughts, or, more specifically, deductive reasoning about thoughts. Conception is a first-order use of deductive reasoning in which the premises are groupings of percepts, while metacognition is a higher-order use of deductive reasoning in which the premises can be concepts themselves, or concepts about concepts abstracted any number of levels. All physical things, grouped to any level of generality, are still just first-order concepts because we have a strong perceptual sense of their scope. So apple tree, tree, and plant are all first-order concepts, but lumber source and fruit season are metaconcepts about trees. Looking inward, we have a concept of ourself that is based on our subjective experience of doing things, but we also have a metaconcept of ourself which holds thoughts we have thought about ourselves. Metacognition expands our realm of comprehension from matters of immediate relevance to matters abstracted one or more levels away. It extends our reach from physical reality to unlimited imagination. I call metaconcepts third-order information because this move to arbitrary degrees of indirection unlocks new kinds of explanatory power. Conception, both first and higher-order, heavily leverages perception but gives us a kind of window into the future.

Noumena and phenomena just happen. That is, they are not functional in that they do nothing to influence future events. All physical things are necessarily just noumenal and phenomenal because they just happen, which we believe means that invariant physical laws of the universe apply to them. We can also apply the idea of noumena and phenomena to functional things as well by referring to them. Percepts and concepts are about things and we usually know what they are about and don’t need to observe them to find out. But things can get complicated and it may be necessary to observe a functional system to learn more about it. This is the case when we are using our minds to figure out the functions of other IPs, or if we have created a conceptual model whose implications are too hard to figure out logically. In these cases, we isolate the functional entity as a noumenon called a black box whose internal workings are taken to be unknown, and we make phenomenal observations of its behavior. For example, if we find a calculator-like device, we can play around with its buttons and observe what happens. We can never know for sure the true function the device was created to have, but we can develop theories based on observations of it. In this way, functional noumena are ultimately unknowable just like physical noumena. Our theories can become increasingly accurate, but only as they pertain to aspects of their existence that are useful to us, which may be quite different than their creators had in mind (if they were even created for a purpose).

When our concepts are based on formal models like mathematics, we have access to their actual noumena because we defined them. In this case, all their logical implications are also noumenal by definition. But the implications of many formal models can be too complex for us to reason out (i.e. prove), so we may instead opt to gather information about them by induction. If we can run the model on a computer, we can do this by running millions of simulations and analyzing the results for patterns. For example, weather simulators are precisely defined, but we have no idea what all the implications of their rules might be except by running simulations and seeing what pops out. Similarly, in our own minds we can’t logically forecast all the implications of many conceptual models, so we run simulations that project what will happen using subconceptual and conceptual heuristics we have refined over time. In so doing, we have effectively built functional noumena that describe the world about which we then make phenomenal observations.

Evolved functions or traits can be referred to as noumena to which evolution makes adjustments based on phenomenal observations called natural selection. As with the above cases, the noumena themselves can never be understood directly, but only in terms of surmised functions of their phenomena. The noumenon of any gene itself blends information from all the feedback that created it, which, if you think about it, would take an infinite amount of information to describe because time and space don’t break down into quantized units, even if matter and energy do. Even a finite attempt to characterize all that feedback would be astronomically complex. But we can simplify a genetic function using a deductive cause-and-effect model the helps us understand it in the sense that it empowers us to make any number of useful predictions about what it will do. We know such conceptual models leave out details, but they are still useful. Daniel Dennett calls explanations of evolved noumena “free-floating rationales”2. This is a great way of putting it, because it emphasizes that the underlying logic is not dependent on anything physical, which is important because function is not a physical thing. All functional noumena are necessarily free-floating in the sense that they don’t have to be implemented to exist; they embody logical relationships whether anyone knows it or not. But, of course, all physical IPs (either through DNA or thoughts) are implemented and are ultimately concerned only with what can be implemented, because function must ultimately be useful. In other words, we can imagine information and IPs in the abstract, but they can’t actually process any information.

Perception, conception, and metacognition are purely functional modes of existence. They depend on a physical IP, but the same function could potentially be rendered by any number of different physical implementations. We could happily live our lives entirely within a computer simulation if it did a good enough job. This doesn’t mean implementations of information on computers would be indistinguishable from implementations using physical matter; they could always be distinguished. Scientific experiments within the simulation could expose differences between physical and simulated reality. Of course, we could either prevent simulants from doing such experiments, or change the results they obtain, or modify their thought processes to fool them. But I digress; my point is not whether simulation could be a feasible alternative to physical life, but that we live our lives entirely as functional entities and only require a physical environment as a source of new information. We may require physical IPs to exist and think, but we are first and foremost not physical ourselves.

Let’s review our three quandaries in the light of form and function dualism. First, the origin of life. Outside of life, phenomena naturally occur and explanations of them comprise the laws of physics, chemistry, materials science and all the physical sciences. These sciences work out rules that describe the interactions of matter and energy. They essentially define matter and energy in terms of their interactions without really concerning themselves with their noumenal nature. As deductive explanations, they are based in the functional world of comprehension and draw their evidence from our perception of phenomena. While the target is ultimately the truth our noumena of nature, we realize that models are functional and not physical, and also only approximations, even if nearly perfect in their accuracy. With the arrival of life, a new kind of existence, a functional existence, arose when the feedback loops of natural selection developed perception to finds patterns in nature that could be exploited in “useful’ ways. The use that concerns life is survival, or the propagation of function for its own sake, and that use is sufficient to drive functional change. But perception forms its own rules transcendent to physical laws because it uses patterns to learn new patterns. The growth of patterns is directed toward ever greater function because of the functional ratchet. It exploits that fact that appropriately-configured natural systems are shaped by functional objectives to replicate similar patterns and not just by physical laws indifferent to similarity.

Next, let’s consider the mind-body problem. The essence of this problem is the feeling that what is happening in the mind is of an entirely different quality than the physical events of the external world. Form and function dualism tells us that this feeling reflects that actual underlying natural entities, some of which are physical and some functional. Specifically, the mind is entirely concerned with functional entities and the external physical world is entirely concerned with physical entities, except that other living things are themselves both functional and physical. This division is not reducible at all, as physicalists would have us believe, because function is concerned with and defined by what is possible, and the realm of the possible is entirely outside the scope of mere physical things. While function doesn’t reduce to physical, it does depend on the brain. The mind is a natural entity comprised of a complex of functional capacities implemented using the physical machinery of the brain. So the mind can be said to have both a functional aspect and a physical aspect. Since the mind is the subset of brain processing that we associate with consciousness and experience, which is arguably a small subset of the large amount of neural processing that happens outside our conscious awareness, it is quite relevant to discuss the physical nature of the mind in terms of the subset of brain functions directly associated with consciousness. But one can also talk about the functional aspect in isolation from the IP that physically supports it. Although this aspect is just an abstraction in the sense that it needs the brain to support it, as an abstraction it can be thought of as entirely immaterial, our “soul”, if you like. This view of the soul is not supernatural; it just distinguishes function from form, and, more to the point, higher-level functions from lower-level functions, being the essential activities of survival like sleeping and eating with which we have conscious awareness but which don’t define our long-term objectives.

Finally, let’s look at the explanatory gap, which is about explaining with physical laws why our senses and emotions feel they way they do. I said this gap would evaporate with an expanded ontology. By recognizing functional existence as real, we can see that it opens up a vastly richer space than physical existence because it means anything can be related anything in any number of ways. The world of imagination is unbounded, while the physical world is closely ruled by rather rigid laws. The creation of IPs that can first generalize inductively (via evolution of life and minds) and then later deductively and metacognitively (via further evolution of minds) gave them increasing degrees of access to this unbounded world. The functional part alone is powerless in the physical world; it needs the physical manifestation of the IP and its limbs (manipulative extremities) to impact physical things; there is nothing spectral going on here. Physical circumstances are always finite and so IPs are finite, but their capacities are potentially unlimited because capacities are general and not constrained to handle only specific circumstances. So to close the explanatory gap and explain what it means to feel something, we should first recognize that the scope of feeling, experience, and understanding was never itself physical; it was a functional effect within an IP. So what happens in the IP to create feelings?

I’m just going to say the answer here and develop and support it in more detail later on. The role of the brain is to control the body in a coordinated way, and as a practical matter, it solves this using a combination of bottom-up and top-down information processing. These two styles, which have to meet somewhere in the middle, are usually called intuitive and rational. The rational mind is entirely conscious, while the intuitive mind provides the conscious mind with a wealth of impressions and hunches from some inner source I call the nonconscious mind. The role of consciousness is to focus specifically on top-level problems that the nonconscious mind can’t handle by itself. To create this logical view of top-level concerns, the nonconscious mind presents information to the conscious mind by creating a theater of consciousness. Conscious experience is a highly produced and streamlined version of the information the nonconscious mind processes. The analogy to a movie is particularly good because movies are designed to simulate consciousness. What we feel as pain is really just part of the user interface between the nonconscious and conscious processes in the brain. Senses, feelings, and lessons from the school of hard knocks bubble up to consciousness through the intuitive mind. We are aware of our bodies, our minds, the world, and the passage of time, and we have a specific conscious feeling of them through senses and emotions. Rationally, we organize the world into objects and other concepts that follow rules of cause and effect. Consciousness merges the two styles together pretty seamlessly, but they are actually quite different entirely functional constructs. Some of our intuitive and rational knowledge, though itself functional, is about the physical world, and some of it is about our mental world or other nonphysical subjects. Most notably, our somatic sensory information is about our bodies and our emotions are about our minds. They are not about our bodies and minds in a physical way; rather, they tell us things our bodies and minds need. Whether about physical, mental or other things, knowledge serves functional purposes and so can be said to be a functional entity.

To summarize my initial defense of dualism, I have proposed that form and function, also called physical and functional existence, encompass the totality of possible existence. We have evidence of physical things in our natural universe. We could potentially someday acquire evidence of other kinds of physical things from other universes, and they would still be physical, but they may produce different measurements that suggest an entirely different set of physical laws. Functional existence needs no time or space, but for physical creatures to benefit from it, there must be a way for functional existence to manifest in a natural universe. Fortunately, the feedback loops necessary for that to happen are physically possible and have arisen through evolution, and have then gone further to develop minds which can not only perceive, but can also comprehend and reflect on themselves. Note that this naturalistic view is entirely scientific, provided one expands the ontology of science to include functional things, and yet it is entirely consistent with both common sense and conventional wisdom, which hold that a “life force” is something fundamentally lacking in inanimate matter. We also see evidence of that “life force” in human artifacts because we are good at sensing patterns with a functional origin. Some patterns that occur in nature without any help from life do surprise us by appearing to have a functional origin when they don’t.3 Life isn’t magic, but some of its noumenal mystery is intrinsically beyond a complete deductive understanding. But we can continue to improve our deductive understanding of life and the mind to give us a better explanatory grasp of how they work.

1.4 Hey, Science, We’re Over Here

Between the physical view that the mind is a machine subject entirely to physical laws and the ideal view that the mind is a transcendent entity unto itself that exists independently of the body, science has come down firmly in the former camp. This is understandable considering we have unequivocally established that processes in the brain create the mind, although just how this happens is still not known. The latter camp, idealism, is, in its most extreme form called solipsism, the idea that only one’s own mind exists and everything else is a figment of it. Most idealists don’t go quite that far and will acknowledge physical existence, but still claim that our mental states like capacities, desires, beliefs, goals, and principles are more fundamental than concrete reality. So idealists are either mental monists or mental/physical dualists. Our intuition and language strongly support the idea of our mental existence independent of physical existence, so we consequently take mental existence for granted in our minds and in discourse. The social sciences also start from the assumption that minds exist and go from there to draw out implications in many directions. But none of this holds much sway with physicalists, who, taking the success of physical theories as proof that physical laws are sufficient, have found a number of creative ways to discount mental existence. Some hold that there are no mental states, but just brain states (eliminativism), while others acknowledge mental states, but say they can be viewed as or reduced to brain states (reductionism). Eliminativism, also called eliminative materialism (though it would be more accurate to call it eliminative physicalism to include energy and spacetime), holds that physical causes work from the bottom up to explain all higher-level causes, which will ultimately demonstrate that our common-sense or folk psychology understanding of the mind is false. Reductionism seems to apply well in nonliving systems. We can predict subatomic and atomic interactions using physics, and molecular interactions using chemistry. Linus Pauling’s 1931 paper “On the Nature of the Chemical Bond” showed that chemistry could in principle be reduced to physics12. Applied to the mind, reductionism says that everything happening in the mind can be explained in terms of underlying physical causes. Reductionism doesn’t say higher-level descriptions are invalid, just that, like chemistry, they are just a convenience; a physical description is possible. However, both eliminativism and reductionism build on the incorrect assumption that information and its byproducts don’t fundamentally alter the range of natural possibility. This assumption, formally called the Causal Closure of the Physical, states that physical effects can only have physical causes. Stated correctly, it would say that physical effects can only have natural causes and recognize that information can be created and have effects in a natural world.

The ontology I am proposing is form and function dualism. This is the idea that the physical world exists exactly as the physicalists have described, but also that life and the mind have capacities or functions, which are entirely natural but cause things that would not otherwise happen. It is easy to get confused when talking about function because language itself is purely functional, so we have to use functional means to talk about either physical or functional things. So we have to be careful to distinguish the references or words, which are functional, that we use to talk about referents, which are either physical or functional noumena. Here are some general words commonly used to refer to either physical things or functional things:

PhysicalFunctional
form function, capacity, information
concrete abstract
things, events feelings, ideas, thoughts, concepts
action, process performance, behavior, maneuver
Note that we have no language that describes biological processes in a functional way beyond words like function, capacity, information, and behavior. This is because most of the functional terminology of language derives from our subjective experience, which leads to a mental vocabulary for thoughts and feelings and a physical vocabulary for things and events. Many words, like hard or deep, have both physical and functional meanings, but we can tell from context which is meant. Since we don’t subjectively experience living processes except for those of our own minds, we mostly just use ordinary physical observational terminology to describe them, e.g. as things and processes. Here, contextually, we might now that certain things and processes are actually biological things and processes, and hence much of what we are thinking about when we talk about them is actually functional and not physical. This is all sort of implied by context, but unfortunately it makes it harder for us to keep track of where information processes are critically involved. Mental vocabulary has a similar drawback. Since we only know what it means from personal experience, it is difficult to attach objective scientific meaning to it. However, the persistence of feelings and thoughts, the degree to which others seem to have similar feelings and thoughts, and plans and behaviors that we can associate with them can all provide confirming evidence of their existence and functionality, though not their underlying mechanism.

Function only becomes a kind of existence independent of physical existence if its existence can cause physical changes in the world. Since function, in the form of information, stored either in DNA or brains, is the product of a higher level of physical systems, specifically of information processors, and the impact is at the lower, purely physical level, the impact is called downward causation34. Downward causation directly rejects reductionism and asserts emergence5, the idea that something new, namely information, is created that can effect change. Information doesn’t violate any physical laws when it causes things to happen that would not otherwise happen. All it has done, really, is create a more complex feedback response in the physical system. Or, put another way, it has stored up potential using information instead of energy, so, like a spring, it can release that potential when it is needed. This complex feedback response is entirely natural, and one we could arguably call physical as well, except that the rules that govern it go far beyond anything conventional physical laws contemplate. First, because this complex response is indirect and uses logic, it is not physical, so conventional physical laws can’t help explain it. Second, the way the feedback has been tuned by long periods of inductive logic to develop specific functional capacities further make it irreducible to physical laws. The consequence, that information is created, is natural but complex and can only be understood by interpreting function as an independent form of existence. This is because “understanding” itself, being a property of information processing, is all about being functional or useful, which can only happen if it can predict what will happen, and we can only accurately predict what will happen physically using physical laws and functionally using functional laws (or explanations, as laws is too strict a term for much of what happens in the functional realm). So any physicalists out there who want to claim victory, on the grounds that the action of information in a physical system is a physical process akin to the conversion of potential energy to kinetic energy, please, go right ahead. Just keep in mind that you will never be able to predict what that release of information will do until you embrace functional existence, i.e. the logic of information processing.

So when happens when information is “released” to cause physical effects? When information is applied, resulting in downward causation, the physical effect is specific, but informationally it can be viewed as a special case of a general rule characterizing things that could happen in similar situations. Events happen that are generally similar to prior events, but physically all that is happening is a set of specific events because similarity means nothing from a physical standpoint. Functionally, it looks like things that have happened “before” are happening “again”, but nothing in the universe ever happens twice. Something entirely new has happened. Yes, it is similar to something that happened before, but only according to some ultimately arbitrary definition of similarity that is relevant to a specific information processor. So we must not conflate things happening in general with things happening specifically. As soon as we even speak of things happening in general, we have admitted the existence of function and we are no longer talking about the physical world independent of our functional perspective of it. Our minds and our language are very function-oriented, so seeing things in general comes very naturally to us, it can be hard to separate functional ideas from non-functional ones, but it is always possible.

Aside from the fact that we have words for concrete or physical things and words for functional or abstract things, nouns of any kind may be specific or general in that they can refer to a particular thing or to a class of things. Wording and context help differentiate these cases. For example, “I own a green car” probably refers to a specific physical object and my conception of it, while “I am going to get a green car” refers to a categorical, functional thing (and not really to anything physical at all). While I am likely referring to my only green car, the indefinite article “a” doesn’t single out a definite car, and I may actually have several green cars, and so have not indicated which one I mean. “I own the green car” makes the identification definite, as would use of a proper noun (name). In the functional world of our minds, we are always very clear with ourselves whether our ideas are specifically referring to single, particular things or are generally referring to classes of things. This distinction is at the heart of function because the way function works is to “extract” order from the uniformity of nature by identifying and then capitalizing on generalities.

Whenever IPs collect and apply information, they are causing function to emerge in a physical world. The word “emergence” suggests something comes out of nothing, and if you are willing to count a preestablished potential to do things as something, then something new has been created. It is not magical or inexplicable; it is just the result of feedback loops exploiting the uniformity of nature. As Bob Doyle puts it, “Some biologists (e.g., Ernst Mayr) have argued that biology is not reducible to physics and chemistry, although it is completely consistent with the laws of physics. … Biological systems process information at a very fine (atomic/molecular) level. Information is neither matter nor energy, but it needs matter for its embodiment and energy for its communication.”6 Arthur C. Clarke’s third law says “Any sufficiently advanced technology is indistinguishable from magic.” What evolution has created in life and minds qualifies as being sufficiently more advanced than our own technology to count as magic. We now know that it is technology, but we are still quite a ways off from understanding it in enough depth to say we really “get it”. But increasingly refined deductive explanations, such as I am developing here, bring us closer and will eventually bring our technology up to the level evolution has already attained.

Downward causation (i.e. the application of function by IPs) can be called an “interaction” between life and body or between mind and body because life and mind affect the body and vice versa. The IP, being a complex system with both physical and nonphysical aspects, has established mechanisms to mediate its stored potential capacities with effectors, which are generally proteins in the case of living things and muscles in the case of animals. Physical laws most effectively explain purely physical aspects and functional principles most effectively explain more functional aspects, though those functional principles are often tuned for physical applications. Still, though, what makes them functional is that they work by applying generalities to specific situations.

What other ways do physical things differ from functional ones? Each physical thing is unique, but the same information can exist in multiple ways or formats, which is called multiple realizability. What makes them the “same” is that they refer to the same things in the same ways. This characterization of sameness is itself subject to the judgment of an IP, but provided IPs agree, then sameness can be admitted. This allows information to be abstracted from the physical forms to which it refers and from the IPs that manage it. Information consists of generalizations and particulars derived from generalities, which are only indirectly about physical things and are not physical themselves. Next, while matter and energy are conserved in that they can neither be created nor destroyed (though quantum effects challenge this a bit), information is inherently infinite. The amount of information captured by IPs in the universe is finite but can grow over time without bound.7

Although physicalism specifically rejects the possibility of anything existing beyond the physical world as characterized by the laws of physics, in so doing it overlooks unexpected consequences of feedback. Perhaps overlooks is too strong a word, because physicalists can see life and the mind and can call them physical, but physicalists do presume that the behavior of such systems must be reducible to physical terms just because physical laws were sufficient to create them. But this assumption is both unjustified and wrong. Effects can happen in a physical world that can’t be traced back to the physical conditions that caused them because the connections between causes and effects have been hopelessly muddled by uncountable feedback effects that have become increasingly indirect. Instead, the real spirit of physicalist philosophy is naturalism, which says that everything arises from natural causes rather than and supernatural ones. Instead of declaring up front that the only natural causes to be allowed are from the Standard Model of particle physics or from general relativity, we should be open to other causes, such as information interactions. This is why I conclude that naturalism is best supported not by a monism of form but by a dualism of form and function.

Physicalists see the functional effects of life and the mind (and, not incidentally, make all their arguments using their minds), but they conflate biological information with complex physical structure, and they are not the same. Just because we can collect all sorts of physical, chemical, and structural information about both nonliving and living matter does not mean they are the same sort of thing. As I said earlier, rocks have a complex physical structure but contain no information. We collect information about them using physical laws and measurements that help us describe and understand them, but their structure by itself is information-free. Weather patterns are even more complex and also chaotic, but they too contain no information, just complex physical structure. We have devised models based on physical laws that are pretty helpful for predicting the weather. But the weather and all other nonliving systems don’t control their own behavior; they are reactive and not proactive. Living things introduce functions or capabilities built from information generated from countless feedback experiments and stored in DNA. This is a fundamental, insurmountable, and irreducible difference between the two. Living things are still entirely physical objects that follow physical laws, but when they use information they are triggering physical events that would not happen without it. Because abstraction has unlimited scope, information processing vastly increases the range of what is physically attainable, as the diversity of life and human achievement demonstrate.

What I am calling form corresponds to what Plato and Aristotle called the material cause, which is its physical substance, and what I am calling function corresponds to their final cause, or telos, which is its end, goal, or purpose. They understood that while material causes were always present and hence necessary, they were not sufficient to explain why many things were they way they were. The idea that one must invoke a final cause or purpose to fully explain why things happen is called teleology. Aristotle expanded on this to identify four kinds of causes that resolve different kinds of questions about why changes happen in the world. Of these, material, formal, efficient, and final, I have discussed the first and last. The formal cause is based on Plato’s closely-held notion of universals, the idea that general qualities or characteristics of things are somehow inherent in them, e.g. that females have “femaleness”, chairs have “chairness”, and beautiful things have “beauty”. While the Greeks clung to the idea that universals were intrinsic, William of Ockham put metaphysics on a firmer footing in the 13th century by advocating nominalism, the view that universals are extrinsic, i.e. that they have no existence except as classifications we create in our minds. While classifications are a critical function of the mind, I think everyone would now agree that we can safely say formal causes are descriptive but not causative. The efficient cause is what we usually mean by cause today, i.e. cause and effect. The laws of physicalism all start with matter and energy (no longer considered causative, but which simply exist) and then provide efficient causes to explain how they interact to bring about change. A table is thus caused to exist because wood is cut from trees and tools are used in a sequence of events that results in a table.

Although Aristotle could see that see that these steps had to happen to create a table, it doesn’t explain why the table was built. The telos or purpose of the table, and the whole reason it was built, is so that people can use it to support things at a convenient height. Physicalists reject this final, teleological cause because they see no mechanism — how can one put purpose into physical terms? For example, physically objects sink to lower places because of gravity, not because it is their purpose or final cause. This logic is sound enough for explaining gravity, but it doesn’t work at all for tables, and, as I have mentioned, it doesn’t work for anything life and the mind in general. So was it really reasonable to dispense with the final cause just because it wasn’t understood? How did such a non-explanatory stance come to be the default perspective of science? To see why, we have to go back to William of Ockham. 1650 years after Aristotle, William of Ockham laid the groundwork for the Scientific Revolution, which would still need another 300 years to get significantly underway. With his recognition that universals were not intrinsic properties but extrinsic classifications, Ockham eliminated a mystical property that was impeding understanding of the prebiotic physical world. But he did much more than identify the mind as the source of formal causes, he explained how the mind worked. Ockham held that knowledge and thought were functions of the mind which could be divided into two categories, intuitive and abstractive.8 Intuitive cognition is the process of deriving knowledge about objects from our perceptions of them, which our minds can do without conscious effort. Abstractive cognition derives knowledge from positing abstract or independent properties about things and drawing conclusions about them. Intuitive knowledge depends on the physical existence of things, while abstractive knowledge does not, but can operate on suppositions and hypotheses. I concur with Ockham that these are the two fundamental kinds of knowledge, and I will develop a much deeper view of them as we proceed. Ockham further asserted that intuitive knowledge precedes abstractive knowledge, which means all knowledge derives from intuitive knowledge. Since intuitive knowledge is fundamental, and it must necessarily be based on actual experience, we must look first to experience for knowledge and not to abstract speculation. Ockham can thus be credited with introducing the now ubiquitous notion that empiricism — the reliance on observation and experiment in the natural sciences — is the foundation of scientific knowledge. He recognized the value of mathematics (i.e. formal sciences) as useful tools to interpret observation and experiment, but cautioned that they are abstract and so can’t be sources of knowledge of the physical world in their own right.

Francis Bacon formally established the paramountcy of empiricism and the scientific method in his 1620 work, Novum Organum. Bacon repeatedly emphasizes how only observation with the senses can be trusted to generate truth about the natural world. His Aphorism 19, in particular, dismisses ungrounded, top-down philosophizing and endorses grounded, bottom-up empiricism:

“There are and can only be two ways of investigating and discovering truth. The one rushes up from the sense and particulars to axioms of the highest generality and, from these principles and their indubitable truth, goes on to infer and discover middle axioms; and this is the way in current use. The other way draws axioms from the sense and particulars by climbing steadily and by degrees so that it reaches the ones of highest generality last of all; and this is the true but still untrodden way.”

Bacon built on Ockham’s point that words alone could be misleading by citing a number of biases or logical fallacies that can so easily permeate top-down thinking and so obscure what is really happening. Specifically, he cited innate bias, personal bias, and rhetorical biases (into which one could include traditional logical fallacies like ad hominem, appeal to authority, begging the question, etc.).

Bacon didn’t dispense with Aristotle’s four causes but repartitioned them into two sets. He felt that physics should deal with material and efficient causes while metaphysics should deal with formal and final causes.9 He then laid out the basic form of the scientific method. The objective is to find and prove physical laws, which are formal causes that are universal. While Ockham had rejected such abstractions, Bacon accepted them, but rebranded the only legitimate ones as those that were demonstrable by his method. Using the example of heat as a formal cause, he recommended collecting evidence for and against, i.e. listing things with heat, things without, and things where heat varies. Comparative analysis of the cases should then lead to a hypothesis of the formal cause of heat. Bacon could see that further cycles of observation and analysis could inductively demonstrate the universal natural laws, and he attempted to formalize the process, but never quite finished that work. Even now people would disagree that the scientific method has a precise form, but would agree that it depends on iterative observation and analysis. Bacon had little to say about the final cause because it was the least helpful to his inductive method, and in any case could easily be perverted by bias to lead away from the discovery of efficient causes that underly formal causes. In any case, the success of the inductive, physicalist approach since Bacon and the inability of detractors to refute its universal scope have led to the outright rejection of teleology as an appeal to mysticism when physical laws seem to be sufficient.

We are now quite comfortable with the idea that all our knowledge of the physical world must derive only from observations of it and not suppositions about it. And we concur with Ockham that our primary knowledge of the physical world is inductive, but that secondary abstractive knowledge can group that knowledge into classifications and rules which provide us with causative explanatory power. We recognize that our explanations are extrinsic to the fabric of reality but are nevertheless very effective. However, this shift away from the more magical thinking of the ancients (not to mention the Christian idea that God designed everything) blinded us to something surprising that happens in biological systems and even more significantly in brains: the creation of function. Function is in many ways a subtle phenomenon, and this is why it has been overlooked or underappreciated. Function is not something specific you can point to; it results from creating indirect references to things and generalizing about what might happen to them.

In supposing that knowledge must originate inductively, Ockham and Bacon inadvertently put a spotlight on direct natural phenomena. How could they have known, how could anyone know, that indirect natural phenomena would play a critical role in the development of life and then the brain? Charles Darwin, of course, figured it out by process of elimination (shifting from direct forces to the indirect influences of a nearly infinite series of natural selections), but that is not the same thing as recognizing the source of the power of indirection. Aristotle had already pointed out that every phenomenon had an efficient cause, so of course some sequence of events must have caused life to arise, and Darwin put the pieces together to propose a basic strategy for it to start from nothing and end up where it is now. The events that power natural selection are, taken individually, entirely physical, and so it seems natural to assume that the whole of the process is entirely physical. But this assumption is a fundamental mistake, because natural selection is only superficially physical. The specific selection events of evolution don’t matter; what matters is how they are interpreted or applied in a general way so as to influence future similar events. What it really does by collecting evidence of the value of a mechanism across a series of events is to justify the conclusion of an indirect or general power of the mechanism across an abstract range of situations.

Rene Descartes tried to unravel function, but, coming long before Darwin, he could see no physical source and resorted to conjecture. As I mentioned before, he proposed a mental substance that interacted with the physical substance of the brain in the pineal gland. This is a wildly inaccurate conclusion which has only served to accentuate the value of experimental research over philosophy, but it is still true that knowledge is a nonphysical capacity of the brain whose functional character physical science has not yet attempted to explain. But Descartes’ mistaken assumptions and the rise of monism have led to a concomitant fall in the popularity of all stripes of dualism, even to the point where many consider it a proven dead end. Gilbert Ryle famously put the nail in the coffin of Cartesian dualism in The Concept of Mind10 in 1949. We know (and knew then) that Descartes’ mental “thinking substance” does not exist as a physical substance, but Ryle felt it still had tacit if not explicit “official” support. He felt we officially or implicitly accepted two independent arenas in which we live our lives, one of “inner” mental happenings and one of “outer” physical happenings. This view goes all the way down to the structure of language, which has a distinct vocabulary for mental things (using abstract nouns which denote ideas or qualities) and physical things (using concrete nouns which connect to the physical world through senses). As Ryle put it, we have “assumed that there are two different kinds of existence or status. What exists or happens may have the status of physical existence, or it may have the status of mental existence.” He disagreed with this view, contending that the mind is not a “ghost in the machine,” something independent from the brain that happens to interact with it. To explain why, he introduced the term “category mistake” to describe a situation where one inadvertently assumes something to be a member of a category when it is actually of a different sort of category. His examples focused on parts not being the same sort of thing as wholes, e.g. someone expecting to see a forest but being shown some trees might ask, “But where is the forest?”. In this sort of example, he identified the mistake as arising from a failure to understand that forest has a different scope than tree.11 He then contended that the way we isolate our mental existence from our physical existence was just a much larger category mistake which happens because we speak and think of the physical and the mental with two non-intersecting vocabularies and conceptual frameworks, yet we assume it makes sense to compare them with each other. As he put it, “The belief that there is a polar opposition between Mind and Matter is the belief that they are terms of the same logical type.” Ryle advocated the eliminativist stance: if we understood neurochemistry well enough, we could describe the mechanical processes by which the mind operates instead of saying things like think and feel.

But Ryle was more mistaken than Descartes. His mistake was in thinking that the whole problem was a category mistake, when actually only a superficial aspect of it was. Yes, it is true, the mechanics of what happens mentally can be explained in physical terms because the brain is a physical mechanism like a clock. So his reductionist plan can get us that far. But that is not the whole problem, and it is not the part that interested Descartes or that interests us, because saying how the clock works is not really the interesting part. The interesting part is the purpose of the clock: to tell time. Why the brain does what it does cannot be explained physically because function is not physical. The brain and the mind control exist to the body, but that function is not a physical feature. One can tell that nerves from the brain animate the hands, but one must invoke the concept of function to see why. As Aristotle would say, material and efficient causes are necessary but not sufficient, which is why we need to know their function. Ryle saw the superficial category mistake (forgetting that the brain is a machine) but missed the significant categorical difference (that function is not form). So, ironically, his argument falls apart due to a category mistake, a term that he coined.

Function can never be reduced to form because it is not built from subatomic particles; it is built from logic to characterize similarities and implications. It is true that function can only exist in a natural universe by leveraging physical mechanisms, but this dependency doesn’t mean it doesn’t exist. All it means is that nature supports both generalized and specific kinds of existence. We know the mind is the product of processes running in the brain, just as software is the product of signals in semiconductors, but that doesn’t tell us what either is for. Why we think and why we use software are both questions the physical mechanisms are not qualified to answer. Ryle concluded, “It is perfectly proper to say, in one logical tone of voice, that there exist minds and to say, in another logical tone of voice, that there exist bodies. But these expressions do not indicate two different types of existence, for ‘existence’ is not a generic word like ‘colored’ or ‘sexed.'” But he was wrong because there are two different kinds of existence, and living things exhibit both. Information processors have a physical mechanism for storing and manipulating information and use it to deliver functionality. For thinking, the brain, along with the whole nervous and endocrine systems, are the physical part and the mind is the functional part. For living things, the whole metabolism is the physical part and behavior is the functional part. This is the kind of dualistic distinction Descartes was grasping for. While Descartes overstepped by providing an incorrect physical explanation, we can be more careful. The true explanation is that functional things are not physical and their existence is not dependent on space or time, but they can have physical implementations, and they must for function to impact the physical world.

The path of scientific progress has understandably influenced our perspective. The scientific method was designed to unravel mysteries of the natural world, and was created on the assumption that fixed natural laws govern all natural activity. Despite his advocacy of dualism, Descartes promoted the idea of a universal mechanism behind the universe and living things, and his insistence that matter should be measured and studied mathematically as an extension of what we now call spacetime helped found modern physics: “I should like you to consider that these functions (including passion, memory, and imagination) follow from the mere arrangement of the machine’s organs every bit as naturally as the movements of a clock or other automaton follow from the arrangement of its counter-weights and wheels.” 12 He only invoked mental substance to bridge the explanatory gap of mental experience. If we instead identify the missing piece of the puzzle as function, then we can see that nature, through life, can “learn things about itself” using feedback to organize activities in a functional way we call behavior. Behavior guides actions through indirect assessments instead of direct interactions, which changes the rules of the game sufficiently to call it a different kind of existence.

Darwin described how indirect assessments could use feedback to shape physical mechanisms, but he didn’t call out functional existence specifically, and, in the 150 years since, I don’t think anyone else has either. But if this implies, as I am suggesting, that the underlying metaphysics of biology has been lacking all this time, then we have to ask ourselves what foundation it has been built on instead. The short answer is a physicalist one. Both before and after Darwin, traits were assumed to have a physical explanation, and they are still mostly thought to be physical today. And because function does always leverage a physical mechanism, this is true, but, as Aristotle said in the first place, it is not sufficient to tell us why. But if biologists honestly thought only in terms of physical mechanisms, they would have made very little progress. After all, we still have no idea, except by gross analogies to simple machines like levers, pipes, and circuits, how bodies work, let alone minds. Biology, as practiced, makes observations of functioning biological mechanisms and attempts to “reverse engineer” an explanation of them to create a natural history. Much of what is to be explained is provided by the result that is to be explained.13 We assume certain functions, like energy production or consumption, and work out biochemical details based on them, but we couldn’t build anything like a homeostatic, self-replicating living creature if our lives depended on it because we only understand superficial aspects. Biology is thus building on an unspoken foundation of some heretofore ineffable consequence of natural selection which I have now called out as biological function or information. Darwin gave biologists a license to identify function on the grounds that it is “adaptive”, and they have been doing that ever since, but not overtly as a new kind of existence, but covertly as “phenomena” to be explained, presumably with physical laws. I am saying that these phenomena are functional and not physical ones, and so their explanations must be based on functional principles, not physical.

But what of teleology? Do hearts pump blood because it is their purpose or final cause? We can certainly explain how hearts work using purposeful language, but that is just an artifact of our description. Evolved functionality gets there by inductive trial and error, while purpose must “put forth” a reason or goal to be attained. Evolution never looks forward because induction doesn’t work that way, so we can’t correctly use the word purpose or teleology to describe information created by inductive means. But we can use the word functional, because biological information is functional by generalizing on past results even though it is not forward-looking. And we can talk about biological causes and effects, because information is used to cause general kinds of outcomes. Biological causes and effects are never certainties the way physical laws deal in certainties because information is always generalizing to best-fits. Physical effects can also be said to have causes, but we should keep in mind that the causality models behind physical laws are for our benefit and not part of nature themselves. They are deductive models that make generalizations about kinds of things which we then inductively map onto physical objects to “predict” what will happen to them, which will give us a good idea of the kind of things that will most likely happen.

With our minds, however, through the use of abstraction used with deductive models we can “look forward” in the sense that we can run simulations on general types which we know could be mapped to potential real future situations. We can label elements of these forward-looking models as goals or purposes, because bringing reality into alignment with a desired simulation is another way of saying we attain goals. So we really can say that the purpose of a table is to support things at a convenient height for people. But tables are not pulled toward this purpose; they may also serve no purpose or be used for other purposes. Aristotle claimed that an acorn’s intrinsic telos is to become a fully grown oak tree.14 Biological functions can be said to be pulled inexorably toward fulfillment by metabolic processes. The difference is actually semantic. Biological processes can be said to run continuously until death, but again, it only looks like things that have happened “before” are happening “again” when really nothing ever happens twice. Similar biological processes run continuously, but each “instance” of such a process is over in an instant, so we are accustomed to using general and not specific terminology to describe biological functions. These processes have no purpose, per se, because none was put forth, but they do behave similarly to ways that have been effective in the past for reasons that we can call causes and effects. Many of the words we use to describe causes and effects imply intent and purpose, so it is natural for us to use such language, but we should keep in mind it is only metaphorical. Tables, on the other hand, are not used continuously and have no homeostatic regulation ensuring that people keep using them, so they may or may not be used for their intended purpose. Designers don’t always convey intended purposes to users, and users sometimes find unintended uses which become purposes for them, and both can be influenced by inductive or deductive approaches, so it is hard to speak with certainty about the purpose of anything. But it is definitely true that we sometimes have purposes and intentionally act until we consider them to be generally fulfilled, so minds can be teleological.

Part 2: The Rise of Function

I’ve outlined what function is and how it came to be, but to understand the detailed kinds of function we see in life and the mind, we need to back up to the start and consider the selection pressures at work. Humans have taken slightly over four billion years to evolve. Of that, the last 600 or so million years (about 15%) has been as animals with minds, the last 4 million years (about 0.1%) as human-like primates, and the last 10,000 or so years (about 0.00025%) as what we think of as civilized. The rate of change has been accelerating, and we know that our descendants will soon think of us as shockingly primitive (and some already do!). An explanation of the mind should account for what happened in each of these four periods and why.

Life: 4 billion to 600 million years ago
Minds: 600 million to 4 million years ago
Humans: 4 million to 10,000 years ago
Civilization: 10,000 years ago to present

2.1 Life: 4 billion to 600 million years ago

While we don’t know many of the details of how life emerged, the latest theories connect a few more dots than we could before. Deep-sea hydrothermal vents 1 may have provided at least these four necessary precursors for early life to arise around four billion years ago:

(a) a way for hydrogen to react directly with carbon dioxide to create organic compounds (called carbon fixation),

(b) an electrochemical gradient to power biochemical reactions that led to ATP (adenine triphosphate) as the store of energy for biochemical reactions,

(c) formation of the “RNA world” within iron-sulfur bubbles, in which RNA could replicate itself and catalyze reactions,

(d) the chance enclosure of these bubbles within lipid bubbles, and the preferential selection of proteins that would increase the integrity of these outer bubbles, which eventually led to the first cells

This scenario is at least a plausible way for the precursors of life to congregate in one place and have opportunities for feedback loops to develop which could start to capture function and then ratchet it up. Many steps are missing here, and much of the early feedback probably depended more on chance than on mechanisms that actually capture and leverage it as information. Alexander Rich first proposed the concept of the RNA world in 1962 because RNA can both store information and catalyze reactions, and thus do both the tasks that DNA and proteins later specialized at. But whatever the exact order was, let’s just assume that in the first few hundred million years life arose.

(e) expansion of biochemical processes, including organized cell division, the use of proteins and DNA,

(f) the last universal common ancestor, or LUCA, about 3.5 billion years ago

Early life must have been very bad at even basic cell functions compared to modern forms, so the adaptive pressure in the early days must have mostly focused on improving the core mechanisms of metabolism, replication, and adaptation. As life became more robust, it became less dependent on the vents and was gradually able to move away from them. Although we know that all living cells on earth must descend from a single common ancestor roughly 250 to 750 million years after life first arose, this does include viruses. While there are several theories of the origin of viruses, I believe all viral lines are remnants of pre-LUCA strategies that evolved before the LUCA line was firmly established.

The central mechanics of life on earth evolved during these early years. Although the central mandate of evolution is survival over time, we can roughly prioritize the set of component skills that needed to evolve to make it happen. As each of these skills improved over time, organisms that could do them better would squeeze out those that could not:

1. Metabolism is, of course, the fundamental function as life must be able to maintain itself. A source of energy was critical to this, which is why hydrothermal vents are such a likely starting point.

2. Reproduction was the next most critical function, as any kind of organism that could produce more like itself would quickly squeeze out those that could not. This is where RNA comes in. RNA is too complex to have been the first approach used to replicate functionality, but one can imagine a functional ratchet that used simpler but less effective molecules first.

3. Natural selection at the level of traits is the next most critical function needed because it would make possible the piecewise improvement of organisms. Bacteria developed a mechanism called conjugation that lets two bacterial cells connect and copy a piece of genetic material called a plasmid from one to the other. Most plasmids ensure that the recipient cell doesn’t already have a similar plasmid, which protects against counterproductive changes. There are so many bacteria that a good strategy for them is to try out everything and see what works.

4. Proactive gene creation. Directed mutation is currently a controversial theory, but I think it will turn out that nearly all genetic change is pretty carefully coordinated and that the mechanisms that make it possible evolved in these early years. I am talking about ways a cell can assemble new genes by combining snippets of DNA called transposable elements that are then put back into chromosomes where their functional effects could be inherited by daughter cells. If these changes are done in germ cells they will affect all future generations. If organisms were able to evolve ways to do this in the early years, they could have easily outcompeted other organisms. I think much of the genetic arms race in the beginning focused on better ways to direct change, not because the result of such tinkering was known in advance but because organisms that tinkered with their own DNA when under stress survived better in the long run. Such directed mutation capacities probably started out by directly impacting the next generation so that they could be selected for right away, but over time were refined into strategies that could take many generations to produce new genes or even be held in reserve indefinitely until environment stress indicated that change was needed.2

The next big step was:

(g) the arrival of eukaryotes

Eukaryotes are now widely thought to have arisen by symbiogenesis, which is the absorption of certain single-cell creatures by others that resulted in one living inside the other permanently. Two organelles common to all eukaryotes have double-membraned organelles, which would be expected to occur if one membrane originated from the cell membrane of the endosymbiont while the other originated in the host vesicle which enclosed it.3 The first is the cell nucleus and the other is the mitochondrion. Algae and plants also contain plastids that also have double-membranes. Mitochondria and plastids reproduce with their own DNA, while cell nuclei seem to have become the repository for the host DNA. While eukaryotes need these organelles, their key evolutionary enhancement was sexual reproduction, which combines genes from two parents to create a new combination of genes in every offspring. Sexual reproduction is a nearly universal feature of eukaryotic organisms4 and the basic mechanisms are believed to have been fully established in the last eukaryotic common ancestor (LECA) about 2.2 billion years ago. In the short term, sex has a high cost but few benefits. However, in the long term it provides enough advantages that eukaryotes almost always use it. Asexual reproduction, which is used by prokaryotes (non-eukaryotes, including the bacteria and archaea) and by somatic (non-sex) cells of eukaryotes, is done by a cell division process called mitosis. During mitosis, a double strand of DNA is separated and each single strand is then used as a template to create two new double strands. When the cell divides into two, each daughter cell ends up with one set of DNA. Sexual reproduction uses a modified cell division process called meiosis and a cell fusion process called fertilization. Cells that undergo meiosis contain a complete set of genes from each of two parents. They first replicate the DNA, making four sets of DNA in all, and then randomly shuffle genes between parent strands in a process called crossing over. The cell then divides twice to make four gametes each with a unique combination of parental genes. Gametes from different parents then fuse during fertilization to create a new organism with a complete set of genes from each of its two parents, where each set is a random mixture from each parents’ parents.

Sexual reproduction is clearly a much more complex and seemingly unlikely process compared to asexual reproduction, but I will show why sex is probably a necessary development in the functional ratchet of life. The underlying reason for sex is that it facilitates points 3 and 4 above, namely natural selection at the level of traits and proactive gene creation. Because mechanisms evolved to do both 3 and 4 well, prokaryotes evolved in just two billion years instead of two trillion or quadrillion. Of course, I can only guess about time frames this large, but in my estimation evolution would have made almost no progress at all without refining these two mechanisms, so any organisms that could improve on them would have a huge advantage over those that did them less well. We know that conjugation is not the only mechanism prokaryotes use to transfer genetic material between; all such mechanisms outside of sexual reproduction are called horizontal gene transfer (HGT), and also include transformation and transduction, the latter of which is the incorporation of DNA from viruses. Any mechanism that can share genetic information at the gene or function level to other organisms creates opportunities for new combinations of genes to compete, which makes it possible for individual advantageous functions to spread preferentially to less capable ones. HGT has been sufficient for the evolution of two large groups of single-celled organisms, bacteria and archaea, and so is no doubt deployed in many strategic ways we can still only guess at, but the outcome is fundamentally pretty haphazard, which makes it inadequate to support multicellular life. On the one hand, it allows many new genetic combinations to be tried at a fairly low cost since the number of single-cell organisms is very high. But on the other hand, it lacks many mathematical advantages that sex brings to the table. I will assume “that the protoeukaryote → LECA era featured numerous sexual experiments, most of which failed but some of which were incorporated, integrated, and modified,”5 and that consequently a great many intermediate forms before LECA formed are no longer extant to give us insight into the incremental stages of evolution.6

What benefits does sex provide that led to its evolution? John Maynard Smith famously pointed out that in a male-female sexual population, a mutation causing asexual reproduction (i.e. parthenogenesis, which does naturally arise sometimes allowing females to reproduce as clones without males) should rapidly spread because asexual reproduction has a “twofold” advantage since they no longer need males. It is true that when resources allow unlimited growth, asexual reproduction can thus spread faster, but this rarely happens. Usually, populations are constrained by resources to a roughly stable population. Achieving the fastest reproduction cycle is not the critical factor in long-term success in these situations, and it is actually rather irrelevant. In any case, eukaryotic populations probably could and would have evolved a way to switch between sexual and asexual reproduction depending on which is more beneficial at the time, and very few ever choose asexual. This strongly suggests that sexual reproduction nearly always confers more advantages than asexual reproduction. We are aware of a number of such advantages, but I think the critical ones are better solutions to my points 3 and 4 above. Sexual reproduction is set up to create an almost unlimited number of genomes with different combinations of genes, while all asexual reproduction can do is accumulate genes (although prokaryotic genomes stay pretty small, so they must also have ways of knocking genes out). And sex pits each trait against its direct competitors so that natural selection can operate on each independently. Beneficial traits can spread through a population “surgically” knocking out less effective alleles, something asexual reproduction can’t do. Sex gives a species vastly more capacity to adapt to changing environments because variants of every gene can remain in the gene pool waiting to spread when conditions make them more desirable.7 Asexual creatures can’t keep genes around for long that aren’t useful right now, because they can’t generate new combinations. Horizontal gene transfer is apparently sufficient to allow prokaryotes to adapt, but obligate parthenogenesis in multicellular species leaves them with essentially no prospects for further adaptation and so represents a dead end. This includes about 80 species of unisex reptiles, amphibians, and fishes. All major vertebrate groups except mammals have species that can sometimes reproduce parthenogenetically.8 We can conclude that Maynard Smith was right that asexual reproduction provides a “quick win”, but because it is a poor long-term strategy its use is limited in multicellular life. Overall, I would estimate that eukaryotes are roughly 10 to 100 times “better” at evolution than prokaryotes, mostly because of sex, but their improved technologies really start to shine in multicellular organisms, because their ability to pinpoint the focus of natural selection allows complex organisms to arise.

(h) complex multicellularity, meaning organisms with specialized cell types.

Multicellular life has arisen independently dozens of times, starting about 1 billion years ago, and even some prokaryotes have achieved it, but only six independently achieved complex multicellularity: animals, two kinds of fungi, green algae (including land plants), red algae, and brown algae. The relatively new science of evo-devo (evolutionary development) is focused largely on cell differentiation in complex multicellular (eukaryotic) organisms. The way that the cells of the body achieve such dramatically different forms, simplistically, is by first dividing and then turning on regulatory genes that usually then stay on permanently. Regulatory genes don’t code for proteins, but they do determine what other regulatory genes will do and ultimately what proteins will be transcribed. Consequently, as an embryo grows, each area can become specialized to perform specific tasks based on what proteins the cell produces.

The most dramatic demonstration of the power of triggered differentiation is radial and bilateral symmetry. Most animals (the bilateria) have near perfect bilateral symmetry because the same regulatory strategy is deployed on each side, which means that so long as growth conditions are maintained equally on both sides, a perfect (but reversed) “clone” will form on each side. Evo-devo has revealed that the eyes of insects, vertebrates, and cephalopods (and probably all bilateral animals) evolved from the same common ancestor, contrary to earlier theory. Homeoboxes are parts of regulator genes shared widely across eukaryotic species that regulate what organs develop where. As evo-devo uncovers the functions of regulatory genes, the new science of genomics is mathematically exposing the specific evolutionary origins of every gene. Knowing each gene’s origins and roughly what it does will coalesce into a comprehensive understanding of development.

Multicellularity and differentiation created opportunities for specialized structures to arise in bodies to perform different functions. Tissues are groups of cells with similar functions, organs are groups of tissues that provide a higher level of functionality still, and organ systems coordinate the organs at the highest level. A stream has no purpose; water just flows downhill. But a blood vessel is built specifically to deliver resources to tissues and to remove waste. This may not be the only purpose it serves, but it is definitely one of them. All tissues, organs, and organ systems have specific functions which we can identify, and usually one that seems primary. Additional functions can and often do arise because having multiple applications is the most convenient way for evolution to solve problems with the available resources. Making high-level generalizations about the functions of tissues, organs, and organ systems is the best way to understand them, provided we recognize the limitations of generalizations. The heart definitely specializes in pumping blood and the brain in overall control of the body. To study the form of these structures without considering their function is rather obviously a waste of time; physicalism must take a back seat to functionalism in areas driven by function.

Before I move on, I should note that all complex multicellular eukaryotes live symbiotically with countless single-cell bacteria, archaea, fungi, and protists, and also with viruses, which have no cell membranes at all. Evolution has built on its successes in surprisingly deep ways which we are only beginning to appreciate.

2.2 Minds: 600 million to 4 million years ago

The Concerns of Animals

Animals are mobile. Mobile organisms need brains while sessile ones don’t. This point is so obvious it hardly needs to be said, but everything follows from it. Fungi are close relatives to animals that have evolved some highly specialized features that make them ideally suited to life under ground, and plants are arguably more evolved than fungi or animals because their cells can photosynthesize using chloroplasts. But they don’t need brains because they don’t move. They just sit tight and grow, making the best of whatever happens to them. Plants can afford to wait for food (i.e. sunlight) and mates (i.e. pollen) to come to them, but animals need to seek them out and compete for them. They need algorithms to decide where to go and what to do when they get there. The body must be controlled as a logical unit called an agent, and its activities in the world can be subdivided into discrete functions, starting with eating, mating, and sleeping (an activity most animals do for reasons still only partially understood). These can be subdivided further based on physical considerations, chiefly how to control the body and external objects, and functional considerations, chiefly maximizing survival potential by meeting needs and avoiding risks. Brains are the specialized control organs animals developed to weigh these considerations. Brains first collect information about their bodies and the environment from the bottom up but then fit that information into control algorithms that address discrete functions and control considerations from the top down. Top-down prioritization is essential to coordinate body movements and actions effectively. Let’s take a closer look at how animals evolved to get a better idea how they have met these challenges.

The last animalian common ancestor is called the urmetazoan, aka “first animal”, and is thought to have been a flagellate marine creature. The urmetazoan is important because, like the LUCA and LECA before it, an unknown but perhaps significant amount of animal evolution went into making the urmetazoan and an unknown but perhaps significant number of competing multicellular mobile forms were squeezed out by the metazoans (aka animals). Now we only see what got through this bottleneck. The surviving animals have differentiated into many branches with a wide variety of forms, so I will climb up through the animal family tree.

Sponges are the most primitive animals from a control standpoint, having no neurons or indeed any organs or specialized cells. But they have animal-like immune systems and some capacity for movement in distress.1. Cnidarians (like jellyfish, anemones, and corals) come next and feature diffuse nervous systems with nerve cells distributed throughout the body without a central brain, but often featuring a nerve net that coordinates movements of a radially symmetric body. Although jellyfish move with more energy efficiency than any other animal, a radial body design provides limited movement options, which may explain why all higher animals are bilateral (though some, like sea stars and sea urchins, have bilateral larvae but radial adults). Nearly all creatures that seem noticeably “animal-like” to us do so because of their bilateral design which features forward eyes and a mouth. This group is so important that we have to think of the features of the urbilaterian, the first bilateral animal about 570-600 million years ago. As I mentioned above, we now have evidence that the urbilaterian did have eyes. While the exact order in which the features of animals first appeared is still unknown, a functional principle that developed in many bilateral animals was a centralized control center that can make high-level decisions leveraging a variety of sensory information.

A few exceptions to centralized control exist among the invertebrates, most notably the octopus (a mollusk), which has a brain for each arm and a central brain that loosely administers them. Having independent eight-way control of its arms comes in handy for an octopus because the arms can often usefully pursue independent tasks. Octopus arms are vastly more capable than those of any other animals, and they use them in amazingly coordinated ways, including to “bounce-walk” across the sea floor and to jump out of the water’s edge to capture crabs.

Why, then, don’t animals all have separate brains for each limb and organ? The ways function evolves is always a compromise between logical need and physical mechanism. To some degree, historical accident has undoubtedly shaped and constrained evolution, but, on the other hand, where logical needs exist, nature often finds a way, which sometimes results in convergent evolution of the same trait through completely different mechanisms. In the case of control, it seems likely that it was physically feasible for animals to either localize or centralize control according to which strategy was more effective. An example of decentralized control in the human body is the enteric nervous system, or “gut-brain”, which lines the gut with more than 100 million nerve cells. This is about 0.1% of the 100 billion or so nerves in the human brain. Its main role is controlling digestion, which is largely an internal affair that doesn’t require overall control from the brain.2 However, the brain and gut-brain do communicate in both directions, and the gut-brain has “advice” for the brain in the form of gut feelings. Much of the information sent from the gut to the brain is now thought to arise from our microbiota. The microbes in our gut can weigh several pounds and comprise hundreds of times more genes than our own genome. So gut feelings are probably a show of “no digestion without representation” that works to both parties’ benefit.34 The key point in terms of distributed control is that if the gut has information relevant to the control of the whole animal, it needs to convey that information in a form that can impact top-level control, and it does this through feelings and not thoughts.

So let’s consider how control of the body is accomplished in the other two families of highly mobile, complex animals, namely the arthropods and vertebrates. The control system of these animals is most broadly called the neuroendocrine system, as the nervous and endocrine systems are complementary control systems that work together. The endocrine system sends chemical messages using hormones traveling in the blood while the nervous system sends electrochemical messages through axons, which are long, slender projection of nerve cells, aka neurons, and then between neurons through specialized connections called synapses. Endocrine signals generally start slower and last longer than nerve-based signals. Both arthropods and vertebrates have endocrine glands in the brain and about the body, including the ovaries and testes. Hormones regulate both physiology and behavior of bodily functions like digestion, metabolism, respiration, tissue function, sensory perception, sleep, excretion, lactation, stress, growth and development, movement and reproduction. Hormones also affect our conscious mood, which encompasses a range of slowly-changing subjective states that can influence our behavior.

While the endocrine system focuses on control of specific functions, the nervous system provides overall control of the body, which includes communication to and from the endocrine system. In addition to the enteric nervous system (gut-brain), the body has two other peripheral systems called the somatic and autonomic nervous systems that control movement and regulation of the body somewhat independently from the brain. The central nervous system (CNS) comprises the spinal cord and the brain itself. Nerve cells divide into sensory or afferent neurons that send information from the body to the CNS, motor or efferent neurons that send information from the brain to the body, and interneurons which comprise the brain itself.

The functional capabilities of brains have developed quite differently in arthropods and vertebrates. I am not going to review arthropod brains in detail because vertebrates ultimately developed much larger brains with more generalized functionality, but arthropods are much more successful at small scales than vertebrates. It seems likely and appears to be the case that they depend much more on instinctive behavior than vertebrates. However, many can learn to adapt their behavior by learning about new features in their environment.5 Moving through the vertebrates on the way to Homo sapiens, first

fish branch off, then
amphibians, and then
amniotes, which enclose embryos with an amniotic sac that provides a protective environment and makes it possible to lay eggs on land. Amniotes divide into
reptiles, from which derive
birds (by way of dinosaurs), and
mammals. And mammals then divide into
monotremes (like the duck-billed platypus), then
marsupials, and then
placentals, which gestate their young internally to a relatively late stage of development. There are eighteen orders of placentals, one of which is
primates, to which
humans belong.

It seems to us that evolution reached its apotheosis (divine perfection) in Homo sapiens, and yet we all know that all species have had the same amount of time to evolve, so none should be “more evolved” than others. And yet, by almost any measure, brain power (viewed as a general capacity to solve problems) generally increases as one moves through the above branches toward humans. Furthermore, new brain structures appear along the way that help to account for that increase in power. Of course, the living representatives of each line above have evolved more brain power and new brain structures. Some birds, in particular, are smarter by almost any measure than some primitive mammals. But birds and mammals have specialized to many new environments, while fish, amphibians, and reptiles have mostly continued to occupy environments they were already well-adapted for. By comparison, fish haven’t had the need or the opportunity to evolve much more powerful brains than they have had for millions of years. The truth is, brain power has generally increased over time in all animal lines because evolution is not random, it is directed. It is not directed to more complex forms but to more functional forms. In animals, that functionality is most critically driven by the power of the top-level control center, which is the brain. So some species have made better use of the time they have had available to evolve because they have faced greater environmental challenges which have needed better control systems. But it is also worth noting that fish, amphibians, and reptiles are cold-blooded. Warm-blooded animals need much more food but can engage in a more active lifestyle in a wider temperature range. Also, warm-blooded animals can support more energy-dependent brains, making it easier for them to think more and faster.

Let’s consider for a moment the differences in brain development among vertebrates. Fish and amphibians have no cerebral cortex, the outer layer of neural tissue of the cerebrum in the brain. The cerebral cortex is thought to be the principal control center of more complex behavior. Let’s just consider what some measures of brain and cerebral cortex differences among the animals suggest. Although neuron counts only provide a rough indication of brain power, they do suggest potential, so I have listed them for certain vertebrates:

AnimalTotal NeuronsCerebral Cortex Neurons
fish, amphibians 0.02 billion none
small rodents 0.03-0.3 billion 0.01-0.04 billion
cats, dogs, herbivores 0.5-2.5 billion 0.2-0.6 billion
monkeys 3-6 billion 0.5-1.7 billion
smarter birds 0.8-3 billion 0.8-1.9 billion
elephants 250 billion 6 billion
apes 10-40 billion 2.5-10 billion
cetaceans (dolphins, whales) ? 5-40 billion
humans 86 billion 16 billion

  • By total number of neurons, humans have substantially more at 86 billion than any animals except elephants and probably dolphins and whales.6
  • By total number of cerebral cortex neurons, humans have the most (about 16 billion), except that some whales may have more. Elephants have about 6 billion, which is only bested by cetaceans and primates.7

Consciousness as the Top-Down Logical Perspective

That the brain must control the body as a logical agent that pursues discrete tasks implies that it needs to maintain a top-down logical perspective about the world that defines it in terms of the functions it needs to perform. So, for example, it must distinguish its own body from that which is not part of its body, it must distinguish high-level categories like plant food, animal (prey) food, predators, members of the same species (conspecifics), relatives, mates, offspring, other plants and animals, terrain features (ground, water, mountains, sky, etc.), and so forth. While these are all things made from matter, they are functionally distinct to animals based on their expected interactions with them. It is more accurate to say that we define these things, and in fact all physical things, principally in terms of what they can do for us and only secondarily in more purely compositional or mechanical terms. However, brains can only gather information about the world outside them by looking for patterns in data collected from senses. So how can they maintain a top-down perspective when information comes to them through bottom-up channels? The answer is consciousness, aka the mind.

We can thus separate information processing into two broad categories, bottom-up and top-down:

  • Bottom-up processing finds patterns locally in data one source at a time
  • Top-down processing proposes functional units to subdivides the world

The brain manages these two kinds of processing through two logically-distinct subprocesses:

Technically, what we call the conscious mind comprises only our subjective awareness at one moment, including our current sensory perceptions, emotions, and thoughts, which have access to our short-term memory, which reputedly holds about four chunks of information and fades out after a few to at most thirty seconds. However, this is not the definition of consciousness I will be using in this book, as it is too narrow. While I do certainly mean current awareness and attention, I also include the scope of past and future awareness and attention. In other words, I also include the long-term memory that is reachable by consciousness. Technically, our long-term memory is part of our nonconscious mind, which also includes every other cognitive task that we surmise must be happening but of which we lack awareness. We infer the existence of the nonconscious mind by process of elimination — it is the tasks that need information processing that we can’t feel happen. Some nonconscious tasks are fundamentally outside the reach of conscious awareness, while others just don’t happen to be in our awareness right now, but we could bring them into consciousness if the need arose. In other words, they are within the scope of conscious processing. Conscious processing is structured around all the information that can be made conscious, not just around current information, which lacks sufficient context to accomplish anything by itself. So while memory itself is a nonconscious process, much of what we store in our memory is shaped by conscious processing. We have an excellent sense of its content, scope, and implications for our current thoughts even though we can’t pull much of it into our awareness at a time. Furthermore, the way we perceive our conscious thinking processes is, of course, part of consciousness, but that perception hides a lot of nonconscious support algorithms that we take for granted, which notably include recognition and language, but also many other talents whose mechanisms are invisible to us. Consciousness is coordinated using many nonconscious processes, and this makes it hard to say where one leaves off and the other begins.

My distinction based on the scope of conscious reach is sufficient to call the conscious and nonconscious minds distinct subprocesses, but they are highly interdependent so it would be overreach to label them independent subprocesses. All information in the conscious mind comes from the nonconscious mind, and, to the extent the conscious mind has executive power, the subconscious takes much of its direction from it. It is analogous to the CEO and other employees of a company. It has been estimated that 90 to 99 percent of mental processing is nonconscious based on neural activity9 but we can’t quantify this precisely because they blur into each other. Vision is processed nonconsciously for us in parallel. Each part of the image at a pixel-like level is simultaneously converted in real time from an input signal to an internal representation, which is then also often converted in real time into recognized objects. In addition to this nonconscious parallel processing of the input, our conscious perception of the output uses built-in (nonconscious) parallel processing because we can simultaneously see the whole image at once even though we feel like we are doing one thing at a time.10

Before going further, let me contrast “nonconscious mind” with the more commonly used term “unconscious mind” coined by Sigmund Freud. Freud’s unconscious mind was the union of repressed conscious thoughts that are no longer accessible (at least not without psychoanalytic assistance) and the nonconscious mind. He saw the preconscious, which is quite similar to what we now call the subconscious, as the mediator between them:

Freud described the unconscious section as a big room that was extremely full with thoughts milling around while the conscious was more like a reception area (small room) with fewer thoughts. The preconscious functioned as the guard between the two spaces and would let only some thoughts pass into the conscious area of thoughts. The ideas that are stuck in the unconscious are called “repressed” and are therefore unable to be “seen” by any conscious level. The preconscious allows this transition from repression to conscious thought to happen.11

Freud either didn’t contemplate or was not concerned with neural processing that happened below levels that could be understood were they to become conscious, repressed or not. Instead of the big room/reception area analogy, my nonconscious and conscious are much more like film production and finished movie — lots of processing with tools unfamiliar to consciousness is done to package information up in a streamlined form that consciousness can understand. In any case, the parts of the mind permanently outside conscious scope arguably don’t matter to psychoanalysts, but if they help explain how the conscious mind works they matter to me. In any case, the term unconscious also refers to a loss of consciousness, which can be confusing, so I will only use the term “nonconscious” going forward. Originally, Freud used the term “subconscious” instead of unconscious, but he abandoned it in 1893 because he felt it could be misunderstood to mean an alternate “subterranean” consciousness. It now persists in popular parlance as a synonym for intuitive, which is the word I will use instead to avoid any confusion.

To summarize, I am saying two different things about consciousness that don’t necessarily go together:

  • 1. Consciousness is one of two subprocesses in the brain, namely the one that works from the top down, and
  • 2. We are aware of consciousness (but not nonconsciousness).

This begs some bigger questions, namely, why does awareness exist, and why is it limited to the conscious part? The answer is that awareness is how consciousness maintains its top-down perspective. An agent must ultimately put one plan into effect: a cheetah decides to chase down a specific gazelle. But to do that, a myriad of bottom-up information must be condensed into relevant factors to create the top-down view. How can this information be condensed to a single decision while maintaining a comprehensive grasp of all the details, any of which might impact the next decision, e.g. to call off the chase? The answer is awareness and attention. Awareness simplifies bottom-up information into a set of discrete information channels called qualia, which I will discuss in the next section. Attention prioritizes qualia and thought processes to bring the most relevant factors into consideration for decisions. It is not a coincidence by any means, but the information processing that takes place to do this simplification of bottom-up data into top-down digestible forms is the same thing as conscious awareness. Awareness is just a specific kind of information processing, and it coincides with the consciousness subprocess (and not nonconsciousness) because it is the kind of processing that consciousness needs to do and is set up to do.

Consciousness feels like a theater because that is the approach to managing this information that works best. Once we recognize that this approach has been used, we can cite any number of good reasons why this approach evolved. Just from common sense, we know animals have to be aware and alert to get things done and to stay safe. Consciousness is clearly an extremely effective strategy for them. Of course, we can’t tell what consciousness feels like to them, but we can draw analogies to our own experience to see when and how they experience comparable sensory, emotional, and cognitive states. This does not mean that any top-level decision-making process, be it a computer program, a robot, or a zombie, would therefore have conscious awareness. The theater of consciousness is a user interface set up specifically to feed bottom-up information into a top-down algorithm that can continuously reassess priorities to produce a stream of decisions. The algorithms we associate with computers, robots, and zombies are just not in the same league of functionality as what animal minds do. We can’t even remotely imagine how to design such algorithms yet. But if we could devise such algorithms that used nonconscious and conscious subprocesses to take the same kinds of things into consideration for the same kinds of reasons to produce the same kind of stream of decisions, then it would be fair to say that it would have awareness comparable to our own. Could we instead design intelligent robots that are not conscious? Yes, undoubtedly, and we arguably have already started to do so, but it would not be a comparable kind of intelligence. Many tasks can be done very competently without any of the concerns that animals face.

Most of my focus in this book is on the processing done by the consciousness subprocess or for it by nonconscious processes because these are the aspects of the mind that matter the most to us. I’m going to take a closer look at these processes, starting with qualia.

Qualia – A Way to Holistically Manage Incoming Information

Living organisms are homeostatic, meaning they must maintain their internal conditions in a working state at all times. Animals had to evolve a homeostatic control system, meaning that it had to be able to adjust its supervision on a continuous basis. But it still needs to be able to fulfill tasks smoothly and efficiently and not in a herky-jerky panic. Karl Friston was the first to characterize these requirements of a homeostatic control system through his free energy principle.12 This principle says that a homeostatic control system must minimize its surprise, meaning that it should proceed calmly with its ongoing actions so long as all incoming information falls within expected ranges. Any unexpected information is a surprise, which should be bumped up in priority and dealt with until it can itself be brought back into an expected range. It would really be more accurate and informative to call it the surprise-minimization principle because it isn’t really about energy or anything physical at all. This principle says the control system must try to know what to expect, and, beyond that, it must also minimize the chances that inputs will go outside expected ranges. Animals have to follow this principle simply because it is maladaptive not to. Unlike machines we build, which are not homeostatic or homeostatically controlled, animals must have a holistic reaction strategy that can deal with control issues fractally, that is, as needed and at every level of concern.

Simple animals have simple expectations. Even a single-cell creature, like yeast, can sense its environment in some ways and respond to it. Simple creatures evolve a very limited range of expectations and fixed responses to them, but animals developed a broader range of senses, which made it possible to develop a broader range of expectations and responses. In a control arms race, animals have ratcheted up their capacity to respond to an increasing array of sensory information to develop ever more functional responses. But it all starts with the idea of real-time information, which is, of course, the specialty of sensory neurons. These neurons bring signals to the brain, but what the brain needs to know about each sense has to be converted logically into an expected range. Information within the range is irrelevant and can be ignored. Information outside the range requires a response. This requirement to translate the knowledge into a form usable for top-level control created the mind as we know it.

From a logical standpoint, here is what the brain does. First, it monitors its internal and external environment using a large number of sensory neurons, which are bundled into specific functional channels. The brain reprocesses each channel using a logical transformation and feeds it to a subprocess called the mind that maintains an “awareness” state over the channel that it ignores. These channels are kept open because a secondary process in the brain called an “attention” process evaluates each channel to see if it falls outside the expected range. When a channel does that, the attention process focuses on that channel, which moves the mind subprocess from an aware (but ignoring) state to a focused (attentive) state. The purpose of the mind subprocess is to collect incoming information that has been converted into a logical form that is relevant to tasks at hand so that it can prioritize and act so as to minimize future surprise. Of course, its more complex reactions complete necessary functions, and that is its “objective” if we view the problem deductively, but the brain doesn’t have to operate deductively or understand that it has objectives. All it needs to be able to do is convert sensory information into expected ranges and have ways of keeping them there.

Relatively simpler animal brains, like those of insects, use entirely instinctive strategies to make this happen. But you can still tell from observing them that, from a logical standpoint, they are operating with both awareness and attention. This alone doesn’t make their mind subprocess comparable to ours in any way we can intuitively identify with, but it does mean that they have a mind subprocess. They are very capable of shifting their focus when inputs fall outside expected ranges, and they then select new behaviors to deal with the situation. Do they “go back” to what they were dong once a minor problem has been stabilized? The free energy principle doesn’t answer questions like that directly, but it does indirectly. Once a crisis has been averted, the next most useful thing the animal can do to avoid a big surprise in its future is to return to what it was doing before. But for very simple animals it may be sufficiently competitive to just continually evaluate current conditions to decide what to do next rather than to devise longer-term plans. After all, current conditions can include desires for food or sex, which can then be prioritized to devise a new plan on a continuous basis. Insects have very complex instinctive strategies for getting food which often depend on monitoring and remembering environmental features. So even though their mind subprocess is simple compared to ours, it must be capable of bringing things to attention, considering remembered data, and consulting known strategies to prioritize its actions to choose an effective logical sequence of steps to take.

People usually consider the ability to feel pain as the most significant hallmark of consciousness. Insects don’t have nociceptors, the sensory neurons that transmit pain to the brain, so they don’t suffer when their legs are pulled off. It is just not sufficiently helpful or functional for insects to feel pain because their reproductive strategy is to make lots of expendable units. More complex animals make a larger investment in each individual and need them to be able to recover from injuries, and pain provides its own signal which is interpreted within an expected range to let the mind subprocess know whether it should ignore or act. Every sensory nerve (a bundle of sensory neurons) creates its own discrete and simultaneous channel of awareness in the mind. If you have followed my argument so far, you can see that what we think of as our first-person awareness or experience of the world is just the mind subprocess doing its job. Minds don’t have to be aware of themselves, or have comprehension or metacognition, to feel things. Feelings are, at their lowest level, just the way these nerve channels are processed for the mind subprocess. Feelings in this sense of being experienced sensations are called qualia. We distinguish red from green as very different qualia, but we could never describe the difference to a person who has red-green color blindness. The feelings themselves are indescribable experiences; words can only list associations we may have with them.

We don’t count each sensory nerve as its own quale (pronounced kwol-ee, singular of qualia), even though we can tell it apart from all others. Instead, the brain groups the sensory nerves functionality into a fixed number of categories, and the feeling of each quale as we experience it is exactly the same regardless of which nerve triggered it. Red looks the same to me no matter which optic nerve sends it. The red we experience is a function of perception and is not part of nature itself, which deals only in wavelengths, so our experience seems like magic as it is supernatural. But it isn’t really magic because there is a natural explanation: outside of our conscious awareness in the mind subprocess, the brain has done some information processing and presented a data channel to the mind in the form of a quale. The most important requirement of each sensory nerve is that we can distinguish it from all others, and the second most important requirement is that we can concurrently categorize it into a functional group, its quale. The third most important requirement is that we monitor each channel for being what we expect, and that unexpected signals then demand our attention. These requirements of qualia must all hold from ant minds to human minds, and, in an analogous way, for senses in yeast. But the detective and responsive range in yeast is much simpler than in ants, and that in ants is much simpler than in people. As we will see, the differences that arise are not just quantitative, but also qualitative as they bring new kinds of function to the table.

The way the brain processes qualia for us makes each one feel different in a special, customized way that is indescribable. More accurately, we can describe them, but words can’t create the feeling. One can describe colors to a blind person or sounds to a deaf person, but the meaning really lies in the experience. Where does this special feeling come from? The short answer is that it is made up. The longer answer is that everything the mind experiences is just information; the mind can only access and process information. But not all information is the same. Because the brain is creating complex signals for the qualia which are designed to let us instantly tell them apart, it has invested some energy in giving each of them a special data signature or “look and feel” which only that quale can produce. To some degree, we can remember that look and feel, but it is not as convincing or fresh as first hand experience of it. Synesthesia is a rare brain condition which allows some qualia to trigger other qualia. Most commonly, synesthetes who see letters or numbers or hear certain sounds then see colors or shapes they associate with them, or think of colors they associate with them. This indicates that some internal malfunction has allowed one quale’s channel to overlap or bleed into another. The overlap almost always goes beyond simple sensory qualia to include words, numbers, shapes or ideas, which suggests that other data channels feed these into our conscious awareness as well. But more to my point at hand, it suggests that the brain invents qualia but generally shields the mind from the details.

Our sensory perceptions provide us with information about our bodies or the world around us. The five classic human senses are sight, hearing, taste, smell, and touch. Sight combines senses for color, brightness, and depth to create composite percepts for objects and movement. The fovea (the highest-resolution area of the retina) only sees the central two degrees of the visual field, but our weaker peripheral vision extends to about 200 to 220 degrees. Smell combines over 1000 independent smell senses. Taste is based on five underlying taste senses (sweet, sour, salty, bitter, and umami). Hearing combines senses for pitch, volume, and other dimensions. And touch combines senses for pressure, temperature, and pain. Beyond these five used most for sensing external phenomena, we have a number of somatic senses that monitor internal body state, including balance (equilibrioception), proprioception (limb awareness), vibration sense, velocity sense, time sense (chronoception), hunger and thirst, erogenous sensation, chemoreception (e.g. salt, carbon dioxide or oxygen levels in blood), and a few more13. Most qualia are strictly informational, providing us with useful clues about ourselves or the world, but some are also dispositional, making us feel inclined to act or to take a position regarding them. Among senses, this most notably applies to pain, temperature, some smells, hunger, thirst, and sex. Dispositional senses are sometimes called drives.

We possess another large class of dispositional qualia called emotions that monitor our mental needs. If we recognize a spider or a snake, that is just a point of information, but if we feel fear, then we know we need to avoid it. Emotions give us motivation to act on a wide variety of needs. Emotions are metaperceptions our brain creates for us by forming perceptions about our conscious mental states. You could say our brain reads our mind and reacts to it. Our nonconscious mind computes what emotions we should feel be “peeking” at our conscious thoughts and feeding its conclusions back to us as emotions. It needs our conscious assessments because only the conscious mind understands the nuances involved, especially with interpersonal interactions. Emotions react to what we really believe, and so can’t be fooled easily, but thanks to metacognition we can potentially bring ourselves to believe things on one level that we don’t believe on another and so can manipulate our emotions.

If everything meets our default expectations, no emotion would be stirred because no further action is needed. But if an event falls short of or exceeds our expectations, emotions may be generated to spur further action. Negative emotions motivate us to take corrective action, while positive emotions motivate us to take reinforcing action. We may be aware of rational reasons to act (that ultimately tie back to motivations from drives and emotions), but reasoning lacks urgency. Emotion will inspire us to act quickly. Most emotions are intense but short-lived because quick action is needed. They can also be more diffuse and longer-lived to signal longer-term needs, at which point we call them moods.

We have more emotions than we have qualia for emotions, which causes many emotions to overlap in how they feel. The analysis of facial expressions suggests that there are just four basic emotions: happiness, sadness, fear, and anger.14 While that is a bit of an oversimplification, it is approximately true. Dozens of discernible emotions share these four qualia, but they affect us in different ways because we know the emotions not just by how they feel but by what they are about. So satisfaction, amusement, joy, awe, admiration, adoration, and appreciation are distinct emotions that all feel happy, while anguish, depression, despair, grief, loneliness, regret, and sorrow all feel sad, yet we distinguish them based on context. The feel of an emotion spurs a certain kind of reaction. Happiness spurs reinforcement, sadness spurs diminishment, fear spurs retreat, and anger spurs corrective action. Sexual desire has its own qualia that spur sex. So emotions that call for similar reactions can share qualia, and in some sense, an emotion can be said to feel “like” the action they inspire us to take. Fear and anger make us feel like doing something, happiness feels like something we want more of, and sadness makes us feel like pulling away from its source, which, in the long run, will help us overcome it. Wikipedia lists seventy or so emotions, while the Greater Good Science Center identifies twenty-seven15. Just as we can distinguish millions of colors with three qualia, we can probably distinguish a nearly unlimited range of emotions by combining the four to about twelve emotional qualia with an almost unlimited number of objects at which they can be directed. For example, embarrassment, shyness, and shame mostly trigger qualia for awkwardness, sadness, anxiety, and fear, but also correspond respectively to social appropriateness, comfortability around others, and breaking social norms.

Awareness and attention themselves can be said to have a custom feel to them and so can be called qualia. Awareness is informational while attention is dispositional. Their quality is just a sense of existence and interest, and so is not as specific as senses and emotions, but they are near permanent sensations in our conscious life. Qualia are the special effects of the theater of consciousness that make it feel “first-person” and so seamless and well-produced that we believe it shows us the world around us “as it really is”. We know that our visual, aural, tactile, and other sensory ranges are highly idiosyncratic and only represent a very biased view of the world around us, but because that view is entirely consistent with our interactions with that world, it is real for all intents and purposes. The world we are imagining in our heads counts as real if our interactions with it are faithfully executed. We recognize our senses can be fooled, and, more than that, we know that they fill in gaps for us to keep the show on the road, which invariably introduces some mistakes, but we also know we can reconfirm any information in doubt as needed.

2.3 Humans: 4 million to 10,000 years ago

Cooperation and Engineering

People are much more capable than our nearest animal relatives, but why? Clearly, something significant happened in the seven million years since we diverged from chimpanzees. Before I get into that, let’s consider what some of the smartest animals can do. Apes and another line of clever animals, the corvids (crows, ravens and rooks), can fashion simple tools from small branches, which requires cause and effect thinking using conceptual models. Most apes and corvids live in groups to defend against predators and extend foraging opportunities, as do many other kinds of animals, but they derive many other subtle benefits from social life. They groom each other, share care of offspring, share knowledge, and are quite vocal, using calls for warning, mating, and other purposes.12 Apes and corvids have also a substantial capacity to attribute mental states (such as senses, emotions, desires, and beliefs) to themselves and others, an ability called Theory of Mind (TOM)34. In particular, if they see food being hidden and they are aware of another animal (agent) observing it being hidden, they will take this knowledge of the other animal’s knowledge into account in their behavior. That mammals and birds evolved all these capabilities independent of each other indicates that a functional ratchet is at work.

But while apes, corvids, and some other animals as well can devise novel (non-instinctive) strategies to solve problems using both intuitive and conceptual learning, and they can benefit from social interactions, only humans can work together cooperatively to solve new problems with engineering. Many animals, e.g. social insects and beavers, work cooperatively to solve problems using strategies honed by instinct. They are acting individually without realizing that their behavior benefits the group rather than using new information to work out solutions. Among animals that can conceptually work out solutions to new problems, only humans can use reuse and improve on previously used tools (engineering) and then communicate their plans to each other and execute them in a coordinated fashion. They do this using language, which can be subdivided into semantic and nonverbal meaning. Semantic meaning is conceptual while nonverbal meaning is emotional and intuitive (so I am counting conceptual gestures as semantic, though they are technically also nonverbal). Early languages forced people to develop formal expressions of their conceptual models, which in turn made them think more clearly about them. So language is inherently metacognitive because every linguistic expression is both about something and is also an expression, which is a way of thinking about that something.

Cooperation and engineering gave us more ways to think about things and reasons to do so because they both expand the range of beneficial activities indefinitely. Most notably, they create the prospect of individuals learning specialized service or manufacturing roles. Making spears for cooperative hunting is an often-cited activity requiring both kinds of specialization, but just having the cognitive capacity to do this suggests we were likely also cooperating to engineer housing, clothing, and food practices, probably for longer than a million years. Starting with rudimentary uses of semantic communication and tools, bands of humans established roles for group-level strategies that slowly evolved into ever-more elaborate cultural heritages. No other animal communities have the ability to develop technologies and teach them to future generations. Because artifacts and the skills to use them persist, humans started to develop a frame of reference beyond the here and now that encompassed the past and future. Culture created the need for a broader concept of time. Cooperation had opened up access to functionality for an entirely new kind of cognitive ratchet that pushed human evolution quickly.

But why and how did such dramatic evolutionary change happen so rapidly after our divergence from chimps? The conventional theory of speciation requires geographic separation but doesn’t hint at anything like reasons why intelligence might evolve quickly. A number of factors come together that make the rapid evolution of humans more likely. First, let’s consider the rate of change. Mutations spread when they help a species survive better in a given niche. Traditional theories of evolution assume that organisms will continue to experience a steady rate of mutational change given no environmental change because useful mutations are always possible. But while possible, the likelihood of beneficial mutations continuously declines the longer an organism has lived in the same niche. This is because it gradually exhausts the range of what is physically reachable from the existing genome through small functional changes, causing the organism to climb up to a local maximum in the space of all possible functionality. Humans that could fly might be more functional, but flight is not physically reachable. So I am saying that evolutionary change always happens fastest when the fit of a species to its niche is worst and slows as that fit is perfected. In other words, evolution is a function of environmental stability. If the environment changes, evolution will be spurred to make species fit better. If the environment stays the same, each interbreeding population will approach stasis as its gene pool comes to represent an optimal solution to the challenges presented by the niche. However, if that population is separated geographically into two subpopulations, then one now has two new niches, which may be quite different separated than the average was combined. Each population will quickly evolve to fit the new subniches. Rapid evolution can happen both when the environment changes quickly or when a niche is divided in two, but the difference is that in the latter case a new species will form. In both cases, however, a single interbreeding population changes rapidly because mutants survive better than standards because they fit into the new environmental conditions better.

Scientists have noticed from the beginning that rates of evolutionary change are uneven. Darwin knew of this phenomenon, and wrote: “the periods during which species have undergone modification, though long as measured in years, have probably been short in comparison with the periods during which they retain the same form.”5 In the fossil record, many species in the fossil record experience stasis over dozens or even hundreds of millions of years. Niles Eldredge and Stephen Jay Gould published a paper in 1972 that named this phenomenon punctuated equilibrium, and contrasted it with the more widely subscribed notion of phyletic gradualism, which held that evolutionary change was gradual and constant. But nobody has explained why punctuated equilibrium happens. But it is as simple as this: change happens quickly when the slope toward local maxima of potential functionality is the steepest, and then slows down and nearly stops when the local maximum is achieved.

It is usually sufficient to view functional potential from the perspective of environmental opportunity, but organisms are also information processors and sometimes entirely new ways of processing information create new functional opportunities. This was the case with cooperation and engineering. Cooperation with engineering launched a new cognitive ratchet because they greatly extended the range of what was physically reachable from small functional changes. Michael Tomasello identified differences in ape and human social cognition using comparative studies that show just what capacities apes lack. Humans do more pointing, imitating, teaching, and reassessing from different angles, and our Theory of Mind goes deeper, so we not only realize what others know, but also what they know we know, and so forth recursively. These features combine to establish group-mindedness or what he calls “collective intentionality”, which are ideas of belonging with associated expectations. Though our early cooperating ancestors Australopithecus four million years ago and Homo erectus two million years ago didn’t know it, they were bad fits for their new niche, which was effectively transformed by our potential to build arbitrarily complex tools to perform arbitrarily complex tasks. (We are still bad fits for our new niche because we have not reached the limits of arbitrary complexity, so we nervously await the arrival of the technological singularity to see what that really means). In fact, we were arguably the worst fit for our niche that the history of life had ever seen, in that the slope toward our potential achievements was greatest, but we were also, of course, the only creatures yet to appear in a position to fill that niche. Making matters “worse”, the more functionality we evolved to fit our niche better, the more potential became accessible, which accelerated evolution further up to the present moment.

Language is often cited as the critical evolutionary development that drove human intelligence, and while I basically agree with this, there is more to the story than that. First, let me address the question of whether language is an instinct: it isn’t. Our brains are not wired with a universal grammar capacity that only needs to be provided with words to bring forth language automatically. However, language does have considerable instinctive support because of the Baldwin effect. The Baldwin effect, first mentioned by Douglas Spalding in 1873 and then promoted by American psychologist James Mark Baldwin in 1896, proposes that the ability to learn new behaviors will lead animals to choose behaviors that help them fit their niche better, which will in turn lead to natural selections that make them better at those behaviors. As Daniel Dennett put it, learning lets animals “pretest the efficacy of particular different designs by phenotypic (individual) exploration of the space of nearby possibilities. If a particularly winning setting is thereby discovered, this discovery will create a new selection pressure: organisms that are closer in the adaptive landscape to that discovery will have a clear advantage over those more distant.” The Baldwin Effect is Lamarckian-like in that offspring tend to become better at what their ancestors did the most. It is entirely consistent with natural selection and is an accepted part of the Modern and Extended Synthesis because nothing parents learn is inherited by their offspring. All that happens is that natural selection improves the fit of an animal to its niche. The upshot is that behaviors that result from information processing done in real time, aka learning from experience, which can include learning from others and passed down through generations, can impact genetic support for those behaviors given many generations. So one can imagine that a number of instincts that help us with language are Baldwin instincts. Couldn’t this evolve far enough to make an algorithm for Universal Grammar, e.g. Noam Chomsky’s principles and parameters approach to generative grammars, become innate? Yes, anything can evolve given enough time, but it is just not necessary. A fairly small number of Baldwin refinements to our general pool of cognitive instincts were enough to support language, and evolution will not specialize talents unnecessarily because generality has more functional potential.

Many, and perhaps most, complex animal behaviors were shaped by the Baldwin effect. I consider dam building in beavers to be a Baldwin instinct. It seems like it might have been reasoned out and taught from parents to offspring, but actually “young beavers, who had never seen or built a dam before, built a similar dam to the adult beavers on their first try.”6 This instinct results largely from an innate desire to suppress the sound of running water. But why would evolution select for that? Over the long period of time when this instinct was developing, beavers were gnawing wood and sometimes blocking streams. Those that blocked streams more did better. Beavers do learn from their environment and always try to use that knowledge to improve their lot. They did not conceive the idea of building dams, but they did do little things that build on their experience. For example, beavers plug holes in dams with mud and debris. While this now derives from an instinctive urge, pre-instinctive beavers building shoddier dams will have found from experience that flowing water was a problem and sometimes figured out ways to address it. A chance mutation that makes beavers more inclined to plug their dams will outcompete the learned capacity because it happens more reliably. In this way, Baldwin instincts backfill functions developed in real-time into instinct.

Unlike beavers, young humans raised without language will not simply speak fluent Greek. Both Holy Roman Emperor Frederick II and King James IV of Scotland performed such experiments in the 13th and 15th centuries7. In the former case, the infants died, probably from lack of love, while in the latter they did not speak any language, though they may have developed a sign language. The critical period hypothesis strongly suggests that normal brain development including the ability to use language requires adequate social exposure during the critical early years of brain development. Children with very limited exposure to language who interact with other similar kids will often develop an idioglossia or private language, which are not full-featured languages. Fifty deaf children, probably possessing idioglossia or home sign systems, were brought together in Nicaragua in a center for deaf education in 1977. Efforts to teach them Spanish had little success, but in the meantime, they developed what became a full-fledged sign language now called Idioma de Señas de Nicaragua (ISN) over a nine-year period8. Languages themselves must be created through a great deal of human interaction, but our facility with language, and our inclination to use it, is so great that we can quickly create complete languages given adequate opportunity. While every fact and rule about any given language must be learned, and while our general capacity for learning includes the ability to learn other complex skills as well, language has been with humans long enough to be heavily influenced by the Baldwin effect. A 2008 study on the feasibility of the Baldwin effect influencing language evolution using computer simulations found that it was quite plausible9. I think human populations have been using proto-languages for millions of years and that the Baldwin effect has been significant in preferentially selecting traits that help us learn them.

While linguists tend to focus on grammar, which is related only to the semantic content of language, much of language is nonverbal. Consider that Albert Mehrabian famously claimed in 1967 that only 7% of the information transmitted by verbal communicating was due to words, while 38% was tone of voice and 55% was body language. This breakdown was based on two studies in which nonverbal factors could be very significant and does not fairly represent all human communication. While other studies have shown that 60 to 80% of communication is nonverbal in typical face-to-face conversations, in a conversation about purely factual matters most of the information is, of course, carried by the semantic content of the words. This tells us that information carried nonverbally usually matters more to us than the facts of the matter. Cooperation depends more on goodwill than good information, and that is the chief contribution of nonverbal information. Reading and writing are not interactive and don’t require a relationship to be established, so so still work well without body language. But written language also conveys substantial nonverbal content through wording that evokes emotions and intuitions that essentially capture a mood.

Having established how the cognitive ratchet got started with humans and then accelerated, I’m not going to look closer at the ways functionality expanded in human minds to increase our capacity to cooperate and engineer. The overall role of the mind is the same in humans as other animals, to control the body effectively, so we need to consider the problem holistically with this in mind. That said, I will show how our capacity for conceptual thinking was boosted while remaining integrated with the evolutionary demands of survival.

The Rational and Intuitive Minds

“The intuitive mind is a sacred gift and the rational mind is a faithful servant. We have created a society that honors the servant and has forgotten the gift”. –Albert Einstein

Einstein’s quote divides the conscious mind into an intuitive and a rational component. The rational mind consists of logical information processing done in the conscious mind and the intuitive mind is information processing that is either not logical or only partly logical because it draws on impressions or hunches. Our intuitive impressions appear in our conscious awareness after having been triggered somehow by conscious thoughts. I am proposing that a level of processing below consciousness that I call the nonconscious intuitive mind creates these impressions by drawing heavily on memory but also considering a preponderance of the evidence relating to the matter at hand. It is paired with the conscious intuitive mind, which is the part of intuition we are aware of. Although the rational mind also draws heavily on memory, which is a nonconscious process, I will, as I mentioned in the last chapter, consider memory that we can bring back to consciousness as part of consciousness, and the rational thinking we do is also conscious, so there is no nonconscious rational mind. Qualia appear in our conscious awareness almost immediately after we sense things, as if the nonconscious processes that convert senses to a form we can consciously understand were instantaneous. We can call the part of the mind that manages qualia the qualitative mind. The qualitative mind then divides into the aware mind, the attentive mind, the sensory mind, and the emotional mind. As with the intuitive mind, each of these parts has both conscious and nonconscious parts, and it should be understood that the nonconscious parts feed the conscious parts.

It is well known that the left and right hemispheres of the brain work differently. Most significantly, the right brain controls the left half of the body and vice versa. But of more interest here, the right brain can approximately be called the intuitive or non-verbal mind and the left the rational or verbal mind. Although this is a gross simplification, to a first-order approximation it is true. To a second-order approximation, each half is generally capable overall and could carry on without the other half, but specialized skills do often preferentially develop in just one hemisphere. The most likely reason for hemisphere lateralization is probably that the two halves can only communicate to each other through the corpus callosum (a band of nerve fibers connecting the two halves), and possibly to a much lesser degree through lower parts of the brain. This allows each half to develop considerable autonomy. To act effectively as top-down information processors, we need to carefully balance both top-down (rational) and bottom-up (intuitive) perspectives, so it makes sense that we would develop lateralized specialties for these perspectives using each hemisphere. Each side must do both top-down and bottom-up processing because both are needed to make sense of anything, but the weighting is different, making one side seem relatively more logical and one more intuitive. Both halves contribute their efforts to create our overall, balanced perspective.

In The Master and His Emissary: The Divided Brain and the Making of the Western World, Iain McGilchrist argues that the left brain has become increasingly dominant in Western society, leading to rationality eclipsing intuition. While he linked his argument to the physical structure of the brain, one could alternately make the purely functional argument that rational thinking, from either or both hemispheres, has become more dominant in Western society. This is exactly the point that Einstein was making. Is this general shift related to left brain dominance? No; it’s not the side of the brain; the rational/intuitive division is too simplistic to describe how they specialize. But rephrased in functional terms, does this general shift exist and is it related to the dominance of rationality? Yes to both. It is not a failing of modern society; it is just an inescapable part of the cognitive ratchet: concepts and metaconcepts are rational, so the more cognitive functionality we develop, the more rational we become. But the need for intuition never goes away, because we will always have to join top-down solutions back to bottom-up perceptions, which are the source of all our drives, motives, and satisfaction with life. This mutual dependence has been recognized for a long time. Immanuel Kant put it like this: “Thoughts without intuitions are empty, intuitions without concepts are blind”10. The full passage from Kant’s 1781 edition of Critique of Pure Reason, p50-51, is:

Intuition and concepts … constitute the elements of all our cognition, so that neither concepts without intuition corresponding to them in some way nor intuition without concepts can yield a cognition. Thoughts without [intensional] content (Inhalt) are empty (leer), intuitions without concepts are blind (blind). It is, therefore, just as necessary to make the mind’s concepts sensible—that is, to add an object to them in intuition—as to make our intuitions understandable—that is, to bring them under concepts. These two powers, or capacities, cannot exchange their functions. The understanding can intuit nothing, the senses can think nothing. Only from their unification can cognition arise.

To clarify, Kant did think that intuition and concepts existed independently in our minds. It was just that neither alone could create understanding or cognition, and with this I completely agree.

Learning – A Way to Conditionally Manage Incoming Information

I outlined before how information processors create three orders of information: percepts, concepts and metaconcepts. I also said that our capacity for percepts and concepts has both innate and learned components. Qualia are how we subjectively experience innate perception as either senses or feelings, as I discussed in the last chapter11. But I haven’t yet suggested why learning exists, except in the larger context that everything functional is useful. Yes, learning is useful, but why? Instincts are great for handling frequently-encountered problems, but animals face challenges every day where novel solutions could do a better job, and it would not be practical or probably even possible to wait for solutions to such problems to evolve into instincts. Real-time systems that could tailor a solution to the problem at hand would provide a tremendous competitive advantage. This is the beginning of the cognitive ratchet I have been discussing.

We call this creation of real-time solutions from experience learning, and all animals developed two quite distinct ways to do it, an inductive approach that works from the bottom up and a deductive approach that works from the top down. These are what I previously said that William of Ockham called intuitive and abstractive cognition. Although every problem animals solve must incorporate elements of both, they are different specializations. Bottom-up techniques, which are the specialty of the qualitative and intuitive minds, create qualia and subconcepts respectively. Top-down techniques, which are the specialty of the rational mind, create concepts and metaconcepts. Inductive methods find patterns in data independent of context, while deductive methods create contexts (or models), starting with the context of the organism as an agent, and decompose that context into causes and effects, i.e. reasons. While qualia process sensory information in a fixed way, subconcepts and concepts are learned and stored in memory, starting from nothing when the brain first forms and building an increasingly complex network throughout life.

Feelings, subconcepts, and concepts collectively comprise “thoughts”, which is not a rigid term but encompasses any kind of experience passing through our conscious awareness. We consider thoughts that are well-established and considered to be reliable or correct to be knowledge. Knowledge can be either specific or general, where specific knowledge is tied to a single circumstance and general knowledge is expected to apply to some range of circumstances. Qualia are always specific and take place in real time, but subconcepts and concepts (including the memory of qualia) can be either specific or general, and can either take place in real time as we think about them or be remembered for later use. Though qualia thus constitute much of our current knowledge, they comprise none of our learning, experience, or long-term knowledge. I need a term I can use to refer just to what we have learned, so I am going to use the word “knowledge” and “thought” to mean only subconcepts and concepts going forward, and I will explicitly mention qualia (aka feelings) separately as needed. Furthermore, I will call a piece of specific knowledge a “memento”, indicating it is a distinct item from memory, and a piece of general knowledge either a “lesson” if it follows directly from experience or a “notion” if it is a novel projection. Arguably, everything in our memory is knowledge by definition because to remember is to know, but knowledge varies dramatically in how well we know it and how reliable or correct we consider it to be. We also start to forget things immediately and increasingly over time, so we tend to reserve the word knowledge for memory that meets a standard of robustness appropriate to any given situation.

All feelings and thoughts create information from patterns discovered in real time, that is, through experience and not through evolution, even though they all leverage cognitive mechanisms that developed over evolutionary time. But where feelings are “special effects” in the theater of consciousness which evolved to deliver them to our awareness in a customized way, thoughts (i.e. subconcepts and concepts) have no sensation. We can associate remembered qualia with them, but this second-hand feeling falls far short of live qualia. Knowledge doesn’t “feel” like anything; all it does is bring other knowledge to mind. Thoughts connect to other thoughts. They connect serially in language, or we can go depth-first to dig into deeper levels of detail, or breadth-first to find similarities or analogies, or we can just think by free-association to follow our intuition. Thoughts seem to be composed out of memories of other thoughts, and our ability to traverse them depends on what we are looking for. Language seems like a special case because while it doesn’t prevent us from roaming through our memory as we like, when we think using language, following our own inner voice, it helps us guide trains of thought more easily down desired pathways. As we think, we form memories and gather experience, recording them first as mementos of things that happened to us and then as lessons that help us predict what will happen in the future in similar situations. This collecting of experience happens both beneath our conscious awareness through learned perception (making subconcepts) and through conception, which is the conscious creation of concepts.

Finally, it is time to start decomposing subconcepts and concepts. Subconcepts have both nonconscious and conscious aspects. They are created nonconsciously from impressions we form from experience. This internal analysis builds them from considerations of inductive trials by looking at the preponderance of the evidence. Subconcepts can only be spoken of in the plural because are they are networks of impressions rather than discrete ideas. To refer to something discretely, we need to use concepts. Our conscious awareness of our intuitive mind consists of impressions and hunches which we know are based on experience and thus potentially carry useful information. We also have intuition about how reliable our intuitions are, so we actually trust our intuition for most things most of the time. Although we don’t know just why things pop into our minds intuitively, intuition tends to be very appropriate because it works the same way as memory recall. Specific memories, or mementos, tie to individual moments, but general memories or lessons draw conclusions about the broader applicability of specific memories to more general circumstances. Intuitions are lessons we learn below the conceptual level from experience. So we don’t just remember them, we are nonconsciously making inductive conclusions about them.

Concepts

Conceptual thinking, as I discussed in the chapter on dualism, is based on deductive reasoning. Deduction establishes logical models, which are sets of abstract premises, logical rules, and conclusions one can reach by applying the rules to the premises. Logical models are closed, meaning we can assume their premises and rules are completely true, and also that all conclusions that follow from the rules are true (given binary rules, but the rules don’t have to be binary). In our minds, we create sufficiently logical frameworks called conceptual models, for which the underlying premises are concepts. Concepts are abstract entities which have two parts in the mind: a handle with which we refer to it and a content that tells us what it means. The concept’s handle is a reference we keep for it in our minds like a container. Concepts are often named by words or phrases, but we know many more concepts than we named with words, including, for example, a detailed conceptual memory of events. From the perspective of the handle, the concept is fully abstract and might be about anything.

The concept’s meaning is its content, which consists of one or more relationships to other concepts. At its core, information processing finds similarities among things and applies those similarities to specific situations. Because of this, the primary feature of every concept’s content is whether it is a generality or a particular. A generality or type embraces many particulars that can be said to be examples of the type. The generality is said to be superordinate to the subordinate example across one or more variable ranges. Providing a value for one of those ranges creates an example or instance called a token of the type, and if all ranges are specified one arrives at a particular, which is necessarily unique because two tokens with the same content are indistinguishable and so are the same token. Generalities are always abstract, while particulars can be either concrete or abstract, which, in my terminology means they are either about something physical or something functional. A concrete or physical particular will correspond to something spatiotemporal, i.e. a physical thing or event. Each physical thing has a noumenon (or thing-in-itself) we can’t see and phenomena that we can. From the phenomena, we create information (feelings, subconcepts, and concepts) which can be linked as the content of a concept. Mentally, we catalog physical particulars as facts, which is a recognition that the physical circumstance they describe is immutable, i.e. what happened at any point in space and time cannot change. Note that concrete particulars are still generalities with respect to the time dimension, because we take physical existence as persisting through a duration of time. Since concrete particulars eventually change over time, we model them as a series of particulars linked abstractly as a generality. What happens at a given point in space and time is noumenal, but we only know of it by aligning our perceptions of phenomena with our subconcepts and concepts, which sometimes leads to mistaken conclusions. We reduce that risk and establish trust by performing additional observations to verify facts, and from the amount of confirming evidence we establish a degree of mathematical certainty about how well our thoughts characterize noumena. Belief is a special ability which I will describe later that improves certainty further by quarantining doubt.

An abstract or functional particular is any non-physical concept that is specific in that it doesn’t itself characterize a range of possible concepts. The number “2” is an abstract particular, as it can’t be refined further. A circle is also an abstract particular until we introduce the concept of distance, at which point circle is a type that becomes a particular given a radius. If we introduce location within a plane, we would also need the coordinates of the circle’s center. So we can see that whether an abstract concept is particular or not depends on what relationships exist within the logical model that contains it. The number x such that x+2=4 is variable until we solve the equation, at which point we see it is the particular 2. The number x such that x^2=4 is variable even after we solve the equation because it can be either -2 or 2. So for functional entities, once all variability is specified within a given context one has an abstract particular. Mathematics lays out sets of rules that permit variability that then let us move from general to particular mathematical objects. Deductive thought employs logical models that permit variability and can similarly arrive at particulars. For example, we can conceive of altruism as a type of behavior. If I write a story in which I open a door for someone in some situation, then that is a fully specified abstract particular of altruism. So just as we see the physical world as a collection of concrete particulars that we categorize using abstract generalities about concrete things, we see the mental world as a set of abstract particulars categorized by abstract generalities about abstract things. Thus, both our concrete and abstract worlds divide nicely into particular and abstract parts. Concrete particulars can be verified with our senses (if we can still access the situation physically), but abstract particulars can only be verified logically. In both cases, we can remember a particular and how we verified it.

While our senses send a flood of information to our minds which inherently form concrete particulars, the process of recognition also categorizes things into a wide variety of abstract types we have established in our memories as concepts. Our nonconscious mind doesn’t think about these concepts, but it does retrieve them for us, demonstrating that some of our capacity to work with concepts is innate, even if their content is shaped through experience.

Beyond whether they are generalities or particulars, concepts can have considerably more content. But what does that content look like? The surprising fact we have to keep in mind is that concepts are what they can do — its meaning is its functionality. So we shouldn’t try to decompose concepts into qualities or relationships but instead into units of purpose. Deductive models can provide much better control than inductive models because they can predict the outcomes of multistep processes through causal chains of reasoning, but to do that their ranges of variability have to align closely with the inductive variability observed in the kinds of applications in which we expect to apply them. When this alignment is good, the deductive models become highly functional because their predictions tend to come true. Viewed abstractly then, the premises and rules of deductive models exist because they are functional, i.e. because they work. So concepts are not just useful as an incidental side effect, being useful is fundamental to their nature. This is what I have been saying about information all along — it is bundled-up functionality.

Given this perspective, what can we say about content? Let’s start simply. The very generic concept in this clause’s use of the phrase “generic concept”, for example, is an abstract generality with no further meaning at all; it is just a placeholder for any concept. Or, the empty particular concept in this clause is an example of an abstract particular with no further meaning, since it is the unique abstract particular whose function is to represent an empty particular. But these are degenerate cases; almost every concept we think of has some real content. A concrete particular concept includes spatiotemporal information about it, as noted above, and all our spatiotemporal information comes originally from our senses as qualia. We additionally gain experience with an object which is generalized into subconcepts that draw parallels to similar objects. Much of the content of concrete particulars consists of links to feelings and subconcepts that remind us what it and other things like it feel like. Each concrete particular is also linked to every abstract generality for which it is a token. Abstract generalities then indirectly link to feelings and subconcepts of their tokens, with better examples forming stronger associations. What does it mean to link a concept to other feelings (sensory or emotional), subconcepts, or concepts? We suspect that this is technically accomplished using the 700 trillion or so synapses that join neurons to other neurons in our brains12, which implies that knowledge is logically a network of relationships linking subconcepts and concepts together and from there down to feelings. Our knowledge is vast and interconnected, so such a vast web of connections seems like it could be powerful enough to explain it, but how might it work? Simplistically, thinking about concepts could activate the feelings and thoughts linked by their contents by activating the linked neurons. Of course, it is more complicated than that; chiefly, activation has to be managed holistically so that each concept (and subconcept and feeling) contributes an appropriate influence on the overall control problems being solved. The free energy (surprise-minimization) principle is one holistic rule that helps provide this balance, but in more detail than that are attention and prioritization systems. But for now, I am trying to focus on how the information is organized, not how it is used.

Central to the idea of concepts is their top-down organization. To manage our bodies productively, we, that is our minds as the top-level control centers of our brains, have to look at the world as agents. When we first start to figure the world out we form learn simple categories like me and not-me, food and not-food, safe and not-safe. Our brains are wired to pick up on these kinds of binary distinctions to help us plan top-level behavior, and they soon develop into a full set of abstract generalities about concrete things. It is now impossible to say how much of this classification framework is influenced by innate preferences and how much was created culturally through language over thousands of generations, because we are all now learn to understand the world with the help of language. In any case, our framework is largely shared, but we also know how to create new personal or ad hoc classifications as the need arises. For categories and particulars to be functional we need deductive models with rules that tell us how they behave. Many of these models, too, are embedded in language and culture, and in recent centuries we have devised scientific models that have raised the scope and reliability of our conceptual knowledge to a new level.

Some examples of concepts will clarify the above points. The concept APPLE (all caps signifies a concept) is an abstract generality about a kind of fruit and not any specific apple. We have one reference point or handle in our minds for APPLE, which is not about the word “apple” or a thing like or analogous to an apple, but only about an actual apple that meets our standard of being sufficiently apple-like to match all the variable dimensions we associate with being an APPLE. From our personal experience, we know an APPLE’s feel, texture, and taste from many interactions, and we also know intuitively in what contexts APPLEs are likely to appear. We match these dimensions through recognition, which is a nonconscious process that just tells us whether our intuitive subconcepts for APPLE are met by a given instance of one. We also have deductive or causative models that tell us how APPLEs can be expected to behave and interact with other things. Although each of us has customized subconceptual and conceptual content for APPLE, we each have just one handle for APPLE and through it we refer to the same functionality for most purposes. How can this be? While each of us has distinct APPLE content from our personal experiences, the functional interactions we commonly associate with apples are about the same. Most generally, our functional understanding of them is that they are fruits of a certain size eaten in certain ways. In more detail, we probably all know and would agree that an APPLE is the edible fruit of the apple tree, is typically red, yellow or green, is about the size of a fist, has a core that should not be eaten, and is often sliced up and baked into apple pies. We will all have different feelings about their sweetness, tartness, or flavor, but this doesn’t have a large impact on the functions APPLEs can perform. That these interactions center around eating them is just an anthropomorphic perspective, and yet that perspective is generally what matters to us (and, in any case, not so incidentally, fruits appear to have evolved to appeal to animal appetites to help spread their seeds). Most of us realize apples come in different varieties, but none of us have seen them all (about 7500 cultivars), so we just allow for flexibility within the concept. Some of us may know that apples are defined to be the fruit of a single species of tree, Malus pumila, and some may not, but this has little impact on most functional uses. The person who thinks that pears or apple pears are also apples is quite mistaken relative to the broadly accepted standard, but their overly generalized concept still overlaps with the standard and may be adequate for their purposes. One can endlessly debate the exact standard for any concept, but exactness is immaterial in most cases because only certain general features are usually relevant to the functions that typically come under consideration. Generality is usually more relevant and helpful than precision, so concepts all tend to get fuzzy around the edges. But in any case, as soon as irrelevant details become relevant, they can simply be clarified for the purpose at hand. Suppose I have an apple in my hand which can call APPLE_1 for the purposes of this discussion. APPLE_1 is a concrete particular or token of an APPLE, and we would consider its existence a fact based on just a few points of confirming evidence.

The fact that a given word can refer to a given concept in a given context is what makes communication possible. It also accounts for the high level of consistency in our shared concepts and the accelerating proliferation of new concepts through culture over thousands of years. The word “apple” is the word we use to refer to APPLE in English. The word “apple” is itself a concept, call it WORD_APPLE. WORD_APPLE has a spelling and a pronunciation and the content that it is a word for APPLE, while APPLE does not. We never confuse WORD_APPLE with APPLE and can tell from context what content is meant in any communication. Generally speaking, WORD_APPLE refers only to the APPLE fruit and the plant it comes from, but many other words have several or even many meanings, each of which is a different concept. Even so, WORD_APPLE, and all words, can be used idiomatically (e.g. “the Big Apple” or “apple of my eye”) or metaphorically to refer to anything based on any similarity to APPLE. We usually don’t name instances like APPLE_1, but proper nouns are available to name specific instances as we like. We don’t have specific words or phrases for most of the concepts in our heads, either because they are particulars or because they are generalities that are too specific to warrant their own words or names. A wax apple is not an APPLE, but it is meant to seem like an APPLE, and it matches the APPLE content at a high level, so we will often just refer to it using WORD_APPLE, only clarifying that it is a different concept, namely WAX_APPLE, if the functional distinction becomes relevant.

Some tokens seem to be the perfect or prototypical exemplars of an abstract category, while others seem to be minor cases or only seem to fit partially. For example, if you think of APPLE, a flawless red apple probably comes to mind. If you think of CHAIR, you are probably thinking of an armless, rigid, four-legged chair with a straight back. Green or worm-eaten apples are worse fits, as are stools or recliners. Why does this happen? It’s just a consequence of familiarity, which is to say that some inductive knowledge is more strongly represented. All the subcategories or instances of a completely impartial deductively-specified category are totally equivalent, but if we have more experience with one than another, then that will invariably color our thoughts. Exemplars are shaped by the weighting of our own experience and our assessment of the experience of others. We develop personal ideals, personal conceptions of shared ideals, and even ideals customized to each situation at hand that balance many factors. Beyond ideals, we develop similar notions for rarities and exceptions. Examples that only partially fit categories only demonstrate that the category was not generalized with them in mind. Nothing fundamental can be learned about categories by pursuing these kinds of idiosyncratic differences. Plato famously conceived the idea that categories were somehow fundamental with his theory of Forms, which held that all physical things are imitations or approximations of ideal essences called Forms or Ideas which they in some sense aspire to. I pointed out earlier that William of Ockham realized that categories were actually extrinsic. They consequently differ somewhat for everyone, but they also share commonalities based on our conception of what we have in common.

Deriving an Appropriate Scientific Perspective for Studying the Mind

I’ve made the case for developing a unified and expanded scientific framework that can cleanly address both mental and physical phenomena. I’ve reviewed how scientific physicalism squeezed out the functional view. And I’ve reviewed how function arose in nature and led to consciousness. This culminates in a new challenge: we need to develop an appropriate scientific perspective to study the mind, which will also impact how we view the study of science at large. I will follow these five steps:

1. Our Common Knowledge Understanding of the Mind
2. Form & Function Dualism: things and ideas exist
3. The nature of knowledge: pragmatism, rationalism and empiricism
4. What Makes Knowledge Objective?
5. Orienting science (esp. cognitive science) with form & function dualism and pragmatism

1. Our Common Knowledge Understanding of the Mind

Before we get all sciency, we should reflect on what we know about the mind from common knowledge. Common knowledge has much of the reliability of science in practice, so we should not discount its value. Much of it is uncontroversial and does not depend on explanatory theories or schools of thought, including our knowledge of language and many basic aspects of our existence. So what about the mind can we say is common knowledge? This brief summary just characterizes the subject and is not intended to be exhaustive. While some of the things I will assume from common knowledge are perhaps debatable, my larger argument will not depend on them.

First and foremost, having a mind means being conscious. Consciousness is our first-person (subjective) awareness of our surroundings through our senses and our ability to think and control our bodies. We implicitly trust our sensory connection to the world, but we also know that our senses can fool us, so we’re always re-sensing and reassessing. Our sensations, formally called qualia, are subjective mental states like redness, warmth, and roughness, or emotions like anger, fear, and happiness. Qualia have a persistent feel that occurs in direct response to stimuli. When not actually sensing we can imagine we are sensing, which stimulates the memory of what qualia felt like. It is less vivid than actual sensation, though dreams and hallucinations can seem pretty real. All qualia are strictly functional, but about different kinds of things. While our sensory qualia generate properties about physical things (forms), our drives and emotional qualia generate properties about mental states (functions). Fear, desire, love, hunger, etc., feel as real to us as sight and sound, though we recognize them as abstract constructions of the mind. As with sensory qualia, we can recall emotions, but again, the feeling is less vivid.

Even more than our senses, we identify our conscious selves with our ability to think. We can tell that our thoughts are happening inside our heads, and not, say, in our hearts. It is common knowledge that our brains are in our heads and brains think1, so this impression is a well-supported fact, but why do we feel it? Let’s say we call this awareness of our brains “encephaloception”. It is a subset of proprioception (our sense of where the parts of our body are), but also draws on other somatosenses like pain, touch, and pressure. Encephaloception pinpoints our thoughts in our heads because we need to know the impact pain, motion, impact, balance, etc. have on our ability to think. Of course, our sense of vision and hearing are close to the brain, which further enhances the feeling that we think with our brains, but we can tell withing seeing or hearing.

But what is thinking? Loosely speaking it is the union of everything we feel happening in our heads. We mostly consider thinking to be something we do rather than something that happens to us, but the difference is rather subtle and will be the subject of a later section on free will. For now let’s just think of doing and happening as the same sort of thing. We experience a continuous train of images, events, words, and other mental constructs flowing together in a connected way that create what feels like internal and external worlds, though we know they are entirely imaginary. With little conscious effort it feels like we are directing our own movie. Our minds just match what we see to similar things and situations we have seen before and get a feel for what will probably happen next based on how much familiarity we have. While most of our thinking involves the sensory and event-based thoughts that comprise the mind’s real world, we also have other ways to think, notably through language, memory, and logical reasoning. Our innate gift for language lets us form trains of thought that are entirely abstracted from senses and events. Our ability to remember things relevant to our thoughts gives us an intuitive capacity to free associate in useful ways. Though intuition can be quite powerful, it essentially amounts to making good use of our associative memory. We just prod our memories to recall patterns or strategies related to what we need and either right away or after a while useful recollections materialize in bursts of intuition. Finally, we can think logically, chaining ideas together based on logical relationships rather than just senses, language, and intuition. One more critical thinking skill is learning, which is the review and assessment of feedback to discover more useful patterns. Focused learning in one domain over months and years results in mastery, which is a combination of knowledge and motor skills that give us expertise with relatively little conscious thought.

I’ve listed some different kinds of thinking but that still doesn’t tell us what they are. We can feel, remember and learn from our sensory perception of the world, but but we can’t break it down subjectively. Our senses and memories either just happen or feel like they do our bidding, but we can’t explain how they work subjectively. But we can at least partially break down two of our subjective abilities: reasoning and language. We feel we have access to reasons and rules of logic to manipulate those reasons that we use to help us reach decisions. We carry a large number of mental models in our heads which describe simplified situations in which idealized objects interact according to logical rules, and we are always trying to apply these models to real-world situations that seem like good fits for them. When we find a good fit, we then believe that the implications we see in the mental model will also hold in the real world so long as the fit remains good. All of our cause-and-effect understanding of the world derives from this kind of logical modeling. We roughly know from common knowledge how we reason logically because we do this reasoning entirely consciously. We could not do it at all without great subconscious support from recognition, sensory and/or linguistic capacities, learning, and our natural facility to project models in our heads. But given that support, the final product, logical reasoning, is entirely conscious and anyone can explain their lines of reasoning.

Most of language just appears in our heads as we need it, but we can define every word in terms of other words. It is common knowledge that words are well-defined and are not circular and fuzzy. But how can this be? Since I’m sticking to common knowledge and not getting too technical, I’ll just say that we feel we know the concepts the words represent, and we know that dictionary definitions rather accurately describe those concepts to the appropriate degree of detail to explain them. We further know, though we may not realize it, that every definition is either physical or functional but not both. Physical things are ultimately only knowable through our senses, so they break down to into ways we can sense the objects. Functional things are ultimately only knowable through what they can do, so they break down into capacities of the objects. This linguistic division alone essentially proves that our world has a form and function dualism. But for both physical and functional things, words are functional entities — they are phenomena through which we can refer to the noumena — so the definitions and the words are tools we can use to achieve our functional aims. So we differentiate physical things from each other only because it is helpful to us for functional reasons to do so, not because there is any intrinsic reason to draw those lines. Ultimately words are supported purely by inexplicable subconscious support, which is either a sensory basis for physical things or a functional basis for functional things. That functional basis is ultimately ineffable: we distinguish methods that work from those that don’t based on experience. We can articulate specific sets of logical rules in formal systems that are perfectly articulated and certain, but they are not functional. Function requires an application, and application requires fitting, which is invariably approximate, and approximate is not certain. Though it can’t be explained at the lowest level logically, function can be explained as the consequence of finding patterns in data, fitting them to situations, and assessing the amount of function that results. In our brains, this happens subconsciously and so is beyond our ability to explain via common knowledge, but we know it when we see it.

2. Form & Function Dualism: things and ideas exist

We can’t study anything without a subject to study. What we need first is an ontology, a doctrine about what kinds of things exist. We are all familiar with the notion of physical existence, and so to the extent we are referring to things in time and space that can be seen and measured we share the well-known physicalist ontology. Physicalism is an ontological monism, which means it says just one kind of thing exists, namely physical things. But is physicalism is a sufficient ontology to explain the mind? Die-hards insist it is and must be, and that anything else is new-age nonsense. I am sympathetic to the extent that I agree that mysticism is not explanatory and has no place in science. And we can certainly agree from common knowledge that physical things exist. But we also know that physical things alone don’t yet explain our subjective experience, which is so much more complex than the observed physical properties of the brain would seem to suggest. So we really need to consider whether we can extend science’s reach into the mind without resorting to the supernatural.

We are intimately familiar with the notion of mental existence, as in Descartes’ “I think therefore I am.” Feeling and thinking (as states of mind) seem to us to exist in a distinct way from physical things as they lack extent in space or time. Idealism is the monistic ontology that asserts that only mental things exist, and what we think of as physical things are really just mental representations. In other words, we dream up reality any way we like. But science and our own experience offer overwhelming evidence of a persistent physical reality that doesn’t fluctuate in accord with our imagination, which makes pure idealism untenable. But if we join the two together, we can imagine a dualism of mind and matter with both mental and physical entities that don’t reduce to each other. Religions seized on this idea, stipulating a soul (or something like it) that is distinct from the body. Descartes also promoted dualism, but he got into trouble identifying the mechanism: he guessed that the brain had a special mental substance that did the thinking, a substance that could in principle be separated from the body. Descartes imagined the two substances somehow interacted in the pineal gland. But no such substance was ever found and the pineal gland’s primary role is to make melatonin, which helps regulate sleep.

We know from science that the brain works using physical laws with nothing supernatural added, so we need an explanation of the mind bound by that constraint. While Descartes’ substance dualism doesn’t deliver, two other forms of dualism have been proposed. Property dualism tries to separate mind from matter by asserting that mental states are nonphysical properties of physical substances (namely brains). This misses the mark, too, because it suggests a direct or inherent relationship between mental states and the physical substance that holds the state (the brain), and, as we will see, this relationship is not direct. It is like saying software is a non-physical property of hardware. But while software runs on hardware, the hardware reveals nothing about what the software is meant to do. Predicate dualism proposes that predicates, being any subjects of conversation, are not reducible to physical explanations and so constitute a separate kind of existence. I will demonstrate that this is true and so hold that predicate dualism is the correct ontology science needs, but I am rebranding it as form and function dualism (just why is explained below). Sean Carroll writes,2

“Does baseball exist? It’s nowhere to be found in the Standard Model of particle physics. But any definition of “exist” that can’t find room for baseball seems overly narrow to me.”

Me too. Baseball encompasses everything from an abstract set of rules to a national pastime to specific sporting events featuring two baseball teams. Some of these have a physical corollary and some don’t, but the physical part isn’t the point. A game is an abstraction about possible outcomes when two sides compete under a set of rules. “Three” is an abstraction of quantity, “red” of color, “happy” of emotion. Quantity is an abstraction of groups, color of light frequency, brightness and context, and emotion of experienced mental states. Even common physical items can be rather abstract. Water is the liquid that comprises lakes, oceans, and rain, even though all have dissolved solids, with water from oceans having up to 3.5% salts. Diet soda has far less aspartame (0.05%), yet we would never call it water. So whether we use the word water depends on the functional impact of the dissolved solids — if no impact, then it still counts as plain water. Seawater counts as water for many purposes, just notably not for hydrating plants or animals.

So why don’t I like the term predicate dualism? The problem is that it suggests that because propositional attitudes can’t be eliminated from explaining the mind that they are also irreducible, but that is not true. They are readily reduced to simpler functional entities. Let’s take a quick look at how that happens. Brains work within physical laws, but are different from rock slides or snowstorms because they manage information. Information is entirely natural and can be managed by a physical system that can systematically leverage feedback. Living things are systems capable of doing this.3 Genes can’t manage real-time information, but brains, a second-order living information management system, can. We don’t really need to posit souls or minds to get started, we only need to focus on information, which is another way of saying capacity or function. So I prefer form and function dualism to predicate dualism because it more clearly describes the two kinds of things that exist. Function is much bigger than predicates, which are items which can be affirmed or denied. Information is broader than simple truth or falsity and includes any patterns which can be leveraged to achieve function. For example, while predicates are the subjects (and objects) of logical reasoning, function includes not just these active elements that can be manipulated by logical reasoning but also passive forms, like the capacities imbued in us by evolution, instinct, and conditioning. These are mechanisms and behaviors that have been effective in past situations. Evolution established fins, legs, and wings mostly for locomotion. Animals don’t need to know the details so long as they work, but the selection pressures are on function, not form. However, we can actively reason out the passive function of wings to derive principles that help us build planes. Some behaviors originally established with reason, like tying shoelaces, can be executed passively (on autopilot) without active use of predicates or reasoning. So we should more generally think of this nonphysical existence as a capacity for doing things rather than as yes or no predicates.

This diagram shows how form and function dualism compares to substance dualism and several monisms. These two perspectives, form and function, are not just different ways of viewing a subject, but define different kinds of existences. Physical things have form, e.g. in spacetime, or potentially in any dimensional state in which they can have an extent. Physical systems that leverage information are no longer just physical but physical and functional systems. Function has no extent but is instead measured in terms of its predictive power. Evolution uses feedback to refine algorithms (e.g. catalysis and pattern-matching) to increase their functionality. The higher-order information management systems found in brains use real-time feedback to accelerate the development of functionality. Although information management systems make function possible in an otherwise physical world, form and function can’t be reduced to each other. I show them as planes with a line of intersection not because they meet in the pineal gland but because there are relationships between them. Physical information management systems allow functional entities to operate using physical mechanisms. These entities are not in the physical universe because they are not physical, but they control the behavior of physical systems and so change the physical universe. Viewed the other way, we create the physical world in our minds by modeling it via phenomena. We never know the actual form of physical things (i.e. their noumena) but only our interpretation of them (i.e. their phenomena), so to our minds the physical world is primarily a functional construct and only secondarily physical. So the physical world is capable of simulating some measure of function, and the functional world is capable of simulating some measure of form. As I have noted, the uniformity of nature gives an otherwise physical universe the capacity to develop functional entities through feedback, so our universe is not strictly just physical because life has unleashed function into it. For this reason, function can be said to emerge from form, meaning that certain interactions of forms make function “spring” into existence with new capabilities not otherwise present in forms. It isn’t magic; it is just results from the fact that patterns can be used to predict what will happen next in a uniform universe, and competitive feedback systems leverage those patterns to survive. Living things are still physical, but the function they manage is not. Function can be said to exist in an abstract, timeless, nonphysical sense independent of whether it is ever implemented. This is true because an idea is not made possible because we think it; it is “out there” waiting to be thought whether we think it or not. Genes can only capture information gathered from feedback across generations of life cycles. This can lead to instinctive support for some complex mental behaviors, like dam-building in beavers, but it can’t manage information in real-time. Brains do gather information in real-time, and learning commits it to memory to let them surpass their instinctive behavior. Humans can apply information in arbitrarily abstract ways, which could, in principle, let them think any thought or attain any function. Our own brains are, of course, heavily constrained by their architecture, and any artificial brain we build would still have physical constraints, so we can’t, in practice, think anything. Across the infinite range of possible functions we can only access a smaller, but still infinite, set.

So the problem with physicalism as it is generally presented is that form is not the only thing a physical universe can create; it can create form and function, and function can’t be explained with the same kind of laws that apply to form but instead needs its own set of rules. If physicalism had just included rules for both direct and abstract existence in the first place, we would not need to have this discussion. But instead, it was (inadvertently) conceived to exclude an important part of the natural world, the part whose power stems from the fact that it is abstracted away from the natural world. It is ironic considering scientific explanation itself (and all explanation) is itself immaterial function and not form. How can science see both the forest and the trees if it won’t acknowledge the act of looking?

Pipe

A thought about something is not the thing itself. “Ceci n’est pas une pipe,” as Magritte said4. The phenomenon is not the noumenon, as Heidegger would have put it: the thing-as-sensed is not the thing-in-itself. If it is not the thing itself, what is it? Its whole existence is wrapped up in its potential to predict the future; that is it. However, to us, as mental beings, it is very hard to distinguish phenomena from noumena, because we can’t know the noumena directly. Knowledge is only about representations, and isn’t and can’t be the physical things themselves. The only physical world the mind knows is actually a mental model of the physical world. So while Magritte’s picture of a pipe is not a pipe, the image in our minds of an actual pipe is not a pipe either: both are representations. And what they represent is a pipe you can smoke. What this critically tells us is that we don’t care about the pipe, we only care about what the pipe can do for us, i.e. what we can predict about it. Our knowledge was never about the noumenon of the pipe; it was only about the phenomena that the pipe could enter into. In other words, knowledge is about function and only cares about form to the extent it affects function. We know the physical things have a provable physical existence — that the noumena are real — it is just that our knowledge of them is always mediated through phenomena. Our minds experience phenomena as a combination of passive and active information, where the passive work is done for us subconsciously finding patterns in everything and the active work is our conscious train of thought applying abstracted concepts to whatever situations seem to be good matches for them.

Given the foundation of form and function dualism, what can we now say distinguishes the mind from the brain? I will argue that the mind is a process in the brain viewed from its role of performing the active function of controlling the body. That’s a mouthful, so let me break it down. First, the mind is not the brain but a process in the brain. Technically, a process is any series of events that follows some kind of rules or patterns, but in this case I am referring specifically just to the information managing capabilities of the brain as mediated by neurons. We don’t know quite how they do it, but we can draw an analogy to a computer process that uses inputs and memory to produce outputs. But, as argued before, we are not so concerned with how this brain process works technically as with what function it performs because we now see the value of distinguishing functional from physical existence. Next, I said the mind is about active function. To be clear, we only have one word for mind, but might be referring to several things. Let’s call the “whole mind” the set of all processes in the brain taken from a functional perspective. Most of that is subconscious and we don’t necessarily know much about it consciously. When I talk about the mind, I generally mean just the conscious mind, which consists only of the processes that create our subjective experience. That experience has items under direct focused attention and also items under peripheral attention. It includes information we construct actively and also provides us access to much information that was constructed passively (e.g. via senses, instinct, intuition, and recollection). The conscious mind exists as a distinct process from the whole mind because it is an effective way for animals to make the kinds of decisions they need to make on a continuous basis.

3. The nature of knowledge: pragmatism, rationalism and empiricism

Given that we agree to break entities down into form and function, things and ideas, physical and mental, we next need to consider what we can know about them, and what it even means to know something. A theory about the nature of knowledge is called an epistemology. I described the mental world as being the product of information, which is patterns that can be used to predict the future. What if we propose that knowledge and information are the same thing? Charles Sanders Peirce called this epistemology pragmatism, the idea that knowledge consists of access to patterns that help predict the future for practical uses. As he put it, pragmatism is the idea that our conception of the practical effects of the objects of our conception constitutes our whole conception of them. So “practical” here doesn’t mean useful; it means usable for prediction, e.g. for statistical or logical entailment. Practical effects are the function as opposed to the form. It is just another way of saying that information and knowledge differ from noise to the extent they can be used for prediction. Being able to predict well doesn’t confer certainty like mathematical proofs; it improves one’s chances but proves nothing.

Pragmatism takes a hard rap because it carries a negative connotation of compromise. The pragmatist has given up on theory and has “settled” for the “merely” practical. But the whole point of theory is to explain what will really happen and not simply to be elegant. It is not the burden of life to live up to theory, but of theory to live up to life. When an accepted scientific theory doesn’t exactly match experimental evidence, it is because the experimental conditions are more complex than the theory’s ideal model. After all, the real world is full of imperfections that the simple equations of ideal models don’t take into account. However, we can potentially model secondary and tertiary effects with additional ideal models and then combine the models and theories to get a more accurate overall picture. However, in real-world situations it is often impractical to build this more perfect overall ideal model, both because the information is not available and because most situations we face include human factors, for which physical theories don’t apply and social theories are imprecise. In these situations pragmatism shines. The pragmatist, whose goal is to achieve the best prediction given real-world constraints, will combine all available information and approaches to do it. This doesn’t mean giving up on theory; on the contrary, a pragmatist will use well-supported theory to the limit of practicality. They will then supplement that with experience, which is their pragmatic record of what worked best in the past, and merge the two to reach a plan of action. Recall that information is the product of both a causative (reasoned) approach and a pattern analysis (e.g. intuitive) approach. Both kinds of information can be used to build the axioms and rules of a theoretical model. We aspire to causative rules for science because they lead to necessary conclusions, but in their absence we will leverage statistical correlations. We associate subconscious thinking with the pattern analysis approach, but it also leverages concepts established explicitly with a causative approach. Both our informal and formal thinking is a combination at many levels of both causation and pattern analysis. Because our conscious and subconscious minds work together in a way that appears seamless to us, we are inclined to believe that reasoned arguments are correct and not dependent on subjective (biased) intuition and experience. But we are strongly wired to think in biased ways, not because we are fundamentally irrational creatures but because biased thinking is often a more effective strategy than unbiased reason. We are both irrational and rational because both help in different ways, but we have to spot and overcome irrational biases or we will make decisions that conflict with our own goals. All of our top-level decisions have to strike a balance between intuition/experience-based (conservative) thinking and reasoned (progressive) thinking. Conservative methods let us act quickly and confidently so we can focus our attention on other problems. Progressive methods slow us down by casting doubt but they reveal better solutions. It is the principal role of consciousness to provide the progressive element, to make the call between a tried-and-true or a novel approach to any situation. These calls are always themselves pragmatic, but if in the process we spot new causal links then we may develop new ad hoc or even formal theories, and we will remember these theories along with the amount of supporting evidence they seem to have. Over time our library of theories and their support will grow, and we will draw on them for rational support as needed.

Although pragmatism is necessary at the top level of our decision-making process where experience and reason come together to effect changes in the physical world, it is not a part of the theories themselves, which exist independently as constructs of the mental (i.e. functional) world. We do have to be pragmatic about what theories we develop and about how we apply them, but since theories represent idealized functional solutions independent of practical concerns, the knowledge they represent is based on a narrower epistemology than pragmatism. But what is this narrower epistemology? After all, it is still the case that theories help predict the future for practical benefits. And Peirce’s definition, that our conception of the practical effects of the objects of our conception constitutes our whole conception of them, is also still true. What is different about theory is that it doesn’t speak to our whole conception of effects, inclusive of our experience, but focuses on causes and effects in idealized systems using a set of rules. Though technically a subset of pragmatism, rule based-systems literally have their own rules and can be completely divorced from all practical concerns, so for all practical purposes they have a wholly independent epistemology based on rules instead of effects. This theory of knowledge is called rationalism, and holds that reason (i.e. logic) is the chief source of knowledge. Put another way, where pragmatism uses both causative and pattern analysis approaches to create information, reason only uses the logical, causative approach, though it leverages axioms derived from both causative and pattern-based knowledge. A third epistemology is empiricism, which holds that knowledge comes only or primarily from sensory experience. Empiricism is also a subset of pragmatism; it differs in that it pushes where pragmatism pulls. In other words, empiricism says that knowledge is created as stimuli come in, while pragmatism says it arises as actions and effects go out. The actions and effects do ultimately depend on the inputs, and so pragmatism subsumes empiricism, which is not prescriptive about how the inputs (evidence) might be used. In science, the word empiricism is taken to mean rationalism + empiricism, i.e. scientific theory and the evidence that supports it, so one can say that rationalism is the epistemology of theoretical science and empiricism is the epistemology of applied science.

Mathematics and highly mathematical physical theories are often studied on an entirely theoretical basis, with considerations as to their applicability left for others to contemplate. The study of algorithms is mostly theoretical as well because their objectives are established artificially, so they can’t be faulted for inapplicability to real-world situations. Developing algorithms can’t, in and of itself, explain the mind, because even if the mind does employ an algorithm (or constellation of algorithms), the applicability of those algorithms to the real-world problems the mind solves must be established. But iteratively we can propose algorithms and tune them so that they do align with problems the mind seems to solve. Guessing at algorithms will never reveal the exact algorithm the mind or brain uses, but that’s ok. Scientists never discover the exact laws of nature; they only find rules that work in all or most observed situations. What we end up calling an understanding or explanation of nature is really just a framework of generalizations that helps us predict certain kinds of things. Arguably, laws of nature reveal nothing about the “true” nature of the universe. So it doesn’t matter whether the algorithms we develop to explain the mind have anything to do with what the mind is “actually” doing; to the extent they help us predict what the mind will do they will provide us with a greater understanding of it, which is to say an explanation of it.

Because proposing algorithms, or outlines of potential algorithms, and then testing them against empirical evidence is entirely consistent with the way science is practiced (i.e. empiricism), this is how I will proceed. But we can’t just propose algorithms at random; we will need a basis for establishing appropriate artificial objectives, and that basis has to be related to what it is we think minds are up to. This is exactly the feedback loop of the scientific method: propose a hypothesis, test it, and refine it ad infinitum. The available evidence informs our choice of solution, and the effectiveness of the solution informs how we refine or revise it. From the high level at which I approach this subject in this book, I won’t need to be very precise in saying just how the algorithms work because that would be premature. All we can do at this stage is provide a general outline for what kinds of skills and considerations are going into different aspects of the thought process. Once we have come to a general agreement on that, we can start to sweat the details.

While my approach to the subject will be scientifically empirical, we need to remember that the mind itself is primarily pragmatic and only secondarily capable of reason (or intuition) to support that pragmatism. So my perspective for studying the mind is not itself the way the mind principally works. This isn’t a problem so long as we keep it in mind: we are using a reasonable approach to study something that is itself uses a highly integrated combination of reason and intuition (basically causation and pattern). It would be disingenuous to suggest that I have freed myself of all possible biases in this quest and that my conclusions are perfectly objective; even established science can never be completely free of biases. But over time science can achieve ever more effective predictive models, which is the ultimate standard for objectivity: can results be duplicated? But the hallmark of objectivity is not its measure but its methods: logic and reason. The conclusions one reaches through logic using a system of rules built on postulates can be provably true, contingent on the truth of the postulates, which make it a very powerful tool. Although postulates are true by definition from the perspective of the logical model that employs them, they have no absolute truth in the physical world because our direct knowledge of the physical world is always based on evidence from individual instances and not on generalities across similar instances. So truth in the physical world (as we see it from the mental world) is always a matter of degree, the degree to which we can correlate a given generality to a group of phenomena. That degree depends both on the clarity of the generalization and on the quality of the evidence, and so is always approximate at best, but can often be close enough to a perfect correlation to be taken as truth (for practical purposes). Exceptions to such truths are often seen more as “shortcomings of reality” than as shortcomings of the truth since truth (like all concepts) exists more in a functional sense than in the sense of having a perfect correlation to reality.

But how can we empirically approach the study of the mind? If we can accept the idea that the mind is principally a functional entity, it is largely pointless to look for physical evidence of its existence, beyond establishing the physical mechanism (the brain) that supports it. This is because physical systems can make information management possible but can’t explain all the uses to which the information can be put, just as understanding the hardware of the internet doesn’t say anything about the information flowing through it. We must instead look at the functional “evidence.” We can never get direct evidence, being facts or physical signs, of function (because function has no form), so we either need to look at physical side effects or develop a way to see “evidence” of function directly independent of the physical. Behavior provides the clearest physical evidence of mental activity, but our more interesting behavior results from complex chains of thought and can’t be linked directly to stimulus and response. Next, we have personal evidence of our own mind from our own experience of it. This evidence is much more direct than behavioral evidence but has some notable shortcomings as well. Introspection has a checkered past as a tool for studying the mind. Early hopes that introspection might be able to qualitatively and quantitatively describe all conscious phenomena were overly optimistic, largely because they misunderstand the nature of the tool. Our conscious minds have access to information based both on causation and pattern analysis, but our conscious awareness of this information is filtered through an interpretive layer that generalizes the information into conceptual buckets. So these generalized interpretations are not direct evidence, but, like behavior, are downstream effects of information processing. Even so, our interpretations can provide useful clues even if they can’t be trusted outright. Freud was too quick to attach significance to noise in his interpretation of dreams as we have no reason to assume that the content of dreams serves any function. Many activities of the mind do serve a function, however, so we can study them from the perspective of those functions. As the conscious mind makes a high-level decision, it will access functionally relevant information packaged in a form that the conscious subprocess can handle, which is at least partially in the form of concepts or generalizations. These concepts are the basis of reason (i.e. rationality), so to the extent our thinking is rational then our interpretation of how we think is arguably exactly how we think (because we are conscious of it). But that extent is never exact or complete because our concepts draw on a vast pool of subconscious information which heavily colors how we use them, and also we use subconscious data analysis algorithms (most notably memory/recognition). For both of these reasons any conscious interpretation will only be approximate and may cause us to overlook or misinterpret our actual motivations completely (for which we may have other motivations to suppress).

While both behavior and introspection can provide evidence that can suggest or support models of the mind, they are pretty indirect and can’t provide very firm support for those models. But another way to study function is to speculate about what function is being performed. Functionalism holds that the defining characteristics of mental states are the functions they bring about, quite independent of what we think about those functions (introspectively) or whether we act on them (behaviorally). This is the “direct” study of function independent of the physical to which I alluded. Speculation to function, aka the study of causes and effects, is an exercise of logic. It depends on setting up an idealized model with generalized components that describes a problem. These components don’t exist physically but are exemplars that embody only the properties of their underlying physical referents that are relevant to the situation. Given the existence of these exemplars (including their associated properties) as postulates, we can then reason about what behavior we can expect from them. Within such a model, function can be understood very well or even perfectly, but it is never our expectation that these models will align perfectly with real-world situations. What we hope for is that they will match well enough that predictions made using the model will come true in the real world. Our models of the functions of mental states won’t exactly describe the true functions of those mental states (if we could ever discover them), but they will still be good explanations of the mind if they are good at predicting the functions our minds perform.

Folk explanations differ from scientific explanations in the breadth and reliability of their predictive power. While there are unlimited folk perspectives we can concoct to explain how the mind works, all of which will have some value in some situations, scientific perspectives (theories) seek a higher standard. Ideally, science can make perfect predictions, and in many physical situations it nearly does. Less ideally, science should at least be able to make predictions with odds better than chance. The social sciences usually have to settle for such a reduced level of certainty because people, and the circumstances in which they become involved, are too complex for any idealized model to describe. So how, then, can we distinguish bona fide scientific efforts in matters involving minds from pseudoscience? I will investigate this question next.

4. What Makes Knowledge Objective?

It is easier to define subjective knowledge that objective knowledge. Subjective knowledge is anything we think we know, and it counts as knowledge as long as we think it does. We set our own standard. It starts with our memory; a memory of something is knowledge of it. Our minds don’t record the past for its own sake but for its potential to help us in the future. From past experience we have a sense of what kinds of things we will need to remember, and these are the details we are most likely to commit to memory. This bias aside, our memory of events and experiences is fairly automatic and has considerable fidelity. The next level of memory is of our reflections: thoughts we have had about our experiences, memories and other thoughts. I call these two levels of memory and knowledge detailed and summary. There is no exact line separating the two, but details are kept as raw and factual as possible while summaries are higher-order interpretations that derive uses for the details. It takes some initial analysis, mostly subconscious, to study our sensory data so we can even represent details in a way that we can remember. Summaries are a subsidiary analysis of details and other summary information performed using both conscious (reasoned) and subconscious (intuitive) methods. These details and summaries are what we know subjectively.

We are designed to gather and use knowledge subjectively, so where does objectivity come in? Objectivity creates knowledge that is more reliable and broadly applicable than subjective knowledge. Taken together, reliability and broad applicability account for science’s explanatory power. After all, to be powerful, knowledge must both fit the problem and do so dependably. Objective approaches let us create both physical and social technologies to manage both goods and services to high standards. How can we create objective knowledge that can do these things? As I noted above, it’s all about the methods. Not all methods of gathering information are equally effective. Throughout our lives, we discover better ways of doing things, and we will often use these better ways again. Science makes more of an effort to identify and leverage methods that produce better information, i.e. with reliability and broad applicability. These methods are collectively called the “scientific method”. It isn’t one method but an evolving set of best practices. They are only intended to bring some order to the pursuit and do not presume to cover everything. In particular, they say nothing of the creative process or seek to constrain the flow of ideas. The scientific method is a technology of the mind, a set of heuristics to help us achieve more objective knowledge.

The philosophy of science is the conviction that an objective world independent of our perceptions exist and that we can gain an understanding of it that is also independent of our perceptions. Though it is popularly thought that science reveals the “true” nature of reality, it has been and must always be a level removed from reality. An explanation or understanding of the world will always be just one of many possible descriptions of reality and never reality itself. But science doesn’t seek a multitude of explanations. When more than one explanation exists, science looks for common ground between and tries to express them as varying perspectives of the same underlying thing. For example, wave-particle duality allows particles to be described both as particles and waves. Both descriptions work and provide explanatory power, even though we can’t imagine macroscopic objects being both at the same time. We are left with little intuitive feel for the nature of reality, which serves to remind us that the goal of objectivity is not to see what is actual there but to gain the most explanatory power over it that we can. The canon of generally-accepted scientific knowledge at any point in time will be considered charming, primitive and not terribly powerful when looked back on a century or two later, but this doesn’t mitigate its objectivity or claim on success.

That said, the word “objectivity” hints at certainty. While subjectivity acknowledges the unique perspective of each subject, objectivity is ostensibly entirely about the object itself, its reality independent of the mind. If an object actually did exist, any direct knowledge we had of it would then remain true no matter which subject viewed it. This goal, knowledge independent of the viewer, is admirable but unattainable. Any information we gather about an object must always ultimately depend on observations of it, either with our own senses or using instruments we devise. And no matter how reliable that information becomes, it is still just information, which is not the object itself but only a characterization of traits with which we ultimately predict behavior. So despite its etymology, we must never confuse objectivity with “actual” knowledge of an object, which is not possible. Objectivity only characterizes the reliability of knowledge based on the methods used to acquire it.

With those caveats out of the way, a closer look at the methods of science will show how they work to reduce the likelihood of personal opinion and maximize the likelihood of reliable reproduction of results. Below I list the principle components of the scientific method, from most to least helpful (approximately) in establishing its mission of objectivity.

    1. The refinement of hypotheses. This cornerstone of the scientific method is the idea that one can propose a rule describing how kinds of phenomena will occur, and that one can test this rule and refine it to make it more reliable. While it is popularly thought that scientific hypotheses are true until proven otherwise (i.e. falsified, as Karl Popper put it), we need to remember that the product of objective methods, including science, is not truth but reliability5. It is not so much that laws are true or can be proven false as that they can be relied on to predict outcomes in similar situations. The Standard Model of particle physics purports (with considerable success) that any two subatomic particles of the same kind are identical for all predictive purposes except for occupying a different location in spacetime.6. Maybe they are identical (despite this being impossible to prove), and this helps account for the many consistencies we observe in nature. But location in spacetime is a big wrinkle. The three body problem remains insoluble in the general case, and solving for the movements of all astronomical bodies in the solar system is considerably more so. Predictive models of how large groups of particles will behave (e.g. for climate) will always just be models for which reliability is the measure and falsifiability is irrelevant. Also, in most real-world situations many factors limit the exact alignment of scientific theory to circumstances, e.g. impurities, ability to acquire accurate data, and subsidiary effects beyond the primary theory being applied. Even so, by controlling the conditions adequately, we can build many things that work very reliably under normal operating conditions. Some aspects of mental function will prove to be highly predictable while others will be more chaotic, but our standard for scientific value should still be explanatory power.
    2. Scientific techniques. This most notably includes measurement via instrumentation rather than use of senses. Instruments are inherently objective in that they can’t have a bias or opinion regarding the outcome, which is certainly true to the extent the instruments are mechanical and don’t employ computer programs into which biases may have been unintentionally embedded. However, they are not completely free from biases or errors in how they are used, and also there are limits in the reliability of any instrument, especially at the limits of their operating specifications. Scientific techniques also include a wide variety of practices that have been demonstrated to be effective and are written up into standard protocols in all scientific disciplines to increase the chances that results can be replicated by others, which is ultimately the objective of science.
    3. Critical thinking. I will define critical thinking here without defense, as that requires a more detailed understanding of the mind than I have yet provided. Critical thinking is an effort to employ objective methods of thought with proven reliability while excluding subjective methods known to be more susceptible to bias. Next, I distinguish three of the most significant components of critical thinking:

3a. Rationality. Rationality is, in my theory of the mind, the subset of thinking concerned with applying causality to concepts, aka reasoning. As I noted in The Mind Matters, thinking and the information that is thought about divide into two camps, being reason, which manages information that derives using a causative approach, and intuition, which manages information that derives using a pattern analysis approach. Both approaches are used to some degree for almost every thought we have, but it is often useful to focus on one of these approaches as the sole or predominant one for the purpose of analysis. The value of the rational approach over the intuitive is in its reproducibility, which is the primary objective of science and the knowledge it seeks to create. Because rational techniques can be written down to characterize both starting conditions and all the rules and conclusions they imply, they have the potential to be very reliable.

3b. Inductive reasoning. Inductive reasoning extrapolates patterns from evidence. While science seeks causative links, it will settle for statistical correlations if it has to. Newton used inductive reasoning to posit gravity, which was later given a cause by Einstein’s theory of general relativity as a deformation of space-time geometry.

3c. Abductive reasoning. Abductive reasoning seeks the simplest and most likely explanations, which is a pattern matching heuristic that picks kinds of matches that tend to work out best. Occam’s Razor is an example of this often used in science: “Among competing hypotheses, the one with the fewest assumptions should be selected”.

3d. Open-mindedness. Closed-mindedness means having a fixed strategy to deal with any situation. It enables a confident response in any circumstance, but works badly if one tries to use it beyond the conditions those strategies were designed to handle. Open-mindedness is an acceptance of the limitations of one’s knowledge along with a curiosity about exploring those limitations to discover better strategies. While everyone must be open-minded in situations where ignorance is unavoidable, one hopes that one will develop sufficient mastery over most of the situations that one encounters to be able to act confidently in a closed-minded way without fear of making a mistake. While this is often possible, the scientist must always remember that perfect knowledge is unattainable and must always be alert for possible cracks in one’s knowledge. These cracks should be explored with objective methods to discover more reliable knowledge and strategies than one might already possess. By acknowledging the limits and fallibility of its approaches and conclusions, science can criticize, correct, and improve itself. Thus, more than just a bag of tricks to move knowledge forward, it is characterized by a willingness to admit to being wrong.

3e. Countering cognitive biases. More than just prejudice or closed-mindedness, cognitive biases are subconscious pattern analysis algorithms that usually work well for us but which are less reliable than objective methods. The insidiousness of cognitive biases was first exposed by Tversky and Kahneman their 1971 paper, “Belief in the law of small numbers.”78. Cognitive biases use pattern analysis to lead us to conclusions based on correlations and associations rather than causative links. They are not simply inferior to objective methods because they can account for indirect influences that can be overlooked by objective methods. But robust causative explanations are always more reliable than associative explanations, and in practice they tend to be right where biases are wrong. (where “right” and “wrong” here are taken not as absolutes but as expressions of very high and low reliability).

    4. Peer review. Peer review is the evaluation of a scientific work by one or more people of similar competence to assess whether it was conducted using appropriate scientific standards.
    5. Credentials. Academic credentials attest to the completion of specific education programs. Titular credentials, publication history, and reputation add to a researcher’s credibility. While no guarantee, credentials help establish an author’s scientific reliability.
    6. Pre-registration. A recently added best practice is pre-registration, which clears a study for publication before it has been conducted. This ensures that the decision to publish is not contingent on the results, which would be biased 9.

The physical world is not itself a rational place because reason itself it has a functional existence, not a physical existence. So rational understanding, and consequently what we think of as truth about the physical world, depends on the degree to which we can correlate a given generality to a group of phenomena. But how can we expect a generality (i.e. hypothesis) that worked for some situations to work for all similar situations? The Standard Model of particle physics professes (with considerable success) that any two subatomic particles of the same kind are identical for all predictive purposes except for occupying a different location in spacetime.10. Maybe they are identical (despite this being impossible to prove), and this helps account for the many consistencies we observe in nature. But location in spacetime is a big wrinkle. The three body problem remains insoluble in the general case, and solving for the movements of all astronomical bodies in the solar system is considerably more so. Predictive models of how large groups of particles will behave (e.g. for climate) will always just be models for which reliability is the measure and falsifiability is irrelevant. Particles are not simply free-moving; they clump into atoms and molecules in pretty strict accordance with laws of physics and chemistry that have been elaborated pretty well. Macroscopic objects in nature or manufactured to serve specific purposes seem to obey many rules with considerably more fidelity than free-moving weather systems, a fact upon which our whole technological civilization depends. Still, in most real-world situations many factors limit the exact alignment of scientific theory to circumstances, e.g. impurities, ability to acquire accurate data, and subsidiary effects beyond the primary theory being applied. Even so, by controlling the conditions adequately, we can build many things that work very reliably under normal operating conditions. The question I am going to explore in this book is whether scientific, rational thought can be successfully applied to function and not just form, and specifically to the mental function comprising our minds. Are some aspects highly predictable while others remain chaotic?

We have to keep in mind just how much we take correlation of theory to reality for granted when we move above the realm of subatomic particles. No two apples are alike, or any two gun parts, though Eli Whitney’s success with interchangeable parts has led us to think of them as being so. They are interchangeable once we slot them into a model or hypothesis, but in reality any two macroscopic objects have many differences between them. A rational view of the world breaks down as the boundaries between objects become unclear as imperfections mount. Is a blemished or rotten apple still an apple? What about a wax apple or a picture of an apple? Is a gun part still a gun part if it doesn’t fit? A hypothesis that is completely logical and certain will still have imperfect applicability to any real-world situation because the objects that comprise it are idealized, and the world is not ideal. But still, in many situations this uncertainty is small, often vanishingly small, which allows us to build guns and many other things that work very reliably under normal operating conditions.

How can we mitigate subjectivity and increase objectivity? More observations from more people help, preferably with instruments, which are much more accurate and bias-free than senses. This addresses evidence collection, but it not so easy to increase objectivity over strategizing and decision-making. These are functional tasks, not matters of form, and so are fundamentally outside the physical realm and so not subject to observation. Luckily, formal systems follow internal rules and not subjective whims, so to the degree we use logic we retain our objectivity. But this can only get us so far because we still have to agree on the models we are going to use in advance, and our preference of one model over another ultimately has subjective aspects. To the degree we use statistical reasoning we can improve our objectivity by using computers rather than innate or learned skills. Statistical algorithms exist that are quite immune to preference, bias, and fallacy (though again, deciding what algorithm to use involves some subjectivity). But we can’t yet program a computer to do logical reasoning on a par with humans. So we need to examine how we reason in order to find ways to be more objective about it so we can be objective when we start to study it. It’s a catch-22. We have to understand the mind first before we figure out how to understand it. If we rush in without establishing a basis for objectivity, then everything we do will be a matter of opinion. While there is no perfect formal escape from this problem, we informally overcome this bootstrapping problem with every thought through the power of assumption. An assumption, logically called a proposition, is an unsupported statement which, if taken to be true, can support other statements. All models are built using assumptions. While the model will ultimately only work if the assumptions are true, we can build the model and start to use it on the hope that the assumptions will hold up. So can I use a model of how the mind works built on the assumption that I was being objective to then establish the objectivity I need to build the model? Yes. The approach is a bit circular, but that isn’t the whole story. Bootstrapping is superficially impossible, but in practice is just a way of building up a more complicated process through a series of simpler processes: “at each stage a smaller, simpler program loads and then executes the larger, more complicated program of the next stage”. In our case, we need to use our minds to figure out our minds, which means we need to start with some broad generalizations about what we are doing and then start using those, then move to a more detailed but still agreeable model and start using that, and so on. So yes, we can only start filling in the details, even regarding our approach to studying the subject, by establishing models and then running them. While there is no guarantee it will work, we can be guaranteed it won’t work if we don’t go down this path. While not provably correct, nothing in nature can be proven. All we can do is develop hypotheses and test them. By iterating on the hypotheses and expanding them with each pass, we bootstrap them to greater explanatory power. Looking back, I have already done the first (highest level) iteration of bootstrapping by endorsing form & function dualism and the idea that the mind consists of processes that manage information. For the next iteration, I will propose an explanation for how the mind reasons, which I will then use to support arguments for achieving objectivity.

So then, from a high level, how does reasoning work? I presume a mind that starts out with some innate information processing capabilities and a memory bank into which experience can record learned information and capabilities. The mind is free of memories (a blank slate) when it first forms but is hardwired with many ways to process information (e.g. senses and emotions). Because our new knowledge and skills (stored in memory) build on what came before, we are essentially continually bootstrapping ourselves into more capable versions of ourselves. I mention all this because it means that the framework with which we reason is already highly evolved even from the very first time we start making conscious decisions. Our theory of reasoning has to take into account the influence of every event in our past that changed our memory. Every event that even had a short-term impact on our memory has the potential for long-term effects because long-term memories continually form and affect our overall impressions even if we can’t recall them specifically.

One could view the mind as being a morass of interconnected information that links every experience or thought to every other. That view won’t get us very far because it gives us nothing to manipulate, but it is true, and any more detailed views we develop should not contradict it. But on what basis can we propose to deconstruct reasoning if the brain has been gradually accumulating and refining a large pool of data for many years? On functional bases, of which I have already proposed two: logical and statistical, which I introduced above with pragmatism. Are these the only two approaches that can aid prediction? Supernatural prophecy is the only other way I can think of, but we lack reliable (if any) access to it, so I will not pursue it further. Just knowing that however the mind might be working, it is using logical and/or statistical techniques to accomplish its goals gives us a lot to work with. First, it would make sense, and I contend that it is true, that the mind uses both statistical and logical means to solve any problem, using each to the maximum degree they help. In brief, statistical means excel at establishing the assumptions and logical means at drawing out conclusions from the assumptions.

While we can’t yet say how neurons make reasoning possible, we can say that it uses statistics and logic, and from our knowledge of the kinds of problems we solve and how we solve them, we can see more detail about what statistical and logical techniques we use. Statistically, we know that all our experience contributes supporting evidence to generalizations we make about the world. More frequently used generalizations come to mind more readily than lesser used and are sometimes also associated with words or phrases, such as about the concept APPLE. An APPLE could be a specimen of fruit of a certain kind, or a reproduction or representation of such a specimen, or used in a metaphor or simile, which are situations where the APPLE concept helps illustrate something else. We can use innate statistical capabilities to recognize something as an APPLE by correlating the observed (or imagined) aspects of that thing against our large database every encounter we have ever had with APPLES. It’s a lot of analysis, but we can do it instantly with considerable confidence. Our concepts are defined by the union of our encounters, not by dictionaries. Dictionaries just summarize words, and yet words are generalizations and generalizations are summaries, so dictionaries are very effective because they summarize well. But brains are like dictionaries on steroids; our summaries of the assumptions and rules behind our concepts and models are much deeper and were reinforced by every affirming or opposing interaction we ever had. Again, most of this is innate: we generalize, memorize, and recognize whether we want to or not using built-in capacities. Consciousness plays an important role I will discuss later, but “sees” only a small fraction of the computational work our brains do for us.

Let’s move on to logical abilities. Logic operates in a formal system, which is a set of assumptions or axioms and rules of inference that apply to them. We have some facility for learning formal systems, such as the rules of arithmetic, but everyday reasoning is not done using formal systems for which we have laid out a list of assumptions and rules. And yet, the formal systems must exist, so where do they come from? The answer is that we have an innate capacity to construct mental models, which are both informal and formal systems. They are informal on many levels, which I will get into, but also serve the formal need required for their use in logic. How many mental models (models, for short) do we have in our heads? Looked at most broadly, we each have one, being the whole morass of all the information we have every processed. But it is not very helpful to take such a broad view, nor is it compatible with our experience using mental models. Rather, it makes sense to think of a mental model as the fairly small set of assumptions and rules that describe a problem we typically encounter. So we might have a model of a tree or of the game of baseball. When we want to reason about trees or baseball, we pull out our mental model and use it to draw logical conclusions. From the rules of trees, we know trees have a trunk with ever small branches branching off that have leaves that usually fall off in the winter. From the rules of baseball, we know that an inning ends on the third out. Referring back a paragraph, we can see that models and concepts are the same things — they are generalizations, which is to say they are assessments that combine a set of experience into a prototype. Though the same data, models and concepts have different functional perspectives: models view the data from the inside as the framework in which logic operates, and concepts view it from the outside as the generalized meaning it represents.

While APPLE, TREE, and BASEBALL are individual concepts/models, no two instances of them are the same. Any two apples must differ at least in time and/or place. When we use a model for a tree (let’s call it the model instance), we customize the model to fit the problem at hand. So for an evergreen tree, for example, we will think of needles as a degenerate or alternate form of leaves. Importantly, we don’t consciously reason out the appropriate model for the given tree; we recognize it using our innate statistical capabilities. A model or concept instance is created through recognition of underlying generalizations we have stored from long experience, and then tweaked on an ad hoc basis (via further recognition and reflection) to add unique details to this instance. Reflection can be thought of as a conscious tool to augment recognition. So a typical model instance will be based on recognition of a variety of concepts/models, some of which will overlap and even contradict each other. Every model instance thus contains a set of formal systems, so I generally call it a constellation of models rather than a model instance.

We reason with a model constellation by using logic within each component model and then using statistical means to weigh them against each other. The critical aspect of the whole arrangement is that it sets up formal systems in which logic can be applied. Beyond that, statistical techniques provide the huge amount of flexibility needed to line up formal systems to real-world situations. The whole trick of the mind is to represent the external world with internal models and to run simulations on those models to predict what will happen externally. We know that all animals have some capacity to generalize to concepts and models because their behavior depends on being able to predict the future (e.g. where food will be). Most animals, but humans in particular, can extend their knowledge faster than their own experience allows by sharing generalizations with others via communication and language, which have genetic cognitive support. And humans can extend their knowledge faster still through science, which formally identifies objective models.

So what steps can we take to increase the objectivity of what goes on in our minds, which has some objective elements in its use of formal models, but which also has many subjective elements that help form and interpret the models? Devising software that could run mental models would help because it could avoid fallacies and guard against biases. It would still ultimately need to prioritize using preferences, which are intrinsically subjective, but we could at least try to be careful and fair setting them up. Although it could guard against the abuses of bias, we have to remember that all generalizations are a kind of bias, being arguments for one way of organizing information over another. We can’t write software yet that can manage concepts or models, but machine learning algorithms, which are statistical in nature, are advancing quickly. They are becoming increasingly generalized to behave in ever more “clever” ways. Since concepts and models are themselves statistical entities at their core, we will need to leverage machine learning as a starting point for software that simulates the mind.

Still, there is much we can do to improve our objectivity of thought short of replacing ourselves with machines, and science has been refining methods to do it from the beginning. Science’s success depends critically on its objectivity, so it has long tried to reject subjective biases. It does this principally by cultivating a culture of objectivity. Scientists try to put opinion aside to develop hypotheses in response to observations. They then test them with methods that can be independently confirmed. Scientists also use peer review to increase independence from subjectivity. But what keeps peers from being subjective? In his 1962 classic, The Structure of Scientific Revolutions11, Thomas Kuhn noted that even a scientific community that considers itself objective can become biased toward existing beliefs and will resist shifting to a new paradigm until the evidence becomes overwhelming. This observation inadvertently opened a door which postmodern deconstructionists used to launch the science wars, an argument that sought to undermine the objective basis of science, calling it a social construction. To some degree this is undeniable, which has left science with a desperate need for a firmer foundation. The refutation science has fallen back on for now was best put by Richard Dawkins, who noted in 2013 that “Science works, bitches!”12. Yes, it does, but until we establish why we are blustering much like the social constructionists. The reason science works is that scientific methods increase objectivity while reducing subjectivity and relativism. It doesn’t matter that they don’t (and in fact can’t) eliminate it. All that matters is that they reduce it, which distinguishes science from social construction by directing it toward goals. Social constructions go nowhere, but science creates an ever more accurate model of the world. So, yes, science is a social construction, but one that continually moves closer to truth, if truth is defined in terms of knowledge that can be put to use. In other words, from a functional perspective, truth just means increasing the amount and quality of useful information. It is not enough for scientific communities to assume best efforts will produce objectivity, we must also discover how preferences, biases, and fallacies can mislead the whole community. Tversky and Kahneman did groundbreaking work exposing the extent of cognitive biases in scientific research, most notably in their 1971 paper, “Belief in the law of small numbers.”1314. Beyond just being aware of biases, scientists should not have to work in situations with a vested interest in specific outcomes. This can potentially happen in both public and private settings, but is more commonly a problem when science is used to justify a commercial enterprise.

5. Orienting science (esp. cognitive science) with form & function dualism and pragmatism

The paradigm I am proposing to replace physicalism, rationalism, and empiricism is a superset of them. Form & function dualism embraces everything physicalism stands for but doesn’t exclude function as a form of existence. Pragmatism embraces everything rationalism and empiricism stand for but also includes knowledge gathered from statistical processes and function.

But wait, you say, what about biology and the social sciences: haven’t they been making great progress within the current paradigm? Well, they have been making great progress, but they have been doing it using an unarticulated paradigm. Since Darwin, biology has pursued a function-oriented approach. Biologists examine all biological systems with an eye to the function they appear to be serving, and they consider the satisfaction of function to be an adequate scientific justification, but it isn’t under physicalism, rationalism or empiricism. Biologists cite Darwin and evolution as justification for this kind of reasoning, but that doesn’t make it science. The theory of evolution is unsupportable under physicalism, rationalism, and empiricism alone, but instead of acknowledging this metaphysical shortfall some scientists just ignore evolution and reasoning about function while others just embrace it without being overly concerned that it falls outside the scientific paradigm. Evolutionary function occupies a somewhat confusing place in reasoning about function because it is not teleological, meaning that evolution is not directed toward an end or shaped by a purpose but rather is a blind process without a goal. But this is irrelevant from an informational standpoint because information never directs toward an end anyway, it just helps predict. Goals are artifacts of formal systems, and so contribute to logical but not statistical information management techniques. In other words, goals and logic are imaginary constructs; they are critical for understanding the mind but can be ignored for studying evolution and biology, which has allowed biology to carry on despite this weakness in its foundation.

The social sciences, too, have been proceeding on an unarticulated paradigm. Officially, they are trying to stay within the bounds of physicalism, rationalism, and empiricism, but the human mind introduces a black box, which is what scientists call a part of the system that is studied entirely through its inputs and outputs without any attempt to explain the inner workings. Some efforts to explain it have been attempted. Pavlov and Skinner proposed that behaviorism could explain the mind as nothing more than operant conditioning, which sounded good at first but didn’t explain all that minds do. Chomsky refuted it in a rebuttal to Skinner’s Verbal Behavior by explaining how language acquisition leverages innate linguistic talents15. And Piaget extended the list of innate cognitive skills by developing his staged theory of intellectual development. So we now have good reason to believe the mind is much more than conditioned behavior and employs reasoning and subconscious know-how. But that is not the same thing as having an ontology or epistemology to support it. Form & function dualism and pragmatism give us the leverage to separate the machine (the brain) from its control (the mind) and to dissect the pieces.

Expanding the metaphysics of science has a direct impact across science and not just regarding the mind. First, it finds a proper home for the formal sciences in the overall framework. As Wikipedia says, “The formal sciences are often excluded as they do not depend on empirical observations.” Next, and critically, it provides a justification for the formal sciences to be the foundation for the other sciences, which are dependent on mathematics, not to mention logic and hypotheses themselves. But the truth is that there is no metaphysical justification for invoking formal sciences to support physicalism, rationalism, and empiricism. With my paradigm, the justification becomes clear: function plays an indispensable role in the way the physical sciences leverage generalizations (scientific laws) about nature. In other words, scientific theories are from the domain of function, not form. Next, it explains the role evolutionary thinking is already having in biology because it reveals how biological mechanisms use information stored in DNA to control life processes through feedback loops. Finally, this expanded framework will ultimately let the social sciences shift from black boxes to knowable quantities.

But my primary motivation for introducing this new framework is to provide a scientific perspective for studying the mind, which is the domain of cognitive science. It will elevate cognitive science from a loose collaboration of sciences to a central role in fleshing out the foundation of science. Historically the formal sciences have been almost entirely theoretical pursuits because formal systems are abstract constructs with no apparent real-world examples. But software and minds are the big exceptions to this rule and open the door for formalists to study how real-world computational systems can implement formal systems. Theoretical computer science is a well-established formal treatment of computer science, but there is no well-established formal treatment for cognitive science, although the terms theoretical cognitive science and computational cognitive science are occasionally used. Most of what I discuss in this book is theoretical cognitive science because most of what I am doing is outlining the logic of minds, human or otherwise, but with a heavy focus on the design decisions that seem to have impacted earthly, and especially human, minds. Theoretical cognitive science studies the ways minds could work, looking at the problem from the functional side, and leaves it as a (big) future exercise to work out how the brain actually brings this sort of functionality to life.

It is worth noting here that we can’t conflate software with function: software exists physically as a series of instructions, while function exists mentally and has no physical form (although, as discussed, software and brains can produce functional effects in the physical world and this is, in fact, their purpose). Drew McDermott (whose class I took at Yale) characterized this confusion in the field of AI like this (as described by Margaret Boden in Mind as Machine):

A systematic source of self-deception was their common habit (made possible by LISP: see 10.v.c) of using natural-language words to name various aspects of programs. These “wishful mnemonics”, he said, included the widespread use of “UNDERSTAND” or “GOAL” to refer to procedures and data structures. In more traditional computer science, there was no misunderstanding; indeed, “structured programming” used terms such as GOAL in a liberating way. In Al, however, these apparently harmless words often seduced the programmer (and third parties) into thinking that real goals, if only of a very simple kind, were being modelled. If the GOAL procedure had been called “G0034” instead, any such thought would have to be proven, not airily assumed. The self-deception arose even during the process of programming: “When you [i.e. the programmer] say (GOAL… ), you can just feel the enormous power at your fingertips. It is, of course, an illusion” (p. 145). 16

This begs the million-dollar question: if an implementation of an algorithm is not itself function, where is the function, i.e. real intelligence, hiding? I am going to develop the answer to this question as the book unfolds, but the short answer is that information management is a blind watchmaker both in evolution and the mind. That is, from a physical perspective the universe can be thought of as deterministic, so there is no intelligence or free will. But the main thrust of my book is that this doesn’t matter because algorithms that manage information are predictive and this capacity is equivalent to both intelligence and free will. So if procedure G0034 is part of a larger system that uses it to effectively predict the future, it can fairly also be called by whatever functional name you like that describes this aspect. Such mnemonics are actually not wishful. It is no illusion that the subroutines of a self-driving car that get it to its destination in one piece do wield enormous power and achieve actual goals. This doesn’t mean we are ready to start programming goals to the level human minds conceive them (and certainly not UNDERSTAND!), but function, i.e. predictive power, can be broken down into simple examples and implemented using today’s computers.

What are the next steps? My main point is that we need start thinking about how minds achieve function and stop thinking that a breakthrough in neurochemistry will magically solve the problem. We have to solve the problem by solving the problem, not by hoping a better understanding of the hardware will explain the software. While the natural sciences decompose the physical world from the bottom up, starting with subatomic particles, we need to decompose the mental world from the top down, starting (and ending) with the information the mind manages.