2.3 Humans: 4 million to 10,000 years ago

Cooperation and Engineering

People are much more capable than our nearest animal relatives, but why? Clearly, something significant happened in the seven million years since we diverged from chimpanzees. Before I get into that, let’s consider what some of the smartest animals can do. Apes and another line of clever animals, the corvids (crows, ravens and rooks), can fashion simple tools from small branches, which requires cause and effect thinking using conceptual models. Most apes and corvids live in groups to defend against predators and extend foraging opportunities, as do many other kinds of animals, but they derive many other subtle benefits from social life. They groom each other, share care of offspring, share knowledge, and are quite vocal, using calls for warning, mating, and other purposes.12 Apes and corvids have also a substantial capacity to attribute mental states (such as senses, emotions, desires, and beliefs) to themselves and others, an ability called Theory of Mind (TOM)34. In particular, if they see food being hidden and they are aware of another animal (agent) observing it being hidden, they will take this knowledge of the other animal’s knowledge into account in their behavior. That mammals and birds evolved all these capabilities independent of each other indicates that a functional ratchet is at work.

But while apes, corvids, and some other animals as well can devise novel (non-instinctive) strategies to solve problems using both intuitive and conceptual learning, and they can benefit from social interactions, only humans can work together cooperatively to solve new problems with engineering. Many animals, e.g. social insects and beavers, work cooperatively to solve problems using strategies honed by instinct. They are acting individually without realizing that their behavior benefits the group rather than using new information to work out solutions. Among animals that can conceptually work out solutions to new problems, only humans can use reuse and improve on previously used tools (engineering) and then communicate their plans to each other and execute them in a coordinated fashion. They do this using language, which can be subdivided into semantic and nonverbal meaning. Semantic meaning is conceptual while nonverbal meaning is emotional and intuitive (so I am counting conceptual gestures as semantic, though they are technically also nonverbal). Early languages forced people to develop formal expressions of their conceptual models, which in turn made them think more clearly about them. So language is inherently metacognitive because every linguistic expression is both about something and is also an expression, which is a way of thinking about that something.

Cooperation and engineering gave us more ways to think about things and reasons to do so because they both expand the range of beneficial activities indefinitely. Most notably, they create the prospect of individuals learning specialized service or manufacturing roles. Making spears for cooperative hunting is an often-cited activity requiring both kinds of specialization, but just having the cognitive capacity to do this suggests we were likely also cooperating to engineer housing, clothing, and food practices, probably for longer than a million years. Starting with rudimentary uses of semantic communication and tools, bands of humans established roles for group-level strategies that slowly evolved into ever-more elaborate cultural heritages. No other animal communities have the ability to develop technologies and teach them to future generations. Because artifacts and the skills to use them persist, humans started to develop a frame of reference beyond the here and now that encompassed the past and future. Culture created the need for a broader concept of time. Cooperation had opened up access to functionality for an entirely new kind of cognitive ratchet that pushed human evolution quickly.

But why and how did such dramatic evolutionary change happen so rapidly after our divergence from chimps? The conventional theory of speciation requires geographic separation but doesn’t hint at anything like reasons why intelligence might evolve quickly. A number of factors come together that make the rapid evolution of humans more likely. First, let’s consider the rate of change. Mutations spread when they help a species survive better in a given niche. Traditional theories of evolution assume that organisms will continue to experience a steady rate of mutational change given no environmental change because useful mutations are always possible. But while possible, the likelihood of beneficial mutations continuously declines the longer an organism has lived in the same niche. This is because it gradually exhausts the range of what is physically reachable from the existing genome through small functional changes, causing the organism to climb up to a local maximum in the space of all possible functionality. Humans that could fly might be more functional, but flight is not physically reachable. So I am saying that evolutionary change always happens fastest when the fit of a species to its niche is worst and slows as that fit is perfected. In other words, evolution is a function of environmental stability. If the environment changes, evolution will be spurred to make species fit better. If the environment stays the same, each interbreeding population will approach stasis as its gene pool comes to represent an optimal solution to the challenges presented by the niche. However, if that population is separated geographically into two subpopulations, then one now has two new niches, which may be quite different separated than the average was combined. Each population will quickly evolve to fit the new subniches. Rapid evolution can happen both when the environment changes quickly or when a niche is divided in two, but the difference is that in the latter case a new species will form. In both cases, however, a single interbreeding population changes rapidly because mutants survive better than standards because they fit into the new environmental conditions better.

Scientists have noticed from the beginning that rates of evolutionary change are uneven. Darwin knew of this phenomenon, and wrote: “the periods during which species have undergone modification, though long as measured in years, have probably been short in comparison with the periods during which they retain the same form.”5 In the fossil record, many species in the fossil record experience stasis over dozens or even hundreds of millions of years. Niles Eldredge and Stephen Jay Gould published a paper in 1972 that named this phenomenon punctuated equilibrium, and contrasted it with the more widely subscribed notion of phyletic gradualism, which held that evolutionary change was gradual and constant. But nobody has explained why punctuated equilibrium happens. But it is as simple as this: change happens quickly when the slope toward local maxima of potential functionality is the steepest, and then slows down and nearly stops when the local maximum is achieved.

It is usually sufficient to view functional potential from the perspective of environmental opportunity, but organisms are also information processors and sometimes entirely new ways of processing information create new functional opportunities. This was the case with cooperation and engineering. Cooperation with engineering launched a new cognitive ratchet because they greatly extended the range of what was physically reachable from small functional changes. Michael Tomasello identified differences in ape and human social cognition using comparative studies that show just what capacities apes lack. Humans do more pointing, imitating, teaching, and reassessing from different angles, and our Theory of Mind goes deeper, so we not only realize what others know, but also what they know we know, and so forth recursively. These features combine to establish group-mindedness or what he calls “collective intentionality”, which are ideas of belonging with associated expectations. Though our early cooperating ancestors Australopithecus four million years ago and Homo erectus two million years ago didn’t know it, they were bad fits for their new niche, which was effectively transformed by our potential to build arbitrarily complex tools to perform arbitrarily complex tasks. (We are still bad fits for our new niche because we have not reached the limits of arbitrary complexity, so we nervously await the arrival of the technological singularity to see what that really means). In fact, we were arguably the worst fit for our niche that the history of life had ever seen, in that the slope toward our potential achievements was greatest, but we were also, of course, the only creatures yet to appear in a position to fill that niche. Making matters “worse”, the more functionality we evolved to fit our niche better, the more potential became accessible, which accelerated evolution further up to the present moment.

Language is often cited as the critical evolutionary development that drove human intelligence, and while I basically agree with this, there is more to the story than that. First, let me address the question of whether language is an instinct: it isn’t. Our brains are not wired with a universal grammar capacity that only needs to be provided with words to bring forth language automatically. However, language does have considerable instinctive support because of the Baldwin effect. The Baldwin effect, first mentioned by Douglas Spalding in 1873 and then promoted by American psychologist James Mark Baldwin in 1896, proposes that the ability to learn new behaviors will lead animals to choose behaviors that help them fit their niche better, which will in turn lead to natural selections that make them better at those behaviors. As Daniel Dennett put it, learning lets animals “pretest the efficacy of particular different designs by phenotypic (individual) exploration of the space of nearby possibilities. If a particularly winning setting is thereby discovered, this discovery will create a new selection pressure: organisms that are closer in the adaptive landscape to that discovery will have a clear advantage over those more distant.” The Baldwin Effect is Lamarckian-like in that offspring tend to become better at what their ancestors did the most. It is entirely consistent with natural selection and is an accepted part of the Modern and Extended Synthesis because nothing parents learn is inherited by their offspring. All that happens is that natural selection improves the fit of an animal to its niche. The upshot is that behaviors that result from information processing done in real time, aka learning from experience, which can include learning from others and passed down through generations, can impact genetic support for those behaviors given many generations. So one can imagine that a number of instincts that help us with language are Baldwin instincts. Couldn’t this evolve far enough to make an algorithm for Universal Grammar, e.g. Noam Chomsky’s principles and parameters approach to generative grammars, become innate? Yes, anything can evolve given enough time, but it is just not necessary. A fairly small number of Baldwin refinements to our general pool of cognitive instincts were enough to support language, and evolution will not specialize talents unnecessarily because generality has more functional potential.

Many, and perhaps most, complex animal behaviors were shaped by the Baldwin effect. I consider dam building in beavers to be a Baldwin instinct. It seems like it might have been reasoned out and taught from parents to offspring, but actually “young beavers, who had never seen or built a dam before, built a similar dam to the adult beavers on their first try.”6 This instinct results largely from an innate desire to suppress the sound of running water. But why would evolution select for that? Over the long period of time when this instinct was developing, beavers were gnawing wood and sometimes blocking streams. Those that blocked streams more did better. Beavers do learn from their environment and always try to use that knowledge to improve their lot. They did not conceive the idea of building dams, but they did do little things that build on their experience. For example, beavers plug holes in dams with mud and debris. While this now derives from an instinctive urge, pre-instinctive beavers building shoddier dams will have found from experience that flowing water was a problem and sometimes figured out ways to address it. A chance mutation that makes beavers more inclined to plug their dams will outcompete the learned capacity because it happens more reliably. In this way, Baldwin instincts backfill functions developed in real-time into instinct.

Unlike beavers, young humans raised without language will not simply speak fluent Greek. Both Holy Roman Emperor Frederick II and King James IV of Scotland performed such experiments in the 13th and 15th centuries7. In the former case, the infants died, probably from lack of love, while in the latter they did not speak any language, though they may have developed a sign language. The critical period hypothesis strongly suggests that normal brain development including the ability to use language requires adequate social exposure during the critical early years of brain development. Children with very limited exposure to language who interact with other similar kids will often develop an idioglossia or private language, which are not full-featured languages. Fifty deaf children, probably possessing idioglossia or home sign systems, were brought together in Nicaragua in a center for deaf education in 1977. Efforts to teach them Spanish had little success, but in the meantime, they developed what became a full-fledged sign language now called Idioma de Señas de Nicaragua (ISN) over a nine-year period8. Languages themselves must be created through a great deal of human interaction, but our facility with language, and our inclination to use it, is so great that we can quickly create complete languages given adequate opportunity. While every fact and rule about any given language must be learned, and while our general capacity for learning includes the ability to learn other complex skills as well, language has been with humans long enough to be heavily influenced by the Baldwin effect. A 2008 study on the feasibility of the Baldwin effect influencing language evolution using computer simulations found that it was quite plausible9. I think human populations have been using proto-languages for millions of years and that the Baldwin effect has been significant in preferentially selecting traits that help us learn them.

While linguists tend to focus on grammar, which is related only to the semantic content of language, much of language is nonverbal. Consider that Albert Mehrabian famously claimed in 1967 that only 7% of the information transmitted by verbal communicating was due to words, while 38% was tone of voice and 55% was body language. This breakdown was based on two studies in which nonverbal factors could be very significant and does not fairly represent all human communication. While other studies have shown that 60 to 80% of communication is nonverbal in typical face-to-face conversations, in a conversation about purely factual matters most of the information is, of course, carried by the semantic content of the words. This tells us that information carried nonverbally usually matters more to us than the facts of the matter. Cooperation depends more on goodwill than good information, and that is the chief contribution of nonverbal information. Reading and writing are not interactive and don’t require a relationship to be established, so so still work well without body language. But written language also conveys substantial nonverbal content through wording that evokes emotions and intuitions that essentially capture a mood.

Having established how the cognitive ratchet got started with humans and then accelerated, I’m not going to look closer at the ways functionality expanded in human minds to increase our capacity to cooperate and engineer. The overall role of the mind is the same in humans as other animals, to control the body effectively, so we need to consider the problem holistically with this in mind. That said, I will show how our capacity for conceptual thinking was boosted while remaining integrated with the evolutionary demands of survival.

The Rational and Intuitive Minds

“The intuitive mind is a sacred gift and the rational mind is a faithful servant. We have created a society that honors the servant and has forgotten the gift”. –Albert Einstein

Einstein’s quote divides the conscious mind into an intuitive and a rational component. The rational mind consists of logical information processing done in the conscious mind and the intuitive mind is information processing that is either not logical or only partly logical because it draws on impressions or hunches. Our intuitive impressions appear in our conscious awareness after having been triggered somehow by conscious thoughts. I am proposing that a level of processing below consciousness that I call the nonconscious intuitive mind creates these impressions by drawing heavily on memory but also considering a preponderance of the evidence relating to the matter at hand. It is paired with the conscious intuitive mind, which is the part of intuition we are aware of. Although the rational mind also draws heavily on memory, which is a nonconscious process, I will, as I mentioned in the last chapter, consider memory that we can bring back to consciousness as part of consciousness, and the rational thinking we do is also conscious, so there is no nonconscious rational mind. Qualia appear in our conscious awareness almost immediately after we sense things, as if the nonconscious processes that convert senses to a form we can consciously understand were instantaneous. We can call the part of the mind that manages qualia the qualitative mind. The qualitative mind then divides into the aware mind, the attentive mind, the sensory mind, and the emotional mind. As with the intuitive mind, each of these parts has both conscious and nonconscious parts, and it should be understood that the nonconscious parts feed the conscious parts.

It is well known that the left and right hemispheres of the brain work differently. Most significantly, the right brain controls the left half of the body and vice versa. But of more interest here, the right brain can approximately be called the intuitive or non-verbal mind and the left the rational or verbal mind. Although this is a gross simplification, to a first-order approximation it is true. To a second-order approximation, each half is generally capable overall and could carry on without the other half, but specialized skills do often preferentially develop in just one hemisphere. The most likely reason for hemisphere lateralization is probably that the two halves can only communicate to each other through the corpus callosum (a band of nerve fibers connecting the two halves), and possibly to a much lesser degree through lower parts of the brain. This allows each half to develop considerable autonomy. To act effectively as top-down information processors, we need to carefully balance both top-down (rational) and bottom-up (intuitive) perspectives, so it makes sense that we would develop lateralized specialties for these perspectives using each hemisphere. Each side must do both top-down and bottom-up processing because both are needed to make sense of anything, but the weighting is different, making one side seem relatively more logical and one more intuitive. Both halves contribute their efforts to create our overall, balanced perspective.

In The Master and His Emissary: The Divided Brain and the Making of the Western World, Iain McGilchrist argues that the left brain has become increasingly dominant in Western society, leading to rationality eclipsing intuition. While he linked his argument to the physical structure of the brain, one could alternately make the purely functional argument that rational thinking, from either or both hemispheres, has become more dominant in Western society. This is exactly the point that Einstein was making. Is this general shift related to left brain dominance? No; it’s not the side of the brain; the rational/intuitive division is too simplistic to describe how they specialize. But rephrased in functional terms, does this general shift exist and is it related to the dominance of rationality? Yes to both. It is not a failing of modern society; it is just an inescapable part of the cognitive ratchet: concepts and metaconcepts are rational, so the more cognitive functionality we develop, the more rational we become. But the need for intuition never goes away, because we will always have to join top-down solutions back to bottom-up perceptions, which are the source of all our drives, motives, and satisfaction with life. This mutual dependence has been recognized for a long time. Immanuel Kant put it like this: “Thoughts without intuitions are empty, intuitions without concepts are blind”10. The full passage from Kant’s 1781 edition of Critique of Pure Reason, p50-51, is:

Intuition and concepts … constitute the elements of all our cognition, so that neither concepts without intuition corresponding to them in some way nor intuition without concepts can yield a cognition. Thoughts without [intensional] content (Inhalt) are empty (leer), intuitions without concepts are blind (blind). It is, therefore, just as necessary to make the mind’s concepts sensible—that is, to add an object to them in intuition—as to make our intuitions understandable—that is, to bring them under concepts. These two powers, or capacities, cannot exchange their functions. The understanding can intuit nothing, the senses can think nothing. Only from their unification can cognition arise.

To clarify, Kant did think that intuition and concepts existed independently in our minds. It was just that neither alone could create understanding or cognition, and with this I completely agree.

Learning – A Way to Conditionally Manage Incoming Information

I outlined before how information processors create three orders of information: percepts, concepts and metaconcepts. I also said that our capacity for percepts and concepts has both innate and learned components. Qualia are how we subjectively experience innate perception as either senses or feelings, as I discussed in the last chapter11. But I haven’t yet suggested why learning exists, except in the larger context that everything functional is useful. Yes, learning is useful, but why? Instincts are great for handling frequently-encountered problems, but animals face challenges every day where novel solutions could do a better job, and it would not be practical or probably even possible to wait for solutions to such problems to evolve into instincts. Real-time systems that could tailor a solution to the problem at hand would provide a tremendous competitive advantage. This is the beginning of the cognitive ratchet I have been discussing.

We call this creation of real-time solutions from experience learning, and all animals developed two quite distinct ways to do it, an inductive approach that works from the bottom up and a deductive approach that works from the top down. These are what I previously said that William of Ockham called intuitive and abstractive cognition. Although every problem animals solve must incorporate elements of both, they are different specializations. Bottom-up techniques, which are the specialty of the qualitative and intuitive minds, create qualia and subconcepts respectively. Top-down techniques, which are the specialty of the rational mind, create concepts and metaconcepts. Inductive methods find patterns in data independent of context, while deductive methods create contexts (or models), starting with the context of the organism as an agent, and decompose that context into causes and effects, i.e. reasons. While qualia process sensory information in a fixed way, subconcepts and concepts are learned and stored in memory, starting from nothing when the brain first forms and building an increasingly complex network throughout life.

Feelings, subconcepts, and concepts collectively comprise “thoughts”, which is not a rigid term but encompasses any kind of experience passing through our conscious awareness. We consider thoughts that are well-established and considered to be reliable or correct to be knowledge. Knowledge can be either specific or general, where specific knowledge is tied to a single circumstance and general knowledge is expected to apply to some range of circumstances. Qualia are always specific and take place in real time, but subconcepts and concepts (including the memory of qualia) can be either specific or general, and can either take place in real time as we think about them or be remembered for later use. Though qualia thus constitute much of our current knowledge, they comprise none of our learning, experience, or long-term knowledge. I need a term I can use to refer just to what we have learned, so I am going to use the word “knowledge” and “thought” to mean only subconcepts and concepts going forward, and I will explicitly mention qualia (aka feelings) separately as needed. Furthermore, I will call a piece of specific knowledge a “memento”, indicating it is a distinct item from memory, and a piece of general knowledge either a “lesson” if it follows directly from experience or a “notion” if it is a novel projection. Arguably, everything in our memory is knowledge by definition because to remember is to know, but knowledge varies dramatically in how well we know it and how reliable or correct we consider it to be. We also start to forget things immediately and increasingly over time, so we tend to reserve the word knowledge for memory that meets a standard of robustness appropriate to any given situation.

All feelings and thoughts create information from patterns discovered in real time, that is, through experience and not through evolution, even though they all leverage cognitive mechanisms that developed over evolutionary time. But where feelings are “special effects” in the theater of consciousness which evolved to deliver them to our awareness in a customized way, thoughts (i.e. subconcepts and concepts) have no sensation. We can associate remembered qualia with them, but this second-hand feeling falls far short of live qualia. Knowledge doesn’t “feel” like anything; all it does is bring other knowledge to mind. Thoughts connect to other thoughts. They connect serially in language, or we can go depth-first to dig into deeper levels of detail, or breadth-first to find similarities or analogies, or we can just think by free-association to follow our intuition. Thoughts seem to be composed out of memories of other thoughts, and our ability to traverse them depends on what we are looking for. Language seems like a special case because while it doesn’t prevent us from roaming through our memory as we like, when we think using language, following our own inner voice, it helps us guide trains of thought more easily down desired pathways. As we think, we form memories and gather experience, recording them first as mementos of things that happened to us and then as lessons that help us predict what will happen in the future in similar situations. This collecting of experience happens both beneath our conscious awareness through learned perception (making subconcepts) and through conception, which is the conscious creation of concepts.

Finally, it is time to start decomposing subconcepts and concepts. Subconcepts have both nonconscious and conscious aspects. They are created nonconsciously from impressions we form from experience. This internal analysis builds them from considerations of inductive trials by looking at the preponderance of the evidence. Subconcepts can only be spoken of in the plural because are they are networks of impressions rather than discrete ideas. To refer to something discretely, we need to use concepts. Our conscious awareness of our intuitive mind consists of impressions and hunches which we know are based on experience and thus potentially carry useful information. We also have intuition about how reliable our intuitions are, so we actually trust our intuition for most things most of the time. Although we don’t know just why things pop into our minds intuitively, intuition tends to be very appropriate because it works the same way as memory recall. Specific memories, or mementos, tie to individual moments, but general memories or lessons draw conclusions about the broader applicability of specific memories to more general circumstances. Intuitions are lessons we learn below the conceptual level from experience. So we don’t just remember them, we are nonconsciously making inductive conclusions about them.


Conceptual thinking, as I discussed in the chapter on dualism, is based on deductive reasoning. Deduction establishes logical models, which are sets of abstract premises, logical rules, and conclusions one can reach by applying the rules to the premises. Logical models are closed, meaning we can assume their premises and rules are completely true, and also that all conclusions that follow from the rules are true (given binary rules, but the rules don’t have to be binary). In our minds, we create sufficiently logical frameworks called conceptual models, for which the underlying premises are concepts. Concepts are abstract entities which have two parts in the mind: a handle with which we refer to it and a content that tells us what it means. The concept’s handle is a reference we keep for it in our minds like a container. Concepts are often named by words or phrases, but we know many more concepts than we named with words, including, for example, a detailed conceptual memory of events. From the perspective of the handle, the concept is fully abstract and might be about anything.

The concept’s meaning is its content, which consists of one or more relationships to other concepts. At its core, information processing finds similarities among things and applies those similarities to specific situations. Because of this, the primary feature of every concept’s content is whether it is a generality or a particular. A generality or type embraces many particulars that can be said to be examples of the type. The generality is said to be superordinate to the subordinate example across one or more variable ranges. Providing a value for one of those ranges creates an example or instance called a token of the type, and if all ranges are specified one arrives at a particular, which is necessarily unique because two tokens with the same content are indistinguishable and so are the same token. Generalities are always abstract, while particulars can be either concrete or abstract, which, in my terminology means they are either about something physical or something functional. A concrete or physical particular will correspond to something spatiotemporal, i.e. a physical thing or event. Each physical thing has a noumenon (or thing-in-itself) we can’t see and phenomena that we can. From the phenomena, we create information (feelings, subconcepts, and concepts) which can be linked as the content of a concept. Mentally, we catalog physical particulars as facts, which is a recognition that the physical circumstance they describe is immutable, i.e. what happened at any point in space and time cannot change. Note that concrete particulars are still generalities with respect to the time dimension, because we take physical existence as persisting through a duration of time. Since concrete particulars eventually change over time, we model them as a series of particulars linked abstractly as a generality. What happens at a given point in space and time is noumenal, but we only know of it by aligning our perceptions of phenomena with our subconcepts and concepts, which sometimes leads to mistaken conclusions. We reduce that risk and establish trust by performing additional observations to verify facts, and from the amount of confirming evidence we establish a degree of mathematical certainty about how well our thoughts characterize noumena. Belief is a special ability which I will describe later that improves certainty further by quarantining doubt.

An abstract or functional particular is any non-physical concept that is specific in that it doesn’t itself characterize a range of possible concepts. The number “2” is an abstract particular, as it can’t be refined further. A circle is also an abstract particular until we introduce the concept of distance, at which point circle is a type that becomes a particular given a radius. If we introduce location within a plane, we would also need the coordinates of the circle’s center. So we can see that whether an abstract concept is particular or not depends on what relationships exist within the logical model that contains it. The number x such that x+2=4 is variable until we solve the equation, at which point we see it is the particular 2. The number x such that x^2=4 is variable even after we solve the equation because it can be either -2 or 2. So for functional entities, once all variability is specified within a given context one has an abstract particular. Mathematics lays out sets of rules that permit variability that then let us move from general to particular mathematical objects. Deductive thought employs logical models that permit variability and can similarly arrive at particulars. For example, we can conceive of altruism as a type of behavior. If I write a story in which I open a door for someone in some situation, then that is a fully specified abstract particular of altruism. So just as we see the physical world as a collection of concrete particulars that we categorize using abstract generalities about concrete things, we see the mental world as a set of abstract particulars categorized by abstract generalities about abstract things. Thus, both our concrete and abstract worlds divide nicely into particular and abstract parts. Concrete particulars can be verified with our senses (if we can still access the situation physically), but abstract particulars can only be verified logically. In both cases, we can remember a particular and how we verified it.

While our senses send a flood of information to our minds which inherently form concrete particulars, the process of recognition also categorizes things into a wide variety of abstract types we have established in our memories as concepts. Our nonconscious mind doesn’t think about these concepts, but it does retrieve them for us, demonstrating that some of our capacity to work with concepts is innate, even if their content is shaped through experience.

Beyond whether they are generalities or particulars, concepts can have considerably more content. But what does that content look like? The surprising fact we have to keep in mind is that concepts are what they can do — its meaning is its functionality. So we shouldn’t try to decompose concepts into qualities or relationships but instead into units of purpose. Deductive models can provide much better control than inductive models because they can predict the outcomes of multistep processes through causal chains of reasoning, but to do that their ranges of variability have to align closely with the inductive variability observed in the kinds of applications in which we expect to apply them. When this alignment is good, the deductive models become highly functional because their predictions tend to come true. Viewed abstractly then, the premises and rules of deductive models exist because they are functional, i.e. because they work. So concepts are not just useful as an incidental side effect, being useful is fundamental to their nature. This is what I have been saying about information all along — it is bundled-up functionality.

Given this perspective, what can we say about content? Let’s start simply. The very generic concept in this clause’s use of the phrase “generic concept”, for example, is an abstract generality with no further meaning at all; it is just a placeholder for any concept. Or, the empty particular concept in this clause is an example of an abstract particular with no further meaning, since it is the unique abstract particular whose function is to represent an empty particular. But these are degenerate cases; almost every concept we think of has some real content. A concrete particular concept includes spatiotemporal information about it, as noted above, and all our spatiotemporal information comes originally from our senses as qualia. We additionally gain experience with an object which is generalized into subconcepts that draw parallels to similar objects. Much of the content of concrete particulars consists of links to feelings and subconcepts that remind us what it and other things like it feel like. Each concrete particular is also linked to every abstract generality for which it is a token. Abstract generalities then indirectly link to feelings and subconcepts of their tokens, with better examples forming stronger associations. What does it mean to link a concept to other feelings (sensory or emotional), subconcepts, or concepts? We suspect that this is technically accomplished using the 700 trillion or so synapses that join neurons to other neurons in our brains12, which implies that knowledge is logically a network of relationships linking subconcepts and concepts together and from there down to feelings. Our knowledge is vast and interconnected, so such a vast web of connections seems like it could be powerful enough to explain it, but how might it work? Simplistically, thinking about concepts could activate the feelings and thoughts linked by their contents by activating the linked neurons. Of course, it is more complicated than that; chiefly, activation has to be managed holistically so that each concept (and subconcept and feeling) contributes an appropriate influence on the overall control problems being solved. The free energy (surprise-minimization) principle is one holistic rule that helps provide this balance, but in more detail than that are attention and prioritization systems. But for now, I am trying to focus on how the information is organized, not how it is used.

Central to the idea of concepts is their top-down organization. To manage our bodies productively, we, that is our minds as the top-level control centers of our brains, have to look at the world as agents. When we first start to figure the world out we form learn simple categories like me and not-me, food and not-food, safe and not-safe. Our brains are wired to pick up on these kinds of binary distinctions to help us plan top-level behavior, and they soon develop into a full set of abstract generalities about concrete things. It is now impossible to say how much of this classification framework is influenced by innate preferences and how much was created culturally through language over thousands of generations, because we are all now learn to understand the world with the help of language. In any case, our framework is largely shared, but we also know how to create new personal or ad hoc classifications as the need arises. For categories and particulars to be functional we need deductive models with rules that tell us how they behave. Many of these models, too, are embedded in language and culture, and in recent centuries we have devised scientific models that have raised the scope and reliability of our conceptual knowledge to a new level.

Some examples of concepts will clarify the above points. The concept APPLE (all caps signifies a concept) is an abstract generality about a kind of fruit and not any specific apple. We have one reference point or handle in our minds for APPLE, which is not about the word “apple” or a thing like or analogous to an apple, but only about an actual apple that meets our standard of being sufficiently apple-like to match all the variable dimensions we associate with being an APPLE. From our personal experience, we know an APPLE’s feel, texture, and taste from many interactions, and we also know intuitively in what contexts APPLEs are likely to appear. We match these dimensions through recognition, which is a nonconscious process that just tells us whether our intuitive subconcepts for APPLE are met by a given instance of one. We also have deductive or causative models that tell us how APPLEs can be expected to behave and interact with other things. Although each of us has customized subconceptual and conceptual content for APPLE, we each have just one handle for APPLE and through it we refer to the same functionality for most purposes. How can this be? While each of us has distinct APPLE content from our personal experiences, the functional interactions we commonly associate with apples are about the same. Most generally, our functional understanding of them is that they are fruits of a certain size eaten in certain ways. In more detail, we probably all know and would agree that an APPLE is the edible fruit of the apple tree, is typically red, yellow or green, is about the size of a fist, has a core that should not be eaten, and is often sliced up and baked into apple pies. We will all have different feelings about their sweetness, tartness, or flavor, but this doesn’t have a large impact on the functions APPLEs can perform. That these interactions center around eating them is just an anthropomorphic perspective, and yet that perspective is generally what matters to us (and, in any case, not so incidentally, fruits appear to have evolved to appeal to animal appetites to help spread their seeds). Most of us realize apples come in different varieties, but none of us have seen them all (about 7500 cultivars), so we just allow for flexibility within the concept. Some of us may know that apples are defined to be the fruit of a single species of tree, Malus pumila, and some may not, but this has little impact on most functional uses. The person who thinks that pears or apple pears are also apples is quite mistaken relative to the broadly accepted standard, but their overly generalized concept still overlaps with the standard and may be adequate for their purposes. One can endlessly debate the exact standard for any concept, but exactness is immaterial in most cases because only certain general features are usually relevant to the functions that typically come under consideration. Generality is usually more relevant and helpful than precision, so concepts all tend to get fuzzy around the edges. But in any case, as soon as irrelevant details become relevant, they can simply be clarified for the purpose at hand. Suppose I have an apple in my hand which can call APPLE_1 for the purposes of this discussion. APPLE_1 is a concrete particular or token of an APPLE, and we would consider its existence a fact based on just a few points of confirming evidence.

The fact that a given word can refer to a given concept in a given context is what makes communication possible. It also accounts for the high level of consistency in our shared concepts and the accelerating proliferation of new concepts through culture over thousands of years. The word “apple” is the word we use to refer to APPLE in English. The word “apple” is itself a concept, call it WORD_APPLE. WORD_APPLE has a spelling and a pronunciation and the content that it is a word for APPLE, while APPLE does not. We never confuse WORD_APPLE with APPLE and can tell from context what content is meant in any communication. Generally speaking, WORD_APPLE refers only to the APPLE fruit and the plant it comes from, but many other words have several or even many meanings, each of which is a different concept. Even so, WORD_APPLE, and all words, can be used idiomatically (e.g. “the Big Apple” or “apple of my eye”) or metaphorically to refer to anything based on any similarity to APPLE. We usually don’t name instances like APPLE_1, but proper nouns are available to name specific instances as we like. We don’t have specific words or phrases for most of the concepts in our heads, either because they are particulars or because they are generalities that are too specific to warrant their own words or names. A wax apple is not an APPLE, but it is meant to seem like an APPLE, and it matches the APPLE content at a high level, so we will often just refer to it using WORD_APPLE, only clarifying that it is a different concept, namely WAX_APPLE, if the functional distinction becomes relevant.

Some tokens seem to be the perfect or prototypical exemplars of an abstract category, while others seem to be minor cases or only seem to fit partially. For example, if you think of APPLE, a flawless red apple probably comes to mind. If you think of CHAIR, you are probably thinking of an armless, rigid, four-legged chair with a straight back. Green or worm-eaten apples are worse fits, as are stools or recliners. Why does this happen? It’s just a consequence of familiarity, which is to say that some inductive knowledge is more strongly represented. All the subcategories or instances of a completely impartial deductively-specified category are totally equivalent, but if we have more experience with one than another, then that will invariably color our thoughts. Exemplars are shaped by the weighting of our own experience and our assessment of the experience of others. We develop personal ideals, personal conceptions of shared ideals, and even ideals customized to each situation at hand that balance many factors. Beyond ideals, we develop similar notions for rarities and exceptions. Examples that only partially fit categories only demonstrate that the category was not generalized with them in mind. Nothing fundamental can be learned about categories by pursuing these kinds of idiosyncratic differences. Plato famously conceived the idea that categories were somehow fundamental with his theory of Forms, which held that all physical things are imitations or approximations of ideal essences called Forms or Ideas which they in some sense aspire to. I pointed out earlier that William of Ockham realized that categories were actually extrinsic. They consequently differ somewhat for everyone, but they also share commonalities based on our conception of what we have in common.

  1. The social life of corvids, Current Biology, VOLUME 17, ISSUE 16, PR652-R656, AUGUST 21, 2007
  2. Larissa Swedell, Primate Sociality and Social Systems, Queens College, City University of New York; New York Consortium in Evolutionary Primatology), 2012 Nature Education
  3. Katharina Friederike Brecht, A multi-facetted approach to investigating theory of mind in corvids, University of Cambridge, April 2017
  4. Darwin’s theory, Punctuated equilibrium, Wikipedia
  5. Dam Building: Instinct or Learned Behavior?, Feb 2, 2011, 8:27 PM by Beaver Economy Santa Fe D11
  6. Language deprivation experiments, Wikipedia
  7. Nicaraguan Sign Language, Wikipedia
  8. Yusuke Watanabe et al, Language Evolution and the Baldwin Effect, Graduate School of Information Science, Nagoya University, Japan, 2008
  9. I didn’t discuss how we experience innate conception, and I’m not going to cover the innate aspects of conception for a while, because we first need to understand what concepts are.
  10. Neuron, Wikipedia

Leave a Reply