2.1 Life: 4 billion to 600 million years ago

While we don’t know many of the details of how life emerged, the latest theories connect a few more dots than we could before. Deep-sea hydrothermal vents 1 may have provided at least these four necessary precursors for early life to arise around four billion years ago:

(a) a way for hydrogen to react directly with carbon dioxide to create organic compounds (called carbon fixation).

(b) an electrochemical gradient to power biochemical reactions that led to ATP (adenine triphosphate) as the store of energy for biochemical reactions.

(c) the formation of the RNA world in which RNA could replicate itself and catalyze reactions in protected areas (possibly iron-sulfur bubbles or other compartments). The ribosome evolved to translate RNA to proteins; up to six likely phases of ribosome evolution have been proposed.2 Early protein chemistry and basic metabolism were established.

(d) the chance creation of lipid bubbles enclosing this RNA metabolic soup to form the first RNA-based cells. Genes don’t code for everything in life — they can’t create membranes. But genes quickly evolved proteins to strengthen and improve those original cell walls to make them very sophisticated biological entities. Genetically-driven cell division evolved at this time, allowing cells to multiply.

This scenario is at least a plausible way for the precursors of life to congregate in one place and have opportunities for feedback loops to develop which could start to capture function and then ratchet it up. Many steps are missing here, and much of the early feedback probably depended more on chance than on mechanisms that actually capture and leverage it as information. Alexander Rich first proposed the concept of the RNA world in 1962 because RNA can both store information and catalyze reactions, and thus do both the tasks that DNA and proteins later specialized at. The RNA world may have evolved for hundreds of millions of years before viruses and DNA brought about the next big changes.

(e) the creation of of viruses. Genomic analysis suggests a critical viral contribution to the origin of life as we know it. I propose an extensive “age of viruses” that spanned hundreds of millions of years and included a vast proliferation and refinement of biochemical mechanisms to manage cells, genes and proteins from which only the few most successful lines survived.

While it has long been contentious whether viruses should even be classified as living given that they are obligate parasites, they are alive because they have their own genome, and they metabolize and reproduce during part of their life cycle.3 Patrick Forterre suggests calling normal cells “ribocells” and cells producing virions (inactive virus particles) “virocells”. Only ribocells code for ribosomes (needed to make proteins), while only virocells code for capsids, which are the protein coats of virions. Viruses are thus distinct living organisms when they are virocells and are like seeds when they are virions. All viruses today kill their ribocell host when they are virocells by rapidly reproducing and then bursting the cell wall (a process called lysis), but the first viruses probably lived symbiotically as “ribovirocells” until the capacity for lysis evolved. Many viruses also live symbiotically with their hosts via a lysogenic phase in which they integrate into the host genome until induced to leave, which can happen many generations later or even never. About 8% of the human genome is known to be viral as embedded (lysogenic) retroviruses, most of which are now permanent residents. All cells today use double-stranded DNA, but viruses come in both single and double-stranded RNA and DNA forms, and also have a more diverse molecular biology than today’s cellular life, suggesting an older origin. At least six surviving lines of viruses appear to have evolved independently, and many more have probably not survived.45

At the end of the age of viruses, just three lines of DNA-based cells and six lines of viruses symbiotic with them emerged. Those three lines are bacteria, archaea, and eukaryotes. These three either share a last universal common ancestor, or LUCA, about 3.5 billion years ago, or they come from three independent conversions of RNA cells to DNA cells. I suspect the latter case, along with many other cell lines from the age of viruses that did not survive. Furthermore, eukaryotes are much more highly evolved than bacteria and archaea, so I discuss them separately below. Viruses probably “invented” DNA as a superior replication strategy and then took over the replication machinery of some RNA cells they infected to create DNA-based cells. The stability of double-stranded DNA as a genetic material eventually eclipsed single or double-stranded RNA and single-stranded DNA in all living cells, but single-stranded and RNA-based viruses still exist.6 While it is theoretically possible that some new lines of viruses evolved after DNA-based cells took over, I suspect that the “wild west” window of opportunity to evolve something as novel as a virus had shut down for good because the competition from existing mechanisms had become too great.

Early life must have been very bad at even basic cell functions compared to modern forms, so much of the adaptive pressure in the early days must have focused on improving the core mechanisms of metabolism, replication, and adaptation during the RNA world and the age of viruses. As life first became more robust, it became less dependent on the hydrothermal vents and was gradually able to move away from them. Although the central mandate of evolution is to survive over time, we can roughly prioritize the set of component skills that needed to evolve along the way. As each of these skills improved over time, organisms that could do them better would squeeze out those that could not:

1. Metabolism is, of course, the fundamental function as life must be able to maintain itself. A source of energy was critical to this, which is why hydrothermal vents are such a likely starting point.

2. Reproduction was the next most critical function, as any kind of organism that could produce more like itself would quickly squeeze out those that could not. This is where RNA comes in. Although RNA is too complex to have been the first approach used to replicate functionality, we can guess that a functional ratchet got to RNA through a series of simpler but less effective molecules that have not survived in any lifeform today.

3. Natural selection at the level of traits is the next most critical function needed because it would make possible the piecewise improvement of organisms. Metabolism and replication critically need trait-level selection to improve, so these mechanisms coevolved. Bacteria developed a mechanism called conjugation that lets two bacterial cells connect and copy a piece of genetic material called a plasmid from one to the other. Most plasmids ensure that the recipient cell doesn’t already have a similar plasmid, which protects against counterproductive changes. There are so many bacteria that a good strategy for them is to try out everything and see what works.

4. Proactive gene creation. Directed mutation is currently a controversial theory, but I think it will turn out that nearly all genetic change is pretty carefully coordinated and that the mechanisms that make it possible evolved in these early years. I am talking about ways a cell can assemble new genes by combining snippets of DNA called transposable elements (also called TEs, “jumping genes” or transposons) and then inserting the result back into chromosomes. Sometimes this creates “junk DNA” that does nothing, and sometimes it creates new, active genes. Viruses depend on this kind of technology, and we know genes can jump, but it is hard but it is hard to see how mechanism could evolve that could do this in a useful way. What we have to remember is that it only has to be more useful than chance to survive and prosper. Cells that were “open minded” about mixing up their DNA could evolve strategies like this if they work sometimes. And because this could happen, it almost certainly did happen, because those that could gain such an advantage, even if it took many generations to show benefits, would have squeezed out those that could not. This very long-term selection of gene-editing technology has by now had as much time to evolve as the more visible traits of genes, even though it is much harder to see or even imagine how it might be working. Adi Livnat calls this mechanism the “writing phenotype” to contrast it with the “performing phenotype”, which are the genes behind the observable traits of an organism. If true, this would be the largest extension to the theory of evolution since Darwin.7

The next big step was:

(f) the arrival of eukaryotes

All cellular (non-viral) life today is in the bacteria, archaea, and eukaryote domains. Bacteria and archaea, collectively called prokaryotes, are primitive single-celled organisms, but eukaryotes are elaborate single-celled creatures typically 10,000 times the size of prokaryotes, and also comprise nearly all multicellular lifeforms on earth. Eukaryotes are a minuscule fraction of all living things by number, but because they are much bigger they have about the same worldwide biomass as prokaryotes. Prokaryotic cells lack internal structures, but eukaryotic cells have a variety of cell organelles, most notably a cell nucleus and mitochondria, which both have double membranes, and an endomembrane system including the endoplasmic reticulum and the Golgi apparatus. Eukaryotes must have a complex prehistory now lost to us, but the evidence suggests they arose from an original cell line, dubbed chronocytes, that must already have been much more complex creatures than bacteria or archaea.8 Chronocytes had a cytoskeleton, which is a network of protein fibers that attach to the cell wall that give it shape and made it possible for them to engage in endocytosis, the ingestion of bacteria and other objects by engulfment. The eukaryotic ancestor likely ingested a number of bacteria and archaea that made permanent alterations, as genetic remnants of bacteria, archaea and a third unrelated line (the postulated chronocytes line) are now found in eukaryotes. The nucleus and mitochondria are probably such engulfed organisms that retained much of their structure (a process called symbiogenesis).9 The double membrane is the expected signature of a single-membraned creature engulfed by an outer cell wall.10 Algae and plants are eukaryotes that engulfed organelles called plastids. Mitochondria and plastids reproduce with their own DNA, while cell nuclei became the repository for the host cell’s DNA.

These physical enhancements to eukaryotes gave them a whole host of new functional capabilities prokaryotes lack, including endocytosis (as noted) and locomotion using flagella, cilia, or pseudopods. The endomembrane system, which comprises more than half of the total membrane in eukaryotic cells, likely originated by folding and refolding an inner or outer membrane. It facilities the synthesis and transport of proteins like a post office, which likely accounts for why eukaryotic cells can be so much larger, more complex, and more functional than prokaryotes. Eukaryotes acquired energy-producing capabilities already refined by prokaryotes through mitochondria and chloroplasts, which are plastids that can photosynthesize. But perhaps the greatest invention of the eukaryotes was:

(g) sexual reproduction, which combines genes from two parents to create a new combination of genes in every offspring.

Sexual reproduction is a nearly universal feature of eukaryotic organisms11 and the basic mechanisms are believed to have been fully established in the last eukaryotic common ancestor (LECA) about 2.2 billion years ago. In the short term, sex has a high cost but few benefits. However, in the long term it provides enough advantages that eukaryotes almost always use it. Asexual reproduction is used by prokaryotes and by the somatic (non-sex) cells of eukaryotes. In prokaryotes it is called binary fission and in somatic cells it is called mitosis. In both cases, a double strand of DNA is separated and each single strand is then used as a template to create two new double strands. When the cell divides into two, each daughter cell ends up with one set of DNA.

Sexual reproduction uses a modified cell-division process called meiosis and a cell fusion process called fertilization. Cells that undergo meiosis contain a complete set of genes from each of two parents. They first replicate the DNA, making four sets of DNA in all, and then randomly shuffle genes between parent strands in a process called crossing over. The cell then divides twice to make four gametes each with a unique combination of parental genes. Gametes from different parents then fuse during fertilization to create a new organism with a complete set of genes from each of its two parents, where each set is now a largely random mixture from each parent.

Sexual reproduction is clearly a much more complex and seemingly unlikely process compared to asexual reproduction, but I will show why sex is probably a necessary development in the functional ratchet of life. The underlying reason for sex is that it facilitates points 3 and 4 above, namely natural selection at the level of traits and proactive gene creation. Because mechanisms evolved to do both 3 and 4 well, prokaryotes evolved in just two billion years instead of two trillion or quadrillion. Of course, I can only guess about time frames this large, but in my estimation evolution would have made almost no progress at all without refining these two mechanisms, so any organisms that could improve on them would have a huge advantage over those that did them less well. We know that conjugation is not the only mechanism prokaryotes use to transfer genetic material between themselves. All such mechanisms outside of sexual reproduction are called horizontal gene transfer (HGT), and also include transformation and transduction. Transduction is the incorporation of DNA from viruses, and viruses likely created most or even all of the very elaborate machinery behind horizontal gene transfer in the first place. Any mechanism that can share genetic information at the gene or function level with other organisms creates opportunities for new combinations of genes to compete. Life on earth has been a group (horizontal) effort because advantageous mutations useful to different domains arose in different lines of vertical descent. Without sex, prokaryotes would be evolutionary dead-ends that died off as deleterious mutations accumulated, but HGT gives them access to enough new genetic material to ward off this fate and even to adapt well to new environments. HGT allows many new genetic combinations to be tried at a fairly low cost since the number of single-cell organisms is very high. But it also lacks many mathematical advantages that sex brings to the table. If we assume “that the protoeukaryote → LECA era featured numerous sexual experiments, most of which failed but some of which were incorporated, integrated, and modified,”12 then nearly all of the steps that created sex, which is a highly-refined but complex mechanisms, are lost to us.

What benefits does sex provide that led to its evolution? John Maynard Smith famously pointed out that in a male-female sexual population, a mutation causing asexual reproduction (i.e. parthenogenesis, which does naturally arise sometimes allowing females to reproduce as clones without males) should rapidly spread because asexual reproduction has a “twofold” advantage since it no longer needs males. It is true that when resources allow unlimited growth, asexual reproduction can thus spread faster, but this rarely happens. Usually, populations are constrained by resources to a roughly stable population. Achieving the fastest reproduction cycle is not the critical factor in long-term success in these situations, and it is actually rather irrelevant. In any case, eukaryotic populations can and have evolved ways to switch between sexual and asexual modes of reproduction to capitalize on this asexual advantage, but in practice this almost never happens. I think this is because the ability to multiply faster comes with the cost of being monoclonal, and this staggering loss of genetic diversity is likely to create a genetic dead end. All major vertebrate groups except mammals have species that can sometimes reproduce parthenogenetically13, including about eighty species of unisex reptiles, amphibians, and fishes. While these lines may last for quite a while, they have few prospects for further adaptation. Sexual reproduction makes natural selection at the level of traits (point 3 above) possible. Only through sexual reproduction can variants of each gene in a population vie for success independently from all the other traits. Sexual reproduction can thus create an almost unlimited number of genomes with different combinations of genes, while all asexual creatures remain clones (barring HGT, though prokaryotic genomes stay very small, so they must both take genes in and knock them out). Beneficial traits can spread through a population “surgically” replacing less effective alleles (variants of the same gene). Sex gives a species vastly more capacity to adapt to changing environments because variants of every gene can remain in the gene pool waiting to spread when conditions make them more desirable.14 Asexual creatures can’t keep genes around for long that aren’t useful right now, because they can’t generate new combinations (except by HGT). We can conclude that Maynard Smith was right that asexual reproduction provides a “quick win”, but because it is a poor long-term strategy its use is very limited in multicellular life. Overall, sex is what makes eukaryotic evolution possible because it provides a controlled way for traits to evolve independently.

Finally, we see:

(h) complex multicellularity, meaning organisms with specialized cell types.

Multicellular life has arisen independently dozens of times, starting about 1 billion years ago, and even some prokaryotes have achieved it, but only six independently achieved complex multicellularity: animals, two kinds of fungi, green algae (including land plants), red algae, and brown algae. The relatively new science of evo-devo (evolutionary development) is focused largely on cell differentiation in complex multicellular (eukaryotic) organisms. The way that the cells of the body achieve such dramatically different forms, simplistically, is by first dividing and then turning on regulatory genes that usually then stay on permanently. Regulatory genes don’t code for proteins, but they do determine what other regulatory genes will do and ultimately what proteins will be transcribed. Consequently, as an embryo grows, each area can become specialized to perform specific tasks based on what proteins the cell produces.

The most dramatic demonstration of the power of triggered differentiation is radial and bilateral symmetry. Most animals (the bilateria) have near perfect bilateral symmetry because the same regulatory strategy is deployed on each side, which means that so long as growth conditions are maintained equally on both sides, a perfect (but reversed) “clone” will form on each side. Evo-devo has revealed that the eyes of insects, vertebrates, and cephalopods (and probably all bilateral animals) evolved from the same common ancestor, contrary to earlier theory. Homeoboxes are parts of regulator genes shared widely across eukaryotic species that regulate what organs develop where. As evo-devo uncovers the functions of regulatory genes, the new science of genomics is mathematically exposing the specific evolutionary origins of every gene. Knowing each gene’s origins and roughly what it does will coalesce into a comprehensive understanding of development.

Multicellularity and differentiation created opportunities for specialized structures to arise in bodies to perform different functions. Tissues are groups of cells with similar functions, organs are groups of tissues that provide a higher level of functionality still, and organ systems coordinate the organs at the highest level. A stream has no purpose; water just flows downhill. But a blood vessel is built specifically to deliver resources to tissues and to remove waste. This may not be the only purpose it serves, but it is definitely one of them. All tissues, organs, and organ systems have specific functions which we can identify, and usually one that seems primary. Additional functions can and often do arise because having multiple applications is sometimes the most convenient way for evolution to solve problems with the available resources. Making high-level generalizations about the functions of tissues, organs, and organ systems is the best way to understand them, provided we recognize that generalizations usually have exceptions. The heart definitely specializes in pumping blood and the brain in overall control of the body. The study of these structures should focus first on their function and only secondarily on their form because their form is driven by their function. The blood will still need circulation and the body will still need coordinated control regardless of what physical mechanisms are drafted to do it. So physicalism must take a back seat to functionalism in areas driven by function, which means in the study of life.

Evolution superficially appears to be a process in which complex forms supersede simpler ones, but it is more accurate to think of it as a continuously improving functional web. Functions can be lost along the way, but barring complete system collapse (and mass extinctions do happen), functionality will tend to increase steadily. Plants and animals could not survive without a complex symbiosis with countless bacteria, archaea, fungi, protists (single-cell eukaryotes), and viruses, which is fortunate because they left breadcrumbs that help us understand how small incremental steps moved the functional ratchet forward. I’ve broken those steps down as much as I could above, but what are the biggest missing links? Before about fifty years ago, we had no insight into what preceded multicellular life, and now we can break that period down into seven stages (a to g) with some detail. Even so, we can only surmise that vast complexity arose and was later simplified (through population bottlenecks) to create the RNA world, viruses, the eukaryotes, and sex. Because these developments were streamlined from complexity now lost, we can never be quite sure how they unfolded, but we will continue to develop insights. What we do know is that everything that has happened since multicellularity arose is child’s play compared to what happened before. We have enough genetic evidence in multicellular creatures that we should eventually be able to piece out almost exactly how each detail evolved. The traditional missing link, the leap from ape to man, still holds many mysteries, but they are all solvable.

  1. Nick Lane, “The Cradle of Life“, The New Scientist, 17 October 2009
  2. Anton S. Petrov et al, History of the ribosome and the origin of translation, PNAS December 15, 2015 112 (50) 15396-15401
  3. Some have suggested that viruses may have evolved before cells as the first independent replicators of genetic material. The first cells would then have contained some viruses, some of which would eventually replicate with the cell instead of via capsids. But the genetic evidence shows no remnants of viral DNA in cellular DNA, and I think a later parasitic origin for viruses is more likely.
  4. “modern viruses (and plasmids, most likely originated from them) would have inherited from this ancient virosphere many molecular mechanisms that have disappeared from modern DNA cells. This would explain why the molecular biology of the viral world for transcription, replication repair, and recombination is more diverse than that of the cellular world (despite the fact that we have only explored a tiny fraction of the modern virosphere). If this view is correct, many still unknown molecular mechanisms (and their associated proteins) remain to be discovered in viruses.”, Marc H. V. van Regenmortel, Brian W. J. May, Desk Encyclopedia of General Virology, 2009
  5. Patrick Forterre, Mart Cryptic, The Origin of Virions and Virocells: The Escape Hypothesis Revisited, Viruses: Essential Agents of Life, pp.43-60
  6. Patrick Forterre, Giant Viruses: Conflicts in Revisiting the Virus Concept, Unité de Biologie du Gène chez les Extrêmophiles, Institut Pasteur
  7. Adi Livnat, Interaction-based evolution: how natural selection and nonrandom mutation work together, Biol Direct. 2013; 8: 24. The writing phenotype might even be capable of creating or enabling genes on demand in response to changing environmental circumstances, which is a phenomenon that has sometimes been observed but can’t be explained.
  8. Hyman Hartman and Alexei Fedora, The origin of the eukaryotic cell: A genomic investigation, PNAS February 5, 2002 99 (3) 1420-1425
  9. This is why it is awkward to suggest eukaryotes have a common ancestor with prokaryotes — their DNA was transferred to an engulfed prokaryote from a chronocyte. This kind of transfer of DNA can happen, thanks to gene-editing enzymes that probably derive from viruses. Could the first chronocytes have been archaea? Not really; it is like asking if the first chimps were lemurs — you’d have to go back to a very different ancestral form.
  10. Hartman H, The origin of the eukaryotic cell, 1984, Speculations Sci Technol. 1984;7(2):77-81
  11. Ursula Goodenough and Joseph Heitman, Origins of Eukaryotic Sexual Reproduction, Cold Spring Harbor Perspectives in Biology
  12. Ursula Goodenough and Joseph Heitman, Origins of Eukaryotic Sexual Reproduction, Cold Spring Harbor Perspectives in Biology
  13. Switch from sexual to parthenogenetic reproduction in a zebra shark
  14. Jef Akst, “Why Sex Evolved“, The Scientist, October 13, 2010

Leave a Reply