Evo Devo: A Computational View

Ugo O. Gagliardi

By far the most exciting development in biology, in the last few years, is the emergence of the new sub discipline of "Evolutionary Development" also known as "Evo Devo". This discipline deals with the study of how the embryo building process has evolved across the: tree of life and over the eons. A very good and readable book on this topic is:

Sean B. Carroll: "The New Science of Evo Devo", W.W.Norton, 2005, ISBN_0-393-06016-0

Evo Devo Evidence

The key findings of Evo Devo to date (2006) are:

These findings are very significant for three major reasons:

Specifically, since signaling proteins are produced by cells, given that the kth cell type requires the {Pi}k set of signaling proteins which are produced by the {Cj}k set of cell types, all cell types in the {Cj}k set MUST EXIST before the Ck cell type can come in existence. This, in turn, sets up precedence relationships between the cell types of an organism. The embryo development MUST respect such precedence relationships. That is, embryo development must follow temporal sequences of tissue types in agreement with such precedence relationships.

The existence of such precedence relationships implies that the processes that generate the organism cell types have to be arranged as a partial order[ing]. That is, as a graph whose nodes represent the individual cell differentiation processes and whose links represent the specific precedence relationships pertaining to the subject cell type. Such a graph is referred to as a DAG for Directed Acyclic Graph. Directed means that the graph can only be traversed in one direction (time direction, typically from left to right). Acyclic means that the graph contains no loops. That is, later more differentiated cell types do not produce signaling proteins used in the cell generation processes of earlier less differentiated cell types.

These observations suggest that the overall embryogenesis process is, mostly, a quasi-tree (more precisely a Directed Acyclic Graph or DAG) of sub-processes, in the sense that most sub-processes have a single parent process. Such sub-processes respect the precedence relationships mentioned above. It is very likely that the structure of this DAG, which we will call the BioPert, and the associated precedence relationships reflect, to a significant extent, evolutionary history. That is, oldest organs & associated processes appearing early in the BioPert.

Results from Computer Science (computational complexity) indicate that such BioPerts have to be of necessity the result of evolutionary (trial & error) processes negating, therefore, the possibility of alternative generative hypotheses for biology.

Embryo development is a bootstrap process. That is, earlier tissue types are necessary or prerequisite to the development of subsequent tissues/organs. In other words, embryo development is characterized by temporally sequenced tissue/organ development processes. Another way to say this is that the various embryonic tissues must respect a number of precedence relationships. That is, the BioPert could also be represented by a PERT (Program Evaluation & Review Technique; used in many Engineering fields to describe complex development processes) chart with nodes corresponding to the distinct tissue types. The arcs or transitions in the PERT representing the precedence relationships that must hold during the embryonic bootstrap. Incoming links into a node represent all the signaling proteins & their respective generating cell/tissue types that MUST be already present before the tissue corresponding to the node can begin to generate by cellular differentiation.

The recent Evo Devo findings show that major cell/tissue differentiation processes are under the control of specific Master Control Genes (MCGs) and their related cascades of controlled genes or GeNets. It is very likely that it will turn out that all distinct cell differentiation processes are under the control of one or more specific MCGs, this would mean that the PERT chart states can be alternatively labeled with the respective MCG(s). Therefore, the progress in Evo Devo research should eventually allow us to label all the BioPert nodes with the pertinent MCG(s).

Interesting computational insights result if one considers the differential impact of random mutations on the links or nodes of the BioPert. Clearly the potentially disastrous impact of any mutation will be much greater if it affects nodes (and their outgoing links) that have the highest number of "descendant" nodes. The nodes that have no descendants being the freest to evolve under random mutation. Thus, they will be least likely to be conserved. Such nodes will be found on the "skin" of the BioPert, that is, they will be external nodes.

Conversely all internal BioPert nodes will be more likely to be conserved under mutation, with the nodes corresponding to early tissues being the most strongly conserved ones. In fact, mutations in such nodes can more adversely impact the viability of the developing organism, therefore will be weeded out by the death of the organism. In other words, tissue types corresponding to internal BioPert nodes, especially early internal nodes, will be strongly conserved and will, effectively, not evolve in a random fashion.

This implies that major biological innovations, such as those involved in major speciations events (biological radiations), very likely result from appending new GeNets as external [later] nodes of the BioPert. This, in turn, means that if we plot the time a particular feature (such as eyes, limbs, penis, etc..) was "invented" by the evolutionary process versus the time the feature begins to form during a pertinent embryogenesis process we should get a monotonically ascending line. I.e. later discovered features appearing later in embryo development. That is, ontogeny will recapitulate phylogeny.

The strong conservation of the BioPert internal nodes, especially the early internal nodes, implies that during the process of speciation the BioPerts of the emerging new species will retain most, if not all, the internal nodes of the parent species. That is, the parent BioPert will appear as an internal sub-network of the offspring species BioPerts.

This means that evolution "in the large" will be well described by sets of "nested" BioPerts, with ancestral species BioPerts being deeply nested in the BioPerts of all subsequent descendant species. This mode of development can be referred to as Nested Derivative Descent (NDD). Nested refers to the nesting of ancestors' BioPert in descendants' BioPerts; Derivative refers to the fact that proximate descendant BioPerts are derived via incremental modifications of the internal nodes; finally Descent refers to the fact that the derivation relies on an uninterrupted chain of parent(s) child relationships. In general, the study of evolution "in the large" will be best done by obtaining and focusing on the various BioPerts involved. In fact, the BioPert, as indicated above, would contain, in a well structured way, the minimal necessary and sufficient information to describe species actual design.

The "Intelligent Design" critics of evolution theory have a point. Namely, it is not possible to assemble something as complex as even a worm via random mutations. However, where they fail is in not recognizing that complex structures will not evolve by random mutation at the same rate for all their parts. Namely, when a successful design for a part or subsystem has evolved by mutation & selection, it will tend to be conserved in all the descendants. The slow accumulation of such partial successes, will in hind sight give the impression that a conscious designer was at work.

The set of precedence relations for a given species must be unique, reflecting the unique evolutionary history of the species. They are, therefore, shared by all its members. In other words, the truly invariant representation of a species would be the BioPert. In fact, the BioPert would contain the absolute minimum necessary amount of information capable of characterizing the species, while abstracting out all the information relating to members variability. That is, actual final body parts and organ architecture must be the result of dynamic feedback mechanisms.

Such a computational architecture would be compatible with the observed facts in tetragametic chimerism as well as sexual reproduction in general.

To sum up, the embryo generation process is composed of a network of cell differentiation processes. These processes can be arrayed as a PERT network: the BioPert. Each of its nodes being a specific cell differentiation process. A node is "fired" once all the cell types corresponding to its inputs are present in "adequate" numbers to produce enough signaling protein(s) to activate the node's MCG(s) and related GeNets. The wave of firings proceeds across the BioPert until all of its states have been activated. I.e. all the tissue/organ types have been generated.

Such PERT networks can indeed perform computations since they posses state (memory). The state of the developing embryo is the collection of the embryo cells at a given point. The state could be denoted by the vector {n1, n2,... nr), where nj denotes the number of cells of the j-th cell type. The BioPert, for a given species, will be identical for all members of the species. The topology of a species' BioPert will be shaped by the evolutionary history of the species. Earlier more primitive species having BioPerts that will be found in later more developed species as early subnetworks of the latter BioPerts. In other words, evolution is a bootstrap process for bootstrap processes, or alternatively ontogeny recapitulates phylogeny.

It is clear that the study of emerging genomic shuffles in our and other species is very likely to yield crucial insights on the mechanisms of speciation, that is, evolution "in the large" or macro evolution.

For example, one might discover that virtually all BioPerts have a subnet for creating a morula as an initial state of the embryo. Next most BioPerts may have a subnet for generating a gastrula, a small balloon of cells with initially two layers of cells of different type, and later three kinds (endoderm, mesoderm & ectoderm). The next MCG could be one that flattens the gastrula into a pancake like structure. In the case of Chordata (animals that have a notochord, they include all vertebrates) there should be MCGs for neurulation which causes the pancake to roll-up (like a cannoli pastry) creating a neurula. The presence or addition of the neurula MCG would distinguish all Bilateria BioPerts from non-Bilateria.

The proposed model suggests that each major radiation of the tree of life is probably associated with the addition of one or more NEW MCGs to the BioPert. Therefore, traveling down the tree from the ancient metazoa ancestor to us one would encounter, at least, the following distinct BioPerts: Bilateria, Deuterostomia, Chordata, Craniata, Vertebrata, Gnathostomata, Sarcopterygii, Tetrapods, Amniota, Synapsida, Therapsida, Mammalia, Eutheria, Primates, Catarrhini, Hominidae, Hominoidea, Hominids, Homo, Homo-Sapiens.

In other words, the evolutionary process progressively discovered how to assemble successful structures which then were retained in successive ages. For example the Deuterostomia BioPert "knew" how to build animals with two openings (a mouth & an anus). This process know how is retained to the present day. The Chordata BioPert knew how to build a bundle of nerve tissue spanning the animal's whole length. The Craniata BioPert knew how to assemble animals with a brain encased in a bone structure (cranium). The Gnathostomata BioPert knew how to assemble animals with jaws, etc. That is, the process for assembling a complex modern animal was not invented as an act of separate and possibly recent creation, but rather as many discoveries of how to assemble specific tissue/organs/sub-systems, such individual discoveries being strewn over the eons of evolutionary time. It is, therefore possible to assign an epoch and a biological ancestor for each and all the features of our body. Indeed advanced evolutionary theory does not just say we evolved from the great apes, rather that different parts and subsystems of our bodies descend from many much more primitive ancestors.

A major research goal for EvoDevo could be to verify, at least for the earliest nodes of this path, that each BioPert on the path has, most likely, some completely new MCGs and that, normally, all the MCGs of each such BioPert are retained, except for some minor mutations, by all the descendants, including descendant BioPerts.

If this model is correct, the EvoDevo finding that ancient MCGs are retained almost "intact" across a wide range of apparently unrelated species such as fruit flies & humans, is not surprising at all! In fact, once a "successful" process is found for building an organ or a type of tissue the corresponding MCGs & associated GeNets that control its development are very likely to be retained, with a minimum of alteration, by all the descendants. This is so because any drastic reinvention of the process is not likely to equal the incumbent process so evolution is bound to be extremely conservative. I.e. it conserves "good" solutions or computations. Drastic innovation will arise only in connection with new functions, organs or tissues, but such new processes will require new MCGs and their attendant cascades of control & controlled genes.

If these views are nearly correct they imply specific strategies for biological research. In the first place the study of functionality and architecture of a species genome can be greatly helped by manipulating homologous genes in more suitable species. Suitability being determined by freedom from strict ethical constraints, rapid reproductive cycle and low cost in procuring specimen. The first consideration is particularly important in the case of our own species and it can rise to a level of significant importance for species we have emotional ties to such as dogs, cats, horses etc.. The other two considerations are related to the cost and time factors in conducting studies.

The above evolutionary paradigm also suggests differential strategies for shedding light on the evolutionary process at different time depths. In fact, a particular time slice of evolution should be best illuminated by doing transgenic studies using two or more species that separated at that particular time depth. Thus if one is interested in studying recently evolved genomic structures one should consider recently separated species, for example Humans & Chimps (less than six MYA). The converse would be true if one is studying very ancient structures such as eyes.

Ultimately the systematic analysis & documentation of the BioPert & genomic architectures of a large number of species will give mankind the most definitive understanding of how life evolved. Thus Earth's life will finally understand itself. This understanding, in turn, will usher an age of miraculous Bioengineering developments. This understanding together with the resultant piecing together of the architecture of the evolutionary process should also provide us with insights about how life may have evolved elsewhere in the Universe.

Copyright Ugo O. Gagliardi 2009