Literature DB >> 22417916

Widespread recurrent evolution of genomic features.

Ignacio Maeso¹, Scott William Roy, Manuel Irimia.

Abstract

The recent explosion of genome sequences from all major phylogenetic groups has unveiled an unexpected wealth of cases of recurrent evolution of strikingly similar genomic features in different lineages. Here, we review the diverse known types of recurrent evolution in eukaryotic genomes, with a special focus on metazoans, ranging from reductive genome evolution to origins of splice-leader trans-splicing, from tandem exon duplications to gene family expansions. We first propose a general classification scheme for evolutionary recurrence at the genomic level, based on the type of driving force-mutation or selection-and the environmental and genomic circumstances underlying these forces. We then discuss various cases of recurrent genomic evolution under this scheme. Finally, we provide a broader context for repeated genomic evolution, including the unique relationship of genomic recurrence with the genotype-phenotype map, and the ways in which the study of recurrent genomic evolution can be used to understand fundamental evolutionary processes.

Entities: Chemical

Mesh：

Year: 2012 PMID： 22417916 PMCID： PMC3342872 DOI： 10.1093/gbe/evs022

Source DB: PubMed Journal: Genome Biol Evol ISSN： 1759-6653 Impact factor: 3.416

Evolutionary Biology in the Era of Ubiquitous Genomes

The explosion of genomic sequences over the past few years has revolutionized our understanding of evolution. Ten years after publication of the human genome sequence (Lander et al. 2001; Venter et al. 2001), hundreds of genomes are now available, spanning nearly all major phylogenetic groups, and providing an increasingly focused picture of evolutionary processes. These resources have allowed identification of troves of both broadly shared genomic features (allowing the reconstruction of presumed ancestral traits, e.g., the gene complements of the eukaryotic and metazoan ancestors; Putnam et al. 2007; Fritz-Laylin et al. 2010) and lineage-specific genomic changes (in some cases allowing associations with phenotypic novelties, e.g., Wang et al. 2005; Zhang et al. 2010; McLean et al. 2011). In addition, many instances of a third more puzzling phylogenetic pattern have been observed: traits whose distribution is “scattered” across the evolutionary tree (fig. 1), indicating repeated independent evolution of similar genomic features in different lineages.

—Phylogenetic distribution of some genomic features across metazoans. Genome-wide/gene-wide traits are mapped to a phylogenetic tree of metazoans (plus choanoflagellates) depicted by empty/solid forms above/below the tree branches, as indicated in the legend. Red shapes denote recurrent loss of ancestral features, whereas green features involve overall gain of genomic sequence; blue represents more complex characters. Each symbol indicates that a particular feature has evolved independently at least once within the corresponding taxonomic group. For example, “reductive evolution” in the teleost branch indicates that at least one lineage within the group (pufferfish) is known to show this feature. In the case of WGD, several symbols along the same branch represent the existence of lineages with successive rounds of WGD (i.e., octoploidy, dodecaploidy, etc.). Numbers in parentheses indicate which tropomyosin (TPM) exon(s) have duplicated in tandem in each event. The cases represented here are selected examples from the literature and are not intended as an exhaustive list; in addition, many yet unknown cases are expected to be discovered with the increasing availability of whole-genome sequences.

Recurrent Evolution: Phenotypic, Molecular, and Genomic

Recurrent evolution has been extensively studied at a variety of levels and has often led to confusion due to a lack of explicit definitions (Doolittle 1994; Arendt and Reznick 2008). It is therefore useful to begin our discussion by comparing recurrent genomic evolution as defined and reviewed here with previous definitions and work.

Recurrent Phenotypic Evolution

Recurrent evolution has most commonly been studied at the level of organismal phenotype (fig. 2), comprising an extremely rich field with hundreds of articles spanning three centuries exploring a wide diversity of recurrent phenotypes and lineages (Scotland 2011). A central concern of phenotypic work has been understanding the physical or genetic causes for recurrence. This pursuit often focuses on distinguishing between convergent evolution and parallel evolution (a distinction which itself has been extensively debated; Arendt and Reznick 2008; Scotland 2011). Generally, the distinctions follow etymology: parallel comes from the Greek for “beside” + “each other” (Παρα + αλληλος) and thus involves lineages with initially similar starting points arriving at similar endpoints by taking similar paths; on the other hand, convergence comes from the Latin for “with or together” (com-) and “to incline, tend toward” (vergere) and thus generally involves lineages with different starting points taking different paths to arrive at similar endpoints. For instance, one proposed distinction between parallelism and convergence focuses on the starting points for the two lineages: whether similar (closely related species, parallel) or different (distantly related species, convergent). Another proposed distinction focuses on paths (the specific genetic mutations underlying the changes) taken by the two lineages—whether the same (parallel) or different (convergent) (Arendt and Reznick 2008). Importantly, the two proposed distinctions are related since, because of their higher genetic and developmental similarities, closely related species are more likely to evolve similar traits by identical genetic changes than are species with more disparate biology (although this is not always the case; Arendt and Reznick 2008).

—Levels of recurrent evolution. Different levels of biological organization in which recurrent evolution may be studied. Although the phenotype should be considered a continuum across the different scales of biological complexity, for practical reasons, we may divide it into three levels: 1) organismal: individual features such as anatomy, physiology, behavior, etc.; 2) cellular: characteristics of single cells, including cell movements, secretory capacities, morphology, organellar composition, etc. (equivalent to the organismal level in unicellular species); and 3) molecular: all observed traits below the cellular level, including transcriptome, proteome, biochemical properties, chromatin structure, etc. Genomic level (gray box) corresponds only to the nucleotide sequence (i.e., elements that can be recognized at the sequence level) and may be comparable to the classic concept of genotype.

Recurrent Molecular Evolution

An equally diverse range of phenomena is subsumed under the heading of “recurrent molecular evolution.” A useful starting point here is Doolittle's (1994) four-category schema. He identified 1) functional convergence, in which the same molecular function arises multiple times (e.g., unrelated enzymes catalyzing the same reaction; Galperin and Koonin 2012); 2) mechanistic convergence, involving evolution of similar mechanisms for accomplishing similar functions in unrelated molecules (e.g., similar sidechain geometries in unrelated serine proteases; Kraut 1977); 3) structural convergence, in which unrelated sequences fold into similar structures (e.g., repeated evolution of alpha-helices and beta-sheets or similar RNA secondary structures); and 4) sequence convergence, in which similar specific molecular (either DNA or protein) sequences evolve multiple times independently.

Recurrent Genomic Evolution

We are now in a position to define recurrent genomic evolution, the topic of this review, and to see how it differs from nearly all of these other levels of recurrence. Whereas organismal (i.e., anatomical, physiological, etc.) and most categories of molecular recurrence are observed at any of the phenotypic levels (fig. 2), genomic recurrence is directly observed as similar changes in the genotype—that is, at the level of DNA sequence. Notably, then, even most of Doolittle’s molecular categories (functional, mechanistic, and structural) do not qualify as genomic recurrence because they relate to phenotype. Although these categories are defined at the molecular level (and thus intuitively “closer” to the genomic or genotypic level), they are in fact phenotypic. This becomes clear when the general definition of phenotype—the observable characteristics of an organism—is recalled. We may recognize different “levels” across the phenotypic continuum—molecular, cellular, and organismal (fig. 2)—but this does not change the fact that they are all clearly aspects of phenotype and not genotype: they reflect directly observable characteristics of the organism or cell. Another fundamental distinction between “classical” and genomic recurrence involves the focus of the study: in classical studies of recurrence in molecules, cells, or organisms, repeated evolution is initially observed at the phenotypic level and only thereafter interrogated at the genotype/genomic level. By contrast, genotypic convergence involves direct observation of similar or same changes in the genome in different lineages, notwithstanding these changes' effects on the various levels of phenotype (whether similar, different, or even potentially nonexistent). Genotypic (∼genomic) recurrence is thus most closely related to Doolittle’s fourth category, sequence convergence.

The Importance of Being Recurrent and the “Rules” of Evolution

The study of recurrent evolution is of special importance for understanding the forces shaping genomes. Because of the inherent stochasticity of evolutionary processes, inferring evolutionary forces from the occurrence of a given (set of) change(s) in a single lineage is difficult. Recurrent evolution of the same genomic characteristics suggests predictability of evolution, elucidating the rules of genome evolution by revealing commonalities of evolutionary forces experienced across disparate lineages (Conway Morris 2009). We believe that the wealth of recurrent genomic features indicate unappreciated similarity of fundamental forces across lineages. Although the large number of genomic characters and finite nature of sequence space implies that genomic recurrence may sometimes occur simply by chance (see below), many cases have now been unearthed that suggest specific forces driving genome evolution down similar paths in different lineages. Identifying and understanding these forces or causes are perhaps the major challenge of the study of recurrent genome evolution.

Chance, Heterogeneity of Causes, and Genomic Recurrence

Inherent to the treatment of recurrence as a valuable and biologically meaningful tool to understand evolution is the notion that cases of repeated genomic evolution are informative if they occur in excess of the level of coincidence expected simply from the action of stochastic processes in finite sequence space. In some cases discussed here, this null hypothesis can be rejected. Other cases await direct testing, generally because of the lack of enough data to assess the statistical significance of the pattern and/or to properly define the null hypothesis (i.e., specific mutation rates across lineages, etc.). Although we have chosen to discuss mostly cases that we believe are likely to reflect unexpected levels of recurrence (with some exceptions such as whole-genome duplications [WGDs], see below), it remains possible that some of these examples do not significantly differ from the chance expectation. Similarly, it is worth pointing out that different instances of a particular trait may be due to different pressures acting in different lineages (this is particularly possible for cases in which fundamentally different mechanisms for a given genomic change are imaginable). Although recurrent patterns caused by different pressures should be considered true recurrence, their subsequent evolutionary interpretation will be much more obscure. These considerations place similar caveats on most or all cases discussed below, and thus, they will not be discussed extensively for each instance, but just in a few particularly enlightening examples. Ultimately, random chance and our proposed explanations represent testable alternative hypotheses that could and should be directly tested.

The Causes of Recurrent Evolution of Genomic Features

What forces may explain genomic recurrence? In contrast to recurrent anatomical or physiological characters, which are usually (and reasonably) assumed to reflect adaption, often due to shared peculiarities of the organisms' environmental niches, the potential causes of observed recurrent genomic features are more diverse and may be very different for different recurrent traits—indeed, in some cases, the adaptative value of repeated genomic outcomes is dubious. In understanding the forces driving recurrent genomic evolution, we believe that the following two axes are particularly important.

Forces Driving the Pressure: Mutation, Positive Selection, or Relaxed Selection

A species undergoes a genomic change when 1) a spontaneous mutation occurs and 2) the resultant mutated allele spreads through the population, a process highly dependent on selective strength and efficiency (incorporating demography, effective population size, etc.). Thus, insofar as recurrent changes reflect similar pressures or constraints across lineages, these similarities may involve forces that are “mutational” or “selective” (or even both). The notion that selection could impart a directionality to evolutionary change is familiar to any evolutionary biologist; however, that mutation could be directional may be less familiar (the interested reader should consult Yampolsky and Stoltzfus 2001). Mutation can be no less a directional force if a certain class of mutation (G-to-A, small genomic deletions, intron loss, etc.) is more frequent than its reverse (A-to-G, small insertions, intron gain, etc.). Thus, all that is needed for mutation-driven recurrent evolution is that multiple lineages are experiencing similar mutational biases in parallel. For selective pressure, a second question is whether the recurrence is due to similar “positive” selective pressure in multiple lineages or to similar “relaxation” of selective pressure in multiple lineages. Notably, differences in selective pressure include not only classical fitness variation but also in effective population size (Ne) that leads to differences in the effectiveness of selection versus drift. Indeed, according to one influential model, a general prediction of this is that several general aspects of the genome architecture should evolve recurrently in lineages exposed long enough to similar Ne (and mutation rates) (Lynch and Conery 2003; Lynch 2006, 2007).

Nature of the Pressure: General, Recurrent Environmental, or Recurrent Genetic

Another important consideration involves the distribution of the pressure driving convergence and the source of that pressure. Similar evolutionary pressures and constraints in two lineages can either be 1) “general” (or ancestral), that is, applying to most or all lineages within a group or 2) “recurrent,” that is, pressures that themselves arose independently in only a subset of lineages. For recurrent pressure, a second question is whether the pressure arose due to a previous change in the genome of the species (“genetic” or intrinsic) or in its environment (“environmental” or extrinsic). Using this framework, we next review some of the major known cases (or classes) of recurrent genomic evolution (summarized in table 1), beginning with the illustrative case of reductive genome evolution (RGE). Notably, for many of the phenomena discussed here, the causes remain unclear and often debated. Our goal is to frame the questions and to engender debate, not to arbitrate between competing hypotheses. In addition, we have chosen to focus on eukaryotic nuclear genomes, and thus, we will not discuss an equal number of interesting cases of recurrent evolution in prokaryotes and eukaryotic organelles.

Table 1

Possible Causes of Recurrent Genomic Evolution

	Driving force			Nature of the pressure			Probability of occurrence by chance
		Selectional			Recurrent
	Mutational	Positive	Relaxed	General/Ancestral	Environmental	Genetic
Genomic organization
Reductive evolution	X	X	X		X	X	Null
Genome expansion	X	X	X			X	Low
WGDs	X			X			High
Sex chromosomes	X					X	Low
Nucleotide composition	X	X				X	Low
Genome-wide gene structures
Massive intron loss	X	X			X	X	Low
Strong intron boundaries		X				X	Null
SLTS	X					X	Low
Complete loss of ancestral U12 introns	X			X			Low
Gene/gene family level
Gene family expansions		X		X	X		High
Cluster formation and assembly of syntenic blocks		X					Low
Disruption of gene clusters and other syntenic blocks			X			X	High
Gene losses	X		X		X	X	High
Specific intragenic features
Tandem exon duplications		X		X			Low
Gene structures	X			X			High
Loss of gene segments			X			X	Low

Possible Causes of Recurrent Genomic Evolution

An Example: On the Causes of Reductive Genome Evolution

These distinctions are illustrated by different hypotheses about the evolutionary causes of RGE. RGE is perhaps the best-known instance of recurrent genome evolution. RGE has been observed in nearly all eukaryotic superkingdoms (Venkatesh et al. 2000; Lane et al. 2007; Morrison et al. 2007; Opperman et al. 2008; Slamovits and Keeling 2009; Ankarklev et al. 2010; Corradi et al. 2010) and can include pronounced gene loss, elimination of repetitive elements, evolution of overlapping genes, reduction of average intron sizes, and/or intron numbers and other genomic changes leading to more compact genomes. In addition, significant genome contractions have occurred even in typically large genomes: For instance, multiple mammalian orders have experienced parallel patterns of genome contraction (including loss of nuclear mitochondrial sequences [NumtS], pseudogenes, and long terminal repeat retrotransposons) following the Cretaceous–Tertiary (KT) boundary (Rho et al. 2009). Several hypotheses have been proposed for genome reduction. First, RGE is often argued to reflect positive selection for loss of inessential genomic elements acting specifically on parasitic/fast-replicating lineages. This hypothesis is an example of a recurrent (acting only or especially on some lineages) environmental (due to considerations of an organism's niche) “positive-selective” pressure. Another alternative is that RGE reflects loss of genomic sequences that are no longer efficiently maintained by selection (“relaxed-selective” pressure). Several possible reasons for relaxed-selective pressure are possible. Changes in lifestyle could render some processes obsolete (e.g., parasites that obtain products from their hosts may lose biosynthetic pathways), an example of “recurrent-environmental” causes. Reduced efficiency of selection due to reduced effective population size in parasites could also lead to weakly selected elements (also recurrent-environmental) (Lynch 2007). In some cases, loss of one gene may render related/interacting genes nonfunctional, leading to their loss. This case of relaxed-selective pressure is due to changes within the organism's genome (gene loss) and thus is a case of recurrent-genetic. Finally, it is also possible that some aspects of RGE simply reflect a strong tendency toward deletion at the genome level (mutational pressure). Such a deletion process could arise due to changes in the DNA replication/repair machinery (genetic) or due to changes in the environment (e.g., increased ultra violet exposure leading to a greater rate of double-strand breaks in DNA; environmental). Notably, it is also conceivable that the pressures governing recurrent RGE are general: Gene loss is known even in species without striking genome reduction, and many lineages appear to experience an excess of DNA deletions over insertion (Petrov 2002a, 2002b). From this perspective, lineages undergoing RGE could potentially be exhibiting general pressures that have simply proceeded to a more advanced stage.

Multiple Levels of Recurrent Genomic Evolution

We next proceed to a discussion of different examples of observed genomic recurrence. We have organized these examples by the “scale” of their changes: recurrent genomic evolution can be recognized at multiple scales, ranging from whole-genome patterns such as RGE (globally affecting numerous individual features at the same time) to specific changes within individual genes (such as the recurrent deletion of a regulatory DNA motif). Although these different levels are interconnected and, in many cases, are probably interdependent, for clarity, we will divide the examples discussed here into four broad categories. We will first review cases of genome-wide patterns of recurrent evolution, subdivided into changes in genomic organization (such as RGE) and global changes of gene structures. Then, we will focus on cases affecting single genes or gene families. Last, we will zoom in to discuss examples of recurrent evolution of features within the individual genes themselves.

Cases of Recurrent Evolution of Genomic Organization

Expansive Genome Evolution

Another repeatedly observed evolutionary trajectory is pronounced expansion of genome size and content. At least in animal, plant, and fungi, some species have dramatically increased total DNA content (Gregory et al. 2007). In some cases, gene numbers have increased several-fold relative to related lineages (often through WGDs [see below]), accompanied by evolution of large gene families, apparently increased intergenic and intron lengths, and, in nearly all cases, massive proliferation of repetitive elements (e.g., Lander et al. 2001; Bennetzen 2002; Kidwell 2002; Piegu et al. 2006; Ungerer et al. 2006; Gregory et al. 2007). Similar histories may have also been experienced by other lineages; however, systematic undersampling of large genomes outside of these three groups has hampered our knowledge of other such taxa. Here, again, the causes for convergent genome expansion remain unclear, although, given that massive genome expansions require hundreds of mutations accumulating in the same direction, they are unlikely to evolve simply by chance. Some hypotheses closely associate genome expansion with multicellularity. One possibility is that multicellularity promotes evolution of regulatory complexity and gene family expansion (Vogel and Chothia 2006; Taft et al. 2007; Lang et al. 2010). Another influential hypothesis suggests that genome expansion in multicellular organisms largely reflects reduced selection against mildly deleterious insertions (such as gene duplicates, transposable element insertions, and introns) in species with reduced Ne, such as plants or animals (Lynch and Conery 2003; Lynch 2007). However, recent work questioning the correlation between Ne and genomic complexity urge caution (Whitney and Garland 2010, but see Lynch 2011). Finally, it is possible that genetic changes, such as high expression of active retrotranscriptases, can lead to increased proliferation of repeated elements, a recurrent-genetic mutational cause.

Whole-Genome Duplications

A polyploid is a cell, organism, or species that contains more than two homologous sets of chromosomes. The mutation that produce them is referred to as WGD or polyploidization, and it has been repeatedly described in many eukaryotic groups, including animals (Bisbee et al. 1977; Amores et al. 1998; Gallardo et al. 1999; Evans et al. 2004; Edger and Pires 2009), plants (Fawcett et al. 2009), ciliates (Aury et al. 2006), oomycetes (Martens and Van de Peer 2010), and fungi (Wolfe and Shields 1997; Ma et al. 2009). Although extensive gene losses in paleopolyploids could result in a diploid-like gene complement, WGDs are generally not reversible and therefore are a case of mutational ratchet, a “general mutational” cause (see below). In some lineages, this phenomenon is especially pervasive, with a high prevalence of multiple extra rounds of polyploidizations after a first WGD event (especially common in plants, but also several animal lineages) (Evans et al. 2004). However, it is not clear whether recurrent WGDs, although very frequent, occur and accumulate more often than expected for a random process. From a selectional perspective, although WGDs can have immediate phenotypic effects (Kennedy et al. 2006; Thompson and Merg 2008), these may not explain the fixation in most cases. However, Fawcett et al. (2009) have suggested that plant lineages that underwent WGDs had a better chance to survive after the KT mass extinction. In addition, WGDs have been postulated to have served as a frequent source of increased evolutionary potential for subsequent evolution (Blomme et al. 2006; Zhang and Cohn 2008), even though hypotheses linking WGDs with big taxonomic radiations and evolutionary novelties have been controversial (Donoghue and Purnell 2005; Hurley et al. 2007). In total then, although WGD may result in dramatic recurrent patterns at a genome-wide level, it may not be caused by common evolutionary forces acting on a particular set of lineages but may simply respond to a high mutational frequency (i.e., a higher rate of mutations leading to polyploidization).

Sex Chromosomes

In many distantly related eukaryotes, sex is determined at the genetic level by chromosomal complement. This is thought to involve a cascade of events driven largely by sexual antagonistic selection, including 1) a gene at a previously autosomal locus develops a dominant ability to determine sex; 2) recombination is suppressed at this locus; 3) additional sex-related genes accumulate nearby on the chromosome, further driving recombination suppression; 4) stepwise degradation of the chromosome containing the dominant sex determinant (Y/W); and 5) increased traffic of genes between the sex chromosomes and autosomes. Evolution of similar sex chromosome systems has occurred repeatedly in vertebrates, invertebrates, fungi, and plants (Fraser et al. 2004; Fraser and Heitman 2005; Bergero et al. 2007; Bellott et al. 2010; Charlesworth and Mank 2010; Davis and Thomas 2010; Kaiser and Bachtrog 2010; Ellegren 2011). Sex chromosomes are thus an example of a “selectional” cascade of events triggered by recurrent genetic changes. Finally, another interesting case of recurrent evolution of a genome-based sex determination system is the X-autosome balance in at least Drosophila and Caenorhabditis (reviewed in Haag 2005) and the plant genus Rumex (Navajas-Pérez et al. 2005).

Changes in Global and Local Nucleotide Composition

Global nucleotide composition (or GC content) ranges widely across eukaryotic and prokaryotic genomes. In particular, many divergent lineages have recurrently evolved highly AT-rich genomes throughout eukaryotic evolution (Gardner et al. 2002; Eichinger et al. 2005; Eisen et al. 2006; Ghedin et al. 2007), whereas the evolution of highly GC-rich genomes is rarer among eukaryotes (Merchant et al. 2007). These differences are likely due to a combination of selectional and mutational pressures (including mutational bias and biased recombination-associated DNA repair) (Yampolsky and Stoltzfus 2001; Birdsell 2002). Interestingly, because genome-wide GC-content is a major determinant of global codon bias (Hershberg and Petrov 2009), independent evolution of similar GC-contents in two different species will usually result in recurrent evolution of similar preferential codon usages. The same pressures—especially local differences in recombination (Duret 2006; Duret and Arndt 2008)—are likely to cause local differences in GC-content also within genomes (e.g., isochores). Notably, these regions are continuously evolving; for example, several mammalian lineages are undergoing a recurrent process of GC-rich isochore erosion, with a significant trend of G/C to A/T substitutions, whereas others are independently increasing their overall GC-content (Duret et al. 2002; Belle et al. 2004; Romiguier et al. 2010). Interestingly, in addition to repeated patterns of nucleotide composition at a genomic scale, these trends sometimes result in cases of striking recurrence of GC-content at specific genes (e.g., the gene RAG1 in two marsupial species; Gruber et al. 2007).

Cases of Genome-Wide Recurrent Evolution of Gene Structures

Widespread Genome-Wide Intron Loss

Whereas most studied eukaryotic species have plentiful spliceosomal introns (at least one per gene on average), several distantly related lineages contain far fewer (<0.1 per gene, Matsuzaki et al. 2004; Vanacova et al. 2005; Morrison et al. 2007), apparently due to independent episodes of massive intron loss (Irimia and Roy 2008). Why should this be? Perhaps, the leading hypothesis is that massive intron reduction reflects strong positive selection for intron loss in lineages that are optimized for fast replication (Doolittle 1978). This is a recurrent-environmental positive-selection model, since it invokes increased positive selection due to peculiarities of species' environments, related to RGE. On the other hand, massive reduction in intron number could reflect “runaway” mutation, for instance due to elevated rates of creation of intronless DNA copies of genes by widespread retroposition associated with retroelement invasion (Roy and Penny 2007). This is a recurrent-genetic mutational model, since it invokes increased mutation due to peculiarities of species' genomes (retroelement invasion). Finally, evidence for more gradual intron number reduction in many lineages suggests a general mutational pressure toward intron loss, potentially due to a near absence of intron gain in many lineages (Roy and Irimia 2009a). This hypothesis provides an example of a “ratchet-like” effect (Covello and Gray 1993; Doolittle 1998), in which transition in one direction (from intron presence to absence) occurs much more readily than the reverse (intron gain), leading to a strong directionality to evolution. Ratchets can be due to mutation, selection, or a complicated combination of the two and are a common phenomenon across recurrent evolution of genomic features (see below for further discussion on the role of ratchet processes on the evolution of genome complexity and the constructive neutral evolution [CNE]; Stoltzfus 1999; Gray et al. 2010; Doolittle et al. 2011; Speijer 2011).

Transformation of Intron Structures after Massive Intron Loss

In each case in which a eukaryotic lineage has experienced nearly complete intron loss, the few remaining introns exhibit modified splicing signals, with strengthened consensus sequences for core splicing motifs (5′ splice site and branch point), and even highly constrained distance between the branch point and the 3′ intron boundary (Irimia et al. 2007, 2009; Irimia and Roy 2008; Schwartz et al. 2008). Such a tight association between two genomic transformations—intron loss and intron sequence change—suggests that genetic changes associated with one lead to selective pressures driving the other: a case of recurrent genetic positive-selective pressures. However, although several mechanistic hypotheses have been proposed (Irimia and Roy 2008; Irimia et al. 2009), a clear explanation is still lacking.

Spliced Leader Trans-Splicing

Spliced leader trans-splicing (SLTS) is a variation on the spliceosomal splicing mechanism that attaches short trans-encoded RNA “leader” sequences to the 5′ end of transcripts of a generally well-defined subset of genes. SLTS systems exhibit a highly punctate phylogenetic distribution across protists and animals (Lukes et al. 2009; Roy and Irimia 2009b; Douris et al. 2010; fig. 1). Phylogenetic evidence suggests frequent evolution of SLTS from a non-SLTS ancestor; by contrast, no case of loss of SLTS in any lineage is known (Roy and Irimia 2009b), although, with current data and methods for detecting SLTS, cases of secondary loss of SLTS are hard to prove. This suggests a model in which 1) new SLTS systems arise at some rate over evolutionary time, likely by creation from spliced leader-like sequences from traditional spliceosomal RNAs by largely neutral mutations (Lukes et al. 2009) and 2) degradation of defunct 5′ untranslated regions (UTRs) following the evolution of SLTS leads to a very low probability of loss of SLTS. Thus, SLTS may be another case of mutational ratchet in which transition from one state to another is common over evolutionary time, but the reverse is rare, therefore leading to recurrent evolution of the same feature. Interestingly, the cascade of events leading to the evolution of SLTS may result in increased molecular complexity, by enabling new molecular paths of gene expression. One instance of the increased molecular complexity associated with SLTS is the evolution of polycistronic transcripts, which is tightly associated with SLTS in diverse eukaryotic lineages (and is very rare in eukaryotes without SLTS). This difference likely reflects the fact that in eukaryotes, translation of downstream open reading frames (ORFs) is generally inefficient. As such, in eukaryotes that lack SLTS, polycistronic transcripts will be rare; however, SLTS upstream of ORFs can create monocistronic mature messenger RNAs from polycistronic transcripts, resolving this difficulty. Dynamics of operon creation and loss may also reflect a ratchet: Mutations affecting transcription termination of upstream genes and leading to long transcripts may allow effective expression of trans-spliced downstream genes from polycistronic messages; on the other hand, internal promoters in operons are likely to eventually degrade, inhibiting the opposite transition, from operons back to independent promoters. In total, then, the evolution of SLTS (and operonic systems) are perhaps the best example of recurrent CNE (Lukes et al. 2009), an alternative mechanism to generate increased biological diversity (Stoltzfus 1999; Gray et al. 2010; Doolittle et al. 2011; and see Speijer 2011 for counterarguments).

Massive Loss of U12 Introns

U12 or minor introns are a rare class of introns that are removed by a distinct spliceosomal machinery and characterized by strict extended splice signals. U12 introns are likely to have been present in the last common ancestors of eukaryotes but have been independently reduced in number or completely lost in many lineages (Russell et al. 2005; Alioto 2007; Dávila López et al. 2008; Roy and Irimia 2009b). The dynamics may be governed by a general mutational ratchet (in this case, not associated to CNE): whereas both loss of U12-intron sequences and conversion from U12- to “standard” major U2-spliceosomal introns are routinely observed, and simple mutations causing these changes have been identified in the laboratory, the opposite (U2-to-U12) has never been documented (Burge et al. 1998; Roy and Irimia 2009b).

Case of Recurrent Genome Evolution at the Gene or Gene Family Level

Gene Duplications and Family Expansions

Gene duplication is a frequent phenomenon (Lipinski et al. 2011), which affects a wide variety of gene families and biological processes, suggesting much recurrent gene duplication may be largely stochastic. However, exceptions in which recurrent gene duplication has underpinned parallel phenotypic evolution are also known. One clear example involves duplication of RNAse genes (Zhang 2006). In two lineages of leaf-eating monkeys, a new digestive tract-specific RNAse gene arose by duplication of the same ancestral RNAse and acquired identical amino acid changes altering RNAse activity and resulting in improved leaf digestion. Such cases represent recurrent genomic evolution due to selective environmental pressures acting at on a specific subset of lineages. Other cases evidence general environmental adaptation by recurrent massive gene family expansion. Some biological functions, such as immunity, chemoreception, and detoxification, require the interaction or the recognition of a vast range of substrates, and, thus, increased molecular diversity of paralogs within the genome could be favored. For instance, cytochrome-P450 genes, which participate in detoxification of various compounds, have undergone pronounced independent expansion in many metazoan lineages (Thomas 2007; Baldwin et al. 2009). A similar situation is found in chordate olfactory receptors, where a correlation with environmental positive-selective pressures is evident (Niimura and Nei 2007; Niimura 2009). On the other hand, other cases of recurrent massive gene family expansion—which are overwhelmingly statistically significant over a random expectation obtained from related gene families—suggest important adaptation of unknown functional significance, raising important questions for further exploration (e.g., EXTK tyrosine kinases, for which dozens of members have independently evolved in several lineages; fig. 1, in contrast to all other related tyrosine kinase families, for which nearly no gene duplications are known in other metazoans lineages, D'Aniello et al. 2008).

Cluster Formation and Assembly of Syntenic Blocks

Pairs or groups of genes may be closely physically linked in different species due to functional reasons. In most cases, this reflects retention of an ancestral association; however, some instances of repeated evolution of physical linkage between pairs or groups of genes have been described. One set of these involves recurrent evolution of clusters of paralogous genes, presumably by tandem gene duplication and selection against gene translocation. These genomic structures may provide a genetic positive-selective advantage by allowing subtle coding sequence and transcriptional diversification of new gene copies under the control of a shared set of regulatory elements (Tena et al. 2011). Accordingly, many described cases correspond to key developmental genes with complex transcriptional expression patterns (Peterson 2004; Duncan et al. 2008; Irimia et al. 2008; Kuraku et al. 2008; Takatori et al. 2008; Kerner et al. 2009; Negre and Simpson 2009); for example, Iroquois genes have independently evolved gene clusters in at least five metazoan lineages (Irimia et al. 2008; Takatori et al. 2008; Kerner et al. 2009), arguing for positive-selective reasons versus stochastic occurrence. More rarely, recurrent linkage of nonparalogous genes may occur, and this association may be favored due to functional advantages (e.g., improved coordination of expression): for instance, for three genes involved in galactose metabolism in two divergent fungal phyla (Slot and Rokas 2010).

Disruption of Highly Conserved Gene Clusters and Other Syntenic Blocks

Ancestral blocks of syntenic genes have been maintained in diverse modern animals, indicating strong selection for their retention in diverse lineages, generally associated with specific developmental programs (e.g., Hox gene clusters; Duboule 2007). However, these associations have been recurrently disrupted in several different animal lineages (Ferrier and Holland 2002; Seo et al. 2004; Pierce et al. 2005; Duboule 2007; Negre and Ruiz 2007). This indicates that these linkages have repeatedly become nonessential, suggesting modification of fundamental animal developmental programs, a potential case of relaxed-selective pressures. Similarly, disruption of ancient associations of phylogenetically unrelated genes, acting as genomic regulatory blocks (Engstrom et al. 2007; Kikuta et al. 2007), have also been reported (e.g., Iroquois genes with Sowah genes in several lineages; Irimia et al. 2008; Maeso et al. 2012).

Gene Losses

Gene losses constitute an obvious example of nonconstructive mutational ratchet for rather unessential genes. In extreme examples, such as the GFP gene family in metazoans and the oxylipin pathway genes in holozoans, the taxonomic distribution implies at least five independent losses (Deheyn et al. 2007; Lee et al. 2008, fig. 1). Alternatively, the loss of the same selective pressure in two lineages due to a common change in lifestyle and/or developmental process (e.g., loss of vision in lightless environments; Protas et al. 2011) may result in dispensability of the same genes and thus in their recurrent loss (environmental relaxed-selection). An example of this is the repeated loss of oxidative phosphorylation complex I genes in anaerobic fungi (Marcet-Houben et al. 2009). In such cases, the loss of one of the genes involved in a particular protein complex or biological pathway could render its interacting partners nonfunctional, further enhancing the loss of the latter. This is exemplified by the absence of all six proteins integrating the fifth adaptor protein (AP-5) complex independently in five different eukaryotic lineages (Hirst et al. 2011). Genic redundancy, by individual gene duplication or WGD, configures yet another evolutionary scenario for recurrent gene losses (genetic relaxed selection). In these cases, although simple chance is likely to underlie most patterns of gene loss, there are instances in which not all genes seem to be equally prone to retention. For example, some paralogs have been repeatedly lost specifically in different vertebrate lineages, as is the case of Pdx2 genes in teleosts and tetrapods (Mulley and Holland 2010), EvxB in elephant shark and tetrapods (Ravi et al. 2009), Alx3 in frogs, lizards, and chicken (McGonnell et al. 2011), or globin-E gene (GbE) in all major vertebrate lineages but birds (Hoffmann et al. 2011). (It should be noted, however, that although intriguing and suggestive, these patterns of coincidental loss across four/five major vertebrate lineages cannot be statistically significantly different from the null expectation due to the small sample size. Further availability of genomic sequences should overcome this limitation.) More globally, this nonramdom pattern of paralog losses seems to be the rule in yeast (Scannell et al. 2007). Finally, some recurrent losses may reflect positive-selective genetic pressure: for instance, recurrent reduction to a single copy of the same gene families following WGD in plants, fungi, and animals likely reflects strong purifying selection on gene dosage (Paterson et al. 2006).

Cases of Recurrent Evolution of Specific Intragenic Features

Tandem Exon Duplications

Seven to 17% of metazoan genes have tandem exon duplications (Letunic et al. 2002; Gao and Lynch 2009), generally associated with mutually exclusive alternative splicing (Kondrashov and Koonin 2001; Irimia et al. 2008). This alternative processing generates internal redundancy (internal paralogy), which can be exploited to produce functionally divergent transcripts. Although many exon duplications may be (nearly) neutral and occurring by chance, extreme recurrent cases suggest positive-selective forces. A classic example is the DSCAM gene, in which exons 6 and 9 have undergone massive, independent expansions in different insect and crustacean lineages (Brites et al. 2008; Lee et al. 2010). Alternative splicing generates many isoforms of the DSCAM gene, which encodes receptors involved in axon guidance, potentially allowing for increased wiring complexity (Schmucker et al. 2000). In the tropomyosin cytoskeletal gene, independent duplication of many different exons has occurred in most bilaterian lineages (Vrhovski et al. 2008; Irimia, Maeso, et al. 2010; Koziol et al. 2011; fig. 1) at a frequency statistically significantly higher than expected even from the highest estimates of intragenic duplications (Gao and Lynch 2009). The explanation appears to lie in the use of alternative promoters to produce two different protein isoforms with radically different cellular functions. Following duplication, each exon copy is “assigned” to one of the two isoforms, reducing pleiotropy and allowing “general positive selection” for optimized function of each protein (Irimia, Maeso, et al. 2010). Finally, another classic example is the parallel evolution of alternative splicing of recurrent tandem exon duplicates in ion channel receptors in flies and mammals (Copley 2004; Fodor and Aldrich 2009).

Gain or Loss of Individual Introns

Intron loss is a relatively common process, especially in some lineages, so the loss of the same intron in a specific gene is likely to occur repeatedly in different lineages simply by chance (Roy and Penny 2006; Roy and Irimia 2008a). However, certain gene features, such as conserved high expression level (Carmel and Koonin 2009), could generate trends toward recurrent intron loss from some genes (a case of general positive selection). Intron gain, on the other hand, is generally thought to be less common, although the extent of parallel gains have been widely debated (e.g., Csurös 2005; Nguyen et al. 2005; Sverdlov et al. 2005), and genome-wide comparisons showed that they may account for up to 8% of the shared intron positions across eukaryotic genes (Carmel et al. 2007). In addition, clear individual cases have been identified (Tarrio et al. 2003; Qiu et al. 2004; Ahmadinejad et al. 2010), even as polymorphisms within populations (Omilian et al. 2008; Li et al. 2009). Nonetheless, despite its lower frequency, parallel intron gain is also likely to occur largely by chance, particularly given than no case of parallel gain in multiple lineages has been described yet. Alternatively, intron gain has long been proposed to be biased toward certain sequences (proto-splice sites; Dibb and Newman 1989), which could impose a general mutational pressure underlying the recurrent patterns.

Recurrent Loss of Gene Parts

Repeated loss of coding sequences of genes may provide parallel changes in protein function or protein–protein interactions (e.g., truncation of C-terminal transactivation domain in meis/hth proteins, Irimia et al. 2011; and loss of Snag domains in C2H2 zinc fingers, Barrallo-Gimeno and Nieto 2009; Irimia et al. 2010). At the regulatory level, recurrent loss of cis-regulatory sequences can have major phenotypic and adaptative consequences with minimal pleiotropic effects (e.g., repeated deletion of a pelvic enhancer in stickleback populations; Chan et al. 2010). In other cases, change in body plans and/or developmental programs may render some regulatory elements unnecessary, even for otherwise deeply conserved sequences (e.g., the only known regulatory element conserved from cnidarians to vertebrates has been lost (or diverged beyond recognition) independently in protostomes, tunicates, and hydra; Royo et al. 2011). Thus, a great variety of causes can be devised for this type of genomic changes, depending on the gene and lineages involved (recurrent-environmental positive-selection, recurrent-environmental and recurrent-genetic relaxed-selection, general mutation, etc.).

Evolution of Coding Sequences

Cases of identical changes in amino acid sequences in different lineages have been extensively studied and represent the paradigmatic example of recurrent molecular phenotypic evolution (Doolittle 1994; Zhang and Kumar 1997; Christin et al. 2010). Parallel amino acid replacements are probably very frequent and happen extensively by chance even at generally highly conserved sites (i.e., “rare amino acid replacements,” RGC_CAMs; Irimia et al. 2007; Rogozin et al. 2007a, 2007b, 2008; Roy and Irimia 2008b). However, it has been estimated that homoplastic amino acid substitutions are 2-fold more common than expected under neutral models of protein evolution (Rokas and Carroll 2008). Not surprisingly, then, in addition to the plethora of neutral cases, many studied examples are linked to recurrent environmental positive-selective pressures, with amino acid substitutions conferring adaptative changes to the new environment (e.g., optimal activity at lower pH conditions in the aforementioned RNAses, Zhang 2006; or changes in “hearing genes” in mammals with echolocating systems; Liu et al. 2010; Davies et al. 2011).

The Relationship between Recurrent Genome Evolution and Phenotype

What are the phenotypic effects of this wealth of recurrent genomic changes? It is worth noting that, with regard to the genotype–phenotype map, the study of recurrent genomic changes may be seen as the inverse of the study of recurrent phenotypic changes. The study of recurrent phenotypic evolution is an inherently “top-down” enterprise (fig. 2): study begins with the observation of similar morphological, physiological, or even molecular phenotypes and then investigates whether or not the underlying genetic changes also share similarities (redeployment of the same key developmental genes or similar types of mutations). Recurrent phenotypes may or may not reflect changes in the same pathways, the same genes within those pathways, the same types of changes within those genes (e.g., exon duplication vs. protein changes), the same specific change (e.g., a specific amino acid change), or the same genome-level change giving rise to the transcript/protein change (e.g., Threonine-to-Serine changes can occur due to substitutions at the first or third codon position). Even if the transcript changes are the same, this could reflect identical or nonidentical changes in the genome (e.g., genomic change vs. RNA editing). In all cases, the organismal phenotypes are equivalent, regardless of the similarity or difference of their genomic bases. By contrast, study of recurrent genomic evolution is a fundamentally bottom-up pursuit (fig. 2): study begins with an observation of similarity encoded at the genomic level (e.g., independently duplicated exons in tropomyosin genes) and then investigates whether or not these similarities are reflected in resemblance at phenotypic levels (optimization of the same two protein functions). For instance, consider a recurrent intragenic tandem duplication. The duplications may affect the transcriptome or may not (e.g., an intronic duplication may not). Exonic duplications may affect the protein sequence/function/structure or may not (e.g., an exon in a UTR). Protein-affecting changes may or may not affect cellular/organismal phenotype. Fundamentally, then, whereas repeated phenotypic evolution may speak directly of adaptative values, but only rarely (and sometimes indirectly) about the evolutionary mechanisms of genetic change, recurrent genomic evolution directly informs about the genetic changes themselves, although adaptative causes can remain more elusive. The types and extents of phenotypic changes due to recurrent genomic changes—and the similarities of these changes across lineages—remain largely unknown and represent an important set of questions in understanding recurrent evolution.

What Do Recurrent Genomic Features Then Tell Us about Evolution?

Genomic recurrence provides a new perspective on evolutionary processes, informing us in often unexpected ways about commonalities of forces—mutational and/or selectional—acting across different lineages. Cases of genomic recurrence caused by ratchet mutations are fundamental to understanding the evolutionary constraints and canalizations that shape the way in which the “genomespace,” as the morphospace, is explored through evolution, underscoring predictability in the overall outcome of neutral mutation, whether or not this will be “constructive” (Stoltzfus 1999; Gray et al. 2010; Doolittle et al. 2011; Speijer 2011). For example, the observation of recurrent emergence of SLTS suggests that the mutational path to a new SLTS system is readily available over long evolutionary times; on the other hand, the lack of reversion from SLTS to non-SLTS presumably indicates general selective forces opposing loss of SLTS, for instance due to loss of the machinery involved in the non–SLTS-dependent expression of the genes subject to SLTS. Other quasineutral changes that have been repeatedly used as substrate for molecular innovations suggest that certain genomic traits confer evolutionary flexibility, opening new venues that can be explored during evolution. Thus, their mere presence would be indicative of evolutionary potential, allowing specific hypotheses about the occurrence of typically accompanying features (e.g., reorganization of conserved synteny after WGDs or the creation of operons in the presence of SLTS). In other cases, although cellular/organismal phenotypic consequences of genomic recurrence may not be immediately evident, careful study of genomic patterns can provide straightforward testable hypotheses about phenotypic consequences. For instance, the observation of recurrent evolution of gastrointestinal RNAase paralogs in two leaf-eating monkey lineages made specific predictions that protein sequence changes in the gastrointestinal RNAase gene would enhance digestion, which were later experimentally confirmed (Zhang 2006). However, it is in the less predictable cases in which the study of recurrent genome evolution arguably reaches the height of its power. For instance, the finding that splicing motifs become highly similar among the remaining introns in nearly intronless species came as a profound surprise (Irimia et al. 2007; Irimia and Roy 2008; Schwartz et al. 2008). This pattern indicates a rule that is at the same time extremely clear and poorly understood: In the context of (or following) nearly complete intron loss, selection for consensus sequences increases on remaining introns. In such cases, the repeatability of the evolutionary outcomes is likely to point at specific ways in how selection acts on these features, illuminating the path for future research.

Concluding Remarks

The diverse instances discussed here represent only a subset of the known cases of repeated evolution at the genome level that have been found largely serendipitously, suggesting that recurrent patterns of genome evolution are widespread. In addition, although recurrent evolution can occur by sheer chance, the above examples provide extensive evidence that genomic recurrence often respond to specific evolutionary forces. As ancestrally shared features are the result of a common evolutionary history, shared features evolved by recurrent evolution are often the result of common evolutionary forces acting on different lineages. These cases improve our understanding of genome evolution, the causes and the modes, allowing us to make specific predictions about evolutionary outcomes. Unraveling the manifold significance of repeated genomic outputs will necessarily require comprehensive and systematic analyses of recurrent phenomena as well as rigorous statistical testing and greater phylogenetic sampling to assess the dynamics underlying observed cases of convergence. Given the increasing availability of complete genome sequences, these analyses are increasingly possible, and as with replicates in experimental research, recurrent events will help us to sketch an increasingly focused picture of genome evolution.

170 in total

Review 1. Mutational equilibrium model of genome size evolution.

Authors: Dmitri A Petrov
Journal: Theor Popul Biol Date: 2002-06 Impact factor: 1.570

Review 2. Convergence and parallelism reconsidered: what have we learned about the genetics of adaptation?

Authors: Jeff Arendt; David Reznick
Journal: Trends Ecol Evol Date: 2007-11-19 Impact factor: 17.712

3. Mystery of intron gain: new data and new models.

Authors: Scott William Roy; Manuel Irimia
Journal: Trends Genet Date: 2008-12-11 Impact factor: 11.639

4. Splicing in the eukaryotic ancestor: form, function and dysfunction.

Authors: Scott William Roy; Manuel Irimia
Journal: Trends Ecol Evol Date: 2009-07-01 Impact factor: 17.712

5. Sequence and genetic map of Meloidogyne hapla: A compact nematode genome for plant parasitism.

Authors: Charles H Opperman; David M Bird; Valerie M Williamson; Dan S Rokhsar; Mark Burke; Jonathan Cohn; John Cromer; Steve Diener; Jim Gajan; Steve Graham; T D Houfek; Qingli Liu; Therese Mitros; Jennifer Schaff; Reenah Schaffer; Elizabeth Scholl; Bryon R Sosinski; Varghese P Thomas; Eric Windham
Journal: Proc Natl Acad Sci U S A Date: 2008-09-22 Impact factor: 11.205

6. A mitochondrial DNA phylogeny of African clawed frogs: phylogeography and implications for polyploid evolution.

Authors: Ben J Evans; Darcy B Kelley; Richard C Tinsley; Don J Melnick; David C Cannatella
Journal: Mol Phylogenet Evol Date: 2004-10 Impact factor: 4.286

7. Vanishing GC-rich isochores in mammalian genomes.

Authors: Laurent Duret; Marie Semon; Gwenaël Piganeau; Dominique Mouchiroud; Nicolas Galtier
Journal: Genetics Date: 2002-12 Impact factor: 4.562

Review 8. Divergence and convergence in enzyme evolution.

Authors: Michael Y Galperin; Eugene V Koonin
Journal: J Biol Chem Date: 2011-11-08 Impact factor: 5.157

9. Eukaryotic genome size databases.

Authors: T Ryan Gregory; James A Nicol; Heidi Tamm; Bellis Kullman; Kaur Kullman; Ilia J Leitch; Brian G Murray; Donald F Kapraun; Johann Greilhuber; Michael D Bennett
Journal: Nucleic Acids Res Date: 2006-11-07 Impact factor: 16.971

10. Evolutionary history of the iroquois/Irx genes in metazoans.

Authors: Pierre Kerner; Aissam Ikmi; Dario Coen; Michel Vervoort
Journal: BMC Evol Biol Date: 2009-04-15 Impact factor: 3.260

15 in total

1. Genealogical correspondence of a forebrain centre implies an executive brain in the protostome-deuterostome bilaterian ancestor.

Authors: Gabriella H Wolff; Nicholas J Strausfeld
Journal: Philos Trans R Soc Lond B Biol Sci Date: 2016-01-05 Impact factor: 6.237

2. Origin of spliceosomal introns and alternative splicing.

Authors: Manuel Irimia; Scott William Roy
Journal: Cold Spring Harb Perspect Biol Date: 2014-06-02 Impact factor: 10.005

3. Tibet as a potential domestication center of cultivated barley of China.

Authors: Xifeng Ren; Eviatar Nevo; Dongfa Sun; Genlou Sun
Journal: PLoS One Date: 2013-05-03 Impact factor: 3.240

4. Extensive conservation of ancient microsynteny across metazoans due to cis-regulatory constraints.

Authors: Manuel Irimia; Juan J Tena; Maria S Alexis; Ana Fernandez-Miñan; Ignacio Maeso; Ozren Bogdanovic; Elisa de la Calle-Mustienes; Scott W Roy; José L Gómez-Skarmeta; Hunter B Fraser
Journal: Genome Res Date: 2012-06-21 Impact factor: 9.043

5. Evolution of protein indels in plants, animals and fungi.

Authors: Pravech Ajawatanawong; Sandra L Baldauf
Journal: BMC Evol Biol Date: 2013-07-04 Impact factor: 3.260

6. Population-genetic analysis of HvABCG31 promoter sequence in wild barley (Hordeum vulgare ssp. spontaneum).

Authors: Xiaoying Ma; Hanan Sela; Genlin Jiao; Chao Li; Aidong Wang; Mohammad Pourkheirandish; Dmitry Weiner; Shun Sakuma; Tamar Krugman; Eviatar Nevo; Takao Komatsuda; Abraham Korol; Guoxiong Chen
Journal: BMC Evol Biol Date: 2012-09-24 Impact factor: 3.260

7. Helical antifreeze proteins have independently evolved in fishes on four occasions.

Authors: Laurie A Graham; Rod S Hobbs; Garth L Fletcher; Peter L Davies
Journal: PLoS One Date: 2013-12-06 Impact factor: 3.240

8. A paneukaryotic genomic analysis of the small GTPase RABL2 underscores the significance of recurrent gene loss in eukaryote evolution.

Authors: Marek Eliáš; Vladimír Klimeš; Romain Derelle; Romana Petrželková; Jan Tachezy
Journal: Biol Direct Date: 2016-02-02 Impact factor: 4.540

Review 9. Domestication of self-splicing introns during eukaryogenesis: the rise of the complex spliceosomal machinery.

Authors: Julian Vosseberg; Berend Snel
Journal: Biol Direct Date: 2017-12-01 Impact factor: 4.540

10. The medaka mutation tintachina sheds light on the evolution of V-ATPase B subunits in vertebrates.

Authors: Claudia Müller; Ignacio Maeso; Joachim Wittbrodt; Juan R Martínez-Morales
Journal: Sci Rep Date: 2013-11-14 Impact factor: 4.379