Literature DB >> 33367270

Genomic Convergence in the Adaptation to Extreme Environments.

Shaohua Xu1, Jiayan Wang1, Zixiao Guo1, Ziwen He1, Suhua Shi1.   

Abstract

Convergent evolution is especially common in plants that have independently adapted to the same extreme environments (i.e., extremophile plants). The recent burst of omics data has alleviated many limitations that have hampered molecular convergence studies of non-model extremophile plants. In this review, we summarize cases of genomic convergence in these taxa to examine the extent and type of genomic convergence during the process of adaptation to extreme environments. Despite being well studied by candidate gene approaches, convergent evolution at individual sites is rare and often has a high false-positive rate when assessed in whole genomes. By contrast, genomic convergence at higher genetic levels has been detected during adaptation to the same extreme environments. Examples include the convergence of biological pathways and changes in gene expression, gene copy number, amino acid usage, and GC content. Higher convergence levels play important roles in the adaptive evolution of extremophiles and may be more frequent and involve more genes. In several cases, multiple types of convergence events have been found to co-occur. However, empirical and theoretical studies of this higher level convergent evolution are still limited. In conclusion, both the development of powerful approaches and the detection of convergence at various genetic levels are needed to further reveal the genetic mechanisms of plant adaptation to extreme environments.
© 2020 The Author(s).

Entities:  

Keywords:  adaptive evolution; convergent evolution; extreme environments; plant genomes

Mesh:

Year:  2020        PMID: 33367270      PMCID: PMC7747959          DOI: 10.1016/j.xplc.2020.100117

Source DB:  PubMed          Journal:  Plant Commun        ISSN: 2590-3462


Introduction

The study of adaptation plays a central role in evolutionary biology. Although adaptive events are often challenging to identify, independent evolution of similar traits in multiple clades is one of the clearest pieces of evidence that adaptation has occurred (Losos, 2011; Martin and Orgogozo, 2013; Stern, 2013; Storz, 2016). Such convergent evolution is common across many taxa. Classic examples include the independent evolution of flight in insects, birds, and bats; eyes in octopi, vertebrates, and spiders; andC4 photosynthesis in diverse plant lineages. A central question is whether the convergence in phenotype reflects similar underlying molecular events. Most empirical and theoretical work to address this question has focused on unicellular organisms, mammals, model plants, and crops (Zhang, 2006; Li et al., 2010; Lin et al., 2012; Tenaillon et al., 2012; Zhen et al., 2012; Lenser and Theißen, 2013; Stern, 2013; Storz, 2016; Woodhouse and Hufford, 2019; Wu et al., 2020a, 2020b; Zhang et al., 2020). Recently, the availability of genomic data has enabled the expansion of molecular convergence analyses to non-model organisms. The detection of convergence at the whole-genome level is emerging as a new frontier in evolutionary biology (Parker et al., 2013; Foote et al., 2015; He et al., 2020a; Zhang et al., 2020). However, it is still unknown whether genomic convergence is a common feature of adaptive evolution. Further work is needed to investigate the organization and extent of convergence across multiple genomes to determine its evolutionary frequency. Plants that have adapted to extreme environments, i.e., extremophiles, are particularly promising subjects for genomic convergent evolution research. Plants are found in virtually all terrestrial habitats, including the most challenging environments. These organisms have established themselves in unstable intertidal zones, dry and hot deserts, aquatic environments, extremely cold arctic environments, and high altitudes with both cold temperatures and strong UV radiation. Stressors from these extreme environments would be detrimental to most plants, hence there are high selective pressures for traits that allow survival. Having experienced similar strong selective pressures from similar extreme environments, phylogenetically distinct plant taxa have convergently evolved similar traits and adaptive strategies (Hoekstra et al., 2001; Shi et al., 2005; Zhou et al., 2016; Thorogood et al., 2018). With the increasing availability of plant genome data, exploration of the molecular bases that underlie these similar traits has been attempted in several extremophiles (Oh et al., 2012; Wu et al., 2012; Xu et al., 2017a, 2017b; He et al., 2020a). Despite some limitations, the study of molecular convergence using genomic data not only helps to reveal the molecular basis of phenotypic convergence in extremophiles but also provides insight into the principles of convergent evolution. For example, although adaptive molecular convergence has been detected in a small number of candidate genes, its detection at the whole-genome level has been controversial because of the lack of suitable controls and a large amount of neutral convergence (Parker et al., 2013; Zou and Zhang, 2015a; Foote et al., 2015; Thomas and Hahn, 2015). Extremophiles that have adapted to the same extreme environment and are subject to similar selective pressures may offer greater opportunity for the detection of genomic-level convergent evolution (He et al., 2020b). Comparison of extremophiles can also help to identify molecular convergence at various genetic levels (Figure 1). The best-studied molecular events are the convergent site substitutions (Zhang and Nei, 1997; Li et al., 2010; Liu et al., 2010; Zhen et al., 2012; Parker et al., 2013; Foote et al., 2015; Hu et al., 2017). Many studies have also found that extremophile plants have undergone the same amino acid (AA) substitutions during adaptation to the same environment (Christin et al., 2010; Xu et al., 2017a; Fukushima et al., 2017). Convergent evolution at other genetic levels has also been detected in extremophiles. For example, an important feature of plant genomes is the high number of gene duplications (Zhang, 2003; Oh et al., 2012). Duplicated genes may evolve new functions or increase the abundance of proteins, both of which have the potential to convergently promote extreme environmental adaptation (Conn et al., 2015; Van Buren et al., 2019). The availability of whole-genome data provides opportunities to comprehensively survey convergent evolution at other genetic levels, such as gene expression variation, TE (transposable element) content, and other features of plant genomes (Yang et al., 2017; Lyu et al., 2018; He et al., 2020a).
Figure 1

Types of Genomic Convergence.

We classify genomic convergence as events that affect nucleotide substitutions at individual sites, gene copy number, gene expression alteration, and genome composition. The source data, detection methods, and examples are shown for each type. See Box 1 for details of detection methods. The examples in (A) and (C) are adapted from Yang et al. (2017). The examples in (B) and (D) are based on data from Ma et al. (2013), Wu et al. (2012), Yang et al. (2013) and Lyu et al. (2018).

Types of Genomic Convergence. We classify genomic convergence as events that affect nucleotide substitutions at individual sites, gene copy number, gene expression alteration, and genome composition. The source data, detection methods, and examples are shown for each type. See Box 1 for details of detection methods. The examples in (A) and (C) are adapted from Yang et al. (2017). The examples in (B) and (D) are based on data from Ma et al. (2013), Wu et al. (2012), Yang et al. (2013) and Lyu et al. (2018). In this review, we summarize advances in the study of plant genome convergence during adaptation to extreme environments and explore the extent and the genetic levels of genomic convergence (Table 1). We first discuss convergent site substitutions in a small number of candidate genes, in individual whole genomes, and in population genomics. We then examine genomic convergence at higher levels, focusing on changes in gene copy number, gene expression, and whole-genome composition. We discuss the probability of convergence at these genetic levels. We highlight cases where genomic convergence at various genetic levels contributes to plant adaptation to the same extreme environments and highlight the necessity of developing powerful approaches and comprehensive analysis of convergence at various genetic levels to further advance genomic convergence studies.
Table 1

Summary of Molecular Convergence Events in Plants Adapting to Extreme Environments.

Environmental stressPlant taxaPhenotypeGenetic level of convergence
References
Site substitutionExpressionCopy numberWhole-genome feature
Drought and hot environmenteight independent grass C4 lineagesC4 photosynthesisPEPCChristin et al. (2007)
C4 lineages in Cyperaceae, grass, and eudicotC4 photosynthesisPEPCBesnard et al. (2009)
23 independent monocot C4 lineagesC4 photosynthesisrbcLChristin et al. (2008)
A. comosus (pineapple), P. equestris (moth orchid), and K. fedtschenkoiCAM photosynthesisPEPC, HY5 (elongated hypocotyl 5) and two CAZyme genes54 genes including PPCKboth PEPC and PPCK have duplicationsYang et al. (2017)
C4 lineages in AlloteropsisC4 photosynthesislateral gene transfer of ppc and pckChristin et al. (2012)
Low nutrient availability in soilsC. follicularis, N. alata, D. adelae, and S. purpureacarnivorousGH19 chitinases, purple acid phosphatases, and RNase T2sFukushima et al. (2017)
Invasive insectash trees (Fraxinus)EAB (Agrilus planipennis) resistance53 genesKelly et al. (2020)
Highly fluctuating environment of intertidal zonemangrove taxa: Avicennia marina, Rhizophora apiculata, and Sonneratia albaHigh salinity tolerance, vivipary, aerial root∼400 genesConvergent AA composition change, convergent TE and genome size reductionXu et al. (2017a; Lyu et al. (2018); He et al., 2020a
Polar extreme environmentsAntarctic psychrophilic green algae (Chlamydomonas sp. ICE-L, and Tetrabaena socialis)resistance to low temperature∼150 genesZhang et al. (2019)
High latitude and low temperaturelodgepole pine (Pinus contorta) and interior spruce (Picea glauca, Picea engelmannii, and their hybrids)resistance to low temperature47 genes61 genesduplicated genes are enriched of site convergenceYeaman et al. (2016)
Calamine metalliferous soiltwo populations of A. halleri and two populations of A. arenosaresistance to heavy metal toxicity24 genesArnold et al. (2016); Preite et al. (2019)
High salinityT salsuginea and P euphraticatandem duplication of HKT1Wu et al. (2012); Ma et al. (2013)
Desiccationeight resurrection plantsresurrectiontandem duplication of ELIPsVan Buren et al. (2019)
Parasitic habitsobligate parasitic plants in the Orobanchaceaehost detectionKAI2dKAI2d is a rapid-evolving clade of KAI2 duplicationConn et al. (2015)
Aquatic environmentsloss or simplification of stomata and complex flower structuresHalophila and Zosteraceaeconvergent loss of ∼1200 conservative core genes in monocotLee et al. (2018)
Summary of Molecular Convergence Events in Plants Adapting to Extreme Environments.

Convergence of Site Substitutions

Convergent AA Substitutions in Candidate Genes

Early attempts to detect molecular convergence usually focused on phenotypes associated with known candidate genes. One of the most well-studied examples of convergent evolution among extremophiles is the independent development of C4 and Crassulacean acid metabolism (CAM) photosynthesis from C3 photosynthesis. These alternative modes of photosynthesis have evolved in diverse plant species more than 60 times as a result of adaptation to drought and high temperatures (Sage et al., 2011; Edwards and Ogburn, 2012; Heyduk et al., 2019). Several genes involved in photosynthesis appear to have experienced convergent site substitutions. The phosphoenolpyruvate carboxylase (PEPC) gene, which encodes a key enzyme in the C4 photosynthetic pathway, has the highest number of convergent AA substitutions. Among eight independent grass C4 lineages, 21 PEPC sites have converged to similar or identical AAs (Christin et al., 2007). In the second most species-rich C4 family, Cyperaceae, the PEPC gene from 104 C4 species in five phylogenetically independent groups shares 16 parallel AA substitutions driven by positive selection, five of which also appear in grass and eudicot C4 lineages (Besnard et al., 2009). Rubisco, an enzyme involved in the fixation of CO2 into organic compounds, is another key enzyme that has undergone convergent evolution in the photosynthetic system. Rubisco from C4 plants shows greater catalytic efficiency but lower specificity than that of C3 plants. This is probably an adaptation to higher CO2 concentrations in C4 photosynthetic cells. The rbcL gene, which encodes the large subunit of Rubisco, has eight codons with adaptive convergent substitutions among 23 independent monocot C4 lineages (Christin et al., 2008). Another example of convergent site substitution occurs in carnivorous plants. Most carnivorous plants are found in damp heaths, bogs, swamps, and muddy or sandy shores where nitrogenous material is often scarce or unavailable because of acidic or other unfavorable soil conditions. To survive in these stressful environments, more than 600 plant species have evolved carnivorous traits; e.g., trap morphology and the presence of digestive fluid proteins (Adamec, 1997; Thorogood et al., 2018). By sequencing digestive fluid proteins from four carnivorous species that originated independently from three clades (Cephalotus follicularis in Oxalidales, Sarracenia purpurea in Ericales, and Nepenthes alata and Drosera adelae in Caryophyllales), Fukushima et al. (2017) found that carnivorous plants repeatedly co-opted stress-response genes to putatively enable a particular digestive physiology. The repeatedly utilized proteins (GH19 chitinases, purple acid phosphatases, and RNase T2s) have also acquired convergent AA substitutions. As the convergent substitutions tend to be located at exposed positions that constitute the protein-environment interface, the proteins may experience the same selective pressure from the digestive fluid environment (including the presence of insect-derived substrates, high endogenous proteolytic activity, low pH, and microbial invasion or symbiosis) to maintain catalytic activity. The candidate gene approach used to study convergent evolution is usually supported by both evolutionary and functional evidence. Therefore, this approach has identified well-supported convergent events that accompany adaptation to extreme environments. However, the approach is limited to phenotypes associated with pre-existing candidate genes, which are not available for most extremophiles.

Genome-wide Convergence of AA Substitutions

Genomic data provide an opportunity to expand the scope of research to phenotypes with an unknown genetic basis. A series of approaches developed in candidate genes were applied to detect adaptive genomic convergence (Box 1). In a genome-wide survey of AA substitutions among three CAM plant genomes, Yang et al. (2017) identified four CAM photosynthesis-related genes that had experienced convergent evolution, including the previously identified PEPC gene (Figure 1A). There are an unexpectedly large number of convergent AA substitutions in the PEPC proteins of the CAM plants Kalanchoe fedtschenkoi and Phalaenopsis equestris (Yang et al., 2017). The convergent AA substitutions can increase PEPC activity. Another gene, HY5 (elongated hypocotyl 5), carries a convergent AA substitution in K. fedtschenkoi and P. equestris and functions in the blue light signaling pathway, which acts as an input to entrain the circadian clock. Because CAM species temporally separate CO2 uptake and fixation, the circadian clock should play a key role in controlling the robust oscillation of the physiological and biochemical features of CAM. AA substitutions were the main targets of early attempts to detect convergent substitutions. These studies involve two steps (Zhang and Kumar, 1997). First, AA sites that experienced convergent substitutions are identified. Second, candidate sites are tested for evidence of adaptive evolution against a neutral null hypothesis. Several approaches using different criteria are listed below.

Test of Whether the Observed Numbers of Convergent Sites Can be Attributed to Random Chance

This method estimates the number of expected convergent substitutions under neutral processes using simulations. Computation is performed based on parameters such as the AA substitution rate matrix, AA equilibrium frequencies, and branch lengths (Zou and Zhang, 2015b). Statistical tests are then employed to evaluate whether the number of observed sites can be explained by neutral evolution. When applied to a small number of candidate genes, this method can successfully identify convergent AA substitutions with high confidence (Zhang and Kumar, 1997; Zhang, 2006; Castoe et al., 2009; Li et al., 2010; Zhen et al., 2012; Stern, 2013; Storz, 2016). However, when applied to whole genomes, the expected number of convergent substitutions may vary greatly depending on the AA substitution model selected (Zou and Zhang, 2015b; He et al., 2020b). For example, by ignoring the variation of acceptable AAs among different sites and at different genetic distances, nonadaptive convergence may be underestimated.

Δ Site-Specific Likelihood Support

The ΔSSLS (site-specific likelihood support) method is a tree-based approach. The method first defines a null hypothesis of a species tree (H0) and an alternative hypothesis of a convergence tree (H1). The method then calculates SSLS using maximum likelihood (ML) for both trees. The difference in SSLS between H0 and H1 is calculated at each site: ΔSSLS = SSLS (H0) – SSLS (H1). The method was first developed to detect the convergence of 13 mitochondrial genes between snakes and agamid lizards (Castoe et al., 2009) and was also applied to detect genome-wide convergence in echolocating mammals by Parker et al. (2013). To calculate genic level ΔSSLS values, Parker et al. (2013) used the mean ΔSSLS of all sites within a gene as the measure of the strength of support for convergent evolution. A negative ΔSSLS value is a signature of convergence. However, the application of the ΔSSLS method to whole-genome data has been shown to produce false-positives because of the lack of suitable controls (Zou and Zhang, 2015a; Thomas and Hahn, 2015).

Convergence/Divergence Ratio

A simple and widely used method is the comparison of the ratio of convergence (C) and divergence (D). The C/D ratio test has been widely used as auxiliary evidence of convergence. Under neutral evolution, the number of divergent sites is highly correlated with the number of convergent sites. Within a group of species without convergent evolution, the ratio between each pair of species should be linearly correlated. A pair of species shows evidence of convergent evolution if it has a higher convergent/divergent ratio than other species pairs.

CCS method

The CCS (convergence at conservative sites) method was developed to eliminate the influence of neutral convergence and concentrate on adaptive events (Xu et al., 2017a; He et al., 2020a). The CCS method is based on two principles. First, symmetric phylogenies are used for focal and control groups. The method assumes that convergent evolution is neutral in the control group and can be used as the baseline for detecting adaptive convergence in the focal group. Second, only conservative AA sites are used for convergence detection. The AA sites that are highly variable among species can produce a large number of convergent events by chance. By contrast, conservative sites are usually functionally important, and their convergent substitutions are more likely to be adaptive. Although whole-genome approaches promise to reveal novel convergence events, careful statistical analyses must be performed to distinguish real events from background noise. This problem is more tractable when a small number of candidate genes are analyzed, as background events can be distinguished using neutrality tests. However, whole-genome studies include tens of thousands of genes, only a small fraction of which are expected to acquire adaptive convergent substitutions. The signal from rare events can be overwhelmed by background AA substitutions (Rokas and Carroll, 2008; Goldstein et al., 2015). To robustly control for genome-wide background noise, Xu et al. (2017a) developed a CCS (convergence at conservative sites) method for the detection of genome-wide convergence at only conservative sites (Box 1). The method focuses exclusively on conservative sites, discarding rapidly evolving loci that account for the bulk of background noise. Simulation results strongly suggest that the CCS method has a low false-positive rate. The method can be made even more robust by including controls for each focal taxon to form a symmetrical design. The CCS method has been fruitfully applied to the study of convergence in mangroves, woody plants that originated independently from inland ancestors and have adapted to intertidal environments. Habitats at the interface between terrestrial and marine environments are characterized by high salinity, hypoxia, daily fluctuating tides, strong UV light, high temperature and sedimentation, and muddy anaerobic soils (Giri et al., 2011; He et al., 2019). To colonize new habitats, mangroves have evolved a series of highly specialized characteristics such as salt tolerance, viviparous embryos, and aerial roots (Tomlinson, 1986; Liang et al., 2008). Furthermore, the mangrove suite of traits has evolved many times, with closely related non-mangrove species available as controls. However, despite bringing together the stringent CCS method and a well-suited biological system, almost 80% of the candidate convergent substitutions can be attributed to neutral noise (Xu et al., 2017a). Nevertheless, the study did identify ∼400 genes that harbored candidate adaptive convergent AA substitutions (Xu et al., 2017a). The ubiquitin mediated proteolysis, N-glycan biosynthesis, and glutathione metabolism pathways were enriched in convergently evolving genes. These processes are involved in stress response and are likely to be important for mangroves' adaptation to their habitats. He et al., 2020a improved the CCS method to identify genes that harbor convergent AA substitutions. They identified 73 genes that likely underlie salinity tolerance and are located mainly in the plasma membrane (Figure 2A).
Figure 2

Multi-level Convergence of Mangrove Genomes.

(A) Examples of convergent AA substitutions in three mangrove genomes. Each of the three genes contains at least three convergent AA substitutions and participates in salinity tolerance.

(B) Convergence of AA usage in mangrove genomes. The AA usage of the three mangrove genomes is distinct from that of more than 50 inland dicotyledon genomes. The five most underused AAs are shown on the left (blue font) and the four most overused in mangroves are shown on the right (red font).

(C) The numbers of long terminal repeat-retrotransposons (LTR-RTs) in the three mangrove genomes are convergently smaller than those of their inland relatives. (A) and (B) are based on data from He et al., 2020a; (C) is based on data from Lyu et al. (2018).

Multi-level Convergence of Mangrove Genomes. (A) Examples of convergent AA substitutions in three mangrove genomes. Each of the three genes contains at least three convergent AA substitutions and participates in salinity tolerance. (B) Convergence of AA usage in mangrove genomes. The AA usage of the three mangrove genomes is distinct from that of more than 50 inland dicotyledon genomes. The five most underused AAs are shown on the left (blue font) and the four most overused in mangroves are shown on the right (red font). (C) The numbers of long terminal repeat-retrotransposons (LTR-RTs) in the three mangrove genomes are convergently smaller than those of their inland relatives. (A) and (B) are based on data from He et al., 2020a; (C) is based on data from Lyu et al. (2018). The CCS method was also applied to detect convergence in two green algae (i.e., psychrophilic green algae) adapted to extremely cold polar environments (Zhang et al., 2019). Transcriptome sequencing of two independently originated Antarctic psychrophilic green algae (Chlamydomonas sp. ICE-L and Tetrabaena socialis) identified genes that harbored convergent substitutions. There were 1,031 convergent AA substitutions in psychrophilic algae and 761 in mesophilic algae. The excess convergent substitutions in psychrophilic algae suggest that adaptation to extreme environments is a causal factor in the convergence process. The photosynthetic machinery, multiple antioxidant systems, and several crucial translation elements in Antarctic psychrophilic algae appear to harbor many convergent substitutions, suggesting that these organisms possess relatively stable photosynthetic apparatuses and multiple protective mechanisms. Because of the high degree of neutral convergence, the concentration of candidate convergent genes in whole genomes is a probability statement. For example, even with additional criteria, the probability that any candidate convergent gene identified by Xu et al. (2017a) has experienced convergent evolution is only 50%. Methods innovation and functional tests are needed to explicitly identify true adaptive convergence (He et al., 2020a).

Detection of Site Convergence Using Population Genomic Data

The cases summarized above involve convergent evolution in unrelated or phylogenetically distinct species. Independent adaptation may also occur between closely related species or between populations of the same species. Unlike phylogenetically independent adaptive convergence in which evolution is dominated by independently occurring mutations, the convergent evolution of closely related species or populations can be driven by introgression or by independent selection on standing variation (Stern, 2013). Furthermore, the fitness effects of AA substitutions are often conditional on the genetic background. Thus, convergent site substitutions are more likely to have similar beneficial effects in closely related species (Storz, 2016). The same single nucleotide polymorphisms (SNPs) seem to be repeatedly recruited during adaptation to the same extreme environment, and the comparison of genomic data provides evidence of both adaptive and independent evolution (Lee and Coop, 2019). Kelly et al. (2020) exploited the convergent evolution of resistance to the emerald ash borer (EAB, Agrilus planipennis) in ash trees (genus Fraxinus). By testing for both phenotypic and molecular convergence associated with EAB resistance in Fraxinus, they showed that EAB-resistant taxa occur within three independent phylogenetic lineages. They detected 53 genes with evidence of convergent AA evolution in genomes from these resistant lineages. Gene-tree reconstruction indicated that for 48 of these candidates, the convergent AAs are more likely to have arisen by independent evolution than by hybridization or incomplete lineage sorting. Seven of the candidate genes have putative roles in the phenylpropanoid biosynthesis pathway, which generates products closely related to disease and insect resistance. An additional 17 genes are involved in herbivore recognition, defense signaling, or programmed cell death. Phylogenetically distant conifers provide an example of genomic convergence as a result of adaptation to high latitude and low temperature. Both the lodgepole pine (Pinus contorta) and interior spruce (Picea glauca, Picea engelmannii, and their hybrids), which diverged approximately 140 million years ago, have evolved a pattern of local adaptation to the climate that reflects a tradeoff between competition for light resources and the acquisition of freezing tolerance. To investigate the basis of this adaptation, Yeaman et al. (2016) searched for correlations between individual SNPs and environmental variables in individuals from >250 populations across the geographic ranges of lodgepole pine and interior spruce. They identified a suite of 47 genes, many of them duplicated, with polymorphisms associated with spatial variation in temperature or cold hardiness in both species. Several of the convergent genes are related to seasonal transitions and abiotic stress (e.g., PRR5, FY, FLC, and RCAR1). These results suggest that long-diverged conifers share a suite of genes that play an important role in adaptation to temperature. In a study of Arabidopsis populations that have independently adapted to calamine metalliferous soils, a low degree of convergence was detected in both genes and functional networks (Preite et al., 2019). A high concentration of heavy metal ions is stressful because metalliferous soils are usually nutritionally imbalanced and toxic to plants (Brady et al., 2005; Wójcik et al., 2017). The authors detected convergent genomic footprints of selection in Arabidopsis halleri and Arabidopsis arenosa at two calamine-type metalliferous (M) sites, each of which was compared with an NM (non-metalliferous) site in its vicinity. These species co-occur at two calamine metalliferous sites with toxic levels of the heavy metals zinc and cadmium. The authors identified five candidate genes with selective sweep signatures that were convergent between both population pairs Mias (versus Zapa) and Klet (versus Kowa) in A. halleri and another five convergent candidate genes in A. arenosa. No convergent candidate genes were common to both species. These results suggest that species-specific metal handling and other biological features may explain the low degree of convergence between species. In another study, the same loci were highly differentiated in both Arabidopsis lyrata and A. arenosa during adaptation to a serpentine habitat (Arnold et al., 2016). Artificial domestication is a special kind of extreme environment. Under domestication, strong artificial selection drives diverse domesticated crops to evolve the same traits, such as increased fruit size and seed number, increased yield, loss of seed dormancy, loss of bitterness, and shattering (seed dispersal; Lenser and Theißen, 2013; Woodhouse and Hufford, 2019). The molecular convergence that underlies these shared phenotypic traits has been identified in different plants because of the large quantity of sequencing data available for domesticated plants. For example, the sh1 gene was found to control the loss of shattering in rice, sorghum, and maize (He et al., 2011; Lin et al., 2012). Because many previous studies on the convergent evolution of domesticated plants have been well summarized (Lenser and Theißen, 2013; Woodhouse and Hufford, 2019), we refer readers to these excellent reviews.

Convergence of Gene Copy Number Variation

A striking feature of plant genomes is the high proportion of duplicated genes. Gene duplication increases gene expression and can improve tolerance to environmental stressors (Oh et al., 2014; Van De Peer et al., 2017). Furthermore, redundant copies may also accumulate genetic changes and obtain new functions or expression patterns that increase genomic plasticity. For example, the expression of duplicated genes is more variable compared with that of single-copy loci in response to environmental stresses (Ha et al., 2007; Oh et al., 2014). Convergent evolution in both the whole genome and in tandem duplicates has been observed as a result of adaptation to extreme environments (Figure 1B). Whole-genome duplications (WGDs) occurred not only before the diversification of extant angiosperms and seed plants but also independently in the common ancestors of many groups (Jiao et al., 2011; Wu et al., 2019). By doubling genome content simultaneously, WGDs provide plenty of genetic material for evolution and are proposed to play key roles in adaptation to new environments and diversification (Ohno, 1970; Conant and Wolfe, 2008; Selmecki et al., 2015; Li et al., 2016; Van De Peer et al., 2017; Ren et al., 2018; Wu et al., 2019). Convergent retention of genes related to conditional response to stress and local adaptation may suggest the importance of gene duplication for adaptation to stressful conditions (Li et al., 2016; Van De Peer et al., 2017). A wave of ancient WGD events occurred around the K-Pg (Cretaceous-Paleocene) boundary independently in many plant lineages (Vanneste et al., 2014; Van De Peer et al., 2017; Ren et al., 2018; Wu et al., 2019). Key genes in cold response and shade avoidance pathways convergently maintain duplicated copies in diverse lineages (Wu et al., 2019). The additional copies evolved new interactions with other genes and promote the survival of plants in cold and dark environments. Modern polyploids, such as tetraploid Arabidopsis thaliana, rice, and citrus, have been shown to exhibit increased tolerance to salt or drought stresses (Chao et al., 2013; Yang et al., 2014; Ruiz et al., 2016). Tandem duplication of individual genes disproportionately affects loci important for stress response (Hanada et al., 2008; Freeling, 2009; Li et al., 2016; Van Buren et al., 2018). Most tandem duplications in plant genomes are relatively new and therefore unlikely to have evolved new functions or expression patterns. They are more likely to benefit survival by increasing absolute transcript abundance. One example is the massive tandem duplication of early light-induced proteins (ELIPs) shared by plants tolerant to desiccation (Van Buren et al., 2019). More than 300 diverse resurrection plants with vegetative desiccation tolerance can revive from typically lethal prolonged dehydration (Oliver et al., 2000, 2005). The ELIP gene family has expanded convergently through repeated tandem gene duplication in all eight sequenced resurrection plant genomes. Whereas 66 desiccation-sensitive land plants have fewer than 10 ELIPs, with an average of 3.1 per genome, eight desiccation-tolerant species have an average of 20.7 ELIPs per genome due to independent massive tandem duplications. The HKT1 (high-affinity K+ transporter 1) gene has also experienced repeated tandem duplication during adaptation to saline environments (Figure 1B). Compared with only one copy in A. thaliana, there are three expressed HKT1 genes arranged in a tandem gene array in the halophyte Thellungiella salsuginea (Wu et al., 2012). A similar tandem duplication has also occurred in the desert poplar, a member of the order Malpighiales that inhabits saline desert environments. The HKT1 gene family has expanded from one member in the closely related Populus trichocarpa genome to four in the Populus euphratica genome, with three copies arranged as tandem duplicates (Ma et al., 2013). Duplicated genes may also acquire new functions (Conant and Wolfe, 2008). Examples include the convergently evolved strigolactone recognition conferred by D14 and duplicated KAI2d genes, which enable host detection in parasitic plants (Conn et al., 2015). Obligate parasitic plants in the Orobanchaceae germinate after sensing plant hormones, strigolactones, exuded from host roots. In A. thaliana, the a/b-hydrolase D14 acts as a strigolactone receptor that controls shoot branching, whereas its ancestral paralog KAI2 does not. In a survey of 10 species that represent the full range of parasitism in the Orobanchaceae, more copies of KAI2 were found in parasite genomes than in those of non-parasitic Lamiales species. The additional copies in parasites formed the rapidly evolving clade KAI2d, which had accumulated an excess of AA substitutions. The AA substitutions in KAI2d caused protein structure changes and enabled strigolactone recognition. In addition to gene duplication, other kinds of gene copy number variation also play a role in adaptation to extreme environments. Repeated lateral transfers of genes in the C4 pathway have been found in the grass lineage Alloteropsis (Christin et al., 2012). The vertically acquired C4 PEPC gene in the three C4 clades is either absent or has no C4 characteristics. By contrast, the C4-specific PEPC genes from the three C4 clades each clustered with a distantly related clade. The lateral gene transfer was supported by the high similarity between the intron and untranslated region (UTR) sequences of horizontally acquired genes and those of the putative donors. Another key gene, pck (phosphoenolpyruvate carboxykinase), also transitioned from a member of Cenchrinae to the common ancestor of Alloteropsis angusta and Alloteropsis semialata and now performs a key function in the C4 pathways of these taxa. The convergent lateral gene transfer may have occurred during direct contact after the deposition of foreign pollen on the Alloteropsis stigma. Natural selection may then have led to rapid fixation of the transferred genes in populations. A decrease in convergent gene copy number (i.e., gene loss) has occurred in seagrasses that have independently returned to the sea. Physiological and morphological features such as stomata and complex floral structures have been lost or simplified in seagrasses due to their re-acquired aquatic life habits (Kuo et al., 2006). The diffusion rate of volatile substances is also reduced underwater. In a survey of two independent seagrass lineages, Halophila and Zosteraceae, ∼1,200 conserved monocot core genes have been convergently lost. These losses primarily affect genes associated with volatile substance synthesis and signaling and with stomatal development (Lee et al., 2018).

Convergence of Expression Regulation and Noncoding Variation

During adaptation to stressful environments, molecular phenotypes such as gene expression may have more direct effects on whole-organism traits than site substitutions or gene copy number variations. A large number of plant genes are up- or downregulated in response to environmental stresses (Maruyama et al., 2014; Feng et al., 2020; Gong et al., 2020). However, most of the identified stress-response genes show similar expression changes, and therefore expression regulation is likely to be an ancestral state shared by diverse plant taxa. Convergent gene expression evolution, which is the independent derivation of the same gene expression pattern, is relatively poorly understood because of a lack of relevant studies (Figure 1). As a key enzyme of C4 and CAM photosynthesis, PEPC not only accumulates extensive convergent AA substitutions but also shows higher expression in all C4 and CAM plants (Heyduk et al., 2019). Compared with only four genes that show convergent AA substitutions, 54 genes show convergent shifts in diel gene expression patterns in the CAM species K. fedtschenkoi and Ananas comosus (pineapple; Yang et al., 2017). These significant time shifts in diel transcript expression occur in genes related to nocturnal CO2 fixation, circadian rhythm, carbohydrate metabolism, stomatal movement, and heat stress response. For example, the maximum transcript abundance of PPCK (phosphoenolpyruvate carboxylase kinase 1), a key regulator of PEPC that promotes nocturnal CO2 uptake, shifted from daytime in C3 photosynthetic species to nighttime in the two CAM species (Figure 1C). The study of convergent evolution of distantly related conifers also revealed convergent gene expression regulation in addition to site substitutions (Yeaman et al., 2016). In response to climate stress, 61 transcripts displayed conserved differential expression patterns in both species. Furthermore, genes with signatures of convergent evolution are also enriched in transcription factors and are more likely to affect the expression of other genes. Convergence of gene expression may also combine with convergence in gene copy number variation. For example, the convergent tandem duplication of ELIPs in resurrection plants results in a convergent gene expression pattern change (Van Buren et al., 2019). The expression of ELIPs is low or undetectable under well-watered conditions in all surveyed tolerant and sensitive species; however ELIPs are consistently among the most highly expressed genes in resurrection plants during dehydration and rehydration. The dramatic expansion of ELIPs results in 622-fold higher expression in resurrection plants than in desiccation-sensitive plants under water-deficit conditions. The relatively few examples of gene expression convergence may reflect the scarcity of gene expression data compared with whole-genome sequence data.

Convergence of Whole-Genome Composition

In addition to these functional genes, whole-genome features, such as GC content, AA usage, and TE content, may also be under selective pressure in extreme environments. Using whole-genome sequences from three independent mangrove clades, He et al., 2020a examined the convergence of genomic features related to intertidal environment adaptation. The three mangrove taxa overuse the same set of AAs and underuse others (Figure 2B). Mangroves underuse AAs with large hydrophobic residues whose non-specific inter- and intra-molecular interactions may disrupt proper protein folding and conformation under hypersaline conditions (Paul et al., 2008). The degree of AA usage alteration is highest in the extracellular region where fluctuating salinity has a direct effect. The trend also helps mangroves to reduce energy costs in low-nutrient intertidal environments. The study also found that AA substitution patterns in mangroves are quite different from those of other organisms (He et al., 2020a). Therefore, mangrove proteins have been evolving using a common substitution mechanism that appears to be unique to them, resulting in the observed changes in AA composition. Lyu et al. (2018) found that all mangrove lineages massively and convergently reduced TE loads, resulting in genome size reduction (Figure 2C). By modeling TE dynamics, they found that the disappearance of these elements is a consequence of fewer births rather than additional deaths. TE-mediated genome instability can accelerate in the face of environmental insults such as high salinity and strong UV radiation (Pfeiffer et al., 2000; Schuermann et al., 2005; Argueso et al., 2008), leading to enhanced long-term selection pressure on the TE load. Therefore, the elimination of most mobile elements leading to a reduced TE load and genome size may be a convergent strategy employed by mangrove during adaptation to new stressful environments. Furthermore, convergence in transcriptome profiles was found between the two mangroves Rhizophora mangle and Heritiera littoralis (Dassanayake et al., 2009). Greater GC content has been associated with species that grow in seasonally cold and/or dry climates, perhaps suggesting an advantage of GC-rich DNA during cell freezing and desiccation (Šmarda et al., 2014).

Probability of Genomic Convergence in the Adaptation to Extreme Environments

The numbers of available paths leading to adaptation determine the level of molecular convergence. Relative to the large number of site substitutions across whole genomes, detected convergent substitutions are very rare in plant adaptation to extreme environments. In the two distantly related CAM plant genomes, only five protein-coding genes harbor convergent AA substitutions (Yang et al., 2017). Similar patterns emerge during the adaptation of Arabidopsis to calamine metalliferous soil (Preite et al., 2019). In this case, only five convergently evolving genes were detected in two populations that were independently adapting to these habitats, and no convergence was detected between species. The rarity of site convergence may be the result of overly rigid criteria. For example, the criteria used in the CAM plants required that whole genes be clustered into one phylogenetic clade, which is unexpected for genes with a small number of site convergences. On the other hand, site convergences detected at the whole-genome level may be false-positives, given the large degree of neutral convergence. It is still an open question whether the rarity of genomic site convergence is due to a lack of powerful statistical approaches or whether phenotypic convergence involves a small number of underlying DNA sequence changes. Although convergent evolution is rare at individual sites, it may be more frequent at higher levels of organization. For example, although only four genes contained convergent AA substitutions, 54 genes experienced convergent shifts in diel expression in CAM species (Yang et al., 2017). The higher level of gene expression convergence may reflect the fact that the same expression change can be achieved by various genetic changes, such as gene copy number expansion or mutations in regulatory regions. Convergence in gene copy number may also be more frequent than has been reported among extremophiles. The frequency of gene duplication is much higher in plants than in other domains of life. Gene duplications, especially tandem duplications, can rapidly increase the abundance of their products and may play an important role in short-term adaptation to new environments (Bianconi et al., 2018). Gene duplications produced by WGD have been proposed to promote the adaptation of plants to diverse environments through the evolution of new functions (Van De Peer et al., 2017; Ren et al., 2018; Wu et al., 2019). In addition to increasing gene copy number, gene duplication may also influence the rate of site convergence, although the relationship between gene duplication and convergent site substitution is still unclear. It is widely believed that convergence results from the minimization of pleiotropy and the maximization of phenotypic change (Stern, 2013). Therefore, gene duplications that produce functional redundancy and alleviate pleiotropic constraints would decrease convergent evolution rates (Zhen et al., 2012; Storz, 2016). This hypothesis is supported by the convergent evolution of resistance to toxic cardenolides through genetic changes in Na+, K+-ATPase among diverse insect taxa. In taxa that possess two copies of this gene, a greater number of unique substitutions appear to contribute to cardenolide resistance (Zhen et al., 2012). However, a higher level of site convergence in duplicated genes was observed during adaptation to cold temperatures in conifers. On average, signatures of convergence were 65% more common in cases where one ortholog was duplicated than in one-to-one orthologs (Yeaman et al., 2016). The promotion of adaptation by site convergence in duplicated genes has also been observed in other organisms (Chen et al., 1997; Christin et al., 2007). Convergent evolution of whole-genome features may affect the majority of protein-coding genes. Selective advantages of AA composition changes may be very small for individual genes but may accumulate to produce large overall effects. Convergent changes in genomic composition, such as AA and GC usage, have been widely explored in microorganisms like halophilic and thermophilic bacteria (Paul et al., 2008). Such convergence may occur in unicellular psychrophilic green algae, whose cells are directly influenced by the environment. Although organs of multicellular plants avoid direct contact with extreme environments, environmental selective pressure may still drive changes in whole-genome composition, such as those observed in mangrove genomes. In addition to the convergence of digestive physiology, convergence of whole-genome composition may also occur in carnivorous plants. For example, nutrient deficiency, which is the major selective pressure on carnivory, may also drive AA composition change that is constrained by nutrition and often selected to achieve cost minimization (Akashi and Gojobori, 2002). Compared with other convergence types, the convergence of whole-genome composition involves the majority of genes or genomic sequences. Therefore, the convergence of genomic composition requires sufficient time to evolve and is likely to be a relatively long-term adaptive strategy. This may have been the case for the three mangrove taxa discussed above, all of which originated from inland ancestors about 55 million years ago. If they had independently invaded similar environments more recently, perhaps one million and five million years ago, they would not be expected to exhibit comparable convergent signals at the genomic level.

Conclusions and Future Directions

Adaptive evolution is the central topic of evolutionary biology, and convergent evolution provides the most compelling evidence of adaptive evolution. With the development of sequencing technologies, genomic studies have become prevalent in the field of evolutionary biology, and the large amount of whole-genome data provides opportunities for genomic convergence studies. However, whether such studies can reveal the adaptive mechanisms of genome evolution and to what extent convergence occurs at the whole genome level are still unknown. In this review, we explored the extent of genomic convergence in plants that have independently adapted to the same extreme environments. Similar and well-defined environmental selection pressures should facilitate the discovery of convergent evolution in these situations. Although convergent evolution studies still focus primarily on model species and experimentation, the accessibility of genomic data has made it possible to test for genomic convergence in non-model species. Indeed, the molecular bases of adaptation in several extremophile plants have been revealed by studying convergent evolution. We summarized the genomic convergence of plants during adaptation to various extreme environmental conditions, such as the unstable intertidal zone, the extreme drought in deserts, aquatic environments, the extreme cold in arctic conditions, pests and pathogens, and cold combined with strong UV at high altitudes. We found that genomic convergence studies revealed extensive candidate molecular mechanisms of extreme environmental adaptation. Genomic convergence is prevalent at various genetic levels, including site convergence, convergence in gene copy number variation, gene expression, and whole-genome composition. Moreover, this mode of adaptation often co-occurs at several genetic levels. For example, mangrove genomes experienced convergent evolution of AA substitutions, TE and genome size reductions, and AA composition change simultaneously (Xu et al., 2017a; Lyu et al., 2018; He et al., 2020a). CAM plants experienced convergence in both AA substitutions and gene expression alterations (Yang et al., 2017). These studies strongly indicate that genomic convergence is prevalent in extremophile genomes and promises to be a new frontier in comparative genomic studies. However, there are still challenges in genomic convergence studies. For example, when candidate gene approaches are applied to whole genomes, the large number of neutrally convergent sites can easily overwhelm true convergence. Even when detection is possible, the results are not always reliable because of the lack of suitable controls for neutral convergence (He et al., 2020b). Lateral gene transfer between distantly related species and hybridization between closely related species may also mislead the assessment of convergent evolution (Dunning and Christin, 2020). Therefore, the development of innovative approaches that detect adaptive convergence using genomic data is needed. More empirical and theoretical studies of convergence at higher genetic levels are also needed. For example, analyses have detected convergence in AA substitution and composition, as well as TE reduction, in several mangrove species from three clades. How much of the adaptation among over 70 mangrove species can be explained by convergent evolution? Do changes in convergent gene copy number and expression pattern also play a role? Similar questions remain in other extremophiles, such as desert plants. In addition to the convergence at these genetic levels, convergence in other genetic changes, such as epigenetic modifications, expression regulation networks, 3D structure of proteins, and sequence insertions and deletions, may also play a role in adaptive convergence and could be explored in future studies. Compared with site convergence, convergence at higher genetic levels lacks a theoretical basis. Most importantly, the comprehensive analysis of multi-level genomic convergence is still quite limited. In the comprehensive analysis of multi-level genomic convergence, we could examine the order of occurrence of different convergence types during adaptation to a new environment and ask whether large or small effect convergence is more likely to occur. The answers to these questions will help to reveal not only the mechanisms of genomic convergence but also the roles of different genetic variations in adaptive evolution. To analyze multi-level genomic convergence, the collection of multidimensional data is crucial. In addition to the assembly of reference genomes, future studies should also design experiments to generate multidimensional data, such as gene expression and population genetic variations. In this review, we have focused on plant genome convergence in extreme environments; i.e., under strong selective pressures. Genomic convergence studies could also be applied to a wider range of environmental adaptation processes. Such studies may help to explore the extent of genomic convergence in situations where there is weak selective pressure and determine whether convergence is a common feature of adaptation to similar environments, even when they are not extreme. Genomic convergence studies not only provide an opportunity to explore the general principles of adaptive evolution but are also effective approaches for identifying functional genes related to environmental adaptation, discovering domestication genes, and improving agronomic traits. Therefore, the comprehensive study of genomic convergence in plants should be a new frontier in evolutionary biology.

Funding

This work was supported by the (31830005 and 31971540), the (2017FY100705), and the Guangdong Basic and Applied Basic Research Foundation (2019A1515010752).
  82 in total

Review 1.  Mechanisms of DNA double-strand break repair and their potential to induce chromosomal aberrations.

Authors:  P Pfeiffer; W Goedecke; G Obe
Journal:  Mutagenesis       Date:  2000-07       Impact factor: 3.000

Review 2.  Life at the extreme: lessons from the genome.

Authors:  Dong-Ha Oh; Maheshi Dassanayake; Hans J Bohnert; John M Cheeseman
Journal:  Genome Biol       Date:  2012       Impact factor: 13.583

3.  Evidence for an ancient adaptive episode of convergent molecular evolution.

Authors:  Todd A Castoe; A P Jason de Koning; Hyun-Min Kim; Wanjun Gu; Brice P Noonan; Gavin Naylor; Zhi J Jiang; Christopher L Parkinson; David D Pollock
Journal:  Proc Natl Acad Sci U S A       Date:  2009-04-28       Impact factor: 11.205

4.  Genome-Wide Convergence during Evolution of Mangroves from Woody Plants.

Authors:  Shaohua Xu; Ziwen He; Zixiao Guo; Zhang Zhang; Gerald J Wyckoff; Anthony Greenberg; Chung-I Wu; Suhua Shi
Journal:  Mol Biol Evol       Date:  2017-04-01       Impact factor: 16.240

5.  Desiccation tolerance in bryophytes: a reflection of the primitive strategy for plant survival in dehydrating habitats?

Authors:  Melvin J Oliver; Jeff Velten; Brent D Mishler
Journal:  Integr Comp Biol       Date:  2005-11       Impact factor: 3.326

6.  PLANT EVOLUTION. Convergent evolution of strigolactone perception enabled host detection in parasitic plants.

Authors:  Caitlin E Conn; Rohan Bythell-Douglas; Drexel Neumann; Satoko Yoshida; Bryan Whittington; James H Westwood; Ken Shirasu; Charles S Bond; Kelly A Dyer; David C Nelson
Journal:  Science       Date:  2015-07-31       Impact factor: 47.728

Review 7.  The genetics of convergent evolution: insights from plant photosynthesis.

Authors:  Karolina Heyduk; Jose J Moreno-Villena; Ian S Gilman; Pascal-Antoine Christin; Erika J Edwards
Journal:  Nat Rev Genet       Date:  2019-08       Impact factor: 53.242

8.  Analysis of 41 plant genomes supports a wave of successful genome duplications in association with the Cretaceous-Paleogene boundary.

Authors:  Kevin Vanneste; Guy Baele; Steven Maere; Yves Van de Peer
Journal:  Genome Res       Date:  2014-05-16       Impact factor: 9.043

9.  Genomic comparison of two independent seagrass lineages reveals habitat-driven convergent evolution.

Authors:  HueyTyng Lee; Agnieszka A Golicz; Philipp E Bayer; Anita A Severn-Ellis; Chon-Kit Kenneth Chan; Jacqueline Batley; Gary A Kendrick; David Edwards
Journal:  J Exp Bot       Date:  2018-06-27       Impact factor: 6.992

10.  Convergent molecular evolution among ash species resistant to the emerald ash borer.

Authors:  Laura J Kelly; William J Plumb; David W Carey; Mary E Mason; Endymion D Cooper; William Crowther; Alan T Whittemore; Stephen J Rossiter; Jennifer L Koch; Richard J A Buggs
Journal:  Nat Ecol Evol       Date:  2020-05-25       Impact factor: 15.460

View more
  4 in total

1.  RNA G-quadruplex structure contributes to cold adaptation in plants.

Authors:  Xiaofei Yang; Haopeng Yu; Susan Duncan; Yueying Zhang; Jitender Cheema; Haifeng Liu; J Benjamin Miller; Jie Zhang; Chun Kit Kwok; Huakun Zhang; Yiliang Ding
Journal:  Nat Commun       Date:  2022-10-20       Impact factor: 17.694

2.  Exploitation of Plant Growth Promoting Bacteria for Sustainable Agriculture: Hierarchical Approach to Link Laboratory and Field Experiments.

Authors:  Federica Massa; Roberto Defez; Carmen Bianco
Journal:  Microorganisms       Date:  2022-04-21

Review 3.  Genetic and molecular mechanisms underlying mangrove adaptations to intertidal environments.

Authors:  Ashifa Nizam; Suraj Prasannakumari Meera; Ajay Kumar
Journal:  iScience       Date:  2021-11-30

4.  A New Year's spotlight on two years of publication.

Authors:  Wenjia Wang; Lexuan Gao; Xiaofeng Cui
Journal:  Plant Commun       Date:  2021-12-29
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.