Literature DB >> 34894228

Rates of Protein Evolution across the Marsupial Phylogeny: Heterogeneity and Link to Life-History Traits.

Agusto R Luzuriaga-Neira1, David Alvarez-Ponce1.   

Abstract

Despite the importance of effective population size (Ne) in evolutionary and conservation biology, it remains unclear what factors have an impact on this quantity. The Nearly Neutral Theory of Molecular Evolution predicts a faster accumulation of deleterious mutations (and thus a higher dN/dS ratio) in populations with small Ne; thus, measuring dN/dS ratios in different groups/species can provide insight into their Ne. Here, we used an exome data set of 1,550 loci from 45 species of marsupials representing 18 of the 22 extant families, to estimate dN/dS ratios across the different branches and families of the marsupial phylogeny. We found a considerable heterogeneity in dN/dS ratios among families and species, which suggests significant differences in their Ne. Furthermore, our multivariate analyses of several life-history traits showed that dN/dS ratios (and thus Ne) are affected by body weight, body length, and weaning age.
© The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Entities:  

Keywords:  zzm321990 dzzm321990 Nzzm321990 /dzzm321990 Szzm321990 ; effective population size; extinction; life-history traits; marsupials

Mesh:

Year:  2022        PMID: 34894228      PMCID: PMC8759560          DOI: 10.1093/gbe/evab277

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Significance We estimate the rates of protein evolution in each of the branches of the marsupial phylogeny. We found substantial heterogeneity (proteins have evolved faster in certain families and species compared with others), which provides clues about their different effective population sizes (proteins tend to evolve fast in species with small effective population sizes). Rates of protein evolution tend to be higher in lineages with small body size and body length, and with a high weaning age.

Introduction

The effective population size (Ne) is a critical concept in evolutionary and conservation biology because it is directly linked to the effectiveness of natural selection and the amount of neutral variation that a population contains. Species with small Ne tend to be at high risk of extinction because they have low levels of genetic diversity, are susceptible to accumulating deleterious mutations due to genetic drift, and are present potentially low rates of adaptative evolution (Charlesworth 2009). Comparative genomics studies in several groups such as rodents and primates (Wu and Li 1985; Weinreich 2001; Romiguier et al. 2014); mammals, birds, and reptiles (Figuet et al. 2016; Botero-Castro et al. 2017); and large versus small-bodied mammals (Popadin et al. 2007) have validated the prediction that species with small Ne tend to display a high probability of fixation of deleterious mutations (Ohta 1976; Ohta and Ina 1995). The significance of Ne as an evolutionary and conservation factor is highlighted by several findings showing that Ne values are often lower than the census numbers of breeding individuals in a species (Crow and Morton 1955; Frankham 1995). Understanding what factors affect Ne is essential from the conservation point of view. Whereas a number of factors are known to affect Ne (Charlesworth 2009), it is still unclear which life-history traits impact Ne, and to what extent. Because of the direct association between Ne and genome evolution, the patterns of genetic variation of a particular species reflect the past variations in its Ne (Nadachowska-Brzyska et al. 2015). Mutations affecting protein-coding sequences can be classified into synonymous mutations (which largely evolve neutrally) and nonsynonymous mutations (which are often deleterious). The Nearly Neutral Theory of Molecular Evolution predicts a higher nonsynonymous to synonymous divergence ratio (ω = dN/dS) in species with a small Ne (Kimura 1983; Ohta 1993). A decrease in Ne increases the fraction of nonsynonymous mutations with selection coefficients below 1/Ne, which can be fixed by drift (Wright 1931; Kimura 1983). In agreement with this prediction, there is evidence for an accelerated accumulation of nonsynonymous mutations in mammals with low Ne. Ohta (1993) reported a higher dN/dS in primates compared with artiodactyls or rodents. This conclusion, based on 17 genes, was confirmed by the analysis of thousands of putative orthologs in primates and rodents (Wu and Li 1985; Weinreich 2001). Given the general scarcity of direct estimates of Ne in mammals, correlates are usually used as proxies. Body mass is the most frequently used proxy of Ne, based on the general inverse relationship between body mass and Ne (Damuth 1981; Peters and Peters 1986; Damuth 1987). Several studies have shown a positive correlation between body mass and dN/dS (Nikolaev et al. 2007; Popadin et al. 2007; Romiguier et al. 2012, 2014). Besides body mass and dN/dS, a number of life-history traits such as lifespan and generation time have also been used as proxies of Ne, given their positive correlation with dN/dS (Nikolaev et al. 2007; Nabholz et al. 2008; Welch et al. 2008; Lartillot and Poujol 2011). In addition, James and Eyre-Walker (2020) showed that there is a positive correlation between a species’ distribution area and its genetic diversity, using mitochondrial DNA data from 639 species of mammals. Because life-history traits and distribution area are related to Ne, they are directly associated with species vulnerability, thus representing intrinsic extinction factors of the species (Purvis et al. 2000; Cardillo et al. 2005; Collen et al. 2011; Rolland et al. 2020). The link between life-history traits and Ne remains relatively unexplored. The few studies available are often limited by the number of species and loci studied. Moreover, these studies are largely limited to analyses of a few eutherian mammals or species from a wide variety of orders (Nikolaev et al. 2007; Popadin et al. 2007; Romiguier et al. 2012, 2014). Marsupials are an attractive group to test this link because they are a relatively young order (∼160 Myr old; Luo et al. 2011) and are thus relatively homogenous in their biology, yet they exhibit substantial diversity in their life-history traits. In addition, a large exome data set encompassing 45 species of Australasian and American marsupials (representing 18 of the 22 extant families) has recently been generated (Duchêne et al. 2018). The aim of the current study is two-fold. First, we computed the average rates of protein evolution (dN/dS ratios) in each branch of the marsupial phylogeny. This analysis revealed a high amount of heterogeneity in the rates of protein evolution both across families and within each family, which point to differences in Ne. Second, we used multivariate analyses to establish the relationship between the dN/dS ratio of each species and several life-history traits. We found that that body length, body mass and weaning age independently impact dN/dS, indicating that they are determinants of Ne.

Results

Differences in Rates of Protein Evolution among Marsupial Families and Species

We generated a concatenated alignment that included sequences for 1,472 exons. The concatenome had a total length of 808,941 base pairs for 43 marsupial species. Our analysis included a complete deletion step; thus, from 269,647 possible codons, only 122,308 were retained for analysis. We did not use the remaining codons as they contained missing data and/or ambiguities. We first used the M0 model, which assumes a homogeneous dN/dS value for all branches in the phylogenetic tree, to estimate an overall dN/dS of 0.1542. The log-likelihood of the alignment under this model was l = −2,040,465.59. We next applied the 19ω model to estimate a separate dN/dS for each of the 18 marsupial families included in our analysis, obtaining a log-likelihood value of l = −2,039,789.51. Comparison of the log-likelihoods of both models using a likelihood ratio test indicated that the 19ω model fitted the data significantly better than the M0 model [2Δl = 2 × (l − l) = 1,352.16, d.f. = 18, P < 10−10], this implies significant heterogeneity in the rates of protein evolution among the different families. The dN/dS values ranged from 0.1217 (family Dasyuridae) to 0.2397 (family Vombatidae). The dN/dS values are detailed in figure 1 and supplementary table S2, Supplementary Material online.
Fig. 1.

Estimated dN/dS values for the 18 marsupial families included in our study. Numbers in blue represent the dN/dS for each family estimated using the 19ω model. The blue horizontal line represents time in millions of years.

Estimated dN/dS values for the 18 marsupial families included in our study. Numbers in blue represent the dN/dS for each family estimated using the 19ω model. The blue horizontal line represents time in millions of years. We then used the FR model, which assumes an independent dN/dS for each branch in the phylogenetic tree, obtaining a log-likelihood value of l = −2,039,488.51. The FR model fitted the data significantly better than the M0 model [2Δl = 2 × (l − l) = 1,954.16, d.f. = 82, P < 10−10] and the 19ω model [2Δl = 2 × (l − l) = 602, d.f. = 64, P < 10−10], implying significant heterogeneity in the rates of protein evolution across the different branches of the phylogeny (even among branches in the same family). Across the studied species, the lowest dN/dS is 0.106 (Pascogale topoatafa), while the highest dN/dS is 0.302 (Pseudocheirus occidentalis) (fig. 2, supplementary figs. S1 and S2, Supplementary Material online).
Fig. 2.

Estimated dN/dS at each branch of the marsupial phylogenetic tree. The numbers in blue on each branch represent the estimated dN/dS (ω) values. The blue horizontal line represents time in millions of years.

Estimated dN/dS at each branch of the marsupial phylogenetic tree. The numbers in blue on each branch represent the estimated dN/dS (ω) values. The blue horizontal line represents time in millions of years. Next, we applied the FR model to each of the 1,472 exon alignments separately. Thus, we obtained a total of 778 (number of exons after removing those with missing data) × 83 (branches) = 64,574 dN/dS values. For each pair of branches (a total of 3,403 comparisons), we compared their dN/dS values using a paired Wilcoxon test. We obtained significant differences (P < 0.05) in 81% of the comparisons (fig. 3 and supplementary fig. S3, Supplementary Material online). The test detected significant differences for most pairs of external branches, except for those involving species from the same genus, such as Bettongia penicillata and Bettongia lesueur. However, we noted that non-significant differences (P > 0.05) could happen between distantly related species, such as Didelphis virginiana and Monodelphis domestica (fig. 3).
Fig. 3.

Heatmap representing the paired Wilcoxon tests using the dN/dS values for 778 exon sequences from 43 marsupial species. Each cell represents a P-value calculated under the alternative hypothesis of difference between each pair of species. The dark color for most of the cells represents P-values lower than 0.05, indicating significant differences between species. The light color represents P-values greater than 0.05, showing no significant differences between species. An expanded version of this figure (including internal branches) is presented in supplementary figure S3, Supplementary Material online.

Heatmap representing the paired Wilcoxon tests using the dN/dS values for 778 exon sequences from 43 marsupial species. Each cell represents a P-value calculated under the alternative hypothesis of difference between each pair of species. The dark color for most of the cells represents P-values lower than 0.05, indicating significant differences between species. The light color represents P-values greater than 0.05, showing no significant differences between species. An expanded version of this figure (including internal branches) is presented in supplementary figure S3, Supplementary Material online.

Life-History Traits and Their Association with dN/dS in Marsupials

For each of the 43 species included in our analysis, we gleaned information on distribution area and seven life-history trait variables (body mass, body length, sexual maturity age, weaning age, gestation period, litter size, and the number of litters per year; supplementary table S1, Supplementary Material online). Then we evaluated their correlations with dN/dS using Spearman's rank correlation coefficients. We detected a significant correlation (P < 0.05) for body mass, body length, weaning, and litter size (supplementary fig. S4, Supplementary Material online). Except for litter size, all the correlations were positive. Then we used a phylogenetic generalized least squares regression (pgls) to avoid the effects of the shared phylogenetic history among the studied species. We found a significant correlation between dN/dS and three variables: body mass (R2 = 0.37, P = 0.008), body length (R2 = 0.37, P = 0.009), and weaning age (R2 = 0.35, P = 0.02) (fig. 4, supplementary table S3, Supplementary Material online). Thus, the association between dN/dS and litter size was not significant once we controlled for phylogenetic inertia.
Fig. 4.

Phylogenetic generalized least squares regressions between dN/dS and distribution area and seven life-history traits in marsupials. Each panel displays the regression lines, R2 coefficient, and P-values. Shaded gray areas represent 95% confidence intervals. *P < 0.05, **P < 0.01, ***P < 0.001.

Phylogenetic generalized least squares regressions between dN/dS and distribution area and seven life-history traits in marsupials. Each panel displays the regression lines, R2 coefficient, and P-values. Shaded gray areas represent 95% confidence intervals. *P < 0.05, **P < 0.01, ***P < 0.001.

Discussion

Our analyses show that rates of protein evolution, as measured from the dN/dS ratios, widely vary among the different marsupial families (fig. 1). Families with the lowest dN/dS ratios are expected to be the ones with largest Ne. Our results revealed that the Vombatidae family is the one with the highest dN/dS (0.2397), meaning that its species have accumulated more nonsynonymous divergence per unit of synonymous divergence compared with the other 17 families analyzed. Vombatus ursinus and Lasiorinhus latifrons (both represented in the tree) are species with a large body size and mass (26,650 g and 876.5 mm on average). Furthermore, both species present a very low reproduction rate (0.75 litters per year), with weaning periods ranging from 225 to 545 days. Contrarily, the Dasyuridae family, which has the lowest dN/dS (0.1217), is represented by species with a much smaller body mass (ranging from 7 to 8,000 g) and size (63–763 mm), and a much higher reproductive rate (3–12 litters per year and weaning periods between 95 and 22 days). The contrast between these families with the highest and lowest dN/dS ratios in the phylogenetic tree suggests that the above-mentioned life-history traits can be used as a proxies of Ne in marsupials, similar to previous studies in other species. Our analyses based on the free-ratios model also revealed significant differences in the rates of protein evolution of most of the species studied. Species from the genus Pseudocheirus exhibited the highest dN/dS values, whereas species from the genus Phascogale presented the lowest dN/dS. In our study, the critically endangered species Pseudocheirus occidentalis has the highest dN/dS ratio (0.3021). Currently, P. occidentalis is confined to a small area in southwestern Australia (Bader et al. 2019). Our estimations suggest a potential distribution area of 18,010 km2. This species is characterized by a low reproduction rate (1 litter per year) and a considerable long weaning time (210 days). In contrast, the species with the lowest dN/dS, (0.1059), Phascogale tapoatafa, considered in the near-threatened category according to the UICN criteria, has a much larger potential distribution area (1,068,122 km2), a higher reproduction rate (6 litters per year), and a lower weaning age (122 days). The body mass and length are the most used Ne proxies. Comparing the species with the lowest and highest dN/dS values, Pascogale tapoatafa has a considerably lower body mass and length (211 g and 195 mm) than Pseudocheirus occidentalis (700 g and 360 mm), supporting the inverse relationship between these two variables and Ne. Even though our correlation and pgls regression plots provide insights into the relationship between dN/dS and several life-history traits frequently used as proxies of Ne, correlation tests are significant for only four factors: body mass, body length, litter size, and weaning age, with low-dN/dS species tending to exhibit low body mass, low body length, high litter size, and low weaning age (supplementary fig. S4, Supplementary Material online). These results differ from previous findings in placental mammals, where sexual maturity age and generation time were more correlated with dN/dS, indicating that they are better predictors of Ne than body mass (Nikolaev et al. 2007; Lartillot 2013; Romiguier et al. 2013). In addition, our pgls regression analyses indicate that only body mass, body length, and weaning age have an impact on dN/dS once phylogenetic effects are accounted for (fig. 4). Fisher et al. (2003) studied the influence of four life-history traits (body size, reproductive rate, habitat specialization, and diet) and former distribution area on marsupial susceptibility to decline and extinction. They found that even though extrinsic factors such as geographical range overlap with domestic animals played a major role, intrinsic factors such as body size and other life-history traits could also contribute to the extinction of the Australian marsupials. In this context, our results also imply that besides well-known proxies of Ne such as body mass and size, other variables like weaning age can be used as proxies of Ne in marsupials. These variables are commonly considered as intrinsic factors in the decline and extinction in marsupials. In summary, using exome data from 43 marsupial species, we report considerable variation in rates of protein evolution among families and species, indicative of important differences in Ne. We found that this variation is associated with a number of life-history traits. Some of these traits have been described before as intrinsic extinction factors in marsupials and other vertebrates.

Materials and Methods

Molecular Data and Phylogeny

We performed our analyses using a data set of 1,550 aligned exons and the phylogenetic tree obtained by Duchêne et al. (2018). The length of the alignments ranged from 141 to 3,660 bp (47–1,220 codons). Each alignment contained sequences for 45 species of marsupials representing 18 of the 22 extant families. To generate each alignment, the authors identified the orthologs of their targeted exons using as reference the genome of Monodelphis domestica (Mikkelsen et al. 2007), using the condition of a single BlastN hit with a bit score >380. The phylogenetic tree was estimated with RAxML v8.1.1 (Stamatakis 2014), using a concatenated alignment of 867,000 bp, including all the sequenced loci. For the phylogenetic tree calculation, they used 12 fossil-based age constraints on internal nodes in the tree for calibration (see further details in Duchêne et al. 2018). All the alignments and the phylogenetic tree generated by Duchêne et al. (2018) are available online (github.com/duchene/marsupial_family_phylogenomics; last accessed January 21, 2021).

Life-History Traits and Distribution Area Data

For each of the marsupial species included in our analysis, we collected data of distribution area and seven life-history traits from the PantTHERIA database (Jones et al. 2009) and the Animal Diversity Web (available at http://animaldiversity.ummz.umich.edu; last accessed May 20, 2021). The life-history traits used in this study include body mass, body length, litter size, litters per year, sexual maturity age, weaning age, and gestation time. We estimated the distribution area for each species with the QGIS program (QGIS Development Team 2016), using the data retrieved from The UICN Red List of Threatened Species (IUCN 2020). The retrieved life-history traits information and estimated distribution areas are shown in supplementary table S1, Supplementary Material online.

Estimation of Rates of Protein Evolution

Out of the 45 species in the original data set, we discarded two species (Potorous tridactylus and Pseudochirops corinnae) because they exhibited more than 10% missing data. We also removed 78 loci because their sequence files contained more than 30% missing or ambiguous data. We then created a concatenated alignment comprising 1,472 exons (808,941 bp; 269,947 codons) for 43 marsupial species. We removed the branches for the two discarded species from the phylogenetic tree using the R-project package ape v5.4 (Paradis et al. 2004). Using the concatenated alignment, we computed rates of evolution using model 0 (M0), the free-ratios model (FR), and a 19-ratios model (which we called 19ω), as implemented in the CODEML module of the PAML package, version 4.8d (Yang 1997). Each model computes a number of nonsynonymous to synonymous divergence ratios (ω = dN/dS). The M0 model assumes a constant dN/dS for all branches in the tree, while the FR model assumes an independent dN/dS for each branch. The other model, which is intermediate between M0 and FR, estimates a separate dN/dS for any user-defined set of branches. In our case, we used the model to calculate a separate dN/dS for each of the 18 marsupial families, plus an additional ω for internal branches not belonging to any of these families. Significant differences between pairs of nested models (M0 vs FR, M0 vs 19ω, and 19ω vs FR) imply heterogeneity in the dN/dS values among the different species and families represented in the tree. To compare the fit of the models, we compared the results of the likelihood ratio test with a χ2 distribution (using as degrees of freedom the difference between the number of parameters of each pair of nested models). In addition to the analysis of the concatenated alignment, we used the FR model to estimate the dN/dS for all branches for each exon separately. We tested for significant differences for each pair of branches using a paired Wilcoxon test (as implemented in the R-project v3.5.2 software; R Core Team 2018); for each pair of branches, the dN/dS values of all pairs of orthologous exons were compared.

The Relationship between Rates of Protein Evolution and Morpho-Ecological Variables

We evaluated the correlation between each of the morpho-ecological factors (life-history traits and distribution area) and the dN/dS of each species using Spearman's rank correlation. However, simple correlations do not take into account the effect of phylogenetic inertia. Consequently, we complemented our analyses by performing a phylogenetic generalized least squares regression (pgls). To visualize de pgls regression and correct for nonindependence caused by the phylogeny, we performed a transformation of the dN/dS and life-history variables, assuming a normal distribution and equal variance (as in Rolland et al. 2020). For the pgls regression analysis, we used the method and the R-project code implemented by Rolland et al. (2020).

Supplementary Material

Supplementary data are available at Genome Biology and Evolution online. Click here for additional data file.
  32 in total

1.  Predicting extinction risk in declining species.

Authors:  A Purvis; J L Gittleman; G Cowlishaw; G M Mace
Journal:  Proc Biol Sci       Date:  2000-10-07       Impact factor: 5.349

2.  Multiple causes of high extinction risk in large mammal species.

Authors:  Marcel Cardillo; Georgina M Mace; Kate E Jones; Jon Bielby; Olaf R P Bininda-Emonds; Wes Sechrest; C David L Orme; Andy Purvis
Journal:  Science       Date:  2005-07-21       Impact factor: 47.728

3.  Evolution in Mendelian Populations.

Authors:  S Wright
Journal:  Genetics       Date:  1931-03       Impact factor: 4.562

4.  Interaction between selection and biased gene conversion in mammalian protein-coding sequence evolution revealed by a phylogenetic covariance analysis.

Authors:  Nicolas Lartillot
Journal:  Mol Biol Evol       Date:  2012-09-29       Impact factor: 16.240

5.  Comparative population genomics in animals uncovers the determinants of genetic diversity.

Authors:  J Romiguier; P Gayral; M Ballenghien; A Bernard; V Cahais; A Chenuil; Y Chiari; R Dernat; L Duret; N Faivre; E Loire; J M Lourenco; B Nabholz; C Roux; G Tsagkogeorga; A A-T Weber; L A Weinert; K Belkhir; N Bierne; S Glémin; N Galtier
Journal:  Nature       Date:  2014-08-20       Impact factor: 49.962

6.  Avian Genomes Revisited: Hidden Genes Uncovered and the Rates versus Traits Paradox in Birds.

Authors:  Fidel Botero-Castro; Emeric Figuet; Marie-Ka Tilak; Benoit Nabholz; Nicolas Galtier
Journal:  Mol Biol Evol       Date:  2017-12-01       Impact factor: 16.240

7.  An examination of the generation-time effect on molecular evolution.

Authors:  T Ohta
Journal:  Proc Natl Acad Sci U S A       Date:  1993-11-15       Impact factor: 11.205

8.  Rare variant alleles in the light of the neutral theory.

Authors:  M Kimura
Journal:  Mol Biol Evol       Date:  1983-12       Impact factor: 16.240

9.  PAML: a program package for phylogenetic analysis by maximum likelihood.

Authors:  Z Yang
Journal:  Comput Appl Biosci       Date:  1997-10

10.  Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences.

Authors:  Tarjei S Mikkelsen; Matthew J Wakefield; Bronwen Aken; Chris T Amemiya; Jean L Chang; Shannon Duke; Manuel Garber; Andrew J Gentles; Leo Goodstadt; Andreas Heger; Jerzy Jurka; Michael Kamal; Evan Mauceli; Stephen M J Searle; Ted Sharpe; Michelle L Baker; Mark A Batzer; Panayiotis V Benos; Katherine Belov; Michele Clamp; April Cook; James Cuff; Radhika Das; Lance Davidow; Janine E Deakin; Melissa J Fazzari; Jacob L Glass; Manfred Grabherr; John M Greally; Wanjun Gu; Timothy A Hore; Gavin A Huttley; Michael Kleber; Randy L Jirtle; Edda Koina; Jeannie T Lee; Shaun Mahony; Marco A Marra; Robert D Miller; Robert D Nicholls; Mayumi Oda; Anthony T Papenfuss; Zuly E Parra; David D Pollock; David A Ray; Jacqueline E Schein; Terence P Speed; Katherine Thompson; John L VandeBerg; Claire M Wade; Jerilyn A Walker; Paul D Waters; Caleb Webber; Jennifer R Weidman; Xiaohui Xie; Michael C Zody; Jennifer A Marshall Graves; Chris P Ponting; Matthew Breen; Paul B Samollow; Eric S Lander; Kerstin Lindblad-Toh
Journal:  Nature       Date:  2007-05-10       Impact factor: 49.962

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.