| Literature DB >> 25840415 |
Pierre Luisi1, David Alvarez-Ponce2, Marc Pybus3, Mario A Fares4, Jaume Bertranpetit1, Hafid Laayouni5.
Abstract
Genes vary in their likelihood to undergo adaptive evolution. The genomic factors that determine adaptability, however, remain poorly understood. Genes function in the context of molecular networks, with some occupying more important positions than others and thus being likely to be under stronger selective pressures. However, how positive selection distributes across the different parts of molecular networks is still not fully understood. Here, we inferred positive selection using comparative genomics and population genetics approaches through the comparison of 10 mammalian and 270 human genomes, respectively. In agreement with previous results, we found that genes with lower network centralities are more likely to evolve under positive selection (as inferred from divergence data). Surprisingly, polymorphism data yield results in the opposite direction than divergence data: Genes with higher centralities are more likely to have been targeted by recent positive selection during recent human evolution. Our results indicate that the relationship between centrality and the impact of adaptive evolution highly depends on the mode of positive selection and/or the evolutionary time-scale.Entities:
Keywords: humans; mammals; natural selection; physical protein interaction; positive selection; protein interaction network
Mesh:
Substances:
Year: 2015 PMID: 25840415 PMCID: PMC4419801 DOI: 10.1093/gbe/evv055
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FDistribution of genes with putative signatures of positive selection within the PIN. ZF and 2Δℓ were used to estimate the likelihood of having evolved under positive selection in human populations and in mammals, respectively. (A) Average degrees (number of interactions) for genes with and without signatures of positive selection. We represent the mean of centrality measure ± 1 SE for the genes with a putative signal of positive selection (in red) and the other genes (in blue). The significance of the differences between the mean of both groups was assessed through 10,000 permutations. Asterisks represent significant differences. *P < 0.05; **P < 0.01. (B) Human PIN with genes with signatures of positive selection according to divergence data (P < 0.05 estimated from 2Δℓ) represented in red. (C) Human PIN with genes with signatures of positive selection according to polymorphism data represented in red.
Relationship between Degree and the Impact of Natural Selection
| Positive Selection | Purifying Selection | |||||||
|---|---|---|---|---|---|---|---|---|
| YRI ( | CEU ( | CHB ( | Mammals (2Δ | Recent Humans (DAF) | Humans (NI) | Mammals (ω) | ||
| Spearman correlation | 0.0501 | 0.0409 | 0.0471 | −0.0841 | −0.0879 | 0.0770 | −0.2039 | |
| 1.11 × 10−5*** | 0.0004*** | 3.48 × 10−5*** | 9.29 × 10−11*** | 4.51 × 10−16*** | 7.29 × 10−06*** | 6.91 × 10−56*** | ||
| Partial Spearman correlation | 0.0451 | 0.0326 | 0.0374 | −0.0340 | −0.0668 | 0.0314 | −0.1698 | |
| 0.0001*** | 0.0059** | 0.0015** | 0.0107 | 2.35 × 10−09*** | 0.0742 | 2.79 × 10−37*** | ||
| Nonparametric ANOVA | 5.324 | 5.844 | 5.074 | 11.90 | 18.03 | 9.084 | 77.85 | |
| 0.0012** | 0.0006*** | 0.0017** | 9.18 × 10−08*** | 1.16 × 10−11*** | 5.49 × 10−06*** | 2.26 × 10−49*** | ||
| Trend test on ranks | 15.88 | 12.14 | 14.12 | 33.51 | 52.60 | 20.66 | 229.4 | |
| 6.79 × 10−5*** | 0.0005** | 0.0002*** | 7.45 × 10−09*** | 4.43 × 10−13*** | 5.67 × 10−06*** | 7.30 × 10−51*** | ||
| Partial nonparametric ANOVA | 2.731 | 3.149 | 2.080 | 2.537 | 6.353 | 2.343 | 51.93 | |
| 0.0423 | 0.0240 | 0.1006 | 0.0548 | 0.0003*** | 0.0713 | 4.27 × 10−33*** | ||
| Partial trend test on ranks | 7.794 | 2.360 | 5.107 | 6.281 | 16.48 | 2.964 | 153.6 | |
| 0.0053** | 0.1246 | 0.0239 | 0.0122 | 4.97 × 10−5*** | 0.0852 | 8.05 × 10−35*** | ||
aSpearman correlation between degree and selection scores (ZF for positive selection in YRI, CEU, and CHB populations; 2Δℓ for positive selection in mammals; DAF for purifying selection during recent human evolution; NI for purifying selection in the human lineage; and ω for purifying selection in mammals). High ZF and 2Δℓ scores indicate a higher probability of having evolved under positive selection as inferred from polymorphism and divergence data, respectively. Low DAF and ω scores indicate higher evolutionary constraint estimated from polymorphism and divergence data, respectively, whereas high NI scores indicate higher evolutionary constraint estimated from both polymorphism and divergence data.
bIn order to test for an association between degree and natural selection scores while controlling for putatively confounding factors, we applied a linear regression between the selection scores and protein length, expression level and breadth. The linear regression residuals were then used to perform the Spearman’s correlation analysis, the nonparametric ANOVA, and the linear trend on ranks test.
cNonparametric ANOVA and linear trend tests on ranks performed to contrast whether the score used as a proxy of natural selection are equal across the degree groups.
*P < 0.05; **P < 0.01; ***P < 0.001.
FImpact of natural selection among groups of genes divided according to degree quartiles. Genes were divided into four groups according to the degree quartiles. The median selection score ± 1 median absolute deviation for each group is represented in the y axis. ZF and 2Δℓ scores were used to estimate the likelihood of positive selection in human populations and in mammals, respectively. DAF, NI, and ω were used to estimate the impact of purifying selection in recent human populations, in the human lineage, and in mammals, respectively. Lower DAF and ω indicate higher evolutionary constraint estimated from polymorphism and divergence data, respectively, whereas higher NI scores indicate higher evolutionary constraint estimated from both polymorphism and divergence data. A nonparametric ANOVA analysis was performed to contrast whether the medians of the scores are equal across the groups. A trend test on ranks was also carried out to test for a linear relationship between the four groups (encoded from 1 to 4) and natural selection scores. A Tukey’s honestly significant difference test was further applied to test for all pairwise differences. Significantly different pairs are marked with asterisks according to the level of significance. *P < 0.05; **P < 0.01; ***P < 0.001.
FDAF for three sites classes nearby genes with signal of recent positive selection. Crosses represent the median of the maximum DAF observed for three site classes nearby polyPSGs: cis-eQTLs, 0-fold degenerated sites and 4-fold degenerated sites. The violin plots represent the distribution of the median of maximum DAF scores observed for a given site class in 10,000 random sets of PIN genes. The analysis was restricted to PIN genes for which the DAF could be calculated for at least one SNP for each of the three site classes. (A) Using eQTLs reported by the GEUVADIS consortium (Lappalainen et al. 2013) and located within 100 kb from the associated gene. The polyPSGs set contained 29 genes and the permutations were performed on a set of 358 PIN genes. (B) Using eQTLs reported by Liang et al. (2013) and located within 100 kb from the associated gene. The polyPSGs set contained 14 genes and the permutations were performed on a set of 198 PIN genes. Significantly higher median DAF in a site class for polyPSGs as compared with the 10,000 permutations is marked with asterisks. **P < 0.01.
Association between Gene Essentiality and Degree and the Impact of Natural Selection
| Lethal versus Viable Genes | Indispensability Score | ||||
|---|---|---|---|---|---|
| Mean Lethal | Mean Viable | ||||
| Degree | 14.55 | 7.048 | 6.62 × 10−52*** | 0.2311 | 3.03 × 10−107*** |
| Positive selection in YRIc | 6.419 | 6.154 | 0.0047** | 0.0473 | 4.34 × 10−05*** |
| Positive selection in CEUc | 6.754 | 6.350 | 0.0009*** | 0.0695 | 2.00 × 10−09*** |
| Positive selection in CHBc | 6.712 | 6.423 | 0.0235 | 0.0380 | 0.0010** |
| Positive selection in mammalsd | 1.830 | 2.270 | 2.03 × 10−08*** | −0.1157 | 3.62 × 10−25*** |
| Purifying selection in recent humans | 0.1041 | 0.1109 | 4.66 × 10−08*** | −0.1131 | 5.14 × 10−25*** |
| Purifying selection in humans | 11.81 | 6.848 | 2.37 × 10−09*** | 0.1932 | 3.70 × 10−29*** |
| Purifying selection in mammals | 0.0768 | 0.1160 | 3.70 × 10−29*** | −0.2600 | 6.67 × 10−89*** |
aMann–Whitney test to compare the degree or the natural selection score between genes that are essential and genes that are not essential, that is, lethal and viable when knocked out in mice, respectively (data from the Mouse Genome Database [Bult et al. 2008] “MRK_Ensembl_Pheno.rpt” file downloaded on October 7, 2010).
bSpearman’s correlation analysis to test for the relationship between degree or the natural selection score and the functional indispensability score (Khurana et al. 2013).
c,dHigh ZF and 2Δℓ scores indicate a higher probability of having evolved under positive selection during human and mammal evolution, respectively.
eLow DAF scores indicate higher selective constraints during recent human evolution.
fHigh NI scores indicate higher selective constraints during the human lineage evolution.
gLow ω scores indicate higher selective constraints during mammal evolution.
*P < 0.05; **P < 0.01; ***P < 0.001.
FComparison of the impact of natural selection between essential and nonessential genes. We performed a Mann–Whitney test to compare the selection scores between genes that are lethal (essential, in yellow) and viable (nonessential, in green) when knocked out in mice (data from the Mouse Genome Database [Bult et al. 2008]; “MRK_Ensembl_Pheno.rpt” file downloaded on October 7, 2010). ZF and 2Δℓ scores were used to estimate the likelihood of positive selection in human populations and in mammals, respectively. DAF, NI, and ω were used to estimate the impact of purifying selection in recent human populations, in the human lineage, and in mammals, respectively. Lower DAF and ω indicate higher evolutionary constraint estimated from polymorphism and divergence data, respectively, whereas high NI scores indicate higher evolutionary constraint estimated from both polymorphism and divergence data. In order to put all the scores within the same scale, the mean standardized scores are plotted (standardized scores were calculated by subtracting the mean and dividing by the standard deviation). Significant differences between lethal and viable genes pairs are marked with asterisks. *P < 0.05; **P < 0.01; ***P < 0.001.