| Literature DB >> 34297086 |
Xavier Farré1, Ruben Molina2, Fabio Barteri1, Paul R H J Timmers3,4, Peter K Joshi4, Baldomero Oliva2, Sandra Acosta1, Borja Esteve-Altava1, Arcadi Navarro1,5,6,7, Gerard Muntané1,8.
Abstract
The enormous mammal's lifespan variation is the result of each species' adaptations to their own biological trade-offs and ecological conditions. Comparative genomics have demonstrated that genomic factors underlying both, species lifespans and longevity of individuals, are in part shared across the tree of life. Here, we compared protein-coding regions across the mammalian phylogeny to detect individual amino acid (AA) changes shared by the most long-lived mammals and genes whose rates of protein evolution correlate with longevity. We discovered a total of 2,737 AA in 2,004 genes that distinguish long- and short-lived mammals, significantly more than expected by chance (P = 0.003). These genes belong to pathways involved in regulating lifespan, such as inflammatory response and hemostasis. Among them, a total 1,157 AA showed a significant association with maximum lifespan in a phylogenetic test. Interestingly, most of the detected AA positions do not vary in extant human populations (81.2%) or have allele frequencies below 1% (99.78%). Consequently, almost none of these putatively important variants could have been detected by genome-wide association studies. Additionally, we identified four more genes whose rate of protein evolution correlated with longevity in mammals. Crucially, SNPs located in the detected genes explain a larger fraction of human lifespan heritability than expected, successfully demonstrating for the first time that comparative genomics can be used to enhance interpretation of human genome-wide association studies. Finally, we show that the human longevity-associated proteins are significantly more stable than the orthologous proteins from short-lived mammals, strongly suggesting that general protein stability is linked to increased lifespan.Entities:
Keywords: GWAS; aging; comparative genomics; convergent evolution; genetics; maximum lifespan
Mesh:
Year: 2021 PMID: 34297086 PMCID: PMC8557403 DOI: 10.1093/molbev/msab219
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
Fig. 1.Workflow used in this study for the detection of convergent amino acid substitutions (CAAS). In the Discovery phase, we identified AA substitutions that were exclusive from species in the top (yellow) and low (blue) deciles. In the Validation phase, we classified the species from intermediate deciles (green) in two groups, the species having the “long-lived” and the “short-lived” AA version. Finally, we ran a RRPP phylogenetic ANOVA to validate each discovered AA, keeping as significant only those for which we validated the direction of the effect (FDR < 0.05).
Lists of Discovered and Validated CAAS and Genes.
| Discovered | Validated RRPP | |||
|---|---|---|---|---|
| CAAS | Genes | CAAS | Genes | |
| Scenario 1 | 284 | 273 | 131 (46.1%) | 128 |
| Scenario 2 | 2,453 | 1,822 | 1,025 (41.8%) | 891 |
| Scenario 3 | 533 | 495 | 185 (33.5%) | 182 |
| Scenarios 1 + 2 | 2,737 | 2,004 | 1,157 (42.3%) | 996 |
Note.—Numbers in parentheses represent the percentage of phylogenetically validated positions.
Fig. 2.(A) Manhattan plot of gene-based association results of the phylogenetic-controlled PGLS regressions for LQ. Each dot represents a gene, those depicted in red represent FDR < 0.1. The negative logarithm of the FDR P value for each gene tested is reported on the y axis. P value cutoffs corresponding to the Benjamini–Hochberg threshold FDR = 0.05 and FDR = 0.1, based on the 17,969 genes tested, are denoted by the dashed and dotted lines respectively. Phylogenetically controlled regression (PGLS) between log10 root-to-tip ω for the significant genes (B) SPAG16, (C) TOR2A, (D) ADCY7, and (E) CDK12 are displayed against log10 LQ. Black lines represent the linear regression line. UCSC version names were used for species labeling. Correspondence to the species names can be found in supplementary table 1, Supplementary Material online.
Fig. 3.Stratified Q–Q plot for human longevity shows consistent enrichment across several assessed gene sets. Annotation categories were: 1) all SNPs in the GWAS (orange); 2) SNPs in genic regions of genes screened by the CAAS method (yellow); 3) SNPs in the discovered genes (light blue); 4) SNPs in genes validated with RRPP (dark blue); and 5) SNPs in genes nominally significant (Pcons < 0.05) in the PGLS regression (green). All genic regions were defined by gene boundaries plus 5 kb. In summary, the genes we validated in the study were enriched in human longevity signal.