| Literature DB >> 35705575 |
Alejandro Andirkó1,2, Juan Moriano1,2, Alessandro Vitriolo3,4,5, Martin Kuhlwilm6,7, Giuseppe Testa3,4,5, Cedric Boeckx8,9,10.
Abstract
Large-scale estimations of the time of emergence of variants are essential to examine hypotheses concerning human evolution with precision. Using an open repository of genetic variant age estimations, we offer here a temporal evaluation of various evolutionarily relevant datasets, such as Homo sapiens-specific variants, high-frequency variants found in genetic windows under positive selection, introgressed variants from extinct human species, as well as putative regulatory variants specific to various brain regions. We find a recurrent bimodal distribution of high-frequency variants, but also evidence for specific enrichments of gene categories in distinct time windows, pointing to different periods of phenotypic changes, resulting in a mosaic. With a temporal classification of genetic mutations in hand, we then applied a machine learning tool to predict what genes have changed more in certain time windows, and which tissues these genes may have impacted more. Overall, we provide a fine-grained temporal mapping of derived variants in Homo sapiens that helps to illuminate the intricate evolutionary history of our species.Entities:
Mesh:
Year: 2022 PMID: 35705575 PMCID: PMC9200848 DOI: 10.1038/s41598-022-13589-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1(a) Density of distribution of derived Homo sapiens alleles over time in an aggregated control set (n = 1000) of random variants across the genome and two sets of derived ones: all derived variants, and those found at high-frequency. Horizontal lines mark distribution quantiles 0.25, 0.5 and 0.75. (b) Line plot showing the bimodal distribution of high-frequency variants using different generation times (in the text, we used 29 years, following[62]).
Figure 2(a) Selected temporal windows used in our study to further interrogate the nature and distribution of HF variants. (b) Distribution of introgressed alleles over time, as identified by[27,30]. (c) Plots of HF variants in datasets relevant to human evolution, including regions under positive selection[29], regions depleted of archaic introgression[27,28] and genes showing an excess of HF variants (‘excess’ and ‘length’)[19]. Variant counts in (a,c,d) are squared to aid visualization. (d) Kernel density difference between the highest point in the distributions of (d) (leftmost peak) and the second, older highest density peak, normalized, in percentage units.
Big40 Brain volume GWAS[46] top hits with high predicted gene expression in ExPecto (, RPKM), along with dating as provided by GEVA.
| Location | rsid | Nearest gene(s) | GWAS trait | Age (GEVA) |
|---|---|---|---|---|
| 20:49070644 | rs75994450 | PTPN1 | Fractional anisotropy measurement, Splenium (Corpus Callosum) | 36,735.46 |
| 14:59669037 | rs75255901 | DAAM1 | Functional connectivity (rfMRI) | 39,543.24 |
| 1:22498451 | rs2807369 | WNT4 | Volume of gray matter in Cerebellum (left) | 50,060.96 |
| 2:63144695 | rs17432559 | EHBP1 | Volume of Corpus Callosum (Posterior) | 52,290.48 |
| 12:2231744 | rs75557252 | CACNA1C | Functional connectivity (rfMRI) | 93,924.62 |
| 10:92873811 | rs17105731 | PCGF5 | Volume of inferiortemporal gyrus (right) | 255,792.5 |
| 17:59312894 | rs73326893 | BCAS3 | Functional connectivity (rfMRI) | 418,742.6 |
| 22:27195261 | rs72617274 | CRYBA4 | Functional connectivity (rfMRI) | 445,477.7 |
| 2:230367803 | rs56049535 | DNER | Functional connectivity (rfMRI) | 523,629.8 |
| 16:3687973 | rs78315731 | DNASE1 | Volume of Pars triangularis (left) | 698,856.5 |
‘Functional connectivity’ is a measure of temporal activity synchronization between brain parcels at rest (originally defined in[51]).
Figure 3(a) Venn diagram of GO terms associated with genes shared across time windows. (b) Top GO terms per time window.
Figure 4(A) Sum of all directional mutation effects within 1 kb to the TSS per time window in 22 brain and brain-related tisues (red) and the the rest of tissues included by the ExPecto trained model as a control group (blue). Significant differences exist across time periods when non-brain and brain-related tissues are compared (Kruskal–Wallis test; ). (B) Genes with a high sum of all directional mutation effects, and cumulative directionality of expression values in brain tissues per time window.