| Literature DB >> 33713133 |
Neus Font-Porterias1, Rocio Caro-Consuegra1, Marcel Lucas-Sánchez1, Marie Lopez2, Aaron Giménez3, Annabel Carballo-Mesa4, Elena Bosch1,5, Francesc Calafell1, Lluís Quintana-Murci2,6, David Comas1.
Abstract
Demographic history plays a major role in shaping the distribution of genomic variation. Yet the interaction between different demographic forces and their effects in the genomes is not fully resolved in human populations. Here, we focus on the Roma population, the largest transnational ethnic minority in Europe. They have a South Asian origin and their demographic history is characterized by recent dispersals, multiple founder events, and extensive gene flow from non-Roma groups. Through the analyses of new high-coverage whole exome sequences and genome-wide array data for 89 Iberian Roma individuals together with forward simulations, we show that founder effects have reduced their genetic diversity and proportion of rare variants, gene flow has counteracted the increase in mutational load, runs of homozygosity show ancestry-specific patterns of accumulation of deleterious homozygotes, and selection signals primarily derive from preadmixture adaptation in the Roma population sources. The present study shows how two demographic forces, bottlenecks and admixture, act in opposite directions and have long-term balancing effects on the Roma genomes. Understanding how demography and gene flow shape the genome of an admixed population provides an opportunity to elucidate how genomic variation is modeled in human populations.Entities:
Keywords: Roma; adaptation; admixture; demography; exomes; mutational load
Year: 2021 PMID: 33713133 PMCID: PMC8233508 DOI: 10.1093/molbev/msab070
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
Fig. 1.Population structure and distribution of synonymous variants. (A) PCA with the merged data set of genome-wide array and WES variants. (B) Unfolded site-frequency spectrum for synonymous WES variants. (C) Genetic diversity measures (πvar and θw) from synonymous WES variants. Other diversity metrics (Tajima’s D and θπ) are shown in supplementary figure 8, Supplementary Material online. Significant differences were tested between Roma and non-Roma populations (*** refers to P value <0.001 in all comparisons).
Fig. 2.Proportion of missense variants from each GERP category in each frequency bin (low-frequency, common) for each population. Low-frequency: singletons and doubletons; common: tripletons and above.
Fig. 3.Deleterious load comparisons and trajectory estimations. (A) Mutational load proxy (Nalleles and Nhom) ratios between Roma and non-Roma for missense variants in each deleterious GERP category. Point estimates and 95% CIs are shown. *P value <0.05; **P value <0.01; ***P value <0.001. Only P values <0.001 are considered significant to account for multiple testing errors. (B) Relative mutational load (Lg/LANC) in the Roma in each sampled generation for each simulated model. LANC: load in the ancestral population (proto-Roma 20 generations before the “Out-of-India” event). “Out-of-India” and “Out-of-Balkans” represent the two simulated bottlenecks at 63 and 38 generations ago, respectively.
R / ratios between Roma and non-Roma populations for missense variants in each deleterious GERP category normalized by synonymous variants.
|
| 2 < GERP < 4 | 4 < GERP < 6 | 6 < GERP |
|---|---|---|---|
| Roma-IBS | 0.986 (0.954–1.0172) | 1.032 (0.965–1.107) | 1.152 (−4.333 to 3.917) |
| Roma-TSI | 0.989 (0.957–1.018) | 1.053 (0.983–1.132) | 1.225 (−5.176 to 4.417) |
| Roma-Hungarian | 0.993 (0.954–1.030) | 1.033 (0.951–1.114) | 0.946 (−3.918 to 4.207) |
| Roma-PJL | 0.987 (0.945–1.028) | 1.013 (0.922–1.112) | 1.058 (−2.953 to 4.287) |
| Roma-GIH | 0.973 (0.930–1.014) | 1.034 (0.938–1.154) | 1.048 (−2.842 to 4.934) |
| Roma-ITU | 0.984 (0.938–1.028) | 0.970 (0.861–1.074) | 0.833 (−2.496 to 3.553) |
Note.—Point estimates and 95% CIs are shown.
Correlations (Spearman’s ρ) between the global proportion of South Asian ancestry in the Roma population inferred with RFMix and the number/length of ROHs per-individual.
| All ROHs | 0.5 < ROHs ≤ 2.5 (Mb) | 2.5 < ROHs (Mb) | |
|---|---|---|---|
| Number of ROHs | 0.1051 | −0.0148 | 0.3587 |
| Total ROH length | 0.3766 | 0.2518 | 0.3563 |
P value <0.05;
P value <0.01.
Fig. 4.Fraction of Nhom in ancestry-specific ROHs versus the total length of ancestry-specific ROHs per individual. South Asian ROHs in red, and European ROHs in blue. The first three panels show a deleterious GERP category each and the last panel shows synonymous variants. β2 and β3 show intercept and slope differences between regressions. *P value <0.05; **P value <0.01; ***P value <0.001.
Fig. 5.Selection tests results (XP-EHH) and mean local ancestry in two candidate regions. (A) Results for chromosome 20: 50,000,000–56,000,000. Top panel shows South Asian (dark red) ancestry (mean and 4.42 standard deviations in solid and dotted lines, respectively). Genomic location of DOK5 gene is shown. Middle and bottom panels show XP-EHH analysis comparing Roma against Europe and South Asia against Europe (top 1% and 5% are shown with dashed lines). The region within chr20: 52,813,832–53,454,024 is highlighted in red. (B) Results for chromosome 18: 20,000,000–23,000,000. Top panel shows European (blue) ancestry (mean and 4.42 standard deviations in solid and dotted lines, respectively). The genomic location of LAMA3 gene is shown. Middle and bottom panels show XP-EHH analysis comparing Roma against South Asia and Europe against South Asia (top 1% and 5% are shown with dashed lines). The region within chr18: 21,276,048–21,740,878 is highlighted in blue.