| Literature DB >> 29500350 |
Denis Pierron1, Margit Heiske2, Harilanto Razafindrazaka2,3, Veronica Pereda-Loth2, Jazmin Sanchez2, Omar Alva2, Amal Arachiche2, Anne Boland4, Robert Olaso4, Jean-Francois Deleuze4, Francois-Xavier Ricaut2, Jean-Aimé Rakotoarisoa5, Chantal Radimilahy5, Mark Stoneking6, Thierry Letellier2.
Abstract
While admixed populations offer a unique opportunity to detect selection, the admixture in most of the studied populations occurred too recently to produce conclusive signals. By contrast, Malagasy populations originate from admixture between Asian and African populations that occurred ~27 generations ago, providing power to detect selection. We analyze local ancestry across the genomes of 700 Malagasy and identify a strong signal of recent positive selection, with an estimated selection coefficient >0.2. The selection is for African ancestry and affects 25% of chromosome 1, including the Duffy blood group gene. The null allele at this gene provides resistance to Plasmodium vivax malaria, and previous studies have suggested positive selection for this allele in the Malagasy population. This selection event also influences numerous other genes implicated in immunity, cardiovascular diseases, and asthma and decreases the Asian ancestry genome-wide by 10%, illustrating the role played by selection in recent human history.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29500350 PMCID: PMC5834599 DOI: 10.1038/s41467-018-03342-5
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Average Asian ancestry estimated using ELAI, across the 700 Malagasy individuals, for each position of each chromosome. In this analysis the African and Asian populations from the 1000 Genomes Project[22] were used as proxies for the African and Asian ancestry in the Malagasy; the results from other reference populations are provided in Supplementary Figure, and using other approaches and reference populations in Supplementary Figures 2 and 3. Black lines represent a deviation of 3 SD from the mean and blue lines represent a deviation of 6 SD from the mean
Fig. 2Asian ancestry proportion of the Malagasy population by genomic region, based on an unsupervised ADMIXTURE analysis for k = 2. SD3 (3 standard deviations from the average ancestry) represents the region chr1: 114423653–175653680; SD6 represents the region chr1: 154764878–161975281; and SD9 represents the region chr1: 158934324–160006961
Fig. 3Distribution of Asian ancestry across the genome. a The simulated distribution based on the “realistic model”. b The simulated distribution based on the “drift model”. c The observed distribution for all positions on chromosome 1. d The observed distribution for all positions in the genome (excluding chromosome 1)
Fig. 4Change in local ancestry expected around a site under selection computed for various times of selection and selection coefficients. The gray dots represent the actual data for Asian ancestry observed around the position rs12705 (ACKR1 gene), the colored bars are the expected values for the indicated selection coefficient, the interval between the onset of admixture and the onset of selection, and the duration of selection
Fig. 5Distribution of the amount of Asian ancestry across Malagasy individuals for the 1q23 chromosomal region, and the Fst value between Asian and African populations for each SNP. SNPs marked in red have been linked to a phenotype by GWAS
Fig. 6Spatial distribution of African ancestry percentage across Madagascar: a global ancestry computed on all positions (excluding chromosome 1); b local ancestry computed for the ACKR1 locus; c frequency of the Duffy null allele. Sampled villages are represented by gray dots. The underlying map was generated by R using the library mapdata (2016)
Allele frequencies for SNPs present on the genotyping array that exhibit a strong association with a phenotype (Phen-gen p-value < 10−8), and are in the 1q23 chromosomal region
| SNP | Ref/alt | Associated phenotype | Observed frequency | Expected frequency (mean ± SD) | Probability of observed data | |||
|---|---|---|---|---|---|---|---|---|
| Africa (%) | East Asia (%) | Madagascar (%) | ||||||
| SD9 | rs1101999 | T/C | Asthma | 31 | 0 | 32 | 19 ± 1.9% | <10−6 |
| rs3026968 | C/T | Chemokine CCL2 | 3 | 53 | 8 | 22 ± 2.1% | <10−6 | |
| rs3093059 | A/G | C-reactive protein | 32 | 15 | 35 | 25 ± 2.2% | 2.10−5 | |
| rs12075 | A/G | Leukocyte count | 2 | 92 | 7 | 37 ± 2.4% | <10−6 | |
| SD6 | rs6684514 | G/A | Erythrocyte indices | 8 | 23 | 7 | 14 ± 1.7% | 4 × 10−6 |
| rs1801274 | G/A | Mucocutaneous lymph node syndrome | 47 | 72 | 46 | 57 ± 2.5% | 1.4 × 10−5 | |
| rs7528684 | G/A | Diabetes mellitus, type 1 | 6 | 56 | 11 | 25 ± 2.1% | <10−6 | |
| rs1142287 | T/C | Crohn disease | 46 | 31 | 37 | 40 ± 2.5% | 9.55 × 10−2 | |
| rs12566888 | T/G | Platelet aggregation | 35 | 51 | 36 | 41 ± 2.5% | 1.84 × 10−2 | |
| rs13376333 | C/T | Atrial fibrillation | 30 | 3 | 23 | 20 ± 2.0% | 4.87 × 10−2 | |
The frequency observed in Madagascar is compared with that observed in Africa and East Asia, and to the expected frequency produced by admixture without selection (based on one million computer simulations[15]). Probability of observed data is based on the quantile of the observed frequency in the distribution of the simulation results