| Literature DB >> 32781046 |
Julius Mulindwa1, Harry Noyes2, Hamidou Ilboudo3, Luca Pagani4, Oscar Nyangiri5, Magambo Phillip Kimuda5, Bernardin Ahouty6, Olivier Fataki Asina7, Elvis Ofon8, Kelita Kamoto9, Justin Windingoudi Kabore10, Mathurin Koffi6, Dieudonne Mumba Ngoyi7, Gustave Simo8, John Chisi9, Issa Sidibe11, John Enyaru12, Martin Simuunza13, Pius Alibu12, Vincent Jamonneau14, Mamadou Camara15, Andy Tait16, Neil Hall17, Bruno Bucheton18, Annette MacLeod16, Christiane Hertz-Fowler2, Enock Matovu19.
Abstract
Africa contains more human genetic variation than any other continent, but the majority of the population-scale analyses of the African peoples have focused on just two of the four major linguistic groups, the Niger-Congo and Afro-Asiatic, leaving the Nilo-Saharan and Khoisan populations under-represented. In order to assess genetic variation and signatures of selection within a Nilo-Saharan population and between the Nilo-Saharan and Niger-Congo and Afro-Asiatic, we sequenced 50 genomes from the Nilo-Saharan Lugbara population of North-West Uganda and 250 genomes from 6 previously unsequenced Niger-Congo populations. We compared these data to data from a further 16 Eurasian and African populations including the Gumuz, another putative Nilo-Saharan population from Ethiopia. Of the 21 million variants identified in the Nilo-Saharan population, 3.57 million (17%) were not represented in dbSNP and included predicted non-synonymous mutations with possible phenotypic effects. We found greater genetic differentiation between the Nilo-Saharan Lugbara and Gumuz populations than between any two Afro-Asiatic or Niger-Congo populations. F3 tests showed that Gumuz contributed a genetic component to most Niger-Congo B populations whereas Lugabara did not. We scanned the genomes of the Lugbara for evidence of selective sweeps. We found selective sweeps at four loci (SLC24A5, SNX13, TYRP1, and UVRAG) associated with skin pigmentation, three of which already have been reported to be under selection. These selective sweeps point toward adaptations to the intense UV radiation of the Sahel.Entities:
Keywords: Nilo-Saharan; population genetic variation; signatures of selection
Mesh:
Substances:
Year: 2020 PMID: 32781046 PMCID: PMC7477016 DOI: 10.1016/j.ajhg.2020.07.007
Source DB: PubMed Journal: Am J Hum Genet ISSN: 0002-9297 Impact factor: 11.025
Figure 1Map of Africa Showing the Distribution of Five Major African Linguistic Families, the Locations Where Samples Were Collected, and the Proportions of Different Genetic Components
The pie chart size is proportional to the sample size and pie chart proportions and colors correspond to the proportions and colors of ADMIXTURE components within that population for K = 6 (Figure 3). Note that the map colors for languages are not associated with pie chart colors. The legend shows first the map color for each major linguistic group and second the major colors (>25% admixture component) of the admixture pie charts for each population in that linguistic group. The linguistic distribution map was compiled from data in Ethnologue and used under the Creative Commons Attribution-ShareAlike 4.0 International License. Our populations were sampled from Guinea (GUI), Côte d’Ivoire (CIV), Cameroon (CAM), Democratic Republic of Congo (DRC), Zambia (ZAM), and Uganda (UNL & UBB), the 1000 Genomes project (Gambia [GWD], Sierra-Leone [MSL], Nigeria [ESN, YRI], Kenya [LWK], Egypt [EGY]), and the African Genome Variation project (Ethiopia [AMH, GUM, ORO, SOM, WOL]). The inset map shows sampling sites in Uganda. The Lugbara (UNL) were from West Nile region that is predominantly occupied by Nilo-Saharan speakers and the Basoga (UBB) were from the southern region, which is occupied by Bantu speaking people. This map was overlaid with pie charts derived from the admixture plot using R tools. The Ugandan map was generated using QGIS3.6 (see Web Resources) with regional ethnicity classification traced with inference from “Ethnologue languages of Uganda.”
Figure 3Genetic Admixture and Differentiation in Our Data, Selected 1000 Genomes, and AGVP Populations
Admixture plot (731 samples) for K = 3 to K = 9. Genome sequences from this study, 1000 Genomes African samples, AGVP Egyptian, Ethiopian, and European populations (GBR, British from England and Scotland; TSI, Toscani in Italy; IBS, Iberian in Spain; FIN, Finnish in Finland; CEU, Utah residents with Northern and Western European ancestry). Three replicates were carried out for each value of K.
Figure 2Multidimensional Scaling Analysis of Sequenced Populations
(A) This study: Guinea (GAS), Côte d’Ivoire (CIV), Cameroon (CAM), Democratic Republic of Congo (DRC), Uganda (Nilotics, UNL, Niger Congo B, UBB), and Zambia (ZAM); seven Soli/Chikunda (Niger-Congo B)-speaking individuals were outliers by MDS and are not shown in this plot but are shown in Figure S5A.
(B) This study and African Genome Variation Project Ethiopian samples Amhara (AMH), Welayta (WOL), Oromo (ORO), Ethiopian Somali (SOM), and Gumuz (GUM) and 50 samples from each 1000 Genomes African population Nigeria (ESN, YRI), Gambia (GWD), Mende Sierra Leone (MSL), Kenya (LWK). Colors for each cluster are taken from the color for the dominant genetic component for that cluster in the admixture plot at K = 6.
Figure 4F3 Tests of Admixture
(A) Target UBB; Z scores for probability that a pair of populations contributed ancestry to the Uganda Niger Congo B Basoga.
(B) Target LWK; Z scores for probability that a pair of populations contributed ancestry to Kenyan Luhya.
Heatmap color represents intensity of Z score for probability that a population contributes genetic components to the target. Negative Z scores (yellow to red) are associated with increasingly strong evidence of a contribution and positive scores (cyan to blue) are associated with increasingly strong evidence against a contribution. White squares are inconclusive.
Figure 5Genome-wide Signatures of Selection in the Lugbara and Basoga
Manhattan plot showing SNPs with extreme absolute iHS values (|iHS| > 3.0) that occur in the Lugbara (UNL blue) and Basoga (UBB red) populations.
The Top 20% of Protein-Coding Genes with Strongest Signatures of Selection in the Lugbara Population
| 1 | amyeloid leukemia, | ||
| 2 | cdiabetes | ||
| 3 | danemia | ||
| 4 | eschizophrenia | ||
| 5 | ftuberculosis, gHIV | ||
| 6 | hHIV, htuberculosis, hdiabetes | ||
| 7 | iodor perception | ||
| 8 | |||
| 9 | jdeafness | ||
| 10 | kstroke | ||
| 11 | lautophagy | ||
| 12 | mneuralblastoma | ||
| 13 | |||
| 14 | |||
| 15 | oichthyosis, pSLE | ||
| 16 | qkidney disease | ||
| 17 | |||
| 18 | |||
| 19 | |||
| 20 | |||
| 21 | |||
| 22 | rpathogen immunity |
Genes are extracted from the protein coding genes in the top 1% of 100 kb iHS Windows (Table S8) with each gene having a mean iHS > 3.0 in the Lugbara population. The genes in bold are those that also have evidence of selection in the Basoga population. Genes with superscripts are those that are associated with the phenotype in the “Associated Effect” Column.
Top-Ranked Extreme Signatures that Are Highly Differentiated between the Lugbara and Basoga Populations
| 3 | 3.21 | 3.35 | 0.24 | 48/199 | 2.05 | 0.06 | 4.38 | 61 | |
| 3 | 4.15 | 3.37 | 0.23 | 43/189 | 1.92 | 0.02 | 3.58 | 62 | |
| 11 | 4.14 | 3.31 | 0.23 | 72/312 | 1.73 | 0.03 | 3.88 | 68 | |
| 7 | 4.87 | 3.10 | 0.19 | 51/265 | 2.40 | 0.04 | 2.94 | 70 | |
| 12 | 3.63 | 3.65 | 0.23 | 66/283 | 1.95 | 0.02 | 3.02 | 77 | |
| 5 | 4.31 | 3.08 | 0.21 | 61/291 | 1.84 | 0.02 | 4.60 | 88 | |
| 5 | 3.44 | 3.19 | 0.34 | 104/305 | 1.73 | 0.01 | 4.23 | 90 | |
| 3 | 4.04 | 3.07 | 0.27 | 57/208 | 0.36 | 0.05 | 3.57 | 91 | |
| 11 | 4.14 | 3.26 | 0.23 | 72/312 | 1.45 | 0.02 | 2.32 | 95 | |
| 5 | 3.50 | 3.04 | 0.17 | 38/218 | 2.34 | 0.05 | 3.42 | 101 | |
| 3 | 4.15 | 3.98 | 0.23 | 43/189 | 1.03 | 0.01 | 1.69 | 105 | |
| 3 | 3.21 | 3.00 | 0.24 | 48/199 | 2.05 | 0.08 | 2.67 | 106 | |
| 10 | 4.43 | 3.04 | 0.17 | 68/404 | 2.50 | 0.02 | 1.19 | 108 | |
| 2 | 3.70 | 3.21 | 0.17 | 48/279 | 1.82 | 0.01 | 3.32 | 111 | |
| 10 | 3.68 | 3.15 | 0.16 | 55/337 | 1.76 | 0.02 | 3.03 | 111 | |
| 1 | 3.80 | 3.18 | 0.15 | 21/136 | 1.61 | 0.01 | 4.17 | 113 | |
| 22 | 4.99 | 3.35 | 0.23 | 45/200 | 0.88 | 0.00 | 1.26 | 115 | |
| 14 | 3.23 | 3.34 | 0.15 | 38/262 | 2.30 | 0.02 | 2.53 | 117 | |
| 10 | 3.68 | 3.03 | 0.16 | 55/337 | 3.46 | 0.04 | 1.86 | 119 | |
| 3 | 3.57 | 3.17 | 0.26 | 43/165 | 0.12 | 0.03 | 1.79 | 122 |
Genes were ranked separately for xpEHH, FST, and Tajima D. The rank score was obtained by ranking genes separately by Tajima D, FST, and xpEHH and then an overall score was obtained by summing the ranks of the three metrics.
Figure 6Signatures of Selection Unique to the Uganda Nilotic Lugbara Population
Evidence (iHS, xpEHH, and Tajima D) for differential selection signatures between Lugbara (UNL) and Basoga (UBB) at the UVRAG locus on chromosome 11 (A) and the NEK4 locus on chromosome 1 (B).