| Literature DB >> 26620671 |
Eliška Podgorná1,2, Issa Diallo3, Christelle Vangenot4, Alicia Sanchez-Mazas5, Audrey Sabbagh6, Viktor Černý7, Estella S Poloni8.
Abstract
BACKGROUND: Dietary changes associated to shifts in subsistence strategies during human evolution may have induced new selective pressures on phenotypes, as currently held for lactase persistence. Similar hypotheses exist for arylamine N-acetyltransferase 2 (NAT2) mediated acetylation capacity, a well-known pharmacogenetic trait with wide inter-individual variation explained by polymorphisms in the NAT2 gene. The environmental causative factor (if any) driving its evolution is as yet unknown, but significant differences in prevalence of acetylation phenotypes are found between hunter-gatherer and food-producing populations, both in sub-Saharan Africa and worldwide, and between agriculturalists and pastoralists in Central Asia. These two subsistence strategies also prevail among sympatric populations of the African Sahel, but knowledge on NAT2 variation among African pastoral nomads was up to now very scarce. Here we addressed the hypothesis of different selective pressures associated to the agriculturalist or pastoralist lifestyles having acted on the evolution of NAT2 by sequencing the gene in 287 individuals from five pastoralist and one agriculturalist Sahelian populations.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26620671 PMCID: PMC4665893 DOI: 10.1186/s12862-015-0543-6
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
African population samples studied for NAT2 sequence variation
| Sample | Geographical | Living in dry | Linguistic | Subsistence | |||
|---|---|---|---|---|---|---|---|
| Population | Code | size | region (Country) | savanna biome1 | affiliation | mode | Reference |
| Egyptians | EGY | 10 | Northern (Egypt) | no | Afro-Asiatic | Agricultural | [ |
| Mandenka | MAN | 97 | Western (Senegal) | yes | Niger-Congo | Agricultural | [ |
| Fulani Banfora | FBAN | 49 | Western (Burkina Faso) | yes | Niger-Congo | Pastoralist | this study |
| Fulani Tindangou | FTIN | 50 | Western (Burkina Faso) | yes | Niger-Congo | Pastoralist | this study |
| Fulani Ader | FADE | 48 | Western (Niger) | yes | Niger-Congo | Pastoralist | this study |
| Yoruba in Ibadan | YRI | 88 | Western (Nigeria) | no | Niger-Congo | Agricultural | [ |
| Yoruba Bantus | YOR | 31 | Western (Nigeria) | no | Niger-Congo | Agricultural | [ |
| Yoruba | YRB | 18 | Western (Nigeria) | no | Niger-Congo | Agricultural | [ |
| Yoruba CEPH | YO | 12 | Western (Nigeria) | no | Niger-Congo | Agricultural | [ |
| Ibo | IBO | 19 | Western (Nigeria) | no | Niger-Congo | Agricultural | [ |
| Hausa | HAU | 17 | Western (Nigeria) | yes | Afro-Asiatic | Agricultural | [ |
| Kanembou | KANE | 49 | Central (Chad) | yes | Nilo-Saharan | Agricultural | this study |
| Daza | DAZ | 41 | Central (Chad) | yes | Nilo-Saharan | Pastoralist | this study |
| Fulani Bongor | FBON | 50 | Central (Chad) | yes | Niger-Congo | Pastoralist | this study |
| Fulani | FU | 13 | Central (Cameroon) | yes | Niger-Congo | Pastoralist | [ |
| Kanuri | KN | 12 | Central (Cameroon) | yes | Nilo-Saharan | Agricultural | [ |
| Mada | MD | 14 | Central (Cameroon) | yes | Afro-Asiatic | Agricultural | [ |
| Ngumba Bantus | NGU | 16 | Central (Cameroon) | no | Niger-Congo | Agricultural | [ |
| Lemande | LM | 14 | Central (Cameroon) | no | Niger-Congo | Agricultural | [ |
| Bakola Pygmy | PYG | 26 | Central (Cameroon) | no | Niger-Congo | Hunter-gatherer | [ |
| Bedzan Pygmy | BEZ | 32 | Central (Cameroon) | no | Niger-Congo | Hunter-gatherer | [ |
| Baka Pygmy Cameroon | BAKC | 31 | Central (Cameroon) | no | Niger-Congo | Hunter-gatherer | [ |
| Baka Pygmy Gabon | BAKG | 16 | Central (Gabon) | no | Niger-Congo | Hunter-gatherer | [ |
| Akele Bantus Gabon | GAB | 26 | Central (Gabon) | no | Niger-Congo | Agricultural | [ |
| Biaka Pygmy | BIA | 24 | Central (C. A. R.) | no | Niger-Congo | Hunter-gatherer | [ |
| Mbuti Pygmy | MBU | 24 | Central (D. R. C.) | no | Nilo-Saharan | Hunter-gatherer | [ |
| Dinka | DN | 13 | Eastern (South Sudan) | yes | Nilo-Saharan | Pastoralist | [ |
| Luhya in Webuye | LWK | 97 | Eastern (Kenya) | yes | Niger-Congo | Agricultural | [ |
| Maasai | MAS | 12 | Eastern (Kenya) | yes | Nilo-Saharan | Pastoralist | [ |
| Luo | LUO | 14 | Eastern (Kenya) | yes | Nilo-Saharan | Pastoralist | [ |
| Somali | SOM | 20 | Eastern (Somalia) | yes | Afro-Asiatic | Pastoralist | [ |
| Turu | TR | 15 | Eastern (Tanzania) | yes | Niger-Congo | Agro-pastoralist | [ |
| Hadza | HZ | 14 | Eastern (Tanzania) | yes | Khoisan | Hunter-gatherer | [ |
| Sandawe | SW | 18 | Eastern (Tanzania) | yes | Khoisan | Hunter-gatherer | [ |
| Burunge | BG | 17 | Eastern (Tanzania) | yes | Afro-Asiatic | Agro-pastoralist | [ |
| Chagga Bantus | CHA | 32 | Eastern (Tanzania) | yes | Niger-Congo | Agricultural | [ |
| Maasai | MS | 14 | Eastern (Tanzania) | yes | Nilo-Saharan | Pastoralist | [ |
| San | SAN | 38 | Eastern (Zimbabwe) | yes | Khoisan | Hunter-gatherer | [ |
| African Americans2 | ASW | 61 | ND3 | ND3 | ND3 | ND3 | [ |
1Classification according to climatic zone and biome (ecoregion), based on [99]
2African Americans: Americans of African ancestry in Southwestern US
3ND: not defined
Fig. 1Schematic diagram of the NAT2 locus on 8p22, including the non-coding 100 bp-long Exon 1, the 8.6 Kb-long intronic region, and the 870 bp-long single protein-coding Exon 2, adapted from [102] and [77]. The first (+1) and last positions (+873) of the open reading frame (ORF) of Exon 2 are indicated at the bottom of it, as well the relative positions of Exon 1 and of the polyadenylation signals. The positions of the 15 polymorphic sites (SNPs) observed among the 287 individuals from the six Sahelian samples sequenced in this study are shown as heavy-black (non-synonymous mutations) or light-gray (synonymous mutations) vertical bars below the diagram. Segments link the SNPs positions to the list of 21 haplotypes inferred from the combination of the 15 SNPs. Haplotypes’ associated acetylation activity (taken from the official NAT2 gene nomenclature, http://nat.mbg.duth.gr/) and average frequency among the six Sahelian samples are shown on the left and right sides of the list, respectively. A diagram displaying the position of the 30 SNPs observed in protein-coding Exon 2 among the 39 African population samples (1,192 individuals) analyzed in this study is shown in Additional file 3: Figure S2
Haplotype frequencies and molecular diversity of the six Sahelian samples in a 1,396 bp sequence encompassing the NAT2 coding exon
| Population1 | ||||||||
|---|---|---|---|---|---|---|---|---|
| Haplotype | Acetylation activity2 | FBAN | FTIN | FADE | FBON | DAZ | KANE | Total |
|
| fast | 2 | 4 | 6 | 8 | 7 | 2 | 29 |
|
| fast | 1 | 1 | |||||
|
| fast | 6 | 5 | 4 | 11 | 6 | 32 | |
|
| fast | 11 | 10 | 8 | 7 | 2 | 2 | 40 |
|
| slow | 1 | 1 | 3 | 5 | |||
|
| slow | 39 | 46 | 42 | 47 | 28 | 40 | 242 |
|
| slow | 2 | 1 | 1 | 4 | |||
|
| slow | 3 | 4 | 6 | 2 | 1 | 3 | 19 |
|
| slow | 32 | 17 | 20 | 23 | 23 | 25 | 140 |
|
| slow | 1 | 1 | |||||
|
| slow | 1 | 1 | |||||
|
| slow | 1 | 2 | 2 | 2 | 7 | ||
|
| slow | 2 | 4 | 7 | 1 | 14 | ||
|
| slow | 3 | 5 | 2 | 2 | 1 | 5 | 18 |
|
| slow | 1 | 1 | |||||
|
| unknown | 1 | 1 | |||||
|
| unknown | 1 | 1 | 2 | ||||
|
| unknown | 2 | 1 | 4 | 7 | 14 | ||
|
| unknown | 1 | 1 | |||||
|
| unknown | 1 | 1 | |||||
|
| unknown | 1 | 1 | |||||
| Total (2N chromosomes) | 98 | 100 | 96 | 100 | 82 | 98 | 574 | |
| Number of haplotypes (k) | 8 | 13 | 10 | 13 | 12 | 13 | ||
| Number of segregating sites (S) | 6 | 9 | 9 | 10 | 10 | 11 | ||
| Gene diversity (expected heterozygosity, h) | 0.72 | 0.75 | 0.75 | 0.72 | 0.78 | 0.76 | ||
| Nucleotide diversity (π) x 10−3 | 1.81 | 1.79 | 1.85 | 1.78 | 1.83 | 1.93 | ||
| Tajima’s |
| 1.09 (0.875) | 1.18 (0.891) | 0.73 (0.801) | 0.72 (0.799) | 0.68 (0.875) | ||
1Population codes as in Table 1
2Reported activity in the official NAT2 gene nomenclature (nat.mbg.duth.gr)
3Small caps alphabetical suffixes were added to the names of haplotypes that differ from known haplotypes in the flanking region of the NAT2 coding exon (see text)
4New haplotypes submitted to the official NAT2 gene nomenclature and included in it (see text)
5 P-value associated with Tajima’s D test for departure from selective neutrality: it is given as the proportion of random D values generated under the neutral equilibrium model that are smaller than, or equal to the observed value. The sole significant result is shown in bold; it corresponds to a type I error rate of 0.006, and it remains significant after Bonferroni correction for multiple testing
Analysis of molecular variance (AMOVA) under four criteria of classification of populations, based on the 870 bp long NAT2 coding exon
| Percentage of variation | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| Dataset1 | Categories grouping2 | Number of population samples | Number of groups | Between groups | Between populations within groups | Fixation indexes | |||
| ΦCT |
| ΦSC |
| ||||||
| AFR | Geo5 | 37 | 3 | −0.016 | 3.23 | −0.000126 | 0.3722 | 0.03230 |
|
| Lang | 38 | 4 | 0.43 | 3.04 | 0.00425 | 0.1323 | 0.03053 |
| |
| Subsist | 38 | 4 | 1.82 | 2.02 | 0.01819 |
| 0.02055 |
| |
| Clim | 38 | 2 | 2.35 | 2.11 | 0.02345 |
| 0.02161 |
| |
| FP | Geo5 | 28 | 3 | −0.046 | 2.21 | −0.000376 | 0.4572 | 0.02207 |
|
| Lang | 29 | 3 | −0.216 | 2.30 | −0.002076 | 0.6201 | 0.02297 |
| |
| Subsist | 29 | 3 | 1.16 | 1.58 | 0.01156 |
| 0.01600 |
| |
| Clim | 29 | 2 | 3.57 | 0.71 | 0.03572 |
| 0.00739 |
| |
| FPLS | Geo | 13 | 3 | −0.536 | 2.99 | −0.005276 | 0.7238 | 0.02974 |
|
| Lang7 | 12 | 2 | −0.456 | 2.84 | −0.004476 | 0.5612 | 0.02832 |
| |
| Subsist | 13 | 2 | 1.20 | 1.99 | 0.01202 | 0.0530 | 0.02016 |
| |
| Clim | 13 | 2 | 5.40 | 0.54 | 0.05403 |
| 0.00568 |
| |
1 As described in Methods, the three population data subsets are AFR: 38 samples, excluding the Americans of African ancestry (ASW of [53], see Table 1); FP: 29 samples of African food-producing populations; FPLS: 13 samples of African food-producing populations with sample size ≥ 20 individuals. Thus, the ASW sample was not considered in any of the AMOVA analyses
2 Categories as in Table 1 : classification according to geographical region (Geo), subsistence mode (Subsist), linguistic affiliation (Lang), and ecoregion (Clim), namely climatic zone and biome, which defines the fourth categorization criterion that considers whether populations live within the dry savanna biome or outside of it
3 Significance of the ΦCT index and of the corresponding percentage of variation due to differences between groups. Significant P-values (i.e., <5 %) are shown in bold, and adjusted P-values after Bonferroni correction for multiple testing (here, four tests) are provided in brackets
4 Significance of the ΦSC index and of the corresponding percentage of variation due to differences between populations within groups. Significant P-values (i.e., <5%) are shown in bold, and adjusted P-values after Bonferroni correction for multiple testing (here, four tests) are provided in brackets
5 Only three geographical regions are considered here (Western, Central and Eastern, Table 1) because the fourth region (Northern) is represented by one population sample only (EGY)
6 Because variance components in AMOVA are actually defined as covariances, negative values can occur [95]. A negative ΦCT value would be expected if gene copies were more correlated between groups than between populations within groups. However, none of the negative ΦCT values in the table are statistically significant, thus indicating that they are equal to zero
7 Only two linguistic families are considered here (Niger-Congo and Nilo-Saharan, Table 1) because the third family (Afro-Asiatic) is represented by one population sample only (SOM)
Fig. 2MDS plot of pairwise Reynolds genetic distances between the 38 populations of the AFR dataset. The Stress value is 0.071. The same plot is reproduced 4 times, with populations color-coded according to: (a) geographical region, (b) linguistic affiliation, (c) subsistence mode, and (d) biome (see text)
Fig. 3Map showing the frequency distributions of predicted NAT2 phenotypes in African populations screened for sequence variation in the coding-exon (pie charts are proportional to sample size). Map created with the QGis open source software [98], with climatic zones defined according to [99]
Fig. 4Boxplots of predicted prevalence of slow acetylators in populations classified according to subsistence mode and further to ecoregion. Each box extends from the first to the third quartile and displays the median (thick line), and the dispersion of populations’ individual values (filled circles) is shown by dashed lines. (a) Thirty-eight African populations (AFR dataset); (b) same dataset crossed with biome information, i.e., separating populations living in the seasonally dry zones (Sahel, Savanna) from those living in the humid tropical and equatorial zones; (c) same cross-analysis as in (b) but excluding population samples of less than 20 individuals
Kruskal-Wallis test for equality of frequency of the slow acetylation phenotype across geographical regions, linguistic families, subsistence modes, and climatic zones
| Dataset1 | Categories grouping2 | Number of populations | Number of groups | Kruskal-Wallis H statistic |
|
|---|---|---|---|---|---|
| AFR | Geo4 | 37 | 3 | 2.38 | 0.3039 |
| Lang | 38 | 4 | 7.13 | 0.0680 | |
| Subsist | 38 | 4 | 15.62 |
| |
| Clim | 38 | 2 | 13.17 |
| |
| FP5 | Geo4 | 28 | 3 | 2.52 | 0.2843 |
| Lang | 29 | 3 | 3.61 | 0.1649 | |
| Subsist | 29 | 3 | 8.25 |
| |
| Clim | 29 | 2 | 7.74 |
| |
| FPLS | Geo | 13 | 3 | 0.33 | 0.8480 |
| Lang5 | 12 | 2 | 1.15 | 0.2827 | |
| Subsist | 13 | 2 | 5.90 |
| |
| Clim | 13 | 2 | 6.43 |
|
1 As described in Methods, the three population data subsets are AFR: 38 samples, excluding the 1KG Americans of African ancestry (ASW, see Table 1); FP: 29 samples of African food-producing populations; FPLS: 13 samples of African food-producing populations with sample size ≥ 20 individuals. Thus, the ASW sample was not considered in any of the Kruskal-Wallis tests
2 Categories as in Table 1 : classification according to geographical region (Geo), subsistence mode (Subsist), linguistic affiliation (Lang), and ecoregion (Clim), namely climatic zone and biome, which defines the fourth categorization criterion that considers whether populations live within the dry savanna biome or outside of it
3 Significant P-values (i.e., <5 %) are shown in bold, and adjusted P-values after Bonferroni correction for multiple testing (here, four tests) are provided in brackets
4 Only three geographical regions are considered here (Western, Central and Eastern, Table 1) because the fourth region (Northern) is represented by one population sample only (EGY)
5 Only two linguistic families are considered here (Niger-Congo and Nilo-Saharan, Table 1) because the third family (Afro-Asiatic) is represented by one population sample only (SOM)