| Literature DB >> 18304320 |
Audrey Sabbagh1, André Langaney, Pierre Darlu, Nathalie Gérard, Rajagopal Krishnamoorthy, Estella S Poloni.
Abstract
BACKGROUND: The N-acetyltransferase 2 (NAT2) gene plays a crucial role in the metabolism of many drugs and xenobiotics. As it represents a likely target of population-specific selection pressures, we fully sequenced the NAT2 coding region in 97 Mandenka individuals from Senegal, and compared these sequences to extant data on other African populations. The Mandenka data were further included in a worldwide dataset composed of 41 published population samples (6,727 individuals) from four continental regions that were adequately genotyped for all common NAT2 variants so as to provide further insights into the worldwide haplotype diversity and population structure at NAT2.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18304320 PMCID: PMC2292740 DOI: 10.1186/1471-2156-9-21
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
SNP and haplotype frequencies in the Mandenka sample.
| Nucleotide Changea | G191A | C282T | T341C | C345Td | C403Gc | C481T | G590A | G609Tc | C638Td | A803G | G838Ac | G857A | |
| Amino Acid Change | R64Q | None | I114T | None | L135V | None | R197Q | E203D | P213L | K268R | V280M | G286E | |
| SNP Frequency | 0.1031 | 0.3041 | 0.3608 | 0.0258 | 0.0206 | 0.3454 | 0.1701 | 0.0051 | 0.0103 | 0.5052 | 0.0051 | 0.0722 | |
| Haplotypeb | Haplotype Frequency | ||||||||||||
| G | C | T | C | C | C | G | G | C | A | G | G | 0.0928 | |
| . | . | C | . | . | T | . | . | . | . | . | . | 0.0103 | |
| . | . | C | . | . | T | . | . | . | G | . | . | 0.3300 | |
| . | . | C | . | . | . | . | . | . | G | . | . | 0.0155 | |
| . | T | . | . | . | . | A | . | . | . | . | . | 0.1289 | |
| . | T | . | . | . | . | . | . | . | . | . | A | 0.0670 | |
| . | . | . | . | . | . | . | . | . | G | . | . | 0.1289 | |
| . | T | . | . | . | . | . | . | . | . | . | . | 0.0515 | |
| A | . | . | . | . | . | . | . | . | . | . | . | 0.0876 | |
| NAT2*14B | A | T | . | . | . | . | . | . | . | . | . | . | 0.0155 |
| . | . | . | . | . | . | . | T | . | G | . | . | 0.0052 | |
| . | T | . | . | . | . | A | . | . | . | . | A | 0.0052 | |
| . | . | . | . | G | . | . | . | . | G | . | . | 0.0206 | |
| . | . | C | . | . | T | . | . | . | G | A | . | 0.0052 | |
| . | T | . | . | . | . | A | . | T | . | . | . | 0.0103 | |
| . | T | . | T | . | . | A | . | . | . | . | . | 0.0258 | |
a Polymorphic sites were numbered considering +1 as the A of the translation start codon in the cDNA sequence (GenBank CR407631)
b NAT2 haplotypes were named in accordance with the consensus gene nomenclature of human NAT2 alleles [65]
c These SNPs/haplotypes have been recently described in Patin et al. [10]
d Newly reported SNPs/haplotypes in the present study
e NAT2*4 is the reference allele that was shown to be the ancestral haplotype through outgroup comparisons with the chimpanzee and rhesus monkey sequences
f Total number of chromosomes in the Mandenka sample
Summary statistics and coalescence estimates for NAT2 sequence data.
| Sample | |||||
| Mandenka (1188 bp) | 0.837 | 0.218 | 0.173 | 10,932 | 799 ± 267 |
| Mandenka (870 bp) | 0.837 | 0.298 | 0.238 | 15,658 | 1,145 ± 382 |
| 12 African populationsf (870 bp) | |||||
| | 0.806 | 0.265 | 0.220 | 14,098 | 1,071 ± 372 |
| | ± 0.061 | ± 0.038 | ± 0.034 | ± 2,733 | |
| | 0.678 | 0.181 | 0.156 | 9,173 | 947 ± 332 |
| | 0.879 | 0.302 | 0.271 | 19,903 | 1,297 ± 418 |
a Haplotype diversity
b Nucleotide diversity per bp (× 10-2) with Jukes and Cantor's correction
c Watterson's estimator of the population mutation parameter per bp (× 10-2)
d Effective population size
e Time to the most recent common ancestor (TMRCA) of all sampled genes (in thousand years)
f Sequence data from Patin et al. [10]
g Are reported here the mean (and standard deviation), minimum, and maximum values of the statistics computed for each of the twelve sub-Saharan African samples of Patin et al. [10]
NAT2 haplotypea frequencies in samples of the worldwide genotyping surveyb.
| SNP positionc (ancestral stated) | ||||||||||||||||||||||
| 191 (G) | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | |||
| 282 (C) | . | . | . | . | . | . | T | T | . | T | . | T | . | . | T | . | T | T | . | T | T | |
| 341 (T) | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | |||||||
| 481 (C) | . | T | T | . | . | . | . | . | . | . | . | . | T | . | . | T | . | T | . | . | . | |
| 590 (G) | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | ||||||
| 803 (A) | . | . | G | G | . | . | . | . | . | G | . | . | . | G | G | G | . | . | . | . | . | |
| 857 (G) | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | ||||
| Population codeh | ||||||||||||||||||||||
| 1 | 27 | 1 | 47 | 17 | 0 | 0 | 0 | 34 | 4 | 0 | 0 | 0 | 0 | 34 | 7 | 1 | 13 | 0 | 6 | 11 | 0 | |
| 2 | 10 | 1 | 37 | 3 | 0 | 0 | 1 | 16 | 0 | 0 | 0 | 2 | 0 | 9 | 2 | 0 | 6 | 0 | 3 | 10 | 0 | |
| 3 | 8 | 0 | 4 | 0 | 0 | 0 | 0 | 11 | 0 | 0 | 0 | 0 | 0 | 39 | 1 | 0 | 13 | 0 | 0 | 4 | 0 | |
| 4 | 4 | 0 | 13 | 5 | 0 | 0 | 0 | 8 | 0 | 0 | 0 | 0 | 0 | 19 | 3 | 0 | 6 | 0 | 0 | 2 | 0 | |
| 6 | 18 | 2 | 65 | 3 | 0 | 0 | 0 | 32 | 0 | 0 | 0 | 13 | 0 | 30 | 0 | 0 | 10 | 0 | 17 | 3 | 1 | |
| 7 | 13 | 0 | 22 | 8 | 0 | 0 | 0 | 32 | 5 | 0 | 2 | 1 | 0 | 7 | 1 | 0 | 4 | 0 | 2 | 3 | 0 | |
| 8 | 3 | 0 | 19 | 0 | 0 | 0 | 0 | 17 | 0 | 0 | 0 | 1 | 0 | 7 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | |
| 9 | 13 | 0 | 45 | 0 | 0 | 0 | 0 | 22 | 0 | 0 | 0 | 3 | 0 | 4 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | |
| 10 | 133 | 1 | 242 | 0 | 0 | 0 | 0 | 127 | 2 | 0 | 0 | 3 | 0 | 6 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | |
| 11 | 20 | 1 | 52 | 1 | 0 | 0 | 0 | 24 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 12 | 22 | 6 | 54 | 2 | 0 | 0 | 0 | 30 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 5 | 0 | 0 | 0 | 0 | |
| 14 | 44 | 8 | 104 | 6 | 0 | 0 | 0 | 54 | 1 | 0 | 0 | 5 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | |
| 15 | 187 | 20 | 318 | 17 | 0 | 0 | 0 | 206 | 0 | 0 | 0 | 15 | 0 | 3 | 0 | 0 | 7 | 0 | 0 | 1 | 0 | |
| 17 | 11 | 6 | 50 | 3 | 0 | 0 | 0 | 28 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 18 | 23 | 3 | 42 | 0 | 0 | 0 | 0 | 15 | 0 | 0 | 0 | 13 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 19 | 383 | 70 | 647 | 68 | 0 | 0 | 0 | 470 | 0 | 0 | 0 | 22 | 0 | 0 | 0 | 0 | 26 | 0 | 0 | 2 | 0 | |
| 21 | 109 | 26 | 164 | 30 | 0 | 0 | 0 | 149 | 0 | 0 | 0 | 17 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 23 | 9 | 0 | 41 | 0 | 0 | 0 | 0 | 29 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 25 | 134 | 12 | 213 | 17 | 0 | 0 | 0 | 184 | 0 | 0 | 0 | 17 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 26 | 140 | 8 | 216 | 29 | 0 | 0 | 0 | 185 | 0 | 0 | 0 | 27 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 27 | 13 | 1 | 29 | 4 | 0 | 0 | 0 | 43 | 0 | 1 | 0 | 6 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 28 | 31 | 0 | 23 | 3 | 0 | 0 | 0 | 30 | 0 | 0 | 0 | 12 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 30 | 132 | 0 | 4 | 0 | 1 | 0 | 0 | 46 | 0 | 0 | 0 | 38 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | |
| 32 | 46 | 0 | 6 | 0 | 0 | 0 | 0 | 22 | 0 | 0 | 0 | 13 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | |
| 33 | 208 | 0 | 1 | 0 | 0 | 1 | 0 | 55 | 0 | 0 | 0 | 22 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | |
| 36 | 353 | 1 | 5 | 0 | 1 | 0 | 0 | 126 | 3 | 0 | 6 | 70 | 0 | 4 | 0 | 1 | 6 | 0 | 0 | 0 | 0 | |
| 38 | 26 | 0 | 10 | 0 | 0 | 0 | 0 | 34 | 0 | 0 | 0 | 17 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | |
| 41 | 114 | 7 | 86 | 5 | 0 | 0 | 0 | 46 | 1 | 1 | 0 | 0 | 0 | 8 | 0 | 2 | 3 | 0 | 1 | 0 | 0 | |
| Global | 2234 | 174 | 2559 | 221 | 2 | 1 | 1 | 2075 | 16 | 2 | 8 | 321 | 0 | 180 | 14 | 4 | 107 | 0 | 32 | 36 | 1 | |
a NAT2 haplotypes were named in accordance with the consensus gene nomenclature of human NAT2 alleles [65]
b Only the 28 samples with no missing data (i.e., with genotype data available for the seven common SNPs in NAT2) were considered here
c Polymorphic sites were numbered considering +1 as the A of the translation start codon in the cDNA sequence (GenBank CR407631)
d The ancestral state of each SNP was deduced from both the chimpanzee and rhesus monkey sequences, and is represented here as a dot
e Nucleotide substitutions shown in bold have a functional consequence on enzyme activity. Bold-faced haplotypes are 'slow alleles' associated with as decreased acetylation capacity; the others display an enzymatic activity comparable to the reference 'rapid' allele NAT2*4
f NAT2*6J is a newly described allele found in the Mandenka sample in the present study (see Table 1). It was predicted to be a 'slow allele' since it contains two inactivating mutations
g Total number of chromosomes
h Samples are numbered according to their population code [see Additional file 1]: (1) 101 Tswana, (2) 50 Ateke Bantus, (3) 40 Bakola Pygmies, (4) 30 Baka Pygmies, (6) 97 Mandenka, (7) 50 Dogons, (8) 24 Somali, (9) 44 Moroccans, (10) 258 Spanish, (11) 49 Sardinians, (12) 60 French, (14) 112 UK Caucasians, (15) 387 US Caucasians, (17) 50 Swedes, (18) 48 Saami, (19) 844 Germans, (21) 248 Polish, (23) 40 Ashkenazi Jews, (25) 290 Russians, (26) 303 Turks, (27) 50 Gujarati, (28) 50 Turkmen, (30) 112 Han Chinese, (32) 44 Chinese, (33) 144 Japanese, (36) 288 Koreans, (38) 44 Thai, (41) 137 Nicaraguans.
Figure 1SNP frequencies in the 41 samples of the worldwide genotyping survey. Samples are ordered geographically (as in Additional file 1), thus including: SSAFR, sub-Saharan Africa (Tswana, Ateke Bantus, Bakola Pygmies, Baka Pygmies, Yoruba, Mandenka, Dogons, and Somali); NA, North Africa (Moroccans); EUR, Europe (Spanish, Sardinians, French, French-Canadians, UK Caucasians, both US Caucasian samples, Swedes, Saami, both German samples, Polish, Slovaks, Ashkenazi Jews, Romanians, Russians, and Turks); CSASIA, Central/South Asia (Gujarati, Turkmen, and Kyrgyz); EASIA, East Asia (the three Chinese, three Japanese, and two Korean samples, and Thai); AME, America (Embera, Ngawbe, and Nicaraguans).
Figure 2(A) Haplotype frequencies in the 28 samples genotyped for all seven common SNPs at Single populations are reported on the left side of the plot, with population codes in brackets; geographic areas are indicated on the right side, as follows: SSAFR, sub-Saharan Africa; NA, North Africa; EUR, Europe; CSASIA, Central/South Asia; EASIA, East Asia; AME, America. Only haplotypes with frequencies > 5% in at least one geographical region were represented individually; all other haplotypes were pooled into a single group (in white). Also, haplotypes NAT2*14A and NAT2*14B were pooled into the NAT2*14 cluster. (B) Median-joining networks of the inferred Only haplotypes with frequencies > 0.005 within a geographic area were considered to construct the networks. Circle areas are proportional to the haplotypes' frequency, and branch lengths are proportional to the number of mutations separating haplotypes. Haplotypes' labels are shown in black; mutations are shown in red on corresponding network branches.
Figure 3Multidimensional scaling plot of Reynolds genetic distances among the 41 population samples of the worldwide genotyping survey. The stress value is 0.07, indicating a very good fit of the projection to the original data. Samples are numbered according to their population code [see Additional file 1 and Additional file 2]. The shaded areas highlight the distribution of samples from sub-Saharan Africa, Europe, and East Asia in the plot.
Fvalues inferred from the 41 samples of the worldwide genotyping survey for each of the seven common SNPs of the NAT2 gene.
| G191A | C282T | T341C | C481T | G590A | A803G | G857A | Average | |
| World subdivided into | ||||||||
| Five geographic groups (41 populations)a | 0.104 | 0.004c | 0.257 | 0.231 | 0.022 | 0.241 | 0.061 | 0.152 |
| Three geographic groups (35 populations)b | 0.122 | -0.001c | 0.308 | 0.280 | 0.011 | 0.291 | 0.074 | 0.189 |
a Sub-Saharan Africa, Europe/North Africa, Central/South Asia, East Asia, Central America [see Additional file 1 for the population composition of each geographic area]
b Sub-Saharan Africa, Europe/North Africa, East Asia
c Not significantly different from 0 at the 5% level