| Literature DB >> 20485449 |
Bonnie R Joubert1, Kari E North, Yunfei Wang, Victor Mwapasa, Nora Franceschini, Steven R Meshnick, Ethan M Lange.
Abstract
Understanding genetic variation between populations is important because it affects the portability of human genome-wide analytical methods. We compared genetic variation and substructure between Malawians and other African and non-African HapMap populations. Allele frequencies and adjacent linkage disequilibrium (LD) were measured for 617 715 single nucleotide polymorphisms (SNPs) across subject genomes. Allele frequencies in the Malawian population (N=226) were highly correlated with allele frequencies in HapMap populations of African ancestry (AFA, N=376), namely Yoruban in Ibadan, Nigeria (Spearman's r(2)=0.97), Luhya in Webuye, Kenya (r(2)=0.97), African Americans in the southwest United States (r(2)=0.94) and Maasai in Kinyawa, Kenya (r(2)=0.91). This correlation was much lower between Malawians and other ancestry populations (r(2)<0.52). LD correlations between Malawians and HapMap populations were strongest for the populations of AFA (AFA r(2)>0.82, other ancestries r(2)<0.57). Principal components analyses revealed little population substructure within our Malawi sample but provided clear distinction between Malawians, AFA populations and two European populations. Five SNPs within the lactase gene (LCT) had substantially different allele frequencies between the Malawi population and Maasai in Kenyawa, Kenya (rs3769013, rs730005, rs3769012, rs2304370; P-values <1 x 10(-33)).Entities:
Mesh:
Year: 2010 PMID: 20485449 PMCID: PMC2909738 DOI: 10.1038/jhg.2010.41
Source DB: PubMed Journal: J Hum Genet ISSN: 1434-5161 Impact factor: 3.172
Correlation of allele frequency across populations1
| Ancestry | Population | BMW | YRI | LWK | MKK | ASW | CEU | TSI | CHB | CHD | GIH | JPT | MEX |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AFA | BMW | 1 | |||||||||||
| AFA | YRI | 0.972 | 1 | ||||||||||
| AFA | LWK | 0.971 | 0.960 | 1 | |||||||||
| AFA | MKK | 0.913 | 0.906 | 0.932 | 1 | ||||||||
| AFA | ASW | 0.942 | 0.947 | 0.937 | 0.917 | 1 | |||||||
| EUA | CEU | 0.475 | 0.474 | 0.498 | 0.618 | 0.622 | 1 | ||||||
| EUA | TSI | 0.486 | 0.485 | 0.510 | 0.634 | 0.628 | 0.967 | 1 | |||||
| ASA | CHB | 0.418 | 0.417 | 0.433 | 0.500 | 0.498 | 0.607 | 0.602 | 1 | ||||
| ASA | CHD | 0.415 | 0.415 | 0.430 | 0.496 | 0.495 | 0.602 | 0.597 | 0.976 | 1 | |||
| ASA | GIH | 0.511 | 0.510 | 0.532 | 0.632 | 0.628 | 0.850 | 0.848 | 0.712 | 0.709 | 1 | ||
| ASA | JPT | 0.415 | 0.415 | 0.430 | 0.497 | 0.495 | 0.603 | 0.598 | 0.959 | 0.952 | 0.709 | 1 | |
| MXA | MEX | 0.490 | 0.490 | 0.508 | 0.603 | 0.609 | 0.834 | 0.826 | 0.735 | 0.727 | 0.811 | 0.733 | 1 |
Spearman’s correlation coefficients for allele frequencies, MAF > 0.05
Abbreviations: ASW, African ancestry in Southwest USA; BMW, Individuals of various self-reported ancestry in Blantyre, Malawi; CEU, Utah residents with Northern and Western European ancestry from the CEPH collection; CHB, Han Chinese in Bejing, China; CHD, Chinese in Metropolitan Denver, Colorado; GIH, Gujarati Indians in Houston, Texas; JPT, Japanese in Tokyo, Japan; LWK, Luhya in Webuye, Kenya; MEX, Mexican ancestry in Los Angeles, California; MKK, Maasai in Kinyawa, Kenya; TSI, Toscans in Italy; YRI, Yoruba in Ibadan, Nigeria.
Correlation of allele frequency across Malawi ethnic groups1
| Population | Ngoni | Lomwe | Yao | Chewa | Tumbuka | Nyanja/Mang'anja | Sena | Other |
|---|---|---|---|---|---|---|---|---|
| Ngoni | 1 | |||||||
| Lomwe | 0.969 | 1 | ||||||
| Yao | 0.958 | 0.956 | 1 | |||||
| Chewa | 0.937 | 0.935 | 0.925 | 1 | ||||
| Tumbuka | 0.928 | 0.927 | 0.916 | 0.897 | 1 | |||
| Nyanja/Mang'anja | 0.919 | 0.918 | 0.907 | 0.888 | 0.880 | 1 | ||
| Sena | 0.914 | 0.913 | 0.903 | 0.883 | 0.875 | 0.867 | 1 | |
| Other | 0.928 | 0.927 | 0.916 | 0.897 | 0.889 | 0.879 | 0.876 | 1 |
Spearman’s correlation coefficients for allele frequencies, MAF > 0.05
Correlation of adjacent linkage disequilibrium across populations1
| Ancestry | Population | BMW | YRI | LWK | MKK | ASW | CEU | TSI | CHB | CHD | GIH | JPT | MEX |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AFA | BMW | 1 | |||||||||||
| AFA | YRI | 0.896 | 1 | ||||||||||
| AFA | LWK | 0.887 | 0.857 | 1 | |||||||||
| AFA | MKK | 0.826 | 0.805 | 0.836 | 1 | ||||||||
| AFA | ASW | 0.822 | 0.819 | 0.802 | 0.805 | 1 | |||||||
| EUA | CEU | 0.543 | 0.533 | 0.555 | 0.666 | 0.634 | 1 | ||||||
| EUA | TSI | 0.550 | 0.540 | 0.562 | 0.676 | 0.636 | 0.937 | 1 | |||||
| ASA | CHB | 0.523 | 0.514 | 0.529 | 0.604 | 0.569 | 0.700 | 0.695 | 1 | ||||
| ASA | CHD | 0.520 | 0.511 | 0.526 | 0.600 | 0.564 | 0.694 | 0.689 | 0.945 | 1 | |||
| ASA | GIH | 0.564 | 0.553 | 0.574 | 0.673 | 0.634 | 0.852 | 0.848 | 0.768 | 0.761 | 1 | ||
| ASA | JPT | 0.517 | 0.508 | 0.523 | 0.597 | 0.562 | 0.692 | 0.687 | 0.932 | 0.925 | 0.761 | 1 | |
| MXA | MEX | 0.551 | 0.542 | 0.560 | 0.652 | 0.623 | 0.829 | 0.820 | 0.760 | 0.753 | 0.810 | 0.755 | 1 |
Spearman’s correlation coefficients, MAF>0.05. Linkage disequilibrium measured in each population using adjacent marker r2.
Abbreviations: ASW, African ancestry in Southwest USA; BMW, Individuals of various self-reported ancestry in Blantyre, Malawi; CEU, Utah residents with Northern and Western European ancestry from the CEPH collection; CHB, Han Chinese in Bejing, China; CHD, Chinese in Metropolitan Denver, Colorado; GIH, Gujarati Indians in Houston, Texas; JPT, Japanese in Tokyo, Japan; LWK, Luhya in Webuye, Kenya; MEX, Mexican ancestry in Los Angeles, California; MKK, Maasai in Kinyawa, Kenya; TSI, Toscans in Italy; YRI, Yoruba in Ibadan, Nigeria.
Figure 1SNP Associations with population membership. Individuals in Blantyre, Malawi (BMW) were compared to each African ancestry HapMap population: Individuals of African ancestry in the Southwest USA (ASW), Luhya in Webuye, Kenya (LWK), Maasai in Kinyawa, Kenya (MKK), and Yoruban of Ibadan, Nigeria (YRI).
Top 10 Outlier SNPs for each comparison of BMW vs. HapMap populations of African ancestry1
| Comparison | CHR | SNP | POS | MAF1 | MAF2 | UNADJ | BONF | Gene |
|---|---|---|---|---|---|---|---|---|
| BMW | 2 | rs6430594 | 136435643 | 0.69 | 0.08 | 5.31E-68 | 2.64E-62 | Aspartyl-tRNA synthetase ( |
| 2 | rs12472293 | 136364547 | 0.11 | 0.70 | 7.30E-61 | 3.63E-55 | NA | |
| 2 | rs309143 | 136430648 | 0.78 | 0.18 | 4.70E-58 | 2.34E-52 | Aspartyl-tRNA synthetase ( | |
| 2 | rs3769013 | 136272652 | 0.81 | 0.23 | 8.33E-54 | 4.14E-48 | Lactase ( | |
| 2 | rs730005 | 136299164 | 0.71 | 0.16 | 2.13E-51 | 1.06E-45 | Lactase ( | |
| 2 | rs3769012 | 136272950 | 0.71 | 0.16 | 2.73E-50 | 1.36E-44 | Lactase ( | |
| 2 | rs961360 | 136110128 | 0.70 | 0.17 | 2.81E-48 | 1.40E-42 | R3H domain containing 1 ( | |
| 2 | rs6430585 | 136223397 | 0.15 | 0.70 | 1.35E-47 | 6.72E-42 | UBX domain protein 4 ( | |
| 2 | rs3806502 | 136004743 | 0.73 | 0.21 | 6.98E-46 | 3.47E-40 | R3H domain containing 1 ( | |
| 2 | rs2305248 | 135644782 | 0.72 | 0.21 | 9.44E-44 | 4.69E-38 | RAB3 GTPase activating protein subunit 1 | |
| BMW | 19 | rs2190687 | 14765415 | 0.50 | 0.06 | 9.26E-28 | 4.68E-22 | NA |
| 2 | rs6733349 | 231976556 | 0.44 | 0.06 | 2.48E-22 | 1.25E-16 | NA | |
| 7 | rs1717725 | 38071558 | 0.05 | 0.30 | 1.50E-18 | 7.59E-13 | NA | |
| 7 | rs6944302 | 79942827 | 0.49 | 0.17 | 5.20E-18 | 2.62E-12 | Guanine nucleotide binding protein, alpha | |
| 7 | rs12700014 | 18930601 | 0.54 | 0.23 | 1.19E-15 | 6.00E-10 | Histone deacetylase 9 ( | |
| 9 | rs3739821 | 129742298 | 0.32 | 0.08 | 8.61E-15 | 4.35E-09 | Family with sequence similarity 102, | |
| 21 | rs494619 | 18347077 | 0.35 | 0.07 | 1.84E-14 | 9.27E-09 | NA | |
| 6 | rs2301220 | 33146744 | 0.26 | 0.57 | 6.68E-14 | 3.38E-08 | Major histocompatibility complex, class II, | |
| 7 | rs10216027 | 79968467 | 0.30 | 0.08 | 2.12E-13 | 1.07E-07 | Guanine nucleotide binding protein, alpha | |
| 6 | rs6457713 | 33185754 | 0.33 | 0.63 | 3.33E-13 | 1.68E-07 | NA | |
| BMW | 2 | rs6733349 | 231976556 | 0.44 | 0.05 | 3.22E-19 | 1.62E-13 | NA |
| 16 | rs1017228 | 21971218 | 0.31 | 0.06 | 2.16E-17 | 1.08E-11 | Chromosome 16 open reading frame 52 | |
| 7 | rs11772387 | 48019075 | 0.29 | 0.06 | 9.83E-15 | 4.95E-09 | Sad1 and UNC84 domain containing 1 | |
| 1 | rs2236906 | 208038108 | 0.58 | 0.27 | 7.48E-13 | 3.77E-07 | Interferon regulatory factor 6 | |
| 1 | rs4304614 | 107256813 | 0.25 | 0.55 | 1.08E-12 | 5.45E-07 | NA | |
| 2 | rs3789106 | 111437355 | 0.28 | 0.07 | 1.54E-12 | 7.77E-07 | Acyl-Coenzyme A oxidase-like ( | |
| 6 | rs1572438 | 803970 | 0.16 | 0.42 | 3.71E-12 | 1.87E-06 | NA | |
| 7 | rs1915960 | 48011003 | 0.07 | 0.27 | 9.55E-12 | 4.80E-06 | Sad1 and UNC84 domain containing 1 | |
| 7 | rs10248243 | 47478639 | 0.05 | 0.24 | 1.09E-11 | 5.49E-06 | Tensin 3 ( | |
| 7 | rs983186 | 27155184 | 0.06 | 0.25 | 1.36E-11 | 6.84E-06 | ||
| BMW | 1 | rs12030126 | 234879762 | 0.05 | 0.39 | 4.84E-20 | 2.43E-14 | NA |
| 9 | rs7020021 | 132242790 | 0.48 | 0.10 | 7.63E-19 | 3.83E-13 | Hemicentin 2 ( | |
| 23 | rs226711 | 98339530 | 0.08 | 0.49 | 9.45E-19 | 4.74E-13 | NA | |
| 2 | rs282268 | 224628420 | 0.06 | 0.39 | 5.16E-18 | 2.59E-12 | NA | |
| 9 | rs7045276 | 191644 | 0.07 | 0.39 | 3.87E-17 | 1.94E-11 | NA | |
| 2 | rs282273 | 224631266 | 0.05 | 0.35 | 5.75E-17 | 2.89E-11 | NA | |
| 2 | rs2577284 | 224637656 | 0.07 | 0.39 | 2.56E-16 | 1.28E-10 | NA | |
| 9 | rs3739821 | 129742298 | 0.42 | 0.08 | 3.32E-16 | 1.66E-10 | Family with sequence similarity 102, | |
| 1 | rs6586395 | 232715442 | 0.10 | 0.45 | 4.77E-16 | 2.39E-10 | NA | |
| 8 | rs7003117 | 115759827 | 0.08 | 0.39 | 6.30E-16 | 3.16E-10 | NA | |
Abbreviations: ASW, African ancestry in Southwest USA; BMW, Individuals of various self-reported ancestry in Blantyre, Malawi; LWK, Luhya in Webuye, Kenya; MKK, Maasai in Kinyawa, Kenya; YRI, Yoruba in Ibadan, Nigeria. NA if SNP not located within a gene.
Figure 2Lactase gene SNP frequencies by African ancestry population
Figure 3No evidence of population substructure in Malawi population: Component 1 vs. 2. Analyses performed in EIGENSOFT software using 23,612 SNPs.
Figure 4Separation of BMW and African ancestry HapMap populations: Component 1 vs. 2. Analyses performed in EIGENSOFT software using 18,481 SNPs.
Admixture analyses for clusters of size K = 2, 3 and 5 with reported means, (standard deviations) and [ranges] by ancestral population.
| CEU | TSI | ASW | YRI | BMW | LWK | MKK | |
|---|---|---|---|---|---|---|---|
| (N = 109) | (N = 77) | (N = 42) | (N = 108) | (N = 226) | (N = 83) | (N = 143) | |
| K=2 | |||||||
| 1 | 0.014 (0.010) | 0.028 (0.008) | 0.733 (0.088) | 0.930 (0.008) | 0.959 (0.014) | 0.900 (0.011) | 0.714 (0.034) |
| [0.000,0.057] | [0.010,0.049] | [0.457,0.906] | [0.909,0.948] | [0.865,0.984] | [0.869,0.926] | [0.617,0.834] | |
| 2 | 0.986 (0.010) | 0.972 (0.008) | 0.267 (0.088) | 0.070 (0.008) | 0.041 (0.014) | 0.100 (0.011) | 0.286 (0.034) |
| [0.943,1.000] | [0.951,0.990] | [0.094,0.543] | [0.052,0.091] | [0.016,0.135] | [0.074,0.131] | [0.166,0.383] | |
| K=3 | |||||||
| 1 | 0.012 (0.013) | 0.050 (0.013) | 0.098 (0.024) | 0.095 (0.023) | 0.070 (0.021) | 0.249 (0.037) | 0.678 (0.124) |
| [0.000,0.056] | [0.007,0.075] | [0.047,0.143] | [0.037,0.151] | [0.019,0.142] | [0.177,0.321] | [0.323,0.953] | |
| 2 | 0.017 (0.010) | 0.006 (0.008) | 0.663 (0.081) | 0.858 (0.017) | 0.905 (0.019) | 0.721 (0.033) | 0.242 (0.107) |
| [0.000,0.039] | [0.000,0.032] | [0.411,0.831] | [0.814,0.896] | [0.834,0.947] | [0.655,0.785] | [0.035,0.605] | |
| 3 | 0.971 (0.013) | 0.945 (0.010) | 0.239 (0.089) | 0.047 (0.011) | 0.025 (0.016) | 0.030 (0.010) | 0.080 (0.034) |
| [0.919,0.997] | [0.924,0.967] | [0.068,0.518] | [0.022,0.075] | [0.000,0.131] | [0.010,0.057] | [0.000,0.206] | |
| K=5 | |||||||
| 1 | 0.010 (0.011) | 0.006 (0.010) | 0.442 (0.062) | 0.667 (0.045) | 0.132 (0.044) | 0.224 (0.044) | 0.089 (0.050) |
| [0.000,0.047] | [0.000,0.038] | [0.301,0.600] | [0.549,0.776] | [0.000,0.278] | [0.112,0.329] | [0.000,0.194] | |
| 2 | 0.007 (0.011) | 0.036 (0.016) | 0.055 (0.029) | 0.042 (0.023) | 0.043 (0.021) | 0.202 (0.037) | 0.601 (0.181) |
| [0.000,0.046] | [0.003,0.075] | [0.000,0.127] | [0.000,0.106] | [0.000,0.010] | [0.133,0.290] | [0.000,1.000] | |
| 3 | 0.012 (0.013) | 0.003 (0.007) | 0.256 (0.048) | 0.248 (0.042) | 0.774 (0.045) | 0.495 (0.050) | 0.106 (0.097) |
| [0.000,0.045] | [0.000,0.030] | [0.136,0.382] | [0.157,0.357] | [0.626,0.918] | [0.395,0.631] | [0.000,0.499] | |
| 4 | 0.963 (0.012) | 0.938 (0.10) | 0.219 (0.091) | 0.019 (0.010) | 0.021 (0.015) | 0.019 (0.010) | 0.066 (0.035) |
| [0.912,0.989] | [0.917,0.957] | [0.042,0.503] | [0.000,0.044] | [0.000,0.125] | [0.001,0.047] | [0.000,0.189] | |
| 5 | 0.008 (0.009) | 0.017 (0.012) | 0.028 (0.016) | 0.025 (0.014) | 0.031 (0.014) | 0.060 (0.013) | 0.138 (0.160) |
| [0.000,0.035] | [0.000,0.043] | [0.000,0.063] | [0.000,0.066] | [0.000,0.080] | [0.027,0.096] | [0.000,1.000] | |
Abbreviations: ASW, African ancestry in Southwest USA; BMW, Individuals of various self-reported ancestry in Blantyre, Malawi; CEU, Utah residents with Northern and Western European ancestry from the CEPH collection; LWK, Luhya in Webuye, Kenya; MKK, Maasai in Kinyawa, Kenya; TSI, Toscans in Italy; YRI, Yoruba in Ibadan, Nigeria.