| Literature DB >> 15606995 |
Anna González-Neira1, Francesc Calafell, Arcadi Navarro, Oscar Lao, Howard Cann, David Comas, Jaume Bertranpetit.
Abstract
Recent studies of haplotype diversity in a number of genomic regions have suggested that long stretches of DNA are preserved in the same chromosome, with little evidence of recombination events. The knowledge of the extent and strength of these haplotypes could become a powerful tool for future genetic analysis of complex traits. Different patterns of linkage disequilibrium (LD) have been found when comparing individuals of African and European descent, but there is scarce knowledge about the worldwide population stratification. Thus, the study of haplotype composition and the pattern of LD from a global perspective are relevant for elucidating their geographical stratification, as it may have implications in the future analysis of complex traits. We have typed 12 single nucleotide polymorphisms in a chromosome 22 region--previously described as having high LD levels in European populations--in 39 different world populations. Haplotype structure has a clear continental structure with marked heterogeneity within some continents (Africa, America). The pattern of LD among neighbouring markers exhibits a strong clustering of all East Asian populations on the one hand and of Western Eurasian populations (including Europe) on the other, revealing only two major LD patterns, but with some very specific outliers due to specific demographic histories. Moreover, it should be taken into account that African populations are highly heterogeneous. The present results support the existence of a wide (but not total) communality in LD patterns in human populations from different continental regions, despite differences in their demographic histories, as population factors seem to be less relevant compared with genomic forces in shaping the patterns of LD.Entities:
Mesh:
Year: 2004 PMID: 15606995 PMCID: PMC3500194 DOI: 10.1186/1479-7364-1-6-399
Source DB: PubMed Journal: Hum Genomics ISSN: 1473-9542 Impact factor: 4.639
List of markers analysed in the present study.
| aSNP name | cDistance | Polymorphism | |
|---|---|---|---|
| rs139433 | 39847691 | 30773 | G/C |
| rs139495 | 39878464 | 87439 | C/T |
| rs3927 | 39965903 | 54236 | T/C |
| rs738499 | 40020139 | 126713 | T/G |
| rs137831 | 40146852 | 365815 | C/A |
| rs133291 | 40512667 | 25622 | C/T |
| rs713881 | 40538289 | 73607 | G/C |
| rs739292 | 40611896 | 188003 | G/A |
| rs714002 | 40799899 | 92186 | T/C |
| rs134874 | 40892085 | 529673 | G/A |
| rs2013730 | 41421758 | 206746 | C/T |
| rs737782 | 41628504 | C/G | |
a Name according to the National Center for Biotechnology Information database (dbNCBI; dbSNP Build 120).
b Position in pair of bases according to the dbNCBI Build 34.
c Distance to next single nucleotide polymorphism in pair of bases.
Population-descriptive parameters grouped by continental region.
| Continental region | Population | bDh | cKh | dKmax | eFNF | fKh shared | g% shared | hS fixed | i% fixed | |
|---|---|---|---|---|---|---|---|---|---|---|
| Sub-Saharan Africa: | ||||||||||
| Bantu (BAN) | 40 | 0.9628 ± 0.0144 | 24 | 34 | 0.31 | 11 | 45.83 | - | - | |
| Mandenka (MAN) | 48 | 0.9344 ± 0.0247 | 26 | 39 | 0.35 | 11 | 42.31 | - | - | |
| Yoruba (YOR) | 50 | 0. 8555 ± 0.0491 | 27 | 40 | 0.34 | 11 | 40.74 | - | - | |
| San (SAN) | 14 | 0.8681 ± 0.0594 | 7 | 12 | 0. 50 | 5 | 71.43 | 5 | 41.67 | |
| Mbuti Pygmies (MBU) | 28 | 0.9524 ± 0.0242 | 18 | 24 | 0.27 | 8 | 44.44 | 1 | 8.33 | |
| Biaka Pygmies (BIA) | 72 | 0. 8 975 ± 0.0242 | 26 | 44 | 0.43 | 18 | 69.23 | - | - | |
| Average | 42 | 0.9117 | 21 | 32 | 0.36 | 11 | 52.33 | |||
| Europe: | ||||||||||
| Orcadian (ORC) | 32 | 0.9355 ± 0.0240 | 17 | 31 | 0. 48 | 15 | 88.24 | - | - | |
| English (ENG)j | 140 | 0.9581 ± 0.0071 | 51 | 117 | 0.57 | 41 | 80.39 | - | - | |
| Adygei (ADY) | 34 | 0.9340 ± 0.0264 | 19 | 33 | 0.45 | 13 | 68.42 | - | - | |
| Russian (RUS) | 50 | 0.9567 ± 0.0178 | 31 | 48 | 0.37 | 18 | 58.06 | - | - | |
| French Basque (FRB) | 48 | 0.9371 ± 0.0215 | 24 | 46 | 0. 50 | 19 | 79.17 | - | - | |
| French (FRE) | 58 | 0.9383 ± 0.0222 | 33 | 56 | 0. 43 | 22 | 66.67 | - | - | |
| Catalan (CAT) | 94 | 0.9526 ± 0.0114 | 41 | 87 | 0. 54 | 31 | 75.61 | - | - | |
| Continental Italian (CIT) | 44 | 0.9440 ± 0.0177 | 22 | 43 | 0. 51 | 14 | 63.64 | - | - | |
| Sardinian (SAR) | 56 | 0.9325 ± 0.0186 | 26 | 54 | 0.54 | 19 | 73.08 | - | - | |
| Average | 62 | 0.9432 | 29 | 57 | 0. 51 | 21 | 72.59 | |||
| Middle East/North Africa: | ||||||||||
| Mozabite (MOZ) | 60 | 0.9384 ± 0.0183 | 28 | 54 | 0.50 | 16 | 57.14 | - | ||
| Bedouin (BED) | 98 | 0.9552 ± 0.0115 | 46 | 92 | 0. 51 | 33 | 71.74 | - | ||
| Druze (DRU) | 96 | 0.9575 ± 0.0090 | 41 | 90 | 0. 56 | 31 | 75.61 | - | ||
| Palestinian (PAL) | 102 | 0.9639 ± 0.0081 | 48 | 96 | 0.51 | 27 | 56.25 | - | ||
| Average | 89 | 0.9537 | 41 | 83 | 0. 52 | 27 | 65.19 | |||
| Central/South Asia: | ||||||||||
| Balochi (BAL) | 50 | 0.9706 ± 0.0093 | 28 | 48 | 0.43 | 21 | 75.00 | - | - | |
| Brahui (BRA) | 50 | 0.9314 ± 0.0262 | 28 | 48 | 0.43 | 13 | 46.43 | - | - | |
| Makrani (MAK) | 50 | 0.9551 ± 0.0165 | 30 | 48 | 0.39 | 17 | 56.67 | - | - | |
| Sindhi (SIN) | 50 | 0.9649 ± 0.0134 | 30 | 49 | 0. 40 | 19 | 63.33 | - | - | |
| Pathan (PAT) | 50 | 0.9429 ± 0.0233 | 30 | 48 | 0.39 | 18 | 60.00 | - | - | |
| Burusho (BUR) | 50 | 0.9535 ± 0.0137 | 25 | 47 | 0. 49 | 18 | 72.00 | - | - | |
| Hazara (HAZ) | 50 | 0.9527 ± 0.0153 | 26 | 45 | 0.44 | 15 | 57.69 | - | - | |
| Kalash (KAL) | 50 | 0.9143 ± 0.0246 | 21 | 47 | 0. 58 | 18 | 85.71 | - | - | |
| Average | 50 | 0.948 | 27 | 48 | 0.45 | 17 | 64.60 | |||
| East Asia: | ||||||||||
| Han (HAN) | 90 | 0.9348 ± 0.0135 | 34 | 71 | 0.54 | 23 | 67.65 | 1 | 8.33 | |
| North China (NCH) | 138 | 0.9305 ± 0.0144 | 52 | 109 | 0.53 | 33 | 63.46 | - | - | |
| South China (SCH) | 140 | 0.9298 ± 0.0146 | 48 | 103 | 0.54 | 30 | 62.50 | 1 | 8.33 | |
| Cambodian (CAM) | 22 | 0.9221 ± 0.0381 | 13 | 20 | 0.39 | 11 | 84.62 | 2 | 16.67 | |
| Japanese (JAP) | 62 | 0. 8 911 ± 0.0328 | 31 | 49 | 0.38 | 20 | 64.52 | 1 | 8.33 | |
| Yakut (YAK) | 48 | 0.9273 ± 0.0254 | 27 | 39 | 0.32 | 22 | 81.48 | - | - | |
| Average | 83 | 0.9225 | 34 | 65 | 0.49 | 23 | 70.70 | |||
| Oceania: | ||||||||||
| Non-Austronesian (NAN) Melanesian | 44 | 0.6332 ± 0.0581 | 5 | 12 | 0.70 | 5 | 100.00 | 6 | 50.00 | |
| Papuan (PAP) | 34 | 0.8111 ± 0.0454 | 10 | 12 | 0.20 | 7 | 70.00 | 3 | 25.00 | |
| Average | 39 | 0.7167 | 8 | 12 | 0.45 | 6 | 85.00 | |||
| America: | ||||||||||
| Karitiana (KAR) | 48 | 0.7881 ± 0.0369 | 7 | 19 | 0.71 | 6 | 85.71 | 6 | 50.00 | |
| Suruí (SUR) | 42 | 0.7573 ± 0.0333 | 6 | 15 | 0.69 | 6 | 100.00 | 7 | 58.33 | |
| Colombian (COL) | 26 | 0. 8 246 ± 0.0475 | 7 | 15 | 0.62 | 6 | 85.71 | 4 | 33.33 | |
| Maya (MAY) | 50 | 0.7967 ± 0.0565 | 20 | 26 | 0.25 | 17 | 85.00 | - | - | |
| Pima (PIM) | 50 | 0. 8 204 ± 0.0335 | 12 | 16 | 0.29 | 10 | 83.33 | 5 | 41.67 | |
| Average | 43 | 0.797 | 10 | 18.2 | 0.48 | 9 | 87.95 | |||
a N, number of chromosomes
b Dh, haplotype diversity
c Kh, observed number of haplotypes
d Kmax, number of haplotypes expected under equilibrium
e FNF, fraction of haplotypes not found for each population defined (see Methods)
f Kh shared, number of haplotypes shared between two or more populations
g% shared, percentage of shared haplotypes
hS fixed, number of non-polymorphic single nucleotide polymorphisms (SNPs)
i % fixed, percentage of non-polymorphic SNPs
j ENG, English reference sample [8].
Figure 1Plot of first three dimensions obtained in the correspondence analysis based on the 182 shared haplotypes found in 40 populations (English population included). The first dimension separates the six African populations from the rest. Native Americans are clearly differentiated in the second dimension and East Asians and Oceanians are separated from the rest of Eurasian populations by the third dimension.
Figure 2Principal components analysis plots based on the r. Populations with more than three missing r2 values were excluded in the analysis (Surui, San, Non-Austronesian (NAN) Melanesian, Colombian, Karitiana, Pima and Papuan). 42.1% and 18.9% of the variance is explained by the first and second components, respectively. The plot of the two first components pools the populations into two groups: East Asia and European/Western Eurasians. African populations are scattered due their lack of a single linkage disequilibrium pattern. Note that four Native American populations were excluded from the analysis and only one such population (Mayans, from Mexico) could be included.
Correlation coefficients for pairs of populations (below the diagonal) within geographical regions: Africa (3A), Europe (3B), Central/South Asia (3C) and East Asia (3D), and among geographical regions, with one population from each region (3E).
| A) | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| 0.395 | 0.001* | 0.920 | 0.479 | ||||||
| -0.286 | 0.933 | 0.664 | 0.847 | ||||||
| 0.832 | -0.029 | 0.969 | 0.820 | ||||||
| 0.034 | 0.148 | -0.013 | 0.010 | ||||||
| 0.272 | 0.075 | -0.089 | 0. 8 00 | ||||||
| 0.000* | 0.109 | 0.000* | 0.000* | 0.000* | 0.000* | 0.000* | 0.000* | ||
| 0.977 | 0.000* | 0.000* | 0.000* | 0.000* | 0.000* | 0.000* | 0.000* | ||
| 0.509 | 0.500 | 0.348 | 0.159 | 0.016 | 0.148 | 0.029 | 0.212 | ||
| 0.901 | 0.924 | 0.312 | 0.002 | 0.004 | 0.000* | 0.000* | 0.000* | ||
| 0.910 | 0.919 | 0.455 | 0.819 | 0.012 | 0.000* | 0.000* | 0.000* | ||
| 0.863 | 0.858 | 0.699 | 0.781 | 0.719 | 0.002 | 0.000* | 0.001 | ||
| 0.943 | 0.970 | 0.466 | 0.933 | 0.899 | 0.803 | 0.000* | 0.000* | ||
| 0.907 | 0.950 | 0.651 | 0. 88 1 | 0. 88 0 | 0.906 | 0.934 | 0.000* | ||
| 0.954 | 0.977 | 0.408 | 0.958 | 0.909 | 0.830 | 0.986 | 0.933 | ||
| 0.004 | 0.002* | 0.267 | 0.275 | 0.014 | 0.174 | 0.278 | |||
| 0.788 | 0.003 | 0.077 | 0.115 | 0.004 | 0.263 | 0.147 | |||
| 0.813 | 0.799 | 0.169 | 0.008 | 0.000* | 0.012 | 0.009 | |||
| 0.367 | 0.555 | 0.446 | 0. 476 | 0.137 | 0.866 | 0.931 | |||
| 0.361 | 0.503 | 0.749 | 0.240 | 0.001* | 0.000* | 0.000* | |||
| 0.712 | 0.790 | 0.966 | 0.477 | 0.865 | 0.002* | 0.002* | |||
| 0.441 | 0.370 | 0.721 | 0.058 | 0.926 | 0.813 | 0.000* | |||
| 0.359 | 0. 468 | 0.741 | 0.030 | 0.922 | 0.831 | 0.903 | |||
| 0.001* | 0.001* | 0.001* | 0.027 | 0.009 | |||||
| 0.914 | 0.000* | 0.000* | 0.001* | 0.000* | |||||
| 0.889 | 0.959 | 0.000* | 0.001* | 0.000* | |||||
| 0.886 | 0.967 | 0.928 | 0.017 | 0.003* | |||||
| 0.726 | 0.870 | 0.911 | 0.764 | 0.003* | |||||
| 0.841 | 0.956 | 0.966 | 0.886 | 0. 8 96 | |||||
| 0.749 | 0. 84 3 | 0.145 | 0.575 | 0.372 | |||||
| 0.109 | 0.000* | 0.008* | 0.949 | 0.225 | |||||
| 0.068 | 0.882 | 0.045 | 0.901 | 0.279 | |||||
| 0.470 | 0.752 | 0.612 | 0.515 | 0.440 | |||||
| -0.190 | 0.022 | -0.042 | -0.220 | 0.337 | |||||
| -0.299 | -0.398 | -0.359 | -0.260 | 0.320 | |||||
The Bonferroni correction was applied and only significant p values were labelled with an asterisk (above the diagonal). Names of the populations are as in Table 2.