| Literature DB >> 22272242 |
Rui Pereira1, Christopher Phillips, Nádia Pinto, Carla Santos, Sidney Emanuel Batista dos Santos, António Amorim, Ángel Carracedo, Leonor Gusmão.
Abstract
Ancestry-informative markers (AIMs) show high allele frequency divergence between different ancestral or geographically distant populations. These genetic markers are especially useful in inferring the likely ancestral origin of an individual or estimating the apportionment of ancestry components in admixed individuals or populations. The study of AIMs is of great interest in clinical genetics research, particularly to detect and correct for population substructure effects in case-control association studies, but also in population and forensic genetics studies. This work presents a set of 46 ancestry-informative insertion deletion polymorphisms selected to efficiently measure population admixture proportions of four different origins (African, European, East Asian and Native American). All markers are analyzed in short fragments (under 230 basepairs) through a single PCR followed by capillary electrophoresis (CE) allowing a very simple one tube PCR-to-CE approach. HGDP-CEPH diversity panel samples from the four groups, together with Oceanians, were genotyped to evaluate the efficiency of the assay in clustering populations from different continental origins and to establish reference databases. In addition, other populations from diverse geographic origins were tested using the HGDP-CEPH samples as reference data. The results revealed that the AIM-INDEL set developed is highly efficient at inferring the ancestry of individuals and provides good estimates of ancestry proportions at the population level. In conclusion, we have optimized the multiplexed genotyping of 46 AIM-INDELs in a simple and informative assay, enabling a more straightforward alternative to the commonly available AIM-SNP typing methods dependent on complex, multi-step protocols or implementation of large-scale genotyping technologies.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22272242 PMCID: PMC3260179 DOI: 10.1371/journal.pone.0029684
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
AIM-INDELs used in the multiplex.
| MID | rs number | Chromosome | Position (bp) | Alleles described in dbSNP | References |
| MID-1470 | rs2307666 | 11 | 64729920 | -/ |
|
| MID-777 | rs1610863 | 16 | 6551830 | -/ |
|
| MID-196 | rs16635 | 6 | 99789775 | -/ |
|
| MID-881 | rs1610965 | 5 | 79746093 | -/ |
|
| MID-3122 | rs35451359 | 18 | 45110983 | -/ |
|
| MID-548 | rs140837 | 6 | 3708909 | -/ |
|
| MID-659 | rs1160893 | 2 | 224794577 | -/ |
|
| MID-2011 | rs2308203 | 2 | 109401291 | -/ |
|
| MID-2929 | rs33974167 | 8 | 87813725 | -/ |
|
| MID-593 | rs1160852 | 6 | 137345857 | -/ |
|
| MID-798 | rs1610884 | 5 | 56122323 | -/ |
|
| MID-1193 | rs2067280 | 5 | 89818959 | -/ |
|
| MID-1871 | rs2308067 | 7 | 127291541 | -/ |
|
| MID-17 | rs4183 | 3 | 3192524 | -/ |
|
| MID-2538 | rs3054057 | 15 | 86010538 | -/ |
|
| MID-1644 | rs2307840 | 1 | 36099090 | -/ |
|
| MID-3854 | rs60612424 | 6 | 84017514 | -/ |
|
| MID-2275 | rs3033053 | 14 | 42554496 | -/ |
|
| MID-94 | rs16384 | 22 | 42045009 | -/ |
|
| MID-3072 | rs34611875 | 18 | 67623917 | -/ |
|
| MID-772 | rs1610859 | 5 | 128317275 | -/ |
|
| MID-2313 | rs3045215 | 1 | 234740917 | -/ |
|
| MID-397 | rs25621 | 6 | 139858158 | -/ |
|
| MID-1636 | rs2307832 | 1 | 55590789 | -/ |
|
| MID-51 | rs16343 | 4 | 17635560 | -/ |
|
| MID-2431 | rs3031979 | 8 | 73501951 | -/ |
|
| MID-2264 | rs34122827 | 13 | 63778778 | -/ |
|
| MID-2256 | rs133052 | 22 | 41042364 | -/ |
|
| MID-128 | rs6490 | 12 | 108127168 | -/ |
|
| MID-15 | rs4181 | 2 | 42577803 | -/ |
|
| MID-2241 | rs3030826 | 6 | 67176774 | -/ |
|
| MID-419 | rs140708 | 6 | 170720016 | -/ |
|
| MID-943 | rs1611026 | 5 | 82545545 | -/ |
|
| MID-159 | rs16438 | 20 | 25278470 | -/ |
|
| MID-2005 | rs2308161 | 10 | 69800909 | -/ |
|
| MID-250 | rs16687 | 7 | 83887882 | -/ |
|
| MID-1802 | rs2307998 | 5 | 7814345 | -/ |
|
| MID-1607 | rs2307803 | 3 | 108981031 | -/ |
|
| MID-1734 | rs2307930 | 6 | 84476378 | -/ |
|
| MID-406 | rs25630 | 6 | 14734341 | -/ |
|
| MID-1386 | rs2307582 | 1 | 247768775 | -/ |
|
| MID-1726 | rs2307922 | 1 | 39896964 | -/ |
|
| MID-3626 | rs11267926 | 15 | 45526069 | -/ |
|
| MID-360 | rs25584 | 12 | 112145217 | -/ |
|
| MID-1603 | rs2307799 | 5 | 70828427 | -/ |
|
| MID-2719 | rs34541393 | 20 | 30701405 | -/ |
|
*Nomenclature according to [7] and Marshfield Diallelic Insertion/Deletion Polymorphisms database;
**Mapping data according to dbSNP (build 132).
Figure 1Example of an electropherogram obtained for the HGDP-CEPH 0452 sample with the 46 AIM-INDEL multiplex (markers are identified by MID number).
Figure 2Analysis of HGDP-CEPH diversity panel samples from four continental origins using a set of 46 AIM-INDELs.
A) ancestral membership proportions (based on STRUCTURE results from 3 independent runs treated in CLUMPP and plotted with distruct; individuals were first sorted by geographic origin of population, and within those by ascending population code and HGDP individual number); B) estimated ln probability of the data (−ln P(D) obtained with STRUCTURE and plotted using Structure harvester); C) principal component analysis 3D plots. D) estimation of population assignment success (results from one-out cross validation studies using the Snipper app suite; see methods for details of the analyses). AFR: Africa; EUR: Europe; EAS: East Asia; NAM: Native America.
Figure 3Ancestral membership proportions for testing population samples from different continental origins using the HGDP-CEPH diversity panel genetic data as training sets.
Angola (Africa); Portugal (Europe); Taiwan (East Asia); Brazilian Amazonas tribes (Native America); Belém is an example of a highly admixed Brazilian city in northeastern Amazonas.
Ancestral membership proportions for HGDP-CEPH diversity panel samples and testing populations from four continental origins.
| 46 AIM-INDELs (this study) | 210 INDELs | 48 In4 AIM-SNP set | |||||||||||
| AFR | EUR | EAS | NAM | AFR | EUR | EAS | NAM | AFR | EURA | EAS | AMI | ||
| HGDP-CEPH AFR |
| 0.011 | 0.012 | 0.008 |
| 0.009 | 0.009 | 0.005 | AFR |
| 0.02 | 0.01 | 0.01 |
| HGDP-CEPH EUR | 0.008 |
| 0.014 | 0.014 | 0.007 |
| 0.013 | 0.013 | EURA | 0.01 |
| 0.02 | 0.01 |
| HGDP-CEPH EAS | 0.006 | 0.018 |
| 0.024 | 0.007 | 0.021 |
| 0.017 | EAS | 0.01 | 0.04 |
| 0.03 |
| HGDP-CEPH NAM | 0.008 | 0.041 | 0.027 |
| 0.011 | 0.028 | 0.015 |
| AMI | 0.01 | 0.03 | 0.04 |
|
| Testing populations: | AFR | EUR | EAS | NAM | |||||||||
| Angola |
| 0.011 | 0.011 | 0.008 | |||||||||
| Portugal | 0.018 |
| 0.008 | 0.008 | |||||||||
| Taiwan | 0.004 | 0.003 |
| 0.009 | |||||||||
| Br. Amazonas tribes | 0.010 | 0.013 | 0.032 |
| |||||||||
| Belém (4G-analysis) | 0.148 | 0.535 | 0.088 | 0.229 | |||||||||
| Belém (3G-analysis) | 0.168 | 0.537 | - | 0.295 | |||||||||
(AFR: Africa; EUR: Europe; EAS: East Asia; NAM: Native America).
Figure 4Ancestral membership proportions for HGDP-CEPH diversity panel samples from five continental origins using a set of 46 AIM-INDELs (based on STRUCTURE results from 3 independent runs treated in CLUMPP and plotted with distruct; individuals were first sorted by geographic origin of population, and within those by ascending population code and HGDP individual number).
AFR: Africa; EUR: Europe; EAS: East Asia; NAM: Native America; OCE: Oceania.