| Literature DB >> 19216798 |
Tesfaye M Baye1, Hemant K Tiwari, David B Allison, Rodney C Go.
Abstract
BACKGROUND: New technologies make it possible for the first time to genotype hundreds of thousands of SNPs simultaneously. A wealth of genomic information in the form of publicly available databases is underutilized as a potential resource for uncovering functionally relevant markers underlying complex human traits. Given the huge amount of SNP data available from the annotation of human genetic variation, data mining is a reasonable approach to investigating the number of SNPs that are informative for ancestry information.Entities:
Year: 2009 PMID: 19216798 PMCID: PMC2649128 DOI: 10.1186/1756-0381-2-1
Source DB: PubMed Journal: BioData Min ISSN: 1756-0381 Impact factor: 2.522
Number of SNPs investigated for data-mining of AIMs for each chromosome for both Yorubans and European populations listed by genotypic platform or source.
| Chr | HapMap | 500 k Affymetrix | 100 k Illumina | AIMs |
| 1 | 286584 | 39418 | 9820 | 241 |
| 2 | 304922 | 40633 | 8702 | 230 |
| 3 | 235256 | 33120 | 7207 | 190 |
| 4 | 224433 | 31339 | 6000 | 132 |
| 5 | 230257 | 31595 | 6329 | 136 |
| 6 | 251838 | 31130 | 6579 | 192 |
| 7 | 196235 | 25407 | 5581 | 124 |
| 8 | 199358 | 26948 | 4891 | 129 |
| 9 | 169079 | 22596 | 4480 | 115 |
| 10 | 197292 | 28217 | 5240 | 144 |
| 11 | 189407 | 28217 | 5240 | 144 |
| 12 | 177798 | 25998 | 5928 | 164 |
| 13 | 146641 | 24712 | 5465 | 129 |
| 14 | 114909 | 18910 | 3093 | 76 |
| 15 | 99603 | 15432 | 3420 | 91 |
| 16 | 101959 | 14190 | 3307 | 87 |
| 17 | 83339 | 15069 | 3388 | 103 |
| 18 | 111158 | 11127 | 4079 | 130 |
| 19 | 51689 | 14631 | 2570 | 83 |
| 20 | 111869 | 6284 | 3520 | 117 |
| 21 | 45994 | 12266 | 3007 | 83 |
| 22 | 51037 | 7014 | 1381 | 59 |
| X | 103517 | 6123 | 1886 | 94 |
| Y | 54 | - | - | - |
| Total | 3684228 | 492556 | 109366 | 3011 |
Chr = chromosome
Distribution of allele frequency differences (Yoruba vs. European) across SNP marker databases
| Allele freq difference | HapMap | Affymetrix | Illumina | AIMs | ||||
| SNPs | % | SNPs | % | SNPs | % | SNPs | % | |
| 0 | 635890 | 17.26 | 12813 | 2.60 | 1392 | 1.27 | - | - |
| 0.01–0.29 | 2477910 | 67.257 | 385585 | 78.28 | 91992 | 84.11 | - | - |
| 0.3–0.50 | 440866 | 11.966 | 73066 | 14.83 | 15833 | 14.62 | 993 | 33.83 |
| 0.51–0.70 | 114055 | 3.096 | 18910 | 3.84 | 1515 | 51.63 | ||
| 0.71–0.90 | 14957 | 0.406 | 2138 | 0.44 | - | - | 414 | 14.11 |
| 0.91–0.99 | 520 | 0.014 | 28 | 0.01 | - | - | 12 | 0.40 |
| 1 | 30 | 0.001 | - | - | - | - | - | - |
Chr = chromosome
Number of AIMs and percentage with delta ≥ 0.3 (in parentheses) for HapMap, Affymetrix, Illumina and AIM databases.
| HapMap SNPs | Affymerix SNPs | Illumina SNPs | AIM SNPs | |||||
| Chr | Total | delta (%) | Total | delta (%) | Total | delta (%) | Total | delta (%) |
| 1 | 270009 | 36255 (13) | 39418 | 4439 (11) | 9820 | 1471 (15) | 235 | 235 (100) |
| 2 | 293090 | 46551 (16) | 40633 | 4649 (11) | 8702 | 1280(15) | 217 | 217(100) |
| 3 | 225937 | 35394 (16) | 33120 | 3779 (11) | 7207 | 1021(14) | 178 | 178(100) |
| 4 | 214465 | 33242 (16) | 31339 | 3449 (11) | 6000 | 929(15) | 124 | 124(100) |
| 5 | 221858 | 31821 (14) | 31595 | 3245 (10) | 6329 | 879(14) | 129 | 129(100) |
| 6 | 244251 | 32121 (13) | 31130 | 3126 (10) | 6579 | 894(14) | 184 | 184(100) |
| 7 | 182354 | 26745 (15) | 25407 | 2785 (11) | 5581 | 826(15) | 121 | 121(100) |
| 8 | 192846 | 32106 (17) | 26948 | 3142 (12) | 4891 | 751(15) | 122 | 122(100) |
| 9 | 162192 | 23800 (15) | 22596 | 2447 (11) | 4480 | 585(13) | 108 | 108(100) |
| 10 | 189583 | 26671 (14) | 28217 | 3005 (11) | 5240 | 784(15) | 135 | 135(100) |
| 11 | 180434 | 23850 (13) | 28217 | 2767 (11) | 5240 | 863(15) | 154 | 154(100) |
| 12 | 169898 | 23058 (14) | 25998 | 2672 (11) | 5928 | 768(14) | 125 | 125(100) |
| 13 | 142568 | 18327 (13) | 24712 | 1909 (10) | 5465 | 399(13) | 71 | 71(100) |
| 14 | 110229 | 16581 (15) | 18910 | 1667 (11) | 3093 | 499(15) | 88 | 88(100) |
| 15 | 95436 | 16511 (17) | 15432 | 1778 (13) | 3420 | 533(16) | 83 | 83(100) |
| 16 | 96742 | 14331 (15) | 14190 | 1741(12) | 3307 | 536(16) | 95 | 95(100) |
| 17 | 79038 | 12212 (16) | 15069 | 1342(12) | 3388 | 626(15) | 115 | 115(100) |
| 18 | 107243 | 15605 (15) | 11127 | 1613(11) | 4079 | 354(14) | 79 | 79(100) |
| 19 | 48447 | 6970 (14) | 14631 | 620(10) | 2570 | 439(12) | 102 | 102(100) |
| 20 | 108979 | 12941 (12) | 6284 | 1357(11) | 3520 | 431(14) | 80 | 80(100) |
| 21 | 43739 | 6775 (16) | 12266 | 772(11) | 3007 | 169(12) | 54 | 54(100) |
| 22 | 49009 | 6419 (13) | 7014 | 620(10) | 1381 | 304(16) | 88 | 88(100) |
| X | 102866 | 23368 (23) | 6123 | 1867(18) | 1886 | 640(18) | 156 | 156(100) |
| Y | 55 | 11 (20) | - | - | - | - | - | - |
Chr = chromosome
Number of overlapping SNP AIMs selected by different platforms (HapMap, Affymetrix, Illumina, and AIMs).
| Chr | H<->A | H<->I | H<->S | A <->I | A<->S | I<->S | H<->A<->I | H<-> A<>S | H<->I <->S | A<->I <->S | All |
| 1 | 1923 | 0 | 107 | 0 | 5 | 0 | 0 | 1 | 0 | 0 | 0 |
| 2 | 2180 | 0 | 114 | 0 | 7 | 0 | 0 | 5 | 0 | 0 | 0 |
| 3 | 1983 | 0 | 88 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 |
| 4 | 1433 | 0 | 58 | 0 | 6 | 0 | 0 | 3 | 0 | 0 | 0 |
| 5 | 1590 | 0 | 71 | 0 | 7 | 0 | 0 | 4 | 0 | 0 | 0 |
| 6 | 1836 | 0 | 105 | 0 | 8 | 0 | 0 | 7 | 0 | 0 | 0 |
| 7 | 1240 | 0 | 55 | 0 | 2 | 0 | 0 | 2 | 0 | 0 | 0 |
| 8 | 1546 | 0 | 61 | 0 | 5 | 0 | 0 | 4 | 0 | 0 | 0 |
| 9 | 1073 | 0 | 63 | 0 | 2 | 0 | 0 | 2 | 0 | 0 | 0 |
| 10 | 1275 | 0 | 69 | 0 | 5 | 0 | 0 | 2 | 0 | 0 | 0 |
| 11 | 1167 | 0 | 67 | 0 | 2 | 0 | 0 | 1 | 0 | 0 | 0 |
| 12 | 1071 | 0 | 67 | 0 | 6 | 0 | 0 | 5 | 0 | 0 | 0 |
| 13 | 1073 | 0 | 38 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 |
| 14 | 760 | 0 | 46 | 0 | 2 | 0 | 0 | 2 | 0 | 0 | 0 |
| 15 | 863 | 0 | 45 | 0 | 2 | 0 | 0 | 2 | 0 | 0 | 0 |
| 16 | 824 | 0 | 52 | 0 | 4 | 0 | 0 | 3 | 0 | 0 | 0 |
| 17 | 694 | 0 | 56 | 0 | 2 | 0 | 0 | 2 | 0 | 0 | 0 |
| 18 | 883 | 0 | 55 | 0 | 5 | 0 | 0 | 3 | 0 | 0 | 0 |
| 19 | 244 | 0 | 48 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 |
| 20 | 863 | 0 | 47 | 0 | 3 | 0 | 0 | 3 | 0 | 0 | 0 |
| 21 | 431 | 0 | 31 | 0 | 6 | 0 | 0 | 5 | 0 | 0 | 0 |
| 22 | 361 | 0 | 41 | 0 | 5 | 0 | 0 | 1 | 0 | 0 | 0 |
| X | 1075 | 0 | 95 | 0 | 7 | 0 | 0 | 5 | 0 | 0 | 0 |
Chr = chromosome, H = HapMap, A = Affymerix, I = Illumina, and S = AIMs identified by Smith et al. (2004)