| Literature DB >> 17367522 |
Stefan Wilkening1, Bowang Chen, Michael Wirtenberger, Barbara Burwinkel, Asta Försti, Kari Hemminki, Federico Canzian.
Abstract
BACKGROUND: Genotyping technologies for whole genome association studies are now available. To perform such studies to an affordable price, pooled DNA can be used. Recent studies have shown that GeneChip Human Mapping 10 K and 50 K arrays are suitable for the estimation of the allele frequency in pooled DNA. In the present study, we tested the accuracy of the 250 K Nsp array, which is part of the 500 K array set representing 500,568 SNPs. Furthermore, we compared different algorithms to estimate allele frequencies of pooled DNA.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17367522 PMCID: PMC1839100 DOI: 10.1186/1471-2164-8-77
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Comparison of accuracies of three algorithms
| Simpson_our | 0.046 | 0.034 | 0.951 | 0.051 |
| Simpson_web | 0.051 | 0.038 | 0.941 | 0.056 |
| Craig_our | 0.067 | 0.049 | 0.909 | 0.072 |
| Craig_web | 0.080 | 0.061 | 0.903 | 0.075 |
| PPC_our | 0.043 | 0.033 | 0.959 | 0.022 |
| PPC_brohede | 0.050 | 0.038 | 0.946 | 0.022 |
The errors are based on estimates from 3574 SNPs which could be analyzed by all methods.
*Data used for normalization: "our" = 34 individuals analyzed in our lab, "Brohede" = 26 individuals analyzed in the lab of Brohede et al. [1], "web" >3000 individuals analyzed in the lab of Caig et al. [9], files are available under [15].
Estimation accuracy in the Nsp 250 K array
| 1 | 0.056 | 91647 | 0.971 | 0.000 |
| 2 | 0.051 | 93654 | 0.976 | 0.041 |
| 3 | 0.047 | 99922 | 0.980 | 0.044 |
| 4 | 0.046 | 102687 | 0.983 | 0.046 |
| 1 | 0.095 | 4790 | 0.980 | 0.041 |
| 2 | 0.079 | 3544 | 0.987 | 0.041 |
| 3 | 0.070 | 3479 | 0.989 | 0.041 |
| 4 | 0.064 | 3523 | 0.989 | 0.042 |
| 5 | 0.061 | 3364 | 0.986 | 0.043 |
| 6 | 0.057 | 3623 | 0.989 | 0.043 |
| 7 | 0.054 | 3543 | 0.988 | 0.043 |
| 8 | 0.052 | 3356 | 0.987 | 0.045 |
| 9 | 0.049 | 3419 | 0.988 | 0.045 |
| 10 | 0.048 | 3524 | 0.987 | 0.046 |
| 15 | 0.042 | 3545 | 0.980 | 0.048 |
| 20 | 0.035 | 3701 | 0.976 | 0.048 |
| 25 | 0.030 | 3208 | 0.959 | 0.046 |
| 30 | 0.027 | 1329 | 0.941 | 0.043 |
| 35 | 0.024 | 141 | 0.954 | 0.041 |
| 0.0 – 0.1 | 0.096 | 27688 | 0.915 | 0.037 |
| 0.1 – 0.2 | 0.045 | 23875 | 0.983 | 0.043 |
| 0.2 – 0.3 | 0.038 | 18843 | 0.977 | 0.048 |
| 0.3 – 0.4 | 0.033 | 17339 | 0.953 | 0.051 |
| 0.4 – 0.5 | 0.030 | 15783 | 0.778 | 0.053 |
| A/T, T/A | 0.052 | 6799 | 0.979 | 0.050 |
| A/C, T/G | 0.048 | 16056 | 0.982 | 0.046 |
| A/G, T/C | 0.045 | 69445 | 0.983 | 0.045 |
| C/G, G/C | 0.043 | 10387 | 0.981 | 0.045 |
*To get the error for different numbers of repeats, we took the mean of all possible combination of the four replicates. For 3 replicates for example we took the mean values of pool combinations 123, 124, 134, 234.
Figure 1Graph showing the correlation between detection rate (MDR) and call rate. Data derived from 100 NspI and 100 StyI arrays, hybridized with individual DNA. A 93% call rate corresponds to about 97.8% MDR.
Figure 2Graph showing the correlation between detection rate (MDR) and the error (absolute difference between estimated and known allele frequency). Each cross stands for one 250 K array, all hybridized with the same DNA pool.