| Literature DB >> 16982644 |
Guohong Hu1, Hui-Yun Wang, Danielle M Greenawalt, Marco A Azaro, Minjie Luo, Irina V Tereshchenko, Xiangfeng Cui, Qifeng Yang, Richeng Gao, Li Shen, Honghua Li.
Abstract
Microarray-based analysis of single nucleotide polymorphisms (SNPs) has many applications in large-scale genetic studies. To minimize the influence of experimental variation, microarray data usually need to be processed in different aspects including background subtraction, normalization and low-signal filtering before genotype determination. Although many algorithms are sophisticated for these purposes, biases are still present. In the present paper, new algorithms for SNP microarray data analysis and the software, AccuTyping, developed based on these algorithms are described. The algorithms take advantage of a large number of SNPs included in each assay, and the fact that the top and bottom 20% of SNPs can be safely treated as homozygous after sorting based on their ratios between the signal intensities. These SNPs are then used as controls for color channel normalization and background subtraction. Genotype calls are made based on the logarithms of signal intensity ratios using two cutoff values, which were determined after training the program with a dataset of approximately 160,000 genotypes and validated by non-microarray methods. AccuTyping was used to determine >300,000 genotypes of DNA and sperm samples. The accuracy was shown to be >99%. AccuTyping can be downloaded from http://www2.umdnj.edu/lilabweb/publications/AccuTyping.html.Entities:
Mesh:
Year: 2006 PMID: 16982644 PMCID: PMC1635267 DOI: 10.1093/nar/gkl601
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Scatter plots of data from a microarray for 1172 SNPs. The signal intensity log ratios, Ln(R)'s, are plotted against the signal sums of the two color intensities (R + G, Cy3 + Cy5). Upper panel, plot using the original data. Lower panel, plot after data processing. Note, spots with both color intensities smaller than the low-signal filtering values (Equation 2) were eliminated by the program and are not plotted in the lower panel. The two lines, y = ±1.5, encompass the heterozygous cluster.
Summary of genotyping results of 1172 SNPs in 24 human genomic DNA samples
| SNPs | ||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sample No. | Detectable in each direction | Detectable in Both Directions | Detectable in one directon | |||||||||||||||||||||
| Total | AG | CT | Total | Concordant | Accuracy(CT) | Non-concordant | AG | CT | Undetectable | |||||||||||||||
| No. | % | Heterozygous | Heterozygous | Heterozygous | ||||||||||||||||||||
| No. | % | No. | % | No. | % | No. | % | No. | % | No. | % | No. | % | No. | % | No. | % | No. | % | |||||
| 1 | 1172 | 1161 | 99.06 | 356 | 30.66 | 1159 | 98.89 | 362 | 31.23 | 1149 | 98.04 | 1110 | 96.61 | 337 | 30.38 | 99.97 | 39 | 3.39 | 12 | 1.02 | 10 | 0.85 | 1 | 0.09 |
| 2 | 1172 | 1151 | 98.21 | 317 | 27.54 | 1162 | 99.15 | 320 | 27.54 | 1144 | 97.61 | 1108 | 96.85 | 298 | 26.90 | 99.98 | 36 | 3.15 | 7 | 0.60 | 18 | 1.54 | 3 | 0.26 |
| 3 | 1172 | 1150 | 98.12 | 315 | 27.39 | 1160 | 98.98 | 318 | 27.41 | 1140 | 97.27 | 1102 | 96.67 | 298 | 27.04 | 99.97 | 38 | 3.33 | 10 | 0.85 | 20 | 1.71 | 2 | 0.17 |
| 4 | 1172 | 1151 | 98.21 | 323 | 28.06 | 1169 | 99.74 | 325 | 27.80 | 1150 | 98.12 | 1118 | 97.22 | 308 | 27.55 | 99.98 | 32 | 2.78 | 1 | 0.09 | 19 | 1.62 | 2 | 0.17 |
| 5 | 1172 | 1154 | 98.46 | 330 | 28.60 | 1165 | 99.40 | 326 | 27.98 | 1148 | 97.95 | 1111 | 96.78 | 309 | 27.81 | 99.97 | 37 | 3.22 | 6 | 0.51 | 17 | 1.45 | 1 | 0.09 |
| 6 | 1172 | 1161 | 99.06 | 388 | 33.42 | 1168 | 99.66 | 386 | 33.05 | 1158 | 98.81 | 1130 | 97.58 | 372 | 32.92 | 99.99 | 28 | 2.42 | 3 | 0.26 | 10 | 0.85 | 1 | 0.09 |
| 7 | 1172 | 1158 | 98.81 | 266 | 22.97 | 1149 | 98.04 | 255 | 22.19 | 1136 | 96.93 | 1108 | 97.54 | 245 | 22.11 | 99.98 | 28 | 2.46 | 22 | 1.88 | 13 | 1.11 | 1 | 0.09 |
| 8 | 1172 | 1162 | 99.15 | 274 | 23.58 | 1153 | 98.38 | 259 | 22.46 | 1145 | 97.70 | 1112 | 97.12 | 249 | 22.39 | 99.98 | 33 | 2.88 | 17 | 1.45 | 8 | 0.68 | 2 | 0.17 |
| 9 | 1172 | 1157 | 98.72 | 269 | 23.25 | 1150 | 98.12 | 266 | 23.13 | 1138 | 97.10 | 1099 | 96.57 | 249 | 22.66 | 99.97 | 39 | 3.43 | 19 | 1.62 | 12 | 1.02 | 3 | 0.26 |
| 10 | 1172 | 1158 | 98.81 | 248 | 21.42 | 1164 | 99.32 | 255 | 21.91 | 1152 | 98.29 | 1123 | 97.48 | 239 | 21.28 | 99.98 | 29 | 2.52 | 6 | 0.51 | 12 | 1.02 | 2 | 0.17 |
| 11 | 1172 | 1158 | 98.81 | 281 | 24.27 | 1151 | 98.21 | 271 | 23.54 | 1139 | 97.18 | 1109 | 97.37 | 257 | 23.17 | 99.98 | 30 | 2.63 | 19 | 1.62 | 12 | 1.02 | 2 | 0.17 |
| 12 | 1172 | 1158 | 98.81 | 265 | 22.88 | 1160 | 98.98 | 263 | 22.67 | 1147 | 97.87 | 1117 | 97.38 | 252 | 22.56 | 99.98 | 30 | 2.62 | 11 | 0.94 | 13 | 1.11 | 1 | 0.09 |
| 13 | 1172 | 1151 | 98.21 | 325 | 28.24 | 1157 | 98.72 | 318 | 27.48 | 1141 | 97.35 | 1112 | 97.48 | 304 | 27.34 | 99.98 | 29 | 2.54 | 10 | 0.85 | 16 | 1.37 | 5 | 0.43 |
| 14 | 1172 | 1145 | 97.70 | 337 | 29.43 | 1165 | 99.40 | 350 | 30.04 | 1141 | 97.35 | 1115 | 97.72 | 329 | 29.51 | 99.99 | 26 | 2.28 | 4 | 0.34 | 24 | 2.05 | 3 | 0.26 |
| 15 | 1172 | 1147 | 97.87 | 339 | 29.56 | 1160 | 98.98 | 339 | 29.22 | 1142 | 97.44 | 1110 | 97.20 | 320 | 28.83 | 99.98 | 32 | 2.80 | 5 | 0.43 | 18 | 1.54 | 7 | 0.60 |
| 16 | 1172 | 1155 | 98.55 | 330 | 28.57 | 1162 | 99.15 | 336 | 28.92 | 1147 | 97.87 | 1109 | 96.69 | 314 | 28.13 | 99.97 | 38 | 3.31 | 8 | 0.68 | 15 | 1.28 | 2 | 0.17 |
| 17 | 1172 | 1160 | 98.98 | 310 | 26.72 | 1162 | 99.15 | 318 | 27.37 | 1152 | 98.29 | 1117 | 96.96 | 296 | 26.50 | 99.98 | 35 | 3.04 | 8 | 0.68 | 10 | 0.85 | 2 | 0.17 |
| 18 | 1172 | 1161 | 99.06 | 330 | 28.42 | 1165 | 99.40 | 330 | 28.33 | 1154 | 98.46 | 1123 | 97.31 | 314 | 27.96 | 99.98 | 31 | 2.69 | 7 | 0.60 | 11 | 0.94 | 0 | 0.00 |
| 19 | 1172 | 1141 | 97.35 | 332 | 29.10 | 1157 | 98.72 | 341 | 29.47 | 1130 | 96.42 | 1095 | 96.90 | 317 | 28.95 | 99.98 | 35 | 3.10 | 11 | 0.94 | 27 | 2.30 | 4 | 0.34 |
| 20 | 1172 | 1162 | 99.15 | 330 | 28.40 | 1140 | 97.27 | 320 | 28.07 | 1132 | 96.59 | 1094 | 96.64 | 301 | 27.51 | 99.97 | 38 | 3.36 | 30 | 2.56 | 8 | 0.68 | 2 | 0.17 |
| 21 | 1172 | 1149 | 98.04 | 283 | 24.63 | 1154 | 98.46 | 289 | 25.04 | 1133 | 96.67 | 1100 | 97.09 | 268 | 24.36 | 99.98 | 33 | 2.91 | 16 | 1.37 | 21 | 1.79 | 2 | 0.17 |
| 22 | 1172 | 1159 | 98.89 | 334 | 28.82 | 1152 | 98.29 | 324 | 28.13 | 1140 | 97.27 | 1101 | 96.58 | 307 | 27.88 | 99.97 | 39 | 3.42 | 19 | 1.62 | 12 | 1.02 | 1 | 0.09 |
| 23 | 1172 | 1151 | 98.21 | 276 | 23.98 | 1164 | 99.32 | 275 | 23.63 | 1145 | 97.70 | 1109 | 96.86 | 259 | 23.35 | 99.98 | 36 | 3.14 | 6 | 0.51 | 19 | 1.62 | 2 | 0.17 |
| 24 | 1172 | 1156 | 98.63 | 320 | 27.68 | 1157 | 98.72 | 328 | 28.35 | 1143 | 97.53 | 1106 | 96.76 | 303 | 27.40 | 99.97 | 37 | 3.24 | 13 | 1.11 | 14 | 1.19 | 2 | 0.17 |
| Average | 1172 | 1155 | 98.54 | 312 | 26.98 | 1159 | 98.85 | 311.4 | 26.87 | 1144 | 97.58 | 1110 | 97.06 | 294 | 26.44 | 99.98 | 33.7 | 2.94 | 11.3 | 0.96 | 15 | 1.28 | 2.2 | 0.19 |
| SD | — | 5.8 | 0.50 | 33.9 | 2.95 | 6.9 | 0.59 | 36.29 | 3.07 | 7.1 | 0.61 | 8.9 | 0.36 | 34 | 3.01 | 0.50 | 4.1 | 0.36 | 6.96 | 0.59 | 5 | 0.43 | 1.5 | 0.13 |
Figure 2The graphic interface of AccuTyping.