| Literature DB >> 20725627 |
Farhat Naureen Memon1, Graham J G Upton, Andrew P Harrison.
Abstract
We have previously discovered that probes containing runs of four or more contiguous guanines are not reliable for measuring gene expression in the Human HG_U133A Affymetrix GeneChip data. These probes are not correlated with other members of their probe set, but they are correlated with each other. We now extend our analysis to different 3' GeneChip designs of mouse, rat, and human. We find that, in all these chip designs, the G-stack probes (probes with a run of exactly four consecutive guanines) are correlated highly with each other, indicating that such probes are not reliable measures of gene expression in mammalian studies. Furthermore, there is no specific position of G-stack where the correlation is highest in all the chips. We also find that the latest designs of rat and mouse chips have significantly fewer G-stack probes compared to their predecessors, whereas there has not been a similar reduction in G-stack density across the changes in human chips. Moreover, we find significant changes in RMA values (after removing G-stack probes) as the number of G-stack probes increases.Entities:
Year: 2010 PMID: 20725627 PMCID: PMC2915844 DOI: 10.4061/2010/489736
Source DB: PubMed Journal: J Nucleic Acids ISSN: 2090-0201
List of organisms and their chip designs used in this study. The number of annotated probes and the number of G-stack probes include both the main and control probes.
| Organism | Chip Design | Chip size | No. of annotated Probes | No. of G-stack Probes | No. of Affected Probe Sets |
|---|---|---|---|---|---|
| Humans | HG_U133_Plus_2 | 1164 ∗ 1164 | 604,258 | 24,980 | 16,254 |
| (Homo_Sapiens) | HG_U133A | 712 ∗ 712 | 247,965 | 12,868 | 8,298 |
| HG_U95A | 640 ∗ 640 | 201,807 | 7,329 | 3,733 | |
| HG_U95B | 640 ∗ 640 | 201,862 | 6,334 | 3,240 | |
| HG_U95D | 640 ∗ 640 | 201,858 | 7,198 | 3,227 | |
| HG_U95E | 640 ∗ 640 | 201,863 | 7,880 | 3,514 | |
| Mouse | MOE430A | 712 ∗ 712 | 249,958 | 372 | 314 |
| (Mus Musculus) | MOE430B | 712 ∗ 712 | 248,704 | 252 | 203 |
| MG_U74Av2 | 640 ∗ 640 | 197,993 | 7,360 | 3,556 | |
| MG_U74Bv2 | 640 ∗ 640 | 197,131 | 7,006 | 3,614 | |
| Rat | RAE230A | 602 ∗ 602 | 175,477 | 81 | 58 |
| (Rattus Norvegicus) | Rat230_2 | 834 ∗ 834 | 342,410 | 208 | 163 |
| RG_U34A | 534 ∗ 534 | 140,317 | 3,691 | 2,104 | |
Figure 1Contour plot illustrating that in human chip HG_U133A, the average correlation coefficient values changes according to the position of G-stack (with four Gs only) for a group of probes.
In G-stack probes, the effect of the position of G-stack on the average correlation coefficient value, (n is the number of affected probes and is the average correlation between these n probes).
| Position of G-stack | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ||
| HG_U133A | 871 | 449 | 548 | 583 | 664 | 546 | 599 | 604 | 531 | 471 | 458 | |
| 0.51 | 0.32 | 0.36 | 0.36 | 0.34 | 0.34 | 0.36 | 0.40 | 0.44 | 0.42 | 0.44 | ||
| HG_U133_Plus_2 | 1758 | 925 | 1072 | 1170 | 1234 | 1087 | 1214 | 1093 | 1049 | 923 | 903 | |
| 0.29 | 0.15 | 0.17 | 0.18 | 0.19 | 0.18 | 0.18 | 0.23 | 0.26 | 0.22 | 0.27 | ||
| HG_U95A | 398 | 297 | 255 | 271 | 315 | 308 | 367 | 448 | 417 | 47 | 47 | |
| 0.40 | 0.28 | 0.30 | 0.31 | 0.31 | 0.29 | 0.33 | 0.35 | 0.40 | 0.42 | 0.39 | ||
| HG_U95B | 314 | 267 | 237 | 261 | 282 | 293 | 326 | 375 | 339 | 19 | 23 | |
| 0.66 | 0.51 | 0.54 | 0.55 | 0.56 | 0.57 | 0.58 | 0.64 | 0.67 | 0.63 | 0.68 | ||
| HG_U95D | 293 | 278 | 243 | 272 | 284 | 311 | 381 | 471 | 478 | 56 | 73 | |
| 0.60 | 0.36 | 0.37 | 0.39 | 0.38 | 0.41 | 0.42 | 0.44 | 0.50 | 0.41 | 0.47 | ||
| HG_U95E | 346 | 323 | 317 | 285 | 350 | 363 | 398 | 532 | 559 | 73 | 57 | |
| 0.54 | 0.29 | 0.32 | 0.35 | 0.34 | 0.33 | 0.41 | 0.41 | 0.46 | 0.39 | 0.47 | ||
| MOE430A | 15 | 19 | 16 | 14 | 15 | 22 | 36 | 12 | 12 | 13 | 0 | |
| 0.51 | 0.26 | 0.32 | 0.30 | 0.15 | 0.28 | 0.27 | 0.33 | 0.33 | 0.40 | — | ||
| MOE430B | 4 | 13 | 14 | 13 | 16 | 13 | 22 | 4 | 9 | 8 | 0 | |
| 0.92 | 0.42 | 0.39 | 0.31 | 0.54 | 0.50 | 0.49 | 0.60 | 0.62 | 0.49 | — | ||
| MG_U74Av2 | 357 | 315 | 259 | 292 | 292 | 349 | 349 | 428 | 431 | 39 | 46 | |
| 0.29 | 0.15 | 0.17 | 0.18 | 0.19 | 0.18 | 0.18 | 0.23 | 0.26 | 0.22 | 0.27 | ||
| MG_U74Bv2 | | 326 | 286 | 246 | 257 | 271 | 298 | 369 | 436 | 496 | 17 | 14 |
| 0.54 | 0.33 | 0.33 | 0.35 | 0.39 | 0.40 | 0.42 | 0.46 | 0.54 | 0.61 | 0.63 | ||
| RG_U34A | | 194 | 148 | 145 | 126 | 166 | 160 | 183 | 176 | 144 | 94 | 85 |
| 0.33 | 0.23 | 0.21 | 0.24 | 0.28 | 0.29 | 0.32 | 0.32 | 0.41 | 0.38 | 0.36 | ||
| Position of G-stack | ||||||||||||
| 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | ||
| HG_U133A | 491 | 424 | 523 | 580 | 592 | 604 | 650 | 615 | 737 | 611 | 689 | |
| 0.45 | 0.47 | 0.50 | 0.50 | 0.47 | 0.43 | 0.41 | 0.38 | 0.34 | 0.29 | 0.26 | ||
| HG_U133_Plus_2 | 949 | 872 | 1009 | 1140 | 1098 | 1193 | 1271 | 1185 | 1310 | 1192 | 1308 | |
| 0.29 | 0.32 | 0.39 | 0.42 | 0.39 | 0.41 | 0.39 | 0.38 | 0.32 | 0.27 | 0.24 | ||
| HG_U95A | 33 | 37 | 631 | 417 | 384 | 388 | 359 | 433 | 467 | 399 | 570 | |
| 0.42 | 0.49 | 0.54 | 0.55 | 0.55 | 0.57 | 0.55 | 0.57 | 0.55 | 0.54 | 0.54 | ||
| HG_U95B | 15 | 18 | 555 | 336 | 350 | 342 | 340 | 383 | 390 | 365 | 463 | |
| 0.77 | 0.84 | 0.81 | 0.79 | 0.81 | 0.81 | 0.81 | 0.80 | 0.77 | 0.78 | 0.75 | ||
| HG_U95D | 72 | 74 | 832 | 364 | 357 | 337 | 330 | 391 | 426 | 334 | 500 | |
| 0.43 | 0.54 | 0.64 | 0.64 | 0.66 | 0.75 | 0.69 | 0.70 | 0.74 | 0.71 | 0.75 | ||
| HG_U95E | 61 | 73 | 833 | 412 | 378 | 371 | 354 | 420 | 434 | 370 | 530 | |
| 0.39 | 0.47 | 0.62 | 0.61 | 0.66 | 0.64 | 0.67 | 0.68 | 0.67 | 0.69 | 0.65 | ||
| MOE430A | 2 | 25 | 16 | 15 | 27 | 20 | 19 | 18 | 8 | 14 | 7 | |
| 0.37 | 0.33 | 0.40 | 0.42 | 0.37 | 0.36 | 0.26 | 0.29 | 0.31 | 0.10 | 0.07 | ||
| MOE430B | 0 | 7 | 10 | 19 | 23 | 12 | 10 | 10 | 6 | 6 | 6 | |
| — | 0.55 | 0.72 | 0.66 | 0.62 | 0.68 | 0.70 | 0.45 | 0.28 | 0.32 | 0.48 | ||
| MG_U74Av2 | 42 | 51 | 682 | 434 | 388 | 373 | 362 | 457 | 471 | 357 | 545 | |
| 0.29 | 0.32 | 0.39 | 0.42 | 0.39 | 0.41 | 0.39 | 0.38 | 0.32 | 0.27 | 0.24 | ||
| MG_U74Bv2 | 6 | 11 | 790 | 404 | 392 | 336 | 326 | 443 | 419 | 357 | 465 | |
| 0.78 | 0.54 | 0.68 | 0.67 | 0.67 | 0.69 | 0.62 | 0.62 | 0.57 | 0.56 | 0.55 | ||
| RG_U34A | 82 | 76 | 182 | 185 | 214 | 200 | 204 | 215 | 213 | 202 | 268 | |
| 0.47 | 0.43 | 0.46 | 0.49 | 0.46 | 0.42 | 0.42 | 0.45 | 0.34 | 0.26 | 0.21 | ||
Figure 2Contour plot illustrating that in mouse chip MOE430B, the average correlation coefficient values changes according to the position of G-stack (with four Gs only) for a group of probes.
Figure 3The plot shows that diagonal correlation coefficient values of each chip design. Diagonal values represent correlation among the same group of probes.
The effect on RMA of removing G-stack probes from probe sets. Subtraction of the original RMA value from the RMA value after removal of G-stack probes gives the quantity d. Entries are column percentages.
| No. of G-stack probes: | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|---|---|
| No. of probe sets: | 38,422 | 10,216 | 4,192 | 1,280 | 384 | 121 | 36 |
| 0 | 0 | 0 | 0 | 0 | 0 | 2 | |
| 1.0 < | 0 | 0 | 1 | 1 | 3 | 3 | 9 |
| 0.5 < | 0 | 1 | 4 | 5 | 7 | 6 | 11 |
| Between −0.5 and 0.5 | 100 | 94 | 76 | 61 | 51 | 46 | 34 |
| −0.5 > | 0 | 5 | 14 | 21 | 19 | 19 | 10 |
| −1.0 > | 0 | 0 | 5 | 11 | 17 | 21 | 18 |
| 0 | 0 | 0 | 1 | 2 | 5 | 16 | |