| Literature DB >> 14519202 |
William J Lemon1, Sandya Liyanarachchi, Ming You.
Abstract
Logit-t employs a logit-transformation for normalization followed by statistical testing at the probe-level. Using four publicly-available datasets, together providing 2,710 known positive incidences of differential expression and 2,913,813 known negative incidences, performance of statistical tests were: Logit-t provided 75% positive-predictive value, compared with 5% for Affymetrix Microarray Suite 5, 6% for dChip perfect match (PM)-only, and 9% for Robust Multi-array Analysis at the p < 0.01 threshold. Logit-t provided 70% sensitivity, Microarray Suite 5 provided 46%, dChip provided 53% and Robust Multi-array Analysis provided 63%.Entities:
Mesh:
Year: 2003 PMID: 14519202 PMCID: PMC328456 DOI: 10.1186/gb-2003-4-10-r67
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Summary of statistical test results
| Incidences | IQR for ranks | Known positives achieving rank | ||||||||
| TP | FP | TN | FN | PPV | Sens | 1st Q | Median | 3rd Q | ||
| MAS5 | 950 | 13,641 | 1,134,051 | 324 | 7% | 75% | 13 | 36 | 130 | 335 |
| dChip PM | 1,068 | 14,390 | 1,133,302 | 206 | 7% | 84% | 6 | 19 | 63 | 558 |
| RMA | 1,098 | 10,406 | 1,137,286 | 176 | 10% | 86% | 5 | 12 | 38 | 734 |
| logit-Exp | 1,037 | 12,311 | 1,135,381 | 237 | 8% | 81% | 6 | 15 | 53 | 636 |
| logit-ExpR | 1,002 | 11,667 | 1,136,025 | 272 | 8% | 79% | 6 | 15 | 69 | 619 |
| logit-t | 1,110 | 345 | 1,147,347 | 164 | 76% | 87% | 4 | 8 | 13 | 1066 |
| MAS5 | 24 | 1,305 | 263,631 | 186 | 2% | 11% | 151 | 456 | 1,283 | 10 |
| dChip PM | 38 | 1,729 | 263,207 | 172 | 2% | 18% | 72 | 255 | 1,505 | 19 |
| RMA | 91 | 1,860 | 263,076 | 119 | 5% | 43% | 21 | 112 | 450 | 41 |
| logit-t | 106 | 79 | 264,857 | 104 | 57% | 50% | 4 | 8 | 21 | 151 |
| MAS5 | 86 | 3,473 | 690,352 | 519 | 2% | 14% | 172 | 816 | 3,952 | 21 |
| dChip PM | 84 | 3,790 | 690,035 | 521 | 2% | 14% | 64 | 703 | 5,296 | 44 |
| RMA | 199 | 3,504 | 690,321 | 406 | 5% | 33% | 15 | 330 | 5,166 | 139 |
| logit-t | 263 | 107 | 693,718 | 342 | 71% | 43% | 4 | 8 | 738 | 349 |
| MAS5 | 239 | 5,854 | 826,736 | 487 | 4% | 33% | 35 | 127 | 681 | 81 |
| dChip PM | 307 | 3,760 | 828,830 | 419 | 8% | 42% | 12 | 49 | 418 | 180 |
| RMA | 398 | 4,693 | 827,897 | 328 | 8% | 55% | 7 | 33 | 653 | 251 |
| logit-t | 490 | 1,752 | 830,838 | 236 | 22% | 67% | 3 | 7 | 15 | 524 |
| MAS5 | 234 | 4,540 | 802,820 | 470 | 5% | 33% | ||||
| dChip PM | 295 | 2,378 | 804,982 | 409 | 11% | 42% | ||||
| RMA | 381 | 3,263 | 804,097 | 323 | 10% | 54% | ||||
| logit-t | 473 | 116 | 807,244 | 231 | 80% | 67% | ||||
Spiked in RNAs are considered positives and all others considered negatives. In the Tonsil dataset, two comparisons resulted in scores much worse than others for all methods (0.75 versus 2 pM and 0.75 versus 75 pM). The last block of results had these comparisons removed. In each block, the first three rows tally t-test results on the MAS5, dChip and RMA indexes with positives having p value < 0.01. Last row tallies t-values of the Logit-t method with positives having |t| > threshold, based on df. Threshold t values correspond approximately to p < 0.01. The first four columns tally calls: TP, true positive; FP, false positive; TN, true negative; FN, false negative. The next two columns indicate performance measures: PPV, positive predictive value TP/(TP+FP); Sens, sensitivity TP/(TP+FN); IQR, Interquartile range for ranks. Ranks of statistics demarking the 1st quartile, median and 3rd quartiles for known positives. The last column shows the number of known positives achieving rank at or below the number of known positives in a comparison. The maximum achievable number for the final column is the total number of known positives for all comparisons.
Figure 1Receiver-operator characteristic (ROC) plots for all methods on each of the four datasets. (a) Affymetrix Latin Square dataset, (b) Gene Logic Spike, (c) Gene Logic AML and (d) Gene Logic Tonsil. Results for all comparisons within the datasets were pooled to produce the plots. The dChip and Logit_Exp lines are nearly identical until about 5% FP. (b-d) do not include results for Logit_Exp or Logit_ExpR.
Figure 2Histogram of Logit-Log slopes. Least-squares linear fits to logit-transformed intensity for each PM probe from the spiked-in probesets in the Affymetrix Latin Square dataset versus log concentration of the spike resulted in slopes constituting the histogram. The major mode of the histogram near -0.15 logit intensity units per log concentration unit represents the majority of probes and is used in the Logit_Exp and Logit_ExpR gene-expression indexes.
Figure 3The relationship between log(index) and log(RNA) for each model using the Affymetrix dataset. Plots of log RNA concentration versus (a,d) log(MAS5); (b,e) log(dChip PM-only), and (c,f) RMA. (a-c) show results for all 14 spiked genes and least-squares regression line through aggregate data. (d-f) show data for one probe set, 37777_at, with least squares regression line.
Parameters (β-exponents) indicative of non-linearity of gene-expression index relative to RNA concentration
| MAS5 | dChip | RMA | |
| 37777_at | 1.27 | 1.68 | 1.49 |
| 684_at | 1.40 | 1.63 | 1.54 |
| 1597_at | 1.38 | 1.80 | 1.64 |
| 38734_at | 1.46 | 1.70 | 1.52 |
| 39058_at | 1.35 | 1.58 | 1.51 |
| 36311_at | 1.25 | 1.56 | 1.43 |
| 36889_at | 1.17 | 1.71 | 1.51 |
| 1024_at | 1.37 | 1.80 | 1.48 |
| 36202_at | 1.31 | 1.68 | 1.42 |
| 36085_at | 1.50 | 2.04 | 1.66 |
| 40322_at | 1.52 | 1.88 | 1.64 |
| 407_at | 1.32 | 1.61 | 1.63 |
| 1091_at | 1.58 | 2.63 | 2.25 |
| 1708_at | 1.37 | 1.85 | 1.58 |
| Average | 1.37 | 1.80 | 1.59 |
| CV | 8.1% | 15.2% | 12.8% |
Figure 4Contour plots of densities of CV for gene-expression indexes versus probe-level data. (a) CV density for MAS5 indexes versus probe-level, (b) CV density for dChip PM-only versus probe-level (c) CV density for known positives only (n = 1,274) for MAS5 versus probe-level and (d) CV density for known positives for dChip PM-only versus probe-level. Dashed line indicates equal CV. Dotted line indicates optimal CV ratio = .