| Literature DB >> 12123529 |
M Bilban1, L K Buehler, S Head, G Desoye, V Quaranta.
Abstract
BACKGROUND: Genome-wide or application-targeted microarrays containing a subset of genes of interest have become widely used as a research tool with the prospect of diagnostic application. Intrinsic variability of microarray measurements poses a major problem in defining signal thresholds for absent/present or differentially expressed genes. Most strategies have used fold-change threshold values, but variability at low signal intensities may invalidate this approach and it does not provide information about false-positives and false negatives.Entities:
Year: 2002 PMID: 12123529 PMCID: PMC117791 DOI: 10.1186/1471-2164-3-19
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Signal distributions for specific and non-specific hybridizations overlap at low absolute intensities. The median intensity of 4 B.subtilis genes (n = 24 replicates per gene × 4 = 96) was used as a linear scaling factor to balance the Cy3 and Cy5 channels. Following this normalization step, normalized intensities were Log2 transformed for efficient graphical illustration. Positive control spots (open bars) and negative control spots (filled bars) from (A) array 1 and (B) array 2 microarray hybridizations. The positive control group includes seven housekeeping genes (n = 42) and four B.subtilis genes (24 repeats per sequence; n = 96) representing sequence-specific hybridization. The negative control sequences (six repeats per sequence) include three plant genes (n = 18), three E. coli genes (n = 18), and seven human cytomegalovirus (hCMV) genes (n = 42) representing non-specific hybridization events. Data for Cy3 and Cy5 signals were pooled. Signal distributions for test genes (n = 154) from (C) array 1 and (D) array 2.
Figure 2Specificity and sensitivity of select cut-offs for individual microarrays. Specific (spiked B. subtilis and housekeepers) and non-specific hybridization control groups (plant, bacterial and viral genes) represent sensitivity (squares) and specificity (circles), respectively. The intersection point of the two graphs indicates the threshold TM at which Sp equals Se. TM values were 0.18 (array1) and 0.09 (array 2). Indicated thresholds a-d are described in table 1. The Tj values presented in Table 1 were used to construct these curves. Note the different signal range (abcissa values) for array 1 (A) and array 2 (B).
Use of ROC analysis for the selection of cut-off values for individual microarray experiments.
| Array 1 | Array 2 | ||||||
| Threshold | Specificity [%] | Sensitivity [%] | % genes below threshold* | Threshold | Specificity [%] | Sensitivity [%] | % genes below threshold* |
| 0.010 | 0.0 | 100 | 0 | 0.005 | 0.0 | 100 | 0 |
| 0.030 | 1.8 | 100 | 0 | 0.010 | 2.5 | 100 | 0 |
| 0.040 | 13.4 | 100 | 0 | 0.015 | 9.3 | 100 | 0 |
| 0.020 | 30.5 | 100 | 0 | ||||
| 0.060 | 52.7 | 99.6 | 1 | ||||
| 0.030 | 57.7 | 100 | 1 | ||||
| 0.080 | 71.1 | 99.4 | 7 | ||||
| 0.100 | 81.3 | 99.0 | 17 | 0.040 | 75.4 | 99.9 | 12 |
| 0.120 | 86.8 | 98.2 | 26 | 0.050 | 89.9 | 99.8 | 20 |
| 0.140 | 90.0 | 97.7 | 36 | 0.060 | 93.9 | 99.6 | 29 |
| 0.170 | 94.9 | 96.5 | 44 | 0.062 | 94.9 | 99.6 | 31 |
| 0.064 | 95.6 | 99.9 | 33 | ||||
| 0.200 | 97.4 | 95.7 | 48 | ||||
| 0.300 | 98.6 | 92.8 | 65 | 0.100 | 99.1 | 98.3 | 45 |
| 0.400 | 99.5 | 89.9 | 77 | 0.110 | 100 | 97.6 | 48 |
| 0.500 | 100 | 85.0 | 79 | 0.200 | 100 | 92.0 | 65 |
| 0.700 | 100 | 74.7 | 83 | 0.500 | 100 | 85.7 | 81 |
| 1.000 | 100 | 56.6 | 89 | 0.800 | 100 | 72.4 | 90 |
| 1.500 | 100 | 28.7 | 95 | 1.000 | 100 | 50.8 | 94 |
| 2.000 | 100 | 19.4 | 98 | 1.500 | 100 | 17.1 | 97 |
| 3.000 | 100 | 10.2 | 98 | 2.000 | 100 | 7.7 | 98 |
| 5.000 | 100 | 5.4 | 99 | 2.500 | 100 | 2.9 | 98 |
| 10.000 | 100 | 1.2 | 100 | 3.000 | 100 | 2.1 | 99 |
| 12.000 | 100 | 0.0 | 100 | 5.000 | 100 | 0.0 | 100 |
Specificity and sensitivity for 25 signal cut-offs from 2 microarray experiments. Three thresholds (a-c) are calculated from the negative reference controls only and compared to the threshold with maximum specificity and senitivity (TM) obtained as the point of intersection in figure 2. a: mean (TX), b: median (T0.5X), c: mean + 2 standard deviations (TX2SD); d:TM Note the different signal range for array 1 and array 2. *normalized Cy3 and Cy5 pool.
Figure 3ROC analysis of selected signal cut-off values as a predictor for specific hybridization. ROC curves demonstrate the capacity to discriminate between the absence or presence of sequence-specific hybridization in individual microarray experiments. The closer an ROC curve is to the upper left hand corner of the graph, the more accurate it is because the true positive rate is 100% and the false positive rate is 0%. ROC plots based on percentile rank calculations for 25 cut-off signal thresholds (taken from table 1). The meaning of the position of thresholds a-d (table 1) are explained in the text. The area under the ROC curve was (A) 0.994 (array 1) and (B) 0.999 (array 2). Rising diagonal indicates no discrimination between positiv and negative control signals.
ROC parameters for different negative control groups.
| Threshold with maximum specificity and sensitivity* | β error | α error | Threshold value for α = 0.05 | Discriminatory power (Area under ROC plot) | |
| Array 1 | 0.40 (+SSC) | 0.101 | 0.096 | 0.50 | 0.969 |
| 0.18 (-SSC) | 0.039 | 0.044 | 0.17 | 0.994 | |
| Array 2 | 0.25 (+SSC) | 0.089 | 0.099 | 0.33 | 0.984 |
| 0.09 (-SSC) | 0.015 | 0.019 | 0.06 | 0.999 |
SCC spots or spots with deposited DNA perform differently in ROC-analysis yielding different areas under the ROC curve as well as different thresholds with maximum specificity and sensitivity. The area under the ROC curve may be used as an indicator for microarray hybridization quality. *characterized by smallest α and β errors. Note that in this case the α error can be >0.05.
Analysis of fluorescence ratios of genes known to be involved in cancer invasion.
| Gene | Invasive signal | Non-invasive signal | Ratio | Confidence category | |||
| Array 1 | Array 2 | Array 1 | Array 2 | Array 1 | Array 2 | ||
| TIMP-1 | 1.70 | 0.92 | 0.25 | 0.30 | 6.9 | 3.1 | A |
| TIMP-2 | 0.63 | 0.25 | 0.24 | 0.70 | 2.5 | 0.4 | |
| MMP-9 | 4.16 | 2.22 | 1.70 | 1.01 | 2.5 | 2.2 | |
| Ln-5, γ2 | 1.27 | 1.22 | 0.06 | 0.03 | 22.5 | 35.5 | B |
| Integrin α3 | 0.24 | 0.16 | 0.12 | 0.06 | 2.0 | 2.8 | |
| Ln-5, β3 | 0.15 | 0.06 | 0.09 | 0.04 | 1.9 | 1.6 | C |
Ratios were qualitatively assigned to confidence categories according to their level of expression. All ratios met p < 0.05. TM was 0.18 for array 1 and 0.09 for array 2.