| Literature DB >> 17478523 |
Phillip Stafford1, Marcel Brun.
Abstract
Microarray gene expression data becomes more valuable as our confidence in the results grows. Guaranteeing data quality becomes increasingly important as microarrays are being used to diagnose and treat patients (1-4). The MAQC Quality Control Consortium, the FDA's Critical Path Initiative, NCI's caBIG and others are implementing procedures that will broadly enhance data quality. As GEO continues to grow, its usefulness is constrained by the level of correlation across experiments and general applicability. Although RNA preparation and array platform play important roles in data accuracy, pre-processing is a user-selected factor that has an enormous effect. Normalization of expression data is necessary, but the methods have specific and pronounced effects on precision, accuracy and historical correlation. As a case study, we present a microarray calibration process using normalization as the adjustable parameter. We examine the impact of eight normalizations across both Agilent and Affymetrix expression platforms on three expression readouts: (1) sensitivity and power, (2) functional/biological interpretation and (3) feature selection and classification error. The reader is encouraged to measure their own discordant data, whether cross-laboratory, cross-platform or across any other variance source, and to use their results to tune the adjustable parameters of their laboratory to ensure increased correlation.Entities:
Mesh:
Year: 2007 PMID: 17478523 PMCID: PMC1904274 DOI: 10.1093/nar/gkl1133
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Sample size, normalization methods, platform and tissues used
| Platform | Normalization methods | Probes (gene | overlap | Tissues (sample | N |
|---|---|---|---|---|---|
| Agilent Human 1Av2 | BSUB ( | 18703 | 11504 | Liver, lung, spleen | 6 |
| Agilent Human 1Av2 | MEAN ( | 18703 | 11504 | Liver, lung, spleen | 6 |
| Agilent Human 1Av2 | PROC ( | 18703 | 11504 | Liver, lung, spleen | 6 |
| Affymetrix U133Av2 | MAS5 ( | 22215 | 11504 | Liver, lung, spleen | 6 |
| Affymetrix U133Av2 | GC-RMA ( | 22215 | 11504 | Liver, lung, spleen | 6 |
| Affymetrix U133Av2 | RAW ( | 22215 | 11504 | Liver, lung, spleen | 6 |
| Affymetrix U133Av2 | PM ( | 22215 | 11504 | Liver, lung, spleen | 6 |
| Affymetrix U133Av2 | PM–MM( | 22215 | 11504 | Liver, lung, spleen | 6 |
Agilent's MEAN value is the signal intensity per channel + local and global background. BSUB is MEAN — local background. Local background is calculated using negative controls, mean local background and a spatial detrending calculation based on scanner-induced low frequency multiplicative noise. PROC is background subtracted, spatially detrended, lowess normalized and error modeled data. The error model separates the lower additive components error for low intensity, the multiplicative components for high intensity, and adds the squared results of all error terms plus the error from the simple background subtracted signal. Affy MAS5 is the mismatch-subtracted data from GCOS. GC-RMA is the GC-modified robust multi-array variance stabilizing method. dChip PM and PM–MM methods are iterative, model-based methods that automatically exclude high error datapoints.
Figure 1.Graphical view of precision. Intensity replicates (left three columns) are log10 scatter plots of technical replicates for each normalization and tissue. Low scatter indicate higher precision. MvA plots (center three columns) are Bland–Altman charts showing variability (M = log2 (S1/S2)) as a function of the average intensity (A = log2 sqrt(S1/S2)) where S1 and S2 are the two replicate samples for each normalization and tissue. Linearity and low spread indicate high precision without intensity-sourced bias. Ratio replicates (right three columns) are log2 plots of tissue:tissue ratio replicates for each combination of tissue.
Figure 2.Intensity plots using boxplots (top) and line-plots (bottom). Top: boxplots of each array are colored by normalization type. Top boxplots show Agilent data arranged from left to right from the CY3 and CY5 channels, respectively. Lower boxplots show Affymetrix data. Lower figures show the log10-transformed intensity values as line-plots. High intensity genes are colored red, low intensity genes are colored green. All data is log10-transformed and median normalized.
Figure 3.Hierarchical grouping of 1000 genes selected using a Model I ANOVA for tissue differences ignoring the normalization class. Data was clustered using Euclidean distance to create the gene and experiment trees. Colored bars at the bottom of each dendrogram indicate the normalization method, tissue type or channel where appropriate. Vertical colored bars represent the Euclidean-based k-means gene clusters. Gene overlap was determined sequentially, using probename to RefSeq to HUGO Gene Symbol inside GeneSpring (translate genome function).
Figure 4.Power calculations indicate limits of detection. The log2 ratio between the three tissues is plotted as blue bars along the X-axis. The X-axis is the probe number sorted by the calculated delta, the Y-axis is the log2 fold-change. Red circles indicate statistical significance at P < 0.00001. The black curve is each probe's delta (the minimum detectable difference expressed as a log2 ratio) calculated by computing the post-hoc power for each probe at α = 0.05, β = 0.20 and N = 3 per tissue. The lower the delta, the less difference must be seen between tissues for a ratio to be significant. Wider delta curves imply that a ratio must be large in order to reach significance. The delta curves roughly recapitulate the precision seen in Figure 1, but also provide a graphical view of the distribution and magnitude of ratios versus proportion of significant genes. GC-RMA tends to show ratios close to the calculated delta; MAS5 shows many high ratios but fewer actual significant genes, implying false positives are a concern. PM only shows good stability across the tissue replicates. The Agilent data shows a uniform distribution of high and low ratios and many significant genes, implying low false positives and due to the number of significant genes, likely low false negatives. Raw Affymetrix data has seemingly high precision but analysis shows high false negatives and ratios that often disagree in magnitude and direction with other highly correlative probes across both Affymetrix and Agilent data.
Sensitivity results
| Data set | Average Δ | Average MDFC (95th percentile ratio) | Median MDFC (95th percentile ratio) | N |
|---|---|---|---|---|
| Agilent BSUB | 1.13 ± 0.03 | 1.37 ± 0.08 | 1.34 | 3 |
| Agilent MEAN | 1.14 ± 0.08 | 1.30 ± 0.07 | 1.15 | 3 |
| Agilent PROCESSED | 1.28 ± 0.07 | 1.61 ± 0.13 | 1.37 | 3 |
| Affymetrix MAS5 | 1.99 ± 0.69 | 2.38 ± 0.52 | 2.16 | 3 |
| Affymetrix GC-RMA | 1.32 ± 0.21 | 1.31 ± 0.26 | 1.43 | 3 |
| Affymetrix RAW | 1.56 ± 0.21 | 1.58 ± 0.14 | 1.19 | 3 |
| Affymetrix PM | 1.85 ± 0.19 | 2.3 ± 0.14 | 2.16 | 3 |
| Affymetrix PM–MM | 1.65 ± 0.11 | 2.01 ± 0.25 | 1.99 | 3 |
Delta is the minimum detectable difference at α = 0.05, β = 0.20, N = 3, in fold-change units. Delta was averaged per probe, per case and per tissue with the standard deviation shown. The minimum detectable fold-change is the ratio of two technical replicates at the 95th percentile probe. The average was taken across all probes, all tissues and all possible technical replicates. The median MDFC was the middle value across all possible cases.
Gene Ontology analysis of genes selected by t-test at p < 5.3 × 10−5 for Agilent and p < 4.5 × 10−5 for Affymetrix
| Data set | t-test | Liver:Spleen (case1) | t-test | Liver:Lung (case2) | t-test | Spleen:Lung (case3) |
|---|---|---|---|---|---|---|
| Agilent BSUB | 4975 (27%) | catalytic activity: 5.36 × 10−13 | 6867 (37%) | catalytic activity: 8.87 × 10−10 | 3356 (18%) | |
| e− transport: 1.95 × 10−12 | O2 binding: 1.21 × 10−9 | lipid binding: 5.67 × 10−5 | ||||
| signal transducer: 1.53 × 10−4 | ||||||
| Agilent MEAN | 3682 (20%) | catalytic activity: 2.86 × 10−10 | 4681 (25%) | 2443 (13%) | ||
| e− transport: 2.65 × 10−8 | catalytic activity: 2.81 × 10−11 | lipid binding: 3.25 × 10−7 | ||||
| structural activity: 8.86 × 10−11 | cell adhesion: 8.56 × 10−6 | |||||
| Agilent PROCESSED | 4979 (26%) | 6809 (36%) | catalytic activity: 8.68 × 10−11 | 3440 (18%) | ||
| catalytic activity: 8.89 × 10−10 | O2 binding: 1.37 × 10−9 | lipid binding: 4.74 × 10−4 | ||||
| e− transport: 3.22 × 10−9 | signal transducer: 9.58 × 10−4 | |||||
| Affymetrix MAS5 | 2644 (12%) | 1065 (5%) | transferase: 2.21 × 10−28 | 450 (2%) | cell adhesion: 1.81 × 10−17 | |
| transferasae: 3.26 × 10−35 | ||||||
| e− transport: 4.93 × 10−23 | transporter: 1.02 × 10−24 | receptor binding: 1.63 × 10−8 | ||||
| Affymetrix GC−RMA | 2192 (10%) | 11916 (54%) | ion channel: 3.63 × 10−8 | 12793 (58%) | structural molecule: 2.2 × 10−6 | |
| transferase: 2.31 × 10−21 | transporter: 5.72 × 10−8 | ion channel: 7.96 × 10−4 | ||||
| e− transport: 8.65 × 10−14 | e− transport: 1.76 × 10−3 | |||||
| Affymetrix RAW | 1371 (6%) | 2215 (10%) | O2 binding: 1.19 × 10−14 | 2838 (13%) | ||
| O2 binding: 1.56 × 10−24 | lipid binding: 2.63 × 10−9 | structural activity: 3.94 × 10−3 | ||||
| transferase: 4.47 × 10−18 | cell adhesion: 9.55 × 10−3 | |||||
| Affymetrix PM–MM | 2448 (11%) | 1300 (6%) | lipid binding: 2.12 × 10−21 | 933 (4%) | ||
| O2 binding: 5.87 × 10−20 | structural molecule: 3.9 × 10−4 | |||||
| MHC antigen: 4.39 × 10−19 | O2 binding: 8.5 × 10−18 | cell adhesion: 1.54 × 10−3 | ||||
| Affymetrix PM | 1730 (8%) | DNA binding: 3.99 × 10−9 | 479 (2%) | immunoglobulin: 3.05 × 10−15 | 1870 (8%) | nucleic acid binding: 3 × 10−15 |
| transcription factor: 8.84 × 10−6 | immunity protein: 2.87 × 10−12 | structural activity: 9.78 × 10−13 | ||||
| transcription: 7.69 × 10−4 | NF-κB cascade: 2.34 × 10−6 | cell adhesion: 1.4 × 10−11 |
The number of significant genes is listed in the t-test column, the top three biological categories from GO are identified along with the probability calculated by hypergeometric test for overabundance. Agilent data only used the CY5 channel, but the CY3 data is almost identical (data not shown). Bold terms are common across each case.
Figure 5.Overlap between Agilent and Affymetrix data. Using a Model I ANOVA we identified 1000 genes that are most differentially expressed across the three tissues tested. This analysis identifies the influence of normalization on the amount of overlap. (A) shows the most unmodified data (MEAN and RAW) versus a strong background subtraction method (MAS5). (B) is a comparison among the Agilent normalization methods. (C) and (D) compare highly processed Affymetrix data with Agilent methods. (E) and (F) compare four Affymetrix normalization methods to RAW data. (G) and (L) show the highest Affymetrix/Agilent overlaps occur between PROCESSED or BSUB and PM-MM normalizations. (H), (I), (J) and (K) illustrate the various overlaps between and among Agilent and Affymetrix normalizations.
GeneMapp, Biocarta and Kegg metabolic pathways
| Data set | Database | Liver:Spleen (case1) | Liver:Lung (case2) | Spleen:Lung (case3) |
|---|---|---|---|---|
| Agilent BSUB | BioCarta | Intrinsic prothrombin activation | Intrinsic prothrombin activation | NFAT and hypertrophy |
| GenMapp | Blood clotting cascade | Blood clotting cascade | Inflammatory response | |
| KeGG | Complement and coagulation | Complement and coagulation | Cytokine–cytokine receptor | |
| Agilent MEAN | BioCarta | Complement pathway | Intrinsic prothrombin activation | Nuclear receptors in lipid metabolism and toxicity |
| GenMapp | Ribosomal proteins | Blood clotting cascade | GPCRDB Rhodopsin-like | |
| KeGG | Complement and coagulation | Complement and coagulation | Cell communication | |
| Agilent PROCESSED | BioCarta | Fibrinolysis | Complement pathway | NFAT and hypertrophy |
| GenMapp | Blood clotting | Complement activation classical | Inflammatory response | |
| KeGG | Complement and coagulation cascade | Complement and coagulation cascade | Cytokine–cytokine receptor | |
| Affymetrix MAS5 | BioCarta | Intrinsic prothrombin pathway | Intrinsic prothrombin pathway | Oxidative stress-induced gene expression |
| GenMapp | Ironotecan pathway | Ironotecan pathway | Inflammation response | |
| KeGG | Complement and coagulation cascade | Complement and coagulation cascade | Cell communication | |
| Affymetrix GC-RMA | BioCarta | Intrinsic prothrombin activation | T Helper cell surface molecules | Role of Src kinases in GPCR signaling |
| GenMapp | Irinbotecan pathway | GPCRDB Rhodopsin-like | GPCRDB Class A Rhodopsin-like | |
| KeGG | Complement and coagulation cascade | Neuroactive ligand receptor interaction | Cytokine–cytokine receptor interaction | |
| Affymetrix RAW | BioCarta | TSP1 Induced apoptosis | Toll-like receptor pathway | Regulation of splicing |
| GenMapp | Smooth muscle contraction | Apoptosis | Smooth muscle contraction | |
| KeGG | MAPK signaling | MAPK signaling | MAPK signaling | |
| Affymetrix PM–MM | BioCarta | Intrinsic prothrombin activation pathway | Intrinsic prothrombin activation pathway | B lymphocyte surface molecules |
| GenMapp | Blood clotting cascade | Blood clotting cascade | GPCRDB Class A | |
| KeGG | Complement and coagulation cascade | Complement and coagulation cascade | Rhodopsin-like Cell communication | |
| Affymetrix PM | BioCarta | METS effect on macrophage differentiation | Fc epsilon receptor I signaling in Mast cells | T-cell receptor signaling pathway |
| GenMapp | Apoptosis | GPCRDB Class A Rhodopsin-like | GPCRDB Class A Rhodopsin-like | |
| KeGG | Cell cycle | Leukocyte transendothelial migration | Insulin signaling pathway |
Each case was used to select 100 significant genes which were tested for the most obvious gene regulatory pathway.
Figure 6.Classifier error rates for tissue comparisons for Agilent and Affymetrix platforms and the associated normalizations. For each iteration, 500 of the most significantly differentially expressed genes were removed until less than 500 genes remained. A two-feature forward floating search with bolstering error estimation scored the features, linear discriminant analysis was the classifier rule. Overall error was estimated using cross validation with 500 replicates. (A) shows the lung versus spleen error rates. (B) shows the liver versus spleen and (C) the liver versus lung error. Dashed lines in all cases correspond to the Agilent normalization methods, solid lines correspond to the Affymetrix normalizations. Area under the curve was used to establish the rank order.
Area under the error curves (Figure 6) and the corresponding proportion of significant genes at p < 5.3 × 10−5 for Agilent and p < 4.5 × 10−5 for Affymetrix (called%
| Data set | Area liver:spleen | % > | Area liver:lung | % > | Area spleen:lung | % > |
|---|---|---|---|---|---|---|
| Agilent BSUB | 0.06 | 72% | 0.05 | 62% | 0.17 | 78% |
| Agilent MEAN | 0.04 | 80% | 0.02 | 75% | 0.16 | 87% |
| Agilent PROCESSED | 0.06 | 73% | 0.05 | 63% | 0.19 | 81% |
| Affymetrix MAS5 | 0.35 | 90% | 0.24 | 46% | 0.19 | 42% |
| Affymetrix GC-RMA | 0.21 | 88% | 0.03 | 95% | 0.02 | 98% |
| Affymetrix RAW | 0.37 | 94% | 0.23 | 90% | 0.15 | 87% |
| Affymetrix PM–MM | 0.25 | 88% | 0.29 | 67% | 0.10 | 78% |
| Affymetrix PM | 0.22 | 90% | 0.17 | 87% | 0.24 | 82% |
Figure 7.Probe distance comparisons. Probe location for the 11 Affymetrix 25-mers and the single Agilent 60-mer are plotted along the target gene on the X-axis. Color in this case indicates the average log2 ratio between liver and spleen for two single normalizations, GC-RMA (Affymetrix) and PROCESSED CY5 (Agilent). Other normalizations and tissues produced similar results. Red indicates high relative signal in liver, green indicates high relative signal in spleen. Length of the probe is proportional to the amount of gene sequence shown in the diagram, which in turn is defined by the distance between the most distant probes. Blue triangles indicate introns; numbers along the bottom of each graph indicate the amount of gene up- and downstream of the current window. Y-axis (temp) is the Tm for each probe calculated in standard salt conditions. Left column contains genes that correlated well across Agilent CY5 PROCESSED and Affymetrix GC-RMA. Right column contains genes with poor correlation. Other normalization/tissue combinations produced lists of different genes that were either well or poorly correlated, but the pattern seen here was conserved.