| Literature DB >> 16417622 |
Li-Xuan Qin1, Richard P Beyer, Francesca N Hudson, Nancy J Linford, Daryl E Morris, Kathleen F Kerr.
Abstract
BACKGROUND: There are currently many different methods for processing and summarizing probe-level data from Affymetrix oligonucleotide arrays. It is of great interest to validate these methods and identify those that are most effective. There is no single best way to do this validation, and a variety of approaches is needed. Moreover, gene expression data are collected to answer a variety of scientific questions, and the same method may not be best for all questions. Only a handful of validation studies have been done so far, most of which rely on spike-in datasets and focus on the question of detecting differential expression. Here we seek methods that excel at estimating relative expression. We evaluate methods by identifying those that give the strongest linear association between expression measurements by array and the "gold-standard" assay. Quantitative reverse-transcription polymerase chain reaction (qRT-PCR) is generally considered the "gold-standard" assay for measuring gene expression by biologists and is often used to confirm findings from microarray data. Here we use qRT-PCR measurements to validate methods for the components of processing oligo array data: background adjustment, normalization, mismatch adjustment, and probeset summary. An advantage of our approach over spike-in studies is that methods are validated on a real dataset that was collected to address a scientific question.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16417622 PMCID: PMC1360686 DOI: 10.1186/1471-2105-7-23
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Biological Samples. RNA samples were from an unbalanced 2 × 2 factorial design. The 24 mice were young or old, wild-type or carried the MCAT transgene, which directs overexpression of human catalase to the mitochondrial cellular compartment. Transgene overexpression extends lifespan[16], and thus gene expression differences between MCAT and age-matched wild-type mice would be expected.
| Wild-type | MCAT | |
| Young | N = 6 | N = 8 |
| Old | N = 5 | N = 5 |
Methods under Evaluation. Summary of the six methodologies for oligonucleotide array data that were compared in this study. Details on the methodologies can be found in the references. An asterisk (*) marks components of methods that were studied in the follow-up analysis (see Table 5).
| Method | Background Adjustment | Normalization | Mismatch adjustment | Probeset Summary | Reference |
| MAS5 | regional adjustment* | scaling by a constant* | subtract idealized mismatch* | Tukey biweight average* | [17] |
| gcRMA | by GC content of probe* | quantile normalization* | PM only* | medianpolish* (robust fit of linear model) | [9] |
| RMA | whole array adjustment | quantile normalization* | PM only* | medianpolish* (robust fit of linear model) | [2] |
| VSN | none* | variance stabilizing transformation | PM only* | medianpolish* (robust fit of linear model) | [18] |
| dChip | none* | invariant set* | PM only* | Li-Wong multiplicative | [19] |
| dChip.mm | none* | invariant set* | subtract mismatch* | model* |
Figure 1Agreement between array and qRT-PCR for the comparison of Y-WT and O-WT mice. For the Y-WT vs. O-WT contrast, the figure shows estimates of relative expression from the array data, processed with six different methodologies, compared to qRT-PCR. Estimated differences are on the log2 scale. Genes indicated with an open circle are influential genes according to the sensitivity analysis. The number on each scatterplot is the Pearson correlation.
Figure 2Genes selected for qRT-PCR are medium to high intensity in array data. The plot highlights the genes selected for qRT-PCR in a scatterplot of the YWT vs. OWT contrast against the mean signal intensity. Data were processed with gcRMA for this plot. Selected genes span a large range of average signal intensity with the notable exception of low-intensity genes. See AdditionalFigures.doc for similar figures for the other contrasts.
Figure 3Relative performance of the six methodologies for six summary contrasts of the data. MAS5, gcRMA, and dChip.mm consistently outperform the other methods, although all methods performed comparably on the 'Interaction' contrast. Correlations are lower for contrasts for which there is less differential expression, as seen in the scatterplots such as Figure 1 [see AdditionalFigures.doc]. However, the interesting comparisons are between the six correlations for a given contrast.
Results of the leave-one-out sensitivity analysis. For each contrast, an individual gene is listed if its removal produced a change in the ranking of the six methodologies. The third column shows how the ranking of the six methodologies changed upon removal of the gene. Here, M = Mas5, G = gcRMA, R = RMA, V = VSN, D = dChip, D- = dChip.mm. Bold font highlights changes. Note that all changes in rankings, with two exceptions, were transpositions of two adjacent methods or a shuffle of three adjacent methods.
| CONTRAST | GENE | RANKING |
| Age | 36 | ( |
| Genotype | 3 | ( |
| Genotype | 12 | ( |
| Genotype | 26 | (G,M, |
| Genotype | 27 | (G, |
| Interaction | 4,19 | (M, |
| Interaction | 6,11,24 27,34,37,43,47 | ( |
| Interaction | 8 | (M,D-, |
| Interaction | 26 | ( |
| Interaction | 32 | (M,D-,G,R, |
| Interaction | 12 | ( |
| YWT-OWT | 3 | (M,G,D,R, |
| YWT-OWT | 37 | ( |
| YWT-YMCAT | 3 | (M, |
| OWT-OMCAT | 2,4,8,15,16,20,27,31,34,35,43,47 | (D-,G,V, |
| OWT-OMCAT | 10,12,17,18,21,36,39,40,45 | (D-,G, |
| OWT-OMCAT | 23 | (D-, |
| OWT-OMCAT | 5,6,13,32,33 | (D-,G, |
| OWT-OMCAT | 3 | (D-, |
| OWT-OMCAT | 26 | ( |
Results of the leave-two-out sensitivity analysis. The table shows that removing gene pairs affected only minor changes in our findings.
| CONTRAST | # gene-pairs considered (non-influential singleton genes) | # influential gene-pairs | # of these pairs that produce a single transposition of neighbors | # of these pairs that produce a shuffle of the top-three methods |
| Age | 1035 | 11 | 11 | 0 |
| Genotype | 903 | 3 | 3 | 0 |
| Interaction | 528 | 51 | 43 | 8 |
| YWT-OWT | 990 | 10 | 10 | 0 |
| YWT-YMCAT | 1035 | 14 | 14 | 0 |
| OWT-OMCAT | 153 | 2 | 2 | 0 |
Figure 4Variability of methods within biological replicates as related to signal intensity. For each version of the data, the standard deviation of measurements within each of the four biological groups was calculated, and these were pooled to form a single standard deviation for each gene. These were plotted against the mean intensity for that gene, and fitted with a non-parametric smoother to summarize the trend. The fitted smooths are shown above. MAS5 shows the greatest variability, and MAS5 and the dChip mismatch model both show greater variability at low intensities.
Figure 5Bias across contrasts for the six methods. Slopes of the least squares regression lines fitted to scatterplots such as in Figure 1. The three methods that showed the best correlation between array and qRT-PCR, MAS5 gcRMA, and the dChip mismatch model, consistently show the least bias, with slopes closest 1.
Components of Methods Examined in Follow-Up Analysis. The table gives abbreviations for the methods in Table 2 that are studied in the follow-up analysis. These abbreviations are used in RESULTS and Figures 6–9.
| Method | Background Adjustment | Normalization | Mismatch adjustment | Probeset Summary | Reference |
| MAS5 | BA-RA | constant | adjustedMM | TukeyAverage | [17] |
| gcRMA | BA-GC | quantile | PMonly | medianpolish | [9] |
| dChip.mm | BA-none | invariantset | subtractMM | Li-Wong | [19] |
Figure 6Correlations for each of the six contrasts for 56 combinations. See Table 5 for notation. Curves are colored by (a) the method of background adjustment, (b) the method of normalization, (c) the use of MM probe data, (d) the method for summarizing data across a probeset. No sub-method is clearly uniformly superior.
Figure 7BA-GC and Li-Wong perform well together. See Table 5 for notation. Combinations that used both BA-GC and the Li-Wong summary method perform consistently well (green curves). Replacing BA-GC with subtractMM was not as good (yellow curves). Replacing the Li-Wong with medianpolish is also arguably less effective (blue curves). Note: BA-GC could not be combined with subtractMM, because this resulted in negative values.
Figure 8adjustedMM and TukeyAverage perform well together. See Table 5 for notation. Combinations that used two components of MAS5, adjustedMM and TukeyAverage, perform consistently well (black solid curves), as long as BA-GC was not also used (dotted black curves). Using another probeset summary method with adjustedMM was not as effective (red and green curves).
Figure 9Groups of methods that consistently performed well. Methods that combine BA-GC with Li-Wong are consistently near the top (green curves), as well as adjustedMM combined with Tukey Average (black curves), as long as GA-GC was not also used. Other consistent performers are also noted in the Figure.