| Literature DB >> 17370338 |
Carole L Yauk1, M Lynn Berndt.
Abstract
DNA microarray technologies are used in a variety of biological disciplines. The diversity of platforms and analytical methods employed has raised concerns over the reliability, reproducibility and correlation of data produced across the different approaches. Initial investigations (years 2000-2003) found discrepancies in the gene expression measures produced by different microarray technologies. Increasing knowledge and control of the factors that result in poor correlation among the technologies has led to much higher levels of correlation among more recent publications (years 2004 to present). Here, we review the studies examining the correlation among microarray technologies. We find that with improvements in the technology (optimization and standardization of methods, including data analysis) and annotation, analysis across platforms yields highly correlated and reproducible results. We suggest several key factors that should be controlled in comparing across technologies, and are good microarray practice in general.Entities:
Mesh:
Year: 2007 PMID: 17370338 PMCID: PMC2682332 DOI: 10.1002/em.20290
Source DB: PubMed Journal: Environ Mol Mutagen ISSN: 0893-6692 Impact factor: 3.216
Fig. 1Number of publications retrieved from PubMed* using DNA microarray technologies. *PubMed search criteria: “microarray” [all fields] OR “microarrays” [all fields] OR “genechip” [all fields] OR “genechips” [all fields] AND “dna” [all fields] OR “cdna” [all fields] OR “complimentary dna” [all fields] OR “oligonucleotides” [all fields] OR “oligonucleotide” [all fields] Limits: XXXX [Publication Date].
Fig. 2Summary of choices for microarray experiments.
Studies Examining the Correlation Among Microarray Technologies from 2000 to 2003
| Publication | Platforms | Probe ID | Validation | Author'conclusion |
|---|---|---|---|---|
| Operson 50mer, PCR probes | Sequence similarity | None | Agreement | |
| Huges et al. [2001] | Agilent oligo, cDNA | Sequence similarity; not clearly specified | None | Agreement |
| Affymetrix, custom cDNA | Sequence similarity (47 genes studied, all sequence confirmed on cDNA platform) | Quantitative (Q) RT-PCR | Agreement | |
| Affymetrix, cDNA | Sequence similarity: For each cDNA probe a best matching probe set was identified by sequence alignment. | None | Disagreement | |
| Poor correlation between technologies. 56 cell lines from the NCI-60 cell lines studied; independent microarray experiments compared from two labs using different materials and protocols. Cells were cultured independently and all were processed separately. Observed that cross-hybridization of genes to various probes reduced correlation of cDNA arrays to the oligonucleotide arrays. Low abundant transcripts performed poorly. Subsequent re-analysis and improvement by sequence matching [ | ||||
| Affymetrix, Incyte cDNA | Sequence similarity: A subset of clones were sequence verified | Northern | Disagreement Noted that a large proportion of cDNAs were incorrectly annotated. Discrepancies were related to probes | |
| Affymetrix, Incyte cDNA | Unigene or Genbank | QR-T-PCR | Disagreement Conclusion based primarily on sensitivity and specificity–Affymetrix found 218 genes differentially expressed versus 4 for cDNA | |
| Barezak et al. [2003] | Affymetrix, Operon 70mer | Unigene ID | None | Agreement |
| Agilent oligonucleotide, cDNA | Sequence matched | QRT-PCR | Agreement | |
| Affymetrix, cDNA | UniGene | QRT-PCR | Agreement | |
| cDNA, Custom oligo | Sequence similarity | RT-PCR | Agreement | |
| Affymetrix, Clontech cDNA | Genbank ID | QRT-PCR and Q-immunoblot | Disagreement Small sample size; examined list of genes that arbitrarily had 1.7-fold or greater change | |
| Affymetrix, Agilent cDNA, Amersham 30mer | Genbank ID | None | Disagreement Correlations in expression level and significant gene expression were divergent. Subsequent reanalysis and improvement by | |
| Febiot Geniom, Affymetrix, cDNA | Not described | QRT-PCR | Agreement |
Studies Examining the Correlation Among Microarray Technologies from 2004 to 2006
| Publication | Platforms | Probe ID | Validation | AUthors' conclusion |
|---|---|---|---|---|
| Affymetrix, Agilent cDNA, Agilent oligo, Codelink (Amersham), Mergen, NIA cDNA | Unigene ID | None | Agreement Good platforms correlate well. Expression profiles clustered by biology rather than technology. cDNA platforms less sensitive than oligonucleotide | |
| Affymetrix, in-house spotted cDNA and oligonucleotide | MGI identifiers | None | Agreement Good concordance in expression level and statistical significance between Affymetrix and oligonucleotide arrays; cDNA arrays showed poor concordance with other platforms | |
| Affymetrix, Agilent cDNA | Unigene ID | Real time RT-PCR | Moderate agreement Gene changes overlapping between the two platforms were co-directional; RT-PCR validation rates were similar | |
| Affymetrix, Agilent cDNA, custom oligonucleotides from three different sources printed by Agilent | Unigene ID | Real time RT-PCR | Agreement Expression level correlation low, but log ratios high; correlation is stronger for highly expressed genes | |
| Affymetrix, Agilent cDNA | Sequence matched | None | Agreement Cross-platform analysis greatly improved by sequence matching | |
| Affymetrix, cDNA | Unigene (sequence matched) | Real time RT-PCR | Disagreement Poor correlation when matched by expression level | |
| Affymetrix, Agilent cDNA, custom-cDNA | Unigene ID | None | Moderate agreement Good correlations for commercial, whereas the correlations between the custom-made and either commercial platforms were lower. Discrepant findings due to clone errors, old annotations, or unknown causes | |
| Affymetrix, Codelink (Amersham) | Unigene ID | Real time RT-Pcr | Agreement After noise adjustment (use precent)genes | |
| Affymetrix, Clontech, Incyte, NIEHS, Molecular Dynamics, PHASE-1 | Comparison of pathways | Real time RT-PCR | Agreement Correlation of the biological pathways involved in response to toxicant exposure | |
| Affymetrix, Millennium Pharmaceuticals cDNA | UniGene ID, then sequence matched using BLAST | None | Moderate agreement Increased significantly after sequence matching; discrepant correlations between Affymetrix and cDNA measurements could be explained by probe sequence differences | |
| Affymetrix (5 labs), cDNA (3 labs), 2-color oligonucleotide (Qiagen 70mer; analyzed in 2 labs) | Unigene, locuslink, RefSeq | Real time RT-PCR | Agreement Among best performing laboratories; increased data quality with more stringent pre-processing | |
| Affymetrix (inter-and intra-laboratory correlation) | None required | None | Agreement Intra-laboratory correlation was only slightly stronger the inter-laboratory. Samples clustered by biology rather than laboratory | |
| Affymetrix, TIGR cDNA | Sequence mapped TIGR | Real time RT-PCR | Agreement Biological treatment had a greater effect on gene expression than platform for 90% of the genes | |
| Affymetrix, Agilent Amersham (Codelink), Compugen, Operon, 2 custom oligo, 5 custom cDNA, | Transcripts matched using NIA mouse index | None | Moderate agreement Standardized protocols and data analysis required | |
| Affymetrix, Genomic Amplicon arrays, Operon oligo | Locus ID (Arabidopsis) | Northern blot | Moderate agreement Signal intensity-dependant | |
| Genbank acc.No | N/A | Agreement Concluded that the quality of the original dataset was poor and inappropriate methods were applied for analysis. Alternate analysis had 10× >concordance | ||
| Bames et al. [2005] | Affymetrix, Illumina Beadarrays | Sequence matched using BLAST [ | None | AgreementFor genes with high expression and concordance improved for probes that were verified to target same transcript |
| Gwinn et al [2005] | Affymetrix, Amersham (Codelink), cDNA | Amersham (locuslink ID) Affymetrix (GenBank)–links made via IDs given by company; genes of interest used probe sequence | Real-time RT-PCR | Disagreement Each platform yielded unique gene expression profiles |
| Affymetrix, in-house cDNA, in-house Operon oligonucleotide | Unigene | Real time RT-PCR | Agreement High concordance for significant expression ratios and 1.5 to 2-fold changes (93–99%) | |
| Affymetrix, Stanford cDNA | Sequence matched | None | Agreement Re-examination of NCI-60 cell line data. Overlapping probes correlate well | |
| Affymetrix, Agilent oligo, cDNA | Unigene ID | None | Agreement Data were more consistent between two commercial platforms and less consistent between custom arrays and commercial arrays; expression at the gene level exhibited an acceptable level of agreement. Lab and sample effect was greater than platform effect | |
| Affymetrix, in-house long oligo | Unigene ID | Real time RT-PCR | Agreement Similar profiles and strong correlations were found for the 2 platforms | |
| 6 different cDNA and oligo array studies previously published from several laboratories | Unigene ID | None | Agreement Integrated raw microarray data from different studies for supervised classifications. More platforms better for predictive analysis | |
| Affymetrix, Applied Biosystems | Promote analysis | Real time RT-PCR | Agreement AB more sensitive and more correlated with RT-PCR | |
| Affymetrix, Amersham (codelink) | Locuslink ID | Real time RT-PCR | Disagreement Only 9 genes found to be differentially expressed in common out of 42 (Affymetrix) and 105 (Codelink) in total | |
| Affymetrix, GE Healthcare (Amersham), Agilent | Sequence mapped | Real time RT-PCR | Moderate agreement 1 color more precise than 2 color; Affymetrix and Agilent were more concordant based on detection of differential genes | |
| Agilent, Applied biosystems | Sequence matched (BLAST) | Real time RT-PCR | Agreement 1375 genes confirmed with RT-PCR | |
| Affymetrix, Amersham, Mergen, ABI, custom cDNA, MGH, MWG, Agilent, Compugen, Operon | Probes sequence matched within 1 exon (Unigene, LocusLink, RefSeq, Refseq exon) | Real time RT-PCR | Strong agreement Commercial better than in-house, 1-color better than 2-color | |
| Affymetrix, Agilent (1 and 2 color), Applied Biosystems, Eppendorf, GE Healthcare, Illumina, in-house spotted Operon oligonucleotide | Probes sequence mapped to RefSeq and to AceView using 30 probe for genes with multiple oligonucleotides | Real time RT-PCR; TaqMan1 (Roche Molecular Systems); StaRTPCR and QuantiGene carried out by | Strong agreement Intra-platform consistency across test sites and high inter-platform concordance with respect to differentially expressed genes; high correlation between QRT-PCR values and microarray results | |
| Affymetrix, Agilent, Applied Biosystems, GE Healthcare | Probes sequence mapped to RefSeq | None | Agreement High inter-site and cross-platform concordance in the detection of differential gene expression using fold change rankings; fold change ranking outperforms other analysis methods | |
| Affymetrix, Agilent, Applied Biosystems, Eppendorf, GE Healthcare, Illumina | Probes sequence mapped to RefSeq | Real time RT-PCR; TaqMan1 (997 genes), StaRT-PCR (205 genes), and QuantiGene (244 genes) | Agreement High correlation between gene expression values and microarray results. Main variable was probe sequence and target location |