| Literature DB >> 15888200 |
Huixia Wang1, Xuming He, Mark Band, Carole Wilson, Lei Liu.
Abstract
As gene expression profile data from DNA microarrays accumulate rapidly, there is a natural need to compare data across labs and platforms. Comparisons of microarray data can be quite challenging due to data complexity and variability. Different labs may adopt different technology platforms. One may ask about the degree of agreement we can expect from different labs and different platforms. To address this question, we conducted a study of inter-lab and inter-platform agreement of microarray data across three platforms and three labs. The statistical measures of consistency and agreement used in this paper are the Pearson correlation, intraclass correlation, kappa coefficients, and a measure of intra-transcript correlation. The three platforms used in the present paper were Affymetrix GeneChip, custom cDNA arrays, and custom oligo arrays. Using the within-platform variability as a benchmark, we found that these technology platforms exhibited an acceptable level of agreement, but the agreement between two technologies within the same lab was greater than that between two labs using the same technology. The consistency of replicates in each experiment varies from lab to lab. When there is high consistency among replicates, different technologies show good agreement within and across labs using the same RNA samples. On the other hand, the lab effect, especially when confounded with the RNA sample effect, plays a bigger role than the platform effect on data agreement.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15888200 PMCID: PMC1142313 DOI: 10.1186/1471-2164-6-71
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary of data collection
| KC | CI 15K cDNA | 15K | Clonetech | 4 | Raw intensity | cDNA | Keck | In house |
| KAV | Affymetrix 430A | 23K | Clonetech | 2 | AV(Bioconductor) | Affymetrix | Keck | In house |
| KLW | Affymetrix 430A | 23K | Clonetech | 2 | Li and Wong | Affymetrix | Keck | In house |
| KRMA | Affymetrix 430A | 23K | Clonetech | 2 | RMA | Affymetrix | Keck | In house |
| CC | Riken16K cDNA by Agilent | 16K | Clonetech | 3 | Raw intensity | cDNA | Cal Tech | NCBI GEO |
| CO | Riken16K Oligo by Agilent | 16K | Clonetech | 3 | Raw intensity | Oligo | Cal Tech | NCBI GEO |
| GNF | Affymetrix U74Av2 | 12K | In house | 2 | AV(MAS4.0) | Affymetrix | GNF | expression.gnf.org |
Figure 1Consistency of replicates.
Correlation coefficients for pairwise comparisons between data sets. Pearson correlation coefficients (PCC), kappa coefficients (Kappa), intraclass correlation coefficients (ICC) and intra-transcript correlation coefficients (ITC) for pairwise comparisons.
| Comparisons | No. of Matched Unigene IDs | PCC | Kappa | ICC | ITC |
| GNF vs. KC | 1,838 | 0.590 | 0.327 | 0.693 | 0.748 |
| GNF vs. CC | 1,374 | 0.513 | 0.312 | 0.678 | 0.774 |
| GNF vs. CO | 1,914 | 0.633 | 0.365 | 0.707 | 0.729 |
| GNF vs. KAV | 2,058 | 0.727 | 0.452 | 0.686 | 0.724 |
| GNF vs. KLW | 3,295 | 0.640 | 0.374 | 0.681 | 0.690 |
| GNF vs. KRMA | 3,452 | 0.686 | 0.400 | 0.706 | 0.705 |
| KC vs. CC | 2,730 | 0.597 | 0.363 | 0.681 | 0.830 |
| KC vs. CO | 3,043 | 0.641 | 0.423 | 0.714 | 0.812 |
| KC vs. KAV | 2,964 | 0.747 | 0.523 | 0.726 | 0.908 |
| KC vs. KLW | 4,362 | 0.680 | 0.461 | 0.714 | 0.868 |
| KC vs. KRMA | 4,516 | 0.725 | 0.493 | 0.736 | 0.893 |
| CC vs. CO | 3,262 | 0.688 | 0.429 | 0.770 | 0.836 |
| CC vs. KAV | 2,285 | 0.708 | 0.461 | 0.746 | 0.859 |
| CC vs. KLW | 3,658 | 0.650 | 0.407 | 0.739 | 0.837 |
| CC vs. KRMA | 3,843 | 0.707 | 0.472 | 0.772 | 0.862 |
| CO vs. KAV | 3,001 | 0.806 | 0.555 | 0.781 | 0.865 |
| CO vs. KLW | 4,725 | 0.759 | 0.503 | 0.782 | 0.847 |
| CO vs. KRMA | 5,018 | 0.805 | 0.580 | 0.813 | 0.854 |
| KAV vs. KLW | 7,181 | 0.923 | 0.666 | 0.832 | 0.917 |
| KAV vs. KLW | 7,237 | 0.955 | 0.734 | 0.848 | 0.938 |
| KLW vs. KRMA | 14,130 | 0.921 | 0.732 | 0.765 | 0.971 |
Figure 2Correlation coefficients for pairwise comparisons between data sets.
Sensitivity check of the overall comparison among all the data sets
| ICC | |
| Before leaving out | 0.662 |
| Leave out GNF | 0.703 |
| leave out KLW | 0.650 |
| leave out KC | 0.663 |
| leave out CC | 0.684 |
| leave out CO | 0.670 |
Figure 3Boxplot of the full data set of CO, with 7,282 Unigene IDs.
Figure 4Boxplot of the subset of CO, overlapped with the other 4 datasets, with 551 Unigene IDs.
Figure 5Comparisons between the same technology but different labs (KC_CC) and comparisons between different technologies in the same lab (KRMA_KC and CO_CC).
Frequency table for KAV and KC
| KAV | KC | |||
| Frequency | -2 | 0 | 2 | Total |
| -2 | 173 | 136 | 5 | 314 |
| 0 | 157 | 1,972 | 146 | 2,275 |
| 2 | 3 | 112 | 260 | 375 |
| Total | 333 | 220 | 411 | 2,964 |
Kappa coefficient = 0.523