| Literature DB >> 17504534 |
Karen Kapur1, Yi Xing, Zhengqing Ouyang, Wing Hung Wong.
Abstract
We have developed a strategy for estimating gene expression on Affymetrix Exon arrays. The method includes a probe-specific background correction and a probe selection strategy in which a subset of probes with highly correlated intensities across multiple samples are chosen to summarize gene expression. Our results demonstrate that the proposed background model offers improvements over the default Affymetrix background correction and that Exon arrays may provide more accurate measurements of gene expression than traditional 3' arrays.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17504534 PMCID: PMC1929160 DOI: 10.1186/gb-2007-8-5-r82
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Probe design of Exon arrays. (a) Exon-intron structureof a gene. Black boxes represent exons. Gray boxes represent introns. Introns are not drawn to scale. (b) Probe design of Exon arrays. Four probes target each putative exon. (c) Probe design of 3' expression arrays. Probes target the 3' end of the mRNA sequence.
Training the MAT background model using different sets of probes
| Train/Test | Cerebellum | Heart | Liver |
| Cerebellum | 0.64 | 0.67 | |
| Heart | 0.64 | 0.65 | |
| Liver | 0.66 | 0.64 | |
| Cerebellum | 0.61 | 0.63 | |
| Heart | 0.61 | 0.63 | |
| Liver | 0.64 | 0.63 |
MAT background model parameters were estimated from background probes or full probes. After parameters were trained on a given tissue, the model is evaluated on separate tissues by the R2 statistic.
Comparison of MAT and Affymetrix GC bin background models
| Cerebellum | Heart | Liver | |
| MAT | 0.24 | 0.30 | 0.35 |
| GC Bin | 0.07 | 0.24 | 0.25 |
The MAT and GC bin background models were trained from background probes. R2 statistics are reported for the fit of background models to the set of full probes.
Figure 2Heatmap visualization of Exon array pairwise probe correlations. Heatmap visualization of probe intensities of CD44 (Exon array transcript cluster 3326635). Each cell of the heatmap shows the correlation of two probe intensities among 11 tissues (breast, cerebellum, heart, kidney, liver, muscle, pancreas, prostate, spleen, testes, and thyroid). The top color bar indicates the probe annotation type, core probes (red), extended probes (blue), full probes (yellow). The signal intensities of core probes tend to have high correlation (the top right corner of the heatmap).
Figure 3Comparison of A/P calls using ROC curves. Different models of probe-specific background are used as the basis for generating A/P calls of gene expression. We plot the ROC curve as the true positive rate versus the false positive rate of agreement between each A/P call method and gold-standard sets of expressed and unexpressed genes generated from independent SAGE data. The ROC curves from several A/P methods are shown here in each of three tissues, (a) cerebellum, (b) heart, (c) liver: Exon array MAT background (red); Exon array Affymetrix DABG (blue); maximum 3' array MAS 5.0 probeset statistic (purple); minimum 3' array MAS5.0 probeset statistic (brown).
Figure 4Correlation of expression indices between human and mouse. Gene expression indices on a set of ortholog genes were computed using identical human and mouse tissues from Exon arrays. (a) Exon array ortholog gene correlations on heart tissue. (b) Exon array correlation of tissue expression between human and mouse.