| Literature DB >> 19435513 |
Joshua S Bloom1, Zia Khan, Leonid Kruglyak, Mona Singh, Amy A Caudy.
Abstract
BACKGROUND: High-throughput cDNA synthesis and sequencing of poly(A)-enriched RNA is rapidly emerging as a technology competing to replace microarrays as a quantitative platform for measuring gene expression.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19435513 PMCID: PMC2686739 DOI: 10.1186/1471-2164-10-221
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Sequencing and arrays show correlated differential expression but sequencing is more susceptible to sampling error. Read counts are not evenly distributed across genes. For the RMg sample, log10 read counts per gene are shown (A), with genes ordered by abundance. The log2 ratio of the medians of six replicate microarray experiments for RM in ethanol vs RM in glucose is compared to the log2 ratio of sequencing read counts. The methods are correlated (R = 0.75356, 95% CI: 0.7236–0.785). Colors indicate significantly differentially expressed genes at a FDR<1% and 1.5 fold or greater change, where significance is determined using Fisher's exact test for the sequencing data and the Mann-Whitney test for the array data. Purple indicates significantly different by both methods, green is significantly different by sequencing only, blue is significantly different by microarrays only, and red is significant by both methods but with opposite directionality (B). Data from (B) but represented as a Venn diagram of significant differences; note in red the 9 genes measured as significantly changed but in opposite directions (C). The results from (B) can be modeled by sampling from binomial distributions for each gene. Here a single random sampling is shown (D). The correlation of log2 expression ratios determined by microarrays and sequencing is highly dependent on the number of read counts per gene. For both the actual data (black), and simulated data (green) with 95% confidence intervals (light green), correlation improves as the thresholds for sequence coverage increase (E).
Figure 2Quantitative PCR of significantly differentially expressed genes show better agreement with arrays than sequencing. 192 randomly sampled significantly differently expressed genes were analyzed by qPCR. qPCR results are highly correlated with both microarrays (R = 0.86, bootstrap 95% CI: 0.7043 – 0.953) (A) and sequencing results (R = .82, bootstrap 95% CI: 0.7031 – 0.8917) (B). However, the subset of the tested genes that were called significantly differentially expressed by the arrays only (see Fig. 1A, red dots) were more highly correlated (R = 0.925, bootstrap 95% CI: 0.8621 – 0.9648) (C) than the subset of genes that were called significant by sequencing (see Fig. 1A, green) (R = 0.518, bootstrap 95% CI: 0.3227 – 0.7069) (D). Error bars represent 95% confidence intervals for the differential expression measurements.