| Literature DB >> 22600740 |
Matthew A Kayala1, Pierre Baldi.
Abstract
The Bayesian regularization method for high-throughput differential analysis, described in Baldi and Long (A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics 2001: 17: 509-519) and implemented in the Cyber-T web server, is one of the most widely validated. Cyber-T implements a t-test using a Bayesian framework to compute a regularized variance of the measurements associated with each probe under each condition. This regularized estimate is derived by flexibly combining the empirical measurements with a prior, or background, derived from pooling measurements associated with probes in the same neighborhood. This approach flexibly addresses problems associated with low replication levels and technology biases, not only for DNA microarrays, but also for other technologies, such as protein arrays, quantitative mass spectrometry and next-generation sequencing (RNA-seq). Here we present an update to the Cyber-T web server, incorporating several useful new additions and improvements. Several preprocessing data normalization options including logarithmic and (Variance Stabilizing Normalization) VSN transforms are included. To augment two-sample t-tests, a one-way analysis of variance is implemented. Several methods for multiple tests correction, including standard frequentist methods and a probabilistic mixture model treatment, are available. Diagnostic plots allow visual assessment of the results. The web server provides comprehensive documentation and example data sets. The Cyber-T web server, with R source code and data sets, is publicly available at http://cybert.ics.uci.edu/.Entities:
Mesh:
Year: 2012 PMID: 22600740 PMCID: PMC3394347 DOI: 10.1093/nar/gks420
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 2.Plasmodium falciparum mean raw intensity versus empirical SD showing a clear mean–variance dependence. Outliers have been removed to make the relationship clearer.
Figure 3.Plasmodium falciparum VSN mean normalized intensity versus empirical and Bayes-regularized SD. The systematic mean–variance relationship seen in the raw data has largely been removed in the empirical variance. The regularization shows regression toward the mean experiment SD.
Figure 1.Mean versus SD plots for Condition 1 of both the low- and high-replicate Plasmodium falciparum protein microarray data sets. The Bayes-regularized estimates approximate the ‘truth’ of the high-replicate empirical measurements better than the low-replicate empirical measurements. The plots are shown with density estimate smoothing, a plotting option on the web server. Darker colors indicate high local density.
Figure 4.Plasmodium falciparum ROC curve from PPDE analysis. ∼25% false positives must be accepted to discover ∼90% of the true positives.