| Literature DB >> 21611181 |
Abstract
Microarray is a powerful tool for genome-wide gene expression analysis. In microarray expression data, often mean and variance have certain relationships. We present a non-parametric mean-variance smoothing method (NPMVS) to analyze differentially expressed genes. In this method, a nonlinear smoothing curve is fitted to estimate the relationship between mean and variance. Inference is then made upon shrinkage estimation of posterior means assuming variances are known. Different methods have been applied to simulated datasets, in which a variety of mean and variance relationships were imposed. The simulation study showed that NPMVS outperformed the other two popular shrinkage estimation methods in some mean-variance relationships; and NPMVS was competitive with the two methods in other relationships. A real biological dataset, in which a cold stress transcription factor gene, CBF2, was overexpressed, has also been analyzed with the three methods. Gene ontology and cis-element analysis showed that NPMVS identified more cold and stress responsive genes than the other two methods did. The good performance of NPMVS is mainly due to its shrinkage estimation for both means and variances. In addition, NPMVS exploits a non-parametric regression between mean and variance, instead of assuming a specific parametric relationship between mean and variance. The source code written in R is available from the authors on request.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21611181 PMCID: PMC3096627 DOI: 10.1371/journal.pone.0019640
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Relationship between sample mean and sample variance.
Sample mean versus log sample variance plots of three different datasets from either control or treatment conditions. Smoothed variances using a non-paramteric method [6], [7] is displayed with green lines. Sample size n is indicated for each dataset. The data sets were normalized with RMA method.
Figure 2Simulation results from four mean-variance relationships.
(A) The plots display mean versus log variance relationship in the four simulated data from case 0 to case 4. Simulated control data are presented on the left, and differentially expressed data are on the right. Smoothed variances using a non-paramteric method is displayed with green lines. (B) The plot displays false negative versus false positive rate for identifying DE genes in the simulated data using different methods. The false positive and false negative rate are the average rate from 100 simulated datasets. They were estimated over a range of cut-off values for each method. Dashed line, solid line, and dotted line represent Gottardo et al. [8] Bayseian () method, NPMVS and limma, respectively. Four mean and variance relationships, case 0, case 1, case 2, and case 3 are represented by black, red, green and blue colours, respectively.
Figure 3Identification of CBF2_OX differentially expressed genes.
Up- and down-regulated genes greater than 2 fold changes are uncovered by three different methods with cut-off (adjusted by Benjamini & Hochberg method) value less than 0.01 for limma, and a cut-off posterior probability greater than 0.99 for and NPMVS, respectively.
Gene ontology enrichment analysis for CBF2_OX up- and down-regulated genes.
| limma |
| NPMVS | |
| Enriched GO in up-regulated genes | response to stress (8.39E-09) | response to stress (2.82E-10) | response to stress (2.83E-12) |
| Enriched GO in down-regulated genes | N.A. | response to stress (0.034) | response to stress (0.00016) |
| N.A. | cell wall (0.034) | cell wall (8.91E-06) |
P values, which are indicated in parentheses, were adjusted by Benjamini & Hochberg method.
Enriched cis-regulatory elements in CBF2_OX up-regulated genes.
| Limma |
| NPMVS | PLACE | ||||
| word |
| counts |
| counts |
| counts | |
| CCGAC | 7.45E-35 | 78 | 1.06E-43 | 113 | 1.36E-43 | 137 | DRE |
| ACGTG | 6.02E-06 | 69 | 2.48E-09 | 104 | 1.42E-11 | 133 | ABRE |
| ATGTCG | 9.30E-26 | 47 | 2.04E-33 | 65 | 6.64E-31 | 77 | N.A. |
| CCACG | 4.01E-06 | 73 | 1.09E-05 | 89 | BOXIIPCCHS | ||
| CGGCA | 9.04E-06 | 59 | 2.79E-06 | 71 | N.A. | ||
| ACACG | 9.95E-07 | 133 | GADOWNAT | ||||
| CACGTG | 6.68E-05 | 55 | CACGTMOTIF | ||||
| CGTGTC | 1.44E-05 | 51 | N.A. |