| Literature DB >> 28827889 |
Abstract
Microarray studies generate a large number of p-values from many gene expression comparisons. The estimate of the proportion of the p-values sampled from the null hypothesis draws broad interest. The two-component mixture model is often used to estimate this proportion. If the data are generated under the null hypothesis, the p-values follow the uniform distribution. What is the distribution of p-values when data are sampled from the alternative hypothesis? The distribution is derived for the chi-squared test. Then this distribution is used to estimate the proportion of p-values sampled from the null hypothesis in a parametric framework. Simulation studies are conducted to evaluate its performance in comparison with five recent methods. Even in scenarios with clusters of correlated p-values and a multicomponent mixture or a continuous mixture in the alternative, the new method performs robustly. The methods are demonstrated through an analysis of a real microarray dataset.Entities:
Keywords: distribution of p-values; microarray studies; mixture model; proportion from the null hypothesis
Year: 2017 PMID: 28827889 PMCID: PMC5562234 DOI: 10.1016/j.csda.2017.04.008
Source DB: PubMed Journal: Comput Stat Data Anal ISSN: 0167-9473 Impact factor: 1.681