| Literature DB >> 29416190 |
Chuan Hong1, Yang Ning2, Shuang Wang3, Hao Wu4, Raymond J Carroll5, Yong Chen6.
Abstract
Motivated by analyses of DNA methylation data, we propose a semiparametric mixture model, namely the generalized exponential tilt mixture model, to account for heterogeneity between differentially methylated and non-differentially methylated subjects in the cancer group, and capture the differences in higher order moments (e.g. mean and variance) between subjects in cancer and normal groups. A pairwise pseudolikelihood is constructed to eliminate the unknown nuisance function. To circumvent boundary and non-identifiability problems as in parametric mixture models, we modify the pseudolikelihood by adding a penalty function. In addition, the test with simple asymptotic distribution has computational advantages compared with permutation-based test for high-dimensional genetic or epigenetic data. We propose a pseudolikelihood based expectation-maximization test, and show the proposed test follows a simple chi-squared limiting distribution. Simulation studies show that the proposed test controls Type I errors well and has better power compared to several current tests. In particular, the proposed test outperforms the commonly used tests under all simulation settings considered, especially when there are variance differences between two groups. The proposed test is applied to a real data set to identify differentially methylated sites between ovarian cancer subjects and normal subjects.Entities:
Keywords: Asymptotics; Conditional likelihood; Non-regular problem; Penalized likelihood; Semiparametric mixture model
Year: 2017 PMID: 29416190 PMCID: PMC5798902 DOI: 10.1080/01621459.2017.1280405
Source DB: PubMed Journal: J Am Stat Assoc ISSN: 0162-1459 Impact factor: 5.033