Literature DB >> 28828033

A hybrid parameter estimation algorithm for beta mixtures and applications to methylation state classification.

Christopher Schröder1, Sven Rahmann1.   

Abstract

BACKGROUND: Mixtures of beta distributions are a flexible tool for modeling data with values on the unit interval, such as methylation levels. However, maximum likelihood parameter estimation with beta distributions suffers from problems because of singularities in the log-likelihood function if some observations take the values 0 or 1.
METHODS: While ad-hoc corrections have been proposed to mitigate this problem, we propose a different approach to parameter estimation for beta mixtures where such problems do not arise in the first place. Our algorithm combines latent variables with the method of moments instead of maximum likelihood, which has computational advantages over the popular EM algorithm.
RESULTS: As an application, we demonstrate that methylation state classification is more accurate when using adaptive thresholds from beta mixtures than non-adaptive thresholds on observed methylation levels. We also demonstrate that we can accurately infer the number of mixture components.
CONCLUSIONS: The hybrid algorithm between likelihood-based component un-mixing and moment-based parameter estimation is a robust and efficient method for beta mixture estimation. We provide an implementation of the method ("betamix") as open source software under the MIT license.

Entities:  

Keywords:  Beta distribution; Classification; Differential methylation; EM algorithm; Maximum likelihood; Method of moments; Mixture model

Year:  2017        PMID: 28828033      PMCID: PMC5563068          DOI: 10.1186/s13015-017-0112-1

Source DB:  PubMed          Journal:  Algorithms Mol Biol        ISSN: 1748-7188            Impact factor:   1.405


  4 in total

1.  Estimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of p-values.

Authors:  Stan Pounds; Stephan W Morris
Journal:  Bioinformatics       Date:  2003-07-01       Impact factor: 6.937

2.  Applications of beta-mixture models in bioinformatics.

Authors:  Yuan Ji; Chunlei Wu; Ping Liu; Jing Wang; Kevin R Coombes
Journal:  Bioinformatics       Date:  2005-02-15       Impact factor: 6.937

3.  Massive parallel bisulfite sequencing of CG-rich DNA fragments reveals that methylation of many X-chromosomal CpG islands in female blood DNA is incomplete.

Authors:  Michael Zeschnigk; Marcel Martin; Gisela Betzl; Andreas Kalbe; Caroline Sirsch; Karin Buiting; Stephanie Gross; Epameinondas Fritzilas; Bruno Frey; Sven Rahmann; Bernhard Horsthemke
Journal:  Hum Mol Genet       Date:  2009-02-17       Impact factor: 6.150

Review 4.  Methodological aspects of whole-genome bisulfite sequencing analysis.

Authors:  Swarnaseetha Adusumalli; Mohd Feroz Mohd Omar; Richie Soong; Touati Benoukraf
Journal:  Brief Bioinform       Date:  2014-05-27       Impact factor: 11.622

  4 in total
  1 in total

1.  Subclonal reconstruction of tumors by using machine learning and population genetics.

Authors:  Giulio Caravagna; Timon Heide; Marc J Williams; Luis Zapata; Daniel Nichol; Ketevan Chkhaidze; William Cross; George D Cresswell; Benjamin Werner; Ahmet Acar; Louis Chesler; Chris P Barnes; Guido Sanguinetti; Trevor A Graham; Andrea Sottoriva
Journal:  Nat Genet       Date:  2020-09-02       Impact factor: 38.330

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.