Literature DB >> 23325620

Sparsely correlated hidden Markov models with application to genome-wide location studies.

Hyungwon Choi1, Damian Fermin, Alexey I Nesvizhskii, Debashis Ghosh, Zhaohui S Qin.   

Abstract

MOTIVATION: Multiply correlated datasets have become increasingly common in genome-wide location analysis of regulatory proteins and epigenetic modifications. Their correlation can be directly incorporated into a statistical model to capture underlying biological interactions, but such modeling quickly becomes computationally intractable.
RESULTS: We present sparsely correlated hidden Markov models (scHMM), a novel method for performing simultaneous hidden Markov model (HMM) inference for multiple genomic datasets. In scHMM, a single HMM is assumed for each series, but the transition probability in each series depends on not only its own hidden states but also the hidden states of other related series. For each series, scHMM uses penalized regression to select a subset of the other data series and estimate their effects on the odds of each transition in the given series. Following this, hidden states are inferred using a standard forward-backward algorithm, with the transition probabilities adjusted by the model at each position, which helps retain the order of computation close to fitting independent HMMs (iHMM). Hence, scHMM is a collection of inter-dependent non-homogeneous HMMs, capable of giving a close approximation to a fully multivariate HMM fit. A simulation study shows that scHMM achieves comparable sensitivity to the multivariate HMM fit at a much lower computational cost. The method was demonstrated in the joint analysis of 39 histone modifications, CTCF and RNA polymerase II in human CD4+ T cells. scHMM reported fewer high-confidence regions than iHMM in this dataset, but scHMM could recover previously characterized histone modifications in relevant genomic regions better than iHMM. In addition, the resulting combinatorial patterns from scHMM could be better mapped to the 51 states reported by the multivariate HMM method of Ernst and Kellis. AVAILABILITY: The scHMM package can be freely downloaded from http://sourceforge.net/p/schmm/ and is recommended for use in a linux environment.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23325620      PMCID: PMC3582268          DOI: 10.1093/bioinformatics/btt012

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  24 in total

1.  Genome-wide location and function of DNA binding proteins.

Authors:  B Ren; F Robert; J J Wyrick; O Aparicio; E G Jennings; I Simon; J Zeitlinger; J Schreiber; N Hannett; E Kanin; T L Volkert; C J Wilson; S P Bell; R A Young
Journal:  Science       Date:  2000-12-22       Impact factor: 47.728

2.  Spatial distribution of di- and tri-methyl lysine 36 of histone H3 at active genes.

Authors:  Andrew J Bannister; Robert Schneider; Fiona A Myers; Alan W Thorne; Colyn Crane-Robinson; Tony Kouzarides
Journal:  J Biol Chem       Date:  2005-03-10       Impact factor: 5.157

3.  Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome.

Authors:  Nathaniel D Heintzman; Rhona K Stuart; Gary Hon; Yutao Fu; Christina W Ching; R David Hawkins; Leah O Barrera; Sara Van Calcar; Chunxu Qu; Keith A Ching; Wei Wang; Zhiping Weng; Roland D Green; Gregory E Crawford; Bing Ren
Journal:  Nat Genet       Date:  2007-02-04       Impact factor: 38.330

4.  Combinatorial patterns of histone acetylations and methylations in the human genome.

Authors:  Zhibin Wang; Chongzhi Zang; Jeffrey A Rosenfeld; Dustin E Schones; Artem Barski; Suresh Cuddapah; Kairong Cui; Tae-Young Roh; Weiqun Peng; Michael Q Zhang; Keji Zhao
Journal:  Nat Genet       Date:  2008-06-15       Impact factor: 38.330

5.  Hidden Markov models in computational biology. Applications to protein modeling.

Authors:  A Krogh; M Brown; I S Mian; K Sjölander; D Haussler
Journal:  J Mol Biol       Date:  1994-02-04       Impact factor: 5.469

6.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

7.  Hierarchical hidden Markov model with application to joint analysis of ChIP-chip and ChIP-seq data.

Authors:  Hyungwon Choi; Alexey I Nesvizhskii; Debashis Ghosh; Zhaohui S Qin
Journal:  Bioinformatics       Date:  2009-05-14       Impact factor: 6.937

8.  Discovery and characterization of chromatin states for systematic annotation of the human genome.

Authors:  Jason Ernst; Manolis Kellis
Journal:  Nat Biotechnol       Date:  2010-07-25       Impact factor: 54.908

9.  Construction of multilocus genetic linkage maps in humans.

Authors:  E S Lander; P Green
Journal:  Proc Natl Acad Sci U S A       Date:  1987-04       Impact factor: 11.205

10.  Genome-wide mapping of in vivo protein-DNA interactions.

Authors:  David S Johnson; Ali Mortazavi; Richard M Myers; Barbara Wold
Journal:  Science       Date:  2007-05-31       Impact factor: 47.728

View more
  4 in total

Review 1.  Chromatin-state discovery and genome annotation with ChromHMM.

Authors:  Jason Ernst; Manolis Kellis
Journal:  Nat Protoc       Date:  2017-11-09       Impact factor: 13.491

2.  Bayesian adaptive group lasso with semiparametric hidden Markov models.

Authors:  Kai Kang; Xinyuan Song; X Joan Hu; Hongtu Zhu
Journal:  Stat Med       Date:  2018-11-28       Impact factor: 2.373

3.  Joint analysis of expression profiles from multiple cancers improves the identification of microRNA-gene interactions.

Authors:  Xiaowei Chen; Frank J Slack; Hongyu Zhao
Journal:  Bioinformatics       Date:  2013-06-14       Impact factor: 6.937

Review 4.  Integrating Epigenomics into the Understanding of Biomedical Insight.

Authors:  Yixing Han; Ximiao He
Journal:  Bioinform Biol Insights       Date:  2016-12-04
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.