Yeonok Lee1, Debashis Ghosh1, Ross C Hardison1, Yu Zhang1. 1. Department of Statistics and Department of Biochemistry and Molecular Biology, Penn State University, University Park, PA 16803, USA.
Abstract
SUMMARY: Hidden Markov models (HMMs) are flexible and widely used in scientific studies. Particularly in genomics and genetics, there are multiple distinct regimes in the genome within each of which the relationships among multivariate features are distinct. Examples include differential gene regulation depending on gene functions and experimental conditions, and varying combinatorial patterns of multiple transcription factors. We developed a software package called MRHMMs (Multivariate Regression Hidden Markov Models and the variantS) that accommodates a variety of HMMs that can be flexibly applied to many biological studies and beyond. MRHMMs supplements existing HMM software packages in two aspects. First, MRHMMs provides a diverse set of emission probability structures, including mixture of multivariate normal distributions and (logistic) regression models. Second, MRHMMs is computationally efficient for analyzing large data-sets generated in current genome-wide studies. Especially, the software is written in C for the speed advantage and further amenable to implement alternative models to meet users' own purposes. AVAILABILITY AND IMPLEMENTATION: http://sourceforge.net/projects/mrhmms/
SUMMARY: Hidden Markov models (HMMs) are flexible and widely used in scientific studies. Particularly in genomics and genetics, there are multiple distinct regimes in the genome within each of which the relationships among multivariate features are distinct. Examples include differential gene regulation depending on gene functions and experimental conditions, and varying combinatorial patterns of multiple transcription factors. We developed a software package called MRHMMs (Multivariate Regression Hidden Markov Models and the variantS) that accommodates a variety of HMMs that can be flexibly applied to many biological studies and beyond. MRHMMs supplements existing HMM software packages in two aspects. First, MRHMMs provides a diverse set of emission probability structures, including mixture of multivariate normal distributions and (logistic) regression models. Second, MRHMMs is computationally efficient for analyzing large data-sets generated in current genome-wide studies. Especially, the software is written in C for the speed advantage and further amenable to implement alternative models to meet users' own purposes. AVAILABILITY AND IMPLEMENTATION: http://sourceforge.net/projects/mrhmms/
Authors: Cory Y McLean; Dave Bristor; Michael Hiller; Shoa L Clarke; Bruce T Schaar; Craig B Lowe; Aaron M Wenger; Gill Bejerano Journal: Nat Biotechnol Date: 2010-05-02 Impact factor: 54.908