Literature DB >> 9146967

Scoring hidden Markov models.

C Barrett1, R Hughey, K Karplus.   

Abstract

MOTIVATION: Statistical sequence comparison techniques, such as hidden Markov models and generalized profiles, calculate the probability that a sequence was generated by a given model. Log-odds scoring is a means of evaluating this probability by comparing it to a null hypothesis, usually a simpler statistical model intended to represent the universe of sequences as a whole, rather than the group of interest. Such scoring leads to two immediate questions: what should the null model be, and what threshold of log-odds score should be deemed a match to the model.
RESULTS: This paper analyses these two issues experimentally. Within the context of the Sequence Alignment and Modeling software suite (SAM), we consider a variety of null models and suitable thresholds. Additionally, we consider HMMer's log-odds scoring and SAM's original Z-scoring method. Among the null model choices, a simple looping null model that emits characters according to the geometric mean of the character probabilities in the columns modeled by the hidden Markov model (HMM) performs well or best across all four discrimination experiments.

Entities:  

Mesh:

Substances:

Year:  1997        PMID: 9146967     DOI: 10.1093/bioinformatics/13.2.191

Source DB:  PubMed          Journal:  Comput Appl Biosci        ISSN: 0266-7061


  19 in total

1.  Preparation of name and address data for record linkage using hidden Markov models.

Authors:  Tim Churches; Peter Christen; Kim Lim; Justin Xi Zhu
Journal:  BMC Med Inform Decis Mak       Date:  2002-12-13       Impact factor: 2.796

2.  Statistical modeling and analysis of the LAGLIDADG family of site-specific endonucleases and identification of an intein that encodes a site-specific endonuclease of the HNH family.

Authors:  J Z Dalgaard; A J Klar; M J Moser; W R Holley; A Chatterjee; I S Mian
Journal:  Nucleic Acids Res       Date:  1997-11-15       Impact factor: 16.971

Review 3.  Tools for Predicting the Functional Impact of Nonsynonymous Genetic Variation.

Authors:  Haiming Tang; Paul D Thomas
Journal:  Genetics       Date:  2016-06       Impact factor: 4.562

4.  Comparative sequence analysis of ribonucleases HII, III, II PH and D.

Authors:  I S Mian
Journal:  Nucleic Acids Res       Date:  1997-08-15       Impact factor: 16.971

5.  The proofreading domain of Escherichia coli DNA polymerase I and other DNA and/or RNA exonuclease domains.

Authors:  M J Moser; W R Holley; A Chatterjee; I S Mian
Journal:  Nucleic Acids Res       Date:  1997-12-15       Impact factor: 16.971

6.  Transcriptional organization of the Clostridium acetobutylicum genome.

Authors:  Carlos J Paredes; Isidore Rigoutsos; E Terry Papoutsakis
Journal:  Nucleic Acids Res       Date:  2004-04-01       Impact factor: 16.971

7.  CutProtFam-Pred: detection and classification of putative structural cuticular proteins from sequence alone, based on profile hidden Markov models.

Authors:  Zoi S Ioannidou; Margarita C Theodoropoulou; Nikos C Papandreou; Judith H Willis; Stavros J Hamodrakas
Journal:  Insect Biochem Mol Biol       Date:  2014-06-27       Impact factor: 4.714

8.  Using context to improve protein domain identification.

Authors:  Alejandro Ochoa; Manuel Llinás; Mona Singh
Journal:  BMC Bioinformatics       Date:  2011-03-31       Impact factor: 3.169

9.  The effectiveness of position- and composition-specific gap costs for protein similarity searches.

Authors:  Aleksandar Stojmirović; E Michael Gertz; Stephen F Altschul; Yi-Kuo Yu
Journal:  Bioinformatics       Date:  2008-07-01       Impact factor: 6.937

10.  Pleiotropic functions of catabolite control protein CcpA in Butanol-producing Clostridium acetobutylicum.

Authors:  Cong Ren; Yang Gu; Yan Wu; Weiwen Zhang; Chen Yang; Sheng Yang; Weihong Jiang
Journal:  BMC Genomics       Date:  2012-07-30       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.