Literature DB >> 11955022

An expectation maximization algorithm for training hidden substitution models.

I Holmes1, G M Rubin.   

Abstract

We derive an expectation maximization algorithm for maximum-likelihood training of substitution rate matrices from multiple sequence alignments. The algorithm can be used to train hidden substitution models, where the structural context of a residue is treated as a hidden variable that can evolve over time. We used the algorithm to train hidden substitution matrices on protein alignments in the Pfam database. Measuring the accuracy of multiple alignment algorithms with reference to BAliBASE (a database of structural reference alignments) our substitution matrices consistently outperform the PAM series, with the improvement steadily increasing as up to four hidden site classes are added. We discuss several applications of this algorithm in bioinformatics. Copyright 2002 Elsevier Science Ltd.

Entities:  

Mesh:

Substances:

Year:  2002        PMID: 11955022     DOI: 10.1006/jmbi.2002.5405

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  30 in total

1.  Modeling DNA base substitution in large genomic regions from two organisms.

Authors:  Von Bing Yap; Terence P Speed
Journal:  J Mol Evol       Date:  2004-01       Impact factor: 2.395

2.  Efficient methods for estimating amino acid replacement rates.

Authors:  Lars Arvestad
Journal:  J Mol Evol       Date:  2006-04-28       Impact factor: 2.395

3.  Learning to count: robust estimates for labeled distances between molecular sequences.

Authors:  John D O'Brien; Vladimir N Minin; Marc A Suchard
Journal:  Mol Biol Evol       Date:  2009-01-08       Impact factor: 16.240

4.  Phylogenetic mixture models for proteins.

Authors:  Si Quang Le; Nicolas Lartillot; Olivier Gascuel
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2008-12-27       Impact factor: 6.237

5.  Fast, accurate and simulation-free stochastic mapping.

Authors:  Vladimir N Minin; Marc A Suchard
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2008-12-27       Impact factor: 6.237

6.  Efficient maximum likelihood parameterization of continuous-time Markov processes.

Authors:  Robert T McGibbon; Vijay S Pande
Journal:  J Chem Phys       Date:  2015-07-21       Impact factor: 3.488

7.  Historian: accurate reconstruction of ancestral sequences and evolutionary rates.

Authors:  Ian H Holmes
Journal:  Bioinformatics       Date:  2017-04-15       Impact factor: 6.937

8.  EREM: Parameter Estimation and Ancestral Reconstruction by Expectation-Maximization Algorithm for a Probabilistic Model of Genomic Binary Characters Evolution.

Authors:  Liran Carmel; Yuri I Wolf; Igor B Rogozin; Eugene V Koonin
Journal:  Adv Bioinformatics       Date:  2010-05-06

9.  Identifying novel constrained elements by exploiting biased substitution patterns.

Authors:  Manuel Garber; Mitchell Guttman; Michele Clamp; Michael C Zody; Nir Friedman; Xiaohui Xie
Journal:  Bioinformatics       Date:  2009-06-15       Impact factor: 6.937

10.  Evolutionary triplet models of structured RNA.

Authors:  Robert K Bradley; Ian Holmes
Journal:  PLoS Comput Biol       Date:  2009-08-28       Impact factor: 4.475

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.