Literature DB >> 19131426

Learning to count: robust estimates for labeled distances between molecular sequences.

John D O'Brien1, Vladimir N Minin, Marc A Suchard.   

Abstract

Researchers routinely estimate distances between molecular sequences using continuous-time Markov chain models. We present a new method, robust counting, that protects against the possibly severe bias arising from model misspecification. We achieve this robustness by generalizing the conventional distance estimation to incorporate the empirical distribution of site patterns found in the observed pairwise sequence alignment. Our flexible framework allows for computing distances based only on a subset of possible substitutions. From this, we show how to estimate labeled codon distances, such as expected numbers of synonymous or nonsynonymous substitutions. We present two simulation studies. The first compares the relative bias and variance of conventional and robust labeled nucleotide estimators. In the second simulation, we demonstrate that robust counting furnishes accurate synonymous and nonsynonymous distance estimates based only on easy-to-fit models of nucleotide substitution, bypassing the need for computationally expensive codon models. We conclude with three empirical examples. In the first two examples, we investigate the evolutionary dynamics of the influenza A hemagglutinin gene using labeled codon distances. In the final example, we demonstrate the advantages of using robust synonymous distances to alleviate the effect of convergent evolution on phylogenetic analysis of an HIV transmission network.

Mesh:

Substances:

Year:  2009        PMID: 19131426      PMCID: PMC2734148          DOI: 10.1093/molbev/msp003

Source DB:  PubMed          Journal:  Mol Biol Evol        ISSN: 0737-4038            Impact factor:   16.240


  41 in total

1.  An expectation maximization algorithm for training hidden substitution models.

Authors:  I Holmes; G M Rubin
Journal:  J Mol Biol       Date:  2002-04-12       Impact factor: 5.469

2.  Long term trends in the evolution of H(3) HA1 human influenza type A.

Authors:  W M Fitch; R M Bush; C A Bender; N J Cox
Journal:  Proc Natl Acad Sci U S A       Date:  1997-07-22       Impact factor: 11.205

3.  SynPAM-a distance measure based on synonymous codon substitutions.

Authors:  Adrian Schneider; Gaston Gonnet; Gina Cannarozzi
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2007 Oct-Dec       Impact factor: 3.710

4.  Counting labeled transitions in continuous-time Markov models of evolution.

Authors:  Vladimir N Minin; Marc A Suchard
Journal:  J Math Biol       Date:  2007-09-14       Impact factor: 2.259

Review 5.  Structural basis of immune recognition of influenza virus hemagglutinin.

Authors:  I A Wilson; N J Cox
Journal:  Annu Rev Immunol       Date:  1990       Impact factor: 28.527

6.  Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene.

Authors:  R Nielsen; Z Yang
Journal:  Genetics       Date:  1998-03       Impact factor: 4.562

7.  The neighbor-joining method: a new method for reconstructing phylogenetic trees.

Authors:  N Saitou; M Nei
Journal:  Mol Biol Evol       Date:  1987-07       Impact factor: 16.240

8.  Estimating synonymous and nonsynonymous substitution rates.

Authors:  S V Muse
Journal:  Mol Biol Evol       Date:  1996-01       Impact factor: 16.240

9.  A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome.

Authors:  S V Muse; B S Gaut
Journal:  Mol Biol Evol       Date:  1994-09       Impact factor: 16.240

10.  A codon-based model of nucleotide substitution for protein-coding DNA sequences.

Authors:  N Goldman; Z Yang
Journal:  Mol Biol Evol       Date:  1994-09       Impact factor: 16.240

View more
  59 in total

1.  Characterizing molecular adaptation: a hierarchical approach to assess the selective influence of amino acid properties.

Authors:  Saheli Datta; Raquel Prado; Abel Rodríguez; Ananías A Escalante
Journal:  Bioinformatics       Date:  2010-09-16       Impact factor: 6.937

2.  Positive Selection in CD8+ T-Cell Epitopes of Influenza Virus Nucleoprotein Revealed by a Comparative Analysis of Human and Swine Viral Lineages.

Authors:  Heather M Machkovech; Trevor Bedford; Marc A Suchard; Jesse D Bloom
Journal:  J Virol       Date:  2015-08-26       Impact factor: 5.103

3.  Mutation-selection models of coding sequence evolution with site-heterogeneous amino acid fitness profiles.

Authors:  Nicolas Rodrigue; Hervé Philippe; Nicolas Lartillot
Journal:  Proc Natl Acad Sci U S A       Date:  2010-02-22       Impact factor: 11.205

4.  Evaluation of Ancestral Sequence Reconstruction Methods to Infer Nonstationary Patterns of Nucleotide Substitution.

Authors:  Tomotaka Matsumoto; Hiroshi Akashi; Ziheng Yang
Journal:  Genetics       Date:  2015-05-06       Impact factor: 4.562

5.  Estimating the rate of intersubtype recombination in early HIV-1 group M strains.

Authors:  Melissa J Ward; Samantha J Lycett; Marcia L Kalish; Andrew Rambaut; Andrew J Leigh Brown
Journal:  J Virol       Date:  2012-12-12       Impact factor: 5.103

6.  Comprehensive Characterization of HIV-1 Molecular Epidemiology and Demographic History in the Brazilian Region Most Heavily Affected by AIDS.

Authors:  Tiago Gräf; Hegger Machado Fritsch; Rúbia Marília de Medeiros; Dennis Maletich Junqueira; Sabrina Esteves de Matos Almeida; Aguinaldo Roberto Pinto
Journal:  J Virol       Date:  2016-08-26       Impact factor: 5.103

7.  Three roads diverged? Routes to phylogeographic inference.

Authors:  Erik W Bloomquist; Philippe Lemey; Marc A Suchard
Journal:  Trends Ecol Evol       Date:  2010-09-20       Impact factor: 17.712

8.  Contribution of Epidemiological Predictors in Unraveling the Phylogeographic History of HIV-1 Subtype C in Brazil.

Authors:  Tiago Gräf; Bram Vrancken; Dennis Maletich Junqueira; Rúbia Marília de Medeiros; Marc A Suchard; Philippe Lemey; Sabrina Esteves de Matos Almeida; Aguinaldo Roberto Pinto
Journal:  J Virol       Date:  2015-09-30       Impact factor: 5.103

9.  On the contribution of Angola to the initial spread of HIV-1.

Authors:  Andrea-Clemencia Pineda-Peña; Jorge Varanda; João Dinis Sousa; Kristof Theys; Inês Bártolo; Thomas Leitner; Nuno Taveira; Anne-Mieke Vandamme; Ana B Abecasis
Journal:  Infect Genet Evol       Date:  2016-08-10       Impact factor: 3.342

10.  Inference and characterization of horizontally transferred gene families using stochastic mapping.

Authors:  Ofir Cohen; Tal Pupko
Journal:  Mol Biol Evol       Date:  2009-10-06       Impact factor: 16.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.