Literature DB >> 32053390

Incorporating Nearest-Neighbor Site Dependence into Protein Evolution Models.

Gary Larson1, Jeffrey L Thorne2,3, Scott Schmidler1,4.   

Abstract

Evolutionary models of proteins are widely used for statistical sequence alignment and inference of homology and phylogeny. However, the vast majority of these models rely on an unrealistic assumption of independent evolution between sites. Here we focus on the related problem of protein structure alignment, a classic tool of computational biology that is widely used to identify structural and functional similarity and to infer homology among proteins. A site-independent statistical model for protein structural evolution has previously been introduced and shown to significantly improve alignments and phylogenetic inferences compared with approaches that utilize only amino acid sequence information. Here we extend this model to account for correlated evolutionary drift among neighboring amino acid positions. The result is a spatiotemporal model of protein structure evolution, described by a multivariate diffusion process convolved with a spatial birth-death process. This extended site-dependent model (SDM) comes with little additional computational cost or analytical complexity compared with the site-independent model (SIM). We demonstrate that this SDM yields a significant reduction of bias in estimated evolutionary distances and helps further improve phylogenetic tree reconstruction. We also develop a simple model of site-dependent sequence evolution, which we use to demonstrate the bias resulting from the application of standard site-independent sequence evolution models.

Entities:  

Keywords:  diffusion process; dynamic programming; evolution; phylogeny; protein structure

Mesh:

Substances:

Year:  2020        PMID: 32053390      PMCID: PMC7081252          DOI: 10.1089/cmb.2019.0500

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  8 in total

1.  A stochastic evolutionary model for protein structure alignment and phylogeny.

Authors:  Christopher J Challis; Scott C Schmidler
Journal:  Mol Biol Evol       Date:  2012-06-21       Impact factor: 16.240

2.  Different versions of the Dayhoff rate matrix.

Authors:  Carolin Kosiol; Nick Goldman
Journal:  Mol Biol Evol       Date:  2004-10-13       Impact factor: 16.240

3.  MALIDUP: a database of manually constructed structure alignments for duplicated domain pairs.

Authors:  Hua Cheng; Bong-Hyun Kim; Nick V Grishin
Journal:  Proteins       Date:  2008-03

4.  An evolutionary model for maximum likelihood alignment of DNA sequences.

Authors:  J L Thorne; H Kishino; J Felsenstein
Journal:  J Mol Evol       Date:  1991-08       Impact factor: 2.395

5.  Evolution of DNA or amino acid sequences with dependent sites.

Authors:  A von Haeseler; M Schöniger
Journal:  J Comput Biol       Date:  1998       Impact factor: 1.479

6.  Simultaneous Bayesian estimation of alignment and phylogeny under a joint model of protein sequence and structure.

Authors:  Joseph L Herman; Christopher J Challis; Ádám Novák; Jotun Hein; Scott C Schmidler
Journal:  Mol Biol Evol       Date:  2014-06-04       Impact factor: 16.240

7.  Protein structure alignment beyond spatial proximity.

Authors:  Sheng Wang; Jianzhu Ma; Jian Peng; Jinbo Xu
Journal:  Sci Rep       Date:  2013       Impact factor: 4.379

8.  Trends in substitution models of molecular evolution.

Authors:  Miguel Arenas
Journal:  Front Genet       Date:  2015-10-26       Impact factor: 4.599

  8 in total
  1 in total

1.  Convolutional Neural Network Based Approach to in Silico Non-Anticipating Prediction of Antigenic Distance for Influenza Virus.

Authors:  Majid Forghani; Michael Khachay
Journal:  Viruses       Date:  2020-09-12       Impact factor: 5.048

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.