Literature DB >> 20377464

Aligning protein sequences with predicted secondary structure.

John Kececioglu1, Eagu Kim, Travis Wheeler.   

Abstract

Accurately aligning distant protein sequences is notoriously difficult. Since the amino acid sequence alone often does not provide enough information to obtain accurate alignments under the standard alignment scoring functions, a recent approach to improving alignment accuracy is to use additional information such as secondary structure. We make several advances in alignment of protein sequences annotated with predicted secondary structure: (1) more accurate models for scoring alignments, (2) efficient algorithms for optimal alignment under these models, and (3) improved learning criteria for setting model parameters through inverse alignment, as well as (4) in-depth experiments evaluating model variants on benchmark alignments. More specifically, the new models use secondary structure predictions and their confidences to modify the scoring of both substitutions and gaps. All models have efficient algorithms for optimal pairwise alignment that run in near-quadratic time. These models have many parameters, which are rigorously learned using inverse alignment under a new criterion that carefully balances score error and recovery error. We then evaluate these models by studying how accurately an optimal alignment under the model recovers benchmark reference alignments that are based on the known three-dimensional structures of the proteins. The experiments show that these new models provide a significant boost in accuracy over the standard model for distant sequences. The improvement for pairwise alignment is as much as 15% for sequences with less than 25% identity, while for multiple alignment the improvement is more than 20% for difficult benchmarks whose accuracy under standard tools is at most 40%.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20377464     DOI: 10.1089/cmb.2009.0222

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  3 in total

1.  Defining the Domain Arrangement of the Mammalian Target of Rapamycin Complex Component Rictor Protein.

Authors:  Ping Zhou; Ning Zhang; Ruth Nussinov; Buyong Ma
Journal:  J Comput Biol       Date:  2015-07-15       Impact factor: 1.479

2.  Crystal structure of Zebrafish interferons I and II reveals conservation of type I interferon structure in vertebrates.

Authors:  Ole Jensen Hamming; Georges Lutfalla; Jean-Pierre Levraud; Rune Hartmann
Journal:  J Virol       Date:  2011-06-08       Impact factor: 5.103

3.  Boosting the accuracy of protein secondary structure prediction through nearest neighbor search and method hybridization.

Authors:  Spencer Krieger; John Kececioglu
Journal:  Bioinformatics       Date:  2020-07-01       Impact factor: 6.937

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.