Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Aligning protein sequences with predicted secondary structure.

Literature DB >> 20377464

Aligning protein sequences with predicted secondary structure.

John Kececioglu¹, Eagu Kim, Travis Wheeler.

Abstract

Accurately aligning distant protein sequences is notoriously difficult. Since the amino acid sequence alone often does not provide enough information to obtain accurate alignments under the standard alignment scoring functions, a recent approach to improving alignment accuracy is to use additional information such as secondary structure. We make several advances in alignment of protein sequences annotated with predicted secondary structure: (1) more accurate models for scoring alignments, (2) efficient algorithms for optimal alignment under these models, and (3) improved learning criteria for setting model parameters through inverse alignment, as well as (4) in-depth experiments evaluating model variants on benchmark alignments. More specifically, the new models use secondary structure predictions and their confidences to modify the scoring of both substitutions and gaps. All models have efficient algorithms for optimal pairwise alignment that run in near-quadratic time. These models have many parameters, which are rigorously learned using inverse alignment under a new criterion that carefully balances score error and recovery error. We then evaluate these models by studying how accurately an optimal alignment under the model recovers benchmark reference alignments that are based on the known three-dimensional structures of the proteins. The experiments show that these new models provide a significant boost in accuracy over the standard model for distant sequences. The improvement for pairwise alignment is as much as 15% for sequences with less than 25% identity, while for multiple alignment the improvement is more than 20% for difficult benchmarks whose accuracy under standard tools is at most 40%.

Entities: Disease

Mesh：

Substances：
Proteins

Year: 2010 PMID： 20377464 DOI： 10.1089/cmb.2009.0222

Source DB: PubMed Journal: J Comput Biol ISSN： 1066-5277 Impact factor: 1.479

Keyword Cloud
Cited

3 in total

1. Defining the Domain Arrangement of the Mammalian Target of Rapamycin Complex Component Rictor Protein.

Authors: Ping Zhou; Ning Zhang; Ruth Nussinov; Buyong Ma
Journal: J Comput Biol Date: 2015-07-15 Impact factor: 1.479

2. Crystal structure of Zebrafish interferons I and II reveals conservation of type I interferon structure in vertebrates.

Authors: Ole Jensen Hamming; Georges Lutfalla; Jean-Pierre Levraud; Rune Hartmann
Journal: J Virol Date: 2011-06-08 Impact factor: 5.103

3. Boosting the accuracy of protein secondary structure prediction through nearest neighbor search and method hybridization.

Authors: Spencer Krieger; John Kececioglu
Journal: Bioinformatics Date: 2020-07-01 Impact factor: 6.937

3 in total