Literature DB >> 9672045

Generalized affine gap costs for protein sequence alignment.

S F Altschul1.   

Abstract

Based on the observation that a single mutational event can delete or insert multiple residues, affine gap costs for sequence alignment charge a penalty for the existence of a gap, and a further length-dependent penalty. From structural or multiple alignments of distantly related proteins, it has been observed that conserved residues frequently fall into ungapped blocks separated by relatively nonconserved regions. To take advantage of this structure, a simple generalization of affine gap costs is proposed that allows nonconserved regions to be effectively ignored. The distribution of scores from local alignments using these generalized gap costs is shown empirically to follow an extreme value distribution. Examples are presented for which generalized affine gap costs yield superior alignments from the standpoints both of statistical significance and of alignment accuracy. Guidelines for selecting generalized affine gap costs are discussed, as is their possible application to multiple alignment.

Mesh:

Substances:

Year:  1998        PMID: 9672045

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  20 in total

1.  Statistical potentials for fold assessment.

Authors:  Francisco Melo; Roberto Sánchez; Andrej Sali
Journal:  Protein Sci       Date:  2002-02       Impact factor: 6.725

Review 2.  Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements.

Authors:  A A Schäffer; L Aravind; T L Madden; S Shavirin; J L Spouge; Y I Wolf; E V Koonin; S F Altschul
Journal:  Nucleic Acids Res       Date:  2001-07-15       Impact factor: 16.971

3.  Frequency of gaps observed in a structurally aligned protein pair database suggests a simple gap penalty function.

Authors:  Nalin C W Goonesekere; Byungkook Lee
Journal:  Nucleic Acids Res       Date:  2004-05-20       Impact factor: 16.971

4.  Analysis of protein homology by assessing the (dis)similarity in protein loop regions.

Authors:  Anna R Panchenko; Thomas Madej
Journal:  Proteins       Date:  2004-11-15

5.  A collection of amino acid replacement matrices derived from clusters of orthologs.

Authors:  Rolf Olsen; William F Loomis
Journal:  J Mol Evol       Date:  2005-10-20       Impact factor: 2.395

6.  An information theoretic approach to macromolecular modeling: I. Sequence alignments.

Authors:  Tiba Aynechi; Irwin D Kuntz
Journal:  Biophys J       Date:  2005-11       Impact factor: 4.033

7.  Nonbonded terms extrapolated from nonlocal knowledge-based energy functions improve error detection in near-native protein structure models.

Authors:  Evandro Ferrada; Francisco Melo
Journal:  Protein Sci       Date:  2007-07       Impact factor: 6.725

8.  Protein sequence similarity searches using patterns as seeds.

Authors:  Z Zhang; A A Schäffer; W Miller; T L Madden; D J Lipman; E V Koonin; S F Altschul
Journal:  Nucleic Acids Res       Date:  1998-09-01       Impact factor: 16.971

9.  MAFFT multiple sequence alignment software version 7: improvements in performance and usability.

Authors:  Kazutaka Katoh; Daron M Standley
Journal:  Mol Biol Evol       Date:  2013-01-16       Impact factor: 16.240

10.  Expansion of the protein repertoire in newly explored environments: human gut microbiome specific protein families.

Authors:  Kyle Ellrott; Lukasz Jaroszewski; Weizhong Li; John C Wooley; Adam Godzik
Journal:  PLoS Comput Biol       Date:  2010-06-03       Impact factor: 4.475

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.