Literature DB >> 19234132

Sequence context-specific profiles for homology searching.

A Biegert1, J Söding.   

Abstract

Sequence alignment and database searching are essential tools in biology because a protein's function can often be inferred from homologous proteins. Standard sequence comparison methods use substitution matrices to find the alignment with the best sum of similarity scores between aligned residues. These similarity scores do not take the local sequence context into account. Here, we present an approach that derives context-specific amino acid similarities from short windows centered on each query sequence residue. Our results demonstrate that the sequence context contains much more information about the expected mutations than just the residue itself. By employing our context-specific similarities (CS-BLAST) in combination with NCBI BLAST, we increase the sensitivity more than 2-fold on a difficult benchmark set, without loss of speed. Alignment quality is likewise improved significantly. Furthermore, we demonstrate considerable improvements when applying this paradigm to sequence profiles: Two iterations of CSI-BLAST, our context-specific version of PSI-BLAST, are more sensitive than 5 iterations of PSI-BLAST. The paradigm for biological sequence comparison presented here is very general. It can replace substitution matrices in sequence- and profile-based alignment and search methods for both protein and nucleotide sequences.

Mesh:

Substances:

Year:  2009        PMID: 19234132      PMCID: PMC2645910          DOI: 10.1073/pnas.0810767106

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  41 in total

1.  PHAT: a transmembrane-specific substitution matrix. Predicted hydrophobic and transmembrane.

Authors:  P C Ng; J G Henikoff; S Henikoff
Journal:  Bioinformatics       Date:  2000-09       Impact factor: 6.937

2.  Non-symmetric score matrices and the detection of homologous transmembrane proteins.

Authors:  T Müller; S Rahmann; M Rehmsmeier
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

3.  The compositional adjustment of amino acid substitution matrices.

Authors:  Yi-Kuo Yu; John C Wootton; Stephen F Altschul
Journal:  Proc Natl Acad Sci U S A       Date:  2003-12-08       Impact factor: 11.205

4.  Exhaustive matching of the entire protein sequence database.

Authors:  G H Gonnet; M A Cohen; S A Benner
Journal:  Science       Date:  1992-06-05       Impact factor: 47.728

5.  Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures.

Authors:  Alexander Stark; Michael F Lin; Pouya Kheradpour; Jakob S Pedersen; Leopold Parts; Joseph W Carlson; Madeline A Crosby; Matthew D Rasmussen; Sushmita Roy; Ameya N Deoras; J Graham Ruby; Julius Brennecke; Emily Hodges; Angie S Hinrichs; Anat Caspi; Benedict Paten; Seung-Won Park; Mira V Han; Morgan L Maeder; Benjamin J Polansky; Bryanne E Robson; Stein Aerts; Jacques van Helden; Bassem Hassan; Donald G Gilbert; Deborah A Eastman; Michael Rice; Michael Weir; Matthew W Hahn; Yongkyu Park; Colin N Dewey; Lior Pachter; W James Kent; David Haussler; Eric C Lai; David P Bartel; Gregory J Hannon; Thomas C Kaufman; Michael B Eisen; Andrew G Clark; Douglas Smith; Susan E Celniker; William M Gelbart; Manolis Kellis
Journal:  Nature       Date:  2007-11-08       Impact factor: 49.962

6.  Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology.

Authors:  K Sjölander; K Karplus; M Brown; R Hughey; A Krogh; I S Mian; D Haussler
Journal:  Comput Appl Biosci       Date:  1996-08

7.  Profile analysis: detection of distantly related proteins.

Authors:  M Gribskov; A D McLachlan; D Eisenberg
Journal:  Proc Natl Acad Sci U S A       Date:  1987-07       Impact factor: 11.205

8.  Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks.

Authors:  R L Tatusov; S F Altschul; E V Koonin
Journal:  Proc Natl Acad Sci U S A       Date:  1994-12-06       Impact factor: 11.205

9.  TM-align: a protein structure alignment algorithm based on the TM-score.

Authors:  Yang Zhang; Jeffrey Skolnick
Journal:  Nucleic Acids Res       Date:  2005-04-22       Impact factor: 16.971

Review 10.  Recent evolutions of multiple sequence alignment algorithms.

Authors:  Cédric Notredame
Journal:  PLoS Comput Biol       Date:  2007-08       Impact factor: 4.475

View more
  81 in total

1.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment.

Authors:  Michael Remmert; Andreas Biegert; Andreas Hauser; Johannes Söding
Journal:  Nat Methods       Date:  2011-12-25       Impact factor: 28.547

2.  A unified taxonomy for ciliary dyneins.

Authors:  Erik F Y Hom; George B Witman; Elizabeth H Harris; Susan K Dutcher; Ritsu Kamiya; David R Mitchell; Gregory J Pazour; Mary E Porter; Winfield S Sale; Maureen Wirschell; Toshiki Yagi; Stephen M King
Journal:  Cytoskeleton (Hoboken)       Date:  2011-10

3.  NhaA antiporter functions using 10 helices, and an additional 2 contribute to assembly/stability.

Authors:  Etana Padan; Tsafi Danieli; Yael Keren; Dudu Alkoby; Gal Masrati; Turkan Haliloglu; Nir Ben-Tal; Abraham Rimon
Journal:  Proc Natl Acad Sci U S A       Date:  2015-09-28       Impact factor: 11.205

4.  eFindSite: improved prediction of ligand binding sites in protein models using meta-threading, machine learning and auxiliary ligands.

Authors:  Michal Brylinski; Wei P Feinstein
Journal:  J Comput Aided Mol Des       Date:  2013-07-10       Impact factor: 3.686

5.  Integration of new genes into cellular networks, and their structural maturation.

Authors:  György Abrusán
Journal:  Genetics       Date:  2013-09-20       Impact factor: 4.562

6.  A galaxy of folds.

Authors:  Vikram Alva; Michael Remmert; Andreas Biegert; Andrei N Lupas; Johannes Söding
Journal:  Protein Sci       Date:  2010-01       Impact factor: 6.725

7.  Predicting the molecular interactions of CRIP1a-cannabinoid 1 receptor with integrated molecular modeling approaches.

Authors:  Mostafa H Ahmed; Glen E Kellogg; Dana E Selley; Martin K Safo; Yan Zhang
Journal:  Bioorg Med Chem Lett       Date:  2014-01-08       Impact factor: 2.823

8.  fpocket: online tools for protein ensemble pocket detection and tracking.

Authors:  Peter Schmidtke; Vincent Le Guilloux; Julien Maupetit; Pierre Tufféry
Journal:  Nucleic Acids Res       Date:  2010-05-16       Impact factor: 16.971

9.  SCPS: a fast implementation of a spectral method for detecting protein families on a genome-wide scale.

Authors:  Tamás Nepusz; Rajkumar Sasidharan; Alberto Paccanaro
Journal:  BMC Bioinformatics       Date:  2010-03-09       Impact factor: 3.169

10.  Diversity of protein structures and difficulties in fold recognition: the curious case of protein G.

Authors:  Jeremy Horst; Ram Samudrala
Journal:  F1000 Biol Rep       Date:  2009-09-08
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.