Literature DB >> 20133727

FragBag, an accurate representation of protein structure, retrieves structural neighbors from the entire PDB quickly and accurately.

Inbal Budowski-Tal1, Yuval Nov, Rachel Kolodny.   

Abstract

Fast identification of protein structures that are similar to a specified query structure in the entire Protein Data Bank (PDB) is fundamental in structure and function prediction. We present FragBag: An ultrafast and accurate method for comparing protein structures. We describe a protein structure by the collection of its overlapping short contiguous backbone segments, and discretize this set using a library of fragments. Then, we succinctly represent the protein as a "bags-of-fragments"-a vector that counts the number of occurrences of each fragment-and measure the similarity between two structures by the similarity between their vectors. Our representation has two additional benefits: (i) it can be used to construct an inverted index, for implementing a fast structural search engine of the entire PDB, and (ii) one can specify a structure as a collection of substructures, without combining them into a single structure; this is valuable for structure prediction, when there are reliable predictions only of parts of the protein. We use receiver operating characteristic curve analysis to quantify the success of FragBag in identifying neighbor candidate sets in a dataset of over 2,900 structures. The gold standard is the set of neighbors found by six state of the art structural aligners. Our best FragBag library finds more accurate candidate sets than the three other filter methods: The SGM, PRIDE, and a method by Zotenko et al. More interestingly, FragBag performs on a par with the computationally expensive, yet highly trusted structural aligners STRUCTAL and CE.

Mesh:

Year:  2010        PMID: 20133727      PMCID: PMC2840415          DOI: 10.1073/pnas.0914097107

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  30 in total

1.  Protein fold similarity estimated by a probabilistic approach based on C(alpha)-C(alpha) distance comparison.

Authors:  Oliviero Carugo; Sándor Pongor
Journal:  J Mol Biol       Date:  2002-01-25       Impact factor: 5.469

2.  Rapid 3D protein structure database searching using information retrieval techniques.

Authors:  Zeyar Aung; Kian-Lee Tan
Journal:  Bioinformatics       Date:  2004-02-12       Impact factor: 6.937

3.  TASSER: an automated method for the prediction of protein tertiary structures in CASP6.

Authors:  Yang Zhang; Adrian K Arakaki; Jeffrey Skolnick
Journal:  Proteins       Date:  2005

Review 4.  Rapid retrieval of protein structures from databases.

Authors:  Zeyar Aung; Kian-Lee Tan
Journal:  Drug Discov Today       Date:  2007-08-28       Impact factor: 7.851

5.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path.

Authors:  I N Shindyalov; P E Bourne
Journal:  Protein Eng       Date:  1998-09

Review 6.  Is protein classification necessary? Toward alternative approaches to function annotation.

Authors:  Donald Petrey; Barry Honig
Journal:  Curr Opin Struct Biol       Date:  2009-03-05       Impact factor: 6.809

7.  Secondary structure spatial conformation footprint: a novel method for fast protein structure comparison and classification.

Authors:  Elena Zotenko; Dianne P O'Leary; Teresa M Przytycka
Journal:  BMC Struct Biol       Date:  2006-06-08

8.  Critical assessment of methods of protein structure prediction-Round VII.

Authors:  John Moult; Krzysztof Fidelis; Andriy Kryshtafovych; Burkhard Rost; Tim Hubbard; Anna Tramontano
Journal:  Proteins       Date:  2007

9.  Sequence-similar, structure-dissimilar protein pairs in the PDB.

Authors:  Mickey Kosloff; Rachel Kolodny
Journal:  Proteins       Date:  2008-05-01

10.  SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition.

Authors:  Iain Melvin; Eugene Ie; Rui Kuang; Jason Weston; William Noble Stafford; Christina Leslie
Journal:  BMC Bioinformatics       Date:  2007-05-22       Impact factor: 3.169

View more
  45 in total

1.  Retrieving backbone string neighbors provides insights into structural modeling of membrane proteins.

Authors:  Jiang-Ming Sun; Tong-Hua Li; Pei-Sheng Cong; Sheng-Nan Tang; Wen-Wei Xiong
Journal:  Mol Cell Proteomics       Date:  2012-03-13       Impact factor: 5.911

2.  Reducing the dimensionality of the protein-folding search problem.

Authors:  George D Chellapa; George D Rose
Journal:  Protein Sci       Date:  2012-07-06       Impact factor: 6.725

3.  Physical-chemical determinants of coil conformations in globular proteins.

Authors:  Lauren L Perskie; George D Rose
Journal:  Protein Sci       Date:  2010-06       Impact factor: 6.725

4.  Nonlinearities in protein space limit the utility of informatics in protein biophysics.

Authors:  S Rackovsky
Journal:  Proteins       Date:  2015-09-10

5.  GOSSIP: a method for fast and accurate global alignment of protein structures.

Authors:  I Kifer; R Nussinov; H J Wolfson
Journal:  Bioinformatics       Date:  2011-02-03       Impact factor: 6.937

6.  Fast geometric consensus approach for protein model quality assessment.

Authors:  Rafal Adamczak; Jaroslaw Pillardy; Brinda K Vallat; Jaroslaw Meller
Journal:  J Comput Biol       Date:  2011-01-18       Impact factor: 1.479

7.  Maps of protein structure space reveal a fundamental relationship between protein structure and function.

Authors:  Margarita Osadchy; Rachel Kolodny
Journal:  Proc Natl Acad Sci U S A       Date:  2011-07-07       Impact factor: 11.205

8.  Rapid search for tertiary fragments reveals protein sequence-structure relationships.

Authors:  Jianfu Zhou; Gevorg Grigoryan
Journal:  Protein Sci       Date:  2014-12-31       Impact factor: 6.725

9.  Tertiary alphabet for the observable protein structural universe.

Authors:  Craig O Mackenzie; Jianfu Zhou; Gevorg Grigoryan
Journal:  Proc Natl Acad Sci U S A       Date:  2016-11-03       Impact factor: 11.205

10.  Mining tertiary structural motifs for assessment of designability.

Authors:  Jian Zhang; Gevorg Grigoryan
Journal:  Methods Enzymol       Date:  2013       Impact factor: 1.600

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.