Literature DB >> 10373007

Hidden Markov models that use predicted secondary structures for fold recognition.

J Hargbo1, A Elofsson.   

Abstract

There are many proteins that share the same fold but have no clear sequence similarity. To predict the structure of these proteins, so called "protein fold recognition methods" have been developed. During the last few years, improvements of protein fold recognition methods have been achieved through the use of predicted secondary structures (Rice and Eisenberg, J Mol Biol 1997;267:1026-1038), as well as by using multiple sequence alignments in the form of hidden Markov models (HMM) (Karplus et al., Proteins Suppl 1997;1:134-139). To test the performance of different fold recognition methods, we have developed a rigorous benchmark where representatives for all proteins of known structure are matched against each other. Using this benchmark, we have compared the performance of automatically-created hidden Markov models with standard-sequence-search methods. Further, we combine the use of predicted secondary structures and multiple sequence alignments into a combined method that performs better than methods that do not use this combination of information. Using only single sequences, the correct fold of a protein was detected for 10% of the test cases in our benchmark. Including multiple sequence information increased this number to 16%, and when predicted secondary structure information was included as well, the fold was correctly identified in 20% of the cases. Moreover, if the correct secondary structure was used, 27% of the proteins could be correctly matched to a fold. For comparison, blast2, fasta, and ssearch identifies the fold correctly in 13-17% of the cases. Thus, standard pairwise sequence search methods perform almost as well as hidden Markov models in our benchmark. This is probably because the automatically-created multiple sequence alignments used in this study do not contain enough diversity and because the current generation of hidden Markov models do not perform very well when built from a few sequences.

Entities:  

Mesh:

Year:  1999        PMID: 10373007

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  16 in total

1.  Identification of related proteins with weak sequence identity using secondary structure information.

Authors:  C Geourjon; C Combet; C Blanchet; G Deléage
Journal:  Protein Sci       Date:  2001-04       Impact factor: 6.725

2.  Improved detection of homologous membrane proteins by inclusion of information from topology predictions.

Authors:  Maria Hedman; Hans Deloof; Gunnar Von Heijne; Arne Elofsson
Journal:  Protein Sci       Date:  2002-03       Impact factor: 6.725

3.  Improving the quality of twilight-zone alignments.

Authors:  L Jaroszewski; L Rychlewski; A Godzik
Journal:  Protein Sci       Date:  2000-08       Impact factor: 6.725

Review 4.  Structural genomics: computational methods for structure analysis.

Authors:  Sharon Goldsmith-Fischman; Barry Honig
Journal:  Protein Sci       Date:  2003-09       Impact factor: 6.725

5.  Automatic generation and evaluation of sparse protein signatures for families of protein structural domains.

Authors:  Matthew J Blades; Jon C Ison; Ranjeeva Ranasinghe; John B C Findlay
Journal:  Protein Sci       Date:  2005-01       Impact factor: 6.725

Review 6.  Advances in homology protein structure modeling.

Authors:  Zhexin Xiang
Journal:  Curr Protein Pept Sci       Date:  2006-06       Impact factor: 3.272

7.  Combining multiple structure and sequence alignments to improve sequence detection and alignment: application to the SH2 domains of Janus kinases.

Authors:  B Al-Lazikani; F B Sheinerman; B Honig
Journal:  Proc Natl Acad Sci U S A       Date:  2001-12-18       Impact factor: 11.205

8.  Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates.

Authors:  Yuedong Yang; Eshel Faraggi; Huiying Zhao; Yaoqi Zhou
Journal:  Bioinformatics       Date:  2011-06-11       Impact factor: 6.937

9.  Improving protein fold recognition by random forest.

Authors:  Taeho Jo; Jianlin Cheng
Journal:  BMC Bioinformatics       Date:  2014-10-21       Impact factor: 3.169

10.  Improving protein fold recognition using triplet network and ensemble deep learning.

Authors:  Yan Liu; Ke Han; Yi-Heng Zhu; Ying Zhang; Long-Chen Shen; Jiangning Song; Dong-Jun Yu
Journal:  Brief Bioinform       Date:  2021-11-05       Impact factor: 13.994

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.