Literature DB >> 28961795

Improving protein fold recognition by extracting fold-specific features from predicted residue-residue contacts.

Jianwei Zhu1,2, Haicang Zhang1, Shuai Cheng Li3, Chao Wang1, Lupeng Kong1,2, Shiwei Sun1, Wei-Mou Zheng4, Dongbo Bu1.   

Abstract

MOTIVATION: Accurate recognition of protein fold types is a key step for template-based prediction of protein structures. The existing approaches to fold recognition mainly exploit the features derived from alignments of query protein against templates. These approaches have been shown to be successful for fold recognition at family level, but usually failed at superfamily/fold levels. To overcome this limitation, one of the key points is to explore more structurally informative features of proteins. Although residue-residue contacts carry abundant structural information, how to thoroughly exploit these information for fold recognition still remains a challenge.
RESULTS: In this study, we present an approach (called DeepFR) to improve fold recognition at superfamily/fold levels. The basic idea of our approach is to extract fold-specific features from predicted residue-residue contacts of proteins using deep convolutional neural network (DCNN) technique. Based on these fold-specific features, we calculated similarity between query protein and templates, and then assigned query protein with fold type of the most similar template. DCNN has showed excellent performance in image feature extraction and image recognition; the rational underlying the application of DCNN for fold recognition is that contact likelihood maps are essentially analogy to images, as they both display compositional hierarchy. Experimental results on the LINDAHL dataset suggest that even using the extracted fold-specific features alone, our approach achieved success rate comparable to the state-of-the-art approaches. When further combining these features with traditional alignment-related features, the success rate of our approach increased to 92.3%, 82.5% and 78.8% at family, superfamily and fold levels, respectively, which is about 18% higher than the state-of-the-art approach at fold level, 6% higher at superfamily level and 1% higher at family level. An independent assessment on SCOP_TEST dataset showed consistent performance improvement, indicating robustness of our approach. Furthermore, bi-clustering results of the extracted features are compatible with fold hierarchy of proteins, implying that these features are fold-specific. Together, these results suggest that the features extracted from predicted contacts are orthogonal to alignment-related features, and the combination of them could greatly facilitate fold recognition at superfamily/fold levels and template-based prediction of protein structures.
AVAILABILITY AND IMPLEMENTATION: Source code of DeepFR is freely available through https://github.com/zhujianwei31415/deepfr, and a web server is available through http://protein.ict.ac.cn/deepfr. CONTACT: zheng@itp.ac.cn or dbu@ict.ac.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28961795     DOI: 10.1093/bioinformatics/btx514

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  6 in total

1.  Profiles of Natural and Designed Protein-Like Sequences Effectively Bridge Protein Sequence Gaps: Implications in Distant Homology Detection.

Authors:  Gayatri Kumar; Narayanaswamy Srinivasan; Sankaran Sandhya
Journal:  Methods Mol Biol       Date:  2022

2.  Why can deep convolutional neural networks improve protein fold recognition? A visual explanation by interpretation.

Authors:  Yan Liu; Yi-Heng Zhu; Xiaoning Song; Jiangning Song; Dong-Jun Yu
Journal:  Brief Bioinform       Date:  2021-09-02       Impact factor: 11.622

3.  Improving protein fold recognition using triplet network and ensemble deep learning.

Authors:  Yan Liu; Ke Han; Yi-Heng Zhu; Ying Zhang; Long-Chen Shen; Jiangning Song; Dong-Jun Yu
Journal:  Brief Bioinform       Date:  2021-11-05       Impact factor: 13.994

4.  Protein threading using residue co-variation and deep learning.

Authors:  Jianwei Zhu; Sheng Wang; Dongbo Bu; Jinbo Xu
Journal:  Bioinformatics       Date:  2018-07-01       Impact factor: 6.937

5.  Network-based protein structural classification.

Authors:  Khalique Newaz; Mahboobeh Ghalehnovi; Arash Rahnama; Panos J Antsaklis; Tijana Milenković
Journal:  R Soc Open Sci       Date:  2020-06-03       Impact factor: 2.963

6.  DeepFrag-k: a fragment-based deep learning approach for protein fold recognition.

Authors:  Wessam Elhefnawy; Min Li; Jianxin Wang; Yaohang Li
Journal:  BMC Bioinformatics       Date:  2020-11-18       Impact factor: 3.169

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.