Literature DB >> 30957840

A feature-based approach to predict hot spots in protein-DNA binding interfaces.

Sijia Zhang1, Le Zhao1, Chun-Hou Zheng1, Junfeng Xia1.   

Abstract

DNA-binding hot spot residues of proteins are dominant and fundamental interface residues that contribute most of the binding free energy of protein-DNA interfaces. As experimental methods for identifying hot spots are expensive and time consuming, computational approaches are urgently required in predicting hot spots on a large scale. In this work, we systematically assessed a wide variety of 114 features from a combination of the protein sequence, structure, network and solvent accessible information and their combinations along with various feature selection strategies for hot spot prediction. We then trained and compared four commonly used machine learning models, namely, support vector machine (SVM), random forest, Naïve Bayes and k-nearest neighbor, for the identification of hot spots using 10-fold cross-validation and the independent test set. Our results show that (1) features based on the solvent accessible surface area have significant effect on hot spot prediction; (2) different but complementary features generally enhance the prediction performance; and (3) SVM outperforms other machine learning methods on both training and independent test sets. In an effort to improve predictive performance, we developed a feature-based method, namely, PrPDH (Prediction of Protein-DNA binding Hot spots), for the prediction of hot spots in protein-DNA binding interfaces using SVM based on the selected 10 optimal features. Comparative results on benchmark data sets indicate that our predictor is able to achieve generally better performance in predicting hot spots compared to the state-of-the-art predictors. A user-friendly web server for PrPDH is well established and is freely available at http://bioinfo.ahu.edu.cn:8080/PrPDH.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Keywords:  feature selection; hot spot; machine learning; protein–DNA interaction; support vector machine

Year:  2020        PMID: 30957840     DOI: 10.1093/bib/bbz037

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  5 in total

1.  A Deep Learning-Based Method for Identification of Bacteriophage-Host Interaction.

Authors:  Menglu Li; Yanan Wang; Fuyi Li; Yun Zhao; Mengya Liu; Sijia Zhang; Yannan Bin; A Ian Smith; Geoffrey I Webb; Jian Li; Jiangning Song; Junfeng Xia
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2021-10-07       Impact factor: 3.702

2.  A polygenic stacking classifier revealed the complicated platelet transcriptomic landscape of adult immune thrombocytopenia.

Authors:  Chengfeng Xu; Ruochi Zhang; Meiyu Duan; Yongming Zhou; Jizhang Bao; Hao Lu; Jie Wang; Minghui Hu; Zhaoyang Hu; Fengfeng Zhou; Wenwei Zhu
Journal:  Mol Ther Nucleic Acids       Date:  2022-04-06       Impact factor: 10.183

3.  Prediction of hot spots in protein-DNA binding interfaces based on supervised isometric feature mapping and extreme gradient boosting.

Authors:  Ke Li; Sijia Zhang; Di Yan; Yannan Bin; Junfeng Xia
Journal:  BMC Bioinformatics       Date:  2020-09-17       Impact factor: 3.169

4.  mmCSM-NA: accurately predicting effects of single and multiple mutations on protein-nucleic acid binding affinity.

Authors:  Thanh Binh Nguyen; Yoochan Myung; Alex G C de Sá; Douglas E V Pires; David B Ascher
Journal:  NAR Genom Bioinform       Date:  2021-11-17

5.  Computationally identifying hot spots in protein-DNA binding interfaces using an ensemble approach.

Authors:  Yuliang Pan; Shuigeng Zhou; Jihong Guan
Journal:  BMC Bioinformatics       Date:  2020-09-17       Impact factor: 3.169

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.