Literature DB >> 20529890

Applying the Naïve Bayes classifier with kernel density estimation to the prediction of protein-protein interaction sites.

Yoichi Murakami1, Kenji Mizuguchi.   

Abstract

MOTIVATION: The limited availability of protein structures often restricts the functional annotation of proteins and the identification of their protein-protein interaction sites. Computational methods to identify interaction sites from protein sequences alone are, therefore, required for unraveling the functions of many proteins. This article describes a new method (PSIVER) to predict interaction sites, i.e. residues binding to other proteins, in protein sequences. Only sequence features (position-specific scoring matrix and predicted accessibility) are used for training a Naïve Bayes classifier (NBC), and conditional probabilities of each sequence feature are estimated using a kernel density estimation method (KDE).
RESULTS: The leave-one out cross-validation of PSIVER achieved a Matthews correlation coefficient (MCC) of 0.151, an F-measure of 35.3%, a precision of 30.6% and a recall of 41.6% on a non-redundant set of 186 protein sequences extracted from 105 heterodimers in the Protein Data Bank (consisting of 36 219 residues, of which 15.2% were known interface residues). Even though the dataset used for training was highly imbalanced, a randomization test demonstrated that the proposed method managed to avoid overfitting. PSIVER was also tested on 72 sequences not used in training (consisting of 18 140 residues, of which 10.6% were known interface residues), and achieved an MCC of 0.135, an F-measure of 31.5%, a precision of 25.0% and a recall of 46.5%, outperforming other publicly available servers tested on the same dataset. PSIVER enables experimental biologists to identify potential interface residues in unknown proteins from sequence information alone, and to mutate those residues selectively in order to unravel protein functions. AVAILABILITY: Freely available on the web at http://tardis.nibio.go.jp/PSIVER/

Mesh:

Substances:

Year:  2010        PMID: 20529890     DOI: 10.1093/bioinformatics/btq302

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  59 in total

1.  Prediction of Protein-Protein Interaction Sites with Machine-Learning-Based Data-Cleaning and Post-Filtering Procedures.

Authors:  Guang-Hui Liu; Hong-Bin Shen; Dong-Jun Yu
Journal:  J Membr Biol       Date:  2015-11-12       Impact factor: 1.843

2.  Protein-protein interaction sites prediction by ensemble random forests with synthetic minority oversampling technique.

Authors:  Xiaoying Wang; Bin Yu; Anjun Ma; Cheng Chen; Bingqiang Liu; Qin Ma
Journal:  Bioinformatics       Date:  2019-07-15       Impact factor: 6.937

3.  Classification of Clinically Useful Sentences in MEDLINE.

Authors:  Mohammad Amin Morid; Siddhartha Jonnalagadda; Marcelo Fiszman; Kalpana Raja; Guilherme Del Fiol
Journal:  AMIA Annu Symp Proc       Date:  2015-11-05

4.  Machine-learning techniques for the prediction of protein-protein interactions.

Authors:  Debasree Sarkar; Sudipto Saha
Journal:  J Biosci       Date:  2019-09       Impact factor: 1.826

5.  Predictive models for anti-tubercular molecules using machine learning on high-throughput biological screening datasets.

Authors:  Vinita Periwal; Jinuraj K Rajappan; Abdul Uc Jaleel; Vinod Scaria
Journal:  BMC Res Notes       Date:  2011-11-18

6.  A Novel Internet of Things Framework Integrated with Real Time Monitoring for Intelligent Healthcare Environment.

Authors:  A Suresh; R Udendhran; M Balamurgan; R Varatharajan
Journal:  J Med Syst       Date:  2019-05-03       Impact factor: 4.460

7.  Classification of clinically useful sentences in clinical evidence resources.

Authors:  Mohammad Amin Morid; Marcelo Fiszman; Kalpana Raja; Siddhartha R Jonnalagadda; Guilherme Del Fiol
Journal:  J Biomed Inform       Date:  2016-01-13       Impact factor: 6.317

8.  Prediction of Protein-Protein Interaction Sites Using Convolutional Neural Network and Improved Data Sets.

Authors:  Zengyan Xie; Xiaoya Deng; Kunxian Shu
Journal:  Int J Mol Sci       Date:  2020-01-11       Impact factor: 5.923

Review 9.  Computational prediction of protein interfaces: A review of data driven methods.

Authors:  Li C Xue; Drena Dobbs; Alexandre M J J Bonvin; Vasant Honavar
Journal:  FEBS Lett       Date:  2015-10-13       Impact factor: 4.124

10.  Deep Learning for Protein-Protein Interaction Site Prediction.

Authors:  Arian R Jamasb; Ben Day; Cătălina Cangea; Pietro Liò; Tom L Blundell
Journal:  Methods Mol Biol       Date:  2021
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.