Literature DB >> 28073761

Seeing the trees through the forest: sequence-based homo- and heteromeric protein-protein interaction sites prediction using random forest.

Qingzhen Hou1,2, Paul F G De Geest1,2, Wim F Vranken3,4,5, Jaap Heringa1,2, K Anton Feenstra1,2.   

Abstract

MOTIVATION: Genome sequencing is producing an ever-increasing amount of associated protein sequences. Few of these sequences have experimentally validated annotations, however, and computational predictions are becoming increasingly successful in producing such annotations. One key challenge remains the prediction of the amino acids in a given protein sequence that are involved in protein-protein interactions. Such predictions are typically based on machine learning methods that take advantage of the properties and sequence positions of amino acids that are known to be involved in interaction. In this paper, we evaluate the importance of various features using Random Forest (RF), and include as a novel feature backbone flexibility predicted from sequences to further optimise protein interface prediction.
RESULTS: We observe that there is no single sequence feature that enables pinpointing interacting sites in our Random Forest models. However, combining different properties does increase the performance of interface prediction. Our homomeric-trained RF interface predictor is able to distinguish interface from non-interface residues with an area under the ROC curve of 0.72 in a homomeric test-set. The heteromeric-trained RF interface predictor performs better than existing predictors on a independent heteromeric test-set. We trained a more general predictor on the combined homomeric and heteromeric dataset, and show that in addition to predicting homomeric interfaces, it is also able to pinpoint interface residues in heterodimers. This suggests that our random forest model and the features included capture common properties of both homodimer and heterodimer interfaces.
AVAILABILITY AND IMPLEMENTATION: The predictors and test datasets used in our analyses are freely available ( http://www.ibi.vu.nl/downloads/RF_PPI/ ). CONTACT: k.a.feenstra@vu.nl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

Entities:  

Mesh:

Year:  2017        PMID: 28073761     DOI: 10.1093/bioinformatics/btx005

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  14 in total

1.  Protein-protein interaction sites prediction by ensemble random forests with synthetic minority oversampling technique.

Authors:  Xiaoying Wang; Bin Yu; Anjun Ma; Cheng Chen; Bingqiang Liu; Qin Ma
Journal:  Bioinformatics       Date:  2019-07-15       Impact factor: 6.937

2.  Multi-task learning to leverage partially annotated data for PPI interface prediction.

Authors:  Henriette Capel; K Anton Feenstra; Sanne Abeln
Journal:  Sci Rep       Date:  2022-06-21       Impact factor: 4.996

3.  ProB-Site: Protein Binding Site Prediction Using Local Features.

Authors:  Sharzil Haris Khan; Hilal Tayara; Kil To Chong
Journal:  Cells       Date:  2022-07-05       Impact factor: 7.666

4.  Deep Learning for Protein-Protein Interaction Site Prediction.

Authors:  Arian R Jamasb; Ben Day; Cătălina Cangea; Pietro Liò; Tom L Blundell
Journal:  Methods Mol Biol       Date:  2021

5.  Online biophysical predictions for SARS-CoV-2 proteins.

Authors:  Luciano Kagami; Joel Roca-Martínez; Jose Gavaldá-García; Pathmanaban Ramasamy; K Anton Feenstra; Wim F Vranken
Journal:  BMC Mol Cell Biol       Date:  2021-04-23

6.  PIPENN: Protein Interface Prediction from sequence with an Ensemble of Neural Nets.

Authors:  Bas Stringer; Hans de Ferrante; Sanne Abeln; Jaap Heringa; K Anton Feenstra; Reza Haydarlou
Journal:  Bioinformatics       Date:  2022-02-12       Impact factor: 6.937

7.  Scoring of protein-protein docking models utilizing predicted interface residues.

Authors:  Gabriele Pozzati; Petras Kundrotas; Arne Elofsson
Journal:  Proteins       Date:  2022-03-14

8.  O-GlcNAcylation Prediction: An Unattained Objective.

Authors:  Theo Mauri; Laurence Menu-Bouaouiche; Muriel Bardor; Tony Lefebvre; Marc F Lensink; Guillaume Brysbaert
Journal:  Adv Appl Bioinform Chem       Date:  2021-06-08

9.  Reciprocal Perspective for Improved Protein-Protein Interaction Prediction.

Authors:  Kevin Dick; James R Green
Journal:  Sci Rep       Date:  2018-08-03       Impact factor: 4.379

10.  Developing Computational Model to Predict Protein-Protein Interaction Sites Based on the XGBoost Algorithm.

Authors:  Aijun Deng; Huan Zhang; Wenyan Wang; Jun Zhang; Dingdong Fan; Peng Chen; Bing Wang
Journal:  Int J Mol Sci       Date:  2020-03-25       Impact factor: 5.923

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.