Literature DB >> 30520961

Protein-protein interaction sites prediction by ensemble random forests with synthetic minority oversampling technique.

Xiaoying Wang1,2,3, Bin Yu1,3,4, Anjun Ma5,6, Cheng Chen1,3, Bingqiang Liu2, Qin Ma5,6.   

Abstract

MOTIVATION: The prediction of protein-protein interaction (PPI) sites is a key to mutation design, catalytic reaction and the reconstruction of PPI networks. It is a challenging task considering the significant abundant sequences and the imbalance issue in samples.
RESULTS: A new ensemble learning-based method, Ensemble Learning of synthetic minority oversampling technique (SMOTE) for Unbalancing samples and RF algorithm (EL-SMURF), was proposed for PPI sites prediction in this study. The sequence profile feature and the residue evolution rates were combined for feature extraction of neighboring residues using a sliding window, and the SMOTE was applied to oversample interface residues in the feature space for the imbalance problem. The Multi-dimensional Scaling feature selection method was implemented to reduce feature redundancy and subset selection. Finally, the Random Forest classifiers were applied to build the ensemble learning model, and the optimal feature vectors were inserted into EL-SMURF to predict PPI sites. The performance validation of EL-SMURF on two independent validation datasets showed 77.1% and 77.7% accuracy, which were 6.2-15.7% and 6.1-18.9% higher than the other existing tools, respectively.
AVAILABILITY AND IMPLEMENTATION: The source codes and data used in this study are publicly available at http://github.com/QUST-AIBBDRC/EL-SMURF/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Year:  2019        PMID: 30520961      PMCID: PMC6612859          DOI: 10.1093/bioinformatics/bty995

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  69 in total

1.  Co-evolution of proteins with their interaction partners.

Authors:  C S Goh; A A Bogan; M Joachimiak; D Walther; F E Cohen
Journal:  J Mol Biol       Date:  2000-06-02       Impact factor: 5.469

2.  Similarity of phylogenetic trees as indicator of protein-protein interaction.

Authors:  F Pazos; A Valencia
Journal:  Protein Eng       Date:  2001-09

3.  Prediction of protein interaction sites from sequence profile and residue neighbor list.

Authors:  H X Zhou; Y Shan
Journal:  Proteins       Date:  2001-08-15

4.  ProMate: a structure based prediction program to identify the location of protein-protein binding sites.

Authors:  Hani Neuvirth; Ran Raz; Gideon Schreiber
Journal:  J Mol Biol       Date:  2004-04-16       Impact factor: 5.469

5.  Prediction of protein-RNA binding sites by a random forest method with combined features.

Authors:  Zhi-Ping Liu; Ling-Yun Wu; Yong Wang; Xiang-Sun Zhang; Luonan Chen
Journal:  Bioinformatics       Date:  2010-05-18       Impact factor: 6.937

6.  Exploring supervised neighborhood preserving embedding (SNPE) as a nonlinear feature extraction method for vibrational spectroscopic discrimination of agricultural samples according to geographical origins.

Authors:  Sanguk Lee; Jinyoung Hwang; Hyeseon Lee; Hoeil Chung
Journal:  Talanta       Date:  2015-07-09       Impact factor: 6.057

7.  Protein-protein interaction site prediction based on conditional random fields.

Authors:  Ming-Hui Li; Lei Lin; Xiao-Long Wang; Tao Liu
Journal:  Bioinformatics       Date:  2007-01-18       Impact factor: 6.937

8.  Conserved clusters of functionally related genes in two bacterial genomes.

Authors:  J Tamames; G Casari; C Ouzounis; A Valencia
Journal:  J Mol Evol       Date:  1997-01       Impact factor: 2.395

9.  SCOP: a structural classification of proteins database for the investigation of sequences and structures.

Authors:  A G Murzin; S E Brenner; T Hubbard; C Chothia
Journal:  J Mol Biol       Date:  1995-04-07       Impact factor: 5.469

10.  A protein interaction map of Drosophila melanogaster.

Authors:  L Giot; J S Bader; C Brouwer; A Chaudhuri; B Kuang; Y Li; Y L Hao; C E Ooi; B Godwin; E Vitols; G Vijayadamodar; P Pochart; H Machineni; M Welsh; Y Kong; B Zerhusen; R Malcolm; Z Varrone; A Collis; M Minto; S Burgess; L McDaniel; E Stimpson; F Spriggs; J Williams; K Neurath; N Ioime; M Agee; E Voss; K Furtak; R Renzulli; N Aanensen; S Carrolla; E Bickelhaupt; Y Lazovatsky; A DaSilva; J Zhong; C A Stanyon; R L Finley; K P White; M Braverman; T Jarvie; S Gold; M Leach; J Knight; R A Shimkets; M P McKenna; J Chant; J M Rothberg
Journal:  Science       Date:  2003-11-06       Impact factor: 47.728

View more
  24 in total

1.  DWPPI: A Deep Learning Approach for Predicting Protein-Protein Interactions in Plants Based on Multi-Source Information With a Large-Scale Biological Network.

Authors:  Jie Pan; Zhu-Hong You; Li-Ping Li; Wen-Zhun Huang; Jian-Xin Guo; Chang-Qing Yu; Li-Ping Wang; Zheng-Yang Zhao
Journal:  Front Bioeng Biotechnol       Date:  2022-03-21

Review 2.  Protein-Protein Docking: Past, Present, and Future.

Authors:  Sharon Sunny; P B Jayaraj
Journal:  Protein J       Date:  2021-11-17       Impact factor: 2.371

3.  Accurate Prediction of Anti-hypertensive Peptides Based on Convolutional Neural Network and Gated Recurrent unit.

Authors:  Hongyan Shi; Shengli Zhang
Journal:  Interdiscip Sci       Date:  2022-04-27       Impact factor: 3.492

4.  ProB-Site: Protein Binding Site Prediction Using Local Features.

Authors:  Sharzil Haris Khan; Hilal Tayara; Kil To Chong
Journal:  Cells       Date:  2022-07-05       Impact factor: 7.666

5.  Ensembles of natural language processing systems for portable phenotyping solutions.

Authors:  Cong Liu; Casey N Ta; James R Rogers; Ziran Li; Junghwan Lee; Alex M Butler; Ning Shang; Fabricio Sampaio Peres Kury; Liwei Wang; Feichen Shen; Hongfang Liu; Lyudmila Ena; Carol Friedman; Chunhua Weng
Journal:  J Biomed Inform       Date:  2019-10-23       Impact factor: 6.317

6.  Deep Learning for Protein-Protein Interaction Site Prediction.

Authors:  Arian R Jamasb; Ben Day; Cătălina Cangea; Pietro Liò; Tom L Blundell
Journal:  Methods Mol Biol       Date:  2021

7.  Identification of Sub-Golgi protein localization by use of deep representation learning features.

Authors:  Zhibin Lv; Pingping Wang; Quan Zou; Qinghua Jiang
Journal:  Bioinformatics       Date:  2020-12-26       Impact factor: 6.937

8.  A Method for Prediction of Thermophilic Protein Based on Reduced Amino Acids and Mixed Features.

Authors:  Changli Feng; Zhaogui Ma; Deyun Yang; Xin Li; Jun Zhang; Yanjuan Li
Journal:  Front Bioeng Biotechnol       Date:  2020-05-05

9.  Protein Interaction Network Reconstruction Through Ensemble Deep Learning With Attention Mechanism.

Authors:  Feifei Li; Fei Zhu; Xinghong Ling; Quan Liu
Journal:  Front Bioeng Biotechnol       Date:  2020-05-05

10.  A deep learning approach to predict blood-brain barrier permeability.

Authors:  Shrooq Alsenan; Isra Al-Turaiki; Alaaeldin Hafez
Journal:  PeerJ Comput Sci       Date:  2021-06-10
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.