| Literature DB >> 28484602 |
Tanmoy Jana1, Abhirupa Ghosh2, Sukhen Das Mandal1, Raja Banerjee2,3, Sudipto Saha1.
Abstract
PPIMpred is a web server that allows high-throughput screening of small molecules for targeting specific protein-protein interactions, namely Mdm2/P53, Bcl2/Bak and c-Myc/Max. Three different kernels of support vector machine (SVM), namely, linear, polynomial and radial basis function (RBF), and two other machine learning techniques including Naive Bayes and Random Forest were used to train the models. A fivefold cross-validation technique was used to measure the performance of these classifiers. The RBF kernel of SVM outperformed and/or was comparable with all other methods with accuracy values of 83%, 79% and 90% for Mdm2/P53, Bcl2/Bak and c-Myc/Max, respectively. About 80% of the predicted SVM scores of training/testing datasets from Mdm2/P53 and Bcl2/Bak have significant IC50 values and docking scores. The proposed models achieved an accuracy of 66-90% with blind sets. The three mentioned (Mdm2/P53, Bcl2/Bak and c-Myc/Max) proposed models were screened in a large dataset of 265 242 small chemicals from National Cancer Institute open database. To further realize the robustness of this approach, hits with high and random SVM scores were used for molecular docking in AutoDock Vina wherein the molecules with high and random predicted SVM scores yielded moderately significant docking scores (p-values < 0.1). In addition to the above-mentioned classification scheme, this web server also allows users to get the structural and chemical similarities with known chemical modulators or drug-like molecules based on Tanimoto coefficient similarity search algorithm. PPIMpred is freely available at http://bicresources.jcbose.ac.in/ssaha4/PPIMpred/.Entities:
Keywords: docking; modulators; protein–protein interaction; support vector machine
Year: 2017 PMID: 28484602 PMCID: PMC5414239 DOI: 10.1098/rsos.160501
Source DB: PubMed Journal: R Soc Open Sci ISSN: 2054-5703 Impact factor: 2.963
(a) Comparison of performance on Mdm2/P53 (1 : 7) testing dataset (fivefold cross-validation) using three different kernels of SVM (linear, polynomial and radial basis function), Naive Bayes and Random Forest method. (b) Comparison of performance on Bcl2/Bak (1 : 2) testing dataset (fivefold cross-validation) using three different kernels of SVM (linear, polynomial and radial basis function), Naive Bayes and Random Forest method. (c) Comparison of performance on c-Myc/Max (1 : 10) testing dataset (fivefold cross-validation) using three different kernels of SVM (linear, polynomial and radial basis function), Naive Bayes and Random Forest method.
| methods | sensitivity | specificity | accuracy | F1 score | PPV | AUC |
|---|---|---|---|---|---|---|
| ( | ||||||
| SVM linear | 0.68 | 0.71 | 0.70 | 0.36 | 0.41 | 0.77 |
| SVM poly | 0.64 | 0.60 | 0.61 | 0.35 | 0.32 | 0.63 |
| SVM RBF | 0.83 | 0.82 | 0.83 | 0.45 | 0.57 | 0.88 |
| Naive Bayes | 0.16 | 0.97 | 0.87 | 0.22 | 0.39 | 0.83 |
| Random forest | 0.69 | 0.99 | 0.95 | 0.77 | 0.88 | 0.93 |
| ( | ||||||
| SVM linear | 0.73 | 0.60 | 0.65 | 0.67 | 0.62 | 0.69 |
| SVM poly | 0.60 | 0.49 | 0.53 | 0.49 | 0.48 | 0.61 |
| SVM RBF | 0.86 | 0.75 | 0.79 | 0.72 | 0.77 | 0.83 |
| Naive Bayes | 0.70 | 0.87 | 0.81 | 0.73 | 0.76 | 0.87 |
| Random forest | 0.87 | 0.94 | 0.92 | 0.88 | 0.90 | 0.95 |
| ( | ||||||
| SVM linear | 0.80 | 0.93 | 0.92 | 0.60 | 0.65 | 0.89 |
| SVM poly | 0.8 | 0.90 | 0.89 | 0.47 | 0.58 | 0.89 |
| SVM RBF | 0.87 | 0.91 | 0.90 | 0.50 | 0.63 | 0.91 |
| Naive Bayes | 0.67 | 0.95 | 0.93 | 0.63 | 0.59 | 0.86 |
| Random forest | 0.4 | 0.99 | 0.94 | 0.55 | 0.86 | 0.89 |
Figure 1.(a) The ROC plot for Mdm2/P53 1 : 1 (P : N) dataset with 0.99 similarity (blue), 0.90 similarity (red), 0.80 similarity (green) and randomization trial (black). (b) The ROC plot for Bcl2/Bak 1 : 1 (P : N) dataset with 0.99 similarity (blue), 0.90 similarity (red), 0.80 similarity (green) and randomization trial (black).
Figure 2.Box plot showing the binding free energy of top hits, low hits and random hits of Bcl2/Bak.
Figure 3.(a) The home page consisting of submission form for molecular descriptors, target selection and threshold value selection. (b) Result page of prediction shows ‘prediction result’, ‘tabular result’ and ‘graphical result’. (c) Similarity search page where users can input a molecule either by drawing using JME editor or by pasting MOL 2D format file. (d) Similarity search result shows the list of compounds similar to the query structure.