| Literature DB >> 29945178 |
Ibrahim Ahmed1, Peter Witbooi2, Alan Christoffels1.
Abstract
Motivation: Triplet amino acids have successfully been included in feature selection to predict human-HPV protein-protein interactions (PPI). The utility of supervised learning methods is curtailed due to experimental data not being available in sufficient quantities. Improvements in machine learning techniques and features selection will enhance the study of PPI between host and pathogen.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29945178 PMCID: PMC6289132 DOI: 10.1093/bioinformatics/bty504
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 2.Neural Network Architecture. The architecture of the neural network was used to predict host-pathogen PPI. Four layers and a varying number of nodes in the input and hidden layers were used. This network has 16 nodes in the input layer, 20 nodes in the first hidden layer, 20 nodes in the second hidden layer and 1 node in the output layer
Comparison of performance of model generated using the triplets feature as in Cui versus the quadruplets feature of the current paper
| Method | SN (%) | SP (%) | AC (%) |
|---|---|---|---|
| Triplets | 80.5 | 89.7 | 85.1 |
| Quadruplets | 92.5 | 91.1 | 88.3 |
Comparison of performance on Indep (B.anthracis) of multi-task learning model of (Kshirsagar ) versus the quadruplets feature (of the current paper)
| F1 score | Std | |
|---|---|---|
| Our model | 57.36 | 0.089 |
| 27.8 | 4.0 |
Note: Table 2 reports the performance of our model on the dataset used by (Kshirsagar ). The datasets is a subset of their multi-task, specifically we used human-B.anthracis on Indep task.
Model performance (average accuracy, CV score, F1_score and Std)% of 12 different features set, implemented using SVM and Neural network
| SVM | Neural network | |||||||
|---|---|---|---|---|---|---|---|---|
| Accuracy | Score | F1_score | Std | Accuracy | Score | F1_score | Std | |
| Triplet | 90.49 | 87.00 | 61.23 | 00.00 | 91.5649 | 83.7794 | 61.2016 | 00.1683 |
| Triplet_degree | 89.94 | 81.39 | 65.2106 | 01.2978 | 91.1869 | 81.6814 | 66.2411 | 01.2448 |
| Triplet_cluster | 91.04 | 81.39 | 65.6041 | 01.3876 | 90.7026 | 82.2588 | 65.9132 | 01.7797 |
| Triplet_between | 90.09 | 80.88 | 65.2766 | 00.4142 | 90.7799 | 81.5668 | 65.9522 | 00.7874 |
| Triplet_similarity | 89.99 | 81.84 | 65.0589 | 01.2321 | 92.0279 | 81.7859 | 65.6692 | 01.3124 |
| Triplet_all | 91.09 | 82.20 | 65.6563 | 00.7196 | 93.2626 | 83.0365 | 65.5762 | 00.8151 |
| Quadruplet | 91.693 | 81.3968 | 66.3107 | 00.7334 | 91.0106 | 82.8321 | 65.7306 | 00.7615 |
| Quadruplet_degree | 92.317 | 83.9685 | 66.6005 | 01.8392 | 91.4632 | 82.8311 | 66.0958 | 00.3209 |
| Quadruplet_between | 92.492 | 82.6440 | 66.3309 | 01.7438 | 90.8393 | 82.9580 | 65.6902 | 01.6715 |
| Quadruplet_cluster | 92.755 | 84.0455 | 66.5803 | 00.6979 | 92.3635 | 83.9358 | 66.5801 | 01.0428 |
| Quadruplet_similarity | 92.464 | 82.2044 | 66.5126 | 01.0792 | 92.6595 | 82.7353 | 65.6109 | 00.4150 |
| Quadruplet_all | 92.271 | 85.4418 | 66.2581 | 00.4571 | 94.5758 | 86.9634 | 66.4710 | 00.3613 |
Fig. 3.Precision-Recall curve showing a neural network implementation for the quadruplet feature combined with network features and sequence similarity
Fig. 4.ROC curve showing a neural network implementation for the quadruplet feature combined with network features and sequence similarity
Molecular function enriched GO terms for human proteins predicted to interact with proteins of B.anthracis based on artificial neural network using the DAVID database
| GO Term | Description | |
|---|---|---|
| GO: 0008066 | Glutamate receptor activity | 3.6253776435E–033 |
| GO: 0020037 | Heme binding | 3.9274924471E–017 |
| GO: 0046906 | Tetrapyrrole binding | 3.9274924471E–018 |
| GO: 0010851 | Cyclase regulator activity | 1.5105740181E–011 |
| GO: 0004672 | Protein kinase activity | 8.4592145015E–09 |
| GO: 0004674 | Protein serine/threonine kinase activity | 6.6465256798E–014 |
| GO: 0051119 | Sugar transmembrane transporter activity | 1.8126888218E–09 |
| GO: 0005355 | Glucose transmembrane transporter activity | 1.5105740181E–006 |
| GO: 0019825 | Oxygen binding | 2.1148036254E–013 |