| Literature DB >> 28545462 |
Tanlin Sun1, Bo Zhou1, Luhua Lai1,2,3, Jianfeng Pei4.
Abstract
BACKGROUND: Protein-protein interactions (PPIs) are critical for many biological processes. It is therefore important to develop accurate high-throughput methods for identifying PPI to better understand protein function, disease occurrence, and therapy design. Though various computational methods for predicting PPI have been developed, their robustness for prediction with external datasets is unknown. Deep-learning algorithms have achieved successful results in diverse areas, but their effectiveness for PPI prediction has not been tested.Entities:
Keywords: Deep learning; Protein-protein interaction
Mesh:
Substances:
Year: 2017 PMID: 28545462 PMCID: PMC5445391 DOI: 10.1186/s12859-017-1700-2
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1The structure of a stacked autoencoder (SAE)
The 10-CV training performance of the pre-training models and their prediction performances on test sets
| Code | Sen. | Spe. | Pre. | Acc. | Test set acc. | NR-test set acc. |
|---|---|---|---|---|---|---|
| AC | 0.9806 | 0.9588 | 0.9581 | 0.9695 | 0.9682 | 0.9591 |
| CT | 0.9542 | 0.9367 | 0.9357 | 0.9452 | 0.9447 | 0.9312 |
Column 2–5 represent the results of 10-cv with standard deviations ranged from 0.001 to 0.003
Test set acc.: prediction accuracy for the hold-out test set
NR-test set acc.: prediction accuracy for the NR-test set
The 10-CV training performance of the final model
| Code | Sen. | Spe. | Pre. | Acc. |
|---|---|---|---|---|
| AC | 0.9806 | 0.9634 | 0.9627 | 0.9719 |
Column 2–5 represent the training results of 10-cv with standard deviations ranged from 0.001 to 0.003
Comparison of the 10-CV training accuracy to those of previous methods using the same dataset
| References | Algorithm | Training |
|---|---|---|
| [ | SVM | 0.83 |
| [ | SVM | 0.9037 |
| [ | LDA-ROF | 0.9790 |
| [ | CS-SVM | 0.9400 |
| [ | ELM | 0.8480 |
| [ | SVM | 0.9200–0.9740 |
| Our Model | SAE | 0.9719 |
Prediction performance of the final model on external datasets
| Dataset name | Samples | Acc. | Pan et al.’s acc. |
|---|---|---|---|
| 2010 HPRD | 9214 | 0.9921 | 0.8915 |
| 2010 HPRD NR | 1482 | 0.9714 | 0.8670 |
| DIP | 2908 | 0.9377 | 0.9004 |
| HIPPIE HQ | 30074 | 0.9224 | 0.8501 |
| HIPPIE LQ | 220442 | 0.8704 | -- |
| inWeb_inbiomap HQ | 155465 | 0.9114 | -- |
| inWeb_inbiomap LQ | 459231 | 0.8799 | -- |
HQ High quality, LQ Low quality
Training performance on PPIs from other species
| Species | Sen. | Spe. | Pre. | Acc. | Guo et al.’s acc. |
|---|---|---|---|---|---|
|
| 0.9689 | 0.9528 | 0.9518 | 0.9605 | 0.9273 |
|
| 0.9951 | 0.9628 | 0.9616 | 0.9784 | 0.9009 |
|
| 0.9935 | 0.9528 | 0.9508 | 0.9723 | 0.9751 |
Colum 2–5 are the training results of 5-CV with standard deviations ranged from 0.001 to 0.003