| Literature DB >> 30598096 |
Long Zhang1, Guoxian Yu1, Maozu Guo2,3, Jun Wang4.
Abstract
BACKGROUND: Identifying protein-protein interactions (PPIs) is of paramount importance for understanding cellular processes. Machine learning-based approaches have been developed to predict PPIs, but the effectiveness of these approaches is unsatisfactory. One major reason is that they randomly choose non-interacting protein pairs (negative samples) or heuristically select non-interacting pairs with low quality.Entities:
Keywords: Deep neural networks; Non-interacting proteins; Protein-protein interactions; Random walk; Sequence similarity
Mesh:
Substances:
Year: 2018 PMID: 30598096 PMCID: PMC6311908 DOI: 10.1186/s12859-018-2525-3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
The 18 PPIs datasets used in this paper
| Groups | Datasets | # Positive samples | # Negative samples |
|---|---|---|---|
|
|
| 17257 | 17257 |
|
| 17257 | 17257 | |
|
| 17257 | 17257 | |
|
| 17257 | 17257 | |
|
|
| 3355 | 3355 |
|
| 3355 | 3355 | |
|
| 3355 | 3355 | |
|
| 3355 | 3355 | |
|
|
| 923 | 923 |
|
| 923 | 923 | |
|
| 923 | 923 | |
|
| 923 | 923 | |
|
|
| 4013 | 0 |
|
| 6984 | 0 | |
|
| 1412 | 0 | |
|
| 1420 | 0 | |
|
| 313 | 0 | |
|
| 0 | 1937 |
aSC: S. cerevisiae;
bHS: H. sapiens;
cMM: M. musculus;
dTest: Six independent testing datasets;
1NIP-SS;
2NIP-RW;
3Subcellular location;
4Random pairing
Fig. 1The flowchart of constructing reliable negative samples. The left and right of this Figure describe the strategy of NIP-SS and NIP-RW, respectively
The original values of the seven physicochemical properties for each amino acid
| Code |
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| A | 0.62 | -0.5 | 27.5 | 8.1 | 0.046 | 1.181 | 0.007187 |
| C | 0.29 | -1 | 44.6 | 5.5 | 0.128 | 1.461 | -0.03661 |
| D | -0.9 | 3 | 40 | 13 | 0.105 | 1.587 | -0.02382 |
| E | -0.74 | 3 | 62 | 12.3 | 0.151 | 1.862 | 0.006802 |
| F | 1.19 | -2.5 | 115.5 | 5.2 | 0.29 | 2.228 | 0.037552 |
| G | 0.48 | 0 | 0 | 9 | 0 | 0.881 | 0.179052 |
| H | -0.4 | -0.5 | 79 | 10.4 | 0.23 | 2.025 | -0.01069 |
| I | 1.38 | -1.8 | 93.5 | 5.2 | 0.186 | 1.81 | 0.021631 |
| K | -1.5 | 3 | 100 | 11.3 | 0.219 | 2.258 | 0.017708 |
| L | 1.06 | -1.8 | 93.5 | 4.9 | 0.186 | 1.931 | 0.051672 |
| M | 0.64 | -1.3 | 94.1 | 5.7 | 0.221 | 2.034 | 0.002683 |
| N | -0.78 | 2 | 58.7 | 11.6 | 0.134 | 1.655 | 0.005392 |
| P | 0.12 | 0 | 41.9 | 8 | 0.131 | 1.468 | 0.239531 |
| Q | -0.85 | 0.2 | 80.7 | 10.5 | 0.18 | 1.932 | 0.049211 |
| R | -2.53 | 3 | 105 | 10.5 | 0.291 | 2.56 | 0.043587 |
| S | -0.18 | 0.3 | 29.3 | 9.2 | 0.062 | 1.298 | 0.004627 |
| T | -0.05 | -0.4 | 51.3 | 8.6 | 0.108 | 1.525 | 0.003352 |
| V | 1.08 | -1.5 | 71.5 | 5.9 | 0.14 | 1.645 | 0.057004 |
| W | 0.81 | -3.4 | 145.5 | 5.4 | 0.409 | 2.663 | 0.037977 |
| Y | 0.26 | -2.3 | 117.3 | 6.2 | 0.298 | 2.368 | 0.023599 |
H1: hydrophobicity; H2: hydrophilicity; V: volume of side chains; P1: polarity; P2: polarizability; SASA: solvent accessible surface area; NCI: net charge index of side chains
Fig. 2The framework of our deep neural networks for protein-protein interactions prediction
Recommended parameters of our model
| Name | Range | Recommend |
|---|---|---|
| Learning rate | 1,0.1,0.001,0.002,0.003,0.0001 | 0.002 |
| Batch size | 32,64,128,256,512,1024,2056 | 1024,2056 |
| Weight initialization | uniform, normal, lecun_uniform, glorot_normal, glorot_uniform | glorot_normal |
| Per-parameter adaptive learning rate | SGD, RMSprop, Adagrad, Adadelta, Adam, Adamax, Nadam | Adam |
| Activation function | relu, tanh, sigmoid, softmax, softplus | relu, sigmoid |
| Dropout rate | 0.5, 0.6, 0.7 | 0.6 |
| Depth | 2, 3, 4, 5, 6, 7, 8,9 | 3 |
| Width | 16, 32, 64, 128, 256, 1024, 2048, 4096 | 128, 64, 32 |
| GPU | Yes, No | Yes |
Optimal parameters of comparing methods
| Method | Name | Parameters | |||
|---|---|---|---|---|---|
| Guo’s work [ | SVM+AC | C |
| Kernel | |
| 32768.0 | 0.074325444687670064 | Poly | |||
| Yang’s work [ | n_neighbors | Weights | Algorithm | p | |
| 3 | Distance | Auto | 1 | ||
| Zhou’s work [ | SVM+LD | C |
| Kernel | |
| 3.1748021 | 0.07432544468767006 | rbf | |||
| You’s work [ | RF+MCD | n_estimators | Max_features | Criterion | Bootstrap |
| 5000 | Auto | Gini | True | ||
The degree distribution of proteins of different datasets on S. cerevisiae, H. sapiens, and M. musculus
|
| Maximum degree | Number | The proportion of proteins/the number of interactions (non-interactions) | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1<degree<10 | 10<degree<20 | 20<degree<30 | 30<degree<50 | 50<degree<70 | 70<degree<80 | 80<degree<100 | 100<degree<150 | degree>150 | |||||
|
| Positive | 6.8763 | 252 | 4382 | 0.7976/0.3418 | 0.1130/0.2131 | 0.0436/0.1372 | 0.0299/0.1429 | 0.0087/0.0667 | 0.0014/0.0132 | 0.0030/0.0341 | 0.0025/0.0382 | 0.0005/0.0128 |
| NIP-SS-NonControl | 8.0493 | 1439 | 3814 | 0.8589/0.2841 | 0.0679/0.1092 | 0.0260/0.0723 | 0.0207/0.0905 | 0.0105/0.0680 | 0.0029/0.0238 | 0.0026/0.0259 | 0.0042/0.0524 | 0.0063/0.2739 | |
| NIP-SS | 7.9391 | 154 | 3861 | 0.8052/0.3406 | 0.0873/0.1248 | 0.0303/0.0863 | 0.0544/0.2381 | 0.0080/0.0492 | 0.0049/0.0402 | 0.0028/0.0272 | 0.0065/0.0851 | 0.0005/0.0085 | |
| Sub | 12.7178 | 105 | 2516 | 0.6355/ 0.2302 | 0.2134/0.2196 | 0.0254/0.0467 | 0.0552/0.1928 | 0.0672/0.2902 | 0.0016/0.0089 | 0.0012/0.0085 | 0.0004/0.0031 | 0/0 | |
| Random method | 6.8781 | 18 | 4381 | 0.8274/0.7327 | 0.1726/0.2673 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |
|
| Positive | 1.8146 | 31 | 2384 | 0.9727/0.8470 | 0.0243/0.1262 | 0.0025/0.0219 | 0.0004/0.0049 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
| NIP-SS-NonControl | 2.9263 | 312 | 1709 | 0.9427/0.4741 | 0.0234/0.0866 | 0.0164/0.1043 | 0.0053/0.0514 | 0.0070/0.1067 | 0.0018/0.0343 | 0/0 | 0.0029/0.0961 | 0.0006/0.0465 | |
| NIP-SS | 2.7760 | 30 | 1777 | 0.9572/0.8012 | 0.0315/0.1250 | 0.0113/0.0738 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |
| Sub | 6.5563 | 39 | 888 | 0.7331/0.3221 | 0.1486/0.2697 | 0.1002/0.3279 | 0.0180/0.0803 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |
| Random method | 2.0212 | 9 | 2221 | 1.0000/1.0000 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |
|
| Positive | 0.9473 | 24 | 948 | 0.9926/0.9450 | 0.0063/0.0411 | 0.0011/0.0139 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
| NIP-SS-NonControl | 1.9025 | 99 | 636 | 0.9513/0.5929 | 0.0299/0.1389 | 0.0110/0.0882 | 0.0016/0.0169 | 0.0016/0.0275 | 0.0016/0.0380 | 0.0031/0.0977 | 0/0 | 0/0 | |
| NIP-SS | 2.0163 | 20 | 612 | 0.9624/0.8212 | 0.0376/0.1788 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |
| Sub | 6.0190 | 52 | 263 | 0.7262/0.1446 | 0.1901/0.3857 | 0/0 | 0.0798/0.4415 | 0.0038/0.0282 | 0/0 | 0/0 | 0/0 | 0/0 | |
| Random method | 1.2678 | 8 | 814 | 1/1 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |
Fig. 3The experimental results of NIP-SS-NonControl and NIP-SS on S. cerevisiae, H. sapiens, and M. musculus. The negative datasets constructed by NIP-SS control the degree distribution of proteins
The degree distribution of proteins of different sizes of submatrix W (k=3) on S. cerevisiae, H. sapiens, and M. musculus
|
| Maximum degree | Number | The proportion of proteins/the number of interactions (non-interactions) | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1<degree<10 | 10<degree<20 | 20<degree<30 | 30<degree<50 | 50<degree<70 | 70<degree<80 | 80<degree<100 | 100<degree<150 | degree>150 | |||||
|
| Positive | 6.8763 | 252 | 4382 | 0.7976/0.3418 | 0.1130/0.2131 | 0.0436/0.1372 | 0.0299/0.1429 | 0.0087/0.0667 | 0.0014/0.0132 | 0.0030/0.0341 | 0.0025/0.0382 | 0.0005/0.0128 |
| 35.6391 | 99 | 942 | 0.0106/0.0022 | 0.0998/0.0447 | 0.1943/0.1396 | 0.6008/0.6468 | 0.0669/0.1024 | 0.0096/0.0200 | 0.0180/0.0443 | 0/0 | 0/0 | ||
| 18.4008 | 56 | 1779 | 0.1417/0.0560 | 0.4817/0.3978 | 0.2816/0.3518 | 0.0877/0.1744 | 0.0073/0.0200 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 12.9057 | 48 | 2482 | 0.3429/0.1847 | 0.5028/0.5189 | 0.1241/0.2219 | 0.0302/0.0745 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 10.2938 | 36 | 3056 | 0.5301/0.3250 | 0.3750/0.4727 | 0.0906/0.1900 | 0.0043/0.0123 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 8.7113 | 29 | 3554 | 0.6294/0.4206 | 0.3348/0.4952 | 0.0357/0.0841 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 7.8181 | 28 | 3914 | 0.6704/0.4607 | 0.3143/0.5003 | 0.0153/0.0390 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 7.2907 | 24 | 4163 | 0.7173/0.5390 | 0.2782/0.4491 | 0.0046/0.0119 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 6.9820 | 24 | 4324 | 0.7590/0.6048 | 0.2396/0.3913 | 0.0014/0.0038 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 6.9070 | 24 | 4365 | 0.7830/0.6512 | 0.2160/0.3463 | 0.0009/0.0025 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
|
| Positive | 1.8146 | 31 | 2384 | 0.9727/0.8470 | 0.0243/0.1262 | 0.0025/0.0219 | 0.0004/0.0049 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
| 33.4103 | 73 | 195 | 0/0 | 0.0051/0.0030 | 0.2410/0.1938 | 0.7282/0.7548 | 0.0205/0.0376 | 0.0051/0.0109 | 0/0 | 0/0 | 0/0 | ||
| 10.7926 | 28 | 569 | 0.4095/0.2923 | 0.5536/0.6335 | 0.0369/0.0742 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 6.4638 | 25 | 899 | 0.8365/0.7101 | 0.1613/0.2829 | 0.0022/0.0070 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 4.6913 | 18 | 1179 | 0.9245/0.8328 | 0.0755/0.1672 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 3.3799 | 18 | 1532 | 0.9778/0.9371 | 0.0222/0.0629 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 2.7423 | 14 | 1793 | 0.9933/0.9791 | 0.0067/0.0209 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 2.4481 | 13 | 1946 | 0.9964/0.9881 | 0.0036/0.0119 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 2.2924 | 11 | 2038 | 0.9995/0.9984 | 0.0005/0.0016 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 2.1370 | 10 | 2139 | 1/1 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 2.0171 | 10 | 2224 | 1/1 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
|
| Positive | 0.9473 | 24 | 948 | 0.9926/0.9450 | 0.0063/0.0411 | 0.0011/0.0139 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
| 8.6649 | 22 | 191 | 0.6545/0.5287 | 0.3351/0.4480 | 0.0105/0.0233 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 2.6700 | 13 | 503 | 0.9841/0.9474 | 0.0159/0.0526 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 1.7843 | 10 | 663 | 1/1 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 1.4745 | 10 | 746 | 1/1 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
| 1.2214 | 7 | 831 | 1/1 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | ||
Fig. 4Left: the accuracy under different input values of p (the size of submatrix). Right: the accuracy under different input values of k (the steps of random walks)
The degree distribution of proteins under different k on S. cerevisiae, H. sapiens, and M. musculus
|
| Maximum degree | Number | The proportion of proteins/the number of interactions (non-interactions) | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1<degree<10 | 10<degree<20 | 20<degree<30 | 30<degree<50 | 50<degree<70 | 70<degree<80 | 80<degree<100 | 100<degree<150 | degree>150 | ||||||
|
| Positive | 6.8763 | 252 | 4382 | 0.7976/0.3418 | 0.1130/0.2131 | 0.0436/0.1372 | 0.0299/0.1429 | 0.0087/0.0667 | 0.0014/0.0132 | 0.0030/0.0341 | 0.0025/0.0382 | 0.0005/0.0128 | |
| 10.1949 | 33 | 3083 | 0.5336/0.3501 | 0.4032/0.5196 | 0.0626/0.1285 | 0.0006/0.0019 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 10.2095 | 22 | 3079 | 0.5320/0.3452 | 0.4005/0.5138 | 0.0659/0.1364 | 0.0016/0.0046 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 10.2131 | 39 | 3078 | 0.5338/0.3312 | 0.3765/0.4765 | 0.0854/0.1800 | 0.0042/0.0123 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 10.2460 | 583 | 3069 | 0.8661/0.3685 | 0.0909/0.1090 | 0.0130/0.0292 | 0.0075/0.0257 | 0.0033/0.0169 | 0.0013/0.0089 | 0.0016/0.0128 | 0.0010/0.0110 | 0.0153/0.4179 | |||
| 10.1841 | 679 | 3086 | 0.9180/0.4234 | 0.0680/0.0731 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0.0139/0.5035 | |||
| 10.3570 | 658 | 3039 | 0.9105/0.4172 | 0.0760/0.0797 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0.0135/0.5031 | |||
| 10.2350 | 574 | 3072 | 0.9134/0.4201 | 0.0713/0.0760 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0.0153/0.5038 | |||
| 10.2754 | 598 | 3061 | 0.9138/0.4205 | 0.0699/0.0749 | 0.0007/0.0012 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0.0157/0.5034 | |||
| 10.2865 | 642 | 3058 | 0.9094/0.4156 | 0.0755/0.0808 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0.0150/0.5036 | |||
|
| Positive | 1.8146 | 31 | 2384 | 0.9727/0.8470 | 0.0243/0.1262 | 0.0025/0.0219 | 0.0004/0.0049 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |
| 4.5824 | 21 | 1202 | 0.9085/0.8057 | 0.0907/0.1914 | 0.0008/0.0029 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 4.6720 | 18 | 1183 | 0.9231/0.8282 | 0.0769/0.1718 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 4.6339 | 23 | 1191 | 0.9278/0.8351 | 0.0714/0.1614 | 0.0008/0.0034 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 4.6577 | 19 | 1186 | 0.9115/0.8004 | 0.0885/0.1996 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 4.8196 | 26 | 1153 | 0.8803/0.6975 | 0.1110/0.2686 | 0.0087/0.0340 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 4.7895 | 29 | 1159 | 0.8645/0.6432 | 0.1182/0.2875 | 0.0173/0.0693 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 4.9486 | 31 | 1128 | 0.8608/0.6405 | 0.1232/0.2942 | 0.0151/0.0607 | 0.0009/0.0046 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 4.9119 | 28 | 1135 | 0.8520/0.6191 | 0.1251/0.2912 | 0.0229/0.0897 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 4.8348 | 29 | 1150 | 0.8678/0.6590 | 0.1165/0.2766 | 0.0157/0.0644 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
|
| Positive | 0.9473 | 24 | 948 | 0.9926/0.9450 | 0.0063/0.0411 | 0.0011/0.0139 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |
| 2.8863 | 14 | 475 | 0.9916/0.9745 | 0.0084/0.0255 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 2.7293 | 13 | 495 | 0.9879/0.9615 | 0.0121/0.0385 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 2.8140 | 14 | 484 | 0.9897/0.9670 | 0.0103/0.0330 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 2.8062 | 12 | 485 | 0.9959/0.9875 | 0.0041/0.0125 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 2.7673 | 13 | 490 | 0.9939/0.9794 | 0.0061/0.0206 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 2.8062 | 14 | 485 | 0.9876/0.9599 | 0.0124/0.0401 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 2.8219 | 14 | 483 | 0.9834/0.9502 | 0.0166/0.0498 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 2.8863 | 15 | 475 | 0.9789/0.9350 | 0.0211/0.0650 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
| 2.7906 | 15 | 487 | 0.9856/0.9529 | 0.0144/0.0471 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | |||
Results based on different negative datasets on S. cerevisiae, H. sapiens and M. musculus
| Species | Negative samples | ACC | PE | RE | SPE | MCC |
| AUC |
|---|---|---|---|---|---|---|---|---|
|
| NIP-SS | 94.34% ± 0.38% | 95.62% ± 0.75% | 92.96% ± 0.40% | 95.74% ± 0.75% | 88.73% ± 0.77% | 94.27% ± 0.34% | 98.24% ± 0.11% |
| NIP-RW | 87.92% ± 0.24% | 90.04% ± 1.69% | 85.32% ± 1.90% | 90.48% ± 2.20% | 75.97% ± 0.55% | 87.59% ± 0.35% | 94.23% ± 0.12% | |
| Sub | 93.79% ± 0.43% | 95.18% ± 0.41% | 92.25% ± 0.78% | 95.33% ± 0.45% | 87.62% ± 0.83% | 93.69% ± 0.38% | 98.13% ± 0.17% | |
| Random method | 74.20% ± 0.78% | 72.68% ± 1.45% | 77.59% ± 0.89% | 70.83% ± 1.77% | 48.53% ± 1.47% | 75.04% ± 0.76% | 81.29% ± 0.34% | |
|
| NIP-SS | 86.17% ± 0.93% | 86.38% ± 1.27% | 85.88% ± 1.55% | 86.48% ± 1.00% | 72.36% ± 1.85% | 86.12% ± 1.05% | 92.20% ± 0.82% |
| NIP-RW | 86.44% ± 0.59% | 90.05% ± 0.48% | 81.87% ± 2.35% | 90.91% ± 1.09% | 73.14% ± 1.12% | 85.75% ± 1.28% | 92.30% ± 0.70% | |
| Sub | 93.34% ± 0.58% | 93.19% ± 0.42% | 93.51% ± 0.94% | 93.17% ± 0.37% | 86.68% ± 1.16% | 93.35% ± 0.57% | 96.22% ± 0.45% | |
| Random method | 60.46% ± 1.54% | 60.07% ± 1.74% | 62.33% ± 1.91% | 58.50% ± 3.24% | 20.85% ± 3.14% | 61.17% ± 1.63% | 64.57% ± 1.35% | |
|
| NIP-SS | 81.69% ± 1.48% | 80.57% ± 2.20% | 83.73% ± 2.97% | 79.51% ± 4.47% | 63.44% ± 3.16% | 82.06% ± 0.84% | 87.04% ± 1.95% |
| NIP-RW | 80.66% ± 2.14% | 84.89% ± 5.41% | 74.83% ± 3.46% | 86.72% ± 4.62% | 61.97% ± 4.53% | 79.41% ± 2.52% | 87.75% ± 2.25% | |
| Sub | 91.82% ± 1.26% | 90.13% ± 2.57% | 93.93% ± 2.38% | 89.76% ± 2.41% | 83.78% ± 2.40% | 91.95% ± 1.44% | 94.81% ± 0.74% | |
| Random method | 50.76% ± 2.12% | 50.80% ± 5.77% | 52.17% ± 1.90% | 49.44% ± 3.26% | 1.60% ± 3.86% | 51.37% ± 3.58% | 51.40% ± 2.43% |
Results of different network architectures on S. cerevisiae, the adopted negative dataset is constructed by NIP-SS
| Architectures | Data set | ACC | PE | RE | SPE | MCC |
| AUC |
|---|---|---|---|---|---|---|---|---|
| DNNs | Fold 1 | 94.08% | 94.04% | 94.17% | 93.98% | 88.15% | 94.11% | 98.24% |
| Fold 2 | 94.03% | 94.36% | 93.64% | 94.42% | 88.07% | 94.00% | 98.13% | |
| Fold 3 | 94.57% | 95.25% | 93.66% | 95.45% | 89.14% | 94.45% | 98.17% | |
| Fold 4 | 94.38% | 94.99% | 93.78% | 94.98% | 88.77% | 94.38% | 98.16% | |
| Fold 5 | 94.19% | 94.84% | 93.50% | 94.88% | 88.39% | 94.17% | 98.03% | |
| Average | 94.25% ± 0.22% | 94.70% ± 0.49% | 93.75% ± 0.26% | 94.74% ± 0.56% | 88.5% ± 0.45% | 94.22% ± 0.19% | 98.15% ± 0.08% | |
| DNNs-Con | Fold 1 | 91.92% | 92.40% | 91.27% | 92.55% | 83.84% | 91.83% | 97.15% |
| Fold 2 | 91.86% | 93.87% | 89.21% | 94.40% | 83.79% | 91.48% | 96.90% | |
| Fold 3 | 91.58% | 93.62% | 89.32% | 93.86% | 83.26% | 91.42% | 96.83% | |
| Fold 4 | 91.86% | 93.65% | 90.07% | 93.70% | 83.79% | 91.83% | 96.92% | |
| Fold 5 | 91.42% | 92.24% | 90.53% | 92.32% | 82.86% | 91.38% | 96.93% | |
| Average | 91.73% ± 0.21% ∙ | 93.16% ± 0.77% ∙ | 90.08% ± 0.86% ∙ | 93.37% ± 0.89% ∙ | 83.51% ± 0.43% ∙ | 91.59% ± 0.23% ∙ | 96.95% ± 0.12% ∙ |
∙/∘ indicates whether our model is statistically (according to pairwise t-test at 95% significance level) superior/inferior to the DNNs-Con
Results on H. sapiens with different numbers of negative samples for NIP-SS and NIP-RW
| Method | Dataset | SEN | SPE | AUC | GM |
|---|---|---|---|---|---|
| NIP-SS | 86.57% | 87.08% | 92.01% | 86.57% | |
| 69.95% | 89.25% | 86.33% | 79.00% | ||
| 52.93% | 92.48% | 81.50% | 69.94% | ||
| NIP-RW | 81.87% | 90.91% | 92.30% | 86.27% | |
| 72.84% | 90.13% | 87.41% | 81.02% | ||
| 58.33% | 94.51% | 84.03% | 75.00% |
Results of DNNs with AC, CT, LD and MCD feature descriptors on S. cerevisiae
| Model | Dimension | ACC | PE | RE | SPE | MCC |
| AUC |
|---|---|---|---|---|---|---|---|---|
| DNNs-AC | (210+210) | 94.25% ± 0.22% | 94.70% ± 0.49% | 93.75% ± 0.26% | 94.74% ± 0.56% | 88.50% ± 0.45% | 94.22% ± 0.19% | 98.15% ± 0.08% |
| DNNs-CT | (343+343) | 94.37% ± 0.24% | 95.55% ± 0.75% | 93.09% ± 0.81% | 95.67% ± 0.65% | 88.78% ± 0.48% | 94.30% ± 0.23% | 98.20% ± 0.21% |
| DNNs-LD | (630+630) | 94.41% ± 0.14% | 95.46% ± 0.41% | 93.25% ± 0.44% | 95.56% ± 0.44% | 88.84% ± 0.28% | 94.34% ± 0.15% | 98.23% ± 0.06% |
| DNNs-MCD | (882+882) | 94.25% ± 0.22% | 94.70% ± 0.49% | 93.75% ± 0.26% | 94.74% ± 0.56% | 88.50% ± 0.45% | 94.22% ± 0.19% | 98.15% ± 0.08% |
∙/∘ indicates whether DNNs-AC is statistically (according to pairwise t-test at 95% significance level) superior/inferior to the other descriptors
Results of our modal and of other state-of-the-art methods on S. cerevisiae
| Method | Negative samples | ACC | PE | RE | SPE | MCC | F1 | AUC |
|---|---|---|---|---|---|---|---|---|
| Our method | Sub | 93.79% ± 0.43% | 95.18% ± 0.41% | 92.25% ± 0.78% | 95.33% ± 0.45% | 87.62% ± 0.83% | 93.69% ± 0.38% | 98.13% ± 0.17% |
| NIP-SS | 94.34% ± 0.38% | 95.62% ± 0.75% | 92.96% ± 0.40% | 95.74% ± 0.75% | 88.73% ± 0.77% | 94.27% ± 0.34% | 98.24% ± 0.11% | |
| NIP-RW | 87.92% ± 0.24% | 90.04% ± 1.69% | 85.32% ± 1.90% | 90.48% ± 2.20% | 75.97% ± 0.55% | 87.59% ± 0.35% | 94.23% ± 0.12% | |
| Du’s work [ | Sub | 92.58% ± 0.38% | 94.21% ± 0.45% | 90.95% ± 0.41% | 94.41% ± 0.45% | 85.41% ± 0.76% | 92.55% ± 0.39% | 97.55% ± 0.16% |
| NIP-SS | 94.44% ± 0.35% | 95.46% ± 0.38% | 93.44% ± 0.45% | 95.45% ± 0.41% | 88.90% ± 0.68% | 94.44% ± 0.37% | 98.22% ± 0.20% | |
| NIP-RW | 88.59% ± 0.32% | 92.61% ± 0.41% | 84.14% ± 0.43% | 93.13% ± 0.35% | 77.52% ± 0.59% | 88.17% ± 0.34% | 94.73% ± 0.18% | |
| You’s work [ | Sub | 89.15% ± 0.33% | 90.00% ± 0.57% | 88.10% ± 0.17% | 90.21% ± 0.61% | 78.33% ± 0.67% | 89.04% ± 0.31% | 94.78% ± 0.21% |
| NIP-SS | 94.42% ± 0.47% | 96.71% ± 0.47% | 91.96% ± 0.64% | 96.87% ± 0.46% | 88.94% ± 0.92% | 94.28% ± 0.49% | 98.46% ± 0.12% | |
| NIP-RW | 86.03% ± 0.43% | 89.19% ± 0.60% | 82.00% ± 0.70% | 90.06% ± 0.64% | 72.30% ± 0.85% | 85.44% ± 0.46% | 93.33% ± 0.46% | |
| Zhou’s work [ | Sub | 88.76% ± 0.37% | 89.44% ± 0.27% | 87.89% ± 0.45% | 89.62% ± 0.30% | 77.53% ± 0.53% | 88.66% ± 0.28% | 94.69% ± 0.31% |
| NIP-SS | 92.10% ± 0.34% | 93.48% ± 0.45% | 90.51% ± 0.73% | 93.68% ± 0.49% | 84.24% ± 0.67% | 91.97% ± 0.37% | 97.29% ± 0.16% | |
| NIP-RW | 82.64% ± 0.33% | 83.98% ± 0.34% | 80.67% ± 0.48% | 84.61% ± 0.36% | 65.34% ± 0.65% | 82.30% ± 0.35% | 90.00% ± 0.39% | |
| Yang’s work [ | Sub | 84.81% ± 0.37% | 87.53% ± 0.14% | 81.18% ± 0.84% | 88.44% ± 0.18% | 69.80% ± 0.71% | 84.23% ± 0.47% | 90.03% ± 0.31% |
| NIP-SS | 89.18% ± 0.35% | 93.34% ± 0.33% | 84.38% ± 0.53% | 93.98% ± 0.31% | 78.73% ± 0.69% | 88.64% ± 0.38% | 95.50% ± 0.20% | |
| NIP-RW | 83.98% ± 0.48% | 86.09% ± 0.67% | 81.07% ± 0.80% | 86.89% ± 0.76% | 68.09% ± 0.97% | 83.50% ± 0.51% | 91.45% ± 0.27% | |
| Guo’s work [ | Sub | 87.88% ± 0.56% | 88.16% ± 0.90% | 87.53% ± 0.59% | 88.24% ± 1.02% | 75.77% ± 1.12% | 87.84% ± 0.53% | 93.69% ± 0.33% |
| NIP-SS | 90.00% ± 0.43% | 90.45% ± 0.68% | 89.45% ± 0.69% | 90.55% ± 0.77% | 80.01% ± 0.86% | 89.94% ± 0.43% | 95.02% ± 0.27% | |
| NIP-RW | 82.43% ± 0.27% | 83.48% ± 0.40% | 80.87% ± 0.45% | 83.99% ± 0.49% | 64.89% ± 0.54% | 82.15% ± 0.27% | 89.04% ± 0.33% |
Prediction results on seven independent PPIs datasets, PPIs of H. sapiens are used as the training set
| Species | Test pairs | Negative | ||
|---|---|---|---|---|
| NIP-SS | NIP-RW | Sub | ||
|
| 4013 (interactions) | 86.10% | 78.11% | 94.42% |
|
| 6984 (interactions) | 85.34% | 79.65% | 92.68% |
|
| 1412 (interactions) | 86.20% | 85.03% | 96.29% |
|
| 1420 (interactions) | 81.86% | 79.15% | 92.28% |
|
| 313 (interactions) | 85.64% | 80.66% | 96.10% |
|
| 1937 (non-interactions) | 15.69% | 18.58% | 4.67% |
|
| 2250 (313 interactions, 1937 non-interactions) | 23.45% | 27.56% | 17.75% |