| Literature DB >> 35575379 |
Kapil Devkota1, Henri Schmidt1, Matt Werenski1, James M Murphy2, Mert Erden1, Victor Arsenescu1, Lenore J Cowen1.
Abstract
MOTIVATION: Protein function prediction, based on the patterns of connection in a Protein-Protein Interaction (or Association) network, is perhaps the most studied of the classical, fundamental inference problems for biological networks. A highly successful set of recent approaches use random walk-based low dimensional embeddings, that tend to place functionally similar proteins into coherent spatial regions. However, these approaches lose valuable local graph structure from the network when considering only the embedding. We introduce GLIDER, a method that replaces a protein-protein interaction or association network with a new graph-based similarity network. GLIDER is based on a variant of our previous GLIDE method, which was designed to predict missing links in Protein-Protein Association networks, capturing implicit local and global (i.e. embedding-based) graph properties.Entities:
Year: 2022 PMID: 35575379 PMCID: PMC9237677 DOI: 10.1093/bioinformatics/btac322
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.931
Fig. 1.Working schematic of GLIDER-knn. The original graph is transformed into GLIDER(G) both adding and deleting edges. Then for each node (e.g. the starred node), the k-closest direct neighbors in GLIDER(G) vote for all their GO labels (created with BioRender.com)
Accuracy, F1 and Resnik score results on DREAM1–4 and STRING composite networks for different function prediction methods, using the MF category of GO, reporting mean and standard deviation over 5-fold cross-validation
| Network | Metric | GLIDER- | GLIDER-25nn | Majority-Vote | DSD- | node2vec | deepNF(S) | MASHUP(S) | GLIDER-MASHUP |
|---|---|---|---|---|---|---|---|---|---|
| DREAM1 | Accuracy |
| 0.643 ± 0.013 | 0.356 ± 0.022 | 0.451 ± 0.011 | 0.439 ± 0.016 | 0.182 ± 0.011 | 0.605 ± 0.020 | 0.600 ± 0.007 |
| DREAM2 |
| 0.418 ± 0.006 | 0.314 ± 0.012 | 0.364 ± 0.015 | 0.247 ± 0.011 | 0.267 ± 0.012 | 0.386 ± 0.010 | 0.384 ± 0.006 | |
| DREAM3 |
| 0.392 ± 0.016 | 0.229 ± 0.016 | 0.365 ± 0.016 | 0.253 ± 0.019 | 0.197 ± 0.010 | 0.366 ± 0.014 | 0.374 ± 0.017 | |
| DREAM4 | 0.281 ± 0.013 | 0.244 ± 0.018 |
| 0.249 ± 0.015 | 0.091 ± 0.015 | 0.186 ± 0.018 | 0.201 ± 0.015 | 0.175 ± 0.020 | |
| STRING-E |
| 0.685 ± 0.010 | 0.375 ± 0.003 | 0.379 ± 0.006 | 0.449 ± 0.009 | 0.382 ± 0.015 | 0.636 ± 0.007 | 0.625 ± 0.011 | |
| STRING-ED |
| 0.685 ± 0.010 | 0.410 ± 0.008 | 0.438 ± 0.010 |
| 0.384 ± 0.018 | 0.659 ± 0.007 | 0.624 ± 0.010 | |
| STRING-EDC |
| 0.664 ± 0.004 | 0.432 ± 0.008 | 0.387 ± 0.010 | 0.425 ± 0.018 | 0.362 ± 0.011 | 0.660 ± 0.014 | 0.619 ± 0.008 | |
| DREAM1 |
|
| 0.615 ± 0.003 | 0.406 ± 0.010 | 0.463 ± 0.011 | 0.360 ± 0.009 | 0.187 ± 0.014 | 0.580 ± 0.009 | 0.573 ± 0.011 |
| DREAM2 |
| 0.415 ± 0.008 | 0.332 ± 0.011 | 0.361 ± 0.006 | 0.248 ± 0.010 | 0.236 ± 0.015 | 0.379 ± 0.008 | 0.365 ± 0.007 | |
| DREAM3 |
| 0.386 ± 0.018 | 0.262 ± 0.008 | 0.360 ± 0.019 | 0.263 ± 0.012 | 0.228 ± 0.028 | 0.377 ± 0.014 | 0.378 ± 0.015 | |
| DREAM4 | 0.285 ± 0.008 | 0.250 ± 0.006 |
| 0.259 ± 0.008 | 0.127 ± 0.003 | 0.210 ± 0.015 | 0.212 ± 0.016 | 0.188 ± 0.009 | |
| STRING-E |
| 0.655 ± 0.010 | 0.400 ± 0.011 | 0.391 ± 0.004 | 0.384 ± 0.008 | 0.324 ± 0.099 | 0.598 ± 0.004 | 0.599 ± 0.003 | |
| STRING-ED |
| 0.664 ± 0.013 | 0.438 ± 0.008 | 0.456 ± 0.010 | 0.401 ± 0.007 | 0.327 ± 0.009 | 0.614 ± 0.012 | 0.605 ± 0.009 | |
| STRING-EDC |
| 0.637 ± 0.010 | 0.452 ± 0.016 | 0.404 ± 0.011 | 0.348 ± 0.008 | 0.322 ± 0.009 | 0.625 ± 0.005 | 0.588 ± 0.012 | |
| DREAM1 | Resnik |
| 2.388 ± 0.013 | 1.629 ± 0.029 | 1.870 ± 0.035 | 1.254 ± 0.041 | 0.859 ± 0.015 | 1.934 ± 0.051 | 1.998 ± 0.042 |
| DREAM2 |
| 1.740 ± 0.035 | 1.316 ± 0.056 | 1.469 ± 0.056 | 0.936 ± 0.011 | 0.770 ± 0.027 | 1.301 ± 0.038 | 1.353 ± 0.045 | |
| DREAM3 |
| 1.567 ± 0.038 | 1.081 ± 0.056 | 1.515 ± 0.051 | 1.023 ± 0.032 | 0.837 ± 0.047 | 1.296 ± 0.038 | 1.390 ± 0.061 | |
| DREAM4 |
| 1.213 ± 0.021 | 1.295 ± 0.019 | 1.234 ± 0.008 | 0.773 ± 0.018 | 0.871 ± 0.079 | 0.925 ± 0.016 | 0.909 ± 0.027 | |
| STRING-E |
| 2.583 ± 0.022 | 1.557 ± 0.029 | 1.621 ± 0.054 | 1.343 ± 0.020 | 1.056 ± 0.021 | 1.985 ± 0.028 | 2.095 ± 0.033 | |
| STRING-ED |
| 2.593 ± 0.050 | 1.720 ± 0.028 | 1.820 ± 0.035 | 1.435 ± 0.024 | 1.069 ± 0.009 | 2.100 ± 0.031 | 2.155 ± 0.034 | |
| STRING-EDC |
| 2.520 ± 0.009 | 1.742 ± 0.034 | 1.683 ± 0.016 | 1.251 ± 0.020 | 1.074 ± 0.029 | 2.136 ± 0.031 | 2.107 ± 0.010 |
Note: Best performance bolded. All method parameters set as described in Section 2.
The number of GO labels having their shortest path distance from the root nodes 5, and annotating at least 50 proteins, for DREAM1–4, STRING networks and GO hierarchies: MF, BP and CC
| Networks | GO hierarchies | ||
|---|---|---|---|
| MF | BP | CC | |
| DREAM1 | 45 | 272 | 86 |
| DREAM2 | 38 | 218 | 72 |
| DREAM3 | 28 | 120 | 31 |
| DREAM4 | 38 | 213 | 71 |
| STRING-E | 47 | 277 | 89 |
| STRING-ED | 47 | 278 | 90 |
| STRING-EDC | 47 | 278 | 90 |
List of 40 GWAS genes implicated for PD that are present in all the DREAM1–4 networks
| BAG3 | CTSB | HTRA2 | PARK7 | SREBF1 |
| BCKDK | DLG2 | KPNA1 | PINK1 | STK39 |
| BRIP1 | DYRK1A | MAP4K4 | RIMS1 | SYNJ1 |
| CD19 | EIF4G1 | MAPT | RIT2 | SYT11 |
| CHRNB1 | FBXO7 | NOD2 | SATB1 | UBTF |
| CLCN3 | FCGR2A | NSF | SETD1A | USB25 |
| CNTN1 | FYN | NUCKS1 | SHEGL2 | VAMP4 |
| CRHR1 | GBF1 | PAM | SHEGL2 | WNT3 |
Fig. 2.GLIDER-neighbors and their induced subgraph for the protein VAMP4 in (a) DREAM1, and (b) DREAM2 networks. The number of top VAMP4 (bolded node in the figure) GLIDER neighbors k is set to 15. Note: rectangular nodes are present in both DREAM1 and DREAM2. The oval nodes in the DREAM1 subgraph (a) are absent in the whole of DREAM2. The hexagonal nodes are only present in one of the subgraphs in (a) and (b), even though these nodes are present in both DREAM1 and DREAM2
Table of the fraction of the GWAS genes whose GLIDE neighbors enriched at least one GO label, using FuncAssociate (version 3.0), when the number of GLIDE neighbors is k
| Network |
|
|
|
|
|---|---|---|---|---|
| DREAM1 | 0.90 | 0.98 | 0.95 | 0.98 |
| DREAM2 | 0.63 | 0.80 | 0.73 | 0.95 |
| DREAM3 | 0.68 | 0.70 | 0.75 | 0.98 |
| DREAM4 | 0.50 | 0.48 | 0.65 | 0.875 |
Fig. 3.GLIDER-neighbors and their induced subgraph for the protein PINK1 in (a) DREAM1, (b) DREAM2 and (c) DREAM3 networks. The number of PINK1 (bolded node in the figure) neighbors was chosen to be 20. Note: the hexagonal nodes [in (a–c)] also appear as nodes in DREAM1. The oval nodes [in (a)] are absent in DREAM2 while the oval nodes in (b) (GPR103) is absent in DREAM1
Accuracy, F1 and Resnik score results on DREAM1–4 and STRING composite networks for different function prediction methods, using the BP category of GO, reporting mean and standard deviation over 5-fold cross-validation
| Network | Metric | GLIDER- | GLIDER-25nn | Majority-Vote | DSD- | node2vec | deepNF(S) | MASHUP(S) | GLIDER-MASHUP |
|---|---|---|---|---|---|---|---|---|---|
| DREAM1 | Accuracy |
| 0.544 ± 0.007 | 0.381 ± 0.003 | 0.476 ± 0.008 | 0.352 ± 0.017 | 0.273 ± 0.015 | 0.534 ± 0.009 | 0.521 ± 0.016 |
| DREAM2 | 0.366 ± 0.005 | 0.363 ± 0.008 | 0.314 ± 0.006 |
| 0.225 ± 0.015 | 0.215 ± 0.008 | 0.3562 ± 0.0103 | 0.334 ± 0.009 | |
| DREAM3 | 0.338 ± 0.024 | 0.333 ± 0.019 | 0.255 ± 0.018 |
| 0.208 ± 0.013 | 0.200 ± 0.012 | 0.347 ± 0.026 | 0.341 ± 0.014 | |
| DREAM4 | 0.179 ± 0.011 | 0.157 ± 0.007 |
| 0.164 ± 0.009 | 0.076 ± 0.006 | 0.093 ± 0.014 | 0.142 ± 0.014 | 0.146 ± 0.010 | |
| STRING-E |
| 0.521 ± 0.007 | 0.375 ± 0.003 | 0.353 ± 0.007 | 0.351 ± 0.010 | 0.273 ± 0.016 | 0.504 ± 0.012 | 0.505 ± 0.005 | |
| STRING-ED |
| 0.573 ± 0.008 | 0.418 ± 0.010 | 0.401 ± 0.011 | 0.417 ± 0.005 | 0.300 ± 0.010 | 0.568 ± 0.008 | 0.547 ± 0.010 | |
| STRING-EDC |
| 0.545 ± 0.008 | 0.406 ± 0.011 | 0.345 ± 0.014 | 0.375 ± 0.011 | 0.282 ± 0.013 | 0.521 ± 0.016 | 0.529 ± 0.018 | |
| DREAM1 |
|
| 0.461 ± 0.006 | 0.364 ± 0.010 | 0.410 ± 0.008 | 0.272 ± 0.010 | 0.259 ± 0.021 | 0.440 ± 0.010 | 0.444 ± 0.009 |
| DREAM2 | 0.317 ± 0.006 |
| 0.285 ± 0.004 | 0.301 ± 0.003 | 0.212 ± 0.004 | 0.200 ± 0.008 | 0.308 ± 0.005 | 0.285 ± 0.006 | |
| DREAM3 |
| 0.302 ± 0.011 | 0.244 ± 0.001 | 0.301 ± 0.007 | 0.211 ± 0.005 | 0.185 ± 0.012 | 0.296 ± 0.173 | 0.278 ± 0.014 | |
| DREAM4 | 0.171 ± 0.010 | 0.158 ± 0.004 |
| 0.162 ± 0.004 | 0.088 ± 0.001 | 0.106 ± 0.009 | 0.136 ± 0.007 | 0.146 ± 0.010 | |
| STRING-E |
| 0.439 ± 0.007 | 0.400 ± 0.011 | 0.319 ± 0.008 | 0.292 ± 0.003 | 0.240 ± 0.006 | 0.433 ± 0.005 | 0.428 ± 0.007 | |
| STRING-ED |
| 0.487 ± 0.004 | 0.365 ± 0.003 | 0.355 ± 0.004 | 0.334 ± 0.001 | 0.281 ± 0.005 | 0.474 ± 0.002 | 0.468 ± 0.010 | |
| STRING-EDC |
| 0.463 ± 0.005 | 0.359 ± 0.004 | 0.308 ± 0.003 | 0.284 ± 0.003 | 0.253 ± 0.005 | 0.479 ± 0.002 | 0.448 ± 0.007 | |
| DREAM1 | Resnik |
| 2.615 ± 0.007 | 2.164 ± 0.063 | 2.465 ± 0.069 | 1.162 ± 0.036 | 0.751 ± 0.038 | 1.751 ± 0.015 | 1.876 ± 0.038 |
| DREAM2 | 1.964 ± 0.035 |
| 1.885 ± 0.018 | 1.944 ± 0.042 | 1.090 ± 0.013 | 0.906 ± 0.020 | 1.321 ± 0.004 | 1.394 ± 0.028 | |
| DREAM3 | 1.894 ± 0.042 | 1.651 ± 0.052 | 1.627 ± 0.036 |
| 1.123 ± 0.011 | 1.128 ± 0.029 | 1.346 ± 0.046 | 1.426 ± 0.078 | |
| DREAM4 | 1.144 ± 0.042 | 1.155 ± 0.030 | 1.145 ± 0.022 |
| 0.729 ± 0.020 | 0.783 ± 0.084 | 0.865 ± 0.020 | 0.873 ± 0.017 | |
| STRING-E |
| 2.189 ± 0.031 | 1.557 ± 0.029 | 2.011 ± 0.034 | 1.207 ± 0.045 | 0.905 ± 0.024 | 1.767 ± 0.028 | 1.897 ± 0.049 | |
| STRING-ED |
| 2.787 ± 0.031 | 02.249 ± 0.018 | 2.238 ± 0.044 | 1.317 ± 0.042 | 0.978 ± 0.020 | 1.920 ± 0.013 | 2.079 ± 0.042 | |
| STRING-EDC |
| 2.652 ± 0.033 | 2.134 ± 0.053 | 1.960 ± 0.036 | 1.199 ± 0.041 | 1.091 ± 0.058 | 2.008 ± 0.060 | 2.039 ± 0.047 |
Note: Best performance bolded. All method parameters set as described in Section 2.
Accuracy, F1 and Resnik score results on DREAM1–4 and STRING composite networks for different function prediction methods, using the CC category of GO, reporting mean and standard deviation over 5-fold cross-validation
| Network | Metric | GLIDER- | GLIDER-25nn | Majority-Vote | DSD- | node2vec | deepNF(S) | MASHUP(S) | GLIDER-MASHUP |
|---|---|---|---|---|---|---|---|---|---|
| DREAM1 | Accuracy | 0.596 ± 0.005 |
| 0.567 ± 0.005 | 0.585 ± 0.008 | 0.374 ± 0.011 | 0.330 ± 0.017 | 0.526 ± 0.003 | 0.517 ± 0.013 |
| DREAM2 | 0.529 ± 0.016 | 0.527 ± 0.015 | 0.494 ± 0.007 |
| 0.218 ± 0.018 | 0.230 ± 0.012 | 0.410 ± 0.009 | 0.378 ± 0.010 | |
| DREAM3 |
| 0.596 ± 0.017 | 0.504 ± 0.015 | 0.595 ± 0.009 | 0.318 ± 0.015 | 0.518 ± 0.016 | 0.501 ± 0.016 | 0.464 ± 0.016 | |
| DREAM4 |
| 0.471 ± 0.005 | 0.471 ± 0.012 | 0.477 ± 0.008 | 0.091 ± 0.006 | 0.248 ± 0.056 | 0.282 ± 0.010 | 0.247 ± 0.007 | |
| STRING-E |
| 0.625 ± 0.012 | 0.554 ± 0.013 | 0.555 ± 0.008 | 0.369 ± 0.009 | 0.356 ± 0.013 | 0.549 ± 0.010 | 0.545 ± 0.006 | |
| STRING-ED |
| 0.629 ± 0.005 | 0.575 ± 0.002 | 0.576 ± 0.009 | 0.382 ± 0.010 | 0.329 ± 0.012 | 0.565 ± 0.009 | 0.553 ± 0.006 | |
| STRING-EDC |
| 0.625 ± 0.009 | 0.567 ± 0.015 | 0.529 ± 0.013 | 0.371 ± 0.098 | 0.361 ± 0.020 | 0.594 ± 0.010 | 0.572 ± 0.017 | |
| DREAM1 |
|
| 0.547 ± 0.008 | 0.544 ± 0.003 | 0.550 ± 0.007 | 0.327 ± 0.011 | 0.351 ± 0.018 | 0.469 ± 0.005 | 0.469 ± 0.002 |
| DREAM2 | 0.497 ± 0.017 | 0.492 ± 0.014 | 0.474 ± 0.006 |
| 0.208 ± 0.010 | 0.264 ± 0.018 | 0.366 ± 0.008 | 0.334 ± 0.004 | |
| DREAM3 |
| 0.542 ± 0.007 | 0.471 ± 0.012 | 0.543 ± 0.002 | 0.334 ± 0.014 | 0.418 ± 0.021 | 0.436 ± 0.008 | 0.410 ± 0.016 | |
| DREAM4 | 0.450 ± 0.005 | 0.437 ± 0.007 |
| 0.445 ± 0.002 | 0.111 ± 0.005 | 0.310 ± 0.007 | 0.259 ± 0.009 | 0.234 ± 0.007 | |
| STRING-E |
| 0.577 ± 0.005 | 0.546 ± 0.006 | 0.519 ± 0.007 | 0.341 ± 0.010 | 0.344 ± 0.007 | 0.485 ± 0.013 | 0.492 ± 0.003 | |
| STRING-ED |
| 0.585 ± 0.005 | 0.559 ± 0.004 | 0.538 ± 0.005 | 0.350 ± 0.007 | 0.344 ± 0.010 | 0.499 ± 0.010 | 0.503 ± 0.010 | |
| STRING-EDC |
| 0.584 ± 0.009 | 0.549 ± 0.003 | 0.499 ± 0.003 | 0.334 ± 0.013 | 0.359 ± 0.007 | 0.517 ± 0.013 | 0.509 ± 0.007 | |
| DREAM1 | Resnik |
| 1.483 ± 0.010 | 1.296 ± 0.009 | 1.422 ± 0.021 | 0.927 ± 0.009 | 0.708 ± 0.015 | 1.146 ± 0.022 | 1.239 ± 0.032 |
| DREAM2 | 1.221 ± 0.023 | 1.232 ± 0.022 | 1.134 ± 0.018 |
| 0.822 ± 0.008 | 0.755 ± 0.015 | 0.976 ± 0.017 | 1.003 ± 0.016 | |
| DREAM3 | 1.089 ± 0.007 |
|
| 1.103 ± 0.031 | 0.884 ± 0.020 | 0.990 ± 0.058 | 1.042 ± 0.021 | 1.005 ± 0.023 | |
| DREAM4 | 1.032 ± 0.018 | 1.057 ± 0.016 | 1.020 ± 0.014 |
| 0.673 ± 0.016 | 0.761 ± 0.020 | 0.848 ± 0.026 | 0.848 ± 0.023 | |
| STRING-E |
| 1.553 ± 0.015 | 1.222 ± 0.007 | 1.404 ± 0.018 | 0.931 ± 0.015 | 0.796 ± 0.027 | 1.226 ± 0.029 | 1.314 ± 0.023 | |
| STRING-ED |
| 1.598 ± 0.038 | 1.298 ± 0.002 | 1.485 ± 0.045 | 0.987 ± 0.015 | 0.768 ± 0.022 | 1.260 ± 0.021 | 1.356 ± 0.023 | |
| STRING-EDC |
| 1.596 ± 0.009 | 1.229 ± 0.015 | 1.357 ± 0.018 | 0.963 ± 0.022 | 0.927 ± 0.030 | 1.358 ± 0.032 | 1.381 ± 0.010 |
Note: Best performance bolded. All method parameters set as described in Section 2.