| Literature DB >> 30642247 |
Lixin Cheng1,2, Pengfei Liu3, Dong Wang4, Kwong-Sak Leung5.
Abstract
BACKGROUND: Clustering molecular network is a typical method in system biology, which is effective in predicting protein complexes or functional modules. However, few studies have realized that biological molecules are spatial-temporally regulated to form a dynamic cellular network and only a subset of interactions take place at the same location in cells.Entities:
Keywords: Functional module; Network clustering; Protein interaction network; Subcellular localization; Topological overlap
Mesh:
Substances:
Year: 2019 PMID: 30642247 PMCID: PMC6332531 DOI: 10.1186/s12859-019-2598-7
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1The flowchart of module identification. For a given protein interaction network, GPIN, it is first filtered to CLPIN by the context of cell localization, and then it is imputed to form LTOPIN using both the locational and topological information. After that, ClusterONE is used to identify modules. Finally, the modules are evaluated using three functional categories, biological process, molecular function and cancer gene set. LTOM, Locational and Topological Overlap Model; Sn, Sensitivity; PPV, Positive Predictive Value; Acc, Accuracy
Fig. 2The locational and topological overlap model. A toy network is used to demonstrate the calculation steps. Nodes denote proteins while edges denote interactions. Different colors represent distinct subcellular localizations. Protein A and B share three overlapping partners. P1 and P3 have common localizations with both A and B whereas P2 does not. Specifically, the localizations of A-P2 and B-P2 are different (green edge vs brown edge)
Network reliability comparison
| Network | Protein | Interaction | Avg No. of literature | Verified | Precision | Recall | MCC |
|---|---|---|---|---|---|---|---|
| HPRD | |||||||
| GPIN | 8136 | 37,039 | 1.13 | 40.31% | 0.8755 | 0.2058 | 0.2242 |
| CLPIN | 6882 | 23,135 | 1.23a | 54.83% a | 0.9025 | 0.1900 | 0.2339 |
| BioGRID | |||||||
| GPIN | 12,289 | 268,684 | 1.14 | – | 0.7842 | 0.2347 | 0.1741 |
| CLPIN | 9749 | 94,780 | 1.27a | – | 0.8369 | 0.2211 | 0.2050 |
aSignificant difference by RankSum test, p < 0.001. PIN Protein Interaction Network, CLPIN Co-Localization Protein Interaction Network. Avg No. of literature The average number of literature supported the interaction, Verified in vivo The percentage of protein interactions that have been verified in vivo in the HPRD database
Overview of the HPRD protein interaction networks
| Network | Protein | Interaction | Average path length | Average clustering coefficient | Network | Degree exponent |
|---|---|---|---|---|---|---|
| GPIN | 7969 | 30,157 | 4.2425 | 0.1428 | 0.0009 | 2.6559 |
| CLPIN | 6794 | 22,103 | 4.4730 | 0.1539 | 0.0010 | 2.6118 |
| TOPIN | 6794 | 26,473 | 4.3919 | 0.2687 | 0.0012 | 2.6687 |
| LTOPIN | 6794 | 25,007 | 4.4165 | 0.2453 | 0.0011 | 2.6138 |
Performance comparison on known protein complexes
| PIN | Module | module size > = 5 | Module | module size > = 10 | ||||
|---|---|---|---|---|---|---|---|---|
| Sn | PPV | ACC | Sn | PPV | ACC | |||
| HPRD | ||||||||
| GPIN | 336 | 0.277 | 0.2143 | 0.2436 | 10 | 0.081 | 0.1259 | 0.101 |
| CLPIN | 252 | 0.2488 | 0.2095 | 0.2283 | 8 | 0.0691 | 0.1207 | 0.0913 |
| TOPIN | 376 | 0.2981 |
| 0.2528 | 51 | 0.1109 |
| 0.1395 |
| LTOPIN | 355 |
| 0.1966 |
| 34 |
| 0.1497 |
|
| Yeast | ||||||||
| GPIN | 614 | 0.6148 |
| 0.6066 | 112 | 0.5495 |
| 0.55 |
| CLPIN | 344 | 0.717 | 0.5592 | 0.6332 | 106 | 0.5939 | 0.5046 | 0.5474 |
| TOPIN | 492 | 0.7151 | 0.5678 | 0.6372 | 143 | 0.5977 | 0.5124 | 0.5534 |
| LTOPIN | 490 |
| 0.5663 |
| 141 |
| 0.5104 |
|
Modules were identified using ClusterONE with module size no less than five and ten, respectively. Bold values denote the best scores corresponding to specific criteria. Sn Sensitivity, PPV The positive predictive value, ACC The geometric accuracy
Overrepresentation scores of cancer genes in HPRD
| PIN | Module | Cancer | ORS | Cancer gene |
|---|---|---|---|---|
| module size > = 10 | ||||
| GPIN | 10 | 0.2 | 0 | 0 (0) |
| CLPIN | 8 | 0.25 | 0.125 | 0.0133 (6) |
| TOPIN | 51 | 0.4902 | 0.098 | 0.0399 (18) |
| LTOPIN | 34 | 0.441 | 0.1765 | 0.0421 (19) |
| module size > = 5 | ||||
| GPIN | 336 | 0.273 | 0.0504 | 0.0887 (40) |
| CLPIN | 252 | 0.3004 | 0.0593 | 0.0931 (42) |
| TOPIN | 376 | 0.3528 | 0.0504 | 0.1242 (56) |
| LTOPIN | 355 | 0.3539 | 0.0702 | 0.1441 (65) |
ORS Overrepresentation Score, Cancer module ratio The ratio of modules containing cancer genes, Cancer gene ratio The ratio of cancer genes over genes in cancer modules