| Literature DB >> 34966415 |
Xiaoting Wang1, Nan Zhang1, Yulan Zhao1, Juan Wang1.
Abstract
Motivation: A protein complex is the combination of proteins which interact with each other. Protein-protein interaction (PPI) networks are composed of multiple protein complexes. It is very difficult to recognize protein complexes from PPI data due to the noise of PPI.Entities:
Keywords: GO terms; NNP; function of proteins; protein complex; protein interaction network
Year: 2021 PMID: 34966415 PMCID: PMC8711776 DOI: 10.3389/fgene.2021.792265
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1Workflow of the NNP.
Results of methods are used in the unweighted networks and weighted networks computed by the TSSN.
| Metrics |
|
| F1 |
|---|---|---|---|
| ClusterOne-u | 0.32 | 0.415 | 0.361 |
| ClusterOne-T |
|
|
|
| MCODE-u | 0.21 | 0.49 | 0.294 |
| MCODE-T |
|
|
|
| MCL-u | 0.58 | 0.21 | 0.308 |
| MCL-T |
|
|
|
Bold values represents the experimental results on ClusterOne, MCode and MCL weighted by the TSSN method.
F1 values of NNP on different thresholds of WNT.
|
| 0 | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 | 1 |
| F1 | 0.4 | 0.41 |
| 0.41 | 0.4 | 0.39 | 0.395 | 0.37 | 0.3 | 0.2 | 0.13 |
Bold values shows that when the threshold t is 0.2, the value of F1 reaches a maximum of 0.42.
Precision values of NNP on different thresholds of WNT.
| t | 0.2 | 0.21 | 0.22 | 0.23 | 0.24 | 0.25 |
| Precision | 0.491 | 0.492 |
| 0.495 | 0.493 | 0.493 |
Bold values shows that when the threshold t is 0.5, the precision value reaches the maximum 0.5.
Each algorithm identifies the cluster information.
| No. | Algorithm | Number | Average | Coverage |
|---|---|---|---|---|
| 1 | CYC2008 | 408 | 4.71 | 1,628 |
| 2 | CFinder | 178 | 11.31 | 2,147 |
| 3 | ClusterONE | 413 | 5 | 1898 |
| 4 | MCODE | 110 | 6.5 | 1,299 |
| 5 | NNP | 538 | 4.54 | 1937 |
| 6 | MCL | 623 | 6.57 | 4096 |
| 7 | EA | 398 | 13.5 | 2,661 |
| 8 | PC2P | 434 | 4.50 | 1953 |
Three complexes identified by methods were analyzed from the DIP.
| Algorithm | CFinder (%) | Cluster | MCODE (%) | NNP (%) | MCL (%) | EA (%) | PC2P (%) |
|---|---|---|---|---|---|---|---|
| CFI | 100 | 100% | 100 | 100 | 100 | 100 | 83.3 |
| NEC | 83.3 | 64.1% | 91.7 | 100 | 100 | 91.7 | 83.3 |
| DRC | 56.3 | 100% | 61.4 | 91.7 | 67.5 | 83.3 | 53.3 |
Results of protein complexes recognized by algorithms.
| Metrics method |
|
| F1 |
|---|---|---|---|
| CFinder | 0.3408 | 0.2698 | 0.3012 |
| ClusterONE | 0.4068 | 0.3554 | 0.3794 |
| MCODE | 0.2293 | 0.501 | 0.3146 |
| NNP |
|
|
|
| MCL | 0.3326 | 0.4093 | 0.367 |
| EA | 0.34 | 0.383 | 0.3602 |
| PC2P | 0.4340 | 0.1935 | 0.2677 |
Bold values show that the experimental results of the NNP method are optimal.
Numbers of protein complexes perfectly matched by each algorithm for DIP data set.
| Algorithm | Perfect matching |
|---|---|
| CFinder | 11 |
| ClusterONE | 10 |
| MCODE | 6 |
| NNP |
|
| MCL | 15 |
| EA | 14 |
| PC2P | 0 |
Bold values show that the experimental results of the NNP method are optimal.
Protein complexes with lower p-value identified by the algorithm on the DIP.
| GO term | OL (%) |
|
|---|---|---|
| mRNA processing | 96 | 1.54E-36 |
| Small nuclear ribonucleo protein complex | 86.1 | 2.73E-58 |
| mRNA splicing, | 95.7 | 4.48E-38 |
| Transferase activity, transferring glycosyl groups | 89.59 | 1.81E-76 |
| Ribosomal small subunit biogenesis | 88.2 | 2.45E-48 |
| Transporter activity | 94.38 | 6.84E-100 |
Algorithm perfectly matches the protein complex on the DIP.
| GO term | OL (%) |
|
|---|---|---|
| mRNA metabolic process | 100 | 7.37E-27 |
| Anaphase-promoting complex–dependent catabolic process | 100 | 4.68E-24 |
| Polyadenylation-dependent snoRNA 3′-end processing | 100 | 1.45E-32 |
detecting protein complexes.
|
|