| Literature DB >> 33048996 |
Kuo-Ching Ying1, Shih-Wei Lin2,3,4.
Abstract
Protein Function Module (PFM) identification in Protein-Protein Interaction Networks (PPINs) is one of the most important and challenging tasks in computational biology. The quick and accurate detection of PFMs in PPINs can contribute greatly to the understanding of the functions, properties, and biological mechanisms in research on various diseases and the development of new medicines. Despite the performance of existing detection approaches being improved to some extent, there are still opportunities for further enhancements in the efficiency, accuracy, and robustness of such detection methods. Based on the uniqueness of the network-clustering problem in the context of PPINs, this study proposed a very effective and efficient model based on the Lin-Kernighan-Helsgaun algorithm for detecting PFMs in PPINs. To demonstrate the effectiveness and efficiency of the proposed model, computational experiments are performed using three different categories of species datasets. The computational results reveal that the proposed model outperforms existing detection techniques in terms of two key performance indices, i.e., the degree of polymerization inside PFMs (cohesion) and the deviation degree between PFMs (separation), while being very fast and robust. The proposed model can be used to help researchers decide whether to conduct further expensive and time-consuming biological experiments and to select target proteins from large-scale PPI data for further detailed research.Entities:
Mesh:
Year: 2020 PMID: 33048996 PMCID: PMC7553341 DOI: 10.1371/journal.pone.0240628
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Main stages of LKHM.
Statistics of the three pre-processed datasets.
| Before Pre-processing | After Pre-processing | |||
|---|---|---|---|---|
| Species | Interaction | Interactor | GO Annotation | Interactor & GO Annotation |
| Human | 8,412 | 4,823 | 20,201 | 3,394 |
| Mouse | 2,498 | 2,259 | 1,480 | 1,447 |
| Fruitfly | 680 | 607 | 3,299 | 269 |
Fig 2An example diagram of PPIN.
Adjacency matrix.
| Protein | A | B | C | D | E | F |
|---|---|---|---|---|---|---|
| A | 0 | 1 | 1 | 0 | 0 | 0 |
| B | 1 | 0 | 0 | 1 | 1 | 0 |
| C | 1 | 0 | 0 | 1 | 0 | 1 |
| D | 0 | 1 | 1 | 0 | 1 | 1 |
| E | 0 | 1 | 0 | 1 | 0 | 1 |
| F | 0 | 0 | 1 | 1 | 1 | 0 |
Distance matrix.
| Protein | A | B | C | D | E | F |
|---|---|---|---|---|---|---|
| A | - | 1.0000 | 1.0000 | 0.3333 | 0.6000 | 0.6000 |
| B | 1.0000 | - | 0.3333 | 0.7143 | 0.6667 | 0.3333 |
| C | 1.0000 | 0.3333 | - | 0.7143 | 0.3333 | 0.6667 |
| D | 0.3333 | 0.7143 | 0.7143 | - | 0.4286 | 0.4286 |
| E | 0.6000 | 0.6667 | 0.3333 | 0.4286 | - | 0.6667 |
| F | 0.6000 | 0.3333 | 0.6667 | 0.4286 | 0.6667 | - |
Fig 3A sketch of the revised LKH algorithm.
Degrees of polymerization inside PFMs (C) for compared approaches.
| SSO | EPSO | LKHM | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Threshold | Fruitfly | Mouse | Human | Fruitfly | Mouse | Human | Fruitfly | Mouse | Human |
| 0.050 | 1.0448 | 1.0280 | 1.0270 | 1.0842 | 1.0288 | 1.0272 | 2.1514 | 1.8548 | 2.2176 |
| 0.055 | 1.0906 | 1.0298 | 1.0242 | 1.0688 | 1.0278 | 1.0276 | 2.2928 | 1.9256 | 2.1338 |
| 0.060 | 1.0516 | 1.0296 | 1.0278 | 1.0666 | 1.0334 | 1.0274 | 2.4402 | 1.8886 | 2.2436 |
| 0.065 | 1.0634 | 1.0326 | 1.0276 | 1.0710 | 1.0310 | 1.0310 | 2.3326 | 1.9212 | 2.0004 |
| 0.070 | 1.0778 | 1.0306 | 1.0308 | 1.0852 | 1.0356 | 1.0312 | 2.3818 | 1.9446 | 2.3118 |
| 0.075 | 1.2486 | 1.0274 | 1.0276 | 1.0930 | 1.0296 | 1.0288 | 2.4328 | 1.8684 | 2.3538 |
| 0.080 | 1.0658 | 1.0300 | 1.0302 | 1.0980 | 1.0334 | 1.0296 | 2.1786 | 1.8596 | 2.3036 |
| 0.085 | 1.1094 | 1.0342 | 1.0302 | 1.1050 | 1.0362 | 1.0408 | 2.2484 | 1.8868 | 2.3032 |
| Total Ave. | 1.0940 | 1.0302 | 1.0281 | 1.0835 | 1.0319 | 1.0304 | 2.3073 | 1.8937 | 2.2324 |
Deviation degrees between PFMs (S) for compared approaches.
| SSO | EPSO | LKHM | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Threshold | Fruitfly | Mouse | Human | Fruitfly | Mouse | Human | Fruitfly | Mouse | Human |
| 0.050 | 19.9450 | 23.8480 | 22.8452 | 21.1064 | 23.9212 | 22.7966 | 18.9636 | 22.8962 | 21.5276 |
| 0.055 | 20.3240 | 23.6394 | 23.0308 | 20.5620 | 24.0596 | 22.8374 | 19.1828 | 22.6266 | 21.9868 |
| 0.060 | 19.6422 | 23.5466 | 23.1240 | 19.6758 | 24.0378 | 23.0052 | 20.5374 | 22.8688 | 21.7224 |
| 0.065 | 20.1214 | 23.6064 | 22.8264 | 20.0156 | 24.0522 | 22.5876 | 21.8196 | 23.4734 | 21.9264 |
| 0.070 | 19.7058 | 23.2210 | 22.9818 | 20.5950 | 23.9542 | 22.9120 | 21.9656 | 22.9996 | 22.4988 |
| 0.075 | 19.5328 | 23.5762 | 22.8524 | 20.3126 | 23.6772 | 22.7920 | 21.9108 | 22.9906 | 22.2826 |
| 0.080 | 20.4202 | 23.2104 | 22.9208 | 20.0296 | 23.7938 | 23.0494 | 21.9372 | 23.1178 | 22.1316 |
| 0.085 | 19.8458 | 22.7710 | 22.7432 | 20.7184 | 23.6878 | 23.0670 | 21.8264 | 22.9274 | 22.1776 |
| Total Ave. | 19.9421 | 23.4273 | 22.9155 | 20.3769 | 23.8979 | 22.8809 | 21.0179 | 22.9875 | 22.0317 |
Fig 4Degrees of polymerization inside PFMs.
Average computational times for compared approaches.
| SSO | EPSO | LKHM | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Threshold | Fruitfly | Mouse | Human | Fruitfly | Mouse | Human | Fruitfly | Mouse | Human |
| 0.050 | 4.4 | 71.2 | 731.6 | 5.4 | 148.6 | 710.2 | 0.7 | 71.5 | 421.2 |
| 0.055 | 4.0 | 69.4 | 501.4 | 4.0 | 167.4 | 650.0 | 0.7 | 71.5 | 421.2 |
| 0.060 | 4.0 | 75.6 | 451.8 | 5.0 | 163.2 | 637.8 | 0.7 | 71.5 | 421.2 |
| 0.065 | 4.0 | 73.6 | 407.6 | 5.0 | 168.0 | 683.8 | 0.7 | 71.5 | 421.2 |
| 0.070 | 4.0 | 107.4 | 418.8 | 5.0 | 146.0 | 629.6 | 0.7 | 71.5 | 421.2 |
| 0.075 | 4.0 | 78.0 | 375.6 | 5.0 | 168.0 | 758.2 | 0.7 | 71.5 | 421.2 |
| 0.080 | 4.0 | 74.6 | 395.2 | 5.0 | 166.4 | 751.2 | 0.7 | 71.5 | 421.2 |
| 0.085 | 4.0 | 76.4 | 402.0 | 5.0 | 167.2 | 771.2 | 0.7 | 71.5 | 421.2 |
| Total Ave. | 4.0 | 78.2 | 460.5 | 4.9 | 161.8 | 699.0 | 0.7 | 71.5 | 421.2 |
* CPU time in second.