| Literature DB >> 35205217 |
Li Shen1, Jian Zhang1, Fang Wang2, Kai Liu2.
Abstract
Essential proteins are indispensable to cells' survival and development. Prediction and analysis of essential proteins are crucial for uncovering the mechanisms of cells. With the help of computer science and high-throughput technologies, forecasting essential proteins by protein-protein interaction (PPI) networks has become more efficient than traditional approaches (expensive experimental methods are generally used). Many computational algorithms were employed to predict the essential proteins; however, they have various restrictions. To improve the prediction accuracy, by introducing the Local Fuzzy Fractal Dimension (LFFD) of complex networks into the analysis of the PPI network, we propose a novel algorithm named LDS, which combines the LFFD of the PPI network with the protein subcellular location information. By testing the proposed LDS algorithm on three different yeast PPI networks, the experimental results show that LDS outperforms some state-of-the-art essential protein-prediction techniques.Entities:
Keywords: LFFD; PPI network; essential proteins; subcellular location information
Mesh:
Substances:
Year: 2022 PMID: 35205217 PMCID: PMC8872415 DOI: 10.3390/genes13020173
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1A simple example of calculating a local fractal dimension. The left is the network structure diagram. The right is the double-log plot between the B(r) and r.
Figure 2An example of calculating LFFD. The left is the kite network structure diagram. The right is the double-log plot between the N(r) and r.
The information of the experimental data.
| Datasets | Proteins | Interactions | Essential Proteins | Non-Essential Proteins |
|---|---|---|---|---|
| DIP4746 | 4746 | 15,166 | 1130 | 3616 |
| DIP5093 | 5093 | 24,743 | 1167 | 3926 |
| MIPS4546 | 4546 | 12,319 | 1016 | 3530 |
Figure 3Comparison of the number of essential proteins predicted by LDS and other methods for dataset DIP4746. (a–f) are for the top 1000~1500, respectively.
Figure 4Comparison of the number of essential proteins predicted by LDS and other methods for dataset DIP5093. (a–f) are for the top 1000~1500, respectively.
Figure 5Comparison of the number of essential proteins predicted by LDS and other methods for dataset MIPS4546. (a–f) are for the top 1000~1500, respectively.
Comparisons of SN, SP, PPV, NPV, F-measure, and ACC between LDS with other methods for three different PPI datasets. The bold is the best result.
| Datasets | Methods |
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| DIP4746 | DC | 0.5469 | 0.7561 | 0.412 | 0.8423 | 0.470 | 0.7063 |
| SC | 0.500 | 0.7414 | 0.3767 | 0.8259 | 0.4297 | 0.6839 | |
| BC | 0.4681 | 0.7315 | 0.3527 | 0.8148 | 0.4023 | 0.6688 | |
| CloseC | 0.4611 | 0.7293 | 0.3473 | 0.8124 | 0.3962 | 0.6654 | |
| ClusterC | 0.5336 | 0.7519 | 0.402 | 0.8376 | 0.4586 | 0.700 | |
| IC | 0.5478 | 0.7564 | 0.4127 | 0.8426 | 0.4707 | 0.7067 | |
| LAC | 0.5451 | 0.7555 | 0.4107 | 0.8417 | 0.4684 | 0.7054 | |
| PeC | 0.4717 | 0.7326 | 0.3553 | 0.8161 | 0.4053 | 0.6705 | |
| LID | 0.554 | 0.7583 | 0.4173 | 0.8447 | 0.4760 | 0.7097 | |
|
|
|
|
|
|
|
| |
| DIP5093 | DC | 0.4901 | 0.7636 | 0.3813 | 0.8344 | 0.4289 | 0.701 |
| SC | 0.4559 | 0.7534 | 0.3547 | 0.8233 | 0.399 | 0.6853 | |
| BC | 0.4165 | 0.7417 | 0.324 | 0.8105 | 0.3645 | 0.6672 | |
| CloseC | 0.4422 | 0.7494 | 0.344 | 0.8188 | 0.387 | 0.679 | |
| ClusterC | 0.4773 | 0.7598 | 0.3713 | 0.8302 | 0.4177 | 0.6951 | |
| IC | 0.4876 | 0.7629 | 0.3793 | 0.8336 | 0.4267 | 0.6998 | |
| LAC | 0.5193 | 0.7723 | 0.404 | 0.8439 | 0.4544 | 0.7143 | |
| PeC | 0.4619 | 0.7552 | 0.3593 | 0.8252 | 0.4042 | 0.688 | |
| LID | 0.5261 | 0.7743 | 0.4093 | 0.8461 | 0.4604 | 0.7175 | |
|
|
|
|
|
|
|
| |
| MIPS4546 | DC | 0.4242 | 0.6972 | 0.2873 | 0.8079 | 0.3426 | 0.6362 |
| SC | 0.2776 | 0.655 | 0.188 | 0.759 | 0.2242 | 0.5706 | |
| BC | 0.3917 | 0.6878 | 0.2653 | 0.7971 | 0.3164 | 0.6216 | |
| CloseC | 0.2825 | 0.6564 | 0.1913 | 0.7607 | 0.2281 | 0.5728 | |
| ClusterC | 0.4242 | 0.6972 | 0.2873 | 0.8079 | 0.3426 | 0.6361 | |
| IC | 0.3858 | 0.6861 | 0.2613 | 0.7951 | 0.3116 | 0.619 | |
| LAC | 0.4242 | 0.6972 | 0.2873 | 0.8079 | 0.3426 | 0.6362 | |
| PeC | 0.4232 | 0.6969 | 0.2867 | 0.8076 | 0.3418 | 0.6357 | |
| LID | 0.4311 | 0.6992 | 0.292 | 0.8102 | 0.3482 | 0.6392 | |
|
|
|
|
|
|
|
|
Figure 6Number of essential proteins predicted by LDS in top1000–1500 for three datasets with different parameter α.