| Literature DB >> 27586883 |
Gaoshi Li1,2, Min Li3, Jianxin Wang4, Jingli Wu2, Fang-Xiang Wu1,5, Yi Pan1,6.
Abstract
BACKGROUND: Essential proteins play an indispensable role in the cellular survival and development. There have been a series of biological experimental methods for finding essential proteins; however they are time-consuming, expensive and inefficient. In order to overcome the shortcomings of biological experimental methods, many computational methods have been proposed to predict essential proteins. The computational methods can be roughly divided into two categories, the topology-based methods and the sequence-based ones. The former use the topological features of protein-protein interaction (PPI) networks while the latter use the sequence features of proteins to predict essential proteins. Nevertheless, it is still challenging to improve the prediction accuracy of the computational methods.Entities:
Keywords: Essential proteins; Orthology; Protein-protein interaction network; Subcellular localization
Mesh:
Substances:
Year: 2016 PMID: 27586883 PMCID: PMC5009824 DOI: 10.1186/s12859-016-1115-5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Number and ratio of essential and nonessential proteins in each subcellular location
| Subcellular location | Essential proteins number | Essential proteins ratio | Nonessential proteins number | Nonessential proteins ratio |
|---|---|---|---|---|
| Cytoskeleton | 95 | 0.081 | 133 | 0.033 |
| Golgi apparatus | 61 | 0.052 | 184 | 0.046 |
| Cytosol | 138 | 0.118 | 289 | 0.073 |
| Endosome | 22 | 0.019 | 109 | 0.027 |
| Mitochondrion | 173 | 0.148 | 753 | 0.189 |
| Plasma membrane | 53 | 0.045 | 354 | 0.089 |
| Nucleus | 809 | 0.693 | 1407 | 0.353 |
| Extracellular space | 1 | 0.001 | 70 | 0.018 |
| Vacuole | 19 | 0.016 | 238 | 0.060 |
| Endoplasmic reticulum | 137 | 0.117 | 292 | 0.073 |
| Peroxisome | 4 | 0.003 | 61 | 0.015 |
To understand the association between subcellular localization and essentiality of proteins, we first count the number of essential and nonessential proteins in each subcellular location, respectively. Next, their ratios are calculated. According to Table 1, the ratios of essential proteins are higher than that of nonessential proteins in Cytoskeleton, Golgi apparatus, Cytosol, Nucleus and Endoplasmic reticulum. Hence, the five subcellular locations above mentioned are positive correlation with essential proteins while the others are negative correlation
Fig. 1Influence of parameters α and β. (a) Top 1 % (Top 51) (b) Top 5 % (Top 255) (c) Top 10 % (Top 510) (d) Top 15 % (Top 764) (e) Top 20 % (Top 1019) (f) Top 25 % (Top 1274)
Fig. 2Influence of parameters α and β for SON
Fig. 3SON compared with several existing methods. (a) Top 1 % (Top 51) (b) Top 5 % (Top 255) (c) Top 10 % (Top 510) (d) Top 15 % (Top 764) (e) Top 20 % (Top 1019) (f) Top 25 % (Top 1274)
Fig. 4PR curves of SON and that of other methods
Fig. 5Jackknife curves of SON and other nine methods
Number of predicting high and low connectivity essential proteins by using SON and other nine existing methods
|
| DC | IC | EC | SC | BC | CC | NC | PeC | ION | SON | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| degree < =10 | 1 % | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 17 | 14 |
| 5 % | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 40 | 66 | 64 | |
| 10 % | 0 | 0 | 0 | 0 | 1 | 0 | 27 | 84 | 108 | 116 | |
| 15 % | 0 | 0 | 8 | 8 | 18 | 7 | 66 | 117 | 146 | 156 | |
| 20 % | 0 | 0 | 28 | 28 | 41 | 20 | 101 | 153 | 188 | 193 | |
| 25 % | 11 | 20 | 73 | 73 | 76 | 55 | 156 | 192 | 253 | 220 | |
| degree > 10 | 1 % | 22 | 24 | 24 | 24 | 24 | 24 | 32 | 39 | 24 | 29 |
| 5 % | 101 | 102 | 96 | 96 | 95 | 104 | 156 | 133 | 122 | 129 | |
| 10 % | 207 | 210 | 195 | 195 | 181 | 193 | 255 | 209 | 223 | 233 | |
| 15 % | 320 | 316 | 271 | 271 | 253 | 277 | 307 | 255 | 299 | 313 | |
| 20 % | 413 | 406 | 349 | 349 | 320 | 344 | 363 | 285 | 359 | 387 | |
| 25 % | 491 | 484 | 394 | 394 | 357 | 393 | 388 | 302 | 381 | 441 |
As shown in the top part of Table 2 (degree < = 10), it is weak for eight centrality methods to predict low connectivity essential proteins. When taking the top 20 % proteins ranked in descending order according to their ranking scores computed by DC and IC, the numbers of predicting essential proteins are 0. The performance of SON overall is better than that of eight centrality methods (DC, IC, EC, SC, BC, CC, NC and PeC). When K is 10, 15, 20 %, respectively, the performance of SON is also better than that of ION