| Literature DB >> 19208145 |
Abstract
BACKGROUND: Protein subcellular localization is crucial information to elucidate protein functions. Owing to the need for large-scale genome analysis, computational method for efficiently predicting protein subcellular localization is highly required. Although many previous works have been done for this task, the problem is still challenging due to several reasons: the number of subcellular locations in practice is large; distribution of protein in locations is imbalanced, that is the number of protein in each location remarkably different; and there are many proteins located in multiple locations. Thus it is necessary to explore new features and appropriate classification methods to improve the prediction performance.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19208145 PMCID: PMC2648781 DOI: 10.1186/1471-2105-10-S1-S43
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Prediction flowchart.
Numbers of proteins in the dataset
| Subcellular locations | Number of proteins |
| Mitochondrion | 494 |
| Vacuole | 129 |
| Spindle pole | 58 |
| Cell periphery | 106 |
| Punctate composite | 123 |
| Vacuolar_membrane | 54 |
| ER | 272 |
| Nuclear periphery | 59 |
| Endosome | 43 |
| Bud neck | 60 |
| Microtubule | 20 |
| Golgi | 40 |
| Late Golgi | 36 |
| Peroxisome | 20 |
| Actin | 29 |
| Nucleolus | 157 |
| Cytoplasm | 1576 |
| ER to Golgi | 6 |
| Early Golgi | 51 |
| Lipid particle | 19 |
| Nucleus | 1333 |
| Bud | 23 |
| Total number of classified proteins | 4708 |
| Total number of different proteins | 3552 |
Performance comparison between fuzzy k-NN and k-NN models in three measures
| ISORT (1-N) | ISORT (2-NN) | ISORT (3-NN) | Fuzzy K-NN ( | |
| Measure I (k = 1) (%) | 50.68 | 55.41 | 56.91 | |
| Measure I (k = 2) (%) | 59.67 | 68.85 | 70.40 | |
| Measure I (k = 3) (%) | 60.23 | 72.93 | 76.96 | |
| Measure II (%) | 47.83 | 55.73 | 58.63 | |
| Mitochondrion | 28.43 | 38.13 | 35.12 | |
| Vacuole | 30.26 | 26.32 | 26.32 | |
| Spindle pole | 27.78 | 16.67 | 22.22 | |
| Cell periphery | 26.98 | 31.75 | 30.16 | |
| Punctate composite | 6.56 | 4.92 | 3.28 | |
| Vacuolar membrane | 8.11 | 0 | 8.11 | |
| ER | 41.61 | 44.97 | 41.61 | |
| Nuclear periphery | 35.00 | 45.00 | ||
| Endosome | ||||
| Bud neck | 30.56 | 33.33 | ||
| Microtubule | ||||
| Golgi | 23.81 | 23.81 | ||
| Late Golgi | 13.04 | 17.39 | ||
| Peroxisome | ||||
| Actin | 23.53 | 23.53 | ||
| Nucleolus | 13.92 | 15.19 | 20.25 | |
| Cytoplasm | 49.08 | 64.72 | 66.18 | |
| ER to Golgi | ||||
| Early Golgi | 20.00 | 30.00 | 26.67 | |
| Lipid particle | 18.18 | 9.09 | 9.09 | |
| Nucleus | 63.47 | 78.03 | 77.91 | |
| Bud | 53.85 | 23.08 | 7.69 | |
| Measure III (%) | 37.98 | 34.77 | 35.35 | |
Figure 2Prediction performance of k-NN and fuzzy k-NN on three measures (M-I, M-II and M-III) are plotted against the number of nearest neighbours.
A prediction performance comparison to show the effectiveness of incorporating neighbourhood information
| No NI(*) | NI(*) | |
| Prediction coverage (%) | 60 | |
| Measure-I ( | 86.14 | |
| Measure-II (%) | 63.52 | |
| Mitochondrion | 35.12 | |
| Vacuole | 15.30 | |
| Spindle pole | 38.89 | |
| Cell periphery | 30.34 | |
| Punctate composite | 16.50 | |
| Vacuolar membrane | 8.11 | |
| ER | 51.89 | |
| Nuclear periphery | 48.98 | |
| Endosome | 40.74 | |
| Bud neck | 36.11 | |
| Microtubule | 31.58 | |
| Golgi | 23.81 | |
| Late Golgi | 21.74 | |
| Peroxisome | 33.33 | |
| Actin | 52.94 | |
| Nucleolus | 32.91 | |
| Cytoplasm | 79.88 | |
| ER to Golgi | 66.67 | |
| Early Golgi | 26.67 | |
| Lipid particle | 6.67 | |
| Nucleus | 77.91 | |
| Bud | 7.69 | |
| Measure-III (%) | 39.07 | |
(*) Note: No NI: Prediction without incorporating neighbourhood information; NI: Prediction with neighbourhood information.
Prediction performance (%) of ISORT, PLPD and proposed method
| ISORT | PLPD | Proposed method | |
| Measure-I ( | 72.89 | 85.25 | |
| Measure-II (%) | 53.84 | 59.26 | |
| Mitochondrion | 32.1862 | 0.81 | |
| Vacuole | 17.8295 | 16.2791 | |
| Spindle pole | 12.069 | 42.31 | |
| Cell periphery | 22.6415 | 21.25 | |
| Punctate composite | 1.626 | 13.8211 | |
| Vacuolar membrane | 0 | 18.5185 | |
| ER | 32.3529 | 1.04 | |
| Nuclear periphery | 20.339 | 5.41 | |
| Endosome | 25.5814 | 34.29 | |
| Bud neck | 26.6667 | 19.15 | |
| Microtubule | 25 | 30 | |
| Golgi | 17.5 | 32.14 | |
| Late Golgi | 11.1111 | 25 | |
| Peroxisome | 25 | 33.33 | |
| Actin | 17.2414 | 25 | |
| Nucleolus | 22.293 | 11.4 | |
| Cytoplasm | 64.5305 | 77.7919 | |
| ER to Golgi | 50 | 66.6667 | |
| Early Golgi | 27.451 | 32.43 | |
| Lipid particle | 15.7895 | 5.2632 | |
| Nucleus | 65.28 | 79.6699 | |
| Bud | 39.1304 | 21.7391 | |
| Measure-III (%) | 26.7254 | 34.74 | |