| Literature DB >> 23815126 |
Xiaohua Xu1, Lin Lu, Ping He, Ling Chen.
Abstract
BACKGROUND: Understanding the localization of proteins in cells is vital to characterizing their functions and possible interactions. As a result, identifying the (sub)cellular compartment within which a protein is located becomes an important problem in protein classification. This classification issue thus involves predicting labels in a dataset with a limited number of labeled data points available. By utilizing a graph representation of protein data, random walk techniques have performed well in sequence classification and functional prediction; however, this method has not yet been applied to protein localization. Accordingly, we propose a novel classifier in the site prediction of proteins based on random walks on a graph.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23815126 PMCID: PMC3654884 DOI: 10.1186/1471-2105-14-S8-S4
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Classification accuracies (in %) of yeast data given varying random walk steps and laziness parameters.
Figure 2Classification accuracies (in %) of gram-negative bacteria data given different random walk steps and laziness parameters.
Sensitivity and Specificity for yeast data using 10-fold cross-validation including the total predication accuracy
| RaWa | SVM | |||
|---|---|---|---|---|
| MIT | 57.38 | 68.29 | 54.9 | 65.0 |
| NUC | 54.08 | 59.95 | 51.0 | 64.0 |
| CYT | 68.90 | 55.67 | 72.1 | 47.7 |
| ME1 | 84.09 | 55.22 | 72.7 | 68.1 |
| EXC | 51.43 | 64.29 | 57.1 | 58.8 |
| ME2 | 39.22 | 57.14 | 41.2 | 52.5 |
| ME3 | 77.91 | 74.71 | 81.6 | 76.4 |
| VAC | 0 | - | 0 | - |
| POX | 55.00 | 84.62 | 0 | - |
| ERL | 1 | 83.33 | 0 | 0 |
| Total Accuracy | 61.3±0.11 | 60.2±0.28 | ||
Sensitivity and Specificity for gram-negative bacteria data using 10-fold cross-validation including the total predication accuracy.
| RaWa | SVM | |||
|---|---|---|---|---|
| Cytoplasm | 89.3 | 94.0 | 93.6 | 85.6 |
| Extracell | 82.4 | 91.0 | 83.8 | 86.1 |
| Inner membrane | 98.2 | 93.7 | 95.9 | 96.5 |
| Outer membrane | 85.6 | 89.2 | 84.5 | 90.1 |
| periplasm | 79.3 | 91.1 | 84.5 | 85.2 |
| Accuracy | 93.3±0.24 | 92.1±0.46 | ||
Figure 3ROC curves illustrating the comparison of RaWa and SVM methods on data from yeast.
Figure 4ROC curves illustrating the comparison of RaWa and SVM methods on data from gram-negative bacteria.
Information about gram-negative and yeast data
| Proteins | Site | Number |
|---|---|---|
| Gram-negative bacteria proteins | Cytoplasm | 140 |
| Extracellular | 74 | |
| Inner membrane | 687 | |
| Outer membrane | 97 | |
| Periplasm | 116 | |
| Yeast | Cytosolic or cytoskeletal (CYT) | 463 |
| Nuclear (NUC) | 429 | |
| Mitochondrial (MIT) | 244 | |
| Membrane protein, no N-terminal signal (ME1) | 163 | |
| Membrane protein, uncleaved signal (ME2) | 51 | |
| Membrane protein, cleaved signal (ME3) | 44 | |
| Extracellular (EXC) | 37 | |
| Vacuolar (VAC) | 30 | |
| Peroxisomal (POX) | 20 | |
| Endoplasmic reticulum lumen (ERL) | 5 | |
Figure 5A simple partially labeled graph.