| Literature DB >> 19422713 |
Daniel Restrepo-Montoya1, Carolina Vizcaíno, Luis F Niño, Marisol Ocampo, Manuel E Patarroyo, Manuel A Patarroyo.
Abstract
BACKGROUND: The computational prediction of mycobacterial proteins' subcellular localization is of key importance for proteome annotation and for the identification of new drug targets and vaccine candidates. Several subcellular localization classifiers have been developed over the past few years, which have comprised both general localization and feature-based classifiers. Here, we have validated the ability of different bioinformatics approaches, through the use of SignalP 2.0, TatP 1.0, LipoP 1.0, Phobius, PA-SUB 2.5, PSORTb v.2.0.4 and Gpos-PLoc, to predict secreted bacterial proteins. These computational tools were compared in terms of sensitivity, specificity and Matthew's correlation coefficient (MCC) using a set of mycobacterial proteins having less than 40% identity, none of which are included in the training data sets of the validated tools and whose subcellular localization have been experimentally confirmed. These proteins belong to the TBpred training data set, a computational tool specifically designed to predict mycobacterial proteins.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19422713 PMCID: PMC2685389 DOI: 10.1186/1471-2105-10-134
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Comparison between feature-based and general localization tools according to the validation metrics.
| 91 | 87 | 55 | 204 | 147 | 145 | |
| 65 | 65 | 64 | 68 | 53 | 56 | |
| 3 | 3 | 4 | 0 | 14 | 1.0 | |
| 113 | 117 | 149 | 0 | 57 | 14 | |
| 0.97 | 0.97 | 0.93 | 1.0 | 0.91 | 0.99 | |
| 0.45 | 0.43 | 0.27 | 1.0 | 0.72 | 0.91 | |
| 0.37 | 0.35 | 0.22 | 1.0 | 0.45 | 0.84 | |
| 204 | 204 | 204 | 204 | 204 | 159 | |
| 68 | 68 | 68 | 68 | 67 | 57 | |
TPs, True Positives. TNs, True Negatives. FPs, False Positives. FNs, False Negatives. Secreted proteins (n = 204), non-secreted proteins (n = 68), N = 272
Protein sequence features related to subcellular localization prediction.
| SignalP 2.0 (Sec-dependent) | X | X | X | |||||||
| TatP 1.0 (altern system) | X | X | X | X | ||||||
| LipoP 1.0 (Sec-dependent) | X | X | X | X | X | X | ||||
| Phobius | X | X | X | |||||||
| PA-SUB 2.5 | X | X | X | |||||||
| Gpos-Ploc | X | X | X | X | X | |||||
| PSORTb v.2.0.4 | X | X | X | X | ||||||
SP, Signal Peptidase. SPI, Signal Peptidase I. SPII, Signal Peptidase II. Cyt, Cytoplasmatic. TM, Transmembranal protein.
Criteria used for constructing the confusion matrix.
| Protein predicted as secreted being secreted. | |
| Protein predicted as non-secreted being non-secreted. | |
| Protein predicted as secreted being non-secreted. | |
| Protein predicted as non-secreted being secreted. |