| Literature DB >> 20033263 |
Yuhui Hu1, Hans Lehrach, Michal Janitz.
Abstract
The subcellular localization of a protein can provide important information about its function within the cell. As eukaryotic cells and particularly mammalian cells are characterized by a high degree of compartmentalization, most protein activities can be assigned to particular cellular compartments. The categorization of proteins by their subcellular localization is therefore one of the essential goals of the functional annotation of the human genome. We previously performed a subcellular localization screen of 52 proteins encoded on human chromosome 21. In the current study, we compared the experimental localization data to the in silico results generated by nine leading software packages with different prediction resolutions. The comparison revealed striking differences between the programs in the accuracy of their subcellular protein localization predictions. Our results strongly suggest that the recently developed predictors utilizing multiple prediction methods tend to provide significantly better performance over purely sequence-based or homology-based predictions.Entities:
Mesh:
Substances:
Year: 2009 PMID: 20033263 PMCID: PMC2834777 DOI: 10.1007/s10735-009-9247-9
Source DB: PubMed Journal: J Mol Histol ISSN: 1567-2379 Impact factor: 2.611
Comparison of the protein localization prediction software programs used in the study
| Software | Prediction strategy | Number of predicted localizations* | Reference |
|---|---|---|---|
| SherLoc2 | Sequence-based predictions (aa composition, sorting signals), homology similarity, GO terms | 9 | Briesemeister et al. ( |
| WoLF-PSORT | Sequence-based predictions (aa composition, sorting signals, functional motifs), homology similarity | 11 | Horton et al. ( |
| pTARGET | Sequence-based predictions (aa composition, localization-specific Pfam domains) | 9 | Guda ( |
| ProtCom p8 | Sequence-based predictions (signal sequences, anchors, other functional peptides), homology similarity | 9 |
|
| PA-SUB v.2.5 | Homology similarity | 9 | Lu et al. ( |
| MultiLoc2# | Sequence-based predictions (aa composition, sorting signals), homology similarity, GO terms | 4 | Blum et al. ( |
| ESLPred2 | Sequence-based predictions (aa composition, sorting signals), homology similarity | 4 | Garg and Raghava ( |
| BaCelLo | Sequence composition | 4 | Pierleoni et al. ( |
| SubLoc | Aa composition | 4 | Hua and Sun ( |
* The number of sites was counted only for eukaryotic proteins
#Only the low-resolution function of MultiLoc2 was used; the high-resolution module was included in SherLoc2; aa amino acids
Comparison of experimental localization results for 52 Chr.21 proteins to in silico low-resolution predictions
| Gene symbol | GenBank protein acc. no. | Function class | Localization in HEK293T | Low-resolution localization prediction | |||
|---|---|---|---|---|---|---|---|
| MultiLoc2-LowRes | ESLPred2 | BaCelLo | SubLoc | ||||
|
| CAA62631.1 | ATPase | PM/Golgi | Cyto | Cyto | Cyto | Secr. Path. |
|
| AAH11971.1 | Acyltransferase | ER/PM(less) | Mito | Secr. Path | Secr. Path. | Cyto |
|
| NP_006048.1 | Galactosyl-transferase | Golgi/ER | Secr. Path. | Secr. Path. | Cyto | Mito |
|
| BAA24932.1 | Transcription regulation | Cyto(punct) Nuc-M-phase | Nuc | Nuc | Nuc | Nuc |
|
| NP_853633.1 | Unclear | Cyto | Cyto | Secr. Path. | Secr. Path. | Nuc |
|
| AAL34462.1 | Unknown | Nuc/Cyto | Cyto | Cyto | Cyto | Nuc |
|
| XP_032945.2 | Unknown | Nuc/Cyto | Cyto | Nuc | Nuc | Nuc |
|
| CAB56001.2 | Unknown | Nuc | Cyto | Nuc | Nuc | Nuc |
|
| AAC05974.2 | Unknown | PM | Cyto | Secr. Path. | Secr. Path. | Cyto |
|
| AAG00496.1 | Unknown | Nuc/Cyto | Cyto | Mito | Cyto | Cyto |
|
| AAK60445.1 | Unknown | ER | Mito | Nuc | Secr. Path. | Nuc |
|
| NP_079419.1 | Unknown | Cyto (punct) | Nuc | Nuc | Mito | Mito |
|
| NP_000062.1 (splicing isoform) | Cystathionine-beta-synthase | Cyto | Cyto | Cyto | Cyto | Cyto |
|
| BAA02792.1 | Chaperonin | Cyto | Cyto | Cyto | Cyto | Cyto |
|
| NP_005432.1 | Chromatin assembly factor | Nucleoplasm Cyto-M phase | Cyto | Nuc | Nuc | Cyto |
|
| AAG60052.1 | Tight junction | ER/PM(less) | Secr. Path. | Secr. Path. | Secr. Path. | Secr. Path. |
|
| CAB60616.1 | Tight junction | PM/Golgi | Secr. Path. | Secr. Path. | Secr. Path. | Secr. Path. |
|
| NP_036264.1 | Tight junction | ER/PM(less) | Secr. Path. | Secr. Path. | Secr. Path. | Secr. Path. |
|
| BAA91605.1 | Oxidoreductase | Cyto | Cyto | Cyto | Cyto | Cyto |
|
| AAH10536.1 | Receptor | PM | Secr. Path. | Secr. Path. | Cyto | Nuc |
|
| AAH02560.1 | Methyltransferase-like | Nuc/Cyto | Cyto | Secr. Path. | Cyto | Cyto |
|
| NP_006043.1 | Unknown | Nuc | Cyto | Cyto | Cyto | Cyto |
|
| NP_005230.1 | Transcription factor | Nuc | Nuc | Nuc | Cyto | Nuc |
|
| AAD34617.1 | Transcriptional repressor | Cyto | Nuc | Nuc | Nuc | Nuc |
|
| NP_000402.2 | Protein ligase | Cyto | Cyto | Nuc | Cyto | Cyto |
|
| AAA52676.1 | DNA binding | Nuc | Nuc | Nuc | Nuc | Nuc |
|
| NP_008962.1 | Transcription factor binding | Cyto | Cyto | Nuc | Cyto | Cyto |
|
| AAH03624.1 | Receptor | ER/PM(less) | Secr. Path | Secr. Path. | Secr. Path. | Secr. Path. |
|
| AAH36452.1 | K-channel | Lyso/PM | Secr. Path. | Secr. Path. | Cyto | Secr. Path. |
|
| NP_005127.1 | K-channel | Lyso/PM | Cyto | Secr. Path. | Secr. Path. | Secr. Path. |
|
| NP_002234.2 | K-channel | PM/Golgi | Cyto | Cyto | Cyto | Cyto |
|
| NP_002231.1 | K-channel | PM/Golgi | Cyto | Cyto | Cyto | Cyto |
|
| XP_035973.4 | Unknown | Nuc/Cyto (punct)-M phase | Nuc | Nuc | Nuc | Nuc |
|
| BAA25170.1 | DNA binding | Cyto/Nuc | Cyto | Cyto | Cyto | Cyto |
|
| NP_002453.1 | Dynamin and large GTPases | Cyto(punct) | Cyto | Mito | Cyto | Cyto |
|
| AAH00380.1 | RNA processing | Nucleolus | Nuc | Nuc | Nuc | Cyto |
|
| AAH12061.1 | RNA binding | Cyto/Nuc | Cyto | Cyto | Nuc | Cyto |
|
| CAA63724.1 | Unknown | Nuc/Cyto | Cyto | Nuc | Cyto | Mito |
|
| AAH09047.1 | Phosphodiesterase | Cyto (accum) | Cyto | Cyto | Nuc | Nuc |
|
| AAH00123.1 | Kinase | Cyto | Cyto | Cyto | Cyto | Secr. Path. |
|
| AAH09919.1 | Kinase | Cyto (accum) | Cyto | Cyto | Mito | Mito |
|
| AAH07746.1 | Transcription factor | Nuc/Cyto | Nuc | Nuc | Nuc | Nuc |
|
| CAA37039.1 | Peptidylprolyl isomerase A | Nuc/Cyto | Cyto | Cyto | Secr. Path. | Cyto |
|
| Pseudogene, 81% identity to BAB79493.1 | Unknown | Cyto | Cyto | Cyto | Cyto | Secr. Path. |
|
| AAH06371.1 | SH3 adaptor | Cyto | Cyto | Cyto | Cyto | Nuc |
|
| AAF81754.1 | Transcription factor-like | Nuc/Cyto | Cyto | Nuc | Cyto | Secr. Path. |
|
| NP_076927.1 | Protease | ER | Cyto | Secr. Path. | Cyto | Secr. Path. |
|
| NP_543136.1 | Chromosome-associated | Cyto/Nuc | Cyto | Cyto | Cyto | Cyto |
|
| NP_061834.1 | Catalytic activity | Cyto | Cyto | Cyto | Cyto | Cyto |
|
| AAC32312.1 | Ligase | Cyto | Cyto | Cyto | Cyto | Nuc |
|
| AAH06341.1 | Unknown | Nucleoplasm | Secr. Path | Nuc | Nuc | Cyto |
|
| BAA92123.1 | Unknown | Nuc | Nuc | Nuc | Nuc | Nuc |
The localization properties of 52 Chr.21 proteins determined experimentally in HEK293T cells were compared to prediction results given by four computational programs that can only classify proteins into four subcellular compartments. Accum accumulated, Cyto cytosol, ER endoplasmic reticulum, Lyso lysosome and endosome, Mem-bound membrane-bound, Mito mitochondria, Nuc Nucleus, PM plasma membrane, Punct punctuated, Secr. Path. extracellular secreted protein or secretory pathway protein
Comparison of experimental localization results for 52 Chr.21 proteins to in silico high-resolution predictions
| Gene symbol | GenBank protein acc. no. | Function class | Localization in HEK293T | High-resolution localization prediction | ||||
|---|---|---|---|---|---|---|---|---|
| SherLoc2 | WoLF-PSORT | pTARGET | ProtComp8 | PA-SUB v2.5 | ||||
|
| CAA62631.1 | ATPase | PM/Golgi | Cyto | PM | PM | PM | ER |
|
| AAH11971.1 | Acyltransferase | ER/PM(less) | ER | Extracell | ER | ER | Mito |
|
| NP_006048.1 | Galactosyl-transferase | Golgi/ER | Golgi | Extracell | Golgi | Golgi | Golgi |
|
| BAA24932.1 | Transcription regulation | Cyto(punct) Nuc-M-phase | Nuc | Nuc | Nuc | Nuc | Nuc |
|
| NP_853633.1 | Unclear | Cyto | Cyto | Extracell | Extracell | PM | Cyto |
|
| AAL34462.1 | Unknown | Nuc/Cyto | Mito | Nuc | PM | Extracell | – |
|
| XP_032945.2 | Unknown | Nuc/Cyto | Nuc | Nuc | Nuc | Extracell | Extracell |
|
| CAB56001.2 | Unknown | Nuc | Cyto | Nuc | Extracell | Mem-bound Perox | – |
|
| AAC05974.2 | Unknown | PM | PM | PM | PM | Extracell | – |
|
| AAG00496.1 | Unknown | Nuc/Cyto | Cyto | Cyto | Cyto | Extracell | – |
|
| AAK60445.1 | Unknown | ER | Mito | Extracell | Extracell | Cyto | – |
|
| NP_079419.1 | Unknown | Cyto (punct) | Cyto | Cyto_Nuc | Cyto | Extracell | – |
|
| NP_000062.1 (splicing isoform) | Cystathionine-beta-synthase | Cyto | Cyto | PM | – | Cyto | Cyto |
|
| BAA02792.1 | Chaperonin | Cyto | Cyto | Cyto | Mito | Cyto | Cyto |
|
| NP_005432.1 | Chromatin assembly factor | Nucleoplasm Cyto-M phase | Nuc | Nuc | Nuc | Nuc | Nuc |
|
| AAG60052.1 | Tight junction | ER/PM(less) | PM | PM | PM | PM | – |
|
| CAB60616.1 | Tight junction | PM/Golgi | PM | PM | PM | PM | – |
|
| NP_036264.1 | Tight junction | ER/PM(less) | PM | PM | PM | PM | – |
|
| BAA91605.1 | Oxidoreductase | Cyto | Cyto | Cyto | Cyto | Extracell | Cyto |
|
| AAH10536.1 | Receptor | PM | PM | PM | Extracell | PM | Extracell |
|
| AAH02560.1 | Methyltransferase-like | Nuc/Cyto | Cyto | Nuc | PM | Nuc | Nuc |
|
| NP_006043.1 | Unknown | Nuc | Cyto | Cyto | Cyto | Extracell | Cyto |
|
| NP_005230.1 | Transcription factor | Nuc | Nuc | Nuc | Nuc | Nuc | Nuc |
|
| AAD34617.1 | Transcriptional repressor | Cyto | Nuc | Cyto | Nuc | Mem-bound perox | Nuc |
|
| NP_000402.2 | Protein ligase | Cyto | Cyto | Cyto | Mito | Extracell | Cyto |
|
| AAA52676.1 | DNA binding | Nuc | Nuc | Nuc | Nuc | Mito | Nuc |
|
| NP_008962.1 | Transcription factor binding | Cyto | Cyto | Cyto | Cyto | Extracell | Cyto |
|
| AAH03624.1 | Receptor | ER/PM (less) | PM | PM | Lyso | PM | Extracell |
|
| AAH36452.1 | K-channel | Lyso/PM | PM | Extracell | PM | PM | ER |
|
| NP_005127.1 | K-channel | Lyso/PM | PM | Cyto | PM | PM | ER |
|
| NP_002234.2 | K-channel | PM/Golgi | PM | PM | PM | PM | ER |
|
| NP_002231.1 | K-channel | PM/Golgi | PM | PM | PM | PM | Mito |
|
| XP_035973.4 | Unknown | Nuc/Cyto (punct)-M phase | Nuc | Nuc | Nuc | Nuc | Nuc |
|
| BAA25170.1 | DNA binding | Cyto/Nuc | Nuc | Nuc | Lyso | Extracell | Nuc |
|
| NP_002453.1 | Dynamin and large GTPases | Cyto (punct) | Cyto | Cyto | Cyto | Cyto | Cyto |
|
| AAH00380.1 | RNA processing | Nucleolus | Nuc | Nuc | Nuc | Nuc | Nuc |
|
| AAH12061.1 | RNA binding | Cyto/Nuc | Nuc | Cysk | Cyto | Cyto | Cyto |
|
| CAA63724.1 | Unknown | Nuc/Cyto | Cyto | Cyto | Cyto | Mito | Cyto |
|
| AAH09047.1 | Phosphodiesterase | Cyto (accum) | Cyto | Cyto | Cyto | Extracell | Cyto |
|
| AAH00123.1 | Kinase | Cyto | Cyto | Cyto | Cyto | Cyto | Cyto |
|
| AAH09919.1 | Kinase | Cyto (accum) | Cyto | Cyto | – | Cyto | Cyto |
|
| AAH07746.1 | Transcription factor | Nuc/Cyto | Nuc | Nuc | Cyto | Nuc | Nuc |
|
| CAA37039.1 | Peptidylprolyl isomerase A | Nuc/Cyto | Cyto | Cyto | Cyto | Cyto | Cyto |
|
| Pseudogene, 81% identity to BAB79493.1 | Unknown | Cyto | Cyto | Cyto | – | Extracell | – |
|
| AAH06371.1 | SH3 adaptor | Cyto | Cyto | Cyto | Nuc | Extracell | Cyto |
|
| AAF81754.1 | Transcription factor-like | Nuc/Cyto | Cyto | Cyto | Cyto | Cyto | Cyto |
|
| NP_076927.1 | Protease | ER | PM | Cyto | PM | ER | Extracell |
|
| NP_543136.1 | Chromosome-associated | Cyto/Nuc | Cyto | Cyto | Cyto | Extracell | Cyto |
|
| NP_061834.1 | Catalytic activity | Cyto | Cyto | Nuc | Golgi | Extracell | Cyto |
|
| AAC32312.1 | Ligase | Cyto | Perox | Mito | ER | Extracell | Cyto |
|
| AAH06341.1 | Unknown | Nucleoplasm | Cyto | Extracell | Golgi | Extracell | Cyto |
|
| BAA92123.1 | Unknown | Nuc | Nuc | Nuc | – | Nuc | Cyto |
The localization properties of 52 Chr.21 proteins determined experimentally in HEK293T cells were compared to prediction results given by five computational programs that can classify proteins into at least nine subcellular compartments. Accum accumulated Cysk cytoskeleton, Cyto cytosol, ER endoplasmic reticulum, Extracell extracellular secreted protein, Lyso lysosome and endosome, Mem-bound membrane-bound, Mito mitochondria, Nuc Nucleus, PM plasma membrane, Punct punctuated, Perox peroxisome
Fig. 1Comparison of the prediction performances of five computational predictors with high resolution. Prediction performance varied among the different programs. SherLoc2 and WoLF-PSORT rendered the highest accuracy with the experimental results (indicated as Hek), at 83% and 75%, respectively, which was significantly better than pTARGET (60%), ProtComp8 (56%) and PA-SUB v2.5 (54%). Prediction accuracy was found to be associated with the specific localization site. Abbreviations: Nuc nucleus, Cyto cytoplasm, PM plasma membrane, ER endoplasmic reticulum, Lyso lysosome and endosome. *For the proteins with dual localization sites, all five of the predictors predicted only one site but such predictions were still counted as a full correct prediction
Fig. 2Comparison of the prediction performances of four computational predictors with low resolution. The recently developed predictors were found to have similar prediction accuracies, with 75% (MultiLoc2-LowReso, ESLPred2) and 71% (BaCelLo) agreement with the experimental data (indicated as Hek). A relatively low percentage of positive prediction, 60%, was observed for SubLoc, which was developed in 2001. Prediction accuracy was found to be associated with the specific localization site. Abbreviations: Nuc nucleus, Cyto cytoplasm, Secr. path. secretory pathway protein (including plasma membrane, ER, Golgi and lysosomal proteins in this study)