Literature DB >> 18260102

PSLDoc: Protein subcellular localization prediction based on gapped-dipeptides and probabilistic latent semantic analysis.

Jia-Ming Chang1, Emily Chia-Yu Su, Allan Lo, Hua-Sheng Chiu, Ting-Yi Sung, Wen-Lian Hsu.   

Abstract

Prediction of protein subcellular localization (PSL) is important for genome annotation, protein function prediction, and drug discovery. Many computational approaches for PSL prediction based on protein sequences have been proposed in recent years for Gram-negative bacteria. We present PSLDoc, a method based on gapped-dipeptides and probabilistic latent semantic analysis (PLSA) to solve this problem. A protein is considered as a term string composed by gapped-dipeptides, which are defined as any two residues separated by one or more positions. The weighting scheme of gapped-dipeptides is calculated according to a position specific score matrix, which includes sequence evolutionary information. Then, PLSA is applied for feature reduction, and reduced vectors are input to five one-versus-rest support vector machine classifiers. The localization site with the highest probability is assigned as the final prediction. It has been reported that there is a strong correlation between sequence homology and subcellular localization (Nair and Rost, Protein Sci 2002;11:2836-2847; Yu et al., Proteins 2006;64:643-651). To properly evaluate the performance of PSLDoc, a target protein can be classified into low- or high-homology data sets. PSLDoc's overall accuracy of low- and high-homology data sets reaches 86.84% and 98.21%, respectively, and it compares favorably with that of CELLO II (Yu et al., Proteins 2006;64:643-651). In addition, we set a confidence threshold to achieve a high precision at specified levels of recall rates. When the confidence threshold is set at 0.7, PSLDoc achieves 97.89% in precision which is considerably better than that of PSORTb v.2.0 (Gardy et al., Bioinformatics 2005;21:617-623). Our approach demonstrates that the specific feature representation for proteins can be successfully applied to the prediction of protein subcellular localization and improves prediction accuracy. Besides, because of the generality of the representation, our method can be extended to eukaryotic proteomes in the future. The web server of PSLDoc is publicly available at http://bio-cluster.iis.sinica.edu.tw/~ bioapp/PSLDoc/. (c) 2008 Wiley-Liss, Inc.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18260102     DOI: 10.1002/prot.21944

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  13 in total

1.  Predicted protein subcellular localization in dominant surface ocean bacterioplankton.

Authors:  Haiwei Luo
Journal:  Appl Environ Microbiol       Date:  2012-07-06       Impact factor: 4.792

2.  EuLoc: a web-server for accurately predict protein subcellular localization in eukaryotes by incorporating various features of sequence segments into the general form of Chou's PseAAC.

Authors:  Tzu-Hao Chang; Li-Ching Wu; Tzong-Yi Lee; Shu-Pin Chen; Hsien-Da Huang; Jorng-Tzong Horng
Journal:  J Comput Aided Mol Des       Date:  2013-01-03       Impact factor: 3.686

3.  Protein subcellular localization prediction of eukaryotes using a knowledge-based approach.

Authors:  Hsin-Nan Lin; Ching-Tai Chen; Ting-Yi Sung; Shinn-Ying Ho; Wen-Lian Hsu
Journal:  BMC Bioinformatics       Date:  2009-12-03       Impact factor: 3.169

4.  PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes.

Authors:  Nancy Y Yu; James R Wagner; Matthew R Laird; Gabor Melli; Sébastien Rey; Raymond Lo; Phuong Dao; S Cenk Sahinalp; Martin Ester; Leonard J Foster; Fiona S L Brinkman
Journal:  Bioinformatics       Date:  2010-05-13       Impact factor: 6.937

5.  FGsub: Fusarium graminearum protein subcellular localizations predicted from primary structures.

Authors:  Chenglei Sun; Xing-Ming Zhao; Weihua Tang; Luonan Chen
Journal:  BMC Syst Biol       Date:  2010-09-13

6.  Efficient and interpretable prediction of protein functional classes by correspondence analysis and compact set relations.

Authors:  Jia-Ming Chang; Jean-Francois Taly; Ionas Erb; Ting-Yi Sung; Wen-Lian Hsu; Chuan Yi Tang; Cedric Notredame; Emily Chia-Yu Su
Journal:  PLoS One       Date:  2013-10-11       Impact factor: 3.240

7.  An ensemble method for predicting subnuclear localizations from primary protein structures.

Authors:  Guo Sheng Han; Zu Guo Yu; Vo Anh; Anaththa P D Krishnajith; Yu-Chu Tian
Journal:  PLoS One       Date:  2013-02-27       Impact factor: 3.240

8.  Prediction of nuclear proteins using nuclear translocation signals proposed by probabilistic latent semantic indexing.

Authors:  Emily Chia-Yu Su; Jia-Ming Chang; Cheng-Wei Cheng; Ting-Yi Sung; Wen-Lian Hsu
Journal:  BMC Bioinformatics       Date:  2012-12-13       Impact factor: 3.169

9.  Predicting RNA-binding sites of proteins using support vector machines and evolutionary information.

Authors:  Cheng-Wei Cheng; Emily Chia-Yu Su; Jenn-Kang Hwang; Ting-Yi Sung; Wen-Lian Hsu
Journal:  BMC Bioinformatics       Date:  2008-12-12       Impact factor: 3.169

10.  Chlamydiae has contributed at least 55 genes to Plantae with predominantly plastid functions.

Authors:  Ahmed Moustafa; Adrian Reyes-Prieto; Debashish Bhattacharya
Journal:  PLoS One       Date:  2008-05-21       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.