Literature DB >> 17244638

Gpos-PLoc: an ensemble classifier for predicting subcellular localization of Gram-positive bacterial proteins.

Hong-Bin Shen1, Kuo-Chen Chou.   

Abstract

A statistical analysis indicated that, of the 35,016 Gram-positive bacterial proteins from the recent Swiss-Prot database, approximately 57% of these entries are without subcellular location annotations. In the gene ontology database, the corresponding percentage is approximately 67%, meaning the percentage of proteins without subcellular component annotations is even higher. With the avalanche of gene products generated in the post-genomic era, the number of such location-unknown entries will continuously increase. It is highly desired to develop an automated method for timely and accurately identifying their subcellular localization because the information thus obtained is very useful for both basic research and drug discovery practice. In view of this, an ensemble classifier called 'Gpos-PLoc' was developed for predicting Gram-positive protein subcellular localization. The new predictor is featured by fusing many basic classifiers, each of which was engineered according to the optimized evidence-theoretic K-nearest neighbors rule. As a demonstration, tests were performed on Gram-positive proteins among the following five subcellular location sites: (1) cell wall, (2) cytoplasm, (3) extracell, (4) periplasm and (5) plasma membrane. To eliminate redundancy and homology bias, only those proteins which have < 25% sequence identity to any other in a same subcellular location were allowed to be included in the benchmark datasets. The overall success rates thus achieved by Gpos-PLoc were > 80% for both jackknife cross-validation test and independent dataset test, implying that Gpos-PLoc might become a very useful vehicle for expediting the analysis of Gram-positive bacterial proteins. Gpos-PLoc is freely accessible to public as a web-server at http://202.120.37.186/bioinf/Gpos/. To support the need of many investigators in the relevant areas, a downloadable file is provided at the same website to list the results identified by Gpos-PLoc for 31,898 Gram-positive bacterial protein entries in Swiss-Prot database that either have no subcellular location annotation or are annotated with uncertain terms such as 'probable', 'potential', 'perhaps' and 'by similarity'. Such large-scale results will be updated once a year to include the new entries of Gram-positive bacterial proteins and reflect the continuous development of Gpos-PLoc.

Mesh:

Substances:

Year:  2007        PMID: 17244638     DOI: 10.1093/protein/gzl053

Source DB:  PubMed          Journal:  Protein Eng Des Sel        ISSN: 1741-0126            Impact factor:   1.650


  29 in total

1.  Prediction of protein function improving sequence remote alignment search by a fuzzy logic algorithm.

Authors:  Antonio Gómez; Juan Cedano; Jordi Espadaler; Antonio Hermoso; Jaume Piñol; Enrique Querol
Journal:  Protein J       Date:  2008-02       Impact factor: 2.371

2.  Prediction of subcellular location of mycobacterial protein using feature selection techniques.

Authors:  Hao Lin; Hui Ding; Feng-Biao Guo; Jian Huang
Journal:  Mol Divers       Date:  2009-11-12       Impact factor: 2.943

3.  PMLPR: A novel method for predicting subcellular localization based on recommender systems.

Authors:  Elnaz Mirzaei Mehrabad; Reza Hassanzadeh; Changiz Eslahchi
Journal:  Sci Rep       Date:  2018-08-13       Impact factor: 4.379

4.  EuLoc: a web-server for accurately predict protein subcellular localization in eukaryotes by incorporating various features of sequence segments into the general form of Chou's PseAAC.

Authors:  Tzu-Hao Chang; Li-Ching Wu; Tzong-Yi Lee; Shu-Pin Chen; Hsien-Da Huang; Jorng-Tzong Horng
Journal:  J Comput Aided Mol Des       Date:  2013-01-03       Impact factor: 3.686

5.  Computational Approaches for Automated Classification of Enzyme Sequences.

Authors:  Akram Mohammed; Chittibabu Guda
Journal:  J Proteomics Bioinform       Date:  2011-08-23

Review 6.  Some illuminating remarks on molecular genetics and genomics as well as drug development.

Authors:  Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2020-01-01       Impact factor: 3.291

7.  CoBaltDB: Complete bacterial and archaeal orfeomes subcellular localization database and associated resources.

Authors:  David Goudenège; Stéphane Avner; Céline Lucchetti-Miganeh; Frédérique Barloy-Hubler
Journal:  BMC Microbiol       Date:  2010-03-23       Impact factor: 3.605

8.  Computational prediction and experimental assessment of secreted/surface proteins from Mycobacterium tuberculosis H37Rv.

Authors:  Carolina Vizcaíno; Daniel Restrepo-Montoya; Diana Rodríguez; Luis F Niño; Marisol Ocampo; Magnolia Vanegas; María T Reguero; Nora L Martínez; Manuel E Patarroyo; Manuel A Patarroyo
Journal:  PLoS Comput Biol       Date:  2010-06-24       Impact factor: 4.475

9.  PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes.

Authors:  Nancy Y Yu; James R Wagner; Matthew R Laird; Gabor Melli; Sébastien Rey; Raymond Lo; Phuong Dao; S Cenk Sahinalp; Martin Ester; Leonard J Foster; Fiona S L Brinkman
Journal:  Bioinformatics       Date:  2010-05-13       Impact factor: 6.937

10.  Computer aided selection of candidate vaccine antigens.

Authors:  Darren R Flower; Isabel K Macdonald; Kamna Ramakrishnan; Matthew N Davies; Irini A Doytchinova
Journal:  Immunome Res       Date:  2010-11-03
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.