Literature DB >> 16808903

Hum-PLoc: a novel ensemble classifier for predicting human protein subcellular localization.

Kuo-Chen Chou1, Hong-Bin Shen.   

Abstract

Predicting subcellular localization of human proteins is a challenging problem, especially when unknown query proteins do not have significant homology to proteins of known subcellular locations and when more locations need to be covered. To tackle the challenge, protein samples are expressed by hybridizing the gene ontology (GO) database and amphiphilic pseudo amino acid composition (PseAA). Based on such a representation frame, a novel ensemble classifier, called "Hum-PLoc", was developed by fusing many basic individual classifiers through a voting system. The "engine" of these basic classifiers was operated by the KNN (K-nearest neighbor) rule. As a demonstration, tests were performed with the ensemble classifier for human proteins among the following 12 locations: (1) centriole; (2) cytoplasm; (3) cytoskeleton; (4) endoplasmic reticulum; (5) extracell; (6) Golgi apparatus; (7) lysosome; (8) microsome; (9) mitochondrion; (10) nucleus; (11) peroxisome; (12) plasma membrane. To get rid of redundancy and homology bias, none of the proteins investigated here had > or = 25% sequence identity to any other in a same subcellular location. The overall success rates thus obtained via the jackknife cross-validation test and independent dataset test were 81.1% and 85.0%, respectively, which are more than 50% higher than those obtained by the other existing methods on the same stringent datasets. Furthermore, an incisive and compelling analysis was given to elucidate that the overwhelmingly high success rate obtained by the new predictor is by no means due to a trivial utilization of the GO annotations. This is because, for those proteins with "subcellular location unknown" annotation in Swiss-Prot database, most (more than 99%) of their corresponding GO numbers in GO database are also annotated with "cellular component unknown". The information and clues for predicting subcellular locations of proteins are actually buried into a series of tedious GO numbers, just like they are buried into a pile of complicated amino acid sequences although with a different manner and "depth". To dig out the knowledge about their locations, a sophisticated operation engine is needed. And the current predictor is one of these kinds, and has proved to be a very powerful one. The Hum-PLoc classifier is available as a web-server at http://202.120.37.186/bioinf/hum.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16808903     DOI: 10.1016/j.bbrc.2006.06.059

Source DB:  PubMed          Journal:  Biochem Biophys Res Commun        ISSN: 0006-291X            Impact factor:   3.575


  43 in total

1.  Combining machine learning and homology-based approaches to accurately predict subcellular localization in Arabidopsis.

Authors:  Rakesh Kaundal; Reena Saini; Patrick X Zhao
Journal:  Plant Physiol       Date:  2010-07-20       Impact factor: 8.340

2.  Robust quantitative modeling of peptide binding affinities for MHC molecules using physical-chemical descriptors.

Authors:  Ovidiu Ivanciuc; Werner Braun
Journal:  Protein Pept Lett       Date:  2007       Impact factor: 1.890

3.  Prediction of protein function improving sequence remote alignment search by a fuzzy logic algorithm.

Authors:  Antonio Gómez; Juan Cedano; Jordi Espadaler; Antonio Hermoso; Jaume Piñol; Enrique Querol
Journal:  Protein J       Date:  2008-02       Impact factor: 2.371

4.  Quat-2L: a web-server for predicting protein quaternary structural attributes.

Authors:  Xuan Xiao; Pu Wang; Kuo-Chen Chou
Journal:  Mol Divers       Date:  2010-02-11       Impact factor: 2.943

5.  PMLPR: A novel method for predicting subcellular localization based on recommender systems.

Authors:  Elnaz Mirzaei Mehrabad; Reza Hassanzadeh; Changiz Eslahchi
Journal:  Sci Rep       Date:  2018-08-13       Impact factor: 4.379

6.  Computational Approaches for Automated Classification of Enzyme Sequences.

Authors:  Akram Mohammed; Chittibabu Guda
Journal:  J Proteomics Bioinform       Date:  2011-08-23

Review 7.  Some illuminating remarks on molecular genetics and genomics as well as drug development.

Authors:  Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2020-01-01       Impact factor: 3.291

8.  Predicting drug-target interaction networks based on functional groups and biological features.

Authors:  Zhisong He; Jian Zhang; Xiao-He Shi; Le-Le Hu; Xiangyin Kong; Yu-Dong Cai; Kuo-Chen Chou
Journal:  PLoS One       Date:  2010-03-11       Impact factor: 3.240

9.  Comprehensive comparative analysis and identification of RNA-binding protein domains: multi-class classification and feature selection.

Authors:  Samad Jahandideh; Vinodh Srinivasasainagendra; Degui Zhi
Journal:  J Theor Biol       Date:  2012-08-03       Impact factor: 2.691

10.  Protein domain boundary predictions: a structural biology perspective.

Authors:  Svetlana Kirillova; Suresh Kumar; Oliviero Carugo
Journal:  Open Biochem J       Date:  2009-01-21
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.