Literature DB >> 16889410

Predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic K-Nearest Neighbor classifiers.

Kuo-Chen Chou1, Hong-Bin Shen.   

Abstract

Facing the explosion of newly generated protein sequences in the post genomic era, we are challenged to develop an automated method for fast and reliably annotating their subcellular locations. Knowledge of subcellular locations of proteins can provide useful hints for revealing their functions and understanding how they interact with each other in cellular networking. Unfortunately, it is both expensive and time-consuming to determine the localization of an uncharacterized protein in a living cell purely based on experiments. To tackle the challenge, a novel hybridization classifier was developed by fusing many basic individual classifiers through a voting system. The "engine" of these basic classifiers was operated by the OET-KNN (Optimized Evidence-Theoretic K-Nearest Neighbor) rule. As a demonstration, predictions were performed with the fusion classifier for proteins among the following 16 localizations: (1) cell wall, (2) centriole, (3) chloroplast, (4) cyanelle, (5) cytoplasm, (6) cytoskeleton, (7) endoplasmic reticulum, (8) extracell, (9) Golgi apparatus, (10) lysosome, (11) mitochondria, (12) nucleus, (13) peroxisome, (14) plasma membrane, (15) plastid, and (16) vacuole. To get rid of redundancy and homology bias, none of the proteins investigated here had >/=25% sequence identity to any other in a same subcellular location. The overall success rates thus obtained via the jack-knife cross-validation test and independent dataset test were 81.6% and 83.7%, respectively, which were 46 approximately 63% higher than those performed by the other existing methods on the same benchmark datasets. Also, it is clearly elucidated that the overwhelmingly high success rates obtained by the fusion classifier is by no means a trivial utilization of the GO annotations as prone to be misinterpreted because there is a huge number of proteins with given accession numbers and the corresponding GO numbers, but their subcellular locations are still unknown, and that the percentage of proteins with GO annotations indicating their subcellular components is even less than the percentage of proteins with known subcellular location annotation in the Swiss-Prot database. It is anticipated that the powerful fusion classifier may also become a very useful high throughput tool in characterizing other attributes of proteins according to their sequences, such as enzyme class, membrane protein type, and nuclear receptor subfamily, among many others. A web server, called "Euk-OET-PLoc", has been designed at http://202.120.37.186/bioinf/euk-oet for public to predict subcellular locations of eukaryotic proteins by the fusion OET-KNN classifier.

Mesh:

Substances:

Year:  2006        PMID: 16889410     DOI: 10.1021/pr060167c

Source DB:  PubMed          Journal:  J Proteome Res        ISSN: 1535-3893            Impact factor:   4.466


  50 in total

1.  Robust quantitative modeling of peptide binding affinities for MHC molecules using physical-chemical descriptors.

Authors:  Ovidiu Ivanciuc; Werner Braun
Journal:  Protein Pept Lett       Date:  2007       Impact factor: 1.890

2.  Prediction of protein function improving sequence remote alignment search by a fuzzy logic algorithm.

Authors:  Antonio Gómez; Juan Cedano; Jordi Espadaler; Antonio Hermoso; Jaume Piñol; Enrique Querol
Journal:  Protein J       Date:  2008-02       Impact factor: 2.371

3.  Prediction of protein subcellular localization by incorporating multiobjective PSO-based feature subset selection into the general form of Chou's PseAAC.

Authors:  Monalisa Mandal; Anirban Mukhopadhyay; Ujjwal Maulik
Journal:  Med Biol Eng Comput       Date:  2015-01-07       Impact factor: 2.602

4.  iAFP-Ense: An Ensemble Classifier for Identifying Antifreeze Protein by Incorporating Grey Model and PSSM into PseAAC.

Authors:  Xuan Xiao; Mengjuan Hui; Zi Liu
Journal:  J Membr Biol       Date:  2016-11-03       Impact factor: 1.843

5.  Subcellular location prediction of apoptosis proteins using two novel feature extraction methods based on evolutionary information and LDA.

Authors:  Lei Du; Qingfang Meng; Yuehui Chen; Peng Wu
Journal:  BMC Bioinformatics       Date:  2020-05-24       Impact factor: 3.169

6.  Study of peptide fingerprints of parasite proteins and drug-DNA interactions with Markov-Mean-Energy invariants of biopolymer molecular-dynamic lattice networks.

Authors:  Lázaro Guillermo Pérez-Montoto; María Auxiliadora Dea-Ayuela; Francisco J Prado-Prado; Francisco Bolas-Fernández; Florencio M Ubeira; Humberto González-Díaz
Journal:  Polymer (Guildf)       Date:  2009-06-03       Impact factor: 4.430

7.  Computational analysis and prediction of lysine malonylation sites by exploiting informative features in an integrative machine-learning framework.

Authors:  Yanju Zhang; Ruopeng Xie; Jiawei Wang; André Leier; Tatiana T Marquez-Lago; Tatsuya Akutsu; Geoffrey I Webb; Kuo-Chen Chou; Jiangning Song
Journal:  Brief Bioinform       Date:  2019-11-27       Impact factor: 11.622

8.  Predicting drug-target interaction networks based on functional groups and biological features.

Authors:  Zhisong He; Jian Zhang; Xiao-He Shi; Le-Le Hu; Xiangyin Kong; Yu-Dong Cai; Kuo-Chen Chou
Journal:  PLoS One       Date:  2010-03-11       Impact factor: 3.240

9.  A new method for predicting the subcellular localization of eukaryotic proteins with both single and multiple sites: Euk-mPLoc 2.0.

Authors:  Kuo-Chen Chou; Hong-Bin Shen
Journal:  PLoS One       Date:  2010-04-01       Impact factor: 3.240

10.  Protein domain boundary predictions: a structural biology perspective.

Authors:  Svetlana Kirillova; Suresh Kumar; Oliviero Carugo
Journal:  Open Biochem J       Date:  2009-01-21
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.