Literature DB >> 17235453

Euk-PLoc: an ensemble classifier for large-scale eukaryotic protein subcellular location prediction.

H-B Shen1, J Yang, K-C Chou.   

Abstract

With the avalanche of newly-found protein sequences emerging in the post genomic era, it is highly desirable to develop an automated method for fast and reliably identifying their subcellular locations because knowledge thus obtained can provide key clues for revealing their functions and understanding how they interact with each other in cellular networking. However, predicting subcellular location of eukaryotic proteins is a challenging problem, particularly when unknown query proteins do not have significant homology to proteins of known subcellular locations and when more locations need to be covered. To cope with the challenge, protein samples are formulated by hybridizing the information derived from the gene ontology database and amphiphilic pseudo amino acid composition. Based on such a representation, a novel ensemble hybridization classifier was developed by fusing many basic individual classifiers through a voting system. Each of these basic classifiers was engineered by the KNN (K-Nearest Neighbor) principle. As a demonstration, a new benchmark dataset was constructed that covers the following 18 localizations: (1) cell wall, (2) centriole, (3) chloroplast, (4) cyanelle, (5) cytoplasm, (6) cytoskeleton, (7) endoplasmic reticulum, (8) extracell, (9) Golgi apparatus, (10) hydrogenosome, (11) lysosome, (12) mitochondria, (13) nucleus, (14) peroxisome, (15) plasma membrane, (16) plastid, (17) spindle pole body, and (18) vacuole. To avoid the homology bias, none of the proteins included has > or =25% sequence identity to any other in a same subcellular location. The overall success rates thus obtained via the 5-fold and jackknife cross-validation tests were 81.6 and 80.3%, respectively, which were 40-50% higher than those performed by the other existing methods on the same strict dataset. The powerful predictor, named "Euk-PLoc", is available as a web-server at http://202.120.37.186/bioinf/euk . Furthermore, to support the need of people working in the relevant areas, a downloadable file will be provided at the same website to list the results predicted by Euk-PLoc for all eukaryotic protein entries (excluding fragments) in Swiss-Prot database that do not have subcellular location annotations or are annotated as being uncertain. The large-scale results will be updated twice a year to include the new entries of eukaryotic proteins and reflect the continuous development of Euk-PLoc.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17235453     DOI: 10.1007/s00726-006-0478-8

Source DB:  PubMed          Journal:  Amino Acids        ISSN: 0939-4451            Impact factor:   3.520


  23 in total

1.  Prediction of subcellular location of mycobacterial protein using feature selection techniques.

Authors:  Hao Lin; Hui Ding; Feng-Biao Guo; Jian Huang
Journal:  Mol Divers       Date:  2009-11-12       Impact factor: 2.943

Review 2.  Some illuminating remarks on molecular genetics and genomics as well as drug development.

Authors:  Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2020-01-01       Impact factor: 3.291

3.  Subcellular Proteomics as a Unified Approach of Experimental Localizations and Computed Prediction Data for Arabidopsis and Crop Plants.

Authors:  Cornelia M Hooper; Ian R Castleden; Sandra K Tanz; Sally V Grasso; A Harvey Millar
Journal:  Adv Exp Med Biol       Date:  2021       Impact factor: 2.622

4.  Multi-label multi-kernel transfer learning for human protein subcellular localization.

Authors:  Suyu Mei
Journal:  PLoS One       Date:  2012-06-13       Impact factor: 3.240

5.  Gene ontology based transfer learning for protein subcellular localization.

Authors:  Suyu Mei; Wang Fei; Shuigeng Zhou
Journal:  BMC Bioinformatics       Date:  2011-02-02       Impact factor: 3.169

6.  An ensemble classifier for eukaryotic protein subcellular location prediction using gene ontology categories and amino acid hydrophobicity.

Authors:  Liqi Li; Yuan Zhang; Lingyun Zou; Changqing Li; Bo Yu; Xiaoqi Zheng; Yue Zhou
Journal:  PLoS One       Date:  2012-01-30       Impact factor: 3.240

7.  A multi-label predictor for identifying the subcellular locations of singleplex and multiplex eukaryotic proteins.

Authors:  Xiao Wang; Guo-Zheng Li
Journal:  PLoS One       Date:  2012-05-22       Impact factor: 3.240

8.  Semi-supervised protein subcellular localization.

Authors:  Qian Xu; Derek Hao Hu; Hong Xue; Weichuan Yu; Qiang Yang
Journal:  BMC Bioinformatics       Date:  2009-01-30       Impact factor: 3.169

9.  MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction.

Authors:  Torsten Blum; Sebastian Briesemeister; Oliver Kohlbacher
Journal:  BMC Bioinformatics       Date:  2009-09-01       Impact factor: 3.169

10.  Support Vector Machine-based method for predicting subcellular localization of mycobacterial proteins using evolutionary information and motifs.

Authors:  Mamoon Rashid; Sudipto Saha; Gajendra Ps Raghava
Journal:  BMC Bioinformatics       Date:  2007-09-13       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.