Literature DB >> 16251364

A novel representation of protein sequences for prediction of subcellular location using support vector machines.

Setsuro Matsuda1, Jean-Philippe Vert, Hiroto Saigo, Nobuhisa Ueda, Hiroyuki Toh, Tatsuya Akutsu.   

Abstract

As the number of complete genomes rapidly increases, accurate methods to automatically predict the subcellular location of proteins are increasingly useful to help their functional annotation. In order to improve the predictive accuracy of the many prediction methods developed to date, a novel representation of protein sequences is proposed. This representation involves local compositions of amino acids and twin amino acids, and local frequencies of distance between successive (basic, hydrophobic, and other) amino acids. For calculating the local features, each sequence is split into three parts: N-terminal, middle, and C-terminal. The N-terminal part is further divided into four regions to consider ambiguity in the length and position of signal sequences. We tested this representation with support vector machines on two data sets extracted from the SWISS-PROT database. Through fivefold cross-validation tests, overall accuracies of more than 87% and 91% were obtained for eukaryotic and prokaryotic proteins, respectively. It is concluded that considering the respective features in the N-terminal, middle, and C-terminal parts is helpful to predict the subcellular location.

Mesh:

Substances:

Year:  2005        PMID: 16251364      PMCID: PMC2253224          DOI: 10.1110/ps.051597405

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


  38 in total

1.  The InterPro database, an integrated documentation resource for protein families, domains and functional sites.

Authors:  R Apweiler; T K Attwood; A Bairoch; A Bateman; E Birney; M Biswas; P Bucher; L Cerutti; F Corpet; M D Croning; R Durbin; L Falquet; W Fleischmann; J Gouzy; H Hermjakob; N Hulo; I Jonassen; D Kahn; A Kanapin; Y Karavidopoulou; R Lopez; B Marx; N J Mulder; T M Oinn; M Pagni; F Servant; C J Sigrist; E M Zdobnov
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

2.  Prediction of protein cellular attributes using pseudo-amino acid composition.

Authors:  K C Chou
Journal:  Proteins       Date:  2001-05-15

Review 3.  Multiple pathways used for the targeting of thylakoid proteins in chloroplasts.

Authors:  C Robinson; S J Thompson; C Woolhead
Journal:  Traffic       Date:  2001-04       Impact factor: 6.215

4.  Support vector machine approach for protein subcellular localization prediction.

Authors:  S Hua; Z Sun
Journal:  Bioinformatics       Date:  2001-08       Impact factor: 6.937

Review 5.  Chloroplast transit peptides: structure, function and evolution.

Authors:  B D Bruce
Journal:  Trends Cell Biol       Date:  2000-10       Impact factor: 20.808

6.  Using functional domain composition and support vector machines for prediction of protein subcellular location.

Authors:  Kuo-Chen Chou; Yu-Dong Cai
Journal:  J Biol Chem       Date:  2002-08-16       Impact factor: 5.157

7.  NESbase version 1.0: a database of nuclear export signals.

Authors:  Tanja la Cour; Ramneek Gupta; Kristoffer Rapacki; Karen Skriver; Flemming M Poulsen; Søren Brunak
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

8.  PSLpred: prediction of subcellular localization of bacterial proteins.

Authors:  Manoj Bhasin; Aarti Garg; G P S Raghava
Journal:  Bioinformatics       Date:  2005-02-04       Impact factor: 6.937

9.  Multi-class support vector machines for protein secondary structure prediction.

Authors:  Minh N Nguyen; Jagath C Rajapakse
Journal:  Genome Inform       Date:  2003

10.  Secretion of FGF-16 requires an uncleaved bipartite signal sequence.

Authors:  Kazuko Miyakawa; Toru Imamura
Journal:  J Biol Chem       Date:  2003-07-08       Impact factor: 5.157

View more
  38 in total

1.  Prediction of protein function improving sequence remote alignment search by a fuzzy logic algorithm.

Authors:  Antonio Gómez; Juan Cedano; Jordi Espadaler; Antonio Hermoso; Jaume Piñol; Enrique Querol
Journal:  Protein J       Date:  2008-02       Impact factor: 2.371

2.  Evolution of bacterial-like phosphoprotein phosphatases in photosynthetic eukaryotes features ancestral mitochondrial or archaeal origin and possible lateral gene transfer.

Authors:  R Glen Uhrig; David Kerk; Greg B Moorhead
Journal:  Plant Physiol       Date:  2013-10-09       Impact factor: 8.340

3.  Identifying anticancer peptides by using a generalized chaos game representation.

Authors:  Li Ge; Jiaguo Liu; Yusen Zhang; Matthias Dehmer
Journal:  J Math Biol       Date:  2018-10-05       Impact factor: 2.259

Review 4.  Machine learning for in silico virtual screening and chemical genomics: new strategies.

Authors:  Jean-Philippe Vert; Laurent Jacob
Journal:  Comb Chem High Throughput Screen       Date:  2008-09       Impact factor: 1.339

5.  CoBaltDB: Complete bacterial and archaeal orfeomes subcellular localization database and associated resources.

Authors:  David Goudenège; Stéphane Avner; Céline Lucchetti-Miganeh; Frédérique Barloy-Hubler
Journal:  BMC Microbiol       Date:  2010-03-23       Impact factor: 3.605

6.  A new method for predicting the subcellular localization of eukaryotic proteins with both single and multiple sites: Euk-mPLoc 2.0.

Authors:  Kuo-Chen Chou; Hong-Bin Shen
Journal:  PLoS One       Date:  2010-04-01       Impact factor: 3.240

7.  PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes.

Authors:  Nancy Y Yu; James R Wagner; Matthew R Laird; Gabor Melli; Sébastien Rey; Raymond Lo; Phuong Dao; S Cenk Sahinalp; Martin Ester; Leonard J Foster; Fiona S L Brinkman
Journal:  Bioinformatics       Date:  2010-05-13       Impact factor: 6.937

8.  Plant-mPLoc: a top-down strategy to augment the power for predicting plant protein subcellular localization.

Authors:  Kuo-Chen Chou; Hong-Bin Shen
Journal:  PLoS One       Date:  2010-06-28       Impact factor: 3.240

9.  Economical evolution: microbes reduce the synthetic cost of extracellular proteins.

Authors:  Daniel R Smith; Matthew R Chapman
Journal:  MBio       Date:  2010-08-24       Impact factor: 7.867

10.  ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins.

Authors:  Aarti Garg; Gajendra P S Raghava
Journal:  BMC Bioinformatics       Date:  2008-11-28       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.