Literature DB >> 15215421

ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST.

Manoj Bhasin1, G P S Raghava.   

Abstract

Automated prediction of subcellular localization of proteins is an important step in the functional annotation of genomes. The existing subcellular localization prediction methods are based on either amino acid composition or N-terminal characteristics of the proteins. In this paper, support vector machine (SVM) has been used to predict the subcellular location of eukaryotic proteins from their different features such as amino acid composition, dipeptide composition and physico-chemical properties. The SVM module based on dipeptide composition performed better than the SVM modules based on amino acid composition or physico-chemical properties. In addition, PSI-BLAST was also used to search the query sequence against the dataset of proteins (experimentally annotated proteins) to predict its subcellular location. In order to improve the prediction accuracy, we developed a hybrid module using all features of a protein, which consisted of an input vector of 458 dimensions (400 dipeptide compositions, 33 properties, 20 amino acid compositions of the protein and 5 from PSI-BLAST output). Using this hybrid approach, the prediction accuracies of nuclear, cytoplasmic, mitochondrial and extracellular proteins reached 95.3, 85.2, 68.2 and 88.9%, respectively. The overall prediction accuracy of SVM modules based on amino acid composition, physico-chemical properties, dipeptide composition and the hybrid approach was 78.1, 77.8, 82.9 and 88.0%, respectively. The accuracy of all the modules was evaluated using a 5-fold cross-validation technique. Assigning a reliability index (reliability index > or =3), 73.5% of prediction can be made with an accuracy of 96.4%. Based on the above approach, an online web server ESLpred was developed, which is available at http://www.imtech.res.in/raghava/eslpred/.

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 15215421      PMCID: PMC441488          DOI: 10.1093/nar/gkh350

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  16 in total

1.  PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization.

Authors:  K Nakai; P Horton
Journal:  Trends Biochem Sci       Date:  1999-01       Impact factor: 13.807

2.  A novel approach to the recognition of protein architecture from sequence using Fourier analysis and neural networks.

Authors:  Adrian J Shepherd; Denise Gorse; Janet M Thornton
Journal:  Proteins       Date:  2003-02-01

3.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

4.  Expert system for predicting protein localization sites in gram-negative bacteria.

Authors:  K Nakai; M Kanehisa
Journal:  Proteins       Date:  1991

Review 5.  Wanted: subcellular localization of proteins based on sequence.

Authors:  F Eisenhaber; P Bork
Journal:  Trends Cell Biol       Date:  1998-04       Impact factor: 20.808

6.  Using neural networks for prediction of the subcellular location of proteins.

Authors:  A Reinhardt; T Hubbard
Journal:  Nucleic Acids Res       Date:  1998-05-01       Impact factor: 16.971

7.  Improved tools for biological sequence comparison.

Authors:  W R Pearson; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

8.  The DEF data base of sequence based protein fold class predictions.

Authors:  M Reczko; H Bohr
Journal:  Nucleic Acids Res       Date:  1994-09       Impact factor: 16.971

9.  Analysis and prediction of affinity of TAP binding peptides using cascade SVM.

Authors:  Manoj Bhasin; G P S Raghava
Journal:  Protein Sci       Date:  2004-03       Impact factor: 6.725

10.  Support vector machines with selective kernel scaling for protein classification and identification of key amino acid positions.

Authors:  Nela Zavaljevski; Fred J Stevens; Jaques Reifman
Journal:  Bioinformatics       Date:  2002-05       Impact factor: 6.937

View more
  97 in total

1.  GPCRpred: an SVM-based method for prediction of families and subfamilies of G-protein coupled receptors.

Authors:  Manoj Bhasin; G P S Raghava
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

2.  RASCAL is a new human cytomegalovirus-encoded protein that localizes to the nuclear lamina and in cytoplasmic vesicles at late times postinfection.

Authors:  Matthew S Miller; Wendy E Furlong; Leesa Pennell; Marc Geadah; Laura Hertel
Journal:  J Virol       Date:  2010-04-14       Impact factor: 5.103

3.  Combining machine learning and homology-based approaches to accurately predict subcellular localization in Arabidopsis.

Authors:  Rakesh Kaundal; Reena Saini; Patrick X Zhao
Journal:  Plant Physiol       Date:  2010-07-20       Impact factor: 8.340

4.  The expression profile of the major mouse SPO11 isoforms indicates that SPO11beta introduces double strand breaks and suggests that SPO11alpha has an additional role in prophase in both spermatocytes and oocytes.

Authors:  Marina A Bellani; Kingsley A Boateng; Dianne McLeod; R Daniel Camerini-Otero
Journal:  Mol Cell Biol       Date:  2010-07-20       Impact factor: 4.272

5.  A novel representation of protein sequences for prediction of subcellular location using support vector machines.

Authors:  Setsuro Matsuda; Jean-Philippe Vert; Hiroto Saigo; Nobuhisa Ueda; Hiroyuki Toh; Tatsuya Akutsu
Journal:  Protein Sci       Date:  2005-11       Impact factor: 6.725

6.  EHPred: an SVM-based method for epoxide hydrolases recognition and classification.

Authors:  Jia Jia; Liang Yang; Zi-Zhang Zhang
Journal:  J Zhejiang Univ Sci B       Date:  2006-01       Impact factor: 3.066

7.  Prediction of mitochondrial proteins using discrete wavelet transform.

Authors:  Lin Jiang; Menglong Li; Zhining Wen; Kelong Wang; Yuanbo Diao
Journal:  Protein J       Date:  2006-06       Impact factor: 2.371

8.  Interleukin-4-inducing principle from Schistosoma mansoni eggs contains a functional C-terminal nuclear localization signal necessary for nuclear translocation in mammalian cells but not for its uptake.

Authors:  Ishwinder Kaur; Gabriele Schramm; Bart Everts; Thomas Scholzen; Karin B Kindle; Christian Beetz; Cristina Montiel-Duarte; Silke Blindow; Arwyn T Jones; Helmut Haas; Snjezana Stolnik; David M Heery; Franco H Falcone
Journal:  Infect Immun       Date:  2011-01-10       Impact factor: 3.441

9.  ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins.

Authors:  Aarti Garg; Gajendra P S Raghava
Journal:  BMC Bioinformatics       Date:  2008-11-28       Impact factor: 3.169

10.  Prediction of nuclear proteins using SVM and HMM models.

Authors:  Manish Kumar; Gajendra P S Raghava
Journal:  BMC Bioinformatics       Date:  2009-01-19       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.