Literature DB >> 9322029

Better prediction of protein cellular localization sites with the k nearest neighbors classifier.

P Horton1, K Nakai.   

Abstract

We have compared four classifiers on the problem of predicting the cellular localization sites of proteins in yeast and E. coli. A set of sequence derived features, such as regions of high hydrophobicity, were used for each classifier. The methods compared were a structured probabilistic model specifically designed for the localization problem, the k nearest neighbors classifier, the binary decision tree classifier, and the naïve Bayes classifier. The result of tests using stratified cross validation shows the k nearest neighbors classifier to perform better than the other methods. In the case of yeast this difference was statistically significant using a cross-validated paired t test. The result is an accuracy of approximately 60% for 10 yeast classes and 86% for 8 E. coli classes. The best previously reported accuracies for these datasets were 55% and 81% respectively.

Entities:  

Mesh:

Substances:

Year:  1997        PMID: 9322029

Source DB:  PubMed          Journal:  Proc Int Conf Intell Syst Mol Biol        ISSN: 1553-0833


  134 in total

1.  Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature.

Authors:  Soumya Raychaudhuri; Jeffrey T Chang; Patrick D Sutphin; Russ B Altman
Journal:  Genome Res       Date:  2002-01       Impact factor: 9.043

2.  Chloroplast transit peptide prediction: a peek inside the black box.

Authors:  A I Schein; J C Kissinger; L H Ungar
Journal:  Nucleic Acids Res       Date:  2001-08-15       Impact factor: 16.971

3.  CREB-H: a novel mammalian transcription factor belonging to the CREB/ATF family and functioning via the box-B element with a liver-specific expression.

Authors:  Y Omori; J Imai ; M Watanabe; T Komatsu; Y Suzuki; K Kataoka; S Watanabe; A Tanigami; S Sugano
Journal:  Nucleic Acids Res       Date:  2001-05-15       Impact factor: 16.971

4.  Membrane topology of human AGPAT3 (LPAAT3).

Authors:  John A Schmidt; Griselda Metta Yvone; William J Brown
Journal:  Biochem Biophys Res Commun       Date:  2010-06-09       Impact factor: 3.575

5.  Regulation of DAF-2 receptor signaling by human insulin and ins-1, a member of the unusually large and diverse C. elegans insulin gene family.

Authors:  S B Pierce; M Costa; R Wisotzkey; S Devadhar; S A Homburger; A R Buchman; K C Ferguson; J Heller; D M Platt; A A Pasquinelli; L X Liu; S K Doberstein; G Ruvkun
Journal:  Genes Dev       Date:  2001-03-15       Impact factor: 11.361

6.  Secreted protein prediction system combining CJ-SPHMM, TMHMM, and PSORT.

Authors:  Yunjia Chen; Peng Yu; Jingchu Luo; Ying Jiang
Journal:  Mamm Genome       Date:  2003-12       Impact factor: 2.957

7.  Archaeal signal peptides--a comparative survey at the genome level.

Authors:  Sonia L Bardy; Jerry Eichler; Ken F Jarrell
Journal:  Protein Sci       Date:  2003-09       Impact factor: 6.725

8.  AraPerox. A database of putative Arabidopsis proteins from plant peroxisomes.

Authors:  Sigrun Reumann; Changle Ma; Steffen Lemke; Lavanya Babujee
Journal:  Plant Physiol       Date:  2004-08-27       Impact factor: 8.340

9.  Mammalian carboxylesterase 3: comparative genomics and proteomics.

Authors:  Roger S Holmes; Laura A Cox; John L VandeBerg
Journal:  Genetica       Date:  2010-04-28       Impact factor: 1.082

10.  Mapping of sequences in Pseudorabies virus pUL34 that are required for formation and function of the nuclear egress complex.

Authors:  Lars Paßvogel; Patricia Trübe; Franziska Schuster; Barbara G Klupp; Thomas C Mettenleiter
Journal:  J Virol       Date:  2013-02-06       Impact factor: 5.103

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.