MOTIVATION: Identifying the destination or localization of proteins is key to understanding their function and facilitating their purification. A number of existing computational prediction methods are based on sequence analysis. However, these methods are limited in scope, accuracy and most particularly breadth of coverage. Rather than using sequence information alone, we have explored the use of database text annotations from homologs and machine learning to substantially improve the prediction of subcellular location. RESULTS: We have constructed five machine-learning classifiers for predicting subcellular localization of proteins from animals, plants, fungi, Gram-negative bacteria and Gram-positive bacteria, which are 81% accurate for fungi and 92-94% accurate for the other four categories. These are the most accurate subcellular predictors across the widest set of organisms ever published. Our predictors are part of the Proteome Analyst web-service.
MOTIVATION: Identifying the destination or localization of proteins is key to understanding their function and facilitating their purification. A number of existing computational prediction methods are based on sequence analysis. However, these methods are limited in scope, accuracy and most particularly breadth of coverage. Rather than using sequence information alone, we have explored the use of database text annotations from homologs and machine learning to substantially improve the prediction of subcellular location. RESULTS: We have constructed five machine-learning classifiers for predicting subcellular localization of proteins from animals, plants, fungi, Gram-negative bacteria and Gram-positive bacteria, which are 81% accurate for fungi and 92-94% accurate for the other four categories. These are the most accurate subcellular predictors across the widest set of organisms ever published. Our predictors are part of the Proteome Analyst web-service.
Authors: Duane Szafron; Paul Lu; Russell Greiner; David S Wishart; Brett Poulin; Roman Eisner; Zhiyong Lu; John Anvik; Cam Macdonell; Alona Fyshe; David Meeuwis Journal: Nucleic Acids Res Date: 2004-07-01 Impact factor: 16.971
Authors: Michael Seringhaus; Alberto Paccanaro; Anthony Borneman; Michael Snyder; Mark Gerstein Journal: Genome Res Date: 2006-08-09 Impact factor: 9.043
Authors: Frank Galka; Sun Nyunt Wai; Harald Kusch; Susanne Engelmann; Michael Hecker; Bernd Schmeck; Stefan Hippenstiel; Bernt Eric Uhlin; Michael Steinert Journal: Infect Immun Date: 2008-02-04 Impact factor: 3.441
Authors: Elvira García Osuna; Juchang Hua; Nicholas W Bateman; Ting Zhao; Peter B Berget; Robert F Murphy Journal: Ann Biomed Eng Date: 2007-02-07 Impact factor: 3.934