| Literature DB >> 17319708 |
John Hawkins1, Lynne Davis, Mikael Bodén.
Abstract
Nuclear localization of proteins is a crucial element in the dynamic life of the cell. It is complicated by the massive diversity of targeting signals and the existence of proteins that shuttle between the nucleus and cytoplasm. Nevertheless, a majority of subcellular localization tools that predict nuclear proteins have been developed without involving dual localized proteins in the data sets. Hence, in general, the existing models are focused on predicting statically nuclear proteins, rather than nuclear localization itself. We present an independent analysis of existing nuclear localization predictors, using a nonredundant data set extracted from Swiss-Prot R50.0. We demonstrate that accuracy on truly novel proteins is lower than that of previous estimations, and that existing models generalize poorly to dual localized proteins. We have developed a model trained to identify nuclear proteins including dual localized proteins. The results suggest that using more recent data and including dual localized proteins improves the overall prediction. The final predictor NUCLEO operates with a realistic success rate of 0.70 and a correlation coefficient of 0.38, as established on the independent test set. (NUCLEO is available at: http://pprowler.itee.uq.edu.au.).Mesh:
Substances:
Year: 2007 PMID: 17319708 DOI: 10.1021/pr060564n
Source DB: PubMed Journal: J Proteome Res ISSN: 1535-3893 Impact factor: 4.466