| Literature DB >> 11928516 |
J P Vert1.
Abstract
A new class of kernels for strings is introduced. These kernels can be used by any kernel-based data analysis method, including support vector machines (SVM). They are derived from probabilistic models to integrate biologically relevant information. We show how to compute the kernels corresponding to several classical probabilistic models, and illustrate their use by building a SVM for the problem of predicting the cleavage site of signal peptides from the amino-acid sequence of a protein. At a given rate of false positive this method retrieves up to 47% more true positives than the classical weight matrix method.Mesh:
Substances:
Year: 2002 PMID: 11928516 DOI: 10.1142/9789812799623_0060
Source DB: PubMed Journal: Pac Symp Biocomput ISSN: 2335-6928