| Literature DB >> 20428463 |
Aaron Smalter1, Jun Huan, Gerald Lushington.
Abstract
Classifying chemical compounds is an active topic in drug design and other cheminformatics applications. Graphs are general tools for organizing information from heterogenous sources and have been applied in modelling many kinds of biological data. With the fast accumulation of chemical structure data, building highly accurate predictive models for chemical graphs emerges as a new challenge.In this paper, we demonstrate a novel technique called Graph Pattern Matching kernel (GPM). Our idea is to leverage existing frequent pattern discovery methods and explore their application to kernel classifiers (e.g. support vector machine) for graph classification. In our method, we first identify all frequent patterns from a graph database. We then map subgraphs to graphs in the database and use a diffusion process to label nodes in the graphs. Finally the kernel is computed using a set matching algorithm. We performed experiments on 16 chemical structure data sets and have compared our methods to other major graph kernels. The experimental results demonstrate excellent performance of our method.Entities:
Year: 2008 PMID: 20428463 PMCID: PMC2860184 DOI: 10.1109/BIBE.2008.4696654
Source DB: PubMed Journal: Proc IEEE Int Symp Bioinformatics Bioeng ISSN: 2159-5410