Literature DB >> 20431140

GPD: a graph pattern diffusion kernel for accurate graph classification with applications in cheminformatics.

Aaron Smalter1, Jun Luke Huan, Yi Jia, Gerald Lushington.   

Abstract

Graph data mining is an active research area. Graphs are general modeling tools to organize information from heterogeneous sources and have been applied in many scientific, engineering, and business fields. With the fast accumulation of graph data, building highly accurate predictive models for graph data emerges as a new challenge that has not been fully explored in the data mining community. In this paper, we demonstrate a novel technique called graph pattern diffusion (GPD) kernel. Our idea is to leverage existing frequent pattern discovery methods and to explore the application of kernel classifier (e.g., support vector machine) in building highly accurate graph classification. In our method, we first identify all frequent patterns from a graph database. We then map subgraphs to graphs in the graph database and use a process we call "pattern diffusion" to label nodes in the graphs. Finally, we designed a graph alignment algorithm to compute the inner product of two graphs. We have tested our algorithm using a number of chemical structure data. The experimental results demonstrate that our method is significantly better than competing methods such as those kernel functions based on paths, cycles, and subgraphs.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20431140      PMCID: PMC3058227          DOI: 10.1109/TCBB.2009.80

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  8 in total

1.  Accurate classification of protein structural families using coherent subgraph analysis.

Authors:  J Huan; W Wang; A Washington; J Prins; R Shah; A Tropsha
Journal:  Pac Symp Biocomput       Date:  2004

2.  NIH Molecular Libraries Initiative.

Authors:  Christopher P Austin; Linda S Brady; Thomas R Insel; Francis S Collins
Journal:  Science       Date:  2004-11-12       Impact factor: 47.728

3.  Graph kernels for chemical informatics.

Authors:  Liva Ralaivola; Sanjay J Swamidass; Hiroto Saigo; Pierre Baldi
Journal:  Neural Netw       Date:  2005-09-12

4.  Virtual screening of molecular databases using a support vector machine.

Authors:  Robert N Jorissen; Michael K Gilson
Journal:  J Chem Inf Model       Date:  2005 May-Jun       Impact factor: 4.956

5.  Systematic discovery of functional modules and context-specific functional annotation of human genome.

Authors:  Yu Huang; Haifeng Li; Haiyan Hu; Xifeng Yan; Michael S Waterman; Haiyan Huang; Xianghong Jasmine Zhou
Journal:  Bioinformatics       Date:  2007-07-01       Impact factor: 6.937

6.  Prediction of human intestinal absorption of drug compounds from molecular structure.

Authors:  M D Wessel; P C Jurs; J W Tolan; S M Muskal
Journal:  J Chem Inf Comput Sci       Date:  1998 Jul-Aug

7.  Small molecules, big players: the National Cancer Institute's Initiative for Chemical Genetics.

Authors:  Nicola Tolliday; Paul A Clemons; Paul Ferraiolo; Angela N Koehler; Timothy A Lewis; Xiaohua Li; Stuart L Schreiber; Daniela S Gerhard; Scott Eliasof
Journal:  Cancer Res       Date:  2006-09-15       Impact factor: 12.701

8.  Protein ranking by semi-supervised network propagation.

Authors:  Jason Weston; Rui Kuang; Christina Leslie; William Stafford Noble
Journal:  BMC Bioinformatics       Date:  2006-03-20       Impact factor: 3.169

  8 in total
  1 in total

1.  Generalized adjacency and the conservation of gene clusters in genetic networks defined by synthetic lethals.

Authors:  Zhenyu Yang; David Sankoff
Journal:  BMC Bioinformatics       Date:  2012-06-11       Impact factor: 3.169

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.