Jose Lugo-Martinez1, Daniel Zeiberg2, Thomas Gaudelet3, Noël Malod-Dognin4, Natasa Przulj4,5, Predrag Radivojac2. 1. Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA. 2. Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA. 3. Department of Computer Science, University College London, London WC1E 6BT, UK. 4. Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain. 5. ICREA, Pg. Lluis Companys 23, Barcelona 08010, Spain.
Abstract
MOTIVATION: Biological and cellular systems are often modeled as graphs in which vertices represent objects of interest (genes, proteins and drugs) and edges represent relational ties between these objects (binds-to, interacts-with and regulates). This approach has been highly successful owing to the theory, methodology and software that support analysis and learning on graphs. Graphs, however, suffer from information loss when modeling physical systems due to their inability to accurately represent multiobject relationships. Hypergraphs, a generalization of graphs, provide a framework to mitigate information loss and unify disparate graph-based methodologies. RESULTS: We present a hypergraph-based approach for modeling biological systems and formulate vertex classification, edge classification and link prediction problems on (hyper)graphs as instances of vertex classification on (extended, dual) hypergraphs. We then introduce a novel kernel method on vertex- and edge-labeled (colored) hypergraphs for analysis and learning. The method is based on exact and inexact (via hypergraph edit distances) enumeration of hypergraphlets; i.e. small hypergraphs rooted at a vertex of interest. We empirically evaluate this method on fifteen biological networks and show its potential use in a positive-unlabeled setting to estimate the interactome sizes in various species. AVAILABILITY AND IMPLEMENTATION: https://github.com/jlugomar/hypergraphlet-kernels. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Biological and cellular systems are often modeled as graphs in which vertices represent objects of interest (genes, proteins and drugs) and edges represent relational ties between these objects (binds-to, interacts-with and regulates). This approach has been highly successful owing to the theory, methodology and software that support analysis and learning on graphs. Graphs, however, suffer from information loss when modeling physical systems due to their inability to accurately represent multiobject relationships. Hypergraphs, a generalization of graphs, provide a framework to mitigate information loss and unify disparate graph-based methodologies. RESULTS: We present a hypergraph-based approach for modeling biological systems and formulate vertex classification, edge classification and link prediction problems on (hyper)graphs as instances of vertex classification on (extended, dual) hypergraphs. We then introduce a novel kernel method on vertex- and edge-labeled (colored) hypergraphs for analysis and learning. The method is based on exact and inexact (via hypergraph edit distances) enumeration of hypergraphlets; i.e. small hypergraphs rooted at a vertex of interest. We empirically evaluate this method on fifteen biological networks and show its potential use in a positive-unlabeled setting to estimate the interactome sizes in various species. AVAILABILITY AND IMPLEMENTATION: https://github.com/jlugomar/hypergraphlet-kernels. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Michael P H Stumpf; Thomas Thorne; Eric de Silva; Ronald Stewart; Hyeong Jun An; Michael Lappe; Carsten Wiuf Journal: Proc Natl Acad Sci U S A Date: 2008-05-12 Impact factor: 11.205
Authors: Kavitha Venkatesan; Jean-François Rual; Alexei Vazquez; Ulrich Stelzl; Irma Lemmens; Tomoko Hirozane-Kishikawa; Tong Hao; Martina Zenkner; Xiaofeng Xin; Kwang-Il Goh; Muhammed A Yildirim; Nicolas Simonis; Kathrin Heinzmann; Fana Gebreab; Julie M Sahalie; Sebiha Cevik; Christophe Simon; Anne-Sophie de Smet; Elizabeth Dann; Alex Smolyar; Arunachalam Vinayagam; Haiyuan Yu; David Szeto; Heather Borick; Amélie Dricot; Niels Klitgord; Ryan R Murray; Chenwei Lin; Maciej Lalowski; Jan Timm; Kirstin Rau; Charles Boone; Pascal Braun; Michael E Cusick; Frederick P Roth; David E Hill; Jan Tavernier; Erich E Wanker; Albert-László Barabási; Marc Vidal Journal: Nat Methods Date: 2008-12-07 Impact factor: 28.547