| Literature DB >> 33286924 |
Alessio Martino1, Antonello Rizzi1.
Abstract
Graph kernels are one of the mainstream approaches when dealing with measuring similarity between graphs, especially for pattern recognition and machine learning tasks. In turn, graphs gained a lot of attention due to their modeling capabilities for several real-world phenomena ranging from bioinformatics to social network analysis. However, the attention has been recently moved towards hypergraphs, generalization of plain graphs where multi-way relations (other than pairwise relations) can be considered. In this paper, four (hyper)graph kernels are proposed and their efficiency and effectiveness are compared in a twofold fashion. First, by inferring the simplicial complexes on the top of underlying graphs and by performing a comparison among 18 benchmark datasets against state-of-the-art approaches; second, by facing a real-world case study (i.e., metabolic pathways classification) where input data are natively represented by hypergraphs. With this work, we aim at fostering the extension of graph kernels towards hypergraphs and, more in general, bridging the gap between structural pattern recognition and the domain of hypergraphs.Entities:
Keywords: graph kernels; hypergraphs; kernel methods; simplicial complexes; support vector machines; topological data analysis
Year: 2020 PMID: 33286924 PMCID: PMC7597323 DOI: 10.3390/e22101155
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Average accuracy on the test set. The color scale has been normalized row-wise (i.e., for each dataset) from yellow (lower values) towards red (higher values, preferred). The times sign (×) indicates that the experiment has been aborted after passing a 24-h deadline. The dagger (†) indicates that the experiment went out-of-memory.
Figure 2Average running times (in seconds) for evaluating the kernel matrix on the entire dataset. The color scale has been normalized row-wise (i.e., for each dataset) from yellow (lower values, preferred) towards red (higher values). The times sign (×) indicates that the experiment has been aborted after passing a 24-h deadline. The dagger (†) indicates that the experiment went out-of-memory. For the four proposed kernels, running times also include the simplicial complexes evaluation starting from the underlying graphs.
Accuracy on the test set (average ± standard deviation) for the four metabolic pathways classification problems.
| Kernel | Problem 1 | Problem 2 | Problem 3 | Problem 4 |
|---|---|---|---|---|
| HCK |
|
|
|
|
| WJK |
|
|
|
|
| EK |
|
|
|
|
| SEK |
|
|
|
|
Specificity on the test set (average ± standard deviation) for the four metabolic pathways classification problems.
| Kernel | Problem 1 | Problem 2 | Problem 3 | Problem 4 |
|---|---|---|---|---|
| HCK |
|
|
|
|
| WJK |
|
|
|
|
| EK |
|
|
|
|
| SEK |
|
|
|
|
Sensitivity on the test set (average ± standard deviation) for the four metabolic pathways classification problems.
| Kernel | Problem 1 | Problem 2 | Problem 3 | Problem 4 |
|---|---|---|---|---|
| HCK |
|
|
|
|
| WJK |
|
|
|
|
| EK |
|
|
|
|
| SEK |
|
|
|
|
Figure 3Negative Eigenfraction for the 18 tested datasets.