| Literature DB >> 35938027 |
Abstract
Biomedical ontologies have been used extensively to formally define and organize biomedical terminologies, and these ontologies are typically manually created by biomedical experts. With more biomedical ontologies being built independently, matching them to address the problem of heterogeneity and interoperability has become a critical challenge in many biomedical applications. Existing matching methods have mostly focused on capturing features of terminological, structural, and contextual semantics in ontologies. However, these feature engineering-based techniques are not only labor-intensive but also ignore the hidden semantic relations in ontologies. In this study, we propose an alternative biomedical ontology-matching framework BioHAN via a hybrid graph attention network, and that consists of three techniques. First, we propose an effective ontology-enriching method that refines and enriches the ontologies through axioms and external resources. Subsequently, we use hyperbolic graph attention layers to encode hierarchical concepts in a unified hyperbolic space. Finally, we aggregate the features of both the direct and distant neighbors with a graph attention network. Experimental results on real-world biomedical ontologies demonstrate that BioHAN is competitive with the state-of-the-art ontology matching methods.Entities:
Keywords: biomedical ontology; embedding; graph attention network; hyperbolic attention; ontology matching
Year: 2022 PMID: 35938027 PMCID: PMC9354052 DOI: 10.3389/fgene.2022.893409
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
FIGURE 1Heterogeneity of biomedical ontologies.
FIGURE 2Hierarchical structure in biomedical ontologies UBERON (left) and FMA (right).
FIGURE 3Framework of BioHAN.
Example of ontology collection.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
FIGURE 4Hyperbolic graph attention layers in BioHAN.
Results of ontology matching.
| Method | FMA-NCI | FMA-SNOMED | SNOMED-NCI | ||||||
|---|---|---|---|---|---|---|---|---|---|
| P | R | F1 | P | R | F1 | P | R | F1 | |
| AML |
| 0.910 |
| 0.923 | 0.762 |
| 0.906 | 0.746 | 0.818 |
| LogMap | 0.940 | 0.898 | 0.919 |
| 0.689 | 0.796 |
| 0.667 | 0.785 |
| LogMapBio | 0.904 | 0.920 | 0.912 | 0.911 | 0.711 | 0.799 | 0.909 | 0.696 | 0.88 |
| MTransE | 0.627 | 0.640 | 0.633 | 0.505 | 0.475 | 0.490 | 0.254 | 0.378 | 0.304 |
| GCN-align | 0.813 | 0.783 | 0.798 | 0.763 | 0.729 | 0.746 | 0.745 | 0.775 | 0.760 |
| DAEOM | 0.882 | 0.689 | 0.774 | 0.719 | 0.693 | 0.706 | 0.891 | 0.682 | 0.773 |
| MEDTO | 0.944 | 0.874 | 0.908 | 0.871 | 0.762 | 0.813 | 0.901 |
| 0.849 |
| BioHAN | 0.930 |
| 0.926 | 0.898 |
| 0.832 | 0.911 | 0.797 |
|
| BioHAN (w/o OB) | 0.930 | 0.922 | 0.926 | 0.782 | 0.731 | 0.756 | 0.788 | 0.709 | 0.746 |
| BioHAN (w/o HB) | 0.831 | 0.822 | 0.826 | 0.771 | 0.729 | 0.749 | 0.850 | 0.711 | 0.774 |
| BioHAN (w/o AM) | 0.860 | 0.842 | 0.851 | 0.819 | 0.726 | 0.770 | 0.864 | 0.719 | 0.785 |
| BioHAN (w/o MN) | 0.893 | 0.849 | 0.870 | 0.822 | 0.745 | 0.782 | 0.877 | 0.701 | 0.779 |
Bold values represents the best results for the column in which they are located.
Summary statistics of ontology enriching.
| Ontology | Nodes | isA (origin) | isA (enriching) |
|---|---|---|---|
| FMA | 78,988 | 78,985 | 78,985 |
| NCI | 66,724 | 59,794 | 75,454 |
| SNOMED | 1,22,222 | 1,05,624 | 2,03,942 |