| Literature DB >> 18831781 |
Jingshan Huang1, Jiangbo Dang, Michael N Huhns, W Jim Zheng.
Abstract
BACKGROUND: Being formal, declarative knowledge representation models, ontologies help to address the problem of imprecise terminologies in biological and biomedical research. However, ontologies constructed under the auspices of the Open Biomedical Ontologies (OBO) group have exhibited a great deal of variety, because different parties can design ontologies according to their own conceptual views of the world. It is therefore becoming critical to align ontologies from different parties. During automated/semi-automated alignment across biological ontologies, different semantic aspects, i.e., concept name, concept properties, and concept relationships, contribute in different degrees to alignment results. Therefore, a vector of weights must be assigned to these semantic aspects. It is not trivial to determine what those weights should be, and current methodologies depend a lot on human heuristics.Entities:
Mesh:
Year: 2008 PMID: 18831781 PMCID: PMC2559880 DOI: 10.1186/1471-2164-9-S2-S16
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Program running log which contains portion of initial similarity matrix. Inside the matrix, for each concept pair, the first line is the concept from BiologicalProcess; the second line is the concept from Pathway; and the third line is their similarity values in concept name, concept properties, and concept relationships.
Training examples in OAANN.
| lipoprotein_metabolic_process (0042157) | vs. | lipoprotein_metabolic_pathway (0000482) |
| inositol_phosphate_metabolic_process (0043647) | vs. | inositol_phosphate_metabolic_pathway (0000154) |
| glutathione_catabolic_process (0006751) | vs. | glutathione_metabolic_pathway (0000134) |
| brassinosteroid_biosynthetic_process (0016132) | vs. | brassinosteroid_biosynthetic_pathway (0000552) |
| gamma-aminobutyric_acid_catabolic_process (0009450) | vs. | gamma-aminobutyric_acid_metabolic_pathway (0000412) |
| chondroitin_sulfate_biosynthetic_process (0030206) | vs. | chondroitin_sulfate_biosynthetic_pathway (0000195) |
| generation_of_precursor_metabolites_and_energy (0010497) | vs. | energy_metabolic_pathway (248) |
| acetylcholine_catabolic_process (0006581) | vs. | acetylcholine_metabolic_pathway (0000408) |
| cell_cycle_checkpoint(0000075) | vs. | cell_cycle_checkpoint_pathway (0000094) |
| DNA_replication_checkpoint (0000076) | vs. | G2/M_DNA_replication_checkpoint_pathway (0000385) |
| purine_metabolic_process (0006143) | vs. | purine_metabolic_pathway (0000031) |
| dopamine_catabolic_process (0042420) | vs. | dopamine_metabolic_pathway (0000409) |
| epinephrine_catabolic_process (0042419) | vs. | epinephrine_metabolic_pathway (0000441) |
| leukotriene_metabolic_process (0006691) | vs. | leukotriene_metabolic_pathway (0000464) |
| norepinephrine_catabolic_process (0042422) | vs. | norepinephrine_metabolic_pathway (0000442) |
| ganglioside_biosynthetic_process (0001574) | vs. | ganglioside_biosynthetic_pathway (0000164) |
| glycine_catabolic_process (0006546) | vs. | glycine_metabolic_pathway (0000440) |
| glucose_homeostasis (0042593) | vs. | glucose_homeostasis_pathway (0000553) |
| aspartate_metabolic_process (0006531) | vs. | aspartate_metabolic_pathway (0000439) |
| arachidonic_acid_metabolic_process (0019369) | vs. | arachidonic_acid_metabolic_pathway (0000460) |
| histamine_catabolic_process (0001695) | vs. | histamine_metabolic_pathway (0000411) |
| alanine_metabolic_process (0006522) | vs. | alanine_metabolic_pathway (0000438) |
| glycogen_biosynthetic_process (0005978) | vs. | glycogen_biosynthetic_pathway (0000532) |
| germ_cell_programmed_cell_death (0035234) | vs. | altered_programmed_cell_death (0000287) |
| C21-steroid_hormone_catabolic_process (0008208) | vs. | C21-Steroid_hormone_metabolic_pathway (0000070) |
| glycerophospholipid_metabolic_process (0006650) | vs. | glycerophospholipid_metabolic_pathway (0000354) |
| regulated_secretory_pathway (0045055) | vs. | regulated_secretory_pathway (0000537) |
| linoleic_acid_metabolic_process (0043651) | vs. | linoleic_acid_metabolic_pathway (0000523) |
| serotonin_catabolic_process (0042429) | vs. | serotonin_metabolic_pathway (0000410) |
| globoside_metabolic_process (0001575) | vs. | globoside_metabolic_pathway (0000196) |
Figure 2Plot of similarity threshold and equivalent concept pair number. The verizontal-axis value (0.75) at the beginning of the plateau in this figure was adopted as the similarity threshold. The intuition is: there is an initial drop followed by a plateau, which is in turn followed by a second drop. It is reasonable to conclude that threshold can possibly be assigned the value corresponding to the beginning of the plateau. Please refer to "Experiment Design and Results" section for more detailed explanation.
Figure 3Portion of equivalent concept pairs output from OAANN. The final result, equivalent concept pairs between two test ontologies, was presented to domain experts, and Precision and Recall measurements were evaluated. Portion of these equivalent concept pairs are shown in this figure.
Figure 4Similarity matrix sorted by concept name similarity. The similarity matrix was sorted by the "concept name similarity" column. Comparing this result with that in Figure 3 indicates that OAANN avoids the situation where string match alone is considered.
Figure 5Neural network structure. The input into this network is a vector , which consists of s1, s2, and s3, representing the similarity in name, properties, and ancestors, respectively, for a given pair of concepts. ware weights assigned to each input. The output from this network is s, the similarity value between these two concepts as given by Formula 3.