| Literature DB >> 26707454 |
Mohamed Ben Aouicha1, Mohamed Ali Hadj Taieb2.
Abstract
The exploitation of heterogeneous clinical sources and healthcare records is fundamental in clinical and translational research. The determination of semantic similarity between word pairs is an important component of text understanding that enables the processing and structuring of textual resources. Some of these measures have been adapted to the biomedical field by incorporating domain information extracted from clinical data or from medical ontologies such as MeSH. This study focuses on Information Content (IC) based measures that exploit the topological parameters of the taxonomy to express the semantics of a concept. A new intrinsic IC computing method based on the taxonomical parameters of the ancestors' subgraph is then assigned to a biomedical concept into the "is a" hierarchy. Moreover, we present a study of the topological parameters through the MeSH taxonomy. This study treats the semantic interpretation and the different ways of expressing the parameters of depth and the descendants' subgraph. Using MeSH as an input ontology, the accuracy of our proposal is evaluated and compared against other IC-based measures according to several widely-used benchmarks of biomedical terms. The correlation between the results obtained for the evaluated measure using the proposed approach and those from the ratings of human' experts shows that our proposal outperforms the previous measures.Entities:
Keywords: Biomedicine; DAG topological parameters; Information content; MeSH; Semantic similarity
Mesh:
Year: 2015 PMID: 26707454 DOI: 10.1016/j.jbi.2015.12.007
Source DB: PubMed Journal: J Biomed Inform ISSN: 1532-0464 Impact factor: 6.317