Literature DB >> 23850839

An ontology-based similarity measure for biomedical data-application to radiology reports.

Thusitha Mabotuwana1, Michael C Lee, Eric V Cohen-Solal.   

Abstract

BACKGROUND: Determining similarity between two individual concepts or two sets of concepts extracted from a free text document is important for various aspects of biomedicine, for instance, to find prior clinical reports for a patient that are relevant to the current clinical context. Using simple concept matching techniques, such as lexicon based comparisons, is typically not sufficient to determine an accurate measure of similarity.
METHODS: In this study, we tested an enhancement to the standard document vector cosine similarity model in which ontological parent-child (is-a) relationships are exploited. For a given concept, we define a semantic vector consisting of all parent concepts and their corresponding weights as determined by the shortest distance between the concept and parent after accounting for all possible paths. Similarity between the two concepts is then determined by taking the cosine angle between the two corresponding vectors. To test the improvement over the non-semantic document vector cosine similarity model, we measured the similarity between groups of reports arising from similar clinical contexts, including anatomy and imaging procedure. We further applied the similarity metrics within a k-nearest-neighbor (k-NN) algorithm to classify reports based on their anatomical and procedure based groups. 2150 production CT radiology reports (952 abdomen reports and 1128 neuro reports) were used in testing with SNOMED CT, restricted to Body structure, Clinical finding and Procedure branches, as the reference ontology.
RESULTS: The semantic algorithm preferentially increased the intra-class similarity over the inter-class similarity, with a 0.07 and 0.08 mean increase in the neuro-neuro and abdomen-abdomen pairs versus a 0.04 mean increase in the neuro-abdomen pairs. Using leave-one-out cross-validation in which each document was iteratively used as a test sample while excluding it from the training data, the k-NN based classification accuracy was shown in all cases to be consistently higher with the semantics based measure compared with the non-semantic case. Moreover, the accuracy remained steady even as k value was increased - for the two anatomy related classes accuracy for k=41 was 93.1% with semantics compared to 86.7% without semantics. Similarly, for the eight imaging procedures related classes, accuracy (for k=41) with semantics was 63.8% compared to 60.2% without semantics. At the same k, accuracy improved significantly to 82.8% and 77.4% respectively when procedures were logically grouped together into four classes (such as ignoring contrast information in the imaging procedure description). Similar results were seen at other k-values.
CONCLUSIONS: The addition of semantic context into the document vector space model improves the ability of the cosine similarity to differentiate between radiology reports of different anatomical and image procedure-based classes. This effect can be leveraged for document classification tasks, which suggests its potential applicability for biomedical information retrieval.
Copyright © 2013 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Document similarity comparison; Natural Language Processing; Radiology informatics; Radiology information systems; Semantic similarity; Semantics; Systematized Nomenclature of Medicine

Mesh:

Year:  2013        PMID: 23850839     DOI: 10.1016/j.jbi.2013.06.013

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  8 in total

1.  [No exchange of information without technology : modern infrastructure in radiology].

Authors:  H Hupperts; K-G A Hermann
Journal:  Radiologe       Date:  2014-01       Impact factor: 0.635

2.  Patient Similarity: Emerging Concepts in Systems and Precision Medicine.

Authors:  Sherry-Ann Brown
Journal:  Front Physiol       Date:  2016-11-24       Impact factor: 4.566

3.  Use of the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) for Processing Free Text in Health Care: Systematic Scoping Review.

Authors:  Christophe Gaudet-Blavignac; Vasiliki Foufi; Mina Bjelogrlic; Christian Lovis
Journal:  J Med Internet Res       Date:  2021-01-26       Impact factor: 5.428

4.  HESML: a real-time semantic measures library for the biomedical domain with a reproducible survey.

Authors:  Juan J Lastra-Díaz; Alicia Lara-Clares; Ana Garcia-Serrano
Journal:  BMC Bioinformatics       Date:  2022-01-06       Impact factor: 3.169

5.  BIOSSES: a semantic sentence similarity estimation system for the biomedical domain.

Authors:  Gizem Sogancioglu; Hakime Öztürk; Arzucan Özgür
Journal:  Bioinformatics       Date:  2017-07-15       Impact factor: 6.937

6.  Annotations, Ontologies, and Whole Slide Images - Development of an Annotated Ontology-Driven Whole Slide Image Library of Normal and Abnormal Human Tissue.

Authors:  Karin Lindman; Jerómino F Rose; Martin Lindvall; Claes Lundström; Darren Treanor
Journal:  J Pathol Inform       Date:  2019-07-23

7.  A Triangular Similarity Measure for Case Retrieval in CBR and Its Application to an Agricultural Decision Support System.

Authors:  Zhaoyu Zhai; José-Fernán Martínez Ortega; Pedro Castillejo; Victoria Beltran
Journal:  Sensors (Basel)       Date:  2019-10-23       Impact factor: 3.576

Review 8.  Empowering study of breast cancer data with application of artificial intelligence technology: promises, challenges, and use cases.

Authors:  Maryam Panahiazar; Nolan Chen; Dmytro Lituiev; Dexter Hadley
Journal:  Clin Exp Metastasis       Date:  2021-10-26       Impact factor: 5.150

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.