Claire L Gordon1, Chunhua Weng2. 1. Department of Medicine, Columbia University Medical Center, 630 West 168th Street, New York, USA; Department of Biomedical Informatics, Columbia University Medical Center, 622 West 168th Street, New York, NY 10032, USA; Department of Medicine, University of Melbourne, Melbourne, VIC 3010, Australia. 2. Department of Biomedical Informatics, Columbia University Medical Center, 622 West 168th Street, New York, NY 10032, USA. Electronic address: cw2384@columbia.edu.
Abstract
INTRODUCTION: A common bottleneck during ontology evaluation is knowledge acquisition from domain experts for gold standard creation. This paper contributes a novel semi-automated method for evaluating the concept coverage and accuracy of biomedical ontologies by complementing expert knowledge with knowledge automatically extracted from clinical practice guidelines and electronic health records, which minimizes reliance on expensive domain expertise for gold standards generation. METHODS: We developed a bacterial clinical infectious diseases ontology (BCIDO) to assist clinical infectious disease treatment decision support. Using a semi-automated method we integrated diverse knowledge sources, including publically available infectious disease guidelines from international repositories, electronic health records, and expert-generated infectious disease case scenarios, to generate a compendium of infectious disease knowledge and use it to evaluate the accuracy and coverage of BCIDO. RESULTS: BCIDO has three classes (i.e., infectious disease, antibiotic, bacteria) containing 593 distinct concepts and 2345 distinct concept relationships. Our semi-automated method generated an ID knowledge compendium consisting of 637 concepts and 1554 concept relationships. Overall, BCIDO covered 79% (504/637) of the concepts and 89% (1378/1554) of the concept relationships in the ID compendium. BCIDO coverage of ID compendium concepts was 92% (121/131) for antibiotic, 80% (205/257) for infectious disease, and 72% (178/249) for bacteria. The low coverage of bacterial concepts in BCIDO was due to a difference in concept granularity between BCIDO and infectious disease guidelines. Guidelines and expert generated scenarios were the richest source of ID concepts and relationships while patient records provided relatively fewer concepts and relationships. CONCLUSIONS: Our semi-automated method was cost-effective for generating a useful knowledge compendium with minimal reliance on domain experts. This method can be useful for continued development and evaluation of biomedical ontologies for better accuracy and coverage.
INTRODUCTION: A common bottleneck during ontology evaluation is knowledge acquisition from domain experts for gold standard creation. This paper contributes a novel semi-automated method for evaluating the concept coverage and accuracy of biomedical ontologies by complementing expert knowledge with knowledge automatically extracted from clinical practice guidelines and electronic health records, which minimizes reliance on expensive domain expertise for gold standards generation. METHODS: We developed a bacterial clinical infectious diseases ontology (BCIDO) to assist clinical infectious disease treatment decision support. Using a semi-automated method we integrated diverse knowledge sources, including publically available infectious disease guidelines from international repositories, electronic health records, and expert-generated infectious disease case scenarios, to generate a compendium of infectious disease knowledge and use it to evaluate the accuracy and coverage of BCIDO. RESULTS: BCIDO has three classes (i.e., infectious disease, antibiotic, bacteria) containing 593 distinct concepts and 2345 distinct concept relationships. Our semi-automated method generated an ID knowledge compendium consisting of 637 concepts and 1554 concept relationships. Overall, BCIDO covered 79% (504/637) of the concepts and 89% (1378/1554) of the concept relationships in the ID compendium. BCIDO coverage of ID compendium concepts was 92% (121/131) for antibiotic, 80% (205/257) for infectious disease, and 72% (178/249) for bacteria. The low coverage of bacterial concepts in BCIDO was due to a difference in concept granularity between BCIDO and infectious disease guidelines. Guidelines and expert generated scenarios were the richest source of ID concepts and relationships while patient records provided relatively fewer concepts and relationships. CONCLUSIONS: Our semi-automated method was cost-effective for generating a useful knowledge compendium with minimal reliance on domain experts. This method can be useful for continued development and evaluation of biomedical ontologies for better accuracy and coverage.
Authors: Jeffrey A Linder; Jeffrey L Schnipper; Ruslana Tsurikova; Tony Yu; Lynn A Volk; Andrea J Melnikas; Matvey B Palchuk; Maya Olsha-Yehiav; Blackford Middleton Journal: Inform Prim Care Date: 2009
Authors: R S Hum; K Cato; B Sheehan; S Patel; J Duchon; P DeLaMora; Y H Ferng; P Graham; D K Vawdrey; J Perlman; E Larson; L Saiman Journal: Appl Clin Inform Date: 2014-04-09 Impact factor: 2.342
Authors: I Nachtigall; S Tafelski; M Deja; E Halle; M C Grebe; A Tamarkin; A Rothbart; A Uhrig; E Meyer; L Musial-Bright; K D Wernecke; C Spies Journal: BMJ Open Date: 2014-12-22 Impact factor: 2.692
Authors: Manuel Rodriguez-Maresca; Antonio Sorlozano; Magnolia Grau; Rocio Rodriguez-Castaño; Andres Ruiz-Valverde; Jose Gutierrez-Fernandez Journal: Biomed Res Int Date: 2014-08-17 Impact factor: 3.411