Shashank Agarwal1, Hong Yu. 1. Medical Informatics, University of Wisconsin-Milwaukee, Milwaukee, WI, USA.
Abstract
OBJECTIVE: Hedging is frequently used in both the biological literature and clinical notes to denote uncertainty or speculation. It is important for text-mining applications to detect hedge cues and their scope; otherwise, uncertain events are incorrectly identified as factual events. However, due to the complexity of language, identifying hedge cues and their scope in a sentence is not a trivial task. Our objective was to develop an algorithm that would automatically detect hedge cues and their scope in biomedical literature. METHODOLOGY: We used conditional random fields (CRFs), a supervised machine-learning algorithm, to train models to detect hedge cue phrases and their scope in biomedical literature. The models were trained on the publicly available BioScope corpus. We evaluated the performance of the CRF models in identifying hedge cue phrases and their scope by calculating recall, precision and F1-score. We compared our models with three competitive baseline systems. RESULTS: Our best CRF-based model performed statistically better than the baseline systems, achieving an F1-score of 88% and 86% in detecting hedge cue phrases and their scope in biological literature and an F1-score of 93% and 90% in detecting hedge cue phrases and their scope in clinical notes. CONCLUSIONS: Our approach is robust, as it can identify hedge cues and their scope in both biological and clinical text. To benefit text-mining applications, our system is publicly available as a Java API and as an online application at http://hedgescope.askhermes.org. To our knowledge, this is the first publicly available system to detect hedge cues and their scope in biomedical literature.
OBJECTIVE: Hedging is frequently used in both the biological literature and clinical notes to denote uncertainty or speculation. It is important for text-mining applications to detect hedge cues and their scope; otherwise, uncertain events are incorrectly identified as factual events. However, due to the complexity of language, identifying hedge cues and their scope in a sentence is not a trivial task. Our objective was to develop an algorithm that would automatically detect hedge cues and their scope in biomedical literature. METHODOLOGY: We used conditional random fields (CRFs), a supervised machine-learning algorithm, to train models to detect hedge cue phrases and their scope in biomedical literature. The models were trained on the publicly available BioScope corpus. We evaluated the performance of the CRF models in identifying hedge cue phrases and their scope by calculating recall, precision and F1-score. We compared our models with three competitive baseline systems. RESULTS: Our best CRF-based model performed statistically better than the baseline systems, achieving an F1-score of 88% and 86% in detecting hedge cue phrases and their scope in biological literature and an F1-score of 93% and 90% in detecting hedge cue phrases and their scope in clinical notes. CONCLUSIONS: Our approach is robust, as it can identify hedge cues and their scope in both biological and clinical text. To benefit text-mining applications, our system is publicly available as a Java API and as an online application at http://hedgescope.askhermes.org. To our knowledge, this is the first publicly available system to detect hedge cues and their scope in biomedical literature.
Authors: Hong Yu; Minsuk Lee; David Kaufman; John Ely; Jerome A Osheroff; George Hripcsak; James Cimino Journal: J Biomed Inform Date: 2007-03-12 Impact factor: 6.317
Authors: Ian Donaldson; Joel Martin; Berry de Bruijn; Cheryl Wolting; Vicki Lay; Brigitte Tuekam; Shudong Zhang; Berivan Baskin; Gary D Bader; Katerina Michalickova; Tony Pawson; Christopher W V Hogue Journal: BMC Bioinformatics Date: 2003-03-27 Impact factor: 3.169
Authors: Balaji Polepalli Ramesh; Rashmi Prasad; Tim Miller; Brian Harrington; Hong Yu Journal: J Am Med Inform Assoc Date: 2012-06-28 Impact factor: 4.497
Authors: F Liu; P Zhou; S J Baccei; M J Masciocchi; N Amornsiripanitch; C I Kiefe; M P Rosen Journal: AJNR Am J Neuroradiol Date: 2021-08-19 Impact factor: 4.966
Authors: Vincent Liu; Mark P Clark; Mark Mendoza; Ramin Saket; Marla N Gardner; Benjamin J Turk; Gabriel J Escobar Journal: BMC Med Inform Decis Mak Date: 2013-08-15 Impact factor: 2.796