Shashank Agarwal1, Hong Yu. 1. Medical Informatics, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA.
Abstract
OBJECTIVE: Negation is a linguistic phenomenon that marks the absence of an entity or event. Negated events are frequently reported in both biological literature and clinical notes. Text mining applications benefit from the detection of negation and its scope. However, due to the complexity of language, identifying the scope of negation in a sentence is not a trivial task. DESIGN: Conditional random fields (CRF), a supervised machine-learning algorithm, were used to train models to detect negation cue phrases and their scope in both biological literature and clinical notes. The models were trained on the publicly available BioScope corpus. MEASUREMENT: The performance of the CRF models was evaluated on identifying the negation cue phrases and their scope by calculating recall, precision and F1-score. The models were compared with four competitive baseline systems. RESULTS: The best CRF-based model performed statistically better than all baseline systems and NegEx, achieving an F1-score of 98% and 95% on detecting negation cue phrases and their scope in clinical notes, and an F1-score of 97% and 85% on detecting negation cue phrases and their scope in biological literature. CONCLUSIONS: This approach is robust, as it can identify negation scope in both biological and clinical text. To benefit text mining applications, the system is publicly available as a Java API and as an online application at http://negscope.askhermes.org.
OBJECTIVE: Negation is a linguistic phenomenon that marks the absence of an entity or event. Negated events are frequently reported in both biological literature and clinical notes. Text mining applications benefit from the detection of negation and its scope. However, due to the complexity of language, identifying the scope of negation in a sentence is not a trivial task. DESIGN: Conditional random fields (CRF), a supervised machine-learning algorithm, were used to train models to detect negation cue phrases and their scope in both biological literature and clinical notes. The models were trained on the publicly available BioScope corpus. MEASUREMENT: The performance of the CRF models was evaluated on identifying the negation cue phrases and their scope by calculating recall, precision and F1-score. The models were compared with four competitive baseline systems. RESULTS: The best CRF-based model performed statistically better than all baseline systems and NegEx, achieving an F1-score of 98% and 95% on detecting negation cue phrases and their scope in clinical notes, and an F1-score of 97% and 85% on detecting negation cue phrases and their scope in biological literature. CONCLUSIONS: This approach is robust, as it can identify negation scope in both biological and clinical text. To benefit text mining applications, the system is publicly available as a Java API and as an online application at http://negscope.askhermes.org.
Authors: Hong Yu; Minsuk Lee; David Kaufman; John Ely; Jerome A Osheroff; George Hripcsak; James Cimino Journal: J Biomed Inform Date: 2007-03-12 Impact factor: 6.317
Authors: Peter L Elkin; Steven H Brown; Brent A Bauer; Casey S Husser; William Carruth; Larry R Bergstrom; Dietlind L Wahner-Roedler Journal: BMC Med Inform Decis Mak Date: 2005-05-05 Impact factor: 2.796
Authors: Balaji Polepalli Ramesh; Rashmi Prasad; Tim Miller; Brian Harrington; Hong Yu Journal: J Am Med Inform Assoc Date: 2012-06-28 Impact factor: 4.497
Authors: Vibhu Agarwal; Tanya Podchiyska; Juan M Banda; Veena Goel; Tiffany I Leung; Evan P Minty; Timothy E Sweeney; Elsie Gyang; Nigam H Shah Journal: J Am Med Inform Assoc Date: 2016-05-12 Impact factor: 4.497
Authors: Sumithra Velupillai; Maria Skeppstedt; Maria Kvist; Danielle Mowery; Brian E Chapman; Hercules Dalianis; Wendy W Chapman Journal: Artif Intell Med Date: 2014-01-25 Impact factor: 5.326