Jon Patrick1, Min Li. 1. Faculty of Engineering and IT, the University of Sydney, Sydney, Australia. jonpat@it.usyd.edu.au
Abstract
OBJECTIVE: Medication information comprises a most valuable source of data in clinical records. This paper describes use of a cascade of machine learners that automatically extract medication information from clinical records. DESIGN: Authors developed a novel supervised learning model that incorporates two machine learning algorithms and several rule-based engines. MEASUREMENTS: Evaluation of each step included precision, recall and F-measure metrics. The final outputs of the system were scored using the i2b2 workshop evaluation metrics, including strict and relaxed matching with a gold standard. RESULTS: Evaluation results showed greater than 90% accuracy on five out of seven entities in the name entity recognition task, and an F-measure greater than 95% on the relationship classification task. The strict micro averaged F-measure for the system output achieved best submitted performance of the competition, at 85.65%. LIMITATIONS: Clinical staff will only use practical processing systems if they have confidence in their reliability. Authors estimate that an acceptable accuracy for a such a working system should be approximately 95%. This leaves a significant performance gap of 5 to 10% from the current processing capabilities. CONCLUSION: A multistage method with mixed computational strategies using a combination of rule-based classifiers and statistical classifiers seems to provide a near-optimal strategy for automated extraction of medication information from clinical records.
OBJECTIVE: Medication information comprises a most valuable source of data in clinical records. This paper describes use of a cascade of machine learners that automatically extract medication information from clinical records. DESIGN: Authors developed a novel supervised learning model that incorporates two machine learning algorithms and several rule-based engines. MEASUREMENTS: Evaluation of each step included precision, recall and F-measure metrics. The final outputs of the system were scored using the i2b2 workshop evaluation metrics, including strict and relaxed matching with a gold standard. RESULTS: Evaluation results showed greater than 90% accuracy on five out of seven entities in the name entity recognition task, and an F-measure greater than 95% on the relationship classification task. The strict micro averaged F-measure for the system output achieved best submitted performance of the competition, at 85.65%. LIMITATIONS: Clinical staff will only use practical processing systems if they have confidence in their reliability. Authors estimate that an acceptable accuracy for a such a working system should be approximately 95%. This leaves a significant performance gap of 5 to 10% from the current processing capabilities. CONCLUSION: A multistage method with mixed computational strategies using a combination of rule-based classifiers and statistical classifiers seems to provide a near-optimal strategy for automated extraction of medication information from clinical records.
Authors: Min Jiang; Yukun Chen; Mei Liu; S Trent Rosenbloom; Subramani Mani; Joshua C Denny; Hua Xu Journal: J Am Med Inform Assoc Date: 2011-04-20 Impact factor: 4.497
Authors: Anne-Lyse Minard; Anne-Laure Ligozat; Asma Ben Abacha; Delphine Bernhard; Bruno Cartoni; Louise Deléger; Brigitte Grau; Sophie Rosset; Pierre Zweigenbaum; Cyril Grouin Journal: J Am Med Inform Assoc Date: 2011-05-19 Impact factor: 4.497