Tianyong Hao, Hongfang Liu, Chunhua Weng1. 1. Chunhua Weng, Ph.D., Department of Biomedical Informatics, Columbia University, New York City, 622 W 168th Street, PH-20, New York, NY 10032, USA, E-mail: chunhua@columbia.edu.
Abstract
OBJECTIVES: To develop an automated method for extracting and structuring numeric lab test comparison statements from text and evaluate the method using clinical trial eligibility criteria text. METHODS: Leveraging semantic knowledge from the Unified Medical Language System (UMLS) and domain knowledge acquired from the Internet, Valx takes seven steps to extract and normalize numeric lab test expressions: 1) text preprocessing, 2) numeric, unit, and comparison operator extraction, 3) variable identification using hybrid knowledge, 4) variable - numeric association, 5) context-based association filtering, 6) measurement unit normalization, and 7) heuristic rule-based comparison statements verification. Our reference standard was the consensus-based annotation among three raters for all comparison statements for two variables, i.e., HbA1c and glucose, identified from all of Type 1 and Type 2 diabetes trials in ClinicalTrials.gov. RESULTS: The precision, recall, and F-measure for structuring HbA1c comparison statements were 99.6%, 98.1%, 98.8% for Type 1 diabetes trials, and 98.8%, 96.9%, 97.8% for Type 2 diabetes trials, respectively. The precision, recall, and F-measure for structuring glucose comparison statements were 97.3%, 94.8%, 96.1% for Type 1 diabetes trials, and 92.3%, 92.3%, 92.3% for Type 2 diabetes trials, respectively. CONCLUSIONS: Valx is effective at extracting and structuring free-text lab test comparison statements in clinical trial summaries. Future studies are warranted to test its generalizability beyond eligibility criteria text. The open-source Valx enables its further evaluation and continued improvement among the collaborative scientific community.
OBJECTIVES: To develop an automated method for extracting and structuring numeric lab test comparison statements from text and evaluate the method using clinical trial eligibility criteria text. METHODS: Leveraging semantic knowledge from the Unified Medical Language System (UMLS) and domain knowledge acquired from the Internet, Valx takes seven steps to extract and normalize numeric lab test expressions: 1) text preprocessing, 2) numeric, unit, and comparison operator extraction, 3) variable identification using hybrid knowledge, 4) variable - numeric association, 5) context-based association filtering, 6) measurement unit normalization, and 7) heuristic rule-based comparison statements verification. Our reference standard was the consensus-based annotation among three raters for all comparison statements for two variables, i.e., HbA1c and glucose, identified from all of Type 1 and Type 2 diabetes trials in ClinicalTrials.gov. RESULTS: The precision, recall, and F-measure for structuring HbA1c comparison statements were 99.6%, 98.1%, 98.8% for Type 1 diabetes trials, and 98.8%, 96.9%, 97.8% for Type 2 diabetes trials, respectively. The precision, recall, and F-measure for structuring glucose comparison statements were 97.3%, 94.8%, 96.1% for Type 1 diabetes trials, and 92.3%, 92.3%, 92.3% for Type 2 diabetes trials, respectively. CONCLUSIONS:Valx is effective at extracting and structuring free-text lab test comparison statements in clinical trial summaries. Future studies are warranted to test its generalizability beyond eligibility criteria text. The open-source Valx enables its further evaluation and continued improvement among the collaborative scientific community.
Entities:
Keywords:
Medical informatics; clinical trial; comparison statement; natural language processing; patient selection
Authors: Samson W Tu; Mor Peleg; Simona Carini; Michael Bobak; Jessica Ross; Daniel Rubin; Ida Sim Journal: J Biomed Inform Date: 2010-09-17 Impact factor: 6.317
Authors: Samir R Thadani; Chunhua Weng; J Thomas Bigger; John F Ennever; David Wajngurt Journal: J Am Med Inform Assoc Date: 2009-08-28 Impact factor: 4.497
Authors: Zhe He; Arturo Gonzalez-Izquierdo; Spiros Denaxas; Andrei Sura; Yi Guo; William R Hogan; Elizabeth Shenkman; Jiang Bian Journal: AMIA Annu Symp Proc Date: 2018-04-16