Na Hong1, Dingcheng Li2, Yue Yu3, Qiongying Xiu4, Hongfang Liu2, Guoqian Jiang5. 1. Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA; Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing, China. 2. Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA. 3. Department of Medical Informatics, School of Public Health, Jilin University, Changchun, Jilin, China. 4. Computer Science, Winona State University, Rochester, MN, USA. 5. Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA. Electronic address: Jiang.Guoqian@mayo.edu.
Abstract
BACKGROUND: Constructing standard and computable clinical diagnostic criteria is an important but challenging research field in the clinical informatics community. The Quality Data Model (QDM) is emerging as a promising information model for standardizing clinical diagnostic criteria. OBJECTIVE: To develop and evaluate automated methods for converting textual clinical diagnostic criteria in a structured format using QDM. METHODS: We used a clinical Natural Language Processing (NLP) tool known as cTAKES to detect sentences and annotate events in diagnostic criteria. We developed a rule-based approach for assigning the QDM datatype(s) to an individual criterion, whereas we invoked a machine learning algorithm based on the Conditional Random Fields (CRFs) for annotating attributes belonging to each particular QDM datatype. We manually developed an annotated corpus as the gold standard and used standard measures (precision, recall and f-measure) for the performance evaluation. RESULTS: We harvested 267 individual criteria with the datatypes of Symptom and Laboratory Test from 63 textual diagnostic criteria. We manually annotated attributes and values in 142 individual Laboratory Test criteria. The average performance of our rule-based approach was 0.84 of precision, 0.86 of recall, and 0.85 of f-measure; the performance of CRFs-based classification was 0.95 of precision, 0.88 of recall and 0.91 of f-measure. We also implemented a web-based tool that automatically translates textual Laboratory Test criteria into the QDM XML template format. The results indicated that our approaches leveraging cTAKES and CRFs are effective in facilitating diagnostic criteria annotation and classification. CONCLUSION: Our NLP-based computational framework is a feasible and useful solution in developing diagnostic criteria representation and computerization.
BACKGROUND: Constructing standard and computable clinical diagnostic criteria is an important but challenging research field in the clinical informatics community. The Quality Data Model (QDM) is emerging as a promising information model for standardizing clinical diagnostic criteria. OBJECTIVE: To develop and evaluate automated methods for converting textual clinical diagnostic criteria in a structured format using QDM. METHODS: We used a clinical Natural Language Processing (NLP) tool known as cTAKES to detect sentences and annotate events in diagnostic criteria. We developed a rule-based approach for assigning the QDM datatype(s) to an individual criterion, whereas we invoked a machine learning algorithm based on the Conditional Random Fields (CRFs) for annotating attributes belonging to each particular QDM datatype. We manually developed an annotated corpus as the gold standard and used standard measures (precision, recall and f-measure) for the performance evaluation. RESULTS: We harvested 267 individual criteria with the datatypes of Symptom and Laboratory Test from 63 textual diagnostic criteria. We manually annotated attributes and values in 142 individual Laboratory Test criteria. The average performance of our rule-based approach was 0.84 of precision, 0.86 of recall, and 0.85 of f-measure; the performance of CRFs-based classification was 0.95 of precision, 0.88 of recall and 0.91 of f-measure. We also implemented a web-based tool that automatically translates textual Laboratory Test criteria into the QDM XML template format. The results indicated that our approaches leveraging cTAKES and CRFs are effective in facilitating diagnostic criteria annotation and classification. CONCLUSION: Our NLP-based computational framework is a feasible and useful solution in developing diagnostic criteria representation and computerization.
Authors: Min Jiang; Yukun Chen; Mei Liu; S Trent Rosenbloom; Subramani Mani; Joshua C Denny; Hua Xu Journal: J Am Med Inform Assoc Date: 2011-04-20 Impact factor: 4.497
Authors: M H Trivedi; J K Kern; A Marcee; B Grannemann; B Kleiber; T Bettinger; K Z Altshuler; A McClelland Journal: Methods Inf Med Date: 2002 Impact factor: 2.176
Authors: Stephen T Wu; Vinod C Kaggal; Dmitriy Dligach; James J Masanz; Pei Chen; Lee Becker; Wendy W Chapman; Guergana K Savova; Hongfang Liu; Christopher G Chute Journal: J Biomed Semantics Date: 2013-01-03
Authors: J Bouaziz; R Mashiach; S Cohen; A Kedem; A Baron; M Zajicek; I Feldman; D Seidman; D Soriano Journal: Biomed Res Int Date: 2018-03-20 Impact factor: 3.411