Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Automating Clinical Chart Review: An Open-Source Natural Language Processing Pipeline Developed on Free-Text Radiology Reports From Patients With Glioblastoma.

Literature DB >> 31977252

Automating Clinical Chart Review: An Open-Source Natural Language Processing Pipeline Developed on Free-Text Radiology Reports From Patients With Glioblastoma.

Joeky T Senders^1,2, Logan D Cho^1,3, Paola Calvachi¹, John J McNulty^1,4, Joanna L Ashby¹, Isabelle S Schulte¹, Ahmad Kareem Almekkawi¹, Alireza Mehrtash⁵, William B Gormley¹, Timothy R Smith¹, Marike L D Broekman^2,6, Omar Arnaout¹.

Abstract

PURPOSE: The aim of this study was to develop an open-source natural language processing (NLP) pipeline for text mining of medical information from clinical reports. We also aimed to provide insight into why certain variables or reports are more suitable for clinical text mining than others.
MATERIALS AND METHODS: Various NLP models were developed to extract 15 radiologic characteristics from free-text radiology reports for patients with glioblastoma. Ten-fold cross-validation was used to optimize the hyperparameter settings and estimate model performance. We examined how model performance was associated with quantitative attributes of the radiologic characteristics and reports.
RESULTS: In total, 562 unique brain magnetic resonance imaging reports were retrieved. NLP extracted 15 radiologic characteristics with high to excellent discrimination (area under the curve, 0.82 to 0.98) and accuracy (78.6% to 96.6%). Model performance was correlated with the inter-rater agreement of the manually provided labels (ρ = 0.904; P < .001) but not with the frequency distribution of the variables of interest (ρ = 0.179; P = .52). All variables labeled with a near perfect inter-rater agreement were classified with excellent performance (area under the curve > 0.95). Excellent performance could be achieved for variables with only 50 to 100 observations in the minority group and class imbalances up to a 9:1 ratio. Report-level classification accuracy was not associated with the number of words or the vocabulary size in the distinct text documents.
CONCLUSION: This study provides an open-source NLP pipeline that allows for text mining of narratively written clinical reports. Small sample sizes and class imbalance should not be considered as absolute contraindications for text mining in clinical research. However, future studies should report measures of inter-rater agreement whenever ground truth is based on a consensus label and use this measure to identify clinical variables eligible for text mining.

Entities: Disease Species

Mesh：

Year: 2020 PMID： 31977252 DOI： 10.1200/CCI.19.00060

Source DB: PubMed Journal: JCO Clin Cancer Inform ISSN： 2473-4276

Keyword Cloud
Cited

3 in total

1. Comparison and interpretability of machine learning models to predict severity of chest injury.

Authors: Sujay Kulshrestha; Dmitriy Dligach; Cara Joyce; Richard Gonzalez; Ann P O'Rourke; Joshua M Glazer; Anne Stey; Jacqueline M Kruser; Matthew M Churpek; Majid Afshar
Journal: JAMIA Open Date: 2021-03-01

2. A Semiautomated Chart Review for Assessing the Development of Radiation Pneumonitis Using Natural Language Processing: Diagnostic Accuracy and Feasibility Study.

Authors: Jordan McKenzie; Rasika Rajapakshe; Hua Shen; Shan Rajapakshe; Angela Lin
Journal: JMIR Med Inform Date: 2021-11-12

Review 3. Assessment of Electronic Health Record for Cancer Research and Patient Care Through a Scoping Review of Cancer Natural Language Processing.

Authors: Liwei Wang; Sunyang Fu; Andrew Wen; Xiaoyang Ruan; Huan He; Sijia Liu; Sungrim Moon; Michelle Mai; Irbaz B Riaz; Nan Wang; Ping Yang; Hua Xu; Jeremy L Warner; Hongfang Liu
Journal: JCO Clin Cancer Inform Date: 2022-07

3 in total