A D Brown1, J R Kachura2. 1. Division of Vascular and Interventional Radiology, Department of Medical Imaging, Toronto General Hospital-University Health Network/University of Toronto, Toronto, Canada. Electronic address: andrew.brown@sloan.mit.edu. 2. Division of Vascular and Interventional Radiology, Department of Medical Imaging, Toronto General Hospital-University Health Network/University of Toronto, Toronto, Canada.
Abstract
OBJECTIVE: Radiology is a finite health care resource in high demand at most health centers. However, anticipating fluctuations in demand is a challenge because of the inherent uncertainty in disease prognosis. The aim of this study was to explore the potential of natural language processing (NLP) to predict downstream radiology resource utilization in patients undergoing surveillance for hepatocellular carcinoma (HCC). MATERIALS AND METHODS: All HCC surveillance CT examinations performed at our institution from January 1, 2010, to October 31, 2017 were selected from our departmental radiology information system. We used open source NLP and machine learning software to parse radiology report text into bag-of-words and term frequency-inverse document frequency (TF-IDF) representations. Three machine learning models-logistic regression, support vector machine (SVM), and random forest-were used to predict future utilization of radiology department resources. A test data set was used to calculate accuracy, sensitivity, and specificity in addition to the area under the curve (AUC). RESULTS: As a group, the bag-of-word models were slightly inferior to the TF-IDF feature extraction approach. The TF-IDF + SVM model outperformed all other models with an accuracy of 92%, a sensitivity of 83%, and a specificity of 96%, with an AUC of 0.971. CONCLUSIONS: NLP-based models can accurately predict downstream radiology resource utilization from narrative HCC surveillance reports and has potential for translation to health care management where it may improve decision making, reduce costs, and broaden access to care.
OBJECTIVE: Radiology is a finite health care resource in high demand at most health centers. However, anticipating fluctuations in demand is a challenge because of the inherent uncertainty in disease prognosis. The aim of this study was to explore the potential of natural language processing (NLP) to predict downstream radiology resource utilization in patients undergoing surveillance for hepatocellular carcinoma (HCC). MATERIALS AND METHODS: All HCC surveillance CT examinations performed at our institution from January 1, 2010, to October 31, 2017 were selected from our departmental radiology information system. We used open source NLP and machine learning software to parse radiology report text into bag-of-words and term frequency-inverse document frequency (TF-IDF) representations. Three machine learning models-logistic regression, support vector machine (SVM), and random forest-were used to predict future utilization of radiology department resources. A test data set was used to calculate accuracy, sensitivity, and specificity in addition to the area under the curve (AUC). RESULTS: As a group, the bag-of-word models were slightly inferior to the TF-IDF feature extraction approach. The TF-IDF + SVM model outperformed all other models with an accuracy of 92%, a sensitivity of 83%, and a specificity of 96%, with an AUC of 0.971. CONCLUSIONS: NLP-based models can accurately predict downstream radiology resource utilization from narrative HCC surveillance reports and has potential for translation to health care management where it may improve decision making, reduce costs, and broaden access to care.
Authors: Chethan Jujjavarapu; Vikas Pejaver; Trevor A Cohen; Sean D Mooney; Patrick J Heagerty; Jeffrey G Jarvik Journal: Acad Radiol Date: 2021-12-01 Impact factor: 3.173
Authors: Arlene Casey; Emma Davidson; Michael Poon; Hang Dong; Daniel Duma; Andreas Grivas; Claire Grover; Víctor Suárez-Paniagua; Richard Tobin; William Whiteley; Honghan Wu; Beatrice Alex Journal: BMC Med Inform Decis Mak Date: 2021-06-03 Impact factor: 2.796