M Iorga1,2, M Drakopoulos3, A M Naidech4, A K Katsaggelos2,5,6, T B Parrish3,2, V B Hill3. 1. From the Departments of Radiology (M.I., M.D., T.B.P., V.B.H.) michael.iorga@northwestern.edu. 2. Departments of Biomedical Engineering (M.I., A.K.K., T.B.P.). 3. From the Departments of Radiology (M.I., M.D., T.B.P., V.B.H.). 4. Neurology (A.M.N.), Northwestern University Feinberg School of Medicine, Chicago, Illinois. 5. Electrical and Computer Engineering (A.K.K.). 6. Computer Science (A.K.K.), Northwestern University, Chicago, Illinois.
Abstract
BACKGROUND AND PURPOSE: Prioritizing reading of noncontrast head CT examinations through an automated triage system may improve time to care for patients with acute neuroradiologic findings. We present a natural language-processing approach for labeling findings in noncontrast head CT reports, which permits creation of a large, labeled dataset of head CT images for development of emergent-finding detection and reading-prioritization algorithms. MATERIALS AND METHODS: In this retrospective study, 1002 clinical radiology reports from noncontrast head CTs collected between 2008 and 2013 were manually labeled across 12 common neuroradiologic finding categories. Each report was then encoded using an n-gram model of unigrams, bigrams, and trigrams. A logistic regression model was then trained to label each report for every common finding. Models were trained and assessed using a combination of L2 regularization and 5-fold cross-validation. RESULTS: Model performance was strongest for the fracture, hemorrhage, herniation, mass effect, pneumocephalus, postoperative status, and volume loss models in which the area under the receiver operating characteristic curve exceeded 0.95. Performance was relatively weaker for the edema, hydrocephalus, infarct, tumor, and white-matter disease models (area under the receiver operating characteristic curve > 0.85). Analysis of coefficients revealed finding-specific words among the top coefficients in each model. Class output probabilities were found to be a useful indicator of predictive error on individual report examples in higher-performing models. CONCLUSIONS: Combining logistic regression with n-gram encoding is a robust approach to labeling common findings in noncontrast head CT reports.
BACKGROUND AND PURPOSE: Prioritizing reading of noncontrast head CT examinations through an automated triage system may improve time to care for patients with acute neuroradiologic findings. We present a natural language-processing approach for labeling findings in noncontrast head CT reports, which permits creation of a large, labeled dataset of head CT images for development of emergent-finding detection and reading-prioritization algorithms. MATERIALS AND METHODS: In this retrospective study, 1002 clinical radiology reports from noncontrast head CTs collected between 2008 and 2013 were manually labeled across 12 common neuroradiologic finding categories. Each report was then encoded using an n-gram model of unigrams, bigrams, and trigrams. A logistic regression model was then trained to label each report for every common finding. Models were trained and assessed using a combination of L2 regularization and 5-fold cross-validation. RESULTS: Model performance was strongest for the fracture, hemorrhage, herniation, mass effect, pneumocephalus, postoperative status, and volume loss models in which the area under the receiver operating characteristic curve exceeded 0.95. Performance was relatively weaker for the edema, hydrocephalus, infarct, tumor, and white-matter disease models (area under the receiver operating characteristic curve > 0.85). Analysis of coefficients revealed finding-specific words among the top coefficients in each model. Class output probabilities were found to be a useful indicator of predictive error on individual report examples in higher-performing models. CONCLUSIONS: Combining logistic regression with n-gram encoding is a robust approach to labeling common findings in noncontrast head CT reports.
Authors: Luciano M Prevedello; Barbaros S Erdal; John L Ryu; Kevin J Little; Mutlu Demirer; Songyue Qian; Richard D White Journal: Radiology Date: 2017-07-03 Impact factor: 11.105
Authors: John Zech; Margaret Pain; Joseph Titano; Marcus Badgeley; Javin Schefflein; Andres Su; Anthony Costa; Joshua Bederson; Joseph Lehar; Eric Karl Oermann Journal: Radiology Date: 2018-01-30 Impact factor: 11.105
Authors: Andre Esteva; Alexandre Robicquet; Bharath Ramsundar; Volodymyr Kuleshov; Mark DePristo; Katherine Chou; Claire Cui; Greg Corrado; Sebastian Thrun; Jeff Dean Journal: Nat Med Date: 2019-01-07 Impact factor: 53.440
Authors: Martin J Willemink; Wojciech A Koszek; Cailin Hardell; Jie Wu; Dominik Fleischmann; Hugh Harvey; Les R Folio; Ronald M Summers; Daniel L Rubin; Matthew P Lungren Journal: Radiology Date: 2020-02-18 Impact factor: 11.105