W Katherine Tan1, Saeed Hassanpour2, Patrick J Heagerty1, Sean D Rundell3, Pradeep Suri4, Hannu T Huhdanpaa5, Kathryn James6, David S Carrell7, Curtis P Langlotz8, Nancy L Organ1, Eric N Meier1, Karen J Sherman7, David F Kallmes9, Patrick H Luetmer9, Brent Griffith10, David R Nerenz11, Jeffrey G Jarvik12. 1. Department of Biostatistics, University of Washington, Seattle Washington; Center for Biomedical Statistics, University of Washington, Seattle Washington. 2. Department of Biomedical Data Science, Dartmouth College, Hanover, New Hampshire. 3. Department of Health Services, University of Washington, Box 357660, Seattle WA 98195-7660; Department of Rehabilitation Medicine, University of Washington, SeattleWashington; Comparative Effectiveness, Cost and Outcomes Research Center, University of Washington, 4333 Brooklyn Ave NE, Seattle, WA 98105. 4. Department of Rehabilitation Medicine, University of Washington, SeattleWashington; Division of Rehabilitation Care Services, Seattle Epidemiologic Research and Information Center, VA Puget Sound Health Care System, Seattle,Washington; Comparative Effectiveness, Cost and Outcomes Research Center, University of Washington, 4333 Brooklyn Ave NE, Seattle, WA 98105. 5. Radia, Inc. 19020, Lynwood, Washington. 6. Department of Radiology, University of Washington, 1959 NE Pacific Street, Seattle WA 98195; Comparative Effectiveness, Cost and Outcomes Research Center, University of Washington, 4333 Brooklyn Ave NE, Seattle, WA 98105. 7. Kaiser Permanente Washington Health Research Institute, Seattle, Washington. 8. Department of Radiology, Stanford University, Palo Alto, California. 9. Department of Radiology Mayo Clinic, Rochester, Minnesota. 10. Department of Radiology, Henry Ford Hospital, Detroit, Michigan. 11. Neuroscience Institute, Henry Ford Hospital, Detroit, Michigan. 12. Department of Health Services, University of Washington, Box 357660, Seattle WA 98195-7660; Department of Radiology, University of Washington, 1959 NE Pacific Street, Seattle WA 98195; Department of Neurological Surgery, University of Washington, 1959 NE Pacific Street, Seattle, WA 98195; Comparative Effectiveness, Cost and Outcomes Research Center, University of Washington, 4333 Brooklyn Ave NE, Seattle, WA 98105. Electronic address: jarvikj@uw.edu.
Abstract
RATIONALE AND OBJECTIVES: To evaluate a natural language processing (NLP) system built with open-source tools for identification of lumbar spine imaging findings related to low back pain on magnetic resonance and x-ray radiology reports from four health systems. MATERIALS AND METHODS: We used a limited data set (de-identified except for dates) sampled from lumbar spine imaging reports of a prospectively assembled cohort of adults. From N = 178,333 reports, we randomly selected N = 871 to form a reference-standard dataset, consisting of N = 413 x-ray reports and N = 458 MR reports. Using standardized criteria, four spine experts annotated the presence of 26 findings, where 71 reports were annotated by all four experts and 800 were each annotated by two experts. We calculated inter-rater agreement and finding prevalence from annotated data. We randomly split the annotated data into development (80%) and testing (20%) sets. We developed an NLP system from both rule-based and machine-learned models. We validated the system using accuracy metrics such as sensitivity, specificity, and area under the receiver operating characteristic curve (AUC). RESULTS: The multirater annotated dataset achieved inter-rater agreement of Cohen's kappa > 0.60 (substantial agreement) for 25 of 26 findings, with finding prevalence ranging from 3% to 89%. In the testing sample, rule-based and machine-learned predictions both had comparable average specificity (0.97 and 0.95, respectively). The machine-learned approach had a higher average sensitivity (0.94, compared to 0.83 for rules-based), and a higher overall AUC (0.98, compared to 0.90 for rules-based). CONCLUSIONS: Our NLP system performed well in identifying the 26 lumbar spine findings, as benchmarked by reference-standard annotation by medical experts. Machine-learned models provided substantial gains in model sensitivity with slight loss of specificity, and overall higher AUC.
RATIONALE AND OBJECTIVES: To evaluate a natural language processing (NLP) system built with open-source tools for identification of lumbar spine imaging findings related to low back pain on magnetic resonance and x-ray radiology reports from four health systems. MATERIALS AND METHODS: We used a limited data set (de-identified except for dates) sampled from lumbar spine imaging reports of a prospectively assembled cohort of adults. From N = 178,333 reports, we randomly selected N = 871 to form a reference-standard dataset, consisting of N = 413 x-ray reports and N = 458 MR reports. Using standardized criteria, four spine experts annotated the presence of 26 findings, where 71 reports were annotated by all four experts and 800 were each annotated by two experts. We calculated inter-rater agreement and finding prevalence from annotated data. We randomly split the annotated data into development (80%) and testing (20%) sets. We developed an NLP system from both rule-based and machine-learned models. We validated the system using accuracy metrics such as sensitivity, specificity, and area under the receiver operating characteristic curve (AUC). RESULTS: The multirater annotated dataset achieved inter-rater agreement of Cohen's kappa > 0.60 (substantial agreement) for 25 of 26 findings, with finding prevalence ranging from 3% to 89%. In the testing sample, rule-based and machine-learned predictions both had comparable average specificity (0.97 and 0.95, respectively). The machine-learned approach had a higher average sensitivity (0.94, compared to 0.83 for rules-based), and a higher overall AUC (0.98, compared to 0.90 for rules-based). CONCLUSIONS: Our NLP system performed well in identifying the 26 lumbar spine findings, as benchmarked by reference-standard annotation by medical experts. Machine-learned models provided substantial gains in model sensitivity with slight loss of specificity, and overall higher AUC.
Authors: Paul A Harris; Robert Taylor; Robert Thielke; Jonathon Payne; Nathaniel Gonzalez; Jose G Conde Journal: J Biomed Inform Date: 2008-09-30 Impact factor: 6.317
Authors: Kim N Danforth; Megan I Early; Sharon Ngan; Anne E Kosco; Chengyi Zheng; Michael K Gould Journal: J Thorac Oncol Date: 2012-08 Impact factor: 15.609
Authors: Richard A Deyo; Samuel F Dworkin; Dagmar Amtmann; Gunnar Andersson; David Borenstein; Eugene Carragee; John Carrino; Roger Chou; Karon Cook; Anthony DeLitto; Christine Goertz; Partap Khalsa; John Loeser; Sean Mackey; James Panagis; James Rainville; Tor Tosteson; Dennis Turk; Michael Von Korff; Debra K Weiner Journal: Pain Med Date: 2014-08 Impact factor: 3.750
Authors: Jeffrey G Jarvik; Bryan A Comstock; Kathryn T James; Andrew L Avins; Brian W Bresnahan; Richard A Deyo; Patrick H Luetmer; Janna L Friedly; Eric N Meier; Daniel C Cherkin; Laura S Gold; Sean D Rundell; Safwan S Halabi; David F Kallmes; Katherine W Tan; Judith A Turner; Larry G Kessler; Danielle C Lavallee; Kari A Stephens; Patrick J Heagerty Journal: Contemp Clin Trials Date: 2015-10-19 Impact factor: 2.226
Authors: Luciola da C Menezes Costa; Christopher G Maher; James H McAuley; Mark J Hancock; Robert D Herbert; Kathryn M Refshauge; Nicholas Henschke Journal: BMJ Date: 2009-10-06
Authors: Selen Bozkurt; Jung In Park; Kathleen Mary Kan; Michelle Ferrari; Daniel L Rubin; James D Brooks; Tina Hernandez-Boussard Journal: AMIA Annu Symp Proc Date: 2018-12-05
Authors: Chethan Jujjavarapu; Vikas Pejaver; Trevor A Cohen; Sean D Mooney; Patrick J Heagerty; Jeffrey G Jarvik Journal: Acad Radiol Date: 2021-12-01 Impact factor: 3.173
Authors: Alexander L Hornung; Christopher M Hornung; G Michael Mallow; J Nicolás Barajas; Augustus Rush; Arash J Sayari; Fabio Galbusera; Hans-Joachim Wilke; Matthew Colman; Frank M Phillips; Howard S An; Dino Samartzis Journal: Eur Spine J Date: 2022-03-27 Impact factor: 2.721
Authors: Aditya V Karhade; Jacobien H F Oosterhoff; Olivier Q Groot; Nicole Agaronnik; Jeffrey Ehresman; Michiel E R Bongers; Ruurd L Jaarsma; Santosh I Poonnoose; Daniel M Sciubba; Daniel G Tobert; Job N Doornberg; Joseph H Schwab Journal: Clin Orthop Relat Res Date: 2022-04-12 Impact factor: 4.755
Authors: Michael Travis Caton; Walter F Wiggins; Stuart R Pomerantz; Katherine P Andriole Journal: Neuroradiology Date: 2021-02-16 Impact factor: 2.804
Authors: Máté E Maros; Chang Gyu Cho; Andreas G Junge; Benedikt Kämpgen; Victor Saase; Fabian Siegel; Frederik Trinkmann; Thomas Ganslandt; Christoph Groden; Holger Wenz Journal: Sci Rep Date: 2021-03-09 Impact factor: 4.379
Authors: Zachary A Marcum; Laura S Gold; Kathryn T James; Eric N Meier; Judith A Turner; David F Kallmes; Daniel C Cherkin; Richard A Deyo; Karen J Sherman; Patrick H Luetmer; Andrew L Avins; Brent Griffith; Janna L Friedly; Pradeep Suri; Patrick J Heagerty; Jeffrey G Jarvik Journal: J Gen Intern Med Date: 2021-02-08 Impact factor: 6.473