Gaurav Trivedi1, Esmaeel R Dadashzadeh2, Robert M Handzel3, Wendy W Chapman4, Shyam Visweswaran1,5, Harry Hochheiser1,5. 1. Intelligent Systems Program, University of Pittsburgh, Pittsburgh, Pennsylvania, United States. 2. Department of Surgery and Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States. 3. Department of Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States. 4. Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, United States. 5. Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States.
Abstract
BACKGROUND: Despite advances in natural language processing (NLP), extracting information from clinical text is expensive. Interactive tools that are capable of easing the construction, review, and revision of NLP models can reduce this cost and improve the utility of clinical reports for clinical and secondary use. OBJECTIVES: We present the design and implementation of an interactive NLP tool for identifying incidental findings in radiology reports, along with a user study evaluating the performance and usability of the tool. METHODS: Expert reviewers provided gold standard annotations for 130 patient encounters (694 reports) at sentence, section, and report levels. We performed a user study with 15 physicians to evaluate the accuracy and usability of our tool. Participants reviewed encounters split into intervention (with predictions) and control conditions (no predictions). We measured changes in model performance, the time spent, and the number of user actions needed. The System Usability Scale (SUS) and an open-ended questionnaire were used to assess usability. RESULTS: Starting from bootstrapped models trained on 6 patient encounters, we observed an average increase in F1 score from 0.31 to 0.75 for reports, from 0.32 to 0.68 for sections, and from 0.22 to 0.60 for sentences on a held-out test data set, over an hour-long study session. We found that tool helped significantly reduce the time spent in reviewing encounters (134.30 vs. 148.44 seconds in intervention and control, respectively), while maintaining overall quality of labels as measured against the gold standard. The tool was well received by the study participants with a very good overall SUS score of 78.67. CONCLUSION: The user study demonstrated successful use of the tool by physicians for identifying incidental findings. These results support the viability of adopting interactive NLP tools in clinical care settings for a wider range of clinical applications. Georg Thieme Verlag KG Stuttgart · New York.
BACKGROUND: Despite advances in natural language processing (NLP), extracting information from clinical text is expensive. Interactive tools that are capable of easing the construction, review, and revision of NLP models can reduce this cost and improve the utility of clinical reports for clinical and secondary use. OBJECTIVES: We present the design and implementation of an interactive NLP tool for identifying incidental findings in radiology reports, along with a user study evaluating the performance and usability of the tool. METHODS: Expert reviewers provided gold standard annotations for 130 patient encounters (694 reports) at sentence, section, and report levels. We performed a user study with 15 physicians to evaluate the accuracy and usability of our tool. Participants reviewed encounters split into intervention (with predictions) and control conditions (no predictions). We measured changes in model performance, the time spent, and the number of user actions needed. The System Usability Scale (SUS) and an open-ended questionnaire were used to assess usability. RESULTS: Starting from bootstrapped models trained on 6 patient encounters, we observed an average increase in F1 score from 0.31 to 0.75 for reports, from 0.32 to 0.68 for sections, and from 0.22 to 0.60 for sentences on a held-out test data set, over an hour-long study session. We found that tool helped significantly reduce the time spent in reviewing encounters (134.30 vs. 148.44 seconds in intervention and control, respectively), while maintaining overall quality of labels as measured against the gold standard. The tool was well received by the study participants with a very good overall SUS score of 78.67. CONCLUSION: The user study demonstrated successful use of the tool by physicians for identifying incidental findings. These results support the viability of adopting interactive NLP tools in clinical care settings for a wider range of clinical applications. Georg Thieme Verlag KG Stuttgart · New York.
Authors: Guergana K Savova; James J Masanz; Philip V Ogren; Jiaping Zheng; Sunghwan Sohn; Karin C Kipper-Schuler; Christopher G Chute Journal: J Am Med Inform Assoc Date: 2010 Sep-Oct Impact factor: 4.497
Authors: Wendy W Chapman; Prakash M Nadkarni; Lynette Hirschman; Leonard W D'Avolio; Guergana K Savova; Ozlem Uzuner Journal: J Am Med Inform Assoc Date: 2011 Sep-Oct Impact factor: 4.497
Authors: Gaurav Trivedi; Charmgil Hong; Esmaeel R Dadashzadeh; Robert M Handzel; Harry Hochheiser; Shyam Visweswaran Journal: Int J Med Inform Date: 2019-06-06 Impact factor: 4.046
Authors: John Zech; Margaret Pain; Joseph Titano; Marcus Badgeley; Javin Schefflein; Andres Su; Anthony Costa; Joshua Bederson; Joseph Lehar; Eric Karl Oermann Journal: Radiology Date: 2018-01-30 Impact factor: 11.105
Authors: Shervin Malmasi; Nicolae L Sandor; Naoshi Hosomura; Matt Goldberg; Stephen Skentzos; Alexander Turchin Journal: Appl Clin Inform Date: 2017-05-03 Impact factor: 2.342
Authors: Yasasvi Tadavarthi; Valeria Makeeva; William Wagstaff; Henry Zhan; Anna Podlasek; Neil Bhatia; Marta Heilbrun; Elizabeth Krupinski; Nabile Safdar; Imon Banerjee; Judy Gichoya; Hari Trivedi Journal: Radiol Artif Intell Date: 2022-02-02
Authors: Manan Shah; Derek Shu; V B Surya Prasath; Yizhao Ni; Andrew H Schapiro; Kevin R Dufendach Journal: Appl Clin Inform Date: 2021-09-08 Impact factor: 2.762