OBJECTIVE: The quality of colonoscopy procedures for colorectal cancer screening is often inadequate and varies widely among physicians. Routine measurement of quality is limited by the costs of manual review of free-text patient charts. Our goal was to develop a natural language processing (NLP) application to measure colonoscopy quality. MATERIALS AND METHODS: Using a set of quality measures published by physician specialty societies, we implemented an NLP engine that extracts 21 variables for 19 quality measures from free-text colonoscopy and pathology reports. We evaluated the performance of the NLP engine on a test set of 453 colonoscopy reports and 226 pathology reports, considering accuracy in extracting the values of the target variables from text, and the reliability of the outcomes of the quality measures as computed from the NLP-extracted information. RESULTS: The average accuracy of the NLP engine over all variables was 0.89 (range: 0.62-1.0) and the average F measure over all variables was 0.74 (range: 0.49-0.89). The average agreement score, measured as Cohen's κ, between the manually established and NLP-derived outcomes of the quality measures was 0.62 (range: 0.09-0.86). DISCUSSION: For nine of the 19 colonoscopy quality measures, the agreement score was 0.70 or above, which we consider a sufficient score for the NLP-derived outcomes of these measures to be practically useful for quality measurement. CONCLUSION: The use of NLP for information extraction from free-text colonoscopy and pathology reports creates opportunities for large scale, routine quality measurement, which can support quality improvement in colonoscopy care.
OBJECTIVE: The quality of colonoscopy procedures for colorectal cancer screening is often inadequate and varies widely among physicians. Routine measurement of quality is limited by the costs of manual review of free-text patient charts. Our goal was to develop a natural language processing (NLP) application to measure colonoscopy quality. MATERIALS AND METHODS: Using a set of quality measures published by physician specialty societies, we implemented an NLP engine that extracts 21 variables for 19 quality measures from free-text colonoscopy and pathology reports. We evaluated the performance of the NLP engine on a test set of 453 colonoscopy reports and 226 pathology reports, considering accuracy in extracting the values of the target variables from text, and the reliability of the outcomes of the quality measures as computed from the NLP-extracted information. RESULTS: The average accuracy of the NLP engine over all variables was 0.89 (range: 0.62-1.0) and the average F measure over all variables was 0.74 (range: 0.49-0.89). The average agreement score, measured as Cohen's κ, between the manually established and NLP-derived outcomes of the quality measures was 0.62 (range: 0.09-0.86). DISCUSSION: For nine of the 19 colonoscopy quality measures, the agreement score was 0.70 or above, which we consider a sufficient score for the NLP-derived outcomes of these measures to be practically useful for quality measurement. CONCLUSION: The use of NLP for information extraction from free-text colonoscopy and pathology reports creates opportunities for large scale, routine quality measurement, which can support quality improvement in colonoscopy care.
Authors: Leonard W D'Avolio; Thien M Nguyen; Wildon R Farwell; Yongming Chen; Felicia Fitzmeyer; Owen M Harris; Louis D Fiore Journal: J Am Med Inform Assoc Date: 2010 Jul-Aug Impact factor: 4.497
Authors: Kathryn A Phillips; Su-Ying Liang; Uri Ladabaum; Jennifer Haas; Karla Kerlikowske; David Lieberman; Robert Hiatt; Mika Nagamine; Stephanie L Van Bebber Journal: Med Care Date: 2007-02 Impact factor: 2.983
Authors: Joshua C Denny; Anderson Spickard; Kevin B Johnson; Neeraja B Peterson; Josh F Peterson; Randolph A Miller Journal: J Am Med Inform Assoc Date: 2009-08-28 Impact factor: 4.497
Authors: Serguei Pakhomov; Susan A Weston; Steven J Jacobsen; Christopher G Chute; Ryan Meverden; Véronique L Roger Journal: Am J Manag Care Date: 2007-06 Impact factor: 2.229
Authors: David W Baker; Stephen D Persell; Jason A Thompson; Neilesh S Soman; Karen M Burgner; David Liss; Karen S Kmetik Journal: Ann Intern Med Date: 2007-02-20 Impact factor: 25.391
Authors: Ronilda Lacson; Kimberly Harris; Phyllis Brawarsky; Tor D Tosteson; Tracy Onega; Anna N A Tosteson; Abby Kaye; Irina Gonzalez; Robyn Birdwell; Jennifer S Haas Journal: J Digit Imaging Date: 2015-10 Impact factor: 4.056
Authors: John Heintzman; Steffani R Bailey; Megan J Hoopes; Thuy Le; Rachel Gold; Jean P O'Malley; Stuart Cowburn; Miguel Marino; Alex Krist; Jennifer E DeVoe Journal: J Am Med Inform Assoc Date: 2014-02-07 Impact factor: 4.497
Authors: Glenn T Gobbel; Jennifer Garvin; Ruth Reeves; Robert M Cronin; Julia Heavirland; Jenifer Williams; Allison Weaver; Shrimalini Jayaramaraja; Dario Giuse; Theodore Speroff; Steven H Brown; Hua Xu; Michael E Matheny Journal: J Am Med Inform Assoc Date: 2014-01-15 Impact factor: 4.497
Authors: Ateev Mehrotra; Michele Morris; Rebecca A Gourevitch; David S Carrell; Daniel A Leffler; Sherri Rose; Julia B Greer; Seth D Crockett; Andrew Baer; Robert E Schoen Journal: Gastrointest Endosc Date: 2017-09-01 Impact factor: 9.427
Authors: Felippe O Marcondes; Katie M Dean; Robert E Schoen; Daniel A Leffler; Sherri Rose; Michele Morris; Ateev Mehrotra Journal: Gastrointest Endosc Date: 2015-10 Impact factor: 9.427