Jeffrey K Lee1, Christopher D Jensen2, Theodore R Levin2, Ann G Zauber3, Chyke A Doubeni4, Wei K Zhao2, Douglas A Corley2. 1. Department of Medicine, Division of Gastroenterology, University of California San Francisco, San Francisco. 2. Division of Research, Kaiser Permanente Northern California, Oakland, CA. 3. Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY. 4. Department of Family Medicine, University of Pennsylvania, Philadelphia, PA.
Abstract
OBJECTIVES: The aim of this study was to test the ability of a commercially available natural language processing (NLP) tool to accurately extract examination quality-related and large polyp information from colonoscopy reports with varying report formats. BACKGROUND: Colonoscopy quality reporting often requires manual data abstraction. NLP is another option for extracting information; however, limited data exist on its ability to accurately extract examination quality and polyp findings from unstructured text in colonoscopy reports with different reporting formats. STUDY DESIGN: NLP strategies were developed using 500 colonoscopy reports from Kaiser Permanente Northern California and then tested using 300 separate colonoscopy reports that underwent manual chart review. Using findings from manual review as the reference standard, we evaluated the NLP tool's sensitivity, specificity, positive predictive value (PPV), and accuracy for identifying colonoscopy examination indication, cecal intubation, bowel preparation adequacy, and polyps ≥10 mm. RESULTS: The NLP tool was highly accurate in identifying examination quality-related variables from colonoscopy reports. Compared with manual review, sensitivity for screening indication was 100% (95% confidence interval: 95.3%-100%), PPV was 90.6% (82.3%-95.8%), and accuracy was 98.2% (97.0%-99.4%). For cecal intubation, sensitivity was 99.6% (98.0%-100%), PPV was 100% (98.5%-100%), and accuracy was 99.8% (99.5%-100%). For bowel preparation adequacy, sensitivity was 100% (98.5%-100%), PPV was 100% (98.5%-100%), and accuracy was 100% (100%-100%). For polyp(s) ≥10 mm, sensitivity was 90.5% (69.6%-98.8%), PPV was 100% (82.4%-100%), and accuracy was 95.2% (88.8%-100%). CONCLUSION: NLP yielded a high degree of accuracy for identifying examination quality-related and large polyp information from diverse types of colonoscopy reports.
OBJECTIVES: The aim of this study was to test the ability of a commercially available natural language processing (NLP) tool to accurately extract examination quality-related and large polyp information from colonoscopy reports with varying report formats. BACKGROUND: Colonoscopy quality reporting often requires manual data abstraction. NLP is another option for extracting information; however, limited data exist on its ability to accurately extract examination quality and polyp findings from unstructured text in colonoscopy reports with different reporting formats. STUDY DESIGN: NLP strategies were developed using 500 colonoscopy reports from Kaiser Permanente Northern California and then tested using 300 separate colonoscopy reports that underwent manual chart review. Using findings from manual review as the reference standard, we evaluated the NLP tool's sensitivity, specificity, positive predictive value (PPV), and accuracy for identifying colonoscopy examination indication, cecal intubation, bowel preparation adequacy, and polyps ≥10 mm. RESULTS: The NLP tool was highly accurate in identifying examination quality-related variables from colonoscopy reports. Compared with manual review, sensitivity for screening indication was 100% (95% confidence interval: 95.3%-100%), PPV was 90.6% (82.3%-95.8%), and accuracy was 98.2% (97.0%-99.4%). For cecal intubation, sensitivity was 99.6% (98.0%-100%), PPV was 100% (98.5%-100%), and accuracy was 99.8% (99.5%-100%). For bowel preparation adequacy, sensitivity was 100% (98.5%-100%), PPV was 100% (98.5%-100%), and accuracy was 100% (100%-100%). For polyp(s) ≥10 mm, sensitivity was 90.5% (69.6%-98.8%), PPV was 100% (82.4%-100%), and accuracy was 95.2% (88.8%-100%). CONCLUSION: NLP yielded a high degree of accuracy for identifying examination quality-related and large polyp information from diverse types of colonoscopy reports.
Authors: Douglas K Rex; John L Petrini; Todd H Baron; Amitabh Chak; Jonathan Cohen; Stephen E Deal; Brenda Hoffman; Brian C Jacobson; Klaus Mergener; Bret T Petersen; Michael A Safdi; Douglas O Faigel; Irving M Pike Journal: Gastrointest Endosc Date: 2006-04 Impact factor: 9.427
Authors: Douglas K Rex; Philip S Schoenfeld; Jonathan Cohen; Irving M Pike; Douglas G Adler; M Brian Fennerty; John G Lieb; Walter G Park; Maged K Rizk; Mandeep S Sawhney; Nicholas J Shaheen; Sachin Wani; David S Weinberg Journal: Am J Gastroenterol Date: 2014-12-02 Impact factor: 10.864
Authors: Douglas K Rex; Philip S Schoenfeld; Jonathan Cohen; Irving M Pike; Douglas G Adler; M Brian Fennerty; John G Lieb; Walter G Park; Maged K Rizk; Mandeep S Sawhney; Nicholas J Shaheen; Sachin Wani; David S Weinberg Journal: Gastrointest Endosc Date: 2014-12-02 Impact factor: 9.427
Authors: Timothy D Imler; Justin Morea; Charles Kahi; Eric A Sherer; Jon Cardwell; Cynthia S Johnson; Huiping Xu; Dennis Ahnen; Fadi Antaki; Christopher Ashley; Gyorgy Baffy; Ilseung Cho; Jason Dominitz; Jason Hou; Mark Korsten; Anil Nagar; Kittichai Promrat; Douglas Robertson; Sameer Saini; Amandeep Shergill; Walter Smalley; Thomas F Imperiale Journal: Am J Gastroenterol Date: 2015-03-10 Impact factor: 10.864
Authors: Ann G Zauber; Sidney J Winawer; Michael J O'Brien; Iris Lansdorp-Vogelaar; Marjolein van Ballegooijen; Benjamin F Hankey; Weiji Shi; John H Bond; Melvin Schapiro; Joel F Panish; Edward T Stewart; Jerome D Waye Journal: N Engl J Med Date: 2012-02-23 Impact factor: 91.245
Authors: Gottumukkala S Raju; Phillip J Lum; Rebecca S Slack; Selvi Thirumurthi; Patrick M Lynch; Ethan Miller; Brian R Weston; Marta L Davila; Manoop S Bhutani; Mehnaz A Shafi; Robert S Bresalier; Alexander A Dekovich; Jeffrey H Lee; Sushovan Guha; Mala Pande; Boris Blechacz; Asif Rashid; Mark Routbort; Gladis Shuttlesworth; Lopa Mishra; John R Stroehlein; William A Ross Journal: Gastrointest Endosc Date: 2015-04-22 Impact factor: 9.427
Authors: Reiko Nishihara; Kana Wu; Paul Lochhead; Teppei Morikawa; Xiaoyun Liao; Zhi Rong Qian; Kentaro Inamura; Sun A Kim; Aya Kuchiba; Mai Yamauchi; Yu Imamura; Walter C Willett; Bernard A Rosner; Charles S Fuchs; Edward Giovannucci; Shuji Ogino; Andrew T Chan Journal: N Engl J Med Date: 2013-09-19 Impact factor: 91.245
Authors: Vincent Liu; Mark P Clark; Mark Mendoza; Ramin Saket; Marla N Gardner; Benjamin J Turk; Gabriel J Escobar Journal: BMC Med Inform Decis Mak Date: 2013-08-15 Impact factor: 2.796
Authors: Shorabuddin Syed; Adam Jackson Angel; Hafsa Bareen Syeda; Carole France Jennings; Joseph VanScoy; Mahanazuddin Syed; Melody Greer; Sudeepa Bhattacharyya; Meredith Zozus; Benjamin Tharian; Fred Prior Journal: Biomed Eng Syst Technol Int Jt Conf BIOSTEC Revis Sel Pap Date: 2022-02
Authors: Catherine Lee; Lawrence H Kushi; Mary E Reed; Elizabeth H Eldridge; Jeffrey K Lee; Jie Zhang; Donna Spiegelman Journal: Am J Prev Med Date: 2021-11-08 Impact factor: 5.043
Authors: Jeffrey K Lee; Christopher D Jensen; Theodore R Levin; Chyke A Doubeni; Ann G Zauber; Jessica Chubak; Aruna S Kamineni; Joanne E Schottinger; Nirupa R Ghai; Natalia Udaltsova; Wei K Zhao; Bruce H Fireman; Charles P Quesenberry; E John Orav; Celette S Skinner; Ethan A Halm; Douglas A Corley Journal: Gastroenterology Date: 2019-10-04 Impact factor: 22.682
Authors: Sobia Nasir Laique; Umar Hayat; Shashank Sarvepalli; Byron Vaughn; Mounir Ibrahim; John McMichael; Kanza Noor Qaiser; Carol Burke; Amit Bhatt; Colin Rhodes; Maged K Rizk Journal: Gastrointest Endosc Date: 2020-09-03 Impact factor: 9.427
Authors: Minta Thomas; Lori C Sakoda; Michael Hoffmeister; Elisabeth A Rosenthal; Jeffrey K Lee; Franzel J B van Duijnhoven; Elizabeth A Platz; Anna H Wu; Christopher H Dampier; Albert de la Chapelle; Alicja Wolk; Amit D Joshi; Andrea Burnett-Hartman; Andrea Gsur; Annika Lindblom; Antoni Castells; Aung Ko Win; Bahram Namjou; Bethany Van Guelpen; Catherine M Tangen; Qianchuan He; Christopher I Li; Clemens Schafmayer; Corinne E Joshu; Cornelia M Ulrich; D Timothy Bishop; Daniel D Buchanan; Daniel Schaid; David A Drew; David C Muller; David Duggan; David R Crosslin; Demetrius Albanes; Edward L Giovannucci; Eric Larson; Flora Qu; Frank Mentch; Graham G Giles; Hakon Hakonarson; Heather Hampel; Ian B Stanaway; Jane C Figueiredo; Jeroen R Huyghe; Jessica Minnier; Jenny Chang-Claude; Jochen Hampe; John B Harley; Kala Visvanathan; Keith R Curtis; Kenneth Offit; Li Li; Loic Le Marchand; Ludmila Vodickova; Marc J Gunter; Mark A Jenkins; Martha L Slattery; Mathieu Lemire; Michael O Woods; Mingyang Song; Neil Murphy; Noralane M Lindor; Ozan Dikilitas; Paul D P Pharoah; Peter T Campbell; Polly A Newcomb; Roger L Milne; Robert J MacInnis; Sergi Castellví-Bel; Shuji Ogino; Sonja I Berndt; Stéphane Bézieau; Stephen N Thibodeau; Steven J Gallinger; Syed H Zaidi; Tabitha A Harrison; Temitope O Keku; Thomas J Hudson; Veronika Vymetalkova; Victor Moreno; Vicente Martín; Volker Arndt; Wei-Qi Wei; Wendy Chung; Yu-Ru Su; Richard B Hayes; Emily White; Pavel Vodicka; Graham Casey; Stephen B Gruber; Robert E Schoen; Andrew T Chan; John D Potter; Hermann Brenner; Gail P Jarvik; Douglas A Corley; Ulrike Peters; Li Hsu Journal: Am J Hum Genet Date: 2020-08-05 Impact factor: 11.025