PURPOSE: This study aimed to develop Natural Language Processing (NLP) approaches to supplement manual outcome validation, specifically to validate pneumonia cases from chest radiograph reports. METHODS: We trained one NLP system, ONYX, using radiograph reports from children and adults that were previously manually reviewed. We then assessed its validity on a test set of 5000 reports. We aimed to substantially decrease manual review, not replace it entirely, and so, we classified reports as follows: (1) consistent with pneumonia; (2) inconsistent with pneumonia; or (3) requiring manual review because of complex features. We developed processes tailored either to optimize accuracy or to minimize manual review. Using logistic regression, we jointly modeled sensitivity and specificity of ONYX in relation to patient age, comorbidity, and care setting. We estimated positive and negative predictive value (PPV and NPV) assuming pneumonia prevalence in the source data. RESULTS: Tailored for accuracy, ONYX identified 25% of reports as requiring manual review (34% of true pneumonias and 18% of non-pneumonias). For the remainder, ONYX's sensitivity was 92% (95% CI 90-93%), specificity 87% (86-88%), PPV 74% (72-76%), and NPV 96% (96-97%). Tailored to minimize manual review, ONYX classified 12% as needing manual review. For the remainder, ONYX had sensitivity 75% (72-77%), specificity 95% (94-96%), PPV 86% (83-88%), and NPV 91% (90-91%). CONCLUSIONS: For pneumonia validation, ONYX can replace almost 90% of manual review while maintaining low to moderate misclassification rates. It can be tailored for different outcomes and study needs and thus warrants exploration in other settings.
PURPOSE: This study aimed to develop Natural Language Processing (NLP) approaches to supplement manual outcome validation, specifically to validate pneumonia cases from chest radiograph reports. METHODS: We trained one NLP system, ONYX, using radiograph reports from children and adults that were previously manually reviewed. We then assessed its validity on a test set of 5000 reports. We aimed to substantially decrease manual review, not replace it entirely, and so, we classified reports as follows: (1) consistent with pneumonia; (2) inconsistent with pneumonia; or (3) requiring manual review because of complex features. We developed processes tailored either to optimize accuracy or to minimize manual review. Using logistic regression, we jointly modeled sensitivity and specificity of ONYX in relation to patient age, comorbidity, and care setting. We estimated positive and negative predictive value (PPV and NPV) assuming pneumonia prevalence in the source data. RESULTS: Tailored for accuracy, ONYX identified 25% of reports as requiring manual review (34% of true pneumonias and 18% of non-pneumonias). For the remainder, ONYX's sensitivity was 92% (95% CI 90-93%), specificity 87% (86-88%), PPV 74% (72-76%), and NPV 96% (96-97%). Tailored to minimize manual review, ONYX classified 12% as needing manual review. For the remainder, ONYX had sensitivity 75% (72-77%), specificity 95% (94-96%), PPV 86% (83-88%), and NPV 91% (90-91%). CONCLUSIONS: For pneumonia validation, ONYX can replace almost 90% of manual review while maintaining low to moderate misclassification rates. It can be tailored for different outcomes and study needs and thus warrants exploration in other settings.
Authors: Martijn J Schuemie; Emine Sen; Geert W 't Jong; Eva M van Soest; Miriam C Sturkenboom; Jan A Kors Journal: Pharmacoepidemiol Drug Saf Date: 2012-01-24 Impact factor: 2.890
Authors: John Hansen; Steven Black; Henry Shinefield; Thomas Cherian; Jane Benson; Bruce Fireman; Edwin Lewis; Paula Ray; Janelle Lee Journal: Pediatr Infect Dis J Date: 2006-09 Impact factor: 2.129
Authors: Lionel A Mandell; Richard G Wunderink; Antonio Anzueto; John G Bartlett; G Douglas Campbell; Nathan C Dean; Scott F Dowell; Thomas M File; Daniel M Musher; Michael S Niederman; Antonio Torres; Cynthia G Whitney Journal: Clin Infect Dis Date: 2007-03-01 Impact factor: 9.079
Authors: Robert J F Laheij; Miriam C J M Sturkenboom; Robert-Jan Hassing; Jeanne Dieleman; Bruno H C Stricker; Jan B M J Jansen Journal: JAMA Date: 2004-10-27 Impact factor: 56.272
Authors: Jeffrey P Ferraro; Ye Ye; Per H Gesteland; Peter J Haug; Fuchiang Rich Tsui; Gregory F Cooper; Rudy Van Bree; Thomas Ginter; Andrew J Nowalk; Michael Wagner Journal: Appl Clin Inform Date: 2017-05-31 Impact factor: 2.342
Authors: Yihua Zhou; Per K Amundson; Fang Yu; Marcus M Kessler; Tammie L S Benzinger; Franz J Wippold Journal: J Digit Imaging Date: 2014-12 Impact factor: 4.056
Authors: B E Jones; B R South; Y Shao; C C Lu; J Leng; B C Sauer; A V Gundlapalli; M H Samore; Q Zeng Journal: Appl Clin Inform Date: 2018-02-21 Impact factor: 2.342
Authors: AlokSagar Panny; Harshad Hegde; Ingrid Glurich; Frank A Scannapieco; Jayanth G Vedre; Jeffrey J VanWormer; Jeffrey Miecznikowski; Amit Acharya Journal: Methods Inf Med Date: 2022-04-05 Impact factor: 1.800
Authors: Timothy L Chen; Max Emerling; Gunvant R Chaudhari; Yeshwant R Chillakuru; Youngho Seo; Thienkhai H Vu; Jae Ho Sohn Journal: J Biomed Inform Date: 2020-12-15 Impact factor: 6.317
Authors: Jianlin Shi; John F Hurdle; Stacy A Johnson; Jeffrey P Ferraro; David E Skarda; Samuel R G Finlayson; Matthew H Samore; Brian T Bucher Journal: Surgery Date: 2021-06-03 Impact factor: 4.348
Authors: Brian T Bucher; Jianlin Shi; Jeffrey P Ferraro; David E Skarda; Matthew H Samore; John F Hurdle; Adi V Gundlapalli; Wendy W Chapman; Samuel R G Finlayson Journal: Ann Surg Date: 2020-10 Impact factor: 13.787