Joel N Swerdel1, George Hripcsak2, Patrick B Ryan3. 1. Janssen Research & Development, 920 Route 202, Raritan, NJ 08869, USA; OHDSI Collaborators, Observational Health Data Sciences and Informatics (OHDSI), 622 West 168th Street, PH-20, New York, NY 10032, USA. Electronic address: jswerdel@its.jns.com. 2. OHDSI Collaborators, Observational Health Data Sciences and Informatics (OHDSI), 622 West 168th Street, PH-20, New York, NY 10032, USA; Columbia University, 622 West 168th Street, PH20, New York, NY 10032, USA. 3. Janssen Research & Development, 920 Route 202, Raritan, NJ 08869, USA; OHDSI Collaborators, Observational Health Data Sciences and Informatics (OHDSI), 622 West 168th Street, PH-20, New York, NY 10032, USA; Columbia University, 622 West 168th Street, PH20, New York, NY 10032, USA.
Abstract
BACKGROUND: The primary approach for defining disease in observational healthcare databases is to construct phenotype algorithms (PAs), rule-based heuristics predicated on the presence, absence, and temporal logic of clinical observations. However, a complete evaluation of PAs, i.e., determining sensitivity, specificity, and positive predictive value (PPV), is rarely performed. In this study, we propose a tool (PheValuator) to efficiently estimate a complete PA evaluation. METHODS: We used 4 administrative claims datasets: OptumInsight's de-identified Clinformatics™ Datamart (Eden Prairie,MN); IBM MarketScan Multi-State Medicaid); IBM MarketScan Medicare Supplemental Beneficiaries; and IBM MarketScan Commercial Claims and Encounters from 2000 to 2017. Using PheValuator involves (1) creating a diagnostic predictive model for the phenotype, (2) applying the model to a large set of randomly selected subjects, and (3) comparing each subject's predicted probability for the phenotype to inclusion/exclusion in PAs. We used the predictions as a 'probabilistic gold standard' measure to classify positive/negative cases. We examined 4 phenotypes: myocardial infarction, cerebral infarction, chronic kidney disease, and atrial fibrillation. We examined several PAs for each phenotype including 1-time (1X) occurrence of the diagnosis code in the subject's record and 1-time occurrence of the diagnosis in an inpatient setting with the diagnosis code as the primary reason for admission (1X-IP-1stPos). RESULTS: Across phenotypes, the 1X PA showed the highest sensitivity/lowest PPV among all PAs. 1X-IP-1stPos yielded the highest PPV/lowest sensitivity. Specificity was very high across algorithms. We found similar results between algorithms across datasets. CONCLUSION: PheValuator appears to show promise as a tool to estimate PA performance characteristics.
BACKGROUND: The primary approach for defining disease in observational healthcare databases is to construct phenotype algorithms (PAs), rule-based heuristics predicated on the presence, absence, and temporal logic of clinical observations. However, a complete evaluation of PAs, i.e., determining sensitivity, specificity, and positive predictive value (PPV), is rarely performed. In this study, we propose a tool (PheValuator) to efficiently estimate a complete PA evaluation. METHODS: We used 4 administrative claims datasets: OptumInsight's de-identified Clinformatics™ Datamart (Eden Prairie,MN); IBM MarketScan Multi-State Medicaid); IBM MarketScan Medicare Supplemental Beneficiaries; and IBM MarketScan Commercial Claims and Encounters from 2000 to 2017. Using PheValuator involves (1) creating a diagnostic predictive model for the phenotype, (2) applying the model to a large set of randomly selected subjects, and (3) comparing each subject's predicted probability for the phenotype to inclusion/exclusion in PAs. We used the predictions as a 'probabilistic gold standard' measure to classify positive/negative cases. We examined 4 phenotypes: myocardial infarction, cerebral infarction, chronic kidney disease, and atrial fibrillation. We examined several PAs for each phenotype including 1-time (1X) occurrence of the diagnosis code in the subject's record and 1-time occurrence of the diagnosis in an inpatient setting with the diagnosis code as the primary reason for admission (1X-IP-1stPos). RESULTS: Across phenotypes, the 1X PA showed the highest sensitivity/lowest PPV among all PAs. 1X-IP-1stPos yielded the highest PPV/lowest sensitivity. Specificity was very high across algorithms. We found similar results between algorithms across datasets. CONCLUSION: PheValuator appears to show promise as a tool to estimate PA performance characteristics.
Authors: Girish N Nadkarni; Omri Gottesman; James G Linneman; Herbert Chase; Richard L Berg; Samira Farouk; Rajiv Nadukuru; Vaneet Lotay; Steve Ellis; George Hripcsak; Peggy Peissig; Chunhua Weng; Erwin P Bottinger Journal: AMIA Annu Symp Proc Date: 2014-11-14
Authors: Vibhu Agarwal; Tanya Podchiyska; Juan M Banda; Veena Goel; Tiffany I Leung; Evan P Minty; Timothy E Sweeney; Elsie Gyang; Nigam H Shah Journal: J Am Med Inform Assoc Date: 2016-05-12 Impact factor: 4.497
Authors: Marc A Suchard; Shawn E Simpson; Ivan Zorych; Patrick Ryan; David Madigan Journal: ACM Trans Model Comput Simul Date: 2013-01 Impact factor: 1.075
Authors: Jenna M Reps; Martijn J Schuemie; Marc A Suchard; Patrick B Ryan; Peter R Rijnbeek Journal: J Am Med Inform Assoc Date: 2018-08-01 Impact factor: 4.497
Authors: Jenna Wong; Daniel Prieto-Alhambra; Peter R Rijnbeek; Rishi J Desai; Jenna M Reps; Sengwee Toh Journal: Drug Saf Date: 2022-05-17 Impact factor: 5.228
Authors: Anna Ostropolets; Philip Zachariah; Patrick Ryan; Ruijun Chen; George Hripcsak Journal: J Am Med Inform Assoc Date: 2021-09-18 Impact factor: 7.942
Authors: Mehr Kashyap; Martin Seneviratne; Juan M Banda; Thomas Falconer; Borim Ryu; Sooyoung Yoo; George Hripcsak; Nigam H Shah Journal: J Am Med Inform Assoc Date: 2020-06-01 Impact factor: 4.497
Authors: Sanket S Dhruva; Guoqian Jiang; Amit A Doshi; Daniel J Friedman; Eric Brandt; Jiajing Chen; Joseph G Akar; Joseph S Ross; Keondae R Ervin; Kimberly Collison Farr; Nilay D Shah; Paul Coplan; Peter A Noseworthy; Shumin Zhang; Thomas Forsyth; Wade L Schulz; Yue Yu; Joseph P Drozda Journal: BMJ Surg Interv Health Technol Date: 2021-12-09
Authors: Martin Chapman; Shahzad Mumtaz; Luke V Rasmussen; Andreas Karwath; Georgios V Gkoutos; Chuang Gao; Dan Thayer; Jennifer A Pacheco; Helen Parkinson; Rachel L Richesson; Emily Jefferson; Spiros Denaxas; Vasa Curcin Journal: Gigascience Date: 2021-09-11 Impact factor: 6.524
Authors: Melissa A Haendel; Christopher G Chute; Tellen D Bennett; David A Eichmann; Justin Guinney; Warren A Kibbe; Philip R O Payne; Emily R Pfaff; Peter N Robinson; Joel H Saltz; Heidi Spratt; Christine Suver; John Wilbanks; Adam B Wilcox; Andrew E Williams; Chunlei Wu; Clair Blacketer; Robert L Bradford; James J Cimino; Marshall Clark; Evan W Colmenares; Patricia A Francis; Davera Gabriel; Alexis Graves; Raju Hemadri; Stephanie S Hong; George Hripscak; Dazhi Jiao; Jeffrey G Klann; Kristin Kostka; Adam M Lee; Harold P Lehmann; Lora Lingrey; Robert T Miller; Michele Morris; Shawn N Murphy; Karthik Natarajan; Matvey B Palchuk; Usman Sheikh; Harold Solbrig; Shyam Visweswaran; Anita Walden; Kellie M Walters; Griffin M Weber; Xiaohan Tanner Zhang; Richard L Zhu; Benjamin Amor; Andrew T Girvin; Amin Manna; Nabeel Qureshi; Michael G Kurilla; Sam G Michael; Lili M Portilla; Joni L Rutter; Christopher P Austin; Ken R Gersing Journal: J Am Med Inform Assoc Date: 2021-03-01 Impact factor: 7.942