| Literature DB >> 35579818 |
Ross D Williams1, Jenna M Reps2, Jan A Kors3, Patrick B Ryan2, Ewout Steyerberg4, Katia M Verhamme3, Peter R Rijnbeek3.
Abstract
INTRODUCTION: External validation of prediction models is increasingly being seen as a minimum requirement for acceptance in clinical practice. However, the lack of interoperability of healthcare databases has been the biggest barrier to this occurring on a large scale. Recent improvements in database interoperability enable a standardized analytical framework for model development and external validation. External validation of a model in a new database lacks context, whereby the external validation can be compared with a benchmark in this database. Iterative pairwise external validation (IPEV) is a framework that uses a rotating model development and validation approach to contextualize the assessment of performance across a network of databases. As a use case, we predicted 1-year risk of heart failure in patients with type 2 diabetes mellitus.Entities:
Mesh:
Year: 2022 PMID: 35579818 PMCID: PMC9114056 DOI: 10.1007/s40264-022-01161-8
Source DB: PubMed Journal: Drug Saf ISSN: 0114-5916 Impact factor: 5.228
Fig. 1Rotation of databases for model development and external validation in the iterative pairwise external validation method
Database characteristics
| Database | Acronym | Country | Data type | Time period | Database size (million patients) |
|---|---|---|---|---|---|
| Optum® de-identified EHR dataset | Optum EHR | USA | EHR | 2006–2018 | 87 |
| IBM MarketScan® commercial database | CCAE | USA | Claims | 2000–2018 | 155 |
| IBM MarketScan® multi-state Medicaid database | MDCD | USA | Claims | 2006–2017 | 30 |
| IBM MarketScan® Medicare supplemental database | MDCR | USA | Claims | 2000–2018 | 10 |
| Optum® de-identified Clinformatics® data mart database | Optum Clinformatics | USA | Claims | 2000–2018 | 98 |
EHR electronic health record
Number of patients and internal validation performance per database
| Database | Patients with T2DM ( | Patients with HF ( | Incidence (%) | Age, years mean ± SD | Female (%) | Full model AUC | Age Sex AUC |
|---|---|---|---|---|---|---|---|
| CCAE | 112,989 | 1843 | 1.6 | 53 ± 8 | 46 | 0.78 | 0.64 |
| MDCD | 15,860 | 650 | 4.1 | 50 ± 12 | 64 | 0.77 | 0.65 |
| MDCR | 22,433 | 1658 | 7.4 | 73 ± 6 | 48 | 0.73 | 0.64 |
| Optum Clinformatics | 92,272 | 4332 | 4.7 | 63 ± 13 | 48 | 0.80 | 0.69 |
| Optum EHR | 159,633 | 3690 | 2.3 | 58 ± 12 | 49 | 0.81 | 0.71 |
AUC area under the concentration–time curve, CCAE Commercial Claims and Encounters, EHR electronic health record, HF heart failure, MDCD Medicaid, MDCR Medicare, SD standard deviation, T2DM type 2 diabetes mellitus
Fig. 2A heatmap of the area under the concentration–time curve values across internal validation (values on the lead diagonal) and external validations of the developed prediction models. The colour scale runs from red (low discriminative ability) to green (high discriminative ability). The upper section details the performances for the data-driven model. The lower half details the same but then for the age and sex model. AUC area under the concentration–time curve, ccae Commercial Claims and Encounters, EHR electronic health records, mdcd Medicaid, mdcr Medicare
Fig. 3Internal and external calibration of the Optum EHR, Optum Clinformatics, and CCAE trained models. CCAE Commercial Claims and Encounters, EHR electronic health records, mdcd Medicaid, mdcr Medicare
| External validation lacks context, which inhibits understanding of model performance. |
| Iterative pairwise external validation provides contextualised model performance across databases and across model complexity. |