PURPOSE: Claims databases offer large populations for research, but lack clinical details. We aimed to develop predictive models to identify estrogen receptor positive (ER+) and human epidermal growth factor negative (HER2-) early breast cancer (ESBC) and advanced stage breast cancer (ASBC) in a claims database. METHODS: Female breast cancer cases in Anthem's Cancer Care Quality Program served as the gold standard validation sample. Predictive models were developed from clinical knowledge and empirically from claims data using logistic and lasso regression. Model performance was assessed by classification rates and c-statistics. Models were applied to the HealthCore Integrated Research Database (claims) to identify cohorts of women with ER+/HER2- ESBC and ASBC. RESULTS: The validation sample included 3184 women with ER+/HER2- ESBC and 1436 with ER+/HER2- ASBC. Predictive models for ER+/HER2- ESBC and ASBC included 25 and 20 factors, respectively. Models had robust discrimination in identifying cases (c-stat = 0.92 for ESBC and 0.95 for ASBC). Compared with a traditional a priori algorithm developed with clinical insight alone, the ER+/HER2- ASBC-predictive model had better positive predictive value (PPV) (0.91, 95% CI, 0.90-0.93, vs 0.69, 95% CI, 0.66-0.73) and sensitivity (0.54 vs 0.35). Models were applied to the claims database to identify cohorts of 33 001 and 3198 women with ER+/HER2- ESBC and ASBC. CONCLUSION: We conducted a validation study and developed predictive models to identify in a claims database cohorts of women with ER+/HER2- ESBC and ASBC. The models identified large cohorts in the claims data that can be used to characterize indications in the evaluation of targeted therapies.
PURPOSE: Claims databases offer large populations for research, but lack clinical details. We aimed to develop predictive models to identify estrogen receptor positive (ER+) and humanepidermal growth factor negative (HER2-) early breast cancer (ESBC) and advanced stage breast cancer (ASBC) in a claims database. METHODS: Female breast cancer cases in Anthem's Cancer Care Quality Program served as the gold standard validation sample. Predictive models were developed from clinical knowledge and empirically from claims data using logistic and lasso regression. Model performance was assessed by classification rates and c-statistics. Models were applied to the HealthCore Integrated Research Database (claims) to identify cohorts of women with ER+/HER2- ESBC and ASBC. RESULTS: The validation sample included 3184 women with ER+/HER2- ESBC and 1436 with ER+/HER2- ASBC. Predictive models for ER+/HER2- ESBC and ASBC included 25 and 20 factors, respectively. Models had robust discrimination in identifying cases (c-stat = 0.92 for ESBC and 0.95 for ASBC). Compared with a traditional a priori algorithm developed with clinical insight alone, the ER+/HER2- ASBC-predictive model had better positive predictive value (PPV) (0.91, 95% CI, 0.90-0.93, vs 0.69, 95% CI, 0.66-0.73) and sensitivity (0.54 vs 0.35). Models were applied to the claims database to identify cohorts of 33 001 and 3198 women with ER+/HER2- ESBC and ASBC. CONCLUSION: We conducted a validation study and developed predictive models to identify in a claims database cohorts of women with ER+/HER2- ESBC and ASBC. The models identified large cohorts in the claims data that can be used to characterize indications in the evaluation of targeted therapies.
Authors: Daniel C Beachler; Cynthia de Luise; Aziza Jamal-Allial; Ruihua Yin; Devon H Taylor; Ayako Suzuki; James H Lewis; James W Freston; Stephan Lanes Journal: BMC Cancer Date: 2021-01-25 Impact factor: 4.430
Authors: Paula Dhiman; Jie Ma; Constanza L Andaur Navarro; Benjamin Speich; Garrett Bullock; Johanna A A Damen; Lotty Hooft; Shona Kirtley; Richard D Riley; Ben Van Calster; Karel G M Moons; Gary S Collins Journal: BMC Med Res Methodol Date: 2022-04-08 Impact factor: 4.615
Authors: Elodie Baumfeld Andre; Robert Reynolds; Patrick Caubel; Laurent Azoulay; Nancy A Dreyer Journal: Pharmacoepidemiol Drug Saf Date: 2019-12-10 Impact factor: 2.890