Yvonne L Eaglehouse1,2, Amie B Park1,2, Matthew W Georg1,2, Derek W Brown1,3, Jie Lin1,2, Stephanie Shao1,4, Julie A Bytnar1,2, Craig D Shriver1,5, Kangmin Zhu1,2,6. 1. Murtha Cancer Center/Research Program, Uniformed Services University of the Health Sciences and Walter Reed National Military Medical Center, Bethesda, MD. 2. Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD. 3. Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD. 4. General Dynamics Information Technology Federal Health, Rockville, MD. 5. Department of Surgery, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD. 6. Department of Preventive Medicine and Biostatistics, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD.
Abstract
PURPOSE: Linked cancer registry and medical claims data have increased the capacity for cancer research. However, few efforts have described methods to select information between data sources, which may affect data use. We developed a systematic process to evaluate and consolidate cancer diagnosis and treatment information between the linked Department of Defense Central Cancer Registry (CCR) and Military Health System Data Repository (MDR) administrative claims database, called Military Cancer Epidemiology Data System (MilCanEpi). METHODS: MilCanEpi contains information on cancer diagnosis and treatment of patients receiving care from 1998 to 2014. We used an iterative process guided by knowledge of data features, current literature, and logical comparisons between the CCR and MDR data to evaluate and consolidate cancer diagnosis and treatment received (yes or no) and their dates. We applied the processes to breast cancer data as an example. Agreement between diagnosis and treatment dates in the two data sources was evaluated using Cohen's κ with 95% CIs. RESULTS: In MilCanEpi, we identified 15,965 patients with a breast cancer diagnosis and 15,145 patients who underwent breast cancer surgery; 97.9% and 84.1% of patients had records in both CCR and MDR for diagnosis and surgery, respectively. Exact agreement was 13.7% for diagnosis dates (Cohen's κ = 0.14; 95% CI, 0.13 to 0.14) and 68.9% for surgery dates (Cohen's κ = 0.69; 95% CI, 0.68 to 0.70) between the two data sources. After applying systematic processes, 98.1% of patients with a breast cancer diagnosis and 99.7% of patients with surgery had information selected for analytic data sets. CONCLUSION: The developed processes resulted in high consolidation rates of breast cancer data in MilCanEpi and may serve as a data selection template for other tumor sites and linked data sources.
PURPOSE:Linked cancer registry and medical claims data have increased the capacity for cancer research. However, few efforts have described methods to select information between data sources, which may affect data use. We developed a systematic process to evaluate and consolidate cancer diagnosis and treatment information between the linked Department of Defense Central Cancer Registry (CCR) and Military Health System Data Repository (MDR) administrative claims database, called Military Cancer Epidemiology Data System (MilCanEpi). METHODS:MilCanEpi contains information on cancer diagnosis and treatment of patients receiving care from 1998 to 2014. We used an iterative process guided by knowledge of data features, current literature, and logical comparisons between the CCR and MDR data to evaluate and consolidate cancer diagnosis and treatment received (yes or no) and their dates. We applied the processes to breast cancer data as an example. Agreement between diagnosis and treatment dates in the two data sources was evaluated using Cohen's κ with 95% CIs. RESULTS: In MilCanEpi, we identified 15,965 patients with a breast cancer diagnosis and 15,145 patients who underwent breast cancer surgery; 97.9% and 84.1% of patients had records in both CCR and MDR for diagnosis and surgery, respectively. Exact agreement was 13.7% for diagnosis dates (Cohen's κ = 0.14; 95% CI, 0.13 to 0.14) and 68.9% for surgery dates (Cohen's κ = 0.69; 95% CI, 0.68 to 0.70) between the two data sources. After applying systematic processes, 98.1% of patients with a breast cancer diagnosis and 99.7% of patients with surgery had information selected for analytic data sets. CONCLUSION: The developed processes resulted in high consolidation rates of breast cancer data in MilCanEpi and may serve as a data selection template for other tumor sites and linked data sources.
Authors: Nikki M Carroll; Kate M Burniece; Jeff Holzman; Deanna B McQuillan; Angela Plata; Debra P Ritzwoller Journal: JCO Clin Cancer Inform Date: 2017-11
Authors: Soko Setoguchi; Daniel H Solomon; Robert J Glynn; E Francis Cook; Raisa Levin; Sebastian Schneeweiss Journal: Cancer Causes Control Date: 2007-04-19 Impact factor: 2.506
Authors: Beth A Virnig; Joan L Warren; Gregory S Cooper; Carrie N Klabunde; Nicola Schussler; Jean Freeman Journal: Med Care Date: 2002-08 Impact factor: 2.983