Literature DB >> 34554825

Data Matching to Support Analysis of Cancer Epidemiology Among Veterans Compared With Non-Veteran Populations-An Exemplar in Brain Tumors.

Christine Woo1,2, Gino N Cioffi3,4, Taissa A Bej5, Brigid Wilson5,6, Janet M Briggs5, Sarah C Markt7, Fredrick R Schumacher7, Carol Kruchko4, Kristin A Waite3,4, L Burt Nabors8, Charles J Nock9,10, Robin L P Jump5,6, Jill S Barnholtz-Sloan3,4,11.   

Abstract

PURPOSE: State and national cancer registries do not systematically include Veteran data, which hinders analysis of the diagnosis patterns, treatment trajectories, and clinical outcomes of Veterans compared with non-Veteran populations. This study used data matching approaches to compare cases included in the Oncology Domain of the Veterans Affairs (VA) Corporate Data Warehouse and the Ohio Cancer Incidence Surveillance System, using brain tumors as an exemplar.
METHODS: We used direct data matching, on the basis of protected health information (PHI) common to both databases, to compare primary brain tumors from Veterans and non-Veterans diagnosed from 2000 to 2016. Working with this matched data set, we used six data elements that did not contain PHI, to assess the feasibility of using deterministic data matching to compare Veterans and non-Veterans.
RESULTS: Between 2000 and 2016, 223 Veterans from Ohio had a primary brain tumor; of those, 55 (25%) were not included in Ohio Cancer Incidence Surveillance System. Direct data matching showed that Veterans experienced a greater proportion of glioblastomas (41%) compared with non-Veterans (21%). Sex did not account for this difference. Deterministic data matching within the matched data set found that 75% (126 of 168) of Veterans had exact matches for at least five of six non-PHI variables common to both databases.
CONCLUSION: This study indicated that direct and deterministic data matching approaches to compare brain tumors in Veterans and in non-Veterans is feasible. This approach has the potential to promote comparisons of the distribution of tumors, the impact of chemical and environmental exposures, treatment trajectories, and clinical outcomes among Veteran and non-Veteran populations with brain tumors as well as other cancers and rare diseases.

Entities:  

Mesh:

Year:  2021        PMID: 34554825      PMCID: PMC8807020          DOI: 10.1200/CCI.21.00052

Source DB:  PubMed          Journal:  JCO Clin Cancer Inform        ISSN: 2473-4276


INTRODUCTION

In the United States, 18.2 million Veterans account for 6.9% of the total population, with about half of Veterans seeking care through the Veterans Health Administration (VHA).[1,2] Exposure to chemical and environmental hazards during military service may increase the risk of several diseases, including an array of cancers.[3] Data describing cancer among Veterans receiving care at the VHA (henceforward, Veterans) are reported to the Veterans Affairs Central Cancer Registry (VACCR), but following a reporting policy dating to 2007, these data are no longer systematically included in state and national central cancer registries (CCRs) because of differences in state-level data use agreements regarding inclusion of Veteran data.[4,5] Both the VACCR and CCRs use precise and robust case definitions for tracking purposes, but CCRs record neither Veteran status nor environmental exposures. Furthermore, some Veterans receive cancer care both in Veterans Affairs (VA) and non-VA settings, leading to the potential inclusion and contribution of these individuals in both the VACCR and state CCRs (Fig 1). Limitations related to sharing protected health information (PHI), including Veteran status, can further confound efforts to recognize Veterans included in both VACCR and state CCRs. The diagnosis patterns, treatment trajectories, and clinical outcomes of Veterans with cancer compared with non-Veteran populations represent a salient knowledge gap. Especially for rare conditions, this also hinders recognition of risk factors for development of cancers that may be specific to chemical and environmental hazards that Veterans might have encountered during military service.
FIG 1.

Conceptual model of how cancer registries may not yield accurate information regarding Veterans. CCRs (blue circle) do not indicate Veteran status when describing cases. The VACCR (red circle) accounts only for Veterans. Data sharing agreements between Veterans Affairs Medical Centers and CCRs vary from state to state. Accordingly, the inclusion of Veterans in CCRs, and their contribution to data analyzed within CCRs, is unknown (purple area), which hinders the assessment of diagnosis patterns, treatment trajectories, and clinical outcomes of Veterans with cancer compared with non-Veteran populations. The relative size of the circles shown in this conceptual model does not accurately represent the number of people included in the Central Cancer Registry compared with the VACCR. CCRs, central cancer registries; VA, Veterans Affairs; VACCR, Veterans Affairs Central Cancer Registry.

CONTEXT

Key Objective Implementation of deterministic data matching approaches to assess the distribution of brain tumors between population-level registry data and Veteran Affairs data and to assess the representation of Veteran patients in registry data. Knowledge Generated Veteran patients were not fully represented in the Ohio state registry, excluding 25% of the patients with brain tumor diagnosed in the Ohio Veteran Affairs Health Care System. Direct data matching showed that brain tumor distribution differed between populations, where Veterans experienced a greater proportion of glioblastomas (41%) compared with non-Veterans (21%). Relevance This data matching approach has the potential to promote comparisons of the distribution of disease, treatment trajectories, and clinical outcomes among Veteran and non-Veteran populations with cancer and rare diseases. Conceptual model of how cancer registries may not yield accurate information regarding Veterans. CCRs (blue circle) do not indicate Veteran status when describing cases. The VACCR (red circle) accounts only for Veterans. Data sharing agreements between Veterans Affairs Medical Centers and CCRs vary from state to state. Accordingly, the inclusion of Veterans in CCRs, and their contribution to data analyzed within CCRs, is unknown (purple area), which hinders the assessment of diagnosis patterns, treatment trajectories, and clinical outcomes of Veterans with cancer compared with non-Veteran populations. The relative size of the circles shown in this conceptual model does not accurately represent the number of people included in the Central Cancer Registry compared with the VACCR. CCRs, central cancer registries; VA, Veterans Affairs; VACCR, Veterans Affairs Central Cancer Registry. The advent of widespread electronic health record systems offers an opportunity to better ascertain cancer diagnoses and clinical outcomes among specific populations. Specifically, the VA's national Corporate Data Warehouse (CDW), a data repository compiled from electronic medical records and updated daily, now includes an Oncology Domain, with cases abstracted by VA cancer registrars.[6] For each Veteran in the Oncology Domain, direct data linkages to the CDW permits access to robust clinical information. State CCRs, which rely upon data submitted from electronic health records and cancer registrars, typically include demographic information and several parameters specific to the cancer diagnosis. The development of approaches to compare cases in VACCR with nonoverlapping cases in state and/or national CCRs would facilitate comparison of cancer and related metrics between Veterans and non-Veteran populations. Previous research suggests that some military exposures may be associated with brain tumors and that brain tumor frequency and mortality in the US Veteran population may be associated with specific periods of deployment. These exposures include ionizing radiation, nerve agents related to weapons demolitions, and possibly smoke from oil well fires.[7-10] Using brain tumors as an exemplar, we use data matching approaches to compare cases included in the VA CDW and in the Ohio Cancer Incidence Surveillance System (OCISS), the CCR for Ohio. Brain tumors, while rare in comparison with other tumor types, contribute disproportionately to morbidity and mortality. Ohio is a populous state, with demographics, including the proportion of Veterans (7.3%), similar to the overall US population.[1] Our approach involved using both direct data matching, on the basis of PHI, and deterministic data matching to assess concordance between data sources, on the basis of data elements that are common to CCRs and do not contain PHI.

METHODS

Study Design and Data Sources

The Institutional Review Boards at the VA Northeast Ohio Healthcare System and the Ohio Department of Health approved the study protocol. We conducted a retrospective cohort study of primary brain tumors among Ohio residents from 2000 to 2016. To identify individuals with brain tumors across Ohio, we used the OCISS. To identify patients who received care through the VHA (henceforward, Veterans) with brain tumors, we used the Veterans Affairs Informatics and Computing Infrastructure to access the Oncology Domain within the CDW.

Case Definition for Patients With Brain Tumor

Inclusion criteria were people age ≥ 18 years who were diagnosed with a brain tumor in Ohio. Cases were defined as individuals with a primary brain tumor, identified using the administrative criteria on the basis of International Classification of Diseases for Oncology, Third Edition (ICD-O-3) anatomic site, histology, and behavior codes as defined by the Central Brain Tumor Registry of the United States.[11] For each case, we obtained patient identifying information (name, date of birth, and social security number) and data that did not include PHI, which categorized into six data elements: sex, county of residence, year of diagnosis, age at diagnosis, brain tumor histology and behavior, and anatomic site location. Veterans were defined as individuals identified in the VA Oncology Domain. Non-Veterans were defined as individuals who appeared only in the OCISS data set. Infrequently, a single individual may represent more than one case if they had more than one primary brain tumor (eg, a glioblastoma and a meningioma). For these individuals, we used demographic information from their first diagnosed brain tumor.

Data Matching Methods

Direct data matching.

We used PHI (name, date of birth, and social security number) to identify individuals with brain tumors who were present in the VA Oncology Domain and in OCISS. These Veterans were then excluded from subsequent analysis of the OCISS data, permitting comparison of brain tumor histology among Veterans with a non-Veteran population.

Deterministic data matching.

To determine the feasibility of estimating the overlap represented by Veterans included in both the VA Oncology Domain and OCISS, we further assessed concordance of data elements reported in both databases that did not include PHI and are commonly reported in tumor registries. For Veterans present in both the VA Oncology Domain and OCISS, we matched data at two levels of stringency. The first required exact matches for sex, county of residence, year of diagnosis, and age (in years) at diagnosis. The second used caliper matching on numeric variables in which we permitted a difference of one between year of diagnosis and age (in years) at diagnosis.

Statistical Methods

We used descriptive statistics to summarize Veterans and non-Veterans with primary brain tumors. Student's t-test and chi-squared test were used to assess for difference in continuous and categorical variables, respectively, including sex and age. Statistical analyses were performed using R (version 3.5.2; Vienna, Austria).

RESULTS

Direct Data Matching

Between 2000 and 2016, 223 Veterans from Ohio had primary brain tumors. Of those, 168 (73%) were represented in OCISS (VA + OCISS) with the remaining 55 (27%) reported only in the VA Oncology Domain (VA only; Appendix Table A1). Differences in brain tumor histology existed between these two groups, with a greater proportion of glioblastomas among those in the VA + OCISS versus the VA only group (48% v 20%, respectively). Comparison of the proportion of cases that were in VA only group did not appear to change after 2007, which is when the VA policy regarding sharing data with CCRs changed.
TABLE A1.

Characteristics of Veterans Only Represented in the VA Oncology Domain and OCISS and Veterans Represented Only in the VA Oncology Domain Data, 2000-2016

Brain Tumor Histology in Veterans and Non-Veterans

After excluding Veterans from the OCISS, a total of 30,681 non-Veterans in Ohio were diagnosed with a primary brain tumor between 2000 and 2016; 110 non-Veterans had more than one primary brain tumor (Table 1). The Veteran population was predominantly male (96%), whereas the non-Veterans had a slight female predominance (57%). Histology distribution differed significantly between groups (P < 0.001), with glioblastomas representing a greater proportion of cases among Veterans than non-Veterans (41% v 21%, respectively) and meningiomas a smaller proportion (18% v 34%, respectively; Fig 2). However, to account for overall differences in the VA and general population, this difference in histology distribution was assessed in males age 55-65 years and resolved (P = .063; Appendix Table A2). The distribution of the anatomic sites involved was consistent with tumor histology, with Veterans having the greatest proportion of tumors in the frontal lobe (21%) and non-Veterans having the greatest proportion of tumors in the meninges (34%). Given that the majority of Veterans were male, we further assessed the brain tumor histology specifically among male Veterans and non-Veterans. Within this subset, Veterans still experienced a greater proportion of glioblastomas compared with non-Veterans (42% v 27%, respectively). Similarly, male Veterans had a greater proportion of brain tumors in their frontal lobes (21%) compared with male non-Veterans, for whom the meninges were the most common site (21%).
TABLE 1.

Characteristics of Veterans and Non-Veterans With Primary Brain Tumors From Ohio, OCISS and VA Oncology Domain, 2000-2016

FIG 2.

Distribution of tumor histology and anatomic site for individuals from Ohio with brain tumors, 2000-2016. Data describing Veterans who came from the VA Oncology Domain. Data describing non-Veterans who came from the Ohio Cancer Incidence Surveillance System after excluding Veterans: (A) general adult population of Ohio and (B) adult males from Ohio. VA, Veterans Affairs.

TABLE A2.

Characteristics of Veterans and Non-Veterans With Primary Brain Tumors From Ohio, Males Only, Age 56-65 Years, OCISS and VA Oncology Domain, 2000-2016

Characteristics of Veterans and Non-Veterans With Primary Brain Tumors From Ohio, OCISS and VA Oncology Domain, 2000-2016 Distribution of tumor histology and anatomic site for individuals from Ohio with brain tumors, 2000-2016. Data describing Veterans who came from the VA Oncology Domain. Data describing non-Veterans who came from the Ohio Cancer Incidence Surveillance System after excluding Veterans: (A) general adult population of Ohio and (B) adult males from Ohio. VA, Veterans Affairs. Recognizing that restrictions related to PHI can render direct data matching impractical, we used data elements that do not include PHI to assess for deterministic matches between patients in the VA Oncology Domain and the OCISS. Working within the known cohort of 168 Veterans included in both data sets (ie, already matched by name, date of birth, and social security number), we assessed for matches using six data elements that did not include PHI: sex, county of residence, year of diagnosis, age at diagnosis, brain tumor histology and behavior, and anatomic site. Overall, the concordance between data sets was highest for sex (100%) and lowest for anatomic site (58%; Fig 3). When requiring exact matches, 65 of 168 cases (39%) were concordant for six of six variables, 61 (36%) for five of six variables, and 30 (18%) for four of six variables. With caliper matching on numeric variables that allowed for a difference of one for the year of diagnosis and for the age at diagnosis, greater concordance was observed, with 76 of 168 cases (45.2%) matched on six of six variables, 63 (37.5%) on five of six variables, and 21 (12.5%) on four of six variables. Further examination of the eight cases with three or fewer matched variables indicated that for both anatomic site and tumor histology, the mismatches were due to less specific codes used in OCISS compared with VA Oncology. For example, for six cases that were listed in OCISS as brain, not otherwise specified, VA Oncology data reported the following anatomic sites: two frontal lobe, two temporal lobe, one cerebellum, and one meninges. We observed similar discrepancies for mismatches in descriptions of tumor histology.
FIG 3.

Concordance of variables devoid of protected health information present in both the VA Oncology Domain and the Ohio Cancer Incidence Surveillance System. The large rectangle represents the 168 Veterans present in both databases, with each individual represented as a column. White indicates a match; blue indicates a mismatch. The colored bar across the top of each panel is the number of patients with matches represented by the following color: six of six matches, teal; five of six matches, light orange; four of six matches, purple; three of six matches, blue; and two of six matches; dark orange. (A) Exact match required. (B) Caliper matching permitting a difference of one for age at diagnosis and year of diagnosis; red represents matches under less stringent requirements. *Two patients. VA, Veterans Affairs.

Concordance of variables devoid of protected health information present in both the VA Oncology Domain and the Ohio Cancer Incidence Surveillance System. The large rectangle represents the 168 Veterans present in both databases, with each individual represented as a column. White indicates a match; blue indicates a mismatch. The colored bar across the top of each panel is the number of patients with matches represented by the following color: six of six matches, teal; five of six matches, light orange; four of six matches, purple; three of six matches, blue; and two of six matches; dark orange. (A) Exact match required. (B) Caliper matching permitting a difference of one for age at diagnosis and year of diagnosis; red represents matches under less stringent requirements. *Two patients. VA, Veterans Affairs.

DISCUSSION

In the data matching approach presented here, we used brain tumors as an exemplar to compare Ohio Veterans and non-Veterans in Ohio over a 17-year time frame. To our knowledge, this study is the first report to compare the distribution of brain tumor histology between Veteran and non-Veteran populations. Furthermore, this report provides insights into the potential feasibility and limitations involved in applying data matching to other cancers using the same databases or to other conditions using other disease-focused databases. For CCRs that include PHI, our approach demonstrates that data matching can definitively identify Veterans within larger cancer registries that describe the general population. This, in turn, facilitates a more specific comparison of Veterans with non-Veteran populations, permitting detection of clinically relevant differences between these groups. We also assessed the feasibility of using a deterministic data matching approach to account for CCRs that do not include PHI, including the National Cancer Institute Surveillance, Epidemiology, and End Results (SEER) Program.[12] Working within a limited and previously matched data set of 168 cases, we found that 75% (126 of 168) of the individuals had at least five of six variables in common between both databases. Decreasing the stringency slightly improved the concordance for at least five of six variables to 83% (139 of 168). These results indicate that a deterministic data matching approach between the VA Oncology Domain and CCRs may not only be feasible but also requires further refinement. For brain tumors, those refinements include appropriately matching specific and nonspecific tumor histology and anatomic sites. For other cancers, content matter experts could develop similarly tailored refinements. Our analysis revealed three findings specific to brain tumors among Veterans. First, 25% of Veterans with brain tumors were not represented in OCISS. This under-representation may be an unintended consequence of policies meant to safeguard Veteran data, and the extent to which this extends to other cancers is unknown.[4] Second, the Veterans included only in the VA Oncology Domain had a higher proportion of meningiomas (34.5%), whereas those in both the VA and OCISS had a higher proportion of glioblastomas (48.2%). These differences likely reflect the clinical approaches specific to these brain tumors. Meningiomas tend to be slow growing and are diagnosed by imaging modalities available at VA medical centers.[13] Veterans with meningiomas may not need care outside of the VA system and therefore only have their case reported to the VACCR. By contrast, the diagnosis and clinical care of individuals with a glioblastoma involve a biopsy and, when possible, surgical resection.[14] Across the studied years, none of the VA medical centers in Ohio offer neurosurgical services. Veterans with concerns for glioblastoma would be referred to a non-VA hospital with neurosurgical specialty care, leading to that hospital including them in OCISS. Finally, Veterans had a greater proportion of glioblastomas compared with non-Veterans, a difference that persisted after limiting the analysis to males. On the basis of the limited understanding of risk factors for developing brain tumors, the potential reasons for the disproportionate representation of glioblastomas in the Veteran population are unknown. Our work has limitations. First, data from the VA Oncology Domain have not undergone standardization and aggregation to registry specifications such as those used for OCISS. The analysis presented here was primarily based on ICD-O codes; for cases with discrepancies between the databases, the VA Oncology Domain typically had more specific information. Second, our analysis is limited to Veterans who access at least some aspects of their health care through a VA medical center. The VHA is not able to account for Veterans who may elect to receive their care only through non-VHA health care settings. Third, our analysis only assesses brain tumors among individuals from a single state. Ohio has four VA medical centers and is the seventh most populous state with urban, suburban, and rural communities; therefore, for this exploratory analysis, it offers a reasonable representation of other locales in the United States. Finally, brain tumors are very rare and investigation on more common cancer types, such as prostate and lung cancer, would be beneficial to further evaluate this methodology. Despite these limitations, our study was able to use a data matching algorithm to detect meaningful differences in the distribution of brain tumor histology between Veterans and non-Veterans in Ohio as well as notable gaps between the VA Oncology Domain and OCISS. The inability to reliably include and identify Veterans at the state registry level will result in inaccuracies in cancer surveillance, reporting, and recommendations at the state and national levels. Further work on developing algorithms to account for Veteran data when considering state and national cancer registry data sets that report on treatment trajectories and outcomes is needed. Our study raises the possibility that deterministic data matching may be a viable approach that would promote investigations that compare the distribution of tumors, impact of chemical and environmental exposures, treatment trajectories, and clinical outcomes among Veteran and non-Veteran populations. Future studies may benefit from adopting probabilistic matching approach to expand on this, particular when PHI is not available.
  9 in total

1.  Unreported VA data may affect SEER research, cancer surveillance, and statistics gathering.

Authors:  Liz Savage
Journal:  J Natl Cancer Inst       Date:  2007-11-27       Impact factor: 13.506

2.  Mortality in US Army Gulf War veterans exposed to 1991 Khamisiyah chemical munitions destruction.

Authors:  Tim A Bullman; Clare M Mahan; Han K Kang; William F Page
Journal:  Am J Public Health       Date:  2005-08       Impact factor: 9.308

3.  Trends in brain cancer mortality among U.S. Gulf War veterans: 21 year follow-up.

Authors:  Shannon K Barth; Erin K Dursa; Robert M Bossarte; Aaron I Schneiderman
Journal:  Cancer Epidemiol       Date:  2017-08-04       Impact factor: 2.984

4.  Structured Approach for Evaluating Strategies for Cancer Ascertainment Using Large-Scale Electronic Health Record Data.

Authors:  Ashley Earles; Lin Liu; Ranier Bustamante; Pat Coke; Julie Lynch; Karen Messer; María Elena Martínez; James D Murphy; Christina D Williams; Deborah A Fisher; Dawn T Provenzale; Andrew J Gawron; Tonya Kaltenbach; Samir Gupta
Journal:  JCO Clin Cancer Inform       Date:  2018-12

Review 5.  Epidemiology and etiology of gliomas.

Authors:  Hiroko Ohgaki; Paul Kleihues
Journal:  Acta Neuropathol       Date:  2005-02-01       Impact factor: 17.088

Review 6.  Meningioma.

Authors:  Christine Marosi; Marco Hassler; Karl Roessler; Michele Reni; Milena Sant; Elena Mazza; Charles Vecht
Journal:  Crit Rev Oncol Hematol       Date:  2008-03-14       Impact factor: 6.312

Review 7.  Glioblastoma.

Authors:  Hans-Georg Wirsching; Evanthia Galanis; Michael Weller
Journal:  Handb Clin Neurol       Date:  2016

8.  Neurological mortality among U.S. veterans of the Persian Gulf War: 13-year follow-up.

Authors:  Shannon K Barth; Han K Kang; Tim A Bullman; Mitchell T Wallin
Journal:  Am J Ind Med       Date:  2009-09       Impact factor: 2.214

9.  CBTRUS Statistical Report: Primary Brain and Other Central Nervous System Tumors Diagnosed in the United States in 2013-2017.

Authors:  Quinn T Ostrom; Nirav Patil; Gino Cioffi; Kristin Waite; Carol Kruchko; Jill S Barnholtz-Sloan
Journal:  Neuro Oncol       Date:  2020-10-30       Impact factor: 12.300

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.