| Literature DB >> 30254982 |
Steven B Cohen1, Jennifer Unangst1.
Abstract
Project Data Sphere (PDS) is a research platform that provides the research community with broad access to both de-identified patient-level data from oncology clinical trials and related analytic tools. While these data are rich in measures that characterize the clinical trials under study, data providers are required to de-identify patient-level data by removing key demographic data. To address these analytic constraints, the data profiles in selected PDS patient-level cancer phase III clinical datasets have been augmented by linking the social, economic, and health-related characteristics of like cancer survivors from nationally representative health and health care-related survey data. Using statistical linkage and model-based techniques, patient-level records in selected PDS datasets have been linked to those of comparable cancer survivors, and are thereby augmented with survey content on social, economic, and health-related characteristics. These new analytically enhanced PDS data resources enable more targeted analyses designed to examine questions such as how disparities in cancer patients' access to health care and income impact patient outcomes in specific phase III clinical trials, and what variations in patient outcomes are associated with specific demographic, socioeconomic, and health-related factors. This study provides an overview of the methodologies used to connect patient-level clinical trial data with nationally representative health-related data on cancer survivors from the national Medical Expenditure Panel Survey (MEPS). MEPS was designed to provide national population-based health care use, expenditure, and source of payment estimates in addition to measures of health status, demographic characteristics, employment, health insurance coverage, and access to health care. Study findings include probabilistic assessments of the representation of the patients in the respective clinical trials relative to the characteristics of cancer survivors in the general population. The study also demonstrates how the augmented datasets serve to enable researchers to assess the impact of socioeconomic factors added through data integration on cancer survival and related outcomes of interest.Entities:
Keywords: MEPS; clinical trials; data integration; health disparities; project data sphere
Year: 2018 PMID: 30254982 PMCID: PMC6141801 DOI: 10.3389/fonc.2018.00365
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Number of MEPS lung cancer survivors eligible for linkage by MEPS year.
| 2000 | 28 | 4.3 |
| 2001 | 37 | 5.7 |
| 2002 | 49 | 7.5 |
| 2003 | 46 | 7.0 |
| 2004 | 46 | 7.0 |
| 2005 | 36 | 5.5 |
| 2006 | 33 | 5.1 |
| 2007 | 49 | 7.5 |
| 2008 | 60 | 9.2 |
| 2009 | 53 | 8.1 |
| 2010 | 61 | 9.3 |
| 2011 | 59 | 9.1 |
| 2012 | 49 | 7.5 |
| 2013 | 47 | 7.2 |
| Total | 653 | 100.0 |
Medical Expenditure Panel Survey Household Component, 2000–2013, Agency for Healthcare Research and Quality, U.S. Department of Health and Human Services.
Lower bound values for EQ-5D decile categories.
| 1 | −0.016 | 41 | 8.5 |
| 2 | 0.620 | 35 | 7.2 |
| 3 | 0.689 | 45 | 9.3 |
| 4 | 0.725 | 47 | 9.7 |
| 5 | 0.760 | 35 | 7.2 |
| 6 | 0.796 | 76 | 15.7 |
| 7 | 0.848 | 52 | 10.8 |
| 8 | 0.883 | 22 | 4.6 |
| 9 | 1.000 | 130 | 27.0 |
PDS data file LungNo_MerckKG_2007_145, Project Data Sphere.
Summary of linkage approach.
| 1: Single year age, sex, race, EQ-5D score | Direct | Direct |
| 2: Categorized age, sex, race, EQ-5D score | Direct | Direct |
| 3: Categorized age, sex, race, EQ-5D decile categories | Dolan | Dolan |
| 1: Single year age, sex, race, EQ-5D decile categories | Sullivan | Dolan |
| 2: Categorized age, sex, race, EQ-5D decile categories | Sullivan | Dolan |
Medical Expenditure Panel Survey Household Component, 2000–2013, Agency for Healthcare Research and Quality, U.S. Department of Health and Human Services; PDS data file LungNo_MerckKG_2007_145, Project Data Sphere.
Measures considered as potential predictors of trial linkage status for MEPS lung cancer survivors.
| Age | Age in years at end of the MEPS survey year |
| Race | White, Black, Other (including Hispanic) |
| Sex | Male, Female |
| EQ-5D decile category | For MEPS 2000–2003, the categorized predicted value of EQ-5D based on Dolan prediction equation. For MEPS 2004–2013, the categorized predicted value of EQ-5D based on Sullivan-Ghushchyan prediction model. Fewer than ten decile categories resulted due to ties. −0.016 ≤ EQ-5D < 0.620 |
| EQ-5D ≥ 1.000 | |
| Marital status | Married, Not married (including divorced, separated, widowed, never married) |
| Employment status | Not employed, Employed at any time during reference period |
| Education level | No degree, Earned at least GED or high school diploma |
| Income level | High income (family income ≥400% of the poverty level), poor through middle income (family income < 400% of the poverty level) |
| MEPS survey period | 2000–2003, 2004–2013 |
| Health insurance coverage | Any private insurance, Public insurance only, Uninsured |
| Smoker status | Current smoker, Not current smoker |
| Perceived health status | Excellent/Very Good/Good, Fair/Poor |
| Limitation in physical functioning | Yes, No |
| Number of prescribed medicine purchases | Frequency in year |
| Number of hospital discharges | Frequency in year |
| Number of emergency room visits | Frequency in year |
| Number of office-based physician visits | Frequency in year |
| Total health care expenditures | Continuous measure for year |
| Access to necessary medical care | Able to get access, Unable to get access |
Medical Expenditure Panel Survey Household Component Data Files 2000–2013, Agency for Healthcare Research and Quality, U.S. Department of Health and Human Services.
Logistic regression model to identify factors associated with trial linkage status for MEPS lung cancer survivors.
| Overall model | 9 | 6.95 | < 0.0001 | |||
| Intercept | 1.06 | 0.43 | 0.0145 | |||
| Marital status | 1 | 6.16 | 0.0134 | |||
| Married | 1.01 | 0.40 | 0.0134 | |||
| Sex | 1 | 11.04 | 0.0010 | |||
| Female | −1.61 | 0.49 | 0.0010 | |||
| MEPS Survey Year | 1 | 6.47 | 0.0113 | |||
| 2000–2003 | −0.93 | 0.37 | 0.0113 | |||
| EQ-5D Decile Category | 0.26 | 0.07 | 0.0001 | 1 | 14.81 | 0.0001 |
| Race | 2 | 25.94 | < 0.0001 | |||
| Other (including Hispanic) | −4.39 | 0.85 | < 0.0001 | |||
| Black | −5.93 | 0.91 | < .0001 | |||
| Access to necessary medical care | 1 | 3.17 | 0.0758 | |||
| Unable to get access | 1.09 | 0.61 | 0.0758 | |||
| Smoking status | 1 | 4.46 | 0.0352 | |||
| Current smoker | 0.94 | 0.44 | 0.0352 |
n = 470.
Analysis performed using SUDAAN statistical software. The subpopn statement was used to conduct the subpopulation analysis of lung cancer survivors from among all MEPS cases (2000–2013).
Pseudo R-square: 0.429977
−2 * Normalized log-likelihood with intercepts only: 580.32
−2 * Normalized log-likelihood full model: 316.14
Approximate chi-square (−2 * log-L ratio): 264.18
Degrees of freedom: 8
Denominator degrees of freedom: 445
Medical Expenditure Panel Survey Household Component Data Files 2000–2013, Agency for Healthcare Research and Quality, U.S. Department of Health and Human Services.
Measures from PDS considered as potential predictors of survival status for PDS lung cancer patients.
| Age | Age in years |
| Race | White, Other |
| Sex | Male, Female |
| Most Recent EQ-5D Measurement | Most recently recorded EQ-5D measurement |
| ECOG Performance | Scale used to assess how a patient's disease is progressing and how the disease affects daily living abilities: |
| Fully active without any physical restriction | |
| Restricted in physical activity of a strenuous nature | |
| Response to Chemo-radiotherapy | Partial/complete response, stable disease |
| Type of Chemo-radiotherapy | Concomitant, Sequential |
| N Stage | Cancer stage that describes the number and relative location of lymph nodes affected by the tumor. A higher number after the N indicates that a greater number of lymph nodes have been affected: |
| NX/N0 (Not measurable; no cancer) | |
| N1/N2 | |
| N3 | |
| Histology | Adenocarcinoma, Squamous cell carcinoma, Other/Unknown |
| Smoking History | Active smoker, Nonsmoker/former smoker |
Data files from LungNo_MerckKG_2007_145 accessed via Project Data Sphere.
Unless otherwise noted, the variables represent measurements taken at baseline or screening.
Measures from MEPS considered as potential predictors of survival status for PDS lung cancer patients.
| Marital status | Married, Not married (including divorced, separated, widowed, never married) |
| Employment status | Not employed, Employed at any time during reference period |
| Education level | No degree, Earned at least GED or high school diploma |
| Income level | High income (family income ≥400% of the poverty level), poor through middle income (family income < 400% of the poverty level) |
| Private insurance coverage | Yes, No (including public insurance only or uninsured) |
| Smoker status | Current smoker, Not current smoker |
| Belief: Health insurance not needed | Disagree/Uncertain, Agree |
| Belief: Health insurance not worth cost | Disagree/Uncertain, Agree |
| Belief: More likely to take risks | Disagree/Uncertain, Agree |
| Belief: Able to overcome illness without help | Disagree/Uncertain, Agree |
| Perceived health status | Excellent/Very Good/Good, Fair/Poor |
| Limitation in physical functioning | Yes, No |
| Number of prescribed medicine purchases | Frequency in year |
| Number of hospital discharges | Frequency in year |
| Number of emergency room visits | Frequency in year |
| Number of office-based physician visits | Frequency in year |
| Total health care expenditures | Continuous measure for year |
| Access to necessary medical care | Able to get access, Unable to get access |
| Medicare coverage | Covered, Not covered |
| Medicaid coverage | Covered, Not covered |
| Tricare coverage | Covered, Not covered |
| Private HMO coverage | Covered, Not covered |
| Office-based provider visit: EEG | Yes, No (including no provider visits) |
| Office-based provider visit: EKG | Yes, No (including no provider visits) |
| Office-based provider visit: MRI | Yes, No (including no provider visits) |
| Office-based provider visit: lab tests | Yes, No (including no provider visits) |
| Office-based provider visit: anesthesia | Yes, No (including no provider visits) |
| Office-based provider visit: other exams | Yes, No (including no provider visits) |
Medical Expenditure Panel Survey Household Component Data Files 2000–2013,
Medical Expenditure Panel Survey Office-Based Medical Provider Visits Files 2000–2013. Agency for Healthcare Research and Quality, U.S. Department of Health and Human Services.
Logistic regression model to identify factors associated with survival status for PDS lung cancer patients.
| Overall model | 8 | 9.63 | <0.0001 | |||
| Intercept | 0.37 | 0.51 | 0.4748 | |||
| N Stage | 2 | 4.12 | 0.0170 | |||
| N1/N2 | −0.42 | 0.50 | 0.4041 | |||
| N3 | 0.54 | 0.57 | 0.3379 | |||
| Medicaid coverage | 1 | 8.07 | 0.0048 | |||
| Covered | 1.61 | 0.57 | 0.0048 | |||
| Private HMO coverage | 1 | 3.76 | 0.0531 | |||
| Covered | 0.59 | 0.31 | 0.0531 | |||
| Smoking status | 1 | 2.77 | 0.0969 | |||
| Current smoker | −0.49 | 0.29 | 0.0969 | |||
| Belief: Health insurance not needed | 1 | 4.86 | 0.0282 | |||
| Agree | 1.78 | 0.81 | 0.0282 | |||
| Office-based visit: lab tests | 1 | 7.38 | 0.0069 | |||
| Yes | 0.80 | 0.29 | 0.0069 |
n = 356 Pseudo R-square: 0.092231
−2 * Normalized log-likelihood with intercepts only: 424.65
−2 * Normalized log-likelihood full model: 390.20
Approximate chi-square (−2 * log-L ratio): 34.45
Degrees of freedom: 7
Denominator degrees of freedom: 355
Data files from LungNo_MerckKG_2007_145 accessed via Project Data Sphere. Medical Expenditure Panel Survey Household Component Data Files 2000–2013, Medical Expenditure Panel Survey Office-Based Medical Provider Visits Files 2000–2013, Agency for Healthcare Research and Quality, U.S. Department of Health and Human Services.