| Literature DB >> 29020131 |
Miguel Angel Luque-Fernandez1, Aurélien Belot1, Linda Valeri2,3, Giovanni Cerulli4, Camille Maringe1, Bernard Rachet1.
Abstract
In this paper, we propose a structural framework for population-based cancer epidemiology and evaluate the performance of double-robust estimators for a binary exposure in cancer mortality. We conduct numerical analyses to study the bias and efficiency of these estimators. Furthermore, we compare 2 different model selection strategies based on 1) Akaike's Information Criterion and the Bayesian Information Criterion and 2) machine learning algorithms, and we illustrate double-robust estimators' performance in a real-world setting. In simulations with correctly specified models and near-positivity violations, all but the naive estimators had relatively good performance. However, the augmented inverse-probability-of-treatment weighting estimator showed the largest relative bias. Under dual model misspecification and near-positivity violations, all double-robust estimators were biased. Nevertheless, the targeted maximum likelihood estimator showed the best bias-variance trade-off, more precise estimates, and appropriate 95% confidence interval coverage, supporting the use of the data-adaptive model selection strategies based on machine learning algorithms. We applied these methods to estimate adjusted 1-year mortality risk differences in 183,426 lung cancer patients diagnosed after admittance to an emergency department versus persons with a nonemergency cancer diagnosis in England (2006-2013). The adjusted mortality risk (for patients diagnosed with lung cancer after admittance to an emergency department) was 16% higher in men and 18% higher in women, suggesting the importance of interventions targeting early detection of lung cancer signs and symptoms.Entities:
Mesh:
Year: 2018 PMID: 29020131 PMCID: PMC5888939 DOI: 10.1093/aje/kwx317
Source DB: PubMed Journal: Am J Epidemiol ISSN: 0002-9262 Impact factor: 4.897
Figure 1.Directed acyclic graph for a proposed structural causal framework in population-based cancer research. Conditional exchangeability of the treatment effect or exposure (A) on 1-year cancer mortality (Y) is obtained through conditioning on a set of available covariates (Y1,Y0 ⊥ A|W). The minimum sufficient set, based on the backdoor criterion, is obtained through conditioning on only W1, W3, and W4. The average treatment effect for the structural framework is estimated as the average risk difference between the expected effect of the treatment conditional on W among treated persons (E(Y|A = 1; W)) and the expected effect of the treatment conditional on W among the untreated (E(Y|A = 0; W)). W1, socioeconomic status; W2, age; W3, cancer stage; W4, comorbidity.
Figure 2.Overlap of the propensity scores for correctly specified (first scenario (A)) and misspecified (second scenario (B)) models for the probabilities of treatment status P(A = 1|W) and P(A = 0|W) in 1 random sample from 1,000 Monte Carlo simulations.
Results From 10,000 Monte Carlo Simulations of the Average Treatment Effect for Correctly Specified Models (First Scenario) and Misspecified Models Using Adaptive Approaches (Second Scenario) for Different Double-Robust Estimators of 1-Year Lung Cancer Mortality, England, 2006–2013
| Simulated Scenario | ATEa (SD) | Absolute Bias | Relative Bias, % | RMSE | 95% CI | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| First scenariob | ||||||||||
| True ATE | −0.1813 | |||||||||
| Naive | −0.2234 (0.049) | −0.2218 (0.012) | 0.0421 | 0.0405 | 23.2 | 22.3 | 0.0575 | 0.0423 | 77 | 89 |
| AIPTW | −0.1843 (0.053) | −0.1848 (0.018) | 0.0030 | 0.0035 | 1.6 | 1.9 | 0.0534 | 0.0180 | 93 | 94 |
| IPTW-RA | −0.1831 (0.050) | −0.1838 (0.017) | 0.0018 | 0.0025 | 1.0 | 1.4 | 0.0500 | 0.0174 | 91 | 95 |
| TMLEc | −0.1832 (0.048) | −0.1821 (0.016) | 0.0019 | 0.0008 | 1.0 | 0.4 | 0.0482 | 0.0158 | 95 | 95 |
| Second scenariod | ||||||||||
| True ATE | −0.1172 | |||||||||
| Naive | −0.0127 (0.103) | −0.0121 (0.033) | 0.1045 | 0.1051 | 89.2 | 89.7 | 0.1470 | 0.1100 | 0 | 0 |
| BFe AIPTW | −0.1155 (0.093) | −0.0920 (0.073) | 0.0017 | 0.0252 | 1.5 | 11.7 | 0.0928 | 0.0773 | 65 | 65 |
| BFe IPTW-RA | −0.1268 (0.043) | −0.1192 (0.031) | 0.0096 | 0.0020 | 8.2 | 1.7 | 0.0442 | 0.0305 | 52 | 73 |
| TMLEc | −0.1181 (0.028) | −0.1177 (0.011) | 0.0009 | 0.0005 | 0.8 | 0.4 | 0.0281 | 0.0107 | 93 | 95 |
Abbreviations: AIPTW, augmented inverse-probability-of-treatment weighting; ATE, average treatment effect; BF, best fit; CI, confidence interval; IPTW-RA, inverse-probability-of-treatment-weighted regression adjustment; RMSE, root mean squared error; SD, standard deviation; TMLE, targeted maximum likelihood estimation.
a ATE across 1,000 simulated data sets.
b First scenario: correctly specified models and near-positivity violation.
c TMLE calling basic SuperLearner (SL) libraries: SL.Step, SL.glm, and SL.glm.interaction.
d Second scenario: misspecification, near-positivity violation, and adaptive model selection.
e Best fit based on Akaike’s Information Criterion and the Bayesian Information Criterion.
One-Year Mortality Among Lung Cancer Patients (Incident Cases; n = 183,426 (102,535 Males and 80,891 Females)), by Cancer Stage, Comorbidity, Socioeconomic Status, and Age at Cancer Diagnosis, After Admittance to an Emergency Department Versus Nonemergency Diagnosis, England, 2006–2013
| Variable | Mortality 1 Year | |
|---|---|---|
| Women | Men | |
| ER presentation | ||
| No | 53.4 | 59.9 |
| Yes | 83.7 | 86.4 |
| Cancer stage | ||
| I | 18.1 | 24.2 |
| II | 35.1 | 37.6 |
| III | 58.6 | 62.4 |
| IV | 82.2 | 85.8 |
| Quartile of CCI | ||
| 1 (lowest) | 62.8 | 67.6 |
| 2 | 64.1 | 68.3 |
| 3 | 67.2 | 71.4 |
| 4 (highest) | 72.4 | 75.5 |
| Quintile of SES | ||
| 1 (lowest) | 62.6 | 66.7 |
| 2 | 63.3 | 68.1 |
| 3 | 64 | 69.5 |
| 4 | 64.2 | 69.6 |
| 5 (highest) | 64.1 | 68.2 |
| Age at diagnosis, yearsa | 73.0 (10.8) | 72.6 (10.3) |
Abbreviations: CCI, Charlson Comorbidity Index; ER, emergency room; SES, socioeconomic status.
a Values are presented as mean (standard deviation).
Figure 3.Sex-specific adjusted risk difference for 1-year lung cancer mortality according to different double-robust estimators among 183,426 lung cancer patients diagnosed after admittance to an emergency department versus persons with a nonemergency cancer diagnosis, England, 2006–2013. A) women; B) men. Bars, 95% confidence intervals. AIPTW, augmented inverse-probability-of-treatment weighting; BF-AIPTW, best-fit augmented inverse-probability-of-treatment weighting (data-adaptive estimation based on Akaike’s Information Criterion (AIC) and the Bayesian Information Criterion (BIC)); BF-IPTW-RA, best-fit inverse-probability-of-treatment-weighted regression adjustment (data-adaptive estimation based on AIC-BIC); IPTW-RA, inverse-probability-of-treatment-weighted regression adjustment; TMLE, targeted maximum likelihood estimation (data-adaptive estimation based on ensemble learning and k-fold cross-validation).
Results of a Monte Carlo Simulation of Risk Differences in 1-Year Mortality Among Lung Cancer Patients (Incident Cases; n = 183,426) Diagnosed After Admittance to an Emergency Department, England, 2006–2013
| Estimator | ATEa (SD) | Absolute | Relative | RMSE | 95% CI |
|---|---|---|---|---|---|
| True ATE | 0.1621 | ||||
| AIPTW | 0.1493 (0.010) | 0.0128 | 7.9 | 0.0165 | 79 |
| IPTW-RA | 0.1587 (0.006) | 0.0034 | 2.1 | 0.0072 | 92 |
| TMLEb | 0.1620 (0.003) | 0.0001 | 0.1 | 0.0034 | 92 |
Abbreviations: AIPTW, augmented inverse-probability-of-treatment weighting; ATE, average treatment effect; CI, confidence interval; IPTW-RA, inverse-probability-of-treatment-weighted regression adjustment; RMSE, root mean squared error; SD, standard deviation; TMLE, targeted maximum likelihood estimation.
a ATE across 1,000 simulated data sets.
b TMLE calling basic SuperLearner (SL) libraries: SL.Step, SL.glm, and SL.glm.interaction.