| Literature DB >> 32998668 |
Bas Bl Penning de Vries1, Maarten van Smeden1, Rolf Hh Groenwold1,2.
Abstract
Joint misclassification of exposure and outcome variables can lead to considerable bias in epidemiological studies of causal exposure-outcome effects. In this paper, we present a new maximum likelihood based estimator for marginal causal effects that simultaneously adjusts for confounding and several forms of joint misclassification of the exposure and outcome variables. The proposed method relies on validation data for the construction of weights that account for both sources of bias. The weighting estimator, which is an extension of the outcome misclassification weighting estimator proposed by Gravel and Platt (Weighted estimation for confounded binary outcomes subject to misclassification. Stat Med 2018; 37: 425-436), is applied to reinfarction data. Simulation studies were carried out to study its finite sample properties and compare it with methods that do not account for confounding or misclassification. The new estimator showed favourable large sample properties in the simulations. Further research is needed to study the sensitivity of the proposed method and that of alternatives to violations of their assumptions. The implementation of the estimator is facilitated by a new R function (ipwm) in an existing R package (mecor).Entities:
Keywords: Causal inference; confounding; inverse probability weighting; joint exposure and outcome misclassification; propensity scores; validation data
Year: 2020 PMID: 32998668 PMCID: PMC8008432 DOI: 10.1177/0962280220960172
Source DB: PubMed Journal: Stat Methods Med Res ISSN: 0962-2802 Impact factor: 3.021
Cross-classification of the reinfarction data for 33,007 individuals as given by Gravel and Platt.
| 11602 | 13116 | 1302 | 5363 | |
| 890 | 589 | 49 | 96 | |
True and false positive rates for reinfarction example.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
For and .
Expected cell counts (rounded to integers) for reinfarction example after misclassification was introduced.
|
| 10912 | 109 | 574 | 7 |
|
| 51 | 10 | 678 | 151 |
|
| 1527 | 10850 | 47 | 693 |
|
| 5 | 27 | 48 | 509 |
|
| 1148 | 116 | 23 | 14 |
|
| 7 | 4 | 29 | 9 |
|
| 334 | 4738 | 41 | 249 |
|
| 4 | 11 | 13 | 68 |
Note: Because of rounding, the sum of all cell entries is 33,006 rather than 33,007, the size of the reinfarction dataset.
Figure 1.Data structure for scenarios with misclassification on the outcome only (left) or on both the exposure and outcome (right). Bullet arrowheads represent deterministic relationships.
Simulation parameter values used in the Monte Carlo studies.
Exposure | |||||||
|---|---|---|---|---|---|---|---|
| Scenarios | misclassification |
|
|
|
|
|
|
| 1a,1b,1c | Absent | −2 | 0 | −3.85 | 2 | −0.431 | −1.5 |
| 2a,2b,2c | Absent | −3 | 0 | −3.85 | 2 | −0.417 | −1.5 |
| 3a,3b,3c | Absent | −2 | 0 | −3.85 | 4 | −0.624 | −1.5 |
| 4a,4b,4c | Absent | −2 | 0 | −3.85 | 2 | −0.431 | −2.5 |
| 5a,5b,5c | Present | −2 | 2 | −3.85 | 2 | −0.431 | −1.5 |
| 6a,6b,6c | Present | −3 | 2 | −3.85 | 2 | −0.417 | −1.5 |
| 7a,7b,7c | Present | −2 | 4 | −3.85 | 2 | −0.431 | −1.5 |
| 8a,8b,8c | Present | −2 | 2 | −3.85 | 4 | −0.624 | −1.5 |
| 9a,9b,9c | Present | −2 | 2 | −3.85 | 2 | −0.431 | −2.5 |
| 10a,10b,10c | Absent | −2 | 0 | −2 | 2 | −0.470 | −1.5 |
| 11a,11b,11c | Absent | −3 | 0 | −2 | 2 | −0.445 | −1.5 |
| 12a,12b,12c | Absent | −2 | 0 | −2 | 4 | −0.641 | −1.5 |
| 13a,13b,13c | Absent | −2 | 0 | −2 | 2 | −0.470 | −2.5 |
| 14a,14b,14c | Present | −2 | 2 | −2 | 2 | −0.470 | −1.5 |
| 15a,15b,15c | Present | −3 | 2 | −2 | 2 | −0.445 | −1.5 |
| 16a,16b,16c | Present | −2 | 4 | −2 | 2 | −0.470 | −1.5 |
| 17a,17b,17c | Present | −2 | 2 | −2 | 4 | −0.641 | −1.5 |
| 18a,18b,18c | Present | −2 | 2 | −2 | 2 | −0.470 | −2.5 |
Note: Scenarios indicated with ‘a’ have n = 10,000, those with ‘b’ have n = 5000 and those with ‘c’ have n = 1000.
Results for simulation studies 1–9b on the performance of different causal estimators in various scenarios of confounding and misclassification in exposure and outcome.
| Crude | ||||||
|---|---|---|---|---|---|---|
| Scenario | Bias | BSE | MSE | SE | SSE | CP |
| 1b |
| 0.004 | 0.169 | 0.119 | 0.118 | 0.122 |
| 2b |
| 0.006 | 0.179 | 0.183 | 0.184 | 0.492 |
| 3b |
| 0.004 | 0.169 | 0.117 | 0.118 | 0.116 |
| 4b |
| 0.004 | 0.174 | 0.117 | 0.118 | 0.102 |
| 5b |
| 0.003 | 0.169 | 0.090 | 0.088 | 0.007 |
| 6b |
| 0.004 | 0.183 | 0.132 | 0.134 | 0.133 |
| 7b |
| 0.003 | 0.164 | 0.086 | 0.088 | 0.009 |
| 8b |
| 0.003 | 0.164 | 0.086 | 0.088 | 0.005 |
| 9b |
| 0.003 | 0.166 | 0.089 | 0.088 | 0.005 |
PS: Propensity score method ignoring misclassification; CCA: complete case analysis; GP: Gravel and Platt estimator ignoring exposure misclassification, consistent with the methodology of Gravel and Platt for point (but not for variance) estimation[14]; IPWM: inverse probability weighting method for confounding and joint exposure and outcome misclassification; BSE: estimated standard error for the bias due to Monte Carlo error; SE: empirical standard error; SSE: sample standard error; CP: empirical coverage probability. In all scenarios, the true marginal log OR (estimand) was −0.4.