| Literature DB >> 28460857 |
Adam J Streeter1, Nan Xuan Lin2, Louise Crathorne3, Marcela Haasova4, Christopher Hyde5, David Melzer6, William E Henley7.
Abstract
OBJECTIVES: Motivated by recent calls to use electronic health records for research, we reviewed the application and development of methods for addressing the bias from unmeasured confounding in longitudinal data. STUDY DESIGN ANDEntities:
Keywords: Electronic health records; Longitudinal; Method review; Observational data; Unmeasured confounding; Unobserved confounding
Mesh:
Year: 2017 PMID: 28460857 PMCID: PMC5589113 DOI: 10.1016/j.jclinepi.2017.04.022
Source DB: PubMed Journal: J Clin Epidemiol ISSN: 0895-4356 Impact factor: 6.437
Fig. 1Flow diagram for method review.
Summary of methods to mitigate against unmeasured confounding captured by systematic review and the frequency of their use among the captured papers
| Method | Description | Obstacles to implementation | Frequency of methods |
|---|---|---|---|
| Instrumental variable analysis (IVA) | Upon identification of a suitably strong instrument, the influence of bias may be reduced through post hoc randomization. The instrumental variable should be highly determinant of the intervention or treatment received, while satisfying the exclusion assumption of being independent of the outcome other than through the treatment (Wright 1928; Angrist 1991). | In practice, finding an instrument with a sufficiently strong treatment association is a stumbling block in many analyses (Bound, Jaeger, and Baker 1995; Baser 2009). Association of the instrument with the outcome exclusively through the treatment is an untestable assumption, particularly if an indirect association exists through an unmeasured covariate. | 79 |
| Difference in differences (DiDs) | A biased effect estimate between two treatment groups may be corrected by the same estimates from a treatment-free period before the exposure, which should be a measure of the confounding bias contributed to the treatment effect (Ashenfelter and Card 1984). Aggregated at the treatment group level, this is operationalized in regression as a period-treatment interaction. At an individual level, demeaning, first-differencing or dummy variables for each individual may yield bias-free fixed effects, contingent on assumptions. | The method is contingent on the availability of repeated outcomes in both periods and invokes a time-invariant confounding assumption: that the confounding bias as captured by the estimated treatment effect in a treatment-free period before exposure is constant through to the study period. | 24 |
| Prior event rate ratio (PERR) | Analogous to the DiD method for time to event or rate data, a biased estimate of the hazard ratio or the incidence rate ratio is adjusted through its ratio with that from a treatment-free prior period (Tannen et al. 2008). | As with the assumption for DiD, repeatable outcomes and a constancy of the unmeasured confounding bias are required across both periods, before and after the exposure. Prior event occurrence should not influence the likelihood of future treatment. | 5 |
| Fixed effects instrumental variable analysis (FE IVA) | IVA may be applied to DiD estimation to mitigate for the second-order endogeneity: the time-varying part of the bias that may not have been adjusted for by DiD. | Assumptions of IVA apply | 5 |
| Dynamic panel model or instrumental variable—generalized method of moments (IV-GMM) | Lagged observations of the confounded (endogenous) explanatory variable are introduced in a first-differences fixed effects analysis so that the differences of the lags become the instrumental variables in a generalized method of moments estimation. | Assumptions of IVA apply. Here, the differenced lags should not be correlated with the differences in the error terms. | 2 |
| Regression discontinuity (RD) | RD is a design for analysis based on a treatment assignment determined by a cutoff applied to a continuous variable, that is, preferably measured with some random noise (as many clinical tests may be). The outcome can then be modeled on treatment for individuals within a certain interval from the cut-off of the assignment variable to ensure exchangeability between individuals for robust causal inference (Thistlethwaite and Campbell 1960) | Where assignment is not sharply determined by the cutoff, an increase in the probability of treatment may be observed leading to a “fuzzy” version of RD. Continuity in the assignment variable is assumed, otherwise, manipulation of assignment, and reverse causality may be suspected. Assignment should be locally random around the cutoff and makes the weak assumption that no unobserved covariates are discontinuous around the assignment cutoff. | 3 |
| Propensity score calibration (PSC) | PSC adjusts for residual confounding in the error-prone main data set by importing information about the unmeasured confounders from a smaller, external “gold-standard” data set (Stürmer et al. 2005). Analysis in the main data set is adjusted using a single-dimension propensity score of the measured corrected for unmeasured confounding by regression calibration against the gold-standard propensity score. | Successful adjustment is wholly dependent on the availability of another data set containing the exposure variable and error-free predictor, with individuals that are relevant enough to those in the main data set and under similar enough conditions to assure sufficient overlap between the two data sets. | 3 |
| Perturbation testing/analysis (PT/PA) | This data mining approach aims to mitigate for unmeasured confounding by adjusting for many measured variables that are weakly associated with the unobserved confounding variables (Lee 2014). Simulation in the single-reviewed example demonstrated this may require 100s, if not 1000s of perturbation variables (PVs). | This requires a very highly dimensional data set, which may ultimately obviate the need for indirect adjustment if the most or all of the confounders are captured. Simulation demonstrated that the bias may be exaggerated if a confounder is inadvertently identified as a PV, requiring many more true PVs to correct the bias. The number of PVs may exceed the available degrees of freedom necessitating clustering. | 1 |
| Negative control outcome/exposure (NCO/NCE) | A negative control is causally related to measured and unmeasured confounders affecting the exposure and main outcome but not directly causally related to exposure and outcome themselves. As such, the negative control may be used to detect confounding bias in the main study and potentially to indirectly adjust for this (Richardson et al. 2014) | This assumes that the effect of the unmeasured confounders on the main outcome is similar to that affecting the negative control. | 1 |
Fig. 2Plot of frequency of reviewed methods for mitigating for unmeasured confounding by: difference-in-differences (black); instrumental variable analysis (IVA) (mid-gray); other (light gray) includes regression discontinuity, prior event rate ratio method, propensity score calibration, perturbation analysis, negative control outcomes, fixed effects with IVA, and dynamic panel models. Note: the low frequencies in 2015 were attributable to the May cutoff for inclusion in that year.
Frequency of instruments categorized by type used in instrumental variable analyses
| IV type | Explanation/example | Frequency |
|---|---|---|
| Historical | Usually prescribing preference of physician or facility based on historical records of previously administered therapies | 34 |
| Geographic | Differential distance between patient's postcode and nearest health facility | 20 |
| Mendelian | Genetic characteristics: single nucleotide polymorphisms | 11 |
| Time | Time-based characteristic of treatment such as date of therapy | 10 |
| Other | Characteristics of individual, for example, age of patient, weight of offspring | 8 |
| Lagged | Previous therapy or outcome of patient | 6 |
| Randomization | Original randomization | 1 |
Abbreviation: IV, instrumental variable.