Literature DB >> 27182642

Brief Report: Negative Controls to Detect Selection Bias and Measurement Bias in Epidemiologic Studies.

Benjamin F Arnold¹, Ayse Ercumen, Jade Benjamin-Chung, John M Colford.

Abstract

Biomedical laboratory experiments routinely use negative controls to identify possible sources of bias, but epidemiologic studies have infrequently used this type of control in their design or measurement approach. Recently, epidemiologists proposed the routine use of negative controls in observational studies and defined the structure of negative controls to detect bias due to unmeasured confounding. We extend this previous study and define the structure of negative controls to detect selection bias and measurement bias in both observational studies and randomized trials. We illustrate the strengths and limitations of negative controls in this context using examples from the epidemiologic literature. Given their demonstrated utility and broad generalizability, the routine use of prespecified negative controls will strengthen the evidence from epidemiologic studies.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2016 PMID： 27182642 PMCID： PMC4969055 DOI： 10.1097/EDE.0000000000000504

Source DB: PubMed Journal: Epidemiology ISSN： 1044-3983 Impact factor: 4.822

Negative controls are used in laboratory science to help detect problems with the experimental method. In epidemiologic studies, a negative control outcome acts as a surrogate for the actual outcome—the negative control should be subject to the same potential sources of bias as the outcome but is not caused by the exposure of interest. Negative control exposures are conceptually the same, but defined relative to the actual exposure. Lipsitch et al.[1] defined the structure of negative controls to detect unmeasured confounding and described by way of example how negative controls could be used to detect selection bias and measurement (information) bias. Here, we define the causal structure of negative controls with respect to selection bias[2] and measurement bias,[3] and illustrate their use with published examples.

NEGATIVE CONTROLS TO DETECT SELECTION BIAS

For clarity, we have focused on structures of selection bias under the null (no effect of exposure) and have focused on four structures we would expect to be most relevant to epidemiologic research (Fig. 1).[2] Selection bias occurs when the analysis conditions on a third variable C that is a common descendant of exposure A and outcome Y or a common descendant of unmeasured causes of either A or Y or both, denoted U or U.[2] Defining C as the combination of censoring mechanisms during enrollment, follow-up, and analysis, standard epidemiologic measures are limited to the stratum of C = 0 (uncensored, available data). We denote negative control exposures as N and negative control outcomes as N—they could be dichotomous, categorical, or continuous.

FIGURE 1.

Simplified causal diagrams of selection bias for exposure A and outcome Y along with negative control exposures (N) and outcomes (N). In all four structures, selection bias results from conditioning on C, a common descendant of (A) exposure A and outcome Y, (B) cause of exposure U and outcome Y, (C) exposure A and cause of outcome U, or (D) cause of exposure U and cause of outcome U. A common form of selection bias can result from conditioning on a common descendant of the exposure and outcome (Fig. 1A). For example, in case–control designs where selection into the study (C) conditions on the outcome (Y → C), selection bias results if the exposure affects participant selection (A → C) differentially by case/control status. This bias structure could also occur in the re-analysis of a case–control study for a secondary outcome Z, which is intermediate between the exposure and outcome: A → Z → Y → C. This design is used in genetic epidemiology studies that repurpose costly genomic measures A and look at their association with additional outcomes Z.[4,5] Negative control outcomes or exposures to detect this type of bias would need to similarly affect participant selection (N → C, Table, example 1 or N → C, Table, example 2). Examples of Studies that Have Used Negative Controls to Detect Selection or Measurement Bias Following Bias Structures in Figures 1 and 2

FIGURE 2.

Simplified causal diagrams of differential measurement error for an exposure A that causes outcome Y. The basic structures for outcome measurement error (A) and exposure measurement error (B) are summarized along with negative control exposures (N) and outcomes (N). U represents other causes of the measured value of Y* and U represents other causes of the measured value of A*.

A second form of selection bias can occur in cross-sectional or retrospective studies when the outcome Y and an unmeasured cause of the exposure U affect study enrollment C (Fig. 1B). This bias can be detected by using a negative control exposure that shares the same unmeasured parent of the exposure (U → N, Table, example 3) or a negative control outcome that similarly affects enrollment (N → C). A third form of selection bias can occur when a study conditions on a common descendant of the exposure and an unmeasured cause of the outcome (Fig. 1C). In per-protocol analyses of randomized trials, investigators limit the analysis to individuals that complied with their respective group assignments. Bias results if compliance (C) is determined by treatment assignment (A) and by unmeasured characteristics (U) that affect both individuals’ willingness to comply with their assigned treatment and their outcome.[6] For example, if individuals assigned to treatment who comply with their regimen are more health conscious than noncompliers, a naive per-protocol analysis could overestimate the benefits of the treatment. Figure 1C also applies to selection bias in prospective studies if exposure A and an unmeasured cause of the outcome U affect loss to follow-up C. A negative control outcome that shares the same unmeasured parent as the outcome (U → N, Table, example 4) or a negative control exposure that similarly affects enrollment, loss to follow-up or compliance (N → C) can be used to detect this type of selection bias. Finally, selection bias can occur if a cause of the exposure U and a cause of the outcome U both affect enrollment C (Fig. 1D). One example is volunteer bias in cohort studies,[2] where individuals’ underlying characteristics might affect their exposures and health outcomes as well as their decision to enroll in the study. This bias could be detected by using a negative control outcome that shares the same parent as the actual outcome (U → N) or a negative control exposure that shares the same parent as the actual exposure (U → N); however, we are unaware of a study that has used negative controls for this bias structure.

NEGATIVE CONTROLS TO DETECT MEASUREMENT BIAS

Many studies have measurement error so that investigators observe Y*, which is an error-prone version of the outcome Y.[3] For example, if Y is an enteric infection that causes diarrhea, Y* could be caregiver-reported diarrhea symptoms. In the diagrams, we assume U accounts for all other unmeasured causes of Y* beyond Y. Similarly, A* can be subject to unmeasured sources of error U. We focus our definitions on differential measurement errors (Fig. 2) because they are most likely to cause bias and the consequent bias is often the least predictable.[3] For parsimony, we have not provided formal definitions of negative controls under more complex (or simple nondifferential) measurement error scenarios, but in principle Figure 2 could be extended to accommodate them—for example, removing edges Y → U or A → U defines negative controls for independent, nondifferential errors. Simplified causal diagrams of differential measurement error for an exposure A that causes outcome Y. The basic structures for outcome measurement error (A) and exposure measurement error (B) are summarized along with negative control exposures (N) and outcomes (N). U represents other causes of the measured value of Y* and U represents other causes of the measured value of A*. Differential outcome measurement error occurs when A influences the measured outcome Y* through U (Fig. 2A). In an unblinded study, physician follow-up (U) may be increased in treated patients compared with the untreated (A → U), and selective follow-up causes differential measurement error of Y. Differential outcome reporting can also bias observational studies and unblinded trials with subjectively reported outcomes,[7] where participant knowledge of their exposure or treatment assignment could influence reporting. An ideal negative control outcome for this scenario shares a common source of correlated measurement error (U) with the true outcome (Table, example 5). Negative control exposures for differential outcome measurement error also exist—placebo drugs in clinical trials are a classic example. Negative control exposures that act like a placebo can be devised for observational studies (Table, example 6). Differential exposure measurement error is possible when the exposure A is measured concurrently with or after the occurrence of the outcome Y (Fig. 2B) and is of greatest concern in retrospective or cross-sectional studies. For example, retrospective case–control studies can be biased if they rely on self-reported exposures A* as a proxy for true exposures A and cases remember exposures more accurately than controls. Negative controls for exposure measurement error need to share correlated errors (U) with the exposure (Table, example 7). The bias described in Figure 2B can also occur if an outcome is measured concurrently with an exposure, where the measured exposure (A*) is used as a proxy for the same measure at a time in the past that is relevant for causing disease (A) (Table, example 8).[8-15]

DISCUSSION

We defined the structure of negative controls to detect common forms of selection and measurement bias in observational studies and randomized trials. The examples in the Table illustrate many recent applications, and the structural definitions in Figures 1 and 2 generalize to further applications we have not discussed—for example, Figure 1C describes the structure for healthy worker bias[2] and healthy user/adherer bias.[16] For extensions beyond the detection of bias, recent efforts have used negative controls in sensitivity analyses to quantify the magnitude of bias from unobserved confounding,[17] as a tool to remove bias in standardized mortality ratios,[18] and as a basis for large-scale empirical calibration of P values in drug safety studies.[19] We envision similar extensions for the types of negative controls defined here. Negative controls have some limitations that arise in practice. Lipsitch et al.[1] characterized negative controls as a “blunt tool” to detect bias in the context of confounding, and that characterization is equally apt in the context of selection and measurement bias. Negative controls often lack specificity in the type of bias that they detect—many examples in the Table illustrate this limitation (Discussion in the eAppendix, http://links.lww.com/EDE/B56). Moreover, negative controls may identify the presence of bias but cannot in general determine its direction or magnitude without additional assumptions.[1] Another limitation that many negative controls share is that they often fail to provide a definitive test of the absence of bias.[1],20 All of these limitations coalesce into a common challenge for selecting negative controls: a control must meet its assumed structural definition, otherwise it can be an insensitive or inappropriate diagnostic for bias. Thus, the ability of a negative control to adequately detect bias ultimately relies on the plausibility of (often untestable) assumptions encoded in its causal diagram. Finally, prespecification of primary outcome and exposure definitions helps prevent the selective presentation of favorable results, and prespecification and complete reporting of negative controls would prevent similar problems.20 Selection bias or measurement bias threaten nearly every epidemiologic study design. Given their demonstrated utility and broad generalizability, the routine use of negative controls will help detect selection bias and measurement bias in epidemiologic studies.

Table.

Examples of Studies that Have Used Negative Controls to Detect Selection or Measurement Bias Following Bias Structures in Figures 1 and 2

20 in total

1. Secondary analysis of case-control data.

Authors: Yannan Jiang; Alastair J Scott; Chris J Wild
Journal: Stat Med Date: 2006-04-30 Impact factor: 2.373

2. Empirical evidence of bias in treatment effect estimates in controlled trials with different interventions and outcomes: meta-epidemiological study.

Authors: Lesley Wood; Matthias Egger; Lise Lotte Gluud; Kenneth F Schulz; Peter Jüni; Douglas G Altman; Christian Gluud; Richard M Martin; Anthony J G Wood; Jonathan A C Sterne
Journal: BMJ Date: 2008-03-03

3. Selective association of multiple sclerosis with infectious mononucleosis.

Authors: B M Zaadstra; A M J Chorus; S van Buuren; H Kalsbeek; J M van Noort
Journal: Mult Scler Date: 2008-01-21 Impact factor: 6.312

4. A general regression framework for a secondary outcome in case-control studies.

Authors: Eric J Tchetgen Tchetgen
Journal: Biostatistics Date: 2013-10-22 Impact factor: 5.899

5. Invited Commentary: Causal diagrams and measurement bias.

Authors: Miguel A Hernán; Stephen R Cole
Journal: Am J Epidemiol Date: 2009-09-15 Impact factor: 4.897

6. Negative Control Outcomes and the Analysis of Standardized Mortality Ratios.

Authors: David B Richardson; Alexander P Keil; Eric Tchetgen Tchetgen; Glinda Cooper
Journal: Epidemiology Date: 2015-09 Impact factor: 4.822

7. Using rapid indicators for Enterococcus to assess the risk of illness after exposure to urban runoff contaminated marine water.

Authors: John M Colford; Kenneth C Schiff; John F Griffith; Vince Yau; Benjamin F Arnold; Catherine C Wright; Joshua S Gruber; Timothy J Wade; Susan Burns; Jacqueline Hayes; Charles McGee; Mark Gold; Yiping Cao; Rachel T Noble; Richard Haugland; Stephen B Weisberg
Journal: Water Res Date: 2012-02-02 Impact factor: 11.236

8. Statins and risk of diabetes: an analysis of electronic medical records to evaluate possible bias due to differential survival.

Authors: Goodarz Danaei; Luis A García Rodríguez; Oscar Fernandez Cantero; Miguel A Hernán
Journal: Diabetes Care Date: 2012-12-17 Impact factor: 19.112

9. The control outcome calibration approach for causal inference with unobserved confounding.

Authors: Eric Tchetgen Tchetgen
Journal: Am J Epidemiol Date: 2013-12-20 Impact factor: 4.897

10. Evaluation of exposure to contaminated drinking water and specific birth defects and childhood cancers at Marine Corps Base Camp Lejeune, North Carolina: a case-control study.

Authors: Perri Zeitz Ruckart; Frank J Bove; Morris Maslia
Journal: Environ Health Date: 2013-12-04 Impact factor: 5.984

35 in total

1. Cytoplasmic synthesis of endogenous Alu complementary DNA via reverse transcription and implications in age-related macular degeneration.

Authors: Shinichi Fukuda; Akhil Varshney; Benjamin J Fowler; Shao-Bin Wang; Siddharth Narendran; Kameshwari Ambati; Tetsuhiro Yasuma; Joseph Magagnoli; Hannah Leung; Shuichiro Hirahara; Yosuke Nagasaka; Reo Yasuma; Ivana Apicella; Felipe Pereira; Ryan D Makin; Eamonn Magner; Xinan Liu; Jian Sun; Mo Wang; Kirstie Baker; Kenneth M Marion; Xiwen Huang; Elmira Baghdasaryan; Meenakshi Ambati; Vidya L Ambati; Akshat Pandey; Lekha Pandya; Tammy Cummings; Daipayan Banerjee; Peirong Huang; Praveen Yerramothu; Genrich V Tolstonog; Ulrike Held; Jennifer A Erwin; Apua C M Paquola; Joseph R Herdy; Yuichiro Ogura; Hiroko Terasaki; Tetsuro Oshika; Shaban Darwish; Ramendra K Singh; Saghar Mozaffari; Deepak Bhattarai; Kyung Bo Kim; James W Hardin; Charles L Bennett; David R Hinton; Timothy E Hanson; Christian Röver; Keykavous Parang; Nagaraj Kerur; Jinze Liu; Brian C Werner; S Scott Sutton; Srinivas R Sadda; Gerald G Schumann; Bradley D Gelfand; Fred H Gage; Jayakrishna Ambati
Journal: Proc Natl Acad Sci U S A Date: 2021-02-09 Impact factor: 11.205

2. Scaling Up a Water, Sanitation, and Hygiene Program in Rural Bangladesh: The Role of Program Implementation.

Authors: Jade Benjamin-Chung; Sonia Sultana; Amal K Halder; Mohammed Ali Ahsan; Benjamin F Arnold; Alan E Hubbard; Leanne Unicomb; Stephen P Luby; John M Colford
Journal: Am J Public Health Date: 2017-03-21 Impact factor: 9.308

Review 3. Understanding and Mitigating the Replication Crisis, for Environmental Epidemiologists.

Authors: Scott M Bartell
Journal: Curr Environ Health Rep Date: 2019-03

Review 4. Risks and Benefits of Attention-Deficit/Hyperactivity Disorder Medication on Behavioral and Neuropsychiatric Outcomes: A Qualitative Review of Pharmacoepidemiology Studies Using Linked Prescription Databases.

Authors: Zheng Chang; Laura Ghirardi; Patrick D Quinn; Philip Asherson; Brian M D'Onofrio; Henrik Larsson
Journal: Biol Psychiatry Date: 2019-04-17 Impact factor: 13.382

Review 5. Advances in Epidemiological Methods and Utilisation of Large Databases: A Methodological Review of Observational Studies on Central Nervous System Drug Use in Pregnancy and Central Nervous System Outcomes in Children.

Authors: Zixuan Wang; Phoebe W H Ho; Michael T H Choy; Ian C K Wong; Ruth Brauer; Kenneth K C Man
Journal: Drug Saf Date: 2019-04 Impact factor: 5.606

6. Empirical confidence interval calibration for population-level effect estimation studies in observational healthcare data.

Authors: Martijn J Schuemie; George Hripcsak; Patrick B Ryan; David Madigan; Marc A Suchard
Journal: Proc Natl Acad Sci U S A Date: 2018-03-13 Impact factor: 11.205

7. Negative Control Outcomes: A Tool to Detect Bias in Randomized Trials.

Authors: Benjamin F Arnold; Ayse Ercumen
Journal: JAMA Date: 2016-12-27 Impact factor: 56.272

8. A Stepped Wedge Cluster-Randomized Trial Assessing the Impact of a Riverbank Filtration Intervention to Improve Access to Safe Water on Health in Rural India.

Authors: Sarah L McGuinness; Joanne O'Toole; Andrew B Forbes; Thomas B Boving; Kavita Patil; Fraddry D'Souza; Chetan A Gaonkar; Asha Giriyan; S Fiona Barker; Allen C Cheng; Martha Sinclair; Karin Leder
Journal: Am J Trop Med Hyg Date: 2020-03 Impact factor: 2.345

9. Acute Gastroenteritis and Recreational Water: Highest Burden Among Young US Children.

Authors: Benjamin F Arnold; Timothy J Wade; Jade Benjamin-Chung; Kenneth C Schiff; John F Griffith; Alfred P Dufour; Stephen B Weisberg; John M Colford
Journal: Am J Public Health Date: 2016-07-26 Impact factor: 9.308

Review 10. Big Data in Public Health: Terminology, Machine Learning, and Privacy.

Authors: Stephen J Mooney; Vikas Pejaver
Journal: Annu Rev Public Health Date: 2017-12-20 Impact factor: 21.981