| Literature DB >> 23169939 |
Halvor Sommerfelt1, Hans Steinsland, Lize van der Merwe, William C Blackwelder, Dilruba Nasrin, Tamer H Farag, Karen L Kotloff, Myron M Levine, Håkon K Gjessing.
Abstract
If individuals in a case/control study are subsequently observed as a cohort of cases and a cohort of controls, weighted regression analyses can be used to estimate the association between the exposures initially recorded and events occurring during the follow-up of the 2 cohorts. Such analyses can be conceptualized as being undertaken on a reconstructed source population from which cases and controls stem. To simulate this population, the cohort of cases is added to the cohort of controls expanded with the reciprocal of the case disease incidence odds (the sampling weight) to include all individuals in the source population who did not develop the case disease. We use a simulated dataset to illustrate how weighted generalized linear model regression can be used to estimate the association between an exposure captured during the case/control study component and an outcome that occurs during follow-up.Entities:
Mesh:
Year: 2012 PMID: 23169939 PMCID: PMC3502318 DOI: 10.1093/cid/cis802
Source DB: PubMed Journal: Clin Infect Dis ISSN: 1058-4838 Impact factor: 9.079
Figure 1.Venn diagram showing the distribution of 2400 cases and 2400 controls in relation to an exposure and an outcome in a population of 100 000 individuals. The numbers were generated using functions found in the Supplementary Appendix. Abbreviations: CoCa, cohort of cases; CoCo, cohort of controls.
Figure 2.Schematic presentation of associations between an exposure (E), a case disease (D), and an outcome (O) in a population, where arrows indicate the direction of causality.
Two-by-Two Tables Showing Distributions of Exposure (E) and Outcome (O) or Disease Defining Case Status (D) as a Basis for the Conceptual Framework of the Reconstructed Population Method
| O | ||||||||
|---|---|---|---|---|---|---|---|---|
| + | − | Total | Risk | RR | 95% CI | |||
| A | Source Population | |||||||
| E | + | 70 | 4930 | 5000 | 0.014 | 2.6 | ||
| − | 508 | 94 492 | 95 000 | 0.005 | ||||
| B | CoCa | |||||||
| E | + | 25 | 475 | 500 | 0.050 | 2.2 | ||
| − | 43 | 1857 | 1900 | 0.023 | ||||
| C | NC | |||||||
| E | + | 45 | 4455 | 4500 | 0.010 | 2.0 | ||
| − | 465 | 92 635 | 93 100 | 0.005 | ||||
| D | CoCo | |||||||
| E | + | 1 | 110 | 111 | 0.009 | 1.9 | ||
| − | 11 | 2278 | 2289 | 0.005 | ||||
| D | ||||||||
| + | − | (Odds) | (OR) | |||||
| E | CC study | |||||||
| E | + | 500 | 111 | 4.505 | 5.4 | |||
| − | 1900 | 2289 | 0.83 | |||||
| O | ||||||||
| + | − | |||||||
| F | CoCa | |||||||
| E | + | 25 | 475 | 500 | 0.050 | 2.2 | 1.4–3.6 | |
| − | 43 | 1857 | 1900 | 0.023 | ||||
| G | CoCo | |||||||
| E | + | 1 | 110 | 111 | 0.009 | 1.9 | .24–14.4 | |
| − | 11 | 2278 | 2289 | 0.005 | ||||
| H | CoCa + CoCo | |||||||
| E | + | 26 | 585 | 611 | 0.043 | 3.3 | 2.1–5.2 | |
| − | 54 | 4135 | 4189 | 0.013 | ||||
Abbreviations: CC, case/control; CI, confidence interval; CoCa, cohort of cases; CoCo, cohort of controls; D, case-defining illness; NC, noncases; OR, odds ratio; RR, relative risk.
Reconstructing the Population
| O | |||||||
|---|---|---|---|---|---|---|---|
| + | − | Total | Risk | RR | |||
| A | rNC | ||||||
| E | + | 1 × 40.67 = 40.67 | 110 × 40.67 = 4473.33 | 4514 | 0.013 | 1.9 | |
| − | 11 × 40.67 = 447.33 | 2278 × 40.67 = 92 639.67 | 93 086 | 0.005 | |||
| B | Reconstructed populationa | ||||||
| E | + | 40.67 + 25 = 65.67 | 4473.33 + 475 = 4948.33 | 5014 | 0.013 | 2.5 | |
| − | 447.33 + 43 = 490.33 | 92639.67 + 1857 = 94 496.67 | 94 986 | 0.005 | |||
Two-by-two tables showing distributions of exposure, outcome, and disease that defines case status in the reconstructed noncases and in the reconstructed source population.
Abbreviations: D, case-defining illness; E, exposure; O, outcome; rNC, reconstructed noncases; RR, relative risk.
a rNC + cohort of cases.
Figure 3.Regression lines reflecting the relative risk for an outcome during follow-up for (A) the cohort of cases (CoCa) + the cohort of controls (CoCo) and (B) the reconstructed population (CoCa + noncases that have been reconstructed from the CoCo × sampling weight [rNC]). The data underlying each line corresponds to the 2 × 2 tables in Table 1 and Table 2, so that (T1B) is the 2 × 2 in row B of Table 1, and (T2A) is the 2 × 2 table in row A of Table 2. Notice that the change in weights, or individuals, between (A) and (B) alters the end-point positions, and thus the slope of the middle line. (A) depicts the ill-advised approach to analyze the combined CoCo and CoCa data (Table 1, row H). The area of each circle is proportional to the number of exposed (Exposure = 1) and unexposed (Exposure = 0) individuals in the CoCa and the CoCo. (B) depicts the reconstructed population method (Table 2, row B). The area of each circle is proportional to the number of exposed and unexposed individuals in the CoCa and the rNC. Abbreviations: CoCa, cohort of cases; CoCo, cohort of controls; rNC, noncases that have been reconstructed from the CoCo × sampling weight; RR, relative risk.