| Literature DB >> 26305250 |
Karyn Heavner1, Igor Burstyn2,3,4.
Abstract
Variation in the odds ratio (OR) resulting from selection of cutoffs for categorizing continuous variables is rarely discussed. We present results for the effect of varying cutoffs used to categorize a mismeasured exposure in a simulated population in the context of autism spectrum disorders research. Simulated cohorts were created with three distinct exposure-outcome curves and three measurement error variances for the exposure. ORs were calculated using logistic regression for 61 cutoffs (mean ± 3 standard deviations) used to dichotomize the observed exposure. ORs were calculated for five categories with a wide range for the cutoffs. For each scenario and cutoff, the OR, sensitivity, and specificity were calculated. The three exposure-outcome relationships had distinctly shaped OR (versus cutoff) curves, but increasing measurement error obscured the shape. At extreme cutoffs, there was non-monotonic oscillation in the ORs that cannot be attributed to "small numbers." Exposure misclassification following categorization of the mismeasured exposure was differential, as predicted by theory. Sensitivity was higher among cases and specificity among controls. Cutoffs chosen for categorizing continuous variables can have profound effects on study results. When measurement error is not too great, the shape of the OR curve may provide insight into the true shape of the exposure-disease relationship.Entities:
Keywords: autism spectrum disorders; categorization; dichotomization; epidemiology methods; misclassification
Mesh:
Substances:
Year: 2015 PMID: 26305250 PMCID: PMC4555337 DOI: 10.3390/ijerph120810198
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
True and observed values in the simulated population.
| Values in the Population (Notation) | True Values | Observed Values | Measurement Error | Postulated True Association with Latent Measure of Outcome a | Cutoffs Range (Increments) |
|---|---|---|---|---|---|
| Environmental exposure 1 (X1) | X1 ~ N(0,1), correlated with X2 by Pearson correlation ρ = 0.7 | W1 = X1 + ε1 | ε1 ~ N(0, σ2), where σ2 ℘ {0.0625, 0.25, 1} | {0.15, 0.25, 0.5} | −3 to 3 (in increments of 0.1) |
| Environmental exposure 2 (X2) | X2 ~ N(0,1), correlated with X1 by Pearson correlation ρ = 0.7 | W2 = X2 + ε2 | ε2~N(0, 0.25) | 0 | <1 |
| Sex (Z) | Z~Binomial(0.5, 1) | Z | None | 1 | |
| Gestational age (Xga) | Xga = (43 – γ), where γ ~ χ2(3) | Wga = R((Xga + εga); 23, 43), | 0.1 | <37 | |
| Autism endophenotype (latent, Y) |
Linear model:
YLinear = β1X1 + β2X2 + β3Z + β4Xga + εy If x1 < meanx1-standard deviationx1 then YThreshold = β2X2 + β3Z + β4Xga + εy If x1 ≥ meanx1-standard deviationx1 then YThreshold = 1.5 × β1X1 + β2X2 + β3Z + β4Xga + εy If x1 < meanx1-standard deviationx1 then YSaturation = 1.5 × β1X1 + β2X2 + β3Z + β4Xga + εy If x1 ≥ meanx1-standard deviationx1 then YSaturation = 0.5 × β1X1 + β2X2 + β3Z + β4Xga + εy, εy ~ N(0,1) | Y b = R(T(y); 0, 18), | due to rounding by R(.) b | Not applicable | 0–6, 7–18 |
coefficients of linear regression, see text and bottom of the table for details, β’s; b R(f(.); min, max) is the function that rounds values of function f(.) to integers and truncates values (retains only values) that fall within interval [min, max].
Figure 1Descriptions of four cutoffs and five categories for W1.
Figure 2Odds ratios (ORs) for different cutoffs (between mean ± 3 standard deviations) used to dichotomize a mismeasured exposure (W1).
Figure A1Odds ratios for different cutoffs (between mean ±3 standard deviations) used to dichotomize a mismeasured exposure (W1) (all scenarios for Figure 2). The X1 cutoff was changed from −3.0 to 3.0 in increments of 0.1. Exposed W1 ≥ cutoff, referent: W1 < cutoff. The vertical reference line for the semi-linear models is the inputted inflection point.
Figure 3Odds ratios (OR) for different cutoffs used to create 5 categories for the causal yet mismeasured exposure (W1) (δ is identified as “W1 delta” in the figure). The 4 W1 cutoffs were centered at 0 with δ from 0.5 to 1.5 standard deviations, in increments of 0.1. Reference category for W1 is mean +/− (0.5 × δ × standard deviation).
Figure A2Odds ratios (OR) for different cutoffs used to create 5 categories for the causal yet mismeasuered exposure (W1) (all scenarios for Figure 3) (δ is identified as “W1 delta” in the figure). The 4 W1 cutoffs were centered at 0 with δ from 0.5 to 1.5 standard deviations, in increments of 0.1. Reference category for W1 is mean +/− (0.5 × δ × standard deviation).
Figure A3Odds ratios (OR) for different cutoffs used to create 5 categories for mismeasured exposure (W1), 101 cutoffs, with the true “weak” association. The 4 W1 cutoffs were centered at 0 with δ from 0.5 to 1.5 standard deviations, in increments of 0.1. Reference category for W1 is mean +/−(0.5 × δ × standard deviation).
Figure 4Sensitivity and specificity at each cutoff for the mismeasured exposure (W1).
Figure A4Sensitivity and specificity at each cutoff for W1 (all scenarios for Figure 4). Cases—black; Controls—grey; Sensitivity—solid lines; Specificity—dotted lines.
Figure A5ROC curves for selected cutoffs for W1.