| Literature DB >> 15579415 |
Jay H Lubin1, Joanne S Colt, David Camann, Scott Davis, James R Cerhan, Richard K Severson, Leslie Bernstein, Patricia Hartge.
Abstract
Quantitative measurements of environmental factors greatly improve the quality of epidemiologic studies but can pose challenges because of the presence of upper or lower detection limits or interfering compounds, which do not allow for precise measured values. We consider the regression of an environmental measurement (dependent variable) on several covariates (independent variables). Various strategies are commonly employed to impute values for interval-measured data, including assignment of one-half the detection limit to nondetected values or of "fill-in" values randomly selected from an appropriate distribution. On the basis of a limited simulation study, we found that the former approach can be biased unless the percentage of measurements below detection limits is small (5-10%). The fill-in approach generally produces unbiased parameter estimates but may produce biased variance estimates and thereby distort inference when 30% or more of the data are below detection limits. Truncated data methods (e.g., Tobit regression) and multiple imputation offer two unbiased approaches for analyzing measurement data with detection limits. If interest resides solely on regression parameters, then Tobit regression can be used. If individualized values for measurements below detection limits are needed for additional analysis, such as relative risk regression or graphical display, then multiple imputation produces unbiased estimates and nominal confidence intervals unless the proportion of missing data is extreme. We illustrate various approaches using measurements of pesticide residues in carpet dust in control subjects from a case-control study of non-Hodgkin lymphoma.Entities:
Mesh:
Substances:
Year: 2004 PMID: 15579415 PMCID: PMC1253661 DOI: 10.1289/ehp.7199
Source DB: PubMed Journal: Environ Health Perspect ISSN: 0091-6765 Impact factor: 9.031
Percentage of measurements below DLs or known only within bounds and AMs, GMs, and GSDs based on fill-in values from a single imputation (data on 478 control subjects).
| Measurements known only within bounds
| |||||||||
|---|---|---|---|---|---|---|---|---|---|
| Type I
| Type II
| Type III
| Dust (ng/g)
| ||||||
| Insecticide | Percent | Range | Percent | Range | Percent | Range | AM | GM | GSD |
| Propoxur | 21.1 | (0–46.0) | 2.9 | (0–65.0) | 1.7 | (21.1–75.7) | 456.6 | 65.6 | 6.0 |
| Carbaryl | 37.9 | (0–260.0) | 11.1 | (0–268) | 18.0 | (20.7–694.8) | 1503.0 | 64.0 | 14.0 |
| Chlorpyrifos | 28.2 | (0–77.4) | 0.2 | (0–20.9) | 0.0 | — | 893.1 | 105.6 | 8.3 |
| α-Chlordane | 60.9 | (0–44.7) | 0.0 | — | 0.4 | (20.8–29.1) | 90.7 | 11.6 | 8.0 |
Types of missing measurements are as follows: no analyte detected and no interfering compound (I), no analyte detected but with an interfering compound present (II), and analyte and interfering compounds both present (III). The range for the DLs reflects the minimum of LBs and the maximum of UBs for the nondetected measurements.
Figure 1Plots under a log-normal distribution of quantiles of environmental measurements of (A) propoxur and (B) carbaryl, and of regression residuals of measurements (Z) and predicted values (ZPred) after accounting for covariates for (C) propoxur and (D) carbaryl. The AMs, GMs, and GSDs are computed from imputed data. (A) AM = 456.6; GM = 65.6; GSD = 6.0. (B) AM = 1503.0; GM = 64.0; GSD = 14.0. (C) AM = 3.5; GM = 0.9; GSD = 2.0. (D) AM = 15.1; GM = 0.9; GSD = 2.6.
Proportional increase in analyte concentration in carpet dust (ng/g) for selected uses.
| Crawling insects
| Flying insects
| Fleas/ticks
| Termites
| Lawn/garden insects
| |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Insecticide, imputation approach | Adjustment | exp(β ) | SE | exp(β ) | SE | exp(β ) | SE | exp(β ) | SE | exp(β ) | SE |
| Propoxur | |||||||||||
| DL/2 | No | 1.426 | 1.167 | 0.987 | 1.144 | 1.231 | 1.153 | 1.145 | 1.219 | 0.756 | 1.151 |
| | No | 1.432 | 1.170 | 0.986 | 1.147 | 1.231 | 1.156 | 1.135 | 1.223 | 0.751 | 1.154 |
| Fill-in | No | 1.459 | 1.189 | 0.966 | 1.163 | 1.225 | 1.173 | 1.072 | 1.249 | 0.737 | 1.171 |
| Fill-in | Yes | 1.511 | 1.182 | 1.030 | 1.157 | 1.251 | 1.166 | 1.209 | 1.239 | 0.687 | 1.165 |
| Multiple impute | Yes | 1.487 | 1.196 | 1.016 | 1.165 | 1.247 | 1.170 | 1.082 | 1.244 | 0.704 | 1.173 |
| Direct estimate | Yes | 1.503 | 1.276 | 0.994 | 1.235 | 1.245 | 1.250 | 1.090 | 1.363 | 0.714 | 1.249 |
| Carbaryl | |||||||||||
| DL/2 | No | 0.853 | 1.201 | 0.661 | 1.173 | 1.560 | 1.185 | 1.129 | 1.266 | 1.660 | 1.183 |
| | No | 0.849 | 1.226 | 0.629 | 1.194 | 1.703 | 1.208 | 1.199 | 1.300 | 1.746 | 1.205 |
| Fill-in | No | 0.830 | 1.311 | 0.591 | 1.265 | 1.812 | 1.285 | 1.486 | 1.417 | 1.735 | 1.282 |
| Fill-in | Yes | 0.940 | 1.274 | 0.432 | 1.235 | 2.337 | 1.252 | 1.538 | 1.366 | 1.779 | 1.249 |
| Multiple impute | Yes | 0.826 | 1.338 | 0.508 | 1.272 | 2.047 | 1.313 | 1.326 | 1.490 | 1.950 | 1.351 |
| Direct estimate | Yes | 0.785 | 1.499 | 0.512 | 1.413 | 2.180 | 1.452 | 1.281 | 1.651 | 2.115 | 1.444 |
| Chlorpyrifos | |||||||||||
| DL/2 | No | 1.578 | 1.209 | 0.779 | 1.181 | 1.264 | 1.182 | 1.581 | 1.276 | 0.759 | 1.188 |
| | No | 1.620 | 1.218 | 0.771 | 1.188 | 1.300 | 1.190 | 1.613 | 1.288 | 0.746 | 1.196 |
| Fill-in | No | 1.917 | 1.243 | 0.757 | 1.210 | 1.389 | 1.212 | 1.669 | 1.322 | 0.713 | 1.219 |
| Fill-in | Yes | 1.745 | 1.244 | 0.740 | 1.210 | 1.383 | 1.212 | 1.631 | 1.323 | 0.731 | 1.219 |
| Multiple impute | Yes | 1.770 | 1.252 | 0.763 | 1.223 | 1.401 | 1.223 | 1.689 | 1.336 | 0.708 | 1.234 |
| Direct estimate | Yes | 1.796 | 1.378 | 0.740 | 1.323 | 1.392 | 1.327 | 1.698 | 1.492 | 0.702 | 1.338 |
| α-Chlordane | |||||||||||
| DL/2 | No | 0.966 | 1.129 | 0.938 | 1.112 | 0.910 | 1.118 | 2.626 | 1.168 | 1.091 | 1.117 |
| | No | 0.954 | 1.153 | 0.925 | 1.132 | 0.894 | 1.140 | 3.031 | 1.199 | 1.110 | 1.138 |
| Fill-in | No | 1.060 | 1.230 | 0.828 | 1.198 | 0.868 | 1.210 | 3.110 | 1.303 | 1.079 | 1.208 |
| Fill-in | Yes | 0.762 | 1.206 | 0.927 | 1.177 | 0.908 | 1.188 | 3.898 | 1.271 | 1.293 | 1.186 |
| Multiple impute | Yes | 0.852 | 1.363 | 0.915 | 1.235 | 0.804 | 1.202 | 3.686 | 1.290 | 1.169 | 1.270 |
| Direct estimate | Yes | 0.858 | 1.379 | 0.919 | 1.316 | 0.803 | 1.339 | 3.666 | 1.442 | 1.211 | 1.334 |
Entries are exponentials of parameter estimates (β) and their SEs from linear regression models of the logarithm of pesticide analyte on age, sex, race, education, study site, season, and pesticide use variables. Regression models also included year house was built (propoxur, carbaryl, α-chlordane), type of home (carbaryl), and presence of oriental rugs (α-chlordane).
See “Materials and Methods” for a description of methods; adjusted imputation includes regression variables.
95% CI excludes 1.
90% CI excludes 1.
Results of simulation study of imputation approaches for log-normally distributed data with μ = 0 and σ2 = 1 with a DL (entries are means of 5,000 repetitions).
| Sample size (no.) | Percent < DL | Complete data | Tobit analysis | Multi-impute using (μ̂, σ̂2) | Single impute using (μ̃, σ̃2) | Insert DL/2 | Insert |
|---|---|---|---|---|---|---|---|
| 50 | |||||||
| Estimate of μ | 10.0 | 0.002 | 0.000 | −0.003 | −0.003 | −0.020 | 0.007 |
| 30.0 | 0.002 | −0.003 | −0.003 | −0.004 | −0.017 | 0.032 | |
| 50.0 | 0.002 | −0.004 | −0.003 | −0.003 | 0.052 | 0.073 | |
| 70.0 | 0.002 | −0.006 | −0.005 | −0.002 | 0.229 | 0.143 | |
| Coverage of 95% CI | 10.0 | 0.947 | 0.944 | 0.943 | 0.943 | 0.943 | 0.942 |
| 30.0 | 0.947 | 0.949 | 0.938 | 0.928 | 0.942 | 0.928 | |
| 50.0 | 0.947 | 0.953 | 0.928 | 0.876 | 0.938 | 0.832 | |
| 70.0 | 0.947 | 0.931 | 0.895 | 0.707 | 0.280 | 0.520 | |
| 100 | |||||||
| Estimate of μ | 10.0 | 0.003 | 0.002 | 0.000 | 0.000 | −0.019 | 0.009 |
| 30.0 | 0.003 | 0.001 | 0.000 | 0.000 | −0.015 | 0.034 | |
| 50.0 | 0.003 | 0.000 | 0.000 | −0.001 | 0.055 | 0.076 | |
| 70.0 | 0.003 | −0.006 | −0.004 | −0.002 | 0.232 | 0.142 | |
| Coverage of 95% CI | 10.0 | 0.944 | 0.945 | 0.940 | 0.940 | 0.943 | 0.942 |
| 30.0 | 0.944 | 0.949 | 0.938 | 0.929 | 0.942 | 0.914 | |
| 50.0 | 0.944 | 0.948 | 0.922 | 0.870 | 0.910 | 0.781 | |
| 70.0 | 0.944 | 0.940 | 0.904 | 0.721 | 0.036 | 0.440 | |
| 200 | |||||||
| Estimate of μ | 10.0 | −0.001 | −0.002 | −0.002 | −0.002 | −0.023 | 0.006 |
| 30.0 | −0.001 | −0.003 | −0.003 | −0.003 | −0.019 | 0.031 | |
| 50.0 | −0.001 | −0.002 | −0.002 | −0.002 | 0.052 | 0.074 | |
| 70.0 | −0.001 | −0.003 | −0.001 | −0.002 | 0.229 | 0.142 | |
| Coverage of 95% CI | 10.0 | 0.952 | 0.950 | 0.951 | 0.950 | 0.941 | 0.946 |
| 30.0 | 0.952 | 0.955 | 0.936 | 0.926 | 0.940 | 0.904 | |
| 50.0 | 0.952 | 0.948 | 0.925 | 0.874 | 0.877 | 0.708 | |
| 70.0 | 0.952 | 0.947 | 0.914 | 0.725 | 0.000 | 0.306 | |
| 400 | |||||||
| Estimate of μ | 10.0 | 0.001 | 0.001 | 0.001 | 0.001 | −0.021 | 0.008 |
| 30.0 | 0.001 | 0.000 | 0.000 | 0.000 | −0.017 | 0.034 | |
| 50.0 | 0.001 | 0.001 | 0.001 | 0.001 | 0.053 | 0.076 | |
| 70.0 | 0.001 | 0.000 | 0.000 | 0.000 | 0.230 | 0.144 | |
| Coverage of 95% CI | 10.0 | 0.954 | 0.954 | 0.952 | 0.951 | 0.931 | 0.949 |
| 30.0 | 0.954 | 0.948 | 0.938 | 0.928 | 0.941 | 0.874 | |
| 50.0 | 0.954 | 0.954 | 0.927 | 0.880 | 0.776 | 0.545 | |
| 70.0 | 0.954 | 0.947 | 0.914 | 0.723 | 0.000 | 0.128 | |
Parameter estimation using observed data with DLs (Tobit analysis), (μ̂, σ̂2) multiple imputation with allowance for uncertainty in model parameters using (μ̃, σ̃2), a single imputation using (μ̂, σ̂2), the insertion of DL/2, and insertion of the expected value conditional on being below the DL, E[Z|Z < DL].