| Literature DB >> 35169666 |
Joshua P Keller1, Maggie L Clark2.
Abstract
Estimating long-term exposure to household air pollution is essential for quantifying health effects of chronic exposure and the benefits of intervention strategies. However, typically only a small number of short-term measurements are made. We compare different statistical models for combining these short-term measurements into predictions of a long-term average, with emphasis on the impact of temporal trends in concentrations and crossover in study design. We demonstrate that a linear mixed model that includes time adjustment provides the best predictions of long-term average, which have lower error than using household averages or mixed models without time, for a variety of different study designs and underlying temporal trends. In a case study of a cookstove intervention study in Honduras, we further demonstrate how, in the presence of strong seasonal variation, long-term average predictions from the mixed model approach based on only two or three measurements can have less error than predictions based on an average of up to six measurements. These results have important implications for the efficiency of designs and analyses in studies assessing the chronic health impacts of long-term exposure to household air pollution.Entities:
Year: 2021 PMID: 35169666 PMCID: PMC8835562 DOI: 10.1097/EE9.0000000000000188
Source DB: PubMed Journal: Environ Epidemiol ISSN: 2474-7882
Summary of sampling strategies (i.e., designs) in the simulation.
| Design | Study type | Study duration |
| Description |
|---|---|---|---|---|
| Design 1 | Parallel | 4–16 months | 2–6 | Measurements spaced 3 months apart (i.e., all participants with a visit in each of months 1, 4, 7, etc.). The time for each visit is randomly selected within a one-month window |
| Design 2 | Parallel | 16–28 months | 2–6 | Measurements are spaced 3 months apart, but the initial visit is randomly chosen within the first year (therefore, all months during the year contain visits). The time for each visit is randomly selected within a 1-month window. |
| Design 3 | Parallel | 12 months | 4 | Measurements were made in months 1, 3, 6, and 12. The time for each visit is randomly selected within a five-week window. |
| Design 4 | Parallel | 30 months | 4 | Measurements made in months 1, 3, 6, and 12, except initial visit for each household is randomly chosen within the first 18 months. The time for each visit is randomly selected within a five-week window |
| Design 5 | Parallel | 36 months | 6 | Measurements made in months 1, 3, 6, 9, 12, and 18, except initial visit for each household is randomly chosen within the first 18 months. The time for each visit is randomly selected within a five-week window |
| Design 6 | Stepped-wedge | 16 months | 6 | Either two or four measurements in each stove group. Measurements made every 3 months (i.e., months 1, 4, 7, 10, 13, and 16). The time for each visit is randomly selected within a 1-month window |
| Design 7 | Stepped-wedge | 28 months | 6 | Either two or four measurements in each stove group. Measurements are made every 3 months (i.e., months 1, 4, 7, 10, 13, and 16), except the initial visit is randomly chosen within the first year. The time for each visit is randomly selected within a 1-month window |
Figure 1.Temporal trend functions were used in the simulations. Equations for each trend are provided in eTable 1; http://links.lww.com/EE/A169.
Figure 2.Root mean squared error (RMSE) of predicted long-term averages from simulation with design 1 (parallel design with measurements every 3 months, all households with baseline measurements in same month) and Trend A (no temporal trend). HH, household, MM, mixed model.
Figure 3.Relative root mean squared error (RMSE) of predicted long-term averages, comparing a mixed model that ignores time to a mixed model that adjusts for time. A relative RMSE greater than 1 indicates that the model that adjusts for time performs better. Panels correspond to the different trends in Figure 1.
Figure 4.Results comparing the relative root mean squared error (RMSE) of predictions made from a model without time spline, to predictions from a mixed model with a time spline. A relative RMSE greater than 1 indicates that the model that adjusts for time performs better.
Figure 5.Log-transformed personal PM2.5 measurements in the Honduras study. A smooth curve is shown separately by stove type.
Impact of different modeling choices and numbers of repeated measures on the accuracy of predicted long-term averages of personal PM2.5 exposure in Honduras data.
| Prediction type | Data | Correlation | RMSE |
|---|---|---|---|
| Mixed model | All (6 obs/hh) | (Reference) | (Reference) |
| 5 obs/hh | 0.989 | 0.073 | |
| 4 obs/hh | 0.958 | 0.141 | |
| 2 obs/hh | 0.859 | 0.252 | |
| Mixed model with no time spline | All (6 obs/hh) | 0.992 | 0.071 |
| 5 obs/hh | 0.980 | 0.102 | |
| 4 obs/hh | 0.949 | 0.157 | |
| 2 obs/hh | 0.852 | 0.259 | |
| Household average (by stove) | All (6 obs/hh) | 0.847 | 0.450 |
| 5 obs/hh | 0.823 | 0.490 | |
| 4 obs/hh | 0.810 | 0.510 | |
| 2 obs/hh | 0.687 | 0.722 | |
| Single observation | 1 obs/hh | 0.696 | 0.715 |
Correlations and root mean squared error (RMSE) are calculated using the prediction from the mixed model with all data as the truth.
obs/hh, observations per household.
Impact of different modeling choices and numbers of repeated measures on the accuracy of predicted long-term averages of personal PM2.5 exposure in Honduras data, when restricting to traditional stoves only.
| Prediction type | Data | Correlation | RMSE |
|---|---|---|---|
| Mixed model | All (4 obs/hh) | (Reference) | (Reference) |
| 3 obs/hh | 0.971 | 0.124 | |
| 2 obs/hh | 0.911 | 0.218 | |
| Mixed model with no time spline | All (4 obs/hh) | 0.999 | 0.029 |
| 3 obs/hh | 0.969 | 0.128 | |
| 2 obs/hh | 0.911 | 0.219 | |
| Household average | All (4 obs/hh) | 0.992 | 0.208 |
| 3 obs/hh | 0.962 | 0.291 | |
| 2 obs/hh | 0.908 | 0.390 | |
| Single observation | 1 obs/h | 0.782 | 0.600 |
obs/hh, observations per household.