| Literature DB >> 24147122 |
Ireneous N Soyiri1, Daniel D Reidpath.
Abstract
Forecasting higher than expected numbers of health events provides potentially valuable insights in its own right, and may contribute to health services management and syndromic surveillance. This study investigates the use of quantile regression to predict higher than expected respiratory deaths. Data taken from 70,830 deaths occurring in New York were used. Temporal, weather and air quality measures were fitted using quantile regression at the 90th-percentile with half the data (in-sample). Four QR models were fitted: an unconditional model predicting the 90th-percentile of deaths (Model 1), a seasonal/temporal (Model 2), a seasonal, temporal plus lags of weather and air quality (Model 3), and a seasonal, temporal model with 7-day moving averages of weather and air quality. Models were cross-validated with the out of sample data. Performance was measured as proportionate reduction in weighted sum of absolute deviations by a conditional, over unconditional models; i.e., the coefficient of determination (R1). The coefficient of determination showed an improvement over the unconditional model between 0.16 and 0.19. The greatest improvement in predictive and forecasting accuracy of daily mortality was associated with the inclusion of seasonal and temporal predictors (Model 2). No gains were made in the predictive models with the addition of weather and air quality predictors (Models 3 and 4). However, forecasting models that included weather and air quality predictors performed slightly better than the seasonal and temporal model alone (i.e., Model 3 > Model 4 > Model 2) This study provided a new approach to predict higher than expected numbers of respiratory related-deaths. The approach, while promising, has limitations and should be treated at this stage as a proof of concept.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24147122 PMCID: PMC3795678 DOI: 10.1371/journal.pone.0078215
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Time series of respiratory related deaths 1987—2000.
The vertical dashed line indicates the separation between the in-sample and out-of sample data.
A comparison of the predictive and forecasting capacity of the models.
|
|
|
|
|
| |
|---|---|---|---|---|---|
|
| |||||
| In sample | WSAD | 2328.2 | 1882.5 | 1943.3 | 1927.5 |
| R1 | 0 | .191 | .165 | .172 | |
|
| |||||
| Out sample | WSAD | 2529.5 | 2121.3 | 2039.7 | 2055.3 |
| R1 | 0 | .161 | .194 | .187 |
Perdition was based on the In Sample (days=2445) and forecasting was based on cross-validation of the Out Sample (days=2548): Model 1, intercept only; Model 2, temporal/seasonal model, Model 3, temporal/seasonal model with selected lags of weather and air quality; and Model 4 temporal/seasonal model with a 7-day moving average of weather and air quality. The weighted sum of the absolute deviation (WSAD) and the coefficient of determination (R1) are used to compare the models.
Figure 2The quantile regression (90th percentile) model of respiratory related deaths.
The small dots indicate daily deaths.
The dotted horizontal line shows the unconditional 90th percentile.