| Literature DB >> 29463757 |
Stephen A Lauer1, Krzysztof Sakrejda2, Evan L Ray3, Lindsay T Keegan4, Qifang Bi4, Paphanij Suangtho5, Soawapak Hinjoy5, Sopon Iamsirithaworn6, Suthanun Suthachana5, Yongjua Laosiritaworn5, Derek A T Cummings7, Justin Lessler4, Nicholas G Reich2.
Abstract
Dengue hemorrhagic fever (DHF), a severe manifestation of dengue viral infection that can cause severe bleeding, organ impairment, and even death, affects between 15,000 and 105,000 people each year in Thailand. While all Thai provinces experience at least one DHF case most years, the distribution of cases shifts regionally from year to year. Accurately forecasting where DHF outbreaks occur before the dengue season could help public health officials prioritize public health activities. We develop statistical models that use biologically plausible covariates, observed by April each year, to forecast the cumulative DHF incidence for the remainder of the year. We perform cross-validation during the training phase (2000-2009) to select the covariates for these models. A parsimonious model based on preseason incidence outperforms the 10-y median for 65% of province-level annual forecasts, reduces the mean absolute error by 19%, and successfully forecasts outbreaks (area under the receiver operating characteristic curve = 0.84) over the testing period (2010-2014). We find that functions of past incidence contribute most strongly to model performance, whereas the importance of environmental covariates varies regionally. This work illustrates that accurate forecasts of dengue risk are possible in a policy-relevant timeframe.Entities:
Keywords: dengue; forecasting; infectious disease; statistics
Mesh:
Year: 2018 PMID: 29463757 PMCID: PMC5877997 DOI: 10.1073/pnas.1714457115
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Fig. 1.The temporal and spatial distribution of annual DHF incidence rates in Thailand. (A) The annual DHF incidence rate per 100,000 population for each Thai province and year used in this study. (B) The median annual DHF incidence rate per 100,000 population for each province from 2000 to 2014. (C) The coefficient of variation (SD divided by the mean) of the annual DHF incidence rate for each province.
Justifications for types of covariates considered for inclusion before model selection
| Covariate type | Reason for inclusion |
| Incidence | Large dengue outbreaks may temporarily deplete the susceptible population ( |
| Demographics | Higher population density may facilitate dengue transmission ( |
| Humidity | Humidity may improve the survival rate of |
| Rainfall | Rainfall is essential for |
| Temperature | Temperatures must be warm enough for |
Fig. 2.The WIP model covariate fit curves. The solid lines represent the average association between each covariate in the WIP model and annual DHF incidence per 100,000 population during the training phase, fixing all other covariates at their mean. The dashed lines are the CIs of each association defined as two SEs above and below the mean association. (A–E) The covariates are arranged by performance in the Wald test from largest reduction in deviance (A) to smallest reduction in deviance (E).
Fig. 3.Incidence-only model forecasts for each year of the testing phase compared with the baseline forecasts and the observed values. Forecasts for the annual DHF incidence rate per 100,000 population from the incidence-only model (blue triangles with gray 80% prediction intervals), baseline forecasts (red circles), and observed values (black x) for each province and year in the testing phase are shown.
Fig. 4.Geographic variation in model and performance. (A) The best fitted model in the testing phase for each MOPH region, which shows spatial patterns of performance. (B) The rMAEs of the forecasts for each MOPH region from the models in A over the baseline forecasts (i.e., the two northernmost MOPH regions show the rMAE of the WIP model forecasts, while the rest show the rMAE of the incidence-only model forecasts). Areas with less error than the baseline are blue, areas with more error than the baseline are red, and areas equal to the baseline are white.
Fig. 5.The performance of outbreak forecasts by the incidence-only model. (A) The proportion of province-years that observed an outbreak by their forecasted outbreak probability, which is binned into quantiles. An outbreak is defined as an annual DHF incidence rate greater than two SDs above the median annual DHF incidence rate for the past 10 y. For each forecasted outbreak quantile, the black diamonds indicate the expected proportion of province-years with an outbreak based on incidence-only model forecasts, and the hollow triangles indicate the observed proportion of province-years with an outbreak. (B) The forecasted probability of an outbreak for each province-year in the testing phase and whether an outbreak was observed. The blue loess smoothed line shows the probability of observing an outbreak for a given forecasted outbreak probability from the incidence-only model. (C) The receiver operating characteristic curve based on the incidence-only model’s sensitivity and specificity on outbreak forecasts. The area under the receiver operating characteristic curve (AUC) is indicated below the line of no discrimination (dashed).