| Literature DB >> 29608586 |
Oswaldo Santos Baquero1,2, Lidia Maria Reis Santana3, Francisco Chiaravalloti-Neto2.
Abstract
Globally, the number of dengue cases has been on the increase since 1990 and this trend has also been found in Brazil and its most populated city-São Paulo. Surveillance systems based on predictions allow for timely decision making processes, and in turn, timely and efficient interventions to reduce the burden of the disease. We conducted a comparative study of dengue predictions in São Paulo city to test the performance of trained seasonal autoregressive integrated moving average models, generalized additive models and artificial neural networks. We also used a naïve model as a benchmark. A generalized additive model with lags of the number of cases and meteorological variables had the best performance, predicted epidemics of unprecedented magnitude and its performance was 3.16 times higher than the benchmark and 1.47 higher that the next best performing model. The predictive models captured the seasonal patterns but differed in their capacity to anticipate large epidemics and all outperformed the benchmark. In addition to be able to predict epidemics of unprecedented magnitude, the best model had computational advantages, since its training and tuning was straightforward and required seconds or at most few minutes. These are desired characteristics to provide timely results for decision makers. However, it should be noted that predictions are made just one month ahead and this is a limitation that future studies could try to reduce.Entities:
Mesh:
Year: 2018 PMID: 29608586 PMCID: PMC5880372 DOI: 10.1371/journal.pone.0195065
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Correlation between the number of cases and lags 1–3 of six predictors.
| Predictor (lagged) | Lag 1 | Lag 2 | Lag 3 |
|---|---|---|---|
| Cases | 0.680 | 0.240 | 0.048 |
| Precipitation | 0.101 | 0.208 | 0.239 |
| Maximum temperature | 0.180 | 0.334 | 0.374 |
| Mean temperature | 0.207 | 0.335 | 0.358 |
| Minimum temperature | 0.219 | 0.315 | 0.310 |
| Relative humidity | 0.062 | -0.071 | -0.101 |
Configuration of artificial neural networks.
| Epochs | 300 |
| Weight initialization | Uniform distribution |
| Activation of hidden layers | Rectifier linear units |
| Optimizer | Adaptive moment estimation (Adam) |
Topology and tuned parameters of trained artificial neural networks.
| Topology | Tuned parameters | ||||
|---|---|---|---|---|---|
| Type | Predictors | Hidden layer 1 | Hidden layer 2 | Batch size | DR |
| MLP | C | 10 | 5 | 10, 20, 50 | 0, 0.2, 0.4 |
| MLP | C, T, P, RH (lags 1–3) | 20 | 10 | 10, 20, 50 | 0, 0.2, 0.4 |
| LSTM | C | 10 | 5 | 10, 20, 50 | 0, 0.2, 0.4 |
| LSTM | C, T, P, RH (lags 1–3) | 20 | 10 | 10, 20, 50 | 0, 0.2, 0.4 |
MLP: multilayer perceptron, LSTM: long short-term memory recurrent neural networks, C: number of cases, T: temperature (maximum, mean and minimum), P: precipitation, RH: relative humidity, DR: dropout regularization.
Best generalized additive model (GAM), artificial neural network (ANN) and autoregressive integrated moving average model (SARIMA).
| Model | Description | |
|---|---|---|
| GAM | Likelihood: | Poisson |
| Spline: | Shrinkage cubic | |
| Knots: | 3 | |
| Predictors: | Clag1, Tmaxlag2, Plag1, RHlag1 | |
| ANN | Epochs: | 300 |
| Weights initialization: | Uniform distribution | |
| Activation | Rectifier | |
| Optimizer: | Adaptive moment estimation | |
| Type: | Multilayer perceptron | |
| Units in hidden layer 1: | 20 | |
| Units in hidden layer 2: | 10 | |
| Batch size: | 50 | |
| Dropout regularization: | 0 | |
| Predictors: | Clag1-3, Tlag1-3, Plag1-3, RHlag1-3 | |
| SARIMA | Transformation: | Natural logarithm |
| Non-seasonal autoregressive order: | 0 | |
| Non-seasonal difference order: | 1 | |
| Non-seasonal moving average order: | 3 | |
| Seasonal autoregressive order: | 0 | |
| Seasonal difference order: | 0 | |
| Seasonal moving average order: | 1 | |
C: number of cases, T: temperature (maximum, mean and minimum), Tmax: maximum temperature, P: precipitation and RH: relative humidity.
Root mean squared errors (RMSE) of predictive models of dengue cases.
| Model | RMSE | RMSE / RMSEnaïve |
|---|---|---|
| GAM | 2152 | 0.316 |
| Ensemble | 3164 | 0.465 |
| MLP | 4422 | 0.650 |
| SARIMA | 5984 | 0.879 |
| Naïve | 6806 | 1.000 |
* Rounded values.
Fig 1Observed and predicted number of dengue cases in training and test data from São Paulo, Brazil, 2000–2016.
Predictions were made by models presented in Table 4.
Fig 2Observed and predicted number of dengue cases in training and test data from São Paulo, Brazil, 2000–2016.
Predictions were made by the generalized additive model presented in Table 4.