| Literature DB >> 34200378 |
Rui Zhang1, Zhen Guo2, Yujie Meng1, Songwang Wang1, Shaoqiong Li1, Ran Niu3, Yu Wang4, Qing Guo1, Yonghong Li4.
Abstract
BACKGROUND: This study intends to identify the best model for predicting the incidence of hand, foot and mouth disease (HFMD) in Ningbo by comparing Autoregressive Integrated Moving Average (ARIMA) and Long Short-Term Memory Neural Network (LSTM) models combined and uncombined with exogenous meteorological variables.Entities:
Keywords: ARIMA; ARIMAX; HFMD; multivariate LSTM; univariate LSTM
Mesh:
Year: 2021 PMID: 34200378 PMCID: PMC8201362 DOI: 10.3390/ijerph18116174
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1The geographical location of Ningbo and Yinzhou station.
Figure 2The repeating modules of RNN and LSTM. (a). The repeating module in a standard RNN contains a single layer; (b). The repeating module in an LSTM contains four interacting layers (A is a chunk of neural network).
Figure 3A three-layer stacked Long Short-Term Memory Neural Network (LSTM) architecture.
Descriptive statistics of daily incidence of HFMD cases and meteorological factors in Ningbo, 2014–2017.
| Indicators | Mean ± SD | Min | P25 | P50 | P75 | Max |
|---|---|---|---|---|---|---|
| Incidence(cases) | 88.9 ± 76.8 | 1 | 33 | 64 | 120 | 479 |
| Tmean ( | 17.5 ± 8.4 | −4.5 | 10.1 | 18.5 | 24.2 | 32.9 |
| Pmean ( | 1016.0 ± 8.8 | 985.7 | 1008.6 | 1015.7 | 1023.2 | 1039.7 |
| RHmean (%) | 79.8 ± 11.2 | 34 | 73 | 81 | 88 | 100 |
| WSmean ( | 2.0 ± 0.9 | 0.1 | 1.4 | 1.8 | 2.4 | 8.3 |
| PPTN (mm) | 5.0 ± 14.4 | 0 | 0 | 0 | 3.3 | 276.2 |
| Sunshine (h) | 4.4 ± 4.1 | 0 | 0 | 3.7 | 8.3 | 12.7 |
Note: SD stands for standard deviation, Min stands for minimum value, Max stands for maximum value, P25 stands for 25th percentile, P50 stands for 50th percentile and P75 stands for 75th percentile; Tmean stands for daily mean temperature, Pmean stands for daily mean pressure, RHmean stands for daily mean relative humidity and WSmean stands for daily mean wind speed and PPTN stands for daily precipitation.
Figure 4The time series distribution of the daily incidence of HFMD and meteorological variables in Ningbo, 2014–2017.
Analysis of correlation between daily incidence of HFMD and meteorological variables.
| Indicators | Tmean | Pmean | RHmean | WSmean | PPTN | Sunshine |
|---|---|---|---|---|---|---|
| HFMD | 0.34 * | −0.36 * | 0.09 * | −0.05 | 0.04 | −0.02 |
| Tmean | −0.89 * | 0.15 * | −0.02 | 0.11 * | 0.17 * | |
| Pmean | −0.26 * | 0.03 | −0.17 * | −0.06 * | ||
| RHmean | −0.32 * | 0.35 * | −0.58 * | |||
| WSmean | 0.08 * | 0.07 * | ||||
| PPTN | −0.29 * |
Note: *: p < 0.05; Tmean stands for daily mean temperature, Pmean stands for daily mean pressure, RHmean stands for daily mean relative humidity and WSmean stands for daily mean wind speed.
Comparison of the ARIMA and ARIMAX models.
| Models | Ljung–Box Test | AIC | RMSE | MAE | MAPE | |
|---|---|---|---|---|---|---|
| X-Squared | ||||||
| ARIMA (5,1,4) | 2.73 | 0.10 | 13,825.48 | 12.43 | 9.71 | 0.21 |
| ARIMA (5,1,2) | 0.11 | 0.74 | 13,988.18 | 14.23 | 11.59 | 0.24 |
| ARIMA (2,1,1)(0,1,0)365 | 0.02 | 0.88 | 11,439.60 | 43.27 | 32.59 | 0.57 |
| ARIMA (3,1,1)(0,1,0)365 | 0.00 | 0.99 | 11,440.77 | 43.2 | 32.61 | 0.58 |
| ARIMAX (5,1,3) | 0.04 | 0.84 | 13,973.31 | 15.98 | 12.70 | 0.22 |
| ARIMAX (4,1,3) | 0.97 | 0.32 | 14,049.60 | 17.23 | 13.49 | 0.23 |
| ARIMAX (5,1,2) | 0.33 | 0.57 | 13,973.21 | 15.92 | 12.71 | 0.22 |
| ARIMAX (5,1,4) | 3.00 | 0.08 | 13,808.40 | 14.73 | 11.26 | 0.21 |
Note: AIC stands for Akaike information criterion, RMSE stands for root mean square error, MAE stands for mean absolute error and MAPE stands for mean absolute percentage error.
Comparison of the univariate LSTM and multivariate LSTM models.
| Models | Time Steps | Neurons | Optimizer | Epochs | Batch Size | RMSE | |
|---|---|---|---|---|---|---|---|
| Univariate LSTM | 1 | 60 | 64 | SGD | 250 | 32 | 11.20 |
| 2 | 60 | 72 | RMSProp | 250 | 16 | 11.33 | |
| 3 | 60 | 72 | Adam | 250 | 16 | 11.33 | |
| 4 | 60 | 72 | RMSProp | 200 | 16 | 11.99 | |
| 5 | 60 | 72 | RMSProp | 250 | 64 | 12.43 | |
| 6 | 60 | 64 | RMSProp | 250 | 16 | 12.52 | |
| 7 | 60 | 128 | SGD | 250 | 32 | 19.30 | |
| 8 | 30 | 128 | SGD | 250 | 32 | 19.56 | |
| 9 | 180 | 64 | SGD | 250 | 32 | 20.59 | |
| 10 | 60 | 32 | SGD | 250 | 32 | 21.57 | |
| Multivariate LSTM | 1 | 60 | 32 | Adam | 250 | 32 | 10.78 |
| 2 | 60 | 64 | RMSProp | 250 | 32 | 11.09 | |
| 3 | 60 | 64 | Adam | 250 | 32 | 11.17 | |
| 4 | 60 | 64 | RMSProp | 250 | 64 | 12.07 | |
| 5 | 60 | 64 | RMSProp | 200 | 32 | 12.99 | |
| 6 | 30 | 32 | Adam | 250 | 32 | 13.64 | |
| 7 | 7 | 32 | Adam | 250 | 32 | 15.09 | |
| 8 | 180 | 32 | Adam | 250 | 32 | 15.48 | |
| 9 | 60 | 128 | Adam | 250 | 32 | 17.07 | |
| 10 | 60 | 64 | SGD | 250 | 32 | 19.99 | |
Note: SGD stands for stochastic gradient descent, RMSProp stands for root mean square prop, Adam stands for adaptive moment estimation.
The forecasting performance of the four models.
| Model | RMSE | MAE | MAPE |
|---|---|---|---|
| ARIMA (5,1,4) | 12.43 | 9.71 | 0.20 |
| ARIMAX (5,1,4) | 14.73 | 11.26 | 0.21 |
| Univariate LSTM | 11.20 | 9.03 | 0.18 |
| Multivariable LSTM | 10.78 | 8.71 | 0.17 |
Figure 5The actual daily incidence of HFMD and values predicted by the four models in December 2017.