| Literature DB >> 31858924 |
Wendong Liu1, Qigang Dai1, Jing Bao2, Wenqi Shen1, Ying Wu1, Yingying Shi1, Ke Xu1, Jianli Hu1, Changjun Bao1, Xiang Huo1.
Abstract
Influenza activity is subject to environmental factors. Accurate forecasting of influenza epidemics would permit timely and effective implementation of public health interventions, but it remains challenging. In this study, we aimed to develop random forest (RF) regression models including meterological factors to predict seasonal influenza activity in Jiangsu provine, China. Coefficient of determination (R2) and mean absolute percentage error (MAPE) were employed to evaluate the models' performance. Three RF models with optimum parameters were constructed to predict influenza like illness (ILI) activity, influenza A and B (Flu-A and Flu-B) positive rates in Jiangsu. The models for Flu-B and ILI presented excellent performance with MAPEs <10%. The predicted values of the Flu-A model also matched the real trend very well, although its MAPE reached to 19.49% in the test set. The lagged dependent variables were vital predictors in each model. Seasonality was more pronounced in the models for ILI and Flu-A. The modification effects of the meteorological factors and their lagged terms on the prediction accuracy differed across the three models, while temperature always played an important role. Notably, atmospheric pressure made a major contribution to ILI and Flu-B forecasting. In brief, RF models performed well in influenza activity prediction. Impacts of meteorological factors on the predictive models for influenza activity are type-specific.Entities:
Keywords: Forecast; influenza activity; meteorological factor; random forest model
Mesh:
Year: 2019 PMID: 31858924 PMCID: PMC7006024 DOI: 10.1017/S0950268819002140
Source DB: PubMed Journal: Epidemiol Infect ISSN: 0950-2688 Impact factor: 2.451
Fig. 1.Temporal patterns of ILI activity and influenza virus positive rates in Jiangsu province, 2011–2016.
Summary of weekly meteorological variables in Jiangsu province, 2011–2016
| Variable | Min | P25 | P50 | Mena | P75 | Max |
|---|---|---|---|---|---|---|
| AP (Pa) | 998.6 | 1006.8 | 1015.7 | 1015.2 | 1022.7 | 1034.0 |
| Mean_T (°C) | −2.191 | 6.953 | 17.092 | 15.621 | 23.551 | 32.648 |
| MAX_T (°C) | 1.252 | 11.622 | 22.471 | 20.174 | 27.467 | 37.407 |
| MIN_T (°C) | −5.997 | 3.44 | 12.698 | 11.938 | 20.434 | 28.199 |
| RH (%) | 45.93 | 67.86 | 74.40 | 73.40 | 80.24 | 91.18 |
| PR (mm) | 0 | 3.335 | 11.648 | 21.296 | 29.987 | 159.657 |
| SD (hour) | 2.252 | 27.574 | 37.274 | 37.987 | 48.717 | 82.009 |
Cross correlation between dependent variable and meteorological factors
| Dependent variable | Lag | Correlation coefficient | ||||||
|---|---|---|---|---|---|---|---|---|
| AP | Mean_T | Max_T | Min_T | RH | PR | SD | ||
| Weekly ILI | 0 | −0.195* | 0.112* | 0.106 | 0.119* | 0.104 | 0.159* | −0.006 |
| 1 | −0.172* | 0.081 | 0.077 | 0.086 | 0.067 | 0.181* | −0.023 | |
| 2 | −0.146* | 0.051 | 0.047 | 0.055 | 0.053 | 0.153* | −0.061 | |
| 3 | −0.126* | 0.026 | 0.025 | 0.028 | 0.033 | 0.142* | −0.072 | |
| 4 | −0.113* | 0.006 | 0.008 | 0.005 | 0.033 | 0.143* | −0.1 | |
| Weekly positive rate of Flu-A | 0 | 0.193* | −0.225* | −0.225* | −0.216* | −0.051 | −0.059 | −0.006 |
| 1 | 0.181* | −0.221* | −0.222* | −0.210* | −0.06 | −0.041 | −0.024 | |
| 2 | 0.161* | −0.204* | −0.210* | −0.189* | −0.05 | −0.013 | −0.051 | |
| 3 | 0.127* | −0.173* | −0.182* | −0.156* | 0.002 | 0.024 | −0.103 | |
| 4 | 0.096 | −0.139* | −0.152* | −0.120* | 0.041 | 0.038 | −0.131 | |
| Weekly positive rate of Flu-B | 0 | 0.375* | −0.459* | −0.465* | −0.454* | −0.271* | −0.206* | −0.143* |
| 1 | 0.427* | −0.492* | −0.496* | −0.486* | −0.278* | −0.226* | −0.142* | |
| 2 | 0.461* | −0.516* | −0.519* | −0.510* | −0.291* | −0.236* | −0.144* | |
| 3 | 0.495* | −0.539* | −0.538* | −0.533* | −0.303* | −0.256* | −0.129* | |
| 4 | 0.509* | −0.547* | −0.545* | −0.541* | −0.299* | −0.266* | −0.136* | |
*statistically significant at 0.05.
Fig. 2.Partial autocorrelation function of time series ILI percentage, positive rate of Flu A and positive rate of Flu B.
Predictors in different models
| Model | lag | ILI-P | Flu-A | Flu-B | time | AP | Mean_T | Max_T | Min_T | RH | PR | SD |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RF-ILI_P | 0 | |||||||||||
| 1 | ||||||||||||
| 2 | ||||||||||||
| 3 | ||||||||||||
| 4 | ||||||||||||
| RF-Flu_A | 0 | |||||||||||
| 1 | ||||||||||||
| 2 | ||||||||||||
| 3 | ||||||||||||
| 4 | ||||||||||||
| RF-Flu_B | 0 | |||||||||||
| 1 | ||||||||||||
| 2 | ||||||||||||
| 3 | ||||||||||||
| 4 |
Performance evaluation of different random forest models
| Model | R2 | MAPE(%) | ||
|---|---|---|---|---|
| Train | Test | Train | Test | |
| RF-ILI_P | 0.79 | 0.50 | 2.48 | 9.95 |
| RF-Flu_A | 0.89 | 0.0.82 | 11.24 | 19.49 |
| RF-Flu_B | 0.95 | 0.80 | 3.20 | 8.58 |
Fig. 3.Plot of observed and predicted values via different models.
Fig. 4.Variable importance in random forest regression models (just displaying the top 10 variables).