| Literature DB >> 32755888 |
Hao-Yuan Cheng1, Yu-Chun Wu2, Min-Hau Lin1, Yu-Lun Liu1, Yue-Yang Tsai2, Jo-Hua Wu2, Ke-Han Pan2, Chih-Jung Ke1, Chiu-Mei Chen1, Ding-Ping Liu1,3, I-Feng Lin4, Jen-Hsiang Chuang1,4.
Abstract
BACKGROUND: Changeful seasonal influenza activity in subtropical areas such as Taiwan causes problems in epidemic preparedness. The Taiwan Centers for Disease Control has maintained real-time national influenza surveillance systems since 2004. Except for timely monitoring, epidemic forecasting using the national influenza surveillance data can provide pivotal information for public health response.Entities:
Keywords: Influenza-like illness; artificial intelligence; epidemic forecasting; forecasting; influenza; machine learning; surveillance
Mesh:
Year: 2020 PMID: 32755888 PMCID: PMC7439145 DOI: 10.2196/15394
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Figure 1Timeline of historical data used for model training and forecasting periods.
Figure 2The framework of feature selection and model tuning for our model training and validation compared to the conventional method.
Figure 3Nowcasts (current week predictions) of the influenza-like illness visits in outpatient and emergency departments by the 5 machine learning models (colored lines) compared with the observed historical data (black line), 2015-2017. ARIMA: autoregressive integrated moving average; ILI: influenza-like illness; RF: random forest; SVR: support vector regression; XGB: extreme gradient boosting.
The evaluation metrics of the 5 machine learning models for then-current week forecasts (nowcast), 1-week forecasts, 2-week forecasts, and 3-week forecasts for 2015 to 2017 data.
| Time period | Outpatient influenza-like illness visits | Emergency influenza-like illness visits | |||||||
|
| Model | RMSEa | MAPEb, % | Hit rate | Pearson correlation coefficient | RMSE | MAPE, % | Hit rate | Pearson correlation coefficient |
|
| |||||||||
|
| ARIMAc | 6621.9 | 6.5 | 0.744 | 0.962 | 1689.8 | 5.2 | 0.718 | 0.965 |
|
| RFd | 10773.1 | 9.2 | 0.577 | 0.891 | 2707.8 | 7.6 | 0.609 | 0.922 |
|
| SVRe | 9265.0 | 7.7 | 0.686 | 0.923 | 4189.8 | 7.1 | 0.686 | 0.802 |
|
| XGBf | 10063.6 | 7.3 | 0.635 | 0.915 | 2696.0 | 6.6 | 0.667 | 0.935 |
|
| Ensemble | 6903.5 | 6.0 | 0.756 | 0.956 | 1696.3 | 5.2 | 0.705 | 0.967 |
|
| |||||||||
|
| ARIMA | 11165.2 | 11.5 | 0.695 | 0.892 | 2562.1 | 8.3 | 0.747 | 0.918 |
|
| RF | 12256.6 | 11.7 | 0.701 | 0.855 | 3430.0 | 11.8 | 0.643 | 0.842 |
|
| SVR | 11573.6 | 10.7 | 0.740 | 0.874 | 4351.8 | 9.4 | 0.701 | 0.803 |
|
| XGB | 13604.7 | 11.0 | 0.695 | 0.836 | 3866.7 | 9.7 | 0.688 | 0.842 |
|
| Ensemble | 11752.8 | 10.1 | 0.721 | 0.874 | 2831.6 | 8.3 | 0.708 | 0.901 |
|
| |||||||||
|
| ARIMA | 15471.8 | 15.3 | 0.695 | 0.792 | 3206.2 | 10.1 | 0.727 | 0.867 |
|
| RF | 13464.5 | 13.8 | 0.721 | 0.823 | 3639.8 | 13.7 | 0.669 | 0.816 |
|
| SVR | 13972.9 | 13.7 | 0.708 | 0.808 | 4562.1 | 10.8 | 0.734 | 0.783 |
|
| XGB | 15317.7 | 13.6 | 0.727 | 0.785 | 4235.0 | 11.3 | 0.708 | 0.817 |
|
| Ensemble | 13758.1 | 12.0 | 0.727 | 0.823 | 3467.9 | 10.4 | 0.727 | 0.860 |
|
| |||||||||
|
| ARIMA | 19338.3 | 18.9 | 0.669 | 0.676 | 3836.8 | 12.0 | 0.688 | 0.801 |
|
| RF | 14310.9 | 14.9 | 0.753 | 0.796 | 3949.8 | 15.0 | 0.708 | 0.777 |
|
| SVR | 16004.3 | 15.9 | 0.786 | 0.743 | 4903.4 | 12.1 | 0.675 | 0.731 |
|
| XGB | 16888.8 | 15.7 | 0.708 | 0.723 | 4823.2 | 13.5 | 0.734 | 0.686 |
|
| Ensemble | 15193.6 | 13.3 | 0.773 | 0.780 | 3937.2 | 13.1 | 0.643 | 0.797 |
aRMSE: root mean squared error.
bMAPE: mean absolute percentage error.
cARIMA: autoregressive integrated moving average.
dRF: random forest.
eSVR: support vector regression.
fXGB: extreme gradient boosting.
Figure 4Forecasts using the ensemble model (red) and the observed (black) number for influenza-like illness visits in (A) outpatient and (B) emergency departments, 2015-2017.
Evaluation metrics of the 5 machine learning models for the current week forecasts (nowcast), 1-week forecasts, 2-week forecasts, and 3-week forecasts in 2018.
|
| Outpatient influenza-like illness visits | Emergency influenza-like illness visits | |||||||||
|
| Model | RMSEa | MAPEb, % | Hit rate | Pearson correlation coefficient | RMSE | MAPE, % | Hit rate | Pearson correlation coefficient | ||
|
| |||||||||||
|
| ARIMAc | 6422.8 | 5.5 | 0.727 | 0.958 | 1125.3 | 5.7 | 0.782 | 0.965 | ||
|
| RFd | 6472.1 | 5.4 | 0.709 | 0.957 | 1305.8 | 6.2 | 0.709 | 0.952 | ||
|
| SVRe | 5343.2 | 5.8 | 0.655 | 0.969 | 2079.6 | 8.0 | 0.582 | 0.875 | ||
|
| XGBf | 7384.6 | 5.8 | 0.691 | 0.943 | 1643.6 | 6.2 | 0.673 | 0.923 | ||
|
| Ensemble | 6170.7 | 5.3 | 0.600 | 0.962 | 1751.0 | 6.5 | 0.727 | 0.912 | ||
|
| |||||||||||
|
| ARIMA | 9874.7 | 9.0 | 0.741 | 0.897 | 1707.6 | 7.8 | 0.704 | 0.919 | ||
|
| RF | 8644.1 | 7.2 | 0.833 | 0.921 | 1861.2 | 7.9 | 0.759 | 0.899 | ||
|
| SVR | 7330.1 | 7.9 | 0.778 | 0.942 | 2643.7 | 10.6 | 0.759 | 0.798 | ||
|
| XGB | 9738.7 | 9.0 | 0.741 | 0.903 | 2353.3 | 8.3 | 0.685 | 0.836 | ||
|
| Ensemble | 9156.8 | 7.5 | 0.796 | 0.911 | 2363.0 | 8.4 | 0.722 | 0.832 | ||
|
| |||||||||||
|
| ARIMA | 11630.6 | 11.8 | 0.811 | 0.851 | 1922.2 | 8.6 | 0.755 | 0.893 | ||
|
| RF | 10082.3 | 9.4 | 0.811 | 0.893 | 1975.5 | 7.7 | 0.736 | 0.888 | ||
|
| SVR | 9292.7 | 10.1 | 0.830 | 0.905 | 3031.0 | 12.8 | 0.566 | 0.730 | ||
|
| XGB | 11262.8 | 11.3 | 0.755 | 0.873 | 2371.6 | 7.9 | 0.698 | 0.843 | ||
|
| Ensemble | 10078.8 | 8.6 | 0.774 | 0.889 | 2389.4 | 8.6 | 0.755 | 0.835 | ||
|
| |||||||||||
|
| ARIMA | 13656.7 | 13.5 | 0.865 | 0.787 | 1875.4 | 9.5 | 0.788 | 0.898 | ||
|
| RF | 10258.0 | 10.1 | 0.769 | 0.892 | 2041.9 | 8.8 | 0.692 | 0.877 | ||
|
| SVR | 9439.7 | 10.9 | 0.885 | 0.904 | 3106.0 | 13.5 | 0.596 | 0.721 | ||
|
| XGB | 12789.3 | 13.0 | 0.788 | 0.830 | 2259.6 | 7.6 | 0.692 | 0.890 | ||
|
| Ensemble | 9160.9 | 8.8 | 0.904 | 0.908 | 2478.4 | 9.6 | 0.769 | 0.814 | ||
aRMSE: root mean squared error.
bMAPE: mean absolute percentage error.
cARIMA: autoregressive integrated moving average.
dRF: random forest.
eSVR: support vector regression.
fXGB: extreme gradient boosting.
Figure 5Forecasts of the 5 machine learning models (red) and the observed number (black) of influenza-like illness visits in (A) outpatient and (B) emergency departments, 2018. ARIMA: autoregressive integrated moving average; MAPE: mean absolute percentage error; RF: random forest; RMSE: root mean squared error; SVR: support vector regression; XGB: extreme gradient boosting.