| Literature DB >> 36262244 |
Tianxing Wu1, Minghao Wang1, Xiaoqing Cheng2,3, Wendong Liu2, Shutong Zhu1, Xuefeng Zhang2.
Abstract
Hepatitis E has placed a heavy burden on China, especially in Jiangsu Province, so accurately predicting the incidence of hepatitis E benefits to alleviate the medical burden. In this paper, we propose a new attentive bidirectional long short-term memory network (denoted as BiLSTM-Attention) to predict the incidence of hepatitis E for all 13 cities in Jiangsu Province, China. Besides, we also explore the performance of adding meteorological factors and the Baidu (the most widely used Chinese search engine) index as additional training data for the prediction of our BiLSTM-Attention model. SARIMAX, GBDT, LSTM, BiLSTM, and BiLSTM-Attention models are tested in this study, based on the monthly incidence rates of hepatitis E, meteorological factors, and the Baidu index collected from 2011 to 2019 for the 13 cities in Jiangsu province, China. From January 2011 to December 2019, a total of 29,339 cases of hepatitis E were detected in all cities in Jiangsu Province, and the average monthly incidence rate for each city is 0.359 per 100,000 persons. Root mean square error (RMSE) and mean absolute error (MAE) are used for model selection and performance evaluation. The BiLSTM-Attention model considering meteorological factors and the Baidu index has the best performance for hepatitis E prediction in all cities, and it gets at least 10% improvement in RMSE and MAE for all 13 cities in Jiangsu province, which means the model has significantly improved the learning ability, generalizability, and prediction accuracy when comparing with others.Entities:
Keywords: Baidu index; BiLSTM; attention; hepatitis E; machine learning; meteorological factors
Mesh:
Year: 2022 PMID: 36262244 PMCID: PMC9574096 DOI: 10.3389/fpubh.2022.942543
Source DB: PubMed Journal: Front Public Health ISSN: 2296-2565
Figure 1The framework of our BiLSTM–Attention model.
Figure 2The structure of the LSTM cell.
Figure 3The annual incidence rates (the number of people suffering from hepatitis E each year per 100,000 persons) of hepatitis E of the 13 cities in Jiangsu Province from 2011 to 2019.
Statistics of the monthly incidence rates of hepatitis E for the 13 cities in Jiangsu province from January 2011 to December 2019.
|
|
|
|
|
|---|---|---|---|
| Nanjing | 0.163 | 0.595 | 0.025 |
| Lianyungang | 0.441 | 1.32 | 0.067 |
| Changzhou | 0.123 | 0.588 | 0 |
| Zhenjiang | 0.506 | 1.237 | 0.031 |
| Wuxi | 0.101 | 0.267 | 0 |
| Suzhou | 0.181 | 0.483 | 0.047 |
| Huai'an | 0.286 | 1.021 | 0.041 |
| Yangzhou | 0.422 | 1.637 | 0.132 |
| Xuzhou | 0.442 | 0.982 | 0.079 |
| Yancheng | 0.416 | 0.815 | 0.152 |
| Suqian | 0.455 | 1.463 | 0.041 |
| Taizhou | 0.367 | 0.840 | 0.108 |
| Nantong | 0.513 | 1.304 | 0.192 |
The parameter setting for different models.
|
|
|
|
|---|---|---|
| SARIMAX | Number of time lags, order of MA model ( | {(1, 1), (1, 2), ..., (3, 3)} |
| AR, MA terms (seasonal part) ( | {(0, 0), (0, 1), ..., (3, 3)} | |
| Degree of differencing | {0, 1} | |
| Differencing term (seasonal part) | {1} | |
| Number of | {12} | |
| GBDT | {100, 200, 300} | |
| Learning rate | {0.1, 0.2, 0.5} | |
| Max depth | {1, 2, 3, 4, 5} | |
| LSTM, BiLSTM, BiLSTM–Attention | Batch size | {8, 16, 32} |
| Epochs | {64, 128, 256, 512} | |
| LSTM hidden | {2, 4, 8, 16, 32, 64, 128} | |
| Validation split | {0.1} | |
| Optimizer | {Adam, SGD} | |
| Loss | {MSE, MAE} |
AR, auto regression; MA, moving average; Adam (23), an algorithm for first-order gradient-based optimization of stochastic objective functions based on adaptive estimates of lower-order moments; SGD, stochastic gradient descent; MSE, mean-squared error; MAE, mean absolute error.
The comparison results of different models on predicting monthly incidence rates of hepatitis E for the 13 cities in Jiangsu province.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| Nanjing | RMSE | 0.025 | 0.060 | 0.046 | 0.049 | 0.074 |
| MAE | 0.143 | 0.218 | 0.197 | 0.201 | 0.238 | |
| Lianyungang | RMSE | 0.150 | 0.253 | 0.210 | 0.205 | 0.256 |
| MAE | 0.335 | 0.396 | 0.368 | 0.363 | 0.398 | |
| Changzhou | RMSE | 0.033 | 0.093 | 0.068 | 0.064 | 0.068 |
| MAE | 0.170 | 0.290 | 0.231 | 0.227 | 0.240 | |
| Zhenjiang | RMSE | 0.044 | 0.287 | 0.088 | 0.102 | 0.261 |
| MAE | 0.188 | 0.513 | 0.267 | 0.285 | 0.467 | |
| Wuxi | RMSE | 0.047 | 0.089 | 0.072 | 0.078 | 0.081 |
| MAE | 0.192 | 0.272 | 0.242 | 0.251 | 0.248 | |
| Suzhou | RMSE | 0.060 | 0.086 | 0.085 | 0.085 | 0.069 |
| MAE | 0.212 | 0.266 | 0.263 | 0.264 | 0.239 | |
| Huai'an | RMSE | 0.063 | 0.174 | 0.128 | 0.126 | 0.160 |
| MAE | 0.214 | 0.386 | 0.306 | 0.299 | 0.344 | |
| Yangzhou | RMSE | 0.073 | 0.208 | 0.166 | 0.176 | 0.155 |
| MAE | 0.249 | 0.422 | 0.361 | 0.371 | 0.339 | |
| Xuzhou | RMSE | 0.075 | 0.134 | 0.116 | 0.137 | 0.145 |
| MAE | 0.254 | 0.336 | 0.296 | 0.307 | 0.333 | |
| Yancheng | RMSE | 0.050 | 0.157 | 0.069 | 0.073 | 0.112 |
| MAE | 0.208 | 0.354 | 0.245 | 0.239 | 0.303 | |
| Suqian | RMSE | 0.071 | 0.318 | 0.119 | 0.168 | 0.246 |
| MAE | 0.236 | 0.535 | 0.303 | 0.367 | 0.440 | |
| Taizhou | RMSE | 0.107 | 0.151 | 0.129 | 0.123 | 0.189 |
| MAE | 0.296 | 0.370 | 0.328 | 0.322 | 0.386 | |
| Nantong | RMSE | 0.086 | 0.171 | 0.196 | 0.155 | 0.122 |
| MAE | 0.272 | 0.364 | 0.391 | 0.354 | 0.315 |
Figure 4The monthly incidence rates (including the true results and predicted ones of BiLSTM–Attention, BiLSTM, and LSTM) of hepatitis E for the 13 cities in Jiangsu Province (x-axis: month, y-axis: incidence rate).
The comparison results of our BiLSTM-Attention model with different training data on predicting monthly incidence rates of hepatitis E for the 13 cities in Jiangsu province.
|
|
|
|
|
|
|---|---|---|---|---|
| Nanjing | RMSE | 0.025 | 0.033 | 0.034 |
| MAE | 0.143 | 0.167 | 0.175 | |
| Lianyungang | RMSE | 0.150 | 0.171 | 0.200 |
| MAE | 0.335 | 0.344 | 0.362 | |
| Changzhou | RMSE | 0.033 | 0.057 | 0.047 |
| MAE | 0.170 | 0.218 | 0.204 | |
| Zhenjiang | RMSE | 0.044 | 0.104 | 0.068 |
| MAE | 0.188 | 0.288 | 0.241 | |
| Wuxi | RMSE | 0.047 | 0.062 | 0.058 |
| MAE | 0.192 | 0.233 | 0.218 | |
| Suzhou | RMSE | 0.060 | 0.078 | 0.061 |
| MAE | 0.210 | 0.252 | 0.212 | |
| Huai'an | RMSE | 0.063 | 0.105 | 0.077 |
| MAE | 0.214 | 0.269 | 0.228 | |
| Yangzhou | RMSE | 0.073 | 0.096 | 0.087 |
| MAE | 0.240 | 0.281 | 0.249 | |
| Xuzhou | RMSE | 0.075 | 0.118 | 0.084 |
| MAE | 0.250 | 0.306 | 0.254 | |
| Yancheng | RMSE | 0.05 | 0.074 | 0.067 |
| MAE | 0.208 | 0.258 | 0.243 | |
| Suqian | RMSE | 0.071 | 0.106 | 0.086 |
| MAE | 0.236 | 0.293 | 0.273 | |
| Taizhou | RMSE | 0.107 | 0.133 | 0.116 |
| MAE | 0.296 | 0.328 | 0.303 | |
| Nantong | RMSE | 0.086 | 0.120 | 0.115 |
| MAE | 0.272 | 0.306 | 0.320 |
Figure 5The monthly incidence rates (including the true results and predicted ones of the BiLSTM–Attention model with full training data, w/o BI, and w/o MF) of hepatitis E for the 13 cities in Jiangsu Province (x-axis: month, y-axis: incidence rate).