| Literature DB >> 33343686 |
Hui Ge1, Debao Fan2, Ming Wan1, Lizhu Jin1, Xiaofeng Wang1, Xuejie Du1, Xu Yang2.
Abstract
Infectious diseases are a major health challenge for the worldwide population. Since their rapid spread can cause great distress to the real world, in addition to taking appropriate measures to curb the spread of infectious diseases in the event of an outbreak, proper prediction and early warning before the outbreak of the threat of infectious diseases can provide an important basis for early and reasonable response by the government health sector, reduce morbidity and mortality, and greatly reduce national losses. However, if only traditional medical data is involved, it may be too late or too difficult to implement prediction and early warning of an infectious outbreak. Recently, medical big data has become a research hotspot and has played an increasingly important role in public health, precision medicine, and disease prediction. In this paper, we focus on exploring a prediction and early warning method for influenza with the help of medical big data. It is well known that meteorological conditions have an influence on influenza outbreaks. So, we try to find a way to determine the early warning threshold value of influenza outbreaks through big data analysis concerning meteorological factors. Results show that, based on analysis of meteorological conditions combined with influenza outbreak history data, the early warning threshold of influenza outbreaks could be established with reasonable high accuracy.Entities:
Year: 2020 PMID: 33343686 PMCID: PMC7725585 DOI: 10.1155/2020/8845459
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Comparison of different influenza-related works.
| Reference | Methods | Data | Goal |
|---|---|---|---|
| [ | ARIMA | Influenza data | Predict trend |
| [ | Statistical methods | Influenza data | Predict trend |
| [ | Statistical methods | Influenza data | Predict trend |
| [ | Bayesian | Influenza data | Predict trend |
| [ | Machine learning methods | Influenza data | Support diagnosis |
| [ | OOC | Influenza data | Predict outbreak |
| [ | Clustering | Social media data | Monitor influenza |
| [ | Linear prediction | Medical data and search data | Monitor influenza |
| [ | Genetic algorithm | Influenza data | Predict trend |
| [ | ANN | Climatic data and influenza data | Predict trend |
| [ | LSTM | Geographical data and climatic data | Predict trend |
| [ | Nonlinear regression | Meteorological data | Monitor influenza |
| [ | MLP | Meteorological data | Predict trend |
Features of meteorological data.
| Name | Meaning | Data type | Data unit |
|---|---|---|---|
| t_avg | Daily average temperature | Continuous | °C |
| t_max | Daily highest temperature | Continuous | °C |
| t_min | Daily lowest temperature | Continuous | °C |
| precip | Cumulative precipitation | Continuous | mm |
| winds_avg | Average wind speed | Continuous | m/s |
| winds_max | Maximum wind speed | Continuous | m/s |
| rh_avg | Average relative humidity | Continuous | % |
| rh_min | Minimum relative humidity | Continuous | % |
| QNE_hPa | Average air pressure | Continuous | hPa |
| radiation | Cumulative daily radiation | Continuous | MJ/m2 |
Critical parameters for CART.
| Name | Meaning | Data type | Default value |
|---|---|---|---|
| max_depth | The maximum tree depth | None | |
| min_impurity_decrease | The minimum impurity for node splitting | 0 | |
| min_weight_fraction_leaf | The minimum weight of a leaf node | 0 | |
| class_weight | The weight of a class | None |
Figure 1Flow of our method.
Evaluation of max_depth for CART.
| max_depth | ACC | f1-score | AUC |
|---|---|---|---|
| 2 | 0.8361 | 0.6562 | 0.8019 |
| 3 | 0.8126 | 0.6793 | 0.7798 |
| 4 | 0.8135 | 0.7087 | 0.7943 |
| 5 | 0.7621 | 0.6315 | 0.7109 |
| 6 | 0.7709 | 0.6107 | 0.6954 |
| 7 | 0.7891 | 0.6051 | 0.6598 |
Evaluation of min_impurity_decrease for CART.
| min_impurity_decrease | ACC | f1-score | AUC |
|---|---|---|---|
| 0 | 0.8135 | 0.7087 | 0.7943 |
| 0.005 | 0.8135 | 0.7087 | 0.7943 |
| 0.01 | 0.8143 | 0.7165 | 0.8029 |
| 0.02 | 0.8177 | 0.7254 | 0.8087 |
| 0.05 | 0.8268 | 0.7301 | 0.8109 |
| 0.08 | 0.7521 | 0.6342 | 0.7651 |
| 0.1 | 0.7196 | 0.6072 | 0.7535 |
Evaluation of min_weight_fraction_leaf for CART.
| min_weight_fraction_leaf | ACC | f1-score | AUC |
|---|---|---|---|
| 0 | 0.8291 | 0.7370 | 0.8153 |
| 0.01 | 0.8043 | 0.6909 | 0.7733 |
| 0.02 | 0.8105 | 0.7144 | 0.7992 |
| 0.05 | 0.8358 | 0.7451 | 0.8208 |
| 0.1 | 0.8470 | 0.6369 | 0.7384 |
| 0.2 | 0.8578 | 0.6882 | 0.7572 |
| 0.3 | 0.7329 | 0.6153 | 0.7023 |
Evaluation of different data tagging methods.
| Data tagging method | ACC | f1-score | AUC |
|---|---|---|---|
| Moving percentile method | 0.8586 | 0.7610 | 0.8429 |
| Monthly upquartile marking | 0.8317 | 0.6963 | 0.7967 |
| Dual cycle daily marking | 0.8391 | 0.7129 | 0.7508 |
Figure 2ROC for different data tagging methods.
Figure 3Data visualization of the CART model.
Comparison between our model and baseline models.
| Method | ACC | f1-score | AUC |
|---|---|---|---|
| Optimized model | 0.8721 | 0.7381 | 0.8709 |
| CART | 0.8586 | 0.7610 | 0.8429 |
| XGBoost | 0.8804 | 0.6998 | 0.8561 |
| LightGBM | 0.8735 | 0.7321 | 0.8224 |
Figure 4ROC for different basic models.
Figure 5Comparison with other algorithms.