| Literature DB >> 33840175 |
Taewan Goo1, Catherine Apio1, Gyujin Heo1, Doeun Lee1, Jong Hyeok Lee2, Jisun Lim3, Kyulhee Han1, Taesung Park1,2.
Abstract
For the novel coronavirus disease 2019 (COVID-19), predictive modeling, in the literature, uses broadly susceptible exposed infected recoverd (SEIR)/SIR, agent-based, curve-fitting models. Governments and legislative bodies rely on insights from prediction models to suggest new policies and to assess the effectiveness of enforced policies. Therefore, access to accurate outbreak prediction models is essential to obtain insights into the likely spread and consequences of infectious diseases. The objective of this study is to predict the future COVID-19 situation of Korea. Here, we employed 5 models for this analysis; SEIR, local linear regression (LLR), negative binomial (NB) regression, segment Poisson, deep-learning based long short-term memory models (LSTM) and tree based gradient boosting machine (GBM). After prediction, model performance comparison was evelauated using relative mean squared errors (RMSE) for two sets of train (January 20, 2020‒December 31, 2020 and January 20, 2020‒January 31, 2021) and testing data (January 1, 2021‒February 28, 2021 and February 1, 2021‒February 28, 2021) . Except for segmented Poisson model, the other models predicted a decline in the daily confirmed cases in the country for the coming future. RMSE values' comparison showed that LLR, GBM, SEIR, NB, and LSTM respectively, performed well in the forecasting of the pandemic situation of the country. A good understanding of the epidemic dynamics would greatly enhance the control and prevention of COVID-19 and other infectious diseases. Therefore, with increasing daily confirmed cases since this year, these results could help in the pandemic response by informing decisions about planning, resource allocation, and decision concerning social distancing policies.Entities:
Keywords: COVID-19; deep learning; disease transmission; mathematical model; pandemics; statistical model
Year: 2021 PMID: 33840175 PMCID: PMC8042305 DOI: 10.5808/gi.21028
Source DB: PubMed Journal: Genomics Inform ISSN: 1598-866X
Fig. 1.Daily confirmed cases of South Korea. Daily confirmed cases of capital, non-capital, and domestic is represented in red, blue, and green, respectively.
Fig. 2.Long short-term memory (LSTM) model architecture. (A) Overall architecture of LSTM. (B) The LSTM block architecture.
RMSE for the regions and models following the the two data subsets
| Region | Model | RMSE of data split 1 | RMSE of data split 2 | ||
|---|---|---|---|---|---|
| Train | Test | Train | Test | ||
| (Jan 20, 2020‒Dec 31, 2021) | (Jan 1, 2021‒Feb 28, 2021) | (Jan 20, 2020‒Jan 31, 2021) | (Feb 1, 2021‒Feb 28, 2021) | ||
| Domestic | Segmented Poisson | 0.088 | 1194.103 | 0.251 | 16.415 |
| Negative binomial | 0.057 | 0.409 | 0.063 | 2.088 | |
| Local regression | 0.037 | 0.793 | 0.039 | 14.856 | |
| LSTM | 0.051 | 23.117 | 0.083 | 6.327 | |
| SEIR | 0.033 | 0.956 | 0.035 | 2.658 | |
| GBM | 0.022 | 1.507 | 0.082 | 0.591 | |
| Capital | Segmented Poisson | 0.075 | 668.199 | 0.235 | 5.312 |
| Negative binomial | 0.061 | 1.311 | 0.064 | 3.078 | |
| Local regression | 0.042 | 1.135 | 0.046 | 3.846 | |
| LSTM | 0.054 | 14.8 | 0.074 | 3.934 | |
| SEIR | 0.073 | 0.410 | 0.072 | 3.109 | |
| GBM | 0.021 | 1.960 | 0.095 | 0.892 | |
| Non-capital | Segmented Poisson | 0.118 | 1131.838 | 0.195 | 34.935 |
| Negative binomial | 0.097 | 0.912 | 0.103 | 1.157 | |
| Local regression | 0.074 | 0.522 | 0.076 | 33.232 | |
| LSTM | 0.087 | 15.207 | 0.119 | 4.774 | |
| SEIR | 0.036 | 0.610 | 0.036 | 1.964 | |
| GBM | 0.015 | 0.607 | 0.083 | 0.855 | |
RMSE, relative mean squared error; LSTM, long short-term memory; SEIR, susceptible exposed infected recoverd; GBM, gradient boosting machine.