| Literature DB >> 33273592 |
Jichao Sun1, Xi Chen1, Ziheng Zhang1, Shengzhang Lai1, Bo Zhao1, Hualuo Liu1, Shuojia Wang1, Wenjing Huan1, Ruihui Zhao1, Man Tat Alexander Ng2, Yefeng Zheng3.
Abstract
The current outbreak of coronavirus disease 2019 (COVID-19) has recently been declared as a pandemic and spread over 200 countries and territories. Forecasting the long-term trend of the COVID-19 epidemic can help health authorities determine the transmission characteristics of the virus and take appropriate prevention and control strategies beforehand. Previous studies that solely applied traditional epidemic models or machine learning models were subject to underfitting or overfitting problems. We propose a new model named Dynamic-Susceptible-Exposed-Infective-Quarantined (D-SEIQ), by making appropriate modifications of the Susceptible-Exposed-Infective-Recovered (SEIR) model and integrating machine learning based parameter optimization under epidemiological rational constraints. We used the model to predict the long-term reported cumulative numbers of COVID-19 cases in China from January 27, 2020. We evaluated our model on officially reported confirmed cases from three different regions in China, and the results proved the effectiveness of our model in terms of simulating and predicting the trend of the COVID-19 outbreak. In China-Excluding-Hubei area within 7 days after the first public report, our model successfully and accurately predicted the long trend up to 40 days and the exact date of the outbreak peak. The predicted cumulative number (12,506) by March 10, 2020, was only 3·8% different from the actual number (13,005). The parameters obtained by our model proved the effectiveness of prevention and intervention strategies on epidemic control in China. The prediction results for five other countries suggested the external validity of our model. The integrated approach of epidemic and machine learning models could accurately forecast the long-term trend of the COVID-19 outbreak. The model parameters also provided insights into the analysis of COVID-19 transmission and the effectiveness of interventions in China.Entities:
Year: 2020 PMID: 33273592 PMCID: PMC7713358 DOI: 10.1038/s41598-020-78084-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Epidemic spreading diagram for SEIQ model. S: susceptible; E: exposed; I: infective; : quarantined. Parameter indicates the infectious rate. Parameter indicates the incubation rate with (incubation period). Parameter indicates the quarantine rate with (infectious period).
The constrained range for parameters with epidemic rationality.
| Parameters | Reasonable ranges |
|---|---|
| [2, 7] | |
| [3, 11] | |
| [1, 5] | |
| [0.05, 0.35] | |
| [0.05, 0.45] |
denotes the basic reproduction number; denotes the incubation period; denotes the infectious period; denotes the final value of ; denotes the decrease ratio of .
Figure 2Simulation and prediction results of daily and cumulative confirmed cases in the region of China excluding Hubei. Data were shown on each figure with 6 different prediction dates. The results of daily confirmed cases are placed on 1st and 3rd rows and the results of cumulative confirmed cases are placed on 2nd and 4th rows. Green vertical line: the prediction date which separate training data and test data. Solid blue line: the real number of confirmed cases before prediction date, namely training data. Solid yellow line: the retrospective number of confirmed cases, namely test data. Red dotted line: the number predicted by the D-SEIQ model.
Figure 3Simulation and prediction results of daily and cumulative confirmed cases in the region of Hubei excluding Wuhan. Data were shown on each figure with 6 different prediction dates. The results of daily confirmed cases are placed on 1st and 3rd rows and the results of cumulative confirmed cases are placed on 2nd and 4th rows. Green vertical line: the prediction date which separate training data and test data. Solid blue line: the real number of confirmed cases before prediction date, namely training data. Solid yellow line: the retrospective number of confirmed cases, namely test data. Red dotted line: the number predicted by the D-SEIQ model. Grey dashed line: the numbers after adjustment of the clinically confirmed cases.
Figure 4Simulation and prediction results of daily and cumulative confirmed cases in the region of Wuhan Data were shown on each figure with 6 different prediction dates. The results of daily confirmed cases are placed on 1st and 3rd rows and the results of cumulative confirmed cases are placed on 2nd and 4th rows. Green vertical line: the prediction date which separate training data and test data. Solid blue line: the real number of confirmed cases before prediction date, namely training data. Solid yellow line: the retrospective number of confirmed cases, namely test data. Red dotted line: the number predicted by the D-SEIQ model. Grey dashed line: the numbers after adjustment of the clinically confirmed cases.
Figure 5The estimated dynamic effective reproduction number in three different regions of China. Blue line: Wuhan city. Yellow line: Hubei province excluding Wuhan city. Red line: China excluding Hubei province.