Muhammad Naeem1, Jian Yu2, Muhammad Aamir1, Sajjad Ahmad Khan3, Olayinka Adeleye2, Zardad Khan1. 1. Department of Statistics, Abdul Wali Khan University, Mardan, KP, Pakistan. 2. Department of Computer Science, Auckland University of Technology, Auckland, New Zealand. 3. Department of Statistics, Islamia College University, Peshawar, KP, Pakistan.
Abstract
BACKGROUND: Forecasting the time of forthcoming pandemic reduces the impact of diseases by taking precautionary steps such as public health messaging and raising the consciousness of doctors. With the continuous and rapid increase in the cumulative incidence of COVID-19, statistical and outbreak prediction models including various machine learning (ML) models are being used by the research community to track and predict the trend of the epidemic, and also in developing appropriate strategies to combat and manage its spread. METHODS: In this paper, we present a comparative analysis of various ML approaches including Support Vector Machine, Random Forest, K-Nearest Neighbor and Artificial Neural Network in predicting the COVID-19 outbreak in the epidemiological domain. We first apply the autoregressive distributed lag (ARDL) method to identify and model the short and long-run relationships of the time-series COVID-19 datasets. That is, we determine the lags between a response variable and its respective explanatory time series variables as independent variables. Then, the resulting significant variables concerning their lags are used in the regression model selected by the ARDL for predicting and forecasting the trend of the epidemic. RESULTS: Statistical measures-Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and Symmetric Mean Absolute Percentage Error (SMAPE)-are used for model accuracy. The values of MAPE for the best-selected models for confirmed, recovered and deaths cases are 0.003, 0.006 and 0.115, respectively, which falls under the category of highly accurate forecasts. In addition, we computed 15 days ahead forecast for the daily deaths, recovered, and confirm patients and the cases fluctuated across time in all aspects. Besides, the results reveal the advantages of ML algorithms for supporting the decision-making of evolving short-term policies.
BACKGROUND: Forecasting the time of forthcoming pandemic reduces the impact of diseases by taking precautionary steps such as public health messaging and raising the consciousness of doctors. With the continuous and rapid increase in the cumulative incidence of COVID-19, statistical and outbreak prediction models including various machine learning (ML) models are being used by the research community to track and predict the trend of the epidemic, and also in developing appropriate strategies to combat and manage its spread. METHODS: In this paper, we present a comparative analysis of various ML approaches including Support Vector Machine, Random Forest, K-Nearest Neighbor and Artificial Neural Network in predicting the COVID-19 outbreak in the epidemiological domain. We first apply the autoregressive distributed lag (ARDL) method to identify and model the short and long-run relationships of the time-series COVID-19 datasets. That is, we determine the lags between a response variable and its respective explanatory time series variables as independent variables. Then, the resulting significant variables concerning their lags are used in the regression model selected by the ARDL for predicting and forecasting the trend of the epidemic. RESULTS: Statistical measures-Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and Symmetric Mean Absolute Percentage Error (SMAPE)-are used for model accuracy. The values of MAPE for the best-selected models for confirmed, recovered and deaths cases are 0.003, 0.006 and 0.115, respectively, which falls under the category of highly accurate forecasts. In addition, we computed 15 days ahead forecast for the daily deaths, recovered, and confirm patients and the cases fluctuated across time in all aspects. Besides, the results reveal the advantages of ML algorithms for supporting the decision-making of evolving short-term policies.
Authors: Daniel J Benjamin; James O Berger; Magnus Johannesson; Brian A Nosek; E-J Wagenmakers; Richard Berk; Kenneth A Bollen; Björn Brembs; Lawrence Brown; Colin Camerer; David Cesarini; Christopher D Chambers; Merlise Clyde; Thomas D Cook; Paul De Boeck; Zoltan Dienes; Anna Dreber; Kenny Easwaran; Charles Efferson; Ernst Fehr; Fiona Fidler; Andy P Field; Malcolm Forster; Edward I George; Richard Gonzalez; Steven Goodman; Edwin Green; Donald P Green; Anthony G Greenwald; Jarrod D Hadfield; Larry V Hedges; Leonhard Held; Teck Hua Ho; Herbert Hoijtink; Daniel J Hruschka; Kosuke Imai; Guido Imbens; John P A Ioannidis; Minjeong Jeon; James Holland Jones; Michael Kirchler; David Laibson; John List; Roderick Little; Arthur Lupia; Edouard Machery; Scott E Maxwell; Michael McCarthy; Don A Moore; Stephen L Morgan; Marcus Munafó; Shinichi Nakagawa; Brendan Nyhan; Timothy H Parker; Luis Pericchi; Marco Perugini; Jeff Rouder; Judith Rousseau; Victoria Savalei; Felix D Schönbrodt; Thomas Sellke; Betsy Sinclair; Dustin Tingley; Trisha Van Zandt; Simine Vazire; Duncan J Watts; Christopher Winship; Robert L Wolpert; Yu Xie; Cristobal Young; Jonathan Zinman; Valen E Johnson Journal: Nat Hum Behav Date: 2018-01
Authors: Qianying Lin; Shi Zhao; Daozhou Gao; Yijun Lou; Shu Yang; Salihu S Musa; Maggie H Wang; Yongli Cai; Weiming Wang; Lin Yang; Daihai He Journal: Int J Infect Dis Date: 2020-03-04 Impact factor: 3.623
Authors: Adam J Kucharski; Timothy W Russell; Charlie Diamond; Yang Liu; John Edmunds; Sebastian Funk; Rosalind M Eggo Journal: Lancet Infect Dis Date: 2020-03-11 Impact factor: 25.071