Literature DB >> 32501377

Statistical analysis of forecasting COVID-19 for upcoming month in Pakistan.

Muhammad Yousaf1, Samiha Zahir1, Muhammad Riaz1, Sardar Muhammad Hussain2, Kamal Shah3.   

Abstract

In this paper, we have conducted analysis based on data obtained from National Institute of Health (NIH) - Islamabad and produced a forecast of COVID-19 confirmed cases as well as the number of deaths and recoveries in Pakistan using the Auto-Regressive Integrated Moving Average Model (ARIMA). The fitted forecasting models revealed high exponential growth in the number of confirmed cases, deaths and recoveries in Pakistan. Based on our model prediction the number of confirmed cases will be increased by 2.7 times, 95% prediction interval for the number of cases at the end of May 2020 = (5681 to 33079). There could be up to 500 deaths, 95% prediction interval = (168 to 885) and there could be eightfold increase in the number of recoveries, 95% prediction interval = (2391 to 16126). The forecasting results of COVID-19 are alarming for May in Pakistan. The health officials and government should adopt new strategies to control the pandemic from further spread until a proper treatment or vaccine is developed.
© 2020 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  ARIMA; COVID-19 Pandemic; Confirmed Cases; Deaths; Forecast; Recoveries

Year:  2020        PMID: 32501377      PMCID: PMC7247520          DOI: 10.1016/j.chaos.2020.109926

Source DB:  PubMed          Journal:  Chaos Solitons Fractals        ISSN: 0960-0779            Impact factor:   5.944


Introduction

In the mid of December, a viral infection called coronavirus disease 2019 (COVID-19) was initially identified in Wuhan City of China [1] and it is believed that human contracted it from wild animals. It has caused thousands of deaths around the world and World Health Organization (WHO) declared it as pandemic on March 11 [2]. COVID-19 infection lead to respiratory illness and has sign like flu, fever, cough, myalgia, diarrhea, and dyspnea [3]. It is a highly contagious and is transmitted via bodily contacts and a respiratory bead from the infected patients, which is now the main source of transmission of the disease. The existence duration for the virus can be up to 12 hours or even two days on a contacted surface [4]. The fatality rate is higher among the young children and elderly aged group ≥ 60 years [5]. At present (as of April 14, 2020) the infected patient's population worldwide is recorded as 1,925,877 (1.9 million) with 119,719 deaths and 452,333 recoveries [6], and it is expected that these statistics is increasing exponentially in the upcoming days. In Pakistan the first two cases of COVID-19 appeared on 26th February 2020, within 48 hours three more cases appeared from different cities around the country and there was no linkage between these patients. Gradually, these cases increased exponentially until 14th April the cases were 5,716 with most cases of Punjab 2,826, 1452 cases, 800 cases, 233 cases, 231 cases, 131 cases, and 43 cases of Sindh, KPK, GB, Baluchistan, Islamabad and AJK respectively. With most effected cases the recoveries were recorded to be 1,378 with 96 deaths [7]. COVID-19 is pandemic and to control its spread, preventive measures be taken. For patients, all body fluids and electrolytes should be continuously checked with vital signs and to avoid further spread, they should be incubated with strict clinical measures under preventive guidelines [8] . The government need to find a strategy to fight this war in timely fashion, such as authorities took further measures of closing borders, suspending community services and schools, minimizing both domestic and international travels until further notice [9] . The purpose of these measures is to limit the chances of physical contacts among people, so that transmission of COVID-19 is controlled, as the incubation period for this virus is relatively longer than other viruses. Due to the novel nature of the virus, there is greater uncertainty around the decision on optimal time of disappearance of this disease. Therefore, short term forecasting is extremely important even in the slightest hint for predicting the upcoming month for the better management of the societal, economical, cultural and public health issues [10]. In the past few months researchers have developed or employed existing mathematical and statistical methods to predict the number of COVID-19 cases and related outcomes. Fractional time delay dynamic system (FTDD) reflects good forecast agreements to the public data [11]. Generalized logistic model (GLM) indicates that the epidemic growth was exponential in china [12]. Based on prediction, situation will be worsened in entire Europe and USA will become the epicenter of new cases during mid of April 2020 [13]. Around 2 million people will be infected by the beginning of May if no measures are taken [14]. Predictions/estimates help to strengthen the strategies in order to prevent the pandemic from worsening. In this research study, we used the available data to forecast the number of confirmed COVID-19 cases, deaths and recoveries in Pakistan for upcoming month. This forecasting is aimed to help government institutions and policy makers as well as public in adopting new strategies and strengthening the existing preventive measures against COVID-19 pandemic. In addition, this study may help in reliving current socioeconomic and psychosocial distress caused by COVID-19 among public in Pakistan.

Methods

Data Source

We obtained data for the number of daily accumulated confirmed cases of COVID-19, deaths and recoveries from February 26, 2020 to April 12, 2020, from official reports of the National Institute of Health (NIH) - Islamabad, Pakistan [15]. NIH is an autonomous health research institute of the Ministry of National Health Services of Pakistan, situated in Islamabad; its primary responsibilities include biomedical and health related research and vaccine manufacturing15. Since the 1st case of COVID-19 observed in Pakistan, NIH collects and publishes the daily reports on COVID-2019 regularly [15]. The published data includes total number of confirmed cases in Pakistan (Province wise), deaths and recovery from COVID-2019. The analysis is based on aggregated numbers of confirmed cases, deaths and recoveries published online by NIH; therefore, ethical approval is not required for this study.

Statistical model used for the analysis

The available data is limited and is affected by fluctuations i.e. highly variable cases were reported day by day. As a result, Cumulative data is used to predict the number of cases in Pakistan. The cumulative number of COVID-19 confirmed cases, deaths and recoveries are expected to show exponential growth over time. Therefore, we used the simple time series methods of Auto-Regressive Integrated Moving Average (ARIMA) Model [16] to forecast the number of cases, deaths and recoveries for upcoming month. The ARIMA model has higher fitting and forecasting accuracy than exponential smoothing [17]. It captures both the seasonal and non-seasonal forecasting trends. Due to the limited available data, we simply focus on non-seasonal models to describes the pattern (growth) over time. Hence, we assumed that the pattern of current cases will continue in the near future (at least a month). We believe that the ARIMA model, which is the combination of Autoregressive (AR) and Moving Average (MA) fits well to the nature of the available data and provide good forecasting for the short time series data. The forecasting and prediction intervals until the end of May is produced from the fitted model. In order to assess the model fit, parameters (p, d, q) are identified by Autocorrelation function (ACF) and Partial Autocorrelation function (PACF); whereas, p is the autoregressive term, d is the differencing order and q is the moving averages term. Furthermore, ARIMA (p, d, q) results are based upon Akaike information criterion (AIC) which is a goodness of fit test such that model with minimum AIC is considered best. All statistical analyses were conducted using the R-library “forecast, tseries and zoo” [18] developed for fitting ARIMA model.

Results

In Pakistan, the number of COVID-19 cases is now increasing exponentially, Fig. 1 . As of 12 April 2020, the accumulative number of confirmed cases in Pakistan were 5230 with most cases occurred in Punjab 2464, and 1411, 744, 228, 224, 119, and 40 cases in Khyber Pakhtunkhwa (KP), Sindh, Baluchistan, Gilgit Baltistan(GB), Islamabad and Azad Jammu Kashmir (AJK), respectively, Table 1 . There were 91 confirmed deaths due to COVID-19 and 13 cases were critical, while 1028 patients recovered.
Fig. 1

Daily confirmed cases, recoveries and deaths of COVID-19 in Pakistan.

Table 1

Number of confirmed cases, death and recoveries by date and region from 26 February to 12 April 2020.

DatePunjabSindhKPBaluchistanGBAJKIslamabadTotal CasesTotal DeathsTotal Recoveries
26/2/200200000200
04/3/200400100500
11/3/2009013022002
18/3/20332081923131530225
25/3/2032341380131831251057821
01/4/20845743276169187962229131107
12/4/2024641411744228224401195230911028
Daily confirmed cases, recoveries and deaths of COVID-19 in Pakistan. Number of confirmed cases, death and recoveries by date and region from 26 February to 12 April 2020. Using 47 days data from 26 February 2020 – 12 April 2020 and ARIMA model, we forecasted the data up to 31st of May. As, we were dealing with timeseries and non-stationary data, it is observed that mean and variance of data is variable in nature. Therefore, double differencing is used, in order to stabilize (made stationary) the data by removing trends. ARIMA (0,2,1), ARIMA (2,2,0) and ARIMA (1,2,1) is applied to produced plots for the number of confirmed cases, recoveries and deaths over time (days) as shown in 2, 3, and 4 respectively of Fig. 1. Results from the model revealed that the number of confirmed cases show a rapid exponential growth which may increase by 2.7 times compared to current cases until end of May 2020. The 95% prediction interval for confirmed cases is from 5681 to 33079 which are at much higher growth, Fig. 2 . The results from the forecasting model for deaths revealed that deaths may be increased up to 500 at the end of May if the current mortality rate prevails. The 95% prediction interval for mortality is estimated to be 168 to 885, Fig. 3 . Similarly, the forecast model for recoveries showed an exponential growth. The model revealed that the number of recoveries will be possibly increased by 8 times at the end of May, with 95% prediction interval of 2391 to 16126, Fig. 4 .
Fig. 2

Forecast of Confirmed cases up to 31, May 2020.

Fig. 3

Forecast of Deaths up to 31, May 2020.

Fig. 4

Forecast of Recoveries up to 31, May 2020.

Forecast of Confirmed cases up to 31, May 2020. Forecast of Deaths up to 31, May 2020. Forecast of Recoveries up to 31, May 2020.

Discussion and Conclusion

This is a first formal study to make a short term forecast about COVID-19 confirmed cases as well as the number of related deaths and recoveries. The results of this study revealed that there could be a three-fold increase in the number of confirmed cases at the end of May if the current trend continues. Currently, the number of deaths is quite low in Pakistan but results from the model revealed that it may increase up to 500 at the end of May. On the other hand, recoveries from COVID-19 related complications were slow in the beginning, but it is now increasing exponentially and the forecasting model for it shows that there could be eightfold increase in the number of recoveries. However, the number of confirmed cases is increasing in Pakistan at higher rate as compared to the number of recoveries as the disease is spreading to a wider region of the country. Similar kind of forecasts has been conducted by other researchers such as [10] but with slightly different method of the forecasting models and presentation of the results is not delivered in a simple way to be understood to a layman, while others are now planning to conduct or continue to conduct such a kind of analysis [19]. It is important to note that majority of the research studies are modeling the preparedness scenarios to inform planning rather predictions [20], that is they informed the actions to be taken to slow the spread and prepare the health system to respond to the pandemic. In addition, their analysis has focused on countries in the South East Asia and Western Pacific regions and ignored other countries such as Pakistan [20] situated in the neighbor of China. Pakistan is expecting dire consequence of the COVID-19 full-fledged global pandemic such as (12.3 to 18.53) millions of layoffs out of the employees in different sectors of economy in the aftermath of partial or complete shutdown due to the countrywide outbreak and lock downs. It has formulated/taken drastic precautionary measures to contain COVID-19 including [21] establishing the National Coordination Committee for Covid-19 and National Disaster Management Authority [22], closure of all educational institutions, sealing borders with neighboring countries, travel bane and screening of travelers, social distancing and bane on public gathering, and comprehensive food security. These precautionary measures seem to be working well in the containment of COVID-19 as compared to other nations data worldwide [23,24,25]. Yet, the results of this study suggest an increasing trend of COVID-19 cases and deaths for the upcoming month and we recommend continuation of the above or more stringent measures to contain COVID-19. We believe that the forecasts established by this study is useful for Pakistani government and public in making informed decisions and taking appropriate steps to prevent further spread of COVID-19 disease. We assume that the analysis for this study is based on an accurate data recorded by NIH [26] in Pakistan and we used appropriate forecasting methods (timeseries modelling-ARIMA). The modeling strategies is based on current trends and non-seasonal timeseries variations, following the patterns shown in Fig. 1; assuming the data is accurate, and the trends will continue in the upcoming month of May. We used conventual statistical approaches AIC [27] for model assessment and selection. However, we acknowledge that our analysis is based on the assumptions and if the assumptions are not true, it may lead to an inaccurate forecast. Furthermore, forecasting with timeseries modeling, requires enough historical data, which is not the case with our analysis, and there is always uncertainty associated with prediction as current patterns in the data may not be continued to future. As Pakistan is a developing country, having a lack of medical facilities which resultantly is affecting the situation further no vaccine or medicine is developed yet to prevent or cure the COVID-19 pandemic permanently. The public health officials and government should take hard decisions to control the rapid increase of the COVID-19. Besides officials, the general public should keep social distancing and use precautions to ensure their safety and control the disease from further spreading, Fig. 3.

CRediT authorship contribution statement

Muhammad Yousaf: Conceptualization, Methodology, Formal analysis, Project administration, Software, Visualization, Writing - original draft. Samiha Zahir: Data curation, Formal analysis, Investigation, Writing - original draft, Writing - review & editing. Muhammad Riaz: Conceptualization, Methodology, Supervision, Visualization, Validation, Writing - original draft, Writing - review & editing. Sardar Muhammad Hussain: . Kamal Shah: Software, Visualization, Validation.

Declaration of Competing Interest

We declare that none of the author has the competing or conflict of interest.
  3 in total

1.  Coronavirus Infections-More Than Just the Common Cold.

Authors:  Catharine I Paules; Hilary D Marston; Anthony S Fauci
Journal:  JAMA       Date:  2020-02-25       Impact factor: 56.272

2.  Forecasting the novel coronavirus COVID-19.

Authors:  Fotios Petropoulos; Spyros Makridakis
Journal:  PLoS One       Date:  2020-03-31       Impact factor: 3.240

3.  Real-time forecasts of the COVID-19 epidemic in China from February 5th to February 24th, 2020.

Authors:  K Roosa; Y Lee; R Luo; A Kirpich; R Rothenberg; J M Hyman; P Yan; G Chowell
Journal:  Infect Dis Model       Date:  2020-02-14
  3 in total
  36 in total

1.  COVID-19 Pandemic Data Modeling in Pakistan Using Time-Series SIR.

Authors:  Muhammad Taimoor; Sajid Ali; Ismail Shah; Fred Roland Muwanika
Journal:  Comput Math Methods Med       Date:  2022-06-28       Impact factor: 2.809

2.  Spatial dynamics and optimization method for a rumor propagation model in both homogeneous and heterogeneous environment.

Authors:  Linhe Zhu; Xuewei Wang; Zhengdi Zhang; Chengxia Lei
Journal:  Nonlinear Dyn       Date:  2021-08-17       Impact factor: 5.022

3.  Analysis of Caputo fractional-order model for COVID-19 with lockdown.

Authors:  Idris Ahmed; Isa Abdullahi Baba; Abdullahi Yusuf; Poom Kumam; Wiyada Kumam
Journal:  Adv Differ Equ       Date:  2020-08-03

4.  Prediction of epidemic trends in COVID-19 with logistic model and machine learning technics.

Authors:  Peipei Wang; Xinqi Zheng; Jiayang Li; Bangren Zhu
Journal:  Chaos Solitons Fractals       Date:  2020-07-01       Impact factor: 9.922

5.  Mathematical analysis of COVID-19 via new mathematical model.

Authors:  Saeed Ahmad; Saud Owyed; Abdel-Haleem Abdel-Aty; Emad E Mahmoud; Kamal Shah; Hussam Alrabaiah
Journal:  Chaos Solitons Fractals       Date:  2020-12-26       Impact factor: 5.944

6.  Deep learning-based forecasting model for COVID-19 outbreak in Saudi Arabia.

Authors:  Ammar H Elsheikh; Amal I Saba; Mohamed Abd Elaziz; Songfeng Lu; S Shanmugan; T Muthuramalingam; Ravinder Kumar; Ahmed O Mosleh; F A Essa; Taher A Shehabeldeen
Journal:  Process Saf Environ Prot       Date:  2020-11-01       Impact factor: 6.158

7.  Forecasting COVID-19 daily cases using phone call data.

Authors:  Bahman Rostami-Tabar; Juan F Rendon-Sanchez
Journal:  Appl Soft Comput       Date:  2020-11-25       Impact factor: 6.725

8.  Analysis and forecasts for trends of COVID-19 in Pakistan using Bayesian models.

Authors:  Navid Feroze; Kamran Abbas; Farzana Noor; Amjad Ali
Journal:  PeerJ       Date:  2021-07-07       Impact factor: 2.984

9.  Bayesian neural networks for stock price forecasting before and during COVID-19 pandemic.

Authors:  Rohitash Chandra; Yixuan He
Journal:  PLoS One       Date:  2021-07-01       Impact factor: 3.240

10.  Forecasting daily new infections, deaths and recovery cases due to COVID-19 in Pakistan by using Bayesian Dynamic Linear Models.

Authors:  Firdos Khan; Shaukat Ali; Alia Saeed; Ramesh Kumar; Abdul Wali Khan
Journal:  PLoS One       Date:  2021-06-17       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.