Literature DB >> 34138956

Forecasting daily new infections, deaths and recovery cases due to COVID-19 in Pakistan by using Bayesian Dynamic Linear Models.

Firdos Khan¹, Shaukat Ali², Alia Saeed^3,4, Ramesh Kumar³, Abdul Wali Khan⁵.

Abstract

The COVID-19 has caused the deadliest pandemic around the globe, emerged from the city of Wuhan, China by the end of 2019 and affected all continents of the world, with severe health implications and as well as financial-damage. Pakistan is also amongst the top badly effected countries in terms of casualties and financial loss due to COVID-19. By 20th March, 2021, Pakistan reported 623,135 total confirmed cases and 13,799 deaths. A state space model called 'Bayesian Dynamic Linear Model' (BDLM) was used for the forecast of daily new infections, deaths and recover cases regarding COVID-19. For the estimation of states of the models and forecasting new observations, the recursive Kalman filter was used. Twenty days ahead forecast show that the maximum number of new infections are 4,031 per day with 95% prediction interval (3,319-4,743). Death forecast shows that the maximum number of the deaths with 95% prediction interval are 81 and (67-93), respectively. Maximum daily recoveries are 3,464 with 95% prediction interval (2,887-5,423) in the next 20 days. The average number of new infections, deaths and recover cases are 3,282, 52 and 1,840, respectively, in the upcoming 20 days. As the data generation processes based on the latest data has been identified, therefore it can be updated with the availability of new data to provide latest forecast.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2021 PMID： 34138956 PMCID： PMC8211153 DOI： 10.1371/journal.pone.0253367

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

1. Introduction

Various pandemics and contagious viral infections such as influenza, Zika, MERS, Spanish flu, Ebola emerged in the past, which badly affected the human lives and economy of the of the major areas and regions of the world [1, 2]. Currently the world is facing a viral infectious disease caused by severe acute respiratory syndrome corona virus 2 (SARS-CoV-2), initially reported in the Wuhan city of China in December 2019, spreading across all continents of the world and it was named as COVID-19 [3]. With a time span of three months, this virus spread rapidly through enormous ways and reached to most countries of the world [3]. The world came up with different measures to contain the spread of the virus such as limiting the social mobility and lockdown from time to time and consequently reached to the peak of inclined economical losses. The second wave of COVID-19 was more lethal due to more deaths, in comparison to the first wave. The third wave has begun in Pakistan which is more alarming than the second wave of COVID-19 in terms of new emerging infections, contagiousness, complications, hospital admission and daily reported-fatalities. According to World Health Organization (WHO) and Worldometer statistics about COVID-19, Pakistan is at 31st position in the world based on the total confirmed cases [3, 4]. The time series models (TSMs) have multitude of applications and play significant role in the areas like finance, economics, engineering, climatology, epidemiology, and hydrology in terms of forecasting [5-10]. TSMs do not merely describe the existing trends but can help to explain the data generating process (DGP) and predict future trends. Moreover, TSMs are advantageous over mechanistic models (MMs) to forecast future disease trends due to the highly explicit epidemiological information, which is needed to fit MMs [10]. Most of TSMs including the popular classes, namely, autoregressive (AR), moving average (MA), autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA) are useful to deal with the stationary data. Therefore, if the data is not stationary, it is necessary to transform the data so that it becomes stationary. Such transformations could have impacts on the results. In many cases, results in original units are required and thereafter the results are needed to be retransformed. State space models (SSMs) offer a very rich class of models which are advantageous over the above mentioned TSMs such as; no need of stationarity and thus no need of transformation of the data [11-16]. SSMs consider a time series data as the output of a system which is dynamic and perturbed by random disturbances. Bayesian dynamic linear model (BDLM) or simply dynamic linear model (DLM) is a special case of general SSMs, i.e., it is linear and Gaussian and useful to model and forecast time series data. In DLMs, estimation of states and forecasting can be done recursively by utilizing the well-known Kalman filter [11]. These models allow a natural explanation of a time series data as the combination of several components, such as seasonal, trend or regressive components. In addition, they have the powerful and sophisticated probabilistic structure, and thus presenting an adaptable framework for a very wide variety of applications in different areas. Given the information available, computations including estimation and forecasting can be done utilizing recursive algorithms by computing the conditional distribution of the quantities of interest. Therefore, naturally these models are treated within a Bayesian framework. For further details about DLMs and their applications, refer is made [11-16]. In the current pandemics, to model and forecast new infections, deaths and recoveries due to COVID-19, different researchers have used different techniques including statistical, mathematical, machine learning algorithm, deep learning etc. [17-21]. [21] compared six time series models, including ARIMA [7], the Holt–Winters additive model (HWAAS) [22], TBAT [23], Facebook’s Prophet [24], DeepAR [25] and N-Beats [26]. They concluded that ARIMA and TBAT models performed better for seven countries out of ten countries. [27] compared different forecasting methods to choose the best method for forecasting deaths due to COVID-19 in the world. These methods include simple average, moving average, naive method, Holt linear trend method, single exponential smoothing, ARIMA and Holt-Winters method. They concluded that ARIMA provided the best-fit model to their time series data. [28] used Holt model and ARIMA model to predict the future’s situation of COVID-19 in China and the United State of America (USA). They concluded that the epidemic situation has ended after May in the Hubei Province of China, however, the situation in the USA has become more sever after May 2020. [29] used a hybrid approach by combining fractals and fuzzy logic to forecast COVID-19 in ten countries with the forecast windows of 10 and 30 days. [30] made a comparison between multiple ensemble neural network model with fuzzy response aggregation and monolithic neural networks and concluded that the former outperformed the later one. [31] used multivariate Markov prognostic models to identify high-mortality risk in hospitalized COVID-19 patients, however, these models require different data sets including comorbidities, demographics, and laboratory values taken at admission and during hospitalization. [18] used ARIMA model to forecast new infections, death and recover cases due to COVID-19 in Pakistan. Their results revealed high exponential growth in the number of new infections, deaths and recover cases during May 2020 in Pakistan. [32] used vector Autoregressive (VAR) models to model and forecast daily new cases, deaths, and recoveries with respect to COVID-19 in Pakistan. The results of [18] have more deviations from the real data, however, the findings of [32] are closer to the recorded data. [19] compared different forecasting models for cumulative new cases and recover cases for the duration of February and June 2020. Their results concluded that ARIMA model performed better than the other models. Most of the time series models (ARIMA, SARIMA, AFRIMA, and non-linear models including ARCH, GARCH etc.) required stationary time series data, however, by applying a transformation on data, we may lose important information regarding extreme phenomena. DLM does not requires the assumption of stationarity and it can be used to model and forecast a time series data with no stationarity, structural breaks and no-clear pattern. It can be seen that none of the studies used DLM to forecast COVID-19 in Pakistan. Secondly, most of the mentioned studies in the context of Pakistan had 5 days or 10 days ahead forecast for new infections, deaths and recover cases. Due to the flexible framework and dynamic nature, we propose to use DLM for modelling and forecasting daily new cases, deaths and recoveries regarding COVID-19 in Pakistan. In addition, this study will provide forecast for longer duration than the previous studies which may help concerned department in making their policies. This study aims to identify the data generating process of three variables (daily new infections, deaths and recover cases), using DLM; to provide forecast about above mentioned three variables due to COVID-19 in Pakistan.

2. Data and study area

Diurnal data of COVID-19 was collected for day-to-day new cases, deaths, and recover cases from World Health Organization (WHO) and Worldometer [3, 4]. The 1st case of COVID-19 was reported on 26th February 2020 in Pakistan. Therefore, the data used in this study ranges between February 26, 2020 to March 20, 2021. Our study region is Pakistan, nevertheless, the proposed methods can be used to the subregions or larger areas.

3. Methodology

3.1. DLM’s structure

DLMs is a popular technique for smoothing and forecasting time series. It has two main components, unobserved states and observed data. The model is explained for daily new infections (DNI) only, however, it can be explained in the same way for daily deaths and recover cases. unobserved states: θ1, θ1,…,θ observed observation: DNI1, DNI2,…,DNI where DNI represents daily new infections of COVID-19 in this study. The P(θ/θ) is the transition probability of states implying the well-known Markov property where the probability of current state depends only on the previous state for t = 1,2,…,T. The probability of observed data DNI at time t is P(DNI/θ) implying that observed data depend on the current state. The models can be formulated as: Then the DLM can be represented in the following equations: The ϵ and ω are two independent white noise error terms which are independent both within each other and between them with zero mean and known covariances E and W, respectively. Eqs (1) and (2) are called observation equation and state or system equation, respectively. It is further assumed that the prior distribution of θ0 is Gaussian distribution, i.e., where is a vector of unobserved states of the system of length m that are assumed to evolve over time according to the linear system operator (state transition), a matrix of order m×m. For time series data the states or different features can be trend, seasonality or regressive components [12, 33]. Then the observations can be expressed as in Eq (1). We observe a linear combination of the states with a matrix (m×p) which serves as observation operator that transforms the model states to a time series observation. The dependence structure of the model presented in Eqs (1) and (2) is given below:

3.2. State’s estimation of DLM

For a given DLM, the major tasks are to draw inference about the unobserved states or to forecast future observations based on a part of available observation sequence [34]. Conditional distributions of the quantities of interest given the available information are used to solve the problem of estimation and forecasting. To estimate the state’s vector, we compute the conditional probability density p(θ|DNI1:). It is imperative to differentiate among filtering (s = t), state prediction (st). In filtering the data is assumed to arrive sequentially which is usual in time series. Now there is need a procedure to estimate the current value of the state, on the basis of observation up to time t (for example now), and to update our estimate and forecast as the new data become available for the next time (t+1). To solve the problem of filtering, we compute the conditional density p(θ|DNI1:). In DLM, the Kalman filter [35] provide formulae to update our current inference on the state vector as the new information become available. This refers to passing form the filtering density p(θ|DNI1:) to p(θ|DNI1:).

3.3. Validation of DLM

The fitted DLM can be validated graphically as well as numerically. Graphically the performance of the fitted model can be assessed by analyzing the residuals and checking their histogram, probability density function and quantile-quantile plot (qqplot) of observed and model’s simulated data. The residuals of the fitted model should have Gaussian distribution with zero mean value if the model is properly specified. Also, a comparison can be made between the observed and model predicted data to assess the performance of fitted model. The performance of the fitted model is evaluated numerically by comparing the distribution of observed and simulated data which includes minimum values, first quartile, mean, median, third quartile and maximum values. In addition, the model can be evaluated by comparing the probability density function of observed data with model’s simulated data.

3.4. Forecasting with DLM

Forecasting is the eventual objective of time series modelling and the length of forecasting depends on the nature and objectives of the study. The estimation of state is then just a step for the prediction of future’s observation. For instance, if we wish one step ahead forecasting of next observation DNI based on available data DNI1:. Then first we need to estimate the next value of state vector θ and then forecast the next observation based on the state θ. The one step ahead predictive density of the state is p(θ|DNI1:) which based on the filtering density of θ. Then consequently, from this the predictive density of observation can be calculated as p(DNI|DNI1:). If the interest is k-step ahead forecasting about DNI (DNI), then we need to estimate the evaluation of the system denoted by θ. The predictive density p(θ|DNI1:) can be used to solve the state prediction. Once the predictive density of k-step ahead state is obtained then based on this density, the k-step ahead predictive density p(DNI|DNI1:) of the future observation can be calculated at time t+k. The forecast become more and more uncertain as the forecast duration t+k get further away in future, however, the uncertainty can be quantified by using the predictive density of DNI given DNI1: [12, 36]. The above procedure can be repeated in the same fashion for daily deaths and recover cases due to COVID-19 in Pakistan by replacing DNI by DD (daily deaths) and DRC (daily recover cases), respectively.

4. Results

The results of the study are divided and presented into three subsections:

4.1. Model’s evaluation

The specified DLMs are evaluated graphically and as well as numerically in Figs 1–3 and Table 1. Figs 1–3 show the probability density function of daily new cases, deaths and recover cases both for observed and models’ simulated data. Lines are drawn at 25th, 50th, 75th and 95th percentiles in each graph both for observed and simulated data to make it easy in understanding. The results show that the fitted models closely simulated the observed data sets, however, the difference between observed and simulated data is little higher at 95th percentiles. The 95th percentile is overestimated for all three variables, but the difference is smaller for daily new cases as compared to daily recover cases. The model well captured the daily deaths rather than new and recovered cases, which could be a good estimate of mortality rates due to COVID-19. Table 1 has numerical evaluation about fitted DLMs for all considered three variables by comparing the distribution of observed and simulated data. For daily new cases, the models captured well the distribution of observed data where minimum values are the same for observed and simulated data sets. However, the remaining statistics in Table 1 are slightly underestimated for daily observed cases. For daily deaths, the statistics related to simulated and observed data are very close. Observed and simulated median, 3rd quartile and maximum values are 32, 56, 153 and 31, 54, 155, respectively. Regarding daily recover cases, the differences are little higher as compared to daily new cases and daily deaths. However, the model closely reproduced the minimum value, median and mean values which are 0, 1,013, 1,534 and 0, 978, 1,527, respectively, for observed and simulated data.

Fig 1

Evaluation of DLM using probability density function of observed and model simulated daily new cases of COVID-19 in Pakistan.

The vertical lines show 25th, 50th, 75th and 95th percentiles for observed and simulated data sets. Red and blue colors represent observed and simulated data sets, respectively. On the x-axis and y-axis, the number of daily new cases and density are given, respectively.

Fig 3

Evaluation of DLM using probability density function of observed and model simulated daily recover cases of COVID-19 in Pakistan.

Table 1

Distribution of observed and simulated daily new cases, daily deaths, and daily recover cases of COVID-19 in Pakistan.

Variable		1^st Quar.	Median	Mean	3^rd Quar.	Max
Daily new infections	Observed	617	1352	1649	2397	6825
Daily new infections	Simulated	589	1318	1622	2337	6681
Daily Deaths	Observed	9	32	36	56	153
Daily Deaths	Simulated	10	31	35	54	155
Daily Recover Cases	Observed	411	1013	1534	1855	16813
Daily Recover Cases	Simulated	331	978	1527	2042	13375

Evaluation of DLM using probability density function of observed and model simulated daily new cases of COVID-19 in Pakistan.

Evaluation of DLM using probability density function of observed and model simulated daily deaths of COVID-19 in Pakistan.

Evaluation of DLM using probability density function of observed and model simulated daily recover cases of COVID-19 in Pakistan.

4.2. Diagnostic checking

The results about diagnostic checking of the models are presented in Figs 4–6. Figs 4–6 are about daily new cases, deaths and recover cases, respectively. The first panel (top-left) of Fig 4 shows residuals plot and it can be seen that it is centered on zero. The histogram (top-right) and pdf (bottom-left) show that the residuals of the specified model are approximately normally distributed. The fourth panel (bottom-right) of Fig 4 presents the quantile-quantile plot of observed and model’s simulated daily new cases. For daily deaths and daily recover cases, the diagnostic checking results are given in Figs 5 and 6, respectively. Once the fitted models qualify the diagnostic checks, then it can be used for forecasting.

Fig 4

Upper left panel is residuals, upper right panel is histogram of residuals, bottom left panel is probability density function of residuals, bottom right panel is qqplot of observed and model simulated daily new cases of COVID-19 in Pakistan.

Fig 6

Fig 5

4.3. Forecasting

The forecasting results about daily new cases, deaths and recover cases are presented in Figs 7–9, respectively. Fig 7 shows a comparison between observed and model simulated data for the duration of March 2020 to March 2021 where it can be noted that how closely the specified model reproduced the observed data. On the right side of the vertical line, the results are about forecasting of daily new cases with their 95% prediction interval. In the forecasting duration, DLM captured the variability of the data very well. This is one of the reasons that is why DLMs are prefer over other time series model because DLMs can elegantly model a time series with non-stationarity nature, structural breaks, no clear pattern etc. Nevertheless, the model captured well the variability, but a 95% prediction interval was calculated to encounter the uncertainty. The horizontal line shows the average value of observed daily infections due to COVID-19. It is clear from Fig 7 that the forecasting daily new cases are higher than the average value of new cases in Pakistan. Table 2 summarized the forecasting results. It can be seen that the minimum daily new cases in the forthcoming 20 days are 2,479 with prediction interval of 1,767–3,191. The maximum number of daily new cases are 4,031 with 95% prediction interval of 3,319–4,743. These results indicate that that the average number of daily new cases in the upcoming 20 days is 3,282 with 95% prediction interval of 2,570–3,994. The model’s forecast depicts that there will be 65,638 total new infections in the next 20 days.

Fig 7

Comparison of observed and model simulated (average of 100 simulations) daily new cases of COVID-19 in Pakistan.

Blue, red and green colors show observed, model simulated and 95% prediction intervals of daily new cases. Horizontal line shows the average value of observed data while the vertical line shows the demarcation points between forecast and observed data. On the x-axis and y-axis, time in months and the number of daily new cases are given, respectively.

Fig 9

Comparison of observed and model simulated (average of 100 simulations) daily recover cases of COVID-19 in Pakistan.

Blue, red and green colors show observed, model simulated and 95% prediction intervals of daily recover cases. Horizontal line shows the average value of observed data while the vertical line shows the demarcation points between forecast and observed data. On the x-axis and y-axis, time in months and the number of daily recover cases are given, respectively.

Table 2

Forecast values for daily infections, fatalities and recover cases about COVID-19 with their corresponding 95% confidence intervals for Pakistan in upcoming 20 days.

The forecast results based on the average of 100 simulations.

Variable	Max/Min	Forecast	95% Confidence Intervals		Total Cases in 20 days
Variable	Max/Min	Forecast	Lower limit	Upper Limit	Total Cases in 20 days
Infections	Min	2479	1767	3191	65,638
	Max	4031	3319	4743
	Average	3282	2570	3994
Fatalities	Min	42	29	56	1,035
	Max	81	67	93
	Average	52	38	65
Recover cases	Min	578	0	2536	36,784
	Max	3464	2887	5423
	Average	1840	1262	3799

Comparison of observed and model simulated (average of 100 simulations) daily new cases of COVID-19 in Pakistan.

Comparison of observed and model simulated (average of 100 simulations) daily deaths of COVID-19 in Pakistan.

Blue, red and green colors show observed, model simulated and 95% prediction intervals of daily deaths. Horizontal line shows the average value of observed data while the vertical line shows the demarcation points between forecast and observed data. On the x-axis and y-axis, time in months and the number of daily deaths is given, respectively.

Comparison of observed and model simulated (average of 100 simulations) daily recover cases of COVID-19 in Pakistan.

Forecast values for daily infections, fatalities and recover cases about COVID-19 with their corresponding 95% confidence intervals for Pakistan in upcoming 20 days.

The forecast results based on the average of 100 simulations. Fig 8 shows the results of daily deaths due to COVID-19 in Pakistan. The results on the left side of the vertical line in Fig 8 shows a comparison between observed and model’s simulated daily deaths during March 2020 to March 2021. On the right side of the vertical line, the results are about forecasting daily deaths and their 95% prediction interval. It is clear that the model elegantly captured the structure of the time series. It can be noted and that the number of deaths is higher than the average value during the forecast duration where the horizontal line indicates the average value of observed deaths. A summary of the model’s forecast is given in Table 2. Table 2 shows that the minimum number of deaths are 42 with 95% prediction interval of 29–56. The maximum number of forecast deaths during the upcoming twenty days is 81 with 95% prediction interval of 67–93. The forecast results indicate that on the average, there are 52 deaths during next twenty days with 95% prediction intervals of 38–65. In addition, the forecasting results show that the total number of deaths during the upcoming 20 days is 1,035.

Fig 8

Comparison of observed and model simulated (average of 100 simulations) daily deaths of COVID-19 in Pakistan.

Fig 9 presents the results for daily recover cases about COVID-19 in Pakistan. Fig 9 shows a comparison between observed and model simulated daily recover cases during March 2020 and March 2021. On the right side of the vertical line, the results are about model’s forecasting where the red line is the forecast daily new cases and green lines show their 95% prediction interval. The forecasting results suggest that the recover cases in the upcoming 20 days are little higher than the average value of observed daily recover cases. The forecast results about daily recover cases are summarized in Table 2. The results suggest that the minimum number of daily recover cases during the next twenty days is 578 with 95% prediction intervals of 0–2,536. The maximum number of daily recover cases is 3,464 with 95% prediction intervals of 2,887–5,423. However, the forecast’s results suggest that the average number of daily recover cases in the forthcoming twenty days is 1,840 with 95% prediction intervals of 1262–3,799. The results further indicates that the total number of daily recover cases during the next 20 days are 36,784.

5. Discussion

COVID-19 has changed the lifestyle, businesses, education system and many more around the world since it emerged in the end of 2019 [37] and it is being considered that the COVID-19 pandemic is the most significance global crises after world war II [38]. However, the long-term impacts depend on the persistence of the pandemic. Due to alarming spread, surge in cases, complications and daily reported morbidity and mortality indicators, this pandemic is the focus of researchers and tried to forecast daily cases (infections, deaths, and recover cases) due to COVID-19. For instance [18], used ARIMA model to forecast the future spread of daily new cases, deaths and recover cases. Their results suggest that the new infections will increase by 2.7 times and number of deaths to eightfold by May 2020. However, besides forecast’s error the government policies affected the spread of COVID-19 and did not touch these high values. [39] used ARIMA with Kalman Filter to model and forecast the future behavior of COVID-19 in Pakistan. [39] made forecast for the first five days of May 2020 based on the available data. Their results suggest that the new infections, deaths and recoveries will be reached to 15,652, 6,342 and 516, respectively. The results of [39] have an increasing trend in all variables and provide higher estimates for maximum values than [18]. [19] used ARIMA model for modelling and forecasting cumulative number of confirmed cases, deaths and recover cases in Pakistan. Their study provides forecast for only ten days (25 June to 4 July 2020). Findings of their study suggest that the cumulative number of new infections, deaths and recover cases would be 2,31,239, 5043 and 1,11,616, respectively, at the end of forecast horizon. [32] used VAR model for forecasting future scenarios about COVID-19 and provides 10 days ahead forecast (28 June to 7 July 2020). The results of their study suggested that the maximum number of daily new infections, deaths and recover cases would be 5,363, 167 and 4,016 during the forecast duration. [29] combined fuzzy logic and fractal dimension to model and forecast time series of COVID-19 (confirmed cases and deaths). They used forecasting windows of 10 and 30 days with the forecast accuracy of 98%. [30] used ensemble neural network model with fuzzy response aggregation to predict COVID-19 time series in Maxico. [40] used differential equation model to model and forecast future’s situation under different assumption. Based on their results, they recommended to take strong controlled measured for infected and asymptotic patients to reduce the number of total infections in the future. The current study provides twenty days ahead forecast (21 March to 9 April 2021) based on available data. The forecast results suggest that the minimum and maximum number of new infections are 2,479 and 4,031, respectively. The minimum and maximum daily deaths are 42 and 81, respectively, during the forecast period. Minimum and maximum daily recover cases are 578 and 3,464, respectively during the next twenty days. The results suggests that the total number of new infections, deaths and recover cases during the entire forecast duration are 65,638, 1,035 and 36,784, respectively. It can be observed that the results of this study are stable and are consistent with the trends of considered variables. The possible reasons may include: the techniques we used consider the dynamic nature of the system; secondly, more observations have been used in this study and perhaps this also exaggerated the findings. It is imperative to state that forecast is a complicated subject, therefore, these results could change due to various reasons including government policies regarding the current or future situation of COVID-19. This may include lockdown, closure of institutions or reducing the number of employees in a day at offices etc. The results of [41] suggested that 15 days after the lockdown, daily new infections due to COVID-19 and growth factor of this disease showed decreasing trend, however, there was no significance decline in the mortality and prevalence in 27 randomly selected countries. [42] investigated the impacts of lockdown and social distancing with deaths due to COVID-19 in 16 European countries. Their results suggested that there was close relationship between the deaths due to COVID-19 and the days elapsed until lockdown. However, there was week relationship between deaths and social distancing. There are other studies which investigated that screening, quarantine, isolation in different settings and contact tracing can help in reducing the new infections due to COVID-19 [43, 44].

6. Conclusion and recommendations

The forecast findings of the study indicate that the average daily new cases are higher than the average values of the observed data. The maximum and minimum number of daily new cases during the next twenty days are 2,479 and 4,031, respectively. The average number of daily new cases and the total number of daily new cases during the forecasting period are 3,282, and 65,638, respectively. The results of the daily deaths show that the minimum and maximum numbers are 42 and 81 per day, respectively. Average daily deaths during the upcoming twenty days are 52. The forecast results advocate that the total number of deaths during the next twenty days are 1,035. The forecast results for daily recovery cases demonstrate that on the average there are 1,840 recoveries per day during the next twenty days. The minimum and maximum number of daily recover cases during the forecasting period are 578 and 3,464, respectively. The total number of daily recovery cases during the upcoming twenty days are 36,784. The findings of this study may be helpful to epidemiologist, to design future modeling based on this evidence and also for policy-makers, planners, and managers in health sector. The specified models can be updated with the arrival of new data and therefore, this forecast could be used on regular basis to provide rigorous information for decision making by the relevant departments in Pakistan. Recommendations about future’s research include Bayesian time series modelling and forecasting of daily new infections, deaths and recover cases, however, prior information and its quantification may not be an easy task. A comparison based on different models (BDLM, NNs, ARIMA, machine learning algorithm etc.) could lead to the best model for modelling and forecasting COVID-19 in Pakistan. Based on the results of this study, it is proposed to Government of Pakistan to contain the further spread of the virus and reduce the daily new infections/cases through diligent strict measures across the country. (CSV) Click here for additional data file. 12 Apr 2021 PONE-D-21-07441 FORECASTING FUTURE SCENARIOS OF COVID-19 BY USING BAYESIAN DYNAMIC LINEAR MODELS IN PAKISTAN PLOS ONE Dear Dr. Khan, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. The manuscript requires further revisions regarding at least motivation, contributions, results, and discussion. Please submit your revised manuscript by May 27 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Stefan Cristian Gherghina, PhD. Habil. Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Thank you for stating the following financial disclosure: "No fund received. " At this time, please address the following queries: Please clarify the sources of funding (financial or material support) for your study. List the grants or organizations that supported your study, including funding received from your institution. State what role the funders took in the study. If the funders had no role in your study, please state: “The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.” If any authors received a salary from any of your funders, please state which authors and which funders. If you did not receive any funding for this study, please state: “The authors received no specific funding for this work.” Please include your amended statements within your cover letter; we will change the online submission form on your behalf. 3. Please amend either the abstract on the online submission form (via Edit Submission) or the abstract in the manuscript so that they are identical. 4. Please include your tables as part of your main manuscript and remove the individual files. Please note that supplementary tables should be uploaded as separate "supporting information" files. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: No ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: No ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The authors of the paper describe their proposed approach for Forecasting COVID-19 in Pakistan with Bayesian Models. The topic is interesting and with possible applicability. However, the paper needs several improvements: 1) the main contribution and originality should be explained in more detail 2) the motivation of the approach with Bayesian Models needs further clarification, why not other models, like NNs? 3) discussion of related work in COVID-19 should be expanded with more recent work 4) Minor grammar and syntax issues need correction 5) more simulation results and formal comparison of results are needed 6) the conclusions should be extended with more future work 7) More references to COVID-19 papers should be included, like: Multiple ensemble neural network models with fuzzy response aggregation for predicting COVID-19 time series: the case of Mexico. Healthcare 2020;8:181. Modeling COVID-19 epidemic in Heilongjiang province, China, Chaos Solitons Fractals, 138, 1–5. Modeling and forecasting of epidemic spreading: the case of Covid-19 and beyond. Chaos Solitons Fractals 2020;135:109794. Forecasting of COVID-19 time series for countries in the world based on a hybrid approach combining the fractal dimension and fuzzy logic. Chaos, Solitons and Fractals 140 (2020) 110242 A Novel Method for a COVID-19 Classification of Countries Based on an Intelligent Fuzzy Fractal Approach. Healthcare 2021, 9, 196 Reviewer #2: Comments: 1. By the time authors receive this review, they will have obtained real data for the predicted values of new cases, deaths, and recovered cases. It is suggested to validate the predicted values with the real values. 2. The paper heavily focuses on the quantitative analysis only and does not analyze how and to what extent measures, including the closure of international borders, lockdowns, etc. helped to control infection trends in different countries? A short discussion could be insightful. 3 . For some instances the predicted value is higher and for other instances, it is lower than the real values. One of the reasons for the ambiguity could be a change in control policy and action. The authors are suggested to explain the ambiguity. Reviewer #3: The merit of the article lies in the production of forecasts for Pakistan's situation in a given period with Bayesian Dynamic Linear Models, with a longer horizon than previous articles. However, there are several points that need attention. I will first list the main specific ones, and at the end, the general ones. Specific comments: - Previous related articles in the introduction could be mentioned in a more detailed critical way, that is stating their merits and drawbacks in a clear way. 1. The value of the forecasts cannot be judged without having other models as benchmarks. At the very least there should be some naive benchmarks. 2. The forecast precision is judged with graphical means without any numerical backing, for which plenty of error metrics exist. Plots do not reveal the detail about the forecast errors. 3. For the benefit of the reader it would be good to be explicit about how forecasts are produced for different horizons (1, 2, ... , 20 days). 4. Lines 58-61: Add references to back the statements. Be explicit about what "mechanistic models" are. Do you mean state-space models? Do you mean compartmental epidemiological models? 5. Lines 115 and 116: Name the variables in such a way that cases, deaths and recoveries are distinguished clearly from each other. Does COV_t represent all of them? It is also helpful to illustrate the dependencies of states. See for example https://www.mdpi.com/2036-7449/13/1/27/htm or https://onlinelibrary.wiley.com/doi/epdf/10.1002/sim.2566 6. Equations 1 and 2: The details of F_t and G_t are not shown. What is their formulation for your model? 7. Lines 140-141: The dependencies in your model should be made explicit. In them, the variables should be clearly included. COV seems to be a package or an aggregation of variables (infections, deaths, recoveries). Is formulation of dependencies in lines 140 and 141 generic, or the actual dependencies in your model? 8. Section 3.3: It is not clear how the dependency structure of the model (which was not clearly added) was constructed or validated. An article that shows construction via cross-correlation and validation via examining posterior probabilities is the following: https://onlinelibrary.wiley.com/doi/epdf/10.1002/sim.2566 9. Lines 165-167: The reasons for the choice should be stated. 10. Lines 301-302: Maximum and minimum figures do not correspond with the wording. You surely mean: minimum 2,132; maximum 4,149 11. About the data: WHO is cited (lines 54 and 55), but the citation directs to https://www.worldometers.info/coronavirus/ There must be more clarity on which data is taken from which source. General comments: Overall, the writing of the article seems rushed. This is understandable in the current situation for many researchers. But for the benefit of the readers and the scientific community, it is important to clearly assert the decisions made in the research, supporting them with graphic means if feasible and adding enough detail of the models. To this purpose, the authors can look at previous articles of the same type. Careful writing can also improve the quality of the work. Expressions such as "[...] forecast is a tricky-subject [...]" should be articulated in a more helpful way. The main criticism to the study is that, while it aims at forecasting scenarios, those are not included. Scenarios are different situations that can be modelled and assessed through different means. For example, by varying the underlying assumptions, or the initial conditions of the model; or by assuming different initial populations in some models; or by considering the presence or absence of government interventions. The way scenarios are formulated varies, partly depending on the type of model, and there is ample freedom to do this. In the article they are not explicitly included. A sample study with scenarios determined by how authors modelled epidemic growth can be found at https://www.mdpi.com/2077-0383/9/2/523 This one defines scenarios based on the basic reproductive number, R0: https://www.medrxiv.org/content/10.1101/2020.03.16.20036939v1.full.pdf This one explores scenarios based on changing the proportion of infected with respect to other groups of people: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0230405 The absence of scenarios, having aimed the article at it, is a major issue. On this ground, the article, in my view, should be rejected. One possibility for the authors could be to submit their work under a different title, not including scenarios. Even in that case, the other comments provided are applicable, which would lead to major changes, aimed at demonstrating that the research is of a high standard. The sample publications mentioned in the comments might be helpful. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Oscar Castillo Reviewer #2: No Reviewer #3: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 7 May 2021 The authors are extremely grateful to the editor and reviewers to spare time form their busy schedule to review our manuscript. Submitted filename: Responses to Reviewers.docx Click here for additional data file. 4 Jun 2021 FORECASTING DAILY NEW INFECTIONS, DEATHS AND RECOVERY CASES DUE TO COVID-19 IN PAKISTAN BY USING BAYESIAN DYNAMIC LINEAR MODELS PONE-D-21-07441R1 Dear Dr. Khan, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Stefan Cristian Gherghina, PhD. Habil. Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #3: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #3: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #3: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #3: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #3: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The authors have made all the suggested changes and have addressed all my concerns. In my opinion, the paper deserves publication. Reviewer #3: (No Response) ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #3: No 9 Jun 2021 PONE-D-21-07441R1 Forecasting daily new infections, deaths and recovery cases due to Covid-19 in Pakistan by using Bayesian Dynamic Linear Models Dear Dr. Khan: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Stefan Cristian Gherghina Academic Editor PLOS ONE

16 in total

1. Comparative study of four time series methods in forecasting typhoid fever incidence in China.

Authors: Xingyu Zhang; Yuanyuan Liu; Min Yang; Tao Zhang; Alistair A Young; Xiaosong Li
Journal: PLoS One Date: 2013-05-01 Impact factor: 3.240

2. Multiple Ensemble Neural Network Models with Fuzzy Response Aggregation for Predicting COVID-19 Time Series: The Case of Mexico.

Authors: Patricia Melin; Julio Cesar Monica; Daniela Sanchez; Oscar Castillo
Journal: Healthcare (Basel) Date: 2020-06-19

3. Using the kalman filter with Arima for the COVID-19 pandemic dataset of Pakistan.

Authors: Muhammad Aslam
Journal: Data Brief Date: 2020-06-12

4. Reduction in mobility and COVID-19 transmission.

Authors: Pierre Nouvellet; Sangeeta Bhatia; Anne Cori; Kylie E C Ainslie; Marc Baguelin; Samir Bhatt; Adhiratha Boonyasiri; Nicholas F Brazeau; Lorenzo Cattarino; Laura V Cooper; Helen Coupland; Zulma M Cucunuba; Gina Cuomo-Dannenburg; Amy Dighe; Bimandra A Djaafara; Ilaria Dorigatti; Oliver D Eales; Sabine L van Elsland; Fabricia F Nascimento; Richard G FitzJohn; Katy A M Gaythorpe; Lily Geidelberg; William D Green; Arran Hamlet; Katharina Hauck; Wes Hinsley; Natsuko Imai; Benjamin Jeffrey; Edward Knock; Daniel J Laydon; John A Lees; Tara Mangal; Thomas A Mellan; Gemma Nedjati-Gilani; Kris V Parag; Margarita Pons-Salort; Manon Ragonnet-Cronin; Steven Riley; H Juliette T Unwin; Robert Verity; Michaela A C Vollmer; Erik Volz; Patrick G T Walker; Caroline E Walters; Haowei Wang; Oliver J Watson; Charles Whittaker; Lilith K Whittles; Xiaoyue Xi; Neil M Ferguson; Christl A Donnelly
Journal: Nat Commun Date: 2021-02-17 Impact factor: 14.919

5. The impact of time to impose lockdown on COVID-19 cases and deaths in European countries.

Authors: Carmen Martínez-Valero; Juande D Miranda; Francisco Javier Martín-Sánchez
Journal: Med Clin (Engl Ed) Date: 2020-10-30

6. Impact of the COVID-19 Epidemic on Lifestyle Behaviors and Their Association With Subjective Well-Being Among the General Population in Mainland China: Cross-Sectional Study.

Authors: Zhao Hu; Xuhui Lin; Atipatsa Chiwanda Kaminga; Huilan Xu
Journal: J Med Internet Res Date: 2020-08-25 Impact factor: 5.428

5 in total

1. Short-term forecasting of daily infections, fatalities and recoveries about COVID-19 in Algeria using statistical models.

Authors: Firdos Khan; Mohamed Lounis
Journal: Beni Suef Univ J Basic Appl Sci Date: 2021-08-19

2. Fractional Stochastic Differential Equation Approach for Spreading of Diseases.

Authors: Leonardo Dos Santos Lima
Journal: Entropy (Basel) Date: 2022-05-17 Impact factor: 2.738

3. eHealth Engagement on Facebook during COVID-19: Simplistic Computational Data Analysis.

Authors: Caroll Hermann; Melanie Govender
Journal: Int J Environ Res Public Health Date: 2022-04-12 Impact factor: 4.614

4. Prediction of COVID-19 Pandemic in Bangladesh: Dual Application of Susceptible-Infective-Recovered (SIR) and Machine Learning Approach.

Authors: Iqramul Haq; Md Ismail Hossain; Ahmed Abdus Saleh Saleheen; Md Iqbal Hossain Nayan; Mafruha Sultana Mila
Journal: Interdiscip Perspect Infect Dis Date: 2022-04-26

5. Water availability and response of Tarbela Reservoir under the changing climate in the Upper Indus Basin, Pakistan.

Authors: Firdos Khan
Journal: Sci Rep Date: 2022-09-23 Impact factor: 4.996

5 in total