Literature DB >> 33269029

Forecasting COVID-19 daily cases using phone call data.

Bahman Rostami-Tabar¹, Juan F Rendon-Sanchez².

Abstract

The need to forecast COVID-19 related variables continues to be pressing as the epidemic unfolds. Different efforts have been made, with compartmental models in epidemiology and statistical models such as AutoRegressive Integrated Moving Average (ARIMA), Exponential Smoothing (ETS) or computing intelligence models. These efforts have proved useful in some instances by allowing decision makers to distinguish different scenarios during the emergency, but their accuracy has been disappointing, forecasts ignore uncertainties and less attention is given to local areas. In this study, we propose a simple Multiple Linear Regression model, optimised to use phone call data to forecast the number of daily confirmed cases. Moreover, we produce a probabilistic forecast that allows decision makers to better deal with risk. Our proposed approach outperforms ARIMA, ETS, Seasonal Naive, Prophet and a regression model without call data, evaluated by three point forecast error metrics, one prediction interval and two probabilistic forecast accuracy measures. The simplicity, interpretability and reliability of the model, obtained in a careful forecasting exercise, is a meaningful contribution to decision makers at local level who acutely need to organise resources in already strained health services. We hope that this model would serve as a building block of other forecasting efforts that on the one hand would help front-line personal and decision makers at local level, and on the other would facilitate the communication with other modelling efforts being made at the national level to improve the way we tackle this pandemic and other similar future challenges.

Entities: Disease Gene Species

Keywords: ARIMA; COVID-19; Call centres; Exponential smoothing; Probabilistic forecasting,; Regression; Time series forecasting

Year: 2020 PMID： 33269029 PMCID： PMC7687495 DOI： 10.1016/j.asoc.2020.106932

Source DB: PubMed Journal: Appl Soft Comput ISSN： 1568-4946 Impact factor: 6.725

Introduction

Since its discovery at the end of 2019 in Wuhan, China, the spread of coronavirus (SARS-CoV-2) has shaken one country after another putting governments, infrastructure, local and international cooperation to the test as it has claimed the lives of more than a million people worldwide [1] and caused severe disruptions to every-day life and the economy. The epidemic is evolving with regional and local differences. The way it spreads, as pointed out by Hamzah et al. [2], is largely influenced by each country’s policies. Consequently, in the UK, some of the regional differences might be strongly linked to differences in approaches or policies or their timing to tackle the epidemic in England, Wales, Scotland and Northern Ireland. Additionally, other factors, linked to human direct interaction and exposure, such as life conditions (including income, accommodation, vulnerability), engagement in economic activities with close interaction and mobility patterns can further create differences in the evolution of the epidemic at regional and local levels. These complexities are being researched in interdisciplinary areas such as digital epidemiology [3], [4]. As the epidemic unfolds, decision makers face the difficulties posed by uneven developments and model limitations to predict its evolution. Increasing numbers of confirmed cases lead to an increase in demand for hospital beds, and in the most severe cases also ICU beds. The uneven spread of the infection, manifested in large variations at local level, render important to monitor its spread and attempt to predict the number of new cases. The ability to accurately forecast the number of positive confirmed cases few weeks in advance, would allow local decision makers to put in place all measures required to increase capacity in hospital beds to treat COVID-19 patients before it occurs, instead of reacting to it. This study has initially been motivated by a real-life forecasting problem faced by a local country council in the National Health Service (NHS) in England. The problem highlights the lack of a rigorous method to forecast the number of confirmed cases at ground/local level. Decision makers at the local level need forecasts to: (i) inform general strategy; (ii) adjust hospital bed capacity; (iii) cancel/postpone non-urgent hospital operations; (iv) redirect available medical resources to COVID-19 wards; and (v) plan for extra capacity in body storage of mortuary services. Additionally, some limitations in the literature have also motivated the development of our study which are summarised as follows: (i) the forecast accuracy of models is not systematically reported; (ii) there are big differences between predictions and what happened in reality and forecast errors greatly vary; (iii) in some studies it is unclear how forecasts are generated and evaluated, e.g. whether a rolling window is used and what is the forecast horizon; and (iv) most of the studies focus on point forecasts and ignore the uncertainty associated with them, which may lead to less effective decisions. Despite the limitations of these efforts, the need to have appropriate tools to aid decision making continues to be pressing and sensible contributions within forecasting are sought after, as evidenced by the publication, already since early stages of the outbreak [5]. Frustration with regards to forecast accuracy when forecasting COVID-19 has also been summarised by Ioannidis et al. [6]. A myriad of reasons has been given, including poor data, lack of incorporation of epidemiological features and lack of transparency. It is important to note that the communication between decision makers at local and national levels is also acutely needed. Therefore, any experience with different forecasting and other decision tools at local level with a clear assessment of their accuracy will facilitate the integration of models and communication as the epidemic continues to unfold. This paper contributes to the literature by developing a probabilistic forecasting approach that uses phone call data to forecast COVID-19 cases at a local level. Overall, the forecasting approach, accuracy assessment, and empirical conditions in our present work are additions to the increasing body of knowledge about foresting this pandemic, and its products (models, assessment and conditions) can be used as building blocks in broader approaches and frameworks that encompass diverse models. It is a contribution towards enlarging the set of tools and accuracy assessment, to aid front line health personnel and managers facing the crisis in taking decisions on resources and operations. Our contributions are fourfold: (i) we propose a novel forecasting model to accurately forecast daily COVID-19 confirmed cases; (ii) we examine the use of phone call data and provide evidence of their usefulness in forecasting daily confirmed cases; (iii) we provide probabilistic forecasts that quantify uncertainties in future confirmed cases; (iv) we benchmark the accuracy of our model against five techniques including AutoRegressive Integrated Moving Average, ARIMA [7], Exponential Smoothing State Space, ETS [8], Seasonal Naive, Snaive [9], Prophet [10] and a regression model without considering call data. The rest of the paper is organised as follows: Section 2 provides a brief overview of the efforts to forecast COVID-19, Section 3 starts by analysing data to then introduce the models used in this study and the forecast performance evaluation scheme and metrics. Section 4 presents the results of the study, followed by concluding remarks in Section 5.

Research background: forecasting COVID-19

Forecasting COVID-19 has attracted a lot of attention of researchers and many studies have analysed various related variables. Some studies focus on the global and national level; others focus on regional level and few consider the local level. The following is an overview of approaches used to forecast different variables of COVID-19 to date. Petropoulos and Makridakis [11] forecast the cumulative number of daily confirmed cases globally with exponential smoothing models. Since this early effort, the limitations of the models were clear and acknowledged by the authors and attributed to strong government interventions and accuracy of data. Benvenuto et al. [12] predict the epidemiological trend of incidence and prevalence of COVID-19 worldwide using ARIMA models. Feroze [13] use Bayesian Structural Time Series Models [14] to forecast the daily total number of positive cases in USA, Brazil, Russia, India and UK for the next 30 days. Authors report a better performance from their models when compared to ARIMA models. At country level, Fong et al. [15] proposed a methodology for fitting models with scarce data at the beginning of the pandemic with data from China. A polynomial neural network with corrective feedback is found to perform better, according to RMSE (Root Mean Square Error), in a panel selection framework including other polynomial neural networks, ARIMA, Linear Regression, Holt-Winters models, Support Vector Machines and Fast Decision Tree, among others. In a similar approach, Fong et al. [16] use Monte Carlo simulation to forecast the total daily cost associated to the pandemic by using scarce data. Two groups of variables, one with time-series data and other with scalar estimates, are used as inputs, with several models being subject to a panel selection process. In the last stages, visualisation through fuzzy rules is proposed to enhance decision-making. This is an attempt to tackle data scarcity and uncertainties, with the aim of supporting decision making in an exercise of broad scope. The approach, tested with data COVID-19 data from China, provides, according to the authors, better qualitative results for decision making than individual forecasting techniques alone. The approaches in these two articles are first steps towards forecasting frameworks to inform decision making during pandemics. An interesting next step forward would be to use open source forecasting tools, to equate the use of open data, in order to promote the construction of such frameworks. Yousaf et al. [17] forecast daily confirmed cases with ARIMA models at country level in Pakistan. Of the same nature is the research by Moftakhar et al. [18] with data from Iran, in which observed new cases are used to predict the number of patients. Artificial Neural Networks (ANN) and ARIMA are used, with the later showing better accuracy. ARIMA models are also used by Gupta and Pal [19] to forecast infection cases in India for the next 30 days. Harun et al. [20] forecast the number of COVID-19 cases in Germany, United Kingdom, France, Italy, Russia, Canada, Japan, and Turkey by using linear and non-linear regression models, ARIMA and exponential smoothing methods. Different best performing forecasting models are identified for different countries. Hazarika and Gupta [21] and references therein show that the proposal of modified machine learning models has been very active. The authors propose a hybrid Random Vector Functional link (RVFL) network and a wavelet-coupled RVFL (WCRVFL) network to forecast the number of infected people in Brazil, India, Peru, Russia and the USA. Performance is evaluated when forecasting 60 days ahead, against support vector regression (SVR) models and the conventional RVFL. Tomar and Gupta [22] use a long short-term memory (LSTM) recurrent neural network to forecast daily and total positive cases, total recovered and total deceased in India. A simple curb fitting with an exponential model is used to assess the effect of a lock-down and social distancing. LSTM networks were also used with data from Canada by Chimmula and Zhang [23]. Perc et al. [3] forecast the number of daily confirmed cases in United States, Slovenia, Iran and Germany with an iterative model that uses the average growth of cases and takes into account the recovered and deceased. By devising the model around growth and targeting its use to the short term, it was rendered helpful even in the absence of a systematic accuracy study. Logistic models of the SEIRD type, which divide population in groups (Susceptible, Exposed, Infected, Removed and Dead), and require the solution of a system of differential equations, have been used abundantly, with different variants, to describe and predict the evolution of the epidemic globally [2] and in regions and countries [24], [25], [26], [27]. Martelloni and Martelloni [24] apply a SIRD model to Italy and adapt it to account for the introduction of lock-downs by governments, the influence of asymptomatic and the absence of a criteria to define the susceptible group. Fanelli and Piazza [25] use a SIRD model to explore the temporal dynamics of the epidemic in China, Italy and France. Sarkar et al. [28] adapt the SIRD model to account for asymptomatic, isolated infected and quarantined susceptible. Mohd and Sulayman [26] propose a modified logistic model to account for reinfections, false detection problems and scarcity of medical equipment in the context of a developing country and found that in such circumstances there can be unstable scenarios where decisions based on the reproduction number should be made with caution. In some studies [24], [25] there are hints of universal characteristics, whereas in Mohd and Sulayman [26] the particular circumstances of a developing country suggests that behaviours of the epidemic might have important country differences. Sardar et al. [29] investigate several models covering the mathematical (SIRD) and statistical types, to study the effect of lock-down in India: a SIRD-type logistic model, expanded further than Sarkar et al. [28] that accommodates for lock-down; an ARIMA model; an exponential smoothing model with ARMA errors, trend and seasonal components (TBATS); a hybrid statistical model based on the combination of ARIMA and TBATS and a weighted combination of the SIRD model and the best statistical one. Daily notified cases from five states in India and the overall country are used to fit the models, which helped assessing different scenarios determined by the effective reproduction rate number. Sophisticated spatio-temporal models have been explored to project infections, deaths and ICU beds and hospital admissions needed in the US [30], [31], [32]. This and related work [33] are signalling the need of complex methods for some settings. A trend that is also manifested, for example, in the investigation of ARIMA models in conjunction with spacial relations between neighbouring countries [34]. The survey by Shinde et al. [5], focuses mostly on research at the country level which considers two categories for data sources: i) official government, and ii) trusted organisations, such as WHO, and social media. The survey also classifies models into two groups: i) mathematical/analytical, and ii) machine learning/data science. These classifications continue to be meaningful to-date. It is interesting to notice the way in which phone data has been use. Mobile phone data has been used to collect statistics of movement of people and data from search engines and newspaper reports has been used to predict the total number of deceased. The scarcity of research on using data from phone calls to health services persists to-date. At regional level, Anastassopoulou et al. [35] propose a methodology to estimate epidemiological parameters and apply a SIRD model to forecast the number of infections, recovered and deceased in the region of Hubei, China. After a coarse estimation of parameters, a non-linear optimisation routine is used to obtain refined estimates. The official figures from the region fell inside the upper and lower bounds, albeit very wide, that had been produced by authors. Despite the inaccuracy, the lessons learned in how to calibrate this type of models are very valuable. Massonnaud et al. [36] adapt a SEIR model to formulate scenarios for French metropolitan regions based on different reproduction rates. The daily number of COVID-19 cases, hospitalisations and deaths, the needs in ICU beds per region and the reaching date of ICU capacity limits were estimated. Hospital catchment areas are used and then aggregated by French region. Weissman et al. [37] implement a SIR model to plan scenarios (defined by doubling times of 2, 6 and 10 days) in 3 hospitals in greater Philadelphia region, US. Three variables are predicted: hospital capacity, patients requiring ICU beds and patients requiring ventilators. The accuracy of the predictions is not reported, but the usefulness of the scenarios in the planning ahead is recognised. This assessment and the construction of models is facilitated by a close collaboration between operational, medical and data personnel. When considering this overview of forecasting efforts, it is clear that accuracy has been disappointing in many cases. The majority of the studies focus only on point forecasts and ignore the uncertainty that may lead to less effective decisions. In some cases, the forecast accuracy has not been reported or it is unclear how it has been evaluated. It is evident the effort in proposing and testing machine learning, statistical and mathematical models. In the area of machine learning there is a emphasis on hybridisation to cope with complexity whereas in the area of SIRD models there is an attempt to adapt them to different circumstances and government interventions. However, limitations of different models might not be reduced unless connected with other approaches such as spatio-temporal complex models like Global Epidemic and Mobility Model (GLEAM) [32], and well articulated to policy making. It is also evident that phone data from official help lines has not been extensively explored, or at least reported.

Experimental design

Data

Data used in this paper comprised the number of daily COVID-19 confirmed cases and the number of daily phone calls received at the National Health Service 111 (NHS 111) at one of the largest Non-metropolitan country council in the East Midlands region of England between 18 March 2020 and 19 October 2020. Data is publicly available and extracted from the Public Health UK,1 and NHS Digital2 respectively. Fig. 1 illustrates the time plot of the confirmed cases. Although the time series is noisy, some systematic patterns are visible. The time plot shows a trend and the number of positive cases in the weekend are lower than the working days. We also analyse the Autocorrelation (ACF) and Partial Autocorrelation (PACF) Functions to investigate the link between the number of confirmed cases and its lagged values. Fig. 2 shows the ACF (Fig. 2-A) and PACF (Fig. 2-B) of the time series. The ACF plot highlights a trend because by increasing the lag, the value of ACF decreases exponentially. Moreover, PACF reveals that there are some positive lags that need to be considered in the forecasting model such as , , , and .

Fig. 1

Time series of daily confirmed cases.

Fig. 2

Autocorrelation and partial autocorrelation of the time series of confirmed cases.

Fig. 3 presents the time plot of the NHS 111 calls highlighting the presence of a trend. The data is less noisy and there is no seasonal pattern. NHS 111 call data can be used as a predictor of the number of daily confirmed cases, therefore, we need to produce its forecast.

Fig. 3

Time series of daily phone calls.

Time series of daily confirmed cases. Autocorrelation and partial autocorrelation of the time series of confirmed cases. In the relationship between the number of confirmed cases and the number of NHS 111 calls on day , the former may be related to past lags of the later. The sample Cross-Correlation Function (CCF) can be used to identify lags of the number of NHS 111 calls series that might be useful predictors of the number of confirmed cases. For instance, consider , the CCF value will give the correlation between the number of NHS 111 calls on day and the number of confirmed cases on day . Time series of daily phone calls. Fig. 4 presents the linear relationship between confirmed cases and NHS 111 lags. We are interested in the negative lags, because we use NHS111 calls to predict the number of confirmed cases. It seems that confirmed cases are relatively high almost 0–24 days after high NHS 111 calls (i.e., significant correlation at lags of to ). There are a lot of models that we could try based on the CCF, ACF, PACF and the presence of trend and the effect of weekends for these data. In the next section, we propose a Multiple Linear Regression model in which the number of confirmed cases, is a linear combination of these features.

Fig. 4

Cross-correlation of the past values of NHS 111 calls and the number of confirmed cases.

Proposed model, multiple linear regression with call data (MLR_T)

In this section, we model the number of confirmed cases as potentially a function of past lags of confirmed cases, current and past lags of the number of NHS 111 calls, the weekend and the trend. These components allow for possible interpretations as described in Section 3.1. We denote as the number of positive confirmed cases on day . The predictors are presented using the following notations: : number of NHS 111 calls on day . : local trend on day . : number of positive confirmed cases on day . : NHS 111 calls on day . : Weekend dummy; is 1 if falls on a Weekend and 0 otherwise. The mathematical representation of the proposed model is shown in Eq. (1). We should note that, to ensure achieving strictly positive values for the number of confirmed cases, we have used the log() transformation for the response variable, confirmed cases, and its lagged values as predictors. Specified transformation is automatically back-transformed to produce forecasts. Not all potential predictors are included in the final forecasting model. After building the model, we need to select the predictors that are useful in producing accurate forecasts. To that end, we have conducted a stepwise regression and select predictors that improve the Out-of-Sample forecast accuracy based on RMSE and Continuous Ranked Probability Score (CRPS). Due to the amount of uncertainty in the number of positive confirmed cases, we produce the estimated probability distribution of the number of positive confirmed cases in future time periods in addition to the point forecast. This allows us to provide more information about the forecasts that could be fruitfully used by decision makers. The proposed model uses the NHS 111 call, its lags and the lagged values of confirmed cases as predictors, therefore, in order to produce ex-ante forecasts [9], we need to generate forecasts of these predictors. Fig. 5 depicts the steps of the forecasting process, implemented in R.

Fig. 5

Steps of the forecasting process.

To forecast NHS 111 calls, we examined ARIMA, ETS and their combination. We use forecasts generated by ARIMA for NHS 111 calls as it results in more forecast accuracy. For the lagged values of confirmed cases, we examines Naive [9], ARIMA, ETS and their combinations. Forecasts generated by ETS result in more accurate forecasts overall, however, forecasts from Naive method were accurate for shorter horizons. Therefore, we replace the forecasts generated from ETS by those generated from the Naive method when . We have compared various horizons, and this yields in a more accurate forecast of confirmed cases in the Out-of-sample. Steps of the forecasting process. Fig. 6 illustrates an example of the probabilistic forecast generated by the proposed model.

Fig. 6

Sample forecast produced by the proposed model.

Benchmarks

We use five different benchmark methods to compare their forecast accuracy with our proposed model. We consider Exponential smoothing and ARIMA models as the two most widely used approaches to time series forecasting, in addition to a regression approach without using call data, Seasonal Naive and Prophet. These benchmark methods are used to show whether there is any adding value in using the proposed model or not.

Proposed model without call data (MLR_W)

In order to show the impact of incorporating the phone call data in the modelling and its effect on forecast accuracy improvement, we use the same model proposed in Section 3.2 without including NHS 111 calls and its lagged variables.

Exponential smoothing state space model (ETS)

The second benchmark is the automatic exponential smoothing [9] model that is incorporated using corresponding implementation of the fable package in R. We use the ETS() function in the fable package [38] to generate daily forecast of the confirmed cases.

Autoregressive integrated moving average (ARIMA)

The third benchmark model is an automatic ARIMA model [9]. The orders of the model and parameters are selected by minimising the corrected AIC (Akaike information criterion). We use the implementation of automatic ARIMA in fable package; ARIMA() function is used to generate the forecasts of daily confirmed cases [38].

Prophet

Prophet is a forecasting procedure created by Facebook [10] that accounts for multiple seasonality and piecewise trend. Prophet works well on daily data, is robust to missing data and shifts in the trend, and typically handles outliers well. It is popular and automated which makes it easy to learn and use in practice. The model is incorporated using corresponding implementation of the Fable package in R. We use the prophet() function in the fable package to generate daily forecast of confirmed cases [38].

Seasonal Naive

The Seasonal naive model is a simple method in which forecast of confirmed cases is equal to the last observed confirmed cases from the same day of the previous week. This method is useful for time series with seasonality.

Forecast performance evaluation schemes and metrics

We first split the dataset into training (70%) and test (30%) sets. We apply the forecasting models on the training set and evaluate the forecast performance of the proposed MLR approach and benchmarks on the test set. Evaluation is conducted using a rolling origin forecasting study with re-estimation. Forecasting horizon is chosen to be which corresponds to three weeks. This allows decision makers to use forecast to inform decisions on resource allocation and planning in the health service. We have considered six different forecast accuracy metrics: i) three point forecast error metrics, ii) one prediction interval accuracy measure, and (iii) two probabilistic forecast accuracy metrics.

Point forecast accuracy measures

The point forecasting metrics are the ME (Mean Error), MAE (Mean Absolute Error) and RMSE (Root Mean Square Error) for point forecasts [39]. These measures help us to evaluate the performance of models from different perspectives. ME provides the overall direction of the error. It reveals whether the produced forecasts on average are too high or too low. MAE is strictly appropriate for the median, i.e. if the MAE of a model is smaller than others, it produces forecasts closer to the median of the data than the others. Finally, RMSE is minimised for the optimal mean forecast. That means a model with the lower value of RMSE produces more accurate mean values comparing to others. We evaluate the ME, MAE and RMSE for each model and each forecasting step, separately. We also report the overall performance of each model across all horizons. A model with an error measure closer to zero is better.

Prediction interval accuracy measures

An appropriate prediction interval accuracy measure should account for both the coverage and the width of the prediction interval. Winkler [40] proposed a measure to calculate the accuracy of the forecast prediction interval for any given model. If is the observation at time and represents the prediction interval at time , then the score for each period is: The Winkler score is the average of scores across all periods. We prefer a model that has a narrow prediction interval with high coverage, therefore a model with smaller score is better. In this study, Winkler score is calculated based on 95% prediction intervals.

Probabilistic accuracy measures

In addition to the point and prediction interval accuracy measures, we also report two probabilistic forecasting measures: (i) Percentile score and , (ii) Continuous Ranked Probability Score (CRPS). The percentile score is a strictly appropriate evaluation criterion for quantiles. The evaluation is performed on a dense probability grid for all percentiles () [41]. For each time period , we obtain the quantile, , where . Then, the percentile score is given by the pinball loss function at each period t: The score can then be calculated for each forecast horizon and across all percentiles. We also report the overall percentile score across all horizons. If the observations does not deviate from the forecast distribution, then the average score is small. A model with smaller percentile score is better. CRPS summaries the quality of a continuous probability forecast with a single score by measuring the distance between the forecast and the observed cumulative Distribution Function. It reveals how closely the CDF of the forecast matches that of the corresponding observations. We should note that in order to report the result of the performance evaluation using time series cross validation, we have averaged the score of each measure computed for different rolling origins.

Result and discussion

In this section, we present the forecast accuracy of our proposed MLR model and its benchmarks for point forecast, prediction interval and probabilistic forecasts using out-of-sample data using time series cross validation approach. Table 1 presents the overall performance of forecasting models for three types of forecast accuracy measures: (i) point error measures using ME, RMSE, MAE; (ii) prediction interval accuracy measure using Winkler score, and (iii) probabilistic accuracy measures using Percentile and CRPS. The reported scores are averaged across all rolling origins and forecast horizons. Overall, the proposed MLR model clearly outperforms benchmarks for all forecast accuracy measures.

Table 1

Forecast performance evaluation.

Model	Accuracy measure
	ME	RMSE	MAE	Winkler	Percentile	CRPS
Prophet	154.42	275.13	155.37	4465.83	73.10	145.61
SNaive	146.41	273.67	153.87	2628.72	69.14	137.07
ETS	148.89	271.99	153.28	3548.38	70.43	139.93
ARIMA	134.99	259.98	144.84	2942.25	65.34	129.72
MLR_W	117.52	222.59	124.30	2295.61	54.76	108.73
MLR_T	86.70	178.06	97.00	583.25	39.71	78.72

We first observe the performance of forecasting models using the point forecast accuracy measures (columns 2–4). shows the forecast bias of considered models which is the tenancy of a forecast to be consistently higher or lower than actual values. Given that ME is positive for all models, generated forecasts are positively biased manning that all forecasting models are overestimating the daily number of confirmed cases on average. The model has the lowest bias comparing to all benchmarks. Decision makers should consider this when making decisions based on the forecasts. is the squared root of MSE which is minimised by mean. If the decision maker is interested in producing the expected average of the daily confirmed cases, and then identify which model outperforms benchmarks in terms of the expected values, then RMSE measure is an appropriate indicator. Table 1 indicates that outperforms benchmarks based on . accuracy measure is another indicator that is useful to evaluate the point forecast accuracy which is minimised by median. That means, if the decision maker is interested in identifying which forecasting model generates more accurate forecast of daily confirmed cases in terms of median values, then should be used. Results indicate that model outperforms others based on . In addition to point forecast accuracy, we believe that the forecast should acknowledge the uncertainty of the future confirmed cases and it should be an integral part of the decision-making process. The report of uncertainty allows for a more informed decision and better risk management strategy. We first report forecast uncertainty using a Winkler (column 5). Results shows that the proposed model has the lowest Winkler score, which means on average it has more accurate coverage and narrower intervals than all benchmarks. Most importantly, we provide the accuracy of the probability forecast distribution that contains all forecast properties including mean, median, prediction intervals and quantiles. This evaluates how well the forecast distribution is capturing the structure of the data. Results indicate that the model provides more accurate probability forecast distribution than others. It means that the proposed model can inform decisions more accurately about uncertainty than others. In addition to the overall performance, we have also analysed the performance of each forecasting model across various forecast horizons. Fig. 7 illustrates the point forecast accuracy metrics for each forecast horizon. While for shorter horizons, the performance of models are closer, there is a substantial gain in longer horizons using our proposed approach. For longer forecasting horizons, the proposed model retains high forecasting accuracy compared to other methods. Clearly, the proposed model not only remains accurate across the forecasting horizon, but is also more robust than other methods against increasing horizons.

Fig. 7

Point forecast accuracy of models for each forecast horizon.

Fig. 8 presents measures of the uncertainty of the models used in this study. Figure (8-A) shows the Winkler score for the 95% prediction intervals for each horizon. The proposed model provides more accurate prediction interval forecasts for all horizons and there is a substantial gain in using our model as forecast horizon increases. All benchmarks provide either more wider intervals and/or fail to cover the actual values inside the prediction interval.

Fig. 8

Forecast uncertainty of models for each forecast horizon.

In evaluating the probabilistic forecast accuracy of our model and its benchmark, we focus on the distributional characteristics of the generated forecasts evaluated by the corresponding percentile score and CRPS. Figure (8-B and C) presents the corresponding percentile score and CRPS aggregated for each forecasting horizon. The results show that the proposed model captures the density structure for consistently across all horizons better than the benchmark models. The difference is substantial for longer forecast horizons. Fig. 7, Fig. 8 indicate that the forecast accuracy of all models deteriorate as forecast horizon increases. Therefore, the further into the future a forecast is required, the forecast error becomes larger for all models. Our results show that the proposed model not only outperforms benchmarks in terms of generating the correct value of future confirmed cases, but is also the most accurate when providing consistent information about uncertainty around confirmed cases. The model is very simple and can be understood by any manager, it also provides the entire probability distribution for confirmed cases for a given day at the local level. Our model offers the ability to capture a range of possibilities regarding confirmed cases for any given day that is not contained in the point forecasts. This is very important for decision makers and planners in healthcare, as it helps them to assess the risk and make better decisions in planning. Further research attempts are required on how to use probabilistic forecasts and interpret output to inform decisions in the healthcare. Forecast performance evaluation. Point forecast accuracy of models for each forecast horizon. Forecast uncertainty of models for each forecast horizon. We should note that, if the focus of the decision makes is more on the short-term horizons (), then it is possible to create a MLR based model using the principles discussed in this paper that outperforms benchmarks substantially.

Conclusion

An accurate forecast of the daily number of positive confirmed cases is crucial for resource planning. Many models have been proposed in the literature to forecast confirmed cases ranging from time series to epidemic forecasting models. In this study, we have created a simple and interpretable forecasting model that can be used to forecast positive confirmed cases at the local level. The proposed MLR model exploits the relationship between the confirmed case and the phone call data (NHS 111 calls), in addition to other patterns such as trend, the effect of weekends and Autoregressive lags of confirmed cases. We compare the performance of the model with ETS, ARIMA, Seasonal Naive, Prophet and a MLR model without using phone call data using an empirical study. These models are applied to the number of confirmed cases for a country council in England. In evaluating the performance of generated forecasts, we have used three point forecast error measures, one prediction interval accuracy measure and two probabilistic forecasting accuracy measures. Our analysis showed that the proposed model can provide accurate and reliable forecasts. It outperforms all benchmarks based on all accuracy measures considered in the study. We also provide evidence that using phone call data is an important predictor of COVID-19 confirmed cases and should be considered in forecasting models. We could propose that this might be due to the connection between phone calls to the health service and the dynamics related to COVID-19. The factors associated to being affected by the virus are clearly social, including deprivation, underlying conditions and belonging to ethnic minorities [42], [43], and these factors are complex, dynamically entangled with mobility, economic activity, lock-downs and other government interventions. Phone calls to a centralised health service, such as the NHS, serve as an information junction where the chains of causality, related to social factors, are manifested. The effect of this source of information, renders possible the interesting performance of a relatively simple model. In addition to generating traditional point forecasts, we also provide the probabilistic forecasts. Given the uncertainty involved in forecasting the number of COVID-19 confirmed cases, this is extremely important for decision makers as it highlights the uncertainty around a single value forecast. It can be used to manage risk in allocating resources in hospitals. We have also used the proposed model to forecast the number of confirmed cases at the national level, England. The overall performance is very similar to the local level, the proposed model outperforms others. However, we have not provided the analysis here due to the space limit and the focus of the study at the local level, but it can be provided on request. One of the limitations of this study is the access to more granular data such as age group, minority group, health condition, local regulations, knockdown, duration of knockdown, and so on. These might allow for a better explanation of changes in the number of confirmed cases. Further improvements of the proposed model can be achieved by incorporating additional information. Another limitation of the study is that the dataset does not contain hospital information such as COVID-19 admission and bed occupancy. This information allows for translating the forecast of COVID-19 confirmed cases into hospital admissions and bed capacity. This could be considered as an important research avenue.

CRediT authorship contribution statement

Bahman Rostami-Tabar: Conceptualization, Programming, Formal analysis, Model development, Writing - Original draft preparation. Juan F. Rendon-Sanchez: Conceptualization, Literature review, Writing - Original draft preparation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

22 in total

1. Multiscale mobility networks and the spatial spreading of infectious diseases.

Authors: Duygu Balcan; Vittoria Colizza; Bruno Gonçalves; Hao Hu; José J Ramasco; Alessandro Vespignani
Journal: Proc Natl Acad Sci U S A Date: 2009-12-14 Impact factor: 11.205

2. Forecasting for COVID-19 has failed.

Authors: John P A Ioannidis; Sally Cripps; Martin A Tanner
Journal: Int J Forecast Date: 2020-08-25

3. Forecasting the patterns of COVID-19 and causal impacts of lockdown in top five affected countries using Bayesian Structural Time Series Models.

Authors: Navid Feroze
Journal: Chaos Solitons Fractals Date: 2020-08-12 Impact factor: 5.944

4. Assessment of lockdown effect in some states and overall India: A predictive mathematical study on COVID-19 outbreak.

Authors: Tridip Sardar; Sk Shahid Nadim; Sourav Rana; Joydev Chattopadhyay
Journal: Chaos Solitons Fractals Date: 2020-07-08 Impact factor: 9.922

5. Modelling the downhill of the Sars-Cov-2 in Italy and a universal forecast of the epidemic in the world.

Authors: Gabriele Martelloni; Gianluca Martelloni
Journal: Chaos Solitons Fractals Date: 2020-07-01 Impact factor: 9.922

6. Forecasting the novel coronavirus COVID-19.

Authors: Fotios Petropoulos; Spyros Makridakis
Journal: PLoS One Date: 2020-03-31 Impact factor: 3.240

7. Locally Informed Simulation to Predict Hospital Capacity Needs During the COVID-19 Pandemic.

Authors: Gary E Weissman; Andrew Crane-Droesch; Corey Chivers; ThaiBinh Luong; Asaf Hanish; Michael Z Levy; Jason Lubken; Michael Becker; Michael E Draugelis; George L Anesi; Patrick J Brennan; Jason D Christie; C William Hanson; Mark E Mikkelsen; Scott D Halpern
Journal: Ann Intern Med Date: 2020-04-07 Impact factor: 51.598

8. Time series forecasting of COVID-19 transmission in Canada using LSTM networks.

Authors: Vinay Kumar Reddy Chimmula; Lei Zhang
Journal: Chaos Solitons Fractals Date: 2020-05-08 Impact factor: 5.944

8 in total

Review 1. Artificial intelligence for forecasting and diagnosing COVID-19 pandemic: A focused review.

Authors: Carmela Comito; Clara Pizzuti
Journal: Artif Intell Med Date: 2022-03-28 Impact factor: 7.011

2. Investigation of robustness of hybrid artificial neural network with artificial bee colony and firefly algorithm in predicting COVID-19 new cases: case study of Iran.

Authors: Mohammad Javad Shaibani; Sara Emamgholipour; Samira Sadate Moazeni
Journal: Stoch Environ Res Risk Assess Date: 2021-09-30 Impact factor: 3.821

3. Forecasting COVID-19 Case Trends Using SARIMA Models during the Third Wave of COVID-19 in Malaysia.

Authors: Cia Vei Tan; Sarbhan Singh; Chee Herng Lai; Ahmed Syahmi Syafiq Md Zamri; Sarat Chandra Dass; Tahir Bin Aris; Hishamshah Mohd Ibrahim; Balvinder Singh Gill
Journal: Int J Environ Res Public Health Date: 2022-01-28 Impact factor: 3.390

4. The comparative analysis of SARIMA, Facebook Prophet, and LSTM for road traffic injury prediction in Northeast China.

Authors: Tianyu Feng; Zhou Zheng; Jiaying Xu; Minghui Liu; Ming Li; Huanhuan Jia; Xihe Yu
Journal: Front Public Health Date: 2022-07-22

5. Early detection of COVID-19 outbreaks using human mobility data.

Authors: Grace Guan; Yotam Dery; Matan Yechezkel; Irad Ben-Gal; Dan Yamin; Margaret L Brandeau
Journal: PLoS One Date: 2021-07-20 Impact factor: 3.240

6. Multi-Regional Modeling of Cumulative COVID-19 Cases Integrated with Environmental Forest Knowledge Estimation: A Deep Learning Ensemble Approach.

Authors: Abdelgader Alamrouni; Fidan Aslanova; Sagiru Mati; Hamza Sabo Maccido; Afaf A Jibril; A G Usman; S I Abba
Journal: Int J Environ Res Public Health Date: 2022-01-10 Impact factor: 3.390

7. COVID-19 Symptoms app analysis to foresee healthcare impacts: Evidence from Northern Ireland.

Authors: José Sousa; João Barata; Hugo C van Woerden; Frank Kee
Journal: Appl Soft Comput Date: 2021-12-20 Impact factor: 6.725

8. Initial prehospital Rapid Emergency Medicine Score (REMS) to predict outcomes for COVID-19 patients.

Authors: Scott S Bourn; Remle P Crowe; Antonio R Fernandez; Sarah E Matt; Andrew L Brown; Andrew B Hawthorn; J Brent Myers
Journal: J Am Coll Emerg Physicians Open Date: 2021-06-29

8 in total