Literature DB >> 32755845

ARIMA modelling & forecasting of COVID-19 in top five affected countries.

Alok Kumar Sahai1, Namita Rath2, Vishal Sood2, Manvendra Pratap Singh3.   

Abstract

BACKGROUND AND AIMS: In a little over six months, the Corona virus epidemic has affected over ten million and killed over half a million people worldwide as on June 30, 2020. With no vaccine in sight, the spread of the virus is likely to continue unabated. This article aims to analyze the time series data for top five countries affected by the COVID-19 for forecasting the spread of the epidemic.
MATERIAL AND METHODS: Daily time series data from 15th February to June 30, 2020 of total infected cases from the top five countries namely US, Brazil, India, Russia and Spain were collected from the online database. ARIMA model specifications were estimated using Hannan and Rissanen algorithm. Out of sample forecast for the next 77 days was computed using the ARIMA models.
RESULTS: Forecast for the first 18 days of July was compared with the actual data and the forecast accuracy was using MAD and MAPE were found within acceptable agreement. The graphic plots of forecast data suggest that While Russia and Spain have reached the inflexion point in the spread of epidemic, the US, Brazil and India are still experiencing an exponential curve.
CONCLUSION: Our analysis shows that India and Brazil will hit 1.38 million and 2.47 million mark while the US will reach the 4.29 million mark by 31st July. With no effective cure available at the moment, this forecast will help the governments to be better prepared to combat the epidemic by ramping up their healthcare facilities.
Copyright © 2020 Diabetes India. Published by Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  ARIMA; COVID-19; Forecasting; Pandemic; SARV-2 Cov

Mesh:

Year:  2020        PMID: 32755845      PMCID: PMC7386367          DOI: 10.1016/j.dsx.2020.07.042

Source DB:  PubMed          Journal:  Diabetes Metab Syndr        ISSN: 1871-4021


The Corona pandemic which originated in Wuhan, China in December 2019 has spread out to the whole world and in six months has caused unprecedented havoc. This extremely virulent strain of corona virus is highly contagious and has already affected over 10,101,998 cases worldwide and has claimed 501,644 lives within seven months [1]. The earlier instances of corona virus namely SARS and MERS were not as contagious and persistent as the 2019-nCov or COVID-19 as it has come to be known. The confusion and lack of transparency in the initial stages of the outbreak only worsened the situation and today 185 countries are suffering from the virus with no cure in sight. The virus in the current form is highly contagious and causes death due to respiratory failure. Due to the differences in epidemiological conditions and testing facilities the spread of the virus has been varied in countries. The worst affected are developed countries like Spain, Italy, France, Germany and the US. Today, US tops the list followed by Brazil, Russia, India and Spain respectively [1]. In India, the first case of COVID19 was reported on January 30, 2020 and the spread in India was extremely slow. As the severity of the viral infection became known the Government of India resorted to a complete lockdown to contain the spread of the virus. The first lockdown was announced on 25th March which was extended gradually till the end of May. Owing to the all-round collapse of the industry and the miseries of the daily wage earners and migrant labours, the government decided to lift the lockdown in a phased manner from June 2, 2020. Migrant labours from the two hotspots of New Delhi and Mumbai migrated to their home states and this large scale export of corona virus resulted in the explosion of the number of cases. Slowly India has entered the top ten countries affected by COVID-109 and today is the third most badly affected country in the world [1]. Despite the claims of the government of increased medical and testing facilities, the number of affected cases is not flattening or abating. The number of new patients every day is reaching 20,000 per day and many concerns are looming over the spread of COVID-19. How many people will be infected tomorrow? How many deaths will happen tomorrow? When will the infection curve reach inflexion or get flattened? How many people will be affected during the peak period of the outbreak? Are there mathematical models available to answer these questions? Under the circumstances, it is very important to estimate the spread of COVID-19 so that the policymakers, medical personnel and the general public could be better prepared to deal with the emergency. In this paper, we have employed Auto Regressive Integrated Moving Average (ARIMA) model to predict the incidence and spread of the COVID-19 in India, Russia, Brazil, Spain and the US as the five most badly hit countries [1]. As compared to other econometric models ARIMA models have been used with success in the prediction of several diseases [[2], [3], [4], [5], [6], [7]].

Literature review

The past two decades have seen research focused on statistical issues pertaining to a prospective detection of outbreaks of infectious diseases. The challenges arise in early detection and possible evolution of the epidemic for taking the appropriate preventive measures. The rapid growth in this area is called biosurveillance [8,9]. An early model of regression method of outbreak detection was presented by Shewhart [10]. Assuming a normally distributed incidence of infected cases the regression tested for exceeding the mean by a certain multiple of the standard deviation. However, with epidemics, the normal distribution is no longer a valid distribution and most epidemics show an exponential distribution or a highly skewed bell curve [11]. used a simple regression model which computer the expected number of cases at month t calculated as the mean count over t-1, t and t+1 months over a specified number of years. Regression models are used to detect the onset of influenza epidemics [12,13]. When data frequency is not much the normal errors regression model are inadequate and Poisson regression models have been used [14,15]. Unlike the parametric regression models described so far semi parametric models can be used to create a baseline model as used in monitoring the mortality and other related effects. A smoothing method to obtain baseline and standard deviations while working with Salmonella outbreaks was used by Ref. [16,17]. Most regression-based models define a mean at time t and issue an alarm at t if the observed value lies above a certain threshold predetermined by the sample statistics and the quantiles of a suitable normal or Poisson distribution [18]. described non-thresholding regression methods which test the hypothesis that a given value yt at time t belongs to the same distribution as the baseline distribution. The regression techniques do not capture the correlation structure of the data. Time series methods offer this advantage over the regression methods. Syndromic and laboratory data collected with daily or weekly frequency are generally autocorrelated with some lags. They may further exhibit correlations associated with the seasonal patterns in the data arising out of weekly or yearly seasonality. Failure to account properly for the autocorrelation in the time series data leads to misspecified models and incorrect forecasts. The Box Jenkins model is designed to take care of the autocorrelation of times series into account. With outbreak surveillance, the trend is best estimated through a relatively simple procedure. A Serfling model [19] based on trigonometric functions may be used to estimate the trend and seasonal components for time series data with regular seasonality. Simple exponential smoothing [20,21] and Holt- Winters procedure are employed in surveillance studies. Simple exponential smoothing makes predictions by taking a weighted average of past observations, the weights decreasing the farther we go in past with the higher weightage on the more recent data. The Hold-Winters procedure is a variant of simple exponential soothing which allows for local trend and seasonality. This method has been used with success in many surveillance studies and has done better than other forecasting methods [22]. Auto Regressive Integrated Moving Average (ARIMA) models [23] have been widely used for detecting outbreaks of infectious diseases [[24], [25], [26], [27]]. Stationarity of the time series is a prerequisite for fitting an ARIMA model. An investigation of ARIMA modelling showed that it was unable to model eight out of 17 syndromic time series resulting from sparse data [28]. However, for the series which were successfully modelled, one step ahead forecasts were highly acceptable and forecasts up to 3 years in future were obtained by continuously updated models. The traditional ARIMA models require a fairly large number of parameters for the auto correlation to be detected. Further, a model for one syndrome or outbreak cannot be automatically applied to another and the model has to be identified each time. For shorter lengths of time series data, it is prudent to use a hierarchical time series model. It is claimed that the hierarchical times series model can detect outbreaks faster than the lab based exceedance system [29]. The ARIMA model has seen widespread usage in the study of infectious diseases for several time series events. These include leptospirosis and its relationship with rainfall and temperature [5] and the relationship of suicide cases with changes in national alcohol policies [30] among others. Time series modelling of infectious disease specially COVID-19 has been reported by several researchers [4,7,[31], [32], [33], [34], [35], [36], [37], [38]].

Methodology

COVID-19 daily data of all reported cases were taken from the Worldometers website (worldometers.info/coronavirus/#countries). Data for India was of primary interest but data for the other two countries above and below India in the severity of epidemic were also studied to have a comparison of the epidemic and also investigate the onset of flattening of the curve. Daily data from 15 February to June 30, 2020 was collected and analysed separately for each country. We used data 30th June for modelling and then 77 days out of sample forecast was done based on the ARIMA models fitted to the data. Actual data from 1st to 7th was used to compute the accuracy and forecast error.

Box Jenkins procedure

Box and Jenkins (1971) popularised a method which combines both autoregressive (AR) and moving average (MA) models. An ARMA (p,q) model is a combination of AR(p) and MA(q) models and is best used for univariate time series modelling. In AR(p) model the future value of a variable is assumed to be dependent upon a linear combination of p past observations and a random error term. Mathematically and AR(p) model can be expressed as follows-Y Yt and εt are the actual value and the error terms at time period t, ϕi (i = 1,2,3,4 …. ) are model parameters and c is a constant. Integer p is known as the order of the model. Unlike AR(p) model an MA(q) model uses past errors as explanatory variables. The MA(q) model is given below-Y Here μ is the mean of the series, θj(j = 1,2,3 … q) are model parameters and is the order of the model. Mathematically an ARMA (p,q) model is represented as follows-Y The AR and MA can only be applied to a univariate stationary times series. To test the stationarity of a times series we need to test for the presence of unit root. If the series is not stationary in level, we need to differentiate it d (d = 1,2,3 …) times to make it stationary. Such a time series model is called an ARIMA (p,d,q) model.

ARIMA modelling steps

The first step is to check for the stationarity of the times series. This can be done by graphically plotting the series or conducting Augmented Dicky Fuller Test (ADF). Identification of the model. Graphically the AR and MA terms can be deduced from the Autocorrelation function (ACF) and partial autocorrelation function (PACF) plots. ARIMA parameters are estimated by least square method. EVIEWS 8 and JMulti software were used. While EVIEWS required naming of the model(p,d,q) based on ACF and PACF plots, JMulti does the model specification automatically using the Hannan Rissanen model selection algorithm(1982). The best model is selected on the basis of AIC values. The residual analysis is done. Out of sample forecast is carried out based on data from February 15, 2020 to June 30, 2020. A 77 days forward forecast upto September 15, 2020 is done based on the model. The procedure is repeated for the US, Brazil, Russia and Spain to check the model specification and forecasting accuracy for the five most severely affected countries.

Results

The first step was to test for unit root in all the five time series. A visual examination of the data plot suggested that the series were exponentially rising and were non stationary. Other than Russian time series of COVID incidence, all other series had to be differentiated. Augmented Dickey Fuller test was conducted to establish that Russian series was stationary in level while Brazil was integrated in the first order and the remaining three series namely India, Spain and US were integrated in second order. The model specification determined by Hannan Rissanen algorithm [29] was India (4,2,4), Brazil (3,1,2), Russia (3,0,0), Spain (4,2,4) and US (1,2,1) respectively. The residuals of the ARIMA series were plotted and found to be stationary. The ARIMA models were then used to forecast the out of sample COVID outbreak for 77 days up to September 15, 2020. The forecast values for each country are presented in Table 1, Table 2, Table 3, Table 4, Table 5 . The graphical plots with 95% confidence intervals are presented in Fig. 1, Fig. 2, Fig. 3, Fig. 4, Fig. 5 . We compared the actual data from 1st July to 18th July and checked the forecast efficiency using mean absolute deviation (MAD) and the mean absolute percentage error (MAPE). The MAD was lowest for Spain followed by Russia whereas India, Brazil and US exhibited increasing absolute deviations indicating that actual forecasts lean towards the upper bound of the forecast. In other words, the forecast indicated worsening situation and steepening of the case graph for India, Brazil and US in the days to come. A better measure of the forecast efficiency is the mean absolute percentage error (MAPE) which converts the absolute deviations as percentage of actual numbers. Percentage numbers are easily compared to have a relative estimate of the severity of the spread across the countries under consideration. MAPE for India, Brazil and US were 3.701%,1.844% and 2.885% respectively. It was lowest for Russia and Spain at 1.090% and 0.832% indicating a very tight forecast accuracy. The smaller numbers for Russia and Spain further indicate that the forecast is following the linear trend established by the past data. Spain has even dropped out of the top five countries in the world. Even though the MAPE numbers for US, India and Brazil are all less than 4.0%, the relatively larger numbers indicate a trend which is steepening and leaning towards the upper bound of the forecast. The MAPE numbers validate the accuracy of the forecast. The results are presented in Table 6 .
Table 1

ARIMA model specifications.

INDIABRAZILRUSSIASPAINUS
MODEL(4,2,4)(3,1,2)(3,0,0)(4,2,4)(1,2,1)
AR11.3758931.382241.821833−0.10320.224382
Std Err0.138920.2328720.0804560.249140.199694
T9.904235.9356222.64392−0.414231.12363
P0.000000.000000.000000.679410.26321
AR2−1.02318−0.83395−0.6470.265827
Std Err0.1125450.3283540.1611360.201803
T−9.09125−2.53978−4.015211.31726
P0.000000.012270.000100.19014
AR31.3695170.448294−0.17526−0.43591
Std Err0.1059840.129190.0807590.125818
T12.921913.47003−2.17015−3.46459
P0.000000.000710.031770.00073
AR4−0.72424−0.63051
Std Err0.1320040.160281
T−5.48652−3.93377
P0.000000.00014
MA11.7944080.8246210.3806110.594318
Std Err0.243180.165616
T3.390993.58854
P0.000920.00047
MA2−1.50841−0.110770.150048
Std Err0.23289
T−0.47564
P0.63513
MA31.591911−0.37507
Std Err
T
P
MA4−0.89826−0.4913
Std Err
T
P
AIC2128.3992689.9982077.2942179.3332624.271
SBC2154.5462707.4742088.9742205.4812632.987
Log Likelihood−1055.2−1339−1034.65−1080.67−1309.14

Source: Authors’ own calculation.

Table 2

Forecast data for COVID-19 outbreak in India.

DATELOWER CIFORECASTUPPER CISTD_ERR
January 07, 2020603,907605,084606,262600.9606
February 07, 2020621,640623,844626,0481124.4701
March 07, 2020639,485642,773646,0621677.9432
April 07, 2020658,599663,129667,6582311.0731
May 07, 2020678,031683,792689,5522939.1669
June 07, 2020697,137704,039710,9403521.313
July 07, 2020717,145725,228733,3114123.9509
August 07, 2020738,198747,529756,8614760.9582
September 07, 2020759,060769,604780,1475379.2975
October 07, 2020780,073791,821803,5705994.242
November 07, 2020802,265815,308828,3516654.942
December 07, 2020824,890839,280853,6707341.9044
07/13/2020847,242862,980878,7198029.8984
07/14/2020870,285887,448904,6108756.6787
07/15/2020894,290912,992931,6949542.059
07/16/2020918,216938,511958,80710355.0318
07/17/2020942,191964,142986,09311199.8207
07/18/2020967,129990,8711,014,61212113.2121
07/19/2020992,5341,018,1841,043,83413087.0884
07/20/20201,017,7131,045,3481,072,98214099.6111
07/21/20201,043,3891,073,1321,102,87515175.4833
07/22/20201,069,9151,101,9281,133,94216333.7611
07/23/20201,096,4541,130,8551,165,25617551.9068
07/24/20201,122,9821,159,8851,196,78718828.2916
07/25/20201,150,2841,189,8581,229,43320191.4058
07/26/20201,178,0631,220,4731,262,88421638.3565
07/27/20201,205,6771,251,0501,296,42323149.8286
07/28/20201,233,6451,282,1381,330,63124741.8074
07/29/20201,262,3581,314,1631,365,96826431.5544
07/30/20201,291,1611,346,4371,401,71428202.6936
07/31/20201,319,9281,378,8261,437,72430050.3959
January 08, 20201,349,3171,412,0291,474,74031996.3832
February 08, 20201,379,1801,445,8991,512,61934041.2004
March 08, 20201,408,9421,479,8311,550,72036168.6273
April 08, 20201,438,9561,514,1961,589,43638388.456
May 08, 20201,469,6201,549,4221,629,22440715.8197
June 08, 20201,500,4361,584,9871,669,53943139.4772
July 08, 20201,531,2121,620,6911,710,17045653.2561
DATELOWER CIFORECASTUPPER CISTD_ERR
August 08, 20201,562,4871,657,1011,751,71548273.3948
September 08, 20201,594,2211,694,1841,794,14751002.6718
October 08, 20201,625,9141,731,4151,836,91653828.1115
November 08, 20201,657,7891,769,0271,880,26656755.2362
December 08, 20201,690,2261,807,4261,924,62559796.9572
08/13/20201,722,8571,846,2291,969,60162946.0768
08/14/20201,755,4601,885,2022,014,94466196.2032
08/15/20201,788,4611,924,7942,061,12869559.2399
08/16/20201,821,8971,965,0512,108,20573039.2166
08/17/20201,855,3442,005,5282,155,71376626.1288
08/18/20201,888,9252,046,3542,203,78480322.6473
08/19/20201,922,9872,087,8972,252,80884139.5271
08/20/20201,957,2712,129,8902,302,50988072.5738
08/21/20201,991,5442,172,0882,352,63192115.7886
08/22/20202,026,1352,214,8362,403,53796277.7173
08/23/20202,061,1302,258,2292,455,329100562.9208
08/24/20202,096,1782,301,9032,507,629104963.9251
08/25/20202,131,3282,345,9082,560,488109481.5652
08/26/20202,166,8872,390,5672,614,248114124.6156
08/27/20202,202,6842,435,7062,668,728118891.0175
08/28/20202,238,4882,481,0842,723,679123775.4657
08/29/20202,274,5472,526,9592,779,371128783.9514
08/30/20202,310,9752,573,4552,835,936133921.0867
08/31/20202,347,4882,620,2792,893,070139181.4645
January 09, 20202,384,0842,667,4262,950,768144564.7979
February 09, 20202,421,0262,715,1743,009,322150078.1141
March 09, 20202,458,2102,763,4173,068,624155720.8153
April 09, 20202,495,4202,811,9323,128,443161488.4038
May 09, 20202,532,8362,860,9043,188,973167384.9613
June 09, 20202,570,5832,910,4703,250,357173414.8553
July 09, 20202,608,4412,960,4013,312,360179574.3196
August 09, 20202,646,3703,010,6543,374,938185862.4365
September 09, 20202,684,5903,061,4623,438,333192284.7388
October 09, 20202,723,0483,112,7703,502,493198841.583
November 09, 20202,761,5513,164,3813,567,211205529.2858
December 09, 20202,800,2213,216,4203,632,619212350.5262
09/13/20202,839,1863,269,0243,698,863219309.2637
09/14/20202,878,2783,322,0203,765,762226403.0034
09/15/20202,917,4353,375,3433,833,250233630.5279

Source: Authors’ own computation.

Table 3

Forecast and 95% confidence interval for COVID outbreak in Brazil.

DATELOWER CIFORECASTUPPER CISTD_ERR
January 07, 20201,439,6071,448,6441,457,6814610.585
February 07, 20201,467,5031,484,2291,500,9568534.161
March 07, 20201,494,8701,517,0101,539,15011296.27
April 07, 20201,523,9821,550,6971,577,41113630.03
May 07, 20201,554,0111,585,9261,617,84116283.47
June 07, 20201,583,2001,621,2721,659,34419424.84
July 07, 20201,611,2041,655,9021,700,60022805.37
August 07, 20201,638,6881,690,1341,741,58026248.49
September 07, 20201,666,0991,724,4671,782,83629780.16
October 07, 20201,693,3461,758,9501,824,55433,472
November 07, 20201,720,2041,793,3781,866,55137334.16
December 07, 20201,746,6351,827,6491,908,66341334.32
07/13/20201,772,7371,861,8181,950,89845,450
07/14/20201,798,5771,895,9501,993,32249680.85
07/15/20201,824,1481,930,0472,035,94754031.33
07/16/20201,849,4251,964,0812,078,73658498.89
07/17/20201,874,4111,998,0392,121,66763076.58
07/18/20201,899,1252,031,9312,164,73667759.09
07/19/20201,923,5812,065,7642,207,94872543.85
07/20/20201,947,7822,099,5402,251,29877428.9
07/21/20201,971,7302,133,2542,294,77782411.53
07/22/20201,995,4302,166,9042,338,37987488.69
07/23/20202,018,8872,200,4932,382,09992657.68
07/24/20202,042,1092,234,0222,425,93497916.23
07/25/20202,065,0992,267,4902,469,880103262.2
07/26/20202,087,8622,300,8972,513,933108693.6
07/27/20202,110,4002,334,2442,558,088114208.2
07/28/20202,132,7192,367,5312,602,343119804.1
07/29/20202,154,8222,400,7572,646,693125479.6
07/30/20202,176,7132,433,9242,691,136131232.8
07/31/20202,198,3952,467,0322,735,669137062.1
January 08, 20202,219,8722,500,0802,780,288142965.9
February 08, 20202,241,1462,533,0682,824,990148942.6
March 08, 20202,262,2222,565,9982,869,774154990.8
April 08, 20202,283,1012,598,8692,914,636161108.9
May 08, 20202,303,7872,631,6812,959,574167295.8
June 08, 20202,324,2832,664,4343,004,586173549.9
July 08, 20202,344,5912,697,1303,049,669179870.1
DATELOWER CIFORECASTUPPER CISTD_ERR
August 08, 20202,364,7142,729,7673,094,820186255.1
September 08, 20202,384,6542,762,3463,140,039192703.7
October 08, 20202,404,4142,794,8683,185,322199214.7
November 08, 20202,423,9972,827,3323,230,667205787.1
December 08, 20202,443,4042,859,7393,276,074212419.8
08/13/20202,462,6372,892,0883,321,539219111.7
08/14/20202,481,7002,924,3813,367,062225861.7
08/15/20202,500,5942,956,6173,412,639232668.9
08/16/20202,519,3212,988,7963,458,270239532.4
08/17/20202,537,8833,020,9183,503,954246451.1
08/18/20202,556,2823,052,9853,549,687253424.2
08/19/20202,574,5213,084,9953,595,469260450.9
08/20/20202,592,6003,116,9503,641,299267530.1
08/21/20202,610,5223,148,8483,687,174274661.2
08/22/20202,628,2893,180,6913,733,094281843.2
08/23/20202,645,9023,212,4793,779,057289075.4
08/24/20202,663,3633,244,2123,825,061296,357
08/25/20202,680,6733,275,8893,871,106303687.3
08/26/20202,697,8353,307,5123,917,189311065.4
08/27/20202,714,8503,339,0803,963,311318490.7
08/28/20202,731,7193,370,5944,009,469325962.5
08/29/20202,748,4443,402,0534,055,662333,480
08/30/20202,765,0273,433,4584,101,890341042.7
08/31/20202,781,4693,464,8104,148,151348649.7
January 09, 20202,797,7713,496,1074,194,443356300.6
February 09, 20202,813,9353,527,3514,240,767363994.6
March 09, 20202,829,9623,558,5414,287,121371731.1
April 09, 20202,845,8533,589,6784,333,503379509.5
May 09, 20202,861,6113,620,7624,379,914387329.3
June 09, 20202,877,2363,651,7934,426,351395189.8
July 09, 20202,892,7293,682,7724,472,814403090.4
August 09, 20202,908,0923,713,6974,519,303411030.7
September 09, 20202,923,3263,744,5714,565,815419,010
October 09, 20202,938,4333,775,3924,612,351427027.9
November 09, 20202,953,4123,806,1614,658,909435083.7
December 09, 20202,968,2673,836,8784,705,489443,177
09/13/20202,982,9973,867,5434,752,089451307.2
09/14/20202,997,6043,898,1574,798,709459473.9
09/15/20203,012,0903,928,7194,845,348467676.6

Source: Authors’ own computation.

Table 4

Forecast and 95% confidence interval for COVID-19 outbreak in Russia.

DATELOWER CIFORECASTUPPER CISTD_ERR3
January 07, 2020653,535654,393655,251437.7359
February 07, 2020659,024660,807662,590909.7199
March 07, 2020664,181667,085669,9901481.792
April 07, 2020669,040673,227677,4132135.867
May 07, 2020673,619679,229684,8402862.586
June 07, 2020677,928685,091692,2543654.641
July 07, 2020681,978690,810699,6424506.393
August 07, 2020685,775696,385706,9945413.246
September 07, 2020689,326701,813714,3016371.346
October 07, 2020692,635707,095721,5547377.383
November 07, 2020695,707712,226728,7468428.454
December 07, 2020698,545717,208735,8709521.978
07/13/2020701,152722,037742,92110655.63
07/14/2020703,531726,712749,89311827.27
07/15/2020705,684731,232756,78113034.97
07/16/2020707,615735,597763,57914276.89
07/17/2020709,323739,804770,28415551.35
07/18/2020710,813743,852776,89016856.75
07/19/2020712,085747,740783,39518191.59
07/20/2020713,142751,468789,79419554.42
07/21/2020713,985755,034796,08320943.87
07/22/2020714,615758,438802,26022358.65
07/23/2020715,035761,677808,32023797.49
07/24/2020715,246764,753814,26025259.17
07/25/2020715,249767,663820,07826742.52
07/26/2020715,046770,408825,77028246.4
07/27/2020714,639772,986831,33429769.71
07/28/2020714,029775,398836,76731311.38
07/29/2020713,217777,642842,06632870.36
07/30/2020712,205779,718847,23034445.63
07/31/2020710,996781,625852,25536036.18
January 08, 2020709,589783,365857,14037641.05
February 08, 2020707,988784,935861,88239259.28
March 08, 2020706,193786,336866,47940889.93
April 08, 2020704,207787,568870,92942532.08
May 08, 2020702,030788,631875,23244184.84
June 08, 2020699,665789,524879,38445847.3
July 08, 2020697,114790,249883,38447518.61
DATELOWER CIFORECASTUPPER CISTD_ERR
August 08, 2020694,378790,804887,23049197.9
September 08, 2020691,459791,190890,92250884.34
October 08, 2020688,359791,408894,45752577.09
November 08, 2020685,079791,457897,83554275.33
December 08, 2020681,623791,338901,05355978.26
08/13/2020677,991791,051904,11257685.09
08/14/2020674,185790,597907,01059395.04
08/15/2020670,209789,977909,74561107.34
08/16/2020666,063789,190912,31862821.23
08/17/2020661,750788,238914,72664535.97
08/18/2020657,272787,121916,97066250.81
08/19/2020652,631785,840919,04967965.05
08/20/2020647,830784,396920,96369677.95
08/21/2020642,870782,790922,71071388.83
08/22/2020637,755781,022924,29073096.99
08/23/2020632,485779,094925,70374801.74
08/24/2020627,065777,007926,94976502.43
08/25/2020621,495774,761928,02778198.38
08/26/2020615,778772,358928,93779888.95
08/27/2020609,918769,799929,68081573.5
08/28/2020603,916767,086930,25583251.4
08/29/2020597,775764,219930,66384922.04
08/30/2020591,497761,200930,90386584.81
08/31/2020585,085758,030930,97688239.11
January 09, 2020578,542754,712930,88289884.37
February 09, 2020571,870751,246930,62191519.99
March 09, 2020565,072747,633930,19593145.44
April 09, 2020558,150743,877929,60394760.14
May 09, 2020551,108739,977928,84696363.57
June 09, 2020543,948735,937927,92597955.19
July 09, 2020536,673731,757926,84199534.49
August 09, 2020529,285727,439925,593101,101
September 09, 2020521,788722,986924,184102654.1
October 09, 2020514,184718,399922,615104193.4
November 09, 2020506,476713,680920,885105718.5
December 09, 2020498,667708,832918,996107228.8
09/13/2020490,760703,855916,950108,724
09/14/2020482,758698,753914,748110203.5
09/15/2020474,664693,527912,390111,667

Source: Authors’ own computation.

Table 5

Forecast and 95% confidence intervals for COVID-19 outbreak in Spain.

DATELOWER CIFORECASTUPPER CISTD_ERR
January 07, 2020295,067296,504297,942733.5741
February 07, 2020294,084296,695299,3071332.366
March 07, 2020292,753296,853300,9522091.702
April 07, 2020291,496297,115302,7352867.293
May 07, 2020290,095297,437304,7793745.853
June 07, 2020288,393297,774307,1554786.492
July 07, 2020286,419298,102309,7865960.999
August 07, 2020284,043298,345312,6487297.349
September 07, 2020281,496298,554315,6118702.891
October 07, 2020278,836298,739318,64210154.9
November 07, 2020276,180298,962321,74411623.88
December 07, 2020273,539299,246324,95413116.37
07/13/2020270,811299,568328,32614672.39
07/14/2020267,937299,903331,86916309.53
07/15/2020264,825300,198335,57118047.73
07/16/2020261,506300,447339,38819868.05
07/17/2020258,046300,664343,28221744.3
07/18/2020254,525300,883347,24123652.35
07/19/2020251,003301,140351,27725580.76
07/20/2020247,457301,439355,42127542.25
07/21/2020243,841301,766359,69129554.04
07/22/2020240,084302,085364,08631633.81
07/23/2020236,156302,372368,58733784.09
07/24/2020232,078302,623373,16935993.12
07/25/2020227,903302,859377,81438243.43
07/26/2020223,688303,107382,52640520.45
07/27/2020219,457303,387387,31842822.67
07/28/2020215,190303,700392,20945158.72
07/29/2020210,843304,024397,20447541.89
07/30/2020206,374304,335402,29649980.92
07/31/2020201,770304,619407,46752474.53
January 08, 2020197,055304,879412,70355013.3
February 08, 2020192,270305,134417,99957585.04
March 08, 2020187,451305,406423,36260182.65
April 08, 2020182,605305,705428,80562807.22
May 08, 2020177,713306,024434,33665466.29
June 08, 2020172,739306,347439,95568168.54
July 08, 2020167,658306,654445,65170917.92
DATELOWER CIFORECASTUPPER CISTD_ERR
August 08, 2020162,468306,940451,41373711.6
September 08, 2020157,194307,213457,23276541.85
October 08, 2020151,865307,488463,11179400.95
November 08, 2020146,502307,779469,05682285.62
December 08, 2020141,106308,091475,07785198.3
08/13/2020135,655308,416481,17788144.96
08/14/2020130,124308,737487,35191130.88
08/15/2020124,500309,045493,59094157.34
08/16/2020118,788309,338499,88797220.94
08/17/2020113,008309,624506,239100315.8
08/18/2020107,183309,916512,649103437.3
08/19/2020101,322310,224519,125106584.3
08/20/202095,421310,546525,671109759.7
08/21/202089,462310,874532,287112967.7
08/22/202083,427311,198538,968116211.4
08/23/202077,312311,509545,706119490.5
08/24/202071,123311,810552,497122801.8
08/25/202064,879312,110559,342126,141
08/26/202058,592312,418566,244129505.5
08/27/202052,269312,739573,209132895.5
08/28/202045,900313,070580,239136313.5
08/29/202039,473313,402587,332139762.5
08/30/202032,975313,729594,482143244.1
08/31/202026,407314,046601,684146,757
January 09, 202019,778314,358608,938150298.7
February 09, 202013,099314,671616,243153866.1
March 09, 20206381314,993623,605157,458
April 09, 2020−377315,325631,026161075.2
May 09, 2020−7182315,663638,507164719.7
June 09, 2020−14045316,000646,045168393.4
July 09, 2020−20972316,331653,635172096.9
August 09, 2020−27961316,657661,275175828.7
September 09, 2020−35004316,979668,963179586.8
October 09, 2020−42091317,306676,703183369.2
November 09, 2020−49217317640684,497187175.5
December 09, 2020−56385317,981692,347191006.6
09/13/2020−63601318,326700,252194864.2
09/14/2020−70873318,668708,210198749.2
09/15/2020−78203319,006716,216202661.6

Source: Authors’ own computation.

Fig. 1

Covid-19 forecast plot for India.

Fig. 2

Covid 19 forecast plot for Brazil.

Fig. 3

Covid-19 forecast plot for Russia.

Fig. 4

Covid-19 forecast plot for Spain.

Fig. 5

Covid-19 forecast plot for US.

Table 6

Forecast and 95% confidence intervals for COVID-19 outbreak in US.

DATELOWER CIFORECASTUPPER CISTD_ERR
January 07, 20202,765,0752,772,8752,780,6743979.266
February 07, 20202,803,6142,818,5292,833,4437609.775
March 07, 20202,841,8732,864,4742,887,07411530.92
April 07, 20202,879,8252,910,7442,941,66415775.66
May 07, 20202,917,4662,957,3482,997,23020348.26
June 07, 20202,954,8173,004,2883,053,75925240.75
July 07, 20202,991,8993,051,5623,111,22630440.89
August 07, 20203,028,7413,099,1733,169,60535935.59
September 07, 20203,065,3653,147,1193,228,87441712.28
October 07, 20203,101,7943,195,4013,289,00847759.35
November 07, 20203,138,0513,244,0193,349,98654066.19
December 07, 20203,174,1533,292,9723,411,79160623.19
07/13/20203,210,1173,342,2613,474,40567421.61
07/14/20203,245,9593,391,8853,537,81274453.5
07/15/20203,281,6943,441,8463,601,99881711.6
07/16/20203,317,3343,492,1423,666,95089189.26
07/17/20203,352,8923,542,7743,732,65696880.35
07/18/20203,388,3773,593,7413,799,104104779.2
07/19/20203,423,8023,645,0443,866,286112880.6
07/20/20203,459,1753,696,6833,934,191121179.7
07/21/20203,494,5053,748,6574,002,809129671.8
07/22/20203,529,8013,800,9674,072,134138352.9
07/23/20203,565,0703,853,6134,142,157147218.8
07/24/20203,600,3193,906,5954,212,870156265.9
07/25/20203,635,5573,959,9124,284,267165490.5
07/26/20203,670,7884,013,5654,356,342174889.5
07/27/20203,706,0204,067,5544,429,087184459.5
07/28/20203,741,2574,121,8784,502,498194197.7
07/29/20203,776,5074,176,5384,576,569204101.1
07/30/20203,811,7744,231,5334,651,293214167.2
07/31/20203,847,0624,286,8654,726,667224393.2
January 08, 20203,882,3784,342,5324,802,686234776.8
February 08, 20203,917,7254,398,5354,879,344245315.6
March 08, 20203,953,1084,454,8734,956,638256007.4
April 08, 20203,988,5314,511,5475,034,563266849.9
May 08, 20204,023,9984,568,5575,113,116277841.2
June 08, 20204,059,5134,625,9025,192,291288979.3
July 08, 20204,095,0814,683,5845,272,087300262.2
DATELOWER CIFORECASTUPPER CISTD_ERR
August 08, 20204,130,7034,741,6005,352,498311688.2
September 08, 20204,166,3844,799,9535,433,522323255.3
October 08, 20204,202,1284,858,6415,515,155334962.1
November 08, 20204,237,9364,917,6655,597,394346806.8
December 08, 20204,273,8144,977,0255,680,236358787.8
08/13/20204,309,7635,036,7205,763,678370903.6
08/14/20204,345,7865,096,7515,847,717383152.7
08/15/20204,381,8865,157,1185,932,350395533.7
08/16/20204,418,0675,217,8206,017,574408045.1
08/17/20204,454,3295,278,8586,103,387420685.8
08/18/20204,490,6775,340,2326,189,787433454.2
08/19/20204,527,1135,401,9426,276,770446349.3
08/20/20204,563,6395,463,9876,364,335459369.7
08/21/20204,600,2575,526,3686,452,478472514.2
08/22/20204,636,9695,589,0846,541,199485781.8
08/23/20204,673,7795,652,1366,630,494499171.2
08/24/20204,710,6875,715,5246,720,361512681.4
08/25/20204,747,6975,779,2486,810,799526311.3
08/26/20204,784,8095,843,3076,901,805540059.8
08/27/20204,822,0275,907,7026,993,377553925.9
08/28/20204,859,3525,972,4337,085,513567908.8
08/29/20204,896,7866,037,4997,178,212582007.3
08/30/20204,934,3306,102,9017,271,472596220.5
08/31/20204,971,9886,168,6397,365,290610547.6
January 09, 20205,009,7596,234,7127,459,665624987.6
February 09, 20205,047,6476,301,1217,554,596639539.7
March 09, 20205,085,6526,367,8667,650,080654202.9
April 09, 20205,123,7776,434,9477,746,116668976.5
May 09, 20205,162,0226,502,3637,842,703683859.7
June 09, 20205,200,3916,570,1157,939,838698851.6
July 09, 20205,238,8836,638,2028,037,521713951.5
August 09, 20205,277,5016,706,6258,135,750729158.5
September 09, 20205,316,2466,775,3848,234,523744,472
October 09, 20205,355,1196,844,4798,333,838759891.3
November 09, 20205,394,1236,913,9098,433,696775415.5
December 09, 20205,433,2586,983,6758,534,093791,044
09/13/20205,472,5257,053,7778,635,029806,776
09/14/20205,511,9267,124,2148,736,502822,611
09/15/20205,551,4637,194,9878,838,512838548.3

Source: Authors’ own computation.

ARIMA model specifications. Source: Authors’ own calculation. Forecast data for COVID-19 outbreak in India. Source: Authors’ own computation. Forecast and 95% confidence interval for COVID outbreak in Brazil. Source: Authors’ own computation. Forecast and 95% confidence interval for COVID-19 outbreak in Russia. Source: Authors’ own computation. Forecast and 95% confidence intervals for COVID-19 outbreak in Spain. Source: Authors’ own computation. Covid-19 forecast plot for India. Covid 19 forecast plot for Brazil. Covid-19 forecast plot for Russia. Covid-19 forecast plot for Spain. Covid-19 forecast plot for US. Forecast and 95% confidence intervals for COVID-19 outbreak in US. Source: Authors’ own computation. Forecast accuracy with mean absolute deviation (MAD) and mean absolute percentage error (MAPE). The graphs show that for the US, Brazil and India the situation does not seem to be coming under control. For Russia and Spain, the situation is seemingly under control and it can be said that the epidemic has reached the inflexion point. (see Table 7)
Table 7

Forecast accuracy with mean absolute deviation (MAD) and mean absolute percentage error (MAPE).

DATEINDIA
BRAZIL
RUSSIA
SPAIN
US
ACTUALFORECASTACTUALFORECASTACTUALFORECASTACTUALFORECASTACTUALFORECAST
1-Jul-20605,220605,0841,453,3691,448,644654,405654,393296,739296,5042,778,4522,772,875
2-Jul-20627,168623,8441,543,3411,484,229661,165660,807297,183296,6952,835,6842,818,529
3-Jul-20649,889642,7731,543,3411,517,010667,883667,085297,625296,8532,890,5882,864,474
4-Jul-20673,904663,1291,578,3761,550,697674,904673,227297,625297,1152,935,7702,910,744
5-Jul-20697,836683,7921,604,5851,585,926681,261679,229297,625297,4372,982,9282,957,348
6-Jul-20720,346704,0391,626,0711,621,272687,862685,091298,869297,7743,040,8333,004,288
7-Jul-20743,481725,2281,674,6551,655,902694,230690,810299,210298,1023,097,0843,051,562
8-Jul-20769,052747,5291,716,1961,690,134700,792696,385299,593298,3453,163,3183,099,173
9-Jul-20794,842769,6041,759,1031,724,467707,301701,813300,136298,5543,224,8923,147,119
10-Jul-20822,603791,8211,804,3381,758,950713,936707,095300,988298,7393,297,1703,195,401
11-Jul-20850,358815,3081,840,8121,793,378720,547712,226301,670298,9623,359,1743,244,019
12-Jul-20879,466839,2801,866,1761,827,649727,162717,208302,352299,2463,417,7953,292,972
13-Jul-20907,645862,9801,887,9591,861,818733,699722,037303,033299,5683,483,5843,342,261
14-Jul-20937,487887,4481,931,2041,895,950739,947726,712303,699299,9033,549,6323,391,885
15-Jul-20970,169912,9921,970,9091,930,047746,797731,232304,574300,1983,621,6373,441,846
16-Jul-201,005,637938,5112,014,7381,964,081752,797735,597305,935300,4473,695,0253,492,142
17-Jul-201,040,457964,1422,048,6971,998,039759,203739,804307,335300,6643,770,0123,542,774
18-Jul-201,077,864990,8712,075,2462,031,931765,437743,852307,335300,8833,833,2713,593,741
MAD33,61433,2778040.42530100,761
MAPE3.701%1.844%1.090%0.832%2.885%

Discussion

India had controlled the spread of the pandemic very successfully until the May 31, 2020. Once the lockdown was lifted the migrant labourers and moving out from the hotspots of Delhi and Mumbai resulted in the explosion of the pandemic. The viral explosion that resulted from the lifting of lockdown has seen India break into the top ten affected countries. By the end of June India already touched third place after US and Brazil. The data for Spain showed a flattening of the curve while Russia showed a clear inflexion point and a downward trend. At the time of writing Spain has been pushed down by Peru and Chile. At the current rate, we estimate India and Brazil to touch 1.38 million and 2.47 million mark respectively while the US is expected to touch 4.29 million mark by the end of July 2020. This modelling is expected to better prepare these countries for the burgeoning demand for healthcare facilities. Though the results of the forecast were very agreeable, the ARIMA models suffer serious limitations in forecasting, characteristic of the time series models. Regression models take into account the causal variables but ARIMA models have found widespread and successful application in disease outbreak modelling. ARIMA forecast, built on the autoregressive nature of the time series coupled with corrective incremental adjustments, essentially, predicts a linear pattern and fails to predict a series with turning points. We have forecasted the COVID incidence up to September 15, 2020 assuming that no vaccine or other cure would be found by then. The exponentially rising graph of total cases indicates a possible community spread. Any successful medical intervention would, however, change the forecast significantly. Further, even without a vaccine, Russia and Spain have shown slowdown and flattening in the growth curves. If and when that will happen in case of the US, Brazil and India cannot be said based on this forecast. The ARIMA model does not help in predicting the onset of flattening of the pandemic cases.

Conclusion

ARIMA modelling of daily reported cases of COVID-19, in the top five countries showed a good forecast as measured by MAD and MAPE. The forecast could be used by the concerned governments to better manage and ramp up their healthcare preparedness for the pandemic.

Declaration of competing interest

On behalf of my co-authors, I, Dr. Alok Kumar Sahai, confirm that none of the authors have any conflict of interest to report.
  22 in total

Review 1.  Artificial intelligence for forecasting and diagnosing COVID-19 pandemic: A focused review.

Authors:  Carmela Comito; Clara Pizzuti
Journal:  Artif Intell Med       Date:  2022-03-28       Impact factor: 7.011

2.  Exploring the impact of air pollution on COVID-19 admitted cases: Evidence from vector error correction model (VECM) approach in explaining the relationship between air pollutants towards COVID-19 cases in Kuwait.

Authors:  Ahmad R Alsaber; Parul Setiya; Ahmad T Al-Sultan; Jiazhu Pan
Journal:  Jpn J Stat Data Sci       Date:  2022-06-28

3.  Trend Analysis and Forecasting the Spread of COVID-19 Pandemic in Ethiopia Using Box-Jenkins Modeling Procedure.

Authors:  Yemane Asmelash Gebretensae; Daniel Asmelash
Journal:  Int J Gen Med       Date:  2021-04-21

4.  COVID-19: Short term prediction model using daily incidence data.

Authors:  Hongwei Zhao; Naveed N Merchant; Alyssa McNulty; Tiffany A Radcliff; Murray J Cote; Rebecca S B Fischer; Huiyan Sang; Marcia G Ory
Journal:  PLoS One       Date:  2021-04-14       Impact factor: 3.240

5.  Evaluating the impact of mobility on COVID-19 pandemic with machine learning hybrid predictions.

Authors:  Cheng-Pin Kuo; Joshua S Fu
Journal:  Sci Total Environ       Date:  2020-11-28       Impact factor: 7.963

6.  Prediction intervals of the COVID-19 cases by HAR models with growth rates and vaccination rates in top eight affected countries: Bootstrap improvement.

Authors:  Eunju Hwang
Journal:  Chaos Solitons Fractals       Date:  2022-01-03       Impact factor: 5.944

7.  Modelling and Forecasting of Growth Rate of New COVID-19 Cases in Top Nine Affected Countries: Considering Conditional Variance and Asymmetric Effect.

Authors:  Aykut Ekinci
Journal:  Chaos Solitons Fractals       Date:  2021-07-08       Impact factor: 5.944

8.  Forecasting spread of COVID-19 using google trends: A hybrid GWO-deep learning approach.

Authors:  Sikakollu Prasanth; Uttam Singh; Arun Kumar; Vinay Anand Tikkiwal; Peter H J Chong
Journal:  Chaos Solitons Fractals       Date:  2020-10-22       Impact factor: 5.944

9.  Improving prediction of COVID-19 evolution by fusing epidemiological and mobility data.

Authors:  Santi García-Cremades; Juan Morales-García; Rocío Hernández-Sanjaime; Raquel Martínez-España; Andrés Bueno-Crespo; Enrique Hernández-Orallo; José J López-Espín; José M Cecilia
Journal:  Sci Rep       Date:  2021-07-26       Impact factor: 4.379

10.  Comparison of ARIMA, ETS, NNAR, TBATS and hybrid models to forecast the second wave of COVID-19 hospitalizations in Italy.

Authors:  Gaetano Perone
Journal:  Eur J Health Econ       Date:  2021-08-04
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.