Literature DB >> 34253340

A novel hybrid fuzzy time series model for prediction of COVID-19 infected cases and deaths in India.

Niteesh Kumar1, Harendra Kumar2.   

Abstract

World is facing stress due to unpredicted pandemic of novel COVID-19. Daily growing magnitude of confirmed cases of COVID-19 put the whole world humanity at high risk and it has made a pressure on health professionals to get rid of it as soon as possible. So, it becomes necessary to predict the number of upcoming cases in future for the preparation of future plan-of-action and medical set-ups. The present manuscript proposed a hybrid fuzzy time series model for the prediction of upcoming COVID-19 infected cases and deaths in India by using modified fuzzy C-means clustering technique. Proposed model has two phases. In phase-I, modified fuzzy C-means clustering technique is used to form basic intervals with the help of clusters centroid while in phase-II, these intervals are upgraded to form sub-intervals. The proposed model is tested against available COVID-19 data for the measurement of its performance based on mean square error, root mean square error and average forecasting error rate. The novelty of the proposed model lies in the prediction of COVID-19 infected cases and deaths for next coming 31 days. Beside of this, estimation for the approximate number of isolation beds and ICU required has been carried out. The projection of the present model is to provide a base for the decision makers for making protection plan during COVID-19 pandemic.
Copyright © 2021 ISA. Published by Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  COVID-19; Clustering; Fuzzy C-means; Fuzzy time series; Pandemic

Mesh:

Year:  2021        PMID: 34253340      PMCID: PMC8259256          DOI: 10.1016/j.isatra.2021.07.003

Source DB:  PubMed          Journal:  ISA Trans        ISSN: 0019-0578            Impact factor:   5.911


Introduction

Currently, a novel virus from a family of coronaviruses spread all over the world named as COVID-19 by world health organization (WHO). The number of infected COVID-19 cases has a rapid growth in most of the countries from December 2019 onward. That is why WHO declared it as a pandemic. It is assumed that the spreading of this virus was originated from a city of China called Wuhan in mid-December 2019. On 30 January 2020, the first case of COVID-19 was reported in India. After the first case, COVID-19 shows rapid growth in almost all parts of India within a few months. As a result, the number of COVID-19 infected cases and deaths per month has increased. It shows that there is a drastic change in the growth of the novel coronavirus. Unfortunately, the vaccine for the preservation of COVID-19 is not invented yet in India and rest of the world. Thus, the mobility rate of COVID-19 at a peak in India. Hence, it becomes necessary to predict the number of upcoming COVID-19 infected cases and deaths in future for the preparation of future plan-of-action and medical set-ups in advance. For forecasting purposes, several mathematical models have been developed in the past few months. Ded and Majumdar [1] proposed model named as time series model for the analysis of incidence trends and also obtained reproductive number of COVID-19. Mandal et al. [2] presented an analysis of COVID-19 through restriction on travels from other countries and the impact of quarantine in India by using simple mathematical model. Bhardwaj [3] proposed regression based predictive model for the prediction of total infection that arises due to COVID-19 at the end of outbreak. Bhola et al. [4] discussed the predictive model for future projection of this pandemic in India. Chatterjee et al. [5] proposed a compartmental SEIR model for obtaining the impact on healthcare and number of infections due to the COVID-19 epidemic in India. Chakarborty and Ghosh [6] build a real time forecasting method for the prediction of future COVID-19 cases and obtained case fatality rates for different countries. Roy and Kar [7] presented a study of impact of natural causes like summer and humidity on COVID-19 by analysing the presently available data. Mondal and Ghosh [8] have focused on the analysis of exponential growth of COVID-19 cases in India with respect to other countries and made future sketching of it. Mandal et al. [9] have proposed a mathematical model by introducing quarantine class and other measures introduced by the government to control COVID-19 spreading and forecast trends of it in three states of India. Marimuthu et al. [10] have estimated the number of infected cases of COVID-19 in Delhi by using SEIR model. Scheiner et al. [11] have developed a mathematical model called death kinetic law by using SEIR model and compared with another approach infection-to-death delay rule. El Koufi et al. [12] presented SIR epidemic model to stochastic from a deterministic frame with non-linear incidence and vertical rates. Al-Qaness et al. [13] proposed a forecasting model to estimate confirmed cases of corona in China based on previous cases. Lalwani et al. [14] proposed 3-phase SIRD model for determining the lockdown period for highly affected regions and predict lockdown period for India and Italy. Patrikar et al. [15] have presented modified SEIR to forecast the estimated cases of novel coronavirus in India. Salgotra et al. [16] developed genetic programming-based prediction model to estimate confirmed cases and death of COVID-19 across three states and India as well. Pai et al. [17] predict the number of active cases of COVID-19 in India by knowing the impact of lockdown along with the inflation of active cases. Kuniya [18] has applied the SEIR compartmental model for predicting of the epidemic peak of COVID-19 in Japan. Tiwari et al. [19] built time series forecasting model to predict the epidemic of COVID-19 in India. Tomar and Gupta [20] proposed data driven model & predict the active cases in India for upcoming days and measure the impact of lockdown on COVID-19. Krishna and Prakash [21] presented a model to estimate the mobility of COVID-19. Sarkar et al. [22] presented SARIIqSq model to forecast the scenario of COVID-19 in India. Malavika et al. [23] presented short term prediction by using Logistic growth model and also used SIR model to forecast peak time, active cases and the effect of lockdown in India. Kaushik et al. [24] gave review on different terminology due to COVID-19 in India such as clinical presentation, treatment, virology etc. Ambikapathy and Krishnamurthy [25] have proposed a model for knowing the impact of lockdown on COVID-19 spreading in India. Acharya and Porwal [26] present an ecological study by using a percentile ranking method for obtaining specific domain, overall vulnerability and present the number of active cases of coronavirus in 9 districts of India. Cooper et al. [27] proposed SIR model to examine the effectiveness of COVID-19 spreading in the community. Singhal et al. [28] have proposed two models, mathematical model with and without parameter, for investigating the trend of COVID-19 and gave some prediction for the upcoming days. Ranjan [29] analysed the outbreak of COVID-19 in India by using the epidemiological model for long-term and short-term predictions. Poddar et al. [30] proposed a model to study the spreading rate, death rate, recovery rate of COVID-19 and study the prediction of it in India. Vasantha and Patil [31] presented an overview of development of different models on COVID-19 in India. Pandey et al. [32] used regression and SEIR model to forecast the outbreak of COVID-19 in India. Alkahtani and Alzaid [33] presented a model of COVID-19 based on fractional differential operator and numerical method by using Lagrange polynomial for solving the system equation. Arora et al. [34] presented a deep learning-based models for forecasting the number of positive cases of COVID-19 for union territories and 32 states of India. Sahoo and Sapra [35] have gave data driven epidemic model for the analysis of COVID-19 in India based on real data of COVID-19. Çakan [36] presented SEIR epidemic model by considering impact of health and analysed the global & local stability of this model. Giri et al. [37] have proposed neural network model by introducing the lockdown condition for showing the infection risk of COVID-19. Clustering is an approach in which a grouped data are separated into a smaller data groups based on similarity measures. In the literature, many researchers used clustering technique with fuzzy time series (FTS) for forecasting purposes. Song and Chissom [38], [39] have proposed a time series model based on fuzzy set theory. Huarng [40] proved that accuracy of the forecasting model can be improved by changing the length of intervals. Li et al. [41] have developed a forecasting algorithm for FTS based on fuzzy C-means (FCM) clustering. Qui et al. [42] proposed high order FTS model based on fuzzy logical relationship (FLR) and automatic clustering. Sang et al. [43] presented a forecasting FTS model based on IFCM clustering technique. Kumar et al. [44] proposed two distance metrics and developed two clustering algorithms AMFCM and EMFCM. Zhang et al. [45] developed FTS forecasting model based on time series clustering and multiple linear regressions. While the FCM clustering technique uses Euclidean distance (ED) which gets easily stuck in a noisy environment and does not obtain good results. So, in the present manuscript basic FCM is modified by using an exponential function to tolerate noisy data before using with FTS technique. Along with this, most researchers generally are using mathematical modelling, like SEIR and SIR techniques, instead of using soft computing techniques for the prediction of COVID-19. Some of these models will forecast the effect of COVID-19 in upcoming weeks or days with more error rate. This gave us an encouragement to think out of the box and developed a novel hybrid fuzzy time series model (ANHFTS) which is based on modified FCM clustering technique for prediction of COVID-19 infected cases and deaths in India for coming 31 days. ANHFTS model has been used to forecast the infected cases and deaths in India. A primary reason for proposing this model is that this approach is more capable as compared to classical techniques and more durable in comparison to exist predicting models. Also, it can predict the COVID-19 infected cases and deaths for short-term and for long-term with small error. Proposed model has two phases. In phase-I, modified fuzzy c-means (MFCM) clustering technique is used to form intervals with the help of centroid while in phase-II, these intervals are upgraded to form sub-intervals and then predict the approximated cases of infection and deaths in India. The main contribution of the presented manuscript is: Developed a hybrid model for forecasting by FTS technique. FCM clustering technique is modified by using an exponential function to tolerate noisy data. This model can predict the approximate COVID-19 infected cases and deaths for trained and untrained data in India. Estimate the approximate number of isolation beds and ICU requirements till 31 August 2020 in India. The proposed ANHFTS model is considered as an unsupervised learning process. Rest of the presented manuscript is organized as follows. Some basic preliminaries of FCM and FTS are introduced in Section 2. In Section 3, some notation and MFCM with their necessary conditions are presented here. Section 4, described the description of the proposed model. Section 5 contains some performance measures which conclude the performance of proposed model. Section 6 reveals the implementation of proposed algorithm on two examples with the prediction of COVID-19 infected cases and deaths in India for next 31 days till 31 August 2020. Section 7 concludes this work.

Background information

In this section, we have discussed basic about FCM and FTS techniques which are used in our proposed model.

Fuzzy C-means clustering technique

Well known soft clustering technique, FCM partition the historical data in such a way that it can exist in more than one cluster with distinct membership value. Ruspini [46], [47] form clusters by using the concept of fuzzy set theory. Later on, Bezdek [48] improved the process of clustering after FCM formulation by Dunn [49]. The main objective of FCM is to minimize the objective function with their necessary conditions as follows. where, is the th data point, represent its membership value in th cluster, denotes Euclidean distance, c is number of cluster, is th cluster centroid. Eqs. (2), (3) are necessary conditions that will minimize Eq. (1). Here, .

Some basic definitions

This section contains important definitions which are used in throughout the present manuscript. Song and Chissom defined the FTS first time. The basic definitions of FTS, time-variant & time-invariant, FLR and FLR group (FLRG) are briefly reviewed as follows.

Fuzzy Time Series

Let be a fuzzy set defined on a universe of discourse , a subset of R. Then the set of is denoted by and it is called FTS defined on [38].

First Order Model

Suppose is formed by . Then, fuzzy relation can be expressed as , where o represent max–min composition and is fuzzy relationship between . Then, is called first order model [39].

Time-variant and Time-invariant

If relation of is not depend on t, , then is known as time-invariant FTS otherwise it is called time-variant [50].

Fuzzy Logical Relationship

If and then the relationship between is known as FLR and can be expressed as , where and are previous and current state of FLR [51].

Fuzzy Logical Relationship Group

Let assume that are FLR’s. Then these FLR’s can be grouped to form FLRG as [40].

Problem formulation and modified fuzzy c-means

Notation

The various notations used throughout this article are as:

Problem statement

Now-a-days, the whole India faced a pandemic problem in the form of COVID-19. Due to this, the number of confirmed cases shows a rapid growth in India day-by-day. Therefore, it is crucial to predict the number of upcoming infected cases and deaths in India. So, health professional are prepared for that situation and able to control the pandemic situation of COVID-19. By considering this problem, a novel hybrid predictive model has been developed by using the fuzzy time series based on MFCM.

Modified fuzzy C-means

Among the clustering technique, basic FCM [48] technique is frequently used by the researcher because of its easy implementation. But in a noisy environment, it gets easily stuck and does not form the desired output. So, to overcome this problem, the basic FCM has been modified by introducing a negative exponential variable for better robustness and the objective function of MFCM is given below. where, and , is the dimension of data and denote the th entry of th data point. The necessary conditions for the minimization of Eq. (4) are derived in the upcoming sub-section.

Derivation for cluster centroid

The minimization of objective function has necessary condition in the form of cluster centroid, it has been derived below for clusters. Let derive it for first cluster centroid i.e. . Differentiate Eq. (5) with respect to by considering other variables as constant. In the same way, we assumed that it can be easily obtained for th cluster. Hence, the general form of th cluster centroid for MFCM is as follows:

Derivation for membership value

Another necessary condition for the minimization of Eq. (4) is determined with respect to membership. So, the Lagrangian function for Eq. (4) is For obtaining necessary condition, differentiates Eq. (7) with respect to by considering other variables as constant. Again, differentiate Eq. (8), but this time with respect to and obtain the following form. From Eqs. (8), (9), we get By putting the value of in Eq. (9), the general form of membership value is obtained.

Description of the proposed ANHFTS model MFCM based approach

The cases of COVID-19 are growing rapidly in India, due to this obtaining the information regarding approximate number of infected cases and deaths in India become difficult. The main issues of government are how to control this disease and predict the upcoming confirmed cases of COVID-19 to prepare in advance for public health and economic decision on the basis of mathematical model. The present manuscript addresses a novel model for the prediction of COVID-19 infected cases and deaths by using FTS technique based on MFCM clustering to prepare in advance during this ongoing pandemic problem. The present model predicts the cases in two phases: Phase I: Form basic intervals by using cluster centroid obtained from MFCM clustering technique. Phase II: Upgrade the basic intervals into sub-intervals to forecast more accurately and predict the upcoming infected cases and deaths in India. Phase I In this phase, basic intervals are formed with help of obtained cluster centroid during MFCM clustering technique. To start the process of MFCM, first we have to know about how many clusters have to be formed. To obtain the number of clusters make partition the universe of discourse X as , where are randomly chosen positive numbers and are the minimum and maximum value of the collected historical data set, into equal length intervals. Instead of applied basic FCM to obtain centroid, we modified the FCM to get better results. By using MFCM, the centroids are obtained by assigning membership value randomly. Now, the basic intervals are formed with help of obtained centroid . Phase I is elaborated in the form of algorithm A with involved steps. Phase II In this phase, the basic intervals are upgraded by FTS technique to forecast more accurate value. Initially, the total number of elements is obtained from the set of historical data which belong to their respective basic intervals i.e. . Now, partition the interval into sub-interval with equal length. Repeat this process until all basic intervals partitioned into sub-intervals. Then, select only those sub-intervals in which historical data belong and referred them as . Now, define the linguistic variable for each intervals and allocate them to the historical data to fuzzified it as . The fuzzified time series data set can be defuzzified by using the mid points of upgraded interval for those whose FLR are non-empty is given below where n is the total number of sub-intervals, t represent time and is the mid-point of th interval. If the FLR is empty (untrained data), i.e.  where represent empty FLR, then the predicted values are obtained by Eq. (12) where are previous predicted values, is current predicted value at any time t, r is the average rate of increment and decrement in all forecasted values obtained by Eq. (11) and h is any small number lying between 0 to 0.5 to overcome the effect of high increment and decrement in the value of r. Phase II is also elaborated in the form of algorithm B with summarized steps. The Fig. 1 depicts the flow chart for proposed ANHFTS model containing algorithm A and algorithm B.
Fig. 1

Flow chart of proposed ANHFTS model.

Flow chart of proposed ANHFTS model.

Performance measure

The measurement of the performance of proposed ANHFTS model is evaluated with three different parameters which are mean square error (MSE), root mean square error (RMSE) and average forecasting error rate (AFER).

Mean square error

The average of squared difference between forecasted and actual value is estimated by mean square error [52]. The value of MSE can be formulated in Eq. (13), lower its value, the better forecasted value. where n is the total number of data points.

Root mean square error

Root mean square error is used to calculate that how much the forecasted value differs with actual value [53]. The value of RMSE should be small for better forecasting and calculated by using Eq. (14). where n is the total number of data points.

Average forecasting error rate

Average forecasting error rate [54] is the percentage of error that reflects the absolute difference between the actual value and forecasted value at any point of time and it is defined in Eq. (15). where n is the total number of data points.

Experimental results and analysis

In this section, first we show that the MFCM is free from noisy data then, the proposed ANHFTS model is implemented on two examples to figure out the performance of it and prediction for the upcoming COVID-19 infected cases and deaths in India at the end of month August 2020 has been carried out.

Noisy environment effect

For clustering purpose, FCM is well-known technique. However, noisy environment easily affects the result of FCM. While clustering should be free from it. So, FCM is modified by introducing a variable in negative exponential form to conquer this problem. Let be a universe of discourse with data points defined on . By minimizing with respect to , the estimated value of is obtained by Eq. (16). Let {5, 3, 4.7, 6, 5.6, 5.1, 4.4, 5.3, 7, 4} be an artificial data set [55], has to be tested by the procedure of least-square method. 5 and 7.0833 are the estimated value of before and after adding the noisy value 30, in artificial data set, by Eq. (16), it shows noisy data highly affects the minimizer. The same data is applied to MFCM for obtaining minimizer result by Eq. (6) are 4.9951 and 4.9259 before adding noisy data and after respectively. Hence, it shows that MFCM tolerate noisy environment. Selected sub-intervals with their mid-points for April 2020. Forecasted infected cases of COVID-19 for the month of April, May, June and July 2020 in India.

Implementation of proposed ANHFTS model

For the future prediction of COVID-19 infected cases and deaths in India, the proposed ANHFTS model has been implemented. But before the prediction, the proposed model is tested against available data of infected cases and deaths due to COVID-19 form April to July 2020, for checking the efficiency of the proposed model. For predicting the number of infected cases of COVID-19 in India, the data form 1st April to 30th July 2020 are considered. This epidemic data of COVID-19 has been taken from Government authorized portal [56]. For simplicity, the whole data is divided into four groups i.e., April, May, June and July months group. Before the prediction of infected cases of COVID-19 for August 2020, the proposed model is tested against the actual data of infected cases by this virus in India. For simplicity, available four months data of COVID-19 is tested by proposed ANHFTS model month-wise. First apply the ANHFTS model for April 2020 COVID-19 data. Let X be the universe of discourse which contain the COVID-19 data of infected person for April 2020. The minimum and maximum values of data set are 2059 and 34 866 respectively, which are denoted by and respectively. For the simplicity, the values of and are taken randomly 59 and 2134 respectively. Then, the universe of discourse will be . Partition X into randomly chosen 7 equal length intervals such as . So, the 7 number of clusters will be formed according to the present model. Now, initialize the membership value and obtain the average of data as = 14975.1. Apply the step 5 and 6 of algorithm A until the termination condition is not obtained. The iterative process of MFCM upgrades the membership value along with centroid successively. After satisfying the termination condition, the centroid will be and in increasing order. After applying the step 8 of algorithm A, the following basic intervals are obtained: Now, obtain the number of elements belong to the interval are 2059, 2545, 3105 and 3684. Partition the interval into 4 sub-intervals by using the step 1 of the algorithm B. Repeat the same procedure with other remaining intervals and select those sub-intervals , which contain the given data. The obtained results are shown in Table 1.
Table 1

Selected sub-intervals with their mid-points for April 2020.

VariablesSub-intervalsCorresponding elementsMid-points
b1[1531.3185, 2165.7626]20591848.5405
b2[2165.7626, 2800.2066]25452482.9846
b3[2800.2066, 3434.6507]31053117.4286
b4[3434.6507, 4069.0947]36843751.8727
b5[4069.0947, 4660.3387]42934364.7167
b6[4660.3387, 5251.5828]47774955.9608
b7[5251.5828, 5842.8268]53505547.2048
b8[5842.8268, 6434.0709]59156138.4488
b9[6434.0709, 7025.3149]67286729.6929
b10[7025.3149, 7986.8055]75997506.0602
b11[7986.8055, 8948.2961]84538467.5508
b12[8948.2961, 9909.7866]92119429.0413
b13[9909.7866, 10871.2772]10 45410 390.5319
b14[10871.2772, 11844.1885]11 48511 387.7329
b15[11844.1885, 12817.0999]12 37112 330.6442
b16[12817.0999, 13790.0112]13 43213 303.5556
b17[13790.0112, 14762.9226]14 35414 276.4669
b18[14762.9226, 15735.8339]15 72515 249.3782
b19[15735.8339, 18591.0009]17 305, 18 54417 877.2092
b20[18591.0009, 21446.1679]20 081, 21 37320 732.3762
b21[21446.1679, 23115.8187]23 04022 280.9933
b22[23115.8187, 24785.4696]24 44823 950.6441
b23[24785.4696, 26455.1204]26 28325 620.2950
b24[26455.1204, 28124.7712]27 89027 289.9458
b25[28124.7712, 29961.8690]29 45829 043.3201
b26[29961.8690, 31798.9668]31 36030 880.4179
b27[31798.9668, 33636.0646]33 06532 717.5157
b28[33636.0646, 35473.1624]34 86634 554.6135
After applying the step 3 of algorithm B, the linguistic variables for each sub-intervals 28 are as: where, the denominators of denotes the membership value of each sub-intervals to their respective linguistic variables . Now, allocate these linguistic variables to the historical data according to their belongness into the sub-intervals . Then, the first order FLR’s and FLRG’s will be in the following form where represent th group and . After applying the remaining steps involved in the proposed ANHFTS model, the forecasted COVID-19 infected cases for April-2020 month with their linguistic variables are find out, which are shown in Table 2.
Table 2

Forecasted infected cases of COVID-19 for the month of April, May, June and July 2020 in India.

DateApril 2020
May 2020
June 2020
July 2020
Linguistic variableForecasted infected cases of COVID-19Linguistic variableForecasted infected cases of COVID-19Linguistic variableForecasted infected cases of COVID-19Linguistic variableForecasted infected cases of COVID-19
1I12021I137 471I1198 485I1609 538
2I22399I239 560I2205 184I2626 775
3I33052I342 829I3215 456I3652 376
4I43694I446 155I4225 730I4675 653
5I54317I549 544I5236 013I5696 495
6I64920I653 001I6246 307I6716 497
7I75516I756 603I7256 604I7737 468
8I86110I860 366I8266 959I8761 458
9I96742I964 105I9277 496I9788 599
10I107503I1067 583I10288 226I10817 035
11I118413I1170 780I11299 023I11846 076
12I129380I1273 879I12309 907I12875 750
13I1310 347I1377 148I13321 065I13905 893
14I1411 318I1480 959I14332 511I14936 820
15I1512 292I1585 348I15344 057I15968 566
16I1613 268I1690 050I16355 730I161 000 985
17I1714 243I1795 092I17367 800I171 034 623
18I1815 556I18100 500I18380 285I181 069 536
19I1917 724I19106 148I19392 915I191 105 449
20I1917 724I20112 150I20406 281I201 143 113
21I2020 275I21118 533I21421 951I211 182 607
22I2020 275I22125 090I22440 045I221 225 814
23I2122 253I23131 770I23458 627I231 278 650
24I2223 892I24138 581I24476 037I241 339 135
25I2325 566I25145 484I25492 216I251 395 061
26I2427 257I26152 533I26507 968I261 442 334
27I2529 009I27159 739I27524 454I271 486 467
28I2630 826I28167 056I28543 236I281 530 135
29I2732 666I29174 555I29564 426I291 573 321
30I2833 920I30182 246I30579 347I301 602 320
31I31187 506

MSE191 360.0033837 093.66947 095 004.034447 860 278.3793

RMSE437.4471914.92822663.64496918.1123

AFER (%)2.00930.60610.48020.5561
Again, repeat the whole process for forecasting the COVID-19 infected cases for May, June and July 2020 months. The number of clusters formed for these months is 10, 8 and 10 respectively. After applying the steps involved in the proposed model, the forecasted results for the months May, June and July 2020 with their linguistic variables are evaluated and shown in Table 2. Graphical representations of forecasted and actual infected cases of COVID-19 from April to July 2020 in India. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Graphical representations of predicted COVID-19 infected cases in August 2020 in India. By the analysis of Table 2, it can be concluded that the calculated value of AFER (%), 2.0093, 0.6061, 0.4802 and 0.5561 for April, May, June and July 2020 months respectively are very small. Therefore, the forecasted values of the infected persons obtained by the proposed model are very close to the actual values. Hence, the proposed model is well trained and suitable for the future prediction of novel corona virus. Fig. 2 shows the graphical representation of Table 2. In which Fig. 2(a), (b), (c) and (d) show the graphical representation of forecasted and actual COVID-19 infected cases for the month of April, May, June and July 2020 respectively. In this figure, the forecasted infected indicate the data used for training purposes and actual values indicate the official data of infected cases till the end of July 2020 in India. From this graph, it is observed that the forecasted COVID-19 infected cases closely match the available official data.
Fig. 2

Graphical representations of forecasted and actual infected cases of COVID-19 from April to July 2020 in India. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

From Table 2, it can be concluded that the spreading of COVID-19 is increasing day-by-day in India. Now, apply step 7 of algorithm B, to calculate the average rate of increment of COVID-19 for the month of July 2020. The average rate of increment for July is 0.03451 and the randomly generated value of is 0.007. Now-a-days, the health ministry of India facing a lot of stress due to the COVID-19 virus. So, it becomes necessary to predict the number of COVID-19 cases in upcoming days to take protective measures in a worst situation. Therefore, the proposed ANHFTS model can also predict the number of upcoming new infected cases approximately. The predicted newly infected cases for the upcoming August-2020 month is shown in Table 3. The results of Table 3 shows that there could be approximately 3 659 185 COVID-19 infected cases up to the end of August 2020. The predicted results of Table 3 are also depicted graphically in Fig. 3.
Table 3

Predicted COVID-19 infected cases and deaths for upcoming month August 2020 in India.

DatePredicted COVID-19 infected casesPredicted COVID-19 deaths
31 July1 644 40936 259
1 August1 687 56136 939
2 August1 731 76137 631
3 August1 777 00838 334
4 August1 823 44339 052
5 August1 871 09339 782
6 August1 919 99040 526
7 August1 970 16441 284
8 August2 021 64942 056
9 August2 074 47942 843
10 August2 128 69143 645
11 August2 184 31944 461
12 August2 241 40045 293
13 August2 299 97346 140
14 August2 360 07747 003
15 August2 421 75247 882
16 August2 485 03848 778
17 August2 549 97849 690
18 August2 616 61650 620
19 August2 684 99451 567
20 August2 755 16052 531
21 August2 827 15953 514
22 August2 901 03954 515
23 August2 976 85055 535
24 August3 054 64356 574
25 August3 134 46857 632
26 August3 216 37958 710
27 August3 300 43159 808
28 August3 386 68060 927
29 August3 475 18262 067
30 August3 565 99763 228
31 August3 659 18564 410
Fig. 3

Graphical representations of predicted COVID-19 infected cases in August 2020 in India.

Predicted COVID-19 infected cases and deaths for upcoming month August 2020 in India. In this example, the number of predicted approximate deaths due to COVID-19 is obtained for the month of August 2020. The available data of deaths due to COVID-19 have been collected from Government authorized portal [56] during the period of April–July 2020. The prediction of COVID-19 deaths in India for August 2020 month will be determined, but before this prediction, the proposed model has been tested against the official data of deaths from COVID-19 in India. The proposed ANHFTS model is applied to the official data month-wise. The universe of discourse for April 2020 will be . According to step 1 of algorithm A, the estimated number of clusters for April 2020 will be 10. After applying the remaining steps involved in the proposed ANHFTS model, the forecasted COVID-19 deaths for the month of April 2020 with their linguistic variable are shown in Table 4. Similarly, the universe of discourse for May, June and July 2020 will be , and respectively. By step 1 of algorithm A, the estimated number of clusters for these months will be 10, 8 and 10 respectively. The forecasted value of COVID-19 deaths in India with their linguistic variables for the month of May, June and July 2020 are shown in Table 4.
Table 4

Forecasted COVID-19 deaths for the month of April, May, June and July 2020 in India.

DateApril 2020
May 2020
June 2020
July 2020
Linguistic variableForecasted COVID-19 deathsLinguistic variableForecasted COVID-19 deathsLinguistic variableForecasted COVID-19 deathsLinguistic variableForecasted COVID-19 deaths
1I153I11257I15594I117 874
2I263I21347I25774I218 210
3I382I31477I36051I318 720
4I4101I41588I46327I419 232
5I5120I51693I56603I519 744
6I6141I61802I66880I620 260
7I7162I71914I77156I720 781
8I8189I82026I87447I821 309
9I9231I92131I97782I921 825
10I9231I102229I108166I1022 298
11I10275I112325I118581I1122 725
12I11314I122421I129041I1223 137
13I12357I132521I129041I1323 605
14I13395I142626I139550I1424 244
15I14424I152731I1410 299I1525 029
16I15450I162842I1511 358I1625 776
17I16477I172968I1612 204I1726 417
18I17510I183111I1712 734I1827 034
19I18548I193260I1712 734I1927 685
20I19589I203412I1813 247I2028 371
21I20630I213567I1913 741I2129 078
22I21672I223724I2014 218I2229 811
23I22717I233885I2114 658I2330 570
24I23771I244051I2215 061I2431 335
25I24832I254222I2315 450I2532 091
26I25891I264402I2415 839I2632 838
27I26943I274593I2516 228I2733 583
28I27998I284790I2616 617I2834 332
29I281064I294994I2717 005I2935 085
30I291117I305205I2817 266I3035 593
31I315350

MSE126.22711203.204827 223.610917 023.0030

RMSE11.235134.6872164.9958130.4722

AFER (%)2.29891.03951.05600.4196
It can be observed by the analysis of Table 4 that the value of performance measure AFER , 2.2989, 1.0395, 1.0560 and 0.4196 for April, May, June and July 2020 months respectively are very close to 0. Therefore, the forecasted values of deaths due to COVID-19 obtained by proposed model have a minor difference from the actual value. The forecasted and actual values of deaths in India versus date for April, May, June and July 2020 months are also depicted in Fig. 4(a), (b), (c) and (d) respectively. From this graph, it is observed that the forecasted COVID-19 deaths obtained by proposed model are closely match with the available official data.
Fig. 4

Graphical representations of forecasted and actual deaths due to COVID-19 from April to July 2020 in India.

By using forecasted COVID-19 deaths data by proposed model present in Table 4, the average rate of increment in COVID-19 death cases for July is 0.02441 and randomly generated value of is 0.005. The prediction of the deaths of COVID-19 is also necessary for taking protective measures against it. The predicted deaths due to novel virus for August 2020 month is shown in Table 3. The predicted COVID-19 deaths in August 2020 are also represented in Fig. 5. The results of Table 3 shows that there could be approximately 64 410 deaths due to COVID-19 up to the end of August 2020. These predicted values of deaths may differ from the official data which shall be obtained at the end of August 2020 because of the awareness of people and daily upgrading health infrastructure towards novel corona virus.
Fig. 5

Graphical representations of predicted COVID-19 deaths in August 2020 in India.

In Example 6.2.1, Example 6.2.2, errors are calculated month-wise. For checking the accuracy in forecasted COVID-19 infected cases and deaths in India by the proposed ANHFTS model, the error percentage is calculated on daily basis by taking a interval of 5 days for the month of July 2020. The forecasted values of COVID-19 infected cases and deaths obtained by the proposed model, official data and calculated error percentage are shown in Table 5. The result of this table shows that the error between these two data for infected cases and deaths are minor. Hence, the proposed model can be applied to predict the approximate infected cases and deaths of COVID-19 for upcoming month August 2020 with fewer errors.
Table 5

Comparison of official data and forecasted data of infected cases as well as deaths for July 2020 in India.

Date
Official data
Forecasted value
Error percentage
Infected cases
05-07-2020697 846696 4950.1936
10-07-2020822 604817 0350.6770
15-07-2020970 169968 5660.1652
20-07-20201 154 9131 143 1131.0217
25-07-20201 387 0871 395 061−0.5749
30-07-20201 612 3541 602 3200.6223

Deaths

05-07-202019 70119 744−0.2183
10-07-202022 14422 298−0.6954
15-07-202024 92925 029−0.4011
20-07-202028 09928 371−0.9680
25-07-202032 12132 0910.0934
30-07-202035 76935 5930.4920
Forecasted COVID-19 deaths for the month of April, May, June and July 2020 in India. Comparison of official data and forecasted data of infected cases as well as deaths for July 2020 in India. Graphical representations of forecasted and actual deaths due to COVID-19 from April to July 2020 in India. Graphical representations of predicted COVID-19 deaths in August 2020 in India.

Estimation of approximated isolation bed’s and ICU’s

The recovery rate in India from COVID-19 is 64% [57] of the total infected cases on 30 July 2020. If it is assumed that this recovery rate will remain constant till 31 August 2020. According to the proposed ANHFTS model of COVID-19, the expected number of recovered people will be approximately 2 341 879 till 31 August 2020. The active cases may require hospitalization, quarantine and ICU’s in case of emergency. The number of active cases will be calculated by Eq. (17). From Eq. (17), the number of active cases will be approximately 1 252 896 at the end of August 2020. According to the recent report of the Ministry of Health and Family Welfare (MoHFW), a total of 944 170 isolation beds, 31 258 ICU’s and 114 638 oxygen supported beds are available to fight against COVID-19 by including 930 COVID hospital, 2362 COVID health-centre, 10 341 quarantine centres and 7195 COVID centres [58]. At present, India has successfully prevented plenty of infected cases and deaths from COVID-19 due to their awareness and advance health infrastructure. However, COVID-19 infected cases continuously shows growth. Therefore, it is essential to increase the number of isolation beds, the number of ICUs or ventilator devices for the struggle against COVID-19 pandemic diseases. According to the present study, the expected number of active cases will be approximately 1 252 896 at the end of August 2020. It is assumed that only 2% to 5% [59], [60] of the active cases are critical who required Ventilators. So, the estimated number of required ventilators for the treatment of COVID-19 infection will be approximately 25 058 to 62 645 and the rest of the active cases approximately 1 190 252 to 1 227 838 should be hospitalized or quarantined. Therefore, India will require approximately 12.5 lakh beds for infected persons and 65 thousand ICU’s for critical infected persons at the end of August 2020. Indian government should impose a strict lockdown to make a breakdown in new cases. Also, the Indian people should follow the guidelines of the Health Ministry and should adopt protective measurements like hand wash, wearing a mask, use sanitizer, follow social distancing, etc.

Analysis of variance

The results obtained by the proposed ANHFTS model and official data of COVID-19 infected cases and deaths in India for the month of July 2020 were tested by one-way ANOVA and it has been carried out in MINITAB 19. The results of analysis of variance are shown in Table 6. The result of ANOVA between ANHFTS model and official data of COVID-19 infected cases shows that the -value is 0.9855 which is greater than the F-value 0.0003 at 95% confidence level. Similarly, the -value is also greater than the F-value for the number of COVID-19 deaths in India for July 2020. Hence, the mean value of the proposed ANHFTS model and official data do not differ significantly at 95% level of significant for infected cases as well as deaths in India, which conform the better accuracy of the proposed model.
Table 6

ANOVA analysis of COVID-19 infected cases and deaths in July 2020.

Source
DF
Adj. SS
Adj. MS
F-value
P-value
AHFTS model versus Official data for infected cases in India
Between-group14.17E+074.17E+070.00030.9855
Within-group101.20E+121.20E+11
Total111.20E+12

AHFTS model versus Official data for deaths in India

Between-group11.10E+041.10E+040.00030.9865
Within-group103.65E+083.65E+07
Total113.65E+08
ANOVA analysis of COVID-19 infected cases and deaths in July 2020.

Conclusion

Daily growing magnitude of COVID-19 cases put the whole world humanity at high risk. Thus, it becomes necessary to control the outbreak of COVID-19 disease and forecast the infected cases and deaths in the upcoming days to execute the necessary plans. Therefore, this study presents a hybrid predictive FTS model based on the MFCM clustering technique. The main purpose of this article is to develop an effective model for estimating the number of COVID-19 infected cases and deaths in India for next coming 31 days. The proposed ANHFTS model is tested against available COVID-19 data of India for the measurement of its performance based on MSE, RMSE and AFER. Table 2, Table 4, show that the proposed ANHFTS model is capable for the prediction of infection and deaths. According to Table 3, the number of newly infected cases and deaths at the end of August 2020 in India will be approximately 3 659 185 and 64 410 respectively. Also, the proposed model predict the requirement of isolation bed’s and ICU’s to deal with COVID-19 in coming days. The output of the proposed model suggest that there will be requirement of approximately 12.5 lakh beds for infected persons and 65 thousand ICU’s for critical infected persons at the end of August 2020 in India. Thus, the proposed model could be significantly important for government and health care decision-makers for making protection plan during this pandemic. The Indian government substantially controls COVID-19 disease but they have to plan a strict strategy against the increment of COVID-19 cases in India and reduces the spread of virus significantly otherwise it will affect a large population of India. If there is no outbreak in the spreading of COVID-19 then, this figure 3 659 185 may be converted into another big figure or even up to crore in upcoming months. Through the developed ANHFTS model we may be enable for calculating important parameter such as infection rates and deaths rates, which will help us to have a more accurate grasp of the transmission trained of COVID-19 type disease, if occurs in future. In recent years, several membership functions have been developed. Each membership grade has some advantages as well as disadvantages. It is impossible to develop a general framework for a membership function because each model possesses its limitations and characteristic. In the context of each application, some membership functions have been seen more appropriate than other. However, the issue of choosing a general membership grade is still a subject of research. In future study, we will extend this present work and study the impact of different membership functions on the predictive results.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
viith data point
μkiMembership value of ith data point in kth cluster
Ckkth cluster centroid
cNumber of clusters 2cn1
mFuzzy index
Jvi,CkObjective function
FtForecasted value at any time t
Iiith linguistic variable
AViith actual value
rRate of increment or decrement
CtNumber of confirmed cases at any time t
AtNumber of active cases at any time t
RtNumber of recovered cases at any time t
DtNumber of deaths at any time t
MFCMModified fuzzy C-means
ANHFTSA novel hybrid fuzzy time series
Algorithm ATo make basic interval by using MFCM clustering technique
Step 1:Partition the universe of discourse X into several equal length intervals for obtaining the numbers of clusters c.
Step 2:Input m and fuzzy stopping criteria , here =0.0001;m=2.
Step 3:Randomly initialize membership value for each historical data s.t. k=1cμkit=1, where t is number of iterations.
Step 4:Calculate the value of a and ui by following formula a=|i=1nvi/n|andui=|j=1dmij|.
Step 5:Calculate the cluster centroid by .
Step 6:Update the membership value by .
Step 7:If μkit+1μkit, then go to next step, otherwise go to step 5.
Step 8:Calculate the basic intervals with the help of centroid by using following steps:
Step 8.1:UBk=Ck+Ck+12;LBk+1=UBk;k=1,2,,c1, where UB and LB are upper bound and lower bound respectively.
Step 8.2:LB1=2C1LB2,UBc=2CcLBc.
Step 9:End.
Algorithm BTo make forecasting by using FTS
Step 1:Basic interval obtained by previous algorithm A are partitioned into sub-intervals according to the number of elements yi belong to them.
Step 2:Select those sub-intervals bi in which historical data belong and calculate the mid-points mi of each sub-interval.
Step 3:Linguistic variables are defined for each selected sub-interval obtained in step 2 as Ii=j=1nbiαij;iN,whereαij=1ifi=j0.5ifj=j1orj=j+10otherwise.
Step 4:Allocate the linguistic variable to all historical data according to the belonging of data to their respective sub-interval.
Step 5:Create first order FLR and FLRG from step 4.
Step 6:Defuzzify the historical data i.e., calculate the forecasted value by using Eq. (11), if FLR is non-empty.
Step 7:Calculate the average rate of increment or decrement r of defuzzified value obtained in previous step; 0r1.
Step 8:Determine the predicted value by Eq. (12), if FLR is empty.
Step 9:End.
  29 in total

1.  Temperature prediction using fuzzy time series.

Authors:  S M Chen; J R Hwang
Journal:  IEEE Trans Syst Man Cybern B Cybern       Date:  2000

2.  A SIR model assumption for the spread of COVID-19 in different communities.

Authors:  Ian Cooper; Argha Mondal; Chris G Antonopoulos
Journal:  Chaos Solitons Fractals       Date:  2020-06-28       Impact factor: 9.922

3.  A vulnerability index for the management of and response to the COVID-19 epidemic in India: an ecological study.

Authors:  Rajib Acharya; Akash Porwal
Journal:  Lancet Glob Health       Date:  2020-07-16       Impact factor: 26.763

4.  Prudent public health intervention strategies to control the coronavirus disease 2019 transmission in India: A mathematical model-based approach.

Authors:  Sandip Mandal; Tarun Bhatnagar; Nimalan Arinaminpathy; Anup Agarwal; Amartya Chowdhury; Manoj Murhekar; Raman R Gangakhedkar; Swarup Sarkar
Journal:  Indian J Med Res       Date:  2020 Feb & Mar       Impact factor: 2.375

5.  A data driven epidemic model to analyse the lockdown effect and predict the course of COVID-19 progress in India.

Authors:  Bijay Kumar Sahoo; Balvinder Kaur Sapra
Journal:  Chaos Solitons Fractals       Date:  2020-06-20       Impact factor: 9.922

6.  Mathematical modeling of COVID-19 fatality trends: Death kinetics law versus infection-to-death delay rule.

Authors:  Stefan Scheiner; Niketa Ukaj; Christian Hellmich
Journal:  Chaos Solitons Fractals       Date:  2020-05-30       Impact factor: 5.944

7.  Optimization Method for Forecasting Confirmed Cases of COVID-19 in China.

Authors:  Mohammed A A Al-Qaness; Ahmed A Ewees; Hong Fan; Mohamed Abd El Aziz
Journal:  J Clin Med       Date:  2020-03-02       Impact factor: 4.241

8.  Prediction of the Epidemic Peak of Coronavirus Disease in Japan, 2020.

Authors:  Toshikazu Kuniya
Journal:  J Clin Med       Date:  2020-03-13       Impact factor: 4.241

9.  Prediction and analysis of COVID-19 positive cases using deep learning models: A descriptive case study of India.

Authors:  Parul Arora; Himanshu Kumar; Bijaya Ketan Panigrahi
Journal:  Chaos Solitons Fractals       Date:  2020-06-17       Impact factor: 9.922

View more
  3 in total

1.  A novel explainable COVID-19 diagnosis method by integration of feature selection with random forest.

Authors:  Mehrdad Rostami; Mourad Oussalah
Journal:  Inform Med Unlocked       Date:  2022-04-06

2.  Improved seagull optimization algorithm of partition and XGBoost of prediction for fuzzy time series forecasting of COVID-19 daily confirmed.

Authors:  Sidong Xian; Kaiyuan Chen; Yue Cheng
Journal:  Adv Eng Softw       Date:  2022-08-01       Impact factor: 4.255

3.  Knowledge-based and data-driven underground pressure forecasting based on graph structure learning.

Authors:  Yue Wang; Mingsheng Liu; Yongjian Huang; Haifeng Zhou; Xianhui Wang; Senzhang Wang; Haohua Du
Journal:  Int J Mach Learn Cybern       Date:  2022-10-02       Impact factor: 4.377

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.