Literature DB >> 33041534

Isfahan and Covid-19: Deep spatiotemporal representation.

Rahele Kafieh¹, Narges Saeedizadeh¹, Roya Arian¹, Zahra Amini¹, Nasim Dadashi Serej¹, Atefeh Vaezi², Shaghayegh Haghjooy Javanmard³.

Abstract

The coronavirus COVID-19 is affecting 213 countries and territories around the world. Iran was one of the first affected countries by this virus. Isfahan, as the third most populated province of Iran, experienced a noticeable epidemic. The prediction of epidemic size, peak value, and peak time can help policymakers in correct decisions. In this study, deep learning is selected as a powerful tool for forecasting this epidemic in Isfahan. A combination of effective Social Determinant of Health (SDH) and the occurrences of COVID-19 data are used as spatiotemporal input by using time-series information from different locations. Different models are utilized, and the best performance is found to be for a tailored type of long short-term memory (LSTM). This new method incorporates the mutual effect of all classes (confirmed/ death / recovered) in the prediction process. The future trajectory of the outbreak in Isfahan is forecasted with the proposed model. The paper demonstrates the positive effect of adding SDHs in pandemic prediction. Furthermore, the effectiveness of different SDHs is discussed, and the most effective terms are introduced. The method expresses high ability in both short- and long- term forecasting of the outbreak. The model proves that in predicting one class (like the number of confirmed cases), the effect of other accompanying numbers (like death and recovered cases) cannot be ignored. In conclusion, the superiorities of this model (particularity the long term predication ability) turn it into a reliable tool for helping the health decision-makers.

Entities: Chemical Disease Gene Species

Keywords: COVID-19; Deep learning; Isfahan; Predication

Year: 2020 PMID： 33041534 PMCID： PMC7534756 DOI： 10.1016/j.chaos.2020.110339

Source DB: PubMed Journal: Chaos Solitons Fractals ISSN： 0960-0779 Impact factor: 5.944

Introduction

The beginning of 2020 was also the beginning of the Covid-19 outbreak, which was declared as pandemic on March 11th [1]. Iran was one of the first countries affected by this virus. The outbreak in Iran started from Qom, spread to Tehran, Isfahan, Guilan, and then to other provinces [2]. Isfahan, the third most populated Iran province is 270 KM south of Qom and experienced a noticeable epidemic [3]. Four main hospitals were allocated to COVID-19 patients, and the public health measures were implemented in alignment with national policies. The first case in Isfahan was hospitalized on Feb. 18th, and today after around three months, the number of confirmed cases is increasing yet. Fig. 1 demonstrates the spread of disease in Isfahan Province.

Fig. 1

Spread of confirmed cases in Isfahan.

Spread of confirmed cases in Isfahan. Due to the high false-negative ratio of reverse transcription-polymerase chain reaction (RT-PC R) test and the limited number of available tests (which led to a single measure per patient), the population with a negative test is not reliable to be ignored. Accordingly, the suspicious cases (with negative results) are also considered as confirmed cases in this study. Another issue to be discussed is the change in the total number of the tested population during the time. In the first phase of facing the pandemic in Iran, the government could only manage to perform the PCR test on hospitalized cases; however, in the second phase and passing through the rough time (around 31st March), more cases, including hospitalized and outpatients were gradually added to the tested community. The prediction of epidemic size, curve, and peak time using mathematical models can help policymakers make evidence-informed decisions. To have a better prediction, based on the available data from all countries, we have selected some influential Social Determinants of Health (SDHs) to be considered in our model. These factors, which are always of concern when talking about health, could be determinants of the outbreak's behavior. SDHs such as the Gross Domestic Product (GDP), fertility, or the growth rate, the population contributed to the world's share (World Share), date of school closure [4], [5], [6]. Among different prediction methods in pandemics, deep learning is proved to be a successful and powerful tool. Pandemics like COVID-19 may not essentially be considered as linear systems, but conventional statistical models like Auto Regressive Moving Average (ARMA), Moving Average (MA), Auto-Regressive (AR) methods [7, 8] are mostly supposing linear assumptions for corresponding models. However, in such a pandemic, transmission rates can change dynamically due to environmental factors like the behavior of people, country actions, etc. In contrast, the merit of deep learning methods in powerful dealing with such nonlinearities motivates the researcher to use it in forecasting time series. Furthermore, deep learning methods are known to have the capacity of automatic feature extraction, which eliminates the need for manual detection and extraction of efficient characteristics. In this paper, the proposed method is based on a novel deep learning method combined with effective SDHs and the original time series of numbers of confirmed, death, and recovered cases. A limited number of works have already implemented using this type of model for forecasting COVID-19 [9], [10], [11], [12], [13], [14], [15], [16], [17]. The rest of the paper is organized as follows; in Section 2, the dataset and utilized models are fully described. In Section 3, optimal model selection, the effectiveness of SDHs in prediction, and the future trajectory of the outbreak in Isfahan are elaborated. Finally, the results are discussed and concluded in the last section.

Material and methods

Dataset

We called this study “spatiotemporal analysis”, motivated by using time-series information from different locations, including selected countries (Global data) and Isfahan province (Local data). The information constitutes occurrences of COVID-19 and SARS plus Social Determinants of Health (SDH) for each location to forecast the number of COVID-19 cases in Isfahan (summarized in Table 1 ).

Table 1

Description of data types.

Order	variable	Location	Source	Description
1	Occurrences of COVID-19 Data	Selected countries (Global data)	John Hopkins University	Daily number of Confirmed/ Death / Recovered people
1	Occurrences of COVID-19 Data	Isfahan Province (Local data)	Isfahan University of Medical Sciences	Daily number of Confirmed/ Death / Recovered people
2	Social Determinants of Health (SDH)	Selected countries (Global data)	[18], [19], [20], [21], [22], [23], [24]	Population, Yearly Change, Net Change, Density, Land Area (Km²), Migrants (net), Fertility, Age, Urban_percentage, World Share, Quarantine, Schools, Restrictions, Hospital Bed, sex0, sex14, sex25, sex54, sex65plus, Sex Ratio, GDP, Smoking, day_from_jan_first
2	Social Determinants of Health (SDH)	Isfahan Province (Local data)	Isfahan University of Medical Sciences
3	Occurrences of SARS Data	Selected countries (Global data)	[25]	Daily number of Confirmed/ Death / Recovered people

Population (2020), Yearly Change (yearly population change), Net Change (net change of the population), Density (P/Km²), Land Area (Km²), Migrants (net migrants of the countries), Fertility (fertility or the growth rate), Age (median lifespan), Urban percentage (urban population), World Share (the population contributed to the world's share), The date for Quarantine, Schools, Restrictions, Hospital Bed (per 1000 people), sex0 (age from 0 to 14years), sex14 (age from 14 to 54 years), sex54 (age from 54 to 65 years), sex65plus (age upper 65 years), sex Ratio, GDP (Gross Domestic Product), Smoking (Percent of smoker population, 2016).

Description of data types. Population (2020), Yearly Change (yearly population change), Net Change (net change of the population), Density (P/Km²), Land Area (Km²), Migrants (net migrants of the countries), Fertility (fertility or the growth rate), Age (median lifespan), Urban percentage (urban population), World Share (the population contributed to the world's share), The date for Quarantine, Schools, Restrictions, Hospital Bed (per 1000 people), sex0 (age from 0 to 14years), sex14 (age from 14 to 54 years), sex54 (age from 54 to 65 years), sex65plus (age upper 65 years), sex Ratio, GDP (Gross Domestic Product), Smoking (Percent of smoker population, 2016). We collected the COVID-19 Global complete data from John Hopkins University [2, 26], including all infected countries and Local data from Isfahan province. The dataset contains a cumulative number of confirmed, death, and recovered COVID-19 cases for different dates. Global data from 22 January till 3 May 2020 and Local data from 18 February till 3 May 2020 was used. We also added SDH for each location as described in Table 1. SARS data is also used as the initial weights of the network. Even though COVID-19 is more contagious than SARS (COVID-19 has a higher R0 [27, 28]), deep learning methods need a sample dataset to mimic the overall trend process. Since only a limited number of countries experienced the full curve (both rising and dropping edges) of the Covid-19 epidemic, SARS data is used to provide such missed information for the model. The occurrence data is divided into training and validation data subset with a ratio of 7:3.

COVID-19 disease prediction model

In this study, we utilized different machine learning methods ranging from classic models to sophisticated deep learning models including Random Forest (RF) [29], Extreme Gradient Boosting (XGBoost) [30], Light Gradient Boosting Machine (LGBM) [31], Multi-Layer Perceptron (MLP) [32], Convolutional Neural Networks (CNN) [33], Long Short-Term Memory (LSTM) [10, 11], Multi-channel LSTM, and Parallel LSTM. Fig. 2 demonstrates the complete block diagram of the proposed model, emphasizing the datasets, data preparation, and comparison of the models [34, 35].

Fig. 2

COVID-19 disease prediction models.

COVID-19 disease prediction models. Above mentioned models are evaluated on Global data by utilizing COVID-19 occurrence data (Confirmed/ Death / Recovered) in time-series format. We investigated different values of time intervals, termed as “lag”, that can be considered before the prediction date to feed the occurrence data into the model. To find the optimal lag parameter, the MAPE value is calculated for predicting occurrences of confirmed, death, and recovered cases for lag values ranging between 1- 20 days used for validation. The lowest MAPE (common between confirmed, death, and recovered cases) is six days, which is selected as the “optimal lag parameter” [18]. Furthermore, the models are also fed by SDHs. For comparison of the models, the Mean Absolute Percentage Error (MAPE) metric is used to measure the size of the error in percentage terms regarding the actual values. The metrics like Mean Absolute Error (MAE) [36], the Mean Square Error (MSE) [12], Root Mean Square Error (RMSE) [10, 26, 37], suffer from non-normalized measurements and accordingly provide higher values for countries with more population. Therefore, we selected MAPE, which offers a normalized version that is more comparable between different sizes of the population. MAPE is calculated using the equations below: Where Xtis the actual value and is the corresponding estimated value for tth sample from all n available samples. The best model based on MAPE error on Global data is then selected for training and forecasting on Local dataset from Isfahan province.

An overview of utilized models

Random Forest (RF) [29] or random decision forests is a Supervised Learning algorithm based on Ensemble learning, popular in regression and prediction. Ensemble methods make more accurate predictions compared to individual models since such methods combine the predictions from multiple machine learning algorithms. XGBOOST [30] is a scalable end-to-end tree boosting system with a novel sparsity-aware algorithm. With insights on cache access patterns, data compression to build a scalable tree boosting system, XGBoost scales beyond billions of examples using far fewer resources than existing systems. Light GBM [31] is proved to be an effective machine learning method that uses Gradient-based One-Side Sampling (GOSS) to exclude a significant proportion of data instances with small gradients. Multilayer Perceptron (MLP) [32], referred to as the deep neural network (DNN), is a feed-forward analysis method consisting of at least three layers of nodes: an input layer, a hidden layer, and an output layer. Except for the input nodes, each node is a neuron that uses a nonlinear activation function. It uses a supervised learning technique called back propagation for training. We designed a COVID-19 prediction model using MLP, and basic MLP is used with its more advanced descendants. MLP used variables including bias b, input x, output y, weight w, sum function s, and activation function f. Each neuron in MLP is fired with the following formula: Fig. 3 shows the overall structure of a neuron and a MLP model. Each layer of neurons in MLP is connected to the next layer, justifying the name “Dense layer”, more prevalent in DNN literature. We used implementation in the Keras package in the Python version 3.7.3 [19].

Fig. 3

The overall structure of a neuron and a MLP model.

The overall structure of a neuron and a MLP model. Convolutional neural network (CNN) is a class of deep neural networks and regularized versions of MLP (fully connected (Dense layer) networks). In MLP, each neuron is connected to all neurons in the next layer. CNNs take advantage of the hierarchical pattern in data and assemble more complex patterns using smaller and simpler patterns [33, 38]. In this work, we use one dimensional CNN, appropriate for analyzing time sequences [19]. CNN used variables including bias b, input time series x of length T(T indicates the value of the lag parameter), the result of convolution C, kernel w, and nonlinear activation function f such as the Rectified Linear Unit (ReLU). A general form of applying the convolution for a centered time sample t is given by: The univariate output of the previous step can be considered as another time series C. Regarding kernel w, the CNN model as an AI method updates the values of w to reach the best performance during the training step. Therefore, only a random initial value is set for w, and the final weights are optimized automatically. Fig. 4 shows the overall structure of a 1D CNN.

Fig. 4

The overall structure of a 1D CNN.

The overall structure of a 1D CNN. Unlike regression predictive modeling, time series also adds the complexity of a sequence dependence among the input variables [20, 21]. The LSTM model is a subcategory of recurrent neural networks (RNNs), a powerful type of neural network designed to handle sequence dependence. Unlike standard feed-forward neural networks, LSTM has feedback connections to only process single data points, but also entire sequences of data. A common architecture for LSTM has the following components: a cell (the memory part) and three gates (input gate, output gate, and forget gate). LSTM uses the activation function of the logistic sigmoid function. Weights of a LSTM network consist of the connections into and out of the LSTM gates and are learned during training. A Vanilla LSTM [22] is a LSTM model with a single hidden layer of LSTM units, and an output layer used to make a prediction. A Stacked LSTM model is compromised from multiple hidden LSTM layers. Since each LSTM layer requires a three-dimensional input and the reality that outputs of an LSTM are two-dimensional, one may consider the LSTM output as each time step in the input data by setting the return_sequences=True argument on the layer. In this particular application of LSTM in predication, we originally deal with three different occurrences (daily number of confirmed/ death / recovered people). Multivariate LSTM Models are undoubtedly a perfect match for such problems. Multivariate time series data means data where there is more than one observation for each time step. There are two main models with multivariate time series data: Multiple Input Series and Multiple Parallel Series. Multiple Input Series: We may consider the number of confirmed cases as the output of the network, but for forecasting purposes, two or three parallel input time series (confirmed/ death/ recovered in previous days) may be of importance. The input time series are parallel because each series has an observation at the same time steps. We used the Stacked LSTM model in the previous section for this network. Multiple Parallel Series: An alternate time series problem is when there are multiple parallel time series, and a value must be predicted for each. Now, we may consider the number of occurrences in all classes (confirmed/ death / recovered) as input, data and we may want to predict the value for each of the three-time series for the next time step. This might be referred to as multivariate forecasting. The characteristics of the mentioned models are collected in Table 2 .

Table 2

The best selected architecture for each model.

Model	Layers	Filters	Batch normalization/ dropout	Activation function
RF	—	n_estimators = 300random_state = 10	—	—
XGBoost	—	n_estimators = 300random_state = 10min_child_wieght= 1col sample_by tree = 1	—	—
LGBM	1 LGBM Regressor	n_estimators=200	—	—
MLP	6	128, 128, 256, 256, 256, 1	—	relu
1D CNN	4CNN +1 Flatten + 2 Dense	32, 128, 128, 128, 50, 1	—	relu
stacked LSTM	2 LSTM+ 2 Dense	64,32,32,1	—	relu
Multiple Input Series	2 LSTM+ 2 Dense	64,32,32,1	—	relu
Multiple Parallel Series	2 LSTM+ 2 Dense	64,32,32,3	—	relu

The best selected architecture for each model.

Results

The models were formed based on occurrences for the daily number of confirmed, death, and recovered COVID-19 in Global and Local data. A lag of six days was applied to each dataset. The networks are pre-trained with occurrences for the daily number of confirmed, death, and recovered SARS disease cases. Each model is tested with many different architectures, and the best performance is achieved with architectures described in Table 2. Each model's performance was evaluated on the test subset of the data to provide a fair comparison based on the MAPE value. Section 3.1 elaborates each model's performance with such different inputs, and the best model is then found to make the forecasting for the next days in Section 3.3. Furthermore, in Section 3.2 effectiveness of SDHs is explored.

Optimal model selection

Eight different models are chosen as candidates for this application (elaborated in Section 2.2). The proposed models are tested utilizing data types from time. Different time intervals, termed as “lag”, can be considered before the prediction date to feed the model's occurrence data. Namely, with a sample lag equal to l = 4, the model predicts the occurrence in 10th March is using input values in from l prior data points (in 9th, 8th, 7th, and 6th March). A lag of 6 days is selected as the optimum and most effective value for predictions in the next steps. We selected the MAPE metric to compare different model/ input combinations. For such an evaluation, a set of nine countries as Global data are chosen, which constitutes a variety of countries; China is undoubtedly the main candidate as the start point of COVID-19. Iran, Italy, Spain, and the USA are selected due to reports of a high number of confirmed and death cases. Germany and Switzerland are also coming from different trends with a high number of confirmed cases and a controlled number of deaths. Finally, Korea and Japan are also included in demonstrating the countries with a high degree of control on the epidemic. Table 3 shows a summary of the performance of Global data with different models. The columns in Table 3 are derived with/without adding SDH features.

Table 3

Results of different models for the confirmed group in Global Data.

Model	Adding SDH featuresMAPE%	Only LagMAPE%
RF	4.78	4.42
XGBoost	5	7.8
LGBM	6.49	9.68
MLP	1.19	1.7
1D CNN	1.27	2.25
LSTM	1.36	2.41
Multiple Input Series	1.23	1.88
Multiple Parallel Series	1.12	1.76

Results of different models for the confirmed group in Global Data. Concerning the results of Table 3, the Multiple Parallel Series is the best performing network in identifying the true magnitude of the pandemic with a MAPE of 1.12% for Global data. As elaborated in Section 2.3, this network uses lag information from confirmed, death, and recovered cases to predict each next occurrence. This result shows that considering these three occurrences' mutual effect can provide better modeling, and ignoring such dependence leads to less performance.

Effectiveness of social determinants of health (SDH)

SDHs are shown to be valuable by comparing the columns of Table 3. The input features to the proposed models come from three types, elaborated in Table 1. Data types for the first and 3rd row in Table 1 turn into time series and the values with an optimal lag of 6 samples are fed into the models. One common approach to finding the optimum combination is to make a correlation matrix that computes the mutual correlation between features and the desired output [24] (Fig. 5. a). However, such an approach ignores the abilities of the network in making connections between the features. Looking for effective features in this work is inspired by Class Activation Mapping (CAM) [23]. This method is most prevalent in image processing and finds the highly effective image regions by masking different regions during the classification process. Removing effective regions of an image make the accuracy lower, but less effective regions have no substantial effect. Using such a theory in mind, we removed each feature from a preparatory model's inputs and calculated the corresponding MAPE value. The features with more effect on resultant MAPE are selected as the most effective features. Fig. 5.b shows the preparatory model's MAPE values in predicting occurrences of confirmed cases from COVID-19, with eliminating each feature listed as SDH. For each region, the four most effective SDHs are extracted from the CAM method and placed in Fig. 5.c. Comparing the results in Fig. 5.a and 5.c confirms that effective SDHs for all countries are similar in both methods. It includes the following SDHs: density, fertility, age, world share, school closure, GDP, smoking, and day_from_jan_first. As a superiority to the correlation method, the CAM method is capable of extracting effective SDHs for each separate region.

Fig. 5

Effectiveness of SDHs. (a) Heatmap by correlation method. (b) MAPE values by CAM method, (c) Heatmap by CAM method.

Future trajectory of COVID-19 in Isfahan

To provide forecasting on the number of confirmed, death, and recovered cases in Local data, Isfahan, Iran, we run the model until August 11th of 2020. As shown in Table 3, the best performance relates to the Multiple Parallel Series method with adding SDHs. Simulation-based estimation of the trajectory of cumulative confirmed, death, and recovered values for local data is illustrated in Fig. 6. a. Furthermore, Fig. 6.b shows the daily number of occurrences of confirmed cases in Isfahan. The real values of the cases are also shown in the green curve and the predicted curve is shown in red (after training the network until April 15th).

Fig. 6

(a) Cumulative forecasting of confirmed, death, and recovered values for local data using the proposed method, (b) daily values of predicted confirmed cases, (c) long term prediction of the model.

(a) Cumulative forecasting of confirmed, death, and recovered values for local data using the proposed method, (b) daily values of predicted confirmed cases, (c) long term prediction of the model. One important parameter to be discussed about our local data is the total number of tested population during the time. In the first phase of facing the pandemic in Iran, the government could only manage to perform the PCR test on hospitalized cases. However, in the second phase and passing through the rough time (around 31st March), more cases, including hospitalized and outpatients, were gradually added to the tested community. Here, we demonstrate our model's power for long-term predication (even with stopping the training in the first phase of a pandemic in local data). Interestingly, the prediction (dark blue curve) in Fig. 6.c could catch the peak of the real data (green curve) around 1.5 months earlier. It comes from the fact that our training method is mostly using Global data, including both hospitalized and outpatients. Therefore, it is expected that our prediction on Local details assuming that PCR tests in Isfahan from the first point were derived from both hospitalized and outpatients (which is evident with the difference between dark blue and green curves). Accordingly, it could predict the current situation correctly. To clarify this issue, the light blue curve in Fig. 6.c is sketched, indicating the number of confirmed cases after eliminating the outpatients. The light blue curve starts to deviate from the green curve at the start of the second phase (around 31st March, when more cases were gradually added to the tested community). Fig. 6.c is carrying a message that under/overestimation in the number of confirmed cases leads to different curves. Therefore, if we pick such incorrect data, the forecasting may not essentially match with.

Discussion

While the pandemic is still a global concern, it is important to have evidence-based images of the epidemic's behavior to figure out the possible situation for coming back to normal life. Moreover, apart from the known risk factors of COVID-19, some other determinants (SDHs) affect and could be managed by a community-based perspective. Here we discuss the result of this study in two categories; first, the effect of SDHs on the epidemic curve and the value of these determinants. Second, the behavior of the outbreak in Isfahan concerning the number of new cases, death, and recovered ones. Adding SDH features to the model makes the prediction more precise, based on the MAPE value shown in Table 3. But the critical question is which determinant is more effective, and the answer to this question could be found in Fig. 5c: darker squares represent more effective determinants. As it is obvious in Fig. 5, the effective SDHs are similar in most studied countries. As COVID-19 is transmissible via person-to-person contact, it was predictable that determinants like density and school closure date are significant factors in the spread of disease and can affect the prediction. Furthermore, determinants like fertility as an indicator of population growth seem to be effective in the pandemic. According to our analysis in Fig. 5, effective SDHs include density, fertility, age, world share, school closure, GDP, smoking, and day_from_jan_first. Non-pharmaceutical Public health measures in Isfahan were implemented along with national outbreak responses and policies. Measures like school closure (from Feb. 26th), closure of religious gatherings (from Feb. 28th), access to PCR test (for inpatients and outpatient from Feb. 27th and Apr. 8th, respectively), putting limits on traveling and transportation (from March 26th) and social distancing and staying home (from march 27th) affect the incidence and the peak time. By launching the outpatient test from Apr. 8th the number of hospitalized patients decreased and cases managed as outpatients with advice to self-isolation. By that date, the availability of tests increased, and all the close contacts of cases were assumed as suspicious and were tested for COVID-19. As it's shown in Fig. 6c, the light blue line is the number of new cases. These new cases were confirmed cases (with positive PCR test) and suspicious cases (with negative RT-PCR test). From March 17th (almost the start of Nowrouz, Iranian New Year) to Apr. 8th (the launch of the outpatient test), the epidemic curve was flat at about 150–180 new cases (which are almost inpatient cases) every day. Then the epidemic goes down slowly so that the number of new inpatient cases reaches below 100 by May 2nd. But at the same time, the number of outpatient cases increased very fast, which may result from facilitated access to test and maybe because of reducing restrictions. As elaborated above, a limited number of methods based on deep learning are recently developed for COVID-19 prediction. We provide a detailed comparison of our proposed method with such works. Regarding the geographical concentration, the proposed research is mainly developed to test the performance of Isfahan Province, Iran; but the dataset provided by the Johns Hopkins Center is used for training the algorithm. Many other works like [9, 14, 15] are also developed based on the Johns Hopkins dataset or Oxford University database, but the geographical focusing on limited areas like India [11], Canada with Italy and the USA [12], Denmark, Belgium, Germany, France, the United Kingdom, Finland, Switzerland, and Turkey [13], Iran [17, 18], and China [16]. Another important issue is the number of days used as input and as the output of the prediction methods. The optimal lag of six days and long term (extendable to months) forecast is proposed in our method. For [9, 11, 13, 16, 17], lag of 21, 66, 148, 5, and 35 days, and prediction of next 7, 30, 17, 1, and 30 days are proposed, respectively. Another parameter in available methods is considering complementary parameters like SDHs for better prediction. We found effective SDHs, including density, fertility, age, world share, school closure, GDP, smoking, and day_from_jan_first. The other works are also including features like the closure of the city and travel restriction [9], lockdown, and social distancing [11], population and population density [14]. Regarding the performance metric, we found that MAPE is more illustrative in performance evaluation. Other works also used mean square error (MSE), mean absolute error (MAE), root MSE (RMSE), Normalized RMSE (NRMSE), and Symmetric MAPE (SMAPE) [13, 15, 16] additionally. In conclusion, the superiorities of this model (particularity the long term predication ability) turn it into a reliable tool for helping the health decision-makers. Our study predicted that the peak time of the COVID-19 outbreak in Isfahan province has passed around May 2nd and if the controlled governmental rules to the population's compliance with the health policies would continue, the epidemic curve will be finished by July 28th.

CRediT authorship contribution statement

Rahele Kafieh: Supervision, Methodology, Writing - review & editing. Narges Saeedizadeh: Software, Visualization. Roya Arian: Software, Visualization. Zahra Amini: Conceptualization, Methodology, Writing - review & editing. Nasim Dadashi Serej: Investigation, Writing - review & editing, Validation. Atefeh Vaezi: Data curation. Shaghayegh Haghjooy Javanmard: Supervision, Data curation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

17 in total

1. Learning to forget: continual prediction with LSTM.

Authors: F A Gers; J Schmidhuber; F Cummins
Journal: Neural Comput Date: 2000-10 Impact factor: 2.026

2. COVID-19 Pandemic and Comparative Health Policy Learning in Iran.

Authors: Azam Raoofi; Amirhossein Takian; Ali Akbari Sari; Alireza Olyaeemanesh; Hajar Haghighi; Mohsen Aarabi
Journal: Arch Iran Med Date: 2020-04-01 Impact factor: 1.354

3. Flu Outbreak Prediction Using Twitter Posts Classification and Linear Regression With Historical Centers for Disease Control and Prevention Reports: Prediction Framework Study.

Authors: Ali Alessa; Miad Faezipour
Journal: JMIR Public Health Surveill Date: 2019-06-25

4. Attention-based recurrent neural network for influenza epidemic prediction.

Authors: Xianglei Zhu; Bofeng Fu; Yaodong Yang; Yu Ma; Jianye Hao; Siqi Chen; Shuang Liu; Tiegang Li; Sen Liu; Weiming Guo; Zhenyu Liao
Journal: BMC Bioinformatics Date: 2019-11-25 Impact factor: 3.169

5. Time series prediction of COVID-19 by mutation rate analysis using recurrent neural network-based LSTM model.

Authors: Refat Khan Pathan; Munmun Biswas; Mayeen Uddin Khandaker
Journal: Chaos Solitons Fractals Date: 2020-06-13 Impact factor: 5.944

6. Statistical Explorations and Univariate Timeseries Analysis on COVID-19 Datasets to Understand the Trend of Disease Spreading and Death.

Authors: Ayan Chatterjee; Martin W Gerdes; Santiago G Martinez
Journal: Sensors (Basel) Date: 2020-05-29 Impact factor: 3.576

7. Predicting COVID-19 Incidence Through Analysis of Google Trends Data in Iran: Data Mining and Deep Learning Pilot Study.

Authors: Seyed Mohammad Ayyoubzadeh; Seyed Mehdi Ayyoubzadeh; Hoda Zahedi; Mahnaz Ahmadi; Sharareh R Niakan Kalhori
Journal: JMIR Public Health Surveill Date: 2020-04-14