Literature DB >> 35607612

Dual attention-based sequential auto-encoder for Covid-19 outbreak forecasting: A case study in Vietnam.

Abstract

For preventing the outbreaks of Covid-19 infection in different countries, many organizations and governments have extensively studied and applied different kinds of quarantine isolation policies, medical treatments as well as organized massive/fast vaccination strategy for over-18 citizens. There are several valuable lessons have been achieved in different countries this Covid-19 battle. These studies have presented the usefulness of prompt actions in testing, isolating confirmed infectious cases from community as well as social resource planning/optimization through data-driven anticipation. In recent times, many studies have demonstrated the effectiveness of short/long-term forecasting in number of new Covid-19 cases in forms of time-series data. These predictions have directly supported to effectively optimize the available healthcare resources as well as imposing suitable policies for slowing down the Covid-19 spreads, especially in high-populated cities/regions/nations. There are several progresses of deep neural architectures, such as recurrent neural network (RNN) have demonstrated significant improvements in analyzing and learning the time-series datasets for conducting better predictions. However, most of recent RNN-based techniques are considered as unable to handle chaotic/non-smooth sequential datasets. The consecutive disturbances and lagged observations from chaotic time-series dataset like as routine Covid-19 confirmed cases have led to the low performance in temporal feature learning process through recent RNN-based models. To meet this challenge, in this paper, we proposed a novel dual attention-based sequential auto-encoding architecture, called as: DAttAE. Our proposed model supports to effectively learn and predict the new Covid-19 cases in forms of chaotic and non-smooth time series dataset. Specifically, the integration between dual self-attention mechanism in a given Bi-LSTM based auto-encoder in our proposed model supports to directly focus the model on a specific time-range sequence in order to achieve better prediction. We evaluated the performance of our proposed DAttAE model by comparing with multiple traditional and state-of-the-art deep learning-based techniques for time-series prediction task upon different real-world datasets. Experimental outputs demonstrated the effectiveness of our proposed attention-based deep neural approach in comparing with state-of-the-art RNN-based architectures for time series based Covid-19 outbreak prediction task.

Entities: Chemical

Keywords: Attention; Auto-encoding; Bi-LSTM; Covid-19; Deep learning

Year: 2022 PMID： 35607612 PMCID： PMC9117090 DOI： 10.1016/j.eswa.2022.117514

Source DB: PubMed Journal: Expert Syst Appl ISSN： 0957-4174 Impact factor: 8.665

Introduction

In the end of year 2019, the rapid spread of novel corona-virus pandemic, named as: Covid-19, over the world has causes tremendous pressures on multiple social aspects. This pandemic also leads to major challenges for researchers in various disciplines (Khan et al., 2021, Kucharski et al., 2020, Liu et al., 2020, Wu et al., 2020). Recently, the appearance of Covid-19′s delta variant (Li, Lou, & Fan, 2021) has dramatically increased the number of infected patients who need intensive medical treatments. These public health associated disasters has led to severe burdensome for the hospitals even with countries that had high-levelled public healthcare systems and infrastructures (Micah et al., 2021). In fact, this variant of Covid-19 is considered as the most contemporary dangerous one. It has flagged out a new pandemic wave through the world with an unimaginative speed. Many real-world case studies in different countries have showed that the tremendous growth in the healthcare requirements for Covid-19 patients. It is the results of fast person-to-person transmission (Fidan and Yuksel, 2021, Pitchaimani and Devi, 2021) of the Covid-19 delta variant as well as shortage of suitable strategies for preventing the spreads of virus among groups of confirmed Covid-19 cases to other healthy ones. In addition, the shortage of proper data modelling and short/long-term Covid-19 outbreak forecasting solutions also led to challenges for the governments to effectively manage as well as plan for social resource optimization (Nascimento et al., 2021).The accurate pandemic forecasting mechanism also supports for the governments to properly impose suitable policies to simultaneously deal with the expansion of Covid-19 as well as ensure the social/economic stability (Miao, Last, & Litvak, 2022). Due to the severe influences of this pandemic in multiple social aspects, it is considered as necessary for building data analysis systems which support to capture the spreading temporal patterns of this pandemic. These systems can explicitly support for governments to have suitable plans in both economic recovery and daily life renormalization. Therefore, it is needless to say that the precise forecasting in number of new COVID-19 cases is considered as an important problem. The prediction results can be utilized to facilitate the planning of available resources in public healthcare system. Moreover, for the long run it supports to efficiently optimize the social management strategies/policies for Covid-19 spread prevention as well as treatments for infected patients. To deal with the pandemic outbreak prediction problem, many researchers have proposed different statistical/mathematical and machine learning (ML)-based approaches as data-driven predictive solutions (Khan et al., 2021) for fighting against the rapid escalations of new confirmed Covid-19 cases.

Statistical and machine learning based approach for Covid-19 outbreak forecasting

Since, the medical reports of confirmed infectious cases are collected and stored as time series data form, most of predictive models are designed to learn capture temporal and sequential patterns of Covid-19 outbreaks. Commonly in most of countries, the outbreaks of Covid-19 frequently occurred at different spreading levels which are tightly respected to specific time-dependent periods. Moreover, recent studies also show that the spreading level of this pandemic is also relied on the changes in some natural aspects, like as temperature, seasonal weather, etc. (Chin et al., 2020, Zhou et al., 2020) Therefore, the fluctuation patterns of these outbreaks are naturally non-linear and dynamic which are mainly depended on multiple natural/non-natural aspects. Moreover, the traditional linear and statistical prediction approaches are considered as unable to capture non-linear temporal information from the reported daily confirmed Covid-19 cases. This type of dataset might contain high level of chaotic observations, noises and disturbances due to the influences of multiple internal/external aspects. In general, most of classical linear time-series data-driven prediction models highly relied upon the regression paradigm without the capability in considering non-linear data patterns. Therefore, within the Covid-19 outbreak prediction problem, they might totally fail to sufficiently capture the dynamism of Covid-19 infectious transmission through the community at different time-dependent periods. Moreover, from the perspective of real-world applications, the Covid-19 pandemic prediction system should be able to continuously learn from previous historical data in order to obtain the long-range and temporal features from data to achieve better and more accurate predictions. From the past, multiple statistical models, like as: Auto Regressive (AR), Moving Average (MA), Auto Regressive Moving Average (ARIMA), Nonlinear Auto-regression Neural Network (NARNN) etc. have been widely applied to capture the linear patterns of routine number of confirmed infectious case data to conduct forecasting. However, these models are still far from sufficiently preserving the real-time and temporal information from reported Covid-19 routine data.

Recent achievements & challenges

Recently, there are several studies presented the efficiency of applying ARIMA model in the Covid-19 pandemic outbreak prediction problem (Katris, 2021). In this approach, the reported Covid-19 data is modelled as the time series pattern to achieve useful information for predicting the new Covid-19 infections. In very recent years, the integrated ARIMA-based approach is utilized as the predictive model is widely applied in different time series datasets. The ARIMA and its variants are widely used due to its flexibility in analyzing temporal information from a given time series dataset through using different statistical ordered parameters. Recent works have demonstrated success of ARIMA integrated predictive systems in handling prediction task for multiple types of infectious diseases (Cao et al., 2020, He and Tao, 2018, Liu et al., 2011), including the Covid-19 pandemic (Benvenuto et al., 2020, Dehesh et al., 2020). However, most of ARIMA-based predictive techniques are still unable to preserve temporal information as well as perform the non-linear regression to deal with complex and highly-noise routine reported Covid-19 data. This type of infectious data is majorly influenced by different external aspects. In order to deal with the non-linear and complex sequential data pattern learning, there are several studies (Ceylan, 2021, Wu et al., 2015, Yu et al., 2014) have applied the NARNN-based approach to enable the capability of predictive system in performing non-linear and time-series based data analysis problem. In this approach, the temporal information is efficiently preserved from the input time-dependent observations through the neural network based learning paradigm. In general, the NARNN is considered as a neural network based approach which explicitly supports to obtain the temporal information from time series data through utilizing different sequences of a given dataset and train them with corresponding parameter weights through the back-propagation learning procedure. Recent enhancements (Ceylan, 2021, Hansun et al., 2021) in the integration of NARNN with predictive systems have demonstrated remarkable performances in dealing with short-term Covid-19 outbreak forecasting problem. The successes of NARNN in complex time series based prediction task has clearly indicated that deep neural network is the potential and key factor for achieving better performance in temporal/sequential data representation learning and prediction problem.

Deep learning based approach for Covid-19 outbreak forecasting

Recently, there are notable achievements of deep learning in multiple disciplines of computer science domain, like as natural language process (NLP), computer vision (CV), etc. These deep learning’s progresses have shed lights on reaching higher level of sequential/temporal data representation learning as well as time series based prediction task (Lim & Zohren, 2021). Among advanced deep neural architectures, recurrent neural network (RNN), like as: Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), Bidirectional LSTM (Bi-LSTM), etc. have played an indispensable role in time series data modelling and representation learning problem. Among RNN-based architecture, GRU and LSTM are considered as the most popular deep neural architectures. These RNN-based architectures have been widely utilized in different types of time series based modelling and fine-tuning processes for effectively handling prediction problem. Specifically, GRU and LSTM can effectively learn and capture the long-range temporal information in forms of generated hidden states through different sequence-connected neural state cells. There are several recent studies (Chimmula and Zhang, 2020, Zeroual et al., 2020) have shown the outperformances of LSTM-based neural architecture in time series based Covid-19 new case prediction task. The LSTM model enables to capture temporal information representation learning and time-dependent training for data prediction. However, most of recent RNN-based time series based predictive models still encountered several major limitations. The limitations are mainly related to problems in handling highly noise/non-smooth sequential data as well as long-range dependency. Since, the reported Covid-19 cases are sometime reported as a cumulative sequence with multiple upward/downward trends which produce a non-smooth sequential data pattern. Thus, it might challenge the capability of LSTM in logically preserving these disturbances to make correct predictions. Moreover, the most of RNN-based architectures like as GRU/LSTM is unable to learn and properly interpret long input sequences due to the vanish/explosion of gradient during the back-propagation learning process. In order words, LSTM-based predictive models tend to likely focus and remember much on it last processed sequences rather than preserve the long-range dependencies of all input sequences. Therefore, it might fail to sufficiently capture the trends of Covid-19 outbreaks within long-range periods.

Our motivations & contributions

Inspired from recent literature studies and proposed Covid-19 predictive models (Chimmula and Zhang, 2020, Hu et al., 2020, Zeroual et al., 2020), in this paper, we proposed an enhanced sequential auto-encoding architecture with the dual self-attention mechanism, called as DAttAE. The proposed DAttAE efficiently supports to improve the performance of temporal information representation learning process.

The utilization of sequential auto-encoding (AE) mechanism in Covid-19 outbreak prediction

In this study, for achieving better performance in Covid-19 case prediction problem, we utilize a sequential (Bi-LSTM) based auto-encoder to encode and generate rich-semantic temporal representations of historical observations. The learnt rich temporal information from sequences are later used to assist the fine-tuning process of Covid-19 trend forecasting problem. In general, the application of AE enables to learn a latent representation of the data which is split into different neural components (encoding/decoding). It later allows us to efficiently disentangle the latent time-dependent features from the given set of historical observations which are currently evaluated. Specifically our proposed DAttAE model in this paper is designed upon the inheritance of the well-known sequence-to-sequence (seq2seq) architecture (Sutskever, Vinyals, & Le, 2014). To deal with noises/lagged observations from time-series dataset, it is integrated with the attention mechanism (Bahdanau et al., 2015, Vaswani et al., 2017) which is mainly inspired from the NLP domain. In general, the historical confirmed Covid-19 case data in forms of time series are passed through an auto-encoding based architecture. The given sequential AE is composed with two separated Bi-LSTM based encoder and decoder as a sequential self-learning approach (as illustrated in the main block of Fig. 1 -A). Within the given proposed AE architecture, first of all, we achieve the concatenated (backward/forward) hidden states of the input sequences in forms of reported number of Covid-19 cases within a specific time-dependent range through the encoding component. Then, the encoder’s outputs are fed into another Bi-LSTM architecture to simultaneously interpret and produce the transformed hidden states of the original input sequences. Next, both learnt hidden states of the encoding and decoding components of our given AE are passed through separated attention layers to generate the combined attention weights (as illustrated in the Fig. 1-B). The ultimate purpose of applying the dual self-attention filtering mechanism in the outputs of the Bi-LSTM based AE architecture is to estimate the importance levels of all data entries in each input sequence. It directly supports to enhance the prediction fine-tuning process in the after all. Finally, the concentrated attention weights of both encoder and decoder are fed into a full-connected layer to predict the number of Covid-19 cases.

Fig. 1

Overall architecture of our proposed DAttAE model.

Our findings and real-world case studies in this paper

In general, the Covid-19 case forecasting task can be formulated as non-smooth/high-noised time-series based data analysis problem in which routine reported Covid-19 cases are abnormally fluctuated over the time. The rapid/abnormal changes in number of infectious cases are come from different reasons, including: impacts of environmental/geographical aspects, the inconsistency in reported data of different organizations/governments, etc. Thus, recent regression as well as modern RNN-based predictive models might be not powerful enough to effectively analyze and reduce noises/temporal disturbances within the representation learning and prediction processes. In order to deal with this problem, in our approach the neural attention mechanism has been adapted to achieve noise-reduced sequential representation of a given input sequence. By doing this, the importance levels of all routine reported Covid-19 data entries in the given sequence are taken into considered during the embedding process through the Bi-LSTM based auto-encoder. Thus, it supports to deliver better prediction results in the after all. To demonstrate the effectiveness of our findings in this paper, we have tested our proposed method with other baselines in real-world Covid-19 datasets and achieved outperformed results for both Covid-19 case prediction and risk-zone classification problems (as later described in sub-section 2.1). In general, our research objectives, contributions as well as novelty of our proposed DAttAE model in this paper can be briefly summarized as three-folds, which are: First of all, we present the utilization of Bi-LSTM based auto-encoding architecture as a self-supervised representation learning. It supports to effectively preserve the dynamic and temporal information from a given time series dataset. The ultimate purpose of using a sequential auto-encoding mechanism is to simultaneously facilitate the data embedding and interpretation processes to deal with challenges of non-smooth and chaotic sequences which might contain a lot of lagged observations, noises and temporal disturbances, such as in the routine reported number of Covid-19 infectious cases. Next, we integrate the proposed Bi-LSTM based auto-encoding mechanism with a custom dual self-attention mechanism (as illustrated in Fig. 1-B). It can support to softly achieve the smooth/fluent representations of input sequences. These rich-semantic representations of historical observation are later utilized to fine-tune for achieve better prediction results through a task-driven full-connected layer at the end. The ultimate purpose of integration an attention mechanism into the given sequential AE architecture is to automatically evaluate the importance of all consecutive data entries in forms of non-smooth input sequences. It later assists to directly leverage the overall accuracy performance of the temporal representation learning process. Finally, we conduct extensive experiments in real-world reported Covid-19 confirmed case datasets to demonstrate the effectiveness of our proposed DAttAE model in this paper. The first dataset contains confirmed Covid-19 cases which are reported within 62 provinces of Vietnam, named as VN-62P. This dataset contains data which is collected within the period of April 27, 2021 to October 21, 2021. The second dataset includes the numbers of infectious cases in 12 high-populated districts of Ho Chi Minh city (HCMC), named as HCMC-12D. This dataset is constructed by collecting number of confirmed Covid-19 cases in 12 high-populated districts of HCMC within the period of June 22, 2021 - October 19, 2021. We also published the full versions of these two datasets as our contribution in this paper. In addition, beside our own datasets, we also evaluated our proposed model with different deep learning based baselines within another Covid-19 dataset which is reported in different states of United States (US). The experimental outputs demonstrated the outperformance of our proposed DAttAE model in comparing with recent state-of-the-art RNN-based architectures.

Difference between our approach and previous works

For many years, researchers have tried to find different approach to effectively handle as well as preserve temporal dynamic features from the time-series based datasets to produce better predictions. However, most of traditional approaches like as the popular auto-aggression based techniques (ARMA, ARIMA, etc.), Naïve Bayesian probabilistic addictive decomposition techniques for time-series, etc. have failed to sufficiently capture the long-range time-dependent features between consecutive observation entries. Moreover, within the short/long-term time series prediction approach through deep learning, multiple complex recurrent neural network based architectures have been utilized to dealing with the long-ranged historical observation representation learning. These RNN-based models have sufficiently supported to capture the rich-semantic time-dependent features within time-series dataset to performance significant improvements. However, these deep learning based predictive model normally performed inaccurate predictions while dealing highly noise/chaotic sequential datasets with unpredicted growing patterns like as the routine reported Covid-19 case dataset. In recent times, there are few works have been concentrated on the utilization of attention mechanism within the sequential auto-encoding paradigm. The utilization of neural attention based mechanism with different full-connected components have effectively assisted to capture the short-term growing patterns of disturbing/chaotic time series datasets. Majorly inspired from previous works for the case study of Covid-19 outbreak prediction, we dedicated our efforts on discovering on how the integration between dual self-attention neural architectures can efficient support the sequential AE-based architecture. Our proposed AE with a dual self-attention mechanism can support to efficiently preserve the unsmooth sequences and improve the performance of time series prediction problem, specifically for the routine reported Covid-19 confirmed cases in Vietnam.

Managerial implications of our studies in this paper

In recent years, the COVID-19 pandemic as well as it associated aspects have been considered as a hot/top priority research topic for the researchers worldwide. To prevent the rapid spreads of this pandemic, early infectious case forecasting is a crucial task which can facilitate multiple management activities in different social sectors, especially public administration and healthcare system. The accurate Covid-19 case forecasts can directly support to optimize multi-sector social resources for preventing the spreads of this pandemic as well as planning for quarantine isolation strategy. Moreover, due to serve impacts of Covid-19 on human health, the effective public healthcare resource anticipation and distribution are extremely important to provide fast and proper treatments for patients with serve COVID-19 symptoms. Therefore, within the computer science domain, there are various researches have been conducted in the literature using different mathematical models/deep learning based paradigms to predict the spread of the Covid-19. Normally, routine Covid-19 infectious cases are reported in forms of complex and highly-fluctuated time-series datasets. Therefore, to effectively analysis and learn the dynamic feature representations of these datasets, different RNN-based models have been applied in this research direction, however, they still be considered as insufficient to cope with abnormal/lagged observations to convey better prediction results. In our works, we proposed a novel integrated Bi-LSTM based auto-encoding architecture with dual attention mechanism to effectively deal with complex and non-smooth/abnormal fluctuations of routine reported Covid-19 infectious cases, thus deliver more accurate predictions. The accuracy forecasts in number of new Covid-19 cases as well as growing pattern are considered as significant for management implications. To sum up, the left contents of our papers are organized into four sections. In the second section, we shortly review recent works which are related to our study. In this section, we also discuss the pros/cons of each technique which serve as a motivation for the proposal of DAttAE model. Next, we formally present about the associated background concepts, methodology and detailed implementations of the proposed DAttAE model in the third section. In the next section, we present extensive comparative studies in Covid-19 outbreak prediction task via different deep neural architecture within real-world datasets and discuss the experimental outputs. Finally, we conclude the findings as well as highlight some promising directions for future improvements. For further referencing purpose, we listed all common abbreviations as well as mathematical notations which are used in the left contents of our paper in Table 1 .

Table 1

List of common abbreviations/notations which are utilized in our paper.

Abbreviation/Notation	Explanation
AE	Auto-Encoding
ANN	Artificial Neural Network
ARIMA	Auto-regression Integrated Moving Average
ARMA	Auto-regression Moving Average
Bi-LSTM	Bidirectional LSTM
GRU	Gated Recurrent Unit
LSTM	Long Short-Term Memory
NARNN	Nonlinear Auto-regression Neural Network
RNN	Recurrent Neural Network
SIR	Spread of Disease
VMH	Vietnam Ministry of Health
WHO	World Health Organization
X	A time series-based dataset as a collection of data entries/observation sequence.
X	An observation sequence.
h	A hidden state.
e	An embedding vector.
W and b	The trainable weighting matrix and bias parameters, respectively.
ReLU(.)	The rectified linear units function, formulated as: ReLU.=max(0,.).
Linear(.)	The linear full-connected neural layer.
σ(.)	The sigmoid function, formulated as: σ.=11+e-..

List of common abbreviations/notations which are utilized in our paper.

Our case studies & related works

Daily covid-19 infectious case prediction & risk zone classification

For many decades, our world has witnessed and suffered from different severe infectious disease outbreaks, like: Asian Flu, HIV/AIDS, SARS and now is the Covid-19. From this time on, many researchers, organizations and governments in different countries have thorough studied, proposed and applied different management strategies to prevent the tremendous exploration of Covid-19 infectious cases. Following the recent reports of World Health Organization (WHO) [ 4 ], there are over 248 million of confirmed cases and 4 million of deaths have been reported over the world at the start of November 2021. In Vietnam, following the official information from Vietnam Ministry of Health (VMH)[ 5 ], there are approximately over 880 thousand of infectious cases and 21 thousand of deaths have been reported up to this time. In fact, the spreads of this Covid-19 pandemic in different countries are varied and it seems unpredictable. This pandemic is influenced by multiple aspects related to both natural (seasonal weather, regional temperature, etc.) as well as unnatural (population, immigration, etc.) aspects. Due to the severe effects of this pandemic in multiple social and economic aspects, it has been considered as a multidisciplinary issue which requires the involvements of many organization and governments. It requires higher efforts for the governments in imposing new social management strategies, pharmaceutical/epidemiological organizations in medical treatment proposal as well as data modeling/analysis solutions for early forecasting and planning. In Vietnam, our government and VMH have proposed different effective management strategies which enable to flexibly impose different quarantine policies for different regions. There are different policies are imposed depending on the levels of Covid-19 infectious risk at different Vietnam’s regions and cities. Recent new proposed management strategies support to jointly optimize social resources for preventing Covid-19 outbreaks in high-risked regions as well as ensure the social and economic stability in low-risked/safe regions. Following the VMH, the level of risk in each region is identified upon two criteria, which are: the daily/weekly number of confirmed Covid-19 cases and vaccination rate (%) in that region. Table 2 shows the detailed assessment criteria which are provide by VMH [ 6 ] for the Covid-19 risk zone classification in Vietnam. Our works in this paper mainly focused on the solution of Covid-19 data as time series modelling and prediction, specifically in learn the evolution pattern of confirmed infected Covid-19 cases. Then, we utilize the predicted infectious case results to conduct Covid-19 risk zone classification following the listed assessment criteria in Table 2. The Fig. 2 illustrates the Covid-19 risk zone prediction and classification problem within 62 districts of Vietnam.

Table 2

Covid-19 risk zone categorization in Vietnam following the VMH’s criteria.

Level of risk	Assessment criteria
Red zone (very high risk)	• Number of weekly confirmed cases > 150. Vaccination rate < 70 % (if ≥ 70 % - considered as the orange zone).
Orange zone (high risk)	• Number of weekly confirmed cases in range of 50 to 150. Vaccination rate < 70 % (if ≥ 70 % - considered as the yellow zone).
Yellow zone (medium risk)	• Number of weekly confirmed cases in range of 20 to 50. Vaccination rate < 70 % (if ≥ 70 % - considered as the green zone).
Green zone (normal)	• Number of weekly confirmed cases less than 20. Vaccination rate < 70 %.

Fig. 2

Covid-19 risk zone prediction and categorization through our proposed DAttAE model within 62 provinces of Vietnam.

Covid-19 risk zone categorization in Vietnam following the VMH’s criteria. Number of weekly confirmed cases > 150. Vaccination rate 70 % (if 70 % - considered as the orange zone). Number of weekly confirmed cases in range of 50 to 150. Vaccination rate 70 % (if 70 % - considered as the yellow zone). Number of weekly confirmed cases in range of 20 to 50. Vaccination rate 70 % (if 70 % - considered as the green zone). Number of weekly confirmed cases less than 20. Vaccination rate 70 %. Covid-19 risk zone prediction and categorization through our proposed DAttAE model within 62 provinces of Vietnam.

Recent achievements in Covid-19 forecasting task

The traditional statistical and machine learning based approach

Recently, there are several works focused on applying statistical/mathematical and traditional predictive approach, like as ARIMA (Benvenuto et al., 2020, Dehesh et al., 2020). These traditional time series based techniques support to efficiently model and predict the transmission trends of Covid-19. These works have demonstrated the success in modelling and correctly predicting the linear growth of new Covid-19 infectious cases. However, real-world reported Covid-19 is much complex and dynamic in different nations in which the growth trends in number of Covid-19 reported cases are observed as non-linear growing patterns. Therefore, traditional statistical methods might be unable to sufficiently fit non-linear patterns of the given dataset which leads to the downgrade in the accuracy for short-term prediction of Covid-19 transmission growth. Same on the statistical approach, recently Zhang et al. (Zhang et al., 2020) proposed a the R0-based statistical technique for forecasting the Covid-19 infection rate of the outbreak which was occurred in the Diamond Princess cruise ship[ 7 ]. Similar to that, Varotsos, C. A. et al. proposed a novel probabilistic model, called as: Covid-19 Decision-Making System (CDMS) to measure the spread of Covid-19 disease for prompt actions and further planning activities (Varotsos & Krapivin, 2020). Also considered as an probabilistic approach, in recent studies (Amaral, Casaca, Oishi, & Cuminato, 2021), Amaral, F. et al. proposed a Covid-19 spreading predictive model which is relied on the Spread of Disease (SIR) model (Cooper, Mondal, & Antonopoulos, 2020) to evaluate and forecast the Covid-19 outbreaks in São Paulo, Brazil and Brazil. However, it was still unable to accurately predicting complex non-linear escalations of Covid-19 cases in different time steps. Moreover, traditional mathematical and statistical techniques are also limited in retaining the temporal information from the routine reported infectious cases in forms of time series data. To effectively deal with the non-linear patterns of daily reported infectious cases, there are recent attempts of Ceylan (Ceylan, 2021) and Hansun et al. (Hansun et al., 2021) have demonstrated the effectiveness of integrating NARNN architecture. These NARRN-based models (Ceylan, 2021, Hansun et al., 2021) support to model the flexible time-dependent growth patterns of Covid-19 diseases in different real-world datasets. In general, the NARNN approach is basically considered as an initial artificial neural network based technique. It mainly relies on neuron-varied learning paradigm to deal with the time series based modelling and prediction problem. On the same approach of applying multi-layered neural architecture for handling Covid-19 spreading prediction problem, recent works of Wieczorek, M. et al. proposed an application of stacked multi-layered full-connected neural architecture to preserve the temporal information from the routine reported Covid-19 datasets (Wieczorek, Siłka, Woźniak, 2020). However, the simplicity of NARNN architecture is considered as unable to preserve the long-range temporal information of complex time series datasets to effectively perform the short-terms prediction.

The deep learning based approach

In recent years, the tremendous raises of deep learning in multiple disciplines have shed some lights for enhancing the performance of Covid-19 outbreak forecasting problem (Shorten, Khoshgoftaar, & Furht, 2021). Within the problem of time series data modelling and representation learning tasks, there are popular RNN-based architectures (e.g., GRU, LSTM, Bi-LSTM, etc.) are widely applied to achieve significant performance in multiple data analysis tasks, including the time-series prediction. Recent attempt of Chimmula et al. (Chimmula & Zhang, 2020) and Hu et al. (Hu et al., 2020) in applying advanced deep neural architectures in handling complex non-linear and temporal information capturing from time series data. In more specifics, in this work (Chimmula & Zhang, 2020), Chimmula et al. have presented the advantages of LSTM-based neural architectures in modelling reported Covid-19 time series data for predicting short-term number of infection cases in Canada, US and Italy. For the general literature reviews of potential applications of different deep neural architectures in sequential data representation learning and Covid-19 outbreak forecasting problem, Zeroual et al. (Zeroual et al., 2020) have conducted extensive comparative studies between different RNN-based and auto-encoding (AE) based architectures to present the potentiality of these advanced neural architectures in handling real-world time series based Covid-19 datasets. Similar to that, recently Chatterjee, A. et al. (Chatterjee, Gerdes, & Martinez, 2020) studied on using multiple LSTM-based architectures to efficiently preserve the dynamic temporal information from reported Covid-19 spreading data to conduct accurate predictions. Also related to Covid-19 time series based data evaluation and learning, Nascimento et al. (Nascimento et al., 2021) recently propped a novel dynamic graph-based analysis technique with multi-regression dynamic model (MDM) approach. It supports to find relationships between time series routine Covid-19 reported data and financial market trends. These recent well-known studies have placed strong backgrounds for building efficient Covid-19 prediction systems. However, as aforementioned issues related to the limitations of RNN-based architectures in remembering the long-range temporal information from input sequences as well as problem associated with the highly chaotic noised/non-smooth time series datasets, there are existing challenges which have still required extra researching efforts. Mainly motived by recent studies in future Covid-19 case forecasting, our works in this paper concentrate on integrating the Bi-LSTM based auto-encoding architecture with dual self-attention mechanism to improve the performance of Covid-19 case prediction task.

Methodology

In this section, we formally present the methodology and detailed implementations of our proposed DAttAE model in this paper. In general, the proposed DAttAE model is designed as a Bi-LSTM based auto-encoder which support to softly capture the temporal information from the input time series sequences as a self-supervision representation learning approach. Then, the concatenated hidden states which are produced by the encoder and decoder parts are fed into a dual self-attention mechanism to produce the combined weighted attention vectors which support to estimate the importance of all data entries in the given input sequences to assist for the data temporal forecasting process. To do this, the concatenated output attention weights of both encoder and decoder are passed through a linear full-connected layer to conduct prediction.

Problemformulation & background concepts

Short-term time series prediction task

In general, large number of real-world datasets in multiple disciplines can be considered as time series data for in which data entries are collected at specific time intervals with () data entries, denoted as: , with being a single data entry at a specific time. Due to the popularity of time series based datasets, there are numerous studies have presented notable achievements as well as existing challenges for the time series prediction problem. The time series prediction model is formally designed to forecast/predict the up-coming sequential trends/patterns of a given time series dataset through analyze the latent temporal features of the historical data entries. For the short-term time series prediction problem, the given time series dataset () is normally split into () smaller observation sequences, denoted as: , or: . Each observation sequence, which are used to learn and predict the upcoming trend/pattern of the consecutive data entry, as: (), with () is the pre-defined observation length or look-back parameter. In general, given time series predictive method, denoted as: , is designed to optimize the following learning objective: .

Recurrent neural network (RNN)

From the past, most of stacked full-connected linear neural network architectures are widely applied to model and learn context latent features from different types of datasets. However, these stacked multi-layered neural architectures are unable to model and retain the consecutive information from sequential/time-series datasets. In order to deal with the continuous and temporal information present in different sequential data forms, the novel deep neural architecture is required. RNN is considered as the most popular neural architecture which is designed to deal with the sequence/time-dependent latent feature representation learning problem. In general, the RNN is developed upon the principle of considering the effects of consecutive data entries’ information to generate the corresponding outputs. To do this, a neural state cells are organized as the sequence-ordered structure. Each neural cell in a given RNN-based architecture contains multiple logic gates which are used to control the influences of historical observations of previous data to the current input entry, as: and generate the corresponding output in form of hidden state, denoted as . For a specific time-step (), the generated hidden state, denoted as for a specific input is generally obtained as follows: , with: standing for the set of trainable parameters (weights, bias) and the corresponding activation functions of different gates in a given RNN architecture. Among RNN-based architectures, GRU and LSTM are the most commonly and widely applied to model and analyze different types of sequence/time series based datasets in multiple disciplines like as NLP explain abbreviation, short-term recommendation, multi-media information retrieval, etc.

DAttAE model

Bi-LSTM based auto-encoding architecture

To effectively model the input sequences coming from a non-smooth/high noisy time series dataset like as routine reported Covid-10 infectious cases, we majorly inspired from the well-known seq2seq-based approach (Sutskever et al., 2014) in the NLP domain in which the proposed DAttAE model is designed as a sequential auto-encoding architecture with two components encoder and decoder. The encoding part is composed as a Bi-LSTM architecture which support to effectively encode all data entries in each input sequence in both forward and backward directions. In more details for a specific data entry in a given observation sequence , each LSTM architecture at each direction supports to produce the corresponding hidden state as the following (as shown in equation (1)). In this equation, the , and present for the weighting and bias parameter matrices of different gates of the given LSTM architecture in each direction which is identified by the operator: (, used for denoting the forward and backward directions, respectively). Then, the last () generated hidden states of the given Bi-LSTM architecture are then combined to form the final sequential embedding vector of the given input sequence (). These learnt temporal representations are formulated as the output hidden state embedding vectors of Bi-LSTM based last layers, denoted as: , with is the length of input sequence ().

The utilization of LSTM/Bi-LSTM and sequential auto-encoding mechanism

In general, the utilization of a LSTM/Bi-LSTM based sequential embedding mechanism in our work is to sufficiently capture the range-varied time-dependent features from a given time-series dataset. The equation (1), (2) generally illustrated a basic LSTM-based neural cell which is implemented in our proposed sequential encoding mechanism. In our encoding and decoding parts, a dual hierarchical LSTM based architecture (Smagulova & James, 2019) is applied to generate the bi-directional temporal embeddings of each input entry in forms of last output hidden states of each LSTM architecture. These rich-structural temporal representations are combined to produce the unified representations of consecutive observation entries. The overall process of hidden state combination process is illustrated as shown in the equation (2), with presenting for the vector concatenation operation. Then, the generated latent embedding vectors for each input sequence in the encoding part will be passed to the decoding part. The decoding component is also designed as another Bi-LSTM based architecture to transform them into another sequential representation form, denoted as: (). The ultimate purpose of using different Bi-LSTM based embedding mechanism to learn the sequential representation of input sequence (). It supports to create a longer consecutive neural learning architecture. By using the given Bi-LSTM based auto-encoding mechanism, it assists to prevent the problems related to gradient explosion as well as efficient trainable weighting parameter calculation. Moreover, the Bi-LSTM architecture also provides better temporal information preserving for chaotic time-series data in which generated latent features in each neural cell can be shared and remembered for sets of long/non-smooth input sequences.

Dual attention mechanism and time series prediction

Next, in order to enhance the capability of the given Bi-LSTM based auto-encoder to achieve better sequential representation of a given input sequence in which the importance levels of all data entries in the given sequence are taken into considered during the embedding process, we implement a custom hierarchical dual self-attention mechanism in our proposed DAttAE model. There are two self-supervised attention layers are placed at the end of encoder and decoder components which take the output embedding vectors of each component and help to overall model to selectively concentrate on different parts of the given sequence for better prediction-driven fine-tuning process at the end. Specifically, the self-supervised attention layer which is located at the output of encoder and decoder components is designed as follows. In the equation (3), the attention score of each data entry at a specific index, denoted as: , is computed as the linear transformation of the attention alignment paradigm which is majorly inspired in previous works (Bahdanau et al., 2015, Vaswani et al., 2017). Being different from previous proposed attention mechanisms which are mainly applied for machine translation task, in our developed dual self-supervised attention mechanism, we used the ReLU activation function to perform the linear transformation of the given output sequential embedding with trainable weighting parameter matrix, as: . Then, the computed attention score for a specific data entry in the sequence is utilized to calculate the corresponding importance score, denoted as: . Finally, we compute the context attention vector for the given input sequence , denoted as: as the weight sum of all attention scores for all data entries in which is illustrated in the equation (4). In general, the application of this self-supervised attention mechanism in our DAttAE model directly supports to the soft alignment between the generated sequential embedding vectors in both encoding and decoding parts which are simultaneously learned and controlled by the corresponding context vectors: and , respectively. Then, we concatenate the calculated context vectors of both encoding and decoding part to produce the final dual attention-based weighting embedding vector of input sequence , denoted as: . Finally, to let the given architecture predict the value of consecutive data entry of the given sequence, as: , we feed the achieved attention-based embedding vector to a linear full-connected layer. The general calculation of this process is formulated as shown in form. As shown in (5), a linear full-connected layer supports to linearly transform the previous obtained attention-based embedding vector of input sequence into the distributions of all possible entry values in a given time series dataset . Then, we optimize DAttAE model’s parameters through the mean square error (MSE) loss strategy with the defined learning objective, as: as shown in the equation (6).

Experiments & discussions

In this section, we present extensive experiments to evaluate the performance of the proposed DAttAE model in new infection case prediction task within two real-world Vietnam reported Covid-19 datasets. Moreover, we also provide thorough comparative studies of the performances of our model with other state-of-the-art RNN-based baselines which are assessed under standard evaluation metrics used for the time series forecasting problem.

Experimental datasets & setups

Dataset descriptions & usage

In order to ensure the adaptation of the proposed model in real-world application, we mainly utilize realistic daily reports of confirmed Covid-19 cases in forms time series data resources within Vietnam. The experimental datasets in this paper are collected from official source of VMH. There are two main types of datasets which are used for all experiments in this paper, which are: VN-62P: this dataset contains the daily reported Covid-19 cases of 62 provinces in Vietnam. For each province, we collected the number of new daily Covid-19 infection cases which have been confirmed and reported by VMH. The quantity of new daily cases in each province has been collected within the duration of April 27, 2021 - October 21, 2021. We used this dataset to test the capability of deep learning baselines in predicting new infectious Covid-19 cases at 62 provinces. To train deep learning predictive baselines, we split the set of data entries of each province into two parts: training set (70 %) and test set (30 %). The training set is used to train and validate the given predictive model. Then, the trained predictive model is evaluated again with the test set under different evaluation metrics. HCMC-12D: similar to the VN-62P dataset, we also collected a set of Covid-19 reports in Ho Chi Minh city (HCMC) provide by VMH which include the daily reported number of infection cases in 12 large and high-populated districts of HCMC. This time series dataset is collected within the time range of [June 22, 2021 - October 19, 2021]. Similar to experiments with VN-62P dataset, deep learning based predictive techniques will be trained to predict the number of new Covid-19 cases at specific time steps. For the experiments in time series predictive model learning and evaluation, the set of data entries in each district is divided into training and test sets.

Dataset usage for experiments

For evaluating the performance of long-range Covid-19 case prediction task within two datasets, we applied the time-dependent splitting strategy in which the training set are taken from the period of April 27, 2021 to the end of () week (September 5, 2021) for the VN-62P dataset and from June 22, 2021 to September 5, 2021 for the HCMC-12D dataset. For experiments with different traditional and deep learning based techniques which are applied for Covid-19 prediction task, we all applied the same test/split ratio in which the training set is utilized to learn and extract temporal features from historical observation which are later applied to predict the future number of Covid-19 cases against the test set. The detailed information and statistics about these two datasets as well as usage which are used for all experiments in our paper can be found in Table 3 .

Table 3

Detailed information and statistics of VN-62P and HCMC-12D datasets.

Dataset	Number of evaluated region	Data reported time range	No. data entries for each region	Training size	Testing size
VN-62P	62	April 27, 2021 - October 21, 2021	177	129	48
HCMC-12D	12	June 22, 2021 - October 19, 2021	110	64	46

Detailed information and statistics of VN-62P and HCMC-12D datasets.

Configurations & evaluation methods

Experimental environment & configurations. To implement the DAttAE model, we mainly used the Python programming language under the PyTorch[8] machine learning framework. We set up our DAttAE model and other traditional deep learning based comparative baselines for all experiments on a same computer with Intel Xeon CPU E5-2620 v4 2.10 GHz (8 cores – 16 threads) CPU and 64 Gb memory. Setups of our proposed DAttAE model for data training and evaluation. For the Bi-LSTM based auto-encoding mechanism (as described in sub-section 3.2.1), we set the number of used LSTM-based cells () for each Bi-LSTM architecture as 32. For the experiments with LSTM/Bi-LSTM related techniques which later are described in sub-section 4.1.3, we also used the same configured number of hidden states for all datasets. For all predictive techniques which are related to the RNN-based approach for dealing with the long-range Covid-19 prediction task, the default observation/look-back length () are set as 5 for all datasets. For the setup of dual self-supervised attention mechanism (as described in the sub-section 3.2.2), the default weighting parameter values of all attention layers is initialized by using Xavier initialization. The hidden size layer of these attention-based architectures is set as the same with the number of configured () parameters for the encoder and decoder components. In more specifics, Table 4 lists detailed configured parameters which are set up for our DAttAE model in all experiments.

Table 4

List of detailed configurations for our DAttAE model.

Parameter	Value
Number of LSTM-based cells which are used in both encoder and decoder components (kLSTM).	32
Default observation/look-back length (L) for each input sequence.	5
Training(%)/validation(%) split ratio.	90/10
Default number of training epochs.	500
Default general dropout rate for all neural architectures in our model.	0.5
Default learning rate (η) for all datasets.	1×10-2
Default model’s optimizer.	Adam (weighting decay: 5×10-2)

List of detailed configurations for our DAttAE model. Experimental evaluation criteria. Similar to recent works (Chimmula and Zhang, 2020, Hu et al., 2020, Zeroual et al., 2020), to evaluate the performance of different deep learning based time series predictive model, we three main standard evaluation metrics which are: Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). These evaluation methods support to assess how much a given predictive model make wrong predictions upon a context of time series dataset. These evaluation methods are calculated upon a given test set, denoted as: () as the following (as shown in the equations (7), (8)). For each time series predictive model in each dataset, we run experiments 10 times and reported the average output of each model as the final result. In our experiments, we also conducted the extended Covid-19 zone categorization problem in each assessed geographical region (provinces in VN-64D dataset and districts in HCMC-12D dataset) which is considered as the classification problem. To evaluate the accuracy performance of this Covid-19 risk zone categorization task, we mainly used two metrics which are the Accuracy and F-1 measure (as shown in equation (9), (10)). In this equation, the values of precision (P) and recall (R) are calculated by identifying the (true positive), (false positive) and (false negative). Specifically, the (true positive) presents the number of regions which are correctly classified to their ground-truth Covid-19 risk zone labels. The (false positive) and (false negative) indicate the number of expected regions which are categorized to specific Covid-19 risk zone based classes but not correct and the number of regions which are not classified by their actually Covid-19 risk zone ground-truth labels, respectively. To identify the Covid-19 risk zone-based labels for each region (district/province) in each dataset, we mainly relied on the standard assessment criteria of VMH which are mentioned in Table 2. Due to the lack of information in vaccination rate in each region, thus we only considered the daily/weekly number of new reported Covid-19 infection cases to identify the ground-truth/predictive labels for each region.

Comparative baselines

To demonstrate the effectiveness of our proposed DAttAE model in comparing with recent state-of-the-art deep learning based predictive model for time series dataset, we implement several techniques to conduct Covid-19 infection cases in two VN-62P and HCMC-12D datasets, which are: Naïve Bayes (NB): for further evaluation on how traditional probabilistic model can facilitate on the problem of time-series based Covid-19 prediction problem, we implemented a simple NB-based regression model which supports to product the predictions through the addictive decomposition approach. ARIMA: is considered as the most well-known and traditional approach for time series data forecasting problem. In this study, we implement the ARIMA model to predict the number of Covid-19 cases within two VN-62P and HCMC-12D datasets. In our experiments, we used the ARIMA model to conduct the long-range time-dependent prediction task. GRU (Zeroual et al., 2020): is considered as the classical and earliest RNN-based neural architecture in which neural state cells are organized as the sequence-ordered structure to enable the capability in modelling and representation learning for sequence/time series-based data. Each GRU’s neural cell contains two types of logic gates: reset and update which support to control the preserved sequential information of each data entry in a sequence when it is passed through it. For the implementation of GRU based predictive model in our experiments, the number of used neural state cells () is set as the same our DAttAE model. LSTM (Zeroual et al., 2020): is the recent most advanced sophisticated sequence-ordered neural network architecture which is designed to prevent problems related to the vanishing/exploration of gradient while handling long-dependent sequence/time series-based datasets. Different from the GRU, LSTM contains three types of logic gates: input, forget and output. The advanced designs of LSTM enable it to extend the capability in preserving the long-term dependencies between input sequences and thus perform great ability to dealing with sequence/time-series based data analysis and representation learning problems. Bi-LSTM (Zeroual et al., 2020): is the enhanced version of the LSTM in which two different LSTM-based architectures are vertically aligning to learn the sequential representation of all data entries in each input sequence at both direction (forward/backward). Then, the last concatenated hidden state vector of each sequence is used to predict the next data entry value. For all experiments, the Bi-LSTM architecture is setup as the same configurations with the encoder part of our proposed DAttAE model. COVID-ANN (Wieczorek, Siłka, & Woźniak, 2020): on the same view of applying deep learning based approach for the Covid-19 forecasting problem, Wieczorek, M. et al. proposed a novel stacked full-connected neural architecture which is similar to the traditional multi-layered perceptron (MLP) architecture to capture temporal information from time-series based Covid-19 dataset. In this work, the authors utilized a 7-layered stacked linear layers with the activation function between two layers. Then, the hidden states of last layer are fed into a ReLU activation to conduct prediction. For comparative studies, we have constructed the same neural architecture in this work to conduct the number of Covid-19 case prediction. COVID-RNN (Chatterjee et al., 2020): is considered as a recent deep learning approach for Covid-19 prediction problem within the recurrent neural network based temporal learning paradigm. In this work, Chatterjee, A. et al. proposed a novel Covid-19 predictive model which is composed as a multiple LSTM-based architecture. The dynamic temporal representations of historical observation entries which are aggregated through hidden states of different LSTM based neural networks are then fed into task-driven layer to conduct prediction. Chatterjee, A. et al. have utilized different-layered LSTM architecture to conduct Covid-19 prediction problem within different Asian countries like as India, Iran, Singapore and Turkey. For experiments in this paper, we implemented a 3-layered LSTM architecture to deal with the routine Covid-19 case forecasting task within our datasets. For other similar configurations of these comparative techniques, we set them as the same DAttAE model’s configurations which are shown in Table 4.

Experimental results & discussions

Daily number of Covid-19 infection case forecasting task

In this experimental section, we present extensive comparative studies between different deep learning based predictive models for handling the daily number of Covid-19 infection case forecasting problem within the VN-64D and HCMC-12D datasets. The routine Covid-19 infectious case forecasting is considered as a challenging task within non-smooth time-series databases like as VN-64D and HCMC-12D. The Fig. 3 and Fig. 4 show the average experimental outputs for daily Covid-19 case prediction problem through different traditional and deep learning based techniques, within the VN-62P and HCMC-12D datasets, respectively. The experimental outputs as shown in these tables are evaluated under the MAE and RMSE standard metrics. In general, as briefly take a look at output experimental charts our proposed DAttAE model outperforms most of recent baselines for the Covid-19 case forecasting problem.

Fig. 3

Average experimental results in terms of MAE and RMSE evaluation metrics through different baselines for daily Covid-19 case prediction task in 62 provinces of Vietnam (VN-62P dataset).

Fig. 4

Average experimental results in terms of MAE and RMSE evaluation metrics through different baselines for daily Covid-19 case prediction task in 12 districts of HCMC, Vietnam (HCMC-12D dataset).

Average experimental results in terms of MAE and RMSE evaluation metrics through different baselines for daily Covid-19 case prediction task in 62 provinces of Vietnam (VN-62P dataset). Average experimental results in terms of MAE and RMSE evaluation metrics through different baselines for daily Covid-19 case prediction task in 12 districts of HCMC, Vietnam (HCMC-12D dataset). In general, as shown from the experimental outputs (Fig. 3 and Fig. 4), our proposed DAttAE model significantly achieved better performances than traditional time-series and probabilistic/regression models, like as: NB and ARIMA. As shown from the experimental outputs, it achieved better predictive MAE/RMSE-based performances than the NB and ARIMA, about 116.21 %/104.51 % and 135.64 %/120.77 % within the VN-64P dataset (as shown in Fig. 3) which is similar in the HCMC-12D dataset (as shown in Fig. 4) for 79.33 %/66.52 % and 53.19 %/50.7 %, respectively. Generally, these experimental outputs have demonstrated the outperformances of deep learning based approaches in comparing with classical probabilistic and regression methods in which better long-range time dependent features are captured with deep neural architectures. In comparing with other common sequential deep neural baselines, including: GRU, LSTM and Bi-LSTM in terms of MAE and RMSE evaluation metrics for this task, our proposed DAttAE model also demonstrated significant improvements. Specifically, for the Covid-19 case prediction task in the VN-62P dataset, our proposed DAttAE model remarkably outperforms previous RNN-based architectures (GRU, LSTM and Bi-LSTM) about: 136.51 %, 116.67 % and 17.97 % in terms of MAE and 117.76 %, 99.62 % and 14.26 % in terms of RMSE, respectively. Similar to that within the HCMC-12D dataset, our approach also notably improves the accuracy performances in terms of MAE and RMSE evaluations about: 19.72 %/20.95 %, 80.21 %/67.84 % and 64.23 %/59.16 % in comparing with Bi-LSTM, LSTM and GRU, respectively. In addition, for comparisons with recent deep learning based methods for Covid-19 case prediction problem like as the COVID-ANN and COVID-RNN, our proposed DAttAE model also slightly improved the accuracy performances in terms of MAE/RMSE metrics approximately 86.85 %/75.23 % and 4.67 %/5.98 % within the VN-62P dataset and 24.7 %/29.53 % and 10.21 %/12.85 % within the HCMC-12D dataset, respectively. Moreover, as shown from some forecasting outputs (Fig. 5 and Fig. 6 ) in the VN-62P and HCMC-12D datasets, our proposed DAttAE model perform better predictions than the traditional Bi-LSTM architecture. In general, the ground-truth Covid-19 daily reported cases are extremely undulating, thus presenting how challenging of the routine infectious case prediction problem. The experimental outputs present that the daily predictive values are close to the actual values than the Bi-LSTM based predictive ones. By utilizing the custom dual attention mechanism within the Covid-19 data representation learning and prediction tasks, our DAttAE model have significantly achieved better sequential representations of Covid-19 case observation data entries in both VN-62P and HCMC-12D and deliver nearly-fit predictions. Through experiments in Covid-19 data in Vietnam, the experimental outputs have demonstrated the effectiveness of our proposed DAttAE model in handling the Covid-19 outbreak forecasting problem in form of time-series based prediction approach.

Fig. 5

Results of Covid-19 case prediction in some high-risked Covid-19 cities/provinces of Vietnam through the Bi-LSTM and DAttAE models in VN-62P dataset.

Fig. 6

Results of Covid-19 case prediction in some high-risked Covid-19 districts of HCMC, Vietnam through the Bi-LSTM and DAttAE models in the HCMC-12D dataset.

Results of Covid-19 case prediction in some high-risked Covid-19 cities/provinces of Vietnam through the Bi-LSTM and DAttAE models in VN-62P dataset. Results of Covid-19 case prediction in some high-risked Covid-19 districts of HCMC, Vietnam through the Bi-LSTM and DAttAE models in the HCMC-12D dataset.

Daily/weekly Covid-19 risk zone classification task

For the Covid-19 risk zone identification problem (as mentioned in sub-section 2.1), in this experimental section we demonstrated the extensive comparative studies in using deep neural time series predictive techniques to conduct the Covid-19 risk zone classification. For the dataset VN-62P is the zone classification for 62 provinces of Vietnam. Similar to that, for HCMC-12D dataset is the zone classification for 12 high-populated/large districts in HCMC, Vietnam. In order to conduct Covid-19 zone identification depending on criteria which have been described in previous section, similar to the daily Covid-19 case prediction problem, we used the trained deep learning-based models, including our proposed DAttAE model, to predict the daily number of infection cases in each region (provinces for the VN-62P dataset and districts for the HCMC-12D dataset). Then, we relied on the assessment criteria for Covid-19 risk zone categorization in Table 2 to identify the corresponding classes (red, orange, yellow and green zones) for all regions at specific time-steps. Finally, the Covid-19 risk zone categorization results are evaluated under the Accuracy and F-1 evaluation metrics for the comparison purposes between different baselines. We evaluated the performance of each model in Covid-19 risk zone classification task with two types of time-interval, which are: daily and weekly. The Fig. 7 and Fig. 8 show the experimental outputs for both daily and weekly Covid-19 zone risk classification task through different techniques, within the VN-62P and HCMC-12D datasets, respectively. Specifically, as shown from the experimental outputs, our proposed DAttAE model also slightly leverage the accuracy performance of Covid-19 risk-zone classification task in both VN-62P and HCMC-12D datasets. In more specific, our proposed DAttAE model explicitly achieves better accuracy performances in terms of F-1 based accuracy than the GRU, LSTM and Bi-LSTM approximately 3.92 %, 2.79 % and 1.57 %, within the VN-62P dataset and 12.33 %, 14.15 % and 1.53 % within the HCMC-12D dataset, respectively. In comparing with recent deep learning based studies, such as: COVID-ANN and COVID-RNN, about proposed DAttAE model also achieve better performances for this task about 5.41 % and 1.79 % in the VN-62P, 5.06 % and 8.01 % in the HCMC-12D dataset. Similar to previous empirical studies in the Covid-19 case forecasting problem (sub-section 4.2.1), experimental results in this section have proved the effectiveness of integrating with attention mechanism for long-varied sequential data representation learning task in forms of daily/weekly Covid-19 risk-zone classification problem.

Fig. 7

Average daily Covid-19 risk zone classification results in terms of Accuracy and F-1 metrics for 62 provinces in Vietnam and 12 districts in HCMC, Vietnam through different techniques.

Fig. 8

Average weekly Covid-19 risk zone classification results in terms of Accuracy and F-1 metrics for 62 provinces in Vietnam and 12 districts in HCMC, Vietnam through different techniques.

Average daily Covid-19 risk zone classification results in terms of Accuracy and F-1 metrics for 62 provinces in Vietnam and 12 districts in HCMC, Vietnam through different techniques. Average weekly Covid-19 risk zone classification results in terms of Accuracy and F-1 metrics for 62 provinces in Vietnam and 12 districts in HCMC, Vietnam through different techniques.

Model evaluation on other time-series Covid-19 dataset

For further performance evaluation of our proposed DAttAE model within different times-series based Covid-19 dataset, we conducted extensive experiments on a popular United States (US) reported Covid-19 confirmed cases within different states (named as US-Covid19). This dataset contains different types of information related to the reported death and infectious cases in forms of time-series and routinely updated by collecting data from the well-known Johns Hopkins University Center for Systems Science and Engineering (CSSE)[ 9 ]. For experiments in this dataset, we extracted the number of confirmed Covid-19 infectious cases within the top-5 highest populated states in the US, including: California, Texas, Florida, New York and Pennsylvania. The extracted data related to the confirmed cases in these states are within the period of April 16, 2021 to January 15, 2022. For dealing with the infectious case prediction problem within this dataset, we utilized different deep learning based methods including: LSTM, Bi-LSTM, COVID-ANN, COVID-RNN and our proposed DAttAE model. We applied the same train/test data splitting strategy with previous datasets as well as general configurations (as described in Table 4) for all deep learning techniques which are studies in the experiments with the US-Covid19 dataset. The average prediction performances in terms of MAE and RMSE evaluation metrics of all states are reported as shown in Fig. 9 .

Fig. 9

Comparative studies between different deep learning based techniques within the US-Covid19 dataset.

Comparative studies between different deep learning based techniques within the US-Covid19 dataset. As shown from the final experimental results in Fig. 9, our proposed DAttAE model has demonstrated effectiveness as well as outperformances in comparing with other deep learning based time-series predictive models. In more specific, our proposed DAttAE model explicitly achieved better performance in terms of MAE/RMSE for the Covid-19 confirmed case prediction about 34.6 %/36.3 %, 20.84 %/18.61 %, 29.01 %/27.63 % and 14.5 %/14.89 % in comparing with LSTM, Bi-LSTM, COVID-ANN and COVID-RNN, respectively. This extensive comparative result has proved the efficiency of utilizing the dual attention mechanism within RNN-based architecture for dealing with non-smooth and chaotic time-series based dataset, like as the routine reported Covid-19 infectious cases.

Parameter sensitivity analysis

To furtherly study on the influences of different model’s parameters upon the accuracy performance of our proposed DAttAE model, in this section we presented extensive ablation studies on important fine-tuning parameters of our model, like as the number of training epochs, the used LSTM-based cells for the Bi-LSTM based auto-encoding architecture () and the length of observation sequence () for the Covid-19 case forecasting problem. As shown from the experimental outputs in Fig. 10 , our proposed DAttAE model are quite insensitive with these parameters. Specifically, for the experiment with number of training epochs, we trained our model with different number of epochs within range of [10, 500] and reported the changes in accuracy performances for the Covid-19 case prediction task in terms of RMSLE assessment metric. The experimental outputs in Fig. 10-A presented that the DAttAE model achieved the stability in accuracy performance within 400–500 training epochs for both VN-62P and HCMC-12D datasets. For the () parameter, the experiments in Fig. 10-B showed the ideal number of used LSTM’s cells for our model is about 32. Similar to that with the default length of observation sequence () parameter, our proposed DAttAE model reached the highest accuracy performance with the value of () parameter is in range (Cao et al., 2020, Chin et al., 2020); in both VN-62P and HCMC-12D datasets.

Fig. 10

Extensive experiments on the model’s parameter sensitivity for Covid-19 case forecasting problem within VN-62P and HCMC-12D datasets.

Conclusion & future works

In this paper, we study the problems of routine Covid-19 infection case forecasting and risk zone classification in forms of deep learning based analysis and representation learning for time series dataset. The accurate daily Covid-19 infection case forecasting and risk zone categorization provide crucial information to governments for effectively planning social resources and imposing suitable policies to prevent the Covid-19 pandemic escalation in different regions. To effectively model and retain the temporal and growing patterns of time series based reported Covid-19 confirmed infection cases, we proposed a novel Bi-LSTM based auto-encoding mechanism with the custom dual self-supervised attention mechanism, called as DAttAE. The proposed DAttAE model not only enables to sufficiently preserve them temporal information from complex/non-smooth time series dataset but also ensures the readiness of learnt sequential embedding vector for dealing with the short-term prediction problem with chaotic noise/disturbance through the application of self-supervised attention mechanism. The utilization of attention mechanism for integrating with the archived sequential embedding vectors which are produced by encoder and decoder supports to estimate the important level of all data entries in each input sequence. Then, the calculated attention weights of input sequences are later used to explicitly facilitate the forecasting-driven fine-tuning process in the after all. The application of attention mechanism with RNN-based architecture such as Bi-LSTM enables to better preserve longer chaotic time series in which temporal latent features are sufficiently capture within deeper neural network architectures. Extensive experiments in real-world reported Covid-19 datasets in Vietnam demonstrated the effectiveness of the proposed ideas.

Achievements/advantages of our works & remained shortages.

Our studies in this paper mainly focused on the application of deep learning approach for dealing with Covid-19 pandemic outbreak prediction problem. The proposed model in this paper might be useful for the governments to effectively forecast as well as optimize multi-sector social resources for preventing the Covid-19 pandemic spreading. To effectively deal with complex and non-smooth fluctuations of routine reported infectious cases and deliver more accurate prediction results, we applied the custom dual attention mechanism within our Bi-LSTM based sequential auto-encoding model to efficiently reduce noises and lagged observations from the Covid-19 reported data. The effectiveness of our proposed ideas in this paper have been demonstrated through extensive experiments in real-world Covid-19 datasets. However, our proposed DAttAE still be unable to incorporate the routine reported Covid-19 data with other associated information resources, such as external environmental/geographical aspects in order to achieve better prediction results. The environmental/geographical aspects like temperature, contamination, geographical locations, etc. (Bashir et al., 2020, Rasheed et al., 2021) are considered as important direct/indirect factors which might lead to the raise in number of Covid-19 infectious cases. Therefore, the capability of integrating with exogenous/external information resources of Covid-19 predictive models could be also considered as a potential improvement direction for further studies in this domain.

Our future works

For future works, we intend to expand the studies in GIS-based spatial clustering problem within the context of time series for identifying the spread pattern of Covid-19 hotpots within specific geographical regions. These expansions require extra research effort on integrating geographical spatial clustering techniques with temporal information representation learning to detail with the clustering problem in the context of temporal dynamism. In addition, in order to improve the performance of our proposed DAttAE model in handling chaotic time series prediction task more accurately, we intend to extend the current model to involve a fuzzy-neural inference mechanism (Han et al., 2018, Soto et al., 2019) within the RNN-based architecture. The utilization of fuzzy-neural inference might enable to leverage the performance of time series prediction problem in such chaotic time series-based datasets like as routine Covid-19 reported cases.

CRediT authorship contribution statement

Phu Pham: Methodology, Software, Writing – original draft. Witold Pedrycz: Writing – review & editing, Validation. Bay Vo: Writing – review & editing, Validation, Conceptualization.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

32 in total

1. Time-series forecasting with deep learning: a survey.

Authors: Bryan Lim; Stefan Zohren
Journal: Philos Trans A Math Phys Eng Sci Date: 2021-02-15 Impact factor: 4.226

2. Interval Type-2 Fuzzy Neural Networks for Chaotic Time Series Prediction: A Concise Overview.

Authors: Min Han; Kai Zhong; Tie Qiu; Bing Han
Journal: IEEE Trans Cybern Date: 2018-05-28 Impact factor: 11.448

3. Correlation between environmental pollution indicators and COVID-19 pandemic: A brief study in Californian context.

Authors: Muhammad Farhan Bashir; Ben Jiang Ma; Bushra Komal; Muhammad Adnan Bashir; Taimoor Hassan Farooq; Najaf Iqbal; Madiha Bashir
Journal: Environ Res Date: 2020-05-13 Impact factor: 6.498

4. COVID-19 with spontaneous pneumomediastinum.

Authors: Changyu Zhou; Chen Gao; Yuanliang Xie; Maosheng Xu
Journal: Lancet Infect Dis Date: 2020-03-09 Impact factor: 25.071

5. Towards Providing Effective Data-Driven Responses to Predict the Covid-19 in São Paulo and Brazil.

Authors: Fabio Amaral; Wallace Casaca; Cassio M Oishi; José A Cuminato
Journal: Sensors (Basel) Date: 2021-01-13 Impact factor: 3.576

6. Socio-economic and environmental impacts of COVID-19 pandemic in Pakistan-an integrated analysis.

Authors: Rizwan Rasheed; Asfra Rizwan; Hajra Javed; Faiza Sharif; Asghar Zaidi
Journal: Environ Sci Pollut Res Int Date: 2021-01-06 Impact factor: 4.223

7. Application of a new hybrid model with seasonal auto-regressive integrated moving average (ARIMA) and nonlinear auto-regressive neural network (NARNN) in forecasting incidence cases of HFMD in Shenzhen, China.

Authors: Lijing Yu; Lingling Zhou; Li Tan; Hongbo Jiang; Ying Wang; Sheng Wei; Shaofa Nie
Journal: PLoS One Date: 2014-06-03 Impact factor: 3.240

8. Understanding Unreported Cases in the COVID-19 Epidemic Outbreak in Wuhan, China, and the Importance of Major Public Health Interventions.

Authors: Zhihua Liu; Pierre Magal; Ousmane Seydi; Glenn Webb
Journal: Biology (Basel) Date: 2020-03-08

9. Early dynamics of transmission and control of COVID-19: a mathematical modelling study.

Authors: Adam J Kucharski; Timothy W Russell; Charlie Diamond; Yang Liu; John Edmunds; Sebastian Funk; Rosalind M Eggo
Journal: Lancet Infect Dis Date: 2020-03-11 Impact factor: 25.071