| Literature DB >> 36111116 |
Leila Ismail1,2,3, Huned Materwala1,2,3, Yousef Al Hammadi1,4, Farshad Firouzi5, Gulfaraz Khan6, Saaidal Razalli Bin Azzuhri7.
Abstract
COVID-19 is a contagious disease that has infected over half a billion people worldwide. Due to the rapid spread of the virus, countries are facing challenges to cope with the infection growth. In particular, healthcare organizations face difficulties efficiently provisioning medical staff, equipment, hospital beds, and quarantine centers. Machine and deep learning models have been used to predict infections, but the selection of the model is challenging for a data analyst. This paper proposes an automated Artificial Intelligence-enabled proactive preparedness real-time system that selects a learning model based on the temporal distribution of the evolution of infection. The proposed system integrates a novel methodology in determining the suitable learning model, producing an accurate forecasting algorithm with no human intervention. Numerical experiments and comparative analysis were carried out between our proposed and state-of-the-art approaches. The results show that the proposed system predicts infections with 72.1% less Mean Absolute Percentage Error (MAPE) and 65.2% lower Root Mean Square Error (RMSE) on average than state-of-the-art approaches.Entities:
Keywords: COVID-19 infection prediction; automated artificial intelligence (Auto-AI); coronavirus; deep learning; healthcare; machine learning; performance evaluation; time series
Year: 2022 PMID: 36111116 PMCID: PMC9468324 DOI: 10.3389/fmed.2022.871885
Source DB: PubMed Journal: Front Med (Lausanne) ISSN: 2296-858X
Summary of COVID-19 infection prediction using time series machine learning and deep learning algorithms.
| Work | Considered countries | Considered algorithms | Justification for algorithm selection | Considered period for developing the algorithm | Considered period for validating the algorithm | Outperforming algorithm |
| Ahmar and Del Val ( | Spain | ARIMA and SutteARIMA | NR | 02/12–04/02 2020 | 04/03–04/09 2020 | SutteARIMA |
| Gecili et al. ( | United States and Italy | HLT, ARIMA, TBATS, and cubic smoothing spline | 02/22–04/29 2020 | 02/22–04/29 2020 | ARIMA | |
| Shahid et al. ( | Brazil, Germany, Italy, Spain, United Kingdom, China, India, Israel, Russia, and United States | ARIMA, SVR, LSTM, Bi-LSTM, GRU | 01/22–05/10 2020 | 05/11–06/27 2020 | Bi-LSTM | |
| Ayoobi et al. ( | Australia and Iran | LSTM, Bi-LSTM, Convolutional LSTM, Bi-Convolutional LSTM, GRU, Bi-GRU | (Australia) | (Australia) | LSTM (Australia) | |
| Ceylan ( | Italy, Spain, and France | ARIMA | Widely used in literature | 02/21–04/15 2020 | NA | ARIMA |
| Singh et al. ( | Malaysia | ARIMA | 01/22–03/31 2020 | 04/01–04/17 2020 | ARIMA | |
| Alzahrani et al. ( | Saudi Arabia | AR, MA, ARMA, ARIMA | Accurate for other countries | 03/02–04/20 2020 | NA | ARIMA |
AR, AutoRegressive; ARIMA, AutoRegressive Integrated Moving Average; ARMA, AutoRegressive Moving Average; Bi-LSTM, Bidirectional Long Short-Term Memory; GRU, Gated Recurrent Unit; HLT, Holt’s Linear Trend; LSTM, Long Short-Term Memory; MA, Moving Average; NR, Not Reported; NA, Not Applicable; SVR, Support Vector Regression; TBATS, Trigonometric Exponential smoothing state-space model with Box-Cox transformation.
FIGURE 1(A) Workflow of the proposed automated artificial intelligence-enabled system for infection prediction, (B) architecture of long short-term memory (LSTM) cell and Bidirectional-LSTM network used in the proposed system for infection prediction, and (C) request-response workflow in the proposed system.
FIGURE 2Implementation of the proposed real-time prediction system.
Characteristics of the COVID-19 dataset used in the experiments.
| Countries | Features | Update frequency | Considered period for the Covid-19 infections |
| Australia, Brazil, China, France, Germany, India, Iran, Israel, Italy, Malaysia, Russia, Saudi Arabia, Spain, United Kingdom, and United States | Province/state, country/region, last update, number of confirmed cases, number of recovered cases, and number of deaths | Daily | 22/01/2020–08/01/2022 |
FIGURE 3COVID-19 infections’ data trend for the countries under study.
Prediction models used for the countries under study.
| Country | Infection’s trend | Automated AI selected model | Model(s) used for comparison |
| China | Exponential + linear | HLT | Bi-LSTM ( |
| France | ARIMA ( | ||
| Germany | Bi-LSTM ( | ||
| Italy | ARIMA ( | ||
| Malaysia | ARIMA ( | ||
| Australia | Polynomial | QT | LSTM ( |
| Iran | Bi-GRU ( | ||
| Russia | Bi-LSTM ( | ||
| Spain | SutteARIMA ( | ||
| UK | Bi-LSTM ( | ||
| US | Linear | LT | ARIMA ( |
| Israel | Bi-LSTM ( | ||
| Brazil | Exponential + damping | DT | Bi-LSTM ( |
| India | Bi-LSTM ( | ||
| Saudi Arabia | ARIMA ( |
Description and parameters of the prediction models used in the experiments.
| Model | Description | Parameter |
| HLT | Allows forecasting of data with a trend. It is exponential smoothing applied to both the average value in the series (level) as well as the trend ( | Smoothing parameters for level (α) and trend (β) |
| QT | Develops a polynomial relationship between time and the infection data ( | Degree of polynomial |
| LT | Develops a linear relationship between time and the infection data ( | Not applicable |
| DT | Extends the HLT model by adding a damping parameter that dampens the steep increasing forecast of HLT to a flat trend in the future ( | Smoothing parameters for level (α), trend (β), and damping parameter (Φ) |
| LSTM | LSTM is a recurrent neural network that is capable of learning long-term dependencies. The main concepts of LSTM are the cell state and the gates. The cell state acts as a data transmission channel that transfers relative information to the chain of neural networks. Gates are the way to decide on what information to keep or forget based on the relevance during the training. | input size, number of neurons, epochs, activation function, and optimizer |
| Bi-LSTM | A recurrent neural network model consisting of two LSTM networks, one in forward direction (previous timestamp to future) and backward direction (future to previous timestamps). | |
| Bi-GRU | A neural network model consisting of two GRU networks, one taking input in forward direction and the other in backward direction. It is a bidirectional recurrent neural network consisting of input and forget gates. GRU are similar to LSTM cells but do not maintain an internal cell state | |
| ARIMA | Combines the autoregressive (AR) and the moving average (MA) models ( | Orders of lag observations (p), differencing (d), and moving average (q) |
| SutteARIMA | Averages alpha-Sutte and ARIMA prediction models ( | Orders of lag observations (p), differencing (d), and moving average (q) |
FIGURE 4Performance of Holt’s linear trend (HLT) model with varying parameters’ values for the infection data in (A) China, (B) France, (C) Germany, (D) Italy, and (E) Malaysia.
FIGURE 5Performance of damped trend (DT) model with varying parameters’ values for the infection data in (A) Brazil, (B) India, and (C) Saudi Arabia.
FIGURE 6Training and validation loss vs. epochs for long short-term memory (LSTM), bidirectional-LSTM (Bi-LSTM), and bidirectional gated recurrent unit (Bi-GRU) models after hyperparameter tuning for infection data in (A) China (Bi-LSTM), (B) Germany (Bi-LSTM), (C) Italy (Bi-LSTM), (D) Australia (LSTM), (E) Iran (B-GRU), (F) Russia (Bi-LSTM), (G) Spain (Bi-LSTM), (H) United Kingdom (Bi-LSTM), (I) United States (Bi-LSTM), (J) Israel (Bi-LSTM), (K) Brazil (Bi-LSTM), and (L) India (Bi-LSTM).
FIGURE 7Autocorrelation function (ACF) and partial autocorrelation function (PACF) plots for the stationary infection data in (A) France, (B) Italy, (C) Malaysia, (D) Spain, (E) United States, and (F) Saudi Arabia.
Optimal values of parameters obtained after hyperparameter tuning for the models used in the experiments.
| Model | Country | Optimal parameters’ values |
| HLT | China | α = 0.1, β = 1.0 |
| France | α = 0.3, β = 0.9 | |
| Germany | α = 1.0, β = 0.1 | |
| Italy | α = 1.0, β = 0.1 | |
| Malaysia | α = 0.1, β = 0.4 | |
| QT | Australia | Degree = 5 |
| Iran | Degree = 2 | |
| Russia | Degree = 2 | |
| Spain | Degree = 3 | |
| United Kingdom | Degree = 2 | |
| DT | Brazil | α = 1.0, β = 0.2, Φ = 0.99 |
| India | α = 1.0, β = 0.1, Φ = 0.99 | |
| Saudi Arabia | α = 0.5, β = 0.1, Φ = 0.99 | |
| LSTM | Australia | Input size = 250, neurons = 100, epochs = 500, activation function = ReLU, optimizer = SGD |
| Bi-LSTM | China | Input size = 250, neurons = 100, epochs = 500, activation function = SELU, optimizer = Adamax |
| Germany | Input size = 250, neurons = 100, epochs = 500, activation function = SELU, optimizer = Adadelta | |
| Italy | Input size = 250, neurons = 100, epochs = 1,500, activation function = ReLU, optimizer = SGD | |
| Russia, Spain, United States, and Brazil | Input size = 250, neurons = 100, epochs = 500, activation function = ReLU, optimizer = Adadelta | |
| United Kingdom | Input size = 250, neurons = 100, epochs = 500, activation function = Softsign, optimizer = Adadelta | |
| Israel | Input size = 250, neurons = 100, epochs = 500, activation function = ReLU, optimizer = Adam | |
| India | Input size = 250, neurons = 100, epochs = 500, activation function = ReLU, optimizer = Nadam | |
| Bi-GRU | Iran | Input size = 250, neurons = 100, epochs = 500, activation function = ReLU, optimizer = Adam |
| ARIMA | France | |
| Italy | ||
| Malaysia | ||
| Spain | ||
| United States | ||
| Saudi Arabia |
FIGURE 8(A) Forecasting of COVID-19 infections in China using automated artificial intelligence-enabled system selected Holt’s linear trend (HLT) and state-of-the-art Bidirectional long short-term Memory (Bi-LSTM) models. (B) Forecasting of COVID-19 infections in France using Automated Artificial Intelligence-enabled system selected HLT and state-of-the-art Autoregressive Integrated Moving Average (ARIMA) models. (C) Forecasting of COVID-19 infections in Germany using Automated Artificial Intelligence-enabled system selected HLT and state-of-the-art Bi-LSTM models. (D) Forecasting of COVID-19 infections in Italy using Automated Artificial Intelligence-enabled system selected HLT and state-of-the-art ARIMA and Bi-LSTM models. (E) Forecasting of COVID-19 infections in Malaysia using Automated Artificial Intelligence-enabled system selected HLT and state-of-the-art ARIMA models. (F) Forecasting of COVID-19 infections in Australia using Automated Artificial Intelligence-enabled system selected QT and state-of-the-art LSTM models. (G) Forecasting of COVID-19 infections in Iran using Automated Artificial Intelligence-enabled system selected QT and state-of-the-art Bi-GRU models. (H) Forecasting of COVID-19 infections in Russia using Automated Artificial Intelligence- enabled system selected Quadratic Trend (QT) and state-of-the-art Bi-LSTM models. (I) Forecasting of COVID-19 infections in Spain using Automated Artificial Intelligence-enabled system selected QT and state-of-the-art ARIMA, SutteARIMA, and Bi-LSTM models. (J) Forecasting of COVID-19 infections in the United Kingdom using Automated Artificial Intelligence-enabled system selected QT and state-of-the-art Bi-LSTM models. (K) Forecasting of COVID-19 infections in the United States using Automated Artificial Intelligence-enabled system selected Linear Trend (LT) and state-of-the-art ARIMA and Bi-LSTM models. (L) Forecasting of COVID-19 infections in Israel using Automated Artificial Intelligence-enabled system selected LT and state-of-the-art Bi-LSTM models. (M) Forecasting of COVID-19 infections in Brazil using Automated Artificial Intelligence-enabled system selected Damped Trend (DT) and state-of-the-art Bi-LSTM models. (N) Forecasting of COVID-19 infections in India using Automated Artificial Intelligence-enabled system selected DT and state-of-the-art Bi-LSTM models, and (O) forecasting of COVID-19 infections in Saudi Arabia using Automated Artificial Intelligence-enabled system selected DT and state-of-the-art ARIMA models.
FIGURE 9Mean absolute percentage error (MAPE) and normalized root mean squared error (RMSE) of the Automated Artificial Intelligence-enabled system selected and state-of-the-art models for the countries under study.
Limitations of time series algorithms.
| Algorithm | Limitation |
| Autoregressive Integrated Moving Average (ARIMA) | Not suitable for infection’s trend that becomes linear or dampens over time |
| SutteARIMA | Not suitable for infection’s trend that increases exponentially |
| Holt’s linear trend | Not suitable for infection’s trend with seasonality |
| Trigonometric Exponential smoothing state-space model with Box-Cox transformation | Not suitable for infection’s trend that increases exponentially |
| Cubic smoothing spline | Not suitable for infection’s trend having a high difference in the number of infections between consecutive time intervals |
| Support vector regression | Not suitable for infection’s trend with randomness |
| Long short-term memory (LSTM), Bi-LSTM, gated recurrent unit (GRU), and Bi-GRU | Time consuming, memory-intensive and the performance is sensitive to the initial values of hyperparameters |
| Autoregressive and Autoregressive Moving Average | Not suitable for infection’s trend whose average varies over time |
| Moving average | Can only predict a consistent change in infections over time |