| Literature DB >> 35784094 |
Miguel Díaz-Lozano1, David Guijo-Rubio2, Pedro Antonio Gutiérrez2, Antonio Manuel Gómez-Orellana2, Isaac Túñez3, Luis Ortigosa-Moreno1, Armando Romanos-Rodríguez1, Javier Padillo-Ruiz4, César Hervás-Martínez2.
Abstract
Many types of research have been carried out with the aim of combating the COVID-19 pandemic since the first outbreak was detected in Wuhan, China. Anticipating the evolution of an outbreak helps to devise suitable economic, social and health care strategies to mitigate the effects of the virus. For this reason, predicting the SARS-CoV-2 transmission rate has become one of the most important and challenging problems of the past months. In this paper, we apply a two-stage mid and long-term forecasting framework to the epidemic situation in eight districts of Andalusia, Spain. First, an analytical procedure is performed iteratively to fit polynomial curves to the cumulative curve of contagions. Then, the extracted information is used for estimating the parameters and structure of an evolutionary artificial neural network with hybrid architectures (i.e., with different basis functions for the hidden nodes) while considering single and simultaneous time horizon estimations. The results obtained demonstrate that including polynomial information extracted during the training stage significantly improves the mid- and long-term estimations in seven of the eight considered districts. The increase in average accuracy (for the joint mid- and long-term horizon forecasts) is 37.61% and 35.53% when considering the single and simultaneous forecast approaches, respectively.Entities:
Keywords: COVID-19 contagion forecasting; Curve decomposition; Evolutionary artificial neural networks; Time series
Year: 2022 PMID: 35784094 PMCID: PMC9235375 DOI: 10.1016/j.eswa.2022.117977
Source DB: PubMed Journal: Expert Syst Appl ISSN: 0957-4174 Impact factor: 8.665
Fig. 1Geography of the sanitary districts of Andalusia. The districts including provincial capitals are highlighted in green.
Fig. 2Daily (a) and cumulative (b) reported positive COVID-19 diagnoses in the district of Córdoba from July , , to August , .
Fig. 3Overview flowchart describing the applied methodology.
Descriptors of a point of the cumulative curve of contagions.
| Variable | Description |
|---|---|
| Estimated | |
| Estimated | |
| Estimated | |
| Estimated | |
| Cum. contagions after |
Fig. 4Mean, maximum and minimum evolution of the eight sanitary districts for each wave.
, and standard deviation, (), of the statistics for the fitted polynomial models.
| District | ||
|---|---|---|
| Almería | 0.9954 | 0.0043 |
| Cádiz Bay | 0.9961 | 0.0023 |
| Córdoba | 0.9944 | 0.0075 |
| Granada | 0.9947 | 0.0047 |
| Huelva Coast | 0.9932 | 0.0044 |
| Jaén | 0.9929 | 0.0166 |
| Málaga | 0.9959 | 0.0029 |
| Sevilla | 0.9967 | 0.0028 |
Inputs () included in the EANNs for day belonging to the th outbreak. is the total number of inputs for the generated datasets. Note that the different datasets are named according to the AR models used (either AR or VAR).
| Lags | Dataset name | I | |
|---|---|---|---|
| 2 | |||
| 3 | |||
| 4 | |||
| 3 | |||
| 5 | |||
| 7 | |||
Parameter values that have been used in the EA for all models (PU, RBF and RBFPU for both MoEANNs and MuEANNs).
| Parameter | Value |
|---|---|
| Independent runs | 40 |
| Stopping criteria: | |
| (1) maximum number of generations | 1500 |
| (2) consecutive generations without improving individuals | 10 |
| Population size | 1000 |
| Number of hidden layers of each individual | 1 |
| Minimum number of hidden neurons (initialisation) | 2 |
| Maximum number of hidden neurons (initialisation) | 3 |
| Maximum number of hidden neurons (whole process) | 4 |
| Range of hidden neurons to be added or deleted | |
| Range of links to be added or deleted | |
| Range for weights between input and hidden layer | |
| Range for weights between hidden and output layer |
Fig. 5Train and test partitions of the district of Sevilla.
Performances of the MoEANN and MuEANN models trained with the different combinations of input features and autoregressive orders, , evaluating the errors of the forecast horizon. The results are expressed as RMSE of the generalization set, and SD stands out for Standard Deviation.
| District | BF | MoEANN | MuEANN | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Almería | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| Cádiz Bay | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| Córdoba | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| Granada | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| Huelva Cost | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| Jaén | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| Málaga | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| Sevilla | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
Performances of MoEANN and MuEANN models trained with the different combinations of input features and autoregressive orders, , evaluating the errors of the forecast horizon. The results are expressed as RMSE of the generalization set, SD stands out for Standard Deviation.
| District | BF | MoEANN | MuEANN | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Almería | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| Cádiz Bay | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| Córdoba | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| Granada | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| Huelva Cost | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| Jaén | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| Málaga | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| Sevilla | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
Performances of MoEANN and MuEANN models trained with the different combinations of input features and autoregressive orders, , simultaneously evaluating the errors of the and forecast horizons. The results are expressed as RMSE of the generalization set, SD stands out for Standard Deviation.
| District | BF | MoEANN | MuEANN | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Almería | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF+PU | 1 | |||||||||||
| 2 | ||||||||||||
| Cádiz Bay | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF+PU | 1 | |||||||||||
| 2 | ||||||||||||
| Córdoba | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF+PU | 1 | |||||||||||
| 2 | ||||||||||||
| Granada | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF+PU | 1 | |||||||||||
| 2 | ||||||||||||
| Huelva Cost | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF+PU | 1 | |||||||||||
| 2 | ||||||||||||
| Jaén | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF+PU | 1 | |||||||||||
| 2 | ||||||||||||
| Málaga | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF+PU | 1 | |||||||||||
| 2 | ||||||||||||
| Sevilla | PU | 1 | ||||||||||
| 2 | ||||||||||||
| RBF | 1 | |||||||||||
| 2 | ||||||||||||
| RBF+PU | 1 | |||||||||||
| 2 | ||||||||||||
Comparison of MoEANN and MuEANN model complexities considering double and simultaneous forecasting, . The results are expressed in terms of the average number of links involved in the EANNs of the eight considered districts.
| BF | Input dataset | ||||
|---|---|---|---|---|---|
| MoEANN | PU | 1 | |||
| 2 | |||||
| RBF | 1 | ||||
| 2 | |||||
| RBF | 1 | 20.81 | 24.49 | 26.80 | |
| 2 | 22.13 | 29.02 | 33.09 | ||
| MuEANN | PU | 1 | |||
| 2 | |||||
| RBF | 1 | ||||
| 2 | |||||
| RBF | 1 | 15.41 | 16.85 | 17.67 | |
| 2 | 15.82 | 19.17 | 20.59 | ||
Fig. 6Comparison boxplots of simultaneous forecast results using MuEANN models including () and not including () polynomial information in the training set. The results are expressed as the RMSE obtained for the test set of the executions for each district.
Percentage accuracy gain obtained in simultaneous double-horizon forecasting, and , training the models with the best datasets with respect to the best in all districts.
| District | ||
|---|---|---|
| MoEANN | MuEANN | |
| Córdoba | 37.43% | 34.52% |
| Huelva Cost | 19.68% | 54.46% |
| Almería | 73.55% | 59.84% |
| Cádiz Bay | 50.97% | 26.72% |
| Sevilla | 48.39% | 46.93% |
| Granada | 35.71% | 50.90% |
| Jaén | 4.28% | |
| Málaga | 30.90% | 18.34% |
| Mean | 37.61% | 35.53% |
Fig. 7test predictions of the cumulative number of Sevilla over time (a) and scattered with real values (b) using the best and datasets.
Fig. 8test predictions of the cumulative number of Sevilla over time (a) and scattered with real values (b) using the best MuEANN model trained with and datasets.
Statistical differences between the average mean RMSE test results of the MoEANN and MuEANN best models.
| District | t | |
|---|---|---|
| Córdoba | −3.26 | 2.31E−03 |
| Huelva Cost | −0.42 | 6.78E−01 |
| Almería | −6.21 | 2.63E−07 |
| Cádiz Bay | −0.85 | 3.98E−01 |
| Sevilla | −0.92 | 3.61E−01 |
| Granada | −1.69 | 9.82E−02 |
| Jaén | −1.18 | 2.46E−01 |
| Málaga | 6.75 | 4.69E−08 |
Statistically significant differences favoring MoEANN method.
Statistically significant differences favoring MuEANN method.
Statistical differences between the best results obtained with the different models trained with the and datasets in the eight districts. The results are expressed as the resulting -value of a paired t-test.
| District | ||||||
|---|---|---|---|---|---|---|
| MoEANN | MuEANN | MoEANN | MuEANN | MoEANN | MuEANN | |
| Córdoba | 1.63E−15 | 2.11E−18 | 2.27E−12 | 2.58E−21 | 4.17E−15 | 3.63E−21 |
| Huelva Cost | 1.79E−03 | 3.84E−06 | 5.95E−05 | 5.86E−07 | 1.45E−05 | 4.36E−07 |
| Almería | 5.84E−10 | 1.52E−09 | 2.57E−07 | 4.75E−16 | 1.64E−09 | 5.89E−14 |
| Cádiz Bay | 1.91E−21 | 6.66E−04 | 3.59E−15 | 2.54E−08 | 1.56E−14 | 3.67E−07 |
| Sevilla | 6.57E−15 | 3.64E−28 | 5.26E−16 | 2.01E−33 | 6.29E−24 | 1.39E−32 |
| Granada | 4.90E−03 | 8.82E−12 | 4.35E−08 | 7.73E−15 | 1.84E−07 | 1.26E−13 |
| Jaén | 7.51E−01 | 1.56E−01 | 5.89E−01 | 3.22E−01 | 9.45E−01 | 2.23E−01 |
| Málaga | 2.00E−09 | 3.93E−02 | 8.02E−12 | 3.06E−04 | 3.76E−14 | 1.98E−04 |
Statistically significant differences favoring models trained with datasets.