| Literature DB >> 33897300 |
Antoni Torres-Signes1, María P Frías2, María D Ruiz-Medina3.
Abstract
A multiple objective space-time forecasting approach is presented involving cyclical curve log-regression, and multivariate time series spatial residual correlation analysis. Specifically, the mean quadratic loss function is minimized in the framework of trigonometric regression. While, in our subsequent spatial residual correlation analysis, maximization of the likelihood allows us to compute the posterior mode in a Bayesian multivariate time series soft-data framework. The presented approach is applied to the analysis of COVID-19 mortality in the first wave affecting the Spanish Communities, since March 8, 2020 until May 13, 2020. An empirical comparative study with Machine Learning (ML) regression, based on random k-fold cross-validation, and bootstrapping confidence interval and probability density estimation, is carried out. This empirical analysis also investigates the performance of ML regression models in a hard- and soft-data frameworks. The results could be extrapolated to other counts, countries, and posterior COVID-19 waves. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00477-021-02021-0.Entities:
Keywords: COVID-19 analysis; Curve regression; Hard-data; Machine learning; Multivariate time series; Soft-data
Year: 2021 PMID: 33897300 PMCID: PMC8053745 DOI: 10.1007/s00477-021-02021-0
Source DB: PubMed Journal: Stoch Environ Res Risk Assess ISSN: 1436-3240 Impact factor: 3.379
Regression parameter estimates at the 17 Spanish Communities
| SC/PE | ||||||
|---|---|---|---|---|---|---|
| C1 | 3.6343 | −0.4814 | −0.0075 | −0.0258 | 0.0189 | 0.0193 |
| C2 | 3.4345 | −0.3923 | 0.0416 | 0.0265 | −0.0709 | −0.0572 |
| C3 | 3.2031 | −0.1364 | −0.0088 | 0.0221 | 0.0430 | 0.0289 |
| C4 | 3.1445 | −0.1118 | 0.0041 | 0.0337 | 0.0062 | 0.0072 |
| C5 | 3.1015 | −0.0693 | −0.0345 | 0.0352 | −0.0112 | 0.0003 |
| C6 | 3.1347 | −0.1397 | 0.0020 | 0.0300 | −0.0061 | −0.0002 |
| C7 | 4.0591 | −0.5487 | −0.0907 | 0.0951 | 0.0992 | 0.0842 |
| C8 | 3.8032 | −0.5500 | −0.1007 | 0.0633 | 0.0139 | 0.0277 |
| C9 | 4.5095 | −0.7435 | −0.1134 | 0.1809 | 0.2231 | 0.2026 |
| C10 | 3.6321 | −0.4685 | −0.0540 | 0.0384 | −0.0152 | 0.0011 |
| C11 | 3.2967 | −0.2274 | −0.0083 | 0.0553 | 0.0250 | 0.0240 |
| C12 | 3.3454 | −0.2122 | −0.0927 | −0.0330 | 0.0724 | 0.0679 |
| C13 | 4.8419 | −0.6790 | −0.2455 | 0.0311 | 0.0554 | 0.0667 |
| C14 | 3.0941 | −0.1037 | 0.0210 | 0.0141 | −0.0016 | 0.0041 |
| C15 | 3.2877 | −0.2598 | −0.0524 | 0.0842 | −0.0423 | −0.0348 |
| C16 | 3.6870 | −0.4302 | −0.0086 | 0.0078 | −0.0027 | −0.0017 |
| C17 | 3.2197 | −0.2071 | 0.0162 | 0.0079 | 0.0206 | 0.0110 |
Regression parameter estimates at the 17 Spanish Communities
| SC/PE | ||||||
|---|---|---|---|---|---|---|
| C1 | 0 | −0.0052 | −0.1330 | −0.0123 | 0.0064 | −0.0195 |
| C2 | 0 | −0.0367 | −0.0998 | −0.0462 | −0.0343 | −0.0107 |
| C3 | 0 | −0.0531 | −0.0074 | −0.0142 | −0.0003 | 0.0020 |
| C4 | 0 | −0.0074 | −0.0284 | −0.0151 | −0.0092 | 0.0012 |
| C5 | 0 | 0.0433 | −0.0438 | −0.0116 | −0.0118 | 0.0046 |
| C6 | 0 | 0.0018 | −0.0174 | −0.0068 | −0.0089 | 0.0000 |
| C7 | 0 | −0.0365 | −0.2451 | −0.1791 | −0.0820 | 0.0026 |
| C8 | 0 | 0.0953 | −0.2389 | −0.0431 | −0.0313 | −0.0045 |
| C9 | 0 | −0.1587 | −0.4054 | −0.2269 | −0.1010 | 0.0047 |
| C10 | 0 | 0.1118 | −0.1579 | −0.0458 | −0.0418 | −0.0220 |
| C11 | 0 | 0.0754 | −0.1138 | −0.0166 | −0.0048 | 0.0072 |
| C12 | 0 | −0.1104 | −0.1338 | 0.1330 | 0.0761 | −0.0017 |
| C13 | 0 | 0.4654 | −0.1302 | −0.1602 | −0.1061 | −0.0038 |
| C14 | 0 | 0.0355 | −0.0560 | 0.0119 | 0.0025 | −0.0044 |
| C15 | 0 | −0.0187 | −0.0021 | −0.0897 | −0.0562 | 0.0134 |
| C16 | 0 | 0.0025 | −0.0707 | −0.0638 | −0.0439 | −0.0267 |
| C17 | 0 | 0.0389 | −0.0270 | −0.0174 | −0.0006 | 0.0019 |
Fig. 1At the top, COVID-19 mortality mean cumulative curve in Spain, since March, 8, 2020 to May, 13, 2020 (continuous red line, 265 temporal nodes), and bootstrap curve confidence intervals, at the left-hand-side, (dashed blue lines) and (dashed magenta lines), and at the right-hand-side, (dashed green lines) and (dashed yellow lines). Plots at the center and bottom reflect the same information respectively referred to the mean intensity (spatial averaged COVID-19 mortality risk curve), and log-intensity (spatial averaged COVID-19 mortality log-risk curve) curves in Spain. All the confidence bootstrap intervals are computed at confidence level from 1000 bootstrap samples
Computed values
| 0.0155 | 0.0259 | 0.0668 | 0.0408 | 0.0927 |
| 0.0623 | 0.1642 | 0.0883 | 0.2174 | 0.0313 |
| 0.0559 | 0.1904 | 0.0054 | 0.1602 | 0.1640 |
| 0.0003 | 0.1238 |
Fig. 21000 bootstrap samples have been generated of the spatially averaged minimum empirical regression risk (SAMERR). The corresponding sample values are displayed at the top. The bootstrap histogram can be found at the bottom-left-hand side. The bootstrap probability density is plotted at the bottom-right-hand-side
Bootstrap confidence intervals for (confidence level )
| CI/S | 1000 | 10000 |
|---|---|---|
| [0.0593, 0.1222] | [0.0594, 0.1236] | |
| [0.0564, 0.1196] | [0.0567, 0.1207] | |
| [0.0584, 0.1215] | [0.0579, 0.1217] | |
| [0.0592, 0.1233] | [0.0581, 0.1208] | |
| [0.0484, 0.1281] | [0.0494, 0.1215] |
Fig. 3At the left-hand side, empirical projections of the autocorrelation operator reflecting temporal autocorrelation and cross-correlation between the 17 Spanish Communities analyzed. At the right-hand side, the considered prior probability density (red squares) of a scaled, by factor 1/3, Beta distributed random variable with shape parameters 14 and 13 is compared with the bootstrap fitting of an empirical prior (blue squares)
Bootstrap confidence intervals for the expected training standard error of the classical and Bayesian residual COVID-19 mortality log-risk predictors ()
| CI/S | Classical | Bayesian |
|---|---|---|
| [0.0474, 0.0597] | [0.0173, 0.0228] | |
| [0.0455, 0.0578] | [0.0167, 0.0220] | |
| [0.0463, 0.0588] | [0.0169, 0.0225] | |
| [0.0460, 0.0586] | [0.0172, 0.0226] | |
| [0.0421, 0.0563] | [0.0158, 0.0215] |
Fig. 4COVID-19 mortality risk maps, since March, 8 to May, 13, 2020. Observed (left-hand-side) and estimated (right-hand side) maps, computed from trigonometric regression, combined with classical (first line) and Bayesian (second line) residual predictors
Hard-data category. Averaged SMAPEs, based on 10 running of random 10-fold cross-validation
| SC( | GRNN | MLP | SVR | BNN | RBF | GP |
|---|---|---|---|---|---|---|
| C1 | 0.1957 | 0.0777 | 0.0700 | 0.0594 | 0.0543 | 0.0554 |
| C2 | 0.6132 | 0.1490 | 0.0663 | 0.0738 | 0.0680 | 0.0654 |
| C3 | 0.1556 | 0.0473 | 0.0350 | 0.0303 | 0.0331 | 0.0304 |
| C4 | 0.0971 | 0.0342 | 0.0135 | 0.0200 | 0.0182 | 0.0211 |
| C5 | 0.2049 | 0.0457 | 0.0318 | 0.0370 | 0.0369 | 0.0372 |
| C6 | 0.1572 | 0.0368 | 0.0177 | 0.0234 | 0.0233 | 0.0247 |
| C7 | 0.4898 | 0.0698 | 0.0644 | 0.0590 | 0.0616 | 0.0588 |
| C8 | 0.0804 | 0.0340 | 0.0171 | 0.0191 | 0.0211 | 0.0177 |
| C9 | 0.7258 | 0.1976 | 0.0979 | 0.0812 | 0.0326 | 0.0437 |
| C10 | 0.2191 | 0.0704 | 0.0556 | 0.0482 | 0.0471 | 0.0463 |
| C11 | 0.1262 | 0.0530 | 0.0310 | 0.0395 | 0.0375 | 0.0355 |
| C12 | 0.5228 | 0.1578 | 0.1341 | 0.1282 | 0.0940 | 0.0993 |
| C13 | 0.3594 | 0.0647 | 0.0576 | 0.0579 | 0.0533 | 0.0458 |
| C14 | 0.1345 | 0.0366 | 0.0209 | 0.0204 | 0.0194 | 0.0207 |
| C15 | 0.6080 | 0.1523 | 0.1411 | 0.1141 | 0.0982 | 0.1039 |
| C16 | 0.2464 | 0.0889 | 0.0709 | 0.0622 | 0.0568 | 0.0594 |
| C17 | 0.0660 | 0.0370 | 0.0148 | 0.0222 | 0.0203 | 0.0227 |
| M. | 0.2942 | 0.0796 | 0.0553 | 0.0527 | 0.0456 | 0.0463 |
| T. | 5.0022 | 1.3528 | 0.9397 | 0.8959 | 0.7757 | 0.7879 |
Soft-data category. Averaged SMAPEs, based on 10 running of random 10-fold cross-validation
| SC( | GRNN | MLP | SVR | BNN | RBF | GP |
|---|---|---|---|---|---|---|
| C1 | 0.1545 | 0.0983 | 0.0666 | 0.0573 | 0.0234 | 0.0312 |
| C2 | 0.1844 | 0.1730 | 0.0660 | 0.0749 | 0.0277 | 0.0301 |
| C3 | 0.1029 | 0.1192 | 0.0481 | 0.0452 | 0.0273 | 0.0274 |
| C4 | 0.0432 | 0.0286 | 0.0165 | 0.0158 | 0.0124 | 0.0123 |
| C5 | 0.0610 | 0.0476 | 0.0258 | 0.0248 | 0.0144 | 0.0149 |
| C6 | 0.0260 | 0.0217 | 0.0133 | 0.0140 | 0.0124 | 0.0125 |
| C7 | 0.3750 | 0.2026 | 0.1095 | 0.0924 | 0.0307 | 0.0399 |
| C8 | 0.0764 | 0.0482 | 0.0305 | 0.0300 | 0.0262 | 0.0187 |
| C9 | 0.4894 | 0.3198 | 0.1753 | 0.1212 | 0.0229 | 0.0372 |
| C10 | 0.1680 | 0.0815 | 0.0521 | 0.0462 | 0.0252 | 0.0290 |
| C11 | 0.1537 | 0.0839 | 0.0436 | 0.0397 | 0.0199 | 0.0219 |
| C12 | 0.3689 | 0.2558 | 0.1505 | 0.1249 | 0.0401 | 0.0490 |
| C13 | 0.2848 | 0.1582 | 0.0968 | 0.0792 | 0.0240 | 0.0320 |
| C14 | 0.0367 | 0.0226 | 0.0120 | 0.0143 | 0.0106 | 0.0104 |
| C15 | 0.3618 | 0.2264 | 0.1201 | 0.1227 | 0.0317 | 0.0522 |
| C16 | 0.1773 | 0.0835 | 0.0651 | 0.0545 | 0.0264 | 0.0318 |
| C17 | 0.0884 | 0.0623 | 0.0210 | 0.0231 | 0.0125 | 0.0136 |
| M. | 0.1854 | 0.1196 | 0.0655 | 0.0577 | 0.0228 | 0.0273 |
| T. | 3.1524 | 2.0333 | 1.1129 | 0.9801 | 0.3877 | 0.4642 |
Our approach. Averaged SMAPEs, based on 10 running of random 10-fold cross-validation, for testing trigonometric regression combined with Classical (C.) and Bayesian (B.) residual analysis
| SC | C. k10 | B. k10 |
|---|---|---|
| C1 | 0.0024 | |
| C2 | 0.0019 | |
| C3 | 0.0016 | |
| C4 | 0.0017 | |
| C5 | 0.0023 | |
| C6 | 0.0018 | |
| C7 | 0.0017 | |
| C8 | 0.0016 | |
| C9 | 0.0013 | |
| C10 | 0.0019 | |
| C11 | 0.0017 | |
| C12 | 0.0016 | |
| C13 | 0.0020 | |
| C14 | 0.0026 | |
| C15 | 0.0023 | |
| C16 | 0.0015 | |
| C17 | 0.0022 | |
| M. | 0.0019 | |
| T. | 0.0321 | 0.0103 |
Hard-data category. Bootstrap confidence intervals () for the spatially averaged SMAPEs from 1000 bootstrap samples ( )
| CI/ML | GRNN | MLP |
|---|---|---|
Soft-data category. Bootstrap confidence intervals () for the spatially averaged SMAPEs from 1000 bootstrap samples ( )
| CI/ML | GRNN | MLP |
|---|---|---|
Fig. 5Hard-data category. From 1000 bootstrap samples, spatially averaged SMAPEs histograms and probability densities are plotted, for GRNN (top), MLP (center), and linear SVR (bottom)
Fig. 6Hard-data category. From 1000 bootstrap samples, spatially averaged SMAPEs histograms and probability densities are plotted, for BNN (top), RBF (center), and GP (bottom)
Fig. 7Soft-data category. From 1000 bootstrap samples, spatially averaged SMAPEs histograms and probability densities are plotted, for GRNN (top), MLP (center) and non-linear SVR (bottom)
Fig. 8Soft-data category. From 1000 bootstrap samples, spatially averaged SMAPEs histograms and probability densities are plotted, for BNN (top), RBF (center) and GP (bottom)
Fig. 9Soft-data category. From 1000 bootstrap samples, spatially averaged SMAPEs histograms and probability densities are plotted, for trigonometric regression, combined with empirical-moment based classical (top), and Bayesian (bottom) residual prediction