| Literature DB >> 33424434 |
Abstract
The choices that researchers make while conducting a statistical analysis usually have a notable impact on the results. This fact has become evident in the ongoing research of the association between the environment and the evolution of the coronavirus disease 2019 (COVID-19) pandemic, in light of the hundreds of contradictory studies that have already been published on this issue in just a few months. In this paper, a COVID-19 dataset containing the number of daily cases registered in the regions of Catalonia (Spain) since the start of the pandemic to the end of August 2020 is analysed using statistical models of diverse levels of complexity. Specifically, the possible effect of several environmental variables (solar exposure, mean temperature, and wind speed) on the number of cases is assessed. Thus, the first objective of the paper is to show how the choice of a certain type of statistical model to conduct the analysis can have a severe impact on the associations that are inferred between the covariates and the response variable. Secondly, it is shown how the use of spatio-temporal models accounting for the nature of the data allows understanding the evolution of the pandemic in space and time. The results suggest that even though the models fitted to the data correctly capture the evolution of COVID-19 in space and time, determining whether there is an association between the spread of the pandemic and certain environmental conditions is complex, as it is severely affected by the choice of the model.Entities:
Keywords: COVID-19; Environmental covariates; Integrated nested Laplace approximation; Relative risk; Space-time interaction; Spatio-temporal models
Year: 2021 PMID: 33424434 PMCID: PMC7778699 DOI: 10.1007/s00477-020-01965-z
Source DB: PubMed Journal: Stoch Environ Res Risk Assess ISSN: 1436-3240 Impact factor: 3.379
Fig. 1Map of peninsular Spain at the province level (a) and map of Catalonia at the region level (b). In a, the four provinces of Catalonia are highlighted. In b, the region of Barcelonès, where the capital city of Catalonia (Barcelona) is located, is also highlighted
Description of the 12 main models that were considered for the comparison in terms of the specification of the logarithm of the relative risk, , corresponding to region i () on day t ()
| Model | |
|---|---|
| Model 1 | |
| Model 2 | |
| Model 3 | |
| Model 4 | |
| Model 5 | |
| Model 6 | |
| Model 7 | |
| Model 8 | |
| Model 9 | |
| Model 10 | |
| Model 11 | |
| Model 12 |
For all the models, denotes the intercept of the model, E the number of expected cases, and () the covariates. In addition, and represent the structured and unstructured random spatial effect of the model, and the structured and unstructured random temporal effect, and the random spatio-temporal effect. The symbols I, II, III, IV denote the type of spatio-temporal interaction (for either or ) considered in Models 5 to 12, according to Table 2
Specification of the four types of spatio-temporal interaction considered in terms of the Kronecker product of the two matrices representing the structure of the spatial and temporal effect, respectively
| Type of spatio-temporal interaction |
|
|---|---|
| I |
|
| II |
|
| III |
|
| IV |
|
The matrix () represents the identity matrix, which corresponds to the unstructured spatial (temporal) effect, whereas () represents a non-identity matrix that corresponds to a specific structured spatial (temporal) effect
DIC and WAIC values corresponding to Models 1 to 12, considering a lagged effect on the covariates of 0, 7, or 14 days
| Model | Lagged effect on the covariates (in days) | |||||
|---|---|---|---|---|---|---|
| 0 | 7 | 14 | ||||
| DIC | WAIC | DIC | WAIC | DIC | WAIC | |
| Model 1 | 72565.19 | 72679.40 | 73011.11 | 73124.89 | 73122.80 | 73237.75 |
| Model 2 | 67300.15 | 67382.92 | 67327.76 | 67409.48 | 67317.83 | 67399.46 |
| Model 3 | 52467.55 | 53120.39 | 52445.03 | 53094.57 | 52471.95 | 53126.27 |
| Model 4 | 52440.54 | 53362.19 | 52421.46 | 53373.55 | 52459.30 | 53366.84 |
| Model 5 | 31851.36 | 33309.72 | 31858.83 | 33315.52 | 31855.27 | 33310.80 |
| Model 6 | 32060.54 | 33300.17 | 32073.40 | 33311.08 | 32067.71 | 33303.21 |
| Model 7 | 31991.11 | 33388.21 | 32001.95 | 33398.70 | 31998.69 | 33393.09 |
| Model 8 | 31860.42 | 33131.04 | 31868.50 | 33136.35 | 31864.40 | 33131.99 |
| Model 9 | 26162.89 | 25639.98 | 26163.56 | 25641.39 | 26161.46 | 25636.84 |
| Model 10 | 29175.52 | 31005.83 | 29197.48 | 31033.46 | 29192.74 | 31031.59 |
| Model 11 | 26240.72 | 25915.22 | 26245.35 | 25919.58 | 26246.03 | 25922.32 |
| Model 12 | – | – | – | – | – | – |
In the case of Model 12, the values obtained for the two metrics were not comparable to those of the rest of models (they all were extremely high), so they are omitted (–)
Fig. 2Histograms of the PIT scores obtained for Models 3 to 8 (from left to right), considering a 0-day, a 7-day, and a 14-day lagged effect (from top to bottom) on the covariates
Fig. 3Summary of the estimates obtained for the coefficients associated with environmental covariates for each of the 12 models fitted, considering a lagged effect on the covariates of 0, 7, or 14 days
Fig. 4Relative risks on a weekly and a daily basis according to the structured and unstructured temporal random effects estimated through Models 3 (a) and 4 (b). The relative risk corresponding to the structured component is computed as either or , whereas the one corresponding to the unstructured component is computed as either or
Precision parameters associated with each spatial, temporal, and spatio-temporal random effect included in Models 3–12
| Model | Lagged effect on the covariates (in days) | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 7 | 14 | |||||||||||||
| Model 3 | 0.61 | 1856.80 | 8.05 | 19519.94 | – | 0.61 | 1830.99 | 7.46 | 19322.89 | – | 0.61 | 1808.52 | 8.22 | 19281.02 | – |
| Model 4 | 0.61 | 1880.24 | 2905.82 | 28751.88 | – | 0.61 | 1868.87 | 2336.19 | 28155.57 | – | 0.61 | 1828.89 | 2996.43 | 32255.73 | – |
| Model 5 | 0.58 | 1701.93 | 17.28 | 20029.05 | 1.96 | 0.58 | 1694.79 | 17.76 | 16311.47 | 1.96 | 0.57 | 1987.44 | 17.87 | 19504.56 | 1.96 |
| Model 6 | 1683.99 | 1824.58 | 71.77 | 17086.46 | 0.16 | 1669.41 | 1764.29 | 81.33 | 17326.17 | 0.16 | 1710.88 | 1782.48 | 78.25 | 17339.23 | 0.16 |
| Model 7 | 1840.15 | 2.40 | 15954.35 | 19012.55 | 1.77 | 1838.72 | 2.35 | 16686.60 | 18328.64 | 1.77 | 1844.39 | 2.34 | 16605.04 | 18684.12 | 1.77 |
| Model 8 | 2548.59 | 2549.11 | 6.03 | 22059.37 | 0.05 | 1932.06 | 1851.87 | 6.17 | 22289.05 | 0.05 | 1956.98 | 1995.20 | 6.37 | 22369.79 | 0.05 |
| Model 9 | 0.57 | 1755.44 | 4150.88 | 19984.01 | 1.56 | 0.57 | 1778.78 | 4460.82 | 20199.37 | 1.56 | 0.57 | 1655.83 | 5206.65 | 12321.17 | 1.55 |
| Model 10 | 1735.39 | 1793.79 | 38350.20 | 29884.19 | 0.01 | 2148.25 | 1996.45 | 38612.51 | 30192.31 | 0.01 | 1640.81 | 1746.67 | 38360.92 | 30166.73 | 0.01 |
| Model 11 | 1.34 | 7.47 | 43977.69 | 18991.44 | 1.33 | 0.71 | 1826.22 | 42416.58 | 19266.62 | 1.33 | 1.40 | 7.23 | 40716.07 | 18450.46 | 1.33 |
| Model 12 | 1842.10 | 1900.22 | 205.84 | 22812.31 | 0.00 | 2096.99 | 1993.96 | 229.80 | 23237.58 | 0.00 | 1893.71 | 1841.31 | 247.87 | 22423.13 | 0.00 |
The precision represents the inverse of the variance of the corresponding random effect
Fig. 5Global relative risks at the region level estimated for the period under study (computed as ) considering Model 3 (a) and Model 4 (b)
Fig. 6Relative risks at the region level (computed as ) estimated for a selection of days within the period under study with Model 9
Fig. 7Evolution of the relative risks (computed as ), according to the estimates provided by Model 9 in the six regions of Catalonia with the highest global relative risks (according to the estimates provided by Models 3 and 4). To make this plot, the relative risks provided by Model 9 have been smoothed through a locally estimated scatterplot smoothing (LOESS) regression (Fox and Weisberg 2018) for ease of visualisation and interpretation