Literature DB >> 35257688

Environmental factors and mobility predict COVID-19 seasonality in the Netherlands.

Martijn J Hoogeveen¹, Aloys C M Kroes², Ellen K Hoogeveen³.

Abstract

BACKGROUND: We recently showed that seasonal patterns of COVID-19 incidence and Influenza-Like Illnesses incidence are highly similar, in a country in the temperate climate zone, such as the Netherlands. We hypothesize that in The Netherlands the same environmental factors and mobility trends that are associated with the seasonality of flu-like illnesses are predictors of COVID-19 seasonality as well.
METHODS: We used meteorological, pollen/hay fever and mobility data from the Netherlands. For the reproduction number of COVID-19 (Rt), we used daily estimates from the Dutch State Institute for Public Health. For all datasets, we selected the overlapping period of COVID-19 and the first allergy season: from February 17, 2020 till September 21, 2020 (n = 218). Backward stepwise multiple linear regression was used to develop an environmental prediction model of the Rt of COVID-19. Next, we studied whether adding mobility trends to an environmental model improved the predictive power.
RESULTS: Through stepwise backward multiple linear regression four highly significant (p < 0.01) predictive factors are selected in our combined model: temperature, solar radiation, hay fever incidence, and mobility to indoor recreation locations. Our combined model explains 87.5% of the variance of Rt of COVID-19 and has a good and highly significant fit: F(4, 213) = 374.2, p < 0.00001. This model had a better overall predictive performance than a solely environmental model, which explains 77.3% of the variance of Rt (F(4, 213) = 181.3, p < 0.00001).
CONCLUSIONS: We conclude that the combined mobility and environmental model can adequately predict the seasonality of COVID-19 in a country with a temperate climate like the Netherlands. In this model higher solar radiation, higher temperature and hay fever are related to lower COVID-19 reproduction, and higher mobility to indoor recreation locations is related to an increased COVID-19 spread.

Entities: Chemical

Keywords: Allergens; Allergies; COVID-19 reproduction number; Mobility; Solar radiation; Temperature

Mesh：

Year: 2022 PMID： 35257688 PMCID： PMC8895708 DOI： 10.1016/j.envres.2022.113030

Source DB: PubMed Journal: Environ Res ISSN： 0013-9351 Impact factor: 6.498

Introduction

COVID-19 appears to be subject to multi-wave seasonality (Kissler et al., 2020; Grech et al., 2020; Liu et al., 2021; Coccia, 2022), comparable to other respiratory viral infections and pandemics since time immemorial (Moriyama et al., 2020; Fox et al., 2017). It is observed that the COVID-19 community outbreaks have a pattern that is similar to those of other seasonal respiratory viruses (Sajadi et al., 2020; Poole, 2020; Burra et al., 2021; Hoogeveen and Hoogeveen, 2021), whereby the seasonal dips coincide with allergy season in regions in the temperate climate zone (Hoogeveen et al., 2021; Hoogeveen, 2020; Shah et al., 2021). The same factors that drive the seasonality of flu-like illnesses, appear to drive COVID-19 seasonality: solar radiation including ultraviolet (UV) light, temperature and humidity (Byun et al., 2021), seasonal allergens (i.e., pollens) and allergies, and behavior. Regarding behavior, mobility data show the beneficial effect of restrictive measures on the reproduction number (Rt) of COVID-19 (Kajitani and Hatayama, 2021; Nouvellet et al., 2021; Linka et al., 2020), but the seasonal aspects of mobility are often overlooked. For example, during nice weather people spend more time outdoors. The advantage of including mobility data in the model is that it allows to discriminate between indoor and outdoor locations. This distinction is relevant since government policies typically restrict mobility regarding specific location types. A single indicator for human-to-human interactions, as for example “commercial trade” (Bontempi, 2020), would not allow us to discriminate mobility per location type. Moreover, selective restrictive policies may distort the association between “commercial trade” and viral spread during our research period. For flu-like illnesses, we previously showed that a compound predictor of solar radiation and seasonal allergens is highly significant though moderately strong r(222) = −0.48 (p < 0.001) (Hoogeveen et al., 2021). It is unclear why environmental factors, such as higher solar radiation, a higher level of seasonal allergens (pollens) and subsequently hay fever are consistently associated with a lower Rt of COVID-19, and, thus possibly associated to COVID-19 seasonality as well. Exposure to solar radiation might be associated with better COVID-19 outcomes (Abraham et al., 2021), and daylight is understood to regulate melatonin levels, and subsequently circadian (lung) immunity (Nosal et al., 2020). Further, increased UV light levels are associated with a more rapid degradation of SARS-CoV-2 particles (Kumar et al., 2021), although the clinical relevance of this effect is debatable. Upon the observation that allergic diseases are associated with lower rates of COVID-19 hospitalizations (Larsson and Gill, 2021; Keswani et al., 2020), several possible pathophysiological explanations are provided, such as a lower expression of membrane-bound angiotensin-converting enzyme 2 (ACE-2) (Jackson et al., 2020; Wan et al., 2020), higher eosinophil counts that are associated with a more favorable course of COVID-19 (Licari et al., 2020; Lindsley et al., 2020; Ferastraoaru et al., 2021), a reduced risk of a cytokine storm or hyper-inflammation (Carli et al., 2020), and T cell-mediated immune responses to allergens which might be effective against COVID-19 as well (Balz et al., 2021). On the other hand, a recent international epidemiological study reported a positive correlation between pollen concentrations and COVID-19 incidence (Damialis et al., 2021). As another study, from Spain, could not confirm the latter finding (Moral de Gregorio et al., 2022), this finding is still a matter of considerable debate. Further, we noticed that an estimate of Rt discriminates better between independent variables than incidence metrics (Hoogeveen et al., 2021), as typically used in flu-like and COVID-19 seasonality research (Byun et al., 2021). The Rt variable is not only a more sensitive metric, but also includes incubation time lags, and is corrected for test bias. Moreover, it is less dependent on seasonality than crude incidence. For such reasons, the Rt has become the standard in predictive modelling for COVID-19. As a consequence, it typically leads to quite different conclusions regarding the nature of, specifically, the relation between COVID-19 spread and humidity and temperature, as seen in research focusing on crude incidence. Apart from the use of the Rt, the novelty of our research is based on the aim to almost fully explain the remarkable fact that during every spring, COVID-19 incidence appears to quickly melt away in the temperate climate zone. We aim to do this by including a very comprehensive set of environmental and mobility parameters in a single predictive model. Whereby we also include recently identified variables such as mobility, solar radiation and seasonal allergens and allergies, that appear to be reliable predictors as discussed above. Finally, among experts there is quite some disagreement on the relative importance of different SARS-CoV-2 transmission routes (Freeman et al., 2021). The statistical elimination of mobility locations and identifying the best predictor, can be helpful to reduce the confusion. Our hypothesis is that a model, combining both environmental factors and mobility trends, improves the prediction of the seasonality of COVID-19 compared to each factor alone. Therefore, the main objective of this study is to explore a comprehensive model, including both environmental factors and mobility trends of people, to improve the prediction of the reproduction number for COVID-19 during spring season which coincides with the low-season of flu-like respiratory diseases in a country in the temperate climate zone such as the Netherlands (latitude: 52°N).

Methods

Data

For the present analyses, we selected the overlapping period of available data sets. The baseline is defined as the first measurements of incidence of COVID-19 on February 17, 2020, and the end date coincides with the end of the first full allergy or pollen season on September 21, 2020, as will be further explained below.

Reproduction number for COVID-19

For the observations of Rt, we used the respective dataset from the Dutch State Institute for Public Health (Rijksinstitutuut voor Volksgezondheid en Milieu; RIVM) (dataset]D-19 r, 2021) from February 17, 2020 till September 21, 2020. RIVM uses a standard method to calculate the Rt metric on the basis of the input data described below (Wallinga and Lipsitch, 2007). RIVM's Rt metric is a daily estimate that is based on positive COVID-19 tests in the Netherlands in hospitals from national intensive care foundation (NICE) and from RIVM's own institutes in municipalities (GGD). When the first symptomatic day of a COVID-19 infected person is not known, RIVM estimates this date. Further, RIVM assumes an average 4 days delay period between infection and first symptoms, and estimates the mean incubation period to be 6.4 days (95% confidence interval (CI): 5.6–7.7) (Backer et al., 2020). As the surveillance system of COVID-19 incidence and hospitalizations in the Netherlands, on which Rt is based, is considered highly reliable and valid, we did not consider to look into alternative metrics such as viral particle counts in waste water (Hu et al., 2021).

Meteorological data

Regarding meteorological data, we used datasets from the Royal Dutch Meteorological Institute (dataset]Daily Mete, 2021) from February 17, 2020 till September 21, 2020. The downloaded daily data included global solar radiation in J/cm2, mean relative atmospheric humidity (% RH), and average temperature in degrees Celsius. For comparison, and given their effects on pollen distribution, we also added precipitation duration in 0.1 h, precipitation amount in 0.1 mm, mean wind speed, minimum and maximum temperatures in degrees Celsius, mean dew point temperature in degrees Celsius, and sunshine duration in 0.1 h. Additionally, we calculated the wind chill temperature per day. These datasets were obtained from the KNMI's centrally located De Bilt weather station. De Bilt is traditionally chosen as it provides an approximation of modal meteorological parameters in the Netherlands, which is a small country. Furthermore, all major population centers in the Netherlands, which account for around 70% of the total Dutch population, are within a radius of only 60 km from De Bilt. We therefore assumed in this study that the measurements from De Bilt are sufficiently representative for the meteorological conditions typically experienced by the Dutch population.

Mobility data

We used Google mobility data for relative trends regarding visits to different types of locations in the Netherlands (dataset] Google. Mobility, 2021) for the same period from February 17, 2020 till September 21, 2020. These location types are: Residential, Workplaces, Indoor Recreation (called retail & recreation by Google, which includes restaurants, cafes, retail, shopping centers, theme parks, museums, libraries, and movie theaters), Outdoor Recreation (called Parks by Google, and including places such as national parks, public beaches, marinas, dog parks, plazas, and public gardens), and Transit Stations (places such as public transport hubs such as subway, bus, and train stations). For comparison, as these are less affected by lockdowns, we also included mobility trends for Grocery & Pharmacy (places such as grocery markets, food warehouses, farmers markets, specialty food shops, drug stores, and pharmacies).

Seasonal allergens and allergies

For hay fever (allergic rhinitis) we used the data from Nivel (2021), for the same period, about weekly incidence reports at primary medical care level, per 100,000 citizens in the Netherlands. Primary medical care is the day-to-day, first-line healthcare given by local healthcare practitioners to their registered clients as typical for the Netherlands. The hay fever incidence metric is a weekly average based on a representative group of 40 primary care units, and calculated using the number of hay fever reports per primary care unit divided by the number of patients registered at that unit. This is then averaged for all primary care units and then extrapolated to the complete population. We used interpolation to generate a daily data set. For comparison, we included daily mean pollen concentrations based on the data from two Dutch pollen stations: Elkerliek Ziekenhuis in Helmond (latitude 51.487059, longitude 5.662036) (Elkerliek, 2020), and Leiden University Medical Center in Leiden (latitude 52.166309, longitude 4.477315) (dataset]Pollen con, 2021). The mean pollen concentration is measured in grains/m3, whereby we used the daily totals for the 42 types of pollen particles for which by both stations the numbers are counted and averaged per day per 1 m3 of air. The common Burkard spore trap is used by these stations. It was noticed before that a metric including all available allergenic particle types, lower allergenic or higher allergenic, correlates stronger with the incidence or Rt of COVID-19, than a metric only based on higher allergenic particle types (Hoogeveen et al., 2021; Shah et al., 2021). In the Netherlands, these are the only two pollen stations, whereby it is understood that the station in Leiden represents the maritime coastal zone in the Netherlands, below sea level, and the one in Helmond the more continental zone, above sea level. In previous studies, we saw that using the data of a single station already provided a good parameter for trend analysis (Hoogeveen et al., 2021; Hoogeveen, 2020). But, we believe that including data of both stations leads to a better estimate for the Netherlands as a whole.

Data sets consolidation

As said before, we selected for all datasets the same overlapping period of COVID-19 and the first full allergy season (Hoogeveen and Hoogeveen, 2021), during 2020. The overlapping period runs therefore from February 17, 2020 till September 21, 2020 (n = 218 days). The end of allergy or pollen season, we defined as total pollen concentrations structurally dropping below 10 grains/m3. Further, all variables were consolidated into a single parameter for the whole country not accounting for spatial variability given that not for all datasets data is available on provincial or municipality level. For sensitivity analyses, we also extended the datasets to periods till June 10, 2021 (n = 480 days). For mobility datasets the clearly intra-week patterns required a 7 days moving average to reduce noise. Therefore, for reasons of consistency, we calculated 7 days moving averages for all other variables as well.

Statistical analysis

Variables are presented with their sample sizes (n), means (M), and standard deviations (SD). We calculated correlation coefficients to assess the strength and direction of relations of each independent variable with Rt, and with each other. Stepwise backward multiple linear regression for all independent variables on Rt was used to keep only candidate predictors that are significant (p < 0.05) in the model and remove insignificant predictors. Next, we removed predictors that were multicollinear as defined below. With the remaining independent variables the F-value, standard deviations and errors, degrees of freedom (DF), and significance level, are calculated to test the goodness of fit hypothesis for our predictive model for Rt. Further, the multiple R, Multiple R squared (R2) and adjusted R2 correlation coefficients are calculated to estimate the predictive power of our model. Additionally, the algebraic equation to predict Rt is determined, which is just to be understood as an empirical formula. Next, per independent variable the (standard) coefficient, t-stat and its 95% CI, probability, and the variance inflation factor value (VIF) are calculated as well. Further, as linear regression assumes normality of the residuals, we applied the Shapiro-Wilk test and to test the homoscedasticity requirement – homogeneity of variance of residuals– the White test is applied. To analyze multicollinearity we used a VIF value of 2.5 as a threshold. Additionally, the priori power is calculated of each predictor alone and compared with the full model. Although the independent variable Rt assumes time lags, we also studied the autocorrelations of residuals, whereby we interpret an autocorrelation beyond a time lag of 7 days as an indication that our model probably might miss a key predictor. Finally, we created calibration plots to visually review the fit of the model. For selected independent variables with a p < 0.05 and VIF score <2.5, standard log10, square root and quadratic (^2) data transformations are applied to reduce non-linearity in relations between variables which helps to reduce skewness, and, especially, meet the normality and homoscedasticity requirement. Such data transformations do not change the nature and direction of relations between independent variables and Rt. In case of the relative mobility trend data we added a constant before such data transformations to avoid loss of data because of negative numbers. For other variables that was not necessary as they only included positive numbers. We reported the results in APA style, adapted to journal requirements, and applied the TRIPOD guidelines in so far applicable. All statistical analyses were done with Stats Kingdom 2021, which we benchmarked on R version 3.5.

Results

Variables and their correlations

The sample sizes (N), means, and SDs of the independent variables as used in our multiple linear regression models are summarized in Table 1 . The values are given for the data sets after applied data transformations.

Table 1

Overview means (M), standard deviations (SDs) and skewness values.

Variable	N	Mean	SD
Hay Fever	218	131	73.8
Log₁₀(Hay Fever)	218	2.06	0.215
Log₁₀(Pollen)	218	1.84	0.464
Log₁₀(Solar Radiation)	218	3.15	0.273
Log₁₀(Solar Radiation_7dma)	218	3.18	0.198
Temperature²	218	221	142
Dew point temperature	218	8.56	5.70
Sqrt(Mobility: Indoor recreation)	218	214	35.8
Sqrt(R_t)	218	1.03	0.163

Table 1: Overview of mean (M), and standard deviation (SD) per independent variable as used in the multiple linear regression models. The function Sqrt returns the square root of the variable.

Overview means (M), standard deviations (SDs) and skewness values. Table 1: Overview of mean (M), and standard deviation (SD) per independent variable as used in the multiple linear regression models. The function Sqrt returns the square root of the variable. During the allergy season, the factors that negatively correlate with Rt, are in order of strength: hay fever (r(218) = −0.65, p < 0.00001), solar radiation (r(218) = −0.63, p < 0.00001), pollen (r(218) = −0.62, p < 0.00001), and temperature (r(218) = −0.12, p = 0.085). Positively correlated to Rt are relative humidity (r(218) = 0.55, p < 0.00001) and the related dew point temperature (r(218) = 0.12, p = 0.082). Further, higher relative humidity is associated with rain or fog, and thus reduced solar radiation and lower temperature. Temperature and solar radiation are associated as well, although only moderately strong: r(218) = 0.39, p < 0.00001). Pollen and hay fever are, as to be expected, associated: r(218) = 0.50, p < 0.00001), although moderately strong. We did not add allergenicity weights to different pollen particles, and the pollen stations do not cover all types of allergenic particles such as, for example, mold spores. Therefore, having both data sets next to each other has added value, at least for our environmental model. Solar radiation is an important factor as it has, during allergy season, stimulating effects on pollen (r(218) = 0.40, p < 0.00001) and subsequently hay fever (r(218) = 0.40, p < 0.00001), in addition to its associations with temperature and Rt. The mobility places that are correlated with Rt are Indoor Recreation (n(218) = 0.761, p < 0.00001), Residential (n(218) = −0.684, p < 0.00001), Transit Stations (r(218) = 0.563, p < 0.00001), Workplaces (r(218) = 0.532, p < 0.00001), Grocery & Pharmacy (r(218) = 0.472, p < 0.00001), and, not significantly, Outdoor Recreation (r(218) = −0.048, p = 0.5). Indoor Recreation and Residential are most strongly inversely correlated: r(218) = −0.817, p < 0.00001), and thus highly collinear (p > 0.8). Indoor Recreation has moderately strong positive correlations with all other mobility variables, and should therefore be seen as a representant of the mobility cluster. Temperature and dew point temperature had a high correlation of r(218) = 0.84 (p < 0.00001), and appear thus to be collinear. These variables although they have, standalone, no significant correlation with Rt, still play a role in our combined and environmental model, probably because of their indirect effects on mobility and pollen maturation and dispersion, with their opposite associations with Rt.

Outcomes combined model

After several iterations with stepwise backward multiple linear regression, four independent variables were selected from the combined pool of environmental and mobility variables that are both significant (p < 0.05) and have a VIF value below 2.5. These selected predictors are: temperature, solar radiation, hay fever, and Indoor Recreation (see Table 2 ). From the mobility datasets, residential was significant as well but was deselected based on its very high multicollinearity with all other mobility variables, homoscedasticity concerns and lowered explanatory power. In other words, staying at home has a beneficial effect, but, does not explain at which out-of-home location most COVID-19 infections occur. Without the hay fever data, the pollen data would have been significant, but using only the pollen data led to homoscedasticity concerns, which were fully mitigated when using the hay fever data instead.

Table 2

Multiple linear regression for mobility and environmental predictors.

	Coeff.	SE	t-stat	lower t_0.025(213)	upper t_0.975(213)	Stand. Coeff.	P	VIF
B	0.804	0.0961	8.37	0.615	0.994	0	<0.00001
Sqrt(Mobility: Indoor recreation)	0.00385	0.000174	22.1	0.0035	0.00419	0.842	<0.00001	2.48
Log₁₀(Hay Fever)	−0.132	0.0241	−5.46	−0.179	−0.084	−0.173	<0.00001	1.72
Log₁₀(Solar Radiation)	−0.0637	0.0201	−3.17	−0.103	−0.024	−0.106	0.00177	1.93
Temperature²	−0.000561	0.0000401	−14.0	−0.00063	−0.000482	−0.489	<0.00001	2.09

Table 2: Overview of outcomes per predictor after multiple linear regression for both mobility and environmental variables. Selection of predictors is based on being (highly) significant and having multicollinearity (VIF) score below 2.5. The function Sqrt returns the square root of the variable.

Multiple linear regression for mobility and environmental predictors. Table 2: Overview of outcomes per predictor after multiple linear regression for both mobility and environmental variables. Selection of predictors is based on being (highly) significant and having multicollinearity (VIF) score below 2.5. The function Sqrt returns the square root of the variable. On the basis of the multiple linear regression test, we can reject the null-hypothesis (H0) that our combined predictive model with the four selected factors does not provide a good fit: F(4, 213) = 374.2, p < 0.00001. R2 equals 0.875, which means that our predictors explain 87.5% of the variance of Rt. The adjusted R square equals 0.873, and the coefficient of multiple correlation (R) equals 0.936. A simple Pearson correlation between our model's predicted and the observed values for Rt is equally strong and highly significant: r(218) = 0.996, p < 0.00001. It means that there is a strong, and highly significant, relationship between our combined model's predicted and the observed Rt of COVID-19 (see Fig. 1 and Fig. 2 ).

Fig. 1

Fig. 2

Time series predicted versus observed reproduction number COVID-19. Fig. 2. The time series of the predicted ( versus the observed reproduction number of COVID-19 (Rt) in the Netherlands show the very good fit of both the combined and environmental model during allergy season in the Netherlands. However, the Combined Model predicts Rt even better. The seasonality effect in March is visible in both model.

Scatter diagram predicted versus observed reproduction number. Fig. 1: The combined mobility and environmental model is superior as its predictions explain 87.5% of the variance of the observed reproduction number of COVID-19 (Rt) during allergy season. Time series predicted versus observed reproduction number COVID-19. Fig. 2. The time series of the predicted ( versus the observed reproduction number of COVID-19 (Rt) in the Netherlands show the very good fit of both the combined and environmental model during allergy season in the Netherlands. However, the Combined Model predicts Rt even better. The seasonality effect in March is visible in both model. The combined predictive model's regression formula looks as follows:Where is the predicted effective reproduction number for COVID-19, M is the indexed mobility trend data for Indoor Recreation locations to which the mobility constant of 60,000 is added, HF is the hay fever incidence per 100K citizens, SI is the mean global solar radiation in J/cm2, and T is the mean temperature in degrees Celsius. In our dataset, the transformed variables only contain positive numbers.

Statistical outcomes environmental model

For the environmental model we excluded mobility data. Again solar radiation and hay fever were selected as predictor of Rt. The pollen metric added explanatory power, and dew point temperature was selected at the expense of its collinear, temperature (see Table 3 ). Relative humidity was again deselected as an insignificant predictor.

Table 3

Multiple linear regression for environmental predictors only.

	Coeff.	SE	t-stat	lower t_0.025(213)	upper t_0.975(213)	Stand. Coeff.	P	VIF
B	3.00	0.100	30.0	2.80	3.19	0	<0.00001
Log₁₀(Pollen)	−0.0587	0.0144	−4.08	−0.0870	−0.0303	−0.167	0.0000633	1.56
Log₁₀(Solar radiation _7dma)	−0.592	0.0370	−16.0	−0.664	−0.519	−0.717	<0.00001	1.89
Dew point temperature	0.00674	0.00109	6.19	0.00459	0.00888	0.235	<0.00001	1.35
Hay fever	−0.000262	0.0000903	−2.91	−0.000440	−0.0000844	−0.118	0.00405	1.56

Table 3: Overview of outcomes per selected environmental predictor after multiple linear regression. Selection of predictors is based on being (highly) significant and having multicollinearity (VIF) score below 2.5.

Multiple linear regression for environmental predictors only. Table 3: Overview of outcomes per selected environmental predictor after multiple linear regression. Selection of predictors is based on being (highly) significant and having multicollinearity (VIF) score below 2.5. On the basis of the multiple linear regression test, we can reject the H0 that our environmental predictive model with the four selected factors does not provide a good fit: F(4, 213) = 181.3, p < 0.00001, and R2 equals 0.773, which means that our environmental predictors explain 77.3% of the variance of Rt. The adjusted R2 equals 0.769, and the coefficient of multiple correlation (R) equals 0.879. It means that there is a very strong direct and highly significant relation between our environmental model's predicted and the observed reproduction numbers of COVID-19. The environmental model's regression formula looks as follows:Where is the predicted reproduction number for COVID-19, SA is average seasonal allergens or pollen concentrations in particles/m3, SI is the 7 days moving average of global solar (ir)radiation in J/cm2, Td is the average dew temperature in degrees Celsius, and HF is the hay fever incidence per 100K citizens. In our dataset, the transformed variables only contain positive numbers.

Discussion

The predictive power of the combined environmental-mobility model including solar radiation, hay fever, temperature and visits to Indoor Recreation locations (87.5%) surpasses the environmental model (77.3%) with more than 10%. Furthermore, the improved accuracy of the combined model shows that adding mobility trends not only helps to control the environmental model for lockdown effects, but also clearly improves it by helping to show the importance of seasonal behavior better. For example, nice weather (sun shine, warmth, low humidity) in The Netherlands is related to higher pollen concentrations, and more visits to crowded non-residential locations where social distancing is hard to apply. The latter is in turn associated with increased COVID-19 infections. Interestingly, increased visits to Outdoor Recreation locations are not associated with an increase in COVID-19 infections (Rt). This finding suggests that outdoor transmission of SARS-CoV-2 is far less likely than indoor transmission, and that restrictive policies that limit visiting Outdoor Recreation locations have less added value. Although, overall, the environmental model is weaker than the combined model, it is still somewhat better at the onset of COVID-19 during February and March 2020. This is probably explained by the exclusion from the mobility data the visits to ski holiday locations abroad, in Italy and Austria, where many of the first patients contracted COVID-19, which leads to an underestimation of both the Indoor Recreation and Outdoor Recreation trend. On the other hand, the combined model is somewhat better in July when lockdown restrictions were relaxed and people were less strict, which is caught well by the mobility trends variable. Both models are almost equally strong in predicting the seasonal decline in March/April, which indicates that the relative importance of restrictive measures was probably not the main driver of that particular decline, but the seasonality effect was. Of the non-residential locations, especially Indoor Recreation is by far the best predictor of increasing COVID-19 infections (Rt), which makes sense as social distancing in busy shopping locations, bars, discos and other such locations, is hard to maintain. Especially, when the seasonality effects are offset by relaxed lockdown measures and social distancing discipline. Even more, if people are under the influence of alcohol and party drugs in crowded party locations, social distancing becomes a distant reality. Additionally, the strong inverse correlation of Residential with Rt, shows that staying at home, because of lockdown measures, is effective. That all other indoor locations have a positive correlation with Rt, shows basically the same: when lockdown measures are relaxed, infection rates increase as people will meet more other people. The single effect of high temperature on Rt appears to be not significant. The role of temperature can be understood only when its associations with other variables such as mobility trends and pollen maturation and dispersion are taken into account. Humidity in general, relative or specific (Td), appears to be positively associated to COVID-19 reproduction, as it is associated with reduced solar radiation and seasonal allergens, and more traffic to indoor locations which are associated with an increase in infections. Even despite observations that, indoors, very dry air, with a low absolute humidity, might favor SARS-CoV-2 transmission, which is likely caused by increased aerosolisation of infectious aqueous particles. It appears that using Rt as dependent variable instead of crude incidence, is a good approach to resolve the inconsistencies that are found in literature regarding especially humidity and COVID-19 (Byun et al., 2021). For example, the observation that rainy season in tropical countries coincides with an uptick in flu-like incidence fits our environmental model: the higher outdoor humidity during rainy season is associated with less solar radiation (and UV) and less pollens, which in its turn explains the increase in incidence of viral respiratory diseases. Finally, although we assume that day length is already covered by the solar radiation variable, it might still be interesting to look if this solar-related variable could add something to the predictive power of our models as well. Given that the same environmental factors and mobility are identified as predictors of COVID-19 spread in other countries, we hypothesize that our models would have a similar predictive power in other countries in the temperate climate zone with highly similar seasonality patterns. But, as this generalization was outside the scope of our study, further study is needed to test this hypothesis (Hoogeveen and Hoogeveen, 2021). Further, it can be argued that during flu-like season, outside allergy season, a model with solar radiation, mobility to indoor entertainment locations, and humidity (dew point temperature or specific humidity and relative humidity), but without the allergens or hay fever factor, can be determined with high predictive power. This is both demonstrated in a recent Lombardian study focusing on a combination of UV radiation and mobility trends, which jointly account for 82.6%–85.5% of the variance of Rt (Falzone et al., 2022). Which is only slightly lower than the outcomes of our combined environmental and mobility model. It is good to note that UV radiation is a collinear of our factor solar radiation.

Methodological concerns

Test bias, especially for new viruses such as COVID-19, is a major methodological challenge. The approach to use more reliable metrics such as the number of hospitalizations to generate the Rt metric appears to be a good method to reduce test bias. But, the change of methodology in June 2020, when more test stations were included with their fluctuating test capacities, most likely led to the introduction of test bias in the Rt metric. Such reliability concerns may have reduced the predictive power of our combined and environmental model. The usefulness of the pollen concentration metric might be improved by taking into account the allergenicity per particle type. The allergenicity classification is available, but it is not on a ratio scale and there are discussions about the accuracy of this classification. Furthermore, other allergenic particles like mold spores, are hardly ever covered by European pollen stations because of budget constraints. We observed that the Indoor Recreation and Outdoor Recreation metric might need to be expanded to holiday locations in foreign countries. Unfortunately, that is something that is currently not possible via the Google Mobility datasets. A solution could be to make use of data about airport travelers. Further, it would be of interest to understand whether the predictive power of our models could be improved by using a single, consolidated parameter for human-to-human contacts. In future research, a comparison could be made between the mentioned “commercial trade” substitute (Bontempi, 2020), and a consolidated factor that combines all traffic to out-of-home indoor locations, all correlating positively with R0. In our research we precluded the period of intensive vaccination from January 2021 onward. Given that it is widely observed that the protective immunity of vaccinations or infections is short-lasting, we are still confronted with resurgences of COVID-19 (Edridge et al., 2020). A comparative Italian study concludes that “the COVID-19 pandemic is driven by seasonality and environmental factors that reduce the negative effects in the summer period, regardless control measures and/or the vaccination campaign” (Coccia, 2022). Nevertheless, it would be of interest to test our predictive models for vaccination events. In any case, it is likely that new waves, will be less intense given longer lasting B-cell and T-cell memory of people that have been infected or are vaccinated already. Therefore, it might be good to control also for herd immunity levels when testing the predictive models for subsequent allergy seasons. Additionally, it might be of interest to differentiate the Rt per virus variant, given that the genetic drift typically leads to more contagious but less deadly variants, that change the dynamics of COVID-19. Finally, testing the predictive models for a wider geographical scope would be of interest, but would require metrics that are not widely available such as a standardized metrics for Rt, hay fever incidence, and pollen datasets.

Conclusion

The combined, mobility and environmental, model explains 87.5% of the variance of Rt of COVID-19 during spring season in a country in the temperate climate zone like the Netherlands, and provides a very good fit (F(4, 213) = 374.2, p < 0.00001), as the predicted and observed Rt correlate strongly and highly significantly. The significant predictors in the combined model are temperature, solar radiation, hay fever incidence, and the Indoor Recreation trend. The environmental factors are inversely associated with Rt. On the other hand, more visits to Indoor Recreation locations is associated with more infections (Rt). This seems to be the best mobility predictor for the effects of lockdown measures on the spread (Rt) of COVID-19. On the other side of the spectrum, moving to Outdoor Recreation locations is not significantly associated with changes in Rt, and including such locations in lockdown regimes appears to be ineffective. The solely environmental model, is around 10% less powerful than the combined model. Nevertheless, the environmental model shows that pollen concentrations and dew point temperature as a collinear of temperature, have an added explanatory value. Further, there are short periods in which the environmental model beats the combined model.

Credit author statement

Martijn Hoogeveen: Conceptualization, Methodology, Data curation, Formal analysis, Writing, Investigation, Visualization, Resources. Ellen Hoogeveen: Methodology, Writing, Resources. Aloys Kroes: Conceptualization, Reviewing, Data curation.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Data statement

The links to downloadable datasets are provided in the reference list or references to the sources are included. Upon request, the data used for this manuscript is available for inspection, but for other purposes we kindly refer to the respective copyright-holder(s).

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

1 in total

1. The environment, epidemics, and human health.

Authors: Avelino Núñez-Delgado; Warish Ahmed; Elza Bontempi; José L Domingo
Journal: Environ Res Date: 2022-07-31 Impact factor: 8.431

1 in total