Literature DB >> 33642623

A novel analysis of COVID 19 risk in India incorporating climatic and socioeconomic Factors.

Srinidhi Jha¹, Manish Kumar Goyal¹, Brij Gupta^2,3, Anil Kumar Gupta⁴.

Abstract

This study investigates the influence of climate variables (pressure, relative humidity, temperature and wind speed) in inducing risk due to COVID 19 at rural, urban and total (rural and urban) population scale in 623 pandemic affected districts of India incorporating the socioeconomic vulnerability factors. We employed nonstationary extreme value analysis to model the different quantiles of cumulative COVID 19 cases in the districts by using climatic factors as covariates. Wind speed was the most dominating climatic factor followed by relative humidity, pressure, and temperature in the evolution of the cases. The results reveal that stationarity, i.e., the COVID 19 cases which are independent of pressure, relative humidity, temperature and wind speed, existed only in 148 (23.7%) out of 623 districts. Whereas, strong nonstationarity, i.e., climate dependence, was detected in the cases of 474 (76.08%) districts. 334 (53.6%), 200 (32.1%) and 336 (53.9%) districts out of 623 districts were at high risk (or above) at rural, urban and total population scales respectively. 19 out of 35 states were observed to be under high (or above) Kerala, Maharashtra, Goa and Delhi being the most risked ones. The study provides high-risk maps of COVID 19 pandemic at the district level and is aimed at supporting the decision-makers to identify climatic and socioeconomic factors in augmenting the risks.

Entities: CellLine Chemical Disease Gene Species

Keywords: COVId 19; Climate; India; Nonstationary analysis; Risk; Socioeconomic

Year: 2021 PMID： 33642623 PMCID： PMC7894130 DOI： 10.1016/j.techfore.2021.120679

Source DB: PubMed Journal: Technol Forecast Soc Change ISSN： 0040-1625

Introduction

Although several studies have been carried out to understand the environment-pandemic relationship, there lacks a consensus among the research community about COVID 19 spread and its relationship with climatic factors. For example, Wu et al. (2020) by studying the records of 166 countries recommended that an increase in temperature and humidity may limit the COVID 19 pandemic partially. Oppositely, Zhu and Xie (2020) by investigating the cases of 122 cities in China, suggested that 1-degree Celsius rise in the average temperature was positively related with 4.9% of daily confirmed cases. They added that there was no strong evidence of COVID 19 cases declining with possible warming in the weather. However, earlier studies show that the spread of disease generated by the ‘severe acute respiratory syndrome’ (SARS) family of viruses was related to factors like air pollution, temperature, relative humidity and other meteorological and environmental factors (Bao et al., 2016; Cui et al., 2003; Lin et al., 2006). An important study over China indicated that COVID-19 decreases with the increase of temperature (Shi et al., 2020). Recent studies have shown that COVID 19 is associated with extreme climate as well as the local factors in India. For instance, Sasikumar et al. (2020) showed that COVID 19 cases were clustered around high temperature zones relating them to the warming conditions in South Asia (India). Similarly, (Gupta et al., 2020)suggested that comparatively hot and dry regions in lower altitude of the Indian territory are more prone to the infection by COVID-19 transmission. These studies, in combination, indicate that the interrelationship of socioeconomic and climatic factors with COVID 19 still needs to be explored for better risk assessment, preparedness and prevention (Sedik et al., 2021). The risk of communicable diseases, in general, are known to be affected primarily by factors such as the carrier of transmission, the host and the surroundings (Lin et al., 2006). COVID 19, being a pandemic now, is a risk to population all around the globe of varying socioeconomic characteristics and climatic exposure (Masud et al., 2020; Wang et al., 2020; Dwivedi et al., 2021). India, which, after China is the most populous country on earth, faces a considerable risk of damage. The first COVID 19 case was reported in India in the state of Kerala in January 2020 (Rawat, 2020). The number of confirmed cases increased to 20,000 from 1000 within a short span of 20 days. Some of the urgent steps which the government took were, for instance, boosting the efficiency of healthcare and management, lockdowns to confirm social distancing, recommending home quarantine for suspected people, home delivery of essential services, efficient surveillance and tracing etc. (Ghoneim et al., 2018; Dorgham et al., 2018; AlZu'bi et al., 2020). However, despite these efforts, as on 14th June 2020, the total number of confirmed COVID 19 cases had increased exponentially to 3,20,922. It is essential to understand that the fate of COVID 19 pandemic depends on the progression of the disease in countries like India. Moreover, India's exposure to climate change and inadequate socioeconomic capacity to fight a disaster makes it more susceptible to the pandemic situation (Ali et al., 2019; Das et al., 2020a; Jha et al., 2020, 2019b; Sathaye et al., 2012; Sinha et al., 2019). Therefore, understanding the complexities of the relationship between the pandemic and climatic and socioeconomic factors is necessary for a more comprehensive assessment of the risks (Anees et al., 2020; Sam et al., 2020; Vittal et al., 2020; Kumar et al., 2021). An inclusive framework which could incorporate the impact of climatic as well as the socioeconomic factors on the pandemic will help to formulate practical risk assessment, reduction and mitigations strategies. Most of the recent studies about the statistical dependence of COVID 19 risk with natural and anthropogenic factors have been done by investigating the characteristics of involved variables. It is also rare to find any study which incorporates the time-varying probabilistic characteristics of COVID 19 cases. The association of the pandemic with the environmental and socioeconomic factors is not only intricate but is also varies with time in space. For instance, the climatic variables itself are time and space varying; therefore, their relationship with the exponentially evolving COVID cases also tend to vary. Many studies in different areas have shown that time-varying probabilistic models often prove to be more productive in comparison to their stationary counterparts (Das et al., 2020b; Jha et al., 2020; Ragno et al., 2019; Song et al., 2020). In this study, we utilize a nonstationary extreme value model to estimate the COVID 19 risk in India. The major objectives of this study can be summarized as (i) to assess the association of climatic factors (pressure, relative humidity, temperature and wind speed) in augmenting the COVID 19 risks in India at rural, urban and total (rural and urban) population scales. (ii) to estimate vulnerability and exposure elements by assessing the socioeconomic condition of the population (iii) to prepare the high resolution (district-wise) map of the pandemic risk by combining the nonstationary COVID 19 hazard measure and the vulnerability and exposure elements. The recent study can help the policy makers in identifying the possible high risk hotspots of the COVID 19 like pandemics as well in understanding the possible reasons governing the risks. The current study provides a simple, computationally efficient, data intensive and feasible mechanism to perform risk analysis at several scales. This can help policy makers in deciding the response measures at different administrative division scale and identify the natural as well human factors to keep an eye on to minimize the risk.

Study area and data

Study area

The current study is performed for 623 COVID 19 affected districts of India as on June 14th, 2020. Districts are the smallest administrative units of the country and also base units of COVID 19 response hierarchy. According to the last available census- the 2011 census of India, there are 35 states and union territories in the country. We performed the analysis at the district levels so that the conclusions drawn from the risk analysis could be implemented at the ground level. Please refer to the map in Fig. 1 for the details about the district and state boundaries. The administrative boundaries have been obtained from the global administrative boundaries (GADM) database, which has been used in several studies (Hijmans et al., 2011; Kugler et al., 2015).

Fig. 1

Districts and state boundaries. The abbreviation of the regions are as: Andaman and Nicobar- AN-UT, Andhra Pradesh-AP, Arunachal Pradesh-AR, Assam-AS, Bihar-BR, Chandigarh-CH-UT,Chhattisgarh-CG, Dadra and Nagar Haveli-DH-UT,Daman and Diu-DD-UT, Delhi-DL-UT, Goa-GA, Gujarat-GJ, Haryana-HR, Himachal Pradesh-HP, Jammu and Kashmir-JnK, Jharkhand-JH, Karnataka-KA, Kerala-KL, Lakshadweep-LD-UT, Madhya Pradesh-MP, Maharashtra-MH, Manipur-MN, Meghalaya-ML, Mizoram-MZ, Nagaland-NL, Orissa-OR, Puducherry-PY-UT, Punjab-PB, Rajasthan-RJ, Sikkim-SK, Tamil Nadu-TN, Tripura-TR, Uttar Pradesh-UK, Uttaranchal-UP, West Bengal-WB.

District-wise confirmed COVID 19 cases data

Several websites are available which provide the updated record of COVID 19 cases in India primarily maintained by the governmental or non-government organizations. For this study, we obtained the daily confirmed district-wise COVID 19 cases for 103 days period of March 4th, 2020 to June 14th, 2020, from the website https://howindialives.com/gram/metrics.php. The data sets have been prepared by collecting the information from various central and state government sources, and have been verified before utilizing in this study.

Climate data

The climate data was collected from the NCEP/NCAR Reanalysis project data set, which comprise of several meteorological data set starting from 1948 to present (Kalnay et al., 1996). The NCEP reanalysis data set are results of a comprehensive project designated to support the climate research, monitoring and investigation communities. The dataset continues to be updated with real-time data, and gridded output can be downloaded from the website https://www.psl.noaa.gov/data/gridded/. This data set has been extensively compared with several other meteorological observations to formulate prediction systems and is found to be good enough for climate studies (Bonaccorso et al., 2003; Kanamitsu et al., 2002; Sachindra et al., 2014; Sillmann et al., 2013). We obtained the daily near-surface data of pressure, relative humidity, temperature and wind speed for 103 days (March 4th, 2020, to June 14th, 2020). These data, from the available coarse resolution of 2.5-degree latitude x 2.5-degree longitude, were re-gridded on 0.5-degree latitude x 0.5-degree longitude grid and then extracted according to Indian administrative boundary. The re-gridding was performed using the inverse distance weighted average (IDWA) method (Snell et al., 2000). The advantage of using IDW is that it simple, easy to comprehend and efficient in modeling the data which do not have outliers (Wu and Hung, 2016). The chances of getting outliers is rare in the climate data which we have analyzed as a very short duration data has been used. Therefore, IDW was utilized to keep the local spatial characteristics of the variables intact as this method interpolates the data based on the magnitude of its nearest neighbor. Once the climate data was re-gridded, district-wise average time series were obtained.

Census data for household population and socioeconomic indicator

In this study, we considered the last available census data, i.e. Census of India 2011 data to estimate the exposure and vulnerability of rural, urban and total population to COVID 19 pandemic in different districts of the country. The dataset was obtained from the official Census website (http://censusindia.gov.in/) of Government of India. The exposure to the pandemic was estimated by considering the old (>65 years) and child population (0–14 years) for each of the districts according to the census records. Further, irrespective of the age, we also considered the ‘other working’ population assuming that easing in eventual easing the lockdown restrictions will enhance the risks. Here, ‘other working’ population implies the number of people working in the sectors other than cultivation, agriculture or household industry. The vulnerability to COVID 19 was estimated by considering significant factors which are crucial in managing the pandemic risk. We assumed that spreading awareness about COVID 19 in masses depends on what is the literacy level of population. Thus, the total number of illiterate population in a district was considered as one of the measures of vulnerability. Similarly, regular cleaning and washing as one of the critical measures to protect oneself are related to the availability of potable water in the household. Therefore, we quantified the number of household in each district which does not have freshwater availability in the premises. Further, one more parameter, i.e. the number of households with no electricity and sanitation facility was added to enhance the accuracy of vulnerability analysis.

Methodology

The methodology for COVID 19 risk estimation revolves around the nonstationary extreme value modeling of the confirmed cases in all districts of the country. The nonstationary extreme value analysis was chosen considering the time-varying extreme nature of the confirmed cases time series. In this study, we fit the extreme value distribution using the linear combination of climate variables (temperature, pressure, relative humidity and wind speed) as possible covariates. The suitability of nonstationary and stationary distribution fits was checked, and then the probabilities of exceedance at different quantiles were calculated. The average of these probabilities served as a measure of COVID 19 hazard, which was eventually combined with the vulnerability and exposure measures to get the risk estimates. Fig. 2 shows the flowchart of the methodology implemented during this study.

Fig. 2

Methodological flowchart.

Nonstationary extreme value modeling

In this study, we considered the probability distribution of confirmed COVID 19 cases as the Generalized Extreme Value (G.E.V.) distribution. For the sake of simplicity, cases were modeled using G.E.V. distribution, considering they are continuous random variables. The cumulative probability distribution function of the G.E.V. distribution is given by Eq. (1) here, is the time series of COVID 19 confirmed cases and , , and mean the location, scale, and shape parameters of the G.E.V. probability distribution, respectively. Therefore, denotes the C.D.F. of the time series with parameters , , and . Two scenarios of extreme value modeling were used. First was the stationary case in which the parameters of G.E.V. distribution were considered constant. Second, in the nonstationary case, the parameters of the G.E.V. distribution were considered varying with time and could be explained by linear combinations of climatic covariates. In other words, we related the climatic influence on the COVID 19 cases by modeling the G.E.V. distribution parameters as linear combinations of the climate covariates. It should be noted that nonstationary is introduced only in the location and scale parameters of G.E.V. distribution as it is difficult to model the shape parameter in nonstationary setting (Coles, 2001; Jha et al., 2020; Yilmaz and Perera, 2014). Therefore, we kept the shape parameter constant to avoid complexity in modeling. 29 linear combinations of the location and the scale parameters were formulated using the climatic covariates of pressure, relative humidity, temperature and wind speed (termed as and respectively in the models). The combinations can be understood from models named as M0, M1, M2 … as shown in Table 1 in the supplementary information. Here, M0 is the stationary model where the values of distribution parameters are constant; therefore, independent of the climatic influence. Few examples of nonstationary G.E.V. models are given as follows:

Table 1

Description of the models used in the present study.

Model ID	Description
M0	X∼GEV[μ,σ,ξ]
M1	X∼GEV[(μ0+μ1C1,σ,ξ]
M2	X∼GEV[(μ0+μ2C2,σ,ξ]
M3	X∼GEV[(μ0+μ3C3,σ,ξ]
M4	X∼GEV[(μ0+μ4C4,σ,ξ]
M5	X∼GEV[(μ0+μ1C1+μ2C2,σ,ξ]
M6	X∼GEV[(μ0+μ2C2+μ3C3,σ,ξ]
M7	X∼GEV[(μ0+μ3C3+μ4C4,σ,ξ]
M8	X∼GEV[(μ0+μ4C4+μ2C2,σ,ξ]
M9	X∼GEV[(μ0+μ4C4+μ1C1,σ,ξ]
M10	X∼GEV[(μ0+μ1C1+μ2C2+μ3C3σ,ξ]
M11	X∼GEV[(μ0+μ1C1+μ2C2+μ4C4σ,ξ]
M12	X∼GEV[(μ0+μ1C1+μ2C3+μ4C4σ,ξ]
M13	X∼GEV[(μ0+μ1C2+μ2C3+μ3C4σ,ξ]
M14	X∼GEV[(μ0+μ1C1+μ2C2+μ3C3+μ4C4,σ,ξ]
M15	X∼GEV[(μ0+μ1C1),(σ0+σ1C1),ξ]
M16	X∼GEV[(μ0+μ2C2),(σ0+σ2C2),ξ]
M17	X∼GEV[(μ0+μ3C3),(σ0+σ3C3),ξ]
M18	X∼GEV[(μ0+μ4C4),(σ0+σ4C4),ξ]
M19	X∼GEV[(μ0+μ1C1+μ2C2),(σ0+σ1C1+σ2C2),ξ]
M20	X∼GEV[(μ0+μ2C2+μ3C3),(σ0+σ2C2+σ3C3),ξ]
M21	X∼GEV[(μ0+μ3C3+μ4C4),(σ0+σ3C3+σ4C4),ξ]
M22	X∼GEV[(μ0+μ4C4+μ2C2),(σ0+σ4C4+σ2C2),ξ]
M23	X∼GEV[(μ0+μ4C4+μ1C1),(σ0+σ4C4+σ1C1),ξ]
M24	X∼GEV[(μ0+μ1C1+μ2C2+μ3C3),(σ0+σ1C1+σ2C2+σ3C3),ξ]
M25	X∼GEV[(μ0+μ1C1+μ2C2+μ4C4),(σ0+σ1C1+σ2C2+σ4C4),ξ]
M26	X∼GEV[(μ0+μ1C1+μ3C3+μ4C4),(σ0+σ1C1+σ3C3+σ4C4),ξ]
M27	X∼GEV[(μ0+μ2C2+μ3C3+μ4C4),(σ0+σ2C2+σ3C3+σ4C4),ξ]
M28	X∼GEV[(μ0+μ1C1+μ2C2+μ3C3+μ4C4),(σ0+σ1C1+σ2C2+σ3C3+σ4C4),ξ]

Here, and denotes the climatic covariates pressure, relative humidity, temperature and wind speed respectively.

Description of the models used in the present study. Here, and denotes the climatic covariates pressure, relative humidity, temperature and wind speed respectively. Here, in model M2 as shown by Eq. (2), defines the trend in the location parameter through physical covariate which is the pressure time series. Similarly, in Model M19, and denoting pressure and relative humidity have been used to describe the location and scale parameters of COVID 19 confirmed cases time series. The significance of all 29 models can be understood following a similar concept.

Best-fit model and parameters

As discussed in the previous section, the stationary and nonstationary distributions were fit utilizing different covariate combinations. The significance of stationary and nonstationary models was checked by employing the likelihood ratio test (L.R. test) (Coles, 2001). The LR test has been utilized to select between the two model types- stationary and nonstationary GEV models. This selection is done based on the statistics obtained using the following equation:where, is the negative log likelihood of the stationary model and is the negative log likelihood of the nonstationary model. Also, where, is the quantile of the Chi-square distribution. The difference between the stationary and nonstationary models is expected to follow an approximate chi-squared distribution at a particular significance level (5% in this study). Further, if nonstationarity is found true, then best nonstationary model is obtained by examining the p-value of the Chi-squared distribution. The null hypothesis of stationarity is rejected once p-value is greater than 0.05. The L.R. test was done for all 28 nonstationary model combinations. The parameter estimation of G.E.V. models was done using the maximum likelihood estimation (MLE) method. The MLE method is very popular and upfront as compared to other parameter estimation methods such as the l-moments method in the context of nonstationary extreme value modeling (Katz et al., 2013). The parameters were estimated using the maximum likelihood function because this method is capable of incorporating the nonstationarity into the distribution parameter Here, is the likelihood function of a particular parameter vector and is the sample size. By minimizing the above function, the distribution parameters for both the nonstationary and stationary cases were obtained.

Risk index estimation

Once the best fit model and covariate combination is obtained, we obtained the exceedance probabilities of different quantiles (25th, 50th, 75th and 95th) of COVID 19 cases. This was done in order to obtain the measure of hazard in terms of probabilities which could be later combined with the vulnerability and exposure measures for estimation of risk. Based on the Intergovernmental Panel on Climate Change (IPCC) recommendations, the risk based on a hazard measure can be estimated as Based on the given formula, the risk induced by COVID 19 was calculated for each district. The mathematical representation of formula applied for risk estimation, utilized in the study be given as: Here, : District wise COVID 19 risk index value. : The measure of hazard in terms of the average probability of different COVID 19 confirmed cases quantiles. : The measure of exposure, i.e., the number of children per household. : The measure of exposure, i.e., the number of old persons per household. : The measure of exposure, i.e., the number of other working persons per household. : The measure of vulnerability, i.e., the number of illiterate persons per household. : The measure of vulnerability, i.e., the density of households with no electricity and sanitation facilities. : The measure of vulnerability, i.e., the density of households with no fresh water availability in the premises. The risk index value was calculated for each district 7and then classified into five classes of low (0–0.25), moderate (0.25–0.50), high (0.50–0.75), very high (0.75–1) and extreme (>1) risks. The risk index value was calculated for each district and later grouped for 35 different states and union territories.

Results and discussion

Fig. 3 shows the district-wise distribution of confirmed COVID 19 cases across India. It was observed from the analysis that there were no cases in only 17 out of 640 selected districts until the last considered date of 14th June 2020. In 45 districts, mostly in the north east or upper Himalaya region, had less than 50 cases. Whereas, 417 districts had more than 1000 cumulative confirmed cases. The southern and western India had the most number of cases. As on 14th June, more than 10,000 cumulative confirmed cases had been registered in 82 districts in the country. The central and northern part of the country had relatively less number of cases; however, this may be a result of poor testing facilities. It should be noted that the districts with zero cases were exempted from further analysis. Once the valid districts were finalized, the time series for covariate data, including climatic variables, were prepared. Fig. 4 represents the time average distribution of 103 days’ (March 4th, 2020 to June 14th) pressure, relative humidity, temperature and wind speed data. The map of pressure distribution shows that the lowest values were obtained in the high altitude districts of the country. Maximum and minimum values of pressure were 1.02 bar and 0.58 bar in the districts in the northern and coastal area of the country, respectively. Similarly, the highest relative humidity was observed in the northern and southernmost part of the country. These regions maintain a good amount of green cover throughout the year; therefore, high relative humidity in these regions is expected (Jha et al., 2019a). Moreover, the least relative humidity was found in the drier regions i.e. north west India. The relative humidity was found to be varying in the range of 26% to 92%. Further, the temperature was ranging from −1.04 to 32.70 °Celsius with the high altitude regions having the lowest temperature and southwestern the lowest. The spatial distribution of mean temperature of the selected months is coherent with the average temperature distribution as suggested by other sources such as India Meteorological Department (IMD) (Srivastava et al., 2009). For estimating the wind speed, the magnitude of the and of the wind velocity components were computed and averaged for the grid points falling under the each individual districts. It should be noted that instead of using time average values of the climatic data, we related their time with the confirmed COVID 19 cases of each corresponding districts. This modeling was done using the 29 covariate combination, as discussed in the methodology section. The covariate combination was prepared such that location and scale parameters which define the magnitude and variability of the number of confirmed COVID 19 cases could be explained by climatic variables. Further, the combinations were made such that the location and scale parameters are explained by both single and multiple climatic variables (Table 1).

Fig. 3

The distribution of cumulative COVID 19 infected cases in 623 districts in India. ‘No Case represents districts with zero cases as on 14th June 2020.

Fig. 4

The district-wise distribution of climatic covariates utilized in the study. The figure shows the average values of the climatic variables for the period of 2nd March 2020 to 14th June 2020.

The distribution of cumulative COVID 19 infected cases in 623 districts in India. ‘No Case represents districts with zero cases as on 14th June 2020. The district-wise distribution of climatic covariates utilized in the study. The figure shows the average values of the climatic variables for the period of 2nd March 2020 to 14th June 2020. As discussed, it is assumed that the nonstationarity in the confirmed cumulative COVID 19 cases is due to the influence of climatic parameters. The results from L.R. test reveal that stationarity, i.e., the COVID 19 cases which are independent of pressure, relative humidity, temperature and wind speed, existed only in 148 (24%) out of 623 districts. Whereas, strong nonstationarity, i.e., climate dependence was detected in the cases of 474 (76%)districts. Fig. 5 shows the spatial distribution of districts with stationarity and nonstationarity as the best fit results. It is evident from Fig. 3 and Fig. 4 that the districts which were found to be independent of the climate variables were also the ones where low COVID 19 cases were observed. This indicates that climate variables strongly govern the evolution of cases in India. Further, nonstationarity was observed in both magnitude (through location parameter) and variability (scale parameter). However, the majority of this influence was observed in the location parameter. It can be inferred from Table 2 that the combinations involving the modeling of the location parameter as a linear function of climatic variables include model M1 to M14. Fig. 6 and Table 2 reveal that climate parameters, particularly pressure, relative humidity and wind speed, individually govern the magnitude of COVID 19 cases. Moreover, the variability in confirmed cases across districts is influenced by the combined action of climate variables. It can be understood from the table and figure that the models which include single climatic covariate as an explanatory variable for location parameter were a good fit for most of the districts. The occurrence and evolution of cases in 133 districts of India were most significantly influenced by the surface pressure. Whereas, the wind speed was the most dominating factor in 80 districts. Further, relative humidity and temperature, as best fit covariates, were obtained in 74 and 46 districts respectively. It should also be noted that the combination of more than one climatic variables was generally not suited for location parameter modeling. However, the scale parameter, along with the location parameter, could be explained by such combinations. As discussed, it is also not advisable to model the scale parameter separately. Therefore, we estimated the scale parameter along with the location parameter in respective combinations of M15 to M28, as discussed in the previous section. The results revealed, using temperature as a possible covariate in location as well as scale parameter was best suited only for 14 districts. However, pressure, relative humidity and wind speed had significant influence in scale temperature was best suited for 8, 3, and 6 districts respectively. The role of the combined impact of climate variables on case variability was most significantly observed in the southernmost coastal districts where models M24, M26, M27 and M28 were the best combinations obtained for many districts in these regions (Fig. 6 and Table 1)

Fig. 5

The nonstationary and stationary classification of districts based on L.R. test results.

Table 2

Number of districts under different covariate based model categories.

Model	Climate variable combination	No. of districts
M0	Stationary	148
M1	L-Pr	133
M2	L-RH	74
M3	L-T	46
M4	L-W	80
M5	L-Pr+RH	0
M6	L-RH+T	2
M7	L-T + W	1
M8	L-W+RH	3
M9	L-W+Pr	0
M10	L-Pr+RH+T	4
M11	L-Pr+RH+W	0
M12	L-Pr+T + W	0
M13	L-RH+T + W	0
M14	L-Pr+RH+T + W	1
M15	LS-Pr	8
M16	LS-RH	3
M17	LS-T	14
M18	LS-W	6
M19	LS-Pr+RH	2
M20	LS-RH+T	8
M21	LS-T + W	5
M22	LS-W+RH	1
M23	LS-W+Pr	1
M24	LS-Pr+RH+T	25
M25	LS-Pr+RH+W	6
M26	LS-Pr+T + W	18
M27	LS-RH+T + W	13
M28	LS-Pr+RH+T + W	21

Note: Pr, RH, T and W stand for Pressure, Relative Humidity, Temperature and Wind Speed respectively.

Fig. 6

Model distribution with different climate variable combination for each district. Pr, RH, T and W stand for Pressure, Relative Humidity, Temperature and Wind Speed, respectively.

The nonstationary and stationary classification of districts based on L.R. test results. Number of districts under different covariate based model categories. Note: Pr, RH, T and W stand for Pressure, Relative Humidity, Temperature and Wind Speed respectively. Model distribution with different climate variable combination for each district. Pr, RH, T and W stand for Pressure, Relative Humidity, Temperature and Wind Speed, respectively. Once the best covariate combination or the model was obtained, the probabilities at different quantiles (25th, 50th, 75th and 95th) were also estimated using the parameters estimated in the previous step. The average of these probabilities was used as COVID 19 hazard measure. It is essential to understand that estimating different quantiles and then averaging them was performed to transform the occurrence of COVID 19 cases into the probabilistic setting. This probabilistic setting was required to estimate the risk index values, as explained in Eq. (6). Further, according to the given formula, risk quantification required the estimation of exposure and vulnerability measures. For exposure, as discussed, the elderly and child population density along with the other working population was calculated for each district. The density here implies the number of people per household. In other words, the elderly, child and working population were calculated and then individually divided by the total number of households in each district. Similarly, the vulnerability measures were also estimated for each district and divided by the number of households. It has been observed that COVID 19 propagation has been distinctive in the rural and urban areas. Considering this, we estimated the risk measures separately for rural, urban and total (combining rural and urban). Therefore, exposure and vulnerability measures were also calculated separately for rural, urban and total population. Figure S1a and Figure S1b in the supplementary information represent the exposure and vulnerability distribution calculated for each district. Normalization of the exposure and vulnerability measures were done to bring all the values at a common scale. Eventually, the risk index was calculated using Eq. (6) at the scale of the rural, urban and total population. As discussed, the risk index values were classified into five different levels so that districts could be characterized into different classes of risk. It is very important to note that the stationary nature of COVID 19 cases was observed only in 148 out of 623 districts. This is a very strong sign of climate variables controlling the magnitude of confirmed COVID 19 cases. However, for more detailed analysis and characterization of results based on the role of climate variables in inducing risk, we calculated the risk by assuming two scenarios. In the first scenario (climate independent) it was assumed that COVID 19 cases were independent of climatic conditions in all 623 districts rather than the originally obtained 148 districts. Whereas in the second scenario (climate dependent), 474 nonstationary and 148 stationary districts were considered as obtained from the previous steps. Table S1 and Table S2 in supplementary information represent the percentage of districts under different categories of risk under stationary as well as nonstationary condition. The inspection of results suggested that rural, urban and total population in 334, 200 and 336 out of 623 districts are under high risk under climate dependent condition as compared to 303, 187 and 305 under climate independent condition respectively. The results were categorized on state-wise scale for the climate-dependent condition (Table 3 ). It was found that the rural population in 19 out of 35 states and union territories have at least 50% of their districts under the high or above risk classes. Whereas, the urban population of greater or equal to 50% districts in 11 states were under such risk classes. Associating the COVID 19 hazard measure with district-wise combined rural and urban population classes suggested that 50% or more districts in 19 out of 35 states exhibit a high risk of the pandemic. At the total population sale, 100% districts in 6 states and union territories demonstrate high risk when analyzed considering the climate characteristics. The spatial distribution of risk indicates that the rural population in the southern and northwestern districts are most vulnerable (Fig. 7 ). This risk was observed to be relatively lesser at urban level, possibly because of the high capacity and low vulnerability of these population. Most of the districts in the country mainly comprise of the rural population; therefore, the combined analysis at total population scale suggest a similar pattern of risk estimates. Further, the spatial pattern of risk under the climate independent assumption was similar to the climate dependent condition; however, the risk was more severe in the latter case.

Table 3

The percentage districts (state-wise) due under high risk due to COVID 19 under climate dependent condition.

Sl No	State	Rural	Urban	Total
1	Jammu and Kashmir	54.55	63.64	50.00
2	Himachal Pradesh	75.00	58.33	75.00
3	Punjab	95.00	50.00	90.00
4	Uttarakhand	69.23	15.38	76.92
5	Haryana	80.95	52.38	80.95
6	Rajasthan	42.42	21.21	48.48
7	Uttar Pradesh	47.89	18.31	40.85
8	Bihar	13.16	7.89	10.53
9	Sikkim	25.00	0.00	25.00
10	Arunachal Pradesh	6.25	0.00	0.00
11	Nagaland	9.09	9.09	9.09
12	Manipur	33.33	22.22	44.44
13	Mizoram	37.50	25.00	37.50
14	Tripua	50.00	50.00	50.00
15	Meghalaya	14.29	14.29	14.29
16	Assam	22.22	25.93	25.93
17	West Bengal	0.00	0.00	0.00
18	Jharkhand	20.83	16.67	16.67
19	Odisha	36.67	23.33	36.67
20	Chhattisgarh	22.22	16.67	22.22
21	Madhya Pradesh	54.00	34.00	58.00
22	Gujarat	76.92	50.00	73.08
23	Maharashtra	100.00	0.00	100.00
24	Andhra Pradesh	69.57	8.70	69.57
25	Karnataka	93.33	33.33	93.33
26	Goa	100.00	50.00	100.00
27	Kerala	100.00	85.71	100.00
28	Tamil Nadu	84.38	50.00	87.50
29	NCT of Delhi (UT)	77.78	100.00	100.00
30	Puducherry (UT)	50.00	75.00	75.00
31	Andaman & Nicobar (UT)	100.00	33.33	100.00
32	Chandigarh (UT)	100.00	100.00	100.00
33	Daman & Diu (UT)	0.00	0.00	0.00
34	Dadra & Nagar Haveli (UT)	85.71	54.29	91.43
35	Lakshadweep (U.T.)	0.00	0.00	0.00

Fig. 7

Spatial distribution of risk for climate dependent and climate independent cases.

The percentage districts (state-wise) due under high risk due to COVID 19 under climate dependent condition. Spatial distribution of risk for climate dependent and climate independent cases.

Conclusions

This study enables the understanding of the influence of climate variables in inducing risk due to COVID 19 in rural, urban and combined population scale at the district level in India. The results reveal there is a significant relationship between climatic factors and COVID 19 risk. It was also found that the risk of the pandemic is greater in rural population. The findings of our study are in line with some other recent works which suggest that there could be a possible link between the COVID 19 risk and climate variables. The investigation draws the attention of the decision-makers to strengthen the capacity of the population, especially in rural areas. The vulnerability to the pandemic is also a factor of the density of the educated population. Therefore, the policymakers in India must focus on increasing awareness in climatically vulnerable rural areas in the country which are backward in terms of education, sanitation and clean water availability. The study provides useful insights for the decision-makers to identify the high-risk hotspots of the pandemic in India and the exposure and vulnerability factors associated with it. Although, the pandemic risk depends upon a number of factors such as availability of testing facilities, efficient contact tracing, the success rate of the testing method, the analysis, with limited data, performs well in characterizing the climate and socioeconomic factors in inducing pandemic risks.

Author Contributions

Srinidhi Jha: conceptualized the problem, Performed the nonstationary analysis and prepared the first draft of the manuscript Manish Kumar Goyal: contributed for problem formulation, Analysis and played supervisory role Brij Gupta: contributed for problem formulation, Analysis and played supervisory role Anil Kumar Gupta: contributed for problem formulation, Analysis and played supervisory role Additional Information

Data availability

The confirmed COVID 19 data were collected from the website https://howindialives.com/gram/metrics.php Climate data (pressure, relative humidity, temperature and wind speed) from the NCEP/NCAR Reanalysis website https://www.psl.noaa.gov/data/gridded/. Census data were obtained from the official Census website (http://censusindia.gov.in/) of Government of India. All the data sets are freely available on the given websites.

Declaration of Competing Interest

The authors declare no competing interests.

10 in total

1. Significance of geographical factors to the COVID-19 outbreak in India.

Authors: Amitesh Gupta; Sreejita Banerjee; Sumit Das
Journal: Model Earth Syst Environ Date: 2020-06-17

2. Environmental factors on the SARS epidemic: air temperature, passage of time and multiplicative effect of hospital infection.

Authors: Kun Lin; Daniel Yee-Tak Fong; Biliu Zhu; Johan Karlberg
Journal: Epidemiol Infect Date: 2006-04 Impact factor: 2.451

3. Association between ambient temperature and COVID-19 infection in 122 cities from China.

Authors: Jingui Xie; Yongjian Zhu
Journal: Sci Total Environ Date: 2020-03-30 Impact factor: 7.963

4. Effects of temperature and humidity on the daily new cases and new deaths of COVID-19 in 166 countries.

Authors: Yu Wu; Wenzhan Jing; Jue Liu; Qiuyue Ma; Jie Yuan; Yaping Wang; Min Du; Min Liu
Journal: Sci Total Environ Date: 2020-04-28 Impact factor: 7.963

5. Efficient deep learning approach for augmented detection of Coronavirus disease.

Authors: Ahmed Sedik; Mohamed Hammad; Fathi E Abd El-Samie; Brij B Gupta; Ahmed A Abd El-Latif
Journal: Neural Comput Appl Date: 2021-01-19 Impact factor: 5.102

6. Impact of Extreme Hot Climate on COVID-19 Outbreak in India.

Authors: Keerthi Sasikumar; Debashis Nath; Reshmita Nath; Wen Chen
Journal: Geohealth Date: 2020-12-01

7. Air pollution and case fatality of SARS in the People's Republic of China: an ecologic study.

Authors: Yan Cui; Zuo-Feng Zhang; John Froines; Jinkou Zhao; Hua Wang; Shun-Zhang Yu; Roger Detels
Journal: Environ Health Date: 2003-11-20 Impact factor: 5.984

8. The influence of temperature on mortality and its Lag effect: a study in four Chinese cities with different latitudes.

Authors: Junzhe Bao; Zhenkun Wang; Chuanhua Yu; Xudong Li
Journal: BMC Public Health Date: 2016-05-04 Impact factor: 3.295

9. Impact of temperature on the dynamics of the COVID-19 outbreak in China.

Authors: Peng Shi; Yinqiao Dong; Huanchang Yan; Chenkai Zhao; Xiaoyang Li; Wei Liu; Miao He; Shixing Tang; Shuhua Xi
Journal: Sci Total Environ Date: 2020-04-23 Impact factor: 7.963

10. Assessment of Risk and Resilience of Terrestrial Ecosystem Productivity under the Influence of Extreme Climatic Conditions over India.

Authors: Srinidhi Jha; Jew Das; Manish Kumar Goyal
Journal: Sci Rep Date: 2019-12-12 Impact factor: 4.379