Literature DB >> 33418356

Spatial analysis of the impact of urban geometry and socio-demographic characteristics on COVID-19, a study in Hong Kong.

Coco Yin Tung Kwok1, Man Sing Wong2, Ka Long Chan1, Mei-Po Kwan3, Janet Elizabeth Nichol4, Chun Ho Liu5, Janet Yuen Ha Wong6, Abraham Ka Chung Wai7, Lawrence Wing Chi Chan8, Yang Xu1, Hon Li1, Jianwei Huang9, Zihan Kan9.   

Abstract

The World Health Organization considered the wide spread of COVID-19 over the world as a pandemic. There is still a lack of understanding of its origin, transmission, and treatment methods. Understanding the influencing factors of COVID-19 can help mitigate its spread, but little research on the spatial factors has been conducted. Therefore, this study explores the effects of urban geometry and socio-demographic factors on the COVID-19 cases in Hong Kong. For each patient, the places they visited during the incubation period before going to hospital were identified, and matched with corresponding attributes of urban geometry (i.e., building geometry, road network and greenspace) and socio-demographic factors (i.e., demographic, educational, economic, household and housing characteristics) based on the coordinates. The local cases were then compared with the imported cases using stepwise logistic regression, logistic regression with case-control of time, and least absolute shrinkage and selection operator regression to identify factors influencing local disease transmission. Results show that the building geometry, road network and certain socio-economic characteristics are significantly associated with COVID-19 cases. In addition, the results indicate that urban geometry is playing a more important role than socio-demographic characteristics in affecting COVID-19 incidence. These findings provide a useful reference to the government and the general public as to the spatial vulnerability of COVID-19 transmission and to take appropriate preventive measures in high-risk areas.
Copyright © 2020 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  COVID-19 pandemic; Socio-demographic characteristics; Spatial analysis; Urban geometry

Mesh:

Year:  2020        PMID: 33418356      PMCID: PMC7738937          DOI: 10.1016/j.scitotenv.2020.144455

Source DB:  PubMed          Journal:  Sci Total Environ        ISSN: 0048-9697            Impact factor:   7.963


Introduction

Coronavirus disease 2019 (COVID-19) is a new emerging infectious virus which is originated in Wuhan, Hubei Province, China (N. Zhu et al., 2020), and has been announced a global pandemic by the World Health Organization (WHO). The earliest known cases were identified in December 2019, and the WHO officially defined COVID-19 as a pandemic on March 11, 2020 (World Health Organization, 2020a). As of June 12, 2020, a cumulative total of 7,410,510 confirmed cases and 418,294 deaths have been reported from 215 countries/regions (World Health Organization, 2020b). Although COVID-19 causes the catastrophic results, the understanding of COVID-19 epidemiology is limited and its mortality is only roughly estimated (Chen et al., 2020; Huang et al., 2020; Ruan et al., 2020). In recent research, geospatial and spatial-statistical analysis is performed to identify the association between geographic variables and disease incidences. These studies from around the world indicate that COVID-19 cases to be highly correlated with socioeconomic and demographic factors. The socially and economically relevant factors with prominent inter-generational and social characteristics were showing close correlations with COVID-19 in 20 European countries (Mogi and Spijker, 2020). In New York City, the analysis of multi-temporal positive records suggested that occupation is playing a more important role in disease transmission than other factors such as income, race, gender and household size (Almagro and Orane-Hutchinson, 2020). In Sweden, a positive relationship was observed between socio-demographic factors (i.e., gender, income, education level, marital status, immigration status) and the toll of COVID-19 pandemic (Drefahl et al., 2020). Studies performed in Italy and South Korea reported that the ages and the household composition were associated with both the spread and mortality of COVID-19. Mollalo et al. (2020) adopted 35 environmental, socioeconomic, topographic and demographic variables to explain the spatial distribution of confirmed cases in the United States, and they concluded that factors such as income, inequality, median household income, race, and the proportion of nursing practitioners were variables with significance. Similarly, Raifman and Raifman (2020) also suggested that race and income are factors that are highly associated with virus exposures in the United States. Khunti et al. (2020) highlighted the association between ethnicity and number of confirmed cases in England and the United States, because the socio-economic conditions are quite different among different racial or ethnic groups in a country. As indicated by Pareek et al. (2020), ethnicity is an important factor that closely related to people's health status, health behavior and social behavior, so the factor should be considered in taking the measures for pandemic control. Therefore, there are pieces of evidence that the socio-economic and demographic characteristics are contributing factors to COVID-19 transmission. Similar to other infectious diseases, COVID-19 transmission is more frequently observed in urban areas. However, recent studies on COVID-19 did not focus on factors such as urban geometry and urban design that are considered important for the outbreak of other infectious diseases. In England, for example, urbanization promotes the transmission of an infectious disease due to increased contact rates and altered socio-economic conditions (Zhang and Atkinson, 2008). Another study in China suggested that the increased connection between rural and urban areas and the rural-to-urban migration caused by the urbanization could speed up the transmission of infectious disease (Gong et al., 2012). Urban geometry can affect the living environment and should be considered when studying the health conditions of residents, and the factors can also be further divided into sub-factors such as the building geometry, the road network and the distribution of greenspaces, etc. (Johansson, 2006). A key factor of urban geometry is the Sky View Factor (SVF), which is defined as the ratio of the visible sky with obstructions to that of without obstruction (Oke, 1988), and it is always considered as a typical indicator of urban geometry (Krüger et al., 2011; Yang et al., 2016, Yang et al., 2015). Lai et al. (2013) found that the SVF, which is related to building height and density, would potentially serve as an indicator of the risk of health in an urban area. Some studies stressed the role of air ventilation, which is largely a result of the urban geometry, on disease transmission (Cheng et al., 2011; Gao et al., 2008; Keshavarzian et al., 2020), and poor ventilation was found to be associated with Severe Acute Respiratory Syndrome (SARS) infection (Gao et al., 2009) as well as asthmatic symptoms (Smedje and Norbäck, 2000). It has also been found that the area of outdoor space affects the quality of the indoor environment (Chan and Liu, 2018; Niachou et al., 2008). The former proposed that the neighborhood environment had a direct influence on people's health conditions which is also related to the relationship between the indoor environment and people's health condition. They found that lower density and height of buildings are most beneficial for human health, especially for those with respiratory diseases (e.g., bronchoconstriction, asthma symptoms). Although these studies did not emphasize the infectious disease, they aligned the SARS outbreak in Hong Kong with their findings. In addition to the building geometry, the road network is another essential factor of city design as it is closely related to traffic flow and the connectivity between different places. It is also found that this factor could facilitate the spread of serious diseases (e.g., malaria and diarrheal pathogens) in developing countries (Coimbra, 1988; Eisenberg et al., 2006) as well as respiratory diseases in developed countries (Vu et al., 2013). Furthermore, Chan and Liu (2018) found that a higher proportion of greenspaces around the buildings are associated with lower levels of air pollution, and such environment is good for people's health, especially for those with respiratory disease. The latest research in China (Y. Zhu et al., 2020) and Western Europe (Ogen, 2020) also found that air pollution was associated with COVID-19 transmission. Therefore, it is important to investigate the role that greenspace plays in COVID-19 transmission (Qu et al., 2020). Based on the findings of these studies, some questions are raised about how urban geometry may affect COVID-19 cases. With the advancements in Geographical Information System (GIS) technologies, in 1999, Moore and Carpenter (1999) suggested that the relationship between environmental and socio-economic factors and infectious disease displays potential research values. With the use of GIS, we are able to link different types of data spatially, e.g., residential addresses, environmental exposure, building geometry, and demographic information. Indeed, spatial clustering in GIS was adopted for epidemiologic investigations after the outbreak of SARS in Hong Kong in 2003. Using the spatial clustering methods, it was found that SARS was highly clustered disease in Hong Kong as the geospatial clusters were observed in the case locations (Leung et al., 2004), and the urban population is faced with a higher risk (Lai et al., 2004). During COVID-19 pandemic, more than 63 scientific studies that focus on the spatial analysis of COVID-19 have been published, covering spatiotemporal analysis, health and social geography, environmental variables, data mining and web-based mapping (Franch-Pardo et al., 2020). Spatial variables have long been found to be connected to the infectious disease, and COVID-19 appears to be no exception. However, to date, most of the studies of COVID-19 focused on the socio-demographic factors, while little of them have touched upon the issue of urban geometry. Therefore, it is necessary to analyze the influence of urban geometry, including building configurations, road network and greenspace. This study aims to investigate the importance of spatial context, including urban geometry and socio-demographic factors, in the COVID-19 epidemic in Hong Kong.

Study area and data source

Study area and COVID-19 cases

Hong Kong is selected as the study area for this research (Fig. 1 ). It is a highly urbanized city with a population of more than seven million and it covers an area of 1111 km2. Due to its mountainous terrain, the population is squeezed in the densely built and high-rises in urban areas, while these high-rises occupy only approximately 20% of general land area. The first COVID-19 case in Hong Kong was reported on January 23, 2020, which was an imported case from Wuhan, China. Up to June 12, 2020, 1109 COVID-19 cases were recorded, of which 1061 cases were discharged, 4 patients were deceased, and the remainder were still hospitalized (Centre for Health Protection (HKSAR), 2020). For this study, the details of COVID-19 cases in Hong Kong were retrieved from the Centre for Health Protection (hereafter government dataset), and Internet source “covid19.vote4.hk - COVID-19 in HK” (hereafter Internet dataset) for the period January 23 to April 30, 2020, covering 1038 cases.
Fig. 1

Study area and the residential spatial distribution of the confirmed cases from government dataset.

Study area and the residential spatial distribution of the confirmed cases from government dataset. The government dataset comprised two types of information about the daily cases in Hong Kong. The first type is related to the COVID-19 infection cases in Hong Kong accompanied with the details of individual patients: (1) case number, (2) date of report, (3) date of onset, (4) gender, (5) age, (6) hospital admitted, (7) current status (i.e., hospitalized, discharged, or deceased), (8) citizenship (i.e., Hong Kong or non-Hong Kong resident), (9) case classification (i.e., imported case, close contact of imported case, possibly local case, close contact of possibly local case, local case, or close contact of local case), and (10) confirmation of case (i.e., confirmed or probable). The second type of information contains the residential buildings in which the infected patients have resided or the non-residential buildings with two or more cases in the past 14 days, which is accompanied with detailed such as (1) district, (2) building name, (3) last date of stay of the case(s), and (4) related probable/confirmed cases. Since this dataset provided building names only, the coordinates of the addresses were retrieved by using Google Geocoding API for further spatial analysis. Fig. 1 shows the spatial distribution of the patients visited in Hong Kong with their case classification. Another dataset was retrieved from the Internet dataset which summarizes the reports of the government, Internet, and news media, and it also provides the details of patients' information and high-risk areas. The information regarding the patients was similar to that of the government dataset. The second set of data provides more specific information such as where the patients stayed before hospitalization, the action done (i.e., residence, working, gathering, stay, medical, arrival, departure or transportation) and the coordinates (i.e., latitude and longitude).

Spatial context

Socio-demographic characteristics The 2016 census data were extracted from the Census and Statistics Department (Census and Statistics Department (HKSAR), 2018) at the Tertiary Planning Unit (TPU) level. The Hong Kong Planning Department uses these regional units for the fine-scale regional planning. The 291 TPUs in Hong Kong were aggregated by the Census and Statistics Department into 154 TPU groups (as shown in Fig. 2 ) to protect personal data privacy of census data. These data include demographic, educational, economic, household, and housing characteristics. Specifically, they provide statistics of age, ethnicity, marital status, usual spoken language, reading and writing ability in Chinese and English, educational attainment, economic activity status, monthly income, occupation, industry, working hours per week, place of work, household size, household composition, monthly domestic household income, type of housing, tenure of accommodation and monthly domestic household rent. All the data within the TPUs were transformed from the number of persons into ratio variables, which indicate the percentage of the population within the TPU with certain socio-demographic characteristics. In addition, the total population and the population density of each TPU group were also involved in this study to evaluate the influences of these factors to COVID-19 cases. The detailed information of independent variables adopted in this study is provided in Table A.1 in the appendix. Individual COVID-19 cases were combined with the socio-demographic characteristics data for each TPU by matching the coordinates of places visited by patients in the last 14 days before going to hospital.
Fig. 2

TPU boundaries and building data used.

Table A.1

Variables used in this study.

Main categorySub-categoryVariable
Urban geometryBuilding geometry

Building height (Sum/Standard deviation)

Building density (Sum/Standard deviation)

Sky view factor (Sum/Standard deviation)

Road network

Number of nodes in network

Number of edges in network

Average node degree

Intersection count

Average streets per node

Counts of streets per node

Total edge length

Average edge length

Total street length

Average street length

Count of street segments

Average circuity

Self-loop proportion

Mean average neighbourhood degree

Mean average weighted neighbourhood degree

Average degree centrality

Average weighted clustering coefficient

Average betweenness centrality

Greenspace

Normalized difference vegetation index (Sum/Standard deviation)

Socio-demographic characteristicsDemographic characteristics

Total number of populations

Population density

Age

0–19 (Male/Female/Both sex)

10–64 (Male/Female/Both sex)

65+ (Male/Female/Both sex)

Median age (Male/Female/Both sex)

Ethnicity

Chinese

Filipino

Indonesian

White

Others

Marital Status

Never married

Married

Widowed

Divorced

Separated

Usual spoken language

Cantonese

Putonghua

Other Chinese dialects

English

Other languages

Whether able to read Chinese

Able to read

Not able to read

Whether able to read English

Able to read

Not able to read

Whether able to write Chinese

Able to write

Not able to write

Whether able to write English

Able to write

Not able to write

Educational characteristics

Educational attainment (highest level attended)

No schooling/Pre-primary

Primary

Lower secondary

Upper secondary

Post-secondary: Diploma/Certificate

Post-secondary: Sub-degree course

Post-secondary: Degree course

Economic characteristics

Economic activity status

Employees

Employers

Self-employed

Unpaid family workers

Home-makers

Students

Retired

Others

Place of Work

Work in the same district

Work in another district on Hong Kong Island

Work in another district in Kowloon

Work in another district in New Towns

Work in another district in other areas in the New Territories

No fixed place/Marine

Work at home

Places outside Hong Kong

Monthly income from main employment

<HK$10,000

HK$10,000–HK$19,999

HK$20,000–HK$39,999

≥HK$ 40,000

Median monthly income from main employment (Male/Female/Both sex)

Occupation

Managers and administrators

Professionals

Associate professionals

Clerical support workers

Service and sales workers

Craft and related workers

Plant and machine operators and assemblers

Elementary occupations

Skilled agricultural and fishery workers; and occupations not classifiable

Industry

Manufacturing

Construction

Import/export, wholesale and retail trades

Transportation, storage, postal and courier services

Accommodation and food services

Information and communications

Financing and insurance

Real estate, professional and business services

Public administration, education, human health and social work activities

Miscellaneous social and personal services

Others: including “Agriculture; forestry and fishing”; “Mining and quarrying”; “Electricity and gas supply”; “Water supply; sewerage, waste management and remediation activities” and industrial activities unidentifiable or inadequately described

Weekly usual hours of work of all employment

<18

18–34

35–44

45–54

55–64

65+

Household characteristics

Household size

1

2

3

4

5

6+

Average domestic household size

Household composition

Composed of couple

Composed of couple and unmarried children

Composed of lone parent and unmarried children

Composed of couple and at least one of their parents

Composed of couple, at least one of their parents and their unmarried children

Composed of other relationship combinations

One-person households

Non-relative households

Monthly domestic household income

<HK$10,000

HK$10,000–HK$19,999

HK$20,000–HK$39,999

HK$40,000–HK$79,999

≥HK$ 80,000

Median monthly domestic household income

Median monthly household income of economically active households

Housing characteristics

Type of Housing

Public rental housing

Subsidised home ownership housing

Private permanent housing

Non-domestic housing

Temporary housing

Tenure of Accommodation

Owner-occupier – With mortgage or loan

Owner-occupier – Without mortgage and loan

Sole tenant

Co-tenant/Main tenant/Sub-tenant

Rent free

Provided by employer

Median monthly domestic household rent

Median rent to income ratio

Building geometry TPU boundaries and building data used. In order to understand the role that the urban geometry plays in facilitating the spread of COVID-19, three building-related settings were included in this study (i.e., building height, building density and Sky View Factor (SVF)). Building height was obtained from the Lands Department of Hong Kong in 2019 (Fig. 2) based on the 1:1000 scale building polygon that it provides, and then, the data were resampled to 5-meter resolution. Building density was also derived from this building layer by calculating the percentage of building occupation in each 100 m × 100 m area, to measure the crowdedness between buildings (Fig. 3 ). An SVF map at 10-meter resolution (Fig. 4 ) derived by Yang et al. (2015) using airborne LiDAR captured by the Civil Engineering and Development Department of Hong Kong in 2011 was adopted in this study. SVF values range from zero to one, and zero stands for a totally obstructed sky while one for unobstructed sky, respectively. A 500-meter buffer zone was created for individual confirmed cases to study these building-related variables. Within this buffer area, the sum and standard deviation of building height, the building density and SVF were estimated, and these parameters are usually used for representing the urban morphology. The sum of building height, building density and SVF usually represent the urbanization level, where a highly urbanized area always has higher sum values of building height and building density and lower SVF. The standard deviation of these factors represents the variation of the urban morphology and the building geometry. The standard deviation of building density and building height refer to the crowdedness between buildings and the roughness of the buildings in urban morphology. These building related attributes have been extensively utlized in other studies. For example, Hang et al. (2012) evaluated the pollutant dispersion and pedestrian ventilation using different standard deviation of building heights and same average building height to simulate different urban morphologies. Building height, building density and SVF were also used for studying the urban heat island effect caused by urban geometry heterogeneity (Yang and Li, 2015).
Fig. 3

Map of building density.

Fig. 4

Map of sky view factor.

Road network Map of building density. Map of sky view factor. In addition to the building geometry, the road networks were also considered, as they are related to the vehicular and pedestrian flows and play an important role in the urban design and planning (Penn et al., 1998). The pedestrian road networks within a 500 m walkable buffer zone of each case were collected with the tool “OSMnx” (Boeing, 2017), road networks can be collected and analyzed easily based on the OpenStreetMap data using graph theory. The characteristics of the road network and the walkable connectivity were considered based on information of nodes, streets, and connectivity in this study. The list of all the parameters adopted is shown in Table A.1. Greenspace exposure The neighborhood green space within a 500-meter-radius buffer zone of the residential address of each case was measured using the Normalized Difference Vegetation Index (NDVI) image derived from a Satellite Pour l'Observation de la Terre (SPOT) 7 image captured on February 29, 2016. The NDVI is a normalized ratio of the infrared and red bands, ranging from −1 (no vegetation) to 1 (dense vegetation) (Goward et al., 1991) (Fig. 6). The sum and standard deviation of the NDVI within a 500-meter buffer area of each case were retrieved, to represent the total greenspace exposure and the variation of greenspace.
Fig. 6

NDVI map for the calculation of greenspace exposure.

Methodology

To further investigate the influence of spatial context on COVID-19 transmission, the stepwise logistic regression, the logistic regression with case-control of time, and the least absolute shrinkage and selection operator (Lasso) regression were performed to identify the significant factors. The dependent variable for the regression was the class of cases, that is imported cases or local cases. Local confirmed cases of COVID-19 (including local, possibly local, close contact of local, and close contact of possibly local cases) in Hong Kong were selected as target cases, and non-local cases (imported cases) were selected as the controls, since the non-local cases were not subject to the local factors when they get infected. The close contact with the imported cases was not considered in this study, as they were dependent on both local factors and imported patients. A total of 125 independent variables were used in this study, as shown in Table A.1 included in the appendix, that is, building geometry, road network, greenspace and socio-demographic characteristics, respectively. These variables were extracted from the government and Internet datasets based on the location list. Hospitals, ports of entry and quarantine facilities were excluded from the list because the patients are infected with COVID-19 before they visit these sites. Information such as neighborhood greenspace, building geometry, road networks, and the census information of individual location retrieved from the government dataset and the Internet were normalized within a range of 0 to 1, to determine the critical risk factors for COVID-19 cases. Spearman's correlation analysis was conducted to evaluate the relationship between variables, and the variables were removed if the correlation value was higher than 0.8 so that the variables with high correlation can be excluded.

Evaluation of the association between spatial context and the COVID-19 infection

Logistic regression was performed to describe the relationships between independent variables and dependent variable (Kleinbaum and Klein, 2002), while the stepwise approach is adopted to estimate the parameters to be included in the model (Steyerberg et al., 1999). However, there are several limitations if logistic regression is used. First, the date of onset is a crucial factor for understanding the transmission of infectious disease, especially for those diseases which are contagious before the onset (Fraser et al., 2004). Therefore, the dataset was further analyzed using the conditional logistic regression to confirm the time of case confirmation. To tackle the problem of the unequal number of cases and control that is commonly occurred in the epidemiological studies, as well as to control the time-related variables, logistic regression with case-control analysis was performed to account for the 1-n matched design. The imported cases were considered as controls and the local cases were considered as the target cases for the analysis in the case-control study. Since the ratio of local cases to imported cases was approximately 1/2 in Hong Kong, two imported case records were matched with each local case record by the closest dates of confirmation for case-control comparison. In addition, in order to conduct temporal evaluation of the influence of the urban geometry and socio-demographic characteristics on COVID-19 local cases, the difference in the confirmation date was set at most three days for each pair of case-control group so that this method evaluated the changes of COVID-19 cases over time among the groups. The commercial software SPSS was adopted to perform the stepwise logistic regression and logistic regression with case-control of time in this study. To avoid the multicollinearity problem caused by a large number of highly correlated parameters (Shen and Gao, 2008), logistic regression with Lasso regularization was also performed in the current study. Tibshirani (1996) developed Lasso regularization to minimize the sum of squared residuals by applying a penalization parameter to shrink smaller coefficients toward zero, leaving only the most predictive variables in the model:where y denotes the dependent variable for the i th data, x  = (x ,  x , …,  x ) denotes the predictor variables for the i th data, β denotes the coefficient of the regression model for the j th dependent variable, p denotes the number of independent variables, and λ denotes the penalization parameter which is determined by 10-fold cross-validation in this study. This method is useful in identifying the specific factors that were most associated with the confirmed case by eliminating those unassociated variables, so that the predictive performance can be improved (Tibshirani, 2011, Tibshirani, 1996). In this study, the logistic regression with Lasso regularization was conducted in R software with “glmnet” package (Friedman et al., 2010; Simon et al., 2011).

Results

From January 23, 2020, to April 30, 2020, there were 1038 COVID-19 cases confirmed in Hong Kong, including 616 imported cases, 25 cases that have been in close contacts with imported cases, 67 local cases, 165 cases that have been in close connection with local cases, 103 likely local cases and 62 cases that have been in close contacts with the likely local cases. The temporal distribution of confirmed cases is shown in Fig. 7. The age of patients ranged from 40 days to 96 years, while 559 patients were males and 479 patients were females. Since the cases that have been in close contact with the imported cases were excluded, 1013 cases were analyzed. Based on the selection criteria as described in the methodology, 1893 records retrieved from the government dataset and 1880 records were retrieved from the Internet dataset for logistic regression and Lasso regression. After considering the three-day interval for case-control study, only 1053 and 1005 records were retrieved from the government and Internet datasets respectively. The reason why there are more locations than COVID-19 cases is that the patients have been to a number of locations during their incubation period as reported. Some cases reported several locations, and some reported one or none, especially for the imported cases which were diagnosed during entry or the home confinement period. For each reported location, 125 variables were input to the three proposed models after the highly correlated variables (correlation higher than 0.8) were filtered, including six building geometry variables, eight road network variables, two greenspace variables and 109 socio-demographic characteristics.
Fig. 7

Temporal distribution of the case class of the confirmed cases.

Table 1 presents the coefficients of spatial factors with significance obtained from logistic regression, case-control and Lasso regression for both government and Internet datasets. The insignificant variables were not presented because there are a large number of variables. The independent variables used in logistic regression and case-control analysis were tested using the Chi-square test, and Table 1 shows the variables with p-value smaller than 0.05. For Lasso regression, the coefficients of selected variables from 10-fold cross-validation are listed in Table 1 as well. Among the 125 independent variables, 29 variables were found to have a significant relationship with COVID-19 cases in Hong Kong, in either one of the models or datasets. From the table, it is found that 13 variables displayed a positive relationship, and 16 had a negative relationship with the confirmed cases. Among the significant factors, five variables are associated with the building geometry, one with the road network, two with the demographic characteristics, one with the educational characteristics, 14 with the economic characteristics, one with the household characteristics and five with the housing characteristics. However, no significant relationship was found in its relationship with greenspace exposure.
Table 1

Coefficient of significant spatial variables from logistic regression analysis, case-control analysis and Lasso regression analysis. The bold text indicates the significant variables from at least three out of six models.

Spatial variable
Government dataset
Internet dataset
Main categorySub-categoryVariableLogistic regressionCase-controlLasso regressionLogistic regressionCase-controlLasso regression
Urban geometryBuilding geometryBuilding height (sum)2.132⁎⁎2.710⁎⁎1.1593.210⁎⁎2.240⁎⁎0.951
Building height (standard deviation)−2.302⁎⁎
Building density (sum)0.273
Building density (standard deviation)−1.590⁎⁎−1.797⁎⁎−0.030−1.536⁎⁎−1.924⁎⁎−0.140
Sky view factor (sum)−0.419−0.234
Road networkStreet length (average)−2.304⁎⁎−0.675−2.870⁎⁎−0.226
Socio-demographic characteristicsDemographic characteristicsPopulation density0.849⁎⁎0.213
Age group: 65+ (male)1.007⁎⁎0.290
Educational characteristicsHighest educational attainment: Sub-degree course1.316⁎⁎0.100
Economic characteristicsEconomic status: Others0.298
Occupation: Professionals−2.543⁎⁎−1.669⁎⁎
Occupation: Service and sales workers0.0931.2780.325
Occupation: Craft and related workers−3.459⁎⁎
Occupation: Skilled agricultural and fishery workers; and occupations not classifiable−2.123⁎⁎−3.475⁎⁎−0.719
Industry: Accommodation and food services0.415
Industry: Manufacturing−0.049
Industry: Public administration, education, human health and social work activities−0.183
Working location: Another district on Hong Kong Island#0.1530.813⁎⁎1.058⁎⁎0.638
Working location: Outside Hong Kong−1.116⁎⁎
Weekly working hours: 18–340.108
Weekly working hours: 65 and over1.458
Median monthly income from main employment (male)−1.723⁎⁎
Median monthly income from main employment (female)−0.073
Household characteristicsMedian monthly domestic household income−0.732
Housing characteristicsTenure of accommodation: Owner-occupier (with mortgage and loan)−0.018
Tenure of accommodation: Owner-occupier (without mortgage and loan)−1.421⁎⁎
Tenure of accommodation: Sole tenant0.309
Tenure of accommodation: Co-tenant−0.642⁎⁎−0.705
Tenure of accommodation: Provided by employer−1.229

Significant result with p-value < 0.01.

Significant result with p-value < 0.05.

“Working location: Another district on Hong Kong Island” means the number of persons working on Hong Kong Island, excluding the persons living and working in the same district on Hong Kong Island.

Coefficient of significant spatial variables from logistic regression analysis, case-control analysis and Lasso regression analysis. The bold text indicates the significant variables from at least three out of six models. Significant result with p-value < 0.01. Significant result with p-value < 0.05. “Working location: Another district on Hong Kong Island” means the number of persons working on Hong Kong Island, excluding the persons living and working in the same district on Hong Kong Island. The results from the three models using the two datasets are not identical, we selected those factors which were significant in at least three models for the investigation, and there are six factors in total (Table 2 ). Three factors are associated with urban geometry and three are associated with the socio-demographic characteristics. These factors were the sum of building height (positive in six models), the standard deviation of building density (negative in six models), average street length (negative in four models), working location in another district on Hong Kong Island (i.e., the number of persons working on Hong Kong Island, excluding the persons living and working in the same district on Hong Kong Island) (positive in five models), service and sales workers (positive in three models), skilled agricultural, fishery workers and occupations not classifiable (negative in three models).
Table 2

Summary of important spatial variables from at least three models.

Spatial variable
Number of models indicated as significant factorsSign of the coefficient in the model
Main categorySub-categoryVariable
Urban geometryBuilding geometryBuilding height (sum)6(+)
Building density (standard deviation)6(−)
Road networkStreet length (average)4(−)
Socio-demographic characteristicsEconomic characteristicsWorking location: Another district on Hong Kong Island4(+)
Occupation: Service and sales workers3(+)
Occupation: Skilled agricultural and fishery workers; and occupations not classifiable3(−)
Summary of important spatial variables from at least three models. Table 3 generalizes the absolute contribution of the spatial factors (i.e., sum of the absolute coefficient of the factors listed in Table 2) to each of the models. Since the values were normalized through pre-processing phases, these weightings can be directly compared. Comparison of the coefficients of urban geometry with the socio-demographic characteristics shows that the weighting of urban geometry is higher than that of socio-demographic characteristics in all models. By analyzing the ratio of the important factors between urban geometry and socio-demographic characteristics as shown in brackets of Table 3, it is found that the number of variables related to the urban geometry is usually greater than or equal to the socio-demographic characteristics. These two findings indicate the importance of urban geometry to COVID-19 cases in Hong Kong.
Table 3

Weighting of the important spatial variables to COVID-19 cases based on the main category and the number in brackets indicates the important factors from the model.

Spatial variableGovernment dataset
Internet dataset
Logistic regressionCase-controlLasso regressionLogistic regressionCase-controlLasso regression
Urban geometry6.026 (3)4.507 (2)1.863 (3)7.616 (3)4.164 (2)1.317 (3)
Socio-demographic characteristics2.123 (1)3.475 (1)0.965 (3)2.091 (2)1.058 (1)0.963 (2)
Weighting of the important spatial variables to COVID-19 cases based on the main category and the number in brackets indicates the important factors from the model.

Discussion

This study compared the local with the imported COVID-19 cases in Hong Kong with the use of logistic, case-control and Lasso regressions to identify the relevant spatial factors in the transmission and spread of the disease. A total of 125 variables were analyzed with six of them related to the building geometry, eight related to the road networks, two related to the greenspace and 109 related to the socio-demographic characteristics. Six important factors were determined through six models, to explain the relationship between urban geometry, socio-demographic characteristics and the incidence of COVID-19. Of the socio-demographic characteristics investigated, those found to be associated with COVID-19 patients were related to the working locations and occupations, which are consistent with the findings of other studies (Almagro and Orane-Hutchinson, 2020). Positively related factors included the working location on Hong Kong Island and the occupation as service and sales workers. Hong Kong Island is the central business district of Hong Kong, which occupies 7% of the land area, while but 23% of the working population resides on the island. High population density has been discussed in several studies, as one of the factors leading to the transmission or mortality of COVID-19 (Rocklöv and Sjödin, 2020; Wu et al., 2020). The densely populated Hong Kong Island provides many chances of social contacts among the large working population. In addition, the workers engaged in the service and sales industry have a greater chance of getting close the contact with the customers and clients, so they are faced with greater risks due to the nature of their jobs. This finding aligned with findings from the first outbreak of the COVID-19 in Wuhan, China, and those earliest confirmed cases were salesmen or saleswomen at Huanan Seafood Wholesale Market (Rothan and Byrareddy, 2020). Based on the experience from Singapore, Koh (2020) also suggested that occupational exposure are the main reasons for earliest confirmed cases engaged in the tourism, retail and transportation industries. The result of this study further supports the findings of these previous studies, indicating high exposure risk of certain occupations. In contrast, the occupation of the group “skilled agricultural, fishery workers and occupations not classifiable” was negatively correlated with the infection. These workers are either working outdoors or tend to work at one location away from urban areas (e.g., cultivated field, mariculture raft, fishing boat or home) leading to lower contact with others. Some of the other studies highlighted the importance of the socio-demographic characteristics on COVID-19 cases, especially the population density on the COVID-19 case number and the spread of the disease. For example, direct relationship was found between the COVID-19 outbreak and the population density in Iran (Ahmadi et al., 2020). Similar finding from the study in Japan suggested the positive correlations between the morbidity and mortality rates and population density (Kodera et al., 2020). The study in Turkey emphasized the importance of the population density and wind that these two factors can explain 94% of the variance in the spread of COVID-19 in the country (Coşkun et al., 2021). However, most of the studies have not yet considered the urban geometry as a factor in their analysis. In this study, the population density was found to be less important than the urban geometry to COVID-19 cases. From Table 1, population density is the significant factor in two out of six models only while the sum of building height and standard deviation of building density are involved in all six models. In logistic regression and Lasso regression using government dataset, the coefficients of population density are 0.849 and 0.213 respectively, while the coefficients of the sum of building height are 2.132 and 1.159, in which the standard deviation of building density are −1.590 and −0.030, and that of average street length are −2.304 and −0.675 respectively. When comparing the coefficients of the variables, it is clearly observed that the contribution of urban geometry characteristics is higher than the population density in the models. To our understanding, there was no research on the relationship between COVID-19 and urban geometry in the past. The results of the study reveal the importance of urban geometry to COVID-19. The most important spatial factors found in this study were the building height and the building density, and they were two variables with significance in all six models. There was a positive association between COVID-19 cases and the sum of building height, and a negative association with the standard deviation of building density. Building height and building density are key components of urban geometry and surface roughness, and these greatly determine the magnitude of wind ventilation (Kubota et al., 2008; Rafailidis, 1997; Wong et al., 2010). The total building height evaluates the level of urbanization level in an area, as high buildings usually exist in urban areas where there are densely built high-rises. In Hong Kong, there are 20 floors to 50 floors in most urban buildings (Wong et al., 2010), and high-rise and dense buildings are always considered as factors that are not good for wind ventilation, and in such an environment, people's health condition will be adversely affected (Wargocki, 2013). Building density is an indicator of how dense of the buildings in an area, and the wind ventilation is always poor in the high building density area since the wind is obstructed by the buildings (Yang et al., 2019). The standard deviation of building density can represent the building distribution of the area, and the higher building density (standard deviation) means more diversified the building blockage. More diversified building blockage improves ventilation because of the enhanced turbulence generation by rough surfaces. A typical case of the building environment affecting the spread of infectious disease was happened during SARS outbreak in 2003, where more than 300 people who lived in Amoy Gardens housing complex in Hong Kong were infected. The consensus document of the World Health Organization reported the large number of infection in building Block E that the dry U-traps in bathroom allowed the contaminated sewage droplets entering households, and the increasing number of virus in the sewer system was aerosolized in bathroom (World Health Organization, 2003) and contaminated droplets were transported through the running exhaust fan to air shaft and thus to the upper part of the building (World Health Organization Regional Office for the Western Pacific, 2003). Since some of the patients did not have person-to-person contacts with others, Yu et al. (2004) conducted some epidemiologic analysis, experimental studies and airflow simulations to further examinate and confirm the airborne spread of SARS between building blocks. Since the outbreak of SARS and avian flu in 2003, researchers start investigating the relationship between urban environments and the infectious diseases, and how different urban settings can affect health of inhabitants and the spread of diseases (Wolf, 2016). This study further emphasizes the relationship between building geometry and COVID-19 cases. The average street length was significantly related to urban geometry, as four models suggested the negative correlation of this factor with the infection risk within a 500 m walkable zone. Longer lengths generally represented by main roads and shorter lengths represented by short streets and lanes. As shown in Fig. 5a, the pedestrian networks in Hong Kong urban areas usually consist of several main roads and a number of short streets connecting with the main roads. Long average street length is the result of low connectivity where only main roads exist in the target zone (Fig. 5b) and there are fewer connections between roads. The walking mobility is thus restricted by limited connectivity and this also reduces the social contact. Increasing social distancing has been demonstrated as a mean to reduce the spread of COVID-19, and the low walking mobility from low connectivity is able to achieve the objective in reducing the infection risk, probably through increased social distancing.
Fig. 5

Examples of road network collected from “OSMnx” (a) urban area; (b) rural area.

Examples of road network collected from “OSMnx” (a) urban area; (b) rural area. NDVI map for the calculation of greenspace exposure. Temporal distribution of the case class of the confirmed cases. In addition to the analysis of individual factors, a major conclusion of this study is that urban geometry is more important than socio-demographic characteristics in COVID-19 risk. Most other studies of the spatial-statistical factors related to COVID-19 cases focused on socio-demographic characteristics, (e.g., Drefahl et al., 2020; Mogi and Spijker, 2020; Raifman and Raifman, 2020) while the urban geometry is rarely considered. In this study, although the number of independent variables for urban geometry was fewer than socio-demographic characteristics, the results indicate that urban geometry is more important than socio-demography in affecting the COVID-19 cases in Hong Kong. Although this significant association has not been mentioned in the literature, urban geometry has been found influential to other diseases. For example, building properties were found correlated with tuberculosis (Lai et al., 2013), respiratory conditions (McCarthy et al., 1985), visual and acoustic comfort (Chan and Liu, 2018), thermal comfort (Ali-Toudert and Mayer, 2006) and excess mortality (Wong et al., 2017). The results of this study confirm the association between urban geometry and disease in the case of COVID-19. The findings on the importance of the urban geometry in this study can be further extended to other infectious diseases, e.g. influenza, tuberculosis and dengue fever. Since there are only a few studies evaluating the association between infectious diseases and urban geometry, this study could further be extended once the data are available. Limitations of the study include the use of the imported cases as a control group for comparison with the local cases, as it is based on the assumption that the imported cases were not exposed in the high-risk areas when they became infected. The reason for not selecting healthy people as the control group was to prevent bias caused by the differences between the control group and target group, as more than seven million people did not catch the disease. Regarding the data used, all the urban geometry and socio-demographic factors considered were collected at a fixed time, as the spatially and temporally dynamic data were unavailable. In addition, the socio-demographic data have been retrieved from the census at TPU, rather than based on the characteristics of an individual patient. The method used in this study displays great applicability in determining the influence of the social environment on the individuals, rather than the individual personal characteristics (Kosatsky et al., 2012). If the personal characteristics of individual patients could be obtained, an individual vulnerability could be identified using a similar approach. Although this study could not exhaust all the potential factors, it provides a useful approach to the identification and highlights the importance of urban geometry to COVID-19 cases.

Conclusions

In this study, a combination of logistic, case-control and Lasso regressions was performed to evaluate the importance of urban geometry and socio-demographic factors in the transmission and spread of COVID-19 cases in Hong Kong. The main factors contributing to increased risk were building height, working in another district on Hong Kong Island, and service and sales occupations. Low-risk factors included districts with large variation in building density, low walkability and occupation of skilled agricultural and fishery workers, and occupations not classifiable. The results suggest the important contribution of urban design and geometry, including building geometry and road network settings, to risk from COVID-19 when compared with socio-demographic characteristics. This result can provide insight for citizens to understand and avoid risk, and for government to establish planning and design policies to minimize disease transmission in the short-term and better urban planning in the long-term.

CRediT authorship contribution statement

Coco Yin Tung Kwok: Conceptualization, Methodology, Formal analysis, Investigation, Data curation, Writing - original draft, Visualization. Man Sing Wong: Conceptualization, Methodology, Formal analysis, Investigation, Writing - original draft, Supervision, Funding acquisition. Ka Long Chan: Writing - original draft. Mei-Po Kwan: Methodology, Investigation, Writing - review & editing. Janet Elizabeth Nichol: Writing - review & editing. Chun Ho Liu: Writing - review & editing, Funding acquisition. Janet Yuen Ha Wong: Writing - review & editing. Abraham Ka Chung Wai: Writing - review & editing. Lawrence Wing Chi Chan: Writing - review & editing. Yang Xu: Writing - review & editing. Hon Li: Software, Data curation. Jianwei Huang: Writing - review & editing. Zihan Kan: Writing - review & editing.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  34 in total

1.  Factors that make an infectious disease outbreak controllable.

Authors:  Christophe Fraser; Steven Riley; Roy M Anderson; Neil M Ferguson
Journal:  Proc Natl Acad Sci U S A       Date:  2004-04-07       Impact factor: 11.205

2.  Environmental change and infectious disease: how new roads affect the transmission of diarrheal pathogens in rural Ecuador.

Authors:  Joseph N S Eisenberg; William Cevallos; Karina Ponce; Karen Levy; Sarah J Bates; James C Scott; Alan Hubbard; Nadia Vieira; Pablo Endara; Mauricio Espinel; Gabriel Trueba; Lee W Riley; James Trostle
Journal:  Proc Natl Acad Sci U S A       Date:  2006-12-07       Impact factor: 11.205

3.  Shifts in mortality during a hot weather event in Vancouver, British Columbia: rapid assessment with case-only analysis.

Authors:  Tom Kosatsky; Sarah B Henderson; Sue L Pollock
Journal:  Am J Public Health       Date:  2012-10-18       Impact factor: 9.308

4.  Evidence of airborne transmission of the severe acute respiratory syndrome virus.

Authors:  Ignatius T S Yu; Yuguo Li; Tze Wai Wong; Wilson Tam; Andy T Chan; Joseph H W Lee; Dennis Y C Leung; Tommy Ho
Journal:  N Engl J Med       Date:  2004-04-22       Impact factor: 91.245

5.  Urbanisation and health in China.

Authors:  Peng Gong; Song Liang; Elizabeth J Carlton; Qingwu Jiang; Jianyong Wu; Lei Wang; Justin V Remais
Journal:  Lancet       Date:  2012-03-03       Impact factor: 79.321

6.  Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China.

Authors:  Chaolin Huang; Yeming Wang; Xingwang Li; Lili Ren; Jianping Zhao; Yi Hu; Li Zhang; Guohui Fan; Jiuyang Xu; Xiaoying Gu; Zhenshun Cheng; Ting Yu; Jiaan Xia; Yuan Wei; Wenjuan Wu; Xuelei Xie; Wen Yin; Hui Li; Min Liu; Yan Xiao; Hong Gao; Li Guo; Jungang Xie; Guangfa Wang; Rongmeng Jiang; Zhancheng Gao; Qi Jin; Jianwei Wang; Bin Cao
Journal:  Lancet       Date:  2020-01-24       Impact factor: 79.321

7.  Occupational risks for COVID-19 infection.

Authors:  David Koh
Journal:  Occup Med (Lond)       Date:  2020-03-12       Impact factor: 1.611

8.  Rethinking Urban Epidemiology: Natures, Networks and Materialities.

Authors:  Meike Wolf
Journal:  Int J Urban Reg Res       Date:  2016-11-01

9.  A Novel Coronavirus from Patients with Pneumonia in China, 2019.

Authors:  Na Zhu; Dingyu Zhang; Wenling Wang; Xingwang Li; Bo Yang; Jingdong Song; Xiang Zhao; Baoying Huang; Weifeng Shi; Roujian Lu; Peihua Niu; Faxian Zhan; Xuejun Ma; Dayan Wang; Wenbo Xu; Guizhen Wu; George F Gao; Wenjie Tan
Journal:  N Engl J Med       Date:  2020-01-24       Impact factor: 91.245

Review 10.  Spatial analysis and GIS in the study of COVID-19. A review.

Authors:  Ivan Franch-Pardo; Brian M Napoletano; Fernando Rosete-Verges; Lawal Billa
Journal:  Sci Total Environ       Date:  2020-06-08       Impact factor: 7.963

View more
  11 in total

1.  The fine-scale associations between socioeconomic status, density, functionality, and spread of COVID-19 within a high-density city.

Authors:  Anshu Zhang; Wenzhong Shi; Chengzhuo Tong; Xiaosheng Zhu; Yijia Liu; Zhewei Liu; Yepeng Yao; Zhicheng Shi
Journal:  BMC Infect Dis       Date:  2022-03-21       Impact factor: 3.090

2.  Comparison of Public Health Containment Measures of COVID-19 in China and India.

Authors:  Haiqian Chen; Leiyu Shi; Yuyao Zhang; Xiaohan Wang; Jun Jiao; Manfei Yang; Gang Sun
Journal:  Risk Manag Healthc Policy       Date:  2021-08-12

3.  Comparing the space-time patterns of high-risk areas in different waves of COVID-19 in Hong Kong.

Authors:  Zihan Kan; Mei-Po Kwan; Jianwei Huang; Man Sing Wong; Dong Liu
Journal:  Trans GIS       Date:  2021-07-12

4.  Geo-clusters and socio-demographic profiles at village-level associated with COVID-19 incidence in the metropolitan city of Jakarta: An ecological study.

Authors:  Pandji Wibawa Dhewantara; Tities Puspita; Rina Marina; Doni Lasut; Muhammad Umar Riandi; Tri Wahono; Wawan Ridwan; Andri Ruliansyah
Journal:  Transbound Emerg Dis       Date:  2021-09-25       Impact factor: 4.521

5.  A spatial-temporal analysis at the early stages of the COVID-19 pandemic and its determinants: The case of Recife neighborhoods, Brazil.

Authors:  Arthur Pimentel Gomes de Souza; Caroline Maria de Miranda Mota; Amanda Gadelha Ferreira Rosa; Ciro José Jardim de Figueiredo; Ana Lúcia Bezerra Candeias
Journal:  PLoS One       Date:  2022-05-17       Impact factor: 3.752

6.  Local Characteristics Related to SARS-CoV-2 Transmissions in the Seoul Metropolitan Area, South Korea.

Authors:  Changmin Im; Youngho Kim
Journal:  Int J Environ Res Public Health       Date:  2021-11-29       Impact factor: 3.390

7.  Management of and Revitalization Strategy for Megacities Under Major Public Health Emergencies: A Case Study of Wuhan.

Authors:  Xianguo Wu; Bin Chen; Hongyu Chen; Zongbao Feng; Yun Zhang; Yang Liu
Journal:  Front Public Health       Date:  2022-01-27

Review 8.  The impact of geo-environmental factors on global COVID-19 transmission: A review of evidence and methodology.

Authors:  Danyang Wang; Xiaoxu Wu; Chenlu Li; Jiatong Han; Jie Yin
Journal:  Sci Total Environ       Date:  2022-02-26       Impact factor: 10.753

9.  Assessing the Country-Level Excess All-Cause Mortality and the Impacts of Air Pollution and Human Activity during the COVID-19 Epidemic.

Authors:  Yuan Meng; Man Sing Wong; Hanfa Xing; Mei-Po Kwan; Rui Zhu
Journal:  Int J Environ Res Public Health       Date:  2021-06-26       Impact factor: 3.390

10.  The Geographical Distribution and Influencing Factors of COVID-19 in China.

Authors:  Weiwei Li; Ping Zhang; Kaixu Zhao; Sidong Zhao
Journal:  Trop Med Infect Dis       Date:  2022-03-06
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.