Literature DB >> 30403743

Defining heatwave thresholds using an inductive machine learning approach.

Juhyeon Park1, Jeongseob Kim1.   

Abstract

Establishing appropriate heatwave thresholds is important in reducing adverse human health consequences as it enables a more effective heatwave warning system and response plan. This paper defined such thresholds by focusing on the non-linear relationship between heatwave outcomes and meteorological variables as part of an inductive approach. Daily data on emergency department visitors who were diagnosed with heat illnesses and information on 19 meteorological variables were obtained for the years 2011 to 2016 from relevant government agencies. A Multivariate Adaptive Regression Splines (MARS) analysis was performed to explore points (referred to as "knots") where the behaviour of the variables rapidly changed. For all emergency department visitors, two thresholds (a maximum daily temperature ≥ 32.58°C for 2 consecutive days and a heat index ≥ 79.64) were selected based on the dramatic rise of morbidity at these points. Nonetheless, visitors, who included children and outside workers diagnosed in the early summer season, were reported as being sensitive to heatwaves at lower thresholds. The average daytime temperature (from noon to 6 PM) was determined to represent an alternative threshold for heatwaves. The findings have implications for exploring complex heatwave-morbidity relationships and for developing appropriate intervention strategies to prevent and mitigate the health impact of heatwaves.

Entities:  

Mesh:

Year:  2018        PMID: 30403743      PMCID: PMC6221332          DOI: 10.1371/journal.pone.0206872

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

An extended period of abnormally hot weather (commonly referred to as a heatwave) can cause adverse human health effects. As the frequency, duration and intensity of extreme heat events are predicted to increase due to climate change [1], many countries have implemented heatwave warning systems and response plans to reduce the human health consequences. Defining a “heatwave” is one key factor in effectively mitigating the impacts of extreme heat events. Certain meteorological thresholds (i.e., two or more consecutive days at a maximum temperature above a certain value) are used for evaluating heatwave extremes and triggering warning systems [2, 3]. Some action plans to protect vulnerable groups have also been designed based on such thresholds. To establish appropriate heatwave thresholds, many studies have focused on statistically significant increases in relative risk and/or odds ratios under different definitions related primarily to intensity (i.e., maximum daily temperature) and/or duration (i.e., how many days exceed a certain temperature) of temperature [4-12]. A new heat index combining temperature and humidity with apparent temperature is also under consideration to replace existing definitions by comparing odds ratios with heatwave outcomes [13-15]. Previous approaches use deductive reasoning, making assumptions first and then seeking validation using heatwave outcomes. This study focuses inductively on the non-linear relationship between meteorological variables and heatwave outcomes such as those that take a J, U, or V-shape [16-18]. While a few studies have given attention to the non-linear curve [19, 20], none have investigated a tipping point where human health effects rapidly change, which should be closely related to the definition of a heatwave. This concept has been neglected mainly because classical statistical techniques such as Ordinary Least Square (OLS) methods cannot capture such the tipping point because the linearity and normality assumptions may not be satisfied in the model. In this paper, we suggest an alternative approach to directly reveal thresholds using Multivariate Adaptive Regression Splines (MARS). This machine learning technique is suitable for capturing curves in predicted outcomes to allow for non-linearity. This approach has been used to develop the heatwave definition by focusing on the relationship between 1) the frequency of emergency department visits identified as heat related and 2) several meteorological factors.

Materials and methods

Study area and dataset

We focused on the summer season in the Seoul metropolitan area of Korea, which includes Seoul, Incheon, and Gyeonggi-do, with a population of 25 million in 2015, representing 49.5% of the country’s population. As the Korean Peninsula lies within the East Asian monsoon belt, summer generally falls between June and August, with the hottest month being August when the mean temperature is about 24–26°C. Our dataset included summertime heat-related morbidity rates from 2011–2016 (totaling 521 days from June to August), recorded at 74 public health centers across the region. In other words, there were 38,554 observations considered (74 centers * 521 days). Daily meteorological factors were obtained from the nearest weather station (determined according to the latitudie and longitude of each public health center and each weather station) (Fig 1). The 105 weather stations covered the whole Seoul metropolitan area with an average inter-station distance of 8.7 km and an average of 3.26 km (SD = 2.51 km) from public health centers.
Fig 1

Study area and dataset [Data Source: [21–23]].

Data

Daily data on emergency department visitors diagnosed with heat illnesses was obtained from a heat-related illness surveillance system operated by the Korea Centers for Disease Control and Prevention (KCDC) for a total of 586 days during the summer periods from 2011 to 2016: 65 days in 2011 (7/1-9/9), 98 days in 2012 (6/1-9/6), 98 days in 2013 (6/1-9/7), 98 days in 2014 (6/1-9/6), 105 days in 2015 (5/24-9/5) and 122 days in 2016 (5/23-9/21). All cases were initially reported by local emergency medical centers and then collected by the 74 public health centers. The Heat-related illnesses are defined according to the International Classification of Diseases, 10th Revision, T67, “Effects of Heat and Light,” which includes such categories as heat stroke, heat cramps, heat syncope, and heat exhaustion. We extracted the data from June to August from the remaining days because there was a significant amount of missing information for visitors who arrived during the early (May) and late stages (September) of surveillance. The data included date of visit, sex, age, place of occurence and diagnosis results. This information was categorized by month, place of occurrence, and heatwave vulnerability profile of visitor (age, gender and diagnosis results) (Table 1). To standardize the variables, we divided by the population of the county where each public health center was located (mean = 303,011, SD = 143,257 for the whole population). The population was calculated using the average of 2010 and 2015 census data were obtained from Statistics Korea.
Table 1

Description of heatwave morbidity.

(Subcategory) NameDescriptionCount (%)
HWwholeNumber of heat-related emergency department visits each day1,468 (100.0)
AgeHWyoungHWwhole under 18 years old87 (5.9)
HWadultHWwhole 18 to 65 years old1,014 (69.1)
HWolderHWwhole over 65 years old367 (25.0)
GenderHWmaleMale HWwhole1,161 (79.1)
HWfemaleFemale HWwhole307 (20.9)
Diagnosis*HWdischHWwhole discharged from a hospital489 (33.3)
HWhospitalHWwhole entered a hospital875 (59.6)
Month of occurrence*HWjunjulHWwhole occurred in June or July551 (37.5)
HWaugHWwhole occurred in August917 (62.5)
Place of occurrence*HWinHWwhole occurred indoors344 (23.4)
HWoutHWwhole occurred outdoors1,102 (75.1)

Note: The variables divided by the population of the county where each public health center was located.

*There is missing information of diagnosis, month and place of occurrence.

Note: The variables divided by the population of the county where each public health center was located. *There is missing information of diagnosis, month and place of occurrence. Meteorological data (considering 19 factors) from the same time periods was obtained from the Korea Meteorological Administration (Table 2). We adopted the common definition of a heatwave as being a heat event during which a certain temperature threshold is surpassed on a given day (Tmin, Tavg, and Tmax) or over consecutive days (AvgTmaxLag1, AvgTmaxLag2, and AvgTmaxLag3) [24, 25]. Daytime temperature (from noon to 6 PM) and its lagged value were included as alternative heatwave thresholds (Tavg1218 along with AvgTavg1218Lag1, AvgTavg1218Lag2, and AvgTavg1218Lag3).
Table 2

Description of meteorological factors.

VariableDescriptionMean ±std. dev.MinMax
TminiMinimum temperature of day i21.13±2.965.0029.80
TavgiAverage temperature of day i24.83±2.5114.4032.90
Tavg1218iAverage daytime (noon to 6 PM) temperature of day i27.68±2.9913.7736.44
AvgTavg1218Lag1iAverage Tavg1218 of day i-1 to i27.67±2.6516.2236.22
AvgTavg1218Lag2iAverage Tavg1218 of day i-2 to i27.67±2.4717.5135.95
AvgTavg1218Lag3iAverage Tavg1218 of day i-3 to i27.66±2.3617.3435.76
TmaxiMaximum temperature of day i29.5±3.0316.3038.90
AvgTmaxLag1iAverage Tmax of day i-1 to i29.5±2.718.1538.80
AvgTmaxLag2iAverage Tmax of day i-2 to i29.49±2.5219.3338.47
AvgTmaxLag3iAverage Tmax of day i-3 to i29.48±2.4119.2538.13
TmaxGapiTmaxi—Tmaxi-10.42±0.210.004.10
TavgGapiTavgi—Tavgi-10.19±0.07-0.100.55
WhumiRelative humidity of day i75.23±13.0728.90100.00
WpciPrecipitation of day i8.09±23.380.00449.50
WwindiAverage wind speed of day i1.57±0.840.0038.70
WsolMAXiMaximum amount of solar radiation of day i2.09±0.820.003.98
WsolVOLiTotal amount of solar radiation of day i14.26±6.630.0028.94
Nindex1iHeat index of day i [26]74.1±3.957.9485.10
Nindex2iDiscomfort index of day i [27]79.1±6.8656.86107.88
We employed two indices, estimated using combined daily mean temperature and relative humidity to enable a focus on methodological variables, which are simple and not combined. The first index was suggested by the US National Weather Service (NWS), originally developed by Steadman [26] as shown in Eq 1: where Nindex1 is the heat index in °F, T is the temperature in °F and RH is relative humidity. Two adjustments were considered when calculating Nindex1 (https://www.wpc.ncep.noaa.gov/heat_index/details_hi.html). Second, a discomfort index (Nindex2) was also established, calculated using the formula by Thom [27] as follows: where Tc is the temperature in °C and RH is relative humidity. The maximum value from hourly measurements of the indices was also considered but not adopted because there was multicollinearity with the maximum temperature variables.

Modelling technique

The MARS method implemented in the “Earth” package for R 3.4.1 was used to investigate the non-linear relationship between heatwave outcomes and meteorological variables. MARS is a spline regression model introduced in 1991 [28] to focus on specific sub-regions of a relationship between covariates and response variables. A knot point t, where the behavior of the function changes, marks the end of one region and the beginning of another, forming basis functions: (x-t)+ and (t-x)+. Firstly, MARS generates a model with an excessive number of knots. Those that contribute least to the overall fit by forward and backward selection are eliminated. A basis function is used to search for the number of knots and their locations, representing the relationships between predictor variables (x) and the outcome variable (y): where β0 is an intercpet, β is the coefficient estimated by minimizing the sum-of-squares, and h(x) is a weighted sum of basic functions. Initially, The MARS searches all possible basis functions and their corresponding knots using a forward algorithm. Starting with a model consisting of intercept terms, β0, a larger number of basis functions are added, reducing sum-of-squares residual error as much as possible. However, MARS can be overfit due to a large number of basic functions. To mitigate this problem, a backward phase improves the model by iteratively deleting less significant terms until a final version is reached with the lowest generalized cross validation (GCV). In the model building process, predictors and knot locations that contribute significantly are automatically selected. Additionally, the response variable (y) is defined as a continuous variable because count data cannot be a response variable for the MARS [29, 30].

Results

Initially, we focused on developing a MARS model based on common meteorological factors (not including Tavg1218 variables) to explore the definition of a heatwave. The mathematical equation resulting from the MARS model for all emergency department visitors diagnosed with heat illnesses can be expressed as Among all the meteorological variables, AvgTmaxLag1 and Nindex1 were included in the equation and others were removed to refine the model fitting process. The knots for AvgTmaxLag1 and Nindex1 were 32.95 and 79.65, respectively, and the term “max” can be defined thus: max(j, k) is equal to j if j is larger than k, otherwise it equals k. A positive sign for a function indicates that the relevant meteorological variable increases the probability of a paitient being diagnosed with heat illiness, while a negative sign indicates the relevant variable decreases this probability. Eq 4 can be explained as follows: AvgTmaxLag1 has little impact on HWwhole when AvgTmaxLag1 is lower than 32.95, while its effect rapidly increases from 32.95. Nindex1 also influences HWwhole after its value becomes larger than 79.65. Fig 2 is a plot of the predicted HWwhole as AvgTmaxLag1 and Nindex1 vary.
Fig 2

Graphical representation of the MARS model.

Moreover, we allowed a second-order interaction term, excluding two combined indexes (Nindex1 and Nindex2). Eq 5 can be formulated, which includes the combination of i) Tavg-Whum and ii) AvgTmaxLag1-Whum (Fig 3). It denoted that the impact of the temeprature is emerged through the interaction with the humidity in a day.
Fig 3

Graphical representation of interaction terms from the MARS.

Table 3 shows the first and second most important factors, with knots from other MARS models for each subcategorized heatwave outcome and daytime temperature variables. To focus on identifying the locations of knots where the function value was found to vary, we reported each knot and skipped coefficient values. The interpretation for other models was similar to the prior MARS model, which targeted all emergency department visitors diagnosed with heat illnesses, because we verified that the coefficients of all functions in the form “max(x-t, 0)” were positive, as for AvgTmaxLag1 and Nindex1 in Eq 4. For example, there is a knot “30.68” of AvgTmaxLag3 in the model for young heat-related visitors (HWyoung). It can be said that the effect of AvgTmaxLag3 played a role in increasing the frequency of visits by young people when it was larger than the knot (30.68).
Table 3

Importance of the MARS model and related knots.

(Subcategory) OutcomeImportance #1Importance #2
VariableKnotVariableKnot
(Not including Tavg1218 variables)
HWwholeAvgTmaxLag132.95Nindex179.65
AgeHWyoungAvgTmaxLag330.68--
HWadultAvgTmaxLag131.80Nindex280.48
HWolderAvgTmaxLag133.40--
GenderHWmaleAvgTmaxLag132.15Nindex178.43
HWfemaleAvgTmaxLag133.10Tavg30.5
DiagnosisHWnothospAvgTmaxLag132.80Nindex276.75
HWhospAvgTmaxLag133.45Nindex279.42
Month of occurrenceHWjunjulTmax29.80Nindex280.65
HWaugAvgTmaxLag133.35Nindex1
Place of occurrenceHWinTavg30.00Nindex278.40
HWoutAvgTmaxLag131.25Nindex280.66
(Including Tavg1218 variables)
HWwholeAvgTavg1218Lag130.59Nindex280.22

Discussion

This study developed a definition for the heatwave using a machine learning technique, MARS, to describe the fundamental relationship between i) the daily frequency of emergency department visits associated with heat illness, and ii) 19 meteorological factors. MARS enabled non-linear relationships to be rendered and automatically defined breaking points (knots) among separate sub-groups. Knots where the behaviour of functions dramatically changed were used to define a heatwave. For all emergency department visitors diagnosed with heat illnesses, the related average maximum temperature for 2 consecutive days was greater than 32.58°C and a heat index higher than 79.64 was selected as a threshold based on knots where heatwave morbidity started to rise dramatically. This approach to defining thresholds is similar to existing methods as follows: 1) the Korea Meteorological Administration issues a heatwave warning when the maximum temperature exceeds 33°C for two straight days, while 2) the National Weather Service (NWS) gives a “Caution” notice for possible fatigue from prolonged exposure and/or physical activity when the heat index exceeds 80. We calculated the number of summer days that meet the existing and alternative criteria based on Station 108, which is representative of Seoul (Table 4). Compared to the current criteria, the number of heat wave days increased three to four times when the two criteria in this study were adopted. When we allowed second-order interactions, combinations of temperature and humidity were included in the model as important variables, even though South Korea only uses a maximum temperature for initiating a heat warning. It suggests that revised criterions and/or more procdure (i.e., attetion, alarm, emegerncy) should be considered for the early warning system.
Table 4

Number of summer days that meet existing and alternative criteria based on Station 108 in Seoul.

YearNumber of summer daysHYPERLINK aExisting criteriabAlternativecriteria 1cAlternativecriteria 2dAlternative criteria 3e
201192052728
20129210134646
201392045051
201492463737
201592374041
20169219294747
Total5523664247250

aJune to October

bMaximum temperature exceeds 33°C for two consecutive days

cAverage maximum temperature for two straight days (AvgTmaxLag1) ≥ 32.58

dNWS heat index (Nindex) ≥ 79.64

eAvgTmaxLag1 ≥ 32.58 OR Nindex ≥ 79.64

aJune to October bMaximum temperature exceeds 33°C for two consecutive days cAverage maximum temperature for two straight days (AvgTmaxLag1) ≥ 32.58 dNWS heat index (Nindex) ≥ 79.64 eAvgTmaxLag1 ≥ 32.58 OR Nindex ≥ 79.64 Heat-related outcomes varied depending on visitors’ profiles, similar to findings from previous studies on heat-related mortality and morbidity. Children were reported as being more sensitive to heatwaves [31, 32] as the number of emergency room visitors under 18 years of age entering due to heat illiness was dramatically higher than for other age brackets, even at a lower threshold. While the elderly are also generally considered vulnerable to heatwaves [4, 33–35], an opposite pattern was observed (relatively fewer emergency room visitors even at a higher threshold). As two-thirds of deaths from heatwaves are indeed among the elderly, heat-related mortality was not included in this model: a factor which will be included in future works where more in-depth consideration can be completed. In gender-specific results, males were found to be more likely to visit the emergency room due to heat-related morbidity at a lower threshold compared with females. At a glance, this does not seem consistent with previous studies, which have identified women as facing higher risk or shown no difference by gender [36-39]. The results of this study might explained by males having greater exposure to heat through male-dominated occupations that place individuals in positions vulnerable to heat waves, for example, as construction workers. For diagnosis, it could be helpful to design multi-stage heatwave warning systems according to the threshold for each group, giving sensitivity to differences in demographic occurrences of severe illness and hospital admission. Our results suggest that heatwave prevention systems and response plans should be designed according to time and place. There is a difference of about 2–3°C in the trigger point for early summer (June to July) for outside workers, a conclusion supported in previous work [40, 41]. In other words, a lower threshold should be set to improve response plans for specific demographics, such that outside workers should take a 10-minute break every hour, as the Korea Occupational Safety and Health Agency recommends. Additionally, in contrast to the basis for existing thresholds (i.e., daily maximum temperature), this study discovered that average daytime (noon to 6 PM) temperature was determined to represent an alternative threshold for heatwaves as reflecting heat exposures when it is hot during the day. With this in mind, other meteorological factors should be explored as other heatwave thresholds. Some limitations were encountered during this study. Relying on fixed monitoring stations may misrepresent true individual-level exposures, so additional analyses such as an object analysis of monitoring should be conducted for the meteorological variables considered here. Other heatwave outcomes such as heat-related mortality should also be considered.

Conclusions

This study investigated heatwave thresholds based on daily data for heat-related morbidity and meteorological variables. Thresholds were inductively determined using MARS models to explore non-linear relationships. For all emergency department visitors diagnosed with heat illnesses, the thresholds identified in this research were similar to existing values used to trigger heatwave warming systems. These thresholds varied depending on visitors’ profiles and the place and time of each occurrence. Average daytime temerpature was selected as an alternative factor informing heatwave thresholds. Our findings can help improve understanding of the effect of heatwaves on human health and be used to design more effective heatwave warning systems.
  31 in total

1.  Definition of temperature thresholds: the example of the French heat wave warning system.

Authors:  Mathilde Pascal; Vérène Wagner; Alain Le Tertre; Karine Laaidi; Cyrille Honoré; Françoise Bénichou; Pascal Beaudeau
Journal:  Int J Biometeorol       Date:  2012-02-24       Impact factor: 3.787

2.  The impact of excess heat events in Maricopa County, Arizona: 2000--2005.

Authors:  Fuyuen Y Yip; W Dana Flanders; Amy Wolkin; David Engelthaler; William Humble; Antonio Neri; Lauren Lewis; Lorraine Backer; Carol Rubin
Journal:  Int J Biometeorol       Date:  2008-07-08       Impact factor: 3.787

3.  Effects of heat waves on mortality: effect modification and confounding by air pollutants.

Authors:  Antonis Analitis; Paola Michelozzi; Daniela D'Ippoliti; Francesca De'Donato; Bettina Menne; Franziska Matthies; Richard W Atkinson; Carmen Iñiguez; Xavier Basagaña; Alexandra Schneider; Agnès Lefranc; Anna Paldy; Luigi Bisanti; Klea Katsouyanni
Journal:  Epidemiology       Date:  2014-01       Impact factor: 4.822

Review 4.  Heat-related mortality: a review and exploration of heterogeneity.

Authors:  Shakoor Hajat; Tom Kosatky
Journal:  J Epidemiol Community Health       Date:  2009-08-19       Impact factor: 3.710

5.  Racial and socioeconomic disparities in heat-related health effects and their mechanisms: a review.

Authors:  Carina J Gronlund
Journal:  Curr Epidemiol Rep       Date:  2014-09-01

Review 6.  The impact of heat waves on children's health: a systematic review.

Authors:  Zhiwei Xu; Perry E Sheffield; Hong Su; Xiaoyu Wang; Yan Bi; Shilu Tong
Journal:  Int J Biometeorol       Date:  2013-03-23       Impact factor: 3.787

Review 7.  Heatwave early warning systems and adaptation advice to reduce human health consequences of heatwaves.

Authors:  Dianne Lowe; Kristie L Ebi; Bertil Forsberg
Journal:  Int J Environ Res Public Health       Date:  2011-12-12       Impact factor: 3.390

8.  Impact of temperature on mortality in Hubei, China: a multi-county time series analysis.

Authors:  Yunquan Zhang; Chuanhua Yu; Junzhe Bao; Xudong Li
Journal:  Sci Rep       Date:  2017-03-22       Impact factor: 4.379

9.  The Impacts of Heatwaves on Mortality Differ with Different Study Periods: A Multi-City Time Series Investigation.

Authors:  Xiao Yu Wang; Yuming Guo; Gerry FitzGerald; Peter Aitken; Vivienne Tippett; Dong Chen; Xiaoming Wang; Shilu Tong
Journal:  PLoS One       Date:  2015-07-28       Impact factor: 3.240

10.  The impact of heatwaves on mortality in Australia: a multicity study.

Authors:  Shilu Tong; Xiao Yu Wang; Weiwei Yu; Dong Chen; Xiaoming Wang
Journal:  BMJ Open       Date:  2014-02-18       Impact factor: 2.692

View more
  1 in total

1.  Using logic regression to characterize extreme heat exposures and their health associations: a time-series study of emergency department visits in Atlanta.

Authors:  Shan Jiang; Joshua L Warren; Noah Scovronick; Shannon E Moss; Lyndsey A Darrow; Matthew J Strickland; Andrew J Newman; Yong Chen; Stefanie T Ebelt; Howard H Chang
Journal:  BMC Med Res Methodol       Date:  2021-04-26       Impact factor: 4.615

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.