Advancing our Understanding of Heat Wave Criteria and Associated Health Impacts to Improve Heat Wave Alerts in Developing Country Settings

Health effects of heat waves with high baseline temperatures in areas such as India remain a critical research gap. In these regions, extreme temperatures may affect the underlying population’s adaptive capacity; heat wave alerts should be optimized to avoid continuous high alert status and enhance constrained resources, especially under a changing climate. Data from registrars and meteorological departments were collected for four communities in Northwestern India. Propensity Score Matching (PSM) was used to obtain the relative risk of mortality and number of attributable deaths (i.e., absolute risk which incorporates the number of heat wave days) under a variety of heat wave definitions (n = 13) incorporating duration and intensity. Heat waves’ timing in season was also assessed for potential effect modification. Relative risk of heat waves (risk of mortality comparing heat wave days to matched non-heat wave days) varied by heat wave definition and ranged from 1.28 [95% Confidence Interval: 1.11–1.46] in Churu (utilizing the 95th percentile of temperature for at least two consecutive days) to 1.03 [95% CI: 0.87–1.23] in Idar and Himmatnagar (utilizing the 95th percentile of temperature for at least four consecutive days). The data trended towards a higher risk for heat waves later in the season. Some heat wave definitions displayed similar attributable mortalities despite differences in the number of identified heat wave days. These findings provide opportunities to assess the “efficiency” (or number of days versus potential attributable health impacts) associated with alternative heat wave definitions. Findings on both effect modification and trade-offs between number of days identified as “heat wave” versus health effects provide tools for policy makers to determine the most important criteria for defining thresholds to trigger heat wave alerts.


Introduction
Interest in heat waves and extreme heat events has increased globally over the past few years, following high profile heat events in the US and Europe that significantly impacted human health. Previous analysis of the 1995 Chicago heat wave illustrated gaps in the methodologies for the comparison of health effects of heat waves across regions [1], which led to application of time-series Specifically, we assess to what extent the timing of the HW within the corresponding HW season modifies the relationship between HW and mortality. Hot seasons intersperse with the Northeastern and Southwestern monsoon seasons in some coastal areas in India; for example, within our dataset, Mumbai falls into this category. Associated meteorological differences in these communities could lead to different seasonal effects of heat wave periods. Some earlier studies in other study areas found that the first HW of the hot season had higher health risks than later HWs [6,19], which could relate to adaptation of the population or vulnerable populations succumbing early in the season (mortality displacement effects).
In addition, we propose an innovative analytical approach to assess the causal effect of HW on mortality through the use of propensity score matching (PSM). Propensity score techniques aim to achieve covariate balance in regards to measured time-varying covariates (i.e., confounders) [20]. This approach also allows one to clearly formulate the causal question of interest. In this paper, we apply an approach based on propensity score matching to estimate the effect of heat waves on mortality.
We examined the effect of the following parameters on the risk of mortality among five cities in northwestern India: (1) HW Definition, incorporating both percentile thresholds and heat wave duration; and (2) HW Timing in Season, assessing whether the heat wave was the first, second, third, fourth, fifth, or later in season.

Data
Daily temperature data and other meteorological data were collected from five cities in India: Jaipur (Rajasthan), Churu (Rajasthan), Idar (Gujarat), Himmatnagar (Gujarat), and Mumbai (Maharashtra). These cities represent a variety of climate conditions experienced throughout the states of Maharashtra, Gujarat, and Rajasthan in northwestern India; the state of Rajasthan records some of the hottest temperatures within cities in India [21]. Meteorological data were obtained from the Indian Meteorological Department (IMD) through their Metadata center in Pune and supplemented with data from the National Oceanic and Atmospheric Administration (NOAA) Global Summary of the Day (GSOD) project. Because meteorological stations were not available for Idar and because the cities of Idar and Himmatnagar are close in proximity (approximately 40 km from city center to city center), Idar and Himmatnagar were combined into a single community for subsequent analysis.
Although the Civil Registration Act of 1952 requires municipality-level birth and death registration for the purpose of property transfers, no central registry for mortality data for India exists for research purposes. We collected detailed registry data on daily mortalities from the relevant city-level registry offices through 2012. The data were available over different study periods for each community depending on the availability of data within the relevant local registry. The four communities and associated time frames included in our study period were Jaipur (2005-2012) and Churu (2003-2012) in Rajasthan; Idar and Himmatnagar (2008-2012) in Gujarat; and the Island City of Mumbai (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012) in Maharashtra. Data from the Island City of Mumbai were collected from the local registry for persons ≥35 years of age; for other cities, data were collected for persons of all ages. These data, which do not provide details on cause of death, were collated into all-cause aggregated daily mortality data for each of the communities of interest. To our knowledge, this is the first such compilation of detailed registry date for multiple Indian communities for the purpose of epidemiologic analysis and public health [14]. Data collection was approved by the Yale University Human Subjects Committee (HSC# 1405014004).

Heat Wave Definitions
To understand the impact of different heat wave definitions on the association between HWs and mortality, a variety of heat wave definitions by temperature threshold (intensity) and number of consecutive days of exceeding the threshold (duration) were applied to each community, as seen in Table 1. Thresholds were based on each community separately. Under these heat wave definitions, Mumbai is highlighted as a coastal city which experiences a mid-season monsoon period. Other cities in the data set do not experience a mid-season monsoon period, and the resulting hot season is shorter in duration. A notable exception in the list of heat wave definitions is the "India Meteorological Department (IMD)" definition. Meteorologists within the IMD use community-specific parameters to define conditions for when a "heat wave" is declared based on the historical temperature record of a given location, with forecasts limited to the temperature today and tomorrow as opposed to predicting heat waves multiple days in advance. The IMD definition of a heat wave allows for a single day of high temperature to be declared as a "heat wave", in addition to multiple consecutive days of high temperature [22]. This contrasts with more common definitions of a "heat wave" in which at least two consecutive days of high temperature are necessary for the days to be considered a heat wave [23]. We included the IMD definition of a heat wave in our analysis to improve the policy relevance, and illustrate how the IMD definition of a heat wave compares to other standards in the literature.
Another methodological issue in assessing the relationship between heat waves and health is whether the heat metric used is an absolute heat metric (i.e., the same heat metric across all cities, such as a given temperature or threshold for heatwaves) or a relative metric (i.e., a percentile of temperature based on each community's temperature distribution). Both have been used in the literature [6,7]. These metrics (absolute versus relative) have different interpretations and implications with respect to adaptation, which is an important consideration for climate change studies and public health policy. Acclimatization can occur through physical adaptation, housing characteristics, or behavioral patterns (e.g., staying indoors, clothing choices, water consumption). With a high degree of acclimatization to weather, results of heat wave-health assessments would be similar across communities for relative temperature effects and different for absolute effects. Without a high degree of acclimatization, communities would have similar absolute effects and dissimilar relative effects. In our models, we utilized a relative threshold, based on a community's long-term weather, to allow for regional acclimatization to temperatures normal for these communities and to increase the comparability across different communities with different long-term meteorological trends.

Propensity Score Matching
PSM was used to control for confounding factors including longer-term patterns (time trends), seasonal and cyclical variations, short-term systematic effects (calendar effects including day of the week, weekend versus weekday) and other short-term time varying confounders (meteorological variables, in this case, adjusted dew point temperature) [24,25].
In our context, the underlying causal question focuses on the estimated health outcome rate on heat waves days, had the heat wave days not occurred. In other words, the outcome rate of interest was compared using heat wave days and non-heat wave days that were similar for all measured background covariates (except the daily temperature that was used to define a heat wave day). In addition, this approach allowed an intermediate stage in which it was possible to check covariate balance and eventually exclude heat wave days that could not be matched to any non-heat wave (i.e., control) days and vice versa.
Ensuring that the distribution of observed baseline covariates is similar and balanced between exposed and unexposed subjects (referred to as exchangeability) is fundamental when making causal inferences [26]. To reduce the potential for confounding within this observational study, PSM was used to compare heat wave days according to the heat wave definitions provided in Table 1 with the most similar non-heat wave days based on available data on measured confounders. We estimated the propensity score for each day and community in the study period using a parametric method, logistic regression, to predict the probability for a day to be a HW. This prediction was based on covariates included in the PSM model to generate a model to identify matched non-HW days. In the final model, which was run separately for each community and HW definition, we included the following covariates: day of the week (DOW) (weekend versus weekday), month, year, and lag temperatures (lag 1 to lag 3) as follows: We estimated for each day and community a propensity score that was used to match HW days to similar non-HW days. Visual comparisons of kernel density plots of the resulting propensity scores as well as the standardized mean differences of each covariate were used to judge whether the matching improved covariate balance and to select the number of "nearest neighbors" and caliper size. We thus used "nearest-neighbor" matching [27] to match HW days to similar non-HW days with a ratio of 1:1 and a caliper of 0.1. Using this approach, we matched HW days to their nearest non-HW day neighbor (with regards to the propensity score). The number of days that were unmatched and thus excluded from the analysis varied between 1.22% in Jaipur to 5.91% in Mumbai.

Association between Heat Waves and Mortality
Utilizing these balanced heat wave and non-heat wave days, we estimated the causal effect of HWs on mortality using a Quasi-Poisson regression model, which allowed for overdispersion of mortality count data. In this model, a dummy variable for heat wave day (with non-heat wave day as the control group) served as the predictor variable. This model yielded the average ratio in observed mortality between HW days and non-HW days. We obtained a Risk Ratio (RR) for each HW definition. In the model, we considered lags 0-14 days as well as cumulative lags; results presented are for lag 0.
We repeated the same analysis separately for each community and included the population by year as an offset in the model. We also conducted sensitivity analysis in regards to the propensity score estimation to assess whether covariates in the model should be specified as a quadratic or linear function using likelihood ratio tests. Finally, we kept covariates modeled as a linear function in our PSM models.

Timing in Season
Different areas of India experience different seasons depending on latitude as well as the presence of northeastern and southwestern monsoons. For this reason, a standard "summer" has not been established for the continent. We analyzed the impact of timing in season on the health effects of all heat wave definitions for each community. Using the distribution of HWs across the year, we categorized these heat waves based on whether they were the first through fifth (or higher order) heat wave of the season.

Attributable Deaths versus Relative Risk
For each heat wave definition and community, attributable deaths were calculated by estimating the attributable risk percent (the percent of mortalities attributable to heat waves in relation to all mortalities) from the relative risk estimates, and multiplying by the daily average expected mortalities for the community and the number of heat wave days to obtain the total number of attributable number of deaths per heat wave definition [28,29]. This is similar to some previous papers which have sought to characterize the aggregate mortalities that are attributable to heat waves to better understand current health impacts of heat waves as well as potential future impacts of extreme heat under alternative climate scenarios [30,31].
where N c HW /year = Number of heat wave days per year, varies by HW definition HW and community c; D c e = average number of expected mortalities for community c ((annual mortality rate * community population)/365.25 days); RR c HW = community-specific heat wave relative risk. The estimated attributable risk includes the amplitude of the effect of the HW (i.e., through the RR c HW ) and its prevalence (number of days on which a HW occurred). Table 2 shows the descriptive statistics for the data that were used to assess the relationship between heat waves (by HW definition) and health outcomes, for all communities in our data set. We further illustrate the underlying differences in the populations of each of the communities as well as the different climatic regions that are represented in the analysis. We standardized all analyses by population, number of years of data available, and generated unique heat wave thresholds by community to ensure that our analyses were locally relevant.  Table 3 provides a further detailed breakdown of characteristics of different heat waves, by definition, for each community. We provide information on the average daily maximum temperature by heat wave definition, the number of days that were declared as heat waves under each definition, and the timing of the earliest heat wave over the entire record for each heat wave definition. Table 3 illustrates that, although heat wave definitions become more stringent across temperature and duration and, as a result, fewer days per year are declared as heat waves, the timing in season of the earliest heat wave does not change by more than a month across the communities. Figure 1 shows the historical attributable deaths per year and relative risk of mortality associated with different heat wave definition criteria. Relative risk of mortality ranged from 1. 28  There is a wide spread in both the relative risk of heat waves and the attributable deaths associated with heat waves, indicating that both are largely dependent on the criteria used for declaring a heat wave. Further, Figure 1 shows the importance of going beyond the use of RR and examining the possibility of using health events to represent the impact of heat waves; the number of heat wave attributable deaths is an important driver of the variability of the HW-related burden in addition to relative risk. Some heat wave definitions were associated with a high relative risk; however, the absolute health burden is diminished because few days on record match those criteria.

Results
To further investigate the relationship between aggregate mortalities and the prevalence of heat wave days, we applied a population weighting to the definition-specific attributable deaths estimates presented in Figure 1. Figure 2 shows the relationship between the number of days that fall into a particular heat wave definition and the annual population-weighted mortalities attributable to those heat wave days according to community. This shows that there are some heat wave definitions that are less stringent and occur often during the heat wave period; however, those definitions do not consistently correspond with the greatest number of attributable deaths annually. This may be an important "tradeoff" tool that can inform policy on heat wave alerts to maximize the public health impact while minimizing the resources expended during heat wave alerts; this may avoid saturating the public with heat wave warnings, which may lower responses to such alerts.

23] in Idar and
Himmatnagar under the 95%_4d definition. There is a wide spread in both the relative risk of heat waves and the attributable deaths associated with heat waves, indicating that both are largely dependent on the criteria used for declaring a heat wave. Further, Figure 1 shows the importance of going beyond the use of RR and examining the possibility of using health events to represent the impact of heat waves; the number of heat wave attributable deaths is an important driver of the variability of the HW-related burden in addition to relative risk. Some heat wave definitions were associated with a high relative risk; however, the absolute health burden is diminished because few days on record match those criteria. Figure 1. Annual average heat wave attributable deaths* and relative risk of mortality during a heat wave**, for all communities, by heat wave definitions (Where distinct heat wave periods occur). * Over whole study period for that community; see Table 1. ** A comparison of HW days to matched non-HW days. Note-Scale for attributable deaths is different by community.
To further investigate the relationship between aggregate mortalities and the prevalence of heat wave days, we applied a population weighting to the definition-specific attributable deaths estimates presented in Figure 1. Figure 2 shows the relationship between the number of days that fall into a particular heat wave definition and the annual population-weighted mortalities attributable to those heat wave days according to community. This shows that there are some heat wave definitions that Figure 1. Annual average heat wave attributable deaths* and relative risk of mortality during a heat wave**, for all communities, by heat wave definitions (Where distinct heat wave periods occur). * Over whole study period for that community; see Table 1. ** A comparison of HW days to matched non-HW days. Note-Scale for attributable deaths is different by community.
impact while minimizing the resources expended during heat wave alerts; this may avoid saturating the public with heat wave warnings, which may lower responses to such alerts.  Figure 3 shows the within-season trend in the estimated relationship between the timing of heat waves and health. We assessed heat waves by order within the hot season: first, second, third, fourth, or greater than or equal to fifth. An important note is that this dataset occurs in a region where we do have several years of data with at least five or more heat wave periods, which is not common in the existing literature. Within this dataset, heat waves that occur later in the hot season tend to be more impactful on health. This holds true across different heat wave definitions, as well as in different communities.   Figure 3 shows the within-season trend in the estimated relationship between the timing of heat waves and health. We assessed heat waves by order within the hot season: first, second, third, fourth, or greater than or equal to fifth. An important note is that this dataset occurs in a region where we do have several years of data with at least five or more heat wave periods, which is not common in the existing literature. Within this dataset, heat waves that occur later in the hot season tend to be more impactful on health. This holds true across different heat wave definitions, as well as in different communities.

Discussion
Because there is no standard definition of what constitutes a heat wave, implementation of heat wave early warning systems to minimize predicted health effects caused by heat waves can rely on data that examine the impact of a range of heat wave criteria (including potential effect modifiers) on past health effect estimates. This can help to generate locally relevant heat wave alert criteria that are tailored to maximize the reduction of health impacts in a particular region. For example, people may be less likely to respond to heat wave alerts depending on the frequency of those alerts; by contrast, the actual risk of an average HW may be lower if HWs are declared more frequently (i.e., if policy makers use a less stringent definition of a heat wave). Even if the same HW definition were to be used in the design of an alert program across multiple cities (e.g., a percentile of the city's temperature distribution, which would be converted to a specific temperature value for implementation), health responses would differ depending on population characteristics. Further, some metrics or definitions of HWs may be more accurately predicted than others for HW warning systems. From a policy perspective, the actual choice of HW definition is a balance that involves tradeoffs of minimizing the health risks of heat waves and maximizing the efficient use of resources. This balance must come with an understanding that a lesser number of declared heat wave days will require less implementation of HAPs.
The relationship between heat waves and health outcomes in this region is comparable to previous studies in other regions. Our results indicate that, even in areas with high baseline temperatures, adaptation to those high baseline temperatures is limited and the burden associated with extreme heat can be substantial. Future research focused on assessing the relationship between

Discussion
Because there is no standard definition of what constitutes a heat wave, implementation of heat wave early warning systems to minimize predicted health effects caused by heat waves can rely on data that examine the impact of a range of heat wave criteria (including potential effect modifiers) on past health effect estimates. This can help to generate locally relevant heat wave alert criteria that are tailored to maximize the reduction of health impacts in a particular region. For example, people may be less likely to respond to heat wave alerts depending on the frequency of those alerts; by contrast, the actual risk of an average HW may be lower if HWs are declared more frequently (i.e., if policy makers use a less stringent definition of a heat wave). Even if the same HW definition were to be used in the design of an alert program across multiple cities (e.g., a percentile of the city's temperature distribution, which would be converted to a specific temperature value for implementation), health responses would differ depending on population characteristics. Further, some metrics or definitions of HWs may be more accurately predicted than others for HW warning systems. From a policy perspective, the actual choice of HW definition is a balance that involves tradeoffs of minimizing the health risks of heat waves and maximizing the efficient use of resources. This balance must come with an understanding that a lesser number of declared heat wave days will require less implementation of HAPs.
The relationship between heat waves and health outcomes in this region is comparable to previous studies in other regions. Our results indicate that, even in areas with high baseline temperatures, adaptation to those high baseline temperatures is limited and the burden associated with extreme heat can be substantial. Future research focused on assessing the relationship between heat waves and health outcomes may identify a threshold temperature beyond which health effects of heat waves are consistent across geographic areas.
Utilizing data on heat wave order, we found that heat waves that occur later in the season have a higher impact on health than those that occur earlier in the season. Previous studies have indicated that the first heat wave in the season tends to have the greatest health impacts [6,16], for several reasons: (i) The most vulnerable populations succumb early in the hot season; (ii) As the season progresses, the population implements adaptive behaviors (such as air conditioning usage) that offset the potential risk from heat waves. One study found that heat waves in the second half of the hot season had a greater health impact than heat waves in the first half of the hot season [21]; we provide evidence of higher heat wave effects in later order heat waves across multiple cities within a region that commonly experiences five or more heat wave periods during a single year. There are a few potential explanations for the later-in-season, high-impact heat waves that are specific to India: (i) Heat waves that occur after the monsoon season are likely to be more humid heat waves. It is clear and biologically plausibility that high-temperature, high-humidity periods are more deleterious to health than high-temperature, low-humidity periods; and (ii) There may be some other co-varying factor in areas of India that do not have mid-season monsoons, which might lead to more dangerous heat wave periods later in the season. It would be interesting to test such hypotheses in future studies.
Our study is not without limitations. In our analysis, we used daily maximum temperature; however, the analysis could have used other meteorological characteristics, such as diurnal variation or daily minimum temperature. Future work could examine the relationships between other health-relevant meteorological characteristics and mortalities in the region, such as minimum and diurnal temperatures. We also utilized relative temperature thresholds for heat wave criteria, as opposed to absolute temperature metrics, to allow for acclimatization to high temperatures in these communities as well as for comparison across communities. Future studies could assess the relationship between absolute temperature metrics and heat wave-health outcomes to understand the level of acclimatization within the region.
In terms of health data, we utilized data on aggregate, daily, all-cause mortality. For several of the cities in our study area (Himmatnagar, Jaipur), we do not have individual-level information and cannot assess important effect measure modifiers of the heat-mortality relationship [32,33], including demographic factors like age, gender, ethnicity, socioeconomic status (SES), behavior factors like smoking status and alcohol consumption, or cause of death such as chronic disease, cardiovascular disease, respiratory disease, among others. Future studies may utilize similar methods as presented above to assess these individual-level health characteristics as well as to understand sensitive subpopulations within these communities.
A further limitation is that we estimated the IMD Heat Wave definition using publicly available data, which does not include the detailed datasets used by the IMD when issuing heat wave alerts. Although we believe that our estimation closely matches the criteria the IMD states in their handbook for issuing heat wave alerts, validation is difficult without access to IMD's internal procedures. One major limitation to research in this region is the availability of data for use in public health studies. The quality of these data may be affected by different standards in reporting for registry data across different communities.
Finally, we proposed a novel approach for time-series epidemiological studies that rely on the use of PSM. Advantages of such an approach include that it allows us to first check exchangeability between exposed (HW) and unexposed (non-HW) days before calculating the difference in a given health outcome using only one-dimensional parameter (the estimated PS). However, such an approach may still be prone to unmeasured confounding that can only be controlled to the extent that those unmeasured confounders are correlated with measured confounders, but such correlation is, by definition, impossible to check.

Conclusions
Our study demonstrates the feasibility of utilizing locally relevant data from registries for public health-relevant work in India, an understudied region that currently experiences extreme temperatures. We further provide guidance for potential heat wave alert systems, which, by definition, must use a specific heat metric. As shown here, there are tradeoffs in the choice of an appropriate metric, including the resources expended. Longer periods of time are labeled as heat waves versus the numbers of mortalities which are attributable to these different types of heat waves, which could potentially be averted by implementation of an alert system.
India has begun the work of implementing heat wave alert systems to reduce the health impacts of heat waves [34]. Understanding the risk of heat waves to continue implementing heat action plans under a changing climate scenario may rely on critical examination of those high-risk, rare heat wave days that are highlighted in our results. One study conducted in Pakistan showed that knowledge and perception play a significant role in adaptation of populations to the health threat presented by heat waves; it also suggests that education on the health effects of heat waves will be a key component of successful implementation of health-protective measures during heat waves [35]. Climate change is expected to lead to more frequent heat waves of higher intensity and duration, which may lead to increases in the attributable health burden during those heat wave periods [11,15]. Health effect estimates from places such as India where high-intensity heat waves already occur may serve to inform the literature on health effects in other parts of the world that are anticipated to experience similar increases in temperature due to climate change.
Future studies should aim to further develop our understanding of the local causal relationship between heat wave characteristics and health effects. Our study provides a highly useful framework for the value of registry data to build an understanding of the local heat wave-health relationships. Future studies could expand to other similar cities in India or other developing country settings to perform analogous research in resource-limited settings.