An Empirical Analysis of Association between Socioeconomic Factors and Communities’ Exposure to Natural Hazards

In today’s urban environments with complex design and configurations, heterogeneous spatial clusters of communities with different socioeconomic characteristics may result in disproportionate exposure of some groups of citizens to natural hazards. The objective of this study was to compare the associations between communities’ socioeconomic characteristics and exposure to different types of natural hazards in New York City (NYC) to examine whether commonly accepted indicators of social vulnerability are associated with similar levels of exposure across various natural hazards. First, we collected socioeconomic data (e.g., population, median income, unemployment rate) at a zip code level of granularity provided by the United States Census Bureau. Next, we identified and gathered spatial data for coastal storms, flooding, extreme heat, and pandemic disease in NYC. We then conducted a pairwise Kendall’s τ-b test to compare the associations. The outcomes showed that the significance and direction of the associations depend on the type of natural hazard. Particularly, the results indicated that zip codes with lower socioeconomic factors and greater percentage of minority ethnicities are exposed disproportionately to extreme heat and COVID-19. On the other hand, zip codes with higher percentage of areas prone to flooding have relatively higher socioeconomic factors. Furthermore, the results did not show any statistically significant association between socioeconomic factors and exposure to coastal storm inundations. The outcomes of this study will help decision makers design and implement better optimized and effective emergency preparedness plans by prioritizing their target areas based on socioeconomic factors in order to enhance social justice.


Introduction
Natural hazards are getting worse in intensity and frequency as a result of climate change [1]. A natural hazard may become a disaster when it interacts with a dense, populated, and unprepared urban environment. Previous studies including Chan [2], Wisner et al. [3], and Rodriguez-Oreggia et al. [4] have argued that socioeconomic characteristics of urban communities might affect their residents' vulnerability to natural hazards. In typical urban areas and major cities, socioeconomic characteristics are spatially structured. That structuring creates heterogeneous geographical clusters of communities consisting of households with relatively similar socioeconomic attributes. Heterogeneous spatial clusters of communities with different socioeconomic characteristics may result in racial and economic divisions and consequently inequality in an urban environment. Exposure to perturbations and external stresses is a main component of a vulnerability analysis [5,6]. If the socioeconomic characteristics of households and communities are associated with their exposure to natural hazards, some communities and their people may suffer from a natural hazard disproportionately. Therefore, to create a more resilient and fair urban society, it is critical to consider spatial socioeconomic characteristics when designing city development strategies and community hazard preparedness plans.
Many experts and previous studies consider the association between socioeconomic factors and exposure to natural hazards an intuitive and obvious fact. For example, reports published by the World Bank [7] and the U.S. Department of Health and Human Services (HHS) [8] noted that people in poverty are more likely to live in high-risk areas due to less expensive properties. Although these types of general statements might be valid in some cases, they contradict many other observations. For example, in almost all major coastal cities in the U.S., living in shoreline areas is actually more expensive and attractive for people with higher income, despite the exposure of those areas to floods and coastal storms [9]. These commonly accepted associations need to be analyzed from different perspectives and for different natural hazards to be a reliable basis for designing preparedness plans and developing resilient urban communities.
Previous studies that aimed to investigate the impacts of socioeconomic factors on vulnerability to natural hazards can be categorized into two groups. The first group aims to construct social vulnerability indices to describe how the intersection of socioeconomic indicators contributes to exposure and sensitivity to natural hazards. This first group of studies combines indicators to describe the multiple dimensions of vulnerability to all hazards. For example, de Loyola Hummell et al. [10] used principal component analysis (PCA) to propose a social vulnerability index calibrated for Brazil based on 45 city-level indicators such as median age, percentage of females, and percentage of households with no electricity service. Their outcomes confirmed disparities among Brazilian regions with regards to vulnerability to natural hazards. Roncancio and Nardocci [11] used the Brazilian social vulnerability index to identify vulnerable areas and populations in the city of Sao Paulo. Ge et al. [12] defined social vulnerability index and natural hazard exposure index in Chinese coastal cities and observed similar spatial patterns. Di Girasole et al. [13] used a GIS to create a social vulnerability map for the Dominican Republic. Ge et al. [14] proposed a conceptual framework for social vulnerability assessment based on connectivity using network theory. Ogie and Pradhan [15] used sociopsychological theories of how people act in natural hazards to define an index to assess the social vulnerability of different suburbs in the Wollongong area of New South Wales, Australia, to natural hazards. Aksha et al. [16] adopted the most widely used measure of social vulnerability, the social vulnerability index (SoVI), to develop a social vulnerability index tailored for Nepal. Frigerio et al. [17] defined a social vulnerability index for natural hazards in Italy and monitored its changes from 1991 to 2011 using a bivariate spatial correlation analysis. This group of studies proposes vulnerability indices that represent the associations between socioeconomic factors and exposure to all natural hazards. However, the level and direction of the associations between socioeconomic factors and natural hazards may also depend on the type of hazard. In other words, different types of natural hazards may affect various groups of people differently. This concern was also observed in the study by Spielman et al. [18]. They evaluated the applicability of the SoVI and concluded that this index has some critical shortcomings regarding theoretical consistency and might be misleading in many cases.
The second group of previous studies aims to deconstruct social vulnerability to examine associations between individual socioeconomic factors and exposure to particular hazards. For example, Masozera et al. [19] investigated whether specific socioeconomic groups of people in New Orleans were more exposed to flooding from Hurricane Katrina. They assessed the correlation between flood levels and household incomes. Their results indicated that low-income households did not experience more flooding than higher income households. Kirby et al. [20] aimed to identify indicators of social vulnerability and assessed their associations with risk of flood hazard in 147 districts of the Dutch province of Zeeland to help policymakers plan for collective actions to protect people against natural hazards. Henry et al. [21] conducted a survey to collect data about impact of income disparity on vulnerability during the 2011 Thai flood. The results of their regression analysis indicated that lower-income respondents were more likely to suffer from flooding than higher-income respondents. Ge et al. [22] conducted a case study to map the social vulnerability with air pollution in Yangtze River Delta Region in China and observed an association between them. Silva and Kawasaki [23] surveyed 517 household residents in the North Central province of Sri Lanka to collect demographic data and households' experience of floods and droughts. They used multiple linear regression analysis to investigate the relationship between poverty and exposure to hazard risks. They observed that people with lower income faced more significant losses during floods and droughts. They mentioned that one of the main reasons that poor people in Sri Lanka suffer more from flood and drought is the fact that they mostly have occupations related to agriculture. Alizadeh et al. [24] used an artificial neural network (ANN) to develop a model that explains earthquake seismic risks based on social vulnerability indices in Tabriz, Iran. Török [25] aimed to identify the most vulnerable demographic group of people to flooding in Romania using a geographically weighted regression. Jeong and Yoon [26] examined associations between socioeconomic factors and economic losses due to typhoons, heavy rains, and heavy snows in South Korea using spatial regression analysis. Ilbeigi [27] analyzed the association between socioeconomic factors and physical damage to housing units during Hurricane Harvey in Texas and Hurricane Irma in Florida, both in 2017. The outcomes did not indicate any statistically significant association between the socioeconomic factors and housing damage. This group of previous studies focuses on a particular natural hazard and does not offer a more comprehensive investigation that compares the level of associations between socioeconomic factors and different types of natural hazards.
In this study, we empirically analyzed and compared the associations between socioeconomic factors and exposure to four major natural hazards in New York City (NYC), namely coastal storms, flooding, extreme heat, and pandemic disease. The outcomes of our analysis help examine the null hypothesis that commonly accepted indicators of social vulnerability are associated with similar levels of exposure across various natural hazards. If the outcomes of this study reject this hypothesis, we can conclude that the level of associations between socioeconomic factors and exposure to natural hazards depend on the type of hazard. First, we gathered socioeconomic data including population, income, poverty, unemployment rate, age, education, and race collected by the United States Census Bureau at a zip code level. Next, we identified NYC's spatial natural hazards data collected by various federal and local agencies. We then conducted a Kendall's τ-b (i.e., the Kendall rank correlation) analysis for each socioeconomic factor and exposure to each natural hazard to statistically examine the associations. The outcomes of this study will help city planners and emergency management agencies have a better understanding of relationships between various socioeconomic factors and exposure of communities to different types of natural hazards. This study sets the stage for future systematic approaches to incorporate communities' socioeconomic factors in developing city planning strategies for building more resilient, sustainable, and fair urban environments.
The remainder of this paper is structured as follows: First, we briefly review the four types of natural hazards that we considered in this study and introduce the data sets used to perform the analysis. We then describe the research methodology and discuss the results. Finally, we present the conclusions and discuss future works.

Data: Natural Hazard and Socioeconomic Factors
With a population of more than 8.3 million, a significant number of complex infrastructure systems, unique natural resources, and a long list of notable corporations and cultural centers headquartered, NYC is one of the greatest cities in the world. Throughout its history, this city has faced many natural extreme events including the severe heat waves of 2006 and Hurricane Sandy in 2012. The NYC Emergency Management Department in partnership with the NYC Department of City Planning and the NYC Mayor's Office of Recovery and Resiliency developed "NYC's Risk Landscape: A Guide to Hazard Mitigation" [28]. This guidebook assesses a wide range of natural hazards and outlines key features of the city's vulnerability to these hazards. The guidebook indicates that some of the natural hazards that pose a risk to NYC including coastal storms, flooding, extreme heat, and pandemic disease are spatially heterogeneous, indicating that not all parts of the city may be exposed to the same level of risks due to those natural hazards.

Natural Hazards
In this section, we briefly review the four major spatially heterogeneous natural hazards in NYC and their corresponding publicly available data.

Coastal Storms
Coastal storms including hurricanes, tropical cyclones, and nor'easters are among the most dangerous natural hazards in NYC. Climate change and rising sea levels are making these hazards stronger and more frequent [29]. In 2011, the U.S. Army Corps of Engineers developed hurricane surge inundation maps using the National Hurricane Center's 2010 NY3 basin Sea, Lake, and Overland Surge from Hurricanes (SLOSH) Maximum of Maximum (MOM) Envelope of Water at high tide [30]. The SLOSH MOM analysis considered a Category 4 hurricane as the most severe possible scenario. Figure 1a shows the storm surge inundation zones based on the SLOSH MOM model for a Category 4 hurricane in NYC created using ArcGIS. The geographic information system (GIS) data of this map in shapefile format are publicly available and were used in this study [31].

Natural Hazards
In this section, we briefly review the four major spatially heterogeneous natural hazards in NYC and their corresponding publicly available data.

Coastal Storms
Coastal storms including hurricanes, tropical cyclones, and nor'easters are among the most dangerous natural hazards in NYC. Climate change and rising sea levels are making these hazards stronger and more frequent [29]. In 2011, the U.S. Army Corps of Engineers developed hurricane surge inundation maps using the National Hurricane Center's 2010 NY3 basin Sea, Lake, and Overland Surge from Hurricanes (SLOSH) Maximum of Maximum (MOM) Envelope of Water at high tide [30]. The SLOSH MOM analysis considered a Category 4 hurricane as the most severe possible scenario. Figure 1a shows the storm surge inundation zones based on the SLOSH MOM model for a Category 4 hurricane in NYC created using ArcGIS. The geographic information system (GIS) data of this map in shapefile format are publicly available and were used in this study [31].

Flooding
In addition to coastal flooding caused by storm surge reviewed in the previous section, NYC is prone to other types of flooding including tidal, inland, and riverine flooding. Replacing vegetation with impervious surfaces (e.g., paved roads) in urbanized watersheds is a main contributor to the likelihood and severity of flooding. Figure 1b shows NYC's 2020s 100-year flood plain developed by the Federal Emergency Management Agency (FEMA) [32]. The shaded areas in the map have a 1 percent chance of being flooded annually. The GIS shapefiles of NYC's flood plain are publicly available [33] and were used to generate Figure 1 using ArcGIS.

Extreme Heat
Each year, extreme heat causes more fatalities than any other extreme weather event in the U.S. [28]. NYC's dense urban environment that absorbs and traps the heat makes it particularly susceptible to this hazard. During the heat waves, the city's massive and dense built environment creates urban heat island effects. Figure 2a shows NYC's surface temperature map created using ArcGIS based on a Landsat image taken on August 24, 2017 [34].

Flooding
In addition to coastal flooding caused by storm surge reviewed in the previous section, NYC is prone to other types of flooding including tidal, inland, and riverine flooding. Replacing vegetation with impervious surfaces (e.g., paved roads) in urbanized watersheds is a main contributor to the likelihood and severity of flooding. Figure 1b shows NYC's 2020s 100-year flood plain developed by the Federal Emergency Management Agency (FEMA) [32]. The shaded areas in the map have a 1 percent chance of being flooded annually. The GIS shapefiles of NYC's flood plain are publicly available [33] and were used to generate Figure 1 using ArcGIS.

Extreme Heat
Each year, extreme heat causes more fatalities than any other extreme weather event in the U.S. [28]. NYC's dense urban environment that absorbs and traps the heat makes it particularly susceptible to this hazard. During the heat waves, the city's massive and dense built environment creates urban heat island effects. Figure 2a shows NYC's surface temperature map created using ArcGIS based on a Landsat image taken on 24 August 2017 [34].

Pandemic Disease
A pandemic is an epidemic of an infectious disease that has spread across a large region and sometimes worldwide. The first case of the novel coronavirus disease 2019 (COVID-19) was reported in Wuhan, China, in December 2019. It has since spread globally, becoming a pandemic. At the time of writing this paper (May 27, 2020), more than 5.68 million cases have been reported across 188 countries and territories, resulting in more than 354,000 deaths. With more than 199,000 confirmed cases, NYC has been the epicenter of this disease in the United States. The city's high population density, crowded mass transit systems, and role as an international travel hub magnified the spread of this pandemic. Since March 30, 2020, the NYC Department of Health and Mental Hygiene (DOHMH) has been publishing COVID-19 data at a zip code level of granularity [35]. Figure 2b shows the spatial distribution of confirmed COVID-19 cases in NYC by zip code by March 30, 2020.

Communities' Socioeconomic Characteristics
The United States Census Bureau collects and provides national socioeconomic data in different formats and levels of granularity. In this study, we used population, median income, family poverty percentage, unemployment rate, median age, percentage of people with higher education, and percentage of minority ethnicities data at a zip code level of granularity [36].

Pandemic Disease
A pandemic is an epidemic of an infectious disease that has spread across a large region and sometimes worldwide. The first case of the novel coronavirus disease 2019 (COVID-19) was reported in Wuhan, China, in December 2019. It has since spread globally, becoming a pandemic. At the time of writing this paper (27 May 2020), more than 5.68 million cases have been reported across 188 countries and territories, resulting in more than 354,000 deaths. With more than 199,000 confirmed cases, NYC has been the epicenter of this disease in the United States. The city's high population density, crowded mass transit systems, and role as an international travel hub magnified the spread of this pandemic. Since 30 March 2020, the NYC Department of Health and Mental Hygiene (DOHMH) has been publishing COVID-19 data at a zip code level of granularity [35]. Figure 2b shows the spatial distribution of confirmed COVID-19 cases in NYC by zip code by 30 March 2020.

Communities' Socioeconomic Characteristics
The United States Census Bureau collects and provides national socioeconomic data in different formats and levels of granularity. In this study, we used population, median income, family poverty percentage, unemployment rate, median age, percentage of people with higher education, and percentage of minority ethnicities data at a zip code level of granularity [36].   Table 1 shows the correlations between each pair of socioeconomic factors. The table indicates that there is a relatively high correlation between median income and per capita income in each zip code. Further, family poverty percentage has a considerable negative correlation with median income and per capita income, as expected. Another set of noticeable correlated variables is the percentage of people with higher education and per capita income, indicating a positive association between these two variables. This correlation analysis helped validate consistency in the final outcomes of this study, which are presented in the following sections.  Table 1 shows the correlations between each pair of socioeconomic factors. The table indicates that there is a relatively high correlation between median income and per capita income in each zip code. Further, family poverty percentage has a considerable negative correlation with median income and per capita income, as expected. Another set of noticeable correlated variables is the percentage of people with higher education and per capita income, indicating a positive association between these two variables. This correlation analysis helped validate consistency in the final outcomes of this study, which are presented in the following sections.

Research Methodology
To empirically compare the associations between the presented socioeconomic factors and exposure to natural hazards reviewed in the previous section, we used ArcGIS to superimpose GIS files of NYC zip codes with the spatial data of natural hazards to extract quantified information for the natural hazards at a zip code level of granularity. Particularly, for coastal storm inundation and flooding, we calculated the percentage of vulnerable areas in each zip code. For extreme heat, we quantified shades of colors in the satellite image ( Figure 2a) and calculated a weighted measure that is indicative of relative temperature for each zip code. For COVID-19 confirmed cases, the original data were already presented in zip codes.
Next, we used Kendall's τ-b test, also known as the Kendall rank correlation, to statistically examine and compare the hypothetical associations. Kendall's τ-b correlation coefficient is a non-parametric statistic that measures the strength of association between two variables based on rank correlation by calculating concordance and discordance. Concordance and discordance measure the agreement between pairs of values from each of the two variables. Kendall's τ-b test does not carry any assumptions about the distribution of the data and is a reliable statistical tool when required assumptions in Pearson correlation and linear regression analysis including linearity, normality, and homoscedasticity are not valid. These assumptions are not valid for many of the socioeconomic and natural hazards data.

Research Methodology
To empirically compare the associations between the presented socioeconomic factors and exposure to natural hazards reviewed in the previous section, we used ArcGIS to superimpose GIS files of NYC zip codes with the spatial data of natural hazards to extract quantified information for the natural hazards at a zip code level of granularity. Particularly, for coastal storm inundation and flooding, we calculated the percentage of vulnerable areas in each zip code. For extreme heat, we quantified shades of colors in the satellite image ( Figure 2a) and calculated a weighted measure that is indicative of relative temperature for each zip code. For COVID-19 confirmed cases, the original data were already presented in zip codes.
Next, we used Kendall's τ-b test, also known as the Kendall rank correlation, to statistically examine and compare the hypothetical associations. Kendall's τ-b correlation coefficient is a nonparametric statistic that measures the strength of association between two variables based on rank correlation by calculating concordance and discordance. Concordance and discordance measure the agreement between pairs of values from each of the two variables. Kendall's τ-b test does not carry any assumptions about the distribution of the data and is a reliable statistical tool when required assumptions in Pearson correlation and linear regression analysis including linearity, normality, and homoscedasticity are not valid. These assumptions are not valid for many of the socioeconomic and natural hazards data. For example, the histograms of median income (Figure 4a)      To conduct Kendall's τ-b test, we first needed to calculate its statistic, τ, which is defined as follows: where C is the total number of concordant pairs, D is the total number of discordant pairs, and n is the number of observations. To check if the calculated τ is statistically significant, we calculated the z-score as follows: Finally, we calculated the p-value using the z-score and compared that with the desired significance level. More detailed information about Kendall's τ-b test and its applications is available in several statistics books including Meyers et al. [37].

Results
After analyzing correlations between socioeconomic factors, we conducted pairwise Kendall's τ-b tests between socioeconomic factors and exposure to natural hazards. Table 2 shows the calculated τ statistics and their corresponding p-values. The p-values are in parentheses. The associations that are statistically significant with a 5% significance level are bolded.  To conduct Kendall's τ-b test, we first needed to calculate its statistic, τ, which is defined as follows: where C is the total number of concordant pairs, D is the total number of discordant pairs, and n is the number of observations. To check if the calculated τ is statistically significant, we calculated the z-score as follows: Finally, we calculated the p-value using the z-score and compared that with the desired significance level. More detailed information about Kendall's τ-b test and its applications is available in several statistics books including Meyers et al. [37].

Results
After analyzing correlations between socioeconomic factors, we conducted pairwise Kendall's τ-b tests between socioeconomic factors and exposure to natural hazards. Table 2 shows the calculated τ statistics and their corresponding p-values. The p-values are in parentheses. The associations that are statistically significant with a 5% significance level are bolded.
The outcomes for extreme heat hazard indicate that there are statistically significant associations between the surface temperature and median age, median income, per capita income, family poverty percentage, unemployment rate, percentage of people with higher education, and percentage of minority ethnicities. The signs of the calculated τ statistics indicate that greater the median age, median income, per capita income, and percentage of educated people, the lower the surface temperature. On the other hand, the greater the family poverty percentage, unemployment rate, and percentage of minority ethnicities, the higher the surface temperature. In other words, the outcomes provide statistical evidence that zip codes with a higher percentage of minority ethnicities, greater percentage of younger people, lower income, lower education, higher poverty, and higher unemployment rate are disproportionately exposed to extreme heat. The major contributor to the urban heat island effect is impermeable surfaces covered by asphalt pavements, buildings, and other structures. Natural covers including trees and other vegetation can partially offset the impacts of extreme heat and keep the surrounding environment cooler. Rosenzweig et al. [38] extensively analyzed the impacts of urban forestry on heat island effects in NYC and confirmed the relationship. To briefly review their findings, Figure 6a shows a satellite thermal image captured by the National Aeronautics and Space Administration (NASA). Figure 6b displays with NYC's vegetation density map. These figures indicate that where vegetation is dense, surface temperatures are lower. Urban heat island effects are created where there is little or no vegetation. Therefore, we can conclude that NYC zip codes with lower socioeconomic factors need more green areas in order to provide more protection against extreme heat.
Sustainability 2020, 5, x FOR PEER REVIEW 9 of 13 The outcomes for extreme heat hazard indicate that there are statistically significant associations between the surface temperature and median age, median income, per capita income, family poverty percentage, unemployment rate, percentage of people with higher education, and percentage of minority ethnicities. The signs of the calculated τ statistics indicate that greater the median age, median income, per capita income, and percentage of educated people, the lower the surface temperature. On the other hand, the greater the family poverty percentage, unemployment rate, and percentage of minority ethnicities, the higher the surface temperature. In other words, the outcomes provide statistical evidence that zip codes with a higher percentage of minority ethnicities, greater percentage of younger people, lower income, lower education, higher poverty, and higher unemployment rate are disproportionately exposed to extreme heat.
The major contributor to the urban heat island effect is impermeable surfaces covered by asphalt pavements, buildings, and other structures. Natural covers including trees and other vegetation can partially offset the impacts of extreme heat and keep the surrounding environment cooler. Rosenzweig et al. [38] extensively analyzed the impacts of urban forestry on heat island effects in NYC and confirmed the relationship. To briefly review their findings, Figure 6a shows a satellite thermal image captured by the National Aeronautics and Space Administration (NASA). Figure 6b displays with NYC's vegetation density map. These figures indicate that where vegetation is dense, surface temperatures are lower. Urban heat island effects are created where there is little or no vegetation. Therefore, we can conclude that NYC zip codes with lower socioeconomic factors need more green areas in order to provide more protection against extreme heat. The results of Kendall's τ-b test for COVID-19 show statistically significant associations between number of confirmed cases of this pandemic with population, median age, median income, per capita income, family poverty percentage, unemployment rate, percentage of people with higher education, and percentage of minority ethnicities in each zip code. The outcomes indicate that the expected number of COVID-19 cases in zip codes with greater population, lower median income, lower per capita income, higher family poverty percentage, higher unemployment rate, lower percentage of people with higher education, and higher percentage of minority ethnicities is greater. This datadriven empirical evidence indicates that while the coronavirus may infect people regardless of their wealth, education, and ethnicity, people belong to lower socioeconomic factors suffer disproportionately from the outbreak due to the longstanding segregation by socioeconomic factors that may be linked to their underlying medical conditions, types of job, access to resources, and transportation modes.
While zip codes with lower socioeconomic status and higher percentage of minority ethnicities The results of Kendall's τ-b test for COVID-19 show statistically significant associations between number of confirmed cases of this pandemic with population, median age, median income, per capita income, family poverty percentage, unemployment rate, percentage of people with higher education, and percentage of minority ethnicities in each zip code. The outcomes indicate that the expected number of COVID-19 cases in zip codes with greater population, lower median income, lower per capita income, higher family poverty percentage, higher unemployment rate, lower percentage of people with higher education, and higher percentage of minority ethnicities is greater. This data-driven empirical evidence indicates that while the coronavirus may infect people regardless of their wealth, education, and ethnicity, people belong to lower socioeconomic factors suffer disproportionately from the outbreak due to the longstanding segregation by socioeconomic factors that may be linked to their underlying medical conditions, types of job, access to resources, and transportation modes.
While zip codes with lower socioeconomic status and higher percentage of minority ethnicities have a greater exposure to extreme heat and pandemic disease in NYC, flooding has a greater coverage in areas with relatively higher socioeconomic factors and lower percentage of minority ethnicities. The results of Kendall's τ-b test for NYC's 2020s 100-year flood plain show statistically significant associations between the percentage of inundation zones due to flooding and median income, per capita income, and percentage of people with higher education with positive τ statistics and with population, unemployment rate, and percentage of minority ethnicities with negative τ statistics. This indicates that zip codes with greater flooding inundation areas have higher median income, per capita income, and percentage of educated people and lower population, unemployment rate, and percentage of minority ethnicities. As Figure 2 shows, flooding inundation zones are mostly located along the shoreline that has higher property values. That might be the reason that zip codes prone to flooding have relatively higher economic factors. This observation contradicts the statements by the World Bank [7] and the HHS [8] that poor people live in flood prone areas.
In contrast to flooding inundations, the associations between percentage of inundation zones due to coastal storms and most of the socioeconomic factors are not statistically significant. The only social factor that is statistically linked to this natural hazard is population with a negative τ indicating that those zip codes with higher percentage of coastal storm inundations have a relatively lower population. We should also note that although this association is statistically significant at the 5% significance level, the calculated association is low (i.e., τ = −0.12) and may not reflect a considerable relationship between these two variables. The reason for this low association might be because densely populated areas including lower Manhattan have relatively similar coastal storm exposure to much less densely populated areas including those in Staten Island or the Rockaways.
In summary, the results of our empirical analysis confirm that the direction and significance of the associations between socioeconomic factors and exposure to natural hazards depend on the type of hazard. Particularly, in NYC, while minorities and people with lower socioeconomic status have a disproportionately greater exposure to extreme heat and pandemic disease, flooding is a concern in areas with relatively higher socioeconomic factors, and there is no evidence indicating that there is an association between exposure to coastal storm flooding and socioeconomic factors.

Conclusions
Heterogeneous spatial clusters of communities with different socioeconomic characteristics may lead to social and economic divisions that may result in disproportionate exposure to natural hazards for different groups of citizens. In this study, we empirically examined associations between exposure to four types of natural hazards in NYC (i.e., coastal storm, flooding, extreme heat, and pandemic disease) and a wide range of socioeconomic factors such as population, median income, family poverty percentage, unemployment rate, median age, percentage of people with higher education, and percentage of minority ethnicities. First, we collected socioeconomic data at a zip code level of granularity from the U.S. Census Bureau. Next, we identified and gathered spatial data for natural hazards in NYC collected by various federal and local agencies including the U.S. Army Corps of Engineers, FEMA, USGS, and the NYC DOHMH. We then conducted a Kendall's τ-b test to compare the associations between exposure to natural hazards and the socioeconomic factors in each zip code.
The results indicate that residents of zip codes with lower socioeconomic factors are disproportionately exposed to extreme heat and pandemic disease. However, for flooding hazard, where the vulnerable areas are mostly located along the shoreline, the zip codes with a greater percentage of inundation area have higher socioeconomic factors. For inundation zones due to coastal storms, the results did not show any statistically significant or considerable association with the socioeconomic factors.
The outcomes of this study demonstrate that the local context (i.e., NYC in this case) largely shapes the extent to which the general principles that indicate people with lower socioeconomic status are more exposed to all natural hazards hold and suggest that scholars and decision makers should carefully consider local context before making broad-brush assumptions about the level of exposure. Furthermore, the pattern of exposure can vary greatly depending on the type of the natural hazard in question. These differences in exposure patterns to various hazards are worth considering when decision makers aim to design and implement optimized and effective emergency preparedness plans. Particularly, this study can contribute to the ongoing initiatives in NYC that aim to mitigate the effects of natural hazards including the guidelines for urban forest restoration by the NYC Department of Parks and Recreations [39], the comprehensive waterfront plan [40], and zoning for coastal flood resiliency [41] by the NYC Department of City Planning by providing inputs that help them systematically prioritize their target areas based on the socioeconomic factors in order to enhance social justice.
This study focused on NYC and its natural hazards. Further investigations in other urban and rural areas with different socioeconomic structures, city planning settings, and types of natural hazards are potential directions for future studies to examine the transferability of the identified exposure patterns in this study and reveal exposure patterns in other regional contexts. Analyzing the links between level of exposure to natural hazards for different groups of people and their sensitivity to hazards based on different aspects of vulnerability including physical, economic, social, and emotional can also be topics for future studies. Finally, considering that this study aimed to set the stage for systematic community hazard preparedness plans that take into account social and economic inequalities, we used socioeconomic data at a zip code level. Further analyses at household or individual granularity are a potential basis for future research in this domain.