A Geographical Analysis of Socioeconomic and Environmental Drivers of Physical Inactivity in Post Pandemic Cities: The Case Study of Chicago, IL, USA

: The pandemic’s lockdown has made physical inactivity unavoidable, forcing many people to work from home and increasing the sedentary nature of their lifestyle. The link between spatial and socio-environmental dynamics and people’s levels of physical activity is critical for promoting healthy lifestyles and improving population health. Most studies on physical activity or sedentary behaviors have focused on the built environment, with less attention to social and natural environments. We illustrate the spatial distribution of physical inactivity using the space scan statistic to supplement choropleth maps of physical inactivity prevalence in Chicago, IL, USA. In addition, we employ geographically weighted regression (GWR) to address spatial non-stationarity of physical inactivity prevalence in Chicago per census tract. Lastly, we compare GWR to the traditional ordinary least squares (OLS) model to assess the effect of spatial dependency in the data. The ﬁndings indicate that, while access to green space, bike lanes, and living in a diverse environment, as well as poverty, unsafety, and disability, are associated with a lack of interest in physical activities, limited language proﬁciency is not a predictor of an inactive lifestyle. Our ﬁndings suggest that physical activity is related to socioeconomic and environmental factors, which may help guide future physical activity behavior research and intervention decisions, particularly in identifying vulnerable areas and people.


Introduction
Responsible for nearly 5 million deaths worldwide and often associated with physical and psychological disorders, an inactive lifestyle is a critical public health challenge [1][2][3][4]. While insufficient physical activity has caused $67.5 billion in health costs worldwide in 2013, it is a severe public health concern in the United States of America [5,6]. Regular physical activity helps prevent hypertension, overweight, and obesity and improves mental health, quality of life, and well-being. In addition to the multiple health benefits of physical activity, more active societies reap additional benefits, including reduced use of fossil fuels, cleaner air, and less congested, safer roads. These outcomes are interconnected with achieving the shared goals, political priorities, and ambition of the Sustainable Development Agenda 2030 [3].
Whether we engage in a physically active lifestyle or not is influenced by personal, interpersonal, sociocultural, and community factors [7][8][9][10] in mostly non-linear ways [11,12]. Macfarlane et al. (2021) [13] and Sallis et al. (2012) [14] described a social model of health as an outcome of socioeconomic status, culture, environmental conditions, housing, employment, and community influences. However, the current literature on physical activity has rarely included these factors, as most studies have focused on the built environment instead. This is a dreadful omission when identifying and guiding spatial-physical interventions and environmental factors associated with physical inactivity; (4) use publicly available data to suggest priority areas for interventions.

Materials and Methods
We obtained model-based estimates of current physical inactivity among the population for all 796 census tracts in Chicago ("Physical inactivity" variable). The physical inactivity data, among many other health-related measures, are provided by the PLACES Project of the Centers for Disease Control and Prevention [49] and stem from responses to the Behavioral Risk Factor Surveillance System survey [48]. Physical inactivity is defined as the proportion of respondents who indicated not engaging in leisure-time physical activity. Specifically, respondents aged ≥18 years who answered "no" to the following question: "During the past month, other than your regular job, did you participate in any physical activities or exercises such as running, calisthenics, golf, gardening, or walking for exercise?" Besides, we obtained census tract-level predictor variables from various sources: First, the share of people with disability ("Disability" variable), poverty ("Poverty" variable), without high school diploma ("No high school" variable), the language barrier ("Limited English" variable), age ("Age" Variable), gender ("Gender" variable), minority ("Minority" variable), and ("Ethnic diversity" variable) for 2018 are provided by the American Community Survey [50]. Second, we quantified mixed land uses ("Mixed land use" variable), spatial access to bike lanes ("Bike ratio" variable), and spatial access to parks ("Park's ratio" variable) by calculating the density of diverse land uses, bike lanes, and parks using data provided by the 2015 Land Use Inventory of the Chicago Metropolitan Agency for Planning [51]. Third, the vacant housing percentage ("Vacant housing" variable), the traffic intensity percentage ("Traffic intensity" variable), and concentration of PM 2.5 ("PM 2.5" variable) for 2018 are provided by Chicago Health Atlas [52]. Lastly, we computed the census tract-level proportion of tree canopy area ("Tree ratio" variable) using the High-Resolution Land Cover, NE Illinois, and NW Indiana, 2010 dataset provided by CMAP. To conduct spatial analysis and mapping using geographic information systems (GIS), we obtained census tract polygon geometries as TIGER/Line Shapefiles from the United States Census Bureau. We joined all our census tract-level variables to the geometries through their 11-digit FIPS codes.

Spatial Distribution of Physical Inactivity
We utilized the elliptic spatial scan statistic with a normal probability model [45,53,54] to find significant clusters of high and low physical inactivity in Chicago. The spatial scan statistic identifies the most likely clusters of either high or low values of a given spatial variable. We chose ellipses over the more established circular form of the spatial scan statistic to address the linear features of our study area, e.g., the shore of Lake Michigan. Each cluster z is an ellipse of radius r, centered on the centroid of a census tract, whereas multiple candidate ellipses of different radii and angle are assessed per tract. We evaluated a set of N = 796 census tracts in Chicago, eliminating 5 tracts due to lack of data or contiguity. If a given centroid lies within the ellipse centered on a neighboring tract, its tract becomes part of the respective cluster. To find the most likely clusters of physical inactivity prevalence, the spatial scan statistic tests the null hypothesis (h 0 ) that mean prevalence of physical inactivity inside the cluster is equal to outside for each candidate ellipse. Conversely, the alternative hypothesis (h a ) states that physical inactivity prevalence inside the cluster is higher/lower than outside. Since we are looking for both, areas of increased and decreased physical inactivity, we evaluate h 0 and h a by choosing z to either minimize or maximize the log of the likelihood ratio (LLR) in Equation (1): where x i is the physical inactivity prevalence value at census tract i, µ the global mean, and σ 2 the variance. We use Monte Carlo simulation to evaluate statistical significance of the most likely clusters through random permutation of physical inactivity prevalence values and their corresponding census tracts 999 times. Therefore, the spatial scan statistic is computed 999 times for simulated data, allowing for calculating cluster p-values [55]. We restricted ellipses to contain a maximum of 20% of Chicago's population to avoid excessively large clusters that may be better represented as multiple smaller and disconnected clusters.
While the spatial scan statistic determines the presence, location, and strength of statistically significant clusters in the data, the Moran's I statistic measures spatial autocorrelation [56]. While the global form of Moran's I tests whether physical inactivity values are correlated among adjacent tracts, its local form [57] allows for illustrating where this is the case geographically. Therefore, local Moran's I belongs to the group of Local Indicators of Spatial Autocorrelation (LISA), and is calculated as follows [58], Equation (2): where I i denotes the Moran's I at location i; z i represents physical inactivity prevalence at location i; z is the mean physical inactivity value; z j is physical inactivity at other locations (where j = i); σ 2 is the variance of z, and W ij is the spatial weight based on proximity between z i and z j . If local Moran's I is positive for a given census tract, it has a similarly high or low physical inactivity prevalence value as its adjacent neighbors, which is referred to as "spatial cluster". Spatial clusters can be clusters of high values ("high-high" cluster, a.k.a "hot spot") or of low values ("low-low cluster", a.k.a. "cold spot"). Conversely, if local Moran's I is negative, the corresponding census tract exhibits different values than its neighbors and is therefore referred to as either "high-low" or "low-high" outlier. Here, we apply local Moran's I to identify hot-and cold spots of physical inactivity based on 9999 permutations at the significance level of p < 0.05.
The spatial scan statistic and local Moran's I can be used in tandem, as they answer slightly different questions: The spatial scan statistic outputs clusters of significantly high or low physical inactivity prevalence in elliptical form. The values are high or low compared to an expected value, which is based on the global physical inactivity prevalence (average physical inactivity for Chicago, in our case). Clusters identified by the spatial scan statistic are not necessarily coherent, as they may include areas that do not significantly deviate from the expectation. Local Moran's I is suited to assess whether clusters are coherent, as it shows whether census tracts are similar to their neighbors. Therefore, while the spatial scan statistic compares a given region to a global expectation, local Moran's I compares them to their immediate neighbors.

Spatial Correlates of Physical Inactivity
We determined significant predictors of physical inactivity by employing Ordinary Least Squares (OLS) regression. We avoided multicollinearity among predictor variables by computing the variable correlation matrix and ensuring that variance inflation factors were below a recommended threshold of 2.5 [59], indicating that collinearity among predictors did not lead to an inflation of variance. This led to excluding the variables Age, Gender, Minority, Ethnic diversity, Parks ratio, Traffic intensity, and PM 2.5. Therefore, our final regression model included the predictor variables Limited English, Disability, Poverty, No high school, Mixed land use, Bike ratio, Tree ratio, and Vacant housing. Our regression diagnostics included checking for heteroskedasticity by plotting residuals versus fitted values and checking for normality by the histogram of standardized residuals. We further analyzed our OLS regression model to check for spatial autocorrelation of residuals, which constitutes a violation of OLS assumptions [46]. We tested for the presence of residual spatial autocorrelation using global Moran's I [56].
In addition, we used Geographically Weighted Regression (GWR), which is an extension of the traditional OLS standard regression by allowing local rather than global parameters to be estimated [60]. It assumes spatial heterogeneity of predictor variable effects and therefore, allows for measuring the spatial variation of regression model results. Spatial non-stationarity might be lost when using simple global fitting methods [61]. GWR has been used broadly in the explanation of relationships in various fields, including but not limited to urban health [62], urban mobility [63], spatial epidemiology and land use planning [64]. GWR models can produce a set of local parameter estimates showing how a relationship varies over space. For each data point, GWR model will produce the local R 2 and local residual, as well as local coefficients, allowing for analyzing the spatial variation of relationships between target and predictor variables. We use an adaptive kernel base to account for nonuniform spatial distribution of the data. An adaptive kernel base allows the GWR model to quantify the optimum bandwidth by iterating the number of nearest neighbors that should be considered for the local regression model [65]. The optimal bandwidth is measured by minimized Akaike Information Criterion (AICc) score or minimized local information loss [61]. We used Python 3.10.1 [66], R 4.1.0 with RStudio 1.4.1717 [67,68], as well as SaTScan TM software [53] for statistical computing, and ArcGIS Pro [69] and the tmap package [70] for cartography.

Spatial Distribution of Physical Inactivity
The spatial distribution of physical inactivity among adults can be characterized by high prevalence in Chicago's southern, southwestern, and western parts (Figure 1a). Using the spatial scan statistic, we identified one cluster of statistically significant high physical inactivity prevalence (Figure 1b), located in Chicago's west and southwest side. It has a mean physical inactivity prevalence of 36.35% (Table 1), has an LLR of 444,515.03, and encompasses 180 census tracts. In addition, we identified one cluster of low physical inactivity located in the northeastern part of the city. It has a mean prevalence of 14.10%, an LLR of 823,303.73, and encompasses 152 census tracts. Table 1 shows important characteristics of the corresponding clusters in Figure 2. Lastly, both clusters are significant at the p < 0.05 level.      The spatial clusters of physical inactivity identified by local Moran's I largely follow the ones identified by the spatial scan statistic (Figure 1b). The high-high (Moran's I) cluster in the western part of Chicago overlaps with the (spatial scan statistic) cluster of high prevalence. Therefore, this area exhibits significantly higher physical inactivity levels than the rest of the study area, while census tracts within the area exhibit positive spatial autocorrelation, meaning values are similarly high among neighbors. Conversely, the low-low (Moran's I) cluster in the northeastern part of Chicago overlaps with the (spatial scan statistic) cluster of low physical inactivity prevalence. This area has significantly lower physical inactivity levels than the rest of the study area, while census tracts within exhibit positive spatial autocorrelation, meaning values are similarly low among neighbors. The local Moran's I analysis confirms that the clusters identified by the spatial scan statistic are indeed clusters of extreme (high/low) physical inactivity values and that these clusters are compact and coherent. Spatial clusters identified by local Moran's I that lie outside of clusters identified by the spatial scan statistic can be considered outliers, such as the group of high-high census tracts in the southeastern part of the city. These areas may exhibit positive spatial autocorrelation of high physical inactivity prevalence values but grouping them together in an ellipse shape to form a significant cluster by spatial scan statistic was not possible.

Spatial Correlates of Physical Inactivity
We found high-poverty census tracts ("Poverty" variable) in the southern part of Chicago and west of downtown (Figure 2a). Similarly, we found pockets of language limitation ("Limited English" variable) scattered throughout the city, with larger clusters west and northwest (Figure 2b). People with disabilities ("Disabled" variable) have a different pattern, with higher prevalence in the southern parts of the city (Figure 2c). The distribution of the percentage of people with no high school diploma ("No high school" variable) shows highest prevalence in the west of the city (Figure 2d). The distribution of mixed land-use ("Mixed land use" variable) shows that regions near downtown have a higher diversity of activities (Figure 2e). Expectedly, the access to bike lanes ("Bike ratio" variable) is high around downtown and some communities in the south (Figure 2f). Also expectedly, areas with higher levels of tree cover ("Tree ratio" variable) are found at the city's fringe (Figure 2g). Conversely, census tracts in the central parts of the city have lower levels of tree cover. We found vacant housing ("Vacant housing" variable) in the southwest and south of the city (Figure 2h). The OLS regression model revealed a positive association of the proportion of people in poverty ("Poverty" variable) with physical inactivity ( Table 2). This indicates that poorer census tracts are home to a less physically active population. In addition, higher proportions of disabled people ("Disabled" variable) were associated with higher levels of physical inactivity. Then, the proportion of the population without a high school degree ("No high school" variable) was positively associated with physical inactivity, indicating that limited educational attainment leads to inactive lifestyles. Further, limited English language capability ("Limited English" variable) was also positively associated with physical inactivity. The proportion of tree cover ("Tree ratio" variable) was negatively associated with physical inactivity, meaning census tracts with higher tree coverage exhibited higher physical inactivity. Then, the vacant housing percentage ("Vacant housing" variable) showed a positive association with physical inactivity, suggesting that physical inactivity behavior is higher where the share of vacant houses is high. The mixed land use ratio ("Mixed land use" variable) shows a physical inactivity behavior is higher where diversity of urban activities and diverse land uses is low. Lastly, the bike lane ratio ("Bike ratio" variable) was inversely associated with physical inactivity behavior. Therefore, the physical inactivity is high where spatial access to bike lanes is low. Overall, the model fit was high, with an R 2 of 0.88. The linear model fit (AIC) was 4131, and Jarque-Bera Statistic is 0.23 indicating a normal distribution of OLS regression residuals. The spatial analysis of residuals revealed significant spatial autocorrelation in the model. Moran's I test (I = 0.17, p = 0.00) confirms the presence of spatial autocorrelation of residuals. The GWR mostly confirmed the result of the OLS model while describing nonstationary spatial relationships. As expected, the GWR coefficients indicated the presence of spatial variation. Model fit of the GWR (AIC = 3780) with R 2 of 0.93 was higher than that of the OLS model indicating that incorporation of spatial structure accounts for some of the previously unexplained variation. Our study identified spatial variation in the physical inactivity prevalence of the population in Chicago. The findings indicate that the positive association between poverty and physical activity is strongest in the west and south of the downtown communities like the Englewood neighborhood (Figure 3a). In addition, the association between limited English language capability and physical inactivity is predomi-nantly negative. However, positive associations are found in the northeast and southwest, such as Edgewater, Lincoln Park, and Rogers Park, with predominantly European and Middle Eastern immigrants (Figure 3b). Moreover, model coefficients for the share of people with disabilities are positive for most of our study areas, where the highest values are found along the shore of Lake Michigan. In contrast, the relationship is reversed in neighborhoods far west of the city, where negative associations between disability and physical inactivity are found (Figure 3c). Also, the relationship between education and physical activity is positive throughout the city but strongest in Chicago's western and far northwestern parts (Figure 3d). Further, the predominantly negative relationship between mixed land use and physical inactivity is reversed in the city's south, where strong positive associations are found (Figure 3e). Furthermore, the association of bike lanes with physical inactivity is negative around downtown and in the northwestern part of the city.
Conversely, values are positive in the southern, southwestern, and some parts north of downtown ( Figure 3f). As expected, the relationship between trees and physical inactivity is positive around downtown and negative in the south (Figure 3g). Lastly, the association between vacant housing and physical inactivity is positive, whereas strongest in the west and southwest (Figure 3h).

Discussion
Using one spatial clustering technique and two different regression models, our study examined the spatial distribution and varying relationships between physical inactivity prevalence and social and environmental factors. We found that physical inactivity prevalence varies across Chicago, with higher levels in the city's west side, such as Englewood and Little Village communities, which are surrounded by industrial areas and exhibit high proportions of Black and Hispanic populations, respectively (see Figure 1a,b).
Our results indicate variation in physical inactivity prevalence by poverty and language proficiency. The highest share of physical inactivity was found in Downtown's west and southwestern parts, affecting low-income people. However, limited language proficiency is not a barrier to physical activity. In addition, physical inactivity was elevated along Lake Michigan among the population with disabilities, which is unbearable for disabled people who are economically vulnerable. In contrast, higher levels of physical activity were found among well-educated people in the northern part of the downtown.

Discussion
Using one spatial clustering technique and two different regression models, our study examined the spatial distribution and varying relationships between physical inactivity prevalence and social and environmental factors. We found that physical inactivity prevalence varies across Chicago, with higher levels in the city's west side, such as Englewood and Little Village communities, which are surrounded by industrial areas and exhibit high proportions of Black and Hispanic populations, respectively (see Figure 1a,b). Our results indicate variation in physical inactivity prevalence by poverty and language proficiency. The highest share of physical inactivity was found in Downtown's west and southwestern parts, affecting low-income people. However, limited language proficiency is not a barrier to physical activity. In addition, physical inactivity was elevated along Lake Michigan among the population with disabilities, which is unbearable for disabled people who are economically vulnerable. In contrast, higher levels of physical activity were found among well-educated people in the northern part of the downtown.
Besides the socioeconomic and demographic factors, we found associations of the built environment and environmental factors with physical inactivity. For instance, spatial access to diverse urban activities shapes human behavior, as less-connected urban facilities and "obesogenic" environments offer fewer opportunities for physical activity like walking and biking. Even if cities offer connected pedestrian and street networks, they may not provide destinations for walking and biking, thereby hampering the development of an active lifestyle. Human-scale cities should be well connected with diverse urban activities and build exercise-oriented urban spaces. However, our results show that physical inactivity is higher despite economic diversities in the southern part of the city. Similarly, physical inactivity is elevated in the south part of the city with a higher share of bike lanes. The quality of urban space is measured through qualitative and quantitative metrics, such as access to local facilities, mixed land uses, public transport density, environmental perception measures (safety, tidiness, transparency, imageability, enclosure, and human-scale [71,72]. Hence, mixed land use per se is not a determinant factor for physical activities.
In addition, we found associations of environmental factors with physical inactivity prevalence, including urban tree cover. However, the findings show that physical inactivity increased when the tree cover was elevated. Canopies are mainly isolated and spatially segregated in the low-density residential districts in Chicago [73]. Trees as urban features should be planted, designed, and maintained purposefully and consciously in synergy with the socioeconomic and physical dynamics of the built environment to become a contributor to physical activities. For instance, urban trees in vacant and abandoned properties cannot promote physical activity behavior, as these properties create a sense of fear and invite criminal activities. The results found that physical inactivity increased as vacant housing increased in the south and southwest parts of the city.
Physical activity behavior is non-linearly associated with urban dynamics and is related to people's environmental perceptions and cultural dynamics. For instance, personal security concerns influence the perceived safety of the environment for physical activity in daily life [72,74]. Hence, the multifactorial relational aspects of physical activity behavior require built environment interventions, community programs (e.g., family and social support programs), and policies (e.g., Complete Street Policies, New Urbanism Policies [75]) contribute to minimizing hinderances of physical activity. The healthy behavior of populations is the outcome of multi-institutional policies, and it requires interdisciplinary and inter-scale surveys.
To build healthy cities, urban planners and decision-makers should include measures to improve urban component connectivity, such as trees, particularly in areas with high physical inactivity prevalence, low income, a high share of the population with disabilities, and communication barriers. Furthermore, the findings may inform urban planning decisions to reduce disparities in urban health, as well as strategies to reduce the burden of diseases associated with sedentary lifestyles.
There are four main limitations to this study. First, the physical inactivity variable is based on survey data, which may introduce response bias. Secondly, this study does not include the impacts of micro-community environments as street connectivity. Future research can capture these impacts to represent the spatial features of this variable. Third, we employed the spatial scan statistic, which imposes the assumption of elliptical clusters, which may not hold true. Fourth, our study may inform future research by identifying neighborhoods that exhibit elevated physical inactivity levels, as well as their associations with socioeconomic and environmental factors, but due to the retrospective study design, our ability to identify causal relationships is limited. Fifth, our data is aggregated to census tracts and, therefore, subject to the modifiable areal unit problem (MAUP, [76]). Methods exist to at least partially address the issue by circumventing the use of predefined and imprecisely measured neighborhood definitions [77].

Conclusions
This study used spatial clustering and geographically weighted regression to assess the spatially varying relationships between social and environmental factors and physical inactivity in Chicago, IL, USA. Our main finding is that the spatial distribution of physical inactivity prevalence varies based on income, land use planning, tree canopy, education, people with disabilities, and language barrier. Accordingly, suggestions for planning interventions for Post Pandemic Cities include (1) To perceive cities as a system of interconnected infrastructures; (2) To integrate dynamics of leisure and entertainment in renewal and urban development practices to promote healthy behavior and health equity; (3) To promote socio-economic and spatial diversity of urban spaces; (4) To understand barriers of people with different types of disabilities (e.g., intellectual, mobility, hearing, vision, and psychological disabilities) and include their spatial and environmental needs in urban development projects; (5) To design urban projects by considering people's socio-environmental perception because physical activity behavior is driven by personal perception, and (6) To identify tree canopies as urban features. Trees encompass the layer of leaves, branches, and stems that shelter the ground when viewed from above. Canopies in synergy with other urban features on a human scale contribute to physical activity and health equity.
This cross-sectional study used public data to explore the socio-environmental variables associated with physical inactivity prevalence. Further study is needed to discover the causality between perceptions of the environment and physical inactivity prevalence. Lastly, longitudinal studies should be conducted to identify causal relationships between socio-environmental actors and physical inactivity outcomes.