The Geography of Mental Health, Urbanicity, and Affluence

Residential location has been shown to significantly impact mental health, with individuals in rural communities experiencing poorer mental health compared to those in urban areas. However, the influence of an individual’s social group on the relationship between residential location and mental health outcomes remains unclear. This study disaggregates the rural-urban binary and investigates how geography and social groupings interact to shape mental health outcomes. Merging data from PLACES and Claritas PRIZM, we conducted a hotspot analysis, generated bivariate choropleth maps, and applied multiscale geographically weighted regressions to examine the spatial distribution of mental health and social groupings. Our findings reveal that mental health is influenced by complex interactions, with social groups playing a critical role. Our study highlights that not all rural and urban areas are alike, and the extent to which social groups influence mental health outcomes varies within and across these areas. These results underscore the need for policies that are tailored to meet the unique mental health needs of individuals from different social groups in specific geographic locations to inform policy interventions that more effectively address mental health disparities across diverse communities.


Introduction
A recent article published in the Journal of the American Medical Association emphasizes the need for further research to comprehend the relationship between a person's place of residence and their health outcomes [1]. While the call for research may appear obvious to some, the reality is quite complex. Where we reside, work, and spend our leisure time significantly influences the quantity and quality of opportunities available to us [2] as well as our behaviors [3]. Therefore, the question of whether geography impacts health outcomes lies at the heart of population health science, which aims to understand the root causes of health disparities across various populations to enhance health outcomes effectively [4]. These disparities vary significantly based on geography.
The existing literature suggests that an individual's mental health outcomes are significantly influenced by the interplay between their characteristics and their place of residence. Factors such as socioeconomic status, physical environment, social support networks, and healthcare access have been identified as crucial determinants of mental health status [5][6][7][8][9][10][11][12]. These determinants are especially pronounced in urban versus rural settings. For instance, 59% of the decrease in community hospitals between 2015 and 2019 were rural hospitals [13], which limits the quantity and quality of opportunities for rural residents to focus on their health-related behaviors.
Prior research indicates that rural residents experience significantly worse mental health outcomes [14][15][16][17][18]. However, rural areas are not homogeneous, and their populations are complex and diverse. Although a significant portion of the U.S. population resides in rural areas, these individuals exhibit different demographic characteristics, social networks, and healthcare access. One study suggests that mental health disorders may be more prevalent in semi-rural areas than in rural areas [19], highlighting the importance of

Data
We created a dataset by merging pre-existing data from PLACES and PRIZM. No Institutional Review Board (IRB) approval was required for the present study as the data used were pre-existing, de-identified, and publicly available. The final dataset contains 32,092 Zip Code Tabulation Areas (ZCTA). In total, we deleted 310 observations from the states of Alaska and Hawaii to better account for the neighboring spatial distribution of mental health and social groups. Keeping such states would produce unreliable results since such states do not border with those in the contiguous 48 adjoining U.S. states or those ZCTAs that lacked sufficient population to reliably estimate social groups [22,23].

PLACES
The PLACES database provides model-based estimates of health measures. The modelbased estimates were produced by the CDC using data sources such as the Behavioral Risk Factor Surveillance System (BRFSS) 2020, Census Bureau 2010 population estimates, and American Community Survey (ACS) 2015-2019 estimates. For this project, we focused on mental health as our dependent variable. The crude prevalence of the lack of health insurance, physical inactivity, and frequent physical health distress served as part of our explanatory variables (Table 1). We included the explanatory variables listed in Table 1 as extant research determines they significantly impact an individual's physical and mental health. First, lack of health insurance decreases access to mental health care, leading individuals with limited access to health insurance to worse mental health outcomes [24,25]. Next, current research has established a link between physical and mental health, showing that improvements in physical health and physical activity lead to improvements in mental health [26][27][28][29].
Although existing studies emphasize the importance of access to green and blue spaces for improving mental health [11,12,30], we excluded these variables from our analyses for two reasons. First, data on access to green and blue spaces are not readily available at the ZCTA level. The land cover data provided by the United States Environmental Protection Agency (EPA) specifically state the data should not be utilized for local-level analyses [31]. Second, due to the nature of the geographic data for separating different rural and urban areas based on social groupings, as a group becomes less urbanized, access to green and blue spaces is predicted to increase, leading to these variables being highly correlated. Therefore, we did not introduce a variable set to measure access to green or blue spaces, as this will result in multicollinearity.

Claritas PRIZM Premier Social Groups
The 14 social groups of Claritas PRIZM Premier are constructed based on each place's urbanicity class and affluence. First, each segment is placed in one of four urbanicity class categories (i.e., urban, suburban, second city, and town and rural, see Table 2 below). Second, within these urban classes, all the segments are classified based on affluence. Finally, all segments are grouped into one of the fourteen social groups. At the top of the affluence and density scales is group U1: Urban Uptown, in which residents live in urban areas and are very affluent. At the opposite extreme is group T4: Rustic living, where residents live in rural areas with a more downscale lifestyle. Table 2 summarizes the definitions for each of the 14 social groups, beginning with the highest affluence and density. Residents are middle to lower-class individuals who prefer fishing, hunting, and meeting at civic clubs; in these remote areas, high school football is a main source of entertainment Rustic Living (T4) Residents live in the most remote towns and have modest incomes; these individuals spend their leisure time participating in small-town activities, such as social groups at local churches, veterans' clubs, and car racing In addition, we extracted the average household expenditures on drugs, percent poverty status, and percent unemployed at the ZCTA level from Simply Analytics to serve as additional explanatory variables. First, since increases in drug prices can lead individuals to be unable to afford necessary medications to treat mental health issues, we included a measure of household expenditures on drugs [32][33][34][35]. Next, we utilized poverty status since those living in poverty experience worse mental health outcomes [36,37]. Lastly, unemployed individuals are more likely to struggle with mental health [38][39][40]. Table 3 summarizes the definitions.

Methods
The merged tabular cross-sectional database from PLACES and PIRZM was imported into Esri ArcGIS Pro version 3.1.0 to produce the spatial statistics described below to learn about the spatial distribution of mental health given our control variables in each of the fourteen social groups, which is akin to fitting an interaction model between the controls and the social groups, with the advantage that the interpretation of the results is more straightforward [41].
Although there is no single cause for mental health distress, the explanatory variables included in this paper attempt to measure the most common associated predic -tors [24][25][26][27][28][29][32][33][34][35][36][37][38][39][40]42,43]. Since our level of analysis is at the ZCTA level, we solely focused our study on the spatial distribution and impact of such variables on mental health's geographic patterns and made no inferences at the individual level to avoid ecological fallacy issues.
First, we estimated the Getis-Ord Gi* statistic to analyze if frequent mental health distress clusters spatially via the identification of hotspots. A hotspot compares an area (i.e., ZCTA) with a high concentration of poor mental health days with the expected number given a random distribution. The Gi* statistic compares the density within a ZCTA with a random spatial model and measures the interaction with other areas to understand the occurrence of spatial patterns [44]. In other words, it analyzes if the crude prevalence of frequent mental health distress in a particular ZCTA is high or low, given the crude prevalence of neighboring ZCTAs. A statistically significant hot spot is one in which there is a high value of frequent mental health distress in a particular zip code surrounded by other zip codes with high values (i.e., larger z-scores); in contrast, a cold spot would have smaller z-scores suggesting a significant clustering of a low prevalence of frequent mental health distress.
Second, we presented a series of bivariate choropleth maps to show the spatial distribution between the crude prevalence of frequent mental health distress and social groups. The bivariate coding scheme represents the product of each variable with three discrete classes (i.e., low, medium, and high) to create a grid of nine unique colors [44]. Using Stata 17, we performed a Kruskal-Wallist test to determine if the median crude prevalence of frequent mental health distress was the same across social groups.
Finally, we estimated a series of multiscale geographically weighted regressions (MGWR) by social group to model the spatial correlations between mental health and the macro social determinants of health to determine which of our explanatory variables has the most robust association with mental health while controlling for the other predictors. MGWR is a derivation of geographically weighted regression (GWR) that constructs a linear regression between a dependent variable and a set of explanatory variables within a neighborhood. The main difference between GWR and MGWR is that in the latter, the scale of analysis can vary between variables in contrast with the former, which assumes that each explanatory variable's scale is identical [45]. Figure 1 shows significant clustering of the crude prevalence of frequent mental health distress in the South (e.g., parts of Louisiana, Mississippi, Alabama, Georgia, and South Carolina), Coal Country (e.g., parts of Tennessee, West Virginia, and Pennsylvania), and the Rust Belt (e.g., parts of Ohio, Pennsylvania, Michigan, and Indiana). In addition, certain parts of Oklahoma, Arkansas, and East Texas; cities such as Dallas-Fort Worth and Houston; and California's San Joaquin Valley also experience significant spatial clustering-indicated by the red dots. In contrast, in the Great Plains (e.g., the Dakotas, Minnesota, and Iowa), around the Great Lakes (e.g., Michigan and Wisconsin), the North East, northern New Mexico, Colorado, and parts of Washington and California, among other areas in the country, the prevalence of frequent mental health distress seems to be lower-indicated by the blue dots that illustrate ZCTAs where there is no high crude prevalence. In the rest of the country, there is no significant geographic concentration of the crude prevalence of frequent mental health distress-as seen with the gray dots (All scale of analysis can vary between variables in contrast with the former, which assumes that each explanatory variable's scale is identical [45]. Figure 1 shows significant clustering of the crude prevalence of frequent mental health distress in the South (e.g., parts of Louisiana, Mississippi, Alabama, Georgia, and South Carolina), Coal Country (e.g., parts of Tennessee, West Virginia, and Pennsylvania), and the Rust Belt (e.g., parts of Ohio, Pennsylvania, Michigan, and Indiana). In addition, certain parts of Oklahoma, Arkansas, and East Texas; cities such as Dallas-Fort Worth and Houston; and California's San Joaquin Valley also experience significant spatial clustering-indicated by the red dots. In contrast, in the Great Plains (e.g., the Dakotas, Minnesota, and Iowa), around the Great Lakes (e.g., Michigan and Wisconsin), the North East, northern New Mexico, Colorado, and parts of Washington and California, among other areas in the country, the prevalence of frequent mental health distress seems to be lowerindicated by the blue dots that illustrate ZCTAs where there is no high crude prevalence. In the rest of the country, there is no significant geographic concentration of the crude prevalence of frequent mental health distress-as seen with the gray dots (All Figures are available in the Supplementary Materials).

Hotspot Analysis
To account for the Modifiable Areal Unit Problem (MAUP), which concerns the impact of spatial unit selection on the statistical analysis of spatial data, we conducted a comparison of the spatial distribution of mental health using county and ZCTA-level data. Specifically, we calculated Moran's I Statistic and performed a hotspot analysis. The results of both analyses indicated a statistically significant clustered pattern, demonstrating that despite different levels of aggregation, mental health exhibits similar spatial patterns between counties and ZCTAs (See Table S1 for complete results).

Bivariate Choropleth Maps
The bivariate choropleth maps (Figure 2) reveal various geospatial relationships. We observed negative and small correlations between frequent mental health distress and the distribution of the most urban and/or affluent ZCTAs (pMH-U1 = −0.11, pMH-S1 = −0.26, pMH-C1 To account for the Modifiable Areal Unit Problem (MAUP), which concerns the impact of spatial unit selection on the statistical analysis of spatial data, we conducted a comparison of the spatial distribution of mental health using county and ZCTA-level data. Specifically, we calculated Moran's I Statistic and performed a hotspot analysis. The results of both analyses indicated a statistically significant clustered pattern, demonstrating that despite different levels of aggregation, mental health exhibits similar spatial patterns between counties and ZCTAs (See File S1 for complete results).

Bivariate Choropleth Maps
The bivariate choropleth maps (Figure 2) reveal various geospatial relationships. We observed negative and small correlations between frequent mental health distress and the distribution of the most urban and/or affluent ZCTAs (p MH-U1 = −0.11, p MH-S1 = −0.26, p MH-C1 = −0.05, p MH-T1 = −0.30). In contrast, the correlation between mental health distress and the distribution of less urban and less affluent ZCTAs is mixed. Specifically, we find positive and somewhat stronger correlations in C2, C3, and T4 social groups (p MH-C2 = 0.19, p MH-C3 = 0.20, p MH-T4 = 0.39). Despite the apparent spatial consistency of frequent mental health distress across social groups, the Kruskal-Wallist test revealed that the median crude prevalence of frequent mental health distress was not uniform across the fourteen social groupings (χ2 = 10,866.37, p = 0.0001). = −0.05, pMH-T1 = −0.30). In contrast, the correlation between mental health distress and the distribution of less urban and less affluent ZCTAs is mixed. Specifically, we find positive and somewhat stronger correlations in C2, C3, and T4 social groups (pMH-C2 = 0.19, pMH-C3 = 0.20, pMH-T4 = 0.39). Despite the apparent spatial consistency of frequent mental health distress across social groups, the Kruskal-Wallist test revealed that the median crude prevalence of frequent mental health distress was not uniform across the fourteen social groupings (χ2 = 10,866.37, p = 0.0001).  Table 4 shows the scaled mean value of each coefficient by social grouping with its standard deviation in parenthesis (results available in Table S1). The mean value reflects the association between each explanatory variable and the crude prevalence of mental health distress. Positive mean values indicate that an increase in the explanatory variable is associated with an increase in the crude prevalence of frequent mental health distress. In contrast, negative mean values indicate that a rise in one of the explanatory variables is related to a decrease in the frequency of mental health distress. The standard deviation  Table 4 shows the scaled mean value of each coefficient by social grouping with its standard deviation in parenthesis (results available in File S1). The mean value reflects the association between each explanatory variable and the crude prevalence of mental health distress. Positive mean values indicate that an increase in the explanatory variable is associated with an increase in the crude prevalence of frequent mental health distress. In contrast, negative mean values indicate that a rise in one of the explanatory variables is related to a decrease in the frequency of mental health distress. The standard deviation indicates each explanatory variable's spatial variation. For example, a low standard deviation suggests low spatial variability and vice versa. In most social groups (except U2, C1, C2, and C3), the lack of health insurance among adults aged 18-64 years has the strongest association with frequent mental health distress. Regular physical health distress is the second explanatory variable with a powerful effect on mental health in most social groups (with exceptions in U2, S3, S4, C2, and C3). Physical inactivity is negatively related to mental health in all social groups, except for S3, S4, and C3, where the relationship is positive.

Multiscale Geographically Weighted Regressions
When comparing coefficients by urbanicity and affluence, the results are mixed. In the most affluent and urban areas (U1 and S1), health insurance, frequent physical distress, and unemployment have the highest mean coefficients. However, in urban but less affluent ZCTAs (U3 and S4), the lack of health insurance remains the explanatory variable with the strongest relationship with mental health distress. For U3, frequent mental health distress is closely followed by poverty, while poverty is the second explanatory variable for inner suburbs ZCTAs (S4). Lack of physical activity has the third most substantial impact on mental health for S4.
In the most rural and less affluent ZCTAs (T3 and T4), lack of health insurance, frequent physical distress, and poverty have the highest mean values, following similar patterns as urban core ZCTAs (U3). Among the explanatory variables, lack of health insurance has the highest spatial variation across our study area, followed by physical inactivity, frequent physical health distress, average household drug expenditures, poverty, and unemployment. In U1, physical inactivity has the highest spatial variation, while in T1, lack of health insurance displays the most extensive spatial variation. Lack of health insurance has the highest spatial variation in T4, while U3 has the lowest.
From the results presented in Table 4, we can conclude that the lack of health insurance is the variable with the most robust relationship with frequent mental health distress at the ZCTA level. To further explore their spatial relationship, we investigated the variation in the local parameter estimates of the lack of health insurance for each social group's mental health distress. Figure 3 displays the coefficient estimate with a divergent color scheme centered at zero to identify where the lack of health insurance has a positive or negative relationship with frequent mental health distress. The green halos indicate statistically significant associations with 95 percent confidence. From the results presented in Table 4, we can conclude that the lack of health insurance is the variable with the most robust relationship with frequent mental health distress at the ZCTA level. To further explore their spatial relationship, we investigated the variation in the local parameter estimates of the lack of health insurance for each social group's mental health distress. Figure 3 displays the coefficient estimate with a divergent color scheme centered at zero to identify where the lack of health insurance has a positive or negative relationship with frequent mental health distress. The green halos indicate statistically significant associations with 95 percent confidence.
Overall, we can observe significant spatial variation within and between social groups, most notably in less urban areas. For example, the lack of health insurance in T2 ZCTAs exhibit a positive relationship (i.e., brown dots) in some locations and a negative relationship in others (i.e., purple dots). In terms of statistical significance, there is also crucial spatial variation; however, in rural areas, the impact of the lack of health insurance on frequent mental health distress seems to be the strongest, as well as in the upper rungs of suburban areas.

Discussion
We note four important findings in our study. First, not all rural areas and not all urban areas are alike. The bivariate maps illustrated a discordance in geographic distributions between social groups and mental health. In the most rural and less affluent areas, the correlation between individuals' residence and the prevalence of mental health distress is strong. However, it is also here where we recognize that some suburban areas also experience significant mental health distress, not solely the most remote and least affluent areas; therefore, future studies need to explore why suburban areas, even with increased mental health resources, still experience poorer mental health outcomes.
Second, our study finds that the strength of the impact of each explanatory variable on the crude prevalence of frequent mental health distress behaves differently in each social group. As the previous literature has found [24,25], the lack of health insurance is positively related to an increase in frequent mental health distress; however, its impact is more substantial in less urban and less affluent areas (i.e., T3 and T4), followed by rural and affluent areas (i.e., T1 and T2), urban and affluent ZCTAs (i.e., U1, S1, and C1), and the urban and less affluent ZCTAs (i.e., U3, S4, and C3). In addition, the impact of other factors such as physical health, unemployment, poverty, and physical inactivity changes significantly depending on the social group and location.
At first glance, the negative relationship between physical inactivity and mental health may seem counterintuitive [46][47][48][49]; however, the impact of physical activity on mental health cannot be understood in a vacuum but as another explanatory variable in the context of the general model during a global pandemic, in which, everyday routines were significantly altered. In addition, the effects of physical activity on mental health may be specific to the characteristics of physical activity, such as intensity, duration, and type of exercise, rather than just physiological activation [50] as well as the location where these activities occur. PLACES aggregates data from individual survey questions, which may lead to effects on the wording of the questions. In this case, the question used to estimate the lack of physical inactivity was based on a range of activities that may have different effects on mental health, namely running, calisthenics, golf, gardening, or walking. Finally, researchers must be careful not to commit the atomistic fallacy that occurs when incorrect inferences are made about relationships in aggregate data based on relationships observed in individual data.
Third, our geospatial results suggest that although levels of frequent mental health distress are commonly found in the most remote areas, mental health distress is also found in other communities with greater affluence; therefore, rural communities should not be considered one group since the levels of mental health distress diverge. This finding underscores the need to increase our understanding of geography and social determinants of health. In addition, as existing studies contend that mental health outcomes are worse in rural areas [14][15][16][17][18][19], these studies do not account for the fact that not all affluent individuals are concentrated in urban areas. These individuals also reside in rural areas and have better access to mental health services than those who live in less affluent areas; Overall, we can observe significant spatial variation within and between social groups, most notably in less urban areas. For example, the lack of health insurance in T2 ZCTAs exhibit a positive relationship (i.e., brown dots) in some locations and a negative relationship in others (i.e., purple dots). In terms of statistical significance, there is also crucial spatial variation; however, in rural areas, the impact of the lack of health insurance on frequent mental health distress seems to be the strongest, as well as in the upper rungs of suburban areas.

Discussion
We note four important findings in our study. First, not all rural areas and not all urban areas are alike. The bivariate maps illustrated a discordance in geographic distributions between social groups and mental health. In the most rural and less affluent areas, the correlation between individuals' residence and the prevalence of mental health distress is strong. However, it is also here where we recognize that some suburban areas also experience significant mental health distress, not solely the most remote and least affluent areas; therefore, future studies need to explore why suburban areas, even with increased mental health resources, still experience poorer mental health outcomes.
Second, our study finds that the strength of the impact of each explanatory variable on the crude prevalence of frequent mental health distress behaves differently in each social group. As the previous literature has found [24,25], the lack of health insurance is positively related to an increase in frequent mental health distress; however, its impact is more substantial in less urban and less affluent areas (i.e., T3 and T4), followed by rural and affluent areas (i.e., T1 and T2), urban and affluent ZCTAs (i.e., U1, S1, and C1), and the urban and less affluent ZCTAs (i.e., U3, S4, and C3). In addition, the impact of other factors such as physical health, unemployment, poverty, and physical inactivity changes significantly depending on the social group and location.
At first glance, the negative relationship between physical inactivity and mental health may seem counterintuitive [46][47][48][49]; however, the impact of physical activity on mental health cannot be understood in a vacuum but as another explanatory variable in the context of the general model during a global pandemic, in which, everyday routines were significantly altered. In addition, the effects of physical activity on mental health may be specific to the characteristics of physical activity, such as intensity, duration, and type of exercise, rather than just physiological activation [50] as well as the location where these activities occur. PLACES aggregates data from individual survey questions, which may lead to effects on the wording of the questions. In this case, the question used to estimate the lack of physical inactivity was based on a range of activities that may have different effects on mental health, namely running, calisthenics, golf, gardening, or walking. Finally, researchers must be careful not to commit the atomistic fallacy that occurs when incorrect inferences are made about relationships in aggregate data based on relationships observed in individual data.
Third, our geospatial results suggest that although levels of frequent mental health distress are commonly found in the most remote areas, mental health distress is also found in other communities with greater affluence; therefore, rural communities should not be considered one group since the levels of mental health distress diverge. This finding underscores the need to increase our understanding of geography and social determinants of health. In addition, as existing studies contend that mental health outcomes are worse in rural areas [14][15][16][17][18][19], these studies do not account for the fact that not all affluent individuals are concentrated in urban areas. These individuals also reside in rural areas and have better access to mental health services than those who live in less affluent areas; therefore, it is necessary to break up rural areas to assess mental health treatments and services.
Finally, the divergent spatial patterns between frequent mental health distress and the explanatory variables used in this study highlight a need for policy to tailor better mental health prevention that emphasizes meeting the needs of target populations in specific geographies.
While our study provides valuable insights, there are several limitations that should be acknowledged. First, the dynamic nature of the COVID-19 pandemic means that mental health outcomes are subject to change over time as new interventions are implemented. Therefore, our findings should not be generalizable to the entire duration of the pandemic. Second, the mapping analyses relied on some model-based estimates, which inherently contain some uncertainty. As such, our findings cannot be interpreted as causal or determinant. Lastly, our analysis utilized PLACES data collected at different time points, which may introduce temporal bias into our results.
Despite these limitations, our study offers a "big picture" framework that may help to guide future population and mental health services planning. The maps presented in this study provide important geospatial input data for public policy planning, especially as they relate to non-health factors-such as social activities, spending habits, and lifestyle preferences-that can influence health outcomes.
Overall, the results from this study provide implications for future research and how to address mental health crises. First, future research should use more comprehensive coding schemes when analyzing rural and urban populations. Additionally, evaluations should include affluence as wealth, which can lead to better healthcare access. Next, given these results, existing catch-all methods utilized to address mental health problems should be expanded to include more appropriate measures for rural and urban communities. Diverse populations need diverse treatments, and existing studies are missing part of the picture.

Conclusions
Extant research has improved our understanding of geographic location's role in mental health outcomes. Results from this study conclude that urban and rural areas are complex and diverse, leading to different mental health outcomes for individuals within these areas that are dependent upon one's social group. Given these findings, policies attempting to alleviate mental health issues within urban and rural communities should further separate these groups and consider how other factors account for persistent mental health problems.
In sum, these findings highlight a need for breaking up large, classified groups before determining ways to address mental health and insurance access on a larger scale. Future studies should continue to utilize a multifaceted urban and rural coding scheme to construct policies targeting these complex groups and their mental health needs.