1. Introduction
There is a need to characterize patterns of risk factors within communities at a resolution that is meaningful to community-level public health efforts. Geographic information systems (GIS) have become a popular and powerful tool in community health [
1], especially when publicly available databases have been leveraged. Public health and population health studies have been undertaken to address this need, including through the development of various spatial microsimulation methodologies in the US, UK, New Zealand, and Australia on topics such as poverty, obesity, smoking, and mental health [
2,
3,
4,
5,
6]. In this context, we define spatial microsimulation as the use of simulation methods to generate individual-level data at higher geographic resolution than available in publicly available administrative datasets. While these studies provided important insights at a high geographic resolution, these spatial microsimulation methods have not typically been developed or applied to respond directly to community members’ or community-serving organizations’ needs.
Developing meaningful models within a community context requires a shift from conducting research on communities to working collaboratively on topics of mutual interest [
7]. Zip code level community-engaged plans were developed in Florida to address infant mortality [
8] and in Michigan [
9] to tackle diabetes prevalence. While these studies provided important insights relevant to public health policy and practice, the low geographic resolution makes the data less nuanced than studies utilizing spatial microsimulation techniques. When attempting to determine optimal intervention strategies or plans within a city, data with greater geographic and demographic resolution are required, but studies have not previously generated this information following a process that would most meaningfully inform community decision-making.
In this study, we focused on New Bedford, Massachusetts (MA), USA, a low-income city with multiple environmental and public health challenges. Given limited resources and a diverse population, within-city comparisons are crucial to local government and community-based organizational needs assessments and planning. We leveraged our previously developed spatial microsimulation methodology to predict high-resolution patterns of multiple risk factors by: (1) constructing a simulated population with multivariable demographic attributes at high spatial resolution (synthetic microdata), and (2) building regression models using public databases to predict risk factors of concern to community partners as a function of available demographic information [
10,
11]. To our knowledge, this is the first application that has directly connected spatial microsimulation methods with high-priority needs articulated by the community in a fully collaborative context.
4. Discussion
As expected, significant predictors in our models matched those found in the literature. For example, Boutelle et al. (2004) found that exercise was associated with ethnicity, income, and smoking [
18]. Trudeau et al. (2003) found that differences existed between predictors of fruit and vegetable consumption by gender, and predictors included age, education, exercise, and smoking [
19]. In the literature, employment [
20] as well as income and education were found to be significant predictors of BMI [
21]. Lastly, predictors of diabetes included demographics, BMI, diet, and lifestyle indicators [
22]. In spite of the number of highly significant predictors, only a small fraction of variability in these behaviors and outcomes can be explained by basic sociodemographic predictors. For example, the majority of the exercise literature examines predictors such as family support, attitudes, and environmental factors [
23,
24], and a systematic review of the literature on fruit and vegetable consumption found that the majority of the literature examines predictors—including intentions, attitudes, perceived barriers, and autonomy [
25]—which are more specific predictors not readily available in publicly available datasets. That said, our ability to determine sociodemographic predictors of key behaviors and outcomes allowed us to connect with a population dataset developed through spatial microsimulation, thereby yielding predictions of within-city variability not otherwise available.
In conversations between university and community partners, community partners indicated value in determining factors that show the needs of specific New Bedford census tracts in comparison to elsewhere in New Bedford, the rest of Massachusetts, or the nation. The results can be used to help target planning efforts on and within particular neighborhoods. Specifically, annual applications for brownfield grants to further the city’s environmental justice efforts are augmented by providing data-driven demographically and spatially resolved insights in their applications. Similarly, grant applications to expand the farmers market program are greatly informed by spatially resolved insight about high-risk subpopulations. The modeled data provide needed information to focus health promotion efforts at a geographically resolved level that was not previously or publicly available to government agencies, non-profits, or community groups. The novelty of this application is most markedly apparent in
Figure 1, which conveys variability in need across New Bedford, along with the marked differences with the state as a whole (
Table 2). Additionally, the regression models (
Table 1) point toward specific demographic groups that could be targeted, either independent from or in conjunction with spatially-oriented interventions. For example, lower income individuals with lower educational attainment in New Bedford had lower rates of multiple health-promoting behaviors and higher rates of diabetes.
The spatial microsimulation and linked regression modeling methodology used in New Bedford can be applied to any community across the US, given that we relied on public databases available for any location, and could generalize to other countries given analogous census data and representative population surveys. While needs vary across communities, data-driven insights about geographic and demographic patterns of modifiable health behaviors within communities can be used to create or enhance current community programming and interventions, obtain funding for interventions, or to support policy and advocacy efforts.
While the analytic methods in our study are generalizable and have demonstrated utility in the public health literature, another key dimension of our project involved strong community partner engagement, which allowed us to design and implement statistical models that could directly inform community-scale public health programs. This type of engaged research involves a shift from research on communities (one-way) to bi-directional community engagement in which researchers share power and conduct studies with communities [
7]. The benefits of bi-directional community engagement are clear: the researchers develop a better understanding of, communication with, and connection to the community, and the community benefits from the research, tools, and technical expertise, enhancing social capital as well as community empowerment [
26]. While there are clear challenges—including initiating, maintaining, and developing a relationship with a community that includes diverse stakeholders, and balancing competing priorities and expectations [
27]—community-engaged research ultimately leads to improved research quality, community relevant research, and impactful research that addresses health disparities [
28]. The demonstrated tools and data can lead to more targeted programming, better allocation of resources, enhanced community decision-making, and overall improved health of communities.
One limitation of the study is the number of predictors available for the regression models, shown in
Table S1. We were constrained to using only variables available in the synthetic population, which was derived from ACS, and does not capture specific behaviors that contribute to the risk of behaviors and outcomes. There are subpopulations that this methodology does not fully capture, including certain immigrant groups and some English language learners. In 2015, community organizations estimated undocumented immigrants to number 10,000 in New Bedford [
29]. Information from these subpopulations will not be represented in administrative databases and may have distinct behavioral patterns, as newly arrived immigrants often reflect the health behaviors and outcomes of their home country, also known as the “healthy immigrant effect” [
30]. Because BRFSS is a phone survey conducted in English, barriers to participation are access to a phone and English language proficiency. Children are also not captured because there is a lack of information on those under 18 years of age in administrative datasets despite the high interest and need for such databases.
Another limitation derives from constructing regression models using populations outside of New Bedford. While using the models created from the New Bedford data only would be a more accurate representation of New Bedford given potential differences in demographic patterns in other communities in Bristol County, the smaller sample size would contribute to greater statistical uncertainty. We tested our models with both datasets and found very similar results (
Table S2). Many communities interested in this method may have much smaller populations, and the insights our models provided for community-level decision-making were robust to this choice, but this issue should be considered carefully when developing models in other locations, as demographic predictors may or may not generalize.
Our spatial microsimulation methods provide key data to inform decisions, but even setting aside the limited set of available predictors, they do not address the complexity of community issues that contribute to exercise, fruit or vegetable consumption, and diabetes rates. For example, identifying census tracts with low fruit and vegetable consumption to inform community action (such as adding a farmers market) may not lead to increased fruit and vegetable consumption. Improvement of diets is multi-faceted and complex—the produce being sold needs to be culturally relevant, affordable, and accessible [
31]. Lastly, a limitation of the maps as a community information tool is that the maps may be misinterpreted by residents because the patterns in outcomes are related to sociodemographic risk factor patterns and not necessarily to the geography of where they live.
Nonetheless, our modeling approach adds value relative to the literature by leveraging spatial microsimulation and regression modeling methods to assist communities with decision-making using publicly available data, which tends to be more basic rather than nuanced. Therefore, our locally-informed models enable researchers to generate data from public databases for communities in a cost-effective, less resource intensive, and realistic manner, which could be followed up with more extensive surveys or local data collection if needed.