The predominance of obesity and diabetes have increased drastically across the US over the last two decades [1
]. Among various causes of obesity and diabetes, several public health researchers found that physical inactivity is one of the common factors [2
]. In consideration of this issue, urban planning studies have tried to identify the relationship between physical activity and built environment. Numerous studies have found that certain urban form features—land use mixed level, accessibility to open space, well-connected sidewalks—have positive associations with physical activity levels [4
]. Particularly, several scholars found that urban sprawl, which is an urban form pattern that reflects low-density housing development and automobile-dependency on the fringe of an existing urban area, is positively associated with the risk of being obese and being physically inactive [9
]. Although positive association between built environment and physical activity might not imply causation, studies have suggested that active built environment, which refers to urban form that could encourage physical activity, is essential to creating and maintaining a healthier society [13
Along with these contexts, a main research question is posed: Which locations have more active living potential than other areas? To answer this question, a comprehensive active built environment index will be developed, which includes multiple operationalized variables that may affect the physical activity of residents. Also, additional questions are established: first, which areas have more potential for active living; second, which residential types (e.g., single-family housing, multifamily housing) have a higher active living potential; third, which residential types are highly clustered with higher opportunities for active living. While the initial question simply finds areas having active living potential, these additional questions could demonstrate the capability of the method as a potential toolset that can be used by policymakers, urban planners, and public health professionals to support the physical activity of residents.
Thus, the objectives are as follows: first, operationalize Geographic Information System (GIS) variables, which represent geospatial dimensions of active built environment; second, develop a GIS model to assess urban forms, which means that the output of the model answers which locations have more active living potential than others; third, apply the modeling results to parcel-level data for responding to the research questions.
Although the relationships between built environment and human behavior, especially physical activity, vary by context and encompass multiple aspects, several researchers tried to identify urban form variables that affect physical activity from about 50 archival research papers [16
]. They found several common variables to affect potential physical activity, including land use mixed level, accessibility to destinations, and street patterns.
Since land use directly defines urban forms, numerous studies have tried to identify relationships between land use and physical activity. Specifically, several studies showed how mixed-use development has a robust and positive relationship with physical activity; people are more likely to walk if they live in mixed-used neighborhoods with parks, schools and commercial destinations nearby [4
Other studies suggested that the distance to destinations is a crucial part of active living. First, full-size grocery stores in neighborhoods positively correlate with healthier diet and weight among residents [18
]. Second, proximity to schools significantly enhances children’s levels of activity. A study supported this argument by reporting that youth who live within half a mile of a school had a greater likelihood of walking or biking to school [20
]. Third, researchers have argued for the significance of proximity to recreational spaces such as parks, public spaces, and recreational facilities with regard to active living. The optimal distance varies, but people who live closer to recreational spaces are more likely to visit the spaces and exercise more often [6
Active transportation modes such as walking and bicycling have also played a significant role in providing more physical activity opportunities [26
]. Public transit is an important variable that encourages walking for both adults and children [27
]. Moreover, a lot of studies have stressed the significance of streets as a walk space [30
]. Regarding the bicycling, a well-connected network of bicycle infrastructure is positively associated with physical activity using bicycles [39
]. Also, another study showed that obesity rates of frequent bicycle users are lower in countries that have accessible bicycle infrastructure [40
Along with the significance of identifying the relationship between geospatial variables of active living and physical activity, recent studies have tried to assess active built environments, especially the walkability of a built environment. Since walkability is a crucial part of physical activity, there are various attempts to develop new indicators, building on issues with the conventional walkability measures. The typical method to measure walkability is Walk Score, which, using a scale of 0 to 100, calculates a score by considering walking distance to amenities, population density, block length, and intersection density. This score has been shown to be a valid measurement of walkability in multiple geographic locations [41
]. Another method to measure destination accessibility is the Neighbourhood Destination Accessibility Index (NDAI), which is a measure of pedestrian access to neighborhood destinations [43
]. The NDAI was derived by computing a score for each destination (a binary score of 0 or 1 was assigned according to whether the destination was present or absent within the 800 m buffer) and then combined into a single domain score. While previous studies mainly considered destinations, the Pedestrian Environment Index (PEI)—which consists of four parts including land-use diversity based on entropy index, population density, commercial density, and intersection density—was developed to measure pedestrian friendliness for urban neighborhoods [44
]. The PEI combines four components by multiplication since factors affecting the pedestrian environment are subject to cause-and-effect (i.e., a change in one factor could result in changes in the other factors). The PEI falls between 0 and 1, and the maximum means a built environment could better support pedestrian activities. Another study concluded that considering both residential density and the walkable destinations creates good measures of urban walkability and can be recommended for use by policy-makers, planners and public health officials [45
]. However, an additional study suggested the Space Syntax Walkability (SSW) method, which employs gross population density and a space syntax measure of street integration, could provide a toolset for further understanding of how urban design makes an impact on walking behaviors [46
]. The authors argued that, compared to conventional walkability measures, the SSW method appears to be an effective alternative to full walkability in terms of data availability and processing.
Since both types of studies, which include identifying variables of active built environment and objectively measuring built environment for active living, have recognized the lack of objective measurement of active built environment, they suggested that objective and reliable measures of active built environment would be more efficient for both urban planning and public health studies [16
]. Using the capability of GIS is considered an especially feasible way to generate objective measures for studies of individuals and neighborhoods.
2. Materials and Methods
The study was undertaken in Jacksonville, Florida, which was selected because Jacksonville is the largest city and has one of the highest diabetes ratios in Florida [49
]. Aforementioned in the previous section, diabetes is positively associated with physical inactivity. Therefore, Jacksonville might be a city that has a lower physical activity level among its residents than other cities in Florida. Another reason to choose Jacksonville is that detailed and required GIS data are available for the city.
Despite complexities inherent in urban forms and physical activity, a considerable number of studies have shown positive relationships between physical activities and certain geospatial dimensions such as mixed land use, proximity to destinations, and active transportation modes. Especially, the destinations include grocery stores, schools and recreational space (e.g., parks, public spaces, public facilities, and recreational facilities). The active transportation modes comprise of walking using sidewalks, bicycling using bike lanes, and walking to public transit.
The following show used data descriptions and their sources. For computing land use mixed level, land use and census data are used. I obtain land use data from the Florida Department of Transportation (FDOT), which maintains generalized land use derived from parcels. For census data, 2010 census block boundaries from the US Census Bureau are used. For the identified destinations, parcel-level data is employed, which include the specific use of each parcel from the Florida Department of Revenue (FDOR) land use classification. I obtain the parcel data from the Florida Geographic Digital Library (FGDL). The detailed descriptions of each destination are as follows:
Grocery stores: supermarkets, neighborhood or community shopping centers
Schools: public schools or colleges
Public facilities: facilities operated by municipalities other than public schools, colleges, military, or correctional facilities
Recreational facilities: theaters, auditoriums, or sport facilities
Parks: public parks
Public spaces: outdoor recreational spaces other than parks
It should be noted that it is possible to use the aforementioned destinations combined as one variable or several variables. However, since each destination could represent unique walk or physical activity opportunities, detailed levels of parcel data are employed instead of using a single variable. Also used is public transit information from the Florida Transit Information System (FTIS) and bike lane features from the Road Characteristics Inventory (RCI) of FDOT. Regarding sidewalks, Florida’s unified street data is employed, which contains detailed street networks and a functional classification of roads.
Since factors affecting physical activity are not a single and unified dimension, the proposed methodology requires the capability to handle complex datasets. Thus, to analyze multivariable geospatial data, a suitability model is employed since it works well with multipart hierarchical variables and larger datasets using rasterized layers [50
]. Since raster layers consist of cells arranged in rows and columns and is a commonly used dataset in GIS, the rasterized layers could be used to represent geographic information and could also be used in raster-based modeling and analysis. Additionally, suitability modeling can provide several benefits. First, since studies sometimes failed to investigate regional contexts, suitability modeling is helpful in visualizing spatial patterns of active built environment within regional contexts. Second, the suitability model not only effectively handles large and complex data, but it also has the flexibility to add or remove relevant dimensions that consist of an active built environment.
Before starting the modeling, determining the proper cell size for analysis is critical because the cell size affects resolution of results, model performance, and disk storage requirements. Therefore, a 5-m cell size is used to capture the details of urban forms as well as to maximize workstation performance in ArcGIS.
illustrates the conceptual process of this study to develop a method to measure active living potential using variables identified from previous research. First, I identify the geospatial variables that positively associated with physical activity. Second, the model starts to convert each identified variable into a suitability layer by type of measurement. Based on the characteristics of each variable, three types of measurements—entropy index, walking catchment areas using street network, and proximity to routes—are used in the model. After generating individual layers, the model merges the layers into a single raster layer to show active built environment index, which represents active living potential of the study area. Last, using the parcel data as a boundary, it computes average scores of active built environment index. As a result, each parcel has a score to show active living potential.
Based on the nature of the variables and measurements, the first step establishes the three types of measurements. The first measurement is entropy index. Entropy index is used to represent land use mixed level since this measurement is one of the most common methods in geography and urban planning [51
]. Using the census block boundary and land use data, the entropy index is calculated using a portion of the land use and the unique land use count. As a result, each census block shows values between 0 and 1. The entropy index is set at 1 when land use is maximally diverse and set at 0 when land use is maximally homogeneous. At the end of this measure, the vector layer is converted into raster layer.
The second measurement is walking catchment areas from destinations. This measurement is used to determine pedestrian proximity to identified destinations such as grocery stores, schools, parks, public spaces, public facilities, and recreational facilities. Thus, pedestrian proximity determines the level of measurement for the variables above (i.e., grocery stores, schools, parks, public spaces, public facilities, and recreational facilities). For example, the measurement for grocery stores is generating network walking catchment areas by 5-min walking distance from grocery stores since walking is the most efficient and common way to achieve a recommended level of physical activity in everyday life. However, it should be noted that walking possibility is not a simple linear relationship with waking distance. That is, defining specific ranges such as only within 5 min walking distance as the most suitable value might be difficult because some residents who live more than 5 min walking distance from a grocery store might still walk there. Because of missing “walking opportunity”, a fuzzy membership is employed during the rasterizing process. The fuzzy membership tool allows for specifying the likelihood that a given value is a member of a set rather than just deciding if the value is either inside or outside the set [52
]. In other words, as the distance from a grocery store decreases, a person is more likely to walk to it, so the locations are more likely to be members of the suitable set. Thus, a new layer is created with corresponding fuzzy membership values rating from 1 to 0, accordingly. In here, 1 could represent the highest opportunity to walk. Another benefit of using fuzzy membership is that the output value is on the same scale as the entropy index, which ranges from 0 to 1. This common scale simplifies the overlay process and helps avoid the need to reclassify data. Since this study looks at active living potential, presenting walking possibility instead of simple distance from each destination is appropriate. Thus, regarding the destination variables (i.e., grocery stores, schools, parks, public spaces, public facilities, and recreational facilities), fuzzy membership is utilized to create suitability layers and output layers showing values from 0 to 1.
The last measurement is used to determine proximity based on whatever transportation options are available such as public transit, sidewalks, and bikeways. Both public transit routes and pedestrian pathways are used for assessing potential walkability, and bikeways are used for finding potential physical activities via bike usage. For example, to create a public transit suitability, the model creates walking catchment areas by 5-min walking distance from bus routes. However, for the bike lane, this study uses ½-mile and 1-mile buffers from bikeways since biking is another way to achieve physical activity. At the end of each process, a suitability layer is created using the fuzzy membership, which has value between 0 and 1.
After creating individual suitability layers, the last step in this model is combining the outputs from the previous individual processes and generating an active built environment index. The cells with combined high scores represent locations with higher suitability for physical activity. Because 10 variables are used and the maximum score of each variable is 1, a perfect score representing the maximum potential for physical activity would be 10. During the composite scoring process, though this study does not use any weights by variables, weighting layers differently based on importance is generally possible.
Finally, using the active built environment index, there is a possibility for another type of spatial analysis. Since the cell size of the active built environment index map is 5 m, these cells can be applied to each parcel. Using the zonal statistic function, the average value of the scored cells that belong to the parcel is assigned. The average value of each parcel is used for responding to the research questions: which residential types have more active living potential and how is active living potential clustered by residential types.
My a priori assumption is that higher active built environment index would be mostly in the central area of the city due to available destinations and street network. As expected, areas having more active living potential are found inside I-295. However, other higher index regions are also located along several streets not in the central areas since these streets are close to identified destinations (e.g., grocery stores, schools and recreational space) and active transportation modes (e.g., bus routes and sidewalk). Although these extended areas along the streets turn out mostly spatial outliers (see Figure 5
), this could also imply that urban planning intervention, such as adding new parks, might aid residents to live in active built environments.
To objectively measure the ways in which built environment affects physical activity, this paper describes the development of a comprehensive active built environment index to include relevant GIS variables, which represent geospatial dimensions of active living potential. The cell-based results of suitability modeling could not only identify areas in which residents will most likely become involved in physical activity, but also pinpoint other areas to urban planners or public health professionals that need to be developed to increase a region’s conduciveness to physical activity. The Jacksonville case study demonstrates both GIS-based visualization results and a potential toolset that can be used by policymakers, urban planners, and public health professionals to support the physical activity of residents. That is, urban planners or public health professionals could use the visualized results from this study as a broad resource to enhance physical activity opportunities of the study area. In addition to visualizing the existing conditions, if the planning or design intervention happens, this model could simulate the visualized effect of new intervention. Also, although this study employed a static model—having only ten variables weighted the same—increasing the GIS capacity could support a more dynamic model that would be more responsive to variable or weight changes over time.
This study employs parcel-level data to determine the variables of active built environment from a considerable amount of related studies. However, with the modeling process, the destinations from parcel data are not perfectly matched with existing literature. For example, to identify grocery stores, parcels including supermarkets and neighborhood or community shopping centers are used. However, it is possible that small retail shops might serve as a neighborhood grocery store for a certain area. Although FDOR data show a certain level of details, since the data might not have actual functions of the parcel, it is possible that this standardized FDOR database misses some information.
Regarding the entropy index, though the entropy calculation is generally an acceptable method to calculate land use diversity, the arguments about the entropy method should be noted. First, it has been shown that associations between walking and land use mix depend on which land uses are included in the calculation of entropy scores [55
]. Second, land use mix might not be a strong predictor of walking. A review study has shown that 2 out of 4 studies on land use mix (entropy) and walking for transport found non-significant associations between them [56
]. Third, entropy scores may not be correctly calculated in terms of the denominator, which is count of land uses [57
]. Although there are the arguments of calculating the land use diversity, a considerable study concluded that land use diversity has a positive association with physical activity [4
Another measurement used in the GIS model is a proximity measure using raster surfaces. This measurement is close to accessibility measurement which typically computes an ease of access to available destinations using distances between origins and destinations as a core frame [58
]. This approach is convenient and valuable since it can be easily associated with zone-based datasets such as census boundary or transportation analysis zone. However, it has been recognized that zone-based analysis has a modifiable areal unit problem (MAUP), which indicates that the specific boundaries such as census block, census block group, or zip code make an impact on the accessibility results [58
]. In other words, even at the same parcel or census block, the accessibility might not be identical. Since aforementioned methodologies used zone-based analysis, they might have MAUP [41
]. However, the cell-based method utilized in this study compensates for MAUP by use of controllable cell size. Since a 5-m cell was used as the unit of analysis, each cell value could apply to any level of geographic unit even at the parcel level. In the results section, it is demonstrated how the raster layer is applicable to other geographic units by using the average scores of parcels instead of using a single value by parcel.
However, there are several limitations to this study. First, this study uses only quantified data. Researchers have argued that qualitative variables such as safety, comfort (which could include speed of traffic or speed limits) and pleasurability might make an impact on walkability [62
]. However, to apply these qualitative variables, it is necessary to examine a local context of the study areas since the significance of those qualitative variables might be different by the context. In future research, after thorough studies regarding the local conditions, these qualitative variables could be considered in the modeling process. Second, this study uses the same weights for the combined layers since this study focuses more on the development of the GIS tool to evaluate active built environment. However, typical suitability modeling uses weighted layers to allow each variable to affect the model differently. Thus, future research may consider establishing an appropriate weight for each variable. Finally, this study assesses built environment regarding physical activity potential. However, recent developments in smartphone and smartwatch technology are enabling researchers to analyze the actual movements of people [63
]. Therefore, in future research, tracking the movement patterns of residents via wearable smart devices would enable the examination of the relationships between actual physical activities and active living potential measures developed in this study. Such data would tremendously enhance the methodology adopted in this study.