Next Article in Journal
Industrial Exaptation: Mono-Functional Industrial Relics and Their Capacity for Adaptive Multi-Performative Reinvention, a Case Study Analysis
Previous Article in Journal
Land Expansion and Green Rural Transformation in Developing Countries: A Kaya Identity Approach
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

From Perception to Behavior: Exploring the Impact Mechanism of Street Built Environment on Mobile Physical Activity Using Multi-Source Data and Explainable Machine Learning

1
School of Architecture, Southwest Jiaotong University, Chengdu 610031, China
2
School of Design, Southwest Jiaotong University, Chengdu 610031, China
3
Information and Network Management Center, Xihua University, Chengdu 610031, China
4
SWJTU-LEEDS Joint School, Southwest Jiaotong University, Chengdu 610031, China
*
Author to whom correspondence should be addressed.
Land 2025, 14(12), 2315; https://doi.org/10.3390/land14122315
Submission received: 15 October 2025 / Revised: 15 November 2025 / Accepted: 17 November 2025 / Published: 25 November 2025

Abstract

This study explores the mechanisms through which the street built environment (BE) influences mobile physical activity (MPA) using multi-source data and explainable machine learning methods. The research combines Geographically Weighted Regression (GWR) and Random Forest (RF) models to reveal the complex spatial heterogeneity between BE factors and MPA, and enhances the interpretability of results through the SHAP model, providing theoretical support for future targeted urban planning and MPA interventions. The study finds that the “density” dimension of BE plays a crucial role in MPA, particularly population density and building density. Additionally, accessibility and safety also significantly influence MPA, while design factors such as greening rates, water landscapes, and building façade design promote MPA. The study emphasizes that the influence of BE factors on MPA is nonlinear, with significant interaction effects between different variables, indicating that improving a single variable alone cannot fully explain changes in MPA. This research provides a new theoretical perspective for understanding the impact of BE factors on MPA and offers empirical evidence for precise interventions. In areas with low MPA participation, improving street design, enhancing traffic safety, and increasing green and water-friendly spaces can significantly promote residents’ MPA, thereby improving public health.

1. Introduction

Since the 21st century, the global health landscape has been undergoing profound changes, with chronic non-communicable diseases becoming the primary threat to human physical and mental health [1]. Physical activity (PA) is widely regarded as the cornerstone of maintaining and promoting health [2]. Regular PA can reduce the risk of chronic diseases and offers numerous benefits for maintaining both physical and mental health. Among various types of PA, mobile physical activities (MPAs) such as running and cycling have become key targets for promoting public PA. This is due to their low infrastructure investment and exercise implementation costs, and numerous studies have shown that optimizing the built environment (BE) can significantly enhance the public’s behavioral adherence to MPA [3,4,5,6,7].
Given the effective promotion of MPA by the BE, many scholars have conducted studies on the relationship between BE factors and MPA to develop scientifically effective MPA intervention measures. Previous work on the measurement of BE factors mainly focused on the classic “5Ds” model [8], which includes factors such as the accessibility of service facilities [9,10], population density [6,11,12], intersection density [13,14], landscape features [15,16,17], and visual quality [4,5,18]. Existing research indicates that BE impacts MPA mainly in three aspects: First, BE factors such as road intersections, traffic lights, and road network density significantly affect the frequency of MPA. These traffic organizations disrupt the continuity of MPA, thereby reducing its frequency and environmental attractiveness. Second, the neighborhood environment reflecting greening levels is closely linked to MPA, including green spaces, green coverage, and water-friendly spaces. Third, BE also influences MPA in terms of functionality, with mixed land use, residential density, and open space density showing certain correlations with MPA. It should be noted that regarding mixed land use, some studies have shown its significant effect on commuting-related MPA, but the effectiveness of its impact on recreational MPA remains uncertain [6].
In existing studies, although many scholars have explored the impact of different BE variables on MPA, most of these studies focus on objective environmental measurements and lack a systematic consideration of human perception. In reality, human behavior is not determined solely by the physical environment—perception plays a crucial role [18]. Environmental psychology and social cognitive theory hold that the individual, behavior, and environment constitute an interconnected and interactive whole. The impact of the environment on individual behavior is usually mediated by cognitive mechanisms [19]. Among these mediating cognitive mechanisms, perception serves as the “input end” of the cognitive process–environmental perception provides actors with raw information and determines the scope of content that the cognitive system can process. Cognition acts as the “mediating end” of behavior-perceived information is transformed into behavior only after undergoing cognitive processes (which generally include psychological processes such as value evaluation and meaning construction). In short, in the interactive process of “environment-behavior”, individuals first “perceive” the environment through their sensory systems, then form “cognition” of the environment based on this “perception”, and finally, this mediating “cognition” influences their specific behavioral decisions [20]. An individual’s perception of the environment influences their behavior choices. For example, certain streets in urban areas may be more likely to attract people to walk or cycle due to greenery, a sense of openness, or perceived safety. These perceptual factors often have a more direct impact on MPA behaviors than objective spatial structural factors. Therefore, from a perceptual perspective, exploring the mediating role of perception in behavior can help reveal the complex relationship between the BE and MPA, fill gaps in existing research, and provide theoretical support for the development of precise intervention strategies.
On the other hand, the impact of BE on MPA is spatially heterogeneous [21], meaning that the influence of BE on residents’ MAP may vary significantly across different geographic regions. Existing statistical methods often assume that the influence of BE on MPA is homogeneous across space, a hypothesis that overlooks the differences in BE characteristics and residents’ MAP behaviors in different areas [5,6,13,22]. Therefore, it is crucial to adopt analytical methods that account for spatial heterogeneity [21]. This study employs a hybrid model that combines Geographically Weighted Regression (GWR) and Random Forest (RF), incorporating spatial weight matrices and machine learning algorithms to effectively handle spatial heterogeneity and non-linear effects. The GWR model assigns different weights to each sample based on geographic location, capturing the variations in local environmental factors, while the RF model efficiently handles complex non-linear relationships, helping to uncover the hidden patterns between BE and MPA. By combining spatial heterogeneity analysis and machine learning methods, we can more accurately reveal the mechanisms through which BE influences MPA.
The innovations of this study are mainly reflected in the following aspects: First, it uses micro-scale street-level data combined with a perceptual perspective to analyze how individuals’ perceptions of the street BE influence their MPA behavior. Second, it introduces spatial heterogeneity and nonlinear analysis methods, enhancing the model’s ability to capture the complex relationship between BE and MPA. Finally, by incorporating the SHAP model, the study provides a clearer framework for the interpretability of the results, further increasing the practical and policy guidance value of the research.
By combining the perception perspective with spatial heterogeneity machine learning methods, the research framework proposed in this study provides greater explanatory power for street-level MPA research. This framework not only reveals the impact mechanisms of various factors in the BE but also provides data support for urban planners and policymakers to implement targeted interventions. Ultimately, this research offers new theoretical foundations and practical pathways for the construction of healthy cities and the promotion of MPA.

2. Materials and Methods

2.1. Framework

The technical framework and workflow of this study are shown in Figure 1. It mainly consists of the following three parts:
(1)
Data processing and variable calculation, which includes improving the traditional “5Ds” built environment indicator system from a perceptual perspective to form the built environment variable set used in this study;
(2)
Model training and result interpretation, which involves the construction and interpretation of the GW-RF machine learning regression model, balancing the “perception-behavior” logical framework with the “spatial heterogeneity processing ability”;
(3)
Result analysis, which includes variable importance, direction of effects, nonlinear effects, variable interactions, and spatial interpretation of the effects of some variables.

2.2. Study Area

Chengdu is located in Sichuan Province, China, and has a subtropical monsoon climate with an average annual temperature of 16 °C, making the climate pleasant. The region has a flat terrain, crisscrossing river networks, and beautiful scenery, and is known as the “City of Leisure”. As the capital of Sichuan Province and a core city in the Chengdu-Chongqing Twin-City Economic Circle, Chengdu has a permanent population of 21.192 million. The roads in the main urban area extend radially from the center to the surrounding areas, with flat roads and numerous sidewalks. Chengdu’s superior natural conditions such as climate and terrain, along with the guidance of leisure and fitness culture, have created a favorable environment for residents in this region to engage in outdoor MPA [23].
This study takes the main urban area of Chengdu as the research object, covering Jinniu District, Qingyang District, Jinjiang District, Wuhou District, and Chenghua District. As the core area of Chengdu’s population and economy, the main urban area is not only the political, cultural, and commercial center of Chengdu, but also the main place where residents engage in PA. Although the research scope of this study is the main urban area, considering that the public do not take administrative boundaries as rigid activity boundaries when engaging in PA, this chapter finally selects the area formed by expanding a 1500m buffer zone outward based on the boundary of Chengdu’s main urban area as the research area, and the scope is shown in Figure 2. Details of the administrative districts and population within the research scope are shown in Table 1.

2.3. Datasets

2.3.1. PA Data

Users’ PA data is recorded by the Keep App (www.keep.com) (accessed on 20 June 2024) and does not involve personal privacy. Keep App, along with Codoon App and Yuepaoquan App, is one of the three most popular outdoor fitness tracking apps in China, boasting a large user base. The popular routes on Keep are spontaneously uploaded by users to share high-quality routes in urban spaces that they consider safe and open to the public. They represent a collection of spatial locations in the city, filtered by users, which are suitable for MPAs such as running, walking, and cycling [7]. The popular routes on KeepApp include the following information: route name, route ID, route location, venue type, route length, check-in count, proportion of PA types, route creation date, and route shape (see Table 2 for detailed data structure). The study area includes a total of 631 popular routes, of which 212 are street routes. Since the route data is VGI data, some users mistakenly marked “non-street-type routes” (such as park routes and playground routes) as “street routes” during creation. In this study, all such mislabeled routes have been excluded during the manual verification process. After manual verification and excluding non-street routes mistakenly labeled by users, 200 popular routes were selected as the foundational data for this study.
At last, existing studies [7] have shown that the user group of the Keep app is mainly concentrated in the 25–40 age group. This user structure makes the research conclusions of this paper more applicable to young and middle-aged groups in cities, and the generalizability to other age groups needs to be cautiously considered.
This study completed the vectorization of popular routes using the ArcGIS platform. First, the geographic coordinates of the routes were used to determine their spatial location. Next, the vectorization of the routes was manually completed within the study area’s single street network. For each street (segment) passed by a popular route N times, the “Route-Count” attribute in the street’s attribute table is recorded as N, ensuring that the PA indicators of streets passed by multiple routes or multiple times by a single route are accurately recorded and reasonably calculated. The vectorization results are shown in Figure 3.
Given that the basic unit of study is the “street,” specifically referring to the segments in the single-line network vector map formed by processing the study area’s road network into a single centerline, the 200 popular routes are marked onto the study area’s single-line network vector map. This involves 2019 street segments, with the longest segment measuring 1364.27 m, the shortest measuring 18.71 m, the average length being 202.94 m, and a standard deviation of 154.15 m. A summary of the vectorized data for the popular routes within the study area is shown in Figure 4a.
Since the GW-RF model in the later sections requires the processing of spatial coordinate data for each sample, and “street segments” as linear vector features are difficult to directly incorporate into the calculation, they need to be abstracted as individual “points” in space, with the “point coordinates” then used as input for the model. Therefore, based on the ArcGIS platform, this study uses the “Feature to Point” tool to convert the line vector features of each street segment into point vector features. It should be noted that in the ArcGIS platform, when converting line features to point features, the resulting points are not necessarily the “midpoints” of the line features. For linear line features, the generated point feature is located at the midpoint of the line segment; however, for polyline or curve line features, the resulting point is located at the weighted average of the x and y coordinates of all the segment midpoints. Specifically, assuming a polyline consists of n segments, the coordinates of the midpoint for each segment are weighted (with the weight being the length of each segment) and averaged to obtain the coordinates of the generated “point feature.” The spatial relationship after converting the polyline is shown in Figure 4b. Although the generated points are not strict midpoints, manual verification confirms they are close to the centroids of the segments and can represent the core locations of the streets. Furthermore, the maximum point offset (≤30 m) is far smaller than the optimal bandwidth of 1800 (The bandwidth will be detailed in the section “3.1 Model Performance”) m for the GW-RF model, so its impact on the spatial weights of the GW-RF model is negligible.

2.3.2. Multi-Source Urban Data

The multi-source urban data used in this study includes road network data, water body data, population raster data, street view image data, point of interest data, and building footprint data.
The road network data is sourced from OpenStreetMap. After single-centerline processing and manual verification, a total of 25,224 street (segment) samples were obtained. Water body data also comes from OpenStreetMap, including both linear and polygonal water body data. After manual verification, the polygonal water body data covers wider river sections of the Fu River, Nan River, Jinjiang, Jiang’an River, Sha River, and Dongfeng Canal that flow through the study area, as well as natural and artificial lakes within the study area. The linear water body data primarily covers the narrower rivers and canals in the remaining parts of the study area. Population raster data is sourced from the WorldPop open data platform. Street view image data is from the Baidu Map. Based on the ArcGIS platform, this study generated a total of 51,127 street view sampling points with 50m intervals using the study area’s single-line network vector map as the base. For road segments shorter than 50m, the midpoint was used as the sampling point. Through the Baidu Street View API, a total of 43,271 unique street view images were collected. Building data is sourced from the Baidu Map. By calling the Baidu Map API and comparing it with manually sampled satellite imagery, a total of 133,239 individual building vector data (including building footprints and number of floors) within the study area was obtained. The POI data is from the Amap. By calling the Amap API and filtering within a 50m buffer zone on both sides of the street using the ArcGIS platform, a total of 196,291 POI points were identified. The data sources and detailed descriptions are provided in Table 3.

2.4. Variables

2.4.1. Dependent Variables

Existing research has indicated that BE factors are the main determinants of the intensity of PA rather than its frequency [7]. Therefore, in this study, when extracting PA indicators, the per capita PA intensity is used as the outcome variable for the model, with the calculation logic as follows:
a. Based on the original data of each popular route, calculate the annual average check-ins W1.
Assuming that the cumulative check-ins for a certain popular route is W0, and the route has been active for N months until the data collection time (June 2024), the annual average check-ins can be calculated as W1.
b. Based on the annual average check-ins, calculate the annual PA intensity I0 for the street segments passed by the popular route.
Assuming that the popular route passes through street segment A, and the number of times it passes through segment A is n, with the length of segment A being L, the total annual PA intensity I0 for segment A can be calculated based on the annual average check-ins W1, along with the number of times segment A is passed (n) and its length (L).
c. Based on the total population near the street segment, calculate the annual per capita PA intensity I1 for the segment.
Considering the proximity characteristic of the public’s daily PAs, taking street segment A as the reference, a buffer zone with a 250 m radius at the block scale is defined. Based on the population raster data, the total population P within the buffer zone surrounding the street is extracted, and then the annual per capita PA intensity (I1) for the segment is calculated.
In summary, the formula for calculating physical activity intensity is:
I 1 = I 0 P = W 1 × n × L P = W 0 × 12 N × n × L P
Finally, the descriptive statistical indicators of PA intensity for all street segment [7] samples in this study are as follows: the number of sample segments is 2019, with a maximum value of 1698.04 m, a minimum value of 0.04 m, an average of 27.12 m, and a standard deviation of 104.94 m. The PA intensity indicator exhibits a distinct concentric distribution pattern, with a general “inner ring low, outer ring high” distribution around the Third Ring Road of Chengdu, as shown in Figure 5.

2.4.2. Environmental Variables

This study is based on the classic “5Ds” framework for describing BE indicators. On this foundation, on one hand, it takes public perception as the starting point and uses the “visually perceptible maximum scale” as the spatial scale basis for statistical indicators (psychologically, it is believed that the maximum distance for social interaction on streets is 100 m [24,25,26,27,28]). The study selects BE elements that the public can directly or indirectly observe as environmental variables. On the other hand, considering the significant impact of safety perception on public behavior, the “Safety” dimension is added to the “5Ds” framework as a supplement. The definition and description of environmental indicators are detailed in Table 4, and the descriptive statistics of the indicators are shown in Table 5.
Chengdu is a popular tourist city in China. Theoretically, tourist-preferred routes may interfere with the data’s reflection of local residents’ route selection tendencies. However, through the analysis of the spatial distribution characteristics of the selected popular routes, it is found that these routes do not show abnormal aggregation in core tourist attractions such as Chunxi Road and Kuanzhai Alleys. Instead, they avoid these tourist-dense areas to a certain extent in space. Based on this empirical result, we believe that the dataset can still effectively reflect the environmental selection preferences of local residents, and the interference of tourist routes on the research conclusions is relatively small.
Density
In the “5D model” framework, density refers to the concentration of population, buildings, or facilities within a unit area, reflecting the degree of resource concentration in urban space. Common indicators include population density, building density, and POI density. This study adopts the same 3 indicators, but with a difference in scale compared to previous research. For population density and building density, the street edge is taken as the baseline, and a bidirectional buffer zone is applied within a 100 m range, which is the maximum scale of human visual perception. For POI density, only street-facing POIs are included in the statistics to calculate their density. The rationale behind this is that for individuals engaging in regular PA on the street (regular PA is defined as exercise for fitness purposes, rather than for leisure, entertainment, or commuting), the functional attributes provided by commercial facilities (POIs) along the street are not the main focus. Instead, the perception of such individuals is influenced by the decorative storefront designs, such as signs and windows, and the customer flow entering and exiting the stores. Therefore, non-street-facing POIs should have no perceptible effect on this group.
Diversity
The diversity in the built environment reflects the degree of mixing of different functional spaces within a region, aiming to demonstrate the balance and complexity of regional functions. Land use function mix and POI diversity are commonly used to measure the diversity of facilities. Land use function mix is typically calculated by computing the area proportion of different land use types within a plot to determine the entropy value. On the one hand, this approach is more suitable for research scenarios at the scale of blocks or administrative units. On the other hand, since land use data is usually based on land use planning maps, there is often a certain discrepancy between the data and the actual situation.
POI data, as a point-based dataset, although it cannot fully reflect the size differences between facilities (e.g., a convenience store and a large supermarket are both categorized as commercial facilities but offer significantly different shopping opportunities and attractions), it is open-source and, on one hand, facilitates research across different spatial scales based on precise geographic coordinates. On the other hand, POI data can accurately mark the physical locations of facilities serving different functions, such as commercial, residential, and public services, allowing for the measurement of the density and diversity of different functional facilities.
Therefore, this chapter ultimately chooses to use POI data that reflects different types of functional facilities (including living and education facilities, transportation facilities, and recreational and sports facilities) to calculate their density, reflecting the diversity of street functions. The spatial scale is consistent with the “density” section above, and only POIs along the streets are included in the calculation.
Design
The design dimension of the built environment focuses on the physical characteristics of spatial form and street networks, involving aspects such as architectural design, street design, and public space design. Reasonable architectural design can improve space utilization and environmental comfort, street design affects traffic flow, pedestrian safety, and accessibility, and high-quality public space design can provide residents with good recreational venues, increase community cohesion, and enhance the quality of the built environment. Common design dimension indicators primarily include the following three aspects:
  • Street network structure: Including road intersection density, street network density, etc.
  • Environmental design: Such as building interface continuity, street width-to-height ratio, blue-green landscapes, etc.
  • Facility dimension: Including pedestrian facilities, accessibility facilities, etc.
In this study, at the street network structure level, since street network density is not an environmental indicator directly perceivable by the public, its representation of street network accessibility characteristics will be reflected in the subsequent “Accessibility Dimension” through corresponding indicators. At the environmental design level, this study uses building interface continuity to reflect the regularity of street-facing building facades, and selects the visual exposure of linear water bodies (such as small rivers, ditches, etc.) and polygonal water bodies (such as wide river surfaces and lakes) to reflect the perception of blue landscapes. Furthermore, through the analysis and extraction of street view image data, sky view factor is selected to reflect the openness of the street, and green view factor is chosen to reflect the perception of green landscapes in the street space. At the facility dimension level, the proportion of pedestrian walkways in the street space is extracted from street view images to reflect the distribution of walking facilities in the street environment.
Distance to Transit
The convenience of the public transportation system measures the proximity between the built environment and public transportation facilities (such as subway stations, bus stops, etc.). The closer the public transportation is to residential areas, the more convenient it is for residents to use public transport, which helps reduce the use of private vehicles, alleviate traffic congestion, lower carbon emissions, and improve the accessibility and convenience of residents’ travel, thus expanding their range of activities. Based on previous research [4,5], this study ultimately selects the number of bus stops and metro stations along the street as indicators for calculation.
Accessibility
Accessibility, as defined in the “5D” framework, refers to the “spatiotemporal convenience of reaching specific functional locations (such as parks, supermarkets, bus stops, etc.).” Its core focus is on the “functional value of the street as a path,” which includes the physical distance from the current street to external destinations, functional compatibility, and psychological accessibility (such as the coverage of parks within a 500m buffer zone, or the network distance to a bus stop, etc.). This definition emphasizes the connection between streets and external places, essentially serving the analysis of “travel behavior from A to B.” Existing studies have shown that the explanatory power of accessibility for commuting-related PAs is much higher than for recreational PAs [6,21]. This is partly because recreational PAs are inherently harder to predict compared to commuting activities, and partly because of the definition of accessibility itself. The “from A to B” travel behavior is different from recreational PAs aimed purely at exercise, and this difference is, to some extent, distinct from the definition of accessibility in the “5D” framework.
Since this study focuses on “MPAs aimed at exercise occurring on streets,” it emphasizes whether streets can attract pedestrian flow for physical exercise, rather than the “convenience of traveling from the street to external places.” Therefore, this study, referencing existing research [29,30,31], adopts the “global normalized angular accessibility (NAch_Global)” indicator from space syntax as a substitute variable to describe the accessibility of the street network. Existing studies have shown that the global NAch indicator effectively captures the overall structure of the street network. By quantifying the spatial connectivity depth in the network’s topological structure, it can effectively reflect the convenience of reaching a target location from any point in the network [32,33].
Safety
Existing research has shown that safety perception is a commonly used indicator in studies of the built environment’s impact on PA [3,22,34]. Safety perception mainly refers to residents’ perceptions of neighborhood crime and road safety, with measurements typically including perceived safety of walking paths, crime rates, and traffic hazards. It can be said that safety perception is a fundamental influencing factor for PA and one of the basic needs for residents to engage in MPA on the streets. Among these factors, increased motor vehicle traffic significantly reduces residents’ willingness and frequency of walking, making it the primary factor influencing the perception of safety in PA.
Meanwhile, the NAch indicator has been proven to be scientifically effective in the quantitative analysis of vehicle flow in urban traffic networks. Several empirical studies have shown that the NAch indicator has a good fit with actual traffic data [29,35]. It is important to note that in the space syntax calculation process, the NAch indicator’s description of urban street network structural features and its predictive performance for traffic flow are closely related to its scale. An empirical study on ground traffic flow in Chongqing has shown that the local angular accessibility indicator at a specific scale fits actual traffic flow better than the global integration and global angular accessibility indicators [29].
Given the difficulty in obtaining real traffic flow data for validation, and considering the excellent ability of the machine learning models used in the subsequent regression analysis to handle complex multicollinearity factors, this study, based on empirical parameters from previous research [29], incorporates the local NAch indicator (with a radius of 7 km, NAch_7k) as a proxy for road motor vehicle flow into the set of indicators.

2.5. Methods

2.5.1. GW-RF Model

This study introduces a hybrid model based on Geographic Weighted Regression (GWR) and Random Forest (RF), called the “Geographic Weighted-Random Forest” (GW-RF) model, to explore and analyze the relationship between the built environment and PA. The GW-RF model combines the spatial heterogeneity modeling capability of GWR with the nonlinear processing ability of RF. In the GW-RF model, the GWR is used to handle local feature differences in the built environment at different geographic locations. It introduces a weighting coefficient spatially to process geographic location differences for each sample, ultimately generating a Spatial Weight Matrix (SWM) based on the spatial position of the samples. Then, in the RF model, the SWM is used to assign different weights to each sample, enhancing the model’s ability to learn from important regional samples. Currently, the GW-RF model has proven to be effective in research areas such as BE and PA [21], street vitality [4], poverty analysis [36], and water resource management [37]. Finally, considering the “black-box” nature of the RF model as a machine learning method, this study overlays the Shapley Additive exPlanations (SHAP) model on top of the GW-RF model for result interpretation, in order to enhance the interpretability of the model’s output.
In the GW-RF model, the SWM matrix is a key component used to describe the spatial interactions between samples. The SWM matrix is a symmetric matrix, where each element W represents the spatial similarity between sample i and sample j. This similarity is typically measured by the distance or adjacency between samples. The most common method for calculating spatial weights is based on the geographic distance between samples (i.e., Euclidean distance), where spatial weight is defined by the physical distance between samples, with shorter distances typically indicating higher similarity. The introduction of bandwidth (h) further adjusts the decay range of the spatial weights. The bandwidth h controls the degree of weight decay, meaning that the bandwidth parameter is typically introduced to limit the influence range between samples. The weighted expression for the spatial weight Wij between sample i and sample j is as follows (in this case, Dij represents the Euclidean distance between sample i and sample j):
w i j = 1 1 + d i j / h

2.5.2. SHAP Model

The SHAP model is based on the concept of Shapley values from cooperative game theory, which quantifies the importance of each feature by calculating its marginal contribution to the model’s prediction. Due to its seamless integration with common machine learning models (such as Random Forest, XGBoost, etc.) and its ability to effectively address the “black-box” problem inherent in machine learning, the SHAP model has become a commonly used interpretability tool in empirical research based on machine learning [38,39]. The SHAP model provides not only global explanations but also local ones [40]. The formula for the Shapley value of feature i is as follows (3):
i = S N i S ! ( n S 1 ) ! n ! f ( S i ) f ( S )
Here, i represents the contribution of feature i, N denotes the set of features, f ( S i ) and f ( S ) represent the model results with and without feature i, respectively.
It should be noted that existing studies have shown that the explanatory power of the spatial distribution of mobile physical activity is higher when different built environment variables interact, compared to a single factor [5,23,41]. Based on this, this study further captures the synergistic or suppressive effects between variables by calculating the interaction values using the SHAP model. The formula for calculating the SHAP interaction effect is as follows:
ij = S N { i , j } | S | ! ( n | S | 2 ) ! ( n 1 ) ! f ( S { i , j } ) f ( S { i } ) f ( S { j } ) + f ( S )
In this case, ij represents the interaction feature i and feature j.

3. Results

3.1. Model Performance

Before constructing the GW-RF model, we tested for multicollinearity among the variables and found no variables with a Variance Inflation Factor (VIF) greater than 10. Therefore, all the variables mentioned earlier were included in the model for calculation.
As mentioned earlier, in the calculation of the GWR model, bandwidth is a core parameter that defines the range of data points or the number of neighboring elements included in the local regression equation, thereby controlling the degree of smoothness in the model. Existing literature has shown that multiple studies have verified the existence of an optimal bandwidth for the same sample set using different methods (such as AIC, AICc, CV, and other information criteria). Empirical results indicate that, for the same sample set, when the initial bandwidth is set small, the model fit improves as the bandwidth increases. However, once the bandwidth exceeds a certain value, further increases in bandwidth do not significantly improve the model fit and may even lead to a gradual deterioration of the fit as bandwidth increases [42,43,44]. Based on these conclusions, this study conducted experiments with different bandwidth parameters within a certain range, and the results show that the optimal bandwidth parameter for this sample set is 1800 m, as detailed in Table 6.
After confirming the optimal bandwidth, this study used 80% of the entire sample set as the training set and 20% as the test set for the calculation of the GW-RF model. The parameter optimization was carried out using the Optuna module in Python to achieve the highest R2 value on the test set as the optimization objective, with 5-fold cross-validation employed to prevent model overfitting. After 20,000 tuning iterations, the optimal model parameters were obtained as follows: {‘n_estimators’: 238, ‘max_depth’: 19, ‘min_samples_split’: 3, ‘min_samples_leaf’: 1, ‘max_features’: 0.5993533983005411}. The model performance was evaluated using R2, Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE) as evaluation metrics.
Finally, the model achieved an R2 of 0.6079, an MAE of 0.2059, and an RMSE of 0.5105. It should be noted that the aim of this study and similar research is not absolute predictive performance, but rather the mechanisms and principles revealed by the model. Given the complexity of factors influencing PA behavior, the diversity of urban environments, and the variability in indicator selection across studies, it is difficult to establish a unified standard for judging model validity—for example, determining that a model is highly reliable simply because the R2 exceeds a certain threshold. Therefore, this study evaluates the reliability of the model by referencing the credibility results of similar research.
Given that the GW-RF model is a relatively cutting-edge research method formed by bridging the Geographic Weighted Regression model with traditional machine learning models, there is currently only one study that applies it to physical activity research. In that study, the R2 values of the eight models ranged from 0.48 to 0.64 [5]. Additionally, research has shown that when using machine learning methods, model reliability indicators for studies with buffer zones ranging from 20m to 500m at the block or street level generally fall within the 0.4 to 0.6 range [7]. Based on this and the evaluation of the interpretability of the subsequent model results, this study considers the results of the model to be valid and reliable.

3.2. Model Results

3.2.1. Variables’ Importance and SHAP Value

The contribution values of the 16 BE variables in the model are detailed in Figure 6. The model’s variable importance results extracted from Figure 6 are shown in Table 7. The SHAP values and influence directions of the various built environment variables in the model are shown in Figure 7.
In the GW-RF model, the contributions of the 16 BE variables are generated by calculating the average SHAP value for each variable across all samples. As shown in Figure 5 and Table 6, the eight most important variables are: D_Pop (23.51%), D_BLD (12.77%), Expo_Lake (9.82%), NAch_7k (8.70%), D_POI (7.81%), NAch_Global (7.78%), N_Bus (6.08%), and R_Ped (5.88%).
When the contribution averages are calculated based on the selected dimensions of the indicators, the ranking of each dimension’s contribution to PA is as follows: Density (14.70%), Safety (8.70%), Accessibility (7.78%), Design (4.87%), Distance to Transit (3.06%), Diversity (1.36%).
Figure 7 shows the SHAP summary plot for each sample, as well as the SHAP summary plot after extreme value removal using the quartile method. This plot illustrates the direction of influence that each variable has on MPA for each sample. In the figure, the color of the points represents whether the BE variable is high or low, and the direction indicates whether a particular variable has a positive or negative influence on MPA.
The variables that clearly show a positive contribution include Expo_Lake, NAch_Global, R_Sky, N_Bus, and D_Trans, while those showing a negative contribution include D_Pop, D_BLD, D_POI, NAch_7k, and R_BLD, among others.
It should be noted that the positive or negative influence refers to the overall contribution of the variable to MPA, not necessarily implying that an increase in the value of that variable will always result in an increase in MPA. In fact, the influence of most variables on MPA is complex and nonlinear, which will be further discussed in the following sections.
Based on the contribution importance and SHAP value of each variable, the following conclusions can be drawn:
The spatial openness and crowd density represented by the density dimension (with POI density being a reflection of crowd density to some extent) are the most important BE indicators influencing regular MPA in street scenes. The former has a positive contribution, while the latter has a negative contribution.
Accessibility and safety are of equal importance, with a significant gap between them and the density dimension. Road network accessibility has a generally positive contribution to MPA; similarly, the negative impact of vehicle flow simulated by local axial passage (NAch_7k) on MPA proves that improvements in safety have a positive effect on MPA. These results align with expectations and indirectly confirm that distinguishing between axial passage indicators at different scales can effectively reflect different characteristics of road network structures.
The design dimension is ranked next, with Expo_Lake (ranked 4th in importance among all indicators) being significantly more important than other indicators in the same dimension, highlighting the strong supporting role of water-adjacent spaces for MPA. The sharp decline in the importance of Expo_Lake reflects the impact of water landscape quality on this supporting role. Indicators like R_BLD and R_Sky further corroborate the importance of low-density, open spaces for street-level MPA. It is worth noting that the importance ranking of R_Vege is much lower than expected, and from Figure 7b, it can be observed that the impact of R_Vege on MPA is more complex (with no clear direction). This study suggests that this may be due to the conflict between the high spatial enclosure provided by street trees (high green view ratio) and the public’s high demand for open spaces.
N_Bus in the public transportation accessibility dimension has a significant positive influence on MPA (ranked 7th), while the impact of metro stations is negligible. This aspect currently lacks a strong explanation, and it is hard to imagine people taking the bus to a specific target street for MPA or choosing to take the bus back home after intense running. Perhaps the label of “popular routes” may trigger a herd mentality, encouraging people to travel long distances to experience the activity, but this cannot be further validated in this study.
The impact of different types of facilities on MPA aligns with the expectations of this study. Most of them have a positive influence, but the degree of influence is relatively low and can be considered negligible. This aligns with the logic of “regular MPA for exercise” represented by “popular routes,” meaning that the behavior is primarily motivated by the exercise itself, returning to space preference, rather than the pursuit of a specific functional “destination.”

3.2.2. Nonlinear and Threshold Effect of BE Variables

Based on the previous analysis, the SHAP summary chart in Figure 7b, which removes outliers, shows that the impact of most of the indicators in the model is complex and nonlinear. Additionally, a large number of empirical studies have demonstrated that BE variables have a typical nonlinear impact and threshold effect on outdoor MPA such as jogging and cycling [5,10,21,40,41]. Based on this, the paper further observes the model results to clarify the influence of various indicators within the indicator set on the outcomes.
In this study, the SHAP values for all samples for each indicator are output, and scatter plots are drawn combining the actual values of these indicators to observe their influence on MPA outcome. Furthermore, a local polynomial smoothing function (LOESS) is applied to fit the SHAP values and the actual values of the samples to explore the nonlinear effects of the influencing factors on the outcomes.
Figure 8 presents scatter plots of the influence of selected variables. The x-axis represents the actual values of the variables, and the y-axis denotes their contribution values. The influence of each variable can be interpreted from 2 perspectives:
(1)
Determine whether the variable’s contribution to MPA is “Promotion” or “Inhibition” based on the y-axis values: a y-value of 0 indicates no contribution to MPA; y > 0 indicates a promotional contribution to MPA; and y < 0 indicates an inhibitory effect on MPA.
(2)
Further judge the changing trend of this influence by observing the slope variation in the fitted curve (e.g., “Positive”, “Negative”, and “Neutral”).
Taking Figure 8d as an example: when the NAch_Global index ranges from 0.75 to 1.42, the variable exerts an overall inhibitory effect on MPA; when it is greater than 1.42 or less than 0.75, the overall effect is promotional. Meanwhile, it can be further observed that these “Promotion” and “Inhibition” effects are in a continuous state of change. When the NAch_Global index exceeds 0.18, the fitted curve shows a downward trend (Negative), indicating that the promotional effect on MPA gradually weakens as the index increases (until crossing the critical line of y = 0, where the effect shifts to inhibition and continues to strengthen).
Furthermore, this study observed the fitting results of the 16 variables (results of the fitting curves of the remaining 12 variables are shown in Appendix A). The influence patterns of each variable on MPA can be categorized into 4 types:
(1)
A nearly linear influence pattern, as shown in Figure 8a;
(2)
An “L-shaped” pattern characterized by “initial promotion followed by inhibition, with the inhibitory effect stabilizing after exceeding a certain threshold”, as shown in Figure 8b;
(3)
An “inverted L-shaped” pattern featured by “initial inhibition followed by promotion, with the promotional effect stabilizing or slowly enhancing after exceeding a certain threshold”, as shown in Figure 8c;
(4)
An “V-shaped” pattern where “the slope of the curve shifts from Negative to Positive”, as shown in Figure 8d.
For the top 8 indicators ranked by variable importance, density-related indicators such as D_Pop, D_POI, and D_BLD all exhibit the characteristic of “promotion at low densities and inhibition at high densities, with the inhibitory effect saturating beyond a certain threshold”. The NAch_7k indicator, used to simulate road motor vehicle flow, also shows the above feature: low traffic flow promotes MPA, and as traffic flow increases, the effect shifts from promotion to inhibition. Expo_Lake has a nearly linear effect. N_Bus presents the following pattern: when N_Bus < 1, the overall effect fluctuates around y = 0; when N_Bus > 1, the effect is promotional and shows a nearly linear increase. The slope of the R_Ped indicator shifts from “nearly Neutral” to Positive, indicating that insufficient pedestrian facilities have a significant inhibitory effect on MPA, and as pedestrian facilities improve, the inhibitory effect weakens continuously and gradually transitions to promotion.
Finally, to facilitate a clearer understanding of the functional patterns of each variable on MPA and further provide a basis for planning and design practices, this study summarizes the effect variation intervals of all 16 global variables as follows (see Table 8).

3.2.3. Main Effects and Interaction Effects of BE Variables

In the model, the SHAP values of explanatory variables can be decomposed into main effects (SHAP main effect values) and interaction effects with other variables (SHAP interaction values) using the improved TreeExplainer algorithm. This study visualizes the main effects and interaction effects of the top 7 most important variables by calculating the SHAP interaction values, as shown in Figure 9. In the figure, the SHAP results of a variable interacting with itself represent its main effect, while the interaction results with other variables represent the interaction effects. From the figure, it can be observed that the main effect of the most important variables is greater than their interaction effects. However, there are also cases where the interaction effects of certain variables exceed their main effects, such as D_BLD & Den_Pop for D_BLD, and NAch_Global & NAch_7k for NAch_Global, etc.
Furthermore, this study calculated the main effects and interaction effects of global variables. In the global variables, the importance contributions of the main effects and interaction effects were 58.4% and 41.6%, respectively. It means that the analysis of interaction effects is just as important as the analysis of individual variables in explaining MPA.
This study calculated and ranked the importance of the main effects of each variable, as well as the interaction effects between variables on the outcome. It further compared the ranking changes in the top 8 variables based on their main effects with the global variable importance ranking (see Section 3.2.1 of this paper), as shown in Table 9.
Upon comparison, it was found that after refining the main effects, the importance rankings of almost all variables were adjusted to some extent compared to the global indicators. For instance, N_Bus, which was previously difficult to interpret, dropped out of the top 8 and fell to the 11th position. By contrast, R_BLD, an indicator that reflects the degree of openness of street space and directly affects the public’s psychological feelings, rose from 9th to 6th place in terms of importance.
Overall, compared to the global importance, the main effect importance ranking seems more intuitive. For example, it further highlights the role of urban features like large water bodies and high-quality landscapes. Additionally, from Figure 10, it can be observed that the variables at the top of the ranking now have a more homogeneous distribution of importance, with no longer the significant lead of D_Pop, which had a 23.51% importance, leading the second-ranked variable by 10 percentage points in the global ranking.

3.2.4. Interaction Effects Among BE Variables

Figure 10 shows the top 10 variable combinations by ranking. This study further examines the interaction effects of these 10 variable combinations by analyzing their sample distribution scatter plots, aiming to gain a deeper understanding of the underlying interaction mechanisms.
Figure 11 presents the visualization results of the interaction between the variable group D_BLD and NAch_Global. The x-axis represents the actual values of D_BLD in the interaction variables. The color bar on the right denotes the actual values of NAch_Global in this group (the redder the color, the higher the value). The y-axis stands for the SHAP interaction value of the variable group: a SHAP interaction value of 0 indicates no interaction effect; otherwise, a larger value or a value < 0 signifies a positive (synergistic effect) or negative interaction effect, respectively.
As shown in Figure 11, when D_BLD < 0.1 and NAch_Global is at an extremely high value (approximately > 1.37), a “low + high” promoting effect of “low building density + high accessibility” is exhibited; otherwise, a “low + low” inhibiting effect of “low building density + low accessibility” is observed. When D_BLD > 0.1, a “high + low” promoting effect of “high building density + low accessibility” and a corresponding “high + high” inhibiting effect are presented.
The visualization results of the interaction effects for the remaining variable groups among the top 10 ranked groups are provided in Appendix B. Overall, the interaction effects of several key variable pairs yield the following results:
(1)
The 4 interaction variable pairs formed by Expo_Lake and the 4 variables (NAch_7k, R_Ped, NAch_Global, and D_BLD) rank 1st, 3rd, 5th, and 9th in importance, respectively. When Expo_Lake < 80%, the other 4 variables all exhibit an interaction pattern of “low + high” promoting effect and “low + low” inhibiting effect with Expo_Lake; when Expo_Lake≥80%, the form of the interaction effect transforms, showing a state of “high + low” promoting effect and “high + high” inhibiting effect.
(2)
The interaction effects between D_BLD and NAch_7k, NAch_Global, and D_POI rank 2nd, 7th, and 8th in importance, respectively. For the above 3 variable pairs, D_BLD = 10% serves as the critical point for the change in interaction effects.
When D_BLD < 10%, low-density spaces, combined with low traffic volume (NAch_7k) or low accessibility (NAch_Global), jointly promote MPA; otherwise, they exert an inhibiting effect on MPA. When D_BLD>10%, in relatively high-density urban spaces, both “high density + low accessibility” and “high density + high traffic volume” contribute to promoting MPA.
The interaction between D_BLD and D_POI exhibits an inhibiting effect in the “low-low” combination and a promoting effect in the “high-high” combination. When considered individually, both “low density”(D_BLD) and “low POI density”(D_POI) show a promoting effect on MPA; however, when combined, the sense of insecurity brought by the “open space + lack of vitality” environment leads to an inhibiting effect on MPA. This finding cannot be simply identified through research on the impact of individual variables on MPA alone.
(3)
The interaction patterns between D_Pop and D_BLD, as well as between D_Pop and D_POI, differ from the above. When D_Pop falls within the range of 0–60,000 people/km2 (the interval where the vast majority of samples are concentrated), both pairs of variables exhibit a certain degree of interaction; when D_Pop exceeds this range, the interaction effect approaches a value of 0 increasingly. In fact, excluding the scenario where D_Pop is at an extremely low value (close to 0)—where some outliers of D_BLD and D_POI inflate the contribution of the interaction effect (i.e., the absolute value on the y-axis)—the interaction effects of the vast majority of samples are very close to 0.
A further observation reveals that when D_Pop is extremely low, road segments with high building density significantly enhance the contribution to MPA. This may be because “high-density environments in sparsely populated areas” (e.g., areas near suburban industrial parks or architectural facilities within large parks) can provide a certain sense of security for MPA, whereas “uninhabited and low-density built areas” are obviously unsuitable for MPA. Similarly, in sparsely populated streets, the presence of an appropriate number of POIs (Point of Interest)—approximately within the range of 0.02–0.03, i.e., 2–3 shops per 100 m—greatly boosts the positive contribution of streets to MPA.
(4)
This section focuses on the interaction between NAch_Global (representing accessibility) and NAch_7k (representing traffic volume). Their overall interaction effect is negative, ranking 2nd in importance. Given that the majority of global samples are concentrated in the NAch_Global range of 0.8–1.5, this study designates the first half of this interval as the low-value range and the second half as the high-value range. The key conclusions are as follows:
First, both “low accessibility + low traffic volume” and “high accessibility + high traffic volume” exert an inhibiting effect on MPA, with the distinction that the inhibiting degree of the “high + high” combination is significantly stronger than that of the “low + low” combination.
Second, “high accessibility” combined with a small number of samples with “medium-low traffic volume” exhibits a strong joint-promoting effect. The reason behind this result is easy to understand, but such samples seem scarce in urban settings.
Finally, “medium accessibility + medium-high intensity traffic volume” has a weak promoting effect on MPA. This appears to be another “compelled” choice due to the extreme scarcity of the aforementioned favorable scenario.

4. Discussion

This study explores the impact mechanism of the street built environment on residents’ MPA using multi-source data and explainable machine learning methods. By analyzing BE variables across different dimensions (such as density, design, diversity, traffic safety, etc.), we found that BE has significant differences in its promoting or inhibiting effects on MPA at multiple levels, and this impact exhibits complex nonlinear and threshold effects.

4.1. Comparison with Existing Research Results

The findings of this study are consistent with previous research results to some extent. Existing studies generally agree that street density and functional mix have a significant impact on MPA, especially in low-density areas, where open spaces and good public facilities can promote outdoor activities. Consistent with the findings of previous research [6,7,22,23], we found that building density and population density on streets promote MPA at low levels. However, when density becomes too high, it can inhibit MPA due to increased feelings of crowding and safety risks.
However, this study also questions some of the common assumptions in previous literature [15,17,23]. For example, POI density did not show the expected significant impact in this study, which may be due to the fact that the sample primarily focused on MPA for exercise purposes, rather than leisure or recreational activities. Compared to shopping, entertainment, and other functional facilities, sports facilities and recreational spaces (such as parks, gyms, etc.) may have a more significant promoting effect on MPA.

4.1.1. Nonlinear and Threshold Effect

A significant finding of this study is that the impact of BE variables on MPA is not linear. Many variables, such as building density, population density, and traffic flow, exhibit significant nonlinear effects across different value ranges. For example, building density shows a clear “inverted U-shaped” relationship, with a promotion effect in low-density areas and a suppression effect in high-density areas. As density increases, the promotion effect on MPA gradually diminishes until a certain threshold is reached, after which the suppression effect begins to increase.
Similarly, the impact of population density also exhibits similar nonlinear characteristics. In low-density areas, an increase in population density can promote MPA, but when density exceeds a certain threshold, excessive crowding leads to a decrease in the frequency and intensity of MPA. This finding is consistent with the nonlinear effects observed in existing literature, such as those by Yang et al. (2021) [10] and Liu et al. (2023) [45], who both pointed out that density indicators in the built environment have complex nonlinear effects on physical activity.
The impact of traffic flow on MPA also exhibits a significant threshold effect. On streets with lower traffic flow, the frequency and intensity of MPA are higher, but as traffic flow increases, particularly when it exceeds a certain level, the suppression effect on MPA becomes significant. This finding provides important insights for traffic safety management and urban planning, indicating that reducing traffic flow or improving traffic safety are effective ways to promote MPA.

4.1.2. Interaction Effect of Variables

This study further reveals the interaction effects between various variables in the built environment, particularly the interactions between density, design, accessibility, and safety. The study shows that the factors in the built environment do not act in isolation; their interaction effects play a crucial role in promoting MPA, which aligns with the findings of Yang et al. (2024) [21]. For example, the interaction effect between water exposure and traffic flow or building density indicates that the positive promoting effect of water landscapes is more significant on streets with low traffic flow and low building density, while this positive effect of water exposure significantly weakens on streets with high traffic flow and high building density.
Another key interaction effect is the interplay between building density and population density. This study found that in low-density areas, an increase in building density can effectively promote MPA, while in high-density areas, excessively high building density can suppress residents’ MPA due to the sense of crowding and unsuitable environmental design. This finding aligns with the results of Yang et al. (2024) [23], who also pointed out that the interaction between population density and land-use density has a significant nonlinear effect on outdoor jogging flow.

4.2. Policy Recommendations and Practical Implications

4.2.1. Guiding Planning Recommendations

Based on the main findings of this study, we propose the following guiding policy and planning recommendations:
(1)
Optimize Street Space Density: When planning new communities or renovating existing neighborhoods, building density should be controlled to avoid overcrowded street environments. Properly reducing building and population density can increase residents’ willingness and intensity to engage in MPA, especially in the renovation of core areas or old neighborhoods.
(2)
Improve Traffic Safety: Reducing traffic flow, especially around residential and commercial areas, can effectively minimize the negative impact on residents’ MPA. Increasing green transportation infrastructure (such as sidewalks, pedestrian streets, slow-moving lanes) and traffic safety measures (such as speed bumps, traffic light settings) will help improve the suitability of streets for MPA.
(3)
Enhance Public Transportation Accessibility: Increasing the density of bus and subway stations and improving the quality of public transportation services can effectively promote outdoor MPA among residents. In areas with limited public transportation, providing more convenient transportation connections can encourage residents to participate in more outdoor exercise.
(4)
Optimize Street Landscape Design: Improving water exposure and green space design, especially on streets frequented by residents, can enhance the attractiveness of the street and encourage MPA. In addition, appropriately increasing the openness of streets and reducing the continuity of building facades can provide a more open and comfortable walking environment.
(5)
Strengthen Community Health Interventions: Specific strategies for promoting MPA should be developed for different urban areas, especially low-income or traffic-congested areas. By implementing targeted interventions to improve the built environment in specific regions, resources can be allocated more precisely, and MPA can be more effectively promoted.

4.2.2. Indicator-Based Planning Recommendations

Based on the guiding policy recommendations, more specific suggestions for planning and design practices can be further proposed in light of the quantitative analysis results of this paper. Considering that a city is a complex, giant system where a minor adjustment in one aspect may have far-reaching impacts on the whole, the promotion of MPA and, more importantly, public health should be grounded in not unduly affecting the normal operation of the city. For example, although the public prefers open environments with low population density and low building density, it is obviously unrealistic to control the building density of the entire urban area below 10%. Therefore, based on practical realities, we divide the indicators involved in this paper into the following two categories:
(1)
Indicators that can be directly intervened in based on quantitative results:
For indicators such as R_Sky, R_Ped, and R_Vege that can be directly improved through street-scale environmental optimization, the planning recommendations proposed in this paper are shown in Table 10.
(2)
Indicators that can be intervened in through the synergistic effects among variables.
For indicators such as D_Pop, D_BLD, and Expo_Lake that are difficult to directly control in the short term, this paper further proposes planning recommendations based on their interaction effects with other variables, which are shown in Table 11.
Finally, given that both indicators, NAch_7k and NAch_Global, are simulated values calculated based on space syntax, their results are only used for mechanism discussion in this paper, and it is difficult to formulate specific planning recommendations.

4.3. Research Limitations and Future Outlook

Although this study provides an in-depth analysis of the relationship between BE and MPA, there are still some limitations. First, this study primarily relies on spatial and perceptual data, lacking a deeper exploration of socioeconomic factors, cultural backgrounds, and individual differences, which may limit the generalizability and comprehensiveness of the results. Secondly, despite using advanced models for data analysis, the assumptions and parameter selections of the models may affect the accuracy of the results, especially in complex urban environments where multiple BE factors interact, potentially not capturing all the underlying interaction effects. Finally, this study focuses on the core urban area of Chengdu. Characterized by a pleasant subtropical monsoon climate, a strong culture of leisure and fitness, flat terrain, and well-improved transportation facilities, this region provides a sound foundation for residents’ outdoor MPAs. However, these unique regional characteristics may make it difficult for the research conclusions to be fully applicable to other areas with drastically different climatic conditions, distinct fitness cultures, or significant disparities in urban infrastructure. Therefore, caution should be exercised to avoid overgeneralizing the findings of this study to other cities or regional contexts.
Future research can be expanded in the following directions. First, more socioeconomic variables and cultural factors should be incorporated into the analytical framework to explore how these factors, in conjunction with BE, influence residents’ MPA behavior, especially in low-income groups and special populations. Secondly, with the continuous development of big data and artificial intelligence technologies, more precise models and broader datasets can be used in the future to further explore the spatial heterogeneity and nonlinear effects of BE on MPA, particularly its impact during different times of the day and under varying weather conditions. Finally, cross-regional comparative studies can be conducted in different urban and cultural contexts to explore how cities and regions of various types can promote residents’ MPA based on their unique BE designs and policies, thus providing broader theoretical support and practical experience for the global development of healthy cities.

5. Conclusions

This study explores the impact mechanism of street built environment (BE) on mobile physical activity (MPA) through multi-source data and interpretable machine learning methods, focusing on the importance, nonlinear effects, and interaction effects of built environment variables. By integrating a hybrid model of Geographically Weighted Regression (GWR) and Random Forest (RF) (GW-RF), the research reveals the complex spatially heterogeneous relationship between built environment factors and physical activity, and enhances the interpretability of results by introducing the SHAP model, providing theoretical support for precise urban planning and physical activity interventions in the future.
The main conclusions indicate that different characteristics of the built environment have significant differences in their impact on MPA. In the density dimension, population density and building density exert particularly notable effects. Moderate density can provide more exercise opportunities and promote MPA, while excessively high density brings a sense of spatial crowding and potential safety hazards, inhibiting residents’ willingness to exercise. Specifically, an increase in building density may promote MPA in low-density areas, but once density reaches a certain threshold, its negative inhibitive effect begins to emerge; similarly, population density at lower levels facilitates MPA, but overly high population density in high-density areas tends to suppress it.
Traffic safety and accessibility are also key factors affecting MPA. The negative correlation between simulated traffic flow (NAch_7k) and MPA shows that higher traffic flow significantly reduces residents’ willingness and frequency to engage in physical activity, especially in areas with dense street traffic. In contrast, good public transportation accessibility (such as the density of bus and subway stations) has a positive promoting effect, particularly in neighborhoods with convenient transportation facilities. Additionally, low traffic flow and a sense of safety can enhance residents’ participation in physical activity, while high traffic flow and low safety perception will inhibit it.
In the design dimension, several factors play important roles in promoting MPA. The exposure to water bodies (Expo_Lake) is particularly prominent—water landscapes enhance the visual attractiveness and spatial openness of streets, and high water body exposure can effectively stimulate residents’ willingness to exercise. Greening rate and sky view factor also contribute significantly, especially in areas with high green coverage, which can notably increase the frequency of walking, running, and other activities. However, the impact of building facade continuity (R_BLD) on MPA presents a complex nonlinear relationship, with higher continuity tending to suppress residents’ willingness to a certain extent.
The study emphasizes that the impact of the built environment on MPA is not linearly increasing but exhibits threshold effects. For example, the spacious space in low-density areas is conducive to physical activity, but as density increases, spatial crowding and traffic pressure gradually inhibit physical activity. Similarly, the relationship between traffic accessibility and safety perception also shows nonlinear characteristics—high accessibility areas still struggle to promote MPA without a sense of safety.
Furthermore, significant interaction effects exist between different dimensions of the built environment. Improvements in a single variable are often insufficient to fully explain changes in MPA, as the interaction between variables exerts an important influence. For instance, the interaction between greening rate and the distribution of walking facilities, as well as between traffic accessibility and safety perception, further affects residents’ willingness and intensity to engage in physical activity. Particularly in high-density areas, good traffic accessibility and low traffic flow can alleviate the negative effects of high density, thereby promoting MPA.
Overall, this study not only provides a new theoretical perspective for understanding the impact of the built environment on physical activity but also offers empirical evidence for promoting residents’ physical activity by optimizing street environments. Urban planners and policymakers can utilize the model framework proposed in this study, combined with specific urban contexts, to design more precise and effective intervention measures. Especially in areas with low physical activity participation, optimizing street design, improving traffic safety, and increasing green spaces and water-friendly areas can enhance residents’ exercise participation, ultimately promoting the improvement of public health levels.

Author Contributions

Conceptualization, H.S. and A.L.; methodology, H.S. and J.Z.; software, H.S. and A.L.; validation, H.S. and Y.L.; formal analysis, J.Z.; investigation, H.S. and Y.L.; resources, H.S. and J.Z.; data curation, H.S.; writing—original draft preparation, H.S. and Y.L.; writing—review and editing, H.S. and J.Z.; visualization, J.Z. and A.L.; supervision, H.S. and J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Sichuan Provincial Natural Science Foundation Project, 2022NSFC1152.

Data Availability Statement

Data are contained within the article. The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Conflicts of Interest

All authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A

Figure A1. Influence Effects of the Remaining 12 Variables.
Figure A1. Influence Effects of the Remaining 12 Variables.
Land 14 02315 g0a1aLand 14 02315 g0a1b

Appendix B

Figure A2. The Interaction Effects of the Top 10 Variable Pairs.
Figure A2. The Interaction Effects of the Top 10 Variable Pairs.
Land 14 02315 g0a2aLand 14 02315 g0a2b

References

  1. Strain, T.; Flaxman, S.; Guthold, R.; Semenova, E.; Cowan, M.; Riley, L.M.; Bull, F.C.; Stevens, G.A. National, Regional, and Global Trends in Insufficient Physical Activity Among Adults from 2000 to 2022: A Pooled Analysis of 507 Population-Based Surveys with 5·7 Million Participants. Lancet Glob. Health 2024, 12, e1232–e1243. [Google Scholar] [CrossRef]
  2. Rhodes, R.E.; Janssen, I.; Bredin, S.S.D.; Warburton, D.E.R.; Bauman, A. Physical Activity: Health Impact, Prevalence, Correlates and Interventions. Psychol. Health 2017, 32, 942–975. [Google Scholar] [CrossRef]
  3. Roberts, I.; Norton, R.; Jackson, R.; Dunn, R.; Hassall, I. Effect of Environmental Factors on Risk of Injury of Child Pedestrians by Motor Vehicles: A Case-Control Study. BMJ 1995, 310, 91–94. [Google Scholar] [CrossRef]
  4. Yang, D.; Wang, X.; Han, R. Nonlinear and Synergistic Effects of the Built Environment on Street Vitality: The Case of Shenyang. Urban Plan. Forum 2023, 5, 93–102. [Google Scholar] [CrossRef]
  5. Yang, W.; Fei, J.; Li, Y.; Chen, H.; Liu, Y. Unraveling Nonlinear and Interaction Effects of Multilevel Built Environment Features on Outdoor Jogging with Explainable Machine Learning. Cities 2024, 147, 104813. [Google Scholar] [CrossRef]
  6. Yang, L.; Yu, B.; Liang, P.; Tang, X.; Li, J. Crowdsourced Data for Physical Activity-Built Environment Research: Applying Strava Data in Chengdu, China. Front. Public Health 2022, 10, 883177. [Google Scholar] [CrossRef] [PubMed]
  7. Shen, H.; Shu, B.; Zhang, J.; Liu, Y.; Li, A. What Factors Influence the Willingness and Intensity of Regular Mobile Physical Activity?—A Machine Learning Analysis Based on a Sample of 290 Cities in China. Front. Public Health 2025, 13, 1511129. [Google Scholar] [CrossRef] [PubMed]
  8. Ewing, R.; Cervero, R. Travel and the Built Environment. J. Am. Plann. Assoc. 2010, 76, 265–294. [Google Scholar] [CrossRef]
  9. Schnohr, P.; O’Keefe, J.H.; Marott, J.L.; Lange, P.; Jensen, G.B. Dose of Jogging and Long-Term Mortality: The Copenhagen City Heart Study. J. Am. Coll. Cardiol. 2015, 65, 411–419. [Google Scholar] [CrossRef]
  10. Yang, L.; Ao, Y.; Ke, J.; Lu, Y.; Liang, Y. To Walk or Not to Walk? Examining Non-Linear Effects of Streetscape Greenery on Walking Propensity of Older Adults. J. Transp. Geogr. 2021, 94, 103099. [Google Scholar] [CrossRef]
  11. Cheng, L.; De Vos, J.; Zhao, P.; Yang, M.; Witlox, F. Examining Non-Linear Built Environment Effects on Elderly’s Walking: A Random Forest Approach. Transp. Res. Part Transp. Environ. 2020, 88, 102552. [Google Scholar] [CrossRef]
  12. Smith, R.A.; Schneider, P.P.; Cosulich, R.; Quirk, H.; Bullas, A.M.; Haake, S.J.; Goyder, E. Socioeconomic Inequalities in Distance to and Participation in a Community-Based Running and Walking Activity: A Longitudinal Ecological Study of Parkrun 2010 to 2019. Health Place 2021, 71, 102626. [Google Scholar] [CrossRef]
  13. Karusisi, N.; Bean, K.; Oppert, J.-M.; Pannier, B.; Chaix, B. Multiple Dimensions of Residential Environments, Neighborhood Experiences, and Jogging Behavior in the RECORD Study. Prev. Med. 2012, 55, 50–55. [Google Scholar] [CrossRef]
  14. Chen, E.; Ye, Z.; Wu, H. Nonlinear Effects of Built Environment on Intermodal Transit Trips Considering Spatial Heterogeneity. Transp. Res. Part Transp. Environ. 2021, 90, 102677. [Google Scholar] [CrossRef]
  15. Jiang, H.; Dong, L.; Qiu, B. How Are Macro-Scale and Micro-Scale Built Environments Associated with Running Activity? The Application of Strava Data and Deep Learning in Inner London. ISPRS Int. J. Geo-Inf. 2022, 11, 504. [Google Scholar] [CrossRef]
  16. Javanmard, R.; Lee, J.; Kim, J.; Liu, L.; Diab, E. The Impacts of the Modifiable Areal Unit Problem (MAUP) on Social Equity Analysis of Public Transit Reliability. J. Transp. Geogr. 2023, 106, 103500. [Google Scholar] [CrossRef]
  17. Lu, Y. Using Google Street View to Investigate the Association Between Street Greenery and Physical Activity. Landsc. Urban Plan. 2019, 191, 103435. [Google Scholar] [CrossRef]
  18. Huang, D.; Liu, Y.; Zhou, P. Meta-analysis on Associations Between the Built Environment and Mobile Physical Activity Using Volunteered Geographic Information. Landsc. Archit. 2024, 31, 12–20. [Google Scholar] [CrossRef]
  19. Biddle, S.J.H.; Nigg, C.R. Theories of Exercise Behavior. Int. J. Sport Psychol. 2000, 31, 290–304. [Google Scholar]
  20. Li, K.; Yang, D.; Jiang, L. Analysis of the Relationship between Built Environment and Physical Activity Willingness of the Elderly: From the Perspective of Perception and Mediation Effect. J. Hum. Settl. West China 2025, 40, 73–80. [Google Scholar]
  21. Yang, W.; Li, Y.; Liu, Y.; Fan, P.; Yue, W. Environmental Factors for Outdoor Jogging in Beijing: Insights from Using Explainable Spatial Machine Learning and Massive Trajectory Data. Landsc. Urban Plan. 2024, 243, 104969. [Google Scholar] [CrossRef]
  22. Alshahrani, N.Z. Predictors of Physical Activity and Public Safety Perception Regarding Technology Adoption for Promoting Physical Activity in Jeddah, Saudi Arabia. Prev. Med. Rep. 2024, 43, 102753. [Google Scholar] [CrossRef]
  23. Yang, W.; Hu, J.; Liu, Y. Association and Interaction Between Built Environment and Outdoor Jogging Based on Crowdsourced Geographic Information. Landsc. Archit. 2024, 31, 44–52. [Google Scholar] [CrossRef]
  24. Swapan, A.Y.; Bay, J.H.; Marinova, D. Built Form and Community Building in Residential Neighbourhoods: A Case Study of Physical Distance in Subiaco, Western Australia. Sustainability 2018, 10, 1703. [Google Scholar] [CrossRef]
  25. Friesen, A. The Importance of Place A Role for the Built Environment in the Etiology and Treatment of Problematic Substance Use. Master’s Thesis, University of Waterloo, Waterloo, ON, Canada, 2018. [Google Scholar]
  26. Ishikawa, A.; Suzuki, H. Relationship Between Distance of Visible Road on Road Network and Occurrence of Snatch—A Study in Redential Area in Osaka City. J. Environ. Eng. Trans. Archit. Inst. Jpn. 2008, 73, 101–106. [Google Scholar] [CrossRef]
  27. Yamanoto, H.; Marumo, H.; Takahashi, A.; Saitou, K. On the Effect of Unobstructed View Distance in Residential Street on Landscape Evaluation. J. City Plan. Inst. Jpn. 1991, 26, 817–822. [Google Scholar] [CrossRef]
  28. Gehl, J. Life Between Buildings—Using Public Space, 6th ed.; Island Press: Washington, DC, USA, 2011; ISBN 978-1-59726-827-1. [Google Scholar]
  29. Sheng, Q.; Yang, T.; Hou, J. Continuous Movement and Hyper-Link Spatial Mechanisms—A Large-Scale Space Syntax Analysis on Chongqing’s Vehicle and Metro Flow Data. J. Hum. Settl. West China 2015, 30, 16–21. [Google Scholar] [CrossRef]
  30. Hillier, W.; Yang, T.; Turner, A. Advancing DepthMap to Advance Our Understanding of Cities: Comparing Streets and Cities and Streets with Cities. In Proceedings of the 8th International Space Syntax Symposium, Santiago, Chile, 3–6 January 2012. [Google Scholar]
  31. Chiaradia, A.; Moreau, E.; Raford, N. Configurational Exploration of Public Transport Movement Networks: A Case Study, the London Underground. In Proceedings of the 5th International Space Syntax Symposium, Delft, The Netherlands, 13–17 June 2005. [Google Scholar]
  32. Yamu, C.; Van Nes, A.; Garau, C. Bill Hillier’s Legacy: Space Syntax—A Synopsis of Basic Concepts, Measures, and Empirical Application. Sustainability 2021, 13, 3394. [Google Scholar] [CrossRef]
  33. Hillier, W.R.G.; Yang, T.; Turner, A. Normalising Least Angle Choice in Depthmap—And How It Opens up New Perspectives on the Global and Local Analysis of City Space. J. Space Syntax 2012, 3, 155–193. [Google Scholar]
  34. Bringolf-Isler, B.; Hänggi, J.; Kayser, B.; Suggs, L.S.; De Hoogh, K.; Dössegger, A.; Probst-Hensch, N. Does Growing up in a Physical Activity-Friendly Neighborhood Increase the Likelihood of Remaining Active during Adolescence and Early Adulthood? BMC Public Health 2024, 24, 2883. [Google Scholar] [CrossRef] [PubMed]
  35. Tao, W.; Gu, H.; Zhang, L.; Shen, M.; Huang, M. Study on the Prediction of Urban Road Traffic from the Perspective of Syntax: A Case Study on Renmin Viaduct Demolition in Guangzhou. J. South China Norm. Univ. Nat. Sci. Ed. 2017, 49, 80–86. [Google Scholar] [CrossRef]
  36. Luo, Y.; Yan, J.; McClure, S.C.; Li, F. Socioeconomic and Environmental Factors of Poverty in China Using Geographically Weighted Random Forest Regression Model. Environ. Sci. Pollut. Res. 2022, 29, 33205–33217. [Google Scholar] [CrossRef]
  37. Su, Z.; Lin, L.; Xu, Z.; Chen, Y.; Yang, L.; Hu, H.; Lin, Z.; Wei, S.; Luo, S. Modeling the Effects of Drivers on PM2.5 in the Yangtze River Delta with Geographically Weighted Random Forest. Remote Sens. 2023, 15, 3826. [Google Scholar] [CrossRef]
  38. Lundberg, S.M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates, Inc.: New York, NY, USA, 2017; Volume 30. [Google Scholar]
  39. Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 1135–1144. [Google Scholar]
  40. Li, Z. Extracting Spatial Effects from Machine Learning Model Using Local Interpretation Method: An Example of SHAP and XGBoost. Comput. Environ. Urban Syst. 2022, 96, 101845. [Google Scholar] [CrossRef]
  41. Wei, D.; Yang, L. Non-Linear and Synergistic Effects of Built Environment Factors on Older People’s Walking Frequency in Chengdu: A Shapley Additive Explanations Analysis. J. Hum. Settl. West China 2024, 39, 75–82. [Google Scholar] [CrossRef]
  42. Koç, T. Bandwidth Selection in Geographically Weighted Regression Models via Information Complexity Criteria. J. Math. 2022, 2022, 1527407. [Google Scholar] [CrossRef]
  43. Li, B.; Cao, J.; Guan, L.; Mazur, M.; Chen, Y.; Wahle, R.A. Estimating Spatial Non-Stationary Environmental Effects on the Distribution of Species: A Case Study from American Lobster in the Gulf of Maine. ICES J. Mar. Sci. 2018, 75, 1473–1482. [Google Scholar] [CrossRef]
  44. Wang, J.; Du, H.; Li, X.; Mao, F.; Zhang, M.; Liu, E.; Ji, J.; Kang, F. Remote Sensing Estimation of Bamboo Forest Aboveground Biomass Based on Geographically Weighted Regression. Remote Sens. 2021, 13, 2962. [Google Scholar] [CrossRef]
  45. Liu, Y.; Li, Y.; Yang, W.; Hu, J. Exploring Nonlinear Effects of Built Environment on Jogging Behavior Using Random Forest. Appl. Geogr. 2023, 156, 102990. [Google Scholar] [CrossRef]
Figure 1. The technical framework and workflow in this study.
Figure 1. The technical framework and workflow in this study.
Land 14 02315 g001
Figure 2. The study area: main districts of Chengdu with a 1500m buffer.
Figure 2. The study area: main districts of Chengdu with a 1500m buffer.
Land 14 02315 g002
Figure 3. Illustration of the vectorization results of popular routes: (a) Route shapefile from the popular route data (Since the images are screenshots from the APP, the Chinese characters in the figures represent the names of streets and facilities on the online map.); (b) Vectorization results of the popular routes in ArcGIS platform (red indicates the street is passed by the route 2 times, green indicates the street segment is passed by the route 1 time).
Figure 3. Illustration of the vectorization results of popular routes: (a) Route shapefile from the popular route data (Since the images are screenshots from the APP, the Chinese characters in the figures represent the names of streets and facilities on the online map.); (b) Vectorization results of the popular routes in ArcGIS platform (red indicates the street is passed by the route 2 times, green indicates the street segment is passed by the route 1 time).
Land 14 02315 g003
Figure 4. Summary of the vectorization results of popular routes: (a) Spatial distribution of popular routes within the study area; (b) Illustration of the spatial relationship between polyline features and the converted point features. (The image is sourced from the “Help” module of the ArcGIS 10.8.2 platform).
Figure 4. Summary of the vectorization results of popular routes: (a) Spatial distribution of popular routes within the study area; (b) Illustration of the spatial relationship between polyline features and the converted point features. (The image is sourced from the “Help” module of the ArcGIS 10.8.2 platform).
Land 14 02315 g004
Figure 5. Spatial distribution map of MPA intensity.
Figure 5. Spatial distribution map of MPA intensity.
Land 14 02315 g005
Figure 6. Variables’ importance of the model.
Figure 6. Variables’ importance of the model.
Land 14 02315 g006
Figure 7. SHAP summary plot: (a) SHAP summary plot for all samples; (b) SHAP summary plot after extreme value removal.
Figure 7. SHAP summary plot: (a) SHAP summary plot for all samples; (b) SHAP summary plot after extreme value removal.
Land 14 02315 g007
Figure 8. Influence Effects of 4 Typical Variables (the X-axis represents the actual values of the variables, and the Y-axis represents the SHAP values of the variables; a SHAP value greater than 0 indicates a positive contribution, while a value less than 0 indicates a negative contribution).
Figure 8. Influence Effects of 4 Typical Variables (the X-axis represents the actual values of the variables, and the Y-axis represents the SHAP values of the variables; a SHAP value greater than 0 indicates a positive contribution, while a value less than 0 indicates a negative contribution).
Land 14 02315 g008
Figure 9. SHAP summary plot of main effects and interaction effects. (In this SHAP interaction plot, the colors represent the magnitude and direction of the interaction effect between features. Red indicates a positive interaction effect, while blue indicates a negative interaction effect).
Figure 9. SHAP summary plot of main effects and interaction effects. (In this SHAP interaction plot, the colors represent the magnitude and direction of the interaction effect between features. Red indicates a positive interaction effect, while blue indicates a negative interaction effect).
Land 14 02315 g009
Figure 10. Top 10 pairs of interaction variables by importance.
Figure 10. Top 10 pairs of interaction variables by importance.
Land 14 02315 g010
Figure 11. D_BLD with NAch_Global.
Figure 11. D_BLD with NAch_Global.
Land 14 02315 g011
Table 1. Area and Population of Involved Districts.
Table 1. Area and Population of Involved Districts.
DistrictInclude Spatial ScopeAreaPopulation
QingyangEntire64.49 km21,373,897
Wuhou123.67 km21,571,181
Jinniu106.67 km21,634,905
Jinjiang60.37 km2767,034
Chenghua109.39 km21,373,897
LongquanyiPartial50.54 km2178,789
Pidu31.54 km2208,118
Shuangliu103.46 km2598,735
Xindu79.94 km2163,640
Wenjiang23.14 km254,392
Total-753.21 km27,924,588
Table 2. PA data descriptions.
Table 2. PA data descriptions.
InformationDescriptionSample
Route nameThe custom name provided by the user when creating a route.Vanke Loop Line
Route IDAssigned by the system upon creation.5f145f0a88d6fe70e739556f
Route locationGeographical coordinates of the starting point of the route.Longitude: 30.5908
Latitude: 104.1723
Venue typeIncluding: Park, Street, Playground, Field and Others.Street
Route length-1671.2 m
Check-in countCumulative check-ins since the creation of the route.814 times
Proportion of PA typesProportion of running, walking, and cycling activities.Running: 75%
Walking: 9%
Cycling: 16%
Route creation date-19 July 2020, 22:56:10
Route shapeThe shape of the route on the online map.
(Since the images are screenshots from the APP, the Chinese characters in the figures represent the names of streets and facilities on the online map.)
Land 14 02315 i001
Table 3. Data source descriptions.
Table 3. Data source descriptions.
DataSourceRecencyAccuracyData Acquisition Time
Road Network DataOpen Street Map (www.openstreetmap.org)June 2024-June 2024
Water Body DataOpen Street Map -
Population Raster DataWorldPop (www.worldpop.org)2018100m
Street View ImageBaidu Map (https://lbsyun.baidu.com)June 2024-
Point of InterestGaode Map (http://lbs.amap.com)-
Building FootprintBaidu Map-
Table 4. Definition and description of BE indicators.
Table 4. Definition and description of BE indicators.
DimensionsVariablesAbbr.Description or CalculationScaleUnits
DensityBuilding DensityD_BLDBuilding/Population/POI density within the street visual perception range.100 m%
Population DensityD_Pop100 mPersons/km2
POI Density D_POIAlong the StreetNums/m
DiversityDensity of Life and Education-related POIsD_LifeTypes of POI density within the street visual perception range.Along the StreetNums/m
Density of Sports and Leisure-related POIsD_Sport
Density of Transportation-related POIsD_Trans
DesignBuilding Façade EnclosureR_BLDThe ratio of total building façade width to street length.Along the Street%
Green View RatioR_VegeThe Proportion of Sky, Greenery, and Pedestrian Path Extracted from Street View Images.-
Sky View RatioR_Sky
Pedestrian Path Coverage RatioR_Ped
Polygonal Water Exposure RatioExpo_LakeThe ratio of the area visible to pedestrians, where the view through buildings reveals water bodies, to the total length of the street.80 m
Linear Water Exposure RatioExpo_River50 m
Distance to Transit (The number of public transport and subway stations in this paper includes those located on opposite streets)Number of Bus StopsN_BusThe number of public transportation stops along the street.Along the StreetNums
Number of Metro StationsN_Metro
AccessibilityRoad Network AccessibilityNAch_GlobalUse the global and local NAch indicators from space syntax to express accessibility and simulate street motor vehicle flow.Global-
SafetySimulated Traffic Flow in the Road NetworkNAch_7k7000 m-
Table 5. The descriptive statistics of the BE indicators.
Table 5. The descriptive statistics of the BE indicators.
DimensionsIndicatorsMaxMinMeanStd.DevUnits
DensityD_BLD100.00%0.00%17.11%0.14%
D_Pop186,191.0925.8714,987.1613711.98Persons/m2
D_POI6.9200.050.09Nums/m
DiversityD_Life0.2900.010.02Nums/m
D_Sport0.2000.0050.01
D_Trans0.1000.0040.003
DesignR_BLD99.55%0.00%20.26%0.20%
R_Vege91.12%0.00%27.02%0.12
R_Sky44.50%0.00%8.69%0.07
R_Ped17.06%0.00%3.31%0.02
Expo_Lake100.00%0.00%5.28%0.20
Expo_River100.00%0.00%10.27%0.28
Distance to TransitN_Bus500.150.42Nums
N_Metro800.040.31
AccessibilityNAch_Global1.4800.900.28-
SafetyNAch_7k3,907,1580174,492.87377,905.64
Table 6. Optimal bandwidth test results for the GW-RF model.
Table 6. Optimal bandwidth test results for the GW-RF model.
BandwidthModel Performance
MaxMinStd. Dev
1500 m0.560520.6556.74
1600 m0.566820.1556.34
1700 m0.554620.7957.12
1800 m0.585220.1255.13
1900 m0.569021.1256.19
2000 m0.564320.2656.50
Table 7. Variables’ importance and ranking results.
Table 7. Variables’ importance and ranking results.
DimensionsVariablesImportance ValueRanking
DensityD_BLD0.12772
D_Pop0.23511
D_POI0.07825
DiversityD_Life0.029512
D_Sport0.000815
D_Trans0.010513
DesignR_BLD0.04948
R_Vege0.040511
R_Sky0.042310
R_Ped0.05139
Expo_Lake0.09823
Expo_River0.003014
Distance to TransitN_Bus0.06087
N_Metro0.000416
AccessibilityNAch_Global0.07786
SafetyNAch_7k0.08704
Table 8. The effect variation intervals of 16 BE variables.
Table 8. The effect variation intervals of 16 BE variables.
DimensionsVariablesEffect TypeUnitsPromotion
Interval
Inhibition
Interval
Positive
Slope
Negative
Slope
Neutral
Slope
DensityD_BLDNonlinear-<10%>10%-0–70%>70%
D_PopNonlinearPersons/km2<8000>8000-0–15000>15000
D_POINonlinearNums/m<0.03>0.03-0–0.1>0.1
DiversityD_LifeNonlinearNums/m0.002–0.01<0.001 0–0.0040.004–0.012>0.012
>0.01
D_SportNonlinearNums/m>0.002<0.002>0.001<0.001None
D_TransNonlinearNums/m>0.002<0.002All--
DesignR_BLDNonlinear-<4%>4%-<20%>20%
R_VegeNonlinear->50%<50%>30%<30%-
R_SkyNonlinear->15%<15%>5%<5%-
R_PedNonlinear->10%<10%>8%-0–8%
Expo_LakeLinear->50%<50%All--
Expo_RiverLinear->20%<20%>40%-<40%
Distance to TransitN_BusNonlinearNums>1<1>1-<1
N_MetroLinearNums>0.4<0.4All--
AccessibilityNAch_GlobalNonlinear-<0.750.75–1.42>1.08<1.08-
>1.42
SafetyNAch_7kNonlinear-<0.1 × 106>0.1 × 106-0–0.4 × 106>0.4 × 106
Table 9. Comparison between the main effects of certain variables and the global variable importance ranking.
Table 9. Comparison between the main effects of certain variables and the global variable importance ranking.
VariablesRanking of Main EffectsRanking of Global Importance
Expo_Lake13
D_BLD22
D_Pop31
D_POI45
Nach_7k54
R_BLD69
R_Ped78
NAch_Global86
Table 10. Planning Recommendations for Directly Intervenable Indicators.
Table 10. Planning Recommendations for Directly Intervenable Indicators.
IndicatorsBasic RequirementsOptimal IntervalPlanning Recommendations
R_Vege-50–85% (Considering that R_Vege and R_Sky are variables that conflict with each other to a certain extent, the calculation of the optimal interval takes their contradiction into account. Specifically, the maximum value of the optimal interval for the vegetation view ratio should not affect the minimum value of the optimal interval for the sky view ratio)Considering that tall trees may affect the sky view ratio by shading the sky, it is recommended to add diverse low-growing plants to enrich the landscape experience of the street.
R_Sky-15–50%Newly built blocks must pay attention to building height and the continuity of building facades (with reasonable openings set in continuous building facades to ensure the openness of space); at the same time, optimize the proportion of facilities such as tall street trees and billboards.
R_Ped10%-Add non-motorized traffic facilities such as dedicated street walkways, with their length accounting for more than 10% of the street width.
D_POI >0.01Ensure that there is at least one daily service facility per 100 m of street to provide basic supplies (such as drinking water) for MPA activities through on-site purchases, while also maintaining a basic level of human activity on the street.
Table 11. Planning Recommendations for Indirectly Intervenable Indicators.
Table 11. Planning Recommendations for Indirectly Intervenable Indicators.
VariablesIndicator Interval (This Indicator Range Refers to the Actual Situation Rather Than the Intervention Target)Planning Recommendations
D_Pop<2000 p/km2Maintain the basic pedestrian flow on the street by setting an appropriate number of public facilities that can gather people (such as street plazas with good environments), so as to provide basic human presence and a sense of security for MPA activities.
0–60,000 p/km2More than 30% of BLD and a D_POI greater than 0.13 can promote MPA through synergistic effects.
D_BLD<10%Improve road network accessibility or add public transportation facilities to facilitate public access to such low-density spaces.
>10%Increase D_POI to above 0.12, while enhancing the richness of POI types;
Expo_Lake0–80%Ensure basic pedestrian walkway facilities (R_Ped > 3.58);
80–100%Attempt to reduce the traffic flow in the area through certain traffic control measures.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shen, H.; Zhang, J.; Li, A.; Liu, Y. From Perception to Behavior: Exploring the Impact Mechanism of Street Built Environment on Mobile Physical Activity Using Multi-Source Data and Explainable Machine Learning. Land 2025, 14, 2315. https://doi.org/10.3390/land14122315

AMA Style

Shen H, Zhang J, Li A, Liu Y. From Perception to Behavior: Exploring the Impact Mechanism of Street Built Environment on Mobile Physical Activity Using Multi-Source Data and Explainable Machine Learning. Land. 2025; 14(12):2315. https://doi.org/10.3390/land14122315

Chicago/Turabian Style

Shen, Hao, Jian Zhang, Ali Li, and Yaoqian Liu. 2025. "From Perception to Behavior: Exploring the Impact Mechanism of Street Built Environment on Mobile Physical Activity Using Multi-Source Data and Explainable Machine Learning" Land 14, no. 12: 2315. https://doi.org/10.3390/land14122315

APA Style

Shen, H., Zhang, J., Li, A., & Liu, Y. (2025). From Perception to Behavior: Exploring the Impact Mechanism of Street Built Environment on Mobile Physical Activity Using Multi-Source Data and Explainable Machine Learning. Land, 14(12), 2315. https://doi.org/10.3390/land14122315

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop