GeoShapley-Based Explainable GeoAI for Sustainable Community Satisfaction Assessment: Evidence from Chengdu, China

Zhang, Wennan; Zhang, Li; Li, Jinyi; Guo, Sui; Hu, Qixuan; Zhou, Rui

doi:10.3390/su172210261

Open AccessArticle

GeoShapley-Based Explainable GeoAI for Sustainable Community Satisfaction Assessment: Evidence from Chengdu, China

by

Wennan Zhang

¹,

Li Zhang

²,

Jinyi Li

¹,

Sui Guo

¹,

Qixuan Hu

¹ and

Rui Zhou

^1,*

¹

College of Architecture and Urban-Rural Planning, Sichuan Agricultural University, Chengdu 611800, China

²

School of Architecture and Urban Planning, Shenzhen University, No. 3688 Nanhai Avenue, Nanshan District, Shenzhen 518060, China

^*

Author to whom correspondence should be addressed.

Sustainability 2025, 17(22), 10261; https://doi.org/10.3390/su172210261

Submission received: 14 October 2025 / Revised: 5 November 2025 / Accepted: 12 November 2025 / Published: 17 November 2025

(This article belongs to the Section Sustainable Urban and Rural Development)

Download

Browse Figures

Versions Notes

Abstract

Understanding the spatial drivers of community satisfaction is crucial for achieving inclusive and sustainable urban development. However, traditional spatial regression models often assume linearity and fail to capture complex, spatially heterogeneous relationships. This study integrates a GeoShapley-based explainable GeoAI framework with the XGBoost algorithm to identify and quantify spatially varying factors influencing community satisfaction in Chengdu, China. By incorporating geographic coordinates as explicit spatial features, the GeoShapley method decomposes model outputs into intrinsic spatial effects and feature-specific interaction effects, enabling the interpretation of how and where each factor matters. Results show significant spatial clustering (Moran’s I = 0.60, p < 0.01) and a distinct south–north gradient in satisfaction. Built environment indicators—including building coverage ratio (BCR), walkability index (WI), and distance to green space (DGS)—exhibit nonlinear relationships and clear thresholds (e.g., BCR > 0.15, DGS > 590 m). Social vitality (Weibo check-ins) emerges as a key local differentiator, while education and healthcare accessibility remain spatially uniform. These findings reveal a dual structure of public service homogenization and spatial-quality heterogeneity, highlighting the need for place-specific, precision-oriented community renewal. The proposed GeoXAI framework provides a transferable pathway for integrating explainable AI into spatial sustainability research and urban governance.

Keywords:

GeoXAI; GeoShapley; community satisfaction; spatial heterogeneity; explainable AI; urban sustainability; XGBoost

1. Introduction

The accelerating pace of global urbanization demands that we implement urban renewal initiatives to continuously enhance residents’ sense of fulfillment, happiness, and security [1,2]. As social participation awareness grows, urban renewal has evolved from mere physical spatial reconstruction into a comprehensive process encompassing economic restructuring, social relationship reshaping, and spatial functional adjustments. In this process, ensuring community quality and improving residents’ quality of life have become core concerns in urban renewal practices [3]. As the building block of cities, community satisfaction (CS) directly reflects residents’ quality of life and serves as a crucial metric for urban sustainability. However, the contradiction between current one-size-fits-all community renewal policies and the spatial heterogeneity of resident needs is becoming increasingly apparent.

Research indicates that CS is a nonlinear, spatially heterogeneous phenomenon influenced by multiple factors including the built environment, supporting facilities, and geographical location. First, community satisfaction is affected by diverse factors, and these influences may vary across regions [4,5,6,7,8,9,10,11]. Socially sustainable communities are characterized by creating livable, inclusive, and equitable environments that meet all residents’ needs while upholding social justice principles [12]. To achieve effective resource allocation and enhance resident satisfaction during renewal processes, it is essential to assess renewal initiatives based on residents’ needs [13]. However, residents’ demands for community services and environmental quality often exhibit diverse characteristics and spatial variations. The direction and intensity of how different environmental elements affect residents’ satisfaction vary across regions. For instance, natural environmental comfort enhances satisfaction in western China, while environmental health and healthcare levels have a more pronounced impact in the southeast [14]. Refined community renewal and governance require not only identifying overarching issues but also addressing specific needs across different geographic locations. Second, the relationship between CS and its drivers is not always linear; nonlinear relationships and threshold effects exist. Research demonstrates that green spaces exhibit nonlinear and threshold effects on the well-being of vulnerable groups [15]. Most environmental factors in public spaces of older residential areas show nonlinear relationships with overall resident satisfaction, consistent with prior findings [16]. This aspect warrants significant attention. Therefore, this study focuses on a critical question: How do spatial heterogeneity and nonlinear interactions jointly shape CS patterns?

However, traditional research methods face limitations in explaining the influence of community satisfaction factors: First, spatial regression models like GWR and MGWR, while accounting for geographic effects, typically rely on linear assumptions, making it difficult to capture complex nonlinear relationships among high-dimensional variables [17,18]. Second, addressing this issue, the recent literature has extensively combined machine learning with interpretable techniques like SHAP and partial dependence plots to explore causal mechanisms and threshold effects among variables [19,20,21,22,23,24]. While this offers a powerful alternative, it neglects spatial heterogeneity, explaining only globally dominant features and failing to provide context-specific interpretations.

The GeoXAI (Geospatial Explainable AI) approach, integrating machine learning and GeoShapley, offers a solution to this challenge [25]. Its core lies in explicitly incorporating geographic information—such as latitude and longitude—as features alongside spatial proximity or other geographically derived variables into machine learning models, thereby implicitly capturing spatial dependencies and heterogeneity. GeoShapley simultaneously accounts for joint features and interaction effects, enabling model-agnostic measurement of spatial and non-spatial effects [26]. It decomposes and calculates Shapley values at geographic units to quantify the differential contributions of each factor across locations. For instance, park greenery may be critically important for satisfaction in downtown communities but less crucial for neighborhoods closer to peripheral areas—a distinction only GeoShapley can capture, overcoming limitations of traditional methods. Thus, GeoXAI effectively addresses shortcomings in conventional spatial regression and general interpretability approaches.

Most existing studies fail to adequately account for the spatial heterogeneity of CS, resulting in policy recommendations that lack specificity. This research makes three key contributions:

Methodological: Uses XGBoost to capture nonlinear relationships between community satisfaction and influencing factors and integrates GeoShapley to quantify the spatial heterogeneity of these relationships—filling the gap of “neglecting spatial effects in interpretable machine learning” in existing studies.
Mechanism level: Identifies thresholds and interactions of variables on CS from both global and local perspectives.
Application level: Combines GeoShapley decomposition with the quantified differential contributions of spatial units to propose more targeted community governance optimization strategies.

2. Literature Review

2.1. Theoretical Foundations

This study integrates theories on life satisfaction, livability, spatial justice, and sustainable development. Quality of life (QoL) research invariably relates to urban livability, which is assessed primarily across three tiers: top-level and bottom-level evaluations focus on macro- and micro-scale assessments, while mid-level evaluations (e.g., at the community level) provide detailed livability insights that fully support urban planning and governance [27]. SDG 11 states that by 2030, all countries will strengthen inclusive and sustainable urbanization, enhancing participatory, integrated, and sustainable human settlement planning and management capacities. Social sustainability requires developing inclusive, cohesive, and participatory communities where residents’ life satisfaction plays a pivotal role. This satisfaction influences individual perceptions and behaviors, thereby strengthening social cohesion and sustainable urban governance [28]. Urban geography research indicates that spatial heterogeneity inevitably exists within communities. The spatial distribution of urban functions and services significantly impacts livability, with disparities in access to essential infrastructure profoundly affecting residents’ quality of life. Research indicates that uneven distribution adversely affects healthcare, recreational spaces, and retail services, ultimately reducing overall urban satisfaction [29]. This underscores the urgent need for equitable distribution of amenities, which contradicts the widely held concept of spatial justice and hinders the maintenance of resource fairness and the promotion of shared development. Spatial justice integrates spatial and social dimensions, focusing on examining cross-regional inequalities in environmental justice, education, healthcare, transportation, and parks [30]. This is highly relevant to community sustainability, necessitating urban planners’ engagement in proactive spatial justice initiatives [28].

2.2. Research Progress and Methodological Limitations in Factors Influencing Community Satisfaction

Exploring the mechanisms influencing community satisfaction has long been a research focus in urban geography, planning, and sociology. Existing studies primarily follow two approaches: identifying influencing factors and analyzing causal mechanisms. However, significant limitations persist in integrating spatial heterogeneity and nonlinear relationships within these analyses.

Early studies predominantly relied on linear models such as ordinary least squares (OLS) and multiple linear regression [31], focusing on the average effects of built environment factors (e.g., green space accessibility, transportation convenience) and socioeconomic attributes (e.g., income levels, educational resources) on community satisfaction. For instance, OLS regression revealed that cultural asset participation exhibits a sustained positive correlation with subsequent life satisfaction and psychological well-being, while negatively correlating with psychological distress [32]. While such studies provide foundational evidence for identifying key determinants, their assumptions of spatially homogeneous and linear relationships struggle to capture the complexity of urban spaces.

To overcome spatial homogeneity assumptions, spatial regression models like Geographically Weighted Regression (GWR) and Multi-Scale Geographically Weighted Regression (MGWR) have gained widespread application [33]. These methods effectively reveal spatial heterogeneity in influencing factors by allowing regression coefficients to vary with spatial location. For instance, GWR has uncovered varying degrees of association between perceived recreational benefits of community green spaces and visual green environments across different geographic locations. MGWR has revealed spatial heterogeneity in vegetation, roads, fences, and other street elements in studies examining residents’ spatial perceptions and quality of life. Geotemporal Geographically Weighted Regression (GTWR) captures spatiotemporal heterogeneity. Related studies have found significant spatiotemporal heterogeneity in the impact of land use on community vitality [34]. Subsequently, Generalized Geotemporal Geographically Weighted Regression (GGTWR) was proposed to extend GTWR by integrating generalized linear models and enabling more flexible bandwidth selection [35,36]. However, its core limitation remains its reliance on linear relationship assumptions, making it challenging to handle complex interactions among high-dimensional variables and nonlinear thresholds. For social sustainability research, this limitation may lead to misjudgments of spatial equity, thereby affecting the precision of public service equalization policies.

In recent years, machine learning models such as Random Forest and Gradient Boosting Trees (e.g., XGBoost) have emerged as vital tools for community satisfaction research due to their superior handling of nonlinear relationships and high-dimensional interactions. Existing research comparing linear regression and gradient-boosted decision trees has concluded that nonlinear associations between park attributes and satisfaction are prevalent in Chengdu. Ignoring nonlinearity risks underestimating the adverse impact of certain park attributes (e.g., sanitation and recreational facilities) on satisfaction when they perform poorly. Future studies should relax linearity assumptions, ideally through machine learning approaches [37]. The introduction of interpretable techniques like SHAP and PDP further reveals the global importance of variables and nonlinear mechanisms. For instance, using XGBoost and SHAP, it was proposed that expanding green spaces in low-income communities is a feasible and cost-effective strategy to address green space access inequality [38]. However, existing machine learning research generally neglects explicit modeling of spatial heterogeneity. Even when models incorporate spatial features like latitude and longitude, their interpretations often remain confined to global average effects. This makes it difficult to answer critical questions such as: in which specific areas does a given factor exert stronger or weaker influence? Do core drivers differ across regions? This limitation starkly contradicts demands for spatial equity and sustainable community governance. Merely knowing that green space proximity matters for satisfaction, without understanding its differential impact in older versus newer neighborhoods, prevents the development of targeted ecological space optimization strategies. Although recent advancements like Geographically Weighted Regression Gradient Boosting (GWRBoost) leverage iterative gradient boosting to capture nonlinear relationships and complex interactions between variables, while allowing model coefficients to dynamically vary with spatial location through geographically weighted mechanisms, they often fail to provide a unified method that is both highly interpretable and broadly applicable across diverse geographic contexts [31].

In summary, existing methods face dual challenges in analyzing the mechanisms influencing community satisfaction: the difficulty of simultaneously addressing “spatial heterogeneity” and “nonlinear relationships.” Traditional spatial regression models (e.g., GWR) can capture spatial heterogeneity but are constrained by linear assumptions, while machine learning models can handle nonlinearity but lack nuanced explanations for spatial heterogeneity. This gap hinders research from precisely supporting sustainable community planning practices, thereby constraining the formulation of context-specific renewal strategies. Addressing this gap, this study integrates the GeoXAI framework combining XGBoost and GeoShapley to respond to the urgent demands for spatial justice and precision governance, providing scientific grounds for the equitable allocation of public resources and differentiated community renewal strategies.

3. Materials and Methods

3.1. Study Area

As one of China’s four megacities with populations exceeding 20 million, Chengdu’s rapid economic growth has been accompanied by various urban challenges stemming from rapid population concentration, including traffic congestion and ecological pressures. To address these challenges, Chengdu was selected in 2021 as one of China’s first pilot cities for urban renewal. This initiative promotes improvements in living environments, infrastructure development, renovation of aging residential areas, and enhancement of public services within renewal units. This reflects Chengdu’s significance within China’s urban development landscape, making the enhancement of residents’ well-being an urgent priority. This study focuses on 839 community units within Chengdu’s central urban district, as illustrated in Figure 1. As the city’s core area, the central district features diverse community types where new and old neighborhoods coexist, making it an ideal subject for examining spatial heterogeneity.

3.2. Research Framework

This study constructs an evaluation indicator system supported by multimodal data, grounded in theoretical analysis, policy literature review, and systematic literature review. Subsequently, through machine learning model comparison, XGBoost is ultimately selected as the optimal explanatory model. Finally, we employ machine learning methods to uncover the interactive relationships between evaluation indicators and CS, proposing recommendations for community renewal efforts. The framework is illustrated in Figure 2.

3.3. Data Sources and Processing

3.3.1. Data and Preprocessing

This study utilized multiple data sources (Table 1), including OSM road network data, POI data, housing price data, NPP-VIIRS-like nighttime light data, Weibo check-in data, Street View image data, building vector data, park and green space data. Specifically, housing price data was scraped from the Anjuke website using Python 3.11.13, capturing price and address information for new and resale properties (listings and residential complexes) in Chengdu. The Geocoding API—Baidu Maps Web Service API’s reverse address lookup function—was employed to derive latitude and longitude coordinates for each price point. After averaging the price data with corresponding coordinates and addresses, 13,938 price records were obtained. Finally, inverse distance weighting was applied to transform the scattered points into a continuous distribution. Data expansion was then used to calculate the average housing price for each community, as spatially adjacent residents tend to have similar housing prices [39]. After collection, Weibo check-in data underwent the following cleaning steps: (1) Character standardization, including converting full-width to half-width characters and traditional to simplified Chinese; (2) Removal of HTML tags, URL links, @mentions and hashtags, emoticons, and redundant punctuation; (3) Filtering of invalid content (e.g., marker texts like “[video]” or “[image]”), deletion of short texts < 3 characters, and filtering of meaningless content consisting solely of numbers or pure time formats.

3.3.2. Community Satisfaction

CS is a core metric for assessing residents’ quality of life and community development levels. Research in this field has long followed two distinct approaches: subjective perceptions and objective environmental factors. Existing literature indicates that a comprehensive understanding of CS requires integrating residents’ subjective evaluations with the objective physical environment that shapes their lived experiences [40]. Objective measures reveal underlying structural factors influencing satisfaction through quantifiable spatial and socioeconomic indicators, while subjective measures directly reflect residents’ psychological states. Together, they form a comprehensive framework for assessing community well-being.

Within the objective dimension, existing research typically focuses on two aspects: the built environment and the socioeconomic environment [39]. The former examines factors like accessibility to public services and green spaces that affect daily convenience and comfort [41], while the latter reflects community resource levels and development status through indicators such as housing prices and economic vitality [9]. Building upon this theoretical framework and drawing from prior work, this study selects three representative indicators to quantify objective satisfaction. Recreational facilities serve as vital carriers of community vitality. Their abundance and accessibility influence residents’ leisure quality of life and social interaction opportunities [42,43], making them key indicators in built environment studies for measuring convenience and livability. The Proximity to Recreations (PR) metric directly reflects higher-level leisure and social needs within Maslow’s Hierarchy of Needs. Housing Value (HV) characterizes a community’s socioeconomic status and environmental quality, serving as a composite proxy for resource endowment. Higher housing prices typically indicate superior resource endowment and higher socioeconomic status among residents [44]. Nighttime Light (NTL) indices have been extensively validated across studies as highly correlated with regional economic activity intensity, population density, and urbanization levels [45]. Higher nighttime light intensity typically indicates a more vibrant nighttime economy, better public services, and a more prosperous urban lifestyle.

Regarding subjective satisfaction, the study draws on previous sentiment analysis methods based on social media data [9]. It employs the Chinese pre-trained language model “Chinese-lert-base,” jointly released by the Harbin Engineering University Joint Laboratory and iFlytek, to analyze sentiment in Weibo texts. LERT employs a Language-Informed Pre-training (LIP) strategy, concurrently training three types of linguistic features during the original masked pre-training task, significantly enhancing the performance of various pre-trained language models. This study achieves transfer learning through fine-tuning, randomly dividing training and auxiliary data into training and validation sets at an 8:2 ratio for phased adaptive training. After 23 rounds of cross-entropy loss training until model convergence, a transfer-learned Weibo sentiment computation model was obtained. This model was used to assign positive/negative sentiment scores (values of 1 and −1) and calculate confidence levels for community Weibo texts from 2023 to present. Validation on the test set achieved an F1 score of 87.98%. Additionally, drawing on prior research methodologies for public well-being studies [46,47], we integrated the Satisfaction Probability Index (SPI) and Satisfaction Intensity Index (SII) through weighted geometric averaging to establish the Weibo Sentiment Index (WSI).

SPI: Measures the prevalence of positive sentiment expression within a community.

S P I = \frac{\sum_{i = 1}^{n} I (l a b e l_{i} = 1) \times c o n f i d e n c e_{i}}{n},

(1)

SII: Measures the intensity level of positive emotional expression.

S I I = \frac{\sum_{i = 1}^{n} I (l a b e l_{i} = 1) \times c o n f i d e n c e_{i}}{\sum_{i = 1}^{n} |l a b e l_{i}| \times c o n f i d e n c e_{i}},

(2)

Here, n denotes the total number of microblogs within the community, label_i represents the sentiment label of the ith microblog (1 for positive, −1 for negative), and confidence_i indicates the model’s predicted confidence level.

The four indicators were standardized using minimum-maximum normalization, followed by entropy weighting to calculate their respective weights (see Table 2). Finally, CS was computed via weighted summation.

3.3.3. Influencing Factor Selection

To explore the influencing mechanisms of CS, this study synthesizes classic indicators widely used in built environment research [9,48,49,50]. It preliminarily constructs an indicator system comprising 32 potential influencing factors across multiple dimensions, including Social Vitality, Locational Attributes, and Public Service Accessibility. In the field of urban studies, the “5D” theory (density, diversity, design, destination accessibility, and travel distance) has gained widespread recognition for its broad coverage [51]. Some scholars argue that urban form elements such as development intensity, high density, mixed land use, permeable small-scale blocks, and connected streets constitute a “supportive urban environment,” positively correlated with improved quality of life [52]. Within our research context, the Locational Attributes dimension represents distance from neighborhoods to central business districts (CBDs), a factor previously shown to significantly influence property values [31]. The Transport and Infrastructure dimension considers road density, where moderately high road density typically correlates with greater walkability, improved access to local services, and more vibrant public spaces [53]. The Building Morphology and Land Use dimension primarily describes the physical spatial attributes of the community itself, such as development intensity (floor area ratio, building density) and functional mix (land use diversity), which directly affect residents’ access to sunlight, ventilation, visual openness, and the availability of ground-level public spaces. Social vitality reflects the intensity and attractiveness of actual social and economic activities occurring within the community; a strong community atmosphere enhances identity and satisfaction. The Public Service Accessibility dimension addresses geospatial inequities by measuring the spatial distribution and availability of service resources, including facility density and distance to nearest facilities. Research indicates that spatial density and proximity—the concentration of facilities like schools, parks, and healthcare centers within a specific area and their proximity to the study zone’s center—correlate with greater spatial justice [54]. Additionally, it innovatively introduces street perception indicators based on street view imagery. SVI explores human perceptions (e.g., whether a place feels safe, affluent, or vibrant) and objective factors (e.g., property appreciation, urban crime, and poverty) from a more granular perspective [54], reflecting the order of urban organization and form with greater precision. This study employs the mask2former image segmentation model to calculate the Green View Index (GVI), Street Equipment Efficiency Index (SEI), and Walkability Index (WI) for each neighborhood’s street view images. These metrics can be used to measure the smart efficiency of street facilities and the development level of pedestrian-friendly spaces.

Although the selected machine learning algorithms can handle multicollinearity to some extent, we preprocessed all variables before modeling to ensure stability and interpretability of results. Through Spearman’s correlation and variance inflation factor (VIF) tests, we excluded variables with VIF ≥ 10. Ultimately, 15 predictor variables were retained for model construction, as shown in Table 3.

3.4. Machine Learning Models

We compared nine machine learning models: linear regression, K-Nearest Neighbors (KNN), decision trees, Random Forest, XGBoost, CatBoost, LightGBM, and AdaBoost. To prevent overfitting and ensure model robustness, we employed a strategy combining Bayesian optimization with 5-fold cross-validation for hyperparameter tuning. The optimization process included 100 iterations of parameter search, with the objective of maximizing the cross-validated R². Five metrics—MAE, RMSE, MSE, MAPE, and R²—were employed to evaluate model suitability.

Results indicate ensemble tree models like XGBoost, CatBoost, and LightGBM are optimal for this dataset, significantly outperforming traditional models such as linear regression and KNN. Hyperparameter tuning significantly improved performance for most ensemble tree models. Notably, when XGBoost’s hyperparameters were adjusted from defaults to learning rate = 0.0189, max_depth = 3, colsample_bytree = 0.6, gamma = 0.0, min_child_weight = 1, reg_alpha = 0.00868, reg_lambda = 1.0, subsample = 0.7001, R² increased significantly from 0.7678 to the highest value of 0.8200 (5-fold cross-validation R² standard deviation = 0.0079). This configuration achieved both the highest fitting accuracy and optimal stability, making it the optimal model selected for subsequent analysis. Details are presented in Table 4.

XGBoost, as an enhanced version of the GBDT model, is a powerful machine learning technique for large-scale datasets, well-suited for modeling complex nonlinear relationships [55]. As a gradient boosting decision tree model, XGBoost can automatically capture complex nonlinear relationships and interaction effects between features and community satisfaction through the combination of tree structures, without requiring the predefined functional forms typical of linear models. Simultaneously, it incorporates a regularization term into its objective function and supports row and column sampling, enabling it to maintain high accuracy while achieving stronger generalization capabilities. This is crucial for extracting robust patterns from multi-source data. However, its “black box” nature makes it difficult to directly interpret the marginal contribution of individual features to predictions [56], particularly when considering geographic locations. GeoShapley, an extension of the Shapley framework, can reveal the differential contributions of influencing factors across different geographic locations. This approach treats location features as collaborative participants in the model interpretation process. This study incorporates the geographic coordinates LAT and LON of community centroids as composite GEO features within the model to capture spatial heterogeneity. In the community satisfaction model, GeoShapley treats the community’s center coordinates LON/LAT as a single joint feature (GEO) rather than two independent features. This ensures compliance with the Shapley value’s theoretical assumption of “joint players” and enables the decomposition of two key spatial effects. The first is the Intrinsic Spatial Effect, referring to “the inherent contribution of location itself.” This denotes satisfaction differences attributable solely to geographic position, independent of any non-spatial features. Its essence lies in quantifying the interpretability of “spatial fixed effects,” reflecting irreplaceable spatial values inherent to a location, such as resource endowments and planning positioning. The second category is the Feature-Space Interaction Effect, denoting the “synergistic interaction between non-spatial feature j and geographic location GEO.” This captures variations in the intensity of a single non-spatial feature’s impact on satisfaction across different locations, revealing how non-spatial influences are modulated by geographic contexts (e.g., development intensity, resident demand). The method is implemented using the open-source Python 3.11.13 package GeoShapley. The primary formula is as follows; for detailed explanations of the GeoShapley algorithm, please refer to the literature [26].

For a feature Xj, its GeoShapley value is formulated as

ϕ_{j} = \sum_{S \subseteq M \ \ \{j\}} \frac{s! (p - s - g)!}{(p - g + 1)!} (f (S \cup \{j\}) - f (S)),

(3)

where GEO is a set of location features with a size of g. If g = 1, the equation reduces to the individual player situation as in the classic Shapley. g = 2 if including the geographical coordinates (u, v) as the features in the model.

We can calculate GeoShapley values for each individual observation and then average them over the background dataset. The final output of GeoShapley has four components that add up to the model prediction:

\overset{\land}{y} = ϕ_{0} + ϕ_{G E O} + \sum_{j = 1}^{p} ϕ_{j} + \sum_{j = 1}^{p} ϕ_{(G E O, j)},

(4)

where

ϕ_{0}

is a constant base value,

ϕ_{G E O}

is a vector with size n measuring the intrinsic location effect in the model;

ϕ_{j}

is a vector with size n for each non-location feature j giving location-invariant effect to the model; and

ϕ_{(G E O, j)}

is a vector with size n for each non-location feature j giving the spatially varying interaction effect to the model.

4. Results

4.1. Distribution Characteristics of CS in Chengdu’s Central Urban Area

Through spatial autocorrelation analysis of CS data, we obtained a Moran’s I value of 0.5975, a p-value of 0.001, and a Z-score of 36.7607. This indicates significant spatial clustering in CS levels across the 839 studied communities. Jenks Natural Breaks is a data clustering method that determines classification thresholds by maximizing similarity within each group while maximizing dissimilarity between groups [57]. Previous studies have applied this method to classify community quality levels [3]. We used Jenks Natural Breaks to categorize CS into five natural levels: poor, fair, moderate, good, and excellent. This yielded the resident satisfaction status for 839 communities in Chengdu’s central urban area, as shown in Figure 3. At the spatial scale, CS levels predominantly fall into moderate, good, and excellent categories, with fair and poor levels accounting for the smallest proportions. Specifically, a “higher in the south, lower in the north” spatial pattern emerges communities in central areas generally exhibit higher satisfaction, which decreases outwardly. Notably, higher-satisfaction communities are highly concentrated in the southern region. These areas represent high-quality urban new districts focused on developing high-tech industries and the new economy, aligning with Chengdu’s southward expansion planning direction. In contrast, communities in the northwest and northeast regions predominantly received average or poor ratings. These areas often suffer from issues such as concentrated aging residential zones and inadequate public facilities, warranting particular attention. This distinct spatial differentiation pattern first reveals the unevenness of urban development at the macro level and directly identifies priority target areas for policy resource allocation. The precise identification of low-satisfaction clusters is the primary prerequisite for implementing differentiated, heterogeneous urban renewal strategies, and holds significant management value in itself.

4.2. Global Feature Importance and Nonlinear Effects

This study visualizes the importance of analysis of factors influencing CS through a honeycomb plot. To examine the relative importance of various factors affecting CS and their spatial effects, a GeoShapley swarm plot was constructed (Figure 4). The x-axis represents the GeoShapley value, quantifying each feature effect’s contribution to the model output and its direction (positive or negative). The y-axis ranks all effect terms in descending order based on their mean absolute GeoShapley value, illustrating their global significance. Each point in the plot represents a sample, with its color reflecting either the raw numerical value of the corresponding feature in that sample or the spatial location attribute implied by its geographic coordinates. This visually reveals the monotonic or nonlinear relationship between feature values and model predictions, along with their spatial distribution patterns.

Our analysis reveals that CS is primarily driven by global effects, with local spatial heterogeneity playing a supplementary role. The pure geographic effect GEO exhibits the widest SHAP value distribution and highest means, indicating geographic location itself is the core predictor of CS, with its direct impact showing substantial spatial variation. However, among the 15 non-spatial features, only a few demonstrate significant spatial non-stationarity. Specifically, WC and its interaction with GEO ranked among the top factors, indicating not only high global importance but also significant variation in impact intensity across geographic locations. Both the main effects of BCR and WI, along with their interactions with GEO, demonstrated high importance, confirming that the influence of the built environment and pedestrian-friendly space is clearly context-dependent. Although distance to parks and green spaces does not exert as strong an influence as some other functions, it still contributes to the model’s predictive power. While the impact of proximity to parks and green spaces on CS is relatively less pronounced, it still exhibits spatial variation. The interaction effects of the remaining 11 features such as DS and DM are of low significance. This indicates that the influence of these factors on CS is relatively spatially homogeneous, meaning their direction and intensity of impact remain relatively stable across the entire study area.

This finding carries core policy implications and managerial value: it addresses the critical question of where tailored policies are needed and where standardized policies can be implemented. For spatially heterogeneous factors like BCR and DGS, localized renewal strategies tailored to specific areas should be developed. Conversely, for factors exhibiting largely homogeneous influence patterns (e.g., DS, DM), existing planning approaches can be maintained. This avoids the inefficiency of a one-size-fits-all approach and provides a scientific basis for achieving targeted community governance. Subsequent research will focus on the four features exhibiting spatial interaction effects, delving into their nonlinear influence patterns and spatial differentiation mechanisms to inform differentiated and precision-targeted community renewal strategies.

To further reveal the complex nonlinear dependencies and potential threshold effects between key influencing factors and CS, this study constructed SHAP dependency plots for four core characteristics: WC, BCR, WI, and DGS (Figure 5). Results indicate that the influence mechanism of CS exhibits significant nonlinear characteristics and critical threshold points.

WC exhibits a pronounced positive nonlinear relationship with CS. As the number of microblog check-ins increases, its positive contribution to satisfaction steadily intensifies, turning positive when exceedingly approximately 1311 entries. This indicates that heightened social media activity—as a proxy for community social vitality—delivers sustained positive impacts on resident satisfaction. This finding underscores the enduring importance of enhancing community social interaction and vitality for improving resident well-being. The impact of BCR on CS exhibits a complex nonlinear pattern with an overall declining trend. When BCR < 0.05, its improvement has a slight negative effect on satisfaction, suggesting that this range of building coverage often corresponds to reasonable supporting facilities, moderate population density, and a convenient, livable community environment. As BCR increases beyond 0.15, its effect turns negative, suggesting that high building density may trigger negative perceptions such as crowding and insufficient daylight, indicating a clear threshold for intensification. The relationship between WI and CS exhibits distinct phased changes. When WI < 0.01, it yields positive effects with a fluctuating upward trend followed by rapid attenuation. Beyond WI > 0.01, it consistently delivers stable negative effects, indicating that the benefits of improving walkability may reach a plateau beyond a certain level, with diminishing marginal returns from further investment. DGS exhibits a clear threshold effect: greater distance from parks contributes positively to satisfaction but with diminishing returns. Once DGS exceeds 589.77 m, negative impacts emerge, and the enhancement effect of parks beyond approximately 1200 m declines sharply. This indicates a significant effective boundary in the service radius of park green spaces, providing clear planning guidance for the spatial allocation of public service facilities.

4.3. Heterogeneity in the Importance of Key Features

To develop sustainable ecological conservation strategies for specific regions, we visualized the spatial distribution of GeoShapley values for the top four contributing factors (Figure 6). The results reveal significant variations in dominant influencing factors across different areas, reflecting pronounced spatial heterogeneity.

In the four low-CS areas of Qingbaijiang District, Xindu District, Pidu District, and Wenjiang District in northern Chengdu—primarily traditional industrial zones or areas undergoing rapid urbanization with relatively lagging community quality—GeoShapley analysis indicates that BCR and DGS are the two primary determinants, followed by WC. This suggests these regions face challenges of mismatched development intensity and ecological leisure space allocation during urbanization. Therefore, future urban renewal should prioritize controlling building density and optimizing park and green space layouts to enhance environmental livability and improve residents’ quality of life.

In central core districts like Jinjiang, Qingyang, and Wuhou, despite high population density, mature development, and overall high CS, the direction of influence factors differs from the northern areas. Here, BCR and WI positively contribute to CS, indicating that building form and walkability have reached relatively reasonable levels. However, DGS still exhibits some negative impact, indicating room for improvement in the accessibility and quality of park green spaces. Consequently, the renewal focus in central areas should shift from basic functional improvements to quality enhancement, particularly increasing green space coverage and elevating park service capacity.

Communities with high CS are primarily concentrated along both sides of Tianfu Avenue in Longquanyi District and Shuangliu District. Longquanyi District, not only a significant automotive industry base, has also significantly enhanced environmental quality in recent years through urban renewal and ecological projects like Qinglong Lake Wetland Park. Shuangliu District, leveraging the national strategic positioning of Tianfu New Area, has developed an excellent urban fabric under high-standard urban planning. In these areas, WC exhibits a significant positive impact, while BCR shows a notably high negative contribution, suggesting that excessive development may suppress comfort levels. Future planning in these regions should balance maintaining vitality with controlling development intensity. Optimizing pedestrian system design could also alleviate the psychological pressure caused by high-density development.

Our findings reveal that within complex urban systems, geographic context significantly modulates the intensity and direction of various factors influencing CS. The aforementioned spatial heterogeneity analysis lays a solid foundation for formulating “targeted development strategies.” It clearly answers the question of “where to focus efforts and what actions to prioritize”: the northern area requires “addressing deficiencies” (increasing green spaces and controlling density), the central area needs “enhancing quality” (optimizing pedestrian access and green spaces), while the southern area demands “mitigating risks” (balancing development intensity and vitality). This data-driven, heterogeneous governance strategy not only deepens our understanding of the spatial differentiation mechanisms underlying community satisfaction in Chengdu but also marks a significant shift from extensive management toward precise, sustainable urban governance.

5. Discussion

5.1. The Dual Pattern of Homogeneous Public Services and Heterogeneous Quality Experiences

Spatial autocorrelation and global feature importance analyses in Section 3.1 and Section 3.2 reveal that within Chengdu’s central urban area, the driving mechanisms of CS exhibit a dual pattern: the influence of public services shows spatial homogeneity, while the impact of spatial qualities demonstrates spatial heterogeneity. GeoShapley analysis indicates that most infrastructure factors—such as distance to schools and hospitals—exert relatively homogeneous spatial influence, whereas a few factors related to spatial quality and spontaneous vitality (WC, BCR, WI, DGS) exhibit significant spatial heterogeneity.

This pattern reflects the effectiveness of Chengdu’s urban planning in promoting spatial fairness and justice. Through the effective implementation of policies like the 15 min neighborhood concept, the central urban area has achieved a high level of spatial equity in fundamental public services such as basic education, healthcare, and public transportation. The greater the diversity and abundance of infrastructure and services, the easier it is to meet the daily needs of urban residents [9]. This implies that residents enjoy roughly equivalent access to basic services regardless of their location within the central urban area, thereby diminishing the role of these factors as key drivers of spatial variation in satisfaction. This study provides robust empirical support for the effectiveness of Chengdu’s urban infrastructure development over the past two decades.

However, this homogenization also signals a transition in the city’s development phase. Currently, the determinants of CS disparities are no longer the presence or absence of facilities, but their quality. WI, BCR, and DGS measure spatial experiences and qualities that are difficult to achieve through standardized allocation, while WC reflects organic social vitality. This finding resonates with the material environment-dominated perspective in traditional urban renewal research [58,59]. These higher-order attributes are profoundly constrained by deep-seated factors such as historical built environments, land economics, and community social capital, inevitably resulting in strong contextual influences. This transformation holds significant importance for advancing sustainable urban development—by ensuring equitable access to public services while enhancing spatial quality, it optimizes resource efficiency, strengthens community resilience, and promotes social inclusion, thereby achieving synergistic economic, social, and environmental benefits. Specifically, in terms of environmental sustainability, optimizing green space accessibility and WI can enhance carbon sequestration capacity and ecological services. Regarding social sustainability, balancing BCR and WC promotes community integration and social equity. For governance sustainability, the identified key thresholds provide scientific grounds for formulating differentiated policies, thereby improving the precision of urban governance and the efficiency of resource allocation.

5.2. Methodological Implications

The swarm plots in Section 3.2 and spatial distribution charts in Section 3.3 demonstrate that the GeoShapley framework retains robust explanatory power even in highly built-up, mature megacity centers.

Compared to traditional GWR models, GeoShapley’s advantage lies in its ability to simultaneously handle complex nonlinear effects and spatial effects without requiring a linear spatial relationship assumption. Compared to the classical SHAP global interpretability method, it further decomposes the independent contribution of geography and its interaction effects with various features. This provides a new interpretable machine learning paradigm for geography and urban studies, bridging the gap between spatial statistics and explainable AI.

A key methodological insight is that this study objectively demonstrates within regions of public service homogenization, most factors exhibit uniformly homogeneous influence. This indicates GeoShapley’s capability for precise diagnostics, providing quantitative and visual decision-making support for scientifically distinguishing between citywide universal policies and localized precision policies in urban renewal prioritization. This approach is not only applicable to Chengdu; by collecting local data, cities worldwide can adopt this methodology to identify and address spatial injustices within their communities, providing analytical tools to advance sustainable development in global cities.

5.3. Community Optimization Strategy

3.2 Dependency analysis reveals key thresholds of BCR > 0.15 and DGS > 589 m; 3.3 Spatial distribution analysis indicates that in the northern region, BCR and DGS are dominant factors, while in the central region, WI and BCR are positive factors and DGS is a negative factor. In the southern region, WC is a positive factor and BCR is a negative factor.

These findings indicate that satisfaction disparities across regions stem from distinct dominant contradictions: the north grapples with the conflict between development intensity and ecological space, the central region requires enhanced green space accessibility, while the south must balance maintaining vitality with controlling development intensity. Based on these discoveries, we propose the following differentiated community optimization strategies:

For heterogeneous factors, implement localized strategies: Precise interventions tailored to each district and community are required for the four identified key heterogeneous factors. For homogenizing factors, implement comprehensive strategies: For factors like DS, DM, and PB that affect homogeneity, continue to uphold and optimize existing 15 min neighborhood standards to ensure baseline fairness in basic public services and implement protective policies.

Northern Region (Qingbaijiang, Xindu, etc.): The core contradiction is development intensity versus insufficient ecological and recreational space. Prioritize building coverage ratio (BCR) for new developments and enhance living environments through pocket park construction in underutilized spaces and improving accessibility to existing parks.

Central Main Urban Area: The core task is quality enhancement and precision governance. While maintaining current development intensity and pedestrian-friendly advantages, focus on addressing the last-mile issues in park and green space quality and distributions such as opening ancillary green areas and upgrading street greening.

Southern High-End Areas (Tianfu New Area, etc.): The core challenge lies in balancing vitality maintenance with comfort assurance. While reaping the positive effects of high vitality, mitigate the negative impacts of high-density development through urban design measures like optimizing WI. This prevents excessive crowding from undermining long-term resident satisfaction.

5.4. Research Limitations and Future Prospects

The analysis in Section 3.1, Section 3.2 and Section 3.3 is based on static cross-sectional data, with subjective satisfaction primarily derived from social media texts, potentially introducing group bias and lacking temporal dimensions. First, while the construction of the comprehensive community satisfaction index integrates both subjective and objective data, Weibo data is widely adopted in academic research due to its vast user base and multidimensional insights. However, social media data may suffer from user group bias—for instance, older demographics are less likely to use Weibo, resulting in an inability to cover all age groups. Future research could incorporate questionnaire survey data to achieve a more comprehensive satisfaction measurement. Second, the study’s scope is limited to central urban areas, and conclusions may not apply to suburban or emerging towns. Future research should expand the scope to explore differences in satisfaction drivers in urban fringe areas. Finally, this study focuses on static cross-sectional analysis and fails to reveal the dynamic processes of these influencing factors over time. Subsequent research could incorporate time-series data to investigate changes in satisfaction drivers before and after urban renewal projects, thereby evaluating policy effectiveness.

Furthermore, the GeoShapley model proposed by this research demonstrates strong methodological versatility and scalability. Future studies may explore its application potential in other urban contexts and regions at different developmental stages, such as evaluating the dynamic mechanisms linking public services and spatial quality within urban sustainability processes. Simultaneously, the model’s indicator selection could incorporate additional dimensions of socio-spatial variables (e.g., community demographic structure, housing types, income levels, resident mobility) to enhance its explanatory power for complex urban systems.

However, the core value of this approach lies in analyzing “spatial non-stationarity.” Consequently, it is most suitable for research questions where influencing factors exhibit significant variation across geographic locations. If a phenomenon is highly spatially homogeneous, traditional global models may suffice, and the advantages of this method would not be fully realized. The results of this method are sensitive to the scale of analysis. Conclusions valid at the neighborhood level may not apply to the street or district level. When applying this framework to other studies, spatial units must be carefully defined to ensure alignment with the geographic context of the research question, and the potential for differing conclusions due to scale changes must be understood. Given the substantial variations across different urban areas, future research may consider techniques such as geographically weighted principal component analysis to better account for spatial and geographic differences between distinct functional zones.

For urban planning departments and government agencies, the analytical framework provided in this study can be used to assess spatial quality conditions across regions and help prioritize location-based interventions. Urban planners can pinpoint the specific spatial distribution of spatial quality deficiencies within their communities. This approach also enables quantitative confirmation of spatial inequities within jurisdictions, moving beyond reliance on subjective community perceptions. Furthermore, by continuously monitoring changes in key indicators, the effectiveness of spatial quality improvement interventions can be evaluated over time. Finally, based on the key factor analysis results generated for each community, it is possible to scientifically determine which communities should be prioritized for quality enhancement, thereby maximizing benefits under constrained urban renewal resources.

6. Conclusions

This study systematically analyzed the spatially heterogeneous factors influencing CS in Chengdu’s central districts by applying XGBoost-GeoShapley under the GeoXAI framework. Key findings include the following:

First, CS in Chengdu’s central urban area exhibits significant spatial positive correlations and clustering. High-satisfaction communities are concentrated along the southern Tianfu New Area corridor, while low-satisfaction communities are primarily located in the northern traditional industrial zones and rapid urbanization transition zones, forming an overall pattern of higher satisfaction in the south and lower satisfaction in the north. This finding suggests that local governments should adopt differentiated strategies in resource allocation and renewal sequencing, prioritizing the northern district with preferential policies.

Second, the driving mechanisms of CS demonstrate coexisting global homogeneity and local heterogeneity. Most basic public service factors (e.g., education, healthcare) exhibit spatially homogeneous effects, reflecting Chengdu’s significant achievements in equalizing fundamental public services. Conversely, a few factors measuring spatial quality and social vitality (building coverage ratio, distance to parks, walkability index, Weibo check-in density) demonstrate strong spatial non-stationarity, serving as key differentiators in CS disparities. This indicates that while maintaining equitable access to public services, local governments should shift their policy focus toward enhancing spatial quality and social vitality. Community governance, meanwhile, should prioritize localized improvements to these heterogeneous factors.

Third, complex nonlinear relationships and explicit threshold effects exist between these four key heterogeneous factors and satisfaction levels. Identifying thresholds such as the building coverage ratio (BCR > 0.15) and distance to parks (DGS > 589.77 m) provides concrete quantitative guidance for precision urban planning.

Fourth, dominant challenges and optimization pathways vary across different regions. The northern region requires balancing development intensity with ecological space preservation, the central region must prioritize quality enhancement and green space optimization, while the southern new district needs to harmonize spatial vitality with development density.

This study demonstrates that the GeoShapley-based spatial interpretable machine learning framework effectively overcomes limitations of traditional methods. It precisely diagnoses core issues within complex urban systems, thereby providing quantitative and interpretable decision support for implementing differentiated urban renewal strategies and building people-centered, high-quality cities.

Based on the above conclusions, the specific implications for local governments and community governance are as follows:

For local governments: It is recommended to incorporate spatially interpretable machine learning into urban monitoring and evaluation systems, promoting a new governance model characterized by “data-driven decision-making and mechanism-based explanations.” Use the key thresholds identified in the research as a basis for control and management: advance ecological restoration and address deficiencies in public services in the northern region; implement quality enhancement and green space patching in the central region; and strengthen coordinated regulation of vitality and density in the southern new district.

For community governance: Advocate targeted renewal based on locally dominant factors—promote “demolishing walls to reveal greenery” in densely built areas, add pocket parks in zones with weak park coverage, and introduce convenient commercial services in undervitalized neighborhoods. Visualize analysis results to foster resident participation in joint deliberation, establishing a “diagnosis-consultation-action” governance loop.

This study demonstrates that the GeoXAI framework delivers precise, explainable decision support for urban governance, advancing people-centered high-quality urban development.

Author Contributions

Conceptualization, W.Z.; methodology, W.Z., J.L. and L.Z.; software, L.Z. and W.Z.; validation, W.Z. and L.Z.; formal analysis, W.Z., L.Z., J.L. and Q.H.; investigation, W.Z., J.L., S.G. and Q.H.; data curation, W.Z.; writing—original draft preparation, W.Z., L.Z., J.L., S.G. and Q.H.; writing—review and editing, R.Z. and W.Z.; visualization, W.Z., L.Z., J.L., S.G. and Q.H.; supervision, L.Z. and R.Z.; project administration, W.Z.; funding acquisition, R.Z.; All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Sichuan Provincial College Student Innovation Training Project, grant number S202410626005; Shenzhen Higher Education Institutions Stable Support Plan General Program project, grant number 20231123102915001; Launch fee for scientific research of newly introduced high-precision and scarce talents in Shenzhen, grant number 827-000906.

Institutional Review Board Statement

Not applicable. This study did not involve human or animal experiments and used only publicly available, anonymized data sources.

Informed Consent Statement

Not applicable. This study did not involve human participants. All social media data used (Weibo check-in data) were publicly available and fully anonymized prior to analysis, ensuring no personally identifiable information was collected or processed.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

During the preparation of this manuscript, the authors used ChatGPT 5 and DeepSeek-V3.2 for the purposes of English readability, grammar consistency, and structural clarity of the manuscript. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

The 14th Five-Year Plan for National Economic and Social Development of the People’s Republic of China. Available online: http://www.gov.cn/xinwen/2021-03/13/content_5592681.htm (accessed on 25 April 2024).
The Plan for the Construction of Urban-Rural Community Service Systems. Available online: https://www.gov.cn/zhengce/content/2022-01/21/content_5669663.htm (accessed on 25 April 2024).
Chen, J.; Gan, W.; Liu, N.; Li, P.; Wang, H.; Zhao, X.; Yang, D. Community Quality Evaluation for Socially Sustainable Regeneration: A Study Using Multi-Sourced Geospatial Data and AI-Based Image Semantic Segmentation. ISPRS Int. J. Geo-Inf. 2024, 13, 167. [Google Scholar] [CrossRef]
Lian, X.; Li, D.; Di, W.; Oubibi, M.; Zhang, X.; Zhang, S.; Xu, C.; Lu, H. Research on Influential Factors of Satisfaction for Residents in Unit Communities—Taking Ningbo City as an Example. Sustainability 2022, 14, 6687. [Google Scholar] [CrossRef]
Chen, T.; Luh, D.; Hu, L.; Shan, Q. Exploring Factors Affecting Residential Satisfaction in Old Neighborhoods and Sustainable Design Strategies Based on Post-Occupancy Evaluation. Sustainability 2023, 15, 15213. [Google Scholar] [CrossRef]
Zheng, W.; Shen, G.Q.; Wang, H.; Hong, J.; Li, Z. Decision support for sustainable urban renewal: A multi-scale model. Land Use Policy 2017, 69, 361–371. [Google Scholar] [CrossRef]
Chen, M.; Chen, C.; Jin, C.; Li, B.; Zhang, Y.; Zhu, P. Evaluation and obstacle analysis of sustainable development in small towns based on multi-source big data: A case study of 782 top small towns in China. J. Environ. Manag. 2024, 366, 121847. [Google Scholar] [CrossRef]
Cui, Y.; Zha, G.; Wang, Q.; Dang, Y.; Shi, K.; Duan, X.; Xu, D.; Huang, B. Evaluating the community commercial vitality using multi-source data: A case study of Hangzhou, China. GIScience Remote Sens. 2025, 62, 2451335. [Google Scholar] [CrossRef]
He, X.; Zhou, Y.; Yuan, X.; Zhu, M. The coordination relationship between urban development and urban life satisfaction in Chinese cities—An empirical analysis based on multi-source data. Cities 2024, 150, 105016. [Google Scholar] [CrossRef]
Zhang, Y.; Li, Q.; Wang, H.; Du, X.; Huang, H. Community scale livability evaluation integrating remote sensing, surface observation and geospatial big data. Int. J. Appl. Earth Obs. Geoinf. 2019, 80, 173–186. [Google Scholar] [CrossRef]
Yin, J.; Wang, J.; Wang, C.; Wang, L.; Chang, Z. CRITIC-TOPSIS Based Evaluation of Smart Community Governance: A Case Study in China. Sustainability 2023, 15, 1923. [Google Scholar] [CrossRef]
Bramley, G.; Dempsey, N.; Power, S.; Brown, C.; Watkins, D. Social sustainability and urban form: Evidence from five British cities. Environ. Plan. A 2009, 41, 2125–2142. [Google Scholar] [CrossRef]
Wang, Z.; Jie, H.; Fu, H.; Wang, L.; Jiang, H.; Ding, L.; Chen, Y. A social-media-based improvement index for urban renewal. Ecol. Indic. 2022, 137, 108775. [Google Scholar] [CrossRef]
Li, X.; Liu, H. The Influence of Subjective and Objective Characteristics of Urban Human Settlements on Residents’ Life Satisfaction in China. Land 2021, 10, 1400. [Google Scholar] [CrossRef]
Rui, J. Green disparities, happiness elusive: Decoding the spatial mismatch between green equity and the happiness from vulnerable perspectives. Cities 2025, 163, 106063. [Google Scholar] [CrossRef]
Chen, N.; Fang, D. Exploring Public Space Satisfaction in Old Residential Areas Based on Impact-Asymmetry Analysis. Sustainability 2024, 16, 2557. [Google Scholar] [CrossRef]
Dissart, J.-C.; Kuentz-Simonet, V. Spatial Determinants of Life Satisfaction on the Aquitaine Coast: A Geographically-Weighted Regression Approach. J. Happiness Stud. 2025, 26, 21. [Google Scholar] [CrossRef]
Hong, X.-C.; Zhang, D.-Y.; Hu, F.-B.; Guo, L.-H.; Liu, J.; Guo, H. Does urban green space form influence the spatial pattern of noise complaints? Sustain. Cities Soc. 2025, 130, 106506. [Google Scholar] [CrossRef]
Xia, J.; Zhang, G.; Ma, S.; Pan, Y. Spatial Heterogeneity of Driving Factors in Multi-Vegetation Indices RSEI Based on the XGBoost-SHAP Model: A Case Study of the Jinsha River Basin, Yunnan. Land 2025, 14, 925. [Google Scholar] [CrossRef]
Liu, J.; Cai, Y.; Shen, X. Integrating Machine Learning, SHAP Interpretability, and Deep Learning Approaches in the Study of Environmental and Economic Factors: A Case Study of Residential Segregation in Las Vegas. Land 2025, 14, 957. [Google Scholar] [CrossRef]
Li, M.; Zhu, Z.; Deng, J.; Zhang, J.; Li, Y. Geospatial Explainable AI Uncovers Eco-Environmental Effects and Its Driving Mechanisms-Evidence from the Poyang Lake Region, China. Land 2025, 14, 1361. [Google Scholar] [CrossRef]
Li, F.; Nan, T.; Zhang, H.; Luo, K.; Xiang, K.; Peng, Y. Evaluating Ecological Vulnerability and Its Driving Mechanisms in the Dongting Lake Region from a Multi-Method Integrated Perspective: Based on Geodetector and Explainable Machine Learning. Land 2025, 14, 1435. [Google Scholar] [CrossRef]
Li, C.; Zhou, Y.; Wu, M.; Xu, J.; Fu, X. Exploring Nonlinear Threshold Effects and Interactions Between Built Environment and Urban Vitality at the Block Level Using Machine Learning. Land 2025, 14, 1323. [Google Scholar] [CrossRef]
Le, T.T.; Sharma, P.; Osman, S.M.; Dzida, M.; Nguyen, P.Q.P.; Tran, M.H.; Cao, D.N.; Tran, V.D. Forecasting energy consumption and carbon dioxide emission of Vietnam by prognostic models based on explainable machine learning and time series. Clean Technol. Environ. Policy 2024, 26, 4405–4431. [Google Scholar] [CrossRef]
Liu, L. An ensemble framework for explainable geospatial machine learning models. Int. J. Appl. Earth Obs. Geoinf. 2024, 132, 104036. [Google Scholar] [CrossRef]
Li, Z. GeoShapley: A Game Theory Approach to Measuring Spatial Effects in Machine Learning Models. Ann. Am. Assoc. Geogr. 2024, 114, 1365–1385. [Google Scholar] [CrossRef]
Zhu, L.; Guo, Y.; Zhang, C.; Meng, J.; Ju, L.; Zhang, Y.; Tang, W. Assessing Community-Level Livability Using Combined Remote Sensing and Internet-Based Big Geospatial Data. Remote Sens. 2020, 12, 4026. [Google Scholar] [CrossRef]
Wang, Y.; Yan, Y.; Yu, S.; Bai, D. Integrated Environmental Perception and Civic Engagement: The Mediating Role of Residential Satisfaction in Urban Migrants’ Community Participation Intention. Sustainability 2025, 17, 8639. [Google Scholar] [CrossRef]
Bove, A.; Ghiraldelli, M. Smart but Unlivable? Rethinking Smart City Rankings Through Livability and Urban Sustainability: A Comparative Perspective Between Athens and Zurich. Sustainability 2025, 17, 8901. [Google Scholar] [CrossRef]
Smith, R.M.; Deb, D.; Blizard, Z.; Midgett, R. A Planner’s quest for identifying spatial (in)justice in local communities: A case study of urban census tracts in North Carolina, USA. Appl. Geogr. 2023, 158, 103030. [Google Scholar] [CrossRef]
Chen, Y.; Ye, Y.; Liu, X.; Yin, C.; Jones, C.A. Examining the nonlinear and spatial heterogeneity of housing prices in urban Beijing: An application of GeoShapley. Habitat Int. 2025, 162, 103439. [Google Scholar] [CrossRef]
Mak, H.W.; Coulter, R.; Fancourt, D. Associations between community cultural engagement and life satisfaction, mental distress and mental health functioning using data from the UK Household Longitudinal Study (UKHLS): Are associations moderated by area deprivation? BMJ Open 2021, 11, e045512. [Google Scholar] [CrossRef]
Tang, F.; Zeng, P.; Wang, L.; Zhang, L.; Xu, W. Urban Perception Evaluation and Street Refinement Governance Supported by Street View Visual Elements Analysis. Remote Sens. 2024, 16, 3661. [Google Scholar] [CrossRef]
Hao, H.; Yao, E.; Yang, Y.; Liu, S.; Pan, L.; Wang, Y. How to build vibrant communities by utilizing functional zones? A community-detection-based approach for revealing the association between land use and community vibrancy. Cities 2024, 155, 105431. [Google Scholar] [CrossRef]
Yu, H. Generalized geographically and temporally weighted regression. Comput. Environ. Urban Syst. 2025, 117, 102244. [Google Scholar] [CrossRef]
Wang, D.; Li, V.J.; Yu, H. Mass Appraisal Modeling of Real Estate in Urban Centers by Geographically and Temporally Weighted Regression: A Case Study of Beijing’s Core Area. Land 2020, 9, 143. [Google Scholar] [CrossRef]
Tang, Q.; Cao, J.; Yin, C.; Cheng, J. Examining the nonlinear relationships between park attributes and satisfaction with pocket parks in Chengdu. Urban For. Urban Green. 2024, 101, 128548. [Google Scholar] [CrossRef]
Li, C.; Managi, S. Impacts of community attachment and community livability on environmental activity according to XGBoost and SHAP. Cities 2025, 156, 105559. [Google Scholar] [CrossRef]
Ma, S.; Wang, B.; Liu, W.; Zhou, H.; Wang, Y.; Li, S. Assessment of street space quality and subjective well-being mismatch and its impact, using multi-source big data. Cities 2024, 147, 104797. [Google Scholar] [CrossRef]
Roychowdhury, S.; Mazumdar, S.; Thakker, D.; Checco, A.; Lanfranchi, V.; Goodchild, B. Integrating Virtual Walkthroughs for Subjective Urban Evaluations: A Case Study of Neighbourhoods in Sheffield, England. Land 2024, 13, 831. [Google Scholar] [CrossRef]
Jin, A.; Ge, Y.; Zhang, S. Spatial Characteristics of Multidimensional Urban Vitality and Its Impact Mechanisms by the Built Environment. Land 2024, 13, 991. [Google Scholar] [CrossRef]
Mitchell, R.J.; Richardson, E.A.; Shortt, N.K.; Pearce, J.R. Neighborhood Environments and Socioeconomic Inequalities in Mental Well-Being. Am. J. Prev. Med. 2015, 49, 80–84. [Google Scholar] [CrossRef]
Lin, L.; Liu, Q.; Xiao, X.; Luo, Q. Perceived Constraints on Active Recreational Sport Participation among Residents in Urban China. Int. J. Environ. Res. Public Health 2022, 19, 14884. [Google Scholar] [CrossRef]
Liu, Z.; Ye, J.; Ren, G.; Feng, S. The Effect of School Quality on House Prices: Evidence from Shanghai, China. Land 2022, 11, 1894. [Google Scholar] [CrossRef]
Elvidge, C.D.; Sutton, P.C.; Ghosh, T.; Tuttle, B.T.; Baugh, K.E.; Bhaduri, B.; Bright, E. A global poverty map derived from satellite data. Comput. Geosci. 2009, 35, 1652–1660. [Google Scholar] [CrossRef]
Gao, M.; Fang, C. Ripples of blue: Unveiling the influence of urban blue spaces on public happiness through social networking sites. Appl. Geogr. 2025, 179, 103632. [Google Scholar] [CrossRef]
Huang, Y.; Li, J.; Wu, G.; Fei, T. Quantifying the bias in place emotion extracted from photos on social networking sites: A case study on a university campus. Cities 2020, 102, 102719. [Google Scholar] [CrossRef]
Zhao, X.; Chen, J.; Li, J.; Wang, H.; Zhang, X.; Yu, F. Unraveling the renewal priority of urban heritage communities via macro-micro dimensional assessment—A case study of Nanjing City, China. Sustain. Cities Soc. 2025, 124, 106317. [Google Scholar] [CrossRef]
Sheng, J.; He, Y.; Lu, T.; Wang, F.; Huang, Y.; Leng, B.; Zhang, X.; Chen, Y. Unveiling urban vitality and its interactions in mountainous cities: A human behaviour perspective on community-level dynamics. Cities 2025, 159, 105780. [Google Scholar] [CrossRef]
Peng, Y.; Liu, Q. Plot-scale population estimation modeling based on residential plot form clustering and locational attractiveness analysis. Comput. Environ. Urban Syst. 2025, 118, 102257. [Google Scholar] [CrossRef]
Pan, Z.; Liu, Y.; Liu, Y.; Huo, Z.; Han, W. Age-friendly neighbourhood environment, functional abilities and life satisfaction: A longitudinal analysis of older adults in urban China. Soc. Sci. Med. 2024, 340, 116403. [Google Scholar] [CrossRef] [PubMed]
Parola, A.; Marcionetti, J. Career Decision-Making Difficulties and Life Satisfaction: The Role of Career-Related Parental Behaviors and Career Adaptability. J. Career Dev. 2022, 49, 831–845. [Google Scholar] [CrossRef]
Cervero, R.; Kockelman, K. Travel demand and the 3Ds: Density, diversity, and design. Transp. Res. Part D Transp. Environ. 1997, 2, 199–219. [Google Scholar] [CrossRef]
Wu, C.; Ye, Y.; Gao, F.Z.; Ye, X.Y. Using street view images to examine the association between human perceptions of locale and urban vitality in Shenzhen, China. Sustain. Cities Soc. 2023, 88, 104291. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Chen, Y.; Jiao, S.; Gu, X.; Li, S. Decoding the spatiotemporal effects of industrial clusters on carbon emissions in a Chinese river basin. J. Clean. Prod. 2025, 516, 145851. [Google Scholar] [CrossRef]
Liu, Y.; Li, T.; Zhao, W.; Wang, S.; Fu, B. Landscape functional zoning at a county level based on ecosystem services bundle: Methods comparison and management indication. J. Environ. Manag. 2019, 249, 109315. [Google Scholar] [CrossRef]
Zhang, X.; Du, L.; Song, X. Identification of Urban Renewal Potential Areas and Analysis of Influential Factors from the Perspective of Vitality Enhancement: A Case Study of Harbin City’s Core Area. Land 2024, 13, 1934. [Google Scholar] [CrossRef]
Huang, W.; Hu, L.; Xing, Y. Sustainable Renewal Strategies for Older Communities from the Perspective of Living Experience. Sustainability 2022, 14, 2813. [Google Scholar] [CrossRef]

Figure 1. Location of the study area. (a) Chengdu’s location in China. (b) The study area within Chengdu. (c) Chengdu’s central urban district. (d) Community delineation within Chengdu’s central urban district. The yellow polygons represent the communities included in the study area.

Figure 2. Framework of the study. The figure illustrates the four-step analytical process: (1) evaluation of influencing factors and measurement of community satisfaction; (2) construction and selection of predictive models; (3) contribution analysis and nonlinear effect detection using GeoShapley and SHAP/POP; and (4) formulation of community optimization strategies. Detailed explanations for each panel are provided in the subsequent sections.

Figure 3. Spatial Distribution of CS.

Figure 4. XGBoost’s Geoshapley Swarm Plot.

Figure 5. Threshold Effects of Predictive Indicators for CS.

Figure 6. Spatial distribution of GeoShapley values for the top four influencing factors in the XGBoost model. (a) WC; (b) BCR; (c) WI; (d) DGS.

Table 1. Description of the dataset.

Type of Data	Data Structure	Source	Description
OSM Road Network	Vector line data	https://www.openstreetmap.org/, accessed on 15 October 2024.	Including spatial information (e.g., latitude and longitude) and road-level information (e.g., motorway, primary, secondary, trunk).
POIs	Vector point data	https://lbsyun.baidu.com/index.php?title=webapi/guide/webservice-placeapi, accessed on 23 June 2025.	2025 data showing geographic locations and categories of public service facilities.
Housing Price	Vector point data	https://chengdu.anjuke.com/, accessed on 1 February 2025.	2025 geolocated point data of housing prices.
NPP-VIIRS-Like Nighttime Light	Raster data	https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/YGIVCD, accessed on 18 August 2025.	2024 Annual Average Nighttime Light Image, Spatial Resolution: 500 m.
Weibo check-in	Vector point data	https://weibo.com, accessed on 10 June 2025.	486,865 Weibo check-in entries from 2023 to present.
Street view images	Images	https://lbs.baidu.com/, accessed on 18 June 2025.	Based on OSM road network data, 168,392 street panoramas were sampled at 50 m intervals.
Population	Raster data	https://hub.worldpop.org/geodata/listing?id=95, accessed on 18 August 2025.	The 2024 total population estimate with no constraints, featuring a spatial resolution of 100 m.
Building Vector	Vector polygon data	https://doi.org/10.1038/s41597-025-04730-5, accessed on 19 August 2025.	Building data for 2024 including attributes such as building height and number of floors.
Park and Greenland	Vector polygon data	https://doi.org/10.1038/s41597-025-04730-5, accessed on 18 December 2024.	2024 Chengdu Park and Green Space Area Data.

Table 2. Entropy Rights Method Results.

Major	Dimensions	Variable	Description	Weight
CS	Subjective	WSI	Average of Weibo check-in data sentiment rating within a 1000 m range, WSI = 0.6 × SPI + 0.4 × SII	0.0223
	Objective	PR	Number of recreations within a 1000 m range	0.3398
		HV	Average housing price	0.1484
		NTL	Total lighting within the community	0.4894

Table 3. Influencing Factor Selection.

Dimensions	Variable	Abbr.	Description	Type of Data
Transport and Infrastructure	Road Density	RD	The ratio of road length to community areas within the community.	OSM Road Network
Building Morphology and Land Use	Building Coverage Ratio	BCR	The ratio of the total floor area of all buildings within the community.	Building Vector
Social Vitality	Weibo Check-in	WC	Number of Weibo check-in density within a 1000 m range.	Weibo check-in
Locational Attributes	Distance to the CBD	DCBD	Euclidean Distance in meters from the Community’s centroid to the Central Business District.	POIs
Public Service Accessibility	Facility Density	FD	The ratio of the number of facilities within a community to the community’s area.	POIs
	Distance to Amenities	DA	Euclidean Distance in meters from the community’s centroid to the nearest amenity.	POIs
	Distance to Employment	DE	Euclidean Distance in meters from the community’s centroid to the nearest employment center.	POIs
	Distance to Science and Culture	DSC	Euclidean Distance in meters from the community’s centroid to the nearest science and culture facility.	POIs
	Distance to School	DS	Euclidean Distance in meters from community’s centroid to the nearest school.	POIs
	Distance to Medical	DM	Euclidean Distance in meters from the community’s centroid to the nearest medical facility.	POIs
	Distance to Green Space	DGS	Euclidean Distance in meters from the community’s centroid to the nearest green space.	POIs
	Proximity to Bus	PB	Number of bus stops within a 1000 m range.	POIs
Street Space Quality	Green View Index	GVI	The ratio of the sum of the areas occupied by vegetation and terrain in the image to the total image area.	Street view images
	Street Equipment Efficiency Index	SEI	The ratio of the sum of the areas occupied by various street facilities (e.g., barrier, fence, pole, utility pole, traffic sign frame, traffic sign(back), traffic sign(front), traffic light, streetlight) in the image to the total image area.	Street view images
	Walkability Index	WI	The ratio of the area occupied by the sidewalk in the image to the total image area.	Street view images

Table 4. Model performance metrics.

Model	R²	MSE	RMSE	MAE	MAPE (%)
XGBoost	0.8200	0.1782	0.4222	0.3235	243.9861
CatBoost	0.8194	0.1794	0.4236	0.3261	200.2926
LightGBM	0.8118	0.1865	0.4318	0.3312	231.4794
Random Forest	0.7877	0.2128	0.4614	0.3449	189.7591
Extra Trees	0.7649	0.2358	0.4856	0.3618	190.9237
AdaBoost	0.7552	0.2447	0.4947	0.3843	179.6520
Linear Regression	0.7201	0.2771	0.5264	0.4112	230.5850
Decision Tree	0.6391	0.3597	0.5997	0.4542	258.6609
KNN	0.5265	0.4720	0.6870	0.5183	461.6426

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, W.; Zhang, L.; Li, J.; Guo, S.; Hu, Q.; Zhou, R. GeoShapley-Based Explainable GeoAI for Sustainable Community Satisfaction Assessment: Evidence from Chengdu, China. Sustainability 2025, 17, 10261. https://doi.org/10.3390/su172210261

AMA Style

Zhang W, Zhang L, Li J, Guo S, Hu Q, Zhou R. GeoShapley-Based Explainable GeoAI for Sustainable Community Satisfaction Assessment: Evidence from Chengdu, China. Sustainability. 2025; 17(22):10261. https://doi.org/10.3390/su172210261

Chicago/Turabian Style

Zhang, Wennan, Li Zhang, Jinyi Li, Sui Guo, Qixuan Hu, and Rui Zhou. 2025. "GeoShapley-Based Explainable GeoAI for Sustainable Community Satisfaction Assessment: Evidence from Chengdu, China" Sustainability 17, no. 22: 10261. https://doi.org/10.3390/su172210261

APA Style

Zhang, W., Zhang, L., Li, J., Guo, S., Hu, Q., & Zhou, R. (2025). GeoShapley-Based Explainable GeoAI for Sustainable Community Satisfaction Assessment: Evidence from Chengdu, China. Sustainability, 17(22), 10261. https://doi.org/10.3390/su172210261

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

GeoShapley-Based Explainable GeoAI for Sustainable Community Satisfaction Assessment: Evidence from Chengdu, China

Abstract

1. Introduction

2. Literature Review

2.1. Theoretical Foundations

2.2. Research Progress and Methodological Limitations in Factors Influencing Community Satisfaction

3. Materials and Methods

3.1. Study Area

3.2. Research Framework

3.3. Data Sources and Processing

3.3.1. Data and Preprocessing

3.3.2. Community Satisfaction

3.3.3. Influencing Factor Selection

3.4. Machine Learning Models

4. Results

4.1. Distribution Characteristics of CS in Chengdu’s Central Urban Area

4.2. Global Feature Importance and Nonlinear Effects

4.3. Heterogeneity in the Importance of Key Features

5. Discussion

5.1. The Dual Pattern of Homogeneous Public Services and Heterogeneous Quality Experiences

5.2. Methodological Implications

5.3. Community Optimization Strategy

5.4. Research Limitations and Future Prospects

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI