Unraveling the Non-Linear Impact of the Built Environment on Population-Based Residential Vitality at the Block Scale: An Explainable AI Approach Using Multi-Source Open Data in Zhengzhou, China
Abstract
1. Introduction
1.1. Urban Vitality and the Paradigm Shift in Urban Renewal
1.2. The Spatial Unit Dilemma: From Arbitrary Grids to Functional Patches
1.3. The Data Divide: Democratizing Urban Analytics with Open Data
1.4. The Linearity Trap: Decoding Non-Linear Thresholds via Explainable AI
1.5. Research Objectives and Contributions
- How can a high-precision, AOI-based urban vitality assessment model be constructed using exclusively free, multi-source open data?
- What is the relative importance of different built environment dimensions (5D+S) in driving urban vitality?
- What are the specific non-linear threshold effects (optimal ranges) of key morphological and functional indicators on vitality, and how can they inform localized urban renewal policies?
2. Literature Review
2.1. Built Environment and Urban Vitality
2.2. From Grids to Functional Patches: The AOI Approach
2.3. Explainable AI in Urban Studies
2.4. Research Gaps and Present Contribution
3. Materials and Methods
3.1. Study Area
3.2. Data Sources
3.3. Spatial Analysis Unit: AOI Delineation
3.4. Variable Measurement
3.4.1. Dependent Variable: Urban Vitality Index
3.4.2. Independent Variables: Multidimensional Built Environment (5D+S)
3.5. Data Preprocessing
3.6. Explainable Machine Learning Framework
3.6.1. XGBoost
3.6.2. SHAP Interpretation
4. Results
4.1. Descriptive Statistics
4.2. Model Performance Comparison
4.3. Feature Importance Analysis
- Tier 2 (mean |SHAP| ≈ 0.04–0.05): Bus Station Density (BusDen500, 0.047) is the second most important feature, demonstrating that fine-grained transit service provision (within walking distance) outweighs even the proximity to a single bus station (DistBus).
- Tier 3 (mean |SHAP| ≈ 0.02–0.04): Green Coverage Ratio (GreenRatio, 0.035) and Building Density (BD, 0.023) form a second meaningful tier, reflecting the importance of ecological context and physical morphology.
- Tier 4 (mean |SHAP| ≈ 0.01–0.02): Distance to Bus Station (DistBus), POI Density (PD), Floor Area Ratio (FAR), and Functional Mix (ENT) contribute moderately.
- Tier 5 (mean |SHAP| < 0.012): Building Age (BldgAge) and Average Building Height (AH) contribute the least, suggesting that vertical density and historical maturity are weak predictors of population-based vitality once location and other density factors are accounted for.
4.4. Non-Linear Threshold Effects
4.4.1. Distance to CBD (DistCBD)
4.4.2. Bus Station Density Within 500 m (BusDen500)
4.4.3. Green Coverage Ratio (GreenRatio)
4.4.4. Building Density (BD)
4.5. Visual Representativeness of Vitality Classes
5. Discussion
5.1. Implications
5.1.1. Theoretical Implications
5.1.2. Practical Implications for Urban Design
5.1.3. Policy and Managerial Implications
5.2. Comparison with Existing Studies
5.3. Methodological Contributions
5.4. Limitations and Future Work
5.5. Ethical Scope and Operational Boundaries of the Vitality Index
6. Conclusions
- Non-linear superiority confirmed. XGBoost (R2 = 0.846; CV R2 = 0.713 ± 0.115) significantly outperforms OLS (R2 = 0.634), confirming that built-environment–vitality relationships are fundamentally non-linear and threshold-driven, not monotonic.
- Location dominates. Distance to the commercial core is the single most important predictor (mean |SHAP| = 0.134), with a critical vitality radius of approximately 4.3 km. Urban renewal within this radius offers the highest vitality return.
- Threshold-based design guidelines. Specific thresholds were identified for precision renewal: (i) at least 4 bus stations should be accessible within 500 m of a block; (ii) green-land coverage should not exceed approximately 8.5% within 500 m if residential vitality is the planning objective; and (iii) building density delivers positive returns within an inverted-U range of approximately 2–50%, with peak effects at 10–30%.
- Open data viability. The framework demonstrates that reproducible, high-fidelity non-linear vitality analysis (R2 > 0.8) is achievable using exclusively free, open-source data, lowering the barrier for evidence-based urban renewal planning across diverse urban contexts.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wu, C.; Ye, X.; Ren, F.; Du, Q. Check-in behaviour and spatio-temporal vibrancy: An exploratory analysis in Shenzhen, China. Cities 2018, 77, 104–116. [Google Scholar] [CrossRef]
- Lan, F.; Gong, X.; Da, H.; Wen, H. How do population inflow and social infrastructure affect urban vitality? Evidence from 35 large- and medium-sized cities in China. Cities 2020, 100, 102454. [Google Scholar] [CrossRef]
- Montgomery, J. Making a city: Urbanity, vitality and urban design. J. Urban Des. 1998, 3, 93–116. [Google Scholar] [CrossRef]
- Glaeser, E.L.; Kolko, J.; Saiz, A. Consumer city. J. Econ. Geogr. 2001, 1, 27–50. [Google Scholar] [CrossRef]
- Zhang, A.; Li, W.; Wu, J.; Lin, J.; Chu, J.; Xia, C. How can the urban landscape affect urban vitality at the street block level? A case study of 15 metropolises in China. Environ. Plan. B Urban Anal. City Sci. 2021, 48, 1245–1262. [Google Scholar] [CrossRef]
- Yue, W.; Chen, Y.; Thy, P.T.M.; Fan, P.; Liu, Y.; Zhang, W. Identifying urban vitality in metropolitan areas of developing countries from a comparative perspective: Ho Chi Minh City versus Shanghai. Sustain. Cities Soc. 2021, 65, 102573. [Google Scholar] [CrossRef]
- Openshaw, S. The Modifiable Areal Unit Problem; Geobooks: Norwich, UK, 1984. [Google Scholar]
- Wu, J.; Lu, Y.; Gao, H.; Wang, M. Cultivating historical heritage area vitality using urban morphology approach based on big data and machine learning. Comput. Environ. Urban Syst. 2022, 91, 101716. [Google Scholar] [CrossRef]
- Li, M.; Liu, J.; Lin, Y.; Xiao, L.; Zhou, J. Revitalizing historic districts: Identifying built environment predictors for street vibrancy based on urban sensor data. Cities 2021, 117, 103305. [Google Scholar] [CrossRef]
- Chen, Y.; Yu, B.; Shu, B.; Yang, L.; Wang, R. Exploring the spatiotemporal patterns of residential electricity consumption in Nanjing, China. Sustain. Cities Soc. 2023, 96, 104629. [Google Scholar]
- Xia, C.; Yeh, A.G.-O.; Zhang, A. Analyzing spatial relationships between urban land use intensity and urban vitality at street block level: A case study of five Chinese megacities. Landsc. Urban Plan. 2020, 193, 103669. [Google Scholar] [CrossRef]
- WorldPop. Open Spatial Demographic Data and Research. Available online: https://www.worldpop.org/ (accessed on 15 April 2026).
- Amap (Gaode Map). POI Data Service. Available online: https://lbs.amap.com/ (accessed on 15 April 2026).
- Ewing, R.; Cervero, R. Travel and the built environment: A meta-analysis. J. Am. Plan. Assoc. 2010, 76, 265–294. [Google Scholar] [CrossRef]
- Zhang, J.; Tan, P.Y.; Zeng, H.; Zhang, Y. Walkability assessment in a rapidly urbanizing city and its relationship with residential estate value. Sustainability 2019, 11, 2205. [Google Scholar] [CrossRef]
- Wu, W.; Niu, X. Influence of built environment on urban vitality: Case study of Shanghai using mobile phone location data. J. Urban Plan. Dev. 2019, 145, 04019007. [Google Scholar] [CrossRef]
- Delclòs-Alió, X.; Miralles-Guasch, C. Looking at Barcelona through Jane Jacobs’s eyes: Mapping the basic conditions for urban vitality in a Mediterranean conurbation. Land Use Policy 2018, 75, 505–517. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 30; Curran Associates, Inc.: Red Hook, NY, USA, 2017; pp. 4765–4774. [Google Scholar]
- Jacobs, J. The Death and Life of Great American Cities; Vintage Books: New York, NY, USA, 1961. [Google Scholar]
- Kang, C.; Fan, D.; Jiao, H. Validating activity, time, and space diversity as essential components of urban vitality. Environ. Plan. B Urban Anal. City Sci. 2021, 48, 1180–1197. [Google Scholar] [CrossRef]
- Cervero, R.; Kockelman, K. Travel demand and the 3Ds: Density, diversity, and design. Transp. Res. Part D Transp. Environ. 1997, 2, 199–219. [Google Scholar] [CrossRef]
- Zeng, C.; Song, Y.; He, Q.; Shen, F. Spatially explicit assessment on urban vitality: Case study in Chicago. Sustainability 2018, 10, 4861. [Google Scholar] [CrossRef]
- Li, X.; Li, Y.; Jia, T.; Zhou, L.; Hijazi, I.H. The six dimensions of built environment on urban vitality: Fusion evidence from multi-source data. Cities 2022, 121, 103482. [Google Scholar] [CrossRef]
- Lu, S.; Shi, C.; Yang, X. Impacts of built environment on urban vitality: Regression analyses of Beijing and Chengdu, China. Int. J. Environ. Res. Public Health 2019, 16, 4592. [Google Scholar] [CrossRef]
- Tu, W.; Zhu, T.; Zhong, C.; Zhang, X.; Xu, Y.; Li, Q. Exploring urban vitality and its driving mechanism through multi-source data: A case study of Shanghai. Sustain. Cities Soc. 2024, 100, 105050. [Google Scholar]
- Yue, H.; Zhu, X. Exploring the relationship between urban vitality and street centrality based on social network review data in Wuhan, China. Sustainability 2019, 11, 4356. [Google Scholar] [CrossRef]
- Huang, B.; Zhou, Y.; Li, Z.; Song, Y.; Cai, J.; Tu, W. Evaluating and characterizing urban vibrancy using spatial big data: Shanghai as a case study. Environ. Plan. B Urban Anal. City Sci. 2020, 47, 1543–1559. [Google Scholar] [CrossRef]
- Lyu, F.; Zhang, L. Using multi-source big data to understand the factors affecting urban park use in Wuhan. Urban For. Urban Green. 2019, 43, 126367. [Google Scholar] [CrossRef]
- Wang, Z.; Jiao, L.; Xu, G.; Luo, X.; Wang, C. Unraveling the Impact Mechanisms of Built Environment on Urban Vitality: Integrating Scale, Heterogeneity, and Interaction Effects. Buildings 2026, 16, 29. [Google Scholar]
- Kim, Y.-L. Seoul’s Wi-Fi hotspots: Wi-Fi access points as an indicator of urban vitality. Comput. Environ. Urban Syst. 2018, 72, 13–24. [Google Scholar] [CrossRef]
- Xu, X.; Xu, X.; Guan, P.; Ren, Y.; Wang, W.; Xu, N. The cause and evolution of urban street vitality under the time dimension: Nine cases of streets in Nanjing City, China. Sustainability 2018, 10, 2797. [Google Scholar] [CrossRef]
- Ma, S.; Long, Y. Functional urban area delineations of cities on the Chinese mainland using massive Didi ride-hailing records. Cities 2020, 97, 102532. [Google Scholar] [CrossRef]
- Chen, Y.; Liu, X.; Li, X.; Liu, X.; Yao, Y.; Hu, G.; Xu, X.; Pei, F. Delineating urban functional areas with building-level social media data: A dynamic time warping (DTW) distance based k-medoids method. Landsc. Urban Plan. 2017, 160, 48–60. [Google Scholar] [CrossRef]
- Yang, L.; Liu, J.; Liang, Y.; Lu, Y.; Yang, H. Spatially varying effects of street greenery on walking time of older adults. ISPRS Int. J. Geo-Inf. 2021, 10, 596. [Google Scholar] [CrossRef]
- Liu, S.; Zhang, L.; Long, Y.; Long, Y.; Xu, M. A new urban vitality analysis and evaluation framework based on human activity modeling using multi-source big data. ISPRS Int. J. Geo-Inf. 2020, 9, 617. [Google Scholar] [CrossRef]
- Ding, C.; Cao, X.; Næss, P. Applying gradient boosting decision trees to examine non-linear effects of the built environment on driving distance in Oslo. Transp. Res. Part A Policy Pract. 2018, 110, 107–117. [Google Scholar] [CrossRef]
- Li, Y.; Pan, Y.; Ning, C.; Ding, C. Examining the effect of the built environment on housing price at the macro and micro levels using a recursive approach. Sustainability 2019, 11, 3629. [Google Scholar]
- Gehl, J. Cities for People; Island Press: Washington, DC, USA, 2010. [Google Scholar]
- Mouratidis, K. Urban planning and quality of life: A review of pathways linking the built environment to subjective well-being. Cities 2021, 115, 103229. [Google Scholar] [CrossRef]
- Mehaffy, M.W.; Haas, T. New Urbanism in the New Urban Agenda: Threads of an unfinished reformation. Urban Plan. 2020, 5, 441–452. [Google Scholar] [CrossRef]
- Christaller, W. Central Places in Southern Germany; Prentice-Hall: Englewood Cliffs, NJ, USA, 1966. [Google Scholar]
- Ye, Y.; Li, D.; Liu, X. How block density and typology affect urban vitality: An exploratory analysis in Shenzhen, China. Urban Geogr. 2018, 39, 631–652. [Google Scholar] [CrossRef]
- Chen, L.; Zhao, L.; Xiao, Y.; Lu, Y. Investigating the spatiotemporal pattern between the built environment and urban vibrancy using big data in Shenzhen, China. Comput. Environ. Urban Syst. 2022, 95, 101827. [Google Scholar] [CrossRef]
- UN-Habitat. New Urban Agenda; United Nations: Quito, Ecuador, 2017. [Google Scholar]
- Sung, H.; Lee, S. Residential built environment and walking activity: Empirical evidence of Jane Jacobs’ urban vitality. Transp. Res. Part D Transp. Environ. 2015, 41, 318–329. [Google Scholar] [CrossRef]








| Study | City | Spatial Unit | Vitality Proxy | Method | Key Limitation |
|---|---|---|---|---|---|
| Wu et al., 2019 [16] | Shanghai | Grid (1 km) | Mobile phone | RF + GBT | Proprietary data; MAUP; no thresholds |
| Li et al., 2021 [9] | Shenzhen | Street segment | Pedestrian count | OLS regression | Linear assumption; single proxy |
| Xia et al., 2020 [11] | Beijing etc. | Street block | POI + social media | Spatial regression | Linear; proprietary data |
| Wu et al., 2022 [8] | Multiple | Grid (500 m) | Mobile phone | ML + morphology | Proprietary data; no SHAP |
| Wang et al., 2026 [30] | Wuhan | Grid (500 m) | Mobile phone | XGBoost + SHAP | Proprietary data; grid MAUP |
| Li et al., 2022 [24] | Wuhan | Grid (500 m) | Multi-source | GWR | Linear; arbitrary grid |
| Lu et al., 2019 [25] | Beijing/Chengdu | TAZ | Population | OLS | Linear; admin. boundary |
| This study | Zhengzhou | AOI (functional) | WorldPop 100 m | XGBoost + SHAP | Open data only; see Section 5.4 |
| Data Type | Source | Resolution/Scale | Year | Records |
|---|---|---|---|---|
| Population density | WorldPop (constrained, R2025A) | 100 m | 2026 | 1.06 M valid pixels in study area |
| AOI polygons | OpenStreetMap | Vector | 2024 | 4084 polygons (15 land-use classes) |
| Building footprints | Open Building Data | Vector | 2024 | 201,584 buildings (height, function, age, quality) |
| Points of Interest | Amap (Gaode) | Point | 2024 | 630,150 POIs (22 major categories) |
| Bus stations | Amap (Gaode) | Point | 2024 | 8612 records |
| Main Business Districts | Urban planning data | Polygon | 2024 | 5 CBD locations |
| Study boundary | Administrative boundary | Polygon | 2024 | Zhengzhou Municipality |
| Dimension | Variable | Abbrev. | Formula | Source |
|---|---|---|---|---|
| D1: Density | Building Density | BD | Total building footprint area/AOI area | Building.shp |
| Average Building Height | AH | Area-weighted mean height (m) | Building.shp | |
| Floor Area Ratio | FAR | Total floor area/AOI area | Building.shp | |
| D2: Diversity | Functional Mix | ENT | Shannon Entropy of POI categories within AOI + 100 m buffer | POI data |
| POI Density | PD | POI count/AOI area (per km2) | POI data | |
| D3: Design | Building Age | BldgAge | Area-weighted mean building age (years since construction) | Building.shp |
| D4: Transit | Distance to Bus Station | DistBus | Euclidean distance from AOI centroid to nearest bus station (m) | Bus Station.shp |
| Bus Station Density | BusDen500 | Count of bus stations within 500 m buffer of AOI centroid | Bus Station.shp | |
| D5: Destination | Distance to CBD | DistCBD | Euclidean distance from AOI centroid to nearest CBD centroid (m) | CBD.shp |
| S: Surroundings | Green Coverage Ratio | GreenRatio | Proportion of green-class AOI area within 500 m buffer | AOI.shp |
| Variable | Mean | Std. Dev. | Min | Median | Max |
|---|---|---|---|---|---|
| Vitality (Y) | 0.265 | 0.266 | 0.000 | 0.159 | 1.000 |
| BD | 0.067 | 0.113 | 0.000 | 0.000 | 0.720 |
| AH (m) | 8.09 | 11.08 | 0.00 | 0.00 | 86.49 |
| FAR | 0.39 | 0.73 | 0.00 | 0.00 | 7.90 |
| ENT | 1.51 | 0.85 | 0.00 | 1.85 | 2.64 |
| PD (per km2) | 629 | 1258 | 0 | 82 | 14,289 |
| BldgAge (years) | 26.9 | 7.69 | 6.0 | 27.5 | 39.8 |
| DistBus (m) | 622 | 1419 | 6 | 263 | 15,605 |
| BusDen500 | 5.79 | 6.32 | 0 | 4 | 42 |
| DistCBD (m) | 16,017 | 15,518 | 50 | 10,710 | 88,461 |
| GreenRatio | 0.104 | 0.139 | 0.000 | 0.053 | 1.292 |
| Model | R2 (Test) | RMSE | MAE | 5-Fold CV R2 |
|---|---|---|---|---|
| OLS | 0.634 | 0.161 | 0.124 | – |
| Random Forest | 0.833 | 0.108 | 0.077 | – |
| XGBoost | 0.846 | 0.104 | 0.073 | 0.713 ± 0.115 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Lu, X.; Zhang, H.; Li, W.; Li, Y.; Xu, Z.; Niu, S. Unraveling the Non-Linear Impact of the Built Environment on Population-Based Residential Vitality at the Block Scale: An Explainable AI Approach Using Multi-Source Open Data in Zhengzhou, China. Buildings 2026, 16, 2229. https://doi.org/10.3390/buildings16112229
Lu X, Zhang H, Li W, Li Y, Xu Z, Niu S. Unraveling the Non-Linear Impact of the Built Environment on Population-Based Residential Vitality at the Block Scale: An Explainable AI Approach Using Multi-Source Open Data in Zhengzhou, China. Buildings. 2026; 16(11):2229. https://doi.org/10.3390/buildings16112229
Chicago/Turabian StyleLu, Xuefei, Haoran Zhang, Wei Li, Yutong Li, Ziruo Xu, and Shujie Niu. 2026. "Unraveling the Non-Linear Impact of the Built Environment on Population-Based Residential Vitality at the Block Scale: An Explainable AI Approach Using Multi-Source Open Data in Zhengzhou, China" Buildings 16, no. 11: 2229. https://doi.org/10.3390/buildings16112229
APA StyleLu, X., Zhang, H., Li, W., Li, Y., Xu, Z., & Niu, S. (2026). Unraveling the Non-Linear Impact of the Built Environment on Population-Based Residential Vitality at the Block Scale: An Explainable AI Approach Using Multi-Source Open Data in Zhengzhou, China. Buildings, 16(11), 2229. https://doi.org/10.3390/buildings16112229

