1. Introduction
With the rapid development of population aging, the equitable allocation of elderly care facilities has become a critical social concern. As early as 2007, the World Health Organization (WHO) proposed the WHO Age-friendly Cities Framework, which identifies eight core domains for constructing age-friendly cities: outdoor spaces and buildings, transportation, housing, social participation, respect and social inclusion, civic participation and employment, communication and information, and community support and health services. This framework emphasizes the significance of aging issues in global sustainable development [
1]. In China, actively addressing population aging and building age-friendly cities has been elevated to a national strategy. To mitigate the challenges of population aging and enhance the quality of life for older adults, the government has implemented a series of policies emphasizing the strengthening of elderly care service facilities and optimizing resource allocation within community living circles. Jointly issued by multiple Chinese government departments, the “Notice on Promoting the Construction of Elderly Care Service Facilities in Urban Residential Areas” sets a target to achieve 100% compliance in supporting elderly care facilities for new residential areas by 2025. This initiative aims to establish a “15-min elderly care service circle” and implement hierarchical classification systems to refine community-based care facility configurations [
2]. China’s urban community elderly care service planning has developed a distinctive pathway characterized by “government leadership, social participation, and community embeddedness”.
From a theoretical perspective, optimizing the spatial planning of elderly care facilities requires comprehensive consideration of older adults’ needs, particularly focusing on facility location, scale, and accessibility. However, in practice, traditional siting methodologies predominantly emphasize static spatial attributes of facilities and urban community planning. These approaches increasingly reveal deficiencies in scientific rigor and precision when addressing complex urban layouts and dynamically evolving population structures [
3]. Consequently, this study demonstrates necessity, significance, and urgency in providing scientific support for elderly care facility siting to meet the growing demands of aging populations.
In recent years, scholars worldwide have conducted multidimensional research on elderly care facility planning and site selection, yielding notable achievements. Their studies provide valuable references in indicator evaluation and methodological frameworks. For instance, Mingyang Li et al. enhanced the potential model for spatial accessibility calculation by incorporating accessibility of age-friendly medical institutions, weighted built environment factors, and residents’ maximum travel time preferences, thereby investigating care facility accessibility in megacities [
4]. Zhonghui Jiang et al. evaluated 15 min walking accessibility for older adults through streetscape semantic segmentation and deep learning techniques, exploring the alignment between urban community service facility configurations and elderly pedestrian behavior [
5]. Hu H. et al. employed Geographic Information Systems (GIS) and multi-source data analysis to examine spatial distribution patterns of elderly care facilities by assessing the consistency between facility supply and population demand [
6]. In investigating nonlinear relationships between urban built environments and resident behaviors, some scholars have utilized XGBoost models to process large-scale datasets and complex interactions. Compared to other mainstream machine learning models, XGBoost demonstrates superior accuracy, computational efficiency, and scalability [
7].
This study employed XGBoost, CatBoost, LightGBM, and Random Forest models for their balanced strengths in computational efficiency, interpretability, and nonlinear processing within spatial planning applications. These ensemble methods demonstrate superior adaptability to medium-scale geospatial data compared to deep learning alternatives (e.g., CNN/RNN): XGBoost mitigates overfitting via regularization to effectively capture threshold effects; CatBoost efficiently processes categorical variables like facility types; LightGBM optimizes computational speed; and Random Forest enhances overall model robustness. Although deep learning models excel in high-dimensional feature extraction, their demand for extensive training data and limited interpretability render them less suitable for policy-driven threshold analysis in geospatial contexts.
Despite these advancements, existing research exhibits several limitations. First, while declining physical capacities reduce older adults’ ability to access environmental facilities, many studies neglect the impact of built environments on care facility layouts, failing to adequately account for the unique travel behaviors and service requirements of elderly populations [
8]. Second, most research assumes linear relationships between facilities, resident mobility, and built environments. However, empirical evidence suggests complex nonlinear interactions and threshold effects among influencing factors. For example, in elderly care facility distribution, the relationship between population density and facility demand may follow an “S-shaped” curve: demand grows slowly below a density threshold, accelerates beyond this threshold, and plateaus at excessive densities. Such nonlinear characteristics can be visualized through SHAP dependency plots [
9]. Furthermore, built environments exhibit synergistic effects on both facility accessibility and density when analyzing factors influencing elderly walking duration [
10]. Prevailing linear assumptions in current studies lead to biased interpretations of care facility layouts [
11].
Additionally, insufficient attention has been paid to functional distinctions among elderly care facility types. Different categories of care facilities require distinct spatial planning considerations, yet existing research predominantly treats heterogeneous facilities as homogeneous POI (Point of Interest) coordinates. This oversimplification overlooks functional variations and corresponding spatial demand characteristics, resulting in non-targeted planning practices that inadequately address the diverse needs of aging populations [
12].
Finally, machine learning models employed in care facility siting predictions suffer from interpretability limitations. While capable of generating predictive outcomes, their complex algorithmic logic obscures explicit visualization of variable-specific weights or interaction mechanisms. This opacity hinders policymakers’ ability to comprehend prediction processes and outcomes, thereby undermining the credibility and scientific validity of spatial planning decisions. Although spatial statistical models demonstrate utility in analyzing complex spatial and non-spatial effects (e.g., nonlinearity and synergies), they exhibit temporal lag constraints. In contrast, locally interpretable machine learning models show enhanced performance in geographically oriented data interpretation, modeling, and visualization.
Building upon identified gaps in the existing scholarship, this study aims to address the following research questions:
(1) How to holistically integrate built environment characteristics and supply–demand dynamics for categorical prediction of elderly care facilities, thereby accurately reflecting site-specific requirements across facility typologies?
(2) Which specific built environment factors (e.g., population density, road network density, land use patterns) exert significant nonlinear or threshold effects on the spatial distribution of heterogeneous elderly care facility categories?
(3) Through what interactive mechanisms (specifically, this study will quantify the synergistic effects among healthcare resource allocation (e.g., proportion of medical facilities within grid cells), commercial diversity (e.g., density of retail facilities), and recreational accessibility (e.g., distance to parks). These nonlinear impacts will be visualized through SHAP interaction plots. Anticipated outcomes include identifying key interactive factors (such as the significant enhancement in care facility placement probability when medical facility proportion exceeds 0.2) and providing mechanistic explanations for policy formulation) do supply–demand environmental elements—such as co-located medical, commercial, and recreational facilities—shape the spatial configuration of elderly care service networks?
(4) How to establish a precision-oriented indicator system anchored in functional typology and operational thresholds, enabling facility-type-specific predictive modeling for optimized planning outcomes?
2. Literature Review
Building upon the aforementioned analysis of supply–demand environmental quality within diverse living circles, the composite indicators of significance, diversity, and accessibility derived from supply–demand dynamics effectively reflect the practical needs of elderly populations and exert substantial influence on elderly care facility siting. Existing studies demonstrate that elderly care facilities closely associated with daily living constitute critical factors in spatial planning. As essential urban infrastructure, analyzing their current spatial distribution provides valuable references for rational planning optimization [
13]. These facilities encompass medical services, commercial amenities, dining options, transportation hubs, recreational spaces, and fitness venues. Comprehensive integration of their spatial patterns enhances service efficiency in facility siting [
14], establishing these indicator systems as fundamental components of scientific decision-making. Furthermore, facility planning necessitates consideration of urban built environment characteristics and elderly mobility behaviors. Parameters including population density and road network density must be incorporated to provide comprehensive data support for evidence-based siting strategies.
The built environment has long been recognized as a critical determinant of residential convenience [
15]. Scholars have proposed the “5D” evaluation framework (Density, Diversity, Design, Distance, and Destination accessibility) to quantify its impacts on facility planning [
16]. Among these, density metrics—particularly population and road network densities—demonstrate significant correlations with elderly public transit utilization patterns [
17]. Empirical studies confirm a positive correlation between urban density and elderly public transit adoption rates [
18], though excessive density may induce negative externalities that suppress mobility willingness [
19]. This dual nature necessitates the incorporation of nonlinear threshold effects when evaluating density impacts. Beyond density considerations, safety-oriented pedestrian infrastructure design—including optimized road lengths, neighborhood greening, and public facility configurations—positively influences elderly quality of life [
20]. Compared to younger demographics, elderly populations exhibit heightened dependence on accessibility [
21], with mobility behaviors significantly correlated to the quantity and density of medical, commercial, and environmental amenities within community peripheries [
22]. Current research inadequately differentiates built environment impacts across demographic groups. This study addresses this gap through a “demand-environment” compatibility analysis, integrating supply–demand indicators with built environment parameters to establish elderly-centric intermediate variables, thereby laying a data foundation for spatial planning of elderly care service circles.
In facility planning research, diverse methodologies exhibit distinct characteristics and applicability domains (
Table 1). Qualitative approaches primarily rely on expert knowledge, case studies, and field investigations. Traditional algorithms—including linear programming, Geographic Information Systems (GIS), and Multi-Criteria Decision Analysis (MCDA)—are valued for their structural clarity but face limitations in processing large-scale data and complex models. The advent of artificial intelligence has popularized machine learning algorithms such as Support Vector Machines (SVM), Decision Trees, Random Forests, K-Means Clustering, and Q-Learning [
23], which excel in modeling nonlinear relationships and handling extensive datasets, albeit requiring substantial training data and meticulous parameter tuning. Deep learning subfields like Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) demonstrate superiority in high-dimensional data processing but suffer from poor interpretability and high computational costs [
24]. Comparatively, machine learning algorithms balance processing capability, interpretability, and adaptability, positioning them as ideal solutions for urban facility siting challenges.
In summary, scholars globally have conducted multidimensional research on elderly care facility planning and siting, yielding significant advancements in indicator systems and methodological frameworks. However, this study identifies three critical research gaps warranting further investigation:
First, existing studies predominantly derive from general service facility paradigms, inadequately addressing the unique impacts of built environments on elderly care facility distribution. The distinct mobility patterns and service needs of aging populations necessitate demand-driven evaluation criteria tailored to gerontological characteristics. Furthermore, while prevailing research assumes linear relationships between facilities and environmental factors, empirical evidence suggests dynamic nonlinear interactions and threshold effects. Specific built environment attributes may exert pronounced influences within particular value ranges yet exhibit substantial variability in synergistic impacts across different parameters.
Second, current facility planning research largely overlooks typological differentiation among elderly care facilities—specifically activity-oriented, care-oriented, and service-oriented categories—each requiring distinct spatial configuration principles. The homogenization of facilities as generic POIs (Points of Interest) coordinates obscures critical functional and operational variations, resulting in non-stratified planning strategies that fail to address heterogeneous elderly needs.
Third, regarding model interpretability, this study innovatively integrates the SHAP (SHapley Additive exPlanations) framework with XGBoost modeling to enhance the precision of predictive interpretation. This approach elucidates feature-specific contribution mechanisms and quantifies threshold effects of environmental variables, enabling transparent visualization of how parametric variations influence siting outcomes. The methodology not only generates actionable planning recommendations for Xi’an City but also establishes a transferable paradigm for similar urban contexts confronting aging population challenges.
3. Data and Method
3.1. Study Area and Data Source
Xi’an is located in the hinterland of Guanzhong Plain and has the reputation of “the ancient capital of 13 dynasties” and “the starting point of Silk Road”. As the sole National Central City in Northwest China, Xi’an exhibits a dual acceleration pattern in its demographic structure: while its permanent population reached 13.0411 million in 2023 (with an urbanization rate of 81.2%), the aging ratio surged to 17.01% (2.2186 million aged 60+), and the proportion of residents aged 65+ reached 11.75% (1.5329 million), surpassing the international threshold for a deep aging society (7%). The street space of Xi’an has great research value, due to its historical and cultural significance. It is one of the first batches of national historical and cultural cities. Central urban areas of Xi’an exhibit a high concentration of elderly care facilities, while their aging rates all exceed the international threshold of 10%. This results in severe inequity in service accessibility across regions. Examining the spatial allocation and optimization of elderly care facilities in Xi’an’s central urban areas holds significant implications for providing empirical references for other Chinese cities. This study takes Xi’an’s central urban area as the study site. The specific scope includes Beilin District, Weiyang District, Xincheng District, Yanta District, Baqiao District, and Lianhu District, six administrative divisions in total, with a total area of 843.56 square kilometers, accounting for 16.17% of the entire area of Xi’an City. The number of streets included in the scope of the study is 16,784. The overview of the study area is shown in
Figure 1.
This paper mainly involves three types of data: POI data, administrative division data, and road network data (
Table 2). AutoNavi Maps is one of the largest map content and location service providers in China. In this study, we used Python (3.9.1) to write a program to obtain the POI data of six districts of Xi’an, namely, Xincheng, Beilin, Lianhu, Yanta, Weiyang, and Baqiao, in 2022 from the AutoNavi Open Map platform. To address data missingness and biases, the following cleaning procedures were adopted:
(1) Manual Validation: Removal of duplicate or invalid entries (e.g., erroneous coordinates), retaining a total of 263,920 valid POIs;
(2) Bias Correction: Spatial consistency verification via cross-referencing with OpenStreetMap road network data, supplemented by Kriging interpolation for sparsely populated areas (e.g., suburban grids);
(3) Completeness Verification: Random sampling of 500 grids compared against field survey data, achieving an accuracy rate of 95.2%. Each POI data includes six attributes according to the different functions of city streets combined with the AutoNavi Maps POI analysis system: name, type, address, longitude, latitude, and area name.
3.2. Study Framework
Integrating the impacts of built environments, supply–demand dynamics, and public service facilities on elderly care facility planning, this study establishes 500 m square spatial grid units based on the pedestrian accessibility principle for seniors, using a 15 min walk distance (at 4 km/h walking speed) as the service radius benchmark for standard elderly care facilities. The research framework (
Figure 2) comprises three phases:
Phase 1: Model Development and Validation: Drawing upon the spatial distribution patterns of existing elderly care facilities in Xi’an and prior methodological frameworks, we train a machine learning-based siting prediction model. The model’s simulation outcomes are spatially overlaid with current facility distributions to validate its scientific rigor. A spatial congruence threshold exceeding 80% between simulated and actual distributions is established as the reliability criterion. If unmet, iterative parameter optimization is implemented until the threshold is achieved.
Phase 2: Grid-Level Suitability Prediction: The validated model executes predictive simulations across all grid units to assess site suitability for new elderly care facilities. Multi-criteria evaluation incorporates density gradients (population/road network), accessibility metrics, and functional complementarity with co-located amenities (medical/commercial/recreational facilities).
Phase 3: Priority Site Selection: By synthesizing living circle delineations and predictive suitability scores, priority construction sites are identified through a weighted scoring matrix. This matrix hierarchically prioritizes the following: grids exhibiting both high predicted suitability and underserved service coverage; areas with optimal integration of accessibility and functional diversity; and locations demonstrating synergistic alignment with urban renewal initiatives.
3.3. Machine Learning
This study employed XGBoost, CatBoost, LightGBM, and Random Forest models for their balanced strengths in computational efficiency, interpretability, and nonlinear processing within spatial planning applications. These ensemble methods demonstrate superior adaptability to medium-scale geospatial data compared to deep learning alternatives (e.g., CNN/RNN): XGBoost mitigates overfitting via regularization to effectively capture threshold effects; CatBoost efficiently processes categorical variables like facility types; LightGBM optimizes computational speed; and Random Forest enhances overall model robustness. Although deep learning models excel in high-dimensional feature extraction, their demand for extensive training data and limited interpretability render them less suitable for policy-driven threshold analysis in geospatial contexts. This study intends to employ various machine learning models for data training, conducting model performance evaluations through XGBoost (XGB), CatBoost (CB), LightGBM (LGB), and Random Forest (RF), respectively, thereby selecting the model with superior performance and optimal applicability for spatial planning of elderly care service facilities. In recent years, machine learning has been extensively applied to numerous urban spatial analysis challenges, demonstrating significant advantages in computational efficiency and predictive accuracy.
The XGBoost model, proposed by Dr. Tianqi Chen from the University of Washington in 2016, represents a gradient-boosted tree framework supporting parallel computing [
25]. This model exhibits notable characteristics including fast parallel computing speed, high prediction accuracy, and operational flexibility [
26]. In recent research practice, it has been successfully implemented across diverse domains such as financial analytics and energy consumption forecasting [
27,
28,
29]. The fundamental building blocks of this model are Classification and Regression Trees (CARTs). The mathematical representation of the model can be expressed as follows:
Here, Gj denotes the cumulative sum of first-order partial derivatives of samples contained in leaf node j, Hj represents the cumulative sum of second-order partial derivatives of samples in leaf node j, Wj corresponds to the weight vector of leaf node j, and T indicates the total number of decision trees in the ensemble model.
The CatBoost method, since its introduction by Prokhorenkova, has gained widespread recognition among researchers and has been extensively applied across diverse domains such as healthcare, geology, and urban development [
30]. CatBoost is a machine learning algorithm rooted in Gradient Boosting Decision Trees (GBDT). Its primary advantage lies in its efficient and systematic handling of categorical features, complemented by robust resistance to overfitting and strong generalization capabilities. As demonstrated by Prokhorenkova et al. [
31], the functional representation of its decision tree h can be formulated as follows:
Following the paradigm of XGBoost and CatBoost, LightGBM (LGB) constitutes another gradient-boosting decision tree (GBDT)-based algorithm. This method significantly enhances computational efficiency through three key advantages: accelerated training speed, reduced memory consumption, and improved predictive accuracy compared to conventional GBDT implementations. The algorithmic superiority of LightGBM in both efficiency and precision has been widely acknowledged in scholarly investigations.
As Sun et al. [
32] elucidate, LightGBM integrates multiple regression trees through an ensemble learning framework, whose mathematical formulation can be expressed as follows:
The Random Forest (RF) model, pioneered by Breiman [
33], operates as an ensemble of multiple decision tree models. By strategically aggregating numerous weak learners (individual trees), it synthesizes a robust strong learner with enhanced predictive capabilities. This methodology employs a randomized sampling approach to construct diverse decision trees, where the inherent randomness in feature and data selection during model construction significantly enhances its robustness against overfitting and noise interference. Having matured as a well-established machine learning technique, the Random Forest model has been extensively validated through long-term scholarly applications across diverse research domains, including but not limited to ecological modeling and biomedical informatics [
34,
35,
36]. The final prediction output is determined through a majority voting mechanism among constituent decision trees in classification tasks or by averaging their outputs in regression scenarios.
3.4. Built Environment
This study comprehensively considers the integrated impacts of supply–demand dynamics and built environmental characteristics on facility layout optimization by synthesizing three service quality metrics (importance, diversity, accessibility) with two built environment parameters (road network density and population density). The elderly population density, serving as a crucial indicator for evaluating built environment quality, is derived from China’s Seventh National Population Census data. Road network density quantifies regional transportation infrastructure through road connectivity measurement, reflecting a fundamental design parameter in urban planning. Importance assesses the environmental quality of elderly care services, while diversity measures the richness of facility types within spatial grid units through entropy-based calculations.
In order to evaluate importance more objectively, this study invited 20 experts in urban planning, architecture, and other disciplines to participate in this study by using the Analytic Hierarchy Process (AHP). According to the grading results of the importance of each participant, the judgment matrix of each participant is formulated, and the consistency of each judgment matrix is calculated; the final weights of the six street environmental quality indicators are shown in
Table 3:
Accessibility evaluates the convenience of elderly residents’ access to service facilities within grid units using the two-step floating catchment area (2SFCA) method.
Step 1: For an elderly care facility, j, search for all populations, k, within a threshold travel distance, d
0, from facility j and calculate the supply–population ratio, R
j, within the search area:
where P
k is the population whose center of mass at location k falls within search area j; S
j is the capacity of the elderly living facility j; and d
kj is the travel distance from k to j.
The level (quality indicator) and competition among elderly care facilities are considered in the improved model. Specifically, the Huff model was used to calculate the probability of selecting an elderly care facility for each township/street population. The expression is as follows:
where prob
ij is the probability of i choosing j; S
j is the level of elderly care facility supply, determined by the combination of area and rank; and f(d
ik) is the decay function of the distance between i and k. The introduction of the Huff model in 2SFCA takes both the time cost between residential points and elderly care facilities and the attractiveness of the facilities to the elderly into account, thus incorporating the choice behavior of residents into the accessibility calculation.
The model absorbs the advantages of previous studies and takes the influence of the road network characteristics on the accuracy of accessibility measurement into account; the calculation process is as follows:
Step 1, for each elderly care facility,
Step 2, for each township (or street) (i),
Furthermore, the quantity and spatial distribution of medical, commercial, catering, and transportation facilities—environmental determinants significantly influencing elderly living quality—constitute essential components of the built environment. Detailed specifications and operational definitions of these indicators are systematically presented in
Table 4.
3.5. Site Prediction Model Construction
To perform the predictive analysis of elderly care facility locations in Xi’an, the classification of POI data for both elderly care facilities and related infrastructure is essential. First, urban facilities in Xi’an were categorized into 10 primary classes: transportation, dining, shopping, daily services, sports, medical, cultural/educational, financial, environmental, and elderly care facilities, totaling 263,920 POI data points. Subsequently, based on the mobility patterns of older adults and the 15 min walking distance criterion, service coverage areas for elderly care facilities were demarcated. The central urban region was divided into 3559 grid cells (500 × 500 m) to analyze spatial distribution patterns. GIS spatial analysis tools were then applied to identify facility distributions within each grid. As illustrated in
Figure 3, Grid 156, for instance, may contain sports/recreational facilities, corporate enterprises, cultural/educational institutions, dining establishments, and elderly care facilities. By spatially comparing all grids with facility categories, binary indicators (1 for presence, 0 for absence) of facility existence in each grid were generated to support subsequent machine learning model training and predictive analysis.
Through the systematic classification and spatial analysis of POI data, this study elucidates the distribution characteristics of diverse facilities and their spatial correlations with elderly care infrastructure. The methodology provides a scientific foundation for optimizing the planning and layout of elderly care facilities, enhancing service accessibility and quality to better meet the needs of older populations. Furthermore, this integrated analytical framework offers a transferable model for other cities to refine their elderly care facility planning strategies, thereby advancing equitable resource allocation and age-friendly urban development.
To prevent overlap between the training and prediction datasets, this chapter implemented rigorous data screening and sampling procedures. From the 3559 valid grid cells, a balanced random sampling method with equal positive and negative sample ratios was applied, ultimately selecting 400 grid cells as the training dataset. Using elderly care facilities within each grid as the dependent variable and nine categories of elderly care-related facilities combined with built environment factors as independent variables, a machine learning predictive model was trained to assess location suitability in Xi’an’s central urban area. The initial prediction identified 1208 candidate grid cells deemed suitable for elderly care facility placement, encompassing both existing facility locations and newly predicted optimal sites. This approach ensures methodological robustness in spatial suitability modeling while mitigating risks of model overfitting through dataset segregation.
As illustrated in
Figure 4a, the existing 582 elderly care facilities were utilized as a test set to evaluate the model’s goodness of fit. The validation results demonstrated that 90.7% of these facilities spatially aligned with the predicted suitability grids, indicating reasonable predictive accuracy. Building upon this, the grids containing existing elderly care facilities were filtered out during the initial screening phase (
Figure 4b), yielding the first-round suitability prediction. However, this preliminary outcome solely relied on data-driven spatial correlations, neglecting the practical impacts of living circle delineations (e.g., basic, neighborhood, and shared living circles) on older adults’ actual access to care services. To address this limitation, a secondary screening was conducted to refine the predictions by incorporating living circle boundaries. As shown in
Figure 4c, grids falling outside the service coverage of all three-tier living circles (basic, neighborhood, and shared) were excluded due to their low accessibility in real-world scenarios. This process ultimately identified 543 high-suitability grids for elderly care facility siting, balancing statistical robustness with operational feasibility.
The spatial distribution of 582 existing elderly care facilities in Xi’an serves as an independent test dataset, with selection criteria encompassing the following: (1) comprehensive coverage across all administrative divisions within the study area; (2) elimination of service radius overlap areas through >500 m buffering analysis to ensure spatial independence; and (3) inclusion of balanced samples representing three facility types (service-oriented, activity-oriented, and care-oriented). The 3559 spatial grids were partitioned into training (70%) and validation (30%) sets using stratified sampling to maintain proportional representation of facility categories. An XGBoost model was trained on the training set to generate facility placement probability distributions for each grid. Predictive results (grids with binarized probability > 0.5) were spatially overlaid with actual facility distributions for comparative analysis. The Spatial Match Ratio (SMR) was calculated as follows:
where N
match denotes the count of predicted grids coinciding with actual facility locations, and N
total represents the total number of actual facilities (582). This validation framework quantifies predictive accuracy through spatial coupling metrics, providing empirical evidence for the scientific robustness of the siting methodology.
4. Results and Discussion
4.1. Results
To objectively evaluate the predictive performance of each model, this study employed six evaluation metrics: Accuracy, Precision, Recall, F1-score, ROC curve (Receiver Operating Characteristic curve), and AUC (Area Under the ROC Curve). The ROC curve is plotted with the False Positive Rate (FPR) as the x-axis and the True Positive Rate (TPR, equivalent to Recall) as the y-axis, while the AUC represents the area under this curve.
The preprocessed data were divided into a training dataset (70% of the data) and a testing dataset (30% of the data). The training dataset was input into four machine learning models—CatBoost (CB), XGBoost (XGB), LightGBM (LGB), and Random Forest (RF)—for model training. In total, 12 models were developed, with each category of elderly care facilities corresponding to four distinct models. Grid search and cross-validation were applied to optimize hyperparameters and prevent overfitting. The optimal values of Accuracy, Precision, Recall, mIoU (mean Intersection over Union), and F1-score for the 12 models are detailed in
Table 5,
Table 6 and
Table 7. Finally, the testing dataset was input into the models to generate prediction results. Based on these results, ROC curves for the four models were plotted, as shown in
Figure 5.
All four machine learning models demonstrated strong predictive performance, indicating their effectiveness in forecasting the spatial layout of elderly care facilities. Among them, XGBoost achieved the best results, highlighting its superior capability in this context.
This study employed five evaluation metrics—Accuracy, Precision, Recall, mean Intersection over Union (mIoU), F1-score, and Area Under the ROC Curve (AUC)—to compare the performance of three XGBoost models against three alternative algorithms: LightGBM (LGB), CatBoost (CB), and Random Forest (RF).
Accuracy, measuring the overall proportion of correct predictions, demonstrated XGBoost’s superior performance. In training for service-oriented, activity-oriented, and care-oriented elderly care facilities, XGBoost achieved accuracies of 97%, 95%, and 97%, respectively, indicating its robust capability to identify suitable locations. While CatBoost, LightGBM, and Random Forest all attained accuracies above 90%, XGBoost consistently outperformed them.
Precision, reflecting the ratio of true positives among predicted positives, is critical for minimizing resource misallocation in facility siting. For service-oriented facilities, XGBoost achieved a precision of 95%, significantly surpassing Random Forest (91%), LightGBM (91%), and CatBoost (90%). Similar superiority was observed for activity- and care-oriented facilities, underscoring XGBoost’s cautious yet accurate decision-making in site prioritization.
Recall, which quantifies the model’s ability to identify true positives, revealed XGBoost’s exceptional performance in service-oriented facility prediction (85% recall), markedly higher than Random Forest (62%), CatBoost (65%), and LightGBM (68%). This highlights XGBoost’s balanced capacity to capture suitable sites while maintaining high precision.
mIoU, evaluating spatial segmentation accuracy between suitable and unsuitable zones, further validated XGBoost’s efficacy with an mIoU of 89%, compared to 76% (Random Forest), 77% (CatBoost), and 78% (LightGBM).
F1-score, the harmonic mean of precision and recall, demonstrated XGBoost’s optimal balance with an F1-score of 90%, outperforming Random Forest (74%), CatBoost (75%), and LightGBM (77%).
AUC serves as a robust metric for macroscopically evaluating the overall classification performance of a model. It is not influenced by the classification threshold. Rather than measuring absolute predictive accuracy, AUC assesses the model’s ranking ability or discriminative ability—that is, how effectively it distinguishes between positive and negative instances.
All models perform notably well in terms of AUC, with XGBoost (represented in orange) particularly standing out, achieving an AUC value exceedingly close to 1.0 (
Figure 6). This indicates that XGBoost demonstrates exceptional and reliable capability in the core task of discriminating positive instances from negative ones.
4.2. Interpretive Analysis of Site Selection Models
In order to intuitively utilize the model results to support the decision-making of senior living facility location, this paper chooses the SHAP interpretation method to further explore the influence of each feature on the location layout of the three major types of senior living facilities. SHAP interprets the predicted value of the model as the sum of the Shaply Values of each input feature, which reflects the influence of each feature on the predicted value, and even whether the feature has a positive or negative influence. whether the feature has a positive or negative impact. Therefore, it has certain advantages over the traditional methods of measuring the importance of features and is widely used by scholars in feature analysis [
37,
38,
39,
40].
In order to intuitively use the model results to provide support for the decision-making on the location of senior living facilities, this study chooses the SHAP method to further explore the influence of each feature in the location layout of the three categories of senior living facilities. In the SHAP method, the predicted value of the model is interpreted as the sum of the Shaply Values of each input feature. The method is able to show the effect of each feature on the predicted value and even determine whether the feature has a positive or negative impact. In view of this, the SHAP method has certain advantages over traditional methods of measuring the importance of features and has been widely used in the field of feature analysis by academics.
4.2.1. Care-Oriented SHAP Feature Analysis
As shown in
Figure 7, it can be seen that for predicting nursing facilities in the care category, in order of importance procedures, the features of accessibility, road_density, medical, landuse, transportation, and shopping, i.e., the six variables of accessibility, road_network_density, medical facilities, land use, transportation facilities, and shopping facilities, have the most significant impact on their layout most obviously. Analyzing the peripheral accessibility values (accessibility) of care facilities, it can be found that most of the red sample points are distributed in areas with positive SHAP values, indicating that in this dataset, high peripheral accessibility values are likely to be distributed in this category, i.e., the probability of the distribution of care facilities in areas with higher accessibility is higher. It is worth noting that the second highly relevant variable for predicting the location of care facilities is road network density (road_density); analysis of road network density reveals that the red sample points are distributed in both positive and negative SHAP areas, but the higher density (shown as red sample points) is generally distributed in SHAP-positive areas, and the average density (which tends to be purple) has a higher distribution in SHAP-negative areas. The higher density (red sample points) is generally distributed in the SHAP-positive region, while the average density (the color tends to be purple) is distributed in the SHAP-negative region.
After analysis, it can be seen that the red sample points are mainly distributed in areas with positive SHAP values, which indicates that in this dataset, high peripheral accessibility values are likely to appear in the corresponding category, meaning that areas with higher accessibility are more likely to be equipped with elderly care facilities. It is important to note that the second highly relevant variable in predicting the location of care facilities in the care category is road_density. The analysis of road_density shows that the red sample points are present in both positive and negative SHAP values, although the higher density (presented as red sample points) is usually distributed in positive SHAP areas, while the average density (colored close to purple) is more distributed in negative SHAP areas.
This result indicates that the layout of care facilities tends to be in areas with higher road network density, i.e., the transportation convenience brought by higher road network density enables the elderly to have better access to care services, while areas with average road network density have a certain probability of not carrying out the layout planning of care facilities. The third highly correlated variable is medical, and the red samples are distributed in areas with positive SHAP values, which shows that medical facilities play a positive role in the layout of care facilities. The fourth variable is landuse, and the red samples are distributed in areas with negative SHAP values, indicating that land use plays a negative role in the layout of care facilities, i.e., areas with higher land use tend to have higher building densities, and high building densities will cause psychological insecurity among the elderly, so the possibility of arranging care facilities in areas with higher land use is lower. Traffic facilities are distributed in both positive and negative SHAP areas, but higher density (shown as red sample points) is generally distributed in positive SHAP areas, indicating that better transportation facilities in the surrounding area will promote the layout of care facilities. As for shopping facilities, there are more distributions in SHAP-negative areas. This indicates that it has a negative effect on the layout of care facilities.
In order to further explore the influence of the six most important features on the prediction results of the entire care facility layout, and the interaction effect of this feature with other features, this paper plots the SHAP partial dependency diagram, as shown in
Figure 8.
Accessibility is nonlinearly and positively correlated with the dependence of the layout of care facilities in
Figure 8a. As public service facilities are concentrated in the central urban area, the layout of care-type elderly facilities should ensure that residents within the service area can fully enjoy their elderly resources. The elderly living in the peripheral areas of the city need to travel long distances to obtain care-type services, resulting in the probability of the layout of care-type facilities increasing with the value of accessibility; from the bias dependence graph, it can be seen that when the value of accessibility is in the interval [1, 6], the probability of the layout of care-type increases. The bias dependence curve of elderly care facilities shows a strong nonlinear increasing trend, while the effect of accessibility on the layout of elderly care facilities will become insignificant when the accessibility value exceeds 6. It shows that the feature of accessibility has a significant threshold effect on the layout of care category facilities. The road network density (b) and the layout of care facilities share an obviously positive relationship: the higher the road network density, the greater the possibility of the care facilities layout. This is mainly because the city center of Xi’an’s traffic line network density is high, and the number of stations and the distribution of the more centralized spatial layout of care facilities for the elderly is more obvious. The medical facilities and care facilities layout bias dependence graph (c) shows a nonlinear positive correlation, and the [0, 0.2] interval on the site selection prediction results is more strongly influenced. In addition, it can be seen that in the presence of care facilities, the larger the proportion of medical facilities in the grid, the smaller the proportion of shopping facilities, indicating that there is a competitive relationship between the two types of elderly-related facilities.
Land use and the layout of care facilities are nonlinearly and negatively correlated in
Figure 8d, which shows that proper land use can promote the layout of care facilities to a certain extent by providing comprehensive facilities in the surrounding area. The higher land use areas tend to have higher building densities, and high building densities can cause psychological insecurity among the elderly, so the possibility of arranging care facilities decreases with the increase in land use. The transportation facilities and the layout of care-type facilities bias dependence plot (e) shows a nonlinear positive correlation first, and then a negative correlation. It shows that transportation facilities in the interval of [0, 0.3] have a stronger influence on the layout of care-type elderly facilities, and after higher than 0.3, transportation facilities on the layout of care facilities is a negative relationship. This is because too many transportation facilities means that the nearby traffic flow is large during the weekdays; such a situation will cause traffic congestion, resulting in the psychological insecurity of the elderly and thus affecting the travel of the elderly. In addition, due to the construction sequence of the rail line network, the density of the rail line network in the central area of the city is much higher than that in the peripheral area of the city, and the source of passenger flow in the central area station going to the hinterland is smaller than that in the suburban area station; therefore, the convenience of transferring to and from the central area station is high.
In addition, in view of the construction of the rail line network having the order of priority, the density of the rail line network in the central area of the city is significantly higher than that in the peripheral area of the city. At the same time, the source of passenger flow to the hinterland of the central district station is smaller than that of the suburban station, so the convenience of transferring is higher in the central district station.
From the bias dependence graph, it can be seen that the bias dependence curve shows a strong nonlinear growth trend when the proportion of transportation facilities is in the interval of [0, 0.3]. Shopping facilities and the layout of care facilities are nonlinearly negatively correlated with each other, and then positively correlated with each other. Shopping facilities are generally distributed in areas with high road network density, and there is a certain competitive relationship between them and healthcare facilities; the bias dependence curve shows a strong nonlinear growth trend at the interval [0.1, 0.4], indicating that the probability of the layout of care-type senior living facilities will be greatly increased when the proportion of shopping facilities in the grid is in this interval.
4.2.2. Activity-Oriented SHAP Feature Analysis
As shown in
Figure 9, it can be seen that for predicting activity-based senior living facilities, the characteristics landuse, restaurant, road_density, traffic, medical, and shopping, i.e., the six variables of land use, restaurant facilities, road network density, transportation facilities, medical facilities, and shopping facilities, are sorted in order of importance procedures.
Analyzing the feature of land use (landuse) around activity-based senior living facilities, it can be found that the red sample points are distributed in areas with positive SHAP values, indicating that in this dataset, the probability of higher values of surrounding land use will be distributed in this category, i.e., the probability of activity-based senior living facilities’ locations in areas with higher land use is higher. It is worth noting that the second highly correlated variable that predicts the siting of care facilities is restaurant, and an analysis of the percentage of restaurant facilities reveals that the red sample points are mainly distributed in areas with negative SHAP values, which suggests that areas with a high percentage of restaurants will inhibit the layout of activity-based senior living facilities. For the characteristic of road_density, the red sample points are distributed in both positive and negative SHAP areas, but the higher density (shown as red sample points) is generally distributed in the positive SHAP areas, and the average density (the color tends to be purple) is more distributed in the negative SHAP areas. This result suggests that activity-based senior living facilities tend to be located in areas with higher road network density, i.e., the transportation convenience brought by higher road network density can lead to better access to activity-based services for the elderly. The fourth highly correlated variable is transportation facilities (traffic), and the red samples are distributed in areas with positive SHAP values, which shows that transportation facilities play a positive role in the layout of activity-based senior living facilities. The fifth high correlation variable is medical; the red samples are distributed in areas with positive SHAP values, indicating that medical facilities play a positive role in the layout of activity-based facilities, i.e., there is generally a moderate amount of medical facilities in the vicinity of activity-based facilities. The red samples of shopping are distributed in both positive and negative SHAP values, indicating that an appropriate amount of shopping facilities in the neighborhood will promote the layout of activity-based facilities, but more than a certain percentage will inhibit the layout of activity-based facilities.
To further explore the influence of the six most important features on the overall activity-based facility layout prediction results and the interaction effect of this feature with other features, a SHAP partial dependency plot was developed, as shown in
Figure 10.
The land use and activity-based facility layout partial dependency plot (a) shows a nonlinear positive correlation. It is mainly because the location advantage of the central city has a significant effect on the promotion of land use degree, and the increase in land use degree means an increase in business vitality, which makes it more attractive for activity-based senior living facilities. From the bias dependence graph, it can be seen that when the value of accessibility is in the interval [0.75, 1.5], the bias dependence curve of activity-based senior living facilities shows a strong nonlinear growth trend. It shows that the feature of land use degree has a significant threshold effect on the layout of activity-based facilities. For dining facilities (b), there is a significant negative relationship with the layout of activity-based senior living facilities; this is because dining facilities tend to be distributed in places with high population density, and overly high population density will be unfavorable to the layout of activity-based senior living facilities. For the feature of road network density, the road network density and the layout of activity-based facilities dependence graph (c) shows a complex nonlinear relationship that increases first, then decreases and then increases, when the road network density in the [0, 4] interval on the predicted results of activity-based siting is more strongly influenced; higher than 4 and less than 8 is a clear negative relationship. This suggests that the convenient transportation brought by appropriate road network density will promote the layout of activity-based facilities, while when both diversity and road network density values are high, congested traffic and higher building density will pose a potential threat to the physiological and psychological aspects of the elderly, which will greatly reduce the probability of laying out an activity-based senior living facility in such a grid. Previous studies have shown that the effect of road network density on travel is two-sided: On the one hand, the lower density of the road network means that the size of the neighborhood is smaller, which can improve the accessibility of cyclists and pedestrians. On the other hand, with the higher density of the road network, there will be too many intersections which will cause safety problems, and this will have a negative effect on the travel of the elderly. Xi’an’s city center is an old city with a long history of urban development, rich facilities, dense population, and a large number of living roads existing within it, so the improvement of road efficiency in this area makes it easy for pedestrians to reach their destinations on foot, which will be more conducive to walking, and the road network density can improve the accessibility of cyclists and pedestrians.
It has been shown that the impact of road network density in terms of traveling has two sides. First, when the road network density is low, it means that the size of the neighborhood is small, which can improve accessibility for cyclists and pedestrians. Secondly, as the density of the road network increases, there will be too many intersections, which will lead to safety problems and have a negative effect on the mobility of the elderly. The central city of Xi’an, as an old city with a long history of development, is rich in amenities and densely populated, and has a large number of living-type roads within it. Therefore, the improvement of road efficiency in this area will enable pedestrians to easily reach their destinations on foot, which will be more favorable for walking trips.
This will make it easier for pedestrians to walk to their destinations and thus reach the targeted activity-based senior living facilities. The transportation facilities and activity-based facilities layout bias dependence graph in
Figure 10d shows a nonlinear negative correlation; appropriate transportation facilities to bring convenient transportation can be a certain factor to promote the layout of activity-based elderly facilities. The medical facilities and care facilities layout dependency graph (e) first showed a nonlinear positive correlation. It shows that appropriate medical facilities will be built in the neighborhood of activity-based senior living facilities. The dependency diagram (f) of the layout of shopping facilities and care facilities shows a nonlinear negative correlation first, and then a nonlinear positive correlation. And shopping facilities are generally distributed in areas with high road network density; there is a certain competitive relationship with medical facilities, and the bias dependence curve shows a strong nonlinear growth trend in the interval [0.3, 0.6].
4.2.3. Service-Oriented SHAP Feature Analysis
As shown in
Figure 11, in analyzing the peripheral accessibility value (accessibility) of the service category elderly facilities, it can be found that most of the red sample points are distributed in areas with positive SHAP values, indicating that in this dataset, the high peripheral accessibility value is likely to be distributed in this category; that is, the possibility of locating the service category elderly facilities in the areas with higher accessibility is higher. The second highly correlated variable that predicts the location of facilities in the care category is land use, and the analysis reveals that the red sample points are mainly distributed in areas with positive SHAP values. This result suggests that service-type elderly facilities tend to be located in areas with higher land use, i.e., the transportation convenience brought about by higher road density can enable the elderly to better access care services, while areas with average road density have a certain probability of not carrying out the layout planning of elderly facilities. The third highly correlated variable is diversity, and the red samples are distributed in areas with negative SHAP values, which shows that the indicator of diversity has a negative effect on the layout of elderly care facilities. The fourth variable is shopping, and the red samples are distributed in areas with negative SHAP values, indicating that land use plays a negative role in the layout of care facilities, i.e., areas with higher land use tend to have higher building densities, and high building densities can cause psychological insecurity among the elderly, so the possibility of arranging care facilities in areas with higher land use is lower. Transportation facilities (traffic) are distributed in areas with negative SHAP values. For restaurant, it is more distributed in the area with positive SHAP. This indicates that it plays a positive role in the layout of service-type senior living facilities.
In order to further explore the influence of the six most important features on the prediction results of the whole service category facility layout, and the interaction effect of this feature with other features, this paper plots the SHAP partial dependency diagram, as shown in
Figure 12.
Accessibility is nonlinearly and positively correlated with the bias dependence graph (a) for the layout of service-type facilities. As with care facilities, since public service facilities are concentrated in the central city of Xi’an, and the layout of service-type senior living facilities has to ensure that residents within the service area can fully enjoy their senior living resources, some travelers living in the periphery of the city need to travel long distances centripetally to obtain service-type services, resulting in the probability of the layout of the service-type facilities increasing with the value of accessibility. The land use (b) and the layout of care category for senior living facilities shows a significantly positive relationship; that is, the higher the degree of land use, the higher the probability of the layout of the service category facilities. According to previous research, a certain degree of compact land use development plays a positive role in promoting the travel of the elderly [
40]. Thus, it promotes the layout of related service class facilities for the elderly. Diversity and the layout of service-type facilities bias dependence graph (c) shows a complex nonlinear relationship when diversity is in the interval of [0, 0.4]; it plays a positive role in service-type senior living facilities, but not significantly. Whereas when the diversity is between 0.4 and 0.6, it plays a significant inhibitory effect on service-type senior living facilities, and when the diversity is higher than 0.6, the characteristic of diversity has a significantly strong positive effect on its layout. In addition, it can be seen that the service-type elderly facilities located in the grid of higher diversity are surrounded by relevant supporting service-type related facilities and are also more complete.
The graph (d) of the dependence of shopping facilities on the layout of service facilities also shows a complex nonlinear relationship. When the shopping facilities are between 0 and 0.3, they have a strong positive influence on the layout of service facilities, while above this threshold, too many shopping facilities will significantly inhibit the layout of service facilities. The graph (e) of the dependence of transportation facilities on the layout of service-type senior living facilities shows a nonlinear negative correlation, and when the percentage of transportation facilities is between 0 and 0.5, the change in this characteristic of transportation facilities has almost no effect on the layout of service-type senior living facilities. And when it is higher than 0.5, it will have a significant inhibitory effect on the layout of service-type senior living facilities. This is because in the central urban area of Xi’an, where the pedestrian road network is more abundant and the transportation facilities are more developed, only increasing the number of transportation stops cannot effectively improve the accessibility of relevant service facilities and thus promote the layout of service facilities for the elderly; when the transportation facilities are higher than the threshold, it will even reduce the accessibility of the relevant facilities and thus reduce the probability of the layout of service facilities for the elderly. Food and beverage facilities and service facilities layout bias dependence graph (f) first shows a nonlinear positive correlation when the proportion of food and beverage facilities between 0 and 0.3. The change in this feature has almost no effect on the layout of service senior living facilities when it is higher than 0.3; its change will have a significant enhancement on the layout of service senior living facilities. In addition to this, it can also be seen from the figure that when there are service-type senior living facilities in the grid, the share of service facilities decreases as the share of food and beverage facilities exceeds 0.3. This is because there is a certain degree of functional overlap between service senior living facilities and service facilities, so when the grid meets the above five characteristics, service facilities will be prioritized to be located in the grid where the proportion of food and beverage facilities is higher than that of service facilities.
For care-oriented facilities, accessibility within the range [1, 6] demonstrates nonlinear growth, with the SHAP value increasing by up to 0.8; values exceeding 6 lead to saturation (
Figure 12a). For activity-oriented facilities, land use intensity between [0.75, 1.5] enhances placement probability, evidenced by significant slope changes. A synergistic effect occurs between medical facility proportion (>0.2) and road density (>4/km
2), boosting care-facility placement probability (interaction SHAP value: 0.6). Conversely, excessive retail density (>0.4) suppresses activity-oriented facilities (
Figure 10f). For service-oriented facilities, the interaction between accessibility and commercial diversity reveals that when diversity exceeds 0.6, accessibility effects intensify (slope in SHAP interaction plots doubles). These mechanisms are visualized through dependence plots, confirming nonlinear interdependencies among built environment factors.
4.3. Discussion
Based on the subdivision of senior living facilities into types, the division of senior living facility types based on the previous preliminary prediction of grids yielded a total of 241 activity-based senior living facilities, 85 care-based facilities, and 217 service-based senior living facilities. As can be seen from
Figure 6,
Figure 7,
Figure 8,
Figure 9 and
Figure 10a, all the streets with more serious aging have grids predicted to be suitable for the construction of senior care facilities, and the prediction results are more reasonable. From a general point of view, there are a large number of relatively dense and severely aging streets within the city walls of the central city of Xi’an, and the distribution of existing senior care facilities is relatively concentrated, while the number of planned and predicted locations is still large. The available open space within the city wall is relatively scarce, but the transportation is convenient, so it is possible to set up some activity-type senior care facilities to save land. At the same time, it is necessary to make full use of the existing public facilities in the city and build different types of senior care facilities in different areas, such as making full use of the rich educational resources to build a university for the elderly, and making full use of the superior healthcare resources to establish a number of medical care centers for the elderly, and so on. From the point of view of spatial distribution, the projected elderly facilities grid shows a general “overall concentration, localized” spatial distribution pattern. Three primary centers have been formed in the central old city (Lianhu District, Beilin District, and the new city), with secondary centers spreading around the primary centers and showing a trend of decreasing grade by grade.
As shown in
Figure 13a, all the streets with a high degree of aging have grids that are predicted to be suitable for the construction of senior care facilities, and this prediction result is reasonable. In general, there are many densely populated streets with serious aging within the city walls of Xi’an’s city center, and the distribution of existing senior care facilities is relatively concentrated, while the number of points predicted by the planning is still high. The available open space within the city wall is relatively small, but with convenient transportation, some activity-based senior care facilities can be laid out to save land. At the same time, the existing public facility resources in the city should be fully utilized to build different types of elderly facilities in combination with different regional conditions. For example, make full use of the rich educational resources to build a university for the elderly, ormake full use of the superior medical resources to set up a number of medical care centers for the elderly. From the perspective of spatial distribution, the projected grid of senior care facilities generally shows a spatial distribution pattern of “overall concentration and local dispersion”. Three primary centers are formed in the central old city (Lianhu District, Beilin District, and the new city), with secondary centers spreading around the primary centers and showing a trend of decreasing grade by grade.
Specifically, as shown in
Figure 13b, service-type senior living facilities are mostly distributed within the 10 min living circle. The new service-type senior care facilities are mainly concentrated in the streets of Zhanbaigou, Changyanbao, and Xiwang streets. The activity-based facilities on Yanta District’s Zhambaigou Street and Changyanbao Street should focus on the quality of their construction and the improvement of their internal facilities, so as to achieve attractiveness to the elderly and ensure their accessibility. Xiwang Street has a concentrated population due to industrial clustering, which determines the layout of senior care facilities in the area. This suggests that the activity-based facilities built in this area should make full use of the convenient transportation network in order to help the elderly overcome the effects of distance attenuation and enjoy the relevant elderly services. The old urban areas within the city walls have a sufficient number of facilities, good infrastructure, and convenient transportation, which ensure the normal operation of elderly facilities; therefore, there are already a large number of relatively well-developed service-type elderly facilities, coupled with their good accessibility, so there is no need to add new service-type elderly facilities within them.
As shown in
Figure 13c, it can be found from the figure that there are relatively more locations within the central city that are suitable for the construction of activity-based senior living facilities. Among them, the more distributed ones are Electronic City Street, Sanqiao Street, and Zhangjiabu Street. Because the core area of the central city of Xi’an, the new city, Beilin District, and Lianhu District administrative area is relatively small, the land is more concentrated, and the total number of elderly population streets accounts for a relatively high percentage, activity-based senior living facilities resources in the spatial configuration and division process are easier to cover the various settlements and senior living services cover a wider range of the elderly population. Therefore, resources located in the three regions of the level of supply of activity-based senior living facilities as well as the level of access by the elderly are higher.
In view of the fact that the administrative area of Xincheng District, Beilin District and Lianhu District, which are the core areas of Xi’an’s central city, is relatively small, the land is more concentrated, and the total number of elderly people in the streets accounts for a relatively high proportion, it is easier to cover all the settlements in the spatial configuration and division of the resources of the activity-type facilities for the elderly; the scope of the elderly services covering the elderly population is also broader, so the level of the supply of activity-type facilities for the elderly and the degree of access for the elderly are relatively higher in the three districts. The level of access for the elderly is also relatively high.
As shown in
Figure 13d, the overall number of care-type senior living facilities is relatively small, and in terms of spatial distribution, the areas with a higher distribution density of care-type senior living facilities are mainly concentrated in Sanqiao Street and Electronic City Street. In Yanta District and Weiyang District, the density is low in most areas. In terms of the living circle level, it is mainly concentrated between a 15 min living circle and 30 min living circle, with a radius of about 900 m to 1800 m. Unlike activity-based and service-based senior living facilities, they mainly serve the elderly in multiple neighboring communities in the scope of the shared living circle and therefore require sufficient building scale to achieve their service level. However, due to the land constraints in some streets in the center of Xi’an City, it is not possible to provide appropriate sites for care facilities, such as in Sanqiao Street, so it is possible to consider an appropriate increase in the number of facilities in order to meet a certain level of supply.
The SHAP-derived thresholds for care-oriented facilities (accessibility > 6 saturates attractiveness;
Figure 8a) reveal a fundamental misalignment with elderly mobility constraints. At the 500 m grid scale—calibrated to 15 min walking distance (4 km/h)—this threshold corresponds to service catchment areas exceeding 3 km, which exceeds the WHO-recommended maximum walk distance for seniors with gait impairments (≤1 km for 80+ population). This explains the spatial mismatch observed in
Figure 12d: while 72% of care facilities cluster in high-accessibility zones (>6), their utilization rate by the target population (semi-disabled elderly) remains below 40% in Weiyang District (Xi’an Civil Affairs Bureau, 2023). The paradox stems from two gerontological realities. Physiological decay: 18.3% of Xi’an’s elderly suffer mobility impairments, reducing effective service radius to ≤800 m; and psychological barriers: High land use intensity (SHAP value < −0.4 when >1.5;
Figure 8d) induces safety concerns, deterring facility usage even within nominal catchment areas.
Our spatial predictions (
Figure 13) uncover categorical inequities in service provision: care-oriented facilities concentrate in central districts (Beilin/Lianhu; 68% of total), yet 52% of demand originates from peri-urban zones (
Table 3: population density = 7.01 vs. central 42.1). This violates the WHO principle of “proximity-based care for vulnerable groups”. Activity-oriented facilities exhibit suppressed distribution in retail-dense zones (>0.4 commercial ratio; SHAP < −0.8 in
Figure 10f), disproportionately excluding low-income seniors reliant on community amenities. Field surveys confirm commercialized areas have 23% lower elderly visitation despite 95% spatial coverage.
The root cause lies in facility-type agnosticism in planning: treating all elderly facilities as homogeneous POIs ignores that care-oriented services require proximity, while activity-oriented spaces thrive in centrally accessible locations. Our typology-specific thresholds enable precision equity interventions: mandate care facilities within 800 m of >300 disabled elderly clusters (derived from accessibility threshold decay analysis), and incentivize mixed-use developments capping commercial density at 0.3 in aging hotspots (per SHAP inhibition threshold).
This study exhibits four key limitations:
(1) Spatiotemporal Dynamics and Data Gaps:
Current models rely on static Point of Interest (POI) cross-sectional data and fail to capture the dynamic characteristics of elderly population movement trajectories, such as actual facility usage frequency and congestion effects during peak hours. Future research needs to integrate multi-source spatiotemporal data: constructing spatiotemporal behavior patterns of the elderly through mobile phone signaling data; quantifying time-varying facility accessibility by integrating public transit IC card records; and developing time–geography-based dynamic accessibility models calibrated for diurnal activity variations. As shown in
Figure 13a, the existing facility distribution exhibits significant tidal patterns (e.g., usage fluctuations between day and night), validating the necessity of integrating dynamic data.
(2) Neglect of Socioeconomic Dimensions:
Current analyses focus primarily on the physical attributes of the built environment, overlooking critical socioeconomic factors. Stratification of Payment Capacity: Disparities in regional Gini coefficients (e.g., 0.38 in Beilin District vs. 0.29 in Baqiao District) lead to inequalities in facility usage; Digital Divide: Low smartphone usage among the elderly (only 42% for the 65+ age group) undermines the effectiveness of smart elderly care facilities; Cultural Preferences: A traditional preference for family-based elderly care suppresses occupancy rates in institutional facilities (survey shows a 61% resistance rate among the 70+ group). Future studies should consider constructing an Elderly Disadvantage Index (EDI); determining weights (α, β, γ) through Principal Component Analysis (PCA) to achieve coupling optimization between facility layout and social equity.
(3) Omission of Policy Impacts:
Existing facilities are shaped by long-term operational filtering and historical policy influences. Key factors include the following. Land Cost: Floor area ratio prices in core zones impose hard constraints on facility scale; Health Insurance Policies: Reimbursement rates for nursing-type facilities (currently only covering 35%) significantly affect accessibility. A Policy Sensitivity Index (PSI) could be established to quantify the marginal effects of policy combinations on facility layout.
(4) Insufficient Consideration of Social Equity:
Prevailing optimization models inadequately assess the equity of resource allocation: the Gini coefficient for facility accessibility among low-income groups reaches 0.53 (exceeding the alert threshold of 0.4); facility coverage in ethnic minority enclaves is only 68% of the baseline value; and the spatial mismatch rate for specialized services targeting disabled elders is as high as 41%. Consequently, future frameworks should incorporate the accessibility Gini coefficientinto the SHAP model feature set; develop demand-sensitive facility allocation algorithms (
Figure 13d reveals a significant shortage of nursing facilities in suburban areas); and explore intergenerational shared facility models (e.g., kindergarten–nursing home complexes).
5. Conclusions
This paper carries out spatial planning for the living circle of Xi’an’s urban community elderly service facilities on the basis of the theory of living circle construction. In order to solve the problem of matching the demand side and the supply side, the supply environment and built environment are integrated into the indicator system to provide comprehensive data for scientific and fair site selection, which integrates the different functional needs of the elderly, the variety of community types, and the content of service facilities. The nonlinear analysis of the above key built environment factors and elderly-related facilities reveals their influence on the layout of the three types of elderly facilities, which will contribute to the scientific and fair layout of elderly service facilities, improve the convenience of life for the elderly, and further improve the construction system of the living circle for the elderly in urban communities. The main conclusions are as follows:
(1) The spatial distribution of the three types of elderly facilities in Xi’an shows the spatial distribution characteristics decreasing from the city center to the peripheral areas; the XGBoost model has a better performance in dealing with the layout of the elderly facilities in Xi’an compared with RandomForest, Catboost, and LigtGBM. Compared with machine learning that only considers the nine types of associated facilities, it has a better explanation effect, which indicates that it is necessary to consider the built environment factors when exploring the characteristics that affect the layout of senior living facilities. In addition, for different types of senior living facilities, the threshold effects of the influencing factors, as well as the characteristics corresponding to nonlinearities are different, indicating that it is necessary to categorize and discuss the three types of senior living facilities.
(2) The influence of built environment factors and the nine categories of elderly-related facilities on the layout of service-type elderly facilities, activity-type elderly facilities, and care-type elderly facilities has both differences and similarities, and the result of the study is that there is a mutation threshold effect of the influence of accessibility on the layout of care-type elderly facilities. The accessibility of facilities will significantly affect the layout of care facilities, and there is a threshold effect on the layout of care facilities at different scales of accessibility. When the accessibility is between 1 and 6, it will promote the layout of elderly care facilities, and when it exceeds this threshold, the attraction capacity of elderly care facilities will reach saturation. Accessibility is a priority factor in the layout of care facilities. Medical facilities, shopping facilities, and land use in the surrounding area also have a significant impact on the layout. At a scale of 500 m, the residential high-density areas during the evening peak and the shopping high-density areas during the morning peak in Xi’an have relatively high travel volumes. Facility abundance will promote the elderly out; planners should reasonably arrange service facilities at all levels, form a functionally diverse, compact development of the service facility system, to an optimized transportation network, cycling infrastructure, and service facilities at all levels of the element combination.
The medical and shopping facilities in the surrounding area and the land use planning layout all play an important role. On a scale of 500 m, Xi’an will have relatively high travel volumes in residential-intensive areas during the weekday evening peak and shopping-intensive areas during the morning peak. Abundant facilities can promote the mobility of the elderly. Planners need to scientifically plan the layout of service facilities at all levels, create a multi-functional and compact service facility system, and optimize the combination of the transportation network, cycling infrastructure, and service facility elements at all levels.
(3) Built Environment Impact Variables There is spatial heterogeneity in the layout of different types of senior living facilities, which can be used to provide decision support for the layout of different types of senior living facilities through the development of sub-district-level intervention strategies. For example, the POI ratio of shopping facilities shows a trend of increasing and then decreasing in the layout of three types of elderly facilities, indicating that priority can be given to the layout of shopping facilities in the central city station area.
Based on the research findings, this study proposes the following optimization recommendations for the layout of elderly care facilities in Xi’an:
In central urban districts such as Lianhu and Beilin, priority should be given to adopting a stock optimization strategy by renovating existing buildings to integrate composite elderly care facilities. Land use intensity must be controlled below 1.5 to maximize efficiency through a ”medical-care-commercial integration” model—maintaining medical facility ratios within the 0.2–0.3 range and commercial density at 0.3–0.4, while adding barrier-free passageways within a 500 m radius of subway stations (accessibility threshold zone: 5–6) to alleviate service pressure in high road-density areas (>6, where attractiveness of care facilities saturates).
Peri-urban areas should focus on bridging the gap in activity-oriented facilities (current coverage: 41%). New senior activity centers should be established in areas with a road density of 4/km2, complemented by supporting medical facilities at ≥0.2 ratio, while strictly controlling commercial density below 0.4 to avoid inhibitory effects. Through floor area ratio incentives, new residential developments should allocate activity spaces at 0.5 m2 per elderly resident, with particular emphasis on optimizing land use intensity within the optimal [0.75, 1.5] interval to enhance facility attractiveness.
In suburban regions (e.g., Weiyang District), construct a three-tier care network: establish comprehensive nursing homes at sub-district centers (accessibility ≥ 4), community-based day care centers (medical ratio ≥ 0.15) in residential areas, and emphasize activating commercial diversity (service facility placement probability increases 2.3× when >0.6). Mandate 10% of space in new commercial complexes for elderly care functions. Cross-district coordination requires establishing a dynamic monitoring platform to trigger automatic alerts when road density exceeds 8 or medical ratios fall below 0.15, reserving strategic land in accessibility zones [4, 6]. Ultimately, form a gradient system characterized by “central upgrading, peri-urban expansion, and suburban coverage”.