Estimation of Urban Above-Ground Vegetation Carbon Density and Analysis of Topography-Modulated Spectral Responses in Shenzhen, China

Qie, Guangping; Wang, Minzi; Wang, Guangxing

doi:10.3390/rs18050807

Open AccessArticle

Estimation of Urban Above-Ground Vegetation Carbon Density and Analysis of Topography-Modulated Spectral Responses in Shenzhen, China

by

Guangping Qie

^1,2,3

,

Minzi Wang

⁴

and

Guangxing Wang

^3,*

¹

School of Business Administration, Moutai Institute, Renhuai 551801, China

²

Department of Geography and Environmental Resources, Southern Illinois University at Carbondale, Carbondale, IL 62901, USA

³

School of Earth Systems and Sustainability, Southern Illinois University at Carbondale, Carbondale, IL 62901, USA

⁴

Department of Resource and Environment, Moutai Institute, Renhuai 551801, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(5), 807; https://doi.org/10.3390/rs18050807

Submission received: 4 February 2026 / Revised: 4 March 2026 / Accepted: 4 March 2026 / Published: 6 March 2026

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

We developed an explainable framework that uses remote sensing and machine learning to estimate urban above-ground vegetation carbon density.
This framework combines Landsat 8 spectral features, terrain variables, and spatial block cross-validation. The results show that urban vegetation carbon density is influenced by complex, topography-driven spectral responses. And we identified these responses using SHAP-based interaction analysis.

What is the implication of the main finding?

The approach offers a strong and understandable method for mapping urban vegetation carbon in diverse and complex cities.
Identifying topography-driven spectral responses creates a useful model for increasing the reliability and ecological understanding of machine-learning carbon assessments in urban remote sensing.

Abstract

Accurately estimating urban above-ground vegetation carbon density (UAGVCD) is crucial for assessing urban carbon sinks, but it is difficult due to varying spatial patterns, complex land covers, and differences caused by terrain. This study measures UAGVCD in Shenzhen, China, using an explainable remote sensing and machine-learning approach. We combined Landsat 8 spectral bands, vegetation indices, texture metrics, and terrain-based variables with 195 field measurements of carbon density to develop an Extreme Gradient Boosting (XGBoost) model. We evaluated model performance with spatial block cross-validation, using block sizes of 2 km, 5 km, and 10 km to account for spatial autocorrelation. The results show that the XGBoost model performed reliably during spatially independent validation, with the 5 km block showing the best accuracy (train R²= 0.917 ± 0.086, RMSE= 5.53 ± 3.97 Mg ha⁻¹; validation R² = 0.617 ± 0.055, RMSE = 10.25 ± 1.39 Mg ha⁻¹). Smaller blocks gave more varied results, while larger blocks led to a significant drop in accuracy (validation R² = 0.380 ± 0.297 at 10 km). Predictions showed clear differences in UAGVCD, with higher values in mountainous and green areas and lower values in highly developed regions. SHapley Additive exPlanations (SHAP) analyses suggested that both spectral and topographic factors play a significant role in UAGVCD. Additionally, the relationships between spectral data and carbon density showed strong nonlinear responses affected by terrain. These findings highlight the importance of spatially explicit validation and explainable machine learning for reliable urban vegetation carbon mapping.

Keywords:

vegetation carbon density; XGBoost; machine learning; explainable artificial intelligence; topography; remote sensing

1. Introduction

Urban vegetation plays a critical role in mitigating climate change by sequestering atmospheric carbon dioxide and regulating urban microclimates [1,2]. As cities continue to expand globally, accurately quantifying urban above-ground vegetation carbon density (UAGVCD) has become increasingly important for assessing urban carbon sinks [3], supporting climate-resilient urban planning [4], and informing ecosystem-based management strategies. Unlike natural or rural ecosystems, urban landscapes are characterized by pronounced spatial heterogeneity, complex land-cover mosaics, and strong anthropogenic disturbances, which together pose significant challenges to the reliable estimation of vegetation carbon stocks [3].

In recent years, increasing attention has been paid to the remote sensing–based estimation of vegetation carbon density at the urban scale, particularly for urban trees and heterogeneous green spaces. Several studies have demonstrated the feasibility of mapping urban tree and vegetation carbon stocks using remote sensing data and allometric relationships, highlighting the distinct methodological requirements of urban environments [5,6]. However, compared with extensive research on forest ecosystems, systematic investigations of vegetation carbon density estimation in urban contexts remain relatively limited.

Remote sensing provides an efficient and scalable means for monitoring vegetation carbon over large spatial extents [7,8,9]. Multispectral satellite data, particularly from the Landsat series, have been widely used to estimate above-ground vegetation biomass and carbon density due to their long temporal coverage, moderate spatial resolution, and consistent radiometric quality [9,10,11]. Numerous studies have demonstrated the potential of Landsat-derived spectral bands, vegetation indices, and texture features for estimating aboveground biomass and carbon stocks across diverse ecosystems [3,12,13,14]. Nevertheless, most existing applications primarily focus on natural forests or regional-scale ecosystems, while urban-scale vegetation carbon density estimation remains comparatively underexplored, despite its importance for urban carbon accounting and climate mitigation. In recent years, machine learning algorithms have further enhanced the capacity of remote sensing to model complex, nonlinear relationships between spectral information and above-ground vegetation carbon density [3,15,16].

Among machine learning approaches, ensemble tree-based models such as Random Forest and Extreme Gradient Boosting (XGBoost) have shown superior performance in vegetation parameter estimation [17,18,19]. XGBoost, in particular, has gained increasing attention due to its strong predictive power, robustness to multicollinearity, and flexibility in handling high-dimensional feature spaces [20]. Previous studies have successfully applied XGBoost to estimate forest biomass and carbon density at regional and national scales, often outperforming traditional regression-based models [21,22,23]. For example, Guangping Qie and colleagues demonstrated that machine learning models integrating spectral and environmental variables substantially improved forest biomass estimation accuracy [3]. Similarly, Hua Sun et al. highlighted the effectiveness of gradient boosting algorithms in capturing nonlinear spectral–biomass relationships [24], while Guangxing Wang and co-authors emphasized the importance of incorporating ancillary variables to improve carbon mapping in heterogeneous landscapes [25]. Despite these advances, a comprehensive synthesis of modeling approaches specifically targeting urban vegetation carbon density is still lacking. Existing studies often transfer methods developed for forest ecosystems directly to urban environments without fully accounting for urban-specific characteristics, such as fine-scale vegetation fragmentation and strong spectral mixing. Urban vegetation is typically fragmented into small patches such as parks, street trees, and residential green spaces, which are often mixed spectrally with impervious surfaces [3]. This spectral mixing, combined with complex illumination conditions caused by urban morphology and topographic variation, can lead to large uncertainties in carbon estimation. Moreover, many existing studies rely on random cross-validation strategies that ignore spatial autocorrelation, potentially resulting in overly optimistic model performance and limited spatial generalization [26,27,28,29,30].

Topography is a fundamental environmental factor influencing vegetation growth, microclimate, and spectral reflectance [31]. Elevation, slope, and aspect affect solar radiation, soil moisture distribution, and thermal regimes, which in turn regulate vegetation structure and carbon accumulation. Recent studies in subtropical urban agglomerations have shown that topography and climate jointly influence spatial patterns of vegetation carbon, suggesting that terrain-related effects cannot be neglected even in highly urbanized regions [32]. In mountainous and hilly regions, topographic effects can strongly modulate spectral responses captured by optical sensors, leading to spatially variable relationships between vegetation indices and carbon density. Although topographic variables have been increasingly incorporated as predictors in biomass estimation models, their interactive effects with spectral features are rarely examined explicitly, particularly in urban contexts. As a result, the mechanisms through which topography modulates spectral–carbon relationships remain poorly understood.

Recent developments in explainable machine learning provide new opportunities to address this gap. SHapley Additive exPlanations (SHAP) offer a theoretically grounded framework for interpreting complex machine learning models by quantifying the contribution of each predictor to individual predictions [33]. Beyond main effects, SHAP interaction values enable the identification of pairwise interactions between predictors, revealing how environmental factors such as topography can condition or modulate the influence of spectral variables on model outputs. This capability is particularly valuable for urban carbon studies, where understanding the drivers of spatial heterogeneity is as important as achieving high predictive accuracy.

In addition to interpretability, ensuring the spatial robustness of machine learning models is essential for reliable urban carbon mapping. Spatial block cross-validation has been increasingly advocated as an effective strategy to mitigate the effects of spatial autocorrelation by separating training and validation samples in geographic space rather than randomly [34]. By evaluating model performance under spatially independent conditions, spatial block cross-validation provides a more realistic assessment of model generalization ability, which is critical for spatial prediction using remote sensing data.

Against this background, the present study aims to estimate UAGVCD in Shenzhen, China, using Landsat 8 imagery and an explainable machine learning framework. Shenzhen is one of the most rapidly urbanizing megacities in China and is characterized by complex terrain, ranging from coastal plains to hilly and mountainous areas. This combination of intense urbanization and pronounced topographic variation makes Shenzhen an ideal case for investigating topography-modulated spectral responses of UAGVCD.

Specifically, this study integrates multispectral bands, vegetation indices, texture measures, and topographic variables derived from a digital elevation model to construct an XGBoost-based carbon density estimation model. A total of 195 field plots are used for model training and validation. To ensure spatial robustness, spatial block cross-validation is employed to evaluate model performance. Furthermore, SHAP values and SHAP interaction analysis are applied to interpret model behavior and to explicitly reveal how topographic factors modulate the spectral response of UAGVCD.

The objectives of this study are threefold: (1) to assess the capability of Landsat 8 data combined with XGBoost to estimate UAGVCD in a complex urban environment; (2) to evaluate the spatial robustness of the model using spatial block cross-validation; and (3) to elucidate the topography-modulated spectral responses of UAGVCD through SHAP-based explainable machine learning. By linking predictive performance with mechanistic interpretation, this study provides new insights into the complexity of urban vegetation carbon dynamics and contributes to more reliable remote sensing-based carbon assessment in rapidly urbanizing regions.

2. Materials and Methods

2.1. Study Area

Shenzhen is located in southern China, along the eastern coast of the Pearl River Estuary (22°27′–22°52′N, 113°46′–114°37′E), and borders Hong Kong to the south (Figure 1). As one of the most rapidly urbanized megacities in China, Shenzhen has experienced dramatic land-use transformation over the past four decades, evolving from a small coastal town into a highly urbanized metropolitan area. The city covers approximately 1997 km² and represents a typical example of a subtropical coastal city characterized by intense urban development combined with pronounced topographic heterogeneity. Shenzhen has a subtropical climate that supports year-round vegetation growth, providing favorable conditions for the development of diverse urban vegetation types.

Topographically, Shenzhen exhibits a complex terrain gradient from coastal plains in the southwest to mountainous and hilly landscapes in the northeast. Elevation ranges from sea level to approximately 944 m at the peak of Wutong Mountain, the highest point in the city. The central and eastern parts of Shenzhen are dominated by low- to mid-elevation hills and mountains, with slopes and aspects varying considerably over short distances. This rugged terrain introduces strong spatial heterogeneity in solar radiation, soil moisture, and microclimatic conditions, which in turn influence vegetation growth patterns and carbon accumulation. Moreover, complex topography leads to terrain-induced illumination differences and mountain shadows in optical satellite imagery, complicating the relationship between spectral reflectance and above-ground vegetation carbon density.

Shenzhen is also characterized by exceptionally high urbanization intensity. Built-up land occupies a substantial proportion of the city, particularly in the western and southern regions, where dense residential, commercial, and industrial areas dominate. Urban expansion has produced a highly fragmented landscape, in which vegetation is interspersed with impervious surfaces at fine spatial scales. Urban vegetation in Shenzhen includes a wide range of types, such as evergreen broadleaf forests in mountainous areas, secondary forests and shrub lands on hillslopes, urban parks, roadside trees, residential green spaces, and landscaped vegetation. Evergreen tree species are prevalent due to the subtropical climate, resulting in relatively stable canopy cover throughout the year.

The coexistence of intensive urban development, diverse vegetation types, and complex terrain makes Shenzhen a particularly challenging yet representative study area for UAGVCD estimation. Spectral signals recorded by satellite sensors are strongly influenced by both urban background materials and topographic conditions, resulting in nonlinear and spatially variable relationships between vegetation indices and carbon density. These characteristics render Shenzhen an ideal natural laboratory for investigating topography-modulated spectral responses and for evaluating the performance of explainable machine learning models in urban carbon estimation.

2.2. Field Data and Above-Ground Vegetation Carbon Density Calculation

Field survey data used in this study were collected during multiple annual field campaigns conducted between August and December in each year from 2014 to 2016, including repeated annual field surveys conducted in the same seasonal window (August–December) across different years to ensure adequate spatial coverage and representation of diverse vegetation types, corresponding to the availability of cloud-free Landsat 8 OLI imagery acquired during the 2016 growing season. In total, 195 field plots were measured across the study area. Plot locations were recorded using a Global Positioning System (GPS) with a positional accuracy of approximately ±5 m, ensuring reliable spatial alignment with remotely sensed data. Although the number of field plots is limited by the logistical constraints of urban surveys, the sample size is adequate for model development when combined with spatially explicit validation and multi-source remote sensing predictors.

Because field measurements were collected over different years, a temporal normalization procedure was applied to minimize potential inconsistencies between field observations and remotely sensed data. Specifically, vegetation carbon densities measured in 2014 and 2015 were adjusted to a common reference year (2016) using species-specific growth rates, thereby enabling direct comparison with Landsat imagery acquired in 2016. The remote sensing data used for model development consisted of Landsat 8 OLI/TIRS imagery acquired on 18 September 2016 (Path/Row: 122/044) and 27 September 2016 (Path/Row: 121/044), which represent cloud-free scenes during the 2016 growing season over the study area. Growth rates for major tree species were obtained from regional forestry inventories and published growth models for subtropical forests in Guangdong Province, accounting for differences in species composition and growth dynamics. This temporal harmonization ensured that all plot-level carbon density estimates represented a consistent vegetation state corresponding to the 2016 growing season.

For a plot-level carbon density measured in year y (C_γ), the adjustment to the 2016 reference year was implemented using a growth-factor approach:

C₂₀₁₆ = C_γ × (1 + r_s)^(2016 − y),

where r_s represents the species-specific (or species-group) annual relative growth rate. This growth-factor-based temporal adjustment follows widely used carbon stock change and increment methodologies in forest carbon accounting [35,36].

The field sampling design followed the specifications of the National Forest Inventory (NFI) of China. Within each sample plot, vegetation structure was systematically surveyed across tree, shrub, and grass layers. For the tree layer, all trees within each plot were measured, and key attributes including species, diameter at breast height (DBH), and total height were recorded. For the shrub and grass layers, nested subplots of 2 m × 2 m were established, within which shrub coverage, basal diameter, height, individual density, and species composition, as well as grass coverage, height, and species, were measured (Figure 2).

To support representative sampling across heterogeneous urban landscapes, Pleiades 1A and 1B high-resolution imagery was first employed to visually interpret and classify the study area into five land use/land cover (LULC) types, including forests, grasslands, built-up areas, bare lands, and water bodies. Based on this classification, a stratified random sampling strategy was implemented, with plot numbers allocated proportionally to the areal extent of each LULC type and plot locations randomly distributed within each stratum. This design ensured adequate representation of vegetation conditions across both urban and peri-urban environments.

Above-ground biomass and carbon density were estimated separately for trees, shrubs, and grass, and subsequently aggregated at the plot level. For each plot, total above-ground vegetation carbon was calculated as the sum of carbon stocks from the three vegetation components, and carbon density (Mg·ha⁻¹) was derived by normalizing total carbon stock by plot area. Tree biomass estimation was based on species-specific tree volume equations developed for Guangdong Province, incorporating both DBH and tree height measurements (Table A1). Estimated tree volume per hectare (M) was converted to biomass using the biomass expansion factor (BEF) approach:

B E F = a + b / M

(1)

B = B E F * M

(2)

where BEF represents the biomass expansion factor, a and b are species-specific coefficients derived from the Biomass and Volume Relationship Parameter Values Table (Table A2), M denotes volume stock per hectare, and B is above-ground biomass. Biomass was subsequently converted to carbon stock using species-specific carbon conversion coefficients obtained from the Carbon Ratio Table for Major Tree Species in China (Table A3), following the guidelines of the National Continuous Forest Carbon Stock Monitoring and Evaluation System of China. Forest carbon density was finally calculated by dividing total tree carbon stock by plot area.

Due to the limited availability of biomass models for shrubs and grass, empirical equations proposed by Fan (2011) [37] were adopted to estimate biomass based on vegetation height:

S h r u b b i m o m a s s = 0.0398 \times h_{1} - 0.3326

(3)

G r a s s b i o m a s s = 0.0175 \times h_{2} - 0.2888

(4)

where h₁ and h₂ represent the mean heights (m) of shrubs and grass, respectively. Estimated biomass values were converted to carbon stocks using standard carbon conversion factors. Shrub and grass carbon stocks were then added to tree carbon stocks to obtain total above-ground vegetation carbon density for each plot, representing vegetation conditions consistent with the 2016 growing season.

Descriptive statistics of the carbon density values were calculated to summarize their overall distribution characteristics. As shown in Table 1, the dataset consists of 195 plots, with carbon density ranging from 0.00 to 100.67 Mg/ha and an average value of 21.23 Mg/ha. The relatively large standard deviation (22.96 Mg/ha) and high coefficient of variation (108.16%) indicate substantial spatial heterogeneity of carbon density across the study area.

2.3. Remote Sensing Data

2.3.1. Pleiades-1A/1B Imagery and Land Use/Land Cover Classification

Very high spatial resolution satellite imagery from the Pleiades-1A and Pleiades-1B sensors was employed to derive detailed land use/land cover (LULC) information for the study area. Pleiades imagery provides multispectral data with a spatial resolution of 2 m and panchromatic data at 0.5 m resolution, enabling accurate discrimination of fine-scale urban land-cover types. The high spatial detail offered by Pleiades imagery is particularly suitable for urban environments, where vegetation is highly fragmented and interspersed with built-up surfaces.

Pleiades images covering the entire study area were acquired during the same growing season as the field survey to ensure consistency between land-cover information and ground observations. Standard preprocessing procedures, including radiometric calibration, atmospheric correction, and geometric correction, were applied to the Pleiades imagery. All images were orthorectified to a common coordinate system to ensure spatial consistency with other datasets used in this study. Based on the preprocessed Pleiades imagery, a high-accuracy LULC classification was performed to distinguish major urban land-cover categories, including forest, shrubland, grassland, impervious surfaces, water bodies, and bare land. The classification was conducted using a supervised machine learning approach, supported by extensive visual interpretation and reference data. Owing to the fine spatial resolution of the Pleiades data, the resulting LULC map effectively captured small vegetation patches such as urban parks, roadside trees, and residential green spaces that are often missed or misclassified in moderate-resolution imagery.

To support subsequent model training and validation, the high-resolution LULC map was used as a basis for stratified random sampling. Sample plots were randomly selected within each vegetation-related land-cover class to ensure representative coverage of different vegetation types and urban contexts. This stratified sampling strategy reduced sampling bias caused by uneven land-cover distribution and ensured that the field plots adequately reflected the diversity of urban vegetation conditions across Shenzhen.

2.3.2. Landsat 8 OLI Imagery

Landsat 8 Operational Land Imager (OLI) data were used as the primary source of spectral information for UAGVCD estimation. Landsat 8 provides multispectral observations with a spatial resolution of 30 m and a revisit cycle of 16 days, making it well suited for regional-scale vegetation monitoring. Compared with very high-resolution imagery, Landsat data offer a favorable balance between spatial coverage, spectral richness, and long-term data availability, and have been widely applied in biomass and carbon estimation studies.

To minimize phenological and seasonal inconsistencies, Landsat 8 images were selected to closely match the timing of the field survey and the acquisition period of the Pleiades imagery. Only cloud-free or minimally cloud-contaminated scenes acquired during the peak growing season were used. All Landsat images underwent standard preprocessing, including radiometric calibration and atmospheric correction, to convert digital numbers to surface reflectance. Cloud and cloud-shadow pixels were identified and masked using the quality assessment band and visual inspection.

From the preprocessed Landsat 8 imagery, spectral bands, vegetation indices, and texture features relevant to vegetation structure and biomass were derived. These variables served as the primary predictors in the machine learning model. The Landsat data were spatially aligned with the Pleiades-derived LULC map and field plot locations to ensure pixel-level correspondence between spectral information and reference carbon density measurements.

2.3.3. Temporal Consistency and Data Integration

Ensuring temporal consistency among remote sensing data and field observations is critical for reliable vegetation carbon estimation. In this study, the acquisition dates of the Pleiades imagery, Landsat 8 imagery, and field measurements were carefully coordinated to fall within the same growing season. This approach minimized discrepancies caused by seasonal vegetation dynamics and ensured that spectral signals captured by the satellite sensors accurately represented the vegetation conditions observed in the field.

All datasets were resampled or aggregated as necessary to a common spatial framework centered on the Landsat pixel scale. Field plots were spatially linked to Landsat pixels using geographic coordinates, and plot-level carbon density measurements were associated with corresponding Landsat-derived predictors. The integration of high-resolution LULC information from Pleiades imagery with moderate-resolution Landsat data enabled more reliable sample stratification and model calibration, while preserving the spatial coverage required for city-scale carbon mapping.

2.4. Topographic Variables

Topographic conditions exert a fundamental influence on vegetation growth, microclimate, and spectral responses captured by optical remote sensing sensors. In urban environments characterized by complex terrain, such effects can substantially modulate the relationship between spectral features and above-ground vegetation carbon density. To explicitly account for terrain-related influences, a set of topographic variables derived from a digital elevation model (DEM) was incorporated into the modeling framework.

2.4.1. Digital Elevation Model

The DEM used in this study was obtained from the Geospatial Data Cloud (https://www.gscloud.cn (accessed on 5 June 2025)), with a spatial resolution of 30 m, consistent with the spatial resolution of the Landsat 8 imagery. The DEM was resampled and co-registered to the Landsat grid to ensure pixel-level correspondence among spectral, topographic, and field-derived variables. Terrain derivatives, including slope, aspect, eastness, and northness, were calculated using ArcGIS software (ArcGIS 10.8). Elevation represents a primary terrain attribute that controls temperature gradients, moisture availability, and vegetation distribution, and thus provides essential contextual information for interpreting spatial variability in above-ground vegetation carbon density.

2.4.2. Terrain Derivatives: Slope and Aspect

Slope and aspect were derived from the DEM using standard terrain analysis algorithms. Slope describes the steepness of the terrain surface and is closely related to soil depth, drainage conditions, and erosion processes, all of which can affect vegetation structure and biomass accumulation. Steep slopes often limit vegetation growth due to shallow soils and increased runoff, whereas gentle slopes tend to support higher above-ground biomass under similar climatic conditions. Aspect represents the orientation of terrain relative to incoming solar radiation and is a key factor governing illumination conditions, evapotranspiration, and microclimatic variability. In subtropical regions such as Shenzhen, aspect strongly influences vegetation productivity by controlling the amount and timing of solar energy received by the canopy. However, aspect is a circular variable and may introduce discontinuities in statistical modeling when treated directly.

2.4.3. Eastness and Northness

To overcome the limitations associated with circular aspect values, eastness and northness were derived as continuous transformations of aspect. Eastness and northness were calculated as the sine and cosine of aspect, respectively, thereby representing the east–west and north–south components of terrain orientation. These transformations allow aspect-related effects to be incorporated into machine learning models in a continuous and numerically stable manner. Eastness reflects the degree to which terrain surfaces are oriented toward the east or west, which influences morning versus afternoon solar exposure. Northness indicates the degree of north–south orientation, controlling overall solar radiation intensity and thermal conditions. Together, eastness and northness provide a more ecologically meaningful and model-friendly representation of terrain orientation effects on vegetation growth and spectral reflectance than raw aspect values.

2.4.4. Ecological Meaning of Terrain-Derived Variables

The inclusion of elevation, slope, aspect-derived variables, eastness, and northness enables a more explicit representation of terrain-driven ecological processes that influence above-ground vegetation carbon density. These topographic variables jointly regulate solar radiation, temperature, moisture availability, and illumination geometry, which in turn affect vegetation structure, productivity, and carbon accumulation. Moreover, terrain-induced variations in illumination and shadowing can alter spectral responses observed by Landsat sensors, particularly in mountainous and hilly urban environments.

By integrating topographic variables with spectral and texture features, the modeling framework is better equipped to capture topography-modulated spectral responses, thereby reducing estimation bias associated with terrain effects and improving the robustness of above-ground vegetation carbon density prediction across complex urban landscapes.

2.5. Machine Learning Model and Feature Preparation (XGBoost)

To model the complex and nonlinear relationships between remotely sensed variables, topographic factors, and above-ground vegetation carbon density, the Extreme Gradient Boosting (XGBoost) algorithm was employed. All statistical analyses and machine learning modeling were implemented in R (version 4.4.1). XGBoost is a tree-based ensemble learning method that builds boosted regression trees using gradient descent optimization and regularization [38], and has demonstrated strong predictive performance and robustness in vegetation biomass and carbon estimation studies [22,39,40,41].

2.5.1. Feature Preparation

Predictor variables were derived from Landsat 8 OLI imagery and terrain data, including spectral bands, vegetation indices, texture metrics, and topographic variables. (See Table A4 for a comprehensive summary of predictor variable names, categories, and derivation sources). Prior to model training, all predictor variables were spatially aligned to the Landsat pixel grid to ensure consistency with plot-level carbon density observations. Field plot coordinates were used to extract corresponding pixel values for each predictor.

Categorical variables, where applicable, were encoded using one-hot encoding, while continuous variables were retained in their original numeric form. To minimize the influence of missing or invalid values, only samples with complete observations across all predictor variables were retained for model training and validation. No explicit feature scaling was required, as XGBoost is invariant to monotonic transformations of input variables.

2.5.2. XGBoost Model Configuration

The XGBoost model was implemented using the gradient boosting decision tree (GBDT) framework with a squared error loss function, appropriate for continuous regression targets. Model hyperparameters, including learning rate, tree depth, subsampling ratio, and column sampling ratio, were configured to balance model complexity and generalization ability. Regularization terms were incorporated to reduce overfitting and improve model stability.

XGBoost was selected for its ability to:

(1): capture nonlinear relationships between predictors and above-ground vegetation carbon density;
(2): model high-order interactions among spectral and topographic variables; and
(3): provide a foundation for post hoc model interpretability through SHAP-based analysis.

2.6. Spatial Block Cross-Validation Strategy

2.6.1. Rationale for Spatial Cross-Validation

Conventional random cross-validation can lead to overly optimistic accuracy estimates in spatial modeling due to spatial autocorrelation between training and validation samples. To address this issue and provide a more realistic assessment of model generalization performance, a spatial block cross-validation (Spatial Block CV) strategy was adopted. Spatial Block CV explicitly separates training and validation samples in geographic space, thereby reducing spatial dependence between folds and better reflecting the predictive performance of the model when applied to unseen areas.

2.6.2. Spatial Block Design

Field plots were partitioned into spatial blocks using regular grids with block sizes of 2 km, 5 km, and 10 km, respectively. These block sizes were selected to represent different spatial dependency scales commonly encountered in urban landscapes with heterogeneous terrain and land-cover patterns. For each block size, the study area was divided into non-overlapping square blocks, with the grid origin fixed at the upper-left corner of the study area extent to ensure a deterministic and reproducible blocking configuration, and all plots within the same block were assigned to the same fold. A 10-fold cross-validation scheme was then implemented by randomly assigning spatial blocks, rather than individual plots, to each fold. To ensure reproducibility, a fixed random seed was used for the block-to-fold assignment, and the same assignment was applied consistently across all model runs for a given block size. This ensured that training and validation datasets were spatially disjoint at the block level.

2.6.3. Model Evaluation

For each block size (2 km, 5 km, and 10 km), the XGBoost model was trained and validated independently using the corresponding spatial folds. Model performance was evaluated using multiple accuracy metrics, including the coefficient of determination (R²), root mean square error (RMSE), and mean absolute error (MAE), calculated separately for training and validation datasets.

By comparing model performance across different block sizes, the sensitivity of prediction accuracy to spatial dependency was systematically assessed. Because the spatial blocking configuration and block–fold assignment were fixed and fully reproducible, differences in model performance across block sizes can be attributed to changes in spatial dependency scale rather than stochastic variation in cross-validation splits. This multi-scale spatial validation framework provides a more robust and transparent evaluation of the model’s ability to generalize across complex urban environments.

2.7. SHAP and SHAP Interaction Analysis

2.7.1. SHAP-Based Model Interpretation

To interpret the contributions of individual predictor variables to the XGBoost model, Shapley Additive Explanations (SHAP) were employed. SHAP is a game-theoretic approach that attributes the contribution of each feature to a model prediction by considering all possible feature combinations. SHAP values provide both the magnitude and direction of each feature’s influence on above-ground vegetation carbon density predictions. Global feature importance was assessed by computing the mean absolute SHAP value for each predictor across all samples. This allowed identification of the most influential spectral, texture, and topographic variables driving the model’s predictions.

2.7.2. SHAP Dependence and Interaction Analysis

Beyond main effects, SHAP dependence plots were used to examine how variations in individual predictor variables affect above-ground vegetation carbon density, while accounting for interactions with other features. To further quantify pairwise interactions, SHAP interaction values were calculated, enabling explicit assessment of how the combined effects of two variables influence model outputs. In particular, interactions between spectral variables and topographic factors (e.g., elevation, slope, eastness, and northness) were analyzed to reveal topography-modulated spectral responses. These interactions provide mechanistic insights into how terrain conditions regulate vegetation growth and modify the spectral signals captured by Landsat imagery.

2.7.3. Linking SHAP Results to Ecological Interpretation

By integrating SHAP main effects and interaction analyses, the study moves beyond black-box prediction to a more interpretable modeling framework. SHAP-based interpretation enables identification of dominant environmental controls on above-ground vegetation carbon density and clarifies how terrain-induced ecological gradients shape spectral–carbon relationships in complex urban landscapes.

2.8. Spatial Prediction of UAGVCD

Following model training and validation, the optimized XGBoost model was applied to generate a wall-to-wall spatial prediction of above-ground vegetation carbon density across the entire study area. The model configuration corresponding to the 5 km spatial block cross-validation, which provided a balanced trade-off between predictive accuracy and spatial generalization, was selected for spatial mapping. All predictor variables, including Landsat 8 spectral features, vegetation indices, texture metrics, and topographic variables, were prepared as raster layers with a spatial resolution of 30 m, consistent with the Landsat imagery. These raster layers were spatially aligned and stacked to form a continuous predictor dataset covering the full extent of Shenzhen.

The trained XGBoost model was then applied to the predictor stack on a pixel-by-pixel basis to estimate above-ground vegetation carbon density for each grid cell. Areas classified as non-vegetated land covers (e.g., built-up areas and water bodies) based on the high-resolution land use/land cover map were masked to avoid spurious predictions.

The resulting spatially continuous carbon density map represents the distribution of above-ground vegetation carbon density during the 2016 growing season, corresponding to the period of Landsat image acquisition. This spatial product provides a quantitative basis for analyzing the spatial heterogeneity of urban vegetation carbon density and for supporting further interpretation of terrain-modulated spectral responses.

3. Results

3.1. Performance of XGBoost Under Spatial Block Cross-Validation

Under the spatial block cross-validation framework, the XGBoost model exhibited robust predictive performance for UAGVCD across Shenzhen. Using 10-fold spatial block cross-validation with a block size of 5 km, the model achieved a mean validation coefficient of determination (R²) of 0.617, with a standard deviation of 0.055 (Table 2). Correspondingly, the mean validation RMSE and MAE were 10.25 ± 1.39 Mg ha⁻¹ and 9.01 ± 1.21 Mg ha⁻¹, respectively, indicating a satisfactory balance between predictive accuracy and spatial generalization capability.

In contrast, the training performance was substantially higher, with a mean training R² of 0.917 ± 0.086, accompanied by lower RMSE (5.53 ± 3.97 Mg ha⁻¹) and MAE (3.44 ± 2.41 Mg ha⁻¹). The consistent gap between training and validation accuracy reflects the stringent nature of spatially independent validation and highlights the influence of spatial autocorrelation and environmental heterogeneity in urban landscapes. Importantly, the absence of excessively high validation accuracy suggests that the model avoided spatial overfitting and maintained realistic generalization performance.

Model performance varied across individual spatial folds (Table 3). Validation R² values ranged from 0.549 to 0.696, while validation RMSE spanned 8.78–18.83 Mg ha⁻¹ across folds. Such variability indicates spatial heterogeneity in vegetation structure, terrain complexity, and urban development intensity within the study area. Folds characterized by more complex terrain or highly fragmented urban–vegetation mosaics generally exhibited lower predictive accuracy, underscoring the challenges of carbon density estimation in heterogeneous urban environments.

Overall, the spatial block cross-validation results demonstrate that the proposed XGBoost framework is capable of producing stable and spatially transferable estimates of UAGVCD. These results provide a solid foundation for subsequent analyses aimed at interpreting model behavior and identifying the key spectral–topographic drivers underlying spatial variations in carbon density, which are further explored using SHAP and SHAP interaction analyses in the following sections.

3.2. Comparison of Spatial Block Sizes on Model Performance

To evaluate the sensitivity of model performance to spatial autocorrelation control, we compared XGBoost models trained and validated using three spatial block sizes (2 km, 5 km, and 10 km) under a spatial block cross-validation framework (Figure 3; Table 4).

Overall, as Table 4 and Figure 3 demonstrated, increasing the spatial block size resulted in a clear degradation of validation performance, accompanied by larger variability across folds. The model using a 5 km block size achieved the best balance between predictive accuracy and robustness, yielding the highest validation R² (0.617 ± 0.055) and the lowest validation RMSE (10.25 ± 1.39 Mg ha⁻¹) and MAE (9.01 ± 1.21 Mg ha⁻¹). In comparison, the 2 km block configuration produced slightly lower validation accuracy (R² = 0.604 ± 0.109), although its error metrics were comparable (RMSE = 13.66 ± 2.26 Mg ha⁻¹; MAE = 9.27 ± 2.08 Mg ha⁻¹). The larger standard deviation in validation R² for the 2 km blocks indicates increased sensitivity to spatially clustered samples and residual spatial dependence.

In contrast, the 10 km block size led to a substantial decline in validation performance, with validation R² decreasing to 0.380 ± 0.297 and both RMSE (15.41 ± 7.35 Mg ha⁻¹) and MAE (11.24 ± 5.66 Mg ha⁻¹) increasing markedly. Moreover, the wide dispersion of validation metrics across folds suggests unstable model generalization when overly coarse spatial partitioning is applied, likely due to reduced training sample diversity and insufficient representation of local environmental gradients.

The comparison between training and validation metrics further highlights the effect of block size on model generalization. While training R² remained consistently high across all block sizes (≥0.84), the widening gap between training and validation performance at larger block sizes indicates increasing model uncertainty and reduced spatial transferability. In particular, the pronounced divergence observed under the 10 km configuration reflects a trade-off between strict spatial independence and effective model learning.

3.3. Spatial Distribution of UAGVCD

Figure 4 illustrates the spatially explicit predictions of UAGVCD generated using XGBoost models trained under different spatial block sizes (2 km, 5 km, and 10 km). All predictions were produced at the pixel level based on Landsat-derived spectral variables and topographic predictors, and subsequently mapped across the entire Shenzhen metropolitan area.

Across all three block sizes, the predicted spatial patterns exhibit a high degree of consistency at the regional scale. Areas characterized by mountainous terrain and continuous vegetation cover—primarily located in the eastern and southeastern parts of Shenzhen—show markedly higher vegetation carbon density, whereas densely built-up urban cores and coastal lowlands present substantially lower values. This spatial contrast reflects the combined influence of vegetation structure, land cover composition, and topographic conditions on carbon storage capacity.

Despite the overall agreement in large-scale spatial patterns, notable differences emerge among the three block-size scenarios. The 2 km block configuration produces the most spatially heterogeneous predictions, characterized by pronounced local variability and sharper transitions between high- and low-carbon-density areas (Figure 4(1)). Although this finer block size yields relatively high validation performance, its accuracy is slightly lower and less stable than that obtained with the 5 km configuration (Section 3.2). The resulting fragmented prediction surface may therefore reflect residual spatial autocorrelation effects and localized overfitting associated with small spatial blocks.

In contrast, the 10 km block size yields considerably smoother spatial patterns, with reduced local variability and attenuated extremes in predicted carbon density (Figure 4(3)). Although this configuration enforces stronger spatial independence during model training, its lower validation performance and larger uncertainty (Section 3.2) suggest a loss of fine-scale ecological information relevant to urban vegetation carbon dynamics.

The predictions generated using the 5 km spatial block size represent a balance between these two extremes. The resulting maps preserve meaningful spatial heterogeneity while maintaining spatial continuity and ecological plausibility. Combined with the highest and most stable validation performance observed in Section 3.2, the 5 km block size provides the most reliable spatial representation of UAGVCD for the Shenzhen study area (Figure 4(2)).

Overall, these results demonstrate that spatial block size plays a critical role not only in model validation metrics but also in the spatial realism of carbon density predictions. The integration of spatial block cross-validation with spatial prediction therefore offers a robust framework for urban carbon mapping, particularly in heterogeneous metropolitan environments.

3.4. SHAP-Based Interpretation of Spectral–Topographic Controls on Urban Vegetation Carbon Density

Figure 5 presents a comprehensive SHAP-based interpretation of the XGBoost model, integrating global feature importance with variable-specific dependence patterns and topography- conditioned responses. Overall, both spectral and topographic variables contribute substantially to the prediction of UAGVCD, indicating that carbon estimation in complex urban landscapes is governed by coupled spectral–terrain controls rather than by spectral information alone.

The global SHAP importance analysis (Figure 5(1)) shows that spectral variables derived from Landsat imagery, particularly the blue band reflectance (B3), red-edge related indices (TR435), and vegetation indices ARVI and texture B4_mean, dominate the model contributions. Among these, B3 exhibits the largest mean absolute SHAP value, highlighting its strong sensitivity to variations in vegetation structure and background conditions within the urban environment. In parallel, several terrain-related variables, including elevation (DEM), Eastness, Curvature, and Northness, rank comparably high in global importance. The concurrent prominence of spectral and topographic predictors suggests that model predictions are shaped by their combined and context-dependent effects rather than by purely additive contributions. This result suggests that topographic context exerts a non-negligible influence on vegetation carbon density estimation, even within a highly urbanized coastal city such as Shenzhen.

Beyond global importance, SHAP dependence plots (Figure 5(2)) reveal pronounced nonlinear and threshold-like relationships between key predictors and model outputs. For instance, the SHAP values of B3 decrease sharply with increasing reflectance up to a critical range, after which the contribution stabilizes, indicating a saturation effect commonly associated with sparse or stressed vegetation conditions. In contrast, ARVI exhibits a generally positive relationship with SHAP values, with a clear inflection point beyond which increases in ARVI lead to disproportionately larger positive contributions to predicted carbon density. Similar nonlinear responses are observed for B4_mean and Curvature, underscoring that the relationships between spectral or terrain variables and vegetation carbon density are highly non-linear and cannot be adequately captured by linear models.

Importantly, these dependence relationships exhibit strong conditional dependencies on topographic context, as evidenced by the consistent coloring of SHAP values by DEM. For multiple spectral predictors (e.g., B3, ARVI, and B4_mean), identical spectral values correspond to markedly different SHAP contributions under different elevation contexts. This elevation-dependent divergence of SHAP responses provides direct evidence that topography modulates the effective contribution of spectral variables to UAGVCD predictions, consistent with a conditional interaction framework. Higher-elevation pixels generally exhibit more positive SHAP responses for vegetation-related spectral indices, whereas lower-elevation areas tend to show attenuated or even negative contributions. This pattern indicates that elevation does not merely act as an independent explanatory variable but functions as a key environmental modulator that alters the effective spectral response of urban vegetation. Such patterns indicate that elevation does not merely act as an independent explanatory variable, but functions as an environmental moderator that reshapes spectral–carbon relationships captured by the model.

A similar modulation effect is observed for terrain orientation variables. Eastness and Curvature display sign reversals in SHAP values across their ranges, with the magnitude and direction of their contributions varying systematically with elevation. These elevation-conditioned sign changes further support the presence of topography-dependent modulation rather than simple additive effects among predictors. These results suggest that slope orientation and local terrain geometry influence microclimatic conditions, illumination regimes, and vegetation growth environments, thereby indirectly shaping the spectral–carbon relationship captured by the model.

Taken together, the SHAP results provide quantitative evidence that urban vegetation carbon density is controlled by topography-modulated spectral responses. Spectral signals alone are insufficient to explain carbon variability across heterogeneous urban landscapes; instead, their predictive meaning depends strongly on the surrounding terrain context. This finding highlights the importance of explicitly incorporating topographic variables into machine-learning-based carbon estimation frameworks and offers new insights into the coupled spectral–topographic mechanisms governing urban vegetation carbon dynamics.

4. Discussion

4.1. Performance of Machine Learning Models Under Spatially Explicit Validation

Accurate estimation of above-ground vegetation carbon density in urban environments remains challenging due to strong spatial heterogeneity, complex land-cover mosaics, and pronounced spatial autocorrelation [3]. In this study, using spatial block cross-validation offers a more realistic evaluation of model generalization compared to traditional random cross-validation methods, which tend to exaggerate predictive performance in spatial datasets [42,43].

The XGBoost model showed strong predictive ability under spatial block cross-validation, achieving a validation R² of 0.617 with a 5 km block size. This accuracy level is similar to or better than values reported in earlier remote sensing studies on biomass and carbon in mixed or mountainous regions [16,27,44,45]. The clear difference between training and validation results emphasizes the need for spatially independent validation in urban carbon mapping. Nearby samples often share similar spectral and environmental features.

The comparison across block sizes further highlights the trade-off between spatial independence and effective model learning. Smaller blocks (2 km) achieved relatively high validation accuracy but showed greater variability across folds, indicating remaining spatial dependence. In contrast, larger blocks (10 km) significantly lowered validation performance and raised uncertainty due to not adequately representing local environmental gradients within the training data. These findings agree with recent studies that stress the importance of choosing an intermediate block size fitting the spatial scale of ecological processes and landscape differences [46,47,48,49,50]. For Shenzhen, a 5 km block size seems to strike an optimal balance between managing spatial autocorrelation and keeping enough training data. It should be noted that the spatial block cross-validation results are also influenced by the density and spatial representativeness of field samples. In this study, 195 plots distributed over approximately 2000 km² result in a relatively low average sampling density in a highly heterogeneous urban environment. Uneven coverage across elevation gradients and urbanization intensity zones may affect local model performance and the interpretation of spatial generalization, particularly for larger block sizes. While spatial block cross-validation provides a conservative evaluation under limited sampling, denser and more stratified plot designs would further strengthen the robustness of spatial validation in future applications.

4.2. Spatial Patterns of Urban Vegetation Carbon Density and Scale Effects

The spatial prediction results show consistent large-scale patterns of urban vegetation carbon density across different block sizes. Higher carbon densities are mainly found in the mountainous and green areas of eastern and southeastern Shenzhen, while lower values occur in highly urbanized centers and coastal lowlands. These patterns confirm earlier findings that topography, land cover, and urban development shape the distribution of urban vegetation carbon stocks [51,52,53,54]. However, significant differences appear in the spatial texture of predictions between block sizes. The 2 km block arrangement produces very diverse and fragmented carbon density maps with sharp local changes. While this detail may capture fine-scale ecological differences, it may also indicate localized overfitting due to spatially clustered samples.

On the other hand, the 10 km arrangement generates overly smooth spatial patterns, diminishing local variations and possibly hiding important ecological gradients. Similar scale-dependent smoothing effects have been noted in spatial modeling studies using coarse validation units [55]. The 5 km block arrangement offers a compromise between these extremes, maintaining spatial differences while keeping clear regional patterns and better validation performance. This balance shows that spatial validation design not only impacts accuracy metrics but also affects the ecological realism of spatial prediction products—a factor often overlooked in urban carbon mapping studies.

It should also be emphasized that the spatial patterns discussed above are derived from Landsat imagery acquired in 2016 and field measurements collected between 2014 and 2016, representing a historical snapshot of Shenzhen’s urban vegetation carbon conditions. Given the rapid urban renewal and large-scale greening initiatives implemented in recent years, including extensive park development and green space expansion, the current distribution and magnitude of urban vegetation carbon density may have changed. Therefore, the spatial patterns identified here should be interpreted in a temporal context, while the methodological framework remains directly transferable to more recent satellite and field data.

4.3. Spectral and Topographic Controls on Urban Vegetation Carbon Density

The SHAP-based global importance analysis shows that both spectral and topographic factors are key to estimating urban vegetation carbon density. Spectral features from Landsat images, such as visible bands, vegetation indices, and texture metrics, play a major role, confirming extensive research that highlights their sensitivity to vegetation structure and biomass [3,24,56,57]. Notably, terrain-related factors like elevation, eastness, curvature, and northness also rank highly in importance. This finding highlights that even in a highly urban coastal city, topographic context has a significant impact on vegetation carbon storage. Past studies have shown that elevation and terrain orientation affect vegetation growth through microclimates, soil moisture, and sunlight [58,59,60,61,62]. Our findings expand this understanding by showing that topography also influences how spectral signals relate to vegetation carbon density in urban areas.

4.4. Topography-Modulated Spectral Responses Revealed by SHAP Dependence Analysis

Beyond overall feature rankings, SHAP dependence and interaction analyses give valuable insights into the nonlinear and context-dependent nature of spectral-carbon relationships. Several spectral variables, like B3, ARVI, and B4_mean, show clear threshold-like and nonlinear responses, reflecting saturation effects and background influences often seen in optical remote sensing of vegetation [63,64,65]. Importantly, these spectral responses are consistently affected by elevation, as shown by systematic changes in SHAP values across DEM gradients. The same spectral values lead to significantly different contributions to predicted carbon density in different topographic contexts. Higher elevations typically strengthen positive spectral contributions, while lower elevations weaken them. This suggests that elevation does not only serve as a standalone predictor, but also acts as an environmental factor that changes the ecological interpretation of spectral data. Terrain orientation variables further support this view. Eastness and curvature exhibit sign changes and varying contribution levels across their ranges, showing that light angles, slope exposure, and microclimatic conditions affect vegetation growth and spectral reflectance. Such interactions are hard to capture with traditional linear models but are effectively shown through SHAP interpretation of tree-based machine learning models [66,67]. Together, these results provide proof for the idea of topography-modulated spectral responses, where the link between spectral features and vegetation carbon density explicitly depends on terrain conditions. This insight advances existing remote sensing studies by moving beyond simple variable importance to a better understanding of how spectral signals interact with topographic changes in complex urban environments.

4.5. Uncertainty Sources and Limitations

Several sources of uncertainty may affect the estimation of UAGVCD in this study. First, plot-level carbon density calculations rely on regional tree volume equations, BEFs, and carbon conversion coefficients (Table A1, Table A2 and Table A3), all of which contain inherent fitting and transferability uncertainties. Such parameters are commonly developed at provincial or national scales and may not fully represent the species composition and management conditions of highly urbanized forests in Shenzhen, as widely acknowledged in biomass and carbon accounting studies and IPCC guidelines [68,69,70]. Second, shrub and herbaceous biomass was estimated using height-based empirical equations (Equations (3) and (4)), without explicitly considering vegetation cover or structural density. Previous studies have shown that shrub biomass models based on a single predictor typically achieve moderate explanatory power (R² ≈ 0.3–0.6), implying relatively higher uncertainty for plots dominated by shrubs and grasses, such as urban parks and landscaped green spaces [71,72]. Third, temporal mismatches between field measurements and Landsat imagery may introduce uncertainty despite temporal normalization, particularly for fast-changing herbaceous and shrub vegetation [73]. In addition, Landsat’s 30 m spatial resolution inevitably leads to mixed-pixel effects in complex urban environments, where vegetation frequently coexists with impervious surfaces, potentially reducing estimation precision at fine spatial scales [74,75].

Despite these limitations, the spatial block cross-validation framework provides a conservative assessment of model generalization, indicating that the main spatial patterns and the relative importance of spectral and topographic drivers identified in this study are robust.

4.6. Implications for Urban Carbon Mapping and Future Research Directions

Combining spatial block cross-validation, XGBoost modeling, and SHAP-based analysis creates a strong framework for estimating urban vegetation carbon density. By considering spatial dependence and terrain-modulated spectral effects, this method enhances both reliability and clarity, addressing two long-standing challenges in urban remote sensing. Future research can expand this framework in several ways. First, adding multi-temporal or phenological data from dense time series may improve sensitivity to changes in vegetation and reduce saturation effects. Second, including LiDAR-derived structural metrics could enhance the representation of vertical vegetation complexity, especially in mixed urban forests. Third, further quantitative analysis of uncertainty and spatial error propagation could provide deeper insights into the dependability of carbon density maps at various spatial scales, building upon the uncertainty sources discussed above.

5. Conclusions

This study presented a remote sensing and machine learning framework for estimating UAGVCD in Shenzhen, China. We combined Landsat 8 spectral information, terrain-derived variables, and field-based carbon density observations. Our work shows that accurate urban carbon estimation needs a clear approach to spatial dependence and topographic effects.

Using XGBoost and a spatial block cross-validation method, we examined how block size affects model performance and prediction behavior. The findings reveal that validation accuracy and stability are sensitive to the scale of spatial partitioning. Among the configurations we tested, a moderate block size of 5 km provided the best balance between predictive accuracy, robustness across folds, and spatial realism. Smaller blocks of 2 km often resulted in fragmented predictions and heightened sensitivity to local spatial dependence. On the other hand, larger blocks of 10 km led to decreased accuracy and overly smooth patterns, which caused a loss of detailed ecological information. SHAP-based analysis showed that urban vegetation carbon density depends on both spectral and topographic influences, not just spectral data. Spectral variables, such as Landsat bands, vegetation indices, and texture metrics, along with terrain features like elevation, eastness, curvature, and northness, played vital roles in model predictions. The SHAP dependence and interaction analyses highlighted significant nonlinear and threshold-like relationships. This means that similar spectral signals can lead to very different carbon density levels in varied topographic settings. These results offer evidence of how topography affects spectral responses in estimating urban vegetation carbon.

Overall, this study emphasizes the importance of using spatially explicit validation strategies and interpretable machine learning for urban carbon mapping. The proposed framework enhances the reliability and clarity of estimating urban vegetation carbon density. It can be adapted to other cities with varied terrain and land-cover patterns. Future research could build on this approach by adding multi-temporal observations, more structural data from LiDAR or SAR, and deeper exploration of scale-dependent urban ecological processes.

Author Contributions

Conceptualization, G.Q., G.W. and M.W.; methodology, G.Q.; software, G.Q.; validation, G.W. and M.W.; formal analysis, G.Q.; investigation, G.W.; resources, G.Q.; data curation, G.Q.; writing—original draft preparation, G.Q.; writing—review and editing, G.W. and M.W.; visualization, G.Q. and G.W.; supervision, G.W.; project administration, G.W., G.Q. and M.W.; funding acquisition, G.W., G.Q. and M.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the 2025 Key Project of Zunyi Philosophy and Social Science Planning Program (Grant No. 25ZYZD03); the Guizhou Provincial Science and Technology Program (Grant No. QN (2025) 293); and the project was funded by Shenzhen Xianhu Botanic Garden (Grant No. 8851).

Data Availability Statement

The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.

Acknowledgments

The authors appreciate the assistance of Yifan Tan and Hua Sun on the field data collection and funding from the Shenzhen Xianhu Botanic Garden, Shenzhen, China; School of Business Administration, Moutai Institute, China, and we appreciate all the anonymous reviewers of this article who provided valuable revision comments.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

UAGVCD	Urban Above-Ground Vegetation Carbon Density
XGBoost	Extreme Gradient Boosting
SHAP	Shapley Additive exPlanations
DEM	Digital Elevation Model
NFI	National Forest Inventory
OLI	Operational Land Imager
BEF	Biomass Expansion Factor

Appendix A

Table A1. Tree volume calculation equations.

Tree Species	Volume Calculation Equation
Eucalyptus	V = 8.71419 × 10⁻⁵ D^1.94801H^0.74929
Pinus elliottii	V = 7.81515 × 10⁻⁵ D^1.79967H^0.98178
Acacia mangium	V = 7.32715 × 10⁻⁵ D^1.65483H^1.08069
Pinus tabuliformis	V = 7.98524 × 10⁻⁵ D^1.74220H^1.01198
Castanopsis fissa	V = 6.29692 × 10⁻⁵ D^1.81296H^1.01545
Broad-leaved species	V = 6.74286 × 10⁻⁵ D^1.87657H^0.92888
Cunninghamia lanceolata	V = 6.97483 × 10⁻⁵ D^1.81583H^0.99610
Hardwood	V = 6.01228 × 10⁻⁵ D^1.87550H^0.98496

Note: V—tree volume, D—diameter at breast height (1.3 m), H—height of tree.

Table A2. Biomass and volume relationship parameter value table.

Forest Types	a (Mg/m³)	b (Mg)	N	R²
Picea asperata/Abies alba	0.5519	48.861	24	0.78
Betula	1.0687	10.237	9	0.70
Casuarina equisetifolia	0.7441	3.2377	10	0.95
Cunninghamia lanceolata	0.4652	19.141	90	0.94
Cedrus spp.	0.8893	7.3965	19	0.87
Cupressus funebris	1.1453	8.5473	12	0.98
Quercus subg. Quercus sect. Quercus	0.8873	4.5539	20	0.8
Eucalyptus robusta Sm.	0.6096	33.806	34	0.82
Larix principis-rupprechtii	0.9292	6.494	24	0.83
Subtropical evergreen broad-leaved forest	0.8136	18.466	10	0.99
Theropencedrymion	0.9788	5.3764	35	0.93
Broadleaf mixed plantations	0.5856	18.744	9	0.91
Pinus armandi	0.5723	16.489	22	0.93
Pinus massoniana	0.5034	20.547	52	0.87
Pinus sylvestris	1.112	2.6951	15	0.85
Pinus tabuliformis	0.869	9.1212	112	0.91
Other conifer species	0.5292	25.087	19	0.86
Populus tremula	0.4969	26.973	13	0.92
Tsuga chinensis/Cryptomeria fortunei	0.3491	39.816	30	0.79
Tropical forests	0.7975	0.4204	18	0.87

Table A3. Carbon ratio table for different tree species in China.

Trees Species	Ratio	Tree Species	Ratio
Picea asperata Mast.	0.4994	Schima superba	0.5115
Tsuga chinensis	0.5022	Other broad-leaved hardwood species	0.4901
Larix gmelinii	0.5137	Populus spp.	0.4502
Pinus koraiensis Siebold & Zucc.	0.5113	Eucalyptus spp.	0.4748
Pinus thunbergii Parl.	0.5146	Acacia spp.	0.4666
Pinus tabulformis	0.5184	Other broad-leaved softwood species	0.4502
Pinus armandi Franch.	0.5177	Broadleaf mixed trees	0.4796
Pinus massoniana Lamb.	0.5271	Economic tree species	0.4700
Pinus elliottii	0.5311	Cupressus funebris	0.5088
Other Pinus species	0.4963	Coniferous mixed forest	0.5168
Cunninghamia lanceolata	0.5127	* Bush	0.4672
Conifer-broadleaf forest	0.4893	† Herbal	0.3270

Note: * Bush is a joint name of all kinds of different shrub species, † Herbal is a joint name of all kinds of different grass species.

Table A4. Summary of predictor variables used in the analysis. The table provides the full list of predictor variables considered in the analysis, including spectral bands, vegetation indices, texture features, band ratios, and topographic variables. For each variable, the corresponding category, correlation coefficient (r), significance level (p), adjusted p-value (p_adj), and selection status are reported. Vegetation indices (e.g., NDVI, SAVI, EVI, NBR) were derived from multispectral bands following standard formulations reported in the literature. Texture features (e.g., B4_mean, B4_variance, B4_contrast) were calculated from the gray-level co-occurrence matrix (GLCM). Topographic variables (e.g., elevation, slope, TWI, HLI) were generated from the digital elevation model (DEM). Band ratio variables (e.g., TR435, TR517, TR536) were constructed as ratios between specific spectral bands to enhance spectral contrast. This table provides a comprehensive reference for variable definitions and derivation sources mentioned in the Section 2 and Section 3.

Variable	r	p	p_adj	Category	Selected
NDMI	0.622	3.13 × 10⁻²²	5.54 × 10⁻²¹	Vegetation indices	No
NDWI	−0.622	3.1 × 10⁻²²	5.54 × 10⁻²¹		No
ARVI	0.626	1.23 × 10⁻²²	5.54 × 10⁻²¹		Yes
NDVI	0.617	7.24 × 10⁻²²	7.84 × 10⁻²¹		No
SAVI	0.617	7.40 × 10⁻²²	7.84 × 10⁻²¹		No
NBR	0.616	1.01 × 10⁻²¹	8.89 × 10⁻²¹		No
MNDVI	−0.133	0.063667	0.071794		Yes
EVI	0.015	0.834674	0.834674		Yes
Slope	0.389	1.92 × 10⁻⁸	3.09 × 10⁻⁸	Topography	Yes
Relief	0.385	2.78 × 10⁻⁸	4.33 × 10⁻⁸		No
DEM	0.301	1.92 × 10⁻⁵	2.76 × 10⁻⁵		Yes
Elevation	0.301	1.92 × 10⁻⁵	2.76 × 10⁻⁵		No
TWI	−0.289	4.13 × 10⁻⁵	5.77 × 10⁻⁵		Yes
PotentialSolarRadiation_proxy	−0.286	5.18 × 10⁻⁵	7.04 × 10⁻⁵		Yes
HLI	−0.260	0.000238	0.000315		No
Aspect	0.163	0.022986	0.026484		Yes
Eastness	−0.132	0.065075	0.071854		Yes
Hillshade	0.098	0.171615	0.185625		Yes
Curvature	0.068	0.348219	0.369112		Yes
TPI	−0.060	0.406045	0.421968		No
Northness	0.027	0.705956	0.719532		Yes
B4_mean	−0.591	9.66 × 10⁻²⁰	3.2 × 10⁻¹⁹	Texture (GLCM)	Yes
B4_homogeneity	0.250	0.000432	0.000558		Yes
B4_contrast	−0.222	0.001806	0.002279		Yes
B4_variance	−0.179	0.01231	0.014498		Yes
B3	−0.613	1.75 × 10⁻²¹	1.16 × 10⁻²⁰	Spectral bands	No
B4	−0.606	5.83 × 10⁻²¹	3.09 × 10⁻²⁰		No
B7	−0.598	2.87 × 10⁻²⁰	1.17 × 10⁻¹⁹		No
B2	−0.594	5.21 × 10⁻²⁰	1.84 × 10⁻¹⁹		No
B1	−0.589	1.25 × 10⁻¹⁹	3.68 × 10⁻¹⁹		No
B6	−0.516	1.18 × 10⁻¹⁴	2.31 × 10⁻¹⁴		No
B5	0.189	0.008207	0.009886		No
TR435	−0.604	8.55 × 10⁻²¹	4.12 × 10⁻²⁰	Band ratios (TR)	Yes
TR425	−0.586	2.24 × 10⁻¹⁹	6.24 × 10⁻¹⁹		No
TR415	−0.579	7.5 × 10⁻¹⁹	1.99 × 10⁻¹⁸		No
TR517	0.576	1.18 × 10⁻¹⁸	2.98 × 10⁻¹⁸		Yes
TR527	0.574	1.7 × 10⁻¹⁸	4.08 × 10⁻¹⁸		No
TR537	0.565	8.02 × 10⁻¹⁸	1.85 × 10⁻¹⁷		No
TR436	−0.554	4.12 × 10⁻¹⁷	9.1 × 10⁻¹⁷		No
TR426	−0.521	5.6 × 10⁻¹⁵	1.19 × 10⁻¹⁴		No
TR534	0.504	5.72 × 10⁻¹⁴	1.08 × 10⁻¹³		Yes
TR416	−0.497	1.54 × 10⁻¹³	2.81 × 10⁻¹³		No
TR536	0.410	2.57 × 10⁻⁹	4.54 × 10⁻⁹		Yes
TR546	0.401	6.33 × 10⁻⁹	1.08 × 10⁻⁸		No
TR516	0.390	1.7 × 10⁻⁸	2.81 × 10⁻⁸		No
TR526	0.370	1.01 × 10⁻⁷	1.52 × 10⁻⁷		No

References

Velasco, E.; Roth, M.; Norford, L.; Molina, L.T. Does Urban Vegetation Enhance Carbon Sequestration? Landsc. Urban Plan. 2016, 148, 99–107. [Google Scholar] [CrossRef]
Murugadoss, D.; Singh, H.; Thakur, P. Urban Forests and Microclimate Regulation. In Urban Forests, Climate Change and Environmental Pollution: Physio-Biochemical and Molecular Perspectives to Enhance Urban Resilience; Singh, H., Ed.; Springer Nature: Cham, Switzerland, 2024; pp. 531–550. [Google Scholar]
Qie, G.; Ye, J.; Wang, G.; Wang, M. Enhancing Urban Above-Ground Vegetation Carbon Density Mapping: An Integrated Approach Incorporating De-Shadowing, Spectral Unmixing, and Machine Learning. Forests 2024, 15, 480. [Google Scholar] [CrossRef]
Mitchell, D.; Enemark, S.; van der Molen, P. Climate Resilient Urban Development: Why Responsible Land Governance Is Important. Land Use Policy 2015, 48, 190–198. [Google Scholar] [CrossRef]
Giannico, V.; Lafortezza, R.; John, R.; Sanesi, G.; Pesola, L.; Chen, J. Estimating Stand Volume and Above-Ground Biomass of Urban Forests Using LiDAR. Remote Sens. 2016, 8, 339. [Google Scholar] [CrossRef]
Raciti, S.M.; Hutyra, L.R.; Newell, J.D. Mapping Carbon Storage in Urban Trees with Multi-Source Remote Sensing Data: Relationships between Biomass, Land Use, and Demographics in Boston Neighborhoods. Sci. Total Environ. 2014, 500–501, 72–83. [Google Scholar] [CrossRef]
Leitão, P.J.; Schwieder, M.; Pötzschner, F.; Pinto, J.R.R.; Teixeira, A.M.C.; Pedroni, F.; Sanchez, M.; Rogass, C.; van der Linden, S.; Bustamante, M.M.C.; et al. From Sample to Pixel: Multi-Scale Remote Sensing Data for Upscaling Aboveground Carbon Data in Heterogeneous Landscapes. Ecosphere 2018, 9, e02298. [Google Scholar] [CrossRef]
Chen, Y.; Guerschman, J.P.; Cheng, Z.; Guo, L. Remote Sensing for Vegetation Monitoring in Carbon Capture Storage Regions: A Review. Appl. Energy 2019, 240, 312–326. [Google Scholar] [CrossRef]
Liang, X.; Yu, S.; Meng, B.; Wang, X.; Yang, C.; Shi, C.; Ding, J. Multi-Source Remote Sensing and GIS for Forest Carbon Monitoring Toward Carbon Neutrality. Forests 2025, 16, 971. [Google Scholar] [CrossRef]
Naik, P.; Dalponte, M.; Bruzzone, L. Prediction of Forest Aboveground Biomass Using Multitemporal Multispectral Remote Sensing Data. Remote Sens. 2021, 13, 1282. [Google Scholar] [CrossRef]
Nguyen, T.H.; Jones, S.; Soto-Berelov, M.; Haywood, A.; Hislop, S. Landsat Time-Series for Estimating Forest Aboveground Biomass and Its Dynamics across Space and Time: A Review. Remote Sens. 2019, 12, 98. [Google Scholar] [CrossRef]
Cutler, M.E.J.; Boyd, D.S.; Foody, G.M.; Vetrivel, A. Estimating Tropical Forest Biomass with a Combination of SAR Image Texture and Landsat TM Data: An Assessment of Predictions between Regions. ISPRS J. Photogramm. Remote Sens. 2012, 70, 66–77. [Google Scholar] [CrossRef]
Chenge, I.B.; Osho, J.S.A. Mapping Tree Aboveground Biomass and Carbon in Omo Forest Reserve Nigeria Using Landsat 8 OLI Data. South. For. J. For. Sci. 2018, 80, 341–350. [Google Scholar] [CrossRef]
Marelign, A.; Temesgen, D. Estimating and Mapping Woodland Biomass and Carbon Using Landsat 8 Vegetation Index: A Case Study in Dirmaga Watershed, Ethiopia. Comput. Ecol. Softw. 2022, 12, 67–79. [Google Scholar]
Zhang, Y.; Migliavacca, M.; Penuelas, J.; Ju, W. Advances in Hyperspectral Remote Sensing of Vegetation Traits and Functions. Remote Sens. Environ. 2021, 252, 112121. [Google Scholar] [CrossRef]
Zhang, F.; Tian, X.; Zhang, H.; Jiang, M. Estimation of Aboveground Carbon Density of Forests Using Deep Learning and Multisource Remote Sensing. Remote Sens. 2022, 14, 3022. [Google Scholar] [CrossRef]
Sahin, E.K. Assessing the Predictive Capability of Ensemble Tree Methods for Landslide Susceptibility Mapping Using XGBoost, Gradient Boosting Machine, and Random Forest. SN Appl. Sci. 2020, 2, 1308. [Google Scholar] [CrossRef]
Zhang, H.; Eziz, A.; Xiao, J.; Tao, S.; Wang, S.; Tang, Z.; Zhu, J.; Fang, J. High-Resolution Vegetation Mapping Using eXtreme Gradient Boosting Based on Extensive Features. Remote Sens. 2019, 11, 1505. [Google Scholar] [CrossRef]
Li, X.; Jia, H.; Wang, L. Remote Sensing Monitoring of Drought in Southwest China Using Random Forest and eXtreme Gradient Boosting Methods. Remote Sens. 2023, 15, 4840. [Google Scholar] [CrossRef]
Kavzoglu, T.; Teke, A. Advanced Hyperparameter Optimization for Improved Spatial Prediction of Shallow Landslides Using Extreme Gradient Boosting (XGBoost). Bull. Eng. Geol. Environ. 2022, 81, 201. [Google Scholar] [CrossRef]
Liu, J.; Yang, Z.; Li, L.; Chu, X.; Wei, S.; Lian, J. Predicting Aboveground Carbon Storage in Different Types of Forests in South Subtropical Regions Using Machine Learning Models. Ecol. Evol. 2025, 15, e71499. [Google Scholar] [CrossRef] [PubMed]
Wang, D.; Xing, Y.; Fu, A.; Tang, J.; Chang, X.; Yang, H.; Yang, S.; Li, Y. Mapping Forest Aboveground Biomass Using Multi-Source Remote Sensing Data Based on the XGBoost Algorithm. Forests 2025, 16, 347. [Google Scholar] [CrossRef]
Subedi, B.; Morneau, A.; LeBel, L.; Gautam, S.; Cyr, G.; Tremblay, R.; Carle, J.-F. An XGBoost-Based Machine Learning Approach to Simulate Carbon Metrics for Forest Harvest Planning. Sustainability 2025, 17, 5454. [Google Scholar] [CrossRef]
Sun, H.; Qie, G.; Wang, G.; Tan, Y.; Li, J.; Peng, Y.; Ma, Z.; Luo, C. Increasing the Accuracy of Mapping Urban Forest Carbon Density by Combining Spatial Modeling and Spectral Unmixing Analysis. Remote Sens. 2015, 7, 15114–15139. [Google Scholar] [CrossRef]
Wang, G.; Oyana, T.; Zhang, M.; Adu-Prah, S.; Zeng, S.; Lin, H.; Se, J. Mapping and Spatial Uncertainty Analysis of Forest Vegetation Carbon by Combining National Forest Inventory Data and Satellite Images. For. Ecol. Manag. 2009, 258, 1275–1283. [Google Scholar] [CrossRef]
Illarionova, S.; Tregubova, P.; Shukhratov, I.; Shadrin, D.; Efimov, A.; Burnaev, E. Advancing Forest Carbon Stocks’ Mapping Using a Hierarchical Approach with Machine Learning and Satellite Imagery. Sci. Rep. 2024, 14, 21032. [Google Scholar] [CrossRef]
Xi, L.; Shu, Q.; Sun, Y.; Huang, J.; Song, H. Carbon Storage Estimation of Mountain Forests Based on Deep Learning and Multisource Remote Sensing Data. J. Appl. Remote Sens. 2023, 17, 014510. [Google Scholar] [CrossRef]
Cheng, F.; Ou, G.; Wang, M.; Liu, C. Remote Sensing Estimation of Forest Carbon Stock Based on Machine Learning Algorithms. Forests 2024, 15, 681. [Google Scholar] [CrossRef]
Wadoux, A.M.J.-C.; Heuvelink, G.B.M.; de Bruin, S.; Brus, D.J. Spatial Cross-Validation Is Not the Right Way to Evaluate Map Accuracy. Ecol. Model. 2021, 457, 109692. [Google Scholar] [CrossRef]
Mascaro, J.; Asner, G.P.; Knapp, D.E.; Kennedy-Bowdoin, T.; Martin, R.E.; Anderson, C.; Higgins, M.; Chadwick, K.D. A Tale of Two “Forests”: Random Forest Machine Learning Aids Tropical Forest Carbon Mapping. PLoS ONE 2014, 9, e85993. [Google Scholar] [CrossRef] [PubMed]
Hook, P.B.; Burke, I.C. Biogeochemistry in a Shortgrass Landscape: Control by Topography, Soil Texture, and Microclimate. Ecology 2000, 81, 2686–2703. [Google Scholar] [CrossRef]
Wang, G.; Guan, D.; Xiao, L.; Peart, M.R. Forest Biomass-Carbon Variation Affected by the Climatic and Topographic Factors in Pearl River Delta, South China. J. Environ. Manag. 2019, 232, 781–788. [Google Scholar] [CrossRef] [PubMed]
Stoffels, D.; Faltermaier, S.; Strunk, K.S.; Fiedler, M. Guiding Computationally Intensive Theory Development with Explainable Artificial Intelligence: The Case of Shapley Additive Explanations. J. Inf. Technol. 2025, 40, 180–213. [Google Scholar] [CrossRef]
Stock, A. Choosing Blocks for Spatial Cross-Validation: Lessons from a Marine Remote Sensing Case Study. Front. Remote Sens. 2025, 6, 1531097. [Google Scholar] [CrossRef]
Chi, X.; Guo, Q.; Fang, J.; Schmid, B.; Tang, Z. Seasonal Characteristics and Determinants of Tree Growth in a Chinese Subtropical Forest. J. Plant Ecol. 2017, 10, 4–12. [Google Scholar] [CrossRef]
Xing, P.; Zhang, Q.-B.; Baker, P.J. Age and Radial Growth Pattern of Four Tree Species in a Subtropical Forest of China. Trees 2012, 26, 283–290. [Google Scholar] [CrossRef]
Fan, W.Y.; Li, M.Z.; Yu, Y. Quantitative retrieving of vegetation factors for desertification area. Adv. Mater. Res. 2011, 183, 376–380. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; Association for Computing Machinery: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
Li, W.; Niu, Z.; Li, Z.; Wang, C.; Wu, M.; Muhammad, S. Upscaling Coniferous Forest Above-Ground Biomass Based on Airborne LiDAR and Satellite ALOS PALSAR Data. J. Appl. Remote Sens. 2016, 10, 046003. [Google Scholar] [CrossRef]
Jia, Z.; Zhang, Z.; Cheng, Y.; Buhebaoyin; Borjigin, S.; Quan, Z. Grassland Biomass Spatiotemporal Patterns and Response to Climate Change in Eastern Inner Mongolia Based on XGBoost Model Estimates. Ecol. Indic. 2024, 158, 111554. [Google Scholar] [CrossRef]
Zhao, Y.; Zhang, L.; Lei, S.; Liao, L.; Zhang, C. Machine Learning-Based Prediction of Belowground Biomass from Aboveground Biomass and Soil Properties. Environ. Model. Softw. 2025, 185, 106313. [Google Scholar] [CrossRef]
Roberts, D.R.; Bahn, V.; Ciuti, S.; Boyce, M.S.; Elith, J.; Guillera-Arroita, G.; Hauenstein, S.; Lahoz-Monfort, J.J.; Schröder, B.; Thuiller, W.; et al. Cross-Validation Strategies for Data with Temporal, Spatial, Hierarchical, or Phylogenetic Structure. Ecography 2017, 40, 913–929. [Google Scholar] [CrossRef]
Brenning, A. Spatial Cross-Validation and Bootstrap for the Assessment of Prediction Rules in Remote Sensing: The R Package Sperrorest. In Proceedings of the 2012 IEEE International Geoscience and Remote Sensing Symposium, Munich, Germany, 22–27 July 2012; IEEE: New York, NY, USA, 2012; pp. 5372–5375. [Google Scholar]
Tian, L.; Huang, W.; Cui, G.; Huang, X.; Cui, F.; Wei, Y.; Zhao, C.; Tong, S.; Wang, A. Wetland Restoration Enhances Soil Carbon Sequestration in Lake Ecosystems: Integrating Multi-Source Remote Sensing and Optimized Ensemble Machine Learning to Map Soil Organic Carbon Density. Ecol. Indic. 2026, 182, 114551. [Google Scholar] [CrossRef]
Wang, W.; Tang, L.; Zhang, Y.; Cai, J.; Chen, X.; Mao, X. Mapping Ecosystem Carbon Storage in the Nanling Mountains of Guangdong Province Using Machine Learning Based on Multi-Source Remote Sensing. Atmosphere 2025, 16, 954. [Google Scholar] [CrossRef]
Meyer, H.; Reudenbach, C.; Wöllauer, S.; Nauss, T. Importance of Spatial Predictor Variable Selection in Machine Learning Applications—Moving from Data Reproduction to Spatial Prediction. Ecol. Model. 2019, 411, 108815. [Google Scholar] [CrossRef]
Ming, D.; Li, J.; Wang, J.; Zhang, M. Scale Parameter Selection by Spatial Statistics for GeOBIA: Using Mean-Shift Based Multi-Scale Segmentation as an Example. ISPRS J. Photogramm. Remote Sens. 2015, 106, 28–41. [Google Scholar] [CrossRef]
Dungan, J.L.; Perry, J.N.; Dale, M.R.T.; Legendre, P.; Citron-Pousty, S.; Fortin, M.-J.; Jakomulska, A.; Miriti, M.; Rosenberg, M.S. A Balanced View of Scale in Spatial Statistical Analysis. Ecography 2002, 25, 626–640. [Google Scholar] [CrossRef]
Hoosbeek, M.R.; Stein, A.; van Reuler, H.; Janssen, B.H. Interpolation of Agronomic Data from Plot to Field Scale: Using a Clustered versus a Spatially Randomized Block Design. Geoderma 1998, 81, 265–280. [Google Scholar] [CrossRef]
Hekmat, A.; Osanloo, M.; Moarefvand, P. Block Size Selection with the Objective of Minimizing the Discrepancy in Real and Estimated Block Grade. Arab. J. Geosci. 2013, 6, 141–155. [Google Scholar] [CrossRef]
Sudarma, I.M.; Saifulloh, M.; Diara, I.W.; As-Syakur, A.R. Carbon Stocks Dynamics of Urban Green Space Ecosystems Using Time-Series Vegetation Indices. Ecol. Eng. Environ. Technol. 2024, 9, 147–162. [Google Scholar] [CrossRef]
Handore, K. Assessing Impacts of Urban Expansion on Carbon Stock and Sequestration for Kokan Division, Maharashtra, India. Environ. Res. Eng. Manag. 2025, 81, 7–20. [Google Scholar] [CrossRef]
Guo, X.; Liu, Z.; Gao, D.; Xu, C.; Zhang, K.; Liu, X. Application of Land Use Modes in the Spatial Prediction of Soil Organic Carbon in Urban Green Spaces. Int. Agrophysics 2023, 37, 1–14. [Google Scholar] [CrossRef]
Ni, J.; Luo, D.H.; Xia, J.; Zhang, Z.H.; Hu, G. Vegetation in Karst Terrain of Southwestern China Allocates More Biomass to Roots. Solid Earth 2015, 6, 799–810. [Google Scholar] [CrossRef]
Kok, K.; Farrow, A.; Veldkamp, A.; Verburg, P.H. A Method and Application of Multi-Scale Validation in Spatial Land Use Models. Agric. Ecosyst. Environ. 2001, 85, 223–238. [Google Scholar] [CrossRef]
Grabska, E. Use of Remote Sensing Data for Temperate Mountain Forests Characteristics. Ph.D. Thesis, Jagiellonian University, Kraków, Poland, 2021. [Google Scholar]
Goward, S.N.; Huemmrich, K.F. Vegetation Canopy PAR Absorptance and the Normalized Difference Vegetation Index: An Assessment Using the SAIL Model. Remote Sens. Environ. 1992, 39, 119–140. [Google Scholar] [CrossRef]
Singh, S. Understanding the Role of Slope Aspect in Shaping the Vegetation Attributes and Soil Properties in Montane Ecosystems. Trop. Ecol. 2018, 59, 417–430. [Google Scholar]
Pariente, S. Spatial Patterns of Soil Moisture as Affected by Shrubs, in Different Climatic Conditions. Environ. Monit. Assess. 2002, 73, 237–251. [Google Scholar] [CrossRef] [PubMed]
Cantlon, J.E. Vegetation and Microclimates on North and South Slopes of Cushetunk Mountain, New Jersey. Ecol. Monogr. 1953, 23, 241–270. [Google Scholar] [CrossRef]
Clinton, B.D. Light, Temperature, and Soil Moisture Responses to Elevation, Evergreen Understory, and Small Canopy Gaps in the Southern Appalachians. For. Ecol. Manag. 2003, 186, 243–255. [Google Scholar] [CrossRef]
Flores, A.N.; Ivanov, V.Y.; Entekhabi, D.; Bras, R.L. Impact of Hillslope-Scale Organization of Topography, Soil Moisture, Soil Temperature, and Vegetation on Modeling Surface Microwave Radiation Emission. IEEE Trans. Geosci. Remote Sens. 2009, 47, 2557–2571. [Google Scholar] [CrossRef]
Huete, A.R. Vegetation Indices, Remote Sensing and Forest Monitoring. Geogr. Compass 2012, 6, 513–532. [Google Scholar] [CrossRef]
Khunrattanasiri, W. Application of Remote Sensing Vegetation Indices for Forest Cover Assessments. In Concepts and Applications of Remote Sensing in Forestry; Suratman, M.N., Ed.; Springer Nature: Singapore, 2022; pp. 153–166. [Google Scholar]
Xue, J.; Su, B. Significant Remote Sensing Vegetation Indices: A Review of Developments and Applications. J. Sens. 2017, 2017, 1353691. [Google Scholar] [CrossRef]
Iban, M.C. An Explainable Model for the Mass Appraisal of Residences: The Application of Tree-Based Machine Learning Algorithms and Interpretation of Value Determinants. Habitat Int. 2022, 128, 102660. [Google Scholar] [CrossRef]
Yang, B.; Gül, M.; Chen, Y. Comparative Analysis of Deep Learning and Tree-Based Models in Power Demand Prediction: Accuracy, Interpretability, and Computational Efficiency. J. Build. Phys. 2025, 49, 127–169. [Google Scholar] [CrossRef] [PubMed]
Yona, L.; Cashore, B.; Jackson, R.B.; Ometto, J.; Bradford, M.A. Refining National Greenhouse Gas Inventories. Ambio 2020, 49, 1581–1586. [Google Scholar] [CrossRef]
Zhu, S.; Cai, B.; Fang, S.; Zhu, J.; Gao, Q. The Development and Influence of IPCC Guidelines for National Greenhouse Gas Inventories. In Annual Report on Actions to Address Climate Change (2019): Climate Risk Prevention; Zhuang, G., Chao, Q., Hu, G., Pan, J., Eds.; Springer Nature: Singapore, 2023; pp. 233–246. [Google Scholar]
Amon, B.; Çinar, G.; Anderl, M.; Dragoni, F.; Kleinberger-Pierer, M.; Hörtenhuber, S. Inventory Reporting of Livestock Emissions: The Impact of the IPCC 1996 and 2006 Guidelines. Environ. Res. Lett. 2021, 16, 075001. [Google Scholar] [CrossRef]
Ali, A.; Xu, M.-S.; Zhao, Y.-T.; Zhang, Q.-Q.; Zhou, L.-L.; Yang, X.-D.; Yan, E.-R. Allometric Biomass Equations for Shrub and Small Tree Species in Subtropical China. Silva Fenn. 2015, 49, 1275. [Google Scholar] [CrossRef]
Chen, J.; Fang, X.; Wu, A.; Xiang, W.; Lei, P.; Ouyang, S. Allometric Equations for Estimating Biomass of Natural Shrubs and Young Trees of Subtropical Forests. New For. 2024, 55, 15–46. [Google Scholar] [CrossRef]
Lu, D.; Chen, Q.; Wang, G.; Liu, L.; Li, G.; Moran, E. A Survey of Remote Sensing-Based Aboveground Biomass Estimation Methods in Forest Ecosystems. Int. J. Digit. Earth 2016, 9, 63–105. [Google Scholar] [CrossRef]
Guindon, B.; Zhang, Y.; Dillabaugh, C. Landsat Urban Mapping Based on a Combined Spectral–Spatial Methodology. Remote Sens. Environ. 2004, 92, 218–232. [Google Scholar] [CrossRef]
Van de Voorde, T.; Jacquet, W.; Canters, F. Mapping Form and Function in Urban Areas: An Approach Based on Urban Metrics and Continuous Impervious Surface Data. Landsc. Urban Plan. 2011, 102, 143–155. [Google Scholar] [CrossRef]

Figure 1. Geographic location of Shenzhen and distribution of vegetation sampling plots over high-resolution land use/land cover map.

Figure 2. Schematic design of the National Forest Inventory–based field plot with nested subplots for tree, shrub, and grass measurements.

Figure 3. Model performance under spatial block cross-validation across different block sizes. Panels (1–3) show the distributions of validation performance across the 10 spatial folds for (1) coefficient of determination (R²), (2) root mean square error (RMSE), and (3) mean absolute error (MAE) using block sizes of 2 km, 5 km, and 10 km. Points represent fold-level results, while violin and box plots illustrate the variability across folds. Panel (4) compares mean training and validation R² (± standard deviation) across block sizes, highlighting differences in model generalization performance under increasing spatial separation. Taken together, these results demonstrate that spatial block size exerts a significant influence on both model accuracy and stability. For the heterogeneous urban landscape of Shenzhen, a moderate spatial block size of 5 km provides the most reliable assessment of model performance, effectively mitigating spatial autocorrelation while preserving sufficient training information for robust estimation of UAGVCD. Consequently, the 5 km spatial block configuration was adopted for subsequent model interpretation.

Figure 4. Spatial predictions of UAGVCD under different spatial block sizes. Predicted spatial distribution of UAGVCD in Shenzhen based on XGBoost models trained using spatial block cross-validation with block sizes of (1) 2 km, (2) 5 km, and (3) 10 km.

Figure 5. SHAP-based interpretation of spectral and topographic influences on urban vegetation carbon density. (1) Global feature importance ranked by mean absolute SHAP values. Each point represents the contribution of an individual sample, with features ordered by their overall influence on the XGBoost model. (2) SHAP dependence plots for selected key predictors, colored by elevation (DEM). The black dashed lines indicate the median value of each predictor, providing a reference for the central tendency of the variable, while the red dotted lines denote threshold points identified by segmented regression, representing change points where the influence of the predictor on model outputs shifts noticeably. Vertical dashed lines indicate median values, and dotted red lines denote threshold points identified by segmented regression, highlighting elevation-modulated nonlinear effects. Color gradients represent elevation (DEM), illustrating conditional dependencies whereby the contribution of spectral variables to UAGVCD systematically varies across topographic gradients.

Table 1. Statistical summary of sample plot data utilized for urban vegetation carbon density mapping.

No. of Plots	Minimum (Mg/ha)	Maximum (Mg/ha)	Sample Mean (Mg/ha)	Standard Deviation (Mg/ha)	Coefficient of Variation (%)
195	0	100.67	21.23	22.96	108.16

Table 2. Overall model performance under 10-fold spatial block cross-validation (5 km blocks). Mean and standard deviation (SD) of training and validation metrics, including R², RMSE, and MAE.

Metric	Train_Mean	Train_sd	Valid_Mean	Valid_sd	Train_Mean_sd	Valid_Mean_sd
R²	0.917	0.086	0.617	0.055	0.917 ± 0.086	0.617± 0.055
RMSE	5.534	3.967	10.254	1.387	5.534 ± 3.967	10.254 ± 1.387
MAE	3.443	2.406	9.011	1.212	3.443 ± 2.406	9.011 ± 1.212

Table 3. Fold-level training and validation performance of the XGBoost model under spatial block cross-validation. Performance metrics (R², RMSE, MAE) reported for each spatial fold.

Fold	Best_Nrounds	Train_R²	Train_RMSE	Train_MAE	Valid_R²	Valid_RMSE	Valid_MAE
1	28	0.719	12.397	7.581	0.607	10.130	9.496
2	54	0.928	6.168	3.955	0.647	9.826	8.797
3	47	0.913	6.828	4.019	0.696	8.779	8.479
4	41	0.858	8.736	5.368	0.633	9.245	8.736
5	69	0.968	4.152	2.736	0.549	11.849	10.606
6	260	1.000	0.311	0.221	0.544	12.691	10.833
7	587	1.000	0.004	0.003	0.692	8.614	7.095
8	121	0.994	1.732	1.235	0.615	9.779	8.085
9	43	0.883	7.825	4.897	0.636	9.836	7.949
10	46	0.904	7.188	4.422	0.553	11.790	10.037

Table 4. Summary of XGBoost model performance under different spatial block sizes using spatial block cross-validation. Values are reported as mean ± standard deviation across 10 spatial folds. Performance metrics include the coefficient of determination (R²), root mean square error (RMSE), and mean absolute error (MAE) for both training and validation datasets.

Block_km	n_Folds	Train_R2	Valid_R2	Train_RMSE	Valid_RMSE	Train_MAE	Valid_MAE
2 km	10	0.902 ± 0.137	0.604 ± 0.109	5.02 ± 5.55	13.66 ± 2.26	3.12 ± 3.36	9.27 ± 2.08
5 km	10	0.917 ± 0.086	0.617 ± 0.055	5.53 ± 3.97	10.25 ± 1.39	3.44 ± 2.41	9.01 ± 1.21
10 km	10	0.837 ± 0.195	0.380 ± 0.297	7.69 ± 5.65	15.41 ± 7.35	4.84 ± 3.54	11.24 ± 5.66

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Qie, G.; Wang, M.; Wang, G. Estimation of Urban Above-Ground Vegetation Carbon Density and Analysis of Topography-Modulated Spectral Responses in Shenzhen, China. Remote Sens. 2026, 18, 807. https://doi.org/10.3390/rs18050807

AMA Style

Qie G, Wang M, Wang G. Estimation of Urban Above-Ground Vegetation Carbon Density and Analysis of Topography-Modulated Spectral Responses in Shenzhen, China. Remote Sensing. 2026; 18(5):807. https://doi.org/10.3390/rs18050807

Chicago/Turabian Style

Qie, Guangping, Minzi Wang, and Guangxing Wang. 2026. "Estimation of Urban Above-Ground Vegetation Carbon Density and Analysis of Topography-Modulated Spectral Responses in Shenzhen, China" Remote Sensing 18, no. 5: 807. https://doi.org/10.3390/rs18050807

APA Style

Qie, G., Wang, M., & Wang, G. (2026). Estimation of Urban Above-Ground Vegetation Carbon Density and Analysis of Topography-Modulated Spectral Responses in Shenzhen, China. Remote Sensing, 18(5), 807. https://doi.org/10.3390/rs18050807

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Estimation of Urban Above-Ground Vegetation Carbon Density and Analysis of Topography-Modulated Spectral Responses in Shenzhen, China

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Field Data and Above-Ground Vegetation Carbon Density Calculation

2.3. Remote Sensing Data

2.3.1. Pleiades-1A/1B Imagery and Land Use/Land Cover Classification

2.3.2. Landsat 8 OLI Imagery

2.3.3. Temporal Consistency and Data Integration

2.4. Topographic Variables

2.4.1. Digital Elevation Model

2.4.2. Terrain Derivatives: Slope and Aspect

2.4.3. Eastness and Northness

2.4.4. Ecological Meaning of Terrain-Derived Variables

2.5. Machine Learning Model and Feature Preparation (XGBoost)

2.5.1. Feature Preparation

2.5.2. XGBoost Model Configuration

2.6. Spatial Block Cross-Validation Strategy

2.6.1. Rationale for Spatial Cross-Validation

2.6.2. Spatial Block Design

2.6.3. Model Evaluation

2.7. SHAP and SHAP Interaction Analysis

2.7.1. SHAP-Based Model Interpretation

2.7.2. SHAP Dependence and Interaction Analysis

2.7.3. Linking SHAP Results to Ecological Interpretation

2.8. Spatial Prediction of UAGVCD

3. Results

3.1. Performance of XGBoost Under Spatial Block Cross-Validation

3.2. Comparison of Spatial Block Sizes on Model Performance

3.3. Spatial Distribution of UAGVCD

3.4. SHAP-Based Interpretation of Spectral–Topographic Controls on Urban Vegetation Carbon Density

4. Discussion

4.1. Performance of Machine Learning Models Under Spatially Explicit Validation

4.2. Spatial Patterns of Urban Vegetation Carbon Density and Scale Effects

4.3. Spectral and Topographic Controls on Urban Vegetation Carbon Density

4.4. Topography-Modulated Spectral Responses Revealed by SHAP Dependence Analysis

4.5. Uncertainty Sources and Limitations

4.6. Implications for Urban Carbon Mapping and Future Research Directions

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI