Next Article in Journal
Experimental Study on the Impact of Vapor Retarder on Moisture Content in Multi-Layer Log Walls
Previous Article in Journal
Not Only Heteromorphic Leaves but Also Heteromorphic Twigs Determine the Growth Adaptation Strategy of Populus euphratica Oliv.
Previous Article in Special Issue
Spatiotemporal Dynamics and Driving Factors of Arbor Forest Carbon Stocks in Yunnan Province, China (2016–2020)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

High-Resolution Mapping and Impact Assessment of Forest Aboveground Carbon Stock in the Pinglu Canal Basin: A Multi-Sensor and Multi-Model Machine Learning Approach

1
School of Resources, Environment and Materials, Guangxi University, Nanning 530004, China
2
School of Marine Sciences, Guangxi University, Nanning 530004, China
3
Institute of Green and Low Carbon Technology, Guangxi Institute of Industrial Technology, Nanning 530200, China
*
Author to whom correspondence should be addressed.
Forests 2025, 16(7), 1130; https://doi.org/10.3390/f16071130
Submission received: 12 June 2025 / Revised: 2 July 2025 / Accepted: 7 July 2025 / Published: 9 July 2025
(This article belongs to the Special Issue Forest Inventory: The Monitoring of Biomass and Carbon Stocks)

Abstract

Accurate estimation of forest aboveground carbon stock (AGC) is critical for climate change mitigation and ecological management. This study develops a high-resolution AGC estimation workflow for the Pinglu Canal basin, integrating Sentinel-2, Sentinel-1, ALOS PALSAR, and SRTM data with field survey measurements. Feature selection via Recursive Feature Elimination and modeling with a Random Forest algorithm—optimized through hyperparameter tuning—yielded high predictive accuracy under the ALL data combination (R2 = 0.818, RMSE = 11.126 tC/ha), enabling the generation of a 10 m-resolution AGC map. The total AGC in 2024 was estimated at 2.26 × 106 tC. To evaluate human-induced changes, we established a baseline scenario based on historical AGC trends (2002–2021) and climate data. Comparisons revealed that afforestation and vegetation restoration during canal construction led to higher AGC values than projected under natural conditions. This positive deviation highlights the effectiveness of targeted ecological interventions in mitigating carbon loss and promoting forest recovery. Our results demonstrate a cost-effective, scalable method for AGC mapping using freely accessible remote sensing data and machine learning. The findings also provide insights into balancing large-scale infrastructure development with ecosystem conservation.

1. Introduction

The current climate and biodiversity crises pose significant risks to ecosystems and human societies [1]. As highlighted by the IPCC and the Paris Agreement, forests are vital for maintaining ecosystem stability and mitigating climate change [2]. Forests, the primary carbon sinks within terrestrial ecosystems, are pivotal to the global carbon cycle due to their ability to absorb and retain carbon, thereby contributing significantly to climate regulation [3,4]. Therefore, precise monitoring of forests is crucial for developing informed climate policies, safeguarding biodiversity, and ensuring sustainable development [5,6]. In particular, understanding the spatial dynamics of forest carbon stock and evaluating its responses to large-scale anthropogenic disturbances—such as major infrastructure development—has become increasingly important in both ecological research and environmental management.
China, recognizing the urgency of climate change, has set dual carbon goals: peaking carbon emissions by 2030 and achieving carbon neutrality by 2060. To meet these targets, China has implemented comprehensive strategies, including the promotion of renewable energy, emissions reduction, and enhanced carbon sequestration through forest conservation and afforestation programs [7]. China’s forests, covering approximately 23% of the country’s land area, are fundamental to carbon sequestration and ecosystem stability, contributing significantly to the nation’s dual carbon goals [8]. In particular, Guangxi province, with its diverse and abundant forest resources, is a key region in supporting these national objectives through sustainable forest management and carbon storage efforts [9]. At the same time, the region is undertaking the construction of the Pinglu Canal, a major infrastructure project designed to enhance water transportation efficiency and stimulate regional economic growth. However, despite the anticipated economic and logistical benefits, the project is likely to exert considerable pressure on the ecosystems along its route. The Environmental Impact Assessment results indicate that the canal construction project will occupy 3276.7 ha of forest, with a permanently occupied area of 987.12 ha. Consequently, it is essential to carefully evaluate and mitigate the environmental impacts, particularly on the forests within the canal’s watershed. Integrating conservation efforts into the planning and implementation phases will be crucial to minimizing the ecological footprint of this development [10].
Currently, research on the canal basin ecosystems primarily focuses on the Panama Canal [11,12,13], which is in operation. Monitoring projects conducted by the Smithsonian Tropical Research Institute and the Panama Environmental Authority indicate that the forests in the Panama Canal watershed play a vital role in biodiversity and water resource protection. However, these ecosystems are facing increasing pressures due to human activities [14]. Moreover, the forests in the Panama Canal watershed are closely linked to canal operations, and reforestation affects ecosystem services in complex ways. Under different conditions, reforestation has both positive impacts and trade-offs concerning water flow, carbon sequestration, and timber production [15]. Additionally, the Panama Canal provides valuable reference points for forest ecosystem protection measures during the construction phase of large-scale infrastructure projects. During the early stages of the canal’s construction, the United States, for security and water supply reasons, protected the forests along a 660 km corridor on both sides of the canal. Today, after the Panama Canal was handed over to Panama, the surrounding forests remain intact and have been incorporated into national parks and nature reserves [16]. This experience provides valuable lessons for the development of new canals, demonstrating the potential for achieving a balance between environmental and social benefits. Despite the valuable insights from existing studies, research on the ecological impacts during the construction phase of canal projects remains limited. This gap is particularly evident in understanding how large-scale infrastructure affects forest ecosystems in both the short and the long term. As such, targeted research is urgently needed to fill this knowledge gap and support more informed environmental management in future canal projects.
Reliable quantification of aboveground carbon stock (AGC) in forests is critical for evaluating ecosystem functioning and informing strategies for climate change mitigation [17]. Traditional methods based on field inventories and biomass models offer high precision but are constrained by limited spatial coverage and high labor costs [18]. With advances in remote sensing, satellite-based AGC estimation has become a practical solution for large-scale monitoring [19]. Optical and Synthetic Aperture Radar (SAR) data provide complementary information: optical sensors capture canopy structure in clear conditions, while SAR enables observations under clouds and in complex terrain [20,21,22,23]. However, single-source data face limitations. Optical sensors suffer from cloud contamination and canopy saturation, while SAR is prone to signal saturation and environmental sensitivity to factors like precipitation and soil moisture [24]. Integrating data from multiple sources has proven to be an effective approach for addressing these challenges. Integrating spectral, structural, and topographic features from multiple sensors significantly improves AGC estimation accuracy and robustness, especially in heterogeneous landscapes [25,26]. This approach has shown strong applicability in complex ecological settings [27,28]. Modeling methods have also evolved. Linear regression is widely used but often fails to capture nonlinear relationships between variables and AGC [29,30]. Machine learning models—such as Support Vector Regression (SVR), Random Forest (RF), and Extreme Gradient Boosting (XGBoost)—offer enhanced performance, particularly when combined with feature selection and hyperparameter tuning [31,32,33]. Accurate spatial alignment between remote sensing pixels and field plots is also critical to reduce uncertainty and improve model reliability [34]. In summary, combining multi-sensor remote sensing data with robust machine learning models provides a scalable and reliable framework for accurate AGC estimation. Recent research has confirmed that combining field inventory data with machine learning algorithms is a feasible approach for estimating AGC across different spatial scales [35,36,37].
To address the research gap concerning the effects of canal construction on forest ecosystems during the construction period, this study employs multi-source remote sensing data combined with machine learning methods to estimate forest AGC in the Pinglu Canal basin for the year 2024 at a 10 m resolution. Additionally, historical data are utilized to assess the construction-related impacts on forest dynamics within the basin. The objectives of this study are (1) to combine tree-level measurements with remote sensing data at a fine scale to obtain accurate pixel-level AGC datasets; (2) to use Recursive Feature Elimination (RFE) for feature selection and assess the effectiveness of different data types or combinations of data in AGC modeling predictions; (3) to optimize the parameters of RF and XGBoost models through grid search and evaluate their estimation capabilities for AGC in comparison with Multiple Linear Regression (MLR) and SVR, and then use the best-performing model to predict AGC and produce a spatial distribution map of AGC at a 10 m resolution for the Pinglu Canal basin; and (4) to analyze the impact of the canal construction on the forest ecosystem by examining the interannual variations in forest area and AGC. This study represents the first attempt to estimate forest AGC in the Pinglu Canal basin using high-resolution remote sensing data, while also examining the ecological impacts of the canal by analyzing temporal changes in forest area and AGC. It provides valuable insights into the relationship between large-scale infrastructure projects and forest ecosystems, contributing to the understanding of how canal construction affects forest AGC.

2. Materials and Methods

2.1. Study Area

The Pinglu Canal, spanning a total length of 135 km, is located in Guangxi Province, China (Figure 1). The canal basin experiences an annual average rainfall exceeding 1800 mm and a mean temperature of 21.7 °C, reflecting the characteristics of a subtropical monsoon climate. The topography is highly diverse, ranging from low-lying plains to high-altitude mountainous regions, with elevations spanning from −2 m to 371 m above sea level. This varied terrain supports a range of ecosystems, including forests that play a critical role in carbon stock and maintaining ecological stability, consisting primarily of evergreen broadleaf forests and mixed coniferous forests. Notably, plantations are the predominant forest type in the basin, featuring key species such as Chinese fir (Cunninghamia lanceolata), eucalyptus (Eucalyptus spp.), masson pine (Pinus massoniana), and slash pine (Pinus elliottii).
The Pinglu Canal project, a significant waterway construction initiative in southern China, has substantially transformed local land use and ecological patterns. This study aims to assess the potential impacts of canal construction on forest ecosystems, with a particular focus on the dynamics of forest AGC.

2.2. Data and Methodology

To obtain an accurate spatial distribution of forest carbon stock in the Pinglu Canal basin for 2024, a systematic workflow was developed, integrating field survey data, remote sensing data preprocessing, feature selection, and machine learning-based modeling. This workflow combines multi-source data, including SAR, optical data, and terrain data, to extract essential variables such as backscatter coefficients, vegetation indices, texture features, and topographic parameters. The field survey data were transformed into pixel-level AGC values, which were then combined with remote sensing-derived features for model training. RFE was employed to select the most relevant predictors, which were subsequently used in a Random Forest regression framework. The final model was validated against field survey data to ensure accuracy, leading to the generation of a comprehensive forest AGC map for the Pinglu Canal basin in 2024. The workflow is illustrated in Figure 2, providing a detailed representation of the methodological approach adopted in this study.

2.2.1. Field Sampling and Plot Inventory

To obtain field data on forest carbon stock within the study area, we conducted a plot survey in four regions of the watershed, covering the area from the upstream of the canal to the estuary, from September to October 2024. Prior to the field survey, we selected representative vegetation types based on imagery and forestry reports. A survey plan was then formulated to ensure that the selected sample plots represented the full spectrum of vegetation diversity across the study area. Ultimately, due to practical considerations such as road accessibility, we established 37 square sample plots, each measuring 25 m × 25 m. Trees with a diameter at breast height (DBH) ≥ 5 cm were measured, and the surveyed characteristics included the DBH, tree height, and species of all trees within the plots. In addition, the coordinates of each individual tree within the sample plots were recorded using a handheld Real-Time Kinematic (RTK) device. These high-precision spatial data were used to accurately locate each tree, enabling detailed analysis and integration with remote sensing data for subsequent AGC estimation.
Based on the field survey, a total of 4038 trees were recorded across all sample plots. Among these, the dominant tree species was eucalyptus, with 3069 individuals, followed by slash pine (Pinus elliottii), with 610 individuals; masson pine (Pinus massoniana), with 149 individuals; and other species, accounting for 210 individuals. This species composition reflects the diversity of vegetation within the study area and serves as a critical foundation for constructing accurate AGC models. These data were further processed and integrated with remote sensing features to derive pixel-level AGC estimates for the study region.

2.2.2. Pixel-Level AGC Dataset Development

Accurately estimating forest AGC at the pixel level requires the precise integration of field data and remote sensing imagery. To better capture spatial heterogeneity within sample plots, we adopted and refined a method that integrates high-precision tree-level coordinate data with the Sentinel-2 pixel grid. While similar methods have been applied in forest structure inversion and individual tree segmentation studies based on UAV data [38,39], this study advances the approach by incorporating tree-level RTK data, handling missing pixels, and ensuring alignment with the spatial resolution of Sentinel-2 imagery. These improvements enhance the application of this method in integrating satellite pixel data with field plot measurements.
This methodology provides a more accurate and robust framework for generating pixel-level AGC datasets. The precise tree-level coordinates collected during the field survey were used to delineate the boundaries of each sample plot, which were then mapped onto the Sentinel-2 pixel grid (10 m × 10 m). Subsequently, pixels within the boundaries of each sample plot were extracted, while pixels with missing data were excluded [40] to ensure the completeness and accuracy of the training dataset (Figure 3).
Building upon this framework, the next step was to calculate the AGC for each selected pixel. Using species-specific allometric equations, the aboveground biomass (AGB) of each tree was computed as follows:
AGB = a × ( DBH ) b × ( H ) c
where a , b , and c are species-specific parameters derived from allometric models; DBH is the diameter at breast height (cm); and H is the tree height (m).
The AGC of each individual tree was then calculated by applying a carbon conversion factor (CF), which quantifies the proportion of biomass attributable to carbon:
AGC ( tree ) = AGB × CF
Typically, CF values range between 0.47 and 0.50. In this study, to achieve more accurate forest AGC distribution estimates, the CF for all sampled tree species was obtained from national standards. For tree species without specified CF values in the standards, they were categorized into soft broadleaf and hard broadleaf forest types, and the corresponding CF values were applied accordingly.
Finally, the total AGC for each pixel was derived by aggregating the AGC of all individual trees within the corresponding plot boundary:
Total   AGC ( pixel ) = i = 1 n AGC i
where n is the number of trees within the plot, and AGC i is the AGC of the i -th tree.
This methodological approach enabled the integration of fine-scale, tree-level measurements with remote sensing data, ensuring that the resulting pixel-level AGC estimates were accurate and compatible with the spatial resolution of multi-source remote sensing data. However, it should be noted that this approach may still be subject to certain limitations, such as data loss caused by mixed pixels or imperfect alignment between ground plots and remote sensing grids, which are further discussed in the Section 4.

2.2.3. Remote Sensing Data Preprocessing

  • Optical Data
Optical data can be used to estimate forest AGC because they capture key spectral information related to vegetation structure, canopy characteristics, and biochemical properties, which are closely correlated with biomass. The free accessibility, global coverage, high temporal and spatial resolution, and long temporal scale of Sentinel-2 data make them a valuable resource for monitoring forest carbon dynamics over time. The dataset used in this study is the Sentinel-2 L2A surface reflectance product, which has undergone atmospheric correction and orthorectification to provide high-quality, geometrically aligned surface reflectance data. To account for the effects of cloud cover, we used imagery acquired on 31 October 2024, which closely aligns with the timing of field surveys. In this study, we utilized bands with a 10 m resolution (blue, green, red, and near-infrared 1) and a 20 m resolution (red edge 1, red edge 2, red edge 3, near-infrared 2, shortwave infrared 1, and shortwave infrared 2). All bands were resampled to a 10 m resolution using the Sen2Res processor in SNAP 11.0.0. Furthermore, previous studies have demonstrated that spectral vegetation indices carry significant information for biomass estimation. Based on the processed Sentinel-2 imagery, we calculated 18 commonly used vegetation indices for biomass estimation (Table 1).
2.
SAR Data
SAR data, renowned for their capability to penetrate vegetation and capture structural information under all-weather conditions, represent a critical resource for forest AGC estimation. In this study, SAR datasets from Sentinel-1 and ALOS-2 PALSAR-2 were employed. To ensure consistency with Sentinel-2 data, the Sentinel-1 and PALSAR data were resampled to a 10 m spatial resolution. Sentinel-1 data, consisting of two scenes corresponding to the field sampling period, were acquired through the European Space Agency data portal. The data underwent preprocessing in SNAP 11.0.0 to derive VV and VH polarized backscatter coefficients, expressed in decibels (dB). Subsequently, five dual-polarization indices were computed based on these backscatter coefficients. Additionally, a suite of texture features, including mean, variance, homogeneity, contrast, dissimilarity, entropy, second moment, and correlation, was extracted using the gray-level co-occurrence matrix method with a 7 × 7 pixel window. Due to maintenance on the official platform, PALSAR data were obtained from Google Earth Engine (GEE). The imagery acquisition date was selected as the closest available scene covering the Pinglu Canal Basin to the field sampling period (30 August). The dataset provided HH and HV polarized backscatter coefficients directly. Utilizing these coefficients, the same five dual-polarization indices were calculated, and the same set of texture features was extracted. The integration of these polarization indices and texture features offered detailed insights into vegetation structure and heterogeneity, significantly enhancing the precision of forest AGC estimation across the study area.
3.
Topographic Data
Topographic factors, including elevation and terrain variability, are essential in forest AGC estimation due to their impact on vegetation distribution, growth conditions, and biomass accumulation. In this study, we utilized the Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM), which provides global elevation data at a spatial resolution of 30 m. The SRTM DEM has been widely validated for its high accuracy and reliability in representing topographic features. To ensure consistency with other datasets, the SRTM data were resampled to a 10 m resolution. Furthermore, topographic variables such as slope, aspect, and curvature were derived from the resampled DEM and incorporated into the analysis to account for the effects of terrain on forest biomass distribution.
4.
Forest Mask
The forest mask is a critical element for understanding the spatial distribution and dynamic changes of forest cover, playing a key role in AGC estimation, biodiversity assessment, and the monitoring of land use changes. In this study, the forest mask was generated using the deep learning-based Dual-Net framework proposed by Liu et al. [41]. This algorithm integrates dual-date Sentinel-2 imagery, multimodal information fusion (which combines spatial and temporal features), and low-level attention constraints to enhance classification accuracy. By effectively capturing spectral, textural, and temporal characteristics, it enables precise differentiation of land cover types. The classification results produced by this algorithm outperform those of commonly used land cover products, achieving superior accuracy and enabling the generation of more refined and precise forest masks.
In this study, land cover classification for the Pinglu Canal basin from 2021 to 2024 was conducted using the Dual-Net framework. This framework, based on Sentinel-2 imagery, was implemented on the AI Earth platform developed by Alibaba DAMO Academy [42]. The classification outputs were further processed to extract a forest mask. Additionally, the mangrove areas within the mask were removed using the mangrove identification algorithm proposed by Zhao et al. [43] in their study, with the corresponding code made available as open-source on GEE. This approach accurately delineated forested and non-forested areas, providing a reliable dataset for subsequent analyses of forest carbon stock and the spatiotemporal dynamics of forest cover.
5.
Historical Forest AGC Data
This study utilizes the forest AGC data for China from 2002 to 2021, provided by Chen et al. [44]. This dataset was generated at a 1 km resolution by integrating multiple remote sensing observations, such as SAR backscatter, optical data, and passive microwave data, with extensive field measurements and machine learning models. The dataset includes both aboveground and belowground carbon pools, offering comprehensive and spatially explicit information on forest carbon stock changes over the past two decades.
The AGC data were calibrated against field measurements to ensure high accuracy and extended temporally using regression models incorporating vegetation cover and other biophysical predictors. This dataset provides a reliable foundation for understanding long-term trends in forest carbon dynamics, including regional differences in forest biomass allocation and carbon sequestration. It, together with historical temperature and precipitation data obtained from the ERA5_Land dataset on GEE, was instrumental in establishing a historical baseline for analyzing the forest AGC in the Pinglu Canal basin.

2.2.4. Feature Selection

Selecting appropriate variables is crucial for constructing AGC models. Most initial variables are chosen based on previous research findings and relevant academic experience, followed by a summary of variables that may be effective for modeling. However, due to differences in geographical and vegetation characteristics, the variables used for AGC modeling often vary between different study areas. Meanwhile, in the context of machine learning, the variable selection process is expected to improve the efficiency of the modeling process and enhance the robustness of prediction accuracy [45]. Feature selection not only helps reduce the complexity of the model by eliminating irrelevant or redundant variables but also mitigates the risk of overfitting, particularly when working with high-dimensional datasets.
In this study, a feature-level data fusion strategy was applied before feature selection to integrate multi-source remote sensing inputs. Spectral indices (e.g., NDVI, EVI) from Sentinel-2, radar backscatter metrics (e.g., VV, VH, entropy, second moment) from Sentinel-1 and ALOS PALSAR-2, and topographic features (e.g., elevation, slope, curvature) from SRTM were extracted as primary variables.
Prior to fusion, all layers underwent radiometric correction, geometric alignment, and spatial resampling to a unified 10 m resolution using nearest-neighbor interpolation, ensuring pixel-level co-registration across data sources. After preprocessing, variables were min–max normalized to ensure comparability and minimize scaling bias [46]. These standardized features were then stacked at the pixel level to form a unified multivariate feature set. This approach allows for the synergistic use of spectral, structural, and topographic information, capturing the complementary characteristics of each sensor type.
Unlike raw stacking, the fusion process in this study emphasizes cross-source consistency and physical interpretability of features. Similar feature-level fusion strategies have been widely applied and validated in previous studies for enhancing forest biomass and carbon estimation accuracy [28,40,47,48].
To achieve these benefits, this study employed the RFE method to perform variable selection. In the RFE process, an external estimator is provided to assign weights to the features [49]. The estimator is first trained on the initial feature set to determine the importance of each feature. Subsequently, the least important feature is removed from the current feature set. This process is recursively repeated on the pruned feature set to perform feature selection. In this study, RF was used as the feature estimator [50]. RF is highly appropriate for this application, given its capacity to model complex non-linear relationships and variable interactions, along with its resistance to multicollinearity issues. The number of selected features was determined through automated tuning via cross-validation. The optimization and comparison of feature subsets were based on the coefficient of determination (R2) and root mean square error (RMSE) obtained from ten-fold cross-validation.
By systematically identifying the most important features, this approach ensures that the final model is not only accurate but also computationally efficient. Additionally, focusing on the most relevant variables allows for better generalization to unseen data, further enhancing the applicability and reliability of the AGC estimation model across different study areas.

2.2.5. Machine Learning Algorithms

To evaluate the suitability and predictive performance of different modeling approaches for AGC estimation, four representative machine learning algorithms were tested:
MLR is a fundamental statistical technique used to characterize the relationship between a dependent variable and several independent predictors [51]. It assumes that the dependent variable is a linear combination of the predictors. MLR is computationally efficient and interpretable, making it useful for initial analysis when relationships are expected to be linear. However, it may not capture complex nonlinearities, which limits its performance in more intricate datasets.
SVR extends the concept of Support Vector Machines to regression tasks [52]. SVR aims to find a function that fits the data while allowing for some error within a specified margin. By using kernel functions, such as the radial basis function kernel, SVR can effectively capture nonlinear relationships. This makes SVR particularly suitable for modeling complex patterns in high-dimensional data, as is common in remote sensing applications.
RF is an ensemble learning method that constructs multiple decision trees and aggregates their outputs to improve prediction accuracy and reduce overfitting. It handles high-dimensional and nonlinear data effectively and is robust to noise and multicollinearity. RF has been widely applied in remote sensing studies due to its interpretability, resistance to overfitting, and ability to estimate variable importance.
Extreme Gradient Boosting is a powerful ensemble algorithm based on gradient boosting decision trees [53]. It builds models sequentially, with each new tree correcting errors made by the previous ones. XGBoost is known for its high predictive performance, regularization capabilities, and computational efficiency. It has gained popularity in environmental modeling tasks for its ability to handle complex feature interactions and large datasets.
These methods represent both linear and nonlinear regression techniques commonly used in remote sensing and environmental modeling. Each algorithm was selected and handled based on its characteristics and typical usage in similar tasks [54].
MLR, as a classical linear baseline, does not involve hyperparameters, but RFE was applied beforehand to address multicollinearity and improve model robustness. SVR was implemented with a radial basis function kernel, commonly used for capturing nonlinear patterns [55]. Grid search was performed to select the penalty parameter C ∈ {0.1, 1, 10, 100} and kernel coefficient γ ∈ {0.01, 0.1, 1}, based on prior studies and empirical testing.
For the tree-based ensemble methods, RF and XGBoost, hyperparameter tuning was essential due to their flexible structure and potential to overfit [36]. For RF, the following parameters were tuned via grid search: number of trees (n_estimators ∈ {100, 200, 300}), maximum tree depth (max_depth ∈ {3, 6, 9}), tree depth (max_depth ∈ {10, 20, None}), minimum samples required at a leaf node (min_samples_leaf ∈ {1, 2, 3, 4, 5}), minimum samples to split a node (min_samples_split ∈ {2, 4, 6, 8, 10}), and number of features considered at each split (max_features ∈ {‘auto’, ‘sqrt’, ‘log2’}). Similarly, XGBoost was tuned over key parameters: learning rate (eta ∈ {0.01, 0.05, 0.1}), number of boosting rounds (n_estimators ∈ {100, 200, 300}), tree depth (max_depth ∈ {1, 2, 3, 4, 5, 6, 7, 8, 9}), and subsampling ratio (subsample ∈ {0.6, 0.8, 1.0}). These configurations allowed for a comprehensive search of the model space, enhancing both prediction accuracy and generalizability [56].
All models were trained and validated using ten-fold cross-validation, which partitions the dataset into ten equal parts and iteratively uses one for validation and the rest for training [57]. This approach helps reduce overfitting and ensures a fair and robust comparison across models. Performance was evaluated using R2 and RMSE.
Among all tested algorithms, the RF model—combined with RFE and hyperparameter tuning—demonstrated the best predictive performance and was therefore selected for high-resolution, pixel-level AGC mapping. This selection is supported by the quantitative comparison presented in Section 3.2, where RF achieved the highest R2 and lowest RMSE among all models.

2.2.6. Model Evaluation

To evaluate the performance of the selected feature and machine learning models, we split the dataset into a training set (70%) and a test set (30%). This study employed 10-fold cross-validation to test model hyperparameters and address issues such as overfitting. The models were assessed using two primary metrics: RMSE and R2. RMSE provides a measure of the average prediction error, with lower values indicating better model performance. The R2 value quantifies the proportion of variance in the target variable explained by the model, with higher values indicating a better fit. The formulas for RMSE and R2 are defined as follows:
RMSE = 1 n i = 1 n ( y i y ^ i ) 2
R 2 = 1 i = 1 n ( y i y ^ i ) 2 i = 1 n ( y i y ¯ ) 2
where n is the observed sample size, y i is the observed value for observation i, y ^ i is the predicted value i, and y ¯ is the mean of the observed values.

3. Results

3.1. Data Collection Comparison and Feature Selection

This study employed RFE to identify the most relevant input variables and utilized ten-fold cross-validation to evaluate the prediction performance across different feature set combinations. The input variables were derived from three main sources: optical data (Sentinel-2), Synthetic Aperture Radar imagery (Sentinel-1 and PALSAR), and topographic attributes (SRTM). Random Forest was used as the modeling framework to assess the effectiveness of individual and fused datasets, both prior to and following feature selection.
For models trained on individual datasets, Sentinel-1 (S1) outperformed the others in estimation accuracy, followed by Sentinel-2 (S2). PALSAR and SRTM yielded comparatively lower accuracy, which may be attributed to their coarser spatial resolution and the mismatch in acquisition time between the remote sensing imagery and field data collection. Among the pairwise data combinations, the S1 and SRTM duo produced the best results, followed by the S1 and S2 combination. When three datasets were used, the integration of S2, S1, and SRTM showed the most reliable performance. Finally, when all four datasets were combined (ALL), the accuracy was slightly higher than the S2 + S1 + SRTM configuration, making it the best-performing combination overall. These findings indicate that incorporating a greater diversity of data sources generally enhances estimation accuracy.
Table 2 reveals that removing S1 data from any two- or three-source combination led to a substantial decline in accuracy. This emphasizes the unique contribution of S1, which offers the ability to penetrate cloud cover and forest canopies, capturing critical structural details such as canopy density and vegetation type. Such structural insights are vital for AGC estimation, particularly in ecologically complex or topographically varied regions. Furthermore, S1 complements other datasets by supplying information that cannot be obtained through optical or topographic sources alone, thus strengthening model robustness.
Post-RFE, the number of retained features differed significantly across data combinations. However, there was no clear linear relationship between the number of features and the overall model performance. Although feature reduction varied, the relative ranking of each combination in terms of prediction accuracy remained stable. As detailed in Table 2, the combinations of S2 + S1 + PALSAR, S2 + S1 + SRTM, and ALL yielded favorable results, with the ALL set producing the highest accuracy and the fewest features. When model accuracy was comparable, the dataset with fewer selected features offered greater computational efficiency in prediction tasks.

3.2. Model Validation

Using the optimized subsets of selected features, we developed models based on MLR, SVR, and RF and XGBoost algorithms with hyperparameters fine-tuned via grid search. The generalization performance of each model was then assessed using the reserved test dataset.
To facilitate comparison, three data combinations were chosen—S2 + S1 + PALSAR, S2 + S1 + SRTM, and the comprehensive ALL dataset—as they had shown similar potential during the feature selection stage. Figure 4 illustrates the model accuracies across all four algorithms for these combinations.
Validation outcomes demonstrate that model performance varied notably depending on the data inputs, with the ALL dataset (integrating S1, S2, PALSAR, and SRTM) yielding the highest accuracy overall. Among the modeling approaches, the grid-search-optimized RF and XGBoost algorithms consistently outperformed the others. In contrast, MLR produced the poorest results regardless of the input combination, likely due to its inability to effectively model the complex nonlinear patterns associated with forest AGC. Although SVR performed better than MLR, it still struggled to accommodate the intricate nature of the multi-source dataset. Comparatively, RF and XGBoost, with their enhanced nonlinear learning capabilities, achieved superior performance, with RF delivering the highest accuracy under the ALL configuration (R2 = 0.818, RMSE = 11.126 tC/ha).
These results underscore the advantages of integrating diverse remote sensing datasets for improved AGC estimation. The RF model in particular emerged as the most effective method due to its robustness and adaptability to nonlinear relationships. Consequently, the RF model optimized with grid search under the ALL data configuration was selected as the final estimation model for forest AGC in the Pinglu Canal basin. This model not only achieved the best predictive performance but also benefited from using a minimal subset of features, thus enhancing computational efficiency while maintaining high estimation reliability. Its effectiveness makes it particularly well suited for the spatiotemporal assessment of AGC within the study area.

3.3. Spatial Distribution of Forest AGC in the Pinglu Canal Basin

Based on the optimized Random Forest model under the ALL data combination, the estimated forest AGC values are shown in Figure 5. The figure reveals significant spatial variability in the AGC across the Pinglu Canal basin. Overall, high-carbon-stock areas are primarily distributed in the northern upstream and central regions, particularly in areas with dense forest coverage, where the AGC values can reach up to 66.85 tC/ha. In contrast, the downstream regions, especially near river corridors, exhibit noticeably lower carbon stock, with most values ranging from 6.37 to 25.58 tC/ha, represented by lighter colors. The trend of decreasing carbon stock from north to south is attributed to the dense vegetation and extensive forest coverage in the northern and central regions, whereas the downstream areas have reduced vegetation coverage due to agricultural activities, urban expansion, or infrastructure development. Statistical analysis of all pixels indicates that the average aboveground carbon density in the study area is 37.38 tC/ha. The statistical histogram of the pixel values shows that the majority of aboveground carbon density values in the study area fall within the range of 28 to 52 tC/ha. Furthermore, based on the forest area of the study region (6.07 × 104 ha), the total aboveground carbon stock is calculated to be 2.26 × 106 tC.

4. Discussion

4.1. Addressing Spatial Heterogeneity in AGC Estimation with Advanced Sampling Methods

Accurate estimation of forest AGC largely depends on the precise alignment between field measurements and remote sensing data. Traditional methods typically rely on matching sample plots to satellite pixels by recording the plot’s central coordinates or corner coordinates and aggregating plot-level attributes [58,59]. However, these methods assume uniform distribution of attributes within plots, neglecting spatial heterogeneity, and often result in misalignment with pixel grids, leading to inaccuracies in AGC estimation.
To address these limitations, we adopted a sampling method inspired by UAV studies, utilizing handheld RTK devices to record precise coordinates for each tree, enabling accurate alignment of tree point data with Sentinel-2 pixel grids. Unlike traditional methods, this approach accurately maps individual tree positions to corresponding pixel grids, capturing the spatial heterogeneity within sample plots. Pixels with missing or incomplete data were excluded, ensuring that only fully representative pixels were used for AGC estimation. This preserves the variability of forest attributes while avoiding the assumption of uniform distribution. Additionally, the alignment with Sentinel-2 grids minimizes errors caused by misregistration between field plots and remote sensing data, significantly enhancing the accuracy of pixel-level AGC estimation [60]. By accounting for within-plot variability, excluding incomplete pixels, and ensuring precise correspondence between field data and satellite imagery, our method establishes a more reliable foundation for AGC modeling and spatial analysis. To further validate this, we conducted a global spatial autocorrelation analysis using Moran’s I [61]. It would provide further insights into the spatial structure of carbon storage and could help evaluate the effectiveness of different sampling or modeling strategies in representing spatial variability [62]. The result yielded a value of 0.77 (p < 0.01), indicating strong and statistically significant positive spatial autocorrelation. In addition, a local spatial clustering analysis was conducted to further examine the distribution pattern of AGC values (Figure 6).
As shown in Figure 6, significant high–high clusters were mainly concentrated in the central and northern areas of the forested region, where vegetation is denser and more continuous. In contrast, low–low clusters appeared along the southern and peripheral zones, which may reflect more fragmented or sparse forest stands. These spatial patterns emphasize the non-random distribution of forest carbon and further demonstrate the capacity of our pixel-level modeling framework to capture ecological spatial variability.
An additional advantage of our approach is its ability to significantly increase the number of modeling-ready AGC pixels derived from field data. Using 37 sample plots, we were able to obtain 158 forest AGC pixels for modeling, compared to traditional survey methods, where typically only one AGC pixel can be generated per sample plot [63]. This substantial increase in pixel-level data density enhances the robustness of AGC modeling by providing a more comprehensive representation of spatial variability, ultimately improving model performance and reliability.
A major limitation of our approach is the reduced utilization of field data due to the exclusion of incomplete pixels. As a result, many individual tree data points were not incorporated into the current analysis. Specifically, approximately 43% of individual tree data were excluded from this study. For 10 sample plots, the utilization rate of individual tree data was even below 45%, highlighting the limitations of the current approach in fully leveraging available data.
To address this issue in future work, we plan to integrate UAV LiDAR data for individual tree biomass estimation, allowing for the full utilization of all individual tree data [64,65]. This integration would enable the generation of additional forest AGC pixels, further enhancing data density and improving the overall robustness of AGC modeling.

4.2. Relationship Between Multiple Source Factors and AGC

Analyzing the interplay between various data sources and AGC underscores the critical value of integrating multi-source remote sensing inputs for improving AGC prediction accuracy [66]. This study offers an in-depth evaluation of feature importance rankings alongside correlation analyses between selected variables and AGC, highlighting how inputs from diverse datasets contribute to model performance.
As shown in Figure 7, features derived from S1, S2, PALSAR, and SRTM exhibit varying degrees of influence in the AGC estimation model. Among them, S1 features—particularly VH-based entropy and second moment—emerge as the most influential, primarily due to their effectiveness in capturing vegetation structural complexity and texture [67]. S2-derived indices, such as the Normalized Burn Ratio (NBR) and Red-Edge Inflection Point (REIP), also rank highly, underscoring the role of spectral metrics in reflecting canopy condition and health [68]. Although variables from PALSAR and SRTM like variance (HH) and profile curvature demonstrate lower importance scores, they provide supplementary information related to terrain and surface structure. These outcomes reveal the complementary benefits of combining datasets: S1 and S2 serve as the primary contributors to AGC prediction, while PALSAR and SRTM reinforce model stability by offering structural and topographic context.
Further insights are provided in Figure 8, which displays the relationships between AGC and selected variables across different forest categories. In the aggregated dataset (ALL), moderate positive correlations are observed between AGC and S1 variables such as VH-VV and VH entropy, highlighting the relevance of SAR-derived structural features. S2 metrics consistently show absolute correlation coefficients exceeding 0.4, reflecting their importance in representing vegetation properties and spatial carbon distribution, especially in regions with lower to medium biomass levels. Conversely, terrain attributes (e.g., SRTM-derived profile curvature) and specific PALSAR variables (e.g., HV contrast and HH variance) display weaker correlations with AGC.
Distinct forest types exhibit unique patterns in feature relevance. For eucalyptus stands, S2 variables like NBR and B12 are strongly associated with AGC, indicating spectral indices’ sensitivity to eucalyptus canopy and biomass dynamics [69]. Slash pine, on the other hand, responds more to structural inputs from S1 and PALSAR, as reflected by stronger correlations with VH correlation and HV mean. Masson pine AGC correlates robustly with all S1-derived metrics, suggesting its structural characteristics are well captured by radar data. In mixed forests, a notable positive relationship is observed between AGC and the HV mean from PALSAR, signifying the value of SAR data in detecting structural heterogeneity. Interestingly, AGC in these mixed stands is negatively correlated with terrain variables from SRTM, indicating that complex topography, such as steep slopes, may inhibit forest growth and reduce carbon accumulation [70,71]. A summary comparison of the key variables from each data source and their relevance to AGC modeling across forest types is provided in Table 3.
Collectively, these results emphasize the significance of fusing multi-source datasets in AGC modeling. Sentinel-1 plays a crucial role by providing structural and textural details; Sentinel-2 offers rich spectral indicators of canopy health; and both PALSAR and SRTM add contextual information related to elevation and landscape complexity. The integration of these complementary data sources enables comprehensive and accurate carbon stock estimation across varied forest ecosystems [72].

4.3. Analysis of Interannual Variation of AGC and Area Change in the Pinglu Canal Basin Forest

To better evaluate the impact of canal construction on the forest ecosystem within the basin, analyses were conducted at two spatial scales [73]: the entire basin and a 1 km buffer zone around the river channels, which represents the Environmental Impact Assessment (EIA) area. Figure 9 illustrates the forest area dynamics in the Pinglu Canal basin from 2021 to 2024, based on forest masks generated in this study. The figure reveals significant interannual variations in forest area, driven by both canal construction activities and subsequent ecological restoration measures.
In the EIA area, where construction activities were most concentrated, the forest area experienced a significant decline in 2022 due to land clearing for canal construction. Although reforestation measures improved forest cover in 2023 and 2024, the forest area in the EIA zone had not yet returned to 2021 levels by 2024. This is primarily due to the ongoing construction of the Pinglu Canal, as several areas within the 1 km buffer zone remain occupied by engineering projects, limiting the extent of forest recovery. In contrast, the entire watershed showed a more balanced trend. The forest area initially decreased from 59,683.73 ha in 2021 to 57,880.31 ha in 2022, reflecting the broader deforestation impact of construction activities. However, from 2023 onwards, forest recovery efforts led to a steady increase, culminating in 60,669.97 ha by 2024, demonstrating the effectiveness of large-scale conservation and reforestation initiatives [74]. These findings indicate that, although ongoing construction activities within the EIA area continue to hinder the complete recovery of forests to pre-construction levels, proactive conservation and reforestation measures have significantly increased forest area across the entire watershed.
Furthermore, a historical baseline scenario for the Pinglu Canal basin was established using historical AGC data from 2002 to 2021 and precipitation and temperature data obtained from the ERA5_Land dataset on GEE. A linear model was employed for fitting and prediction, yielding the predicted 2024 forest AGC with R2 = 0.76 and RMSE = 184,137.79 tC for the entire basin, and R2 = 0.70 and RMSE = 34,188.66 tC for the EIA area. The predicted values were then compared with the AGC estimates for 2024 derived in this study using multi-source remote sensing data. For the entire basin, the baseline scenario predicts a carbon stock of 2,079,631.86 tC, while the estimate based on multi-source remote sensing data is higher, at 2,260,779.61 tC, reflecting the positive impact of reforestation and conservation measures implemented during the canal construction period. Similarly, in the EIA area, the baseline scenario predicts 359,349.23 tC, whereas the estimated value reaches 370,767.55 tC, despite ongoing construction activities. These findings highlight the effectiveness of targeted vegetation restoration and protection efforts in enhancing carbon sequestration, even under the constraints of large-scale infrastructure projects [75].
Since the construction of the Pinglu Canal began in July 2022, afforestation efforts along the canal’s route, as reported by the Guangxi Forestry Bureau, have resulted in the completion of 19,063 ha of new forest plantations. These efforts have significantly strengthened ecological restoration in the counties and districts traversed by the canal. While the construction activities have inevitably impacted the forest ecosystems along the canal, proactive forest conservation and vegetation restoration measures have substantially mitigated these effects. The overall forest area in the basin has increased, and carbon stock estimates for both the entire basin and the EIA zone in 2024 exceed baseline scenario predictions. These efforts have not only enhanced carbon sequestration capacity but also improved the resilience and stability of the forest ecosystems. Such findings highlight the importance of embedding ecological compensation and vegetation recovery into infrastructure planning frameworks. Their demonstrated effectiveness suggests that large-scale projects like the Pinglu Canal can serve as models for integrating development with environmental goals, including carbon neutrality targets and land-use policy. Additionally, the use of multi-source remote sensing offers a scalable, cost-effective approach for tracking ecological impacts and restoration effectiveness, thereby supporting data-driven environmental governance. This case demonstrates that, with well-planned restoration initiatives, development and ecological protection are not mutually exclusive, but can be jointly pursued within a sustainable planning framework [76].

4.4. Method Comparison and Applicability

Recent advances in AGC estimation have focused on multi-source remote sensing integration and machine learning techniques. Current mainstream approaches can be broadly categorized as follows:
(1)
Spaceborne LiDAR-based modeling using GEDI or ICESat-2
These methods utilize LiDAR observations to extract forest vertical structure features such as canopy height and layering [77]. They are effective in alleviating spectral saturation in high-biomass areas and are widely used for global-scale AGC estimation. However, due to sparse sampling density and discontinuous spatial distribution, they are not suitable for pixel-level wall-to-wall mapping. This limits their applicability in local-scale ecological engineering assessments, where a trade-off exists between accuracy and spatial coverage.
(2)
High-resolution remote sensing modeling based on deep learning (e.g., CNN, Transformer)
Deep learning has been increasingly applied in remote sensing, enabling end-to-end AGC estimation using high-resolution imagery. These models can achieve excellent prediction accuracy in regions with abundant training data and stable land cover [78]. However, they require large volumes of high-quality labeled samples, lack interpretability, and are sensitive to data imbalance or domain shifts. These limitations restrict their generalizability in ecologically heterogeneous regions.
(3)
Empirical regression models based on vegetation indices or topographic variables
Traditional regression models use vegetation indices such as NDVI or EVI to establish empirical relationships with AGC [79]. These methods are simple to implement and scalable, making them suitable for broad regional assessments. However, they are vulnerable to saturation effects and terrain-induced distortion, and are limited in estimating AGC in structurally complex or data-sparse environments.
This study proposes a practical alternative that combines open-access data from Sentinel-2, Sentinel-1, ALOS PALSAR, and SRTM with a Random Forest model optimized by RFE and hyperparameter tuning. The resulting model achieves 10 m-resolution AGC mapping with high accuracy. Unlike approaches that rely on LiDAR or complex neural networks, this method balances accuracy, interpretability, and cost-efficiency, making it suitable for regions with complex ecosystems and limited ground data.
All remote sensing data used in this study are freely available through the GEE and Copernicus platforms, enabling scalable applications. Importantly, although the model requires representative ground samples for training, it does not rely on extensive or updated forest inventories. Once trained, the model can be applied across larger areas using only remote sensing features, offering a practical solution for regions with limited field data but adequate satellite coverage.
This approach is particularly applicable in developing regions, coastal ecosystems, and infrastructure corridors, where field surveys are often constrained. To further improve model automation and accuracy, future work may integrate UAV-based LiDAR data with remote sensing variables to obtain detailed structural information while reducing the labor cost of field campaigns.
Nevertheless, three limitations remain. First, the model lacks explicit structural metrics such as canopy height, which may lead to underestimation in dense forests. Second, the absence of time-series features limits the capacity to capture seasonal and interannual dynamics. Third, the model still depends on ground plot data for training, and performance may decline if plot distribution or quality is insufficient.
To establish a baseline scenario, we used a simple linear regression model based on AGC trends from 2002 to 2021. While straightforward, this approach may oversimplify complex ecological processes. Future studies could explore more robust time-series models such as ARIMA [80] or LSTM [81] to better account for nonlinear drivers of AGC change.

5. Conclusions

The accurate estimation of forest AGC is critical for addressing climate change and achieving sustainable development. In this study, we integrated high-resolution remote sensing data with machine learning techniques to explore the application of multi-source remote sensing data and model optimization in AGC estimation, focusing on evaluating the impact of the Pinglu Canal construction on the forest ecosystem. The main contributions of this study are as follows:
By integrating high-precision tree-level coordinate data obtained from RTK with Sentinel-2 imagery, this study developed a high-quality, pixel-level AGC dataset for the Pinglu Canal basin in 2024. This approach effectively addressed key challenges, including aligning field measurements with satellite data resolution and capturing spatial heterogeneity, thereby providing a robust foundation for accurate carbon stock estimation.
Using RFE, we conducted effective feature selection and compared the estimation capabilities of different machine learning models. Among all tested data combinations and models, the hyperparameter-optimized RF model demonstrated the best performance in AGC estimation. Based on this, we generated a 10 m-resolution AGC spatial distribution map for the Pinglu Canal basin, offering a visual tool for carbon stock monitoring.
By comparing the predicted 2024 AGC under a baseline scenario with the estimated AGC derived from multi-source remote sensing data and field survey data under actual conditions, we found that active forest conservation and vegetation restoration measures during the Pinglu Canal construction period resulted in better forest growth than natural growth scenarios. This highlights the potential for large-scale infrastructure projects to achieve a certain degree of harmony between construction and forest ecosystem protection.
Our methodology, relying on freely accessible remote sensing data as predictors for machine learning models, combined with high spatial and temporal resolution satellite imagery, proved to be cost-effective, scalable, and highly reliable for AGC estimation in the Pinglu Canal basin. To further enhance data utility and improve AGC estimation accuracy, future work will integrate UAV-based LiDAR data with multi-source remote sensing datasets.

Author Contributions

Conceptualization, methodology, software, validation, W.X.; formal analysis, W.X. and X.M.; investigation, W.X., W.W. (Wenqian Wu), X.M., S.D. and W.W. (Wenhuan Wang); data curation, W.X., S.D. and X.M.; writing—original draft preparation, W.X.; writing—review and editing, W.Z.; visualization, S.D., W.W. (Wenqian Wu) and W.W. (Wenhuan Wang); supervision, W.Z.; funding acquisition, Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Guangxi Science and Technology Major Program, grant number GUIKEAA23062054; the Science and Technology Research and Development Projects of Guangxi Institute of Industrial Technology, grant number CYY-HT2023-JSJJ-0037; and the Guangxi Key Research and Development Program, grant number GUIKEAB24010248.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

During the preparation of this manuscript, the authors used OpenAI’s ChatGPT (GPT-4, March 2025 version) for the purposes of improving language clarity, refining logical flow, and assisting in information and literature retrieval. The authors have reviewed and edited all AI-generated content and take full responsibility for the integrity and accuracy of the final manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AGCAboveground carbon stock
SARSynthetic Aperture Radar
SVRSupport Vector Regression
RFRandom Forest
RFERecursive Feature Elimination
MLRMultiple Linear Regression
RTKReal-Time Kinematic
GEEGoogle Earth Engine
SRTMShuttle Radar Topography Mission
DEMDigital Elevation Model
AGBAboveground biomass
R2Coefficient of determination
RMSERoot mean square error
SVMSupport Vector Machine
EIAEnvironmental Impact Assessment

References

  1. Malhi, Y.; Franklin, J.; Seddon, N.; Solan, M.; Turner, M.G.; Field, C.B.; Knowlton, N. Climate change and ecosystems: Threats, opportunities and solutions. Philos. Trans. R. Soc. B Biol. Sci. 2020, 375, 20190104. [Google Scholar] [CrossRef] [PubMed]
  2. Pörtner, H.-O.; Scholes, R.J.; Agard, J.; Archer, E.; Arneth, A.; Bai, X.; Barnes, D.; Burrows, M.; Chan, L.; Cheung, W.L.W. Scientific Outcome of the IPBES-IPCC Co-Sponsored Workshop on Biodiversity and Climate Change; Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES): Bonn, Germany, 2021. [Google Scholar]
  3. Falkowski, P.; Scholes, R.J.; Boyle, E.; Canadell, J.; Canfield, D.; Elser, J.; Gruber, N.; Hibbard, K.; Högberg, P.; Linder, S. The global carbon cycle: A test of our knowledge of earth as a system. Science 2000, 290, 291–296. [Google Scholar] [CrossRef] [PubMed]
  4. Lorenz, K. Carbon Sequestration in Forest Ecosystems; Springer: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
  5. Goetz, S.J.; Hansen, M.; Houghton, R.A.; Walker, W.; Laporte, N.; Busch, J. Measurement and monitoring needs, capabilities and potential for addressing reduced emissions from deforestation and forest degradation under REDD+. Environ. Res. Lett. 2015, 10, 123001. [Google Scholar] [CrossRef]
  6. Jenkins, M.; Schaap, B. Forest ecosystem services. In Background Analytical Study; United Nations Forum on Forests: New York, NY, USA, 2018; Volume 1. [Google Scholar]
  7. Wang, Y.; Guo, C.-H.; Chen, X.-J.; Jia, L.-Q.; Guo, X.-N.; Chen, R.-S.; Zhang, M.-S.; Chen, Z.-Y.; Wang, H.-D. Carbon peak and carbon neutrality in China: Goals, implementation path and prospects. China Geol. 2021, 4, 720–746. [Google Scholar] [CrossRef]
  8. FAO. Global Forest Resources Assessment 2020: Main Report; Food and Agriculture Organization of the United Nations: Rome, Italy, 2020. [Google Scholar]
  9. Zhou, Q.; Wang, L.; Tang, F.; Zhao, S.; Huang, N.; Zheng, K. Mapping spatial and temporal distribution information of plantations in Guangxi from 2000 to 2020. Front. Ecol. Evol. 2023, 11, 1201161. [Google Scholar] [CrossRef]
  10. Carse, A. Nature as infrastructure: Making and managing the Panama Canal watershed. Soc. Stud. Sci. 2012, 42, 539–563. [Google Scholar] [CrossRef]
  11. Condit, R.; Robinson, W.D.; Ibáñez, R.; Aguilar, S.; Sanjur, A.; Martínez, R.; Stallard, R.F.; García, T.; Angehr, G.R.; Petit, L. The Status of the Panama Canal Watershed and Its Biodiversity at the Beginning of the 21st Century: Long-term ecological studies reveal a diverse flora and fauna near the Panama Canal, harbored within a corridor of forest stretching from the Caribbean to the Pacific, but deforestation, land degradation, erosion, and overhunting remain threats. Bioscience 2001, 51, 389–398. [Google Scholar]
  12. Moreno, S.H. Impact of development on the Panama Canal environment. J. Interam. Stud. World Aff. 1993, 35, 129–150. [Google Scholar] [CrossRef]
  13. Wallander, S.; Lauterbach, S.; Anderson, K.; Chou, F.; Grossman, J.M.; Schloegel, C. Existing markets for ecosystem services in the Panama canal watershed. In Emerging Markets for Ecosystem Services; CRC Press: Boca Raton, FL, USA, 2021; pp. 311–336. [Google Scholar]
  14. Ibáñez, R.; Condit, R.; Angehr, G.; Aguilar, S.; GarcÍa, T.; MartÍnez, R.; Sanjur, A.; Stallard, R.; Wright, S.J.; Rand, A.S. An ecosystem report on the Panama Canal: Monitoring the status of the forest communities and the watershed. Environ. Monit. Assess. 2002, 80, 65–95. [Google Scholar] [CrossRef]
  15. Simonit, S.; Perrings, C. Bundling ecosystem services in the Panama Canal watershed. Proc. Natl. Acad. Sci. USA 2013, 110, 9326–9331. [Google Scholar] [CrossRef]
  16. Condit, R. Extracting environmental benefits from a new canal in Nicaragua: Lessons from Panama. PLoS Biol. 2015, 13, e1002208. [Google Scholar] [CrossRef]
  17. Gibbs, H.K.; Brown, S.; Niles, J.O.; Foley, J.A. Monitoring and estimating tropical forest carbon stocks: Making REDD a reality. Environ. Res. Lett. 2007, 2, 045023. [Google Scholar] [CrossRef]
  18. Segura, M.; Kanninen, M. Allometric models for tree volume and total aboveground biomass in a tropical humid forest in Costa Rica 1. Biotropica 2005, 37, 2–8. [Google Scholar] [CrossRef]
  19. Lu, D.; Chen, Q.; Wang, G.; Liu, L.; Li, G.; Moran, E. A survey of remote sensing-based aboveground biomass estimation methods in forest ecosystems. Int. J. Digit. Earth 2016, 9, 63–105. [Google Scholar] [CrossRef]
  20. Halme, E.; Pellikka, P.; Mottus, M. Utility of hyperspectral compared to multispectral remote sensing data in estimating forest biomass and structure variables in Finnish boreal forest. Int. J. Appl. Earth Obs. Geoinf. 2019, 83, 101942. [Google Scholar] [CrossRef]
  21. Li, W.; Zhang, Y.; Zhang, J.; Chen, H.; Chen, E.; Zhao, L.; Zhao, D. Tropical forest AGB estimation based on structure parameters extracted by TomoSAR. Int. J. Appl. Earth Obs. Geoinf. 2023, 121, 103369. [Google Scholar] [CrossRef]
  22. Puliti, S.; Breidenbach, J.; Schumacher, J.; Hauglin, M.; Klingenberg, T.F.; Astrup, R. Above-ground biomass change estimation using national forest inventory data with Sentinel-2 and Landsat. Remote Sens. Environ. 2021, 265, 112644. [Google Scholar] [CrossRef]
  23. Verly, O.M.; Leite, R.V.; da Silva Tavares-Junior, I.; da Rocha, S.J.S.S.; Leite, H.G.; Gleriani, J.M.; Rufino, M.P.M.X.; de Fatima Silva, V.; Torres, C.M.M.E.; Plata-Rueda, A. Atlantic forest woody carbon stock estimation for different successional stages using Sentinel-2 data. Ecol. Indic. 2023, 146, 109870. [Google Scholar] [CrossRef]
  24. Schmitt, M.; Zhu, X.X. Data fusion and remote sensing: An ever-growing relationship. IEEE Geosci. Remote Sens. Mag. 2016, 4, 6–23. [Google Scholar] [CrossRef]
  25. Sinha, S.; Mohan, S.; Das, A.; Sharma, L.; Jeganathan, C.; Santra, A.; Santra Mitra, S.; Nathawat, M. Multi-sensor approach integrating optical and multi-frequency synthetic aperture radar for carbon stock estimation over a tropical deciduous forest in India. Carbon Manag. 2020, 11, 39–55. [Google Scholar] [CrossRef]
  26. Yang, Q.; Niu, C.; Liu, X.; Feng, Y.; Ma, Q.; Wang, X.; Tang, H.; Guo, Q. Mapping high-resolution forest aboveground biomass of China using multisource remote sensing data. GISci. Remote Sens. 2023, 60, 2203303. [Google Scholar] [CrossRef]
  27. Kanga, S. Advancements in remote sensing tools for forestry analysis. Sustain. For. 2023, 6, 2269. [Google Scholar] [CrossRef]
  28. Li, J.; Bao, W.; Wang, X.; Song, Y.; Liao, T.; Xu, X.; Guo, M. Estimating Aboveground Biomass of Boreal Forests in Northern China Using Multiple Data sets. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4408410. [Google Scholar]
  29. Cosenza, D.N.; Korhonen, L.; Maltamo, M.; Packalen, P.; Strunk, J.L.; Næsset, E.; Gobakken, T.; Soares, P.; Tomé, M. Comparison of linear regression, k-nearest neighbour and random forest methods in airborne laser-scanning-based prediction of growing stock. Forestry 2021, 94, 311–323. [Google Scholar] [CrossRef]
  30. Mohammadi, J.; Shataee, S.; Babanezhad, M. Estimation of forest stand volume, tree density and biodiversity using Landsat ETM+ Data, comparison of linear and regression tree analyses. Procedia Environ. Sci. 2011, 7, 299–304. [Google Scholar] [CrossRef]
  31. Georgopoulos, N.; Gitas, I.Z.; Stefanidou, A.; Korhonen, L.; Stavrakoudis, D. Estimation of Individual Tree Stem Biomass in an Uneven-Aged Structured Coniferous Forest Using Multispectral LiDAR Data. Remote Sens. 2021, 13, 4827. [Google Scholar] [CrossRef]
  32. Singh, C.; Karan, S.K.; Sardar, P.; Samadder, S.R. Remote sensing-based biomass estimation of dry deciduous tropical forest using machine learning and ensemble analysis. J. Environ. Manag. 2022, 308, 114639. [Google Scholar] [CrossRef]
  33. Bergstra, J.; Bardenet, R.; Bengio, Y.; Kégl, B. Algorithms for hyper-parameter optimization. In Proceedings of the 25th International Conference on Neural Information Processing Systems, Granada, Spain, 12–15 December 2011; Volume 24. [Google Scholar]
  34. Jung, J.; Kim, S.; Hong, S.; Kim, K.; Kim, E.; Im, J.; Heo, J. Effects of national forest inventory plot location error on forest carbon stock estimation using k-nearest neighbor algorithm. ISPRS J. Photogramm. Remote Sens. 2013, 81, 82–92. [Google Scholar] [CrossRef]
  35. Urbazaev, M.; Thiel, C.; Cremer, F.; Dubayah, R.; Migliavacca, M.; Reichstein, M.; Schmullius, C. Estimation of forest aboveground biomass and uncertainties by integration of field measurements, airborne LiDAR, and SAR and optical satellite data in Mexico. Carbon Balance Manag. 2018, 13, 5. [Google Scholar] [CrossRef]
  36. Huang, L.; Huang, Z.; Zhou, W.; Wu, S.; Li, X.; Mao, F.; Song, M.; Zhao, Y.; Lv, L.; Yu, J. Landsat-based spatiotemporal estimation of subtropical forest aboveground carbon storage using machine learning algorithms with hyperparameter tuning. Front. Plant Sci. 2024, 15, 1421567. [Google Scholar] [CrossRef]
  37. Zhang, Y.; He, B.; Chen, R.; Zhang, H.; Fan, C.; Yin, J.; Li, Y. The potential of optical and SAR time-series data for the improvement of aboveground biomass carbon estimation in Southwestern China’s evergreen coniferous forests. GISci. Remote Sens. 2024, 61, 2345438. [Google Scholar] [CrossRef]
  38. Ecke, S.; Stehr, F.; Frey, J.; Tiede, D.; Dempewolf, J.; Klemmt, H.-J.; Endres, E.; Seifert, T. Towards operational UAV-based forest health monitoring: Species identification and crown condition assessment by means of deep learning. Comput. Electron. Agric. 2024, 219, 108785. [Google Scholar] [CrossRef]
  39. Qin, H.; Zhou, W.; Yao, Y.; Wang, W. Individual tree segmentation and tree species classification in subtropical broadleaf forests using UAV-based LiDAR, hyperspectral, and ultrahigh-resolution RGB data. Remote Sens. Environ. 2022, 280, 113143. [Google Scholar] [CrossRef]
  40. Li, H.; Hiroshima, T.; Li, X.; Hayashi, M.; Kato, T. High-resolution mapping of forest structure and carbon stock using multi-source remote sensing data in Japan. Remote Sens. Environ. 2024, 312, 114322. [Google Scholar] [CrossRef]
  41. Liu, S.; Wang, H.; Hu, Y.; Zhang, M.; Zhu, Y.; Wang, Z.; Li, D.; Yang, M.; Wang, F. Land use and land cover mapping in China using multimodal fine-grained dual network. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4405219. [Google Scholar] [CrossRef]
  42. Xu, H.; Man, Y.; Yang, M.; Wu, J.; Zhang, Q.; Wang, J. Analytical Insight of Earth: A Cloud-Platform of Intelligent Computing for Geospatial Big Data. arXiv 2023, arXiv:2312.16385. [Google Scholar]
  43. Zhao, C.; Qin, C.-Z. 10-m-resolution mangrove maps of China derived from multi-source and multi-temporal satellite observations. ISPRS J. Photogramm. Remote Sens. 2020, 169, 389–405. [Google Scholar] [CrossRef]
  44. Chen, Y.; Feng, X.; Fu, B.; Ma, H.; Zohner, C.M.; Crowther, T.W.; Huang, Y.; Wu, X.; Wei, F. Maps with 1 km resolution reveal increases in above-and belowground forest biomass carbon pools in China over the past 20 years. Earth Syst. Sci. Data 2023, 15, 897–910. [Google Scholar] [CrossRef]
  45. Li, Y.; Li, C.; Li, M.; Liu, Z. Influence of variable selection and forest type on forest aboveground biomass estimation using machine learning algorithms. Forests 2019, 10, 1073. [Google Scholar] [CrossRef]
  46. Cabello-Solorzano, K.; Ortigosa de Araujo, I.; Peña, M.; Correia, L.; Tallón-Ballesteros, A.J. The impact of data normalization on the accuracy of machine learning algorithms: A comparative analysis. In Proceedings of the International Conference on Soft Computing Models in Industrial and Environmental Applications, Salamanca, Spain, 5–7 September 2023; pp. 344–353. [Google Scholar]
  47. Wang, C.; Zhang, W.; Ji, Y.; Marino, A.; Li, C.; Wang, L.; Zhao, H.; Wang, M. Estimation of Aboveground Biomass for Different Forest Types Using Data from Sentinel-1, Sentinel-2, ALOS PALSAR-2, and GEDI. Forests 2024, 15, 215. [Google Scholar] [CrossRef]
  48. Wang, Z.; Zhang, Y.; Li, F.; Gao, W.; Guo, F.; Li, Z.; Yang, Z. Regional mangrove vegetation carbon stocks predicted integrating UAV-LiDAR and satellite data. J. Environ. Manag. 2024, 368, 122101. [Google Scholar] [CrossRef]
  49. Guyon, I.; Weston, J.; Barnhill, S.; Vapnik, V. Gene selection for cancer classification using support vector machines. Mach. Learn. 2002, 46, 389–422. [Google Scholar] [CrossRef]
  50. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  51. Uyanık, G.K.; Güler, N. A study on multiple linear regression analysis. Procedia Soc. Behav. Sci. 2013, 106, 234–240. [Google Scholar] [CrossRef]
  52. Awad, M.; Khanna, R.; Awad, M.; Khanna, R. Support vector regression. In Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers; Apress: New York, NY, USA, 2015; pp. 67–80. [Google Scholar]
  53. Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  54. Cheng, F.; Ou, G.; Wang, M.; Liu, C. Remote sensing estimation of forest carbon stock based on machine learning algorithms. Forests 2024, 15, 681. [Google Scholar] [CrossRef]
  55. Ding, X.; Liu, J.; Yang, F.; Cao, J. Random radial basis function kernel-based support vector machine. J. Frankl. Inst. 2021, 358, 10121–10140. [Google Scholar] [CrossRef]
  56. Jiang, W.; Zhang, L.; Zhang, X.; Gao, S.; Gao, H.; Sun, L.; Yan, G. Multi-Decision Vector Fusion Model for Enhanced Mapping of Aboveground Biomass in Subtropical Forests Integrating Sentinel-1, Sentinel-2, and Airborne LiDAR Data. Remote Sens. 2025, 17, 1285. [Google Scholar] [CrossRef]
  57. Kohavi, R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the International Joint Confeence on Artificial Intelligence, Montreal, QC, Canada, 20–25 August 1995; pp. 1137–1145. [Google Scholar]
  58. Wang, X.; Jiao, H. Spatial scaling of forest aboveground biomass using multi-source remote sensing data. IEEE Access 2020, 8, 178870–178885. [Google Scholar] [CrossRef]
  59. Yan, X.; Li, J.; Smith, A.R.; Yang, D.; Ma, T.; Su, Y.; Shao, J. Evaluation of machine learning methods and multi-source remote sensing data combinations to construct forest above-ground biomass models. Int. J. Digit. Earth 2023, 16, 4471–4491. [Google Scholar] [CrossRef]
  60. Yan, L.; Roy, D.; Li, Z.; Zhang, H.; Huang, H. Sentinel-2A multi-temporal misregistration characterization and an orbit-based sub-pixel registration methodology. Remote Sens. Environ. 2018, 215, 495–506. [Google Scholar] [CrossRef]
  61. Dai, W.; Fu, W.; Jiang, P.; Zhao, K.; Li, Y.; Tao, J. Spatial pattern of carbon stocks in forest ecosystems of a typical subtropical region of southeastern China. For. Ecol. Manag. 2018, 409, 288–297. [Google Scholar] [CrossRef]
  62. Lin, Z.; Chao, L.; Wu, C.; Hong, W.; Hong, T.; Hu, X. Spatial analysis of carbon storage density of mid-subtropical forests using geostatistics: A case study in Jiangle County, southeast China. Acta Geochim. 2018, 37, 90–101. [Google Scholar] [CrossRef]
  63. Fuchs, H.; Magdon, P.; Kleinn, C.; Flessa, H. Estimating aboveground carbon in a catchment of the Siberian forest tundra: Combining satellite imagery and field inventory. Remote Sens. Environ. 2009, 113, 518–531. [Google Scholar] [CrossRef]
  64. Xu, D.; Wang, H.; Xu, W.; Luan, Z.; Xu, X. LiDAR applications to estimate forest biomass at individual tree scale: Opportunities, challenges and future perspectives. Forests 2021, 12, 550. [Google Scholar] [CrossRef]
  65. Liao, K.; Li, Y.; Zou, B.; Li, D.; Lu, D. Examining the role of UAV Lidar data in improving tree volume calculation accuracy. Remote Sens. 2022, 14, 4410. [Google Scholar] [CrossRef]
  66. Mohammed, K.; Kpienbaareh, D.; Wang, J.; Goldblum, D.; Luginaah, I.; Lupafya, E.; Dakishoni, L. Synthesizing Local Capacities, Multi-Source Remote Sensing and Meta-Learning to Optimize Forest Carbon Assessment in Data-Poor Regions. Remote Sens. 2025, 17, 289. [Google Scholar] [CrossRef]
  67. Cheekhooree, K. Canopy Height Assessment in South Australian Pinus Radiata Plantations Using Sentinel-1: A Comparative Analysis Between INSAR and Machine Learning Algorithms. Master’s Thesis, Flinders University, College of Science and Engineering, Bedford Park, Australia, 2024. [Google Scholar]
  68. Lausch, A.; Erasmi, S.; King, D.J.; Magdon, P.; Heurich, M. Understanding forest health with remote sensing-part I—A review of spectral traits, processes and remote-sensing characteristics. Remote Sens. 2016, 8, 1029. [Google Scholar] [CrossRef]
  69. Datt, B. Remote Sensing of Foliar Biochemistry and Biophysical Properties in Eucalyptus Species: Application of High Spectral Resolution Reflectance Measurements. Ph.D. Thesis, UNSW Sydney, Sydney, Australia, 1999. [Google Scholar]
  70. Sharma, C.; Gairola, S.; Baduni, N.; Ghildiyal, S.; Suyal, S. Variation in carbon stocks on different slope aspects in seven major forest types of temperate region of Garhwal Himalaya, India. J. Biosci. 2011, 36, 701–708. [Google Scholar] [CrossRef]
  71. Singh, S. Understanding the role of slope aspect in shaping the vegetation attributes and soil properties in Montane ecosystems. Trop. Ecol. 2018, 59, 417–430. [Google Scholar]
  72. Xu, W.; Cheng, Y.; Luo, M.; Mai, X.; Wang, W.; Zhang, W.; Wang, Y. Progress and Limitations in Forest Carbon Stock Estimation Using Remote Sensing Technologies: A Comprehensive Review. Forests 2025, 16, 449. [Google Scholar] [CrossRef]
  73. Zhang, X.; Shen, J.; Sun, F.; Wang, S. Spatial-temporal evolution and influencing factors analysis of ecosystem services value: A case study in sunan canal basin of jiangsu province, eastern China. Remote Sens. 2022, 15, 112. [Google Scholar] [CrossRef]
  74. Zhao, H.; Wu, C.; Wang, X. Large-scale forest conservation and restoration programs significantly contributed to land surface greening in China. Environ. Res. Lett. 2022, 17, 024023. [Google Scholar] [CrossRef]
  75. Lu, F.; Hu, H.; Sun, W.; Zhu, J.; Liu, G.; Zhou, W.; Zhang, Q.; Shi, P.; Liu, X.; Wu, X. Effects of national ecological restoration projects on carbon sequestration in China from 2001 to 2010. Proc. Natl. Acad. Sci. USA 2018, 115, 4039–4044. [Google Scholar] [CrossRef]
  76. Mitra, S.; Madhuvanthi, S.; Sabumon, P. Integrating Green Infrastructure. In Nature-Based Solutions in Achieving Sustainable Development Goals; Springer: Cham, Switzerland, 2024; p. 167. [Google Scholar]
  77. Neuenschwander, A.; Duncanson, L.; Montesano, P.; Minor, D.; Guenther, E.; Hancock, S.; Wulder, M.; White, J.C.; Purslow, M.; Thomas, N. Towards global spaceborne lidar biomass: Developing and applying boreal forest biomass models for ICESat-2 laser altimetry data. Sci. Remote Sens. 2024, 10, 100150. [Google Scholar] [CrossRef]
  78. Zhang, F.; Tian, X.; Zhang, H.; Jiang, M. Estimation of aboveground carbon density of forests using deep learning and multisource remote sensing. Remote Sens. 2022, 14, 3022. [Google Scholar] [CrossRef]
  79. Poudel, A.; Shrestha, H.L.; Mahat, N.; Sharma, G.; Aryal, S.; Kalakheti, R.; Lamsal, B. Modeling and Mapping of Aboveground Biomass and Carbon Stock Using Sentinel-2 Imagery in Chure Region, Nepal. Int. J. For. Res. 2023, 2023, 5553957. [Google Scholar] [CrossRef]
  80. Shumway, R.H.; Stoffer, D.S.; Shumway, R.H.; Stoffer, D.S. ARIMA models. In Time Series Analysis and Its Applications: With R Examples; Springer: Berlin/Heidelberg, Germany, 2017; pp. 75–163. [Google Scholar]
  81. Greff, K.; Srivastava, R.K.; Koutník, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 2016, 28, 2222–2232. [Google Scholar] [CrossRef]
Figure 1. Study area and sample plot distribution.
Figure 1. Study area and sample plot distribution.
Forests 16 01130 g001
Figure 2. Flowchart showing the overall workflow for data preprocessing, modeling, and forest AGC mapping in the Pinglu Canal basin in 2024.
Figure 2. Flowchart showing the overall workflow for data preprocessing, modeling, and forest AGC mapping in the Pinglu Canal basin in 2024.
Forests 16 01130 g002
Figure 3. The flowchart illustrates the process of converting field-measured individual tree data into Sentinel-2 pixel-level carbon storage for subsequent training. Ultimately, only pixels with no missing data were included in the dataset.
Figure 3. The flowchart illustrates the process of converting field-measured individual tree data into Sentinel-2 pixel-level carbon storage for subsequent training. Ultimately, only pixels with no missing data were included in the dataset.
Forests 16 01130 g003
Figure 4. Comparison of forest AGC estimation using different algorithms and data combinations.
Figure 4. Comparison of forest AGC estimation using different algorithms and data combinations.
Forests 16 01130 g004
Figure 5. Spatial distribution and profile statistics of AGC in the Pinglu Canal basin. Panels a–c show representative high- and low-carbon-density areas corresponding to real land cover conditions.
Figure 5. Spatial distribution and profile statistics of AGC in the Pinglu Canal basin. Panels a–c show representative high- and low-carbon-density areas corresponding to real land cover conditions.
Forests 16 01130 g005
Figure 6. Local Moran’s I cluster map of forest aboveground carbon stock in the Pinglu Canal basin.
Figure 6. Local Moran’s I cluster map of forest aboveground carbon stock in the Pinglu Canal basin.
Forests 16 01130 g006
Figure 7. Relative importance ranking of multi-source remote sensing features in the random forest model optimized by grid search.
Figure 7. Relative importance ranking of multi-source remote sensing features in the random forest model optimized by grid search.
Forests 16 01130 g007
Figure 8. Correlation heatmap of AGC and modeling features across different tree species.
Figure 8. Correlation heatmap of AGC and modeling features across different tree species.
Forests 16 01130 g008
Figure 9. Annual forest area dynamics: increase and decrease.
Figure 9. Annual forest area dynamics: increase and decrease.
Forests 16 01130 g009
Table 1. Bands, indices, and features extracted from Sentinel-2, Sentinel-1, PALSAR, and SRTM data.
Table 1. Bands, indices, and features extracted from Sentinel-2, Sentinel-1, PALSAR, and SRTM data.
Sensor Bands, Indices or Texture FeaturesDescription
Sentinel-2Multispectral bandsB2490 nm, blue
B3560 nm, green
B4665 nm, red
B5705 nm, red edge1
B6740 nm, red edge2
B7783 nm, red edge3
B8842 nm, NIR
B8A865 nm, red edge4
B111610 nm, SWIR
B122190 nm, SWIR
Vegetation indicesARVI, CVI, DVI, EVI, GNDVI, GRVI, IPVI, LAI, MSAVI, MTCI, NBR, NBR2, NDI45, NDVI, NDWI, REIP, SAVI, WDVIAll 18 vegetation indices are calculated based on multispectral bands.
Sentinel-1PolarizationVVVertical–vertical polarization
VHVertical–horizonal polarization
IndicesVV + VHAll 5 indices are calculated based on the dual-polarization backscatter coefficient (VV and VH).
VV − VH
VV/VH
VH/VV
(VV − VH)/(VV + VH)
Texture featureMeanTexture features are extracted from dual-polarization backscatter coefficients (VV and VH) using GLCM with a 7 × 7 pixel window.
Variance
Homogeneity
Contrast
Dissimilarity
Entropy
Second moment
Correlation
PALSARPolarizationHHHorizontal–horizontal polarization
HVHorizontal–vertical polarization
IndicesHH + HVAll 5 indices are calculated based on the dual-polarization backscatter coefficient (HH and HV).
HH − HV
HH/HV
HV/HH
(HH − HV)/(HH + HV)
Texture featureSame as Sentinel-1Texture features are extracted from dual-polarization backscatter coefficients (HH and HV) using GLCM with a 7 × 7 pixel window.
SRTMDEMElevationHeight above sea level
IndicesSlopeAll 5 indices are all calculated using standard terrain analysis algorithms in GIS software (ArcGIS Pro, version 3.0.1).
Aspect
Curvature
Plan curvature
Profile curvature
Table 2. Comparison of forest AGC estimation models using different remote sensing data combinations.
Table 2. Comparison of forest AGC estimation models using different remote sensing data combinations.
SensorNumberR2RMSESelectedR2RMSE
S2280.44119.370240.44619.281
S1230.57717.723220.58017.662
PALSAR230.35619.113110.36019.053
SRTM60.34224.01740.36121.696
S2 + S1510.60116.361420.61016.190
S2 + PALSAR510.49419.217280.50319.063
S2 + SRTM340.49518.160140.50617.949
S1 + PALSAR460.60117.482450.60817.555
S1 + SRTM290.61516.897220.63016.569
PALSAR + SRTM290.42220.130280.43119.979
S2 + S1 + PALSAR740.62112.634500.65811.999
S2 + S1 + SRTM570.64013.373370.65513.090
S2 + PALSAR + SRTM570.49919.298410.52018.887
S1 + PALSAR + SRTM520.61614.067510.63213.613
ALL800.64712.191210.65911.980
Table 3. Summary of Key Variables from Multi-Source Datasets and Their Relationship with AGC.
Table 3. Summary of Key Variables from Multi-Source Datasets and Their Relationship with AGC.
Data SourceKey VariablesImportance/
Feature Role
Typical Forest TypesCorrelation with AGCNotes/
Interpretation
Sentinel-1Entropy (VH), VH-VV, second moment (VH)High—captures structure and textureMasson pineModerate to highKey radar variables for structure; performs well in high-biomass and coniferous areas
Sentinel-2NBR, REIP, B12High—spectral condition indicatorsEucalyptusModerate to highSensitive to canopy health and chlorophyll; good in low–medium biomass zones
PALSARVariance (HH), HH, mean (HV)Medium—supplementary structureMixed forestsLow to moderateAdds heterogeneity info, useful in mixed stands
SRTMProfile_curvature, aspectLow—topographic backgroundMixed forestsWeakReflects terrain influence; steep areas may limit carbon accumulation
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xu, W.; Mai, X.; Deng, S.; Wang, W.; Wu, W.; Zhang, W.; Wang, Y. High-Resolution Mapping and Impact Assessment of Forest Aboveground Carbon Stock in the Pinglu Canal Basin: A Multi-Sensor and Multi-Model Machine Learning Approach. Forests 2025, 16, 1130. https://doi.org/10.3390/f16071130

AMA Style

Xu W, Mai X, Deng S, Wang W, Wu W, Zhang W, Wang Y. High-Resolution Mapping and Impact Assessment of Forest Aboveground Carbon Stock in the Pinglu Canal Basin: A Multi-Sensor and Multi-Model Machine Learning Approach. Forests. 2025; 16(7):1130. https://doi.org/10.3390/f16071130

Chicago/Turabian Style

Xu, Weifeng, Xuzhi Mai, Songwen Deng, Wenhuan Wang, Wenqian Wu, Wei Zhang, and Yinghui Wang. 2025. "High-Resolution Mapping and Impact Assessment of Forest Aboveground Carbon Stock in the Pinglu Canal Basin: A Multi-Sensor and Multi-Model Machine Learning Approach" Forests 16, no. 7: 1130. https://doi.org/10.3390/f16071130

APA Style

Xu, W., Mai, X., Deng, S., Wang, W., Wu, W., Zhang, W., & Wang, Y. (2025). High-Resolution Mapping and Impact Assessment of Forest Aboveground Carbon Stock in the Pinglu Canal Basin: A Multi-Sensor and Multi-Model Machine Learning Approach. Forests, 16(7), 1130. https://doi.org/10.3390/f16071130

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop