1. Introduction
Forests are essential components of global ecosystems, providing critical functions such as carbon sequestration, biodiversity maintenance, and climate regulation. They support the survival and evolution of both human societies and other species. Forests are characterized by high biodiversity, which sustains complex ecosystem structures and efficient nutrient cycles. Secondly, forests are hailed as the most valuable comprehensive treasure trove in nature, possessing multiple key functions such as carbon storage, gene preservation, resource reserve, water conservation, and energy accumulation. Finally, forests play an irreplaceable role in environmental protection. They not only optimize the quality of the ecological environment but also maintain the dynamic balance of the global ecosystem, providing a fundamental environmental guarantee for the sustainable development of human society [
1,
2]. Forest is the main component of the terrestrial biosphere, playing a crucial role not only in maintaining regional ecological environments but also in significantly contributing to the global carbon balance [
3,
4,
5]. Forest stock volume (FSV) quantitatively represents the aggregate stemwood volume of all viable standing trees within a delineated forest area, usually expressed in volume units. It is an important indicator for measuring the scale of forest resources and the potential for timber production, as well as a key basis for evaluating the ecological functions of forests [
6].
In the past, forest resource surveys relied on time-consuming and labor-intensive manual inventory methods. For example, the traditional measurement of forest stock volume involved measuring tree height, diameter at breast height, and other stand parameters, and establishing a corresponding volume equation to calculate the forest stock volume. Common methods for measuring forest stock volume include the standard tree method and the volume table method. The standard tree method estimates forest stock volume based on the representativeness of the standard tree. If the standard tree does not represent the entire forest, it may lead to estimation errors, and a certain number of trees need to be felled, which can easily cause forest damage. Manual inspection methods are especially difficult in complex mountainous environments. Although optical remote sensing can achieve large-scale forestry monitoring, it can only capture features of the forest canopy surface and struggles to acquire its vertical structure characteristics. Furthermore, the mountainous terrain often experiences variable weather conditions with frequent cloud cover, and data loss or degradation is especially likely during the rainy season. Compared to optical remote sensing, LiDAR technology directly measures the three-dimensional structure of forests, offering significant potential for forest resource management and ecosystem research. However, large-scale continuous observation is challenging with airborne or near-ground LiDAR platforms (e.g., stationary, mobile, and unmanned aerial vehicles). While airborne LiDAR offers high precision and efficiency, it has limitations. Complex mountainous terrain (steep slopes and deep valleys) presents high flight safety risks for UAVs, making some areas difficult to access. Synthetic Aperture Radar (SAR) enables all-weather, high-resolution observations with superior penetration capabilities, making it indispensable for resource exploration, environmental monitoring, and related geoscientific applications [
7]. However, SAR produces pixel layover in steep slope areas, resulting in severe geometric distortion. Furthermore, topographic relief causes drastic variations in local incidence angles, and backscatter from identical land-cover types varies with slope, necessitating DEM-based terrain correction. However, residual errors after correction still impact interpretation accuracy. Spaceborne LiDAR refers to a LiDAR system mounted on a satellite or spacecraft. It offers advantages such as low operational costs, high spatiotemporal data resolution, strong resistance to interference, and the ability to perform continuous and accurate monitoring of forest resources across regional areas at various spatiotemporal scales with high efficiency. It detects characteristic information of the ground or atmosphere through the emission and reception of laser pulses, generating three-dimensional point cloud data and high-precision terrain/surface elevation data. Its laser range is based on the time-of-flight principle and is unaffected by geometric distortions like SAR layover and foreshortening. Accurate surface elevation data can still be acquired in steep slope areas. Unlike traditional ground surveys, the proposed method delivers data of higher spatial resolution and accuracy. Such capability significantly improves the precision of forest growing stock quantification by overcoming key methodological constraints inherent to conventional techniques.
Currently, there are four Earth observation satellites globally equipped with LiDAR. The National Aeronautics and Space Administration (NASA) has launched three of them: ICESat-1 (Ice, Cloud, and Elevation Satellite-1) launched in 2003, ICESat-2 (Ice, Cloud and land Elevation Satellite-2) launched in 2018, and GEDI (Global Ecosystem Dynamics Investigation) launched in 2018. Additionally, China launched the Gaofen-7 (GF-7) in 2019. ATLAS aboard ICESat-2 employs micropulse multibeam photon-counting technology and serves as a full-waveform LiDAR payload. This provides a reliable foundation for extracting forest structures with high-precision data. Its global coverage capability and short repeat observation cycle make it suitable for large-scale dynamic monitoring of forests, especially in areas lacking ground survey data. These advantages make it an important tool for forest resource investigation and ecological research, particularly in monitoring forests in large-scale and complex terrain regions, where they play an irreplaceable role. Before the public release of ICESat-2 data, Neuenschwander and others used simulated ICESat-2 data to explore the accuracy of the ATL08 algorithm in retrieving forest canopy height, which showed a significant correlation with airborne point cloud data. The root-mean-square error (RMSE) in the Alaska permafrost/taiga transition zone and Sonoma County, California, was less than 2 m [
8]. Since NASA officially released the ICESat-2/ATLAS satellite data on 30 May 2019, researchers have systematically investigated the retrieval of forest structural parameters utilizing this satellite-derived dataset [
9]. Zhu effectively reduced the discrepancies in forest height obtained by GEDI and ATLAS by constructing a forest height consistency model. By processing encrypted ICESat-2 point clouds alongside Sentinel-2 multispectral data, researchers created a continental-scale forest height product (30 m grid) exhibiting 2.67 m RMSE relative to ground truth plots [
10]. There are certain differences in the inversion performance between ICESat-2/ATLAS and GEDI. Liu et al. demonstrated that ATLAS has higher accuracy in topographic mapping than GEDI, while GEDI performs better in canopy height mapping [
11]. ATLAS and GEDI now serve as primary data sources for most forest canopy height inversion research. Concurrently, multi-source data integration techniques have gained widespread application in producing high-resolution forest canopy height maps.
The use of ICESat-2/ATLAS data for forest structure parameter inversion holds significant application potential. However, there is still a need for in-depth exploration on how to improve the inversion accuracy in forest stock volume inversion under large-scale complex mountainous terrain conditions, and this area remains under-researched. LiDAR technology, since its inception in the 1960s, has been classified as active remote sensing technology due to its principle of actively emitting probing beams towards objects. The accuracies mentioned below refer to the degree of consistency between the predicted results and the ground survey data. It effectively acquires 3D information within the canopy and is increasingly used in various disciplines such as surveying and forestry [
12]. From the perspective of data sources, LiDAR addresses the saturation issues that traditional optical remote sensing and SAR remote sensing data are prone to, demonstrating its unique advantages in reflecting forest structures. However, LiDAR is susceptible to slope factors in areas with complex terrain. In steep slopes, the angle of incidence of the laser pulse varies greatly, resulting in an uneven distribution of the point cloud [
13], and the larger the slope, the more pronounced the laser footprint distortion [
14]. Steeply sloping terrain may result in multiple reflections of the laser beam, making the point cloud data noisy, and in extreme terrain (e.g., cliffs and deep valleys), the elevation error of LiDAR may increase significantly. Holmgren et al. used two regression models to estimate stocking in a sample plot with a radius of 10 m. The first model used the LiDAR-estimated tree height and canopy area as variables (coefficient of determination (R
2) = 0.9, RMSE = 22% of the mean forest stock volume), while the second model used the LiDAR-estimated tree height and number of plants as variables (R
2 = 0.82, RMSE = 26% of the mean forest stock volume). The study concluded that tree height and forest stock volume can be estimated at a high spatial resolution (10 m radius plots) in forested areas using LiDAR data in combination with ground sample surveys. The research site is positioned in southern Sweden, with flat terrain and elevations between 120 and 145 m above sea level, and the inversion model has a high accuracy [
15]. Pang Yong and his colleagues established a model based on the upper quartile tree height (H75) of the highly normalized point cloud and the field survey data, then inverted the average stand height in the study area. A density-gradient experimental design was implemented to assess point cloud sparsity effects, with paired
t-tests demonstrating statistically distinct outcomes (
p < 0.01) for canopy height and DBH estimation across density regimes. It was found that high-density point clouds can enable further individual tree segmentation based on the CHM, thus reducing the reliance on measured data. The study area of this research is located in forest farms in Shandong and Chongqing, where the terrain is relatively gentle. The average accuracy in the Shandong study area reaches 90.59%, indicating excellent results [
16]. Dong et al. used the ICESat-1/GLAS data to invert the maximum height of the forest canopy and developed an extrapolation model to estimate the aboveground biomass at the regional scale. The results indicated that the estimation accuracy of the forest canopy top height and aboveground biomass was relatively high, with the R
2 for the forest canopy top height being 0.69. The study area of this research was the Three Gorges Reservoir Area, situated at the junction of the second and third topographic steps. The complex terrain in this area resulted in relatively low inversion accuracy [
17]. Wang et al. systematically analyzed ICESat-1/GLAS full-waveform data to investigate the interplay between lidar-derived canopy height metrics, in situ measured tree heights, and terrain slope characteristics. Their study revealed significant slope-dependent biases (
p < 0.01) in spaceborne lidar height measurements while establishing robust allometric relationships (R
2 = 0.78) between median canopy height attributes and forest stand volume. Through comprehensive waveform processing and statistical validation, the researchers quantified the impact of topographic variation on height estimation accuracy and developed predictive models linking remote sensing metrics to forest biomass parameters. The results showed that the slope was well-correlated with both the measured average tree height and the forest stock volume. The ICESat-1/GLAS data had certain potential in estimating forest stock volume. Meanwhile, it was found that the error in the tree height estimated by lidar data would also increase as the slope increased [
18]. In 2013, Neigh et al. estimated the aboveground biomass and carbon storage of boreal forests using ICESAT-1/GLAS data. The estimated total aboveground carbon was 38 ± 3.1 Pg. The study indicated that the slope calculation in the research could affect the accuracy of the estimation [
19]. Liu et al. combined ICESat-2 and Sentinel-2A data in 2023 to conduct an inversion of forest stock volume in Shangri-La, which was chosen as the study area. The results show that when the Sentinel-2A and ICESat-2/ATLAS variable sets are combined, the Random Forest method achieves the highest inversion accuracy, with an R
2 value of 0.7034. The study area features typical complex terrain with significant undulations. The complex geographical environment will reduce the inversion accuracy [
20].
This study took ICESat-2/ATL08 as the primary data source and Jingdong Yi Autonomous County, Pu’er City, Yunnan Province, as the study area. It conducted research on issues such as forest stock volume inversion under complex terrain conditions using spaceborne lidar. We generated a spatial distribution map of forest stock volume in the study area to underpin forest resource management and monitoring, biomass and carbon storage quantification, ecosystem health assessment, and biodiversity evaluation. The core innovation of this study lies in its systematic evaluation of the performance and methodological applicability of spaceborne lidar (ICESat-2/ATLAS) for forest stock volume inversion in topographically complex mountainous regions. In response to the challenges posed by rugged terrain and limited remote sensing detection conditions in mountainous environments, this research adopts inversion algorithms specifically tailored for complex landscapes. The approach demonstrates a certain degree of improvement in the monitoring accuracy of spaceborne lidar in highly heterogeneous forest areas. Furthermore, it offers a novel technical pathway to address the long-standing issues of “data scarcity” and “insufficient accuracy” in mountain forest resource surveys.
5. Discussion and Conclusions
5.1. Discussion
This paper focuses on the feasibility of ICESat-2/ATLAS data for forest stock volume inversion in mountain forests, adopts the traditional, relatively mature Random Forest model and the widely used and fast LightGBM model and XGBoost model for mountain forest stock volume modeling, and compares the modeling accuracy and inversion results of the three different models. The findings demonstrate that the proposed methodology successfully overcomes the challenges posed by the study area’s rugged topography and extreme elevation variations, achieving accurate forest stock volume estimation. Specifically, effective forest stock volume inversion can be carried out for the forested areas in the study area based on ICESat-2/ATLAS data, the forest stock volume inversion model is constructed by extracting multidimensional feature parameters of the data, the feasibility of single data inversion of stock volume is systematically evaluated, and the three machine learning regression methods used have high inversion accuracy The three machine learning regression methods used all have high inversion accuracy and strong nonlinear fitting ability; the R2 of the optimal model XGBoost reaches 0.89, rRMSE = 10.5912, which indicates that the method has obvious applicability in estimating and inverting forest stock volume and confirms the practical value of single-data inversion.
However, this study has not yet systematically analyzed the differential effects of different terrain conditions and forest types on the forest stock volume inversion model. Meanwhile, the current inversion results still have an accuracy bias due to data time inconsistency and shadow interference of mountain images. Future research will focus on solving the following problems:
(1) Reliability and uncertainty analysis of the results: The average forest stock volume obtained from the inversion of the optimal model XGBoost is 141.00 m3/hm2, the statistical average stock volume of the 212 corner gauge control sample plots of the 2016 National Forest Resources Planning and Design Survey used is 139.3 m3/hm2, and the standard deviation (SD) of the sample plot statistics is 69.5 m3/hm2, while the absolute value of the inversion and the sample plot statistics are the same. The absolute deviation of the inversion value from the statistical value of the sample site did not exceed the SD of the sample site data, and the coefficient of determination of the model R2 reached 0.89, indicating that 89% of the spatial variability of the storage volume could be explained, and that the model precision was acceptable in the range of the natural variability of the sample site. The above quantitative indexes comprehensively proved that the complex mountain forest stock volume inversion model constructed in this study has high inversion accuracy and the results are credible. However, there are still errors: ① In terms of the overall storage volume obtained from the inversion, the results are overestimated to a certain extent, which is greatly affected by the complex terrain environment, such as the inclusion of bare soil and rocky plots in the forest area of the broken terrain, the increase in the error of predicting the canopy height due to the photon signal of the slope, and so on. ② Secondly, the forest inventory data employed in this study were obtained through field measurements conducted during the 2016 survey campaign across all sample plots, and the satellite-mounted lidar data were collected in 2019. Over a three-year study period, the natural growth of dominant tree species in the research area contributed to continuous increases in forest stock volume, which to some extent heightened inversion challenges and reduced model accuracy. ③ The small number of sample plots in the study area may lead to the inability to fully cover the environmental gradient in the study area, resulting in the difficulty of the model to learn the real ecological relationship, and the variance of parameter estimation under small samples increased, and the stability of the model was reduced.
(2) The influence of the selection of feature variables on the inversion results: This research implements a rigorous comparative framework to evaluate three ensemble learning algorithms—XGBoost, LightGBM, and Random Forest (RF)—assessing their relative performance through standardized validation protocols, and carries out modeling using their respective feature importance indexes to reveal the multidimensional expression of the feature–target relationship through the differences in the feature importance assessment of different algorithms. At the same time, before modeling, according to the existing literature, to screen the light spot parameters that may have a greater contribution to the forest stock volume for sequential Gaussian conditional simulation spatial interpolation, eliminating the parameters with lower spatial interpolation accuracy and retaining the variables that can truly reflect the geospatial autocorrelation can further improve the accuracy of the modeling.
(3) Effects of spatial interpolation on light spot parameters: Sequential Gaussian conditional simulation (SGCS), as a geostatistical method, shows unique advantages in the spatial inversion of forest stock volume, and at the same time, brings several issues that are worth exploring. In this study, we used the SGCS method to integrate satellite-borne LiDAR data with auxiliary variables for forest stock volume mapping. SGCS is suitable for simulating forest stock volume in mountainous areas with complex topography through spatial correlation modeling driven by the variational function. The spatial interpolation using the ordinary kriging method was attempted in the pre-experimental stage, but the interpolation effect was not satisfactory, so the spatial interpolation of the satellite-borne LiDAR spots in the complex mountainous environment was switched to the method of sequential Gaussian conditional simulation. This choice of spatial interpolation method provides a triple advantageous path of accuracy–efficiency–reliability for the spatial analysis of forest resources, and it can avoid or minimize the banding effect brought by satellite-borne LiDAR data and improve the spatial interpolation accuracy. Future research will adopt higher precision spatial interpolation methods to further achieve a high degree of matching of complex terrain to optimize its forest stock volume model.
(4) Impact of a single data source on the inversion results: This research focuses on the application of a single starborne LiDAR data source, in contrast to existing studies that use multi-source data synergy. However, multi-source fusion methods face inherent limitations such as optical data saturation and prediction error transfer while improving accuracy. In contrast, this study carries out the analysis based only on the 2019 ICESat-2/ATLAS on-board lidar photon data, avoiding the problem of error superposition caused by multi-source data fusion. Future studies will explore how to effectively control the errors that may arise from the introduction of optical data while maintaining the advantages of a single data source.
(5) The contribution of spaceborne lidar to reducing the workload of ground data collection is not yet clear: Since this study requires the use of angular control plot data as the true values, quantitative comparisons of different scale plots have not been conducted yet. Only a performance comparison among different machine learning models was carried out using the same number of plots (212 plots), and the fixed samples make it impossible to quantitatively assess the potential for sample reduction and to prove whether it is feasible to achieve the same performance with less data. Future research will explore how to establish a forest stock volume model with fewer plots to reduce the workload of ground surveys.
5.2. Conclusions
In this study, the forest stock volume estimation was performed through remote sensing inversion techniques in topographically complex mountainous regions, characterized by significant elevation variations and rugged terrain features, aiming at solving the problem of insufficient accuracy of stock volume inversion in a highly heterogeneous surface environment. Taking Jingdong Yi Autonomous County in Yunnan Province as the study area, which is characterized by a large elevation gradient and high terrain fragmentation, 13 parameters were extracted from 212 angularly controlled sample plots of ICESat-2 light patches in the forest range of the study area, and 10 parameters were selected for spatial interpolation by sequential Gaussian conditional simulation. A Random Forest model, LightGBM model, and XGBoost model were constructed for forest stock volume in the study area, and the XGBoost model constructed with the parameters extracted from ICESat-2/ATLAS data after photon point cloud denoising and photon point cloud classification algorithms had a higher accuracy of inversion for forest stock volume, with an R2 = 0.89. We used it to invert the forest volume in the study area. The average forest stock volume was 141.00 m3/hm2, the maximum forest stock volume was 338.22 m3/hm2, and model results identified 30.41 m3/ha as the minimum stock volume threshold, accompanied by complete spatial distribution mapping of volume values throughout the forested landscape. This study demonstrates that lidar technology provides forest parameter estimates with accuracy comparable to ground surveys. Systematic validation revealed high consistency (R2 = 0.89, rRMSE = 10.5912) between ICESat-2/ATLAS-derived forest stock volume and field measurements. The proposed methodology offers a practical solution for forest stock volume estimation in complex mountainous terrain, particularly in areas with steep topography and limited access. These findings confirm the reliability of spaceborne lidar for large-scale forest monitoring and support the development of GIS-based forest management systems. The combined advantages of operational efficiency and measurement accuracy suggest strong potential for forestry applications.