1. Introduction
Forest height is a key parameter characterizing forest vertical structure and ecosystem function, playing a vital role in sustainable forest management, carbon stock estimation, stand structure analysis, and global carbon cycle research. Precise measurement and monitoring of forest height provide essential support for evaluating forest ecosystem services, advancing sustainable development, and addressing climate change [
1,
2,
3,
4]. Traditional forest height measurement primarily relies on ground plot surveys, which, despite high accuracy, suffer from low efficiency, high costs, and limited coverage, making them unsuitable for large-scale, high-timeliness monitoring requirements [
5]. Therefore, achieving accurate large-scale forest height inversion using remote sensing technology has become a major research focus in forestry remote sensing [
6,
7].
For the purpose of accurately estimating forest heights, lidar (Light Detection and Ranging) data may be directly collected from the canopy, revealing its vertical structure [
8,
9,
10]. The capability of remotely sensing forest height using the Geoscience Laser Altimeter System (GLAS) data was proven by early research like Lefsky et al. [
11]. Simard et al. [
12] developed a worldwide forest height map by merging GLAS, the Shuttle Radar Topography Mission (SRTM), and the Moderate Resolution Imaging Spectroradiometer (MODIS) data. Forest height retrieval has benefited greatly from the photon-counting LiDAR technology aboard the recently launched the Ice, Cloud, and land Elevation Satellite-2 (ICESat-2) owing to the satellite’s superior spatial resolution and signal-to-noise ratio [
13,
14,
15,
16]. Nevertheless, due to LiDAR’s limited sample capabilities, combining it with other remote sensing data is still necessary for its use on a regional scale [
17,
18].
Multi-source remote sensing data fusion provides a robust foundation for forest height modeling by leveraging the complementary strengths of different sensors [
19,
20]. For instance, synthetic aperture radar (SAR) data (e.g., Sentinel-1) are sensitive to the three-dimensional structure of forests and provide all-weather observation capabilities [
21]; optical data (e.g., Sentinel-2) offer rich spectral information for identifying vegetation types and canopy conditions; SRTM terrain data aid in explaining how topography affects height distribution. Researchers have shown that using optical and radar data together greatly improves the accuracy of forest parameter estimation [
22,
23], and adding lidar samples further increases the generalizability of the model to other kinds of forests [
24]. As a result, the feature basis for forest height modeling is more extensive and robust when multi-source remote sensing fusion is used [
25,
26,
27].
But there are a lot of obstacles to multi-source fusion as well. Data registration and fusion become more complicated due to differences in spatial and temporal resolution among remote sensing data [
28,
29]. Additionally, optical, radar, and lidar data all require different processing approaches, which makes it technically challenging to efficiently integrate heterogeneous data in order to fully utilize information potential [
30,
31]. Model performance and stability may also be affected by data quality concerns, such as cloud cover, atmospheric conditions, and noise interference. This is especially true in complicated situations [
32].
Regarding modeling approaches, traditional linear regression struggles to capture the complex nonlinear relationships among remote sensing features. In recent years, machine learning methods have gained prominence due to their advantage in handling high-dimensional data. Random Forest (RF) is widely applied in forest parameter estimation for its stability and interpretability [
33,
34]; Extreme Gradient Boosting (XGBoost) efficiently processes large-scale data through its gradient boosting framework [
35]; Light Gradient Boosting Machine (LightGBM) further optimizes training speed and memory usage, making it suitable for modeling high-dimensional remote sensing features [
36]. Research consistently demonstrates that these ensemble learning methods outperform traditional models in forest biomass and height inversion. It should be noted that relying solely on remote sensing data without effective machine learning methods makes it difficult to fully explore the complex relationships between features. Conversely, relying solely on machine learning models without multi-source data support makes it difficult for the model to accurately capture the spatial heterogeneity of forest structure.
Compared to previous studies, this work makes several distinct contributions: First, it systematically benchmarks three distinct multi-source data fusion schemes (incorporating Sentinel-1, Sentinel-2, ICESat-2, and SRTM) for forest height inversion, explicitly delineating their complementary characteristics in complex terrain. Second, under a unified geographical framework, it provides an empirical comparison of three advanced ensemble learning algorithms (LightGBM, XGBoost, and RF), offering practical guidance for model selection. Furthermore, and of significant practical value, this study pioneers the exploration of feasible inversion pathways in the absence of ICESat-2 data, thereby proposing a viable and scalable alternative for large-scale operational monitoring.
2. Materials and Methods
2.1. Technical Approach
There are five primary stages to the research pipeline, as shown in
Figure 1. The first step was the collection and organization of forest inventory data in addition to multi-source remote sensing data from Sentinel-1, Sentinel-2, ICESat-2, and SRTM. Second, all of the remote sensing datasets were preprocessed using noise reduction, atmospheric correction, geometric correction, and radiometric calibration. Spectral bands, vegetation indices, and texture features from Sentinel-2; altimetry variables from ICESat-2; elevation and terrain metrics from SRTM; and backscatter coefficients and interferometric coherence from Sentinel-1, were the next feature variables systematically extracted from the preprocessed data. To execute the forest height inversion experiments with various data combinations, we employed three machine learning models: Random Forest, XGBoost, and LightGBM. The performance of each model and data combination was ultimately assessed by calculating the correlation coefficient (
), along with the root mean square error (
) and mean absolute error (
).
2.2. Overview of the Study Area
Situated near the southern border of the Qinghai–Tibet Plateau and inside the core region of the Hengduan Mountains, the research area is located in Shangri-La City, Diqing Tibetan Autonomous Prefecture, Yunnan Province, China (26°52′–28°52′ N, 99°20′–100°19′ E) (
Figure 2). Alpine canyon landscape with steep inclines and declines from 1800 to 5500 m in height characterizes the region. Annual precipitation ranges from 600 to 800 mm, with the majority falling between June and September, with temperatures ranging from 5 to 12 °C. The climate is characterized by a plateau monsoon pattern, which is shaped by the South Asian monsoon and complicated topography.
The region boasts high forest coverage, serving as a crucial natural forest distribution area in Southwest China. Primary vegetation types include Abies spp., Picea spp., Pinus yunnanensis, and Quercus spp. Distinct vertical forest zonation spans from sparse Yunnan pine forests at lower elevations to pristine fir forests at higher altitudes, forming a complex and diverse forest spatial structure. Due to its typical terrain–vegetation combinations and complete vertical zonation spectrum, this region has become an ideal area for forest remote sensing monitoring and ecosystem research.
2.3. Research Data
2.3.1. Spatial–Temporal Data Fusion Strategy
A systematic spatiotemporal fusion strategy was employed to harmonize the multi-source remote sensing data. All datasets were first resampled to a consistent 10-m spatial resolution and unified to the WGS84 global coordinate system to establish a common spatial baseline. Precise geometric correction was then applied to achieve sub-pixel co-registration (targeting registration error < 0.5 pixels). To mitigate phenological influences, the acquisition window for all remote sensing data was confined to the period of October through December 2018. Finally, to integrate the discrete ICESat-2 LiDAR data with the continuous satellite imagery, the median canopy height value of all ICESat-2 footprints within a 10-m radius buffer was assigned to each corresponding Sentinel pixel.
2.3.2. Sentinel-1 Data
This research made use of Sentinel-1 (S-1) data that was supplied by the European Space Agency (ESA). Ground Range Detected (GRD) and Single Look Complex (SLC) products obtained in Interferometric Wide Swath (IW) mode make up this C-band SAR data. In order to determine derived radar indices and extract backscatter coefficients, the GRD products were used. On the other hand, the surface targets’ temporal stability was characterized by the SLC products, which produced interferometric coherence characteristics based on bi-temporal.
In total, eight features were extracted from the S-1 data, including two backscatter coefficients, two interferometric coherence features, and four derived indices (see
Table 1). The entire processing workflow was performed using ESA SNAP software (version 12.0.0). and corresponding plugins. The sampling time is as shown in
Table 2.
2.3.3. Sentinel-2 Data
This investigation used the European Space Agency’s Sentinel-2 (S-2) Level-2A surface reflectance products. Six photos taken between October and December 2018 were chosen for this analysis. After resampling all bands to a spatial resolution of 10 m, 28 predictive factors were carefully separated into three groups. To start, 12 spectral bands were chosen as the primary spectral features: B1, B2, B3, B4, B5, B6, B7, B8, B8A, B9, B11, and B12. Additionally, six vegetation indices were computed using data from the SNAP software’s biophysical processor. These included the Normalized Difference Vegetation Index (NDVI) which is equal to (B8 − B4)/(B8 + B4), as well as the Fraction of Absorbed Photosynthetically Active Radiation (FAPAR), vegetation cover (FCOVER), leaf area index (LAI), chlorophyll content (CAB), and canopy water content (CW). Additionally, ten texture features were generated from the normalized difference vegetation index (NDVI) using a 5 × 5 gray-level co-occurrence matrix (GLCM), with the NDVI serving as the base layer for the calculation (see
Table 3).
2.3.4. ICESat-2 Data
In this investigation, forest height data was obtained using the PhoREAL_v3.26 professional program [
37] utilizing ICESat-2 ATL08 Version 5 data, henceforth referred to as ICE-2. Products obtained at the same time, ATL03 for global geolocated photon data and ATL08 for land vegetation elevation data, were inputted at the same time during processing. The parameters “Terrain Best Fit (m)” and “Max Canopy (m)” were extracted using the method. The difference between the two numbers was then used to determine the forest canopy height, which was directly seen by ICESat-2. There were 12,047 legitimate laser imprints produced by using this height value as an independent variable. Reliable vertical structural references for forest height inversion were provided by these footprints, which showed uniform spatial distribution throughout the research region. A map depicting the footprint of the research region shows the geographical dispersion of this dataset (
Figure 3).
Table 4 shows the sampling time.
2.3.5. SRTM Data
In this investigation, the data used are the digital elevation models (DEMs) produced by the Space Shuttle Radar Topography Mission (SRTM), which were made public by the NSA and NASA together. This dataset accurately describes the topographic undulations in the research region with a spatial resolution of around 30 m [
38]. During data processing, two key topographic factors—slope gradient and aspect—were derived from the DEM using GIS terrain analysis tools. Ultimately, SRTM elevation (DEM), slope, and aspect were incorporated as independent variables describing terrain characteristics into the forest height inversion model. This approach mitigates terrain-induced interference in remote sensing signals, thereby enhancing inversion accuracy.
2.3.6. Ground Sample Data
This study uses the 2016 Shangri-La Second-class Category Resource Survey data (hereafter SL-SCRS) as ground truth to assess the performance of forest canopy height estimates generated by integrating ICE-2, S-1, S-2, and SRTM remote sensing data. The dataset contains 12,047 valid sample points, which cover the primary forest types and topographic gradients in the region and are well-distributed spatially, thus providing a good representation of the regional forest structural variation. All samples were randomly split into a training set (8433 samples) and a test set (3614 samples) using a 7/3 ratio, ensuring no statistically significant differences in canopy height distribution or spatial location between them. Professionally collected through field plot measurements, the SL-SCRS data encompass key forest characteristics—including forest type, stand structure, and mean top height—equipping it to reliably benchmark the results derived from remote sensing.
To enable a systematic evaluation of inversion accuracy, the validated canopy heights from the SL-SCRS dataset were used as a reference. Statistical metrics were then employed to compare this ground truth against the ICESat-2 ATL08 product as well as the heights estimated from the integrated ICE-2, S-1, S-2, and SRTM data.
2.4. Modeling Approach
The authors of this paper simulate forest height retrieval using multi-source remote sensing data using three state-of-the-art ensemble learning algorithms: LightGBM, XGBoost, AND Random Forest (RF). The three algorithms can examine the intricate nonlinear correlations between attributes and forest height from different angles since they all use decision trees as their basis models. However, their growth tactics for tree structures and optimization aims are different.
In order to drastically improve training efficiency, LightGBM uses a histogram-based approach with a leaf-wise growth strategy in conjunction with Gradient One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) methods. In order to achieve a balance between accuracy and efficiency, the model parameters were fine-tuned. These parameters mainly include: feature fraction at 0.8, number of leaves at 31, and minimum data in leaf nodes (min_data_in_leaf) set at 20. This model is great at dealing with high-dimensional features and massive datasets because it determines feature relevance by calculating the information gain and the number of times a feature is a split point across all trees.
A regularized objective function is minimized repeatedly via XGBoost’s use of a gradient boosting framework to build decision trees. The learning rate, max depth, and subsample parameters should be set to 0.1, 6, and 0.8, respectively, for optimal results. Both the feature’s occurrence frequency across all trees and the decrease in loss function it provides are taken into account when evaluating its relevance. When dealing with complicated nonlinear interactions, this approach really shines.
Using Bootstrap sampling and random feature selection, Random Forest builds numerous decision trees using a Bagging ensemble technique. Among its parameter settings that are used to regulate model complexity and diversity are the following: n estimators, which is set to 100, max features, which is set to sqrt, and random state, which is set to 42. In order to determine the significance of features, this approach adds up the decrease in Gini impurity that each feature has contributed across all trees. It has a robust resistance to overfitting and steady training.
2.5. Evaluation of Model Accuracy
To systematically evaluate the accuracy of forest height inversion results, this study employs a widely used quantitative evaluation metric system comprising Root Mean Square Error (), Mean Absolute Error (), and Pearson’s Correlation Coefficient (). The definitions of each metric are as follows:
An improved model’s accuracy is shown by a reduced root-mean-squared error (), which evaluates the total divergence between the predicted and observed values.
- 2.
Mean Absolute Error ()
The is a simple way to see how off the model is; it is a measure of the average discrepancy between the expected and actual values. Performance improves as the value decreases.
- 3.
Correlation Coefficient ()
This statistic, which may take on values between −1 and 1, represents the linear relationship between the expected and observed values. A higher positive correlation is indicated by values that are closer to 1.
Each of the three measures above assesses model inversion performance from a unique angle:r primarily indicates model fit, while and reflect error magnitude. Using these metrics collectively enables a more comprehensive comparison of different model-data combinations in forest height inversion.
3. Results
3.1. Evaluation of Forest Height Retrieval Results Under Different Data Combinations
To systematically evaluate the performance of integrated satellite data in estimating forest height, this study used SL-SCRS ground survey data as the dependent variable and established three data combination schemes: DC1 (S-1 + S-2 + ICE-2 + SRTM), DC2 (S-2 + ICE-2 + SRTM), and DC3 (S-1 + ICE-2 + SRTM). Three ensemble algorithms—LightGBM, XGBoost, and Random Forest (RF)—were employed for modeling to comprehensively evaluate each data combination’s performance in forest height inversion.
Figure 4 displays the scatter point density distributions corresponding to different data combinations and algorithms. Overall, scatter points for all combinations clustered near the 1:1 line, indicating that every data combination possesses a certain level of forest height inversion capability. Notably, the DC1 combination achieved the optimal accuracy metrics under the LightGBM model (
= 0.72,
= 5.52 m). Its scatter points exhibited good fitting consistency across both high and low value ranges, with the smallest systematic bias. This superior performance can be attributed to the comprehensive feature set provided by the full integration of radar, optical, lidar, and topographic data, which collectively capture the multi-faceted characteristics of the forest canopy. Although the scatter point distributions of DC2 and DC3 are similar to DC1 in range, quantitative accuracy assessments confirm that the DC1 combination, which integrates multi-source features, holds a significant advantage in inversion performance. These findings attest to the value of a multi-sensor fusion approach for achieving superior accuracy in forest canopy height retrieval.
Table 5 presents the accuracy evaluation results for different data combinations and model combinations. Comprehensive analysis indicates that the DC1 combination yielded the best inversion performance across all three machine learning models, particularly with the LightGBM algorithm, which delivered superior accuracy (
= 0.72,
= 5.52 m,
= 4.082 m). In contrast, XGBoost and Random Forest models performed slightly lower with the DC1 combination, though their accuracy metrics (XGBoost:
= 0.716,
= 5.551 m; RF:
= 0.706,
= 5.629 m) remained marginally superior to those of the DC2 and DC3 combinations.
The DC2 combination (S-2 + ICE-2 + SRTM) outperformed DC3 but fell short of DC1 in a cross-comparison of various data combinations. In this particular combination, the LightGBM model likewise attained the greatest accuracy with an -value of 0.714 and an of 5.559 m. It is clear that multi-source data fusion is crucial for improving inversion performance, since the DC3 combination (S-1 + ICE-2 + SRTM) has the worst accuracy metrics out of the three.
Thanks to its gradient one-sided sampling technique and leaf-wise growth strategy, LightGBM is able to process high-dimensional features efficiently and fully utilize complementary information among optical, radar, and laser altimetry characteristics, which is why it performs better in the DC1 combination. Synergistic use of data from many remote sensing sources greatly enhances the accuracy of forest height inversion. The optimal performance is achieved when the DC1 data set is used in conjunction with the LightGBM model, which offers a trustworthy technological method for accurate monitoring of forest characteristics at the regional scale.
3.2. Feature Importance Analysis
Based on the feature importance analysis results across different models using three data combinations (black bars in
Figure 5 represent the top 90% cumulative contribution rate of feature variables), the following findings emerge: The ICE-2 canopy height feature and DEM maintain high importance across all data combinations and models, highlighting the core role and irreplaceability of laser altimetry and terrain data in forest height inversion.
The DC1 combination exhibits the most balanced feature importance distribution. Beyond ICE-2 and SRTM features, S-2’s shortwave infrared band (B11) ranked fourth in average importance, while vegetation texture (Variance) ranked seventh. S-1’s VV polarization coherence (coh_VV) also placed sixth. This multi-feature contribution pattern fully reflects the synergistic and complementary advantages of multi-source data fusion.
Comparative analysis reveals distinct complementary relationships among different data types. In the DC2 combination, the absence of radar data significantly elevates the importance of optical features, resulting in higher overall contribution from S-2 features compared to DC1. Concurrently, the importance of slope among terrain factors rises from fifth to fourth place, while DEM and aspect maintain their first and second positions respectively. This indicates terrain factors play a more critical compensatory role under data-constrained conditions.
The DC3 combination exhibits a unique pattern of feature importance distribution. In the absence of optical data, the contribution of S-1 radar features significantly increases. Among terrain factors, DEM, slope, and aspect rank as the top three in importance, reflecting a significant synergistic effect between radar data and terrain features.
Feature importance rankings vary across different machine learning models. LightGBM exhibits a more balanced feature importance distribution, achieving equilibrium across different data sources and avoiding over-reliance on any single feature. XGBoost and Random Forest (RF) show similar feature importance distributions, revealing comparable feature dependency patterns, particularly with high consistency in core feature rankings. This difference indicates that different models exhibit distinct sensitivities and utilization strategies for features when processing multi-source data.
Based on the results of feature importance analysis, this study identified five key feature categories—vegetation indices, S-2 bands, vegetation biophysical parameters, grayscale co-occurrence matrix, and S-1 interferometry—that significantly contribute to forest height inversion. Spatial visualization and comparative analysis were conducted using SL-SCRS measured canopy height data (
Figure 6a). The selected features include: S-2 Normalized Difference Vegetation Index (NDVI) (
Figure 6b), shortwave infrared band B11 (
Figure 6c), canopy water content (CW) (
Figure 6d), GLCMMean texture feature (
Figure 6e) calculated from NDVI using GLCM, and S-1 VH polarization interferometric coherence (coh_VH) (
Figure 6f).
Comparative analysis of representative areas revealed significant correlations between spatial distributions of these features and forest height. Specifically, NDVI and its derived texture feature GLCMMean showed positive correlations with forest height, indicating enhanced vegetation greenness and canopy spatial heterogeneity with increasing tree height. CW similarly exhibits a positive correlation, reflecting rising canopy moisture content with increasing forest height. Conversely, B11 reflectance and VH polarization interferometric coherence coh_VH show negative correlations with forest height, reflecting the shortwave infrared band’s response to vegetation biomass and the phenomenon of reduced radar coherence in tall canopies due to enhanced bulk scattering. These features reveal the spatial distribution patterns of forest height across multiple dimensions—spectral, textural, moisture, and structural—providing multi-perspective remote sensing evidence for understanding regional forest vertical structure.
3.3. Removal of ICE-2 Data to Mitigate Its Impact on Forest Height Retrieval
In this study, although ICE-2 data can provide high-accuracy photon cloud measurements of forest canopies, its spatial coverage is relatively sparse, presenting significant limitations for large-scale forest height estimation. Therefore, this experimental section excludes ICE-2 data and instead conducts forest height inversion entirely based on S-1, S-2, and SRTM remote sensing sources to explore estimation schemes more suitable for large-area applications.
We adjusted the data combinations to construct the following three configurations: DC4 (S-1 + S-2 + SRTM), DC5 (S-2 + SRTM), and DC6 (S-1 + SRTM). Comparing the accuracy of the new combinations in
Table 6 with the original combinations (including ICE-2) in
Table 5 reveals that, except for a slight improvement in DC5 under the Random Forest (RF) model, the rest of the combinations show a minor decrease in inversion accuracy across different models, though the overall decline is not significant.
To visually demonstrate the inversion performance of different data combinations and models,
Figure 7 presents spatial distribution maps of forest height for the DC4, DC5, and DC6 combinations under LightGBM, XGBoost, and Random Forest (RF) models within a representative area. The figures reveal that while numerical accuracy is slightly lower compared to combinations including ICE-2, all combinations effectively capture the spatial distribution trends of forest height, demonstrating particular stability in areas with continuous vegetation structures.
The results demonstrate that effective forest height inversion capability is maintained even without ICE-2 data, significantly expanding the spatial scope and feasibility of inversion operations. Therefore, in large-scale applications balancing efficiency and coverage requirements, multi-source remote sensing inversion strategies independent of ICE-2 data hold practical potential and application value.
4. Discussion
4.1. Feature-Importance-Based Path Optimization
Based on the above findings, a multi-level optimization strategy is recommended for feature selection: First, prioritize the inclusion of SRTM terrain factors to supplement spatial distribution and terrain-related details. Second, integrate ICE-2 canopy height products to ensure effective utilization of key forest stand structural information, thereby further enhancing overall model performance.
This data screening strategy, grounded in feature importance analysis, provides a scientifically sound and flexible optimization pathway for forest height inversion under varying data conditions. It not only enhances model accuracy but also strengthens the reliability and applicability of inversion results across diverse application scenarios.
4.2. Performance Comparison and Mechanism Analysis of Different Models
The LightGBM model achieved the highest accuracy across most data combinations. This performance advantage stems from its distinctive leaf-wise growth strategy and efficient handling of high-dimensional features through Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB). These characteristics allow LightGBM to more effectively leverage the complementary information from the heterogeneous remote sensing sources in our fused feature set, leading to a more robust and accurate model. In contrast, while Random Forest exhibits high stability, it lacks flexibility, whereas XGBoost strikes a balance between computational efficiency and accuracy. Differences in feature sensitivity across models also reflect their distinct learning mechanisms: LightGBM utilizes information from various data sources more evenly, while XGBoost and RF rely more heavily on a few core features (such as ICE-2 elevation and DEM).
4.3. The Key Role of S-1 and S-2 in Forest Height Estimation
In this study, S-1 and S-2 satellite data played a pivotal role in forest height estimation. As a synthetic aperture radar (SAR) system, S-1 possesses all-weather observation capabilities, enabling continuous acquisition of surface information under complex meteorological conditions such as cloud cover and precipitation. Its VV and VH polarization data exhibit strong sensitivity to forest canopy structure and vertical distribution, offering unique advantages for extracting structural parameters like forest density and height.
The spectrum and textural data provided by S-2 multispectral images is extensive. With its great spatial resolution, it accurately describes the cover of plants, the biogeochemical aspects of the canopy, and the growth condition of the organism. Its data is used to produce a number of vegetation indices and textural properties, which are vital for inverting forest biophysical metrics including canopy water content, leaf area index, and canopy cover.
This work greatly improves the accuracy and resilience of forest height estimate by combining structure-sensitive characteristics from S-1 with spectral-texture information from S-2. This way, the complementary capabilities of multi-source remote sensing data are completely used. To further clarify the contribution of each data type in the inversion process, this study generated feature importance heatmaps for both S-1 and S-2 data (
Figure 8). By averaging the feature importance across each feature set, these heatmaps visually demonstrate the relative contribution of each feature in the prediction process, thereby identifying the variables most influential for forest height estimation. This analysis not only validated the synergistic effect of S-1 and S-2 in forest parameter inversion but also provided a basis for subsequent feature selection and model optimization.
4.4. Outlook
This study achieved high accuracy in forest height retrieval through multi-source remote sensing data fusion and machine learning methods. The model performance improved after incorporating ICESat-2 data, yet certain limitations remain. The primary issue lies in ICESat-2’s discrete point sampling nature. While it provides a high-precision forest height benchmark, it struggles to support large-scale spatial continuous retrieval. This results in low computational efficiency and limited scalability when applying the model at regional scales. Simultaneously, the preprocessing and fusion of massive multi-source remote sensing data impose significant demands on computational resources and algorithmic efficiency [
39]. Finally, the heavy reliance on Sentinel and ICESat-2 data may restrict the application in areas with poor data coverage or persistent cloud contamination.
To address these challenges, future research will focus on enhancing the efficiency and feasibility of large-scale forest height retrieval while exploring ways to reduce dependence on ICESat-2 data in appropriate scenarios. Key research directions include: developing efficient data dimensionality reduction and feature compression methods to reduce big data processing burdens; systematically investigating fusion mechanisms between ICESat-2 and continuous coverage data like Sentinel-1/2, while developing large-area inversion modeling methods independent of ICESat-2 to expand application scope while maintaining accuracy; Building lightweight machine learning models with strong generalization capabilities to enhance adaptability across diverse environments and data conditions; Integrating multi-source validation data from sources like UAV remote sensing and ground observations to improve the reliability and stability of satellite inversion results [
40].
Through these improvements, future work can significantly enhance the spatial coverage and computational efficiency of forest height estimation while maintaining high inversion accuracy, providing more practical and scalable technical support for regional and global forest resource monitoring.
5. Conclusions
This research established a machine learning-driven framework to retrieve forest height in the Shangri-La area of Yunnan Province by integrating data from multiple remote sensing platforms. The forest height retrieval model was built by combining data from Sentinel-1 radar, Sentinel-2 multispectral, ICESat-2 altimetry, and SRTM terrain sensors with three machine learning algorithms: Random Forest, XGBoost, and LightGBM. The greatest inversion accuracy was achieved by combining Sentinel-1, Sentinel-2, ICESat-2, and SRTM data, which completely exemplifies the synergistic benefits of multi-source data in terms of feature complementarity, according to the findings.
The inversion performance that LightGBM attained was the best among the machine learning models that were examined. The model’s capacity to accurately capture patterns of spatial variation in forest height was shown by its prediction results under the DC1 data combination, which achieved a correlation coefficient r of 0.72 with observed values and an of 5.52 m. Additional study of feature relevance demonstrated substantial contributions from ICESat-2 altimetry data. While Sentinel-1 backscatter characteristics successfully defined the vertical structure of the forest, Sentinel-2 spectral features and vegetation indices supplied vital canopy spectral information. Further, the model’s stability and flexibility in difficult mountainous areas were greatly improved by the SRTM terrain component. Beyond the specific case study, this research validates a scalable methodological framework that synergizes multi-source remote sensing fusion with ensemble learning for forest parameter retrieval in topographically complex regions. The demonstrated feasibility of achieving satisfactory estimation accuracy without ICESat-2 data significantly broadens the potential for large-scale, operational application of this approach.
In order to overcome the difficulties in estimating forest heights in areas with complicated terrain, this research confirms that combining machine learning algorithms with data from many remote sensing sources is a viable way to retrieve forest parameters. The research findings provide empirical evidence and practical guidance for future studies on forest carbon stock assessment, sustainable forest management, regional resource surveys, and dynamic monitoring.