Review Reports - Multi-Source Data Fusion and Ensemble Learning for Canopy Height Estimation: Application of PolInSAR-Derived Labels in Tropical Forests

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript employs continuous canopy height results derived from airborne PolInSAR instead of LiDAR footprints and proposes a multi-source ensemble learning framework for canopy height prediction. The results illustrate PolInSAR inversion can improve label quality, and it can be applicable to spaceborne canopy height mapping to improve the estimation accuracy. The manuscript is well written. However, there are still some details that need attention, as follows:

1.The theoretical background of the RVoG-VTDs model is well introduced; however, the mathematical expression of the model is missing. Since the proposed framework directly builds upon this formulation, it would be beneficial to include at least the core RVoG-VTDs coherence equation, together with a brief explanation of the temporal decorrelation term.

2.The manuscript lists multiple input variables, including Landsat-8 optical bands (B1–B7) and topographic features such as DEM, yet their physical relevance to canopy height inversion is not clearly described. Providing a concise explanation of the role of each feature in height estimation (for example, visible bands related to chlorophyll absorption, and SWIR reflecting leaf moisture and shadow effects) would enhance interpretability and demonstrate that the model design is physically grounded rather than purely data-driven.

3.The scale of canopy height estimation using airborne POLInSAR is different from that of spaceborne data. How did the authors unify them?

4. Figure 13 could be improved by grouping the channels into spectral and topographic categories, etc., to better illustrate their functional roles. Such grouping would help readers intuitively understand the contributions of different data sources. Figure 3 is not clear, please modify it.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This is a well-written and technically solid study addressing an important topic in remote sensing—the accurate estimation of forest canopy height through the integration of PolInSAR-derived labels and ensemble learning methods. The manuscript presents a clear workflow, from data preprocessing to model fusion, and provides comprehensive experimental validation using datasets from Gabon’s tropical forests. The idea of using continuous canopy height labels derived from airborne PolInSAR to overcome the limitations of sparse LiDAR footprints is novel and significant for large-scale forest monitoring and carbon stock estimation.

Although the abstract lists several accuracy metrics, which effectively demonstrate the quantitative performance of the results, it lacks a clear emphasis on methodological innovation. It is recommended to retain a few key numerical results to highlight the improvement in performance, while explicitly stating the core methodological contribution in the abstract.
The introduction provides a comprehensive literature review, but it is overly lengthy—particularly the sections discussing LiDAR and GEDI, which occupy too much space and obscure the study’s own innovation. It is recommended to condense these background discussions and focus more on the advantages of PolInSAR and the existing research gaps. A separate subsection or paragraph should be added at the end of the introduction to clearly state the research objectives and innovations, explicitly outlining how this study improves upon and advances previous methods. In addition, the introduction should strengthen the argument for the current research gap—not only by noting issues such as sparse LiDAR sampling and large interpolation errors, but also by citing specific studies or quantitative evidence that demonstrate how these limitations affect canopy height estimation in tropical forests, thereby emphasizing the necessity of this work.
The methodology section is generally well structured, but it lacks sufficient detail, which affects the reproducibility of the study. The description of the study area should go beyond basic geographic location and vegetation type, and further explain why Gabon’s Pongara and Akanda National Parks were selected as the experimental sites, highlighting their representativeness in terms of ecological characteristics and data availability. The PolInSAR data processing procedure should be described in greater detail, including the software platforms used (e.g., PolSARpro or SNAP), the methods for polarimetric calibration and interferometric co-registration, as well as key processing parameters such as multilooking settings and filtering window sizes. Additionally, the LiDAR data matching procedure should specify the spatial matching radius or the method used to align LiDAR and PolInSAR observations.
For the core methodological innovation—namely, the hybrid baseline selection and dual-layer ensemble learning framework—the logical coherence and interpretability should be further strengthened. The research workflow presented in Figure 2 is rather complex and could be divided into two separate diagrams: one illustrating the PolInSAR inversion process and the other depicting the modeling and prediction framework, to improve reader comprehension. Although the hybrid baseline (PROD+ECC) strategy is presented with equations, the explanation remains conceptual; the physical meanings of each parameter and the rationale for threshold selection (e.g., the choice of 10 m and 30 m divisions, along with relevant references) should be clearly specified. In addition, the optimization procedure of the adaptive weight α needs to be explicitly described—whether it is based on validation error minimization, local heterogeneity metrics, or grid search. Moreover, the parameters of all models should be listed in an appendix or summarized in a table, including the number and depth of trees for RF, the learning rate for XGBoost, and the structural parameters for U-Net and ResNetU-Net such as the number of layers, patch size, batch size, and number of training epochs.
The discussion section mainly reiterates the results and lacks mechanistic interpretation and academic depth. It is recommended to expand the discussion of the physical characteristics of PolInSAR, such as how its volume scattering mechanism and vertical decorrelation features enhance the reliability of canopy height inversion. Additionally, the authors should analyze the complementarity between PolInSAR-derived variables and other multi-source features (e.g., DEM, kNDVI, and Landsat spectral bands) in model training. A quantitative comparison with previous studies should also be included to explain why the proposed method achieves significantly lower RMSE than the global product by Lang et al. (2022) under comparable conditions.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The revisions to this paper have been completed as required, and there are no further issues.