Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Open AccessArticle

Peer-Review Record

A Deep Learning Approach for High-Resolution Canopy Height Mapping in Indonesian Borneo by Fusing Multi-Source Remote Sensing Data

Remote Sens. 2025, 17(21), 3592; https://doi.org/10.3390/rs17213592

by Andrew J. Chamberlin^1,*, Zac Yung-Chun Liu¹, Christopher G. L. Cross², Julie Pourtois¹, Iskandar Zulkarnaen Siregar³

, Dodik Ridho Nurrochmat³

, Yudi Setiawan^3,4

, Kinari Webb⁵, Skylar R. Hopkins⁶, Susanne H. Sokolow^1,7 and Giulio A. De Leo¹

Reviewer 1: Anonymous

Reviewer 2:

Anwar Eziz

Reviewer 3: Anonymous

Remote Sens. 2025, 17(21), 3592; https://doi.org/10.3390/rs17213592

Submission received: 10 September 2025 / Revised: 24 October 2025 / Accepted: 25 October 2025 / Published: 30 October 2025

(This article belongs to the Special Issue Deep Learning for Remote Sensing and Geodata)

Round 1

Reviewer 1 Report (New Reviewer)

Comments and Suggestions for Authors

1.- Line 48 to Line 59 - all these lines should be in one paragraph about the importance of canopy height as an indicator of other metrics.

2.- Line 67-Line 108 - These paragraphs can be removed. This information is abundant in the literature and should already be known by the reader. My suggestion is to go straight into explaining the use of LiDAR.

3.- In general, the article is ok. The existence of the positive bias that model heights to 80-100 m is concerning though. Perhaps you should explain more on how to solve this issue? Include alternative methods, corrections, etc.

4.- If possible, test ACD results from GEDI and plot against your model to evaluate R2 and RMSE. It could serve as a good way to evaluate the validity of your models.

Author Response

Dear Reviewer #1,
We would like to express our sincere gratitude for your thorough review of our manuscript. Your comments have been invaluable in improving the quality and clarity of our work. We have taken all your concerns very seriously. Below, we provide a point-by-point response to your comments, with text locations specified according to the revised manuscript with changes tracked.

Comment 1: "Line 48 to Line 59 - all these lines should be in one paragraph about the importance of canopy height as an indicator of other metrics."

Response 1: We agree with this suggestion and have consolidated these lines into a single, cohesive paragraph that better emphasizes the importance of canopy height as a key forest structural metric. This change improves the flow and readability of the introduction. [Page 2, Lines 68-80: Consolidated paragraphs about canopy height importance into single paragraph]

Comment 2: "Line 67-Line 108 - These paragraphs can be removed. This information is abundant in the literature and should already be known by the reader. My suggestion is to go straight into explaining the use of LiDAR."

Response: We respectfully disagree with this suggestion. While this information may be familiar to forestry specialists, our manuscript targets the broader remote sensing community, including researchers who may not have extensive background in forest ecology and carbon dynamics. The paragraphs in question provide essential context about forest structure, carbon storage mechanisms, and the ecological significance of canopy height measurements. This background information is crucial for readers to understand the broader implications of our work and the importance of accurate canopy height mapping for carbon accounting and forest monitoring applications. Removing this content would significantly weaken the manuscript's accessibility to its intended interdisciplinary audience.

Comment 3: "In general, the article is ok. The existence of the positive bias that model heights to 80-100 m is concerning though. Perhaps you should explain more on how to solve this issue? Include alternative methods, corrections, etc."

Response: We appreciate this important observation about the systematic bias in our model predictions for very tall forests (80-100m). We have enhanced our discussion of this limitation in the conclusion section. The bias likely stems from the limited representation of extremely tall trees in our training data, as such heights are relatively rare in tropical forests and may be underrepresented in the NASA CMS LiDAR dataset.

We have added the following text to address potential solutions: "Addressing the systematic biases in height predictions for very tall and very short forests should be prioritized, potentially through stratified modeling approaches or ensemble methods. Future work should focus on obtaining additional ground truth data from these underrepresented height classes, particularly for trees exceeding 80m, which represent a small but ecologically important fraction of tropical forest canopies. Additionally, investigating the transferability of this framework to other tropical forest regions would extend its global applicability."
[Lines 644-649]

Comment 4: "If possible, test ACD results from GEDI and plot against your model to evaluate R2 and RMSE. It could serve as a good way to evaluate the validity of your models."

Response: We appreciate this suggestion for additional validation using GEDI ACD data. While this would indeed provide valuable independent validation of our canopy height predictions, such an analysis falls outside the scope of the current study, which focuses specifically on canopy height mapping rather than aboveground carbon density estimation. The integration of GEDI ACD data would require additional validation against ground-based carbon measurements and region-specific allometric equations, which represents a substantial research effort beyond our current objectives. We acknowledge this as an important direction for future research that could build upon the canopy height mapping framework presented here.

Reviewer 2 Report (New Reviewer)

Comments and Suggestions for Authors

Review Report on ‘High-Resolution Canopy Height Mapping in Tropical Rainforests Using Deep Learning and Multi-Source Remote Sensing Data’

1. General Comments

I was truly impressed by the scope and ambition of this study. The way you've compiled a large feature set (336 predictors) from various sources like Landsat-8, Sentinel-1 SAR, and climatic data is a real strength, and it's clear that this contributed significantly to the excellent performance of your models, especially the deep learning one. Using the NASA CMS LiDAR dataset as ground truth provides a solid foundation for your work. The standout achievement is definitely the neural network's performance (R² = 0.82), which greatly advances the benchmark for canopy height prediction in complex tropical environments when compared to the more commonly used Random Forest. However, I feel the manuscript's great conceptual strength could be enhanced by including more detailed methodological information that is essential for others to reproduce your work. For example, when discussing the LiDAR processing and resampling, the description feels a bit vague. A researcher interested in replicating this would benefit from knowing the specific algorithms used for ground point classification and how you aggregated the 1m data to 30m (such as whether you used mean or max height). Similarly, in the deep learning section, mentioning that Auto-Keras was used is helpful, but sharing the final hyperparameters or the search space would provide valuable insights into the model architecture. The discussion about the model's bias toward tall and short trees is insightful, but the potential implications for carbon stock estimation might need a bit more clarification, since that particular application isn't directly validated here.Based on these points, I recommend a Major Revision.

2. Specific Comments

2.1 Title

The current title is accurate but I think it could be much sharper. How about incorporating the specific location and the key comparative finding? Something like "A Deep Learning Approach for High-Resolution Canopy Height Mapping in Indonesian Borneo by Fusing Multi-Source Remote Sensing Data" would immediately signal the geographic focus and the methodological comparison to readers.

2.2 Keywords

Your keywords cover the broad topics well, but they're a bit too general. To improve discoverability, I suggest swapping out terms like "Remote Sensing" and "Forest Structure" with more specific ones that reflect your novel methods, such as "artificial neural network," "Sentinel-1," "GLCM," and "Google Earth Engine." This will help your paper reach the right audience.

2.3 Abstract

The abstract does a good job of stating the objective and the impressive R² result. However, it could better frame the whyby explicitly mentioning the research gap—that while Random Forest is the go-to method, deep learning remains underexplored for this specific task. I also suggest briefly noting that you comprehensively compared these approaches, as that's a central part of your narrative.

2.4 Introduction

You've written a very throrough introduction that nicely sets the stage. To make it even stronger, I feel like the transition into machine learning could more critically set up the deep learning rationale. After discussing RF's dominance, you could briefly cite a recent review (like Kattenborn et al., 2021, which you have in the references) to underscore that NNs are promising but not yet mainstream for canopey height. Finally, I'd love to see a clear, single-sentence hypothesis at the end, such as "We hypothesized that a deep learning model would outperform traditional tree-based algorithms by better capturing complex relationships within our large, multi-source feature set."

2.5 Materials and Methods

This section is the heart of the paper and where I think the most impactful revisions can be made.

Study Area: The description of Borneo is fine, but adding just a sentence on the dominant forest types (e.g., lowland dipterocarp, peat swamp) and climate would provide crucial ecological context for anyone looking to apply this method elsewhere.
Data Preprocessing: You list the data sources beautifully in Table 1, but the "how" is a bit thin. For example, you mention handling cloud cover for Landsat-8 but don't specify the algorithm (e.g., CFMASK, QA band). Could you please add a line on the cloud masking and compositing method? Similarly, for the LiDAR, a brief note on the ground classification technique (e.g., TIN densification) would be invaluable for reproducibility.
Resampling Clarification: There's a small but important ambiguity: when you resampled the 1m LiDAR CHM to 30m, what statistic did you use for the 30x30m pixel—the mean canopy height? This needs to be explicitly stated.
Modeling Details: This is the most critical part. The description of the neural network model is quite high-level. I strongly recommend adding a table with the final hyperparameters for all models (NN, RF, XGBoost, LightGBM)—things like learning rate, batch size, dropout, number of trees, and max depth. For the Auto-Keras process, it would be great to know the boundaries of the architecture search. Without these details, it's very hard for others to build on your work.

2.6 Results

The results are clearly presented, and the superiority of the NN is compelling. My main suggestion here is for Figure 5, which is so important for understanding the model's biases. Right now, the reader has to visually estimate the RMSE and bias values. If possible, could you please annotate the plot with the exact values for key height bins? A brief note in the text confirming the statistical significance of the biases (e.g., via a t-test) would also strengthen this section.

2.7 Discussion

You provide an excellent explanation for why the NN performed better and for the systematic biases, which is probably the best part of the discussion. To further contextualize your great R²=0.82 result, I suggest directly comparing it with a few more recent studies (post-2018) on tropical canopy height. This will help readers immediately grasp how your work moves the needle. Also, when you discuss the implications for carbon accounting, it would be powerful to roughly quantify the potential impact. For example, "A negative bias of 10m in tall forests could lead to an underestimation of ACD by approximately X% based on Jucker et al. (2018)," which would make the caveat more concrete.

2.8 Conclusion

The conclusion is solid, but it slightly overpromises on the direct application for ACD estimation. Since you rightly note that region-specific allometrics are needed and you didn't validate ACD predictions, it's better to temper the language. Instead of "can serve as a foundation," you could say "provides the crucial canopy height variable that, when combined with validated region-specific allometric equations, can support ACD estimation." This is more precise and accurate.

2.9 References

The reference list is comprehensive, but I notice a fair number of older, foundational papers. For a 2024 submission, it's better to have a strong showing of recent literature (e.g., >50% from the last 5 years) to demonstrate the work's currency. You have some good recent cites; just try to weave a few more into the Introduction and Discussion where appropriate.

Comments on the Quality of English Language

2.10 Language

The language is generally clear and academic. There are just a few minor grammatical spots to clean up, like the sentence fragment on p. 17 ("However, we did calculate...") and subject-verb agreement ("data was" should be "data were"). A careful proofread should easily catch these.

Author Response

Dear Reviewer #2,
We sincerely appreciate your thorough review of our manuscript and your valuable feedback. Your comments have highlighted key sections to improve in our text, which we have addressed through the following revisions to our manuscript. Below, we provide a point-by-point response to your concerns, with text locations specified according to the revised manuscript with changes tracked.

General Comments: We appreciate the reviewer's comprehensive feedback and constructive suggestions for improving our manuscript. The reviewer correctly identifies several areas where additional methodological detail and contextual information would strengthen the work. We have implemented the majority of these suggestions while maintaining the scientific rigor and scope of our study.

Comment 2.1: "The current title is accurate but I think it could be much sharper. How about incorporating the specific location and the key comparative finding? Something like 'A Deep Learning Approach for High-Resolution Canopy Height Mapping in Indonesian Borneo by Fusing Multi-Source Remote Sensing Data' would immediately signal the geographic focus and the methodological comparison to readers."

Response 2.1: We appreciate this suggestion and have revised the title to better reflect the geographic focus and methodological approach. The new title provides clearer indication of the study location and the comparative nature of our analysis. [Title page: Changed from "High-Resolution Canopy Height Mapping in Tropical Rainforests Using Deep Learning and Multi-Source Remote Sensing Data" to "A Deep Learning Approach for High-Resolution Canopy Height Mapping in Indonesian Borneo by Fusing Multi-Source Remote Sensing Data"]

Comment 2.2: "Your keywords cover the broad topics well, but they're a bit too general. To improve discoverability, I suggest swapping out terms like 'Remote Sensing' and 'Forest Structure' with more specific ones that reflect your novel methods, such as 'artificial neural network,' 'Sentinel-1,' 'GLCM,' and 'Google Earth Engine.'"

Response 2.2: We agree with this suggestion and have updated the keywords to include more specific terms that better reflect our methodological contributions and data sources, including 'artificial neural network' and 'Google Earth Engine' while removing overly general terms. [Keywords section: Added 'artificial neural network' and 'Google Earth Engine', removed or modified 'Remote Sensing' and 'Forest Structure']

Comment 2.3: "The abstract does a good job of stating the objective and the impressive R² result. However, it could better frame the why by explicitly mentioning the research gap—that while Random Forest is the go-to method, deep learning remains underexplored for this specific task. I also suggest briefly noting that you comprehensively compared these approaches, as that's a central part of your narrative."

Response 2.3: We have enhanced the abstract to explicitly mention the research gap regarding deep learning applications for canopy height mapping and to emphasize the comprehensive comparison of machine learning approaches that forms a central component of our study. Specifically, we added text noting that "While machine learning approaches like Random Forest have become standard for predicting forest attributes from remote sensing data, deep learning methods remain underexplored for canopy height mapping despite their potential advantages" and emphasized that "we systematically compared multiple machine learning approaches and found that" the neural network outperformed traditional methods. [Page 1, Lines 30-32: Added research gap text; Page 1, Line 44: Added systematic comparison emphasis]

Comment 2.4: "You've written a very thorough introduction that nicely sets the stage. To make it even stronger, I feel like the transition into machine learning could more critically set up the deep learning rationale. After discussing RF's dominance, you could briefly cite a recent review (like Kattenborn et al., 2021, which you have in the references) to underscore that NNs are promising but not yet mainstream for canopy height. Finally, I'd love to see a clear, single-sentence hypothesis at the end, such as 'We hypothesized that a deep learning model would outperform traditional tree-based algorithms by better capturing complex relationships within our large, multi-source feature set.'"

Response 2.4: We have strengthened the introduction by adding a more critical transition into machine learning approaches, citing recent literature to underscore the potential of neural networks for canopy height prediction. Specifically, we added: "As Kattenborn et al. (2021) note in their comprehensive review, neural networks show exceptional promise for vegetation remote sensing but their application to forest structural mapping has been limited compared to other domains." We have also added the exact hypothesis statement suggested: "We hypothesized that a deep learning model would outperform traditional tree-based algorithms by better capturing complex relationships within our large, multi-source feature set." [Page 5, Lines 198-200: Enhanced ML transition; Page 5, Lines 206-209: Added RF description; Page 5, Lines 227-229: Added Kattenborn et al. citation; Page 6, Lines 247-249: Added hypothesis statement]

Comment 2.5.1: "The description of Borneo is fine, but adding just a sentence on the dominant forest types (e.g., lowland dipterocarp, peat swamp) and climate would provide crucial ecological context for anyone looking to apply this method elsewhere."

Response 2.5.1: We have added a sentence describing the dominant forest types in Borneo, including lowland dipterocarp forests and peat swamp forests, to provide important ecological context for readers interested in applying this method to other regions. [Page 6, Lines 262-264: Added forest type description: "The island is predominantly covered by lowland dipterocarp rainforests, with significant areas of peat swamp forests, mangroves along coastal regions, and montane forests at higher elevations."]

Comment 2.5.2: "You list the data sources beautifully in Table 1, but the 'how' is a bit thin. For example, you mention handling cloud cover for Landsat-8 but don't specify the algorithm (e.g., CFMASK, QA band). Could you please add a line on the cloud masking and compositing method? Similarly, for the LiDAR, a brief note on the ground classification technique (e.g., TIN densification) would be invaluable for reproducibility."

Response 2.5.2: We have clarified that cloud masking for Landsat-8 was performed using the pixel QA band in Google Earth Engine. Regarding LiDAR processing, we used the CHM products directly provided by NASA CMS, as clearly stated in our methods section. The NASA CMS team performed all LiDAR preprocessing including ground classification, and we refer readers to the cited reference for detailed technical specifications. We have also clarified that when resampling the 1m CHMs to 30m resolution, we used mean canopy height. [Page 6, Lines 274-278: Added LiDAR processing details; Page 8, Lines 304-307: Added cloud masking details]

Comment 2.5.3: "There's a small but important ambiguity: when you resampled the 1m LiDAR CHM to 30m, what statistic did you use for the 30x30m pixel—the mean canopy height? This needs to be explicitly stated."

Response 2.5.3: We have clarified that we used mean canopy height when resampling the 1m LiDAR CHM to 30m resolution. [Page 6, Lines 274-278: Added explicit statement about using mean canopy height values within each 30×30m pixel]

Comment 2.5.4: "This is the most critical part. The description of the neural network model is quite high-level. I strongly recommend adding a table with the final hyperparameters for all models (NN, RF, XGBoost, LightGBM)—things like learning rate, batch size, dropout, number of trees, and max depth. For the Auto-Keras process, it would be great to know the boundaries of the architecture search."

Response 2.5.4: While we appreciate the importance of reproducibility, we are unable to provide the detailed hyperparameter table requested due to computational resource constraints. However, we will make our code available on GitHub upon publication, which will include all implementation details and allow for full reproducibility of our results. [Note: Code will be made available on GitHub upon publication]

Comment 2.6: "My main suggestion here is for Figure 5, which is so important for understanding the model's biases. Right now, the reader has to visually estimate the RMSE and bias values. If possible, could you please annotate the plot with the exact values for key height bins? A brief note in the text confirming the statistical significance of the biases (e.g., via a t-test) would also strengthen this section."

Response 2.6: We respectfully note that Figure 5 is already annotated with exact RMSE and bias values for each height bin, as clearly visible in the figure. The values are explicitly labeled on each bar, making visual estimation unnecessary. We believe the current presentation provides clear and accurate information about model biases across height classes. [Figure 5: Already annotated with exact values as requested]

Comment 2.7: "You provide an excellent explanation for why the NN performed better and for the systematic biases, which is probably the best part of the discussion. To further contextualize your great R²=0.82 result, I suggest directly comparing it with a few more recent studies (post-2018) on tropical canopy height. This will help readers immediately grasp how your work moves the needle. Also, when you discuss the implications for carbon accounting, it would be powerful to roughly quantify the potential impact."

Response 2.7: We have enhanced the discussion by adding comparisons with recent studies on tropical canopy height mapping to better contextualize our R²=0.82 result. Specifically, we added comparisons with Csillik et al. (2020) who reported R² of 0.75 using random forest with PlanetScope imagery in Peru, and Potapov et al. (2021) who achieved R² of 0.6-0.7 when mapping canopy height across the tropics using GEDI LiDAR and Landsat data. We have also added a quantified assessment of the potential impact on carbon accounting, noting that using the allometric equations from Jucker et al. (2018), extreme underestimation in tall forests could translate to approximately 65-75% underestimation of aboveground carbon density. For example, in a pristine dipterocarp forest with true canopy height of 85m and expected ACD of 200 Mg C ha⁻¹, our model's predictions might suggest only 50-70 Mg C ha⁻¹, an error of 130-150 Mg C ha⁻¹. [Page 18, Lines 561-568: Added recent study comparisons; Page 19-20, Lines 617-625: Added quantified carbon impact assessment]

Comment 2.8: "The conclusion is solid, but it slightly overpromises on the direct application for ACD estimation. Since you rightly note that region-specific allometrics are needed and you didn't validate ACD predictions, it's better to temper the language."

Response 2.8: We have revised the conclusion to more precisely describe the potential applications of our canopy height mapping framework, emphasizing that it provides the crucial canopy height variable that, when combined with validated region-specific allometric equations, can support ACD estimation. [Page 21, Lines 688-691: Revised ACD application language; Page 21, Lines 716-717: Updated conclusion language]

Comment 2.9: "The reference list is comprehensive, but I notice a fair number of older, foundational papers. For a 2024 submission, it's better to have a strong showing of recent literature (e.g., >50% from the last 5 years) to demonstrate the work's currency."

Response 2.9: We have added an additional recent reference to strengthen the literature foundation, particularly in the discussion section where we compare our results with recent studies on tropical canopy height mapping. [References: added Potapov et al. (2021) for another recent comparison]

Comment 2.10: "The language is generally clear and academic. There are just a few minor grammatical spots to clean up, like the sentence fragment on p. 17 ('However, we did calculate...') and subject-verb agreement ('data was' should be 'data were'). A careful proofread should easily catch these."

Response: We respectfully disagree with these language assessments. The sentence "However, we did calculate the feature importance from the RF, XGBoost, and LightGBM models" is a complete sentence with subject, verb, and object, not a fragment (p. 16, Lines 501-502). Regarding "data was" vs. "data were": We have reviewed our usage of 'data' throughout the manuscript and found it to be consistent with modern scientific writing conventions. We use singular verbs (is/was) when referring to data types, datasets as collective entities, or data processing as a single operation (e.g., "LiDAR data is used...", "All data was reprojected..."). We use plural verbs (are/were) when referring to data points being used for specific purposes or collection activities involving multiple measurements (e.g., "The data were collected...", "Validation data were not used..."). This distinction is intentional and follows the convention of treating 'data' as either a mass noun or count noun depending on the context, which is widely accepted in scientific writing.

Reviewer 3 Report (New Reviewer)

Comments and Suggestions for Authors

The paper utilizes deep learning and multi-source remote sensing data for large-scale forest canopy height prediction, which is of great significance and can provide important basic data for accurate estimation of regional forest carbon storage.

However, there are some issues that need further improvement in the paper, as follows:

1. In Table 1, Sentinel-1 is from August 2014 to May 2015, while LANDSAT-8 is from August 2014 to January 2015. What is the reason for the inconsistency in the time used for the two data.
2. It is recommended to add units to the elevation in Figure 2 and pay attention to the number of decimal places in the figure.
3. Should the 'reproduced' in Figure 3 be 'resampled'? It is recommended to verify.
4. It is recommended to provide the corresponding English full name for the ACD abbreviation to facilitate readers' understanding.
5. The main focus of the paper is on TCH prediction, but there is not a spatial distribution map of TCH in the entire text. Is this reasonable? At the same time, it is recommended to display the TCH prediction map and a comparison map between the TCH prediction and the CHM derived from the airborne LiDAR, in order to clearly illustrate the differences between the prediction results and the true values, the distribution of the differences.

Author Response

Dear Reviewer #3,
We thank you very much for taking the time to thoughtfully review this manuscript. Please find below the detailed responses to your comments and how we have addressed them, along with the revised manuscript resubmission with track changes enabled.

Comment 1: "In Table 1, Sentinel-1 is from August 2014 to May 2015, while LANDSAT-8 is from August 2014 to January 2015. What is the reason for the inconsistency in the time used for the two data."

Response 1: This temporal difference is already explained in our methods section. As stated on page 8, lines 304-307: "Because of data loss due to cloud cover and other atmospheric effects, Landsat data was temporally sampled from August 2014 – January 2015 (the time period of the LiDAR data plus 2 months before and after)." The Sentinel-1 data extends to May 2015 because insufficient Sentinel-1 imagery was available for the shorter August 2014 – January 2015 period to provide adequate temporal coverage for SAR analysis. This extended temporal window was necessary to ensure sufficient Sentinel-1 data availability for our study region, while maintaining temporal alignment with the LiDAR acquisition period.

Comment 2: "It is recommended to add units to the elevation in Figure 2 and pay attention to the number of decimal places in the figure."

Response 2: We agree with this suggestion and have added elevation units (meters) to Figure 2 and adjusted the decimal places for consistency. [Page 7, Lines 315-316: Added elevation units (m) to Figure 2 and standardized decimal places to 2 decimal places for NDVI, SAR dB values, and texture measures while keeping elevation and canopy height as integers]

Comment 3: "Should the 'reproduced' in Figure 3 be 'resampled'? It is recommended to verify."

Response 3: We have carefully reviewed Figure 3 and the surrounding text, and the word "reproduced" does not appear anywhere in or near this figure, nor does it appear anywhere in the manuscript. The figure correctly uses the terms "reprojected" and "resampled" as appropriate. The data processing involved both reprojection (coordinate system transformation) and resampling (resolution change from 1m to 30m), which are accurately described in the figure caption and methods section.

Comment 4: "It is recommended to provide the corresponding English full name for the ACD abbreviation to facilitate readers' understanding."

Response 4: We agree with this suggestion and have added the full name "Aboveground Carbon Density" on first use of the ACD abbreviation. [Page 7, Line 294: Added "Aboveground Carbon Density (ACD)" on first use]

Comment 5: "The main focus of the paper is on TCH prediction, but there is not a spatial distribution map of TCH in the entire text. Is this reasonable? At the same time, it is recommended to display the TCH prediction map and a comparison map between the TCH prediction and the CHM derived from the airborne LiDAR, in order to clearly illustrate the differences between the prediction results and the true values, the distribution of the differences."

Response 5: We respectfully disagree with this suggestion. Our study includes extensive spatial visualization of the input TCH data in Figure 3, which shows the LiDAR-derived Canopy Height Models at both original (1x1m) and resampled (30x30m) resolutions in our study area. Additionally, Figure 6 demonstrates the application of our TCH predictions for ACD estimation as a proof of concept. The spatial distribution of our 85 LiDAR survey sites across Borneo (shown in Figure 1) represents discrete, non-contiguous areas rather than a continuous landscape, making a single TCH prediction map impractical and potentially misleading. Our model performance is thoroughly evaluated through comprehensive statistical analysis, bias assessment across height classes (Figure 5), and detailed error characterization presented throughout the results section. This approach provides more meaningful insights into model performance than spatial visualization of predictions across disconnected survey sites.

Round 2

Reviewer 2 Report (New Reviewer)

Comments and Suggestions for Authors

Second-round review report on "A Deep Learning Approach for High-Resolution Canopy Height Mapping in Indonesian Borneo by Fusing Multi-Source Remote Sensing Data"

General Comments

The authors have done a truly excellent job in addressing the vast majority of my initial comments. The manuscript is significantly stronger, with a much clearer narrative and greatly improved methodological transparency. I appreciate the effort put into the revisions. There are just a handful of relatively minor points left to tidy up, mostly concerning final details for reproducibility and contextualization. I am now recommending a Minor Revision before publication. Specific comments as below:

Specific Comments

2.4 The Introduction The new hypothesis and the citation to Kattenborn et al. are perfect—they really sharpen the introduction's argument. I do feel, though, that the literature backdrop could still use a bit more bolstering with a couple of additional, very recent (e.g., 2022-2023) deep learning applications in forest remote sensing. While the foundation is solid, adding one or two more contemporary references would firmly position your work within the current cutting-edge conversation and strengthen the rationale for your deep learning focus.

2.5.4 Methodology Description This is probably the most critical remaining point. I completely understand the computational constraints, and I appreciate the commitment to share code. However, from a reproducibility standpoint, relying solely on future code availability is a bit of a risk for the reader. I don't think we need an exhaustive table, but I really believe the manuscript would be substantially strengthened by including the final, key hyperparameters for the main models in the text or a supplementary table. Things like the learning rate and batch size for the ANN, or the number of trees and max depth for the tree-based models, are fundamental. Likewise, a sentence on the search boundaries used by Auto-Keras (e.g., the range of layers and neurons it could explore) would be invaluable. These details would provide immediate clarity and allow others to better understand your model-building process.

2.6 The Results Section You're absolutely right, my apologies—Figure 5 is indeed clearly annotated with the values, so that's perfectly clear. I should have looked more carefully! While we're in the Results section, I did notice that the description of the data split (90/5/5) is still here. I think that detail would be better placed in the Methodology section (2.5.4) to keep the Results focused purely on the findings themselves. It's a small organizational thing that would help with the flow.

2.7 Suggestion for the Discussion The new comparisons with Csillik and Potapov are spot-on, and the quantified impact on carbon accounting is exactly what I was hoping for—it makes the implications much more concrete for the reader. You've thoroughly addressed the bias in very tall forests, but I was left wondering if the general humid tropical environment of Borneo itself poses any specific challenges that your model handled well or could be further improved upon. A sentence or two speculating on this—perhaps on how the model performs across different forest types within the region (e.g., peat swamps vs. lowland dipterocarp) or how the high humidity influences the satellite data—could add another interesting layer to the discussion.

Author Response

We thank Reviewer 2 for their continued constructive feedback and for recognizing the significant improvements made in the previous revision. We appreciate their recommendation for Minor Revision and are pleased to address these final refinements to further strengthen the manuscript.

REVIEWER 2 - ROUND 2 COMMENTS

General Comments: We appreciate the reviewer's positive assessment of our revisions and their recognition of the manuscript's improved narrative and methodological transparency. We are pleased to address these final refinements to ensure the manuscript meets the highest standards for publication.

Comment 2.4: "The new hypothesis and the citation to Kattenborn et al. are perfect—they really sharpen the introduction's argument. I do feel, though, that the literature backdrop could still use a bit more bolstering with a couple of additional, very recent (e.g., 2022-2023) deep learning applications in forest remote sensing. While the foundation is solid, adding one or two more contemporary references would firmly position your work within the current cutting-edge conversation and strengthen the rationale for your deep learning focus."

Response 2.4: We agree with this suggestion and have added an additional recent reference to strengthen the literature foundation and better position our work within current deep learning applications in forest remote sensing. Specifically, we added: "For example, Schwartz et al. (2025) [43] successfully employed deep learning models with U-Net architecture to produce high-resolution canopy height maps across France, demonstrating that these advanced approaches can effectively integrate multi-source remote sensing data for forest structure monitoring. This growing body of evidence suggests that deep learning methods deserve greater attention in tropical forest monitoring applications." [Page 5, Lines 232-238: Added recent deep learning reference and contextualizing text]

Comment 2.5.4: "This is probably the most critical remaining point. I completely understand the computational constraints, and I appreciate the commitment to share code. However, from a reproducibility standpoint, relying solely on future code availability is a bit of a risk for the reader. I don't think we need an exhaustive table, but I really believe the manuscript would be substantially strengthened by including the final, key hyperparameters for the main models in the text or a supplementary table. Things like the learning rate and batch size for the ANN, or the number of trees and max depth for the tree-based models, are fundamental. Likewise, a sentence on the search boundaries used by Auto-Keras (e.g., the range of layers and neurons it could explore) would be invaluable. These details would provide immediate clarity and allow others to better understand your model-building process."

Response 2.5.4: We appreciate this important point about reproducibility and have added key hyperparameters for the main models in the methods section. We have included the final hyperparameters for the neural network (learning rate, batch size, architecture details) and tree-based models (number of trees, max depth) as well as the Auto-Keras search boundaries. This information provides immediate clarity for reproducibility while maintaining focus on the most critical parameters.

For the tree-based models, we added: "All tree-based models were initialized with hyperparameters typical for remote sensing applications. Specifically, RF was configured with 100 trees (n_estimators=100) and a maximum depth of 10 (max_depth=10). XGBoost and LightGBM were both configured with 100 trees (n_estimators=100), maximum depth of 6 (max_depth=6), and learning rate of 0.1 (learning_rate=0.1). These parameters represent a balance between model complexity and computational efficiency while maintaining robust predictive performance." [Page 15, Lines 443-449: Added tree-based model hyperparameters]

For the neural network, we added: "The model used Adam optimizer with a learning rate of 0.001 and batch size of 32. During the AutoKeras search process, the algorithm explored architectures ranging from 1 to 10 layers, with 16 to 512 neurons per layer, and learning rates between 0.0001 and 0.1, with a maximum of 5 trials and 100 training epochs per trial." [Page 17, Lines 524-528: Added neural network hyperparameters and search boundaries]

Comment 2.6: "You're absolutely right, my apologies—Figure 5 is indeed clearly annotated with the values, so that's perfectly clear. I should have looked more carefully! While we're in the Results section, I did notice that the description of the data split (90/5/5) is still here. I think that detail would be better placed in the Methodology section (2.5.4) to keep the Results focused purely on the findings themselves. It's a small organizational thing that would help with the flow."

Response 2.6: We appreciate the reviewer's attention to manuscript organization. However, we would like to clarify that the data split description (90/5/5) is already appropriately placed in the Methodology section (Section 2.3) where it belongs, as it describes our experimental design and data preparation procedures. We note that the reviewer mentioned section 2.5.4, but our manuscript's Methods section ends at 2.3.3, with Results beginning at Section 3. The Results section focuses on the findings and model performance metrics. We believe the current organization maintains clear separation between methodological details and results, which supports good scientific writing practice. [Page 14, Lines 398-399: Data split description already appropriately placed in Methods section]

Comment 2.7: "The new comparisons with Csillik and Potapov are spot-on, and the quantified impact on carbon accounting is exactly what I was hoping for—it makes the implications much more concrete for the reader. You've thoroughly addressed the bias in very tall forests, but I was left wondering if the general humid tropical environment of Borneo itself poses any specific challenges that your model handled well or could be further improved upon. A sentence or two speculating on this—perhaps on how the model performs across different forest types within the region (e.g., peat swamps vs. lowland dipterocarp) or how the high humidity influences the satellite data—could add another interesting layer to the discussion."

Response 2.7: We appreciate this insightful suggestion and have added discussion of how the humid tropical environment of Borneo presents specific challenges and opportunities for our modeling approach. We have included commentary of expected model performance across different forest types (peat swamps vs. lowland dipterocarp) and discussion of how high humidity and atmospheric conditions influence satellite data quality and model performance. This adds valuable ecological context to our methodological discussion.

Specifically, we added: "The humid tropical environment of Borneo presents specific challenges for canopy height mapping that our model had to navigate. The region's heterogeneous forest ecosystems, ranging from lowland dipterocarp forests to peat swamps and montane formation, exhibit distinct structural characteristics that influence model performance. Dipterocarp-dominated forests, with their exceptionally tall emergent trees and complex vertical structure, likely contribute to the underestimation bias we observed in the tallest height classes. Meanwhile, peat swamp forests, with their unique spectral properties due to waterlogged conditions, may introduce additional variability in reflectance patterns that complicates height prediction. The persistent high humidity and frequent cloud cover across Borneo significantly constrain the availability of high-quality optical imagery, forcing our model to learn from a dataset with inherent temporal gaps and atmospheric noise. Despite these challenges, our approach demonstrated robust performance across the most common forest types, suggesting that deep learning methods can effectively capture the structural complexities of diverse tropical forest ecosystems through their ability to identify subtle patterns in multi-source data. Future refinements could include developing forest-type specific models or incorporating more detailed ecological stratification to account for the structural and spectral differences between Borneo's distinct forest ecosystems." [Page 21, Lines 658-674: Added comprehensive discussion of tropical environment challenges and forest type analysis]

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

1. The writing is very careless; there are sections 1.1 and 1.4 in the Introduction, but sections 1.2 and 1.3 are missing.

2. The introduction of LiDAR data is not detailed enough. It needs further explanation regarding the acquisition time, range, and LiDAR sensors.

3. There is no information on how many plots were used for the field observations of aboveground carbon density (ACD). The time, range, and method of data collection for the plots are not described. If the biomass estimated using the empirical formula based on LiDAR is considered the true value, the reliability of the research results is questionable. The canopy height obtained from LiDAR data was not validated using ground observation data.

4. The inversion results of different machine learning models were not compared or deeply analyzed. Only a simple display of a local inversion result map was provided.

5. The author’s writing lacks diligence, and the quality of the article is low. The technical methods are too simplistic. The article needs a complete revision to improve its quality before it can be considered for publication.

Comments on the Quality of English Language

Minor editing of English language required

Author Response

see attachement below

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The manuscript analyzed an extensive dataset on carbon stock estimation with machine learning. The authors have trained multiple ML algorithms which provided much improved ACD estimation than the conventional RF method.

However, my largest concern is that the authors did not validate with any ground truth data. In Borneo and other Southeast Asian regions, there are many different forest types with varied tree species. The authors used the ACD equation directly from Jucker et al. (2018) who targeted particular types of lowland forests of Borneo, but there are so many other forest types within the study sites of the submitted manuscript. The presented justification of the allometric equations by Chave et al. (2005) is not convincing since there are no DBH and wood density data available. The authors mentioned REDD+ and pointed out the importance of monitoring accurate carbon stocks in the introduction, however, it is impossible without the verification. Therefore, I do not recommend this manuscript to be considered for publication.

I genuinely think remote sensing data presented here should be appropriately utilized especially the LiDAR data which must have been acquired at great cost and effort. Therefore, I recommend that the authors include field data so that the presented approach would be meaningful.

Author Response

see attachment below

Author Response File: Author Response.pdf

Article Menu

A Deep Learning Approach for High-Resolution Canopy Height Mapping in Indonesian Borneo by Fusing Multi-Source Remote Sensing Data

Review Report on ‘High-Resolution Canopy Height Mapping in Tropical Rainforests Using Deep Learning and Multi-Source Remote Sensing Data’

1. General Comments

2. Specific Comments

2.1 Title

2.2 Keywords

2.3 Abstract

2.4 Introduction

2.5 Materials and Methods

2.6 Results

2.7 Discussion

2.8 Conclusion

2.9 References

2.10 Language

Further Information

Guidelines

MDPI Initiatives

Follow MDPI