Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Advancing Forest Inventory and Fuel Monitoring with Multi-Sensor Hybrid Models: A Comparative Framework for Basal Area Estimation

Remote Sens. 2026, 18(6), 852; https://doi.org/10.3390/rs18060852

by Nasrin Salehnia¹, Peter Wolter^1,*, Brian R. Sturtevant²

and Dalia Abbas Iossifov³

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Remote Sens. 2026, 18(6), 852; https://doi.org/10.3390/rs18060852

Submission received: 8 January 2026 / Revised: 18 February 2026 / Accepted: 27 February 2026 / Published: 10 March 2026

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

It is a good and well written manuscript overall, please find my specific comments:

Line 99: Please define the indices before introducing them in the text.

Line 100: Consider removing the duplicate word “that.”

Line 134: It may help to briefly introduce this method earlier to improve the flow of the technical description.

Lines 150–151: Please define GA-xPLS, RF-xPLS, and SVR-xPLS when they are first mentioned to improve readability.

Lines 152–154: Please revise this sentence for clarity.

Line 160: Consider referencing the four approaches here as well for consistency.

Line 180: Scientific names should be italicized throughout the manuscript.

Line 182: Balsam fir has already been defined earlier; repeating the full name may not be necessary.

Section 2.1.1: Why was a single-date Landsat image used instead of multi-temporal Sentinel-2 imagery? Please clarify the rationale for the date selection.

Table 1: Notations such as B2_b4 and b2_b7 are nonstandard and difficult to interpret. Please use consistent, standard band notation for both Landsat-9 and Sentinel-2. Additionally, LiDAR-derived predictor variables are not defined; please include descriptions.

Table 2: Please explain the criteria for selecting only the listed spectral vegetation indices (SVIs).

Table 3: The NDVI formula appears to differ from the standard definition. Please verify and provide references for all listed equations.

Table 4: The term “Whole name” does not appear to be standard; please clarify or revise.

Figure 5: Image quality should be improved. Please ensure consistent, high-resolution figures throughout the manuscript.

Lines 690–693: I recommend considering the following study, particularly Section 4.2:
Bhattarai, R., Rahimzadeh-Bajgiran, P., & Mech, A. (2023). Estimating nutritive, non-nutritive and defense foliar traits in spruce-fir stands using remote sensing and site data. Forest Ecology and Management, 549, 121461.
It may be valuable to discuss how their findings compare with yours, especially since they did not observe a relationship between SWIR reflectance and canopy water or dry matter content. Given the methodological similarities and overlap in species, integrating this comparison could strengthen your discussion.

Author Response

Response to Reviewer #1

We would like to thank the reviewer for the positive assessment of our manuscript and for their insightful, constructive feedback. We have carefully addressed each of the points raised, and we believe these revisions have significantly strengthened the quality and clarity of the paper.

-----------------------------------------------------------------------------------------------------------------

It is a good and well written manuscript overall, please find my specific comments:

Comment 1 (Line 99): Define the indices before introducing them in the text.

Response: Done. We expanded acronyms at first mention. Revision made (Introduction): We now write “normalized difference vegetation index (NDVI)” and “moisture stress index (MSI)” before using the abbreviations (line 102 in the revised version).

Comment 2 (Line 100): Consider removing the duplicate word “that.”

Response: Done. We removed the duplicated word to improve readability.

Comment 3 (Line 134): Briefly introduce this method earlier to improve flow.

Response: Done. We added a brief introduction of genetic algorithms (GA)—a population-based evolutionary approach for feature subset selection using selection, crossover, and mutation—in the Introduction to improve the flow into the later technical GA-xPLS description (line 136 in the revised version). The full GA procedure used in this study is described in the Materials and Methods section (2.2.2. GA-xPLS).

Comment 4 (Lines 150–151): Define GA-xPLS, RF-xPLS, and SVR-xPLS at first mention.

Response: Done. We expanded each hybrid acronym at first mention. Thanks!

Comment 5 (Lines 152–154): Please revise this sentence for clarity.

Response: Done. We rewrote the sentence to reduce length and improve clarity while preserving the original meaning.

Comment 6 (Line 160): Reference the four approaches here for consistency.

Response: Done. We explicitly list the four approaches in the study objectives to match the terminology used earlier.

Comment 7 (Line 180): Scientific names should be italicized throughout.

Response: Done. We italicized genus–species names consistently throughout the manuscript (authors of species names remain non-italic).

Comment 8 (Line 182): Balsam fir already defined earlier; repeating full name may not be necessary.

Response: Done. Abies balsamea is defined at first mention; we removed the repeated full scientific name later in the manuscript and now refer to it as balsam fir for consistency.

Comment 9 (Section 2.1.1): Why single-date Landsat instead of multi-temporal Sentinel-2? Clarify rationale for date selection.

Response: Clarified. Sentinel-2 predictors are multi-temporal to capture phenology, while Landsat-9 was included as a complementary, independent late-season snapshot selected based on cloud-free availability in the senescence window; we added the rationale and date-selection criteria (cloud/haze minimization and phenological timing). We also note continuity with prior regional work that leverages late-season imagery for strong senescence contrast.

Therefore, we have added the below paragraph in section 2.1.1 (lines 203-210 in the revised version):

“Sentinel-2 predictors were assembled as multi-temporal observations to capture intra-seasonal phenology and senescence dynamics. In contrast, Landsat-9 was included as a complementary single-date snapshot to (i) provide an independent optical sensor perspective and (ii) maintain continuity with regional forest-structure studies that leverage late-season imagery. The Landsat-9 acquisition date was selected because it provided the most suitable cloud-free/low-haze coverage within the late-season senescence window, when contrast among forest types is often enhanced in visible and SWIR wavelengths.”

Comment 10 (Table 1): Nonstandard band notations; use standard notation for Landsat-9 and Sentinel-2; define LiDAR predictors.

Response: Done. We revised Table 1 to use standard, explicit band notation for Sentinel-2 and Landsat-9 (listing bands individually rather than using nonstandard range shorthand). We also added a note in Table 1 directing readers to Table 4, where all LiDAR predictor abbreviations and definitions are provided (“LiDAR predictor abbreviations and definitions are provided in Table 4.”).

Comment 11 (Table 2): Explain criteria for selecting only the listed SVIs.

Response: Done. We added a brief selection rationale: indices were chosen to represent distinct biophysical sensitivities (greenness/chlorophyll, moisture, senescence/structure) and to limit redundancy/multicollinearity while aligning with indices used in related forest-structure studies (lines 214-219 in the revised version):

Comment 12 (Table 3): NDVI formula differs from standard; verify and provide references for all equations.

Response: Corrected. We updated the NDVI equation to the standard definition and verified all listed indices. We have presented references in the Introduction section for the used indices. In Table 3, we have also added a description below the table as: “Prior to modeling, NDVI, MSI, and NDAI were linearly rescaled as (VI+1) ×100, and SVR was multiplied by 1.5, to harmonize predictor ranges; these monotonic transformations do not change observation rankings or Pearson correlations”.

Comment 13 (Table 4): “Whole name” is not standard; please clarify or revise.

Response: Done. We renamed the column to “Metric description”.

Comment 14 (Figure 5): Improve image quality; ensure consistent high-resolution figures.

Response: Addressed. We checked Fig.5; and we found that Fig. 5d has not enough quality. We re-exported the figure at publication quality (600 dpi; consistent fonts/line weights). Thanks!

Comment 15 (Lines 690–693): Consider Bhattarai et al. (2023) and compare findings (esp. SWIR relationships).

Response: Good idea! Added. We incorporated this study in the Discussion and contrasted their focus on foliar traits (with strong emphasis on red-edge indices and site variables) with our structural target (basal area) and our multi-sensor + LiDAR framework; we discuss why SWIR predictors may behave differently across trait-vs-structure modeling (lines 692-703 in the revised version).

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This paper offers a meaningful contribution to forest inventory and fuel monitoring by developing a comparative framework for basal area (BA) estimation, blending multi-sensor data fusion with hybrid feature-selection models. The study’s strength lies in its systematic benchmarking of four pipelines—xPLS, GA-xPLS, RF-xPLS, and SVR-xPLS—using identical predictors and field data, which delivers clear, actionable insights for fire risk management. Below are detailed observations and suggestions to strengthen the work:

Key Strengths

The integration of Sentinel-2, Landsat-9, and LiDAR data effectively combines spectral, phenological, and structural information, addressing the longstanding challenge of handling high-dimensional, collinear data in forest remote sensing.
The head-to-head model comparison, backed by bootstrap confidence intervals and residual analysis, provides a unambiguous performance ranking (RF-xPLS outperforming GA-xPLS, xPLS, and SVR-xPLS) and ensures the results are robust.
Focusing on conifer BA—critical for assessing ladder fuel and crown fire risks—ties the research to real-world forest management needs. The 27-predictor parsimonious set is particularly valuable, as it enables practical, large-scale wall-to-wall mapping.
The standardized preprocessing of 175 predictors, thoughtful field plot design (five subplots per cluster), and rigorous cross-validation (LOOCV, OOB) reflect careful attention to methodological rigor.

Recommendations for Revision

Clarify model hyperparameters: While the hybrid pipelines are described, specific hyperparameter settings (e.g., number of trees in Random Forest, population size and iterations for Genetic Algorithm, penalty parameter C and ε for SVR) are not fully disclosed. Adding these details—perhaps in a supplementary table—would make the work easier to replicate.
Address sample representativeness: The paper notes mild under-prediction in high-BA stands due to limited samples. It would help to specify how many of the 141 plots fall into the high-BA category (e.g., >60 m²·ha⁻¹) and discuss practical steps to mitigate this limitation, such as targeted field sampling in underrepresented areas or synthetic data augmentation.
Deepen ecological interpretation of predictors: The 27 selected predictors (SWIR bands, red-edge/NIR features, LiDAR HQUAD) are linked to biophysical signals, but more context on their ecological relevance would enhance the paper. For example, why do March and August SWIR bands perform better than other seasons? How does HQUAD capture vertical structure differently across conifer species like Pinus resinosa and Abies balsamea?
Discuss model transferability: The study is confined to Minnesota’s hemiboreal forests. Adding a brief section on how RF-xPLS might perform in other temperate–boreal regions—accounting for differences in forest composition, climate, and sensor data availability—would broaden the research’s impact.
Add sensor contribution analysis: To highlight the unique value of each sensor (Sentinel-2, Landsat-9, LiDAR), a supplementary ablation experiment (e.g., testing RF-xPLS with only spectral data, only LiDAR, or combinations) would quantify how each data source improves BA estimation.
Improve figure clarity: Appendix figures like A1-A2 and B1-B3 lack detailed captions and clear axis labels (e.g., “z-scored spectral-band predictors” do not specify which bands are included). Simplifying these figures and adding concise legends would make them more accessible to readers.

Minor Notes

Ensure consistency with terminology: “SVR” is used for both Short Wave Infrared to Visible Ratio and Support Vector Regression. A brief footnote when the term first appears would help distinguish the two.
In Section 3.5, clarify whether nonnegativity clipping of predictions affected the results—for example, how many samples were clipped and whether this impacted RMSE or R² values.

Comments on the Quality of English Language

The manuscript’s English is generally clear, academic, and consistent with remote sensing research standards. Minor improvements include splitting a few overly long sentences in the Discussion section for readability and double-checking the consistency of abbreviations (e.g., ensuring “VI” is defined on first use in all sections). No major grammatical errors were identified.

Author Response

Response to Reviewer #2

Thank you for the positive feedback regarding our methodology and the significance of our findings for forest fire risk management. We have carefully incorporated all of your suggestions into the revised manuscript. We believe these changes have strengthened the paper and addressed the specific observations you noted.

-----------------------------------------------------------------------------------------------------------------

Comments and Suggestions for Authors

Key Strengths

The integration of Sentinel-2, Landsat-9, and LiDAR data effectively combines spectral, phenological, and structural information, addressing the longstanding challenge of handling high-dimensional, collinear data in forest remote sensing.
The head-to-head model comparison, backed by bootstrap confidence intervals and residual analysis, provides a unambiguous performance ranking (RF-xPLS outperforming GA-xPLS, xPLS, and SVR-xPLS) and ensures the results are robust.
Focusing on conifer BA—critical for assessing ladder fuel and crown fire risks—ties the research to real-world forest management needs. The 27-predictor parsimonious set is particularly valuable, as it enables practical, large-scale wall-to-wall mapping.
The standardized preprocessing of 175 predictors, thoughtful field plot design (five subplots per cluster), and rigorous cross-validation (LOOCV, OOB) reflect careful attention to methodological rigor.

Recommendations for Revision

Comment 1: Clarify model hyperparameters: While the hybrid pipelines are described, specific hyperparameter settings (e.g., number of trees in Random Forest, population size and iterations for Genetic Algorithm, penalty parameter C and ε for SVR) are not fully disclosed. Adding these details—perhaps in a supplementary table—would make the work easier to replicate.

Response: Thank you for this suggestion. To improve reproducibility, we added a new Appendix Table A1 that reports the key hyperparameter settings and implementation details for all hybrid pipelines, including RF-xPLS (e.g., number of trees, minimum leaf size, mtry rule, random seed), GA-xPLS (e.g., population size, maximum generations, crossover/mutation settings, subset-size constraints, CV scheme), and SVR-xPLS (kernel choice, C, and the response-specific ε rule, with LOOCV). Please see Table A1 in the Appendix A.

Comment 2: Address sample representativeness: The paper notes mild under-prediction in high-BA stands due to limited samples. It would help to specify how many of the 141 plots fall into the high-BA category (e.g., >60 m²·ha⁻¹) and discuss practical steps to mitigate this limitation, such as targeted field sampling in underrepresented areas or synthetic data augmentation.

Response: Thank you for this suggestion. We now quantify plot representativeness in the upper tail of total basal area (TOTBA). Among the 141 plots, 107 (75.9%) fall in <40 m²·ha⁻¹, 27 (19.1%) in 40–60 m²·ha⁻¹, and only 7 (5.0%) exceed 60 m²·ha⁻¹. This limited sample support at high TOTBA likely contributes to the mild under-prediction observed at the upper tail. We have added these counts and the associated interpretation directly in the Discussion section (lines 590-595 in the revised version), and we expanded the limitation statement with practical mitigation steps, including targeted/stratified field sampling to increase coverage of high-stocking stands and using weighting or stratified resampling during model training to reduce imbalance effects. We also note that synthetic augmentation could be explored, but should be applied cautiously to preserve ecological realism and the joint distribution of predictors.

Comment 3: Deepen ecological interpretation of predictors: The 27 selected predictors (SWIR bands, red-edge/NIR features, LiDAR HQUAD) are linked to biophysical signals, but more context on their ecological relevance would enhance the paper. For example, why do March and August SWIR bands perform better than other seasons? How does HQUAD capture vertical structure differently across conifer species like Pinus resinosa and Abies balsamea?

Response: Thank you for this helpful suggestion. We agree that additional ecological context strengthens interpretation of the retained predictors. Accordingly, we expanded the Discussion (lines 677-685 in the revised version) to (i) clarify why March and August SWIR features can be especially informative for basal-area mapping and (ii) explain the ecological meaning of the LiDAR HQUAD metric (lines 707-716 in the revised version) across canopy positions. Specifically, we note that early-season (March) conditions (leaf-off for deciduous components and reduced understory activity) increase the relative influence of evergreen crowns and background/illumination contrasts, while SWIR wavelengths remain sensitive to canopy moisture and dry-matter absorption. In late summer (August), forests are near peak leaf area and may experience stronger moisture limitation; SWIR-based predictors can therefore add contrast among low-, moderate-, and high-stocking stands, improving discrimination toward the upper end of structural gradients. We also added text explaining that HQUAD is a quadratic-height summary that disproportionately weights taller returns, making it sensitive to overstory dominance and canopy stratification; thus, it tends to track tall, overstory-dominant conifers such as Pinus resinosa more directly, while variation in subcanopy-associated species such as Abies balsamea may be captured more indirectly in mixed stands where the overstory controls the height distribution. These additions provide clearer ecological interpretation of why SWIR seasonality and HQUAD were retained among the 27 predictors.

Comment 4: Discuss model transferability: The study is confined to Minnesota’s hemiboreal forests. Adding a brief section on how RF-xPLS might perform in other temperate–boreal regions—accounting for differences in forest composition, climate, and sensor data availability—would broaden the research’s impact.

Response: Thank you for this valuable suggestion. We agree that discussing transferability broadens the impact of the study. We have added a short model transferability paragraph in the Discussion describing how RF-xPLS may generalize to other temperate–boreal regions, and what factors could limit or require recalibration (differences in species composition, disturbance history, phenology/climate, and sensor/LiDAR availability). We also outline practical steps for applying the approach elsewhere, including predictor harmonization across sensors, seasonal timing alignment, and local calibration/validation with a modest number of field plots to ensure robust performance under new conditions.

Comment 5: Add sensor contribution analysis: To highlight the unique value of each sensor (Sentinel-2, Landsat-9, LiDAR), a supplementary ablation experiment (e.g., testing RF-xPLS with only spectral data, only LiDAR, or combinations) would quantify how each data source improves BA estimation.

Response: Thank you for this helpful suggestion. We added a sensor-ablation experiment to quantify the incremental contribution of Sentinel-2, Landsat-9, and LiDAR within the RF-xPLS predictor set. Specifically, we refit RF models using (i) Sentinel-2 only, (ii) Landsat-9 only, (iii) LiDAR only, and (iv) sensor combinations (Sentinel-2+Landsat-9, Sentinel-2+LiDAR, Landsat-9+LiDAR), and compared them to the full multi-sensor model. Results are reported in Table E1 (Appendix E). Overall, Sentinel-2 explains most of the predictive skill (S2-only: RMSE=3.70, R²=0.84), while the full multi-sensor configuration provides the best performance (All sensors: RMSE=3.48, R²=0.89). Landsat-9 and LiDAR alone are substantially less informative, but contribute modest gains when combined with Sentinel-2, consistent with complementary late-season optical information (Landsat-9) and direct canopy-structure sensitivity (LiDAR).

Please check the Supplementary file to check the details.

We have also added the below sentences in the “Discussion” (Lines 601-606): “A supplementary sensor-ablation analysis (Appendix E: Table E1) shows that Sentinel-2 satellite sensor data explains most of the predictive skill (S2-only: RMSE=3.70, R²=0.84), while the full multi-sensor model performs best overall (S2+L9+LiDAR: RMSE=3.48, R²=0.88), indicating modest but consistent gains from combining Landsat-9 and LiDAR sensor data with Sentinel-2 satellite sensor data.”

Comment 6: Improve figure clarity: Appendix figures like A1-A2 and B1-B3 lack detailed captions and clear axis labels (e.g., “z-scored spectral-band predictors” do not specify which bands are included). Simplifying these figures and adding concise legends would make them more accessible to readers.

Response: Thank you for noting this. We revised Appendix Figures for the mentioned captions (Appendix B, Figures B1 and B2) to improve readability and reproducibility. Specifically, we expanded the captions to clearly define each panel, metric, and unit. The changes simplify interpretation and make the figures self-contained without requiring readers to infer which predictors or band groups are included.

Minor Notes

Ensure consistency with terminology: “SVR” is used for both Short Wave Infrared to Visible Ratio and Support Vector Regression. A brief footnote when the term first appears would help distinguish the two.

Response: Done! Thank you for noting this potential ambiguity. We revised the manuscript to avoid confusion between SVR (Short-Wave Infrared to Visible Ratio) and SVR (Support Vector Regression). We have also put the below sentence for Tables 3 and 5.

“‘SVR’ in predictor names denotes the Shortwave Infrared–to–Visible Ratio (spectral index), whereas ‘SVR-xPLS’ refers to Support Vector Regression in the modeling pipeline.”

In Section 3.5, clarify whether nonnegativity clipping of predictions affected the results—for example, how many samples were clipped and whether this impacted RMSE or R² values.

Response: Thank you for this important suggestion. We explicitly evaluated the impact of non-negativity clipping by recomputing RMSE and R² for every response and method before clipping (raw predictions) and after clipping using the same evaluation set (n = 141 observations). Clipping was implemented as ŷ = max(0, ŷ), consistent with the physical constraint that basal area cannot be negative. Across all response × method combinations, the performance metrics were unchanged at the reported precision, with maximum absolute differences of ΔR² = 0.00 and ΔRMSE = 0.00, indicating that the clipping procedure had negligible influence on model evaluation. Notably, RF-xPLS produced no negative predictions (0% in all responses), so clipping had no effect for that method.

We also added a clarifying statement in the manuscript addressing this point (Lines 564-567):
“Nonnegativity clipping (ŷ = max(0, ŷ)) was applied only to a small number of negative predictions and had negligible impact on RMSE and R².”

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

General Comments
This manuscript presents a comparative analysis of variable selection–based machine learning pipelines for predicting species-specific basal area under high-dimensional and collinear data environments by integrating multi-source satellite imagery and LiDAR data. The study addresses a topic with clear practical relevance to wildfire risk management and forest management applications. The research objectives are well defined, and the analytical scale and methodological framework are generally aligned with the scope of the journal.
However, the current version of the manuscript requires substantial improvement in terms of the rigor of the validation design, external validity, and reproducibility of data and code. In particular, the insufficient separation between the variable selection process and performance evaluation may compromise the reliability of the reported results. If these major issues are not adequately addressed, the generalizability and scientific contribution of the study may be limited.
Therefore, the reviewer recommends Major Revision prior to reconsideration for publication.

Major Comments
1. Validation Design and Potential Overfitting
The manuscript reports both in-sample and validation-path performance metrics. However, for some methods (notably xPLS and GA-xPLS), substantial discrepancies are observed between these two measures. This raises concerns regarding the insufficient separation between the variable selection process and model evaluation.
The procedure used to derive validation-path RMSE should be described more explicitly.
If possible, the authors are encouraged to present additional results based on nested cross-validation or an independent validation dataset, in which the variable selection and performance evaluation stages are clearly separated.
Additional justification is required to demonstrate that the reported results are not affected by optimistic bias.
2. Lack of External or Spatial Validation
The present analysis is based on data collected from a single study area (Kawishiwi Ranger District), and no explicit assessment is provided regarding the transferability of the proposed models across different regions or time periods.
The inclusion of external validation results, such as those based on neighboring regions, different years, or spatial block cross-validation, is strongly recommended.
If external validation is not feasible, the authors should more clearly delimit the practical applicability of their findings.
3. Data Availability and Reproducibility
Some inconsistencies are observed between the data description in the Data Availability Statement and the datasets reported in the main text. In addition, the analytical code and key model configurations are not sufficiently documented.
The authors should clarify the sources and access procedures for the satellite imagery, LiDAR data, and field measurements used in the analysis.
To enhance reproducibility, the authors are encouraged to provide the main scripts or detailed processing workflows used for variable selection and model training.
4. Interpretation of Structural and Understory Limitations
Although the manuscript acknowledges the limitations of remote sensing data in capturing understory vegetation and ladder fuels, the implications of these limitations for result interpretation and practical application are not sufficiently discussed.
The predictive outputs should be interpreted more cautiously, emphasizing their role in identifying relative risk patterns rather than providing absolute basal area estimates.
The conclusions and management implications should be revised accordingly to reflect these structural limitations.
5. Effects of Non-Negativity Clipping on Model Performance
The manuscript applies non-negativity clipping to negative predictions; however, the influence of this procedure on RMSE and coefficient of determination values is not explicitly evaluated.
The authors are encouraged to report and compare model performance before and after clipping in order to ensure fair and transparent evaluation.

Minor Comments
Several sentences are overly long and complex, which may reduce readability. Simplification of sentence structures is recommended.
The interpretation and discussion of some figures and tables should be expanded for greater clarity.
The Data Availability Statement should be revised to ensure consistency with the datasets actually used in the study.
The consistency of abbreviations and variable notations should be carefully checked throughout the manuscript.

Author Response

Response to Reviewer #3

Thank you for your valuable comments on our work. We have taken your suggestions seriously and have applied the necessary revisions to improve the rigor of our validation process and the reproducibility of our findings. We are confident that the paper is now much stronger and more scientifically sound thanks to your guidance.

General Comments

This manuscript presents a comparative analysis of variable selection–based machine learning pipelines for predicting species-specific basal area under high-dimensional and collinear data environments by integrating multi-source satellite imagery and LiDAR data. The study addresses a topic with clear practical relevance to wildfire risk management and forest management applications. The research objectives are well defined, and the analytical scale and methodological framework are generally aligned with the scope of the journal.
However, the current version of the manuscript requires substantial improvement in terms of the rigor of the validation design, external validity, and reproducibility of data and code. In particular, the insufficient separation between the variable selection process and performance evaluation may compromise the reliability of the reported results. If these major issues are not adequately addressed, the generalizability and scientific contribution of the study may be limited. Therefore, the reviewer recommends Major Revision prior to reconsideration for publication.

Major Comments

Comment 1: Validation Design and Potential Overfitting

The manuscript reports both in-sample and validation-path performance metrics. However, for some methods (notably xPLS and GA-xPLS), substantial discrepancies are observed between these two measures. This raises concerns regarding the insufficient separation between the variable selection process and model evaluation.
The procedure used to derive validation-path RMSE should be described more explicitly.
If possible, the authors are encouraged to present additional results based on nested cross-validation or an independent validation dataset, in which the variable selection and performance evaluation stages are clearly separated. Additional justification is required to demonstrate that the reported results are not affected by optimistic bias.

Response:

We agree that clearer separation between (i) refit diagnostics and (ii) cross-validated selection-path performance is needed. We revised the Methods to explicitly define how “validation-path RMSE” is computed for xPLS and GA-xPLS and added Table A1 (Appendix A) summarizing the key settings (CV scheme, selection criterion, and hyperparameters) for each pipeline.

For xPLS, at each elimination step we evaluate each candidate variable removal using leave-one-out cross-validation (LOOCV) within plsregress(...,'CV',C1), with the number of PLS components selected by our xPLS rule. The validation-path RMSE reported in Fig. 5b is the minimum pooled LOOCV RMSE along the elimination path, and the selected subset is the one that minimizes this pooled LOOCV error.

For GA-xPLS, subset search is guided by a repeated K-fold CV fitness (K=10, repeats=3) computed on training data only for each candidate mask (with a hard constraint of ≤50 predictors and mild size penalties). After GA converges, we compute the final selection-path RMSE using the same PLS component rule and CV metric, and we report refit (in-sample) diagnostics separately for interpretability and consistency with the observation–prediction plots (e.g., Fig. 10).

We also added a short limitation statement noting that, as with any data-driven model selection, selection-path CV error can still be mildly optimistic, and thus the most reliable conclusions are the relative method ranking and stability patterns (supported by the cross-validated/bootstrapped summaries in Figs. 7–9), rather than claiming universal out-of-region generalization.

Comment 2: Lack of External or Spatial Validation
The present analysis is based on data collected from a single study area (Kawishiwi Ranger District), and no explicit assessment is provided regarding the transferability of the proposed models across different regions or time periods.
The inclusion of external validation results, such as those based on neighboring regions, different years, or spatial block cross-validation, is strongly recommended.
If external validation is not feasible, the authors should more clearly delimit the practical applicability of their findings.
Response: We appreciate the reviewer’s suggestion regarding external validation. While we agree that transferability is a key goal in remote sensing, we believe the current study provides a robust assessment of model stability and applicability for several reasons:

Significant Geographical and Ecological Scale: Although the KRD is a single administrative unit, it encompasses approximately 2,908 km². This area is larger than some small nations and contains a highly diverse hemiboreal forest mosaic. By sampling 141 plots across this vast landscape, our models are already tested against a wide range of stand structures, species compositions, and successional stages.
Robust Sampling Design: To ensure spatial representativeness and minimize "stand edge effects," each of our 141 field plots was established as a cluster of five variable-radius subplots within homogenous forest associations of at least 5 ha. This design captures local spatial variability more effectively than single-point measurements.
Internal Validation and Uncertainty Quantification: We utilized rigorous internal validation protocols to guard against overfitting. This includes the use of nonparametric bootstrap resampling (B = 500) to provide sampling-robust uncertainty for our RMSE and R² results. Furthermore, all selection pipelines (including RF-xPLS) utilized leave-one-out cross-validation (LOOCV) or repeated 10-fold CV to score candidate subsets, ensuring the models generalize well to unseen data within the district.
Mechanistic Portability of Predictors: The 27-predictor subset identified by the RF-xPLS workflow was not selected by chance; these features are mechanistically coherent for BA. They include SWIR moisture sensitivity (linked to canopy closure), red-edge/NIR features (linked to chlorophyll and crown density), and a robust vertical-structure LiDAR metric (HQUAD). These are fundamental biophysical controls on forest structure that we expect to remain relevant across other northern temperate–boreal regions.

To address the reviewer’s point about delimiting applicability, we have added a statement to the Discussion section (Section 4). We clarify that while the models are robust for the 2,908 km² KRD, regional recalibration and external validation with a modest representative plot sample are recommended when transferring these models to regions with significantly different species compositions or disturbance histories (Lines 739-758).

Comment 3: Data Availability and Reproducibility

Some inconsistencies are observed between the data description in the Data Availability Statement and the datasets reported in the main text. In addition, the analytical code and key model configurations are not sufficiently documented. The authors should clarify the sources and access procedures for the satellite imagery, LiDAR data, and field measurements used in the analysis. To enhance reproducibility, the authors are encouraged to provide the main scripts or detailed processing workflows used for variable selection and model training.

Response: We have expanded the discussion of key figures and tables throughout the manuscript to provide greater clarity. Specifically, we added detailed interpretations in the Results section for Figures 3-5 and Tables 2-3, and included supplementary explanations in the revised Supplementary Materials. These additions clarify the patterns observed and their implications.

Data Availability Statement: We have revised this section to accurately reflect the datasets used in the study. The updated statement now reads:

"Requests for the codes and scripts used in this study may be directed to the first author, Nasrin Salehnia (salehnia@iastate.edu), or the corresponding author, Peter Wolter (ptwolter@iastate.edu)."

Comment 4: Interpretation of Structural and Understory Limitations

Although the manuscript acknowledges the limitations of remote sensing data in capturing understory vegetation and ladder fuels, the implications of these limitations for result interpretation and practical application are not sufficiently discussed. The predictive outputs should be interpreted more cautiously, emphasizing their role in identifying relative risk patterns rather than providing absolute basal area estimates. The conclusions and management implications should be revised accordingly to reflect these structural limitations.
Response: According to your idea and the previous reviewers’ comments, we have updated the discussion section and have added more clarifications related to ecological and biophysical signals. Specially for “capturing understory vegetation and ladder fuels”, we have added several discussions in section 4 (Lines 677-685).

Comment 5: Effects of Non-Negativity Clipping on Model Performance. The manuscript applies non-negativity clipping to negative predictions; however, the influence of this procedure on RMSE and coefficient of determination values is not explicitly evaluated. The authors are encouraged to report and compare model performance before and after clipping in order to ensure fair and transparent evaluation.

To provide full transparency, we also quantified how often clipping was triggered. The table below reports the percentage of negative predictions that were set to zero. Notably, RF-xPLS produced no negative predictions (0% in all responses), so clipping had no effect for that method.

For your information the following table presents the “Percent negative predictions” set to zero:

Response	xPLS(%)	GA_xPLS(%)	RF_xPLS(%)	SVR_xPLS(%)
ABBA	0	0	0	0
LALA	0.71	14.29	0	1.43
PIBA	15.71	10.53	0	1.43
PICEA	22.14	8.27	0	2.86
PIRE	9.29	16.54	0	7.86
PIST	20.14	12.03	0	2.14
THOC	2.86	12.78	0	4.29
TOTBA	0	0	0	0

We also added a clarifying statement in the manuscript addressing this point:
“Nonnegativity clipping (ŷ = max(0, ŷ)) was applied only to a small number of negative predictions and had negligible impact on RMSE and R².”

Minor Comments

- Several sentences are overly long and complex, which may reduce readability. Simplification of sentence structures is recommended.

Response: Thank you for this valuable feedback. We have carefully reviewed the manuscript and simplified overly long and complex sentences throughout to improve readability. Specific attention was paid to restructuring lengthy sentences in the Introduction and Discussion sections. These revisions have made the text more concise and accessible while preserving scientific accuracy. All changes are marked in the revised manuscript.

- The interpretation and discussion of some figures and tables should be expanded for greater clarity.
The Data Availability Statement should be revised to ensure consistency with the datasets actually used in the study.

Response: As we notified in comment 3, we have expanded the discussion of key figures and tables throughout the manuscript to provide greater clarity. Specifically, we added two more tables and expanded captions of the supplementary file. For the “Data Availability Statement”: We have revised this section to accurately reflect the datasets used in the study. The updated statement now reads:

"Requests for the codes and scripts used in this study may be directed to the first author, Nasrin Salehnia (salehnia@iastate.edu), or the corresponding author, Peter Wolter (ptwolter@iastate.edu)."

- The consistency of abbreviations and variable notations should be carefully checked throughout the manuscript.

Response: Done! Thanks.

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

Comments and Suggestions for Authors

The manuscript has been significantly improved after revision. The authors have satisfactorily addressed my comments, and I have no further suggestions.

Article Menu

Advancing Forest Inventory and Fuel Monitoring with Multi-Sensor Hybrid Models: A Comparative Framework for Basal Area Estimation

Comment 12 (Table 3): NDVI formula differs from standard; verify and provide references for all equations.

Comment 13 (Table 4): “Whole name” is not standard; please clarify or revise.

Key Strengths

Recommendations for Revision

Minor Notes

Further Information

Guidelines

MDPI Initiatives

Follow MDPI