Multi-Dimensional Estimation of Leaf Loss Rate from Larch Caterpillar Under Insect Pest Stress Using UAV-Based Multi-Source Remote Sensing

Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis paper presents a study on 3D modeling and vertical distribution analysis of leaf loss rate (LLR) under pest stress using multi-source UAV remote sensing data combined with machine learning techniques. The study’s strengths lie in its effective fusion of features from multiple sensors, rigorous evaluation, and strong model performance, all clearly communicated through detailed analysis. While the manuscript demonstrates high quality and strong potential, several important revisions are necessary:
Major comments
The Discussion section is unclear. To improve clarity, I suggest:
- Move most of the background and literature review content to the Introduction.
- Refocus the Discussion to:
- Interpret the study’s findings in depth,
- Compare them with existing literature,
- Highlight the unique contributions of this approach,
- Clearly discuss limitations and future directions.
Minor comments
- Abstract: Terms such as W11P3, W5P2, and MPI are introduced without explanation.
- S-G Smoothing Algorithm: Although the general concept and mechanism are explained, the manuscript would benefit from a clearer description of the specific steps—especially how polynomial fitting is applied to multi-source features.
- Formal references should be included for all software tools used, such as ENVI and ArcGIS.
- Lines 185–186: The symbol M used alongside R² and RMSE is unclear and nonstandard; please clarify its meaning and significance.
- Lines 199–208: The listed objectives overly emphasize methodological details (e.g., S-G smoothing, RFE). Objectives should instead articulate the broader aims or intended outcomes of the research. Consider rephrasing this section to focus on the scientific questions or hypotheses.
- Line 377: While MPI is a key metric, the cited source [90] is difficult to access. Please consider providing a publicly accessible link or other references.
- Line 384: The description of the hierarchical analysis method (AHP) needs more detail and should include appropriate references.
- Line 390: an extraneous "S" at the beginning of the sentence.
- Lines 390–405: The explanation of how optimal W and P parameter combinations are chosen is currently limited to MSE values. A more in-depth rationale is recommended—such as how these combinations affect the physical meaning or behavior of the smoothed features.
Author Response
Response to Reviewer 1 Comments
Major comment
The Discussion section is unclear. To improve clarity, I suggest:
- Move most of the background and literature review content to the Introduction.
- Refocus the Discussion to: Interpret the study’s findings in depth, Compare them with existing literature, Highlight the unique contributions of this approach, Clearly discuss limitations and future directions.
The discussion section was modified to read:
- Discussion
4.1. Performance of multi-source remote sensing features for LLR estimation
In this study, in order to estimate the LLR of forest trees under larch caterpillar infestation, multi-source features combined with machine learning algorithms were extracted to construct an accurate estimation model in both horizontal and vertical directions, which provides a powerful research method for pest control. In this study, S-G smoothing is introduced to reduce the noise of the data to provide a more accurate data source for the extraction of sensitive features. Savitzky and Golay invented SG smoothing, which removes noise while preserving the signal form and broadband [35]. Existing studies have focused on S-G smoothing preprocessing of spectral data, including UAV spectral noise removal [47-49], and satellite remote sensing time series [50], in order to enhance the accuracy of vegetation index calculation. However, the innovation of this study is the S-G smoothing of optical features and LiDAR features with different combinations of windows (5, 7, 9, 11) and polynomial orders (2, 3), and the optimal combination pattern is explored. Compared with traditional methods, this approach can better fit the characteristics of UAV remote sensing data, effectively reduce the influence of environmental noise and improve the stability of features. It also compensates for the application of S-G smoothing in the optimization of multi-source remote sensing features from UAVs, thus enhancing the reliability of the model estimation performance and providing higher quality data support for applications such as forest health monitoring. In a study applying S-G smoothing to UAV hyperspectral data, an overall accuracy of 93.47% was obtained for the classification of rice leaf blight [51]. In contrast, this study extracted sensitive features to construct the LLR estimation model after preprocessing remote sensing features based on S-G smoothing with different parameter combinations, and obtained an accuracy of 93.83%, which indicates that parameter tuning can enhance the data quality in order to improve the estimation model accuracy. The extraction of sensitive features can effectively improve the prediction accuracy of the model, and also optimize the computational efficiency and enhance the generalization ability of the model [52]. The Recursive Feature Elimination (RFE) algorithm chosen in this paper is a commonly used feature selection method, which removes unimportant features recursively and evaluates the performance of different feature subsets to ultimately determine the optimal feature combination [53]. The histogram of the feature importance scores of the 13 HSIs and 16 MSIs extracted according to this algorithm is shown in Figure 9. The difference in importance scores of HIS and MSI features in the model can be seen in Figure 9-a. For the HSIs, features such as the NDVI, ARI and MCARI exhibit high importance scores, especially the NDVI, with a score of more than 0.7. This index is a common vegetation index used in remote sensing; it is calculated from the reflectance of the near infrared (800 nm) and red light (670 nm) bands and displays strong sensitivity to the health status of trees, biomass, etc. [54, 55]. The ARI and MCARI are highly sensitive to changes in leaf carotenoids and chlorophyll [56], which can be used to distinguish vegetation with different health levels; these features play a key role in model performance. In addition, other common HSIs, such as the NPCI, PRI, and LCI, also have high scores, reflecting their advantages when using hyperspectral data, as they effectively reflect information on vegetation status and health conditions [57]. However, some features, such as the NVI, PBI, and NGRDI, had low importance scores, indicating that these features contributed comparatively less to the estimation of LLR. In contrast, the performance of the MSIs was slightly different, with high importance scores for GOSAVI and GNDVI, especially for GOSAVI, with a value close to 0.6, indicating its importance in multispectral data processing. Other common vegetation indices, such as the LCI, NDGI, and CIGREEN, also enhance modeling performance in cases with multispectral features. Overall, however, the importance scores for MSIs were generally lower than those for HSIs, which may be related to the number of bands considered. The hundreds of bands of the hyperspectral data provide more fine scale in-formation, and the corresponding indices dominate the feature importance scores, whereas the features of multispectral data, although also of some importance, display lower importance scores overall. This suggests that hyperspectral data have a greater advantage over multispectral data in tasks that require fine scale differentiation between different vegetation conditions or health states. As shown in Figure 9-b, PER90 in stratum I had the highest importance score, exceeding 0.8, indicating that the height variable at the 90th percentile position of the stand was consistent with the layer I characteristics of the stand [58]. PER80, on the other hand, had a lower importance score, but an MPI accuracy of 0.8956 was obtained for the LLR estimation model constructed for layer I, suggesting that the use of two feature types enhanced the estimation ability of the model through synergistic effects. The differences in the importance scores of most features in layer II are small, with PER1 having the highest contribution. PER5 and PER10 had the smallest importance scores, which may have affected the accuracy of the model. The importance scores of features other than PER10 in layer III were small compared with those in layers I and II, whereas PER10 had the second-highest importance score among all features in the three layers and highly contributed to the estimates of the model. In summary, the drawback of this paper is that features with low importance scores are not eliminated, which may lead to the introduction of redundant information or noise into the model, thus affecting the training efficiency and estimation accuracy of the model. Although the model has some predictive ability at different vertical levels, the unoptimized feature set may make the model complex and increase the risk of overfitting. In future research, features of low importance should be eliminated to improve the generalization ability of the model, reduce the number of computations, and further optimize model
performance.
Figure 9. Histograms of importance scores for LLR-sensitive features (a) HSI and MSI; (b) LI sensitive to different levels of the vertical
4.2. Application of multidimensional estimation of LLR to forest management
Compared with the traditional time-consuming and labor-intensive methods of obtaining forest LLR, the remote sensing approach with UAV-based hyperspectral and LiDAR data collection shortens the cycle of data acquisition, improves the data accuracy, and provides a technical basis for the large-scale nondestructive inversion of forest biochemical parameters. In this study, a model with optimal accuracy in the horizontal direction combined with LI sensitive to three levels was used to realize LLR estimation in the vertical direction. Then, the results were fused with LiDAR point cloud data to achieve 3D visualization. Numerous articles have been published on multi-source data fusion for the inversion of forest parameters, among which the DSM-based fusion method proposed by Xin Shen and other researchers from Nanjing Forestry University in 2020 is the most advanced. The team accomplished the fusion of hyperspectral and LiDAR data via three steps: generating a gridded LiDAR point cloud, extracting the highest-quality LiDAR point cloud data, and matching the hyperspectral pixels to those points. A regression model was then implemented to predict forest bio-chemical traits at three vertical canopy levels, and an R2 of 0.85-0.91 was obtained [22]. Recently, the method has also been applied to multispectral and LIDAR data fusion and hyperspectral, thermal infrared and LIDAR data fusion to investigate the 3D photosynthetic shape of winter wheat and the vertical profile of plant shape under pine wood nematode stress [18, 23]. Inspired by this work, we demonstrated the vertical spatial distribution of the LLR of sample trees under stress due to larch caterpillar infestation by fusing the pixel-level estimated LLR with LiDAR point cloud data via Python. It is found that the estimated MPI reaches 0.93 in the horizontal direction and between 0.83 and 0.89 in the vertical direction, and this accuracy is close to that of Xin Shen, Qinan Lin et al [18, 22]. The LLR multidimensional estimation method proposed in this paper achieves high accuracy and innovatively realizes 3D visualization, but still has several limitations at the operational level. For example, it requires high-precision ground survey data acquisition, especially the canopy top data are still difficult to obtain; and the method relies on high-resolution LiDAR data for 3D visualization, thus reaching at most the application on the stand scale. The application of this method to the stand scale in subsequent studies can further explore the overall response under pest stress. In forest management applications, the method has demonstrated important value for accurate pest management and disaster assessment. Its three-dimensional visualization capability at the single-tree scale can not only provide decision support for targeted application and selective harvesting, but also through the comprehensive analysis of the vertical spatial distribution of different forest stands. It can reveal the impact of pests on the growth and health of the entire forest stand, and accurately identify the most significant areas of pest infestation, thus providing data support for the precise application of control measures. This multi-dimensional assessment breaks through the limitations of traditional two-dimensional monitoring, enabling managers to formulate differentiated management strategies based on the actual damage of each layer of the canopy, and significantly improving the pertinence and effectiveness of control measures.
Minor comment
- Abstract: Terms such as W11P3, W5P2, and MPI are introduced without explanation.
I've reworked the summary to elaborate on the various terms:
Leaf loss caused by pest infestations poses a serious threat to forest health. The leaf loss rate (LLR) refers to the percentage of the overall tree crown leaf loss per unit area and is an important indicator for evaluating forest health. Therefore, rapid and accurate acquisition of the LLR via remote sensing monitoring is crucial. This study is based on drone hyperspectral and LiDAR data as well as ground survey data, calculating hyperspectral indices (HSI), multispectral indices (MSI), and LiDAR index (LI). It employs Savitzky-Golay (S-G) smoothing with different window sizes (W) and polynomial orders (P) combined with recursive feature elimination (RFE) to select sensitive features. Using Random Forest Regression (RFR) and Convolutional Neural Network Regression (CNNR) to construct a multidimensional (horizontal and vertical) estimation model for LLR, combined with LiDAR point cloud data, achieved a three-dimensional visualization of the leaf loss rate of trees. The results of the study showed: (1) The optimal combination of HSI and MSI was determined to be W11P3, and the LI was W5P2. (2)The optimal combination of the number of sensitive features extracted by the RFE algorithm was 13 HSI, 16 MSI and hierarchical LI (2 in layer I., 9 in layer II, and 11 in layer III). (3) In terms of the horizontal estimation of the defoliation rate, the model performance index of the CNNRHSI model (MPI=0.9383) was significantly better than that of RFRMSI (MPI=0.8817), indicating that the continuous band of hyperspectral could better monitor the subtle changes of LLR. (4) The I-CNNRHSI+LI, II-CNNRHSI+LI and III-CNNRHSI+LI vertical estimation models were constructed by combining the CNNRHSI model with the best accuracy and the LI sensitive to different vertical levels, respectively, and the MPI reached more than 0.8, indicating that the LLR estimation of different vertical levels had high accuracy. According to the model, the pixel-level LLR of the sample tree was estimated, and the three-dimensional display of the LLR of the forest trees under the pest stress of larch caterpillars was realized, which provided a high-precision research scheme for the estimation of the LLR under pest stress.
- S-G Smoothing Algorithm: Although the general concept and mechanism are explained, the manuscript would benefit from a clearer description of the specific steps—especially how polynomial fitting is applied to multi-source features.
I have already described the implementation of the S-G method in detail in the article. as follows:
The implementation steps of the method are as follows: firstly, the local polynomial fitting is carried out by sliding window for each feature column, and the window size is 5, 7, 9, and 11. Fit the 2nd and 3rd order polynomials using least squares within each window; Second, a 5-fold cross-validation was used to evaluate the smoothing performance of different combinations of parameters. Each combination was trained and verified, 80% of the data in each fold cross-validation was used for training, 20% of the data was used for validation, and the Mean Squared Error (MSE) was calculated for evaluation, and finally the smoothed combination with the smallest MSE was selected for model construction.
- Formal references should be included for all software tools used, such as ENVI and ArcGIS
All software is marked with the name of the country and company in which it was developed.
- Lines 185–186: The symbol M used alongside R² and RMSE is unclear and nonstandard; please clarify its meaning and significance.
In the original text, the symbol M refers to the average of the R² and RMSE of the validation and training sets.
- Lines 199–208: The listed objectives overly emphasize methodological details (e.g., S-G smoothing, RFE). Objectives should instead articulate the broader aims or intended outcomes of the research. Consider rephrasing this section to focus on the scientific questions or hypotheses.
I've revised the writing of this goal to read:
In order to make up for this shortcoming, this study fuses optical and LiDAR index, combines S-G smoothing and recursive feature cancellation methods to extract sensitive features, and uses machine learning algorithms to construct a high-precision LLR estimation. In addition, the three-dimensional visualization of LLR at the single tree scale under insect stress is realized, which provides an innovative solution for forest health assessment. The following issues are to be addressed: (1) To explore the optimization ability of S-G smoothing with different window and polynomial combinations for multi-source remote sensing features . (2) Combined with Recursive Feature Elimination (RFE), the horizontal optical features sensitive to LLR and the LiDAR features sensitive to different vertical layers were extracted. (3) The LLR horizontal estimation model with the best accuracy was constructed, and the pixel-level estimation of a single wood was realized by combining the LiDAR at different vertical levels. (4) The estimation results are fused with the LiDAR point cloud to realize the three-dimensional visualization of LLR, and the feeding situation of larch caterpillars at the single tree scale is analyzed in detail, which provides more accurate location information for the later control work.
- Line 377: While MPI is a key metric, the cited source [90] is difficult to access. Please consider providing a publicly accessible link or other references.
I have added the DOI number of the document to the references.DOI:10.27204/d.cnki.glzhu.2019.000053.
- Line 384: The description of the hierarchical analysis method (AHP) needs more detail and should include appropriate references.
The Hierarchical Analysis Method (AHP) is weighted based on expert opinion, with appropriate references being:DOI:10.27204/d.cnki.glzhu.2019.000053.
- Line 390: an extraneous "S" at the beginning of the sentence.
This was my negligence and I have deleted it .
- 9. Lines 390–405: The explanation of how optimal W and P parameter combinations are chosen is currently limited to MSE values. A more in-depth rationale is recommended—such as how these combinations affect the physical meaning or behavior of the smoothed features.
Thank you for your insightful question regarding the selection of the optimal W and P parameter combination. In order to verify the validity of the optimal parameter combination smoothing (HSI/MSI:W11P3, LI:W5P2), the significance of the between-group differences of the smoothed features was assessed by one-way analysis of variance (ANOVA, α=0.05), and the results are shown in the figure below. As can be seen in the following figure, the smoothed features all showed
significant enhancement. Among them, 26 features of 32 HSIs in Figure (a) were significantly enhanced (p<0.05), accounting for 81.25%. Especially, the features of GI, LCI, MCARI, NDVI, NGRDI, NPCI, CARI, RCI1, REP, MTCI, MTCI2, NDRSR, and MRENDVI are more significant. Twenty-three of the 32 MSIs in Figure (b) were characterized by significant enhancement (p<0.05), accounting for 71.87%. Especially, the features of ARI, CI, LCI, NDGI, GNDVI and RVI are more significant. Seventeen of the 18 LIs in Figure (c) were significantly enhanced (p<0.05) in 17 features, accounting for 94.44%. Especially CC and LAI are more significant. Although the percentage of significance of LI is higher than the remaining two, the overall performance of significance enhancement is weaker. This result shows that the best combinations with the lowest MSE all improve the significance of the features. This process not only ensures mathematical goodness of fit, but also ensures that the features have the statistical validity to distinguish between different health states.
Author Response File: Author Response.docx
Reviewer 2 Report
Comments and Suggestions for AuthorsDear Authors,
The title seems too long to me, and the abstract is also overly extensive — I suggest making it more concise. In the introduction, the authors present some redundancies regarding the advantages of UAVs; I believe this section could be made more concise. It would be valuable to establish a clearer connection between the existing studies and the current research being developed.
Regarding the results, I consider it appropriate to improve the terminology of several sentences, as many are too long and may cause the reader to lose focus. As for the discussion, it might be interesting to explore the potential application of this study in forest management, highlighting its practical features and benefits.
Table 2 appears to be overly extensive, which may cause the reader to lose focus due to the amount of dispersed information. Would it be possible to improve its presentation or consider relocating part of the content to an appendix or supplementary material?
Regarding the figures, I would generally recommend improving the captions to ensure they are clearer and more informative. Specifically, for figure 4, I suggest reconsidering its design to make it more visually appealing and easier to interpret. In Figure 7, I suggest adding labels or explanations for zones I, II, and III, as well as defining any acronyms used. This would enhance clarity and ensure the figure is fully understandable without the need to refer back to the main text.
Finally, I would suggest adding a proper conclusion, summarizing the main contributions of the study and offering a perspective on future research directions.
Congrats
Comments on the Quality of English LanguageN.A.
Author Response
Response to Reviewer 2 Comments
- The title seems too long to me, and the abstract is also overly extensive — I suggest making it more concise. In the introduction, the authors present some redundancies regarding the advantages of UAVs; I believe this section could be made more concise. It would be valuable to establish a clearer connection between the existing studies and the current research being developed.
The title has been revised to read: " Analysis of multi-dimensional estimation of Leaf Loss rate of larch caterpillar under insect pest stress based on UAV multi-source remote sensing ". Redundant information about the proposed advantages of the UAV has been removed for processing.
- Regarding the results, I consider it appropriate to improve the terminology of several sentences, as many are too long and may cause the reader to lose focus. As for the discussion, it might be interesting to explore the potential application of this study in forest management, highlighting its practical features and benefits.
Regarding some excessively long sentences in the results, I've revised them. The discussion section was modified to read:
- Discussion
4.1. Performance of multi-source remote sensing features for LLR estimation
In this study, in order to estimate the LLR of forest trees under larch caterpillar infestation, multi-source features combined with machine learning algorithms were extracted to construct an accurate estimation model in both horizontal and vertical directions, which provides a powerful research method for pest control. In this study, S-G smoothing is introduced to reduce the noise of the data to provide a more accurate data source for the extraction of sensitive features. Savitzky and Golay invented SG smoothing, which removes noise while preserving the signal form and broadband [35]. Existing studies have focused on S-G smoothing preprocessing of spectral data, including UAV spectral noise removal [47-49], and satellite remote sensing time series [50], in order to enhance the accuracy of vegetation index calculation. However, the innovation of this study is the S-G smoothing of optical features and LiDAR features with different combinations of windows (5, 7, 9, 11) and polynomial orders (2, 3), and the optimal combination pattern is explored. Compared with traditional methods, this approach can better fit the characteristics of UAV remote sensing data, effectively reduce the influence of environmental noise and improve the stability of features. It also compensates for the application of S-G smoothing in the optimization of multi-source remote sensing features from UAVs, thus enhancing the reliability of the model estimation performance and providing higher quality data support for applications such as forest health monitoring. In a study applying S-G smoothing to UAV hyperspectral data, an overall accuracy of 93.47% was obtained for the classification of rice leaf blight [51]. In contrast, this study extracted sensitive features to construct the LLR estimation model after preprocessing remote sensing features based on S-G smoothing with different parameter combinations, and obtained an accuracy of 93.83%, which indicates that parameter tuning can enhance the data quality in order to improve the estimation model accuracy. The extraction of sensitive features can effectively improve the prediction accuracy of the model, and also optimize the computational efficiency and enhance the generalization ability of the model [52]. The Recursive Feature Elimination (RFE) algorithm chosen in this paper is a commonly used feature selection method, which removes unimportant features recursively and evaluates the performance of different feature subsets to ultimately determine the optimal feature combination [53]. The histogram of the feature importance scores of the 13 HSIs and 16 MSIs extracted according to this algorithm is shown in Figure 9. The difference in importance scores of HIS and MSI features in the model can be seen in Figure 9-a. For the HSIs, features such as the NDVI, ARI and MCARI exhibit high importance scores, especially the NDVI, with a score of more than 0.7. This index is a common vegetation index used in remote sensing; it is calculated from the reflectance of the near infrared (800 nm) and red light (670 nm) bands and displays strong sensitivity to the health status of trees, biomass, etc. [54, 55]. The ARI and MCARI are highly sensitive to changes in leaf carotenoids and chlorophyll [56], which can be used to distinguish vegetation with different health levels; these features play a key role in model performance. In addition, other common HSIs, such as the NPCI, PRI, and LCI, also have high scores, reflecting their advantages when using hyperspectral data, as they effectively reflect information on vegetation status and health conditions [57]. However, some features, such as the NVI, PBI, and NGRDI, had low importance scores, indicating that these features contributed comparatively less to the estimation of LLR. In contrast, the performance of the MSIs was slightly different, with high importance scores for GOSAVI and GNDVI, especially for GOSAVI, with a value close to 0.6, indicating its importance in multispectral data processing. Other common vegetation indices, such as the LCI, NDGI, and CIGREEN, also enhance modeling performance in cases with multispectral features. Overall, however, the importance scores for MSIs were generally lower than those for HSIs, which may be related to the number of bands considered. The hundreds of bands of the hyperspectral data provide more fine scale in-formation, and the corresponding indices dominate the feature importance scores, whereas the features of multispectral data, although also of some importance, display lower importance scores overall. This suggests that hyperspectral data have a greater advantage over multispectral data in tasks that require fine scale differentiation between different vegetation conditions or health states. As shown in Figure 9-b, PER90 in stratum I had the highest importance score, exceeding 0.8, indicating that the height variable at the 90th percentile position of the stand was consistent with the layer I characteristics of the stand [58]. PER80, on the other hand, had a lower importance score, but an MPI accuracy of 0.8956 was obtained for the LLR estimation model constructed for layer I, suggesting that the use of two feature types enhanced the estimation ability of the model through synergistic effects. The differences in the importance scores of most features in layer II are small, with PER1 having the highest contribution. PER5 and PER10 had the smallest importance scores, which may have affected the accuracy of the model. The importance scores of features other than PER10 in layer III were small compared with those in layers I and II, whereas PER10 had the second-highest importance score among all features in the three layers and highly contributed to the estimates of the model. In summary, the drawback of this paper is that features with low importance scores are not eliminated, which may lead to the introduction of redundant information or noise into the model, thus affecting the training efficiency and estimation accuracy of the model. Although the model has some predictive ability at different vertical levels, the unoptimized feature set may make the model complex and increase the risk of overfitting. In future research, features of low importance should be eliminated to improve the generalization ability of the model, reduce the number of computations, and further optimize model
performance.
Figure 9. Histograms of importance scores for LLR-sensitive features (a) HSI and MSI; (b) LI sensitive to different levels of the vertical
4.2. Application of multidimensional estimation of LLR to forest management
Compared with the traditional time-consuming and labor-intensive methods of obtaining forest LLR, the remote sensing approach with UAV-based hyperspectral and LiDAR data collection shortens the cycle of data acquisition, improves the data accuracy, and provides a technical basis for the large-scale nondestructive inversion of forest biochemical parameters. In this study, a model with optimal accuracy in the horizontal direction combined with LI sensitive to three levels was used to realize LLR estimation in the vertical direction. Then, the results were fused with LiDAR point cloud data to achieve 3D visualization. Numerous articles have been published on multi-source data fusion for the inversion of forest parameters, among which the DSM-based fusion method proposed by Xin Shen and other researchers from Nanjing Forestry University in 2020 is the most advanced. The team accomplished the fusion of hyperspectral and LiDAR data via three steps: generating a gridded LiDAR point cloud, extracting the highest-quality LiDAR point cloud data, and matching the hyperspectral pixels to those points. A regression model was then implemented to predict forest bio-chemical traits at three vertical canopy levels, and an R2 of 0.85-0.91 was obtained [22]. Recently, the method has also been applied to multispectral and LIDAR data fusion and hyperspectral, thermal infrared and LIDAR data fusion to investigate the 3D photosynthetic shape of winter wheat and the vertical profile of plant shape under pine wood nematode stress [18, 23]. Inspired by this work, we demonstrated the vertical spatial distribution of the LLR of sample trees under stress due to larch caterpillar infestation by fusing the pixel-level estimated LLR with LiDAR point cloud data via Python. It is found that the estimated MPI reaches 0.93 in the horizontal direction and between 0.83 and 0.89 in the vertical direction, and this accuracy is close to that of Xin Shen, Qinan Lin et al [18, 22]. The LLR multidimensional estimation method proposed in this paper achieves high accuracy and innovatively realizes 3D visualization, but still has several limitations at the operational level. For example, it requires high-precision ground survey data acquisition, especially the canopy top data are still difficult to obtain; and the method relies on high-resolution LiDAR data for 3D visualization, thus reaching at most the application on the stand scale. The application of this method to the stand scale in subsequent studies can further explore the overall response under pest stress. In forest management applications, the method has demonstrated important value for accurate pest management and disaster assessment. Its three-dimensional visualization capability at the single-tree scale can not only provide decision support for targeted application and selective harvesting, but also through the comprehensive analysis of the vertical spatial distribution of different forest stands. It can reveal the impact of pests on the growth and health of the entire forest stand, and accurately identify the most significant areas of pest infestation, thus providing data support for the precise application of control measures. This multi-dimensional assessment breaks through the limitations of traditional two-dimensional monitoring, enabling managers to formulate differentiated management strategies based on the actual damage of each layer of the canopy, and significantly improving the pertinence and effectiveness of control measures.
- Table 2 appears to be overly extensive, which may cause the reader to lose focus due to the amount of dispersed information. Would it be possible to improve its presentation or consider relocating part of the content to an appendix or supplementary material?
Table 2 shows the 64 HSIs and MSIs calculated for this study, containing the formulas and relevant references, which I have moved to Table A1 in the Appendix.
- Regarding the figures, I would generally recommend improving the captions to ensure they are clearer and more informative. Specifically, for figure 4, I suggest reconsidering its design to make it more visually appealing and easier to interpret. In Figure 7, I suggest adding labels or explanations for zones I, II, and III, as well as defining any acronyms used. This would enhance clarity and ensure the figure is fully understandable without the need to refer back to the main text.
Figure 4 I have redrawn as shown in the figure below. I.II.III. of Figure 7 I have explained in the title, and some of the titles of the figure have been improved.
- Finally, I would suggest adding a proper conclusion, summarizing the main contributions of the study and offering a perspective on future research directions.
The conclusion should be changed to read:
In this study, the optical and lidar features sensitive to LLR were extracted by combining UAV multi-source remote sensing data with S-G smoothing and RFE algorithms. Random Forest Regression (RFR) and Convolutional Neural Network Regression (CNNR) models were constructed to estimate the horizontal and vertical LLR under larch caterpillar pest stress. The study concluded as follows:
(1) The S-G smoothing of different parameter combinations shows completely opposite conclusions on the optical features (W11P3) and the lidar features (W5P2), which is due to the characteristics of the data.
(2) Combined with the RFE algorithm, 13 HSIs and 16 MSIs were extracted horizontally, and the analysis found that the HSI had higher importance scores than MSI, especially NDVI and ARI. 6 LI were extracted from layer I, 9 from layer II, and 11 from layer III at different vertical levels, and the same analysis found that PER90, PER1, and PER10 had the highest importance scores respectively.
(3)The MPI ranking of the horizontal model constructed based on sensitive optical features is CNNRHSI> RFRHSI> RFRMSI> CNNRMSI, where CNNRHSI achieves the best accuracy (MPI=0.9383).
(4) The combination of CNNRHSI and sensitive LI constructs a vertically different level LLR estimation model. It was found that the accuracy of the CNNRHSI+LI model reached more than 0.8 (layer I: MPI=0.8956, layer II: MP I= 0.8424, layer III: MPI=0.8346), and the accuracy was reliable. Finally, the estimation results of the single wood scale were fused with the LiDAR point cloud to realize the three-dimensional visualization of LLR. The results showed that the mildly damaged sample trees were gradually damaged from the II. and III. layers, while the severely damaged sample trees were severely damaged on the I. layer, while the moderately damaged sample trees did not show obvious patterns.
The method proposed in this study not only fills the shortcomings of the traditional two-dimensional model, but also provides intuitive results for the index of tree leaf loss rate after insect infestation grazing, and improves the accurate estimation ability of LLR.
Author Response File: Author Response.docx
Reviewer 3 Report
Comments and Suggestions for AuthorsThe manuscript “Three -Dimensional modeling and vertical distribution analysis of the Leaf Loss Rate due to pest stress from larch caterpillars based on multisource remote sensing via unmanned aerial vehicles” contains a very interesting comparative analysis of drone-taken optical (hyperspectral and multispectral) and LiDAR data fed and optimized into machine learning Random Forest (RFR) and deep learning Convolutional Neural Network Regression (CNNR) algorithms. It is a valuable, well-organized, and in a general sense, a scientifically sound report. It presents a novel approach to combine current drone technology and sensors to obtain data and convert them into information useful for assessing pest damage or severity at the local level. Its findings represent an advancement towards operationalization of pests monitoring at local and regional levels. However, the manuscript is not absent of improvement opportunities, particularly in the methodology description and results sections. To avoid repeating here my comments, please refer to the PDF document where I directly wrote my comments.
Comments for author File: Comments.pdf
Author Response
3 Response to Reviewer 3 Comments
- (W11P3) What is this, a predictive variable, a combination of predictive variables? Rephrase the sentence to clarify the idea -do the same to the rest of the paragraph
I have revised the entire summary as follows:
Leaf loss caused by pest infestations poses a serious threat to forest health. The leaf loss rate (LLR) refers to the percentage of the overall tree crown leaf loss per unit area and is an important indicator for evaluating forest health. Therefore, rapid and accurate acquisition of the LLR via remote sensing monitoring is crucial. This study is based on drone hyperspectral and LiDAR data as well as ground survey data, calculating hyperspectral indices (HSI), multispectral indices (MSI), and LiDAR index (LI). It employs Savitzky-Golay (S-G) smoothing with different window sizes (W) and polynomial orders (P) combined with recursive feature elimination (RFE) to select sensitive features. Using Random Forest Regression (RFR) and Convolutional Neural Network Regression (CNNR) to construct a multidimensional (horizontal and vertical) estimation model for LLR, combined with LiDAR point cloud data, achieved a three-dimensional visualization of the leaf loss rate of trees. The results of the study showed: (1) The optimal combination of HSI and MSI was determined to be W11P3, and the LI was W5P2. (2)The optimal combination of the number of sensitive features extracted by the RFE algorithm was 13 HSI, 16 MSI and hierarchical LI (2 in layer I., 9 in layer II, and 11 in layer III). (3) In terms of the horizontal estimation of the defoliation rate, the model performance index of the CNNRHSI model (MPI=0.9383) was significantly better than that of RFRMSI (MPI=0.8817), indicating that the continuous band of hyperspectral could better monitor the subtle changes of LLR. (4) The I-CNNRHSI+LI, II-CNNRHSI+LI and III-CNNRHSI+LI vertical estimation models were constructed by combining the CNNRHSI model with the best accuracy and the LI sensitive to different vertical levels, respectively, and the MPI reached more than 0.8, indicating that the LLR estimation of different vertical levels had high accuracy. According to the model, the pixel-level LLR of the sample tree was estimated, and the three-dimensional display of the LLR of the forest trees under the pest stress of larch caterpillars was realized, which provided a high-precision research scheme for the estimation of the LLR under pest stress.
- 200,000?? please be consistent with the number format/ How the dead forest area is larger than the area where the caterpillars were present? It does not make sense, correct it.
Regarding the statistics in this paragraph, I reproduced the wrong number, and I have modified it in the article as:
The larch caterpillar infestation first occurred in 1981 in the Daxing'an Mountains of Inner Mongolia, with an area of 3333 hm2. From 1992 to 1994, the insect infestation was rampant, with a cumulative area of more than 166,700 hm2 and 13.33 million hm2 of pine forest death.
3.(damage level of oil trees in Jianping County)damage due to what? explain it./ ISIC-SPA)What's the meaning of this acronym?
Ning Zhang et al., based on UAV hyperspectral imaging technology, identified the de-gree of damage to oil trees caused by Dendrolimus tabulaeformis Tsai et Liu (D. tabu-laeformis). A joint algorithm of successive projection algorithm (SPA) and the instabil-ity index between classes (ISIC) with the best band selection efficiency and cross-validation accuracy was proposed, and a partial least squares regression model was established, and the accuracy of 95.23% was finally achieved
5.State here the size of the study area.
The area is 15.36 ha.
- Clearly state what tools were used to measure what variables.
The data used in the field data collection and the parameters of the measured sample tree are the parts of the article, nothing else, I made a writing error, and I have corrected it.
- This sentence is ambiguous. Provide precise units here. % of leafs lost per ... m2? It is necessary to improve the writing from this part on -of this subsection- to clarify the methodology to calculate LLR.
This section adds a section on the calculation of LLR and an explanation:
When calculating the LLR according to Equation 1, a typical standard shoot in the east, south, west and north directions was selected at the upper(Ⅰ), middle(Ⅱ) and lower levels(Ⅲ) of the sample tree by the tall pruning tool and the number of damaged needles and healthy needles were recorded. Finally, the average defoliation rate of all branches was used as the horizontal direction of the current sample tree to LLR. During this period, the LLR of 38 typical trees with mild, severe and severe stress was recorded in detail, with a total of 114 samples at the upper, middle and lower levels of each tree, providing measured data for vertical monitoring.
1
- Clearly explain how this LLR was calculated for each level or section of the tree?
The vertically different levels of LLR are divided equally into upper, middle, and lower layers according to the height of each sample tree. And in each layer the typical branches in the south-east-northwest direction are obtained, and the LLR calculated according to the formula is obtained.
- In Table 1 define the spatial resolution of the optical images (m2) and the points density of the LiDAR cloud (pts/ m2).
Spatial resolution of the optical images:0.49m2,Points density of the LiDAR cloud:1266 pts /m2. I've added this to the textual presentation
- What type of model? Was it fitted with all initial potential predictive variables -HSI´s, MSI´s, LI´s, etc.? Please clarify the idea.
First, it trains a Bagged Regression Ensembles model on all initial features, and calculates the weight or importance of each feature based on the model.
For HSI and MSI, the RFE algorithm is used to extract the sensitive optical features, which are the estimation feature data in the horizontal direction. The sensitive features of layer I., layer II, and layer III were extracted for LI, respectively, to provide the best sensitive features for vertical estimation.
- How did you define (specify) a given number of "features"? or do you mean "a specified" number of features? If by feature you mean variable, I suggest using the latter term/word.
The RMSE is calculated by specifying the percentage of the number of all features, and the regression model is constructed with the features selected as sensitive features at the minimum RMSE.
- Explain how these layers are defined. Are they a portion of the tree canopy height? or are they fixed, no matter the tree canopy height?
Explain also, above, how the variables/sensitive features were estimates for each laye
The vertically different levels of LLR are divided equally into upper, middle, and lower layers according to the height of each sample tree. And in each layer the typical branches in the south-east-northwest direction are obtained, and the LLR calculated according to the formula is obtained.
- as explained in the former section?
The sensitive features mentioned here are the HSI, MSI and LI sensitive to LLR extracted according to the RFE algorithm.
- Based on what criteria? On experts opinion? Pairwise comparison?
The weights are given based on expert opinion.
- 15. What about the operational limitations of the method reported to estimate LLR? Is it ready for evaluating forests of hundreds or thousands of hectares? Please add a brief discussion on the operative use of models such as the one described in this report, and on what is needed for its operationalization, if possible.
This section has been modified to
4.2. Application of multidimensional estimation of LLR to forest management
Compared with the traditional time-consuming and labor-intensive methods of obtaining forest LLR, the remote sensing approach with UAV-based hyperspectral and LiDAR data collection shortens the cycle of data acquisition, improves the data accuracy, and provides a technical basis for the large-scale nondestructive inversion of forest biochemical parameters. In this study, a model with optimal accuracy in the horizontal direction combined with LI sensitive to three levels was used to realize LLR estimation in the vertical direction. Then, the results were fused with LiDAR point cloud data to achieve 3D visualization. Numerous articles have been published on multi-source data fusion for the inversion of forest parameters, among which the DSM-based fusion method proposed by Xin Shen and other researchers from Nanjing Forestry University in 2020 is the most advanced. The team accomplished the fusion of hyperspectral and LiDAR data via three steps: generating a gridded LiDAR point cloud, extracting the highest-quality LiDAR point cloud data, and matching the hyperspectral pixels to those points. A regression model was then implemented to predict forest bio-chemical traits at three vertical canopy levels, and an R2 of 0.85-0.91 was obtained [22]. Recently, the method has also been applied to multispectral and LIDAR data fusion and hyperspectral, thermal infrared and LIDAR data fusion to investigate the 3D photosynthetic shape of winter wheat and the vertical profile of plant shape under pine wood nematode stress [18, 23]. Inspired by this work, we demonstrated the vertical spatial distribution of the LLR of sample trees under stress due to larch caterpillar infestation by fusing the pixel-level estimated LLR with LiDAR point cloud data via Python. It is found that the estimated MPI reaches 0.93 in the horizontal direction and between 0.83 and 0.89 in the vertical direction, and this accuracy is close to that of Xin Shen, Qinan Lin et al [18, 22]. The LLR multidimensional estimation method proposed in this paper achieves high accuracy and innovatively realizes 3D visualization, but still has several limitations at the operational level. For example, it requires high-precision ground survey data acquisition, especially the canopy top data are still difficult to obtain; and the method relies on high-resolution LiDAR data for 3D visualization, thus reaching at most the application on the stand scale. The application of this method to the stand scale in subsequent studies can further explore the overall response under pest stress. In forest management applications, the method has demonstrated important value for accurate pest management and disaster assessment. Its three-dimensional visualization capability at the single-tree scale can not only provide decision support for targeted application and selective harvesting, but also through the comprehensive analysis of the vertical spatial distribution of different forest stands. It can reveal the impact of pests on the growth and health of the entire forest stand, and accurately identify the most significant areas of pest infestation, thus providing data support for the precise application of control measures. This multi-dimensional assessment breaks through the limitations of traditional two-dimensional monitoring, enabling managers to formulate differentiated management strategies based on the actual damage of each layer of the canopy, and significantly improving the pertinence and effectiveness of control measures.
16.About Conclusions
The conclusion should be changed to read:
In this study, the optical and lidar features sensitive to LLR were extracted by combining UAV multi-source remote sensing data with S-G smoothing and RFE algorithms. Random Forest Regression (RFR) and Convolutional Neural Network Regression (CNNR) models were constructed to estimate the horizontal and vertical LLR under larch caterpillar pest stress. The study concluded as follows:
(1) The S-G smoothing of different parameter combinations shows completely opposite conclusions on the optical features (W11P3) and the lidar features (W5P2), which is due to the characteristics of the data.
(2) Combined with the RFE algorithm, 13 HSIs and 16 MSIs were extracted horizontally, and the analysis found that the HSI had higher importance scores than MSI, especially NDVI and ARI. 6 LI were extracted from layer I, 9 from layer II, and 11 from layer III at different vertical levels, and the same analysis found that PER90, PER1, and PER10 had the highest importance scores respectively.
(3)The MPI ranking of the horizontal model constructed based on sensitive optical features is CNNRHSI> RFRHSI> RFRMSI> CNNRMSI, where CNNRHSI achieves the best accuracy (MPI=0.9383).
(4) The combination of CNNRHSI and sensitive LI constructs a vertically different level LLR estimation model. It was found that the accuracy of the CNNRHSI+LI model reached more than 0.8 (layer I: MPI=0.8956, layer II: MP I= 0.8424, layer III: MPI=0.8346), and the accuracy was reliable. Finally, the estimation results of the single wood scale were fused with the LiDAR point cloud to realize the three-dimensional visualization of LLR. The results showed that the mildly damaged sample trees were gradually damaged from the II. and III. layers, while the severely damaged sample trees were severely damaged on the I. layer, while the moderately damaged sample trees did not show obvious patterns.
The method proposed in this study not only fills the shortcomings of the traditional two-dimensional model, but also provides intuitive results for the index of tree leaf loss rate after insect infestation grazing, and improves the accurate estimation ability of LLR.
Author Response File: Author Response.docx
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors have made efforts to respond to most of my earlier comments. However, the following issues remain insufficiently addressed:
"
- Line 377: While MPI is a key metric, the cited source [90] is difficult to access. Please consider providing a publicly accessible link or other references.
I have added the DOI number of the document to the references.DOI:10.27204/d.cnki.glzhu.2019.000053.
- Line 384: The description of the hierarchical analysis method (AHP) needs more detail and should include appropriate references.
The Hierarchical Analysis Method (AHP) is weighted based on expert opinion, with appropriate references being:DOI:10.27204/d.cnki.glzhu.2019.000053.
"
Please consider adding more details about the Hierarchical Analysis Method (AHP) (or HAP?), as it is foundational for the non-traditional evaluation metric—MPI—used in this study. A brief and proper validation is also critical for readers to trust the use of MPI, which is not a widely adopted metric. Additionally, the DOI provided still does not resolve the accessibility issue for international readers, as it cannot be found via Google Scholar. Please provide a more accessible link or alternative reference.
Based on the above items, the reviewer humbly suggests a major revision before a further decision can be made.
Author Response
- Based on your proposed analytic hierarchy process for more details, I have listed the detailed steps on weight calculation below. References for the AHP method are:
- Thomas L. Saaty, 2008.Decision making with the analytic hierarchy process. International Journal of Services Sciences (IJSSCI), 1(1):83-98DOI:10.1504/IJSSCI.2008.017590
- W. SAATY. 1987. THE ANALYTIC HIERARCHY PROCESS-WHAT IT IS AND HOW IT IS USED. Math Modelling. 9 (3-5):161-176. https://doi.org/10.1016/0270-0255(87)90473-8.
- Omkarprasad S. Vaidya, Sushil Kumar.2006.Analytic hierarchy process: An overview of applications. European Journal of Operational Research.(169) 1:1-29.https://doi.org/10.1016/j.ejor.2004.04.028.
- Pairwise Comparison Matrix
The 4✕4 matrix listed assesses the relative importance of the four evaluation indicators.
|
||||
1 |
2 |
1 |
2 |
|
1/2 |
1 |
1/2 |
1 |
|
1 |
2 |
1 |
2 |
|
1/2 |
1 |
1/2 |
1 |
- Weight calculation (Geometric Mean Method)
- Calculate the geometric mean for each row
:(1×2×1×2)1/4≈1.4142
:(0.5×1×0.5×1) 1/4 ≈0.7071
= ≈ 1.4142
=≈0.7071
- Normalized weights(K)
=:1.4142/4.2426≈0.3333
=:0.7071/402426≈0.1667
- Consistency Ratio Test (CR Test)
1)Calculation of the maximum eigenvalue (λₘₐₓ)
Construction of the weighted matrix
Calculate the λₘₐₓ
.0000
Among them, n is the order of the judgment matrix, and Ki is the weight.
2) Calculate the Consistency Ratio
From the above calculation results, it can be seen that CR=0<0.1 proves that the matrix is consistent. The weight allocation determined by AHP shows that the total weight of the model accuracy index( + )and the stability index (+ )is 50%, indicating the equal importance of accuracy and stability in performance evaluation. To facilitate the calculation of MPI, the weight K leaves two digits after the point.
- Regarding the short verification of MPI, I verified its stability by calculating the MPI value in different scenarios with 10% of the weight.
- Model metrics
Model |
||||
RFRHSI |
0.9421 |
0.1616 |
0.0146 |
0.1733 |
CNNRHSI |
0.9314 |
0.0788 |
0.0058 |
0.0267 |
RFRMSI |
0.9049 |
0.1567 |
0.0317 |
0.1858 |
CNNRMSI |
0.9055 |
0.0924 |
0.0451 |
0.3117 |
- Scene with a weight ± 10%
Scene |
k₁ |
k₂ |
k₃ |
k₄ |
Adjust the logic |
Original |
0.33 |
0.17 |
0.33 |
0.17 |
Original Weights |
S1 |
0.363 |
0.17 |
0.33 |
0.137 |
k₁+10%,k₄ Adjust |
S2 |
0.297 |
0.17 |
0.33 |
0.203 |
k₁-10%, k₄ Adjust |
S3 |
0.33 |
0.187 |
0.33 |
0.153 |
k₂+10%, k₄ Adjust |
S4 |
0.33 |
0.153 |
0.33 |
0.187 |
k₂-10%, k₄ Adjust |
S5 |
0.33 |
0.17 |
0.363 |
0.137 |
k₃+10%, k₄ Adjust |
S6 |
0.33 |
0.17 |
0.297 |
0.203 |
k₃-10%, k₄ Adjust |
S7 |
0.33 |
0.153 |
0.33 |
0.187 |
k₄+10%, k₂ Adjust |
S8 |
0.33 |
0.187 |
0.33 |
0.153 |
k₄-10%, k₂ Adjust |
- MPI in different scene
scene |
RFRHSI |
CNNRHSI |
RFRMSI |
CNNRMSI |
Original |
0.9052 |
0.9383 |
0.8817 |
0.8712 |
S1 |
0.9073 |
0.9348 |
0.8818 |
0.8756 |
S2 |
0.9031 |
0.9418 |
0.8815 |
0.8669 |
S3 |
0.9055 |
0.9374 |
0.8822 |
0.8749 |
S4 |
0.9049 |
0.9392 |
0.8812 |
0.8675 |
S5 |
0.9111 |
0.9392 |
0.8878 |
0.8814 |
S6 |
0.8994 |
0.9375 |
0.8756 |
0.8610 |
S7 |
0.9049 |
0.9392 |
0.8812 |
0.8675 |
S8 |
0.9055 |
0.9374 |
0.8822 |
0.8713 |
After determining the weights based on the analytic hierarchy process (k₁=0.33, k₂=0.17, k₃=0.33, k₄=0.17), the CNNRHSI exhibits the best overall performance with an MPI value of 0.9383, attributed to its outstanding advantages of high accuracy (=0.9314) and low error (=0.07875). To verify the stability of MPI, the sensitivity analysis of the 8 weights ± 10% perturbation scenarios shows that the model ranking (CNNRHSI>RFRHSI > RFRMSI> CNNRMSI) is highly stable, with the dynamic range of the MPI being less than 0.5%. Among these, the weight change of has the most significant impact. This result confirms the robustness of the MPI evaluation system and provides a reliable basis for model selection for estimating defoliation rates under larch caterpillar pest stress.
The reference on MPI (DOI: 10.27204/d.cnki.glzhu.2019.000053.) is available on CNKI (Https://www.cnki.net), which is quoted from CNKI. I can't find and provide another citation format, sorry.
- In the article, I give the calculation steps and the results of the consistency test and the calculated weights about the method. I chose open access for the review process during the submission process, so these calculations were not included in the paper.
Author Response File: Author Response.docx
Round 3
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors have addressed my previous comments with sufficient detail. For consistency, I recommend using the term “Analytic Hierarchy Process (AHP)” instead of “hierarchical analysis method (AHP)” throughout the manuscript.
Please reorganize the response content and attach it as supplemental material.
Author Response
Dear reviewers
I have modified it to Analytic Hierarchy Process (AHP).