You are currently viewing a new version of our website. To view the old version click .
by
  • Wojciech Gruszczyński*,
  • Edyta Puniach and
  • Paweł Ćwiąkała
  • et al.

Reviewer 1: Jakub Chromčák Reviewer 2: Anonymous Reviewer 3: Anonymous Reviewer 4: Anonymous

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper deals with a very current topic, the combination of UAV and U-Net in the processing of geospatial data, with the potential to reveal landscape transformations in the monitoring of areas affected by mining. Instead of the classic generic classification, it uses heteroskedastic regression, where it uses the prediction of height corrections together with uncertainty. The methodology for processing and creating a U-Net seems time-consuming, but relatively complex. Three metrics were used for comparison. I would ask the authors to add an explanation of why they chose these metrics and did not use other standard geodetic/photogrammetric metrics (mean absolute error/ normalized median absolute deviation or mean bias error). Even the ASPRS uses NMAD statistical measures for accuracy. Please explain this fact or upgrade the comparison metrics.

I miss an explanation of why the absolute difference between the corresponding cell elevations in uni and the provisional DEM was defined as exceeding 10 meters (row 166/174). The value seems unrealistically high. Also, explain the absolute value of corrections 5 m in row 216.

Figure 4- Add units to the scale and explain the blue color in the description, although it is explained later in the text.

Explain why you used such a range of absolute errors in Fig. 5 (and others). Are the limits somehow tied to local accuracy classes, a state standard, or are they otherwise justified?

Please add the Figure, showing the position of control points in every locality, for better visualization. It is also not clear to me from the text whether the reference points were used once for each series, or whether a network of points was created that was used systematically for each series measurement. Please add this to the paper.

In conclusion, please add a statement about the effectiveness of the method. Since machine learning seems to be relatively lengthy and inefficient, I would like to ask you to clearly explain the prevailing benefits.

For the future, I would perhaps recommend increasing the photogrammetry overlap, as these are highly specialized measurements; it is better to increase quality at the expense of quantity. Since the measurements were carried out in southern Poland, it might be worth trying to select more mountainous areas in the southern border region for higher diversity. 😊

Author Response

"Please see the attachment."

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Please find my comments below:

  1. Model Choice: I was curious about the use of a heteroscedastic deep learning model instead of a homoscedastic one. It would be helpful if the authors could provide a brief explanation of this choice in the Introduction or Methods section to clarify the rationale behind the selected approach.
  2. Introduction (Lines 89–109): The authors describe their previous work involving a U-Net neural network for ground surface determination across three paragraphs. I recommend condensing this into a single paragraph. Additionally, it would strengthen the Introduction to first summarize commonly used ground filtering algorithms, followed by a brief overview of other machine learning methods applied to similar tasks, and then introduce the method from the authors’ previous paper.
  3. Figures 2 and 3: Consider merging Figures 2 and 3 to streamline the presentation. Focus on key inputs, outputs, and processes. Detailed information such as “patches of 204×204 cells” may be omitted from the figure if already described in the text.
  4. Line 160 – Inconsistency Between Text and Figure: The sentence “In hsparse grid cells where data is missing…” filling missing values is mentioned in the text, but Figure 3 shows outlier removal following hsparse. This creates a mismatch between the text and figure. Additionally, variables such as dem and older are not shown in the figure. Please revise for consistency.
  5. Why will there be outliers between olddem and dem, just because of applying morphological opening?
  6. Variable Naming in Figures: Figures 2 and 3 use multiple variable names (e.g., uni, hgauss, hdgauss, hsparse, dem, olddem, provisional DEM) for outputs derived from uni. This makes the workflow appear overly complex. Simplified naming convention or consistent abbreviations would improve readability.
  7. Does the provisional DEM refer to points within a 1-meter grid cell or a raster format? Please clarify.
  8. Why is the difference between uni and hgauss used as input to the model? Wouldn’t this difference be close to zero, rather than representing the cleaned point cloud data?
  9. Validation images were cropped using a 15 m grid. Why was this size chosen instead of the 5 cm × 5 cm grid used for training/input?
  10. When down sampling from 1 m to 10 m resolution, which specific method was used?
  11. Section 2.2, for the ground filtering algorithms, was the input data the original point cloud or the outlier-removed version?
  12. Section 2.4 – Dataset Introduction, it would be more effective to introduce the dataset before describing the methods and data processing steps.
  13. Figure 4, adding an inset map showing the geographic or spatial distribution of the four study sites would enhance the reader’s understanding of the dataset.
  14. I wonder if the established U-Net architecture works on the original UAV point cloud data? Have you tested the final U-Net architecture directly on the original cloud points (without cleaned/outlier removed), together with the cleaned ALS point cloud?

 

 

 

 

Author Response

"Please see the attachment."

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This article proposes a developed algorithm that based on heteroscedastic regression using a U-Net network, demonstrated high effectiveness in determining vertical ground surface displacements in agricultural areas of the USCB and regions with similar characteristics.

However, there are still some parts of the article that need to be modified. Here are detailed comments:

  1. Abstract, lines 19 to 21, the text mentions "the root mean square error (RMSE) of vertical displacement, the percentage of nodes with determined displacement values, and the percentage of outliers among those values". It is suggested to add specific numerical values;
  2. It is suggested in the paper that the author revise section 2.4 to section 2.1, first describing data acquisition and then describing research methods;
  3. Section 2.1.1, line 147, mentions that "In the first step, the area covered by the point cloud is divided into 5 cm x 5 cm grid cells". How is 5cm determined? Do we need to consider the scope of the research area? The value of 1m * 1m in line 157, 2-meter radius in line 164, and lines 178-179 also have the same issue. There are other places in the text that also have the same problem. Please explain the basis for these values one by one.
  4. Section 2.1.2, lines 203 to 209, compares the measurement values obtained using ALS elevation and GNSS RTK, as well as the digital surface model generated by unmanned aerial vehicle photogrammetry. When were these three types of data obtained?
  5. The blue area in the figure is not explained in the annotation of Figure 4.
  6. In Section 2.4, lines 427 to 431, it is explained that there was no vertical displacement in Figures 4a to c. Therefore, is it appropriate to use a-b as the training area? At the same time, the training area includes buildings and vegetation areas, while there are no buildings in the validation and testing areas. Is this appropriate? Please provide an explanation.
  7. Suggest placing Table 1 after Section 2.4 and before Section2.4.1.
  8. According to the analysis of the article, there is not much difference in the results between ATIN and the U-Net method proposed in this paper in the experimental data. However, U-Net requires a large amount of data, and the steps are more complex and computationally intensive. At the same time, the experimental data in this article has the characteristics of low slope and agricultural areas, so U-Net can only prove its advantages in this situation. ATIN has been proven to have advantages in areas with steep slopes in other literature. So is it necessary to study the method of this article in this situation?
  9. The writing logic in the text is lacking, and there is a phenomenon where subsequent results are reflected in the previous text. This writing style makes it difficult for readers to read. Suggest modification.

Author Response

"Please see the attachment."

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

This paper proposes a heteroskedasticity regression model based on U-Net architecture for estimating vertical surface displacement in agricultural areas from UAV photogrammetry point clouds, which is trained, validated, and tested using four real datasets (Jerzmanowice, Karniowice, Lędziny, Jaworzno), and compared with a variety of traditional ground filtering algorithms (ATIN, SMRF, CSF). The research design is rigorous and has certain innovative and practical value, but this article can be improved in the following points: Although it is compared with the classical filtering method, it is not compared with other deep learning semantic segmentation methods (such as PointNet, RandLA-Net, etc.). It is recommended to introduce 1–2 mainstream point cloud deep learning models as a comparison group to test their elevation correction accuracy and uncertainty quantification ability on the same dataset, and clarify the advantages of U-Net. The time span of the ALS reference data used in this paper is large (e.g., Karniowice's ALS data is 2013, while the UAV data is 2021–2022), and the influence of the "ALS and UAV data time difference" on the target value of elevation correction is not quantified. The sub-area with a "small time difference" (such as within 1 year) can be selected for comparison, and the influence of time difference on the target value of elevation correction can be evaluated. It is recommended to further elaborate on the potential application value of the proposed method beyond agricultural areas. For example, the approach could be extended to mining subsidence monitoring, urban ground stability analysis, forest floor modeling, and infrastructure deformation detection, thereby highlighting the generalizability and broader academic and engineering significance of the model. 

Author Response

"Please see the attachment."

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

Comments and Suggestions for Authors

This paper is readable.