A Two-Stage Semiempirical Model for Satellite-Derived Bathymetry Based on Log-Ratio Reflectance Indices
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsSee attached .docx file.
Comments for author File:
Comments.pdf
The English is generally good, but review by a native English speaker before submission of the revision would be useful.
Author Response
Dear Reviewer
We sincerely thank the reviewer for the thoughtful and detailed feedback provided. In addition to addressing the explicit recommendations, it is important to note that the focus and title of the manuscript have been revised to emphasize the contributions rather than presenting a complete methodology, since some of the strategies in the originally proposed workflow were not novel. The article now centers on the proposed semi-empirical model. The structure has also been adapted to the classical format of: Introduction, Materials and Methods, Results, Discussion, and Conclusion. Furthermore, more robust and statistical comparisons have been included, along with an extensive revision of the language and overall writing. We have also carefully revised the manuscript to address each of the points raised. Below we provide our responses to the major comments.
(1) Abstract: A log ratio method of green/blue and red/blue is employed and said to be better than Stumpf’s (or Lyzenga’s) SDB method. But I think this is Stumpf’s method...? Also, they mention that ACOLITE is used. Unless ACOLITE is used in a new way, atmospheric correction is simply standard operating procedure.
Response: We appreciate this important observation. In the revised version, we changed the focus of the manuscript. Since ACOLITE is not used in a novel way, its role is now presented as a standard preprocessing step rather than as a central component of the contribution. The focus of the paper has been shifted to emphasize the modeling strategy itself. We also now explicitly clarify the differences between our proposed formulation and Stumpf’s approach, underscoring that while both use logarithmic transformations, the structure of our log-ratio indices differs from Stumpf’s ratio of logarithms.
(2) Page 3, Starting around L88. It would be useful to provide context on the general approaches used to SDB models and then refer to these in the specific examples cited. I am referring to physics-based, quasi-empirical (Stumpf, Lyzenga), purely empirical (almost all machine learning methods).
Response: Thank you for this suggestion. We revised this part of the manuscript to provide broader context on the main approaches used in SDB. Specifically, we now use the term semiempirical to describe Stumpf, Lyzenga, and Hashim’s methods, and we explicitly describe our proposed model in the same category. We also clarify that machine learning approaches are purely empirical, in line with the reviewer’s recommendation.
(3) Page 3, Lines 96-7. In fact, the Stumpf model is a “two zone” model. Blue and green bands are usually used, but in fact Stumpf noted that the use of blue and red is more appropriate for “shallow water” – although I have never seen documentation defining the depth threshold where blue/red should be used instead of blue/green. Nonetheless, this point seems particularly important since the authors are proposing the use of blue/green and blue/red. An example of where this is important is Page 3 Line 118 where “traditional methods are referred to as “empirical” when, in fact, they are generally quasi-empirical.
Response: We thank the reviewer for this clarification. In the revised manuscript, the differences with Stumpf’s model are explicitly highlighted in several sections. We emphasize that while Stumpf’s method relies on the ratio of logarithms, our formulation applies the logarithm directly to the band ratio, which is conceptually different. We also followed the reviewer’s advice to revise terminology, consistently using semiempirical instead of empirical or analytical when referring to these approaches.
(4) Page 4, L125-126. Similar to my previous comments, the meaning of “analytical methods” is not clear. Page 5, L172-176. Agreed that LiDAR and multi-beam can be used as ground-truth to train SDB models. But those generally cover an entire area, and the goal of SDB is generally to estimate shallow water bathymetry for which one does not have complete coverage. This operational/real-world consideration must be acknowledged and/or discussed – perhaps in the Discussion section.
Response: Following the reviewer’s guidance, the term analytical methods has been replaced throughout the manuscript with semiempirical methods, which better reflects the nature of these approaches. Additionally, we added a discussion in the revised manuscript acknowledging the operational limitations of using LiDAR and multi-beam data, noting that while they provide complete coverage in some cases, SDB is particularly valuable where such coverage is not available. This addition has been placed in the Discussion section, as recommended.
(5) Page 10, L331-332. It is fairly common in SDB work to use multiple images to derive a single “composite image” that is then used for model training and for application across the study area. Articles on this should be cited. (These can often be found by using key words like “multi-temporal images” and “composite image for SDB.”) The most common compositing method is using the maximum pixel value based on the justification that the maximum reflectance indicates the least atmospheric “contamination.” In principle, I have no objection to using a mean value composite (although the mean is heavily impact by aberrant values – especially with a small sample size. However, the authors must justify the use of a mean composite by citing articles in which a mean composite was used. (And as a side note, in my work I have found that if I am using, for example, four images, there is always at least one image that outperforms any composite. The trick, of course, is to figure out which image will perform best.)
Response: We appreciate this insightful comment. In the revised version, we included a paragraph in Section 2.2 (Satellite Data and Preprocessing) providing more detail on the selection and averaging of images, along with citations to studies that also applied mean compositing. Our rationale is that selecting a single “best” image may introduce bias, as it can be akin to “cherry-picking” and does not necessarily reflect performance in more challenging cases. By averaging multiple acquisitions, we aim to reduce variability and improve robustness in the worst-case scenarios, even if the mean is not always superior to the best single image. We clarified this reasoning and cited relevant literature to justify the choice.
(6) Figures 6 through 9. Please verify that the RGB model uses the same color bar as the proposed model and the ANN. Something seems consistently wrong that the RGB model has notably deeper values than the other two.
Response: We would like to clarify that the RGB composites are not displayed with the same color bar as the proposed model or the ANN, because they are not quantitative bathymetric outputs. They are shown only as qualitative visual references to highlight areas of potential discrepancies where no in situ depth data are available. While we agree that such qualitative comparisons are less rigorous, we believe they still provide valuable insights—for example, identifying unrealistic deep channels or shoreline changes that clearly contradict reality. Importantly, these qualitative analyses do not replace quantitative evaluation, which remains the primary basis of our conclusions. This distinction has been clarified in the revised manuscript to avoid misinterpretation.
(7) Two very distinct types of ground-“truth” in situ data are employed. I think the use of a data set that is considered highly accurate – lidar – and one considered of unknown accuracy – the chart data – is really interesting. I think a paragraph should be included (in the Discussion section?) about any relevant observations – even if it is just “No conclusion could be drawn about SDB accuracy based on whether lidar or chart data are used as in situ ground truth.” Incidentally, related to this is that it would have been interesting if it had been possible to compare the accuracy of chart-based SDB with lidar data – although that would have required an additional data set that I imagine does not exist.
Response: We fully agree with the reviewer that this is an important point. In the revised version, we added a paragraph in the Discussion section addressing the nature of the two types of in situ ground-truth datasets. We discuss the high accuracy of LiDAR-derived data compared with the more uncertain chart-derived points, and we note that the comparison of these two reference sources is an interesting aspect of our study. While a direct accuracy comparison between chart-based SDB and LiDAR-based SDB was not possible given the available datasets, we highlight the relevance of considering these differences and their implications when interpreting the results.
(1) Page 2, L71-81. “Chapter” is appropriate for books, but not articles. Uses “Section.”
Response: Thank you for pointing this out. All instances of “chapter” have been replaced with “section” to conform to academic writing standards for journal articles.
(2) Page 6, Figure 1. Figure 1 is never cited or referred to in the text. Remove it if it is not necessary.
Response: Following the reviewer’s suggestion and after changing the focus of the paper toward the proposed model, Figure 1 was removed from the manuscript.
(3) Page 6, Near L211. It would be useful to present Figure 2 before the specifics of the study areas are discussed.
Response: We appreciate this observation. Figure 2 is now presented before the description of the study areas, improving the logical flow of the text.
(4) Page 8, L250-253. I assume these reflectance values are normalised? Landsat-8 provides 16-bit integer data. What was done to convert reflectance values to floating point numbers with four decimal places?
Response: Thank you for raising this important clarification. We now explicitly state in Section 2.2 (Satellite Data and Preprocessing) that reflectance values are converted to floating point numbers by the ACOLITE tool during atmospheric correction. This ensures the appropriate scaling and normalization of Landsat-8 data.
(5) Page 8, 258-259. The sentence that starts “original data...” makes no sense.
Response: We appreciate this comment. The manuscript was substantially revised, and the sentence in question no longer appears in the revised version.
(6) Page 8, L270-272. Reword the explanation of using the pixel closest to in situ data points. I presume that it was not the closest pixel that was used, but instead was the pixel in which a point fell. (And if this is not the case, the authors need to justify using a “nearby” pixel whose reflectance can be completely different than an adjacent pixel.
Response: We have clarified this point in the revised manuscript. For each in situ point, we calculate the coordinates of the center of each pixel and assign the in situ point to the pixel whose center is closest. We also verified that no pixel center is assigned to more than one in situ point, avoiding duplication and ensuring that reflectance values are matched consistently to reference depths.
(7) Page 9, L282-283. The sentence that starts “A total of 8 images” should be broken into two sentences.
Response: The text has been restructured during revision, and this specific sentence no longer appears in the manuscript.
(8) Page 13, L429. “...to calculate Lyzenga’s constant.” Table 1 indicates Lyzenga’s method has two coefficients. How is ai determined? More (brief) explanation is required.
Response: We have expanded the explanation in Section 2.3, clarifying how the coefficients of Lyzenga’s and other methods are determined. Specifically, they are estimated by minimizing the squared error between predicted and observed depths using the available in situ reference data. We also note that this same optimization approach is consistently applied to all benchmark algorithms for fairness.
(9) Page 13, L442. For the M-A study area, only 128 points were available. I doubt this is this sufficient to fit a generalized ANN. This must be discussed (probably after the Results section). Later... This comment is sufficiently addressed by the analysis mentioned on Page 14, L469-470.
Response: We agree with this point and have included discussion of the limitations of using ANN with the limited Magdalena-Almejas dataset. As the reviewer observed, the Results section already addressed this issue, and we have ensured that this discussion is more explicit in the revised manuscript.
(10) Page 15. Because of the figure resolution, Figures 4 and 5 are of little value for a reader except to show the locations of Zones A, B, and C. I suggest presenting one only just to show the zone locations.
Response: We agree with the reviewer’s suggestion. In the revised manuscript, a single figure is now used to show the locations of Zones A, B, and C, improving clarity and avoiding redundancy.
(11) Page 16, L481. Please provide a better indication of where the island is. Is it the spot in the centre of the image?
Response: Thank you for this useful observation. We revised the caption and figure description to clearly indicate where the island is located. It is indeed the spot in the center of the image, and this has now been explicitly noted and labeled for clarity.
We would like to sincerely thank the reviewer(s) and the editor once again for their thoughtful and constructive feedback. The comments have been extremely valuable in helping us improve the manuscript. We believe that the revised version now presents a clearer, more rigorous, and more balanced study, with a stronger emphasis on the main contribution: the proposed two-stage semiempirical model for satellite-derived bathymetry.
We hope that the changes made adequately address all concerns raised, and we respectfully submit the revised manuscript for your consideration.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsMultiple papers have been published on this topic recently which makes this topic relatively competitive. It is therefore of major importance that the methodology of the work is explained clearly, in order to convince the reader of the novelty of the method. While, the strategy of using 2 bands ratios in order to improve the regression can be considered innovative, a large number of major issues doesn't allow the reader to get confidence in the work undertake. The major flaws are the following:
- analytical vs parametrical approach: while your method is clearly based on an empirical approach (log ratios), you cannot really speak about an analytical method which basically seeks to rely purely on physical laws of optics all the way through the SDB process (this is actually a strategy followed by some SDB estimation method). Minimally, I would suggest to replace all the references to the term analytical by the term parametrical.
- Generalization: two sites are not enough to pretend to have a generalized method. Moreover, it seems that the nature of the seafloor looks similar for your two areas. Testing your method on highly variable areas in terms of nature will prove that the method is robust or not.
- Calibration vs validation: It seems to the reader that all the measured soundings are used for both the calibration and validation steps. If this is the case this does not test the performance of the estimation. If you want to undertake a proper calibration (regression) and validation procedure, you should split-sample your measured datasets, that is isolating data for the regression phase solely, leaving another part for the validation only. One common strategy is to randomly split 75% for the calibration and 25%, but other strategies exists as long as the split groups are "sufficiently representative" of the whole group.
- Your evaluation is lacking 2D plots of estimated vs in situ-depth plots which are useful to detect some defects of the estimation with respect to water-depth. You should also better describe your source data (vertical reference, confidence, origin). Finally, you should also provide difference maps in order to locate where are the areas of fitness and those aside. Your interpretation will be easier to perform.
- Bathymetric data are normally characterized using the International Hydrographic Organization Special Publication S44. Although all the S44 parameters are difficult to get when handling SDB, you should at least try to characterize areas where your method fits one of the orders (for SDB, Order 1 vertically is often sought).
- The preprocessing phases is poorly described when it comes to the averaging process. If you want to average multiple images (which is a valuable strategy), you have to be very cautious with the level of corrections (Atmospheric mostly in your case (as you deal with log ratio from the Bottom of the Atmosphere) applied so that the reflectance of the seafloor values is not contaminated with residual effects. ACOLITE is mentioned in the title of your paper, but little insights is given on the way you have used it.
- the ANN seems not really useful to your argumentation (little is known on the way you have conceptualized, implemented and trained/validate (note that you should also have here a strategy to sample your datasets). The reader is a bit puzzled that the metrics indicate good performances while your qualitative description discard it.
- The style of the paper: There are either a number of redundant information as the workflow is split between the methodology and the results, while there are new elements (ANN) appearing later in the paper which should be introduced earlier. This make the reading a bit confusing. Also I suggest that you provide a discussion section which criticize the benefit/limit of your method (section 4.4 might be an attempt to do so). I suggest to follow the classical path of materials & method, results, discussion, Conclusion
- Figures should be bigger and should provide more details (bear in mind that following the abstract the reader often jump to the figures before starting reading the paper)
Following are specific comments provided throughout your paper:
l56-58: This sentence seems to indicate that you have deliberately chosen the log ratio methodology and not to go the physical (purely analytical) based route. You should explain why you have made this choice.
l76-77: why not trying to further generalize? That is on different data sites. For examples Hudson Bay (CDN), African coasts, Europe, Pacific Islands (or at least areas where in-situ data can be easy to access)
l105: Spear Relative Water Depth tool from ENVI. What is the methodological background of this tool. If you are using a dedicated software you should provide elements on the methodology/parameters
l114-117: it has nothing to do with the methodology, it has only to do with the way we fine tune the regression. In that sense one cannot say that ML/AI techniques have proven effective in estimating water depth.
l122: one has to provide valuable metrics to say that one method outperform an other. "Greater accuracy" is too vague. What is the target and how different targets relate to this objective would be a better way to handle this.
l172-176: You might also stress out the need for data that are collected in reasonable timeframe from the date of the satellite imagery collection (that is to consider potential morphological dynamics in the area).
You suggest using Nautical charts. Bear in mind that data on nautical charts are characterized by level of confidence (CATZOC), including indirectly consideration against the age of the datasets.
l177-l182: What is the final output of this step? Masked area and/or adequate selection of images? Correction of the Reflectance value (this is of high importance, as you consider averaging images)
l191-192: How does your method generalize ?
l194-198: correlation plot is highly valuable (estimated bathy against measured, for soundings which have not been exposed to the regression tuning process)
l199-l204:
We suggest that the bathymetry should be interpreted against international standards such as International Hydrographic Organization Special Publication 44 (edition 6)
Note also that you should discuss the way you select your data when it comes to train / validate. You should clearly separate both. Else you will lose credibility.
Fig1: Be aware that Fig1 is not fully in accordance with the text. For example you mention ACOLITE corrections, which is barely touched upon in your text (note that you mention ACOLITE also in the title of your paper). It is the view of the reviewer that you should further stress this point.
l210: two is not enough to prove the generalization of your method. 5 or more, for different seafloor nature, would have better tested the robustness of your method.
l244: At least for the sake of reproducibility of your method, you should indicate the charts reference from which you digitized the soundings.
Also as mentioned above the Hydrographic chart should either indicate a CATZOC classification, or minimally a source diagram (indicating the type of sensor and period at which the survey has been acquired).
l245: If you have digitized the points from a nautical paper chart, you should also have digitized control points-> if you want to describe the precision of the digitization process you can provide the uncertainty.
l257: were by: a word is missing.
l257-260: Please provide a bit of detail on these data (NOAA), at least to show that you have looked at them before using them. Important features that the reader want to be informed I believe are date of the survey with respect to sat image, average density, vertical precision, horizontal precision, level of processing (raw/processed). Also, although I understand that you are working on log/ratio, you should look carefully how these Lidar data have been tidal corrected. Please mention.
RGB valued, please provide the range extent. At which processing stage these values are provided?
l265 chart-derived: Which Charts. Please be explicit (reference them). You haven't introduced this data before.
How did you compare both datasets (as soundings are not co-located).
You should be more explicit
l266: how have you tested the generalization of your method? What kind of validation strategy have you used (typically people isolate part of the soundings to be used subsequently for the validation, which have not been made aware to the regression step).
Fig3: As you have the bathymetry from these two data sets you should illustrate here the global morphology. A 2D map of the bathymetry (from in-situ measurements) should be provided here rather than the distribution of the point cloud.
l279 Landsat: Sentinel 2 has also a wide archive of data and is free to use.
You can test your method on this sensor as well which will improve the "generalization" capabilities.
l281-285: Long sentence, you should rephrase
l300: You should indicate the level of vertical uncertainty that you are expecting. Without this it is difficult to acknowledge what is representative in terms of tidal effects.
l307: are you considering coastal aerosols here?
l313: "calibration parameters" you should provide a technical reference
3.4.2 You should be more explicit and provide details proving that you can average, considering as mentioned earlier that BOA corrections should have been handle with extra care for each image and that there is no (minimal) residual effects of the Atmosphere (a comparison of co-located reflectance values should illustrate this).
l385 "estimated": considered to be the center of the pixel the nearest to the observed depth (no linear/bilinear interpolation?)
MAE: This one metric that can easily be referred to S44 bounding limits (vertical component) corresponding to Order 1 or better for example.
l422 ANN: This is a new element in the paper. I do not understand what you are expecting from this + there is a wide diversity of architectures of ANN, which should be minimally described (number of connections, hidden layers, activation function)
l425: here again if you want to train a model you should explain how has been performed the training strategy (selection of the soundings for the training).
l442: Default parameters: you should be more explicit. At least briefly stating which are these default parameters and what are their sensitivity.
l452: "independently calibrated" counter intuitive when it comes to the generalization of the method.
Also you should provide details on the way you have selected points to "calibrate" the models. These calibration points should be separated from the validation points.
Figures: p15: At least for FIg6, 7, 8, if not Fig 4/5, you should also provide a difference map (between proposed and ANN model) to help the reader identifying where issues may arise.
Validation, you should provide a 2D plot of estimated vs measured soundings.
l478: If you indicate geographic locations in the text make sure they are also displayed on the maps (the reader is not expected to know the geography of your area of interest)
l481: "The ANN incorrectly extrapolated". How do you explain that statistically (Table 6/7) the ANN provides better metrics than for your model ? while your quantitative explanation seems to indicate that it provides unrealistic features in both shallow and deep areas. -> this must have something to do with the representativity of the measured soundings/strategy train/validation
l485: Here again the reader does not know how the training has been made. (it should have been made so that the training has experienced all the type of water depth situations)
l489: "without training data" (now recurrent comment): it seems to me that you are using all the available measured data for the training. You might want for example to randomly sample 75 % of the data for the training 25% for the validation (other strategies may exist as long as the training split and the validation split are representative of the overall dataset).
4.4 : see my previous comment with respect to difference maps. + this section has not really strong arguments. The Black-box nature of the neural network is an a-priori choice, you knew it before using this technique. I don't think invoking this argument is scientifically valid. I believe that you are jumping too quickly to the conclusion, you might want to discuss further work to make this study more robust (see comments at the beginning of this review).
4.5 Nice to share your data. This is should facilitate reproducibility and comparison. Note that the headline has a mistake in the last field. You have swapped letters in depth.
l512-515: Could you identify areas from this dataset per contribution of each data type (i.e. do you have an associated map of 1. Pure estimation 2. Estimation with controlling measured data).
l539-541: Did you provide the estimated depth computed from your method for this second area, such as you have done for the previous one.
Conclusion/Abstract: I suggest you review both with the comments provided for the reviewers, to make them more robust.
Author Response
Dear Reviewer,
We sincerely thank you for your thorough review and constructive comments on our manuscript. In addition to addressing the explicit recommendations, it is important to note that the focus and title of the manuscript have been revised to emphasize the contributions rather than presenting a complete methodology, since some of the strategies in the originally proposed workflow were not novel. The article now centers on the proposed semi-empirical model. The structure has also been adapted to the classical format of: Introduction, Materials and Methods, Results, Discussion, and Conclusion. Furthermore, more robust and statistical comparisons have been included, along with an extensive revision of the language and overall writing. Your feedback has been extremely valuable for improving both the clarity and the quality of our work. Below, we provide a point-by-point response to each of the major concerns raised:
-
analytical vs parametrical approach: while your method is clearly based on an empirical approach (log ratios), you cannot really speak about an analytical method which basically seeks to rely purely on physical laws of optics all the way through the SDB process (this is actually a strategy followed by some SDB estimation method). Minimally, I would suggest to replace all the references to the term analytical by the term parametrical.
Following your advice and the recommendations of other reviewers, we have replaced the use of the term analytical with semi-empirical throughout the manuscript. This terminology better reflects the nature of our approach based on log-ratio indices and avoids possible misinterpretation.
-
Generalization: two sites are not enough to pretend to have a generalized method. Moreover, it seems that the nature of the seafloor looks similar for your two areas. Testing your method on highly variable areas in terms of nature will prove that the method is robust or not.
While the previous version of the manuscript did not make this sufficiently clear, we emphasize now that the proposed strategy was tested in two study areas with markedly different seafloor conditions. The clear reef-dominated environment of Buck Island is fundamentally different from the turbid sandy bottoms of Isla Magdalena, as reinforced in the revised text. We acknowledge, however, that the number of sites is limited. Due to constraints of time, resources, and field conditions, it was not possible to test additional areas at this stage. Nevertheless, the contrasting nature of the two selected regions, both in terms of seafloor composition and data availability, provides valuable insight into the robustness of the method. We have clarified that extending the evaluation to further sites with diverse conditions remains an important direction for future work.
-
Calibration vs validation: It seems to the reader that all the measured soundings are used for both the calibration and validation steps. If this is the case this does not test the performance of the estimation. If you want to undertake a proper calibration (regression) and validation procedure, you should split-sample your measured datasets, that is isolating data for the regression phase solely, leaving another part for the validation only. One common strategy is to randomly split 75% for the calibration and 25%, but other strategies exists as long as the split groups are "sufficiently representative" of the whole group.
One of the most important changes in this new version is the adoption of a k-fold cross-validation strategy. Unlike a simple split-sample approach, k-folds ensure that the dataset is systematically partitioned into multiple subsets, allowing the model to be trained and validated repeatedly on different data segments. This reduces the risk of biased sampling, improves representativeness, and provides more robust statistical measures of performance. The revised manuscript highlights this change and discusses its advantages in greater detail.
-
Your evaluation is lacking 2D plots of estimated vs in situ-depth plots which are useful to detect some defects of the estimation with respect to water-depth. You should also better describe your source data (vertical reference, confidence, origin). Finally, you should also provide difference maps in order to locate where are the areas of fitness and those aside. Your interpretation will be easier to perform.
We have included new 2D plots comparing measured versus estimated depths, as well as residual distribution plots, which help visualize potential estimation biases. In addition, we now provide detailed descriptions of both the satellite imagery (including quality and confidence criteria) and the in situ sounding data (vertical reference, confidence level, and origin). Furthermore, we added difference maps to illustrate spatial patterns of accuracy and misfit, which facilitates a more intuitive interpretation of the results.
-
Bathymetric data are normally characterized using the International Hydrographic Organization Special Publication S44. Although all the S44 parameters are difficult to get when handling SDB, you should at least try to characterize areas where your method fits one of the orders (for SDB, Order 1 vertically is often sought).
We must acknowledge that we were not previously aware of the International Hydrographic Organization Special Publication S44. In this revised version, we have incorporated this standard into the analysis by characterizing the confidence level of the data used against S44 criteria and providing a comparison of how our data align with the expected vertical accuracy for SDB applications.
-
The preprocessing phases is poorly described when it comes to the averaging process. If you want to average multiple images (which is a valuable strategy), you have to be very cautious with the level of corrections (Atmospheric mostly in your case (as you deal with log ratio from the Bottom of the Atmosphere) applied so that the reflectance of the seafloor values is not contaminated with residual effects. ACOLITE is mentioned in the title of your paper, but little insights is given on the way you have used it.
The revised manuscript provides a clearer description of the preprocessing steps. Although ACOLITE is still mentioned, we reduced its centrality as suggested, since our use of this tool is not novel. Instead, we focus on explaining the quality assessment of the satellite data and justifying why image averaging was applied under our specific conditions. We explicitly discuss the caution exercised in handling atmospheric corrections so that bottom reflectance values were not contaminated by residual effects.
-
the ANN seems not really useful to your argumentation (little is known on the way you have conceptualized, implemented and trained/validate (note that you should also have here a strategy to sample your datasets). The reader is a bit puzzled that the metrics indicate good performances while your qualitative description discard it.
In the previous version, the description of the ANN experiment was not sufficiently explicit. We have now clarified that the dataset was split into training and validation sets, and we also explain the rationale for including a qualitative description of its performance. Although we recognize that the ANN is not central to the proposed semi-empirical approach, we decided to retain it because it provides a useful benchmark: it highlights both the potential of purely empirical methods (in terms of high statistical fit) and their weaknesses (such as poor generalization, overfitting, and reliance on large datasets). This comparison strengthens the argument for the semiempirical methods.
-
The style of the paper: There are either a number of redundant information as the workflow is split between the methodology and the results, while there are new elements (ANN) appearing later in the paper which should be introduced earlier. This make the reading a bit confusing. Also I suggest that you provide a discussion section which criticize the benefit/limit of your method (section 4.4 might be an attempt to do so). I suggest to follow the classical path of materials & method, results, discussion, Conclusion
We followed your valuable suggestion to restructure the paper according to the classical structure: Introduction, Materials and Methods, Results, Discussion, and Conclusion. This reorganization reduces redundancy, avoids introducing new elements too late in the manuscript, and provides a clearer narrative flow. The revised discussion section explicitly addresses both the benefits and the limitations of the method.
-
Figures should be bigger and should provide more details (bear in mind that following the abstract the reader often jump to the figures before starting reading the paper)
We have improved the quality, resolution, and size of all figures to ensure better readability. Additional figures have also been included to enhance the visual interpretation of the results. In particular, we made sure that the figures are sufficiently detailed to serve as a stand-alone entry point for readers who, as you noted, often review figures immediately after reading the abstract.
Regarding the numerous specific comments on the text, we would like to note that most of them referred to sentences or sections that are no longer present, as the manuscript has been extensively rewritten and reorganized. Several others are directly addressed within our responses to the major comments. Concerning the remaining points, we have taken care to avoid ambiguous expressions such as “greater accuracy”; we have introduced information related to CATZOC; added a correlation plot; incorporated the IHO S-44 standard; described the data partitioning strategy with the k-fold approach; removed ACOLITE from the title to focus exclusively on the bathymetric model; provided more detail on data sources, image averaging, and quality control of soundings; explicitly stated the parameters of the neural network; moved the data to the Data Availability section; and incorporated additional recommendations from the other reviewers, among other improvements such as the quality of English.
We sincerely thank the reviewer once again for the constructive comments, which have helped us significantly improve the clarity, rigor, and overall quality of the manuscript.
We hope that the changes made adequately address all concerns raised, and we respectfully submit the revised manuscript for your consideration.
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe authors proposed a two-step regression framework for satellite-derived bathymetry, and have done sufficient work to demonstrate the validity of the method. The paper is interesting, but the current version does not meet the standards for publication in an academic journal, primarily due to significant issues with its formatting and presentation. The comments and suggestions are listed below.
Major comments:
(1) The paper lacks professionalism, both in its irregular formatting and its over-inclusion of basic, fundamental knowledge. For example in lines 167-204, line 316, line 322, line 364-370, section 4, etc. The current version more closely resembles an engineering report than an academic paper, so the authors are encouraged to seek assistance from experts to refine the manuscript's formatting and adherence to academic conventions. Moreover in subsection 3.1-3.3, there are much irrelevant information here, and the authors should merge these subsections and remove redundant information.
(2) The authors devote significant space to the description of Lyzenga’s method, Stumpf’s method, and Hashim’s method. So I strongly suggested the authors to compare their proposed method with the above three methods, rather than just compared with ANN method.
(3) I can not find out something new or special in the discussion part. Perhaps the authors could discuss on why the proposed method is more superior than traditional analytical model or ANN model, especially in areas where less training data were acquired.
Minor comments:
(1) In lines 88-113, many citations employ both the (Year) and superscript numbering formats simultaneously. This dual citation style is redundant and should be avoided.
(2) In lines 139-140, why is there a dedicated section introducing the log-ratio of reflectivity over the exponent?
(3) In lines 313-314,raw data to DN should be DN to remote sensing reflectance at the atmosphere
(4) In subsection 3.4.2, the authors mentioned that mean stacking was applied to multiple remote sensing images to improve the signal-to-noise ratio. In this process, did the authors account for the adverse effects of clouds and shadows, and/or waves or ripples? Were these contaminated areas masked or interpolated prior to performing the averaging?
(5) In figure 6-8, it is inappropriate to draw a conclusion by just comparing SDB results with RGB composite image. For example, I cannot find out a 10-meter overestimation by the ANN in Fig. 8. The authors should compare SDB results with DDM, or use a scatter plot with predicted depth vs the actual depth to better represent their funding. Same as Fig. 9.
(6) Subsection 4.5 should be moved to Data Availability Statement section.
Author Response
Dear Reviewer
We sincerely thank the reviewer for the careful reading of our manuscript and for the constructive comments and suggestions. We have carefully revised the paper according to the feedback, and we believe these changes have substantially improved the quality and clarity of the work. Below, we provide a point-by-point response to each comment.
In addition to addressing the explicit recommendations, it is important to note that the focus and title of the manuscript have been revised to emphasize the contributions rather than presenting a complete methodology, since some of the strategies in the originally proposed workflow were not novel. The article now centers on the proposed semi-empirical model. The structure has also been adapted to the classical format of: Introduction, Materials and Methods, Results, Discussion, and Conclusion. Furthermore, more robust and statistical comparisons have been included, along with an extensive revision of the language and overall writing.
Major comments
(1) The paper lacks professionalism, both in its irregular formatting and its over-inclusion of basic, fundamental knowledge. For example in lines 167-204, line 316, line 322, line 364-370, section 4, etc. The current version more closely resembles an engineering report than an academic paper, so the authors are encouraged to seek assistance from experts to refine the manuscript's formatting and adherence to academic conventions. Moreover in subsection 3.1-3.3, there are much irrelevant information here, and the authors should merge these subsections and remove redundant information.
Response: We agree with this valuable observation. The manuscript has been completely restructured to follow the standard academic format (Introduction, Materials and Methods, Results, Discussion, and Conclusions). Redundant and overly basic information was removed, and subsections 3.1–3.3 were merged and rewritten more concisely. Furthermore, we shifted the focus of the article to highlight the principal contribution of this work: the proposed two-stage semiempirical model.
(2) The authors devote significant space to the description of Lyzenga’s method, Stumpf’s method, and Hashim’s method. So I strongly suggested the authors to compare their proposed method with the above three methods, rather than just compared with ANN method.
Response: We appreciate this important clarification. In the previous version, our method was already compared with Lyzenga, Stumpf, and Hashim, but as the reviewers pointed out, the presentation was confusing. In the revised manuscript, the comparisons are now explicitly presented and much more clearly organized, ensuring that the performance of the proposed method can be directly contrasted with these traditional approaches, in addition to ANN.
(3) I can not find out something new or special in the discussion part. Perhaps the authors could discuss on why the proposed method is more superior than traditional analytical model or ANN model, especially in areas where less training data were acquired.
Response: Thank you for this valuable suggestion. We have restructured the entire manuscript to emphasize the main contribution of the work. In particular, the Discussion and Conclusions sections were expanded to explicitly address why the proposed two-stage semiempirical model outperforms traditional approaches in our case studies and why it offers advantages over ANN in data-scarce environments. These sections now provide a clearer interpretation of our findings and their implications.
Minor comments
(1) In lines 88-113, many citations employ both the (Year) and superscript numbering formats simultaneously. This dual citation style is redundant and should be avoided.
Response: Corrected. All in-text citations have been homogenized to use only the numbered citation style, avoiding redundancy.
(2) In lines 139-140, why is there a dedicated section introducing the log-ratio of reflectivity over the exponent?
Response: This section has been removed, and the content was integrated into the methodological description to avoid redundancy.
(3) In lines 313-314, raw data to DN should be DN to remote sensing reflectance at the atmosphere.
Response: We thank the reviewer for pointing out this mistake. The terminology has been revised throughout the manuscript, and “raw data” has been replaced with more technically accurate wording.
(4) In subsection 3.4.2, the authors mentioned that mean stacking was applied to multiple remote sensing images to improve the signal-to-noise ratio. In this process, did the authors account for the adverse effects of clouds and shadows, and/or waves or ripples? Were these contaminated areas masked or interpolated prior to performing the averaging?
Response: We appreciate this important question. In the revised version, we added a paragraphs in Section 2.2 (Satellite Data and Preprocessing) describing in more detail how image selection and filtering were carried out. Specifically, images with contamination from clouds, shadows, or strong surface disturbances were excluded from the averaging process, ensuring that only valid and representative pixels with high quality were included.
(5) In figure 6-8, it is inappropriate to draw a conclusion by just comparing SDB results with RGB composite image. For example, I cannot find out a 10-meter overestimation by the ANN in Fig. 8. The authors should compare SDB results with DDM, or use a scatter plot with predicted depth vs the actual depth to better represent their funding. Same as Fig. 9.
Response: We understand the reviewer’s concern and acknowledge the limitation. Quantitative comparisons were performed wherever in situ reference depth data were available, and in those cases scatter plots and statistical metrics (RMSE, MAE, R², Bias) are presented in the manuscript. In the revised version, we have also added explicit scatter plots of predicted depth versus actual depth, together with several additional numerical and statistical comparisons, to strengthen the quantitative evaluation. However, for regions without ground-truth data, only qualitative assessments were possible. While we agree that these qualitative comparisons are less rigorous, we believe they still provide useful insights—for example, highlighting unrealistic deep channels or shoreline changes that clearly contradict reality. Importantly, these qualitative analyses do not replace the quantitative evaluation, which remains the primary basis of our conclusions. We clarified this distinction in the revised manuscript to avoid misinterpretation.
(6) Subsection 4.5 should be moved to Data Availability Statement section.
Response: This change has been implemented as suggested.
We sincerely thank the reviewer once again for the constructive comments, which have helped us significantly improve the clarity, rigor, and overall quality of the manuscript.
We hope that the changes made adequately address all concerns raised, and we respectfully submit the revised manuscript for your consideration.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors have satisfactorily addressed all comments in my original review. This is a useful paper that is also well-written and wellstructured.
Author Response
Dear Reviewer,
Thank you very much for your comments, which were very helpful in improving our work and meeting Geomatics' high standards.
Sincerely,
Dr. Leonardo Tenorio Fernández
(on behalf of all co-authors)
Reviewer 3 Report
Comments and Suggestions for AuthorsDear editor,
I am satisfied with the authors' responses and I have no more comments.
Author Response
Dear Reviewer,
Thank you very much for your comments, which were very helpful in improving our work and meeting Geomatics' high standards.
Sincerely,
Dr. Leonardo Tenorio Fernández
(on behalf of all co-authors)

