Previous Article in Journal
Integrative Runoff Infiltration Modeling of Mountainous Urban Karstic Terrain
 
 
Article
Peer-Review Record

Generalized Methodology for Two-Dimensional Flood Depth Prediction Using ML-Based Models

Hydrology 2025, 12(9), 223; https://doi.org/10.3390/hydrology12090223
by Mohamed Soliman 1,2,*, Mohamed M. Morsy 2 and Hany G. Radwan 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Hydrology 2025, 12(9), 223; https://doi.org/10.3390/hydrology12090223
Submission received: 15 July 2025 / Revised: 20 August 2025 / Accepted: 21 August 2025 / Published: 24 August 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Summary
The manuscript proposes a machine learning-based methodology to rapidly predict two-dimensional flood depth, testing multiple ML models and combining the best-performing approach with unsupervised clustering to improve generalization across global catchments. This study addresses an important challenge in hydrology: fast and accurate flood depth mapping for early warning systems. The methodology is innovative and potentially impactful for operational flood management and global applications. However, the paper is long and somewhat scattered, with methodological and results sections that are very detailed but not always clearly organized. Several aspects require clarification, reorganization, and further discussion to strengthen the scientific contribution and reproducibility. 

General Comments

  1. Introduction

The introduction includes an extensive literature review, but the key information about the research gap and the specific objective is diluted among many citations. A concise statement of the problem, the gap, and the novel contribution should appear earlier.

  1. Materials and Methods
  • The section describing Regression Model Trials 1, 2, and 3 is repetitive and overly detailed. Consider consolidating the results into a single summary table and comparative plots to improve readability and highlight the key differences between the models.
  • Very long tables (e.g., the list of 45 catchments with coordinates) interrupt the flow and are better suited for appendices or supplementary material.
  • Some statistical analyses (e.g., skewness, kurtosis, K-S test) could be moved to an appendix.
  • The K-means clustering step is central to the approach, but the number of clusters (K) is not fully justified. A sensitivity analysis of K would strengthen the model’s generalization claims.
  1. Results and Discussion
  • It is unclear whether the HEC-RAS flood maps used to train and validate the machine learning models were themselves validated against real flood observations (e.g., gauged data, or satellite imagery). If they were not, the reported predictive accuracy primarily reflects the model’s ability to reproduce HEC-RAS outputs rather than real-world flood behavior. This point should be clarified and explicitly discussed as a potential limitation.
  • While six ML models are tested, the manuscript provides limited analysis of why Random Forest performs best. A discussion of overfitting, computational cost, and model selection rationale would add rigor.

Specific Comments

  1. Key figures (e.g., global basin maps, feature importance, elbow method) should have self-explanatory captions.
  2. Global catchment maps (Figures 2 and 5) could include a scale bar and country names for context.
  3. Minor grammatical and typographical corrections are needed (e.g., “confusion within the model” vs “model confusion”; ensure consistent spelling of “real-time applicability”).
  4. Ensure consistent use of units (e.g., meters vs m).
  5. Include details on software versions (e.g., Scikit-learn, TensorFlow) in the Methods section to ensure reproducibility.

Author Response

For research article

Generalized Methodology for Two-dimensional Flood Depth Prediction Using ML-Based Models

Response to Reviewer 1 Comments

 

Open Review

(x) I would not like to sign my review report
( ) I would like to sign my review report

Quality of English Language

(x) The English could be improved to more clearly express the research.
( ) The English is fine and does not require any improvement.

 

 

 

Yes

Can be improved

Must be improved

Not applicable

Does the introduction provide sufficient background and include all relevant references?

( )

(x)

( )

( )

Is the research design appropriate?

(x)

( )

( )

( )

Are the methods adequately described?

( )

(x)

( )

( )

Are the results clearly presented?

( )

(x)

( )

( )

Are the conclusions supported by the results?

( )

(x)

( )

( )

Are all figures and tables clear and well-presented?

( )

(x)

( )

( )

Comments and Suggestions for Authors

 

Summary
The manuscript proposes a machine learning-based methodology to rapidly predict two-dimensional flood depth, testing multiple ML models and combining the best-performing approach with unsupervised clustering to improve generalization across global catchments. This study addresses an important challenge in hydrology: fast and accurate flood depth mapping for early warning systems. The methodology is innovative and potentially impactful for operational flood management and global applications. However, the paper is long and somewhat scattered, with methodological and results sections that are very detailed but not always clearly organized. Several aspects require clarification, reorganization, and further discussion to strengthen the scientific contribution and reproducibility. 

The authors wish to thank the reviewers for the constructive and encouraging comments. In the revised manuscript, we have incorporated enhancements to both the analysis and the presentation, ensuring that these revisions align closely with the reviewer’s comments and suggestions.

 

General Comments

 

  1. Introduction

The introduction includes an extensive literature review, but the key information about the research gap and the specific objective is diluted among many citations. A concise statement of the problem, the gap, and the novel contribution should appear earlier.

Response 1:

We understand the importance of clearly highlighting the research gap and the main objective early in the introduction to improve readability and focus.

Action:

We have revised the introduction to:

  • Move the statements presenting the research gap and study objective to the beginning of the section.
  • Add a concise summary of our novel contribution.
  • Streamline some citations to reduce redundancy and improve clarity.

These changes are now reflected in Section 1, Paragraphs 2, 3, and 4 of the revised manuscript.

  • The last paragraph has also been enhanced to present the research gap and study objective clearly.

 

  1. Materials and Methods
  • The section describing Regression Model Trials 1, 2, and 3 is repetitive and overly detailed. Consider consolidating the results into a single summary table and comparative plots to improve readability and highlight the key differences between the models.

Response 2:

We agree that clarity and readability are essential. To address this, we have added a new section and a summary table to clearly present the key differences among the three regression model trials. However, as this section represents the core contribution of the study, we have retained a detailed narrative to ensure reproducibility and reusability of the workflow, particularly given the varying combinations of models, hyperparameters, and validation logic applied in each trial. Excessive simplification could reduce the level of detail needed for accurate replication by future researchers or practitioners. We have therefore organized the descriptions more clearly, removed redundancies, and maintained sufficient technical detail for full transparency. (See section 3.6.4. Summary of Regression Model Improvement Path)

 

  • Very long tables (e.g., the list of 45 catchments with coordinates) interrupt the flow and are better suited for appendices or supplementary material.

Response 3:

We agree that long tables may disrupt the readability of the main text. Therefore, we have moved the table listing the 45 catchments with coordinates to the Appendix to maintain a smooth narrative flow. A reference to the table's new location has been added in the relevant section of the main text for easy access by interested readers.

 

  • Some statistical analyses (e.g., skewness, kurtosis, K-S test) could be moved to an appendix.

Response 4:

 

We have moved the section and relevant tables containing the statistical analyses (e.g., skewness, kurtosis, K-S test) to the Appendix to maintain a smooth narrative flow in the main text.

 

  • The K-means clustering step is central to the approach, but the number of clusters (K) is not fully justified. A sensitivity analysis of K would strengthen the model’s generalization claims.

Response 5:

 

While we initially relied on the well-established Elbow and Silhouette score methods to justify the number of clusters (K), we have expanded this section to include a sensitivity analysis. The newly added table presents the effect of varying K on the models’ training and testing performance. The results show that increasing the number of clusters improves training accuracy but reduces testing and validation performance for unseen catchments, indicating potential overfitting. Based on this analysis, the selected K provides an optimal balance between model performance and generalization capability. (See Table 15: Sensitivity analysis of the number of clusters (K) using silhouette score, R², and RMSE for training and testing. - Trial 03, and the commentary.)

 

  1. Results and Discussion
  • It is unclear whether the HEC-RAS flood maps used to train and validate the machine learning models were themselves validated against real flood observations (e.g., gauged data, or satellite imagery). If they were not, the reported predictive accuracy primarily reflects the model’s ability to reproduce HEC-RAS outputs rather than real-world flood behavior. This point should be clarified and explicitly discussed as a potential limitation.

 

Response 6:

Our study is built upon well-calibrated baseline scenarios, as established in our previously published research (Soliman et al., 2022). In that study, Land Use/Land Cover (LULC) maps were calibrated to minimize errors and ensure reliable HEC-RAS flood simulations. The main contribution of the present research is to surrogate the hydrodynamic models with machine learning to predict flood depth in significantly reduced computational time, achieving more than a 225X improvement in speed. (Last paragraph in the discussion part)

 

As recommended, we have also added a statement to the Limitations section emphasizing that the machine learning models are trained using HEC-RAS outputs. Therefore, their predictive accuracy may not fully represent real-world flood behavior. Future work will incorporate validation using observed flood data or satellite-derived flood extents to improve model robustness.

 

  • While six ML models are tested, the manuscript provides limited analysis of whyRandom Forest performs best. A discussion of overfitting, computational cost, and model selection rationale would add rigor.

 

Response 7:

We have expanded the Discussion section to provide a detailed explanation of why Random Forest (RF) was selected as the best-performing model. Specifically, RF demonstrated a moderate gap between training (R² = 0.913) and testing (R² = 0.690) results, indicating good generalization while minimizing overfitting. We also compared RF’s computational cost and performance with other models, including XGBoost and deep learning approaches, to justify our selection. Additionally, we outlined planned future work, such as further hyperparameter optimization and additional trials, to enhance RF’s predictive accuracy and generalization capability.

 

Specific Comments

  1. Key figures (e.g., global basin maps, feature importance, elbow method) should have self-explanatory captions.

Response 8: We have revised the captions for all key figures, including the global basin map, feature importance plot, and elbow method visualization, to ensure they are self-explanatory.

  1. Global catchment maps (Figures 2 and 5) could include a scale bar and country names for context.

Response 9: For Figures 2 and 5, we have added a scale bar and country labels to provide a clearer geographic context and reader interpretation.

For Figure 2:

As the catchments are distributed across the globe, we have included continental names to enhance interpretability. We welcome any further suggestions to improve clarity and visual presentation. 

 

  1. Minor grammatical and typographical corrections are needed (e.g., “confusion within the model” vs “model confusion”; ensure consistent spelling of “real-time applicability”).

Response 10:

We have revised the manuscript to improve grammar, ensure consistent terminology, and correct typographical errors. Specifically, “confusion within the model” has been replaced with “model confusion” for clarity, and all instances of “real-time applicability” have been updated to “near-real-time flood forecasting” for greater precision. In addition, we conducted a thorough proofreading to address other minor grammatical and typographical issues throughout the manuscript.

  1. Ensure consistent use of units (e.g., meters vs m).

Response 11:

We have ensured the consistent use of units throughout the manuscript, including standardizing measurements (e.g., using “m” for meters) in both the text and tables.

  1. Include details on software versions (e.g., Scikit-learn, TensorFlow) in the Methods section to ensure reproducibility.

Response 12:

We agree that specifying the software versions is essential for reproducibility. The Methods section has been updated to include the Python version and the exact versions of the libraries used in our experiments, including Scikit-learn, TensorFlow, and other relevant packages. (See section Stage 6: Flood Depth Regression Modeling Approaches.)

 

Submission Date

15 July 2025

Date of this review

02 Aug 2025 08:11:46

comments reply: 08-09-2025

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The article proposes a generalised methodology for two-dimensional flood depth prediction using ML-based models. The topic is interesting, and the article offers valuable insights into the data-driven modelling approaches. There are a few points that need to be addressed: 

1- Introduction: It is a bit lengthy, and it would be great if it could be enhanced by trimming. Besides, the research gaps that the study wants to fill are not explicitly discussed. It is suggested to improve the last paragraph of the introduction to cover that. Also, the novelty of the work needs to be explicitly highlighted there as well. 

2- The complete name of ANFIS is Adaptive Network-based Fuzzy Inference System, based on the original article by Jang (1993) introducing ANFIS. Please correct it. 
Jang, J.-S.R., 1993. ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans. Syst., Man Cybernet. 23 (3), 665–685.

3- Table 1: Manning's coefficient has a unit. Please add the unit to the Table caption. 

4- Section 2.2.6: Justifications are needed on why ANN, CNN, RNN, LSTM, Random Forest, and XGBoost are selected. As a suggestion, you could discuss selecting 1-2 models from different families of ML-based techniques due to their unique capabilities. 

5-Table 3&4: The authors considered Manning's roughness as a dimensionless parameter, which is not correct. The SI unit of this parameter is Please revise Table 4.

6- Table 6 is a bit confusing for the readers. Perhaps by using better naming system it could be improved. 

7- Table 9: RMSE has a unit. It will hold the unit of the parameter it assesses. For example, if you use it for error in predicting depth, its unit will be in depth (e.g., m, mm, etc.)

8- Section 4: Discussion: 

Please add a few lines explaining the study's limitations. Also, it would be great if your discussion could address the following questions:

Does the rainfall event characteristic (duration, total rain, etc.) have an impact on the model's performance?

Did you find any correlation between catchment land use and model performance? In other words, was there any specific land use that led to better or worse performance? 

 

Author Response

For research article

Generalized Methodology for Two-dimensional Flood Depth Prediction Using ML-Based Models

Response to Reviewer 2 Comments

 

 

Open Review

(x) I would not like to sign my review report
( ) I would like to sign my review report

Quality of English Language

( ) The English could be improved to more clearly express the research.
(x) The English is fine and does not require any improvement.

 

 

 

Yes

Can be improved

Must be improved

Not applicable

Does the introduction provide sufficient background and include all relevant references?

( )

(x)

( )

( )

Is the research design appropriate?

( )

(x)

( )

( )

Are the methods adequately described?

( )

(x)

( )

( )

Are the results clearly presented?

( )

(x)

( )

( )

Are the conclusions supported by the results?

( )

(x)

( )

( )

Are all figures and tables clear and well-presented?

( )

(x)

( )

( )

Comments and Suggestions for Authors

The article proposes a generalised methodology for two-dimensional flood depth prediction using ML-based models. The topic is interesting, and the article offers valuable insights into the data-driven modelling approaches. There are a few points that need to be addressed: 

We thank the reviewer for the constructive and encouraging comments. In the revised manuscript, we have strengthened both the analyses and the presentation, aligning the changes closely with the reviewer’s suggestions throughout.

1- Introduction: It is a bit lengthy, and it would be great if it could be enhanced by trimming. Besides, the research gaps that the study wants to fill are not explicitly discussed. It is suggested to improve the last paragraph of the introduction to cover that. Also, the novelty of the work needs to be explicitly highlighted there as well. 

Response 1:

We have revised the Introduction section by trimming unnecessary details to improve clarity and focus. In particular, the final paragraph has been rewritten to explicitly identify the research gap, state the study objectives, and highlight the novelty of the proposed approach. These changes clarify how our method differs from existing work and emphasize its value for near-real-time flood depth prediction using machine learning. The revised paragraph also integrates a comparison with HEC-RAS 2D, highlighting the significant computational speed advantage and the operational relevance for ungauged and flood-prone areas. (See last paragraph Introduction part)

 

2- The complete name of ANFIS is Adaptive Network-based Fuzzy Inference System, based on the original article by Jang (1993) introducing ANFIS. Please correct it. 
Jang, J.-S.R., 1993. ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans. Syst., Man Cybernet. 23 (3), 665–685.

Response 2:

We have updated the manuscript to use the full name “Adaptive Network-based Fuzzy Inference System (ANFIS)” in accordance with the original reference by Jang (1993). We also added the original reference where ANFIS is first introduced: Jang, J.-S.R. (1993). ANFIS: Adaptive-network-based fuzzy inference system. IEEE Transactions on Systems, Man, and Cybernetics, 23(3), 665–685.

3- Table 1: Manning's coefficient has a unit. Please add the unit to the Table caption. 

Response 3:

We have added the unit for Manning’s roughness to the Table 1 caption and standardized the corresponding column header. We also checked the manuscript to ensure consistent use of this unit wherever Manning’s coefficient appears.

4- Section 2.2.6: Justifications are needed on why ANN, CNN, RNN, LSTM, Random Forest, and XGBoost are selected. As a suggestion, you could discuss selecting 1-2 models from different families of ML-based techniques due to their unique capabilities.

Response 5:

 We have clarified the rationale for selecting the six models and organized them by family to emphasize complementary capabilities. Two tree-based ensemble methods, Random Forest and XGBoost, were included for their robustness to nonlinearity, strong performance on tabular predictors, moderate hyperparameter sensitivity, and interpretability via feature importance. Four neural network architectures were chosen to address different learning needs: (i) ANN for general nonlinear relationships among aggregated/tabular inputs; (ii) CNN to leverage spatial context in gridded/geospatial predictors; (iii) RNN and (iv) LSTM to capture dependencies in varying inputs. This addresses the suggestion to select 1–2 models per family while ensuring both spatial and temporal dependencies are represented. We also report training time and key hyperparameters to make the comparison transparent. (See Section 2.2.6.)

5-Table 3&4: The authors considered Manning's roughness as a dimensionless parameter, which is not correct. The SI unit of this parameter is sm1/3 Please revise Table 4.

Response 6:

We have updated Table 3 & 4 to include the unit of Manning’s.

6- Table 6 is a bit confusing for the readers. Perhaps by using better naming system it could be improved. 

Response 7:

We have improved Table 6 by adopting a clearer naming convention and explicitly labeling the outcomes as true and false predictions (True Positive, False Positive, False Negative, and True Negative), which is common in similar confusion matrices. We also separated the results into two subtables (a- all samples and b- testing samples) to enhance readability and interpretation. These changes make the confusion matrix more self-explanatory and improve the overall clarity of the results.

7- Table 9: RMSE has a unit. It will hold the unit of the parameter it assesses. For example, if you use it for error in predicting depth, its unit will be in depth (e.g., m, mm, etc.)

Response 8:

We have added the appropriate unit to RMSE, reflecting the unit of the predicted variable (flood depth). Table 9 now reports RMSE (m) for both training and testing datasets, and we confirmed consistency with any other RMSE references in the manuscript.

 

8- Section 4: Discussion: 

Please add a few lines explaining the study's limitations. Also, it would be great if your discussion could address the following questions:

Does the rainfall event characteristic (duration, total rain, etc.) have an impact on the model's performance?

Response 9:

We now include a concise Limitations subsection noting that (i) the models are trained on HEC-RAS simulation outputs, so accuracy reflects consistency with the baseline hydraulics rather than direct observation; (ii) performance depends on input data quality (e.g., DEM resolution and parameter maps), which may vary across regions; (iii) generalization across unseen catchments is influenced by the clustering scheme and sample coverage; and (iv) rainfall representation is simplified by the current synthetic hyetograph, which does not span the full range of event types. These points frame appropriate use and guide future extensions.

We quantified the effect of total rainfall depth in the Random Forest sensitivity analysis and report its contribution to predictive performance. In line with the limitation above, duration and within-storm temporal pattern were not varied in this study. We have clarified that future work will evaluate multiple real events, covering short-duration, high-intensity convective storms and longer stratiform events, to more fully characterize how event properties affect model skill and operational applicability. This statement has been added to the Limitations and referenced in the Discussion.

The impact of total rainfall has been addressed in the sensitivity analysis of the Random Forest model, where key hydrological parameters were evaluated. Additionally, we have added a statement in the Limitations section clarifying that future research should incorporate multiple rainfall events, including short-duration and real-time events, to improve the model’s applicability beyond the current synthetic hyetograph approximation.

 

Did you find any correlation between catchment land use and model performance? In other words, was there any specific land use that led to better or worse performance?

Response 10:

The relationship between land use and model performance was examined through the sensitivity analysis of geospatial parameters using the Random Forest (RF) model. RF was used to quantify the relative importance of each input parameter in predicting flood depth. The results (Figure 3) show that Distance to Stream (DTS) and SINK are the most influential parameters, with importance scores of 24% and 14.5%, respectively, while land-use-related parameters—represented by Manning’s n and Curve Number (CN)—were found to have the lowest importance scores (2.3% and 1.7%).

These results, presented in Section 3.4.1 (Parameter Sensitivity Analysis), indicate that while land use has some influence on flood depth prediction, it is considerably less significant compared to topographic parameters. This finding has been explicitly clarified in the revised manuscript. (Section 3.4.1. Parameter sensitivity analysis, 2nd paragraph)

 

 

Submission Date

15 July 2025

Date of this review

01 Aug 2025 04:33:4

 

Date of update

10 Aug 2025

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

The authors developed of a Machine Learning-Based Global Flood Depth Prediction Model to Address the Low Computational Efficiency of Traditional Hydraulic Models (e.g., HEC-RAS). They utilized data from 45 river basins worldwide, considering geographically diverse regions. The data underwent meticulous preprocessing, including outlier handling and normalization. The methodology section is described in detail and the references are quite comprehensive, citing recent studies including those from 2024. The research direction indeed holds significant practical value, as flood warnings require rapid response.

In my view, the main innovation of the paper lies in integrating publicly available data such as ALOS DEM and ESRI LULC to construct the first global river basin-oriented flood depth prediction framework. It achieves a breakthrough in computational efficiency, providing a technical groundwork for real-time flood warnings. Overall, it demonstrates a solid scientific foundation, possesses distinctive innovative features, and exhibits significant practical potential.

 I think it can be considered to publish if the following questions and concerns are reasonably modified and responded to.

1、During the validation phase, the CA_01 and CA_02 basins exhibited notably poorer performance. The authors attribute this to flat terrain but did not conduct an in-depth analysis of the specific underlying causes. The current basin clustering relies solely on statistical values (Section 3.6.3, Tables 17-19), overlooking the spatial structure of the topography (such as drainage density, slope distribution, etc.). This results in insufficient basis for basin cluster grouping. It is recommended that the authors incorporate geomorphological indicators into the clustering features to enhance the physical meaningfulness of the groupings.

2、Although the dataset has global coverage, the number of samples from Africa is limited. This may adversely affect the model's generalization capability.

Author Response

For research article

Generalized Methodology for Two-dimensional Flood Depth Prediction Using ML-Based Models

Response to Reviewer 3 Comments

 

Open Review

( ) I would not like to sign my review report
(x) I would like to sign my review report

Quality of English Language

( ) The English could be improved to more clearly express the research.
(x) The English is fine and does not require any improvement.

 

 

 

Yes

Can be improved

Must be improved

Not applicable

Does the introduction provide sufficient background and include all relevant references?

(x)

( )

( )

( )

Is the research design appropriate?

(x)

( )

( )

( )

Are the methods adequately described?

(x)

( )

( )

( )

Are the results clearly presented?

( )

(x)

( )

( )

Are the conclusions supported by the results?

(x)

( )

( )

( )

Are all figures and tables clear and well-presented?

(x)

( )

( )

( )

Comments and Suggestions for Authors

The authors developed of a Machine Learning-Based Global Flood Depth Prediction Model to Address the Low Computational Efficiency of Traditional Hydraulic Models (e.g., HEC-RAS). They utilized data from 45 river basins worldwide, considering geographically diverse regions. The data underwent meticulous preprocessing, including outlier handling and normalization. The methodology section is described in detail and the references are quite comprehensive, citing recent studies including those from 2024. The research direction indeed holds significant practical value, as flood warnings require rapid response.

In my view, the main innovation of the paper lies in integrating publicly available data such as ALOS DEM and ESRI LULC to construct the first global river basin-oriented flood depth prediction framework. It achieves a breakthrough in computational efficiency, providing a technical groundwork for real-time flood warnings. Overall, it demonstrates a solid scientific foundation, possesses distinctive innovative features, and exhibits significant practical potential.

We thank the reviewer for the constructive and encouraging comments. In the revised manuscript, we have strengthened both the analyses and the presentation, aligning the changes closely with the reviewer’s suggestions throughout.

 I think it can be considered to publish if the following questions and concerns are reasonably modified and responded to.

1、During the validation phase, the CA_01 and CA_02 basins exhibited notably poorer performance. The authors attribute this to flat terrain but did not conduct an in-depth analysis of the specific underlying causes.

Response 1:

regarding the notably poorer performance of CA-01 and CA-02 during the validation phase. Based on the descriptive statistics (Appendix A, Table A. 3), CA-01 is characterized by low elevation variability (mean 873.96 m, std. 286.78 m), a relatively flat slope (mean 0.286 rad), and a high maximum sink depth (30.66 m). These features can impede model generalization, as identified in our sensitivity analysis, where sink depth and slope emerged as influential parameters.

In contrast, CA-02 exhibits a consistently steep slope (mean 1.569 rad, std. 0.044 rad) and a similarly high maximum sink depth (24.01 m). Such physiographic settings tend to produce rapid runoff and increase the model’s sensitivity to small errors in the input data, potentially reducing predictive accuracy in these cases.

These results are consistent with our earlier findings in the sensitivity analysis and now align with the observed model behavior. They indicate that catchments with either very flat terrain or very steep slopes, when combined with high sink depth, may pose challenges for model transferability and generalization.

We have added a statement in the limitations section to explicitly note that extreme physiographic characteristics, such as those seen in CA-01 and CA-02, may limit model performance and should be considered in future applications and model development.

(Appendix A, Table A. 3, discussion lines at section 3.6.3, Last statement in the limitations)

 

The current basin clustering relies solely on statistical values (Section 3.6.3, Tables 17-19), overlooking the spatial structure of the topography (such as drainage density, slope distribution, etc.). This results in insufficient basis for basin cluster grouping. It is recommended that the authors incorporate geomorphological indicators into the clustering features to enhance the physical meaningfulness of the groupings.

Response 2:

regarding the inclusion of geomorphological indicators in the basin clustering process. In the current study, the clustering relied on statistical descriptors (mean, median, and mode) of key topographic parameters, as detailed in Section 3.6.3 and Tables 17–19. This choice was motivated by the need to standardize clustering inputs across diverse global catchments and maintain computational efficiency for large-scale application.

We agree that incorporating explicit geomorphological indicators—such as drainage density, slope distribution, etc.—would likely enhance the physical interpretability and robustness of the cluster groupings. These indicators can capture spatial structure and topographic form more effectively, improving the model’s generalization to diverse physiographic conditions.

To address this, we have added a statement in the Limitations subsection under the Discussions noting that future work will extend the clustering feature set to include such geomorphological metrics, thereby strengthening the physical basis for basin grouping and improving predictive performance in varied terrain contexts. (See Limitations Section)

 

2、Although the dataset has global coverage, the number of samples from Africa is limited. This may adversely affect the model's generalization capability.

Response 3:

our main target during the sample selection process was to ensure maximum variability in key geospatial and hydrological parameters—such as Land Use/Land Cover (LULC), average elevation, soil type, and climatic context—rather than to achieve an even geographic distribution. This approach aimed to provide the model with diverse input conditions to enhance learning robustness across different flood-generating environments.

To address the reviewer’s concern, we have added a statement in the Limitations section explicitly noting that the lower density of African samples may reduce predictive robustness in underrepresented geospatial contexts. Future work will prioritize expanding the dataset with more African catchments to improve spatial coverage and ensure more balanced global representation. (See Limitations Section)

 

 

Submission Date

15 July 2025

Date of this review

12 Aug 2025 03:10:24

 

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors have provided thorough and constructive responses to my comments and have made substantial improvements to the manuscript. However, further refinement in visual presentation and concise writing would significantly improve readability and make the manuscript easier to follow. For these reasons, I recommend a major revision to address the structural and editorial issues before the manuscript can be considered for publication.

1) Despite efforts to restructure the methods section, the manuscript remains significantly too long for the standard length of a scientific article, and its current form compromises both readability and the overall effectiveness of communication. For example, Sections 3.6.2 and 3.6.3 are still overly verbose and contain repeated procedural descriptions. These sections should be condensed to eliminate redundancy and enhance clarity.

2) Several figures are central to the manuscript but are not well integrated into the text. In particular, some figures (e.g., Figures 5, 9, 10, and 11) appear before they are first mentioned or discussed in the text, disrupting the logical flow. Moreover, figure callouts in the text are sometimes missing or vague, making it difficult for readers to connect visual information with the corresponding analysis (e.g., Figure 11). Additionally, the same figure number is used more than once (e.g., Figure 9 appears both before and after Figure 10). Please ensure that each figure number is unique and follows a clear sequential order throughout the manuscript.

 

Author Response

For research article

Generalized Methodology for Two-dimensional Flood Depth Prediction Using ML-Based Models

Open Review

(x) I would not like to sign my review report
( ) I would like to sign my review report

Quality of English Language

(x) The English could be improved to more clearly express the research.
( ) The English is fine and does not require any improvement.

 

Comments and Suggestions for Authors

The authors have provided thorough and constructive responses to my comments and have made substantial improvements to the manuscript. However, further refinement in visual presentation and concise writing would significantly improve readability and make the manuscript easier to follow. For these reasons, I recommend a major revision to address the structural and editorial issues before the manuscript can be considered for publication.

We  sincerely thank the reviewer for his constructive feedback and recognition of the improvements made in the revised manuscript. We appreciate the detailed suggestions for further enhancing clarity, structure, and visual integration. Below, we provide our responses to each point raised:

  • Despite efforts to restructure the methods section, the manuscript remains significantly too long for the standard length of a scientific article, and its current form compromises both readability and the overall effectiveness of communication. For example, Sections 3.6.2 and 3.6.3 are still overly verbose and contain repeated procedural descriptions. These sections should be condensed to eliminate redundancy and enhance clarity.

Respond 1:

 Manuscript Length and Redundancy:

We fully agree with the reviewer’s observation that certain sections remained verbose in the earlier version. In response, we (the authors) carefully restructured and condensed Sections 3.6.2 and 3.6.3 by removing repetitive procedural descriptions and merging overlapping content. This refinement significantly improved the readability and flow of the Methods and Discussion sections.

Furthermore, to address the overall length issue, we summarized and streamlined content throughout the manuscript. The revised version is now 27 pages including references and annexes, and 22 pages excluding annexes, compared to 34 pages in the original submission. This reduction enhances readability without compromising scientific depth.

2) Several figures are central to the manuscript but are not well integrated into the text. In particular, some figures (e.g., Figures 5, 9, 10, and 11) appear before they are first mentioned or discussed in the text, disrupting the logical flow. Moreover, figure callouts in the text are sometimes missing or vague, making it difficult for readers to connect visual information with the corresponding analysis (e.g., Figure 11). Additionally, the same figure number is used more than once (e.g., Figure 9 appears both before and after Figure 10). Please ensure that each figure number is unique and follows a clear sequential order throughout the manuscript.

Respond 2: 

Figures Integration and Numbering
We carefully revised the figure placement and numbering to ensure logical flow and avoid confusion:

  • All figures now appear after their first mention in the text, with clear and explicit callouts integrated into the narrative.
  • Missing or vague references have been corrected, and their analysis is now directly linked to the text.
  • The numbering issue has been resolved; each figure now has a unique sequential number consistent across the entire manuscript.

This restructuring ensures smoother connection between visuals and text, helping readers to follow the analysis more effectively.

 

 

Submission Date

15 July 2025

Date of this review

14 Aug 2025 23:29:47

Revised version Submission

17Aug 2025

Author Response File: Author Response.docx

Back to TopTop