Review Reports - Quantification of Suspended Sediment Concentration Using Laboratory Experimental Data and Machine Learning Model

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Dear authors,

please find enclosed my comments.

Comments for author File: Comments.pdf

Author Response

Comments 1: Title

The provided title is too general and does not reflect the methods employed in the study. The authors should revise it to specifically highlight the methods and data used.

Response 1: Thanks for the suggestion. We have revised the title to Quantification of Suspended Sediment Concentration Using Laboratory Experimental Data and Machine Learning Model”. A unique aspect of this study is only laboratory experimental data was used.

Comments 2: Abstract

o What do “accuracy” and “error” mean in the context of this study? Which specific metrics have been used to report model accuracy and error?

The authors should also provide details about the benchmark dataset. Do the collected images include associated SSC values?

o The architecture and structure of the model, including hyperparameter tuning, should be explained before presenting the results.

o The abstract is very confusing and lacks clarity. It should be rewritten in a clear and well organized manner.

Response 2:

At first, the modeling results were evaluated using Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) while increasing the coefficient of determination (R²) and Kling–Gupta Efficiency (KGE) (see Table 3 in revised manuscript). And the dataset was published by the University of Arizona Data Library. We anticipate other researchers will use this dataset.

Second, regarding the architecture and structure of the model, the manuscript already includes a thorough explanation of the selected machine learning models (Random Forest Regression and Gradient Boosting Regression) and clearly outlined the hyperparameter tuning strategy (e.g., using GridSearchCV with defined parameter ranges and cross-validation methods) in the methodology section (specifically in sections 3.2.2 and 3.2.3). These sections follow a logical flow from model setup to tuning and evaluation, improving clarity and reproducibility.

The abstract has been revised in this revision to improve the clarity.

Comment 3: Lines 53–54: There is a sudden and unclear transition from methodology to data.

o The authors should first present an overview of the available conventional methodologies, discussing their limitations. This would provide a logical foundation and motivation for the use of machine learning techniques. Several studies have highlighted the deficiencies of physical and process-based models in capturing complex engineering phenomena, thereby justifying the adoption of ML models. Including such a discussion along with appropriate transition sentences would significantly improve the flow and coherence of the manuscript. https://doi.org/10.3389/fenve.2023.1235557

Response 3: We appreciate the suggestion. In response, we have revised lines 44 - 54 to maintain the structure and to describe the Conventional approaches and We have provided an intensive literature review on the application of ML on environmental studies (Line 122-175). We focused on the literature on using ML models for imaging analysis, excluding studies such as hydrological time-series data analysis using ML models. The objective is to narrow our scope to image-based ML models in water engineering applications.

Comment 4: Paragraph 3-Lines 66-83: The provided literature review is very weak!

o The literature review is weak. The authors should first give an overview of ML models used in environmental prediction including neural networks, kernel-based, ensemble, and hybrid methods (see: https://doi.org/10.1111/jfr3.70042) and then focus on ML applications in SSC and sediment transport, including recent hybrid and ensemble model studies.

Response 4: Disagree. We provided intensive literature review on the application of ML models in environmental studies (Line 111-167), and then we focused on the SSC measurements using remote sensing methods and ML models (167-199). Then, we stated that this study aims to develop a series of laboratory experimental data and use ML models to predict SSC based on these data. The literature review is comprehensive and thorough.

Comment 5: Experimental Runs and Observation Data

o Lines 197–198: The authors should clarify the preprocessing steps for natural light images.

o How were the RGB values and GLCM features extracted?

Response 5:

First, we added Section 2.1. explaining experimental procedure.

Second, regarding RGB and GLCM features, we have clarified the preprocessing steps for natural light images in Lines 259–275. Specifically, all images were cropped to remove visual disturbances and to ensure consistent framing and resolution across samples. Each image was processed as a 3-channel RGB array, from which the mean intensity of each channel (red, green, and blue) was calculated to capture overall color composition. In addition, a normalized red reflectance metric (R / (R + G + B)) was computed to quantify the relative dominance of red in the image.

Lines 262-272 explains how the RGB values are extracted, and lines 276-291 explains how the preprocessing and the GLCM features are extracted.

Comment 6: Methodology

o The authors should clarify that their experiments do not capture the vertical distribution of SSC.

o Why was sediment size not included as an input parameter for SSC estimation?

o The dataset is entirely based on a controlled laboratory environment, where the color of the water-sediment mixture is used as a proxy for SSC. However, in real-world conditions, water color can be influenced by other factors, such as vegetation or organic matter, rather than SSC alone. The authors should discuss how their method would perform under such conditions and whether it can be generalized beyond the lab setting.

Response 6: At first, the vertical distribution of SSC was not measured, but only the surface sediment was sampled using a container simultaneously as the images were taken. This study result only applies to the surface measured suspended sediment concentration. Second, only one sediment size was used because this set of experimental data is for calibrating a camera installed in the field. The sediment was from the field site. Third, in this study, we did not consider other environmental factors in the field. To avoid these environmental factors, we take the sediment and conducted this series of laboratory experiments to obtain SSC in a large range from 1000 to 150,000ppm. It’s hard to get a large range of SSC in the field.

Comment 7: Line 342-Section 3.2.2. Model Set-up – Fine Tuning of Selected Models

The authors should clearly specify the type of modeling used (regression or classification) and explicitly define the input and output parameters. This information would be best presented in a table for clarity.

Response 7:

Thank you for this suggestion. In response, we have explicitly stated the modeling type as regression and clearly defined the input and output parameters in Table 2 on Pages 11–12, which summarizes the input parameters used for each model and the output parameter (suspended sediment concentration), to improve clarity and transparency of the model setup.

Comment 8: Results

o Kling–Gupta Efficiency (KGE) should be clearly defined in the manuscript.

Response 8: The Kling-Gupta Efficiency (KGE) is a statistical metric used to evaluate the performance of hydrological models by comparing simulated and observed time series data. It was introduced by Gupta et al. (2009) to overcome some limitations of traditional metrics like the Nash-Sutcliffe Efficiency (NSE).

Kling-Gupta Efficiency (KGE) combines three components into a single metric:

In which r is the Pearson correlation coefficient between observed and simulated values; is the bias ratio, in which μ_s is the mean of simulated values, and μ_ois the mean of observed values; and , the variability ratio defined as the coefficient of variability of simulated vs the observed, and σ is the standard deviation. If KGE=1, it means perfect agreement between observed and simulated values, KGE <1 means deviation from perfect agreement, KGE < 0 means model performs worse than the mean of observations.

See revision on Line 769-783.

Comment 9: RGB mean and red reflectance may be correlated. Similarly, time of capture and temperature could be co-dependent. Conduct multicollinearity analysis (e.g., VIF) or PCA to justify final feature set (see https://doi.org/10.1016/j.jag.2025.104357). This may also improve interpretability and generalization.

Response 9: In this study, each image is associated with a single suspended sediment concentration (SSC) value and six extracted features: mean R, G, B values, red reflectance, time of capture, and temperature. Rather than applying dimensionality reduction techniques such as PCA or excluding features based on VIF analysis, we used all the features to evaluate their full predictive utility. Post-training results indicated that all six features contributed meaningfully to model performance. We acknowledge that multicollinearity analysis (e.g., VIF or PCA) can improve model interpretability and generalization, particularly in studies with more complex feature sets. In future work, especially when working with a larger number of variables—as demonstrated by Tousi et al. (2021)—we plan to incorporate such techniques.

Comment 10: In Table 3, the metric “% Within 30% Relative Error” is unclear and must be clearly explained. The same clarification is needed for Table 4.

Response 10: we added “The modeling results were evaluated using the percentage of results that have a normalized error of less than 30%. The normalized error is defined as the ratio of an absolute error (i.e., simulation – observation) vs the observation.” On Line 804-806.

Comment 11: CNN is poorly optimized and underperforms but the reason is only briefly mentioned. Provide architectural tuning details, training plots, and validation strategies. Otherwise, its inclusion

appears superficial.

Response 11: CNN models were proved to be effective for identifying spatial features, however, in this study, each image only corresponds to one SSC. There are no measurements of spatial distribution of SSC in one image, so CNN model is not suitable as shown perform poorly. In addition to RESNET50, multiple layers of CNNs were tested, and the results are not as good as the ones from RFR and GBR. This is also an important funding from this study. Since our data has been released publicly, we anticipate further discussion on CNN models by using our data.

Comment 12: Performance metrics are presented, but their uncertainty is not. Include error bars (e.g., standard deviation across cross-validation folds) for RMSE/R² values to better reflect model robustness.

Response 12: We appreciate this important suggestion. Lines 657–677 have been added to explain the inclusion of uncertainty metrics, and Table 5 now presented the ranges of RMSE and R² values to illustrate performance variability across cross-validation folds or repeated train–test splits. These additions provide a clearer picture of model robustness and generalization. Besides, we also plotted the prediction errors of all the data using natural and NIR images in Figure 7 and 8, which clearly showed that RGB model with NIR images produced the best matches.

Comment 13: The authors should explain why normalized input parameters were not used.

Response 13: Thank you for the comment. For models trained using natural light images (e.g., RGB and red reflectance features), we applied feature normalization using z-score standardization (Standard Scaler) prior to model training. For near-infrared (NIR) models using GLCM texture features, we did not apply normalization, as these features are derived from statistical texture metrics that inherently reflect local image structure. Additionally, ensemble-based models such as Random Forest and Gradient Boosting are generally robust to differences in feature scale due to their tree-based architecture. While normalization had limited impact on model performance in our case, we recognize its potential value, particularly in scenarios involving a wider range of feature types or non-tree-based models. We intend to further explore normalization strategies, including those for image-derived color features and sediment concentration values, in future studies.

Comment 14: A discussion of the limitations and strengths of the proposed models is necessary to evaluate their practical applicability.

Response 14: A discussion section was added on Line 928-966.

4. Response to Comments on the Quality of English Language

Point 1:

Response: The manuscript has been revised thoroughly.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

General comments

Although the structure of the document could be improved, the manuscript is well-written, and the subject is current and relevant.

The setup that is the source for image acquisition is not described in the manuscript, as well as the preparation of the SSC samples, which must be solved in a revised version.

Specific comments

Abstract

The abstract is concise but presents the essential information.

Keywords

Too many keywords are presented. Five to six keywords are enough to represent main topics. Moreover, abbreviations must be avoided.

Introduction

The statements in lines 45 to 48 must be supported by reference(s).

Please check and remove duplicates of abbreviations’ definitions such as SCC in lines 56 and 111. With exception to the Abstract section, all abbreviations must be defined only once.

Although more details on the relevance of SSC measurements can be helpful to readers, the introduction contains the relevant concepts to provide a smooth framework for the developed work.

Experimental Runs and Observation Data

Section 2 contains only one subsection. So, its numbering is unnecessary. However, more details on the experimental setup, image acquisition hardware, and sediment suspensions preparation must be provided.

I recommend including a section describing the setup and suspension preparation and concentration analysis, and a section describing the images acquisition.

Methodology

This section is too descriptive concerning theoretical concepts. General concepts can be moved to the introduction section, and this section must be focused on the work carried out.

Results

The results are clearly presented.

The use of “flying” titles such as in line 496 should be avoided.

Conclusions

I suggest that “Key Findings” and subsection 4.3 (Summary) can be merged into the conclusion’s section, after the required adjustment. On the other hand, part of the statements can be a basis for the elaboration of a Discussion subsection.

Author Response

Comments 1:

Abstract

The abstract is concise but presents the essential information.

Response 1: Thank you for pointing this out. We agree with this comment. Therefore, we have revised the abstract and made it more concise. See Line 18-33 in the revised manuscript.

Comments 2: Keywords

Too many keywords are presented. Five to six keywords are enough to represent main topics. Moreover, abbreviations must be avoided.

Response 2: We appreciate the reviewer’s feedback regarding the keywords. We have revised the keywords to include only six main topics without abbreviations to improve clarity and align with journal guidelines. The revised keywords are:

Suspended sediment concentration, water quality monitoring, machine learning, random forest regression, gradient boosting regression, near-infrared imaging

We trust this addresses the reviewer’s concern.

Comments 3: Introduction

The statements in lines 45 to 48 must be supported by reference(s).

Response 3: Agree. We have, accordingly, revised the Introduction to provide supporting references for the statements in lines 45 to 48, as suggested by the reviewer. Specifically, we added citations to Bilotta & Brazier (2008), Rai & Kumar (2015), and Owens et al. (2005) to substantiate the importance of monitoring sediment concentration in relation to water quality, ecosystem health, turbidity, nutrient distribution, and pollutant transport.

This change can be found on lines 37–40(after updating the key words the line count has decreased) of the revised manuscript. The revision is below:

“Monitoring sediment concentration in water bodies is essential for managing water quality, protecting aquatic ecosystems, and ensuring public safety (Bilotta & Brazier, 2008). Sediment concentration influences turbidity, nutrient distribution, and pollutant transport, impacting water quality and ecosystem health (Rai & Kumar, 2015; Owens et al., 2005).”

Comment 4: Please check and remove duplicates of abbreviations’ definitions such as SCC in lines 56 and 111. With exception to the Abstract section, all abbreviations must be defined only once.

Response 4: Agree. We have carefully checked the manuscript and removed duplicate definitions of abbreviations, including the repeated definition of suspended sediment concentration (SSC) in the specified lines 114 and elsewhere. The abbreviation SSC is now defined only once at its first occurrence in the main text, in accordance with the reviewer’s suggestion.

Many changes were made to makesure abbreviation is ONLY defined once in the text.

Comments 5: Experimental Runs and Observation Data

Response 5: Section was revised and was broken into three subsections: 2.1. Experimental Set-up 2.2. Image Collection 2.3. Image Processing. Please see Line 235-288 in the revised manuscript.

Comment 6: I recommend including a section describing the setup and suspension preparation and concentration analysis, and a section describing the images acquisition.

Response 6: Section 2.1. was added to describe the experimental set-up and concentration measurements. Line 235-253.

Comments 7: Methodology

This section is too descriptive concerning theoretical concepts. General concepts can be moved to the introduction section, and this section must be focused on the work carried out. abbreviations must be avoided.

Response 7: Agree. Section 3.1. Theoretical Basis which stated why the features of images are associated with SSC remained the same, while Section 3.2. Selection of Machine Learning Models was revised considerably by removing excessive literature evaluations and only focuses on the models we used and parameters for these models. See Line The methodology section has been revised (see line 476-586).

Comments 8: Results

The results are clearly presented. The use of “flying” titles such as in line 496 should be avoided.

Response 8:

Thank you for pointing this out. We have revised the formatting to ensure that section titles, including 4.1 Preliminary Model Evaluation, are no longer separated from their corresponding text in the revised manuscript.

Comments 9: Conclusions

Response 9:

Thank you for this valuable suggestion. We have carefully merged the Key Findings and Summary (Section 4.3) content into the Conclusion section to reduce redundancy and improve clarity. Additionally, we have elaborated a Discussion subsection (4.3) that interprets the results in relation to previous studies, highlights the implications, and acknowledges limitations. These changes have strengthened the structure and narrative of the manuscript as recommended.

This revision can be found on Line 928-982 of the revised manuscript.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Dear Authos,

Thank you.

The Article quality has been significantly improved.