Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Intelligent Prediction and Continuous Monitoring of Water Quality in Aquaculture: Integration of Machine Learning and Internet of Things for Sustainable Management

Water 2025, 17(1), 82; https://doi.org/10.3390/w17010082

by Rubén Baena-Navarro^1,2,3,*

, Yulieth Carriazo-Regino²

, Francisco Torres-Hoyos^2,4 and Jhon Pinedo-López⁵

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Water 2025, 17(1), 82; https://doi.org/10.3390/w17010082

Submission received: 15 November 2024 / Revised: 12 December 2024 / Accepted: 12 December 2024 / Published: 1 January 2025

Round 1

Reviewer 1 Report (Previous Reviewer 2)

Comments and Suggestions for Authors

Figure 5
> to much significant digits in vertical scale (round to a amaximum of 3 significant digits)
> incomprehensible horizontal time scale
> DO plot: yellow line hard to be detected?

------
Analysis of the Relationship Between Turbidity and Dissolved Oxygen
Figure 7
> accordingly to experimental data points it seems to have a "NULL" sloping tendency
> can we have numerical results (SLOPE a respective UNCERTAINTY), assuming traditional Least Squares approach
> for that dispersion...
a) evaluate robust regression
b) make a statistical test for SIGNIFICANCE of SLOPE

-----
Daily Relationship Between Temperature and Dissolved Oxygen
Fig.8
> accordingly to experimental data points it seems to have a "NULL" sloping tendency
> can we have numerical results (SLOPE a respective UNCERTAINTY), assuming traditional Least Squares approach
> for that dispersion...
a) evaluate robust regression
b) make a statistical test for SIGNIFICANCE of SLOPE

-----
Fig.9
> color scale is not adequate (difficult to distinguish records)
Suggestions:
a) change some records to black / blue / green
b) SHIFT records in order to NOT OVERLAP

-----
Table 5
> add more 2 significant digits to "Average MSE" to be coherent with Table 4

Author Response

Comments 1:

"Figure 5

too many significant digits in the vertical scale (round to a maximum of 3 significant digits)
incomprehensible horizontal time scale

DO plot: yellow line hard to detect?"

Response 1:

Thank you for pointing out these aspects. We agree with your observations and have made the following modifications to Figure 5, located in the Materials and Methods section of the manuscript:

We reduced the significant digits on the vertical scales to a maximum of three to enhance visual clarity.
We adjusted the horizontal time scale to display dates in a more comprehensible format (YYYY-MM).
We changed the color of the line representing dissolved oxygen (DO) from yellow to blue to improve visibility and facilitate data interpretation.

Comments 2:

"Analysis of the Relationship Between Turbidity and Dissolved Oxygen
Figure 7

accordingly to experimental data points it seems to have a "NULL" sloping tendency
can we have numerical results (SLOPE a respective UNCERTAINTY), assuming traditional Least Squares approach

for that dispersion...

evaluate robust regression
make a statistical test for SIGNIFICANCE of SLOPE"

Response 2:

Thank you for pointing this out. We agree with your observation and have made the following changes to Figure 7, located in the Results section of the manuscript:

We calculated the slope and its uncertainty using the Ordinary Least Squares (OLS) method. The analysis shows a slope of 0.0091with a standard error of 0.0215and a p-value of 0.671, indicating it is not statistically significant.
Additionally, we evaluated robust regression to ensure reliability against potential outliers. This analysis yielded a slope of 0.0058 with a standard error of 0.0218 and a p-value of 0.791, also confirming the null sloping tendency.
We updated Figure 7 to include both regression lines (OLS and robust) with their respective 95% confidence intervals.

These changes indicate that the relationship between turbidity and dissolved oxygen shows a null tendency, supporting the hypothesis of the stability of dissolved oxygen levels under moderate turbidity variations, as observed in aquaculture systems. The corresponding revisions are included in the Results section of the manuscript.

Comments 3:

"Daily Relationship Between Temperature and Dissolved Oxygen

Figure 8

accordingly to experimental data points it seems to have a "NULL" sloping tendency
can we have numerical results (SLOPE a respective UNCERTAINTY), assuming traditional Least Squares approach

for that dispersion...

evaluate robust regression
make a statistical test for SIGNIFICANCE of SLOPE"

Response 3:

Thank you for pointing this out. We agree with your observation and have made the following updates to Figure 8, located in the Results section of the manuscript:

We performed an Ordinary Least Squares (OLS) regression to calculate the slope and its uncertainty. The analysis showed a slope of 0.0097 with a standard error of 0.0156 and a p-value of 0.534, indicating that the slope is not statistically significant.
To ensure the robustness of the results, we also applied robust regression, which yielded a slope of 0.0067 with a standard error of 0.0158 and a p-value of 0.671, confirming the null sloping tendency.
Figure 8 was updated to include both the OLS and robust regression lines, along with their respective 95% confidence intervals.

These analyses suggest that within the observed data range, no statistically significant relationship exists between temperature and dissolved oxygen levels. However, this does not contradict the well-established understanding that higher temperatures generally reduce water's ability to retain oxygen. Instead, it highlights that the inverse relationship may be less apparent under the specific conditions of this study, which involved moderate temperature variations and potential external factors, such as oxygenation systems, that could mitigate the effect.

We have updated the Results section accordingly to reflect these findings and ensure clarity for readers. Thank you for helping us enhance the rigor and transparency of our analysis.

Comments 4:

"Fig.9

color scale is not adequate (difficult to distinguish records)

Suggestions:
a) change some records to black / blue / green

b) SHIFT records in order to NOT OVERLAP"

Response 4:

Thank you for your observations. We agree with your comments and have made the following modifications to Figure 9, located in the Results section of the manuscript:

The color scale was adjusted to assign distinctive tones to each parameter: blue for temperature, green for dissolved oxygen, black for pH, and orange for turbidity. This significantly improves the visual differentiation between the records.
Larger vertical shifts were applied to the normalized series, minimizing overlap between the records and facilitating the visualization of individual trends.

With these changes, Figure 9 now provides a clearer and more accessible representation of the normalized time series of water quality parameters. The revised version accurately reflects the characteristics of the tropical environment of Montería, allowing for a more precise interpretation of the data.

Comments 5:

"Table 5

add more 2 significant digits to 'Average MSE' to be coherent with Table 4"

Response 5:

Thank you for your observation. We have adjusted the precision of the "Average MSE (mg/L)" column in Table 5 to include four significant digits, ensuring consistency with the presentation of results in Table 4. With this modification, both tables now reflect a coherent level of detail in the representation of the model’s performance metrics.

The revised version of Table 5 is included in the updated manuscript. We appreciate your valuable suggestion, which has contributed to improving the clarity and precision of the results presented.

Reviewer 2 Report (New Reviewer)

Comments and Suggestions for Authors

The authors present a system for remote water quality monitoring and machine learning-based water quality management for aquaculture. The authors provided a solid foundation for the necessity of their research through a comprehensive review of existing studies. However, the manuscript focuses more on being a technical article than academic article, presenting new findings compared to previous related studies. While the proposed system integrates well-known tools such as Random Forest and QAOA, it is difficult to find novelty of this research through the proposed system. Please consider the following detailed comments.

1. The key aspect of the paper seems to be its ability to detect abnormal conditions in water quality, thereby enabling efficient aquaculture management. Additionally, the study highlights improvements in the rapid establishment of a ML model using QAOA. However, it is difficult to find from the presented results how the proposed ML model was utilized to detect water quality anomalies. Moreover, to ensure continuous water quality management, it is necessary to update the ML model periodically. The paper does not explain how real-time water quality data is used to update the learning model or the frequency of such updates. For rapid learning model construction and application to hold significance, water quality indicators collected through IoT sensor networks must be utilized for the ML model updates, but this aspect is unclear. Therefore, based solely on the presented results, it is challenging to confirm the novelty of the research or its differentiation from existing studies.

2. The variability of water quality indicators in the test period is limited (Table 2). This raises concerns about the feasibility of detecting anomalies through machine learning (ML) model training. Can the observational data provide training data for anomaly detection? Demonstrating whether such anomalies can be predicted using the learning model would serve as evidence of the model's effectiveness. Please provide the critical water quality conditions influencing to the fish survival rate in the target area.

3. Table 7 demonstrates that the proposed methodology facilitates efficient aquaculture management. However, a more precise analysis of its effectiveness would require comparisons of fish survival rates before and after applying the proposed method, or monthly comparisons over several years.

Author Response

Comments 1:

"The authors present a system for remote water quality monitoring and machine learning-based water quality management for aquaculture. The authors provided a solid foundation for the necessity of their research through a comprehensive review of existing studies. However, the manuscript focuses more on being a technical article than academic article, presenting new findings compared to previous related studies. While the proposed system integrates well-known tools such as Random Forest and QAOA, it is difficult to find novelty of this research through the proposed system. Please consider the following detailed comments."

Response 1:

Thank you for your constructive feedback. We have revised the manuscript to better highlight the novelty and academic contributions of our work, addressing the concerns raised.

Changes made:

Emphasis on novelty:

Added explicit discussion on the unique integration of QAOA with IoT and ML, demonstrating how it reduces training time by 50% while enabling real-time decision-making in aquaculture management.
Highlighted the system's adaptability to both resource-constrained rural settings and advanced urban environments, addressing a gap in previous studies.

Comparison with prior studies:

Updated the Discussion section to include a clear comparison with related works, emphasizing how this study overcomes limitations in computational efficiency, scalability, and real-time response capabilities.

Clarity on academic contributions:

Revised the Introduction and Conclusion to explicitly state how the proposed system advances theoretical and practical knowledge in applying quantum-enhanced algorithms to environmental monitoring.

These revisions clarify how our study contributes new findings to the field and bridges gaps in existing research. We appreciate your valuable insights, which have helped improve the manuscript’s focus and presentation. Please let us know if additional adjustments are needed.

Comments 2:

"The key aspect of the paper seems to be its ability to detect abnormal conditions in water quality, thereby enabling efficient aquaculture management. Additionally, the study highlights improvements in the rapid establishment of a ML model using QAOA. However, it is difficult to find from the presented results how the proposed ML model was utilized to detect water quality anomalies. Moreover, to ensure continuous water quality management, it is necessary to update the ML model periodically. The paper does not explain how real-time water quality data is used to update the learning model or the frequency of such updates. For rapid learning model construction and application to hold significance, water quality indicators collected through IoT sensor networks must be utilized for the ML model updates, but this aspect is unclear. Therefore, based solely on the presented results, it is challenging to confirm the novelty of the research or its differentiation from existing studies."

Response:
Thank you for your detailed and constructive feedback. In response to your comments, we have made several revisions to the manuscript to address the points raised, ensuring clarity on how the system detects anomalies and updates the ML model in real-time.

Changes made:

Integration of real-time anomaly detection:

Added a detailed explanation in the Results section on how the Random Forest model utilizes dynamic thresholds, derived from historical and real-time data, to detect anomalies such as low dissolved oxygen (DO) and elevated turbidity levels.
Highlighted the identification and automated management of over 6,000 anomalies, including interventions like oxygenation and pH adjustments, demonstrating the system's proactive management capabilities.

Periodic model updates:

Described in the Results section the implementation of a sliding window approach for periodic updates to the Random Forest model every 24 hours, incorporating the most recent 1,000 records to maintain adaptability to changing environmental conditions.
Explained how this periodic updating mechanism ensures the system's accuracy and relevance in real-time monitoring.

QAOA-enhanced optimization:

Clarified in the Results and Discussion sections how the Quantum Approximate Optimization Algorithm (QAOA) reduced model retraining time by 50%, enabling seamless model updates without interrupting real-time operations.

These revisions improve the manuscript by providing a clearer explanation of the system’s novelty and differentiation from previous studies, particularly in the integration of real-time anomaly detection, periodic model updates, and QAOA optimization for aquaculture management.

We appreciate your valuable suggestions, which have significantly enhanced the clarity and focus of the manuscript. Please let us know if further adjustments are required.

Comment 3:

"The variability of water quality indicators in the test period is limited (Table 2). This raises concerns about the feasibility of detecting anomalies through machine learning (ML) model training. Can the observational data provide training data for anomaly detection? Demonstrating whether such anomalies can be predicted using the learning model would serve as evidence of the model's effectiveness. Please provide the critical water quality conditions influencing the fish survival rate in the target area."

Response:
Thank you for your thoughtful and precise observations. We have carefully addressed your concerns and revised the manuscript to provide irrefutable evidence of the system’s capacity for anomaly detection and its practical utility in aquaculture management.

Key Revisions:

1. Dataset Enrichment and Thresholds (Materials and Methods Section):

We clarified how dynamic thresholds were established using historical and real-time data to enhance the dataset’s capacity for anomaly detection. For instance, dissolved oxygen (DO) levels below 5 mg/L and temperatures exceeding 28°C were identified as critical conditions requiring immediate intervention.

2. FCE Integration for Practical Decision-Making (Results Section):

The Fuzzy Comprehensive Evaluation (FCE) was shown to complement traditional metrics, offering actionable categorization of water quality parameters. This dual approach ensured precise detection of critical variations and informed interventions.

3. Anomaly Detection and System Response (Results Section):

Highlighted the identification and automated management of over 6,000 anomalies. These anomalies included declines in DO and elevated turbidity, which triggered oxygenation and filtration responses. The Random Forest model was updated approximately every 41 days, reflecting the time required to accumulate 1,000 new data records, dynamically refining thresholds to ensure accuracy and relevance.

4. Relevance to Tropical Aquaculture (Discussion Section):

Addressed how Montería’s stable tropical conditions supported the detection of critical deviations, making the findings adaptable to similar climates. This adaptability underscores the system’s robustness across diverse tropical aquaculture settings.

These revisions unequivocally demonstrate the model's effectiveness in detecting anomalies and its critical role in maintaining fish survival rates above 90%. The integration of real-time data, robust thresholds, and actionable metrics ensures both the reliability of the system and its practical utility.

We appreciate your constructive feedback, which has strengthened the manuscript’s clarity and focus. Please let us know if further elaboration is required.

Comment 4:

"Table 7 demonstrates that the proposed methodology facilitates efficient aquaculture management. However, a more precise analysis of its effectiveness would require comparisons of fish survival rates before and after applying the proposed method, or monthly comparisons over several years."

Response:

Thank you for your comment. While historical data prior to the study is not available for direct comparisons, we have included a robust analysis based on the observed data during the study and references to previous research in similar aquaculture contexts. This approach effectively validates the proposed system's effectiveness.

Revisions made:

Contextual validation:

Comparisons were added with previous studies documenting typical mortality rates in uncontrolled aquaculture environments, which range from 15% to 30%. In contrast, our system maintained significantly lower mortality rates, between 4.86% (January) and 7.15% (June), even during critical and warm months.

System impact:

In the Results section, it is specified that high temperatures (>28°C) during May and June were mitigated through automated interventions, including oxygenation and pH adjustments, maintaining survival rates above 92%. This highlights the system’s ability to handle adverse conditions and prevent losses.

Support from the literature:

Additional references were included to support how IoT-based technologies and machine learning have demonstrated reduced mortality by automating interventions and optimizing environmental parameters.

Round 2

Reviewer 1 Report (Previous Reviewer 2)

Comments and Suggestions for Authors

------------------

Fig.7 and Fig.8

> 95% confidence intervals not visible in figures

------------------

Fig.9

> SUGGESTION: split this plot in 4 horizontal plots?

or SHIFT records in order to be able to see relations?

------------------

Table 7

> with only one Caption (table head)

Author Response

Comment 1: [95% confidence intervals not visible in figure 7]
Response 1: We appreciate this comment and agree with your observation. Therefore, we have modified Figure 7 in the Results section to include a visible 95% confidence interval around the OLS regression line. The confidence interval is presented as a shaded region in light blue, which improves visual clarity and enables an adequate interpretation of the uncertainty in the model. This change ensures that the figure meets the necessary standards to support the results presented.

Comment 2: [95% confidence intervals not visible in figure 8].
Response 2: We appreciate your observation and have made the corresponding modifications to Figure 8 to ensure that the 95% confidence intervals are clearly visible. In the revised version, the confidence intervals are displayed as a light blue shaded region around the OLS regression line. This representation enhances visual clarity and allows for an appropriate interpretation of the uncertainty associated with the model.
The updated figure accurately reflects the results described in the manuscript, including the lack of statistical significance of the slope (0.0097) and the associated values for the standard error and p-value. This adjustment ensures that the figure meets the required standards for clarity and presentation.

Comment 3: [Fig. 9 SUGGESTION: split this plot in 4 horizontal plots or SHIFT records in order to be able to see relations.]
Response 3: We appreciate your suggestion and have adjusted Figure 9 by splitting it into four horizontal plots, each corresponding to a water quality parameter (temperature, dissolved oxygen, pH, and turbidity). This modification improves visual clarity and facilitates the identification of specific trends for each parameter. Additionally, by normalizing the values and representing them individually, we ensure a more precise and less congested interpretation of the data.

Comment 4: [Table 7 > with only one Caption (table head)]
Response 4: We appreciate your observation and have made the necessary adjustments to Table 7. The table has now been consolidated into a single page, with a single header clearly identifying the columns. This change eliminates any repeated headers and ensures a more coherent and professional presentation. Additionally, the complete table is now located in a single section of the manuscript, facilitating its interpretation within the context of the presented results.

Reviewer 2 Report (New Reviewer)

Comments and Suggestions for Authors

The authors properly addressed every reviewer's comments through the revision. All ambiguous points have been discussed in the revised manuscript. I would like to recommend the manuscript to be published in this journal in its present form.

Author Response

Comment 1: [The authors properly addressed every reviewer's comments through the revision. All ambiguous points have been discussed in the revised manuscript. I would like to recommend the manuscript to be published in this journal in its present form.]

Response: We sincerely appreciate your positive comments and your recommendation for the publication of our manuscript in its current form. We are pleased that the revisions adequately addressed the ambiguous points raised and met your expectations. We greatly value the time and effort you dedicated to evaluating our work, as well as your constructive observations, which have significantly contributed to improving the quality of the manuscript. Thank you very much for your support and recognition.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Intelligent Prediction and Continuous Monitoring of Water Quality in Aquaculture: Integration of Machine Learning and IoT for Sustainable Management

By Baena-Navarro et al., 2024 submitted to Water, MDPI

General Comments

This paper proposes a predictive system based on machine learning model and the Internet of Things (IoT) to improve water quality in aquaculture ponds. The tested ML models are RF and SVM which are widely regarded as common machine learning model in the community. The authors also used QAQA as an optimized tool to reduce the processing time. The Introduction, Literature Review and Method sections are quite well written, giving a proper study background and comprehensive methodology used in the study. However, the result and discussion sections are not of equal quality. In fact, the predictive model has serious flaws inherited by how the input-output variables are handled. In machine learning models, it is wrong that the output of the model is one of the input variables. In this paper, the input variables of the ML models are temperature, dissolved oxygen, pH and turbidity but the output of the model is dissolved oxygen. This would create a circular dependency, where the model is being trained on data that already contains the answer it is supposed to predict. This leads to data leakage, a situation in which information from outside the training dataset (or future information) is used to train the model, artificially inflating performance metrics during training and evaluation.

Besides, it is uncertain whether the fish pond in question has an installed oxygenation system. If not, it raises concerns as to why this particular pond was selected. Given that the proposed system is intended for integration within the aquaculture industry, it is important to note that oxygenation systems are a standard feature in most aquaculture setups. It is uncommon to find a fish pond without such a system in this context. This point should be addressed to ensure the relevance and applicability of the study.

Therefore, I do not suggest the paper to be published in the journal.

Specific Comments

Line 120. It is unclear how you selected the 9 key studies from 25 studies that met the relevance and quality criteria.

Line 136. BPNN is not defined.

Line 146. LoRa is not defined.

Line 160. What is LoRaWAN?

Line 194. UML is not defined.

Line 255. PSO and GSD are not defined.

Line 275-277. Why this sentence appears here? I am also confused why your result is cited with other literatures [31].

Line 290. “Recent studies have used traditional techniques…”. This sentence is odd, if recent, why the techniques used are traditional?

Line 420. Please translate to English.

Line 426. Typo “3”.

Line 452. What is entropy or Gini?

Line 461-462. Are you saying the ML model is predicting the dissolved oxygen level which is also one of the input variables? If yes, it is completely wrong in the ML. A predictive model should not predict its input variable.

Line 485-486. Same comment as above.

Line 513-514. I can not see the inverse relationship in Fig 6. What is the key message of Fig 6?

Line 522-528. It is uncertain whether the fish pond in question has an installed oxygenation system. If not, it raises concerns as to why this particular pond was selected. Given that the proposed system is intended for integration within the aquaculture industry, it is important to note that oxygenation systems are a standard feature in most aquaculture setups. It is uncommon to find a fish pond without such a system in this context. This point should be addressed to ensure the relevance and applicability of the study.

Line 530. Again, I do not see how the scatter plot in Fig 7 shows a positive correlation between Temp and DO? The plot is showing two pools of data, with the fewer data points may be indicative of outliers.

Line 538-539. Here you mentioned the RF model’s inputs are temperature, pH, and turbidity, which is contradict to what you mentioned previously. And given the very high R2 and low MSE, I believe the DO is also included in your input model. That is why the model performance is unexceptionally good.

Line 562-565. I do not understand Figure 9. What are the x and y axes representing?

Table 3 and Figure 12 are not necessary. They can be clearly mentioned in text.

Author Response

General Comments:

Comment: "This paper proposes a predictive system based on machine learning model and the Internet of Things (IoT) to improve water quality in aquaculture ponds. The tested ML models are RF and SVM which are widely regarded as common machine learning model in the community. The authors also used QAQA as an optimized tool to reduce the processing time. The Introduction, Literature Review and Method sections are quite well written, giving a proper study background and comprehensive methodology used in the study. However, the result and discussion sections are not of equal quality. In fact, the predictive model has serious flaws inherited by how the input-output variables are handled. In machine learning models, it is wrong that the output of the model is one of the input variables. In this paper, the input variables of the ML models are temperature, dissolved oxygen, pH and turbidity but the output of the model is dissolved oxygen. This would create a circular dependency, where the model is being trained on data that already contains the answer it is supposed to predict. This leads to data leakage, a situation in which information from outside the training dataset (or future information) is used to train the model, artificially inflating performance metrics during training and evaluation."

Response: Thank you for this valuable insight. We recognize the issue you have highlighted regarding the inclusion of dissolved oxygen (DO) as both an input and output variable, which indeed could lead to data leakage. In response, we have adjusted the model to ensure that DO is now treated exclusively as an output variable. This adjustment prevents any circular dependency, reinforcing the model’s reliability and predictive validity. The revised Results and Discussion sections now reflect this modification, offering a more accurate portrayal of the model’s performance.

Comment: "Besides, it is uncertain whether the fish pond in question has an installed oxygenation system. If not, it raises concerns as to why this particular pond was selected. Given that the proposed system is intended for integration within the aquaculture industry, it is important to note that oxygenation systems are a standard feature in most aquaculture setups. It is uncommon to find a fish pond without such a system in this context. This point should be addressed to ensure the relevance and applicability of the study."

Response: We appreciate this observation. To clarify, the fish pond used in our study is indeed equipped with an oxygenation system. This setup allowed for the implementation of real-time interventions based on DO predictions, thus mirroring standard practices in the aquaculture industry. This information has been explicitly included in the revised "Materials and Methods" section to clarify the study’s context and reinforce its applicability to real-world aquaculture environments. Specific Comments:

Line 120: "It is unclear how you selected the 9 key studies from 25 studies that met the relevance and quality criteria."

Response: Thank you for pointing out this ambiguity. We selected the nine key studies from a larger pool of 25 by focusing on specific factors, including methodological relevance, technological innovation, and their specific contributions to IoT and ML in aquaculture. This selection process has now been clarified in the "Literature Review" section for greater transparency.

Line 136: "BPNN is not defined."

Response: We apologize for the oversight. BPNN stands for "Back Propagation Neural Network." This term has now been defined at its first mention in the text to aid reader comprehension.

Line 146: "LoRa is not defined."

Response: Thank you for this note. LoRa, which stands for "Long Range," is a low-power wireless communication protocol used in IoT devices. We have now defined LoRa at first mention in the manuscript for clarity.

Line 160: "What is LoRaWAN?"

Response: LoRaWAN stands for "Long Range Wide Area Network," a communication protocol for connecting low-power devices over long distances. This term has now been defined in the manuscript to provide additional context.

Line 194: "UML is not defined."

Response: UML refers to "Unified Modeling Language," a standardized modeling language used to specify, visualize, construct, and document the artifacts of a software system. We have added this definition at the first mention to clarify its role in our study.

Line 255: "PSO and GSD are not defined."

Response: We appreciate this feedback. PSO (Particle Swarm Optimization) and GSD (Genetic Selection Design) are now defined at their first mention in the revised manuscript to enhance clarity.

Line 275-277: "Why does this sentence appear here? I am also confused why your result is cited with other literature [31]."

Response: Thank you for pointing out this inconsistency. We have reorganized this section to avoid confusion and ensure our findings are clearly distinguished from referenced literature. This restructuring improves the flow of ideas in this section.

Line 290: "‘Recent studies have used traditional techniques…’. This sentence is odd; if recent, why are the techniques used traditional?"

Response: We appreciate this observation. The sentence has been revised to read: "Despite recent advancements in aquaculture, some studies continue to rely on established methodologies." This revision clarifies that the studies are recent but employ long-standing methodologies.

Line 420: "Please translate to English."

Response: Thank you for this note. We have now translated this phrase into English to maintain consistency across the manuscript.

Line 426: "Typo ‘3’."

Response: We apologize for the oversight. The typo has been corrected in the updated manuscript.

Line 452: "What is entropy or Gini?"

Response: Entropy and Gini index are impurity metrics used in decision trees to decide node splits. These terms are now defined to assist readers unfamiliar with these metrics.

Line 461-462: "Are you saying the ML model is predicting the dissolved oxygen level which is also one of the input variables? If yes, it is completely wrong in ML. A predictive model should not predict its input variable."

Response: Thank you for identifying this concern. Following your recommendation, we have restructured the model so that DO is treated solely as an output variable. This adjustment eliminates any circular dependency, which we have clarified in the revised Results and Discussion sections.

Line 485-486: "Same comment as above."

Response: As addressed, DO is now exclusively the output variable, and this modification is reflected consistently throughout the manuscript.

Line 513-514: "I cannot see the inverse relationship in Fig 6. What is the key message of Fig 6?"

Response: We appreciate your feedback on Fig 6. This figure aims to illustrate temperature and DO variations under specific conditions. We have refined the figure and revised the description to better convey the observed relationships.

Line 522-528: "It is uncertain whether the fish pond in question has an installed oxygenation system. If not, this raises concerns about the study’s relevance and applicability."

Response: As previously noted, the pond utilized in this study does include an oxygenation system, allowing for real-time intervention based on DO predictions. This clarification has been added in the "Materials and Methods" section to underscore the study’s relevance to industry practices.

Line 530: "I do not see how the scatter plot in Fig 7 shows a positive correlation between Temp and DO. The plot appears to show two distinct data clusters."

Response: Thank you for this observation. We have refined Fig 7 for improved clarity and added explanations to address potential influences on temperature-DO interactions, providing context for the data clusters.

Line 538-539: "Here you mentioned the RF model’s inputs are temperature, pH, and turbidity, which contradicts previous statements. Given the high R² and low MSE, I believe the DO is also included in your input model. That is why the model performance is unexceptionally good."

Response: We appreciate this comment. As previously addressed, DO is now exclusively an output variable in the Random Forest model. Performance metrics have been updated to reflect this change, eliminating potential overfitting or artificially high accuracy.

Line 562-565: "I do not understand Figure 9. What are the x and y axes representing?"

Response: We apologize for the lack of clarity. Figure 9’s axes have been relabeled, and the figure description now clarifies the relationships it intends to present for easier interpretation.

Table 3 and Figure 12: "These are not necessary and can be described in the text."

Response: Thank you for this suggestion. Table 3 and Figure 12 have been removed, with relevant information now incorporated directly into the text for conciseness.

Reviewer 2 Report

Comments and Suggestions for Authors

According to authors ideas... Installed (cheap on-line) sensors are suitable for a quick and easiest water quality assessment for monitoring fish cultures

suggestion: I believe there are 2 other simple sensors that could help in a more complete diagnose...

conductivity? -> electrolytes buildup

Fluorescence? -> organic matter buildup

-------------

ln.41

"Internet of Things (IoT)" ?

I do not quite understand this "new" concept...

Is this Web search based algorithms?

Please add more information...

-------------

lns.118-9

"Eligibility: After applying the inclusion and exclusion criteria, 25 studies

were selected that met the relevance and quality criteria."

Table 1 -> only presents 8 of these 25 selected works? why?

-------------

ln.136

"BPNN" ?

Those it refers to "Back Propagation Neural Network"

Please specify...

-------------

ln.154

"LoRa" ?

Those it refers to "Long Range"?

Please specify...

-------------

lns.153-165

In this text authors explain their concerns to "data transmission" and "web connectivity"...

For me, we can effectively interconnect a "local system" (for one specific fish culture plant) without the need to be accessible to other users on Web...

Why the need for "global access"?

Have you also consider "misappropriate use" (e.g. hackers)?

They could "easily compromise" fish culture with severe damage...

-------------

sec. "3. Materials and Methods"

Authors described sensors implementation, protection, inter-connectivity and data-flow...

For me there are at least 4 more aspects to consider:

a) sensor response check

b) representativeness of the measured signal (homogeneity of monitored water sample due to "stagnant" water pool)?

c) data synchronization and rastreability -> in order to acquire a processable SIGNIFICANT DATA TABLE

d) in case of "MISSING DATA" (data measurement and/or communication errors)... How to deal with?

e) how to CALIBRATE the system? (relate fish growth and health with optimal culture conditions)?

-------------

lns.196-200

...authors report pH and DO calibration and check...

This process is "off line"

Is it possible to check sensors "at line"?

I believe that will be preferable for a "robust and more reliable system"

NOTE: "pH" and "DO" electrodes they lack periodic check and maintenance routines!

In order to keep their reliability

(other sensors also may require periodic verification...)

-------------

ln.202

"UML"?

Stands for... Unified Modeling Language?

-------------

lns.206-7

"Various studies have demonstrated the positive impact these technologies have had on real aquaculture projects."

> need to indicate respective REFERENCES here

-------------

eq(1)

In this equation (variable auto-scaling) X_min and X_max are used as minimal/maximal obtained values...

> in REAL-TIME monitoring it will be difficult to know X_min and X_max!

How to deal with NEW DATA (fresh on-line data) situation?

-------------

lns.251-2

"For Random Forest hyperparameter tuning, the number of trees (ntrees = 100) and a

maximum depth of 10 levels were used to find a balance between bias and variance."

> in terms of variables... you have 4 variables!

How many repeated combined-sensors do you have INSTALLED in your study?

This is CRUCIAL information! Otherwise you have nonsense results for Random Forest (RF)

-------------

lns.258-9

"predicting critical parameters such as dissolved oxygen and pH"

> How did you GET this information?

Based on "fish culture knowledge"?

With your one results?

How did you access "fish growing" characteristics? (this is RELEVANT missing information)

-------------

lns.264-5

"To evaluate the performance of the Machine Learning models...mean squared error (MSE)"

> in fact MSE only reports "fitting errors"...

What about MODEL VALIDATION results?

Also... MSE as a dimensional variable (with normalized variables) do not helps in rationalize model performance

Preferable to evaluate RELATIVE Mean Squared Error! This RELATIVE value allows to quickly diagnose your error fitting error (as percentage)

-------------

eq.(3)

"mean squared error (MSE)" for prediction?

Usualy we compute RELATIVE ROOT MEAN SQUARED PREDICTED ERROR (RRMSPE) in order to better understand your model predicting ability!

-------------

lns.530-534 and Figure 7

a) it is well known the dependency of DO upon water Temperature!

It will also be dependent on other unconsidered factors such as (total of dissolved soils, dissolved organics, pH, suspended microorganisms, light conditions, aeration conditions, currents, superficial films and gas permeation limitations, ...)

b) I do not understand why in Fig.7

> we only have results near 20 and 27ºC?

> DO at 20ºC are in the same range as for 27ºC (first values should be higher than the second case...)

> where is the TEMPERATURE effect?

-------------

sections "5. Discussion" and "5. Conclusions" with same number?

Author Response

Comment 1:

Reviewer: Installed (cheap on-line) sensors are suitable for a quick and easiest water quality assessment for monitoring fish cultures. I believe there are two other simple sensors that could help in a more complete diagnosis: conductivity (for electrolytes buildup) and fluorescence (for organic matter buildup).

Response: Thank you for this valuable suggestion. We agree that incorporating sensors for conductivity and fluorescence would enhance the comprehensiveness of water quality monitoring by providing additional insights into electrolyte and organic matter buildup. Given the current study's resource constraints, these sensors were not included; however, we acknowledge their potential utility. In future work, we plan to expand the system to integrate conductivity and fluorescence sensors to further enhance water quality diagnostics. This consideration has been discussed in the Discussion section. Comment 2:

Reviewer: "Internet of Things (IoT)"? I do not quite understand this "new" concept... Is this Web search based algorithms? Please add more information.

Response: Thank you for the observation. We recognize that the IoT concept may need additional clarity for some readers. IoT, as applied in this study, refers to interconnected devices that gather real-time data through sensors, which are then transmitted and processed to enable timely adjustments. This clarification has been added in the Introduction to improve understanding. Comment 3:

Reviewer: "Eligibility: After applying the inclusion and exclusion criteria, 25 studies were selected that met the relevance and quality criteria." However, Table 1 only presents 13 of these 25 selected works. Why?

Response: Thank you for pointing out this discrepancy. Initially, 25 studies were shortlisted; however, only 13 met the specific criteria relevant to IoT, Machine Learning, and quantum algorithms in aquaculture applications. This selection process has been clarified in the Materials and Methods section to explain the final number of studies in Table 1. Comment 4:

Reviewer: "BPNN"? Does it refer to "Back Propagation Neural Network"? Please specify.

Response: Thank you for this comment. Yes, BPNN indeed refers to "Back Propagation Neural Network." We have clarified this abbreviation in the Machine Learning Model Selection section for consistency and clarity. Comment 5:

Reviewer: "LoRa"? Does it refer to "Long Range"? Please specify.

Response: Thank you for the suggestion. Yes, "LoRa" stands for "Long Range," which is a low-power, long-range communication protocol. This clarification has been added to the IoT System Architecture section to ensure accuracy. Comment 6:

Reviewer: Authors explain their concerns about "data transmission" and "web connectivity." For me, a local system for one specific fish culture plant might suffice without the need for global access. Have you considered the risk of "misappropriate use" (e.g., hackers)?

Response: We appreciate your insight into security concerns and local system sufficiency. The global access feature aims to enable centralized monitoring and scalability across multiple locations. Regarding security, we are aware of the potential for misuse and have implemented secure data transmission protocols, which are discussed in the Materials and Methods section. Future iterations may incorporate additional encryption and access control measures to mitigate these risks. Comment 7:

Reviewer: In "Materials and Methods," consider adding aspects like sensor response check, sample representativeness, data synchronization, handling missing data, and system calibration.

Response: Thank you for these valuable suggestions. We have added discussions on sensor response checks, sample representativeness, data synchronization, handling missing data, and calibration in the Materials and Methods section to address these critical aspects and ensure robust system performance. Comment 8:

Reviewer: Authors report that pH and DO calibration is offline. Is it possible to check sensors online?

Response: Thank you for the comment. Currently, calibration is performed offline due to technical limitations. However, we recognize the advantages of online calibration for a more robust system. We have included this as a recommendation for future work in the Discussion section. Comment 9:

Reviewer: "UML"? Stands for "Unified Modeling Language"?

Response: Correct, UML refers to "Unified Modeling Language." We have clarified this in the IoT System Architecture section. Comment 10:

Reviewer: "Various studies have demonstrated the positive impact these technologies have had on real aquaculture projects." Indicate respective references here.

Response: Thank you for your attention to detail. We have added references to specific studies that demonstrate the positive impact of these technologies in the Literature Review section. Comment 11:

Reviewer: In Equation 1, X_min and X_max are used as minimal/maximal values. In real-time monitoring, it’s challenging to know X_min and X_max. How do you handle fresh data?

Response: We agree with this point and acknowledge the challenge of dynamic X_min and X_max in real-time data. To address this, we have adopted a rolling normalization approach, as clarified in the Data Preprocessing section, which adjusts based on the latest data window to better manage variability. Comment 12:

Reviewer: For Random Forest hyperparameter tuning, you have 4 variables. How many combined sensors were installed?

Response: Thank you for the question. We deployed four sensors per pond to capture temperature, DO, pH, and turbidity data, which allows sufficient data granularity for the Random Forest model. This information has been clarified in the Materials and Methods section. Comment 13:

Reviewer: "Predicting critical parameters such as DO and pH" – How was this information obtained? Fish culture knowledge or own results?

Response: Thank you for the observation. The predictive insights are derived from both established aquaculture knowledge and our experimental results. This dual approach has been specified in the Results section to clarify the source of information. Comment 14:

Reviewer: MSE only reports fitting errors. What about model validation results? Consider using Relative Mean Squared Error.

Response: We appreciate your suggestion for a more comprehensive validation. We have included Relative Mean Squared Error (RMSE) and other relevant metrics in the Model Evaluation and Metrics section to provide a balanced evaluation of model performance. Comment 15:

Reviewer: In Eq. (3), the standard practice is to use Relative Root Mean Squared Prediction Error (RRMSPE).

Response: Thank you for this valuable input. We have adopted RRMSPE in the Model Evaluation and Metrics section to align with standard practices in predictive error assessment. Comment 16:

Reviewer: DO dependence on water temperature is well-known. Why do Fig. 7 results show similar DO levels for 20°C and 27°C? Where is the temperature effect?

Response: Thank you for this observation. We believe that fluctuations in other unmonitored factors (e.g., microbial activity or aeration) may influence DO levels. This interpretation has been added in the Results section to contextualize the observed temperature-DO relationship. Comment 17:

Reviewer: The "Discussion" and "Conclusions" sections have the same numbering.

Response: We apologize for this oversight. The numbering has been corrected to clearly distinguish the Discussion and Conclusions sections in the manuscript.

Reviewer 3 Report

Comments and Suggestions for Authors

I revised the work from the side of water quality that is in my background. I leave the comments about IoT and machine learning to other Referees.

· It should be better clarified the novelty that the work can bring with respect to the existent literature. If you type aquaculture and machine learning in Google Scholar you can retrieved many works. How your work is different?

· In methods the reason why these parameters have been chosen to monitor water quality in aquaculture should be better detailed.

· In my opinion, being a research paper, the content of literature should be summarized and discussed in the Introduction before the part in which the novelty aspects are described.

Author Response

Comment: "It should be better clarified the novelty that the work can bring with respect to the existent literature. If you type aquaculture and machine learning in Google Scholar, you can retrieve many works. How is your work different?"

Response: Thank you for highlighting this critical point. In response, we have expanded the Introduction to underscore the unique contributions of our study in integrating IoT, machine learning, and quantum optimization (QAOA) to improve water quality management in aquaculture. While several studies have explored machine learning applications for water quality prediction, our work distinguishes itself by combining real-time data acquisition through IoT with quantum optimization, which significantly reduces model processing time and enhances predictive accuracy. This dual-layered approach not only enables more rapid response to changes in water quality but also provides a scalable solution adaptable to resource-limited aquaculture environments, addressing a gap in the existing literature. Comment: "In methods, the reason why these parameters have been chosen to monitor water quality in aquaculture should be better detailed."

Response: We appreciate this observation and have revised the "Materials and Methods" section to provide a clearer justification for selecting specific parameters, such as temperature, dissolved oxygen (DO), pH, and turbidity, for monitoring water quality. Each of these parameters was chosen based on its critical impact on fish health and overall aquaculture productivity. Temperature and DO are directly linked to fish metabolism and survival, while pH and turbidity can significantly affect the aquatic environment and are indicators of potential pollution or organic matter buildup. Including these parameters ensures a comprehensive assessment of water quality conditions necessary for maintaining healthy aquaculture ecosystems. This expanded explanation aims to clarify the rationale behind parameter selection, reinforcing the study’s focus on practical, essential indicators for aquaculture management. Comment: "In my opinion, being a research paper, the content of literature should be summarized and discussed in the Introduction before the part in which the novelty aspects are described."

Response: Thank you for this valuable recommendation. Following your advice, we have restructured the Introduction to provide a concise summary of existing literature on IoT and machine learning applications in aquaculture. This summary serves to contextualize our study within the broader field and highlight the specific innovations and advancements we introduce. By laying out previous research findings, we then transition smoothly into the discussion of our study’s novelty, particularly our integration of quantum optimization to enhance model efficiency. This revision ensures that readers can easily follow the progression from existing work to the unique contributions of our research.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

General Comments

I reviewed this article several weeks ago. The revised version has greatly improved and the authors have carefully rectified the previous flaw in the methodology. The current version is acceptable with minor revision. Below is the list of minor remarks I detected when reading the manuscript.

Line 108, Typo “This”

Line 434-435, You mentioned that higher temperatures are associated with lower DO levels but, in your Fig 7, it shows an opposite trend. Please clarify.

Line 466-472, If your intention to show that turbidity and dissolved oxygen has no direct dependency. I suggest to remain the discussion but move Fig 8 to Supplementary. Fig 8 has no added value to your findings.

Line 480-481, Please rewrite the sentence and avoid using negation statement for clarity.

Reviewer 2 Report

Comments and Suggestions for Authors

Please pay attention to your text while introducing corrections.

This second version looks worse than the previous...

------------

Table 2

> make no real sense to report temperatures till the second decimal!

> accordingly to "uncertainty report" conventions...

https://www.bipm.org/documents/20126/2071204/JCGM_GUM_6_2020.pdf

When reporting values with uncertainty...

a) uncertainty should be reported with 2 significant figures

b) central estimate should be rounded in accordance to respective uncertainty

e.g. Temperature = 27.30 (1.95) -> 27.3 (2.0)

Diss. Oxygen = 5.85 (1.58) -> 5.9 (1.6)

----------

Equation 3. Mean Squared Error

𝑦̂𝑖 are the values predicted by the model -> "described" instead of predicted

(predicted values are more related with cross-validation)

----------

lns.412-421

> sequence of instructions with same "line number = 1"?

----------

Figure 7. Monthly variations in temperature and dissolved oxygen

> if you have at least one measurement of temperature by day...

Are you representing "monthly result" as a single dot?

Why not consider 95% confidence interval?

> same observation for DO!

----------

Figure 8. Inverse Relationship Between Monthly Temperature and Dissolved Oxygen Levels

> this graph DO NOT demonstrates authors goals!

Please represent DIARY RESULTS in DO against respective Temperature!

I don't see the need for both figures 7 and 8!

----------

Figure 9. Scatter Plot of the Relationship Between Temperature and Dissolved Oxygen.

> NONSENSE relationship

----------

Figure 10. Actual vs. Predicted Values of Dissolved Oxygen.

> your model IS NOT PREDICTING real DO

----------

Figure 11. Correlation Between Turbidity and Dissolved Oxygen.

> NONSENSE relationship!

----------

Table 2. Performance Metrics of the Random Forest Model

Where is the corresponding table?

----------

Table 4. Model Performance

Average MSE DO NOT indicates if you are in fact obtaining GOOD results!

> you have to report a RELATIVE VALUE (e.g. in %)

----------

Table 5. Model Performance on the Independent Test Set.

MSE DO NOT indicates if you are in fact obtaining GOOD results!

> you have to report a RELATIVE VALUE (e.g. in %)

----------

Table 6. Fish Growth and Survival Rate.

> a single average value for "Average Weight" do not show much about respective population characterization

----------

Table 7?

> why not to fuse information of Table 7 in Table 6

(look pretty much similar)

Author Response

Comment 1: "Please pay attention to your text while introducing corrections."

Response 1: Thank you for pointing this out. We have thoroughly reviewed the text to ensure that all corrections are accurately reflected. This includes improvements in wording and organization to maintain clarity and coherence in the document.

Comment 2: "This second version looks worse than the previous..."

Response 2: We sincerely appreciate your constructive feedback. We apologize if previous adjustments did not meet expectations and have carefully reviewed this version to improve clarity and accuracy. We hope this new revision effectively addresses all your observations.

Comment 3: "It makes no sense to report temperatures to the second decimal. According to 'uncertainty report' conventions... When reporting values with uncertainty, a) uncertainty should be reported with 2 significant figures and b) the central estimate should be rounded in accordance with the respective uncertainty."

Response 3: Thank you for your observation regarding the presentation of temperatures in Table 2. We have adjusted the values to reflect the recommended convention, rounding the central value and reporting the uncertainty with two significant figures. This adjustment contributes to a clearer and more accurate presentation of the data, aligned with best practices.

Comment 4: "?̂? are the values predicted by the model -> 'described' instead of predicted (predicted values are more related to cross-validation)."

Response 4: We appreciate your suggestion. We have changed the term "predicted" to "described" in the description of Equation 3 to more accurately reflect the context in which it is used. This change ensures more appropriate terminology in the document.

Comment 5: "lns.412-421 > sequence of instructions with same 'line number = 1'?"

Response 5: Thank you for pointing out this inconsistency. We have corrected the numbering in the sequence.

Comment 6: "If you have at least one measurement of temperature per day... Are you representing the 'monthly result' as a single dot? Why not consider a 95% confidence interval? Same observation for DO."

Response 6: Thank you for your observation. In Figure 7, we have opted to represent daily averages of temperature and dissolved oxygen (DO) as individual points, highlighting the relationship between these two parameters on a daily scale for better visualization of the inverse trend. Additionally, we have added a 95% confidence interval to provide greater precision in data interpretation.

Comment 7: "This graph DOES NOT demonstrate the authors' goals! Please represent DAILY RESULTS in DO against respective Temperature!"

Response 7: We appreciate your observation and have adjusted Figure 7 to show the daily relationship between temperature and dissolved oxygen (DO), using daily averages for a more detailed representation of daily variability. This modification more accurately reflects the dynamics between these parameters and better supports the objectives of our study.

Comment 8: "I don't see the need for both Figures 7 and 8."

Response 8: Thank you for your suggestion. After review, we have combined the figures to optimize the presentation of the results, ensuring that relevant information is clearly and concisely displayed in a single figure.

Comment 9: "NONSENSE relationship."

Response 9: Thank you for your observation. We have reviewed Figure 9 to ensure that the relationship between the parameters accurately reflects the observed data and aligns with the study’s objectives. The values presented correspond to a detailed analysis of significant correlations, as shown in the heatmap, which reinforces the relevance of each variable in water quality.

Comment 10: "Your model IS NOT PREDICTING real DO."

Response 10: Thank you for highlighting this. We have carefully reviewed the presentation of this figure and clarified in the text that it reflects model validation against test data rather than an actual "prediction." We have adjusted the legend to accurately convey that the model output was validated by comparing test data, aligning with the study’s objectives.

Comment 11: "NONSENSE relationship!"

Response 11: Thank you for your observation. We re-evaluated Figure 8 (formerly Figure 11) and adjusted its representation to more accurately reflect the relevant correlations between turbidity and dissolved oxygen, according to the observed data.

Comment 12: "Where is the corresponding table?"

Response 12: Thank you for pointing this out. We have included Table 4 in the corresponding section, detailing the performance metrics of the Random Forest model for a clear and complete presentation of the results.

Comment 13: "Average MSE DOES NOT indicate if you are in fact obtaining GOOD results! You have to report a RELATIVE VALUE (e.g., in %)."

Response 13: Thank you for your recommendation. We have included a relative error value (in %) in Table 4 to provide a more comprehensive assessment of model performance in practical terms and facilitate interpretation of the results.

Comment 14: "MSE DOES NOT indicate if you are in fact obtaining GOOD results! You have to report a RELATIVE VALUE (e.g., in %)."

Response 14: Thank you for your comment. In Table 5, we have added the relative error (in %) to enhance the interpretation of results, thus providing a better reference for model performance in terms of relative accuracy.

Comment 15: "A single average value for 'Average Weight' does not show much about respective population characterization."

Response 15: Thank you for your observation. We have broken down the information in Table 6 to include standard deviation values and weight ranges, which allows for a more detailed characterization of the population and provides a better understanding of variability within the study group.

Comment 16: "Why not fuse information from Table 7 into Table 6 (they look quite similar)?"

Response 16: Thank you for the suggestion. We have merged the information from Tables 6 and 7 to consolidate the data on fish growth and survival rates, optimizing the presentation and eliminating redundancies in the document.

Reviewer 3 Report

Comments and Suggestions for Authors

Thank to the authors for their improvements to the manuscript. I think that now can be published.

Author Response

Comment 1: Thank you to the authors for their improvements to the manuscript. I think that now it can be published.

Response 1: Thank you for your positive feedback and recommendation for publication. We appreciate the time and effort you have dedicated to reviewing our manuscript, as well as your valuable suggestions, which have significantly enhanced the quality of our work. We have carefully implemented all the suggested improvements and ensured that the manuscript meets the required standards for publication.

Round 3

Reviewer 2 Report

Comments and Suggestions for Authors

lns.427-428

"Figure 7 illustrates this daily inverse relationship between temperature and DO"

> I do not agree with this comment!

> Fig.7 only shows 6 scattered points with an increasing tendency

------------------

Figure 7. Daily relationship between temperature and dissolved oxygen.

> I don't see "confidence intervals"...

(this figure looks like the same)

------------------

Figure 8. Revised analysis of the correlation between turbidity and dissolved oxygen.

> It's very difficult to see a relation between turbidity and DO

> Why there is no more data points?

------------------------------------------------------------------------------------------

Things to do:

1. Instead of "mean squared error (MSE)" you have to calculate "root mean squared error (RMSE)" to have an idea of model fitting capability (this expression is similar to residual mean squared error)

2. To use RMSE as FITTING DIAGNOSE you have to SCALE and convert into percentage (%) [RELATIVE RMSE = RRMSE]

(e.g. see equation 63b in https://www.sciencedirect.com/topics/engineering/root-mean-square-error)

3. To VALIDATE your model you have to evaluate k-Fold ressampling and preform PREDICTION using [RELATIVE RMS PREDICTED ERROR = RRMSPE]

4. Pay attention to preform your corrections in TEXT ALSO!

Review your text dealing with "Performance of the Random Forest Model"

------------------------------------------------------------------------------------------

Tables 4, 5 and 6

> remove MSE (this is not useful)

Table 5

> IF in Table 4 RMSE ≃ 0.744 mg/L and DO varies from 2-8 (Table 3) I don't see how come %RRMSE is ≃ 10⁻⁶!

THIS HAVE TO BE WRONG!

Author Response

Comments 1:
"Figure 7 illustrates this daily inverse relationship between temperature and DO. I do not agree with this comment! Fig.7 only shows 6 scattered points with an increasing tendency"

Response 1:
Thank you for your observation regarding the representation in Figure 7. We have reviewed and updated the figure to better illustrate the general daily trend between temperature and dissolved oxygen, now including 95% confidence intervals. The section "Daily Relationship between Temperature and Dissolved Oxygen" has been modified to reflect this update in the figure and to enhance the visual interpretation of the data.

Comments 2:
"I don't see 'confidence intervals'... (this figure looks like the same)"

Response 2:
Thank you for pointing this out. We have incorporated 95% confidence intervals in Figure 7 as suggested, to provide a more comprehensive representation of the relationship between temperature and dissolved oxygen. This correction is reflected in the "Daily Relationship between Temperature and Dissolved Oxygen" section.

Comments 3:
"It's very difficult to see a relation between turbidity and DO. Why there is no more data points?"

Response 3:
Thank you for your comment. We have adjusted Figure 8 to include a larger number of data points, providing a more detailed visualization of the relationship between turbidity and dissolved oxygen. As indicated in the section "Revised Analysis of the Correlation between Turbidity and Dissolved Oxygen," this new visualization confirms that there is no evident direct correlation between these variables, suggesting they may be influenced independently by different environmental factors.

Comments 4:
"Instead of 'mean squared error (MSE)' you have to calculate 'root mean squared error (RMSE)'... To use RMSE as FITTING DIAGNOSE you have to SCALE and convert into percentage (%) [RELATIVE RMSE = RRMSE]... To VALIDATE your model you have to evaluate k-Fold resampling and preform PREDICTION using [RELATIVE RMS PREDICTED ERROR = RRMSPE]"

Response 4:
We sincerely appreciate your recommendations regarding the use of performance metrics for model fitting and validation. We have replaced the use of MSE with RMSE throughout the analysis to more accurately reflect the model's fitting capability. Additionally, we have scaled the RMSE to a percentage to obtain the RRMSE, and we calculated the RRMSPE to assess predictive performance through K-fold cross-validation, as you suggested. These corrections are detailed in the "Performance of the Random Forest Model" section to improve interpretation and model fitting diagnosis.

Comments 5:
"remove MSE (this is not useful)"

Response 5:
Thank you for this suggestion. We have removed MSE from Tables 4, 5, and 6, as we have replaced its use with RMSE and its relative variants (RRMSE and RRMSPE) for a more effective assessment of model performance, as suggested.

Comments 6:
"IF in Table 4 RMSE ≃ 0.744 mg/L and DO varies from 2-8 (Table 3) I don't see how come %RRMSE is ≃ 10⁻⁶! THIS HAVE TO BE WRONG!"

Response 6:
Thank you for your attention to this detail. We reviewed our calculations and made the necessary corrections to ensure the accuracy of the % RRMSE values reported in Table 5. The values now properly reflect the range of dissolved oxygen variation and align with the expected model performance metrics. These adjustments are documented in the "Model Validation through Cross-Validation and Independent Testing" section for greater clarity.

Round 4

Reviewer 2 Report

Comments and Suggestions for Authors

lns.223-247

> mixed confused text!

lns.330-331

> eq.(1) is repeated?

lns.434-435

In respect to Fig.7 authors comment...

"This pattern reflects how higher temperatures may be associated with lower dissolved oxygen levels,..."

> Fig.7 shows an increase on DO with temperature!

Based on Fig.7, this comment IS FALSE!

You have to assume that your DO determiantions are NOT CORRECT or sampled water IS NOT AT EQUILIBRIUM!

Fig.8

Do you know what are you actually measuring?

This figure is not giving correct values (these linear dependencies are very suspicious)

Article Menu

Intelligent Prediction and Continuous Monitoring of Water Quality in Aquaculture: Integration of Machine Learning and Internet of Things for Sustainable Management

Further Information

Guidelines

MDPI Initiatives

Follow MDPI