Next Article in Journal
Optimization of Elbow Draft Tubes for Variable Speed Propeller Turbine
Next Article in Special Issue
To Feel the Spatial: Graph Neural Network-Based Method for Leakage Risk Assessment in Water Distribution Networks
Previous Article in Journal
Optimizing Bioethanol (C2H5OH) Yield of Sweet Sorghum Varieties in a Semi-Arid Environment: The Impact of Deheading and Deficit Irrigation
Previous Article in Special Issue
A Novel IoT-Based Performance Testing Method and System for Fire Pumps
 
 
Article
Peer-Review Record

Case Study for Predicting Failures in Water Supply Networks Using Neural Networks

Water 2024, 16(10), 1455; https://doi.org/10.3390/w16101455
by Viviano de Sousa Medeiros 1, Moisés Dantas dos Santos 2,* and Alisson Vasconcelos Brito 3
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Water 2024, 16(10), 1455; https://doi.org/10.3390/w16101455
Submission received: 23 March 2024 / Revised: 1 May 2024 / Accepted: 9 May 2024 / Published: 20 May 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

1. Do not use "we" in your sentences as much as you can. Use the third unknown person mode.

2. The target and objectives of the research are mentioned several times in different parts of the manuscript. 

3. Source of data and the selected model are mentioned several times.

4. Necessary equations for error indices and their range of variation are not given.

5. In the References section, title of the references and journal names should be typed uniformly.

 

Comments for author File: Comments.pdf

Comments on the Quality of English Language

There are few places in which you can see the difference between Brasilian composition and English writing. Although, this is natural, and not a problem.

Author Response

Reviewer sugestions:

  1. Do not use "we" in your sentences as much as you can. Use the third unknown person mode.
    1. Response: Following the Reviewer's guidance, the entire text was revised and all requested improvements were met. Special attention was paid to those parts of the manuscript where the use of third-person mode was requested. All text has been corrected.
  2. The target and objectives of the research are mentioned several times in different parts of the manuscript. 
    1. Response:  As in the previous item, all of the reviewer's suggestions regarding this topic were implemented in the text.
  3. Source of data and the selected model are mentioned several times.
    1. Response: As in the previous item, all of the reviewer's suggestions regarding this topic were implemented in the text.
  4. Necessary equations for error indices and their range of variation are not given.
    1. Response: All error metrics used in the manuscript had their equations presented at the end of the Materials and Methods section, starting from line 255 and following
  5. In the References section, title of the references and journal names should be typed uniformly.
    1. Response: All bibliographic references were reviewed and those highlighted by the Reviewer were corrected according to what was suggested.

Reviewer 2 Report

Comments and Suggestions for Authors

 

This paper presents a case study on predicting failures in water supply networks using a machine learning framework. Despite the significance of the addressed topic, I believe there are some important flaws in the current work. If I understand correctly, this work is limited to a specific location and focuses solely on predicting the number of days between two consecutive failures based on location and variables associated with failure frequency. Could the author justify these limitations? Given that all points along water distribution networks (WDNs) are susceptible to failures, it seems plausible that incorporating other variables such as water flow, demand, and time of day and year could enhance the performance of the framework.

In the discussion, I would suggest adding comments on the potential integration of machine learning approaches with physical models, introducing new features relevant to WDNs simulation (see, for example, “The extension of EPANET source code to simulate unsteady flow in water distribution networks with variable head tanks”). This approach could address the issues mentioned in lines 41-42-43 regarding “high-dimensional and highly representative data.”I recommend adopting a more formal style. For instance, replacing “we begin with a brief introduction” with “Section 1 is devoted to…” and “Observing this figure” with “Observing Figure 4.”

Please check the reference in line 242 (see?).

It would be beneficial to list the variables utilized in the presented machine learning framework in a table.

Lines 263-265 in the “Results and Discussion” section appear to repeat information already stated in the introduction.

The comments and descriptions regarding Figure 7, which is the primary outcome of this work, are unclear. It's not evident whether an increase in the number of records results in a decrease in errors. Could the author provide a clearer explanation?

Could the author better clarify how the error is computed?

 

Comments on the Quality of English Language

Sometimes the style adopted seems informal

Author Response

Reviewer Sugestions:

  1. This paper presents a case study on predicting failures in water supply networks using a machine learning framework. Despite the significance of the addressed topic, I believe there are some important flaws in the current work. If I understand correctly, this work is limited to a specific location and focuses solely on predicting the number of days between two consecutive failures based on location and variables associated with failure frequency. Could the author justify these limitations?

    1. Response: These limitations occur due to the nature of this work: this research is applied to the specific scenario provided by CAGEPA, that is, the company indicated the objective of the research and provided the possible data to be used. Therefore, the Company needs to predict the occurrence of future failures, and provides all the data used by the predictive model.
  2. Given that all points along water distribution networks (WDNs) are susceptible to failures, it seems plausible that incorporating other variables such as water flow, demand, and time of day and year could enhance the performance of the framework.

     

    1. Response: The data used in the research is that provided by the company. Water flow data, time of day the failure occurred are not currently available. On the other hand, the time of year in which the failure occurred is relevant information and will be taken into account in future work.
  3. In the discussion, I would suggest adding comments on the potential integration of machine learning approaches with physical models, introducing new features relevant to WDNs simulation (see, for example, “The extension of EPANET source code to simulate unsteady flow in water distribution networks with variable head tanks”). This approach could address the issues mentioned in lines 41-42-43 regarding “high-dimensional and highly representative data.”

    1. Response: The use of flow simulation in an EPANET environment is a very interesting approach to be considered. This approach is present in the planning of future work, however, to date this type of information has not been provided by the company.
  4. I recommend adopting a more formal style. For instance, replacing “we begin with a brief introduction” with “Section 1 is devoted to…” and “Observing this figure” with “Observing Figure 4.”

     

    1. Following the Reviewer's guidance, the entire text was revised and all requested improvements were met. Special attention was paid to those parts of the manuscript where the use of third-person mode was requested. All text has been corrected.
  5. Please check the reference in line 242 (see?).

     

    1. This suggestion was implemented, the reference was corrected and can be viewed in the text.
  6. It would be beneficial to list the variables utilized in the presented machine learning framework in a table.

     

    1. Response: A table containing the list of variables used by the predictive model was created and can be found in the text starting at line 199.
  7. Lines 263-265 in the “Results and Discussion” section appear to repeat information already stated in the introduction.

     

    1. Response: The indicated text was taken from the manuscript.
  8. The comments and descriptions regarding Figure 7, which is the primary outcome of this work, are unclear.

    1. The main result of this work is the construction of the predictive model itself. The information present in the aforementioned description aims to show the performance of the proposed model.
  9. It's not evident whether an increase in the number of records results in a decrease in errors. Could the author provide a clearer explanation?

     

    1. Response: The author's intention was to indicate that an increase in the number of records has the potential to make the model more accurate, since a greater number of samples would be used in the model training and validation process. To date, no further data has been provided by the company (more samples and variables.
  10. Could the author better clarify how the error is computed?

    1. All error metrics used in the manuscript were presented and had their equations detailed at the end of the Materials and Methods section, starting from line No. 255 and following

Reviewer 3 Report

Comments and Suggestions for Authors

Comments on “Case study for predicting failures in water supply networks using neural networks.     My comments may reflect my ignorance, but hopefully they may help the authors address issues that will improve the quality of this paper.       

This is a very well-written article reporting on the use of Neural Networks for predicting the days between failures at specified locations on a water supply network. The methodology presented considers many more factors than many other approaches and claims to accurately predict when and where the next failure will occur.  (For example, the first sentence in the conclusions (line 341) states Given the results presented, we can conclude that it is possible to predict in advance the occurrence of failures in water supply networks using data from the network itself, focusing on the historical record of past failures.    The conclusion I draw from this paper is that the methodology has the potential of making better predictions of the frequency and location of failures but that they are still random.   I would think it would be of value to any utility to have this method help them define the probability of a failure within any specified number of days at any specified site on the network.

Other questions I have include:  a) Why linear regression?   Would non-linear regression be better?   b)  What is a failure?   Does the magnitude of failure influence the results you obtained? Will or could your method predict days before failure depending on its magnitude?    c)  Is this proposed method practical?   Is it an approach utility staff can understand and implement?  Is it being implemented by CAGEPA?    

I find the legends on the axes of your Figures 4, 5, and 6 hard to read and it is hard to see what is in the figure.   I strongly suggest making these figures easier to understand but commend you on their descriptions in the text.  

 

Again, nice work.  

 

Author Response

Reviewer sugestions:

Comments on “Case study for predicting failures in water supply networks using neural networks.”     My comments may reflect my ignorance, but hopefully they may help the authors address issues that will improve the quality of this paper.    

This is a very well-written article reporting on the use of Neural Networks for predicting the days between failures at specified locations on a water supply network. The methodology presented considers many more factors than many other approaches and claims to accurately predict when and where the next failure will occur. (For example, the first sentence in the conclusions (line 341) states “Given the results presented, we can conclude that it is possible to predict in advance the occurrence of failures in water supply networks using data from the network itself, focusing on the historical record of past failures.” The conclusion I draw from this paper is that the methodology has the potential of making better predictions of the frequency and location of failures but that they are still random.   I would think it would be of value to any utility to have this method help them define the probability of a failure within any specified number of days at any specified site on the network.

Other questions I have include:

  1. a) Why linear regression?   Would non-linear regression be better?  

    1. Response: Linear regression was used only as a comparative criterion, in relation to the MLP model, which is the core of this work. For the current research period, a non-linear regression model was not used, but the implementation of this model is part of the planning for future work.
  2. b) What is a failure?   Does the magnitude of failure influence the results you obtained? Will or could your method predict days before failure depending on its magnitude?  

    1. Response: The definition of a failure is present in the text in the first paragraph of the Problem Definition section, where it is written: "A failure is understood as any occurrence recorded in the water supply network related to leaks, indicating a potential risk of temporary water shortage ( water outage)". Regarding the influence of the magnitude of the failure, it is important to highlight that the results of this research do not take the magnitude of the failure into account. Failures of different magnitudes result in the same problem (supply interruption). Thus, this work is independent of the magnitude of the failures. Because a large or small fault will need to be repaired, and the process of repairing this fault involves interrupting the supply.
  3. c) Is this proposed method practical?   Is it an approach utility staff can understand and implement? Is it being implemented by CAGEPA?  

     

    1. Response: Yes, this proposed method practical. This project was developed in the context of an institutional partnership between the Federal University of Paraíba and CAGEPA. Therefore, the research problem was defined by the company, which also provided the data and the results obtained will be used by the Company, applying them to its particular problems with the purpose of optimizing processes and costs.
  4. I find the legends on the axes of your Figures 4, 5, and 6 hard to read and it is hard to see what is in the figure.   I strongly suggest making these figures easier to understand but commend you on their descriptions in the text.  

    1. Response: The figures cited were updated and their captions corrected according to the Reviewer's suggestions, except Figure 6 which was removed from the manuscript.

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

 

The authors should ensure the results reproducibility. I thus recommend integrating in the paper some of the authors’ s responses, albeit with necessary rephrasing:

For instance:

Introduction or conclusion:
These limitations occur due to the nature of this work: this research is applied to the specific scenario provided by CAGEPA, that is, the company indicated the objective of the research and provided the possible data to be used. Therefore, the Company needs to predict the occurrence of future failures, and provides all the data used by the predictive model”.

Conclusion or discussion:
The use of flow simulation in an EPANET environment is a very interesting approach to be considered. This approach is present in the planning of future work, however, to date this type of information has not been provided by the company”.

Conclusion or discussion

“The author's intention was to indicate that an increase in the number of records has the potential to make the model more accurate, since a greater number of samples would be used in the model training and validation process. To date, no further data has been provided by the company”

Author Response

Sugestions:

Introduction or conclusion:
These limitations occur due to the nature of this work: this research is applied to the specific scenario provided by CAGEPA, that is, the company indicated the objective of the research and provided the possible data to be used. Therefore, the Company needs to predict the occurrence of future failures, and provides all the data used by the predictive model”.

RESPONSE: This suggestion was followed up as suggested. The text update is highlighted in bold on lines 132 to 137.

Conclusion or discussion:
The use of flow simulation in an EPANET environment is a very interesting approach to be considered. This approach is present in the planning of future work, however, to date this type of information has not been provided by the company”.

RESPONSE: This suggestion was followed up as suggested. The text update is highlighted in bold on lines 368 to 372.

Conclusion or discussion

“The author's intention was to indicate that an increase in the number of records has the potential to make the model more accurate, since a greater number of samples would be used in the model training and validation process. To date, no further data has been provided by the company”

RESPONSE: This suggestion was followed up as suggested. The text update is highlighted in bold on lines 374 to 378.

Round 3

Reviewer 2 Report

Comments and Suggestions for Authors

The paper is now ready for publication in WATER

Back to TopTop