Next Article in Journal
Spatial Changes in Invertebrate Structures as a Factor of Strong Human Activity in the Bed and Catchment Area of a Small Urban Stream
Previous Article in Journal
An Assessment of the Influence of Uncertainty in Temporally Evolving Streamflow Forecasts on Riverine Inundation Modeling
 
 
Article
Peer-Review Record

Uncertainty Quantification in Machine Learning Modeling for Multi-Step Time Series Forecasting: Example of Recurrent Neural Networks in Discharge Simulations

Water 2020, 12(3), 912; https://doi.org/10.3390/w12030912
by Tianyu Song, Wei Ding *, Haixing Liu, Jian Wu, Huicheng Zhou and Jinggang Chu
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Water 2020, 12(3), 912; https://doi.org/10.3390/w12030912
Submission received: 8 February 2020 / Revised: 13 March 2020 / Accepted: 22 March 2020 / Published: 23 March 2020
(This article belongs to the Section Hydrology)

Round 1

Reviewer 1 Report

This paper proposed a framework to quantify the uncertainty contributions of sample sets, machine learning (ML) approach and machine learning architecture when ML is used for multi-step time series forecasting based on the analysis of variance (ANOVA) theory. The authors use five sample sets, two ML approaches: recurrent neural networks (RNN) and Long Short-Term Memory network (LSTM) and two ML architectures applied to discharge simulations to do the study.

You must explain so much better the subsample approach used, that it is not clear at all in the text.

You must explain so much better the indexes used for quantify the contribution of each element in your study, i.e., you must explain so much better the equations: 14 -18, define each equation explicitly, or at least one of them, and specially the equation (18).

I understand that when you define the architectures to use in the study, you only explain the recurrent layers, but you must put in the paper the real ML architecture that you are using with all the layers and elements that you are using for both architectures, i.e., how many layers, what type of layers and so on. And you have to specify the networks when you define the case study, i.e., in section 3.2.

You must explain so much better the results, especially the section 4.4 and the graphics that you present in that section, that is very dense and difficult to follow.

Author Response

Please kindly see the attached.

Author Response File: Author Response.pdf

Reviewer 2 Report

1. Distinction among the machine learning architectures 

   The authors suggest two LSTM architectures on the machine learning in the section 2.3. The reviewer could not differentiate which one is the authors’ new findings for improving the performance of machine learning. It would be better to indicate whether the both architectures come from the authors’ ideas or not.

 

2. Uncertainty budgets

   In every uncertainty estimation, uncertainty budgets are presented to indicate which uncertainty factors are important in evaluating uncertainties. However, there are no explanations on the uncertainty budgets such that the reviewer could not find the main cause of uncertainty by the machine learning algorithms. A guideline to evaluate the measurement uncertainty (ISO/IEC Guide 98-3 Part 3) might be helpful to understand the meaning of uncertainty budgets.

Author Response

Please kindly see the attached.

Author Response File: Author Response.pdf

Reviewer 3 Report

The manuscript, Uncertainty Quantification in Machine Learning Modeling for Multi-step Time Series Forecasting: Example of Recurrent Neural Networks in Discharge Simulations” presents several ML methods for forecasting flood events and compares their uncertainty.  

 

A good study, lacks any detail on the data used, any specific implementation details, or how the models were implemented, e.g., time steps, data vector lengths, etc.

 

Needs to delete most of the general discussion on ML algorithms and models - these descriptions are available other places.

 

Needs a good English language editor.

 

When the study was interesting, the majority of the paper describes commonly used ML algorithms and methods. This information is available in numerous other locations, both papers and textbooks. There is no need to supply that much detail on these algorithms in this paper. The paper lacks details on actual implementations. For example, there are almost no descriptions of the input data. What are the time steps, where do the data come from, how do you align rainfall data and flow data, what resolution are the rainfall data, did you do any spatial interpolation on the rainfall data, what do the data look like (plots), etc. The detail needs to be on what YOU did, not on general ML algorithms. (Lines 157 – 180 or so).

 

Figure 4 Need to discuss why and how you are forecasting rainfall. This is not discussed in the paper.

 

Figure 4 Need details, what are the time steps, what is the length of the feature vector, etc. We need details on what YOU did, not general discussions.

 

L188 – 191 Cite MPDI or other paper on error metrics https://doi.org/10.3390/hydrology5040066

 

L264 – 267 Need details on your models. How many stations, how many data points, etc.

 

Eq 23 Normalizing by the min-max can be an issue if you have large outliers as is common in hydrology. Did you consider using a z-score transform instead (i.e., normalize by standard deviations). This is usually a better transform for hydrologic data, but doesn’t provide a 0-1 data range.

L268 – 273, need more information on the data you used. How many points, what is the time step, how many flow data, how many rainfall data, how well does forecast rainfall match actual, etc. We need details on your model .

 

Figure 11 – 14. All these figures are very difficult to read because of the y-axis scale. By including all the outliers  in the plot, it is very difficult to see the different in the mean and how the 1st std dev compares. You need to limit the y-axis on these plots to highlight the box, not the outliers.

 

Figure 11 – NSE is only valid if you are comparing models of the same location over the same time period. You can’t compare across locations or times. That is, it is good for comparing two models of the same event, but I don’t think it is the best metric for this paper. Just MAE or MSE is probably better.

 

L178 – 186 You note that you tried a number of different architectures, and selected one. But there are no details on this work. This could change uncertainty as much as selecting a different model.

 

What is the difference between “discharge data” and “discharge feature”?

Author Response

Please kindly see the attached.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

This paper proposed a framework to quantify the uncertainty contributions of sample sets, machine learning (ML) approach and machine learning architecture when ML is used for multi-step time series forecasting based on the analysis of variance (ANOVA) theory. The authors use five sample sets, two ML approaches: recurrent neural networks (RNN) and Long Short-Term Memory network (LSTM) and two ML architectures applied to discharge simulations to do the study.

The authors have answered adequately to my previous considerations.

But they must improve the English and maybe to divide the explanation of the results in various paragraphs, because, the explanation is very dense and a bit difficult to follow.

Author Response

Please kindly see the attached.

Author Response File: Author Response.pdf

Reviewer 2 Report

1. Minor comments

   1) The uncertainty issues are complemented in the lines 79-81, 269-271, 389-391, and the subsequent figures (Fig. 16-17). As far as the reviewer knows, the variances are not exactly the same as uncertainty. It is because the uncertainty is the square root of {the variance divided by the degrees of freedom}. However, Figures 16 and 17 are showing the variance according to Equations 19-22, not the uncertainty contribution. This could be left for a future study by metrologists.

   2) Since two model architectures are not of the authors’ own, some discussion on the future development of the ML algorithm should be placed as a future plan.

Otherwise, the reviewer has no further comments.

Author Response

Please kindly see the attached.

Author Response File: Author Response.pdf

Back to TopTop