Issuing useful hydrological predictions (e.g., river flow predictions) is one of the most important challenges in hydrology. Dealing with this challenge involves answering numerous research questions, but also putting research into practice by exploiting research advancements in operational contexts. This additional consideration introduces some extra requirements for prediction methodologies, mostly related to their appropriateness for what we call prediction “at scale”. Issuing hydrological predictions “at scale” is a major theme in the present study. The term “at scale” is used here according to Taylor and Letham [1
], i.e., to imply several notions of scale, mostly (i) a large number of required predictions, and (ii) a large variety of prediction problems to be solved. The latter are created, e.g., under different climate and catchment conditions.
Also importantly, the present study is principally founded upon the premise that (operational) hydrological predictions can be most useful when expressed in probabilistic terms (see, e.g., references [2
]), i.e., in terms of probability distribution function (PDF) [4
] (see also references [7
]) or in terms of prediction intervals (or predictive quantiles). Delivering probabilistic hydrological predictions is a relatively new practice [6
] considering the much longer history of hydrological modelling, comprehensively summarized by Todini [4
]. This practice is also referred to in the related literature as “global uncertainty” quantification (see, e.g., reference [9
]) or “predictive uncertainty” quantification (see, e.g., reference [4
]), and its technical implications are under consideration and ongoing discussions (see, e.g., references [3
The background of the present study lies in the tremendous and growing progress made in two distinct research fields, the advancements of which can be exploited in hydrological contexts for predictive modelling (contrasted to explanatory and descriptive modelling in Shmueli [14
]). These are the field of “process-based” hydrological modelling (term used here as defined in Montanari and Koutsoyiannis [6
]; see, e.g., references [15
]) and the field of machine-learning (see, e.g., references [23
]). The former includes various modelling approaches spanning from distributed to lumped conceptual approaches, which also aim (besides prediction) at supporting some sort of “physical interpretation” of the catchment-scale hydrological phenomena [4
] and describing the catchment’s behavior as a whole [16
], respectively. Moreover, the machine-learning field includes a large variety of multi-purpose algorithmic techniques, potentially useful in various applied fields, such as hydrology. Among its latest advancements are ensemble learning methods (e.g., the bagging by Breiman [27
] and random forests by Breiman [28
]), i.e., methods that combine the results of individual learning algorithms [29
]. Machine-learning algorithms are often referred to in the hydrological literature under the more general term “data-driven models”.
Process-based hydrological models and data-driven algorithmic approaches are regarded as two different “streams of thought” in predictive hydrological modelling which need to be harmonized “for the sake of hydrology” [4
]. In fact, machine-learning techniques can be perceived as manifestations of the algorithmic modelling culture, a statistical modelling culture that is grounded on the premise that the mechanism behind the data generation is completely unknown and, therefore, obtaining predictions by exploiting the data does not require its prior description through an analytical model [30
]. This culture fundamentally deviates from what is called “process-based modelling”.
Often perceived to represent tradition, experience and lessons-learnt knowledge (from a “physical process-oriented” modeller’s point of view) [4
], process-based models are mostly preferred by hydrological modellers and hydro-meteorological forecasters [31
]. Among the plethora of the currently available process-based hydrological models, few exemplary ones are more trustable than others (e.g., the Génie Rural—GR hydrological models by Perrin et al. [16
], Mouelhi et al. [17
], and others, which are also available in open source by Coron et al. [32
]), as it is evident from the literature that they are the result of decades of continuous and labor-intensive hydrological research focusing on better overall prediction, better prediction of low and high flows, and model parsimony, among others (see, e.g., the related comments in Perrin et al. [16
On the other hand, “engineering-oriented” modellers report on (unexploited) opportunities for high predictive performance stemming from the use of data-driven hydrological models [4
]. Machine-learning regression algorithms are regularly implemented in the data-driven hydrological literature for solving a vast amount of technical problems, and for building confidence in predictive and explanatory modelling (see, e.g., references [34
]). Yet, their potential has been realized and exploited only to a limited extent, and mostly for obtaining “point” predictions (term used here as opposed to “probabilistic”). Nonetheless, this potential includes the possibility of delivering probabilistic hydrological predictions (including forecasts; see, e.g., the relevant practical suggestions for using random forests in water-related applications by Tyralis et al. [41
]), in spite of the widespread misconception existing in the minds of hydrologists that machine-learning algorithms are by nature deterministic (i.e., not statistical). Actually, machine-learning methods are all statistical (therefore, “machine-learning” and “statistical learning” are terms interchangeably used beyond hydrology), and some of them (e.g., the quantile regression ones, on which this study focuses) are ideal for predictive uncertainty quantification.
Advancing the implementation of machine-learning regression algorithms by conducting large-sample (and in-depth) hydrological investigations has been gaining prominence recently (see, e.g., references [42
]), perhaps following a more general tendency for embracing large-scale hydrological analyses and model evaluations (see, e.g., references [47
]). The key significance of such studies in improving the modelling of hydrological phenomena, especially when the modelling is data-driven, has been emphasized by several experts in the field (see, e.g., references [16
In the present study, we exploit a large dataset for advancing the use of machine-learning algorithms within broader methodological approaches for quantifying the predictive uncertainty in hydrology. The hydrological modelling and hydro-meteorological forecasting literatures include a large variety of such methodologies (see, e.g., references [45
]), reviewed in detail by Montanari [9
] and Li et al. [70
]. Deterministic “process-based” hydrological models are usually and preferably a core ingredient of probabilistic approaches of this family. In this context, statistical models are applied to convert the point predictions provided by hydrological models to probabilistic predictions. Such methodologies are hereafter referred to under the term “probabilistic hydrological post-processing” methodologies.
We are explicitly interested in probabilistic hydrological post-processing methodologies whose models are estimated sequentially in more than one stage (see also Section 2.1
; hereafter referred to as “multi-stage probabilistic hydrological post-processing methodologies”) and machine-learning quantile regression algorithms, since the former can accommodate the latter naturally and effectively. The effectiveness of this accommodation has already been proven, for example, with the large-scale results by Papacharalampous et al. [45
] and Tyralis et al. [46
] for the monthly and daily timescales, respectively. Aiming at combining the advantages from both the above-outlined “streams of thought” in predictive hydrological modelling, these studies and a few earlier ones (to the best of our knowledge, those mentioned in Table 1
) have integrated process-based hydrological models and data-driven algorithmic approaches (spanning from conditional distribution modelling approaches to regression algorithms) within multi-stage probabilistic hydrological post-processing methodologies for predictive uncertainty quantification purposes.
As summarized in Table 1
, multi-stage (mostly two-stage) probabilistic hydrological post-processing has been implemented both using parametric and non-parametric statistical models. Machine-learning quantile regression algorithms do not make assumptions about the probability distribution function (PDF) of the predictand; therefore, they fall into the broader class of non-parametric techniques. Their output is a set of predictive quantiles of selected levels (e.g., the predictive quantiles of levels α
/2 and 1 − α
/2, which form the (1 − α
) 100% central prediction interval), instead of predictive PDFs of the hydrological processes of interest. While (three) algorithms from this category have already been incorporated into multi-stage probabilistic hydrological post-processing methodologies (mostly for solving technical problems within case studies; see Table 1
), there is no extensive study focusing on formalizing and framing this incorporation. We aspire to fill this gap by conducting the largest and most systematic assessment of machine-learning algorithms for probabilistic post-processing in hydrology.
We aim at answering the following research question: Why and how to apply machine-learning quantile regression algorithms for probabilistic hydrological post-processing? As implied by our aim, our contribution in the literature includes the inspection and appraisal of both quantitative and qualitative aspects of the application of the algorithms. Although our benchmark experiment holds a prominent position in this study, the theoretical and practical information on the proposed methodologies and framework, also provided herein, are rather equally important for answering the above-stated research question. Specifically, we:
Explore, through benchmark tests, the modelling possibilities provided by the integration of process-based models and machine-learning quantile regression algorithms for probabilistic hydrological modelling. This exploration encompasses the:
comparative assessment of a representative sample set of machine-learning quantile regression algorithms in two-stage probabilistic hydrological post-processing with emphasis on delivering probabilistic predictions “at scale” (an important aspect within operational settings);
identification of the properties of these algorithms, as well as the properties of the broader algorithmic approaches, by investigating their performance in delivering predictive quantiles and central prediction intervals of various levels; and
exploration of the performance of these algorithms for different flow magnitudes, i.e., in conditions characterized by different levels (i.e., magnitudes) of predictability.
Explore, through benchmark tests, the modelling possibilities provided by simple quantile averaging. Simple quantile averaging is the simplest way to combine multiple quantile predictions (by averaging them), but also “hard to beat in practice” [76
Formulate practical recommendations and technical advice on the implementation of the algorithms for solving the problem of interest (and other problems of technical nature). An important remark to be made is that these recommendations are not meant in any case to be limited to selecting a single algorithm for all tasks and under all conditions. Each algorithm has its strengths and limitations, which have to be identified so that it finds its place within a broader framework (provided that the algorithm is a good fit for solving the problem of interest). This point of view is in accordance with the ‘‘no free lunch theorem’’ by Wolpert [78
Justify and interpret key aspects of the developed methodological framework and its high appropriateness for progressing our understanding on how machine-learning quantile regression algorithms should be used to maximize benefits and minimize risks from their implementation.
Preliminary works on the above can be found in Papacharalampous et al. [79
], while Papacharalampous et al. [45
] (studies built on the work by Montanari and Koutsoyiannis [6
]) and Tyralis et al. [46
] focus on ensemble learning probabilistic hydrological post-processing methodologies that can accommodate the algorithms assessed herein. Ensemble learning methods combining the predictions obtained by multiple learning algorithms (e.g., the equal-weight combiner tested herein) are increasingly adopted in many engineering and applied science fields, since they frequently provide improved predictive performance with respect to each of the individual learning algorithms (see, e.g., the review by Sagi and Rokach [29
]). The results of the present study also advocate the value of ensemble learning for probabilistic hydrological post-processing.