Review: Sources of Hydrological Model Uncertainties and Advances in Their Analysis

: Despite progresses in representing different processes, hydrological models remain uncertain. Their uncertainty stems from input and calibration data, model structure, and parameters. In characterizing these sources, their causes, interactions and different uncertainty analysis (UA) methods are reviewed. The commonly used UA methods are categorized into six broad classes: (i) Monte Carlo analysis, (ii) Bayesian statistics, (iii) multi-objective analysis, (iv) least-squares-based inverse modeling, (v) response-surface-based techniques, and (vi) multi-modeling analysis. For each source of uncertainty, the status-quo and applications of these methods are critiqued in gauged catchments where UA is common and in ungauged catchments where both UA and its review are lacking. Compared to parameter uncertainty, UA application for structural uncertainty is limited while input and calibration data uncertainties are mostly unaccounted. Further research is needed to improve the computational efﬁciency of UA, disentangle and propagate the different sources of uncertainty, improve UA applications to environmental changes and coupled human–natural-hydrologic systems, and ease UA’s applications for practitioners.


Introduction
Hydrological models are developed to understand process, test hypothesis, and support decision-making. These models solve empirical and governing equations at different complexities. Their complexities vary depending on how the governing equations are solved over different spatial configurations such as lumped [1], semi-distributed [2], or fully distributed area [3], as well as the extent to which variables and processes are coupled [4,5]. Although model development has progressed in accounting for different processes and complexities, they remain as simplifications of the actual hydrological processes.
Process simplification is a consequence of limited knowledge and data, imprecise measurements, and involvement of multiple scales and interactive processes [6][7][8][9]. These simplifications introduce uncertainty and make it an intrinsic property of any model [10][11][12]. Analyzing this uncertainty requires separate methodological treatments beyond model development. However, uncertainty analysis (UA) benefits hydrological modeling in (i) identifying model limitations and improvement strategies [13,14], (ii) guiding further data collection [15,16], and (iii) quantifying the uncertainty associated with model predictions.
Various UA techniques have been developed. A few of the methods include generalized likelihood uncertainty estimation (GLUE) [17], differential evolution adaptive metropolis (DREAM) [18], parameter estimation code (PEST) [19], Bayesian total error analysis (BATEA) [20], and multi-objective analysis (Borg) [21]. These techniques have been applied in different water resources decision-making such as water supply design [22], flood mapping [23,24], and hydropower plant evaluation [25]. However, there is a lack of a systematic review and categorization of these techniques to provide a much-needed UA guideline for selecting a suitable method.
This review intends to summarize, categorize, and review UA methods for each source of model uncertainty. The conceptual basis of the methods, their applications, advantages, and challenges are critiqued along with an outlook of future research. This work builds upon and goes beyond other UA reviews in the literature. Among these reviews, Walker et al. [26] focused on the conceptual basis of model uncertainty, while Refsgaard et al. [27] focused on the role, terminology and assessment of uncertainty. Pathak et al. [28] emphasized the application of selected UA methods for practicing engineers. Meanwhile, Gupta et al. [6] and Mishra [29] focused on reviewing specific UA methods, calibration techniques and sensitivity analysis, and Warmink et al. [30] focused on identification and classification of uncertainty sources. None of these reviews adopt a comprehensive categorization and critiquing of the UA methods for each uncertainty source. Here, we fill that gap by organizing and critiquing the UA methods for each source of uncertainty. Further, we newly reviewed response surface methods, uncertainty source interactions and provided a much-needed review of UA applications for hydrologic prediction in ungauged basins. These make our work a novel contribution that critiqued UA in both gauged and ungauged catchments.

Sources of Hydrological Model Uncertainties
Hydrological model uncertainties stem from parameters, model structure, calibration (observation) and input data ( Figure 1). In addition to these sources, uncertainties can stem from model initial and boundary conditions; however, these sources are not considered in this review. Hydrological models often contain parameterizations that are a result of conceptual simplifications, known as effective parameters [31]. Parameter uncertainty can be a result of the inability to estimate or measure these effective parameters that integrate and conceptualize processes [32]. In addition, parameter uncertainty can be a result of the natural process variability and observation errors. Although some model parameters, such as hydraulic conductivity, are measurable at the point scale, their values at the catchment scale vary significantly. Hence, the practical difficulty in measuring this variability can also lead to parameter uncertainty. On the other hand, due to errors in the calibration data, parameter uncertainty can be encountered even if a model is an exact representation of the hydrologic system. Therefore, inability to accurately estimate effective parameters, the challenge of measuring natural variability, and the existence of observation errors lead to parameter uncertainty. This uncertainty often manifests itself in model calibration through the lack of a single optimal set of parameters [33].
An exact representation of a hydrological system is challenging due to the absence of a unifying theory, limited knowledge and numerical and process simplifications. Such limitations constitute model structural uncertainty. Model structural uncertainty can also refer to alternative conceptualizations, such as the hydro-stratigraphy of the subsurface or the discretization of surface and process features [34][35][36]. Model performance is strongly dependent on model structures [37]. As a result, structural uncertainty is critical as it can render the model and quantification of other uncertainties useless [37][38][39]. Comparing structural and parameter uncertainty, Højberg et al. [40] showed that structural uncertainty is dominant, particularly when the model is used beyond its calibration sphere. Moreover, Rojas et al. [41] concluded that structural uncertainty may contribute up to 30% of the predictive uncertainty. Using the variance decomposition of streamflow estimates, Troin et al. [42] showed that model structure is the highest predictive uncertainty contributor. Input forcing to hydrological models includes different hydro-meteorological, catchment, and subsurface data. Although there have been improvements in data acquisition and processing, the data used for model forcing are sparse, and subjected to gaps, imprecisions, and uncertainties [43]. In most cases, input data involve interpolations, scaling, and derivation from other measurements that result in an uncertainty range of 10-40% [43]. These inaccuracies in input data constitute input uncertainty. Failure to account for input uncertainty can lead to biased parameter estimation [20,44] and mislead water balance calculations [45]. Consequently, input uncertainty can interfere with the quantification of predictive uncertainty. Comparing input and model parameter uncertainty particularly in a data sparse region, Bárdossy et al. [46] showed the severity of input uncertainty over parameter uncertainty and suggested a simultaneous analysis of both uncertainties to have a meaningful result.
Hydrological modeling often involves calibrating the model parameters and evaluating model predictions using observed state and/or flux variables, such as streamflow and groundwater levels. These observations are subjected to similar measurement uncertainties as input data. Streamflow observations, for instance, are derived from rating curves that translate river stage measurements to discharge estimates. This translation not only propagates the random and systematic stage measurement errors but also passes the structural (rating curve equation) and parameter uncertainties involved in the calibration of the rating curve. In addition, discharge estimates suffer from interpolation and extrapolation errors of the rating curve, hysteresis, change in site conditions (e.g., bed movement), and seasonal variations in measurement and flow conditions [47][48][49][50]. Extrapolation contributes the highest uncertainty [51]. Overall, discharge observation uncertainty ranges from 5% [52,53] to 25% when extrapolation is involved [54].
Predictive uncertainty is the result of the above uncertainties exhibited on the model output. This uncertainty can be biased or variance-dominated depending on the level of model complexity and the information content of the data, bias being dominant in simple models and variance being dominant in complex models. Predictive uncertainty is usually heteroscedastic, the magnitude of uncertainty varying with the magnitude of the model output (non-Gaussian residuals). In streamflow modeling, this is partly due to the limited number of high flow observations (low frequency events) to constrain model calibration. Different data transformation schemes are used to address heteroscedasticity including the Box-Cox transformation and simple log transformation [55,56].

Hydrological Model Uncertainty Analysis
We have found six broad classes of UA methods: (i) Monte Carlo sampling, (ii) response surface-based schemes including polynomial chaos expansion and machine learning, (iii) multi-modeling approaches, (iv) Bayesian statistics, (v) multi-objective analysis, and (vi) least square-based inverse modeling ( Figure 2). Furthermore, approximation and analytical solutions based on Taylor series are also used in UA. However, since such approaches are not widely applied, they are not included in this review. The number of papers reviewed in this section (Section 3) across the different sources and methods are shown in Figure 2. This numbers indicate pioneering articles, articles that discuss limitations and advantages of the methods, and articles that introduce method improvements. Hence, the numbers do not indicate articles that are application oriented. As rainfall multipliers are embedded as part of model parameters they are not included as a unique method to be reviewed. In general, although it could be a limited sample, the numbers may reflect an approximation of how different methods are being developed for each source and how these methods are being critiqued and advanced as a focus of different studies.
The bellow six broad classes are a summary of different UA methods ( Table 1). The common thread across these methods is their involvement of many actual/surrogate model executions. This can translate into high computational time depending on (1) the model's execution time, (2) whether the surrogate training requires multiple runs, and (3) whether the model execution is parallel.

Parameter Uncertainty
Compared to the other sources of uncertainty, several techniques are employed to address parameter uncertainty ( Figure 2). One of the widely used parameter UA methods is generalized likelihood uncertainty estimation (GLUE) [17], which accounts for the equifinality hypothesis [33]. The equifinality hypothesis highlights the existence of multiple parameter sets that describe hydrological processes indiscriminately or producing the same final result. The approach augments Monte Carlo simulations with a behavioral threshold measure that distinguishes hydrologically tenable and untenable parameters (and structures). Although GLUE's straightforward conceptualization and implementation allow it to achieve a widespread use, GLUE has been criticized for its subjectivity in choosing a behavioral threshold and its lack of a formal statistical foundation [75][76][77]. Extending GLUE, Beven [33] suggested the limits of acceptability approach where pre-defined uncertainties that objectively reflect both input and output observation uncertainties are used as a measure of acceptability than the subjective behavioral thresholds. Although its use is not as widespread as GLUE, this approach has been applied, e.g., [78,79]. The limitation of its application is partly due to the lack of measurements that reflect the level of input and output uncertainties and their interactions [80][81][82].
Following GLUE, numerous contributions have been made to formally quantify parameter uncertainty. These approaches primarily rely on Bayesian statistics [83][84][85][86]. The main advantage of the formal Bayesian statistics is that parameter uncertainty is not only quantified, but can also be reduced through the inclusion of prior knowledge. Although GLUE's behavioral thresholds can achieve this goal, their formulation is subjective. The latest addition to the formal approach is DREAM [18], which has been applied widely. DREAM's widespread application is derived from its unique capability of merging the differential evolution algorithm [87] with the adaptive Markov Chain Monte Carlo (MCMC) approach [88,89]. The differential evolution allows DREAM to efficiently explore non-linear and discontinuous parameter spaces, while the adaptive MCMC component keeps it within the parameter posterior space. Contrary to other MCMC schemes [90], DREAM uses multiple chains to exchange sampling space information rather than confirming the attainment of the stationary state. The primary challenge of DREAM and other Bayesian methods is identifying a likelihood function that results in homoscedastic residuals, justifying the use of a strong assumption for a formal likelihood function. This challenge is particularly difficult in disproportionately low flow dominated observations of streamflow coupled with a few high flows [33,55,[91][92][93].
Least squares-based parameter inversions incorporated in PEST [19] and US Geological Survey computer program (UCODE) [66] are also used to quantify parameter uncertainty. These approaches rely on linear approximation of models and Gaussian residuals. Besides estimating parameter uncertainty, these methods are also useful in determining the number of model parameters that can be estimated using the available data [94,95]. This is valuable in highly parameterized models to prioritize data collection for model improvement. Furthermore, PEST expedites Monte Carlo based UA through its null space Monte Carlo analysis and the use of singular value decomposition (SVD). This further allows PEST to be used in its SVD-assist mode with Tikhonov regularization. The SVD-assist approach coupled with parallelized Jacobian matrix calculation makes parameter estimation close to observations (because of Tikhonov regularization) and faster (due to parallelization plus SVD reducing the parameter space) when calibrating highly parameterized and ill-posed problems. The main limitations of PEST and other leastsquared based UA are their assumption of model linearity and Gaussian residuals. These assumptions have limited their use in surface water models riddled with threshold-based processes, discontinuities and integer variables.
Multi-objective optimization (MOO) is another UA approach for parameter uncertainty. MOO-based UA has received substantial interest due to the absence of a single global optimal solution due to equifinality, and because of the need for improved constraint of flux and store outputs in hydrological models [60,96,97]. MOO allows retrieving information from the increasingly available data against which model predictions can be compared [61,98]. It handles parameter uncertainty by conditioning model calibration through multiple complementary and competing objective functions. The functions can be defined using (1) multiple responses such as, metrics that measure the matching of different segments of a hydrograph, (2) same-variable output but at multiple sites, and (3) multivariable outputs such as streamflow, soil moisture, and evapotranspiration [61,99,100]. The result of MOO provides trade-off solutions referred to as non-dominated (Pareto) solutions in which each member solution matches some aspect of the objective better than every other member.
Parameter uncertainty is considered well identified when the optimized parameter uncertainty substantially decreased from the prior uncertainty, leading to a reduced model prediction uncertainty. In MOO, the uncertainty range can further be narrowed by rejecting solutions that failed the model validation test. MOO also provides unique parameter un-certainty information following the shape of the Pareto front. For instance, extended Pareto front in each objective indicates high uncertainty in reproducing the parameterization of the processes [61,100,101]. Furthermore, MOO provides functionality beyond parameter UA as the shape of the tradeoff allows us to understand model structural limitations [102]. However, this advantage might not be straightforward because meaningful multi-objective tradeoffs in hydrological modeling are less frequent [102].
One of the challenges in MOO is the lack of consistent criteria for choosing the number and type of the objective functions and inability to formally emphasize one criterion over the other. Like GLUE, MOO is also statistically informal as it does not use Bayesian probability theory. Theoretically, the Pareto solutions and GLUE's behavioral solutions may overlap; however, they are not necessarily equivalent. The difference between the parameter uncertainty sets defined by the Pareto front and the GLUE approach is that GLUE's equifinality set consists of both the dominated and non-dominated parameter whereas the Pareto set contains only the non-dominated sets [103]. The difference between Bayesian UA and multi-objective analysis is that Bayesian schemes usually have a single objective (likelihood functions) or an aggregated multi-objective. In cases where multi-objective configuration is employed, the Bayesian scheme results in a compromise solution [104] while MOO results in a full Pareto solution [103]. Detailed review of MOO can be found in [100].
Response surface methods are used to approximate predictive uncertainty that is primarily caused by parameter uncertainty. Response surface methods employ a proxy function emulating the original model's parameter and output relationships, making the method blind to model internal workings or the nature of outputs that are not emulated. The common proxy functions used to approximate models are truncated Polynomial Chaos Expansion (PCE) [2,105,106] and machine learning schemes [107][108][109][110]. Since these methodologies are computationally inexpensive, the approximation is suited for UA in complex models that demand long run time or applied for real time forecasts. Response surface methods train the approximating functions based on a subset of the original model output and parameter relationships. Consequently, one of the major challenges of these schemes is ensuring the accuracy of the approximating function. If the subset is not representative, the function can lead to biased approximation. Identifying the right size and representative subset are an ongoing research. Furthermore, the accuracy of the response surface methods depends on the smoothness of the original model parameter and output surface relationship. Thus, the approximation is better for problems that are not highly non-linear and response surfaces that are not riddled with thresholds and multiple local optimums. Commonly, a validation test on a split sample is considered to confirm the approximating function's accuracy [2,110].
The typical approach of uncertainty quantification using response surfaces is to conduct Monte Carlo simulations using the approximated function. The resultant Monte Carlo based distributions represent the predictive uncertainty. Compared to the machine learning alternative, PCE offers an analytical approximation of the uncertainty through the estimate of variance without the need for simulations [111]. Further review of PCE can be found in [112,113], while review of machine learning approaches are provided by [114].
Dotto et al. [115] compared Monte Carlo (GLUE), multi-objective (A Multi-Algorithm, Genetically Adaptive Multiobjective-AMALGAM [62]) and Bayesian (Model-Independent Markov Chain Monte Carlo Analysis-MICA [116]) techniques to estimate parameter uncertainty. They found that while all the techniques performed similarly; GLUE was slow but easiest to implement. Although the Bayesian technique was less suitable because it required homoscedastic residuals, it was the only one avoiding subjectivity. Finally, they indicated that method choice depends on the modelers' experience, priority, and problem complexity. Further, Keating et al. [117] found equivalent performance between the Bayesian DREAM and the least square null-space Monte Carlo. Comparing GLUE and formal Bayesian techniques, Thi et al. [118] indicated formal Bayesian method's efficiency in resulting to a better identified parameter. However, Jin et al. [119] indicated that a stricter behavioral threshold choice can achieve similar identifiability with an increased computation [120]. A detailed comparison of formal Bayesian techniques against GLUE and its behavioral threshold choices can be found in [120].

Input Uncertainty
Hydrological models require different hydroclimatic input data. For this review, we focused on precipitation and its uncertainty. Traditionally, input uncertainty is addressed using a factor that multiplies input precipitation to correct measurement uncertainty. This approach is relatively fast and easy as the multiplication factor is expert judgement-based or estimated with model parameters. However, the method is ad hoc and does not have formal procedures to determine the multiplying factor. Beyond this approach, precipitation uncertainties can be studied using Monte Carlo (GLUE) and Bayesian statistics. The Monte Carlo approach estimates the rainfall multiplying along with the model parameters. The Bayesian techniques can be classified into two classes, model dependent and model independent, depending on whether the input uncertainty is estimated dependently or independently of the model parameters. Balin et al. [121] demonstrated how input uncertainty can be estimated as a standalone quantity, separate from model parameters. They decoupled input uncertainty from the model by assuming a distribution that represents the "true" precipitation. This approach is advantageous, because it relieves input uncertainty from the remaining uncertainty sources. However, the approach is limited because of the difficulty in determining the "true" input distribution.
On the other hand, Bayesian total error analysis (BATEA [74]) and integrated Bayesian uncertainty estimator (IBUNE [73]) estimate input uncertainty along with model parameters. Hence, these approaches recover input uncertainty with parameter and structural uncertainties. Thereby, the estimated input uncertainty is dependent on the specific model structure and can vary with changes in model structures. Consequently, the input uncertainty might not only be a result of input measurements, but also structural and parameter uncertainties. These approaches can also be computationally expensive in distributed models, which require spatially and temporally varying inputs.

Structural Uncertainty
Multi-model averaging and ensemble techniques, which pool together alternative model structures, have been applied to quantify and reduce structural uncertainty. Hagedorn et al. [122] noted that the success of multi-model averaging is a result of error compensation, consistency, and reliability. Consistency refers to the lack of a single model that performs best regardless of circumstances, and reliability refers to averages performing better than the worst component an ensemble. Moreover, Moges et al. [123] showed why a multi-model average performs better than a single model using statistical proof and empirical large-sample data.
Predictions from ensembles are combined using either equal or performance-based weights. Velázquez et al. [70] presented a study that benefits from equal weighting while performance based weighted multi-model predictions are discussed in [37,71,124]. A comparison of different alternative model averaging techniques was presented in [125]. Equal weighting is advantageous in the absence of model weighting/discrimination criteria and when component models have similar performances. However, when the models' performances are significantly different and each model is specialized in simulating certain components of the hydrological processes, using variable weights provide better predictions.
One of the widely used weighted multi-model averaging is Bayesian model averaging (BMA) [126][127][128]. BMA and similar Bayesian based averaging techniques are advantageous since their weights are interpretable and derived from model posterior performance that combines the model's ability to fit observations and experts' prior knowledge. Despite its widespread application, BMA has its limitations in addressing structural uncertainty. For instance, even though model performance varies across hydrograph segments [129], runoff seasons [14], or catchment circumstances [72], BMA assigns constant weights to component models. Furthermore, improvement in BMA's performance was noted when component models are weighted differently for different quantiles of a hydrograph [129]. In contrast, hierarchical mixture of experts (HME) [72], an extension of BMA, has the capacity to assign different weights to different models depending on the dominant processes governing different segments of a hydrograph. This flexibility allows HME to be useful in dealing with structural uncertainty when model performances vary.
Hydrology involves governing physical principles and established process signatures that need to be respected. As a result, any statistically based multi-model averaging techniques need to be critically streamlined so that they respect hydrological principles. In this regard, Gupta et al. [130] stressed the application of diagnostic modeling. Similarly, Sivapalan et al. [131] argued the importance of parsimony and incorporation of dominant processes in model structural development. To this end, Moges et al. [132] streamlined HME by coupling it with hydrological signatures that diagnose model inadequacy. Such approaches can be explored to characterize catchment hydrology while avoiding disparities between statistical success and process description.
Structural uncertainty has also been addressed in the framework for understanding structural errors (FUSE [59]), which explores structural uncertainty through shuffling and combination of parts of alternative hydrological models. The parts constitute alternative formulations of processes. FUSE has similarity with Monte Carlo (or Bootstrap) methods. However, the samples in FUSE are components of a model representing the different alternative process conceptualizations. Further, given the ease of parallelization, FUSE is practical to address structural uncertainty.
The chief criticism of all multi-modeling UA is the limitation of identifying and incorporating the entire span of feasible model structures. Although the above methods can handle any number of structures, the practical inability to identify and explore the entire structural space constrains the scope of the investigation. As a result, full examination of structural uncertainty is limited.

Calibration Data Uncertainty
Although different calibration data are used to constrain hydrological models, streamflow observation uncertainty is common in the literature. Hence, this subsection focuses on streamflow observation uncertainty. Some of the methodologies used to address observation uncertainty include Monte Carlo simulations to estimate the rating curve uncertainty, use of the upper and lower bounds of the rating curve estimates, and Bayesian techniques. The first method propagates observation uncertainty to parameter and/or predictive uncertainty estimates through a repeated calibration of the hydrologic model [45], making the method computationally intensive. In the second approach, only the upper and lower rating curve uncertainty bounds are considered to come up with the corresponding upper and lower bounds of predictive uncertainty [133]. This approach is computationally effective, but it does not maximize exploring the full information contained in the rating curve uncertainty. To improve the rating curve uncertainty utilization, McMillan et al. [134] suggested derivation of an informal likelihood function that accounts for the entire rating curve uncertainty. This technique allowed them to apply Bayesian statistics and use the rating curve information to analyze observation uncertainty. In contrast, Sirakosa and Renard [135] used a formal Bayesian technique where the rating curve parameters and the hydrological model parameters are inferred simultaneously by directly using the stage measurement instead of the streamflow estimate.
Observation uncertainty influences predictive uncertainty. McMillan et al. [134] showed improvement in model predictive uncertainty by considering discharge observation uncertainty compared to a model that does not consider discharge uncertainty while Beven and Smith [82] demonstrated the disinformation emanating from observation uncertainties. Aronica et al. [133] showed how different rating curve realizations lead to different predictive uncertainties, whereas Engeland et al. [45] concluded that considering or not considering observation uncertainty leads to different parameter uncertainties and different interactions with structure, parameter, and predictive uncertainties. These studies indicate the significance of observation uncertainty and the need to account it.

Predictive Uncertainty
The resultant uncertainty estimate of a model output is predictive uncertainty. Predictive uncertainty can be quantified by propagating parameter, structure and input uncertainties to the model output. This propagation is mostly performed by Monte Carlo simulation of parameter, structure, and input uncertainties. Predictive uncertainty can be reduced using different machine learning schemes [63,136]. Although their physical process realism needs validation, these schemes reduce predictive uncertainty as they can learn and identify patterns in the model residuals. Besides, to reduce predictive uncertainty, it is important to disentangle and understand the propagation of the different sources of uncertainties.

Uncertainty Source Interaction and Reduction
Understanding the causes and interactions of uncertainty sources is critical in disentangling and reducing predictive uncertainty. The different causes and interactions of the four sources are summarized in Figure 3. Some of the causes can be common across the different sources ( Figure 3). Structural uncertainty is at the core and interacts with all sources of uncertainty. The interactions between input and structural uncertainty are not direct. They interact through the parameters of the model. On the other hand, although input and calibration data uncertainties are independent, they both interact with parameter uncertainty. The interaction between input and parameter uncertainties may lead to biased parameter estimates [74], which can dominate the overall model uncertainty [137,138]. Similarly, input errors can be compensated by parameters [139][140][141]. However, this compensation can be misleading since the causes of input uncertainty are independent of any model and its parameters. This is particularly a concern when the model is used for conditions that are different from the ones used for model development. Thus, it is better to independently address the causes of input errors instead of relying on compensation.
Although observation and model structural uncertainty do not necessarily interact directly, Montanari and Baldassarre [142] showed how different model structures can limit propagation of observation uncertainty. For a given model structure, observations are used for (1) model performance evaluation, (2) deriving hydrological signatures, and (3) deriving likelihood measures in model parameter estimation. With these uses, observation uncertainty interacts with parameter uncertainty when parameters are estimated. Consequently, failure to account for observation uncertainty can lead to biased parameter and predictive uncertainty [135,143,144]. Due to the interaction between parameter and observation uncertainty, acquiring more data can reduce parameter/predictive uncertainties. However, if the information content of the acquired data cannot support the parameterization of the model, these uncertainties may not be reduced by simple data acquisitions. Different techniques are employed to better extract the information content of calibration data, including (i) extracting different hydrological signatures from streamflow [145][146][147], (ii) increasing the spatial coverage of observations beyond the catchment outlet [148][149][150], (iii) having different variables beyond streamflow [151,152], (iv) incorporating soft data from expert-based diagnostics [153,154] and (v) including remote sensing data [155][156][157]. These examples indicate that not only data acquisition, but also the diversity and the information content of the data are essential. The bias-variance trade-off curve describes the interaction between parameter and structural uncertainty [158] (Figure 4). This theoretical curve shows that the optimal model lies between a biased assumption-ridden simple model and a highly variable complex model, that is, model biases lead to inaccurate process representations, while building a complex model without the detailed supporting data results in highly variable outputs and ambiguities. Thus, model building involves trade-offs between the two sources. Consequently, Clark et al. [159] states how the boundary between structure and parameter is not always clearly defined. As different parameter values can translate into structural uncertainty, Clark et al. [159] indicated that it is not straightforward to distinguish parameter from structural uncertainty. Structural uncertainty can also be manifested by parameters residing at the edge of the permissible parameter range [160]. Indeed, calibrated parameter values lying outside their physical ranges serve as an indicator of structural uncertainty [161]. Similarly, different studies showed that the tradeoff between these two sources [162,163].
On the bias-variance trade-off curve, there are two potential ways of reducing uncertainty. From the right, parameter uncertainty can be lessened through reducing model complexity by multi-objective and multi-variable analysis, dimensional reduction of complex and coupled models, or approximating the complex model with surrogate proxies. Via any of these approaches, reduction of parameter uncertainty demands computational efficiency, different flux and storage data, and using effective UA methods (Section 3 and the literature therein). From the left, structural uncertainty can be lessened by removing simplifying and relaxing assumptions using theory development, coupling sub-models and process understanding [14,164,165]. However, due to the uniqueness of place and process, reduction of structural uncertainty can be watershed dependent [166]. Consequently, the use of hydrological signatures can be critical in identifying and diagnosing the dominant processes and incorporating the proper level of model complexity.

Model Inadequacy and Structural Uncertainty
Model adequacy is interlinked with structural uncertainty. Model inadequacy results from missing one or more processes, inadequate numerical schemes and discretization, adopting a deterministic approach for a stochastic phenomenon, and misrepresenting process/physical phenomenon [167]. Conversely, structural uncertainty results from the inability to discriminate alternative models due to limitations in process understanding (Section 2). As such, ideally, the ensemble techniques in Section 3.3 shall contain models that are difficult to discriminate, thereby linking model inadequacy evaluations and structural UA. However, in most ensemble based structural UA, it is not common to address structural uncertainty following model inadequacy evaluations. Both hydrological and statistical approaches are used to evaluate model inadequacy. By using a statistical technique, Pande [168] proved that the crossing of any two quantiles of a model structure indicates the structure's inadequacy. Pande [169] applied this quantile-based approach to rank several structures. This ranking can be used to identify alternative models for structural UA. Although other statistical model ranking methods such as Bayesian model selection criteria exists, Pande's [168] approach is preferable as it is proven to diagnose structural inadequacy. However, as the approach involves model optimization for different quantiles, it demands high computation. Further, as this approach is statistical, it cannot always guarantee hydrological adequacy. Alternatively, in the hy-drological approach, process-based signatures are employed to evaluate model structural adequacy. For instance, the use of recession curves may highlight deficiency in subsurface flow characterization while the behavior of flow duration curves can diagnose process nonlinearities [56,[170][171][172]. Further, McMillan [173] provided a review of hydrological signatures and their corresponding processes. These links can be used as hydrological instruments to identify inadequacy and enhance structural UA.

UA in Ungauged Basins
Most watersheds across different parts of the world are ungauged. Although such watersheds are common, model development and UA are limited as they demand hydrological observations [174,175]. In these watersheds, there are two approaches of UA-model dependent and model independent ( Figure 5). In both approaches, we have identified five distinct stages that may incur uncertainty. The model dependent approach [176,177] involves (i) choosing a model, (ii) estimating and quantifying parameter uncertainty for several gauged watersheds which can be achieved using Bayesian, Monte Carlo or multiobjective techniques as discussed in Section 3, (iii) determining how many parameters (the full or a subset of the parameter distribution) to transfer to the ungauged catchment, (iv) developing a regional relationship between the model parameters and watershed attributes, and (v) estimating the parameter distribution of the ungauged catchment by using the regional relationship (Figure 5a). At its core, this approach directly transfers model parameters from gauged to ungauged watersheds. Hence, it can propagate both structural and parameter uncertainties from gauged to ungauged catchments. Additionally, UA becomes challenging when the model parameters and regional relationships at stage four are poorly identified. Further, at stage three, determining the transferable parameters does not only lack a guideline, but also incurs a high computational cost when the full parameter distribution is transferred. As an alternative to selecting the parameter subset, Hundecha and Bárdossy [178] directly estimated the regression coefficients instead of the model parameters. Their approach requires predefining a linear regression between a set of sensitive model parameters and catchment characteristics. Although the approach alleviates the subset problem, it assumes a linear relationship and increases the number of parameters (regression coefficients) to be estimated.
The model independent approach involves the following stages ( Figure 5b): (i) select watershed attributes and hydrological signatures, (ii) regionalize the attributes and hydrological signatures (e.g., baseflow index, curve number, mean flows), (iii) estimate the value of the hydrological signatures using the ungauged watershed attributes, (iv) choose a model, and (v) apply the model and quantify parameter uncertainty by targeting the estimated signatures at the ungauged watershed. Stage five may employ Monte Carlo, Bayesian and multi-objective methods. As any model uses the information at stage four, this approach allows a direct quantification of structural uncertainty and is less prone to propagating structural and parameter uncertainty from gauged to ungauged watersheds. However, the approach is limited when the relationship between hydrological signatures and catchment attributes is poor. Additionally, since most of the hydrological signatures employed at stage four are streamflow oriented, the approach might be poor in predicting other flux or storage variables.
At stage five, the Monte Carlo method discriminates the behavioral and non-behavioral parameters by capitalizing on the confidence interval determined at stage three [179]. The multi-objective approach uses the estimated signatures (stage three) as its objectives [180]. The Bayesian approach employs any distribution (e.g., based on expert knowledge) as its parameter prior, while it relates the model simulated and estimated hydrological signatures for its likelihood [175,181,182]. Moreover, the fundamental pros and cons of the Monte Carlo, Bayesian and multi-objective techniques in Section 3 apply here as well. UA in both approaches benefits from having (i) many gauged basins in the regionalization, (ii) a discernible regional relationship and (iii) non-redundant complementary hydrological signatures. Similarly, both approaches implicitly assume the dominant process in the ungauged catchment is comparable to or contained within the set of dominant processes in the gauged catchments and is a function of watershed attributes. Hence, model choice is critical. Comparing a physically based versus a conceptual model, a physically based model is theoretically advantageous as it relates most of its parameters to catchment attributes. However, since its parameters are often poorly identified, using such models for regionalization may lead to high uncertainty. In contrast, although there is a lack of direct physical relationships between a conceptual model parameter and catchment attributes, due to model parsimony, conceptual models are better suited for ungauged catchment UA. Parsimony often leads to better identified parameter which is relatively easy to regionalize and transfer to ungauged basins. Better identification may also reduce the challenges of subsetting parameters for regionalization.
The studies reviewed in this subsection did not account for input and observation uncertainties. However, we provided a perspective in addressing them in ungauged basins. A prior Accounting of observation uncertainty in gauged catchments may reduce propagating the observational uncertainty from gauged to ungauged watersheds. This applies only for the model dependent ungauged UA. However, since there are no observations in ungauged basins, this uncertainty is not a concern by its own. Similarly, accounting for input uncertainty in the gauged watersheds before regionalizing them can reduce propagating input uncertainty to the ungauged watershed. However, it is important for future studies to assess and disentangle all sources of uncertainty in ungauged basins.

Other Approaches in UA
This review has heretofore focused on the widely used probabilistic and optimizationbased UA (Section 3). Theoretically, accurate estimates through these methods require large amounts of data. Additionally, these methods are not straightforward in accommodating ambiguous information (e.g., natural language and qualitative information). These qualities may limit their success in data scarce conditions and regions. Considering these limitations, alternative philosophies are in use. Although these alternatives are not the focus of this review, we briefly highlight fuzzy and information theoretic based approaches. Fuzzy based approaches rely on subjectively defined membership functions instead of formal probabilities [183]. Hence, this approach is well suited in accounting ambiguous natural language, qualitative and other additional information into UA. Liu et al. [184] showcased the application of the method in addressing parameter uncertainty. The challenges of this approach include subjectivity in defining the membership function and how to account for probabilistic quantities. A general review of fuzzy theory to hydrological problems can be found in [185].
On the other hand, information theoretic philosophies question the fundamental notion of quantifying uncertainty. Their central argument rests on the application and limitation of probabilistic methods in the absence of a 'true' model and limited data. That is, lack of a "true" model and limited data may not guarantee making the true probabilistic inference, hence uncertainty is inherently unknown [186]. Instead, this approach proposes a different question that is "to what extent is the available observed data useful in model improvement?" than quantifying the unknown uncertainty [187]. The application of this approach is demonstrated in addressing both structural and parameter uncertainties [188]. Although Nearing and Gupta [188] argued that information theory is independent of probabilistic theory, its technical application involves probabilistic terms [187].

Summary and Outlook
Across the four sources of hydrological model uncertainty, six broad classes of UA methods are identified. Each method has its own skills and limitations, and none is universally superior. In terms of coverage, parameter uncertainty has benefited from a variety of techniques, while recently input uncertainty has gained attention. In contrast, although there are a few works that deal with structural uncertainty, it is not addressed extensively provided that it is at the core of the other sources [41]. Meanwhile, calibration data uncertainty analysis beyond streamflow observation is limited.
UA that treat all sources of uncertainties in an integrated manner or disentangle their contributions are sparse. More work in this area will help source attribution and interpretation of predictive uncertainty. In line with source attribution, it is also important to employ uncertainty propagation in integrated hydrological models. By coupling different complementary sub-models these models have demonstrated their value in reducing assumptions and capturing feedback process [189,190]. However, such studies have not yet widely utilized UA to attribute their predictive uncertainty and illuminate which sub-model needs improvement or what additional data is required. In this regard, Moges et al. [191] showed how a framework based on the winding stairs algorithm and a concurrent computational framework can be utilized.
Uncertainty characterizations are limited in their ability to provide reliable information for applications involving environmental changes and coupled human-natural-hydrologic systems. This is because these systems involve many intricate and highly uncertain factors, such as future forcing input variables and decision-making under changing environments. Hence, to treat the propagation of uncertainty due to the feedback between climate models and human water management, an advanced UA approach involving an integrated transdisciplinary framework is needed. This challenge is recognized within the modeling community [192,193].
Model development, particularly in transboundary rivers, involves not only the above four sources of uncertainty but also geopolitical uncertainty related to water resources governance [194]. This uncertainty is characterized by data secrecy, ambiguous treaties, lack of mutual trust and cooperation, resulting in difficulties of data sharing and accessibility. Consequently, geopolitical uncertainty may confound the above sources of uncertainty sources and make hydrological models less reliable, limit their reproducibility and usefulness in decision-making [194,195]. As this source of uncertainty is poorly addressed in the existing hydrological literature, it needs more studies involving social scientists and different stakeholders promoting transparency, trust building, and multidisciplinary cooperation, which is in line with hydro-social thinking [196].
Another research frontier in UA is its application in large data sets [197,198]. Such analysis can be useful to generalize, regionalize, and identify patterns of model uncertainty in catchments of different climate, topography, and nature. Hence, it is important to apply different techniques to diverse datasets beyond single-catchment focused studies. Such explorations can reveal organizing principles and patterns of model uncertainty [199].
UA applications are still limited in practical engineering and management operations. One approach to bridge this limitation is to better communicate UA and link it with other concerns such as economics. In this regard, McMillan [25] demonstrated how UA reduces cost in the hydropower industry. Similarly, investigating the implications of UA on the flood insurance sector can be a path for coupling research and practice [200]. Future investigations in these areas can help UA gain a widespread foothold for the practicing community.
Furthermore, although there is a continued inclusion of UA in ungauged watersheds, it remains to be at an early stage [174]. Given that most watersheds across the world are ungauged, future research in this avenue is also critical. To this end, this review provides the first critique of the steps, the alternatives, and the uncertainty sources in ungauged watersheds. Funding: This work was conducted as a part of the "Improved hydrologic forecasting through synthesis of critical storage components and timescales across watersheds worldwide" Working Group supported by the John Wesley Powell Center for Analysis and Synthesis, funded by the USGS.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest:
The authors declare no conflict of interest.