We tested four likelihood measures including two limits of acceptability and two absolute model residual methods within the generalized likelihood uncertainty estimation (GLUE) framework using the topography model (TOPMODEL). All these methods take the worst performance of all time steps as the likelihood of a model and none of these methods were successful in finding any behavioral models. We believe that reporting this failure is important because it shifted our attention from which likelihood measure to choose to why these four methods failed and how to improve these methods. We also observed how large parameter samples impact the performance of a hybrid uncertainty estimation method, isolated-speciation-based particle swarm optimization (ISPSO)-GLUE using the Nash–Sutcliffe (NS) coefficient. Unlike GLUE with random sampling, ISPSO-GLUE provides traditional calibrated parameters as well as uncertainty analysis, so over-conditioning the model parameters on the calibration data can affect its uncertainty analysis results. ISPSO-GLUE showed similar performance to GLUE with a lot less model runs, but its uncertainty bounds enclosed less observed flows. However, both methods failed in validation. These findings suggest that ISPSO-GLUE can be affected by over-calibration after a long evolution of samples and imply that there is a need for a likelihood measure that can better explain uncertainties from different sources without making statistical assumptions.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited