Quantifying Uncertainty in the Modelling Process; Future Extreme Flood Event Projections Across the UK

: With evidence suggesting that climate change is resulting in changes within the hydrologic cycle, the ability to robustly model hydroclimatic response is critical. This paper assesses how extreme runoff—1:2- and 1:30-year return period (RP) events—may change at a regional level across the UK by the 2080s (2069–2098). Capturing uncertainty in the hydroclimatic modelling chain, ﬂow projections were extracted from the EDgE (End-to-end Demonstrator for improved decision-making in the water sector in Europe) multi-model ensemble: ﬁve Coupled Model Intercomparison Project (CMIP5) General Circulation Models and four hydrological models forced under emissions scenarios Representative Concentration Pathway (RCP) 2.6 and RCP 8.5 (5 × 4 × 2 chains). Uncertainty in extreme value parameterisation was captured through consideration of two methods: generalised extreme value (GEV) and generalised logistic (GL). The method was applied across 192 catchments and aggregated to eight regions. The results suggest that, by the 2080s, many regions could experience large increases in extreme runoff, with a maximum mean change signal of +34% exhibited in East Scotland (1:2-year RP). Combined with increasing urbanisation, these estimates paint a concerning picture for the future UK ﬂood landscape. Model chain uncertainty was found to increase by the 2080s, though extreme value (EV) parameter uncertainty becomes dominant at the 1:30-year RP (exceeding 60% in some regions), highlighting the importance of capturing both the associated EV parameter and ensemble uncertainty.


Introduction
Floods are a common hydrological hazard that pose widespread threat to lives and infrastructure [1,2]. According to UN estimates, in the period of 1995-2015, 2.3 billion people were affected (either in terms of health or socioeconomics) by floods globally, resulting in 157,000 fatalities and a total cost of USD 662 billion in economic damage [3]. In the UK, millions of people are affected by flooding every year, with annual flood damage costs estimated to be in the region of GBP 1.1 billion [4]. In response, the UK government spent GBP 808.2 million on flood risk management in 2018/2019, a GBP 144.9 million increase in expenditure in just over ten years (2005/2006, [5]).
Climate change is projected to exacerbate these pressures. Studies indicate human activity as the leading cause of climate change (e.g., [6,7]), with the Intergovernmental Panel on Climate Change (IPCC) concluding in its Fifth Assessment Report (AR5) that anthropogenic sources are "extremely likely to have been the dominant cause of the observed warming since the mid-20th century" [8]. Evidence is building that this warming has led to increases in precipitation intensity and melt rate (snow and ice) [8][9][10][11], resulting in changes within the hydrologic cycle.
Coinciding with increasing urbanisation, the increase in the frequency and severity of floods worldwide [8] raises concerns for the future. UK climatological records support these concerns; according to Met Office records, there have been 17 record-breaking rainfall months or seasons since 1910, with nine of them occurring since 2000 [12]. This extreme rainfall has, in turn, led to a pattern of severe flooding in recent times; notably, the 2007 summer floods, the 2013-2014 winter floods and the 2015-2016 winter floods, with the latter being the most extreme on record in terms of rainfall intensity [13].
To ensure future flood preparedness, the ability to robustly model the hydroclimatic response is critical. Given the dynamic and stochastic nature of climate, coupled with the complexity of hydrological processes, eliciting future flood projections with any degree of confidence is highly challenging. These projections are the product of a multi-step modelling chain (Figure 1). General Circulation Models (GCMs) are forced under emissions scenarios to produce outputs that represent physical processes in the atmosphere, ocean, cryosphere, and land surface (e.g., precipitation, temperature) [14]. GCMs are run at a coarse spatial resolution; to make them suitable for hydrological modelling, the outputs are downscaled using statistical or dynamical approaches. Hydrological models (HMs) are used to propagate the downscaled climate signal to hydrological outputs, generating simulations of flow. Flood return periods (RPs) are determined by fitting extreme value distributions to flood peak series (annual maxima or peak-over-threshold) based on daily flow projections. With each step in the chain, uncertainty cascades, propagating (or constraining) the uncertainty through the modelling chain [15,16]. These uncertainties can arise from a number of sources, including model structures, inputs, and parameters. In order to provide meaningful projections, the characterisation and quantification of this uncertainty is essential [15,16]. GCM Downscaling HM EVD Emissions Scenario Consideration of an ensemble of models supports the characterisation and quantification of uncertainty. There are two types of ensembles, each accounting for a different source of uncertainty: the multi-model ensemble (MME) and the perturbed physics ensemble (PPE; also known as a perturbed parameter ensemble). Providing an estimate of structural uncertainty, MMEs consider multiple model structures, whilst the PPE explores parametric uncertainty in a given model through systematic variation of uncertain model parameters [17]. At present, computational demand is a limiting factor in the consideration of multi-model perturbed physics ensembles.
Extreme value (EV) theory is used to extract large-return-period events from the outputs of this hydroclimatological modelling chain [18]. In the UK, flood event return period analysis is carried out using the methods detailed in the Flood Estimation Handbook (FEH) [19]. The FEH recommends fitting either generalised extreme value (GEV) or generalised logistic (GL) distributions to annual maximum (AMAX) flow series to produce estimates for high-return-period events. As with climate and hydrological models, these methods have associated uncertainty relating to the different EV model parameters (fitting error) and the model structure.
Collet et al. [20] investigated future flood events and climate model parameter uncertainty across Scotland by deriving flood projections from the Future Flows Hydrology (FFH) database, a PPE from the third Coupled Model Intercomparison Project (CMIP3). The study suggested that future extreme runoff may increase significantly across Scotland by the 2080s, particularly in eastern catchments at high RPs (i.e., 1:100, 1:200 year). The EV parameter uncertainty associated with these projections was shown to become increasingly dominant for higher RPs, highlighting the need to account for this uncertainty.
However, as climate change science and earth system modelling advance through the incorporation of more processes and the improved parameterisation of physical processes [21], new climate models and projections are made available. The CMIP3 models and Special Report on Emissions Scenarios (SRES) scenarios considered by Collet et al. [20] have been superseded by CMIP5 and Representative Concentration Pathways (RCPs), introducing the need for more up-to-date flood projections. Furthermore, Visser-Quinn et al. [22] highlighted the importance of accounting for different sources of uncertainty in the hydroclimatological modelling chain through consideration of a range of models, finding large variance associated with HMs. As Collet et al. considered a PPE (11 members; 11 different realisations of the GCM HadRM3, and a single hydrological model), multiple models were not considered, and hence, the influence of HM uncertainty could not be accounted for.
This paper derives flood projections from the EDgE (End-to-end Demonstrator for improved decision-making in the water sector in Europe) project [23], which utilised an MME combining five CMIP5-GCMs and four HMs forced under emissions scenarios RCP 2.6 and RCP 8.5 (5 × 4 × 2 chains). These data represent the state of the art and affords the ability to quantify the associated structural and EV parameter uncertainty. Through the fitting of the GEV and GL distributions, the 1:2-and 1:30-year RP events are extracted for 192 UK catchments. The focus of this paper is on the change across two 30-year periods: the baseline  and the 2080s (2069-2098). The 2080s were chosen in alignment with the IPCC [8] to explore the far-future impacts of climate change on flood events. The method is applied at the catchment scale and aggregated to the regional level, with the coarser resolution allowing for possible large-scale trends to be identified. In addition, the analysis is replicated for the 44 Scottish catchments featured in Collet et al. [20], enabling comparison of the change in projections and uncertainties between studies. The overarching aim of this paper is to investigate how extreme runoff may change on a regional level across the UK by the 2080s, as well as the uncertainty associated with these projections.

Methods
The methodology is presented in two stages ( Figure 2): data preparation, followed by analysis. In Stage 1, daily annual maximum (AMAX) flows were extracted for 192 catchments across the UK from: (1) observed data maintained by the National River Flow Archive (NRFA; 1976-2005); (2) baseline projections from the EDgE project ; and (3) future projections from the EDgE project (2069-2098). The baseline period of 1976-2005 was selected to maximise the observed data for bias correcting the flow projections; following which, the standard baseline period of 1971-2000 was used. After bias correction, two EV methods, GEV and GL, were fitted to the AMAX flows and the 1:2-year and 1:30-year RP flood events identified. These catchment-level data were then aggregated into eight geographic regions in order to facilitate discussion.
In the second stage, the following analysis was applied at the catchment and regional levels: (1) the mean change signal, a quantification of the change in extreme runoff estimates; (2) relative standard deviation (RSD), which captures the spread of model chain outcomes and structural uncertainty; and (3) the probability distribution uncertainty (PDU), the standardised difference between the upper and lower 95% confidence limits across the modelling chains, thus capturing the EV parameter uncertainty. These outputs were subsequently compared with the Collet et al. [20] CMIP3-PPE flood projects.   [25]. The HMs are: mHm [26,27], Noah-MP [28], VIC [29], and PCR-GLOBWB2 [30]. Noah-MP and VIC can be further classified as land-surface models (LSMs), simulating land-atmosphere fluxes, whereas mHM and PCR-GLOBWB2 simulate water balance components, i.e., precipitation and evapotranspiration.
Flows forced under both RCP 2.6 and RCP 8.5 were extracted, allowing the implications of the most conservative pathway to be compared to that of the most severe. The focus of this paper is on the change across two 30-year periods: the baseline, 1971-2000, and the future, 2069-2098, referred to as the 2080s. The 2080s is the long-term future period used by the IPCC [8] and has become the standard within climate change impact studies. Furthermore, the future period aligns with that of Collet et al. [20], ensuring consistency in this regard when drawing comparisons between both studies.
AMAX flows were extracted (per calendar year) from both the observed and projected data for all 192 catchments. A threshold of 0.8 was set so that any year that had less than 80% completeness of daily flows was removed. A total of 41 observations in the entire record did not meet this threshold (0.7% of total AMAX flows; a negligible loss of data).
The projected AMAX flows were then bias corrected with reference to the observed AMAX flows. The bias correction used the baseline period 1976-2005 to maximise the use of the gauged NRFA data. However, as 1971-2000 is standard (as of AR5) as a baseline, once the bias correction was complete, this period was used as the baseline in the analysis. The difference in time periods is possible because the purpose of the observed baseline is to correct the magnitude of the simulated flows, rather than to align the temporal occurrence of events.  [20]. Catchment fill colour is by region; these regions are the focus of the regional analysis.

Bias Correction
Bias in projections represents errors in the statistical properties relative to the observations. Using the R package 'qmap' (version 1.0-4; [31]), the projected AMAX flows were bias corrected per model chain and per catchment using a quantile mapping approach. The approach, summarised in Figure 2, looks to improve the fit for extreme flows by grouping the observed and projected AMAX flows into three equal-sized groups: low, average, and high flows. A linear transformation function is determined (using the function 'fitQmap'), which is then applied to each quantile (using the function 'doQmap'), rescaling the statistical properties of the projections to those observed.
Validation of the quantile mapping outputs is provided in Supplementary Materials ( Figures S3 and S4). The bias-corrected simulated extreme flows over the baseline period were compared with observed flows to check for consistency. In all cases, observations were within the lower and upper quartiles of the EDgE projections ( Figure S3). Four catchments were identified as having an observed-projected median ratio of less than 0.9 or greater than 1.1 ( Figure S4), though these were retained for the analysis. Differences in medians were greatest for the 1:30-year RP; this may be attributed to the greater EV uncertainty associated with a higher return period.

Extreme Value (EV) Theory
The objective of EV theory is to quantify the stochastic behaviour of a process at extreme levels [18]; hence, its common application in hydrology to model extremely high or low flows (e.g., [32,33]). The theory involves fitting an EV distribution to AMAX or peak-over-threshold (POT) series, depending on the distribution, to extract the desired RP.
For the purposes of this study, two EV distributions, generalised extreme value (GEV) and generalised logistic (GL), were fitted to the observed and projected AMAX flows. The methods detailed in the FEH are standard practice in the UK for flood magnitude RP event analysis [19]. Whilst the GEV distribution is commonplace, the GL distribution represents the standard in the UK, as it tends to provide more conservative results [20].
The GEV distribution is a three-parameter distribution, and has the following quantile function [19]: where ξ is the location parameter, α is the scale parameter, and k is the shape parameter. F is the non-exceedance probability, so for an RP of event T, F = 1 − (1/T). The GL distribution uses the same three parameters, and has the following quantile function [19]: Expressions (1) and (2) therefore provide the flow value Q corresponding to a specified non-exceedance probability.
Both distributions were fitted to the observed and projected AMAX using the R packages "lmomco" (version 2.3.2; [34]) for GL and "extRemes" (version 2.0-11; [35]) for GEV. The GEV and GL distributions were fitted using the L-moments approach [19] to obtain the distribution parameters. This approach, which is based on L-moments instead of conventional moments, is recommended by the FEH due to its robustness in the presence of unusually small or large data [19].
The 95% EV confidence intervals (CIs) were obtained using a parametric bootstrapping method, in which a sample of size n, equal to the length of the original data, was simulated. The EV distributions were then fitted to the simulated sample, and the resulting parameter estimates were stored. This was repeated for 1000 bootstrap iterations, which was determined through trial and error as the most suitable number of iterations when weighing both computational time and the difference in parameter estimates, in order to obtain a sample from the population of the stored parameters. The percentile method was then used to determine the 95% confidence interval estimates. The 'ci.fevd' function within the "extRemes" package and the 'qua2ci.simple' function within the "lmomco" package were used to determine the CIs for the GEV and GL distributions, respectively.
To validate the EV distributions, both GEV and GL were manually fitted for an arbitrary station (NRFA station ID 2001: Helmsdale at Kilphedir) using the L-moments approach. The manual fitting procedure and cumulative distribution function plots (Figures S1 and S2) are provided in the Supplementary Materials. The location, scale, and shape parameters from manual fitting were found to be in agreement with those fit using the R packages.

Regionalisation
The catchment-level data were aggregated into eight geographic regions (Figure 3) based on measuring authority (provided by the NRFA). Examining the results at a coarser resolution aided interpretation and facilitated discussion by allowing potential regional trends to be identified. The subsequent analysis in stage 2 was applied at both the catchment and regional levels.

Change in Extreme Runoff
In order to quantify the projected change in extreme runoff at the catchment and regional levels, the mean change signal was calculated. The mean change signal is the mean of the percentage difference between future and baseline extreme runoff estimates across the modelling chains (Equation (3)).
where Q F is the future runoff estimate, Q BL is the baseline runoff estimate, and N is the number of model chains. This was calculated per scenario, per EV method, per catchment and region, and per RP.

Uncertainty Characterisation
In order to capture the EV parameter uncertainty and the model chain (GCM-HM) structural uncertainty associated with the projections, each source was analysed separately. These uncertainties were calculated per scenario, per time period, per EV method, per catchment and region, and per RP.
The model chain structural uncertainty was captured by determining the relative standard deviation (RSD), which represents the spread of model chain outcomes, a traditional proxy measure of uncertainty [36]: where σ is the sample standard deviation and µ is the sample mean of the distribution. This is a standardised measure, allowing comparison between data with different means and spread. It is based on the premise that the larger the dispersion between the peak flow projections provided by each model chain, the greater the associated uncertainty. A catchment/region with a high RSD indicates low agreement between model chains and, hence, high model chain structural uncertainty; a catchment/region with a small RSD indicates good agreement between model chains and, hence, low model chain structural uncertainty.
The uncertainty associated with the EV model parameters was assessed by firstly determining the relative coefficient of uncertainty, RCu, for each model chain: where E is the runoff estimate, CI up is the upper 95% confidence limit, and CI low is the lower 95% confidence limit. Dividing by the runoff estimate standardises the value, allowing comparison of CI ranges for locations with different runoff estimates at a given RP. This is necessary as larger peak flows generally have larger CIs, which leads to higher absolute uncertainty [37]. The mean RCu over all model chains was then taken to determine the probability distribution uncertainty, PDU: A catchment/region with a high PDU indicates a wide 95% CI relative to the runoff estimate and, hence, high EV parameter uncertainty; a catchment/region with a small PDU indicates a narrow 95% CI relative to the runoff estimate and, hence, low EV parameter uncertainty.

Comparison with CMIP3-PPE
The analysis was repeated on data from Collet et al. [20] to explore how projections and the associated uncertainties have changed with the current generation of models and scenarios. The study derived flood projections from the FFH database, which utilised an 11-member CMIP3-PPE (11 different realisations of HadRM3, and a single hydrological model) to simulate projections of daily flow [38]. It is derived from Future Flows Climate, an 11-member ensemble based on the SRES A1B emissions scenario outlined in AR4.
A total of 95 catchments across Scotland were considered by Collet et al. [20]; of these, 44 catchments matched with those in this study (Figure 3b).
Detailed in Sections 2.2.1 and 2.2.2, the analysis was performed on their raw data at the catchment and regional levels. It is important to note that, as Collet et al. [20] was based on a PPE, computation of RSD quantifies the climate model parameter uncertainty, rather than the structural uncertainty in this work (an MME).

Results
The following results are presented at the regional level to aid interpretation and the discussion; see the Supplementary Materials ( Figures S7-S9) for raw results at the catchment level. Section 3.1 presents the mean change signal results, which quantify the projected change in extreme runoff across the UK by the 2080s. Section 3.2 presents the uncertainty associated with the extreme runoff projections; firstly, the model chain structural uncertainty (captured through the determination of RSD), and secondly, the EV parameter uncertainty (captured through the determination of PDU). Figure 4 presents the regional mean change signal for both emissions scenarios (RCPs 2.6 and 8.5) and RPs (1:2-and 1:30-year) using the GEV distribution. The GL distribution is not shown; across all modelling chains and RPs, it was found that both GEV and GL produced very similar change signal estimates. In the Supplementary Materials ( Figure S5), it is shown that the absolute difference between the distributions is always ≤3%, with a median difference <1%. Both distributions are special cases of the Kappa distribution, with the same three parameters [19]-the probable reason for this similarity.

Change in Extreme Runoff by the 2080s
Beginning with RCP 2.6 and the 1:2-year RP (Figure 4a), the largest and smallest mean change signals are projected in South West England (+19%) and North Scotland (+3%), respectively. For the 1:30-year RP (Figure 4c), the mean change signal is lower. The highest and lowest projected changes occur in the same regions, having fallen to +15% and −3%, respectively. The largest reductions are projected in the northern regions of the UK. Here, North Scotland, East Scotland, and Northern Ireland are all projected to see a reduction in the 1:30-year RP flood magnitude (negative mean change signal, shown in yellow).
Looking to RCP 8.5 and the 1:2-year RP (Figure 4b), East Scotland is projected to see the highest mean change signal (34%), with Central and South East England seeing the lowest (18%). As with RCP 2.6, the change signal is lower for the 1:30-year RP (Figure 4d). Generally, the largest reductions are once more projected in the northern regions. Here, the minimum change signal is projected in East Scotland (8%), whilst the highest occurs in the South West of England (20%).
Comparing between emissions scenarios, the mean change signal estimates are significantly higher under RCP 8.5 than RCP 2.6 for all regions, with a median increase of 20% at the 1:2-year RP and 8% at the 1:30-year RP. The difference in mean change signal between RPs is more noticeable under RCP 8.5; the median difference under RCP 2.6 is 5%, whereas it is 19% under RCP 8.5. This is largely a result of the high change signal estimates exhibited in Figure 4b.

Model Chain Structural Uncertainty
The regional RSD is shown in Figure 5 for both time periods (baseline and 2080s), emissions scenarios, and RPs using the GEV distribution. As before, differences between the GEV and GL distributions were minimal ( Figure S6); consequently, only the GEV results are presented.
Beginning with the baseline period and the 1:2-year RP (Figure 5a), RSD is lowest in West Scotland (3%), and does not exceed 6% in any region. For the 1:30-year RP (Figure 5d), RSD estimates are slightly larger, with the lowest exhibited in North Scotland (5%), and highest in South West England (9%).
By Neither emissions scenario consistently exhibits larger RSD estimates than the other, and the differences between the two remain quite low: never exceeding 6%. From the baseline period to the 2080s, RSD is estimated to increase across all regions by a median value of 10%, implying greater agreement between the multi-model chains over the baseline than the 2080s.  Figure 6 shows the regional PDU associated with the GEV and GL distributions for both time periods (baseline and 2080s), emissions scenarios, and RPs.

Extreme Value Parameter Uncertainty
The GL distribution exhibits lower parameter uncertainty than GEV for the 1:2-year RP for all regions across both time periods and scenarios, with a maximum PDU difference between distributions of 3% exhibited in Central and South East England and a median difference of approximately 2%. The lowest PDU is found in North Scotland under RCP 8.5 (Figure 6f) at 18%, and the highest in Central and South East England over the baseline (Figure 6a) at 30%.
For the 1:30-year RP, the opposite is true; the GEV distribution exhibits lower parameter uncertainty for all regions across both time periods and scenarios, with a maximum PDU difference between distributions of 8% in East Scotland, and a median difference of 6%. There is a large increase in EV parameter uncertainty from the 1:2-year to the 1:30-year RP, as shown by the PDU estimates. North Scotland under RCP 8.5 (Figure 6i) again exhibits the lowest PDU (29%), and West Scotland over the baseline (Figure 6j) exhibits the highest (63%).
The lower EV uncertainty associated with GEV for the 1:30-year RP is more notable than the lower uncertainty of GL for the 1:2-year RP. This is, in part, due to the higher EV uncertainty associated with larger RPs.

Comparison with a CMIP3-PPE
Tables S1-S3 in the Supplementary Materials compare the mean change signals and associated uncertainties of this work, a CMIP5-MME with Collet et al. [20], and a CMIP3-PPE. Only results for the 1:2-year RP are compared, since it is statistically more robust than the 1:30-year RP.
Table S1 compares the regional mean change signal estimates across Scotland at the 1:2year RP using GEV and GL. The change signal estimates under SRES A1B fall in between RCP 2.6 and RCP 8.5, with a median change signal under SRES A1B of 16%, compared with 9% under RCP 2.6 and 32% under RCP 8.5. The differences in change signal estimates between GEV and GL are negligible for both this work and Collet et al. [20].
Table S2 compares the regional RSD estimates across Scotland at the 1:2-year RP using GEV and GL for the baseline and 2080s. The RSD estimates over the baseline are extremely similar between all three emissions scenarios, but are slightly larger under SRES A1B. By the 2080s, the RSD is notably lower for Collet et al. [20], with a median RSD of 7%, compared to 17% under RCP 2.6 and 20% under RCP 8.5. As with the mean change signal, the differences in RSD between GEV and GL are negligible in both studies.
Table S3 compares the regional PDU estimates across Scotland at the 1:2-year RP using GEV and GL for the baseline and 2080s. From the baseline to the future, all regions exhibit an increase in PDU under SRES A1B, whereas under RCP 2.6 and RCP 8.5, PDU decreases. However, for both studies, the estimates fall within a similar range, and GL exhibits lower EV parameter uncertainty than GEV.

Change in Extreme Runoff by the 2080s
The results indicate that by the 2080s, flood events may become significantly greater in magnitude across many of the UK regions. For the 1:2-year RP, extreme runoff is projected to increase across every region; this increase is much greater under RCP 8.5 than RCP 2.6. North Scotland is projected to experience the largest change, exhibiting a 34% increase under RCP 8.5 by the 2080s. Though the return period is small, with a 50% chance of flows equalling or exceeding the event in any given year, these significant increases in runoff, combined with increasing urbanisation in flood-prone areas across the UK [39], could still result in considerable flood impacts for this event by the 2080s without the implementation of sufficient flood risk management strategies.
The change signal decreases across the UK at the higher 1:30-year return period, with North and East Scotland and Northern Ireland even experiencing small, negative change signals under RCP 2.6. However, the remaining five regions still exhibit significant increases in extreme runoff, with a maximum increase of 20% found in South East England under  Table S4) suggests that a 20% increase to 1:30-year runoff could be approximately equivalent to a present day 1:100-year event in English regions. This could result in 764,996 and 453,729 additional people and properties being at flood risk in England [40], thus highlighting the greater flood exposure that could result from these projections.
Across emissions scenarios, it can be observed that RCP 2.6 and RCP 8.5 generally experience the same regional change signal trends, but at a lower magnitude under RCP 2.6. This is to be expected; RCP 8.5 is a much more severe scenario than the "stringent" pathway of RCP 2.6. However, in the South of England, large increases in extreme runoff are exhibited irrespective of emissions scenario, raising concerns that, even under the best case, this region could experience flood events that are 15% larger in magnitude than present-day 1:30-year runoff events.

Associated Uncertainties
By comparing the findings shown in Figures 5 and 6, it can be seen that EV parameter uncertainty is the dominant uncertainty. For the 1:2-year RP, the EV uncertainty is moderate, with a PDU range of 18-30%; however, for the higher 1:30-year RP, uncertainties range from 29-63%. Uncertainty this high limits the usefulness of the change signal findings; hence, it is critical to quantify this uncertainty.
The biggest source of EV parameter uncertainty may be related to time period length; the larger the amount of data considered, the greater the estimation of the EV parameters and, hence, the better the distribution fit. Thirty-year time periods were used for nonstationary analysis of river flow time series; however, the likelihood of a high return period occurring during this time interval becomes relatively low. The associated EV uncertainties reflect this, with the 1:30-year RP exhibiting, by some margin, the highest PDU. To reduce EV uncertainty (aside from only considering the smallest RPs), a pooling-group method [19] can be used to increase the flow series length. However, this method relies on pooling hydrologically similar catchments (initially in terms of size, wetness, and soil properties), which would reduce the overall number of catchments. Furthermore, a time series five times the RP is required, which becomes increasingly difficult for high return periods [20] (e.g., 5 × 30 or 5 × 100). Therefore, such an approach would be unsuitable for this work.
The GL distribution exhibits lower parameter uncertainty than the GEV distribution for the 1:2-year RP, whereas for the 1:30-year RP, the uncertainty associated with GEV is lower. The GL distribution is recommended for UK flood event analysis, as it tends to provide more conservative results [19]; however, as found by Ul Hassan et al. [41], there is often no single best distribution for a range of spatially and characteristically different catchments, and, as this study has found, across RPs. In addition, studies that have found better performance with a particular distribution on one level (i.e., catchment or regional) might find that another performs better when the analysis is repeated on a finer or coarser scale. In this study, this was not the case; as with the regional results, GL tends to exhibit lower parameter uncertainty at the catchment level for the 1:2-year RP, and GEV tends to exhibit lower uncertainty for the 1:30-year RP (see catchment results in Supplementary Materials, Figure S9).
With regards to the model chain structural uncertainty, low uncertainty was found over the baseline period, with a median RSD of 8% across all regions. However, by the 2080s, the median RSD across all regions increased to 18%. This significant increase occurs because the present is more similar to the conditions the models were trained under, leading to greater agreement between the models over the baseline. The 2080s are more uncertain, and each model is structurally different, causing the models to diverge over time.

CMIP3-PPE Comparison
Before discussing the findings of the comparison with Collet et al. [20], there are a few important details to note. Collet et al. analysed flow projections across Scotland; consequently, only Scottish regions were subject to comparison. As their study is based upon AR4, their work considered a CMIP3 climate model, an SRES emissions scenario (SRES A1B), and a baseline period of 1961-1990. The term 'baseline' therefore encompasses two different periods, depending on which study is the subject. Furthermore, as they analysed flow projections from a PPE, computation of the RSD yields the climate model parameter uncertainty, rather than the model chain structural uncertainty.
Despite these differences, it can be seen that both studies are generally in agreement. The projected change in extreme runoff for Collet et al. [20] falls in between the change signals from this work, with a median change signal of 16% under SRES A1B compared to 9% and 32% under RCP 2.6 and RCP 8.5, respectively. From a superficial level, at least, this finding may be expected. There is not an RCP scenario that directly correlates with SRES A1B, though SRES A1B has been described as a "middle-ground" scenario relative to RCP 2.6 and RCP 8.5 [8,42]. This work also agrees with the finding that the largest change in runoff is exhibited in the East of Scotland, a trend that Collet et al. attribute to the size of the eastern catchments and larger attenuation resulting from lakes and reservoirs.
Both studies exhibit similarly low RSD estimates over the baseline period. Though this similarity is slightly surprising (given the differences in models and baseline periods, and the disparity in what RSD represents), low uncertainty over the baseline is to be expected; the present is more similar to the model training conditions. By the 2080s, greater RSD is exhibited in this work, with a median RSD of 17% and 20% under RCP 2.6 and RCP 8.5, respectively, compared with 7% under SRES A1B. This could indicate that structural uncertainty has a greater influence on models than parameterisation uncertainty, as the former results in the largest RSD by the 2080s. However, due to the identified differences between studies, and the fact that Collet et al. did not assess HM parameter uncertainty (which would require an HM-PPE), a dedicated study (e.g., see [43,44]) would be required to investigate this further.
Both studies agree that GL exhibits lower associated uncertainty than GEV across all three regions for the 1:2-year RP, suggesting that the GL model may be the most suitable of the two for low RP analysis. The EV uncertainties themselves are generally consistent between studies, with the PDU for both GEV and GL falling in the range of 21-29% under all scenarios. However, Collet et al. show an increase in PDU from the baseline to the 2080s, suggesting greater EV parameter uncertainty by the 2080s, whereas this study found the uncertainty to decrease by the same period. It is possible that this observation is not of significance, and is rather a result of the identified differences between the studies.

Limitations
The main limitation of this work is the extent of the uncertainties associated with the extreme runoff projections. For the 1:2-year RP, the EV parameter uncertainty (PDU) and the model chain structural uncertainty (RSD) are moderate, not exceeding 30% and 21%, respectively. However, for the 1:30-year RP, the EV parameter uncertainty alone exceeds 60% in a number of regions, indicating a wide 95% confidence interval for the flood projections. This limits the conclusions that can be drawn from the change signal findings, as it is unknown which results may be reflective of reality, and which are more a product of the associated uncertainty. However, by capturing and quantifying these uncertainties as opposed to neglecting them, the value of the projections can be said to increase [15,16].
The spatial distribution of catchments was lacking in certain regions. This sparsity is apparent in Northern Ireland and parts of North and West Scotland (Figure 3a). Regional analysis is heavily influenced by data availability, since the mean change signal and uncertainty estimates for each region are aggregations of the catchments found within. This limitation of data availability is common in similar studies (e.g., [20,37]). Although not performed in this work due to a large number of catchments, recent innovations in spatiotemporal extreme value analysis offer methods to improve the estimation of model parameters for catchments with limited data by "pooling" together information from neighbouring locations (e.g., [45,46]).
The current version of the EDgE modelling chain has not considered artificial influences that alter natural runoff. Flow records for sixty percent of all UK gauging stations are significantly affected by flow manipulations [47]. These flow manipulations can come in the form of river regulation, water abstractions for public water supply and industrial use, groundwater recharge, and outflows from sewage treatment works. With a growing population and urbanisation, the presence of artificial influences that affect runoff is projected to increase [48]. Therefore, it is important to consider these influences and their increasing presence when analysing future extreme flows.

Scope for Future Research
To take this work further, the model chain structural uncertainty could be partitioned into GCM and HM structural uncertainty to allow for more thorough analysis of individual model uncertainty. This may be possible through the application of a quasi-ergodic (QE) analysis of variance (ANOVA) approach to the hydroclimatological model chain (e.g., [22]). However, it has not been determined if using such an approach would be possible, as the EV is not technically a time series (as is required for QE-ANOVA).
Parameter uncertainty in models, which has been shown to be comparable to or greater than structural uncertainty [43], could be considered for a more comprehensive analysis of model uncertainty. To assess both uncertainty types, MMEs and PPEs could be used in combination (a multi-model PPE). The current issue with this is computational demand, but with continued technological advancement, this may become a more feasible proposition in the near future.
The methodology is readily transferable, as evidenced by replicating the method from Collet et al. [20] and using it in this study. It could be further adapted into a framework to establish a consistent method for future climate change impact studies, which would allow for intercomparison between works. As the EDgE project utilised the ISI-MIP subset of CMIP5 GCMs, this study can be compared on a cross-sectoral basis. For example, Rosenzweig et al. [49] used the ISI-MIP subset of models in a study of the impact of climate change on the agricultural sector. Other studies have used the ISI-MIP subset to investigate the impact of climate change on biomes, fisheries, and coastal systems, amongst others [25]. Through comparison, a consistent and comprehensive picture of the cross-sectoral impacts of climate change projections can be built up.
The Sixth Assessment Report (AR6) will introduce the next generation of models, CMIP6, and a new set of forcing scenarios combining RCPs with Shared Socioeconomic Pathways (SSPs) [50]. SSPs are scenarios of projected global socioeconomic changes up to 2100, which are used to derive greenhouse gas emissions scenarios with different climate policies [51]. These scenarios grant an understanding of the extent to which societal developments can affect the severity of climate change risks and response options, something that was not possible with the scenarios of AR5. Just as this study did with Collet et al. [20], there will be a need to repeat and compare with this study for the upcoming generation of models and scenarios to understand how projections differ and how the associated uncertainties might have changed.

Conclusions
With evidence suggesting that climate change is resulting in changes within the hydrologic cycle, there is concern that in the future, flood events will become more frequent and severe in magnitude. The ability to robustly model the climate and hydrological response under climate change is therefore critical.
Collet et al. [20] investigated the projected change in future flood events across Scotland and the uncertainty associated with these projections. The study, which was based upon AR4, suggests that much of Scotland may experience large increases in extreme runoff by the 2080s. However, two areas of development are identified: (1) The release of AR5 introduced a new generation of climate models (CMIP5) and emissions scenarios (RCPs), rendering the findings out of date; and (2) the work considered a PPE, yet, as Visser et al. [52] highlighted, there is a need to account for different sources of uncertainty through a range of models. This paper addresses these areas by deriving flood projections from the EDgE MME (five CMIP5-GCMs, four HMs, two RCPs) with the aim of investigating how extreme runoff may change on a regional level across the UK by the 2080s and the uncertainty associated with these projections.
The results suggest that by the 2080s, many UK regions could experience large increases in extreme runoff, particularly under RCP 8.5. Under this scenario, for the 1:2-year RP, the largest increases were exhibited in the north (North, East, and West Scotland). For the 1:30-year RP, irrespective of emissions scenario, the largest increases were found in the south (South West England, Central and South East England, Wales, and North West England). These projected increases in extreme runoff, coupled with increasing urbanisation in flood-prone areas across the UK, could result in severe flood impacts by the 2080s without the implementation of sufficient flood risk management strategies.
The model chain structural and EV parameter uncertainties associated with these projections were evaluated. RSD estimates indicated low model chain structural uncertainty over the baseline, with higher uncertainty by the 2080s, a result of the present being more similar to the conditions the models were trained under. The EV parameter uncertainty increased significantly from the 1:2-year return period to the 1:30-year return period, with the PDU exceeding 60% in some regions. This was considered the main limitation of this work; high uncertainty limits the conclusions that can be drawn from the extreme flow projections. However, by capturing these uncertainties as opposed to neglecting them, the value of the projections increases.
The findings of this study were generally consistent with Collet et al. [20], despite the different models, scenarios, and baseline periods considered. The mean change signal and PDU estimates all fell under a similar range. The similarity in RSD estimates over the baseline was particularly surprising, given that it represents the climate model parameter uncertainty in Collet et al. (as a PPE) and the model chain structural uncertainty in this work (as an MME). However, the increase in RSD by the 2080s was more significant in this work. This raises the question about whether model structure uncertainty may be more dominant than model parameter uncertainty, though a dedicated study would be required to investigate this further.
To build upon this work, the easily transferable methodology could be further adapted into a framework to establish a consistent method for future climate change impact studies, which, in turn, would allow for intercomparison between studies. Additionally, with AR6 scheduled for release in 2021, there will be a need to repeat and compare with this study under the upcoming generation of models and scenarios to understand how projections differ and how the associated uncertainties have changed.
Supplementary Materials: The following are available online at https://www.mdpi.com/2076-3 263/11/1/33/s1, Figure S1: CDF of manually fitted GL distribution to observed AMAX values at station ID 2001, Figure S2: CDF of manually fitted GEV distribution to observed AMAX values at station ID 2001, Figure S3: Comparison of observed and simulated 1:30-year RP flows over the baseline period using the GEV distribution, Figure S4: Catchments where the ratio of observed median flow to simulated median flow was less than 0.9 or greater than 1.1, Figure S5: Comparison of mean change signal estimates for GEV and GL, Figure S6: Comparison of relative standard deviation estimates for GEV and GL, Figure S7: Catchment level mean change signal results using GEV, Figure  S8: Catchment level relative standard deviation results using GEV, Figure S9: Catchment level probability distribution uncertainty results, Table S1: Comparison with Collet et al. [20] of regional mean change signal in Scotland, Table S2: Comparison with Collet et al. [20] of regional relative standard deviation in Scotland, Table S3: Comparison with Collet et al. [20] of probability distribution uncertainty in Scotland, Table S4: Mean extreme flow estimates for southerly regions.  Data Availability Statement: Publicly available datasets were analyzed in this study. This data can be found here: edge.climate.copernicus.eu.

Acknowledgments:
The EDgE dataset was created under contract for the Copernicus Climate Change Service (http://edge.climate.copernicus.eu/). The ECMWF implements this service and the Copernicus Atmosphere Monitoring Service on behalf of the European Commission.

Conflicts of Interest:
The authors declare no conflict of interest.
Sample Availability: Not applicable.

Abbreviations
The following abbreviations are used in this manuscript: