## 1. Introduction

Central Europe recently experienced a number severe drought events (e.g., 2000, 2003, 2015, 2018; e.g., [

1,

2,

3]). These events attracted public, media and scientific attention as well as stimulated drought research, development of drought legislation and adaptation strategies. Many of these activities require assessment of drought characteristics (like severity, intensity, duration and frequency). While these characteristics are routinely estimated for heavy precipitation events and floods (e.g., [

4,

5,

6,

7]), the applications in the drought context are less common.

This could be at least partly attributed to the vague definition of drought potentially leading to contradicting assessments. A common definition of such drought is the deficit of water with respect to variable of interest or specific water use.

Drought can be identified and quantified using various drought indices, which use different variables for its estimation. The most widely used are the Palmer Drought Severity Index (PDSI) [

8,

9], with temperature and precipitation as an input; the Standardized Precipitation Index (SPI) [

10] using only precipitation; the Standardized Precipitation Evapotranspiration index (SPEI) [

11] facilitating both the sensitivity of PDSI and the simplicity of the SPI calculation; and the Reconnaissance Drought Index (RDI) [

12] incorporating directly potential evapotranspiration. Ref. [

13] compared the ten most widely used meteorological drought indices and tracked the indicated effect of drought on streamflow.

Here we are focusing on hydrological drought. Ref. [

14] distinguished streamflow droughts and low flows; low flows are normally experienced during a drought, but they feature only one element of the drought, i.e., drought intensity. However, other characteristics of a drought event, such as cumulative deficit volume of an event or event duration, are also of interest for water management.

Another modification of SPI was made by [

15], introducing an analogous approach for streamflow and thus capturing the hydrological droughts. One of the first studies modelling hydrological droughts via deficit volumes was made by [

16]. Ref. [

17] offered a review of multiple climatological and hydrological parameters concerning drought and summarized drought modelling methods in [

18]. Ref. [

19] performed streamflow and hydrological drought trend analysis using Streamflow Drought Index (SDI) [

15]. Ref. [

20] used Generalized Extreme Value distribution, Generalized Pareto, three-parameter lognormal and Pearson type III distributions to describe drought durations and deficit volumes. This study did not add much in the context of different drought indices discussed here, but it is important for the discussion of which distribution function to use to estimate large quantiles of said variables.

Since drought is a phenomenon that needs a long period of time to evolve and is an intermittent process, another limitation is usually short length of available observed data [

21]. Due to the rarity of extreme events, modelling of drought extremes is related to large uncertainties. One possible way how to prolong the study period is the use reconstructed climate fields or climate models to obtain sufficient period length, this may however introduce new uncertainty. To cope with the uncertainty issue, it is desirable to employ methods such as regional frequency analysis [

22].

Regional frequency analysis (RFA) uses spatial pooling of data from a homogeneous region to reduce the standard error of the estimates, i.e., it trades time for space. The vast majority of its applications is for runoff (e.g., [

23,

24]), precipitation (e.g., [

25,

26,

27]) or temperature maxima (e.g., [

28]), while the applications in the drought context are rare. Some noteworthy exceptions are [

29] who carried out RFA of deficit runoff volumes, or [

30] who presented regional analysis of low flows over South China. The RFA method based on L-moments was carried out by [

31] and [

32].

Here we show the application of an (RFA) model based on L-moments for estimation of drought characteristics, more specifically the distribution of maximum deficit volumes for the period 1900–2015 over the Czech Republic. The model aims at reduction of uncertainty in the estimated return levels, in the periods of drought events and in the parameters of the extremal model. The goodness-of-fit of the model is evaluated through discordance analysis, as well as the Anderson–Darling test, with the critical values estimated by a bootstrap procedure.

## 2. Study Area—Czech Republic

Although Czech Republic is a small country in central Europe, weather conditions differs markedly among its various regions. The variability of the weather is strongly driven by the unstable location and magnitude of two main pressure centres. In particular during the warm period of the year, the expansion of the high pressure projection into Czech Republic causes warmer temperatures and dry weather, whereas the Icelandic Low manifests itself with a greater number of atmospheric fronts bringing more clouds and precipitation.

The average air temperature is strongly dependent on the altitude and ranges from 0.4 °C on the highest elevation point (mountain Sněžka; 1603 m) to almost 10 °C in the lowlands of southeast Moravia. The annual rainfall is also strongly dependent on the altitude and orography. The wettest areas are the mountain ranges with steep slopes facing northwest in Jizerské hory (Jizera Mountains) with average total rainfall exceeding 1700 millimetres. On the other hand, the driest regions are the lowlands in southeast Moravia and northwest Bohemia receiving approximately 400 mm on average (the latter is influenced by rain shadow east of the Krušné hory (Ore Mountains)

Figure 1 left.

For the purpose of the study we considered all of the 133 catchments defined by [

33] covering the entirety of the Czech Republic (

Figure 1 right) with respective areas ranging from 154 to 1928 km

^{2}. The catchments are based on hydrological division of the Czech Republic as provided by the Czech Hydrometeorological Institute, which is also considered in the application of water management policies.

## 4. Results and Discussion

The characteristics of simulated deficit events in four successive 30-year (climatic) periods starting in 1901 are given in

Table 1. The average values of event severity (

D), intensity (

I), length (

L), relative severity (

rD) and relative intensity (

rI) are varying over the periods with largest values of event severity in the periods 1931–1960 and 1961–1990. These periods are in good agreement with the extreme droughts that manifested in 1947, 1953–1954, 1959, 1963–1964, 1973–1974, 1983 [

77,

78,

79,

80,

81].

The relatively lower values of all variables in the last period might be linked with the rather wet conditions that prevailed in Central Europe [

82]. In addition, the current dry period over the Czech Republic spans the years 2014–2018, so considerable part is not considered here. A steady decrease in soil moisture has been reported for the same period [

83], due to the increasing temperature and consequently to the rising evapotranspiration. The latter can be also seen in the drought representation by the SPEI index [

84].

In the 53 gauged catchments, the properties of simulated deficit volumes for the period 1980–2010 were compared to the observational records. The validation showed that the characteristics of simulated deficit volumes correspond well to those based on observed data, as shown in

Figure 2 and

Table 2. In

Figure 2, the simulated event severities and lengths are well represented through the median, as well as through the confidence interval in all ranges. The simulated low event intensities correspond quite well to the observed ones, despite the overestimation of the high intensities by the model. The simulated relative severity and intensity are slightly overestimated in the whole range, due to the cumulative effect in their computation. This overestimation pattern is well shown in

Table 2 through the average of the individual variables.

#### 4.1. Spatial Pooling

The input to K-means algorithm was mean runoff and mean potential evapotranspiration for each catchment which resulted in three clusters of catchments. The algorithm ran ten times, each time starting with cluster centres in a different random position. Within fifty iterations, each run converged to a locally-optimal solution. Cluster 1 represents the catchments at high elevations with a lot of precipitation (see

Table 3 for average precipitation for individual clusters), low land dry catchments with limited precipitation form cluster 3, while cluster 2 is a transition between the low drought risk cluster 1 and severe drought event risk cluster 3.

Table 3 reports also the probability of year without drought. It may be surprising that the low-risk cluster 1 has the lowest probability of year without drought (0.3), while this probability is 0.49 for severe drought event risk cluster 3. However, it has to be noted (and is demonstrated further) that the tail of the distribution of deficit volume is much heavier in cluster 3 than in cluster 1 (see, e.g.,

$\kappa $ parameter in

Table 4 or the quantile functions in

Figure 3).

At-site distributions were chosen on the basis of L-moment ratio diagrams and at-site Anderson–Darling (A

^{2}) tests. The diagrams were constructed by plotting the estimated sample L-moment ratios versus the theoretical L-moment ratio curves for the candidate distributions (

Figure 4). From the considered distributions, the estimated L-moment ratios for deficit volumes correspond best to those of the Generalized Pareto Distribution (GPD). In addition, the Anderson–Darling test at the significance level

${\alpha}_{LOC}$ = 0.05 rejected the GPD only at six out of 133 catchments, which is very close to the nominal level of the test.

For each cluster a stationary index flood model for scaled deficit volumes was developed. The scaling was performed by the at-site first L-moment, with the scaling factors varying between 1.94 and 23.5 mm. The fitted regional parameters of the model are presented in

Table 4. It is evident that the cluster 3 (dry catchments) exhibits quite different behaviour than the other two clusters. In particular, the low value of the shape parameter indicates heavy tail. In addition, the smaller scale parameter also points towards dry regime prone to heavy extremes.

The goodness-of-fit was assessed using Gumbel plots, discordance measure and regional Anderson–Darling test (

Figure 3). It is clear that the regional model fits the deficit volumes scaled by the first L-moment well. The same figure, highlights 1–2 catchments in every cluster demonstrating different behaviour than the rest of the cluster (discordant catchments). Regions were checked for within-cluster discordance based on a critical value set at 3 with a 10% significance level as defined by [

54], and five catchments in total were found discordant. In the Anderson–Darling test, the regional critical values were estimated using the methods described above with 3000 bootstrap samples for each region (

Table 4). All clusters passed the regional Anderson–Darling test for significance level

${\alpha}_{GLOB}$ = 0.10.

#### 4.2. Choice of the At-Site Distribution

The annual maximum deficit volumes analyzed in the present paper cannot be regarded as standard block maxima, since there is often only one drought event (and only seldom more than two) for individual year and catchment. Therefore the annual maximum deficit volume are not theoretically expected to follow Generalized Extreme Value distribution. Indeed, the results suggest that for most stations the Generalized Pareto Distribution is appropriate for the description of the distribution of the annual maximum deficit volume (though generalized normal and generalized extreme value distributions could be also good candidates for stations that did not pass the Anderson–Darling or are being an outliers in

Figure 4).

Similar results can be seen in [

20], where annual deficit volumes were fitted to various distributions and GPD presented the best results. However, no spatial pooling was employed in this study. In another study that employed RFA for deficit volumes [

29] the Generalized Exponential Distribution was used, which is a reparameterization of the Generalized Pareto Distribution.

In addition, the analyses conducted within searching for the optimal at-site distribution revealed that Generalized Extreme Value distribution cannot be used to characterize the distribution of deficit volumes, although it is very often found appropriate for maximum discharges or heavy precipitation indices. This result can be, at least partly, region-specific, therefore the at-site distribution should be always checked prior the regional frequency analysis.

#### 4.3. Drought Definition

In contrast to extreme precipitation or runoff, the definition of drought is not straightforward and various definitions do exist. In the present paper, we considered deficit volume, due to its clear physical interpretation. On the other hand, one may also consider drought indices, based on cumulative deviation from the mean, e.g., Drought Severity Index [

85] or indices inspired by the Standardized Precipitation Index (SPI). The use of the latter within regional frequency analysis, however, is complex since often the temporal dimension of drought is characterized by different time-scales for which the SPI is calculated.

Moreover, even the definition of deficit volume allows for several subjective choices like threshold level, form of the threshold (variable or fixed within a year), number of days/months needed for the discharge to be above threshold to end the drought event etc. This increases the uncertainty in the estimation of drought characteristics.

#### 4.4. Reduction of Uncertainty

Statistical modelling of extremes is related to large uncertainties due to the rarity of extreme events or problems with their measurement. This especially applies to droughts since they do not occur every year and thus the length of series typically available for hydrological analysis provides only limited information. This can be, at least partly, overcome by “trading space for time”, i.e., combining data from several sites over homogeneous regions. The effect of adding sites/catchments is maximal when the data are independent. This is seldom true, however, thus the real reduction of uncertainty not only depends on the number of data but also on the dependence structure of the analyzed data.

To assess the increase in precision of the parameter estimates owing to spatial pooling, GPD parameters were fitted for each individual catchment and the 25th and 75th percentiles of the parameter estimates were calculated using 500 bootstrap samples. Then, for each region and each parameter the average interquartile range was obtained as the difference between the average 75th and 25th percentile of the estimates. These average interquartile ranges were compared with those of the regional model. Results are shown in

Table 5.

Increase in precision for the return levels was calculated by substituting the estimated parameters of the bootstrap sample to GPD quantile function with corresponding probability

p,

$p=1-1/T$, where

T is the return period in years. The estimated return levels for each cluster together with calculated confidence intervals can be seen in

Figure 5.

Another option how to increase the sample size is to consider reconstructed climate data (e.g., [

86]) in combination with a hydrological model. This introduces additional sources of uncertainty, though, through the reconstructed climate fields and the parameterization of the hydrological model. On the other hand, the spatial and temporal scales relevant for drought may allow to obtain reliable information even based on data with limited spatial coverage.

#### 4.5. Identification of Homogeneous Regions

Identification of the homogeneous regions requires the greatest amount of subjective judgment of all stages of regional frequency analysis. When using K-mean clustering, methods uncertainties stem from the choice of the number of clusters, which can actually be mitigated by using methods like gap statistics [

87,

88]. However in this study we had a predefined number of clusters from the very beginning since the initial idea was to classify the catchments into three groups based on the level of threat by drought. Another ways to proceed with spatial pooling would be by using self-organizing maps [

89,

90], dimensionality reduction technique [

91], or pooling methods first suggested by [

92] and [

93] with subsequent implementation of the method referred to as the region of influence approach by [

5].

Although we used unsupervised clustering algorithm it is worth noting that the resulting regions used for regional frequency analysis shown in

Figure 1, correspond well with the distribution of hydroclimatic variables relevant to drought such as aridity index [

44], which supports the relevance of the clustering algorithm.