Extreme precipitation events are often associated with hazards such as flooding and the resulting damage. In Germany, many destructive floods have occurred in recent decades, i.e., the Elbe floods in 2002 and 2013. Weather conditions favoring the occurrence of heavy rainfall events are likely to increase with global warming [1
] in Germany as well as in many places worldwide [2
]. Therefore, it becomes even more important to adequately estimate occurrence probabilities of precipitation amounts and intensities, as this information is needed for the design of water management systems. These range from urban drainage systems to river and creek design and retention basins. Therefore different stakeholders need information on the occurrence of extreme precipitation for different durations. Consequently, it is necessary to understand the relationship between precipitation intensity, duration, and exceedance-probability.
For a single location, this relationship can be represented graphically in intensity-duration-frequency (IDF) curves, a commonly used tool for the design of hydrological structures [3
]. However, there is no uniform procedure for estimating IDF curves, and different countries have different regulations for which method to use. In Germany, IDF curves for the entire state region are currently provided by KOSTRA-DWD [4
], a project of the German Meteorological Service. The KOSTRA-DWD IDF curves are the results of a multi-step procedure and a set of different strategies for different ranges of durations [5
]. In the USA, the National Weather Service provides estimates of precipitation frequency via an online portal [6
]. These estimates are based on a regional frequency analysis [7
]. The model used by the Swiss Weather Service is based on a seasonal Bayesian approach [9
]. The results are also made available online [10
]. Recent developments also suggest a wide range of methods, such as the use of radar data [11
], cluster analysis to group stations [12
] or support-vector machines to estimate extreme events based on reanalysis data [13
In statistics, the definition of extreme events is based on their rare occurrence. Their statistical analysis is therefore based on small samples and it is necessary to use those efficiently in order to extrapolate from observed to unobserved levels of intensity. Extreme value theory provides several approaches to this problem (for an introduction, see Coles [14
]). In geosciences, the block-maxima approach is popular. This approach is based on modeling the probability distribution of block-maxima (e.g., monthly or annual maxima) with a generalized extreme value (GEV) distribution. The longer the time series are, the more reliable the estimates are. Even if relatively long time series of 50 years or more exist at many places in Germany for daily precipitation sums, similarly long time series for observations at shorter durations are still an exception, since recording at such high frequencies is based on relatively new technology. Therefore, pooling existing information across duration can be beneficial.
In this study, we model both spatial variations of the probability distribution as well as its dependence on the accumulation duration in a consistent way. This approach allows us to include data of several gauge stations and a range of durations simultaneously in our estimation and hence makes efficient use of the available data. Instead of modeling the probability distribution individually for different precipitation durations, Koutsoyiannis et al. [15
] proposed a duration-dependent distribution based on empirical dependencies of distribution parameters on duration. This approach provides the advantages of parameter parsimony and the direct availability of estimates for all durations within the interval considered. This was already employed in previous studies [16
]. Similar to the studies of Lehmann et al. [17
], Stephenson et al. [19
], Blanchet et al. [20
], who used a single model for a wide range of durations, we model a duration range spanning from one minute to five days. Thereby, we aim to transfer knowledge from the long durations, for which long time series exist, to the short durations.
Extending the model to include spatial variations not only provides the opportunity to estimate the IDF relationship for several locations simultaneously, but we expect that pooling information from several stations will reduce the uncertainties of parameter estimation, especially for stations with short observation time series. Many different statistical methods are used to model the spatial variation of the IDF relationship. The most straightforward way would be the spatial interpolation of the estimated distribution parameters, as done in [20
]. A commonly used approach is regional frequency analysis, which combines data from stations with similar characteristics [8
]. In contrast, spatial variations can be modeled in a single step, using Bayesian Hierarchical Models (BHM) [17
], Vector Generalized Linear Models (VGLM) [23
], or Vector Generalized Additive Models (VGAM) [25
], which simplifies the estimation of uncertainties. The BHM’s provide the uncertainty estimates directly, while for VGLM’s and VGAM’s, they can be obtained using, for example, the bootstrap method [26
]. Fischer et al. [23
] used a GEV to model daily precipitation sums and showed that the inclusion of Legendre polynomials for longitude, latitude, and altitude as covariates in location, scale and shape parameter contributed to a considerable improvement of the model compared to station-wise modeling.
Here we use the idea of Koutsoyiannis et al. [15
] in the framework of VGLMs to combine the modeling of multiple durations and spatial variations by integrating orthogonal polynomials of longitude and latitude as covariates to describe the spatial variability of the parameters of a duration-dependent GEV (d-GEV).
We expect that this will allow us to provide estimates for all durations within the range that is used for parameter estimation and, to a certain extend, also to extrapolate beyond. Furthermore, we obtain IDF relations at ungauged sites and improve the estimates for locations and durations with existing but short time series. To verify these assumptions, we test the approach in the study area of the Wupper catchment in the West of Germany and use the Quantile Skill Score [27
] in a cross-validation setting [28
] to evaluate the model performance for a range of return periods and individually resolved for all durations. We focus on two research questions:
Under which conditions is the spatial d-GEV approach an improvement compared to the separate application of the GEV for each duration and station?
Does the spatial d-GEV approach provide reliable estimates at ungauged sites?
In Section 2
, we describe the data on which the study is based and the methods used for modeling, i.e., parameter estimation, model selection, estimation of confidence intervals and verification. The verification results are presented in Section 3.1
. Return level maps and IDF curves are provided in Section 3.2
. The results are discussed in Section 4
, the last section summarizes methods, results and conclusions.
Since the QSI varies strongly between individual stations and individual durations, the assessment of the model performance and its presentation is challenging. However, the averaged QSI presented in Figure 3
allows for some conclusions. We find that the average QSI for the spatial d-GEV is strongly positive for upper quantiles (large non-exceedance probabilities
). From this, we conclude that the spatial d-GEV approach is an improvement for modeling rare events since it benefits from the increased data availability at neighboring sites. However, for a range of smaller durations
min, the skill decreases for both the station-wise d-GEV and the spatial d-GEV model, compared to the reference which is based on an individual GEV for all stations and durations. This suggests that the d-GEV does not describe the variations in this range of small durations sufficiently well. Figure 8
presents the QSI together with the IDF estimates for gauge Solingen-Hohenscheid; this figure suggests that the negative QSI values in the range of
h are related to an underestimation of the quantiles in this range. This supports the assumption that the d-GEV is not sufficiently flexible in this range and a more complex model might be necessary, as suggested in (e.g., [40
]). Nevertheless, we only used time series with a maximum of 14 years to investigate the model performance for sub-hourly durations and thus cannot exclude the possibility that the effect may occur due to insufficient data.
Moreover, the average QSI is lower for durations
h. We believe this is due to a larger data availability for these durations and thus longer time series are available to train the reference model. This is supported by investigating the influence of time series length, where a fixed number of years
is used to train the model for each duration and at each station (cf. Figure 4
). We observe a gradual decrease in the average QSI with the length of the training time series. We conclude that the advantage of the spatial d-GEV model over the station-wise GEV model is reduced for longer time series; The pooling of information becomes less important. We find that for a length of about
years, there are about as many stations with the spatial d-GEV being superior as stations where it is inferior to the reference model. This implies that in case of a single gauge with a long time series, the spatial d-GEV approach cannot outperform single site estimates for individual durations. However, due to the lack of data, we are unable to make any statements about the behavior of the estimates for
. This information would be particularly helpful for these durations, as here often only short time series are available. Moreover, even with a strongly positive average QSI, negative values of the QSI for individual stations, and durations, occur. Yet, Fischer et al. [23
] showed that even for long time series with more than 50 years of observations, a GEV model with spatial and seasonal covariates performed better than separate models for each month and station at almost all stations investigated. Therefore, the major improvement of the model by adding spatial and seasonal covariates is not directly applicable to the spatial d-GEV model.
However, a large advantage of the d-GEV model is its ability to interpolate between durations and stations at the level of distribution parameters and it is therefore possible to obtain estimates for durations and sites for which no measurements exists in a consistent way. This advantage has been disregarded in the verification process, where we used a separate model for individual durations and stations as a reference model. From the results presented in Figure 6
, we infer that the model performance at ungauged sites is comparable to that of a separately applied GEV for an available time series of 30–35 years, at least for high quantiles
. Therefore we conclude that the spatial d-GEV model provides reliable estimates at ungauged sites. However, the available time series for
h are again not long enough to investigate the model performance at ungauged sites for this range of small durations in this way.
In this study, we model annual precipitation maxima simultaneously in space and across durations. To this end, we integrate orthogonal polynomials of longitude and latitude as spatial covariates into the duration-dependent GEV proposed by Koutsoyiannis et al. [15
]. This allows for a parameter parsimonious description compared to modeling of individual stations and durations, efficient use of existing data, and the pooling of information between stations and durations. We specifically model a wide range of durations from one minute to 5 days in order to investigate to what extent knowledge can be transferred from long observation time series and whether estimates for stations or durations with fewer observations benefit from this. We investigate this model in the Wupper catchment with the main focus on evaluating the model performance. Model validation is based on techniques from forecast verification: we use a variant of the Quantile Skill Score, the Quantile Skill Index (QSI). In the presentation used here, this score allows a detailed analysis of the model performance for different non-exceedance probabilities and durations, simultaneously. As a reference model that is not based on any empirical relationship between intensity and duration, the GEV is used to model precipitation maxima independently at individual stations and durations.
We find that using the spatial d-GEV improves the modeling of rare events of all durations, as it benefits from greater data availability. Accordingly, this model is advantageous for stations with short time series and does not necessarily improve the estimation if a longer time series is available. We also find that the d-GEV model is most likely not flexible enough to model the whole range of durations sufficiently and that a model with additional parameters (e.g., [40
]) might be necessary. Therefore we recommend reducing the duration range in cases where the aim is exclusively the description of short durations
h. Future studies will also explore the use of more flexible models to describe the whole range of durations. We expect that the estimation of further additional parameters for the duration dependence in these more flexible models will benefit from a spatial covariates setting.
Since this approach allows us to interpolate between stations and durations, spatial maps of return levels can be readily obtained for any duration, as well as IDF curves for any location in the research area. The bootstrap method provides 95% confidence intervals representing the sampling uncertainty. For the d-GEV with spatial covaiates, these uncertainties are smaller than for the station-wise d-GEV, since the spatial model can draw information from both neighboring sites and durations. Uncertainties from the model selection are not considered. For a reliable estimation of the uncertainties, Mélese et al. [26
] suggest a Bayesian Hierarchical Model. In the return level maps we observe that the spatial patterns change from a minimum in the center of the catchment for short durations to a west-east gradient for long durations. This is likely related to the main north-west direction of advective weather conditions in the study area and might also be linked to the orography.
In this work, we assume that there is no dependence between observations of different durations. This seems to be reasonably well justified as reported in another study which investigates the effect of including this dependence explicitly using a max-stable process for six stations in the same research area ([35
] in this issue). We also assume that there is no dependence between observations at neighboring stations; this dependence could also be modeled using a max-stable process [19
]. We use the assumption that the IDF relationship does not vary in time. However, we plan to account for the temporal variations of the IDF relationship by a straightforward extension of the spatial d-GEV model with further covariates in future studies. Nevertheless we could demonstrate that the approach presented here allows obtaining reasonable estimates of return levels for any arbitrary duration or location within the study domain, performing particularly well for rare events.