Statistical Analysis and Stochastic Modelling of Hydrological Extremes

: Analysis of hydrological extremes is challenging due to their rarity and small sample size and the interconnections between di ﬀ erent types of extremes and gets further complicated by an untrustworthy representation of meso-scale processes involved in extreme events by coarse spatial and temporal scale models as well as biased or missing observations due to technical di ﬃ culties during extreme conditions. The special issue “Statistical Analysis and Stochastic Modelling of Hydrological Extremes”—motivated by the need to apply and develop innovative stochastic and statistical approaches to analyze hydrological extremes under current and future climate conditions —encompass 13 research papers. Case studies presented in the papers exploit a wide range of innovative techniques for hydrological extremes analyses. The papers focus on six topics: Historical changes in hydrological extremes, projected changes in hydrological extremes, downscaling of hydrological extremes, early warning and forecasting systems for drought and ﬂood, interconnections of hydrological extremes and applicability of satellite data for hydrological studies. This Editorial provides an overview of the covered topics and reviews the case studies relevant for each topic.


Introduction
Assessment of hydrological extremes is of paramount importance, as they have the potential to affect society in terms of human health and mortality, and the ecosystem and the economy (e.g., infrastructure and agriculture) [1,2].In the last decades, millions of people have been affected by hydrological extremes.The risk of these hazards will increase in the future as a result of climate change and as population and infrastructure continue to increase and occupy areas exposed to higher risks [3][4][5][6].
Analyzing extremes in a complex interacting hydro-climatology system is challenging on the following grounds.First, extremes are rare events in the tail of distribution, characterized by either very small or very large values, and therefore, have a different statistical behavior.Being rare, extreme events have a small sample size, which adds a large uncertainty to the results of statistical analyses.With regard to observations, extreme events might be biased or missed altogether due to technical difficulties during extreme events [7].
For future hydrological extremes, the projections are mainly derived from global climate models (GCMs) which provide coarse spatial and temporal scale data that cannot be implemented directly in the hydrological impact analysis of climate change.For instance, for design applications in urban hydrology, precipitation data with a temporal resolution of a few minutes and a spatial resolution of 1-10 km 2 are needed especially when simulating flood events for small urban catchments with fast runoff processes and short response times [8].To meet the needs of the end user for hydrological applications, information provided by climate models needs to be downscaled to much finer spatial and temporal scales [9].It adds an extra tier of complexity to the hydrological extremes analyses.
The statistical analysis and stochastic modelling of hydrological extremes in developing countries is further hampered by unavailability or scarcity of high quality, fine-scale ground observations.The analysis of hydrological extremes is particularly important for vulnerable developing countries because they will shoulder a far greater burden of loss and damage due to inadequately built infrastructure, weaker preparedness and low levels of capacity to respond to disasters [10].In the case rain-gauge observations are not available, satellite estimations with a spatially and temporally homogeneous precipitation data at a quasi-global to global scale are excellent alternatives [11].Before the use of satellite data, their quality must, nevertheless, be validated next to the spatial downscaling of the data to make them suitable for the applications that require fine-resolution data such as in urban hydrology.
The complexity of analyzing hydrological extremes calls for robust statistical methods for the treatment of such events.This Special Issue is motivated by the need to apply and develop innovative stochastic and statistical approaches to analyze hydrological extremes under current and future climate conditions.

Overview of the Special Issue Contributions
The Special Issue includes 13 papers exploiting a broad range of innovative statistical methods for hydrological extremes analyses.The papers were published between 31 March 2018 and 14 June 2019 with an average time of 45 days from initial submission to online publication.In only a few months after the online publication, the papers have received 6 (7) citations in the literature indexed in Scopus (Google Scholar) (Table 1), indicating the significance and immediate impacts of the published studies.The abstract and full-text of the papers published in the Special Issue have been viewed on average 33 and 44 times per day, showing the broad reach of the published research.[22] 5 April 2019 3 3 0 0 Zhang and Wang [23] 4 June 2019 2 5 0 0 Mehmood et al. [24] 14 June 2019 3 5 0 0 The papers of this Special Issue focus on six topics associated with hydrological extremes which are reviewed in the following sections:

Background
Trend analysis of hydrological extremes provides essential information for regional water resource planning, risk assessment of hydrological hazards and adaptation and mitigation strategies to climate change [25,26].To this end, various parametric and non-parametric methods have been developed and used over time.Non-parametric tests have gained popularity for the trend analysis of hydrological variables owing to their insensitivity to skewed data (non-normal distribution) which is likely the case for extreme events as well as their more resilience to outliers in time series [27,28].These methods have, nonetheless, some shortcomings.One of these shortcomings is the sensitivity of trend test results to serial correlation in time series.In fact, a positive serial correlation, mostly the case for hydroclimatological data, increases the possibility of rejecting the null hypothesis of no trend while it is true [28].Similar to serial correlation, a cross-correlation among the time series of neighboring stations or gridcells increases the false rejection of the null hypothesis of no field significance of trends [29,30].These limitations need to be thoroughly addressed to avoid biased and misleading trend analysis results, e.g., using the effective sample size (ESS) for the former [31] and a bootstrapping method for the latter [32].

Case Studies
In a case study for the High Basin of the Cauca River in Southwestern Colombia, Ávila et al. [19] examined temporal trends in eight extreme precipitation indicators by the Mann-Kendall (MK) test and the Sen slope estimator for the period 1970-2013.The results showed a decreasing trend in precipitation intensity indices of annual maximum 1-day precipitation amount (Rx1day) and annual maximum 5-day precipitation amount (Rx5day).An increasing trend was also observed in the September-October-November period for consecutive dry days index and in the December-January-February period for total precipitation and number of wet days.The extreme precipitation indices were found to have a concurrent correlation with sea surface temperature in the equatorial Pacific and a lagged correlation (a lag of 2-3 months) with ENSO.
In another study, Mehmood et al. [24] analyzed temporal trends in annual maximum streamflow using the MK test at 29 stations with varying records between 30 (1987-2016) and 55 (1962-2016) years in the Kabul River Basin, Pakistan.Stationary and non-stationary Bayesian models were then used for flood frequency estimation and their results were compared using the corresponding flood frequency curves (FFCs).The results revealed a mixture of increasing and decreasing trends across different stations, implying a signal of clear non-stationarity in the flood regime.The non-stationarity was also confirmed by the findings of the Bayesian models where reliable results were found from the non-stationary Bayesian model, while the stationary Bayesian model either overestimated or underestimated flood frequencies.

Background
Hydrological extremes can change due to natural climate variability and/or anthropogenic climate change.While climate variability, referred to natural processes influencing the atmosphere, is a periodic variation (yearly, decadal or multidecadal) in average or range of weather conditions either above or below a long-term average value without causing the long-term average itself to change, climate change is a long-term continuous change (increase or decrease) in these statistics.As for anthropogenic climate change whose hydrological impact was investigated in three papers in this Special Issue, extreme precipitation is expected to increase due to an increase in the atmospheric water-holding capacity under warmer climates dictated by the Clausius-Clapeyron (CC) relation of 7% • C −1 [33,34].The extreme precipitation intensification has, therefore, been projected under future climate change [35,36].At the regional scale, the extreme precipitation-temperature rate varies with regional climatic settings (available water vapor and ranges of air temperature variation) [18] and different scaling rates were reported such as sub-CC (~3% • C −1 ), close-CC (~7% • C −1 ), super-CC (~14% • C −1 ), peak-like CC (positive and negative), and negative CC [37][38][39][40].The scaling rate also changes with the intensity and duration of extreme precipitation, being higher for more extreme and shorter duration precipitation [37,39,41].

Case Studies
Hourly extreme precipitation-temperature scaling rate regarding storm characteristics (types) and event process-based temperature variations in South China was investigated by Pan et al. [18].They found a different magnitude of air temperature fluctuations prior to and after different storm types, and more reliable scaling rates from the 24-h mean air temperature prior to storms than the naturally daily mean air temperature.A peak-like scaling relation with a break at 28 • C temperature was reported between precipitation extremes and the 24-h mean air temperature.They obtained a positive scaling rate below 28 • C and a negative one for above 28 • C. The former was attributed to a high availability of relative humidity (80-100%), and the latter was triggered by a lack of moisture in the atmosphere instead of by the atmospheric water vapor-holding capacity.Comparing heavy storm producing weather systems in South China (e.g., warm-front storms, cold-front storms, monsoon storms, convective storms, and typhoon storms), a small influence of the storm types on the scaling rates was deduced.
China's extreme precipitation response of the next few decades to emission reductions based on the implement intended nationally determined contributions (INDCs) under the Paris Agreement was investigated by Zhang and Wang [23] using an ensemble of GCMs from the Fifth Coupled Climate Model Intercomparison Project (CMIP5).The maximum consecutive five-day precipitation and number of heavy precipitation days over China are projected to increase by 16% and up to 20%, respectively.The population exposure to heavy precipitation events will also increase in almost all Chinese regions, e.g., by 10% for extreme precipitation of > the 20-year return period.
The climate change impact on hydro-climatology and the potential of hydropower generation in the Dez Dam Basin in Iran was also investigated [15].The results showed a remarkable reduction of up to 33% in future streamflows.Different climate change impacts on the electricity generation potential were found in two hydropower plants considered: 3% decrease and 33% increase.

Background
The primary tool for future hydroclimatic projections is GCM.However, the resolution of the current generation of Earth System Models (see CMIP5) is still coarse and unable to capture sub-grid scale processes [6,[42][43][44].The processes that cannot be resolved in horizontal grid spacing of GCMs are parameterized, which is a source of large bias and uncertainty in the simulations [45][46][47][48][49].A more trustworthy representation of these processes and features is provided at finer spatial resolutions of regional climate models (RCMs) which are typically between 50 and 12 km, for instance 50 km for RCMs implemented and simulated in the project PRUDENCE [50] and NARCCAP [51], 25 km in ENSEMBLES [52] and 12 km in EURO-CORDEX [53].Even if the spatial resolution of RCMs is much higher than that of GCMs, the grid size is still too large to adequately represent convective rain which is of primary importance for flood risk analysis [54].To explicitly represent deep convection, very high resolution climate models (<4 km) termed convection-permitting models (CPMs) are needed.
The major issue still plaguing CPMs is the representation of rain-on-snow events which are responsible for flash flooding in urban watersheds.
Another alternative to circumvent the intrinsic deficiency of GCMs and RCMs to represent fine-scale physical processes is statistical downscaling [55,56].Nevertheless, the results of downscaling methods are often compromised with bias and limitations [57,58] due to assumptions and approximations made within each method.Some of these assumptions cast doubt on the reliability of downscaled projections and may limit the suitability of downscaling methods for some applications [59].As there is no single best downscaling method, the assumptions that led to the final results for different methods require evaluation.Therefore, end users can select an appropriate method based on their strengths and limitations.

Case Study
To address the need for a statistical downscaling of extreme precipitation to a finer scale (e.g., catchment), Pham et al. [20] proposed a two-step statistical downscaling approach.Precipitation was first classified into wet and dry day or dry day, non-extreme precipitation day, and extreme precipitation day using the linear discriminant analysis (LDA), random forest (RF), and support vector classification (SVC).Afterwards, the precipitation amount for each precipitation state was predicted using the least square support vector regression (LS-SVR).Predictors of classification and prediction were obtained from the large-scale climate variables of the NCEP reanalysis data during 1964-1999 and 2000-2013 for calibration and validation, respectively.The results showed an outperformance of RF compared to LDA and SVC for precipitation classification.The extreme precipitation downscaling was found to be improved using RF for the classification of three-precipitation-states and using LS-SVR for the prediction of precipitation amount.

Background
Hydrological hazards (flood and drought) are manageable by implementing appropriate emergency preparedness and mitigation strategies.One of the effective measures to mitigate the negative impacts of drought and flood is the early warning and forecasting systems.Forecasting of hydrological events is performed using conceptual or data-driven models [60].Each group of forecasting models has its own cons and pros.The conceptual models usually incorporate simplified forms of physical laws and are generally nonlinear, deterministic, and time-invariant, with parameters that characterize watershed features [61].Nevertheless, the main limitation of these models is that when they are calibrated to a set of time series, they may not provide an accurate prediction for values beyond the range of calibration or validation values [62].Data-driven models are numerical models which represent causal relationships or patterns between sets of input and output time series data, independent of the physics of the real-world situation.Although a limited prior knowledge requirement of internal functions of the system being modeled [63] along with a high ability to represent non-linear processes [64] and time-space variability [60] are the main pros of data-driven models, the prediction entirely based on mathematics without explicit physical consideration is the main limitation of these models.

Case Studies
Rhee and Yang [13] developed a hybrid model for the meteorological drought prediction of the 6-month Standardized Precipitation Index (SPI) for areas with a sparse gauge network using the APEC Climate Center Multi-Model Ensemble seasonal climate forecast and machine learning models of Extra-Trees and Adaboost.To overcome the limitation of the sparse network, dynamically downscaled historical climate data from the Weather Research and Forecasting (WRF) model were used to train machine learning models instead of in-situ data as a reference.In another study, Khan et al. [14] developed two artificial neural network (ANN)-based models and two wavelet-based artificial neural network (W-ANN) models for meteorological and hydrological droughts characterized by Standard Index of Annual Precipitation (SIAP) and Standardized Water Storage Index (SWSI), respectively.
Owing to the importance of reservoir inflow forecasting for appropriate reservoir management, especially in the flood season, the variation analogue method (VAM), the W-ANN, and the weighted mean analogue method (WMAM) were used to forecast reservoir inflows by Amnatsan et al. [16].In another study, Bafitlhile and Li [17] applied ε-Support Vector Machine (ε-SVM) and ANN for the simulation and forecasting streamflow of three catchments with humid, semi-humid and semi-arid climates.To optimize the ANN and SVM sensitive parameters, the Evolutionary Strategy (ES) optimization method was used.

Background
Hydrological extremes are often investigated in isolation, while in reality hydrologic processes in the water cycle are interconnected.Or the complex interconnected water systems are oversimplified such as the relation between precipitation and groundwater table fluctuations.Isolated analysis of hydrological extremes or oversimplification of their complex interactions results in an underestimation of the impact associated with extreme conditions.Hydrological extremes must, therefore, be analyzed in a compound manner for a more realistic estimate of the overall impact.This ensures a better decision-making to curb the growing impacts of the extremes and to plan and build more resilient water systems [10].

Case Studies
The interconnection between hydrological extremes was addressed by Dawley et al. [22] who correlated surface and subsurface hydrological extreme events by investigating the possible effects of extreme storm events of different properties on the fluctuations in surface and subsurface water systems.They applied three probability density functions (PDFs), Gumbel, stable, and stretched Gaussian distributions, to capture the distribution of extremes and the full-time series of storm properties (storm duration, intensity, total precipitation, and inter-storm period), stream discharge, lake stage, and groundwater head values.The potentially non-stationary statistics of hydrological extremes were quantified by computing the time-scale local Hurst exponent (TSLHE) for the time series data recording both the surface and subsurface hydrological processes.The results indicated that groundwater recharge has a strong relationship with storm duration and intensity and a weak one with total precipitation.The surface water and groundwater series were found to be persistent because of their relatively slow evolving nature, while storm properties were anti-persistent because of their rapid temporal evolution.They also showed that a single distribution cannot most effectively capture all of hydrological extremes and different distributions depending on the variable under study should be used.
The difficulty of establishing a joint distribution function for multiple correlated random variables with a mixture of non-normal marginal distributions affecting the design and safety evaluation of hydro-infrastructural systems was tacked by Tung et al. [21].They presented a framework for a practical normal transform based on the third-order polynomial with an explicit consideration of sampling errors in sample L-moments and correlation coefficients.The modeling framework was then applied to establish an at-site precipitation intensity-duration-frequency (IDF) relationship for 27-year annual maximum precipitation records with seven durations (1-72 h).The results showed that the proposed framework is able to deal with multivariate data having a mixture of non-normal distributions.

Background
The precipitation measurements can be obtained through different sources such as surface networks, weather radars and satellite estimations.Among them, the most common practice is the derivation of precipitation estimates over land areas from surface rain-gauge observations at automated or human-operated sites [65].Despite the different types of errors included in surface network measurements such as instrument/human errors, change of instrument/observer, change of observing technique and changes in station surroundings like urbanization, they provide the most accurate measurements.Notwithstanding being the most reliable and longer records [66], they are limited in sampling precipitation for continental and global applications [67].In most regions of the world, rain-gauges do not provide a reliable spatial representation of precipitation [68], especially over oceans, deserts and mountainous areas.In addition, other possible problems of surface rain-gauge observations are the inhomogeneous spatial distribution and the existence of missing data resulting in inadequate temporal and spatial sampling [65] especially in developing countries.
Satellite estimations with a spatially and temporally homogeneous precipitation information at a quasi-global to global scale are excellent alternatives wherever/whenever rain-gauge observations are not available [11].Nevertheless, their quality has to be validated before any application.The typical sources of errors in satellite precipitation data are sensor-related errors, retrieval errors and spatial and temporal sampling errors [69].The most common practice for the verification of satellite data is to compare satellite estimations with local station-based observations, by considering the spatial scale mismatch between the point station observations and gridded outputs as the latter represent area averages rather than point values [54].Specifically, extreme precipitation values obtained from station observations are expected to be more intense compared with the ones from gridded outputs [70], because of the smoothing associated with the spatial averaging of precipitation characteristics over gridcells [71].Another limitation of satellite data is their coarse spatial resolution, which is not suitable for practical applications in hydrology, calling for the spatial downscaling of the data.

Case Study
Addressing the need for the spatial downscaling of satellite data, Mahmud et al. [12] developed a spatial downscaling algorithm to produce finer-scale satellite precipitation data in humid tropics.They used the potential of the low precipitation variability in Peninsular Malaysia and monsoon characteristics (period, location, and intensity) at the local scale as a proxy to spatially downscale TRMM (Tropical Rainfall Measuring Mission) satellite precipitation data.To this end, a site-specific coefficient (SSC) was first derived for each individual pixel by comparing the high-resolution areal precipitation (0.05 • ) from a dense gauge network and re-gridded TRMM satellite precipitation data (from the initial resolution of 0.25 • to 0.05 • ) and then the SSC was validated to produce high-resolution precipitation maps.

Conclusions
The research published in this Special Issue applied or developed a broad range of novel methods for the statistical analysis and stochastic modelling of hydrological extremes.The case studies presented in the 13 published papers have touched on six research areas: (1) Historical changes in hydrological extremes; (2) projected changes in hydrological extremes; (3) downscaling of hydrological extremes; (4) early warning and forecasting systems for drought and flood; (5) interconnections of hydrological extremes; and (6) applicability of satellite data for hydrological studies.Contributions to this Special Issue are expected to be greatly beneficial for researchers, policy-makers and risk managers dealing with hydrological hazards.Yet, innovative statistical methods have to be developed to keep up with the accelerating pace of socio-environmental changes.Hence, of particular interests for further research, are the topics concerning future hydrological extremes, for instance:

•
Assessment of the decadal natural oscillations of hydrological extremes and their concurrent and lagged relationships with large-sale atmospheric circulation patterns as done in Tabari and Willems [1];

Table 1 .
Metrics of the papers published in this Special Issue (average full-text and abstract views per day were calculated until 30 August 2019).
• Attribution analysis of changes in the intensity, duration and frequency of hydrological extremes to anthropogenic influences; • Dynamical downscaling of hydrological extremes and exploring the added value of CPMs and RCMs; • Climate-proof hydraulic designs based on projected IDF curves; • Assessment of uncertainties in hydrological projections and observations; • Socioeconomic risk analysis of future hydrological hazards; • Hydrological hazard mitigation and adaptation strategies.