Flood Hazard Estimation under Nonstationarity Using the Particle Filter

: The presence of the nonstationarity in ﬂow datasets has challenged the ﬂood hazard assessment. Nonstationary tools and evaluation metrics have been proposed to deal with the nonstationarity and guide the infrastructure design and mitigation measures. To date, the examination of how the ﬂood hazards are affected by the nonstationarity is still very limited. This paper thus examined the association between the ﬂood hazards and the nonstationary patterns and degrees of the underlying datasets. The Particle Filter, which allows for assessing the uncertainty of the point estimates, was adopted to conduct the nonstationary ﬂood frequency analysis (NS-FFA) for subsequently estimating the ﬂood hazards in three real study cases. The results suggested that the optimal and top NS-FFA models selected according to the ﬁtting efﬁciency in general align with the pattern of nonstationarity, although they might not always be superior in terms of uncertainty. Moreover, the results demonstrated the association and the sensitivity of the ﬂood hazards to the perceived patterns and degrees of nonstationarity. In particular, the variations of the ﬂood hazards intensiﬁed with the increase in the degree of nonstationarity, which should be assessed in a more elaborate manner, i.e., considering multiple statistical moments. These advocate the potential of using the nonstationarity characteristics as a proxy for evaluating the evolutions of the ﬂood hazards.


Introduction
The flood risk assessment is critical for designing infrastructure and planning mitigation measures and policies to improve the water resources management. The flood risk is assessed through the consideration of three principal aspects, namely the flood hazard (i.e., the flood probability), the exposure, and the vulnerability [1][2][3]. The flood frequency analysis (FFA), which estimates the flood quantiles associated with certain exceedance probabilities (or return periods) and vice versa, is the fundamental tool for assessing the flood hazards. Conventionally, the FFA has been conducted by fitting a probability distribution to a set of observations (e.g., annual maximum series (AMS)) under the assumption that they are independent and identically distributed, and consequently stationary. However, the conventional FFA, and, thus, the flood hazard assessment, have been challenged in the last decades in the face of climate change and other anthropogenic changes in the watersheds (e.g., land use and land cover changes) [4], which lead to the violation of the assumption of stationarity.
As a result, the development and implementation of the nonstationary FFA (NS-FFA), in which the distribution is temporally variant, has received more attention [5][6][7]. In particular, the distribution evolution in the NS-FFA has been commonly modeled by varying the distribution parameters. The distribution function can be determined based on the theoretical foundations, such as the extreme value theory [8]. However, the distribution has been deemed as a secondary aspect in the NS-FFA and, thus, it is a common practice to employ a practically acceptable distribution in the study region [9]. Among a variety of quantity uncertainty in estimates, assumptions (such as the statistical characteristics of the model errors, or the uncertainty in the input parameters, e.g. in [36,37]) have often be made. However, these assumptions, in general, cannot be properly justified, and, thus, might become an additional source of error and uncertainty. Unlike other conventional methods that might heavily rely on the assumptions adopted, the PF is practically assumptioninsensitive, and consequently is more reliable in uncertainty estimation [38,39].
In the face of the constant and sometimes intensive changes in the hydroclimate systems, there is a necessity of updating the strategies used for the flood hazard assessment and water resources management [4]. Although the limited understanding of the driving process(es) poses a major challenge in characterizing the nonstationarity, simply ignoring it in the planning process is not acceptable [40]. Broadly speaking, the nonstationarity can be intuitively translated into changes in the flood hazards, such as upward/downward trends in the annual maximum series (AMS) leading to the increase/decrease in the flood hazards. A recent study by Read and Vogel [26] illustrated how the nonstationarity can produce substantial impacts on the flood hazards using a hypothetical streamflow record, and concluded that the degree of nonstationarity is a critical parameter determining the impact severity. However, the need for assessing to what extent the nonstationarity affects the planning and management decisions for water resources management has been underscored [41,42]. The examination of the connection between the different observed patterns of nonstationarity and their degrees in the underlying datasets with the flood hazards is still lacking, especially using real datasets. Such an investigation could elucidate the response of the flood hazards to the nonstationarity, and, thus, provide insights into managing floods under nonstationarity.
In the view of the above, this paper employs the most recently proposed PF to conduct the NS-FFA for assessing the flood hazards under nonstationarity of different patterns. The temporal evolutions of the selected flood hazard indexes as well as their uncertainties are examined for three flow AMSs that exhibit different patterns and degrees of nonstationarity. In particular, the temporal evolution patterns of the flood hazards indexes are contrasted with the perceived patterns of nonstationarity of the underlying datasets to advance the knowledge on how the nonstationarity affects the flood hazards.

Nonstationary Datasets
Strictly speaking, the stationarity refers to the temporal invariance of the probability distribution; while the weak/second-order stationarity, which is defined based upon the temporal invariance of the first two statistical moments of the underlying stochastic process [43], has been often adopted in the field of hydrology. In the context of the FFA, the limited sample size constrains the exploration of higher-order moments [41] and subsequently the estimation of some distribution parameters, such as the shape parameter of three-parameters distributions. Thus, the definition of weak stationarity was adopted herein when examining the nonstationarity and its patterns. The Mann-Kendall (MK) test, which has been commonly adopted in the literature, was employed to detect the presence of monotonic temporal trends in both the mean and the standard deviation of the datasets. The MK test was coupled with the moving window approach, in which both the window length and constant shift are 10 years. These values were used to skip potential decadal variability and avoid data points to be counted more than once because of window overlapping. Moreover, the Mann-Kendall-Sneyers and Mann-Whitney-Pettitt tests were employed to detect the presence of the change point(s) in the datasets. The significance level of 0.10 was used in these tests.
In the literature, the nonstationarity of AMSs has been commonly reported in terms of the monotonic trends in the mean and/or the standard deviation, and the non-monotonic trends (e.g., [5,25,[44][45][46]). Unlike the monotonic trends, the non-monotonic trends refer to the non-uniform temporal changes, including abrupt shifts and different trends (direction and/or degree) in different subperiods. In this paper, three datasets, which exhibit three different patterns of nonstationarity, were selected for illustrating the temporal evolutions of the flood hazards under different nonstationary scenarios. The three AMSs were collected on Chilliwack River (ECCC-08MH016) from the Environment and Climate Change Canada (ECCC), and Aberjona River (USGS-01102500) and Elizabeth River (USGS-01393450) from the United States Geological Survey (USGS). These datasets were named as the D1, D2, and D3, respectively throughout this paper, and have sample sizes of 89, 78, and 98, respectively. The percentages of missing data in the D1 and D2 are 6% and 3%, respectively. Among these datasets, D1 exhibits a significant monotonic upward trend in the mean (Figure 1a), while D2 shows significant monotonic upward trends in both the mean and standard deviation (Figure 1b). A significant change point was detected around 1951 in the D3, and a significant monotonic upward trend in the mean was identified in the sub-dataset after the change point ( Figure 1c). These datasets exhibiting upward trends in the mean and/or the standard deviation were selected, as this type of nonstationary (compared to the downward trend) would pose more challenges to the decision-makers, engineers, and urban planners, for managing increasing flood hazards.
In the literature, the nonstationarity of AMSs has been commonly reported in terms of the monotonic trends in the mean and/or the standard deviation, and the non-monotonic trends (e.g., [5,25,[44][45][46]). Unlike the monotonic trends, the non-monotonic trends refer to the non-uniform temporal changes, including abrupt shifts and different trends (direction and/or degree) in different subperiods. In this paper, three datasets, which exhibit three different patterns of nonstationarity, were selected for illustrating the temporal evolutions of the flood hazards under different nonstationary scenarios. The three AMSs were collected on Chilliwack River (ECCC-08MH016) from the Environment and Climate Change Canada (ECCC), and Aberjona River (USGS-01102500) and Elizabeth River (USGS-01393450) from the United States Geological Survey (USGS). These datasets were named as the D1, D2, and D3, respectively throughout this paper, and have sample sizes of 89, 78, and 98, respectively. The percentages of missing data in the D1 and D2 are 6% and 3%, respectively. Among these datasets, D1 exhibits a significant monotonic upward trend in the mean (Figure 1a), while D2 shows significant monotonic upward trends in both the mean and standard deviation (Figure 1b). A significant change point was detected around 1951 in the D3, and a significant monotonic upward trend in the mean was identified in the sub-dataset after the change point ( Figure 1c). These datasets exhibiting upward trends in the mean and/or the standard deviation were selected, as this type of nonstationary (compared to the downward trend) would pose more challenges to the decision-makers, engineers, and urban planners, for managing increasing flood hazards.

Distribution and Nonstationary Structure in the Nonstationary Frequency Analysis
In the NS-FFA, the focus has been primarily on the nonstationary structures, as they are used to model the temporal evolution of the distribution; whereas the distribution is not envisioned to significantly affect the evolution of the flood hazard estimates under nonstationarity. In this paper, the distribution type was thus fixed for all datasets, while seven different nonstationary structures were nested in the selected distribution. The optimal nonstationary structure was then selected from the competing structures for each dataset according to their performance.

Distribution and Nonstationary Structure in the Nonstationary Frequency Analysis
In the NS-FFA, the focus has been primarily on the nonstationary structures, as they are used to model the temporal evolution of the distribution; whereas the distribution is not envisioned to significantly affect the evolution of the flood hazard estimates under nonstationarity. In this paper, the distribution type was thus fixed for all datasets, while seven different nonstationary structures were nested in the selected distribution. The optimal nonstationary structure was then selected from the competing structures for each dataset according to their performance.
The GEV developed within the extreme value theory provides a rigorous theoretical framework for the analysis of hydroclimatic extreme events [8,24]. It has been commonly used in the NS-FFA [46,47], and, thus, was used herein. Given a set of observations, which are independent realizations of the random variable, Y, the GEV cumulative function (Y (F Y )) and the quantile function (y p e ) corresponding to an exceedance probability p e (p e = 1 − F Y (y; θ)) are given by: where the distribution parameter vector (θ) is composed of the shape, scale, and location parameters, denoted by κ, α, and ξ, respectively. The return period (T) is the reciprocal of the exceedance probability, i.e., T = 1/p e . In the NS-FFA, it is common to adopt time-varying distribution parameters, which are expressed as a function of a selected covariate to depict the nonstationarity. The temporal covariate was adopted herein due to its practical convenience, especially for fitting purposes as discussed previously. The candidate nonstationary structures investigated in this paper have been commonly employed in the literature (e.g., [17,48,49]). The candidate nonstationary structures, in which the distribution parameters (except the shape parameter) are either linear or nonlinear functions of the covariate, are shown in Table 1. The shape parameter was treated as a constant, as it is particularly unrealistic to allow it to vary and difficult to estimate [8]. More complicated nonstationary structures (e.g., parameters expressed as a higher-order polynomial function of the covariate) can be used theoretically; however, they were not included as their parametrization could be highly uncertain [50,51]. Table 1. Candidate Nonstationary Structures in the nonstationary flood frequency analysis (NS-FFA). GEV = generalized extreme values distribution.

Particle Filtering for the Nonstationary Frequency Analysis
The idea behind the Bayesian filtering is to estimate the hidden/unobservable state vector of a dynamic system, x 0:i = {x 0 , x 1 , . . . , x i }, which is difficult if not impossible to be measured in the field. The hidden state vector is estimated from the related observations, y 1:i = {y 1 , y 2 , . . . , y i }. The general form of the state-space equation system can be expressed as where g(·) is the dynamic model that describes the stochastic dynamics of the system (i.e., the state function); f (·) is the measurement model that allows for mapping the states into the observational space (also referred to as the mapping function); ν i and ϕ i are the process noise in x i and measurement noise in y i , respectively; N(·) denotes the normal distribution; and V j and Φ j are the variance of ν j and ϕ j , respectively. In Bayesian filtering techniques, the prior belief propagation from step i-1 to i is conducted using the Chapman-Kolmogorov equation, which requires an analytical integration over the state space. Such integration is challenging in non-linear and non-Gaussian problems. The PF method, which replaces this integration with an approximation using a cluster of discrete weighted particles, can solve this issue [52]. When estimating the distribution parameters in the context of FFA, the general structure of the PF can be formulated by taking θ i as the state variable. Under nonstationarity, it varies over time to depict the distribution evolution. In this paper, ϕ i was not considered. Since the parameters are estimated for a given time period, the hidden states evolve in pseudo-time, replacing time by iterations in order to incorporate all observations unitized as a batch at each step i, i.e., Y i = {y 1 , y 2 , . . . , y n } ∀ i. This estimation can be expressed as the search for the joint posterior distribution of the states given the related observations in the Bayesian framework.
The PF approximates the distribution parameters and their inherent uncertainty through a set of particles, which are initially sampled from an arbitrary prior distribution. In this paper, a non-informative prior, namely a uniform distribution, was adopted. The parameter estimation is achieved by the particle cluster evolving over pseudo-time, which is controlled by both particle weights updating and white noise perturbation (v i ) around their positions. At each pseudo-time step, the particle weights are estimated based on the particle likelihood. The particle likelihood is obtained by the error between the model outputs (quantile estimates) which are computed using the particle set through the quantile function of the probabilistic model f (·), and the measurements Y i by assigning them exceedance probabilities using an empirical plotting position formula. Then, the particles are resampled to guarantee particle diversity and avert particle impoverishment. With the progress of the PF over pseudo-time steps, θ i converges to a stable estimation. The quantification of the uncertainty in the model parameters is then carried out employing the stabilized set of parameter particles.
In the PF, several hyperparameters need to be pre-determined and are case-sensitive. The hyperparameters are dependent on each other and govern the different aspects of the PF including the estimation precision and the computational burden. For instance, the selected number of particles governs the accuracy of the estimates to some extent, and the number of pseudo-time steps to achieve convergence in the estimates depends on the underlying probabilistic model, the number of particles employed, and the dataset itself. In this paper, the number of particles and the number of pseudo-time steps to reach a single estimation were set to 5000 and 500, respectively. The number of simulations to stabilize the uncertainty estimates was 100. A more detailed description of the PF and the setup of the hyperparameters of the PF for the FFA can be found in Sen et al. [32] and Vidrio-Sahagún et al. [33].

Flood Hazard Metrics
Three hazard metrics, namely the ERL, ERP, and R, were adopted to examine the temporal evolution of the flood hazards under nonstationarity. These three metrics and their stationary counterparts (namely the conventional return levels/quantiles, return period, and R, respectively) are commonly used in flood hazard assessment studies [49,53,54]. The ERL and ERP are used in both the NS-FFA and the nonstationary flood hazard assessment, while the R is designated for the latter.
The ERL, which denotes the time-variant return level/quantile corresponding to a particular exceedance probability p e (or return period T), is given by: where θ t is the time-varying distribution parameter vector. Similarly, the ERP, which denotes the reciprocal of the time-variant exceedance probability associated with a given assessment threshold (quantile), is given by: where y p e ,t 0 is the assessment threshold, which is associated with the exceedance probability p e at a reference time denoted as t 0 .
The R explicitly incorporates the design life of an infrastructure or the assessment horizon of interest, M (in years), for assessing the flood hazard. The R denotes the probability of observing at least one event exceeding the design threshold y p e ,t 0 during the time period M. The R is estimated by:

Model Evaluation and Uncertainty Metrics
The candidate models were evaluated based on the fitting efficiency using the Akaike Information Criteria (AIC) and the Bayesian Information Criteria (BIC), both of which deal with the trade-off between the goodness-of-fit offered by a model and its complexity. A model yielding a smaller AIC/BIC is more efficient in the fitting. These two assessment criteria were used to select the optimal models and they are calculated by [55,56]: where N par is the number of model parameters, and the root mean square error (RMSE), which is calculated by: where O j are the empirical quantiles obtained using a plotting position formula (p 1:n = (r − 0.35)/n (where r is the rank of the j th observation and n is the sample size of Y i ), and M j are the corresponding modeled quantiles by the PF. The empirical plotting position formula was selected due to its better performance for estimating the parameters and quantiles compared to other formulas for extreme value distributions [57]. In addition, two other commonly used accuracy metrics, Bias, and R 2 , were included and they are calculated by: Regarding the uncertainty in the estimations, it is often desired that a narrow uncertainty band contains as many as possible observations. Thus, to assess the level of uncertainty, two uncertainty metrics, namely the average bandwidth (AW) and the percentage of coverage (POC) of the uncertainty bands, have often been employed. It is worth to mention that these two uncertainty metrics conflict with each other, as the band coverage is in a trade-off with its width. These uncertainty metrics are computed by [37,58]: where M U j and M L j are the upper and lower uncertainty bounds of the estimates of the j th observation, respectively.

Optimal Nonstationary Model
The optimal nonstationary model, namely the optimal nonstationary structure given the selected distribution, is determined based on the efficiency of fitting of the ERL estimates. Table 2 summarizes the calculated fitting efficiency metrics along with the accuracy metrics for the ERL estimates of all candidate nonstationary models for each dataset. For the D1, the three candidate models, M 1,0,0 , M 2,0,0 , and M E1,0,0 , all of which adopt a time-variant location parameter, but constant scale and shape parameters, are in general superior to other candidate models in terms of both the fitting efficiency (AIC and BIC) and accuracy (RMSE, Bias, and R 2 ). Among these top models, the M 1,0,0 , which models the nonstationarity using the linear function of time, slightly outperforms the other models in terms of the fitting efficiency. Recall that a significant upward trend was detected in the mean of this dataset, which is consistent with the nonstationary structure of the selected optimal model (M 1,0,0 ) of the D1. For the D2, it is apparent that the M E1,1,0 is superior to all other competing models in terms of both the fitting efficiency (the lowest AIC and BIC) and accuracy (the second lowest Bias, the highest R 2 , and the lowest RMSE). In the M E1,1,0 , the location and scale parameters are exponential and linear functions of time, respectively. These reflect the detected trends in both the mean and standard deviation in the D2. For the D3, the optimal model selected according to both the fitting efficiency and accuracy metrics is the M 1,1,0 , followed by M 1,E1,0 , and M E1,1,0 . These models (M 1,1,0 , M 1,E1,0 , and M E1,1,0 ) allow both the location and scale parameters to vary over time. Note that differing from the D1 and D2, a change point was detected in the D3 and that the candidate models are not formulated to capture change points. Yet, if ignoring the change point, significant trends in the mean and standard deviation were detected over the whole observation period. Therefore, the nonstationary structure of the optimal and top models of the D3 also aligns with the perceived pattern of nonstationarity. As illustrated by the three datasets, irrespective of their different patterns of nonstationarity, the nonstationary structure of their optimal models (and often, other top models) appears to be related to the patterns of nonstationarity detected in the underlying datasets at a great degree. Moreover, Figure 2 displays the uncertainty metrics of all candidate nonstationary models for each dataset. For the D1, it appears that the optimal and top candidate models (M 1,0,0 , M 2,0,0 , and M E1,0,0 ) selected in terms of the fitting efficiency (AIC and BIC) result in a relatively higher AW compared to other candidate models, while the high POCs (>95%) are reported in most of the models (except M 1,E1,0 ) (Figure 2a). Recall that the AW and POC are two uncertainty metrics conflicting with each other, as a smaller AW can be achieved at the expense of decreasing the POC, and vice versa. In contrast, for the D2, the M E1,1,0 is not only optimal according to the fitting efficiency metrics but also superior in uncertainty, as it yields the lowest AW and the highest POC among the candidate models. Similarly, for the D3, the top models (M 1,1,0 , M 1,E1,0 , and M E1,1,0 ) in general also offer a lower level of uncertainty than that of other competing models. Besides, the optimal model (M 1,1,0 ) selected in terms of fitting efficiency has the lowest AW and its POC is only 1% below the maximum POC reported in the M E1,0,0 , M 2,0,0 , and M 1,E1,0 . Thus, the selected optimal models based on the fitting efficiency (AIC/BIC) do not always yield the lowest level of uncertainty (e.g., for the D1). These results argue that the optimal models selected in terms of the fitting efficiency might not necessarily guarantee to be optimal in terms of different assessment criteria, such as the uncertainty. Therefore, developing more elaborate criteria for incorporating the uncertainty into the selection of the optimal model might be beneficial for improving the NS-FFA, and, thus, the flood hazard assessment, as the reliability is also an important practical aspect of the analysis and assessment.

Flood Hazard Assessment under Nonstationarity
The selected optimal NS-FFA models, namely M 1,0,0 , M E1,1,0 , and M 1,1,0 for the D1, D2, and D3, respectively, are used to exemplify the temporal evolution of the flood hazard indexes. The nonstationarity in the three datasets is shown to be acceptably captured in the NS-FFA, as the nonstationary structures of the optimal models showcase coherence with the perceived patterns of nonstationarity of the underlying datasets as discussed previously. Among the three datasets, the D1 has the lowest degree of nonstationarity, as its estimated trend (Sen's) slope in the mean is 0.02 µ m 3 /s per decade (i.e., an upward trend with a slope of 2% per decade with respect to the reference statistic, in this case, the mean of the dataset). The degree of nonstationarity of the D3 is considered moderate among these datasets, as it shows a significant trend slope in the mean of 0.05 µ m 3 /s per decade in the subperiod after the detected change point; whereas the slopes of the trends in the mean and standard deviation over the whole observation period (ignoring the change point) are 0.08 µ m 3 /s per decade and 0.06 σ m 3 /s per decade, respectively. The D2 is considered to have the highest degree of nonstationarity among the datasets, as the slopes of its upward trends in the mean and standard deviation are 0.07 µ m 3 /s per decade and 0.15 σ m 3 /s per decade, respectively.
The frequency surfaces (Figure 3) present the temporal evolution of the point estimates of the ERLs. As expected, the ERLs increase with time at any given T for all three datasets. This behavior is consistent with the perceived upward trends in the mean and/or standard deviation in these datasets. Comparing the frequency surfaces of the datasets, the curvature of the frequency surface of the D2 is at a higher degree. Quantitatively, the ERLs increase at a constant rate of 0.03 ERL m 3 /s per decade independently of the T and time for the D1. Here, ERL refers to the mean of the ERLs taken along the T-axis at every 5-yrs interval every year over the observation period. For the D3, the ERLs increase at an average rate of 0.09 ERL m 3 /s per decade, while the trend slope varies along with the contour lines of Ts. For example, the ERLs increase at the rates of 0.07 ERL m 3 /s and 0.10 ERL m 3 /s per decade along with the contour lines of T = 5-yrs and 75-yrs, respectively. For the D2, the ERLs increase at the highest average rate of 0.16 ERL m 3 /s per decade among the datasets, while the trend slopes of ERLs also vary with both the T and time. Thus, the variation pattern and degree of the ERLs appear to be consistent with the perceived pattern and degree of nonstationarity in the datasets.   Figure 4 further depicts the point estimates of the ERLs corresponding to three selected Ts (10-, 50-, and 100-yr). As shown in Figure 4, the ERL point estimates of the three datasets tend to increase, indicating the rise of the expected quantiles associated with a given T. The most pronounced variations in the ERLs (and frequency surfaces) can be noticed in the D2, followed by the D3 and D1, which coincides with the most and least prominent degree of nonstationarity perceived in the D2 and D1, respectively. For instance, at the given T = 50-yr, the ERL point estimates for the D2 display an upward trend of 0.14 / per decade, whereas the trends are 0.09 / per decade and 0.03 / per decade for the D3 and D1, respectively. Note that significant upward trends were detected in both the mean and standard deviation in the D2 and D3  Figure 4 further depicts the point estimates of the ERLs corresponding to three selected Ts (10-, 50-, and 100-yr). As shown in Figure 4, the ERL point estimates of the three datasets tend to increase, indicating the rise of the expected quantiles associated with a given T. The most pronounced variations in the ERLs (and frequency surfaces) can be noticed in the D2, followed by the D3 and D1, which coincides with the most and least prominent degree of nonstationarity perceived in the D2 and D1, respectively. For instance, at the given T = 50-yr, the ERL point estimates for the D2 display an upward trend of 0.14ERL 50−yr m 3 /s per decade, whereas the trends are 0.09ERL 50−yr m 3 /s per decade and 0.03ERL 50−yr m 3 /s per decade for the D3 and D1, respectively. Note that significant upward trends were detected in both the mean and standard deviation in the D2 and D3 (over the entire observation period).  per decade and −421 per decade, respectively, whereas the trend for the D1 is −42 per decade. As illustrated in Figure 5, the extremely high point estimates of the ERP for the D2 and D3 (ERP > 10 5 years for the reference assessment thresholds , and , ) might be unrealistic. Pertaining to the unrealistic estimates, Ouarda et al. [51] and Serinaldi and Kilsby [25] pointed out that the estimates of the flood quantiles and their exceedance probabilities can be sometimes unrealistic/improbable under nonstationarity. Thus, the estimates of the flood hazards might be unrealistic as well. A possible solution to avoid the unrealistic estimates might be to refine the nonstationary structure [25,51], such as through advancing the understanding of the cause-effect mechanism behind the nonstationarity.  Figure 5 shows the point estimates of the ERPs corresponding to three assessment thresholds y p e ,t 0 for D1, D2, and D3, respectively. These three y p e ,t 0 are the 10-, 50-and 100-yr quantiles derived from the optimal nonstationary model at the reference time point (t 0 ). The selection of t 0 is problem specific. It can be the start time of the operation of the infrastructure under assessment or the start of the period of interest for policy evaluation. For illustration purposes, t 0 = 30 years prior to the end of the observation period of the datasets is used here. Differing from the point estimates of the ERL, the ERP point estimates tend to decrease temporally, suggesting the increase of the expected recurrence of y p e ,t 0 over time for all three datasets. Furthermore, similar to the ERLs, a larger change in the ERP point estimates is resulted in when the degree of nonstationarity is higher. For example, at the given y T=50−yr,t 0 , the ERP point estimates for the D2 and D3 have (downward) average trends of −656 yrs per decade and −421 yrs per decade, respectively, whereas the trend for the D1 is −42 yrs per decade. As illustrated in Figure 5, the extremely high point estimates of the ERP for the D2 and D3 (ERP > 10 5 years for the reference assessment thresholds y T=50yr,t 0 and y T=100yr,t 0 ) might be unrealistic. Pertaining to the unrealistic estimates, Ouarda et al. [51] and Serinaldi and Kilsby [25] pointed out that the estimates of the flood quantiles and their exceedance probabilities can be sometimes unrealistic/improbable under nonstationarity. Thus, the estimates of the flood hazards might be unrealistic as well. A possible solution to avoid the unrealistic estimates might be to refine the nonstationary structure [25,51], such as through advancing the understanding of the cause-effect mechanism behind the nonstationarity. Geosciences 2021, 11, x FOR PEER REVIEW 13 of 18 To exemplify the temporal evolution of the R, the is set to the last 30 years of the observation period. Thus, the is evaluated over the assessment horizon from the to the end of the observation period. The point estimates of the R at the previously defined three , and their uncertainties are shown in Figure 6 for the three datasets. Comparing the R point estimates of these three datasets, the D2 always has the highest R (and the largest increase rate in R), while the lowest R is reported for the D1. For example, at the given , , the R for the D1 increases at the average rate of 0.21 per decade, whereas the R for the D3 and D2 increases at the rates of 0.26 per decade and 0.35 per decade, respectively. The results on the R along with the previous results on the ERL and ERP support that the flood hazards change more (increase or decrease) when the degree of nonstationarity of the underlying datasets is more pronounced. This implies that the impacts of climate change and other anthropogenic forces leading to the nonstationarity of flow AMSs on the flood hazards might be quantified according to the degree of nonstationarity. In addition, the trends in both the mean and standard deviation are shown to affect the flood hazards (e.g., in the D2 and D3). This calls for examining the temporal trends in the mean and standard deviation, and even higher order-moments if possible considering the sample size, to model and assess the effect of the nonstationarity on flood quantiles and hazards. To exemplify the temporal evolution of the R, the M is set to the last 30 years of the observation period. Thus, the R is evaluated over the assessment horizon from the t 0 to the end of the observation period. The point estimates of the R at the previously defined three y p e ,t 0 and their uncertainties are shown in Figure 6 for the three datasets. Comparing the R point estimates of these three datasets, the D2 always has the highest R (and the largest increase rate in R), while the lowest R is reported for the D1. For example, at the given y T=50−yr,t 0 , the R for the D1 increases at the average rate of 0.21 per decade, whereas the R for the D3 and D2 increases at the rates of 0.26 per decade and 0.35 per decade, respectively. The results on the R along with the previous results on the ERL and ERP support that the flood hazards change more (increase or decrease) when the degree of nonstationarity of the underlying datasets is more pronounced. This implies that the impacts of climate change and other anthropogenic forces leading to the nonstationarity of flow AMSs on the flood hazards might be quantified according to the degree of nonstationarity. In addition, the trends in both the mean and standard deviation are shown to affect the flood hazards (e.g., in the D2 and D3). This calls for examining the temporal trends in the mean and standard deviation, and even higher order-moments if possible considering the sample size, to model and assess the effect of the nonstationarity on flood quantiles and hazards.

Uncertainty in the Flood Hazard Assessment
The application of the PF allows for the uncertainty estimation simultaneously with the point estimates of flood quantiles and hazards. As shown in Figures 4 and 5, the uncertainty bandwidths of ERLs and ERPs, in general, increase with the increments of the T and y p e ,t 0 , respectively. These results illustrate that the hazard estimation is more uncertain in the tail region of the distribution compared to those events that are more recurrent. Moreover, the bandwidth of the ERL increases along with time at all three Ts for the three datasets. For instance, at the given T = 50-yr, the upward trend slopes of the ERL bandwidths are 0.01 AW, 0.12 AW, and 0.06 AW per decade for the D1, D2, and D3, respectively. Thus, the bandwidth of the ERL of the D2 increases at the highest rate among the datasets. Similar results were obtained at the given T = 10-yr and 100-yr. Therefore, the level of uncertainty in the ERL appears to elevate over time and its change rate is related to the degree of nonstationarity as well. This suggests that the elevation of the uncertainty in the ERLs would be more prominent when the degree of nonstationarity is higher. Similar to the trend in their point estimates, the bandwidth of the ERP decreases along with the time. It is worth it to mention that, at a certain time subperiod (i.e., the beginning of the evaluation period of the ERP), the uncertainties of the ERPs of the D2 and D3 are very large and range over a few orders of magnitude at the reference assessment thresholds of y T=50yr,t 0 and y T=100yr,t 0 ( Figure 5).

Uncertainty in the Flood Hazard Assessment
The application of the PF allows for the uncertainty estimation simultaneously with the point estimates of flood quantiles and hazards. As shown in Figures 4 and 5, the uncertainty bandwidths of ERLs and ERPs, in general, increase with the increments of the T and , , respectively. These results illustrate that the hazard estimation is more uncertain in the tail region of the distribution compared to those events that are more recurrent. Moreover, the bandwidth of the ERL increases along with time at all three Ts for the three datasets. For instance, at the given T = 50-yr, the upward trend slopes of the ERL bandwidths are 0.01 , 0.12 , and 0.06 per decade for the D1, D2, and D3, respectively. Thus, the bandwidth of the ERL of the D2 increases at the highest rate among the datasets. Similar results were obtained at the given T = 10-yr and 100-yr. Therefore, the level of uncertainty in the ERL appears to elevate over time and its change rate is related to the degree of nonstationarity as well. This suggests that the elevation of the uncertainty in the ERLs would be more prominent when the degree of nonstationarity is higher. Similar to the trend in their point estimates, the bandwidth of the ERP decreases along with the time. It is worth it to mention that, at a certain time subperiod (i.e., the beginning of the evaluation period of the ERP), the uncertainties of the ERPs of the D2 and D3 are very large and range over a few orders of magnitude at the reference assessment thresholds of , and , ( Figure 5). As for the uncertainty in the R of the three datasets (Figure 6), the D1, which shows the lowest degree of nonstationarity, has the lowest R but the largest uncertainty; whereas the uncertainty in the R of the D2 is the lowest. For instance, at the given , , the As for the uncertainty in the R of the three datasets ( Figure 6), the D1, which shows the lowest degree of nonstationarity, has the lowest R but the largest uncertainty; whereas the uncertainty in the R of the D2 is the lowest. For instance, at the given y T=50−yr,t 0 , the AW is 0.45, 0.34, and 0.35 for the D1, D2, and D3, respectively. Thus, different from its point estimates, the uncertainty in the R appears to be inversely correlated with the degree of nonstationarity of the datasets. From all the results of the ERL, ERP, and R, it can be concluded that despite the uncertainty in the flood hazard indexes respond differently to the nonstationarity, it is evidently associated with the nonstationary degree. The different responses of uncertainty in the flood hazard indexes to the nonstationarity would be ascribed to their different formulations, and, thus, their natures.

Conclusions
To advance the understanding of the impact of the nonstationarity on the flood hazards, this paper conducted the flood hazard assessment through the NS-FFA using the PF for three selected flow AMSs, which exhibit different nonstationary behavior and degree. In particular, the response of the flood hazards to the perceived pattern and degree of nonstationarity was investigated. In the NS-FFA, the nonstationary structure of the optimal models according to the fitting efficiency appeared to align with the perceived patterns of nonstationarity in these three underlying datasets, irrespective of their different patterns of nonstationarity. However, the optimal model selected in such a way did not always guarantee its superiority in terms of other assessment criteria, such as the uncertainty level in terms of the AW and/or POC, as illustrated in the D1. Therefore, to further improve the NS-FFA, and, thus, flood hazard assessment, it is desirable to incorporate the uncertainty into the selection of the optimal model, as the model reliability is the other important practical aspect of the frequency analysis. As for the flood hazard indexes, the responses of the three selected indexes, namely the ERL, ERP, and R (in both their point estimates and uncertainty levels) were also associated with the patterns and degrees of nonstationarity. Among the three datasets, the D2 having the most pronounced degree of nonstationarity was found to have the largest increases in the indexes; whereas the lowest increases in the indexes were calculated for the D1, which has the lowest degree of nonstationarity. Thus, the variations of the point estimates of the flood indexes intensified with the degree of nonstationarity, which should be characterized considering the temporal evolutions of the mean and standard deviation, and possibly the high-order moments. The uncertainties in the ERP and ERL tended to increase with the increase in the degree of nonstationarity. In contrast, the uncertainty in the R decreased with the increase in the degree of nonstationarity. As a result, the uncertainty in these flood hazard indexes showed different responses to the nonstationarity, depending on their formulations. All of these advocate the sensitivity of the flood hazards to the perceived patterns and degrees of nonstationarity of the underlying datasets. Therefore, the impacts of climate change and other anthropogenic factors on the flood hazards might be depicted by the nonstationarity characteristics of the datasets. On the other hand, further research on the effect of the PF on the NS-FFA, in particular in the context of the uncertainty in estimates, is recommended through comparing with other available approaches. Funding: The first author of this paper is funded by a doctoral Scholarship from the National Council for Science and Technology of Mexico (CONACYT) and the Universidad de Guadalajara. This work is also partially funded by the Discovery Grant of Natural Sciences and Engineering Research Council (RGPIN/04550-2020) held by the second author.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data presented in this study are openly available in the HYDAT database of the Environment and Climate Change Canada (ECCC) (https://collaboration.cmc. ec.gc.ca/cmc/hydrometrics/www/) and in the United States Geological Survey (USGS) website (https://nwis.waterdata.usgs.gov/usa/nwis/peak).