Multivariate Hydrologic Risk Analysis for River Thames

This study analyzed the multivariate flood risk for the river Thames at Kingston based on historical flood data from the National River Flow Archive (NRFA) website. The bivariate risk anal‐ ysis framework was prepared from the joint return periods of the peak flow (m3/s) and 3‐day annual maximum  flow (m3/s)  flood pair. A  total of  137  samples of  flood pairs  from  1883  to 2019 were adopted  for  risk analysis. The multivariate  return periods were characterized depending on  the quantification of the bivariate flood frequency analysis of the pair through copulas methods. The unknown parameter of each copula was estimated using the method‐of‐moment (MOM) estimator based on Kendall’s tau inversion, in which the Clayton copula performed best to model the depend‐ ence of the two flood variables. Then, the bivariate hydrologic risk was characterized based on the joint return period in AND, established from the Clayton copula method. The results reveal that the flood pair would keep a constant hydrologic risk value for some time then moderately decrease as the 3‐day AMAX flow increases from 700 m3/s. This hydrologic risk indicator was analyzed under four service time scenarios and three peak flows whose return periods were positioned at 50, 100, and 150 years. The outcomes from the bivariate risk analysis of the flood pairs can be used as deci‐ sion support during the design of flood defenses and hydraulic facilities.


Introduction
Some of the major natural catastrophe events are floods, caused by the falling of extreme rainfall exceeding the capacity of a particular catchment. These events are likely to be more frequent and severe considering the ongoing radical changes in climate [1]. The calamity caused by these continuous, heavy inundations poses a significant effect on human lives and the surrounding environment both socially and economically [2]. In fact, they are the costliest hydrological events, and account for 31% of the world's economic loss [3]. In the last 20 years, there have been 951 weather-related events among the 999 natural disaster events in Europe, of which 41% of them were floods [3]. Specifically, flash floods, caused by intense rainfall, need to be comprehensively addressed since they leave little time to warn the population and cause more deaths than normal floods [4].
Even with the ongoing threat associated with floods, settlements are sited along the floodplain zones due to the benefits that come from rivers [5]. Such settlements have served a major contribution to the current flood's complexity requiring the need for extensive analysis in predicting and managing the future probability of flood re-occurrence. This is possible by adopting various flood risk analysis measures [6]. The designing of flood defenses has been widely applied as a measure to manage and control future flood risk. So long as a flood magnitude is known, decision-makers are in a position to characterize the possible location for a flood to occur by predicting flood duration, rising rate, flood depths, velocities, and damage. For instance, Dottori [7] produced a new dataset for river flood hazard maps for the European and Mediterranean Basin regions, which can be used for a range of applications at a continental scale, from evaluating present and future river flood risk scenarios to the cost-benefit assessment of different adaptation strategies to reduce flood impacts. Diakakis [8] developed a database of flood-related fatalities to investigate qualitative changes in flood mortality in Greece between 1960 and 2010. Chatzichristaki et al. [9] analyzed the flash flood in Rhodes Island (South Greece) on 22 November 2013 through the curve number method and the soil conservation service runoff estimation method. Rehan [10] developed an innovative micro-scale approach for vulnerability and flood risk assessment with applications in property-level protection for an urban area in Teddington, London.
The quantification of a flood risk involves the characterization of different hydrological factors that need to be taken into consideration during analysis. More clarity is seen by considering more than one variable in analyses through a multivariate approach [10]. This kind of approach has been adopted in many research works [11][12][13] due to its proficiency in delivering a full screen for a flood event. For example, Karatzetzou [14] developed a unified seismic-flood-hazard model for the risk assessment of roadway networks in Greece, which supports the risk assessment of critical elements of transportation infrastructure in a multi-hazard environment. Moreover, an occurrence of a flood event is a combination of several attributes, such as peak flow, flood duration, and flood volume. However, flood factors are dependent on each other, which limits directly combining their marginal distributions. Recent studies employed copulas in modeling such multivariate hydrologic events [1,2,[10][11][12][15][16][17][18][19][20]. The copula function models the marginal distribution and multivariate dependency separately, offering flexibility in making the needful adjustment of the marginal and joint probability functions.
River Thames is the second-longest river in the UK, with a total catchment area of 12,935.77 km 2 . There have been several reported flood incidences reflected as early as the Anglo-Saxon Chronicle of 1099 [21], making it a flood-prone river. At Teddington lock (Kingston), the River Thames changes its regime from being tidal to non-tidal [5]. Recently, there have been climatic suggestions of an increase in magnitude and intensity of rainfall as well as a rise in sea level that will lead to an increase in floods. In addition, there are new developments planned that will interfere with the effectiveness of the measures already in place to combat floods at the Thames flood plain [21]. Although several mitigation measures are in place to manage the reoccurring events, the risk of flood is still a threat. Most flood defenses constructed on the Thames, such as the tidal surge barrier, walls, and embankments, are coming to an end of their design life. At the moment, there are plans in place to manage tidal flood risk for the next 100 years. However, implementations of engineering and managerial measures to combat floods require a properly structured flood risk analysis among the inputs. Owing to this, most studies conducted on the Thames have focused on the design and management of tidal floods at the Thames estuary [11,[21][22][23]. Considering the ongoing threat of fluvial floods due to climate change, there are limited studies that analyze its risk. Furthermore, there are no research works undertaken that analyze multivariate risk resulting from floods in the Thames River. Multivariate flood risk analysis aims at finding the best probability estimate of flood re-occurrence on the river Thames in a specified return period, since it compares different variables using the copula function.
Therefore, this study aims to analyze the multivariate flood risk in the River Thames under consideration of multiple flood variables. In detail, this analysis involves quantifying the joint probabilities of flood peak and three-day maximum flow through different copula functions. Moreover, the unknown parameters of copulas will be estimated using method-of-moments (MOM). The performances of the copulas are evaluated through the goodness-of-fit statistical tests in which the best copula is employed for further flood risk analyses. In this study, the risk indices under consideration would be the return periods for critical thresholds of the flood variables, which include the primary, joint, and secondary return periods. Moreover, the risk of failure will be further characterized based on interactive analysis of two flood variables aiming at revealing significant effects for persisting elevated risk levels due to impacts from multiple interactive flood variables.

Overview of the Study Area
The River Thames, as shown in Figure 1 below, is the longest river in England with a longitude and latitude of 51°35′8.07″ N and 0°36′57.87″ W, respectively. It has a length of 346 km flowing from the springs on Gloucestershire east towards the North Sea with a total catchment area of 12,935.77 km 2 . The Thames is divided into five sections with three non-tidal and two tidal sections. It is fed by fifty-one named tributaries having eighty islands and forty-five weirs and locks throughout its course [24]. Figure 1 shows the largest basin of the river Thames with a total catchment of 9942 km 2 above the Teddington weir (Kingston). The Teddington weir is located at the furthest lower west end of the catchment located at 51°24′54.0″ N and 0°18′31.9″ W. The basin receives precipitation at an average of 710 mm, evenly distributed throughout the year with the exception of autumn or the early winter maximia. During summer, 460 mm of the precipitation is lost due to evaporation, accounting for 65% of the total rainfall. The average river flow ranges from 1.5 m 3 /day to 65.8 m 3 /day from the upper side of the river to the tidal limit located at Teddington [25]. The annual runoff of the river ranges between 99 (in 1934) and 418 mm (in 1951), with an average of 250 mm [26]. The catchment has a temperate climate with oceanic influence, with temperatures of 16.4 °C and 4.6 °C during summer and winter, respectively, and a mean yearly temperature of 11 °C [24].
The basin has undergone historical changes in its topography, geology, and land use; its land cover is comprised of arable agriculture and pasture throughout the catchment, while forests and urban areas are mainly located at the lower side of the basin [25]. Despite several measures placed in the lower Thames to support resilience to climate variability, there is still future fluvial flood risk due to an increase in rainfall, especially during the winter season [22]. Flooding has always been one of the major challenges for the River Thames in the 20th/21st century, such as the 1947 and 2007 floods [27]. The flood-generating mechanisms have changed in which snowmelt (sometimes over frozen ground) was a more common mechanism in major flood events (e.g., the 1947 flood) prior to the 1960s, while rising winter temperatures have seen snowmelt decline as an aggravating factor in relation to flood risk [28] Furthermore, the sustained heavy rainfall on a saturated catchment caused the 1894 flood [28].

Methodology
This study used an inductive research approach rather than a deductive one for analysis. Such an approach was chosen with an aim to formulate a hypothesis and develop a hydrological risk indicator from the joint distributions of two flood variables, i.e., annual peak flow and the corresponding 3-day maximum flow. This study is based on the quantitative approach of data collection using existing data from the NRFA. Before further analysis, the collected data was gathered in Microsoft Excel to check for missing data in any years-no missing data were observed. The data gathered from the gauged daily flow (GDF) was then calculated further to obtain the three-day annual maxima in each year from 1 January 1883 to 1 October 2019. The peak flow and three-day maxima flow, each containing 137 samples, were presented for further analysis in the software MATLAB. These variables are dependent on each other; hence, the copula function was used to bring together their marginals to form a joint distribution. The Gumbel, Frank, and Clayton copula methods, which belong to the Archimedean family, were applied to join the marginal distributions of the variables. The best copula method was then used to model the hydrological risk based on the joint return period in AND.
Various tools and materials were used in the analysis of the flood risk for the River Thames at Kingston. The main software used for the analysis of flood risk estimation were MATLAB and Microsoft Excel. MATLAB was extensively used for the construction and selection of marginal distributions and copulas, as well as the calculation of the return period of the flood variables. Several charts were extracted for analysis in this study. Microsoft Excel was used for the identification of the annual peak flow and 3-day maximum flow to be used in MATLAB.

Flood Risk Assessment through the Copula Function
The copula function is seen as a cornerstone in the field of hydrological and meteorological risk analysis. Its uniqueness is stipulated in its ability to incorporate the joint probabilities of more than one dependent variable to anticipate the future risk of an event, as shown in Figure 2. It is applied in water resource engineering in analyzing flood frequency, precipitation, storm, and drought characteristics [29]. The joint probabilities of multiple attributes provide a wider span in analysis, thus improving the accuracy of the estimated risk [12]. A d-dimensional copula, C: [0; 1] d → [0;1] is a cumulative distribution function (CDF) of a random vector (U1,Ud) with uniform marginals (0,1). The generic copulas C(u) is shown in Equation (1) below as: where p [Uj ≤ uj] = uj for ∈ and 0 1.

Sklar's Theorem
The copulas theorem was first introduced by Sklar, who considered the existence of copulas when d-dimensional CDF and F have marginals F1, Fd such that [29]: where F , … , F are marginal distributions; if these are continuous, then the copula function C can be written as follows:   Details on copula function properties are described in detail by Zhang and Singh [29] and Salvadori and De Michele [30].

Copula Density
If the copula has a probability distribution function (PDF), then it is obtained in the usual manner as [29]: and the joint probability density function (pdf) of the two random variables is given as: Subsequently, the conditional pdf of X1, given the value of X2, can be formulated as: and the conditional pdf of X2, given the value of X1, can be expressed as:

Types of Copulas
There are five main types of copulas that are applied in different fields; the Gaussian and student-t copulas, which belong to the elliptical family, while the Gumbel, Clayton, and Frank copulas are from the Archimedean copulas family.
The Archimedean family is quite popular and attractive in hydrological flood frequency analysis. This is because of their capability of being easily generated and encapsulating a wide range of dependence structures with several desirable properties, such as symmetry and associativity [31]. Furthermore, it has no limitation on the marginal distribution for constructing the joint distribution, making it a perfect fit in the analysis of the joint probability distribution in the flood analysis [12].
The Archimedean copula is a simplified measure of dependence and is defined as [18]: Each copula family's persistence is constricted by the involvement of the flood variables (e.g., by using the Kendall′s rank correlation (t) dependence measure). To begin with, the Clayton copula and Gumbel-Hougaard copula can only model positive dependence between random variables (i.e., tθ ≥ 0), while the Frank copula can model both negative and positive dependence structures (i.e., tθ ∈ [−1, 1]). Despite the Ali-Mikhail-Haq copula being applicable for both negative and positive dependence, the dependence parameter is restricted for Kendall's tau, tθ ∈ [−0.1817, 0.3333], and thus its copula parameter θ does not cover the entire range [−1, 1] of association measures [13]. Their formulas are shown in Table 1 below.

Marginal Distribution of Flood Variables
There are several marginal distributions used for estimating the hydrological variables, such as normal distribution, lognormal, extreme value type-1 (Gumbel or EVI), and generalized extreme value distribution (GEV) [12]. Two of the distributions that will be used in this study are elaborated in Table 2 below.

Name PDF Denotes
µL: mean of logarithm of x σL: standard deviation of logarithm of x

Estimation of Parameters
Several methods are applicable for estimating copula parameters, such as method-of moments (MOM), maximum likelihood estimators, and matching moments methods [32].
The MOM is a simple technique that matches the first amount of data with the theoretical moments of the distribution. This method is achieved by solving the sample (k) equations; hence, reducing the theoretical moment of a function of the parameters and parameter estimation problem. Given the k-th population moment of a random variable Y as , and the k-th sample moment of a sample … . to be ∑ for k = 1,2,n, the method of moment of the distribution parameters … are attained by solving the set of p equations expressed as [32]: Even though the method-of-moment estimators are not generally efficient, they are asymptotically normal and unbiased

Maximum Likelihood Estimation (MLE)
Although this method is considered difficult since it involves too many parameters, especially for large values of d, it is widely applied in flood risk. It takes the probability distribution of the flood variable, then makes the comparison of the distributed dataset to acquire a suitable match [32].
The joint pdf of , … , is the same as the likelihood given as: Assuming as the parameter value at which ‫|‬ attains its maxima as the function of θ for each sample x, is the maximum likelihood estimator of the parameter , based on the sample data. Given the likelihood function as C2, the probable MLE parameter values of θ are solved as [32]:

Goodness-of-Fit Statistical Tests
The goodness-of-fit is a hypothesis test used to determine the correlation of variables and whether the collected data fit a particular distribution. They evaluate performance for both marginal and joint probability functions. The measures of the goodness-of-fit are used in hypothesis testing to test the normality of residuals as well as to check whether two samples (observed and from the marginal distribution) are drawn from identical distributions. The tests commonly used are: the chi-square, Kolmogorov-Smirnov, Anderson-Darling, root mean square error (RMSE), and Shapiro-Wilk. The estimation of empirical non-exceedance probabilities of the flood variables for the observed values is as follows [2]: where is the probability of non-exceedance; N signifies the sample size; k is the kth smallest observation in the data set.
1. The root mean square error (RMSE) The root mean square error (RMSE) is one of the most applied measures of the goodness-of-test. RMSE calculates the variations between the hypothetical model and the observed values of the probability distribution. It is known that the smaller the RMSE, the better the estimation. It is given by [11]: where Femp and C are the empirical and theoretical cumulative probability, respectively; n is the sample size. The smaller the RMSE value, the better the fitting [11].

Akaike Information Criterion (AIC)
The AIC value can be obtained as follows, established on the basis of RMSE [11]: where N is the sample size and k is the number of unknown parameters in the probability distribution The Kolmogorov-Smirnov (K-S) test is preferred since it does not make any assumptions about the distribution of data [33]. This method makes a comparison of the maximum gap of the experimental cumulative distribution function from the theoretical cumulative distribution function. The K-S test ( , n n D  ) is used to determine whether the parameters are acceptable or not and is given by [33]: where F1,n = experimental distribution; . = theoretical distribution; sup = supremum function.
To perform the goodness-of-fit test, the null hypothesis is applied; it only gets accepted when the gap between the theoretical is smaller than the expected for the given sample.

Primary and Secondary Return Period
Primary and secondary return periods can be obtained if the appropriate copula functions are specified to reflect the joint probabilistic features among flood variables. The joint return periods can be described in OR and AND, formulated as follows [31]: where µ is the mean inter-arrival time of the two consecutive flooding events. The secondary (Kendall′s) return period plays a vital role in identifying the average time between two supercritical flood events, thus analyzing the potential risk. The probability of a supercritical event decreases upon an increase in the primary return period. It is defined as [31]: where KC is the Kendall′s distribution, associated with multivariate copula function Cθ. For Archimedean copulas, KC can be expressed as [31]: where φ t is the right derivative of the copula generator function φ t .

Flood Hydrologic Risk
In terms of the engineering design of hydrologic infrastructure, the risk is the probability of downstream flooding due to uncontrolled water release from upstream facilities associated with flooding [31]. The intensity of flood risk is usually determined by the period of future re-occurrence of heavy inundation, commonly known as the return period. Under consideration of the service time (i.e., n) of a hydrological infrastructure, the risk of failure associated with the return period of a flooding event can be expressed as [31]: where R is the risk of failure, p and q are the probability of an event exceeding and not exceeding the set threshold, and T is the return period The failure probability (R) is the measure of flood risk ranging between 0 and 1, with a greater value indicating a greater risk magnitude. At most, the failure of hydrologic structures is due to high peak flow, making it an important attribute for risk analysis. Nevertheless, other flood variables, such as multi-day flood volume and duration, are also important for flood control. A risk framework that considers more than one variable may provide more support for actual flood control than the conventional analysis [31]. It is essential to characterize floods through multiple variable aspects by a process called bivariate risk analysis. Bivariate risk analysis plays a significant role in taking non-structural safety measures and developing flood mitigation strategies. Bivariate risk analysis in relation to joint return period in AND is defined as [31]:

Marginal Probability Distribution of the Flood Pair
The flow chart of the proposed study is shown in Figure 3 below. The flood variables were obtained from historical records from the National River Flow Achieve (NRFA) website. The daily flow data and annual maximum (AMAX) flow data were obtained from the NRFA website. The daily flow data were then further computed to obtain the three-day AMAX flow. The flood pairs of peak flow and 3-day AMAX were then used for the analysis of the bivariate hydrologic risk at the Thames. Lognormal and Gumbel distributions were used to model the distribution of each flood variable by using the method of moment (MOM) and maximum likelihood estimation (MLE). The obtained distributions were tested for their statistical acceptance using the Kolmogorov-Smirnov (K-S) test, while the best distribution was tested using the root mean square error (RMSE) and the Akaike information criterion (AIC) methods. The marginal distribution that best fits the flood data was then used for multivariate risk analysis through copula methods. Three copula methods were used to model the dependence of flood variables: Gumbel-Hougaard, Cook-Johnson (Clayton), and Frank copulas. The best one from the three copula models was then used to calculate the primary, joint, and secondary return periods to characterize the bivariate flood risk.     1883  1887  1891  1895  1899  1903  1907  1911  1915  1919  1923  1927  1931  1935  1939  1943  1947  1951  1955  1959  1963  1967  1971  1975  1979  1983  1987  1991  1995  1999  2003  2007  2011  2015

Statistical Characteristics of Flood Variables
Flood analysis was undertaken after a thorough characterization of flood data obtained from the NRFA. Table 3 shows some descriptive statistic values of the considered peak and 3-day annual maximum flow variables. The percentile, mean, standard deviation, skewness, and kurtosis of each flood variable were calculated. The peak and 3-day AMAX datasets had skewness coefficients of 1 and 1.05, respectively. The high degree of positive skewness from both flood variables implies that the variables were right-tailed, having a high range of values less than the mean. Both data samples had a platykurtic type of kurtosis since they had coefficients of less than three. The kurtosis values imply that both data samples were light-tailed with short distributions and tails thinner than the normal distribution. Moreover, the Mann-Kendall trend was adopted to further detect the variation trend for the historical flood data series. The results indicate that the peak flows had a z-statistic value of 1.5389 and a p-value of 0.1238. This implies that there is no trend present in the historical annual peak flows. Similarly, the historical data of the three-day maxima had a z-statistic of 0.2438 and a p-value of 0.8074, which also indicates no increasing or decreasing trend in the data series. Consequently, both the annual peak flows and three-day maxima flows were stationary, which implies that the flood risk analyses based on those historical data can still be helpful in dealing with future flooding events.
Moreover, to further detect the impacts of climate change and anthropogenic activities on the discharges of the River Thames, the variation trends for both the annual peak and 3-day AMAX series were characterized by the Mann-Kendall test in a moving window with a width of 40 years. Table S1 in the Supplementary Materials presents the statistics and corresponding p-values for the chosen flood variables. The results indicate that no significant trends were detected for the flood peaks and the associated 3-day flows, even for the recent time periods. This implied that, even though intensive anthropogenic activities and climate change have occurred in the Thames Basin, those external inferences have not led to statistically significant changes in the flood series for the River Thames.

Marginal Distribution
In this study, two parametric distribution functions-extreme value type I (Gumbel or EVl) and lognormal-were used to fit the flooding data obtained from the NRFA. These two parametric distributions were estimated through maximum likelihood estimation (MLE) and methods of moments (MOM), as shown in Table 4. Results show that there was a slight difference in values between the two-parameter estimators. For example, the values of the location parameter (µ) in peak flow of 272.8900 and 272.1989 established from the MOM and MLE estimators, respectively, showed a slight difference of 0.6911.

Comparison between Empirical and Expected Probability
The fitted marginal distributions for the two flood variables through EVI and lognormal distribution functions are illustrated in Figures 5 and 6 below. The observations denote the empirical probabilities corresponding to the distributions in Table 2. The distributions were obtained from the method-of-moment (MOM) and maximum likelihood estimation (MLE) estimators. The obtained CDFs for the marginal distributions from the two parametric models showed good agreement with the empirical distributions.

Goodness-of-Fit Statistical Test
The performances of the marginal and joint probabilities of the flood variables were achieved using the goodness-of-fit test. Three tests were employed-RMSE, AIC, and K-S tests-as expressed in Equations (13)- (15) and presented in Table 5 below. The results denote that the quantification of marginal distributions of flood peak and three-day maxima perform well from all the parametric probabilistic models. The associated p-values of the K-S tests for both flood variables were larger than 0.05 from both MOM and MLE estimates, indicating the satisfactory performance of the proposed methods. Furthermore, the RMSE and AIC methods were further adopted to identify the most appropriate methods for quantifying the marginal distributions of flood variables. As presented in Table 5, the Gumbel (EVI) method obtained using the MOM distribution performed best in modeling both flood variables, with the minimum RMSE and AIC values of 0.0188 and −1086.8 for the peak and 0.0138 and −1,171.6 for the three-day AMAX flow, respectively. The Gumbel distribution from the MOM estimator would be used to model the joint distributions of the flood variables through the copula methods.

Dependencies of Flood Variables
The dependency between peak flow and three-day AMAX was evaluated through Pearson's linear correlation (r), and one non-parametric dependence measure, Kendall′s tau. Table 6 below shows the values of Kendall′s tau and Pearson′s r among flood variables, which were 0.5286 and 0.5905, respectively. The results indicate that the structures of the two flood variables were positively correlated and highly dependent on each other since they were above zero and approached a value of one.

Flood Variables
Kendall Pearson Peak three-day AMAX 0.5286 0.5905 Three copulas from the Archimedean family, including Cook-Johnson (Clayton), Frank, and Gumbel-Hougaard, were applied to model the dependence among flood variables. The unknown parameters in the three copulas were estimated by the method-ofmoments (MOM) estimator based on inversion of Kendall's tau, as shown in Table 7. Table 7. The values of unknown parameter in copulas.

Selection of the Joint Probability Distribution
The joint distribution functions for flood peak and 3-day maximum flow, obtained through the three above-mentioned copulas, are shown in Figure 7. The RMSE and AIC (expressed by Equations (13) and (14), respectively) were used to identify the best copula by testing the goodness-of-fit of sample data to the theoretical joint distribution. Table 8 below shows the comparison of RMSE and AIC values for joint distribution obtained through copula functions for peak and three-day AMAX flow. The results showed that all three copulas performed well in modeling, due to their slight difference in values. Among the three, the Clayton copula was the best-fitted model for quantifying the joint distribution of flood variables, with the lowest value of 0.1727 and −479.2032 for RSME and AIC, respectively.

Return Periods of Flood Characteristics
The combination of more than one flood variable in flood risk analysis provides more insightful information for a better understanding of flood behavior. The return periods were calculated based on the Clayton copula to reflect the historical flooding characteristics. Table 9 below shows the primary return periods of the peak and three-day AMAX flood pairs obtained by univariate marginal distribution, as well as joint and secondary return periods. The AND, OR joint return periods, and the secondary return periods were calculated based on Equations (16)- (19). The OR return period was much less than the univariate and AND-joint return period. Both the 'AND' and 'OR' joint return periods were not far in values from the univariate return period, showing the strong correlation between the flood variables. The secondary return period was often higher in values than the joint return period in OR, but less than the return period in AND, as clearly seen in Table 9 below. Table 9. The joint and secondary return for peak-three-day AMAX flood pair. Note: T and , the joint return period in AND; T or , the joint return period in OR; T kendall , the joint return period in Kendall. Figure 8 below shows the contour plots of the joint return periods and secondary return periods for the peak and 3-day AMAX flood pairs. The curved lines show that the flood pairs were highly correlated.

Bivariate Hydrologic Risk Analysis for River Thames
The reflection of bivariate risk analysis was based on the ability to incorporate peak flow with other flooding variables. For this study, a 3-day AMAX flow was adopted. Since flood peak was the key factor for consideration in flood risk analysis, three peak scenarios were considered, which were 625.9, 689.1, and 725.9 m 3 /s with return periods of 50, 100, and 150 years, respectively. The values of peaks from the three return periods were estimated based on the marginal distribution of the flooding peak flow. Moreover, four service times for the hydraulic facilities are assumed to be 10, 30, 50, 70 years. Figure 9 shows the bivariate hydrologic risk analysis for the River Thames at Kingston under different flooding peak-3-day AMAX flow scenarios. Results show that at the same service time, the hydrologic risk would decrease with the increase in the design flow. For example, in the 10-year service time, the risk was estimated to decrease from 18.2 to 6.5% as the design of peak return period increased from 50 years to 150 years, with design flow increasing from 625.9 m 3 /s to 725.9 m 3 /s, respectively. Similarly, the bivariate hydrologic risk would increase with the rise of service time of the hydraulic facilities. Upon assuming the service time and the designated standard, the bivariate hydrologic risk values would remain constant and start decreasing as the 3day AMAX got larger than the 700 m 3 /s, as shown in Table 10 below. This is due to a high correlation between flood variables. This raises the concern that a flood with a high peak will be accompanied by a large amount of a 3-day AMAX flow. For example, as shown in Figure 9 above and Table 10 below, if the service time of the River Thames was estimated to be 10 years, then the risk of failure of the three designed flows at 400 m 3 /s 3-day annual maximum flow would be 18.3, 9.6, and 6.5%. Moreover, if the service time is assumed to be 10 years at a peak of 625.90 m 3 /s, the risk will be 18.3% from 400 m 3 /s and begin to decrease at 700 m 3 /s. This applies to all service times and peaks. Thus, for this study, the hydraulic risk indicator may change when the 3-day AMAX flow exceeds 700 m 3 /s at different peak scenarios and service times. However, increasing the design standards of the hydraulic facilities would decrease the hydrological risk.   The implication for the bivariate risk of flooding peak flow and 3-day AMAX flow is to provide decision support for hydrologic facility design and the establishment of flooding diversion areas. The actual flooding control practices and the excess water of floods can be redirected temporarily to holding ponds or other bodies of water with a lower risk or impact to flooding. In flood diversion practice, the bivariate risk for flood peak and 3day AMAX flow would be an important reference for the design of flooding diversion areas. For example, as shown in Figure 9 and Table 10, for the river levee with a designed flow of 626 m 3 /s and 10-year service period, the flooding risk value would be 18.3%, 18.3%, 18.3%, 18.1, 17.4, 15.8% with a 3-day AMAX being 400, 500, 600, 700, 800, and 900 m 3 /s, respectively. These risk values suggest that the flood diversion area should be designed based on a 3-day flow of 700 m 3 /s, such as the bivariate risk, which would not decrease significantly for the 3-day flow less than 700 m 3 /s.

Summary of Current Study
The novelty of this study was to analyze the flood risks for the River Thames within a multivariate context and further reveal the failure probabilities under consideration of the joint return period with different service time scenarios. Initially, a framework for bivariate hydrologic risk analysis for peak flow and 3-day annual maximum flow was prepared. This analysis was done based on historical data from 1883 to 2019, obtained from the National River flow regime website (NRFA). The marginal distribution of the obtained flood variables was conducted using Gumbel and lognormal methods by using two estimation methods-method of moment (MOM) and maximum likelihood estimation (MLE). The flood variables were hypothetically tested for statistical satisfaction. The Kolmogorov-Smirnov test was adopted to check for the acceptability of the variables, while the root mean square error (RMSE) was used to decide the best choice among the four options. The best fit from both flood variables was the Gumbel method obtained from MOM. The distributions of the two flood variables were joined using different copula methods-the Clayton, Frank, and Gumbel-Hougaard methods. The estimation of the unknown parameter of copula was conducted using the MOM method. The obtained copulas were then tested for statistical correlation using the Kolmogorov-Smirnov test. The results showed that all the methods were statistically correlated. The root mean square error (RMSE) and Akaike information criterion (AIC) methods were then used for the identification of the best copula among the three copula methods. Based on the selected copula, the primary, joint, and secondary return periods were developed. The bivariate hydrologic risk (i.e., failure probability) of the peak 3-day AMAX flow was then defined from the derived joint return periods

Main Findings and Recommendation
The Clayton copula was used to model the joint return period of the two flood variables to characterize the bivariate hydrological risk. Three return periods were estimated: primary, joint, and secondary return periods. The essence of calculating the secondary return period was to show the correspondence of the supercritical flood event. The secondary return period was located between T OR and T AND , whilst T OR was the shortest return period, less than the return period of individual flood variables, and T AND was the longest one among the joint and secondary return periods. Therefore, the hydrological risk analysis of the flood pair was analyzed based on the T AND joint return period. Results showed that during a flood event, the bivariate hydrologic risk for the flood pair would have a constant 3-day flow up to 700 m 3 /s. However, there will be a decrease in flood risk as the 3-day flow rose above the stated value. Such bivariate values could be useful in the design of various engineering facilities that are to be placed on the Thames for flood control. Analysis of this study outlines the need to consider more variables apart from peak flow during infrastructure design to produce a more standardized design of the hydrologic facilities that can withstand extreme conditions of floods.

Suggestions for Future Research
This study only considered one flood pair, i.e., peak flow and 3-day AMAX flow, for risk characterization. During heavy inundation, peak was the major flooding factor for the failure of hydraulic facilities, making it an important factor for consideration of flood risk analysis. However, other factors have a role to play in the control and alleviation of river flooding. Such factors include flooding volume, duration, and annual maximum days flow. Analyzing flooding duration assists in the understanding of flood pressure, enabling its control while results from the volume factor airs out information on diversion of floods. Thus, further studies are required that consider other flood pairs (peak-volume, peak-duration, and peak-5-day AMAX) in order to come up with a robust multivariate flood analysis for the Thames.
Moreover, the developed model for multivariate risk analysis is based on stationary data series for both flood peaks and 3-day AMAX flows, since no significant variation trends are detected based on the historical records. Nevertheless, these stationary features may not be valid in the future due to the increasing effects of climate change. It is widely accepted that assuming stationarity in the hydrologic variables used for long-lived engineering designs is no longer tenable [34,35]. For the Thames Basin, it may hardly introduce covariates into the estimated parameters in the established copula model based on stationary historical records. One way to address the non-stationarity that may occur in future flood series is to generate flood series through reliable hydrological models based on climate projections as done by Bell et al. [22]. The copula model can then be developed based on the projected flood series, in which covariates such as time can be used to reflect the nonstationary features in the flood series. Such a model can be applied for supporting flood resilience under climate change. Nevertheless, the results from the current study are still useful for flood control even though these results are based on stationary data series. These results can be considered as the reference baselines for future flood protection under climate change.