Investigation of Attributes for Identifying Homogeneous Flood Regions for Regional Flood Frequency Analysis in Canada

: The identiﬁcation of homogeneous ﬂood regions is essential for regional ﬂood frequency analysis. Despite the type of regionalization framework considered (e.g., region of inﬂuence or hierarchical clustering), selecting ﬂood-related attributes to reﬂect ﬂood generating mechanisms is required to discriminate ﬂood regimes among catchments. To understand how di ﬀ erent attributes perform across Canada for identifying homogeneous regions, this study examines ﬁve distinctive attributes (i.e., geographical proximity, ﬂood seasonality, physiographic variables, monthly precipitation pattern, and monthly temperature pattern) for their ability to identify homogeneous regions at 186 gauging sites with their annual maximum ﬂow data. We propose a novel region revision procedure to complement the well-known region of inﬂuence and L-Moments techniques that automates the identiﬁcation of homogeneous regions across continental domains. Results are presented spatially for Canada to assess patterning of homogeneous regions. Memberships of two selected regions are investigated to provide insight into membership characteristics. Sites in eastern Canada are highly likely to identify homogeneous ﬂood regions, while the western prairie and mountainous regions are not. Overall, it is revealed that the success of identifying homogeneous regions depends on local hydrological complexities, whether the considered attribute(s) reﬂect primary ﬂooding mechanism(s), and on whether catchment sites are clustered in a small geographic region. Formation of e ﬀ ective pooling groups a ﬀ ords the extension of record lengths across the Canadian domain (where gauges typically have < 50 years of record), facilitating more comprehensive analysis of higher return period ﬂood needs for climate change assessment.


Introduction
Designing future infrastructure for flood resiliency is necessary and crucial for emerging design standards. Flood frequency analysis (FFA) is often used to estimate flood quantiles for river infrastructure design to prevent structural failure or inadequacy during extreme flood events. Given its importance, a growing number of countries have carried out nation-wide study for advanced methods of FFA to improve design flood estimation [1][2][3]. Outcomes from these studies can be generalized into published guidelines, which are beneficial for domestic end-users in terms of simplicity and consistency and for reducing the element of subjectivity within the design process [4,5].
In Canada, flooding has been recognized as the most frequent and costliest of natural disasters over the past 100 years, claiming considerable economic and social losses for cities, urban clusters, considered in their quantile regression model as flood influential attributes: latitude and longitude of gauging stations, CV, QDA, mean annual precipitation, and basin slope.
The geographic extent of Canada means that water resources engineering practice is generally governed at the provincial level and the boundary, as opposed to federal jurisdiction, which is more common in other countries [35,36]. As a result, methods of RFFA have been inconsistently applied among government agencies, academic communities, and industrial partners [8,11,37]. To tackle this problem, the Natural Sciences Engineering Research Council of Canada (NSERC) funded FloodNet Strategic Network project unified researchers across Canada to develop nation-wide flood forecasting and water resources management strategies. An important mandate was to research standardized FFA methods and techniques tailored for Canadian hydrological environments [38]. Within this network, Sandink et al. and Zahmatkesh at al. [23,24] examined FFA using a quantile regression model that considered ungauged catchments across Canada. Zhang et al. [39] demonstrated the generalized extreme value (GEV) distribution fits Canadian annual maximum flow data considerably better than other well-known distributions, including generalized logistic, Pearson type III, and log Pearson type III distributions. Others [40][41][42][43][44] focused on developing regionalization techniques using peaks-over-threshold (POT) flood data, which is advantageous for gauging sites where annual maximum flood records are limited. Little attention has been paid to the examination of different flood-related attributes and their characteristics for identifying homogeneous flood regions.
Here, we consider five distinct categories of frequently used attributes (i.e., geographical proximity, physiographic variables, flood seasonality, monthly precipitation pattern, and monthly temperature pattern) and investigate their relevance in identifying homogeneous flood regions for RRFA applications on a continental, Canada-wide scale. Their abilities to identify homogenous regions are investigated across major hydrological sub-regions of Canada. Regional hydrological characteristics are used as input to analyze homogeneous region identification results. To increase efficiency of our analysis and minimize the element of subjectivity, a novel automated regionalization process that combines the well-known ROI [45,46] approach with a proposed automatic region revision algorithm (ARRA) is introduced and demonstrated for applicability to continental domains. Memberships of two regions are selected as a case study to provide insight into membership characteristics. Findings of this study are deemed to be an important contribution toward the Canadian statistical flood estimation guideline under the FloodNet project.

Rationale for Attribute Selection
Geographical proximity is selected based on the rationale that catchments closer to each other generally encompass similar hydrological and physiographical characteristics, and, therefore, catchments with smaller geographical proximity are likely to exhibit a similar flood regime and to form a homogeneous region. The presence of large spatial variability in flood characteristics might question the use of geographical proximity, therefore, directly using physiographic variables that exert key influence on the dominant flood generating mechanisms provides another way to group sites with similar flood behavior. Geographical proximity and physiographic variables are the most common flood-related attributes for catchment regionalization and thus are included in this study.
As previously noted, flood seasonality has the advantage of convenience in attribute extraction. In addition, it has been previously applied and was found to be beneficial for flood studies in Canada for catchment classification [42,47,48] and in the formation of homogeneous regions [49,50].
Monthly precipitation and temperature patterns consider monthly average precipitation and temperature for the location of the catchment site. These values are provided by Environment and Climate Change Canada (ECCC) [51], computed for each catchment site in this study using historical monthly climate grids for North America [52,53]. Flood generating mechanisms in Canada are generally dominated by either rainfall (pluvial), snowmelt (nival), and rain-on-snow (mixed) events [42,54]. The monthly patterning of precipitation and temperature are considered to contain key information concerning the dominant flood generation process. For example, precipitation accumulation during winter months dominates the magnitude of the spring melt event. Large precipitation values in summer and fall suggest rainfall-driven peak floods. Temperature values in the melt season influence the timing and the magnitude of spring peak floods. Therefore, we explore these attributes given their potential usefulness in mapping regional flood characteristics.

Datasets
Annual maxima flood samples are taken from the Canadian Reference Hydrometric Basin Network (RHBN). Developed by Water Survey of Canada, the RHBN constitutes 223 gauging sites in total at the start of this study, and is only a small subset of the Canadian hydrometric gauging network (6379 gauging with flow record sites in total) [10]. RHBN sites are identified as near pristine catchments, high quality flow measurements, with an absence of anthropogenic control [55,56]. These merits make their flood data ideal for RFFA. In addition to the 223 RHBN sites, only 186 sites have corresponding physiographic variables available, supported by ECCC [51]. Therefore, we consider a total of 186 gauging sites in this study, generating a total of 186 annual maximum flood samples. Although RHBN stations generally have flow records that are greater than 20 years in length, some sites are seasonally operated, which means that not all calendar years are able to derive the annual maximum flood. The average station record length among our samples is 48 years, with a maximum of 103 years and a minimum of eight years. More than 80% of samples have station record lengths greater than 30 years.
The geographical distribution of the 186 sites is presented in Figure 1, with corresponding record length distributions of the 186 sites presented in Figure 2 (the x-axis corresponding to the longitude, from west to east, noted by province or territory. Figures 1 and 2 indicate that most study sites in British Columbia and the Atlantic provinces have relatively higher record lengths compared to other regions. The prairie provinces, particularly Saskatchewan and Manitoba, have relatively fewer stations and relatively shorter record lengths. The three northern territories have the fewest number of gauging sites and an average record length of 40 years. and temperature are considered to contain key information concerning the dominant flood generation process. For example, precipitation accumulation during winter months dominates the magnitude of the spring melt event. Large precipitation values in summer and fall suggest rainfalldriven peak floods. Temperature values in the melt season influence the timing and the magnitude of spring peak floods. Therefore, we explore these attributes given their potential usefulness in mapping regional flood characteristics.

Datasets
Annual maxima flood samples are taken from the Canadian Reference Hydrometric Basin Network (RHBN). Developed by Water Survey of Canada, the RHBN constitutes 223 gauging sites in total at the start of this study, and is only a small subset of the Canadian hydrometric gauging network (6379 gauging with flow record sites in total) [10]. RHBN sites are identified as near pristine catchments, high quality flow measurements, with an absence of anthropogenic control [55,56]. These merits make their flood data ideal for RFFA. In addition to the 223 RHBN sites, only 186 sites have corresponding physiographic variables available, supported by ECCC [51]. Therefore, we consider a total of 186 gauging sites in this study, generating a total of 186 annual maximum flood samples. Although RHBN stations generally have flow records that are greater than 20 years in length, some sites are seasonally operated, which means that not all calendar years are able to derive the annual maximum flood. The average station record length among our samples is 48 years, with a maximum of 103 years and a minimum of eight years. More than 80% of samples have station record lengths greater than 30 years.
The geographical distribution of the 186 sites is presented in Figure 1, with corresponding record length distributions of the 186 sites presented in Figure 2 (the x-axis corresponding to the longitude, from west to east, noted by province or territory. Figures 1 and 2 indicate that most study sites in British Columbia and the Atlantic provinces have relatively higher record lengths compared to other regions. The prairie provinces, particularly Saskatchewan and Manitoba, have relatively fewer stations and relatively shorter record lengths. The three northern territories have the fewest number of gauging sites and an average record length of 40 years.

Geographical Proximity
The latitude and the longitude of the gauging stations are used to calculate the geographical distance between two catchments. The similarity distance between catchment m and n is defined as: where and are the latitude and the longitude coordinates for the gauging site of catchment . We use geographical coordinates for the above equation, which can cause minor discrepancies in the calculation or the comparison of one-degree longitude approaching the polar region.

Physiographic Variables
The selection of physiographic variables is based on the stepwise regression method, which has been used to select flood-related attributes in previous studies [57][58][59]. The stepwise regression method is an automatic procedure used to select explanatory variables based on the development of a multilinear regression model. Candidate variables are iteratively added and removed based on the use of statistical t-test until the predictive power of the regression model is optimized. In this study, 66 sets of different physiographic variables at each site are obtained from ECCC [51]. Because different variables have different units and scale, variables are normalized by their standard deviation prior to the regression. The dependent variable for the stepwise regression considers the median value of each flood sample, which corresponds to a 2-year return period flood. The median value is considered a robust indicator of flood characteristics and is meant to reduce impact from outlier flood values [4,34]. Consequently, the stepwise method recognizes the following variables as sufficiently explanatory of flood characteristics: (1) catchment area, (2) waterbody area in the catchment, (3) standard deviation of elevation across the catchment, (4) average annual air temperature for the catchment, and (5) average annual precipitation for the catchment. Variables (2) and (3) are derived from the ECCC National Hydrology Network database. Variables (4) and (5) are computed based on 10 km historical gridded climate data representing a 30 year period of record from 1981 to 2010. Data provided by ECCC are computed using historical monthly climate grids for North America [52,53].

Geographical Proximity
The latitude and the longitude of the gauging stations are used to calculate the geographical distance between two catchments. The similarity distance between catchment m and n is defined as: where Lat m and Lat n are the latitude and the longitude coordinates for the gauging site of catchment m. We use geographical coordinates for the above equation, which can cause minor discrepancies in the calculation or the comparison of one-degree longitude approaching the polar region.

Physiographic Variables
The selection of physiographic variables is based on the stepwise regression method, which has been used to select flood-related attributes in previous studies [57][58][59]. The stepwise regression method is an automatic procedure used to select explanatory variables based on the development of a multilinear regression model. Candidate variables are iteratively added and removed based on the use of statistical t-test until the predictive power of the regression model is optimized. In this study, 66 sets of different physiographic variables at each site are obtained from ECCC [51]. Because different variables have different units and scale, variables are normalized by their standard deviation prior to the regression. The dependent variable for the stepwise regression considers the median value of each flood sample, which corresponds to a 2-year return period flood. The median value is considered a robust indicator of flood characteristics and is meant to reduce impact from outlier flood values [4,34]. Consequently, the stepwise method recognizes the following variables as sufficiently explanatory of flood characteristics: (1) catchment area, (2) waterbody area in the catchment, (3) standard deviation of elevation across the catchment, (4) average annual air temperature for the catchment, and (5) average annual precipitation for the catchment. Variables (2) and (3) are derived from the ECCC National Hydrology Network database. Variables (4) and (5) are computed based on 10 km historical gridded climate data representing a 30 year period of record from 1981 to 2010. Data provided by ECCC are computed using historical monthly climate grids for North America [52,53]. The similarity distance between catchment m and n is calculated based on a weighted Euclidean distance formula defined as: where k is the number of physiographic variables, w j is the weighting factor for the physiographic variable j, and x mj is the standardized value for the physiographic variable j of catchment m. w j controls the relative importance of variable j. Here, weights of 0.4 were assigned to the basin area and 0.15 to the remaining four variables. These weights corresponded to variable coefficients in the computed stepwise model, rounding to the nearest 0.05 digit.

Flood Seasonality
Similarity between catchments is measured using a unit polar coordinate system. A catchment is presented as a point in the polar coordinate space and can be positioned by angular and radial values. The angular value reflects the average date of flood occurrence, whereas the radial value reflects the variability in the occurrence date of floods. A larger radial value indicates smaller variability in occurrence date; a radial value of one indicates no variability in occurrence date, implying that all floods occur on the same day of each year.
Based on Burn [49], for a single flooding event, the date of occurrence of the event is transformed from Julian day to an angular value, where Julian day one is 1 January and day 365 is 31 December, using: For a given catchment flood sample composed of k flooding events, its Cartesian coordinates x and y in the unit circle are calculated as: Therefore, the similarity distance between catchments m and n is calculated as: Followed by the Durocher et al. [42] classification, sites used in this study are further classified into nival, pluvial, and mixed regimes based on their flood seasonality statistics and localized geographic and climatic environments (i.e., classifications noted on Figures 1 and 3, respectively). Nival sites are subject to regular flood occurrence dates for the spring snowmelt period. These sites are generally located in cold regions of Canada such as continental interior, mountainous British Columbia, and northern Canada. A smaller number of sites are exclusively pluvial-driven with average annual flood occurrence from November to February. These sites are in the warmest regions of Canada, which are coastal southwest British Columbia and Vancouver Island. A substantial number of study sites are classified as mixed response. These sites experience warm to mild winters and are predominately located in southeastern Ontario, southern Québec, and the Atlantic provinces. Peak floods for these sites can be either spring snowmelt, rain-on-snow, or single heavy rainfall events. Their wide range of regularity in the flood seasonality space provides an effective indication of annual peak floods driven by multiple flood responses. events. Their wide range of regularity in the flood seasonality space provides an effective indication of annual peak floods driven by multiple flood responses.

Monthly Precipitation Pattern
Similarity measures based on precipitation patterns are attributed to the values of monthly average precipitation from January to December for each catchment site. The correlation coefficient is selected to assess the similarity measure between two catchments. In contrast to Euclidean distance, the correlation coefficient is considered more effective when characterizing the pattern of two datasets, as it measures the degree of linearity of the datasets, while the Euclidean distance measures the distance between two points in a matric space. The correlation coefficient between catchment and is described as: where is the monthly average precipitation value for month of catchment , and ¯ is the average of the 12 monthly average precipitation values for catchment expressed as: ranges from −1 to 1, with values exactly equal to 1 (−1) indicating a perfect positive (negative) linear relationship between two datasets, and values exactly equal to 0 indicating no linear relationship. For the similarity measure of catchment and , closer to 1 indicates a stronger positive linear relationship between catchment and , therefore, the similarity distance based on the correlation coefficient is computed as:

Monthly Precipitation Pattern
Similarity measures based on precipitation patterns are attributed to the values of monthly average precipitation from January to December for each catchment site. The correlation coefficient is selected to assess the similarity measure between two catchments. In contrast to Euclidean distance, the correlation coefficient is considered more effective when characterizing the pattern of two datasets, as it measures the degree of linearity of the datasets, while the Euclidean distance measures the distance between two points in a matric space. The correlation coefficient between catchment n and m is described as: where x ni is the monthly average precipitation value for month i of catchment n, and x n is the average of the 12 monthly average precipitation values for catchment n expressed as: r nm ranges from −1 to 1, with values exactly equal to 1 (−1) indicating a perfect positive (negative) linear relationship between two datasets, and values exactly equal to 0 indicating no linear relationship. For the similarity measure of catchment m and n, r nm closer to 1 indicates a stronger positive linear relationship between catchment m and n, therefore, the similarity distance based on the correlation coefficient is computed as:

Monthly Temperature Pattern
In common with the similarity measure for precipitation patterning, temperature patterning is computed from monthly average temperature for each catchment. Monthly average temperature data for catchment n and m are then input into Equations (7) and (8); Equation (9) is used to calculate the similarity distance between the two catchments.

Region of Influence Approach
The ROI approach [45,46] is used given its flexibility of identifying flood regions for each study site. The ROI defines target sites as having a unique flood region. The addition of other sites to the region proceeds in order of the shortest similarity distance to the greatest. Determining the number of sites in a region requires a trade-off between the size of the region and the quality of the region. A larger region benefits flood estimation at larger return periods (i.e., generates longer records), however, the quality of the region (i.e., homogeneity in flood characteristics) generally decreases as more sites are added to the region. For RFFA, the 5T rule for region size (i.e., total station-year of record of the region) states that regions should optimally have five times greater record length than the return period of interest (T) and has been widely accepted as a guideline for optimal trade-off [4,12]. The 5T rule was adopted in this study.

Generalized Extreme Value (GEV) Distribution and L-Moment Estimation Method
The GEV distribution is used to estimate flood quantiles. The GEV distribution has been determined to be more robust for fitting annual maximum flow at RHBN stations than other commonly used three parameter distributions [39]. The index flood L-Moment parameter estimation method is recommended by many studies for its simplicity, robustness, nearly unbiased estimation, and convenient integration with the GEV and the L-Moment homogeneity test [16,60,61].

L-Moment Homogeneity Test
The homogeneity test aims to verify if sites in the flood region exhibit similar flood characteristics at an acceptable level of statistical significance. Since L-Moments are considered unbiased statistics of flood data, the L-Moment homogeneity test has received much attention in RFFA applications [4,12,19,30,49,62]. Based on Hosking and Wallis [16], the first step of the homogeneity test is to determine the regional L-Moment ratios t R , t R 3 , and t R 4 , denoted as the regional L-CV, L-skewness, and L-kurtosis, respectively. For a region comprises N sites, the regional L-Moment t R (similarly apply for t R 3 and t R 4 ) is calculated as: where t (i) is the at-site L-Moment ratio for site i, and n i is the record length for site i. Dispersion can then be expressed as: To assess if the dispersion, V, is within the limit of region homogeneity, two variables are required: µ V , the expected mean o f V; and σ V , the expected standard deviation of V. µ V and σ V are estimated through many reproductions of the original region. To do this, a Kappa distribution fit by L-Moment ratios of 1, t R , t R 3 , and t R 4 is used to reproduce the N sim , or number of original regions (N sim = 1000 used in this study). Each reproduced region has the same region size (N sites in a region) and the same record length, n i for site i, with respect to the original region.
For each reproduced region, the dispersion, V, is calculated using Equations (10) and (11). Based on the N sim number of V values, the expected mean σ V and the expected standard deviation µ V can be obtained.
Lastly, the homogeneity statistic is defined as:

Automatic Region Revision Algorithm (ARRA)
For a given target site and ROI and any attribute, the initial flood region formation often still tests heterogeneous. Many studies have reported this situation, and, subsequently, a region revision process is needed to reduce region heterogeneity by editing the initial group membership [12,28,29,44]. The revision process includes steps such as adding, deleting, and replacing site(s) within the initially formed region, subsequently testing for homogeneity after each progressive change. In past studies, this is largely carried out through a heuristic process, meaning there is no set procedure regarding the order of steps or the methodology of revision [12,17,29,63]. For our large-scale study, however, it is ineffective to proceed via a heuristic process for each region, therefore, an automatic region revision algorithm (ARRA) was /designed with the intent of reducing region heterogeneity through an automatic and non-subjective modification of the region membership.
A heterogeneous region is input into the ARRA, and a revised region with improved homogeneity is output from the first iteration. If the output region does not meet the homogeneous criteria (i.e., H < 1), the ARRA can be reapplied to the region to further reduce heterogeneity. Each time the region membership is modified, the homogeneity of the membership increases, but the attribute similarity decreases because the newly added site(s) have larger attribute distance(s) compared to the replaced site. As a region should be formed primarily based on attribute similarity, the number of ARRA iterations needs to be constrained to ensure an appropriate trade-off between region homogeneity and attribute similarity. We perform a sensitivity analysis on the number of ARRA iterations used to revise 186 randomly formed initial pooling regions by counting the number of homogeneous regions produced after each ARRA iteration. From this sensitivity analysis, we determine that a maximum of five iterations of the ARRA should be applied (see Section 3.1. ARRA performance). If, after five iterations of the ARRA, a region still tests heterogeneous, this region is regarded as unable to form a homogeneous region. Figure 4 illustrates methodological procedure followed by the ARRA. The L-Moment homogeneity test is embedded in the ARRA and used to identify sites that should be removed and new sites that should be added to achieve the greatest improvement in region homogeneity. The order of searching for a newly added site depends on attribute similarity, such that shorter attribute similarities are tested first. The process terminates once an improved region is formed and the 5T region size rule is satisfied.

Flood Region Identification Process
For each of the five considered attributes, the process of identifying flood regions is demonstrated below. First is the identification of the initial flood region for a study site, which uses the ROI approach to group regions based on attribute similarity alone. To be specific, catchment sites having the shortest attribute distance to the target site are pooled into the initial region. The region

Flood Region Identification Process
For each of the five considered attributes, the process of identifying flood regions is demonstrated below. First is the identification of the initial flood region for a study site, which uses the ROI approach to group regions based on attribute similarity alone. To be specific, catchment sites having the shortest attribute distance to the target site are pooled into the initial region. The region size is set to 500 station years of record, which allows for accurate estimation up to the 100-year flood according to the 5T rule. Next, the homogeneity of the initial region is assessed using the L-Moment homogeneity test. If the initial region is heterogeneous, the ARRA is applied to revise region membership, up to a maximum of five iterations. The homogeneity of the revised region is re-evaluated using the homogeneity test. This process was repeated for all 186 study sites, and the total number of homogeneous regions identified for each attribute was determined.
Annual maximum flows for all region members are typically used for the homogeneity test and the subsequent flood quantile estimation. In this study, however, we purposely exclude annual maximum flows at the target site to afford more robust and rigorous evaluations of homogeneity and flood quantiles (i.e., a leave-one-out analysis), and therefore, our methodology can later be applied for ungauged regional analyses.
2.9. Assessing the Accuracy of Regional Flood Quantiles Estimated regional flood quantiles are compared to "true" flood quantile determined by at-site samples. It is common, in practice, to determine "true" quantiles from at-site FFA when the return period of interest is below half the at-site record length (i.e., a 2T rule) [4]. Comparison of regional and at-site quantiles provides a means to assess the accuracy of regional estimates relative to standard practice.
There were only 11 sites with record lengths greater than 90 years included in this study, therefore, for the purpose of reliable at-site estimation, the return periods selected for comparison could not be extreme quantiles; we selected a range of 20 to 45 years. For each return period, T, the selected sites were those that were able to identify 5T homogeneous regions across all attributes and those having record lengths greater than 2T for reliable at-site estimation. A homogeneous region is easier to form for smaller region sizes, therefore, the number of sites available for analysis for each return period differed, with more sites meeting our criteria at smaller return periods. Table 1 lists the return periods considered for comparison, the number of sites considered at each return period, and the required record lengths for adequate at-site and regional quantile estimates. It is noteworthy that flood estimation for both at-site and regional methods was subject to sampling uncertainty, with the uncertainty bound decreasing with decreasing return period. Thus, the smaller return periods provided improved reliability for assessing results. Table 1. Required record length for at-site and regional estimate at different return periods used in analysis. 20  40  88  100  25  50  47  125  30  60  29  150  35  70  15  175  40  80  14  200  45  90  11  225 Relative bias = 1 n

Station-Years of Record for Regional Estimate
where Q i is the quantile of regional estimate for site i, q i is the quantile of at-site estimate for site i, and n is the number of available sites that analyses for each return period. Table 2 shows the resulting number of homogeneous regions produced by the attribute and the number of ARRA iterations. When the ARRA is not applied and the regions are formed based on shortest attribute distance alone, it results in only five to ten sites (of 186) sites that form homogeneous regions across all attributes. Forming homogeneous regions based on attribute similarity alone is, therefore, found to be unproductive, and the use of the region revision process (i.e., the ARRA) to revise initial regions is deemed necessary. Once implemented, the number of homogeneous regions the ARRA identifies non-linearly increases with the number of ARRA iterations for all attributes. In general, the number of homogeneous regions increases significantly for one to three iterations of the ARRA and increases less from four to eight iterations. Two to four iterations of the ARRA results in identification of relatively more homogeneous regions when considering geographical proximity, precipitation, and temperature patterning than for flood seasonality and physiographic attributes. For five or more iterations, monthly precipitation pattern produces the most homogeneous regions.

ARRA Performance
To determine a suitable threshold for the number of ARRA iterations, an alternative series composed of 186 regions, for which membership was randomly formed (i.e., without the use of attribute similarity), is used and one to eight iterations of the ARRA applied (last column, Table 2). Comparing results between the five attributes and the alternative series, we find that attribute similarity is largely irrelevant to the identification of homogenous regions after eight iterations. At five iterations, approximately half of the sites form homogenous regions across all attributes, and the number of regions associated with each attribute remains greater than the alternative series. This suggests reasonable preservation of attribute similarity as a selection criterion. We therefore find a maximum of five iterations of the ARRA to be a suitable balance between maintaining attribute similarity for a region and leveraging the revision power of the ARRA.
With appropriate use of the ARRA (i.e., five iterations), approximately 79 to 99 sites of 186 sites identify homogeneous regions across all attributes. This is significantly higher than the five to ten sites identified prior to the use of the ARRA.

Identification of Homogeneous Regions
When the ARRA is applied for five times (or less), monthly precipitation pattern identifies the largest number of homogeneous regions among all other attributes, followed by temperature pattern, geographical proximity, and flood seasonality. Physiographic variables produce the fewest number of homogeneous flood regions. Differences among the attribute results are relatively small, where the total difference between the two most extreme results (flood seasonality and monthly precipitation pattern) is 21 sites, which is~11% of the 186 study sites. Figure 5 shows homogeneous region identification across Canada for each attribute. Note that sites that could not identify a homogeneous region but may be members of another site's homogeneous region are also indicated in blue. Catchment sites are non-uniformly distributed across Canada, with clusters in southern Canada aligned with urban development and large populations, while remote and sparsely gauged regions are often found in the continental interior and mid to high latitudes of the continental landmass.
Water 2020, 12, x FOR PEER REVIEW 13 of 27 total difference between the two most extreme results (flood seasonality and monthly precipitation pattern) is 21 sites, which is ~11% of the 186 study sites. Figure 5 shows homogeneous region identification across Canada for each attribute. Note that sites that could not identify a homogeneous region but may be members of another site's homogeneous region are also indicated in blue. Catchment sites are non-uniformly distributed across Canada, with clusters in southern Canada aligned with urban development and large populations, while remote and sparsely gauged regions are often found in the continental interior and mid to high latitudes of the continental landmass.
Results are generally similar across all attributes at the national scale, with regionalized discrepancies identified. In general, all attributes readily identify homogeneous regions in eastern Canada, while, in western Canada (particularly the interior and the northern regions), the identification of homogeneous regions is more problematic. Catchment sites in eastern Canada are generally clustered in small geographical areas, therefore, they experience more similar flooding behavior. Site clusters are also found in Vancouver Island and southeast British Columbia, where considerable homogeneous regions were also identified across all attributes. Figure 5. Sites achieving homogeneous regions (red) relative to those that did not (blue), shown by geographic location for each attribute. Note that ARRA was applied up to a maximum five iterations. Figure 5. Sites achieving homogeneous regions (red) relative to those that did not (blue), shown by geographic location for each attribute. Note that ARRA was applied up to a maximum five iterations.
Results are generally similar across all attributes at the national scale, with regionalized discrepancies identified. In general, all attributes readily identify homogeneous regions in eastern Canada, while, in western Canada (particularly the interior and the northern regions), the identification of homogeneous regions is more problematic. Catchment sites in eastern Canada are generally clustered in small geographical areas, therefore, they experience more similar flooding behavior. Site clusters are also found in Vancouver Island and southeast British Columbia, where considerable homogeneous regions were also identified across all attributes.
As catchment sites in eastern Canada are more tightly clustered, less variability in flood attributes is expected. Figure 6 presents three boxplots comparing catchment physiographic variables between eastern and western sites with respect to catchment area, water body area in the catchment, and standard deviation of elevation across catchment. The variability in attribute physiography for the eastern sites is noticeably less than that for the western sites, particularly for the standard deviation of elevation across catchment.
Water 2020, 12, x FOR PEER REVIEW 14 of 27 As catchment sites in eastern Canada are more tightly clustered, less variability in flood attributes is expected. Figure 6 presents three boxplots comparing catchment physiographic variables between eastern and western sites with respect to catchment area, water body area in the catchment, and standard deviation of elevation across catchment. The variability in attribute physiography for the eastern sites is noticeably less than that for the western sites, particularly for the standard deviation of elevation across catchment. Some site clusters are found across the southern Canadian prairies in Alberta, Saskatchewan, and Manitoba, where annual peak floods are predominated during the spring snowmelt period. The geographical proximity attribute is typically effective when sites are clustered. Important nival regime influences, such as snowpack accumulation and timing and rate of snowmelt, are reflected in Some site clusters are found across the southern Canadian prairies in Alberta, Saskatchewan, and Manitoba, where annual peak floods are predominated during the spring snowmelt period. The geographical proximity attribute is typically effective when sites are clustered. Important nival regime influences, such as snowpack accumulation and timing and rate of snowmelt, are reflected in attributes such as monthly precipitation pattern, monthly temperature pattern, and flood seasonality. The regional pooling results, however, show that not many catchment sites within the cluster groups identified homogeneous regions across all attributes. Though site clusters are found in both eastern Canada and the southern prairie region, homogeneity often occurs across large regions, indicating geographical contiguity cannot always warrant effective homogeneous region identification.
Literature indicates that the Canadian Prairies are known for their hydrological complexity, mainly attributed to the presence of potholes and hummocks, which results in highly variable fill and spill runoff process and dynamic effective drainage area, leading to highly non-linear flooding generating mechanisms [26,27,64,65]. Zhang et al. [39] provided statistical evidence that annual maximum flows from prairie RHBN stations are difficult to adequately fit robust distributions as well, including GEV, log Pearson type III, and generalized logistic distributions. This is a strong indication of multiple flood responses occurring at a single catchment site. Ehsanzadeh et al.
[66] studied prairie flood response based on nine prairie sites and revealed noteworthy non-linear flood frequency curves.
In addition, flood record length across the Prairies is generally limited (Figure 2). The average record length over 28 prairie sites is 25 years, which is substantially lower than the rest of the 158 sites examined across Canada (having an average record length of 52 years). In order to develop a flood region with 500 station years, more catchment sites must be pooled into these flood regions. This adds an additional challenge for developing homogeneous regions, since more sites leads to more variable flood responses within the flood region.
In Burn [28,29], wherein successfully identified homogeneous regions were formed for southern Manitoba, region identification did not simply rely on attribute similarity measures. A heuristic membership revision process was applied with subjective trial and error to improve region homogeneity. Such region revision approaches are more statistically rigorous than our proposed ARRA, however, they require practitioners to have sound knowledge of local hydrology and are more statistically sophisticated [67]. Our method, on the other hand, is designed to be accessible to all water resource practitioners seeking to perform food frequency analysis.
For the mountainous western Canada region, annual peak floods are predominately snowmelt and rain-on-snow regimes, though rare heavy rainstorms can also trigger annual peak floods in smaller basins [54]. Homogeneous region identification maps are noisy along the cordillera mountain chain for all attributes, namely, it is difficult to interpret a distinctive spatial pattern. In southern British Columbia and Alberta, some sites identify homogeneous regions, however, locations of these sites differ amongst attributes. In central British Columbia and Alberta, and south of Northwest Territories, only flood seasonality and monthly precipitation patterns identify homogeneous regions, and just for a few sites. The western mountain chain is subject to highly variable climate and basin characteristics. Flood generation mechanisms are influenced by combined basin features including catchment size, drainage topography (e.g., channel slope, floodplains, alluvial fans, canyons), localized snow accumulation and distribution, as well as glaciation and avalanches [8]. These basin features, as well as temperature and precipitation, are highly variable spatially and temporally in mountainous environments [54]. Attributes selected in this study capture flood behavior from a limited set of physiographic characteristics and are likely not rigorous enough for catchment regionalization in the mountains.
In northern Manitoba, Northwest Territories, and Nunavut, catchment sites are characterized by cold subarctic climate, barren and tundra rolling landscape, as well as long-lasting (five or six months of the year) snow and ice cover underlain by permafrost [54]. Annual peak floods are primarily snowmelt driven; therefore, the duration and the rate of snowmelt are key characteristics for grouping catchment sites. Homogeneous region identification shows that monthly temperature pattern is more effective than other attributes because it captures timing, rate, and duration of snowmelt driven flood behavior. Some sites also identify homogeneous regions using flood seasonality, possibly because duration and rate of snowmelt are inherently correlated with the average and the variation of peak flood dates.
We find two general and probable causes that account for the inability to identify homogeneous flood regions in Canada. First, the clustering or the proximity of gauge sites has considerable impact on the outcome of identifying homogeneous regions, regardless of the attribute considered. The tendency for attributes to be more similar within smaller geographical regions is significant, despite the fact that regions (for attributes other than geographical proximity) can also include sites that are non-proximal. Second, attributes selected in this study measure a distinctive hydrological feature, however, across large spatial domains (e.g., Canadian landmass), there exist significant local-scale hydrological complexities that influence flood generation mechanisms. For sites that are influenced by multiple hydrological characteristics, our attribute selection is not rigorous enough to capture the particulars of flood behavior and is thus unable to group catchment sites with similar flood frequency regimes. Related to this, Table 2 indicates that most sites identify homogeneous regions as an outcome of ARRA interactions; ARRA revises region membership based on a specified attribute. If the specified attribute does not capture primary flood characteristics, the subsequent ARRA enhancement becomes ineffective.

Analyzing Membership Characteristics
To gain insight into membership characteristics, two catchment sites along with their region memberships were selected for more detailed case studies. Flood regions for both sites are identified based on flood seasonality and five ARRA iterations, with only one of the two regions being homogeneous.
Target catchment site Water Survey of Canada (WSC) gauge 03MB002-Whale River at 40.2 km from the Mouth in northern Québec, cannot identify a homogeneous flood region with flood seasonality attribute and ARRA iterations. This site and its 12 members are plotted in geographically ( Figure 7a) and in flood seasonality space ( Figure 7b) and are summarized in Table 3 based on physiography. Based on geographical proximity (Figure 7a), site membership is supported from a climatic perspective. The map shows that target and member sites broadly span across Canada, from Pacific to Atlantic and from southern British Columbia to the northern edge of the Northwest Territories. All member sites are, however, situated near an ocean or a coastal region and receive substantial annual precipitation (see Table 3). Since member sites span a broad range of latitude, there is variation in annual temperature range that alters the amount and the temporal distribution of rain and snowfall, thus affecting the dominant runoff mechanism during the annual peak flood season. Expected differences in flood behavior are also reflected in the varying physiography among member sites (Table 3).
Four member sites in southeastern British Columbia have high mean annual precipitation and noticeably higher mean annual temperatures compared to other members further north. These four member sites are exposed to more pluvial or mixed rain-on-snow floods. Basin area also substantially varies among the membership; five member sites are small basins (i.e., <500 km 2 ), whereas the seven others and the target site have basin areas ranging between 3500 km 2 and 49, 000 km 2 . Four of the five small basins are located in southeastern British Columbia. The basin compactness ratio (BasinArea/Perimeter 2 ) is a surrogate measure for time to peak flow and is significantly greater (as expected) for the smaller basins, indicating much shorter routing times than what is seen for the larger basins. A wide spectrum of mean basin slope also exists, ranging from 3.3% to 45.4%, across member sites. Smaller basins in British Columbia mountains are remarkably steeper than member sites from other areas of Canada. Mean basin slope affects time to peak flow, as well as runoff ratios. Member sites that are highly variable in such physiographic characteristics are less likely to exhibit similar flood behavior.   Table 3. In contrast, target catchment site WSC gauge 06DA004 (Geikie River below Wheeler River) identifies as a homogeneous region consisting of 12 catchment sites, excluding the target site (i.e., leave-one-out analysis). The target site is in northern Saskatchewan with very few other sites nearby. The climatology is described as sub-arctic, cold temperature, with physiography consisting of flat to rolling topography with numerous surface water bodies present in the catchment. The sub-arctic, cold climate causes annual peak flooding that is predominately snowmelt driven; the amount of accumulated winter snowpack, as well as timing and rate of snowmelt are influential to flood generation. The geographical extent of the 12 site membership is shown (Figure 7a), along with flood seasonality (Figure 7b), physiographic values (Table 3), and boxplots of physiographical values ( Figure 8). Flood seasonality space indicates the membership has good consistency in the regularity of date of occurrence, suggesting these 12 member sites likely have similar flood type and characteristics. Geographically, member sites are situated in the interior of Canada with most located in mid-to-northern Alberta, Saskatchewan, and Manitoba. This area is subject to prolonged, colder, sub-arctic climate; hence the annual peak flood is a nival flood regime. Although catchment area and perimeter span a large range (Table 3 and Figure 8), catchment compactness ratio, mean basin slope, mean annual precipitation, and mean annual temperature are within the same order of magnitude. The spread of 06DA004 box plots is noticeably smaller than the spread observed for the 03MB002 region among all physiographic variables, which reflects their contrasting results in terms of homogeneity.
The above case studies provide examples of the application of catchment physiographic variables to further investigate membership characteristics, which can potentially diagnose causes for region (homo)heterogeneity. We conduct a similar member physiographic analysis for other considered attributes, and for prairie and mountainous sites that cannot identify homogeneous regions. Member sites in heterogeneous regions often displayed large physiographic variability. Therefore, it is generally found that our selected attributes and ARRA regionalization approach are not rigorous enough to identify homogeneous flood regions for catchments with significant hydrological complexity, which, in Canada, are those primarily located in prairie and mountainous regions. Based on geographical proximity (Figure 7a), site membership is supported from a climatic perspective. The map shows that target and member sites broadly span across Canada, from Pacific to Atlantic and from southern British Columbia to the northern edge of the Northwest Territories. All member sites are, however, situated near an ocean or a coastal region and receive substantial annual precipitation (see Table 3). Since member sites span a broad range of latitude, there is variation in annual temperature range that alters the amount and the temporal distribution of rain and snowfall, thus affecting the dominant runoff mechanism during the annual peak flood season. Expected differences in flood behavior are also reflected in the varying physiography among member sites (Table 3).
Four member sites in southeastern British Columbia have high mean annual precipitation and noticeably higher mean annual temperatures compared to other members further north. These four member sites are exposed to more pluvial or mixed rain-on-snow floods. Basin area also substantially varies among the membership; five member sites are small basins (i.e., <500 km ), whereas the seven

Predictive Measures for Regional Quantile Estimation
Predictive measures for regional quantile estimation are presented as relative bias and relative RMSE for return periods ranging from 20 to 45 years (Table 4 and Figure 9). In general, relative bias across all considered attributes is small for all return periods (ranging from −0.6% to 3.7%). As biases are within ±5% deviation, regional estimation accuracy is considered satisfactory. Bias is generally positive, suggesting that regional estimates tend to overestimate "true" flood quantiles, but are uncorrelated with the magnitude of the flood quantile. Comparing bias among attributes, flood seasonality and physiographic variables exhibit larger bias than geographical proximity, monthly precipitation pattern, and monthly temperature pattern, in general.  RMSE generally increases with increasing return period across all attributes. Similar RMSE among attributes is found within each return period equal to and less than 35 years. At larger return periods (i.e., 40 and 45 years), flood seasonality and physiographic variables show noticeably larger RMSE than geographical proximity, monthly precipitation pattern, and monthly temperature pattern attributes. Though the "true" quantile is modeled by at-site estimates using accepted methods, estimation uncertainties caused by statistical extrapolation increase with increasing quantiles for both at-site and regional estimates. Therefore, higher relative RMSE at larger return periods is anticipated. RMSE generally increases with increasing return period across all attributes. Similar RMSE among attributes is found within each return period equal to and less than 35 years. At larger return periods (i.e., 40 and 45 years), flood seasonality and physiographic variables show noticeably larger RMSE than geographical proximity, monthly precipitation pattern, and monthly temperature pattern attributes. Though the "true" quantile is modeled by at-site estimates using accepted methods, estimation uncertainties caused by statistical extrapolation increase with increasing quantiles for both at-site and regional estimates. Therefore, higher relative RMSE at larger return periods is anticipated.
Geographical proximity, monthly precipitation pattern, and monthly temperature pattern perform better across both metrics than flood seasonality and physiographic variables, possibly because regions identified based on the first three attributes often have a higher degree of geographic proximity. Flood seasonality and physiographic measures end up grouping sites across a wider geographical extent, therefore, the degree of hydrological similarity between sites may be lower, resulting in slightly poorer (but acceptable) regional flood estimation results.
Overall, all considered attributes produced satisfactory regional flood quantile estimates for Canada based on acceptable range of bias and a reasonable range of estimation uncertainty. The success in regional quantile estimation demonstrates the applicability of proposed regionalization process based on ROI and ARRA.

Conclusions
This study provides insight into five distinctive flood-related attributes for their behavior in identifying homogeneous flood regions across Canada. All considered attributes show similar results regarding the number of homogeneous regions identified and locations where homogeneous regions could be identified. In general, the success of homogeneous region identification is relevant to local hydrological complexities and whether the considered attribute reflects primary flood generation mechanisms and geographic clustering of the sites.
Through combinations of these factors, results of homogeneous region identification are highly distinctive when mapped for Canada. Catchment sites in eastern Canada are generally clustered in small geographic regions and are more likely to exist within similar hydrological environments. Annual peak floods in northern Canada are predominately snowmelt driven, which is sensitive to temperature variation, making monthly temperature pattern important for homogeneous region identification. The Prairie region and the western mountains are subject to highly variable physiographic characteristics, resulting in difficultly in identifying homogeneous regions, regardless of the attribute considered.
Use of a regionalization revision process to revise initial group membership was found to be important. We proposed an automated process, the ARRA, to efficiently revise group membership across large domains and showed it successfully increased the number of homogeneous regions. Flood quantiles obtained from the identified homogeneous regions were reasonably close to estimated at-site "true" quantiles, further demonstrating success of the regionalization process. The ARRA can be readily adopted for other types of regionalization frameworks (e.g., clustering) when subsequent region revision is required.
Findings of this study, on the basis of 186 catchment sites across Canada, provide valuable input on the identification of homogeneous flood regions as well as their attribute behaviors and spatial characteristics. The success of identifying homogeneous flood regions is essential for RFFA and thus for reliable flood quantile estimation. Within the FloodNet project, work on refining RFFA techniques will aid in appropriate sizing of flood resilient infrastructures, which is crucial to proactive protection of lives and properties against flood risk.