Exploring the Capability of Natural Flood Management Approaches in Groundwater-Dominated Chalk Streams

: This study aims to address the gap in the Natural Flood Management (NFM) evidence base concerning its implementation potential in groundwater-dominated catchments. We generated a typology of 198 chalk catchments using redundancy analysis and hierarchical clustering. Three catchment typologies were identiﬁed: (1) large catchments, (2) headwater catchments with permeable soils, and (3) catchments with impermeable soils and surfaces (urban and suburban land uses). The literature suggests that natural ﬂood management application is most effective for catchments <20 km 2 , reducing the likelihood of signiﬁcant ﬂood mitigation in large catchments. The relatively lower proportion of surface runoff and higher recharge in permeable catchments diminishes natural ﬂood management’s likely efﬁcacy. Impermeable catchments are most suited to natural ﬂood management due to a wide variety of ﬂow pathways, making the full suite of natural ﬂood management interventions applicable. Detailed groundwater ﬂood maps and hydrological models are required to identify catchments where NFM can be used in a targeted manner to de-synchronise sub-catchment ﬂood waves or to intercept runoff generated via groundwater emergence. Whilst our analysis suggests that most chalk groundwater-dominated catchments in this sample are unlikely to beneﬁt from signiﬁcant ﬂood reductions due to natural ﬂood management, the positive impact on ecosystem services and biodiversity makes it an attractive proposition. mental beneﬁts, such as improved water quality, aesthetic improvements, reduced soil erosion, increased biodiversity, and collaboration across multiple river-management bod-ies. This paper provides a ﬁrst order triage of the potential for NFM runoff-management methods in chalk catchments. Further work in this ﬁeld will need to focus on hydrological models that represent the permeability of the soils and the inﬂuence of groundwater on stream ﬂows, detailed groundwater emergence, and ﬂood-risk mapping as well as NFM-implementation studies speciﬁcally on chalk streams.


Introduction
Natural Flood Management (NFM) represents a paradigm of management strategies that aim to improve a catchment's resilience to prolonged and/or heavy rainfall by restoring, enhancing, or altering a catchment's natural hydrological and morphological characteristics [1]. Strategies aim to increase interception and infiltration by reducing rapid runoff generation, increasing catchment water storage, and slowing overland channel flows [2,3]. Flood peak reductions of up to 30-40% have been attributed to NFM methods, such as storage pond networks, tree shelter belts, channel realignment, leaky dams, and winter cover crops in rural and urban settings [2,[4][5][6][7][8]. Moreover, NFM can offer co-benefits, such as improvements to water quality, reductions in soil degradation, and enhanced biodiversity [1,9,10]. Consequently, the demand for NFM implementation has increased due to its perception as a relatively low cost, low maintenance flood mitigation solution that protects and maintains hydrological and biological function of the rivers in which it is implemented [1,11,12]. Despite this growing demand for NFM interventions, the NFM evidence base consistently cites groundwater-dominated river systems, like chalk streams, as a gap in the knowledge due to hydrological differences, meaning the current evidence may not be directly applicable [11,13,14].
A small number of chalk streams can be found in Northern France and Russia, but the 224 chalk streams found in the UK account for the majority of this river type found globally [15]. Catchments underlain with chalk bedrock are predominantly groundwater dominated [16]. This means that large quantities of water are stored and transmitted through the chalk bedrock before being released into river channels as baseflow, generating a stable annual river regime with a lack of spate conditions and relatively small variations between high and low flows [17]. It can take months for changes in rainfall inputs to manifest as changes in the river regime [18], giving chalk streams a strong seasonal flow oscillation in conjunction with annual rainfall patterns in the UK. Discharge rises slowly over the winter, peaking in late April after heavier winter rainfall, and gradually recedes over the summer, with lowest flows in autumn after generally drier summers. Groundwater flooding occurs when rainfall recharge causes the groundwater level to rise, in turn increasing groundwater inputs into river systems and causing groundwater emergence in topographic low points, winterbournes, and activating springs [19,20].
NFM functions by enhancing a catchments' natural ability to absorb shocks from storms by storing water in the catchment and then slowly releasing it [21]. A successful NFM scheme reduces discharge at at-risk locations by extending a flood duration and reducing the peak discharge (and water level) at a point at any given time [13,22]. This is achieved by increasing interception and infiltration, slowing overland channel flows, and manipulating channel and catchment surface roughness [2,3]. NFM is a processbased approach, meaning that each NFM scheme is designed by matching flow pathways, landscape features, and sources of flood waters that contribute to peak flows to specific NFM interventions that tackle those issues. Because of this, NFM interventions can be categorised according to the processes that they manipulate: (1) the reduction of rapid runoff generation, (2) increasing catchment water storage, and (3) strategies to reduce the conveyance of water downstream [11,13,14]. This is an important distinction because it allows specific sources of flood waters to be linked to specific solutions in the process of NFM design. For example, many arable fields suffer from sediment loss due to excess rapid overland flow. In this case, NFM interventions would focus on reducing runoff inputs to the river channel by intercepting and storing water. Sediment can be trapped and stored and surface roughness increased (slowing the flow of surface runoff) by using winter cover crops or across-slope tillage. The choice of method would be dictated by the specific soil properties of the site as well as management preferences. As such, NFM schemes are often tailor-made to each application scenario, and it is important that the features and water-transfer processes of the catchment where implementation is proposed are first established.
In contrast, the vast majority of the NFM evidence base is founded on research conducted in surface water-dominated catchments where floods are caused by the convergence of multiple surface-runoff inputs to the river channel [11,13,23]. As a result, the application of NFM in groundwater-dominated catchments has been highlighted as a one of the key knowledge gaps in the NFM evidence base, in acknowledgement that the flow pathways and processes present in groundwater-dominated catchments are significantly different to those found in catchments with fluvial floods. This can be demonstrated by the difference in Base Flow Index (BFI) values for the River Lambourne (a typical chalk stream) and the Belford Burn catchment in Northumberland, which has been used as an NFM experimentation catchment for many years. BFI measures the proportion of total channel flow contributed by groundwater sources. Belford Burn has a BFI of 0.313 [24], and the River Lambourne has a BFI of 0.98 [25]. Therefore, groundwater catchments typically transfer a large proportion of water below the ground surface. Of the 12 main NFM techniques summarised by Lane (2017) [13], eight of them rely on managing surface water. However, if most of the water transfer throughout the catchment occurs below the ground surface, these measures may have limited effect (Figure 1). Because NFM interventions are tailored to fit the specific sources and flood water, it follows that NFM schemes in groundwater catchments will focus on different combinations of interventions than those found in surface-runoff-dominated catchments ( Figure 1) depending on the key properties of chalk catchments. It is therefore important to identify the morphological and hydrological features that affect groundwater recharge and the production of overland flows (increasing the probability of channel bank exceedance) to guide future NFM strategies in catchments dominated by groundwater processes.
Water 2021, 13, x FOR PEER REVIEW 3 of 23 found in surface-runoff-dominated catchments ( Figure 1) depending on the key properties of chalk catchments. It is therefore important to identify the morphological and hydrological features that affect groundwater recharge and the production of overland flows (increasing the probability of channel bank exceedance) to guide future NFM strategies in catchments dominated by groundwater processes. This study is intended as a screening process for NFM in groundwater catchments by grouping catchments according to hydrological properties and matching them to NFM interventions that specifically tackle these flow pathways. To do this, we quantify the relationships between hydrological variability and key morphological characteristics that are amenable to NFM for 198 catchments with chalk bedrock in the Southeast of England. We use these results to classify the catchments and infer flow pathways and to make suggestions for the most appropriate NFM strategies for these river basins. Because NFM schemes must be designed on a case-by-case basis for the greatest effectiveness, this is intended as a broad-scale screening process to narrow down options and not as a comprehensive guide for choosing NFM interventions in all chalk stream catchments.

Study Area and Catchment Selection
Chalk groundwater-dominated catchments were first identified using topographic and geological datasets. Topographic catchment boundaries were provided by the Centre for Ecology and Hydrology National River Flow Archive (NRFA) [26]. A bedrock map (BGS 625k) from the British Geological Survey [27] was used to identify catchments that were underlain by chalk bedrock. To reduce uncertainty due to differing groundwater transfer processes [28] and because of the global rarity of chalk stream river systems, we focused on chalk groundwater-dominated river systems, excluding limestones or Permo-Triassic sandstones. We selected catchments with at least ≥70% of the catchment within the chalk bedrock and with gauging stations within 5 km downstream of the chalk bedrock (determined via the buffer tool in ArcGIS and visual inspection). Using these criteria, a total of 198 catchments were available for analysis, located predominantly in the Southeast of England ( Figure 2). This study is intended as a screening process for NFM in groundwater catchments by grouping catchments according to hydrological properties and matching them to NFM interventions that specifically tackle these flow pathways. To do this, we quantify the relationships between hydrological variability and key morphological characteristics that are amenable to NFM for 198 catchments with chalk bedrock in the Southeast of England. We use these results to classify the catchments and infer flow pathways and to make suggestions for the most appropriate NFM strategies for these river basins. Because NFM schemes must be designed on a case-by-case basis for the greatest effectiveness, this is intended as a broad-scale screening process to narrow down options and not as a comprehensive guide for choosing NFM interventions in all chalk stream catchments.

Study Area and Catchment Selection
Chalk groundwater-dominated catchments were first identified using topographic and geological datasets. Topographic catchment boundaries were provided by the Centre for Ecology and Hydrology National River Flow Archive (NRFA) [26]. A bedrock map (BGS 625k) from the British Geological Survey [27] was used to identify catchments that were underlain by chalk bedrock. To reduce uncertainty due to differing groundwater transfer processes [28] and because of the global rarity of chalk stream river systems, we focused on chalk groundwater-dominated river systems, excluding limestones or Permo-Triassic sandstones. We selected catchments with at least ≥70% of the catchment within the chalk bedrock and with gauging stations within 5 km downstream of the chalk bedrock (determined via the buffer tool in ArcGIS and visual inspection). Using these criteria, a total of 198 catchments were available for analysis, located predominantly in the Southeast of England (Figure 2).

Data Analysis
To relate hydrological variability to catchment morphological characteristics, we compiled and analysed the covariation among four key hydrological variables and twentyone variables quantifying the physical catchment properties. Details and data sources are found in Table 1.

Data Analysis
To relate hydrological variability to catchment morphological characteristics, we compiled and analysed the covariation among four key hydrological variables and twenty-one variables quantifying the physical catchment properties. Details and data sources are found in Table 1.

Hydrological Variables
All hydrological variables were derived from the Centre for Ecology and Hydrology UK National River Flow Archive (NRFA) [26]. Average discharge (Qmean) and maximum recorded discharge (Qmax) are among the most basic hydrological metrics used to characterise the hydrological regime. The base flow index (BFI) [25] measures the proportion of streamflow from groundwater contributions. This is relevant to NFM implementation because high BFI values are related to stable flows and limited surface-runoff generation and contributions to streamflow. Conversely, the Richard-Baker flashiness index (RBFI) [29] measures the sensitivity of a streamflow to rainfall inputs. Therefore, high RBFI values indicate significant sources of runoff resulting in spate conditions. As mentioned previously, understanding the processes by which water is transferred throughout the catchments is essential to NFM design. Chalk streams are often characterised and defined by features such as long times to peak, high base flow domination, and a lack of spate conditions [15]. However, large ranges in BFI and RBFI values within the chalk stream sample demonstrate that there is major spatial variation in chalk stream properties. These metrics will help characterise chalk catchments according to flow variability, average flows, the short-term response of flow to sharp bursts of rainfall, groundwater contributions to the flow regime, and (indirectly) the rate of recharge and runoff production. This helps inform which NFM interventions will be most suitable based on whether water transfer occurs predominantly above or below the ground surface. Details of index calculations are provided in Table 1.

Hydrological Variables
All hydrological variables were derived from the Centre for Ecology and Hydrology UK National River Flow Archive (NRFA) [26]. Average discharge (Q mean ) and maximum recorded discharge (Q max ) are among the most basic hydrological metrics used to characterise the hydrological regime. The base flow index (BFI) [25] measures the proportion of streamflow from groundwater contributions. This is relevant to NFM implementation because high BFI values are related to stable flows and limited surface-runoff generation and contributions to streamflow. Conversely, the Richard-Baker flashiness index (RBFI) [29] measures the sensitivity of a streamflow to rainfall inputs. Therefore, high RBFI values indicate significant sources of runoff resulting in spate conditions. As mentioned previously, understanding the processes by which water is transferred throughout the catchments is essential to NFM design. Chalk streams are often characterised and defined by features such as long times to peak, high base flow domination, and a lack of spate conditions [15]. However, large ranges in BFI and RBFI values within the chalk stream sample demonstrate that there is major spatial variation in chalk stream properties. These metrics will help characterise chalk catchments according to flow variability, average flows, the short-term response of flow to sharp bursts of rainfall, groundwater contributions to the flow regime, and (indirectly) the rate of recharge and runoff production. This helps inform which NFM interventions will be most suitable based on whether water transfer occurs predominantly above or below the ground surface. Details of index calculations are provided in Table 1.

Physical Catchment Properties
We selected 21 physical variables that directly influence hydrology and that have explained variations in catchment hydrology in other studies [29][30][31]. This included the percentage cover of 9 land uses (arable land, broad leaved and coniferous woodland, grassland, heathland, urban, suburban, marshland, and inland rock) from the 2015 Land Cover Map (LCM 2015) [32]. Heathland, grassland, and marsh land covers are made up of sub-categories of each classification, respectively (i.e., all separate grassland types are combined into a new, homogenous grassland classification). Combinations of subcategories to generate the three new classes were done as specified in Appendix A of the LCM2015 documentation to generate more optimum data distributions. The hydrological impact of land cover is well documented in the literature (Table 1) and is relevant to NFM because land cover directly impacts flow pathways and, consequently, the choices of NFM interventions that may be suitable. Additionally, land use can inform the amount of space available for NFM application to inform what is possible. Large quantities of storage ponds, controlled flood plain zones, and afforestation may not be suitable NFM interventions in regions of arable land due to the need to preserve economically productive space, whereas other interventions, like winter cover crops, hedge edges, and no-till farming, may be more suitable. Percentage cover of 6 soil types (bypass flow common, bypass flow uncommon, bypass flow very uncommon, bypass flow variable, slowly permeable, and impermeable) were classified according to the hydraulic conductivity in the Hydrology of Soil Types (HOST) [24]. This is because hydraulic conductivity can directly dictate the efficiency of aquifer recharge and surface-runoff generation. Soil-saturated hydraulic conductivity has been identified as one of the key features that dictates chalk stream river regime and flood response [31]. It is therefore important to understand the key hydrological processes in the catchment that will in turn be useful to inform the choice of NFM interventions. Topographic catchment shape and drainage density were included because they influence the rate of propagation of water through the catchment and river channel network [33,34] (p. 304). Bedrock transmissivity and a proxy for aquifer abstraction rates were included because they provide information about how water propagates through the sub surface of the catchment, which in turn influences streamflow response [35,36]. Transmissivity values of the chalk bedrock vary between 230 m 2 /day to 2600 m 2 /day [36]. Catchments where transmissivity is low have reduced rates of recharge, subsurface water transfer, and reduced groundwater contributions to channel flows relative to regions with high-transmissivity bedrock [36]. The hydrological influence of each physical catchment variable is included in Table 1 as well as details of their calculation. Table 1. Equations and methodology of variables compiled for the response and explanatory databases for the redundancy analysis.

Variable Description and Influence on Flow
Mean Discharge (Q Mean ) ∑ Q n n = days in record The average quantity of water in the river channel. Gives an indication of the discharge under normal conditions from the NRFA [26].

Maximum Recorded
Discharge (Q Max) Q Max Maximum discharge capacity of catchments at the gauging site during high flows from the NRFA [26].

Richard-Baker Flashiness
Measures the absolute daily fluctuations in streamflow, divided by the sum of all stream flow for the time series length [37]. Values range between 0 and 1. Values near 0 represent stable flow and those close to 1 represent highly changeable flows and spate conditions [29].
Base Flow Index BFI measures the proportion of river runoff derived from stored sources [25,26]. High drainage densities are linked to drainage efficiency and high peak flows [33].
Channel Slope m/m Calculated from a raster layer of SRTM 30m Digital Elevation Model [40] in ArcGIS using the zonal statistics function.
Arable Land and Horticulture Reduced peak flows [41].

Coniferous Woodland
Reduced peak flows due to large leaf surface area [42].

Heathland
No overall influence-the effects of vegetation are counteracted by shallow soils and low storage capacity [44]. Urban Increased peak flows-impervious surfaces and drainage systems [45]. Suburban Increased peak flows-impervious surfaces and drainage systems [45]. Peat Semi-permeable (Vertical saturated hydraulic conductivity 0.1-10 cm/day −1 ) [46]. Transmissivity (m 2 /day) Transmissivity = ga 3 12v g = acceleration due to gravity a = area v = kinematic viscosity of the fluid The rate at which water passes through the chalk bedrock [35,36].

Abstraction Score
Subjectively assigned an arbitrary abstraction score based on the flow regime description on the gauging station info in the NRFA. Scores: flow added; natural flow; minor, moderate and major reduction of flows due to abstraction.
Quantifies influence of abstraction on river regime [26].

Statistical Analysis
We used redundancy analysis (RDA) [47] to characterise and explain the variation in hydrological properties in relation to physical properties. RDA is an asymmetric ordination method whereby the variation in one set of (explanatory) variables is used to directly explain the variation in another set of (response) variables. Essentially a combination of multiple regression and principal component analysis, RDA generates a matrix of the fitted values of all response variables, which are then subjected to a principal component analysis. Linear combinations of explanatory variables (physical catchment properties) that best explain the variation in the response variables (hydrological properties) are sought by the model in successive order. The main advantages of using RDA over principal component analysis in this context are that variance in the response variables is attributed to the explanatory variables, and presence or absence of a relationship between specific x and y variables can be tested [47]. It can therefore be used to directly link hydrological traits of chalk streams to the presence or absence of specific physical features, enabling more targeted NFM scheme design.
All analyses were undertaken in the R environment (version 3.6.1) [48]. After compiling the variables as outlined in Table 1, all variables were transformed using a Box Cox transformation and were centred and scaled to reduce the influence of outliers and place variables on a common scale for the RDA model [49]. To identify the explanatory variables that best describe the co-variation of the hydrological variables, a preliminary redundancy analysis (RDA) model was initially run using the rda function of the vegan package in R [50]. This global model, using all 21 catchment variables, was subjected to model validation. To achieve a parsimonious model containing important physical explanatory variables, the global model was subjected to a forward stepwise procedure using the forward.sel function of the Packfor R package [51] (p. 48). This process selects the model with the combination of variables with the highest R 2 and p value [52] (p. 178). The remaining physical catchment variables after the stepwise procedure are those that directly correlate with and explain the variation in hydrological regime in chalk streams whilst maintaining the highest explanatory power (R 2 ). The resulting forward-selected RDA model was then subjected to a validation process, including ANOVA tests of the significance of the relationships identified in the parsimonious model and the number of significant axes in the model using 1000 permutations. The covariance of the selected physical catchment variables in the model was ascertained using the variance inflation factor (VIF). Catchments were then plotted as a points on a biplot according to their RDA coordinates.

Catchment Classification
Cluster analysis was used to group catchments with similar river regimes according to catchment RDA coordinates. Catchments with similar combinations of physical catchment property variables are located near each other on an RDA biplot (have similar RDA coor-dinates). Before subjecting the data to cluster analysis, the data's tendency for clustering was established using the Hopkins Statistic with the get_clust_tendency function of the factoextra package in R [53]. Four hierarchical clustering methods were compared for their suitability to the dataset according to two distance measures (cophenetic correlation and the Gower statistic). Cophenetic correlation measures the degree of agreement between the original unmodeled pairwise distances and the pairwise distances in the dendrogram. A high positive correlation indicates that pairwise distances have been preserved [54]. The Gower statistic is the sum of squared differences between the dissimilarity matrix and the cophenetic distance and is calculated using dendrogram hierarchical partition [55,56]. The clustering algorithms compared were single, complete, and average linkage agglomerative clustering and Ward's minimum variance clustering [55]. The optimum number of clusters was established via a Mantel correlation. Here, the original distance matrix is compared to binary matrices computed from the dendrogram being cut at multiple different levels for different numbers of cluster allocations [55]. The optimum number of clusters is where the Mantel correlation is highest.
The uncertainty in the allocated clusters was estimated using silhouette widths. This measures the degree of membership of an object to its allocated cluster by comparing the average distance of an object to all other objects in the same cluster, to the average distance between it, and all the objects in the next closest cluster [57]. Accordingly, high silhouette width values indicate that that catchment has a high degree of membership in that group. Catchments with negative silhouette width values can be assumed to have been misclassified.
Once the clustering allocations were validated using the pairwise distance measures, the catchment groupings were mapped and used to describe the variation within chalk stream hydrology and flow pathways, and the potential for the application of NFM for each group was assessed according to its specific physical and hydrological qualities.

Results
The stepwise redundancy analysis revealed that the combinations of the following catchment properties significantly explained the co-variation of hydrological variables (p ≤ 0.001): area, uncommon bypass flow, impermeable soils, slowly permeable soils, station elevation, form factor, and urban land use ( Table 2). The adjusted R 2 values, representing the proportion of hydrological variance, which is explained by the physical catchment properties, were 0.682 for the global model and 0.675 for the reduced model. Only the first two axes were used for analysis (axes 1: 77.4%, axes 2: 35.9%), as the third axis and beyond were statistically insignificant. Variance inflation factor (VIF) values (Table 3) revealed that the remaining 9 variables have very little collinearity (VIF values < 5). Despite the fact that the variables of uncommon bypass flow and slowly permeable soils were mildly collinear, they were both retained because removal of either caused a substantial drop in the explanatory power of the model. The 15 catchment variables removed via the forward selection process did little to reduce the Adjusted R 2 value, suggesting that the discarded variables mostly generated noise.

Clustering and Cluster Validation
Catchments were grouped into three clusters using the average linkage hierarchical clustering algorithm. The data's tendency for clustering was established prior to this via the Hopkins statistic (0.752; significant = >0.5 and <1). Average linkage hierarchical clustering was selected out of the four algorithms because it returned the highest cophenetic correlation and smallest value for the Gower distance (Table 4). Despite four clusters being identified as the optimum number of clusters for this dataset according to the Mantel correlation (Figure 3a), three clusters were used because the fourth was not associated with any vectors on the biplot, making it difficult to interpret. Silhouette widths were used to quantify and map the uncertainty in catchment group allocations (Figure 4c). An average silhouette width of 0.50 indicates that catchments were generally classified correctly, with all misclassifications occurring in group 1 (Figure 3b). Silhouette widths below 0.31 were considered uncertain due to being close to the decision boundary, and catchments with silhouette widths below 0 were misclassified [52] (p. 70). A drop in average silhouette width from 0.53 with four clusters to 0.5 with three clusters indicates that group cohesion is slightly reduced by this decision. To mitigate against this, the nine misclassified catchments (Figure 3b) were reclassified and displayed as their nearest neighbour alternative classification in Figure 4b.

Discussion
The explanatory power of the parsimonious RDA model is within the acceptable range for RDA models that characterise hydrological variation [58][59][60]. The model has been used to group catchments with similar physical catchment properties and river regime properties using their coordinates from the RDA plot. Thus, the cluster classifications were used to infer catchment typology and hydrological conditions, including rates of aquifer recharge and rapid runoff generation. The cluster classifications are used as a screening process to identify which NFM interventions can theoretically be applied or ruled out of each catchment typology based on the dominant physical catchment properties identified by that group as shown on the RDA plot ( Figure 4).
It is acknowledged that this approach relies heavily on the partitioning of catchments into groups. Hence, care was taken to ensure that uncertainty in cluster allocation was reduced by checking the data's tendency to cluster. The number of clusters and the clustering algorithm that made the most statistical sense were selected. However, uncertainty in cluster allocation cannot be removed entirely, and 13.6% of the sample had uncertain group allocations (Figure 4c). It is understood that these catchments share traits that would allow them to be comfortably grouped in to two of the three groups and are therefore understood to be intermediate catchments that do not easily fit in to a single classification. In these cases, the combined traits of the two catchment classifications should be taken into account when selecting NFM options for these catchments (alternative groupings for highly uncertain catchments are provided in Appendix A). Catchments that have been reclassified can be confidently allocated to their group.

Group 1: Large Catchments
Group 1 catchments have high Q mean and Q max values that are explained by their large size (Figure 4). These are the largest catchments within the sample, with topographic catchment areas ranging from 108 km 2 to 1459 km 2 . Q max has a strong negative correlation to station elevation, demonstrating that catchments with higher discharges tend to be those closer to sea level and further downstream, resulting in large water accumulation. Q mean , Q max , and catchment area have weak correlations (shown by orthogonality) with all other variables in the model (Figure 4a), rendering it difficult to comment further.
Previous research suggests that NFM becomes less effective at reducing flooding in catchments greater than 20 km 2 [11,61]. Furthermore, evidence shows that increasing the area impacted by NFM measures does not always increase the gains in flow attenuation [13,41]. The lack of substantial evidence for NFM implementation at a larger catchment scale has become a barrier to the general uptake of NFM [62]; however, it must be acknowledged that very few catchment-scale NFM schemes have been implemented [13,41]. The NFM evidence base at this scale is mostly provided by risk-based predictive models [63], which introduces computation limitations to catchment-scale NFM research and their subsequent findings [64]. Nonetheless, it is generally accepted that NFM is most effective for (and possibly limited to) catchments < 20 km 2 .
Metcalfe et al. (2018) [64] argued that a large proportion of the benefits of NFM come from its ability to de-synchronise sub-catchment flood waves, which theoretically works at every catchment scale [65]. De-synchronisation is achieved by implementing NFM schemes at a small scale in carefully chosen sub-catchments that are designed to attenuate individual sub-catchment flood waves, reducing the likelihood of flood wave synchronisation. Whilst it is recognised that the flood peak reductions from this process are often small, it can be enough to prevent bank overtopping in events up to 1:100 year scale [66]. Additionally, this method is only suitable in situations where the configuration of the sub-catchments currently causes synchronised flood waves. Dixon et al. (2016) [67] illustrate that, in some cases, NFM placement can synchronise flood waves that were previously staggered, resulting in an overall increase in flood risk. Where desynchronisation of flood waves is the goal, experimental modelling is required in the design process and prior to implementation of an NFM scheme to mitigate such risks [64,68].
As none of the catchments found in group 1 are <20 km 2 , the evidence is lacking to support the effective use of NFM within these river basins. However, this does not prevent these larger catchments from being broken down into multiple sub-catchments or for NFM opportunities to be identified and applied locally for point-source flooding. An example of this could be to implement bunds on a sloped field to prevent field runoff from flooding an adjacent road. In specific situations, it may be suitable to investigate the use of desynchronising flood waves to reduce the incidence of bank overtopping. Such areas would be downstream river sections where multiple sub-catchments contribute to flood waves, and relatively small changes in river levels alter the risk of bank overtopping [64,66]. Extensive research and considerable time are required to design and implement effective NFM strategies for de-synchronisation [68], so it is only recommended where combined flood waves are known to be a problem. Low drainage density of permeable chalk catchments, however, limits the opportunities for de-synchronisation.

Group 2: Permeable Catchments
These catchments are the stereotypical chalk streams: small catchments that are dominated by groundwater recharge and influxes of groundwater that dominate the river regime via baseflow. Group 2 catchments have river regime variability closely related to high BFI values and are associated with a large proportion of uncommon bypass flow soil types and higher station elevations (interpreted here as a higher proportion of headwater catchments). Uncommon bypass flow is a classification of soils from the HOST soil classification that describes thin (aquifer within 2m), permeable, and unconsolidated soils with micro and macro pores [24]. These soils have a vertical saturated hydraulic conductivity of >10 cm day −1 [69], allowing highly efficient recharge via vertical drainage into the chalk aquifer. This is linked to reduced runoff generation [70], limiting the potential for NFM implementation because a large proportion of the water transfer throughout the catchment occurs below the ground surface. Due to the lack of significant runoff, NFM treatments that aim to reduce runoff generation or store surface water within the catchment are highly unlikely to have a significant effect on streamflow. Examples of such interventions include winter cover crops, changes in tillage practices, lengthening drainage pathways, planting across slopes, online and offline storage ponds, wetlands, and controlled flood zones [13]. What remains are NFM techniques that focus on in-channel interventions to reduce downstream conveyance [13], such as channel realignment, sustainable urban drainage systems (SuDS), de-culverting covered river channels, and increasing in-channel, riparian, and marginal vegetation [6,71]. Whilst in-channel river restoration schemes such as these can significantly reduce flood peaks in small catchments [8,72], it must be emphasised that restoration schemes and in-channel interventions deliver the best results as part of a suite of other NFM measures [73]. Therefore, whilst beneficial, in-channel measures alone are unlikely to deliver the optimum impact of NFM interventions.
Despite a lack of surface-runoff generation in highly groundwater-dominated catchments, such as those in group 2, surface water can occur due to groundwater emergence. In this case, the water table rises to intersect with the ground surface, forming static pools of water in topographic low points called turloughs [74] and intermittently flowing river channels called winterbournes [75] and can activate springs in weak points and fractures in the chalk [20,76]. These phenomena generally occur after large quantities of prolonged rainfall. Previous groundwater-emergence floods have been recorded after rainfall events that double and triple the long-term averages [74,[77][78][79]. Under such conditions, NFM has been shown to be far less effective because engineered and natural stores of water, such as soil storage, storage ponds, log jams, and groundwater stores, become full and are overwhelmed [11,13]. The presence of turloughs, winterbournes, and springs demonstrate that water transfer throughout the catchment is not uniform across space and is dictated by the location, size, and activation of hydrogeological features such as fractures [19,[80][81][82]. As a result, NFM schemes in these catchments will require in-depth, local knowledge to design them, with sensitivity to the local hydrogeological features. We suggest that NFM interventions in this group of catchments be focused in areas where groundwater emergence occurs as winterbournes and springs to intercept or store flood flows, particularly where the resulting surface water causes disruption or damage to property. For example, in-channel interventions, such as log jams and online storage ponds, could be installed in in the path of known winterbournes and springs to intercept flows when the water table is high. The potential benefits of this kind of NFM installation have not currently been tested.
For these catchments, it is suggested that NFM will have diminished effectiveness due to a lack of significant surface runoff. This limits suitable NFM interventions to mostly instream channel modifications to reduce rapid conveyance, which will be less effective compared to a full suite of NFM interventions [13,73]. In many cases, it may be more economical to map and maintain local knowledge of groundwater emergence for flood-risk mapping. Previously, major flood damages were caused by urbanisation of forgotten, dormant winterbournes and springs, which then activate under heavy rainfall [78]. These mapping efforts would also be instrumental in designing and locating potential NFM schemes near hydrogeological features, such as springs and winterbournes. Therefore, effective flood mitigation measures are associated with tracking water table heights, flood warnings, flood mapping, and making a concerted effort to understand spatial changes in hydrogeology. Long-term groundwater emergence and flood-risk maps are required for appropriate flood planning, for identifying locations for NFM schemes, and for reducing flood risk by restricting building and development on areas at risk of groundwater emergence.

Group 3: Less Permeable Catchments
Group 3 catchments are associated with high RBFI values (Figure 4), which is best explained by higher incidence of impermeable and slowly permeable soils, the presence of urban land use, and higher values of form factor (indicating more circular catchment shapes). These are the chalk catchments with the largest proportions of surface runoff.
Higher RBFI values describe greater flow variability and steeper and higher magnitude discharge peaks. The relationship between soil permeability and rapid runoff generation is well known and documented, where impermeable soils and surfaces generate greater quantities of surface runoff and quick-flow catchment pathways [24,31,70,83,84]. Work previously performed on chalk stream catchments by Ascott et al. (2017) [31] concluded that impermeable superficial deposits, such as those found in group 3 catchments, slow the vertical conveyance of water into the aquifers, reducing recharge and groundwater dominance in the river regime and flood response. Inversely, reduced rate of recharge and absorption of rainfall sub-surface will increase surface-runoff generation.
As demonstrated by Lane (2017) [13], a large proportion of NFM interventions work by manipulating and intercepting surface-runoff pathways [61]. By virtue of a greater quantity of surface runoff, the full suite of NFM strategies are viable in group 3 catchments, including reduction of rapid runoff generation through soil management and increased catchment roughness, increasing catchment water storage using storage ponds, and reducing the conveyance of water downstream with in-channel and river restoration strategies. This allows many different potential combinations of NFM intervention for optimising results [73], meaning that this group of chalk catchments is the most suitable for the application of NFM in the study sample.

Applications of NFM in Chalk Catchments
According to the typology of catchments generated via redundancy analysis, three chalk catchments in the UK (Yeading Brook West at North Hillingdon, Catchwater at Withernwick, and the River Dour at Crabble Mill; Appendix A) inherently have the physical and hydrological features best suited to the current range of NFM measures. The study sample accounts for 198 of the estimated 224 chalk streams in the UK. In the study sample, only 25 chalk catchments are <20 km 2 , and of these, 22 are classified as permeable catchments, limiting them to mostly in-channel NFM interventions and highly targeted NFM schemes downstream of hydrogeological features. Overall, this suggests that implementing NFM in chalk groundwater-dominated catchments is likely to have sub-optimal results compared to other catchment types [3,12,85].
The findings of this analysis do not necessarily negate the use of NFM methods in chalk catchments. Where appropriate, de-synchronising sub-catchment flood waves can be implemented for larger catchments classed as impermeable by applying NFM at the local scale. NFM should also be considered on the local scale in areas where groundwater emergence as springs and winterbournes cause disruptions or for any other known source of surface water or runoff. Additionally, the environmental benefits of NFM on water quality, aesthetic improvements, reductions in soil erosion, and biodiversity are uncontested [62]. There are proven benefits for river managers and catchment partnerships in chalk streams regions to implement NFM for river restoration, water quality, and biodiversity improvements. The application of NFM is a cross-disciplinary, collaborative process, so the benefits of increased communication and the formation of catchment partnerships have been argued to be another of the co-benefits [61]. Wingfield et al. (2019) [62] argued that building the evidence base for NFM to irrefutability could take decades and that by hesitating to implement NFM due to a lack of current evidence, potential benefits are lost. Minor flood benefits gained through river restoration will likely have minimal effect for large-scale storms but may reduce the incidence of small-scale nuisance floods [66,67]. Therefore, river restoration is advantageous to flood reduction but should not be considered the main objective unless supported by local, detailed analysis. Additionally, even in groups 1 and 2 catchments, there are likely small-scale point-source locations of pluvial flooding, like flooding of fields or roads, and groundwater emergence that could be combatted with small-scale NFM schemes. However, implementation of such schemes will require better reporting and mapping of local sources of flooding.

Conclusions
The results of a redundancy analysis model were used to generate a typology of chalk catchments, resulting in three groupings according to broadly similar river regimes and physical catchment properties. Using these classifications as a screening tool, the likely effectiveness of applying NFM in each of these groups was discussed. The first class (group 1) is grouped by virtue of their larger size, meaning they are likely unsuited to NFM due to NFM being most effective for catchments <20 km 2 . It is acknowledged that these catchments can be broken down into smaller sub-catchments, possibly to de-synchronise sub-catchment flood waves. De-synchronisation is only recommended in conjunction with hydrological modelling prior to NFM design. It should also be considered that flood reductions from de-synchronising flood waves are most effective for smaller nuisance floods rather than larger flood events. Permeable catchments (group 2) are associated with smaller headwater catchments and high-permeability soils. NFM interventions are less suited due to a low proportion of surface-runoff processes. Large quantities of surface water can, however, be generated due to groundwater emergence at hydrogeological features, such as winterbournes or springs, which activate when the water table is high. Effective flood planning in these catchments is more likely to come in the form of in-channel NFM interventions, hydrogeological mapping, building, and planning restrictions and the development of groundwater emergence early warning systems. We suggest that NFM schemes in these catchments be small-scale and highly targeted to deal with runoff from activated hydrogeological features that would otherwise cause small-scale disruptions (i.e., flooding roads). Catchments classed as impermeable (group 3) are related to the presence of impermeable and slowly permeable soils as well as other less-permeable surfaces, such as urban land use. Due to this, runoff is generated, meaning that this category of catchments is the most suited to NFM in the chalk stream sample.
Overall, this study suggests that implementing NFM in chalk groundwater-dominated catchments is likely to have sub-optimal results compared to other catchment types. However, NFM implementation may be justifiable purely on the merit of the multiple environ-mental benefits, such as improved water quality, aesthetic improvements, reduced soil erosion, increased biodiversity, and collaboration across multiple river-management bodies. This paper provides a first order triage of the potential for NFM runoff-management methods in chalk catchments. Further work in this field will need to focus on hydrological models that represent the permeability of the soils and the influence of groundwater on stream flows, detailed groundwater emergence, and flood-risk mapping as well as NFM-implementation studies specifically on chalk streams.