Deriving a Bayesian Network to Assess the Retention Efficacy of Riparian Buffer Zones

: Bayesian networks (BN) have increasingly been applied in water management but not to estimate the efficacy of riparian buffer zones (RBZ). Our methodical study aims at evaluating the first BN to predict the RBZ efficacy to retain sediment and nutrients (dissolved, total, and particulate nitrogen and phosphorus) from widely available variables (width, vegetation, slope, soil texture, flow pathway, nutrient form). To evaluate the influence of parent nodes and how the number of states affects prediction errors, we used a predefined general BN structure, collected 580 published datasets from North America and Europe, and performed classification tree analyses and multiple 10 ‐ fold cross ‐ validations of different BNs. These errors ranged from 0.31 (two output states) to 0.66 (five states). The outcome remained unchanged without the least influential nodes (flow pathway, vegetation). Lower errors were achieved when parent nodes had more than two states. The number of efficacy states influenced most strongly the prediction error as its lowest and highest states were better predicted than intermediate states. While the derived BNs could support or replace simple design guidelines, they are limited for more detailed predictions. More representative data on vegetation or additional nodes like preferential flow will probably improve the predictive power.


Introduction
The eutrophication of lakes and rivers remain a global concern despite considerable efforts in the past to reduce the input of nutrients from terrestrial environments [1]. After targeting the industrial and domestic sources, the diffuse agricultural inputs of nitrogen (N), phosphorus (P), sediments, and also pesticides gained importance [2][3][4][5]. Particularly, the legacy of long-term excess applications of fertilizers and manure became eminent [6,7]. In the European Union (EU), for instance, diffuse sources affect now more water bodies than point sources [8].
Many processes and factors control the mobilization and the diffuse transport of agricultural pollutants. Accordingly, various "best management practices" (BMP) can contribute to reduce the risk of pollutants entering water bodies. Riparian buffers zones (RBZ), also termed vegetated buffer strips, are vegetated areas along water bodies which are either too difficult to work or intentionally set aside [9]. RBZ stand among the most widely applied BMP to reduce the diffuse pollution of surface waters but also to improve their hydro-morphological and thermal conditions [9][10][11]. This is reflected by the rich literature on the efficacy of RBZ to retain substances from adjacent upslope areas (henceforth catchments).
The RBZ efficacy is expressed as the ratio of the amount of nutrients, pesticides, or sediment leaving the buffer towards the water body and the amount of material entering the RBZ from the catchment. It depends on RBZ properties and external factors like land management which, in turn, determine the runoff and incoming nutrients [12,13]. The large variability in RBZ efficacy reported in the literature reflects the variability of those factors.
Therefore, modelling approaches to predict RBZ efficacy in large-scale applications tend to be (too) data-demanding, e.g., process-based models like VFSMOD [14] and REMM [15], or (too) simplistic [16]. For instance, Weissteiner et al. [17] used RBZ width as single variable to estimate the efficacy of N and P retention in their model for the EU. This approach is in line with regulations and recommendations on RBZ design which specify adequate widths of RBZ as simple, easily administrable rules to reduce diffuse emissions of nutrients and sediments [18][19][20]. However, the use of RBZ width as the only design variable is not commonly considered sufficient [21,22].
Some studies used sub-datasets and applied multivariate regression to improve the model performance. For example, Zhang et al. [13] derived separate models for different vegetation types (grass and mixed, only trees), again only with RBZ width as the single explanatory variable for N and P retention. For sediment retention, they considered RBZ slope as the second variable which was supported by a previous study that identified RBZ width and slope to be best correlated with sediment retention [23]. In contrast, Yuan et al. [24] found only a weak relationship between sediment retention and slope. In their meta-analysis on nitrate retention, Mayer et al. [25] also tested RBZ width as the only predictor in regression models for different width classes, vegetation types, and flow pathways. The explained variability (r²) in these subsets was partly higher than the r² values for the full dataset. They concluded that more factors need to be considered. In accordance, Hassanzadeh et al. [16] achieved even higher r² values by considering various RBZ and external variables in their regression model.
Nonetheless, such regression models have limitations which can be addressed by Bayesian networks (BN, also termed Bayesian belief networks, [26]). Firstly, BN explicitly deal with the uncertainty in data or conditions because they represent the relationships (links or edges) between system variables (nodes) probabilistically rather than deterministically [27]. Secondly, they can integrate even incomplete quantitative and qualitative information (e.g., "sandy soil") from different domains such as expert knowledge, field data, and other BN. Thirdly, BN can provide information on causes from evidence of observed effects.
Although BN have increasingly been applied for water resource management, decision-making, predictions, and assessments of ecosystem services during the last decades [26,[28][29][30], they have only implicitly considered RBZ retention so far [31]. Therefore, the aims of this study are (i) to develop the first BN to predict RBZ efficacy to retain N and P from RBZ and external factors, (ii) to train and cross-validate the BN based on empirical data compiled from an extensive literature review, and (iii) to evaluate how node selection and the number of their states, i.e., data discretization, affect model accuracy. As our BN aims at medium-to large-scale applications, we focused on commonly available input data and grouped nutrient fractions of N and P.

Data Collection and Preprocessing
We compiled the measured RBZ efficacy for N, P, and sediment and information on generally available RBZ and external variables (first column in Table 1) based on 23 reviews published between 1994 and 2019 [12,13,[22][23][24][25][32][33][34][35][36][37][38][39][40][41][42][43][44][45][46][47][48] (Supplementary Table S1). These variables were selected based on key processes of nutrient and sediment retention discussed in the literature, e.g.,  RBZ width and nutrient form: after entering the RBZ, the velocity of surface runoff decreases due to surface roughness and soil infiltration [22,49,50]. This results in a lower transport capacity and a deposition of sediment and particulate nutrients, which is an important process for their retention [40]. Positive yet variable relationships of RBZ width and efficacy have widely been acknowledged in reviews and guidelines (e.g., [19,33,48]) with broader RBZ being needed to efficiently retain dissolved nutrients compared to sediment [46].  RBZ vegetation, flow pathway, and nutrient form: dense aboveground vegetation increases the hydraulic roughness and reduces the transport capacity of water for particles, while dense root systems favor soil permeability, porosity, infiltration, and, thus, the sorption of dissolved nutrients (DN, DP) to soil particles as well as their uptake by plants [45,51]. Previous studies reported contrasting results in respect to the efficacy of grassed and woody RBZ [12,52,53]. Most probably, this is due the large number of processes and complex interactions among nutrients, plants, soil, and micro-organisms [54].  Soil texture, flow pathway, and nutrient form: the larger the soil particles, the faster they settle under given flow conditions. While most of the coarse sediment can be retained even in narrow RBZ, the retention of fine sediment requires soil infiltration in RBZ which are more than 15-20 m wide in order to be effective [12,23,55]. In addition, soil composition (e.g., texture) is decisive for P sorption, hydraulic conductivity and, thus, for surface runoff, soil moisture storage, as well as residence time of nutrients in the root zone [40,47,51,56,57]. While sandy soils favor infiltration, surface flow predominates on clayey soils.  RBZ slope and width: RBZ are expected to be less effective in reducing the transport capacity of water flow as well as in trapping sediments and nutrients in steep terrain [13,23,40] thus requiring broader buffers on steeper slopes [48]. Due to the experimental design and data availability, the influence of slope was not significant in other studies (e.g., [47,58]). RBZ slope was used in some regression models as an independent variable (e.g., [13,48]). As far as possible, the measured RBZ efficacy values-preferably from mass (load or yield)-were extracted from the original studies to minimize compilation errors and to avoid double A total of 269 of the collected datasets were complete, i.e., with values for all variables shown in Table 1 ( Table 2). The others missed RBZ slope (n = 52) and soil texture (n = 61). Slope was missing for 39 groundwater studies, i.e., 27% of all, thus shifting the dataset towards studies on surface flow. To increase the sample size, we calculated missing average values from ranges, e.g., the minimum and maximum efficacy given for RBZ of 10-15 m width was converted to an average value of 12.5 m width, and applied no quality criteria apart from data completeness.
Two thirds of the 580 efficacy values with complete data originated from plot studies and one third from field and watershed studies, with different experimental designs and time scales. Rainfall data was particularly inconsistently reported, e.g., as average values or intensities, and was therefore excluded from the data collection.  13 12 Variables (nodes) All Slope, soil texture 1 One study with partly incomplete soil data, 2 three countries without complete data.

Aggregating the Nutrient Forms
We collected the efficacies for five nutrient forms and sediment. In order to reduce the number of states and increase the number of data sets for each state, we applied preliminary regression analyses to quantify the similarity of RBZ efficacy for particulates (mostly sediment) and total N and P (TN, TP) as well as the dissolved forms (DN, DP) before setting up the BN.

BN Design-Nodes and States
BN are directed acyclic graphs which link variables (called parent and child nodes) without feedback loops. The nodes and links of the BN were based on the literature listed above: the RBZ and external variables in the first column of Table 1 were considered as independent parent nodes directly influencing RBZ efficacy as the final child (output) node of the BN (Figure 2).
In a BN, discrete states have to be defined for each node. They have to be exhaustive and mutually exclusive. Apart from flow pathway, the states of the nominal variables were groups of the reported values (see columns three and four in Table 1). The original nutrient forms were aggregated to the three states "particulate", "mixed", and "dissolved" based on the preliminary regression analyses. The various values of soil texture were grouped into (i) the three states "fine" (clayey soils), "medium" (loamy and silty soils), and "coarse" (sandy soils), and (ii) only two states "fine" and "coarse" (comprising loamy, silty, and sandy soils). The plant species composition, structure, and management of RBZ varied widely in the literature and were simplified to "grass" (herbaceous) and "woody" (with shrubs and trees).
In contrast to the nominal variables, the three continuous variables had to be discretized. Different discretization approaches exist but seemingly no optimal one [59,60]. Given the uneven value distribution and to acknowledge the limited data availability for parent nodes, we used the common equal-frequency approach. For RBZ width, however, we slightly adjusted the quantiles to match legal regulations. Since one aim of this study was to assess the effect of state definition on model accuracy, different numbers of quantiles were used (see Table 1).

BN Training and Evaluation-Importance of Nodes and State Definition
BNs with different number of states were trained using efficacy values with data for all parent nodes (n = 580) and the Maximum Likelihood parameter estimation. The training filled the conditional probability tables (CPT) which describe the probability of each state of the child node considering all possible states of the parent nodes, based on Bayes' theorem. The different state definitions of four nodes (see Table 1) resulted in 72 different BNs. A 10-fold cross-validation-i.e., 90% of the data used for training and 10% for validation-was repeated 10 times for each of these BNs to compare prediction (classification) errors [59,61], i.e., the share of wrongly predicted output states. Ten different probabilities and prediction errors were thus obtained for each BN, ranging from 0 (all predictions correct) to 1 (all predictions wrong).
In order to assess how the state definition (state number) of the parent and output nodes affected the model performance, we used the non-parametric Wilcoxon Signed Rank test to determine whether the prediction errors differed significantly for parent nodes with different numbers of states (p < 0.05).
We also compared the complete BNs with all parent nodes to simplified BNs where the least important nodes were removed. The node importance was evaluated in two ways. Firstly, we applied the non-parametric Kruskal-Wallis test to each parent node to assess whether the RBZ efficacy for at least one state significantly differs from the other states (p < 0.05). Following Moe et al. [62], we also performed a classification tree analysis for each of the 72 possible state combinations. For all parent nodes we compared their relative importance and their occurrence in the classification trees to a random binary variable. In the simplified BN, we did not consider nodes with an influence close to the random variable and with a low occurrence in the classification trees (≤33%). As the values of the random variable vary with each repetition, we averaged the outcomes of 10 analyses to avoid misclassifications. The cross-validation and state evaluation were repeated with the 72 simplified BNs. We discarded undefined output from validation data outside the training data.

Implementation
All statistical analyses were conducted with the software R [63]. For the training and cross-validation of the BNs we used the R package bnlearn [64], and rpart [65] for the classification tree analysis. To exemplarily show the influence of individual parent nodes on the output, we trained an "optimal" BN-based on the prediction errors and the CPT size-with the full dataset. For the node of interest, we determined the BN output for each possible state by setting its probability to 1 ("hard evidence") while assuming an equal probability of all states of the other parent nodes ("soft evidence"). These calculations were conducted using the R package gRain [66].

Aggregating the Nutrient Forms
The retention of TN and TP were well correlated (r² = 0.64). Both were also more significantly correlated to sediment retention values (Figure 3, r² = 0.40-0.48, p < 0.0001) than the DN and DP retentions (r² = 0.09, p = 0.003). Based on these results, we considered TN and TP together as a separate "mixed" state of the node "nutrient form" and distinguished it from the "particulate" state (mainly sediment) and the "dissolved" nutrient forms, for which sedimentation is less important.

Variability of RBZ Efficacy and Importance of Nodes
The reported RBZ efficacy ranged from complete retention to negative values, i.e., where the RBZ acted as sources. The distribution of all values was left-skewed with a median of 68% and a mode of 87%. This variability also resulted in a strong overlap between the states of the parent nodes ( Figure 4). Therefore, even wide buffers can have a rather low efficacy (Figure 4a). However, the Kruskal-Wallis test revealed significant differences in RBZ efficacy among the states of all parent nodes (p ≤ 0.0003) except the flow pathway (p = 0.07) and RBZ vegetation (p = 0.8). For instance, RBZ are generally more effective with coarse soils than with fine and medium soils (Figure 4c) except for DP (not shown).
The classification tree analysis confirmed the strong influence of nutrient form, RBZ width, and soil texture on the RBZ efficacy ( Figure 5). Despite its comparatively low importance (9%), RBZ slope occurred in 60% of the classification trees. The least important variables were RBZ vegetation and flow pathway (6%). Both variables rarely occurred in the trees, although they were more influential than a random variable (1.5%). This was especially the case for the flow pathway.

Cross-Validation and Evaluation of BNs
Fewer output states resulted in lower prediction errors, with average values over all BNs ranging from 0.31 (two states) to 0.66 (five states). However, the BNs predicted the lowest and highest output states significantly better than the intermediate states ( Figure 6). These errors correspond to 62% of the expected error of a random predictor (E) for the two-state output (E = 0.5) up to 82% for the five-state output (E = 0.8). The values for the minimum and maximum states equaled on average 63% of E.
The prediction errors for RBZ efficacy were significantly lower when the nutrient form was "particulate" or "mixed" rather than "dissolved", on average by 0.05 (−0.05-0.15, Figure 7). The impact of the flow pathway and the RBZ vegetation on the prediction error was negligible. The error even decreased on average by 0.01 after the BNs were simplified, i.e., by excluding the RBZ vegetation and the flow pathway. Limiting the database to the respective dominant state of these two nodes, i.e., to grassed RBZ and surface flow (n = 391), had no additional effect on the error.
The prediction errors were generally lowest when the parent nodes had three states (Table 3). However, for some BNs with all six parent nodes ("complete BN" in Table 3) choosing four states for the node "RBZ width" resulted in better predictions. Choosing two states, on the other hand, increased the predictions errors by up to 0.05 for RBZ width.

BN Cross-Validation and Evaluation
We tested the influence of nodes and states on RBZ efficacy in order to identify the minimum data requirement while ensuring no significant decrease in model performance and reliability. In this sense, the simplified BN with four out of the six considered parent nodes and with three (slope, soil texture) or four states (width) for the parent nodes as well as two to three output states can be considered as "optimal" among the tested alternatives ( Figure 8A). The CPT size, i.e., the product of the state numbers, is 216 or 324 elements-which is manageable given the available input data [59]. Nonetheless, 0.3% or 0.9% of the predicted values during the cross-validation were undefined (compared to 0.2% for the smallest CPT size).
In accordance to the reviewed literature (cf. Methods), the BN reflects that a high RBZ efficacy was less probable for narrow than for broad RBZ ( Figure 8B, a), for fine sediments than for coarse sediment ( Figure 8B, b), and for dissolved nutrients than for particulate nutrients ( Figure 8B, c). The more complex pattern for RBZ slope ( Figure 8B, d) depended on the nutrient form (not shown). The higher probability of low RBZ efficacies for medium slopes occurred only for TN and TP. For particulate and dissolved nutrients, the efficacy of RBZ declined with increasing slope. Considering the dominance of studies on surface flow, this pattern can be explained by higher flow velocity and transport capacity on steeper slopes (cf. Methods). Nonetheless, it differs from findings of Zhang et al. [13], based on a smaller dataset, that sediment retention decreases with slopes steeper than 10% but increases with slopes below this threshold. At the plot scale-which dominates the collected data-flow resistance and other flow characteristics varies with slope gradient but also depends on the fragmentation of riparian vegetation [67] which is usually not available from RBZ studies. Following Zhang et al., more research on slope effects on RBZ efficacy is recommended to revise the inconsistent probabilities.
The use of these commonly available variables as nodes facilitates the envisaged application and validation of this first BN to estimate the retention efficacy of RBZ. This contributes to a better evaluation at medium to large scales of how RBZ can help to improve the water quality. The BN was significantly better in distinguishing effective (high state) from ineffective RBZ (low state) than between intermediate states ( Figure 6). With two output states, the BN can for instance support or replace guidelines and design aids for effective RBZ based on single factors. For quantitative assessments, the discrete output states could be replaced by the average values of 45% and 90% efficacy in our database, or 30%, 70%, and 90% if a third state is needed. The high prediction errors hamper the use of more intermediate efficacy states at least in combination with our heterogeneous empirical dataset. Note that the BN is not intended to replace but rather complement other modelling approaches. For instance, Figure 8b illustrates that the BN can deal with uncertain evidence and provides probabilities of RBZ efficacy. Nonetheless, the high influence of RBZ width, soil texture, and slope emphasize the need of multivariate approaches in RBZ analyses. Although the loss of information is a potential limitation of discrete BN [59], defining more than two states for parent nodes was only slightly helpful to improve the model accuracy. The prediction errors as well as the variability and the significant inconsistency among the reported efficacy values may indicate the need for supplementing BN nodes. Even with more input data than in previous studies, our analyses indicate that additional parent nodes are not necessarily useful to improve predictions if the data is biased and does not cover the variability of site conditions. For instance, the low influence of RBZ vegetation is in line with the similarity of woody and grassed RBZ as observed in previous studies (cf. Methods). However, the dominant grassed RBZ were on average narrower (9 m) than the woody RBZ (22.6 m). Moreover, the common distinction between grassed and woody RBZ does not consider the influence of different plant groups and species on the nutrient retention in RBZ [68,69]. Vegetation and root density seem to be promising to refine the state definition and to improve model predictions (cf. Methods), but such data is unavailable in existing RBZ studies.

Further BN Development
The current BN can be improved in different ways. The collection of training and validation data could be limited to specific experimental designs, scales, or site conditions. This simple approach would minimize the impact of some factors not explicitly considered in the BN at the expense of restricting our intended applicability at large scales. More nodes and states, on the other hand, require enough representative training data to fill the rapidly growing CPT. Expert knowledge can reduce the data demand (e.g., [70]) but its validation can be challenging [71]. We exemplarily discuss below constraints of the available data regarding the design and application of the BN (and other empirical models).
Firstly, only less than 20% of the datasets were on subsurface flow. The underrepresentation was most striking for DP (n = 10). Although limiting the dataset to surface flow did not reduce the prediction errors, the observed low influence of the flow pathway on RBZ efficacy and the (slightly) higher errors for dissolved nutrients (Figure 7) should be scrutinized with more (DP) data. However, revising the role of flow pathway may also require considering the contrasting effects of external factors like redox conditions on the retention of P (oxidation/reduction of iron oxides) and N (denitrification) either as separate node or as additional DN and DP states of the nutrient node.
Secondly, the current BN may overestimate the efficacy of "real-world" RBZ because most literature values were measured in rather short-term studies. However, aging buffer strips may turn from sinks to sources [34,53,72] and foster eutrophication if poorly maintained [72][73][74]. Negative values occurred in 26 datasets and comprised mostly DP (n = 16) and DN (n = 9), followed by TN (n = 3), TP (n = 2), and PP (n = 1). Given the variety of underlying processes and the small number of cases, they could not be considered as a separate state in the output node. Apart from the remobilization of P from saturated RBZ soils as DP [34,45], P-rich colluvium in RBZ can be remobilized during heavy rainfall or flood events. Detailed maps on water-soluble P, degree of P saturation, and residual soil P in agricultural areas (e.g., [7]) might be helpful to overcome this limitation in future BN. The data availability is, however, more challenging for other processes such as the decay of plants [74,75].
Thirdly, other below-average retention efficacies e.g., for sediment retention in our data collection were explained by conservation tillage (CT) [76,77] and preferential flow [78] in the catchment of RBZ. CT, i.e., minimum-and no-till, has increasingly been used worldwide during the last decade [79], and might do so in the future [80], to reduce sediment and PP losses from agricultural land. However, the accumulation of labile P pools in top soils and the development of preferential flow paths belowground can unintendedly impair the RBZ efficacy to retain soluble, immediately bioavailable P [34,81]. No-till can also increase the water and nitrate flux entering RBZ from adjacent fields [82]. While more water flux can reduce the nitrate retention in subsurface flow [46], higher (initial) concentrations favor its retention [47].
In the few available quantitative studies on CT, no-till reduced significantly the RBZ efficacy compared to conventional till, notably except for P (Figure 9a). However, most of the values already belonged to the lowest efficacy state (RBZ efficacy < 55%). Therefore, the influence of CT on the BN output would currently be rather low. It might also not be representative for other output states. By separating the soil texture in RBZ and their catchments, the BN could consider some of the complex impacts of CT together with nutrient form.
As a side effect, these two BN nodes can generally integrate the effects of hydrology [46,87], land use, and land management [16,21]. These variables affect the concentration [47,88], timing, amount, and composition of nutrient and sediment entering RBZ and have been used in other empirical RBZ models (e.g., [16,46]).
Fourthly, most published RBZ studies were plot studies which are typically constructed with uniform slopes to favor sheet-flow conditions under which RBZ work best [48,89]. However, they neglect the tendency of water to converge and develop preferential flow paths at larger scales and within natural terrain [90][91][92]. According to the very few available quantitative studies, preferential flow can reduce the efficiency of RBZ [78,85,86,93] (Figure 9b, left). Nonetheless, RBZ may perform equally well depending on RBZ width and structure as well as the runoff conditions [86,91] (Figure  9b, right). According to our data collection, plot studies reported on average less effective RBZ than field-and watershed studies which can be explained by the narrower RBZ in these plot studies. Therefore, more field studies are needed to reliably quantify how preferential flow affects the efficacy of RBZ.
In an extended BN, the current output state could be reduced based on the probability of preferential flow. The likelihood of preferential surface flow is often estimated from digital elevation models by applying area thresholds or variants thereof [94] or from linear anthropogenic features [95]. Similar slope-area relationships are known for e.g., the topographic index [96,97] and soil erosion [98] which also influence the variability and composition of nutrients.

Conclusions
The discrete, multivariate BNs to predict RBZ efficacy developed and evaluated in our study are flexible tools which e.g., provide probabilities instead of exact outcomes. They predict sufficiently well the RBZ efficacies collected from previous studies. The "optimal" BNs have four parent nodes (width, slope, nutrient form, and soil texture) with more than two states and up to three states for RBZ efficacy. The resulting CPTs are manageable with the available input data. RBZ vegetation and flow pathway are found to be not relevant.
The model performance is better for the highest and lowest states of RBZ efficacy compared to the intermediate states. The BNs could thus complement or replace existing design guidelines for RBZ. The expected limitations for more detailed predictions arise from missing data for other variables as discussed in the literature as well as the heterogeneous database which does not adequately represent all site conditions. This is relevant for the vegetation type which typical states "grass" and "woody" comprise different species. Additionally, most data are obtained from plot-scale studies which typically do not consider preferential flow. More field-scale studies are needed to better represent preferential flow in BNs. Future research should focus on the database to better cover the different state combinations and to provide new data to train and validate the relationships.
Supplementary Materials: The following are available online at www.mdpi.com/xxx/s1, Table S1: Data table,  File