Probabilistic characterization of the vegetated hydrodynamic system using non-parametric bayesian networks

: The increasing risk of ﬂooding requires obtaining generalized knowledge for the implementation of distinct and innovative intervention strategies, such as nature-based solutions. Inclusion of ecosystems in ﬂood risk management has proven to be an adaptive strategy that achieves multiple beneﬁts. However, obtaining generalizable quantitative information to increase the reliability of such interventions through experiments or numerical models can be expensive, laborious, or computationally demanding. This paper presents a probabilistic model that represents interconnected elements of vegetated hydrodynamic systems using a nonparametric Bayesian network (NPBN) for seagrasses, salt marshes, and mangroves. NPBNs allow for a system-level probabilistic description of vegetated hydrodynamic systems, generate physically realistic varied boundary conditions for physical or numerical modeling, provide missing information in data-scarce environments, and reduce the amount of numerical simulations required to obtain generalized results—all of which are critically useful to pave the way for successful implementation of nature-based solutions.

Despite many studies focused on hydraulic load reduction by vegetation, there is no general consensus or coherent set of guidelines yet that enable uniform integration of vegetated coastal ecosystems in design for vegetation-levee combinations. This maybe due to the varying estimates of wave attenuation resulting from the case studies. For instance, in Table 1, a few flume and field experimental studies have been synthesized showing a variety of hydrodynamic and vegetation characteristics that were tested to determine drag coefficient expressions. It can be observed that the resulting drag coefficient expressions are considerably different and yield distinct values for the same Reynolds number, see [13]. Since the drag coefficient is a critical variable for the calibration of numerical models, estimates of wave attenuation potential of vegetation also differ considerably, which undermines the reliability of vegetation as an integrated element of a flood defense systems. Instead of case-specific varying point estimates, a wide range of results in multiple biohydrophysical conditions are required which could be generalized to develop the implementation guidelines for nature-based solutions. However, not only the generalized results are missing but the boundary conditions required to measure or model those results are also missing. A potential reason could be the lack of holistic system-level description of vegetated hydrodynamic systems which means that the vegetated foreshores have not been quantified as a system with interacting and correlated constituting elements. Therefore, this study provides a first attempt to develop a probabilistic model of vegetated hydrodynamic system which would help in obtaining the necessary boundary conditions to describe and numerically model the system for a variety of situations more efficiently. We synthesize prior studies parameterizing individual variables in the system [21][22][23][24][25][26][27][28][29] and build on that to study complete system dynamics and its response to variations.
Field (Salt marsh) [40] Irregular waves Statistical [41] Statistical evaluation of 35 studies Statistical evaluation of kelp, seagrass, saltmarsh and mangroves    Modeling complex systems requires the (probabilistic) dependence among constituting system components to be taken into account. As a possibility, Bayesian networks (BNs) are being used in different coastal engineering studies [47][48][49][50][51][52][53][54]. A Bayesian network is a directed acyclic graph (DAG) whose nodes represent random variables and whose arcs represent probabilistic dependence. Nonparametric Bayesian networks (NPBN) are hybrid BNs that model the dependence structure using Gaussian copulas and can have both discrete (as long as variables are defined in at least an ordinal scale) or continuous marginal distributions in the same graph [55][56][57][58][59]. A bi-variate copula is a two dimensional distribution of the "ranks" of the data whose marginals are uniformly distributed on [0,1] [60][61][62][63]. Spearman's (conditional) rank correlations, which parameterize the Gaussian copula, are used to define probabilistic influence between parent and child nodes [57,59]. In the case of conditional dependence among variables, recursive formula is used to calculate partial rank correlations [56,64].

Spartina anglica
Similar studies to this work have mostly applied discrete BNs with conditional probability tables [31,51,65] which are typically quantified with synthetic data sets requiring an enormous number of numerical simulations. This makes the quantification of such models dubious or indefensible. For example, consider a model with 10 discrete variables with 4 states each where 9 variables are connected through an arc to the remaining variable. The number of simulations required to quantify the probability table for the node influenced by the 9 variables would be on the order of one million. Therefore, the number of boundary conditions generated by discrete BNs are so extensive that it would make both physical and numerical modeling costly, labor-intensive, and computationally demanding.
This study introduces dependence modeling using nonparametric Bayesian networks for vegetated coastal systems. The system has been parameterized using continuous distributions and likely (conditional) correlations among variables. The model represents a consistent joint probability distribution and hence can be used to generate conditions that are physically realistic. It adds value to numerical modeling by reducing the number of simulations required to get meaningful generalized results. The paper has been structured as follows: method for parameterization and stochastic modeling in Section 2, resulting NPBNs with dependence information in Section 3, value of dependence modeling in Section 4, and conclusions in Section 5.

Methodology
Vegetated hydrodynamic systems were schematized, parameterized, and probabilistically modeled in the user-defined random variable mode of UNINET [57,59], see overall methodology in Figure 2. The parameterization was conducted for benthic, submerged, and emergent vegetation, and the system was categorized into three variable families: (i) hydrodynamic, (ii) vegetation, and (iii) hybrid (dike) variables. Each variable was described with a continuous marginal distribution along with (conditional) rank correlations among variables determined from literature, data, or "in-house" expert judgment. Gaussian-copula based nonparametric Bayesian networks were created for submerged (salt marshes) and emergent (mangroves) vegetation, both combined with a seagrass meadow in the lower intertidal. Monte Carlo sampling was carried out from the NPBNs which resulted in realistic physical conditions sampled from a stochastic model that takes multivariate dependence into account.

System Schematization and Parameterization
To simplify the system hydrophysically, a 1-dimensional cross-shore profile and a dike without a berm were schematized. Vegetation was schematized as rigid cylinders and seagrasses, saltmarshes, and mangroves were modeled as benthic, submerged, and emergent vegetation respectively.

Monte Carlo Sampling
Vegetated hydrodynamic conditions are sampled from NPBNs with correct dependence structure Step 5

Interpretation
Results are generated, interpreted and communicated.

Figure 2.
Overall methodology and steps for stochastic modeling in UNINET.

Schematization
The schematization in Figure 3 was idealized and divided into 6 segments based on the dominant physical and hydrodynamic processes. Hydrodynamic boundary conditions were defined at the offshore boundary. After nearshore transformation on the ramp, waves reached till the point where they started to "feel" the bottom. At this point on the foreshore, bottom friction started playing a role, mimicking both bed roughness and the benthic vegetation. Waves then encountered a hybrid flood defense system, consisting of mature vegetation (a saltmarsh or mangrove forest) and a dike at the landward extent of vegetation, see Figure 3. The toe of the dike was fixed at z toe = 0 m and the rest of the depths were calculated as positive upwards from this reference.  Vegetation was schematized based on the relative depth between water and vegetation height. Seagrasses were idealized as benthic vegetation, salt marshes as fully submerged with the water depth approaching vegetation height, and mangroves as emergent vegetation with multiple vertical layers, see Figure 4. Seagrasses were modeled as bed friction, salt marshes and mangroves as rigid cylinders parameterized by stem height, drag coefficient, frontal width, and vegetation density.

Variable Families
The minimum required number of primary variables which could sufficiently describe the hydrodynamic forcing, a vegetation field, and a dike were used ( Figure 5). Variations of these variables could reproduce a variety of biohydrophysical conditions observed across the globe. All variables used are listed hereunder: • Hydrodynamic: Offshore wave height (H m0 ), peak wave period (T p ), water depth (h), and offshore slope (S 0 ) were grouped as hydrodynamic variables. • Vegetation: Vegetation forest length (L v ) and vegetation slope (S v ) were general vegetation variables which represented forest characteristics for both saltmarshes and mangroves. - Emergent vegetation had three sub categories for each of the vertical layers.
* Stem height (h v,s ), stem frontal width (b v,s ), stem density (N v,s ), and stem drag coefficient (C d,s ) were put in place for the top layer of mangroves.
Hybrid: Lastly, dike slope (S d ) and crest level (h c ) were labeled as hybrid variables.

Hydrodynamic Variables
Generally, wave heights are Rayleigh distributed for a random sea-state but they are Weibull distributed for long time scales [21,66]. Hence, global trends and distribution of wave height and corresponding peak wave periods [67,68] were adopted for H m0 and T p . Water depth accounts for the mean sea level, storm surges, tidal fluctuations, and sea level rise due to climate change. Water depth was kept uniformly distributed within a range that was obtained using a breaker index-based (γ b = H rms /h = 0.8) back calculation so that the waves break near the toe of the dike.

Vegetation Variables
The values for the vegetation variables were synthesized from more than 25 studies to parameterize the vegetation, see Table 1. In addition, similar synthesis results from [31] for salt marshes and from [69] for mangroves were augmented into a literature meta-analysis and were used as a basis to parameterize most of the vegetation variables. However, individual studies that specifically covered global distribution of a certain parameter were preferred over the meta-analysis of this study, e.g., see [27] for mangrove canopy heights.
Moreover, some data was modified during the synthesizing process to optimally fit the study scope. For example, large forest lengths (up to 30 km) [40,65], were curtailed to 1.5 km assuming that the vulnerability of communities protected by such large vegetation forest is negligible. Or dimensionless friction coefficient was calculated using c f = gn 2 h 1/3 from the range of Manning's coefficients (n) produced by [70,71] for various vegetation.

Hybrid Variables
Dike crest levels were defined as the maximum vertical distances relative to the dike toe and were critical for determining the run-up extent. The maximum crest level was estimated by adding maximum water depth, maximum wave height, and minimum freeboard. The dike slope was used to determine the horizontal distance between the toe of the dike and the crest. Typical ranges of dike slopes and minimum free-board were determined by asking design experts who practice dike design in accordance with criteria mentioned by [72].

Stochastic Model Setup
The stochastic model was made in the user-defined mode of UNINET in order to sample varied vegetated-hydrodynamic conditions. UNINET models nonparametric Bayesian networks which are from the family of hybrid BNs characterized by Gaussian copulas. Setting-up the directed acylcic graph of a NPBN requires marginal distributions of random variables as nodes and (conditional) rank correlations as influences on arcs. Continuous marginal distributions of the random variables (nodes) were defined from a range of parametric distributions available in UNINET, see Table 2. For influences (arcs), bivariate rank correlations were determined through data or expert judgment. Once the DAG was setup, analytical conditioning of the NPBN was performed by UNINET and Monte Carlo samples of variables were generated which established the varied vegetated hydrodynamic conditions.  Table 2, preference was given to published studies that explicitly provide parameterized distributions for variables, e.g., offshore wave height [21], peak wave period [22], and mangrove trunk height [27]. Secondly, when explicit estimates of distributions were not available, data over multiple spatial domains and long temporal scales was used. Only hydrodynamic data of such sort was available which was used to cross-check the marginal distribution and was instrumental in determining rank correlations. However, when multispatiotemporal data was not available, local datasets at specific locations were used to derive the first best assessment of the distributions, e.g., salt marsh height, frontal width, and density. If no data was available, literature was consulted to carry out a meta-analysis aimed to identify general trends about variability of a variable which could be used to infer distributions, e.g., drag coefficient, vegetation, and offshore slope. Except for the variables from published studies, beta distribution was preferred for most of other variables due to its ability to give control of both the bounds (range) and the variability within the bounds (density function).

. Correlations and Copulas
Bivariate rank correlations were calculated from various sources of data and in-house expert judgment as presented in Table 2 and elaborated in following sections. Moreover, in case of conditional dependence, where a child node is dependent on more than one parent node, recursive formula [56,64] in Equation (1) where ρ 12|3 is the conditional rank correlation coefficient of random variables X 1 and X 2 given X 3 .

Data Processing
Raw field data was processed to acquire correlation coefficients and synthetic data sampled from an estimated Gaussian copula. An example of data processing is presented for two vegetation variables: vegetation height (h v ) and frontal width (b v ), see Figure 6. The field vegetation data for Chesapeake Bay in the north-eastern part of the US was used from four different stations along a transect containing Spartina alterniflora and Spartina patens species. The raw data in Figure 6a was transformed to the unit square based on its cumulative distribution function (Figure 6b) and the Gaussian copula was estimated (Figure 6c). This copula together with the marginal distributions may be used to sample synthetic data. In Figure 6d, random sampling for both variables was transformed back to original units through the inverse of the cumulative distributions function. Similar analysis of both h v and b v was done in relation to vegetation density and, overall, the rank correlations obtained from this data source were of (h For the hydrodynamic variables, time-series data was obtained for 10 distinct locations over 25 years with 3-hour resolution from waveclimate.com and treated in a similar manner as explained for the vegetation variables. The 10 major deltas with vegetation on foreshores were selected from [74] and representative average values for wave conditions are presented in Table 3. The rank correlation obtained from this data source, as presented in Figure 7, were ρ(H m0 , T p ), ρ(H m0 , h), and ρ(T p , h).

Expert Judgment
In case of unavailability of data or literature, the classical model of expert judgment is an option [55,[75][76][77][78][79] to determine rank correlations between two or more variables. In this case, the classical model could not be fully implemented due to limited resource. However, in-house expert judgment was incorporated through verbal communication with experts and elicitation from a combination of general trends from literature. For example, a higher water level would require a higher dike crest level hence the rank correlation for (h, h c ) was defined positive. Overall, individual estimates of experts about the nature (positive or negative) and strength (value) of correlations were collated together to derive estimates of dependence among variables. The correlations defined based on expert judgment, as mentioned in Table 4, were mainly for vegetation and hybrid variables.  The resulting ranges, distributions, and correlations from literature meta-analysis, data processing, and expert judgment are presented in Tables 2 and 4 along with their main source. These quantification results formed the basis for setting up the nonparametric Bayesian networks for salt marshes and mangroves as presented in Figure 8 and Figure 9 respectively. Table 4. Rank correlations among variables in salt marsh system. See Figure 9 for correlations of mangrove system variables.

Correlations
Nature Coefficient Source Color indicates different components in the system: blue for hydrodynamics, gray for dike, light green for seagrass-saltmarsh variables.

Results
This work aimed to obtain a better understanding of mangroves and salt marshes and parameters influencing wave attenuation by these ecosystems. This was approached through constructing a stochastic model for salt marshes and mangroves in the form of nonparametric Bayesian networks which capture the dependence among variables of interest. The NPBNs could be used to sample varied vegetated hydrodynamic conditions that tend to coexist in physically realistic windows.

Vegetated Hydrodynamic System
The nonparametric Bayesian network representing salt marsh environment is presented in Figure 8. Overall, the NPBN quantifies the influences between the vegetation and the hydrodynamic variables which eventually captures the underlying dependence of the system. Interactions among groups of variables (vegetation and hybrid) are linked through hydrodynamic variables as the incoming energy determines both dike crest levels and foreshore slopes. Stem width shows conditional dependence on other variables as it has two parent nodes, i.e., stem height and vegetation density. Variables near the offshore boundary, such as offshore slope and seagrass bed roughness, do not interact actively with the rest of the system and therefore operate as independent variables.
Similarly, for mangrove environments the NPBN is presented in Figure 9. The mangrove variables were divided into three layers (roots, trunk, and stem) due to the differences in their phytomorphological characteristics. Root density plays an important role in this NPBN as it links vegetation slope and forest length to the root layer and further links all mangrove layers to each other. This allows for the maintenance of the harmony between individual plant characteristics and the overall forest characteristics in sampled conditions. Similarly, using the relative emergent property of mangroves, the root height is linked to water depth that conceptually connects hydrodynamic variables to vegetation variables.
Accounting for the multivariate-dependence enables the generation of Monte Carlo samples of variables, which constitute varied boundary conditions. This, for instance, in the case of relatively more energetic hydrodynamic conditions generates samples of reflective beaches, higher and steeper dikes, and higher mangrove roots. Similarly, root variables influence trunk and stem variables as root structure categorizes mangroves into juvenile or mature mangrove types.

Dependence
The NPBNs are based on a consistent joint probability distribution of all the variables in the network correlated to one another. Due to conditional dependence, more correlations were revealed apart from the ones that were already defined to setup the model, e.g., stem height and drag coefficient. This resulted in a symmetric correlation matrix which indicates the dependence among all variables, see Table 5 for the correlation matrix of the salt marsh system. A value close to 1.0 represents the strongest correlation, e.g., self-correlation on the diagonal, and a value close to 0.0 represents the weakest correlation, e.g., c f and S d . Table 5. Correlation matrix of stochastic model for salt marshes, also see Figure 10. Highest correlation is 1.00, lowest is 0.00. Positive and negative values show positive and negative correlations. The correlations other than the ones in Figure 8 have been calculated using Equation (1) The correlation matrices generated by UNINET were transformed to adjacency matrices (which use absolute values of correlations) and have been plotted in Figure 10 for both the salt marsh and mangroves NPBNs. Figure 10a,c show all possible correlations among variables. The extensive amount of influences show the inherent complexity of the vege-tated hydrodynamic system which is undermined when it is studied using deterministic methods. Dominant correlations were filtered out by fixing the correlation threshold to 0.1, see Figure 10b for salt marshes and Figure 10d for mangroves. The strength of copulas and dependence modeling allows us to demonstrate the influences which are not initially perceived. For salt marshes an example could be the correlations of wave height with drag coefficient, vegetation height, frontal width, dike slope, and crest level. The strongest conditional dependence is between drag coefficient and vegetation height, followed by wave height and vegetation density, drag coefficient and vegetation density, and vegetation slope and frontal width in salt marshes. For mangroves, the strongest of such "non parent-child" correlations is between trunk height and trunk drag coefficient while other examples include wave height and root height, stem density and vegetation slope, and trunk frontal width and trunk density. This means that the effects on the system of all such correlated variables shown in Figure 10b,d should not be studied in isolation from the other variables, otherwise the true system response (e.g., wave damping) would not be revealed.

Realistic Boundary Conditions
Analytical conditioning of the stochastic model and Monte Carlo sampling (a process of picking up random values from a probabilistically interpreted system) resulted in logical combinations of variables. Due to the correlations and the joint distribution, two variables are "tied-up" together such that the range of a variable only coexists within a certain range of another variable, e.g., high waves with long periods. Table 6 shows part of 300 physically realistic combinations of variables in saltmarsh environments representing boundary conditions which could potentially be used for hydrovegetation modeling. The sampled conditions represent combinations of hydrodynamic forcing, physical conditions, sea states, vegetation types, vegetation characteristics, and flood defense extents to cover the entire space of variations that would yield generalized results when analyzed further. An example of the sampling result could be seen in Figure 11 which shows combinations of foreshore slope, forest length, crest level, and dike slope. Idealized profiles from 10 out of 300 models Figure 11. Sampled profiles (10/300).
A comparison of the conditions sampled from the salt marsh NPBN as the 'dependent case' with the conditions sampled from an independent DAG with uniformly distributed variables as the "independent case" is presented in Figure 12. For the independent case, sampled combinations cover almost the complete sampling space whereas for the dependent case a variable-specific pattern exists. For example, wave heights and wave periods are not sampled in any recognizable trend for the independent case but the conditions sampled from the stochastic model show positive dependence such that there are almost no conditions sampled for high wave heights with low wave periods or vice versa. When compared to the data analyzed in this study and the published literature [23,[80][81][82][83][84], the independent case deems to be unrealistic and the dependent case is in harmony for the variables whose dependence information is published. Hence, Figure 12 not only shows the pair-wise dependence of variables that emerged beyond the correlations initially defined, but also shows the realistic "windows" of variables where they coexist.
Moreover, the correlation for ρ(H m0 , T p ) was explicitly introduced while setting up the model but the results go beyond the specified correlations and reveal unspecified dependence. For instance, ρ(H m0 , N v ) was not specified initially but the corresponding plot in Figure 12 shows that for higher wave heights the vegetation density seems to have negative correlation and shows signs of tail dependence. This means that less vegetation is expected in high energetic environments. One possible explanation for this could be breakage or uprooting during storms as shown by [85].  Table 2) and correlations (Table 4). Yellow panels shows the distinct sampling of the variable pair with positive H m0 − T p and negative dependence N v − S v .

Discussion
Vegetated hydrodynamic conditions sampled from the salt marsh and mangroves NPBNs could be further used for hydrovegetation numerical modeling, probabilistic modeling, or system-based sensitivity analysis. These conditions help to reduce the number of numerical simulations required to produce generalized results, aid probabilistic analysis in data-scarce environments, and provide a basis for a holistic system-based sensitivity analysis.

Value of Dependence Modeling Using Nonparametric Bayesian Networks
The value of dependence modeling of vegetated coastal systems for generating varied boundary conditions lies in obtaining physically realistic boundary conditions, which, when simulated, reduces bulk simulation time and increases the authenticity of overall system response. Using this approach will advance current modeling practice as conditions and values for sensitivity analysis are too often randomly or quickly selected instead of being informed by realistic situation. This asks for a good understanding of the biogeomorphological system and goes beyond standard modeling efforts by engineers solely. Hence, even in modeling studies inclusion of biophysical disciplines is essential to achieve realistic results that can be translated to situations encountered in real life.

Methodological Value
Many Bayesian network based studies for coastal applications have been published but nearly all of them use discrete Bayesian networks [47][48][49][50][51][52][53][54] except the very few recent ones which use NPBNs [86][87][88][89]. In discrete BNs, variables have to be discretized into bins where the number and sizes of the bins need to be correctly fine-tuned to ensure that each bin represents homogeneous information and has nearly similar impact on the child node. This introduces subjectivity as the decisions about bins could vary from person to person which adds another source of uncertainty in modeling discrete BNs. Belonging to the hybrid BNs family, NPBNs provide flexibility to incorporate both discrete and continuous marginal distributions in the same graph, thereby reducing the subjectivity and uncertainty.

Reduced Numerical Simulations
Using the sampled boundary conditions from the NPBNs can greatly reduce the number of simulations required to produce generalized results by orders of magnitude. The number of simulations (N s ) required to quantify conditional probabilities of all variables in the system were in the order of (O) 4 to (O) 5 for [31,51,65]. The reason for such large N s is that it is a (sum-product) function of the number of variables and the number of permutations per variable, see Equation (2). If any of the two factors increase, the required number of simulations increases geometrically. With 13 variables of seagrass-saltmarshes system even if only 13 permutations per variable are numerically modeled, the number of simulations would go to a gigantic amount of 3.2 · 10 14 which is computationally near to impossible. However, the number of simulations for the method presented in this study are only a function of the number of permutations due to the use of copulas. NPBNs can extract the dependence information even if all variables are varied at once; hence variables can be modeled as many times as the computational capacity permits.
where, N s is total number of simulations, n is number of variables, i is a given variable, and v is the number of variations of a given variable [90].

Filling Information in Data-Scarce Environments
Measured data, especially for vegetation, still remains scarce, however this should not stop the attempts to understand biophysical systems and quantification of the value of nature-based solutions. While modeling such systems, these NPBNs could be used to extract missing information in data-scarce scenarios if all the boundary conditions are not available. The dependence modeling makes sure that the rest of the boundary conditions fall into physically realistic ranges. As an application, this has been illustrated in Figure 13b using the conditions measured for Posidonia oceanica species in the Mediterranean Sea by [39]. First, in Figure 13a only one variable h was conditionalized to observe other dependent variables in a highly data-scarce case. While some variables like N v = 610 stems/m 2 and h v = 0.826 m were within less than 3% error range when compared to the measured ones (N v = 615 and h v = 0.8), other variables provided fairly acceptable (but not very accurate) first-order estimates. Subsequently, more known variables were conditionalized to make the NPBN case specific. The NPBN was updated after conditionalizing H m0 , h, C d , h v , and L v , and the mean values of N v = 736 stems/m 2 and b v = 0.0096 m were observed. When compared to measured values of N v = 615 and b v = 0.0086 m by [39], the error was found to be within acceptable range (19% and 11% respectively). It is noteworthy that NPBNs provide an advantage to assimilate new information from newer sources in the existing model [91]. On the other hand, as argued by [92], for the optimal use of probabilistic models subject knowledge and physical understanding about the context of the modeler plays an important role.

Better System-Based Response
Thinking in terms of traditional sensitivity analysis, where one variable is varied while keeping all others fixed, the proposed method of conducting sensitivity analysis using correlated boundary conditions that vary all at once has an added advantage. As we believe natural systems are highly dynamic and often show nonlinear responses, the system response to boundary conditions that are varied individually would be comparatively different to the response generated by changing all boundary conditions simultaneously. Therefore, contrarily to the proposed method, traditional sensitivity analysis does not suffice in determining the real system response to varied conditions. Using dependence modeling to generate boundary conditions that vary all together but in a correlated way not only provides physically-realistic conditions but also adds value to any further analysis done using those conditions.

Limitations and Way forward
The limitations of this study are based on simplification choices and the restrictions of the tools used for modeling.

•
Rigid Cylindrical Vegetation: Vegetation was modeled as rigid cylinders which do not bend, undergo uprooting, or break in storm conditions which are critical for the system response [85].
• Data Availability: Ideally the statistical description should be derived from the data but due to limited data and responses, stochastic modeling was performed using in-house expert judgment for some of the variables. • Gaussian Copula: Dependence analysis throughout the study was based on Gaussian copula. However, there is evidence of other dependence structures [80] for the variables in this study.
Recommendations for improvement and future research direction are listed below.
• New field data or data from global models could be obtained to expand the domain and scale of hydrodynamic and vegetation variables both spatially and temporally. • Better dependence structure, e.g., vines, could be explored since the Gaussian copula is not the most accurate dependence description for some of the variables.
As a way forward, the results from this study in the form of physically realistic vegetated hydrodynamic conditions could be utilized to carry out hydrovegetation numerical modeling. The aim here would be to observe generalized system response in terms of flood hazard reduction potential of vegetation under a variety of conditions. Variables such as the wave attenuation coefficient or run-up could be added to the existing NPBNs from this study to relate physical boundary conditions to the resulting effect of vegetation on hydrodynamics.

Conclusions
The vegetated hydrodynamic system was probabilistically parameterized using a range, distribution, and correlations among variables. The system was stochastically modeled using nonparametric Bayesian networks for sea grasses, salt marshes, and mangroves. The resulting NPBNs revealed dominant correlations among variables which derive the dynamics of the vegetated hydrodynamic system. NPBNs also revealed dependence information among variables beyond the pair-wise correlations which were initially quantified to set up the model. Modeling salt marsh and mangrove systems stochastically has enabled assessment of holistic system-based response to variations in individual variables. The NPBNs are capable of generating physically-realistic varied boundary conditions that are useful for further analysis such as reducing hydrovegetation numerical simulations required to acquire generalized information, improving traditional randomly-selected sensitivity analysis with a more cross-disciplinary system-based sensitivity analysis, and filling information in data-scarce environments. The main findings, which were derived by using a NPBN, help to pave the way for generating both generalized and predictive knowledge required for implementation of nature-based solutions in a range of realistic conditions that can be found across global coastal foreshores.

Conflicts of Interest:
The authors declare no conflict of interest.

Nomenclature
The following nomenclature has been used in this article: