Conditions of Hydraulic Heterogeneity under Which Bayesian Estimation is More Reliable

: Natural heterogeneity of soil hydraulic properties is signiﬁcant for the design and construction of geotechnical structures, and should be adequately characterized. Accurate measurements of hydraulic properties remain a di ﬃ cult job and do not always work well for further design and analysis. Field hydraulic monitoring data reﬂects the overall slope performance and provide a more representative estimation of in-situ soil hydraulic properties for back analysis. The objective of this study is to explore the conditions under which monitoring data can provide reliable estimates of hydraulic parameters. Di ﬀ erent distributions of soil heterogeneity generate a total number of 500 sets of synesthetic monitoring data. Bayesian inversion with the integration of Karhunen-Lo è ve (K-L) and polynomial chaos expansion (PCE) is chosen to estimate the spatially varied saturated coe ﬃ cient of permeability k s . The results show that the method is accurate and reliable, with less than 3% percentage error and 0.08 coe ﬃ cient of variation (COV) around the monitoring points. There are two characteristics of the best-estimated ﬁelds. First, the ranges of k s for best-estimated ﬁelds are much narrower than the worst estimated ﬁelds. Second, when the larger k s values are distributed in the unsaturated zone of slope crest, it will lead to the best estimation. It is suggested that monitoring data can provide a reliable estimation of heterogeneous k s when the ratio of ground surface ﬂux to k s in the unsaturated zone of slope crest is less than 1 / 150. Small values of k s in the slope crest result in the response of pressure head far from the responses of homogenous k s in the unsaturated zone. This complex response of the pressure head further causes the ill identiﬁcation of k s by Bayesian estimation.


Introduction
Water-induced slope failure is a common geological disaster threatening infrastructures and lives. Severe rainfall events, intense storms, changes in groundwater level, snowmelt and roaring waves substantially affect the mechanical and infiltration processes and consequently lead to the instability of slopes [1][2][3][4]. Thus, proper characterization of hydraulic properties in slopes can better assess the stability and mitigate the water-induced slope failures. Up to the present, theoretical studies of the influence of the heterogeneity of hydraulic parameters on the field responses of slopes (i.e., forward problems) are quite developed [19,20]. However, for the inverse problem, the related studies are still very limited [13,16]. Previous studies only considered one specific distribution of material heterogeneity to verify and illustrate the proposed various methods [32][33][34][35][36]. In reality, the natural heterogeneity of soils is extremely diverse. The performance of inverse analysis methods for different ground situations has not been fully investigated. The conditions of soil heterogeneity under which the monitoring data can provide a In general, the entire ground can be divided into strata or layers. Soil parameters can be viewed as statistically homogenous in a given stratum or layer (Figure 1b). However, the soils are heterogeneous in the vertical direction, due to natural pedological processes. In this sense, inter-layer heterogeneity Water 2020, 12, 160 3 of 16 means that different strata or layers have different properties. Field investigations, experimental studies, and numerical simulations have been carried out to study the influence of inter-layer heterogeneity on rainfall infiltration characteristics and slope stability [22][23][24][25]. If the hydraulic properties are heterogeneously distributed, a sudden change in pore pressure and water content will be observed at the interface of the formation-this can affect the advancement of the wetting front. The failure modes and safety factor of the slope also change [26,27].
Intra-layer heterogeneity means that soil properties vary spatially even within a given soil layer. The values of soil properties are different from point to point, but are spatially correlated (Figure 1c). The correlation structure can be used to describe spatial variations of soil properties within the framework of random fields effectively [28]. This inherent heterogeneity is a major source of geotechnical uncertainty and has significant impacts on the reliability of slope stability [29][30][31]. For the rainfall-induced slope failures, the intra-layer heterogeneity leads to the propagation of the uncertainty of soil hydraulic properties to variation of water pressure and fluctuation of the groundwater table. The factor of safety of slope stability and the probability of slope failure are also influenced.
Up to the present, theoretical studies of the influence of the heterogeneity of hydraulic parameters on the field responses of slopes (i.e., forward problems) are quite developed [19,20]. However, for the inverse problem, the related studies are still very limited [13,16]. Previous studies only considered one specific distribution of material heterogeneity to verify and illustrate the proposed various methods [32][33][34][35][36]. In reality, the natural heterogeneity of soils is extremely diverse. The performance of inverse analysis methods for different ground situations has not been fully investigated. The conditions of soil heterogeneity under which the monitoring data can provide a more reliable estimation of soil hydraulic properties remains an open question.
In previous studies, our research team has demonstrated that the Bayesian method can be used to estimate the heterogeneity of hydraulic properties in unsaturated soil slopes [16]. The combination of Karhunen-Loève (K-L) expansion and the polynomial chaos expansion (PCE) can successfully reduce the computation load with acceptable accuracy. Suggestions for planning slope monitoring schemes were provided [36].
The objective of this study is to investigate the conditions under which monitoring data can provide a more reliable estimation of heterogeneous hydraulic properties. The performance of Bayesian estimation for different realizations of the saturated coefficient of permeability k s is assessed based on the criteria of mutual information (I), and the root mean square error (RMSE). The effect of heterogeneity is studied with sensitivity analysis. The relationship between estimated error and monitoring response is measured by the metric similarity. Instead of introducing a new inverse method, the methodology in our previous study [16] is adopted, due to its high computational efficiency. Other inverse methods can also be used.

Bayesian Estimation of Heterogeneity with Polynomial Chaos Expansion and Karhunen-Loève Expansion
Bayesian estimation with integration of the polynomial chaos expansion (PCE) surrogate model and the Karhunen-Loève (K-L) expansion is emerged to estimate material heterogeneity in recent years [32][33][34][35][36]. To solve high-dimensional inverse problems, K-L expansion is adopted to transform the inverse problem to inference on a truncated sequence of weights of the K-L expansion modes. To accelerate the computational efficiency in Bayesian inference, PCE is constructed as a meta-model to surrogate the original model. As a result, the calculation of system responses using a simple equation consisting of polynomials is allowed. The purpose of this study focuses on the effect of soil heterogeneity instead of the methodology. Therefore, a brief introduction of the method [16] is described here. Detailed descriptions of the method can be found in Yang et al. (2018) [16].
The computational cost of the numerical prediction would be extremely large, due to the increase of nonlinearity of the boundary value problem [37][38][39]. Therefore, the first step is to establish the Water 2020, 12, 160 4 of 16 surrogate model for responses of the numerical model and random variables in K-L expansion by polynomial chaos expansion (PCE) [40,41]: where j denotes the index of polynomials; m denotes the total number of polynomials; c j is coefficient; θ is the random variables in K-L expansion; ψ j (θ) are polynomials; P(θ) is the response of the numerical model. Second, the random variables θ are estimated by Markov chain Monte Carlo (MCMC) in conjunction with corresponding monitoring data in a Bayesian framework. The posterior probability density function of θ is expressed as follows [42]: where const is a normalizing constant; f (θ) is the prior distribution of θ; σ 2 e is the variance of the residual error;P = [P 1 , . . . ,P t , . . . ,P T ] is a vector of monitoring data;P t is the monitoring data from the tth measurement point and T is the total number of the measurement points.
Finally, the estimated random variables are transferred to the estimated field by K-L expansion. the spatially varied soil properties U(x) is expressed as: where x denotes a vector of spatial coordinates; µ(x) is the mean of U(x); θ i is the ith uncorrelated Gaussian random variable; n is the truncated level; λ i and ϕ i (x) are the ith eigenvalues and eigenvectors of the covariance function, respectively. In practice, covariance function is estimated from sample data or fitted to parametric variogram models.

Modeling of Unsaturated Infiltration in a Soil Slope with Spatially Heterogeneous k s Field
A typical unsaturated soil slope subject to rainfall infiltration with heterogeneous k s field is modeled. The slope angle is 45 • , and the slope is 20 m high ( Figure 2). The mean value and standard deviation of k s are 2 × 10 −5 m/s and 1.6 × 10 −5 m/s, respectively. The horizontal correlation length l x and vertical correlation length l z are assumed to be 50 m and 10 m, respectively. Six K-L terms are sufficient to preserve 95% of the total field variance. Therefore, the truncation level of the K-L expansion is 6.
The numerical model of unsaturated infiltration in the heterogeneous soil slope is established to model the infiltration and pore pressure responses. According to Darcy's law and the conservation law of water mass in soil pores, the steady-state flow in an unsaturated soil slope can be described as: where H is the total hydraulic head, which is the sum of elevation head z and pore-water pressure head h; k is the unsaturated coefficient of permeability. Here, the permeability of the soil slope is assumed to be isotropic with the unsaturated coefficient of permeability of x-direction and z-direction are the same.
Laboratory studies have shown that there is a relationship between water content and soil suction (negative pressure) for the unsaturated soil defined as soil-water characteristic curve (SWCC). Direct measurements of the unsaturated coefficient of permeability in the laboratory can be time-consuming, especially for low water content conditions. Indirect measurements of permeability are commonly performed by establishing permeability functions (PF) through SWCC. Several models of hydraulic properties are proposed for unsaturated soil [43][44][45][46]. Concerning geotechnical applications, the model proposed by Fredlund et al. (2012) [47] is adopted. An exponential PF is used to define the unsaturated coefficient of permeability function [48]: where k s is the saturated coefficient of permeability, θ w is the volumetric water content, θ s is the saturated water content, and b is a constant depending on the soil type. The SWCC is described as [49]: where u a and u w represent the pore-air pressure and the pore-water pressure, respectively; a f , n f and m f are fitting parameters. In this study, we assume that the saturated water content θ s is 0.4, b is equal to 3 and the SWCC parameters for a f , n f and m f are 5, 2 and 1, respectively. The numerical model of an unsaturated soil slope under the steady-state rainfall infiltration is established using a Multiphysics PDE (partial differential equation) solver COMSOL. The initial pore-water pressure distribution is hydrostatic with the initial groundwater level, as shown in Figure 2. The rain flux, q = 2 × 10 −7 m/s, is applied along the slope surface. Along the left and right boundaries beneath the groundwater table, the hydraulic heads are constant. A zero-flux boundary is applied along the left and right boundaries above the groundwater table. The bottom boundary is impermeable.
Water 2020, 12, x FOR PEER REVIEW 5 of 16 where ks is the saturated coefficient of permeability, θw is the volumetric water content, θs is the saturated water content, and b is a constant depending on the soil type. The SWCC is described as [49]: where ua and uw represent the pore-air pressure and the pore-water pressure, respectively; af, nf and mf are fitting parameters. In this study, we assume that the saturated water content θs is 0.4, b is equal to 3 and the SWCC parameters for af, nf and mf are 5, 2 and 1, respectively. The numerical model of an unsaturated soil slope under the steady-state rainfall infiltration is established using a Multiphysics PDE (partial differential equation) solver COMSOL. The initial porewater pressure distribution is hydrostatic with the initial groundwater level, as shown in Figure 2. The rain flux, q = 2 × 10 −7 m/s, is applied along the slope surface. Along the left and right boundaries beneath the groundwater table, the hydraulic heads are constant. A zero-flux boundary is applied along the left and right boundaries above the groundwater table. The bottom boundary is impermeable.

Artificial Monitoring Scheme for Inverse Estimation
The rainfall-induced slope failures are typically shallow above the groundwater table [50,51]. It is mainly due to a loss in shear strength when soil suctions (i.e., negative pressure head) is decreased or dissipated [47,52]. In engineering practice, shallow piezometers are generally installed to monitor the pressure heads in the slope. For example, the automatic recording piezometers are installed from 1.00 m to 16.6 m below ground level in a natural terrain above the North Lantau Expressway in Hong Kong reported from Evans and Lam (2003) [53]. It is, therefore, reasonable to place measurement points in shallow depth for property estimation. In this study, a monitoring scheme of eight boreholes with a horizontal spacing of 5 m is established within the shallow depths of the slope ( Figure 2). Two piezometers are installed in each borehole. The shallower piezometer is 2.5 m below the ground, and the deeper one is 7.5 m depth. The corresponding pore pressure heads calculated by the numerical model, which are corrupted with 2% noises are regarded as synthetic monitoring data.
To determine the characteristics of heterogeneity for a robust back estimation, 500 random realizations of the k s field are generated. Figure 3 presents the mean value of the pressure head at 16 different measurement points with the increasing number of realizations. As shown in the graph, there is a small fluctuation of pressure head when the number of realizations reaches 500. This result shows that 500 realizations are enough for covering the uncertainties of pressure head responses for heterogeneity estimation.

Artificial Monitoring Scheme for Inverse Estimation
The rainfall-induced slope failures are typically shallow above the groundwater table [50,51]. It is mainly due to a loss in shear strength when soil suctions (i.e., negative pressure head) is decreased or dissipated [47,52]. In engineering practice, shallow piezometers are generally installed to monitor the pressure heads in the slope. For example, the automatic recording piezometers are installed from 1.00 m to 16.6 m below ground level in a natural terrain above the North Lantau Expressway in Hong Kong reported from Evans and Lam (2003) [53]. It is, therefore, reasonable to place measurement points in shallow depth for property estimation. In this study, a monitoring scheme of eight boreholes with a horizontal spacing of 5 m is established within the shallow depths of the slope ( Figure 2). Two piezometers are installed in each borehole. The shallower piezometer is 2.5 m below the ground, and the deeper one is 7.5 m depth. The corresponding pore pressure heads calculated by the numerical model, which are corrupted with 2% noises are regarded as synthetic monitoring data.
To determine the characteristics of heterogeneity for a robust back estimation, 500 random realizations of the ks field are generated. Figure 3 presents the mean value of the pressure head at 16 different measurement points with the increasing number of realizations. As shown in the graph, there is a small fluctuation of pressure head when the number of realizations reaches 500. This result shows that 500 realizations are enough for covering the uncertainties of pressure head responses for heterogeneity estimation.

Quantification of Parametric Uncertainty Reduction
The outcomes of the Bayesian inference are the posterior distributions of basic random variables θ. To quantify the effects of inverse estimation, the mutual information (I) based on Shannon entropy [54] is used. The Shannon entropy for a continuous random variable with probability density function f(θ) within [a, b] is called differential entropy:

Quantification of Parametric Uncertainty Reduction
The outcomes of the Bayesian inference are the posterior distributions of basic random variables θ.
To quantify the effects of inverse estimation, the mutual information (I) based on Shannon entropy [54] is used. The Shannon entropy for a continuous random variable with probability density function f (θ) within [a, b] is called differential entropy: Water 2020, 12, 160 For continuous variables, the differential entropy itself cannot signify the effects of inverse estimation. The mutual information I is proposed to quantify the amount of information obtained through the observed dataP: where H(θ) is the differential entropy of prior probability density distribution of θ, which is regarded as a measure of uncertainty about θ; H(θ|P) denotes the differential entropy of the posterior probability density distribution of θ, which means the amount of uncertainty remaining about θ after the data is observed.

Quantification of Estimated Field
The estimated fields can be simulated via the K-L expansion with the statistics of posterior distributions. In this study, the maximum a posteriori density (MAP) value denoting the best solution from Bayesian inverse estimation is utilized to represent the optimal estimated field. The root mean square error (RMSE) is calculated to evaluate the difference between the true field and the best-estimated field: wherek s x j is the optimal estimated value with MAP at coordinate x i = (x j , z j ) corresponding to k s (x j ); L is the total number of the discrete coordinates of the spatial varied field. RMSE can be regarded as an integrated index to rank the results of the estimated fields. Figure 4 shows the mean I value of random variables θ 1 -θ 6 of 500 sets of synthetic monitoring data. The mean I value of each random variable is around two nats. It indicates that a significant difference exists between prior and posterior distributions of random variables. All the random variables are well identified, and the uncertainties are significantly reduced after inverse estimation. The I value of θ 1 , θ 4 , and θ 5 are 2.48, 2.30, and 2.14 nat, respectively, which are much larger than θ 2 and θ 3 . This means θ 1 , θ 4 and θ 5 are better estimated than θ 2 and θ 3 . This finding is consistent with the sensitivity analysis results, according to Yang et al. (2018).

Performance Evaluations Based on Synthetic Monitoring Data
For continuous variables, the differential entropy itself cannot signify the effects of inverse estimation. The mutual information I is proposed to quantify the amount of information obtained through the observed data ̂: where H(θ) is the differential entropy of prior probability density distribution of θ, which is regarded as a measure of uncertainty about θ; H(θ|̂) denotes the differential entropy of the posterior probability density distribution of θ, which means the amount of uncertainty remaining about θ after the data is observed.

Quantification of Estimated Field
The estimated fields can be simulated via the K-L expansion with the statistics of posterior distributions. In this study, the maximum a posteriori density (MAP) value denoting the best solution from Bayesian inverse estimation is utilized to represent the optimal estimated field. The root mean square error (RMSE) is calculated to evaluate the difference between the true field and the bestestimated field: where k s (x j ) is the optimal estimated value with MAP at coordinate xi = (xj, zj) corresponding to ks(xj); L is the total number of the discrete coordinates of the spatial varied field. RMSE can be regarded as an integrated index to rank the results of the estimated fields. Figure 4 shows the mean I value of random variables θ1-θ6 of 500 sets of synthetic monitoring data. The mean I value of each random variable is around two nats. It indicates that a significant difference exists between prior and posterior distributions of random variables. All the random variables are well identified, and the uncertainties are significantly reduced after inverse estimation. The I value of θ1, θ4, and θ5 are 2.48, 2.30, and 2.14 nat, respectively, which are much larger than θ2 and θ3. This means θ1, θ4 and θ5 are better estimated than θ2 and θ3. This finding is consistent with the sensitivity analysis results, according to Yang et al. (2018).   3.14 × 10 −6 m/s, which can represent the typical error of the estimated fields. The interquartile range (IQR, upper quartile, Q 3 -lower quartile, Q 1 ) is 4.12 × 10 −6 m/s, which is narrow compared to the full range. It indicates the major RMSE values are allocated within a narrow range of RMSE. The error of the estimated fields is centralized in a narrow range from 1.55 × 10 −6 m/s to 5.67 × 10 −6 m/s. Water 2020, 12, x FOR PEER REVIEW 8 of 16 Figure 5 displays the probability density function of the RMSE based on 500 estimated fields. The RMSE corresponding to the maximum probability density is 1.77 × 10 −6 m/s. The median (Q2) value is 3.14 × 10 −6 m/s, which can represent the typical error of the estimated fields. The interquartile range (IQR, upper quartile, Q3-lower quartile, Q1) is 4.12 × 10 −6 m/s, which is narrow compared to the full range. It indicates the major RMSE values are allocated within a narrow range of RMSE. The error of the estimated fields is centralized in a narrow range from 1.55 × 10 −6 m/s to 5.67 × 10 −6 m/s. Figure 5. The probability density function (kernel smooth) of the RMSE based on 500 sets of synthetic monitoring data. Figure 6 shows the contour map of the mean percentage error with 500 random fields. It is the average value of the percentage error of 500 random fields with the simulation of MAP value in posterior distributions. The percentage error around the measurement points is less than 3%, whereas, the maximum is 20% near the bottom boundary. The significant difference between the real and estimated field occurs at locations that are far from the measurement points.    Figure 6 shows the contour map of the mean percentage error with 500 random fields. It is the average value of the percentage error of 500 random fields with the simulation of MAP value in posterior distributions. The percentage error around the measurement points is less than 3%, whereas, the maximum is 20% near the bottom boundary. The significant difference between the real and estimated field occurs at locations that are far from the measurement points.

Performance Evaluations Based on Synthetic Monitoring Data
Water 2020, 12, x FOR PEER REVIEW 8 of 16 Figure 5 displays the probability density function of the RMSE based on 500 estimated fields. The RMSE corresponding to the maximum probability density is 1.77 × 10 −6 m/s. The median (Q2) value is 3.14 × 10 −6 m/s, which can represent the typical error of the estimated fields. The interquartile range (IQR, upper quartile, Q3-lower quartile, Q1) is 4.12 × 10 −6 m/s, which is narrow compared to the full range. It indicates the major RMSE values are allocated within a narrow range of RMSE. The error of the estimated fields is centralized in a narrow range from 1.55 × 10 −6 m/s to 5.67 × 10 −6 m/s. Figure 5. The probability density function (kernel smooth) of the RMSE based on 500 sets of synthetic monitoring data. Figure 6 shows the contour map of the mean percentage error with 500 random fields. It is the average value of the percentage error of 500 random fields with the simulation of MAP value in posterior distributions. The percentage error around the measurement points is less than 3%, whereas, the maximum is 20% near the bottom boundary. The significant difference between the real and estimated field occurs at locations that are far from the measurement points.    Figure 7 shows the mean value of Coefficient of variation (COV) with 500 random fields. The COV in each field is defined as the ratio of the standard deviation to the mean value based on the last 20% samples of the posterior distribution. It can be observed that the COV around the measurement points is very small, which is less than 0.08. This means that the uncertainty of estimation around measurement points is low. The uncertainty of estimation is significant near the bottom boundary, and the COV is approximate to 0.22. The above results show that the measurement points can reduce the error and uncertainty of the inverse estimation of heterogeneity.
Water 2020, 12, x FOR PEER REVIEW 9 of 16 points is very small, which is less than 0.08. This means that the uncertainty of estimation around measurement points is low. The uncertainty of estimation is significant near the bottom boundary, and the COV is approximate to 0.22. The above results show that the measurement points can reduce the error and uncertainty of the inverse estimation of heterogeneity.

Figure 7.
Coefficient of variation (COV) of the estimated field with 500 sets of synthetic monitoring data.

Analysis of the Effect of Heterogeneity with Extreme RMSE Values
Sensitivity analysis of the pressure head is conducted at measurement points with respect to ks. Taking the partial derivative of the output pressure head with respect to the input parameter ks, small perturbations of ks with 2% are exanimated at each location in the field. The average sensitivity of pressure head at measurement points is shown in Figure 8. The sensitivity away from the slope and below the water table is positive, indicating the average pressure head of sixteen measurement points will increase along with the increasing ks and vice versa. Opposite effects occur in the crest, slope part, and toe. Sensitivity in the crest is extremely large. It means that the ks in the crest zone has a noteworthy influence on the average pressure head responses. The change of ks in the crest can cause significant uncertainty in the average pressure head responses. Therefore, the slope can be divided into two parts bounded by the water table. The relationship between the characteristics of distributed ks on both sides and the estimated effects are worth studying. Moreover, the responses of pressure head in the crest, according to the ks, should be the focus of attention.

Analysis of the Effect of Heterogeneity with Extreme RMSE Values
Sensitivity analysis of the pressure head is conducted at measurement points with respect to k s . Taking the partial derivative of the output pressure head with respect to the input parameter k s , small perturbations of k s with 2% are exanimated at each location in the field. The average sensitivity of pressure head at measurement points is shown in Figure 8. The sensitivity away from the slope and below the water table is positive, indicating the average pressure head of sixteen measurement points will increase along with the increasing k s and vice versa. Opposite effects occur in the crest, slope part, and toe. Sensitivity in the crest is extremely large. It means that the k s in the crest zone has a noteworthy influence on the average pressure head responses. The change of k s in the crest can cause significant uncertainty in the average pressure head responses. Therefore, the slope can be divided into two parts bounded by the water table. The relationship between the characteristics of distributed k s on both sides and the estimated effects are worth studying. Moreover, the responses of pressure head in the crest, according to the k s , should be the focus of attention.
Water 2020, 12, x FOR PEER REVIEW 9 of 16 points is very small, which is less than 0.08. This means that the uncertainty of estimation around measurement points is low. The uncertainty of estimation is significant near the bottom boundary, and the COV is approximate to 0.22. The above results show that the measurement points can reduce the error and uncertainty of the inverse estimation of heterogeneity.

Figure 7.
Coefficient of variation (COV) of the estimated field with 500 sets of synthetic monitoring data.

Analysis of the Effect of Heterogeneity with Extreme RMSE Values
Sensitivity analysis of the pressure head is conducted at measurement points with respect to ks. Taking the partial derivative of the output pressure head with respect to the input parameter ks, small perturbations of ks with 2% are exanimated at each location in the field. The average sensitivity of pressure head at measurement points is shown in Figure 8. The sensitivity away from the slope and below the water table is positive, indicating the average pressure head of sixteen measurement points will increase along with the increasing ks and vice versa. Opposite effects occur in the crest, slope part, and toe. Sensitivity in the crest is extremely large. It means that the ks in the crest zone has a noteworthy influence on the average pressure head responses. The change of ks in the crest can cause significant uncertainty in the average pressure head responses. Therefore, the slope can be divided into two parts bounded by the water table. The relationship between the characteristics of distributed ks on both sides and the estimated effects are worth studying. Moreover, the responses of pressure head in the crest, according to the ks, should be the focus of attention.  To analyze the characteristics of the heterogeneous fields, the 500 realizations of heterogeneous fields are ranked by the RMSE values of estimation. Figure 9a shows the first five fields, which are best-estimated. The mean RMSE of these fields is 2.75 × 10 −7 m/s. Figure 9b shows the first five worst-estimated fields. The mean RMSE is 3.21 × 10 −5 m/s. The RMSEs of these ten fields are listed in Table 1. There are nearly two orders of magnitude differences in terms of the RMSEs. The distribution of heterogeneity can significantly influence the effect of inverse estimation. Compared to the first five best and worst real fields, there are two characteristics of the range and the distribution of k s . First, the k s of the best-estimated fields range from 0.35 × 10 −5 m/s to 3.92 × 10 −5 m/s, which is much narrower than the worst estimated fields ranging from 0.55 × 10 −5 m/s to 12.99 × 10 −5 m/s. Second, when the larger k s values (around 3 × 10 −5 m/s) are distributed in the unsaturated zone of slope crest, it will lead to the best estimation. However, the distribution of k s for the worst estimated fields is reversed. For these fields, the crest part of the slope is generally less permeable compared with the main body part of the slope. According to Zhang et al. (2004) [55], the steady-state matric suction in the unsaturated zone can be determined by the ratio of ground surface flux to the saturated coefficient of permeability, q/k s . Therefore, a reliable estimation of the heterogeneous field of k s can be obtained when the q/k s in the slope crest is less than 1/150. To analyze the characteristics of the heterogeneous fields, the 500 realizations of heterogeneous fields are ranked by the RMSE values of estimation. Figure 9a shows the first five fields, which are best-estimated. The mean RMSE of these fields is 2.75 × 10 −7 m/s. Figure 9b shows the first five worstestimated fields. The mean RMSE is 3.21 × 10 −5 m/s. The RMSEs of these ten fields are listed in Table  1. There are nearly two orders of magnitude differences in terms of the RMSEs. The distribution of heterogeneity can significantly influence the effect of inverse estimation. Compared to the first five best and worst real fields, there are two characteristics of the range and the distribution of ks. First, the ks of the best-estimated fields range from 0.35 × 10 −5 m/s to 3.92 × 10 −5 m/s, which is much narrower than the worst estimated fields ranging from 0.55 × 10 −5 m/s to 12.99 × 10 −5 m/s. Second, when the larger ks values (around 3 × 10 −5 m/s) are distributed in the unsaturated zone of slope crest, it will lead to the best estimation. However, the distribution of ks for the worst estimated fields is reversed. For these fields, the crest part of the slope is generally less permeable compared with the main body part of the slope. According to Zhang et al. (2004) [55], the steady-state matric suction in the unsaturated zone can be determined by the ratio of ground surface flux to the saturated coefficient of permeability, q/ks. Therefore, a reliable estimation of the heterogeneous field of ks can be obtained when the q/ks in the slope crest is less than 1/150.  z (m)  To further analyze the characteristics of the pressure head responses in these fields, Figure 10a,b show the pressure head profiles corresponding to the ten fields and the homogeneous case in section A-A of the slope (Figure 2). It can be observed that only a small deviation of pressure head profiles exists between the five best-estimated fields and the homogeneous case. This is because the k s values in the crest for the five best-estimated fields are close to the mean value (i.e., 2 × 10 −5 m/s). For the worst estimated fields, the pressure head profiles in the unsaturated zones are significantly different from that of the homogeneous case. The complex responses of the pressure head may result from the less permeable zone in the slope crest. To further analyze the characteristics of the pressure head responses in these fields, Figure 10a,b show the pressure head profiles corresponding to the ten fields and the homogeneous case in section A-A of the slope (Figure 2). It can be observed that only a small deviation of pressure head profiles exists between the five best-estimated fields and the homogeneous case. This is because the ks values in the crest for the five best-estimated fields are close to the mean value (i.e., 2 × 10 −5 m/s). For the worst estimated fields, the pressure head profiles in the unsaturated zones are significantly different from that of the homogeneous case. The complex responses of the pressure head may result from the less permeable zone in the slope crest.
To investigate the uncertainty of random variables for Bayesian estimation, Figure 11 shows the prior and posterior distributions corresponding to the first five best and worst estimated fields of random variable θ1. Compared with the prior distribution, the posterior distributions of θ1 are within a narrower region. This indicates that θ1 corresponding to all the fields is well identified, but it is better for the first five best fields. In Table 1, the average values of the mutual information (I) for the random variables are also presented. The average I values of the best-estimated fields are nearly doubled compared with those of the worst estimated fields.  To investigate the uncertainty of random variables for Bayesian estimation, Figure 11 shows the prior and posterior distributions corresponding to the first five best and worst estimated fields of random variable θ 1 . Compared with the prior distribution, the posterior distributions of θ 1 are within a narrower region. This indicates that θ 1 corresponding to all the fields is well identified, but it is better for the first five best fields. In Table 1, the average values of the mutual information (I) for the random variables are also presented. The average I values of the best-estimated fields are nearly doubled compared with those of the worst estimated fields.
Water 2020, 12, x FOR PEER REVIEW 12 of 16 Figure 11. Prior and posterior distributions of the first five best and worst estimated fields of random variable θ1.

Relations between Estimation Error and Monitoring Response
As shown in Figure 10, the performance of inverse estimation may depend on the shape of the pressure head profiles. To quantify this complexity, the metric similarity of Manhattan distance (Li et al., 2004) is introduced to qualify the shape of the pressure head profile in the A-A section ( Figure  12). The pressure head profile of the homogeneous case is used as a reference. The similarity between the pressure head profile in the A-A section of 500 fields and the homogeneous case is quantified. A scatter plot of RMSE values of the 500 fields and the Manhattan distance is displayed in Figure 12. The logarithm of the RMSE values and the Manhattan distance have a positive correlation coefficient of 0.4. It implies that the pressure head profiles will cause an inaccurate estimation of soil heterogeneity. An empirical correlation equation between the Manhattan distance and the RMSE value is also given in Figure 12. This equation can be used to roughly assess the potential estimation error of the spatial heterogeneity once the pressure head monitoring data is obtained along a crosssection of the slope.

Relations between Estimation Error and Monitoring Response
As shown in Figure 10, the performance of inverse estimation may depend on the shape of the pressure head profiles. To quantify this complexity, the metric similarity of Manhattan distance (Li et al., 2004) is introduced to qualify the shape of the pressure head profile in the A-A section ( Figure 12). The pressure head profile of the homogeneous case is used as a reference. The similarity between the pressure head profile in the A-A section of 500 fields and the homogeneous case is quantified. A scatter plot of RMSE values of the 500 fields and the Manhattan distance is displayed in Figure 12. The logarithm of the RMSE values and the Manhattan distance have a positive correlation coefficient of 0.4. It implies that the pressure head profiles will cause an inaccurate estimation of soil heterogeneity. An empirical correlation equation between the Manhattan distance and the RMSE value is also given in Figure 12. This equation can be used to roughly assess the potential estimation error of the spatial heterogeneity once the pressure head monitoring data is obtained along a cross-section of the slope.
Water 2020, 12, x FOR PEER REVIEW 12 of 16 Figure 11. Prior and posterior distributions of the first five best and worst estimated fields of random variable θ1.

Relations between Estimation Error and Monitoring Response
As shown in Figure 10, the performance of inverse estimation may depend on the shape of the pressure head profiles. To quantify this complexity, the metric similarity of Manhattan distance (Li et al., 2004) is introduced to qualify the shape of the pressure head profile in the A-A section ( Figure  12). The pressure head profile of the homogeneous case is used as a reference. The similarity between the pressure head profile in the A-A section of 500 fields and the homogeneous case is quantified. A scatter plot of RMSE values of the 500 fields and the Manhattan distance is displayed in Figure 12. The logarithm of the RMSE values and the Manhattan distance have a positive correlation coefficient of 0.4. It implies that the pressure head profiles will cause an inaccurate estimation of soil heterogeneity. An empirical correlation equation between the Manhattan distance and the RMSE value is also given in Figure 12. This equation can be used to roughly assess the potential estimation error of the spatial heterogeneity once the pressure head monitoring data is obtained along a crosssection of the slope.

Conclusions
In this study, the conditions under which monitoring data can provide a more reliable estimation of hydraulic heterogeneity are investigated. The performance of the estimation is assessed based on mutual information (I) of random variables, and the root mean square error (RMSE) of the estimated field. The effect of the characteristics of heterogeneity on inverse estimation is investigated with sensitivity analysis and best-estimated and worst-estimated fields. The major conclusions of the study are summarized below.
1. The mean I value of each random variable is around two nats. All the random variables are appropriately identified, and the uncertainties are significantly reduced after inverse estimation. The RMSE corresponding to the maximum probability density is 1.77 × 10 −6 m/s. The interquartile range is narrow centralized from 1.55 × 10 −6 m/s to 5.67 × 10 −6 m/s. The percentage error around the measurement points is less than 3%, and the COV around the measurement points is less than 0.08. The field measurements can reduce the errors and uncertainties around the locations of the monitoring points.
2. The characteristics of heterogeneity can significantly influence the effect of inverse estimation. The ranges of k s values for the best-estimated fields are much narrower than those of the worst estimated fields. For the best-estimated fields, the larger values of k s are distributed in the crest of the unsaturated zone. A less permeable crest zone could result in the complex responses of pressure head and further cause the ill identification of heterogeneity.
3. The similarity between the pressure head profiles of 500 random fields and the homogeneous case is measured. The RMSE value and the Manhattan distance have a positive correlation with a correlation coefficient r equals 0.4. It implies that the complexity of the pressure head profiles will cause an inaccurate estimation of soil heterogeneity.
One of the limitations of the present study is that the conclusions are based on numerical analysis without comparing the results to field measurements or laboratory experiments. Exploring the conditions of soil heterogeneity with real cases of field monitoring projects has several obstacles, including limited data of site investigation and measurement uncertainties. Well-controlled laboratory experiments are possible to validate the conclusions. Relevant laboratory validation is ongoing with reduced-scale slope models.