Soil Moisture Mapping Based on Multi-Source Fusion of Optical, Near-Infrared, Thermal Infrared, and Digital Elevation Model Data via the Bayesian Maximum Entropy Framework

: This paper proposes a combined approach wherein the optical, near-infrared, and thermal infrared data from the Landsat 8 satellite and the Advanced Spaceborne Thermal Emission and Reﬂection Radiometer (ASTER) global digital elevation model (GDEM) data are fused for soil moisture mapping under sparse sampling conditions, based on the Bayesian maximum entropy (BME) framework. The study was conducted in three stages. First, based on the maximum entropy principle of the information theory, a Lagrange multiplier was introduced to construct general knowledge, representing prior knowledge. Second, a principal component analysis (PCA) was conducted to extract three principal components from the multi-source data mentioned above, and an innovative and operable discrete probability method based on a fuzzy probability matrix was used to approximate the probability relationship. Thereafter, soft data were generated on the basis of the weight coe ﬃ cients and coordinates of the soft data points. Finally, by combining the general knowledge with the prior information, hard data (HD), and soft data (SD), we completed the soil moisture mapping based on the Bayesian conditioning rule. To verify the feasibility of the combined approach, the ordinary kriging (OK) method was taken as a comparison. The results conﬁrmed the superiority of the soil moisture map obtained using the BME framework. The map revealed more detailed information, and the accuracies of the quantitative indicators were higher compared with that for the OK method (the root mean squared error (RMSE) = 0.0423 cm 3 / cm 3 , mean absolute error (MAE) = 0.0399 cm 3 / cm 3 , and Pearson correlation coe ﬃ cient (PCC) = 0.7846), while largely overcoming the overestimation issue in the range of low values and the underestimation issue in the range of high values. The proposed approach e ﬀ ectively fused inexpensive and easily available multi-source data with uncertainties and obtained a satisfactory mapping accuracy, thus demonstrating the potential of the BME framework for soil moisture mapping using multi-source data.


Introduction
The soil moisture in the first 0-5 cm of a soil layer plays an important role in the exchange of energy and substances between the atmosphere and the land. The surface soil moisture is a fundamental component in the fields of hydrology, meteorology, and agriculture [1][2][3][4][5]. Although conventional in

Auxiliary Data
To obtain the auxiliary environmental variables, including the land surface temperature (LST), Albedo, and vegetation indices, the Landsat 8 satellite data were used in this study. The Landsat 8 satellite was launched in 2013, carrying two main sensors, namely, the Operational Land Imager (OLI) and the Thermal Infrared Sensor (TIRS), which has a total of 11 bands and, except for the panchromatic band (15 m) and the thermal infrared band (100 m), the spatial resolution is 30 m. To unify the resolution of the auxiliary environmental variables, b10 was resampled to a spatial resolution of 30 m using the nearest neighbor method [53]. Table 1 lists the spectral characteristics of Landsat 8. An image acquired on 12 October 2018, which was close to the dates of the in situ experiments, was downloaded from the USGS (United States Geological Survey) data center (https://earthexplorer.usgs.gov/). Reprocessing of the Landsat 8 data included radiation correction, atmospheric correction, and geometrical correction. The ENVI 5.3 software was used to convert the digital number (DN) of the images to the surface spectral reflectance in the process of radiation correction, and the FLAASH Atmospheric Correction toolbox was used for the atmospheric correction [53][54][55][56][57][58]. Subsequently, the image was geo-referenced based on 25 ground control points, which were obtained during the in situ experiments [32].

Auxiliary Data
To obtain the auxiliary environmental variables, including the land surface temperature (LST), Albedo, and vegetation indices, the Landsat 8 satellite data were used in this study. The Landsat 8 satellite was launched in 2013, carrying two main sensors, namely, the Operational Land Imager (OLI) and the Thermal Infrared Sensor (TIRS), which has a total of 11 bands and, except for the panchromatic band (15 m) and the thermal infrared band (100 m), the spatial resolution is 30 m. To unify the resolution of the auxiliary environmental variables, b10 was resampled to a spatial resolution of 30 m using the nearest neighbor method [53]. Table 1 lists the spectral characteristics of Landsat 8. An image acquired on 12 October 2018, which was close to the dates of the in situ experiments, was downloaded from the USGS (United States Geological Survey) data center (https://earthexplorer. usgs.gov/). Reprocessing of the Landsat 8 data included radiation correction, atmospheric correction, and geometrical correction. The ENVI 5.3 software was used to convert the digital number (DN) of the images to the surface spectral reflectance in the process of radiation correction, and the FLAASH Atmospheric Correction toolbox was used for the atmospheric correction [53][54][55][56][57][58]. Subsequently, the image was geo-referenced based on 25 ground control points, which were obtained during the in situ experiments [32]. The terrain indices used in this study were derived on the basis of the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) global digital elevation model (GDEM) data with a resolution of 30 m, which are among the most complete digital elevation data available [59]. Two primary terrain indices, namely, the slope and surface roughness, were produced using ASTER GDEM data in ArcGIS 10.2 using the Raster Calculator tool [51,60].

Ground Experiment Dataset
Ground experiments were conducted from 17-18 October 2018. The soil moisture samples were collected from 17-18 October 2018, and the processes of oven-drying method were carried out on 19 October 2018. During the ground experiments, the weather in the study area was cloudy and there was no precipitation. The average temperature was 17 • C, 17 • C, and 18 • C, respectively. A total of 100 soil moisture samples were acquired mainly using the time-domain reflectometry (TDR) probes for the top 5 cm of the land surface, and the Global Positioning System (GPS) was used to identify and record the positions of the samples. The probes were placed vertically on each soil moisture sample, and three close-distance measurements were made and averaged. Among them, 29 soil moisture samples from the top 5 cm of the surface were collected simultaneously using the TDR method and placed in cutting rings and aluminum specimen boxes using the oven-drying method. These samples were used to calibrate the TDR measurements based on a linear model. The calibrated values of the soil moisture ranged from 0.06 to 0.36 cm 3 /cm 3 , averaging at 0.19 cm 3 /cm 3 . A total of 70 soil moisture samples were randomly selected for the training dataset, and the remaining 30 samples served as an independent validation dataset [32].

Methodology
In this study, a set of combined approaches based on the BME framework was used to integrate multi-source data with uncertainties to map the soil moisture under sparse sampling conditions. We selected the OK method instead of the regression kriging (RK) and co-kriging (CK) methods, which have the means for using auxiliary data, because of the limitations of these methods in the application of auxiliary data. Studies have shown that these methods are affected by the characteristics of the linear estimator, because of which auxiliary data with a low linear correlation cannot be used effectively, and the obtained accuracy is similar to that of the OK method [12,23,48,61]. Although there is a potential nonlinear relationship between the nine auxiliary environmental variables analyzed in this study, as discussed in Section 1, the linear correlation is weak (Pearson correlation coefficient (PCC) < 0.5). Therefore, only the OK method was implemented in this study as a comparison method. Figure 2 shows the procedure of the methodology used for soil moisture mapping.

Ordinary Kriging (OK) Method
The semi-variogram is a function of the separating distance and is the main component in geostatistical methods. Under the intrinsic hypothesis, a calculation formula for the semi-variogram is given to express the correlation between the variability of the spatial variables and the separating distance [15,30,45]: where γ(h) denotes the semi-variance with respect to the spatial variable Z at a separating distance of h; x is the spatial location of the sample; and N(h) indicates the number of pairs in the given separating distance h. The experimental semi-variogram can be calculated from the point pairs at specific separating distances and then modeled using the theoretical models, including exponential, Gaussian, and spherical functions [62,63]. In this study, a nugget-spherical model was used as the semi-variogram function [23,64,65]:

Ordinary Kriging (OK) Method
The semi-variogram is a function of the separating distance and is the main component in geostatistical methods. Under the intrinsic hypothesis, a calculation formula for the semi-variogram is given to express the correlation between the variability of the spatial variables and the separating distance [15,30,45]: where γ(h) denotes the semi-variance with respect to the spatial variable Z at a separating distance of h; x i is the spatial location of the sample; and N(h) indicates the number of pairs in the given separating distance h. The experimental semi-variogram can be calculated from the point pairs at specific separating distances and then modeled using the theoretical models, including exponential, Gaussian, and spherical functions [62,63]. In this study, a nugget-spherical model was used as the semi-variogram function [23,64,65]: Remote Sens. 2020, 12, 3916 7 of 25 where C 0 represents the nugget variance, i.e., the minimum variability observed or the "noise" at a distance of 0; C is the partial sill; C 0 + C denotes the sill variance of the spherical model; and a is the range, which is the distance parameter representing the correlation length. The OK method aims to provide an optimal linear unbiased prediction of the spatial variable under the second-order stationarity assumption. The unmeasured location x i is predicted using the estimated value Z * (x 0 ), i.e., the line sum of the known measured variable values. The prediction formula can be described as follows [13,[66][67][68][69]: where Z * (x 0 ) indicates the prediction value at the unmeasured location of x i ; Z(x i ) is the known measured value at the location of x i ; n is the number of points within the searched neighborhood; and λ i represents a weight coefficient, which is related to the distance between the unmeasured location and the known measured locations.
To ensure an optimal and unbiased estimation of the variable, the variance of the errors should be minimized and the sum of the weight coefficients should be equal to 1. Therefore, the following conditions should be satisfied: where σ is the predicted error variance.

Bayesian Maximum Entropy (BME) Framework
The BME theoretical framework provides a set of theoretically sound and technically operational methods for integrating various related auxiliary data with uncertainties for the spatiotemporal mapping of target parameters. This framework provides a theoretical and technical support for fusing easily available and inexpensive multisource data for mapping soil moisture under sparse sampling conditions.

General Principle of the BME Framework
The general principle of the BME framework is briefly presented in this section, and more detailed discussions can be found in a previous study (Christakos, 2000). Figure 3 shows the general scheme of the BME framework, which is divided into three epistemological stages.
In the prior stage, the general knowledge is processed using the maximum entropy principle of the information theory in the form of a prior probability distribution function f G X map . We consider that the vector of variables X map consists of X hard , X soft , and X k , which denote the values of the hard and soft data and the unknown value at the estimation position, respectively. Based on the concept of the Shannon entropy, the entropy in f G X map can be expressed as follows [12,51,52,69]: The entropy was maximized by introducing the Lagrange multipliers (µ α ), while setting the partial function (Z p ) to zero. Solving the system of equations (Equation (7)) with respect to the Lagrange multipliers (µ α ) yields the maximum entropy solution [45,46].
Remote Sens. 2020, 12, 3916 8 of 25 where G α X map is a set of functions allowing the incorporation of the general knowledge. In this study, the statistical indexes and covariance moment of the soil moisture samples were introduced in the BME framework as the general knowledge.  In the prior stage, the general knowledge is processed using the maximum entropy principle of the information theory in the form of a prior probability distribution function f . We consider that the vector of variables consists of , , and , which denote the values of the hard and soft data and the unknown value at the estimation position, respectively. Based on the concept of the Shannon entropy, the entropy in f can be expressed as follows [12,51,52,69]: The entropy was maximized by introducing the Lagrange multipliers (μ ), while setting the partial function (Z ) to zero. Solving the system of equations (Equation (7)) with respect to the Lagrange multipliers (μ ) yields the maximum entropy solution [45,46].
where ℊ is a set of functions allowing the incorporation of the general knowledge. In this study, the statistical indexes and covariance moment of the soil moisture samples were introduced in the BME framework as the general knowledge.
In the meta-prior stage, two types of specific knowledge were gathered and organized: hard data (HD) with a reliable accuracy and soft data (SD) with uncertainties. The sparse samples of the soil moisture collected during the in situ experiments were considered reliable in terms of the accuracy and were used as hard data in the BME framework. The quantitative real nonlinear relationship between the multi-source auxiliary environmental variables and the soil moisture was approximated using a discrete probability and was used as soft data with uncertainties [31,45]. Since the construction of the soft data plays an important role in the operation of the BME framework, the construction process of the discrete probabilistic soft data is introduced in detail in Section 3.2.3.
In the posterior stage, the prior probability distribution function (prior PDF) obtained in the prior stage and the hard data and soft data f ( ) in the probability form obtained in the meta-prior stage were integrated to generate the posterior probability distribution function (posterior PDF), according to the Bayesian conditioning rule (Christakos, 2000). The posterior PDF at an unknown estimation position f can be expressed as [46][47][48][49]: In the meta-prior stage, two types of specific knowledge were gathered and organized: hard data (HD) with a reliable accuracy and soft data (SD) with uncertainties. The sparse samples of the soil moisture collected during the in situ experiments were considered reliable in terms of the accuracy and were used as hard data in the BME framework. The quantitative real nonlinear relationship between the multi-source auxiliary environmental variables and the soil moisture was approximated using a discrete probability and was used as soft data with uncertainties [31,45]. Since the construction of the soft data plays an important role in the operation of the BME framework, the construction process of the discrete probabilistic soft data is introduced in detail in Section 3.2.3.
In the posterior stage, the prior probability distribution function (prior PDF) obtained in the prior stage and the hard data X hard and soft data f S (X soft ) in the probability form obtained in the meta-prior stage were integrated to generate the posterior probability distribution function (posterior PDF), according to the Bayesian conditioning rule (Christakos, 2000). The posterior PDF at an unknown estimation position f K X map can be expressed as [46][47][48][49]: Several estimator modes can be used to form the posterior PDF [45,46]. In this study, the mode estimation, which maximizes the posterior PDF, was employed. The estimated value χ K at an unknown position can be expressed as:

Preparation and Dimension Reduction of Auxiliary Data
The soil moisture content is affected by multiple environmental factors, such as the land surface temperature, topography, and vegetation indices, and their effects on the soil moisture are complex and nonlinear [12,59]. Making full use of the auxiliary environmental variables that have a potential correlation with the soil moisture is significant for soil moisture mapping under sparse sampling conditions [23,59]. A total of nine auxiliary environmental variables were adopted in this study, considering their potential nonlinear effect on the soil moisture.
It should be noted that auxiliary environmental variables inevitably introduce various errors in their preparation processes and the differences between the time of in situ experiments and the transition time of Landsat 8 satellite would also introduce errors. Hence, the approach presented in this study is a bold attempt because these uncertainties in the process of multi-source data fusion would affect the mapping results. Nevertheless, this attempt was adopted to validate the potential of the BME framework in effectively incorporating uncertain data, such as soft data, to improve the accuracy of the soil moisture mapping.
The data sources and processing methods of the multi-source auxiliary environmental variables are described in detail in this section. Considering that the collinearity between them may introduce redundant information in the construction process of the BME framework, we conducted a varimax-rotated principal component analysis (PCA) on the nine multi-source auxiliary environmental variables to extract the principal components with eigenvalues greater than 1 as the materials for subsequent probabilistic soft data. The details of the PCA are discussed in Section 4.3.1.

(1) Vegetation Indices
Vegetation index can indicate the growth status of vegetation and a variety of vegetation growth conditions, including the canopy water content, affecting the soil moisture through transpiration or other hydrological processes [32,[70][71][72][73]. Therefore, five commonly used vegetation indices were selected as auxiliary environmental variables for the soil moisture mapping: the ratio vegetation index (RVI), normalized difference vegetation index (NDVI), enhanced vegetation index (EVI), normalized difference water index based on SWIR 1 band (NDWI_SWIR1), and SWIR 2 band (NDWI_SWIR2). All the vegetation indices were estimated on the basis of the pre-processed OLI data. The formulae are as follows [44,[74][75][76][77]: where NIR is the reflectivity at the near-infrared band (b5), red is the reflectivity at the red band (b4); blue is the reflectivity at the blue band (b2), and NDWI_SWIR1 and NDWI_SWIR2 are the reflectivity values at the SWIR 1 band (b6) and SWIR 2 band (b7), respectively.
(2) Albedo As an important parameter for controlling the surface energy budget, the land surface albedo (Albedo) is widely used in climatology and hydrology and has been successfully applied in soil moisture mapping as a related factor [59,78,79]. Since there is no widely accepted high-resolution land surface albedo products, it was described on the basis of Landsat 8 data using an empirical formula that has been successfully applied in published studies [80][81][82]. The formula is as follows: Studies have shown that there exists a correlation between the surface temperature and the soil moisture; therefore, the surface temperature was selected as one of the auxiliary environmental variables for the soil moisture mapping in this study [23,59]. With the development of remote sensing technology, related studies on land surface temperature inversion have yielded promising results. Currently, three methods are widely used for retrieving the land surface temperature through thermal infrared remote sensing: the single-channel algorithm, split-window algorithm, and multiband algorithm [83][84][85][86][87]. The split-window algorithm inverts the surface temperature from two thermal infrared bands, whereas the multiband algorithm requires multiple thermal infrared bands for the inversion process. Compared with Chinese Huanjing-1B (HJ-1B), the Moderate Resolution Imaging Spectroradiometer (MODIS), and other remote sensing data, Landsat 8 data are more widely being used in surface temperature studies in recent years owing to their high resolution. Although the TIRS sensor of Landsat 8 has two thermal infrared bands, the calibration parameters of b11 were officially found to be unstable several times [88,89]. Therefore, only b10 was used to estimate the surface temperature based on the single-channel algorithm.
The general principles of the single-channel algorithm are briefly presented, and a more detailed discussion of the theory and techniques can be found in previous papers [90,91]. First, the DN value of Landsat 8 b10 data was converted to the radiant brightness temperature based on the offset and gain values in the header file, and then the onboard brightness temperature of each pixel of the image was estimated using the Planck's inverse radiation function. Finally, the brightness temperature was converted to the real surface temperature using the single-window algorithm [90]. Based on the above steps, the 30-m surface temperature data of the study area were obtained and then applied to the BME framework as soft data with uncertainties.

(4) Terrain Indices
Terrain is an important factor affecting the surface water transport and has a potential impact on the soil moisture. Therefore, two commonly used and representative terrain indices, namely, the slope and surface roughness (m), were selected as environmental auxiliary variables for the soil moisture mapping. The slope is the second derivative of the change in the DEM, and the surface roughness (m) is the ratio of the earth surface area to the projection area [92,93]. The two terrain indices used in this study were obtained using the ArcGIS software.

Construction Process of Soft Data
The BME framework can fuse multi-source auxiliary environmental variables to fully mine the potential correlation between these and the target variable and improve the accuracy of soil parameter mapping under sparse sampling conditions [12,36]. In the BME framework, the potential relationship between the auxiliary environmental variables and the target variable can be described as a type of probabilistic soft data with uncertainty. The probability can be expressed in multiple distribution forms, such as linear, histogram, and Gaussian probability distributions [59].
In this study, a bold attempt was made in that the nine auxiliary environmental variables, including the vegetation indices, land surface temperature, land surface albedo, and terrain indices, were used to construct the fuzzy probabilistic soft data, which helped approximate the real probability with discrete probability for the soil moisture mapping under the BME framework. The distribution characteristics of the auxiliary environmental variables were fully considered in the process of soft data construction.
The equal interval algorithm used in previous studies was improved, and the quantile breakpoint method was applied to obtain the discrete intervals [12,59]. The construction process of the soft data can be divided into the following eight steps.
Step 1: The varimax-rotated PCA method was applied as a data dimensionality reduction algorithm to obtain the principal components representing most of the information of the auxiliary environmental variables, considering that the collinear correlation between the nine auxiliary environmental variables may lead to information redundancy in the soil moisture mapping process. The detailed process of the PCA is described in Section 4.3.1. The extracted principal component images were recorded as F k , and the total number is p.
Step 2: Each principal component, which was derived in Step 1, was classified using the quantile breakpoint method. The quantiles were calculated by conducting a statistical analysis of all the pixels in a particular principal component image. The principal component category was recorded as G F_i , and the number of categories for each principal component was N f .
Step 3: Seventy soil moisture samples were divided into different groups based on the equidistant breakpoint method. The operation process requires obtaining the maximum and minimum values of all the samples, and then the range of the values was divided on the basis of equal intervals, which matched the equidistant histogram soft data format in the BME framework. The soil moisture group was recorded as G SM_i , and the group number was Nsm.
Step 4: The soil moisture samples belonging to a specific soil moisture group were identified and counted, and the total number of samples in the group was recorded as Count SM_i , as shown in In this study, a bold attempt was made in that the nine auxiliary environmental variables, including the vegetation indices, land surface temperature, land surface albedo, and terrain indices, were used to construct the fuzzy probabilistic soft data, which helped approximate the real probability with discrete probability for the soil moisture mapping under the BME framework. The distribution characteristics of the auxiliary environmental variables were fully considered in the process of soft data construction. The equal interval algorithm used in previous studies was improved, and the quantile breakpoint method was applied to obtain the discrete intervals [12,59]. The construction process of the soft data can be divided into the following eight steps.
Step 1: The varimax-rotated PCA method was applied as a data dimensionality reduction algorithm to obtain the principal components representing most of the information of the auxiliary environmental variables, considering that the collinear correlation between the nine auxiliary environmental variables may lead to information redundancy in the soil moisture mapping process. The detailed process of the PCA is described in Section 4.3.1. The extracted principal component images were recorded as F , and the total number is p.
Step 2: Each principal component, which was derived in Step 1, was classified using the quantile breakpoint method. The quantiles were calculated by conducting a statistical analysis of all the pixels in a particular principal component image. The principal component category was recorded as G _ , and the number of categories for each principal component was N .
Step 3: Seventy soil moisture samples were divided into different groups based on the equidistant breakpoint method. The operation process requires obtaining the maximum and minimum values of all the samples, and then the range of the values was divided on the basis of equal intervals, which matched the equidistant histogram soft data format in the BME framework. The soil moisture group was recorded as G _ , and the group number was Nsm.
Step 4: The soil moisture samples belonging to a specific soil moisture group were identified and counted, and the total number of samples in the group was recorded as Count _ , as shown in Figure  4. The pixel values in the same position as these samples were extracted from a specific principal component image and then classified into different principal component categories obtained in Step 2. Thereafter, the number of elements in each principal component category was counted and recorded as Count _ .   Step 5: For a specific soil moisture group, all the principal component images obtained in Step 1 were counted and recorded according to Step 4. Subsequently, all the soil moisture groups obtained in Step 3 were counted and recorded according to the above steps.
Step 6: The fuzzy probability matrix (M P ) could be obtained for a specific principal component image, using Equation (16). The value at any position represents the probability of occurrence of a principal component category under the premise of a known soil moisture group, corresponding to P(F|SM) above.
Step 7: Considering the difference in the abilities of the principal components in explaining the total variance, weight factors were established in terms of the normalization coefficient of the explanatory ability of each principal component. The formula is as follows: where γ Fk is the weight factor of the Fk principal component and P Fk is the explanatory percentage of the Fk component with respect to the total variance.
Step 8: The soft data were constructed on the basis of the fuzzy probability matrix obtained in Step 6 and the weight coefficients of the principal components obtained in Step 7. The specific composition of the soft data table is described in Section 4.3.2.

Methods of Validation
To evaluate and verify the ability of the BME framework in fusing multi-source data with uncertainties and in improving the accuracy of soil moisture mapping, the OK method was used to predict the soil moisture at the same set of test positions as a comparative experiment.
Three quantitative indicators, namely, the root mean squared error (RMSE), mean absolute error (MAE), and Pearson correlation coefficient (PCC), were estimated to confirm the agreement between the estimated soil moisture obtained by the BME framework and the OK method, and the measured soil moisture during the field experiments [48,94]. The differences between the estimated and measured soil moisture values are reflected in the RMSE and MAE, a low value of which represents a high prediction accuracy. The linear correlation between the estimated and measured soil moisture values are reflected in the PCC, where a value close to 1 represents a strong linear correlation. Figure 5 shows the locations of 100 soil moisture samples and their descriptive statistics. The soil moisture content varied from 0.04 to 0.36 cm 3 /cm 3 , with a standard deviation of 0.07 cm 3 /cm 3 . A total of 70 samples were randomly selected as modeling samples, and the remaining 30 samples were used as validation samples to assess the performance of the different mapping algorithms. The Kolmogorov-Smirnov method was used to test the normalization of the modeling samples, and the Kolmogorov-Smirnov (K-S) test value was 0.195 (sig > 0.05), which indicated that the modeling data can be considered to follow an approximately normal distribution. Notably, this approximate assumption will affect the accuracy of the prediction results of the OK method, which is highlighted in the analysis of the results below. Remote Sens. 2018, 10, x FOR PEER REVIEW 13 of 25 In the field experimental dataset, a total of 30 soil moisture samples were randomly selected as independent validation dataset, and the remaining 70 samples were served as training dataset for OK method and BME framework. Inspired by the idea of Monte Carlo method, multiple repeat sampling was performed to reduce the contingency risk associated with a single random sample [95]. Considering the heavy workload of BME method of soil moisture mapping, a total of six random samples were manipulated, and a total of six soil moisture mapping experiments based on both OK method and BME framework were carried out. The results of these repeated tests show that the quantitative accuracy indexes of each repeated test fluctuated around its mean values (OK: RMSE = 0.0667 cm 3 /cm 3 , MAE = 0.0567 cm 3 /cm 3 , and PCC = 0.5674; BME: RMSE = 0.0433 cm 3 /cm 3 , MAE = 0.0400 cm 3 /cm 3 , and PCC = 0.7732) and the amplitude was relatively narrow. The results of repeated tests were relatively stable, which reflected the consistency of the conclusions of this study. Based on one of the sampling tests, the soil moisture mapping results are described in detail below.

Variogram Model and OK Prediction
The OK method requires estimating the variogram model for the soil moisture samples. An isotropic experimental variogram was estimated from the soil moisture sample data, while neglecting the influence of the anisotropy on the model parameters. The trial-and-error method with different lag intervals was used in the variogram model fitting based on the minimum error indicator values. Similarly, the optimal variogram model parameters were determined. The GS+ geostatistical software was used to fit the variogram model [15]. Figure 6 shows the experimental variogram and the fitted variogram model with the optimal parameters. The best fitted model for the soil moisture sampling data was the spherical model with the lowest residual sum of squares (RSS) value. Table 2 lists the optimal variogram parameters. As listed, the R 2 of the fitted variogram model is at a moderate level, which also affects the accuracy of the soil moisture mapping based on the OK method. The geostatistical analyst extension of ArcGIS was used for the soil moisture mapping based on the OK method; the prediction map (a grid size of 30 m by 30 m) is shown in Figure 7a   In the field experimental dataset, a total of 30 soil moisture samples were randomly selected as independent validation dataset, and the remaining 70 samples were served as training dataset for OK method and BME framework. Inspired by the idea of Monte Carlo method, multiple repeat sampling was performed to reduce the contingency risk associated with a single random sample [95]. Considering the heavy workload of BME method of soil moisture mapping, a total of six random samples were manipulated, and a total of six soil moisture mapping experiments based on both OK method and BME framework were carried out. The results of these repeated tests show that the quantitative accuracy indexes of each repeated test fluctuated around its mean values (OK: RMSE = 0.0667 cm 3 /cm 3 , MAE = 0.0567 cm 3 /cm 3 , and PCC = 0.5674; BME: RMSE = 0.0433 cm 3 /cm 3 , MAE = 0.0400 cm 3 /cm 3 , and PCC = 0.7732) and the amplitude was relatively narrow. The results of repeated tests were relatively stable, which reflected the consistency of the conclusions of this study. Based on one of the sampling tests, the soil moisture mapping results are described in detail below.

Variogram Model and OK Prediction
The OK method requires estimating the variogram model for the soil moisture samples. An isotropic experimental variogram was estimated from the soil moisture sample data, while neglecting the influence of the anisotropy on the model parameters. The trial-and-error method with different lag intervals was used in the variogram model fitting based on the minimum error indicator values. Similarly, the optimal variogram model parameters were determined. The GS+ geostatistical software was used to fit the variogram model [15]. Figure 6 shows the experimental variogram and the fitted variogram model with the optimal parameters. The best fitted model for the soil moisture sampling data was the spherical model with the lowest residual sum of squares (RSS) value. Table 2 lists the optimal variogram parameters. As listed, the R 2 of the fitted variogram model is at a moderate level, which also affects the accuracy of the soil moisture mapping based on the OK method. The geostatistical analyst extension of ArcGIS was used for the soil moisture mapping based on the OK method; the prediction map (a grid size of 30 m by 30 m) is shown in Figure 7a Figure 7. Maps of soil moisture based on the ordinary kriging (OK) method and BME framework: (a) OK method, (b) BME framework.

Dimension Reduction of Environmental Auxiliary Variables
The varimax-rotated PCA was used in this study as the principal component factor extraction method, considering that there were nine auxiliary environmental variables with a collinear relationship. The objective of the PCA was to explore and analyze the relationship between the auxiliary environmental variables and then calculate the core components representing most of the information on all the variables. The SPSS Statistical Package (V. 24.0) was used to perform the PCA [97].
The analysis initially extracted nine components ( Table 3). The first three components were extracted with eigenvalues greater than 1 as the principal components for the subsequent operations. The first component (F1) was the dominant factor, accounting for 50.67% of the total variance, and all three principal components cumulatively explained 84.29% of the total variance. The regression coefficients of the principal component equations were obtained on the basis of the SPSS Statistical Package, and the equations were as follows:   Figure 7. Maps of soil moisture based on the ordinary kriging (OK) method and BME framework: (a) OK method, (b) BME framework.

Dimension Reduction of Environmental Auxiliary Variables
The varimax-rotated PCA was used in this study as the principal component factor extraction method, considering that there were nine auxiliary environmental variables with a collinear relationship. The objective of the PCA was to explore and analyze the relationship between the auxiliary environmental variables and then calculate the core components representing most of the information on all the variables. The SPSS Statistical Package (V. 24.0) was used to perform the PCA [97].
The analysis initially extracted nine components ( Table 3). The first three components were extracted with eigenvalues greater than 1 as the principal components for the subsequent operations. The first component (F1) was the dominant factor, accounting for 50.67% of the total variance, and all three principal components cumulatively explained 84.29% of the total variance. The regression coefficients of the principal component equations were obtained on the basis of the SPSS Statistical Package, and the equations were as follows:

Dimension Reduction of Environmental Auxiliary Variables
The varimax-rotated PCA was used in this study as the principal component factor extraction method, considering that there were nine auxiliary environmental variables with a collinear relationship. The objective of the PCA was to explore and analyze the relationship between the auxiliary environmental variables and then calculate the core components representing most of the information on all the variables. The SPSS Statistical Package (V. 24.0) was used to perform the PCA [97].
The analysis initially extracted nine components ( Table 3). The first three components were extracted with eigenvalues greater than 1 as the principal components for the subsequent operations. The first component (F1) was the dominant factor, accounting for 50.67% of the total variance, and all three principal components cumulatively explained 84.29% of the total variance. The regression coefficients of the principal component equations were obtained on the basis of the SPSS Statistical Package, and the equations were as follows: Based on Equation (17), the weight factors of the three principal components were found to be 0.60, 0.22, and 0.18, respectively.

Soft Data Construction Based on Fuzzy Probability Matrix
In this study, three principal components representing most of the information were extracted from nine auxiliary environmental variables having a nonlinear correlation with the soil moisture, based on Landsat 8 satellite data and DEM data. The soft data in the BME framework represent the probability of expressing the correlation between the auxiliary and target variables [45]. In this study, the real probability relationship was approximated using a discrete form of the probability. The construction of the soft data used in the BME framework was based on a fuzzy probability matrix (Step 4 of Section 3.2.3) between each principal component and the soil moisture, and was then merged using the weight coefficients (Section 4.3.1). Because of the importance of the fuzzy probability matrices in the soft data construction process, we list the fuzzy probability matrix between the first principal component (F1) and the soil moisture in Table 4 as an example. The number of soil moisture groups (Nsm) and principal component groups (N f ) were both defined as 10. The soil moisture was grouped according to the equal intervals (Step 3 of Section 3.2.3), and the first principal component was grouped according to the quantile breakpoint method (Step 2 of Section 3.2.3).
In this section, two types of specific knowledge, namely, hard data with a reliable accuracy and soft data with uncertainties, both prepared in the meta-prior stage, were integrated to realize the spatial mapping of the soil moisture. Seventy soil moisture samples obtained during the ground experiments were considered accurate and reliable and were used as the hard dataset. We evenly selected 10,000 points in the study area and constructed a soft dataset based on the soft data obtained in the previous section [59]. Thus, we not only ensured a sufficient data volume in the prediction algorithm, but also ensured that there were enough soft data points around each hard data point. The prediction process was conducted using SEKS-GUI v1.0.9 [98] in the programming environment of MATLAB R2018a, and the results of the soil moisture mapping (a grid size of 30 m by 30 m) were exported on the basis of ArcGIS [60], and shown in Figure 7b.

Performance of OK Method and BME Framework
The mapping of the soil moisture in the study area was conducted using the BME framework based on multi-source data, including the vegetation indices, land surface albedo, land surface temperature, and terrain indices at a resolution of 30 m. To verify the feasibility and superiority of the BME framework, the OK method was also applied to map the soil moisture as a comparative experiment. Figure 7 shows the results of the soil moisture mapping obtained using the OK method (a) and BME framework (b).
As shown, a mask file was applied and marked in white, which was based on the 30-m resolution land use data downloaded from the website of Finer Resolution Observation and Monitoring Global Land Cover products (http://data.ess.tsinghua.edu.cn/) [99]. Two types of land use, namely, water bodies and impervious surfaces, were extracted as mask areas that were not involved in the mapping process, and then the results of the soil moisture mapping were masked based on these areas.
A visual comparison of the results reveals that the OK method produces smoother mapping results (Figure 7a), the soil moisture content range of which is narrower. The predicted values for most pixels were at a moderate level (0.15-0.25 cm 3 /cm 3 ), and the OK method could hardly cover the lowest (<0.1 cm 3 /cm 3 ) and highest levels (>0.3 cm 3 /cm 3 ). This is consistent with published studies, and the reason for this could be the smoothing effect of the OK method and the use of the linear unbiased estimation algorithm [14,61]. In contrast, the soil moisture map derived from the BME framework reveals more detailed information (Figure 7b), and the predictive range covers the entire range of the ground experiment soil moisture samples (0.06-0.36 cm 3 /cm 3 ). Compared with the DEM of the study area, the BME map is more in line with the topographic trends; for example, a mountain from the northwest to the middle of the study area corresponds to a lower soil moisture band. These characteristics of the BME map are consistent with published studies in that, with the addition of multi-source auxiliary data, more abundant auxiliary materials can be applied in the mapping process, thus yielding more detailed information [69,100,101].
In addition to visual comparison, a quantitative verification is important for objectively evaluating the mapping results of the two algorithms. Three quantitative indicators, including the root mean squared error (RMSE), mean absolute error (MAE), and Pearson correlation coefficient (PCC), were estimated on the basis of the independent validation dataset obtained during the ground experiments. A low index value of the RMSE and MAE represents a high accuracy, whereas a PCC value close to zero indicates a very weak linear correlation between the estimated and measured soil moisture values. Table 5, the index value for the OK method (0.0670 cm 3 /cm 3 ) is significantly higher than that for the BME framework (0.0423 cm 3 /cm 3 ), which means that the RMSE accuracy of the BME framework is evidently higher. The MAE index also shows a similar trend: The index value of the soil moisture map produced using the OK method (0.0559 cm 3 /cm 3 ) is significantly higher than that of the map produced using the BME framework (0.0399 cm 3 /cm 3 ), indicating that the MAE precision of the OK method is lower. Both the RMSE and MAE indicators could reflect the overall differences between the soil moisture values estimated by the two algorithms at the verification locations. Therefore, from Table 5, the BME prediction results are more accurate. As listed in Table 5, the soil moisture estimated using the BME framework has a relatively strong linear relationship with the validation dataset, and its PCC value is greater than 0.75. In contrast, the correlation between the soil moisture obtained using the OK method and the validation dataset is weak, with the PCC value reaching only 0.5794. The PCC indicator could reflect the model efficiency in terms of the linear correlation. So, a comparison of the PCC index values shows that the linear correlation between the estimated soil moisture based on the BME framework and the measured validation dataset is higher than that in the case of using the OK method, which means that the BME is more accurate.

From the RMSE indicator listed in
The scatter plots could directly reflect the differences between the predicted soil moisture values based on the algorithms and the soil moisture measured during the ground experiments. Figure 8 shows the scatter plots of the two algorithms. A 1:1 line is marked as a blue, dotted line, and the points close to it represent a small difference between the estimated and measured soil moisture values. Based on the overall distribution of the scatter plots shown in Figure 8, both the OK method and BME framework provide generally reasonable estimations for partial of the validation dataset. However, most of the scatter points in the two plots deviate from the 1:1 line, and their distribution characteristics are different, as described in detail below. From the RMSE indicator listed in Table 5, the index value for the OK method (0.0670 cm 3 /cm 3 ) is significantly higher than that for the BME framework (0.0423 cm 3 /cm 3 ), which means that the RMSE accuracy of the BME framework is evidently higher. The MAE index also shows a similar trend: The index value of the soil moisture map produced using the OK method (0.0559 cm 3 /cm 3 ) is significantly higher than that of the map produced using the BME framework (0.0399 cm 3 /cm 3 ), indicating that the MAE precision of the OK method is lower. Both the RMSE and MAE indicators could reflect the overall differences between the soil moisture values estimated by the two algorithms at the verification locations. Therefore, from Table 5, the BME prediction results are more accurate. As listed in Table 5, the soil moisture estimated using the BME framework has a relatively strong linear relationship with the validation dataset, and its PCC value is greater than 0.75. In contrast, the correlation between the soil moisture obtained using the OK method and the validation dataset is weak, with the PCC value reaching only 0.5794. The PCC indicator could reflect the model efficiency in terms of the linear correlation. So, a comparison of the PCC index values shows that the linear correlation between the estimated soil moisture based on the BME framework and the measured validation dataset is higher than that in the case of using the OK method, which means that the BME is more accurate.
The scatter plots could directly reflect the differences between the predicted soil moisture values based on the algorithms and the soil moisture measured during the ground experiments. Figure 8 shows the scatter plots of the two algorithms. A 1:1 line is marked as a blue, dotted line, and the points close to it represent a small difference between the estimated and measured soil moisture values. Based on the overall distribution of the scatter plots shown in Figure 8, both the OK method and BME framework provide generally reasonable estimations for partial of the validation dataset. However, most of the scatter points in the two plots deviate from the 1:1 line, and their distribution characteristics are different, as described in detail below.  From Figure 8, the following points can be summarized: (1) The majority of the scatter points of the BME framework are distributed around the 1:1 line, and the estimated values have a strong linear correlation with the measured value (PCC = 0.7846). On the contrary, the scatter points of the OK method are not well aggregated around the 1:1 line; in particular, there exist several scatter points with a large deviation. This means that the BME framework provides a relatively more reasonable prediction based on the independent validation dataset.
(2) The issues of overestimation in the range of low values and underestimation in the range of high values are more evident when using the OK method, because most of the scatter points on the left side of the scatter plot are above the 1:1 line, whereas most of the scatter points on the right side are below the 1:1 line. The OK method is a linear estimator, and this inherent disadvantage may cause this typical issue, consistent with previous studies [23,102].
(3) The BME prediction range is wider than that of the OK method, covering the entire range of the ground experiment soil moisture samples (0.06-0.36 cm 3 /cm 3 ). The OK method provides a narrower range (0.10-0.30 cm 3 /cm 3 ), and more than 85% of the predictions are at the moderate level (0.15-0.25 cm 3 /cm 3 ), which may be due to its smoothing effect [12]. The wide prediction range of the BME may be related to the fact that the discrete grouping in the fuzzy matrix covers the entire range of the measured soil moisture (Section 3.2.3). In terms of the prediction range, the BME framework provides more reasonable results.

Limitations and Extensions
This study is a case study to explore the feasibility of high-spatial-resolution soil moisture mapping at agricultural/watershed scale through multi-source data fusion based on the BME framework. The experimental results in the study area show that this method is feasible and promising, compared with the OK method of traditional geostatistics.
In this study, compared with the OK method, the soil moisture map obtained using the BME framework has better features; however, it has some unsatisfactory errors, which may be due to the following reasons. (1) Although we assumed that the distribution characteristics of the auxiliary variables are generally stable over a short time period, the issue of asynchrony between the ground experiments and the satellite transit may lead to greater changes in the distribution characteristics of some of the parameters, thus affecting the accuracy of the results [32,103]. (2) The BME is an improvement based on conventional geostatistics. Although it has been significantly improved, its accuracy is still affected by the sampling density (of the hard data points). The regional spatial correlation of the soil moisture could not be perfectly represented by the sparse samples, which may lead to errors in the prediction processes [61,104,105]. (3) We made a bold attempt by applying a simple and feasible construction method for the soft data, i.e., the discrete probabilities obtained from a fuzzy matrix were used to approximate the real probability between the soil moisture and the auxiliary variables. It is a discrete approximate expression of the real probability. Although this is an innovative simplification compared with the previous complex construction processes for the soft data, there remain some differences [12,59]. (4) The inversion algorithms of the nine auxiliary variables were based on previous published studies, and we did not study their accuracy and applicability in-depth. Therefore, the inversion processes of the auxiliary variables may introduce large errors, which may have a certain impact on the mapping results [23].
Some extended analysis was performed. These analyses are related to this study, but they are not the core issues concerned, which are only for discussion, and in-depth study is still needed in future work. (1) In the same study area, we conducted higher-resolution (8 m) soil moisture mapping based on multi-model coupling and fusion of optical and radar data of GaoFen-3 (GF-3) and GaoFen-1 (GF-1) satellite [32]. Comparing the results of the two studies, it can be found that the overall accuracy of the methods used in this study is lower than those of the previous study. The accuracy of OK method is lower than that of other two methods, which may be explained by the fact that it does not effectively make full use of multi-source auxiliary data, but only relies on the autocorrelation characteristics of variables in geostatistics. The accuracy of the method based on BME framework proposed in this study is slightly lower than that of the previous study. This may be explained by the different types of multi-source data used by the two studies and the time difference between satellite transit and field experiments. The brief comparison shows that different data fusion methods and different multi-source data may have certain effects on the accuracy of mapping results. However, the response mechanism needs to be further studied in the future. (2) The soil moisture mapping methods used in this study were conducted in another study area, which is located in the central part of Jilin Province, Nong'an county. From the results of new case study, BME framework also shows obvious advantages compared with OK method. The BME method provides more detailed information and the quantitative accuracy indexes (RMSE = 0.0438 cm 3 /cm 3 , MAE = 0.0392 cm 3 /cm 3 , and PCC = 0.8124) were significantly higher than that of the OK method (RMSE = 0.0592 cm 3 /cm 3 , MAE = 0.0519 cm 3 /cm 3 , and PCC = 0.5830), which was consistent with the case study in this paper. The overall accuracy of the new study was slightly higher than that of the case study in this paper, which may be due to the more uniform underlying surface types in the new study area.

Conclusions
Although the BME framework has a demonstrated performance in effectively integrating multi-source data for the spatial mapping of the land surface parameter and has been successfully applied to soil properties, its applications to soil moisture mapping are limited, with studies often focusing on only one type of auxiliary data [12,23]. Based on the above analysis, the purpose of this study was to verify the feasibility of a set of methods for the spatial mapping of the soil moisture under sparse sampling conditions by fusing multi-source data including optical, near-infrared, thermal infrared remote sensing, and DEM data.
To investigate the feasibility of the BME framework for the soil moisture mapping at agricultural watershed scale through multi-source data fusion, this study made a bold attempt by integrating nine environmental auxiliary variables with uncertainties and having a nonlinear correlation with the soil moisture. The sparse samples obtained during the ground experiments in the study area were used as hard data, and the fuzzy probability relationships between the soil moisture and the three principal components extracted using the PCA from the nine environmental auxiliary variables were used to construct the soft data. Subsequently, combined with the general knowledge under the constraint of maximum entropy principle generated in the prior stage, the soil moisture mapping was carried out based on the Bayesian conditioning rule. The OK method was also applied to the soil moisture mapping in the study area as a comparative experiment. The OK method was used instead of the RK and CK methods, which can also use auxiliary data, because of the limitations of the RK and CK methods in the application of auxiliary data. Previous studies have proven that these methods are ineffective when it comes to fusing multi-source data that do not have a strong linear correlation with the target parameters (correlation index below 0.7); moreover, their accuracy is similar to that of the OK method [12,23,47,61]. The results showed that the soil moisture mapping results based on the BME framework have significant advantages as validated in terms of the visual contrast, quantitative index comparison, and scatter plot comparison over the OK method. This confirms the feasibility of using the BME framework to fuse multi-source, uncertain data, including optical, near-infrared, thermal infrared remote sensing, and DEM data, for soil moisture mapping with an acceptable accuracy under the condition of sparse ground samples.
The specific evaluation and analysis results based on the independent validation dataset are as follows. (1) A visual comparison of the soil moisture maps showed that the OK method produces smoother mapping results, whereas the soil moisture map obtained using the BME framework revealed more detailed information, which may be due to two reasons: the smoothing effect of the OK method itself and the presence of more detailed information with the application of multi-source auxiliary data [14,15,17]. (2) In terms of the quantitative validation indicators, the RMSE and MAE index values of the BME results were significantly lower than those of the OK method, indicating a smaller Remote Sens. 2020, 12, 3916 20 of 25 difference between the BME-predicted soil moisture and the measured dataset, and a higher accuracy. The PCC index of the BME was significantly higher, indicating a stronger linear correlation between the BME-predicted values and the validation dataset, and more precise results. (3) Based on the scatter plots, most of the scatter points in the case of using the BME framework were distributed around the 1:1 line, and the prediction range of the BME was much wider than that of the OK method. Although the issues of overestimation in the low-value range and underestimation in the high-value range remain in the BME results, these issues are significantly alleviated compared with the OK method.
In this study, various efforts were made by fusing multi-source data, including optical, near-infrared, thermal infrared remote sensing, and DEM data, based on the BME framework for soil moisture mapping with an acceptable accuracy. However, as with any study, there are some shortcomings, as discussed in Section 5.1. Future research could be carried out along the following lines. The multi-source data should be enriched, and the conditions of variable selection could be further studied to fuse more abundant and closely related auxiliary variables. The construction of soft data should be further studied to realize more feasible and simpler soft data. The time interval between ground experiments and satellite transit should be short, and it is better to achieve quasi-synchronization, which may be more conducive to the mapping accuracy.