Using of Remote Sensing-Based Auxiliary Variables for Soil Moisture Scaling and Mapping

Soil moisture is one of the core hydrological and climate variables that crucially influences water and energy budgets. The spatial resolution of available soil moisture products is generally coarser than 25 km, which limits their hydro-meteorological and eco-hydrological applications and the management of water resources at watershed and agricultural scales. A feasible solution to overcome these limitations is to downscale coarse soil moisture products with the support of higher-resolution spatial information. Although many auxiliary variables have been used for this purpose, few studies have analyzed their applicability and effectiveness in arid regions. To this end, we comprehensively evaluated four commonly used auxiliary variables, including NDVI (Normalized Difference Vegetation Index), LST (Land Surface Temperature), TVDI (Temperature Vegetation Dryness Index), and SEE (Soil Evaporative Efficiency), against ground-based soil moisture observations during the vegetation growing season in the Heihe River Basin, China. Performance metrics indicated that SEE is most sensitive (R2 ≥ 0.67) to soil moisture because it is controlled by soil evaporation limited by the available soil moisture. The similarity of spatial patterns also showed that SEE best captures soil moisture changes, with the STD (standard deviation) of the HD (Hausdorff Distance) less than 0.058 when compared with PLMR (Polarimetric L-band Multi-beam Radiometer) soil moisture products. In addition, soil moisture was mapped by RF (Random Forests) using both single auxiliary variables and 11 types of multiple auxiliary variable combinations. SEE was found to be the best auxiliary variable for scaling and mapping soil moisture with accuracy of 0.035 cm3/cm3. Among the multiple auxiliary variables, the combination of LST, NDVI, and SEE was found to best enhance the scaling and mapping accuracy of soil moisture with 0.034 cm3/cm3.


Introduction
Soil moisture plays a key role in the water cycle and heat exchange of surfacevegetation-atmosphere columns [1][2][3]. Information about the spatiotemporal distribution of soil moisture is essential for drought monitoring [4], evapotranspiration forecasting [3,5], water resource management [6][7][8], and crop yield estimation [9]. In recent years, advances in active and passive microwave remote sensing have made this the main technique for measuring soil moisture distribution at regional and global scales [10,11].
Over the past years, several global soil moisture products based on satellite-based passive microwave sensors have become available [12]. Among these, the Advanced Microwave Scanning Radiometer Earth Observing System (AMSR-E) has a finer spatial resolution of 25 km [13][14][15], while Soil Moisture and Ocean Salinity (SMOS) [16], launched in 2009, has a resolution of 40 km, and the Soil Moisture Active Passive (SMAP) [17,18], launched in 2015, has a resolution of 36 km. These passive microwave remote sensing products aim to map global soil moisture within an accuracy of 0.04 cm 3 /cm 3 to satisfy hydro-climatological and hydro-meteorological requirements and the management of water resources at watershed and agricultural scales.
For practical applications, however, soil moisture maps with spatial resolutions higher than 10 km are urgently required at watershed scales because soil texture and structure, vegetation coverage, and topography lead to high soil moisture spatial variability [19]. For example, Pan and Wood (2010) found that a higher spatial resolution of soil moisture can improve hydrology assimilations [20]. Agricultural applications often require a finer spatial resolution than 1 km [21], and weather forecasting and hydrological applications also need a higher spatial resolution than 10 km. In recent years, the downscaling of soil moisture has thus received wide attention to satisfy watershed-scale research and application needs [22][23][24][25]. To obtain a finer distribution of soil moisture, auxiliary variables at a higher spatial resolution are indispensable for capturing the spatial pattern of soil moisture in remote sensing pixels. For example, a combination of high-resolution Normalized Difference Vegetation Index (NDVI), albedo, and Land Surface Temperature (LST) data can be used to downscale soil moisture from 25 km to 1 km [26].
Auxiliary variables are also beneficial for the upscaling of multi-point ground-based soil moisture observations. Generally, the footprint of a remote sensing pixel is much larger than the representative area of in situ soil moisture measurement, which leads to a huge scale gap when validating soil moisture remote sensing products, especially when the spatial heterogeneity of soil moisture is increased by the compound influences of precipitation, soil texture, vegetation cover, and topography. At present, Wireless Sensor Network (WSN) [27] and COsmic-ray Soil Moisture Observing System (COSMOS) [28] are two promising methods capable of overcoming the scale mismatch between point observations and remote sensing pixels. Various upscaling methods have been developed to upscale the multi-points soil moisture observations, including Bayesian linear regression [29], ridge regression [30], and Kriging [30][31][32]. Introducing auxiliary variables related to soil moisture into the upscaling process does not only compensate the weak spatial representation of the sparse in situ soil moisture measurements, but also takes into account the trend of change of soil moisture with time, improving upscaling accuracy [11].
Until now, multispectral remote sensing data ranging from visible, near-infrared, thermal-infrared to microwave bands have been used directly or indirectly as auxiliary variables relative to soil moisture. Because of the different remote sensing principles (reflection, radiation, and scattering properties) in different wavelengths, each type of auxiliary variable reflects soil moisture at different depths and under different vegetation cover conditions. Optical remote sensing indirectly takes advantage of the strong absorption in the visible bands and strong reflection in the near-infrared bands of vegetation; the relationship between soil moisture and spectral reflectance in the visible/near-infrared bands can be used to determine the soil moisture at the surface or in the top millimeters of soil. In general, the absorption in the visible bands increases with soil moisture. Recently, a series of vegetation indices based on optical remote sensing was developed to scale soil moisture via vegetation health conditions and water stress. During the entire vegetation growing season, there is an obvious positive correlation between NDVI [33] and soil moisture. Based on NDVI, the Vegetation Condition Index (VCI) [34,35], Anomaly Vegetation Index (AVI) [36], Temperature Condition Index (TCI) [35], and Temperature Vegetation Index (TVI) [37] were developed to comprehensively reflect soil water deficiencies in different years. However, soil moisture retrieval is limited in these bands due to their limited capability to penetrate clouds and vegetation, and the careful correction required to eliminate atmospheric effects [38].
Monitoring soil moisture with LST observed by a thermal-infrared remote sensor can be traced back to the 1970s. LST was utilized as a feasible proxy to infer soil moisture in drought-affected areas [39] thanks to soil thermal properties. Researchers demonstrated that, in cloudless conditions, LST reflects soil moisture when the land surface is bare [40]. Both the Crop Water Stress Index (CWSI) [6] and the Apparent Thermal Inertia (ATI) [41] are based on land surface emissivities and LSTs for estimating the variability of soil moisture under bare soil and low vegetation coverage conditions. The Moderate Resolution Imaging Spectroradiometer (MODIS)-derived ATI has been successfully used to map soil moisture (1 km) with an accuracy of 0.031 cm 3 /cm 3 in the Babao River Basin, China, where the land cover type is primarily alpine meadow [31]. Because of the limited sensitivity of LST to soil moisture over vegetated areas, auxiliary variables such as the Temperature Vegetation Dryness Index (TVDI) [42] and the Vegetation Temperature Condition Index (VTCI) [43] were proposed to estimate soil moisture and monitor drought. For example, Kang et al. (2015) found that the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) TVDI describes the heterogeneity of soil moisture within a microwave pixel [44]. Peng et al. (2015) utilized the VTCI calculated from MODIS to downscale the Climate Change Initiative (CCI) soil moisture from 0.25 • to 5.6 km [45], while the result showed reasonable agreement with soil moisture observations. However, building the TVDI or VTCI feature space requires not only a large number of samples to cover the extremely dry to humid conditions, but also stable and uniform meteorological conditions in the study area. Additionally, there is uncertainty in the determination of wet and dry edges. Due to the close coupling relationship between soil moisture and evapotranspiration in a water limited region, the Soil Evaporative Efficiency (SEE) [46], defined as the ratio of the actual to potential soil evaporation, was employed to reflect soil moisture in last decades. Merlin et al. (2009) utilized the SEE to downscale the SMOS soil moisture product from a 40 km resolution to 10 km/4 km resolution [22]. However, calculation of soil evaporation directly from remote sensing is difficult, and its uncertainty increases with precipitation and irrigation.
Microwave remote sensing, which utilizes radiative and scattering signals to monitor hydrological variables, maps soil moisture based on the microwave dielectric sensitivity. The combination of active and passive L-band microwave remote sensing holds strong potential for improving the spatial resolution of soil moisture. Active microwave remote sensing can provide much higher spatial resolution, and passive sensors can provide frequent observations, albeit with coarser spatial resolution. With the support of high-resolution microwave backscattering as an auxiliary variable, higher resolution microwave brightness temperature data can be easily obtained by decomposing the passive microwave brightness temperature of the L-band; thus, higher resolution soil moisture can be retrieved [47]. However, both soil surface roughness and soil structure are the main limitations for soil moisture estimation [48] with these methods.
Recently, researchers combined land surface variables estimated from optical, thermalinfrared, and microwave sensors to scale soil moisture with satisfactory accuracy and resolution [49][50][51][52]. For example, many researchers have combined land surface albedo and brightness temperature to retrieve soil moisture with a high accuracy (RMSE ≤ 0.048 cm 3 /cm 3 ) [53,54]. Knipper et al. (2017) compared the roles of albedo and onboard brightness temperature in improving the resolution of soil moisture products (1 km) and found that brightness temperature provides optimal precision for the spatial variability of soil moisture [55].
A variety of auxiliary variables used to scale soil moisture have been explored during the past decades. However, a systematic evaluation of their applicability and contribution to the scaling transformation of soil moisture, especially in croplands, is still missing. Therefore, in this study, four auxiliary variables quantified from remote sensing images, including NDVI, LST, TVDI, and SEE, were analyzed to show their applicable conditions, the sensitivity to soil moisture, correlations, and sensing depth to soil moisture, and to provide useful information for the scale conversion of soil moisture. Random Forest (RF) was then applied to identify the auxiliary variable that is most suitable for mapping soil moisture, eventually trying to use RF to solve the problems of nonlinearity between soil moisture and auxiliary variables, and mutual correlation between variables, and hoping to use optimal auxiliary variable to guide RF to obtain high-precision of soil moisture.
The paper is organized as follows. Section 2 briefly describes the study area and the key datasets supporting the analysis, including remote sensing images and the Hydro-meteorological Observation Network and Ecohydrological Wireless Sensor Network (EHWSN). Section 3 describes in detail four auxiliary variables closely related to soil moisture and evaluates their performance with in situ soil moisture, the spatial consistency evaluation method between the auxiliary variables and the Polarimetric L-band Multi-beam Radiometer (PLMR)retrieved soil moisture products are introduced to thoroughly demonstrate the selected optimal auxiliary variable, and Random Forests (RF) is introduced to map soil moisture. Section 4 presents the results and discussions, and is followed by conclusions in Section 5.

Study Area
The middle reach of the Heihe River Basin (HRB) in the arid region of northwest China has been selected as the experimental study area thanks to its rich observation infrastructure maintained by the Heihe Watershed Allied Telemetry Experimental Research (HiWATER) project, which includes 21 Automatic Meteorological Stations (AMSs) and 50 WATERNET nodes located in an observation matrix (5.5 km × 5.5 km) [56]. Additionally, HiWATER provides synchronous space-sky-ground observation datasets that satisfy the experimental requirements.
The extent of the study area ranges from latitude 38.75 • N to 39.00 • N and from longitude 100.20 • E to 100.55 • E with elevations of 850-2000 m ( Figure 1). The land cover types mainly include farmland, orchard, village, road, irrigation channel, bare land, and desert. The major crops are seed maize, wheat, and vegetables [56]. The climate of the study region is dry with long and cold winters, while the summer is hot and short; occasional sandstorms occur in the period from March to May. The annual mean air temperature is 6 • C, and the mean maximum and minimum temperatures generally occur in July and January, respectively. The annual precipitation is about 100-250 mm, and 70% occurs from June to September. The potential evapotranspiration is as high as 1200-1800 mm per year. Within the research area, the irrigation districts of Daman and Yingke have a complete irrigation infrastructure with densely distributed main canals, branch canals, lateral canals, field ditches, and sublateral canals. The crops are irrigated approximately once every twenty days during the growing season with water from rivers and supplemented by wells. The significant spatiotemporal variations of soil moisture and evapotranspiration in this region result in both the fractured landscape and a rotational irrigation system.

Materials
All data in this study were provided freely by the HiWATER project (http://westdc. westgis.ac.cn/hiwater, accessed on 1 June 2017), which is a multi-scale integrated observation experiment relying on satellites, aerial sensors, and ground observations started in the HRB in 2012 [56]. The data used in this study include remote sensing images and the Hydro-meteorological Observation Network data and Ecohydrological Wireless Sensor Network (EHWSN) data.

Remote Sensing Images
ASTER reflectivity and its LST products [57,58] were utilized to estimate the auxiliary variables. Nine ASTER L1B images (15 June, 24 June, 10 July, 2 August, 11 August, 18 August, 27 August, 3 September, and 12 September of 2012) were applied after a geometric and radiative calibration. The visible and near-infrared bands (band 2 and band 3) of ASTER were used to calculate the Vegetation Index (VI). Other auxiliary indices, such as TVDI and SEE, were derived based on VI and ASTER LST products.
The soil moisture products (700 m) of PLMR [59] were used as a reference to evaluate the spatial distribution patterns of the auxiliary variables. The PLMR soil moisture products on 30 June, 7 July, 10 July, 26 July, and 2 August of 2012 were retrieved from six L-band microwave channels (three incident angles at 7 • , 21.5 • , 38.5 • , and dual-polarizations) by the Levenberg-Marquardt (LM) optimal algorithm. The product's accuracy is better than 0.04 cm 3 /cm 3 compared with WATERNET observations, which is suitable for the study of soil moisture scale conversion and validation of remote sensing products.

Hydro-Meteorological Observation Network
The hydro-meteorological network in the middle reach of HRB, located in the southwest of the Zhangye City, includes a Daman superstation (AMS_15) and 20 ordinary AMSs ( Figure 1). The Daman superstation includes a multiscale observation system for surface fluxes and soil moisture profiles equipped with a 40 m boundary layer tower, one lysimeter, two Eddy Covariance (EC) systems (4.5 m and 17 m), four groups of Large Aperture Scintillometer (LAS), one cosmic-ray probe, and one stable isotopic observation system. The difference is that the ordinary AMSs lack a layer of EC (one EC system) compared with the Daman superstation [60].

Ecohydrological Wireless Sensor Network (EHWSN)
The EHWSN was installed in the intensive observation matrix of 5.5 km × 5.5 km extent (Figure 1), located in the hydro-meteorological network [27]. The EHWSN includes 50 soil moisture/temperature WATERNET nodes. Their distributions were spatially optimized with the stratified non-homogeneity method [61,62] to capture the spatiotemporal variations of soil moisture, soil temperature, and land surface temperature in the heterogeneous surface.
The soil moisture at the depths of 4 cm and 10 cm measured by each WATERNET node and AMS (Daman superstation and 20 ordinary AMSs) were combined to evaluate the applicability of remote sensing auxiliary variables to construct the RF training samples and to verify the RF mapped soil moisture. The 0 cm soil temperature observations were utilized to verify the component temperature of ASTER LST.

Methodology
The ASTER images were first introduced to calculate four frequently used auxiliary variables, including NDVI, LST, TVDI, and SEE. The performance metrics between time series of these auxiliary variables and soil moisture observations at 4 cm and 10 cm were evaluated to quantitatively determine their representativeness. The Hausdorff Distance (HD) method was introduced to evaluate the spatial consistency between these auxiliary variables and PLMR-retrieved soil moisture products to validate the feasibility of the auxiliary variables at large scales. Eventually, the machine learning method RF was used to map soil moisture using both single auxiliary variables and multiple auxiliary variables combinations, respectively.

Normalized Difference Vegetation Index (NDVI)
NDVI synthetically reflects the differences in chlorophyll absorption features between visible and near-infrared bands [33]. This auxiliary variable is sensitive to changes in chlorophyll in plant leaves, and is an indirect indicator of soil water content. Due to the influence of vegetation, soil moisture, and atmosphere, NDVI shows large seasonal and regional variations.

Land Surface Temperature (LST)
Due to its key role in surface-atmosphere interactions, LST is one of the essential variables in hydrology, meteorology, climatology and ecology [39]. Generally, there is a negative correlation between LST and soil moisture due to the soil heat capacity changing with soil moisture. In this study, the LST remote sensing products (15 m) were from the dataset named "HiWATER: ASTER LST dataset in 2012 in the middle reaches of the HRB". The products were obtained by the temperature and emissivity separation algorithm, after the atmospheric calibration of ASTER L1B data using the MODIS atmospheric profile product (MOD07) and the radiative transfer model MODTRAN (MODerate spectral resolution atmospheric TRANsmittance algorithm and computer model). These products demonstrated a reasonable accuracy, with an average bias of 0.5 K and average RMSE less than 2 K [57].

Temperature Vegetation Dryness Index (TVDI)
The estimation of TVDI [40,42,63] is based on a two-dimensional feature space constructed by NDVI and LST ( Figure 2). The construction of the feature space requires that the study area is large enough to cover canopy coverages ranging from bare soil to dense vegetation, and surface soil moisture varying from dry to saturated. The scatter plots of NDVI and LST generally form a triangle or trapezoid space ( Figure 3) in which the location of a pixel represents a certain soil moisture and evapotranspiration. Therefore, TVDI is more suitable for the estimation of soil moisture in arid regions. The Formula (1) to calculate TVDI is: where LST NDV I (K) is the land surface temperature at a pixel; LST NDV I,max and LST NDV I,min are the maximum and minimum LST (2) in the study area, respectively, under the same NDVI conditions of LST NDV I . LST NDV I,max under differing vegetation cover is assumed to represent pixels with unavailable soil moisture content, and forms a "dry edge" at the upper boundary of the feature space. LST NDV I,min slowly increases with vegetation cover and describes pixels close to potential evapotranspiration, which form a lower, nearly horizontal, "wet edge" [64][65][66][67]. Generally, linear regression is performed to fit the dry and wet edge: where a 1 and b 1 are the intercept and slope for the wet edge, while a 2 and b 2 are those for the dry edge, respectively. At the dry edge (TVDI = 1), soil moisture is close to the wilting point; latent heat flux is assumed to be 0 W/m 2 and the sensible heat flux reaches its maximum. The wet edge is usually simplified as a line parallel to the NDVI axis ( Figure 2) when the slope of the wet edge is close to 0, and the soil water content is near to field capacity. For a pixel in Figure 2, the ratio of section A and B is defined as TVDI, where A is the vertical distance between the pixel and the wet edge, whereas B is the vertical distance between the dry and wet edge. If the pixel (NDVI, LST) is closer to the dry edge, the TVDI is higher and the soil moisture is lower. Otherwise, the situation is reversed. The algorithm described in this section was applied to all nine ASTER data acquired over our study area. Figure 3 shows the plots of LST against NDVI in a two-dimensional space for the data acquired on 15 June, 24 June, 10 July, 2 August, 11 August, 18 August, 27 August, 3 September, and 12 September of 2012, and the corresponding dry and wet edges determined automatically. Each feature space constructed by the LST changed under the NDVI with a step of 0.005. Furthermore, the dry and wet edges were fitted by maximum and minimum LST under corresponding NDVI. The maximum LST gradually decreased, whereas the minimum LST slowly increased with the NDVI. This figure confirms that the pixels in the study area form a triangle or trapezoid space in the two-dimensional feature space NDVI/LST, and the dry and wet edges are determined on the basis of the triangle or trapezoid space. The negative slope associated with the wet edge might result from the high rate of evapotranspiration from canopies compared with bare soil surfaces under a non-limiting water situation.

Soil Evaporation Efficiency (SEE)
SEE is defined as the ratio of actual to potential soil evaporation. The rationale for choosing SEE as fine-scale information is based on its strong correlation with surface soil moisture [22]. However, there are large difficulties and uncertainties in estimating soil evaporation directly from remote sensing due to the influence of atmospheric conditions, surface soil moisture, vegetation coverage, and soil texture. Therefore, Nishida et al. (2003) put forward a method to indirectly estimate SEE using surface soil temperature and vegetation fraction over heterogeneous surfaces [46]. SEE is a function of surface soil temperature (3): where T s,max and T s,min are the maximum and minimum surface soil temperature, respectively. T s is the surface soil temperature obtained by decomposing the ASTER LST based on NDVI [68] as following: (I) When NDVI < 0.2, the pixel mainly consists of bare soil and the ASTER LST is considered as the surface soil temperature (4): (II) When 0.2 ≤ NDVI < 0.5, the land surface is regarded as a compound of bare soil and vegetation, where the LST is a combination of vegetation canopy temperature and surface soil temperature. The surface soil temperature T s (5) was obtained based on the vegetation fraction and vegetation canopy temperature: where T c is the vegetation canopy temperature (6), f c is the vegetation fraction (7), ε is the surface emissivity (8), ε s and ε c are the emissivities of the bare soil and full vegetation covered surface, and NDV I min and NDV I max are the minimum and maximum NDVI, respectively. (III) When NDVI ≥ 0.5, the pixels are regarded as fully vegetated, the canopy temperature is substituted by LST (9), and the surface soil temperature is obtained with Equation (6).
The components temperature decomposed from ASTER LST are verified with the observed 0 cm soil temperature by the Hydro-meteorological Network. The result demonstrates that there is a good consistency (R 2 = 0.91, RMSE ≤ 3.10 K) between the decomposed surface soil temperature and observed 0 cm soil temperature (Figure 4), which is superior to the retrieved surface soil temperature with accuracy of 4 K [69,70]. Kustas and Norman (1999) developed a simple two-source model for modeling surface soil temperature with 4 K lower than the radiometric temperature observations [69]. Zhao et al. (2014) utilized the radiative transfer theory to estimate the RMSEs within 4 K between the retrieved and observed surface soil temperature [70]. Therefore, it is considered that the component temperature decomposition method in this paper is reliable and the decomposed 0 cm soil temperature can be used for SEE calculation.

Performance Metrics
Correlation analysis was employed to compare the sensibilities of the auxiliary variables to soil moisture at depths of 4 cm and 10 cm during the entire vegetation growing season from 15 June to 12 September, 2012. The coefficient of determination (R 2 ) was used to evaluate the strength of the relationship. According to the overpass time of ASTER, the soil moisture observations at GMT 04:10 were selected as reference to quantitatively analyze the applicability of each auxiliary variable.

Hausdorff Distance (HD)
The HD method was introduced to evaluate the consistency of the spatial distribution between the auxiliary variables and the PLMR-retrieved soil moisture products. The HD is a type of maximum and minimum distance that compares two finite point sets A and B (10). HD has higher tolerance to perturbations in the locations of the points than the binary correlation technique because it measures their proximity rather than their exact correspondence [71]. Unlike most shape comparison methods that require constructing point-to-point correspondences between two images, HD can be calculated without the explicit pairing of points and only requires calculating the maximum distance between two datasets [72,73]. If the values of HD are floating in a small range of a certain median, there is a good spatial consistency between the A and B.
HD is defined as: where A = {a 1 , a 2 , · · · , a m } and B = {b 1 , b 2 , · · · , b n } are the point datasets of two images, and a i and b i are the pixels value of A and B. h(A, B) is the directed HD from A to B (11), and h(B, A) is that from B to A (12). · is the Euclidean norm of a vector. In this study, A and B are the auxiliary variables and the PLMR soil moisture product, respectively, both covering the extent between latitudes 38.75 • N to 39.00 • N and longitudes 100.20 • E to 100.55 • E in the middle reach of HRB. Two representative images (10 July and 2 August of 2012) of PLMR soil moisture products were selected to compare data accessibility and consistency. The auxiliary variables were normalized to the PLMR soil moisture product based on (13) to make their units and range of values comparable: where X is the normalized value, x is the auxiliary variable's value, x max and x min are the maximum and minimum values of the auxiliary variable, and the PLMR max and PLMR min are the maximum and minimum values of PLMR product, respectively.

Random Forests (RF)
The RF [74,75] is a machine learning algorithm based on the bagging integrated learning theory [76] and the random subspace method [77]. RF has high prediction accuracy and tolerance for anomalies and noise in the data, with rare over-fitting in practical application. RF has been widely used in various applications, such as data mining [78][79][80], bioinformatics research [81,82], and information classification [82][83][84]. RF's application accuracy can be enhanced further by fusing multi-source information.
Similar to other classifier models, such as deep learning and artificial neural networks, RF is based on a dataset of decision trees that is determined by more than one variable. Assuming that there is a multi-decision tree classification model {h(X, Θ k ), k = 1, 2, 3, · · ·}, {Θ k } is an independent identically distributed random vector and k is the count of decision tree models included. Hence, for any given variables X, each tree casts a unit vote for the most popular class and the final RF classifier is determined.
The RF method was introduced to map soil moisture by the recommended auxiliary variable. Multiple auxiliary variables combinations were also used to map soil moisture by RF and compare them with a single auxiliary variable. Four single auxiliary variables and 11 types (C 2 4 + C The procedure of RF in this study was as follows: (I) The training dataset p was constituted by the 4 cm-depth soil moisture observed by AMS and WATERNET. The bootstrap sampling method was used to extract the samples from the auxiliary variables with the condition that the volume of the samples dataset was same as that of the training dataset. (II) Decision trees were constructed by the sample dataset and then p types of corresponding classification results were achieved. (III) Each tree in the RF casts a unit vote for the most popular class. The final soil moisture products were predicted by the mean of ballot vote according to the p types of classification results. (IV) Correlation analysis and RMSE between the predicted and observed soil moisture were carried out to quantitatively evaluate the RF mapping accuracy.

Correlation between Auxiliary Variables and Soil Moisture Observations
Four auxiliary variables (NDVI, LST, TVDI, and SEE) were correlated with observed soil moisture measured over heterogeneous surfaces in the arid area.
Based on the R 2 values, the strengths of the correlations between the auxiliary variables and soil moisture at the superficial zone and the root zone were in the order of SEE > TVDI > LST > NDVI ( Figure 5). The auxiliary variable with a higher R 2 value with soil moisture performed better in the scaling transformation of soil moisture. There was an expected but moderate relationship between NDVI and soil moisture during the vegetation growing season (Figure 5a). In contrast, there was a negative but slightly stronger correlation (R 2 = 0.49) between LST and soil moisture (Figure 5b) than with NDVI due to the rapid rise of LST with water stress. There was also a negative correlation between TVDI and soil moisture (Figure 5c), as this auxiliary variable can better reflect the soil moisture status at a depth of 10 cm (R 2 = 0.68) than that at a depth of 4 cm (R 2 = 0.63). The higher the TVDI, the lower the soil moisture, and thus TVDI can indicate a water deficiency in agricultural production. Compared with these three auxiliary variables, the data points of SEE and soil moisture were more concentrated (Figure 5d). The correlations between SEE and soil moisture were better than for other auxiliary variables. The R 2 between the SEE and soil moisture was 0.67 at 4 cm depth, and was 0.74 at 10 cm depth.
In an arid environment, soil moisture links surface phenology with subsurface water storage, and strongly influences the surface water cycle and energy partitioning due to the strong coupling effect between soil moisture and evapotranspiration. Moreover, soil moisture in the root zone also controls vegetation health and percent coverage [85]. Vegetation health is closely related to transpiration, which is limited by soil moisture in the arid region of HRB. As an important characteristic of vegetation health, NDVI reflects vegetation transpiration by reflectivity, and is mainly used to represent the growth conditions of vegetation in the ecosystem. Therefore, NDVI always indirectly represents soil moisture changes. Nevertheless, because of the heterogeneous surface, the distribution of soil moisture and NDVI varied widely in HRB, leading to a weak correlation (R 2 = 0.41) between NDVI and soil moisture (Figure 5a). As a key factor of vegetation growth, stomatal resistance to transpiration is significantly affected by LST and is partly controlled by soil moisture availability. Suitable LST and soil moisture ensure vegetation health transpiration. LST is always influenced by the soil heat capacity and conductivity. These two thermal properties are functions of soil type and change with soil moisture. As a consequence, soil moisture largely controls LST through the energy balance of the land surface. The higher the soil moisture, the less energy is available for sensible heat flux of the surface [42]. Generally, soil heat capacity decreases with soil moisture increase, and this leads to a decrease in LST. Therefore, LST always indirectly reflected the extent to which vegetation absorbs root zone soil moisture (Figure 5b). However, the correlation between LST and soil moisture (R 2 = 0.49) was also influenced by the impact of land surface heterogeneity, soil types, climate conditions, and vegetation coverage. TVDI, derived based on the two-dimensional feature space formed by LST and NDVI, has the advantages of both NDVI and LST. Therefore, TVDI could reflect soil moisture more accurately (R 2 ≥ 0.63) than LST (R 2 = 0.41) or NDVI (R 2 = 0.49) (Figure 5c). However, the TVDI method requires a dry limit in the NDVI/LST triangle space [64] and unsaturated soil moisture.
Evaporation is controlled by soil temperature, vapor pressure deficit, soil moisture content, and other meteorological factors. To accurately simulate soil moisture content, it is essential to estimate the evaporation rate from the land surface. In wet conditions, soil evaporation approaches potential evaporation because it occurs at the surface of the soil and the deep moisture rises up through the capillaries and reaches the soil surface. In general, soil moisture content changes below field capacity and a dry layer exists at the soil surface. When soil evaporation occurs, soil moisture diffuses to the surface soil through the soil pores to replenish it. Therefore, soil evaporation is mainly controlled by the soil moisture content. In contrast, the SEE is a direct function of soil evaporation, so it was more sensitive (R 2 ≥ 0.67) to soil moisture (Figure 5d). In the NDVI/LST triangle space, the surface soil temperature corresponds to soil evaporation whereas the LST corresponds to the land surface evapotranspiration with changes of NDVI. Notably, the LST is closer to the surface soil temperature when the NDVI is smaller. Therefore, the SEE is more sensitive to soil moisture than the TVDI by considering the parameterization of SEE (Equation (3)).
Having described all of above, the SEE is presented as a more sensitive agency variable to soil moisture than other three variables ( Figure 5).

Consistency of Spatial Pattern between Auxiliary Variables and PLMR Soil Moisture Products
The consistency in spatial patterns of the auxiliary variables and PLMR soil moisture products was measured with HD. Box and whisker plots ( Figure 6) showed that the HDs between four auxiliary variables and PLMR soil moisture products were stable (normally distributed). However, the HD of NDVI and SEE converged more (maximum and minimum intervals) than the LST and TVDI. In contrast, the outliers of SEE were minimal and onesided by comparison with NDVI. At the same time, the minimum HDs of SEE were less than for the other three auxiliary variables. Besides, the standard deviations (STD) of HD (Table 1) also demonstrated that the HD of SEE to soil moisture were closer (STD ≤ 0.058) to the median values than in the NDVI, LST, and TVDI. These features comprehensively indicated that there was a better spatial consistency between the SEE and the PLMR soil moisture than that with other auxiliary variables. The spatial pattern distribution also showed that the variation ranges of SEE were basically consistent with that of PLMR soil moisture (Figure 7). In summary, the SEE is strongly recommended as the most optimal auxiliary variable for representing the distribution of soil moisture based on the HDs (Figure 6), the correlation coefficients ( Figure 5), and visual interpretation (Figure 7).

Mapping Soil Moisture by Random Forests (RF)
To evaluate the feasibility of using the selected auxiliary variable for soil moisture inversion, we mapped soil moisture by RF with each single auxiliary variable including NDVI, LST, TVDI, and SEE separately (Figure 8). The soil moisture mapping results accuracies were relatively poorer for NDVI (R 2 = 0.58, RMSE = 0.058 cm 3 /cm 3 ), LST (R 2 = 0.72, RMSE = 0.048 cm 3 /cm 3 ), and TVDI (R 2 = 0.77, RMSE = 0.043 cm 3 /cm 3 ). The performance of SEE was optimal (R 2 = 0.86, RMSE = 0.035 cm 3 /cm 3 ) compared with observations. This result indicates that the recommended auxiliary variable SEE, which is controlled by soil surface evaporation and is limited by the available soil moisture, can be used to accurately reflect soil moisture.  Eleven types of multiple auxiliary variables combinations were also used to map soil moisture. The Taylor diagram (Figure 9) shows the R, RMSE, and STD of the RF mapping results compared with observed soil moisture at 4 cm by Daman superstation, ordinary AMSs, and WATERNETs. The results showed that multiple auxiliary variable combinations all performed well for mapping soil moisture, with 0.90 < R < 0.93, 0.033 cm 3 /cm 3 < RMSE < 0.038 cm 3 /cm 3 , and 0.068 cm 3 /cm 3 < STD < 0.074 cm 3 /cm 3 between mapped and observed soil moisture ( Table 2). This performance can be explained by the fact that all auxiliary variables combinations contain both NDVI, which reflects the variability of the land cover, and LST, which reflects the complicated soil properties. However, combinations C, H, and K, which contain the auxiliary variable SEE, were better (R 2 = 0.86, RMSE = 0.034 cm 3 /cm 3 ) than other combinations. The main reason is that the SEE is closely related to soil evaporation and better reflects soil moisture. Therefore, considering the limitations of accuracy, sample calculation and complexity of the feature space construction of TVDI, combination H was found to be optimal to map soil moisture by RF. Additionally, comparing the mapping of soil moisture with four single auxiliary variables and their combinations demonstrated that SEE better reflects soil moisture than most multi-variable combinations (e.g., combination A). However, the combination H (R 2 = 0.86, RMSE = 0.034 cm 3 /cm 3 ) performed similarly to SEE (R 2 = 0.86, RMSE = 0.035 cm 3 /cm 3 ). Whether the SEE or the combination H was used for soil moisture mapping, the accuracies were less than the SMAP and SMOS soil moisture products' accuracy of 0.04 cm 3 /cm 3 . Therefore, we can conclude that SEE is an optimal auxiliary variable for soil moisture mapping and the multiple auxiliary variables combination H (LST, NDVI, and SEE) is expected to enhance mapping accuracy.  The performance of using the combination H to map soil moisture in the middle reach of HRB was evaluated. The scatter plots of in situ observed (except for 11 August because of lack of observations) and RF-mapped soil moisture ( Figure 10) showed that the correlations between the two datasets were 0.80 ≤ R 2 ≤ 0.91, 0.021 cm 3 /cm 3 ≤ RMSE ≤ 0.046 cm 3 /cm 3 . The mean RMSE (0.034 cm 3 /cm 3 ) was smaller than 0.04 cm 3 /cm 3 , which is the SMAP/SMOS accuracy and satisfies the research of drought monitoring and water resource management. The RF-mapped soil moisture agrees well with the observations at 4 cm of Daman Superstation ( Figure 11). The statistical analysis showed that soil moisture mapped by RF using the combinations C, H, and K compared to the observations with an RMSE of 0.034 cm 3 /cm 3 . The soil moisture mapped by SEE alone had the lowest RMSE of 0.035 cm 3 /cm 3 . Precipitation observations at Daman Superstation recorded significant rainfall on 5 June, 17 June, 27 June, 6 July, 16 July, 20 July, 29 July, 6 August, 12 August, 17 August, 31 August, and 23 September. Irrigation events took place on 7 June, 3 July, 28 July, and 25 August. Considering both the region's size and the uniform crop distribution, we concluded that precipitation or irrigation occurred in the whole study area. RF-mapped soil moisture based on combination H with a resolution of 15 m (Figure 12) was consistent with both precipitation and irrigation events in the study area. For example, soil moisture mapping results indicated that there was higher soil moisture on 2 August due to the residual moisture from the irrigation event of 28 July. Soil moisture on 18 August and 3 September was relatively higher due to the strong rainfall that occurred one or three days before. Whereas the values were comparatively lower on 10 July because land surface evapotranspiration removed most of the water from the irrigation and precipitation that occurred on 3 July and 6 July, respectively. In addition, on 15 June and 12 September, the soil moisture values were lowest due to the lack of precipitation or irrigation. However, due to the uneven distribution of the sampling points in the study area, some biases existed in the RF mapping results. For example, the RF samples lacked points in both the river and desert zones, and thus the RF mapping results do not reflect soil moisture changes in these zones. Therefore, an even distribution of representative in situ observations and multi-source information, such as other high-precision remote sensing data, will be needed for future RF samples.

Discussion
A variety of auxiliary variables have been used to scale soil moisture during the past decades. However, a systematic evaluation of their applicability is still missing. We analyzed four auxiliary variables quantified from ASTER images, including NDVI, LST, TVDI, and SEE from the trend consistency and spatial pattern. The auxiliary variables that reflected soil moisture at the depths of either 4 cm or 10 cm were, in order of performance, SEE > TVDI > LST > NDVI. Using only NDVI or LST did not reflect overall soil moisture because of the strong heterogeneity of the land surface, saturated soil hydraulic conductivity, soil texture, and climate conditions. NDVI saturates at higher vegetation density, not reflecting changes in soil moisture. NDVI often indicates the vegetation amount and chlorophyll content rather than water status and it is a rather conservative indicator of water stress [10]. LST often closely follows vegetation transpiration and can rise rapidly with water stress. TVDI is assumed to reflect soil moisture as it incorporates the advantages of both NDVI and LST. However, the construction of the NDVI/LST feature space requires that the study area is large enough to capture canopy coverages ranging from bare soil to dense vegetation and surface soil moisture varying from dry to saturated. Based on the performance metrics between these auxiliary variables and observed soil moisture and by virtue of the HD maximum and minimum distances between the auxiliary variables and PLMR soil moisture, the SEE, which is controlled by soil surface evaporation and contained more information than the other auxiliary variables, was found to be an optimal auxiliary variable for scaling and mapping of soil moisture. The RF machine learning method has high prediction accuracy and tolerance for anomalies and noise in the data, with rare over-fitting in practical application. RF has the advantage of learning the relationships between soil moisture and auxiliary variables, especially nonlinear, indirect relationships, based on a large number of samples. Physics-based relationships between the variables and soil moisture and machine learning gradually showed a trend of integration. Physics-based relationships constraining machine learning can improve the training efficiency. Physics-based relationships may guide machine learning methods to obtain high-precision of soil moisture. We introduced the RF both to evaluate the feasibility of the selected optimal auxiliary variable and to map soil moisture. Soil moisture mapped by each single auxiliary variable indicated that an optimal accuracy (R 2 = 0.86, RMSE = 0.035 cm 3 /cm 3 ) existed in the SEE-based mapping soil moisture and in situ observations due to the SEE's close relation to soil evaporation. Soil moisture mapped by multiple auxiliary variables combinations indicated that each combination can reflect soil moisture, with 0.90 < R < 0.93, 0.033 cm 3 /cm 3 < RMSE < 0.038 cm 3 /cm 3 , and 0.068 cm 3 /cm 3 < STD < 0.074 cm 3 /cm 3 because multiple auxiliary variables combinations contain more information about soil moisture than a single auxiliary variable. However, combination H (LST, NDVI, SEE) was relatively better (R 2 = 0.86, RMSE = 0.034 cm 3 /cm 3 ) than the others (R 2 < 0.85, RMSE > 0.035 cm 3 /cm 3 ) at reflecting soil moisture with an accuracy of 0.034 cm 3 /cm 3 , which is less than the SMAP/SMOS required soil moisture accuracy (0.04 cm 3 /cm 3 ). Soil moisture (15 m) mapped by combination H in the middle reach of the HRB was consistent with the spatiotemporal changes of irrigation or precipitation.
The optimal auxiliary variable (SEE) and the multiple auxiliary variables combination H (LST, NDVI, and SEE) screened in this study are based on the middle reach of HRB, which has complex surface heterogeneity, variable climate, and significant water and heat exchange. The study spans the entire vegetation growing season proved the feasibility of the method by using the optimal auxiliary variable and combination of auxiliary variables selected for soil moisture mapping. Therefore, for other surfaces, the selected optimal auxiliary variable and combination of auxiliary variables are also applicable under the ground data constraints.

Conclusions
To support scaling and mapping of soil moisture for producing mid-and highresolution soil moisture estimates and validating satellite remote sensing products, the applicability of four auxiliary variables (NDVI, LST, TVDI, and SEE) was quantitatively evaluated in the arid region of China.
The auxiliary variables that reflected soil moisture at the depths of either 4 cm or 10 cm were, in order of performance, SEE > TVDI > LST > NDVI. Based on the performance metrics between these auxiliary variables and observed soil moisture and by virtue of the HD maximum and minimum distances between the auxiliary variables and PLMR soil moisture, the SEE was found to be an optimal auxiliary variable for scaling and mapping of soil moisture.
The RF machine learning method was introduced both to evaluate the feasibility of the selected optimal auxiliary variable and to map soil moisture. Soil moisture mapped by SEE and auxiliary variables combination H (LST, NDVI, and SEE) are respectively more optimal than others at reflecting soil moisture with accuracies of 0.035 cm 3 /cm 3 and 0.034 cm 3 /cm 3 , which are less than the SMAP/SMOS required soil moisture accuracy (0.04 cm 3 /cm 3 ). In addition, soil moisture (15 m) mapped by combination H in the middle reach of the HRB was consistent with the spatiotemporal changes of irrigation or precipitation.
SEE is recommended as an optimal auxiliary variable for scaling and mapping of soil moisture. The multiple auxiliary variables combination H (LST, NDVI, and SEE) is recommended for enhancing the scaling and mapping accuracy of soil moisture. Future studies should evaluate the use of SEE for scaling soil moisture. However, two factors may lead to inaccurate SEE calculations. First, from the definition of SEE, directly estimating evaporation using remote sensing observations presents substantial difficulties and uncertainties. Second, the accuracy of surface soil temperature is influenced by vegetation cover. These problems will need to be solved in future research. In addition, multi-source data, such as high-precision remote sensing data as well as evenly distributed and representative in situ observations, should be fused into the RF sample construction to obtain an accurate, high-resolution soil moisture product.

Data Availability Statement:
The data presented in this study are available upon request from the corresponding author.