A Neural-Network Based Spatial Resolution Downscaling Method for Soil Moisture: Case Study of Qinghai Province

Currently, soil-moisture data extracted from microwave data suffer from poor spatial resolution. To overcome this problem, this study proposes a method to downscale the soil moisture spatial resolution. The proposed method establishes a statistical relationship between low-spatialresolution input data and soil-moisture data from a land-surface model based on a neural network (NN). This statistical relationship is then applied to high-spatial-resolution input data to obtain high-spatial-resolution soil-moisture data. The input data include passive microwave data (SMAP, AMSR2), active microwave data (ASCAT), MODIS data, and terrain data. The target soil moisture data were collected from CLDAS dataset. The results show that the addition of data such as the land-surface temperature (LST), the normalized difference vegetation index (NDVI), the normalized shortwave-infrared difference bare soil moisture indices (NSDSI), the digital elevation model (DEM), and calculated slope data (SLOPE) to active and passive microwave data improves the retrieval accuracy of the model. Taking the CLDAS soil moisture data as a benchmark, the spatial correlation increases from 0.597 to 0.669, the temporal correlation increases from 0.401 to 0.475, the root mean square error decreases from 0.051 to 0.046, and the mean absolute error decreases from 0.041 to 0.036. Triple collocation was applied in the form of [NN, FY3C, GEOS-5] based on the extracted retrieved soil-moisture data to obtain the error variance and correlation coefficient between each product and the actual soil-moisture data. Therefore, we conclude that NN data, which have the lowest error variance (0.00003) and the highest correlation coefficient (0.811), are the most applicable to Qinghai Province. The high-spatial-resolution data obtained from the NN, CLDAS data, SMAP data, and AMSR2 data were correlated with the ground-station data respectively, and the result of better NN data quality was obtained. This analysis demonstrates that the NN-based method is a promising approach for obtaining high-spatial-resolution soil-moisture data.


Introduction
Moisture stored in surface soil accounts for less than 0.001% of total global freshwater by volume but plays an important role in connecting global terrestrial water, energy, and carbon cycling processes [1]. By influencing soil evaporation and transpiration, soil moisture (SM) strongly affects the interaction between the land surface and the atmosphere [2]. Thus, a thorough understanding of SM can contribute to efficient monitoring of the climate and environmental changes and provide valuable guidance for drought monitoring and flood forecasting in agriculture and forestry [3]. In addition, SM determines the distribution of precipitation infiltration and surface runoff, which controls plant growth [4]. Therefore, high-quality SM data is crucial in multiple technological fields, such as hydrology, meteorology, climatology, and water-resources management.
Traditional methods to monitor SM usually rely on automatic or manual collection methods, which have the advantages of temporal continuity and guaranteed accuracy. However, these methods are unsatisfactory because, for starters, there are insufficient observation stations, which is especially serious because SM results are representative only of the soil near the given station. In addition to poor spatial representation, these methods are time-consuming and labor-intensive [5].
However, recent developments in remote-sensing methods have created the possibility to obtain large-scale, long-term soil-moisture data. In this field, microwave radiometers have become the most important source of global SM data due to their better temporal sampling features. In particular, microwave bands such as the L (0.5-1.5 GHz), C (4)(5)(6)(7)(8), and X (8-12 GHz) bands have been widely used to measure SM [6]. Currently, four passive microwave satellites and one active microwave satellite monitor SM globally. Four passive microwave sensors are currently in orbit: the microwave radiation imager (MWRI), which operates in the X-band, onboard the Fengyun-3 (FY3) satellite launched by the China National Space Administration (2008-present) [7], the Advanced Microwave Scanning Radiometer (AMSR2), which operates in the X and C bands, onboard the Global Change Observation Mission-Water (GCOM-W) satellite launched by the Japan Aerospace Exploration Agency (JAXA) (2012-present) [8], and two dedicated satellites equipped with L-band radiometers: the Soil Moisture and Ocean Salinity (SMOS) (2009-present) instrument launched by the European Space Agency (2010-present) [9] and the Soil Moisture Active Passive (SMAP) instrument launched by the National Aeronautics and Space Administration (NASA) (2015-present) [10]. Another contributor is the ASCAT (2007-present) instrument, which monitors active scatterer in the C band from the MEOP satellite launched by the ESA and is an important source of active microwave data [11]. These microwave radiometers have the advantages of providing a complete observation of the global land surface within two to three days and providing surface soil-moisture information on a large scale. Their major disadvantage, however, is the poor spatial resolution of the microwave radiometer, which is typically about 25-40 km. However, SM is subject to complex interactions between topography, soil, vegetation, and other meteorological factors, which leads to high spatial variability. Therefore, many regional hydrological and agricultural applications require SM data with a spatial resolution of several kilometers or even tens of meters. It is thus vital to develop techniques to obtain accurate, high-precision, soil-moisture data with high coverage.
The low spatial resolution of soil-moisture data extracted from passive microwave data is typically downscaled by combining it with other high-spatial-resolution data. Based on the combined data type, the following two categories emerge: (i) combinations of active and passive microwave data and (ii) combinations of visible, infrared, and microwave data. In previous work, Njoku et al. combined radar (active) and radiometer (passive) data to study SM under vegetated-terrain cover and analyzed the sensitivity with which multichannel low-frequency passive and active measurements can detect SM under different vegetation conditions [12]. In other work, Das et al. obtained a linear relationship between radar backscatter and soil-moisture data by merging coarse-scale radiometer SMAP SM data with the fine-scale backscatter coefficient to produce high-spatial-resolution (9 km) SM data [13]. Zhan et al. used a Bayesian method to merge relatively accurate 36-km radiometer brightness temperature with the relatively noisy 3-km radar backscatter coefficient and explored the potential for retrieving SM from these results. Their results prove that the Bayesian method produces better data than direct extraction of either the brightness temperature or radar backscatter [14]. To combine visible and infrared remote sensing with passive microwave data, Wilson et al. combined and weighted terrain maps and other spatial attributes according to the correlations to generate SM data [15]. Srivastava et al. used artificial neural networks (NN), support vector machines, relevance vector machines, and generalized linear models to combine MODIS surface temperature with SM retrieved by SMOS to conclude that the artificial NN produced better results than other methods [16]. Yang et al. estimated soil parameters by assimilating the brightness temperature data simulated by the land surface model and the radiative transfer model. By minimizing the brightness temperature errors of AMSR2, they estimated the SM [17]. In researching SM downscaling, Chen et al. used dual Kalman filters to assimilate the brightness temperature of AMSR-E with the MODIS surface temperature [18]. Finally, Chauhan et al. used the universal triangle approach to link the high-resolution normalized difference vegetation index (NDVI), surface albedo, and land-surface temperature to SM data, thereby disaggregating low-spatial-resolution microwave SM into high-spatial-resolution SM [19]. The common idea behind these methods is to establish a statistical correlation or physical model between SM and auxiliary variables.
Qinghai Province is in the northeastern part of the Qinghai-Tibet Plateau, which is the source of the Yangtze River, the Yellow River, and the Lancang-Mekong River, and is an important water-conserving area in China and Asia [20]. In recent years, under the influence of global warming, the climate of Qinghai has been warming and humidifying, glaciers and snowfields are shrinking year by year, rivers, lakes, and wetlands are shrinking, soil erosion is expanding, and the water-conserving function is deteriorating seriously. Soil moisture is an important surface characteristic parameter and has an irreplaceable role in related land degradation, drought monitoring, and water conservation monitoring [21]. Therefore, an urgent need exists to systematically monitor the soil moisture information in Qinghai Province, which is an area that seriously lacks ground truth data, making the use of remote sensing data to retrieve SM in Qinghai Province of significant potential value.
The quality of remote sensing data largely determines the results of remote-sensing retrieval of soil moisture. Existing studies, such as those mentioned above, usually involve only a single passive microwave radiometer or a single passive microwave radiometer combined with a single active microwave radar for SM retrieval and do not involve the three bands L, C, and X simultaneously [22]. In this paper, we use the powerful multivariate and nonlinear fitting capability of NN to analyze the single-band as well as multi-band synergy in detecting soil moisture in the region of Qinghai Province for the three bands L, C, and X and select multiple microwave sensors (SMAP, AMSR2, FY3C, ASCAT) as data sources. The ability to detect SM information in Qinghai Province through multi-band synergy compensates for the shortcomings of insufficient information from a single sensor. At the same time, elevation and slope data are introduced to treat the complex topography of Qinghai Province to make the algorithm more universal [23]. Finally, we use MODIS data with high spatial resolution and topographic data to downscale SM experiments with the NN model trained with low-resolution data.
The paper is organized as follows: Section 2 explains the data and methods used in this study. Section 3 presents and discusses the main findings. Finally, Section 4 gives the main conclusions of the study.  Table 1 for details. In general, the L, C, and X bands are sensitive to SM, whereas other bands in other frequency ranges are not sensitive to SM. Therefore, this study uses the brightness temperature from the 1.41 GHz channel from SMAP, from the 6.9, 7.3, and 10.65 GHz channels of AMSR2, and from the 10.65 GHz channel of FY3C. Specifically, the SMAP radiometer uses a 24 MHz bandwidth centered at 1.41 GHz. The AMSR2 radiometer uses 0.35, 0.35, 0.10 GHz bandwidths centered at 6.9, 7.3, and 10.65 GHz, respectively. The FY3C radiometer uses a 180% ± 10% MHz bandwidth centered at 10.65 GHz. In addition, we use the backscatter coefficient (σ40) from the 5.3 GHz channel of the ASCAT active microwave data as the input variables for the NN. The specific spatial distribution of soil moisture is shown in Figure 1.

Data from Land Surface Model
As target data, we use the CLDAS soil volumetric water content analysis product, which is published by China Meteorological Data Service Centre. Comparing the qualitycontrolled SM observation data from automated monitoring stations in China with the CLDAS-V2.0 soil volumetric water content data shows that the CLDAS soil volumetric water content product fits the actual ground observation data [24], with a national regional average correlation coefficient of 0.89, a root mean square error (RMSE) of 0.02 m 3 /m 3 , and a deviation of 0.01 m 3 /m 3 . Therefore, CLDAS SM products are considered of higher quality than similar international products (such as the GLDAS and NLDAS products), and they also offer better spatial and temporal resolution. In addition, we use the SM data from the SM ground model GEOS-5 as the input product for triple collocation (TC). The GEOS-5 model provides hourly products, and this study uses the time of 05:30 for the surface SM dataset to represent the SM between 0 and 7 cm within the surface layer [25].

MODIS Data and Terrain Data
The auxiliary input data included land surface temperature (LST) and vegetation index (VI), and the VI data further included the enhanced VI (EVI) and the NDVI [26]. The daily LST data were provided by the MODIS-Terra LST product (MOD11A1) with a spatial resolution of 1 km. The EVI and NDVI data were provided by the MODIS-Terra VI products (MOD13A2) and have a temporal resolution of 16 days and a spatial resolution of 1 km. This study also uses data from the MODIS-Terra surface reflectance product (MOD09A1), with a temporal resolution of 8 days and a spatial resolution of 500 m. The annual land cover data (MCD12Q1) were also used here. The specific spatial distribution of vegetation cover is shown in Figure 2. In addition, we use the results of the SRTM 90-m digital elevation model (DEM) data and the calculated slope data (SLOPE) based on the DEM data for the study area. Table 2 summarizes the datasets used. The above data were obtained from the land processes distributed active archive center (https://lpdaac.usgs.gov/, accessed on 29 March 2021).  The measured data used in this paper are mainly divided into SM data and precipitation data from stations in Qinghai area. SM data are obtained from a soil depth of 10 cm and in time intervals of hours from six automated SM stations (Delingha, Dulan, Golmud, Nomuhong, Tianjun, and Wulan). The time interval for precipitation data is one day, and the areas contain the seven stations Yeniugou, Xiaozaohuo, Dachaidan, Chaka, Wudaoliang, Xinghai, Qumarai.

Data Preprocessing
Due to snow and ice coverage in winter, active and passive microwave data were not available from December to March in the study area because snow cover and frozen soil typically cover the land surface, which might introduce large biases in satellite-retrieved products such as SM [27]. Therefore, this study uses the satellite data and ground station data for Qinghai Province (31 • -40 • N, 89 • -103 • E) from 1 April 2017 to 30 November 2017 and from 1 April 2018 to 30 November 2018. To apply satellite observation data and SM data in the NN model, all data were resampled to a grid of 0.25 • × 0.3125 • . SMAP and AMSR2 data were treated by using bilinear interpolation, ASCAT and FY3C data by the inverse distance weighted method, and CLDAS, MODIS, and terrain data by simple average aggregation. Since the passive microwave datasets (SMAP, AMSR2, and FY3C) include both ascending and descending orbits, we processed the data from these orbits separately. In addition, the brightness temperature data of the passive microwave data could be divided into vertical and horizontal polarization channels, based on which the microwave polarization difference index (MPDI) is calculated as [28] where Tbv and Tbh are the brightness temperature of the vertical and horizontal polarizations, respectively. Based on the assumption that the microwave channel is not subject to strong atmospheric attenuation, the MPDI is designed to eliminate the influence of surface temperature on microwave signals. In addition, it is a normalized polarization difference, which can serve as an indicator of SM status as a function of incident angle. In addition, the MPDI is sensitive to the dielectric properties of soil, and even more so to the surface roughness. Therefore, the MPDI is high for flat surfaces but relatively low for rough surfaces, such as areas with vegetation cover. In the preprocessing of the ASCAT data, the σ40 time series of each grid was renormalized to the range of [0 1], which means that the highest (lowest) backscatter value measured in this study is assigned the value 1 (0). The backscatter time index obtained from this preprocessed ASCAT data is abbreviated "BTI" [29]. This processing method emphasizes the time mode of the ASCAT signal and has been shown to reduce the retrieval time. However, since the processing is performed on the grid, it might reduce the spatial information provided by the radar.

Triple Collocation Method
The traditional error estimation method typically compares retrieved SM data with actual observation data obtained from ground stations. However, such comparisons are usually limited in number and location of instrument verification points, which makes it difficult to ensure robust datasets. In addition, the spatial mismatch between the ground data and the remote-sensing satellite data, as well as the heterogeneity of the ground surface, lead to representative errors and scale-conversion errors. Therefore, we used TC analysis [30][31][32] to estimate SM error. Compared with the traditional method to estimate SM error, it (1) does not require a high-quality reference dataset, which means that it can verify the three different SM data in the study area without ground measurement data. (2) Triple collocation simultaneously obtains the error variances of the three different SM data and (3) avoids the representative error caused by the spatial mismatch between the ground measurement data and the remote-sensing satellite data in the traditional estimation method. (4) Finally, the improved extended TC method [33] detects correlations between the retrieved SM data and the actual surface layer SM data. The error variance is expressed by [33] where σ 2 X is the variance of X, and σ XY is the covariance of X and Y. The correlation coefficient is [33] where R X is the correlation between X and the unknown true SM state. In this study, the SMs retrieved by the NN method, collected from satellite data, and obtained from the ground model GEOS-5 are triple-matched in the form [NN data, satellite data, reanalysis data] to estimate the error variance and correlation coefficient of the TC. The SM obtained from the NN method is evaluated based on these results.

Evaluation Index
This study uses spatial correlation, temporal correlation, root mean square error, and mean absolute error to quantitatively analyze the aspects that differentiate SM retrieved by the NN and SM obtained from a model, as well as aspects that differentiate SM retrieved by the NN and SM collected from ground stations.
(1) Spatial correlation: ρspatial Spatial correlation serves to evaluate the accuracy with which the spatial model retrieves SM. It is obtained by calculating the Pearson correlation coefficient between the retrieved SM, which produces a daily correlation value between the whole area and the simulated SM map. For a better comparison, the average spatial correlation is calculated as the average of all daily spatial correlations with a significance greater than 95%.
(2) Temporal correlation: ρtemporal Temporal correlation is used to evaluate how well the retrieved SM matches the temporal variations in the SM. It is a location-related metric calculated at the pixel level. The Pearson correlation between the retrieved time series and the modeled SM is calculated for each pixel, which gives a correlation map. The mean temporal correlation is the mean value of all the pixels in the temporal correlation map.
(3) Root mean square error: RMSE The RMSE is calculated based on the unit error and the deviation from a reference of the unit error. Therefore, it provides a comprehensive assessment of recalculation, including the accuracy and precision of data retrieval. The RMSE is calculated at the pixel level by using the original SM time series, and a map of the RMSE is obtained for each retrieval. The mean RMSE is the mean value of all the pixels in the RMSE map. A NN [34][35][36] is essentially a system to do nonlinear mathematical calculations and can represent any complex nonlinear process. The multivariable nature and nonlinear ability of NN fully exploit the synergy between different data. The NN used in this study has three parts: (1) an input layer, which receives the satellite observation data and auxiliary variable inputs; (2) a hidden layer; and (3) an output layer, which provides the SM. This structure suffices to fit any continuous function.
The NN was trained with satellite observation data as input data and the corresponding ground model data as target data. The training dataset must represent the entire range of expected scenarios, which means that it must include all climate regimes and seasons. If the training data are well selected, the NN's performance when applied to the training data should differ little from its performance when applied to the entire dataset. Similarly, a NN should perform in the same way when applied to two sufficiently representative but completely different target datasets, meaning that any potential local or regional bias in the target data is corrected. These characteristics can be traced to the fact that the estimated spatial-temporal structure of the NN is determined by satellite observations instead of by target data [37]. In addition, the NN correlates the satellite observations with the most common SM among the input values in the target data, regardless of the location or acquisition time of the data [29].
The NN constructed in this study uses the Levenberg-Marquardt (LM) [38,39] training algorithm and applies error backpropagation [40] to update the weights. Since the LM algorithm stops when it finds a local minimum, the error surface is not fully explored. Therefore, in this study, the NN training was repeated four times, each time using random initial NN weights to ensure different starting points on the error surface; the optimal NN was selected for retrieving SM products.
The key step in downscaling SM in this study is to build a statistical relationship using low spatial-resolution data, and then input high-spatial-resolution data into the statistical relationships to obtain the downscaled SM. [26,41,42] The spatial scale of different data is unified and scaled by different resampling methods, as shown in Section 2.1.5. In particular, the low-spatial-resolution microwave data are resampled to 1 km spatial resolution by replication expansion, without changing the specific values, to make them consistent with the spatial resolution of MODIS and other auxiliary data. Figure 3 shows a flow chart of this process, which is described as follows:

Selection Microwave Band
By studying the quality of data retrieved from various satellites and in different bands, we identified the SM data from different sensors. This exercise was done using the CLDAS SM dataset as reference data. Although the NN model was trained on a small subset of the available dataset, the entire dataset was used for retrieval and evaluation. Table 3 summarizes the average quality index of the SM calculated by comparing a single microwave input dataset with the target SM dataset (CLDAS). Table 4 summarizes the average quality index of the SM calculated by comparing a combination microwave input dataset with the target SM dataset (CLDAS). Below, we discuss in detail the results of using different satellites and different bands. As shown in Table 3, in the 1.41 GHz (SMAP) band, Tbv has a higher spatial sensitivity to SM than Tbh, and the quality of Tbv in the ascending orbit exceeds that of Tbv in the descending orbit, with the average spatial correlation increased by 0.014 and the average temporal correlation increased by 0.087, and RMSE and MAE decreased by 0.001 and 0.002, respectively. In the 6.9, 7.3, and 10.7 GHz bands, Tbh is more sensitive to SM than is Tbv. In addition, the MPDI obtained from preprocessing in these bands also has greater spatial sensitivity to SM. Based on the AMSR2 microwave data, the 10.7 GHz band produces greater spatial and temporal correlation and lower RMSE and MAE in Qinghai Province compared with the 6.9 and 7.3 GHz bands and FY3C's 10.7 GHz band, which indicate a higher sensitivity to SM. In addition, the experiments show that Tbh and MPDI are highly similar in terms of spatial distribution in the 6.9, 7.3, and 10.7 GHz bands but differ significantly from Tbv in the 1.41 GHz band, which leads to the assumption that complementary relationships exist between them. The processed BTI data were also more sensitive to soil moisture than the original σ40 data, with an increase of 0.01 in the average spatial correlation and an increase of 0.002 in the average temporal correlation, whereas the RMSE and MAE decreased by 0.002 and 0.001, respectively. Finally, the best microwave band combination in Qinghai province was selected by joint retrieval of Tbv in the ascending orbit of 1.41Ghz (SMAP) band, Tbh and MPDI in the descending orbit of 10.7Ghz (AMSR2) band, and BTI, σ40 data of ASCAT. Figure 4 shows the raw images of these five single bands, a map of daily average SM as obtained by the NN, and a map of the temporal correlation between the NN SM and the CLDAS SM, respectively.  Figure 4 shows the SM monitoring capability of different wavebands in Qinghai Province. The first row of Figure 4 shows that the original Tbv data at 1.41 GHz indicate higher temperatures in bare land and forest areas than in grassland areas, and both bare land and forest areas have lower daily average SM, which means that more information is needed to distinguish forest areas from bare lands when retrieving SM. Meanwhile, lake areas, such as the Qinghai Lake area in the northeast, are coolest. The temporal correlation map indicates a weak negative correlation in hotter areas (higher Tbv), such as the bare lands in the northwest (Qaidam Basin) and the bare lands in the northeast corner, and a strong positive correlation in cooler areas. The original Tbh maps in the second row reveal a high sensitivity to vegetation; combining these with the maps of average daily SM shows that higher vegetation coverage and higher temperature results in greater SM. However, the poor distinction between the bare lands in the northwest and the mixture of bare lands and grassland in the southwest indicates that more information is needed to distinguish between these two areas. The observation of the MPDI image in the third row shows a high similarity in spatial distribution with the TBH in the second row. The BTI maps in the fourth row show higher BTIs in the grassland in the southeast and in the hinterland of the Qaidam Basin in the northwest, which reflects greater SM in the SM map, and the mixture of bare lands and grassland around the Qaidam Basin has a lower BTI, which reflect a lower SM. Therefore, the correlation map shows a negative correlation of BTI in the northwest corner of Qaidam Basin and a strong positive correlation for the mixture of bare lands and grassland in the southwest and the mixture of bare lands and grassland in the northeast. The observation of the σ40 image in the fifth row shows that its spatial distribution is highly similar to that of the NN SM and CLDAS SM time correlation map of BTI in the fourth row. However, the differentiation between different areas of vegetation cover is worse in the SM map. Figure 5 shows that the spatial distributions of SM obtained by NN retrieval of the four different combinations of data are highly similar to each other, and the overall SM increases from northwest to southeast, which clearly distinguishes bare soil areas, bare soil and grassland mixed areas, grassland areas, and forest areas, making up for the lack of detection capability of the single microwave band in Figure 4.  Figure 5a,b show greater positive correlation in the northeast region than do Figure 5c,d, which confirms that the 10.7 Ghz (AMSR2) band descending-orbit TBH is more capable of detecting SM information in Qinghai province than is the 10.7Ghz (AMSR2) band descending-orbit MPDI.
The results in Table 4 show that the combined SMAP_TBV_A and AMSR2_TBH_D produce higher-quality SM data than the combination of SMAP_TBV_A and AMSR2_MPDI_D. Taking the CLDAS soil moisture data as a benchmark, for SMAP_TBV_A and AMSR2_TBH_D the combination spatial correlation and temporal correlation reach 0.621 and 0.393, respectively, for SMAP_TBV_A and AMSR2_MPDI_D the spatial correlation and temporal correlation reach 0.600 and 0.354, respectively. Meanwhile, given the high similarity between Tbh and MPDI (see Figure 4), AMSR2_TBH_D is used as a final input variable. Furthermore, the addition of σ40 and BTI to the above NN reduces the RMSE and MAE. Compared with the original σ40 data, BTI data obtained after preprocessing translate into a greater temporal correlation between the SM obtained by the NN model and CLDAS data. And based on experience, active and passive microwaves have different sensitivities to SM, vegetation, and surface roughness. The 5.3 GHz observation frequency of ASCAT also differs significantly from that of SMAP (1.41 GHz) and AMSR2 (10.7 GHz). Therefore, the ASCAT dataset is considered as a potentially useful dataset that could compensate for the combination of passive microwave data in the NN. BTI is used as a final input variable. Finally, the input variables are ascending Tbv in the 1.41 GHz (SMAP) band, descending Tbh in the 10.7 GHz (AMSR2) band, and BTI data from ASCAT.

Selection of Auxiliary Data
This section discusses the results of a collaborative analysis of microwave data and auxiliary input data for SM retrieval. The purpose is to determine the content and type of information that can be extracted from microwave data and auxiliary observation data and determine how to combine these data to provide maximal information for SM retrieval. Experimental trials were conducted to add and combine various auxiliary input data based on microwave data and to retrieve SM from different combinations of datasets using the NN model. These results are compared with the CLDAS data to determine the optimal combination (see detailed results in Table 5). In addition, for completeness, the SM products retrieved from all available data are compared among themselves. The brightness temperature and backscattering coefficient obtained by active and passive microwave data are all affected by the opacity of vegetation cover, which reduces the radiation from the soil surface. Therefore, information about the vegetation strongly affects SM retrieval. Table 5 also shows that adding the VI data to the microwave data improves spatial and temporal correlations and reduces MAE and RMSE. Compared with EVI, NDVI improves the spatial correlation to 0.623. Given the high correlation between NDVI and EVI, NDVI is used as a final input variable.
Terrain data such as DEM and SLOPE also play an important role in the retrieval of SM by physical models. The complex mountainous terrain reduces the quality of the microwave data retrieved. In addition, precipitation is mainly concentrated at higher altitudes in many areas of Qinghai Province, leading to relatively lush vegetation cover, which strongly affects the SM. As a result, DEM and SLOPE are also used as final input variables. Table 5 shows that adding DEM to the NN model improves the spatial correlation to 0.634 and the temporal correlation to 0.412. When NDVI, DEM, and SLOPE are all added to the NN model, the spatial correlation reaches 0.676, and the temporal correlation reaches 0.450. The surface temperature information strongly affects the soil surface emissivity, which directly affects the brightness temperature and the backscatter coefficient. Table 5 shows that when a single auxiliary input variable for the NN model, adding LST produces the greatest improvement of the temporal correlation, which attains 0.441.
Compared with CLDAS data, the SM retrieved (see Table 5) from the combination of microwave data and auxiliary inputs NSDSI, NDVI, DEM, and SLOPE produces the highest spatial correlation of 0.684, whereas the temporal correlation is only 0.453. The SM retrieved from the combination of microwave data and auxiliary inputs LST, NDVI, DEM, and SLOPE produces the highest temporal correlation of 0.477, whereas the spatial correlation is only 0.663. The SM retrieved from the combination of microwave data and auxiliary inputs NSDSI, LST, NDVI, DEM, and SLOPE produces a spatial correlation of 0.669 and a temporal correlation of 0.475, which is the most balanced combination. Therefore, we use herein the microwave data and auxiliary inputs NDVI, DEM, SLOPE, LST, NSDSI, etc. as final input variables to obtain the daily average SM map of Qinghai Province, which is shown on the left side of Figure 6. On the right side of Figure 4 is shown the correlation map between SM data retrieved from the NN and CLDAS SM data.  Figure 6 shows that the overall SM in the entire study area increases from northwest to southeast. Also, the high positive correlation in the grassland areas and poor correlations in the Qaidam Basin (northwest corner) and the mixture of forest and grassland (southeast corner) show that the above input variables do not allow us to retrieve SM from bare lands and forest areas but do allow us to retrieve SM from grassland areas.

Triple Collocation Method to Verify Soil Moisture as Determined by Neural Network
To estimate how accurately the NN model determines the SM on a large scale, we apply TC to analyze the SM from the NN. TC estimates the distribution of spatial error for each dataset by locally solving the linear relationships between the three SM datasets. One of the assumptions is that the errors in all three datasets are independent, so the FY3C SM data, which were not used to train the NN model, are combined with the GEOS-5 ground-model SM data and the SM data used to train the NN model in the form of [NN SM, FY3C SM, GEOS-5 SM] for TC. Furthermore, to ensure the accuracy of the TC results, the areas with a correlation coefficient between the three different datasets less than 0.2 are masked and are not involved in the final TC calculation. Finally, the error variance and correlation coefficient are estimated between the NN data and the actual SM data.
The spatial distribution shown in Figure 7 of the variance in TC error indicates that the variance in error between the SM retrieved from NN and FY3C and the actual SM is lowest in the Qaidam Basin in the northwest, whereas the variance in the error between the SM retrieved from GEOS-5 and the actual SM in the Qaidam Basin area is significantly greater. Combining these results with the maps of the spatial distribution of TC correlation coefficients shows that the spatial distribution of the correlation coefficient between the SM retrieved from NN and FY3C and the actual SM correlates to the error variance, meaning that the areas with greater error variance correlate more to the actual SM data.  Figure 8 show that the error variance between NN SM and the actual SM is much less than the error variance between (i) FY3C SM and GEOS-5 SM and (ii) the actual SM, with a median error variance of 0.0003 (NN) < 0.00017 (FY3C) < 0.00030 (GEOS-5). The correlation coefficient between (i) NN SM and FY3C SM and (ii) the actual SM is much greater than that for GEOS-5 SM, with a median correlation coefficient of 0.811 (NN) > 0.792 (FY3C) > 0.516 (GEOS-5). Among these three datasets, NN and FY3C have similar median correlation coefficients, but NN has Q1 = 0.681 and a lower-limit outlier of 0.338, which are much greater than for FY3C (Q1 = 0.594 and lower-limit outlier of 0.115). Therefore, after comprehensive analysis and comparison, the SM data retrieved by the NN model is of better quality for Qinghai Province.

Verification of Downscaled Soil Moisture from Neural Network
In this study, the downscaled SM dataset for Qinghai Province and the map of the daily average SM (Figure 9) were obtained by inputting MODIS data with high spatial resolution and resampled microwave data into the NN model verified by the TC. To verify the adaptability of the downscaled SM data for Qinghai Province, we apply a correlation analysis where we compare the downscaled 1 km SM data, original SMAP SM data, original AMSR2 SM data, and CLDAS SM data with the SM data collected from six ground stations in Qinghai Province. In terms of data selection, for each time series, we use data from all available at ground stations and from CLDAS, SMAP, AMSR2, and NN. In addition, each time series extends over at least 30 days to obtain good statistics. Furthermore, to determine whether the downscaled SM data capture the actual ground SM dynamics, we verify the variations over time of the downscaled SM by studying the time series of the seven ground precipitation stations (see Figure 10).    Table 6 shows that the correlation of the downscaled SM results of the NN model at the Dulan, Tianjun, and WuUlan sites exceeds 0.6, which is a larger average than CLDAS, SMAP, and AMSR2, thereby demonstrating that the NN model properly downscales the SM. Table 6 also reveals negative correlations with CLDAS SM at both the Golmud and Nuomuhong sites, whereas SMAP, AMSR2, and NN produce negative correlations at the Golmud site but positive correlations at the Nuomuhong site. This indicates that the temporal and spatial structures based on the NN model are driven by the satellite observations rather than by the target data. Figure 7 shows a map of the daily average SM in Qinghai Province after downscaling; these results provide much more SM information than do large-scale maps of SM.  Figure 10 shows that the downscaled SM strongly correlates with the precipitation data because the SM increases significantly after precipitation and decreases significantly during drought. Furthermore, the downscaled SM data from Xiaozaohuo, Chaka, Wudaoliang and other sites depart significantly from the absolute value of the CLDAS data, whereas both maintain good time consistency. The results demonstrate that downscaling the SM captures better the variations in precipitation over time, which indicates that the downscaled SM better reflects the actual variations in SM over time.

Conclusions
This paper presents a method to retrieve soil moisture (SM) by combining multiinstrument observation data. The method is based on a neural network (NN) to retrieve SM information from passive microwave sensors SMAP and AMSR2, active microwave sensors ASCAT, as well as MODIS data (LST, NSDSI, NDVI) and topographic data (DEM, SLOPE). The greatest advantage of this method is that it can give full play to the potential of the joint retrieval of SM by each microwave sensor and also make full use of the segmentation capability of high-spatial-resolution MODIS data and topographic data.
From the microwave band selection, the best retrieval effect was achieved by the combination of Tbv in the ascending orbit for the 1.41 GHz (SMAP) band, Tbh in the descending orbit for the 10.7 GHz (AMSR2) band, and BTI data of ASCAT through the neural network method. The final NN SM dataset is obtained by combining the auxiliary data LST, NDVI, NSDSI, DEM, and SLOPE with the above three bands of microwave data. The above two models were compared with the CLDAS model SM dataset, and the result shows that the spatial correlation increases from 0.597 to 0.669, the temporal correlation increases from 0.401 to 0.475, the root mean square error decreases from 0.051 to 0.046, and the mean absolute error decreases from 0.041 to 0.036. All indicators improve, which confirms that the use of the auxiliary data improves the performance of the NN model.
The low-resolution SM products obtained from the NN retrieval in the triple collocation are higher quality than the SM products from the FY3C satellite and the ground model GEOS5 in Qinghai Province (i.e., the NN low-resolution products have the highest median correlation of 0.811, the highest correlation Q1 value of 0.681, and the lowest error variance of 0.00003).
Based on the comparison with the ground stations data, the NN SM dataset obtained on the small scale is also of better quality than the CLDAS product, and the correlation with SM at three stations, namely, Dulan (0.768), Tianjun (0.620), and Wulan (0.616), exceeds 0.6, showing strong correlation. The correlation between CLDAS SM products is greater than 0.6 only in Dulan (0.759) and Wulan (0.670). In addition, comparing with the rainfall site data shows that downscaled NN SM data also better capture the dynamic changes of SM in the study area, producing higher SM values when there is more rainfall and a decrease in SM during the long dry season. Comparing the images before and after downscaling also shows that the SM after downscaling can provide more detailed SM information. We also discuss some shortcomings in the downscaling process. The downscaled SM is susceptible to interference from clouds and rain, leading to a significant quantity of missing data, so future work will focus on data completion.
The results of this study confirm that the NN method can be used to obtain SM with high spatial resolution and can be applied to the Qinghai Province area. The data used herein can be downloaded for free from the official websites of the National Aeronautics and Space Administration (NASA), the Japan Aerospace Exploration Agency (JAXA), the European Centre for Medium-Range Weather Forecasts (ECMWF), and the China Meteorological Information Sharing Platform (CIMISS) without regional restrictions and can be used to produce sTable 1 km SM data in the Qinghai Province area.