Upscaling of Surface Soil Moisture Using a Deep Learning Model with VIIRS RDR

In current upscaling of in situ surface soil moisture practices, commonly used novel statistical or machine learning-based regression models combined with remote sensing data show some advantages in accurately capturing the satellite footprint scale of specific local or regional surface soil moisture. However, the performance of most models is largely determined by the size of the training data and the limited generalization ability to accomplish correlation extraction in regression models, which are unsuitable for larger scale practices. In this paper, a deep learning model was proposed to estimate soil moisture on a national scale. The deep learning model has the advantage of representing nonlinearities and modeling complex relationships from large-scale data. To illustrate the deep learning model for soil moisture estimation, the croplands of China were selected as the study area, and four years of Visible Infrared Imaging Radiometer Suite (VIIRS) raw data records (RDR) were used as input parameters, then the models were trained and soil moisture estimates were obtained. Results demonstrate that the estimated models captured the complex relationship between the remote sensing variables and in situ surface soil moisture with an adjusted coefficient of determination of R = 0.9875 and a root mean square error (RMSE) of 0.0084 in China. These results were more accurate than the Soil Moisture Active Passive (SMAP) active radar soil moisture products and the Global Land data assimilation system (GLDAS) 0–10 cm depth soil moisture data. Our study suggests that deep learning model have potential for operational applications of upscaling in situ surface soil moisture data at the national scale.


Introduction
Soil moisture is a crucial variable in controlling the hydrologic cycle between the land surface and the atmosphere through vegetation evaporation and transpiration [1][2][3][4].Accurate soil moisture estimation at a site is of great importance in modeling the surface hydrologic circle and climate change.Direct observations of ground measurements provide surface soil moisture with high accuracy and scalable frequency at the points measured.However, the most obvious limitations of the ground soil moisture measurements are their spatial discontinuity at specific locations [5], and, therefore, point-based or ground station measurements do not represent the spatial distribution since soil moisture varies spatiotemporally [6].For the requirement of soil moisture estimation at a large scale (i.e., national scale), satellite remote sensing (RS) measurements are the preferred operational option.RS has shown great promise in providing improved spatial and temporal coverage of soil moisture measurements [7].The main difficulties lie in how to accurately estimate the surface soil moisture.An effective method to enhance the accuracy of soil moisture estimates is to upscale in situ soil moisture measurements using RS measurements via statistical regression models.However, conventional statistical regression models have difficulties in extracting complex correlations between large-scale RS data and in situ soil moisture.
There are three types of upscaling methods.One type is land surface model based upscaling approach, which merges in situ soil moisture measurements with predictions of a land surface model.Cai et al. [8] used a hyper-resolution land surface model (HydroBlocks) to upscale in situ soil moisture measurements for the SMAP Validation Experiment 2015.Such models can accurately upscale in situ soil moisture measurements at the field-scale, depending on plenty of parameterizing ground data [9].Therefore, the model-based method might not be suitable for large-scale applications of soil moisture upscaling with few ground-based parameters.
The second type uses traditional statistical regression methods to upscale in situ measurements with optical/IR RS indices or active/passive RS land surface parameters.Using polynomial regression statistics based on Universal Triangle methods Wang et al. [10,11] upscaled in situ soil moisture measurements using MODIS-based land surface temperature (LST) and normalized difference vegetation index (NDVI) data to map daily soil moisture products at 1 km resolution.By implementing a geostatistical algorithm, such as block kriging, researchers can compute the spatial semivariogram of surface soil moisture measurements at different stations in a local area and compute the surface soil moisture across the whole area [12,13].Qin et al. [9,14] found that a Bayesian linear regression-based model performs better than the ordinary least square linear regression-based or the block kriging-based models when upscaling in situ soil moisture data with MODIS-based apparent thermal inertia (ATI) data.Based on the strategy of temporal stability and the high frequency of in situ soil moisture observations [15], stations with temporally continuous measurements have been selected to build a linear regression model based on in situ soil moisture data.However, the complexity and nonlinearity of the relationships makes it impossible to obtain large-scale estimates using traditional statistical regression methods.
The third type of approach involves using machine learning methods, such as support vector regression (SVR), and artificial neural networks (ANN).These methods can usually achieve more accurate soil moisture estimates than traditional statistical regression method because they can better model nonlinear relationships without special mathematic equations or assumptions about the data distribution based on larger-scale soil samples.Sajjad et al. [16] have found that an SVR-based model performed better than an ANN-based model when upscaling in situ soil moisture at 12 km resolution with resampled TRMM backscatter and resampled AVHRR NDVI data.However, constructing a stable regression relationship between these RS variables and in situ data remains challenging because it is difficult to extract the complicated nonlinearities from large training datasets.Recently, deep learning networks have been introduced to learn useful representations from large unlabeled datasets and has been applied for classification and regression in many fields [17,18].When applied to a well-known benchmark, i.e., the recognition of handwritten digits in the Mixed National Institute of Standards and Technology database, the best reported error rates are 1.6% for shallow neural network using randomly initialized backpropagation and 1.4% for support vector machines (SVMs) [17].Multicolumn deep convolutional neural networks (CNN) were the first to achieve a near-human performance, with the best reported error rate of 0.23% [19].RS applications, such as land cover classification [20], feature selection [21], and climatology prediction [22], use deep learning networks, but the use of these methods in soil moisture estimation at the national scale using RS data remains untested.The purpose of the present study is to investigate the possibility of using a deep neural network for upscaling of soil moisture with large-scale datasets sampled from in situ soil moisture and RS measurements.
In this paper, we propose a deep learning method based on a deep feedforward neural network (DFNN) to upscale in situ surface soil moisture data for cropland in China using VIIRS raw data records (RDR).VIIRS RDR refers to the raw data from the satellite (Suomi National Polar-Orbiting Partnership spacecraft launched on 28 October 2011) transmitted to the earth, which can be calibrated to radiance radiance/reflectance and brightness temperatures with geolocation, namely VIIRS sensor data records (SDR) [23].We can obtain real-time daily VIIRS RDR received by the meteorological satellite direct broadcasting service system in the Ministry of Water Resources of China since 2012.In addition, the Ministry of Water Resources of China has the unique advantage of 1997 soil moisture ground stations and 10-day observations.Based on abundant satellite remote sensing and ground measurements and the compelling advantages of the deep learning techniques, we upscaled soil moisture ground measurements using VIIRS RDR to achieve a national surface soil moisture product at 750 m resolution.
The remainder of this paper is organized into the following four sections.Section 2 describes the data and the study area.Then, we describe the deep learning models with the input parameters and building procedure in Section 3. Section 4 discusses the calibrated models and the adjustments in the model training parameters and assesses the results of the model validation and the residual charts, which are used to discuss the model variance.The paper concludes with a summary of our findings in Section 5.

Study Area
The focus of our study was China's cropland, and these were delineated from GlobeLand30 [24,25] land use land cover (LULC) map produced in 2010.The data refers to 10 land cover types, namely cultivated land, forest, grassland, and others.We use cultivated land of GlobeLand30 as croplands masked from 2012 to 2015.The dataset can be found at the following link: http://www.globallandcover.com(last access date: 2 October 2016).The cropland mask was extracted by identifying cultivated land as one class and all nine other land cover types as a single non-cropland class.In Figure 1, the green regions denote cropland, and the white regions represent non-cropland areas.
ISPRS Int.J. Geo-Inf.2017, 6, 130 3 of 20 satellite direct broadcasting service system in the Ministry of Water Resources of China since 2012.
In addition, the Ministry of Water Resources of China has the unique advantage of 1997 soil moisture ground stations and 10-day observations.Based on abundant satellite remote sensing and ground measurements and the compelling advantages of the deep learning techniques, we upscaled soil moisture ground measurements using VIIRS RDR to achieve a national surface soil moisture product at 750 m resolution.The remainder of this paper is organized into the following four sections.Section 2 describes the data and the study area.Then, we describe the deep learning models with the input parameters and building procedure in Section 3. Section 4 discusses the calibrated models and the adjustments in the model training parameters and assesses the results of the model validation and the residual charts, which are used to discuss the model variance.The paper concludes with a summary of our findings in Section 5.

Study Area
The focus of our study was China's cropland, and these were delineated from GlobeLand30 [24,25] land use land cover (LULC) map produced in 2010.The data refers to 10 land cover types, namely cultivated land, forest, grassland, and others.We use cultivated land of GlobeLand30 as croplands masked from 2012 to 2015.The dataset can be found at the following link: http://www.globallandcover.com(last access date: 2 October 2016).The cropland mask was extracted by identifying cultivated land as one class and all nine other land cover types as a single non-cropland class.In Figure 1, the green regions denote cropland, and the white regions represent non-cropland areas.

Data
In addition to the GlobeLand30 2010 dataset, VIIRS SDR, in situ soil moisture measurements, precipitation ground observations, SMAP and GLDAS data were used in this study as follows.

Data
In addition to the GlobeLand30 2010 dataset, VIIRS SDR, in situ soil moisture measurements, precipitation ground observations, SMAP and GLDAS data were used in this study as follows.
(1) VIIRS RDR VIIRS RDR has five imagery bands, sixteen moderate resolution bands and one day-night band, and, in this study, twelve moderate resolution bands were selected (See Table 1), where M1, M2, M3, M4, M5, M6, M7, M8, M10 and M11 were used to calculate the top of atmosphere (TOA) reflectance, and M12, M15 and M16 were used to calculate the TOA brightness temperature.These data were collected on the 1st, 11th, and 21st day of each month from May to October during each year from 2012 to 2015.M9, M13 and M14 were not chosen for this study because most of the band radiances would be absorbed by water vapor according to the atmospheric transmission profile [26].(2) In situ soil moisture measurements The croplands of China contain 1875 distributed ground stations for soil moisture observations (see the red points in Figure 1).The Chinese Ministry of Water Resources collects soil samples from all of the ground stations every 10 days to measure the soil moisture content in the top 10-cm soil layer by the gravimetric method [28].In Figure 1, the red points represent the locations of the soil moisture sampling stations.The stations are sparse in western China and dense in the other regions.At these stations, soil samples were collected, and the soil moisture content of the upper 10-cm soil layer was measured gravimetrically [28].We present the gravimetric soil moisture in percent in this article.These in situ soil moisture measurements were performed at 08:00 a.m.Beijing Time on the 1st, 11th, and 21st day of each month from May to October during each year from 2012 to 2015.

(4) SMAP and GLDAS data
We compare the upscaling estimates over China cropland with the Soil Moisture Active Passive (SMAP) [30] active radar and Global Land data assimilation system (GLDAS) [31] soil moisture products to demonstrate the advantages of our model.SMAP is an orbiting observatory that measures the amount of water in the top 5 cm (2 inches) of soil everywhere on Earth's surface, which is designed to measure soil moisture every 2-3 days over a three-year period.SMAP's radar started transmitting data on January 2015, and it stopped on 7 July 2015 when its radar sensor broke.Therefore, only three months (May to July 2015) of SMAP radar daily soil moisture level 3 product (∼3 km resolution) corresponding to in situ soil moisture were obtained [30].
GLDAS produces global vertical layers (0-100 cm) of daily gravimetrical soil moisture (kg/cm 2 ) every three hours at 25 km resolution [31].Because the GLDAS soil moisture at 00:00 UTC matches the time of ground measurement, we use the upper layer (0-10 cm) of the GLDAS soil moisture at 00:00 every day at the same time range as SMAP for the research area.Readers can find these datasets by using the following link: https://search.earthdata.nasa.gov(last access date: 2 October 2016).

Input Parameters of Soil Moisture Estimation Models
Short Wave Infrared (SWIR) reflectance of soil was proven to be sensitive to the surface soil moisture.Lobell and Anser [32] describe the relationship between the soil moisture and soil reflectance using the in situ soil moisture.For a given value of soil porosity, all of the bands of soil reflectance have a strong absorbing effect on soil moisture, especially in SWIR regions where the wavelength of soil reflectance band is approximately 1300 nm, 1900 nm and 2200 nm.
Corresponding to these regions, M7, M8, M10 and M11 are the four VIIRS solar reflectance bands (SRB).Those bands are defined as input parameters of MODEL I.Because visible/near infrared bands between 350 nm and 1000 nm are also correlated with surface soil moisture, they can also absorb surface soil moisture.M1, M2, M3, M4, M5, M6, M7, M8, M10 and M11 are used as input parameters of MODEL II.In addition, land surface temperature (LST) also has a linear relationship with surface soil moisture [33], which is the function of Thermal Emissive Bands (TEB).Therefore, M12, M15 and M16 are employed as input parameters of MODEL III.
The model input calculation procedure includes the following steps (1)-( 6), as shown in Figure 2.
(2) Convert SRB radiance and TEB radiance to TOA reflectance and brightness temperature, respectively.
To remove the Bow-tie effect at the edge of the single scene of VIIRS SDR, we performed geometric correction to derive the correct radiance using the geolocation file.In addition, the TOA reflectance was calculated by Equation (1), where L represents SRB radiance and EarthSun_dist denotes the distance between the earth and sun, which can be calculated using the current date.Band mean solar irradiance (BMSI) and solar zenith angle (SZA) are SRB constants which can be adapted from the header file of VIIRS SDR.The brightness temperature was calculated by Equation (2), where k denotes the Boltzmann constant, which is equal to 1.3846 × 10 −23 J/K; h refers to Planck's constant, which is equal to 6.6262 × 10 −34 J•s; c represents the speed of light, which is equal to 3 × 10 8 m/s; λ refers to wavelength of a certain VIIRS TEB; and L denotes to the pixel value of VIIRS TEB radiance.
(3) Cloud removal of single scenes of VIIRS image To remove cloud, we employed the Fmask method [34] using both VIIRS TOA reflectance and brightness temperature since Fmask version 3.3 is proven to be an effective package for detecting cloud pixels in Landsat series, VIIRS SDR product and Sentinel 2 level 1 product efficiently and accurately [35].
(4) Daily mosaic of single scenes of TOA reflectance or brightness temperature We mosaicked single scenes of TOA reflectance or brightness temperature with non-cloud pixels over the whole area of China.During the process, the pixel values of the overlapping regions were averaged for each pixel.
(5) Removal of non-cropland pixels Since soil moisture is meaningful for sites at croplands in the research area, we removed non-cropland pixels such as water and urban areas.We used GlobeLand30 to exclude pixels with nine different land cover types differing from cropland.( 6) Data grouping for model calibration and validation First, soil moisture measurements at 1875 ground stations were confined to the valid range of 0 to 100 and invalid measurements were screened out.Then, a total of 68,342 items of soil moisture measurements were kept by a no precipitation filter, ensuring the measuring time period of selected soil moisture measurements had no precipitation.In situ soil moisture measurements were linked with the calculated TOA reflectance or brightness temperature at the same location and time, which ensures that the three measurements were kept only if all of the data are valid.A total of 9789 pairs of samples were obtained by overlapping in situ soil measurements and TOA reflectance of VIIRS SWIR bands (M7, M8, M10, and M11); a total of 8096 pairs of samples were obtained by overlapping in situ soil measurements and TOA reflectance of VIIRS SWIR and Visible bands (M1, M2, M3, M4, M5, M6, M7, M8, M10, and M11); and a total of 6428 pairs of samples were obtained by overlapping in situ soil measurements and TOA reflectance of VIIRS SWIR and Visible bands (M1, M2, M3, M4, M5, M6, M7, M8, M10, M11, M12, M15, and M16).Finally, the valid soil samples were randomly split into model calibration (one-tenth of the samples) and validation datasets (nine-tenths of the samples).The datasets of each year (2012-2015) were split in the same way.The model calibration datasets were used as the input parameters of the deep learning model, while the model validation datasets were used to validate the calibrated models.The soil moisture estimation samples taken for training were independent of those taken for validation.

Model Building Using Deep Learning
Deep feedforward neural network (DFNN) was used because it is an established supervised tool that can produce a regression task to extract deep features among a large number of variables and enable high predictive accuracy.The H2O implementation of deep learning was used in this study, which is based on a DFNN that is trained with gradient descent using error back propagation.Readers can find the H2O R package 3.0 edition by using the following link: https://github.com/h2oai/h2o-3(last access date: 22 December 2016).
We built deep learning-based models including one input layer with two types of explanatory variables (VIIRS TOA reflectance and brightness temperature), one response variable (in situ soil moisture), one output layer (estimated soil moisture), and multiple hidden layers.The procedure of model building, as shown in Figure 3, includes the following steps.

Model Building Using Deep Learning
Deep feedforward neural network (DFNN) was used because it is an established supervised tool that can produce a regression task to extract deep features among a large number of variables and enable high predictive accuracy.The H2O implementation of deep learning was used in this study, which is based on a DFNN that is trained with gradient descent using error back propagation.Readers can find the H2O R package 3.0 edition by using the following link: https://github.com/h2oai/h2o-3(last access date: 22 December 2016).
We built deep learning-based models including one input layer with two types of explanatory variables (VIIRS TOA reflectance and brightness temperature), one response variable (in situ soil moisture), one output layer (estimated soil moisture), and multiple hidden layers.The procedure of model building, as shown in Figure 3, includes the following steps.(1) the number of hidden layers; (2) the number of hidden units in each hidden layer, i.e., neurons; and (3) the number of iterations over the training samples, i.e., epochs.To address the optimal values, a grid search algorithm [37,38] was employed.For each model, the number of hidden layers started at 3 and ended at 11, neurons started at 100 and ended at 500, and epochs started at 1000 and ended at 3500.Then, after each iteration, the number of hidden layers, neurons and epochs would be (1) the number of hidden layers; (2) the number of hidden units in each hidden layer, i.e., neurons; and (3) the number of iterations over the training samples, i.e., epochs.To address the optimal values, a grid search algorithm [37,38] was employed.For each model, the number of hidden layers started at 3 and ended at 11, neurons started at 100 and ended at 500, and epochs started at 1000 and ended at 3500.Then, after each iteration, the number of hidden layers, neurons and epochs would be increased by 1100 and 100, respectively.When the minimum MSE value and maximum R 2 are obtained, the optimal parameters are determined.
(2) Model calibration using DFNN By tuning hidden layers, neurons and epochs, three groups of models can be calibrated when the highest R 2 and lowest MSE value occurs.The output of the calibration includes the calibrated models (model category, weights, biases, statues of neuron layers, etc.), the estimated soil moisture, and the optimized training parameters (hidden layers, neurons and epochs).
(3) Model validation using in situ soil moisture The number of validating samples was approximately one tenth of total samples.All samples were independent of the training samples.To measure model variance and to further explore model performance, we performed residual analysis on estimated soil moisture and in situ soil moisture using residual scatter plots, residual frequency histograms, and residual cumulative percent plots.The residual scatter plots were employed to identify probable outliers under a conditional probability of 99.7%.Based on this confidence level, Jarque-Bera tests [39] were performed on the residual data to evaluate the normality of the models.Accompanied by residual frequency histograms and residual cumulative percent plots, the Jarque-Bera test uses the hypothesis that the data are normally distributed at the significance level of 0.3%.When the probability is lower than 0.003, the null hypothesis can be rejected and the data are normally distributed, and vice versa.Additionally, the Jarque-Bera test results can be used to determine whether the residual data have the skewness and kurtosis patterns that match those of a normal distribution.

Model Calibration Using DFNN
By tuning the number of hidden layers, neurons and epochs, we obtained R 2 and MSE for the three models, as shown in Tables 2-4, respectively.For Model I, we first used 10 hidden layers and 500 neurons.When the number of epochs was less than 1000, low R 2 value occurred; when the number of epochs was increased to 2000, the results had a significant agreement, with R 2 = 0.9746 and MSE = 0.0003.This finding demonstrates that a large number of epochs enables complex model learning pattern recognition and better prediction results in this situation.For nine hidden layers and 500 neurons, when the number of epochs was not equal to 1000, the results showed extreme instability and a poor correlation.When the number of epochs was equal to 3500, the results showed a significant agreement, with R 2 = 0.9786 and MSE = 0.0003.Hence, even relatively shallow layers led to good estimations.For Model II, we first used 10 hidden layers and 500 neurons.When the number of epochs was less than or equal to 1000, low R 2 value occurred; when the number of epochs was increased to 1300, the results had a significant goodness of fit, with R 2 = 0.9169 and MSE = 0.0005.
For nine hidden layers and 400 neurons, when the number of epochs varied from 1000 to 1300, the results showed relatively poor correlation.For eight hidden layers, the results showed that a shallower layer DFNN can achieve better agreement of minimum MSE = 0.00009 and maximum R 2 = 0.9851, compared to relatively deeper layers in this situation.For Model III, when we used nine hidden layers and 400 neurons, the coefficients of determination of the model were relatively stronger than those when 10 hidden layers and 500 neurons were used.When the number of epochs was equal to 1300, the results showed a strong correlation, with R 2 = 0.9215 and MSE = 0.0005.For 8 hidden layers and 500 neurons, the result reached the highest goodness of fit, with minimum MSE = 0.00009 and maximum R 2 = 0.9851 when the number of epochs was equal to 3500.

Model Validation Using In Situ soil Moisture
The optimal versions of Model I (highlighted in Table 2) were validated using 1000 pairs of validation samples.As shown in Figure 4, the adjusted coefficients of determination were greater than 0.95, demonstrating that Model I was able to capture more than 95% of variability in measured soil moisture data.The model using 10 hidden layers was more stable compared to that using nine hidden layers for RMSE was reduced from 0.0282 to 0.0131.The optimal versions of Model II (highlighted in   The models using eight hidden layers generated better results than those using more hidden layers.By comparing the three groups of models, we observed that deep neural hidden layers are not suitable for relatively small-scale datasets containing 5900 to 7300 samples.At the same time, the iterative operations in the model training of Model II and III were more computationally intensive than that of Model I.This is mainly because Models II and III have higher dimensions of input parameters than Model I.In addition, using Model III is better if the size of the training data are relatively small due to the multi-sensor remote sensing data.In contrast, we suggest using Model I to estimate soil moisture when the size of soil samples is relatively large with only visible remote sensing data.The models using eight hidden layers generated better results than those using more hidden layers.By comparing the three groups of models, we observed that deep neural hidden layers are not suitable for relatively small-scale datasets containing 5900 to 7300 samples.At the same time, the iterative operations in the model training of Model II and III were more computationally intensive than that of Model I.This is mainly because Models II and III have higher dimensions of input parameters than Model I.In addition, using Model III is better if the size of the training data are relatively small due to the multi-sensor remote sensing data.In contrast, we suggest using Model I to estimate soil moisture when the size of soil samples is relatively large with only visible remote sensing data.

Residual Analysis
Figure 5 shows the residual scatter plots of the models, the x-axis represents the in situ soil moisture value of samples and y-axis represents the residual value between model estimated and in situ measured soil moisture.The red points were probable outliers that were identified using confidence intervals with a conditional probability of 99.7%.Based on this confidence level, the Jarque-Bera test was performed on the residual data to evaluate the normality of the models.

Residual Analysis
Figure 5 shows the residual scatter plots of the models, the x-axis represents the in situ soil moisture value of samples and y-axis represents the residual value between model estimated and in situ measured soil moisture.The red points were probable outliers that were identified using confidence intervals with a conditional probability of 99.7%.Based on this confidence level, the Jarque-Bera test was performed on the residual data to evaluate the normality of the models.As shown in Figure 6, statistical significance of each model is less than 0.003, and the skewness is close to 0. This result demonstrates that the residual data of all models obeyed normal distribution.Specifically, for the optimal versions of Model I, the kurtosis of the model using 10 hidden layers is larger than that of the model using nine hidden layers, while the skewness of two models is opposite.The results indicated that the models with more hidden layers have greater statistical significance and more stable performance compared to those with less hidden layers for Model I.This conclusion As shown in Figure 6, statistical significance of each model is less than 0.003, and the skewness is close to 0. This result demonstrates that the residual data of all models obeyed normal distribution.Specifically, for the optimal versions of Model I, the kurtosis of the model using 10 hidden layers is larger than that of the model using nine hidden layers, while the skewness of two models is opposite.The results indicated that the models with more hidden layers have greater statistical significance and more stable performance compared to those with less hidden layers for Model I.This conclusion was further confirmed by the probable outliers shown in Figure 5, and the number of probable outliers of the model using 10 hidden layers was greater than that of the model using nine hidden layers.The same conclusion can be drawn for the optimal versions of Model III.Conversely, for the optimal versions of Model II, the kurtosis of the model using eight hidden layers is larger than that of the model using 10 hidden layers, while the skewness of the two models is opposite to the kurtosis.One possible reason is that the large number of epochs (3500) could have resulted in overlearning problems and instability in the network's generalization capability [40].
ISPRS Int.J. Geo-Inf.2017, 6, 130 13 of 20 was further confirmed by the probable outliers shown in Figure 5, and the number of probable outliers of the model using 10 hidden layers was greater than that of the model using nine hidden layers.The same conclusion can be drawn for the optimal versions of Model III.Conversely, for the optimal versions of Model II, the kurtosis of the model using eight hidden layers is larger than that of the model using 10 hidden layers, while the skewness of the two models is opposite to the kurtosis.One possible reason is that the large number of epochs (3500) could have resulted in overlearning problems and instability in the network's generalization capability [40].

Comparison with SMAP and GLDAS Soil Moisture Product
To further investigate the performance of the H2O model estimates, we employed the soil moisture product of SMAP and GLDAS to compare with the upscaling results estimated by MODEL III (8 hidden layers).Based on the dates of the in situ soil moisture measurements, we selected the three types of soil moisture estimates on six individual days (11

Comparison with SMAP and GLDAS Soil Moisture Product
To further investigate the performance of the H2O model estimates, we employed the soil moisture product of SMAP and GLDAS to compare with the upscaling results estimated by MODEL III (8 hidden layers).Based on the dates of the in situ soil moisture measurements, we selected the three types of to validate three types of soil moisture estimates against in situ soil moisture measurements, as shown in Figure 7 and Table 5.In Figure 7, where the panels on the left show the 1:1 scatter plots of three types of soil moisture estimates compared to in situ soil moisture measurements.The panels on the right are the corresponding distribution histograms of the soil samples on the six days.The H2O models performed better for soil moisture estimation than SMAP radar product and GLDAS product.The distribution pattern of their soil samples displayed the same trend.At the same time, the majority of the SMAP soil moisture values have a range of 0.15-0.25,indicating a systematic over-estimation problem compared to the in situ soil moisture measurements.One possible reason is that the measurement unit of the SMAP soil moisture is cm 3 /cm 3 (water volume divided by water and soil volume), which differs from that of in situ soil moisture, i.e., g/g (water weight divided by water and soil weight).Theoretically, the volume water percentage is equal to the result of the gravimetric soil moisture multiplied by the density of water and soil.Strictly speaking, the density of the water and In Figure 7, where the panels on the left show the 1:1 scatter plots of three types of soil moisture estimates compared to in situ soil moisture measurements.The panels on the right are the corresponding distribution histograms of the soil samples on the six days.The H2O models performed better for soil moisture estimation than SMAP radar product and GLDAS product.The distribution pattern of their soil samples displayed the same trend.At the same time, the majority of the SMAP soil moisture values have a range of 0.15-0.25,indicating a systematic over-estimation problem compared to the in situ soil moisture measurements.One possible reason is that the measurement unit of the SMAP soil moisture is cm 3 /cm 3 (water volume divided by water and soil volume), which differs from that of in situ soil moisture, i.e., g/g (water weight divided by water and soil weight).Theoretically, the volume water percentage is equal to the result of the gravimetric soil moisture multiplied by the density of water and soil.Strictly speaking, the density of the water and soil is greater than 1; as a result, the majority of the SMAP soil moisture results are greater than the in situ soil moisture measurements.For GLDAS, the points plot closer to the 1:1 line than those of the SMAP but farther than those of H2O, in accordance with the correlation results shown in Table 5.The H2O model fitted the in situ soil moisture best at the 0.05 level (p < 0.05) and estimated soil moisture most accurately (RMSE is minimal) among the three types of soil moisture products.For GLDAS, the scaling effect created a more serious spatial mismatch problem: the point-based ground soil sample cannot represent all of the land cover in a GLDAS soil moisture pixel.The GLDAS soil moisture product has a resolution of 25 km and is far coarser than the VIIRS resolution of 750 m and the SMAP radar resolution of 3 km.In addition, the GLDAS soil moisture product averages the soil moisture values from 00:00 to 03:00 UTC due to a 3-h lag with respect to the ground measuring time (00:00 UTC).This difference could lead to uncertainty and error in the soil moisture estimation due to possible evapotranspiration by corn or irrigation occurring during the lag period.Similarly, a larger time difference between the SMAP overpass (06:00 UTC) and the in situ soil moisture could lead to greater uncertainty and error in soil moisture estimation compared to GLDAS.
The SMAP radar appears to be worse at capturing the variance in ground measured soil moisture than GLDAS.This is mainly due to the different value ranges between SMAP volumetric soil moisture and in situ measured gravimetric soil moisture.In addition, SMAP radar-based soil moisture estimates are less accurate (RMSE is the maximal) than GLDAS soil moisture estimates.This may have caused by two factors, one is the low signal-to-noise ratio of SAMP radar due to radio frequency interference [41,42].The other is that SMAP can only measuring the top 5 cm depth of soil surface layer without sampling 0-100 cm in depth like GLDAS.The H2O model-based soil moisture estimation maps have a resolution of 750 m and are finer than the GLDAS resolution of 25 km and the SMAP radar resolution of 3 km.As Figure 8 shows, H2O model based soil moisture estimates represent more details spatially than other two products.Nonetheless, it is not perfect, due to a missing data problem associated with cloud removal.
surface layer without sampling 0-100 cm in depth like GLDAS.The H2O model-based soil moisture estimation maps have a resolution of 750 m and are finer than the GLDAS resolution of 25 km and the SMAP radar resolution of 3 km.As Figure 8 shows, H2O model based soil moisture estimates represent more details spatially than other two products.Nonetheless, it is not perfect, due to a missing data problem associated with cloud removal.

Conclusions
This paper proposed a deep learning model-based method for upscaling in situ soil moisture using VIIRS RDR to address the challenging problem of surface soil moisture estimation at the national scale.Three groups of models were built using VIIRS SWIR, Visible/SWIR, and Visible/SWIR/TIR data.The results showed high accuracy with 0.8917 < R < 0.9813 and 0.0118 < RMSE < 0.0294, thereby confirming the effectiveness of the deep learning model.The results also showed that models using eight hidden layers should be employed when the size of the training samples is relatively small and the dimension of the input variables is relatively large (multi-sensor).

Conclusions
This paper proposed a deep learning model-based method for upscaling in situ soil moisture using VIIRS RDR to address the challenging problem of surface soil moisture estimation at the national scale.Three groups of models were built using VIIRS SWIR, Visible/SWIR, and Visible/SWIR/TIR data.The results showed high accuracy with 0.8917 < R 2 < 0.9813 and 0.0118 < RMSE < 0.0294, thereby confirming the effectiveness of the deep learning model.The results also showed that models using eight hidden layers should be employed when the size of the training samples is relatively small and the dimension of the input variables is relatively large (multi-sensor).For the upscaling of the in situ soil moisture to the national scale, we suggest using a model with eight hidden layers, 500 neurons per hidden layer, and 3500 epochs per hidden layer.We observed that the effectiveness of these models depended on the model parameters and the size of the input training dataset.To make the models more operational, further work is necessary.Our approach produces surface soil moisture estimated for cropland in China that are more accurate than the SMAP radar soil moisture products and GLDAS 0-10 cm depth soil moisture products.However, the soil moisture map has data gaps due to clouds in image data.Still, the deep learning models provided a practical way to upscale the soil moisture to a national scale.We believe that they can be applied more comprehensively when more remote sensing sensors such as radar are involved.

Figure 1 .
Figure 1.Ground stations for cropland soil moisture measurements in China.

Figure 1 .
Figure 1.Ground stations for cropland soil moisture measurements in China.

20 Figure 2 .
Figure 2. The procedure of the model input calculation.

Figure 2 .
Figure 2. The procedure of the model input calculation.

Figure 3 .
Figure 3.The procedure of model building.

Figure 3 .
Figure 3.The procedure of model building.
500 pairs of samples.Similar to Models I and II, the results demonstrated Model III could explain most of variability in measured soil moisture.

Figure 4 .
Figure 4. Comparison between in situ soil moisture and estimated soil moisture as obtained by the optimal versions of Model I, Model II and Model III.

Figure 4 .
Figure 4. Comparison between in situ soil moisture and estimated soil moisture as obtained by the optimal versions of Model I, Model II and Model III.

Figure 5 .
Figure 5.The residual scatter plots of the three groups of optimal models.

Figure 5 .
Figure 5.The residual scatter plots of the three groups of optimal models.

Figure 6 .
Figure 6.The residual frequency histograms and residual cumulative percent plots of three groups of optimal models.
May 2015, 21 May 2015, 1 June 2015, 11 June 2015, 21 June 2015 and 1 July 2015).The 2-3-day revisit period causes large data gaps in the SMAP daily soil moisture product.Therefore, we mosaicked three SMAP soil moisture images for each revisit period to represent the six days.In detail, images on 10-12 May were mosaicked to represent 11 May; 19-21 May were mosaicked to represent 21 May; 30 and 31 May and 1 June were

Figure 6 .
Figure 6.The residual frequency histograms and residual cumulative percent plots of three groups of optimal models.
ISPRS Int.J. Geo-Inf.2017, 6, 130 14 of 20 soil moisture estimates on six individual days (11 May 2015, 21 May 2015, 1 June 2015, 11 June 2015, 21 June 2015 and 1 July 2015).The 2-3-day revisit period causes large data gaps in the SMAP daily soil moisture product.Therefore, we mosaicked three SMAP soil moisture images for each revisit period to represent the six days.In detail, images on 10-12 May were mosaicked to represent 11 May; 19-21 May were mosaicked to represent 21 May; 30 and 31 May and 1 June were mosaicked to represent 1 June 2015; 10-12 June were mosaicked to represent 11 June; 19-21 June were mosaicked to represent 21 June; and 29 and 30 June and 1 July were mosaicked to represent 1 July.We obtained 594, 629, 552, 605, 584, and pairs of soil samples, respectively, for the six single days, and each pair includes the H2O model-based soil moisture estimate, SMAP radar-based soil moisture estimate, GLDAS soil moisture estimate and in situ soil moisture for the same day and location.Then, we used R 2 , RMSE and p-value

ISPRS
Int. J. Geo-Inf.2017, 6, 130 14 of 20 mosaicked to represent 1 June 2015; 10-12 June were mosaicked to represent 11 June; 19-21 June were mosaicked to represent 21 June; and 29 and 30 June and 1 July were mosaicked to represent 1 July.We obtained 594, 629, 552, 605, 584, and 553 pairs of soil samples, respectively, for the six single days, and each pair includes the H2O model-based soil moisture estimate, SMAP radar-based soil moisture estimate, GLDAS soil moisture estimate and in situ soil moisture for the same day and location.Then, we used R , RMSE and p-value to validate three types of soil moisture estimates against in situ soil moisture measurements, as shown in Figure7and Table5.

Figure 7 .
Figure 7. (a) Comparison of the performance of three types of soil moisture estimates and in situ soil moisture measurements on 11 May, 21 May and 1 June 2015.(b) Comparison of the performance of three types of soil moisture estimates and in situ soil moisture measurements on 11 June, 21 June and 1 July 2015.

Figure 8 .
Figure 8. H2O model estimated soil moisture at 0-10 cm depth, SMAP radar estimated soil moisture at 0-5 cm depth and GLDAS estimated soil moisture at 0-10 cm depth maps for cropland in China on 21 June and 1 July 2015.

Figure 8 .
Figure 8. H2O model estimated soil moisture at 0-10 cm depth, SMAP radar estimated soil moisture at 0-5 cm depth and GLDAS estimated soil moisture at 0-10 cm depth maps for cropland in China on 21 June and 1 July 2015.

Table 2 .
Parameters and Results of Model I.

Table 3 .
Parameters and Results of Model II.

Table 4 .
Parameters and Results of Model III.

Table 3 )
were validated using 700 pairs of validation samples.The validation results were R 2 > 0.89 and RMSE < 0.03.While the adjusted coefficients of determination of the model validation results were lower than those of the model calibration results, one possible reason is that the relatively small sample size may have resulted in more instability for DFNN in the model calibration process.The optimal versions of Model III (highlighted in Table4) were validated using 500 pairs of samples.Similar to Models I and II, the results demonstrated Model III could explain most of variability in measured soil moisture.

Table 5 .
Comparison of the performance of three types of soil moisture estimates.