A New Approach for Soil Moisture Downscaling in the Presence of Seasonal Di ﬀ erence

: The variation of soil moisture (SM) is a complex and synthetic process, which is impacted by numerous factors. The e ﬀ ects of these factors on soil moisture are dynamic. As a result, the relationship between soil moisture and explanatory variables varies with time and season. This kind of change should be considered in obtaining ﬁne spatial resolution soil moisture products. We chose a study area with four distinct seasons in the temperate monsoon region. In this research, we established seasonal downscaling models to avoid the inﬂuence of seasonal di ﬀ erences. Precipitation, land surface temperature, evapotranspiration, vegetation index, land cover, elevation, slope, aspect and soil texture were taken as explanatory variables to produce ﬁne spatial resolution SM. SM products derived from Advanced Microwave Scanning Radiometer–Earth Observing System (AMSR-E) and Advanced Microwave Scanning Radiometer 2 (AMSR2) were downscaled with the help of machine learning algorithms. We compared three machine learning algorithms of random forest (RF), support vector machine (SVM), and K-nearest neighbors (KNN) to determine the most suitable algorithm for this study. The results show that season-based downscaling is even better than continuous time series. In the analysis of seasonal di ﬀ erences, precipitation plays a dominant role, but its contribution rate is di ﬀ erent in each season. Moreover, the inﬂuence of vegetation is more prominent in winter, while the inﬂuence of terrain is more important in the other three seasons. It could be noted that the accuracy of the RF model is the best among three machine learning algorithms, and the RF-downscaled products have superior matching performance to both AMSR (AMSR-E and AMSR2) SM products and in-situ measurements. The analysis indicates considering seasonal di ﬀ erence and the application of machine learning has high potential for spatial downscaling in remote sensing applications.


Introduction
Soil moisture (SM) is an essential climate variable influencing land-atmosphere interactions, an essential hydrologic variable impacting rainfall-runoff processes, an essential ecological variable regulating net ecosystem exchange, and an essential agricultural variable [1][2][3][4]. Many ground stations have been set up to monitor soil moisture, but due to the limitation of number and uneven distributions, the in-situ measurements are insufficient for representing the soil moisture with large range [5]. With the development of remote sensing technology, we can get spatio-temporally continuous datasets on a large scale. Passive microwave satellites have been widely used to estimate SM over the past decades without the limitation of weather. However, passive microwave satellites have a coarse spatial resolution [6], which limits their soil moisture products in applications such as crop management, drought monitoring, fine-scale water budget assessment, and ecological modelling [7]. Therefore, it is necessary to improve the spatial resolution for the effective application of passive microwave derived SM products [8].
In light of the above insights, three machine learning algorithms were chosen to construct the regression correlation between soil moisture and auxiliary factors according to seasons. Most factors related to soil moisture including precipitation, land surface temperature, evapotranspiration (ET), land cover (LC), vegetation index, surface elevation, slope and aspect were selected as explanatory variables. In particular, we showed the differences in the accuracy of the downscaling models with and without dividing seasons. In addition, we analyzed the differences in the influence of explanatory factors on soil moisture in each season. We used a difference analysis to illustrate the importance of the seasons. According to the accuracy of the downscaling model obtained by the three machine learning methods, we determined the most suitable model for downscaling. Then, we compared the differences between downscaled SM products and original microwave products, and verified the downscaled SM products with the in-situ measurements.

Study Area and Ground Data
The study area selected in this paper is located in the Shaanxi Province in the interior of China, ranging from 105 • 29 -111 • 15 E and 31 • 42 -39 • 35 N ( Figure 1). The terrain of the Shaanxi Province is high in the north and south, low in the middle, with a variety of landforms such as plateaus, mountains, plains and basins, among which the loess plateau accounts for 40% of the land area of the province. Northern and central Shaanxi have a temperate monsoon climate, but due to the difference in rainfall, the northern part is a semi-arid region, while the central part is a semi-humid region. Southern Shaanxi has a humid subtropical monsoon climate. The average annual precipitation is 340-1240 mm, with more precipitation in summer and autumn, less in winter and spring. Under the influence of climate and other conditions, the soil moisture in the Shaanxi Province is higher in the south and lower in the north, higher in summer and lower in winter.
The ground measured data are from the China meteorological data network (available online: http://www.nmic.cn/). There are 25 ground data stations, mostly distributed in low-elevation plains and valley areas, containing soil moisture data at various depths (10, 20, 40 cm). The ground data stations record soil relative humidity. Due to the lack of SM monitoring measurements at 5 cm, we chose soil moisture relative humidity data at a depth of 10 cm, as in past research [16,19,40]. Although the measurement depths are inconsistent, there is a strong correlation between the SM values of the two continuous soil layers [16,41]. The time range was from 2002 to 2013. Since the units of measured data and remote sensing data are different, and we lack the measured variables for the conversion of the two units, we can only conduct qualitative analysis of the two, but not quantitative analysis.

AMSR-E and AMSR2 Soil Moisture Products
AMSR-E is a passive microwave radiometer mounted on the Aqua satellite launched by National Aeronautics and Space Administration (NASA). It has six frequency bands of 6.925, 10.65, 18.7, 23.8, 36.5, and 89.0 GHz. The satellites, with ascending (1:30 p.m.) and descending (1:30 a.m.) modes have a revisit period of 1-2 days. At present, the derived soil moisture products of AMSR-E are obtained by three algorithms, one is created by the United Stands Snow and Ice Data Center (NSIDC) algorithm, another is created by the Japan Aerospace Exploration Agency (JAXA) algorithm and the other one is created by the Land Parameter Retrieval Model (LPRM) developed by the VU University Amsterdam with NASA. The JAXA product shows better performance under dry conditions [42]. Considering that most of the study area is non-humid, we chose the L3 soil moisture product from JAXA (http://gcom-wl.jaxa.jp/product-download.html). The spatial resolution of these data is 25 × 25 km, and the data are stored in an ease-grid format. The time coverage range is from June 2002 to September 2011, and the value range is 0-0.6 m 3 /m 3 .
AMSR2 is a passive microwave radiometer carried on the GCOM-W1 satellite launched by JAXA. Launched in July 2012, AMSR2 has continued AMSR-E's earth observation work well until now. Compared with AMSR-E, AMSR2 adds a channel with a frequency of 7.3 GHz, which reduces the impact of radio frequency interference (RFI). In order to maintain the continuity of the data, this article also chose the AMSR2 L3 level of soil moisture product with a spatial resolution of 25 × 25 km provided by JAXA, and the time coverage is from July 2012 to August 2017.

AMSR-E and AMSR2 Soil Moisture Products
AMSR-E is a passive microwave radiometer mounted on the Aqua satellite launched by National Aeronautics and Space Administration (NASA). It has six frequency bands of 6.925, 10.65, 18.7, 23.8, 36.5, and 89.0 GHz. The satellites, with ascending (1:30 p.m.) and descending (1:30 a.m.) modes have a revisit period of 1-2 days. At present, the derived soil moisture products of AMSR-E are obtained by three algorithms, one is created by the United Stands Snow and Ice Data Center (NSIDC) algorithm, another is created by the Japan Aerospace Exploration Agency (JAXA) algorithm and the other one is created by the Land Parameter Retrieval Model (LPRM) developed by the VU University Amsterdam with NASA. The JAXA product shows better performance under dry conditions [42]. Considering that most of the study area is non-humid, we chose the L3 soil moisture product from JAXA (http://gcom-wl.jaxa.jp/product-download.html). The spatial resolution of these data is 25 × 25 km, and the data are stored in an ease-grid format. The time coverage range is from June 2002 to September 2011, and the value range is 0-0.6 m 3 /m 3 .
AMSR2 is a passive microwave radiometer carried on the GCOM-W1 satellite launched by JAXA. Launched in July 2012, AMSR2 has continued AMSR-E's earth observation work well until now. Compared with AMSR-E, AMSR2 adds a channel with a frequency of 7.3 GHz, which reduces the impact of radio frequency interference (RFI). In order to maintain the continuity of the data, this article also chose the AMSR2 L3 level of soil moisture product with a spatial resolution of 25 × 25 km provided by JAXA, and the time coverage is from July 2012 to August 2017.
Considering that the two sensors AMSR-E and AMSR2 are slightly different, we conducted regression analysis on the two sets of soil moisture data and obtained R 2 of 0.703. The significance test proved that the two sets of product data had a good correlation within the Shaanxi Province.

Multi-Source Remote Sensing Data
Land surface temperature (LST), vegetable index (VI), evapotranspiration (ET), and land cover (LC) data were collected from NASA Moderate Resolution Imaging Spectroradiometer (MODIS). In VI, we selected two indexes, normalized differential vegetation index (NDVI) and enhanced vegetation index (EVI). The data can be downloaded from NASA's website (http://ladsweb.nodaps.eosdis.nasa. gov/). The data information is shown in Table 1, and the time range is from June 2012 to August 2017. The precipitation data were obtained from the Rainfall products of the Tropical Rainfall Measuring Mission (TRMM), a satellite jointly launched by NASA and JAXA in 1997 to monitor rainfall over the tropics. The satellite precipitation data used in this paper were the 3b43 monthly precipitation data set provided by NASA, which can be downloaded from the official website of NASA (http: //mirador.gsfc.nasa.gov/). The spatial resolution of the data is 25 × 25 km, and the accuracy is better than that of similar satellite precipitation data. The time range is from June 2002 to August 2017.
Digital elevation model (DEM) data were acquired from the Shuttle Radar Topography Mission (SRTM) data products (http://srtm.csi.cgiar.org/SELECTION/inputCoord.asp) [43]. DEM data with a spatial resolution of 1 km were selected. Slope data and aspect data with a spatial resolution of 1 km were generated through the slope and aspect toolbars in ArcGIS 10.7 based on DEM data.

Soil Texture
Soil texture data were downloaded from the Chinese Resource and Environment Data Cloud Platform (http://www.resdc.cn/data.aspx?DATAID=260). The soil texture is divided into sand, silt, and clay. Soil texture data contains the percentage content of each category, with a spatial resolution of 1 km.

Random Forest
RF is an optimized decision tree algorithm based on Bootstrap's sampling method. The decision ability of the RF model depends on each classification and regression tree (CART). The decision tree is composed of root nodes, child nodes, and leaf nodes. Each child node contains a judging rule and the result is determined by the voting score of each decision tree. When RF is used for regression, the minimum mean square error (%) (%IncMSE) is used to calculate the regression accuracy. As for classification, the Gini index is the index to calculate the classification accuracy. The smaller Gini index refers to the less uncertainty of the model and the better classification results. RF has the advantages of high learning efficiency, insensitivity to outliers, and prevention of over-fitting. RF also has an importance analysis function for input features. The importance ranking of different factors can be obtained through the indicators %IncMSE and IncNodePurity. The calculation criteria of the two indicators are slightly different, but the results are roughly the same. %IncMSE refers to the increase value of the mean square error rate of the model after removing an impact factor. The higher %IncMSE means that the removed impact factor is more important. IncNodePurity refers to the increase of node purity after adding an impact factor. The increase in node purity means a decrease in the Gini index, and the model is better. Compared with IncNodePurity, %IncMSE is more suitable for the regression model, so this paper mainly refers to this index.

Support Vector Machine
SVM can be used for classification or regression. When SVM is used for classification, the principle is to find one or a group of hyperplanes based on the training sample space. The longer the distance between the training samples and the hyperplanes is, the higher the classification accuracy is. When SVM is used for regression, it is called support vector regression (SVR). We look for a regression model f(x) which is as close as possible to the actual y value when applying SVR. In contrast to the traditional regression model, SVR allows f(x) to have a certain difference from the actual y value. A threshold is set for the deviation, and the predictions within the threshold are considered correct.

K-Nearest Neighbor
KNN is a simple and effective machine learning algorithm. If the k-most-adjacent samples in the eigenspace belong to a certain group, then the unclassified sample also belongs to that group and has the characteristics of that group. The classification results depend on the value of k. The category of sample is only determined by the category of the nearest one or several. Compared with other algorithms, the KNN method lacks the training process.

SM Product Machine Learning Downscaling
According to the climatic conditions in the study area, time spans are divided into spring (March, April, May), summer (June, July, August), autumn (September, October, November), and winter (December to February of the following year). In this study, we assume that the functional relationships between soil moisture and other auxiliary factors do not change with spatial scale, that is, in one season the functional relationship expressed in Equations (1) and (2) is the same.
Therefore, the functional relation obtained from soil moisture data and auxiliary factors on the 25 km scale can be applied to the data on the 1 km scale. The 1 km soil moisture data can be predicted by taking the data of 1 km auxiliary factors into the functional relationship. The specific downscaling steps are as follows ( Figure 2).
First step: The auxiliary factors data were resampled to the spatial resolution of 25 km. The temporal resolution is different for soil moisture data and auxiliary factors data. There is constant change in different months of one season among soil moisture, meteorology, and vegetation factors. In view of this, for soil moisture, precipitation, LST, ET, and EVI, NDVI, the temporal resolution is unified as the monthly scale rather than the seasonal scale. In order to maintain the same temporal resolution, LC, terrain, and soil texture data are also unified on a monthly scale. In fact, LC remains unchanged for the same year. In addition, the value of terrain and soil texture keeps constant over the time range of this study. Finally, the sample includes three months of monthly data for one seasonal model. Take spring for example, the sample is composed of the data from March to May which benefits in the accuracy of functional relationship.
Second step: The total sample size of each season was slightly different. The sample data were randomly divided into training set and verification set, and the ratio of training sample to verification sample was about 7:3 ( Table 2). The regression relationship between soil moisture and auxiliary factors under the spatial resolution of 25 km was obtained by three machine learning methods in different seasons, and the model accuracy was calculated. Then, the importance function in the random forest algorithm can be used to get the importance ranking of explanatory variables in different seasonal models. The corresponding %IncMSE value of each variable can be obtained through this function. Third Step: The auxiliary factors data were resampled to the spatial resolution of 1 km, and all the data were processed on a seasonal scale. The seasonal data of 1 km auxiliary factors were put into different seasonal models to obtain the seasonal SM products of 1 km.
Remote Sens. 2020, 12, x FOR PEER REVIEW 7 of 20 Second step: The total sample size of each season was slightly different. The sample data were randomly divided into training set and verification set, and the ratio of training sample to verification sample was about 7:3 ( Table 2). The regression relationship between soil moisture and auxiliary factors under the spatial resolution of 25 km was obtained by three machine learning methods in different seasons, and the model accuracy was calculated. Then, the importance function in the random forest algorithm can be used to get the importance ranking of explanatory variables in different seasonal models. The corresponding %IncMSE value of each variable can be obtained through this function. Third Step: The auxiliary factors data were resampled to the spatial resolution of 1 km, and all the data were processed on a seasonal scale. The seasonal data of 1 km auxiliary factors were put into different seasonal models to obtain the seasonal SM products of 1 km.

Validation Methodology
The downscaling results are evaluated and verified in three areas. First of all, the model precision evaluation of three scaling algorithms was carried out to determine the best performance of the algorithm. Secondly, the downscaling products was compared with the original 25 km soil moisture products. Finally, the in-situ measurements were used to verify the downscaling products and the original 25 km soil moisture products. The

Validation Methodology
The downscaling results are evaluated and verified in three areas. First of all, the model precision evaluation of three scaling algorithms was carried out to determine the best performance of the algorithm. Secondly, the downscaling products was compared with the original 25 km soil moisture products. Finally, the in-situ measurements were used to verify the downscaling products and the original 25 km soil moisture products.
The  8)). Several statistical parameters including MAPE, BIAS, RMSE, and R 2 were used to compare the downscaling models obtained by different machine learning algorithms. The MAPE, BIAS, and RMSE were used for the comparisons between the 1 km downscaled products and the original 25 km soil moisture products. For direct validation, the downscaled results were verified with the in-situ measurements using the R and the K-L.
K − L(SM T SM P ) = SM T log SM T SM P where SM P and SM T represent different values in different evaluation parts. SM P means the predicted values or downscaled data, and SM T stands for the validation data or true values. Cov(SM P , SM T ) is the covariance of SM P and SM T , Var[SM P ] is the covariance of SM P , Var[SM T ] is the covariance of SM T .

Comparative Analysis of Machine Learning Algorithms
We evaluated the accuracy of the downscaling models, and the results are shown in Table 3. In general, the model accuracy of the three algorithms is acceptable. The MAPE value is less than 24%, and BIAS is less than 0.065 m 3 /m 3 , and the RMSE value is not more than 0.027 m 3 /m 3 . According to the R 2 , the correlation between the RF predicted value and the true value is the highest, and the highest value is 0.609 in autumn. The variation range of SM values in different seasons may have an impact on R 2 . RF performs well in terms of MAPE, RMSE, and R 2 , but poorly in BIAS. The autumn results are used for further analysis. The BIAS of RF is 0.041 m 3 /m 3 in autumn which is much higher than that of SVM and KNN. However, the MAPE and RMSE of RF are the smallest, and the R 2 is the highest. We conclude that the RF-predictions are mainly overestimated rather than underestimated compared with other algorithms, and the deviation is relatively low. The overestimation and underestimation of the other two algorithms are more evenly distributed, and the deviation value is larger.
Combined with Figure 3, we can find that the spatial distribution of the downscaled results obtained by the three algorithms is generally consistent. The RF result in summer is higher compared to other algorithms, which is consistent with the BIAS index. In addition, in the figures of spring and summer, the soil moisture smoothing effect obtained by RF in the middle research area is not as good as that obtained by SVM. In general, SVM performs moderately, and slightly worse than the RF algorithm. The KNN results show that there are high and low values cross distribution, especially in the central and southern regions in summer and autumn, and there is no smooth values transition. It may have relevance for the algorithm execution rules. The KNN algorithm only takes into account the surrounding range of values when making predictions. This may also be the reason for the poor performance of KNN algorithm. Since RF is the best among the three algorithms, we choose RF for downscaling in the following process.

Effectiveness of Seasonal Downscaling Model
In order to prove the effectiveness of the seasonal downscaling model, we built a comprehensive downscaling model without distinguishing seasons for a comparison experiment. The training samples of the four seasons in Table 2 were mixed, and 9979 samples were randomly selected to train

Effectiveness of Seasonal Downscaling Model
In order to prove the effectiveness of the seasonal downscaling model, we built a comprehensive downscaling model without distinguishing seasons for a comparison experiment. The training samples of the four seasons in Table 2 were mixed, and 9979 samples were randomly selected to train a comprehensive downscaling model. Then, the same validation samples as those in Table 2 were selected to verify the accuracy of the comprehensive model in each season. The results are shown in Table 4.
The comparison results included MAPE, BIAS, RMSE, and R 2 . It can be seen that the MAPE, BIAS, and RMSE from the seasonal model are lower than the comprehensive model, especially in spring and winter. According to the performance of R 2 , the fitting degree between the predicted results of the seasonal model and the verification samples is better. We calculated the magnitude of the improvement of model accuracy after considering season differences. On average, the MAPE, BIAS, and RMSE decreased by 17%, 28%, and 10%, respectively, using the season-based methods. In addition, R2 improved by an average of 13%. The value of BIAS of the season-based model in winter decreased a lot, so the average reduction rate of BIAS was relatively high. Overall, the accuracy of the downscaling model improved by about 10% while considering seasonal differences. The results demonstrate that the seasonal model performs better than the comprehensive model.

Comparison with Original AMSR-E/AMSR2 Soil Moisture
As shown in Figure 3, taking 2010 for example, the RF-downscaled results are roughly consistent with the spatial distribution of the original products of 25 km. The soil moisture values show a decreasing trend from south to north. In addition, the soil moisture downscaled result represents more detailed spatial distribution and smoother spatial variation. Especially in Qinling Mountains, it is easier to identify mountains and the plains between mountains, and the variation of soil moisture with the topography is also well represented. Although there are some deviations in some areas, the downscaled results are acceptable considering the complex and varied topography and climate of the study area.
We resampled the original SM products to a spatial resolution of 1 km. The 1 km SM data after resampling maintain the SM value at 25 km scale. Then, we compared the downscaled products with the resampled products to obtain the difference between them at 1 km scale ( Table 5). The maximum RMSE value is 0.024 m 3 /m 3 in summer, 0.013 m 3 /m 3 in autumn, and lower than 0.010 m 3 /m 3 in spring and winter. Except for the autumn, all the other three seasons have negative BIAS. It indicates that compared with the original soil moisture of 25 km, the soil moisture value after downscaling is low.
The maximum value of MAPE is 34% in summer, which is slightly higher. The MAPE value is between 10%-20% in spring and autumn, and within 10% in winter.  Figure 4 shows the spatial distribution of RMSE with seasons. To illustrate the difference between the original data and the downscaled result, we selected region A and B for analysis. Compared with the original SM value, the downscaled SM value of region A is higher, while the downscaled SM value of region B is lower. Most of region A belongs to urban and built-up areas. The NDVI value in this region is lower than that in the surrounding areas, but the meteorological and topographical differences between this region and surrounding areas are not significant. Therefore, overestimation may occur under the guidance of topographic and meteorological factors. In addition, fewer urban ground stations are used for sensor calibration in the generation stage of SM products [44]. Urban areas are rarely considered in the comparative validation of SM products [45,46]. These may affect the accuracy of remote sensing SM products in the urban areas. Therefore, the possibility of low accuracy of the original 25 km SM products cannot be ruled out. Region B is located in the transition from mountains to valley plains. The climate and vegetation coverage in region B are very similar to that of the surrounding areas. In the original 25 km resolution SM products, it is the place with the highest soil moisture value in the study area, but not after downscaling. Region B has the highest soil moisture value in the original 25 km data, and it may be related to other factors such as proximity to rivers and reservoirs.
Remote Sens. 2020, 12, x FOR PEER REVIEW 12 of 20 overestimation may occur under the guidance of topographic and meteorological factors. In addition, fewer urban ground stations are used for sensor calibration in the generation stage of SM products [44]. Urban areas are rarely considered in the comparative validation of SM products [45,46]. These may affect the accuracy of remote sensing SM products in the urban areas. Therefore, the possibility of low accuracy of the original 25 km SM products cannot be ruled out. Region B is located in the transition from mountains to valley plains. The climate and vegetation coverage in region B are very similar to that of the surrounding areas. In the original 25 km resolution SM products, it is the place with the highest soil moisture value in the study area, but not after downscaling. Region B has the highest soil moisture value in the original 25 km data, and it may be related to other factors such as proximity to rivers and reservoirs. It can be seen from Figure 5 that the downscaled soil moisture values maintain a good consistency with the original 25 km data. However, we also noted that the SM value of AMSR2 is slightly lower than the AMSR-E value. Therefore, the time continuity of soil moisture products was also improved after downscaling. It can be seen from Figure 5 that the downscaled soil moisture values maintain a good consistency with the original 25 km data. However, we also noted that the SM value of AMSR2 is slightly lower than the AMSR-E value. Therefore, the time continuity of soil moisture products was also improved after downscaling.

Comparison with In-Situ Measurements
We unified the temporal resolution of the in-situ measurements data to the seasonal scale. Since the units of downscaled data and measured data are different, the data were normalized according to the maximum and minimum values before the comparative analysis. We verified original 25 km SM products and downscaled products by R and K-L indexes. The higher R value or the smaller K-L means a better result. After excluding the outliers in the measured data, the number of samples used for data analysis in spring, summer, autumn, and winter were 167, 180, 137, and 89, respectively.
According to Table 6, we can see that the R values of the downscaled data and measured data are increased to 0.570, 0.458, 0.764, and 0.288, respectively for the four seasons, which are higher than that of the original 25 km data and measured data. The small correlation between remote sensing data and ground measurements may be due to their differences in the spatial support [21]. The K-L values of the original 25 km data and the measured data for the four seasons are 0.143, 0.305, 0.222, and 0.080, while the K-L values of the downscaled data and the measured data are reduced to 0.084, 0.236, 0.221, and 0.075. The correlation or fitting degree between the downscaled data and the measured data is better and the difference is smaller. These results prove the effectiveness of the downscaling. In winter, the difference between the downscaled data and the measured data is the smallest according to the K-L value, but the R value is the lowest, which may be caused by the small range of soil moisture variation.

Seasonal Difference Analysis of Explanatory Variables
We got the ranking of explanatory variables with seasons at 25 km spatial resolution using the importance function of random forest ( Table 7). The ranking results are verified by correlation analysis. Explanatory factors all passed the significance tests, and the correlations with soil moisture were strong. The R 2 value between the factor with the largest %IncMSE and SM was also the highest. Since the variables contained in the downscaling models of the four seasons are the same, we can assume a unified threshold to judge the importance of factors. When the value of %IncMSE exceeds 100, we assume that this factor has a greater impact on soil moisture. The main factors affecting soil moisture are different with seasons. Precipitation, elevation, land surface temperature, and slope all have an impact on soil moisture in spring. Soil moisture is mainly affected by elevation, precipitation, aspect, and land surface temperature in summer. Precipitation and aspect have a great influence on soil moisture in autumn while the influence of precipitation, NDVI, and LST is significant in winter. These results show the seasonal effect between soil moisture and auxiliary factors, and proves the necessity of establishing seasonal models.
We analyzed the seasonal differences of the main influencing factors according to their categories. Summer is mainly influenced by meteorological and topographic factors. There is the southeast monsoon prevailing in summer, brining heavy rainfall. Affected by topographical factors, the land surface temperature in Shaanxi is high in summer and the evapotranspiration is strong. Topography usually affects the spatial distribution of SM in wet conditions [35], especially during and immediately after a rainfall event [47]. In addition, elevation also plays a role in divisions of study area because of an obvious height difference between them. The central plains and the southern valley plains receive more precipitation than other regions, resulting in higher soil moisture values. Therefore, topographic factors not only directly affect the spatial distribution of soil moisture content, but also further affect soil moisture by influencing meteorological factors.
In spring and autumn, the main factors affecting soil moisture are all meteorological factors and topographic, but the value of %IncMSE corresponding to these factors varies greatly. The difference of %IncMSE of different factors in spring is close, while the value of %IncMSE of meteorological and topographic factors in autumn is obviously very high. Compared with autumn, meteorological conditions are less variable in spring with lower precipitation and less frequency of rainfall. In addition, precipitation in September is at a high level and the monsoon effects still exist.
Soil moisture in winter is mainly affected by meteorological factors and vegetation. In winter, meteorological factors become less active due to low temperature and little rain. Thus, the properties of the soil itself become more important. The level of soil moisture depends less on external supplies and more on the ability of the soil to retain moisture.
According to the performance of different auxiliary factors in each season, precipitation is the key factor affecting soil moisture. In previous studies, although the time-continuous variation trend of precipitation and soil water value was compared, precipitation was rarely directly used as an influencing factor to participate in modeling. One reason is that the spatial resolution of precipitation products is relatively coarse. Another is that there may be a time hysteresis effect between soil moisture and precipitation. The occurrence of rainfall is on a large scale. It can be implied that the impact of coarse resolution precipitation products is relatively small. In general, the soil depth in remote sensing data is generally around 5 cm, because the microwave penetrates the soil up to 5 cm. There is no time lag between soil moisture and precipitation at this depth at seasonal scale. According to Figure 5, the change trend of soil moisture is roughly the same as that of precipitation. Some extreme soil moisture values are accompanied by extreme rainfall values.
Among the topographic factors, elevation, slope, and soil moisture are negatively correlated. There is lower temperature and less precipitation due to high altitude. What is more, direct sunlight leads to higher evapotranspiration. Sandy soil and less vegetation aggravate the situation. Obviously, it is not easy to retain soil moisture under these conditions. Water flows from high to low by gravity, and steep slopes speed up the flow of water, leaving steeper areas with less water to preserve. Although the influence of aspect on each season is different, they all show similar laws. The soil moisture on the south and north aspect is higher. The north slope is an overcast slope with low solar intensity, low surface temperature and low evaporation, so soil preserves water well. Although the south slope is a sunny slope, it is also towards the southeast monsoon with sufficient precipitation supply.
According to the %IncMSE value, sandy, clay, and silt have little influence on the establishment of the model. We tested on the removal of soil texture from the downscaling model in summer and winter. The model comparison suggested that although soil texture is a less important predictor, the model performed better when soil texture is included (Table 8).

Discussion
Soil moisture is a key factor in agriculture research, but the spatial resolution of current products is insufficient to meet research needs. The variation of SM is a complex and synthetic process, which is impacted by numerous factors [19]. In recent years, more attention has been paid to the influence of spatial heterogeneity rather than temporal heterogeneity in the research on improving the spatial resolution of soil moisture remote sensing products. This is often overlooked in downscaling studies of factors that change dynamically over time. The significant improvement in this study is to mainly consider the influence of time heterogeneity on soil moisture downscaling. In a study area with four distinct seasons, we divided the time periods according to the seasons. A variety of explanatory variables involving meteorology, topography, land cover, and soil properties were selected in the establishment of the downscaling model. In addition, this model can be built more accurately with the help of machine learning methods.
We verified whether the accuracy of the downscaling model can be improved after considering the seasonal difference. The results in Table 4 support our argument. In addition, it can also be seen from Table 7 that the influence degree of explanatory variables on soil moisture is different in each season. That is to say, the relationship between soil moisture and auxiliary factors also has a large seasonal difference. These mean that seasonal differences do affect results and should be considered in relevant studies. Compared with the original remote sensing data, summer downscaled results have the largest deviation in four seasons (MAPE = 34%, RMSE = 0.024 m 3 /m 3 , BIAS = −0.019 m 3 /m 3 ). The downscaled product has better correlation with the in-situ measurements than the original remote sensing product. According to the evaluation results of the downscaled SM products, the performance of winter products is the best, while the performance of summer products is slightly inadequate. Having got the downscaled results, how to reduce the deviation in summer is the focus of the future research. The performance of the downscaled products in different seasons is positively correlated with the soil moisture stability. The better the stability of soil moisture value is, the better the downscaled result is. In summer, there is drastic change in meteorological factors. Precipitation, land surface temperature, and evapotranspiration intensity all reach the highest values in a year, bringing high moisture supply to the soil and strong evapotranspiration at the same time. The vegetation cover types in the study area are mostly deciduous. Summer is the period when the difference in vegetation coverage between different land cover types is the largest. High vegetation coverage reduces the accuracy of the sensor in obtaining soil moisture information, and the spatial heterogeneity of the mixed pixels is greatly enhanced. Thus, soil moisture values in summer are very unstable and affected by meteorological and vegetation factors. We also take into account that the deviation of the original remote sensing soil moisture products in summer is likely to be the largest, and there would be a deviation transfer in the downscaling process. In contrast, soil moisture values in winter are at a relatively low and stable level because of low temperature, little rain, and low vegetation coverage. In this study area, the rainfall in autumn is greater than that in spring, and the soil moisture values are also higher and more unstable in autumn. Therefore, the downscaling results in spring are better than in autumn.
In this study, the same input explanatory variables were used in the establishment of the seasonal downscaling model. In the following research, the most suitable downscaling model for each season can be determined according to the importance of explanatory variables. In addition to the auxiliary data selected in this study, we can also consider the infrared data [24,48] and SAR [7,16,18]. Additionally, we chose the seasonal scale for research, since the climate in the study area has four distinct seasons. However, this time division is not applicable to tropical or cold regions. We can divide different time scales according to the condition of the research area and the research emphasis. For example, dry and wet seasons can be distinguished in tropical monsoon regions, and growing seasons and non-growing seasons can be distinguished in the research based on cultivated land.
Due to the lack of measured data of unit conversion, the in-situ measurements used for verification in this paper are not consistent with the unit of remote sensing data. In addition, there is no soil moisture data with a depth of 5 cm in the ground monitoring stations, so the soil moisture values with a depth of 10 cm are mostly used in the experiments. What is more, the in-situ measurements are at point scale, while remote sensing products are at the 1-km-pixel scale. In the study of verifying remote sensing data with measured data, there is always unreasonable matching in the space scale [20]. These factors may all affect the accuracy of validation. Therefore, it is necessary to explore other effective validation methods to evaluate downscaled SM products [19].
We compared the three machine learning methods in order to get the most suitable method for this research. According to their performance, the random forest has the best comprehensive performance. The RF-downscaled products have superior matching performance to both AMSR SM products and in-situ measurements. However, RF still has some shortcomings, such as overall overestimation. The method would be optimized by adjusting parameters to make it more suitable for our study. In addition to the machine learning method used in this paper, other algorithms such as neural networks [17,38] and Bayesian [49] have also been used in the downscaling research. The comparison and integration of different algorithms can also be a focus of future research.

Conclusions
In this paper, we have proved that season-based downscaling is even better than continuous time series. The AMSR SM products were downscaled based on the machine learning method. The relationship between meteorology, topography, land cover types, and soil properties was considered in the model. We finally obtained 1 km soil moisture products from 2002 to 2017 on a seasonal scale in the Shaanxi province. The downscaled products not only maintain the accuracy of AMSR-E/AMSR2 SM products, but improves the time continuity of the SM products. Compared with the original 25 km products, the downscaled results have a higher fitting degree with the in-situ measurements, providing more abundant spatial information of soil moisture. Through the research of this paper, we get the following conclusions: (1) Three machine learning methods RF, SVM, and KNN were selected to construct downscaling models. By comparing the accuracy of the downscaling models, we found RF performs best. One deficiency of RF is overestimation. (2) The construction of the downscaling model with seasons can improve the accuracy. Season-based downscaling is effective. The winter downscaled result is the best, as the characteristics of soil moisture in winter are in a relatively stable state with low content and small variation. (3) The influence of different factors on soil moisture varies with seasons, but precipitation is a key factor for all four seasons. For summer and autumn, the influence of meteorological factors and topographic factors on soil moisture is very prominent. However, vegetation is more prominent in winter.
In this study, the influence of seasonal differences is considered in the process of downscaling. This provides a new perspective for the study of soil moisture or other dynamic variables. In terms of explanatory variables for the establishment of the downscaling model, we chose four types of factors and analyzed the seasonal differences of their influence on soil moisture. Our analysis can provide reference for subsequent studies on soil moisture.