Multi-Scale Validation of SMAP Soil Moisture Products over Cold and Arid Regions in Northwestern China Using Distributed Ground Observation Data

The Soil Moisture Active Passive (SMAP) mission was designed to provide global mapping of soil moisture (SM) on nested 3, 9, and 36 km earth grids measured by L-band passive and active microwave sensors. The validation of SMAP SM products is crucial for the application of the products and improvement of the retrieval algorithm. Since the SMAP SM products were released, much effort has been invested in the evaluation of the SMAP radiometer SM product (SMAP_P). However, there has been little validation of SMAP radar (SMAP_A) and active/passive combined (SMAP_AP) SM products. This paper presents an evaluation of SMAP_P, SMAP_A and SMAP_AP SM products by using distributed ground observations networks in different landscapes in the Heihe River Basin of northwestern China. The standard error metrics of SMAP products and relative error are applied to measure the products’ performances. The results show that the SMAP SM products exhibit consistent spatial-temporal variation with the ground measurements and typical precipitation events. Three products show various types of performance capability (e.g., active, passive and combined), surface coverage (e.g., bare, vegetated) and climatic region (e.g., cold, arid). Relatively, the SMAP_P shows the best performance, while the SMAP_A performs the worst. The best performances are observed over bare soils but with overestimation and the largest relative error, and unsatisfactory accuracies are observed over cold regions and woody vegetated surfaces with underestimation. The vegetation effect and the freezing-thawing cycle may be major factors that led to an unsatisfactory performance. Efforts on resolving the influence of these factors are expected to improve the accuracy and to promote the application of SMAP SM products over these regions. Overall, this evaluation provides an understanding of SMAP SM products over cold and arid regions, and suggestions for the further refinement of the SMAP SM retrieval algorithms.


Introduction
The significance of soil moisture (SM) as a terrestrial hydrology state variable has been well-recognized [1][2][3]. Therefore, estimating SM with high accuracy is crucial to meet the needs of various applications. The Soil Moisture Active Passive (SMAP) [4] satellite, launched in January 2015, is the first of the earth observation satellites being developed by the National Aeronautics and Space Administration (NASA) to provide high resolution global mapping of SM and freeze/thaw state every 2-3 days on nested 3, 9, and 36-km earth grids measured by a host of passive and active microwave sensors.
The SMAP satellite incorporates an L-band (1.26 GHz) radar and an L-band radiometer (1.41 GHz) to provide 3-km spatial resolution backscattering observation, and 36-km resolution brightness temperature observation, respectively. Due to the strong penetration capability, the L-band microwave observations have been recognized as the most promising band for SM estimation [5], especially under vegetation canopies. The SMAP radar sensor stopped working on 7 July 2015 due to a mechanical failure, but it could provide nearly 3 months (from 13 April to 7 July 2015) of radar observations and SM products at 3-km resolution, as well as the 9-km resolution SM products by combining active and passive observations. Although the SMAP active and combined SM products were only available during a very short period, they provided the first high spatial resolution SM products at a global scale. The retrieval algorithms of these two products need to be further refined. In addition, these multi-scale SM products would be very important in scale-related research into SM estimation; thus, the evaluation of SMAP SM products (especially the active and combined products) would provide feedback for the algorithm's refinement and useful information for research related to scale effect and scale transformation issues.
Related calibration and validation activities for SMAP have been ongoing, such as the conduction of SMAPEx [6] and SMAPVEX12 [7] experiments. The validation of the SMAP SM products is of crucial significance, not only in view of possible applications of these data, but also to provide useful feedback for the further refinement of the retrieval algorithm. After the release of SMAP products, several authors [1,8,9] conducted validations of SMAP radiometer and active/passive combined SM products. However, there is very little literature [10] reporting the evaluation of all three SMAP SM products. Thus, one of the major objectives of this paper is to validate all three kinds of SM products.
Furthermore, the traditional validation strategies of remote sensing products usually utilize a single site-based point observation to validate the footprint scale remote sensing products. There is no doubt that mismatch in the scale introduces large uncertainty in the evaluation [11][12][13][14]. Several recent validation research papers [1,11,12] demonstrate that the average of the multi-point observations can effectively relieve the uncertainty caused by the scales. Thus, this evaluation utilizes the average of the distributed ground observation network data to evaluate the SMAP SM products.
The main objective of the present study is to contribute to the evaluation of the available SMAP SM products, including the active products (SMAP_A), passive products (SMAP_P) and active/passive combined (SMAP_AP) products, over the cold and arid regions in northwestern China. In this evaluation, we address two key points. One point is the scale-related issue. At the horizontal spatial scale, three different resolution SMAP SM products are evaluated using multi-point averaged ground observations from several nested observation networks [15], distributed in various landscapes, respectively. At the vertical spatial scale, the averaged value of two depth ground SM measurements (~2 cm and~4 cm, or~5 cm and~10 cm) are used to represent ground reference value. At the temporal scale, the average value of multiple observations obtained within 1 h of the SMAP transit are used to represent the ground reference. Another point is the special study area. The evaluation is conducted in three networks with different climatic regions and land surface types, namely: (1) cold region covered by alpine meadow; (2) irrigated region covered by man-made oasis farmland; and (3) natural oasis of arid Gobi desert covered by sparsely distributed vegetation (Populus euphratica, tamarisk, and others). In addition to the standard error metrics [16] of SMAP SM products, the relative error (RE) is introduced as an extra evaluation index.

Study Area
The Heihe River Basin (HRB) (37.5 • N-43 • N, 97 • E-102 • E) located in northwestern China (Figure 1a) is used as the study area in this paper. The basin is the second largest inland river basin in China, which covers various landscapes. The upstream area of the basin is located in the mountain cyrospheric region of the northeastern Tibet Plateau. Both the permafrost and seasonal frozen soil coexist in the region, and the region is mainly covered by alpine meadow, evergreen forest and snow. The annual precipitation, which mostly occurs from May to September, is higher than those at the mid-and down-stream areas; thus, the SM is relatively high. The midstream area is located in the center of the Hexi corridor which is characterized by an arid climate. The terrain is relatively flat and is mainly covered by the Gobi desert, man-made oasis, wetland, and so on. The main vegetation includes crop, grass and salinized meadow. The annual precipitation is small and the SM at the oasis is mainly supplied by irrigation. The downstream area is located in Ejina in Inner Mongolia where the climate is rather arid with an annual precipitation of less than 50 mm. Most parts of the region are covered by the Gobi desert. Populus euphratica and tamarisk are sparsely distributed near the river bank, forming a natural oasis.

Ground Observation Networks and Datasets
Three SM observation networks [15,17] are respectively established in the up-, middle-and down-stream areas during HiWATER experiments [18]. In particular, the up-stream network consists of multi-scale observations, including automatic weather stations (AWSs) and wireless sensor networks (WSN). The observation networks [15] provide a very rich dataset for the SMAP SM products validation. Thus, the utilization of data over HRB in this evaluation can provide more comprehensive evaluation results over multiple climatic regions and land surface types.
Remote Sens. 2017, 9,327 3 of 14 cyrospheric region of the northeastern Tibet Plateau. Both the permafrost and seasonal frozen soil coexist in the region, and the region is mainly covered by alpine meadow, evergreen forest and snow. The annual precipitation, which mostly occurs from May to September, is higher than those at the mid-and down-stream areas; thus, the SM is relatively high. The midstream area is located in the center of the Hexi corridor which is characterized by an arid climate. The terrain is relatively flat and is mainly covered by the Gobi desert, man-made oasis, wetland, and so on. The main vegetation includes crop, grass and salinized meadow. The annual precipitation is small and the SM at the oasis is mainly supplied by irrigation. The downstream area is located in Ejina in Inner Mongolia where the climate is rather arid with an annual precipitation of less than 50 mm. Most parts of the region are covered by the Gobi desert. Populus euphratica and tamarisk are sparsely distributed near the river bank, forming a natural oasis.

Ground Observation Networks and Datasets
Three SM observation networks [15,17] are respectively established in the up-, middle-and down-stream areas during HiWATER experiments [18]. In particular, the up-stream network consists of multi-scale observations, including automatic weather stations (AWSs) and wireless sensor networks (WSN). The observation networks [15] provide a very rich dataset for the SMAP SM The upstream network (Figure 1b) has been set up in the Babaohe sub-basin of the upstream area since 2013. The network is a nested multi-scale network consisting of a hydro-meteorological network (HMN) and a wireless sensor network (WSN), covering an area of around 4500 km 2 , which covers about 4 SMAP_P pixels. The HMN encompasses five automatic weather stations (AWSs) which can provide multi-layer (2 cm, 4 cm, 10 cm, 20 cm, . . . , 320 cm) SM measurements, as well as other meteorological elements, e.g., precipitation. The WSN encompasses about 40 measuring nodes that provide profile SM measurements. In this paper, we mainly use the surface SM measurements (2 cm and 4 cm) of WSN and precipitation observations of AWSs to validate the SMAP SM products.
The midstream network was setup inside and around the artificial oasis area in the midstream area. Three AWSs were instrumented in the network which can provide data for two SMAP_P pixel product validation (Figure 1c). The Mid_DM_SS_AWS and Mid_HRB_RSS_AWS were located inside the oasis, which were covered by maize vegetation, and the Mid_HZZ_AWS was located outside the oasis, which was covered by the desert. All the AWSs provide SM at the depth of 5, 10, 20 cm, etc. In addition, precipitation observations of the AWSs are applied in this paper to assist the analysis of SM variations. The~5 cm and~10 cm SM measurements are used to validate the SMAP SM products.
The downstream observation network ( Figure 1d) is deployed to measure water consumption of the natural oasis ecosystem. The observation network is composed of six AWSs distributed in several landscape types, including bare soils (2), forest (3) and desert (1). All these stations provide time serial SM measurements at~5 cm and~10 cm for the validation the SMAP SM products at one passive pixel scale.

SMAP SM Products
The SMAP satellite carries an L-band radar and an L-band radiometer to monitor the Earth's surface at sun-synchronous times: 06:00 (descending) and 18:00 (ascending) [20]. The radiometer and radar began to provide SM products on the 31 March and 13 April 2015, respectively. Because of the mechanical failure of the radar, the SMAP stopped providing active microwave products on 7 July 2015. All the SMAP products can be freely downloaded from the website of NSIDC (https: //nsidc.org/data/smap/smap-data.html). Three L3 level SM products, including radar SM product (L3_SM_A), radiometer SM product (L3_SM_P) and active/passive combined SM product (L3_SM_AP), are chosen for the evaluation in this research. The spatial resolutions of the three products are 3 km, 36 km and 9 km, respectively. The L3_SM_A and L3_SM_AP products cover 13 April-7 July and the L3_SM_P covers 2 April-31 December 2015. All the L3 products are the daily composite of the level 2 granules.
The baseline retrieval algorithm of L3_SM_A product inverts a forward scattering model using a time series data-cube approach [21,22]. The Numerical Maxwell Model in 3 Dimensions [23,24] is adopted as the benchmark model for the bare surface. A "discrete scattering" approximation approach [25,26] is utilized for the non-woody vegetated surface, and a layered scattering geometry and vegetation model [27] are applied for the woody vegetated surface. Three channels radar backscattering coefficients, HH, HV and VV, as well as auxiliary data (including DEM, land cover class and crop type, etc.), are applied to simultaneously estimate soil permittivity (equivalent to SM), surface roughness and vegetation water content (VWC). Furthermore, the SMAP adopts several other optional algorithms, including change detection [28,29], snapshot algorithm [30], etc. In this paper, only the baseline SM product is evaluated.
The L3_SM_P product is a daily global composite of the L2_SM_P SM product that is produced by a single channel algorithm [31,32]. The algorithm utilizes the horizontally-polarized brightness temperature (TB) observations to estimate SM based on a zero-order radiative transfer model known as the τ-ω model [33]. The SM retrieval process includes five basic steps, including normalizing TB to emissivity, removing the vegetation effects, accounting for the soil surface roughness effects, converting emissivity to soil permittivity, and converting soil permittivity to SM. The retrieval process can be seen in detail in the algorithm document of the SMAP SM product [31] and related references. The L3_SM_AP is a daily global composite of L2_SM_AP product which is based on the merger of the SMAP radiometer and radar instruments product at two discrete grid resolutions, 36 km and 3 km, respectively [34][35][36]. The L2_SM_AP baseline algorithm is essentially focused on the disaggregation of the radiometer TB, based on the radar backscattering spatial patterns within the radiometer footprint that are inferred from the radar measurements [35]. Once the TB is disaggregated to 9 km, the L2_SM_P inversion algorithm is applied with ancillary data to produce the L2_SM_AP product.

Data Processing and Evaluation Method
The SMAP SM products possess three different spatial resolutions and the three resolution grids are well-nested, as shown in Figure 2. The spatial averaging has been recognized to be able to reduce the noise of site observations [11,12], as well as to enhance the representativeness of the ground observations; thus, the site-average values at different scales and conditions are applied to evaluate the SMAP SM products.
the SMAP SM products.
More specifically, for the upstream area of HRB, because of the same land surface coverage (alpine meadow), the evaluation of the SMAP_P product is conducted on the whole network scale and pixel scale, respectively, and the evaluations of SMAP_A and SMAP_AP products are conducted on the whole network scale because the ground sites are rather sparsely distributed (relative to the high resolution of the SMAP_A and SMAP_AP products) and because of the very limited products that are available. An evaluation of the whole network involves comparing the average value of all measurements of the whole network with the average value of the corresponding SMAP SM products, and that on the pixel scale means comparing the average value of measurements that are within a certain SMAP pixel. Taking Figure 2 as an example, the average value of a and b sites is used to validate a certain SMAP_A, and that of a, b, ..., f is used to validate a certain SMAP_P SM product.
For the mid-and down-stream areas of HRB, the ground surfaces are covered by several kinds of landscapes, including: the Gobi desert, non-vegetated cropland, cropland and woodland. The downstream network can provide evaluation data for one SMA_P pixel and several SMAP_A and SMAP_AP pixels. In particular, there may be only one site within a certain SMAP_A pixel over midstream area network. Thus, the evaluations are conducted on the whole network scale of mid-and down-steam areas, respectively, without considering the differences in surface types. Moreover, the sites that fall into different surface types are used to represent the corresponding ground observations. For example, a site that falls into the Gobi desert, e.g., Mid_HZZ_AWS in Figure 1c, is considered as a bare soil observation in mid-stream. To quantitatively evaluate the SMAP SM products, several validation indices, including RMSE, the mean bias, the unbiased RMSE (ubRMSE), and correlation coefficient (R), are used. The definitions of the validation indices are as follows [16]: More specifically, for the upstream area of HRB, because of the same land surface coverage (alpine meadow), the evaluation of the SMAP_P product is conducted on the whole network scale and pixel scale, respectively, and the evaluations of SMAP_A and SMAP_AP products are conducted on the whole network scale because the ground sites are rather sparsely distributed (relative to the high resolution of the SMAP_A and SMAP_AP products) and because of the very limited products that are available. An evaluation of the whole network involves comparing the average value of all measurements of the whole network with the average value of the corresponding SMAP SM products, and that on the pixel scale means comparing the average value of measurements that are within a certain SMAP pixel. Taking Figure 2 as an example, the average value of a and b sites is used to validate a certain SMAP_A, and that of a, b, ..., f is used to validate a certain SMAP_P SM product.
For the mid-and down-stream areas of HRB, the ground surfaces are covered by several kinds of landscapes, including: the Gobi desert, non-vegetated cropland, cropland and woodland. The downstream network can provide evaluation data for one SMA_P pixel and several SMAP_A and SMAP_AP pixels. In particular, there may be only one site within a certain SMAP_A pixel over mid-stream area network. Thus, the evaluations are conducted on the whole network scale of mid-and down-steam areas, respectively, without considering the differences in surface types. Moreover, the sites that fall into different surface types are used to represent the corresponding ground observations. For example, a site that falls into the Gobi desert, e.g., Mid_HZZ_AWS in Figure 1c, is considered as a bare soil observation in mid-stream. To quantitatively evaluate the SMAP SM products, several validation indices, including RMSE, the mean bias, the unbiased RMSE (ubRMSE), and correlation coefficient (R), are used. The definitions of the validation indices are as follows [16]: (1) Moreover, another commonly used index, relative error (RE), is added to assist the SMAP SM products' evaluation: In the above equations: θ est and θ re f represent the SMAP and ground observed SM values, respectively; E[·] and abs are the expectation and absolute value operators, respectively; σ est and σ re f are the standard deviations of the SMAP and ground observed SM, respectively.

Passive SM Product
Figures 3-5 show the evaluation results of SMAP_P SM product at the up-, mid-and down-stream areas of HRB, respectively. In the upstream area of the HRB, a total of around 45 ground sites (AWSs + WSNs) are respectively located in four SMAP radiometer pixels. Due to the entire region being covered by the same land surface type (alpine meadows), two evaluation strategies are used, as described in Section 2.4. As can be seen from Figure 3a, the temporal evolutions of ground measurement and SMAP_P SM product show significant variation. Referring to the precipitation data, both ground measurements and SMAP_P SM product are able to capture the precipitation events and the SM variation trends. However, ground measurements at different footprints show rather large differences, but SMAP_P SM product shows relatively small differences. Additionally, the SMAP_P SM product shows underestimation against the ground measurements (Figure 3a,b). Figure 4 shows the evaluation results of SMAP_P SM product over the midstream area of HRB. Two comparisons are conducted: (1) the measurements from the two sites located within the oasis are averaged (Ground_Oasis) to compare to the oasis SMAP_P SM product (SMAP_P_Oasis), and the measurements from site located Gobi desert (Ground_Bare) are compared to the bare surface SMAP_P SM product (SMAP_P_Bare); (2) the average values of all the ground measurements (Ground_All) in the whole network region are compared to the average SMAP_P SM product (SMAP_All). Both SMAP_P SM product and ground measurements show consistent variation trends to the precipitation. SMAP underestimates the SM over the oasis and overestimates those over the bare soil. However, the averaged SMAP_P SM product over the network performs a better estimation of SM. Figure 5 shows the evaluation results of the downstream network where the land surface is covered by bare soils and woody vegetation. One SMAP radiometer footprint covers the entire downstream network. Large differences between the ground measurements and SMAP_P SM product are observed. However, the SMAP_P SM product is more closed to the SM measurements over bare soil. Both the SMAP and ground measurements possess similar temporal evolution. as described in Section 2.4. As can be seen from Figure 3a, the temporal evolutions of ground measurement and SMAP_P SM product show significant variation. Referring to the precipitation data, both ground measurements and SMAP_P SM product are able to capture the precipitation events and the SM variation trends. However, ground measurements at different footprints show rather large differences, but SMAP_P SM product shows relatively small differences. Additionally, the SMAP_P SM product shows underestimation against the ground measurements (Figure 3a,b).    Table 1 lists the error metrics of the SMAP_P SM product over the entire HRB network. It shows various performance associated to climate region or land surface types. Throughout the whole river basin, SMAP_P SM product underestimates the SM over vegetated (woody and non-woody) surface, Remote Sens. 2017, 9, 327 8 of 14 especially over the woody land surface with the bias being larger than 0.14 m 3 /m 3 , and overestimates the SM over bare soil. Due to the presence of the biases, the SMAP_P SM product shows a relatively large RMSE over most climatic regions and land surface types. However, it can achieve a favorable accuracy after removing the bias with the ubRMSE smaller than 0.04 m 3 /m 3 (the SMAP mission target accuracy) over most areas of the HRB, except for that over the upstream area. The unsatisfactory performance over the upstream area of HRB may be caused by the impact of soil freezing and thawing process over the cold climate region [1]. Although it shows relatively small RMSE and ubRMSE values over the midstream and downstream areas, the RE are rather large, especially over bare soils. This is because the SM values are rather small over the bare soils. Overall, from the error matrices defined in [16], the SMAP_P SM product possesses a relatively favorable performance over most parts of HRB, but from the RE index, one should pay sufficient attention to when these products are used.  At the surface fields, All_Ave represents all the ground measurement and SMAP_P SM product are averaged and compared. Pixel, Oasis, Bare and Woody mean the ground measurements within a certain pixel with a specific surface type that are applied to validate the SMAP_P product at pixel scale.

Active SM Product
Due to the short duration of SMAP radar sensor and the lack of SMAP coverage over HRB, the amount of SMAP_A SM product over HRB is very small, especially at the upstream of the basin. Overall, the active SM product shows an unsatisfactory estimation of SM. Figure 6 shows the comparison of SMAP_A SM product and the ground measurements. The SMs over the upstream of  Table 1. Performance metrics of SMAP_P SM product over HRB, in which N is the sample number. At the surface fields, All_Ave represents all the ground measurement and SMAP_P SM product are averaged and compared. Pixel, Oasis, Bare and Woody mean the ground measurements within a certain pixel with a specific surface type that are applied to validate the SMAP_P product at pixel scale.

Active SM Product
Due to the short duration of SMAP radar sensor and the lack of SMAP coverage over HRB, the amount of SMAP_A SM product over HRB is very small, especially at the upstream of the basin. Overall, the active SM product shows an unsatisfactory estimation of SM. Figure 6 shows the comparison of SMAP_A SM product and the ground measurements. The SMs over the upstream of HRB are significantly underestimated, and those over bare soils over mid-and down-stream areas are overestimated. These results are similar to those of the SMAP_P SM product. From Table 2, all performance metrics show that the active SM product over most climatic and surface types in HRB presents an unsatisfactory accuracy. This observation implies that the present version of SMAP_A SM product is hardly directly applied in this region.  Table 2. Performance metrics of SMAP_A SM product over HRB. With which N is the sample number. At the surface fields, All_Ave represents all the ground measurement and the SMAP_A SM products are averaged and compared; Oasis, Bare and Woody mean the ground measurements within a certain SMAP_A pixel with a specific surface type are applied to validate the SMAP_A products at pixel scale.

Active/Passive Combined SM Product
The evaluation results of SMAP_AP SM product are presented in Figure 7 and Table 3. The overall performance of SMAP_AP SM product is better than that of the active one, and worse than that of the passive one. Except for the bare surface at the downstream, the SMAP_AP SM product shows smaller RE than the active and passive product. As is the case with the active and passive products, the SMAP_AP SM product significantly underestimates the SM at the upstream area. Overestimation of SM can be found over bare soil at the mid-and downstream areas. In contrast, it is found that the SMA_AP SM product shows a slight overestimation over the woody surface at the downstream area.  Table 2. Performance metrics of SMAP_A SM product over HRB. With which N is the sample number. At the surface fields, All_Ave represents all the ground measurement and the SMAP_A SM products are averaged and compared; Oasis, Bare and Woody mean the ground measurements within a certain SMAP_A pixel with a specific surface type are applied to validate the SMAP_A products at pixel scale.

Active/Passive Combined SM Product
The evaluation results of SMAP_AP SM product are presented in Figure 7 and Table 3. The overall performance of SMAP_AP SM product is better than that of the active one, and worse than that of the passive one. Except for the bare surface at the downstream, the SMAP_AP SM product shows smaller RE than the active and passive product. As is the case with the active and passive products, the SMAP_AP SM product significantly underestimates the SM at the upstream area. Overestimation of SM can be found over bare soil at the mid-and downstream areas. In contrast, it is found that the SMA_AP SM product shows a slight overestimation over the woody surface at the downstream area.
that of the passive one. Except for the bare surface at the downstream, the SMAP_AP SM product shows smaller RE than the active and passive product. As is the case with the active and passive products, the SMAP_AP SM product significantly underestimates the SM at the upstream area. Overestimation of SM can be found over bare soil at the mid-and downstream areas. In contrast, it is found that the SMA_AP SM product shows a slight overestimation over the woody surface at the downstream area.   Table 3. Performance metrics of SMAP_AP SM product over HRB, in which N is the sample number. At the surface fields, All_Ave represents that all the ground measurement and SMAP_AP SM product are averaged and compared; Oasis, Bare and Woody mean the ground measurements within a certain SMAP_AP pixel with a specific surface type are applied to validate the SMAP_AP products at pixel scale.

Discussion
In this paper, three different scale SMAP SM products are evaluated using the ground observation networks distributed in HRB, and the corresponding results are shown in Section 3. In this section, an extended discussion is conducted to give remarks on the overall performance of SMAP SM products, as well as to provide suggestions for the algorithm's improvement.
Regarding the evaluating metrics, the performance metrics defined by [16] and RE are applied to assess the performances of SMAP products in this research. As shown in Tables 1-3, several error indices, e.g., RMSE, ubRMSE, show very satisfactory accuracies, especially over the bare soil surface, even better than the SMAP mission target accuracy of 0.04 m 3 /m 3 . Nevertheless, the RE shows rather large values because, over bare soil at arid regions of HRB, SM values are rather small, and the SMAP estimated SM values are relatively small. However, they are not small enough compared to the ground reference values. Under this condition, the RMSE and ubRMSE present very small values, but the RE presents large values. Thus, only the performance metrics cannot fully describe the performance of SMAP SM products. From this point, not only the standard performance metrics, but also the RE should be carefully considered for the comprehensive evaluation of SMAP SM products. This point should also be addressed in evaluation issues of many other remote sensing products.
For the consideration of various climatic regions and land surface types, this research systematically evaluates the SMAP SM products over several landscapes distributed in cold and arid regions of HRB because the numbers and densities of the ground observation sites in each network varies. Even within a certain network, the land surface presents several different landscapes.
Additionally, the SMAP provides three different spatial scale SM products. Meanwhile, the mixed pixel issue objectively exists in the remote sensing data. This issue also influences the application of SM products in various disciplines. Thus, the choice of evaluation strategy is very important. The authors of [1] utilized the averaged SM values of the whole ground network, without considering the difference in surface covers, to represent the ground truth, and used the averaged values of all SMAP SM products covering the whole network to represent the SMAP estimated values. The two sets of averaged values are compared to evaluate the performance of the SMAP_P SM product. However, the evaluation presented in this paper utilizes several strategies to compare the estimated and ground reference values. The results of this evaluation show significant performance differences in SMAP SM products over various land surfaces, e.g., the performances of SMAP SM products over bare soil are better than those over vegetated soil, and much better than those over the woodland surface. The averaging of all ground measurements without considering surface difference may make the result worse under some conditions, but it also makes the result a little better under other conditions. However, the discrimination of land surface may provide more useful information about the SMAP SM products performance, and extend their application.
The radio frequency interference (RFI) on L-band microwave signature has been recognized as the main challenging issue in East Asia, which may be an important error source of SMAP products over northwest China. However, this issue has been optimally controlled and mitigated [37,38]. Here, we mainly discuss possible error sources in the SM retrieval algorithm related to climatic region and land surface characteristics. The overall performance of SMAP SM products over HRB possess the following characteristics:

•
The SMAP SM products over most parts of HRB present relatively satisfactory spatial-temporal variation, especially because they can capture the typical precipitation events. However, they present relatively large RE.

•
The performance of the passive SM product is the best. The active SM product is worse than the passive and combined products, which is consistent with the findings of [10]. • All SMAP SM products were found to perform a slightly better over the middle area than those over the up-and down areas of HRB.

•
Better performance of all SMAP SM products can be observed over bare soils compared with vegetated soils.
As described in Section 3, all of the three SMAP SM products show overestimation over bare soils and underestimation over vegetated and frozen soils. These observations may be attributed to the climate and land surface characteristics. The surface of the study area is characterized by cold and arid regions. Over the cold region, the liquid and solid water coexist in the soil and possess different permittivity characters, but the present algorithm is insufficient in the frozen soil permittivity modeling, resulting in the underestimation of SM. Over the arid region, the soil is so dry that the microwave signal at the L-band penetrates larger soil depth and senses the volume scattering of the soil. This leads to the overestimation of SM. The underestimation of SM over the vegetated surface is caused by the effects of vegetation canopy on the microwave signal. This point is the key factor that makes the estimation of SMAP weak over the downstream area.
Additionally, the weak forward modeling and inversion strategies may be another important factor for the unsatisfied performance of the present SMAP SM products. For the passive SM product, this evaluation's results are close to the finding of [1], but SMAP_P SM results in a larger RMSE and ubRMSE over HRB. Although the algorithm is relatively mature and has been validated previously, there may be the possibility of improvement via adding vertically polarized TB in the future SMAP SM products generation. For the active SM products, errors may have originated because the sensitivity of backscattering to SM is less than to surface roughness [39], as well as from the uncertainty in the time series algorithm. Parameterization and estimation soil surface roughness from radar observation before SM estimation may be a promising approach to improve SMAP_A SM inversion [40]. The error in SMAP_AP SM products may be caused by: (1) the uncertainties in active and passive observations, respectively; and (2) the uncertainties caused by the scale issue when merging the radar and radiometer observations [35,36,41,42]. Development and improvement of the more robust merging algorithm, considering the scale effects, may improve the accuracy of SMAP_AP SM products.

Conclusions
The validation of SMAP SM products is crucial for the application of the products and refinement of the retrieval algorithm. This study presents an evaluation of the SMAP SM products by using the ground network data over the HRB in cold and arid regions of northwestern China, which offers an insight into the SMAP SM products' application in the region, and also provides suggestions for the refinement of the retrieval algorithm. SMAP_A and SMAP_AP products are available at a very limited period, but they can be used in scale-related research, and the present evaluation can help to provide feedback for the improvement of the current retrieval algorithm.
The results show that the SMAP SM products over most parts of HRB present relatively satisfactory spatial-temporal variation, especially because they can capture the typical precipitation events.
Relatively, the performance of the passive SM products is the best, and the active SM products were worse than the two others.
All SMAP SM products present a little better over the midstream area than those over the cold (upstream) and extreme arid (downstream) areas of HRB. Better performance of all SMAP SM products can be observed over bare soils than vegetated soils. The vegetation effects on SM inversion have been fully recognized. The unsatisfactory performance of SMAP SM products over the cold region may be caused by the freezing and thawing cycle. This indicates that the future SMAP SM retrieval algorithm should focus on the vegetation effects of the freezing and thawing cycle.