Validation of Multiple Soil Moisture Products over an Intensive Agricultural Region: Overall Accuracy and Diverse Responses to Precipitation and Irrigation Events

: Remote sensing and land surface models promote the understanding of soil moisture dynamics by means of multiple products. These products differ in data sources, algorithms, model structures and forcing datasets, complicating the selection of optimal products, especially in regions with complex land covers. This study compared different products, algorithms and ﬂagging strategies based on in situ observations in Anhui province, China, an intensive agricultural region with diverse landscapes. In general, models outperform remote sensing in terms of valid data coverage, metrics against observations or based on triple collocation analysis, and responsiveness to precipitation. Remote sensing performs poorly in hilly and densely vegetated areas and areas with developed water systems, where the low data volume and poor performance of satellite products (e.g., Soil Moisture Active Passive, SMAP) might constrain the accuracy of data assimilation (e.g., SMAP L4) and downstream products (e.g., Cyclone Global Navigation Satellite System, CYGNSS). Remote sensing has the potential to detect irrigation signals depending on algorithms and products. The single-channel algorithm (SCA) shows a better ability to detect irrigation signals than the Land Parameter Retrieval Model (LPRM). SMAP SCA-H and SCA-V products are the most sensitive to irrigation, whereas the LPRM-based Advanced Microwave Scanning Radiometer 2 (AMSR2) and European Space Agency (ESA) Climate Change Initiative (CCI) passive products cannot reﬂect irrigation signals. The results offer insight into optimal product selection and algorithm improvement.


Introduction
Soil moisture (SM) is widely recognized as a key parameter in the hydrological cycle and in energy balance [1,2]. Despite the rapid development of in situ observation techniques [3,4], remote sensing, land surface model (LSM) and their combination (i.e., data assimilation) provide globally continuous SM products in space and time. Remote sensing algorithms and LSMs are developed, validated and improved in densely gauged areas, and are extended to data sparse areas. Radiometer-based remote sensing of SM is based on solving microwave transfer equations, supported by ancillary datasets including soil surface temperature, roughness and if necessary, vegetation optical depth (VOD) [5,6]. Radar-based SM products and their merging with radiometer-based products can better represent SM dynamics in densely vegetated areas [7,8]. LSMs are driven by meteorological

Study Area and In Situ Observations
Anhui Province, China (29 • 41 -34 • 38 N, 114 • 54 -119 • 37 E) is located in a humid to semi-humid transitional zone, known as a big agricultural province. The northern part of Anhui Province belongs to a semi-humid monsoon climate, and the southern part belongs to a subtropical humid monsoon climate. The annual mean temperature ranges from 14-16 • C, and the annual precipitation ranges from 800-1600 mm across the whole province [34]. Cropland is the main land-use land cover (LULC) type in the northern and central flat areas (Figure 1), where irrigation systems are essential to maintain agricultural production [35]. The northern part has no large lakes or rivers, whereas the central part is characterized by highly developed water systems, including the Yangtze River, the Huai River and the Chaohu Lake. The southern and southwestern areas are covered by hilly terrains with dense forests (Figure 1). A total of 20 meteorological stations (Table 1) recorded hourly SM data at a 10-cm depth beneath the soil surface in 2017-2020. The data were collected from the Anhui Meteorological Service Center and were checked for breaks and non-responsive values, following the method in [36]. The temporal consistencies of SM observations were checked by reference to multiple SM products (described below) to find any changes in sensor sensitivities (dynamic ranges) or obvious inconsistencies in SM trends. Finally, the mean SM values and SM dynamics (1σ) are shown in Table 1. The 0.05 • × 0.05 • monthly Terra Moderate-resolution Imaging Spectroradiometer (MODIS) normalized difference vegetation index (NDVI) data in 2017-2020 were extracted and averaged to show the mean vegetation conditions at each station.

Study Area and In Situ Observations
Anhui Province, China (29°41′-34°38′N, 114°54′-119°37′E) is located in a humid to semi-humid transitional zone, known as a big agricultural province. The northern part of Anhui Province belongs to a semi-humid monsoon climate, and the southern part belongs to a subtropical humid monsoon climate. The annual mean temperature ranges from 14-16 °C, and the annual precipitation ranges from 800-1600 mm across the whole province [34]. Cropland is the main land-use land cover (LULC) type in the northern and central flat areas (Figure 1), where irrigation systems are essential to maintain agricultural production [35]. The northern part has no large lakes or rivers, whereas the central part is characterized by highly developed water systems, including the Yangtze River, the Huai River and the Chaohu Lake. The southern and southwestern areas are covered by hilly terrains with dense forests (Figure 1). A total of 20 meteorological stations (Table 1) recorded hourly SM data at a 10-cm depth beneath the soil surface in 2017-2020. The data were collected from the Anhui Meteorological Service Center and were checked for breaks and non-responsive values, following the method in [36]. The temporal consistencies of SM observations were checked by reference to multiple SM products (described below) to find any changes in sensor sensitivities (dynamic ranges) or obvious inconsistencies in SM trends. Finally, the mean SM values and SM dynamics (1σ) are shown in Table 1. The 0.05° × 0.05° monthly Terra Moderate-resolution Imaging Spectroradiometer (MODIS) normalized difference vegetation index (NDVI) data in 2017-2020 were extracted and averaged to show the mean vegetation conditions at each station.

Soil Moisture Products and Data Preprocessing
Multiple radiometer-based products were compared in this study. The Advanced Microwave Scanning Radiometer 2 (AMSR2) products are based on the Land Parameter Retrieval Model (LPRM), JAXA algorithm, Normalized Polarization Difference (NPD) algorithm and Single Channel Algorithm (SCA). LPRM simultaneously retrieves surface temperature, SM and VOD, producing three datasets at 10.7 GHz (X band), 6.9 GHz (C1 band) and 7.3 GHz (C2 band) [38]. JAXA retrieves SM and VOD based on a lookup table approach [39]. NPD uses the polarization difference and a combined vegetation/roughness factor, and SCA uses Advanced Very High Resolution Radiometer (AVHRR) NDVI climatology for vegetation correction [40]. Unlike LPRM, other algorithms only generate X-band SM products to mitigate radio frequency interference (RFI) effects. These products differ substantially from each other [41], which motivates new retrieval and merging algorithms [42,43]. For all AMSR2 products, SM values within 0-0.6 cm 3 ·cm −3 were kept for validation.
Soil Moisture and Ocean Salinity (SMOS) products include SMOS-L3 V300 [44] and SMOS-IC V106 [45], derived from multi-angular TB observations by iterating the L-band Microwave Emission of the Biosphere forward model. Soil Moisture Active Passive (SMAP) products are generated based on three algorithms: SCA-H (H-pol), SCA-V (V-pol) and dual channel algorithm (DCA) [19]. The recent SMAP Version 8 SM products were validated in this study, with DCA as the baseline algorithm. Two versions of flagging strategies were applied for SMOS and SMAP. The rigorous one was the same as that in [46], which is commonly used in validation studies. Specifically, for SMOS-L3, data with a surface temperature < 273 K, quality index (Soil_Moisture_Dqx) > 0.06 or RFI probability (RFI_Prob) > 0.1 were excluded. SMOS-IC SM data with a surface temperature > 273 K and a quality flag = 0 ("data OK") were retained. For SMAP products, data with retrieval quality index = 0 or 8 were retained. The relaxed one is the same to that for AMSR2, reserving all SM values within 0-0.6 cm 3 ·cm −3 . Both data in the a.m. and p.m. orbits were considered for radiometer-based products ( Table 2).
Active-based products were also validated in this study. The National Aeronautics and Space Administration (NASA) Cyclone Global Navigation Satellite System (CYGNSS) product is generated by reference to The SMAP SCA-V, having the advantages of wide spatial and temporal coverage [47]. A 25-km MetOp-B Advanced Scatterometer (ASCAT) product was generated based on a change detection method [48]. SM values were retained only if wetland flag < 15%, topography flag < 20%, frozen soil probability < 10%, snow cover probability <10% and SM retrieval error < 10% [26]. The European Space Agency (ESA) Climate Change Initiative (CCI) V06.1 Active, Passive and Combined products were used in this study. The active and passive products are generated by fusing multiple satellite Remote Sens. 2022, 14, 3339 5 of 20 retrievals from scatterometers and radiometers, respectively, and the active and passive combined product is further rescaled to the Global Land Data Assimilation System (GLDAS) Noah SM climatology [49,50]. Data with snow cover, a surface temperature < 273 K, dense vegetation or failed retrievals were discarded. For ASCAT and ESA CCI Active, the degree of saturation was transformed into volumetric water content using the ESA CCI auxiliary porosity data. The nearest CCI porosity data were used to calculate the ASCAT SM. Five modeling-based products were validated in this study. Forced by a number of analysis-and observation-based products [51], the GLDAS Noah land surface model can provide SM products in the 0-10 cm soil layer, which currently serves as the reference for rescaling ESA CCI SM. The Modern-Era Retrospective Analysis for Research and Applications Version 2 (MERRA2) is the latest atmospheric reanalysis of the modern satellite era produced by NASA's Global Modeling and Assimilation Office (GMAO) [52], providing coarse-scale SM products in the 0-5 cm soil layer. ERA5 benefits from a decade of developments in model physics and data assimilation, providing enhanced modeling results compared to its predecessor, ERA-Interim [53]. It has been demonstrated that the direct assimilation of microwave-based SM products (e.g., SMAP SM and, recently, CYGNSS SM) into LSM improves SM modeling skills [13,54,55]. ERA5-Land shares with ERA5 most of the parameterizations and enhances the description of the hydrological cycle, in particular the soil moisture and lake description [56]. SMAP L4 product is derived by assimilating SMAP TB observations into the NASA Catchment Land Surface Model [57,58]. The basic properties of the modeling products are shown in Table 2. Note that only the products with the timestamp closest to UTC 0:00 were collected. For each product, SM data in the topmost layer were evaluated, and the data were discarded if the soil temperature was less than 273 K.

In Situ and Triple Collocation-Based Validations
A direct comparison of SM products with in situ observations is straightforward. For each station, quality-assured in situ SM observations and gridded SM products were spatially and temporally matched. The correlation coefficient (R), bias, ubRMSE and root mean square error (RMSE) values were calculated for each product. Taylor diagrams were used to compare R, RMSE and standard deviation (SD, square root of variance) values among these products. Based on the statistical analyses, the performance of each product was generalized. SD is a measure of the SM dynamic range. Low SD values mean low information content and exceptionally high SD values mean noisy retrievals. Correlation measures the overall consistency of SM products and observations. RMSE and ubRMSE measure the overall and bias-corrected SM differences, respectively. These metrics have been widely used in validation studies and are defined as follows: where x is in situ SM observation, y is gridded SM product, the overbar denotes the mean value, and N denotes the number of data pairs. The TCA was performed by considering the scale difference of multiple SM products. TC-based correlation coefficient (TC_R) and RMSE (TC_RMSE) values can be calculated based on three error-independent datasets, usually composed of a passive-based, an activebased and a modeling-based dataset [20,28]. TC_R and TC_RMSE show the correlation and overall difference between each triplet and the 'true' SM time series. For TCA, SM data in the a.m. and p.m. orbits were averaged to increase the sample size. The time series of SM anomaly was calculated based on a 31-day moving window, similar to [19], and the minimum length of the time series was 100. A critical hypothesis of TCA is the zero error cross-correlation (ECC) between SM triplets, which is usually violated even for activeand passive-based products. The consequence is that the evaluation results differ among triplets. Here, we calculated TC_R and TC_RMSE values for any possible triplet to find the lower and upper boundaries of the two metrics values. We are aware that optimistic statistics can be obtained by including ECC-dependent products in TCA (e.g., SMAP and CYGNSS or ERA5 and ASCAT in a triplet), and the statistics might vary greatly among triplets [32]. Therefore, the median TC_R and TC_RMSE values were compared. Although the median values might still be biased, they can be used for a fair comparison among sites and products. All data processing was accomplished based on the MATLAB R2015a platform. The mathematical form of TC_R and TC_RMSE is defined as follows: where X, Y and Z denote independent time series of SM anomaly (N > 100), σ denotes variance (for one dataset) or covariance (for two datasets).

Evaluating the Capabilities of Precipitation and Irrigation Detection
Precipitation and irrigation are the dominant natural and human factors of SM dynamics. The ability of multiple SM products to respond to precipitation signals was first evaluated. To this end, the 0.1 • × 0.1 • daily integrated multi-satellite retrievals for Global Precipitation Measurement (IMERG) precipitation product [59] were collected. For each station, the daily precipitation amount (cumulative P in UTC 0:00-UTC 0:00, approximately local time 8 a.m.-8 a.m.) was calculated and correlated with daily SM change (∆SM) in 24 h. To match precipitation and SM products, the in situ observations at 8 a.m., the SMOS/SMAP/ASCAT a.m. products (6 a.m., 9:30 a.m.), and the daily averaged AMSR2 products were used. The other products were composed to 8 a.m. or generated at 8: 00-9:30 a.m. (Table 2), with minimal time differences from the precipitation product. Correlation coefficients were calculated between daily P and ∆SM for in situ SM observations and SM products in order to evaluate the diverse responses of SM dynamics to precipitation. Because no in situ precipitation data were available, only IMERG precipitation data were used in this study.
The ability of multiple SM products to capture irrigation signals was evaluated. To this end, the monthly irrigation water use (IWU) product [33] was collected. Recently, Zhang et al. [33] considered multiple irrigation-related processes in the framework of hydrological balance and integrated multiple satellite observations to obtain ensemble IWU estimates from 2011-2018. To match the IWU product, multiple SM products were calculated to monthly averages. Because meteorological stations are not distributed in cropland areas, the observations only reflect precipitation signals. The difference between gridded products and in situ observations should reflect, if any, irrigation signals. Based on this hypothesis, the monthly SM difference (gridded minus in situ) was correlated with the monthly IWU. The stronger the correlationship, the better the capability of SM products to capture irrigation signals. Although representativeness errors might have an impact, results concluded from the 20 stations provide insight into the selection of irrigation-sensitive SM products. Figure 2 shows the statistical values for the validation of SM products at the 20 stations. Each boxplot shows the maximum, 75% quartile statistics, median, 25% quartile statistics and the minimum of the metrics, including data availability, correlation coefficient, bias, ubRMSE and RMSE for the 20 stations. Data availability means the proportion of quality assured SM data in the study period of 2017-2020. Generally, models provide more SM data than remote sensing, demonstrating the advantages of wide spatial and temporal coverage. Only minor data of frozen soils (<5%) are discarded. ESA CCI Combined and Active have provided more than 90% data in recent years, followed by CYGNSS. All AMSR2 products provide 50-80% data, most for JAXA and least for LPRM X. ASCAT provides about 40% data, whereas SMOS and SMAP provide much less data (<10%) due to a strict operational flagging strategy. A relaxed strategy (0-0.60 cm 3 ·cm −3 ) can largely increase data volume for L-band products ( Figure 2). ESA CCI Passive has a wide range of data availability, as it integrates AMSR2, SMOS and SMAP data.

Overall Performance of SM Products
Modeling-based products generally outperform remote sensing-based products. ERA5 and ERA5 Land can better capture SM dynamics (R ≈ 0.8) and have lower and more stable ubRMSE values than other products. Despite large differences in spatial resolution, MERRA2 and SMAP L4 provide almost unbiased SM data and have minimal differences from observations. For remote sensing, the L-band outcompetes C-/X-band for SM retrieval. The former has consistent data quality in the a.m. and p.m. orbits. In the L-band, SMAP products have lower uncertainties, although SMOS-IC outperforms in terms of correlation. Regarding ubRMSE and RMSE, SMAP DCA performs the best, followed by SMAP SCA-V, SCA-H and SMOS-IC (SMOS-L3 not included due to low data volume). SMAP DCA is almost unbiased, similar to ASCAT. A relaxed flagging strategy increases the L-band data volume, yet the data quality is not necessarily largely decreased (especially for R). All AMSR2 products are not well-correlated with the observations (R < 0.4). NPD performs the best, followed by LPRM C2, and JAXA and SCA have the largest ubRMSE values. Positive biases are observed for LPRM products and negative for other AMSR2 products. The differences in bias can be as large as 0.03 cm 3 ·cm −3 . Modeling-based products generally outperform remote sensing-based products. ERA5 and ERA5 Land can better capture SM dynamics (R ≈ 0.8) and have lower and more stable ubRMSE values than other products. Despite large differences in spatial resolution, MERRA2 and SMAP L4 provide almost unbiased SM data and have minimal differences from observations. For remote sensing, the L-band outcompetes C-/X-band for SM retrieval. The former has consistent data quality in the a.m. and p.m. orbits. In the L-band, SMAP products have lower uncertainties, although SMOS-IC outperforms in terms of correlation. Regarding ubRMSE and RMSE, SMAP DCA performs the best, followed by SMAP SCA-V, SCA-H and SMOS-IC (SMOS-L3 not included due to low data volume). SMAP DCA is almost unbiased, similar to ASCAT. A relaxed flagging strategy increases the L-band data volume, yet the data quality is not necessarily largely decreased (especially for R). All AMSR2 products are not well-correlated with the observations (R < 0.4). NPD performs the best, followed by LPRM C2, and JAXA and SCA have the largest ESA CCI and ASCAT are better than other remotely sensed products, considering both data availability and accuracy. ESA CCI Combined performs the best, followed by ESA CCI Active and Passive. ASCAT is the optimal single-satellite-based product, outperforming SMAP and SMOS in data availability and AMSR2 in overall accuracy. Integration of ASCAT and other radar data makes ESA CCI Combined the best remotely sensed product. ESA CCI Combined is positively biased, similar to GLDAS Noah, as the former is rescaled to the latter. Larger positive and negative biases were observed for ESA CCI Active and Passive, respectively. CYGNSS significantly extends the spatial coverage of SMAP data while at the expense of reduced data quality, which is even lower than that of poorly flagged SMAP data. Figure 3 shows Taylor diagrams of all SM products for two cropland stations (Figure 3a,b) and one forest land station (Figure 3c). All modeling-based products report large SM dynamics in cropland and low SM dynamics in forest land. ERA5 and ERA5 Land have wider SM dynamic ranges and consistently stronger correlations with observations than other products. Modeling-based products are closer to in situ observations than remote sensing-based products, including ESA CCI Combined, which is rescaled to GLDAS Noah. AMSR2 products differ greatly among algorithms. The JAXA and SCA products are close to each other. sensed product. ESA CCI Combined is positively biased, similar to GLDAS Noah, as the former is rescaled to the latter. Larger positive and negative biases were observed for ESA CCI Active and Passive, respectively. CYGNSS significantly extends the spatial coverage of SMAP data while at the expense of reduced data quality, which is even lower than that of poorly flagged SMAP data. Figure 3 shows Taylor diagrams of all SM products for two cropland stations ( Figure  3a,b) and one forest land station (Figure 3c). All modeling-based products report large SM dynamics in cropland and low SM dynamics in forest land. ERA5 and ERA5 Land have wider SM dynamic ranges and consistently stronger correlations with observations than other products. Modeling-based products are closer to in situ observations than remote sensing-based products, including ESA CCI Combined, which is rescaled to GLDAS Noah. AMSR2 products differ greatly among algorithms. The JAXA and SCA products are close to each other.   Figure 4 shows that most products can capture short-term SM dynamics well, except for AMSR2 and CYGNSS. AMSR2 LPRM's perform similarly and have wide SM dynamic ranges, with unexpectedly large values in the winter season. Only LPRM X-band retrievals are shown in Figure 4 for comparison with other AMSR2 X-band retrievals. AMSR2 JAXA and SCA have abnormally high values in the summer season and consistently low values (~0.01 cm 3 ·cm −3 ) in other seasons. AMSR2 NPD has a very narrow SM dynamic range. CYGNSS also has a narrow SM dynamic range with large short-term noises. SMAP products are in better agreement with in situ observations than SMOS-IC and SMOS-L3 (not shown due to the low data volume). All ESA CCI products reproduce well in situ observed SM dynamics. For ESA CCI Combined, data fusion and rescaling to GLDAS Noah SM reduces dry biases in ESA CCI Active and wet biases in ESA CCI Passive. ASCAT performs similarly to ESA CCI Active, but the retrievals are almost unbiased.

Specific Behaviors of SM Products
JAXA and SCA have abnormally high values in the summer season and consistently low values (~0.01 cm 3 · cm -3 ) in other seasons. AMSR2 NPD has a very narrow SM dynamic range. CYGNSS also has a narrow SM dynamic range with large short-term noises. SMAP products are in better agreement with in situ observations than SMOS-IC and SMOS-L3 (not shown due to the low data volume). All ESA CCI products reproduce well in situ observed SM dynamics. For ESA CCI Combined, data fusion and rescaling to GLDAS Noah SM reduces dry biases in ESA CCI Active and wet biases in ESA CCI Passive. ASCAT performs similarly to ESA CCI Active, but the retrievals are almost unbiased. Forced by meteorological datasets, all modeling-based products can well reproduce the temporal pattern of SM variabilities. It is interesting to observe that the 0-10 cm ERA5 and ERA5 Land products fit better with in situ observations for dry soils, and the 0-5 cm MERRA2 and SMAP L4 products fit better for wet soils (Figure 4). The depth of the top soil layer and the quality of the forcing datasets might account for the differences. A detailed comparison of in situ observations is shown in Figure 5. Considering data availability and overall accuracy, only six major products are presented here. Regression slope Forced by meteorological datasets, all modeling-based products can well reproduce the temporal pattern of SM variabilities. It is interesting to observe that the 0-10 cm ERA5 and ERA5 Land products fit better with in situ observations for dry soils, and the 0-5 cm MERRA2 and SMAP L4 products fit better for wet soils (Figure 4). The depth of the top soil layer and the quality of the forcing datasets might account for the differences. A detailed comparison of in situ observations is shown in Figure 5. Considering data availability and overall accuracy, only six major products are presented here. Regression slope values can manifest the dynamic range of SM values. ERA5 has the largest SM dynamic range, followed by SMAP L4, MERRA2, ESA CCI Combined, GLDAS Noah and CYGNSS.   Figure 1).

TC-Based Comparison of SM Products
TC-based correlations confirm the advantages of models over remote sensing, especially in the central water-contaminated and southern hilly areas. CYGNSS has a moderate correlation in the northern plain, where SMAP products also perform better. SMAP cannot provide enough quality-assured data for calibrating CYGNSS in the rest of study area, leading to decreased CYGNSS data quality (R < 0.2, Figure 6a). ESA CCI Combined has also decreased data quality in these areas (Figure 6b), where satellite-based SM retrievals generally have large uncertainties. Especially in forest areas, low SM dynamics and high retrieval uncertainties contribute to low correlations. Model performances are less dependent on land cover. SMAP L4 and ERA5 perform better in the northern plain, and the performance decreases marginally in the rest of the study area. Compared to GLDAS Noah and MERRA2, the assimilation-based ERA5 and SMAP L4 performed slightly better in the central and southern areas. Together with ESA CCI Combined, all modeling-based products have ubRMSE values that are better than 0.04 cm 3 · cm -3 , except for ERA5 (Figure 7). Although ERA5 shows a median RMSE value better than 0.04 cm 3 · cm -3 , the RMSE values are generally larger than other modeling-based products and exceed 0.04 cm 3 · cm -3 for some triplets. It seems that TC-based RMSE depends on the dynamic range of SM products. A wide SM dynamic range (e.g., for ERA5) might also amplify random errors, and a narrow SM dynamic range produces low σXX values in Equation (7) and thus low RMSE values.  Figure 1).

TC-Based Comparison of SM Products
TC-based correlations confirm the advantages of models over remote sensing, especially in the central water-contaminated and southern hilly areas. CYGNSS has a moderate correlation in the northern plain, where SMAP products also perform better. SMAP cannot provide enough quality-assured data for calibrating CYGNSS in the rest of study area, leading to decreased CYGNSS data quality (R < 0.2, Figure 6a). ESA CCI Combined has also decreased data quality in these areas (Figure 6b), where satellite-based SM retrievals generally have large uncertainties. Especially in forest areas, low SM dynamics and high retrieval uncertainties contribute to low correlations. Model performances are less dependent on land cover. SMAP L4 and ERA5 perform better in the northern plain, and the performance decreases marginally in the rest of the study area. Compared to GLDAS Noah and MERRA2, the assimilation-based ERA5 and SMAP L4 performed slightly better in the central and southern areas. Together with ESA CCI Combined, all modeling-based products have ubRMSE values that are better than 0.04 cm 3 ·cm −3 , except for ERA5 (Figure 7). Although ERA5 shows a median RMSE value better than 0.04 cm 3 ·cm −3 , the RMSE values are generally larger than other modeling-based products and exceed 0.04 cm 3 ·cm −3 for some triplets. It seems that TC-based RMSE depends on the dynamic range of SM products. A wide SM dynamic range (e.g., for ERA5) might also amplify random errors, and a narrow SM dynamic range produces low σ XX values in Equation (7) and thus low RMSE values.

Diverse SM Responses to Precipitation Events
In situ observations show stronger SM responses to precipitation in cropland (R = 0.4) than in forest land (R = 0.3) (Figure 8). The responses of SM products to precipitation are shown in Figure 9. Among other AMSR2 products, LPRM X can best reproduce the correlation (R > 0.15) and JAXA the worst (almost uncorrelated). No SMOS results are available because of the low data volume. SMAP products also show weak correlations, and a relaxed flagging strategy does not improve performance. CYGNSS produces slightly better correlations than SMAP, attributable to higher data availability. A single satellitebased ASCAT product cannot reflect SM dynamics due to precipitation events. Integrating both MetOp-A and MetOp-B ASCAT data, ESA CCI Active shows much improved correlations (R > 0.3). ESA CCI Passive performs better than any individual radiometer-based product in response to precipitation, and the responsiveness of ESA CCI Combined is further enhanced by blending the Active product. All modeling-based products show close responsiveness to in situ observations. ERA5 performs the best, followed by SMAP L4, ERA5 Land, GLDAS Noah and MERRA2. However, with a finer spatial resolution, ERA5 Land does not perform as well as ERA5. These modeling-based products are forced by diverse precipitation datasets. However, the good data quality shared by precipitation datasets in this data-rich area is likely the major reason for the strong SM responses to IMERG-based precipitation.

Diverse SM Responses to Precipitation Events
In situ observations show stronger SM responses to precipitation in cropland (R = 0.4) than in forest land (R = 0.3) (Figure 8). The responses of SM products to precipitation are shown in Figure 9. Among other AMSR2 products, LPRM X can best reproduce the correlation (R > 0.15) and JAXA the worst (almost uncorrelated). No SMOS results are available because of the low data volume. SMAP products also show weak correlations, and a relaxed flagging strategy does not improve performance. CYGNSS produces slightly better correlations than SMAP, attributable to higher data availability. A single satellite-based ASCAT product cannot reflect SM dynamics due to precipitation events. Integrating both MetOp-A and MetOp-B ASCAT data, ESA CCI Active shows much improved correlations (R > 0.3). ESA CCI Passive performs better than any individual

Diverse SM Responses to Precipitation Events
In situ observations show stronger SM responses to precipitation in cropland (R = 0.4) than in forest land (R = 0.3) (Figure 8). The responses of SM products to precipitation are shown in Figure 9. Among other AMSR2 products, LPRM X can best reproduce the correlation (R > 0.15) and JAXA the worst (almost uncorrelated). No SMOS results are available because of the low data volume. SMAP products also show weak correlations, and a relaxed flagging strategy does not improve performance. CYGNSS produces slightly better correlations than SMAP, attributable to higher data availability. A single satellite-based ASCAT product cannot reflect SM dynamics due to precipitation events. Integrating both MetOp-A and MetOp-B ASCAT data, ESA CCI Active shows much improved correlations (R > 0.3). ESA CCI Passive performs better than any individual products show close responsiveness to in situ observations. ERA5 performs the best, followed by SMAP L4, ERA5 Land, GLDAS Noah and MERRA2. However, with a finer spatial resolution, ERA5 Land does not perform as well as ERA5. These modeling-based products are forced by diverse precipitation datasets. However, the good data quality shared by precipitation datasets in this data-rich area is likely the major reason for the strong SM responses to IMERG-based precipitation.   products show close responsiveness to in situ observations. ERA5 performs the best, followed by SMAP L4, ERA5 Land, GLDAS Noah and MERRA2. However, with a finer spatial resolution, ERA5 Land does not perform as well as ERA5. These modeling-based products are forced by diverse precipitation datasets. However, the good data quality shared by precipitation datasets in this data-rich area is likely the major reason for the strong SM responses to IMERG-based precipitation.   . Correlations between daily precipitation amount and soil moisture change based on multiple products. Each boxplot shows the distribution of correlation coefficient values at the 20 stations, including the maximum, 75% quartile statistics, median, 25% quartile statistics and the minimum. The symbol "(+)" means a relaxed flagging strategy for L-band products.

Diverse SM Responses to Irrigation Events
The ability to capture irrigation signals differs among products and algorithms ( Figure 10). As expected, modeling-based products can barely capture irrigation signals (R < 0.3), even though satellite data are assimilated (e.g., for ERA5 and SMAP L4). The radar-based ASCAT product is almost insensitive to irrigation signals, and ESA CCI Active further decreases sensitivity. Despite the low data volume, SMAP products are the most sensitive to irrigation signals. SMAP SCA-H and SCA-V have stronger sensitivities (R > 0.5) than SMAP DCA, although the latter shows better overall accuracy. It is noteworthy that the relaxed flagging strategy does not deprive SMAP of its ability to capture irrigation signals. The good performance even transfers to CYGNSS. SMOS products are less effective than SMAP products because of their lower data volume and overall accuracy. Due to contrasting SM values in cropping (high SM) and non-cropping (low SM) seasons (Figure 4), it is not surprising to see AMSR2 JAXA and SCA are strongly correlated (R > 0.5) with irrigation water use. It seems that single-channel algorithms are more competent than LPRM algorithms for detecting irrigation signals. With an LPRM algorithm, all three AMSR2 LPRM products and the ESA CCI Passive product have negative correlations (R < −0.3) with irrigation water use.

Diverse SM Responses to Irrigation Events
The ability to capture irrigation signals differs among products and algorithms (Figure 10). As expected, modeling-based products can barely capture irrigation signals (R < 0.3), even though satellite data are assimilated (e.g., for ERA5 and SMAP L4). The radarbased ASCAT product is almost insensitive to irrigation signals, and ESA CCI Active further decreases sensitivity. Despite the low data volume, SMAP products are the most sensitive to irrigation signals. SMAP SCA-H and SCA-V have stronger sensitivities (R > 0.5) than SMAP DCA, although the latter shows better overall accuracy. It is noteworthy that the relaxed flagging strategy does not deprive SMAP of its ability to capture irrigation signals. The good performance even transfers to CYGNSS. SMOS products are less effective than SMAP products because of their lower data volume and overall accuracy. Due to contrasting SM values in cropping (high SM) and non-cropping (low SM) seasons (Figure 4), it is not surprising to see AMSR2 JAXA and SCA are strongly correlated (R > 0.5) with irrigation water use. It seems that single-channel algorithms are more competent than LPRM algorithms for detecting irrigation signals. With an LPRM algorithm, all three AMSR2 LPRM products and the ESA CCI Passive product have negative correlations (R < -0.3) with irrigation water use. Figure 10. Correlations between soil moisture bias (gridded minus in situ soil moisture) and irrigation water use on monthly scales. Each boxplot shows the distribution of correlation coefficient values at the 20 stations, including the maximum, 75% quartile statistics, median, 25% quartile statistics and the minimum. The symbol "(+)" means a relaxed flagging strategy for L-band products.

Practices for Optimal Product and Algorithm Selection
No SM products perform consistently better than others. Modeling-based products have the advantages of continuous spatial and temporal coverage, strong correlations with in situ observations, and timely responses to rainfall events. Despite less accurate forcing data over poorly gauged areas, models still perform better and more consistently across different landscapes than remote sensing. The latter does not perform well or even Figure 10. Correlations between soil moisture bias (gridded minus in situ soil moisture) and irrigation water use on monthly scales. Each boxplot shows the distribution of correlation coefficient values at the 20 stations, including the maximum, 75% quartile statistics, median, 25% quartile statistics and the minimum. The symbol "(+)" means a relaxed flagging strategy for L-band products.

Practices for Optimal Product and Algorithm Selection
No SM products perform consistently better than others. Modeling-based products have the advantages of continuous spatial and temporal coverage, strong correlations with in situ observations, and timely responses to rainfall events. Despite less accurate forcing data over poorly gauged areas, models still perform better and more consistently across different landscapes than remote sensing. The latter does not perform well or even fails to produce meaningful retrievals in the central and southern parts of Anhui province. The main difficulties include the separation of water emissions and the effects of complex terrains and/or dense vegetation. This is evidenced by the extremely low data volume of operational SMOS and SMAP products. As a result, data assimilation (e.g., SMAP L4) can marginally improve SM modeling in these areas. Recently, several studies have demonstrated the limited contribution of data assimilation to SM and carbon fluxes modeling under different circumstances [9,60]. This might also explain the large uncertainties of the CYGNSS product, which uses the SMAP product as a calibration reference. The increased coverage of the CYGNSS product is at the expense of decreased accuracy. Although AMSR2 products are not acceptable in terms of absolute accuracy, JAXA and SCA detect plausible irrigation signals. It is more likely a coincidence arising from high SM biases in cropping seasons and low biases in non-cropping seasons. If we focus on the detection of irrigation events, SMAP products are an optimal choice, especially SCA products.
High-resolution SM products are necessary for regional-scale drought monitoring [61,62], especially in intensive agricultural regions. ERA5 Land performs better than or at least comparably to ERA5 in terms of multiple metrics (Figure 2), providing a finer-resolution (~9 km) alternative for drought monitoring. This study corroborates the current use of ERA-5 Land SM for drought monitoring in Anhui province. Remote sensing offers an objective description of land surfaces. However, the products do not correlate as well with in situ observations or respond as well to rainfall events on a daily scale. Shortterm random noises might be the major reason. As a result, temporal aggregation can improve the comparability of remotely sensed and modeling-based products. For example, Liu et al. [63] observed a slightly better performance of ESA CCI over GLDAS Noah for global drought monitoring on a monthly scale. Moreover, an ensemble of results from multiple datasets is recommended.

Implications for Improving the Retrieval Algorithm
AMSR-E and AMSR2 provide over 20 years of multifrequency global observations. Several retrieval algorithms have been developed, among which LPRM C2 and NPD stand out in this study. More stringent flagging strategies might improve the evaluation metric values, especially considering the RFI effects. Poor flagging partly contributes to a high percentage of data availability and, in the meantime, causes low data quality. This applies equally to other AMSR2 products. AMSR2 SCA has high values in wet seasons and low values in dry seasons. Soil and vegetation parameters can be refined for better retrieval as SCA proves to be successful and serves previously as a baseline algorithm for SMAP. More importantly, it is necessary to rethink the appropriate parameterization, as multiple retrieval algorithms have diverse SM biases.
SMAP SCAs have proven superior abilities in capturing irrigation signals. The use of real-time instead of climatological vegetation data might further improve SM retrievals and the detectability of irrigation signals because cropland phenology experiences substantial inter-annual variabilities [64,65]. Water correction is critical to SM retrieval in regions with a dense network of rivers and lakes. This is probably the main reason for the decreased SM accuracy in the central Anhui province. The distribution of complex terrains and dense forests explains the low SM accuracy in the south. Both effects apply equally to other remotely sensed products but are not decisive for modeling-based products. Recent studies have shown SM sensitivities of L-band radiometry under temperate forest canopies and deeper than a few centimeters [66,67]. This might improve SM retrieval under dense vegetation cover. Biases in effective soil surface temperature data also play a role in SM retrieval [46,68,69], especially under dense vegetation cover that masks out a large portion of soil emissions.
The ESA and NASA effects on GNSS-R for SM retrieval have been well elaborated upon by Pierdicca et al. [70]. This technique is currently far from mature for SM retrieval, although more advanced algorithms have been recently developed, e.g., change detection in [71], machine learning in [72,73] and semiempirical method in [74]. The currently operational CYGNSS SM product is generated using a linear relationship calibrated between reflectivity and SMAP SM. The residual nonlinearities and uneven distribution of calibration samples explain the reduced SM dynamic range, i.e., underestimating high SM values and overestimating low SM values [70]. The same issue is also encountered in remote sensing of soil salinity based on linear regression [75]. Moreover, CYGNSS retrievals are also affected by low-quality SMAP data and noisy observations over mountainous regions, making SM time series noisier than SMAP. To meet both ends, nonlinear and physically based methods are needed for further improvement. For nonlinear methods, such as machine learning [72,73], high-quality satellite retrievals are still required. LPRM products (AMSR2 and ESA CCI Passive) show strong negative correlations between irrigation water use and SM bias. The underestimation of high SM values and/or overestimation of low SM values might be responsible. Although spatial scales differ between product grids and site observations, SMAP products still show strong positive correlations. At the least, LPRM products are biased from SMAP products. This result underscores the importance of comparing multiple SM climatology [76] and further investigation into the LPRM algorithm.

Recommendations for Validation of Soil Moisture Products
Validation practices for satellite soil moisture products have been well documented by Gruber et al. [11]. The representativeness of in situ sites is emphasized in this study. Meteorological stations record long-term soil moisture observations, which are invaluable for validation purposes. However, these stations are distributed for ease of management, generally far away from agricultural land. The observations only naturally reflect soil drying and wetting processes and are unaffected by irrigation events. This might be one of the reasons for better model performance than remote sensing. Based on MODIS products, we observed a recent (2000-2021) NDVI decreasing trend at the 20 meteorological stations. The intensive urbanization processes in China might lower the representativeness of in situ observations. The measurement depth of in situ data is also a critical factor affecting the validation results. The 10-cm measurement depth is closer to that of modeling-based products, e.g., 0-5 cm, 0-7 cm and 0-10 cm, which produces better validation metrics values. Remotely sensed products have a shallower penetration depth of 0-2 cm at the Xand C-bands and 0-5 cm at the L-band. The inconsistencies in soil depth underscore the difficulties in SM product evaluation, especially for biases.
The other recommendation is on the validation method. TCA has several assumptions and the results might differ among triplets. The basic assumption is a linear relationship between SM datasets and the unknown true SM time series plus zero-mean random noise. The core assumption is zero error cross-correlation between SM datasets, which is not held even for passive-and active-based retrievals. Although it is feasible to examine this assumption by introducing a fourth dataset [24], using multiple triplets is more practical. For example, Zheng et al. [27] used multiple triplets and a bootstrapping technique to enhance the TCA results. Similarly, in this study, the method can depict the upper and lower boundaries of TC-based metric values. It becomes more useful with a growing number of products from observations, models and remote sensing. The median correlation and RMSE values are more robust, reducing the risk of over-optimistic evaluation results.

Conclusions
This study compared multiple remotely sensed, modeling-and assimilation-based SM products against in situ observations in a humid to semi-humid transitional region with diverse landscapes. Models generally outperform remote sensing in hilly and densely vegetated areas and areas with developed water systems. Remote sensing has difficulties in these areas, as evidenced by the extremely low data volume of operational SMOS and SMAP products. The limited and noisy SMAP reference data are mainly responsible for the low accuracy and narrow dynamic range of the CYGNSS product. For the same reason, data assimilation can marginally improve SM modeling. AMSR2 products have diverse but generally low performances depending on retrieval algorithms, which is better for LPRM C2 and NPD in terms of overall accuracy. ASCAT is the optimal single-satellite product, having both acceptable accuracy and spatial coverage. Models can better reproduce the responses of SM to precipitation events than remote sensing, while by nature they cannot reflect irrigation events. SMAP SCA-H and SCA-V are among the best products for detecting irrigation signals. The plausible irrigation signals revealed by AMSR2 SCA and JAXA are likely caused by retrieval errors. All LPRM products failed to identify irrigation events, probably due to an overestimation of low SM values and/or an underestimation of high SM values. The evaluation results provide guidance to select optimal products, improve retrieval algorithms and recommend common practices for SM validation. Data Availability Statement: The AMSR2 LPRM SM product can be found here: https://disc. gsfc.nasa.gov/datasets?keywords=AMSR2; the AMSR2 JAXA SM product can be found here: ftp: //ftp.gportal.jaxa.jp; the AMSR2 LANCE (NPD and SCA) SM product can be found here: https: //n5eil01u.ecs.nsidc.org/DP1/AMSA/AU_Land.001/; the SMOS L3 and SMOS IC SM products can be found here: ftp://ftp.ifremer.fr; the SMAP L3 SM product can be found here: https:// nsidc.org/data/SPL3SMP/versions/8; the CYGNSS SM product can be found here: https://data. cosmic.ucar.edu/gnss-r/soilMoisture/cygnss/level3/; the ASCAT SM product can be found here: https://navigator.eumetsat.int/product/EO:EUM:DAT:METOP:SOMO25; the ESA CCI SM product can be found here: https://esa-soilmoisture-cci.org/data; the GLDAS Noah SM product can be found here: https://ldas.gsfc.nasa.gov/gldas/; the MERRA2 SM product can be found here: https: //disc.gsfc.nasa.gov/datasets?keywords=MERRA2; the ERA5 SM product can be found here: https: //www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era5; the ERA5 Land SM product can be found here: https://www.ecmwf.int/en/forecasts/dataset/ecmwf-reanalysis-v5-land; the SMAP L4 SM product can be found here: https://nsidc.org/data/SPL4SMGP/versions/6; the MODIS NDVI product can be found here: https://ladsweb.modaps.eosdis.nasa.gov/. The GPM IMERG precipitation product can be found here: https://disc.gsfc.nasa.gov/datasets/GPM_3IMERGDF_06 /summary?keywords=%22IMERG%20final%22; the irrigation water use product is openly available in National Tibetan Plateau/Third Pole Environment Data Center (TPDC) at doi.org/10.11888/hydro. tpdc.271220; A registration is generally compulsory for data collection. In-situ soil moisture data are not publicly available due to data privacy policy.

Conflicts of Interest:
The authors declare no conflict of interest.