Open Access This article is
- freely available
Remote Sens. 2017, 9(11), 1179; https://doi.org/10.3390/rs9111179
Data Assimilation to Extract Soil Moisture Information from SMAP Observations
Universities Space Research Association, Columbia, MD 21046, USA
Global Modeling and Assimilation Office, NASA Goddard Space Flight Center, Greenbelt, MD 20771, USA
Science Systems and Applications Inc., Lanham, MD 20706, USA
USDA ARS Hydrology and Remote Sensing Laboratory, Beltsville, MD 20705, USA
USDA ARS Southeast Watershed Research Center, Tifton, GA 31793, USA
Bureau of Economic Geology, The University of Texas at Austin, Austin, TX 78712, USA
Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 91125, USA
USDA ARS Southwest Watershed Research Center, Tucson, AZ 85719, USA
USDA ARS National Soil Erosion Research Laboratory, West Lafayette, IN 47907, USA
Viterbi School of Engineering, University of Southern California, Los Angeles, CA 90089, USA
USDA ARS Grazinglands Research Laboratory, El Reno, OK 73036, USA
Author to whom correspondence should be addressed.
Received: 3 October 2017 / Accepted: 9 November 2017 / Published: 17 November 2017
This study compares different methods to extract soil moisture information through the assimilation of Soil Moisture Active Passive (SMAP) observations. Neural network (NN) and physically-based SMAP soil moisture retrievals were assimilated into the National Aeronautics and Space Administration (NASA) Catchment model over the contiguous United States for April 2015 to March 2017. By construction, the NN retrievals are consistent with the global climatology of the Catchment model soil moisture. Assimilating the NN retrievals without further bias correction improved the surface and root zone correlations against in situ measurements from 14 SMAP core validation sites (CVS) by 0.12 and 0.16, respectively, over the model-only skill, and reduced the surface and root zone unbiased root-mean-square error (ubRMSE) by 0.005 m m and 0.001 m m, respectively. The assimilation reduced the average absolute surface bias against the CVS measurements by 0.009 m m, but increased the root zone bias by 0.014 m m. Assimilating the NN retrievals after a localized bias correction yielded slightly lower surface correlation and ubRMSE improvements, but generally the skill differences were small. The assimilation of the physically-based SMAP Level-2 passive soil moisture retrievals using a global bias correction yielded similar skill improvements, as did the direct assimilation of locally bias-corrected SMAP brightness temperatures within the SMAP Level-4 soil moisture algorithm. The results show that global bias correction methods may be able to extract more independent information from SMAP observations compared to local bias correction methods, but without accurate quality control and observation error characterization they are also more vulnerable to adverse effects from retrieval errors related to uncertainties in the retrieval inputs and algorithm. Furthermore, the results show that using global bias correction approaches without a simultaneous re-calibration of the land model processes can lead to skill degradation in other land surface variables.
Keywords:data assimilation; SMAP soil moisture; neural networks; bias correction
The importance of soil moisture in hydrological and land surface boundary layer processes has long been recognized (e.g., [1,2,3,4]), and the need for high quality soil moisture observations to enhance our understanding of these processes has been identified . Direct observations of soil moisture can be obtained with in situ sensors, but these are constrained to point-scale measurements at a limited number of locations.
In contrast, satellite instruments are able to observe soil moisture globally with a local revisit time of 2–3 days. In particular L-band (1.4 GHz) microwave radiometers have a high soil moisture sensitivity and are able to penetrate the top 5 cm of the soil in sparsely to moderately vegetated areas [6,7]. Two passive L-band satellite missions have been launched in recent years, the European Space Agency’s Soil Moisture and Ocean Salinity (SMOS) mission in 2009  and the National Aeronautics and Space Administration’s Soil Moisture Active Passive (SMAP) mission in 2015 . Soil moisture retrieval products from SMOS and SMAP have been shown to have high skill in capturing soil moisture variations [9,10], however, many applications require observations of the complete soil moisture profile and with finer spatial and temporal resolutions than those of SMOS and SMAP.
Data assimilation (DA) can be used to interpolate and extrapolate the satellite observations by merging them with information from a dynamic land surface model. This generates higher horizontal resolution estimates of the full soil moisture profile with complete spatio-temporal coverage and often with a higher skill than that of the model or satellite observations alone [11,12,13,14,15,16,17,18].
The specific soil moisture skill improvements that can be obtained from an assimilation depend on the quality of the assimilated observations, the amount of complementary/novel information they provide, and the efficiency with which the DA system is able to extract this information. The latter is contingent on the specifics of the DA system, including (but not limited to) the type of observation assimilated (raw brightness temperatures (Tbs) vs. soil moisture retrievals), the assimilation algorithm, the observation and model error estimates, and the bias-correction scheme. Ultimately, the optimal choice for each factor and their combination depends on the specific application, and a simultaneous comparison of all possible options is not trivial. Nevertheless, several studies have explored options for the individual factors in the context of DA for soil moisture estimation. For example, De Lannoy and Reichle  compared the assimilation of SMOS soil moisture retrievals against the assimilation of (two versions of) SMOS Tbs and showed that in each case different information was extracted from the observations, resulting in locally different soil moisture estimates. Crow and Van den Berg  investigated the use of an independent triple collocation (TC) analysis to generate improved estimates of the model and observation errors. Finally, Kumar et al.  explored two methods to correct the observation bias, while De Lannoy et al.  investigated methods to correct the model forecast bias.
One key assumption for most DA algorithms, including the ensemble Kalman filter used here, is that all errors are purely random and thus that the observations are unbiased with respect to the model (e.g., Kalnay  Chapter 5). Realistically, biases in the model forcing data, differences in the soil texture, or biases in the Tbs will generally result in biases between the observations and the model. To comply with the assumption of unbiased observations, DA systems typically rescale the observations to the model climatology (generally referred to as ‘bias correction’). One common approach is to match the cumulative distribution function (CDF) of the observations to that of the model estimates at each location [23,24]. Alternatively, Reichle et al.  rescale the assimilated Tbs such that their seasonally-varying climatology matches that of the simulated Tbs in each location. While such localized bias correction techniques fulfill the requirements of the DA system, they can considerably alter the spatial and temporal patterns of the observation mean and variability, thereby removing some of the independent information provided by the satellite instruments. With the availability of high quality soil moisture retrievals from SMOS and SMAP, it is desirable to retain as much of the independent satellite information as possible.
Our objective in this study is to compare different methods to rescale the observations and identify which approach results in the most efficient assimilation of SMAP soil moisture observations into the National Aeronautics and Space Administration (NASA) Catchment land surface model (CLSM). Specifically, we are interested in the potential of assimilating neural network (NN)-based retrievals to reduce the need for further bias correction. Recently, Kolassa et al.  trained an NN on SMAP Tbs and CLSM soil moisture estimates to generate soil moisture retrievals that are, by design, consistent with the global climatology of the model. Here, we assimilate these SMAP NN retrievals without further bias correction and compare the skill of the resulting soil moisture estimates against: (1) an assimilation of the SMAP NN retrievals using a standard localized rescaling; and (2) an assimilation of the SMAP Level-2 passive soil moisture retrievals using a global rescaling. We additionally compare the skill of the above soil moisture assimilation estimates against that of the SMAP Level-4 soil moisture product, which is based on the assimilation of locally rescaled Tb observations.
2.1. SMAP Soil Moisture Products
SMAP was launched in January 2015 and is equipped with an L-band (1.4 GHz) radiometer that observes horizontal and vertical polarization Tbs as well as the third and fourth Stokes’ parameters. Its Sun-synchronous, near-circular, polar orbit has equator crossings at 6:00 a.m. and 6:00 p.m. local time and a revisit time of 2–3 days . Level 1 Tbs have been collected since 31 March 2015 and are provided on the 36-km resolution Equal-Area Scalable Earth version 2 (EASEv2) grid  as daily half-orbit files. Here we use the SMAP NN soil moisture retrieval product , the official SMAP Level-2 passive soil moisture retrieval product , and the SMAP Level 4 soil moisture analysis .
2.1.1. SMAP Neural Network (SMAP NN) Retrieval Product
The details of the SMAP NN retrieval algorithm and product are discussed in Kolassa et al. . In this subsection, we briefly summarize the key aspects, following some of their text. The SMAP NN product uses a statistical NN retrieval algorithm to compute surface soil moisture estimates for the 2-year period from April 2015 to March 2017. The data are provided with a resolution of 2–3 days and are posted on the 36-km resolution EASEv2 grid. The inputs to the retrieval algorithm are brightness temperatures and Stokes’ parameters from the SMAP Level-1C product , surface-layer soil temperature estimates from a CLSM simulation (Section 3.1), and vegetation water content (VWC) estimated empirically from a Normalized Difference Vegetation Index (NDVI) climatology based on Moderate Resolution Imaging Spectroradiometer (MODIS) observations. Only observations from the morning (6:00 a.m.) overpass are used in order to minimize observation errors due to Faraday rotation and the difference between the soil and canopy temperatures [7,28]. Soil moisture estimates are computed for times and locations where the soil is unfrozen (Global Earth Observing System version-5 (GEOS-5) surface temperature is higher than 1 C), the VWC is less than 5 kg m, and the water fraction of the grid cell is less than 5% according to the GEOS-5 land mask.
Since the NN algorithm is calibrated using CLSM surface soil moisture estimates as the target data, the resulting SMAP NN soil moisture estimates are consistent with the global CLSM climatology; that is, the retrievals match the global mean, variability, and higher moments of the model estimates. The spatial and temporal patterns of the retrieval product, however, are driven by the satellite input observations (e.g., Jimenez et al. ). The retrieval errors are estimated through a TC analysis using surface soil moisture retrievals from the Advanced Microwave Scanning Radiometer 2 (AMSR2; ) and the Advanced Scatterometer (ASCAT; Wagner et al. ) as additional inputs . Based on the TC results, the observation error standard deviations for the assimilation are specified as a spatially varying, temporally static error standard deviation map with a global mean value of 0.020 m m. For the assimilation experiment with localized bias correction (Section 3.2), the observation error standard deviations are rescaled using the ratio of the local (grid cell) model and retrieval soil moisture time series standard deviations. The purpose of this local rescaling is to preserve the relative magnitude of the observations and their errors before and after the local CDF-matching, which matches the observation mean and standard deviation to those of the model.
2.1.2. SMAP Level-2 Passive Retrieval Product (SMAP L2P)
The SMAP Level-2 Passive (L2P) soil moisture estimates are computed from the SMAP radiometer Level-1C Tbs using the physically-based “tau-omega” model . The ancillary input data include surface temperature estimates provided by the quasi-operational GEOS-5 Forward Processing system  with a 0.25 resolution and VWC estimated from a MODIS-based NDVI climatology using an empirical relationship established from prior observations. No retrieval is performed for frozen soil conditions (fraction of frozen soil based on GEOS-5 surface temperature larger than 5%) and soil moisture estimates are flagged as ‘not recommended’ for dense vegetation (VWC > 5 kg m) . The soil moisture estimates are provided as daily half-orbit files on the 36-km resolution EASE v2 grid that is also used for the SMAP NN retrieval product.
Here, we used version 4 of the SMAP L2P ‘baseline’ retrieval product that is based on SMAP vertical polarization Tbs . We assimilated only data points from the morning (6:00 a.m.) overpasses and for which the retrieval quality flag was set to ‘recommended’ (indicating unfrozen soils and a VWC below 5 kg m). Based on a TC analysis , the observation error standard deviations for the assimilation were specified as a static error standard deviation map with a global mean of 0.030 m m. For the assimilation of the L2P retrievals with global bias correction, the error standard deviations were rescaled using the ratio of the global model and observation standard deviations (computed over all times and locations).
2.1.3. SMAP Level-4 Soil Moisture Analysis (SMAP L4_SM)
The SMAP L4_SM data product is generated by assimilating SMAP Level-1C Tb anomalies into the CLSM using the DA system discussed in Section 3.1 combined with a tau-omega radiative transfer model [18,35]. SMAP Tbs with an ‘acceptable’ quality flag—as defined in —are assimilated when the model does not indicate active precipitation, frozen soil, or snow cover. L-band brightness temperatures generally exhibit a spatially and temporally varying bias with respect to the CLSM (see e.g., Figure 2 of De Lannoy and Reichle ). To account for this, the SMAP observations are locally rescaled prior to the assimilation such that their seasonally-varying climatology (i.e., their long-term mean seasonal cycle) matches that of the model. SMAP L4_SM estimates of the volumetric surface (0–5 cm) and root zone (0–100 cm) soil moisture are available globally on the 9-km EASEv2 grid with a 3-hourly resolution.
2.2. In Situ Data
We evaluate the model soil moisture estimates against in situ soil moisture measurements from the SMAP core validation sites and two sparse networks.
2.2.1. Core Validation Site Measurements
The SMAP core validation sites (CVSs) are a diverse collection of calibration and validation sites across different watersheds that use dense arrays of soil moisture sensors distributed over so-called reference pixels at 3 km, 9 km, and 36 km to represent the spatial scales of the different SMAP products . The measurements from sensors within each reference pixel are combined into an area-weighted average using weights based on Voronoi polygons  to yield one in situ soil moisture time series per reference pixel that is representative of a SMAP grid cell. Here we use reference pixels at the 9-km scale (matching the resolution of the DA experiments) from a subset of CVSs located in the contiguous United States. This includes 14 reference pixels (from 8 different watersheds) with surface soil moisture measurements and 8 reference pixels (from 5 different watersheds) that additionally provide root zone soil moisture measurements. These sites span a range of different climatic conditions, land cover and land use types (Table 1).
2.2.2. Sparse Network Measurements
We also evaluate the soil moisture estimates against in situ measurements from the Soil Climate Analysis Network (SCAN; ) and the US Climate Reference Network (USCRN; [40,41]). Unlike the CVSs, these ‘sparse’ networks typically only have one sensor within each 9-km model grid cell and are not necessarily representative of the grid-cell scale soil moisture estimated by the model. However, the sparse networks span a more varied range of climatic conditions and land cover types than the CVSs. We use SCAN and USCRN measurements at a depth of 5 cm to validate the surface soil moisture from the assimilation. The root zone soil moisture estimates are evaluated against an average of the in situ measurements for the 0–100 cm layer with each measurement weighted by the vertical extent of the represented layer. The SCAN and USCRN data were subjected to an extensive quality control process as detailed, for example, in De Lannoy et al.  and Appendix C of Reichle et al. . After quality control, 181 stations were used from the SCAN and 138 from the USCRN.
3. Data Assimilation System and Experiments
3.1. Model and Data Assimilation System
The data assimilation experiments are performed using the CLSM driven with surface meteorological forcing data at 0.25 resolution provided by the GEOS-5 Forward Processing system . The precipitation forcing data are corrected using global gauge-based observations from the National Oceanic and Atmospheric Administration (NOAA) Climate Prediction Center Unified (CPCU) product, scaled to the Global Precipitation Climatology Project (GPCP) v2.2 pentad precipitation product climatology [44,45]. The GEOS-5 background precipitation is also scaled to the GPCP v2.2 climatology.
The diagnostics used here to analyze the assimilation results are the surface (0–5 cm) and root zone (0–100 cm) soil moisture, as well as the land evaporation and the overland runoff. Two different configurations of the model are used in this study: (1) the Nature Run v4 (NRv4) configuration used to generate the L4_SM product and (2) the Nature Run v5 (NRv5) configuration used for the SMAP soil moisture assimilation experiments presented here. The main differences between the two configurations include an updated correction of the precipitation forcing data and an updated vegetation height dataset as well as revised parameterizations of the heat capacity, the minimum snow water equivalent, and the turbulent roughness length. The DA system was run over the contiguous United States from April 2015 to March 2017 producing 3-hourly analyses on the 9-km resolution EASEv2 grid.
The assimilation was performed using an ensemble Kalman filter including non-zero horizontal correlations in the observation and model errors in order to distribute the observed information to nearby model grid cells (3D ensemble Kalman filter) [16,46]. This setup essentially uses the model information to downscale the 36-km SMAP observations to the 9-km model resolution. To translate the model state into surface soil moisture estimates with the same spatial support as the observations, the observation operator computes the spatial convolution of the model estimates with a two-dimensional Gaussian function that contains 50% of the signal within a circle with a radius of 20 km . Observation error maps were estimated using a TC analysis (Section 2) and the spatial correlation between the observation errors was assumed to follow a Gaussian distribution with a 0.25 length scale in all directions. Following the SMAP L4_SM setup, an ensemble of 24 members was used here. Moreover, model error correlations were localized to a radius of 1.25 (by reducing their value to zero beyond this radius) to avoid spurious spatial correlations as a result of the limited ensemble size [18,47]. The perturbations to the meteorological forcing and model prognostic variables follow the Version 2 L4_SM system  and are summarized in Table 2.
3.2. Data Assimilation Experiments
Several data assimilation experiments were performed for April 2015 to March 2017 over the contiguous United States (CONUS), each with a different method to address bias between the observations and corresponding model forecasts. All experiments used the modeling and DA system introduced in Section 3.1. Table 3 summarizes the main characteristics of all assimilation experiments and Section 3.3 discusses limitations associated with each experiment.
3.2.1. Open Loop
For the open loop (OL) experiment, the model is run for the study period without assimilating any SMAP observations. The OL represents a baseline for the model skill against which potential skill improvements from the assimilation of SMAP observations are measured. The OL for the soil moisture assimilation experiments is generated using the NRv5 configuration, whereas the open loop for the L4_SM system (OL-L4) is generated using the NRv4 configuration (Section 3.1).
3.2.2. SMAP NN Retrieval Assimilation without Bias Correction (DA-NN)
In the DA-NN experiment, the NN retrievals are assimilated without further bias correction. By design, the NN retrievals are consistent with the global climatology of the model. The purpose of this experiment is thus to test whether the NN approach is sufficient to account for the systematic bias (related to factors other than disagreements about the soil moisture state) between the model and observations and thus reduce the need for further rescaling that would remove some of the independent satellite information.
3.2.3. SMAP NN Retrieval Assimilation with Local CDF-Matching (DA-NN-lCDF)
In the DA-NN-lCDF experiment, the NN retrievals are assimilated after applying a local CDF-matching that imposes the model’s mean, variability and higher moments on the observations separately for each grid cell. To compute the CDF-matching statistics, we apply a spatial sampling with a 1.25 moving window to mitigate the effect of the relatively short study period . The purpose of this experiment is to compare the assimilation using a local (grid cell level) rescaling in DA-NN-lCDF with the global rescaling implicit in the DA-NN experiment.
3.2.4. SMAP L2P Retrieval Assimilation with Global CDF-Matching (DA-L2P-gCDF)
In the DA-L2P-gCDF experiment, the L2P retrievals are assimilated after applying a global CDF-matching of the satellite soil moisture retrievals to the model estimates. The purpose of this experiment is to: (1) compare the impact of the different retrieval algorithms; and (2) assess whether applying a global CDF-matching to an existing retrieval product results in a different soil moisture skill than assimilating soil moisture estimates that are by design consistent with the global model climatology.
3.2.5. SMAP Level-4 Brightness Temperature Assimilation Product (DA-L4)
The SMAP L4_SM product is generated by assimilating SMAP Tb observations (Section 2.1.3) and is included here to relate the skill of the above soil moisture assimilation experiments to the skill that can be obtained from a Tb assimilation (bearing in mind that a local rescaling of the Tbs is applied (Section 2.1.3)).
3.3. Limitations of the DA Experiments
In the DA-NN and DA-L2P-gCDF experiments, the soil moisture observations are globally matched to the climatology of the modeled soil moisture. However, local biases and differences in the local variability are retained (see e.g., Figure 3 of Kolassa et al. ). These can provide very valuable information on missing processes in the model (for example processes related to agricultural practices) or unrealistic process parameterizations. However, from a DA perspective, the retention of local biases violates the assumptions of the DA system, which is designed to deal with random rather than systematic errors. The experiments conducted here investigate whether—in practice—the benefit of retaining more of the independent satellite information can outweigh the adverse effects of violating the DA assumptions. This includes investigating the effect on the modeled soil moisture skill, but also the impact on related variables, such as evaporation or runoff estimates.
Another concern is the possible non-orthogonality of the observation and model errors as a result of the soil temperature information that is shared between the SMAP retrievals and the model. This issue might be exacerbated by the fact that for the global bias correction approaches, the dynamic range of the model and observations will not necessarily match locally and would represent another violation of the DA system assumptions.
Finally, the assimilation and validation periods here include the NN training period (April 2015 –March 2016), which violates the DA assumption of uncorrelated model and observation errors. Owing to the relatively short SMAP record to date, further investigation of this issue must be left for future study.
We compare the different soil moisture assimilation experiments in terms of: (1) the statistics of the modeled soil moisture estimates; (2) the soil moisture estimate skill against in situ soil moisture measurements; (3) the consistency of the specified model and observation error statistics with the actual errors; and (4) the impact on model fields related to soil moisture.
3.4.1. Soil Moisture Statistics
To assess the impact of the assimilation on the climatology of the soil moisture estimates, we compare the statistics of the soil moisture fields generated with the assimilation experiments against those generated with the OL. The difference between the mean soil moisture fields highlights areas that experience a general wetting or drying as a result of the assimilation, whereas the difference of the mean soil moisture standard deviations assesses to what extent the assimilation of SMAP observations introduces (or removes) variability in the modeled soil moisture fields. The soil moisture mean values and standard deviations are computed using all model estimates, including times and locations when no SMAP observations were assimilated.
3.4.2. Evaluation Against In Situ Measurements
The soil moisture estimates from each assimilation experiment are evaluated against in situ measurements using the correlation (R), absolute bias (|b|), and unbiased root-mean-square error (ubRMSE). The metrics are computed using all simulated soil moisture estimates, including time instances when no SMAP observations were assimilated. The correlation is computed as the Pearson correlation coefficient of the modeled and in situ soil moisture time series in each location and quantifies the skill in capturing soil moisture temporal variations across all time scales. The absolute bias is computed as the absolute value of the mean difference between the in situ and modeled soil moisture time series in each location. We use the absolute bias to better compare skill improvements and to avoid the effect of bias compensation when computing mean metrics. The ubRMSE is calculated to estimate errors in the soil moisture variability and is computed as the RMSE between the modeled and in situ soil moisture time series after removing their respective long-term mean values.
To assess the statistical significance of differences in the experiment evaluation metrics, we also estimate their 95% confidence intervals using the Student’s t-test for the correlation and bias, and a chi-square test for the ubRMSE. All metrics and their confidence intervals are estimated, accounting for auto-correlation in the soil moisture time series.
When computing average metrics (across all reference pixels for the CVSs and across all networks for the sparse networks), we use a k-means clustering approach with a maximum cluster extent of 1 to avoid the dominance of regions with a high sensor density and to ensure realistic confidence intervals .
3.4.3. Assimilation Diagnostics
The relative impact of the model forecasts and observations on the soil moisture estimates depends on the specified model and observation error statistics. To assess the consistency of the error characterizations in our experiments with the actual model and observation errors, we analyze the standard deviation of the normalized observation-minus-forecast residuals (or ‘innovations’), which are computed as , where O and F are the observations and forecast estimates, and and are the assumed observation and forecast error variances as prescribed () or diagnosed from the ensemble () . In a well-calibrated DA system, with correctly specified model and observation error statistics, this metric should be close to one. Values greater than one indicate that the DA system underestimates the actual errors, and values less than one indicate that the errors are overestimated. The standard deviation of the normalized observation-forecast differences is computed for times and locations when SMAP observations were assimilated.
3.4.4. Impact on Related Model Fields
The assimilation of soil moisture estimates with a local bias could adversely affect model fields related to soil moisture, despite a potential improvement of the soil moisture estimates themselves (Section 3.3). To investigate this possibility, we also analyze changes in the mean land evaporation and overland runoff resulting from the assimilation of SMAP observations. The analysis of the evaporation and runoff is qualitative, since no reliable reference data were available for our study period.
4. Results and Discussion
4.1. Assimilation with Global vs. Local Bias Correction
First, we compare the assimilation of the NN retrievals without further bias correction (DA-NN) to the assimilation of the same retrievals using standard local CDF-matching bias correction (DA-NN-lCDF).
4.1.1. Mean Soil Moisture Statistics
In the DA-NN experiment, the retention of local biases between the model and observations results in 2-year mean soil moisture estimates that show distinct spatial differences (defined as DA-NN minus OL) with respect to the model (Figure 1a). For example, DA-NN exhibits drier conditions in the predominantly agricultural areas of the Midwest and parts of the Northwest (eastern Montana, eastern Oregon and the Dakotas). In these regions, SMAP observes the effects of agricultural practices (e.g., tile drainage or tillage) that are not represented in the model (see e.g., He et al. ). For the agricultural areas subject to irrigation, these somewhat counter-intuitive results reflect the dry bias of the SMAP retrievals relative to the model (see e.g., Figure 4d in Kolassa et al. ) . In areas with extensive tile drainage, such as large parts of Iowa, the results reflect the expected behavior. Additionally, the spatial patterns of the DA-NN soil moisture estimates depend on the SMAP brightness temperatures as well as the ancillary retrieval inputs and are thus not purely observational features. The local bias correction applied in the DA-NN-lCDF experiment removes systematic differences between the model and the observations prior to the assimilation and—by design—results in mean soil moisture differences without strong spatial features (Figure 1b).
Differences in the soil moisture variability between DA-NN and OL (Figure 1d) appear to be related to the soil moisture mean state and seasonal variability in a region. In humid regions with a more pronounced seasonal cycle, such as parts of the Eastern US, Northern Mexico or the California Central Valley, DA-NN decreases the soil moisture variability with respect to the OL. The reduced variability is possibly an artifact of the retrievals’ reduced soil moisture sensitivity in regions that are more humid and more densely vegetated. One exception to this behavior is the corn belt, where DA-NN increases the soil moisture variability with respect to the OL. Here the NN retrievals capture the effects of agricultural practices that are not represented in the model and that tend to increase the soil moisture variability. The variability differences between the DA-NN-lCDF and the OL (Figure 1e) are generally small and have less distinct spatial features than those observed for the DA-NN experiment, as expected given the local scaling applied to the observations in the DA-NN-lCDF.
4.1.2. Evaluation Against In Situ Measurements
Evaluated against the surface CVS measurements (Figure 2), DA-NN and DA-NN-lCDF are able to improve the model skill over the OL. Both experiments yield comparable correlation (Figure 2a) and ubRMSE (Figure 2c) improvements at most reference pixels (exceptions are Little River (LR) and South Fork (SF1 and SF3)), resulting in similar average correlation increases of 0.12 and 0.10, and ubRMSE reductions of 0.005 m m and 0.004 m m for DA-NN and DA-NN-lCDF, respectively. In terms of the bias (Figure 2b), DA-NN generally yields the larger skill changes at individual pixels, including a bias degradation at four pixels, whereas DA-NN-lCDF yields smaller but consistent improvements. On average, this results in a similar bias reduction of 0.009 m m and 0.007 m m for DA-NN and DA-NN-lCDF, respectively. The small (albeit not statistically significant at the 5% level) bias reduction for the DA-NN-lCDF estimates with respect to their OL contradicts the intended behavior of the system and might point to issues with the DA system calibration.
Against the CVS root zone measurements (Figure 3), DA-NN-lCDF yields more consistent improvements than DA-NN in terms of the ubRMSE and correlations, but their magnitude is smaller than the less frequent improvements from DA-NN. As a result, the average correlation is improved by 0.16 for both experiments (Figure 3a), but DA-NN-lCDF results in a larger ubRMSE reduction of 0.006 m m compared to 0.003 m m for DA-NN (Figure 3c). In terms of the root zone bias (Figure 3b), both experiments are only able to improve the model skill at approximately half of the reference pixels. The bias degradation at the remaining locations is smaller for the DA-NN-lCDF estimates, resulting in a slight bias reduction of 0.001 m m on average compared to the average bias increase of 0.015 m m for DA-NN.
At many stations, the skill changes with respect to the OL and skill differences between DA-NN and DA-NN-lCDF are small. Notable exceptions are the Little River (LR) and South Fork (SF) watersheds, both of which have previously been identified as sites with large discrepancies between the SMAP retrievals and the in situ measurements [10,49]. At LR, DA-NN consistently degrades the model skill in both soil layers and across all metrics, whereas DA-NN-lCDF yields small or no skill changes. Bearing in mind that the NN retrievals and the OL model estimates have a comparable correlation and ubRMSE skill at LR , the results suggest that assimilating the NN retrievals only provides a small amount of novel information to the model, but likely introduces noise that degrades the model skill. For DA-NN-lCDF, the observations appear to have a smaller impact and the soil moisture estimates are less sensitive to retrieval product noise.
At the SF reference pixels, DA-NN improves the soil moisture dynamics, as evident from the significantly (at the 95% confidence level) larger correlation increases and larger (but not statistically significant) ubRMSE reductions in both soil layers compared to DA-NN-lCDF. Figure 1d showed that DA-NN slightly increases the soil moisture variability at SF, likely by introducing the effects of agricultural processes not represented in the model. In contrast, the strong drying in DA-NN at SF (Figure 1a) strongly increases the bias at one surface pixel and at both root zone pixels. Experiment DA-NN-lCDF—by design—only leads to small changes of the bias. This suggests that the observations have a stronger impact in the DA-NN experiment, because more independent satellite information is retained. Therefore, the (reliable) observation information on soil moisture dynamics is used more efficiently in DA-NN. However, the higher impact and the retention of local biases also make the soil moisture estimates more vulnerable to the adverse effects of bias in the retrievals.
When evaluated against sparse network in situ measurements (Figure 4), differences in the average metrics of both experiments are less pronounced than for the CVS evaluation. In the surface layer, both assimilation experiments increase the correlation and reduce the ubRMSE over the OL. For the root zone, both assimilation experiments slightly degrade the model skill compared to the OL for all metrics. However, compared to the error bars, the skill changes observed in the sparse network evaluation are nearly negligible.
4.1.3. Model and Observation Errors
The impact of the assimilated soil moisture observations on the model estimates is driven by: (1) the difference between the rescaled observations and the forecast; and (2) the relative weight given to the observations and the model during the assimilation. The latter depends on the specified model and observation errors through the Kalman gain. The standard deviation of the normalized observation–forecast differences (Figure 5) shows how accurately the DA system reflects the actual model and observation errors. For both experiments, the DA system tends to overestimate the actual errors (as indicated by values smaller than 1), which is also reflected by the domain average values of 0.89 for DA-NN and 0.68 for DA-NN-lCDF. This more pronounced overestimation for DA-NN-lCDF could be one reason for the apparently smaller observation impact noted above. The inaccurate error characterization could be caused by: (1) inaccurate observation errors estimated from the TC analysis (Section 2); (2) uncertainties in the model or observation temporal standard deviations used to rescale the observation errors for DA-NN-lCDF; or (3) inaccurate model errors—represented by the ensemble spread and driven by the forcing and prognostic perturbations. Points (1) and (3) would affect both assimilation experiments and are thus likely causes for the general error overestimation. Point (2) affects only DA-NN-lCDF and could explain the stronger error overestimation. The model perturbations used here were initially developed for the L4_SM Tb assimilation and yield model standard deviations that might not be appropriate for the soil moisture assimilation conducted here.
4.1.4. Impact on Related Model Fields
The soil moisture skill improvement in DA-NN over OL (with the root zone bias as the only exception) suggests that issues with the retention of local biases (see Section 3.3) may in practice be outweighed by the benefit of retaining more of the independent SMAP information. It is important, however, to also assess how the assimilation without local bias correction affects the overland runoff and land evaporation.
The differences in mean land evaporation for DA-NN and DA-NN-lCDF (Figure 6a,b) primarily reflect differences in the mean soil moisture state caused by assimilation of SMAP observations (Figure 1). For DA-NN, this includes a reduced evaporation in the region stretching from southeast of the Great Lakes to Texas, for which a strong drying was observed in Figure 1a, and an increased evaporation corresponding to the increased soil moisture in Florida. Generally, the land evaporation tends to be more sensitive to soil moisture in the Western US, however, owing to the smaller soil moisture changes introduced there, this increased sensitivity is not evident in the evaporation changes. For the DA-NN-lCDF experiment, the mean soil moisture state is—by design—not changed relative to the OL and as a result no notable changes in the mean land evaporation are introduced by the assimilation.
In terms of the runoff (Figure 6d,e), the assimilation mostly introduces changes in regions where the runoff is large, such as the Eastern US and along the West Coast. For DA-NN, these changes mirror the spatial features of the mean soil moisture changes, resulting in a runoff increase in areas with increased soil moisture and vice versa. For DA-NN-lCDF, no notable spatial features were introduced in the mean soil moisture state and thus no spatial features are discernible in the changes to the runoff.
A quantitative validation of the evaporation and runoff changes introduced by DA-NN is difficult due to a lack of reliable reference data. The DA-NN experiment is able to reduce the known evaporation overestimation of the model , but the very large changes of ~1 mm/day are likely unrealistic. Furthermore, the runoff reductions introduced by DA-NN intensify the known runoff underestimation of the model . Thus, the soil moisture skill improvements observed for DA-NN do not readily translate into improvements in related water cycle variables. For applications aiming to obtain a comprehensive set of land surface estimates (rather than only improving soil moisture estimates), an additional re-calibration of the soil moisture dependent processes in the land model would be required in order to make the DA-NN approach fully viable.
4.1.5. Discussion of DA-NN and DA-NN-lCDF Results
Generally, the DA-NN and DA-NN-lCDF experiments are able to improve the model soil moisture skill over the OL. In particular, over CONUS, where the validation data are dense and where the model generally has a high skill, improving the model through data assimilation is more difficult than in data sparse regions. Additionally, using corrected precipitation forcing data (Section 3.1) further limits the skill improvements that can be obtained from an assimilation. The consistent assimilation skill improvements are thus encouraging and demonstrate the great potential of SMAP observations to improve land surface model estimates, in particular in data sparse regions. Remaining differences between the modeled estimates and the in situ measurements are related to uncertainties in the assimilated observations and the model forcing data as well as differences in the ancillary data (for example the soil texture) used in the model and at the ground stations.
In the DA-NN experiment, which retains more of the independent satellite information, the observations have a larger impact on the soil moisture estimates than in the DA-NN-lCDF experiment. When the observation are of high quality and contain novel information, this can lead to larger improvements in the model soil moisture skill than is possible with a local bias correction. However, the larger observation impact also makes the DA-NN more vulnerable to the adverse effects of low-quality satellite observations. This means that the NN assimilation without bias correction can use the observation information more efficiently, but is also less reliable than an assimilation using a localized bias correction. To use the DA-NN approach it is thus crucial to accurately characterize the model and observation errors and to apply a rigorous quality control to the observations. Additionally, to better isolate the reliable retrieval information, it might be beneficial to separately assimilate the different temporal components of the retrievals—i.e., the long-term mean, seasonal, sub-seasonal and interannual signatures —with the DA-NN approach.
4.2. Assimilation of NN vs. L2P Retrievals
In this section, we compare the assimilation of the NN retrievals (DA-NN) to that of the L2P retrievals (DA-L2P-gCDF) to determine the impact of the different retrieval approaches. In both cases, the global climatology of the observations matches that of the corresponding model estimates.
4.2.1. Mean Soil Moisture Statistics
The spatial patterns of the mean soil moisture differences between DA-L2P-gCDF and OL (Figure 1c) are similar to those observed for the DA-NN experiment (Figure 1a), but generally have a smaller magnitude. Notable discrepancies in the different spatial patterns of the DA-NN and DA-L2P-gCDF experiments occur along parts of the Rocky Mountains (in Colorado, Wyoming and Idaho), where DA-NN causes a wetting relative to OL, whereas DA-L2P-gCDF introduces mostly small mean soil moisture changes relative to OL. As for DA-NN, the spatial patterns in the mean soil moisture difference between DA-L2P-gCDF and OL reflect the local biases between the L2P retrievals and the model.
The spatial patterns of the standard deviation difference between the DA-L2P-gCDF and OL experiments (Figure 1f) are also very similar to those observed for the DA-NN experiment, but with a slightly smaller magnitude. In addition to the SMAP observations and the ancillary retrieval inputs (VWC and surface temperature), the differences between the L2P retrievals (and corresponding assimilation estimates) and the model are also driven by the ancillary parameter inputs, such as the soil texture. The L2P retrieval algorithm relies on more of these ancillary data than the NN retrievals, and as such the spatial features of the DA-L2P-gCDF estimates correspond less to SMAP observational features than those of the DA-NN estimates.
4.2.2. Evaluation against In Situ Measurements
Evaluated against the surface CVS measurements (Figure 2), the DA-NN and DA-L2P-gCDF experiments have a very similar skill at most reference pixels and across all metrics. This results in nearly identical average skill improvements for both experiments, with correlation increases of 0.12 and 0.13, bias reductions of 0.009 m m and 0.008 m m, and ubRMSE reductions of 0.005 m m and 0.006 m m for DA-NN and DA-L2P-gCDF, respectively.
Similarly, the skill of the DA-NN and DA-L2P-gCDF estimates against the root zone CVS measurements (Figure 3) is nearly identical at most reference pixels. This is also reflected in the average correlation improvements of 0.16 and ubRMSE reductions of 0.003 m m for both experiments. Both assimilations are only able to reduce the root zone bias at about half of the reference pixels and the relatively large bias degradation at the remaining pixels results in an average bias increase of 0.015 m m and 0.016 m m for DA-NN and DA-L2P-gCDF.
As before, the LR and SF watersheds show more pronounced differences between the two assimilation experiments. At SF, DA-NN generally obtains larger correlation improvements than DA-L2P-gCDF in both soil layers, but DA-L2P-gCDF leads to smaller bias degradations (or larger bias reductions). Given that the NN and L2P retrievals have a similar skill at the SF pixels , the results suggest that the observations have a larger impact on the analysis for DA-NN than for DA-L2P-gCDF.
At LR, DA-L2P-gCDF shows the same consistent skill degradation as DA-NN, but the magnitude of the degradation is larger. Previously, Kolassa et al.  found that the L2P retrievals had a significantly better (at the 95% confidence level) correlation skill than the NN retrievals and the model at LR, indicating that at LR the L2P retrievals capture soil moisture information that is not represented in the other products. The DA-L2P-gCDF skill degradations thus suggest that at LR, the DA system is either not able to extract this independent information or is too sensitive to potential noise in the retrievals.
Against the sparse network measurements (Figure 4), the DA-NN and DA-L2P-gCDF experiments have nearly identical correlation and ubRMSE skill in both soil layers. In terms of the bias, DA-L2P-gCDF is able to slightly reduce the bias in the surface and root zone layers, whereas DA-NN slightly increases the surface bias against the sparse network measurements.
4.2.3. Model and Observation Errors
The specified model and observation errors of the DA-L2P-gCDF experiment (Figure 5c) underestimate the actual errors in some regions, particularly in the central US. This is reflected in the higher domain average value of 1.01 for DA-L2P-gCDF, compared to 0.89 for DA-NN. These differences can be caused by: (1) different errors for the L2P retrievals compared to the NN retrievals generated with the TC analysis (Section 2.1.2); and (2) the rescaling of the L2P errors in the DA-L2P-gCDF experiment with the ratio of the global standard deviations of the model and observations (Section 2.1.2).
4.2.4. Impact on Related Model Fields
The impact of the DA-L2P-gCDF assimilation on the modeled land evaporation (Figure 6c) has similar spatial patterns as the impact of the DA-NN assimilation and primarily reflects the changes in the mean soil moisture state. Generally, the magnitude of the evaporation changes is smaller for the DA-L2P-gCDF estimates because of the smaller impact of DA-L2P-gCDF on the mean soil moisture state compared to DA-NN.
Similarly, the spatial patterns of the overland runoff changes introduced by DA-L2P-gCDF (Figure 6f) are very similar to those introduced by DA-NN, but have a smaller magnitude as a result of the smaller soil moisture impact in DA-L2P-gCDF compared to DA-NN. The larger differences between the mean soil moisture state of DA-NN and DA-L2P-gCDF near the Rocky Mountains are not propagated into the runoff, as a result of the reduced runoff sensitivity to soil moisture in areas where the runoff magnitude is small (see also Section 4.1.4).
4.2.5. Discussion of DA-NN and DA-L2P-gCDF Results
Overall, the skill of the DA-NN and DA-L2P-gCDF experiments is very similar, suggesting that a global CDF-matching of an existing soil moisture retrieval product can yield comparable soil moisture skill when a retrieval in the model climatology is not possible. Additionally, the skill differences between the DA-NN and DA-L2P-gCDF experiments are related to: (1) differences in the retrieval product skill; and (2) differences in the amount of novel information that each retrieval product provides to the model. The skill of both retrieval products was extensively evaluated against in situ measurements in , where the authors found them to be comparable with somewhat better correlations for the L2P retrievals and a lower ubRMSE for the NN retrievals. Our findings suggest that the impact of these retrieval skill differences on modeled soil moisture estimates generated here is negligible.
The amount of novel information that each data product provides to the model is more difficult to quantify. As a proxy, we compared the model skill from the DA-NN-lCDF experiment to the model skill from an assimilation of the L2P retrievals with a local CDF-matching (DA-L2P-CDF; not shown here). Since the bias correction and assimilation setup in both experiments are the same, differences in the resulting model skill are related to differences in the retrieval skill (which are small, see above) and differences in the independent information provided by both products. The DA-NN-lCDF and DA-L2P-CDF experiments were found to have a nearly identical average surface correlation against the core site measurements of 0.69 for both experiments, and a similar average absolute bias of 0.052 m m and 0.051 m m for DA-NN-lCDF and DA-L2P-CDF, respectively. This suggests that the amount of independent information provided by each retrieval product is comparable.
4.3. Assimilation of Soil Moisture vs. Brightness Temperatures
Finally, we evaluate the skill improvements from the soil moisture assimilation experiments presented in the previous sections against those obtained from the brightness temperature assimilation implemented in the SMAP L4_SM system. The L4_SM system has been extensively tested and validated [18,52] and thus the skill of the L4_SM estimates can be considered as somewhat of a baseline for the amount of information that a DA system can extract from the SMAP observations. To some extent, the comparison with the L4_SM estimates also assesses the feasibility of the NN as a tool to project SMAP Tb into the modeled soil moisture space, which is similar to the projection of modeled soil moisture estimates into the SMAP Tb space by the L4_SM radiative transfer model (RTM) (while bearing in mind that the Tb observations are locally rescaled in the L4_SM system). As before, we focus on comparing skill improvements to account for the fact that the soil moisture assimilation experiments and the L4_SM estimates have a slightly different OL (Section 2.1.3).
Evaluated against the surface CVS measurements (Figure 2), the L4_SM estimates are able to yield higher correlation improvements than the soil moisture assimilation experiments at most stations, resulting in the largest average correlation improvement of 0.15. In terms of the ubRMSE, the skill improvements of the Tb and soil moisture assimilations are similar with an average ubRMSE reduction of 0.006 m m for DA-L4. Like the DA-NN and DA-L2P-gCDF experiments, DA-L4 leads to a surface bias degradation at several stations. However, these are smaller in magnitude than for DA-NN and DA-L2P-gCDF and are balanced by bias improvements, for example at the SF reference pixels. As a result, DA-L4 behaves as designed and does not significantly change the average bias with respect to its OL.
Against the root zone CVS measurements (Figure 3), DA-L4 has the lowest average correlation skill improvement of 0.14. The average ubRMSE reduction for DA-L4 of 0.003 m m is similar to the reductions obtained from the soil moisture assimilation experiments. In terms of the bias, the DA-L4 estimates behave as intended and only slightly change the bias relative to the OL at most reference pixels (an exception is the SF1 pixel). The resulting average bias increase of 0.006 m m is small compared to the values for DA-NN and DA-L2P-gCDF, but slightly larger than the bias reduction of 0.001 m m obtained with DA-NN-lCDF.
The DA-L4 and DA-NN-lCDF experiments are the most similar in terms of the observation rescaling applied prior to the assimilation (although different moments are rescaled locally in each case), but this is not necessarily reflected in a more comparable skill of both experiments. This is partly due to differences introduced by the retrieval algorithm, but De Lannoy and Reichle  also showed that the assimilation of locally rescaled SMOS Tbs or soil moisture estimates extracted very different information from the observations locally. Thus, it is not surprising that DA-NN-lCDF and DA-L4 have different skills at individual reference pixels.
Similarities between the DA-L4 and DA-NN-lCDF experiments exist at the LR reference pixel, where both experiments generally improve the model skill, whereas the two experiments using global observation rescaling (DA-NN and DA-L2P-gCDF) consistently degrade the model skill. These differences are not related to the retrieval product skill and could thus be related to: (1) a higher level of retained observation noise; or (2) an uncertain error characterization in the DA-NN and DA-L2P-gCDF experiments.
Evaluated against the sparse network measurements (Figure 4), the skill differences of DA-L4 relative to its OL are very small and consistently within error bars. The Tb assimilation slightly improves the correlation and ubRMSE skill in the surface layer, but slightly degrades the skill in the root zone. For the bias, the behavior is inverted, with a slight bias improvement in the root zone.
Overall, the soil moisture assimilation experiments and DA-L4 are able to achieve very similar skill improvements over their respective open loops. This supports the finding of De Lannoy and Reichle  that the assimilation of SMOS Tbs and soil moisture estimates—while locally different—resulted in model estimates with a comparable average skill against in situ measurements. Taken together, the results suggest that the NN method could be a viable assimilation alternative when a Tb assimilation is not possible (e.g., due to issues with the RTM calibration or a too high complexity of the RTM).
Furthermore, the assimilation configuration used in the experiments here is very close to that of the SMAP L4_SM system and as such might not represent the optimal configuration for soil moisture retrieval assimilation. A better calibration of the model perturbations might further improve the observation impact and increase the skill improvements from the soil moisture assimilation experiments.
5. Conclusions and Perspectives
In this study we compared different methods to extract soil moisture information from SMAP observations through data assimilation. In particular, we focused on the potential of NN techniques to reduce the need for bias correction prior to an assimilation in order to maximize the amount of independent satellite information that is used to inform the model. We conducted three experiments to assimilate SMAP soil moisture retrievals into the NASA CLSM and evaluated the resulting soil moisture estimates against in situ measurements from the SMAP core validation sites as well as two sparse networks. For reference, we also compared our soil moisture assimilation experiments against the skill of the SMAP L4_SM estimates generated through a SMAP Tb assimilation.
All of the SMAP data assimilation experiments included in our study were generally able to improve the surface and root zone soil moisture model skill over the respective open loop (model run without data assimilation) when evaluated against the CVS in situ measurements (with the exception of the root zone bias). This demonstrates the general potential of the SMAP observations to inform the model, irrespective of the data assimilation approach chosen, and confirms previous findings [18,52]. For most reference pixels, the improvements over the OL were small and differences in the average metrics were mostly driven by a few pixels with large improvements. However, the improvement over the model skill in data-rich region such as the US is limited because the model skill is generally high. Larger improvements in the model skill can be expected in data-sparse regions. Measurements at the sparse network sites are less representative of the grid-cell scale estimates from the model and retrievals. Moreover, the sparse networks include many stations where microwave-based soil moisture retrievals are not reliable. Therefore, the skill improvements over the open loop were generally smaller or sometimes negative for the sparse networks.
Comparing the three soil moisture assimilation experiments showed that using global observation rescaling (DA-NN and DA-L2P-gCDF) better retained the independent soil moisture information provided by the SMAP retrievals and led to a larger impact of the observations during the assimilation. This resulted in larger soil moisture skill improvements at many reference pixels compared to the improvements obtained when using local rescaling (DA-NN-lCDF). However, it also made the soil moisture estimates more sensitive to a skill degradation in locations where the observations were uncertain. On average, the assimilation resulted in slightly higher skill improvements against the surface in situ measurements for DA-NN and DA-L2P-gCDF and slightly higher skill improvements in the root zone for DA-NN-lCDF. Overall, the results suggest that the global rescaling approaches could potentially be very beneficial for soil moisture estimation under the conditions of: (1) a good observation error characterization; (2) rigorous observation quality control; and (3) potential component-wise assimilation  to better isolate the reliable satellite information.
The experiments using global observation rescaling introduced large changes in the land evaporation and runoff that were likely unrealistic in magnitude. This showed that using the NN assimilation method for purposes other than improving soil moisture estimates is not recommended without a careful re-calibration of the model processes translating soil moisture changes into changes of other model variables.
Instead of assimilating the NN retrievals without further bias correction (in DA-NN), similar results were obtained when assimilating physically-based retrievals after a global bias correction (in DA-L2P-gCDF). We previously showed that the retrieval product skill and amount of independent information provided by the NN and L2P retrievals is comparable and thus the similar skill of the DA-NN and DA-L2P-gCDF estimates indicates that the two rescaling methods are approximately equivalent. However, the relatively short record length of the SMAP observations implies that sampling errors impact both the NN method and the two CDF-matching approaches used here and that the results might change as longer data records become available.
Finally, compared to the skill improvements obtained from the SMAP Tb assimilation implemented in the SMAP L4_SM system, the soil moisture assimilation experiments had comparable average correlation and ubRMSE skill. Differences in the average bias changes between the L4_SM estimates and the experiments using global observation rescaling exist as a result of the local Tb rescaling implemented in the L4_SM system. Overall, the results suggest that on average there is no particular advantage to assimilating either Tbs or soil moisture estimates, although locally the choice could result in statistically significant skill differences.
J. Kolassa was supported by an appointment to the NASA Postdoctoral Program at the Goddard Spaceflight Center, administered by Universities Space Research Association under contract with NASA. Additional funding was provided by the NASA Soil Moisture Active Passive mission. Computational resources for this study were provided by the NASA High-End Computing (HEC) Program through the NASA Center for Climate Simulation (NCCS) at the Goddard Space Flight Center. The USDA is an equal opportunity provider and employer.
J.K. and R.R. conceived, designed and conducted the neural network retrievals and the data assimilation experiments and wrote the manuscript. Q.L. processed the evaluation data and contributed to the setup of the data assimilation system and provided comments on the manuscript. All other authors provided the ground station evaluation data and provided comments on the manuscript.
Conflicts of Interest
The authors declare no conflicts of interest.
- Seneviratne, S.I.; Lüthi, D.; Litschi, M.; Schär, C. Land-atmosphere coupling and climate change in Europe. Nature 2006, 443, 205–209. [Google Scholar] [CrossRef] [PubMed]
- Bateni, S.M.; Entekhabi, D. Relative efficiency of land surface energy balance components. Water Resour Res. 2012, 48. [Google Scholar] [CrossRef]
- Assouline, S. Infiltration into soils: Conceptual approaches and solutions. Water Resour Res. 2013, 49, 1755–1772. [Google Scholar] [CrossRef]
- Jung, M.; Reichstein, M.; Schwalm, C.R.; Huntingford, C.; Sitch, S.; Ahlström, A.; Arneth, A.; Camps-Valls, G.; Ciais, P.; Friedlingstein, P.; et al. Compensatory water effects link yearly global land CO2 sink changes to temperature. Nature 2017, 541, 516–520. [Google Scholar] [CrossRef] [PubMed]
- World Meteorological Organization-Global Climate Observing System. Guideline for the Generation of Satellite-Based Datasets and Products Meeting GCOS Requirements. WMO Technical Document 1488. 2009. Available online: https://public.wmo.int/en/programmes/global-climate-observing-system/Publications/gcos-143.pdf (accessed on 4 June 2017).
- Jackson, T.J.; Hsu, A.Y.; Van de Griend, A.; Eagleman, J.R. Skylab L-band microwave radiometer observations of soil moisture revisited. Int. J. Remote Sens. 2004, 25, 2585–2606. [Google Scholar] [CrossRef]
- Entekhabi, D.; Njoku, E.G.; O’Neill, P.E.; Kellogg, K.H.; Crow, W.T.; Edelstein, W.N. The Soil Moisture Active Passive (SMAP) Mission. Proc. IEEE 2010, 98, 704–716. [Google Scholar] [CrossRef]
- Kerr, Y.H.; Waldteufel, P.; Wigneron, J.-P.; Delwart, S.; Cabot, F.; Boutin, J.; Escorihuela, M.-J.; Font, J.; Reul, N.; Gruhier, C.; et al. The SMOS Mission: New Tool for Monitoring Key Elements of the Global Water Cycle. Proc. IEEE 2010, 98, 666–687. [Google Scholar] [CrossRef][Green Version]
- Al Bitar, A.; Leroux, D.; Kerr, Y.H.; Merlin, O.; Richaume, P.; Sahoo, A.; Wood, E.F. Evaluation of SMOS soil moisture products over continental US using the SCAN/SNOTEL network. IEEE Trans. Geosci. Remote Sens. 2012, 50, 1572–1586. [Google Scholar] [CrossRef][Green Version]
- Chan, S.K.; Bindlish, R.; O’Neill, P.E.; Njoku, E.; Jackson, T.; Colliander, A.; Chen, F.; Burgin, M.; Dunbar, S.; Piepmeier, J.; et al. Assessment of the SMAP passive soil moisture product. IEEE Trans. Geosci. Remote Sens. 2016, 54, 4994–5007. [Google Scholar] [CrossRef]
- Reichle, R.H. Data assimilation methods in the Earth sciences. Adv. Water Resour. 2008, 31, 1411–1418. [Google Scholar] [CrossRef]
- Bolten, J.D.; Crow, W.T.; Zhan, X.; Jackson, T.J.; Reynolds, C.A. Evaluating the utility of remotely sensed soil moisture retrievals for operational agricultural drought monitoring. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2010, 3, 57–66. [Google Scholar] [CrossRef]
- Liu, Q.; Reichle, R.H.; Bindlish, R.; Cosh, M.H.; Crow, W.T.; de Jeu, R.; De Lannoy, G.J.; Huffman, G.J.; Jackson, T.J. The contributions of precipitation and soil moisture observations to the skill of soil moisture estimates in a land data assimilation system. J. Hydrometeorol. 2011, 12, 750–765. [Google Scholar] [CrossRef]
- Draper, C.S.; Reichle, R.H.; De Lannoy, G.J.M.; Liu, Q. Assimilation of passive and active microwave soil moisture retrievals. Geophys. Res. Lett. 2012, 39. [Google Scholar] [CrossRef][Green Version]
- De Rosnay, P.; Drusch, M.; Vasiljevic, D.; Balsamo, G.; Albergel, C.; Isaksen, L. A simplified Extended Kalman Filter for the global operational soil moisture analysis at ECMWF. Q. J. R. Meteorol. Soc. 2013, 139, 1199–1213. [Google Scholar] [CrossRef]
- De Lannoy, G.J.; Reichle, R.H. Global assimilation of multiangle and multipolarization SMOS brightness temperature observations into the GEOS-5 catchment land surface model for soil moisture estimation. J. Hydrometeorol. 2016, 17, 669–691. [Google Scholar] [CrossRef]
- De Lannoy, G.J.; Reichle, R.H. Assimilation of SMOS brightness temperatures or soil moisture retrievals into a land surface model. Hydrol. Earth Syst. Sci. 2016, 20, 4895. [Google Scholar] [CrossRef]
- Reichle, R.; De Lannoy, G.; Liu, Q.; Ardizzone, J.V.; Colliander, A.; Conaty, A.; Crow, W.; Jackson, T.J.; Jones, L.A.; Kimball, J.S.; et al. Assessment of the SMAP Level-4 Surface and Root-Zone Soil Moisture Product using in situ measurements. J. Hydrometeorol. 2017, 18, 2621–2645. [Google Scholar] [CrossRef]
- Crow, W.T.; Van den Berg, M.J. An improved approach for estimating observation and model error parameters in soil moisture data assimilation. Water Resour Res. 2010, 46. [Google Scholar] [CrossRef]
- Kumar, S.V.; Reichle, R.H.; Harrison, K.W.; Peters Lidard, C.D.; Yatheendradas, S.; Santanello, J.A. A comparison of methods for a priori bias correction in soil moisture data assimilation. Water Resour Res. 2012, 48. [Google Scholar] [CrossRef]
- De Lannoy, G.J.; Reichle, R.H.; Houser, P.R.; Pauwels, V.; Verhoest, N.E. Correcting for forecast bias in soil moisture assimilation with the ensemble Kalman filter. Water Resour Res. 2007, 43. [Google Scholar] [CrossRef][Green Version]
- Kalnay, E. Atmospheric Modeling, Data Assimilation and Predictability; Cambridge University Press: Cambridge, UK, 2003; ISBN 978-0521796293. [Google Scholar]
- Reichle, R.H.; Koster, R.D. Bias reduction in short records of satellite soil moisture. Geophys. Res. Lett. 2004, 31, L19501. [Google Scholar] [CrossRef]
- Drusch, M.; Wood, E.F.; Gao, H. Observation operators for the direct assimilation of TRMM microwave imager retrieved soil moisture. Geophys. Res. Lett. 2005, 32. [Google Scholar] [CrossRef]
- Kolassa, J.; Reichle, R.H.; Liu, Q.; Alemohammad, S.H.; Gentine, P.; Aida, K.; Asanuma, J.; Bircher, S.; Caldwell, T.; Colliander, A.; et al. Estimating surface soil moisture from SMAP observations using a Neural Network technique. Remote Sens. Environ. 2017. [Google Scholar] [CrossRef]
- Brodzik, M.J.; Billingsley, B.; Haran, T.; Raup, B.; Savoie, M.H. EASE-grid 2.0: Incremental but significant improvements for Earth-gridded data sets. ISPRS Int. J. Geo-Inf. 2012, 1, 32–45. [Google Scholar] [CrossRef]
- Chan, S.; Njoku, E.G.; Colliander, A. SMAP L1C Radiometer Half-Orbit 36 km EASE-Grid Brightness Temperatures, Version 3; NASA National Snow and Ice Data Center Distributed Active Archive Center: Boulder, CO, USA, 2016.
- O’Neill, P.; Chan, S.; Njoku, E.; Jackson, T.; Bindlish, R. SMAP Algorithm Theoretical Basis Document: L2 & L3 Radiometer Soil Moisture (Passive) Products; Jet Propulsion Laboratory: Pasadena, CA, USA, 2015; Available online: https://nsidc.org/sites/nsidc.org/files/technical-references/L2_SM_P_ATBD_v7_Sep2015-po-en.pdf (accessed on 1 May 2017).
- Jimenez, C.; Clark, D.B.; Kolassa, J.; Aires, F.; Prigent, C. A joint analysis of modeled soil moisture fields and satellite observations. J. Geophys. Res. 2013, 118, 6771–6782. [Google Scholar] [CrossRef]
- Maeda, T.; Taniguchi, Y. Descriptions of GCOM-W1 AMSR2 Level 1R and Level 2 Algorithms; Japan Aerospace Exploration Agency Earth Observation Research Center: Ibaraki, Japan, 2013; Available online: suzaku.eorc.jaxa.jp/GCOM_W/data/doc/NDX-120015A.pdf (accessed on 3 June 2017).
- Wagner, W.; Hahn, S.; Kidd, R.; Melzer, T.; Bartalis, Z.; Hasenauer, S.; Rubel, F. The ASCAT soil moisture product: A review of its specifications, validation results, and emerging applications. Meteorol. Z. 2013, 22, 5–33. [Google Scholar] [CrossRef]
- Wigneron, J.P.; Chanzy, A.; Calvet, J.C.; Bruguier, N. A simple algorithm to retrieve soil moisture and vegetation biomass using passive microwave measurements over crop fields. Remote Sens. Environ. 1995, 51, 331–341. [Google Scholar] [CrossRef]
- Lucchesi, R. File Specification for GEOS-5 FP. NASA Global Modeling and Assimilation Office (GMAO) Office Note No. 4 (Version 1.0). p. 63. Available online: http://gmao.gsfc.nasa.gov/pubs/office_notes (accessed on 1 May 2017).
- O’Neill, P.E.; Chan, S.; Njoku, E.G.; Jackson, T.; Bindlish, R. SMAP L2 Radiometer Half-Orbit 36 km EASE-Grid Soil Moisture; Version 4; NASA National Snow and Ice Data Center Distributed Active Archive Center: Boulder, CO, USA, 2016.
- Reichle, R.H.; Koster, R.; De Lannoy, G.; Crow, W.; Kimball, J. SMAP Level 4 Surface and Root Zone Soil Moisture Data Product: L4_SM Algorithm Theoretical Basis Document (Revision A), Soil Moisture Active Passive (SMAP) Mission Science Document; Jet Propulsion Laboratory: Pasadena, CA, USA, 2014; Available online: https://nsidc.org/sites/nsidc.org/files/technical-references/272_L4_SM_RevA_web.pdf (accessed on 1 May 2017).
- Mohammed-Tano, P.; Piepmeier, J.; Weiss, B.; Hanna, M.; Yueh, S.; Cuddy, D. Soil Moisture Active Passive (SMAP) Project: Level 1B_TB Product Specification Document; SMAP Project, JPL D-92339; Jet Propulsion Laboratory: Pasadena, CA, USA, 2015; Available online: https://nsidc.org/data/SPL1BTB/versions/3 (accessed on 15 May 2017).
- Reichle, R.; De Lannoy, G.; Koster, R.D.; Crow, W.T.; Kimball, J.S. SMAP L4 9 km EASE-Grid Surface and Root Zone Soil Moisture Analysis Update; Version 2; NASA National Snow and Ice Data Center Distributed Active Archive Center: Boulder, CO, USA, 2016.
- Colliander, A.; Jackson, T.J.; Bindlish, R.; Chan, S.; Das, N.; Kim, S.B.; Cosh, M.H.; Dunbar, R.S.; Dang, L.; Pashaian, L. Validation of SMAP surface soil moisture products with core validation sites. Remote Sens. Environ. 2017, 191, 215–231. [Google Scholar] [CrossRef]
- Schaefer, G.L.; Cosh, M.H.; Jackson, T.J. The USDA natural resources conservation service soil climate analysis network (SCAN). J. Atmos. Ocean. Technol. 2007, 24, 2073–2077. [Google Scholar] [CrossRef]
- Diamond, H.J.; Karl, T.R.; Palecki, M.A.; Baker, C.B.; Bell, J.E.; Leeper, R.D.; Easterling, D.R.; Lawrimore, J.H.; Meyers, T.P.; Helfert, M.R.; et al. US Climate Reference Network after one decade of operations: Status and assessment. Bull. Am. Meteorol. Soc. 2013, 94, 485–498. [Google Scholar] [CrossRef]
- Palecki, M.A.; Bell, J.E. US Climate Reference Network soil moisture observations with triple redundancy: Measurement variability. Vadose Zone J. 2013, 12. [Google Scholar] [CrossRef]
- De Lannoy, G.J.; Koster, R.D.; Reichle, R.H.; Mahanama, S.P.; Liu, Q. An updated treatment of soil texture and associated hydraulic properties in a global land modeling system. J. Adv. Model. Earth Syst. 2014, 6, 957–979. [Google Scholar] [CrossRef][Green Version]
- Reichle, R.H.; De Lannoy, G.J.; Liu, Q.; Colliander, A.; Conaty, A.; Jackson, T.; Kimball, J.; Koster, R.D. Soil Moisture Active Passive (SMAP) Project Assessment Report for the Beta-Release L4_SM Data Product. NASA Technical Report Series on Global Modeling and Data Assimilation. NASA/TM-2015-104606. 2015, pp. 40–63. Available online: https://nsidc.org/sites/nsidc.org/files/technical-references/Reichle788.pdf (accessed on 1 May 2017).
- Reichle, R.H.; Liu, Q. Observation-Corrected Precipitation Estimates in GEOS-5; NASA/TM 2014-104606; NASA Global Modeling and Assimilation Office: Greenbelt, MD, USA, 2014; Volume 35.
- Reichle, R.H.; Liu, Q.; Koster, R.D.; Draper, C.S.; Mahanama, S.P.; Partyka, G.S. Land surface precipitation in MERRA-2. J. Clim. 2017, 30, 1643–1664. [Google Scholar] [CrossRef]
- Reichle, R.H.; Koster, R.D. Assessing the impact of horizontal error correlations in background fields on soil moisture estimation. J. Hydrometeorol. 2003, 4, 1229–1242. [Google Scholar] [CrossRef]
- Gaspari, G.; Cohn, S.E. Construction of correlation functions in two and three dimensions. Q. J. R. Meteorol. Soc. 1999, 125, 723–757. [Google Scholar] [CrossRef]
- He, L.; Chen, J.M.; Liu, J.; Bélair, S.; Luo, X. Assessment of SMAP soil moisture for global simulation of gross primary production. J. Geophys. Res. Biogeosci. 2017. [Google Scholar] [CrossRef]
- Chan, S.K.; Bindlish, R.; O’Neill, P.; Jackson, T.; Njoku, E.; Dunbar, S.; Chaubell, J.; Piepmeier, J.; Yueh, S.; Entekhabi, D.; et al. Development and assessment of the SMAP enhanced passive soil moisture product. Remote Sens. Environ. 2017. [CrossRef]
- Reichle, R.H.; Draper, C.S.; Liu, Q.; Girotto, M.; Mahanama, S.P.; Koster, R.D.; De Lannoy, G.J. Assessment of MERRA-2 land surface hydrology estimates. J. Clim. 2017, 30, 2937–2960. [Google Scholar] [CrossRef]
- Draper, C.; Reichle, R. The impact of near-surface soil moisture assimilation at subseasonal, seasonal, and inter-annual timescales. Hydrol. Earth Syst. Sci. 2015, 19, 4831. [Google Scholar] [CrossRef]
- Reichle, R.H.; De Lannoy, G.J.M.; Liu, Q.; Koster, R.D.; Kimball, J.S.; Crow, W.T.; Ardizzone, J.V.; Chakraborty, P.; CollinS, D.W.; Conaty, A.L.; et al. Global Assessment of the SMAP Level-4 Surface and Root-Zone Soil Moisture Product Using Assimilation Diagnostics. J. Hydrometeorol. 2017. [Google Scholar] [CrossRef]
Figure 1. Average soil moisture difference—computed as data assimilation (DA) minus OL for the period April 2015 to March 2017—for the: (a) DA-NN; (b) DA-NN-lCDF; and (c) DA-L2P-gCDF experiments. Red colors indicate that the assimilation decreases the mean soil moisture with respect to the OL. Panels (d–f) are the same, but for the difference of the standard deviation with respect to the OL. Red colors indicate that the assimilation decreases the variability relative to the OL. Panels (a) and (d) also show the location of the South Fork (triangle) and Little River (circle) watersheds discussed in the text.
Figure 2. Change in surface soil moisture: (a) correlation, (b) absolute bias, and (c) unbiased root-mean-square error (ubRMSE) versus core validation site (CVS) measurements for the DA-NN (red squares), DA-NN-lCDF (blue diamonds), DA-L2P-gCDF (green circles) experiments and the DA-L4 (orange triangles). Skill changes have been computed against the OL corresponding to each experiment as DA minus OL. Error bars denote the 95% confidence interval. Reference pixel abbreviations are listed in Table 1.
Figure 3. Same as Figure 2, but for the root zone.
Figure 4. Average metrics for all experiments against core site and sparse network in situ measurements. Shown are (a) the surface correlation, (b) root zone correlation, (c) surface absolute bias, (d) root zone absolute bias, (e) surface ubRMSE, and (f) root zone ubRMSE. The error bars indicate the 95% confidence interval.
Figure 5. Standard deviation of the normalized innovations (O minus F) for (a) the DA-NN, (b) DA-NN-lCDF, and (c) DA-L2P-gCDF experiments. Red colors indicate that the assumed errors are overestimated with respect to the actual errors and blue colors indicate an underestimation. White areas indicate that less than 30 observations were assimilated and no metric was computed. Panel (a) also shows the location of the South Fork (SF; green triangle) and Little River (LR; green circle) sites.
Figure 6. Average land evaporation difference—computed as DA minus OL for the period April 2015 to March 2017—for (a) DA-NN, (b) DA-NN-lCDF, and (c) DA-L2P-gCDF experiments. Panels (d–f) are the same, but for the difference of overland runoff with respect to the OL. Red colors indicate that the assimilation reduces the evaporation and runoff with respect to the OL.
Table 1. Overview of the Soil Moisture Active Passive (SMAP) calibration/validation core sites. Shown are the site name, site key, reference pixel ID (RPID), location, climate, land cover and the availability of root zone measurements (from left to right). The measurement depth for surface soil moisture is 5 cm at all sites. The measurement depth for root zone soil moisture ranges from 30 cm to 75 cm depending on the station .
|Site||Site Key||RPID||US state||Climate||Land Cover||Root Zone|
|Walnut Gulch||WG1||16010906||Arizona||arid||shrub open||no|
|Little Washita||LW||16020907||Oklahoma||temperate||croplands and pasture||yes|
|Fort Cobb||FC1||16030911||Oklahoma||temperate||croplands and pasture||yes|
|Little River||LR||16040901||Georgia||temperate||croplands/natural mosaic||yes|
|Tonzi Ranch||TR||25010911||California||temperate||woody savannas||no|
Table 2. Ensemble perturbations applied to the forcing variables—precipitation (P), downward shortwave (DSW) radiation, and downward long wave (DLW) radiation—and to the Catchment model prognostic variables—surface excess (srfexc) and catchment deficit (catdef). Shown are the perturbation types, which are either multiplicative (M) sampled from a log-normal distribution or additive (A) sampled from a normal distribution, as well as the perturbation standard deviation (std dev), the temporal and spatial correlation lengths, and the cross-correlations of the forcing variables. Perturbations to the prognostic variables are not cross-correlated.
|Type||std dev||Temporal Correlation||Spatial Correlation||Cross Correlation with|
|P||M||0.5||24 h||0.5 deg||-||−0.8||0.5|
|DSW||M||0.3||24 h||0.5 deg||−0.8||-||−0.5|
|DLW||A||20 W m||24 h||0.5 deg||0.5||−0.5||-|
|srfexc||A||0.24 kg m h||3 h||0.3 deg|
|catdef||A||0.16 kg m h||3 h||0.3 deg|
Table 3. Overview of the soil moisture (SM) model and data assimilation experiments. CDF: cumulative distribution function; NN: neural network; OL: open loop; DA-NN: SMAP NN retrieval assimilation without bias correction; DA-NN-lCDF: SMAP NN retrieval assimilation with local CDF-matching; DA-L2P-gCDF: SMAP Level-2 passive retrieval assimilation with global CDF-matching; DA-L4: SMAP Level-4 brightness temperature assimilation product; Tb: raw brightness temperature.
|Experiment Name||Observations Assimilated||Bias Correction||Model Configuration|
|OL||none||n/a||Nature Run v5|
|DA-NN||SMAP NN SM||n/a *||Nature Run v5|
|DA-NN-lCDF||SMAP NN SM||local CDF-matching||Nature Run v5|
|DA-L2P-gCDF||SMAP L2P SM||global CDF-matching||Nature Run v5|
|OL-L4||none||n/a||Nature Run v4|
|DA-L4||SMAP Tb||seasonal climatology matching||Nature Run v4|
* global bias correction implicit.
© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).