Stochastic Bias Correction and Uncertainty Estimation of Satellite-Retrieved Soil Moisture Products

To apply satellite-retrieved soil moisture to a short-range weather prediction, we review a stochastic approach for reducing foot print scale biases and estimating its uncertainties. First, we discuss a challenge of representativeness errors. Before describing retrieval errors in more detail, we clarify a conceptual difference between error and uncertainty in basic metrological terms of the International Organization for Standardization (ISO), and briefly summarize how current retrieval algorithms deal with a challenge of land surface heterogeneity. As compared to relative approaches such as Triple Collocation, or cumulative distribution function (CDF) matching that aim for climatology stationary errors at time-scale of years, we address a stochastic approach for reducing instantaneous retrieval errors at time-scale of several hours to days. The stochastic approach has a potential as a global scheme to resolve systematic errors introducing from instrumental measurements, geo-physical parameters, and surface heterogeneity across the globe, because it does not rely on the ground measurements or reference data to be compared with.


Scale Issue
Anderson [1] stated the whole is not only more than but also disparate from the sum of its parts.For the hierarchical structure, he previously stated that the behavior of aggregated elements is not understood by a simple linear extrapolation of each element so that a different understanding of the new behavior is required at a higher level of complexity and scale.Similarly, the footprint scale behavior of satellite soil moisture products is not captured by an extrapolation or other statistical synthesis of several point measurements at local scale [2,3].Space borne sensors at low resolution do not very delicately detect the point-scale details from land surface.Instead, the satellite retrievals usually deal with a mixed pixel as a single uniform entity (e.g., SMOS retrieval algorithms read sub-pixel land cover information at 4 km by 4 km but finally make the pixels uniform by aggregation).When matching local measurements with satellite retrievals, representativeness error arises from a difference in scale between field measurements and satellite observations.Let us take some example for a spatial scale issue.It may sound simple to measure the height of three or more trees in our backyard.However, it becomes an entirely challenging task at a global scale.First, we would need an instrument to scan the globe with consistency.However, the instrument, in practice, contains calibration errors, instrumental errors, or other errors arising from environmental factors.In addition, as land surface is globally heterogeneous, the interpretation of satellite measurements is very complicated.Retrieval models should be flexible enough to accommodate the complexities such as climate conditions, surface heterogeneity at global scale, and any auxiliary information used to convert the raw signal to geo-physical variables of our interest.However, current retrieval algorithms often rely on the empirical models originated from a few validation sites at a local scale, due to a lack of ancillary and heterogeneity information.For example, several retrieval algorithms usually use vegetation models formulated and calibrated from limited validation sites [4,5].A change detection method makes an assumption that the effects of vegetation on backscattering is minimal at cross-over angle, based upon the empirical relations established for correcting the vegetation effects [3,6,7].Similarly, as in the tree example described above, remotely sensed vegetation index such as Leaf Area Index (LAI) may be used for globally characterizing the height of vegetation, although vegetation reflects a remote sensor's signals.The rationale valid at a local scale is not valid any more at footprint scale, as the LAI measured in the field is different from the LAI retrieved from remote sensors.Thus, it is needed to directly assess the footprint scale measurements and retrievals rather than converting a spatial scale between local point and satellite.
In addition to spatial scale, there is an issue with time scale.Relative comparisons such Triple Collocation (TC) or cumulative distribution function (CDF) matching that re-scale or compare satellite products to other types of datasets aim for climatology errors at a time-scale of years rather than retrieval errors at several hours to day time-scale as in a short-range weather prediction for storm or flooding.Thus, we review the instantaneous error dynamics supportive of a short-range weather prediction.

Issues with Current Retrieval Goal
Despite the scale issue, the unbiased Root Mean Square Error (RMSE) goal of 0.04 m 3 /m 3 is imposed for the SMOS and SMAP retrieval qualities.It is the most commonly used standard validation method to measure a deviation of the satellite products from the ground measurements.Although the upscaling of several local point field measurements is required for the validation of coarse resolution satellite soil moisture products at several kilometers, it should be clarified that upscaling errors are actually independent from retrieval errors in remotely sensed products [8].Following discussions further develop upscaling issues.
Essentially, governing factors of the footprint scale measurements are different from those of point-scale in-situ field measurements [9].Satellite measurements are influenced by regional scale meteorological events, topography or vegetation effects at a land scape-scale [10,11].In contrast, ground measurements represent a temporal variation of soil dielectric constant at a fine scale of soil porosity, soil particle and water molecule levels, although they are also affected by meteorological events.
Secondly, the ground measurements undersample land surface heterogeneity [4].Satellite retrievals usually aggregate sub-pixel land surface heterogeneity in a different way from the ground measurements upscaled to the same spatial extent [6,[12][13][14].Thus, there is an essential discrepancy in terms of spatial representativeness.Thoma et al. [15] discussed that there is a limitation in representing satellite surface soil moisture data with a combination of point scale field data, due to the spatially distributed land surface features that affect satellite measurements more complicatedly than in-situ point measurements.Verhoest et al. [16] also addressed that a direct comparison with field measurements is not possible, because there is a scale dependency of satellite data in terms of land surface characteristics such as roughness [17,18].Talone et al. [7] previously stated that land surface inhomogeneity ultimately limits the capability to compare single point measurements with satellite measurements so that the ideal validation site should be spacious and homogeneous.However, in the real world, land surface is usually spatially heterogeneous.
Accordingly, upscaling increases uncertainty (i.e., standard deviations) with the spatial extents and heterogeneity, as shown in Figure 1 [19][20][21].Although high standard deviation may decrease with the increasing number of probes deployed in the field, there are still essential limitations.Such a high density validation site is very limited to a few locations across the globe [22].Upscaling function of aggregation is difficult to determine [8].
Remote Sens. 2017, 9, 847 3 of 23 increases uncertainty (i.e., standard deviations) with the spatial extents and heterogeneity, as shown in Figure 1 [19][20][21].Although high standard deviation may decrease with the increasing number of probes deployed in the field, there are still essential limitations.Such a high density validation site is very limited to a few locations across the globe [22].Upscaling function of aggregation is difficult to determine [8].Thirdly, the field measurements are also exposed to several errors [23][24][25][26].For example, the time-domain reflectometry (TDR) measurement errors arise from a conversion of soil dielectric constant measurements to soil water content, signal noise in saline soils, or the presence of organic matters in soils [27].Field measurements also need ancillary information or assumptions, which introduce uncertainties.The gravimetric sampling that determines soil water weight after ovendrying also involves reading or calibrating errors.
Fourthly, there is difference in a sampling depth between ground measurement and satellite instrument.The field measurement or land surface model estimates soil moisture at the fixed depth of a soil layer [8].In contrast, the satellite penetration depth-the soil depth into which the radar or radiometer transmits signal-changes.For example, for L-band instruments such as SMOS and SMAP, a theoretical penetration depth is approximately 2-5 cm [17].However, this usually varies by several factors such as soil texture, and soil moisture as well as salinity [28,29].Figure 2 shows an example that the same microwave signal with a wavelength of 10 cm reaches a depth of 10 m in dry sandy soil but penetrates only a few cm in wet soils [5,12,30,31].This variable sampling depth of remote sensors makes it difficult to directly compare the satellite instrument measurements with the field measurements, even when they are upscaled.
Finally, it is difficult to make a global validation with the soil moisture international networks connecting each local point measurement at different locations, due to inconsistency in operation.Because different in-situ sensors are deployed in various ground stations registered to the soil moisture international networks, there are inconsistences when evaluating satellite retrievals at the global scale all together.In addition, there is a time-mismatch between sampling time of in-situ sensor and satellite overpass time [3].This different temporal resolution (e.g., in-situ at hourly level, and satellite at several days level) will result in disagreements, particularly when satellite measurement is taken right before sudden rain events, but in-situ sensors instantaneously respond to such meteorological events.Additionally, the international networks still contain upscaling or representativeness errors, and have a limited global coverage, as they are just a few pixels out of the whole at a hundred thousand pixels [22].Thirdly, the field measurements are also exposed to several errors [23][24][25][26].For example, the time-domain reflectometry (TDR) measurement errors arise from a conversion of soil dielectric constant measurements to soil water content, signal noise in saline soils, or the presence of organic matters in soils [27].Field measurements also need ancillary information or assumptions, which introduce uncertainties.The gravimetric sampling that determines soil water weight after oven-drying also involves reading or calibrating errors.
Fourthly, there is difference in a sampling depth between ground measurement and satellite instrument.The field measurement or land surface model estimates soil moisture at the fixed depth of a soil layer [8].In contrast, the satellite penetration depth-the soil depth into which the radar or radiometer transmits signal-changes.For example, for L-band instruments such as SMOS and SMAP, a theoretical penetration depth is approximately 2-5 cm [17].However, this usually varies by several factors such as soil texture, and soil moisture as well as salinity [28,29].Figure 2 shows an example that the same microwave signal with a wavelength of 10 cm reaches a depth of 10 m in dry sandy soil but penetrates only a few cm in wet soils [5,12,30,31].This variable sampling depth of remote sensors makes it difficult to directly compare the satellite instrument measurements with the field measurements, even when they are upscaled.
Finally, it is difficult to make a global validation with the soil moisture international networks connecting each local point measurement at different locations, due to inconsistency in operation.Because different in-situ sensors are deployed in various ground stations registered to the soil moisture international networks, there are inconsistences when evaluating satellite retrievals at the global scale all together.In addition, there is a time-mismatch between sampling time of in-situ sensor and satellite overpass time [3].This different temporal resolution (e.g., in-situ at hourly level, and satellite at several days level) will result in disagreements, particularly when satellite measurement is taken right before sudden rain events, but in-situ sensors instantaneously respond to such meteorological events.Additionally, the international networks still contain upscaling or representativeness errors, and have a limited global coverage, as they are just a few pixels out of the whole at a hundred thousand pixels [22].On the other hand, downscaling that disaggregates the satellite products to a local scale is also complicated [32,33].The currently available downscaling techniques often employ machine learning or proxies such as land surface temperature, or vegetation index for downscaling soil moisture products at low resolution.However, a linear relationship between a proxy and soil moisture is uncertain, as being usually based on the empirical interpretation of a limited range of land surface and meteorological conditions, rather than an actual realization of sub-pixel heterogeneity or theoretical description [34].
Taken together, a core challenge of assessing and improving soil moisture satellite retrievals lies within a difference in scale and heterogeneity, which can not be fully resolved by upscaling or relative approaches.In addition to RMSE retrieval goal, more diverse perspective for defining errors directly at footprint scale is needed.Thus, for a reduction of footprint scale instantaneous errors in satellite retrievals, and its application to a short-range weather prediction, we review a stochastic approach to address the instantaneous errors in retrieval product.

Errors and Uncertainties
For the reasons discussed in Section 1.1, it is actually difficult to acquire "absolute true values"of soil moisture satellite product.It is important to recognize that it is more difficult to determine them operationally in practice.The observations are traditionally deemed as objective information, but they are actually complicated syntheses, based upon subjective evaluation, several ideal assumptions, imperfect retrieval model and erroneous auxiliary information.Thus, the satellite products are affected by instrumental errors, or errors coming from operators, environmental factors, or simplified retrieval algorithms.Then, what is now important is to be aware of the presence of errors across the scale to keep the quality of systems to an acceptable level, and to appropriately monitor their behaviors in operations.

Errors Systematic Errors
In Metrology, measurement errors consist of systematic errors (also called "biases") and random errors (also called "noise").By the International Organization for Standardization (ISO) International Vocabulary of basic and general terms in Metrology (VIM) 3.14 [35], systematic errors are defined as a difference between true values of the measurand and the average values that would ensue from an infinite number of replicated measurements of the same measurand carried out under repeatable conditions.Systematic errors can be expressed, as follows: On the other hand, downscaling that disaggregates the satellite products to a local scale is also complicated [32,33].The currently available downscaling techniques often employ machine learning or proxies such as land surface temperature, or vegetation index for downscaling soil moisture products at low resolution.However, a linear relationship between a proxy and soil moisture is uncertain, as being usually based on the empirical interpretation of a limited range of land surface and meteorological conditions, rather than an actual realization of sub-pixel heterogeneity or theoretical description [34].
Taken together, a core challenge of assessing and improving soil moisture satellite retrievals lies within a difference in scale and heterogeneity, which can not be fully resolved by upscaling or relative approaches.In addition to RMSE retrieval goal, more diverse perspective for defining errors directly at footprint scale is needed.Thus, for a reduction of footprint scale instantaneous errors in satellite retrievals, and its application to a short-range weather prediction, we review a stochastic approach to address the instantaneous errors in retrieval product.

Errors and Uncertainties
For the reasons discussed in Section 1.1, it is actually difficult to acquire "absolute true values"of soil moisture satellite product.It is important to recognize that it is more difficult to determine them operationally in practice.The observations are traditionally deemed as objective information, but they are actually complicated syntheses, based upon subjective evaluation, several ideal assumptions, imperfect retrieval model and erroneous auxiliary information.Thus, the satellite products are affected by instrumental errors, or errors coming from operators, environmental factors, or simplified retrieval algorithms.Then, what is now important is to be aware of the presence of errors across the scale to keep the quality of systems to an acceptable level, and to appropriately monitor their behaviors in operations.

Errors Systematic Errors
In Metrology, measurement errors consist of systematic errors (also called "biases") and random errors (also called "noise").By the International Organization for Standardization (ISO) International Vocabulary of basic and general terms in Metrology (VIM) 3.14 [35], systematic errors are defined as a difference between true values of the measurand and the average values that would ensue from an infinite number of replicated measurements of the same measurand carried out under repeatable conditions.Systematic errors can be expressed, as follows: where ∆x is systematic errors, x ctv is the conventional true value, and x is the arithmetic mean of n time repeated reading data x i , as follows: Systematic errors include multiplicative and additive errors.Additive errors such as offset errors do not change with the measured values.In contrast, multiplicative errors such as gain errors linearly change with the measured values, depending on the input values [35].By the error sources, systematic errors can be classified into instrumental errors (e.g., calibration errors), assumption errors (e.g., model errors), environmental errors (e.g., RFI), dynamic errors (e.g., rain events or vegetation effects), or static errors (e.g., soil texture information).
There are several ways to quantify systematic errors.Based upon the analysis of measuring or retrieval processes, instrumental information (e.g., calibration error), or operational issues, a partial derivative based error propagation can be used [36].

Random Errors
Random errors are different from systematic errors in that they have no consistent impacts on the measurement.They have negative errors as many as positive errors (i.e., add up to 0 in their error distributions) so that they have no impact on the average [35].Systematic errors can be predicted, characterized and even corrected, while random errors cannot.Although it is not possible to eliminate random errors, we may be able to describe their stochastic behaviors with a probability distribution of random errors such as the normal (Gauss, or bell-shape), uniform, bimodal, and Laplace distributions.

Uncertainties
Uncertainty is considered as the "state of knowledge" on the system quality [36].According to the Guide to the expression of Uncertainty in Measurement (GUM), uncertainty is defined as a parameter characterizing the dispersion of the values attributed to the measurand.Thus, uncertainty is conceptually distinct from errors, as it does not assume the presence of symbolic and ideal true values.It is evaluated by two approaches: type A is based upon statistical analysis of repeated measurements or multiple model realizations (e.g., standard deviation, variance, covariance etc.), while type B is on the basis of a priori knowledge (e.g., the observer's personal experience or literature) [36].The GUM standard uncertainty in the case of type A is computed as the standard deviation, as follows: (3)

Current Soil Moisture Retrieval Algorithms
This section introduces how the recent soil moisture retrieval algorithms deal with land surface heterogeneity briefly discussed above in Section 1.1.In contrast to previous soil moisture satellite retrievals that use global constant values for land surface parameters or assume a linear relationship between the raw signals of satellite instrument and the soil moisture retrievals, the recent soil moisture retrievals has started to retrieve geo-physical parameters in the light of sub-pixel land surface heterogeneity.For example, the SMOS retrieval algorithm employs sub-pixel land cover information to select a retrieval model optimal to dominant land cover in a given pixel [12], in contrast to previous retrievals assuming a time-invariance of vegetation dynamics or roughness over the land surface [6].In addition, recent soil moisture mission such as SMOS and SMAP use the sensors operated at L-band frequency.It is optimal for soil moisture observations [17], when considering several aspects such as spatial resolution and protected spectrum.In comparison with other frequencies, an L-band better penetrates atmosphere, cloud and vegetation, showing a high sensitivity to soil moisture.Thus, it is expected that it relatively better detects surface soil moisture in densely vegetated areas with Vegetation Water Content of 6 kg/m 2 or less [4,37].Consequently, it might be said that the recent soil moisture missions have a better potential to handle and characterize time-varying land surface conditions.However, there is an instrumental difference between SMOS and SMAP.In contrast to the SMOS that produces brightness temperature measurement from multiple incidence angles at full polarizations over the same target on the ground, the SMAP radiometer uses brightness temperature at a single incidence angle.
The soil moisture retrieval algorithms are mainly classified into two groups: (1) a change detection method that assumes a time-invariance of land surface condition and detects a relative change of backscattering signals between extreme conditions (e.g., European Remote-sensing Satellite (ERS)-2 (1995 to present) and Metop-A Advanced SCATterometer (ASCAT) (2006 to present) at C-band); and (2) an inversion algorithm to invert geo-physical parameters from measurements (e.g., Advanced Microwave Scanning Radiometer-Earth Observing System (AMSR-E, 2002 to 2011) at C-, X-and Ku-band, and SMOS (2010 to present) and SMAP (launched in 2015 January) at L-band).Both passive and active microwave sensor retrievals are susceptible to the Radio Frequency Interference (RFI).The soil moisture products from those instruments have their own merits as well as demerits.Passive microwave sensors have a low-resolution problem, while active microwave sensors are more vulnerable to geometric factors such as topography, roughness or vegetation effects and to speckle.

Change Detection Method
A change detection method reads a relative change between the highest and lowest values recorded during a limited time span and converts the normalized magnitude to soil moisture.de Since a change detection method is suggested for time-series SSM/I measurements, Wagner et al. [38] retrieved soil moisture by measuring a relative change of ERS-2 or Metop-A ASCAT backscattering ranging from 0 (most dry) to 100 (most wet).Their relative approach is based on their findings that the quantitative validation of soil moisture product with field measurements is, in practice, infeasible due to land surface heterogeneity, a lack of ground measurements across the globe, variable penetration depth of remote sensors, temporal difference and high costs of field surveys, as discussed in Section 1.1.For this reason, they stated that a quantitative comparison directly with other variables such as rainfall and temperature data is not possible [39].
However, a change detection method may have some limitations in retrieving surface soil moisture under extreme conditions such as permanently dry or wet soils with no dynamics.When any single data point is overestimated or underestimated as an outlier, then the entire estimation may also be affected by those misleading upper or lower limits.This approach may have errors arising from land surface heterogeneity, as not characterizing land surface parameters or making a complicated calculation of vegetation dynamics.Instead, they assume a time-invariance of roughness and vegetation, and a land surface homogeneity within the scatterometer footprint [6].Based upon the empirical interpretation that the effects of vegetation dynamics are minimized at 40 degrees, all backscattering values are extrapolated to a common incidence angle of 40 degrees.This method also assumes a linear relationship between surface and deep soil moisture, a vertical and spatial homogeneity of soil property such as soil hydraulic conductivity, and no evapotranspiration activity in the water balance established for calculating soil water index.More detailed description is available at http://www.eumetsat.int/website/home/Data/Products/Land/index.html.

Inversion Method
Due to significant impacts of land surface variability on soil moisture retrievals, it was previously suggested that operational models need an adjustment for time-varying and spatially heterogeneous input parameters [40].Recently, some passive microwave retrievals consider land surface heterogeneity from sub-pixel scale land cover information.For example, AMSR-E Land Parameter Retrieval Model (LPRM) minimizes a mismatch between simulated and measured brightness temperature with two-channel iterations to simultaneously retrieve land surface temperature, vegetation optical depth and the soil dielectric constant [13].On the other hand, they use global constant values for surface roughness, cross polarization, and single scattering albedo [36].SMOS also uses a nonlinear iterative Bayesian approach.To retrieve soil moisture, it needs a broader range of time-varying geo-physical parameters such as vegetation optical depth, surface roughness, dielectric constant, and surface temperature, scattering albedo [12].Geophysical parameters are updated and adjusted by iterations to optimize a cost function.However, the SMOS retrieval algorithm does not operationally retrieve all the geo physical parameters at the same time.For example, it always retrieves soil moisture, but retrieves optical density or roughness, if appropriate (e.g., if satisfying the quality index [14]).On the other hand, the SMAP soil moisture algorithm uses the alternate aggregation procedures of Zhan et al. [41] to use the vegetation water contents (VWC) for vegetation heterogeneity, in addition to 1 km land cover information.SMAP soil moisture brings input parameters (e.g., surface temperature, surface roughness, vegetation optical depth, and single scattering albedo) externally from ancillary database or land cover look-up table [37,42], instead of retrieving them.

Relative Approach (Inter-Comparison): Climatology Error
This section introduces relative approach.Triple Collocation in Section 2.1.1 compares three different datasets for estimating uncertainty.Cumulative Distribution Function matching in Section 2.1.2addresses a bias correction prior to data assimilation.Data assimilation in Section 2.1.3reduces random errors by comparing model estimates and satellite observations.The final analysis may be used as neutral reference data to estimate errors so that the data assimilation is included here.

Triple Collocation (TC) Method
Triple collocation determines the relative errors by directly comparing three independent estimations over the same variable, assuming that errors from different sources are not correlated [43].It has been often applied to satellite-retrieved soil moisture error characterizations (Leroux et al., 2013, Su et al., 2014).Scipal et al. [22] first introduced this method to circumvent a limitation of the upscaling discussed in Section 1.1, as it was found that upscaling errors are often larger than satellite retrieval errors per se, and field measurements are too scarce across the globe to apply that method to operational services.Thus, without using ground-based field measurements, they compared climatology of three independent and spatially distributed soil moisture datasets from the TRMM Microwave Imager (TMI), the active microwave ERS-2 scatterometer and ERA-Interim re-analysis data.
Unlike the original aim of Scipal et al. [22], some other researchers applied a TC method for the estimation of upscaling or aggregation errors by including local scale field measurements [24,44].However, there are critical shortcomings for such applications.To assess upscaling errors, the method should be able to estimate instantaneous (or at least seasonal) satellite retrieval biases nonlinearly arising from several dynamic factors of vegetation, or rainfall conditions.However, a TC method estimates a relative difference in stationary climatology, instead of absolute and instantaneous retrieval errors as in RMSEs of field measurements.As it definds errors from inconsistency in datasets, error estimation may change if using different reference data.It is considered as 'relative errors' [45].Thus, if the assumption that three datasets are independent is violated, the underestimates actual retrieval errors.In fact, satellite data and reference data to be compared with often have a positive error covariance.This may occur, because retrieval algorithms often share several input data sources, similar cost function algorithms or similar error structure with the reference data (e.g., land surface models).For example, in the event of rain, both model (if the model is used as reference data) and satellite data are similarly prone to make significant overestimations.Land surface models are affected by rainfall data vulnerable to errors, as the satellite retrievals are also influenced by the water film suddenly formed on the surface, and the consequent change in a penetration depth.For the estimation of dielectric constant, both satellite retrievals and models require a similar type of soil property information.For example, if the clay fraction in soil maps used for both land surface model and satellite retrievals is overestimated [5,46], then this is escalated to the overestimation of wilting point or field capacity in the land surface models and adversely affects the satellite retrievals to convert dielectric constant to soil moisture, resulting in the similar overestimation of soil moisture in both satellite retrieval and land surface models [18,47,48].
Recently, Gruber et al. [49] introduced an extension of the existing TC.For example, they added the correlation coefficient to the TC [50], or combined the TC with the statistical analysis of filtering innovations in data assimilations [51].

CDF Matching
The CDF matching has been widely used for bias correction of satellite soil moisture data [52].Fundamentally, it matches the cumulative distribution function of satellite data with a long record of climatology from reference data.As it is effective in rescaling satellite data towards the model estimates, it is employed as a bias correction prior to data assimilation.However, there are some limitations as a bias correction [52].For example, an instantaneously or seasonally dynamic variation of rescaling parameters or retrieval errors is ignored [53].This method is based on soil moisture climatology, which do not take into account sub-pixel heterogeneity [33].Thus, satellite observational or retrieval errors often remain even after bias correction [54].

Data Assimilation Analysis Increments
The data assimilation diagnostics such as innovation or analysis increment may be available from the SMAP Level 4 data processing [55].They may be used as the reference data to diagnose uncertainties in satellite retrieval products.A consistency check of Desroziers et al. [56] may be relevant.They suggested to estimate the satellite observational errors by a covariance between "analysis increment" (observation-minus-analysis) and "background departure" (observation-minus-model). Dee [57] also attempted to attribute the satellite observational error with data assimilation.He suggested considering the "analysis increments" as the reference data to be compared with satellite data.
Data assimilation analyses have both merits and limitations on the diagnosis of uncertainty.As compared to two relative approaches discussed in Sections 2.1.1 and 2.1.2,data assimilation approach may be less sensitive to an integrity or a choice of reference data.That is because data assimilation considers both observational and model errors.However, data assimilation aims for mitigating the random errors so that it does not make a bias correction for the satellite observational errors, in principle.

Stochastic Approach: Instantaneous Retrieval Errors
Retrieval error information is often provided by Quality Control (QC) flags, which include a cost function information to be reached at the end of the retrieval process, retrieval errors of each parameter product, confidence level on retrieved soil moisture, RFI, several node information of rain, snow, frozen soils, forest or open water required to improve the quality of brightness temperature data [12,55,58].In general, QC science flags are very useful and informative, but often not sufficient to interpret, use them as error information required for data assimilation.QC information only indicates how the system treats a single step of several retrieval processing steps, not conferring the integrated error information in the unit of soil moisture.Therefore, we discuss footprint-scale retrieval errors in this section, which are not dealt with upscaling in Section 1.1.2and relative climatology errors in Section 2.1.At a footprint scale, a partial-derivative (or tangent space or Jacobian matrix) method may be employed for deterministically quantifying retrieval errors [59].However, in practice, there are several difficulties in operationally implementing it.First, the accurate error of satellite products are often too complicated to deterministically define, predict or assume with a priori knowledge.For example, Parinussa et al. [36] employed a deterministically fixed single value for brightness temperature errors.It was emanated from a priori knowledge based upon a global and long-term average of nominal pixels rather than all the dynamic and real-world error conditions such as forest, RFI contamination, storm rainfall event, frozen soil, snow cover, flooding, or complex topography.However, such trimmed values that neglect all the outliers or extreme conditions are symbolic.They do not show the error dynamics in the real world so that the error propagation may considerably underestimate the actual retrieval quality.More importantly, the error propagation of satellite retrieval algorithms is nonlinear and chaotic so that deterministically defined single error value has no representativeness of or balance with various error scenarios [60].A slight change in retrieval input can make large outliers in retrieval outcome, if the input perturbations occur outside of the optimal range.In contrast, no significant differences in retrieval products occur, when the same degree of perturbations occur within an optimal range of retrieval input parameters.Such a chaotic nature of nonlinear systematic error propagations is illustrated in Figure 3, where the outlier in black dot is largely deviated from other groups.If the input errors deterministically defined in perturbation scheme are assumed too optimistically or still within an optimal range (for example, schemes #1, 2, and 3 in Figure 3), then there is a possibility that the retrieval errors can be largely underestimated.In fact, actual errors could be much larger as in scheme #4 of outlier in Figure 3. Due to such a nonlinear error propagation of satellite retrieval algorithm, and the unpredictability of retrieval input errors, deterministically defined single error value has very limited representativenss of the whole system.Thus, there is need to take into account the probability distribution of errors affecting radar backscatter or soil emission, instead of global constant error determined by a priori knowledge.
implementing it.First, the accurate error of satellite products are often too complicated to deterministically define, predict or assume with a priori knowledge.For example, Parinussa et al. [36] employed a deterministically fixed single value for brightness temperature errors.It was emanated from a priori knowledge based upon a global and long-term average of nominal pixels rather than all the dynamic and real-world error conditions such as forest, RFI contamination, storm rainfall event, frozen soil, snow cover, flooding, or complex topography.However, such trimmed values that neglect all the outliers or extreme conditions are symbolic.They do not show the error dynamics in the real world so that the error propagation may considerably underestimate the actual retrieval quality.More importantly, the error propagation of satellite retrieval algorithms is nonlinear and chaotic so that deterministically defined single error value has no representativeness of or balance with various error scenarios [60].A slight change in retrieval input can make large outliers in retrieval outcome, if the input perturbations occur outside of the optimal range.In contrast, no significant differences in retrieval products occur, when the same degree of perturbations occur within an optimal range of retrieval input parameters.Such a chaotic nature of nonlinear systematic error propagations is illustrated in Figure 3, where the outlier in black dot is largely deviated from other groups.If the input errors deterministically defined in perturbation scheme are assumed too optimistically or still within an optimal range (for example, schemes #1, 2, and 3 in Figure 3), then there is a possibility that the retrieval errors can be largely underestimated.In fact, actual errors could be much larger as in scheme #4 of outlier in Figure 3. Due to such a nonlinear error propagation of satellite retrieval algorithm, and the unpredictability of retrieval input errors, deterministically defined single error value has very limited representativenss of the whole system.Thus, there is need to take into account the probability distribution of errors affecting radar backscatter or soil emission, instead of global constant error determined by a priori knowledge.  .Only scheme #4 is outside of an optimal roughness range.

The Concept and Type of Retrieval Ensembles
The simple way to resolve the issues discussed above is to randomly repeat measurements or retrievals.We consider it as the ensembles, which are defined as multiple idealizations of "virtual" Figure 3. Nonlinear error propagation of roughness to SAR soil moisture [61]: ASAR backscattering differently retrieved soil moisture products under four roughness conditions indicated in Table .Only scheme #4 is outside of an optimal roughness range.

The Concept and Type of Retrieval Ensembles
The simple way to resolve the issues discussed above is to randomly repeat measurements or retrievals.We consider it as the ensembles, which are defined as multiple idealizations of "virtual" copies of the state, considering various possibilities at once [62].If the ensembles being applied to retrieval algorithms, then it may be called "retrieval ensembles".The main idea of retrieval ensembles is to integrate various retrievals in probability to find more optimal and certain sample.The rationale is that a single retrieval model using a single input generally contains biases so that several constraints in retrieval algorithms may be mitigated with stochastically repeated measurements or retrievals, or with appropriate integrations with other retrieval products.
There are two different approaches in retrieval ensembles: (1) the deterministic approach to assemble multiple retrieval models [63]; and (2) the probabilistic approach to process various retrieval input data with the same retrieval algorithm, where the input errors are stochastically defined in a form of PDF (Probability Density Function) [48,52,64].These are also called PDF methods.
In the deterministic method, a single satellite retrieval product is obtained by combining the different soil moisture retrieval products generated from various retrieval models.It assembles the different retrieval products to produce a single dataset with the same temporal and spatial resolution.The advantage of this deterministic approach is that the ensemble members come from various instruments and retrieval models so that the information includes a broad range of possible estimations, and diversity.
In the probabilistic method, the retrieval ensembles are obtained by stochastically perturbing inputs of a single retrieval model.For a random perturbation of inputs, it is important to make a correct attribution of error sources, and identify error ranges.This could be sometimes helped with the use of the uncertainty quantification analysis [65,66].In this review, we focus on this approach.Details of the probabilistic method are introduced in following Section 2.2.2.
This retrieval ensemble suggests several advantages, as compared to other bias correction or error analysis methods in Section 2.1.First, the ensemble method directly assesses the footprint-scale errors without involving upscaling errors.Secondly, this ensemble analysis is specific to a selected retrieval algorithm and particular sensor.Finally, it is not affected by a selection of or error in reference data, unlike relative approach.
For the limitation of a stochastic approach, Parinussa et al. [36] previously argued that the Monte Carlo approach or the use of multiple retrieval models is operationally infeasible, due to the high computational cost.However, recent studies showed that a small ensemble size at 12-20 is sufficient to provide the optimal estimates [52].If perturbing different error sources including land surface heterogeneity, meteorological event, satellite measurement, and parameter inversion errors with the same ensemble size, the resultant ensembles indicate a different quantitative measure of retrieval errors.In this context, the appropriate error attribution and an optimization of realistic ensembles is much more important than a large ensemble size itself.It is possible to optimize the ensembles that function as a bias correction in a non-local approach that does not require ground measurements-based RMSEs.With statistical index such as Lyapunov exponent or Kurtosis, it should be monitored whether a chaotic system is transformed into a stochastic system that improves a structural stability and flexibly reduces non-linear retrieval errors, and whether the ensembles follow a Gaussian distribution [67,68].

Generation of Retrieval Ensembles
The retrieval ensembles are dependent on satellite measurements of sensors, and retrieval input parameters, and land cover.The error sources to be considered for generating ensembles are largely classified into three main categories, as shown in Table 1.

Instrument Measurements
Measurement errors arise from calibration errors, vegetation attenuation, the water film formed by rain events, RFI, radiometric noise, instrument errors, bandwidth, sample integration time, structural uncertainty in surface backscatter or soil emission, and incidence angle interpolation errors and others [59].It is a very important error source to consider, since several retrieval algorithms employ an inversion to minimize a mismatch with measurements.If the measurements are incorrect under such a scheme, then retrieved variables are also incorrect as a consequence.From the statistics of various retrieval ensembles generated by various perturbation schemes, Lee et al. [18] found that satellite measurement errors are multiplicative, in contrast to errors in retrieved geo-physical parameters.Crow et al. [59] and Parinussa et al. [36] suggested 0.3 to 2.5 K for the brightness temperature errors, across the bandwidth.If more realistically including external forcing events, complex vegetation condition, RFI and vertical soil heterogeneity such as a high vertical gradient condition in soil layers, then a much larger magnitude of 20 K is suggested for brightness temperature measurement biases [18,47,54,69,73].This error range is fundementally different from globally averaged climatology errors estimated over nominal pixels.For backscattering errors, Mattia et al. [70] and Lee [61] previously suggested 0.5 to 2 dB.

Geo-Physical Parameters
This considers the effects of errors in geo-physical input parameters or ancillary data.The key includes soil and surface temperature, surface roughness, optical depth and single scattering albedo.First, surface roughness largely propagates soil moisture retrieval errors.It is difficult to directly measure them not only at a local point scale but also at a global scale.Even when it is possible to measure roughness in the field, there is uncertainty in applying it to satellite due to scale dependency and a factor of sun-glint [75].Thus, this parameter is often estimated by an empirical formula or inverse method based upon the Bayesian approach [70,76].However, those approaches have uncertainty in several circumstances such as a large vertical gradient of soil moisture or different soil textures, propagating errors to the estimation of soil reflectivity and soil moisture [77].Some retrieval algorithms such as the LPRM use a globally fixed value for surface roughness, neglecting surface heterogeneity and consequently producing uncertainty in soil moisture retrievals.For an error range, Verhoest et al. [18] found that roughness-induced retrieval errors are approximately 6-10% in the case of using active microwave SAR sensor [71].
Land surface temperature is also an important input parameter for soil moisture retrievals [71,78,79].Although it is a key parameter with high sensitivity to soil moisture retrievals, both SMOS and SMAP instruments at a single frequency do not observe the other variable of surface temperature.Thus, the uncertainty arising from interpolating the variable and using the ancillary data from external database is unavoidable.
Another error source is an optical depth.De Jeu et al. [47] previously reported that a sensor's sensitivity to soil moisture variations decreases and errors increase with vegetation optical density, as soil emission is attenuated by canopy.It may be because no remotely sensed index reasonably estimates vegetation height yet.Although several retrieval algorithms relate the optical depth to LAI [5,72], the optical depth is actually more related to the VWC [80] which better reflects vertical characteristics.The SMAP retrieval system uses the VWC, but it is estimated with the Normalized Difference Vegetation Index (NDVI) from visible near infrared reflectance from the EOS MODIS and NPP/JPSS VIIRS instruments.The NDVI barely reads the vertical characteristics of canopy [81], and is easily saturated by low-level vegetation.For these reasons, Holmes et al. [72] found that the vegetation models introduce uncertainties up to 25 K.They also stated that the auxiliary vegetation database regime results in large variations in simulating brightness temperature by 40 K. Crow et al. [59] also discussed that a spatial pattern in soil moisture retrievals is influenced primarily by vegetation distribution, and found that the presence of vegetation changes the brightness temperature simulations at H-polarization up to 30 K, and at V-polarization up to 20 K.
Finally, there is uncertainty in a single scattering albedo.Several retrieval algorithms use the fixed value of 0.05 to 0.06 across the globe [5,36], due to a scarcity of field measurements.However, in fact, a range of this variable actually varies by several factors.Davenport et al. [82] previously reported that it is spatially heterogeneous as the single scattering albedo is a function of canopy geometry, and vegetation species.They discussed that even an error of 0.01 in the single scattering albedo can be propagated up to soil moisture retrieval errors at 0.02 to 0.1 m 3 /m 3 .For this reason, a single scattering albedo is also a potential error source to consider for generating retrieval ensembles.

Sub-Pixel Land Cover and Soil Map
There can be retrieval errors arising from the assumption of uniform pixels.Vegetation, meteorological activity, topography, and soil property are spatially heterogeneous at low-resolution [33,40,83], although satellite retrievals deal with the pixels as a uniform entity (the SMOS algorithm considers sub-pixel land cover heterogeneity at 4 km by 4 km, but it is eventually aggregated to be uniform at 25 km by 25 km).Crow et al. [59] suggested that aggregation errors arise from the nonlinearity between soil moisture retrievals and land surface, and from the space borne sensors that are unable to capture the net impact of sub-pixel land surface conditions.Zhang et al. [21] also demonstrated that the soil moisture retrieval errors increase with a degree of land surface heterogeneity. Lee et al. [18] demonstrated that sub-pixel land cover misclassification is propagated to soil moisture retrieval errors.Leroux et al. [45] also reported that sub-pixel soil texture and land cover information (e.g., the presence of forest in the field of view) are important error sources after integrating several years of SMOS radiometer data.Draper et al. [84] suggested that a complexity of topography limits soil moisture retrieval skills of both passive and active microwave sensors.Panciera et al. [85] also stated that soil moisture retrieval errors are significant due to the negligence of sub-pixel vegetation heterogeneity.Thus, the sub-pixel heterogeneity may result in retrieval errors, if not appropriately accounted for.
In particular, uncertainty in high-resolution soil map is also a factor to consider.The soil reflectivity required to simulate brightness temperature is a function of dielectric constant, which is formulated with soil moisture, soil bulk density, soil particle density or wilting point [86,87].Most of the operational systems determine such soil properties from a soil texture map [5,38,55,88].However, it is complicated to estimate the spatially and vertically heterogeneous and dynamic soil property from a soil texture map, due to a nonlinear relationship between soil texture and property [89][90][91].
The propagation of error sources discussed above in Section 2.2.2 are dependent on a type of remote sensor, wave length, climatology and retrieval algorithms used.Every satellite product has their own error characteristics.For example, an L-band is designed to be less sensitive to geo-physical conditions such as soil roughness or vegetation, while active microwave sensors at higher frequencies are more sensitive to them [60].If the retrieval algorithm has a decision tree to select the retrieval model that represents the dominant land cover at sub-pixels, then sub-pixel land surface heterogeneity information quality will considerably affect the final retrievals, in contrast to other retrieval algorithms assuming a time-invariance or uniformity of sub-pixel land surface conditions.Therefore, it is suggested that the perturbation regimes for generating retrieval ensembles should be empirically determined by a sensitivity analysis of retrieval algorithms to be used.

Bias Correction
For an effective bias correction, it is important to optimize the retrieval ensembles realistically representative of systematic errors through accurate error attributions in Section 2.2.2 and statistical monitoring of stochastic evolutions [18,66,68].Based upon the assumption of Gaussian distribution, the empirical mean (hereafter called the "ensemble mean") of retrieval ensembles is used to determine a single optimal estimate.The arithmetic mean is a very good representative of the whole system (central limit theorem, Laplace 1749-1827).The normal error distribution is in fact originated from the astronomical observations that Galileo Galilei found in the 17th century.It has been developed to the Gauss' normal law of errors suggesting that the choice of arithmetic mean affords the most optimal value to adhere to [92][93][94].The universal tendency of Gaussian distribution is widely found from Ensemble Kalman filter system [95], to biological system or price fluctuation of stock market.That improves structural stability, and chaotic harmony by satisfying the universality (e.g., the second law of thermodynamics for entropy, [68]), as well as representativeness by embracing various possibilities as a whole [96,97].
In metrological terms, the optimal value is approximated by taking the empirical mean of the retrieval ensembles, as follows: where x is the ensemble mean, n is the ensemble size, and x i is the ith ensemble member.Equation ( 4) corresponds with the ISO VIM definition of the measurements discussed in Section 1.2.1,where the average of repeated measurements in Equation ( 1) corresponds with the ensemble mean in Equation ( 4), and the conventional true value in Equation ( 1) corresponds with original satellite end product.Therefore, by ISO VIM definition, systematic errors become their difference between retrieval ensemble mean and original satellite end product.Several studies reported that satellite measurements should be stochastically retrieved due to nonlinear retrieval errors and complexities.Hossain and Anagnostou [98] employed retrieval ensembles for the optimal utilization of satellite rainfall data.They produced the probabilistic (ensemble) representation of satellite rainfall products by specifying the stochastic error structure of rainfall retrievals.Zhao et al. [99] successfully applied this approach to cloud retrieval products, and produced optimal estimates by assembling several cloud retrieval models (available at http: //www.arm.gov/data/eval/49).In order to reduce biases in temperature retrievals, Zhang et al. [66] developed an "ensemble retrieval" methodology of atmospheric profiles from the Atmospheric InfraRed Sounder (AIRS).They perturbed the temperature eigenvectors, and successfully reduced retrieval errors.
This stochastic approach has been also applied to reduce errors in soil moisture retrievals.Li et al. [100] employed retrieval ensembles to reduce the uncertainties caused by observation errors, parameter uncertainties, and an inversion method.They increased the number of brightness temperature observations using multi-angle and dual-polarized radiometer.Lu and Gong [101] also found that the ensemble data provides more realistic soil moisture information than deterministic single product.De Keyser et al. [102] attempted to stochastically reduce errors in roughness when retrieving Synthetic Aperture Radar (SAR) soil moisture with Integral Equation Model (IEM) model.They applied a Monte Carlo Method to estimate a correlation length, and finally achieved RMSE of approximately 3.5 vol % from the median of randomly produced soil moisture product, as shown in Figure 4. Oh et al. [103] used an ensemble-averaged differential Mueller matrix for microwave backscattering from PDF of the co-polarized phase angle and backscattering coefficients.Kim et al. [104] found that several model realizations and repeated measurements reduce radar observational errors arising from speckle.Merlin et al. [33] also employed ensembles to perform their downscaling of SMOS soil moisture products to 1 km resolution.
Remote Sens. 2017, 9, 847 14 of 23 [104] found that several model realizations and repeated measurements reduce radar observational errors arising from speckle.Merlin et al. [33] also employed ensembles to perform their downscaling of SMOS soil moisture products to 1 km resolution.Figure 5 shows that the retrieval ensemble mean corrected footprint scale wet biases of the SMOS soil moisture in semi-arid region [52].Ensembles were generated from a random perturbation of brightness temperature with an ensemble size of 12.The error range is determined to fully include all the dynamic and extreme errors arising from the effects of rain events, vegetation attenuation, any geo-physical parameters, or RFI to be discussed in Section 2.2.2.Among several other ensemble generation schemes, brightness temperature ensembles were selected because the resultant ensemble mean appropriately reduced the errors already known by a priori knowledge of retrieval error structure [18].It is also possible to make a time or spatial integration of ensembles in order to enhance the utilization of ensembles at a reduced computational cost.In addition to wet biases in West Africa, dry biases in Little Washita Watershed site in Oklahoma were also resolved by a time integration of ensembles [105].On the other hand, CDF matching in Figure 5 increased the RMSEs of the original SMOS soil moisture, as CDF matching shifted the original SMOS soil moisture towards reference data exposed to their own intrinsic errors (i.e., the lowest limit of soil moisture is set at wilting point for calculating bare soil evaporation so that model did not appropriately simulate soils in extremely dry conditions drier than the wilting point [106].Figure 5 shows that the retrieval ensemble mean corrected footprint scale wet biases of the SMOS soil moisture in semi-arid region [52].Ensembles were generated from a random perturbation of brightness temperature with an ensemble size of 12.The error range is determined to fully include all the dynamic and extreme errors arising from the effects of rain events, vegetation attenuation, any geo-physical parameters, or RFI to be discussed in Section 2.2.2.Among several other ensemble generation schemes, brightness temperature ensembles were selected because the resultant ensemble mean appropriately reduced the errors already known by a priori knowledge of retrieval error structure [18].It is also possible to make a time or spatial integration of ensembles in order to enhance the utilization of ensembles at a reduced computational cost.In addition to wet biases in West Africa, dry biases in Little Washita Watershed site in Oklahoma were also resolved by a time integration of ensembles [105].On the other hand, CDF matching in Figure 5 increased the RMSEs of the original SMOS soil moisture, as CDF matching shifted the original SMOS soil moisture towards reference data exposed to their own intrinsic errors (i.e., the lowest limit of soil moisture is set at wilting point for calculating bare soil evaporation so that model did not appropriately simulate soils in extremely dry conditions drier than the wilting point [106].
Remote Sens. 2017, 9, 847 14 of 23 [104] found that several model realizations and repeated measurements reduce radar observational errors arising from speckle.Merlin et al. [33] also employed ensembles to perform their downscaling of SMOS soil moisture products to 1 km resolution.Figure 5 shows that the retrieval ensemble mean corrected footprint scale wet biases of the SMOS soil moisture in semi-arid region [52].Ensembles were generated from a random perturbation of brightness temperature with an ensemble size of 12.The error range is determined to fully include all the dynamic and extreme errors arising from the effects of rain events, vegetation attenuation, any geo-physical parameters, or RFI to be discussed in Section 2.2.2.Among several other ensemble generation schemes, brightness temperature ensembles were selected because the resultant ensemble mean appropriately reduced the errors already known by a priori knowledge of retrieval error structure [18].It is also possible to make a time or spatial integration of ensembles in order to enhance the utilization of ensembles at a reduced computational cost.In addition to wet biases in West Africa, dry biases in Little Washita Watershed site in Oklahoma were also resolved by a time integration of ensembles [105].On the other hand, CDF matching in Figure 5 increased the RMSEs of the original SMOS soil moisture, as CDF matching shifted the original SMOS soil moisture towards reference data exposed to their own intrinsic errors (i.e., the lowest limit of soil moisture is set at wilting point for calculating bare soil evaporation so that model did not appropriately simulate soils in extremely dry conditions drier than the wilting point [106].6, where CDF matching overestimated the SMOS soil moisture, similarly to Figure 5.

Estimation of Uncertainty
Retrieval ensembles can also be used to estimate uncertainty by analyzing a standard deviation of retrieval ensembles [66, 95,[107][108][109][110][111].Ensemble spread (SD) is expressed as follows: Equation ( 5) follows the same denotation with Equation (4), and is well in line with the definition of GUM uncertainty at Equation (3) in Section 1.2.2,where the standard deviation corresponds with the ensemble spread.
Several previous studies have successfully applied retrieval ensembles for assessing uncertainty in various satellite retrievals such as carbon dioxide, cloud, and rainfall retrievals.Reuter et al. [112] previously generated the retrieval ensembles from seven different retrieval algorithms for carbon dioxide concentrations, and successfully estimated retrieval uncertainty from the ensemble spread.Olson et al. [113] propagated the random errors in passive microwave radiometer observations to measure rainfall errors, and successfully quantified the footprint scale rainfall errors at low resolutions.Zhao et al. [110] also employed the cloud retrieval ensembles by perturbing the influential factors from retrieval inputs, assumptions and regression parameters, and successfully

Estimation of Uncertainty
Retrieval ensembles can also be used to estimate uncertainty by analyzing a standard deviation of retrieval ensembles [66, 95,[107][108][109][110][111].Ensemble spread (SD) is expressed as follows: Equation ( 5) follows the same denotation with Equation (4), and is well in line with the definition of GUM uncertainty at Equation (3) in Section 1.2.2,where the standard deviation corresponds with the ensemble spread.
Several previous studies have successfully applied retrieval ensembles for assessing uncertainty in various satellite retrievals such as carbon dioxide, cloud, and rainfall retrievals.Reuter et al. [112] previously generated the retrieval ensembles from seven different retrieval algorithms for carbon dioxide concentrations, and successfully estimated retrieval uncertainty from the ensemble spread.Olson et al. [113] propagated the random errors in passive microwave radiometer observations to measure rainfall errors, and successfully quantified the footprint scale rainfall errors at low resolutions.

Zhao et al.
[110] also employed the cloud retrieval ensembles by perturbing the influential factors from retrieval inputs, assumptions and regression parameters, and successfully estimated the retrieval uncertainty in cloud retrieval products.By doing so, they suggested the error attribution factors for various retrieval variables, and showed that the ensemble spread well exhibits realistic retrieval errors.
This stochastic method has been also applied to the estimation of uncertainty in soil moisture products.De Keyser et al. [102] provided an estimation of SAR retrieval uncertainty by propagating a probability distribution of roughness parameters via IEM model.Kim et al. [104] evaluated soil moisture retrieval accuracy by estimating the impact of radar measurement noise with Monte Carlo Simulation.They showed that the error propagation is different by a type of vegetation.They considered the noise arising from speckle or geo-physical uncertainties.Verhoest et al. [114] modeled roughness parameters with a probability distribution to assess uncertainty in soil moisture retrievals from ERS synthetic aperture radar backscattering.Lee et al. [18] showed in Figures 3 and 7 that the spread of SAR and SMOS soil moisture retrieval ensembles varies by error attributions and a range of input errors to be defined.Thus, before concluding uncertainty information, there is need to evaluate whether the nature of ensembles becomes stochastic, instead of a chaotic system.
Remote Sens. 2017, 9, 847 16 of 23 estimated the retrieval uncertainty in cloud retrieval products.By doing so, they suggested the error attribution factors for various retrieval variables, and showed that the ensemble spread well exhibits realistic retrieval errors.This stochastic method has been also applied to the estimation of uncertainty in soil moisture products.De Keyser et al. [102] provided an estimation of SAR retrieval uncertainty by propagating a probability distribution of roughness parameters via IEM model.Kim et al. [104] evaluated soil moisture retrieval accuracy by estimating the impact of radar measurement noise with Monte Carlo Simulation.They showed that the error propagation is different by a type of vegetation.They considered the noise arising from speckle or geo-physical uncertainties.Verhoest et al. [114] modeled roughness parameters with a probability distribution to assess uncertainty in soil moisture retrievals from ERS synthetic aperture radar backscattering.Lee et al. [18] showed in Figures 3 and 7 that the spread of SAR and SMOS soil moisture retrieval ensembles varies by error attributions and a range of input errors to be defined.Thus, before concluding uncertainty information, there is need to evaluate whether the nature of ensembles becomes stochastic, instead of a chaotic system.

Discussion
As the unbiased Root Mean Square Error (RMSE) goals of 0.04 m 3 /m 3 are imposed for the SMOS and SMAP retrieval qualities, the error characterization or validation of satellite data is often carried out by upscaling the ground measurements.However, it is not a trivial issue to reproduce footprint scale land surface heterogeneity with point measurements [8].In this context, this review discusses the limitation of RMSE retrieval goal based upon different governing factors, a limited spatial coverage of high density validation site at a global scale, field measurement errors, and variable penetration depth.It is also stated that relative approach such as Triple Collocation (TC) or cumulative distribution function (CDF) matching that rely on a relative comparison with other datasets are designed for climatology stationary errors at a time-scale of years [45] rather than resolving sub-pixel heterogeneity [33].

Discussion
As the unbiased Root Mean Square Error (RMSE) goals of 0.04 m 3 /m 3 are imposed for the SMOS and SMAP retrieval qualities, the error characterization or validation of satellite data is often carried out by upscaling the ground measurements.However, it is not a trivial issue to reproduce footprint scale land surface heterogeneity with point measurements [8].In this context, this review discusses the limitation of RMSE retrieval goal based upon different governing factors, a limited spatial coverage of high density validation site at a global scale, field measurement errors, and variable penetration depth.It is also stated that relative approach such as Triple Collocation (TC) or cumulative distribution function (CDF) matching that rely on a relative comparison with other datasets are designed for climatology stationary errors at a time-scale of years [45] rather than resolving sub-pixel heterogeneity [33].
For a short-range weather prediction such as storm or floods at hour to day time scale and footprint spatial scale, we review a stochastic method for reducing instantaneous retrieval errors at a footprint scale [52].However, a stochastic approach also has a limitation.Although a stochastic approach can mitigate retrieval errors arising from input parameters, sub-pixel land cover or satellite measurements in Table 1, it does not improve a Radiative Transfer Model itself.In addition, if the error factors listed in Table 1 are defined optimistically, then the stochastic approach may underestimate true retrieval bias.However, it does not overestimate retrieval biases, although overestimating input errors in Table 1.As retrieval algorithms do not propagate such overestimated errors, a stochastic approach may perform better when assuming large errors.For a successful application, a reasonable error attribution and a stochasticity of ensembles should be established.
A stochastic approach suggests significance for global application, as the random system improves a structural stability of retrieval algorithm by embracing several possibilities, is flexible enough to land surface heterogeneity at a global scale and transforms nonlinear errors in extreme conditions to stochastic stability, gains representativeness by repeated measurements or retrievals, and has a global coverage, and operational consistency, unlike the soil moisture international networks [3].

Summary and Conclusions
This review discusses various error estimation and reduction methods.As a standard method for validating satellite soil moisture products, we first introduce an upscaling of point measurements at local point scale.However, due to a scarcity of core validation sites at global scale, under-sampling of land surface heterogeneity and discrepancy in scale, TC method is suggested.It is effective to estimate relative errors and various possibilities at a global scale.However, such a relative approach may underestimate uncertainty, when three datasets have positive error covariance.Data assimilation final analysis finds an intermediate between model estimates and observations so that it is possible to acquire optimal values filtering out random errors in datasets.However, as data assimilation in theory aims for random errors, there is need to process a bias correction of satellite observations, prior to data assimilation.Thus, for such a purpose, CDF matching is suggested.It is effective to rescale the satellite observations with respect to reference data.However, it does not estimate the instantaneous retrieval errors defined by SMOS or SMAP RMSE goal or retrieval error dynamics required for a short-range weather prediction.
Thus, we review a stochastic approach using the ensemble method to resolve a complexity of retrieval errors arising from sub-pixel land cover map, geo-physical parameters, and measurements.Because of nonlinearity and chaotic nature of retrieval errors, it is shown that a stochastic approach that repeats measurements or retrieval model realizations under various conditions is more structurally stable and flexible to non-linear errors and outliers than a deterministic approach.Thus, it is considered effective for a global estimation, where land surface heterogeneity and scale remain a challenge.

Figure 2 .
Figure 2. Penetration depth of microwaves and radio waves in various types of soil [31].

Figure 2 .
Figure 2. Penetration depth of microwaves and radio waves in various types of soil [31].

Figure 3 .
Figure 3. Nonlinear error propagation of roughness to SAR soil moisture [61]: ASAR backscattering differently retrieved soil moisture products under four roughness conditions indicated in Table.Only scheme #4 is outside of an optimal roughness range.

Figure 5
Figure 5 at a local point scale is further extended to a spatial distribution in Figure 6, where CDF matching overestimated the SMOS soil moisture, similarly to Figure 5.

Figure 5 Figure 6 .
Figure 5 at a local point scale is further extended to a spatial distribution in Figure 6, where CDF matching overestimated the SMOS soil moisture, similarly to Figure 5.

Figure 6 .
Figure 6.Spatial distribution of surface soil moisture, m 3 /m 3 (red is wet, while blue is dry): (a) SMOS product; (b) CDF matching; (c) ensemble method; and (d) difference between before and after bias correction [52].

Figure 7 .
Figure 7. Different retrieval ensembles by different perturbation regimes: GH stands for a random perturbation of geo-physical parameters; TB for brightness temperature; and FR for land cover fraction (or sub-pixel land surface information) [18].

Figure 7 .
Figure 7. Different retrieval ensembles by different perturbation regimes: GH stands for a random perturbation of geo-physical parameters; TB for brightness temperature; and FR for land cover fraction (or sub-pixel land surface information) [18].

Table 1 .
Error sources to consider for ensemble generation.