Comparing the Assimilation of SMOS Brightness Temperatures and Soil Moisture Products on Hydrological Simulation in the Canadian Land Surface Scheme

: Soil moisture is a key variable used to describe water and energy exchanges at the land surface/atmosphere interface. Therefore, there is widespread interest in the use of soil moisture retrievals from passive microwave satellites. In the assimilation of satellite soil moisture data into land surface models, two approaches are commonly used. In the ﬁrst approach brightness temperature (TB) data are assimilated, while in the second approach retrieved soil moisture (SM) data from the satellite are assimilated. However, there is not a signiﬁcant body of literature comparing the differences between these two approaches, and it is not known whether there is any advantage in using a particular approach over the other. In this study, TB and SM L2 retrieval products from the Soil Moisture and Ocean Salinity (SMOS) satellite are assimilated into the Canadian Land Surface Scheme (CLASS), for improved soil moisture estimation over an agricultural region in Saskatchewan. CLASS is the land surface component of the Canadian Earth System Model (CESM), and the Canadian Seasonal and Interannual Prediction System (CanSIPS). Our results indicated that assimilating the SMOS products improved the soil moisture simulation skill of the CLASS. Near surface soil moisture assimilation also resulted in improved forecasts of root zone soil moisture (RZSM) values. Although both techniques resulted in improved forecasts of RZSM, assimilation of TB resulted in the superior estimates.


Introduction
Soil moisture (SM) is one of the most important variables affecting the land surface water and energy budgets. It is one of the major components that impacts the terrestrial water, energy and biogeochemical cycles by constraining the evapotranspiration from land [1][2][3]. It is involved in a series of feedback processes at different scales from local to global and is one of the key initial condition variables for climate and weather prediction models [4]. It also impacts the ground water recharge and river outflow by controlling the division of rainfall into infiltration, percolation and runoff in a basin [5,6]. Soil moisture has further impacts on air temperature and atmospheric stability by in measuring the SM in a wide range of vegetation covers. The study also showed that the skill of the assimilated product is better than the SMOS observation and the open-loop skill of the model alone. Similarly, Zhao et al. [43] showed that assimilating the SMOS level 2 (L2) product into a land surface model over the central Tibetan Plateau resulted in an estimate superior to the cases where open-loop land surface modeling or remote sensing derived SM alone is used. Ridler et al. [5] found that assimilating SMOS retrieved SM product in a fully integrated hydrological and soil-vegetation-atmosphere transfer model in Western Denmark improved the SM correlations in the surface layer and root zone at 25 cm. One of the potential disadvantages of assimilating the retrieved SM product over the TB is the unknown errors arising from the inversion algorithm used for retrieving the SM from the TB. The errors in the input variables and discontinuity in the derivative of order zero or higher in the inversion algorithm can lead to a poor solution [44]. In the case of direct TB assimilation, an RTM or backscatter model is coupled with the land surface model. The RTM works as a forward modeling platform, which translates the land surface model simulated SM into observed variable brightness temperature. The data assimilation term for the model that maps the model state variables from the model space to the observation space is known as observation operator [36,45,46]. Atmospheric RTMs play an important role in the satellite data assimilation for NWP application as satellites measure radiances and do not observe directly the geophysical variables such as humidity, temperature or cloud properties [47]. De Lannoy and Reichle, 2016 [38] assimilated both the SMOS TB and SM retrievals into the Goddard Earth Observing System Model, version 5 (GEOS-5). They found similar domain-average skill matrices for both the assimilation experiments but varying skill levels locally. Recently, another approach has been also investigated assimilating the Neural Network (NN)-based SM retrievals [48][49][50]. In this approach, the NN is trained using the observed TB from the satellite as input and the simulated SM data from land surface model as reference for the training. One of the advantages of this method is that there is no need to do further bias correction. This NN approach has been used for both the SMOS [50] as well as SMAP [48,49] TB data assimilation.
Since the advent of microwave remote sensing, the assimilation and validation of SM data, retrieved or raw TB is an active area of research. However, there is not a significant body of literature comparing the differences between these two approaches. Further, there are not many studies on assimilating SM into the Canadian Land Surface Scheme (CLASS) model, the land surface component of the Canadian Seasonal and Interannual Prediction System (CanSIPS) of Environment and Climate change Canada (ECCC).
In this paper, we present our investigation into (i) the difference between assimilating the SMOS TB vs the SMOS SM retrieval into the CLASS and if there is any particular advantage by using one method over the other, (ii) how the assimilation of top layer SM affects the SM estimation at the root zone, which has significant impacts on evapotranspiration rate especially in vegetated regions. This paper is organized as follows. Section 2.1 describes the models used in this study. Section 2.2 introduces the SMOS data, the study site and in situ data. In Section 2.3, there is a brief presentation about the EnKF. Section 2.4 describes the data assimilation experiment set-up. The results and discussion are presented in Section 3 and Section 4 respectively, and Section 5 holds the conclusion.

The Canadian Land Surface Scheme (CLASS)
The land surface model used in this study was the Canadian Land Surface Scheme, CLASS, the land surface component of the Canadian Global Climate Model or GCM [51,52]. CLASS simulates energy and water balance of the land surface forward in time from an initial condition forced by atmospheric data. CLASS can be run in offline mode or can be coupled with an atmospheric model. The prognostic variables that must be initialized at the beginning of the CLASS simulation include the frozen and liquid water content and the temperature of each of the soil layers; mass, temperature, density and albedo of the snow pack if present; the temperature and amount of intercepted rain and snow on the vegetation canopy; the temperature and depth of ponded water on the soil surface; an empirical vegetation growth index [40]. We also needed to specify the soil and vegetation parameters of the area being simulated. The hydrological and thermal properties as well as the albedo of the soil vary according to the texture and moisture content of the soil [51,52].
In this study, we used version 3.6 of the CLASS standalone driver. The soil column was divided into three layers with bottom boundaries at 0.05, 0.20 and 4.10 m, respectively, from the soil surface. The top layer boundary was set to 0.05 m in order to be consistent with the depth of the SMOS satellite SM measurement. Verseghy [53] provide extensive details regarding CLASS components and model physics. Energy and moisture fluxes were calculated at the top and bottom of each of the three layers in CLASS. The prognostic variables, soil water content and temperature at each layer were advanced in time in accordance with the calculated energy and moisture fluxes. The moisture fluxes were modeled using the Green-Ampt theory in the case of infiltration, while Darcy's method was used in the case of water transfer between the soil layers and drainage [54,55].
The atmospheric forcing data (Table 1) was required to run CLASS include incoming shortwave and longwave radiation, precipitation, air temperature, specific humidity, wind speed and atmospheric pressure [55]. The meteorological forcing data we used were from the North American Regional Reanalysis (NARR, [56]) data. We interpolated the three-hourly meteorological forcing data into 15 min intervals. CLASS is run at a time step of 15 min. The parameters used to run CLASS include maximum leaf area index, roughness length, above ground biomass density, rooting depth and minimum stomatal resistance. Values assigned to these parameters were chosen according to the observation and are summarized in Tables 2 and 3.  The Community Microwave Emission Model (CMEM) [57,58] was developed by the ECMWF to simulate the low frequency passive microwave brightness temperatures (from 1 to 20 GHz) of the surface as seen by passive microwave sensors included on a number of earth observation satellites. We used CMEM version 5.1 in our study. Input to the CMEM include SM and temperatures of the different soil layers, surface temperature, leaf area index, vegetation and soil parameters. The parameters and models used in CMEM are summarized in Table 4. The CMEM model physics follows the parameterization based on the Land Surface Microwave Emission Model [59] and L-Band Microwave Emission of the Biosphere (L-MEB) [60]. The high modularity of CMEM allows for consideration of different parameterizations for soil dielectric constant, effective temperature, soil roughness, vegetation opacity and atmospheric contribution [61]. The vegetation opacity model of Wigneron at al. [60] was used in this study. In the SMOS TB assimilation, CMEM was used as the observation operator or observation model, which maps the model state variables from the model space to the observation space [45,46]. One of the reason for using an observation operator is because satellite measurements, such as radiance or brightness temperature, is only indirectly related to the land surface variables of interest and observation operator facilitates the conversion of model simulated variables to the satellite observed variables [45]. Using the SM inputs simulated by CLASS, CMEM can predict a brightness temperature estimate that can be compared to satellite observed brightness temperature from the same wavelength.

SMOS
The Soil Moisture and Ocean Salinity (SMOS) satellite was launched by the ESA to collect global surface SM from approximately 0-5 cm depth of soil [21,66]. The Microwave Imaging Radiometer with Aperture Synthesis (MIRAS) onboard SMOS provides multi angular, bi-polarized brightness temperatures (TB) at L-band. It covers a spatial resolution of 43 km on average with a repeat cycle of less than three days. The target accuracy of the mission was ±0.04 m 3 ·m −3 [67]. For the direct TB assimilation we used the SMOS estimated brightness temperature (TB) (MIR-SCLF1C product version 620) at both horizontal (H) and vertical (V) polarization. For SM retrieval assimilation we used SMOS SM retrieval L2 product (MIR-SMUDP2 product version 620). Both the SMOS TB and L2 products are available to download from ESA website (https://smos-diss.eo.esa.int/oads/access/ collection/SMOS_Open/tree). In our experiment, we used the SM retrieval data that falls within the realistic range of 0.02-0.6 m 3 ·m −3 . The brightness temperature observations were retained only when it falls with in the range of 100−320 K. The SM product was retrieved from the TB using the L-MEB forward algorithm [60], which is an iterative algorithm that matches the surface emission observed by SMOS to the modeled L-band emission of the surface [68,69]. We used the TB data at an angle of approximately 40 • . The L2 SM was obtained by minimizing a cost function between the SMOS observed TB and the simulated L-MEB TB. The forward model requires the parameters such as SM, land surface temperature, land cover information, leaf area index and soil properties [5,67].

In Situ Data (Study Sites and Ground Data Measurements)
This study was applied over the Brightwater Creek watershed in Saskatchewan Canada ( Figure 1) using in situ SM network run by the University of Guelph and Environment and Climate Change Canada. The distribution of sensors in the network was arranged over a 40 by 40 km study domain situated within an agricultural region [70]. The region is characterized by agricultural activities, with typical crops including wheat, chick peas and canola. The network was designed to capture SM variability within a footprint of a passive microwave radiometer and has been used previously for the validation of SMOS SM retrievals [71]. In situ SM measurements were recorded using a Stevens Hydra Probe II SDI-12 sensor. All soil moisture monitoring stations have a minimum of three Stevens Hydra Probe sensors horizontally installed into the soil profile at 5, 20 and 50 cm depths. The physical measurements obtained by these probes is described by [72] but, briefly, the probes use an impedance-based approach to characterize the dielectric properties of the soil using a radio frequency at 50 MHz. The reflected frequency was measured along four 5.7 cm tines extending from a 3.4 cm diameter sensor head [72]. Using previously established calibration equations [73], the measured real dielectric was converted to soil moisture. Stevens Hydra Probe SM sensors used in this study were calibrated using infiltration wet-up and dry-down laboratory calibration procedures by Burns et al. [73]. The calibration utilized the soil samples collected from the study site at Brightwater Creek watershed, in Saskatchewan Canada. The soil samples were collected from 5, 20 and 50 cm and of different textural compositions. The calibration experiments found that the RMSE values of the sensors were <0.019 m 3 ·m −3 , which are significantly less than the SMOS targeted accuracy of ±0.04 m 3 ·m −3 . More details about the senor calibration can be found in Burns et al. [73]. The network contains 36 stations and previous research by Rowlandson et al. [74] demonstrated that the mean of the network was appropriate for estimating the SM average within the passive microwave footprint. More details about the network can be seen in Tetlock et al. [70] and Burns et al. [75].

Ensemble Kalman Filter (EnKF)
Advanced Kalman based filters have been widely used in geophysical data assimilation in recent years [76][77][78][79]. The Kalman filter [80] provides the best linear unbiased estimate of the system considering the past measurements and dynamics of the fluid system under consideration. The nonlinearity and large number of degrees of freedom limits the application of standard Kalman filter for numerical weather prediction and climate prediction models [78]. In the extended Kalman filter (EKF), standard Kalman filter equations are applied after linearizing the nonlinear models but the requirement of derivation of Jacobian or Tangent Linear Model (TLM) for the linearization of the nonlinear model is a computationally expensive process. EKF also neglects the contribution from higher order moments in calculating the error statistics [78,81,82]. The Ensemble Kalman filter (EnKF), introduced by Evensen [81], estimates the error statistics from ensembles integrated through the nonlinear models [83] and there is no need to linearize the nonlinear models.
The main concept behind the derivation of EnKF is based on forecasting the error statistics using Monte Carlo methods or ensemble integration. Thus, if the forecast model is interpreted as a stochastic differential equation, the forecast-error statistics can be approximated using ensemble integration [78,81]. For EnKF, there is no need for any closure approximation for calculating the error covariance, whereas in the Extended Kalman Filter (EKF), closure approximation is applied by neglecting contributions from higher-order statistical moments in the error covariance evolution equation. Therefore in EnKF, the error covariance matrices can be calculated by integrating the ensemble of model states in time via fully nonlinear model equations. Thus, all the statistical information such as mean and covariances about the predicted model state that is required at the time of analysis is obtained from the ensembles [81,82]. EnKF and its derivatives have been popular in atmospheric and oceanic data assimilation because of its algorithmic simplicity, flow dependent error structure, ease of implementation and comparatively lower computational cost [78].
Consider that X t is the state of a dynamical model at time 't' and Y t is the observation at time 't'. The state space equations of the dynamical system can be expressed as where f( ) is the nonlinear function which takes state X t−1 to X t and q t−1 is the random model error following a Gaussian distribution with zero mean and covariance Q t−1 , h( ) is the observation or measurement function and r t is the observation noise, which is also Gaussian with mean zero and covariance R t . In the EnKF, each ensemble member is propagated forward in time through the nonlinear model and corrected whenever new measurements are available [78] and the cycle is repeated. The state update equations are given by,

Bias Correction
Generally there is a large systematic bias between the satellite derived and model derived SM products because of the uncertainties in RTMs, land surface and vegetation parameters, errors in the land surface models and inaccurate atmospheric forcing [58,[90][91][92]. One commonly used method for addressing the bias is rescaling the satellite observations to long term model simulations. Data assimilation techniques based on Kalman filters assume unbiased observation and model forecast [46]. We applied a prior bias correction to the SMOS data based on the Cumulative Distribution Function (CDF) matching [90,92] before the assimilation. For SMOS SM retrieval bias correction, CDF matching was applied using the four years (2011-2014, April to September) of CLASS simulation data and SMOS SM retrieval data and SMOS SM retrieval was rescaled into CLASS climatology. For SMOS TB bias correction, we applied the CDF matching between the four years (2011-2014, April to September) of CMEM forward simulation data and SMOS observed TB data. The initial conditions for the CMEM forward simulations were provided by CLASS data. In our experiment, for CDF matching, we used the data only from April to September because CDF matching can work as an additional source of error by discarding the seasonal cycles in biases [91,92]. The standard deviation for the observation error for the SM retrieval was set as 0.04 m 3 ·m −3 according to the target accuracy of SMOS. The standard deviation of the observation operator for the SMOS TB was set as 5 K, this include the instrumental errors in radio meter and uncertainties in forward simulations using CMEM.

Experimental Set-Up
It is a challenge to find an optimal method that satisfies all the conditions over a large range of constraints including different land surface models, different data assimilation schemes and different land characteristics. This study was applied over an agricultural region in Saskatchewan, Canada. For the assimilation of TB, CLASS was coupled with CMEM. We employed the EnKF data assimilation scheme of Evensen [76].
Before starting the assimilation and open-loop integration, CLASS was spun-up by cycling the model continuously through the period April 2011 to March 2014 for the model states to be in equilibrium with the forcing meteorology. Please note that only the CLASS model was spun-up and not the entire data assimilation system. This spin-up process was independent from the CLASS simulations performed to build the CDFs (explained in Section 2.3.2). The initial conditions obtained from the CLASS spin-up process was used for the open-loop simulations and data assimilation. At first the CLASS open-loop run (no data assimilation) was performed in deterministic mode forced by the NARR atmospheric data for the period April 2014 to September 2014. Two different approaches of data assimilation experiments were conducted using the EnKF method. The first set of experiments SM retrieval (SM-L2 product) from the SMOS satellite was assimilated into the CLASS model; whereas the second set of experiment TB from SMOS was assimilated into the CLASS model using CMEM as the forward model. The estimation skill of both the approaches as well as the CLASS open-loop run experiment were compared with the in situ SM data.
In the first state estimation experiment, the retrieved SM (SM-L2 product) data from the SMOS was assimilated into the CLASS from April 2014 to September 2014 using the EnKF algorithm. The SM data was assimilated whenever it was available, the repeat cycle of SMOS was less than three days. In SM retrieval assimilation, the model simulated variable and observed variable were SM, therefore there was no need of an RTM and the observation operator was an identity matrix. The second approach was identical to the first approach except that we assimilated SMOS observed TB instead of SM retrieval. In this state estimation experiment, CLASS predicted SM was brought into the observation space with the use of an observation model. The observation model used in this study was the CMEM. The parameters used in CMEM are explained in Section 2. The schematic of the TB assimilation is given in Figure 2. In EnKF, there are different perturbation strategies for initial ensemble generation, from the simple addition of pure random perturbation to the advanced singular vectors. Addition of purely random perturbation can lead to dynamical imbalance and deterioration of the quality of the ensembles [93]. We perturbed only the CLASS SM variable using Random Field (RF) perturbation technique [93] that results in flow balanced ensembles, and did not perturb the meteorological forcing variables. The initial ensembles were generated by adding the observed SM values from the in situ data with differences of randomly chosen SM states from historical run of CLASS for the period of April 2011 to April 2014, the differences were scaled to zero mean before the addition. In this study, an additive inflation method [85,88,89] was used to account for the model error. The mean of the perturbations were scaled to zero and standard deviation equal to the climatological (April 2011 to April 2014) standard deviation (0.065) of the CLASS simulated surface SM. We used a random subset of model predictions of SM fields from a historical run of CLASS and scaled them to zero mean and added one of these subsets into one of the background ensemble member following Li et al. [85].
The assimilation experiments were performed whenever the observation data were available for the period from April 2014 to September 2014. For the same period, a open-loop run of CLASS was also completed, forced by the NARR data. The background and observation errors were assumed to be uncorrelated [94]. The selection of ensemble number is very important in the EnKF method. A large number of ensembles are a computational burden when used with a high dimensional model and a small number of ensembles cannot catch all the possible dimensions of the system. We used 41 ensemble members considering a balance of both the accuracy and computational expenses. The SM assimilation estimate was evaluated using independent in situ SM data in the near surface zone as well as the root zone. We used the measurements at around 20 cm as a proxy for the root zone SM value. The estimation skill was evaluated in terms of Root Mean Square Error (RMSE), modified coefficient of efficiency, E 1 [95,96] and absolute error between the analysis and in situ data.
The modified coefficient of efficiency [96], E 1 is defined as where O i and P i are observation data and model simulated data respectively, andŌ is observation mean. Figure 3 illustrates the time series of the absolute error in SM at 5 cm for the three sets of experiments compared with that of the station data. When the data assimilation was employed, the SM values are closer to the station data than the case when there is no assimilation as can be seen from the reduced absolute error for the TB assimilated case compared to the absolute error of CLASS open-loop run as can be seen in Figure 3. When the assimilation started in April, part of the SM was still in the frozen state and the values were low and therefore there was a small absolute difference. In both the assimilation approaches, we can see that the skill reduces in the beginning and then shows improvement through time; that is because the system takes time to reach a steady state. When the retrieved SM is assimilated, the system takes more time to reach in an equilibrium compared to the TB assimilation. Data assimilation systems take different time periods for stabilization. The assimilation skill was also evaluated in terms of RMSE and modified coefficient of efficiency, E 1 . In Table 5 we can see that the SM estimate at 5 cm from the SMOS TB assimilated approach has higher coefficient of efficiency value when compared to the station data than the case where the SMOS SM retrieval was estimated. From Table 6 we can see the RMSE skill of different approaches when compared with the station data for the period April to September 2014. The RMSE is much lower for the SMOS TB assimilated case than the retrieved SM assimilation. The coefficient of efficiency, E 1 for the near surface SM in the TB assimilated case is higher than that of the case where there is no assimilation indicating that assimilation of SM improves the model performance.  Studies have shown that near surface SM is correlated with root zone soil moisture (RZSM) [75,97,98]. To further explore the impact of assimilation of surface SM in the estimate of RZSM, we analyzed the SM estimates around 20 cm from both the assimilation approaches with the in situ data. Figure 4 displays the time series of the absolute error in SM at 20 cm for the assimilated cases and open-loop run compared with the in situ data. As in the case of top layer SM estimate, the simulated RZSM from the TB assimilated case is closer to the in situ data as seen by the lower absolute error (Figure 4). After assimilating SMOS SM retrievals the skill of the CLASS is improved as can be seen from the reduced RMSE and reduced absolute error for the RZSM values in Figure 4 and Tables 5 and 6. The reduction in error shown, illustrates the importance of synergistic use of observation data and data assimilation in hydrological models.  The estimation skill was evaluated using the RMSE and coefficient of efficiency metrics. From Tables 5 and 6 we can also see that the simulation started from the TB analysis has higher coefficient of efficiency value and lower RMSE. There is higher skill in the RZSM compared to the near surface in all the three cases as evident by the higher correlation skill and lower RMSE. When we assimilated TB in the surface layer, the information was passed onto the root zone and the SM estimate at the root zone was improved. It is a very important result because we do not have direct satellite observation at root zone but we can improve the RZSM simulation of the land surface model by assimilating satellite observed SM in the surface layer. Correctly estimating the RZSM is very important in understanding the evapotranspiration rate in vegetative regions. Both the data assimilation experiments improve the RZSM estimates compared to open-loop simulations in terms of RMSE and coefficient of efficiency metrics. The results may further improve if the data assimilation systems can evolve to realistically simulate the model and observation error parameters over more time.

Discussion
This study investigated the assimilation of SMOS observations (both the SM retrieval and TBs) into CLASS model using the EnKF technique. As shown in the results section, both TB and SM retrieval assimilation analysis for near surface layer exhibit higher deviation from the station data in the beginning of data assimilation and improves as the assimilation progresses. Similar results have been reported by [40,78,99] where station data are more in agreement with the analysis as data assimilation progresses in time. This result can be explained as we should expect some continuous improvement to the analysis as more data are available. But the temporal pattern of improvement in the SM trajectories for both the assimilation approaches are different which might be due to the presence of random error in the SMOS observation [100]. Similar results have been reported by De Lannoy and Reichle [38].
Assimilating the TB improved the CLASS SM estimates in the near surface layer whereas assimilating SM retrieval data did not improve the model performance in the near surface layer. This could be because the standard deviation for the SM retrieval observation error used in this study (according to the target accuracy of SMOS) is relatively small, therefore the assimilation system gives more weight to the erroneous observation over the model forecast. Additionally, sub-optimal configuration of the model error covariance may also result in the degradation of the analysis. Degradation of SM retrievals data may be because of the inconsistent auxiliary data such as land cover or soil temperature used in the retrieval process [38]. Not properly tuning the retrieval model parameters may deteriorate the quality of the SMOS SM retrieval data and thereby reducing the accuracy of the analysis [100]. Therefore, directly coupling the land surface model with an RTM and directly assimilating the TB is a more natural choice [38,46].
In both the assimilation approaches, assimilating the near SM observations into CLASS had the greatest influence on the model estimate of the RZSM. This is in agreement with previous studies [101,102], where assimilating the SMOS observations into land surface models resulted in improved estimates of SM in the root-zone. Improvement in the estimation of the RZSM is very beneficial for the agricultural drought monitoring and flood prediction [102].
One drawback of this study is related to the relatively small assimilation period. The reason is that we do not have the historical time series of in situ data for the validation of assimilation schemes as some of the stations have moved or were not in operational limiting the in situ comparison to a relatively short time period. Nevertheless, the study period chosen satisfied the requirement of our main objective, which is testing and comparing the performance of assimilation of brightness temperatures vs the assimilation of soil moisture retrieval into a land surface model. Additionally, the assimilation period include the growing season of the crop, that is the most important period for an agricultural region such as Brightwater Creek watershed. Studies have shown that, assimilation improves the SM estimates in limited vegetation areas where the SMOS SM observations are relatively better [37,38].

Conclusions
Numerical weather prediction and seasonal to interannual hydrometeorological forecasts rely on the accuracy of the land surface initialization, in particular soil moisture and snow. Satellite SM observations can be assimilated into land surface models to improve the land surface and root-zone SM states as well as other land surface variables such as soil temperature. However, there remains fewer number of studies which compare assimilation of both SM retrieval and TBs at the same time. In this paper we primarily addressed two issues. First, how assimilating SMOS L2 SM retrieval product (SM retrieval) is different from assimilating SMOS TB into the Canadian Land Surface Scheme (CLASS) using an Ensemble Kalman Filter (EnKF). Secondly, we studied the impact of near surface SM assimilation in simulating the RZSM in a vegetative region. The assimilation results are verified against the in situ data. The results indicated that assimilating the SMOS brightness temperature improved the SM forecasting skill of the CLASS. This study also showed that assimilating the near surface SM data from SMOS improves the simulation of RZSM.
The estimation skill of both approaches are compared to each other and also with the SM simulated by the open-loop run of CLASS. When SM analysis at near surface from both the assimilation schemes as well as the open-loop run is compared to the in situ data, the estimate was more close to the station data when the assimilation was employed. When there is no assimilation the model simulation is deviating from the truth (in this case quality controlled in situ data). This results demonstrates the potential benefits of data assimilation for land surface modeling.
Analysis skill of both the approaches were compared to each other, and the results showed that assimilation of direct TB is more robust than the other case where SMOS SM retrieval was assimilated. Assimilating the retrieved SM retrieval product from the satellite is relatively easy compared to the assimilation of direct TB because both the model forecast and observation are SM and this avoids tuning the Radiative Transfer Models (RTMs). In order to retrieve the SM L2 product from the radiance data as seen by the satellite it has to run through a RTM and it requires initial estimates of SM, soil temperature and other parameters. Errors in the initial estimates of SM and other parameters will act as an additional sources of errors in the process of SM L2 product retrieval and it may reduce the overall quality of the SM L2 product and thereby accuracy of the state estimate in SM L2 assimilation.
For an end user, assimilating direct TB is computationally more expensive as it requires the coupling of the land surface model with the RTM. In the case of assimilation of SMOS TB we use CLASS model output as a source of the initial values necessary in the RTM. In our experiment, when we couple the land surface model with the forward model, we improve its ability in capturing the observed TB may be because of lots of auxiliary information including the soil temperature and vegetation used in the forward model. These data are based on additional observations over the study area. The forward model, CMEM, requires parameters such as soil temperature, land cover, vegetation types, soil layer depth, sand, clay and water fractions and surface height over the study area. Though in principle, both the assimilation approach should show similar skill level the differences in those auxiliary information can create a difference in the performance of two RTMs. The difference in SM analysis from SM retrieval assimilation and direct TB assimilation can also come from the sub-optimal tuning of the data assimilation parameters.
Our analysis also showed that the SM information acquired by the satellite from the top layer of soil can be propagated into deep layer though downward propagation of the updated surface SM information by the soil heat diffusion schemes and can get an estimate of the RZSM. Assimilating the near surface SM from SMOS improved the simulation of RZSM. However, improved root zone estimates were observed when TB was assimilated rather than SM. The RZSM significantly impacts the evapotranspiration rate in an agricultural region. This study further demonstrates that although satellites such as SMOS measure SM only from the top few centimeters of the soil we can propagate this information deep into the root zone by assimilating the surface SM from the satellites into land surface models. The assimilation improves the surface estimate and the land models initialized from the analysis produce better RZSM predictions.
This study suggests that two assimilation approaches vary in their results because of the difference in accuracy of the retrieval algorithm used to derive SM. One of the limitations of this study is the difficulty in generalization of the results as it is not easy to reach an optimal global solution which works across a wide range of variables such as different land surface models, different land cover data, different soil soil texture, different vegetation characteristics, and various data assimilation methods, etc. Therefore, to generalize the results more studies need to be performed as new satellite products, improved land surface models, more in situ data network as well as assimilation schemes are available.