Improving Soil Water Content and Surface Flux Estimation Based on Data Assimilation Technique

: Land surface model is a powerful tool for estimating continuous soil water content (SWC) and surface fluxes. However, simulation error tends to accumulate in the process of model simulation due to the inevitable uncertainties of forcing data and the intrinsic model errors. Data assimilation techniques consider the uncertainty of the model, update model states during the simulation period, and therefore improve the accuracy of SWC and surface fluxes estimation. In this study, an Ensemble Kalman Filter (EnKF) technique was coupled to a Hydrologically Enhanced Land Process (HELP) model to update model states, including SWC and surface temperature (Ts). The remotely sensed latent heat flux (LE) estimated by Surface Energy Balance System (SEBS) was used as the observation value in the data assimilation system to update the model states such as SWC and Ts, etc. The model was validated by the observation data in 2006 at the Weishan flux station, where the open-loop estimation without state updating was treated as the benchmark run. Results showed that the root mean square error (RMSE) of SWC was reduced by 30~50% compared to the benchmark run. Meanwhile, the surface fluxes also had significant improvement to different extents, among which the RMSE of LE estimation from the wheat season and maize season reduced by 33% and 44%, respectively. The application of the data assimilation technique can substantially improve the estimation of surface fluxes and SWC states. It is suggested that the data assimilation system has great potential to be used in the application of land surface models in agriculture and water management.


Introduction
Land surface flux is an important index for evapotranspiration (ET) calculation, which is an important parameter of the hydrological cycle [1]. Land surface flux includes surface temperature (Ts), latent heat flux (LE), net radiation (Rn), soil heat flux (G 0 ), and sensible heat flux (H). Soil water content (SWC) is a key parameter of the land surface process and plays an important role in the exchange of matter and energy between different components in the earth system [2,3]. At the same time, SWC has a strong effect on ET, which is an important index to measure the degree of soil drought. Timely and accurate estimation of surface energy and SWC is of great significance for drought monitoring, agricultural water resources management, and crop yield prediction. The land surface model is an effective tool for continuous simulation of SWC and surface fluxes. However, simulation errors tend to accumulate during the operation of the model caused by the input uncertainty as well as continuous model states estimation, and the heterogeneity of the underlying surface also limits the application of the model on the regional scale. With the rapid development of remote sensing technology, a large number of remote sensing observation data have been widely used in the model research The uncertainty caused by multi-source data fusion is also an important factor affecting the application of remote sensing data in SWC simulation. Many pieces of research focused on reducing the uncertainty of multi-source data, such as data fusion [5], data downscaling [8], sensitivity analysis [9], variability analysis [10][11][12], and data assimilation [13]. Zhao et al. [14] proposed the SWC downscaling method based on random forest, which is helpful in improving the spatial resolution of passive microwave SWC products and promotes its application at the regional scale. Bai et al. [15] used the enhanced spatial and temporal adaptive reflectance fusion model (ESTARFM) and machine learning algorithm to generate surface SWC at 30 m spatial resolution in Haihe Basin from 2015 to 2017, which is valuable for agricultural water resources management at the field scale. There are many pieces of research on surface SWC estimating using remote sensing technology [2,[16][17][18]. However, in addition to surface SWC, root zone and deep SWC information is also very important for agricultural management. Numerous studies have extended the surface SWC estimated by remote sensing to the deep SWC. It is necessary to establish some hypotheses about vertical distribution of SWC in soil profile. Many studies have used the one-dimensional Richards equation [19]. The prediction of root SWC is greatly affected by the covariance error in the framework of data assimilation, so the assimilation can be successful only when the observation error and model error are both small enough. Sabater et al. [20] compared the ability of two different Kalman filtering methods and two different variational methods to predict root SWC from surface SWC. Baldwin et al. [21] used the SMAR-EnKF method to combine the dynamic process model with SMAP products to obtain SWC of the vertical soil profile by assimilation. These studies provided a good idea for a root or deep SWC estimation. However, due to spatial heterogeneity, it is hard to say whether the formula is effective in the whole region from surface to deep. Thus, it is necessary to use the land surface model to simulate the root zone and deep SWC. If the land surface model is used only, the errors will still accumulate over time, because SWC is not only the output of the model, but also the key state variable and one of the initial conditions affecting the next time step. Therefore, it is necessary to use data assimilation to improve SWC, especially in the deep and root layers.
Data assimilation gives us a tool to optimize model states and thus reduce simulation errors. The essence of data assimilation is error estimation and error simulation, that is, to correct the model simulation process by introducing observation data of time series so as to improve model simulation accuracy and reduce model uncertainty [22]. In recent years, data assimilation technology has been developed rapidly and gradually introduced into the research of hydrology, such as SWC assimilation [19,23] and Ts assimilation [24]. These studies proved that data assimilation technology has great potential to improve the accuracy of hydrological prediction. At the same time, due to the extensive application of remote sensing technology, more and more surface parameters can be obtained from remote sensing data, which provides data support for the application of data assimilation in hydrological models. Data assimilation methods mainly include variational method and filtering method ( Table 1). The variational method is widely used in meteorology, oceanography and other fields because it does not need preheating and has good assimilation results in a short computing time. Filtering method can take into account the uncertainties of different source data and models, and the errors can propagate over time, which widely used in the field of hydrology. Kalman filter algorithm was proposed by Kalman [25] in 1960. Its basic idea is that the model state is predicted first, then the observation data are introduced, and the model state is re-analyzed according to the observation data at the end. Komma et al. [26] integrated the EnKF method into a conceptual hydrological model and applied it to small watersheds. The prediction accuracy of the hydrological model is improved by optimizing the initial state variables of the hydrological model. Thirel et al. [27] built a data assimilation system in France. By optimizing the state variables of the previous hydrological model, the optimized state variables are used as the initial state variables of the forecast model. In terms of SWC assimilation, Walker et al. [28] assimilated surface soil moisture, indicating that data assimilation could improve the simulation accuracy of SWC when the initial value of the model deviated greatly from the measured value. Crow et al. [29] showed that assimilation of surface soil moisture could improve the simulation accuracy of soil moisture and surface latent heat flux when precipitation data error is large. Chen et al. [30] conducted a three-layer SWC assimilation in the SWAT model, and the results showed that surface soil moisture assimilation had the best performance. In terms of ET assimilation, Qin et al. [31] used LE simulated by the SEBS model as an observation value to assimilate the WEP-L distributed hydrological model, however, it was difficult to be verified. In addition to the single-objective assimilation mentioned above, there is also multi-objective joint assimilation. Pan et al. [32] used the Constrained EnKF (CoEnKF) method to assimilate surface and root soil moisture, latent heat, and runoff in variable infiltration capacity (VIC) model. According to the correlation between the error level of the water balance items and the non-closure terms, the non-closure terms are allocated. The closure of the hydrological process is realized.
It is found that there is no unified evaluation method for model improvement in existing studies. In the simulation of different hydrological factors, the stability of the assimilation method is different. At the same time, it is also worth discussing whether the model simulation for long time series is effective.
In belongs to the temperate monsoon sub-humid climate area. More than 80% of the land use in the irrigation district is farmland, followed by construction, with a small amount of forest and water. The performance of the data assimilation system was evaluated at Weishan flux station (36 • 38 55.5 N, 116 • 3 15.3 E) located in the irrigation district, which is representative of this area (Figure 1). which is representative of this area ( Figure 1).
The annual pan evaporation (1950 mm measured by 20 cm diameter evaporation pan) significantly exceeds the annual precipitation (532 mm), and the annual mean air temperature (Ta) is 13.3 °C. The dominant crops are winter wheat and summer maize planted in rotation, representing the main cropping system in this region. The rainfall is insufficient to support the winter wheat production, as about two-thirds of the annual rainfall occurs from July to September. Thus, agricultural production of the wheat-growing season relies primarily on surface ditch irrigation, with the irrigation water drawn mainly from the Yellow River [33].

Data
The data used in our study include ground measurements and remote sensing data (MODIS). Ground measurements mainly include observation data of flux station and meteorological station, including flux, Ts, SWC, Ta, air pressure, rainfall, etc. The remote sensing data used included MOD09GA, MOD13Q1, and MOD11A1.

Ground Measurements
For ground measurements, a 10 m flux tower was set, and measurement instruments were equipped on and around the tower. The model input and validation data needed in the study, including flux data, soil data, meteorological data, and vegetation parameters, were observed. The LE and sensible heat flux (H) was measured by an eddy-covariance system at the height of 3.7 m. Downwelling shortwave radiation (Rs_down), downwelling longwave radiation (Rl_down), and upwelling longwave radiation (Rl_up) were measured by The annual pan evaporation (1950 mm measured by 20 cm diameter evaporation pan) significantly exceeds the annual precipitation (532 mm), and the annual mean air temperature (Ta) is 13.3 • C. The dominant crops are winter wheat and summer maize planted in rotation, representing the main cropping system in this region. The rainfall is insufficient to support the winter wheat production, as about two-thirds of the annual rainfall occurs from July to September. Thus, agricultural production of the wheat-growing season relies primarily on surface ditch irrigation, with the irrigation water drawn mainly from the Yellow River [33].

Data
The data used in our study include ground measurements and remote sensing data (MODIS). Ground measurements mainly include observation data of flux station and meteorological station, including flux, Ts, SWC, Ta, air pressure, rainfall, etc. The remote sensing data used included MOD09GA, MOD13Q1, and MOD11A1.

Ground Measurements
For ground measurements, a 10 m flux tower was set, and measurement instruments were equipped on and around the tower. The model input and validation data needed in the study, including flux data, soil data, meteorological data, and vegetation parameters, were observed. The LE and sensible heat flux (H) was measured by an eddycovariance system at the height of 3.7 m. Downwelling shortwave radiation (R s_down ), downwelling longwave radiation (R l_down ), and upwelling longwave radiation (R l_up ) were measured by a radiometer at the height of 3.5 m. Soil heat flux (G) was measured by two heat transducers on opposite side of the tower at a depth of 0.03 m. Soil temperature (T soil ) and SWC were measured at the depth of 0.05, 0.10, 0.20, 0.40, 0.80, and 1.6 m. Meteorological data used in this study were recorded at 10 min intervals, including Ta and relative humidity (RH), wind speed (WS), and precipitation (P). Skin temperature was measured with an infrared thermometer. The canopy height and leaf area index (LAI) were measured twice per month at several locations surrounding the tower by randomly selecting samples within the cropland. Table 2 shows the observation projects of observation station. For remote sensing data, land products of MODIS on-boarded Terra and Aqua satellite were used in this study, and the over-passing time of each satellite is around 10:30 am and 13:30 pm. (http://reverb.echo.nasa.gov/reverb/, accessed on 10 May 2022). Collection 5 data onboard the Terra satellite and corresponding products onboard the Aqua satellite was used in this study [34]; broadband surface albedo [35], Ts and emissivity, and NDVI [36] were thus derived. Detailed introduction of above could refer to the website (http://modisland.gsfc.nasa.gov/, accessed on 10 May 2022). All the data were re-projected from the Sinusoidal (SIN) projection into the Universal Transverse Mercator (UTM) projection and resampled to 250 m.

Methodology
The core component of land surface process data assimilation system is hydrological model or land surface model, data assimilation algorithm, and an independent source of observation to provide extra information. In this study, the Ensemble Kalman Filter (EnKF) algorithm was embodied in the HELP model. The estimated LE based on remote sensing using Surface Energy Balance System (SEBS) was used as an independent observation source. The framework of the scheme is shown in Figure 2.

Hydrologically Enhanced Land Process (HELP) Model
HELP model is based on Simple Biosphere Model 2 (SiB2), which enhances runoff process and other hydrological processes modeling. SiB2 model was introduced by Sellers et al. [37], based on the Simple Biosphere Model (SiB), into the photosynthesis-stomatal conductance sub-module. Thus, the surface energy and water balance can be better simulated. SiB2 consists of one vegetation layer and three soil layers, mainly predicting three layers of SWC and temperature, sensible and latent heat fluxes.
In HELP model, radiation transfer, energy, and carbon exchange between land surface and atmosphere is the same as SiB2, while below is the improvement in HELP model: (1) soil in unsaturated zone was divided into 0.1 m layer, and SWC exchange in each layer was described by one-dimensional Richards equation; (2) relationship between unsaturated and saturated hydraulic conductivity was described using van Genuchten equation instead of Cambell/Clapp-Hornberger equation; (3) interflow and groundwater discharge and water exchange between SWC and river way were added to the surface runoff model, described by mass conservation equation and Darcy's law.
In order to enhance the simulation of SWC, unsaturated soil layer was further refined to 10 layers. Thus, the model has 17 state variables in total as: 0~10 cm layer SWC, canopy interception, land surface precipitation storage, temperature at canopy, land surface and deep soil layer, groundwater table, and river water level.
Except for meteorological input data derived from site measurement, HELP model had a set of parameters to describe the physical characteristics of soil and physiological property. Vegetation parameters included vegetation form, optical parameters, and physiological parameters, which were determined by crop time and invariant with time. The vegetation parameters were derived by field experiment at Weishan flux station. The soil parameters are listed in Table 3.

Hydrologically Enhanced Land Process (HELP) Model
HELP model is based on Simple Biosphere Model 2 (SiB2), which enhances runoff process and other hydrological processes modeling. SiB2 model was introduced by Sellers et al. [37], based on the Simple Biosphere Model (SiB), into the photosynthesis-stomatal conductance sub-module. Thus, the surface energy and water balance can be better simulated. SiB2 consists of one vegetation layer and three soil layers, mainly predicting three layers of SWC and temperature, sensible and latent heat fluxes.
In HELP model, radiation transfer, energy, and carbon exchange between land surface and atmosphere is the same as SiB2, while below is the improvement in HELP model: (1) soil in unsaturated zone was divided into 0.1 m layer, and SWC exchange in each layer was described by one-dimensional Richards equation; (2) relationship between unsaturated and saturated hydraulic conductivity was described using van Genuchten equation instead of Cambell/Clapp-Hornberger equation; (3) interflow and groundwater discharge and water exchange between SWC and river way were added to the surface runoff model, described by mass conservation equation and Darcy's law.
In order to enhance the simulation of SWC, unsaturated soil layer was further refined to 10 layers. Thus, the model has 17 state variables in total as: 0~10 cm layer SWC, canopy interception, land surface precipitation storage, temperature at canopy, land surface and deep soil layer, groundwater table, and river water level.
Except for meteorological input data derived from site measurement, HELP model had a set of parameters to describe the physical characteristics of soil and physiological property. Vegetation parameters included vegetation form, optical parameters, and physiological parameters, which were determined by crop time and invariant with time. The vegetation parameters were derived by field experiment at Weishan flux station. The soil parameters are listed in Table 3. The data assimilation approach used in this study was EnKF, proposed by Evensen [38]. The basic idea of EnKF incorporates an ensemble of forecasts to estimate background error covariance, which is a Monte Carlo approximation to the traditional Kalman filter [39]. Data assimilation using EnKF method consists of two steps: (1) ensemble forecast; (2) model states update. At each time step t, a set of ensemble samples of model state variables is generated using Monte Carlo approach, noted as x i,t b , where subscript i indicates the ith sample, superscript b indicates states before update. The dimension of vector x is m, indicated the number of model states is m. Model forecast at initial time can be expressed as: where x i,0 is the initial states of the model; M indicates the modeling operator; n is the ensemble size. The update equation for each ensemble member of EnKF is: where K is the Kalman gain; X a is the analysis after the update; y i is the vector of the observations with the dimension of p; P b is the background error covariance matrix; H is the operator that converts the model states to the observation space with the dimension of p*n; R is the observation error covariance matrix with the dimension of p*p; x b i is the ensemble anomaly for the ith ensemble member.
At model state update stage, each i of the m ensemble members is updated by Equation (2). For each ensemble member, the p dimensional observation y i is sampled from a distribution with the mean equal to the observation and variance of R.

Surface Energy Balance System (SEBS) Model
Instantaneous LE at satellite over-passing time estimated by Surface Energy Balance System (SEBS) was treated as the observation of the data assimilation system, i.e., the vector y in Equation (2). SEBS is a single layer model based on energy balance theory that uses an excess resistance term which accounts for the difference of the roughness lengths for momentum and heat between vegetation and surface [40], which consists of four components: (1) land surface physical parameters from spectral reflectance and radiance; (2) roughness length for heat transfer by extended model; (3) H iterated from Monin-Obukhov similarity theory; (4) LE determined by evaporative fraction based on energy balance equation at limiting cases [41]. Compared with other single layer models based on energy balance theory, SEBS has an advantage in the independent calculation of each pixel, i.e., even if partial pixels are contaminated because of cloud or rainfall on one day, other pixels exposed under clear day still can be used to calculate energy fluxes. The merit of SEBS can maximize the usage of remote sensing data and will provide observation data to data assimilation system to the greatest extent.
The surface energy balance equation [42] can be written as: where R n is the net radiation, W·m −2 ; G 0 is the soil heat flux, W·m −2 ; H is the sensible heat flux, W·m −2 ; LE is the latent heat flux, W·m −2 . The parameterization scheme of roughness length of momentum transfer (z 0m ) and zero displacement height (d) were derived from empirical relationship given by Shaw et al. [43] from the second-order closure theory [44] to suit the situation in China.
where h is the canopy height; LAI is the leaf area index; C d is the drag coefficient of the foliage elements (set to 0.2); z 0s is the substrate roughness height (set to 0.01 m for bare soil) [45].
The model was already estimated using MODIS data at the study site and validated by in situ measurement of eddy covariance system. The results showed that relative error and root mean square error of LE in both wheat and maize season was within 20%, indicating SEBS model using remote sensing data had good accuracy in estimating LE. Detailed results of validation and discussion could refer to Yang et al. [46].

Implementation of Data Assimilation
A key issue in implementing data assimilation method is quantifying the covariance error matrices. For EnKF, variance between multiple ensemble members is the way to quantify the model error. In this study, Monte Carlo sampling through assigned random noise in the forcing data and observation were derived from emulating the model error, which is proved to be an easy and robust way; moreover, the physical basis and interpretation of individual error sources remain clear [26].
The forcing data of HELP model include P, WS, RH, Ta, and R s_down . The ensembles of forcing data were emulated by perturbing each forcing data at each time step, which were generated as Clark et al. [47]: where γ f is the random noise with a covariance of ω f , i.e., γ f is a realization from a normal distribution ranging from −ω f to ω f , in which the subscript of f indicates different sources of forcing data. Similarly, the data assimilation approach implicates an assumption of knowing or at least accurately estimating observation error. The ensembles of observation were also generated as follows: where the symbols have the same meaning as in Equation (8), in which the subscript of y indicated observation in the data assimilation system, i.e., the LE estimated by SEBS. By generating each ensemble of forcing data and observation, a set of ensemble model states were calculated by HELP model, and the model error covariance and observation error covariance (P b and R in Equation (3)) were thus quantified. Considering the actual error of in situ measurements and SEBS simulation, random distribution of forcing data and observation data in HELP model is listed in Table 4. The HELP model has 17 model states, in which the canopy interception and surface interception are the states only relative to rainfall, which means the simulation error of these two states does not accumulate with time. Moreover, groundwater level and river water level are also influenced by neighboring pixels, which can hardly be accurately simulated by one-dimensional vertical model. Therefore, in the data assimilation system, only 10 layers of SWC and three layers of temperature were updated. According to previous studies, comprehensively considering the computational requirements and data assimilation performance, the recommended ensemble size is 10 times of model states [32,48], i.e., 130 in this study. The soil depth of surface, root, and deep layer was 0-10, 20-30, and 100-110 cm, respectively. For the convenience of description, SWC corresponding to surface, root, and deep layer are represented by SWCs, SWCr, and SWCd, respectively.
The time step of HELP model is 30 min, and the time step of SEBS is two times per day at 10:30 and 13:30 under clear day, which means the observation is not available at each time step of HELP model. In this study, the model states were updated by EnKF when LE estimation was available; otherwise, the model was free running. The HELP model ran from 1 October 2005 to 31 December 2005 freely with initial SWC, and temperature values originated from ground measurements to warm up the model, then the model assembled data assimilation system ran from 1 January 2006 to 31 December 2006, while the performance of data assimilation was tested in wheat growing season (from 1 March to 31 May) and maize growing season (from 1 July to 30 September).

Statistical Metrics
Model assessment consisted of the standard statistical evaluations, including bias, mean absolute error (MAE), root mean square error (RMSE), determination coefficient (R 2 ), and effectiveness criterion (Eff ). The statistical metrics are calculated as follows: where superscript u and b represent the update after data assimilation and benchmark run, respectively; O and S represent the observation and simulation, respectively; S is the average of simulated values. The Eff ranges between negative infinity and 100%, and a value larger than 0 corresponds to a positive impact in skill resulting from the data assimilation. Conversely, a value below 0 corresponds to a negative impact. The greater value of Eff indicates better performance following data assimilation compared to the benchmark run.

Results
The data assimilation approach used in this study was EnKF. It was used to reduce the simulation error of SWC and surface flux. The data assimilation results were compared with the results of the benchmark run, and the observation data of the Weishan flux station were used for verification. In terms of SWC assimilation, we also evaluated the effect of root and deep SWC assimilation. The results showed that the model could simulate SWC effectively, and the data assimilation technology can effectively reduce the error of SWC simulation. At the same time, surface flux assimilation also achieved encouraging results.

Benchmark Run
In order to evaluate the performance of the data assimilation approach, the results from the benchmark run are needed, i.e., the open-loop simulation with the same set of forcing data, parameters, and initial states yet without updating model states. Figure 3 shows the simulation results of SWC in three layers by benchmark run. The bias of SWCs, SWCr, and SWCd is 0.033, 0.047, and −0.009 m 3 ·m −3 , respectively. Although the entire error is relatively small, the simulation does not fit well with the observation value, the MAE of the three layers from top to bottom is 0.171, 0.163, and 0.147 m 3 ·m −3 , respectively, and the RMSE is 0.055, 0.053 and 0.053 m 3 ·m −3 . Figure 2 showed the simulation fluctuated violently a few days before and after the rainfall or irrigation, which indicated that SWC is a very sensitive state to the rainfall and irrigation in the HELP model. The SWCs had a severe fluctuation; moreover, SWCr and SWCd also had an obvious increase after rainfall or irrigation.

Benchmark Run
In order to evaluate the performance of the data assimilation approach, the results from the benchmark run are needed, i.e., the open-loop simulation with the same set of forcing data, parameters, and initial states yet without updating model states. Figure 3 shows the simulation results of SWC in three layers by benchmark run. The bias of SWCs, SWCr, and SWCd is 0.033, 0.047, and −0.009 m 3 ·m −3 , respectively. Although the entire error is relatively small, the simulation does not fit well with the observation value, the MAE of the three layers from top to bottom is 0.171, 0.163, and 0.147 m 3 ·m −3 , respectively, and the RMSE is 0.055, 0.053 and 0.053 m 3 ·m −3 . Figure 2 showed the simulation fluctuated violently a few days before and after the rainfall or irrigation, which indicated that SWC is a very sensitive state to the rainfall and irrigation in the HELP model. The SWCs had a severe fluctuation; moreover, SWCr and SWCd also had an obvious increase after rainfall or irrigation.  Figure 4 shows the simulation and observation of the Ts and LE. The simulation of Ts had a systematic error with a bias of 4.98 °C, the MAE was 6.31 °C, and the RMSE was 7.78 °C, but the simulation fitted well with the observation with the R 2 of 0.75. The simulation of heat fluxes was validated using the observation from the eddy covariance system. The H and LE observed by the eddy covariance system were corrected by the energy bal-  Figure 4 shows the simulation and observation of the Ts and LE. The simulation of Ts had a systematic error with a bias of 4.98 • C, the MAE was 6.31 • C, and the RMSE was 7.78 • C, but the simulation fitted well with the observation with the R 2 of 0.75. The simulation of heat fluxes was validated using the observation from the eddy covariance system. The H and LE observed by the eddy covariance system were corrected by the energy balance closure scheme [49]. The relative errors of Rn, G, H, and LE are −17.8%, 103.9%, −0.5%, and 9.0%, respectively. There was a large error in the simulation of G, but the absolute value of G is small relatively and the ratio of G and Rn is small, thus the simulation of LE does not have an obvious systematic error, but the random error is large, the RMSE is 58.26 W·m −2 with an R 2 of 0.68. −0.5%, and 9.0%, respectively. There was a large error in the simulation of G, but the absolute value of G is small relatively and the ratio of G and Rn is small, thus the simulation of LE does not have an obvious systematic error, but the random error is large, the RMSE is 58.26 W·m −2 with an R 2 of 0.68. The simulation error of the HELP model is the combination of error in input data, error in calibrated parameters, and error from the model structure. The input error is the reflectance of instrument measurement uncertainty and/or tempo-spatial heterogeneous land water cycle; the calibrated parameters error is the compromise of empirical relationship and the complex and inenarrable physical mechanism; model structure error is the simplification of equations describing the hydrological processes. These errors are inevitable and can hardly be divided into different sources, which have different influences on the simulation results [50]. The SWC is sensitive to rainfall and irrigation; thus, the interpolation error of rainfall or non-uniform irrigation may cause large errors in the simulation of SWC. The systematic error of Ts may be caused by over-calibrated to the insensitive parameter, or model structure error, while the random error in the simulation of LE may be due to the difficulty in generalizing the evaporation process.  Table 3 shows the error statistics of the data assimilation run and benchmark run. The results show that compared with the benchmark run, the updated SWC simulation in both wheat season and maize season have different degrees of improvement; the Eff ranges from 9.31% to 74.17%.

Data Assimilation in SWC Estimation
From the statistical results, we can see that in the wheat growing season, the bias and MAE of updated SWCs are slightly larger than the benchmark run. The reason is the simulation of SWCs from benchmark run changes severely before, and after the precipitation/irrigation, i.e., the SWCs rise sharply after the precipitation/irrigation while reducing significantly in a short period of time. Therefore, from the perspective of error statistics, during the whole growing season, the error is averaged out, and the bias and MAE remain at a lower level. However, considering the simulation process during the growing season, the time series of updated results from data assimilation fits well with the observation curve, with a slightly systematic underestimation. Compared with the benchmark run, the RMSE from the data assimilation run reduces from 0.04 to 0.03 m 3 ·m −3 . The Eff of SWCs The simulation error of the HELP model is the combination of error in input data, error in calibrated parameters, and error from the model structure. The input error is the reflectance of instrument measurement uncertainty and/or tempo-spatial heterogeneous land water cycle; the calibrated parameters error is the compromise of empirical relationship and the complex and inenarrable physical mechanism; model structure error is the simplification of equations describing the hydrological processes. These errors are inevitable and can hardly be divided into different sources, which have different influences on the simulation results [50]. The SWC is sensitive to rainfall and irrigation; thus, the interpolation error of rainfall or non-uniform irrigation may cause large errors in the simulation of SWC. The systematic error of Ts may be caused by over-calibrated to the insensitive parameter, or model structure error, while the random error in the simulation of LE may be due to the difficulty in generalizing the evaporation process.  Table 3 shows the error statistics of the data assimilation run and benchmark run. The results show that compared with the benchmark run, the updated SWC simulation in both wheat season and maize season have different degrees of improvement; the Eff ranges from 9.31% to 74.17%.

Data Assimilation in SWC Estimation
From the statistical results, we can see that in the wheat growing season, the bias and MAE of updated SWCs are slightly larger than the benchmark run. The reason is the simulation of SWCs from benchmark run changes severely before, and after the precipitation/irrigation, i.e., the SWCs rise sharply after the precipitation/irrigation while reducing significantly in a short period of time. Therefore, from the perspective of error statistics, during the whole growing season, the error is averaged out, and the bias and MAE remain at a lower level. However, considering the simulation process during the growing season, the time series of updated results from data assimilation fits well with the observation curve, with a slightly systematic underestimation. Compared with the benchmark run, the RMSE from the data assimilation run reduces from 0.04 to 0.03 m 3 ·m −3 . The Eff of SWCs is 9.31%, which means after the data assimilation, the simulation results of SWCs have improved to some extent.
The SWCr and SWCd of the wheat growing season have great improvement after data assimilation compared with the benchmark run, with an Eff of 43.68% and 66.67%, respectively. From the perspective of error statistics from the whole growing season, the bias of SWCr and SWCd reduced by 0.0032 and 0.0782 m 3 ·m −3 , respectively, and the MAE reduced by 0.0103 and 0.0754 m 3 ·m −3 , respectively. The SWCr was also influenced by precipitation/irrigation. The time series after data assimilation has a similar pattern as the surface layer; the simulation from the benchmark run shows a severe changing pattern, which might attribute to the recession process of the HELP model. After the data assimilation, the time series of SWC fits much better with the observation, with an RMSE reduced from 0.03 to 0.02 m 3 ·m −3 . The SWCd did not receive much influence by precipitation/irrigation compared with surface and root layers, but the simulation from the benchmark run shows an obvious underestimation compared with the observation. The data assimilation reduces the systematic error, with an RMSE reduced from 0.07 to 0.04 m 3 ·m −3 , which shows a large improvement compared with the benchmark run. The SWCr and SWCd of the wheat growing season have great improvement after data assimilation compared with the benchmark run, with an Eff of 43.68% and 66.67%, respectively. From the perspective of error statistics from the whole growing season, the bias of SWCr and SWCd reduced by 0.0032 and 0.0782 m 3 ·m −3 , respectively, and the MAE reduced by 0.0103 and 0.0754 m 3 ·m −3 , respectively. The SWCr was also influenced by precipitation/irrigation. The time series after data assimilation has a similar pattern as the surface layer; the simulation from the benchmark run shows a severe changing pattern, which might attribute to the recession process of the HELP model. After the data assimilation, the time series of SWC fits much better with the observation, with an RMSE reduced from 0.03 to 0.02 m 3 ·m −3 . The SWCd did not receive much influence by precipitation/irrigation compared with surface and root layers, but the simulation from the benchmark run shows an obvious underestimation compared with the observation. The data assimilation reduces the systematic error, with an RMSE reduced from 0.07 to 0.04 m 3 ·m −3 , which shows a large improvement compared with the benchmark run.  Compared with the wheat growing season, the data assimilation results in the maize growing season performed better. The MAE of SWCs, SWCr, and SWCd are reduced by 0.082, 0.1082, and 0.0744 m 3 ·m −3 compared with the benchmark run, respectively (Table 5). According to the simulation process, the time series simulation of SWC in three layers from the benchmark run show an overestimation, especially for SWCs and SWCr, with a bias of 0.1873 and 0.1997 m 3 ·m −3 , respectively. After data assimilation, the systematic error reduces to a large extent, and the bias of the top two layers of SWC reduced to 0.037 and 0.0362 m 3 ·m −3 , respectively. However, after data assimilation, the time series of simulated SWCd showed an underestimation compared with the benchmark run; the bias changed from 0.0927 to −0.0545 m 3 ·m −3 . The reason is according to the HELP model, the SWCd tended to have a similar pattern with the surface and root layer since the infiltration rate in the Richards equation is constant, and this led to the results that the simulated SWCd is influenced by precipitation and/or irrigation to a large extent. Compared with the wheat growing season, the observed SWCd in the maize season roughly remained constant regardless of precipitation or irrigation because it has a longer infiltration time in a wheatmaize rotation system [51]. The RMSEs of the three layers were also reduced by 0.03 m 3 ·m −3 , 0.03 m 3 ·m −3 , and 0.02 m 3 ·m −3 , respectively. Overall, the data assimilation Eff of the three layers is above 60%, which means the simulation results from the data assimilation run fit better with the observation.
Results showed that, in the wheat growing season, the improvement of data assimilation was mainly demonstrated in the random error, while in the maize growing season, the improvement was mainly demonstrated in the systematic error. Overall, the degree of improvement increased from the top layer to the bottom layer. Compared with the wheat growing season, the data assimilation results in the maize growing season performed better. The MAE of SWCs, SWCr, and SWCd are reduced by 0.082, 0.1082, and 0.0744 m 3 ·m −3 compared with the benchmark run, respectively (Table 5). According to the simulation process, the time series simulation of SWC in three layers from the benchmark run show an overestimation, especially for SWCs and SWCr, with a bias of 0.1873 and 0.1997 m 3 ·m −3 , respectively. After data assimilation, the systematic error reduces to a large extent, and the bias of the top two layers of SWC reduced to 0.037 and 0.0362 m 3 ·m −3 , respectively. However, after data assimilation, the time series of simulated SWCd showed an underestimation compared with the benchmark run; the bias changed from 0.0927 to −0.0545 m 3 ·m −3 . The reason is according to the HELP model, the SWCd tended to have a similar pattern with the surface and root layer since the infiltration rate in the Richards equation is constant, and this led to the results that the simulated SWCd is influenced by precipitation and/or irrigation to a large extent. Compared with the wheat growing season, the observed SWCd in the maize season roughly remained constant regardless of precipitation or irrigation because it has a longer infiltration time in a wheat-maize rotation system [51]. The RMSEs of the three layers were also reduced by 0.03 m 3 ·m −3 , 0.03 m 3 ·m −3 , and 0.02 m 3 ·m −3 , respectively. Overall, the data assimilation Eff of the three layers is above 60%, which means the simulation results from the data assimilation run fit better with the observation. Results showed that, in the wheat growing season, the improvement of data assimilation was mainly demonstrated in the random error, while in the maize growing season, the improvement was mainly demonstrated in the systematic error. Overall, the degree of improvement increased from the top layer to the bottom layer.

Data Assimilation in Surface Flux Estimation
In this study, the estimated LE based on remote sensing using the SEBS model was used as the observation operator in the data assimilation system. However, further verification of data assimilation performance in the surface flux simulation is also needed. Since the background error covariance estimation of the assimilation system is calculated by the sampling error matrix consisting of model states. Therefore, using LE as an observation operator in the assimilation system can only update the model states at observation time, and the surface flux as the model output will be simulated at the next time step using updated model states as the initial condition. Moreover, the assimilation procedure only happens at remote sensing LE estimation is available, i.e., satellite over passing time on clear days. In other words, using LE as an observation operator does not guarantee the improvement of LE simulation results. Figures 7 and 8 are the scatter plots of surface flux simulation compared with an observation by benchmark run and data assimilation run during wheat and maize growing season, respectively (error statistics see Table 6). In general, compared with the benchmark run, four components of surface heat flux simulation in both wheat and maize season performed better in various degrees. For the wheat growing season, after data assimilation, the relative error of Rn, G, H, and LE simulation was substantially reduced to under 10 W·m −2 , indicating the reduction of systematic error. Moreover, the discrete random error was also reduced to a certain extent according to the RMSE and R 2 statistics; the RMSE of LE reduced from 54.03 to 32.80 W·m −2 after data assimilation. The simulated results of surface heat flux in maize season over benchmark run were not as good as in wheat season due to the higher canopy height [46], but after the data assimilation procedure, the simulation results improved except for G, for which the Eff showed a negative value. The RMSE of LE reduced from 119.78 to 67.44 W·m −2 , and the R 2 increased from 0.56 to 0.72.      To further explore the performance of data assimilation results of surface heat flux, Figures 9 and 10 showed the two-week continuous simulation results for Rn, G, H, and LE from 1 March to 14 March in wheat season and 7 September to 20 September in maize season, respectively. The assimilation curves of the four variables had different correction effects compared with the benchmark run. For Rn, H, and LE simulation, the assimilation curves were closer to the observed values than the simulation results, especially for the peaks of the curve, because in the HELP model, the error accumulation of these three variables is time-related, for example, the Rn at time step t is forced by state variables (Ta, Ts, etc.) calculated at time t − 1, when the data assimilation occurred at t − 1, variables Ta and Ts was updated by introducing the new observation which may lead to improvement or deterioration of the results. On the contrary, the calculation of G is derived by an empirical equation related only to Rn, which means it has weaker relations with the 13 updated state variables. This may explain the bad performance of data assimilation of G.
Remote Sens. 2022, 14, 3183 16 of 25 To further explore the performance of data assimilation results of surface heat flux, Figures 9 and 10 showed the two-week continuous simulation results for Rn, G, H, and LE from 1 March to 14 March in wheat season and 7 September to 20 September in maize season, respectively. The assimilation curves of the four variables had different correction effects compared with the benchmark run. For Rn, H, and LE simulation, the assimilation curves were closer to the observed values than the simulation results, especially for the peaks of the curve, because in the HELP model, the error accumulation of these three variables is time-related, for example, the Rn at time step t is forced by state variables (Ta, Ts, etc.) calculated at time t − 1, when the data assimilation occurred at t − 1, variables Ta and Ts was updated by introducing the new observation which may lead to improvement or deterioration of the results. On the contrary, the calculation of G is derived by an empirical equation related only to Rn, which means it has weaker relations with the 13 updated state variables. This may explain the bad performance of data assimilation of G.   Overall, the verification results of data assimilation compared with observation values showed that updating the state variables of the HELP model could improve model output during the next time steps by making a prediction using the updated states. Noted that in our research, the update of state variables only happened when remote sensing data are available to provide observation value at certain times during clear days, which means by the time the model ran into the data assimilation procedure, the updated state variables were treated as the initial states of the model to support the model running in the following time steps until next available observation value. The results showed that after updating, the improvement of model simulation would last for a certain time, which proves the feasibility of this data assimilation scheme in improving the model performance in practical application.

Analysis of Ensemble Generation
The generation of the ensembles in the data assimilation scheme is a critical issue in ensuring the effective update of state variables. In this study, instead of setting the background covariance error matrix, we generated ensembles by adding random perturbations to the forcing data as described in Section 2.3. 4. We assumed that model simulation Overall, the verification results of data assimilation compared with observation values showed that updating the state variables of the HELP model could improve model output during the next time steps by making a prediction using the updated states. Noted that in our research, the update of state variables only happened when remote sensing data are available to provide observation value at certain times during clear days, which means by the time the model ran into the data assimilation procedure, the updated state variables were treated as the initial states of the model to support the model running in the following time steps until next available observation value. The results showed that after updating, the improvement of model simulation would last for a certain time, which proves the feasibility of this data assimilation scheme in improving the model performance in practical application.

Analysis of Ensemble Generation
The generation of the ensembles in the data assimilation scheme is a critical issue in ensuring the effective update of state variables. In this study, instead of setting the background covariance error matrix, we generated ensembles by adding random perturba-tions to the forcing data as described in Section 2.3. 4. We assumed that model simulation uncertainties will be embodied in the ensembles and will be correctly reflected in the prediction of state variables and outputs. In this case, numerical modeling was tested to make sure the random error setting to the forcing data was transmitted to the key variables by model simulation. Figure 11 shows the ensemble perturbation and its standard deviation (std) of SWCs and LE from 100 ensembles (the time step set to 500). For each ensemble, the initial state variables originated from the ground measurements. The std of SWC showed a wide variety, ranging from 0.0053 to 0.0216 m 3 ·m −3 . The ensemble members were perturbed severely when the precipitation event happened because extra random error from perturbation of precipitation was introduced to the model. After precipitation, the ensembles of SWC quickly converged to the previous level before precipitation. The ensemble of LE showed a similar pattern as SWC, and the std ranged from 1.07 to 54.09 W·m −2 . Different from the SWC ensemble, the influence of precipitation on the LE ensemble lasted a longer time, which reflected that SWC and LE had different response speeds to precipitation. Results showed random noise added to the forcing data could be corrected and conducted to the state variable and thus output value. The dispersion of the ensemble also showed different sensitivity degrees to forcing data.
Remote Sens. 2022, 14,3183 18 of 25 uncertainties will be embodied in the ensembles and will be correctly reflected in the prediction of state variables and outputs. In this case, numerical modeling was tested to make sure the random error setting to the forcing data was transmitted to the key variables by model simulation. Figure 11 shows the ensemble perturbation and its standard deviation (std) of SWCs and LE from 100 ensembles (the time step set to 500). For each ensemble, the initial state variables originated from the ground measurements. The std of SWC showed a wide variety, ranging from 0.0053 to 0.0216 m 3 ·m −3 . The ensemble members were perturbed severely when the precipitation event happened because extra random error from perturbation of precipitation was introduced to the model. After precipitation, the ensembles of SWC quickly converged to the previous level before precipitation. The ensemble of LE showed a similar pattern as SWC, and the std ranged from 1.07 to 54.09 W·m −2 . Different from the SWC ensemble, the influence of precipitation on the LE ensemble lasted a longer time, which reflected that SWC and LE had different response speeds to precipitation. Results showed random noise added to the forcing data could be corrected and conducted to the state variable and thus output value. The dispersion of the ensemble also showed different sensitivity degrees to forcing data. Figure 11. Ensemble perturbation and its std. Figure 12a,b are the scatter plots of benchmark run and ensemble mean for SWCs and LE, respectively. By the benchmark run, the HELP model ran freely without putting perturbation to the forcing data. In this case, the ensemble mean values of SWC and LE are supposed to approach the benchmark run, indicating the ensemble generation does not bring systematic error to the model simulation. The bias of SWCs of the ensemble mean compared with the benchmark run is 0.0019 m 3 ·m −3 , which can be treated as unbiased as a whole, but the time process curve (Figure 12c) showed that the ensemble mean value changed more intense than the benchmark run, especially when precipitation occurred (highlighted with a red square). This indicated that the model error of SWC might be magnified by adding random noise to the forcing data. However, for the LE variable, the ensemble mean value was well fitted with the benchmark run with an R 2 of 0.999. However, a slight systematic error is reflected in the LE ensemble, with a bias of −0.0178 W·m −2 , which is acceptable, though. Basically, the ensemble generation in this study is reasonable and correctly reflects the model error into the state variables and output.  Figure 12a,b are the scatter plots of benchmark run and ensemble mean for SWCs and LE, respectively. By the benchmark run, the HELP model ran freely without putting perturbation to the forcing data. In this case, the ensemble mean values of SWC and LE are supposed to approach the benchmark run, indicating the ensemble generation does not bring systematic error to the model simulation. The bias of SWCs of the ensemble mean compared with the benchmark run is 0.0019 m 3 ·m −3 , which can be treated as unbiased as a whole, but the time process curve (Figure 12c) showed that the ensemble mean value changed more intense than the benchmark run, especially when precipitation occurred (highlighted with a red square). This indicated that the model error of SWC might be magnified by adding random noise to the forcing data. However, for the LE variable, the ensemble mean value was well fitted with the benchmark run with an R 2 of 0.999. However, a slight systematic error is reflected in the LE ensemble, with a bias of −0.0178 W·m −2 , which is acceptable, though. Basically, the ensemble generation in this study is reasonable and correctly reflects the model error into the state variables and output.

Dependency Analysis of State Variables
In the EnKF data assimilation scheme, forcing data, state variables, and output ar related by the Kalman gain matrix and observation operator. Theoretically, if the thre components have a relatively low correlation, the effect of data assimilation may not b significant. Tables 7 and 8 showed the correlation coefficient of HELP model state varia bles and surface flux in wheat and maize season, respectively. Benchmark run results wer used in this analysis to make sure the correlation of state variables was correctly reflecte through the model. The SWC of three layers had a high correlation coefficient, and LE ha a relatively high correlation coefficient with SWC, which approved the positive effect o data assimilation of SWC using LE as an observation operator. Compared with the whea season, the correlation coefficient of LE and SWC in the maize season was higher, whic indicated the results in Section 3.2 that for SWC estimation, the maize season had bette performance than the wheat season.
For both wheat season and maize season, LE had the highest correlation with all th state variables and surface flux, which proved that using remote sensing LE estimation a an observation operator was reasonable and feasible. The results also showed that othe than LE, Ts also had a relatively higher correlation coefficient with surface flux. The cor relation coefficient of Ts and LE in wheat and maize season are 0.63 and 0.66, respectively Figure 13 is the scatter plot of Ts simulation compared with an observation by benchmar run, and data assimilation ran during wheat and maize season, respectively. Result showed data assimilation procedure had a larger effect on Ts simulation compared wit SWC. In the wheat season, the benchmark run had obvious systematic error compare with observation; after data assimilation, the bias reduced to −1.074 °C, RMSE reduced t 1.786 °C, and the Eff of data assimilation was 94.8%, and R 2 increased from 0.661 to 0.955 In maize season, the benchmark run had better results, but after data assimilation, th random error also reduced to a large extent, with an Eff of 73.0%, R 2 increased from 0.94 to 0.968, and RMSE reduced from 2.473 to 1.286 °C. Results showed the assimilation usin LE as an observation operator had an obvious effect in improving the simulation Ts a expected.

Dependency Analysis of State Variables
In the EnKF data assimilation scheme, forcing data, state variables, and output are related by the Kalman gain matrix and observation operator. Theoretically, if the three components have a relatively low correlation, the effect of data assimilation may not be significant. Tables 7 and 8 showed the correlation coefficient of HELP model state variables and surface flux in wheat and maize season, respectively. Benchmark run results were used in this analysis to make sure the correlation of state variables was correctly reflected through the model. The SWC of three layers had a high correlation coefficient, and LE had a relatively high correlation coefficient with SWC, which approved the positive effect of data assimilation of SWC using LE as an observation operator. Compared with the wheat season, the correlation coefficient of LE and SWC in the maize season was higher, which indicated the results in Section 3.2 that for SWC estimation, the maize season had better performance than the wheat season.
For both wheat season and maize season, LE had the highest correlation with all the state variables and surface flux, which proved that using remote sensing LE estimation as an observation operator was reasonable and feasible. The results also showed that other than LE, Ts also had a relatively higher correlation coefficient with surface flux. The correlation coefficient of Ts and LE in wheat and maize season are 0.63 and 0.66, respectively. Figure 13 is the scatter plot of Ts simulation compared with an observation by benchmark run, and data assimilation ran during wheat and maize season, respectively. Results showed data assimilation procedure had a larger effect on Ts simulation compared with SWC. In the wheat season, the benchmark run had obvious systematic error compared with observation; after data assimilation, the bias reduced to −1.074 • C, RMSE reduced to 1.786 • C, and the Eff of data assimilation was 94.8%, and R 2 increased from 0.661 to 0.955. In maize season, the benchmark run had better results, but after data assimilation, the random error also reduced to a large extent, with an Eff of 73.0%, R 2 increased from 0.943 to 0.968, and RMSE reduced from 2.473 to 1.286 • C. Results showed the assimilation using LE as an observation operator had an obvious effect in improving the simulation Ts as expected.   Figure 13. The scatter plot of Ts simulation compared with observation by benchmark run and data assimilation ran. Figure 13. The scatter plot of Ts simulation compared with observation by benchmark run and da-ta assimilation ran. Figure 14a,b showed the two-week continuous simulation results for Ts from 1 March to 14 March in wheat season and 7 September to 20 September in maize season, respectively. Results showed the data assimilation improved the Ts simulation, especially for errors in the peaks of the curve. However, despite the data assimilation having overall great performance, it did not improve the simulation for every single day. For example, on 4 March and 10 September, after data assimilation, the error of simulation increased compared with the benchmark run, which is due to the complexity of the mechanism in the energy transfer process and simplification in the model structure.
Remote Sens. 2022, 14, 3183 21 of 25 Figure 14a,b showed the two-week continuous simulation results for Ts from 1 March to 14 March in wheat season and 7 September to 20 September in maize season, respectively. Results showed the data assimilation improved the Ts simulation, especially for errors in the peaks of the curve. However, despite the data assimilation having overall great performance, it did not improve the simulation for every single day. For example, on 4 March and 10 September, after data assimilation, the error of simulation increased compared with the benchmark run, which is due to the complexity of the mechanism in the energy transfer process and simplification in the model structure.

Limitations and Future Works
In this study, the data assimilation method was used to simulate SWC and surface fluxes continuously, which was verified in the Weishan Irrigation District. However, there is some experience in this method, such as the occurrence time of the maximum Ta was determined by the method of fitting ground observations. In addition, this method was only validated under wheat and maize crop types in the Weishan Irrigation District. The extension of the model also requires ground verification in different climate zones and underlying surface types. At the same time, the applicable conditions and limitations of the model should be analyzed, and the parameter values of the model under different underlying surfaces should be modified and improved.
In addition, this study only conducted a simulation for one year. As the underlying surface is relatively uniform, parameterization schemes for different underlying surfaces are also relatively simple. In the process of SWC assimilation simulation, integration application with existing SWC products such as SMAP, CLDAS, and SMOS was not considered. At present, SWC fusion is a hot topic, and the existing SWC products have the advantage of high temporal resolution. In the future, it is necessary to further refine the parameterization scheme of the model for different underlying surface types. In order to provide a reference for regional water resources management, a long-term series of SWC and surface fluxes simulations was carried out by integrating existing SWC products. This study proved that using remote sensing observations as data assimilation can effectively

Limitations and Future Works
In this study, the data assimilation method was used to simulate SWC and surface fluxes continuously, which was verified in the Weishan Irrigation District. However, there is some experience in this method, such as the occurrence time of the maximum Ta was determined by the method of fitting ground observations. In addition, this method was only validated under wheat and maize crop types in the Weishan Irrigation District. The extension of the model also requires ground verification in different climate zones and underlying surface types. At the same time, the applicable conditions and limitations of the model should be analyzed, and the parameter values of the model under different underlying surfaces should be modified and improved.
In addition, this study only conducted a simulation for one year. As the underlying surface is relatively uniform, parameterization schemes for different underlying surfaces are also relatively simple. In the process of SWC assimilation simulation, integration application with existing SWC products such as SMAP, CLDAS, and SMOS was not considered. At present, SWC fusion is a hot topic, and the existing SWC products have the advantage of high temporal resolution. In the future, it is necessary to further refine the parameterization scheme of the model for different underlying surface types. In order to provide a reference for regional water resources management, a long-term series of SWC and surface fluxes simulations was carried out by integrating existing SWC products. This study proved that using remote sensing observations as data assimilation can effectively improve the continuous simulation of SWC and surface flux in land surface process models. In this study, only one remote sensing data source was used to analyze data assimilation results. In the following research, multi-objective data assimilation of multi-source remote sensing data should also be considered. Remote sensing rainfall and SWC should be introduced into the data assimilation system for multi-objective data assimilation.