Improving Soil Moisture Estimation via Assimilation of Remote Sensing Product into the DSSAT Crop Model and Its Effect on Agricultural Drought Monitoring

: Accurate knowledge of soil moisture is crucial for agricultural drought monitoring. Data assimilation has proven to be a promising technique for improving soil moisture estimation, and various studies have been conducted on soil moisture data assimilation based on land surface models. However, crop growth models, which are ideal tools for agricultural simulation applications, are rarely used for soil moisture assimilation. Moreover, the role of data assimilation in agricultural drought monitoring is seldom investigated. In the present work, we assimilated the European Space Agency (ESA) Climate Change Initiative (CCI) soil moisture product into the Decision Support System for Agro-technology Transfer (DSSAT) model to estimate surface and root-zone soil moisture, and we evaluated the effect of data assimilation on agricultural drought monitoring. The results demonstrate that the soil moisture estimates were signiﬁcantly improved after data assimilation. Root-zone soil moisture had a better agreement with in situ observation. Compared with the drought index based on soil moisture modeled without remotely-sensed observations, the drought index based on assimilated data could improve at least one drought level in agricultural drought monitoring and performed better when compared with winter wheat yield. In conclusion, crop growth model-based data assimilation effectively improves the soil moisture estimation and further strengthens soil moisture-based drought indices for agricultural drought monitoring.


Introduction
Drought is a fairly complex and disastrous climate disaster, which can cause tremendous damages to the economy and environment, especially for the agriculture sector. Typically, meteorological, agricultural, hydrological, and socioeconomic droughts are recognized as the four common types of drought [1]. Compared with other types of drought, agricultural drought provides a more immediate and obvious influence [2] and threatens regional and even national food security [3,4]. Under the scenario of global climate change, future drought may be exacerbated [5], and along with other climate hazards, drought may still pose a heightened threat to humanity [6]. Thus, accurate and effective agricultural drought monitoring is of great significance for impact assessment and targeted adaption planning.
Agricultural drought involves insufficient soil moisture due to the imbalance between water supply and crop water demand, which detrimentally affects crop growth and yield [7]. Hence, soil moisture becomes an essential indicator for assessing agricultural drought [2]. crop growth and environmental factors, and drought indices combined with the agronomic parameters from the model output can better monitor agricultural drought. In current studies, crop growth model-based data assimilation is generally used for leaf area index, as well as yield or biomass estimation and prediction [31][32][33][34][35]. The linkage between crop models and remote sensing observations for improving surface and root-zone soil moisture is relatively rare, and evaluations of the impact of DA-based soil moisture estimates on agricultural drought monitoring are still lacking. Consequently, the goals of this study were: (1) to assimilate remote sensing data into a crop model to estimate soil moisture; (2) to investigate the impacts of different strategies (simultaneous state-parameter estimation, ensemble size, and assimilation interval) on soil moisture data assimilation; and (3) to evaluate the extent to which soil moisture assimilation enhances the level of agricultural drought monitoring.

Study Area
The focus of this article is on the Huang-Huai-Hai (HHH) Plain, the second largest plain in China, which is located between 32 • and 41 • N and 112 • and 123 • E, covering 390,000 km 2 ( Figure 1). The HHH Plain is an important food-producing base in China, accounting for approximately 16% of the total cultivated land area [36][37][38]. The prevailing cropping system is the rotation of winter wheat and summer corn, and wheat and corn production account for 70% and 30% of the national total production, respectively [39]. The HHH Plain is affected by a warm-temperate monsoon climate, and the average temperatures in winter and summer are approximately 0 • C and 26 • C, respectively. In terms of precipitation, significant differences exist in time and space dimensions. The average annual precipitation ranges from 400 to 1000 mm, increasing from north to south, but 60-80% of the annual precipitation is concentrated in the summer corn growing season. Since winter wheat is more susceptible to drought than corn, the experimental period was selected during the winter wheat growing season. Considering the differences in topography, climate, hydrology, and agricultural production conditions, the HHH Plain consists of four divisions, as shown in Figure 1. Eight agro-meteorological sites were selected in this study, which were distributed in the four agricultural subregions.

Meteorological Data
Meteorological variables of daily precipitation, air temperature, wind speed, and solar radiation were required to drive the crop growth simulation. Solar radiation was calculated from sunshine duration using the Ångström equation [40]. The other four Meteorological variables of daily precipitation, air temperature, wind speed, and solar radiation were required to drive the crop growth simulation. Solar radiation was calculated from sunshine duration using the Ångström equation [40]. The other four meteorological variables and sunshine duration were observed by meteorological stations in China (http://data.cma.cn/ (accessed on 1 June 2022)).

Agro-Meteorological Data
Agro-meteorological data, including wheat phenology, yield, soil moisture, and soil properties, were also collected in this study; these data were derived from the observations of agro-meteorological sites affiliated with the China Meteorological Administration. Wheat phenology and yield were primarily used for crop growth model calibration. The phenology was measured every 10 days during the growing season, and yield was measured after harvest. Soil moisture was used for model initialization and data assimilation evaluation. We collected soil moisture at 10, 20, 30, 40 and 50 cm depths at the representative HBBZ site, as well as at 10, 20 and 50 cm depths at seven other sites. Soil moisture was measured every 10 days during the growing season. Gravimetric measurement was adopted with four replicates, and the averaged value was used. The soil moisture obtained in this way is gravimetric water content. In order to match the satellite soil moisture data, it was then multiplied by soil bulk density and converted to volumetric water content. The soil moisture used in this study refers to volumetric water content. Soil properties included the soil texture, bulk density, field water-holding capacity, and wilting point at the corresponding layers, and these properties were used for model input and soil moisture index calculation. Soil properties were measured every 5-10 years depending on the actual condition of each site.

Satellite Soil Moisture
Soil moisture from the ESA (European Space Agency) CCI (Climate Change Initiative) was used as the observation for soil moisture data assimilation. ESA CCI soil moisture contains the passive, active and combined products based on various microwave sensors, which have been extensively applied in a great deal of earth system applications, e.g., climate change, land-atmosphere interactions, drought monitoring and assessment, and land surface modeling [18]. The latest version is v07.1, which can be obtained from http://www.esa-soilmoisture-cci.org (accessed on 1 June 2022). This product covers the time range from November 1978 to December 2021, with spatial and temporal resolutions of 0.25 • and 1 day, respectively. The combined product during the period from 2002 to 2011 was used in this study. For more details regarding this product, please refer to Liu et al. [41], Dorigo et al. [18] and Gruber et al. [42].

Crop Model Description and Verification
The Decision Support System for Agro-technology Transfer (DSSAT)-Cropping System Model (CSM)-Wheat model (herein referred to as the DSSAT model [43]) was used as the model operator for data assimilation. The DSSAT model was developed by the US Department of Agriculture, and it can simulate the vegetative and reproductive growth, physiological and ecological processes, and soil water balance on a daily basis. Input data include four categories: weather; crop; soil; and management data. The weather data should at least include daily maximum and minimum temperature, solar radiation, and precipitation. Crop data indicates the genetic and ecological parameters, which describe crop photosynthesis, respiration, phenological stage and other physiological processes. Soil data represent the soil physical, chemical, and morphological parameters throughout the soil profile. Management data include a number of parameters regarding planting, irrigation, fertilization, etc. The soil moisture simulation in the DSSAT model was implemented using the soil water balance module [44]. Soil water balance involves the following processes: infiltration; evapotranspiration (ET); runoff; and drainage. Soil water infiltration is calculated as the sum of rainfall and irrigation minus surface runoff. Surface runoff is computed based on the SCS-CN model [45] developed by the US Department of Agriculture. The soilplant-atmosphere module (SPAM) was developed to compute the daily ET, and it provides two methods-the Priestley-Taylor [46] and FAO-56 Penman-Monteith [40] methods, to calculate the potential ET. Actual ET is calculated based on potential ET and soil water stress factors [47].
Before using the DSSAT model, the model parameters should be calibrated according to the local climate, soil conditions, management methods, and cropping system, in order to enable the model to accurately measure the characteristics of local crops. Crop phenology and yield were used to calibrate seven critical cultivar parameters (P1V, P1D, P5, G1, G2, G3, and PHINT; for a detailed description, see Jones et al. [43]). The suitable parameters in the study area were well calibrated and validated in our previous study; for more details about the model verification process, please refer to Zhou et al. [13].

Assimilation Algorithms
The Augmented Ensemble Kalman Filter (AEnKF) [30,48] was used as the assimilation algorithm in the present paper. On the basis of EnKF (Ensemble Kalman Filter; [49]), AEnKF also takes model parameters as state variables to achieve the aim of simultaneously estimating these two categorical parameters (state variables and model parameters). Hence, EnKF is the foundation of AEnKF. The implementation of EnKF involves the following steps: initialization; forecast; and analysis. Suppose X = [x 1 , x 2 , · · · , x n ] T as the state variable, where x i represents the soil moisture in the different soil layers and crop model parameters in this study. First, the state ensemble is initialized as follows (Equation (1)): where X a i,0 is the i-th initial state member (i = 1, 2, . . . , n; n is the ensemble size), X a 0 is the expectation of the background field, w i is the random error, N indicates the Gaussian distribution, and P is background covariance matrix. Note that the initial state ensemble can also be generated by perturbing model parameters.
Second, each ensemble member is forecasted by the model operator as follows (Equation (2)): where X f i,t+1 is the forecasted state variable of the i-th ensemble member at the time t + 1; M(•) is the model operator, α represents the model forcing data, β is model parameter; µ i is the model error, and Q is the model covariance matrix.
Third, when the observations are available, each ensemble member is analyzed based on the following linear combination of forecasted state and observation (Equation (3)).
where X a i,t+1 is the analysis state variable, Y t+1 is the observation, v i is the observation error, R is the observation covariance matrix, H(•) is the observation operator, and K t+1 is the Kalman gain matrix defined as follows: where P f t+1 is the forecasted covariance matrix, and X f t+1 is the average value of forecasted ensemble members.
Finally, the analysis state variable is calculated by averaging each ensemble member (Equation (9)). When the next observation occurs, the state variable continues to repeat the above process until the end of data assimilation.

Data Assimilation Implementation
Data assimilation framework includes three key components: model operator; observation; and algorithm. The soil water balance module of the DSSAT model was used as the model operator for soil moisture data assimilation. Soil moisture simulations of nine winter wheat growing seasons from 2002 to 2011 were conducted at the eight agrometeorological sites in the HHH Plain. Management parameters, such as planting date, density, depth, soil properties, fertilization, and irrigation, were set according to the records of the agro-meteorological sites.
The ESA CCI soil moisture product was used as observation data of the assimilation framework. Given the differences in the climatology and spatial resolution of in situ and satellite observations, bias correction of satellite soil moisture was required prior to data assimilation. In this study, the cumulative distribution function matching approach was applied for bias correction, which has been widely applied in data assimilation studies (e.g., [24,29,50]). Referring to existing studies [18,24,34,51], the observation error was set to 0.06 m 3 ·m −3 .
The AEnKF was selected as the algorithm for soil moisture data assimilation, which was described in detail in Section 3.2. In this study, the state variable vector consists of soil moisture at different depths and model parameters that were sensitive to soil moisture. Soil parameters such as lower limit (SLLL), drained upper limit (SDUL), saturated limit (SSAT), drainage rate (SLDR) and runoff curve number (SLRO) in the DSSAT model have significant impacts on soil moisture estimation [47]. Thus, these soil parameters were also included as state variables, and the state variable vector was as follows: where X t is the state variable vector at the time t, SM i is the soil moisture at soil layer i, and n is the number of soil layers. Since the ESA CCI data only reflects soil moisture information for the top layer, the observation operator H was set to [1 0 0 0 0 0 0 0 0 0] T . Following the study of Ines et al. [34], an uncertainty level of 10% of the model soil parameters was introduced to perturb the initial state variables. The ensemble size was set to 20 by considering the estimation accuracy and computational time. Additionally, with the increase in assimilation time, the standard EnKF algorithm tends to exhibit a "filter divergence" phenomenon; that is, the analysis value gradually approaches the model forecast and ignores the observation. According to existing studies [33,34], when the observation error covariance is greater than four times the model forecast error covariance (R/P f t+1 > 4), an inflation factor of 1.05 was used to enlarge the Kalman gain to reduce this phenomenon. In addition, the impacts of ensemble size, model parameter, and assimilation interval on data assimilation are discussed in Section 4.2.

Agricultural Drought Index
The root-weighted soil moisture index (RSMI [13]) was used as an agricultural drought index to evaluate the role of data assimilation. There are two main reasons for choosing the RSMI as an indicator: (a) RSMI considers the influence of soil properties and root distribution on crop water uptake, and (b) RSMI does not require long-term time series data. RSMI is as follows: where SMI is the soil moisture index, RW is the weight value, n is the number of soil layers, in which roots reach the deepest part of the soil, which varies with the crop growth stage, z is the soil thickness, RLD is the root length density, and θ, θ f c and θ wp represent soil moisture, field capacity and wilting point, respectively. The drought classifications of the RSMI are shown in Table 1. For more details regarding the RSMI, please refer to Zhou et al. [13].

Evaluation Criteria
The root-mean-square error (RMSE, m 3 ·m −3 ), correlation coefficient (R), and effectiveness (Eff, %) [52] were used to quantitatively measure the capability of data assimilation. These indicators were calculated as follows: where S i is the estimated (simulated or assimilated) soil moisture at the time i, O i is the measured soil moisture; n is the number of measurements, S and O represent the average value of the estimated and measured soil moisture, respectively, S the soil moisture before and after data assimilation, respectively, and Eff indicates the efficiency of the data assimilation. If Eff is greater than 0, this situation indicates that the soil moisture accuracy is improved after data assimilation; otherwise, a negative value denotes a degraded performance.  Figure 2 shows the comparison of the open-loop and assimilated soil moisture at the HBBZ site. Open-loop soil moisture is the direct simulation from the DSSAT model without data assimilation. As shown, relatively large errors were present in the open-loop simulations compared with the measured soil moisture, and open-loop soil moisture was significantly lower than observed data in the late growth stage of winter wheat. In contrast, assimilated results were in good agreement with the observations. Quantitatively, the goodness-of-fit statistics (RMSE, R, and Eff ) at the HBBZ site are shown in Table 2  We also compared the open-loop and assimilated soil moisture with in situ observations at all eight agro-meteorological sites. Except for the HBBZ site, measured soil moisture at the other sites was collected at three depths: 10, 20, and 50 cm. Figure 3 shows the scatter plots and histograms of soil moisture estimates and observations. The results indicated that assimilated soil moisture matched better with in situ observations than openloop simulations. Additionally, compared with open-loop simulations, the histograms of assimilated soil moisture were more similar to these of observations. For the eight sites, the average RMSEs of open-loop simulations at depths of 10, 20, and 50 cm were 0.088, 0.091, and 0.056 m 3 ·m −3 , respectively, and the average RMSEs of assimilated soil moisture were 0.047, 0.046, and 0.045 m 3 ·m −3 , respectively ( Table 3). The correlations between the assimilated soil moisture with observations at each layer were improved. The Eff values of the soil moisture at each layer were greater than zero, indicating that moisture estimates throughout the soil profile were improved via data assimilation. eight sites, the average RMSEs of open-loop simulations at depths of 10, 20, and 50 cm were 0.088, 0.091, and 0.056 m 3 ·m −3 , respectively, and the average RMSEs of assimilated soil moisture were 0.047, 0.046, and 0.045 m 3 ·m −3 , respectively ( Table 3). The correlations between the assimilated soil moisture with observations at each layer were improved. The Eff values of the soil moisture at each layer were greater than zero, indicating that moisture estimates throughout the soil profile were improved via data assimilation.     Root-zone soil moisture plays a significant role in agricultural drought monitoring. For cereal crops, roots are mainly distributed at a depth of 0-50 cm; thus, soil moisture within these depths has often been used to monitor and predict agricultural drought [2,53]. In this study, the average of soil moisture in each layer within 0-50 cm was regarded as the root-zone value. The assimilation results and error statistics of the root-zone soil moisture at HBBZ site are also shown in Figure 2 and Table 2. The RMSEs of open-loop and assimilated soil moistures were 0.067 and 0.038 m 3 ·m −3 , respectively. Similarly, Figure 3 and Table 3 display the results of the root-zone soil moisture at the eight sites. In comparison with the open-loop simulations, the RMSE of the root-zone soil moisture was reduced from 0.065 to 0.035 m 3 ·m −3 , and R was increased from 0.579 to 0.742. Furthermore, compared with the soil moisture at each layer, root-zone soil moisture also had a better agreement with in situ observations. The errors of assimilated soil moisture at each soil layer based on AEnKF were less than those based on the EnKF algorithm. According to the time series data and error statistics, the performance of the AEnKF-based soil moisture data assimilation was better, which also indicated that the simultaneous state-parameter estimation was effective for improving state variable estimation in data assimilation.

Comparisons of Open
During the progress of data assimilation, initial conditions, model parameters, external observations and algorithm are the main factors affecting the accuracy of state variable estimation. Hence, the accuracy of the model parameters is important for improving state variable estimation when other conditions are the same. In the AEnKF algorithm, state variables and model parameters are updated at the same time according to the Kalman gain, which is theoretically expected to enhance the soil moisture measurement. Here we compared the utilities of the results based on the AEnKF algorithm (simultaneous state-parameter estimation) and EnKF algorithm (state-only estimation) at the HHBZ site during  (0.029), respectively. The errors of assimilated soil moisture at each soil layer based on AEnKF were less than those based on the EnKF algorithm. According to the time series data and error statistics, the performance of the AEnKF-based soil moisture data assimilation was better, which also indicated that the simultaneous state-parameter estimation was effective for improving state variable estimation in data assimilation.

Impact of Ensemble Size
The EnKF algorithm uses a Monte Carlo approach to generate the model e tribution. Theoretically, as ensemble size increases, the model error distribution better characterized. However, a significant expansion of ensemble size leads t matic increase in model calculation costs. Therefore, ensemble size is often selec the compromise of accuracy and time cost in practical applications, especially in of complex models with many parameters, which can not only ensure accuracy reduce calculation time. In this section, we compared the accuracy of the assimil moisture with five, ten, twenty, and fifty ensemble members during the growing of 2004-2005 and 2005-2006. Figure 6 shows the time series of soil moisture estimates with different e members. Generally, the fluctuations of the assimilated soil moisture time serie different ensemble members were similar. When the ensemble size was small, tuations in soil moisture were relatively large; conversely, when the ensemble large, the change was relatively small. As ensemble size increased, the error of lated soil moisture decreased to some extent (Figure 7). However, overall, no difference in the error of assimilation results occurred under the different e members. The error of the root-zone soil moisture was the smallest, with RMSEs 0.029, 0.029, and 0.027 m 3 ·m −3 under 5, 10, 20, and 50 ensemble members, respect previous studies, ensemble size was mostly set between 10 and 50 [24,33,34,54]. ing to the error statistics, the RMSEs of the assimilated soil moisture under 20 members were similar. Compared with the observation data and model structu the influence of ensemble size on the soil moisture data assimilation was mar view of the high dimensionality of the state variables in this study, the calculatio larger ensemble size was relatively expensive. Thus, considering the calculation a and time cost, 20 ensemble members were deemed more appropriate for soil m assimilation at the other sites.

Impact of Ensemble Size
The EnKF algorithm uses a Monte Carlo approach to generate the model error distribution. Theoretically, as ensemble size increases, the model error distribution can be better characterized. However, a significant expansion of ensemble size leads to a dramatic increase in model calculation costs. Therefore, ensemble size is often selected with the compromise of accuracy and time cost in practical applications, especially in the case of complex models with many parameters, which can not only ensure accuracy but also reduce calculation time. In this section, we compared the accuracy of the assimilated soil moisture with five, ten, twenty, and fifty ensemble members during the growing seasons of 2004-2005 and 2005-2006. Figure 6 shows the time series of soil moisture estimates with different ensemble members. Generally, the fluctuations of the assimilated soil moisture time series under different ensemble members were similar. When the ensemble size was small, the fluctuations in soil moisture were relatively large; conversely, when the ensemble size was large, the change was relatively small. As ensemble size increased, the error of assimilated soil moisture decreased to some extent (Figure 7). However, overall, no obvious difference in the error of assimilation results occurred under the different ensemble members. The error of the root-zone soil moisture was the smallest, with RMSEs of 0.033, 0.029, 0.029, and 0.027 m 3 ·m −3 under 5, 10, 20, and 50 ensemble members, respectively. In previous studies, ensemble size was mostly set between 10 and 50 [24,33,34,54]. According to the error statistics, the RMSEs of the assimilated soil moisture under 20 and 50 members were similar. Compared with the observation data and model structure error, the influence of ensemble size on the soil moisture data assimilation was marginal. In view of the high dimensionality of the state variables in this study, the calculation cost of larger ensemble size was relatively expensive. Thus, considering the calculation accuracy and time cost, 20 ensemble members were deemed more appropriate for soil moisture assimilation at the other sites. Remote Sens. 2022, 14, x FOR PEER REVIEW 14 of 23

Impact of Temporal Interval
Although the ESA CCI soil moisture product has a one-day temporal resolution, the acquisition of continuous daily data during the growing season is difficult. This problem can be caused by many reasons, including satellite orbit changes, radio frequency interference (RFI), underlying surface (e.g., snow and ice), and physical limitations of satellite sensors [18,55,56]. Under the circumstances of incomplete observations, an exploration of the influence of observation frequency on soil moisture data assimilation is necessary.

Impact of Temporal Interval
Although the ESA CCI soil moisture product has a one-day temporal resolution, the acquisition of continuous daily data during the growing season is difficult. This problem can be caused by many reasons, including satellite orbit changes, radio frequency interference (RFI), underlying surface (e.g., snow and ice), and physical limitations of satellite

Impact of Temporal Interval
Although the ESA CCI soil moisture product has a one-day temporal resolution, the acquisition of continuous daily data during the growing season is difficult. This problem can be caused by many reasons, including satellite orbit changes, radio frequency interference (RFI), underlying surface (e.g., snow and ice), and physical limitations of satellite sensors [18,55,56]. Under the circumstances of incomplete observations, an exploration of the influence of observation frequency on soil moisture data assimilation is necessary.
An assimilation experiment was conducted with observation data at intervals of three, five and ten days for two winter wheat growing seasons from 2004 to 2006. The results showed that the fluctuation of soil moisture estimated with observation data at a ten-day interval was more obvious than that at three-and five-day intervals (Figure 8). When the temporal interval of observations was large, the assimilated soil moisture tended to approach the model simulation results. The model simulation values were lower in the late growth season, which led to an increase in soil moisture fluctuation. The detailed RMSEs of soil moisture with different assimilation intervals are shown in Figure 9. Generally, as the frequency of observation increased, the error of assimilated soil moisture decreased, but the difference in soil moisture error at intervals of three and five days was small. Therefore, the interval of observation data within five days was identified as the ideal time frequency for obtaining soil moisture with high accuracy, in agreement with the conclusion of Walker and Houser [57].
Remote Sens. 2022, 14, 3187 15 of 23 sensors [18,55,56]. Under the circumstances of incomplete observations, an exploration of the influence of observation frequency on soil moisture data assimilation is necessary. An assimilation experiment was conducted with observation data at intervals of three, five and ten days for two winter wheat growing seasons from 2004 to 2006. The results showed that the fluctuation of soil moisture estimated with observation data at a ten-day interval was more obvious than that at three-and five-day intervals (Figure 8). When the temporal interval of observations was large, the assimilated soil moisture tended to approach the model simulation results. The model simulation values were lower in the late growth season, which led to an increase in soil moisture fluctuation. The detailed RMSEs of soil moisture with different assimilation intervals are shown in Figure  9. Generally, as the frequency of observation increased, the error of assimilated soil moisture decreased, but the difference in soil moisture error at intervals of three and five days was small. Therefore, the interval of observation data within five days was identified as the ideal time frequency for obtaining soil moisture with high accuracy, in agreement with the conclusion of Walker and Houser [57].  In this study, the RSMI was used as an indicator for agricultural drought monitoring. We compared the RSMI based on assimilated soil moisture ( ) and open-loop simulations ( ) with the RSMI based on in situ observations ( ), with the Figure 9. Comparison of the RMSEs of assimilated soil moisture with observation intervals of three, five, and ten days at different soil depths.

Comparison with Observed Soil Moisture-Based Drought Index
In this study, the RSMI was used as an indicator for agricultural drought monitoring. We compared the RSMI based on assimilated soil moisture (RSMI DA ) and open-loop simulations (RSMI OL ) with the RSMI based on in situ observations (RSMI Obs ), with the aim of investigating the extent to which soil moisture data assimilation improved the level of agricultural drought monitoring.
Firstly, from the perspective of qualitative assessment, time series graphs and scatter plots were used to show the fluctuations and consistency of the RSMI. Figure 10 displays the RSMI time series during the growing seasons from 2002 to 2011 at eight sites. The results illustrate that the RSMI DA and RSMI obs had similar fluctuations during the growing seasons. In the HHH Plain, due to the huge water demand for water in the late growing season of winter wheat, a decrease in soil moisture often occurred during this period. Both the RSMI DA and RSMI obs showed this characteristic. RSMI OL showed a good match with RSMI obs in the early growth season of winter wheat, however, in the late growth season, the values for RSMI OL were significantly lower than those for RSMI obs . According to the drought levels of RSMI, the low value of RSMI OL in the late growth season indicated a severe or extreme drought, which might be inconsistent with the reality.   Secondly, from the perspective of quantitative assessment, Table 4 displayed the error statistics of RSMI based on open-loop and assimilated soil moisture at eight agrometeorological sites. The average RMSEs of RSMI DA and RSMI OL at the eight sites were 2.60 and 1.38, respectively, and the corresponding average R values were 0.62 and 0.68, respectively. Compared with the RSMI OL , the RSMI DA of all sites had a higher correlation at the 0.01 significance level, and the RMSE of RSMI DA was smaller than that of RSMI OL for all sites. According to the drought category of RSMI (Table 1), a value of less than zero indicated that drought occurred; when the RSMI value was decreased by one, the drought intensity was increased by one level. Therefore, the difference between the average RMSEs of RSMI OL and RSMI DA was 1.22, which implied that, compared with the drought index based on open-loop simulations, the index based on assimilated data could improve at least one drought level in agricultural drought monitoring.

Comparison with Winter Wheat Yield
Crop yield reduction is a major manifestation of crop growth threatened by drought. Here, we discuss the relationship between agricultural drought and crop yield reduction to further evaluate the effect of data assimilation on agricultural drought monitoring. Yield loss ratio (YLR) was utilized to quantify the degree of crop yield reduction, calculated as follows: YLR = Y − Y /Y, where Y represents the annual crop yield, and Y is the multiyear average yield. According to the disaster record dataset of agro-meteorological sites from 2002 to 2011, six sites (AHSX, HBHH, SDDZ, SDHM, SDJX, and SDTA) experienced agricultural droughts. Figure 11 shows the relationship between the RSMI (RSMI OL and RSMI DA ) and crop yield loss ratio (YLR). Overall, the RSMI matched well with the YLR during the nine growing seasons; that is, when the RSMI value was small (drought), the crop yield was reduced, and when the RSMI value was large (wetness), the crop yield increased.
to further evaluate the effect of data assimilation on agricultural drought monitoring. Yield loss ratio (YLR) was utilized to quantify the degree of crop yield reduction, calculated as follows: = ( − )/ , where represents the annual crop yield, and is the multiyear average yield. According to the disaster record dataset of agro-meteorological sites from 2002 to 2011, six sites (AHSX, HBHH, SDDZ, SDHM, SDJX, and SDTA) experienced agricultural droughts. Figure 11 shows the relationship between the RSMI ( and ) and crop yield loss ratio (YLR). Overall, the RSMI matched well with the YLR during the nine growing seasons; that is, when the RSMI value was small (drought), the crop yield was reduced, and when the RSMI value was large (wetness), the crop yield increased. As a result of the small sample size per site, the RSMIs from the six sites were used together to statistically investigate the correlation with YLR (Figure 11g,h). The correlation coefficients between the RSMI ( and ) and YLR were 0.41 and 0.52 (p < 0.01), respectively. The results demonstrated that the RSMI was significantly correlated with YLR, and the RSMI based on assimilated soil moisture performed better than that based on open-loop simulations. Therefore, data assimilation could further strengthen the skill of the drought index to reflect the yield losses by improving soil moisture estimation. As a result of the small sample size per site, the RSMIs from the six sites were used together to statistically investigate the correlation with YLR (Figure 11g,h). The correlation coefficients between the RSMI (RSMI OL and RSMI DA ) and YLR were 0.41 and 0.52 (p < 0.01), respectively. The results demonstrated that the RSMI was significantly correlated with YLR, and the RSMI based on assimilated soil moisture performed better than that based on open-loop simulations. Therefore, data assimilation could further strengthen the skill of the drought index to reflect the yield losses by improving soil moisture estimation.

Discussion
In this study, we established a data assimilation scheme to estimate surface and root zone soil moisture, and we evaluated the effect of data assimilation on agricultural drought monitoring. The main advantage of this scheme is that, combined with a crop models and remote sensing observations, the process of crop growth and development can be simulated, and soil moisture information can also be obtained [19,58]. Such results are very helpful for constructing a drought index with agronomic traits [13,19]. The assimilated results demonstrate that soil moisture estimates were significantly improved after data assimilation (Figures 2 and 3; Tables 2 and 3). Although the observation data in the data assimilation framework was surface satellite soil moisture, deeper layers also showed the improved results after assimilation. From the perspective of different soil depths, the improvement of surface soil moisture accuracy after data assimilation was greater than that of the deep layers. As the soil depth increased, the ability of assimilating surface soil moisture to improve the accuracy of deep soil moisture decreased. These findings also agree with previous studies [20,30]. We can also find that the results of soil moisture assimilation at depths of 30 and 40 cm were less accurate. Soil water balance is a very complex physical process, and the possible reasons for the poor performance include the following: (1) the inability of the model structure to perfectly describe the soil water balance process; (2) the uncertainties of the input data (soil attribute data and management data) [58]. Due to the influence of many factors (models, observations, and forcings), the estimation of soil moisture at a fixed depth may have some uncertainties. Root-zone soil moisture (averaged at each layer within the root zone) was observed to be a better estimate (Table 3). Root-zone soil moisture plays a significant role in agricultural drought monitoring [2,53]. Our study found that compared with the drought index based on open-loop soil moisture, the index based on assimilated data could improve at least one drought level in agricultural drought monitoring. Therefore, it is an important and feasible approach to improve the accuracy of root-zone soil moisture through data assimilation, thereby enhancing the estimation of agricultural drought.
Model and observation are key components for data assimilation. In this study, we found that the performance of the AEnKF-based soil moisture data assimilation was better, which also indicated that the simultaneous state-parameter estimation was more effective for improving soil moisture estimation. This conclusion highlights the importance of simultaneous estimation of model parameters and state variables, which is consistent with related research [30,48,59,60]. In addition to model parameters, observation has a crucial impact on data assimilation. Errors caused by observations include the bias between model and observation and the error of the observation itself. Discrepancies between model and observation are caused by a variety of factors, including uncertainty of model parameters, a difference in spatial scales. Bias correction is effective to reduce the discrepancies between model and observation [59,60]. In this study, the CDF matching approach was used to reduce the impact of scale difference between the gridded satellite observation and pointscale model prior to data assimilation. The error of observation itself is generally set based on empirical values [24,34,59,60]. In our study, we set it based on the relevant studies on soil moisture validation [18,51], and we obtained good estimation results. Recently, several studies have proposed new methods for bias correction and model parameter estimation to further improve the performance of data assimilation. Instead of bias correction prior to data assimilation, Qin et al. [59] proposed on-line bias correction during data assimilation and obtained a good estimate. Tian et al. [60] developed a novel assimilation method to estimate both model and observation errors to improve soil moisture retrievals, in which model state, model parameters, model error, and observation error were estimated simultaneously. Overall, model parameters and observation errors are still worthy of continued research to further improve the estimation accuracy of state variables.
Through this study, we found that the crop growth model-based data assimilation scheme effectively improved soil moisture estimation and was significantly beneficial for agricultural drought monitoring. It is worth noting that this study also had some limitations that need to be further discussed. On one hand, the study was conducted on a point scale, as accurate regional crop model parameterization and high-precision gridded data (meteorological forcings, soil data, and field management data) used for crop modeling are still challenging. Some studies have investigated the gridded data assimilation scheme [32,33,61,62], however, generally gridded meteorological data are used, whereas soil data, and management data (e.g., irrigation), and crop parameters over large areas are still replaced by empirical values or coarse-resolution data. Therefore, observations at higher spatial and temporal resolution (e.g., SMAP product [63]) and more accurate model parameterization schemes can further improve the gridded data assimilation framework, help analyze the variations in drought characteristics in different regions, and better understand the mechanism of drought formation. On the other hand, a single remote sensing data source was used in this study. Multi-source remote sensing can provide land surface monitoring from different perspectives. Some scholars carried out data assimilation based on various remote sensing data sources [24,61,62,64,65]; however, so far collaborative applications based on multi-source remote sensing data are still rare in agricultural drought monitoring. With the launch of subsequent remote sensing satellites, data quality and spatial-temporal resolution will be further improved. The combination of multi-source remote sensing data and crop models will help to further strengthen the application of drought indicators in agricultural drought monitoring.

Conclusions
In this study, the DSSAT crop model and satellite soil moisture were integrated for data assimilation to improve soil moisture estimation. The assimilated results were validated using in situ observations of agro-meteorological sites. The impacts of different strategies (simultaneous state-parameter estimation, ensemble size and temporal interval) on data assimilation were also discussed. Furthermore, we evaluated the effect of data assimilation on agricultural drought monitoring through comparisons with the observed soil moisturebased drought index and winter wheat yield. The main conclusions are summarized as follows: (1) Compared with open-loop simulations, crop growth model-based data assimilation effectively improved soil moisture estimation throughout the soil profile. Soil moisture in the root zone was more consistent with in situ observation (RMSE = 0.035 m 3 ·m −3 , R = 0.742) than that in other soil layers; (2) Simultaneous state-parameter estimation (based on AEnKF algorithm) performed better than state-only estimation (based on EnKF algorithm). As the observation frequency and ensemble size increased, the accuracy of soil moisture estimates also increased. In the case of a comprehensive consideration of estimation accuracy and calculation cost, observation frequency within five days with twenty ensemble members could meet the requirement of soil moisture accuracy;

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.