Assimilating Soil Moisture Retrieved from Sentinel-1 and Sentinel-2 Data into WOFOST Model to Improve Winter Wheat Yield Estimation

Crop yield estimation at a regional scale over a long period of time is of great significance to food security. In past decades, the integration of remote sensing observations and crop growth models has been recognized as a promising approach for crop growth monitoring and yield estimation. Optical remote sensing data are susceptible to cloud and rain, while synthetic aperture radar (SAR) can penetrate through clouds and has all-weather capabilities. This allows for more reliable and consistent crop monitoring and yield estimation in terms of radar sensor data. The aim of this study is to improve the accuracy for winter wheat yield estimation by assimilating time series soil moisture images, which are retrieved by a water cloud model using SAR and optical data as input, into the crop model. In this study, SAR images were acquired by C-band SAR sensors boarded on Sentinel-1 satellites and optical images were obtained from a Sentinel-2 multi-spectral instrument (MSI) for Hengshui city of Hebei province in China. Remote sensing data and ground data were all collected during the main growing season of winter wheat. Both the normalized difference vegetation index (NDVI), derived from Sentinel-2, and backscattering coefficients and polarimetric indicators, computed from Sentinel-1, were used in the water cloud model to derive time series soil moisture (SM) images. To improve the prediction of crop yields at the field scale, we incorporated remotely sensed soil moisture into the World Food Studies (WOFOST) model using the Ensemble Kalman Filter (EnKF) algorithm. In general, the trend of soil moisture inversion was consistent with the ground measurements, with the coefficient of determination (R2) equal to 0.45, 0.53, and 0.49, respectively, and RMSE was 9.16%, 7.43%, and 8.53%, respectively, for three observation dates. The winter wheat yield estimation results showed that the assimilation of remotely sensed soil moisture improved the correlation of observed and simulated yields (R2 = 0.35; RMSE =934 kg/ha) compared to the situation without data assimilation (R2 = 0.21; RMSE = 1330 kg/ha). Consequently, the results of this study demonstrated the potential and usefulness of assimilating SM retrieved from both Sentinel-1 C-band SAR and Sentinel-2 MSI optical remote sensing data into WOFOST model for winter wheat yield estimation and could also provide a reference for crop yield estimation with data assimilation for other crop types.


Introduction
Food production and security have drawn the attention of scientific communities around the world.Crop growth simulation and yield monitoring are essential to inform and develop national food security policies and production management strategies [1,2].Climate change, environmental pollution, water scarcity, and change of cultivated land quality have serious effects on crop production [3].Therefore, timely and accurate information on regional crop growth and yield is of great significance for food security and the sustainable development of agriculture.
Traditional crop yield estimation mainly includes three methods, as follows: Statistical survey, agronomy prediction, and the agrometeorological forecast method.Due to their inherent limitations, it is difficult for these methods to achieve high precision estimations of regional crop yields [4].The crop growth model is a process-oriented and machine-based dynamic model, which has been successfully applied to yield estimation at the field scale, but is not applicable for regional yield estimation.As a near real time technology, remote sensing can provide crop and soil parameters at a large scale and at lower costs than the survey method [5], though it is not as mechanistic as the crop model.Data assimilation is a method that optimally combines crop growth models with remote sensing observations, providing an effective way to improve crop yield estimation at regional scales and it is recognized as the most promising approach for crop growth monitoring and yield estimation [1,6,7].
Remotely sensed data derived from both active and passive sensors are utilized to observe the crop, providing precise and timely information on the phenological status and development of vegetation to monitor crop growth [8].As an important state variable, soil moisture provides linkage to the soil, atmosphere, and plant, and plays a significant role in hydrological, ecological, and biological processes [9,10].With the development of remote sensing technologies, soil moisture monitoring at different spatial and temporal scales has become possible [11,12].Surface roughness and vegetation variables are the two main factors influencing soil moisture retrievals on the basis of SAR (synthetic aperture radar) observations [13].To overcome the difficulty of obtaining the multiple scattering between soil and vegetation in vegetated areas, many widely used vegetation scattering models have been proposed.These models, which can retrieve soil moisture in vegetation covered areas, include the Tor Vergata model [14], the Michigan microwave canopy scattering (MIMICS) model [15], and the water cloud model [16].
The potential of the water cloud model for correcting the vegetation effect in various types of terrain has been previously demonstrated [17][18][19][20][21]. Previous studies have shown that the biophysical parameters needed for the water cloud model to represent the scattering properties of vegetation can be estimated from optical remote sensing data [22].Additionally, the complementary information from SAR and the optical data provide the possibility to retrieve soil moisture with a high level of accuracy [23,24].Several studies have focused on combining optical and remotely sensed SAR data for soil moisture inversion.Bai et al. combined C-band Radarsat-2 and Landsat-8 OLI (Operational Land Imager) data to estimate the soil moisture in prairie areas [10], while Tao et al. developed a modified vegetation backscattering model to retrieve vegetation covered soil moisture based on RADARSAT-2, GF-1, and ground observations [13], and the results were promising.Thus, in this study, we jointly used Sentinel-1 C-band SAR data and Sentinel-2 MSI optical data to retrieve soil moisture by using the water cloud model.
Remote sensing of soil moisture is potentially useful for sequential data assimilation, because it has an obvious influence on crop growth and especially on crop yields.The spatial and temporal coverage of satellite soil moisture allows for data assimilation for crop yield estimation at a regional scale.In the field of crop model applications, several previous studies have used soil moisture as a state variable in crop model data assimilation systems for crop yield estimations.De Wit and Van Diepen utilized the Ensemble Kalman filter (EnKF) method to assimilate soil moisture, derived from a microwave sensor, into the WOFOST (World Food Studies) crop model and the conclusion indicated that the assimilation of soil moisture into the crop growth model has a beneficial effect on the overall performance of the yield forecasting system [4].Ines et al. jointly assimilated the leaf area index (LAI) and soil moisture into the CERES-Maize model with the EnKF algorithm and the results showed that yield estimation was improved substantially [25].Chakrabarti et al. assimilated SMOS (Soil Moisture and Ocean Salinity) soil moisture into the DSSAT (Decision Support System for Agrotechnology Transfer) model to investigate the effects of agricultural drought on crop yield using the EnKF technique.They found that the assimilated crop yield was improved during the growing season [26].However, the soil moisture data used in previous studies were mostly retrieved from SAR data only or with a coarse spatial resolution (e.g., SMOS soil moisture data at 25 km).Thus, we would like to assess the ability of assimilating soil moisture at 10 m scale, retrieved from SAR and optical data together, into the crop model to improve crop yield estimation.
The objectives of this study are twofold, as follows: (1) To evaluate the performance of combining SAR and optical remotely sensed data in the water cloud model for soil moisture inversion; and (2) to assess the accuracy of winter wheat yield estimation by assimilating the retrieved soil moisture into the WOFOST model with the EnKF method.Based on multi-source observations (optical and SAR remotely sensed data, in situ data, and meteorological data), we developed an assimilation framework for time series soil moisture images into the WOFOST model to improve its accuracy for winter wheat yield estimation at a regional scale.The study area and four major datasets are presented in Section 2 and this section also introduces the water cloud model, the WOFOST model, and the EnKF assimilation algorithm in detail.The results Section 3 is divided into three parts, with (1) the water cloud model soil moisture retrieval results, (2) the simulation of soil moisture with the WOFOST model and (3) the assimilation of soil moisture into the WOFOST model with the EnKF method.Uncertainties and perspectives of this study are described in Section 4 and the conclusion is given in Section 5.

Study Area
This study was implemented in a planted area dominated by winter wheat, in Hengshui city of the southern Hebei province, China (Figure 1).It extends from 37 • 03 N to 38 • 23 N and 115 • 10 E to 116 • 34 E, covering 8815 km 2 in total and consisting of 11 counties.The study area is a temperate continental monsoon climate, with the average temperature ranging from 9 °C to 15 °C, and an average annual rainfall ranging from 400 mm to 800 mm.The crop rotation of winter wheat and summer maize is the main crop planting system in this area [27,28].Winter wheat is generally planted in October in the first year and matured in June in the second year.The key phenological stages of winter wheat in this area are the wintering stage before February, the green-up stage in early March, the jointing stage before early April, the heading stage from late April to early May, the grain filling stage before late May, and milk-maturity in early June [29].The climate is seasonally variable, with only 25% of the precipitation occurring from October to May during winter wheat growing season and about 75% occurring in June to September during the maize growing season (Data source: China Meteorological Administration, http://www.cma.gov.cn/).Water is a major limiting factor for winter wheat growth in this area.

Data
Four major datasets were used in this study, as follows: WOFOST model input data, optical remote sensing data, SAR remote sensing data, and field measured data (soil moisture and winter wheat yield).
The WOFOST model input data includes crop parameters, soil parameters, and meteorological data.Some parameters of the crop and soil were calculated from the field measurements (e.g., day of emergence (IDEM), initial total crop dry weight (TDWI), and cumulative temperature from emergence to anthesis and from anthesis to maturity ((TSUM1/TSUM2) et al.).Others were derived from the references or default value in the WOFOST model.Ma et al. and Huang et al. provide details of the sources of the different parameters [30,31].The meteorological data were obtained from the China Meteorological Administration, (http://www.cma.gov.cn/), which includes daily maximum and minimum temperatures, solar radiation, wind speed, actual vapor pressure, and precipitation.The meteorological parameters were interpolated into raster data at 10 m resolution.Each winter wheat assimilation grid has its own meteorological data files.

Data
Four major datasets were used in this study, as follows: WOFOST model input data, optical remote sensing data, SAR remote sensing data, and field measured data (soil moisture and winter wheat yield).
The WOFOST model input data includes crop parameters, soil parameters, and meteorological data.Some parameters of the crop and soil were calculated from the field measurements (e.g., day of emergence (IDEM), initial total crop dry weight (TDWI), and cumulative temperature from emergence to anthesis and from anthesis to maturity ((TSUM1/TSUM2) et al.).Others were derived from the references or default value in the WOFOST model.Ma et al. and Huang et al. provide details of the sources of the different parameters [30,31].The meteorological data were obtained from the China Meteorological Administration, (http://www.cma.gov.cn/), which includes daily maximum and minimum temperatures, solar radiation, wind speed, actual vapor pressure, and precipitation.The meteorological parameters were interpolated into raster data at 10 m resolution.Each winter wheat assimilation grid has its own meteorological data files.
The optical remote sensing data used in this study is Sentinel-2A MSI data.Sentinel-2 is a constellation with two twin satellites Sentinel-2A and Sentinel-2B, launched by the European Space Agency (ESA) in June 2015 and March 2017, respectively.Sentinel-2 carries the multi-spectral instrument (MSI) with 13 spectral channels, including visible, near infrared (NIR), and shortwave infrared spectral range bands.The revisit time of Sentinel-2A/B is 10 days and the spatial resolution of Sentinel-2 is dependent on the particular spectral band.In this study, we used 10 meters for Band 2 (Blue), Band 3 (Green), Band 4 (Red), and Band 8 (NIR) (https://scihub.esa.int).Based on the field experiment dates (Figure 2), three cloud-free Sentinel-2 images were collected for the study area.Data pre-processing, including atmospheric correction and converting top-of-atmosphere reflectance into top-of-canopy reflectance, was accomplished by the Sentinel Application Platform (SNAP) software [32].
Three Sentinel-1A images were acquired during field experiment dates (Figure 2).The twosatellite Sentinel-1 constellation carries a SAR instrument at C-band and has been designed to address medium-and high-resolution applications.The Interferometric Wideswath (IW) mode is the predefined acquisition mode over land.This mode provides dual-polarization (VV and VH) imagery at a resolution of 10 m, with a swath of 250 km (https://scihub.esa.int)[8,21].In this study, we used Level-1 single look complex (SLC) data.Data pre-processing, which includes thermal noise removal, calibration, multi-look, speckle filter, and terrain correction, was accomplished using SNAP to obtain the σ 0 backscatter coefficient.The optical remote sensing data used in this study is Sentinel-2A MSI data.Sentinel-2 is a constellation with two twin satellites Sentinel-2A and Sentinel-2B, launched by the European Space Agency (ESA) in June 2015 and March 2017, respectively.Sentinel-2 carries the multi-spectral instrument (MSI) with 13 spectral channels, including visible, near infrared (NIR), and shortwave infrared spectral range bands.The revisit time of Sentinel-2A/B is 10 days and the spatial resolution of Sentinel-2 is dependent on the particular spectral band.In this study, we used 10 meters for Band 2 (Blue), Band 3 (Green), Band 4 (Red), and Band 8 (NIR) (https://scihub.esa.int).Based on the field experiment dates (Figure 2), three cloud-free Sentinel-2 images were collected for the study area.Data pre-processing, including atmospheric correction and converting top-of-atmosphere reflectance into top-of-canopy reflectance, was accomplished by the Sentinel Application Platform (SNAP) software [32].
For the field experiment data, we selected 2 sites in each county (Figure 1) throughout the study area and monitored the winter wheat growing conditions from March to June in 2017.Each site had five 1 m × 1 m sampling points in an area of 100 m × 100 m.Field experiment dates (Figure 2) were selected during the three main growing seasons of winter wheat (jointing stage, heading stage, and milk-maturity stage).In the field experiment, winter wheat LAI was measured using an LAI-2000 Plant Canopy Analyzer and the soil moisture was measured using a FieldScout TDR 300 Soil Moisture Meter to collect soil moisture data at a depth of 0-10 cm.Winter wheat yields were measured by on-site sampling.First, we collected the grain from 1 m 2 at each sample plot.Second, they were heated to collect the dry weight.Finally, winter wheat yields were calculated as the weight

Water Cloud Model
Agricultural regions are a combination of vegetation and bare soil and the information obtained from optical or radar remote sensing images is influenced by both components.The water cloud model is based on the semi empirical model established by the theory of microwave remote sensing radiation transmission.This model considers that the signal received by the radar sensor is composed Three Sentinel-1A images were acquired during field experiment dates (Figure 2).The two-satellite Sentinel-1 constellation carries a SAR instrument at C-band and has been designed to address mediumand high-resolution applications.The Interferometric Wideswath (IW) mode is the pre-defined acquisition mode over land.This mode provides dual-polarization (VV and VH) imagery at a resolution of 10 m, with a swath of 250 km (https://scihub.esa.int)[8,21].In this study, we used Level-1 single look complex (SLC) data.Data pre-processing, which includes thermal noise removal, calibration, multi-look, speckle filter, and terrain correction, was accomplished using SNAP to obtain the σ 0 backscatter coefficient.
For the field experiment data, we selected 2 sites in each county (Figure 1) throughout the study area and monitored the winter wheat growing conditions from March to June in 2017.Each site had five 1 m × 1 m sampling points in an area of 100 m × 100 m.Field experiment dates (Figure 2) were selected during the three main growing seasons of winter wheat (jointing stage, heading stage, and milk-maturity stage).In the field experiment, winter wheat LAI was measured using an LAI-2000 Plant Canopy Analyzer and the soil moisture was measured using a FieldScout TDR 300 Soil Moisture Meter to collect soil moisture data at a depth of 0-10 cm.Winter wheat yields were measured by on-site sampling.First, we collected the grain from 1 m 2 at each sample plot.Second, they were heated to collect the dry weight.Finally, winter wheat yields were calculated as the weight per area of the sample plot in kg/ha.

Water Cloud Model
Agricultural regions are a combination of vegetation and bare soil and the information obtained from optical or radar remote sensing images is influenced by both components.The water cloud model is based on the semi empirical model established by the theory of microwave remote sensing radiation transmission.This model considers that the signal received by the radar sensor is composed of soil and vegetation and their mutual scattering [17].It was initially proposed by Attema and Ulaby [16] and there are multiple ways of implementing the model [33].Xu's version was used in this study, can be described with the Formulas (1)-(3) [34].
where σ 0 veg and σ 0 soil represent the part of overall backscatter coefficient, σ 0 pq , from vegetation and soil, respectively.The values p and q represent different polarizations of radar.
where Veg represents crop related parameters (in this study, we used NDVI calculated from optical data), θ represents the detection angle of radar, and A and B represent the vegetation parameters.
The value σ 0 pq_soil represents the initial backscatter from the soil and we assumed that it has linear relationship with soil moisture in this study.
where C and D are soil parameters.Combined with the above formula, we can get the formula of soil moisture.
In this study, we used field measured soil moisture as a model input parameter, combined with backscattering and vegetation information, to train the model.As can be seen in Equation ( 5), we used field measured soil moisture data to represent SM and NDVI retrieved from Sentinel-2 to represent Veg and σ 0 pq and θ came from Sentinel-1 backscattering coefficient and the Sentinel-1 incidence angle, respectively.We have 110 field experimental sites, half of which were used for training the water cloud model and half for validating the soil moisture retrieval accuracy.Thus, we will have 55 equations with the form of Equation ( 5) and A, B, C, and D are vegetation and soil parameters to be determined.These are multivariate non-linear overdetermined systems of equations and we solved the equations using MATLAB software.

WOFOST Model
The WOFOST model, developed by the Wageningen University and Research Center (Wageningen, the Netherlands), is a mechanistic model for analyzing the interactions among soil, water, the atmosphere, and plants [35,36].The model's outputs are directly usable for crop-specific yield estimation and can provide estimates of biomass and grain yield at a daily time step for different crop types [1].The WOFOST model can be run in potential mode, water-limited mode, and nutrient-limited mode.Soil moisture is an important crop growth parameter and is estimated from forcing data and soil properties in the WOFOST model.Therefore, SM was adopted as the state variable in the data-assimilation procedure in this study.
We used the water-limited mode of WOFOST model in our research and assumed that the simulation of the soil water balance is performed for freely draining soils.The core of the crop model calibration is fitting the simulated values with the observed data [37].Our previous studies have described the parameterization and calibration of the WOFOST model in detail [30,31].Here, we briefly show the main soil and crop parameters (Tables S1 and S2), which are of great importance to the water-limited mode.Each winter wheat pixel was considered as an assimilation grid.Each assimilation grid has its own unique folder, which contains a soil parameter file, a crop parameter file, and a meteorological parameter file.We used the Chinese soil database (http://www.soil.csdb.cn)and followed the Saxton and Rawls method to derive soil moisture content at the wilting point (SMW), in saturated soil (SM0), and at field capacity (SMFCF) [38].Other soil parameters were used as the default value in WOFOST.As for crop parameters, different assimilation grids have different values of IDEM, TSUM1, and TSUM2, and the remaining parameters are the same.The meteorological parameters are extracted from 10 m meteorological images interpolated from meteorological stations (Section 2.2).

The Ensemble Kalman Filter
The Ensemble Kalman Filter (EnKF) is an optimal recursive data assimilation method introduced by Evensen [39].There are several different versions of the EnKF.In this work, we used the EnKF of Burgers et al. [40].Therefore, the observations also needed to be treated as random variables and it is commonly assumed that observation errors have a Gaussian distribution.
The EnKF performs model forecasting where the model responses (state variables) are propagated forward in time based on the model dynamics and a filter update in which the ensemble of the model state is adjusted through incorporating available observations [30,41,42].The EnKF can be considered as an approximate version of the Kalman Filter, in which the state uncertainty is represented by the ensemble spread [43].If we assume that observations are related to the true data x t , where y is the observation vector, ε and ν are Gaussian random error vectors with a mean of zero, H is the observation operator that relates to y, A represents a linear state-transition model that links x t and x t−1 , and in the crop model data assimilation system it represents the crop model.The forecast of x t at t = k is Gaussian, mean, x f t=k , and error covariance, P f t=k , can be calculated as Equations ( 8)-( 9), and y t=k is observation vector at t = k, as follows: where f and a are indices of the prior and posterior estimates, respectively, I is the identity matrix, and K is the Kalman gain matrix, defined as Equation (10), where R t=k is the error covariance of the observation ensemble.
The EnKF forecast and analysis error covariance come directly from an ensemble of model simulations, as follows: where N e is the number of ensemble members, n is a running index for ensemble member, and x f represents the ensemble mean, calculated as Equation (12).The observation data should be perturbed and the variance used in the perturbation is based on the uncertainty of the observation data.
In this study, the soil moisture retrieved by the water cloud model was the only state variable that was directly assimilated into the WOFOST model.So, H can be taken as an identity matrix and the optimal estimation of state variable x t at the time k (Equation (8)).The Kalman gain matrix (Equation (10)) and the EnKF forecast and analysis error covariance (Equation ( 11)) can be rewritten as Equations ( 13)- (15).
The uncertainties of the crop model parameters and the remotely sensed observations are an important part of the EnKF assimilation system.There are two common ways to account for model uncertainty in the EnKF assimilation system.One is adding a Gaussian perturbation to the model output state variables.Another is to perturb model input parameters to generate ensemble members.In this study, we used the first method and an uncertainty level of 10% was introduced to WOFOST model output soil moisture and perturbed using a Gaussian distribution [25].Additionally, the uncertainty of remotely sensed soil moisture was set to 7%.So, when the WOFOST model starts at the emergence date, WOFOST output soil moisture was perturbed by the method showed above to generate 50 model ensemble members {M SM 1, M SM 2, . . .M SM 50}.When the observed soil moisture is available at the time k, the 50 observation ensemble members {O SM 1, O SM 2, . . .O SM 50} were also generated by adding perturbation to observed soil moisture.Finally, we can calculate the model and observation error covariance based on the ensemble members and then follow Equations ( 14) and ( 13) to calculate the Kalman gain and analysis vector, x a t=k , at the time k.We considered the mean value of x a t=k as the optimal estimation of soil moisture at the time k.

Crop Modeling-Data Assimilation Framework
Figure 3 provides an overall framework for the process of assimilating remotely sensed SM, retrieved by the water cloud model, into the WOFOST model to estimate the regional winter wheat yield with the EnKF algorithm.First, we ran the WOFOST model with the calibrated crop and soil parameters and meteorological data.Meanwhile, we set the ensemble members (in our case, the ensemble number was set as 50) for the output soil moisture (M SM 1−M SM n) from the WOFOST model.Second, Processed Sentinel-1, Sentinel-2, and field measured data are used in the water cloud model to retrieve remotely sensed soil moisture, which are treated as the observation soil moisture (O SM 1−O SM n).Third, the EnKF assimilation algorithm will be run at the date when a new observation soil moisture (O SM 1−O SM n) becomes available and will generate the optimized soil moisture (SM1−SMn) data set.Finally, the optimum soil moisture is input into WOFOST model to simulate winter wheat yield, which will be validated by ground measured yield.

Spatial-Temporal Dynamics of Soil Moisture from Inversion of the Water Cloud Model
In this study, soil moisture content, based on field measurements, was used to calibrate the water cloud model parameters.The values of parameters A, B, C, and D are shown in Table 1.It is noteworthy that, in the water cloud model, the value of the polarization of VV from Sentinel-1 represents the  , while Veg is replaced by NDVI calculated from Sentinel-2.Soil moisture maps retrieved from the water cloud model from three dates and the retrieved soil moisture content of 22 measured sites over the three dates are shown in Figure 4, respectively.From both figures, we can see that the soil moisture is relatively higher on 1 April and 7 May than 1 June

Spatial-Temporal Dynamics of Soil Moisture from Inversion of the Water Cloud Model
In this study, soil moisture content, based on field measurements, was used to calibrate the water cloud model parameters.The values of parameters A, B, C, and D are shown in Table 1.It is noteworthy that, in the water cloud model, the value of the polarization of VV from Sentinel-1 represents the σ 0 pq , while Veg is replaced by NDVI calculated from Sentinel-2.Soil moisture maps retrieved from the water cloud model from three dates and the retrieved soil moisture content of 22 measured sites over the three dates are shown in Figure 4, respectively.From both figures, we can see that the soil moisture is relatively higher on 1 April and 7 May than 1 June in most areas.Soil moisture in most areas was more than 20% on 1 April and 7 May, while soil moisture was mostly less than 20%, or even less than 10%, on 1 June.This is primarily due to the different irrigation on different dates.Based on our field experiment, we found that the farmers in this area usually irrigate the winter wheat 2-4 times during the whole growing season and the irrigation dates are mainly concentrated in the jointing stage, the heading stage, and the grain filling stage, which are generally from early March to mid-May, while 1 April and 7 May generally belong to the jointing stage and heading stage, respectively.Figure 5 shows the validation of the water cloud model soil moisture inversion results.The validation of 7 May has the best precision among the three dates, followed by 1 June and 1 April, and the R 2 is 0.53, 0.49, and 0.45, respectively.RMSE is 7.43%, 8.53%, and 9.16%, respectively.In general, the trend of soil moisture inversion is consistent with the ground measurements.Usually, the accuracy of soil moisture retrieval should be higher with less vegetation, due to less disturbance of the vegetation.However, the precision of the soil moisture retrieved with the water cloud model on 1 April is not the best among the three periods.We think that there are two possible reasons.The first is the effect of vegetation.We can see from Figure 2 that the study area was at the jointing stage of winter wheat on 1 April.We found that the plant height of winter wheat was approximate 17-33 cm, LAI ranged from 0.7-2.0m 2 /m 2 , and the vegetation coverage was higher than 75% when we performed the field experiments.We think the vegetation influence of this period is not much different from the other two periods.Second is the impact of irrigation.From the end of March to early April in Hengshui city is a common irrigation period and the irrigation has a certain impact on the accuracy of field measurements, which brings errors to the retrieval and validation of soil moisture.

Simulation of Soil Moisture with the Water-Limited Mode of the WOFOST Model
For the consideration of irrigation, Wang's strategy that treats irrigation as a part of precipitation with a direct input into WOFOST model was used in our study [44].We assumed that the study area is irrigated three times in DOY (day of year) 95, 110, and 130.The amount of irrigation was set to 100 mm, 60 mm, and 80 mm in DOY 95, 110, and 130, respectively.Moreover, if the SM is greater than 95% of SMFCF (soil moisture field capacity) on irrigation day, irrigation will not proceed.As shown in Figure 6, the green solid line represents the SM curve simulated by the WOFOST model without irrigation.We can see that after 28 February the SM curve dropped quickly and the soil moisture content had almost approached the wilting point since 19 April.So, winter wheat will suffer from water stress if there is no irrigation during the last 2-3 months of the growing season.This period covers the jointing stage and the grain filling stage, which proved to be a period of greater water requirement for winter wheat [45].After we added the irrigation artificially for three dates on 30 March, 20 April, and 12 May, we could see an obvious increase of the SM curve, shown by the blue solid line (Figure 6).This effectively alleviates the reduction in biomass caused by water stress.

Assimilation of Soil Moisture with the WOFOST Model Using the EnKF algorithm
The EnKF assimilation method was implemented for the 10 m grid cells with winter wheat pixels.The winter wheat yield was validated at the 110 field experiment plots.In this study, an ensemble of 50 members was adopted for the simulations of winter wheat yield in the water-limited mode.
the study area was at the jointing stage of winter wheat on 1 April.We found that the plant height of winter wheat was approximate 17-33 cm, LAI ranged from 0.7-2.0m 2 /m 2 , and the vegetation coverage was higher than 75% when we performed the field experiments.We think the vegetation influence of this period is not much different from the other two periods.Second is the impact of irrigation.From the end of March to early April in Hengshui city is a common irrigation period and the irrigation has a certain impact on the accuracy of field measurements, which brings errors to the retrieval and validation of soil moisture.From the assimilation strategy (Figure 3) we know that the soil moisture retrieved from the water cloud model will be assimilated into WOFOST model when the observation is available.As can be seen in Figure 6, the gray solid line encircled by three red circles represents the results of EnKF assimilation and the soil moisture content increased after data assimilation.Figure 7 shows the comparison results of the regional winter wheat yield estimation, with and without EnKF assimilation.The spatial difference of the winter wheat yield map without assimilation is not obvious and most of the estimated yields were around 6500 to 6800 kg/ha.As for the result with EnKF assimilation, more realistic spatial variability throughout the study area was observed.The yield estimation ranged from 5900 to 7700 kg/ha.Generally, the northern part of Hengshui city showed lower winter wheat yield estimation values, while the central and southern part generally had a relatively higher yield estimation.This is likely due to the differences in temperature, solar radiation, and agronomy activities [31].In Figure 8, the WOFOST simulation without SM assimilation could not estimate the winter wheat yield well, with R 2 of 0.21 and RMSE equal to 1330 kg/ha.The winter wheat yield estimation accuracy with EnKF assimilation was improved, with a higher coefficient of determination (R 2 = 0.35) and a much lower RMSE (RMSE = 934 kg/ha).Overall, the correlation of the assimilated winter wheat yield is higher than that without assimilation, although the error of the yield estimation results with assimilation is not negligible (RMSE = 934 kg/ha).

Simulation of Soil Moisture with the Water-Limited Mode of the WOFOST Model
For the consideration of irrigation, Wang's strategy that treats irrigation as a part of precipitation with a direct input into WOFOST model was used in our study [45].We assumed that the study area is irrigated three times in DOY (day of year) 95, 110, and 130.The amount of irrigation was set to 100 mm, 60 mm, and 80 mm in DOY 95, 110, and 130, respectively.Moreover, if the SM is greater than 95% of SMFCF (soil moisture field capacity) on irrigation day, irrigation will not proceed.As shown in Figure 6, the green solid line represents the SM curve simulated by the WOFOST model without irrigation.We can see that after 28 February the SM curve dropped quickly and the soil moisture content had almost approached the wilting point since 19 April.So, winter wheat will suffer from water stress if there is no irrigation during the last 2-3 months of the growing season.This period covers the jointing stage and the grain filling stage, which proved to be a period of greater water requirement for winter wheat [46].After we added the irrigation artificially for three dates on 30 March, 20 April, and 12 May, we could see an obvious increase of the SM curve, shown by the blue solid line (Figure 6).This effectively alleviates the reduction in biomass caused by water stress.

Assimilation of Soil Moisture with the WOFOST Model Using the EnKF algorithm
The EnKF assimilation method was implemented for the 10 m grid cells with winter wheat pixels.The winter wheat yield was validated at the 110 field experiment plots.In this study, an ensemble of 50 members was adopted for the simulations of winter wheat yield in the water-limited

Discussion
The valuable information obtained from optical and SAR remotely sensed data allows us to evaluate the combined ability of the two sources of data by assimilating them into the crop growth model.We considered that there are two strategies to assimilate optical and SAR remotely-sensed data into the crop model.One strategy is to retrieve state variables from the optical data and SAR data individually and then assimilate these parameters into the crop model on the date of observation [47][48][49].The advantage of this method is that the two data sources can complement each other at a temporal scale.Another strategy, which was adopted in this study, is to combine the two data sources

Discussion
The valuable information obtained from optical and SAR remotely sensed data allows us to evaluate the combined ability of the two sources of data by assimilating them into the crop growth model.We considered that there are two strategies to assimilate optical and SAR remotely-sensed data into the crop model.One strategy is to retrieve state variables from the optical data and SAR data individually and then assimilate these parameters into the crop model on the date of observation [47][48][49].The advantage of this method is that the two data sources can complement each other at a temporal scale.Another strategy, which was adopted in this study, is to combine the two data sources

Discussion
The valuable information obtained from optical and SAR remotely sensed data allows us to evaluate the combined ability of the two sources of by assimilating them into the crop growth model.We considered that there are two strategies to assimilate optical and SAR remotely-sensed data into the crop model.One strategy is to retrieve state variables from the optical data and SAR data individually and then assimilate these parameters into the crop model on the date of observation [46][47][48].The advantage of this method is that the two data sources can complement each other at a temporal scale.Another strategy, which was adopted in this study, is to combine the two data sources to retrieve state variables (e.g., soil moisture) through a mechanism model and then assimilate the retrieved parameters into the crop model.An important issue for this strategy is the time consistency of the two data sources.For this study, we assumed that the NDVI will not change dramatically in a short time.In this paper, the SM was assimilated into the WOFOST model as a state variable using the EnKF assimilation algorithm to estimate the winter wheat yield at a regional scale.The SM images were retrieved by the water cloud model using Sentinel-1 C-band SAR data and Sentinel-2 MSI optical data.

Uncertainties of this Study
Soil moisture variability is the main driver of crop yield variability and accurate estimates of soil moisture are important for understanding crop status and estimating crop yields [26,49,50].Our soil moisture retrieval results showed that the accuracy of three SM images was acceptable and the maximum R 2 was 0.53 and minimum RMSE was 7.43%.However, large magnitude soil moisture error can bring a large uncertainty to crop models.The soil moisture retrieval error comes from both field measurements and the characteristics of the satellite data.It has been found that VV polarized radiation, which was used in this study, is effective for monitoring early vegetation, whereas it will cause more errors during the later stages of crop development [51].
In our analysis, the accuracy of the winter wheat yield estimation was improved by the EnKF assimilation method.The R 2 increased from 0.21 to 0.35 and RMSE decreased from 1330 kg/ha to 934 kg/ha and more realistic spatial differences throughout the study area were shown.This indicates the potential of crop yield estimation by combining the WOFOST model and the water cloud model, which jointly used the Sentinel-1 C-band SAR data and Sentinel-2 MSI optical data.However, compared with our previous study about assimilating time series LAI into the WOFOST model using the EnKF method (R 2 = 0.43, RMSE = 439 kg/ha) [1], this study still has a relatively high error (RMSE = 934 kg/ha).The main reasons for the errors are likely from the following points: First, the uncertainty of soil moisture retrieved from the water cloud model is a main source of error.About 7% error for soil moisture is slightly high, because the field capacity of the soil and the wilting point of the crop are around 0.30 m 3 /m 3 and 0.07 m 3 /m 3 in this study area, respectively.Second, the yield breakdown rate, which will cause overestimation of the field measured winter wheat yield, was not considered when calculating the field measured yield.Third, the amount of observation data (there are only three dates of SM images on April 1 st , May 7 th , and June 1 st ) was very limited for the crop model to conduct data assimilation.Sufficient observations during the crop growth period are of great importance for data assimilation to improve crop yield estimation.It is proven that the data assimilation algorithm is sensitive to sampling size and update frequency [44] and the EnKF method is limited by the finite ensemble size, which will bring sampling error into the background covariance [52,53].Fourth, the soil moisture in WOFOST model is an average quantity, which is obtained by dividing the total water content of the root zone by the root length.In this study, we only used the 0-10 cm soil moisture data in the water cloud model.Thus, it is better to use soil moisture data at different depths.

Future Work
In summary, there are still several limitations of this study, although it has been proven to have potential for winter wheat yield estimation at the regional scale.We have summed up three points to help improve the assimilation framework in the future.Firstly, as a semi empirical model, the accuracy of the water cloud model strongly relies on in situ measurements, which causes errors and is difficult to extend to a large regional scale.A more mechanistic model can be imported to replace the water cloud model (e.g., Michigan Microwave Canopy Scattering (MIMICS)).Secondly, owing to the deficiency of the WOFOST model in the irrigation module [4,44], the irrigation and soil water balance only received cursory attention in this study.A soil water atmosphere plant (SWAP) model with more complex soil subroutines could be taken into account.The principal difference between the SWAP and WOFOST models is in the soil water balance component calculations.The SWAP model will have a more accurate modelling of the soil water balance [54].Thirdly, this study was simulated at 10 m scale in Hengshui city, which has more than one million pixels.There is no doubt that this will increase the demand on the computing power of computers if we want to run the model at a larger scale.As such, a higher efficiency assimilation strategy needs to be considered to improve computing efficiency, for example a cluster strategy on the grid can effectively improve the computing efficiency [55,56].Therefore, it would be interesting to try to achieve a better result using the aspects above.

Conclusions
In this study, the SM mapping retrieved from the water cloud model, using Sentinel-1 C-band SAR data and Sentinel-2 MSI optical data, was assimilated into WOFOST model with the EnKF algorithm to improve winter wheat yield estimation at a regional scale in Hengshui city.The following conclusions can be drawn:

•
First, SM retrieval results demonstrated that it is feasible to retrieve soil moisture content with the water cloud model by combining remotely sensed Sentinel-1 and Sentinel-2 data.Results showed an acceptable accuracy on three dates where R 2 was 0.45, 0.53, and 0.49, respectively, and RMSE was 9.16%, 7.43%, and 8.53%, respectively.

•
Second, the assimilation results indicated that, with assimilation of the SM in the WOFOST water-limited mode, the winter wheat yield estimation achieved a higher accuracy, with R 2 = 0.35 and RMSE = 934 kg/ha, than that without assimilation, with R 2 = 0.21 and RMSE = 1330 kg/ha.Consequently, our results highlight the usefulness and ability of assimilating SM, which was retrieved by combining Sentinel-1 C-band SAR and Sentinel-2 MSI optical data into the WOFOST model to improve regional winter wheat yield estimations.

Figure 1 .
Figure 1.Study area.(Every county has 2 measured sites; each site has 5 sample points).

Figure 1 .
Figure 1.Study area.(Every county has 2 measured sites; each site has 5 sample points).

Figure 2 .
Figure 2. Calendar of the field experiment, remote sensing data acquisition dates and general growth stages of winter wheat.

Figure 2 .
Figure 2. Calendar of the field experiment, remote sensing data acquisition dates and general growth stages of winter wheat.

Figure 3 .
Figure 3. Flowchart for the winter wheat yield estimation using the EnKF-based assimilation algorithm.

Figure 3 .
Figure 3. Flowchart for the winter wheat yield estimation using the EnKF-based assimilation algorithm.

Figure 4 .
Figure 4. Maps of soil moisture in the study area retrieved from the water cloud model (a) 1 April, (b) 7 May and (c) 1 Jun.(d) Retrieved soil moisture content of 22 measured sites in Hengshui city over 3 time periods.(Every county has 2 measured sites).

Figure 4 .
Figure 4. Maps of soil moisture in the study area retrieved from the water cloud model (a) 1 April, (b) 7 May and (c) 1 Jun.(d) Retrieved soil moisture content of 22 measured sites in Hengshui city over 3 time periods.(Every county has 2 measured sites).

Figure 5 .
Figure 5. Validation of the retrieved soil moisture from the water cloud model; (a) 1 April, (b) 7 May, and (c) 1 June.Note: 55 sample points were used for validation.

Figure 5 .
Figure 5. Validation of the retrieved soil moisture from the water cloud model; (a) 1 April, (b) 7 May, and (c) 1 June.Note: 55 sample points were used for validation.

Figure 6 .
Figure 6.Comparison of the SM curves simulated by WOFOST model with and without the EnKFbased assimilation method (a sample point in Shenzhou county).

Figure 6 .
Figure 6.Comparison of the SM curves simulated by WOFOST model with and without the EnKF-based assimilation method (a sample point in Shenzhou county).

Figure 7 .
Figure 7. Regional winter wheat yield estimation without and with the EnKF assimilation method ((a) without assimilation; (b) with EnKF assimilation).

Figure 7 .
Figure 7. Regional winter wheat yield estimation without and with the EnKF assimilation method ((a) without assimilation; (b) with EnKF assimilation).

Figure 7 .
Figure 7. Regional winter wheat yield estimation without and with the EnKF assimilation method ((a) without assimilation; (b) with EnKF assimilation).

Table 1 .
Parameter values used in the water cloud model for the study area.

Table 1 .
Parameter values used in the water cloud model for the study area.