Estimation of Winter Wheat Biomass and Yield by Combining the AquaCrop Model and Field Hyperspectral Data

Knowledge of spatial and temporal variations in crop growth is important for crop management and stable crop production for the food security of a country. A combination of crop growth models and remote sensing data is a useful method for monitoring crop growth status and estimating crop yield. The objective of this study was to use spectral-based biomass values generated from spectral indices to calibrate the AquaCrop model using the particle swarm optimization (PSO) algorithm to improve biomass and yield estimations. Spectral reflectance and concurrent biomass and yield were measured at the Xiaotangshan experimental site in Beijing, China, during four winter wheat-growing seasons. The results showed that all of the measured spectral indices were correlated with biomass to varying degrees. The normalized difference matter index (NDMI) was the best spectral index for estimating biomass, with the coefficient of determination (R2), root mean square error (RMSE), and relative RMSE (RRMSE) values of 0.77, 1.80 ton/ha, and 25.75%, respectively. The data assimilation method (R2 = 0.83, RMSE = 1.65 ton/ha, and RRMSE = 23.60%) achieved the most accurate biomass estimations compared with the spectral index method. The estimated yield was in good agreement with the measured yield (R2 = 0.82, RMSE = 0.55 ton/ha, and RRMSE = 8.77%). This study offers a new method for agricultural resource management through consistent assessments of winter wheat biomass and yield based on the AquaCrop model and remote sensing data.


Introduction
Wheat is an important food source for the rapidly increasing population in China [1].The attention paid to national food security and sustainable agricultural development has increased over recent years, with increased concern for the improvement of field wheat management.Therefore, it is important to estimate wheat growth status and predict wheat yield in a timely and accurate way [2].The integration of crop models and remote sensing data has become a useful method for monitoring crop growth status and crop yield based on data assimilation over extensive regions [3,4].
In most cases, researches have developed remote sensing and crop models used in their respective study areas [5].Crop models simulate crop physiological growth status using mathematical formulas [6].They have been used to analyze the influences of climate, soil conditions, and management strategies on agronomic parameters (e.g., canopy aboveground biomass, LAI, and grain yield) [7].The Agricultural Model Intercomparison and Improvement Project (AgMIP) recently reviewed 27 wheat models from around the world [8].This review showed that poor performance may be obtained when a crop model is applied over a large region due to uncertainties in the spatial distribution of soil properties, initial model conditions, crop parameters, and field management practices, resulting in biased simulations [9,10].Large amounts of high-quality data have been used to improve the calibration and parameterization of crop models, thereby increasing the simulation accuracy of crop models on a regional scale.
Rapid development of remote sensing technology has facilitated the acquisition of crop growth information with high temporal and spatial resolutions [5,[11][12][13][14][15][16].Previous studies have indicated that combining crop models and remote sensing data can be used to improve the accuracy of crop yield estimates [17][18][19][20][21][22][23][24][25].Curnel et al. [17] evaluated the feasibility of assimilating wheat leaf area index (LAI) derived from remote sensing data into the World Food Studies' (WOFOST) crop growth model using a recalibration-based assimilation method; the results indicated that remote sensing data can be used to improve yield estimations.Dente et al. [19] assimilated LAI from Environment Satellite (ENVISAT) Advanced Synthetic Aperture Radar (ASAR) and Medium Resolution Imaging Spectrometer (MERIS) data into the Crop Estimation through Resource and Environment Synthesis-Wheat (CERES-Wheat) model at a catchment scale using a variational assimilation algorithm; the results suggested that this approach minimizes the difference between simulated and remotely-sensed LAI and achieves high estimation accuracy.Soil moisture data from the Advanced Microwave Scanning Radiometer-EOS (AMSR-E) and LAI from the Moderate Resolution Imaging Spectroradiometer-LAI (MODIS-LAI) were assimilated into the Decision Support System for Agro-technology Transfer-Cropping System Model (DSSAT-CSM)-Maize using an Ensemble Kalman Filter algorithm, and simulated yield was more accurate when both LAI and soil moisture were used [22].The ensemble-based four-dimensional variational method was used to assimilate HJ-1A/B satellite data into the CERES-Wheat model, and estimates of winter wheat yield in field plots were reported (R 2 = 0.73; RMSE = 319 kg/ha) [23].Huang et al. [25] assimilated time series of LAI data with a 30-m spatial resolution into the WOFOST model with a Kalman Filter (KF) algorithm and reported more accurate estimates of regional winter wheat yield compared with more traditional approaches.
The integration of crop models and remotely sensed data using optimization algorithms (data assimilation methods) is becoming an effective and potential method for monitoring crop growth status and estimating crop yields, as it overcomes certain defects and combines the advantages of individual methods [5,[26][27][28][29][30].The data assimilation method can be used to reduce uncertainty in crop models to ensure that the simulated state variables (e.g., LAI and biomass) are in agreement with the measured state variables from remote sensing data.Several assimilation schemes have been developed [3,12,[31][32][33].Delecolle et al. [34] divided schemes into three categories.The first is the forcing method, in which state variables in crop models are directly substituted by remote sensing variables.This method is easy to use, but it relies on the calibrated parameters of crop models and the accuracy of remote sensing data [5,12].The second is the calibration method, in which the initial parameters of crop models are recalibrated, based on the relationship between the remote sensing state variables and the simulated state variables [5,10,34].In recent years, the calibration method has gained more attention, as it has greatly benefited from several intelligent optimization algorithms.The main shortcoming of the calibration method is that a great deal of computation time is required [3,5,30].The third is the updating method, in which the simulated state variables are continuously renewed whenever remote sensing state variables are available.It is more flexible than the forcing method, but the remote sensing data must be of a higher accuracy than those of the simulated state variables, and this method heavily relies on the selection of the remote sensing data [4,33,35].
The combination of remote sensing data and light-driven or carbon-driven models has been widely studied, but few studies have focused on water-driven models for estimating the biomass and yield of crops.Therefore, in this study we focused on the AquaCrop model, a water-driven crop model, which was recently introduced to optimize crop water management strategies and improve crop yield in irrigation regions [36].Winter wheat is the main crop grown in the North China Plain (NCP), and thus an important food source in China.Increased industrial and domestic water use has resulted in reduced water availability for irrigation of winter wheat crops.Therefore, improving water resource management in this region is crucial for increasing winter wheat yield.The main goal of this study was to improve estimates of winter wheat biomass and yield by assimilating field spectroscopic data into the AquaCrop model with a Particle Swarm Optimization (PSO) algorithm.The biomass and yield of winter wheat were used to optimize field irrigation management strategies and then to increase water use efficiency under different planting dates and irrigation management strategies.The specific objectives of this study were: (1) to select the best spectral indices from hyperspectral data for estimating winter wheat biomass; (2) to calibrate the AquaCrop model with biomass estimates derived from these indices using the PSO algorithm for improving accuracy of biomass and yield estimates; and (3) to evaluate the performance of the data assimilation method in estimating wheat biomass and yield.C in summer, and the minimum temperature is −4.7 • C in winter.For the experimental period, the average annual precipitation was 650 mm and the frost-free period was 180 days on average [37].

Experimental Setup
Table 1 shows the winter wheat planting dates and cultivars.The area of each plot was 100 m 2 , in 2008, 2009, and 2010, and 300 m 2 in 2011.A two-way factorial arrangement of treatments (winter wheat cultivar and planting date) in a randomized complete block design with three replicates was used in this experiment.Weed control, pest management, and fertilizer application were performed according to the local standard practices for wheat production.

Meteorological Data Collection
The local Xiaotangshan meteorological station was used to obtain meteorological data.Daily relative humidity, rainfall, total sunshine hours, wind speed, and maximum, minimum, and mean temperatures were recorded directly at the Xiaotangshan experimental site.The Food and Agriculture Organization Penman-Monteith method was used to calculate the reference evapotranspiration (ET o ) [38].

Measurement of Canopy Reflectance
Spectral measurements of winter wheat were taken at different growth stages.The specific growth stages and dates are presented in Table 2.All canopy spectral measurements were taken at a nadir orientation, 1.0 m above the canopy, under clear sky condition between 10:00 and 14:00 Beijing local time, using an ASD Field Spec Pro Spectrometer (Analytical Spectral Devices, Boulder, CO, USA).The spectrometer was fitted with a 25 • field of view optical fiber, operating in the 350-2500 nm spectral region.The scanned area of the ASD sensor was about 0.70 m 2 .A 40 × 40 cm BaSO 4 calibration panel was used for calculating the black and baseline reflectance.Spectral measurements taken at all four sites in each plot were averaged to represent the canopy reflectance of each plot to reduce the possible effects due to field conditions.Vegetation radiance measurements were averaged from 10 scans at an optimized combination time at each site, and a dark current correction was conducted before each measurement.For each plot, a total of 40 spectra data points were obtained.Panel radiance measurements were taken twice, before and after the canopy spectral measurements.The aboveground biomass, at the measuring positions of canopy spectral reflectance data, were obtained 5-6 times using random sampling of a 0.25-m 2 area, in 2009-2012, with four replicates from each plot.A 4 × 0.25 m 2 area for each plot was deemed sufficient, based on previous results [3].All samples were heated to 105 • C, then oven dried at 70 • C to a constant weight, and their final dry weights were recorded.
The grain yields of each plot with three replicates for each treatment were obtained by randomly sampling a 1.5-m 2 area.Finally, selected grain was dried and weighed on an electronic scale (±0.01 g).

Selection of Spectral Indices and Biomass Estimation from Spectral Indices
Fifteen spectral indices from the literature [13,16,[39][40][41][42][43][44][45][46][47][48][49][50], determined to be good candidates for estimating biomass, were selected for the entire winter wheat growing season, based on 2009, 2010, and 2011 field data (Table 3).To refine the relationships between spectral indices and biomass, linear and nonlinear regression relationships between each of the spectral indices and biomass were determined based on field data from all growth stages during 2009, 2010, and 2011 (n = 135, calibration dataset).Field data taken in 2012 (n = 20, validation dataset) was used to validate the estimation accuracy of the models.Since the four winter wheat cultivars exhibited larger differences in 2012, resulting in greater variation in the biomass of the four winter wheat cultivars, the dataset from 2012 was selected to validate the estimation accuracy of the models.To determine the most sensitive spectral indices, we compared the coefficient of determination (R 2 ), root mean square error (RMSE), and relative RMSE (RRMSE) values of the different models.Best-fitting regression equations were used for estimating winter wheat biomass.In addition, the homoscedasticity values (F) of the estimated and measured biomass were calculated using Levene's test [51].
Note: R i denotes reflectance at band i (nanometer).

Description of the AquaCrop Model
The AquaCrop model was reported by the FAO in 2009, and detailed descriptions are reported in Steduto et al. [36], Raes et al. [52], and Jin et al. [37].It computes daily crop transpiration and soil evaporation.The model subsequently estimates yield based on daily crop transpiration [36].

Description of the ACsaV40 (AquaCrop Plug-In) Model
The AquaCrop plug-in program, ACsaV40, was created to simultaneously run large amounts of data without a user interface [53].It facilitates external and practical applications of AquaCrop.The input parameters of ACsaV40 are sorted in a text file, which can be created using the AquaCrop model, or by manually replacing the values of each variable with new values in the existing text files [54].ACsaV40 runs the successive project files, and the simulated results of each project file are reserved in an output file, which includes the simulation period, stress factors, canopy cover, biomass, crop yield, and so on [53].

Assimilation of the AquaCrop Model and Remote Sensing Data Using the Particle Swarm Optimization (PSO) Algorithm
Particle swarm optimization (PSO) is a comparatively simple principle that can be easily combined into crop models with high calculation efficiency and few input parameters.Compared with various optimization algorithms, PSO is easier to apply in a practical study and has the advantages of a high precision and rapid convergence [55].It has received widespread attention among scientists who have demonstrated its superiority in solving practical problems.In addition, PSO has the capability of parallel computing.Therefore, we used PSO to carry out the assimilation of remote sensing data into the AquaCrop model.PSO is based on the assumption of a group consisting of m (25 groups in this study) particles with certain speeds, without quality and size, in a d-dimensional search space.Each particle can modify its position and velocity based on both the best point in the current generation (p id ) and the best point of all particles in the swarm (p gd ).In this study, estimated biomass was used to optimize the crop parameters used in the AquaCrop model to obtain the optimal simulated biomass based on the fit of the cost function.The corresponding optimal yield is produced when the optimal simulated biomass is achieved.The PSO assimilation method for the The AquaCrop model and remote sensing data are presented in Figure 1.The specific steps to execute the PSO are as follows: (1) The velocity and position (initial value) of each particle are determined.The adjusted parameters include eight crop parameters (cgc, ccx, cdc, eme, num, psen, pstoshp, and rootdep) [56].Specific information and ranges for these parameters are listed in Table 4.The AquaCrop model and remote sensing data are presented in Figure 1.The specific steps to execute the PSO are as follows: (1) The velocity and position (initial value) of each particle are determined.The adjusted parameters include eight crop parameters (cgc, ccx, cdc, eme, num, psen, pstoshp, and rootdep) [56].Specific information and ranges for these parameters are listed in Table 4.

Biomass Estimation
The regression relationships between biomass and spectral indices are provided in Table 5.The lowest and highest R 2 values (0.25 and 0.84) were obtained for DCNI I and NDMI, respectively.The order of the spectral indices was WI I, WI II, NDII, NDMI, TBWI, EVI, TCARI, OSAVI, TCARI/OSAVI, MTCI, CI red edge , NDVI, DCNI I, OSAVI × CI red edge , and WDRVI.Of the R 2 values, two were above 0.70, five were equal to or above 0.60, and eight were below 0.6.All spectral indices were fitted to power regression equations, with the exception of NDMI, DCNI I, and WDRVI, which were fitted to exponential regression equations (Table 5).The result showed that the assumption of homoscedasticity is met, based on the calculated F values between the estimated and measured biomass (Table 5).The correlation between biomass and NDMI was highest compared with the other spectral indices, and the corresponding RMSE and RRMSE values for measured (BIOm) and estimated (BIOe) biomass were 1.80 ton/ha and 25.75%, respectively, which were lower than the values for the other indices (Table 5 and Figure 2).Therefore, NDMI was selected to estimate winter wheat biomass.

Data Assimilation for Biomass Estimation
The value of BIOe derived from the NDMI exponential regression equation was used as a variable to calibrate the AquaCrop model using the PSO algorithm.The results are presented in Figure 3 and Table 6, and the statistical regression equations are shown in Table 6.The BIOs was consistent with the BIOm across four years with different winter wheat cultivars, sowing dates, and irrigation management strategies, and the corresponding R 2 and RMSE values were 0.83 and 1.65 ton/ha, respectively.The estimation accuracies of our experiments varied between years.The R 2 and RMSE values were 0.81 and 1.69 ton/ha for 2008/2009, 0.82 and 1.67 ton/ha for 2009/2010, 0.81 and 1.56 ton/ha for 2010/2011, and 0.87 and 1.72 ton/ha for 2011/2012.The RRMSE values ranged from 23.60% to 30.65%.The deviation between the BIOs and BIOm in 2008/2009 was larger than that for the other years.Strong relationships between BIOs and BIOm were found, although biomass was often overestimated when the measured values exceeded 2 ton/ha (Figure 3).However, biomass was underestimated when the measured values were less than 2 ton/ha.The value of F was from 0.87 to 0.96 between the simulated and measured biomass (Table 6).The results show that the assumption of homoscedasticity was met.

Data Assimilation for Biomass Estimation
The value of BIOe derived from the NDMI exponential regression equation was used as a variable to calibrate the AquaCrop model using the PSO algorithm.The results are presented in Figure 3 and Table 6, and the statistical regression equations are shown in Table 6.The BIOs was consistent with the BIOm across four years with different winter wheat cultivars, sowing dates, and irrigation management strategies, and the corresponding R 2 and RMSE values were 0.83 and 1.65 ton/ha, respectively.The estimation accuracies of our experiments varied between years.The R 2 and RMSE values were 0.81 and 1.69 ton/ha for 2008/2009, 0.82 and 1.67 ton/ha for 2009/2010, 0.81 and 1.56 ton/ha for 2010/2011, and 0.87 and 1.72 ton/ha for 2011/2012.The RRMSE values ranged from 23.60% to 30.65%.The deviation between the BIOs and BIOm in 2008/2009 was larger than that for the other years.Strong relationships between BIOs and BIOm were found, although biomass was often overestimated when the measured values exceeded 2 ton/ha (Figure 3).However, biomass was underestimated when the measured values were less than 2 ton/ha.The value of F was from 0.87 to 0.96 between the simulated and measured biomass (Table 6).The results show that the assumption of homoscedasticity was met.We compared BIOs with BIOe using the spectral index method.The data assimilation method (R 2 = 0.83 and RMSE = 1.65 ton/ha, Table 6) achieved better biomass estimations than the spectral index method (R 2 = 0.77 and RMSE = 1.80 ton/ha, Table 5).

Data Assimilation for Yield
The yield of winter wheat was obtained after the data assimilation.The relationship between the measured and simulated yields is shown in Figure 4 and Table 7.There was a significant relationship between simulated (YIELDs) and measured (YIELDm) yield across all four years (R 2 and RMSE values of 0.82 and 0.55 ton/ha, respectively) (Table 7).YIELDs varied between the four growing seasons.The R 2 and RMSE values for YIELDs and YIELDm were 0.79 and 0.51 ton/ha in 2008/2009, 0.83 and 0.57 ton/ha in 2009/2010, 0.81 and 0.52 ton/ha in 2010/2011, and 0.89 and 0.61 ton/ha in 2011/2012, respectively.The value of RRMSE ranged from 8.77% to 10.69%.There was a wider range of YIELDs values in 2008/2011 than in 2011/2012 because of the different sowing treatments (Figure 4).A good relationship between YIELDs and YIELDm was also found.Yield was often overestimated when the YIELDm was higher than 5 ton/ha (Figure 4) and underestimated when the YIELDm was less than 5 ton/ha.Table 7 shows that the F values ranged from 0.72 to 0.87.The results demonstrated that the assumption of homoscedasticity was met.

Data Assimilation for Yield
The yield of winter wheat was obtained after the data assimilation.The relationship between the measured and simulated yields is shown in Figure 4 and Table 7.There was a significant relationship between simulated (YIELDs) and measured (YIELDm) yield across all four years (R 2 and RMSE values of 0.82 and 0.55 ton/ha, respectively) (Table 7).YIELDs varied between the four growing seasons.The R 2 and RMSE values for YIELDs and YIELDm were 0.79 and 0.51 ton/ha in 2008/2009, 0.83 and 0.57 ton/ha in 2009/2010, 0.81 and 0.52 ton/ha in 2010/2011, and 0.89 and 0.61 ton/ha in 2011/2012, respectively.The value of RRMSE ranged from 8.77% to 10.69%.There was a wider range of YIELDs values in 2008/2011 than in 2011/2012 because of the different sowing treatments (Figure 4).A good relationship between YIELDs and YIELDm was also found.Yield was often overestimated when the YIELDm was higher than 5 ton/ha (Figure 4) and underestimated when the YIELDm was less than 5 ton/ha.Table 7 shows that the F values ranged from 0.72 to 0.87.The results demonstrated that the assumption of homoscedasticity was met.

Discussion
Spectral data and concurrent biomass and yield were acquired during four winter wheat growing seasons.Fifteen spectral indices were related to biomass (Table 5); this is because red edge (670-780 nm) and near infrared (short NIR, 800-1100 nm) data contain useful information regarding vegetation biomass [13,39,44,50].In particular, NDMI was found to be highly correlated with biomass, with R 2 and RMSE values of 0.77 and 1.80 ton/ha, respectively.NDMI does not contain red edge or short NIR data because absorption at these wavelengths is strongly influenced by chlorophyll content and canopy structure, which reduce the signal compared with that of dry matter.However, NDMI contains data at 1649 and 1722 nm, which are more sensitive to changes in dry matter [42].These data were combined to establish NDMI, which includes signals from dry matter.For this reason, NDMI was more highly related with biomass than the other spectral indices and achieved more accurate biomass estimations.In this study, the linear and nonlinear regression between each spectral index and biomass were analyzed to select the best-fitting regression equations.The results show that some models were fitted using power regression, and others fitted using exponential regression (Table 5).The difference between two regressions may have a close relationship with each spectral index and biomass dataset.
The model's initial variables (num and eme) and crop parameters (cgc, ccx, cdc, eme, psen, and rootdep) were calibrated by combining biomass retrieved from spectral indices and the AquaCrop model via the PSO assimilation algorithm, thereby achieving optimal biomass estimations.The simulated biomass values were consistent with the measured values.These findings are consistent with those of Soddu et al. [58].Heng et al. [59] showed that the AquaCrop model is used to better simulate biomass when irrigation is adequate.Our results suggest that the AquaCrop model could be used to simulate winter wheat biomass.The data assimilation method, based on the PSO algorithm, achieved better biomass estimations than the spectral index method (Tables 5 and 6).The main reasons are as follows: (i) The AquaCrop model can be used to simulate dry biomass accumulation on the basis of a plant's physiological processes, and the effects of field management strategies and weather [36,37,59,60]; and (ii) the data assimilation method was used to minimize errors between the observed values from field spectroscopic data and the simulated values from the AquaCrop model, and the errors in the remote sensing data were reduced during data assimilation [10].Typically, biomass simulated with the data assimilation method was overestimated when the measured values exceeded 2 ton/ha, but was underestimated when the measured values were less than 2 ton/ha (Figure 3).This explains why the regression equations between NDMI and biomass

Discussion
Spectral data and concurrent biomass and yield were acquired during four winter wheat growing seasons.Fifteen spectral indices were related to biomass (Table 5); this is because red edge (670-780 nm) and near infrared (short NIR, 800-1100 nm) data contain useful information regarding vegetation biomass [13,39,44,50].In particular, NDMI was found to be highly correlated with biomass, with R 2 and RMSE values of 0.77 and 1.80 ton/ha, respectively.NDMI does not contain red edge or short NIR data because absorption at these wavelengths is strongly influenced by chlorophyll content and canopy structure, which reduce the signal compared with that of dry matter.However, NDMI contains data at 1649 and 1722 nm, which are more sensitive to changes in dry matter [42].These data were combined to establish NDMI, which includes signals from dry matter.For this reason, NDMI was more highly related with biomass than the other spectral indices and achieved more accurate biomass estimations.In this study, the linear and nonlinear regression relationships between each spectral index and biomass were analyzed to select the best-fitting regression equations.The results show that some models were fitted using power regression, and others fitted using exponential regression (Table 5).The difference between two regressions may have a close relationship with each spectral index and biomass dataset.
The model's initial variables (num and eme) and crop parameters (cgc, ccx, cdc, eme, psen, and rootdep) were calibrated by combining biomass retrieved from spectral indices and the AquaCrop model via the PSO assimilation algorithm, thereby achieving optimal biomass estimations.The simulated biomass values were consistent with the measured values.These findings are consistent with those of Soddu et al. [58].Heng et al. [59] showed that the AquaCrop model is used to better simulate biomass when irrigation is adequate.Our results suggest that the AquaCrop model could be used to simulate winter wheat biomass.The data assimilation method, based on the PSO algorithm, achieved better biomass estimations than the spectral index method (Tables 5 and 6).The main reasons are as follows: (i) The AquaCrop model can be used to simulate dry biomass accumulation on the basis of a plant's physiological processes, and the effects of field management strategies and weather [36,37,59,60]; and (ii) the data assimilation method was used to minimize errors between the observed values from field spectroscopic data and the simulated values from the AquaCrop model, and the errors in the remote sensing data were reduced during data assimilation [10].Typically, biomass simulated with the data assimilation method was overestimated when the measured values exceeded 2 ton/ha, but was underestimated when the measured values were less than 2 ton/ha (Figure 3).This explains why the regression equations between NDMI and biomass were similar.Therefore, the integration of spectral indices into the AquaCrop model, using the PSO assimilation algorithm, is a useful tool for winter wheat biomass estimation.
Winter wheat grain yield was simulated according to the optimized values of the initial variables and calibrated crop parameters using the PSO data assimilation algorithm.A good relationship between the measured and simulated yields was found across all four years (R 2 = 0.82 and RMSE = 0.55 ton/ha).However, the RRMSE for yield was lower than that for biomass (Tables 6 and 7), mainly because the latest biomass measurements were taken at the grain filling stage (12 June) rather than at maturity and biomass simulated with the AquaCrop model becomes more accurate with the development of winter wheat [37,61].Our results are in agreement with those of Wang et al. [61] and Jin et al. [37].The AquaCrop model considers the effects of interannual variations in weather and field management strategies, as well as interactions between the two, on wheat growth status; therefore, it was used to analyze the nonlinear interannual variability in crop grain yield [36].The results suggest that the AquaCrop model is an effective tool for deriving crop management strategies, and can be used to simulate biomass and grain yield of winter wheat.First, biomass retrieved from spectral indices is used to calibrate crop biomass simulated with the AquaCrop model.If crop biomass is accurately simulated, it can be used to simulate yield.The simulated yield is finally obtained directly from the AquaCrop model after data assimilation.Simulated grain yield is a useful measurement for informed decision-making regarding national food security issues.However, it is more important to obtain crop growth status information and then to improve field crop management for improving grain yield to ensure national food security, In short, the dynamic simulated biomass of wheat is used to enhance wheat management and decision-making, and then to ensure wheat yield.
The data assimilation accuracy of biomass and grain yield was acceptable according to the R 2 , RSME, and RRMSE values (Tables 6 and 7).The results of Dente et al. [19] and Jiang et al. [23] indicated that assimilating remote sensing data (ENVISAT ASAR, MERIS, and HJ-1A/B satellites images) into the CERES-Wheat model with optimization algorithms (variational assimilation algorithm and Ensemble-Based Four-Dimensional Variational algorithm) can improve the estimation accuracy of wheat yield.Huang et al. [25] recently suggested that combining the WOFOST model and remote sensing data (MODIS and Landsat TM images) with a KF algorithm also increases the estimation accuracy of wheat yield.Our results are in agreement with the results of these studies and demonstrate that the combination of the AquaCrop model and spectral indices with a PSO algorithm can be used to enhance the estimation accuracy of winter wheat yield.A good relationship between the simulated and measured yields was found (Figure 4); however, the relationship between measured and simulated biomass was not reliable during each growth stage (Figure 3).This can be attributed to the influence of a large difference in the biomass measurement date on biomass simulation [37], which then introduced uncertainties into the process of data assimilation.However, the yield simulations were consistent during all crop growth stages.Therefore, the data assimilation method can improve crop yield estimations because the AquaCrop model considers the effects of management strategies and environmental factors on winter wheat growth status, based on a plant's physiological processes.Our results suggest that integrating remote-sensing data into the AquaCrop model is a feasible method for estimating winter wheat biomass and yield.
In this study, the hyperspectral data that were obtained were ground-based data.To improve our model for estimating biomass and yield in winter wheat, and to make it more practical, it is important to estimate the accuracy and stability of the model using hyperspectral satellite data.The current Landsat and Sentinel-2 satellites provide high spatial resolution imagery data (10-60 m) with relatively short revisit periods.Based on this, Landsat and Sentinel-2 sensors have the potential for improved estimates of biomass and yield in winter wheat at regional scales.With the development of unmanned aerial vehicles (UAV), the combination of UAV and hyperspectral imaging data should allow for the timely estimation of the growth status of crops, with high spatial resolution image data at the field and farm scales, in the future.In this study, we only out experiments at a single-site, and obtained good results over four years.The method used in this study is transferrable to other sites.The main insights from this study are as follows: (i) The crop parameters of the AquaCrop model for different crops are parameterized to better-simulate different crop biomass and yields, during all growth stages under different environmental conditions and experimental sites; (ii) different crops should be accurately classified using high temporal and spatial resolution image data when this method is applied to regional scales; (iii) PSO will further enhance the advantages of a parallel algorithm to quickly obtain estimated results at regional scales; (iv) corresponding field crop management strategies (such as water and fertilizer management) can then be carried out, based on the estimated crop biomass, resulting in improved crop yields at regional scales; and (v) in addition, this method can be combined with higher temporal and spatial resolution image data and the AquaCrop model to improve field crop management, and then to enhance crop yield at the sub-field and sub-farm scales in the future.The positive results obtained here were based on single-site experiments over four years, however, further experiments should be carried out to adjust crop parameters of the AquaCrop model under water and fertilizer stress treatments to maintain the stability of the simulated results.The effect of the soil parameter variations on the simulated results in the AquaCrop model should be further investigated to better apply it at regional scales.Further studies are needed to verify these results for different crops, and in different ecological areas, as this study was limited to winter wheat in Beijing, China.

Conclusions
In this study, the PSO data assimilation algorithm was used to assimilate field spectroscopic data into the AquaCrop model to improve the estimation accuracy of winter wheat yield under different planting dates and irrigation management strategies.The conclusions are as follows: (i) Several spectral indices were highly correlated with biomass in winter wheat.The exponential regression equation between the normalized difference matter index (NDMI) and biomass was the best model for estimating biomass, with R 2 and RMSE values of 0.77 and 1.80 ton/ha, respectively; (ii) The data assimilation method (R 2 = 0.83 and RMSE = 1.65 ton/ha) achieved more accurate biomass estimations than the spectral index method; (iii) Yield simulated with the data assimilation method was consistent with measured yield across all four years (R 2 and RMSE values of 0.82 and 0.55 ton/ha, respectively).In summary, the results indicated that the data assimilation method is an effective method for estimating biomass and yield of winter wheat.The results provide a guideline for optimizing irrigation management strategies for winter wheat in this region.

( 2 )
ACsaV40 is executed with the required data using MATLAB (version 2007, MathWorks, Natick, MA, USA), and simulated biomass (BIOs) is obtained.(3) Regression relationships between spectral indices and measured biomass are analyzed, and the best regression model is determined to estimate biomass (BIOe).(4) A cost function is constructed according to the relationship BIOs and BIOe, reflecting the difference between BIOs and BIOe.The fit of the cost function determines whether the optimization algorithm had achieved the optimal input parameters.(5) The values of p id and p gd are searched in each iteration.(6) The position and velocity of each particle are updated based on p id and p gd .The values of C 1 and C 2 are set as 2, and random values between 0 and 1 are assigned to ξ and η [57].(7) If the iteration target (100 generations) is not reached, the updated positions are replaced and the previous step is repeated.(8) If the final iteration is achieved, the values of BIOs and corresponding simulated YIELDs are produced.Remote Sens. 2016, 8, 972 6 of 15

( 2 )
ACsaV40 is executed with the required data using MATLAB (version 2007, MathWorks, Natick, MA, USA), and simulated biomass (BIOs) is obtained.(3) Regression relationships between spectral indices and measured biomass are analyzed, and the best regression model is determined to estimate biomass (BIOe).(4) A cost function is constructed according to the relationship BIOs and BIOe, reflecting the difference between BIOs and BIOe.The fit of the cost function determines whether the optimization algorithm had achieved the optimal input parameters.(5) The values of pid and pgd are searched in each iteration.(6) The position and velocity of each particle are updated based on pid and pgd.The values of C1 and C2 are set as 2, and random values between 0 and 1 are assigned to ξ and η [57].(7) If the iteration target (100 generations) is not reached, the updated positions are replaced and the previous step is repeated.(8) If the final iteration is achieved, the values of BIOs and corresponding simulated YIELDs are produced.

Figure 1 .
Figure 1.Flowchart of the Particle Swarm Optimization (PSO) assimilation method for the AquaCrop model and remote sensing data.Figure 1. Flowchart of the Particle Swarm Optimization (PSO) assimilation method for the AquaCrop model and remote sensing data.

Figure 1 .
Figure 1.Flowchart of the Particle Swarm Optimization (PSO) assimilation method for the AquaCrop model and remote sensing data.Figure 1. Flowchart of the Particle Swarm Optimization (PSO) assimilation method for the AquaCrop model and remote sensing data.

Figure 3 .
Figure 3.Comparison of data assimilation biomass (BIOs) and field measurement biomass (BIOm) values in winter wheat across the four experiments.

Figure 3 .
Figure 3.Comparison of data assimilation biomass (BIOs) and field measurement biomass (BIOm) values in winter wheat across the four experiments.
Note: a C represents the calibration dataset (2009-2011, n = 27), and V represents the validation dataset (2012, n = 4).The calibration dataset was used to refine the linear regression relationships between the data assimilation yield (YIELDs) and field measurement yield (YIELDm) across three years of experiments.The validation dataset taken in 2012 was used to validate the estimation accuracy of the linear regression equation based on data from 2009, 2010, and 2011; b R 2 was calculated from 27 calibration datasets, and RMSE was calculated from 20 validated datasets; x represents simulated biomass; y represents measured biomass; and F represents the homoscedasticity value for the Levene's test.If the associated probability for the F test is larger than 0.05, the assumption of homoscedasticity is met.

Figure 4 .
Figure 4. Comparison of data assimilation yield (YIELDs) and field measurement yield (YIELDm) values in winter wheat across thefour experiments.

Figure 4 .
Figure 4. Comparison of data assimilation yield (YIELDs) and field measurement yield (YIELDm) values in winter wheat across the four experiments.

Table 3 .
Summary of spectral indices studied.

Table 4 .
Initial values and ranges of calibration parameters, or initial data, of the AquaCrop model.

Table 5 .
Correlations between biomass and spectral indices of winter wheat (n = 135).= number of data pairs; x represents the spectral index; and y represents biomass.In addition, x and e represents power and exponential function in regression equations, respectively.Probability levels of 0.05 and 0.01 are indicated by * and **, respectively; F represents the homoscedasticity value in Levene's test.If the associated probability for the F test is larger than 0.05, the assumption of homoscedasticity is met.
Note: n

Table 6 .
Equations for regressions between data assimilation biomass (BIOs) and field measurement biomass (BIOm) of winter wheat for the four experiments.
between the data assimilation biomass (BIOs) and field measurement biomass (BIOm) across three years of experiments.The validation dataset taken in 2012 was used to validate the estimation accuracy of the linear regression equation based on 2009, 2010, and 2011; b R 2 was calculated from 135 calibration datasets, and RMSE was calculated from 20 validation datasets; x represents simulated biomass; y represents measured biomass; and F represents the homoscedasticity value for the Levene's test.If the associated probability for the F test is larger than 0.05, the assumption of homoscedasticity is met.

Table 6 .
Equations for regressions between data assimilation biomass (BIOs) and field measurement biomass (BIOm) of winter wheat for the four experiments.
b R 2 was calculated from 135 calibration datasets, and RMSE was calculated from 20 validation datasets; x represents simulated biomass; y represents measured biomass; and F represents the homoscedasticity value for the Levene's test.If the associated probability for the F test is larger than 0.05, the assumption of homoscedasticity is met.

Table 7 .
Regression equations between data assimilation yield (YIELDs) and field measurement yield (YIELDm) values of winter wheat across the four experiments.
data assimilation yield (YIELDs) and field measurement yield (YIELDm) across three years of experiments.The validation dataset taken in 2012 was used to validate the estimation accuracy of the linear regression equation based on data from 2009, 2010, and 2011; b R 2 was calculated from 27 calibration datasets, and RMSE was calculated from 20 validated datasets; x represents simulated biomass; y represents measured biomass; and F represents the homoscedasticity value for the Levene's test.If the associated probability for the F test is larger than 0.05, the assumption of homoscedasticity is met.

Table 7 .
Regression equations between data assimilation yield (YIELDs) and field measurement yield (YIELDm) values of winter wheat across the four experiments.