Assimilation of Two Variables Derived from Hyperspectral Data into the DSSAT-CERES Model for Grain Yield and Quality Estimation

The combination of remote sensing and crop growth models has become an effective tool for yield estimation and a potential method for grain quality estimation. In this study, two assimilation variables (derived from a hyperspectral sensor), called leaf area index (LAI) and canopy nitrogen accumulation (CNA), were jointly used to calibrate the sensitive parameters and initial states of the DSSAT-CERES crop model, to improve OPEN ACCESS Remote Sens. 2015, 7 12401 simulated output of the grain yield and protein content of winter wheat. The results show that the modified simple ratio (MSR) and normalized difference red edge (NDRE) better estimated LAI and CNA, respectively, compared with the other possible vegetation indices. The integration of both LAI and CNA resulted in a more robust DSSAT-CERES models with than each one alone. The R and RMSE values, respectively, of the regression between the simulated (using the two assimilation variables method) and measured LAI were 0.828 and 0.494, and for CNA were 0.808 and 20.26 kg N∙ha. These two assimilation variables resulted in grain yield and protein content estimates of winter wheat with a high precision and R and RMSE values of 0.698 and 0.726 ton∙ha, and 0.758% and 1.16%, respectively. This study provides a more robust method for estimating the grain yield and protein content of winter wheat based on the integration of the DSSAT-CERES crop model and remote sensing data.


Introduction
Wheat (Triticum aestivum L.) is a staple food in North China, where the population accounts for about 40% of the country's total [1,2].The productivity of wheat, including grain yield and quality, directly determines its market price and related agriculture policies [3,4], and an advanced knowledge of grain yield and quality is important for this purpose [5,6].The combination of remote sensing and crop growth simulation models has provided an effective tool for crop grain yield and quality estimation in regional studies [5,7].
Many studies have been focused on the integration of remote sensing and crop growth simulation models for crop growth monitoring and yield estimation [5,7,8].Maas [9,10] initially described four techniques (input, updating, re-initialization, and re-parameterization) for leaf area indices (LAI) by incorporating remote sensing data into crop growth model estimations.Guerif and Duke [8,11] combined the Simple and Universal Crop Growth Simulator (SUCROS) model and the SAIL reflectance model to calibrate the site specific characteristics of soil and crops using remote sensing data, and they suggested that assimilation methods had a great potential for large-scale crop monitoring (e.g., yield prediction).Jongschaap [12] demonstrated simulation accuracy could be improved by run-time replacement for the Rotask simulation model with remote sensing observations.In a regional winter wheat and maize yield simulation, De Wit and Van Diepen [13] used the Ensemble Kalman filter (EnKF) to assimilate soil moisture into the World Food Study (WOFOST) model, where the results indicated assimilation of soil moisture improved the model's relationship with crop yield statistics for 66% and 56% of the regions for winter wheat and maize, respectively.Fang et al. [7,14] integrated MODIS LAI, vegetation index, and the Crop System Model (CSM)-Crop Environment Resource Synthesis (CERES)-Maize model for corn yield estimation in Indiana, USA.Morel et al. [15] coupled the sugarcane modelling software (MOSICAS) with a remotely sensed time series of the fraction of intercepted photosynthetically active radiation (fIPAR) to optimize the yield estimation by the partial forcing and complete forcing methods.They confirmed that MOSICAS was forced with fIPAR values from SPOT images, which were used to improve the accuracy of the model for the yield estimation.
Previous studies have typically used one agronomic variable (e.g., LAI) as a state variable for combining remote sensing and crop growth models and for yield estimation [8,13,14,16,17].In such studies, the assimilated variable of the model had a reliable accuracy, but other agronomic state variables did not [18].Thorp et al. [19] linked the PROSPECT + SAIL (PROSAIL) radiative transfer model and Decision Support System for Agrotechnology Transfer (DSSAT) crop growth models (DSSAT-PROSAIL) with LAI and plant nitrogen (N) content to estimate the LAI, canopy biomass, plant N content, and yield of durum wheat and demonstrated that inverting the DSSAT-PROSAIL model offered better estimates of crop biophysical properties.Ma [20] selected the Soil-water-atmosphere-plant environment (SWAP) model and the Moderate Resolution Imaging Spectroradiometer (MODIS) products to assimilate evapotranspiration and LAI and suggested that this could achieve a better yield estimation of winter wheat than the use of just one variable.Wang et al. [18] confirmed that the use of LAI together with leaf N accumulation as assimilation variables resulted in a better estimation of winter wheat yield than using each variable alone for model parameter initialization.
According to previous studies, the use of two assimilation variables offers the potential for enhanced agro-ecosystem modeling.However, few studies have used two remotely derived assimilation variables, and especially the DSSAT-CERES crop model for quality estimation has received little attention.Therefore, the objectives of this work were (1) to select the best spectral indices for estimating LAI and canopy N accumulation (CNA) (2) to assimilate LAI and CNA derived from spectral indices into the DSSAT-CERES model using the data assimilation method for obtaining more accurate LAI and CNA simulations and (3) to calibrate the optimal initial values and model parameters for both grain yield and protein content (GPC, an important indicator of grain quality) prediction of winter wheat.

Description of the Study Site
The field experiment was conducted during the 2009/2010 and 2012/2013 growing seasons of winter wheat at the Xiaotangshan experimental site (40.17°N,116.43°E) in Beijing, China.This area is representative of the overall soil and crop management practices in this region.The soil is fine-loamy, with a nitrate N (NO3-N) content of 3.16-14.82mg•kg −1 , an ammonium N (NH3-N) content of 10.20-12.32mg•kg −1 , an Olsen phosphorus content of 3.14-21.18mg•kg −1 , an exchangeable potassium content of 86.83-120.62 mg•kg −1 , and an organic matter content of 15.84-20.24g•kg −1 in the 0-30 cm layer.This area has a typical warm, temperate, semi-humid continental monsoon climate, with hot rainy summers, cold dry winters, and short springs and autumns.Throughout all seasons, the temperature fluctuated daily with significant differences between night and day.
Three experiments were conducted with different wheat cultivars, N application rates, and sowing dates over multiple years (Table 1).Each plot area was 100 m 2 .Experiment 1 was designed in 2009/2010 with a completely randomized design of three wheat cultivars and three sowing dates.Experiment 2 was conducted in 2012/2013 with a completely randomized design of four wheat cultivars and four N application rates.Experiment 3 was a randomized complete block design with two replications of four N application rates.Plot management followed the local standard practices (weed control, pest management and fertilizer application) for wheat production in this region.Note: Exp. is the abbreviation of experiment.There were three winter wheat cultivars and each had three planting dates in 2009.

Fundamental Data Set
The fundamental data set includes meteorological data, soil data, and management data.The meteorological data contain the precipitation and maximum and minimum air temperature, which were acquired from the China Meteorological Data Sharing Service System [21].The solar radiation was calculated with the Angstrom formula [22], using the hours of sunshine recorded by the CMDSSS.
The soil data for each soil horizon, including soil texture, permanent wilting point, field capacity, volumetric water content at saturation, soil organic carbon, inorganic N, PH, and bulk density, were measured through ground investigations and instrumental measurements [23].Crop management data (e.g., seeding, irrigation, and fertilization) were recorded while carrying out the field experiments.

Canopy Hyperspectral Reflectance Data
Spectral measurements were conducted at the following stages of winter wheat (number is the growth stage based on Zadoks' code system [24]): stem elongation (31), booting (47), anthesis (65) and milk development (75) (Table 2).The canopy spectral reflectance data were measured with an ASD FieldSpec Pro FR spectroradiometer (Analytical Spectral Devices, Boulder, Co, USA) with a spectral range of 350-2500 nm and a view angle of 25°.To ensure that the same instrument parameters were used at different growth stages, the instrument was held at a height of 1.0 m above the canopy, under clear sky conditions, between 10:00 a.m. and 2:00 p.m. Beijing time.Vegetation radiance measurement was taken by averaging 20 canopy spectral reflectance curves at an optimized integration time, with a dark current correction for each spectrometry measurement.A white standard panel coated with BaSO4 was used for the spectral reflectance calibration before and after these measurements.

Plant Measurement
The aboveground vegetation corresponding to the spectral sampling positions in a plot was collected and was destructively sampled by randomly cutting plants from an area of 0.25 m 2 , which the number of tillers was counted.Then, 20 representative wheat tillers were randomly sampled from the randomly cutting plants.Leaf area and related dry mass of 20 representative wheat tillers was measured, and specific leaf area (SLA) was calculated.The total dry mass of leaves within the sampled area was determined by the representative dry mass of leaves and the number tillers in sample area.Leaf area index (LAI) was then computed by multiplying the dry mass with the SLA [25].The dry plant material of each sample was then ground to pass through a 240-mesh screen and analyzed for total N using a Carlo-Erba NA 1500 dry combustion analyzer (Carlo Erba, Milan, Italy) [26].Canopy N accumulation (CNAm) was calculated as the aboveground biomass multiplied by the plant N concentration.
Grain yield was measured at the harvest of winter wheat.Three replicate 1 × 1 m areas of each plot for each treatment were obtained.Collected grain was dried and weighed on an electronic scale (±0.01 g).The grain protein content (GPC, %) of each plot was analyzed using a FOSS Infratec TM 1241 Grain Analyzer (Tecator, Hoganas, Sweden).

DSSAT-CERES Model Description
The CERES-Wheat model integrated into DSSAT v4.5 was used in this study [5].The model is a software system that integrates the effects of soil, weather, management, and genetics on daily crop growth, and it can be used to simulate crop phenology, total above-ground biomass, grain yield and quality, water, and N balance [27][28][29].
In the DSSAT-CERES model, the phenology of winter wheat was divided into nine growth stages on the basis of temperature, photoperiod, and genetic characteristics [30].LAI is determined by the growth of leaves on both the main stem and tillers.In the model, the area of the leaves on the main stem, which is dependent on a temperature control factor, was first calculated.The number of tillers per plant depends on the thermal time after emergence [31,32].Carbon assimilation was calculated from daily solar radiation, plant population, canopy extinction coefficient, and LAI.The assimilated carbon was then proportionally partitioned into different plant organs at different growth stages [32,33].Canopy N accumulation was simulated based on the crop N demand and available N in the soil.The crop N demand affects plant growth, target N concentrations, and critical N concentration.The N concentration of different plant organs varies with the plant growth stage.The available N uptake from the soil depends on the soil NH4 + and NO3 − concentrations, soil water, and root growth [34,35].The grain dry matter was derived from photosynthesis during the grain filling stage and re-translocation from the pre-stored dry matter, whereas the N accumulation of grain was derived from the direct root uptake during the grain filling stage and re-translocation from the pre-stored N uptake.The growth rate of grain dry matter and N accumulation were functions of environmental factors such as rainfall, temperature, and solar radiation [32,36].

LAI and CNA Estimation from Spectral Indices
The LAI and CNA estimations were based on several existing spectral indices that are considered to be good candidates for evaluating LAI and N status.The selected spectral indices are listed in Table 3.

Data Assimilation Strategy
Kennedy and Eberhart [52] originally proposed particle swarm optimization (PSO) to simulate the graceful but unpredictable choreography of bird flocks [53].In PSO, a particle is expressed a potential solution.Each particle, without quality or size in a d-dimensional search space, possesses its own position and velocity, and fitness value determined by a cost function.Every particle would modify its position and velocity associated with the optimal point in a current generation (pid) and that of all particles in a swarm (pgd).A flowchart for grain yield and GPC estimates made through the PSO assimilation of remote sensing data into the DSSAT-CERES model is illustrated in Figure 1.
In order to evaluate the assimilation performance of the two state variables, three assimilation schemes were conducted: only LAI as a state variable (SVLAI), only CNA as a state variable (SVCNA), and LAI and CNA used together as state variables (SVLAI + CNA).The detailed steps are as follows.
(1) The initial value (position) and velocity of the particles were determined.For SVLAI, four crop genotype parameters (PHINT, LAIS, SLAS and LSPHS) sensitive to LAI and three management parameters (plant density, irrigation amount, and fertilization amount) were adjusted [54] (Table 4); For SVCNA, four crop genotype parameters (P1D, PHINT, RDGS and SLPF) sensitive to CNA and the same three management parameters were adjusted (Table 4).For SVLAI + CNA, all the above crop genotype and management parameters were considered.The velocity in each dimension was set to ~10% of the dynamic range of the variable [53].It is important to point out that the parameters sensitive to CNA were set to default values (Table 4) in the SVLAI method, and vice versa (i.e., the parameters sensitive to LAI were set to default values (Table 4) in the SVCNA method).(2) The DSSAT executable file "dscsm045.exe"under the installation directory, integrated with the required data, was run in Matlab (version 2007, MathWorks, US), and the simulated LAI and CNA were output.(3) The relationships between the spectral indices and LAI or CNA were analyzed, and the best regression model was selected to estimate LAI and CNA, respectively.(4) The cost function was constructed according to the variables simulated by the DSSAT-CERES model and those retrieved by the spectral index.The fitness value from the cost function determined whether the optimization algorithm reached the optimum input parameters.When one state variable was used in an assimilation scheme (SVLAI or SVCNA), the cost function was based on only one variable (i.e., LAI or CNA) (Figure 1).When two state variables were used in an assimilation scheme (SVLAI + CNA), the cost function was based on both LAI and CNA.(5) The program searched for the pid and pgd at each iteration.(6) The positions and velocities of the particles were updated on the basis of pid and pgd.The c1 and c2 values were set as 2, and ξ and η were random values between 0 and 1 [53].(7) If the iteration (100 generations in this study) was not reached, the updated positions were replaced and the second step was conducted.If the iteration was reached, LAI, CNA, yield and GPC were output.

Statistical Analysis
The coefficient of determination (R 2 ) and the root mean square error (RMSE) were used as metrics to quantify the amount of variation explained by the developed relationships, as well as the accuracy of those relationships.Generally, the performance of the model was estimated by comparing differences in the R 2 and RMSE.A higher R 2 and a lower RMSE indicated higher precision and accuracy of the inversion.
Update the position and velocity of particles：

LAI and CNA Estimation from Spectral Indices
Fifteen spectral indices were used to estimate winter wheat LAI and CNA from the field data from Experiments 1 and 2 (n = 100), and Experiment 3 (n = 32) was used to validate the models' accuracy.Linear and nonlinear (logarithm, exponential, and power) functions were used to fit the models, and the best model had the highest R 2 and lowest RMSE.Table 5 shows the relationship between LAI (or CNA) and the spectral indices.
The results show that all the spectral indices were significantly related to LAI (p-value < 0.001).In particular, the six indices (NDVI, MSR, OSAVI, WDRVI, GI and EVI) with R 2 values greater than 0.8 best estimated the LAI of winter wheat, although large differences in their RMSE values (0.627, 0.598, 0.673, 0.651, 0.758 and 0.761, respectively) were revealed.The power relationship between MSR and LAI (R 2 = 0.829, RMSE = 0.598) had the highest performance compared with the statistical relationships of the other indices, and this relationship was used to estimate the LAI of winter wheat in this study.
Similarly, the regressions between CNA and each spectral index (p-value < 0.001) were highly significant.In comparison, OSAVI, CIred-edge, MCARI/MTVI2, MTCI, NDRE and DCNII were more strongly related with CNA than the other indices, indicating they were very sensitive to changes in CNA (R 2 = 0.701, 0.745, 0.762, 0.742, 0.794 and 0.733, respectively); the corresponding RMSE values used for validating the model accuracy were 39.97 kg N•ha −1 , 43.04 kg N•ha −1 , 42.73 kg N•ha −1 , 44.03 kg N•ha −1 , 37.75 kg N•ha −1 and 52.43 kg N•ha −1 , respectively.NDRE had the highest R 2 and the lowest RMSE, and the results showed that the predicted CNA was very consistent with the measured CNA (Table 5).Therefore, NDRE was selected to estimate the CNA of winter wheat in this study.

Discussion
Spectral indices from hyperspectral data were used to estimate the LAI and CNA of winter wheat, and these spectral indices had highly significantly relationships with LAI and CNA (Table 5).These spectral indices were constructed using red edge (670-780 nm) and shortwave infrared (NIR, 800-1100 nm) wavelengths and they contained useful information about the population size and nutritional status of wheat [41,55,56].The regression between MSR and LAI (R 2 = 0.829 and RMSE = 0.598) was better than those of the other indices and LAI.The MSR was expressed as a function of the simple ratio of NIR to red reflectance, and the simple ratio greatly reduced undesirable noise caused by simultaneous increases or decreases in red and NIR reflectance.The MSR also had the advantage of being less sensitive to canopy optical and geometrical properties [38].Moreover, the MSR may have improved linearity and mitigated the saturation effect when LAI increased [57].NDRE had the strongest relationship with CNA (R 2 = 0.794 and RMSE = 37.75 kg N•ha −1 ), and this index was measured with wavebands centered at 720 and 790 nm, which were more sensitive to estimating canopy N per unit area than canopy N concentration [48].
LAI and CNA were individually used as state variables for integrating remote sensing data into the DSSAT-CERES model with the PSO data assimilation method.The results show that each method (SVCNA and SVLAI) estimated LAI or CNA with great accuracy, in agreement with previous research [5], [7], [8], [16] and [17].However, when only one assimilation variable was used, large errors existed between the simulated and measured values of the other variables; for example, the RMSE of LAI determined with the SVCNA method was 1.207, which was much larger than that determined with the SVLAI method (RMSE = 0.527).Likewise, the error between the simulated and measured values of CNA determined with the SVLAI method was larger than that determined with the SVCNA method.This problem was due to unwanted and inaccurate simulations because only one state variable (i.e.LAI or CNA) was accurately simulated.
In this study, the join usage of LAI and CNA as state variables for assimilating remote sensing information into the DSSAT-CERES model.The simulation with the combined variables was robust, with R 2 and RMSE values of 0.828 and 0.494 for LAI and 0.808 and 30.26 kg N•ha −1 for CNA.The accuracy of the state variable simulations made using two assimilation variables was fairly consistent or even better than those using one variable (LAI or CNA).The primary reason for this is that LAI is a key variable for crop growth monitoring and yield prediction [58], and CNA is an important indicator of the N status of wheat and significantly affects photosynthetic production and grain yield and quality [59].Various crop state variables are independent of each other, though they interact with each other [18,60].Therefore, the SVLAI + CNA method obtained a greater robustness of the DSSAT-CERES model than the single variable methods.However, this study only investigated LAI (a crop population indicator) and CNA (a nutritional status indicator), and the combination of these two state variables may not have been optimal, as the matching of the pattern of interaction between these state variables in a natural system may not yet be fit by the interaction pattern between them within the simulated system.In addition, the cost function was constructed by simple addition (Figure 1), and it did not consider the different deviations between simulated and measured values.Therefore, future studies should pay attention to two points: (1) how to select two optimal state variables that can achieve robust simulations for most model outputs; and (2) how to determine the cost function when two state variables are considered.
According to the results of the grain yield and GPC estimations, the inverted SVCNA method estimated grain yield and GPC better than the SVLAI method (Figure 3a-d).This is because CNA is determined by actual plant N concentration and the corresponding canopy biomass of each plant organ [32,35], thus, nutrient status and plant biomass are simultaneously considered in crop growth.The joint use of LAI and CNA as state variables also obtained good estimations of grain yield and GPC, and the RMSE values between the simulated and measured grain yield and GPC were the lowest.Grain yield is derived from the partition of fixed carbon, which is directly related with LAI [33].Similarly, grain N accumulation is derived from the re-translocation of CNA, and GPC is simply calculated as the N concentration in grain multiplied by a factor [32,36].The assimilation of two state variables derived from a remote sensor mutually promoted consistent results between the simulated and measured values of grain yield and GPC.
Integrating remote sensing data with the DSSAT-CERES model for grain yield and GPC estimation could feasibly be conducted with different sowing dates, cultivars, and N management strategies across different years in this study.Further efforts are required to couple satellite data with the DSSAT-CERES model for estimating grain yield and GPC at a regional scale.

Conclusions
The joint integration of remotely sensed LAI and CNA as state variables into the DSSAT-CERES model using the PSO algorithm was tested in this study.The results suggested that LAI and CNA were accurately estimated with spectral indices, and MSR and NDRE were the best indices for estimating LAI and CNA, respectively.The method of jointly integrating LAI and CNA as state variables was more robust than single variable integration.A good accuracy of winter wheat grain yield and GPC estimations was demonstrated using this kind of data assimilation method.The results demonstrated integrating remote sensing with the DSSAT-CERES model is a potential approach for estimating grain yield and especially GPC.

Figure 1 .
Figure 1.Flowchart of the particle swarm optimization (PSO) assimilation scheme for integrating remote sensing data with the Decision Support System for Agrotechnology Transfer-Crop Environment Resource Synthesis (DSSAT-CERES) model.

Figure 3 .
Figure 3. Relationships between measured and simulated values of (a) yield with leaf area index as a state variable (SVLAI), (b) grain protein content (GPC) with SVLAI, (c) yield with canopy N accumulation as a state variable (SVCNA), (d) GPC with SVCNA, (e) yield with SVLAI + CNA, and (f) GPC with SVLAI + CNA.

Table 1 .
Summary of treatments for the three experiments.

Table 2 .
List of data acquisition in the three wheat experiments.
Note: CNAm and GPC represent measured canopy N accumulation and grain protein content, respectively.-represents not-measured data in this growth stage.Number in each agronomy variable represents the number of samples.

Table 4 .
Initial values and ranges of calibration parameters or initial data of the DSSAT-CERES.
Note:The initial data values were determined with management data, and the genotype parameters were set based on default values in DSSAT-CERES model.