Assimilation of Sentinel-2 Leaf Area Index Data into a Physically-Based Crop Growth Model for Yield Estimation

Remote sensing data, crop growth models, and optimization routines constitute a toolset that can be used together to map crop yield over large areas when access to field data is limited. In this study, Leaf Area Index (LAI) data from the Copernicus Sentinel-2 satellite were combined with the Environmental Policy Integrated Climate (EPIC) model to estimate crop yield using a re-calibration data assimilation approach. The experiment was implemented for a winter wheat crop during two growing seasons (2016 and 2017) under four different fertilization management strategies. A number of field measurements were conducted spanning from LAI to biomass and crop yields. LAI showed a good correlation between the Sentinel-2 estimates and the ground measurements using non-destructive method. A correlating fit between satellite LAI curves and EPIC modelled LAI curves was also observed. The assimilation of LAI in EPIC provided an improvement in yield estimation in both years even though in 2017 strong underestimations were observed. The diverging results obtained in the two years indicated that the assimilation framework has to be tested under different environmental conditions before being applied on a larger scale with limited field data.


Introduction
Agriculture is facing great challenges to increase food production while decreasing global impact [1].In this context, better crop management strategies can help to guarantee sustainable food production and security [2].It is of paramount importance to set up accurate and timely monitoring tools that ensure spatially detailed and accurate information on crop yields and on the required production inputs (e.g., water, fertilizer).Remotely-sensed data can provide spatially and temporally consistent information over large agricultural areas [3] to monitor the seasonal patterns related to the biological lifecycle of the crop [4].Through the use of some variables estimated by remotely-sensed data, it is possible to obtain crop yield predictions.Thus, different estimation approaches are described, based on remote sensing data.Statistical regression methods are the most commonly adopted for quantifying the expected yield [5].For instance, the Normalized Difference Vegetation Index (NDVI) is commonly used as an independent variable in regression models to estimate the crop yield.The main limitations of this technique is the need for sufficiently long and consistent time series of both remote sensing data and agricultural statistics to calibrate the empirical models [6].In order to improve the predictive power of remotely-sensed data, it is possible to add independent meteorological (or bio-climatic) variables into the regression models as shown in several studies [7,8].The interaction of these two different sources of data provides different types of information even if the bio-climatic Agronomy 2019, 9, 255 2 of 18 indicators, especially if derived from satellites as well, are not totally independent from the vegetation indices [6].An alternative approach to obtain crop yield is to consider the crop physiological processes using semi-empirical approaches such as the Monteith's efficiency model [9].The latter computes the daily production of dry matter (DM) integrating incident global solar radiation, usually available from the meteorological stations, and the diagnostic estimation of the status and the activity of the plant provided by remote sensing data [10].The approach uses only the amount of photosynthetically active solar radiation (PAR) adsorbed by the plant, although it is known that the crop yield depends also on the interaction of other variables, such as fertilizer, soil, cultivar diversity, weather conditions, water availability, pests, and diseases.
A more sophisticated approach to account for the interaction of environmental factors and field management strategies is based on dynamic crop growth models (CGMs) [11].The main advantage of CGMs is their capacity to take into account such limiting factors (e.g., soil, weather, water, nitrogen) dynamically [12].CGMs are able to describe the behavior of the real crop by predicting the physiological mechanisms of the crop growth (e.g., phenological development, photosynthesis, dry matter, portioning, and organogenesis) using mechanistic equations [13].Their applicability is, however, complex at field or regional scales because they are designed to describe the plant growth at the point scale, assuming that the field conditions are uniform [14].In this context, the use of satellite data to provide spatially integrated information over large areas is widely recognized to support CGMs [11,12,15,16].The crop phenology, which can be assimilated into the crop growth models, becomes fundamental to control the crop dry matter distribution during the growth process inside the model itself [17].
The integration of remote sensing data and CGM can be achieved in different ways [18,19]: (1) Forcing strategy, where the remote sensing data are used to replace the crop model simulation data.For each time step of the model simulation (e.g., daily), direct ingestion of the variables into the model is done.It requires continuous data and it is subject to frequent instabilities in the overall model setting.
(2) Updating strategy, which consists of continuously updating the model state variables whenever new observations from the satellite are available.This method assumes that better-simulated state variables at day t will also improve the accuracy of the simulated state variables on succeeding days.
(3) Calibration method, which considers adjusting either the model or initial state variables to obtain an optimal agreement between simulated and observed state variables.Key model's state variables are changed according to the driving variable obtained from the satellite.Generally, a minimization algorithm is used to minimize the difference between the model state variable and the driving variable obtained from the satellite [20,21].
Various implementations exist in the literature.For example, the commonly used EPIC model (Environmental Policy Integrated Climate) [14] shows good accuracy in the estimation of yield by assimilating variables derived from remote sensing data with a calibration approach [22].The model is suitable for simulating crop yield over a large range of soils, crops, and management scenarios [23].Satellite-based Leaf Area Index (LAI) is one of the most commonly used variables in model assimilation for yield estimation [21,24,25].LAI is one of the key crop state biophysical variables required by many CGMs and is able to describe the relationship between soil, plant, and atmosphere system [26].LAI can be obtained using freely available data from operational satellites such the USGS Landsat-8 or the Copernicus Sentinel-2 missions.The latter is composed of two identical satellites (Sentinel-2 A and B) providing 5 days of revisit time [27] and this increases the possibility to have cloud-free observations in key phenological periods during the crop growing cycle.Sentinel-2 mission provides improved possibilities for retrieving biophysical parameters such as LAI, thanks to the higher spectral, spatial, and temporal resolutions [28].Recent studies show good correlation between LAI ground measurements and LAI estimates starting from Sentinel-2 reflectance data and using machine learning approaches [29,30].Pre-calculated LAI maps with an accuracy of around 12% at 10 m spatial resolution can be obtained from the service platform for Sentinel-2 implemented by the University of Natural Resources and Life Science, BOKU [29].Building on the BOKU Sentinel-2 LAI data, in this study we apply a data assimilation framework using the EPIC model over a winter wheat cereal crop for two growing seasons (2016 and 2017).A re-calibration of the model parameters was done using a minimization function in order to find the minimum difference between the remote sensing LAI data and the simulated LAI data.The main goal is the evaluation of (1) the data assimilation approach and (2) of the capability of the model to estimate crop yield for possible future application at regional scale.For this reason, the yields estimated by EPIC were validated using measured yield data.The validation of the LAI values obtained from Sentinel-2 was also done through the comparison with LAI ground measurements collected during both the growing seasons.

Study Area
The winter wheat experimental fields used in this study are located in the Marchfeld region, an agricultural area in the east of Vienna in Lower Austria (Lat.48.20 • N, Long.16.72 • E) (Figure 1).This area was designated as a pilot area for the H2020 FATIMA (Farming Tools for external nutrient Inputs and Water Management) project [31] and field data were collected in this framework.The Marchfeld region covers about 75,000 ha and the most important crop types are: winter wheat, vegetables (e.g., carrots, onions, green peas, asparagus, spinach, green salad), sugar beet, grain maize, and potatoes [31].The climate is semi-arid with cold winters characterized by frequently strong frosts and limited snow cover, and hot and intermittently dry summers [32].The average annual temperature is between 9 and 10 • C and the annual precipitation ranges from 500-550 mm with a maximum in early summer, from May to July.During the vegetation period from April to September, the precipitation is between 200 and 440 mm [32].The presence of wind, blowing at an average speed of 3.5 m/s, encourages the dry conditions in the summer and, additionally, leads to high evapotranspirative demand.Different types of soils with high spatial variability can be found [33] and the general soil conditions are characterized by a humus-rich A horizon and a sandy C horizon [34].

Leaf Area Index Measurements
Ground measurements of LAI were undertaken once a week during both years alternatively one week in the experimental plots and the following week in other randomly selected fields in the region.The 2016 field data collection began in the middle of April and finished in the middle of May,

Experimental Plots
Field measurements were conducted in 2016 and 2017 in two experimental fields of winter wheat (cultivar Capo) with total field area of 12 hectares (ha), divided in 12 plots, each of 1 ha (100 m × 100 m).Each plot received different fertilization rates with 3 repetitions of four different levels of fertilization (variant) in a range between 0 and 180 kg N/ha (0, 60, 120, 180 kg N/ha) (Figure 1).
The soil characteristics were measured before, during and after the crop growing cycle for each of the 12 plots in both years, the information was provided by the Austrian Agency for Health and Food Safety (AGES) together with the soil management practice (including date of sowing and harvest).The soil texture in the first 30 cm is silty-clay-loam or loam with small difference amongst the plots.For the experimental plots in the year 2016, the range of sand particles was between 10.9% and 33.7% with a median value of 19.65%, while presence of silt was between 40.5% and 57.4% with a median value of 48.65%.In 2017, the presence of sand was between 15.4% and 28.9% with a median value of 22.8%, while the presence of silt was between 43.9% and 50.9% with a median value of 46.3%.In both the experimental fields, the soil was slightly alkaline, indeed the pH in the first 30 cm was around 7.5-7.6[35].The before seeding management was characterized by a disk harrow soil processing at twelve centimeters depth and a seedbed preparation at 5 cm in depth.After sowing, the mineral fertilization was applied using calcium ammonium nitrate (N = 27%) at three times (March, April, and May).The same procedure was applied in both years.At harvest, the final yield of dry matter in kg/ha was measured and provided by AGES.

Leaf Area Index Measurements
Ground measurements of LAI were undertaken once a week during both years alternatively one week in the experimental plots and the following week in other randomly selected fields in the region.The 2016 field data collection began in the middle of April and finished in the middle of May, while for 2017 the data collection started at the end of April and ended in the middle of June.
In situ LAI measurements were collected with a non-destructive and indirect method using a Li-Cor LAI-2200C Plant Canopy Analyzer (Li-Cor Biosciences, Lincoln, NB, USA) [36].Li-Cor Biosciences is a privately held company based in Lincoln, Nebraska that was founded in 1971 and it is a leader in the design, production, and marketing instruments for biological and environmental research [37].The LAI-2200C has been largely used for LAI measurement; it estimates LAI from the values of canopy transmittance by identifying the attenuation of the radiation as it passes through the canopy [36].Therefore, measurements were taken above-(A) and below-canopy (B).The measures were done in elementary sampling units (ESU) within a radius of about 10 m of a geo-referenced point (accuracy ± 3-5 m using GPS signals); each ESU represents a homogenous area (in terms of crop development conditions).In the experimental fields, the ESUs were located in the center of each of the 12 plots.In the random fields that were additionally sampled, the centers of the ESU were placed in the corner of a square area of about 60 × 60 m, measured from an accessible border of the field.This ensures a representative sampling for the whole field and makes sure that the point also represents the Sentinel-2 pixel at 10 m and 20 m ground sampling distances.In both cases, in order to obtain a representative LAI sample within a radius of about 10 m, each ESU was ideally divided into four sectors; for each sector one measurement was done above the canopy and six below it.The latter were performed at random in each of the four sectors to well represent the ESU.A total of 24 readings were recorded for each ESU and were used to calculate the average representative LAI value used for comparison with the satellite data (Figure 2).
The LAI sampling was always carried out early in the morning or later in the afternoon under diffuse light conditions.Previous studies showed that the best results for LAI were measured in diffuse light conditions to avoid direct sunlight and, therefore, minimizing the underestimation of LAI [38,39].

Leaf Area Index
Satellite-based estimates of LAI were obtained using an artificial neural network (ANN) developed at INRA.An ANN consist of a large number of simple processors, called "neurons", linked by weighted connections where each output depends only on the information that is locally available [40].The ANN is a "black box" approach which has great capacity in predictive modelling but all the characters describing the unknown situation must be presented to the trained ANN and the identification (prediction) is then given [41].The network was trained using PROSPECT and SAIL radiative transfer models and it uses eight Sentinel-2 spectral bands (B3, B4, B5, B6, B7, B8A, B11, and B12) [42].On the basis of this approach, LAI was calculated and available as LAI maps at 10 m resolution at the BOKU data service platform for Sentinel-2 [29].The LAI image data used for the 2016 and 2017 are listed in Table 1.In both years the maximum timespan between satellite acquisitions and ground measurements of LAI was five days.To guarantee that the ground reference points were not covered by clouds or shadows in the image data, the scene classification layer (SCL) was used [43].For the validation of the LAI retrieval, LAI values were extracted for the central points of each LAI ESU measurements with a buffer of 10 m (an equal expansion in each direction from our GPS point) and used for the comparison with the field data as a mean value inside of the buffer for each ESU.This was done to compensate for the location of the center points, as some did not cover exactly one pixel or were located on a pixel border.

Satellite Data
Leaf Area Index Satellite-based estimates of LAI were obtained using an artificial neural network (ANN) developed at INRA.An ANN consist of a large number of simple processors, called "neurons", linked by weighted connections where each output depends only on the information that is locally available [40].The ANN is a "black box" approach which has great capacity in predictive modelling but all the characters describing the unknown situation must be presented to the trained ANN and the identification (prediction) is then given [41].The network was trained using PROSPECT and SAIL radiative transfer models and it uses eight Sentinel-2 spectral bands (B3, B4, B5, B6, B7, B8A, B11, and B12) [42].On the basis of this approach, LAI was calculated and available as LAI maps at 10 m resolution at the BOKU data service platform for Sentinel-2 [29].The LAI image data used for the 2016 and 2017 are listed in Table 1.In both years the maximum timespan between satellite acquisitions and ground measurements of LAI was five days.To guarantee that the ground reference points were not covered by clouds or shadows in the image data, the scene classification layer (SCL) was used [43].For the validation of the LAI retrieval, LAI values were extracted for the central points of each LAI ESU measurements with a buffer of 10 m (an equal expansion in each direction from our GPS point) and used for the comparison with the field data as a mean value inside of the buffer for each ESU.This was done to compensate for the location of the center points, as some did not cover exactly one pixel or were located on a pixel border.For the assimilation of LAI in the EPIC model, LAI values were extracted from the image data using a crop mask of the fields of interest (divided in different plots).The latter involved restricting the analysis to a subset of pixels, rather than using all the pixel of the image [4].For each polygon representing each plot, considered as an independent field, an inner buffer of 20 m (equal reduction in every direction of the space of our spatial analysis) was created to extract the median value of LAI.The inner buffer is useful in this case to exclude possible outliers in the boundaries of the field.This operation was carried out for each LAI image where the fields of interest were cloud-free and available in the Sentinel-2 data service platform.The date of acquisition and cloud cover of the image data are listed in Table 2. 2.5.EPIC Input and Forcing

EPIC Model
EPIC is a physically-based crop growth model that operates on a daily time-step.One of the most important processes simulated by EPIC is the crop growth, and a wide range of crop types can be simulated using a generic growth routine.The crop growth process includes the interception of solar radiation, conversion of intercepted light to the biomass, partition of biomass into roots, above ground biomass and economic yield, and also simulated root growth [44].The annual crop growth from planting to harvest depends on the potential heat unit for the crop.The growing period starts when the average daily temperature exceeds the base temperature of the plant and the phenological development of the crop is based on the daily heat unit accumulation (growing degree day approach) [44].The heat units are used to calculate the heat unit index (ranging between 0 and 1) as a fraction of the potential heat units (PHU) required for the maturation of the crop under study [45].
The potential daily increase of the biomass is estimated using Monteith's approach [9] as a function of intercepted photosynthetic active radiation.The interception of solar radiation depends on the LAI [23], which is described by five model parameters: DMLA, DLAP1, DLAP2, DLAI, and RLAD.The model LAI curve is built considering two simulation stages: from emergence to leaf decline, LAI depends on three input factors DLAP1 and DLAP2, which are two points used for the definition of the optimal leaf development curve and DMLA, namely the maximum potential LAI.The second simulation stage starts from the leaf decline (which depends on DLAI parameter, the fraction of the growing season when leaf area starts to decline) to the end of the growing season.The rate of the LAI is influenced by LAI decline parameter (RLAD).Both simulation stages are also influenced by the heat units and, theoretically, by the estimated stresses caused by water, nutrients, temperature, aeration, and radiation.The potential biomass is also adjusted daily, taking into consideration the minimum of the EPIC plant growth constraints (nutrients, water, temperature, aeration, and radiation) [23].The crop yield is estimated using the harvest index (HI) approach as a nonlinear function of heat units from zero (at the planting stage) to the optimal value (at maturity) [44].The HI could be altered by high temperature, low solar radiation, or water stress during critical crop stages [23].In particular, under water stress conditions, the default parameterization of the model would adjust the HI from the second half of the growing season considering a lower limit of HI.The latter is called water stress yield factor (WSYF) [46], a fraction between 0 and the HI value that represents the lowest HI expected due to the water stress [45].The water stress effect on HI is controlled by PARM (3), namely the fraction of the growing season where water stress starts reducing HI, which also controls the accumulation of actual and potential evapotranspiration until harvest.At harvest, the ratio between potential and actual evapotranspiration (ET) is used with an S-shaped curve to estimate the final HI [45].

EPIC Input Data
EPIC is designed to simulate field, farm or small watershed, homogenous with respect to climate, soil, land use, and topographic management system parameters [45].In this study each plot was considered as an independent field.The model enables the preparation of different parameters, operations, and run files to execute the model.The soil characteristics, such as soil hydrologic group, number of the soil layers and layer depth, soil texture, pH, organic carbon concentration, calcium carbonate content of soil, the sum of bases, cation exchange capacity, and bulk density were used to build the soil parameters file.In the latter file, three soil layers were considered with 30 cm of layer's depth.The soil hydraulic parameters were also calculated by default empirical equations provided by EPIC.These equations give as a result the wilting point, field capacity, saturated soil water content, and saturated hydraulic conductivity based on measured values of bulk density, sand content, and silt content [47].Site files were prepared for every plot while the operation schedule files were prepared for each variant.The climate data (solar radiation, maximum and minimum temperature, precipitation, relative humidity, and wind velocity) were taken from the ZAMG (Zentralanstalt für Meteorologie und Geodynamik) weather station of Gross-Enzersdorf (48 • 19 N 16 • 55 E) located 30 km away from the experimental fields.Weather files were built using the monthly weather statistic generator WXPM p3020 available from EPIC model website [48], using historical data from 1997.

Model Calibration
Four plots (Figure 1; site 1,2,3,4) with different fertilization variants from the 2016 dataset were used to calibrate two unknown parameters of the EPIC model namely the potential heat unit (PHU) and the potential evapotranspiration (PET), which was estimated using the Penman-Monteith approach.For normal operation, knowing the date of sowing and harvest, it is possible to check the output file of the model HUSC index.If this index is higher than 1.2 or smaller than 1, the PHU must be adjusted in order to obtain a value around 1. In this way it is possible to arrive to the right value of PHU from planting to maturity.The other unknown parameters inside of the model, except those involved in the temporal definition of the LAI curve, were left as the default setting.

Data Assimilation
The temporal evolution of the optimal LAI curve is defined by the five parameters described in Section 2.5.1.First we identify the DMLA (maximum potential LAI) as the maximum value (98th percentile) observed in the image data during the full growing season for each field.Subsequently, the re-calibration of the other parameters (DLAP1, DLAP2, DLAI, and RLAD) that define the LAI curve in EPIC was performed on the basis of the minimization of the root mean square error (RMSE) (Equation ( 1)) between the LAI simulated by the model and the LAI observed by satellite at the day of the satellite observation.Default values were used for the parameters that were not measured or not available for our specific experiment.
The minimization of the cost function (Equation ( 1)) was done using a MATLAB optimization toolbox called Fmincon that allows to find the minimum of a constrained nonlinear multivariable function.It uses the interior-point approach for solving nonlinearity constrained optimization problems (Equation ( 2)) where the objective is to find the local minimum of the function x subject to constraints [49]: Starting from the original problem definition in Equation ( 2), the crucial steps of the optimization algorithm are the formulation and solution of the equality constrained barrier sub-problems, with i steps [50] (Equation ( 3)).Each sub-problem is called barrier sub-problem and could be approximated at the form (for each µ > 0): where µ is the barrier parameter and s i is the slack variables that are restricted to be positive by adding ln (s i ) bounded [50].As µ decrease close to zero, the minimum of f µ should approach the minimum of f.The added logarithmic term is called a barrier function [50] and this is easier to solve than the original inequality-constrained problem (Equation ( 2)).For the optimization, we identified the bound constraints (minimum and maximum values for DLAP1, DLAP2, DLAI, and RLAD) by studying the model LAI outputs in correspondence of the four calibration plots (Section 2.5.3, Figure 1; site 1, 2, 3, 4).

Accuracy Assessment
The model performance was estimated using the root mean square error (RMSE), the coefficient of determination (R 2 ) and relative root mean square error (RRMSE).The R 2 (Equation ( 4)) defines the proportion of the dependent variable variance that is predictable by the independent one [51].It represents a normalized measure between 0 and 1 and it is given by the ratio between the "error sum of the square", that quantifies how much the data points Y i vary around the estimated regression line Ŷi , and the "total sum of square", that quantifies how much the data points Y i vary around their mean, Y.In environmental research, a good value of R 2 is above 0.6 [52] even if it depends on the type of estimated variable.For instance, if we consider the range of LAI values, R 2 value larger than 0.9 is considered as excellent performance and between 0.5 and 0.8 a good one [53].The RMSE is a performance indicator of a model and gives a quantity estimates of the deviation of the model's output from measurements [54].The RMSE produces results in the same units as that used for measurements [55] and it is calculated as the root square of the mean of the difference between predicted values (V p ) and observed values (V obs ).If the RMSE is 0, there is no difference between predicted and measured values and the perfect fit is achieved.The relative RMSE (Equation ( 5)) is dimensionless and it is obtained by dividing the RMSE by the mean of the observed variables and thus is less sensitive to the magnitude of the variables [53].The desired result of the RRMSE for an excellent performance of the model is less than 10% and, for a good one, is between 10% and 20% [53]: Agronomy 2019, 9, 255

LAI Validation
Good agreement and low error between Sentinel-2 LAI for the winter wheat and LAI reference measurements was found in both years (Figure 3).Better RMSE and RRMSE values were obtained in 2017 compared to 2016 (RMSE = 0.46 vs. 0.44) (RRMSE = 19% vs. 17%).In the 2016 year, a slightly lower R 2 value was found compared to 2017 (R 2 = 0.72 vs 0.89).Although the comparison between ground and satellite LAI estimations showed good results in both years, there are some cases in which we can find a small disagreement.In 2016, a general small overestimation by the LAI-2200 compared to satellite observations was observed.In 2017, there are three cases for which Sentinel-2 overestimated LAI remarkably.

LAI Validation
Good agreement and low error between Sentinel-2 LAI for the winter wheat and LAI reference measurements was found in both years (Figure 3).Better RMSE and RRMSE values were obtained in 2017 compared to 2016 (RMSE = 0.46 vs. 0.44) (RRMSE = 19% vs. 17%).In the 2016 year, a slightly lower R 2 value was found compared to 2017 (R 2 = 0.72 vs 0.89).Although the comparison between ground and satellite LAI estimations showed good results in both years, there are some cases in which we can find a small disagreement.In 2016, a general small overestimation by the LAI-2200 compared to satellite observations was observed.In 2017, there are three cases for which Sentinel-2 overestimated LAI remarkably.

EPIC Calibration and Assessment
Figure 4 shows the EPIC-LAI curves achieved after the assimilation of satellite data and the LAI curves built using the Sentinel-2 observations for the calibration plots in 2016.In this case, the shape of the LAI curves is similar and the only use of the bound constraints on the parameters to optimize seems sufficient to ensure a good fit between the simulated (EPIC) and the observed (Sentinel-2) LAI curves.However, Figure 4D (180 kg N/ha) shows a temporal delay in the crop development trend simulated by EPIC compared to the satellite observations.

EPIC Calibration and Assessment
Figure 4 shows the EPIC-LAI curves achieved after the assimilation of satellite data and the LAI curves built using the Sentinel-2 observations for the calibration plots in 2016.In this case, the shape of the LAI curves is similar and the only use of the bound constraints on the parameters to optimize seems sufficient to ensure a good fit between the simulated (EPIC) and the observed (Sentinel-2) LAI curves.However, Figure 4D (180 kg N/ha) shows a temporal delay in the crop development trend simulated by EPIC compared to the satellite observations.This temporal displacement between the two LAI curves is larger in 2017 compared to 2016 as shown in Figure 5 for the highest fertilization treatment plots (Figure 5, C-D ), for which an incorrect parameterization appears clear especially for DLAP1, DLAP2, and DLAI.Probably, only use of the bound constraints is not always sufficient to reach a good calibration.It could also be that the model is not able to perfectly represent the satellite curve in all conditions.This temporal displacement between the two LAI curves is larger in 2017 compared to 2016 as shown in Figure 5 for the highest fertilization treatment plots (Figure 5C,D), for which an incorrect parameterization appears clear especially for DLAP1, DLAP2, and DLAI.Probably, only use of the bound constraints is not always sufficient to reach a good calibration.It could also be that the model is not able to perfectly represent the satellite curve in all conditions.

Yield Estimation in 2016
Figure 6A shows a relevant good agreement (R 2 = 0.95) and low error (RMSE = 317 kg/ha and RRMSE = 6%) between the yield estimated by the EPIC model (with assimilation of LAI data from Sentinel-2) and the observed yield for the year 2016.The yield estimation notably improves with LAI assimilation compared to the one obtained using the default EPIC parameterization as shown in Figure 6B (R 2 = 0.93, RMSE = 572 kg/ha and RRMSE = 11%).Using the latter calibration, a model overestimation was found in the lower range of yield values (0 and 60 kg/ha of nitrogen during the growing season).Figure 6B also shows that the high yield values are approximately the same in both the plot with 120 and 180 kg N/ha, while using LAI assimilation from satellite, the yield values are better distributed along the regression line.Thus, the assimilation of LAI data seems to improve the sensitivity to the different amount of nitrogen provided during the growing season.

Yield Estimation in 2016
Figure 6A shows a relevant good agreement (R 2 = 0.95) and low error (RMSE = 317 kg/ha and RRMSE = 6%) between the yield estimated by the EPIC model (with assimilation of LAI data from Sentinel-2) and the observed yield for the year 2016.The yield estimation notably improves with LAI assimilation compared to the one obtained using the default EPIC parameterization as shown in Figure 6B (R 2 = 0.93, RMSE = 572 kg/ha and RRMSE = 11%).Using the latter calibration, a model overestimation was found in the lower range of yield values (0 and 60 kg/ha of nitrogen during the growing season).Figure 6B also shows that the high yield values are approximately the same in both the plot with 120 and 180 kg N/ha, while using LAI assimilation from satellite, the yield values are better distributed along the regression line.Thus, the assimilation of LAI data seems to improve the sensitivity to the different amount of nitrogen provided during the growing season.However, we encountered a major flaw for the assimilation of LAI data in EPIC under nitrogen stress conditions.The model lacks a correction feedback on simulated LAI.Only simulated biomass is corrected for the stress induced by limited nitrogen application.The EPIC-LAI curve is influenced mainly by the temperature and not by the variable amount of N.
Figure 7 shows the simulated LAI and biomass curves obtained using default parameters of the model for the different nitrogen fertilization applications.It appears clear that the LAI curve does not change even if the fertilization management changes.Therefore, the EPIC-LAI curve is referred to the optimal development of the LAI, without considering limiting stress conditions for the crop.However, we encountered a major flaw for the assimilation of LAI data in EPIC under nitrogen stress conditions.The model lacks a correction feedback on simulated LAI.Only simulated biomass is corrected for the stress induced by limited nitrogen application.The EPIC-LAI curve is influenced mainly by the temperature and not by the variable amount of N.
Figure 7 shows the simulated LAI and biomass curves obtained using default parameters of the model for the different nitrogen fertilization applications.It appears clear that the LAI curve does not change even if the fertilization management changes.Therefore, the EPIC-LAI curve is referred to the optimal development of the LAI, without considering limiting stress conditions for the crop.With this approach, a saturation in the biomass is clearly visible for the high fertilization treatments (120-180 kg N/ha) and this causes errors in the estimation of the yield (Figure 7B).However, we encountered a major flaw for the assimilation of LAI data in EPIC under nitrogen stress conditions.The model lacks a correction feedback on simulated LAI.Only simulated biomass is corrected for the stress induced by limited nitrogen application.The EPIC-LAI curve is influenced mainly by the temperature and not by the variable amount of N.
Figure 7 shows the simulated LAI and biomass curves obtained using default parameters of the model for the different nitrogen fertilization applications.It appears clear that the LAI curve does not change even if the fertilization management changes.Therefore, the EPIC-LAI curve is referred to the optimal development of the LAI, without considering limiting stress conditions for the crop.With this approach, a saturation in the biomass is clearly visible for the high fertilization treatments (120-180 kg N/ha) and this causes errors in the estimation of the yield (Figure 7B).Regarding the year 2017, we found a strong underestimation of the model compared to the yield measured in the field with R 2 = 0.97, RMSE = 1961, and RRMSE = 55% (Figure 8).A little Regarding the year 2017, we found a strong underestimation of the model compared to the yield measured in the field with R 2 = 0.97, RMSE = 1961, and RRMSE = 55% (Figure 8).A little improvement can be seen by assimilating LAI from the satellite (Figure 8A) compared to a parameterization using default model values (Figure 8B).improvement can be seen by assimilating LAI from the satellite (Figure 8A) compared to a parameterization using default model values (Figure 8B).The output of the model was further analyzed to understand the strong underestimation.One main reason for the underestimation was found to be due to a major decrease in the harvest index (HI) for the year 2017 (0.214-0.218) compared to 2016 (0.416-0.425).The small intra-year variations in the HI values are probably due to the different phenological development of the crop during the growing season related to the different LAI parameterization for each plot.The strong decrease in the HI in 2017 is probably related to water stress induced by limited rainfall.For comparison, Figure 9 shows the amount of precipitation during the growing season 2016 and 2017.It is known that drought stress at any growth stages decreases grain yield, especially in the stem elongation, flowering, and grain falling stages.
It appeared clear that the impact of water stress excessively reduces the HI index.Therefore, we further tuned the HI by modifying the parameter PARM(3) inside the parameter file.For this parameter a value of 3 was set.Using the letter value, the water stress influences HI only when the heat unit accumulation is equal to three times the potential heat unit accumulation of the crop.In addition, the lower limit of HI, namely the WSYF that was set equal to the HI value.As a result of this adjustment, Figure 10 shows an improvement of the yield estimation (R 2 = 0.97, RMSE = 281, and RRMSE = 8.1%).The output of the model was further analyzed to understand the strong underestimation.One main reason for the underestimation was found to be due to a major decrease in the harvest index (HI) for the year 2017 (0.214-0.218) compared to 2016 (0.416-0.425).The small intra-year variations in the HI values are probably due to the different phenological development of the crop during the growing season related to the different LAI parameterization for each plot.The strong decrease in the HI in 2017 is probably related to water stress induced by limited rainfall.For comparison, Figure 9 shows the amount of precipitation during the growing season 2016 and 2017.It is known that drought stress at any growth stages decreases grain yield, especially in the stem elongation, flowering, and grain falling stages.
It appeared clear that the impact of water stress excessively reduces the HI index.Therefore, we further tuned the HI by modifying the parameter PARM(3) inside of the parameter file.For this parameter a value of 3 was set.Using the letter value, the water stress influences HI only when the heat unit accumulation is equal to three times the potential heat unit accumulation of the crop.In addition, the lower limit of HI, namely the WSYF that was set equal to the HI value.As a result of this adjustment, Figure 10 shows an improvement of the yield estimation (R 2 = 0.97, RMSE = 281, and RRMSE = 8.1%).

Discussion
Regarding the accuracy of LAI estimation with Sentinel-2 data, our study confirmed a previous study [29] that reported an R 2 of 0.83 and a RMSE of 0.32 m 2 /m 2 .When assimilated in a crop model, satellite-based LAI improves yield estimates as reported also in other studies that used a similar assimilation strategy [12,56].The only use of the bound constraints on the model parameters,

Discussion
Regarding the accuracy of LAI estimation with Sentinel-2 data, our study confirmed a previous study [29] that reported an R 2 of 0.83 and a RMSE of 0.32 m 2 /m 2 .When assimilated in a crop model, satellite-based LAI improves yield estimates as reported also in other studies that used a similar assimilation strategy [12,56].The only use of the bound constraints on the model parameters, without including other restriction as nonlinear constraints, seems sufficient to find the minimum of the minimization problem but a good fit between the satellite observations and the LAI curve of the model was found only in the plot where a low amount of fertilizer was provided.However, in both the years, a yield improvement was obtained and this may be due to a forcing of the DMLA parameter using satellite observations.The use of the maximum value of LAI (98th percentile) to set DMLA can be considered as a priori knowledge of canopy conditions.In this study, an ideal maximum LAI reachable by the crop was found for each field.In this context, the combination of Sentinel-2A and Sentinel-2B satellite data can provide timely and dense sets of observations to precisely characterize the crop growth curves and maximum LAI values.
The combination of the EPIC model and satellite-based LAI was found to be not ideal because there is no nitrogen stress feedback on the LAI in the model.In real field conditions, LAI is significantly influenced by nitrogen levels [57], while, in EPIC, the LAI curve is modeled without taking into consideration the nitrogen stress.Only simulated biomass is affected by the nitrogen constraints.
However, under high level of nitrogen application (above 120 kg/ha in our study), the simulated biomass saturates and, in this case, forcing LAI can help the model to achieve an increased precision.Therefore, the coupling of satellite-based LAI observations and potential EPIC LAI curves might improve the description of the spatial variability, for both LAI and biomass estimation, as confirmed by an improvement in yield estimation obtained in both years under study.Regarding the year specific results: in 2016, a good yield estimation was found assimilating remotely-sensed data even if only a slight improvement in R 2 , RMSE, and RRMSE was obtained if compared to the yield estimation without data assimilation.This could be partially attributed to the site-specific parameterization of the model using field observations and, thus, an accurate description of the environment where the crop has been growing up.In 2017, a slight improvement in the yield results was also found especially regarding the correlation (improvement of R 2 ).However, both a data assimilation approach and a standard EPIC parameterization provided a strong underestimation in yield due to water stress.This is probably caused by three reasons: (1) the rainfall data may not be representative of the field condition and they can affect the model estimation, (2) the default parameters of the model are non-representative for the specific year of simulation; or (3) the impact of water stress conditions are erroneously modelled.
One of the problems that we should consider is the capability of the model to estimate yields under different scenarios and in the presence of high variability, especially in the regional scale application.Previous studies have shown promising results considering the EPIC yield estimation at the regional scale in the presence of high variability [58].More examples should be obtained using remote sensing data that allows the improvement in the information about the crop and the environment of interest even if, due to a large amount of parameters required by the model and, therefore, the large rate of uncertainty, the regional scale application of EPIC seems difficult.Furthermore, the definition of the model parameter-bound constraints in the minimization function without a site-specific calibration is not always representative of the different geographical areas, and a multi-location calibration appears fundamental, for the application in different sub-areas (such as at the field scale) or at the regional scale.It seems necessary to add additional information in the minimization function.The application of EPIC model at the sub-field scale seems more suitable than the regional application, but specific field and crop data are needed.

Conclusions
The use of satellite data together with a crop growth model gives a wide range of possibilities to obtain yield estimates.However, the availability of satellite data and the ability of the model to represent both satellite observations and the characteristics of the crop system has to be evaluated.In this study, the fit between the modelled LAI and the observed LAI curves was reached for both years under investigation (2016 and 2017).However, yield was not always well estimated.This problem could be attributed to the non-capability of the model to represent the environment of study or to issues related to the data used for the model calibration.EPIC was chosen after a literature review showing the potential of the model to simulate grain yield under different environmental and management conditions.In the evaluation of our results, the adoption of a specific calibration of the model parameters, taking into consideration four plots from one growing season only (2016), has to be taken into consideration.Therefore, the cost function minimization approach, together with the EPIC capability to assimilate satellite data, need to be tested under different environmental and management conditions, at different spatial scales, and using general constraints, not specific for the study site.Limitations of the re-calibration approach used in this study are the fact that it does not consider uncertainties in the satellite observations, in the model parameterization and in the model variables' estimations.This approach is also expensive in terms of computing time.For future applications, the updating methods could be considered in order to alleviate the shortcoming of the re-calibration methods.Two classes exist: on the one hand, stochastic methods that are generally implemented using ensemble data assimilation (e.g., the ensemble Kalman filter or particle filter).On the other hand, the variational methods used in other disciplines, as for example the 4D-Var approach, seem to be promising also in the crop growth model assimilation.The 4D-Var methods can simultaneously assimilate the observational data at multiple times in an assimilation window starting with the assumption that the model is perfect.The choice of the best possible model according to the available data and the amount of the data required by the model is one of the key points, together with the quality of the satellite observations and applicability of the data assimilation method that should be considered in the data assimilation framework.

Agronomy 2018, 8 , 18 Figure 1 .
Figure 1.Study area of Marchfeld (left), Lower Austria, and the experimental plots in 2016 (top right) and 2017 (bottom right).It is possible to notice the variability in LAI values as a result of different fertilization rates given during the growing cycle.

Figure 1 .
Figure 1.Study area of Marchfeld (left), Lower Austria, and the experimental plots in 2016 (top right) and 2017 (bottom right).It is possible to notice the variability in LAI values as a result of different fertilization rates given during the growing cycle.

Figure 2 .
Figure 2. Workflow for collecting in situ LAI measurements using the Li-Cor LAI 2200C Plant Canopy Analyzer.Step 1 is the selection of the field; in Step 2 the GPS point, representative of the center of the ESU, is taken; and in Step 3 the measurements are performed obtaining an LAI value as a mean value of the 24 measurements below the canopy (six for each square).

Table 1 .
List of Sentinel-2 (S2) acquisitions and dates of LAI ground measurements (three in 2016 and four in 2017) used for the LAI validation.The cloud cover indicates the percentage of the land surface covered by clouds in the image data.

Figure 2 .
Figure 2. Workflow for collecting in situ LAI measurements using the Li-Cor LAI 2200C Plant Canopy Analyzer.Step 1 is the selection of the field; in Step 2 the GPS point, representative of the center of the ESU, is taken; and in Step 3 the measurements are performed obtaining an LAI value as a mean value of the 24 measurements below the canopy (six for each square).

Figure 4 .
Figure 4. Calibration: the graphs show the different LAI values in four plots for 2016 for different levels of fertilization: 0 kg N/ha (A, id: 1-1), 60 kg N/ha (B, id: 2-1), 120 kg N/ha (C, id: 3-1), and 180 kg N/ha (D, id: 4-1).The green line is the daily LAI curve obtained by EPIC model, while the light blue points are the LAI measurements obtained using Sentinel-2 data.

Figure 4 .
Figure 4. Calibration: the graphs show the different LAI values in four plots for 2016 for different levels of fertilization: 0 kg N/ha (A, id: 1-1), 60 kg N/ha (B, id: 2-1), 120 kg N/ha (C, id: 3-1), and 180 kg N/ha (D, id: 4-1).The green line is the daily LAI curve obtained by EPIC model, while the light blue points are the LAI measurements obtained using Sentinel-2 data.

Figure 5 .
Figure 5. Assessment showing the different LAI values in four 2017 plots for different levels of fertilization: 0 kg N/ha (A, id: 1-1), 60 kg N/ha (B, id: 2-1), 120 kg N/ha (C, id: 3-1), and 180 kg N/ha (D, id: 4-1).The green line is the daily LAI curve obtained by EPIC model while the light blue points are the LAI measurements obtained using Sentinel-2 data.

Figure 6 .
Figure 6.Yield estimation with (A) and without (B) the assimilation of LAI from Sentinel-2 data.The orange triangles are the plot used for the calibration of fmincon and the EPIC unknown parameters, and the blue square symbols represent plots used for the assessment only.

Figure 6 .
Figure 6.Yield estimation with (A) and without (B) the assimilation of LAI from Sentinel-2 data.The orange triangles are the plot used for the calibration of fmincon and the EPIC unknown parameters, and the blue square symbols represent plots used for the assessment only.

Figure 6 .
Figure 6.Yield estimation with (A) and without (B) the assimilation of LAI from Sentinel-2 data.The orange triangles are the plot used for the calibration of fmincon and the EPIC unknown parameters, and the blue square symbols represent plots used for the assessment only.

Figure 7 .
Figure 7. LAI (A) and biomass (B) curves obtained with default LAI parameters for the four different fertilization management treatments (0, 60, 120, and 180 Kg N/ha) in 2016.The graphs show a variation in the biomass levels but not in LAI.3.4.Yield Estimation for 2017.

Figure 7 .
Figure 7. LAI (A) and biomass (B) curves obtained with default LAI parameters for the four different fertilization management treatments (0, 60, 120, and 180 Kg N/ha) in 2016.The graphs show a variation in the biomass levels but not in LAI.3.4.Yield Estimation for 2017.

Figure 8 .
Figure 8.Comparison between (A) yield estimated with the assimilation of LAI from Sentinel-2 and (B) yield obtained with default EPIC parameters for 2017.The green line represents the regression line between the values.

Figure 8 .
Figure 8.Comparison between (A) yield estimated with the assimilation of LAI from Sentinel-2 and (B) yield obtained with default EPIC parameters for 2017.The green line represents the regression line between the values.

Figure 8 .
Figure 8.Comparison between (A) yield estimated with the assimilation of LAI from Sentinel-2 and (B) yield obtained with default EPIC parameters for 2017.The green line represents the regression line between the values.

Figure 9 .
Figure 9. Precipitation (blue bars) and simulated water content in the root zone (black line) in 2016 (A) and in 2017 (B) for one of the high fertilizer plots (180 kg N /ha).

Figure 9 .
Figure 9. Precipitation (blue bars) and simulated water content in the root zone (black line) in 2016 (A) and in 2017 (B) for one of the high fertilizer plots (180 kg N /ha).

Figure 10 .
Figure 10.Yield obtained using the modified PARM(3) value that allows to reduce the impact of the water stress on the harvest index.

Figure 10 .
Figure 10.Yield obtained using the modified PARM(3) value that allows to reduce the impact of the water stress on the harvest index.

Table 1 .
List of Sentinel-2 (S2) acquisitions and dates of LAI ground measurements (three in 2016 and four in 2017) used for the LAI validation.The cloud cover indicates the percentage of the land surface covered by clouds in the image data.

Table 2 .
List of Sentinel-2 (S2) LAI acquisitions and cloud cover during both the growing seasons of winter wheat in the area of interest (nine in 2016 and 12 in 2017).