New Approaches for the Assimilation of LAI Measurements into a Crop Model Ensemble to Improve Wheat Biomass Estimations

: The assimilation of LAI measurements, repeatedly taken at sub-ﬁeld level, into dynamic crop simulation models could provide valuable information for precision farming applications. Commonly used updating methods such as the Ensemble Kalman Filter (EnKF) rely on an ensemble of model runs to update a limited set of state variables every time a new observation becomes available. This threatens the model’s integrity, as not the entire table of model states is updated. In this study, we present the Weighted Mean (WM) approach that relies on a model ensemble that runs from simulation start to simulation end without compromising the consistency and integrity of the state variables. We measured LAI on 14 winter wheat ﬁelds across France, Germany and the Netherlands and assimilated these observations into the LINTUL5 crop model using the EnKF and WM approaches, where the ensembles were created using one set of crop component (CC) ensemble generation variables and one set of soil and crop component (SCC) ensemble generation variables. The model predictions for total aboveground biomass and grain yield at harvest were evaluated against measurements collected in the ﬁelds. Our ﬁndings showed that (a) the performance of the WM approach was very similar to the EnKF approach when SCC variables were used for the ensemble generation, but outperformed the EnKF approach when only CC variables were considered, (b) the di ﬀ erence in site-speciﬁc performance largely depended on the choice of the set of ensemble generation variables, with SCC outperforming CC with regard to both biomass and grain yield, and (c) both EnKF and WM improved accuracy of biomass and yield estimates over standard model runs or the ensemble mean. We conclude that the WM data assimilation approach is equally e ﬃ cient to the improvement of model accuracy, compared to the updating methods, but it has the advantage that it does not compromise the integrity and consistency of the state variables.


Introduction
Dynamic crop simulation models are widely used to simulate crop growth, crop yield and soil-plant-atmosphere interactions. Originally developed for point-based applications without consideration of spatial variation of weather, soil and management, crop models have been increasingly used for field-, regional-, national-and global scale purposes [1]. The detailed spatial characteristics of those inputs, however, are often difficult to measure and thus generalized or unknown [2]. Even if available, the resolution of the available data affects the results of the simulation and needs to be taken into consideration when interpreting the results [3][4][5][6][7].
A successful application of dynamic crop models at sub-field level could provide valuable information for precision farming applications, such as detailed yield forecasts, timing of pesticide application and estimation of potential for variable rate application of fertilizers in the field. The technological advancement of remote sensing (RS) platforms and instruments allows for the repeated measurements of heterogeneous biophysical plant canopy variables at small scale in the field [8]. The assimilation of this information into a dynamic crop model could help to account for the changes of spatial characteristics within a field.
The idea of data assimilation for dynamic crop models is to incorporate one or several observations of model state variables during the period of crop growth. Based on these measurements, the model can be modified and used to make predictions about future states of the crop [9]. A range of different observations, either field measurements or derived from remote sensing, have been assimilated into crop models: phenology [10,11], soil moisture content [12][13][14][15][16][17], canopy cover [18,19], and, most-frequently used, leaf area index (LAI) [10,[14][15][16]18,[20][21][22][23][24][25][26][27][28]. Defined as the total one-sided area of leaf tissue per unit of ground surface area (provided in m 2 m −2 ), LAI is one of the key parameters in crop growth analysis due to its influence on light interception, biomass production, plant growth and ultimately on crop yield, and it is critical to understand the functioning of many crop management practices [29,30].
Three different methods have widely been implemented to assimilate field-measured or RS-derived state variables into crop models: (a) crop model calibration, (b) forcing and (c) updating [31][32][33]. The calibration method typically finds optimal agreement between simulated and observed state variables via the variation of one or several parameter values using optimization algorithms such as the Differential Evolution Adaptive Metropolis (DREAM) [34], Particle Swarm Optimization (PSO) [35] or Shuffled Complex Evolution (SCE-UA) [23]. The model is run iteratively to reinitialize or reparametrize state variables, which requires excessive computing time. Errors from observations are typically neglected. The forcing method utilizes the observed data directly to replace the state variables or initial input data of the crop simulation model [36]. However, model and observation uncertainties are ignored, and erroneous observations could be integrated into the model. The updating method comprises the continuous updating of model state variables every time a new observation becomes available ('sequential data assimilation'). Here, the assumption is that a corrected state variable at time t will subsequently improve the simulation output at subsequent time steps t+n [31]. The updating method is computationally inexpensive because the crop simulation model is run only once [32].
The performance of those data assimilation algorithms that rely on a Monte Carlo setup highly depends on the composition of the set of variables that are perturbed to generate the model ensemble. We studied publications dealing with the assimilation of LAI observations into dynamic crop models and found that the majority of studies relied on those variables that influence LAI development directly in the crop component (CC) of the respective model (see Table A1 in the Appendix A for overview of variables used in selected publications).
This CC-based approach is possibly appropriate when assimilating LAI information from greater spatial scales (farm to regional level) into a crop model, as the generated ensemble approximates differences that arise from variation of cultivar specifics, planting dates and phenology.
Within-field heterogeneity of crop growth and yield are, however, caused by variable site characteristics, rather than by strong variation of cultivar-specific crop component variables [44].
A Monte Carlo assimilation approach that incorporates a combination of soil component (e.g., soil water content) and crop component (SCC) ensemble generation variables might therefore be a more appropriate solution when assimilating LAI information at field to sub-field level, because soil-influenced dynamics are incorporated into the generation of the ensemble. To our knowledge, no published study has thoroughly looked at the impacts of the composition of the ensemble generating variables on the performance of data assimilation before.
Updating methods using EnKF or PF commonly update state variables every time a new observation becomes available. Crop models are increasingly implemented modularly to facilitate development, documentation, maintenance, sharing and exchange [45]. Where model complexity rises, elaborate understanding is necessary to understand how the update of only one or few state variables affects other, interdependent variables. Updating sequentially could have unforeseen consequences, ultimately threatening the model's integrity and causing an undefined state of the model (e.g., a threshold value for a state variable is reached and triggers a new module, but the filter updates the state variable to a value < threshold during the next simulation step).
The underlying idea of this study was to investigate on how to improve the biomass yield prediction accuracy of crop simulation models at field level via the assimilation of observational field data, by testing (a) the influence of varying ensemble generation variables, and (b) different assimilation algorithms. The EnKF was selected as it is the most-widely used, well-documented updating method in literature. We furthermore developed the 'Weighted Mean' (WM) approach that assimilates state variable observations into the model without changing the model's internal variables in an effort to avoid the risks mentioned above. Moreover, this method is computationally inexpensive and takes both simulation and observation errors into account. The approach relies on a model ensemble that runs from simulation start to simulation end without sequential updating; the subsequent calculation of the weighted mean accounts for the observational values.
Therefore, we addressed the following research questions in the paper: 1. Does the WM approach outperform the EnKF approach regarding the estimation of total aboveground biomass and grain yield at harvest using a dynamic crop model? 2.
With detailed soil information available, do the ensemble-based assimilation approaches EnKF and WM improve accuracy over standard model runs (SR) and the ensemble mean (EM) with regard to average total aboveground biomass and grain yield per field? 3.
Does the performance of the assimilation approaches depend on the composition of the ensemble generation variables (either crop component-based (CC) or soil and crop component-based (SCC))?

Experimental Sites and LAI Measurements
Embedded in the locally practiced crop rotation, winter wheat was grown on commercial fields in different locations across Germany (4 sites), France (2 sites) and the Netherlands (1 site) during the growing season (GS) 2016/2017, and on seven sites across Germany during the GS 2017/2018 (see Table 1 for overview). All sites were located in the warm, temperate, humid climate of Western and Central Europe with warm summers [46]. The sum of precipitation and average temperature for the period from September 2016 to August 2017 ranged from 490 mm and 12.9 • C in Western France to 647 mm and 10.2 • C in Eastern Germany. For the GS 2016/2017, one commercially available cultivar was planted in each location. Sowing took place on at least two different dates, with site 2 being the only exception (one planting date only). For the GS 2017/2018, two cultivars were planted per field one the same date (with site 12 being an exception, where seeds from both cultivars were planted on two dates). Pesticides, growth regulators and fertilizers were applied based on best practice guidelines. No irrigation was applied. 40 to 60 sampling points were randomly distributed across each field; their location was measured using a differential GPS. LAI measurements were conducted using the LI-COR LAI-2200C Plant Canopy Analyzer (LI-COR Inc., Nebraska, U.S.A.) at each sampling point five times during the growing season, with the earliest measurements starting in April of every year. Growth stages according to the BBCH scale [47] were scouted five times during the growing season at the same dates as the LAI measurements.
Around maturity (BBCH 99), aboveground biomass was sampled at each sampling point on an area of one m 2 and split into the grain, stem and leaf components. Each component was oven dried at 105 • C until no further weight loss occurred, and weighed subsequently.

Soil Samples
Soil samples were collected at each sampling point before planting, and subsequently analyzed for texture and nutrient content in the laboratory. German sites were sampled up to a depth of 90 cm, and up to 60 cm on the French and Dutch sites. Some points in site 5 were sampled to a depth of 30 cm only, due to the presence of solid gypsum in deeper layers. Detailed information on measured soil texture in the fields can be found in [48].

Weather Data
Daily weather data (precipitation, minimum and maximum temperature at 2 m height, solar radiation and average wind speed) were collected from weather stations installed adjacent to the fields (Adcon Telemetry, Klosterneuburg, Austria).

Crop Model
We employed the generic LINTUL5 model implemented in the modeling framework SIMPLACE (Scientific Impact Assessment and Modelling Platform for Advanced Crop and Ecosystem Management, see website at www.simplace.net, accessed 9 December 2019) to simulate daily leaf area and biomass development at all sites. LINTUL5 is a crop growth simulation model developed for potential water-limited, N-limited and NPK-limited conditions [49], and has been used widely for crop response assessments [50][51][52]. SIMPLACE is a model framework that allows the solution of a modeling problem to be modularized into a number of discrete, replaceable and interchangeable software units (so-called SimComponents) [53]. The solution used for this study was a combination of the SimComponents LINTUL5, SlimRoots, SlimWater and STMPsim.
Crop growth in LINTUL5 is a function of intercepted radiation, temperature and radiation use efficiency (RUE). Daily LAI is calculated as the product of the development stage-dependent specific leaf area (SLA) and the weight of the living green leaves (WLVG).
Soil water balance was simulated using SlimWater, where the daily change in soil water content in a multiple layered soil profile is based on the volumes of crop water uptake, soil evaporation, surface run-off and seepage below the root zone [54]. Root growth was simulated using the SimComponent SlimRoots, where the daily increase in biomass of seminal and lateral roots depends on the input of assimilated biomass from the shoot (see [53] for more information). We assumed no occurrence of disease stress and optimal nutrient supply at all times. Thus, water stress was the only growth-inhibiting factor considered in the model.
Soil hydraulic properties based on the derived texture (see Section 2.2) were calculated using the database of hydraulic properties of European soils (HYPRES) [55] for each soil profile. The layered soil profile was extended to 200 cm soil depth, assuming the same in-situ texture present as in the 60-90 cm layer. SIMPLACE <LINTUL5, SLIM> ran in daily time steps. Daily phenology data was provided by a xarvio TM (www.xarvio.com, accessed 9 December 2019) in-house developed, commercial growth stage model that estimated cultivar-specific BBCH stages of winter wheat based on accumulated thermal temperature, vernalisation and photoperiod (see Table A2 in the Appendix A for estimated dates of BBCH stages). The growth stage model has been validated with roughly 30,000 records in Germany. The BBCH stages were transformed into the LINTUL-internal development stages (DVS) based on a lookup-table, and linked to all SimComponents that required DVS information. For all winter wheat-specific variables beyond phenology, we used the generic values (i.e., no calibration of cultivar specifics).
We only considered LAI measurements for assimilation that were conducted before flowering (i.e., < BBCH 65 = DVS 1) due to two reasons: (1) The default LINTUL5 winter wheat configuration did not consider the partitioning of assimilates into the leaves after flowering, and (2) the field-measured LAI was a combination of green and senescent plant material; the simulated LAI however comprised green, living plant material only.

Data Assimilation and Ensemble Generation Approaches
We implemented two approaches to assimilate LAI values into SIMPLACE <LINTUL5, SLIM>: The well-established EnKF and a newly-developed WM method. The main advantage of the new WM approach is the fact that it performs a 'virtual assimilation' of the observations, and therefore does not change any state variable during the model run. Hence, the relations between the state variables are maintained consistent throughout the ensemble simulations. To our knowledge, until present, no other data assimilation updating method has been published that maintains full consistency of all state variables during the model runs.
EnKF has gained popularity in the scientific community due to its simple conceptual formulation and ease of implementation, plus its low computational requirements [56]. It combines an ensemble forecast and the Kalman Filter to calculate the prediction error covariance using the Monte Carlo method [32]. State variables are updated sequentially, taking the uncertainties of the simulation results and observations into account [57]. For detailed information about the EnKF, the reader is kindly referred to other publications [9,56]. We used the implementation of EnKF in R provided by Stefan Gelissen (http://blogs2.datall-analyse.nl/2016/06/08/rcode_ensemble_kalman_filter/, accessed 9 December 2019). The integration of EnKF into SIMPLACE<LINTUL5, SLIM> was done in R [58] using the SIMPLACE R wrapper [59]. The workflow is constituted as follows: First, the ensemble members were randomly generated based on chosen initial values and variance. Secondly, the model runs using the sets of variables as input were invoked. Simulations ran until the first LAI observation became available. Each model run was then interrupted, EnKF updated the LAI value accordingly, and the runs were re-invoked until the next observations became available.
The WM approach assumes that, out of a model ensemble that runs from season start to end without any assimilation (Figure 1a), one or a few ensemble members' simulated LAI values approximate the observed LAI values at each day an observation becomes available (not necessarily the same one at different dates).
Thus, in the subsequent daily weighted mean calculation for the ensemble, the simulated LAI values of the ensemble members closer to the observed value are given a greater weight (Figure 1b-d). Contrary to EnKF or other existing updating methods, no state variables are updated during the simulation runs.
To predict the stateX(t) of the system, we used the weighted mean of the ensemble X i (t): where each weight w of ensemble member i at day t is calculated from the likelihood P that the observation O at time t k approximates the simulated value We assumed that the observation errors on day t k were normally distributed, where O(t k ) is the mean and σ k the standard deviation of the distribution. Thus, we applied the following equation for the calculation of the likelihood P: where h mapped the states to the observational variables.
The weights calculated for the first observation day (t 1 ) were propagated until the next observation (t 2 ) becomes available, and they were also used to calculate the weighted mean of other state variables (aboveground biomass, grain yield). Weights were re-calculated every time an observation became available (i.e., no calculation of running mean, and each observation was used independently from the previous ones). If the value of the LAI observation was way outside the range of simulated values within the ensemble (i.e., higher or lower than the value of the most extreme ensemble member), the entire weights were given to the closest ensemble member. The WM approach can account for reasonable errors (e.g., standard measurement errors). Large errors (e.g., from mishandling the measurement instrument) cannot be accounted for and will eventually results in a low prediction performance.  (2) and (3)). The weights are propagated until (c) the next LAI observation becomes available, and a re-calculation of weights is triggered. (d) A third observation becomes available. No model ensemble member status variable is updated at any point in time.
Both approaches (EnKF and WM) relied on the generation of a model ensemble. We tested two sets of ensemble generating variables, consisting of three variables each, in combination with the assimilation approaches mentioned above: the crop component (CC) set and the combinational set of soil and crop components (SCC).
The CC set was created by varying the three variables ScaleFactorSLA (cScaleFactorSLA), ScaleFactorRUE and the maximal relative increase in LAI (RGRLAI). LINTUL5 accounts for DVS-dependent specific leaf area, which was shown to be a realistic approach [60]. ScaleFactorSLA scales uniformly all predefined DVS-dependent SLA values and ScaleFactorRUE scales all predefined DVS-dependent RUE values accordingly. RGRLAI describes the maximal relative increase in LAI (m 2 m −2 d −1 ) during the juvenile stage of the plant, when the leaf growth is not limited by the available assimilates [49]. This variable is only active during the early growth period (DVS < 0.2 = BBCH < 21) and LAI values < 0.75. We selected those variables because a preceding sensitivity analysis showed high impact on LAI dynamics and biomass accumulation in the model (data not shown). Scaling the SLA automatically scaled the initial LAI.
The SCC set comprised the combination of three parameters that scaled (1) the soil water content at simulation start (SoilWaterInit), (2) the maximal rooting depth that could be reached by the plants (MaximalRootDepth [in meters]), and (3) a scaling factor for the DVS-dependent specific leaf area (ScaleFactorSLA). By perturbing the first two parameters at initialization, the model induced water stress (by reducing the transpiration reduction factor (TRANRF)) at varying points in time during the growing season, thereby expanding the range of LAI values at any given point in time in the ensemble. The parameters SoilWaterInit and MaximalRootDepth were not altered after initialization for reasons of consistency.
The initial soil water content SoilWaterInit in soil layer k was calculated as where VolumetricWaterContent33to1500 describes the volumetric soil water content from field capacity to wilting point (i.e., plant available water), and VolumetricWaterContent1500 the volumetric soil water content at permanent wilting point. Both values were returned by the pedotransfer function (see Section 2.4), based on measured soil texture. We incorporated the ScalingFactorSLA as to cover influences of possible hidden factors (such as nutrient stress). For WM, we chose a uniform distribution of the variables, with defined minimum and maximum values (see Table 2 for values). EnKF creates an ensemble using a randomly generated Gaussian distribution of the parameters (see Table 2 for initial mean values of distribution). For the SCC set, simulations started the day preceding sowing, to achieve a maximum effect of the initial soil conditions on plant development (i.e., no spin up time). For both approaches, the LAI measurement error was assumed to be 0.3.

Evaluation of Model Performance
SIMPLACE<LINTUL5, SLIM> ran in daily time steps for every sampling point in every field that was part of this study, considering both weather and soil texture data as input. The LAI measurements were assimilated into the model using the approaches EnKF and WM in combination with the two sets of ensemble generation variables (CC and SCC). To evaluate the potential benefit of data assimilation, we also included the EM (i.e., no data assimilation). The EM was calculated by averaging aboveground biomass and grain yield at harvest day from all ensemble member simulations, where the ensemble was generated using the same method as for the WM approach and the model's standard runs (SR) (i.e., no data assimilation, no ensemble creation, simulation run with standard configuration) in the analysis.
The evaluation was based on the comparison of measured and simulated total aboveground biomass and grain yield at harvest day. We calculated the three metrics Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE) and the Bias. RMSE indicated the magnitude of error in the unit of measurement with symmetry provided; MAPE showed the average absolute percent difference between measured and predicted values, the Bias computed the amount by which the predicted values were greater (positive Bias value) or smaller (negative Bias value) than the measured ones. Table 3 presents the results for the RMSE analysis of simulated vs. observed biomass yields per site. In six sites, the highest RMSE was produced by the standard run (sites 2, 4, 6, 7, 12, 13, 14), in five sites by either of the two CC assimilation approaches (sites 1, 8,9,10,11) and in one site by the EM SCC approach only (site 5). In five sites, the lowest RMSE was produced by the WM SCC approach (sites 4, 5, 6,8,9) and by the EnKF SCC approach, respectively (sites 2, 10, 11,12,14), in two sites by EM SCC (sites 1, 3) and WM CC approaches respectively (sites 7, 13). Minimum RMSE values among all sites ranged between 0.89 t ha −1 (WM SSC, Site 9) and 3.35 t ha −1 (WM SSC, Site 4), maximum values between 3.77 t ha −1 (Site 9, EnKF CC) and 7.35 t ha −1 (Site 2, Standard Run).

Results of Root Mean Squared Error (RMSE) Analysis
The lowest average RMSE values of all sites were produced by EnKF SCC (2.32 t ha −1 ), WM SCC (2.46 t ha −1 ) and EM SCC (2.65 t ha −1 ). These approaches also showed the lowest standard deviation (0.90 t ha −1 , 0.71 t ha −1 and 1.01 t ha −1 respectively) ( Figure 2). No particular approach showed the best performance across all sites.  Table 4 lists the per-site RMSE results of simulated vs. measured grain yields at harvest. In 10 sites, the highest RMSE was produced by SR (Sites 1, 2, 4, 6,7,8,9,12,13,14), in two sites by WM CC (Sites 3 and 11), and in two sites by EnKF SCC and EM SCC, respectively (Sites 10 and 5). The lowest RMSE was produced in 10 sites by either EnKF SCC or EM SCC. The lowest average RMSE of the 14 sites were produced by EM SCC (1.45 t ha −1 ), EnKF SCC (1.62 t ha −1 ) and WM SCC (1.70 t ha −1 ), with standard deviations of 0.69 t ha −1 , 0.69 t ha −1 and 0.68 t ha −1 respectively (Figure 2).  Table 5 lists the mean absolute percentage error (MAPE) results of simulated vs. measured total aboveground biomass per site. The standard run produced the highest MAPE in seven out of the fourteen sites (2,4,6,7,8,13,14), all of the CC assimilation approaches in five sites (1,3,9,11,12), and the EM CC approach in one site (5). The results for site 10 show an equal performance of the standard run and the WM CC approach. The lowest MAPE was produced by the EnKF SCC approach in five sites (sites 2, 10,11,12,14) and in two sites by the WM SCC approach (sites 5, 6) and by the WM CC approach (sites 7, 13) respectively. In all other sites, several approaches performed equally well (sites 1, 3,4,8,9). The lowest average MAPE across all sites was produced by the EnKF SCC (13%), WM SCC (14%) and EM SCC (15%) approaches with the lowest standard deviations (6%, 6% and 8%), the highest by the EM CC (32% with a standard deviation of 13%) and the standard run (35% with a standard deviation of 14%) (Figure 3). The MAPE results for observed vs. simulated grain yield are listed in Table 6. The three SCC approaches exhibited the lowest MAPE in 11 out of the 14 sites, which also produced the lowest mean values (EM SCC: 16%, EnKF SCC: 18%, WM SSC: 19%, with standard deviations of 10%, 9% and 12% respectively). On average, MAPE values for grain yields were higher than for biomass yields (Figure 3).

Results of Bias Analysis
The calculated bias shows that standard run, EM CC and EnKF CC overestimated the measured biomass in all sites, all other approaches showed a mix of both under-and overestimated values (Table 7). No approach exhibited the best performance across all sites. The EnKF SCC and WM SCC approaches showed the lowest mean bias (0.34 t ha −1 and 0.58 t ha −1 ) with standard deviations of 1.53 t ha −1 and 1.57 t ha −1 , respectively (Figure 4).   Table 8 lists the bias results for simulated vs. observed grain yield. Both the EM CC and EnKF CC approaches overestimated measured grain yield in all 14 sites, whereas the SR and WM CC approaches overestimated in all sites but one (Sites 5 and 7, respectively). The three SCC approaches all showed a mixture of under-and overestimation, where results did not align (i.e., not always either over-or underestimation). On average, the SCC approaches tended to underestimate measured grain yield, contrasting the other approaches that showed overestimation. Standard deviations were similar among all approaches ( Figure 4).

LAI Assimilation
The results demonstrated that, on average across all sites, SR showed the worst performance with respect to RMSE, MAPE and Bias (Figures 2-4). Looking at single results, however, revealed that SR did not always deliver the poorest results. This suggests that the calculation of the EM or the assimilation of LAI observations into the model did not necessarily guarantee a better prediction performance for both total aboveground biomass and grain yield at the end of the growing season.
We relied on a non-destructive, indirect method to determine LAI in the field. The LAI-2200C measures the fraction of transmitted radiation that passes through the plant canopy, and infers LAI by making use of the radiative transfer theory. Indirect methods tend to underestimate LAI when compared to direct measurements [61]. Furthermore, measurements can be prone to errors if the sampling strategy is not followed correctly [61]. The in-field LAI measurements that we used for assimilation were therefore subject to uncertainty. Given this, we rejected, however, to exclude measurements from assimilation because the pattern of crop canopy heterogeneity remained largely unknown. An outlier analysis could have therefore removed values that were measured in areas of extremely dense or extremely thin canopies.
The assimilation of wrongful LAI measurements could be a reason why data assimilation did not always outperform SR (we included a figure showing measured LAI values vs. EnKF-assimilated and Weighted Mean LAI values, respectively, in the Appendix A-see Figure A1). Relying on either EM or data assimilation techniques improved predictions in most cases and averaged over all sites. The lowest mean RMSE value for biomass was 2.32 t ha −1 with a MAPE of 13% using the EnKF SCC approach, and 1.45 t ha −1 with a MAPE of 16% for grain yield using the EM SCC approach. The mean RMSE for SR biomass prediction was 5.32 t ha −1 , with a MAPE of 35%, and 3.16 t ha −1 with a MAPE of 41% for grain yield prediction. The aboveground biomass and grain yield measurements at harvest comprised one measurement per sampling point only (i.e., no repetitive measurements). When interpreting the simulated vs. observed results, the uncertainty in the measured values should be considered.
Differences in phenology were the only cultivar-specific properties we considered in this study. The model was not calibrated for other variables (e.g., SLA, RUE) that could have influenced results positively.
Gilardelli et al. [21] found that the assimilation of remotely sensed LAI into rice model parameters using automatic recalibration increased the accuracy of the simulation results with a mean absolute error (MAE) of 0.66 t ha −1 and a relative root mean square error (rRMSE) of 13.8% contrary to 0.82 t ha −1 and 15.7% without LAI assimilation. However, they concluded that model performance improved only to a moderate extent since the pre-calibrated parameter sets accurately described the characteristics of the cultivars considered. Using the remotely sensed information, spatial variability of yield was well-reproduced. Silvestro et al. [18] reported the lowest rRMSE value to be 18% at field level for winter wheat using EnKF in combination with the SAFY model.

Biomass Yield vs. Grain Yield Simulation Performance
Our results also showed that the model simulation performance was better for biomass yield than for grain yield. This was probably because (a) no data was assimilated after flowering, possibly correcting for environmental influences not considered in the input data or in the model, (b) water stress was not reproduced well in the model (water stress was present especially in sites 2, 5 and 12 as indicated by low HI values, see Table 1), (c) the amount of biomass produced before anthesis that was relocated to grains after anthesis was not accurately defined (we set the maximum value to 15%), and d) the timing of flowering was not accurately predicted by the growth stage model, thereby influencing the timespan of the grain filling period negatively. Differences between observed and predicted flowering date range from 0 to ±14 days, with an average of 1 day difference (data not shown).

Comparison of Ensemble Generation Variables Sets
Looking at the single results revealed that, among all sites, no performance of a distinct approach stood out (i.e., no approach produced the lowest RMSE and MAPE values and bias around 0 consistently in all sites, see Tables 3-8). The mean values of the metrics, however, showed that the SCC approaches outperformed the CC approaches for both total aboveground biomass and grain yield predictions substantially. Lower values of standard deviation signified more robust approaches that seemed more representative of the actual processes in the field. For total aboveground biomass, the comparison of the SCC approaches showed that the differences between the best-performing approaches EnKF and WM were marginal. Concerning the CC approaches, the WM approach outperformed the EnKF algorithm.
The number of studies that relied on Monte Carlo approaches to assimilate LAI measurements into crop models at field or sub-field level is limited. Silvestro et al. [18] relied on a set of eight variables to be updated, all of them part of the model's crop component (see Table A1 in the Appendix A). We encourage future research to also focus on those variables that represent soil conditions and processes in the model.
With respect to grain yield, the EM SCC approach showed the best performance among the SCC approaches, and the WM CC approach among the CC approaches (with respect to average RMSE, MAPE and bias, see . This indicated that, when relying on SCC, the assimilation of LAI did not improve the model's performance with regard to grain yield, in contrast to the CC approach. The reason was probably because no values were assimilated after flowering. We relied on measured soil texture, weather and LAI data for all sampling points in this study. Our conclusions were therefore drawn on the best-case scenario of data availability. We suggest future analysis to focus on data sources with greater level of uncertainty (e.g., soil data from large-scale databases, LAI derived from satellite imagery) to study differences in the performance of the approaches. Focus could also be put on the assimilation of green and/or brown LAI (i.e., living and dead leaf material) to correct for unknown influences after flowering.

Conclusions
The objectives of this study were to investigate if (a) the Weighted Mean (WM) approach outperforms the Ensemble Kalman Filter (EnKF) approach regarding the estimation of total aboveground biomass and grain yield at harvest using a dynamic crop simulation model, (b) the ensemble-based approaches EnKF and WM improve accuracy over standard model runs (SR) and the ensemble mean (EM) with regard to average total aboveground biomass and grain yield per field with detailed soil information available, and (c) the performance of the assimilation approaches depend on the composition of the ensemble generation variables (we tested two sets: crop component-based (CC) and soil and crop component-based (SCC).
We conclude that the assimilation approaches EnKF and WM improved accuracy over standard model runs and EM for average total aboveground biomass and grain yield per-field predictions. The performance, however, differed between sites.
Furthermore, the performance of the WM approach was very similar to the EnKF approach when soil and crop related variables were used for the ensemble generation. When crop related variables were considered only, the WM approach outperformed the EnKF approach. Taking into account that the EnKF approach might violate the integrity of the model runs because only a small part of the states is updated, the WM approach should be preferred when assimilating observational data into crop models.
We furthermore conclude that the difference in site-specific performance largely depended on the choice of the ensemble generation variables set. The combination of soil and crop component-based variables outperformed the crop component set with regard to both biomass and grain yield. For total aboveground biomass, the difference between the assimilation approaches EnKF and WM was marginal when relying on the SCC set, the difference between the assimilation approaches was more pronounced for the CC set, with WM showing the best performance. For grain yield, the assimilation of data using the SCC set did not offer any benefits, in contrast to the CC set.
We are confident that our tested approaches offer great benefit for the scientific crop modeling and precision agriculture community.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. Table A1. List of LAI data assimilation studies with ensemble generation variables employed and respective scale considered. CM = Crop Model, F = Field, FA = Farm, D = District, R = Region.
Ratio of incoming PAR to global radiation 2.
Temperature sum threshold to start senescence 3.
Optimum temperature for plant development 4.
Day of year of emergence 5.
Effective light-use efficiency 6.
Partition to leaf function parameter 7.
Partition coefficient to grain 8.
Thermal time for seedling emergence 5.
Thermal time from silking to physiological maturity 6.
Maximum number of kernel per plant 7.
Leaf weight at emergence 9.
Plant leaf area at emergence D [28] Wheat-CERES