Modelling Crop Biomass from Synthetic Remote Sensing Time Series: Example for the DEMMIN Test Site, Germany

This study compares the performance of the five widely used crop growth models (CGMs): World Food Studies (WOFOST), Coalition for Environmentally Responsible Economies (CERES)-Wheat, AquaCrop, cropping systems simulation model (CropSyst), and the semi-empiric light use efficiency approach (LUE) for the prediction of winter wheat biomass on the Durable Environmental Multidisciplinary Monitoring Information Network (DEMMIN) test site, Germany. The study focuses on the use of remote sensing (RS) data, acquired in 2015, in CGMs, as they offer spatial information on the actual conditions of the vegetation. Along with this, the study investigates the data fusion of Landsat (30 m) and Moderate Resolution Imaging Spectroradiometer (MODIS) (500 m) data using the spatial and temporal reflectance adaptive reflectance fusion model (STARFM) fusion algorithm. These synthetic RS data offer a 30-m spatial and one-day temporal resolution. The dataset therefore provides the necessary information to run CGMs and it is possible to examine the fine-scale spatial and temporal changes in crop phenology for specific fields, or sub sections of them, and to monitor crop growth daily, considering the impact of daily climate variability. The analysis includes a detailed comparison of the simulated and measured crop biomass. The modelled crop biomass using synthetic RS data is compared to the model outputs using the original MODIS time series as well. On comparison with the MODIS product, the study finds the performance of CGMs more reliable, precise, and significant with synthetic time series. Using synthetic RS data, the models AquaCrop and LUE, in contrast to other models, simulate the winter wheat biomass best, with an output of high R2 (>0.82), low RMSE (<600 g/m2) and significant p-value (<0.05) during the study period. However, inputting MODIS data makes the models underperform, with low R2 (<0.68) and high RMSE (>600 g/m2). The study shows that the models requiring fewer input parameters (AquaCrop and LUE) to simulate crop biomass are highly applicable and precise. At the same time, they are easier to implement than models, which need more input parameters (WOFOST and CERES-Wheat).


Introduction
The contention to achieve food security and the sustainable use of agricultural resources becomes vital due to the growth in the human population [1]. The World Summit on Food Security states that the community expects to reach a margin of 10 billion by 2050, and this will force intensive agriculture demand, along with the biggest challenge of climate change [2,3]. In this context, achieving sustainability in agriculture might slow down the negative impacts on the quantity and quality of soil and water resources, land degradation, greenhouse emissions, or biodiversity [4].
For the implementation of sustainable activities in agriculture, there is a definite requirement of monitoring crop growth and status in different regions and environmental contexts, with high temporal resolutions [5]. Efficient methodologies that are both able to monitor crop conditions and changing weather conditions near real-time are essential [1]. Also, these methodologies can be applied to predict crop production, e.g., for optimizing the management strategies in agriculture and increasing sustainability [6]. These crop predictions are not only essential to obtain the economic returns, but are, at the same time, highly valuable to efficiently evaluate food production insufficiency and to ensure food security in the agricultural regions of the world [7].
In the last two decades, crop growth models (CGMs) have reached a high degree of success in simulating the behavior of real crops (i.e., by predicting its final state of total biomass or harvestable yield). CGMs are defined as a set of mathematical equations which provide quantitative and temporal information on plant growth and development by including the effect of various climatic parameters [8]. These models, working on different spatial scales, are increasingly applied as tools for decision-making and research, e.g., they are used in plant management in order to evaluate climate risks [9,10]. On the contrary, CGMs lack the spatial information to estimate yield at the field scale [11]. Regulating model inputs for the required spatial and temporal scales is complicated, and the frequently assumed spatial homogeneity of crucial input parameters leads to errors in the estimated outputs [12,13]. A synergistic approach, linking remote sensing (RS) data with crop growth modeling, has the potential to overcome these limitations, as satellite RS offers a method to examine the fine-scale spatial and temporal changes in crop phenology for specific fields and to forecast crop yields, considering the impact of daily climate variability [14]. The integration of RS data into CGMs provides the practical information of the crop's lifecycle by adjusting model parameters based on RS observations. Many studies successfully utilized the leaf area index (LAI), or fraction of absorbed photosynthetic active radiation (FPAR) derived from vegetation indices, e.g., the normalized difference vegetation index (NDVI), in combination with CGMs to estimate crop biomass or yield at different study regions around the world [11,[15][16][17][18].
Due to economic accessibility, historical records and increasing resolutions of globally available satellite products now provide a new level of data with high spatial, temporal, and spectral resolutions, which in turn creates new possibilities for generating accurate datasets on several crop types and their respective yields [19][20][21]. Specifically for agricultural related applications, such as precision farming, field level information (a few tens of meters) is needed to guide farmers, while swift and subtle changes in crop conditions due to extreme weather events require the capture of high frequency images (i.e., daily or weekly acquisitions) [22,23]. The Landsat mission, which has provided optical multi-spectral RS data since the 1980s with 30m spatial resolution, is exemplary. However, Landsat satellites have a rather low temporal resolution with a revisit time of 16 days, and a high cloud coverage [24,25], which does not allow for crop phenology estimation on a daily or even weekly basis. Other RS systems, on the other hand, have a daily revisit but suffer from rather coarse spatial resolution, e.g., the Moderate Resolution Imaging Spectroradiometer (MODIS) imagery has provided multi-spectral RS data with a daily revisit since 2001. Due to its high frequency, spatial and temporal filtering methods are used to eliminate cloud-contaminated pixels with a high accuracy [26][27][28][29]; however, the effectiveness for fine-scale environmental applications is rather low and limited by the spatial resolution of 250 to 1000 m [30].
One way to overcome this issue is the application of multi-temporal data fusion. In such a fusion, two data sets are combined to create a synthetic time series, delivering both high spatial and

Data Collection and Pre-processing
The climate data, biophysical data, and satellite data were obtained from multiple sources (Table  1). Firstly, the collected data were pre-processed according to the requirements of the study. After this step, the pre-processed information was integrated into the selected CGMs.

Data Collection and Pre-processing
The climate data, biophysical data, and satellite data were obtained from multiple sources (Table 1). Firstly, the collected data were pre-processed according to the requirements of the study. After this step, the pre-processed information was integrated into the selected CGMs. Table 1. Summary of the collected datasets for winter wheat crop modeling from start of season (SOS) to end of season (EOS) (dd.mm.yy). The climate parameters are maximum temperature ( • C) (Tmax), minimum temperature ( • C) (Tmin), dew point temperature ( • C) (Tdew), relative humidity (%) (RH), clear sky surface photosynthetically active radiation (Wm-2) (PARCS), wind speed at 10 m (ms-1) (WIND), sunshine duration (s) (SUND), volumetric soil water layer 1 to 4 (m3) (SWVL1-4), top net solar radiation clear sky (Wm-2) (TSRC), and total precipitation (mm) (TP); the biophysical parameters are leaf area index (LAI) (m2 m-2), fraction of absorbed photosynthetic active radiation (FPAR), plant height (m) (PH), canopy cover (%) (CC) and biomass (g m-2); the satellite data are Landsat and Moderate Resolution Imaging Spectroradiometer (MODIS) MCD43A4. The climate data are available with a temporal resolution of 3 h for all elements mentioned above from the European Centre for Medium-Range Weather Forecasts's (ECMWF) ERA-Interim for the period 25 Februay to 31 August in 2015. These three-hour based data are aggregated into daily means in order to fulfil the requirements of the CGMs. ECMWF provides global climate forecasts and historical climatic data at different spatial resolutions [56]. The spatial resolution of the climate data used in the study is approx. 80 km.

Biophysical Data
The biophysical parameters LAI, FPAR, plant height (PH), canopy cover (CC), and biomass are gathered on phenology based on the Biologische Bundesanstalt, Bundessortenamt and Chemical Industry (BBCH) characterization scale, such as in initial, development, mid-, and late season, and using various instruments, like a SunScan (FPAR), scanner (LAI), and height table (PH). An oven and a weighing machine were used to dry and weigh the collected wet biomass, respectively. In situ samples are acquired within seven winter wheat field sites on 16 days of the year (DOYs): 83,84,108,109,142,144,148,149,150,151,179,180,181,182,183, and 219. The sampling survey is designed into 18 sample points (also called as environmental sampling units (ESUs)) within 12 sub-plots per ESU, in the form of a rectangular cross with spatial extensions of 20 m by 20 m (Figure 1). A distance of 4 m separates each sub-plot. The measurements of the 12 subplots are averaged and assigned to the respective ESU.
The ESUs are far apart enough to not be displayed by the same Landsat pixel (30 m by 30 m). To measure the spatial auto-correlation between ESUs, the study conducts Moran's I statistical test, which states a correlation of the values of a sample based on the measured locations. The test performs with the null hypothesis (H 0 ) that there is no spatial auto-correlation and the alternative hypothesis (H 1 ) that there is spatial auto-correlation [57][58][59]. The test is performed on samples considered at different time steps during the phenological cycle of winter wheat. For the present study, the results of Moran's I were mostly negative, which indicates that the pattern of biomass sampling is dispersed with a non significant p-value (>0.05), which fails to reject the H 0 .

Satellite Data
For the spatial-temporal analysis, the study makes use of Landsat 8 Surface Reflectance Code (LASRC) products and MODIS datasets. The LASRC Tier 1 has a spatial resolution of 30 m on a Universal Transverse Mercator (UTM) projection and provides seven spectral bands (coastal/aerosol, blue, green, red, near infrared (NIR), shortwave infrared (SWIR) 1, SWIR 2). The data is atmospherically corrected using LASRC. From the given quality assessment band "pixel_qa", which is generated using the C function of mask (CFMask) algorithm, clouds and shadows are removed via the shadow and cloud masks. After preprocessing, the available cloud-free and shadow-free Landsat images were acquired in 2015 at the following DOYs respectively: 84 (25 March), 100 (10 April), 123 (03 May), 155 (04 June), 164 (13 June), 196 (15 July), and 219 (7 August). Additionally, the study uses MODIS MCD43A4 version 6 Nadir Bidirectional Reflectance Distribution Function (BRDF)-Adjusted Reflectance (NBAR) dataset that is produced daily using 16 days of Terra and Aqua MODIS data at 500 m spatial resolution. Both the cloud cover and the noise are removed from the quality index included in the product. The cloud gaps in the MODIS data are filled using the linear interpolation.
Eventually, the STARFM is used to fuse both the Landsat and MODIS datasets in order to generate a spatial temporal time series with higher spatial temporal resolution. Before applying the fusion algorithm, a single band of the NDVI from every time step has been generated from the reflectance bands of the Landsat and MODIS datasets. Prior to the data fusion, the MODIS daily NDVI dataset is reprojected and resampled to Landsat imagery using bilinear interpolation. The fused model is based on the principle that both the MODIS and Landsat products detect the same NDVI values, biased by a constant error due to their differences in data processing, acquisition time, bandwidth, and geolocation Remote Sens. 2020, 12, 1819 6 of 28 errors. The algorithm states that if a Landsat-MODIS image pair is available on the same DOY, this constant error can be calculated for each pixel in the image. After that, these errors can be applied to the available MODIS dataset of a prediction date to obtain a prediction image with the same spatial resolution of Landsat. According to [24], this is done in four different steps. Intially, the MODIS time series is reprojected and resampled to the available Landsat imagery. Next, a moving window is applied to the Landsat image to identify the similar neighboring pixels. After that, the weight W I,j,k is assigned to each similar neighbor; lastly, the NDVI of the central pixel is calculated [24,30]. After obtaining the STARFM time series, the study validates the fused product by dropping a single available Landsat NDVI image during the fusion process and comparing both the real (dropped Landsat NDVI) and fused (STARFM NDVI) images of the same time zone. The STARFM performs the fusion process using Equation (1): where w is the size of the moving window, L (x w/2 , y w/2 , t o ) is the cental pixel of the moving window for the Landsat image prediction at time t o , and x w/2 , y w/2 is the central pixel within the moving window. The spatial weighting function W I,j,k determines how much each neighboring pixel x i , y j in w contributes to the estimated reflectance of the central pixel. (x i , y j , t o ) is the MODIS reflectance at the window location (x i , y j ) observed at t o , while L(x i , y j , t k ) and M(x i , y j , t k ) are the corresponding Landsat and MODIS pixel values observed at the base date t k [24]. The n counts for total number of input pairs of (x i , y j , t k ) and M(x i , y j , t k ), and supposedly each pair is acquired on the same date. The size of the moving window is taken as 1500 m by 1500 m, which is three times the size of the MODIS pixel (500 m) and 50 times that of the Landsat pixel (30 m) [24]. The windows minimize the effect of pixel outliers and are therefore used for predicting changes using the spatially and spectrally weighted mean difference of pixels within the window area [24,40].

Methods
The study uses five different CGMs, namely, WOFOST, CERES-Wheat, AquaCrop, LUE, and CropSyst, to calculate crop biomass, including climate and biophysical variables as inputs ( Figure 2) [44][45][46][47][48][49][50]. The selection of the models is based on their design, accuracy, and robustness, which have been proven in previous studies [60][61][62][63]. Because the CGMs lack the spatial information, fused RS time series and MODIS coarse resolution data are used to bridge this gap. Therefore, this study has modified the design of the models according to the input requirements. The developed approach uses the fused product to calculate the NDVI, the highly used indicator to monitor crop growth [17]. Prior input into CGMs, the STARFM NDVI pixel values that are less than 0 and more than 1 are removed from the time series. The ground-measured biophysical parameters (CC, LAI, and FPAR) and the NDVI generated from the STARFM and MODIS are used to derive the daily time series of their respective biophysical parameters, which is one of the input requirements for CGMs. For the remaining input parameters, the three-hour based climate variables are converted to a single day based on their daily mean values. Eventually, the model comparisons are made by validating the model-generated biomass with the measured biomass that is collected at repeated times of the year, based on both the STARFM and MODIS NDVI, individually (Section 2.2). The complete modeling design was adapted in the software R, with reference to the freely available design documentation of the abovementioned CGMs [64][65][66][67][68][69][70][71][72][73]. The parameters used to calibrate the abovementioned CGMs are taken from different related studies (Table 2).
Remote Sens. 2020, 12, 1819 7 of 28 based on their daily mean values. Eventually, the model comparisons are made by validating the model-generated biomass with the measured biomass that is collected at repeated times of the year, based on both the STARFM and MODIS NDVI, individually (Section 2.2). The complete modeling design was adapted in the software R, with reference to the freely available design documentation of the abovementioned CGMs [64][65][66][67][68][69][70][71][72][73]. The parameters used to calibrate the abovementioned CGMs are taken from different related studies (Table 2).

Figure 2.
Conceptual framework of the study that states the total input requirement of the crop growth models (CGMs), including various climate parameters, and biophysical parameters derived from the spatial and temporal reflectance adaptive reflectance fusion model (STARFM) and MODIS normalized difference vegetation index (NDVI) time series. The simulated biomass obtained from the CGMs is validated with the in situ biomass and CGMs are compared on the basis of simplicity, accuracy, and reliability using the STARFM and MODIS data sets. The end products are obtained as a winter wheat daily biomass time series of 30 m and 500 m spatial resolutions during the study period. Figure 2. Conceptual framework of the study that states the total input requirement of the crop growth models (CGMs), including various climate parameters, and biophysical parameters derived from the spatial and temporal reflectance adaptive reflectance fusion model (STARFM) and MODIS normalized difference vegetation index (NDVI) time series. The simulated biomass obtained from the CGMs is validated with the in situ biomass and CGMs are compared on the basis of simplicity, accuracy, and reliability using the STARFM and MODIS data sets. The end products are obtained as a winter wheat daily biomass time series of 30 m and 500 m spatial resolutions during the study period.

WOFOST
The WOFOST model describes the crop parameters, such as crop biomass and yield, by considering crop genetic properties and climatic parameters [44]. It states crop biomass as a function of solar radiation, temperature, and daily crop characteristics. It simulates the daily crop growth rate, which is the gross CO 2 assimilation rate that depends on the LAI and incoming radiation. The daily gross assimilation rate of the crop is calculated by the daily absorbed radiation and the photosynthetic characteristics of each leaf, and it further calculates the total carbohydrates (CH 2 O) produced. Some Remote Sens. 2020, 12, 1819 8 of 28 fractions of the CH 2 O produced are used to provide energy for respiration (maintenance respiration), and the remaining energy is converted into dry matter. The model calculates the growth rate as where ∆G is the growth rate (kg dry matter ha −1 d −1 ); A is the gross assimilation rate ; C e is the conversion efficiency (kg dry matter kg −1 CH 2 O). Based on Monteith's principle of light use efficiency, the calculation of total dry matter (kg dry matter ha −1 yr −1 ) in the WOFOST model is equivalent to the net primary production (NPP) (kg ha −1 yr −1 ) [48,80].
The complete process to calculate the per day growth of a crop's biomass makes the WOFOST model complex in design. This study adopted the complete working methodology of this model from the WOFOST 6.0 documentation prepared by [81], which is also available online (www.wur.nl). A brief description of the WOFOST model is shown in Figure A1.

CERES-Wheat 2.0
The CERES-Wheat model is designed to simulate the effects of climatic parameters, soil, water stresses, and planting density on crop biomass and yield [82,83]. The study uses the CERES-Wheat 2.0 from the draft provided by Dr Joe T. Ritchie and Dr Doug Godwin. The model includes many climatic parameters, such as minimum and maximum temperature, solar radiation, soil moisture, and evapotranspiration. It estimates the biomass by generating the linear relationship between biomass production and intercepted radiation. This model is based on the principle of radiation use efficiency, demonstrating that the conversion efficiency of intercepted radiation to biomass is higher during periods of low radiation than those of high radiation [84]. The CERES-Wheat model calculates biomass as Biomass = PCARB * Min(SWDF1, PRFT), where PCARB is the potential biomass production (g m −2 d −1 ); SWDF1 is the soil water deficit (0;1); PRFT is the photosynthesis affected by temperature (0;1); Min is the minimum function which returns a minimum value. Similar to the WOFOST model, the biomass calculated by CERES-Wheat is equivalent to the NPP [48,80]. The detailed methodology of the CERES-Wheat model is shown in Figure A2 The AquaCrop model used in this study is proposed by [47], which describes the interactions between the plant and soil. It is used as a planning tool to understand the crop response to environmental changes. It subsequently estimates biomass from daily transpiration values. Moreover, it calculates the crop biomass using normalized water productivity (NCWP) based on the development of the crop canopy and crop transpiration. For this study, the crop transpiration and soil evaporation for the same model is calculated using the Penman-Monteith evapotranspiration equation given by [85]. The model calculates biomass as where WP is the biomass water productivity (kg m −2 d −1 ); T r is the transpiration (mm); ET o is the reference evapotranspiration (mm). The step-wise methodology of the AquaCrop model is shown in Figure A3.

CropSyst
The CropSyst model simulates the effect of climate elements and soils on the total crop productivity [86]. The development of this model started in the early 1990s and simulates crop phenology, biomass, and crop yield. It calculates the daily biomass accumulation by determining potential biomass growth based on two components, potential transpiration and crop intercepted photosynthetically active radiation (PAR). The calculated potential growth is then corrected by water and temperature stress. According to [87], the model calculates the daily biomass production as where B PT is the crop potential transpiration dependent biomass (g m −2 d −1 ), TP is the crop potential transpiration (g m −2 d −1 ), VPD is the mean atmospheric vapor pressure deficit (kPa), and Kbt is a biomass transpiration coefficient (kPa). The flow diagram of the CropSyst model is shown in Figure A4.

LUE
The working of the LUE model is based on the light use efficiency theory of [48,49]. However, few studies have integrated it with RS data. The study follows the same methodology to calculate the crop biomass calculated by using the technology of RS derived using a simple physiological paradigm modeled by [66,88]. The model is based on a semi-empiric approach and calculates the daily aboveground biomass as where PAR is photosynthetically active radiation (MJ m −2 d −1 ), fPAR is the fraction of PAR absorbed by the canopy, and єis the actual light-use efficiency (g C M J −1 ). The total aboveground biomass calculated by the LUE model is equivalent to the NPP (kg ha −1 yr −1 ) [48,80]. The detailed step-wise procedure of the LUE model is shown in Figure A5.

Statistical Analysis
The STARFM NDVI data are validated with the pre-processed, cloud-free, and shadow-free Landsat images acquired during the study period. From the predicted NDVI (STARFM NDVI) and observed NDVI (Landsat NDVI), the determination coefficient (R 2 ) (Equation (7)) and root mean square error (RMSE) (Equation (8) and Equation (9)) are calculated. Addtionally, after obtaining results from the CGMs, the modeled biomass of winter wheat obtained using the STARFM and MODIS NDVI is compared with the collected in situ biomass at different time steps during the study period. A linear regression model (LRM) is performed with an aim to establish a linear relationship between the measured (independent variable) and modeled biomass (dependent variable). The statistical parameters used to validate the accuracy of the CGMs are R 2 , mean error (ME) (Equation (8)), and RMSE.
where P i is the predicted value, O i is the observed value,Ō is the observed mean value, and n is the total number of observations. R 2 is a statistical parameter which assesses how well measured values fit a model (i.e., it is a measure for the model quality) and RMSE is used to measure the quadratic differences between predicted and measured values (i.e., it is a measure for the model precision). In general, a lower RMSE value is better than a higher one. To check the significance of CGMs, the probability value (p-value) is calculated using an LRM with a H 0 that there is no correlation between the measured and predicted biomass, and an H 1 that the correlation exists. To perform this test, a significance level (also called alpha (α)) is set to 0.05. A p-value of less than 0.05 shows that a model is significant and it rejects the H 0 that there is no correlation.

Threshold Values of the Climate Parameters Used by CGMs
Many studies have explained growth model comparison, and in general, they show that less calibration causes more uncertainty for the output variable [89,90]. For a better accuracy, it is a crucial step to calibrate CGMs with their required climate parameters concerning the study site. The study uses a minimum lethal temperature value of −2 • C, which is used by [91] for winter wheat. One study [92] illustrated the minimum values of temperature for winter wheat at a grain growth stage as 12 • C. For VPD, this study follows [93], which has analyzed the atmospheric influences on leaf gas exchange on winter wheat with minimum and maximum values of 1.5 and 4.0 kPa, respectively. Ray et al. [94] considered four VPD values with a minimum of 1.1 kPa and a maximum of 3.6 kPa to measure its effect on the crop's transpiration response to drying soil. Moreover, considering that the soil type of the Demmin region is loamy sand, the optimal soil parameter values of water content at field capacity, water content at the wilting point, maximum root depth (Zr), and the average fraction of total available water (TAW) (p) are 0.15 m 3 m −3 , 0.06 m 3 m −3 , 1.5-1.8 m, and 0.55, respectively (Table 3).

STARFM Data Fusion
The NDVI correlations between the Landsat and synthetic images for 2015 on available pre-processed Landsat dates are made to describe the quality, accuracy, and reliability of the STARFM product (Figure 3a observed for DOY 164 (13 June) with an R 2 of 0.90 and RMSE of 0.06. For the DOYs 196 and 219, the agreement of the reflectance values between the Landsat and the fused product is not as high. The R 2 value for both DOYs 196 and 203 is 0.65, with an RMSE of approx. 0.11. The highest values of RMSE are obtained for DOYs 84, 91, and 123 with the values 0.16, 0.17, and 0.10, respectively. Moreover, the lowest values of R 2 are obtained for the DOY 91, with a value of 0.40. When considering these numbers, it is necessary to note that they indicate the lowest quality of the fused product, as all real existing Landsat time series could be used to create the synthetic time series.  Figure 4 shows the NDVI images produced by Landsat (Figure 4a) and the STARFM (Figure 4b) at the same resolution of 30 meters of seven winter wheat fields on DOY 155. On a few sample points, the NDVI generated from the STARFM shows a slight underestimation of the values as compared to  Figure 4 shows the NDVI images produced by Landsat (Figure 4a) and the STARFM (Figure 4b) at the same resolution of 30 m of seven winter wheat fields on DOY 155. On a few sample points, the NDVI generated from the STARFM shows a slight underestimation of the values as compared to the Landsat; however, an opposite trend has been observed on other plots. A more detailed comparison is displayed in the subset maps of Figure 4, comparing three plots: Plot 1, Plot 2, and Plot 4. Plot 1 and Plot 2 show that the Landsat NDVI image is slightly greener than the one of the STARFM (i.e., NDVI values are higher on average). On the other hand, Plot 4 shows higher values of the STARFM NDVI comapared to the Landsat NDVI values. A plot-wise comparison of the selected three plots during the phenological cycle of winter wheat is shown in Figure 5a-c, which indicates both a slight underestimation and an overestimation of the NDVI values generated by the STARFM precisely during the development stages of winter wheat. Also, an overestimation of the NDVI values is observed during the late stages of the crop. Moreover, a similar trend is seen in the bar plots in Figure 5d-f for the same three plots. Importantly, the overall pattern followed by both NDVIs is very similar, thus, proving a sufficiently high accuracy and reliability of the fused time series for vegetation monitoring (overall R 2 of 0.618 and RMSE of 0.10).

Statistical Comparison of CGMs
With both the STARFM and MODIS NDVI inputs, the five models performed significantly (having a p-value < 0.05); this rejects the H 0 of the LRM that there is no correlation between the modeled and measured crop biomass ( Figure 6). The R 2 obtained from the STARFM product has a higher accuracy as compared to the MODIS. Based on the R 2 of the STARFM, the models are descendingly ordered as AquaCrop, LUE, CERES-Wheat, CropSyst, and WOFOST, with R 2 values of 0.86, 0.83, 0.78, 0.78, and 0.77, respectively. The MODIS R 2 values are: LUE (0.68), AquaCrop (0.65) and CERES-Wheat (0.63), CropSyst (0.55), and WOFOST (0.53). In general, the predicted values for all five models follow a similar pattern, and none of the models can claim to outclass the others. However, the ME and RMSE values give a more complete picture of the model performances (i.e., their quality and precision). The ME from the MODIS is slightly lower than the STARFM; on the contrary, the RMSE of the STARFM is lower than the MODIS. The overall STARFM range of all five models for the crop biomass estimation is from 333.11 to 435.29 g/m 2 (ME) and from 511.35 to 698.58 g/m 2 (RMSE). The CERES-Wheat model has provided the lowest RMSE of 511.35 g/m 2 and an ME of 333.11 g/m 2 . Using MODIS, both WOFOST and AquaCrop have overestimated the biomass values as compared to the STARFM. However, the opposite trend has been observed in the remaining models.
However, the ME and RMSE values give a more complete picture of the model performances (i.e., their quality and precision). The ME from the MODIS is slightly lower than the STARFM; on the contrary, the RMSE of the STARFM is lower than the MODIS. The overall STARFM range of all five models for the crop biomass estimation is from 333.11 to 435.29 g/m 2 (ME) and from 511.35 to 698.58 g/m 2 (RMSE). The CERES-Wheat model has provided the lowest RMSE of 511.35 g/m 2 and an ME of 333.11 g/m 2 . Using MODIS, both WOFOST and AquaCrop have overestimated the biomass values as compared to the STARFM. However, the opposite trend has been observed in the remaining models.

Plotwise Comparison of CGMs using the STARFM
In the following, the models' results obtained for the selected 18 plots are compared on a daily basis. The line graphs are drawn in Figure 7a-e show the increasing trend of biomass by all the CGMs and the biomass measured. The LUE model has shown an overestimation in the biomass values as compared to the other models. Except for CropSyst, the other CGMs show the same trend on a daily basis. In the CropSyst model, all ESUs show a dense pattern; however, the opposite is seen in the CERES-Wheat model, where the ESUs show a high variation. The AquaCrop model obtains almost similar values to the measured biomass at the initial stages of crop growth. However, a considerable variation is seen in almost all the models during the late development and mid-stages. After DOY 175, a stable trend in the increment of crop biomass is seen in all the selected models.
(g m −2 ), ME (g m −2 ) and p-value are displayed at the top of every scatter plot. The legend is displayed at the bottom center, where gray color represents MODIS and dark yellow represents the STARFM.

Plotwise Comparison of CGMs using the STARFM
In the following, the models' results obtained for the selected 18 plots are compared on a daily basis. The line graphs are drawn in Figure 7 (a-e) show the increasing trend of biomass by all the CGMs and the biomass measured. The LUE model has shown an overestimation in the biomass values as compared to the other models. Except for CropSyst, the other CGMs show the same trend on a daily basis. In the CropSyst model, all ESUs show a dense pattern; however, the opposite is seen in the CERES-Wheat model, where the ESUs show a high variation. The AquaCrop model obtains almost similar values to the measured biomass at the initial stages of crop growth. However, a considerable variation is seen in almost all the models during the late development and mid-stages. After DOY 175, a stable trend in the increment of crop biomass is seen in all the selected models.   Figure 8 shows the spatial distribution of simulated crop biomass at 30 m spatial resolution by the two best performing models: AquaCrop ( Figure 8a) and LUE (Figure 8b) on DOY 171 (20 June) (i.e., nearly at the start or middle of the flowering stage of winter wheat). The minimum and maximum winter wheat biomass values vary between 1212.30 and 1554.81 g/m 2 in both the AquaCrop and LUE models. These values are obtained considering the climate stress factors, such as temperature stress, VPD, and soil moisture stress. This study shows AquaCrop and LUE are the best fit models with an R 2 of 0.86 and 0.83, and RMSEs of 522 and 582 g/m 2 , respectively. The subset figures in Figure 8 show that the LUE model slightly overestimates the biomass values as compared to the AquaCrop model.

Quality Assessment of Data Fusion
The study investigates the capability of the STARFM [24] over the northeastern region of Germany to predict seasonal changes in modeled biomass during 2015. The study uses the strategy "index-then-blend" (IB), which generates the NDVI from both Landsat and MODIS images before they are blended for the data fusion [96]. On the contrary, many studies first blend the reflectance of the individual MODIS and Landsat data sets and then generate the NDVI using the "blend-thenindex"(BI) approach [97,98]. To investigate which is the better approach, [96] has conducted a theoretical and experimental analysis that states if the predicted NDVI values are lower than the input Landsat values, IB performs better and vice versa. In the present study, a few plots predict higher NDVI values and the remaining plots predict lower; therefore both BI and IB errors are expected to be small [96]. Additionally, the IB approach has less computation cost than BI, as it blends only one band: the NDVI. Therefore, in the present study it was decided to perform the fusion

Quality Assessment of Data Fusion
The study investigates the capability of the STARFM [24] over the northeastern region of Germany to predict seasonal changes in modeled biomass during 2015. The study uses the strategy "index-then-blend" (IB), which generates the NDVI from both Landsat and MODIS images before they are blended for the data fusion [96]. On the contrary, many studies first blend the reflectance of the individual MODIS and Landsat data sets and then generate the NDVI using the "blend-then-index"(BI) approach [97,98]. To investigate which is the better approach, [96] has conducted a theoretical and experimental analysis that states if the predicted NDVI values are lower than the input Landsat values, IB performs better and vice versa. In the present study, a few plots predict higher NDVI values and the remaining plots predict lower; therefore both BI and IB errors are expected to be small [96]. Additionally, the IB approach has less computation cost than BI, as it blends only one band: the NDVI. Therefore, in the present study it was decided to perform the fusion analysis using the IB approach.
The data fusion results indicate that the STARFM can successfully fuse MODIS and Landsat imagery [30,99]. The low RMSE and high R 2 obtained through the STARFM are comparable to those obtained by other studies [24,42,100]. The higher correlations between the observed and predicted NDVI values indicate the suitability of the algorithm for vegetation monitoring. However, many studies have suggested more improvements in the respective fusion algorithm [25,37,40].
In this study, Landsat and MODIS imagery are fused to create the synthetic NDVI time series with a 30 m spatial and daily temporal resolution. Due to high cloud coverage, which is typical in the study region, only four complete (DOYs: 84, 100, 164, and 219) and five partial (DOYs: 107, 123, 155, 180, and 196) cloud-free Landsat scenes were obtained during the phenological cycle of winter wheat. Due to the few cloud-free images, many gaps have been generated between the available Landsat scenes. Therefore, the accuracy of the produced fusion product has been affected [25,30]. For example, the small nine-day gap between DOYs 155 and 164 improves the quality of the fused product for respective days, having an R 2 of 0.89 and an RMSE of 0.06. However, the large gap between the Landsat images on DOYs 180 and 219 lowered the quality of the fused product obtained on day 219 with 0.65 (R 2 ) and 0.13 (RMSE).
Previous studies have used multi-sensor data fusion to derive the NDVI as a monitoring tool for vegetation health and dynamics, allowing easy comparisons based on a temporal and spatial basis [15,32,33,97]. Assessing the reliability of the fused time series, the study compares the NDVI time series obtained from the fusion algorithm with the respective cloud-free Landsat scenes. Studies comparing multi-sensor NDVI values at the development stage of any vegetation reported that both Landsat and MODIS NDVI values are comparable within very close ranges, although the MODIS NDVI appears to be slightly higher than the Landsat NDVI because of its coarse spatial resolution of 500 m [101][102][103]. Similarly, the fused NDVI time series shows slight changes when compared at different growth stages of winter wheat. For example, a few plots show a small overestimation during the development stages and a high underestimation during the later stages of winter wheat, when compared with the real Landsat scenes on respective dates, and vice versa. For the points that have shown the overestimation, the reason could be because at later stages in the crop phenological cycle, the plants are ripe and turn yellowish due to a lower chlorophyll content [104]. Additionally, the MODIS 500 m coarse resolution covers the heterogeneity of green vegetation classes, such as forests or grasslands, which causes more uncertainty in the fused time series at later stages of a crop than the development or growth stages. Another reason could be the smaller gap of cloud-free Landsat data used for the STARFM fusion algorithm during the development stages than at the later stages of the crop.

Description of Results Obtained from Different Models
This study compares five different CGMs: WOFOST, AquaCrop, CERES-Wheat, LUE, and CropSyst to estimate crop biomass by inputting the MODIS and STARFM NDVI products individually. After obtaining the biomass values from different models, the present research compares the predicted values with the in situ biomass measurements taken from the sample plots at the DEMMIN test site. The obtained biomass values are a cumulative sum of each day of the crop growth cycle. These CGMs mentioned above differ in their complexity and processing time, and they have different requirements regarding the input variables. Overall, both the AquaCrop and LUE models perform well in achieving the desired balance of simplicity, accuracy, and robustness by simulating biomass satisfactorily compared to the observed records. For MODIS and the STARFM, both prove to be more reliable and significant with high R 2 (>0.64, >0.82,), low RMSE (<650 g/m 2 , <600 g/m 2 ), and p-value < 0.05, respectively. The CropSyst model works similar to the AquaCrop mode with regards to simplicity and its requirements for input parameters. However, it produces the crop biomass with the highest RMSE of 699 g/m 2 (STARFM) and 783 g/m 2 (MODIS), and the lowest R 2 of 0.78 (STARFM) and 0.55 (MODIS).
The study proves that the CGMs results obtained using the STARFM NDVI are more precise, accurate, and reliable than the MODIS. Even though the aim of the study is to generate a high spatial temporal time series for monitoring winter wheat, the results obtained from MODIS are still significant and precise, especially for AquaCrop, LUE, and CERES-Wheat. Moreover, it was found that CGMs, which require more input parameters, are not only more complex, but also need more processing time to generate the output. For example, both WOFOST and CERES-Wheat need many input parameters, such as evapotranspiration, relative humidity, PAR, soil moisture, solar radiation, and sunshine hours to calculate crop biomass, while AquaCrop, CropSyst, and LUEare designed to consider only three input variables. Even though WOFOST has a complex design, the biomass generated is less accurate, based on R 2 (0.77) and RMSE (651 g/m 2 ) values when compared with the remaining models using the STARFM NDVI time series. In turn, the reason behind the degradation in the performance of WOFOST is the huge requirements for input parameters. Besides, for this model, there were still some missing crop parameters, like nitrogen uptake, stage-wise crop root density, percolation, and infiltration, that have affected the output. Similar to WOFOST, the CERES-Wheat model is purely based on mathematical equations. Besides its complexity, the model generates the output with the lowest RMSE of 511 g/m 2 using the STARFM and second lowest of 614 g/m 2 using MODIS.

Importance of Climate Parameters' Spatial Resolution and Threshold Values
This study explains the importance of climate parameters, like minimum temperature, soil moisture, and VPD, in improving the accuracy of biomass estimation. As the present study is designed to implement CGMs at a global scale, the use of climate parameters from ECMWF is taken into consideration. However, the coarse spatial resolution of climate variables causes uncertainties in the model outputs. On the other hand, the DEMMIN test site has its own measuring stations that currently provide data, such as wind direction, wind speed, air temperature, air humidity, amount of rain and rain intensity, short-wave and long-wave radiation and counter-radiation, soil temperature and soil moisture at various depths, and air pressureevery 15 min. Conceptually, this study will be elaborated and validated by inputting interpolated data from measuring stations to improve the regional level accuracy of CGMs in the future. Moreover, the selection of suitable field-and crop-specific thresholds for the climate parameters is of high importance, as plant growth and development is strongly affected by climate elements [105]. The study used the threshold values of different climate parameters to improve the accuracy of calculated biomass. Some studies, exemplarily [106], highlighted the importance of the threshold values of climate parameters in crop production. The inclusion of climate threshold values plays an important role to get the impact of climate change on crop growth. However, in many CGMs, the knowledge of crop response to weather conditions is incomplete [107]. Hence, this causes a reduction in the performances of the simulation models and this further causes unreliability in crop yield prediction [108].

Conclusions
This study compares the performance of the five widely used CGMs: WOFOST, CERES-Wheat, AquaCrop, LUE, and CropSyst for the prediction of winter wheat biomass on the JECAM site DEMMIN, Germany. Because CGMs lack the spatial information to estimate the crop biomass at the field scale, the study bridges this gap by linking and comparing CGMs with the NDVI RS time series generated using the STARFM and MODIS. The capability of the STARFM is investigated by fusing both Landsat and MODIS imagery to create the synthetic NDVI time series, having a 30 m spatial and daily temporal resolution with an overall R 2 of 0.618 and RMSE of 0.10. Input satellite images with high cloud cover can deteriorate the quality of fused time series. The study makes use of 18 ESUs from seven winter wheat fields in 2015 to collect in situ biophysical parameters like LAI, FPAR, CC, and biomass with measurements repeated 16 times on a dedicated BBCH. The measured biomass is used to compare the performance of the CGMs based on the accuracy, simplicity, and reliability of the predicted biomass using the STARFM and MODIS NDVI time series. Overall, both the AquaCrop and LUE models perform well in achieving the desired balance, requiring only three input climate parameters and simulating biomass satisfactorily compared to the observed records, with high R 2 (>0.64, >0.82), low RMSE (<650 g/m 2 , <600 g/m 2 ), and significant p-values (<0.05) using MODIS and the STARFM, respectively. For the STARFM, the R 2 values obtained from the models are descendingly ordered as AquaCrop, LUE, CERES-Wheat, CropSyst, and WOFOST, with R 2 values of 0.86, 0.83, 0.78, 0.78, and 0.77, respectively. The study concludes that the CGM results obtained using the STARFM NDVI are more precise, accurate, and reliable than the MODIS.
Additionally, the study shows that the models requiring fewer input parameters (AquaCrop and LUE) to simulate crop biomass are highly applicable and significant. Meanwhile, they are easier to implement than the models WOFOST and CERES-Wheat, which need more input parameters, at least for winter wheat in moderate climate zones, such as the DEMMIN test site. On the contrary, due to the unavailability of some important crop-related parameters, such as crop management practices, the amount of fertiliser used, soil information, and seed type, the performance of some of the models was lowered. The inclusion of these parameters could result in a better performance. Moreover, the coarse spatial resolution of climate parameters (~80 km) likely also negatively affected the results of the CGMs. Thus, the validation of the present study with regional climate parameters from measuring stations of the DEMMIN test site is recommended and targeted for future investigations. Additionally, the present approach can also be extended to the correlation of biomass with the soil organic carbon at the test site, in order to analyze its impact on the crop yields.      Figure A5. Flowchart of to calculate crop biomass using the LUE model. Figure A5. Flowchart of to calculate crop biomass using the LUE model.