Testing the Contribution of Stress Factors to Improve Wheat and Maize Yield Estimations Derived from Remotely-Sensed Dry Matter Productivity Yetkin

According to Monteith’s theory, crop biomass is linearly correlated with the amount of absorbed photosynthetically active radiation (APAR) and a constant radiation use efficiency (RUE) down-regulated by stress factors such as CO2 fertilisation, temperature and water stress. The objective was to investigate the relative importance of these stress factors in relation to regional biomass production and yield. The production efficiency model Copernicus Global Land Service-Dry Matter Productivity (CGLS-DMP), which follows Monteith’s theory, was modified and evaluated for common wheat and silage maize in France, Belgium and Morocco using SPOT VEGETATION for the period 1999–2012. For each study site the stress factor that has the highest correlation with crop yield was retained. The correlation between crop yield data and cumulative modified DMP, CGLS-DMP, fAPAR, and NDVI values were analysed for different crop growth stages. A leave-one-year-out cross validation was used to test the robustness of the model. On average, R2 values increased from 0.49 for CGLS-DMP to 0.68 for modified DMP, RMSE (t/ha) decreased from 0.84–0.61, RRMSE (%) reduced from 13.1–8.9, MBE (t/ha) decreased from 0.05–0.03 and the index of model performance (E1) increased from 0.08–0.28 for the selected sites and crops. The best results were obtained by including combinations of the most appropriate stress factors for each selected region and cumulating the modified DMP during part of the growing season that includes the reproductive stage. Though no single solution to the improvement of a global product could be demonstrated, our findings encourage an extension of the methodology to other regions of the world.


Introduction
Regional to global scale crop monitoring and yield forecasting are important for agricultural management and food security [1,2].Satellite remote sensing enables assessment of agricultural crop growth and yield across large territories [3][4][5][6][7].Various Biomass Proxies (hereafter BPs) have been developed from remote sensing imagery to be used in empirical regressive models that monitor agricultural crop growth and estimate crop yield [8][9][10]. NDVI, fAPAR, LAI (Leaf Area Index), GAI (Green Area Index) and EVI (Enhanced Vegetation Index) are examples of BPs that have been derived from remote sensing and that are used in vegetation monitoring and crop yield forecasting [5,7,[11][12][13][14][15][16].Though relationships between BPs and yield have been established, few studies relate Dry Matter Productivity (DMP) to crop yield [17,18].
Vegetation productivity can be defined in several ways.Gross primary productivity (GPP) is the rate at which plants capture and store atmospheric carbon dioxide (CO 2 ) to generate oxygen and energy as biomass [19].Net primary productivity (NPP) is the difference between GPP and the energy lost during plant autotrophic respiration.NPP is thus the rate of atmospheric carbon uptake through the process of photosynthesis and represents the daily accumulation of standing biomass.DMP is analogous to NPP, but expressed in different units (kgDM/ha/day instead of gC/m 2 /day), for agro-statistical purposes [20].The efficiency of the conversion between carbon and dry matter is on average 0.45 gC/gDM [21].
Three types of models are used to estimate the photosynthetic carbon uptake and understand its spatio-temporal variability [22][23][24].The first group consists of empirical models often with a limited applicability outside the area where they have been calibrated.The second group of models is based on major biophysical and biochemical processes of photosynthesis and respiration measured under laboratory conditions [25].These models have a high computational demand.The third group of models are parametric models driven by remote-sensing-derived variables and weather data, calibrated with flux monitoring.Constant parameters are used to link measurements to biophysical processes rendering these models particularly suitable for the local scale.While such assumptions may be difficult to hold at global scale, parametric models may offer a balance between simplicity and process description [23].
Production efficiency models, such as Monteith parametric models, have been developed to monitor the primary production of vegetation [26,27].Monteith suggested that crop growth under non-stressed conditions linearly correlates with their Radiation Use Efficiency (RUE) times the amount of Absorbed Photosynthetically Active Radiation (APAR) [28,29].According to a review of experimental studies [30], RUE values for C 3 species range from 1.32-3.50gDM/MJ of intercepted PAR and for C 4 species from 1.48-4.32gDM/MJ of intercepted PAR during different crop development stages.The seasonal variability of photosynthetic activity, however, depends on environmental constraints [22].For example, RUE is negatively related to water stress and positively related to temperature for annual crops [22].Water stress has been estimated as a function of soil moisture [31,32], water deficits [33] or satellite-derived land surface water index [34].
Production efficiency models have been widely used to estimate regional or global carbon balances in crops, grasslands and forests due to the simplicity of the RUE concept and the availability of remotely sensed data [17,31,32,[35][36][37][38][39][40][41].However, their performance has been shown to vary in describing the carbon budget [42].A comparison of modelled gross and net primary productivity [26] of six different models, CASA [43], GLO-PEM [44], TURC [45], MOD17 [46], BEAMS [47] and C-Fix [48], illustrates how the various methodologies used to calculate vegetation productivity can be generalized in the following two equations: NPP " GPP ´AR where PAR is Photosynthetically Active Radiation (MJ/m 2 ), fAPAR is the Fraction of Absorbed PAR (dimensionless), RUE MAX is the maximum Radiation Use Efficiency (gC/MJ) (i.e., RUE under no stress), which is downregulated by a coefficient that encompasses the effects of all stress factors such as temperature stress or water stress, and AR is Autotrophic Respiration (gC/m 2 /day).In general, variation among the models is caused by the differences in RUE, the incorporation of stress factors, and AR [26].The initial model conditions, model parameters, model structures and accuracy of input data also play an important role [42,49].Uncertainties in global GPP/NPP monitoring also relate to the determination of RUE, AR and the quality of meteorological and the biophysical data [26,46,50,51].
In this study, we analyse the Copernicus Global Land Service-Dry Matter Productivity (CGLS-DMP) product.This model is part of the operational processing chain of SPOT VEGETATION and the following Proba-V at the Vlaamse Instelling Voor Technologisch Onderzoek (VITO).The parametric CGLS-DMP model estimates carbon mass fluxes at local, regional and global scales [48], and has proved usefulness in vegetation productivity studies such as grasslands and forests [52][53][54].For a given location, carbon fluxes are estimated on a decadal basis using SPOT VEGETATION and resulting in a DMP product.Different DMP versions have evolved since the early 1990s [55].The C-Fix approach (a variant of Monteith's approach) was applied, to monitor the overall carbon balance over Europe in terms of NPP [37].In the late 1990s, an operational chain was established to retrieve dekadal DMP images over Belgium based on C-Fix using SPOT VEGETATION for crop monitoring.Around 2000, a consortium was established for JRC-MARS.There have been different MARSOP projects in operation (MARSOP1 from 2001-2003, MARSOP2 from 2004-2007, MARSOP3 from 2008-2014, MARSOP4 from 2015-2019).The DMP coverage has been global except for MARSOP1 which was the Mediterranean Basin, part of the Commonwealth of Independent States (CIS-former Soviet Union), Mercosur in South-America, and the Horn of Africa.The procedure for MARSOP1 was the same as C-Fix.The RUE in optimal conditions has been taken as 1.10 kg¨C/GJ for NPP and 2.45 kg¨DM/GJ for DMP.It is considered constant for all land cover types, while differences exist between biomes, causing the operational DMP product to over/underestimate reality [22,56].The RUE value is subsequently reduced by normalised temperature and CO 2 fertilisation dependency factors.With the start of MARSOP2, the temperature errors were corrected using T 12 for temperature and CO 2 fertilisation efficiencies, and T 24 for autotrophic respiration instead of simply using mean temperature (Equations ( 3) and ( 4)).
T 12 " 0.75 `pT min `Tmax q T 24 " 0.50 `pT min `Tmax q (4) where T min is the minimum daily temperature and T max is the maximum daily temperature.The grid of the meteorological data has been changed from 1 ˝to 0.25 ˝with MARSOP3.For all projects, while calculating CO 2 fertilisation efficiency, the CO 2 value used was fixed at 355.61ppmv, the global mean level of the year 1994.For MARSOP1 and 2, in order to estimate fAPAR, the linear relationship between NDVI and fAPAR was used.For MARSOP3 and 4, the global fAPAR is computed based on [20].The DMP has been produced in the CGLS since 2014 using the same meteo data, algorithm and constants as the MARSOP4 DMP [57].As mentioned in [57], some remarks should be taken into consideration when interpreting the product.Firstly, since no direct water stress factor is implemented in the DMP algorithm, the retrieved values should be considered as optimal values.The CGLS-DMP model is only partially water limited through the sensitivity of fAPAR to vegetation water stress [37].
For the C-Fix model, Verstraeten et al. [48] therefore proposed to limit RUE by using estimates of soil moisture content and water vapour deficit which required several empirical coefficients tuned to local vegetation conditions [58].Maselli et al. (2009) successfully introduced a water stress index to the C-Fix model, which in turn enabled simulations of the gross and net carbon fluxes of Mediterranean forest ecosystems [54].Secondly, although RUE values differ between biome types, it is considered as a global constant for all land cover types.Thirdly, the temperature stress factor was parameterized for European forests, and reflects neither the difference between C 3 and C 4 plants nor the differences within one plant type.Fourthly, although the CO 2 level increases each year, it is considered as a global constant.Finally, the AR is a linear function of daily mean temperature and is assumed biomass-independent [59].The average value of the original AR fraction is around 0.7 for Europe and is an overestimation compared to other values found in the literature [1,42,[60][61][62].
We hypothesise that the relative importance of the stress factors strongly contributes to explaining regional differences in biomass production and yield.An accordingly modified DMP product could therefore serve as a better proxy for crop yield than the currently available CGLS-DMP, and perhaps also outperform other simpler satellite-derived biophysical proxies such as fAPAR and NDVI.The present study aims to parameterise the dry matter productivity approach specifically for wheat and maize and to compare this modified DMP model with the original CGLS-DMP and with other BPs derived from satellite imagery.The value of this improvement is evaluated in the context of arable productivity at the regional administrative or agro-ecological level.The model performance is assessed using statistical metrics based on the comparison between regional cumulated BPs and regional crop yield statistics of silage maize and common wheat for selected sites in Belgium, France and Morocco during the period 1999-2012.

Study Areas and Crops
The selected study sites are located in Belgium, France and Morocco (Figure 1).Silage maize and common wheat are the most dominant crops in the selected sites.From Northern Europe to Northern Africa, the average daily temperature and potential evapotranspiration increases, and the average cumulative rainfall decreases.The regions were selected to capture this trend in climate regimes.Figure 2 shows the long term averages of temperature, precipitation, Penman-Monteith potential evapotranspiration (PET) and water balance computed per region based on meteorological data obtained from the JRC-MARSOP project [63].The water balance was calculated using the Thornthwaite-Mather method as precipitation minus PET [64].Water and heat stress are the major factors that influence arable yields in Belgium [65,66] and France.The total annual water used for irrigation across France can amount to 80% of the crop water use during dry periods, while on average 10%-20% is used during a typical growing season [67].In Morocco, most arable crops are rain fed, having frequent dry periods with high temperatures and irregular rainfall [68].

Data Description
Daily meteorological data were obtained from JRC-MARSOP [63] and 10-daily fAPAR and NDVI from SPOT VEGETATION (1 km) [69] for the period 10/1998-10/2012.Temperature and solar radiation are at a 25 km grid, while rainfall, PET and AET are at a 0.25 ˝grid.
A cropland mask was retrieved from ESA GlobCover 2009 [70].The classes used were "Post-flooding or irrigated croplands (or aquatic)", "Rainfed croplands", "Mosaic cropland (50%-70%)/vegetation (grassland/shrubland/forest) (20%-50%)" and "Mosaic vegetation (grassland/ shrubland/forest) (50%-70%)/cropland (20%-50%)".The soil water holding capacity was retrieved from the European soil map [71] and Digital Soil Map of the World [72] and used in AgroMetShell [73].AgroMetShell provides a toolbox for agrometeorological crop monitoring and forecasting developed by FAO.This toolbox includes a database with the weather, climate and crop data used to analyse the impact of weather on crops [74].AgroMetShell also contains crop-specific water balance parameters for different crop growth stages.In this study, we used the meteorological data from JRC-MARSOP and used AgroMetShell to compute the actual evapotranspiration (AET) for every season during the period 1999-2012.The soil types and textures of the different regions show a high suitability for arable agriculture due to the presence of silt (Table 1).The information for the planting dekad and the length of the growing season as required in AgroMetShell were extracted from the MARS database [63] for Belgium and France; and obtained from [68] for Morocco (Table 2).Official crop yield statistics for common wheat and silage maize for the period 1999-2012 were obtained from the national statistical services.The coefficient of variation is 54% for Morocco, 9% for France and 7% for Belgium for wheat, and 16% for France and 7% for Belgium for maize.
Annual global MOD17A3 GPP and NPP data for 14 years (2000-2013) with a spatial resolution of 30-arcsec were downloaded from the Numerical Terradynamic Simulation Group at the University of Montana [75].This dataset was used to compute NPP/GPP ratio which was compared with the autotrophic respiration fraction computed by the CGLS-DMP and the modified DMP.

Algorithm Description of the DMP Model
The CGLS-DMP model [37] uses the following equation for calculating Dry Matter Productivity except for the last efficiency, ε H 2 O : where DMP is daily dry matter productivity (kgDM/ha/day).R is total shortwave incoming solar radiation (0.2-3.0 µm) (GJ/ha/day).ε p is the ratio of effective photosynthetically active radiation (PAR) (0.4-0.7 µm) to the total incident radiation.Because the energy in the PAR band at the surface of the earth is approximately 48% of global radiation, a value of 0.48 has been used in the model [37].fAPAR is the fraction of PAR absorbed by green vegetation and is estimated based on remote sensing [76].ε RUE MAX is the maximum Radiation Use Efficiency (RUE).In order to make a distinction between C 3 and C 4 crops, we use 2.75 kgDM/GJ for wheat and 3.5 kgDM/GJ for maize [30].ε T is the normalized temperature effect [77] and indicates the role of air temperature in the photosynthesis efficiency [78].ε CO 2 is the normalized CO 2 fertilization effect and takes into account the thermodynamic properties of the carboxylation/oxygenation reactions during photosynthesis [37].ε AR is the fraction kept after Autotrophic Respiration (AR).We introduced in Equation ( 5) the reduction factor "ε H 2 O " to account for water stress.The details of the changed parameters are explained in the following sections.

Temperature Stress Factor
Plant species show characteristic variations in the way photosynthetic processes respond to temperature [79].For instance, wheat, as a C 3 plant, has an optimum temperature around 20 ˝C and maize, as a C 4 plant, has an optimum temperature around 30 ˝C [80][81][82][83][84][85].Different crop varieties may have different temperature responses to dry matter accumulation (e.g., [86]).The focus in this study is on crop type rather than variety.The temperature stress factor (ε T ) in CGLS-DMP is parameterised for C 3 plants (blue line in Figure 3) [37].The temperature response function by [85] is used for maize (red line in Figure 3).

CO 2 Fertilisation Effect
The CO 2 fertilisation effect is the increase in carbon assimilation due to CO 2 concentrations above the atmospheric reference level [37].The CO 2 concentration level in CGLS-DMP was fixed to a constant value, although the globally averaged records show a tendency to increase from 1980-2012 and beyond (Figure 4).In Figure 5a, the dots represent the CO 2 fertilisation effect of CGDL-DMP at a fixed CO 2 level with variable temperature (from ´20 ˝C-40 ˝C).In order to simulate the measured CO 2 increase, the yearly variable CO 2 values were used in the modified DMP.In the CGLS-DMP model, the CO 2 fertilisation effect was parametrised for C 3 plants.For more details on the equation, we refer to [37].Above-ground dry matter increases with elevated CO 2 for C 3 plants [ [88][89][90].The blue to red lines in Figure 5b show the increase of the CO 2 fertilisation effect for C 3 plants from 1999-2012.
For C 4 plants little evidence of biomass accumulation in response to CO 2 enrichment was observed over a wide range of temperatures [88,91].Based on the CO 2 assimilation rate of C 4 at lower CO 2 partial pressures [92], the evolution of ε CO 2 for C 4 plants can be rewritten (Equation ( 6)).
where F CO 2 is the CO 2 assimilation rate in year x, F ref CO 2 is the CO 2 assimilation rate in the reference year 1833, C CO 2 is the actual CO 2 concentration, V pmax is the maximum PEP (phosphoenolpyruvate) carboxylase activity, C ref CO 2 is the CO 2 concentration in the reference year 1833 (281 ppmv (or µbar)) and Kp is the Michaelis-Menten constant for CO 2 (80 µbar according to [92]).Since the variability in CO 2 assimilation rates of C 4 plants is dependent on CO 2 concentration, the changes from one year to another are very small (dotted green line in Figure 5c).

Water Stress Factor
The water stress factor (ε H 2 O , Equation ( 7)) is included in the algorithm to account for the immediate effect of water stress on the vegetation at each dekad.The model uses a simple water stress factor computed on the basis of PET and AET estimates [58].AgroMetShell was used to compute AET from total evaporation, actual rainfall and crop coefficients between planting and harvesting dates [74].The theoretical base of this index is the same as the water scalar in the CASA model [31,35].According to the CASA model, water availability up to a maximum of 50% can limit the photosynthesis which may be complemented by a subsequent fAPAR decrease to account for long-term water stress [31,58].There is no direct water limitation in the CGLS-DMP and water stress is incorporated indirectly through the impact on fAPAR corresponding to a visible impact on the vegetation [20].Maselli et al. [53,58] demonstrated that this modification is effective in improving the C-Fix model simulations for periods affected by significant water stress in Mediterranean tree ecosystems.
ε H 2 O can vary between 0.5, when strong water shortage reduces photosynthesis to half of its potential value, and 1, when there is no water shortage and photosynthetic reduction [58].
ε AR " 1 ´p´3.049`p0.01145 ˆT24 qq (8) where T 24 is the daily mean air temperature (in ˝K).The model performance is improved by introducing biomass information in the modified DMP model, as in the GLOPEM2 model [32].A semi-empirical relationship accounts for above-ground biomass (Equation ( 9)).
ε AR " 1 ´pp´3.049`p0.01145 ˆTM qq ˆp0.53 ˆp W W `50 qqq where T M is ten daily mean composite of mean air temperature and W is above ground biomass, according to The variable ρ ´2.6 min is the minimum reflectance in the red channel of AVHRR in the GLOPEM2 model.The lower ρ ´2. 6 min , the higher the biomass [93].[94] argue that visible reflectance is positively related to standing biomass and canopy closure.We used the ten daily minimum composite values of the red channel of SPOT VEGETATION assuming both instruments reach comparable minimum values in similar conditions.
The long term average autotrophic respiration fraction computed from both the original and modified formula was compared to the annual MOD17A3 NPP/GPP ratio in order to detect similarity and differences.The MODIS dataset is the only large scale estimate available for the NPP/GPP ratio.The long term average values of MODIS NPP and GPP were calculated separately, whereafter the NPP/GPP ratio was computed.
Table 3 presents a summary of unchanged and changed parameters in the DMP equation for both the CGLS and the modified version.

Regression Analysis
The relative importance of CO 2 fertilisation, temperature effect and water stress was determined through different combinations of these stress factors per site and crop type.Previous studies showed that NDVI and fAPAR were used to define site specific relations with crop yield [4][5][6]13,95,96].Therefore NDVI and fAPAR were compared with both CGLS derived and modified DMP.
The regression analysis was done to relate remote sensing Biomass Proxies (BPs) and crop yields of silage maize and common wheat for the period 1999-2012 in Belgium, France and Morocco.A linear regression was calculated between official yield statistics and a regional BP value cumulated over the different periods during the growing season [6] for each crop type and region.Different dekadal combinations of BPs were explored, e.g., the maximum BPs, maximum BPs plus number of dekads, end of season BPs and sum of BPs between flowering and ripening.Although the optimal temporal window varies with crop type and region, the cumulated period of the BPs in the majority of cases includes the reproductive stage.The model performance was assessed using the coefficient of determination (R 2 ).The p-value, calculated by Pearson correlation, was also given to assess whether the relations between the crop yields and calibrated BPs were statistically significant.
A leave-one-year-out cross validation enabled testing the model robustness and goodness of fit using the best combination of variables in the regression analysis.The root mean square error (RMSE), the relative RMSE (RRMSE), mean bias error (MBE) and the index of model performance (E 1 ) were presented to evaluate the model performance.E 1 is a dimensionless index of model-observation agreement [97].E 1 takes the value of 1 for perfect agreement.Although not negatively bounded, the value of 0 indicates that such a model has no more ability to predict the observed values than the observed mean (i.e., a null model).All values below 0 reflect a model that performs worse than the null model.

Results
An overview of spatio-temporal information containing a comparison of modified DMP for C 3 and C 4 versions with CGLS-DMP is provided for Liège Region (BE) (Figure 6).These absolute values are estimated from the modified DMP for C

Autotrophic Respiration Fraction (ε AR )
The range for the ratio of NPP to GPP (which represents the ε AR term in the DMP model) varies with land use type.The values of this ratio are 0.5 at the global scale [42,60,98]; 0.4 for corn and soybeans [1]; 0.4-0.6 for forest, 0.55 for cropland and 0.6 for grassland [61]; and 0.3-0.6 for maize, rice and wheat [62].
The modified ε AR showed a better agreement with the values found in the literature and with the MODIS NPP/GPP ratio as compared to the original CGLS-DMP (Figure 7).Although it is expected to have higher ε AR values in high altitudes and in places with higher temperatures, the differences between the modified DMP and MODIS NPP/GPP ratio are relatively large, particularly in mountainous and desertified regions.The main reason could be that the standing biomass was parameterised on the growing season reflectance value.The areas with very high ε AR values in Figure 7b are sparsely vegetated [94] and fall outside cropland area (the scope of this study).Figure 7 presents the ε AR versions and the MODIS NPP/GPP ratio for a visual interpretation of difference.In order to support the difference/similarity between the raster data presented (Figure 7), a numerical comparison was performed and difference maps created (see Supplementary Materials).

Linear Regression Analysis
Figure 8 shows the highest R 2 values for the relationship between the official yield statistics and each regional cumulative modified DMP computed with different stress factor combinations.In Belgium common wheat has the highest correlation with DMP, without any stress factors in the Polders and with DMP including CO 2 fertilisation effect in the Sandy Loam Region.In France common wheat has the highest correlation with DMP, with water stress in Eure-et-Loir and DMP with CO 2 fertilisation effect and temperature stress in Somme.In Morocco common wheat has the highest correlation with DMP, with all three stress factors in both El Jadida and El Kelaa des Sraghna.In Belgium silage maize has the highest correlation with DMP, with temperature and water stress factors in both the Loam and Liège regions.In France silage maize has the highest correlation with DMP, with all three stress factors in Ain and Haut-Rhin.
The highest correlations for modified DMP per region and crop type (Figure 8) are compared with CGLS-DMP, fAPAR and NDVI in Figure 9.In each study region the modified DMP has a higher correlation than the CGLS-DMP, and in general performs better than fAPAR or NDVI.The correlations are significant for all study sites and BPs, except for fAPAR in the Sandy Loam Region (BE), NDVI in Somme (FR) and CGLS-DMP in Liège Region (BE).The correlations between observed yield and calibrated BPs were significant at the 0.01 level for modified DMP in all regions, but displayed a mixed picture in the case of CGLS-DMP, fAPAR and NDVI.The best combination of BPs from the regression analysis were calibrated with a leave-one-out cross validation.The RMSE (t/ha), the RRMSE (%), MBE (t/ha) and E 1 values calculated in the validation analysis are presented in Figure 10.The modified DMP has lower RMSE values than the other BPs which range from 0.25 t/ha-0.44t/ha for common wheat and 0.68 t/ha-1.41t/ha for silage maize.In general the RRMSE values lowest for the modified DMP.The RMSE values followed a similar trend as RRMSE.The MBE values are between ´0.05 t/ha-0.05t/ha except for the Liège Region (BE), Ain (FR) and Haut Rhin (FR) for all BPs.Overall, higher E 1 values were recorded for modified DMP compared to the other BPs.In addition, E 1 values show that the model is not performant for maize in Belgium and France (except Ain). Figure 11 presents the temporal trends of the actual crop yield and predicted yield of BPs calibrated by the leave-one-out cross validation throughout the study period from 1999-2012.In general, the modified DMP followed a similar trend as observed yield although there are some discrepancies for maize in the Loam Region (BE), Liège Region (BE) and Haut Rhin (FR).

Discussion
The modified DMP, the empirically identified model, is in general more efficient in describing the yield variability for maize and wheat yield across the study sites when different combinations of stress factors are included in the BPs regression analysis together with official statistics.Overall, the modified DMP correlates better with crop yield statistics than the CGLS-DMP, fAPAR and NDVI.On average, R² values increased from 0.49 for CGLS-DMP to 0.68 for the modified DMP, RMSE (t/ha) decreased from 0.84-0.61,RRMSE (%) reduced from 13.1-8.9,MBE (t/ha) decreased from 0.05-0.03and E 1 increased from 0.08-0.28.In general, the cumulative BPs with the highest correlations coincide with periods that include the reproductive stage, which is in agreement with previous studies [14,[99][100][101].These periods are the most critical stages where any water stress may result in reduced crop yields [14].In Canada, Mkhabela et al. [14] reported RMSE values ranging from 104 kg/ha-697 kg/ha from arid zones to sub-humid zones when relating MODIS-NDVI and spring wheat yield.In China, Ren et al. [102] reported a RMSE value of 214 kg/ha when relating MODIS-NDVI and wheat yield.The values for maize in Figure 10 are similar to those recorded by [103] when relating MODIS WDRVI (Wide Dynamic Range Vegetation Index) and maize yield in the US.However, E 1 values demonstrated that for maize the modified DMP model is performant only in Ain, which is also evident for the other BPs.In general, E 1 values for the modified DMP are higher than the CGLS-DMP.Hansen & Indeje [104] reported RMSE values ranging from 0.962 t/ha-1.195t/ha when predicting field-scale maize yields simulated by CERES-maize in Kenya.In agro-ecological zones of Belgium, Klein [105] reported RMSE values fluctuating from 550 kg/ha-1430 kg/ha when relating the simulations of the B-CGMS crop growth model and winter wheat, and from 2150 kg/ha-7730 kg/ha for maize.
The modified DMP includes plant specific parameterisations for C 3 and C 4 plants, introduces potential water limitation and incorporates the impact of different stress factors during the most sensitive cropping periods.According to the results, the inclusion of different stress factors can improve local empirical models.The results indicate that the stress factors play different roles in different climate regimes and for different C 3 and C 4 plants.For instance, the water stress index in CGLS-DMP proved to be a limiting factor, particularly in Morocco.AgroMetShell was used to generate AET estimations for each crop during different stages of the growth period of 1999-2012; this water balance model requires time for collecting the necessary data and computation.An alternative to AgroMetShell is LSA-SAF MSG AET [106] which has been available at the continental scale since 2009 and provides estimations closer to eddy covariance (EC) measurements [107].Since the available years from MSG derived AET are too few for this study, we used AgroMetShell to compute AET.Another stress factor is the CO 2 fertilisation effect which is highly dependent on plant species, soil properties and soil nutrient status [108][109][110].Study sites in Belgium were more responsive to this stress factor compared to the poor soils in Morocco.Irrigation can alleviate the effect of stress factors such as drought and temperature, while fertilisation alleviates plant nutrition stress [111].However, the model approach used in this study did not take into account either irrigation or fertilisation.The autotrophic respiration fraction is overestimated in the CGLS-DMP model.We therefore proposed a new ε AR which followed a similar trend as the findings in [98].For example, densely vegetated areas in Europe have lower ε AR values.However, the values in sparsely vegetated and high altitude areas are different than the values in [98].These areas are beyond the scope of this study and should be disregarded.
The modified DMP could explain more variation in irradiation conditions, short term environmental stresses and respiration costs compared to NDVI and fAPAR.Thus, it could be a more robust choice, although it has several modelling assumptions.Similar to the findings in [112][113][114], Phillips et al. [115] found that NDVI underestimated productivity due to backscatter effects in the lower values and saturation in the higher values.Additionally, the frequency and intensity of extreme weather events could have significant impacts on biomass production and crop yields [116,117].For example, floods and droughts can harm crops, reduce yields and increase crop prices.The amplitude of extreme events can be much larger at regional scales than at global scales [118].For example, the 2003 heat wave affected the local natural environment, society and economy in many parts of Europe.
Van der Velde et al. reported that 2003 maize yield loss in France equalled 1.5 t/ha compared to the 2000-2006 average.This trend can also be seen in the study sites in France within BPs used in this study.The modified DMP is therefore better suited than CGLS-DMP to the extreme event in 2003 (e.g., for Ain, the predicted yield from the BPs calibrated for maize yields are 10 t/ha for modified DMP, 14 t/ha for CGLS-DMP, 10 t/ha for fAPAR and 9 t/ha for NDVI, while the actual yield is 8 t/ha).
Although the proposed method shows promising results in crop yield forecasting for the study sites, there are a number of limitations.Cereal yields depend on the harvest index which is the fraction of the total aboveground biomass allocated to the grains [13].Different yields for the same amount of aboveground biomass can be obtained because the harvest index varies with crop varieties, management practices, and water and nitrogen availability [13], all of which play at the field scale.For this reason, the harvest index was not in the scope of our regional study.A limitation to predicting yield is the amplitude of the inter-annual crop yield variability.López-Lozano et al. [6] showed that the regression analysis performs better when the inter-annual variability is high, which is confirmed in this study: higher performance was seen in Morocco where the inter-annual variability is higher.In addition, many weeds, pests and fungi flourish under warmer temperatures, wetter climates and increased CO 2 levels [116], and affect yields.
We used products derived from SPOT VEGETATION, a sensor with a coarse 1 km spatial resolution, and we applied a mask to these products using the GlobCover 2009 crop map in order to minimize the influence of non-agricultural land cover types.DMP is subsequently calculated per pixel located in cropland assuming wheat (C 3 ) or maize (C 4 ) cover.The two resulting maps could be combined using weights corresponding to the proportion of each crop, provided that the share per crop is known in a particular year.However, this proportion is not available before the end of the season, and it is beyond the scope of this paper to explore the feasibility of such extension.Furthermore, crop-specific maps could minimise the effects of mixed signals through unmixing techniques or by selecting subsamples of purer pixels [119], thus improving the yield estimations [120].The impact of stress is important for estimating dry matter productivity of cereal crops.The research could be extended to include other agricultural regions and arable crops.The incorporation of new generation remote sensing products with higher spatio-temporal resolutions into the DMP model is likely to improve yield estimates.

Conclusions
A modified DMP model was developed to include water stress based on actual evapotranspiration calculated with AgroMetShell and to adapt the existing factors (CO 2 fertilisation effect, temperature stress and autotrophic respiration).The best results were obtained by including different combinations of stress factors for each selected region and cumulating the modified DMP between the flowering and ripening period.A linear regression between the modified DMP and crop yield statistics showed an increased model performance as compared to the CGLS-DMP.On average, for all sites and crops studied, RMSE (t/ha) decreased from 0.84 for CGLS-DMP to 0.61 for modified DMP, RRMSE (%) reduced from 13.1-8.9,MBE (t/ha) decreased from 0.05-0.03and E 1 increased from 0.08-0.28.Although results did not differ much from what can be obtained using simpler BPs such as fAPAR and NDVI, obtaining similar results remains encouraging as the DMP approach may lend itself better to extrapolation in more extreme conditions.The results showed the potential of using the modified DMP for estimating crop yield at the regional scale.A combination of different stress factors produces better local yield estimates, but no single solution to the improvement of a global product could be demonstrated.The inter-annual variability of official yield statistics is low in Belgium and France, and high in Morocco.The latter enabled a better exploration of stress factors.With more accurate crop masks becoming available than the currently used GlobCover 2009, crop yield estimations might even further improve the results.Including appropriate stress factors and their impact during sensitive stages improves dry matter productivity estimates of cereal crops.For improving real RUE estimations, different remote sensing methods will be available such as chlorophyll-related vegetation

Figure 1 .
Figure 1.Study areas with study crops in Belgium, France and Morocco.

Figure 2 .
Figure 2. Cumulative monthly rainfall, PET (potential evapotranspiration), water balance (accumulated monthly water deficit/surplus) (mm) and average monthly temperature ( ˝C) during the 1999-2012 period for study sites in Belgium (a); France (b) and Morocco (c).Error bars show the standard deviation for the meteorological indicators.

Figure 3 .
Figure 3.The temperature functions of ε T used in this study for wheat and for maize.

Figure 5 .
Figure 5. Evolution of ε CO 2 both for CGLS-DMP (Copernicus Global Land Service-Dry Matter Productivity) at the fixed rate of CO 2 value (dots) (a); modified DMP variable CO 2 rates for C 3 (blue to red lines represent the change in temperature from ´20 ˝C-40 ˝C) (b); and C 4 plants (dashed green line) (c); for the years 1999-2012.

Figure 3 Blue curve in Figure 3
Red curve inFigure 3 ε CO2 Blue to red dots in Figure 5a Blue to red lines in Figure 5b Dashed green line in Figure 5c ε AR Figure 7a, range: 0.65-0.85for study regions Figure 7b, range: 0.5-0.7 for study regions 3 and C 4 , and CGLS-DMP versions for different time periods during the growing season.The figure presents a clear difference between three versions of DMP.

Figure 6 .
Figure 6.Comparison of modified DMP for C 3 and C 4 versions with CGLS-DMP for Liège Region (BE).

Figure 8 .
Figure 8. R 2 for a linear regression between official yield statistics and regional cumulative modified DMP computed with different stress factors or combinations.CO 2 : with CO 2 fertilisation effect; H 2 O: with water stress factor; Temp: with temperature stress factor and combinations of these stress factors.The cumulative periods of modified DMP for each study region are presented in dekads.

9 .
Cumulative period of BPs (modified DMP, CGLS-DMP, fAPAR, NDVI), p-value (calculated by Pearson's correlation for r 2 ) and coefficient of determination (R 2 for the linear model) for wheat and maize per region.

Figure 10 .
Figure 10.RMSE (t/ha), RRMSE (%), MBE (t/ha) and E 1 based on the correlation between the calibrated BP (modified DMP, CGLS-DMP, fAPAR, NDVI) and yield statistics per region for wheat and maize using a leave-one-out cross validation.

Figure 11 .
Figure 11.Temporal trends of predicted yield calibrated by leave-one-out cross validation technique confronted with the actual yield for 1999-2012 period per region.

Table 1 .
The soil types and textures of the study sites.

Table 2 .
Crop growing periods for maize and common wheat in the case study sites.

Table 3 .
Summary of unchanged and changed parameters in the DMP equation from CGLS as compared to the modified version.