Assessing the Performance of MODIS NDVI and EVI for Seasonal Crop Yield Forecasting at the Ecodistrict Scale
Abstract: Crop yield forecasting plays a vital role in coping with the challenges of the impacts of climate change on agriculture. Improvements in the timeliness and accuracy of yield forecasting by incorporating near real-time remote sensing data and the use of sophisticated statistical methods can improve our capacity to respond effectively to these challenges. The objectives of this study were (i) to investigate the use of derived vegetation indices for the yield forecasting of spring wheat (Triticum aestivum L.) from the Moderate resolution Imaging Spectroradiometer (MODIS) at the ecodistrict scale across Western Canada with the Integrated Canadian Crop Yield Forecaster (ICCYF); and (ii) to compare the ICCYF-model based forecasts and their accuracy across two spatial scales-the ecodistrict and Census Agricultural Region (CAR), namely in CAR with previously reported ICCYF weak performance. Ecodistricts are areas with distinct climate, soil, landscape and ecological aspects, whereas CARs are census-based/statistically-delineated areas. Agroclimate variables combined respectively with MODIS-NDVI and MODIS-EVI indices were used as inputs for the in-season yield forecasting of spring wheat during the 2000–2010 period. Regression models were built based on a procedure of a leave-one-year-out. The results showed that both agroclimate + MODIS-NDVI and agroclimate + MODIS-EVI performed equally well predicting spring wheat yield at the ECD scale. The mean absolute error percentages (MAPE) of the models selected from both the two data sets ranged from 2% to 33% over the study period. The model efficiency index (MEI) varied between −1.1 and 0.99 and −1.8 and 0.99, respectively for the agroclimate + MODIS-NDVI and agroclimate + MODIS-EVI data sets. Moreover, significant improvement in forecasting skill (with decreasing MAPE of 40% and 5 times increasing MEI, on average) was obtained at the finer, ecodistrict spatial scale, compared to the coarser CAR scale. Forecast models need to consider the distribution of extreme values of predictor variables to improve the selection of remote sensing indices. Our findings indicate that statistical-based forecasting error could be significantly reduced by making use of MODIS-EVI and NDVI indices at different times in the crop growing season and within different sub-regions.
Satellite remote sensing (RS) data offer significant benefits to assessing agricultural yield and production throughout the cropping season due to their spatial coverage, their temporal and spectral resolutions, their availability to users in timely manner and their affordability (free of charge for some data) [1–6]. High temporal and spatial resolutions with sufficient lead time are often required for objective and near-real time crop production estimates over large areas. Although various vegetation indices (VIs) from several satellite sensors are being explored in crop condition monitoring, crop acreage estimates, crop yield forecasting and modeling at regional and national scales, recent studies are mostly focused on data from moderate spatial resolution satellite sensors, i.e., MODIS (Moderate Resolution Imaging Spectroradiometer) for agricultural production assessments (e.g., [7–15]). The high temporal and moderately low spatial resolution, as well as the free availability of MODIS data could partly explain such an interest . The Normalized Difference Vegetation Index (NDVI), based on the contrast between the maximum absorption in the red portion and the maximum reflection in the near-infrared portion of the electromagnetic spectrum has been widely utilized for crop monitoring and agricultural statistics (e.g., [4,6,16]). Comparisons of the MODIS NDVI with the NOAA-AVHRR (National Oceanographic and Atmospheric Administration-Advanced Very High Resolution Radiometer) NDVI temporal profiles over numerous biome types have shown that the MODIS-based index performed with higher fidelity in terms of characterizing the seasonal phenology [17,18]. By including reflectance in the blue band of the electromagnetic spectrum (in addition to the red and near-infrared bands), the Enhanced Vegetation Index (EVI) helps minimizing soil background and atmosphere influences in reflectance data [4,19–21]. Both MODIS NDVI and EVI are commonly used in crop yield forecasting (e.g., [5,6,22–24]). MODIS-derived VIs are also ideal for crop monitoring over fragmented agricultural landscapes (i.e., size of the field close to the size of the pixel ).
Reliable and timely crop yield forecasting is crucial for policy strategies, trade and market opportunities. It is also important for achieving and sustaining global food security . In Canada, grain crop production (i.e., wheat, barley, corn, canola, and soybean) plays a vital role in the economy (for example a total production of around 63.5 million metric tonnes in 2011 and $17 billion of farm cash receipts were recorded ). A recent modeling tool, the Integrated Canadian Crop Yield Forecaster, ICCYF, has been developed for generating in-season crop yield forecasts at regional scale . The tool utilizes historical climate, near-real time RS data and crop field survey in generating probabilistic yield forecasts that are sequentially-updated within the growing season as data becomes available. The ICCYF combines robust statistical techniques for generating in-season yield forecasts well before the end of the growing season and for providing a probability distribution of the forecasted yields at a given spatial unit. Typically, the basic spatial unit considered in crop yield forecasting at regional scales relies on administrative statistical boundaries [7,13,28–30], even when introducing the notion of ecoregion (e.g., [31,32]). Crop yield forecasting at this unit relies on the availability of historical data at these scales. The current crop yield forecasts based on the ICCYF are performed at the Census Agricultural Regions (CARs), which are composed of groups of adjacent census divisions and are used by the Census of Agriculture for disseminating agricultural statistics . However, given the climate variability and the future challenges of the impacts of climate change on agriculture, exploring crop yield forecasting at the ecodistrict (ECD) scale deserves special attention. An ECD is defined as a subdivision of one ecoregion characterized by relatively homogeneous biophysical and climatic conditions . The differentiating characteristics depends on the regional landform, local surface form, permafrost distribution, soil development, textural group, vegetation cover/land use classes, range of annual precipitation, and mean temperature. The improvement in the timeliness and accuracy of crop yield forecasting through the incorporation of near real-time data (both agroclimate and remote sensing) and the use of sophisticated statistical methods should improve our capacity to respond effectively to the future challenges. Running such tools at spatial scales based on common environmental characteristics (i.e., ecological spatial division) is therefore of great interest both from an ecosystem point of view and for a better crop yield forecasting (i.e., aggregation/upscaling of predicted yields).
Furthermore, though the skillful forecasts or model accuracy of the ICCYF for forecasting spring wheat yields reached 90% across the CARs of the Canadian Prairie provinces [Alberta (AB), Saskatchewan (SK) and Manitoba (MB)] , the weak model performance in some regions is still challenging. Indeed, high forecast uncertainties could occur in regions with high historical yield variability and sparse distribution of climate stations. The forecasting tool may fail to capture the changes linked to the impact of persistent drought and flooding conditions, or extreme growing conditions and associated crop management practices [28,35]. Although the choice of the CAR scale is guided by the availability of historical data for the public, this coarse unit normally contains many different soil and climate zones and thus may have different yield response relationships. Exploring the crop yield modeling at the ECD level through the ICCYF will give more insights on how the forecast may be achieved for oriented user groups, as well as a better understanding of the forecast skill at finer scales. In addition, one future development activity of the ICCYF  aims to use data from moderate resolution satellites (i.e., MODIS) for assessing the crop vegetation status and to derive VIs.
Therefore, the objectives of this study were: (i) to investigate the use of other RS indices (MODIS NDVI and EVI) as alternative predictors for forecasting spring wheat yields within the forecast model framework (ICCYF) at the ECD scale; (ii) to compare the forecast model performance at the ECD and CAR scales (ecological spatial unit versus statistical spatial unit), especially in CARs with poorer ICCYF performance as pointed by Newlands et al. ; and (iii) to understand when to use one RS index over another in a given region and to understand the dynamics during the growing season. Although several studies have related RS data and/or agroclimate indices to spring wheat on the Canadian Prairies, no such studies have been conducted to relate those indices to crop yield at the ECD scale. Thus, our study takes advantage of historical yield data at the ECD scale across the agricultural landscape of western Canada. The findings should help in improving the forecast skill of the ICCYF, namely in regions with high forecast uncertainties and/or high historical yield variability.
2. Materials and Methods
2.1. Overview of the Integrated Canadian Crop Yield Forecaster (ICCYF)
The ICCYF is a modeling tool for crop yield forecasting and risk analysis based on the integration of geospatial and statistical data within a Geographic Information System, developed by Agriculture and Agri-Food Canada (AAFC). The main features [28,36] of the ICCYF are: (1) the integration of a physical based soil moisture model to generate climate based predictors and satellite derived information; (2) an automatic ranking and selection of best predictors using Robust Least Angle Regression Scheme (RLARS) and Leave-One-Out-Cross-Validation (LOOCV) scheme at run time, as well as a spatial correlation analysis among the neighboring spatial units; (3) a Bayesian method for sequential forecasting, i.e., estimation of the prior and posterior distributions of model predictors through a Markov Chain Monte Carlo (MCMC) scheme, and random forest-tree machine learning techniques to select the best predictors of unobserved variables at the time of forecast.
The crop yield modeling process using the ICCYF is depicted in Figure 1. To set up a yield model at the ECD level, the potential predictors (agroclimate and remote sensing vegetation indices) were put into a RLARS [37,38] to evaluate and rank the variables that contribute the majority of the variance in the predicted yield. A maximum number of predictors was set based on the sample size of the model building data (set at two in our analysis). The RLARS was applied to account for heteroscedasticity and outliers in the historical data (i.e., model training/calibration). The selected predictors were then subjected to a Robust Cross Validation (RCV) scheme  to further stabilized the model by eliminating any false predictors selected from contaminated data . Then, the Bayesian statistical approach as described by Bornn and Zidek  for the spatial correlation analysis among the neighboring spatial units was applied. Historical data of both forecasting ECD and statistically selected neighboring ECDs were used to establish the prior distribution of the predictors. The posterior distribution of the predictors was obtained using the MCMC scheme . For the in-season forecasting, the random forest-tree machine learning technique  was used to estimate the required unobserved variables. The estimated variables and the variables observed at near real time were finally used as input into the selected yield model to forecast the yield probability distribution for each ECD. The 10th percentile (worst 10%), the 50th percentile (median) and the 90th percentile (best 10%) were output as the probability measures [28,36]. A detailed description of the modeling methodology can be found in Newlands et al.  and Chipanshi et al. .
The final model at the spatial unit considered is a multivariate regression equation, written as follows:
2.2.1. Agroclimate Data
The study region encompassed the three Canadian Prairie Provinces AB, SK and MB (Figure 2), which contains the majority of the agriculture land in Canada. These three provinces account for about 85% of Canada’s arable land .
The agroclimate variables (i.e., soil water availability, SWA, and crop water deficit index, WDI) were calculated using the Versatile Soil Moisture Budget (VSMB) model at a daily time step  and based on the daily temperatures and precipitation, soil physical parameters (obtained from the Canadian Soil Information Service ), and spring wheat information. Daily temperatures and precipitation from a total of 330 climate stations (Figure 2) were provided by Environment Canada and other partner institutions through the Drought Watch activity  operated at the National Agroclimate Information Service of Agriculture and Agri-Food Canada (NAIS-AAFC). Also included as agroclimate variable was the growing degree days (GDD) above a base temperature of 5 °C (calculated as the mean daily temperature above a certain threshold base temperature accumulated on a daily basis over a period of time). A detailed description of the calculation of SWA and WDI can be found in Newlands et al. . These agroclimate variables were summed (i.e., GDD) or averaged (i.e., SWA and WDI) for each month from May to August during each growing season. They were then spatially averaged across all stations within a given ECD with equal weighting. For ECDs with no climate data, stations from neighboring ECDs were used.
2.2.2. MODIS-Derived NDVI and EVI Indices
The 16-day MODIS NDVI and EVI composites for Canada south of 60 degree (2000–2010 period) come from the NASA Land Processes Distributed Active Archive Center’s MOD13Q1 products . These products are computed from atmospherically corrected bi-directional surface reflectances that have been masked for water, clouds, heavy aerosols, and cloud shadows . A mask of the Canadian agricultural land  was used to process the composites at the ECD level. For our purpose the average value of pixels was kept as the VI value for a given ECD. For all years, the period of MODIS data coverage spanned from the day of year 129 (approximately 9 May) to day of year 273 (approximately 30 September). NDVI is calculated from reflectances in the red and near-infrared (NIR) portions of the spectrum. EVI incorporates reflectance in the blue portion of the spectrum in addition to the red and NIR. The equations are given as follows [19–21]:
The distribution of MODIS-derived vegetation indices when polled across ECDs and years (Figure 3) showed that several values were considered as extreme (outliers) during the mid-season (16-day periods 22–30 corresponding to approximately June, July, and beginning of August) for NDVI values, and early in the growing season (i.e., May) and the mid-season (i.e., July) for the EVI values. This suggests the sensitivity of MODIS VIs during the beginning of the growing season (case of EVI) and during the mid-season (both the VIs).
2.2.3. Crop Yield Data
Historical spring wheat yields (all cultivars included) at the ECD scale were provided by Statistics Canada. This yield data for spring wheat was part of a larger historical crop yield federal government database spanning 1992–2010 for Canada’s 38 crops and agricultural area and was intended for AAFC research on crop production and environmental modeling. The estimated yield data were generated from the Field Crop Reporting Series (FCRS). FCRS is a series of six data collection activities using surveys and carried out in order to get accurate and timely estimates of seeding intentions, seeded and harvested area, production, yield and farm stocks of the principal field crops in Canada at the provincial level. Data collected are quality-controlled and compared to previous estimates and other sources when available in order to reduce errors associated to such sample surveys . The official yield estimates of the year is derived from the the November Survey. Only yield data of the 2000–2010 period were included in our study in order to coincide with the MODIS data period (available from 2000 onwards).
2.3. Crop Yield Modeling and Assessment of Model Performance
The modeling approach is similar to that described by Newlands et al. . The main differences rely on the spatial unit considered (i.e., ECD) and the RS VIs used. The ICCYF was adapted to fit the requirements of the statistical processes involved in the modeling for a 11-year period of input data (i.e., the number of predictors, the relative convergence tolerance for the fully iterated best model candidates, the number of iterations for the model convergence, etc.). Here we tested the EVI in addition to the NDVI. Agroclimate and MODIS-derived VIs were used as predictors of yield forecasting at each ECD level. Therefore, two input data sets were involved: (i) agroclimate plus NDVI indices; and (ii) agroclimate plus EVI indices. The average value of two consecutive 16-day periods of MODIS NDVI/EVI were used. For testing the robustness of the modeling at the ECD scale, the leave-one-year-out cross-validation as performed in Mkhabela et al.  and Chipanshi et al. was achieved in this study. For each year of the study period, historical data excluding this year (considered as the forecast year) were used as training data for the yield model at each spatial unit. The forecast was then performed using the data of the forecast year during the growing season (i.e., July, August, and September). The forecasted yield was generated using the observed data from the start of the growing season until the last day of the previous month and the inputs estimated by the tool for the remainder of the growing season.
In order to compare the forecast skills within CARs with poorest ICCYF performance (i.e., CAR 4607, 4609 and 4610, Figure 2), additional runs were performed at the CAR scale over the same period using the input data set as that of . This input data set included historical spring wheat yields, agroclimate variables (i.e., GDD, SWA, WDI), and NOAA-AVHRR NDVI at CAR scale across the Canadian Prairies. CARs with poorest ICCYF performance were previously determined in Newlands et al. . Although RS VIs are already achieved within the ICCYF tool, the use of MODIS-derived VIs aims to investigate their potential for the future development activities. Note that only ECDs with crop land were used in our analysis (207 in this case).
Statistical indicators (i.e., root mean square error-RMSE, mean absolute percentage error-MAPE, and model efficiency index-MEI) were used to quantify the performance of the regression-based models at the ECD scale. The RMSE gives the weighted variations in errors (residual) between the predicted and observed yields. It was calculated as:
The MAPE is an accuracy measure of the forecast quality and was calculated as:
The MEI could be considered as a measure of model skill . The closer to 1 the MEI is, the more skillful the model. MEI was calculated as:
The ICCYF runs and statistical analyses were performed using R . The geospatial analyses and mapping were performed using ArcGis 10.1 . The mapping of the model performance throughout the paper was based on that crop land extent map.
3. Results and Discussion
3.1. Selected Predictors at the Ecodistrict Scale
The number of predictors selected by ECD differed from one ECD to another across the study region and according to the input data set used. MODIS-derived indices over the period weeks of the year 24 to 28 (NDVI(EVI)_26_28 and NDVI(EVI)_24_26) were among the predominant predictors (Table 1) during the 11-year period. This recurrence of July-related MODIS indices suggests that the crop status in mid-season is determining in the final spring wheat yield across the Western Canadian Prairies. These results are in agreement with previous studies which have reported that the optimal time for obtaining NDVI information to be related with final yield of spring-seeded crops including spring wheat for the Canadian Prairies was July . Other predominant predictors included agroclimate indices GDD_6 (total GDD of June) and P_6 and P_7 (sums of precipitation in June and July, respectively). In addition, the number of precipitation and temperature related predictors, namely in June and July, reveals that these meteorological factors are important for the final yield of spring wheat.
3.2. Overall Comparison of Yield Model Performance at the Ecodistrict Scale
The comparisons of model performance indicators showed that the results were quite similar when using a data set including either agroclimate and NDVI indices or agroclimate and EVI indices (Figure 4). Overall for models based on agroclimate plus MODIS-NDVI indices, RMSE values ranged from 43 to 957 kg·ha−1 over the 2000–2010 period. The ranges of the MEI and MAPE were −1.1 to 0.99, and 2% to 33%, respectively, for the same period. Satisfactory results were obtained with the coefficient of determination of yield models (median values equaled 0.63). The same ranges of performance indicators were obtained in case of models based on agroclimate indices plus MODIS-EVI indices, with differences occurring in the low values of MEI (i.e., MEI = −1.8) and high values of RMSE (i.e., RMSE = 975 kg·ha−1). High ranges of performance indicators suggest that the models did not capture well the extreme input values. Modeling crop yields by taking into account these extreme values could help improving the model forecast skill. Furthermore, it is worth to note that the MODIS VI is representative of all crops in the area and will be highly influenced by the dominant crops. The consideration of crop specific mask to derive VIs will probably result in improved crop yield forecast models.
The overall spatial pattern of model performance for each forecast year (2000–2010) was similar to that presented with the 2010 yield forecast (Figure 5). The spatial distribution of the ICCYF performance showed that the model performance did vary across the agricultural regions of the Canadian Prairies. ECDs with relatively high MAPE (≥15%, Figure 5) were located in central and central Alberta, central Saskatchewan, and western Manitoba, with the best model performance being across Saskatchewan (Figure 5). These agricultural regions are generally located in the semi-arid and arid climatic regions. Mkhabela et al.  found best prediction models for spring wheat in the semi-arid and arid zone when relating the CAR-scale crop yield to MODIS-NDVI data. This result has implications in climate change studies. The province of Saskatchewan encompass the three major climatic regions (i.e., sub-humid, semi-arid, and arid) of the Canadian Prairies . As a continuing effort to integrate future climate scenarios data within the ICCYF, and hence investigate the climate variability impacts on agriculture across Canadian agricultural landscape, future analyses could be focused on this province.
MODIS NDVI and EVI performed equally well predicting spring wheat yield at the ECD scale. Bolton and Friedl  found that for predicting maize yields MODIS-EVI (coupled to other crop phenology metrics) provided relatively better results than MODIS-NDVI (R2 = 0.58 and 0.53 for EVI-based and NDVI-based regression models, respectively); but they had the same performance in predicting soybean yield. No crop specific mask was used in our study for retrieving the MODIS VIs. Using crop specific masks is part of the ongoing research and should provide additional information on crop phenology, useful for deriving potential predictors (i.e., phenology-based metrics) and for improving the model forecast skill. Further research also includes the use of additional RS indices such as the fraction of absorbed photosynthetically active radiation (FAPAR). The ICCYF runs at ECD scale could also be investigated with different crops such as oilseeds, corn and barley.
3.3. In-Season Crop Yield Forecasting Based on the Two Input Data Sets
The yield forecasts of spring wheat were performed for the months of July, August and September for the 11-year period of the study. Generally, high correlations were obtained for each month of forecast for both input data set used for the modeling: R values greater or equal 0.60, except in 2002 (Tables 2 and 3). Regarding the models based on agroclimate plus MODIS-NDVI indices (Table 2), RMSE values ranged from 388 to 940 kg·ha−1, 402 to 911 kg·ha−1, and 403 to 958 kg·ha−1 for July, August, and September forecasts. In case of models based on on agroclimate plus MODIS-NDVI indices (Table 3), the ranges of RMSE values were 361 to 925 kg·ha−1, 390 to 906 kg·ha−1, and 395 to 952 kg·ha−1 for July, August, and September forecasts. Relatively good ICCYF performance was obtained in July, compared with August and September forecasts. This trend was also observed when performing the forecast of spring wheat yield at CAR scale . Ordinarily the forecasted yield gets closer to the actual yield towards the end of the growing season (as more data become available) for a given spatial unit of modeling. However, through the current ICCYF selection algorithm, a yield model may only contain predictors at a few critical stages (e.g., predictors based on observations before July). Thus, predictors based on observations of following months, when becoming available, will have no effect on the forecasted yield. The forecast error and reliability range remain the same for those spatial units. For instance, a great number of models based on July-related predictors and having worst performance will lessen the overall performance of the tool across the study agricultural region. Future research will include assessing how the lead time (time window for which potential predictors, namely RS data, are available before a forecast needs to be/is made) affects the forecast accuracy.
The comparison of July, August and September forecasts for all ecodistricts on the study region showed noticeable biases (expressed through the mean bias error, MBE) between the observed and predicted yields in 2000 (yield underestimation) and 2002 (yield overestimation) for models based on both two data sets (Tables 2 and 3). In 2002, many North American regions, including the Canadian Prairies, experienced severe drought conditions . Based on our historical input data set, the ICCYF tool failed to capture such extreme conditions. In its current version, extreme value distribution is not taken into account in the forecast models, though the shortness of the study period did not enable more extreme conditions neither. Future works related to the fine tuning of statistical algorithms involved in the ICCYF should help integrating the impacts of extreme conditions.
3.4. Ecodistrict-Scale Forecast Results in Census Agricultural Region with Poorest ICCYF Performance
The ICCYF runs at CAR scale based on the 11-year data period showed the general trend as found in Newlands et al. : CARs with poorest model performance were located in south-eastern Manitoba (i.e., CARs 4607, 4609 and 4610). A general trend of the spatial distribution of model performance at CAR scale is depicted in Figure 5C. The coefficient of variation (CV), calculated as the ratio of the RMSE to the mean of the dependent variable (i.e., yield), in these CARs was at least greater than 15% over the entire study period. Comparing the ICCYF performance at a finer scale (here the ECD scale) with the CAR scale was very informative. The variability of the model performance within a given CAR was highlighted at the ecological scale. For example significant range of CVs occurred within some CARs (e.g., CARs in the province of Alberta, Figure 5A and B). However, there were some CARs, i.e., south-eastern of Saskatchewan, where the trend at ECD scale was the same at CAR scale (CV ≤ 10%, Figure 5).
In this study we emphasized the yield forecasting of spring wheat at the ecodistrict scale through an integrated crop yield forecasting approach (i.e., the ICCYF) because of the cohesiveness of the modelling unit in terms of climate and soils. The results related to the three CARs with weak model performance (as mentioned previously) showed that this performance was improved in most cases at the ECD scale: higher average MEI values and low average MAPE values (Figure 6). Overall, in CARs with poorer ICCYF performance as reported in previous study, the model performance was improved at that finer and ecological scale, compared to larger statistical unit scale (MAPEs reduced by 40% and MEI increased by five, on average). Downscaling the yield forecasting approach at such specific ecological scale helps in understanding the yield variability within a given statistical unit and improve the forecast skills. Among the three CARs involved in the study, only the CAR 4607 presented ECDs with model performance that were slightly improved. Indeed, ECD 841, 844, 846, 849, and 851 had weak or similar average MAPE (depending on the input data set considered) than that obtained at CAR. Whereas only ECD 851 had a model skill (MEI) less than the CAR-scale one. Although all the CARs with worst ICCYF selected in our study were located in a sub-humid climatic regions, ECDs with relatively low MAPE values belong mainly to the Lake Manitoba plain ecoregion . This ecoregion is characterized as the warmest and most humid in the Canadian Prairies with annual mean precipitation ranging from 450 to 700 mm . A negative relationship has been reported between agricultural productivity and MODISNDVI for a maximum threshold of 600 mm in Western Australia . The worst ICCYF performance in the above mentioned ECDS could support this negative relationship between spring wheat yield and MODIS-derived VIs (though the predictors also included agroclimate variables in our case). The weak performance of the ICCYF in CARs across south-eastern Manitoba could be attributed to the soil surface water capacity. Taking into account RS indices that capture the spatio-temporal patterns in surface moisture status (e.g., Normalized Difference Water Index, NDWI , or CERES-Photosynthetically Absorbed Radiation ) will be explored in future works. Furthermore, exploring the forecast skill of existing approach at ecological scale gives an opportunity for more insights on the impacts of climate variability on crop yield and production.
Remote sensing vegetation indices coupled to agroclimate variables are increasingly being used for crop monitoring and crop yield forecasting at regional scales. This study aimed at assessing spring wheat yield forecasting at the ecodistrict scale through an integrated crop yield forecasting approach (i.e., the ICCYF) because of the cohesiveness of the modelling unit in terms of climate and soils. The MODIS NDVI and EVI were respectively coupled to the same agroclimate indices and the performance of the models selected on the basis of these two input data sets was compared across three Western Canadian provinces over a 11-year period (2000–2010). Overall, MODIS NDVI and EVI performed equally well predicting spring wheat yield at the ECD scale. Our analysis also showed that in coarser statistical units (i.e., CAR) with poorer ICCYF performance as reported in previous study, the model performance was improved at that finer and ecological scale. Our findings indicate that statistical-based forecasting error could be significantly reduced by making use of MODIS-EVI and NDVI indices at different times in the crop growing season and within different sub-regions. Downscaling the yield forecasting approach at such specific ecological scale helps in understanding the yield variability within a given statistical unit and improve the forecast skills. Furthermore, exploring the forecast skill of existing approach at ecological scale gives an opportunity for more insights on the impacts of climate variability on crop yield and production.
This research was funded by the research project on “Integrated crop yield forecasting across Canada’s agricultural landscape”, co-led by A. Chipanshi and N.K. Newlands. The first author (Louis Kouadio) was funded by a NSERC (Natural Sciences and Engineering Research Council of Canada) Visiting Post-doctoral Fellowship. We thank Zhong Yao for processing the climate data. The authors would also like to thank the many individuals involved in the Integrated Canadian Crop Yield Forecasting Across the Agriculture Landscape of Canada project. The authors acknowledge and thank Statistics Canada for providing the spring wheat yield data at the ecodistrict scale used in our analysis. This yield data for spring wheat is part of a larger historical crop yield federal government database spanning 1992–2010 for Canada’s 38 crops and agricultural area, and was provided/shared to Agriculture and Agri-Food Canada (AAFC). We thank the anonymous reviewers for their constructive comments.
Louis Kouadio and Nathaniel K. Newlands are the principal authors of this manuscript having written the majority of the manuscript and performed the analyses. The remote sensing data at the ecodistrict scale across the agricultural landscape of Canada were processed by Andrew Davidson. Aston Chipanshi, Yinsuo Zhang, and Andrew Davidson have reviewed this manuscript and provided comments and suggestions.
Conflicts of Interest
The authors declare no conflict of interest.
- Maas, S.J. Use of remotely-sensed information in agricultural crop growth models. Ecol. Model 1988, 41, 247–268. [Google Scholar]
- Boken, V.K.; Shaykewich, C.F. Improving an operational wheat yield model using phenological phase-based Normalized Difference Vegetation Index. Int. J. Remote Sens 2002, 23, 4155–4168. [Google Scholar]
- Reichert, G.; Caissy, D. Reliable Crop Condition Assessment Program (CCAP) incorporating NOAA AVHRR data, a geographical information system, and the Internet. In Proceedings of the Environmental Systems Research Institute (ESRI) User Conference, San Diego, CA, USA, 8–12 July 2002.
- Hatfield, J.L.; Prueger, J.H. Value of using different vegetative indices to quantify agricultural crop characteristics at different growth stages under varying management practices. Remote Sens 2010, 2, 562–578. [Google Scholar]
- Atzberger, C. Advances in remote sensing of agriculture: Context description, existing operational monitoring systems and major information needs. Remote Sens 2013, 5, 949–981. [Google Scholar]
- Basso, B.; Cammarano, D.; Carfagna, E. Review of crop yield forecasting methods and early warning systems. In Proceedings of the First Meeting of the Scientific Advisory Committee of the Global Strategy to Improve Agricultural and Rural Statistics, FAO Headquarters, Rome, Italy, 18–19 July 2013.
- Doraiswamy, P.C.; Hatfield, J.L.; Jackson, T.J.; Akhmedov, B.; Prueger, J.; Stern, A. Crop condition and yield simulations using Landsat and MODIS. Remote Sens. Environ 2004, 92, 548–559. [Google Scholar]
- Becker-Reshef, I.; Vermote, E.; Lindeman, M.; Justice, C. A generalized regression-based model for forecasting winter wheat yields in Kansas and Ukraine using MODIS data. Remote Sens. Environ 2010, 114, 1312–1323. [Google Scholar]
- Potgieter, A.; Apan, A.; Hammer, G.; Dunn, P. Estimating winter crop area across seasons and regions using time-sequential MODIS imagery. Int. J. Remote Sens 2011, 32, 4281–4310. [Google Scholar]
- Mkhabela, M.S.; Bullock, P.; Raj, S.; Wang, S.; Yang, Y. Crop yield forecasting on the Canadian Prairies using MODIS NDVI data. Agric. Forest Meteorol 2011, 151, 385–393. [Google Scholar]
- Kouadio, L.; Duveiller, G.; Djaby, B.; El Jarroudi, M.; Defourny, P.; Tychon, B. Estimating regional wheat yield from the shape of decreasing curves of green area index temporal profiles retrieved from MODIS data. Int. J. Appl. Earth Obs. Geoinf 2012, 18, 111–118. [Google Scholar]
- Vintrou, E.; Desbrosse, A.; Bégué, A.; Traoré, S.; Baron, C.; Seen, D.L. Crop area mapping in West Africa using landscape stratification of MODIS time series and comparison with existing global land products. Int. J. Appl. Earth Obs. Geoinf 2012, 14, 83–93. [Google Scholar]
- Johnson, D.M. An assessment of pre- and within-season remotely sensed variables for forecasting corn and soybean yields in the United States. Remote Sens. Environ 2014, 141, 116–128. [Google Scholar]
- Mosleh, M.; Hassan, Q. Development of a remote sensing-based “Boro” rice mapping system. Remote Sens 2014, 6, 1938–1953. [Google Scholar]
- Whitcraft, A.K.; Becker-Reshef, I.; Justice, C.O. Agricultural growing season calendars derived from MODIS surface reflectance. Int. J. Dig. Earth 2014. [Google Scholar] [CrossRef]
- Benedetti, R.; Rossini, P. On the use of NDVI profiles as a tool for agricultural statistics: The case study of wheat yield estimate and forecast in Emilia Romagna. Remote Sens. Environ 1993, 45, 311–326. [Google Scholar]
- Huete, A.; Didan, K.; Miura, T.; Rodriguez, E.P.; Gao, X.; Ferreira, L.G. Overview of the radiometric and biophysical performance of the MODIS vegetation indices. Remote Sens. Environ 2002, 83, 195–213. [Google Scholar]
- Fensholt, R.; Sandholt, I. Evaluation of MODIS and NOAA AVHRR vegetation indices with in situ measurements in a semi-arid environment. Int. J. Remote Sens 2005, 26, 2561–2594. [Google Scholar]
- Huete, A.; Justice, C.; Liu, H. Development of vegetation and soil indices for MODIS-EOS. Remote Sens. Environ 1994, 49, 224–234. [Google Scholar]
- Jiang, Z.; Huete, A.R.; Didan, K.; Miura, T. Development of a two-band enhanced vegetation index without a blue band. Remote Sens. Environ 2008, 112, 3833–3845. [Google Scholar]
- Rocha, A.V.; Shaver, G.R. Advantages of a two band EVI calculated from solar and photosynthetically active radiation fluxes. Agric. For. Meteorol 2009, 149, 1560–1563. [Google Scholar]
- Doraiswamy, P.C.; Sinclair, T.R.; Hollinger, S.; Akhmedov, B.; Stern, A.; Prueger, J. Application of MODIS derived parameters for regional crop yield assessment. Remote Sens. Environ 2005, 97, 192–202. [Google Scholar]
- Potgieter, A.; Apan, A.; Dunn, P.; Hammer, G. Estimating crop area using seasonal time series of Enhanced Vegetation Index from MODIS satellite imagery. Aust. J. Agric. Res 2007, 58, 316–325. [Google Scholar]
- Bernardes, T.; Moreira, M.A.; Adami, M.; Giarolla, A.; Rudorff, B.F.T. Monitoring biennial bearing effect on coffee yield using MODIS remote sensing imagery. Remote Sens 2012, 4, 2492–2509. [Google Scholar]
- Duveiller, G.; Baret, F.; Defourny, P. Remotely sensed green area index for winter wheat crop monitoring: 10-Year assessment at regional scale over a fragmented landscape. Agric. Forest Meteorol 2012, 1, 156–168. [Google Scholar]
- Kouadio, L.; Newlands, N. Data hungry models in a food hungry world—An interdisciplinary challenge bridged by statistics. In Statistics in Action: A Canadian Outlook; Lawless, J.F., Ed.; CRC Press (Taylor & Francis Group): New York, NY, USA, 2014; pp. 371–385. [Google Scholar]
- Estimated Areas, Yield, Production and Average Farm Price of Principal Field Crops, in Metric Units, Annual. Avaliable online: http://www.statcan.gc.ca/pub/22-007-x/2012004/related-connexes-eng.htm (accessed on 19 June 2013).
- Newlands, N.K.; Zamar, D.S.; Kouadio, L.A.; Zhang, Y.; Chipanshi, A.; Potgieter, A.; Toure, S.; Hill, H.S. An integrated, probabilistic model for improved seasonal forecasting of agricultural crop yield under environmental uncertainty. Front. Environ. Sci 2014, 2. [Google Scholar] [CrossRef]
- Supit, I. Predicting national wheat yields using a crop simulation and trend models. Agric. For. Meteorol 1997, 88, 199–214. [Google Scholar]
- Wu, B.; Meng, J.; Li, Q.; Yan, N.; Du, X.; Zhang, M. Remote sensing-based global crop monitoring: Experiences with China’s CropWatch system. Int. J. Dig. Earth 2014, 7, 113–137. [Google Scholar]
- Mkhabela, M.S.; Mkhabela, M.S.; Mashinini, N.N. Early maize yield forecasting in the four agro-ecological regions of Swaziland using NDVI data derived from NOAA’s-AVHRR. Agric. For. Meteorol 2005, 129, 1–9. [Google Scholar]
- Bolton, D.K.; Friedl, M.A. Forecasting crop yield using remotely sensed vegetation indices and crop phenology metrics. Agric. For. Meteorol 2013, 173, 74–84. [Google Scholar]
- Census Agricultural Regions Boundary Files for the 2006 Census of Agriculture-Reference Guide. Available online: http://www5.statcan.gc.ca/olc-cel/olc.action?objId=92-174-G&objType=2&lang=en&limit=0 (accessed on 16 May 2007).
- A National Ecological Framework for Canada. Report and National Map at 1:7,500,000 Scale. Avaliable online: http://sis.agr.gc.ca/cansis/publications/ecostrat/index.html (accessed on 31 May 2013).
- Chipanshi, A.C.; Warren, R.T.; L’Heureux, J.; Waldner, D.; McLean, H.; Qi, D. Use of the National Drought Model (NDM) in monitoring selected agroclimatic risks across the agricultural landscape of Canada. Atmos.–Ocean 2013, 51, 471–488. [Google Scholar]
- Chipanshi, A.; Zhang, Y.; Kouadio, L.; Newlands, N.; Davidson, A.; Hill, H.; Warren, R.; Qian, B.; Daneshfar, B.; Bedard, F.; et al. Evaluation of the Integrated Canadian Crop Yield Forecaster (ICCYF) model for in-season prediction of crop yield across the Canadian agricultural landscape. In Agric. For. Meteorol; 2014; under review. [Google Scholar]
- Efron, B.; Hastie, T.; Johnstone, I.; Tibshirani, R. Least angle regression. Ann. Stat 2004, 32, 407–499. [Google Scholar]
- Khan, J.A.; Van Aelst, S.; Zamar, R.H. Robust linear model selection based on least angle regression. J. Am. Statist. Assoc 2007, 102, 1289–1299. [Google Scholar]
- Khan, J.; A.S.V.; Zamar, R. Fast robust estimation of prediction error based on resampling. Comput. Stat. Data An 2010, 54, 3121–3130. [Google Scholar]
- Bornn, L.; Zidek, J.V. Efficient stabilization of crop yield prediction in the Canadian Prairies. Agric. For. Meteorol 2012, 153, 223–232. [Google Scholar]
- Dowd, M. A sequential Monte Carlo approach for marine ecological prediction. Environmetrics 2006, 17, 435–455. [Google Scholar]
- Liaw, A.; Wiener, M. Classification and regression by randomForest. R. News 2002, 2, 18–22. [Google Scholar]
- Campbell, C.; Zentner, R.; Gameda, S.; Blomert, B.; Wall, D. Production of annual crops on the Canadian prairies: Trends during 1976–1998. Can. J. Soil Sci 2002, 82, 45–57. [Google Scholar]
- Baier, W.; Robertson, G.W. Soil moisture modelling—Conception and evolution of the VSMB. Can. J. Soil Sci 1996, 76, 251–261. [Google Scholar]
- Canadian Soil Information Service. Available online: http://sis.agr.gc.ca/cansis/nsdb/index.html (accessed on 11 July 2014).
- Drought Watch, Agriculture and Agri-Food Canada. Available online: http://www.agr.gc.ca/pfra/drought/index_e.htm (accessed on 11 July 2014).
- NASA Land Processes Distributed Active Archive Center. Available online: https://lpdaac.usgs.gov (accessed on 15 August 2014).
- Land Cover for Agricultural Regions of Canada, Circa 2000. Available online: http://data.gc.ca/data/en/dataset/16d2f828-96bb-468d-9b7d-1307c81e17b8 (accessed on 11 July 2014).
- Definitions, Data Sources and Methods of Field Crop Reporting Series. Available online: http://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&SDDS=3401 (accessed on 3 October 2014).
- Murphy, A.H. What is a good forecast? An essay on the nature of goodness in weather forecasting. Weather Forecast 1993, 8, 281–293. [Google Scholar]
- A Language and Environment for Statistical Computing. Available online: http://cran.case.edu/web/packages/dplR/vignettes/timeseries-dplR.pdf (accessed on 25 April 2014).
- Arcgis Desktop: Release 10. Available online: http://www.esri.com/software/arcgis/arcgis-for-desktop (accessed on 20 October 2013).
- Basnyat, P.; McConkey, B.; Lafond, G.P.; Moulin, A.; Pelcat, Y. Optimal time for remote sensing to relate to crop grain yield on the Canadian prairies. Can. J. Plant Sci 2004, 84, 97–103. [Google Scholar]
- Shorthouse, J.D. Ecoregions of Canada’s prairie grasslands. In Arthropods of Canadian Grasslands: Ecology and Interactions in Grassland Habitats; Shorthouse, J.D., Floate, K.D., Eds.; Biological Survey of Canada: Ottawa, ON, Canada, 2010; pp. 53–81. [Google Scholar]
- Hill, M.J.; Donald, G.E. Estimating spatio-temporal patterns of agricultural productivity in fragmented landscapes using AVHRR NDVI time series. Remote Sens. Environ 2003, 84, 367–384. [Google Scholar]
- Gao, B.C. NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sens. Environ 1996, 58, 257–266. [Google Scholar]
- NASA Clouds and the Earth’s Radiant Energy System (CERES). Available online: http://ceres.larc.nasa.gov/index.php (accessed on 15 August 2014).
|Year||AgMet + NDVI||AgMet + EVI|
|2000||GDD_6; NDVI_26–28; P_8||GDD_6; P_6; EVI_26–28;|
|GDD_5; P_6||EVI_24–26; EVI_28–30|
|2001||GDD_6; NDVI_26–28; P_6;||GDD_6; EVI_26–28; P_6;|
|NDVI_24–26; WDI_7||EVI_24–26; EVI_28–30|
|2002||GDD_8; P_7 ; P_8||GDD_6; P_6; EVI_26–28;|
|NDVI_22–24; SWA_5||P_8; EVI_24–26|
|2003||GDD_6; WDI_7; P_7 ;||GDD_6; EVI_26–28; P_6;|
|NDVI_26–28; SWA_7||EVI_24–26; P_7|
|2004||GDD_6; NDVI_26–28; NDVI_24–26;||GDD_6; P_6; EVI_26–28; EVI_24–26;|
|P_7; WDI_5||P_6; P_7|
|2005||GDD_6; NDVI_26–28; P_7;||GDD_6; P_6; EVI_26–28; EVI_24–26;|
|NDVI_24–26; NDVI_34–36||P_7; P_6|
|2006||GDD_6; NDVI_26–28; NDVI_22–24||GDD_6; EVI_26–28; P_7|
|P_7; P_8||EVI_24–26; P_6|
|2007||GDD_6; NDVI_22–24; NDVI_24–26;||GDD_6; EVI_26–28; EVI_24–26;|
|NDVI_26–28; SWA_8||P_7; P_8|
|2008||GDD_6; NDVI_26–28; P_7||GDD_6; EVI_24–26; EVI_26–28;|
|NDVI_24–26; P_6||P_7; P_8|
|2009||GDD_6; GDD_7; P_7;||GDD_6; P_7; EVI_24–26;|
|NDVI_26–28; NDVI_24–26||EVI_26–28; GDD_7|
|2010||NDVI_26–28; GDD_6; NDVI_24–26;||GDD_6; EVI_26–28; EVI_24–26;|
|P_7; NDVI_28–30||EVI_28–30; P_7|
Agroclimate variables include growing degree days (GDD), soil water availability (SWA), precipitation (P), and crop water deficit index (WDI). Numbers following these indices, i.e., 5, 6, 7, and 8, stand for May, June, July and August, respectively; The average of two consecutive 16-day periods of MODIS NDVI/EVI were considered. The number after NDVI/EVI refer to the periods used for the calculation within the year.
aCoefficient of correlation;bMean bias error (kg·ha−1); MBE is the average difference between the predicted and observed yields;cRoot mean square error (kg·ha−1).
aCoefficient of correlation;bMean bias error (kg·ha−1);cRoot mean square error (kg·ha−1).
© 2014 by the authors; licensee MDPI, Basel, Switzerland This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).
Kouadio, L.; Newlands, N.K.; Davidson, A.; Zhang, Y.; Chipanshi, A. Assessing the Performance of MODIS NDVI and EVI for Seasonal Crop Yield Forecasting at the Ecodistrict Scale. Remote Sens. 2014, 6, 10193-10214. https://doi.org/10.3390/rs61010193
Kouadio L, Newlands NK, Davidson A, Zhang Y, Chipanshi A. Assessing the Performance of MODIS NDVI and EVI for Seasonal Crop Yield Forecasting at the Ecodistrict Scale. Remote Sensing. 2014; 6(10):10193-10214. https://doi.org/10.3390/rs61010193Chicago/Turabian Style
Kouadio, Louis, Nathaniel K. Newlands, Andrew Davidson, Yinsuo Zhang, and Aston Chipanshi. 2014. "Assessing the Performance of MODIS NDVI and EVI for Seasonal Crop Yield Forecasting at the Ecodistrict Scale" Remote Sensing 6, no. 10: 10193-10214. https://doi.org/10.3390/rs61010193