Testing the Temporal Ability of Landsat Imagery and Precision Agriculture Technology to Provide High Resolution Historical Estimates of Wheat Yield at the Farm Scale

The long term archiving of both Landsat imagery and wheat yield mapping datasets sensed by precision agriculture technology has the potential through the development of statistical relationships to predict high resolution estimates of wheat yield over large areas for multiple seasons. Quantifying past yield performance over different growing seasons can inform agricultural management decisions ranging from fertilizer applications at the sub-paddock scale to changes in land use at a landscape scale. However, an understanding of the magnitude of prediction errors is needed. In this study, we examine the predictive wheat yield relationships developed from Normalised Difference Vegetation Index (NDVI) acquired Landsat imagery and combine-mounted yield monitors for three Western Australian farms over different growing seasons. We further analysed their predictive capability when these relationships are used to extrapolate yield from one farm to another. Over all seasons, the best predictions were achieved with imagery acquired in September. Of the five seasons reviewed, three showed very reasonable prediction accuracies, with the low and high rainfall years providing good predictions. Medium rainfall years showed the greatest variation in prediction accuracy with marginal to poor predictions resulting from narrow ranges of measured wheat yield and NDVI values. These results demonstrate the potential benefit of fusing together two high resolution datasets to create robust wheat yield prediction models over different growing seasons, the outputs of which can be used to inform agricultural decision making.


Introduction
Estimating agricultural production is necessary for a wide variety of applications: early warning of food security, famine and drought [1][2][3][4]; supporting validation of biophysical crop models [5,6]; a substitute for yield maps to analyse yield consistency [7], and to optimise spatially explicit fertiliser application [8][9][10].The collection of reliable and timely information of crop performance can also influence pricing policies, marketing and trading decisions [13,14] and is important for government, growers, insurance and agricultural companies [11,12] so that logistical issues can be anticipated [12].
The application of airborne or satellite remotely sensed imagery is critical for the above objectives since on-ground human observations of crops is limited.Past predictions of yield which incorporated a variety of remote sensing techniques have ranged from simple statistical relationships [11,12,[15][16][17][18]; more advanced relationships built on agronomic and metrological data [4,19,20], to models utilising absorbed photosynthetically active radiation [21][22][23][24].The majority of these studies have employed low resolution sensors because of their low cost, availability, extensive spatial coverage and frequent acquisition dates.This choice of imagery is particularly advantageous for broad-scale crop progression, crop canopy emergence, maturation and senescence, but sacrifices higher spatial resolution for greater temporal frequency.
Landsat imagery allows yield predictions at a higher resolution, where the pattern of measurements highlights yield performance differences due to soil type and topographic location and where large variations in yield are evident.Reasonable relationships have been observed between yields collected at the farm, field or geo-referenced hand-sampled scale and spectral vegetation indices [14,18,19,22,25].However, spatial mismatches occur with the matching of high resolution spectral indices with broad scale farm and field estimates, as well with yield estimation at the plant level which is also labour intensive, expensive to collect [26] and usually limited to a small spatial extent.Precision agriculture and more specifically yield mapping provides an alternative method to collect yield measurements at a matching scale.Yield and spatial position are collected via a combine harvester every one to three seconds, allowing maps of yield to be produced at a high spatial resolution that is similar to the resolution of the spectral indices produced by the Landsat sensor.Several studies have reported a strong relationship between yield and Landsat and IKONOS imagery [10,[30][31][32] but most studies are from individual fields and rarely have these relationships been used to predict yields elsewhere.
Models that are able to predict yield patterns based on historic imagery have significant practical uses.Firstly, they could assist growers who have not yet adopted, or those who have recently adopted, precision agriculture technology to more readily access its benefits by extending the length of the dataset that is available for crop management decision making [34,35].This would allow land managers to assess the spatially explicit nature of productivity, fertiliser requirements and profitability of their fields from a longer time series [36] and therefore leap-frog the time consuming process of archiving annual yield maps generally used in this assessment process.Secondly, the availability of yield estimates, at the resolution and extent of the Landsat imagery, may assist regional environmental policy decision making [37].Yield variability over time may help to define spatially explicit scenarios such as a comparison between the returns from traditional cropping with those from an alternative, environmentally friendly land use.The major benefit of this analysis is that yield estimates are at a scale that is relevant for management [38].Information is provided at a scale where growers make decisions about land use allocation, yet at an extent where policy decisions can be made.
The aim of this paper is to understand how wheat yield-NDVI relationships vary in time and space.The majority of past studies are based on a single paddock or farm.Here, we explicitly explore the yield-NDVI relationship across three farms over a period of five years under a range of very different rainfall conditions.The first objective of this study is to evaluate the sensitivity of the regression models created from data from different farms, years and image acquisition dates.The second objective is to evaluate if these models can be reliably extrapolated across space.

Study Area
The study area covers 625 km 2 within the northern wheat belt of Western Australia (Figure 1), incorporating four neighbouring farms that have collected yield data for the 1998 to 2004 growing seasons.The region is characterised by a Mediterranean climate, with cool wet winters and hot dry summers.Long term growing season average is around 320 mm, but this varies considerably within and between years.In this environment, water is the major limitation to agricultural plant productivity [39].The landscape is predominately broad sand plains with very little relief and salty discharges situated in lower areas.Farms in the region are large, typically with over 2,000 ha of cropping area.The broad-acre cropping rotations are dominated by wheat with break crops of lupins, rapeseed (canola) and to a lesser extent barley and oats.Pastures for cattle and sheep grazing are also common, as well as small scattered stands of remnant native vegetation consisting of a mixture of evergreen shrubs and trees.Flowering and grain filling of crops occurs in spring (September) with harvest in late spring and early summer (November-December).

Methods
We developed empirical relationships between wheat yield measured on three farms with precision agriculture yield mapping capability and the Normalised Difference Vegetation Index (NDVI) derived from Landsat 5 TM and 7 ETM+ imagery.The strength of these relationships was tested at different times within the season and over different annual rainfall distributions.The predictive power of the models was tested on adjacent farms and using imagery at different times.

Characterising Annual Growing Seasons by Rainfall Distributions
Wheat production in this study area is extremely dependent on highly fluctuating rainfall.The growing season occurs between March and November and the end of season yield is determined by the plants' ability to get water during specific phases of development corresponding to germination, vegetative and reproductive growth [40] (Table 1).Annual decisions to sow a wheat crop are determined by the amount of rain falling between March and April and the availability of water held in the soil.This variability in the initial sowing date therefore also leads to variable harvesting dates.Table 1.The role of monthly rainfall in the growth of wheat [39][40][41][42].To encapsulate the variations in season break and finish, we characterized each growing season by the total rainfall between March and November.Figure 2 shows that 1999 had the highest annual rainfall while rainfall totals for the other years were fairly consistent, with 2003 having the next highest and the years 1998, 2001 and 2004 the lowest.

Wheat Phenology and Image Acquisition Date
In agricultural areas the phenological cycle of seeding, growth, maturity and harvesting of managed agricultural vegetation is repeated on an annual basis.Several authors [43,44] suggest that vegetation indices such as NDVI recorded at critical times during the growing season can help characterise the spatial variability in crop performance.Others [13,37] suggest that accurate wheat yield predictions are possible using only one image, provided it is acquired towards the middle of the growing season when most wheat crop canopies are fully developed.Selection of the optimal image acquisition date depends on two factors; firstly, the relationship of NDVI to particular wheat development stages, and secondly the relationship between NDVI at this time and the final wheat grain yield.
Laboratory-based spectral studies of wheat crop development have found that NDVI is very sensitive to leaf area index (LAI) between 0 and 2 [45].After this stage (during the ripening process) there is an asymptotic relationship between NDVI, LAI and yield [14,46,47].This indicates that the use of NDVI as a surrogate for biomass and yield prediction is best at crop stages where LAI values are less than 3 [45].Similarly, measurement of spectral reflectance of wheat growth stages from a tractor mounted radiometer [48] showed similar asymptotic relationships of NDVI and LAI values during this time period.These findings have also been supported by satellite based studies which indicated good associations between NDVI and wheat grain yield 50 to 110 days after plant emergence [2,15,19,26,49,50].This wide time period encapsulates critical phenological phases for predicting the final wheat yield, including the formation and organisation of the canopy structure after the emergence of the flag leaf [8,50], at the end of stem elongation and beginning of heading [47] and grain formation [51].Other studies [11,14] and those undertaken in Australia [37,49,52] suggest that for cereal crops, biomass or LAI measures at the flowering or anthesis stage (September to November) are closely related to final yield.These studies show that wheat yield has a positive relationship with the biomass produced by the plant, i.e., low biomass leads to low yield, high biomass high yield.
For this study, cloud free Landsat 5 TM and 7 ETM+ data were acquired (path-row 113/81 and 112/81) between August and October (Table 2).Images were processed to USGS Level 1G and further orthocorrection to the Geocentric Datum of Australia 1994 and systematic radiometric corrections were made by the vendor, Geoscience Australia.Pixel size was re-sampled to 25 m.As no direct comparison was made between the images and the time between the acquisitions was short (usually 16 days) no further radiometric calibration was performed.

Wheat Yield Mapping Data
Wheat yield data was obtained from four different yield monitoring systems used on four commercial farms for different growing seasons (Table 3).Previous research has concentrated on identifying spatial variability of yield within a single or a limited number of fields.In this study, by contrast, we used wheat yield data from four farms, incorporating around 10 to 16 fields, each over 70 hectares in size.The total area cropped to wheat each year was over 1,500 hectares per farm.This yield data was aggregated by farm and year for the analysis.
The extent of data available was limited by when the yield monitoring technology was adopted by each grower (Table 3).For 2004, the last year of analysis, yield mapping data from Farm 3 was unavailable.However, the yield monitor used in Farm 1 was trialled on another neighbouring farm (Farm 4) and this was used as a replacement for the Farm 3 dataset.This farm had monitored wheat yield over approximately 600 ha.The raw wheat yield data derived from the combine monitors was cleaned using algorithms that removed erroneous yield values associated with harvester dynamics and operator error.Data was then interpolated to a 25 m by 25 m grid using VESPER kriging software [53] under a defined protocol [54], filling in missing areas due to error removal and provided a grid of similar resolution to that of the Landsat imagery.
The majority of average yields were similar within and across farms in all seasons with the exception of 1998 where Farm 2 recorded nearly a tonne higher yield than the neighbouring Farm 1.This may be the result of calibration errors in the first year of harvest monitor use.Visual interpretation of the histograms of the wheat yields (broken into 0.25 kg/ha intervals) show that the datasets range between positive and negative skewness (see supplementary section for histograms in annual videos).

Comparison of Wheat Yields to Landsat NDVI
The Normalised Difference Vegetation Index (NDVI = (band4 − band3)/(band4 + band3)) was calculated for all images.Empirical calibration relationships were developed for each farm using interpolated wheat yield estimates and the associated NDVI value for all 25 m by 25 m grid cells.Maps of wheat yield, NDVI and scatter plots of their relationships for each farm are present in Supplementary video section.Second order polynomial regression models were fitted to the relationships (see Appendix 1).The models were forced through the origin to acknowledge that NDVI values close to zero are indicative of an absence of photosynthetic vegetation and hence cannot produce any yield.This approach simplified the models by reducing the number of fitted parameters.Each farm model was used to predict wheat yields from the NDVI values taken from the other farms in the corresponding image date.Predicted yields from the yield-NDVI models were then compared to the actual measured yield on each farm.Two criteria were used to assess the accuracy of the predictions of wheat yield: the Root Mean Square Error (RMSE) and the Nash-Sutcliffe Efficiency Criterion (E).The Nash Sutcliffe Efficiency Criterion (E) is a estimate of model performance that compares the variance around the 1:1 line between predicted (P i ) and observe values (O i ) with the variance of observations around the mean (Ō) (Equation ( 1)).It is defined as one minus the sum of the absolute squared difference between the predicted and observed values normalised by the variance of the observed values [55].Values of E range between 1.0 (perfect fit) and −∞.Here, a negative efficiency indicates that the mean value of the observed yield would have been a better predictor than the model [56].

Accuracy of the Regression Relationships and Model Extrapolations over Time.
We developed a total of 34 regression models for the three farms over five years using a total of 13 cloud free image dates.These models were used to extrapolate yield on adjacent farms for a total of 64 extrapolation tests (Table 2).Overall, models provided a good fit with an average RMSE of 0.59 t/ha and all were below 0.85 t/ha.Most regression analyses show highly significant relationships between predicted and observed yield.R 2 values vary substantially with half the regressions models having a regression coefficient below 0.3 and half above, respectively (Appendix 1).Categorising the models by day of year showed that RMSE was lowest and E was highest around the 250th day of the year (Figure 3).Average RMSE was 0.58 t/ha (Standard Deviation (SD = 0.11) and E was 0.3 (SD = 0.16) in the time period between day 245 and 255.This range would be the optimal time period for selection of imagery for the development of the yield-NDVI relationships.Extrapolation, as expected, showed a reduced performance with an average RMSE of 0.71 t/ha and a maximum of 2.0 t/ha.Efficiency for the extrapolation models varies around −6.29 (not shown in Figure 4) and 0.49.Extrapolation showed mixed results with only 11 models out of the 64 having an efficiency of above 30%.Both RMSE and E however, also have an optimum time period, once again around the 250th day of the year (Figure 4).Average RMSE was 0.64 t/ha (SD = 0.12) and E was 0.13 (SD = 0.24) in the time period between day 245 and 255.

Yearly Yield-NDVI Relationships
Model performance for different farms and years was evaluated to determine if these models can be extrapolated in space and time.Videos of the distributions of NDVI over time, the final yield maps and a scatter plot of their relationships are available in the supplementary sections by year.Model and yield prediction efficiency criteria, Root Mean Squared Error (RMSE) and the Nash-Sutcliffe Efficiency Criteria (E) for each year and each image are summarized in the Tables 4-7.These criteria highlight the magnitude of error and prediction accuracy when the models are extrapolated.A description of the growing season, NDVI progression and a brief interpretation of the results are also included.

Table 4. Model and yield prediction efficiency criteria for 1998 Landsat imagery-Root Mean Square Error (RMSE) and the Nash-Sutcliffe Efficiency Criteria (E). Values in bold italics represent efficiency criteria for the calibration models. Descriptions of 1998 Growing season rainfall, NDVI progression and interpretations of results.
Interpretation: Wet conditions in May through to August suggest cropping area was not in water deficit therefore adequate soil water for crop development.Clustering of mid-September NDVI suggest flowering occurred later with potential reduction in wheat yields due to higher temperatures and water deficit later in the season.Acquisition of imagery in late August or mid-September showed poor regression and extrapolation relationships and were not optimal for model development.Imagery acquired further into the season could have provided a better relationship if it were available.Table 5. Model and yield prediction efficiency criteria for 1999 Landsat imagery-Root Mean Square Error (RMSE) and the Nash-Sutcliffe Efficiency Criteria (E).Values in bold italics represent efficiency criteria for the calibration models.Descriptions of 1999 Growing season rainfall, NDVI progression and interpretations of results.
Interpretation: Rainfall was not a constraining factor.Similar patterns of positive yield-NDVI distributions across all image dates and regression models.Their strength declined over time.Validation of the models developed in August and September showed good and consistent predictive capacity over different farm models and farm data.Models created in October provided poorer prediction accuracy.Interpretation: Limited rainfall will impact on establishment of the crop and initial growth due to prolonged period of water stress.Poor to moderate yield-NDVI relationships with similar prediction capability.October efficiency criteria were not included due to their low results but visualization of the imagery in conjunction with September's image may provide an additional benefit.For example, NDVI pattern within fields of Farm1 show differences in soil constraints or soil-water interactions given the gradual terrain of the fields.

Model
Comparison of the NDVI distribution within specific fields highlights areas which yield better and poorly under water stress.Interestingly, areas of low NDVI do not necessarily correspond to low yield, for example, around the boundary of two fields which have moderate end of season yield values.These patterns are similar on all three farms.This information can be potentially beneficial in future farm management and land use planning because it is expected that Mediterranean farming areas will become drier under climate change [57,58].Consequently, this model performed poorly in September and October when the relationship was validated across the other farm data.In September, the models developed for Farm 3 performed reasonably well on Farms 1 and 2. October imagery shows areas of high NDVI values and corresponding high yield values are still across Farms 1 and 3.This may indicate an extension of crop growth as a result of higher August/September rainfall, markedly different sowing dates or a mislabeling of the crop type (not wheat).In some areas high NDVI values that correspond to low yield values are also seen.This could reflect crop management (e.g., weeds or pests) or weather (e.g., high temperatures related to the later flowering date) impacting on yield later in the season.Interpretation: Yield-NDVI relationships follow a positive trend across all three farms and their strength increased with acquisition date.Validation of the models showed that the image acquired in mid-September was the best yield predictor for that year while Farm 3 was the best predictor across all acquisition dates.The Farm 1 yield and NDVI map has an interesting outlier with higher NDVI values for one field.This may be the result of a very late sowing date or mislabeling of the crop type (not wheat).

Error, Model Accuracy and Uncertainties
The results show a reasonable relationship between NDVI and yield at different growth stages suggesting that the approach is a promising undertaking.An average RMSE of 0.58 t/ha for the models developed in early to mid September (245-255) seems reasonable and the extrapolation result of 0.64 t/ha is not substantially worse.A comparison of RMSE measured within this study compares well with those reported in the literature (Table 9).It is noticeable that yield predictions from the AVHRR or MODIS sensors produces lower prediction errors when compared to the ground-based or high resolution sensors.This indicates that some of the variability is filtered out by smoothing caused by the larger pixel size and the mismatch of data scales used in their validation.
Overall the pattern of yield predicted from NDVI shows moderate prediction accuracy, however, there is still a substantial proportion of variability remaining unexplained and a large number of models that fail to predict yield reliably when applied to other farms.This is mostly due to a bias, where yield is consistently over or under predicted due to variability in the relationships between the yield mapped and NDVI datasets.Although, we have employed an extensive data cleaning process, unbiased noise may in part be due to errors in the yield monitoring (i.e., different calibrations of the flow rates by the yield monitors on different farms).The measurement of NDVI is less error prone than yield mapping but the snapshot approach has its limitations [33] because it provides a static representation of the current variation in plant growth.This variation consists of plants in fields with different sowing dates and represents plant phenological responses within fields to differences in soil-water interactions under water stress [64].The comparison of the snapshot to final yield measured around three months later (December) also is hazardous since the sensitivity of wheat to adverse meteorological conditions at critical growth stages, principally at flowering and grain filling is high.
Conditions may occur where above ground biomass is high which is represented by a high NDVI value in the image, but the actual yield may not be commensurately large [65] due to pest infiltration, disease or if lodging occurs late in the season.While a careful selection of the image acquisition can improve the extrapolation performance of the models, such processes are very difficult to account for in a single image and the improvement of the models is inconsistent.[63] Given these situations, our results showed that reliable relationships are found in three out of the five seasons and mid-September being the optimal time for the image acquisition.However, annual variations in rainfall and soil-water interactions can influence these relationships and the selection of image dates.The assumption that low yields are related to low NDVI values and that high yields are related to high NDVI values at a time where the majority of the wheat plants are at maturity can be said to hold.This research provides further evidence to Lobell's [13] suggestion that accurate wheat yield predictions are possible using only one image, provided the image is acquired towards the middle of the growing season when most wheat crop canopies are fully developed.

Implications in Using the Extrapolated Estimates for Management
Models tended to overestimate yield at low NDVI and underestimate yields at high NDVI values.This has several implications for the use of the datasets in both farm and catchment management situations.For precision agriculture management, depending on the management strategy undertaken, this degree of prediction error may not be as limiting.Growers who prefer a zone management strategy might only require accuracy of the mean yield for a zone [59].Those who adopt variable rate applications will require greater accuracy in the estimation of yield.Further research is needed into how the selection of the model with different overestimations and underestimations of wheat yield will affect the applicability to economic returns and also precision agriculture management decisions.For catchment management, the overestimation of yield within lower ranges therefore poses empirical limitations when trying to spatially target areas where land use alternatives may be economically comparable to traditional wheat cropping.This study has not taken into account other crops types that are present in wheat farming systems as break crops.These crops may not have the same yield-NDVI relationships due to plant physiology differences.The use of crop rotations may also limit predictions, especially where wheat is not the major crop in the agricultural landscape.The major methodological constraint to our approach is the availability of data.The methodology relies on access to historical yield mapping data and the availability of cloud free images at anthesis within the agricultural region.These issues may pose limitations to this application of this methodology elsewhere.

Conclusions
The long term archive of Landsat imagery combined with a growing archive of wheat yield maps shows substantial potential for the prediction of historical wheat yield over large areas at resolution that is needed for farm and environmental management.We have shown the range of prediction accuracies for empirical relationships to predict wheat yield over a variety of seasons using wheat yield data collected by combine-mounted yield monitors and Landsat NDVI.Furthermore, we have tested the sensitivity of predictions to imaging date within the growing season.Overall, imagery acquired in mid-September showed stronger relationships (average R 2 of 0.32 and average RMSE of 0.58 t/ha) and prediction accuracies (average RMSE of 0.64 t/ha) when compared to those derived in late August, late September and early October.Of the five seasons reviewed, three gave moderate to good prediction accuracies, while marginal yield predictive capacity was obtained with the 2003 model.Accuracy of the 1998 model was the poorest with prediction capability highlighted by efficiency criteria showing that the average farm wheat yield estimates provided better yield prediction.
The snapshot approach used in this study has the potential to produce high resolution estimates of wheat yield.This approach assumes that biomass measured within the flowering window of a wheat crop through a remote sensing proxy (NDVI) has a positive relationship with wheat yield.These overall results show that fairly robust wheat yield prediction models can be created over differing rainfall seasons.These models can then be applied to satellite imagery to derive spatially varying estimates of wheat yield and economic performance that are both broad in extent and high in resolution.The physical dimensions of this information will help inform decisions by both growers and policy makers on agronomic management, business viability and regional environmental objectives under the threat of a changing climate.

Figure 1 .
Figure 1.The four farm study area in Western Australia.

Figure 2 .
Figure 2. Monthly and long term average monthly rainfall for the five years of analysis.

Figure 3 .
Figure 3. Root Mean Square Error (RMSE) and the Nash-Sutcliffe Efficiency Criteria (E) values by day of the year for the regression models.

Figure 4 .
Figure 4. Root Mean Square Error (RMSE) and the Nash-Sutcliffe Efficiency Criteria (E) values by day of the year for the extrapolation of the regression models.

relationship for Farm 2
showing a high positive relationship in comparison to the other models.

Role of Rainfall in Crop Development Stages March
to May 1. Pre-sowing soil water accumulates; aids seed germination.

Table 2 .
Landsat Thematic Mapper images acquired for the five years of analysis.

Table 3 .
Availability of yield mapping data by farm and average farm wheat yield (t/ha), the number of image dates, number of regression models and extrapolation tests.

Table 6 .
Model and yield prediction efficiency criteria for 2001 Landsat imagery-Root Mean Square Error (RMSE) and the Nash-Sutcliffe Efficiency Criteria (E).Values in bold italics represent efficiency criteria for the calibration models.Descriptions of 2001 Growing season rainfall, NDVI progression and interpretations of results.

Table 7 .
Model and yield prediction efficiency criteria for 2003 Landsat imagery-Root Mean Square Error (RMSE) and the Nash-Sutcliffe Efficiency Criteria (E).Values in bold italics represent efficiency criteria for the calibration models.Descriptions of 2003 Growing season rainfall, NDVI progression and interpretations of results.
Interpretation: Significant rainfall fell between May and September.The October image for Farms 1 and 2 provided the best models for yield prediction while the relationship developed in September was the best for Farm 3. Farm model differences exist with yield-NDVI

Table 8 .
Model and yield prediction efficiency criteria for 2004 Landsat imagery-Root Mean Square Error (RMSE) and the Nash-Sutcliffe Efficiency Criteria (E).Values in bold italics represent efficiency criteria for the calibration models.Descriptions of 2004 Growing season rainfall, NDVI progression and interpretations of results.

Table 9 .
Sensor and prediction error associated with wheat yield prediction.