Prediction of Yield Productivity Zones from Landsat 8 and Sentinel-2A / B and Their Evaluation Using Farm Machinery Measurements

: Yield is one of the primary concerns for any farmer since it is a key to economic prosperity. Yield productivity zones—that is to say, areas with the same yield level within ﬁelds over the long-term—are a form of derived (predicted) data from periodic remote sensing, in this study according to the Enhanced Vegetation Index (EVI). The delineation of yield productivity zones can (a) increase economic prosperity and (b) reduce the environmental burden by employing site-speciﬁc crop management practices which implement advanced geospatial technologies that respect soil heterogeneity. This paper presents yield productivity zone identiﬁcation and computing based on Sentinel-2A / B and Landsat 8 multispectral satellite data and also quantiﬁes the success rate of yield prediction in comparison to the measured yield data. Yield data on spring barley, winter wheat, corn, and oilseed rape were measured with a spatial resolution of up to several meters directly by a CASE IH harvester in the ﬁeld. The yield data were available from three plots in three years on the Rostˇenice Farm in the Czech Republic, with an overall acreage of 176 hectares. The presented yield productivity zones concept was found to be credible for the prediction of yield, including its geospatial variations.


Introduction
Estimating potential crop yield is a crucial activity performed in the assessment of seasonal production. It has been well established by works such as Auernhammer [1] that a plot is not a homogeneous area from the point of view of soil conditions, climate, or crop yield. The period of the past 20 years may be characterized as a shift from conventional farming to precision farming. Precision farming techniques count and rely on such heterogeneity. All plots have their strong as well as weak zones from the crop yield point of view.
Information on yield is important for two main reasons: it can be used (a) to maximize economic profitability and (b) to reduce the environmental burden caused by agricultural activities. Both objectives require as much accurate information on the yield as possible. However, according to the results of several European research projects [2,3] in particular, detailed geospatial data on yield with a spatial resolution of a few meters are scarce, since information on yield is typically available only for larger aggregated areas-i.e., plot level, farm level, and/or regional level [4]. Therefore, indirect methods for yield prediction have therefore been developed and verified. Such indirect methods aim at estimations beyond the scope of this paper. Long-term high/low yield productivity zones should be understood as one kind of information input for further modelling and/or decision-making, not as a general approach or solution to the problem of yield prediction.
The primary goal of this paper is to present means of yield productivity zone identification and computing based on the Sentinel-2A/B and Landsat 8 multispectral satellite data. An eight-year series (2013-2019) from Sentinel-2A/B and Landsat 8 missions was used for the prediction of yield productivity zones. The secondary goal of this paper is to provide quantitative verifications. A fouryear series (2016-2019) of in situ sensor measurements from the CASE IH harvester in three plots was used as reference data for this evaluation.

Materials and Methods
Methods of periodic satellite remote sensing were used for yield productivity zone identification and computing, since neither operative aerial remote sensing nor meteorological monitoring were available for Rostěnice Farm in the Czech Republic.

Agronomical Characteristics of the Pilot Farm
The farm data on the yield measurements were provided by Rostěnice Farm in the Czech Republic ( Figure 1). Data on the crops, agronomic practices, and yield measurements (see Section 2.3) were provided for the purposes of this paper.
The farm, Rostěnice a.s. (N49.105 E16.882), manages over 10,000 ha of arable land in the South Moravia region of the Czech Republic. The average annual rainfall is 544 mm, and the average annual temperature is 8.8°C. Within the managed land, the prevalence of soil type is Chernozem, Cambisol, haplic Luvisol, Fluvisol near to water bodies, and occasionally also Calcic Leptosols. The main program is plant production, where the main focus is on the cultivation of malting barley (2500 ha), maize for grain and biogas production (2500 ha), winter wheat (2000 ha), oilseed rape (1000 ha), and It should be noted that how such computed long-term high/low yield productivity zones are used in decision-making processes related to farm practice-e.g., the application of fertilizers-is beyond the scope of this paper. Long-term high/low yield productivity zones should be understood as one kind of information input for further modelling and/or decision-making, not as a general approach or solution to the problem of yield prediction.
The primary goal of this paper is to present means of yield productivity zone identification and computing based on the Sentinel-2A/B and Landsat 8 multispectral satellite data. An eight-year series (2013-2019) from Sentinel-2A/B and Landsat 8 missions was used for the prediction of yield productivity zones. The secondary goal of this paper is to provide quantitative verifications. A four-year series (2016-2019) of in situ sensor measurements from the CASE IH harvester in three plots was used as reference data for this evaluation.

Materials and Methods
Methods of periodic satellite remote sensing were used for yield productivity zone identification and computing, since neither operative aerial remote sensing nor meteorological monitoring were available for Rostěnice Farm in the Czech Republic.

Agronomical Characteristics of the Pilot Farm
The farm data on the yield measurements were provided by Rostěnice Farm in the Czech Republic ( Figure 1). Data on the crops, agronomic practices, and yield measurements (see Section 2.3) were provided for the purposes of this paper.
Remote Sens. 2020, 12, 1917 4 of 17 The farm, Rostěnice a.s. (N49.105 E16.882), manages over 10,000 ha of arable land in the South Moravia region of the Czech Republic. The average annual rainfall is 544 mm, and the average annual temperature is 8.8 • C. Within the managed land, the prevalence of soil type is Chernozem, Cambisol, haplic Luvisol, Fluvisol near to water bodies, and occasionally also Calcic Leptosols. The main program is plant production, where the main focus is on the cultivation of malting barley (2500 ha), maize for grain and biogas production (2500 ha), winter wheat (2000 ha), oilseed rape (1000 ha), and other crops and products such as soybean and lamb. The average production intensity is 6 t/ha for malting barley, 7 t/ha for winter wheat, 10 t/ha for grain maize, and 4 t/ha for oilseed rape. The farm has applied long-term soilless cultivation (mostly choppers) on their land, leaving all straw after harvest on the land. The high spatial variability of soil conditions in the southern part of farm has led to the adoption of precision farming practices, such as the variable application of fertilizers (since 2006) and crop yield mapping by harvesters. This farm manages over 10,000 ha under minimum soil tillage practices, including an area with a variable rate application of mineral fertilizers. The main crops are winter wheat, spring barley, oilseed rape, and maize. Table 1 shows the list of crops that were cultivated in the study area from 2013 to 2019-i.e., the time period for which the yield productivity zones were computed (2013-2019) and the years in which the yield was measured directly in the plots (2016-2019).

Yield Productivity Zone Identification and Computation
The conducted study comprised the following methodological steps. Multispectral satellite images were used as the primary source of information for yield productivity zone identification, while yield measurements for three plots at Rostěnice Farm from 2016 to 2019 were used as reference data (see Section 2.1). An eight-year series (2013-2019) was composed from Landsat 8 as well as Sentinel-2A/B [41] imagery, as depicted in the Supplementary Material 1. The yield productivity zones were computed from this eight-year series of satellite data according to the Enhanced Vegetation Index (EVI) as defined through the following equation, Equation (1).
where: EVI: Enhanced Vegetation Index, NIR: near-infrared (band). Huete et al. [42,43], as well as Johnson et al. [44] have shown EVI to exhibit higher sensitivities to crop parameters than the more commonly used NDVI (Normalized Difference Vegetation Index). For each plot, the median and quantile values of yield productivity were computed separately. The yield productivity zones (prediction) were then correlated with yield measurements for three plots in three years, meaning the plots for which actual yield measurements were available (see Section 2.3). Besides the statistical verification, qualitative research was also conducted for farms that do not have yield Remote Sens. 2020, 12, 1917 5 of 17 measurements in the form of a map. A dozen farms from seven countries compared their knowledge on yield with yield productivity zone maps and shared their results.
The concept of yield productivity zones, in contrast to yield crop potential, aims at establishing a more universal model that may be re-used for other crops-in this study, for cereals in a temperate climate. The core idea is based on the assumption that it is possible to identify long-term highly productive and less productive zones within a plot and that there are no significant changes in the applied agronomic practices. In other words, yield productivity focuses on the delineation of zones that are usually significantly below or above average from the yield point of view, no matter what the specific crop.
Yield productivity zones themselves are not designed to explain or to quantify, as they only demonstrate that some areas in a plot are either less productive or more productive over the long-term, in other words for the last couple of years. The identification of yield productivity zones is based on vegetation as the primary source of data. Yield productivity zone maps are then considered by farmers as one of several valuable inputs, such as pH maps, cultivation plans, or yield information from previous seasons, and not as the output of analyses or, indeed, as the most important input.
Yield productivity zones are areas within plots which have the same potential yield level as the maximum yield that could be achieved by a crop in given environment, as stated by Evans and Fischer [8]. The universality of yield productivity zones comes at a price. The model is capable of expressing significant geospatial variations for a given crop yield on a particular plot by distinguishing between three kinds of values: below average, average, and above average. However, the model depicts spatial variations within plots, and therefore it could be misleading when trying to compare the yield zones between plots. In general, we can conclude that some areas in a given plot have significantly lower productivity than others and take such information into the decision-making process. The calculation of yield productivity is based on the relationship between vegetation indices and the crop yield recorded during harvest. Johnson et al. [44] evaluated the relationship of the NDVI and EVI vegetation indices to the crop yield and their usage for yield forecast.
The workflow of the yield productivity zone calculation is depicted schematically through the UML (Unified Modeling Language) activity diagram in Figure 2. Sentinel-2A/B images were obtained as a Level-2A product type, as depicted in Supplementary Material 1. Images from the Sentinel 2-A/B missions were also processed with respect to surface reflectance using the Sen2Core software [45]. The Landsat 8 images were processed with respect to surface reflectance through Landsat Surface Reflectance Collection 1 [46] as a product of the ESPA (Eros Science Processing Architecture) Application Programming Interface [47]. The CFMask [48] cloud detection algorithm was applied in the ESPA Application Programming Interface to remove cloud cover from the Landsat satellite images. The "aggregate time series EVI for plots" step was performed as follows: the EVI values were computed from the Sentinel-2A/B and the Landsat 8 satellite images were aggregated based on a median. Median-based aggregations (1) were not influenced by the outliers within a year because (2) these outliers have been filtered out.
A selection of satellite scenes from the past eight years was made for a particular farm area in order to collect cloud-free data related to the second half of the vegetation period. During the FOODIE project pilots [2], a period of eight years was determined to be sufficient for identifying productivity zones in the Czech Republic. In other words, an eight-year period was used because of the comparability of the performed agronomic activities/practices on the one hand (identical processes used during seeding, fertilizer/pesticide application, harvesting etc.) and the long-term possibility to capture ordinary as well as extraordinary situations (droughts, floods, hot seasons, ground frosts etc.) on the other hand. Only data from the second half of the vegetation period (BBCH 40-50 and higher; for more details see Meier et al. [49]) are needed, since they indicate the highest correlation of vegetation indices to crop yield. The yield productivity zones were calculated as the relation of a pixel in individual scenes to the mean value of the plots and, in the next step, as the aggregation of all the scenes into one layer. Yield productivity zones present values in percentages, identifying places with, for example, 60% or 140% of potential yield in comparison to the average value for the given plot ( Figure 3). Farmers can then adjust their agronomical practices based on the spatial variability of yield in The abovementioned indices need to be computed from the relevant satellite data. Information on farmer's parcels (plots/LPIS farmer's blocks) is the second input, since it defines the geometry of the plot boundary. Any deviation in the plot geometry and crop area (such as two crop species within one plot boundary record) leads to an incorrect calculation of the plot variability type in one vegetation season. If this were not the case, the production zones could not be computed for the whole plot. The abovementioned indices are computed as variabilities within each plot. The potential yield is then computed for each plot from the underlying EVI index (defined above) through Empirical Bayesian Kriging (EBK)-a geospatial interpolation as defined in Pilz and Spöck [50] and Krivoruchko and Gribov [51]-then smoothed to a spatial resolution of 5 m. The EBK enables values according Remote Sens. 2020, 12, 1917 7 of 17 Kriging interpolation to be computed; however, the semivariograms differ from plot to plot thanks to the Bayesian approach to semivariogram construction.
Yield productivity zones present values in percentages, identifying places with, for example, 60% or 140% of potential yield in comparison to the average value for the given plot ( Figure 3). Farmers can then adjust their agronomical practices based on the spatial variability of yield in general: prediction and/or measurements. Note that farmers did not reach a consensus on a strategy for yield productivity zones: "feed the rich" (fertilizers primarily into high productive locations) versus "feed the poor" (fertilizers primarily into low productive locations) remain two opposite approaches.
Remote Sens. 2020, 12, x FOR PEER REVIEW 7 of 19 general: prediction and/or measurements. Note that farmers did not reach a consensus on a strategy for yield productivity zones: "feed the rich" (fertilizers primarily into high productive locations) versus "feed the poor" (fertilizers primarily into low productive locations) remain two opposite approaches. Yield productivity zones are another form of data derived from periodic remote sensing. The yield (potential) is the integrator of landscape and climatic variability and, as such, provides useful information for identifying management zones, as defined by Kleinjan et al. [52]. Yield productivity zones represent the basic delineation of management zones for site-specific crop management, which is usually based on yield maps over the past several years. As mentioned before, the presence of a complete series of yield maps for all plots is rare; thus, remotely sensed data are analyzed to determine the plot variability of crops through vegetation indices.
The proof-of-concept software solution for yield productivity zone identification and computing was based on ArcGIS Desktop 10.3.1, developed by the ESRI (Environmental System Research Institute) corporation.

Yield Productivity Zone Verification against Yield Measurements
Predictions in the form of yield productivity zones need to be confronted with the measured yield data. The yield data are considered to be the most sensitive form of data at a farm [53]. Rostěnice Farm in the Czech Republic provided yield maps of cereals as the main crop for three plots between the years 2016 and 2019, as depicted in Table 2. These yield maps were created by a CASE IH AXIAL FLOW 9120 harvester equipped with an AFS Pro 700 monitoring unit. The measurements are in GNSS-RTK quality (Global Navigation System of Systems-the Real Time Kinematics method provides a positional accuracy of better than 0.1 m; see [54]). The measurement frequency was set to one second (1 Hz). The harvesting width was equal to 9.15 m. The AFS Pro 700 monitoring unit was calibrated at 15 randomly selected locations per field to produce yield maps that were as precise as possible. The monitoring of yield was conducted as on-the-go mapping by recording grain flow and moisture continuously over the whole plot area. The obtained data were later filtered to detect and remove erroneous values, outliers, spatial deviations, etc.; the data were also subsequently interpolated, following the approach described in our previous papers [55,56]). Yield productivity zones are another form of data derived from periodic remote sensing. The yield (potential) is the integrator of landscape and climatic variability and, as such, provides useful information for identifying management zones, as defined by Kleinjan et al. [52]. Yield productivity zones represent the basic delineation of management zones for site-specific crop management, which is usually based on yield maps over the past several years. As mentioned before, the presence of a complete series of yield maps for all plots is rare; thus, remotely sensed data are analyzed to determine the plot variability of crops through vegetation indices.
The proof-of-concept software solution for yield productivity zone identification and computing was based on ArcGIS Desktop 10.3.1, developed by the ESRI (Environmental System Research Institute) corporation.

Yield Productivity Zone Verification against Yield Measurements
Predictions in the form of yield productivity zones need to be confronted with the measured yield data. The yield data are considered to be the most sensitive form of data at a farm [53]. Rostěnice Farm in the Czech Republic provided yield maps of cereals as the main crop for three plots between the years 2016 and 2019, as depicted in Table 2. These yield maps were created by a CASE IH AXIAL FLOW 9120 harvester equipped with an AFS Pro 700 monitoring unit. The measurements are in GNSS-RTK quality (Global Navigation System of Systems-the Real Time Kinematics method provides a positional accuracy of better than 0.1 m; see [54]). The measurement frequency was set to one second (1 Hz). The harvesting width was equal to 9.15 m. The AFS Pro 700 monitoring unit was calibrated at 15 randomly selected locations per field to produce yield maps that were as precise as possible. The monitoring of yield was conducted as on-the-go mapping by recording grain flow and moisture continuously over the whole plot area. The obtained data were later filtered to detect and remove erroneous values, outliers, spatial deviations, etc.; the data were also subsequently interpolated, following the approach described in our previous papers [55,56]). A geospatial correlation between the predicted yield productivity zones on the one hand and the yield measurement data on the other hand was performed using the following geospatial procedures: 1.
Both the yield productivity zones (predictions) and the yield measurements (reference data) were transformed into relative values by means of a linear function. This approach enables the comparability of predictions with reference data.

2.
To visualize and describe the spatial patterns of the differences between the yield productivity zones and yield measurements, map algebra was used, especially the geospatial subtraction. Map algebra provides insights on variances between the values in raster data that are overlapping [57].
The geospatial subtraction was used as defined in Equation (2). 3.
The following steps were applied prior to correlation. The data were transformed from raster to discrete reference points. Reference points were created as centroids of yield measurement pixels (values were stored as attributes), and the values of predicted yield were extracted to another attribute. These reference points on the one hand and the concave hull of the filtered field harvester data on the other hand were intersected [55]. Concave hulls were created by the Aggregate Points function [58], with an aggregation distance of 60 m (based on the analyzed data and field geometry characteristics) and minor shape modifications due to aggregations outside of the area of the plot (northwest of the Lány plot).
where: dy: difference in the predicted and measured yields, yz: yield productivity zones (predictions), ym: yield measurements (reference data).

4.
Intersection between the reference points and concave hull of the filtered field harvester data was performed because of the following reasons: a.
The raster applied in this study did not have equal coverage, and therefore "null" values occurred in attributes. b.
The extrapolated values of yield measurements occur at the edges of a plot, as they are: i. artificially calculated, based more on (settings of) software algorithms than actual data measurements; ii.
influenced by the harvesting strategy (for more information, see [55]).
c. Non-credible values of the yield potential (e.g., the "Pivovárka" plot in 2018) rationale remains an open question-a working theory counts the surrounding vegetation that influences the evapotranspiration conditions. This working theory will be pursued further as a subject for ongoing research.

5.
Finally, geospatial data-in other words, reference points with yield values as attributes-were transformed into tables, and the Pearson correlation coefficient (Pearson's r) [59] was used for calculating the correlation between the predicted yield productivity and the yield measurement data.
For the verification, the following software tools were used: ArcGIS Desktop 10.3.1 (for geospatial subtraction, to create concave hulls and maps) and Statistica (to calculate the correlation coefficients and to create graphs).

Results
Yield productivity zones may be understood as a prediction, while yield measurements are used as the reference data for validation. The geospatial distribution of measured yield values between plots at Rostěnice Farm is depicted in Table 2.
The results of the geospatial subtraction of yield productivity zones (prediction) from yield measurements (reference data) are depicted in Figure 4. i. artificially calculated, based more on (settings of) software algorithms than actual data measurements; ii.
influenced by the harvesting strategy (for more information, see [55]).
c. Non-credible values of the yield potential (e.g., the "Pivovárka" plot in 2018) rationale remains an open question-a working theory counts the surrounding vegetation that influences the evapotranspiration conditions. This working theory will be pursued further as a subject for ongoing research. 5. Finally, geospatial data-in other words, reference points with yield values as attributes-were transformed into tables, and the Pearson correlation coefficient (Pearson's r) [59] was used for calculating the correlation between the predicted yield productivity and the yield measurement data.
For the verification, the following software tools were used: ArcGIS Desktop 10.3.1 (for geospatial subtraction, to create concave hulls and maps) and Statistica (to calculate the correlation coefficients and to create graphs).

Results
Yield productivity zones may be understood as a prediction, while yield measurements are used as the reference data for validation. The geospatial distribution of measured yield values between plots at Rostěnice Farm is depicted in Table 2.
The results of the geospatial subtraction of yield productivity zones (prediction) from yield measurements (reference data) are depicted in Figure 4.  Table 3 and Figure 5 demonstrate variabilities (mean, minimum and maximum values, standard deviation) in yield productivity zones on the one hand and variabilities in the measured yield on the other hand. Differences between the values on yield productivity zones and measured yield are provided with equal statistical characteristics. All the information illustrates the variabilities of three plots for three years. Table 3. Statistical characteristics for the reference points used for the comparison of differences between the yield productivity zones and the measured yield. Note that the "N" column depicts the number of reference points that were used for interpolation and intersected by the concave hulls; this means the number of reference points in Table 3 differs from the number of measured points in Table 2.

Plot
Year and measured yield are provided with equal statistical characteristics. All the information illustrates the variabilities of three plots for three years. The differences between the yield productivity zones and the measured yield confirm that the prediction provided by the yield productivity zones led to an overestimation in five cases out of nine and an underestimation in four cases out of nine. The differences in the yield prediction compared to the measured values were up to 5% in seven cases. The only two differences in the predicted and measured yield were as follows: (1) the Přední Prostřední plot resulted in a 6.66% overestimation in 2016, while (2) the Lány plot resulted in a 10.83% underestimation in 2018.
Standard deviations on the one hand as well as the extent of (minimum and maximum) values on the other hand were, in all cases, higher than the yield values measured by a field harvester. In other words, in all cases the data measured under real conditions showed a higher variation, meaning greater local differences, than the (model of) predictions. Figure 3 shows that phenomena such as the intervals used in the legend are equal for both prediction using the yield productivity zones and the measured yield (in Figure 3 for the Pivovárka plot in 2018). The Pivovárka plot in 2018, the Pivovárka plot in 2017, and the Lány plot in 2017 resulted also in the largest standard deviations: 23.13%, 22.71%, and 20.33%, respectively. All three plots were also characterized by a clear spatial pattern. The agronomical season of 2017 was marked by high local differences for all three plots within this study. Figure 5 presents histograms of the differences between the yield productivity zones and the measured yield for all three plots and three years portrayed in Figure 4. Figure 5. Histograms of differences between the yield productivity zones and the measured yield. Note: the histogram for the Lány field in 2018 was modified to increase readability, meaning the intervals of the differences are equal to 5% instead of 10% as in the other histograms.
The differences between the yield productivity zones and the measured yield confirm that the prediction provided by the yield productivity zones led to an overestimation in five cases out of nine and an underestimation in four cases out of nine. The differences in the yield prediction compared to the measured values were up to 5% in seven cases. The only two differences in the predicted and measured yield were as follows: (1) the Přední Prostřední plot resulted in a 6.66% overestimation in 2016, while (2) the Lány plot resulted in a 10.83% underestimation in 2018.
Standard deviations on the one hand as well as the extent of (minimum and maximum) values on the other hand were, in all cases, higher than the yield values measured by a field harvester. In other words, in all cases the data measured under real conditions showed a higher variation, meaning greater local differences, than the (model of) predictions. Figure 3 shows that phenomena such as the intervals used in the legend are equal for both prediction using the yield productivity zones and the measured yield (in Figure 3 for the Pivovárka plot in 2018). The Pivovárka plot in 2018, the Pivovárka plot in 2017, and the Lány plot in 2017 resulted also in the largest standard deviations: 23.13%, 22.71%, and 20.33%, respectively. All three plots were also characterized by a clear spatial pattern. The agronomical season of 2017 was marked by high local differences for all three plots within this study. Figure 5 presents histograms of the differences between the yield productivity zones and the measured yield for all three plots and three years portrayed in Figure 4. Figure 6 depicts scatter plots in order to demonstrate the extents of all the differences between the predicted yield productivity zones and the measured yield. The purpose of Figure 6 is to bring a more complex point of view in comparison to Table 3 and Figure 5, which operate "only" with mean values.
Remote Sens. 2020, 12, x FOR PEER REVIEW 13 of 19 Figure 6 depicts scatter plots in order to demonstrate the extents of all the differences between the predicted yield productivity zones and the measured yield. The purpose of Figure 6 is to bring a more complex point of view in comparison to Table 3 and Figures 5, which operate "only" with mean values.  Table 4 depicts the Pearson correlation coefficient (Pearson's r) and probability values (p) between the predicted yield productivity zones and the measured yield for all the reference points in all the measured years. The correlation has been identified as statistically significant for all the plots in all the years at a 95% significance level. Table 4. Pearson correlation coefficient (Pearson's r) at a 95% significance level. Note that "N/A" means "not applicable", as field machinery measurements were not available for such a combination of a plot and year.   Table 4 depicts the Pearson correlation coefficient (Pearson's r) and probability values (p) between the predicted yield productivity zones and the measured yield for all the reference points in all the measured years. The correlation has been identified as statistically significant for all the plots in all the years at a 95% significance level.  (Pearson's r) at a 95% significance level. Note that "N/A" means "not applicable", as field machinery measurements were not available for such a combination of a plot and year.

Discussion
Yield productivity zones were evaluated as a feasible way of discovering the areas within a plot with higher-than-average and lower-than-average yields. The identification and computation of yield productivity zones is crucial for increasing the yield while minimizing the environmental pollution from agricultural activities by reducing the amount of fertilizer(s) added to the soil [60,61]. The authors would also like to emphasize that yield productivity zones are not a comprehensive solution but an additional input for further modelling, processes and decision-making.
A qualitative evaluation of the yield productivity zone concept was also conducted, in addition to the geospatial and statistical evaluation described in the results section. The qualitative evaluation was in the form of discussions with farmers, agronomists, and agro-scientists at a dozen farms in Europe (Austria, the Czech Republic, Germany, Latvia, and Spain). The qualitative verification was performed for farms without detailed yield measurements in the form of maps. However, the surveyed farmers, agronomists, and agro-scientists were employees who had worked on the evaluated plots for a considerable amount of time-in many cases for decades-and were therefore well-acquainted with the geospatial variations within, as well as between, the plots. It was concluded that the yield productivity zones correctly identify spatial patterns-in other words, zones with higher-than-average and lower-than-average yields within a plot. At the same time, the yield productivity zones were understood by the surveyed farmers, agronomists, and agro-scientists as one of the inputs for planning operations; other kinds of input varying in importance year to year included the weather, weeds, and pests. In general, the yield productivity zones are recognized as a valuable source of information; however, this information needs to be coordinated with the particularities of the agronomical season, including the kind of crop cultivated.
Yield productivity zones have been computed in the way described in this paper for the last four years (2016-2019). Unfortunately, yield measurements were not available for the same time period. Only partial information is therefore available on the geospatial distribution of yield measurements during various years. We came to the following conclusions when comparing the sums of the actual yields on the one hand and yield productivity zones (predictions) on the other hand for each plot at a dozen monitored farms in Europe from 2014. In terms of yield, the geospatial variations within the plots were minimal during rich years. It was argued by farmers/agronomists/agro-scientists that the monitored cereals would have grown anyway during such rich years (with optimal temperatures, rainfall, etc.). On the contrary, the pattern as depicted by yield productivity zones (prediction) became more clear in terms of yield during the poor years.
Uncertainties lie in areas of spatial and temporal resolutions: • The spatial resolution of satellite data caused data inconsistencies between the Sentinel and Landsat missions. The yield productivity zones were calculated at the spatial resolution of 5 m, which meant a smoothening from the original spatial resolution of 30 m in the case of the Landsat data and from 20 m in the case of the Sentinel data.

•
The spectral resolution of Sentinel-2A, Sentinel-2B, and Landsat 8 varies as these sensors differ in the ranges of recorded radiation. The presented paper deals with eight years' series, comprising combinations of all three sensors. A more detailed analysis is beyond the scope of this paper.

•
The temporal resolution for satellite data causes inconsistencies between the Sentinel-2A/B (5 days) and Landsat-8 (16 days) missions. Between 2016 and 2019, data from both the satellite missions were available; the Landsat mission satellite data are the only sufficient and freely available data older than 6 May 2016 for the study area.

•
The spatial resolution for yield measurements: the positional error was influenced by two factors-the speed of the harvester and the delay between the collection of grain and the computation of the respective yield [55]. Theoretically, the maximum positional error for the yield measurements could be up to 18.6 m, although such a high value would be improbable in practice. The yield measurements from harvesters in our study reached a spatial resolution of up to 9.15 m (operational harvesting width) to 3.1 m (measurements each two seconds for average speed 1.55 m·s −1 ).

•
The temporal resolution for the yield measurements: the time period for the monitoring of farm machinery telemetry should be the same as that for yield productivity zones-i.e., the last 8 years. Such a requirement could not be met for the experiments conducted. Only three consecutive years measurements for three plots were available at Rostěnice Farm.

Computing and expressions including the visualizations of uncertainties remain an open issue.
Another subject of debate is the processing of data from headland areas occurring mainly at the edges of the plots. The concave hulls approach used for yield measurements implies less headland areas and less filtering. It is therefore assumed that the results, too, are generally more accurate. However, this conclusion is closely related to the harvesting strategy [55].
The use of the real-time tracking of farm machinery was mentioned in the introduction is one of the prerequisites for any kind of precision farming application. In the scope of the sensor observation domain, a piece of machinery is equivalent to a set of sensors deployed on a mobile carrier. On the one hand, machinery can be monitored passively in order to collect data about the machinery itself and about the processed plot; on the other hand, machinery can be monitored actively in order to navigate the machinery and to control the batching of fertilizer, pesticides, or other substances at desired locations. Data derived from machinery tracking can supplement the source dataset for yield productivity zone calculations due to the more frequent use of machinery on the same plot.

Conclusions
Studies on yield predictions are becoming increasingly common. In contrast, studies comparing predictions computed from satellite images with yield measurements at fully operational farms are rare since (detailed) yield data are the most sensitive kind of farm data. The non-openness of yield data is even worse with respect to accessing yield maps that depict the geospatial variations in yield within a plot-that is to say, maps with yield measurements from harvesters in our study reaching up to a spatial resolution of 9.15 m (operational harvesting width) to 3.1 m (measurements each two seconds for average speed 1.55 m·s −1 ).
The conducted study proved correlations between the predicted yield productivity zones and the measured yield for all three plots in all three years at a 95% significance level. Standard deviations as well as the extent of (minimum and maximum) values are in all cases higher than the yield values measured by a field harvester. The following statements are valid for the conducted study when taking into account the resulting mean values. In seven cases, the differences in yield predictions compared to the measured values were up to 5%. The prediction overestimated the yield by 6.66% in one case and underestimated it by 10.83% in another case. Moreover, a qualitative evaluation by farmers/agronomists/agro-scientists has also demonstrated the credibility of yield productivity zone identification as well as its geospatial distribution across six years (from 2014 to 2019) at a dozen farms in seven countries (Austria, the Czech Republic, Latvia, Germany, Poland, Spain and Turkey).
As the research conducted in the FOODIE and SIEUSOIL projects has confirmed, farmers are willing to use predictions in the form of yield productivity zones as an input for their decision-making. Geospatial and statistical evaluations demonstrated the validity of yield productivity zones as an instrument for predicting long-term areas of high and low yield, meaning areas where crop yield has for several years been significantly above/below the yield average for the whole plot.
Yield productivity zones are used throughout the whole vegetation season primarily because of fertilizer application planning. Moreover, the presented yield productivity zones concept has already been successfully registered under Phase 8 of the Global Earth Observation System of Systems (GEOSS) Architecture Implementation Pilot in order to support the wide variety of requirements that are primarily aimed at agricultural pollution monitoring, typically pollution by fertilizer/pesticide residues.
Supplementary Materials: The following are available online at http://www.mdpi.com/2072-4292/12/12/1917/s1, Table S1: Overview of satellite imagery from Landsat mission used for the calculation of yield zones; Table S2: Overview of satellite imagery from Sentinel-2 mission used for the calculation of yield zones. Funding: This paper is part of a project that has received funding from the European Union's Horizon 2020 research and innovation program under grant agreement No. 818346, called "Sino-EU Soil Observatory for intelligent Land Use Management" (SIEUSOIL).