Wheat Yield Forecasting for the Tisza River Catchment Using Landsat 8 NDVI and SAVI Time Series and Reported Crop Statistics

: Due to the increasing global demand of food grain, early and reliable information on crop production is important in decision making in agricultural production. Remote sensing (RS)-based forecast models developed from vegetation indices have the potential to give quantitative and timely information on crops for larger regions or even at farm scale. Different vegetation indices are being used for this purpose, however, their efﬁciency in estimating crop yield certainly needs to be tested. In this study, wheat yield was derived by linear regressing reported yield values against a time series of six different peak-seasons (2013–2018) using the Landsat 8-derived Normalized Difference Vegetation Index (NDVI) and Soil Adjusted Vegetation Index (SAVI). NDVI- and SAVI-based forecasting models were validated based on 2018–2019 datasets and compared to evaluate the most appropriate index that performs better in forecasting wheat production in the Tisza river basin. Nash-Sutcliffe efﬁciency index was positive with E 1 = 0.716 for the model from NDVI and for SAVI E 1 = 0.909, which means that the forecasting method developed and performed good forecast efﬁciency. The best time for wheat yield prediction with Landsat 8-SAVI and NDVI was found to be the beginning of full biomass period from the 138th to 167th day of the year (18 May to 16 June; BBCH scale: 41–71) with high regression coefﬁcients between the vegetation indices and the wheat yield. The RMSE of the NDVI-based prediction model was 0.357 t/ha (NRMSE: 7.33%). The RMSE of the SAVI-based prediction model was 0.191 t/ha (NRMSE 3.86%). The validation of the results revealed that the SAVI-based model provided more accurate forecasts compared to NDVI. Overall, probable yield amount is possible to predict far before harvest (six weeks earlier) based on Landsat 8 NDVI and SAVI and generating simple thresholds for yield forecasting, and a potential loss of wheat yield can be mapped.


Introduction
Since global trading prices of agricultural commodities depends largely on their seasonal production levels, the total size of cropping area and crop yields are important for export-import activities, agricultural agencies at national and international levels, government agencies, and other crop marketing agencies. Furthermore, due to the increasing global demand of food grain, early and reliable information on crop production is important from a humanitarian point of view as well to organize emergency response and food aid interventions [1].
In recent years, remote sensing became a widespread technique used in agriculture and agronomy Atzberger [2] and the interest in using satellite remote-sensed data for crop monitoring and crop production forecasting has increased as it produces uniform data at the global scale, and modelling results can potentially be utilized in large regions. Reliable and The study area is part of an international catchment, lowlands of the Tisza river catchment (altitude below 200 m), which is the most important wheat and corn producing region in the Carpathian basin and in Central Eastern Europe (CEE). Based on the annual reports of the Hungarian central statistical offices, approximately 55% of the arable lands are covered with wheat and maize. At the study site, 26 wheat fields were selected throughout the Northern Great Plain region (NUT 2) at the Hungarian Great Plain region of Tisza river basin. The study sites were of different sizes ranging from 5 to 34 ha. The total area involved into the research was 438 ha.
Temperature, light or radiation and soil moisture are three cardinal climatic factors that affects the development of vegetation and flowering during crop production. The plain sites of Tisza catchment have a suitable global radiation. About 4430 MJ/m 2 /year is the mean energy input by radiation onto the surface. This relatively high radiation is because of the long photoperiod which is about 2050 h/year. The average annual daily temperature in Hungary is between 10-11 • C and 17.5 • C during the growing season. For winter wheat to grow, the sum of average daily temperatures during the growing season must reach 2100 • C [25].
Though the Great Plain region has a humid continental climate with average precipitation of 495 mm, the studied region suffers from water management. Problems such as floods, surplus water and drought phenomena occurs regularly. In the same year, surplus water and drought often occur, even in the same period of vegetation. The precipitation is the most variable climate element in this plain site. It is different between years and the seasonal distribution is also extreme. For instance, according to the data of the National Weather Service, between the years 1901 and 2010, the maximum and minimum annual precipitation were 321 mm and 953 mm respectively. This provides an unpredictable water supply for the vegetation and makes crop and fruit production vulnerable. An increasing number of heat days during the ripening of cereals and the lack of precipitation can lead to a significant loss of yield and to a deterioration in quality [26]. It is expected that the Tisza river basin will experience more severe drought events according to the climate change models' predictions, while on the other hand, more extreme precipitation events in the future are foreseen.

Reported Crop
The average wheat yield data (t/ha) from 26 arable lands of the study site for the period of 2013-2019 was provided by Karcag Research Institute, University of Debrecen ( Figure 1). The wheat yield was measured based on the total wheat weight harvested from a certain site, and derived to a unit of hectare. A remarkable amount of winter wheat yield was detected in this area in 2015, 2017 (>5 t/ha) and an average amount in 2019 (~5 t/ha). On the other hand, significant drought periods were observed in several years, therefore decrease in yields were measured in 2013, 2014, 2016 and 2018 (~0.8-1 t/ha loss, compared to average).

Landsat 8 NDVI and SAVI
Compared to high-resolution sensors, low-resolution satellite images have a much better synoptic view and temporal revisit frequency system because of their large swat width [1]. However, the accuracy of yield detection, the interpretation (and validation) of the signal, as well as the reliability of the derived information products have usually been complicated by the spatial resolution. An average farm size is 14-15 ha in Hungary [27] and field sizes are often smaller. Therefore, datasets such as Fraction of Absorbed Photosynthetically Active Radiation (fAPAR) or Advanced Very High-Resolution Radiometer (AVHRR) is not appropriate for the monitoring of yield [28], because one pixel exceeds the average crop farm size in CEE region. In Hungary, MODIS NDVI time-series at 250 m ground resolution had sufficient temporal and radiometric resolution to distinguish major crop types and crop-related land use practices [10,24]. On the other hand, 250 m MODIS pixels could have less than 100% wheat and maize sites and they are partially covered by other types of land cover, which causes inherent uncertainty during measurement [16]. The Landsat 8 produces data with higher spatial resolution (30 m) and free availability [29]. The higher the spatial resolution of remote-sensing data, the better its assistance in the identification of the low-resolution pixels. Although Landsat (or similar sensors such as SPOT) is also the main source of data with sufficient spatial resolution in most agricultural areas, but with a 16-day gap between successive images, it can be difficult to obtain enough clear images within a growing season in very humid years in Hungary due to cloud cover [30]. Besides, studies on the relationship between crop yield and Landsatderived vegetation indices are mostly bound to focus on individual fields [31][32][33]. Since Landsat-derived vegetation indices have been proven to be one of the effective tools for assessing vegetation condition, it is valid to use them to predict crop yield. In the present study, Landsat 8 satellite images for the growing season 2013 to 2019 was downloaded from USGS EarthExplorer website. Landsat 8 data have 11 bands: operational land imager (OLI) and thermal infrared sensors (TIRS) C1 Level-1 images with nine spectral bands, however, the bands 4 (red) and 5 (NIR) with 30 m spatial resolution were used for further processing.

Data Processing and Yield Forecast
Data processing and yield forecasting involves several steps from data collection through processing and calibration to validations, which can be performed in several steps. First, data collection and preprocessing, then vegetation indexing, and smoothing,

Landsat 8 NDVI and SAVI
Compared to high-resolution sensors, low-resolution satellite images have a much better synoptic view and temporal revisit frequency system because of their large swat width [1]. However, the accuracy of yield detection, the interpretation (and validation) of the signal, as well as the reliability of the derived information products have usually been complicated by the spatial resolution. An average farm size is 14-15 ha in Hungary [27] and field sizes are often smaller. Therefore, datasets such as Fraction of Absorbed Photosynthetically Active Radiation (fAPAR) or Advanced Very High-Resolution Radiometer (AVHRR) is not appropriate for the monitoring of yield [28], because one pixel exceeds the average crop farm size in CEE region. In Hungary, MODIS NDVI time-series at 250 m ground resolution had sufficient temporal and radiometric resolution to distinguish major crop types and crop-related land use practices [10,24]. On the other hand, 250 m MODIS pixels could have less than 100% wheat and maize sites and they are partially covered by other types of land cover, which causes inherent uncertainty during measurement [16]. The Landsat 8 produces data with higher spatial resolution (30 m) and free availability [29]. The higher the spatial resolution of remote-sensing data, the better its assistance in the identification of the low-resolution pixels. Although Landsat (or similar sensors such as SPOT) is also the main source of data with sufficient spatial resolution in most agricultural areas, but with a 16-day gap between successive images, it can be difficult to obtain enough clear images within a growing season in very humid years in Hungary due to cloud cover [30]. Besides, studies on the relationship between crop yield and Landsat-derived vegetation indices are mostly bound to focus on individual fields [31][32][33]. Since Landsat-derived vegetation indices have been proven to be one of the effective tools for assessing vegetation condition, it is valid to use them to predict crop yield. In the present study, Landsat 8 satellite images for the growing season 2013 to 2019 was downloaded from USGS EarthExplorer website. Landsat 8 data have 11 bands: operational land imager (OLI) and thermal infrared sensors (TIRS) C1 Level-1 images with nine spectral bands, however, the bands 4 (red) and 5 (NIR) with 30 m spatial resolution were used for further processing.

Data Processing and Yield Forecast
Data processing and yield forecasting involves several steps from data collection through processing and calibration to validations, which can be performed in several steps. First, data collection and preprocessing, then vegetation indexing, and smoothing, masking NDVI and SAVI data based on fields, calibration of masked data with yield data, Agronomy 2021, 11, 652 5 of 13 and finally validation of yield forecasting models were performed. NDVI and SAVI was chosen in this research, since those vegetation indices (VIs) were ranked as VIs with the highest correlations with wheat yield [34,35]. To reduce the noise in the VIs time series, a smoothing process was need. In the present case, penalized spline-based smoother was applied for the smoothing of the data. Normalized Difference Vegetation Index (NDVI) and Soil Adjusted Vegetation Index (SAVI) image data were derived from Landsat 8 satellite images based on the following equations: The value of L varies by the amount or cover of green vegetation. L = 0: in very high vegetation regions, L = 1: in areas with no green vegetation, L = 0.5: works well in most situations; therefore, this default value used in this study.
Beside smoothing another obstacle is the cropland mask. In this study, shapes of 26 wheat growing agricultural fields were selected as crop-specific masks to derive wheat biomass-related time series average VI values separately for each wheat field in peakseasons of 6 years for those days, when Landsat data were available. Wheat yield prediction models related to each Landsat observation dates were developed by regressing observed yield values against time series Landsat 8-based NDVI and SAVI data. According to studies, remote sensing time series of at least 6 years should be used for analyzing crop yield [16,24]. Linear regression models were developed based on six years of VIs data (2013 to 2018). The use of standard linear regression models with standard estimation techniques is subject to a number of conditions regarding the explanatory (x) (which were the VIs) and output (y) variables (which was the yield data) and their relationship. For model validation, 2018 and 2019 data were used. The performance of the models forecast were evaluated based on the data of 26 sites (n = 26) using the accuracy metrics coefficient of determination (R 2 ), root means square error (RMSE) and Normalized Root Mean Square Error (NRMSE).
where: y i is predicted yield data;ý i is the observed yield data; y is the average of the observed yields; n is the number of field samples used for validation. RMSE is a commonly used uncertainty metric for absolute predictions of errors and NRMSE is useful for comparisons between seasons in case of variable yield ranges [36]. Nash-Sutcliffe efficiency 'E 1 was also used to assess the accuracy of the predictions of forecast. Nash and Sutcliffe [37] defined the efficiency E 1 as one minus the sum of the absolute squared differences between the predicted yield and the observed yield data, normalized by the variance of the observed yield data during the period under investigation: where: O i is the observed value or observed yield data, P i is the predicted yield values; O is the mean of observed values. Values of E ranges between 1.0 (which is perfect fit) and −∞. An efficiency that is lower than zero indicates that the average value of the observed yield would have been a better predictor than the model. During the validation process, in order to assess overall forecasting accuracy, the relative deviations and absolute deviations of the predicted values from the observed values were calculated to assess the accuracy of the predictions. In order to highlight those yield ranges in which the forecasting model performs the best, significant differences were assessed between the predicted and observed yield values within different yield ranges.

Results
Wheat yield estimation was derived by regressing reported (observed) yield values against the time series of six different peak seasons NDVI and SAVI derived from Landsat 8. The application of these vegetation indices derived from Landsat 8 were analysed and tested for the production of wheat yield assessment and forecasting in the Tisza river catchment area. The peak-season of wheat is in May and early June, followed by the ripening stage, then the harvesting period is in early July in the Tisza river basin. Therefore, NDVI and SAVI data from day 120 to 190 (30 April to 9 July) were collected and analysed for wheat yield forecast. NDVI and SAVI indices showed the highest peaks (NDVI: 0.46 ± 0.077; SAVI: 0.837 ± 0.338) from day 138 to 150 (18 May to 30 May). After this period, vegetation indices started to decline slightly until the harvest in accordance with several studies [1,[38][39][40][41]. The significance and strength of the correlations between VIs and wheat yield changed accordingly with the changes of NDVI and SAVI values ( Figure 2). The coefficients of determination were the highest (R 2 > 0.6) from day 138 to 167 (18 May to 16 June) in the cases of both VIs. This interval was equal to BBCH from 41 to 71. The difference between NDVI-and SAVI-derived models was that NDVI performed maximum regression coefficients (R 2 = 0.757) in the beginning of the heading stage, whilst SAVI performed maximum regression coefficients (R 2 = 0.943) in the flowering and early ripening stages. Furthermore, the relationship between the wheat yield and the SAVI values were stronger than in the case of NDVI, suggesting SAVI to be a better predictor for wheat yield.
Agronomy 2021, 11, x FOR PEER REVIEW 6 where: is the observed value or observed yield data, is the predicted yield va is the mean of observed values. Values of E ranges between 1.0 (which is perfe and −∞. An efficiency that is lower than zero indicates that the average value of th served yield would have been a better predictor than the model.
During the validation process, in order to assess overall forecasting accuracy, th ative deviations and absolute deviations of the predicted values from the observed v were calculated to assess the accuracy of the predictions. In order to highlight those ranges in which the forecasting model performs the best, significant differences we sessed between the predicted and observed yield values within different yield range

Results
Wheat yield estimation was derived by regressing reported (observed) yield v against the time series of six different peak seasons NDVI and SAVI derived from Lan 8. The application of these vegetation indices derived from Landsat 8 were analysed tested for the production of wheat yield assessment and forecasting in the Tisza catchment area. The peak-season of wheat is in May and early June, followed by th ening stage, then the harvesting period is in early July in the Tisza river basin. There NDVI and SAVI data from day 120 to 190 (30 April to 9 July) were collected and ana for wheat yield forecast. NDVI and SAVI indices showed the highest peaks (NDVI: 0 0.077; SAVI: 0.837 ± 0.338) from day 138 to 150 (18 May to 30 May). After this pe vegetation indices started to decline slightly until the harvest in accordance with se studies [1,[38][39][40][41]. The significance and strength of the correlations between VIs and w yield changed accordingly with the changes of NDVI and SAVI values ( Figure 2) coefficients of determination were the highest (R 2 > 0.6) from day 138 to 167 (18 May June) in the cases of both VIs. This interval was equal to BBCH from 41 to 71. The d ence between NDVI-and SAVI-derived models was that NDVI performed maximu gression coefficients (R 2 = 0.757) in the beginning of the heading stage, whilst SAVI formed maximum regression coefficients (R 2 = 0.943) in the flowering and early ripe stages. Furthermore, the relationship between the wheat yield and the SAVI values stronger than in the case of NDVI, suggesting SAVI to be a better predictor for w yield.  n order to provide better insight to the defined yield prediction algorithms, results of regression analysis were selected to interpret the characteristics of the NDVI-and SAVIbased models in BBCH 41, BBCH 59 and BBCH 71 stages in Figure 3. The models provide rapid early detection of yields in different phenological stages of the wheat. n order to provide better insight to the defined yield prediction algorithms, results of regression analysis were selected to interpret the characteristics of the NDVI-and SAVIbased models in BBCH 41, BBCH 59 and BBCH 71 stages in Figure 3. The models provide rapid early detection of yields in different phenological stages of the wheat. The performances of NDVI and SAVI for wheat yield forecasting were calculated based on six years of the training data over the most sensitive (heading, flowering, ripening) period of wheat (May and early June). The observed average wheat yield for the study area from 2018 to 2019 were used to further validate the forecast models and to calculate the accuracy of the prediction. E1 was used to test the performance of the wheat yield which is a global measure of model efficiency. E1= 0.716 for the model from NDVI and E1 = 0.91 for SAVI. The prediction outcomes were compared with the official reported yield values. RMSE, NRMSE, absolute and the relative deviation (difference in percent) of forecast versus reported yield were also calculated. The coefficients of determination for wheat were more than 60% for NDVI and 70% for SAVI during the phenological peak period with the six training years. Although in the case of NDVI, the accuracy of the forecast varied between 0.25-0.45 t/ha (5.13%-9.30%) based on the RMSE and NRMSE of the prediction models. The best prediction accuracies were based on the models derived from the early ripening periods. In the case of SAVI, the accuracy of the forecast varied between 0.17-0.23 t/ha (3.34-4.64%) showing better performance of the prediction (Figure 4). The performances of NDVI and SAVI for wheat yield forecasting were calculated based on six years of the training data over the most sensitive (heading, flowering, ripening) period of wheat (May and early June). The observed average wheat yield for the study area from 2018 to 2019 were used to further validate the forecast models and to calculate the accuracy of the prediction. E 1 was used to test the performance of the wheat yield which is a global measure of model efficiency. E 1 = 0.716 for the model from NDVI and E 1 = 0.91 for SAVI. The prediction outcomes were compared with the official reported yield values. RMSE, NRMSE, absolute and the relative deviation (difference in percent) of forecast versus reported yield were also calculated. The coefficients of determination for wheat were more than 60% for NDVI and 70% for SAVI during the phenological peak period with the six training years. Although in the case of NDVI, the accuracy of the forecast varied between 0.25-0.45 t/ha (5.13%-9.30%) based on the RMSE and NRMSE of the prediction models. The best prediction accuracies were based on the models derived from the early ripening periods. In the case of SAVI, the accuracy of the forecast varied between 0.17-0.23 t/ha (3.34-4.64%) showing better performance of the prediction (Figure 4).  In order to assess the overall prediction accuracy of the VI based prediction models predicted yield values for the total, most sensitive periods were averaged and compared to observed yield values. The RMSE of NDVI based prediction model was 0.357 t/ha (NRMSE: 7.33%). The RMSE of SAVI based prediction model was 0.191 t/ha (NRMSE 3.86%) ( Figure 5). Based on the mean relative deviation (−1.072%) NDVI slightly underestimated the observed yield values. On the other hand, SAVI-based predictions performed less, but positive differences (0.46) compared to reported values. The mean absolute deviation between the estimated and the reported yield data was also higher in the case of NDVI derived prediction (about 8.5%) compared to the 4.1% of absolute deviation value of SAVIbased yield predictions ( Figure 6). The results showed that only the accuracy of NDVIderived prediction exceeded the 5% threshold, which is generally accepted as good [10]. In order to assess the overall prediction accuracy of the VI based prediction models predicted yield values for the total, most sensitive periods were averaged and compared to observed yield values. The RMSE of NDVI based prediction model was 0.357 t/ha (NRMSE: 7.33%). The RMSE of SAVI based prediction model was 0.191 t/ha (NRMSE 3.86%) ( Figure 5).  In order to assess the overall prediction accuracy of the VI based prediction models predicted yield values for the total, most sensitive periods were averaged and compared to observed yield values. The RMSE of NDVI based prediction model was 0.357 t/ha (NRMSE: 7.33%). The RMSE of SAVI based prediction model was 0.191 t/ha (NRMSE 3.86%) ( Figure 5). Based on the mean relative deviation (−1.072%) NDVI slightly underestimated the observed yield values. On the other hand, SAVI-based predictions performed less, but positive differences (0.46) compared to reported values. The mean absolute deviation between the estimated and the reported yield data was also higher in the case of NDVI derived prediction (about 8.5%) compared to the 4.1% of absolute deviation value of SAVIbased yield predictions ( Figure 6). The results showed that only the accuracy of NDVIderived prediction exceeded the 5% threshold, which is generally accepted as good [10]. Based on the mean relative deviation (−1.072%) NDVI slightly underestimated the observed yield values. On the other hand, SAVI-based predictions performed less, but positive differences (0.46) compared to reported values. The mean absolute deviation between the estimated and the reported yield data was also higher in the case of NDVI derived prediction (about 8.5%) compared to the 4.1% of absolute deviation value of SAVI-based yield predictions ( Figure 6). The results showed that only the accuracy of NDVI-derived prediction exceeded the 5% threshold, which is generally accepted as good [10]. Therefore, after the assessment of the overall yield prediction accuracy, the uncertainties and forecasting precision for different yield ranges were evaluated in order to highlight those yield ranges in which the forecasting model showed the best performance. First, the predicted yield values were associated with the observed yields. Then, four groups with different yield intervals were set up, and the grouping was based on the observed yield values. Thus, the observed and associated predicted yields became comparable within the intervals set based on the observed yield data. Independent T-test analyses were used to assess the significant difference between the related observed and predicted yield values within wheat. As a result, the distribution of the predicted yields was possible to compare to the real, observed data distributions (Figure 7). * yield range, in which there was significant difference between observed and predicted yield data (p < 0.05)

Observed yield (t/ha)
• outstanding value Figure 7. Differences between observed and predicted yield within wheat yield ranges for NDVI and SAVI.
In the case of NDVI, higher yield values were significantly underestimated. The difference between predicted yield values and observed yield values was 0.55 t/ha (in average). In the case of SAVI, there were no significant differences between the observed and predicted values. Therefore, after the assessment of the overall yield prediction accuracy, the uncertainties and forecasting precision for different yield ranges were evaluated in order to highlight those yield ranges in which the forecasting model showed the best performance. First, the predicted yield values were associated with the observed yields. Then, four groups with different yield intervals were set up, and the grouping was based on the observed yield values. Thus, the observed and associated predicted yields became comparable within the intervals set based on the observed yield data. Independent T-test analyses were used to assess the significant difference between the related observed and predicted yield values within wheat. As a result, the distribution of the predicted yields was possible to compare to the real, observed data distributions (Figure 7). Therefore, after the assessment of the overall yield prediction accuracy, the uncertainties and forecasting precision for different yield ranges were evaluated in order to highlight those yield ranges in which the forecasting model showed the best performance. First, the predicted yield values were associated with the observed yields. Then, four groups with different yield intervals were set up, and the grouping was based on the observed yield values. Thus, the observed and associated predicted yields became comparable within the intervals set based on the observed yield data. Independent T-test analyses were used to assess the significant difference between the related observed and predicted yield values within wheat. As a result, the distribution of the predicted yields was possible to compare to the real, observed data distributions (Figure 7). * yield range, in which there was significant difference between observed and predicted yield data (p < 0.05)

Discussion
• outstanding value In the case of NDVI, higher yield values were significantly underestimated. The difference between predicted yield values and observed yield values was 0.55 t/ha (in average). In the case of SAVI, there were no significant differences between the observed and predicted values. In the case of NDVI, higher yield values were significantly underestimated. The difference between predicted yield values and observed yield values was 0.55 t/ha (in average). In the case of SAVI, there were no significant differences between the observed and predicted values.

Discussion
This study was carried out in order to develop a satellite-based system (Landsat 8) for wheat yield forecasting and to determine the uncertainties of the prediction for different yield amounts for a solution applicable to lowlands of the Tisza river catchment. Several studies [1,[42][43][44][45][46] have shown the validity of using satellite-derived vegetation in-dices for wheat yield forecasting. In accordance with these previous studies, current results demonstrated that there are significant relationships between the VIs and final crop yields, and the peak season with the highest NDVI values at heading stage of the wheat was defined to be the best to predict the crop yield.
The validation of the NDVI and SAVI derived models performed well in wheat prediction. Furthermore, wheat yield prediction averaged for the most sensitive period (May and early June) is a better predictor than using prediction models individually derived from VIs at a certain day of the year. The accuracy of the NDVI-based prediction model was 0.357 t/ha (7.33%) and is in correspondence with Rudorff and Batista [47] studies, which estimated wheat yield at the farm level using Landsat in the Brazil with a 0.37-0.44 t/ha prediction accuracy. Final reported values in Pakistan with MODIS NDVI were 10% [16], and Nagy et al. [24] studied the wheat yield forecasting for the Tisza river catchment using MODIS NDVI and found 7% accuracy between the predicted and actual yield.
Furthermore, after investigating the predictions in different yield ranges, in the case of high values, NDVI-based yield forecast has the highest number of uncertainties. This might be because NDVI is known to saturate at high LAI values [48,49], bringing about the decrease in NDVI sensitivity for higher yields. From average to lowest wheat yields, the NDVI-based forecasting model performs the best such that the prediction can be a useful tool for detecting yield losses caused by drought phenomena. It can also be a feasible option in specific crop drought monitoring.
In contrary to NDVI, SAVI-based predictions are homogeneous in all yield ranges and the SAVI-based prediction model had better accuracy compared to NDVI: 0.191 t/ha (3.86%). Muller et al. [50] also found that SAVI achieved slightly higher accuracies than NDVI, suggesting that SAVI reduced some of the effects of soil background reflectance. Liaqat et al. [51] also proved that SAVI-derived models performed better for wheat prediction than NDVI and EVI (Enhanced Vegetation Index) Results also showed that soil adjusted vegetation index (SAVI) was the best VI (out of several, including NDVI) for LAI estimation [35], which can contribute to the better prediction accuracy of SAVI.
Overall, moderate and high spatial resolution remote sensing images such as Landsat 8 images have significant potential in wheat yield prediction and SAVI is a better predictor for wheat yield than NDVI.
The developed model was based on NDVI and SAVI from Landsat 8. Nowadays with the Sentinel's 2 and Proba-V sensors, a new era of Earth observation has begun [1]. With these new sensors, data availability at coarse/medium resolution increases at high revisit frequency, but further studies are needed to ensure a suitable sensor inter-calibration because there are not yet time series datasets for them to be used for accurate yield forecasting. In addition, some studies suggested the multivariable analysis or non-linear methods are more reasonable than the linear method when canopy reflectance is used to establish the yield prediction model method and have equal or better performance than the spectral index method in crop yield prediction, especially when hyperspectral data was used [52][53][54].

Conclusions
Information on long-term yield variability is important for tailoring farming practices to the needs of crops. In particular, remotely sensed vegetation indices (VIs) such as the NDVI and SAVI have been widely utilized for agricultural mapping and monitoring. Wheat yield forecasting techniques based on remote-sensing data with 30 × 30 m spatial resolution were estimated. In this study, a forecasting method was formulated based on vegetation indices derived from multi-spectral remote-sensing data (Landsat 8 NDVI and SAVI) and reported wheat yield data in the Tisza river catchment area. The developed wheat yield forecasting model provided timely information on the production of wheat, as well as its status and yield in a regular and standardized manner at field and catchment level. From the forecasting model that was developed based on six training years, the yield can be predicted six weeks earlier before harvesting. Understanding the applicability and accuracy of yield prediction is also an essential component of forecasting because the ultimate goal is to reduce forecast uncertainties for a particular location and for a specific group of people or agricultural or economic sector. With the forecasting method, moderately good and good estimates were provided based on NDVI and SAVI as early as possible during the growing season and can be updated periodically through the season until harvest. This information can reduce impacts of possible yield losses if delivered to farmers or decision makers in a timely and appropriate format and if mitigation measures and preparedness plans are in place.