A Framework for Accurate Annual Regional Crop Yield Prediction

Li, Hsuan-Yi; Lawrence, James A.; Mason, Philippa J.; Ghail, Richard C.

doi:10.3390/rs18081157

Open AccessArticle

A Framework for Accurate Annual Regional Crop Yield Prediction

¹

Department of Civil and Environmental Engineering, Skempton Building, Imperial College London, South Kensington, London SW7 2AZ, UK

²

Department of Earth Science & Engineering, Imperial College London, Prince Consort Road, London SW7 2AZ, UK

³

Department of Earth Sciences, Royal Holloway, University of London, Egham, Surrey TW20 0EX, UK

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(8), 1157; https://doi.org/10.3390/rs18081157

Submission received: 19 March 2026 / Revised: 9 April 2026 / Accepted: 10 April 2026 / Published: 13 April 2026

(This article belongs to the Special Issue Advanced AI and Machine Learning for Monitoring Vegetation Dynamics)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

Annual regional crop yield can be accurately predicted with only opensource data.
EVI is the most important feature in the yield prediction framework.

What are the implications of the main findings?

EVI in April, May and June shows a strong correlation with winter barley yield.
High values of NDMI from April to June reduce annual yields of winter barley.

Abstract

Food insecurity occurs due to the impact of climate change and intense global conditions. Thus, understanding crop farming plans and monitoring crop yields have become major tasks for decision makers. Previous work has applied remote sensing techniques and empirical methods to predict the yields and analyse the relationships between spectral indices and historical crop yield data. However, a limitation of these studies is that they do not extract the values of spectral indices by crop types when the testing area is regional with multiple farmlands and requires a crop classification process. This can cause inaccurate results when investigating the correlations between the yield and the spectral indices. This research develops a yield prediction framework with historical crop maps by means of unsupervised classification with zero ground truth using Sentinel-2 imagery to retrieve the values of spectral indices of winter barley. The extracted spectral indices and the meteorological and historical yield data in North Norfolk, UK, are implemented in 1D Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM) and CNN–LSTM for winter barley yield predictions. LSTM has outstanding performance overall and the best result approaches a Root Mean Square Error (RMSE) of 0.406 kg/hectare, a Mean Square Error (MSE) of 0.165 kg/hectare and a Mean Absolute Error (MAE) of 10.495 kg/hectare. The EVI in April, May and June is the most important feature in the LSTM model and shows strong positive correlation with the yield of winter barley. The developed framework with unsupervised crop classification and LSTM can be applied to multiple crop types and in different regions using opensource datasets, historical yields, spectral indices and meteorological data. Correlations between these datasets indicate that higher EVI and maximum and minimum temperature and sun hours at the germination and seedling growth stages increase the yields of winter barley, but excess Water Content (WC) in plants with a higher Normalised Difference Moisture Index (NDMI) from April to June leads to a decline in the yields of winter barley.

Keywords:

crop yield; agriculture; barley; remote sensing; meteorology; deep learning

1. Introduction

Climate change and intense geopolitical conditions have made ensuring food security more challenging. Food crop production and distribution are important to maintain food security and sustainability. Consistent annual yields of food crops are essential for stable production and distribution. Historical yield data since the green revolution (1966 to 2010) show that yields have flattened since the 1990s. This is particularly true in the high-yield farming systems for rice in East Asia (China, Republic of Korea and Japan), wheat in Northwest Europe (United Kingdom, France, Germany, the Netherlands and Denmark) and India, and maize in South Europe (Italy and France) [1]. The yield stagnation has been caused by specific policies and management systems such as the reduction of fertilizer in western Europe [2] and water scarcity for irrigation and soil-quality depletion in South Asia [3], and critical weather conditions such as heat stress on cereal crops in France [4]. The increased temperature has reduced the crop duration and decreased the length of the grain-filling phase, causing a yield decline in Denmark [5,6]. Alston et al. (2009) [7] suggest more cost-effective investments and agriculture studies to support a return to higher yields, especially in developing countries. To maintain long-term food security and prevent drops in crop yields, comprehensive crop land management is necessary. Accurate crop maps and crop yield forecasts are important tools for decision makers to project present and future crop land management and crop requirements. This research therefore proposes a framework of accurate crop classification and yield predictions using open-source data without the need for ground truth, which can be more widely used in multiple scenarios. The historical yields of crops can be analysed with the values of spectral indices extracted from specific crop types on the accurate crop maps generated from this framework.

Crop yield monitoring and forecasting rely on meteorological data and crop production statistics with integrated mathematical analysis [8]. Yield forecasting models such as Geospatial and Remote-sensing-based Agro-Meteorological (GRAMI) [9,10] and Simple Algorithm For Yield (SAFY) estimates [11] have been developed with meteorological data, measured crop parameters from fields and remote sensing images. They have been applied to multiple crop types in several research studies [12,13,14,15]. These two models are applied with fixed input parameters for crop growth and yield estimation and are strongly dependent on remote sensing data. Thus, the inaccuracy of Lear Area Index (LAI) estimation from Vegetation Indexes (VIs) with remote sensing images by means of an empirical modelling approach has led to imprecise crop yield predictions [16,17]. To overcome the limitations of the input parameters using an empirical modelling approach and LAI values, crop yield forecasting models were constructed with Machine Learning (ML) and Deep Learning (DL) methods.

ML methods such as Random Forest (RF) [18,19], XGBoost [20] and SVM (Support Vector Machine) [21] and DL methods such as ANN (Artificial Neural Network) [22], CNN (Convolution Neural Network) [20,23], Recurrent Neural Network (RNN) [24] and LSTM [25,26,27] are widely used in crop yield forecasting. Comparing the results of crop yield predictions from ML and DL in previous research studies, DL methods were believed to be more promising due to the automatic feature extraction and superior performance and suggested for future research in crop yield prediction [28]. ANN (a feed-forward network) was applied with yield data, average elevation of the region and evapotranspiration indicators calculated with the spectral bands of remote sensing images to predict the yield of spring barley and winter wheat in Czech Republic [22]. The predicted yields in this research study were typically 0.5–1.0 t/ha higher than the observed yields, demonstrating that the accuracy of yield predictions can be improved. Yield predictions with CNN, RNN and LSTM were more accurate in other research studies. A review of crop yield prediction with DL methods indicated that CNN outperformed ANN. RNN-based LSTM has been considered the most capable DL model with the best accuracy due to the advantage of feedback loops in processing [29]. A CNN-RNN yield prediction framework was proposed with the soil properties, weather conditions and management practices, which achieved a Root Mean Square Error (RMSE) 9% and 8% on corn and soybean yield prediction. It demonstrated that weather conditions were the most important factor in this model [30]. A phenology-based LSTM model was developed with the meteorological indices, daily growing degree days (GDD), killing degree days (KDD), and the precipitation and wide dynamic range vegetation index (WDRVI). This LSTM model achieved highly accurate performance with a RMSE of 1.47 Mg/ha [31]. The yield of soybeans in Brazil was predicted using LSTM with meteorological data and satellite images, including land surface temperature, precipitation, NDVI and EVI. The LSTM model was tested against multiple sets of data from different Days of the Year (DOY) (counting from the beginning of the year) and it showed the lowest MAE at 0.24 Mg/ha with the data until DOY64 [32]. CNN, LSTM, and CNN-LSTM were utilised in soybean yield predictions with U.S. Department of Agriculture (USDA) yield data, MODIS Surface Reflectance, MODIS Land Surface Temperature, precipitation and vapor pressure. CNN-LSTM outperformed the other two models and worked highly efficiently on the Google Earth Engine (GEE) [33]. These research studies demonstrated the capability of DL models with crop data statistics, soil properties, remote sensing and meteorological data. However, the relationships between the yields of the crops and the remote sensing and meteorological data were not analysed. Furthermore, directly applying spectral indices of the selected study area from remote sensing data to predict the yield of a specific crop type can cause inaccuracy. Since the values of spectral indices were extracted at a regional scale area consisting of multiple types of plants, these values cannot represent the specific crop type for its yield prediction.

This research aims to develop a widely used rigorous framework for crop yield prediction focusing on winter barley and analyse the relationships between the spectral indices, meteorological data and the yield of winter barley, with four main objectives:

(1): Constructing a framework from online opensource datasets for broader usage under various conditions in multiple regions.
(2): Automatically producing crop maps with zero ground truth data by means of unsupervised crop classification and extracting spectral indices of winter barley.
(3): Comparing the results of yield estimates by DL models to find out the most suitable DL model to be applied in this framework.
(4): Studying the importance of variables in DL models and analysing the correlations between the spectral indices, meteorological data and the yield of winter barley.

2. Materials and Methods

This research develops a comprehensive yield prediction framework to improve the accuracy of the yield predictions with Earth Observation (EO) data. Crop maps are produced with EO data by a ML model. The values of spectral indices of winter barley are extracted from crop maps and applied in three DL models with meteorological data in North Norfolk to predict the yield of winter barley.

2.1. Yield Prediction Framework

This framework comprises three main stages: crop classification; training and validation; and testing (Figure 1). Crop classification uses Sentinel-2 images from November 2017 to June 2023, resampled to 60 m per pixel and processed into spectral indices NDVI, SAVI, EVI and NDMI (Table 1). These are calculated from pixels and the distance matrixes are generated with Fast DTW by the proximities between the pixels. Pixels are clustered with the normalised distance matrixes and hierarchical clustering. The pixels are georeferenced with their clusters and the final integration result, a crop map, is produced. In training and validation, there are two main input datasets with one expected output. The values of spectral indices NDVI, SAVI, EVI and NDMI are extracted from the pixels of winter barley on the crop maps from 2018 to 2022. In addition, minimum and maximum temperatures, rainfall and sun hours in the months between November and June from 2018 to 2022 are collected as input parameters. The expected outputs are the historical yield data of winter barley from 2018 to 2022. The datasets and expected outputs are trained and validated in 300 epochs with three DL models, CNN, LSTM and CNN-LSTM, to generate three yield prediction models. The crop map and meteorological data from 2023 are used in the testing of the yield prediction models. The RMSE, MSE and MAE of yield estimates and the yield of winter barley in 2023 are calculated for the model evaluation. Meanwhile, the Permutation Feature Importance (PFI) of each variable in the DL models were calculated.

2.2. Crop Classification

To classify the crop phenology, Sentinel-2A atmospheric corrected images from November 2017 to June 2023 were collected through Copernicus [38]. The images are resampled to the most suitable pixel size, 60 m, for the fields in the North Norfolk area [39]. The values of spectral indices NDVI, SAVI, EVI and NDMI of each pixel are calculated with the resampled spectral bands, Near InfraRed (NIR), Red, Green, Blue and Short-wavelength InfraRed (SWIR), from Sentinel-2A images. Crop maps from 2018 to 2023 are produced with the spectral indices of satellite images by FastDTW-HC [40], combining Fast Dynamic Time Warping (DTW) and Hierarchical Clustering (HC). FastDTW-HC has been tested with multiple sets of variables, including datasets from Sentinel-1 and -2, to produce crop maps. The results using spectral indices NDVI, SAVI, EVI and NDMI as the input of FastDTW-HC produced the most accurate unsupervised crop maps. Thus, in this research study, the combination of NDVI, SAVI, EVI and NDMI is applied to generate the regional unsupervised crop maps with FastDTW-HC. Four distance matrixes of NDVI, SAVI, EVI and NDMI are generated with proximities among values of pixels by Fast DTW. By normalising and combining the four distance matrixes, pixels are clustered into classes by HC. The clustered pixels are georeferenced to the UTM/WGS84 to produce the final crop maps from 2018 to 2023.

FastDTW-HC [40] is an unsupervised ML model for improving the calculation of similarities between pixels with FastDTW and clustering pixels with the similarities generated by HC. FastDTW-HC is suitable for time series data, which can calculate similarities between pixels with the values at multiple time points (Figure 2); this is different from traditional Euclidean methods which calculate similarities between pixels at a single time point. Agglomerative HC in Figure 3 is then applied to form a dendrogram with the generated similarities among pixels by means of a bottom-up approach to classify the pixels into specific clusters.

To ensure the capability of this proposed unsupervised classification, Li et al. (2025) [39] tested the models for the years 2020 and 2023 with FastDTW-HC and statistical analysis, scoring accuracies of 77% and 77% for winter barley, 77% and 77% for wheat, 97% and 77% for winter rapeseed and 84% and 95% for spring barley. The F1 scores of winter barley and winter rapeseed in 2020 and 2023 are 0.71 and 0.88, 0.69 and 0.6, respectively, which mean the unsupervised classification performed well in recognising crops, especially winter barley and winter rapeseed.

2.3. Yield Prediction Models

The values of spectral indices from 2018 to 2023 are extracted from crop maps produced by FastDTW-HC and applied in the yield prediction models combining the meteorological data. Including minimum and maximum temperature, rainfall and sun hours, the historic meteorological data of North Norfolk, UK, from 2018 to 2023 is accessed from the Met Office at the Lowestoft Monckton Avenue weather station, located at 01°43′37.20′′E, 52°28′58.80′′N [42]. Following the crop phenology, from November to June, the values of NDVI, SAVI, EVI and NDMI and the monthly minimum and maximum temperature, rainfall and sun hours each year are selected as the input variables of yield prediction models 1D CNN, LSTM and 1DCNN-LSTM. The input values of spectral indices are calculated for three growth seasons of winter barley, divided into Growth Stages (GS) (Season 1: November and December—the germination and seedling growth stages (GS00-20), Season 2: January, February and March—the tillering stage (GS20-29), Season 3: April, May and June—the stem elongation, flowering and grain filling stage (GS30-89)) [43]. All input variables are divided into two sections: 2018 to 2022 for training and validation and 2023 for testing. The expected outputs are the yields of winter barley from 2018 to 2023 collected from the survey of regional yields for England by the Department for Environment, Food and Rural Affairs (DEFRA) [44]. The accuracy of yield estimates in 2023 in the testing stage is measured by RMSE, MSE and MAE.

2.3.1. 1D CNN

Unlike the 2D CNN applied on images and videos, 1D CNN is designed for 1D array data such as time series, sequential and text datasets as shown in Figure 4. At the repeated convolutional filters and max pooling layers, 1D CNN can learn to extract the features of the datasets which are utilised in the classification performed by Multi-Layer Perceptron (MLP), namely fully connected dense layers. The outputs of the feature extraction are flattened to 1D to feed in the 1D dense layers for prediction. The two main stages, feature extraction and prediction, are fused to one step which reduces the computational processing time and complexity [45,46].

2.3.2. LSTM

LSTM is an advanced type of RNN which solved the vanishing gradient and exploding gradient problems, which can cause model instability and inaccuracy, while running long-term time sequential data in the traditional RNN [47]. The architecture of LSTM shown in Figure 5 contains three gates, the forget gate, the input gate and the output gate. The forget gate determines the retained and discarded information of the memory cells from the previous step with Equation (1). The input gate controls the updated information to be added in the memory cells with Equations (2)–(4). The output gate controls the generated information of the output memory cells from the old memory cells of the previous steps and the updated information in Equations (5) and (6) [48,49].

We define the elements of LSTM in terms of:

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(1)

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(2)

{\tilde{C}}_{t} = t a n h (W_{C} \cdot [h_{t - 1}, x_{t}] + b_{C})

(3)

C_{t} = f_{t} ⊙ C_{t - 1} + i_{t} ⊙ {\tilde{C}}_{t}

(4)

o_{t} = σ (W_{0} \cdot [h_{t - 1}, x_{t}] + b_{0})

(5)

h_{t} = o_{t} ⊙ t a n h (C_{t})

(6)

where

W_{f}

,

W_{i}

,

W_{c}

and

W_{o}

represent the trained weights and

b_{f}

,

b_{i}

,

b_{c}

and

b_{o}

are the bias in the forget gate, input gate and output gate;

h_{t - 1}

is the parameter from the previous hidden layer and

x_{t}

is the new input parameter;

σ

and

t a n h

are the activation functions;

{\tilde{C}}_{t}

and

C_{t}

are the parameters generated in the input gate;

⊙

denotes the element-wise multiplication and

h_{t}

is the parameter from the hidden layer passing to the next cell.

2.3.3. 1D CNN-LSTM

The architecture of 1D CNN-LSTM starts from the 1D CNN layer which learns to extract the general features in the convolutional filters and max pooling layers. The outputs of the 1D CNN layer are flattened and fed into the LSTM layer which extracts the temporal features in the datasets with the three gates (the forget gate, the input gate and the output gate). The generated outputs from the LSTM layer are fed into the dense layer for prediction to produce the final outputs [50,51].

2.3.4. Permutation Feature Importance

Permutation Feature Importance (PFI) is a model-agnostic technique for measuring the contribution of each input variable in a model. By randomly shuffling the values of each variable in the fitted model, the errors of new estimates are produced and compared with the errors of original estimates to generate the feature importance score [52]. This determines the level of a model relying on a particular feature, where the feature with the largest increased error of new estimates appears to be the most important feature of the model.

3. Results

The results are presented as crop maps from 2018 to 2023 in North Norfolk. The accuracies of three constructed crop yield prediction models with different sets of input parameters are evaluated with RMSE, MSE and MAE. The correlations between the input parameters, the values of spectral indices, the meteorological data, and the yields of winter barley are analysed.

3.1. Crop Maps from 2018 to 2023

The crop maps from 2018 to 2023 are produced in this research with Sentinel-2A images and FastDTW-HC as shown in Figure 6. The study area in North Norfolk is located at 52°46′59.9′′N, 0°43′07.1′′E to 52°53′56.7′′N, 1°17′11.1′′E. It is divided into 24 tiles, 6 km to 6 km for each tile, and with a 25% overlap between each tile to improve the accuracy of classification at the edges of the tiles. Winter barley is shown in orange, winter wheat in blue, winter rapeseed in pink and spring barley in purple. Following the 6-to-7-year modern crop rotation system in Norfolk, as can be seen from the crop maps of 2018 to 2023 (Figure 6), the farms practice crop rotations, with winter barley being the dominant crop in 2019, 2021 and 2023, whilst winter wheat dominates in 2018, 2020 and 2022. The crop maps generated from 2018 to 2023 were utilised to extract the values of spectral indices of winter barley between 2018 and 2023.

3.2. Yield Monitoring and Estimation Evaluation

The values of spectral indices extracted from the crop maps (Figure 6) and the meteorological data from 2018 to 2023 are the input parameters of the yield prediction models, which are formed with CNN, LSTM and CNN-LSTM. The data, the values of spectral indices and the meteorological datasets from 2018 to 2022 are trained in the DL models, CNN, LSTM and CNN-LSTM. The data in 2023 is used to produce the yield estimate in the testing of the DL models. The yield predictions are generated and statistically analysed with RMSE, MSE and MAE. In Table 2, the CNN model shows the poorest results and the LSTM model, which is suitable for time series predictions, performs much better than the CNN model and approaches the best results with the lowest RMSE, MSE and MAE. By combining these two models to CNN-LSTM, though it extracts the general features from CNN and temporal features from LSTM, the accuracy is slightly lower than the results from LSTM. The table shows that LSTM performs best, which contracts some previous work such as that of Sun et al., 2019 [33]. The PFI scores of LSTM (Figure 7) show that the EVI in April, May and June appears to be the most important variable in this model, followed by the values in November and December in second place. The SAVI in November and December and April, May and June and the NDVI in Jan, Feb and March have negative PFI scores, indicating that these three features are not useful when calculating the model.

In Table 3 and Table 4, the values of spectral indices of winter barley are analysed with the yield of winter barley in North Norfolk throughout 2018 to 2023. It shows a positive correlation between the yield of winter barley and the average EVI for each year, and a negative correlation between the yield of winter barley and the average NDVI, SAVI and NDMI for each year. When analysing the relationships between the yield and the values of spectral indices throughout the growing seasons in the year, it is shown that there are strong positive correlations in the EVI and a strong negative correlation in NDVI, especially in April, May and June. This demonstrates the same result as the PFI, namely that the EVI in April, May and June is the most important feature, followed by the EVI in November and December.

3.3. Correlation Analysis Between the Yield and Meteorological Datasets

Comparing the importance of spectral indices and meteorological data to the DL winter barley yield prediction model, the spectral indices are more important, as shown in the PFI analysis (Figure 7). Most seasonal spectral indices are listed on the PFI score rankings. However, for the meteorological datasets, only rainfall in November is listed and it ranks as eighth most important to the DL winter barley yield prediction. Thus, to further understand the influence of meteorological data on the yield prediction, the correlations between meteorological datasets and the yield of winter barley are analysed in Table 5 and Table 6.

The average annual minimum temperature and the yield of winter barley show positive correlation and the average annual rainfall shows negative correlation with the yield of winter barley in Figure 8A. To understand the relationship between the seasonal meteorological datasets and the yield of winter barley, the analysis is separated into the three growth seasons of winter barley, divided by Growth Stages (GS) into three seasons.

The result shows that the meteorological datasets in November and December have a strong positive correlation with the yield of winter barley, as shown in Figure 8B and Table 6. November and December represents the germination and the seedling growth stages of winter barley; thus, the averages of the minimum and maximum temperatures and sun hours in November and December are all positively correlated with the yield of winter barley. This demonstrates that the yields throughout the years are strongly influenced by the minimum temperature, maximum temperature and sun hours in November and December.

The tillering stage in January, February and March shows a negative correlation between average rainfall and the yield of winter barley in Figure 8C and Table 6. This corresponds to the high negative correlation between average and seasonal NDMI and the yield of winter barley throughout the years. The years with a higher rainfall rate in January, February and March can cause a lower yield of winter barley for that year. The values of other meteorological factors in January, February and March do not appear to have a significant relationship with the yield of winter barley.

In April, May and June, the stem elongation, flowering and grain filling stage, the average sun hours in April, May and June show a strong negative correlation with the yield of winter barley throughout the years, as shown in Figure 8D and Table 6. This may cause lower yields of winter barley if the average sun hours are higher than expected. Higher than expected sun hours do not boost the photosynthesis, but instead restrain crop growth, since the DNA of plants may be damaged due to the high daily exposure of harsh ultraviolet radiation [53].

Through the analysis of meteorological datasets, we find the annual averages of minimum temperature and rainfall have a stronger correlation with the annual yield of winter barley. Thus, minimum temperature and rainfall are more important to the winter barley yield prediction compared to other meteorological factors. When diving into the seasonal analysis, the stronger correlations between all meteorological factors (minimum and maximum temperature, rainfall and sun hours) and the yield of winter barley are found in Season 1 (November and December). This means the meteorological datasets in Season 1 are more important to the winter barley yield prediction than the other two seasons. However, the average rainfall in Season 2 and the average sun hours in Season 3, having strong negative correlations with the yield of barley, are considered to be another two key features to be applied to the DL yield prediction model.

4. Discussion

This research extracts the values of four spectral indices, NDVI, SAVI, EVI and NDMI, of winter barley directly from historical crop maps in North Norfolk, UK. The analysis of the relationships between the spectral indices and the yields of winter barley are provided. NDVI has been widely used to predict cereal grain yields in research studies. MODIS-NDVI effectively predicted crop yields across the Canadian Prairie, and the relationships between NDVI and the grain yield of several crops were studied [54]. The results in previous research studies showed a higher correlation between NDVI and crop yields during the flowering and grain filling stages [54,55,56]. In the study of Czech Republic and Slovakia, the strongest positive relationships between NDVI and cereal yield were again recorded during the flowering and grain seasons; however, in regions of Germany, a strong negative correlation was observed [57]. Cases showing a negative relationship between NDVI and crop yield were also found in Australia when the rainfall was above 600 mm [58] and between MODIS-NDVI and the spring wheat yield of agricultural lands in Canada [59]. The inconsistent relationships of the NDVI and the yield could be as a result of the following reasons: (1) The spectral indices, including NDVI, of multiple crop types are different and will influence the results. (2) MODIS-NDVI used in previous research studies was calculated under a cropland mask, which classifies the crop lands but combines all crop types together [60]. To avoid the impacts of mixing crop types in the analysis, accurate historical crop maps provided in this study are necessary and important while analysing the correlation between the values of spectral indices and the yield of crops.

This research indicates that the average and the seasonal NDVI of the year show negative correlations with the historical yields of winter barley, and NDVI is not the most important feature. While looking into the PFI analysis in Figure 7, this research ranks NDVI values in April, May and June seventh and the values in November and December ninth. The seasonal EVI, on the other hand, is the most important feature in this model, ranking the values in April, May and June first and the values in November and December second. The EVI is calculated with the adjustment of NDVI by involving the less atmospheric sensitive blue band and a γ function in the calculation. This removes the canopy background contamination, including soil and atmospheric influences, to make the EVI a more accurate VI for vegetation monitoring [61].

NDMI is not a commonly used parameter of crop yield prediction but is one of the most commonly used in soil moisture and WC of plant monitoring and estimations in agricultural research [62,63]. In this study, NDMI shows a negative correlation with the yield, which may correspond to the WC of crops and thus demonstrates the link with reduced yields during high rainfalls at the tillering stage. The year with the highest average rainfall during the tillering stage (January, February and March) produces lower yields of winter barley. Excessive water absorption through rainfall may be harmful to crop health, including waterlogging or root diseases [64]. Besides rainfall, our analysis demonstrates that temperature and sun hours have a major impact during the germination and seedling growth stages (November and December) on the yield of winter barley compared to other periods. Minimum and maximum temperature and sun hours show relatively high correlation with the yield of winter barley during the germination and the seedling growth stages. A similar situation was found in research by Juhász et al. [65], which shared identical phenology of winter barley with Li et al. (2025) [40]. In general, according to the analysis of weather conditions and spectral indices for the yield of winter barley in this research study, the period of germination and seedling growth and tillering are the two key growth seasons influencing the yield of winter barley. These findings can be beneficial when considering detailed monitoring and predictions of seasonal crop yields.

5. Conclusions

Soil properties, crop growth information, crop yield, meteorological and remote sensing datasets were utilised to predict crop yields with DL methods. This research applies the most commonly used parameters, namely temperature, rainfall and spectral indices. These datasets are all freely available and opensource. The DL frameworks used in this work can be more easily implemented in different regions, especially those in remote and unreachable areas. A DL framework for the prediction of crop yields has been developed in this study, producing a series of accurate retrospective crop maps. The main outcomes are:

(1)

The accurate historical crop maps are generated with zero ground truth by FastDTW-HC, which does not require further local surveying. This unsupervised classification method can be used to investigate large regions (land areas) after being tested in a small region.

(2)

The values of spectral indices of winter barley are extracted from the historical crop maps directly from the pixels of the winter barley, which are considered more accurate. This resolves the inaccuracy of previous research studies which directly applied to regional average values of spectral indices for studies on winter barley. The relationships between spectral indices and the yield of winter barley are studied with more accurate data, which points out that:

The EVI in April, May and June is the most important feature of the DL yield prediction due to the strong correlation with the yield of winter barley throughout 2018 to 2023.
The analysis of spectral indices indicates that the commonly used NDVI in yield predictions is not the best parameter due to the relatively low PFI and correlation.
A higher NDMI at the tillering stage and the stem elongation, flowering and grain filling stage has been interpreted as excess Water Content (WC), which could lead to poor plant health and declines in the yields of winter barley.

(3)

LSTM outperforms CNN and CNN-LSTM in this research with its capability of extracting the temporal features. The LSTM demonstrates the best results in this research with RMSE 0.406 kg/hectare, MSE 0.165 kg/hectare and MAE 10.495 kg/hectare.

(4)

This analysis indicates that certain weather conditions during specific times of the year can have a significant impact on yield. In particular, it was found that:

Temperature and sun hours have significant impacts during the germination and seedling growth stages (November and December), with higher temperatures and sun hours improving germination and seedling growth and eventual yield of winter barley.
High rainfall during the tillering stage (January, February and March) and high sun hours during the stem elongation, flowering and grain filling stages (April, May and June) produce lower yields of winter barley.

This research developed a framework for accurate annual crop yield prediction on a regional scale study area with opensource datasets which is widely applicable under various conditions in different regions. By applying the framework of this research, food security decision makers can manage the arable food crop farming plans, storage and delivery. Crop rotation throughout the years can maintain biodiversity and soil health. Yield monitoring and predictions of food crops can assist farmers to ensure annual production and profits and stabilise the regional food plan within a community. Decision makers can foresee the annual yields of multiple crops in each region before harvest time to adjust the annual food importation and exportation. The analysis of spectral indices and seasonal meteorology can provide insight to support seasonal, annual and long-term agricultural plans, including water, soil and land management. Stockholders can take these crop statistics into account to predict profitable targets on the stock market.

6. Future Work

The framework developed in this research could be used on multiple crop types in different regions with meteorological data and historical yield data in future research studies. Due to the limitation of the historical yield data, it is worth categorizing food crops with similar characteristics into sections to predict the yields of food crops by sections. For instance, winter and spring barley and winter and spring wheat share almost identical phenology and characteristics. These types of crops can cluster into the same section as cereal crops when predicting the yield and production to maintain food sustainability and nutrition needs. However, more detailed crop yield data could well be collected to comprehensively analyse the influence of spectral indices and meteorological data on specific crop types in multiple regions. More input parameters can be tested in yield prediction models to determine the critical factors for predicting the specific crop yield.

Author Contributions

Conceptualization, H.-Y.L.; methodology, H.-Y.L.; software, H.-Y.L.; validation, H.-Y.L.; formal analysis, H.-Y.L.; investigation, H.-Y.L.; resources, H.-Y.L.; data curation, H.-Y.L.; writing—original draft preparation, H.-Y.L.; writing—review and editing, H.-Y.L.; visualization, H.-Y.L.; supervision, J.A.L., P.J.M. and R.C.G.; project administration, J.A.L. and P.J.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The multispectral image products were downloaded from Sentinel-2—Missions—Sentinel Online—Sentinel Online. (n.d.), Sentinel Online, https://dataspace.copernicus.eu/data-collections/copernicus-sentinel-missions/sentinel-2 (accessed on 10 October 2024). The meteorological data was downloaded from the Met Office, MIDAS: UK Hourly Weather Observation Data. NCAS British Atmospheric Data Centre, https://catalogue.ceda.ac.uk/uuid/916ac4bbc46f7685ae9a5e10451bae7c (accessed on 28 February 2025). The historical yield data was downloaded from Cereal and oilseed production in the United Kingdom 2023, Department of Environment, Food and Rural Affairs (DEFRA), UK, 2025, https://www.gov.uk/government/statistics/cereal-and-oilseed-rape-production#full-publication-update-history (accessed on 10 July 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Grassini, P.; Eskridge, K.; Cassman, K. Distinguishing between yield advances and yield plateaus in historical crop production trends. Nat. Commun. 2013, 4, 2918. [Google Scholar] [CrossRef] [PubMed]
Lin, M.; Huybers, P. Reckoning wheat yield trends. Environ. Res. Lett. 2012, 7, 024016. [Google Scholar] [CrossRef]
Ray, D.K.; Ramankutty, N.; Mueller, N.D.; West, P.C.; Foley, J.A. Recent patterns of crop yield growth and stagnation. Nat. Commun. 2012, 3, 1293. [Google Scholar] [CrossRef]
Brisson, N.; Gate, P.; Gouache, D.; Charmet, G.; Oury, F.-X.; Huard, F. Why are wheat yields stagnating in Europe? A comprehensive data analysis for France. Field Crops Res. 2010, 119, 201–212. [Google Scholar] [CrossRef]
Kristensen, K.; Schelde, K.; Olesen, J.E. Winter wheat yield response to climate variability in Denmark. J. Agric. Sci. 2011, 149, 33–47. [Google Scholar] [CrossRef]
Børgesen, C.D.; Olesen, J.E. A probabilistic assessment of climate change impacts on yield and nitrogen leaching from winter wheat in Denmark. Nat. Hazards Earth Syst. Sci. 2011, 11, 2541–2553. [Google Scholar] [CrossRef] [PubMed]
Alston, J.M.; Beddow, J.M.; Pardey, P.G. Agricultural Research, Productivity, and Food Prices in the Long Run. Science 2009, 325, 1209–1210. [Google Scholar] [CrossRef]
Mandal, D.; Rao, Y.S. SASYA: An integrated framework for crop biophysical parameter retrieval and within-season crop yield prediction with SAR remote sensing data. Remote Sens. Appl. Soc. Environ. 2020, 20, 100366. [Google Scholar] [CrossRef]
Maas, S.J. Parameterized Model of Gramineous Crop Growth: I. Leaf Area and Dry Mass Simulation. Agron. J. 1993, 85, 348–353. [Google Scholar] [CrossRef]
Maas, S.J. Parameterized Model of Gramineous Crop Growth: II. Within-Season Simulation Calibration. Agron. J. 1993, 85, 354–358. [Google Scholar] [CrossRef]
Duchemin, B.; Maisongrande, P.; Boulet, G.; Benhadj, I. A simple algorithm for yield estimates: Evaluation for semi-arid irrigated winter wheat monitored with green leaf area index. Environ. Model. Softw. 2008, 23, 876–892. [Google Scholar] [CrossRef]
Yeom, J.-M.; Ko, J.; Kim, H.-O. Application of GOCI-derived vegetation index profiles to estimation of paddy rice yield using the GRAMI rice model. Comput. Electron. Agric. 2015, 118, 1–8. [Google Scholar] [CrossRef]
Kim, H.; Ko, J.; Jeong, S.; Yeom, J.; Ban, J.-O.; Kim, H.-Y. Simulation and mapping of rice growth and yield based on remote sensing. J. Appl. Remote Sens. 2015, 9, 096067. [Google Scholar] [CrossRef]
Battude, M.; Bitar, A.A.; Morin, D.; Cros, J.; Huc, M.; Sicre, C.M.; Le Dantec, V.; Demarez, V. Estimating maize biomass and yield over large areas using high spatial and temporal resolution Sentinel-2 like remote sensing data. Remote Sens. Environ. 2016, 184, 668–681. [Google Scholar] [CrossRef]
Ameline, M.; Fieuzal, R.; Betbeder, J.; Berthoumieu, J.-F.; Baup, F. Estimation of Corn Yield by Assimilating SAR and Optical Time Series Into a Simplified Agro-Meteorological Model: From Diagnostic to Forecast. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 4747–4760. [Google Scholar] [CrossRef]
Kim, M.; Ko, J.; Jeong, S.; Yeom, J.; Kim, H. Monitoring canopy growth and grain yield of paddy rice in South Korea by using the GRAMI model and high spatial resolution imagery. GIScience Remote Sens. 2017, 54, 534–551. [Google Scholar] [CrossRef]
Ma, C.; Liu, M.; Ding, F.; Li, C.; Cui, Y.; Chen, W.; Wang, Y. Wheat growth monitoring and yield estimation based on remote sensing data assimilation into the SAFY crop growth model. Sci. Rep. 2022, 12, 5473. [Google Scholar] [CrossRef] [PubMed]
Ed-Daoudi, R.; Alaoui, A.; Ettaki, B.; Zerouaoui, J. Improving Crop Yield Predictions in Morocco Using Machine Learning Algorithms. J. Ecol. Eng. 2023, 24, 392–400. [Google Scholar] [CrossRef]
Gumma, M.K.; Thenkabail, P.S.; Panjala, P.; Teluguntla, P.; Yamano, T.; Mohammed, I. Multiple agricultural cropland products of South Asia developed using Landsat-8 30 m and MODIS 250 m data using machine learning on the Google Earth Engine (GEE) cloud and spectral matching techniques (SMTs) in support of food and water security. GIScience Remote Sens. 2022, 59, 1048–1077. [Google Scholar] [CrossRef]
Srivastava, A.K.; Safaei, N.; Khaki, S.; Lopez, G.; Zeng, W.; Ewert, F. Winter wheat yield prediction using convolutional neural networks from environmental and phenological data. Sci. Rep. 2022, 12, 3215. [Google Scholar] [CrossRef]
Shammi, S.A.; Meng, Q. Modeling crop yield using NDVI-derived VGM metrics across different climatic regions in the USA. Int. J. Biometeorol. 2023, 67, 1051–1062. [Google Scholar] [CrossRef]
Jurečka, F.; Fischer, M.; Hlavinka, P.; Balek, J.; Semerádová, D.; Bláhová, M.; Anderson, M.C.; Hain, C.; Žalud, Z.; Trnka, M. Potential of water balance and remote sensing-based evapotranspiration models to predict yields of spring barley and winter wheat in the Czech Republic. Agric. Water Manag. 2021, 256, 107064. [Google Scholar] [CrossRef]
Yalcin, H. An approximation for a relative crop yield estimate from field images using deep learning. In 2019 8th International Conference on Agro-Geoinformatics; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2019; p. 8820693. [Google Scholar] [CrossRef]
Elavarasan, D.; Vincent, P.M.D. Crop Yield Prediction Using Deep Reinforcement Learning Model for Sustainable Agrarian Applications. IEEE Access 2020, 8, 86886–86901. [Google Scholar] [CrossRef]
Kiran Kumar, V.; Ramesh, K.V.; Rakesh, V. Optimizing LSTM and Bi-LSTM models for crop yield prediction and comparison of their performance with traditional machine learning techniques. Appl. Intell. 2023, 53, 28291–28309. [Google Scholar] [CrossRef]
Tian, H.; Wang, P.; Tansey, K.; Zhang, J.; Zhang, S.; Li, H. An LSTM neural network for improving wheat yield estimates by integrating remote sensing data and meteorological data in the Guanzhong Plain, PR China. Agric. For. Meteorol. 2021, 310, 108629. [Google Scholar] [CrossRef]
Bhimavarapu, U.; Battineni, G.; Chintalapudi, N. Improved Optimization Algorithm in LSTM to Predict Crop Yield. Computers 2023, 12, 10. [Google Scholar] [CrossRef]
Klompenburg, T.V.; Kassahun, A.; Catal, C. Crop yield prediction using machine learning: A systematic literature review. Comput. Electron. Agric. 2020, 177, 105709. [Google Scholar] [CrossRef]
Dharani, M.K.; Thamilselvan, R.; Natesan, P.; Kalaivaani, P.C.D.; Santhoshkumar, S. Review on Crop Prediction Using Deep Learning Techniques. J. Phys. Conf. Ser. 2021, 1767, 012026. [Google Scholar] [CrossRef]
Khaki, S.; Wang, L.; Archontoulis, S.V. A CNN-RNN Framework for Crop Yield Prediction. Front. Plant Sci. 2020, 10, 1750. [Google Scholar] [CrossRef] [PubMed]
Jiang, H.; Hu, H.; Zhong, R.; Xu, J.; Xu, J.; Huang, J.; Wang, S.; Ying, Y.; Lin, T. A deep learning approach to conflating heterogeneous geospatial data for corn yield estimation: A case study of the US Corn Belt at the county level. Glob. Change Biol. 2019, 26, 1754–1766. [Google Scholar] [CrossRef]
Schwalbert, R.A.; Amado, T.; Corassa, G.; Pott, L.P.; Vara Prasad, P.V.; Ciampitti, I.A. Satellite-based soybean yield forecast: Integrating machine learning and weather data for improving crop yield prediction in southern Brazil. Agric. For. Meteorol. 2020, 284, 107886. [Google Scholar] [CrossRef]
Sun, J.; Di, L.; Sun, Z.; Shen, Y.; Lai, Z. County-Level Soybean Yield Prediction Using Deep CNN-LSTM Model. Sensors 2019, 19, 4363. [Google Scholar] [CrossRef]
Rouse, J.W.; Haas, R.H.; Schell, J.A.; Deering, D.W.; Harlan, J.C. Monitoring the Vernal Advancement and Retrogradation (Greenwave Effect) of Natural Vegetation; NASA/GSFC Type III Final Report; NASA/GSFC: Greenbelt, MD, USA, 1974. Available online: https://ntrs.nasa.gov/api/citations/19750020419/downloads/19750020419.pdf (accessed on 18 March 2025).
Huete, A.R. A soil-adjusted vegetation index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
Liu, H.Q.; Huete, A. A feedback based modification of the NDVI to minimize canopy background and atmospheric noise. IEEE Trans. Geosci. Remote Sens. 1995, 33, 457–465. [Google Scholar] [CrossRef]
Gao, B. NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sens. Environ. 1996, 58, 257–266. [Google Scholar] [CrossRef]
Sentinel-2—Missions—Sentinel Online—Sentinel Online. (n.d.). Sentinel Online. Available online: https://dataspace.copernicus.eu/data-collections/copernicus-sentinel-missions/sentinel-2 (accessed on 10 October 2024).
Li, H.-Y.; Lawrence, J.A.; Mason, P.J.; Ghail, R.C. Assessing the Effect of Spatial Resolution on Crop Classification Success. In Proceedings of the IGARSS 2025—2025 IEEE International Geoscience and Remote Sensing Symposium, Brisbane, Australia, 3–8 August 2025; pp. 2968–2972. [Google Scholar] [CrossRef]
Li, H.-Y.; Lawrence, J.A.; Mason, P.J.; Ghail, R.C. Fast Dynamic Time Warping and Hierarchical Clustering with Multispectral and Synthetic Aperture Radar Temporal Analysis for Unsupervised Winter Food Crop Mapping. Agriculture 2025, 15, 82. [Google Scholar] [CrossRef]
Li, H.Y.; Lawrence, J.A.; Mason, P.J.; Ghail, R.C. Unsupervised Winter Wheat Mapping Based on Multi-spectral and Synthetic Aperture Radar Observations. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2023, XLVIII-1/W2-2023, 1411–1416. [Google Scholar] [CrossRef]
Met Office. MIDAS: UK Hourly Weather Observation Data. NCAS British Atmospheric Data Centre. 2006. Available online: https://catalogue.ceda.ac.uk/uuid/916ac4bbc46f7685ae9a5e10451bae7c (accessed on 28 February 2025).
Key Development Phases and Growth Stages in Barley, Agriculture and Horticulture Development Board (AHDB), 2025. Available online: https://ahdb.org.uk/knowledge-library/key-development-phases-and-growth-stages-in-barley (accessed on 3 September 2025).
Cereal and Oilseed Production in the United Kingdom 2023, Department of Environment, Food and Rural Affairs (DEFRA), UK. 2025. Available online: https://www.gov.uk/government/statistics/cereal-and-oilseed-rape-production#full-publication-update-history (accessed on 10 July 2024).
Kiranyaz, S.; Ince, T.; Gabbouj, M. Personalized Monitoring and Advance Warning System for Cardiac Arrhythmias. Sci. Rep. 2017, 7, 9270. [Google Scholar] [CrossRef]
Kiranyaz, S.; Ince, T.; Gabbouj, M. Real-Time Patient-Specific ECG Classification by 1-D Convolutional Neural Networks. IEEE Trans. Biomed. Eng. 2016, 63, 664–675. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Mirzaei, M.; Yu, H.; Dehghani, A.; Galavi, H.; Shokri, V.; Mohsenzadeh Karimi, S.; Sookhak, M. A Novel Stacked Long Short-Term Memory Approach of Deep Learning for Streamflow Simulation. Sustainability 2021, 13, 13384. [Google Scholar] [CrossRef]
Dehghani, A.; Moazam, H.M.Z.H.; Mortazavizadeh, F.; Ranjbar, V.; Mirzaei, M.; Mortezavi, S.; Ng, J.L.; Dehghani, A. Comparative evaluation of LSTM, CNN, and ConvLSTM for hourly short-term streamflow forecasting using deep learning approaches. Ecol. Inform. 2023, 75, 102119. [Google Scholar] [CrossRef]
Aksan, F.; Li, Y.; Suresh, V.; Janik, P. CNN-LSTM vs. LSTM-CNN to Predict Power Flow Direction: A Case Study of the High-Voltage Subnet of Northeast Germany. Sensors 2023, 23, 901. [Google Scholar] [CrossRef]
Halbouni, A.; Gunawan, T.S.; Habaebi, M.H.; Halbouni, M.; Kartiwi, M.; Ahmad, R. CNN-LSTM: Hybrid Deep Neural Network for Network Intrusion Detection System. IEEE Access 2022, 10, 99837–99849. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Sekiyama, T.; Nagashima, A. Solar Sharing for Both Food and Clean Energy Production: Performance of Agrivoltaic Systems for Corn. A Typical Shade-Intolerant Crop. Environments 2019, 6, 65. [Google Scholar] [CrossRef]
Mkhabela, M.S.; Bullock, P.; Raj, S.; Wang, S.; Yang, Y. Crop yield forecasting on the Canadian Prairies using MODIS NDVI data. Agric. For. Meteorol. 2011, 151, 385–393. [Google Scholar] [CrossRef]
Marti, J.; Bort, J.; Slafer, G.A.; Araus, J.L. Can wheat yield be assessed by early measurement of normalised difference vegetation index? Ann. Appl. Biol. 2007, 150, 253–257. [Google Scholar] [CrossRef]
Salazar, L.; Kogan, F.; Roytman, L. Use of remote sensing data for estimation of winter wheat yield in the United States. Int. J. Remote Sens. 2007, 28, 3795–3811. [Google Scholar] [CrossRef]
Panek, E.; Gozdowski, D. Analysis of relationship between cereal yield and NDVI for selected regions of Central Europe based on MODIS satellite data. Remote Sens. Appl. Soc. Environ. 2020, 17, 100286. [Google Scholar] [CrossRef]
Hill, M.J.; Donald, G.E. Estimating spatio-temporal patterns of agricultural productivity in fragmented landscapes using AVHRR NDVI time series. Remote Sens. Environ. 2003, 84, 367–384. [Google Scholar] [CrossRef]
Kouadio, L.; Newlands, N.K.; Davidson, A.; Zhang, Y.; Chipanshi, A. Assessing the Performance of MODIS NDVI and EVI for Seasonal Crop Yield Forecasting at the Ecodistrict Scale. Remote Sens. 2014, 6, 10193–10214. [Google Scholar] [CrossRef]
Johnson, D.M.; Rosales, A.; Mueller, R.; Reynolds, C.; Frantz, R.; Anyamba, A.; Pak, E.; Tucker, C. USA Crop Yield Estimation with MODIS NDVI: Are Remotely Sensed Models Better than Simple Trend Analyses? Remote Sens. 2021, 13, 4227. [Google Scholar] [CrossRef]
Huete, A.; Justice, C. MODIS Vegetation Index (MOD 13) Algorithm Theoretical Basis Document. 1999. Available online: https://modis.gsfc.nasa.gov/data/atbd/atbd_mod13.pdf (accessed on 18 March 2025).
Imtiaz, F.; Farooque, A.A.; Randhawa, S.G.; Wang, X.; Esau, J.T.; Acharya, B.; Garmdareh, S.E.H. An inclusive approach to crop soil moisture estimation: Leveraging satellite thermal infrared bands and vegetation indices on Google Earth engine. Agric. Water Manag. 2024, 306, 109172. [Google Scholar] [CrossRef]
Koohikeradeh, E.; Jose Gumiere, S.; Bonakdari, H. NDMI-Derived Field-Scale Soil Moisture Prediction Using ERA5 and LSTM for Precision Agriculture. Sustainability 2025, 17, 2399. [Google Scholar] [CrossRef]
Knight, C.; Khouakhi, A.; Waine, T.W. The impact of weather patterns on inter-annual crop yield variability. Sci. Total Environ. 2024, 955, 177181. [Google Scholar] [CrossRef]
Juhász, C.; Gálya, B.; Kovács, E.; Nagy, A.; Tamás, J.; Huzsvai, L. Seasonal predictability of weather and crop yield in regions of Central European continental climate. Comput. Electron. Agric. 2020, 173, 105400. [Google Scholar] [CrossRef]

Figure 1. The framework of food crop yield predictions.

Figure 2. The concept of Fast DTW. Fast DTW can calculate the similarities between the values of X and Y pixels at multiple time points in time series, which is different from the traditional Euclidean method [41].

Figure 3. The conceptualised hierarchical clustering with five individuals, A, B, C, D and E. The dendrogram of A, B, C, D, and E is formed by calculating the similarities and clustering five individuals by layers [41].

Figure 4. The architecture of CNN.

Figure 5. The architecture of LSTM.

Figure 6. Winter food crop maps. (a) The location of the selected study area of this research in Norfolk, UK, with 6 km grids. The map presents the scale of the study and the legend of winter barley, winter wheat, winter rapeseed and spring barley. Winter food crop maps of the selected study are in North Norfolk, UK, (b) 2018, (c) 2019, (d) 2020, (e) 2021, (f) 2022 and (g) 2023.

Figure 7. The importance score of variables in the LSTM model (only the most and least important ones are listed in this figure).

Figure 8. The meteorological data (orange), including minimum and maximum temperature, rainfall and sun hours, and the yield of winter barley (blue) from 2018 to 2023 with annual (A) and seasonal (B–D) analysis.

Table 1. The formula of spectral indices.

Spectral Indices	Formula
NDVI (Normalised Difference Vegetation Index)	(NIR − Red)/(NIR + Red) [34]
SAVI (Soil-Adjusted Vegetation Index)	(1 + L) × (NIR − Red)/(NIR + Red + L); where L = 0.5 [35]
EVI (Enhanced Vegetation Index)	2.5 × (NIR − Red)/(NIR + 6 × Red − 7.5 × Blue + 1) [36]
NDMI (Normalised Difference Moisture Index)	(NIR − SWIR)/(NIR + SWIR) [37]

Table 2. The evaluation of three models, CNN, LSTM, and CNN-LSTM. RMSE: Root Mean Squared Errors; MSE: Mean Squared Errors; MAE: Mean Absolute Errors.

Unit: kg/hectare	CNN	LSTM	CNN-LSTM
RMSE	8.579	0.406	5.013
MSE	73.601	0.165	25.135
MAE	271.294	10.495	156.718

Table 3. The average values of spectral indices from 2018 to 2023 and the correlation with the yield of winter barley.

	NDVI	SAVI	EVI	NDMI
2018	0.403	0.349	0.429	0.178
2019	0.370	0.299	0.421	0.153
2020	0.559	0.356	0.394	0.201
2021	0.486	0.353	0.433	0.202
2022	0.385	0.313	0.429	0.150
2023	0.428	0.372	0.445	0.206
Correlation with yield	−0.874	−0.424	0.573	−0.523

Table 4. The spectral indices from 2018 to 2023 and the correlation with the yield of winter barley for three seasons (November and December, January, February and March and April, May and June).

	November & December
	NDVI	SAVI	EVI	NDMI
2018	0.370	0.308	0.371	0.154
2019	0.339	0.266	0.383	0.135
2020	0.482	0.275	0.318	0.152
2021	0.383	0.300	0.415	0.151
2022	0.368	0.288	0.399	0.146
2023	0.423	0.359	0.438	0.192
Correlation with yield	−0.593	0.278	0.547	0.165
	January & February & March
2018	0.323	0.267	0.322	0.114
2019	0.332	0.263	0.361	0.111
2020	0.521	0.308	0.328	0.148
2021	0.325	0.248	0.345	0.108
2022	0.343	0.275	0.369	0.099
2023	0.348	0.288	0.332	0.131
Correlation with yield	−0.602	−0.165	0.400	−0.372
	April & May & June
2018	0.445	0.396	0.493	0.210
2019	0.449	0.378	0.535	0.225
2020	0.629	0.434	0.477	0.259
2021	0.628	0.441	0.500	0.283
2022	0.444	0.375	0.518	0.205
2023	0.489	0.438	0.531	0.266
Correlation with yield	−0.853	−0.547	0.919	−0.448

Table 5. The averages of minimum and maximum temperatures, rainfall and sun hours between November and June every year from 2018 to 2023 and their correlations with the annual yield of winter barley.

	Minimum Temperature $(℃$ )	Maximum Temperature $(℃$ )	Rainfall (mm)	Sun Hours (h)
2018	4.99	11.11	55.03	129.80
2019	5.59	12.15	41.46	138.40
2020	5.46	12.41	46.70	155.40
2021	4.70	10.84	57.09	125.99
2022	5.58	12.44	38.24	156.00
2023	5.54	11.94	49.30	135.54
Correlation with yield	0.54	0.26	−0.46	−0.11

Table 6. The seasonal average of minimum and maximum temperatures, rainfall and sun hours between November and June every year from 2018 to 2023 and their correlations with the annual yield of winter barley.

	November December	January February March	April May June	November December	January February March	April May June
	Minimum temperature ( $℃$ )			Rainfall (mm)
2018	3.45	1.93	9.07	78.00	62.03	32.70
2019	5.25	3.17	8.23	45.60	39.27	40.90
2020	3.90	3.77	8.20	70.85	51.00	26.30
2021	4.45	2.37	7.20	71.90	53.37	50.93
2022	4.60	3.37	8.43	58.95	39.53	23.13
2023	4.10	3.23	8.80	85.95	36.70	37.80
Correlation with yield	0.45	−0.04	0.50	−0.28	−0.65	0.03
	Maximum temperature ( $℃$ )			Sun hours (h)
2018	8.85	7.30	16.43	81.10	84.47	207.60
2019	10.50	9.70	15.70	76.20	113.47	204.80
2020	9.45	9.80	17.00	60.15	111.67	262.63
2021	10.10	7.93	14.23	61.45	84.00	211.00
2022	9.85	9.80	16.80	61.45	137.87	237.17
2023	9.75	9.53	15.80	58.45	97.30	225.17
Correlation with yield	0.34	0.28	0.024	0.31	0.25	−0.49

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, H.-Y.; Lawrence, J.A.; Mason, P.J.; Ghail, R.C. A Framework for Accurate Annual Regional Crop Yield Prediction. Remote Sens. 2026, 18, 1157. https://doi.org/10.3390/rs18081157

AMA Style

Li H-Y, Lawrence JA, Mason PJ, Ghail RC. A Framework for Accurate Annual Regional Crop Yield Prediction. Remote Sensing. 2026; 18(8):1157. https://doi.org/10.3390/rs18081157

Chicago/Turabian Style

Li, Hsuan-Yi, James A. Lawrence, Philippa J. Mason, and Richard C. Ghail. 2026. "A Framework for Accurate Annual Regional Crop Yield Prediction" Remote Sensing 18, no. 8: 1157. https://doi.org/10.3390/rs18081157

APA Style

Li, H.-Y., Lawrence, J. A., Mason, P. J., & Ghail, R. C. (2026). A Framework for Accurate Annual Regional Crop Yield Prediction. Remote Sensing, 18(8), 1157. https://doi.org/10.3390/rs18081157

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Framework for Accurate Annual Regional Crop Yield Prediction

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Yield Prediction Framework

2.2. Crop Classification

2.3. Yield Prediction Models

2.3.1. 1D CNN

2.3.2. LSTM

2.3.3. 1D CNN-LSTM

2.3.4. Permutation Feature Importance

3. Results

3.1. Crop Maps from 2018 to 2023

3.2. Yield Monitoring and Estimation Evaluation

3.3. Correlation Analysis Between the Yield and Meteorological Datasets

4. Discussion

5. Conclusions

6. Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI