Winter Wheat Yield Estimation Based on Optimal Weighted Vegetation Index and BHT-ARIMA Model

: This study aims to use remote sensing (RS) time-series data to explore the intrinsic relationship between crop growth and yield formation at different fertility stages and construct a high-precision winter wheat yield estimation model applicable to short time-series RS data. Sentinel-2 images were acquired in this study at six key phenological stages (rejuvenation stage, rising stage, jointing stage, the best yield prediction accuracy. The proposed method of this study will provide fast and accurate guidance for crop yield estimation and will be of great value for the processing and application of time-series RS data. winter wheat in based on the The data of the plant of winter Figure 11. Most predicted winter wheat yields in Hengshui City were between 7500 and 10,000 kg/ha, which was in line with the actual yield data in Section 2.3. In the west, mid-south, and east of Hengshui City, the predicted yields were higher than those of other areas in Hengshui City. Especially, the middle and south of Shenzhou, the east of Fucheng, and the east and south of Jingxian County were the higher-yield regions. These results were in general agreement with the spatial variation trend of the ﬁeld experiment. The above results prove that the BHT-ARIMA model has strong spatial generalization in winter wheat yield estimation.


Introduction
As one of the most widely planted crops globally, wheat yield prediction in large areas is essential to ensure food security and maintain sustainable agricultural development [1,2]. Traditional yield estimation methods mainly include statistical field surveys or sampling inspections, which are time-consuming and require a lot of manpower and material resources [3,4]. As a non-destructive emerging technology, remote sensing (RS) technology has the characteristics of high timeliness, wide coverage, and low cost [2], and is widely used in crop growth monitoring and yield prediction [5].
In recent decades, the use of assimilation methods to couple crop growth models with RS data has become one of the development trends in yield estimation. The crop growth model integrated multidisciplinary research (including crop physiology, ecology, meteorology, soil science, agronomy, etc.) to quantitatively and dynamically predict crop yield [6][7][8][9][10][11][12][13][14]. However, incomplete acquired localization data in practical applications will lead to serious bias and instability in crop field forecasting. Besides, the complexity of the agricultural system determines the uncertainty of assimilation, and a large number of data calculations also limit the accuracy of large-area yield estimation [15].
To simplify the calculation and improve the application of RS in regional yield estimation, some researchers have tried methods using the characteristic multispectral information and its derivatives, including characteristic vegetation indices (VI) based on one specific phenological period to construct a yield prediction model. However, the limited amount of information for a given growth period makes it difficult to track the entire process of yield formation during crop growth, and thus lacks accuracy and robustness [16][17][18][19][20][21][22][23][24][25][26].
The time-series data were arranged in the order of the size of the time stamp, which contains the dynamic information during a certain period. In recent years, the application of time-series data in agriculture has been more extensive, including crop mapping, yield estimation, etc. [27,28]. In crop yield estimation, the time-series RS data contain dynamic information on the growth and development of the crop throughout the growing season [29][30][31]. Using time-series data to estimate yields has become one of the research hot spots. Among the existing time-series forecasting (TSF) methods, Long Short-Term Memory (LSTM) [32], as a sort of deep learning method, has been widely used for crop yield estimation and prediction by its sequence modeling using a long-time memory function [33]. However, there is a large error in the parallel processing of multiple temporal data. In addition, Convolutional Neural Networks (CNN) [34] have also been increasingly used in yield estimation due to the shared convolutional kernels, which can achieve automatic feature extraction at the convolutional layer and greatly improve the efficiency of yield prediction. However, a large amount of information will be lost in the pooling layer [35]. Besides, the performance of the TSF models related to deep learning suffers badly when the amount of data is not enough, limiting these models' use for yield estimation based on time-series RS data. Focusing on the shortages of the existing TSF models mentioned above, the Autoregressive Integrated Moving Average (ARIMA) [36] method could be used in yield estimation due to its simplicity and good statistical properties. In addition, in order to consider the intrinsic correlation between time-series data and further capture the interactions between nutrient and structural parameters with yield formation during crop growing, multiple delayed embedding transform (MDT) [37], an emerging technique, is used to perform tensor decomposition to obtain higher-order block Hankel tensor (BHT) [38]. The hybrid method combining the ARIMA and BHT can reduce information loss during tenderization, capture the intrinsic correlation of time series, solve insufficient experimental data to a certain extent, and has great potential in predicting winter wheat yield [39].
This study aims to propose a high-precision winter wheat yield estimation method applicable to short time-series RS data. Firstly, the optimal vegetation indices group suitable for yield estimation during six key growth periods of winter wheat was obtained based on the Regressional ReliefF (RReliefF) algorithm. Then, the Absolutely Objective Improved Analytic Hierarchy Process (AOIAHP) algorithm was innovatively proposed to determine the contribution weights of the different growth stages to winter wheat yield accumulation. For short time-series RS data prediction, the BHT-ARIMA was firstly applied in yield estimation to solve the problem of insufficient sampled data. The paper is organized as follows. Section 2 introduces the experiment design/data used in this study and describes the details regarding the proposed Absolutely Objective Improved Analytic Hierarchy Process and BHT-ARIMA model, etc. Results of selected optimal VIs, contribution ratio determination, and the winter wheat yield estimation are all described in Section 3. The comparison of the BHT-ARIMA model with other methods in yield estimation is discussed in Section 4. The paper concludes in Section 5 with a summary of the results.

Study Region
Hengshui City (115 • 10 -116 • 34 E, 37 • 03 -38 • 23 N), Hebei Province, China, was chosen as the study region. The region is part of the North China Plain, an alluvial plain of the Yellow River, Huaihe River, and Haihe River [40]. It is an important food production area in North China. In this region, winter wheat is the primary crop, and the main crop planting pattern is the rotation of winter wheat and summer maize [41]. Winter wheat was selected as a research crop in yield estimation.

Experiment Design
Field data measurement was conducted on the winter wheat only during the harvest time. Twenty-two fields, each of which has an area larger than 1 km 2 , were selected and examined. Five plots were deployed diagonally in each field with distance of 50 m to each other and 50 m from the edge of a field, as shown in Figure 1. Then, 110 plots were selected for the experiment. Besides, the coordinates of the experimental plots were recorded by a Global Positioning System (GPS, Trimble GeoExplorer 6000 Series GeoXH, Trimble Navigation, Ltd., Sunnyvale, CA, USA) to accurately obtain corresponding band information from RS images. and describes the details regarding the proposed Absolutely Objective Improved Analytic Hierarchy Process and BHT-ARIMA model, etc. Results of selected optimal VIs, contribution ratio determination, and the winter wheat yield estimation are all described in Section 3. The comparison of the BHT-ARIMA model with other methods in yield estimation is discussed in Section 4. The paper concludes in Section 5 with a summary of the results.

Study Region
Hengshui City (115°10'-116°34'E, 37°03'-38°23'N), Hebei Province, China, was chosen as the study region. The region is part of the North China Plain, an alluvial plain of the Yellow River, Huaihe River, and Haihe River [40]. It is an important food production area in North China. In this region, winter wheat is the primary crop, and the main crop planting pattern is the rotation of winter wheat and summer maize [41]. Winter wheat was selected as a research crop in yield estimation.

Experiment Design
Field data measurement was conducted on the winter wheat only during the harvest time. Twenty-two fields, each of which has an area larger than 1 km 2 , were selected and examined. Five plots were deployed diagonally in each field with distance of 50 m to each other and 50 m from the edge of a field, as shown in Figure 1. Then, 110 plots were selected for the experiment. Besides, the coordinates of the experimental plots were recorded by a Global Positioning System (GPS, Trimble GeoExplorer 6000 Series GeoXH, Trimble Navigation, Ltd., Sunnyvale, CA, USA) to accurately obtain corresponding band information from RS images.

Winter Wheat Yield Data
The growth of the farmland within the pixel scale of the Sentinel-2 was roughly uniform, so the yield per unit area (1 m 2 ) was chosen as a representative of the point, and the distribution of winter wheat within one plot is shown in Figure 1. From each 1 m 2 plot, all ears of winter wheat were collected. According to the standard method of obtaining the dry weight of the winter wheat ears, the ears were heated to 105 °C for killing to inactivate enzymes to reduce the effect on dry weight and dried at 80 °C until a constant weight was reached in the laboratory. Next, the dried grains were manually collected from the ears, and the final dry weight of the grains was recorded with the unit of g. In addition, the unit was then converted from g/m 2 to kg/ha.

Winter Wheat Yield Data
The growth of the farmland within the pixel scale of the Sentinel-2 was roughly uniform, so the yield per unit area (1 m 2 ) was chosen as a representative of the point, and the distribution of winter wheat within one plot is shown in Figure 1. From each 1 m 2 plot, all ears of winter wheat were collected. According to the standard method of obtaining the dry weight of the winter wheat ears, the ears were heated to 105 • C for killing to inactivate enzymes to reduce the effect on dry weight and dried at 80 • C until a constant weight was reached in the laboratory. Next, the dried grains were manually collected from the ears, and the final dry weight of the grains was recorded with the unit of g. In addition, the unit was then converted from g/m 2 to kg/ha.

Sentinel-2 Image Processing
The whole growth period of winter wheat occurs from early March to late May. The selected periods of this study are from rejuvenation to filling-maturity. Predominantly cloud-free Sentinel-2 images covering the whole territory during the key phenological phases (Level 1C Top-of-Atmosphere reflectance products) were downloaded from the Copernicus Open-Access Hub (https://scihub.copernicus.eu/dhus/#/home, accessed on 1 July 2020), as shown in Table 1. Only one Sentinel-2 image was used for each phenological phase. Details of the bands used in this study can be found in Table 2. All bands were atmospherically corrected using the Sen2Cor processor tool, and the bands at 20 m resolution were rescaled to 10 m before being stacked [18,42]. In order to quickly and effectively select the optimal VIs suitable for winter wheat yield estimation and their importance ranking from the VI candidates, the RReliefF algorithm [43], as an extended variant of Relief [44], was adopted to screen the optimal VIs with high efficiency. Unlike Relief, which is a typical binary classification algorithm, the RReliefF could cope with the feature selection with continuous target variables. The calculation workflow is shown below.
RReliefF first sets ranking factors R dy , R dj , R dy∧dj , and R j equal to 0. Then, the algorithm iteratively selects a random observation x r , finds the k-nearest observations to x r , and updates all the intermediate ranking factors, as follows [45]: where R dy is the ranking factor of having different values for the response y (i.e., the actual yields), R dj is the ranking factor of having different values for the predictor F j , and R dy∧dj is the ranking factor of having different response values and different values for the predictor F j . The i and i−1 superscripts denote the iteration step number. ∆y(x r ,x q ) is the difference in the value of the continuous response y between observations x r and x q , shown as Equation (4). Let y r denote the value of the response for observation x r , and let y q denote the value of the response for observation x q . ∆j(x r ,x q ) is the difference in the value of the predictor F j between observations x r and x q , shown as Equation (5). Let x rj denote the value of the jth predictor for observation x r , and let x qj denote the value of the jth predictor for observation x q .
∆ y x r , x q = y r − y q max(y)−min(y) d rq is a distance function of the form, shown as Equation (6), and the distance is subject to the scaling, d rq , shown as Equation (7): where rank(r,q) is the position of the qth observation among the nearest neighbors of the rth observation, sorted by distance, and k is the number of nearest neighbors. Finally, the RReliefF calculates the predictor ranking factor R j after fully updating all the intermediate ranking factors. m is the number of iterations.
In this study, the RReliefF algorithm evaluated candidate VIs' importance to the measured yields from the 110 sampling plots. Subsequently, the importance ranking of VIs can be obtained, which were taken as a standard of selecting optimal vegetation index combinations, as shown in Figure 2. A higher ranking of the VI obtained by RReliefF means that the VI has a closer relation with the yield formation, which laid a foundation for the contribution weights estimation and yield prediction to be discussed in Section 2.5.3. In this study, the top 10 features were selected for the following weight estimation and yield prediction. as Equation (4). Let yr denote the value of the response for observation xr, and let yq denote the value of the response for observation xq. Δ j (x r ,x q ) is the difference in the value of the predictor Fj between observations xr and xq, shown as Equation (5). Let xrj denote the value of the jth predictor for observation xr, and let xqj denote the value of the jth predictor for observation xq.
∆ y x r ,x q = y r − y q max y -min y drq is a distance function of the form, shown as Equation (6), and the distance is subject to the scaling, , shown as Equation (7): where rank(r,q) is the position of the qth observation among the nearest neighbors of the rth observation, sorted by distance, and k is the number of nearest neighbors. Finally, the RReliefF calculates the predictor ranking factor Rj after fully updating all the intermediate ranking factors. m is the number of iterations.
In this study, the RReliefF algorithm evaluated candidate VIs' importance to the measured yields from the 110 sampling plots. Subsequently, the importance ranking of VIs can be obtained, which were taken as a standard of selecting optimal vegetation index combinations, as shown in Figure 2. A higher ranking of the VI obtained by RReliefF means that the VI has a closer relation with the yield formation, which laid a foundation for the contribution weights estimation and yield prediction to be discussed in Section 2.5.3. In this study, the top 10 features were selected for the following weight estimation and yield prediction.

Improved Analytic Hierarchy Process
Different phenological stages of winter wheat have different effects on the accumulation of winter wheat yield. It is crucial to determine the weights of each phenological stage scientifically and effectively using the results derived from the mathematical analysis of limited data. The improved analytic hierarchy process (IAHP) is a weighting method that is optimized from the analytic hierarchy process (AHP) for solving fuzzy problems with incomplete information. A comparison matrix needs to be constructed at the beginning [3]. Compared with AHP, IAHP does not need a consistency test, avoiding adjusting the comparison matrix multiple times and improving efficiency and accuracy [3,46,47]. More importantly, IAHP uses the "three-scale method" to construct the comparison matrix, which means the elements in the matrix are just integers from 0 to 2, and these integers imply the degree of importance between the elements [47]. The values of the elements are determined by the subjective analysis of experts. In this study, IAHP only needed the ranking of the importance of the phenological periods to construct the comparison matrix, significantly reducing the amount of information needed to determine the weights and leveraging the results of mathematical data analysis.

Absolutely Objective Improved Analytic Hierarchy Process
IAHP makes it easier for experts to judge the importance of the comparison to construct the comparison matrix. However, it still requires experts to compare the importance of two stages affected by subjective factors to a certain extent [3,46,47]. The Absolutely Objective Improved Analytic Hierarchy Process (AOIAHP) is innovatively proposed in this paper to completely eliminate the influence of subjective factors. In AOIAHP, the objective data analysis determined the comparison matrix be constructed in the first step of IAHP. In this case, the AOIAHP improved the accuracy of weight calculation.
As shown in Figure 3, the selected ten VI time-series data points of six key phenological phases were used as the predictor variable vector to form the input matrices (110 × 6), and the yield data measured by different plots were used as a response vector (110 × 1). The importance ranking of six key phenological phases of winter wheat based on a single VI can be obtained after RReliefF analysis. Subsequently, the rankings were averaged among the selected ten VIs to obtain a total ranking vector of the six key phenological phases.
where r takes on different values depending on the subscript, and rmax and rmin are the maximum and minimum value of importance coefficients obtained by Equation (10), respectively.
Finally, the quasi-optimal matrix, Q q ij , can be obtained based on elements in B b ij , shown as Equation (13), obtained by Equation (14): where the value of q ij can be calculated by Equation (14) using the elements of the matrix The weight vector, W, of six key phenological phases can be obtained based on the maximum eigenvalue and the maximum eigenvector, d, of the quasi-optimal matrix Q q ij . The elements of W can be obtained by Equation (15), where ∑ w k k=1 =1, and k refers to 6 key phenological phases: Finally, with the obtained weights of the six phenological phases, the optimal weighted vegetation index matrix (110 × 10 × 6) can be constructed, as shown in Figure 3.

BHT-ARIMA Model
The BHT-ARIMA model could fully exploit the intrinsic linkage of the time series and achieve augmentation of short time-series data, effectively solving the problem of Based on the composite ranking of six key phenological phases, the comparison matrix, A a ij , shown as Equation (9), can be scientifically and objectively constructed: 16 . . . . . . . . .
The value of a ij is the comparison result related to the ranking of the importance of six key phenological stages of winter wheat. The values of i and j are both from 1 to 6, Remote Sens. 2022, 14, 1994 7 of 21 corresponding to the six key phenological wheat periods, as shown in Table 1, and the integer values of a ij are from 0 to 2, corresponding to period i being less important than period j, period i being equally important as period j, and period i being more important than period j, respectively.
Then, the importance coefficients vector, r, of winter wheat growth stages can be calculated using the elements in the comparison matrix, A a ij , shown as Equation (10): where the subscript j for r indicates six key phenological periods of the winter wheat. The judgment matrix, B b ij , can be constructed based on the importance coefficients of winter wheat key phenological stages, obtained by Equation (11): where r takes on different values depending on the subscript, and r max and r min are the maximum and minimum value of importance coefficients obtained by Equation (10), respectively. Finally, the quasi-optimal matrix, Q q ij , can be obtained based on elements in B b ij , shown as Equation (13), obtained by Equation (14): where the value of q ij can be calculated by Equation (14) using the elements of the matrix B b ij . The weight vector, W, of six key phenological phases can be obtained based on the maximum eigenvalue and the maximum eigenvector, d, of the quasi-optimal matrix Q q ij . The elements of W can be obtained by Equation (15), where ∑ 6 k=1 w k = 1, and k refers to 6 key phenological phases: Finally, with the obtained weights of the six phenological phases, the optimal weighted vegetation index matrix (110 × 10 × 6) can be constructed, as shown in Figure 3.

BHT-ARIMA Model
The BHT-ARIMA model could fully exploit the intrinsic linkage of the time series and achieve augmentation of short time-series data, effectively solving the problem of simultaneous prediction of multiple short time-series data points [48]. The specific steps of modeling are as follows [48]: (1) Obtain the high-order multidimensional short time-series data using MDT. As is shown in Figure 4, the ten calculated optimal weighted vegetation index time-series data points of six key phenological phases from different experimental plots, whose total number is I (110 herein), are decomposed in time dimension with the time decomposition number parameter, T, to obtain the decomposed time-series data, {χ} I×T . High-order time-series data, {χ t } I×τ×T = χ 1 ,χ 2 , · · · ,χT , are obtained by stretching the decomposed time-series data, {χ} I×T , with the replication matrix, which is determined by the parameter τ. The relationship between T andT is shown in Equation (16): simultaneous prediction of multiple short time-series data points [48]. The specific steps of modeling are as follows [48]: (1) Obtain the high-order multidimensional short time-series data using MDT. As is shown in Figure 4, the ten calculated optimal weighted vegetation index time-series data points of six key phenological phases from different experimental plots, whose total number is I (110 herein), are decomposed in time dimension with the time decomposition number parameter, T, to obtain the decomposed time-series data, χ I×T . High-order time-series data, χ t I×τ×T = χ 1 ,χ 2 ,⋯,χ T , are obtained by stretching the decomposed time-series data, χ I×T , with the replication matrix, which is determined by the parameter . The relationship between T and T is shown in Equation (16): (2) Obtain core tensors from the BHT by Tucker decomposition [49]. With the BHT, we compute its d-order differences to obtain a M-order tensor, [50]. After Tucker decomposition, Δ d χ t is decomposed into the product of a core tensor, , which are shown as below: where m = 1,2,⋯ , M, and U ( ) are the projection matrices, which could maximally preserve the temporal continuity between core tensors, and Δ d G t are the low-rank core tensors, which could represent the important original information of Hankel and the intrinsic linkage of the time-series data.
(3) Tensorized ARIMA model establishment. To retain the temporal correlations among core tensors, a (p; d; q)-order ARIMA model is constructed in the tensor form, which could be used to connect the current core tensor, Δ d G t , and the previous core tensors, Δ d G t-1 ,Δ d G t-2 ,⋯,Δ d G t-p , as in Equation (19). α i and β i are the coefficients of the autoregressive model (AR) and the moving average model (MA), respectively, which were estimated from the core tensors and the residual time-series data [39,51]. ε t-i are the random errors of past q observations, and ε t is the forecast error at the current time point, which should be minimized to optimal zero. (2) Obtain core tensors from the BHT by Tucker decomposition [49]. With the BHT, we compute its d-order differences to obtain a M-order tensor, ∆ dχ t = ∆ dχ 1 , ∆ dχ 2 , · · · , ∆ dχT [50]. After Tucker decomposition, ∆ dχ t is decomposed into the product of a core tensor, ∆ dĜ t = ∆ dĜ 1 , ∆ dĜ 2 , · · · , ∆ dĜT , and M orthogonal projection matrices, Û (m) = Û (1) ,Û (2) , · · · ,Û (M) , which are shown as below: where m = 1, 2, · · · , M, and Û (m) are the projection matrices, which could maximally preserve the temporal continuity between core tensors, and ∆ dĜ t are the low-rank core tensors, which could represent the important original information of Hankel and the intrinsic linkage of the time-series data.
(3) Tensorized ARIMA model establishment. To retain the temporal correlations among core tensors, a (p; d; q)-order ARIMA model is constructed in the tensor form, which could be used to connect the current core tensor, ∆ dĜ t and the previous core tensors, ∆ dĜ t−1 , ∆ dĜ t−2 , · · · , ∆ dĜ t−p , as in Equation (19). {α i } and {β i } are the coefficients of the autoregressive model (AR) and the moving average model (MA), respectively, which were estimated from the core tensors and the residual time-series data [39,51]. {ε t−i } are the random errors of past q observations, and {ε t } is the forecast error at the current time point, which should be minimized to optimal zero. where ∆ dĜ t is input into the tensorized ARIMA model to forecast a new core tensor, ∆ dĜ t+1 . Back projection used Û (m) to obtain ∆ dχT +1 to get the predicted values at theT + 1 time point for all the time-series data simultaneously, which is shown in Equation (20). Finally, inverse differencing and inverse-MDT were applied to obtain the prediction result {χ T+1 }.
In this study, the BHT-ARIMA model was applied to estimate the yield of the winter wheat based on short RS time-series data. Five key model parameters required special attention when predicting winter wheat yields using the BHT-ARIMA model, as shown in Table 3. The number of time-series data points of single VI (I) and the number of selected optimal VIs were determined by the original dataset. The number of time-series data points of single VI (I) is the amount of time-series data for each single VI. There were 110 sets of time-series data representing 110 experimental plots for each single VI. The meaning of the number of selected optimal VIs is obvious. It represents the number of VIs selected to be input into the BHT-ARIMA. The number of selected optimal VIs was set to 10, as 10 VIs had the best prediction performance and reduced the complexity of the model calculation, which was proven in Section 3.3.2. The time decomposition number parameter (T) and the replication matrix parameter (τ) were determined manually according to the model characteristics. The time decomposition number parameter (T) was related to time-series segmentation, an integer that is limited to an integer multiple of 10. The replication matrix parameter (τ) was an integer less than the time decomposition number parameter (T). The number of TS layers (T) was calculated from the time decomposition number parameter (T) and the replication matrix parameter (τ), as shown in Equation (16). The specific parameter values and notes are summarized in Table 3. Table 3. Setting of BHT-ARIMA parameters.

Parameter Value Note
Number of time-series data points of single VI (I) 110 Determined by the original dataset Number of selected optimal VIs 10 Determined by the original dataset Time decomposition number parameter (T) 10 Determined manually, integer multiples of 10 Replication matrix parameter (τ) 5 Determined manually, integers less than T Number of TS layers (T) 6 Calculated by Equation (16) Although the BHT-ARIMA model utilized complex data processing techniques such as tensor decomposition, this algorithm did not need to copy data or pack existing data, but wrote data directly to the tensor, and no additional copies were created for both in-process and out-of-process execution. Therefore, this algorithm did not require high hardware configuration. The algorithm's arithmetic power requirement was 30 g of semi-precision, which can be run 783 times per second according to the theoretical value, and the peak bandwidth was about 540 GB/s under the common RTX 2070 graphics card environment, so it only took a few minutes to obtain the prediction results of this model.

Prediction Accuracy Evaluation
In this study, the coefficient of determination (R-square, R 2 ), mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE) were used to determine the predictive performance of the model. The R 2 , MAE, MAPE, and RMSE were defined as follows: Remote Sens. 2022, 14,1994 10 of 21 where n represents the number of samples, y i is the observed yield value, y i is the predicted yield, and y is the average observed yield.

Initial Vegetation Index Candidates
VIs shown in Table 3 were initially chosen as candidates for the following selection. All these VIs have been proven to have a close correlation with crop nutritional parameters (such as chlorophyll, nitrogen, etc.) and structural parameters (leaf area index (LAI), etc.). VIs, including NDVI [17,18,20], GNDVI [21], MTCI [22], WDRVI [24,25], etc., have often been used directly to build linear regression models for yield estimation. In addition, some characteristic VIs, including REP [23] and SAVI [52], etc., have proven to be highly correlated with crop nutrition parameters. Besides, VIs including NDVI [53,54], WDRVI [55,56], etc., have been applied to measure structural parameters (LAI, etc.). These indexes can also be used to reflect crop yield from the aspect of nutrition. Three red-edge bands in Sentinel-2 were used in the study, so VIs including NDRE, MTCI, and CI RE with red-edge bands (abbreviated as RE) were calculated using different red-edge bands, including RE1, RE2, and RE3, in order to take full advantage of multiple red-edge bands in Sentinel-2. A total of 18 VIs were obtained for the following selection. The specific calculation equations using the bands shown in Table 2 are shown in the middle column in Table 4.

Optimal VI Selection to Monitor the Growth of Winter Wheat
Among the VI candidates, the optimal VIs for yield estimation during the whole growth period were selected through the RReliefF algorithm and shown in Table 5, which are sensitive to the chemical and biophysical parameters, which determined the photosynthetic capacity and efficiency and further affected yield formation of winter wheat [68]. As is shown in this table, the red-edge band of Sentinel-2 was fully utilized. According to the published literature, MTCI was closely related to the chlorophyll content of crops, which is the basic element for photosynthesis [63,69]. Besides, MTCI is highly sensitive to changes in biomass during the whole growth period of winter wheat [70]. REP strongly correlates with a wide range of foliar N concentrations during the whole growth period, influencing photosynthesis and yield formation [23,71]. These are consistent with the stable and high ranking of MTCI and REP by RReliefF. NDVI is one of the most widely used VIs to evaluate crop growth and monitor the nutrients needed for crop growth, development, and yield estimation. Besides, NDVI(RE), calculated from the red-edge bands, red edge1, red edge2, and red edge3, delays the saturation trend during the late growth period [70]. Additionally, EVI was proposed to handle the saturation problem of NDVI with improved sensitivity in high-biomass regions and improved vegetation monitoring through a de-coupling of the canopy background signal and a reduction in atmosphere influences [68,72]. SAVI and its optimized variants (MSAVI, OSAVI) increased vegetation reflection while suppressing soil background effects in images to improve monitoring accuracy [65,66]. The above reviews verified that the combination of selected VIs could give full play to their respective advantages in crop growth monitoring and yield forecasting during the whole growth period of winter wheat.

Contribution Ratio Determination to Winter Wheat Yield Accumulation
Based on AOIAHP, the contribution ratios of six key growth stages can be obtained. Figure 5 presents the bar chart of the contribution ratios of six key phenological phases, showing that the jointing period had the highest contribution ratio, followed by the rising, heading, filling, rejuvenation, and lastly, filling-maturity period.
Sens. 2022, 14, x FOR PEER REVIEW organic matter produced by photosynthesis are stored in the grain still determining the accumulation of winter wheat.   In the rejuvenation period, yellow leaves of winter wheat turned green as the canopy structure and chemical parameter content of winter wheat increased. From the rising period to jointing, chemical and biophysical parameters influencing photosynthesis continued to increase, increasing contribution ratios to yield. From the jointing period to the heading period, the growth and development of winter wheat dramatically increased, with the root group and the leaf area reaching the maximum, and the wheat ears beginning to differentiate. These stages are the key periods for determining the number of grains per ear, and it is also the consolidation period for the effective number of ears per hectare and the weight. After the heading period, starch, protein, and accumulated organic matter produced by photosynthesis are stored in the grain through assimilation, still determining the accumulation of winter wheat.

Winter Wheat Yield Prediction Based on Optimal Weighted Vegetation Indexes and the BHT-ARIMA Model
This research used six phenological phases weighted VI time series to establish the BHT-ARIMA model to predict winter wheat yield. The 110 samples were divided into 2 parts for model establishment, where 88 samples were selected randomly for use as the calibration subset, and the remaining 22 were used as the independent validation subset. The R 2 , RSME, MAE, and MAPE for the relationship between the observed and predicted yields in the validation dataset were calculated to evaluate the performance of the prediction of the BHT-ARIMA model. The input of the tensorized ARIMA model (i.e., BHT-ARIMA) has to meet the data requirements of the ARIMA model. The final BHT-ARIMA model only outputs ten reasonable data points after eliminating the unreasonable data. As shown in Figure 6, the R-square reached 0.583 and the RSME, MAE, and MAPE of this model were 323.637 kg/ha, 255.954, and 3.186%, respectively. This model performed better when the observed yields were larger than 8000 kg/ha. The good performance shows that the BHT-ARIMA model has capability in short time-series forecasting in agriculture. Figure 6. The R 2 , RSME, MAE, and MAPE for the relationship between th yields based on the weighted optimal VI group. The dashed line is the 1:1

Sens. 2022, 14, x FOR PEER REVIEW
The vegetation index information was constant for each samp trained. To demonstrate whether the model was overfitted on t model with randomly selected yield samples from 80% of all samp for cross-validation. The cross-validation is to verify whether the m the increase in the number of verifications, overfitting can likely between training and validation errors is significant. In this case, Figure 6. The R 2 , RSME, MAE, and MAPE for the relationship between the observed and predicted yields based on the weighted optimal VI group. The dashed line is the 1:1 line.
The vegetation index information was constant for each sample when the model was trained. To demonstrate whether the model was overfitted on the VIs, we trained the model with randomly selected yield samples from 80% of all samples, using other samples for cross-validation. The cross-validation is to verify whether the model is overfitted. With the increase in the number of verifications, overfitting can likely occur if the difference between training and validation errors is significant. In this case, the R 2 was used as the only metric to measure the cross-validation results. This validation was performed ten times. As we can see from Figure 7, R 2 fluctuated around 0.5. The result indicated no significant overfitting, which implies that the model is generalized and can be used to estimate winter wheat yield. However, only ten reasonable results were obtained after data tensorization, the sample pool was reduced, which also had some influence on the validation process, and there were some limitations of this method.
Sens. 2022, 14, x FOR PEER REVIEW In order to analyze the effect of the number of vegetation ind model prediction performance, 18 matrices ranging in size from 11 were used sequentially as inputs to the BHT-ARIMA model. Th MAPE were calculated to assess the performance of the BHT-ARIM Figure 8. Figure 8a shows that R 2 basically had an increasing tren the number of the selected indexes less than 10, and stabilized larger than 10. Figure 8b-d show that the RSME, MAE, and MAPE same trend. They declined with high fluctuations and had larger v the number of the selected indexes was greater than ten, the values gradually leveled off.
However, if too many indexes were selected, for each additi BHT-ARIMA model needed to perform the process of tensor trans tion, and inversion again, which made the calculation cycle longe of the system, and reduced the efficiency of the model and redund Therefore, ten VIs selected by the RReliefF algorithm were u obtain better prediction results while saving computational cycles efficiency. Besides, the results indicated that the ten selected sets o represent valid information related to yield formation and effecti dancy of information from the full vegetation index.

Influence of the Number of Vegetation Indexes on the Performance of the BHT-ARIMA Model
In order to analyze the effect of the number of vegetation indices we selected on the model prediction performance, 18 matrices ranging in size from 110 × 1 × 6 to 110 × 18 × 6 were used sequentially as inputs to the BHT-ARIMA model. The R 2 , RSME, MAE, and MAPE were calculated to assess the performance of the BHT-ARIMA model, as shown in Figure 8. Figure 8a shows that R 2 basically had an increasing trend, with the increase of the number of the selected indexes less than 10, and stabilized when the number was larger than 10. Figure 8b-d show that the RSME, MAE, and MAPE had approximately the same trend. They declined with high fluctuations and had larger values before ten. When the number of the selected indexes was greater than ten, the values of all three parameters gradually leveled off.
However, if too many indexes were selected, for each additional set of indexes, the BHT-ARIMA model needed to perform the process of tensor transformation, decomposition, and inversion again, which made the calculation cycle longer, increased the burden of the system, and reduced the efficiency of the model and redundancy.
Therefore, ten VIs selected by the RReliefF algorithm were used as model inputs to obtain better prediction results while saving computational cycles and maximizing model efficiency. Besides, the results indicated that the ten selected sets of vegetation index can represent valid information related to yield formation and effectively reduce the redundancy of information from the full vegetation index. Remote Sens. 2022, 14, x FOR PEER REVIEW 15 of 22

Comparisons of Performance of Weighted Vegetation Indexes and Unweighted Vegetation Indexes in the BHT-ARIMA Model
In order to verify the effect of the calculated weights on different phenological periods for winter wheat yield accumulation, two datasets of winter wheat VIs were established, including the dataset used in Section 3.3.1, i.e., the weighted optimal vegetation index group, and the dataset of the unweighted optimal vegetation index group. The samples were divided into two groups: 80% were the calibration group and the remaining 20% were the validation group. Figure 8 shows the scattered plots of the predicted yield and real winter wheat yield, which were predicted based on the unweighted and weighted optimal VI groups. Since the tensorized ARIMA model specifies non-zero input values, the model has only 10 sets of output values after automatically eliminating the unreasonable values. Figure 9 shows that the predicted yield was positively correlated with the observed yield. Figure 9a,b are the plots of the relationship between the predicted yield and the observed yield using unweighted and weighed VI groups, respectively. The R-square of the calibrated model with weighted VI reached 0.583, higher than that of the model with unweighted VI, whose R-square reached 0.325. The weighted model's RSME, MAE, and MAPE were 323.637 kg/ha, 255.954, and 3.186%, respectively, smaller than those of the unweighted model, with 492.990 kg/ha, 350.625, and 4.332%, respectively.
The results indicate different degrees of influence on winter wheat yield for each phenological stage of winter wheat. The selected vegetation index combined with the contribution of key phenological stages can predict the yield of winter wheat well, with good robustness and realism.

Comparisons of Performance of Weighted Vegetation Indexes and Unweighted Vegetation Indexes in the BHT-ARIMA Model
In order to verify the effect of the calculated weights on different phenological periods for winter wheat yield accumulation, two datasets of winter wheat VIs were established, including the dataset used in Section 3.3.1, i.e., the weighted optimal vegetation index group, and the dataset of the unweighted optimal vegetation index group. The samples were divided into two groups: 80% were the calibration group and the remaining 20% were the validation group. Figure 8 shows the scattered plots of the predicted yield and real winter wheat yield, which were predicted based on the unweighted and weighted optimal VI groups. Since the tensorized ARIMA model specifies non-zero input values, the model has only 10 sets of output values after automatically eliminating the unreasonable values. Figure 9 shows that the predicted yield was positively correlated with the observed yield. Figure 9a,b are the plots of the relationship between the predicted yield and the observed yield using unweighted and weighed VI groups, respectively. The R-square of the calibrated model with weighted VI reached 0.583, higher than that of the model with unweighted VI, whose R-square reached 0.325. The weighted model's RSME, MAE, and MAPE were 323.637 kg/ha, 255.954, and 3.186%, respectively, smaller than those of the unweighted model, with 492.990 kg/ha, 350.625, and 4.332%, respectively.
The results indicate different degrees of influence on winter wheat yield for each phenological stage of winter wheat. The selected vegetation index combined with the contribution of key phenological stages can predict the yield of winter wheat well, with good robustness and realism. To verify the advantages of the BHT-ARIMA model in predicting winter wheat yi using short time-series RS data, BHT-ARMA and BHT-CNN models were applied to co pare the performance in winter wheat yield estimation among the BHT-ARMA, BH CNN, and BHT-ARIMA models. BHT-ARMA and BHT-CNN models are tensorized fr the original autoregressive moving average (ARMA) and CNN models. The CNN n work is divided into five convolutional layers, four pooling layers, and two fully c nected layers. Optimal weighted VI time-series data consisting of 10 selected VIs w used to train the BHT-ARIMA, BHT-ARMA, and BHT-CNN models and further evalu the yield estimation accuracies of the trained models. The R 2 , RSME, MAE, and MAPE the observed and predicted yields' relationship were calculated. The yield prediction sults using the three methods mentioned above are shown in Figure 10.
As we can see from Figure 9, the overall performance of the BHT-ARIMA model w better than that of the BHT-ARMA and BHT-CNN models. The R 2 of the BHT-ARIM model reached 0.583, higher than that of the BHT-ARMA and BHT-CNN models, w 0.302 and 0.350, respectively. Moreover, the MAE, MAPE, and RMSE of the BHT-ARIM model, which were 284.366, 3.498%, and 327.358 kg/ha, respectively, were all lower th the BHT-ARMA model, with 711.993, 8.562%, and 958.730 kg/ha, and the BHT-CN model with 695.276, 8.503%, and 791.545 kg/ha, respectively. The experimental data, cluding R 2 , RSME, MAE, and MAPE, are summarized in Table 6.
The results of the model performance comparison showed that the BHT-ARIM model outperformed traditional deep learning networks and time-series prediction m els in solving prediction problems for short time series of agricultural remote sensi especially for multiple short time-series data prediction problems. Especially, we can intuitively from Table 6 that, compared with BHT-ARMA and BHT-CNN, the R 2 of BH ARIMA had a dramatic rise from around 0.3 to around 0.6, while RMSE, MAE, and MA had a sharp decline. This may be caused by insufficient time-series data used in this stu Traditional prediction models require large, extensive data, limiting their application yield predictions with fewer data. The BHT-CNN model was trained with a large amo To verify the advantages of the BHT-ARIMA model in predicting winter wheat yield using short time-series RS data, BHT-ARMA and BHT-CNN models were applied to compare the performance in winter wheat yield estimation among the BHT-ARMA, BHT-CNN, and BHT-ARIMA models. BHT-ARMA and BHT-CNN models are tensorized from the original autoregressive moving average (ARMA) and CNN models. The CNN network is divided into five convolutional layers, four pooling layers, and two fully connected layers. Optimal weighted VI time-series data consisting of 10 selected VIs were used to train the BHT-ARIMA, BHT-ARMA, and BHT-CNN models and further evaluate the yield estimation accuracies of the trained models. The R 2 , RSME, MAE, and MAPE for the observed and predicted yields' relationship were calculated. The yield prediction results using the three methods mentioned above are shown in Figure 10.
As we can see from Figure 9, the overall performance of the BHT-ARIMA model was better than that of the BHT-ARMA and BHT-CNN models. The R 2 of the BHT-ARIMA model reached 0.583, higher than that of the BHT-ARMA and BHT-CNN models, with 0.302 and 0.350, respectively. Moreover, the MAE, MAPE, and RMSE of the BHT-ARIMA model, which were 284.366, 3.498%, and 327.358 kg/ha, respectively, were all lower than the BHT-ARMA model, with 711.993, 8.562%, and 958.730 kg/ha, and the BHT-CNN model with 695.276, 8.503%, and 791.545 kg/ha, respectively. The experimental data, including R 2 , RSME, MAE, and MAPE, are summarized in Table 6.
The results of the model performance comparison showed that the BHT-ARIMA model outperformed traditional deep learning networks and time-series prediction models in solving prediction problems for short time series of agricultural remote sensing, especially for multiple short time-series data prediction problems. Especially, we can see intuitively from Table 6 that, compared with BHT-ARMA and BHT-CNN, the R 2 of BHT-ARIMA had a dramatic rise from around 0.3 to around 0.6, while RMSE, MAE, and MAPE had a sharp decline. This may be caused by insufficient time-series data used in this study. Traditional prediction models require large, extensive data, limiting their application to yield predictions with fewer data. The BHT-CNN model was trained with a large amount of information that may be lost in the pooling layer, and the training results tended to converge on local minima, ignoring the local-to-whole correlation. The BHT-ARMA may not perform autoregressive learning, and only used data from the 10 sets of VIs with larger weights to predict. These limitations mentioned above became obvious when the input data were insufficient. In contrast, the BHT-ARIMA model made full use of the intrinsic linkage of the time series, automatically enabling data expansion for short time series.

Regional Winter Wheat Yield Estimation in Hengshui
In Section 3.3.1, the BHT-ARIMA model effectively estimated winter wheat y based on short time-series VIs and actual yields, with R 2 reaching 0.583 from 110 exp mental plots. In this section, the trained BHT-ARIMA model was used to estimate regional winter wheat yields in Hengshui City, Hebei Province, based on the image of

Regional Winter Wheat Yield Estimation in Hengshui
In Section 3.3.1, the BHT-ARIMA model effectively estimated winter wheat yield based on short time-series VIs and actual yields, with R 2 reaching 0.583 from 110 experimental plots. In this section, the trained BHT-ARIMA model was used to estimate the regional winter wheat yields in Hengshui City, Hebei Province, based on the image of the winter wheat plant area. The data of the plant area of winter wheat in Hengshui City are from the published literature [40]. The prediction results are shown in Figure 11. Most predicted winter wheat yields in Hengshui City were between 7500 and 10,000 kg/ha, which was in line with the actual yield data in Section 2.3. In the west, mid-south, and east of Hengshui City, the predicted yields were higher than those of other areas in Hengshui City. Especially, the middle and south of Shenzhou, the east of Fucheng, and the east and south of Jingxian County were the higher-yield regions. These results were in general agreement with the spatial variation trend of the field experiment. The above results prove that the BHT-ARIMA model has strong spatial generalization in winter wheat yield estimation.
Remote Sens. 2022, 14, x FOR PEER REVIEW City. Especially, the middle and south of Shenzhou, the east of Fucheng, and the south of Jingxian County were the higher-yield regions. These results were in agreement with the spatial variation trend of the field experiment. The above resu that the BHT-ARIMA model has strong spatial generalization in winter wheat y mation. Figure 11. Predicted regional winter wheat yields in Hengshui City using the BHT-ARIM

Discussion
Previous studies have explored correlations between multitemporal VIs yields based on Random Forest (RF), Support Vector Machines (SVM), etc. Howev traditional deep learning-based methods require large, extensive data, limiting plication to yield predictions with fewer data [73][74][75][76]. Therefore, the BHT-ARIM applied in yield estimation has greater potential than other methods. In this stu was obtained and combined with the ARIMA model, which can reduce informa during tensorization, capture the intrinsic correlation of time series, and solve the of insufficient experimental data. The BHT-ARIMA model outperformed the BHT and BHT-CNN models in estimating wheat yields, as mentioned in Section 3.3.4. 7, we have summarized several published RS methods for winter wheat yield es which all used multi-temporal time-series data. At the satellite scale, the Agricult Photosynthesis Model (ACPM) was proposed in [40] for the winter wheat abov biomass (DAM) and yield estimation, with R 2 around 0.49. It is worth noting measured yield data used in [40] were the same as the yield data in this study SVM was applied to predict winter wheat yield based on ground-measured tim spectral data. Its R 2 was larger than that of BHT-ARIMA, which may be cause larger scale of satellite data. In [77], based on UAV-scale RS time-series data, Figure 11. Predicted regional winter wheat yields in Hengshui City using the BHT-ARIMA model.

Discussion
Previous studies have explored correlations between multitemporal VIs and crop yields based on Random Forest (RF), Support Vector Machines (SVM), etc. However, these traditional deep learning-based methods require large, extensive data, limiting their application to yield predictions with fewer data [73][74][75][76]. Therefore, the BHT-ARIMA model applied in yield estimation has greater potential than other methods. In this study, BHT was obtained and combined with the ARIMA model, which can reduce information loss during tensorization, capture the intrinsic correlation of time series, and solve the problem of insufficient experimental data. The BHT-ARIMA model outperformed the BHT-ARMA and BHT-CNN models in estimating wheat yields, as mentioned in Section 3.3.4. In Table 7, we have summarized several published RS methods for winter wheat yield estimation, which all used multi-temporal time-series data. At the satellite scale, the Agriculture Crop Photosynthesis Model (ACPM) was proposed in [40] for the winter wheat aboveground biomass (DAM) and yield estimation, with R 2 around 0.49. It is worth noting that the measured yield data used in [40] were the same as the yield data in this study. In [41], SVM was applied to predict winter wheat yield based on ground-measured time-series spectral data. Its R 2 was larger than that of BHT-ARIMA, which may be caused by the larger scale of satellite data. In [77], based on UAV-scale RS time-series data, the PLS, SVM, and RF methods achieved yield prediction with R 2 of 0.45, 0.50, and 0.55, respectively, which were all less than the R 2 value of BHT-ARIMA. The results showed the potential and advantages of BHT-ARIMA in the estimation of yield based on multi-temporal RS time-series data. However, it should be noted that in the tensorization process, the inversion of the tensor generated some outliers that did not meet the data requirements of the ARIMA model. In this case, some unreasonable data were lost automatically before being input into the ARIMA model, which impacts prediction accuracy.

Conclusions
This study aimed to use RS time-series data, explore the intrinsic relationship between crop growth and yield formation at different fertility stages, and construct a high-precision winter wheat yield estimation model applicable to short time-series data.
Furthermore, AOIAHP was proposed to quantify the different effects of winter wheat growth on yield accumulation at different fertility stages. The jointing period had the highest contribution ratio to winter wheat yield accumulation, followed by rising, heading, filling, rejuvenation, and lastly, filling-maturity. The contribution ratios of the six key phenological stages could be used as weights to be combined with the selected VIs to predict winter wheat yields Moreover, the BHT-ARIMA model was first applied to predict winter wheat yield with short time-series data. Compared with the BHT-CNN and BHT-ARMA models, the BHT-ARIMA model achieved the best prediction results, with R 2 reaching 0.583. The MAE, MAPE, and RMSE of the BHT-ARIMA model were lower than those of the BHT-ARMA and BHT-CNN models.
In this study, the BHT-ARIMA model improved the accuracy of winter wheat yield prediction based on weighted short RS time-series data, and its spatial generalization was proven. To further validate the generalization of the BHT-ARIMA model, in future work, we will validate the model for yield estimation in different regions, at different times, and for different crops to explore the universality and robustness of the model. As for the BHT-ARIMA model itself, we will improve the prediction accuracy of BHT-ARIMA, reducing the impact of data reduction due to the automatic elimination of the unreasonable values. This study provides a fast and accurate guide for crop yield prediction and will be of great value for the processing and applying of short RS time-series data.