Estimation of Above-Ground Biomass of Winter Wheat Based on Consumer-Grade Multi-Spectral UAV

: One of the problems of optical remote sensing of crop above-ground biomass (AGB) is that vegetation indices (VIs) often saturate from the middle to late growth stages. This study focuses on combining VIs acquired by a consumer-grade multiple-spectral UAV and machine learning regression techniques to (i) determine the optimal time window for AGB estimation of winter wheat and to (ii) determine the optimal combination of multi-spectral VIs and regression algorithms. UAV-based multi-spectral data and manually measured AGB of winter wheat, under ﬁve nitrogen rates, were obtained from the jointing stage until 25 days after ﬂowering in the growing season 2020/2021. Forty-four multi-spectral VIs were used in the linear regression (LR), partial least squares regression (PLSR), and random forest (RF) models in this study. Results of LR models showed that the heading stage was the most suitable stage for AGB prediction, with R 2 values varying from 0.48 to 0.93. Three PLSR models based on different datasets performed differently in estimating AGB in the training dataset (R 2 = 0.74~0.92, RMSE = 0.95~2.87 t/ha, MAE = 0.75~2.18 t/ha, and RPD = 2.00~3.67) and validation dataset (R 2 = 0.50~0.75, RMSE = 1.56~2.57 t/ha, MAE = 1.44~2.05 t/ha, RPD = 1.45~1.89). Compared with PLSR models, the performance of the RF models was more stable in the prediction of AGB in the training dataset (R 2 = 0.95~0.97, RMSE = 0.58~1.08 t/ha, MAE = 0.46~0.89 t/ha, and RPD = 3.95~6.35) and validation dataset (R 2 = 0.83~0.93, RMSE = 0.93~2.34 t/ha, MAE = 0.72~2.01 t/ha, RPD = 1.36~3.79). Monitoring AGB prior to ﬂowering was found to be more effective than post-ﬂowering. Moreover, this study demonstrates that it is feasible to estimate AGB for multiple growth stages of winter wheat by combining the optimal VIs and PLSR and RF models, which overcomes the saturation problem of using individual VI-based linear regression models.


Introduction
Winter wheat, one of the main cultivated food crops in the North China Plain (NCP), plays a vital role in China's food security [1]. The above-ground biomass (AGB), which is the total mass of dry organic matter per unit area at a given time, is a good indicator reflecting the growth and development status of crops [2][3][4], especially for crop yield [5]. Therefore, timely and accurate monitoring of the AGB of winter wheat is essential, and it is of guiding significance for the field management of winter wheat [6,7]. have been studied using a UAV remote sensing platform for AGB estimation and have achieved good results. The application of a UAV remote sensing system will promote the development of agriculture. Actually, for VIs, machine learning, and UAV-based remote sensing platforms, previous studies have combined one or two of these techniques for AGB prediction for crops. However, there are limited studies using machine learning algorithms combined with VIs obtained from consumer UAVs to estimate the AGB of winter wheat, and our understanding is very limited in terms of the optimal growth period and VIs for estimating the AGB of winter wheat.
The objectives of this study were (i) to determine the optimal time window of using VIs for AGB estimation of winter wheat and (ii) to determine the optimal combination of multi-spectral VIs and regression algorithms for predicting AGB of winter wheat in multiple growth stages.

Study Area and Experimental Design
The experiment was conducted from 2020 to 2021 at the Wuqiao Experimental Station of China Agricultural University, which is located in Wuqiao County (37 • 41 N, 116 • 37 E), Hebei Province, China ( Figure 1). The site is in the North China Plain and has a warm temperate semi-humid continental monsoon climate, average rainfall of 550 mm, an average temperature of 12.5 • C, and an average altitude of 18 m. The soil texture was determined to be light loam, which contains 11.9% clay, 78.1% silt, and 10.0% sand. accuracy [50]. As for AGB prediction, the CSM technology inherited by drones, which can provide three-dimensional point clouds [51], is highly suitable. Major crops including maize [3], wheat [7], and rice [52] have been studied using a UAV remote sensing platform for AGB estimation and have achieved good results. The application of a UAV remote sensing system will promote the development of agriculture. Actually, for VIs, machine learning, and UAV-based remote sensing platforms, previous studies have combined one or two of these techniques for AGB prediction for crops. However, there are limited studies using machine learning algorithms combined with VIs obtained from consumer UAVs to estimate the AGB of winter wheat, and our understanding is very limited in terms of the optimal growth period and VIs for estimating the AGB of winter wheat.
The objectives of this study were (i) to determine the optimal time window of using VIs for AGB estimation of winter wheat and (ii) to determine the optimal combination of multi-spectral VIs and regression algorithms for predicting AGB of winter wheat in multiple growth stages.

Study Area and Experimental Design
The experiment was conducted from 2020 to 2021 at the Wuqiao Experimental Station of China Agricultural University, which is located in Wuqiao County (37°41′N, 116°37′E), Hebei Province, China ( Figure 1). The site is in the North China Plain and has a warm temperate semi-humid continental monsoon climate, average rainfall of 550 mm, an average temperature of 12.5 °C, and an average altitude of 18 m. The soil texture was determined to be light loam, which contains 11.9% clay, 78.1% silt, and 10.0% sand. This experiment followed a split zone test design. We selected JiMai22 variety of winter wheat (Triticum aestivum L.) in this study, which has the largest plant area in China, and was sowed in October 2020 and harvested in June 2021 with a row spacing of 15 cm and a density of 430 × 10 4 ha −1 . Five nitrogen fertilizer treatments were established, including 0 kg N ha −1 (N0), 80 kg N ha −1 (N1), 120 kg N ha −1 (N2), 160 kg N ha −1 (N3), and 200 kg N ha −1 (N4). For each treatment, three replications were conducted, and each plot area was 40 m 2 (10 m × 4 m). For all treatments, we applied 120 kg P2O5 ha −1 and 90 kg K2O ha −1 to the soil as basal dressings, and the rest of the field managements are the same. This experiment followed a split zone test design. We selected JiMai22 variety of winter wheat (Triticum aestivum L.) in this study, which has the largest plant area in China, and was sowed in October 2020 and harvested in June 2021 with a row spacing of 15 cm and a density of 430 × 10 4 ha −1 . Five nitrogen fertilizer treatments were established, including 0 kg N ha −1 (N0), 80 kg N ha −1 (N1), 120 kg N ha −1 (N2), 160 kg N ha −1 (N3), and 200 kg N ha −1 (N4). For each treatment, three replications were conducted, and each plot area was 40 m 2 (10 m × 4 m). For all treatments, we applied 120 kg P 2 O 5 ha −1 and 90 kg K 2 O ha −1 to the soil as basal dressings, and the rest of the field managements are the same. When the wheat grows to the observed growth period (see Table 1), a 20 cm plus 30 cm area of wheat was randomly selected in each plot. After sampling, the samples were placed in plastic bags and taken back to the laboratory immediately. The stems and leaves of wheat were separated and dried at 105 • C for 2 h and then at 75 • C until the samples maintained a constant weight [52]. We used a balance with an accuracy of 0.01 g to obtain the dry weight of the above-ground biomass of wheat. During the wheat season, eight UAV flight missions were carried out between 10:00 a.m. to 14:00 p.m. on sunny days. Meanwhile, the dates of execution of the mission were the same as the dates of ground sampling. The UAV platform we used was a DJI Phantom 4 quadcopter (DJI, Shenzhen, Guangdong, China), which was equipped with a GPS/GNSS satellite positioning system, and with a maximum load capacity of 1.388 kg and a maximum flight time of 27 min. The sensor had six 1/2.9-inch CMOS, including one RGB sensor for visible light imaging and five monochromatic sensors including blue (450 nm ± 16 nm), green (560 nm ± 16 nm), red (650 nm ± 16 nm), red edge (730 nm ± 16 nm), and NIR (840 nm ± 26 nm) for multi-spectral imaging. For every single sensor, there were 2.08 million effective pixels (more details in Table 2). The fight route was made by DJI go pro software (DJI, Shenzhen, Guangdong, China), which provides easy mission planning through different methods such as setting points using the aircraft and importing files. The images were acquired with 80% overlap and 70% side overlap at the height of 25 m. Nine ground control points (GCP) were evenly placed on the field after the wheat emerged. We used a D-RTK 2 high-precision GNSS mobile station (DJI, Shenzhen, Guangdong, China) which has a centimeter-level positioning system with uninterrupted data transmission to record the coordinate information of GCPs. Figure 2 shows the scene of the drone working in the field.  Pix4D software (Pix4D SA, Lausanne, Switzerland), which integrates structur motion (SFM) technology, was used to generate orthomosaic images. The SFM tech can search for the keypoints among the UAV-captured images. Camera parame be calibrated to external parameters, such as the position and scale of the images same time, a point correlation was performed based on the characteristics to similar characteristics between images in common areas or overlapping area computing the 3D position of matched points, the corresponding images were d and textured. Then, by projecting each textured pixel onto a 2D plane, the orth image was obtained. We selected the multi-spectral Ag template as our mo manually punctured points by importing the coordinates of GCPs for orth georeferencing. Finally, five single-band orthophotos were obtained in each o stage.  Pix4D software (Pix4D SA, Lausanne, Switzerland), which integrates structure-frommotion (SFM) technology, was used to generate orthomosaic images. The SFM technology can search for the keypoints among the UAV-captured images. Camera parameters can be calibrated to external parameters, such as the position and scale of the images. At the same time, a point correlation was performed based on the characteristics to identify similar characteristics between images in common areas or overlapping areas. After computing the 3D position of matched points, the corresponding images were densified and textured. Then, by projecting each textured pixel onto a 2D plane, the orthomosaic image was obtained. We selected the multi-spectral Ag template as our model and manually punctured points by importing the coordinates of GCPs for orthomosaic georeferencing. Finally, five single-band orthophotos were obtained in each observed stage.

Selection of VIs
Forty-four VIs were selected to estimate above-ground biomass of winter wheat (see Table 3). All of the VIs were calculated from the original multi-spectral images.
Band math was performed in the QGIS 3.14 open-source software (QGIS Version 3.14). The orthophotos obtained from Pix4D software were imported into the new project of QGIS. We used the function of the raster calculator to perform the band math and generate the original vegetation index maps. In order to reduce the impact of abnormal factors such as soil, we applied image segmentation based on the OTSU algorithm in MATLAB (MathWorks, Natick, MA, USA) ( Figure 3). Finally, QGIS software was used to extract the vegetation index of the region of interest. Table 3. Vegetation indices used in this study.

Modeling Methods
For each growth stage, there were 15 remote sensing data points from regions of interest corresponding to the above-ground biomass data obtained by destructive sampling. In total, there were 120 samples across all stages. Linear regression (LR), partial least squares regression (PLSR), and random forests algorithm (RF) were used in this research. In individual growth stages, linear models based on each vegetation index were established for exploring the relationship between vegetation index and AGB. Meanwhile, models were built by PLSR and RF based on a pre-flowering dataset, post-flowering dataset, and a full dataset for determining the optimal combination of multi-spectral VIs and regression algorithms for predicting AGB of winter wheat in multiple growth stages. All the models were built in R programming language in R Studio (R Version 3.6.1) [83], using the R packages 'pls' [84] and 'randomForest' [85]. For the validation of models, 80% of the samples were selected as a training dataset, and the remaining 20% were used as a validation dataset. Band math was performed in the QGIS 3.14 open-source software (QGIS Version 3.14). The orthophotos obtained from Pix4D software were imported into the new project of QGIS. We used the function of the raster calculator to perform the band math and generate the original vegetation index maps. In order to reduce the impact of abnormal factors such as soil, we applied image segmentation based on the OTSU algorithm in MATLAB (MathWorks, Natick, MA, USA) ( Figure 3). Finally, QGIS software was used to extract the vegetation index of the region of interest.

Modeling Methods
For each growth stage, there were 15 remote sensing data points from regions of interest corresponding to the above-ground biomass data obtained by destructive sampling. In total, there were 120 samples across all stages. Linear regression (LR), partial least squares regression (PLSR), and random forests algorithm (RF) were used in this research. In individual growth stages, linear models based on each vegetation index were established for exploring the relationship between vegetation index and AGB. Meanwhile, models were built by PLSR and RF based on a pre-flowering dataset, post-flowering dataset, and a full dataset for determining the optimal combination of multi-spectral VIs and regression algorithms for predicting AGB of winter wheat in multiple growth stages. All the models were built in R programming language in R Studio (R Version 3.6.1) [83], using the R packages 'pls' [84] and 'randomForest' [85]. For the validation of models, 80% of the samples were selected as a training dataset, and the remaining 20% were used as a validation dataset.
PLSR is a bilinear regression method [86]. It integrates the advantages of principal component analysis, canonical correlation analysis, and linear regression analysis. It is stable, and suitable for small datasets and can handle multi-collinearity. The objective of partial least squares is to predict dependent variables from independent variables by describing the common structure of these two variables. PLSR can be used to identify potential factors, which are linear combinations of explanatory variables (also called latent variables) that best model the response variable. By performing component projection, it reduces noise and the dimensionality and eliminates the multi-collinearity of the input data. In this study, the one-sigma algorithm [87] was conducted to determine the optimal number of principal components. The variable importance in projection (VIP) [88] score PLSR is a bilinear regression method [86]. It integrates the advantages of principal component analysis, canonical correlation analysis, and linear regression analysis. It is stable, and suitable for small datasets and can handle multi-collinearity. The objective of partial least squares is to predict dependent variables from independent variables by describing the common structure of these two variables. PLSR can be used to identify potential factors, which are linear combinations of explanatory variables (also called latent variables) that best model the response variable. By performing component projection, it reduces noise and the dimensionality and eliminates the multi-collinearity of the input data. In this study, the one-sigma algorithm [87] was conducted to determine the optimal number of principal components. The variable importance in projection (VIP) [88] score is often used to assess the importance of variables, and the variables with a VIP score greater than one is generally considered to be more important. Therefore, we used it to evaluate the importance of the VIs in the PLSR model.
Random forest is an bagging-based ensemble algorithm [89]. By combining multiple weak classifiers, the final result is voted or averaged, so that the result of the overall model has higher accuracy and generalization performance. The generation rules of random forest are as follows: first, take N samples randomly from the dataset by the bootstrap way; second, use these data as the training set to train tree models; third, randomly select m feature subsets from M features, and select the best from these m features after each time the tree splits; fourth, the generated decision trees are formed into a random forest to ensure that each tree grows to the maximum extent, and there is no pruning process; finally, the mean value of the tree prediction results is used as the final prediction result. Furthermore, there are two important parameters to RF, which are the number of decision trees (ntree) and input variables per node (mtry). They are the key parameters to ensure the accuracy and complexity of the model. In this paper, to obtain the best RF model, the ntree and mtry were selected based on the RMSE with the RF algorithm. To further optimize the model, the 10-fold cross-validation was conducted and was repeated 5 times. In this study, %var explained by [89] was used for evaluating the performance of RF models. Percentage increase in mean squared error (%IncMSE) [40,90] was used as an indicator for evaluating the importance of variables in RF models. It comes from permuting out-of-bag (OOB) data, and the important variables have the higher %IncMSE after the data have been permutated. The detailed description of %IncMSE can be found in [89]. We used %IncMSE to evaluate the performance of VIs in RF model.

Evaluation of Model Accuracy
We choose the coefficients of determination (R 2 ), root mean square error (RMSE), mean absolute error (MAE), and residual predictive deviation (RPD) to evaluate the performances of the different models. Generally, the higher the R 2 and the lower the RMSE and MAE, the better the precision and accuracy of the models. The model is regarded as robust if the RPD is higher than 2 [91]. Equations (1)-(4) were used to calculate R 2 , RMSE, MAE, and PRD.
where n is the number of samples, i is the ith sample, x i and y i stand for the estimated values and measured values, SD y i stands for the standard deviation of measured values, and x and y stand for the average estimated values and measured values, respectively. In order to understand the experiment more intuitively, the experiment flowchart ( Figure 4) was made based on the experiment design, data collection, and processing.   Table 4 shows the descriptive statistics of the above-ground biomass (AGB) by each growth stage. Across all stages, the AGB varies from 3.96 to 27.08 t/ha, with an SD of 5.51 and a CV of 36%. AGB showed an increasing trend as the growth stage progressed. It  Table 4 shows the descriptive statistics of the above-ground biomass (AGB) by each growth stage. Across all stages, the AGB varies from 3.96 to 27.08 t/ha, with an SD of 5.51 and a CV of 36%. AGB showed an increasing trend as the growth stage progressed. It reached the maximum at the stage AF20 with a mean value of 20.44 t/ha.

AGB Model Based on LR
AGB was regressed on 44 VIs in each growth stage as well as the pre-and postflowering period (BF and AF), and the results showed that most VIs have good capability to predict AGB in all observed stages with positive correlations, such as NDVI ( Figure 5). Only a few VIs have a negative correlation with wheat AGB, such as SIPI (Supplementary Material S1). Decreased linear correlations between VIs and AGB for the BF (pre-flowering) period were found compared to the LR result at each individual pre-flowering stages, and there was little linear correlation between VIs and AGB throughout the AF (post-flowering) period. Moreover, we found a rapid decrease in NDVI at the stage AF5 (Figure 5b).  Comparing spectral indices between stages, the TCARI/OSAVI showed the best prediction of AGB, with an R 2 of 0.93 at the heading stage. Figure 6 shows the top 10 VIs at each growth stage and the periods of BF and AF. From the jointing stage to AF25, the VIs with the best predictive capability were TCARI/OSAVI (R 2 = 0.70), WDRVI (R 2 = 0.86), TCARI/OSAVI (R 2 = 0.93), GNDVI (R 2 = 0.81), OSAVI-REG (R 2 = 0.88), SIPI (R 2 = 0.83), GARI (R 2 = 0.75), and NDVI (R 2 = 0.72), respectively. For the BF and AF, the best VIs were CVI (R 2 = 0.67) and TCARI/OSAVI (R 2 = 0.46). In general, as the growth progressed, the prediction Comparing spectral indices between stages, the TCARI/OSAVI showed the best prediction of AGB, with an R 2 of 0.93 at the heading stage. Figure 6 shows the top 10 VIs at each growth stage and the periods of BF and AF. From the jointing stage to AF25, the VIs with the best predictive capability were TCARI/OSAVI (R 2 = 0.70), WDRVI (R 2 = 0.86), TCARI/OSAVI (R 2 = 0.93), GNDVI (R 2 = 0.81), OSAVI-REG (R 2 = 0.88), SIPI (R 2 = 0.83), GARI (R 2 = 0.75), and NDVI (R 2 = 0.72), respectively. For the BF and AF, the best VIs were CVI (R 2 = 0.67) and TCARI/OSAVI (R 2 = 0.46). In general, as the growth progressed, the prediction capability of the VIs increased and reached the maximum at the heading stage, then gradually decreased. Comparing spectral indices between stages, the TCARI/OSAVI showed the best prediction of AGB, with an R 2 of 0.93 at the heading stage. Figure 6 shows the top 10 VIs at each growth stage and the periods of BF and AF. From the jointing stage to AF25, the VIs with the best predictive capability were TCARI/OSAVI (R 2 = 0.70), WDRVI (R 2 = 0.86), TCARI/OSAVI (R 2 = 0.93), GNDVI (R 2 = 0.81), OSAVI-REG (R 2 = 0.88), SIPI (R 2 = 0.83), GARI (R 2 = 0.75), and NDVI (R 2 = 0.72), respectively. For the BF and AF, the best VIs were CVI (R 2 = 0.67) and TCARI/OSAVI (R 2 = 0.46). In general, as the growth progressed, the prediction capability of the VIs increased and reached the maximum at the heading stage, then gradually decreased.

AGB Model Based on PLSR
Three PLSR models were constructed based on the pre-flowering dataset, post-flowering dataset, and the whole dataset, respectively. Figure 7a-c shows the top VIs (VIP > 1) of the three PLSR models. Based on the VIP analysis ( Figure S1, Supplementary Material S2), the indices of significant contribution (VIP ≥ 1) were determined for the pre-and postflowering models, and the across-stage models, respectively, with nine, ten, and fifteen indices across-stage.
As shown in Figure 8, the pre-flowering PLSR model with three components performed the best in the training set (R 2 = 0.92, RMSE = 0.95 t/ha, MAE = 0.75 t/ha, and RPD = 3.67). The across-stage PLSR model with two components performed the worst in the training set (R 2 = 0.74, RMSE = 2.82 t/ha, MAE = 2.18 t/ha, and RPD = 2.00), but performed the best for the validation set (R 2 = 0.75, RMSE = 2.57 t/ha, MAE = 2.05 t/ha, and RPD = 1.89). The post-flowering PLSR model performed the worst for the validation set, with the R 2 of 0.50, the RMSE of 2.07 t/ha, the MAE of 1.69 t/ha, and RPD of 1.45. Comparing the performances of the three PLSR models in the training and the validation datasets, the pre-flowering PLSR model allowed for the best prediction of AGB.

AGB Model Based on RF
Based on our result, the ntree and mtry were set to 41 and 24 for the pre-flowering RF model, 300 and 19 for the post-flowering RF model, and 737 and 38 for the across-stage RF model ( Figure S2 in Supplementary Material S2).
The top 20 VIs for the three RF models are shown by the order of the %IncMSE of VIs in Figure 7d-f. MCARI2 with a %IncMSE of 6.20, OSAVI with a %IncMSE of 12.47, and TCARI/OSAVI with a %IncMSE of 46.98 showed the largest contributions to the pre-and post-flowering RF models, and the across-stage RF model. By conducting 10-fold crossvalidation ( Figure S3 in Supplementary Material S2), the three RF models were further optimized by setting the mtry to eight, twelve, and twelve, respectively. Four indices of significant contributions (MCARI2, LCI, CVI, TCARI) were the same in the pre-flowering and post-flowering RF models. Five indices (MCARI2, LCI, CVI, TCARI, GRVI) were the same in the pre-flowering and across-stage RF models. Ten indices (OSAVI, TCARI/OSAVI, S-CCCI, CVI, MCARI2, MTCI, SAVI, OSAVI-REG, TCARI, LCI) were the same in the postflowering and across-stage RF models. Overall, MCARI2, LCI, CVI, and TCARI were the same for all the three RF models.

AGB Model Based on PLSR
Three PLSR models were constructed based on the pre-flowering dataset, postflowering dataset, and the whole dataset, respectively. Figure 7a-c shows the top VIs (VIP > 1) of the three PLSR models. Based on the VIP analysis ( Figure S1, Supplementary Material S2), the indices of significant contribution (VIP ≥ 1) were determined for the preand post-flowering models, and the across-stage models, respectively, with nine, ten, and fifteen indices across-stage.

The Optimal Time Window for the AGB Monitoring
It can be found that the correlation between the VIs obtained from the UAV multispectral images and the AGB of winter wheat showed a trend of increasing first and then decreasing from the jointing stage to the stage AF25 in our study, except for the stage AF5 ( Figure 6). After flowering, the correlation between VI and AGB becomes lower and lower. This may be due to the fact that photosynthesis is weakened [42] and the influence of wheat ears. The abrupt decrease in NDVI (Figure 5b) observed at the stage AF5 was deviated from the whole observation period, indicating that there might be some problems in the image acquisition at that time, in mid-May. There were many poplars planted around the farmland near the experimental site, and the stage AF5 was the time when the poplars spat out. There were many poplars in the air at that time, which floated on the wheat field and affected the quality of the images. Moreover, we found that with the advancement of the growth stages, the AGB has been increasing; but most of the VIs decreased and the degree of decrease gradually increased after flowering. This could explain the gradual decrease in the correlations between VIs and AGB after the heading stage.
Among all the stages, we found that the VIs at the heading stage had the best correlations with the AGB. It has also been found that the heading stage was suitable for monitoring the AGB of rice [29]. As a concurrent period between vegetative growth and reproductive growth of winter wheat, the heading stage is an important phenological stage for wheat [92]. Previous studies have demonstrated that the heading stage was the optimal crop growth stage for remote sensing of several agronomic traits such as grain yield and nitrogen uptake [93]. Furthermore, this can be explained from the perspective of radar remote sensing, such as in the study from Ouaadi [94], which showed that the C-band Sentinel-1 time-series data backscatter coefficient and polarization ratio of wheat reached the minimum and maximum at the heading stage, suggesting the heading stage is an optimal period for monitoring AGB.
It has been well known that vegetation indices often saturate in high-density vegetation canopies [95,96]. As the crop grows, canopy vegetation coverage gradually increases until it reaches the maximum or closure in the reproductive growth stage. In the later stages of crop growth, senescence may affect the use of multi-spectral data for crop biomass monitoring [29,97]. These could explain our results that the post-flowering biomass prediction models performed worse than the pre-flowering biomass prediction models. In crop management practices, pre-flowering biomass prediction is of practical value because most of the agronomic measures affecting crop growth are implemented during the vegetative growth period of crops.

The Comparison of Sensitive Bands
In this study, we evaluated the relationship between each vegetation index and the AGB of winter wheat at each specific growth stage using the linear regression model. The Vis correlated highly with the AGB normally contained in the near-infrared and red bands for all the observation stages, although the best indices in each stage were not exactly the same. This is similar to the finding of a previous study [27], which proved the NIR region is the most effective band for AGB prediction in winter wheat. However, inconsistent results have also been reported that the VIs based on the NIR and red bands may fail to predict AGB in the middle and late stages of wheat growth [98].
According to the comparison of the best 10 VIs for different stages, we found that the best correlated VIs in the early stages were based on the red edge, green, and blue bands in addition to the NIR and red bands. In contrast, in the late stages, the best indices were found to be based exclusively on the NIR and red bands. Previous studies showed that plants have high reflectivity in the NIR band, and the light of the NIR band can penetrate deeper into the leaf and canopy than the visible bands [99,100]. Therefore, with the increase of crop biomass, the VIs based on visible bands may be less responsive to variations in biomass than the VIs based on the NIR bands.

The Performances of PLSR and RF Models for AGB Estimation
Preview studies have concluded that it was difficult to use traditional VIs to estimate crop multi-temporal variations in biomass due to the saturation of the VIs and their low sensitivity in the reproductive growth stages [42,101,102]. In this study, however, it was found to be feasible to predict the AGB of winter wheat across all stages by using multiple VIs and machine learning models. Since different VIs are calculated through different wavebands, the difference in their sensitivity to AGB may be significant in different growth stages. Meanwhile, compared to the use of VI-based linear models, the machine learning technique is suitable for tackling the multi-collinearity problem [103,104]. The RF models have achieved better estimation accuracy than the PLSR model in general, which is similar to previous studies [37,105]. Moreover, differently from the performance of the acrossstage PLSR model, the across-stage RF model showed a better ability to predict wheat biomass than that of the pre-flowering RF model. Recent studies on predicting the AGB of various crops have shown the promise of using RF methods, such as in potato crops [21], wheat [106], rice [107], and corn [3,41]. Therefore, combining VIs and machine learning methods such as RF can help to overcome the saturation problem of using individual VIs in the late growth stage of winter wheat.
Among all the VIs used in this study, there were 28 two-band indices, 13 three-band indices, and 3 four-band indices. Our result showed that 24 of 44 VIs were selected by the machining learning models for estimating the AGB of winter wheat, including 13 two-band indices, 9 three-band indices, and 2 four-band indices. The three-band vegetation indices are expected to be less prone to the saturation problems than the two-band vegetation indices [58,108]. Actually, previous research has revealed the important role of the threeband vegetation index in crop monitoring due to its large amount of information and relatively simple structure [58,108]. Many studies on crop nitrogen monitoring used threeband or optimal three-band vegetation indices and obtained great results [109,110]. In this study, nearly all the three-band vegetation indices were selected by our machine learning models, and nearly half or more of the vegetation indices used in each model were threeband vegetation indices. This suggests that three-band indices deserve more attention in crop biomass forecasting.
LCI, CVI, and TCARI were screened for their importance in biomass estimation in both the pre-flowering PLSR model and pre-flowering RF model. These VIs have been reported in previous studies [7,111,112], suggesting that these VIs may be more stable to be used in PLSR and RF models for the forecast of winter wheat biomass before flowering. Our result showed that five important VIs including the TCARI/OSAVI, OSAVI, OSAVI-REG, MTCI, and SAVI-GREEN were always used in the post-flowering machine learning models. Interestingly, most of these indices were related to soil-line vegetation indices, which were developed to minimize soil background influence [113]. Similarly, TCARI/OSAVI also contributed a lot to AGB forecasting in both across-stage machine learning models. Zheng [29] also found that OSAVI exhibited the best relationship with AGB for the whole season and post-heading stages. Meanwhile, it was found that ten important VIs were always selected both in the across-stage PLSR and RF across-stage models. Nevertheless, these indices should be further evaluated in future research, such as whether they should be prioritized to be used in machine learning models for AGB prediction.

The Limitations of the Study and Suggestions for Future AGB Estimation
This study was conducted in an experimental field condition and only had a small sample size of 120, spanning eight growth stages. It has been demonstrated that the accuracy of biomass estimation highly depends on the prediction method and data type, and less on the sample size of its data [114]. Statistically, obtaining more samples in a larger region will improve the generalizability of the model. Based on the results of the LR models, we found that most of the VIs related to the AGB of winter wheat were based on the NIR and red bands. However, our results indicated that the performance of the VIs for the AGB prediction differs in growth stages due to the fact that the top VIs obtained in different stages were different. This confirms the influence of crop growth stages on the sensitivity and performance of VIs for estimating crop biophysical parameters [115][116][117][118]. Namely, it is challenging to determine a unique VI that is suitable for the prediction of the same crop biophysical parameters across different crop growth stages. It has been recommended to use multiple VIs to best capture agricultural crop characteristics due to the variety of VIs at different growth stages [116]. Similarly, in this study, using multiple VIs in PLSR and RF models might be adequate to capture the variations in biomass in different stages. Therefore, determining an optimal combination of VIs is promising to predict the AGB of crops across stages. In future work, we will further investigate the combination of vegetation indices to obtain a generic prediction model for AGB monitoring using multi-spectral UAVs and more ground samples over a larger area.

Conclusions
In this study, the multi-temporal measured AGB of winter wheat obtained by field sampling was associated with the corresponding images obtained by a consumer-grade drone carrying five-band sensors. Linear regression models were built based on individual VIs from each specific growth stage to select the best predicting stage for the AGB of winter wheat. PLSR and RF models based on the pre-flowering dataset, post-flowering dataset, and a full dataset were constructed to assess the feasibility of using multiple VIs to estimate the AGB of winter wheat during multiple growth stages and to explore the optimal period for biomass forecasting of winter wheat. Firstly, results indicate that the NIR and the red bands are important bands for winter wheat AGB monitoring. Secondly, the optimal time window for using individual vegetation indices to predict winter wheat AGB is before wheat flowering. Lastly, compared with the instability of wheat AGB monitoring for different growth stages based on linear regression, this study demonstrates that it is feasible to use multi-VI-based PLSR and RF models to estimate AGB for multiple growth stages, including both the vegetative growth and reproductive growth stages of winter wheat. Our further work will further explore the optimal combination of vegetation indices to obtain a generic prediction model for AGB monitoring using multi-spectral UAVs and more ground samples over a larger area.