Estimating Yield-Related Traits Using UAV-Derived Multispectral Images to Improve Rice Grain Yield Prediction

: Rice grain yield prediction with UAV-driven multispectral images are re-emerging inter-ests in precision agriculture, and an optimal sensing time is an important factor. The aims of this study were to (1) predict rice grain yield by using the estimated aboveground biomass (AGB) and leaf area index (LAI) from vegetation indices (VIs) and (2) determine the optimal sensing time in estimating AGB and LAI using VIs for grain yield prediction. An experimental trial was conducted in 2020 and 2021, involving two fertility conditions and five japonica rice cultivars (Aichinokaori, Asahi, Hatsushimo, Nakate Shinsenbon, and Nikomaru). Multi-temporal VIs were used to estimate AGB and LAI throughout the growth period with the extreme gradient boosting model and Gompertz model. The optimum time windows for predicting yield for each cultivar were determined using a single-day linear regression model. The results show that AGB and LAI could be estimated from VIs (R 2 : 0.56–0.83 and 0.57–0.73), and the optimum time window for UAV flights differed between cultivars, ranging from 4 to 31 days between the tillering stage and the initial heading stage. These findings help researchers to save resources and time for numerous UAV flights to predict rice grain yield.


Introduction
Precision agriculture involves real-time crop monitoring to aid in necessary field management intervention to ensure a good harvest. This is made possible by the innovation of using unmanned aerial vehicles (UAVs) in the field. The images taken from the UAV serve as an alternative parameter to actual plant traits such as plant height [1][2][3], leaf area index (LAI) [4][5][6], and aboveground biomass (AGB) [7][8][9]. In recent years, this type of remote sensing has also been applied in creating prediction models using machine learning algorithms [10][11][12].
Rice grain yield prediction presents several benefits. With the market demand, the basis for the import and export inventories in a district is the grain yield estimates [13]. Yield forecast, to some extent, can be a strategy for crop insurance providers in evaluating if a farm is entitled to an indemnity due to low production. Local agricultural officers can use grain yield prediction to determine if the current crop management program in the area is bridging the yield gap while considering normal year-to-year variation. Inter-annual variations in rice grain yield can be attributed to year-to-year climate variability such as anomalous weather. The increase in nighttime temperature and longer hot days are expected to affect rice productivity. The presence of cloud cover, reduced solar radiation and change in fertility application can influence grain yield variation between years [14,15].
These situations require reliable results yet with a short-term process. The use of UAVs presents an advantage, as it provides higher spatial-temporal resolution than the use of satellites, allowing for more precise crop monitoring than the situations mentioned above.
However, rice grain yield prediction models using UAVs are not without difficulty. The need to acquire data during certain rice growth stages, which are important for yield determination, and differences in observed phenotype trends due to differences in plant architecture among rice cultivars are inevitable problems in yield prediction by remote sensing.
Previous research has shown that the use of vegetation indices (VIs) to predict grain yield resulted in a good coefficient of determination (R 2 ) when yield-VI models were incorporated with other variables such as phenological data [16,17] and other plant parameters such as days to maturity, plant height [9], and ground coverage [18,19]. Better results were also observed when different VIs [20,21] and multi-temporal VIs [22,23] were incorporated into the model. These prediction models work on the hypothesis that grain yield can be predicted using plant parameters and weather data that indicate the plant's physiological efficiency during grain filling. Additionally, increasing the predictors may improve the determination efficiency to a certain number of terms. Combining VIs with texture indices measured at the booting stage had comparable results to VIs measured at the booting and heading stages [24,25].
However, incorporating these parameters did not always result in better prediction. In particular, the inclusion of growing degree-days did not improve the model compared to using the NDVI alone [26]. Rice prediction models use a generalized optimal sensing time and multi-temporal Vis to predict grain yield, and their performances depend on the accuracy of monitoring certain rice growth stages that are known to be important in grain yield determination such as panicle initiation and flowering stages. Sensing beyond panicle differentiation resulted in a lower rice grain yield prediction than during the interval between panicle initiation and differentiation [22,26].
Other research has also shown that the prediction method is more important than the sample size, with the random forest model being the best method in most cases. The sample size can influence the performance of cross-validated prediction models, and a small sample size is used in a high number of k-fold cross-validations, where a high R 2 in the test dataset can be observed, resulting in a biased result [9,17,27].
The target cultivar used in the prediction is also crucial. The importance of VIs is often evaluated on a case-to-case basis using correlation analyses and variable importance plots, meaning that a particular VI may show a linear relationship in one study but not in another study. Using VIs that can represent the morphological differences among cultivars can improve the variability aspect of the prediction model [28]. Grain yield prediction models for different cultivars have attempted to avoid high variance in prediction model research [29].
The results of the above experiments indicate that the current UAV-based grain yield prediction model consists of multi-temporal VIs, other image-derived variables, and the selection of the best prediction model. Little attention has been paid to estimated plant parameters using UAVs and these estimates to predict rice grain yield.
Our study hypothesized that reasonable estimates of individual rice plant AGB and LAI throughout the growth period from UAV-derived data could predict grain yield without incorporating the actual plant parameters measured manually. The extreme gradient boosting algorithm (XGBoost), a robust ensemble machine learning algorithm, was chosen to estimate AGB and LAI. Previous research has shown that XGBoost outperformed other machine learning algorithms in the yield forecasting of several crops [30,31]. It can also determine trends in crop-yield-related parameters such as rainfall and temperature [32]. XGBoost uses a gradient boosting method wherein the model learns from the weak learners and corrects the mistake in the prediction while regularizing the model due to complexity. Compared to the simple random forest and ridge regression, XGBoost has both strengths of regularization (ridge) and decision trees (random forest). The grain yield prediction using the estimated AGB and LAI depends on the estimation accuracy of AGB and LAI first. Using predicted values to predict another variable poses an inherent prediction error but can be mitigated using a model that can bode well.
Once the AGB and LAI are estimated using VIs, the AGB and LAI dynamics throughout the growth duration can be calculated using the developed estimation models and the Gompertz growth curve. A sigmoid curve such as the Gomperz growth curve is often used to describe the plant weight and LAI as a function of time [33,34]. The Gompertz curve has also been applied in determining a relationship between plant dry matter and vegetation cover derived from computer vision techniques [35].
When the dynamics have been estimated, the optimum time for UAV flights can be determined. This is to optimize the UAV data acquisition by avoiding unnecessary and redundant UAV flights for AGB and LAI estimation that can be used for grain yield prediction. The proposed prediction method was developed for each of the cultivars tested in the study. This study attempted to build a two-step grain yield prediction model by estimating the in-season AGB and LAI using correlated VIs and then utilizing these estimated parameters to predict grain yield. This study also aimed to determine the optimum time window for UAV flights.

Materials and Methods
The workflow of the study consisted of the following: (1) data collection and processing of AGB, LAI, grain yield, and UAV spectral data for the years 2020 and 2021, (2) establishment of baseline grain yield prediction, and (3) estimation models for AGB and LAI to be used as predictors for the GY prediction model. The details are described in Figure 1.

Figure 2.
Location of the study site in Aichi, Japan. Orthomosaic image of the two experimental trials: trial with no fertilizer (in blue box) and trial with fertilizer (red box). On the right, from the top to bottom, are the plot images clipped from the orthomosaic image of rice cultivars, namely, Aichinokaori, Hatsushimo, Asahi, Nakate Shinsenbon, and Nikomaru.
Aichi Prefecture belongs to a humid subtropical climate zone. During the growing season, the average monthly total sunshine duration is 165.38 h, the average monthly total precipitation is 223.35, and the average daily temperature is 25.7 °C [36]. The primary soil type is ultisols.

Experimental Design
Two experimental trials, namely, a trial with no fertilizer and a trial with fertilizer, were conducted in 2020 and 2021, involving five japonica cultivars with different heading times and plant architectures, Aichinokaori, Asahi, Hatsushimo, Nakate Shinsenbon, and Nikomaru ( Figure 2). These cultivars were divided into two groups at the study site: the early-heading group (Aichinokaori, Nakate Shinsenbon, and Nikomaru) and the lateheading group (Asahi and Hatsushimo). Nakate Shinsenbon has a large number of smallsized panicles with a short plant stature, Nikomaru has a small number of large-sized panicles with a tall plant stature, and Aichinokaori, Asahi, and Hatsushimo have an intermediate stature. Basal fertilization was carried out according to local practices for the fertilizer trials, and top-dressing was not performed for both trials. The amount of fertilizer applied in the fertilizer plot was 48 kg N, 34 kg P, and 44 kg K per hectare. The plot size was 1.68 m × 5.52 m with a 24 cm plant spacing. The experiments were laid out in a randomized complete block design with three replications.

Field Data Collection
AGB was sampled for each cultivar at different growth stages: (1) tillering stage, (2) stem elongation, (3) booting stage and (4) heading stage. The tillering stage is referred to as the late tillering stage (V8), the stem elongation stage is when the panicle branches have formed (R1), the booting stage is the flag leaf collar formation (R2), and the heading stage is when one or more florets of the main stem panicle achieve anthesis (R4) [37].
Sampling was conducted after the UAV flight for spectral data collection (Table 1). Four adjacent hills of rice plants were sampled in each plot. The number of tillers per sampled hill was counted and separated into leaves and stems. The detached leaves were used to determine the LAI using an area meter, AAM-9 (Hayashi Denko Co., Ltd., Tokyo, Japan). After measuring the leaf area, the detached plant parts were oven-dried at 70 °C for 48 h until a constant weight was attained and weighed. LAI determination was carried out only until the booting stage. Grain yield was measured at physiological maturity. Eight hills per plot were harvested. The grains were cleaned and weighed, and the grain yield was converted to ton/ha.

UAV Image Acquisition and Preprocessing
This study utilized a UAV system: Matrice 210 RTK v2 with a Zenmuse X7 50 mm camera (DJI, Shenzhen, Guangdong, China) and Micasense RedEdge-MX sensors on board. The multispectral bands used in this study were the following: blue (475 ± 32 nm), green (560 ± 27 nm), red (668 ± 14 nm), red edge (717 ± 12 nm), and near-IR (842 ± 57 nm). The flight altitude was 20.0 m with a forward overlap of 80% and a side overlap of 75%. The shooting mode was hovering with a flight speed of 1.2 m/s. The approximate resolution was 0.20 cm/pix.
For each UAV flight, the sensors were calibrated using a calibrated reflectance panel (CRP) provided by the sensor manufacturer according to its instructions. UAV flights were conducted between 10:00 am and 4:00 pm under windless and clear-sky conditions once per week until the heading stage. The number of UAV flights used in this study was 11 in 2020 and 17 in 2021.
The acquired images were orthomosaicked using the Pix4D mapper software (Pix4D SA, Prilly, Switzerland) to generate the spectral reflectance images of the experimental plots and georeferenced using ground control points.
An orthomosaicked true (RGB) image was used to create a region of interest (ROI). A shapefile consisting of circular polygons with a 10 cm radius corresponding to each hill of rice plants served as the ROI.
The reflectance images were then rasterized using the 'raster' [38] package in R. The orthomosaicked reflectance images were clipped into smaller raster images covering the experimental plot and cut for individual hills using the ROI shapefile. The mean values of reflectance from the clipped hill images were calculated. A total of 24 vegetation indices (Table A1) were calculated using different combinations of the mean values of the five spectral reflectance images.

Data Analysis
Data from rice growing seasons (2020 and 2021) were pulled together to comprise the whole dataset for the study-the study was composed of three prediction models for each cultivar. To establish a baseline for predicting grain yield by AGB and LAI, a linear regression model was conducted using repeated k-fold cross-validation (k = 3) provided by the 'caret' package in R [39]. A regression model for each growth stage was conducted to determine which growth stage can best predict grain yield. The growth stage corresponding to the model with the lowest RMSE in the model evaluation using an independent test dataset was considered as the optimum growth stage for predicting grain yield using AGB and LAI.
A correlation analysis was performed between AGB, LAI, and VIs using the 'corrplot' [40] package in R, and VIs were chosen as the feature variable for further XGBoost modeling. Since the XGBoost algorithm cannot handle high-dimensional data, the top ten correlated VIs were used to estimate AGB and LAI.
The 'xgboost' package in R [41] was used to train the XGBoost model to estimate AGB and LAI. A hold-out method was used with 70% as the training dataset and 30% as the independent test dataset. With the training dataset, the model was trained by a repeated k-fold cross-validation method (k = 3, 10 times). The optimum estimation model was determined using the lowest RMSE as the evaluation metric criterion.
The XGBoost algorithm is a boosting ensemble machine learning algorithm combining boosted trees' features and the gradient descent method with regularization functions. In the tree-based learning method, a prediction is made by creating a regression tree based on a decision tree score. Therefore, correlated features do not necessarily influence the performance of XGBoost. Then, the weak learners in the regression tree are given more weight in the next iteration. This occurs by incorporating the prediction of the residuals or errors of the prior regression tree into the new regression tree. Then, a regularization parameter is included in the new regression tree to account for the increase in the complexity of the model. The gradient descent method minimizes the cost function, which measures the error between the predicted and observed values.
To further explain the prediction process, feature importance was summarized using SHAP (SHapley Additive exPlanations) from the 'SHAPforxgboost' package in R [42]. The independent test dataset was evaluated using the normalized RMSE.
To achieve an optimized prediction model using XGBoost, there are several tuning parameters that can be explored as described by [41,43]. This study attempted to optimize the following parameters: (1) max depth, which is the depth of the decision tree, which means a tree must have features equal to the number of depths to create a prediction; (2) colsample by tree, which is the number of features supplied to a tree.
The estimation models for AGB and LAI were then utilized to predict the AGB and LAI across the growth period using the VIs calculated from other UAV flights not used in the training data for the estimation models. A predicted value corresponding to the extracted ROI of one hill was eliminated when it was less than 85% of the preceding predicted value and the records with less than 3 predicted values were removed from the next analyses. The remaining predicted AGB and LAI for each hill were fitted into the Gompertz growth curve using the 'drc' package in R [44] to interpolate AGB and LAI values at 1 to 100 DAT using 0.1 as the starting value (DAT 0) for AGB and LAI. Then, to determine which days were the optimum time window for conducting a UAV flight, a single-day linear regression model between grain yield and the predicted AGB and LAI was conducted using the 'caret' package in R [34] by a repeated k-fold cross-validation method (k = 3, 10 times). In total, 960 records of AGB and grain yield and 720 records of LAI were used in our study, but only 48 were used for each stage per cultivar. Training single-day linear regression models with a limited number of data can lead to overfitting and inaccurate results. To expand the yield data quantitatively and to eliminate the chance success produced by bias in data partitioning, we conducted a trial of 10 repeats of 3 CVs. The Akaike information criteria (AICc) score was computed for each regression model using the 'AICcmodavg' package in R [45]. The AIC weights were then calculated for each of the 100 regression models based on the regression model with the lowest AIC score. The DAT values corresponding to the regression models with an AIC weight of fewer than 1.0 units were considered the optimum time window for the UAV flights. The singleday regression models that fall in this criterion are considered the models that can best estimate the target variable, grain yield. A smoothing method, namely, LOESS (locally weighted scatterplot smoothing), in the 'ggplot2' [46] package in R was used to determine the trend in the single-day regression series. In this smoothing method, the R 2 for a singleday regression was recalculated based on the R 2 values of the nearby R 2 data points. The nearby data points were set at 50% of the total R 2 data points.

Relationship of AGB and LAI with Grain Yield
Five rice cultivars with different architectures and heading timings were used in this study and grown at different fertilizer levels to ensure diversity in the data. The AGB, LAI, and grain yields differed between the five rice cultivars on a hill basis (Figures 3 and  4), suggesting that the grain yield prediction model for each cultivar is reasonable. In 2020, there were no significant varietal differences in AGB as observed from the high AGB variability within a cultivar while in 2021, and varietal differences in AGB were observed with less AGB variability within a cultivar. Conversely, the AGB between cultivars in the fertilizer trial showed similar AGB variability patterns in both years. AGB was also higher in the fertilizer trial compared to the trial with no fertilizer, regardless of years. 2020 2021  The range of LAI values per growth stage was similar among cultivars in the fertilizer trial except with Aichinokaori in 2020 and Nakate Shinsenbon in 2021 (Figure 3e,f). Higher LAI values were observed among cultivars in 2020 than in 2021. LAI measured from the fertilizer trial was also higher than in the trial with no fertilizer. Higher LAI variability within a cultivar was also observed in the fertilizer trial, especially at later growth stages (Figure 3g,h).
Grain yields across cultivars showed slight variation in the trial with no fertilizer compared to grain yields in the fertilizer trial. In 2020, Aichinokaori had a lower grain yield than Hatsushimo and Nikomaru, while in 2021, Hatsushimo had a higher grain yield than Nakate Shinsenbon. Aichinokaori had the lowest grain yield across trials and years (Figure 4a-d).
Simple linear regression modeling between grain yield and the actual AGB and LAI was conducted to determine the varietal difference and to establish a baseline for predicting grain yield by AGB and LAI. Grain yield could be predicted using the actual AGB and LAI by accounting for the 19.0% to 53.0% variation in grain yield ( Figure 5). In this study, grain yield could be optimally predicted from the actual AGB and LAI at the stem elongation stage for all the cultivars used except for Nikomaru in this study. From these results, it may be possible to develop a model to estimate yield using LAI and AGB at a particular timing. and (e) Nikomaru, at the optimum growth stage. The optimum growth stage was determined by selecting the growth stage corresponding to the simple linear model that resulted in the lowest RMSE in the model validation. Cultivars Aichinokaori, Asahi, Hatsushimo, and Nakate Shinsenbon were optimally predicted at the stem elongation stage, while Nikomaru was optimally predicted at the tillering stage.

AGB and LAI Estimation
Based on the results of the yield estimation model using the actual LAI and AGB, it was thought that LAI and AGB could be used for yield estimation. For the estimation of LAI and AGB using UAVs, a model was developed to estimate AGB and LAI from VIs obtained from the flights.
In this study, 24 VIs were calculated from reflectance images, but since there were too many to include all of them in the XGBoost model to estimate LAI and AGB, VIs were selected to be used as explanatory variables. To select the VIs for AGB and LAI estimation modeling, a correlation analysis was performed between AGB, LAI, and VIs (Tables A2  and A3). The AGB of Nakate Shinsenbon had the highest range of the top ten (10) correlated VIs (R 2 : 0.772-0.806), followed by the AGB of cultivar Nikomaru (R 2 : 0.679-0.752). Medium levels of correlation were found between the VIs and AGB of cultivars Asahi (R 2 : 0.538-0.698) and Aichinokaori (R 2 : 0.528-0.621). The AGB of cultivar Hatsushimo showed the lowest correlation (R 2 : 0.447-0.631) ( Table A2).
Using the top ten (10) correlated VIs indicated in Tables A2 and A3, an AGB and LAI estimation model was developed using XGBoost. The different hyperparameters of the XGBoost model were tuned to obtain the optimum estimation for each cultivar (Tables A4  and A5). For some select cultivars, the AGB and LAI estimation models tuned using the lowest RMSE selection method showed decreased prediction performance when evaluated using the test data ( Table 2, Figures 6 and 7). The AGB estimation model for Hatsushimo and the LAI estimation models for Asahi, Hatsushimo, and Nakate Shinsenbon showed a lower R 2 when estimating the target variables in the test dataset. A higher number of feature variables per tree and depth of trees may result in an estimation model that finds patterns particular to the sample used for building the prediction, resulting in overfitting (Tables A4 and A5).  In contrast, LAI estimation models that used 50% of the feature variables and regression tree depths of five and three did not show overfitting AGB estimation models for Aichinokaori, Asahi, Nakate Shinsenbon, and Nikomaru, and LAI estimation models for Aichinokaori and Nikomaru.
The XGBoost model has a variable importance computation that summarizes the feature variables' level of influence on the training process to predict AGB and LAI based on the parameter gain of a feature variable in the prediction process. The parameter gain measures the relative contribution of a feature variable in a tree created in the model. A higher gain value of a feature close to other features indicates that the former feature is more critical in the prediction. The relative importance of the correlated VIs in the AGB estimation models for each cultivar was calculated (Figure 8). All of the feature variables used in the training of the XGBoost model were significantly correlated with AGB, with varying correlations (Table A2). Comparing the levels of correlation of VIs to the levels of importance in the XGBoost model showed a difference in trend. There were VIs with a medium correlation with AGB, but they showed the highest level of importance in the XGBoost model. In particular, the variable importance of the AGB estimation model for Aichinokaori showed that EVI2.2 was the essential feature in the prediction (Figure 8a). Likewise, the variable importance of the AGB estimation model for Nakate Shinsenbon showed that ATSAVI was the most crucial feature in the prediction model (Figure 8d) but was the least correlated VI with AGB among the ten VIs (Table A2) used in the training process. However, this difference in feature ranking was compensated by making the most correlated VIs the second most important feature in XGBoost, as seen in the variable importance of the AGB estimation models for Asahi, Hatsushimo, and Nikomaru. Nevertheless, the variable importance of the AGB estimation models for other cultivars such as Asahi, Hatsushimo, and Nikomaru showed that BWDRVI was the essential feature, which was also the VI most correlated with AGB (Table A2). The SHAP values of the AGB estimation models show that at least four feature variables per estimation model did not influence the prediction.
The variable importance in the LAI estimation model for each cultivar is shown in Figure 9a-e. The topmost vital features in the training process of the LAI estimation models were the VIs in the middle in the order of ascending correlation with LAI. The least correlated VIs also ranked the lowest in variable importance, such as GRNDVI in the estimation model for Hatsushimo and NDVI in the estimation model for Nikomaru. Similar to the AGB estimation models, the most correlated VIs were the second most important feature in the variable importance for the LAI estimation model.

VI-Estimated AGB and LAI Fitted to the Gompertz Growth Curve
Aerial photography by UAV flights was conducted once per week until heading, and by using the reflectance data and the estimation model described above, it is possible to estimate changes in AGB and LAI for each rice cultivar over the growing season. The estimated AGB and LAI of each flight day were fitted to the Gompertz growth curve (Figures 10 and 11). Although the pattern of growth varied by year and fertilizer conditions, the relative growth trends for each cultivar were common. In our study, Hatsushimo was the fastest growing cultivar, with relatively high AGB and LAI at 85 DAT. At the same time, Nikomaru was the slowest growing cultivar, with a low AGB and LAI at the early growth stage. To evaluate the accuracy of the estimation and curve fitting, the estimated AGB and LAI values for each cultivar, year, and trial were compared to the actual values of AGB at the heading stage and LAI at the booting stage. Based on this comparison, the highest accuracy for AGB estimation was observed in Nikomaru, while the lowest accuracy was observed in Asahi. The estimation for LAI across years and trials showed that the highest accuracy was observed in Asahi, and the lowest accuracy was observed in Hatsushimo (data not shown).

Grain Yield Prediction Using VI-Estimated AGB and LAI
To identify the best time of year for UAV yield estimation, the linear regression between grain yield and VI-estimated AGB and LAI per day which fitted into the Gompertz growth curve was established for each cultivar on daily basis. A repeated k-fold crossvalidation was performed to determine the regression function of the VI-estimated AGBand LAI per day for grain yield (Figures 12 and A1a-e). In the case of the linear regression with no parameter tuning, the mean R 2 values were also the R 2 of the final model for each single-day regression. Using the VI-estimated AGB and LAI, the RMSE of grain yield prediction across cultivars varied throughout the growth period, and the lowest RMSE values were observed at same DAT of the highest R 2 peak (Figures 12 and A1a-e). However, the strength of the grain yield prediction differed between cultivars. The VI-estimated AGB and LAI could explain only 10% and 30% of the grain yield variation for cultivars Aichinokaori and Hatsushimo and 40-50% for Asahi, Nakate Shinsenbon, and Nikomaru. The optimum time window for UAV flights was 42-87 DAT without referring to the actual growth stage dates observed in the study. Correspondingly, the RMSE values in these periods (25,35, and 85 DAT) were also relatively higher, but the RMSE values did not fluctuate strongly compared to the R 2 series, particularly in cultivars Aichinokaori and Asahi ( Figure A1a,b). In contrast, Hatsushimo and Nikomaru had a wide range of RMSE values (0.74-1.66 ton/ha and 0.46-1.16 ton/ha, respectively) which makes the determination of the optimum time window necessary for accurate grain yield prediction.
Using the AIC, the optimum single-day regression model can be determined for each cultivar by calculating the AIC weights. The DAT values whose AIC weights are less than 1.0 units are considered the optimum time window for the UAV flight ( Figure 12). Concerning the growth stage dates in the 2020 and 2021 rice seasons, the training process showed that the cultivars had different optimum time windows. For Aichinokaori, the optimum time window was 57-87 DAT, corresponding to the stem elongation to heading stages. The optimum time window was observed at 72-83 DAT for Asahi, coinciding with the stem elongation to booting stages. For Hatsushimo, the optimum time window was 50-56 DAT, corresponding to the stem elongation stage. Nakate Shinsenbon's optimum time window was 42-45 DAT, coinciding with the stem elongation stage. For Nikomaru, the optimum time window was 53-58 DAT, corresponding to the stem elongation stage.
The independent test dataset evaluated the prediction models corresponding to the optimum time windows (Figure 13a-e). The results show that prediction models corresponding to the optimum time window for cultivars Asahi and Nikomaru did not generalize well on the unseen dataset in R 2 . The grain yield prediction model for Nakate Shinsenbon performed better than the previous two cultivars (RMSE = 0.52 ton/ha). Conversely, a prediction model for cultivar Hatsushimo showed a better performance compared to the other cultivars. The prediction model for Aichinokaori at a later optimum time window showed a good prediction performance (RMSE = 0.61 ton/ha). Asahi was found to have a higher RMSE in the independent test dataset, and the prediction error (RMSE = 1.04 ton/ha) was two times higher than the prediction error of the other cultivars.
The difference between the result of the training dataset and the independent test dataset in the single-day regression model was attributed to the following factors: (1) the repeated k-fold cross validation method that we employed; (2) the prediction errors that were carried over by the AGB/LAI estimation models and the Gompertz growth curve to the single-day regression model. The large reduction of R 2 from the training dataset (optimum time graph) to the independent test dataset might be due to the low sampling size of the independent test dataset

Discussion
Based on the research results, we developed a model to estimate AGB and LAI based on the reflectance obtained from UAV flights (Figures 6 and 7) and to provide the optimal time window to fly UAVs for yield prediction using the AGB and LAI for five rice cultivars based on the AIC weight of the single-day regression models (Figures 12 and A1).
Although the relationship between the in-season AGB and grain yield is continuously studied and can be generalized, a higher grain yield has been observed in rice cultivars that produce AGB before the heading stage [47], while the opposite is true for other cultivars [48]. With this, the selection of cultivars and the timing of UAV flights are critical concerns in grain yield prediction. Thus, a grain yield prediction model for each tested cultivar was developed to determine if a similar optimum time window relative to the growth stage for UAV flights can be obtained for different rice cultivars (Figures 5 and  12). In this study, UAVs were used to estimate AGB and LAI, and the estimated values were used to search for suitable flight times for yield estimation.
To estimate the appropriate time for yield estimation using UAVs, we conducted yield estimation on a hill basis. Because the spatial arrangement of the leaves is different from that of the population as a plot, it is assumed that the reflectance obtained from the plant will be different from that obtained from the community. However, it is quite possible that the reflectance obtained from individual plants will show a similar trend to that of a community composed of plants of comparable growth.
In a previous study [23], the grain yield prediction model determined the correlated multi-temporal VIs calculated from the UAV-derived multispectral images and employed a multi-linear regression model to predict grain yield. The same research also observed that VIs correlated with LAI were good predictors of grain yield. However, an LAI estimation model from these correlated VIs was not developed. The results from that study showed that the optimum times for UAV flights were as follows: combined UAV flights at the booting and heading stages and combined UAV flights at the jointing and booting stages. Another study [22] used the field-estimated AGB and leaf chlorophyll content (LCC) to predict grain yield. The results showed that destructive AGB sampling can predict grain yield, with the R 2 ranging from 0.69 to 0.75. The same paper used VIs and UAVderived canopy structural information such as canopy height to predict the rice grain yield.
The previous research above showed that actual plant parameters, such as AGB and LAI, and the VIs derived from UAVs are feasible for grain yield prediction. This study investigated the relationship between AGB, LAI, VIs, and grain yield, and AGB and LAI estimation models were developed (Figures 6 and 7). Then, the AGB and LAI estimates were simulated using the Gompertz growth curve to observe the AGB and LAI dynamics (Figures 10 and 11). These dynamics were then examined for their relationship with grain yield to determine the optimum time where these estimates can predict grain yield.
Considering the linear relationship of the selected VIs with the AGB and LAI of the five cultivars, several remarks about the model establishment were made (Tables A2 and  A3). Using VIs with a smaller range of correlation coefficient values at a high level of correlation with the target variable can result in a better model accuracy than using VIs with a smaller range of correlation coefficient values at a medium level of correlation with the target variable. This was shown by comparing the AGB estimation model for Nakate Shinsenbon with that of Aichinokaori and Asahi, and the LAI estimation for Nakate Shinsenbon with that of Aichinokaori. VIs with a broader range of correlation values at medium to low levels of correlation with the target variable can result in a similar or better model accuracy than VIs with a smaller range of correlation coefficient values. The LAI estimation model for Hatsushimo was better than the LAI estimation model for Nikomaru.
Before the XGBoost model training for AGB and LAI estimation, feature selection was conducted using correlation analysis to determine feature variables-VIs-that linearly correlated with AGB and LAI (Tables A2 and A3). Even though some cultivars had similar correlated VIs, no two had the same set of correlated VIs. Among the VIs used in the AGB estimation models of the different cultivars, BWDRVI was the most common highly correlated VI. In past research, BWDRVI has provided good contrast between soybean soil and plant leaves at the maturity stage [49]. Other correlated VIs observed in this study confirm the findings of previous research in rice AGB estimation, where modified normalized vegetation indices showed a high correlation with AGB [50]. Conversely, the LAI of the five rice cultivars also showed a high correlation with the modified normalized difference vegetation indices. A similar research result has been observed between the LAI of rice and VIs such as GBNDVI and GNDVI [51]. Similar to the previous research, this study also shows that GBNDVI and GNDVI were more correlated with LAI than NDVI. The correlation analysis to select VIs for AGB and LAI estimations seemed reliable and reasonable.
The AGB and LAI can be estimated with good model accuracy using XGBoost ( Table  2). Previous research on the estimation of aboveground forest biomass showed that the random forest algorithm (c = 0.71) works better than the stochastic gradient boosting (R 2 = 0.60) [52]. In this study, the model evaluation using test datasets of the XGBoost models for AGB estimation resulted in an R 2 ranging from 0.565 to 0.827. For the LAI estimation models, this resulted in an R 2 ranging from 0.574 to 0.727. Comparing the model performances between the AGB estimation and LAI estimation in this study shows that AGB estimation models had better accuracy than LAI estimation models. Using the UAV-derived VIs, AGB estimation models showed less overfitting than LAI estimation models ( Table 2).
XGBoost and stochastic gradient boosting utilize the gradient descent method to find the minimum cost function for predicting the target. However, the XGBoost algorithm has regularization as one of its booster parameters, which prevents the overfitting of prediction models (Tables A4 and A5). The AGB estimation model for Hatsushimo used 90% of the feature variables to build regression trees, as represented by the booster parameter colsample_bytree. Conversely, LAI estimation models for Asahi, Hatsushimo, and Nakate Shinsenbon had higher depths of regression trees at 10. A higher number of feature variables per tree and depth of trees may result in an estimation model that finds patterns particular to the sample used for building the prediction, resulting in overfitting. Much earlier research conducted in remote sensing for AGB and LAI estimation demonstrated that AGB estimation is more suited to remote sensing than LAI estimation [53]. Nevertheless, comparing the NRMSE results of the test dataset showed that all the AGB and LAI estimation models of the five cultivars had good levels of model accuracy (Table 2). Moreover, the variable importance shown through the SHAP plots indicates that boosting the ensemble algorithm can be an excellent strategy to predict AGB and LAI when variation in correlations between the VIs and target exists in the training process (Figures 8 and 9).
After the AGB and LAI estimation, each cultivar's grain yield prediction model was developed using the VI-estimated AGB and LAI fitted into the Gompertz growth curve (Figures 10 and 11). Comparing the accurate estimations of the growth curve fitted, AGB and LAI estimates from the NRMSE of the AGB and LAI estimation models showed a difference in varietal performance (data not shown). Cultivars Aichinokaori and Nikomaru had higher accuracy in the test dataset than in the datasets from the growth curve fitting. In contrast, cultivars Asahi and Nakate Shinsenbon had higher accuracy in the dataset from the growth curve fitting than in the test dataset. The cultivar Hatsushimo showed the lowest accuracy for both datasets.
Using the single-day VI-estimated AGB and LAI simulated from the growth curve, a single-day regression was conducted to predict grain yield. The optimum time window for UAV flights was determined, indicating good model accuracy based on the AIC weights (Figures 12 and A1). Cultivars such as Asahi, Hatsushimo, and Nakate Shinsenbon had optimum time windows at the stem elongation stage. Aichinokaori had an optimum time window at the initial heading stage, while Nikomaru had an optimum time window at the tillering stage. A similar optimum growth stage was observed in the grain yield prediction model using the actual AGB and LAI (Figure 5a-e), except for Aichinokaori, which could be predicted at the stem elongation stage using the actual AGB and LAI, or at the later growth stage using the VI-estimated AGB and LAI.
By evaluating the performances of single-day regression models using the VI-estimated AGB and LAI (Figure 13a-e) and the grain yield prediction using the actual AGB and LAI (Figure 5a-e) on an independent test dataset, the following results were observed: (1) the grain yields of cultivars Hatsushimo and Nakate Shinsenbon were better predicted using the VI-estimated AGB and LAI; (2) the grain yield prediction of cultivars Aichinokaori and Nikomaru w better using the actual AGB and LAI, and (3) the grain yield prediction of cultivar Asahi could not be predicted well by either VI-estimated AGB and LAI or the actual AGB and LAI. These results show that using AGB and LAI was not sufficient enough to accurately predict the rice grain yield for all cultivars (e.g., Asahi).
Concerning each cultivar's AGB and LAI estimations ( Table 2), an accurate time window for UAV flights to predict grain yield could be determined. However, the percent variation in grain yield that can be explained still depends on the correlation of the VIs with AGB and LAI and the robustness of the machine learning algorithm. The optimum time windows determined in this study are consistent with previous studies for the preheading to initial heading stages [22,26], and our results suggest that the lengths of the optimal time windows for each cultivar differed from 4 to 31 days (Figures 12 and A1). In practical terms, a longer optimum time window for UAV flights is an advantage, as flight days will be flexible. A shorter time window will make UAV flights a time constraint, especially in cases where unfavorable weather conditions prevent UAV flights.
This study shows that a robust boosting ensemble machine learning algorithm-in this case, XGBoost-can be used to estimate agronomic traits, namely, AGB and LAI. Then, using the developed estimation models, AGB and LAI can be calculated using the UAV-derived reflectance data gathered during the entire growth period. The estimated AGB and LAI throughout the growth period determined the optimum time window for the UAV flights. The simulated VI-estimated AGB and LAI dynamics allowed for grain yield prediction at any point throughout the growth period. This method will enable researchers to save resources and time for numerous UAV flights to predict grain yield from UAV-derived reflectance data.

Conclusions
The aims of this study were to (1) predict rice yield by using the estimated aboveground biomass (AGB) and leaf area index (LAI) until the heading stage from vegetation indices (VIs) and (2) determine the optimal sensing time in estimating AGB and LAI using VIs for yield prediction.
The AGB and LAI estimation models from the VIs using the XGBoost algorithm resulted in an R 2 ranging from 0.565 to 0.827 and from 0.574 to 0.727, respectively, and the VI-estimated AGB and LAI predicted grain yield better than the actual AGB and VI for cultivars Hatsushimo and Nakate Shinsenbon (RMSE: 0.69 ton/ha and 0.52 ton/ha, respectively). The actual AGB and LAI predicted grain yields of Aichinokaori and Nikomaru (RMSE: 0.75 ton/ha and 0.7 ton/ha, respectively) better than the VI-estimated AGB and LAI, while the grain yield of Asahi could not be predicted well by both methods. The optimum time window for UAV flights was at the stem elongation stage for cultivars Asahi, Hatsushimo, and Nakate Shinsenbon and at the initial heading stage for Aichinokaori, ranging from 4 to 31 days.
These results indicate that the accuracy of rice grain yield estimation using the VI-s estimated AGB and LAI and the suitable period for yield estimation are cultivar dependent. Conversely, due to the small number of cultivars tested in this study, it is not clear whether this model specificity is due to differences in plant architecture and panicle traits. Further research should be conducted on multiple cultivars using these UAV-derived features to develop a generalized yield estimation model that is tolerant of differences among rice cultivars.