Comparison of Machine-Learning and CASA Models for Predicting Apple Fruit Yields from Time-Series Planet Imageries

Apple (Malus domestica Borkh. cv. “Fuji”), an important cash crop, is widely consumed around the world. Accurately predicting preharvest apple fruit yields is critical for planting policy making and agricultural management. This study attempted to explore an effective approach for predicting apple fruit yields based on time-series remote sensing data. In this study, time-series vegetation indices (VIs) were derived from Planet images and analyzed to further construct an accumulated VI (∑VIs)-based random forest (RF∑VI) model and a Carnegie–Ames–Stanford approach (CASA) model for predicting apple fruit yields. The results showed that (1) ∑NDVI was the optimal predictor to construct an RF model for apple fruit yield, and the R2, RMSE, and RPD values of the RF∑NDVI model reached 0.71, 16.40 kg/tree, and 1.83, respectively. (2) The maximum light use efficiency was determined to be 0.499 g C/MJ, and the CASASR model (R2 = 0.57, RMSE = 19.61 kg/tree, and RPD = 1.53) performed better than the CASANDVI model and the CASAAverage model (R2, RMSE, and RPD = 0.56, 24.47 kg/tree, 1.22 and 0.57, 20.82 kg/tree, 1.44, respectively). (3) This study compared the yield prediction accuracies obtained by the models using the same dataset, and the RF∑NDVI model (RPD = 1.83) showed a better performance in predicting apple fruit yields than the CASASR model (RPD = 1.53). The results obtained from this study indicated the potential of the RF∑NDVI model based on time-series Planet images to accurately predict apple fruit yields. The models could provide spatial and quantitative information of apple fruit yield, which would be valuable for agronomists to predict regional apple production to inform and develop national planting policies, agricultural management, and export strategies.


Introduction
Apple (Malus domestica Borkh.), an important cash crop, is widely consumed around the world [1]. As the leading apple-producing area, China has taken a supervisory position in the world apple industry [2]. By 2018, China controlled 46% of the apple production and 42% of the apple planting area worldwide, with annual production and planting areas of approximately 39.24 million tons and 2.07 million ha, respectively [3]. Given the importance of apple fruit production to the economy of China, predicting apple yields be optimized. Therefore, this study aimed to optimize the CASA model parameters and calculations to predict apple fruit yield.
To address these issues, this study derives and analyses time-series VIs from Planet images and further develops an accumulated VI-based model and an improved CASA model to explore an effective approach for predicting apple fruit yields based on timeseries RS data. The objectives of this study were to (1) identify the optimal VI with which to construct an apple yield prediction model based on accumulated VI values and phenological information; (2) optimize the parameters of the CASA model for apple fruit yield predictions to improve the prediction accuracy; (3) compare the yield prediction performances of the accumulated VI-based model and the improved CASA model.

Study Region Experimental Design
As the main apple production area in China, Shandong Province was selected as the study region. The study was conducted in six apple orchards in Guanli town, Qixia city, Shandong Province (120.62 • E-120.76 • E, 37.14 • N-37.27 • N) in the 2019 and 2020 apple growing seasons. As the main planting variety, "Red Fuji" apple trees (Malus domestica Borkh. cv. "Fuji") were used as the experimental material. Four apple orchards (O1-O4) in 2019 and orchards O1 and O2 as well as two additional orchards (O5 and O6) in 2020 were selected for experiments and the sampling validation ( Figure 1). wheat yield. However, previous research has focused on wheat yield predictions. For apple fruit yield predictions, the parameters and calculations of the CASA model need to be optimized. Therefore, this study aimed to optimize the CASA model parameters and calculations to predict apple fruit yield.
To address these issues, this study derives and analyses time-series VIs from Planet images and further develops an accumulated VI-based model and an improved CASA model to explore an effective approach for predicting apple fruit yields based on timeseries RS data. The objectives of this study were to (1) identify the optimal VI with which to construct an apple yield prediction model based on accumulated VI values and phenological information; (2) optimize the parameters of the CASA model for apple fruit yield predictions to improve the prediction accuracy; (3) compare the yield prediction performances of the accumulated VI-based model and the improved CASA model.

Study Region Experimental Design
As the main apple production area in China, Shandong Province was selected as the study region. The study was conducted in six apple orchards in Guanli town, Qixia city, Shandong Province (120.62°-120.76° E, 37.14°-37.27° N) in the 2019 and 2020 apple growing seasons. As the main planting variety, "Red Fuji" apple trees (Malus domestica Borkh. cv. "Fuji") were used as the experimental material. Four apple orchards (O1-O4) in 2019 and orchards O1 and O2 as well as two additional orchards (O5 and O6) in 2020 were selected for experiments and the sampling validation ( Figure 1). The first flowering occurrences of the studied apple trees were observed on April 17 and April 12 in 2019 and 2020, respectively. In Guanli town, harvest commences in early The first flowering occurrences of the studied apple trees were observed on April 17 and April 12 in 2019 and 2020, respectively. In Guanli town, harvest commences in early October depending on the fruit maturity and continues until November. Field practices Remote Sens. 2021, 13, 3073 4 of 18 included postharvest pruning in winter or early spring, three fertilizer applications (nitrogen fertilizer in April, phosphate fertilizer in June, potash fertilizer in August), monthly irrigation, and weeding ( Figure 2). October depending on the fruit maturity and continues until November. Field practices included postharvest pruning in winter or early spring, three fertilizer applications (nitrogen fertilizer in April, phosphate fertilizer in June, potash fertilizer in August), monthly irrigation, and weeding ( Figure 2).

Field Measurements
Forty-seven and 57 apple trees in the full fruit period were randomly selected in 2019 and 2020, respectively. Among them, nineteen selected apple trees were the same for both 2019 and 2020. The geographical coordinates of the sampled trees were collected using a Qianxun positioning SR2 satellite-based RTK receiver mobile device with a centimeter-level positioning accuracy (Qianxun Spatial Intelligence Inc., Hangzhou, China). All fruits were counted on each sampled tree to calculate the total number of fruits per tree. In the apple harvest season, 8 healthy and regularly shaped fruits were collected and weighed from each sampled tree to calculate the mean fruit weight for each sampled tree. The apple yield (kg/tree) was calculated using the total number of fruits per tree and the mean fruit weight (kg).

Planet Imagery Data
For this work, two years (2019 and 2020) of multispectral data from the Planet Labs constellation (www.planet.com, accessed on 15 November 2020) were used. The Plan-etScope constellation has approximately 170 small satellites intended to image the Earth's land surface daily. The sensor-corrected, radiation-corrected, and orthorectified data product (PS Analytic Ortho Scene Level 3B) was used in this study. The PS imagery was captured as continuous strips for 4 bands: blue (455-515 nm), green (500-590 nm), red (590-670 nm), and NIR (780-860 nm) bands at a ground spatial resolution of 3 m. For the 2019 apple growing season, 17 PS images were acquired, while an additional 21 PS images were acquired at corresponding times during the 2020 apple growing seasons. The topof-atmosphere reflectance was converted to surface reflectance using a quick atmospheric correction model in ENVI 5.3.

Meteorological Data
The meteorological data used in this study included daily mean temperature and downwards surface solar radiation data; these data were downloaded from the "Daily statistics calculated from ERA5 single levels hourly data" dataset obtained from the European Centre for Medium-Range Weather Forecasts (ECMWF) (www.ecmwf.int, accessed on 15 November 2020). In this study, 172 days of data were collected from the first flowering stage to the harvest stage ( Figure 3).

Field Measurements
Forty-seven and 57 apple trees in the full fruit period were randomly selected in 2019 and 2020, respectively. Among them, nineteen selected apple trees were the same for both 2019 and 2020. The geographical coordinates of the sampled trees were collected using a Qianxun positioning SR2 satellite-based RTK receiver mobile device with a centimeterlevel positioning accuracy (Qianxun Spatial Intelligence Inc., Hangzhou, China). All fruits were counted on each sampled tree to calculate the total number of fruits per tree. In the apple harvest season, 8 healthy and regularly shaped fruits were collected and weighed from each sampled tree to calculate the mean fruit weight for each sampled tree. The apple yield (kg/tree) was calculated using the total number of fruits per tree and the mean fruit weight (kg).

Planet Imagery Data
For this work, two years (2019 and 2020) of multispectral data from the Planet Labs constellation (www.planet.com, accessed on 15 November 2020) were used. The Plan-etScope constellation has approximately 170 small satellites intended to image the Earth's land surface daily. The sensor-corrected, radiation-corrected, and orthorectified data product (PS Analytic Ortho Scene Level 3B) was used in this study. The PS imagery was captured as continuous strips for 4 bands: blue (455-515 nm), green (500-590 nm), red (590-670 nm), and NIR (780-860 nm) bands at a ground spatial resolution of 3 m. For the 2019 apple growing season, 17 PS images were acquired, while an additional 21 PS images were acquired at corresponding times during the 2020 apple growing seasons. The topof-atmosphere reflectance was converted to surface reflectance using a quick atmospheric correction model in ENVI 5.3.

Meteorological Data
The meteorological data used in this study included daily mean temperature and downwards surface solar radiation data; these data were downloaded from the "Daily statistics calculated from ERA5 single levels hourly data" dataset obtained from the European Centre for Medium-Range Weather Forecasts (ECMWF) (www.ecmwf.int, accessed on 15 November 2020). In this study, 172 days of data were collected from the first flowering stage to the harvest stage ( Figure 3).

Time Series Vegetation Indices
From the time-series Planet imagery data, 6 VIs specific to fruit tree biomass and yield parameters were selected to predict apple yields ( Table 1). The vegetation indices were derived from the pixel on the center of the tree crown, and when the center of the tree crown located at the edge of the pixel, the mean value of adjacent pixels was taken as the vegetation indices of selected apple trees. Simulated daily ratio vegetation index (SR), differential vegetation index (DVI), normalized difference vegetation index (NDVI), soil-adjusted vegetation index (SAVI), renormalized difference vegetation index (RDVI), and enhanced vegetation index (EVI) values were calculated by the linear interpolation approach to define the shapes and amplitudes of the VI curves.

Vegetation Index Equation Reference
SR

Phenological Information Extraction
Previous studies have shown that NDVI is sensitive for detecting plant phenological signals [13,42]. Therefore, this study selected an NDVI time series to extract the phenological stages of apple trees. The NDVI curve was consistent with the ground observations of the apple growth and management measures, assisting in the definition of the phenological stages. The start day of flowering stages was identified by site observation. Other phenological stages were identified by time-series NDVI data and validated by ground observation. The apple growth and development features, the main apple management measures, and the corresponding NDVI curves are shown in Figure 2. The flowering stage (FS), the new-shoot-growing stage (NGS), the new-shoot-stop-growing stage (NSS), the autumn shoot-growing stage (AGS), the autumn shoot-stop-growing stage (ASS), and the harvest stage (HS) were extracted. In order to accurately describe phenological changes, satellite observation was more frequent in FS to AGS stages (growing vigorously) than that in ASS and HS stages. Because the NDVI curve at the HS was affected by human activities, such as reflective mulching films and the removal of leaves, this study focused on the first five phenological stages.

Yield Prediction Model Based on Accumulated VIs
To improve the correlation between the RS data and the apple fruit yield [26], the cumulative VI (∑ VI) values at different phenology stages were calculated; the accumulation area was defined as the enclosed area of the zone by the phenological stage and the curve of the VI measured the day after flowering (Equation (1)). Then, an apple fruit yield prediction model based on the random forest algorithm (RF) was constructed by the ∑ VI values at different phenology stages (Equation (2)). As an ensemble learning approach, RF has a faster training speed and a stronger generalization ability than other statistical approaches [43]. It has been reported to perform more accurately for crop yield predictions than other statistical methods in recent studies [30][31][32]. Due to the relationship between apple fruit yield and the accumulation of dry matter in fruit, this study used the Carnegie-Ames-Stanford approach (CASA) model to calculate the accumulation of net primary productivity (NPP). Then, we calculated the apple fruit yield based on the accumulation of NPP (Equation (3). Due to the differences in the light absorption and utilization abilities of different crops, this study optimized the parameters and calculations of the CASA model. The detail contents are as follows.
where Yield is the apple fruit yield (kg/tree), ∑ NPP is the cumulative NPP over the entire apple tree growing season (g C/tree), H I is the harvest index of 0.7, C is the carbon content of 47.5%, and ω is the water content of apple fruit, which is 84%. The NPP can be calculated as follows: The CASA model estimated the NPP using the absorbed photosynthetically active radiation (APAR) and light use efficiency (ε) (Equation (4)): where APAR is the absorbed photosynthetically active radiation (MJ/tree), and ε is the light use efficiency (g C/MJ).

Determination of Absorbed Photosynthetically Active Radiation
The APAR is related to two factors: photosynthetically active radiation (PAR) and the fraction of absorbed photosynthetically active radiation (FPAR). APAR was calculated as follows: where PAR is the photosynthetically active radiation (MJ/tree), and FPAR is the fraction of absorbed photosynthetically active radiation. PAR was calculated as follows: PAR (0.4-0.7 µm) is the fraction of the shortwave solar radiation (0.3−3.0 µm) that is absorbed by chlorophyll for photosynthesis in plants and is, thus, a fraction (0.48 in the present study) of the incoming solar radiation. APAR was calculated as follows: where SSR is the surface solar radiation (MJ/m 2 ), k is the ratio of photosynthetically active radiation to the surface solar radiation, and P is the planting density (tree/ha). The SR or NDVI values obtained by linear functions are often used to estimate FPAR, and here, FPAR was calculated in three ways (Equations (7)-(9)). FPAR was calculated as follows: where SR min and SR max and NDV I min and NDV I max represent the 5th and 95th percentiles of SR and NDV I, respectively, for the apple trees analyzed in this study. SR min and SR max were computed for every single date. The FPAR min and FPAR max values were defined as 0.01 and 0.95, respectively.

Determination of Light Use Efficiency
The light use efficiency was calculated using the maximum light use efficiency (ε max ) and environmental stress factors (Equation (10)): ε max is the typical light use efficiency when the environmental conditions are optimal. Through repeated analysis and comparisons, ε max was determined to be 0.499 g C/MJ in this study. Because of sufficient irrigation in the experimental orchards, the effects of water stress factors were not considered in this study. The temperature stress factors were calculated by Equations (11) and (12), as follows: where ε max is the maximum light use efficiency (MJ/tree), T 1 and T 2 are scalars representing temperature stress factors that reduce light use efficiency under unfavorable conditions ( • C), T opt is the mean air temperature during the month of maximum NDVI development ( • C), and T is the mean daily air temperature ( • C).

Accuracy Evaluation
The sample trees were divided into two groups by the equidistant sampling method in 2019 and 2020 [44]; one group contained 78 samples as the calibration set, and the other group contained 26 samples as the independent validation set. The coefficient of determination (R 2 ), root mean square error (RMSE), and residual predictive deviation (RPD) were calculated and used to evaluate the accuracies of the models (Equations (13)-(16)) [45]. Higher R 2 values indicate that a model is more stable, and lower RMSE and higher RPD values indicated great model accuracy. Among them, models were classified in terms of RPD as follows: 1.0 < RPD < 1.4 = "poor", 1.4 < RPD < 1.8 = "general", 1.8 < RPD < 2.0 = "good", 2.0 < RPD < 2.5 = "very good" [46].

Yield Mapping
To obtain the spatial and quantitative information of apple fruit yield, this study predicts the apple fruit yield at a regional scale. Apple orchards are the main land type in Guanli town, so this study extracted the apple planting area using visual interpretation methods. In this study, the model with the best prediction accuracy was used to predict regional apple fruit yield in MATLAB software (MathWorks, Inc., Natick, MA, USA); the yield maps for the two years of study were produced using ArcGIS software (ESRI Inc., West Redlands, CA, USA).

Statistical Results of Fruit Yield
The statistical indices of apple fruit yield, including the maximum (Max), minimum (Min), average (Avg), standard deviation (SD), and coefficient of variation (CV), are shown in Table 2. The highest, lowest, and average apple fruit yields of the sampled trees in 2019 were 125. 13, 8.62, and 49.80 kg/tree, respectively, and the corresponding values of the sampled trees in 2020 were 115.60, 7.29, and 53.01 kg/tree, respectively. The CVs of the apple fruit yields of sampled trees in 2019 and 2020 were all over 50% (57% and 54%, respectively). These results indicated that apple fruit yields are extremely variable among trees.

Correlation Analysis between Apple Fruit Yield and ∑ VIs
The trends of six VIs and ∑ VIs for apple trees during the whole growing period are shown in Figure 4. The VIs generally increased in the FS, NGS, and AGS and remained stable or slightly decreased in the NSS and ASS; this trend was consistent with the apple growth and development regulations. These results showed that VIs are sensitive to apple growth. Because the ∑ VIs mainly depended on the VIs and the lengths of the phenological stages, the ∑ VIs values were higher in NSS than the corresponding values in other stages. The correlation coefficients (r) between the apple fruit yield and ∑ VIs were calculated and are shown in Table 3. ∑ SR and ∑ NDVI produced better correlations for the total growth stage, with r values of 0.74 and 0.73, respectively. When each phenological stage was analyzed separately, the highest r value obtained was different. In FS, ∑ NDVI and ∑ SR were identified as the VIs with the highest correlations (r = 0.60). NDVI showed the highest correlation in AGS and ASS (r = 0.47 and 0.78, respectively), and SR produced the highest correlation in NGS and NSS (r = 0.67 and 0.66, respectively). Overall, ∑ NDVI and ∑ SR were sensitive to the apple fruit yields.   The values are the correlation coefficients ® between the yield and ∑ VIs . , ** represents significant at 0.01 level.

Calibration Results of the Yield Prediction Model Based on Different ∑ VIs
The ∑ VIs at FS, NGS, NSS, AGS, and ASS were combined to predict the apple fruit yield based on the random forest algorithm ( Figure 5). For the calibration results, the R 2 values of the six models were all above 0.8. Among them, the RF ∑ RDVI model achieved the best calibration results (R 2 = 0.84, RMSE = 11.62 kg/tree, and RPD = 2.42). The R 2 , RMSE and RPD of the RF ∑ NDVI model reached 0.82, 12.12 kg/tree, and 2.32, respectively. This result showed that the random forest algorithm has a good fitting ability. The differences in calibration accuracies among different models were small.

Calibration Results of the Yield Prediction Model Based on Different ∑ VIs
The ∑ VIs at FS, NGS, NSS, AGS, and ASS were combined to predict the apple fruit yield based on the random forest algorithm ( Figure 5). For the calibration results, the R 2 values of the six models were all above 0.8. Among them, the RF ∑ RDVI model achieved the best calibration results (R 2 = 0.84, RMSE = 11.62 kg/tree, and RPD = 2.42). The R 2 , RMSE and RPD of the RF ∑ NDVI model reached 0.82, 12.12 kg/tree, and 2.32, respectively. This result showed that the random forest algorithm has a good fitting ability. The differences in calibration accuracies among different models were small.

Validation Results of the Yield Prediction Model Based on Different ∑ NDVI Values
The validation set was used to validate the models ( Figure 5). The RF ∑ NDVI , RF SAVI , and RF ∑ RDVI models reached the highest coefficients of determination (R 2 = 0.71). Moreover, the RMSE and RPD values of the RF ∑ NDVI , RF ∑ SAVI , and RF ∑ RDVI models reached 16

Validation Results of the Yield Prediction Model Based on Different ∑ NDVI Values
The validation set was used to validate the models ( Figure 5). The RF ∑ NDVI , RF ∑ SAVI , and RF ∑ RDVI models reached the highest coefficients of determination (R 2 = 0.71). Moreover, the RMSE and RPD values of the RF ∑ NDVI , RF ∑ SAVI , and RF ∑ RDVI models reached 16.40, 16.47, 16.59 kg/tree and 1.83, 1.82, 1.80, respectively; these values were better than the corresponding values of the RF ∑ EVI , RF ∑ DVI , and RF ∑ SR models (RMSE = 18.34, 17.95, 17.39 kg/tree and RPD = 1.63, 1.67, 1.72, respectively). The RPDs of the RF ∑ NDVI , RF ∑ SAVI , and RF ∑ RDVI models were all above 1.8, indicating that the models had good performances and could be used to predict apple fruit yield. Among them, the RF ∑ NDVI model reached the highest validation accuracy (R 2 = 0.71, RMSE = 16.40 kg/tree and RPD = 1.83). This study selected the RF ∑ NDVI model for a comparison with the CASA model.

Net Primary Production Estimation
Fruit tree growth and meteorological conditions directly affect the formation of NPP. Figure 6 shows the mean daily NPP of apple trees in the study area as a function of time over the entire growing season. The results showed that the daily NPP generally increased first, then decreased, and remained stable after 90 days. The NPP estimated based on FPARNDVI was higher than the corresponding values based on FPARSR. These results may cause the FPARNDVI-based model to overpredict apple fruit yield.

Net Primary Production Estimation
Fruit tree growth and meteorological conditions directly affect the formation of NPP. Figure 6 shows the mean daily NPP of apple trees in the study area as a function of time over the entire growing season. The results showed that the daily NPP generally increased first, then decreased, and remained stable after 90 days. The NPP estimated based on FPAR NDVI was higher than the corresponding values based on FPAR SR . These results may cause the FPAR NDVI -based model to overpredict apple fruit yield. The CASASR, CASANDVI, and CASAAverage models were developed by FPARSR, FPARNDVI, and FPARAverage, respectively, to predict apple fruit yields (Figure 7). According

Calibration Results of the CASA Model
The CASA SR , CASA NDVI , and CASA Average models were developed by FPAR SR , FPAR NDVI , and FPAR Average , respectively, to predict apple fruit yields (Figure 7). According to the calibration results, the best performance was produced by the CASA SR model (R 2 = 0.57, RMSE = 18.95 kg/tree, and RPD = 1.51), followed by the CASA Average model (R 2 = 0.57, RMSE = 19.95, and RPD = 1.43), and finally, the CASA NDVI model (R 2 = 0.55, RMSE = 23.29, and RPD = 1.23). The values predicted by the CASA NDVI model were overvalued compared to the actual values.

Validation Results of the CASA Model
The validation set was used to validate the models (Figure 7). The CASASR model (R 2 = 0.57, RMSE = 19.61 kg/tree, and RPD = 1.53) predicted apple fruit yield better than the CASANDVI model (R 2 = 0.56, RMSE = 24.47 kg/tree, and RPD = 1.22) and the CASAAverage model (R 2 = 0.57, RMSE = 20.82 kg/tree, and RPD = 1.44); these results were consistent with the calibration results. Because the CASANDVI and CASAAverage models did not improve the accuracy of apple fruit yield predictions, this study selected the CASASR model for a comparison with the machine-learning model.

Validation Results of the CASA Model
The validation set was used to validate the models (Figure 7). The CASA SR model (R 2 = 0.57, RMSE = 19.61 kg/tree, and RPD = 1.53) predicted apple fruit yield better than the CASA NDVI model (R 2 = 0.56, RMSE = 24.47 kg/tree, and RPD = 1.22) and the CASA Average model (R 2 = 0.57, RMSE = 20.82 kg/tree, and RPD = 1.44); these results were consistent with the calibration results. Because the CASA NDVI and CASA Average models did not improve the accuracy of apple fruit yield predictions, this study selected the CASA SR model for a comparison with the machine-learning model.

Comparison of the RF ∑ NDVI Model and CASA SR Model
To ensure the consistency of the models, the validation set was used to analyze the performances of the RF ∑ NDVI model and the CASA SR model (Figure 8). The R 2 and RMSE values of the RF ∑ NDVI model reached 0.71 and 16.40 kg/tree, respectively; these values were better than the corresponding values of the CASA SR model (R 2 = 0.56 and RMSE = 19.90 kg/tree). The RPDs of the RF ∑ NDVI model were above 1.8, indicating that the model displayed good performance in predicting apple fruit yield. The RPD of the CASA SR model only reached 1.50, indicating that the model had a general performance in predicting apple fruit yield. These results indicated that the accuracy of the RF ∑ NDVI model was higher than that of the CASA SR model when predicting apple fruit yield.

Yield Map
Apple orchards are the main land type in Guanli town, so this study extracted the apple planting area using visual interpretation methods. Using the RF ∑ NDVI model, the apple fruit yield was predicted at the regional scale. The yield maps of Guanli town in 2019 and 2020 are shown in Figure 9.
values of the RF ∑ NDVI model reached 0.71 and 16.40 kg/tree, respectively; these values were better than the corresponding values of the CASASR model (R 2 = 0.56 and RMSE = 19.90 kg/tree). The RPDs of the RF ∑ NDVI model were above 1.8, indicating that the model displayed good performance in predicting apple fruit yield. The RPD of the CASASR model only reached 1.50, indicating that the model had a general performance in predicting apple fruit yield. These results indicated that the accuracy of the RF ∑ NDVI model was higher than that of the CASASR model when predicting apple fruit yield.

Yield Map
Apple orchards are the main land type in Guanli town, so this study extracted the apple planting area using visual interpretation methods. Using the RF ∑ NDVI model, the apple fruit yield was predicted at the regional scale. The yield maps of Guanli town in 2019 and 2020 are shown in Figure 9.

The Machine-Learning Model for Apple Yield Prediction
At present, the ability to rapidly predict crop yields over large scales based on RS data is an area of active research. There are wide applications for field crop yield predictions, such as those of wheat, barley, potato, maize, and soybean [15,25,[47][48][49]. However,

The Machine-Learning Model for Apple Yield Prediction
At present, the ability to rapidly predict crop yields over large scales based on RS data is an area of active research. There are wide applications for field crop yield predictions, such as those of wheat, barley, potato, maize, and soybean [15,25,[47][48][49]. However, there are few reports describing fruit tree yield predictions based on time-series RS data. The results obtained from this study indicate the potential of using time-series multispectral images to accurately predict apple fruit yields across multiple apple growing seasons and orchards.
As an effective method for monitoring crop growth, vegetation indices show strong correlations with fruit yield and greatly influence apple fruit predictions [16][17][18]. ∑ NDVI and ∑ SR consistently produced the strongest relationships with apple fruit yield in this study. Vegetation indices have been identified as being highly sensitive to the canopy chlorophyll content; chlorophyll is a plant constituent that is essential for fruit growth and development in apple trees [11,43]. Apple canopies with low vegetation index values may have low photosynthesis rates and, further, low organic matter accumulations [50,51]. Therefore, it follows that the ∑ VIs measured in the apple growing season can be used to assess crop yields. The NDVI has been widely used for phenological characterizations because it is simple to calculate and sensitive to phenological changes [52,53]. In this study, the phenological stages of apple trees were extracted by NDVI because of the sensitivity of this VI to the growth stage. Some studies have demonstrated that the correlations between VIs and yield differ among different phenological stages [14,54]. In this study, the ∑ VIs of the autumn shoot-stop-growing stage had the highest correlations with the apple fruit yield. Previously published results have indicated that later growing stages could provide higher prediction accuracies when RS data are used to predict fruit yields [16,18,27]; this is consistent with the results obtained from this study. These results may be caused by fruit growth competing for nutrition with new shoots [1]. The autumn shoot-stopgrowing stage represents the peak of fruit growth in an apple tree, and dry matter mainly accumulates in fruit; this dry matter can reflect the yield potential well [55]. Therefore, using a combination of ∑ VIs measured at different phenological stages is very important for accurately predicting apple fruit yields. These results showed that the RF ∑ NDVI model reached the highest yield prediction results and could be used to predict apple fruit yields.

The CASA Model for Apple Yield Prediction
Using RS time series and meteorological data, apple fruit yields were also predicted by the CASA SR model. A previous study demonstrated that the parameters of the CASA model were often affected by several factors, such as the vegetation type, geographical location, and environmental conditions [13,35,36,56,57]. Therefore, this study optimized the parameters of the CASA model to improve its apple fruit yield prediction accuracy. The original CASA model only established a linear relationship between FPAR and SR [13,35]. The SR values become saturated, and noise contributes proportionally to the errors in the FPAR calculation and ultimately to the apple fruit yield predictions [13,58]. Although this study tried to combine NDVI and SR to improve the yield prediction accuracy [58,59], the results of the yield predictions suggested that FPAR NDVI was not helpful for improving the prediction accuracy when compared to FPAR SR . Light use efficiency was the primary controlling factor in the CASA model for predicting crop yields [56,60,61]. In this study, the light use efficiency was calculated using the maximum light use efficiency and environmental stress factors. The maximum light use efficiency of the original CASA model was assigned a value of 0.389 g C/MJ [35]. However, many studies have demonstrated that the maximum light use efficiency displays large differences among vegetation types and environmental conditions [56,57,60]. Zhu et al. [58] proposed that the maximum light use efficiency values ranged from 0.159 to 2.553 g C/MJ for woody vegetation in China. Clearly, the maximum light use efficiency must be adequately estimated when using the CASA model to predict yields [56]. Through repeated analyses and comparisons, the maximum light use efficiency of apple trees was determined to be 0.499 g C/MJ in this study. The environmental stress factors in the original CASA model are mainly divided into temperature and water stress factors [13,36]. Among them, water stress factors represent a physiological reduction in light use efficiency under drought conditions; drought conditions were calculated in this study using the precipitation time series [36]. Because of sufficient irrigation in apple orchards, the effects of the water stress factors were not considered in this study.

Application Prospect of Apple Fruit Yield Predictions
The results obtained from the accuracy comparison of apple fruit yield predictions indicated that the RF ∑ NDVI model performed better than the CASA SR model. These results may have been caused by the relationships between vegetation indices and the apple fruit yield tending to be nonlinear [16,17]. As a machine-learning-based model, the RF ∑ NDVI model has a stronger ability to fit nonlinear data than the CASA SR model. Therefore, the RF ∑ NDVI model can be applied to predict apple fruit yields on a regional level. However, in large-scale apple fruit yield predictions, the meteorological impacts may be more noticeable [62,63], and the relationships between the yield and VIs may be different. The CASA SR model may perform better in large-scale apple fruit yield predictions, especially in extreme growing seasons. In addition, extreme weather, such as frost injury, and poor management also affect apple fruit yields. Modifying the model to consider these effects still requires further research. This study only compared the prediction accuracies between the RF ∑ NDVI model and the CASA SR model in the Guanli town of Shandong, China. Further research will determine the performances of the RF ∑ NDVI model and the CASA SR model in large-scale apple fruit yield predictions.
To predict the yield for other species, the model should be modified according to the agronomic characteristics of the species. The RF ∑ NDVI model needs to be modified according to the difference in phenological stages between species. The parameters of CASA SR model could be adjusted to accommodate different species, including maximum light use efficiency, harvest index, and water content of apple fruit. The performances of the two models applied with other species still needs further study.

Conclusions
This study developed two kinds of models using time-series VIs, the RF ∑ NDVI model and the CASA SR model, to predict apple fruit yields, to explore effective approaches, and to predict regional apple fruit yield. The results showed that (1) ∑ NDVI was the optimal predictor to construct RF model for apple fruit yield, and the R 2 , RMSE, and RPD values of the RF ∑ NDVI model reached 0.71, 16.40 kg/tree, and 1.83, respectively. (2) The maximum light use efficiency was determined to be 0.499 g C/MJ, and the CASA SR model (R 2 = 0.57, RMSE = 19.61 kg/tree, and RPD = 1.53) performed better than the CASA NDVI model and the CASA Average model (R 2 , RMSE, and RPD = 0.56, 24.47 kg/tree, 1.22 and 0.57, 20.82 kg/tree, 1.44, respectively). (3) This study compared the yield prediction accuracies obtained by the models using the same dataset, and the RF ∑ NDVI model (RPD = 1.83) showed a better performance in predicting apple fruit yields than the CASA SR model (RPD = 1.53). The results obtained from this study indicated the potential of the RF ∑ NDVI model based on time-series Planet images to accurately predict apple fruit yields. The models could provide spatial and quantitative information of apple fruit yield, which would be valuable for agronomists to predict regional apple production to inform and develop national planting policies, agricultural management, and export strategies.