A Multicultivar Approach for Grape Bunch Weight Estimation Using Image Analysis

: The determination of bunch features that are relevant for bunch weight estimation is an important step in automatic vineyard yield estimation using image analysis. The conversion of 2D image features into mass can be highly dependent on grapevine cultivar, as the bunch morphology varies greatly. This paper aims to explore the relationships between bunch weight and bunch features obtained from image analysis considering a multicultivar approach. A set of 192 bunches from four cultivars, collected at sites located in Portugal and South Africa, were imaged using a conventional digital RGB camera, followed by image analysis, where several bunch features were extracted, along with physical measurements performed in laboratory conditions. Image data features were explored as predictors of bunch weight, individually and in a multiple stepwise regression analysis, which were then tested on 37% of the data. The results show that the variables bunch area and visible berries are good predictors of bunch weight (R 2 ranging from 0.72 to 0.90); however, the simple regression lines ﬁtted between these predictors and the response variable presented signiﬁcantly different slopes among cultivars, indicating cultivar dependency. The elected multiple regression model used a combination of four variables: bunch area, bunch perimeter, visible berry number, and average berry area. The regression analysis between the actual and estimated bunch weight yielded a R 2 = 0.91 on the test set. Our results are an important step towards automatic yield estimation in the vineyard, as they increase the possibility of applying image-based approaches using a generalized model, independent of the cultivar.


Introduction
Grapevine yield estimation is a subject of utmost importance for the wine and vine business [1]. Accurately estimating the amount of fruit in the vineyard can provide advantages to the farmer regarding harvest logistics, cellar management, wine stock management, and even marketing strategies [2]. However, spatial and temporal variability in vineyard blocks [3][4][5] make this task extremely challenging [6]. Current commercial methods are based on manual bunch sampling, which is destructive, laborious, and time-consuming. In general, all standard methods are dependent on bunch weight, either measured or historical, and are then extrapolated for the entire vineyard or plot. Conventional methods usually present errors close to 30%, depending on sampling methods and vineyard variability [7].
Recently, sensor-based technology has been developed to address the challenge of bunch weight estimation [8]. In particular, image analysis has shown the most promising results regarding this topic. If used thoroughly, it has the potential of inspecting a great number of bunches or vines in a short period of time, resorting less to the extrapolation of data, thus being less prone to errors caused by spatial and temporal variability.
Victorino et al. [9] showed that several bunch features are highly correlated with bunch weight, highlighting bunch volume, bunch projected area, and visible berry number in a two-dimensional image. These yield components have also been discussed in other works such as Dunn and Martin [2] and Diago et al. [10].
Bunch area (or pixels) has been explored to estimate bunch or vine yield by Lopes et al. [11] and Milella et al. [12]. In the last case, the authors proposed an automatic bunch segmentation using convolutional neural networks for pixel classification. Threedimensional data (RGB-D images) have also been used to estimated bunch weight and compared to two-dimensional imaging, on cv. Syrah [13]. The authors concluded that although 3D data outperformed 2D in a controlled lab environment (R 2 = 0.89 with RGB, compared to 0.95 with RGB-D, observed vs. estimated data), in field conditions, at a vine level, 3D data were not as reliable (R 2 = 0.88 with RGB, compared to 0.59 with RGB-D, observed vs. estimated data).
An automatic berry counting algorithm was developed by Aquino et al. [14], where the number of visible berries in two-dimensional vine images was used to estimate bunch weight and vine yield on several cultivars, by multiplying the berry number by historical berry weight data. In this case, the images were collected at night using artificial lighting and the method achieved a R 2 value of 0.78 between the observed and estimated yield. Liu et al. [15] developed a fast, calibration-free algorithm for berry counting. The algorithm uses two-dimensional bunch images taken in field conditions with an artificial background and attempts to reconstruct the bunch in three-dimensions, estimating the occluded berries. The authors used this reconstruction to estimate bunch weight, obtaining an accuracy of 92% in two cultivars [15]. Other works used similar approaches, most including machine learning techniques, to achieve the same goal in different cultivars and conditions [1,[15][16][17][18]. However, the relationship between bunch area or visible berries and bunch weight presented different results in multivariate scenarios (e.g., [1,16]), the first being mentioned to be cultivar-dependent in previous research [10,19]. Furthermore, in a 2D bunch image, as only a fraction of berries is visible, and an intermediate step (estimation of total berry number) is needed before estimating bunch weight. The relationship between visible berries and total berries is also cultivar dependent, as it depends on the shape of the bunch [20].
OIV [21] describes different ways to characterize grape bunch morphologic traits through code descriptors. Some of these codes describe the following traits (e.g.,): • All these traits can vary among cultivars and even within bunches of the same cultivar, and can have an impact when trying to convert bunch image features into bunch weight. Even though several studies have successfully estimated bunch weight using image analysis [22][23][24], this task still poses several challenges. The main challenges are linked to the following: (i) image background noise and need for using complex algorithms or artificial backgrounds to avoid or remove it [10]; (ii) 2D imaging of a 3D object, which neglects berries that are occluded by other berries or vegetation [25]; and (iii) cultivar dependency, as most relationships between bunch weight and other bunch features changes with the cultivar mainly due to differences in bunch architecture [26,27]. Bunch features like bunch size or bunch compactness can also vary when subject to different management practices or edaphoclimatic conditions [28].
Considering the above-mentioned challenges, in this study, we do not intend to develop a ready-to-use yield estimation protocol. Our main objective is to explore the relationships between bunch weight and bunch features obtained from 2D image analysis in four different cultivars (multicultivar approach). To address the challenge of cultivar dependency, a simplified protocol is proposed, which focuses on using features that have already been proven to be able to be automatically extracted from images, and attempting to reach the best performing model for multicultivar bunch weight estimation.

Site Characterization and Trial Description
The experiment was performed in two vineyards located in different countries from different hemispheres: Portugal (Site PT) and South Africa (Site SA).
The PT site is an experimental vineyard located in the Lisbon winegrowing region, at Instituto Superior de Agronomia campus, Lisbon, Portugal (38 • 42 24.61 N, 9 • 11 05.53 W). It is located 70 m above sea level and includes two different plots with spur pruned vines trained on a vertical shoot positioning trellis system with two pairs of movable wires. In the PT site, the season presented a dry spring (119 mm of precipitation from March to June) and a very dry and warm summer (17 mm of precipitation and average mean temperature of 20.6 • C, from June to September). The first plot included two drip-irrigated white Portuguese autochthonous cultivars (Encruzado and Arinto) grafted onto 1103 Paulsen rootstock, planted in 2006. Both cultivars were trained on a unilateral Royat cordon and were spaced 1.0 m within and 2.5 m between north−south oriented rows. Water was supplied with a drip irrigation system, which was managed using a soil water probe. Readily available water was maintained until veraison, at which point stress was applied by allowing the soil water to reach values below the refill point. The refill point was defined using predawn leaf water potential thresholds established for white wines at our site (~−0.3 MPa). The second plot was rainfed and consisted of the cultivar Syrah (grafted onto 140 Ruggeri rootstock) planted in 1999. Plants were trained on a bilateral Royat cordon and spaced 1.2 m within and 2.5 m between north−south oriented rows. In both plots, the soil was a clay loam with 1.6% organic matter and a pH of 7.8 [29].
The SA site is a commercial Cabernet Sauvignon vineyard located at Stellenbosch wine region in the Western Cape, South Africa (33 • 54 10.4 S, 18 • 55 12.9 E), 430 m above sea level. Plants were grafted onto 101-14 MGt (101-14 Millardet et de Grasset) rootstock, spur-pruned, and trained on a vertical shoot positioning trellis system with two pairs of movable wires. Vines were planted in 2003 on a bilateral Royat cordon and spaced 2.0 m within and 2.5 m between north−south oriented rows. Water was supplied with a drip irrigation system and irrigation was managed within commercial standards using the midday stem water potential with a threshold for water stress conditions of −0.9 MPa as an indicator. At the SA site, the season presented a spring with an average rainfall (205 mm of precipitation from September until December 2020) and a warm summer with some precipitation (~100 mm of precipitation and an average temperature of 21 • C from December until April).

Data Set
The PT site data were collected during the 2019 season, near harvest (August 2019). Samples of 48, 71, and 60 bunches of cvs. Arinto, Encruzado, and Syrah, respectively, were collected from four different plants per cultivar, located randomly across the vineyard ( Figure 1A-C).
As shown in the example of Figure 1C, the bunches from cultivar Syrah were uncommonly loose compared to the typical Syrah bunch as a result of a lower fruit set caused by unfavorable weather conditions during flowering. A black metal spring was used as scale (5 cm wide) on images collected at the PT site ( Figure 1A-C), while a black square (3 cm 2 ) was used at the SA site ( Figure 1D). The SA site data were collected during the 2020-2021 season, near harvest (March 2021). A sample of 96 bunches from 16 different vines of the cultivar Cabernet Sauvignon (hereinafter referred to as Cabernet) located across the vineyard was collected and taken to lab conditions. Different numbers of bunches were analyzed for different cultivars and sites. However, to prevent possible modelling bias from different sample sizes, each sample was reduced to the lowest number of bunches, 48 from cv. Arinto, using a randomized selection process. A complete description of the collected data set is presented in Table 1.

Data Collection and Image Analysis
At the PT site, bunch images were collected, in lab conditions, using a commercial camera (Nikkon D5200, Nikon Inc., Melville, NY, USA) mounted on a tripod, approximately 50 cm away from the hanging bunches. A blue background was used to prevent noise caused by surrounding objects and to improve color thresholding for bunch segmentation. At the SA site, the image collection methods were similar, except for the background color, which was white, in order to improve the color contrast, particularly with red grape cultivars. In the lab, at both sites, all bunches were weighed (Bw), imaged (hung by the peduncle, in a random position), and the total number of berries per bunch were counted (Tb).
Image analysis was performed using a customized script written in Matlab (R2020a, The Mathworks, Natick, MA, USA) and using ImageJ (version 1.44p, National Institutes of Health, USA). From each bunch image, the following variables were extracted: bunch projected area (BA; cm 2 ), bunch perimeter (BP; cm), average berry area (bA, cm 2 ), and visible berry number (Vb). BA and BP were estimated using a color threshold tool in the L*a*b color space, adapted to each image type (e.g., different cultivars, background, or lighting conditions). This process was performed in a loop for all cases, and all images were inspected individually after processing. bA was estimated from a subsample of three full berries (fully visible) of each bunch that were selected manually in the image. This number of berries was used in order to keep the methodology simplified, so as to possibly automate it in the future, and was representative of each bunch. The bA was then computed using Matlab's Hough transform function (hough function), which is designed to detect lines (or curves) in an image [30]. Vb was obtained manually from the images, by clicking each berry in the image and marking the berries already counted. In all cases where it made sense, pixels were converted into cm 2 using references in the image with known dimensions (black spring in PT images and black square in SA images; Figure 1). The description of the image analysis sequence is presented in Figure 2. Bunch compactness was visually evaluated from the images, with a panel of eight judges, according to OIV descriptor n • 204 [21], which classifies bunches into five categories, based on the mobility of the berries and the visibility of the pedicels, namely: very loose (class 1), loose (class 3), medium (class 5), dense (class 7), and very dense (class 9).

Data Analysis
In addition to the original variables obtained from the bunches and bunch images (BA, BP, bA, and Vb), three other derived variables were calculated: i ratio between bunch area and visible berries (BA_Vb), which represents the average area occupied by each visible berry; ii ratio between bunch area and bunch perimeter (BA_BP), which represents the relationship between the area that berries occupy and their perimeter, and; iii ratio between bunch area and the average bA of each variety (BA_avg(bA)), which represents the average number of berries per unit of bunch area.
The original data set was divided into a training set (with 63% of the data) and test set (with 37% of the data) in a full random mode, for each cultivar, using a customized code in RStudio (ver. 1.2.5, RStudio team). This splitting process was compared with a cross-validation procedure, providing the same results in terms of the selection of variables and similar regression coefficients. The training set was used to establish the relationship between the analyzed variables (described above) and to train the regression model ( Figure 2). A simple linear regression analysis was used to evaluate the relationships between bunch weight and the remaining bunch features. This analysis was performed separately per cultivar in order to explore the potential differences among them. A Student's t-test was used to compare the slope between each regression line. The general multiple model was obtained by a stepwise regression procedure using the R ® function stepAIC from the package MASS [31], which selects the best model based on its Akaike information criterion (AIC) value. The coefficient of determination (R 2 ), root mean squared percentage error (RMSPE), bias, and modelling efficiency (EF), were the deviance measures used to evaluate the model performance. Table 2 presents the average values of the main bunch features obtained during data collection for all of the studied cultivars. All parameters were obtained via image analysis, except for Bw and Tb, which were obtained manually, in lab conditions. cv. Arinto presented the heaviest bunches, by far, while cv. Syrah presented the lowest ones. Cultivars Cabernet and Encruzado presented similar weights among them. Similar results were observed for the total berry number (Tb), except for Encruzado, which showed lower Tb values than Cabernet. Bunch area followed the same trend as Tb, also with cv.

Characterization of Bunch Features
Encruzado presenting values nearly as low as cv. Syrah. Bunch perimeter did not follow the same trend, with the highest values being observed on cv. Cabernet and the lowest on cv. Encruzado, while cv. Arinto and Syrah presented similar values. The Arinto cultivar showed the highest number of visible berries, followed by Cabernet, while Encruzado and Syrah presented the lowest values. The largest berries were observed on the cultivar Encruzado, followed by Arinto, while Syrah and Cabernet presented the smallest values. Regarding bunch compactness notations, cvs. Arinto, Cabernet, and Encruzado presented the same mode (7-dense bunch), while cv. Syrah presented a mode of 3 (loose bunch). All of the analyzed variables showed high coefficients of variation within the same cultivar, between 30% and 62%, with the lowest being for BP and the highest for Bw and Tb. Figure 3 presents the linear regressions between all image-based features (predictors) and bunch weight, while Table 3 presents the respective regression equations, for R 2 and RMSPE. The average berry area did not have a significant relationship with bunch weight (R 2 = 0.02 on the combined data set) and thus was not showcased in Table 3.

Relationships between Image Features and Bunch Weight
However, bA is still an important feature for the objective of this work, as shown in the Discussion section.  Table 3. Fitted models resulting from the linear regression between bunch weight (Bw; g; predictor) over bunch area (BA; cm 2 ), visible berry number (Vb), total berry number (Tb), and bunch perimeter (BP; cm), and respective statistical deviance measures for each of the four cultivars (n = 30) and for the combined data (n = 120) on the training set. The results show that linear regressions between bunch weight and bunch area presented high and significant R 2 for all cultivars and combined data, with the highest R 2 being observed for cv. Cabernet. The same cultivar also presented the lowest RMSPE (Table 3). However, the regression lines between cvs. for this relationship presented significantly different slopes among each other ( Figure 3A), except between cvs. Encruzado and Arinto. The same trends were true for the visible berries and bunch weight regression models, in this case with cvs. Syrah and Cabernet also presenting an equal slope between their regression lines ( Figure 3B). The regression lines for the relationship between Bw and BP presented slope differences among cultivars that mirrored the ones mentioned for visible berries, in this case even more clearly ( Figure 3C), with the slopes from cvs. Encruzado and Arinto being approximately six to eight times larger than the slopes from the cvs. Syrah and Cabernet regression lines.

Multiple Regression Model
In order to test the hypothesis that bunch weight can be estimated, regardless of cultivar and bunch shape, all the assessed variables were included in a forward stepwise regression analysis. In sum, for this analysis, all image-based features, both original and computed ratios, were considered. Table 4 shows the variable selection summary of the forward stepwise regression analysis. The first variable selected was Vb with a high partial R 2 , followed by BA_BP, which contributed an additional 10% to the model R 2 . The variables BA_avg(bA), BA, and BP were also added to the model, but with a lower contribution, with 3% in total. In the last step, the function excluded BA_BP to avoid collinearity issues, with the final model retaining the same R 2 and a lower AIC. The final multiple regression model is presented in Equation (1). Bw = −31.451 + 4.008 × BA − 1.596 × BP + 3.055 × Vb − 3.644 × BA_avg(bA) (1) Table 4. Variable selection summary of the forward stepwise regression analysis, on the training set; between bunch weight (predictor) and bunch area (BA); bunch perimeter (BP); visible berry number (Vb); average berry area (bA); and the ratios between BA and BP (BA_BP), BA and Vb (BA_Vb), and BA and average berry area (BA_avg(bA)).
Step  Figure 4 presents the relationship between the observed and estimated values of bunch weight using the model described in Equation (1) on the training and test sets, with all of the cultivars combined. Visual observation showed a good agreement between the data on both sets, which were corroborated by the statistical measures of validation shown in Table 5.   Table 5 shows the results of the linear regressions between the actual and estimated bunch weight in the test set using the model described in Equation (1), separated for each cultivar and for all of the cultivars combined. All cases presented a high and significant R 2 , with the highest coefficient being shown for the cvs. Encruzado, Arinto, and combined data sets. These two cultivars also presented a regression model with an RMSPE below 20%, while the combined data presented a slightly higher RMSPE value, but still below 30%. The Cabernet cultivar regression model achieved a lower R 2 , which was still high and significant, and higher than the one shown by cv. Syrah. The elected model (Equation (1)) showed the lowest absolute bias on the data set for cv. Encruzado, while the highest bias was observed for cv. Syrah, with a tendency to overestimate Bw for cvs. Arinto and Syrah and for the combined data. The data sets of cvs. Arinto, Encruzado, and the combined data presented the highest EF values (all above 0.80), followed by cv. Cabernet. All regression lines presented slope and intercept values not significantly different from 1 and 0, respectively, except for cv. Syrah. The cv. Syrah presented a high RMSPE, which, along with its lower R 2 , low EF, and high bias (>20%) indicated that this data set was where the model described in Equation (1) achieved the worst results.

Discussion
To date, several works have focused on developing accurate and automatic ways to identify grapevine yield components such as berries and grape bunches in 2D images for yield estimation purposes [2,24,32]. However, past research has not focused on optimizing the intermediate step for converting what is identified in the image into mass (bunch weight or total plant weight). This task can be challenging to generalize considering the morphological variability of the fruit within and between different cultivars [26,33].
In our study, when comparing the performance of the predictors Vb and BA when estimating the bunch weight, we found that, despite the similar range of values of R 2 and RMSPE, the highest and lowest R 2 of the regression models were not always found on the same cv. Data set. Indeed, while the models based on the predictor BA showed the highest R 2 in the cv. Cabernet data set and the lowest in the combined data set, the models based on the predictor Vb had the highest R 2 in the cvs. Arinto and Encruzado data sets and the lowest in the cv. Syrah. The good agreement between these bunch features and bunch weight was similar to the results presented in previous research [1,34], but, to the best of our knowledge, have never been used together in the same estimation model.
A visual observation of the scatter plot ( Figure 3) shows that BW presents a linear trend with BA ( Figure 3A) and BP ( Figure 3C), where power relationships would be expected. The geometrical differences from the relationship between the weight and area or perimeter of a sphere might be caused by occluded berries and irregular depth along the bunch (e.g., varying bunch compactness). On a single berry, regression lines with power equations would probably have the best modeling performance, however, with a whole bunch, the same perimeter or area might be associated with a portion of the bunch that has an indefinite number of berries behind the visible ones, resulting in a changing depth. Regarding the use of BP to predict BW, we observed a lower fit when compared to the models based on BA and Vb. This can be explained by the fact that BP can vary greatly even in cases of similar bunch weight (Table 2), possibly due to different bunch shapes. However, by providing information on bunch morphology, BP can be an important variable to be included in a multiple regression model and, hence, add accuracy to a generalized weight estimation model.
Bunch features assessed in our work (Table 2) present many interactions between each other and among cultivars, that might hint towards indicators of bunch morphology. For example, cv. Encruzado presents a higher average berry area and a much heavier average bunch than cv. Syrah. At the same time, both cvs. had almost the same total berry number, showing the importance of berry size when estimating bunch weight using berry number as an estimator. This has been previously explored in [15], either by using historical data or through direct measurements of the average berry weight. Berry weight automatic estimation was explored in Mirbod et al. [35] and Roscher et al. [36] using the automatically obtained average berry area. Another example that reinforces the need to use more than one bunch feature in the models for BW estimation is, again, in the case of cv. Encruzado, where the average BA was almost as low as for cv. Syrah, even though the cv. Syrah bunches were much lighter. Further examples can be given, like the case of the average BP of cv. Syrah, which presented the second-highest value, even though, again, this cultivar presented much smaller bunches than the other cultivars. A final example that hints at the possible bunch morphology differences that can be obtained by comparing berry attributes of cv. Encruzado and cv. Syrah is that both cvs. presented a similar average number of visible berries, but cv. Encruzado showed an average berry size that was almost twice that of cv. Syrah. Most of these observations can be corroborated visually, by inspecting the bunch images. These observations indicate that many of these bunch features can help differentiate bunch shapes and architectural traits when analyzed together and, in return, add accuracy to the bunch weight estimation models, as was briefly mentioned by Di Gennaro et al. [24] and Ivorra et al. [33]. Furthermore, because the mentioned bunch features (bunch area, bunch perimeter, and average berry area) were all possible to be obtained automatically from images, it was considered relevant to use such features as predictors of bunch weight, even though they might not always show high correlations between them. This was the reason behind the computing of the derived features of BA_BP, BA_Vb, and BA_avg(bA). In addition, in this way, the shape differences could even be automatically identified on different bunches of the same cultivar in cases of internal variability, and would not require any previous context for the algorithm to be effective for different cultivars.
In our results, BA_BP and BA_avg(bA) presented high and significant Pearson correlation coefficients with OIV bunch compactness indexes (code 204) of 0.73 and 0.60, respectively. As this is a complex trait by itself and goes beyond the scope of this work, it was not further explored. However, this correlation shows that the ratios between these variables can indeed be interesting to classify bunch architectural traits, as mentioned above. In sum, and to further elucidate on the rationale behind these ratios, for BA_BP, if an example is considered for two bunches with the same area, one loose and the other compact, it is possible to understand that the loose one would have a higher perimeter. This would happen as more cases of whole berries would be considered instead of clustered ones, thus having a lower BA_BP ratio, just as it would have a lower OIV bunch compactness index (code 204). On the other hand, BA_avg(bA) relates to the average amount of berries per unit of area, which also increases or decreases, depending on bunch compactness. These relationships, to the best of our knowledge, have never been explored for the bunch weight estimation. Their variables are key in a multiple regression analysis that attempts to estimate the weight for several bunch shapes. A similar approach was explored in [20], where the authors explored several bunch features and achieved a multicultivar method to estimate bunch compactness. However, most of the features used in this work were difficult to achieve through automatic image analysis.
The stepwise regression analysis computed in this study (Table 4) selected all original features (BA, BP, and Vb), except bA, which was included within the ratio of BA_avg(bA). The variable Vb was the first to be selected with a very high partial R 2 , reinforcing what was observed in the simple linear regressions analyzed above, and in accordance with what has previously been reported in other works for different cultivars [1,22]. The ratio for BA_BP also presented a high contribution to the stepwise regression model, possibly because it includes the other main bunch trait that explain Bw (BA), and also because of what this ratio represents regarding bunch morphology, as mentioned above. In the end, this ratio was removed from the model (last step of the stepwise regression model), but the two variables remained, indicating the importance of their interaction, even when BP showed a low R 2 with Bw. It is important to highlight that all studied variables were not extracted at the cost of any extra labor, as they were all features that could be obtained automatically from images [12,15,33,37].
The selected model (Equation (1)) presented a very good fit, with the test set (Table 5) being effective at estimating the bunch weight of different cultivars with a very distinct bunch morphology, presenting an increase in R 2 of 20%, 14%, and 35% compared to models that used the single predictors BA, Vb, or BP, respectively, on the combined data set (Table 3). In cvs. Arinto and Syrah, the model overestimated Bw, possibly for different reasons. In the case of cv. Arinto, the differences in bunch shape might be the main issue. With bigger and wider bunches, this cultivar presents a higher portion of extremities that have different berry occlusion rates than the center of the bunch, which was not considered by the model. The overestimation of Bw for cv. Syrah was possibly related to the fact that this cultivar presented much lighter bunches than the other cultivars. With the model being fitted with three other cultivars that present an average bunch between 161g and 424 g, an overestimation was expected for bunches with less than half the average weight, probably because of the same reasons mentioned for cv. Arinto. The model did not achieve higher R 2 values for the cvs. Cabernet or Syrah than the ones presented by a cultivar-specific model based only on BA (Table 3). Furthermore, especially in the case of cv. Syrah, the resulting RMSPE was above 60%. Despite the good overall model performance, when working with cv. Syrah, better results could still be obtained with the simpler approaches, using a cultivar-specific model. However, as explained above, in this study, the cv. Syrah showed abnormal bunch feature values. For this reason, it is relevant to continue this research and study this approach with more cultivars from different sites and different seasons.
The best bunch weight estimation results were obtained for cultivars that presented higher slopes on all single relationships, as shown in Figure 3 (cvs. Arinto and Encruzado). This behavior might be explained by the fact that cvs. Encruzado and Arinto bunches presented a lower bunch area, visible berry number, and bunch perimeter per unit of bunch weight, which might indicate that there was a higher bunch density and consequent more berry by berry occlusion. In fact, cv. Arinto showe an average of 37% of visible berries per bunch, while cvs. Cabernet, Encruzado, and Syrah presented 53%, 57%, and 66%, respectively. This makes sense, as the Arinto bunches were very large and had increased dimensions in all axes. On the other hand, even though cv. Encruzado did not present the same trend in berry visibility, this cultivar presented the largest average berry area, which could also cause a higher percentage of berry occlusion by other berries, and in return, could be the reason behind the higher slopes of the relationships shown in Figure 3. Furthermore, heavier bunches presented a slightly higher weight estimation error on both the training and test sets (Figure 4), which, again, could be caused by the fact that on bigger bunches, there was a higher fraction of berry-by-berry occlusion (63.2%, 46.8%, 43.5% and 34.5% of the average occluded berries on bunches of cvs. Arinto, Cabernet, Encruzado, and Syrah, respectively). The elected model presented a final RMSPE = 25.9%, which is very promising considering the great variability in average bunch weight among the cvs. Arinto (434 g) and Syrah (72 g).
In our work, differences among years and sites were not explored, as our data were limited in this regard. However, both spatial and temporal variability are caused by differences in the number and/or weight of the bunches [38]. Hence, bunch weight estimation methods that achieve accurate results on varying bunch shapes and sizes, such as the ones presented in this work (Table 2), are likely to also be accurate between sites and at different years.
As mentioned before, all of the analyzed features could be collected from a vine image of a realistic, vineyard scenario [12,15,34,37]. However, we predict that the application of the studied approach in such conditions would be subject to several challenges/adaptations, such as (i) image resolution, which would be particularly important to extract features such as bunch perimeter and visible berries, as these require more detail and, thus, higher resolution if images are to be taken from a larger distance; (ii) bunch occlusion by leaves, where recent works have explored ways to estimate the occluded bunches [39,40], but this challenge still remains unsolved; (iii) extracting features from occluded bunches, as even if occluded bunches are estimated, it will be impossible to have their corresponding exact area, visible berries or perimeter, and ratios between these features on the visible portion of the bunches can be a better option; and (iv) robust segmentation methods, as this challenge lies in the step before the weight estimation (segmentation step), being crucial for a vineyard scenario.

Conclusions
To obtain a generalized, accurate model for grape bunch weight, this paper compared the explanatory potential of several image-based bunch features for estimating bunch weight of four grapevine cultivars. Several simple linear regression models between the studied bunch features and bunch weight presented a strong goodness of fit, showing that bunch weight could be estimated based on single image-based variables, corroborating what has been reported in recent research. However, the resulting regression models were significantly different from each other, depending on the grapevine cultivar. Along with the original image features, other derived features were computed to attempt to explain the bunch morphological differences. The resulting model used a combination of bunch area, bunch perimeter, visible berry number, and average berry area, and presented strong goodness of fit when estimating bunch weight on the test set, which included cultivars with very different bunch shapes. The generalized model did not achieve satisfactory results for cv. Syrah, which, in this study, showcased unusual small bunches. Our results are an important step towards automatic yield estimation in the vineyard, as the possibility of applying image-based approaches in a generalized way that is independent of grape cultivar is increased. Further research needs to confirm the outputs with other cultivars, different sites, and seasons. From a practical point of view, the application of this methodology in field conditions should be explored.