Mid-Season Yield Estimation in High-Productivity Vineyards: A Preliminary Modeling Framework for Free-Canopy Systems

Acevedo-Opazo, César; Cañete-Salinas, Paulo; Araya-Alman, Miguel; Ackerknecht-Espinosa, Cristian; Vásquez, Lucas; Moreno-Simunovic, Yerko

doi:10.3390/agronomy16111106

Open AccessArticle

Mid-Season Yield Estimation in High-Productivity Vineyards: A Preliminary Modeling Framework for Free-Canopy Systems

by

César Acevedo-Opazo

¹

,

Paulo Cañete-Salinas

^2,*

,

Miguel Araya-Alman

²

,

Cristian Ackerknecht-Espinosa

³

,

Lucas Vásquez

¹

and

Yerko Moreno-Simunovic

⁴

¹

Facultad de Ciencias Agrarias, Universidad de Talca, Avenida Lircay s/n, Talca 3460000, Chile

²

Centro de Desarrollo del Secano Interior, Departamento de Ciencias Forestales, Facultad de Ciencias Agrarias y Forestales, Universidad Católica del Maule, Talca 3460000, Chile

³

ERDE Technology and Applied Engineering SPA, Talca 3460000, Chile

⁴

Centro Tecnológico de la Vid y el Vino, Facultad de Ciencias Agrarias, Universidad de Talca, Avenida Lircay s/n, Talca 3460000, Chile

^*

Author to whom correspondence should be addressed.

Agronomy 2026, 16(11), 1106; https://doi.org/10.3390/agronomy16111106

Submission received: 31 March 2026 / Revised: 25 May 2026 / Accepted: 26 May 2026 / Published: 3 June 2026

(This article belongs to the Section Precision and Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

Accurate vineyard yield estimation is essential for harvest planning, resource allocation, and economic decision-making, particularly under conditions of high spatial variability. Traditional sampling-based methods are labor-intensive, destructive, and prone to error, especially in high-productivity free-canopy systems. This study developed and evaluated predictive models for commercial irrigated vineyards of Carménère and Chardonnay in Chile’s Maule Region across two growing seasons (2023–2025). Structural yield components, physiological measurements, and UAV-derived multispectral indices (NDVI, GNDVI, NDRE) were collected from georeferenced sampling grids. Modeling approaches included linear regression, stepwise selection, and machine learning algorithms (Random Forest, Multilayer Perceptron). Validation results showed that cluster number was the primary driver of yield variability, explaining up to 40% of variation. Incorporating physiological and spectral variables improved accuracy, with the best models (least squares and MLP) achieving R² values up to 0.66 and reducing errors to 12–15%. Spatial yield maps reproduced intra-vineyard variability patterns, demonstrating that integrating plant-level and canopy-level data substantially enhances yield prediction. These findings provide a robust framework for precision viticulture applications.

Keywords:

precision viticulture; yield components; plant water status; multispectral remote sensing; machine learning; artificial intelligence

1. Introduction

Mid-season yield estimation represents a critical task in viticulture, as it enables grape growers to adequately prepare for both vineyard harvest and subsequent winery operations [1,2,3,4]. Inaccurate forecasting has been shown to generate adverse consequences across the global wine industry [5]. Several studies have reported errors ranging from 15% to 30% in yield estimates derived from manual field sampling [6,7,8]. Similarly, data from winegrowers in central Chile indicate prediction errors of 20% to 30% across different manual monitoring methodologies, with errors exceeding 40% in years when yield variability is strongly influenced by abiotic factors such as frost or drought [2]. These findings underscore the need for improved yield estimation methodologies capable of reducing error and providing growers with more reliable forecasts.

Estimating vineyard yields is particularly challenging because of high inter-plant variability within blocks, largely associated with differences in vine vigor and soil properties. Current yield prediction practices, which combine historical vineyard yield records with in-field measurements of cluster weight prior to harvest, are labor-intensive, costly, inaccurate, spatially sparse, destructive, and reliant on subjective observations [7,9,10]. Typically, these approaches involve sampling a small portion of the vineyard, usually 1–2% of the vines, and extrapolating the results to the entire field. Most methods advocate random sampling under the assumption that yield variability is random and normally distributed [11]. They also tend to account for operational constraints, such as minimizing sampling time while ensuring accurate measurement of yield components. However, because sample size is generally small relative to vineyard spatial variability, these approaches often produce inaccurate and spatially biased yield predictions [12]. As a result, there remains a substantial gap between yield prediction methods currently available and the practical needs of winegrowers needing to make key decisions for more precise vineyard management.

Analyses of an extensive database of grapevine yields, encompassing a wide range of cultivars and climatic conditions from cold to warm regions, consistently show that the number of clusters per plant explains approximately 60% of the seasonal variability in grapevine yield [13]. Yield fluctuations are less sensitive to the number of berries per cluster (ca. 30%) and even less sensitive to berry size (ca. 10%) [2,3,5,6,7,8,9,10,14,15]. Although the number of clusters per plant is determined by bud fertility and budburst, most of the variation is attributable to the number of clusters per shoot. This finding is particularly relevant because it suggests that yield stabilization depends on understanding and managing the number of clusters produced by vines. In the context of yield prediction, the greatest effort should therefore be directed towards accurately quantifying cluster number. Incorporating supporting information on cluster development (i.e., berries per clusters), together with high-spatial-resolution ancillary information acquired through remote sensing platforms such as drones and satellites, should improve yield estimation accuracy by accounting for vineyard spatial variability [16,17,18,19].

To obtain a rapid and acceptable estimate of yield [12], there has been proposed measuring yield components—namely the number of clusters per plant, the number of berries per cluster, and berry weight—at different stages of vine development. This approach has subsequently been reinforced and refined by recent quantitative and machine learning-based studies [1,15]. In this framework, the average number of clusters is estimated early in the season, at flowering, whereas average cluster weight is measured before harvest. This timing is advantageous because flowers are more visible during this period, allowing lower measurement error and enabling a greater number of sites to be sampled in less time. The method is particularly appealing because it explicitly considers the phenological stages of the vine when defining yield potential. In addition, analyzing the coefficient of variation in both cluster number per vine and cluster weight to determine optimal sample sizes and reduce yield estimation error has been proposed. The main limitation of this procedure, however, is the large number of sampling sites required to obtain an accurate estimate. Alternatively, and based on the method of [7,12] a modified approach is based in incorporating historical yield information, achieving an estimated error of approximately 20%.

In response to operational limitations, other authors have developed alternative methods that achieve better performance in yield estimation. Some of these approaches propose the use of sensors to partially or completely replace field measurements, including remote sensing data [16,20], terahertz-wave images, simple-color imagery [5,21], and visible-light cameras [10]. In addition, ref. [22] proposed a method for real-time yield estimation based on trellis-tension monitors. Collectively, these approaches seek to improve estimation accuracy by exploiting alternative sources of information and/or complementing traditional sampling methods. In Chile, at least one commercial service currently offers yield estimates based exclusively on multispectral airborne imagery (Sociedad agrícola y comercial dayenu limitada), although no published results are available to support such applications. Nevertheless, these methods present important practical limitations at the grower level, mainly because they require high-cost sensors and complex analytical methodologies.

All the methods described above assume that yield components are randomly distributed across the field. This assumption is inconsistent with the evidence provided by yield monitoring systems embedded in grape harvesters [2,15,23,24]. In this regard, refs. [23,24] demonstrated that yield variability is not random, but rather strongly spatially structured at the field scale. Furthermore, according to scientific databases, only a limited number of reports have addressed the use of field-scale yield prediction models based on point measurements of vine yield combined with high-spatial-resolution ancillary information, such as very-high-resolution airborne imagery.

To address this problem, the study evaluates different methods for yield estimation in white and red grapevines cultivated in free canopy systems of high productivity. To this end, multiple layers of information are integrated—including structural yield components, physiological measurements of plants, and spectral information obtained through drones—within a mid-season estimation horizon. The findings demonstrate pronounced spatial variability in yield across the vineyard block and provide a preliminary foundation for proposing estimation models that range from simple approaches to more advanced frameworks. To achieve these objectives, an experiment was conducted over two consecutive seasons on two cultivars, Chardonnay and Carménère of early and late season harvest date respectively, and a single yield estimation model was developed for both cultivars trained to a free canopy system.

2. Materials and Methods

2.1. Characterization of the Experimental Site

The study was conducted across two consecutive growing seasons (2023–24 and 2024–25) in two commercial vineyards located in the Maule Region of Chile. The first site is situated in the Pencahue Valley (35°19′45.0″ S, 71°45′40.5″ W), while the second site is in the commune of San Clemente (35°27′54″ S, 71°29′42″ W). Table 1 provides a detailed summary of the principal production and management characteristics of both experimental sites.

Climate data on temperature and precipitation, together with information on soil characteristics, were obtained from references [25,26].

A total of 42 and 47 georeferenced monitoring points were established in the Carménère and Chardonnay vineyards, respectively. Each monitoring point comprised two adjacent plants (experimental unit), separated by approximately 15 m from neighboring points. Physiological, multispectral, and yield-related measurements were collected from both vines, and the resulting values were averaged to obtain a single representative value per point. This procedure minimized within-point variability and prevented pseudoreplication, thereby allowing each georeferenced point to be treated as an independent observation in subsequent analyses. The same monitoring points were evaluated during both growing seasons, ensuring consistency in the spatial assessment of vineyard variability. Point coordinates were recorded using a Trimble GeoXH DGPS receiver (Trimble Inc., Sunnyvale, CA, USA) with sub-meter accuracy (20–30 cm), corrected via a CORS/NTRIP RTK base station with Trimble Access, and subsequently refined through post-processing in Trimble Business Center (TBC).

2.2. Plant Measurements

Physiological assessments were conducted to characterize plant water status, canopy temperature, and gas exchange in all plants selected for both trials.

-: Plant water status was determined using xylem water potential (Ψx), measured with a Scholander pressure chamber [27]. Leaves were enclosed in plastic film and aluminum foil 90 min prior to measurement to ensure equilibrium between xylem and leaf water potential. Measurements were performed at midday under clear sky conditions, when atmospheric demand is maximal [28], using fully expanded, healthy leaves from the middle third of the canopy.
-: Canopy temperature was recorded at three canopy heights (upper, middle, and lower) using a FLIR TG167 infrared Thermometer Thermal Camera (FLIR Systems Inc., Wilsonville, OR, USA). This passive sensor detects infrared radiation emitted by surfaces above absolute zero, generating thermograms that represent surface temperature in degrees Celsius [29].
-: Gas exchange parameters, including stomatal conductance and transpiration rate, were measured with a Licor LI-600 porometer/Fluorometer (LI-COR Biosciences, Lincoln, NE, USA). Measurements were taken at midday on healthy leaves located in the middle third of the canopy, selecting one representative leaf per plant at each monitoring point.

2.3. Yield Structural Components

Yield-related structural components were assessed at two key phenological stages (pea-sized fruit and pre-harvest). Variables measured included total number of clusters per plant, individual and total bunch weight (g), berry weight (g), and rachis weight (g).

2.4. Drone Flights

Complementary drone flights were conducted between December and February at an altitude of 80 m. A DJI Mavic 3 Multispectral unmanned aerial vehicle (UAV) (DJI Technology Co., Ltd., Shenzhen, China) equipped with a multispectral camera (RGB and infrared bands) was employed to capture high-resolution imagery. Images were processed using Pix4D mapper software (version 4.10.0), which generates 2D and 3D models via photogrammetry (Figure 1). Vegetation indices, including NDVI, GNDVI, and NDRE, were subsequently calculated based on spectral information (Table 2).

2.5. Statistical Analysis

Following data acquisition, a series of statistical procedures were performed to construct and validate yield estimation models. Analyses were conducted using the XLSTAT add-in for Microsoft Excel (Addinsoft, Paris, France). Initially, dimensionality reduction techniques were applied to simplify the dataset while retaining maximum explanatory power, with the objective of predicting the target variable: the harvest yield. In addition, Pearson’s correlation matrix was employed to evaluate linear relationships among variables, thereby facilitating the identification of patterns, associations, and potential redundancies [30]. Together, these approaches enabled the identification of variables exerting the greatest influence on yield per plant, which provided the basis for subsequent model construction and validation. A principal component analysis (PCA) was then conducted on the selected variables to characterize their behavior and to assess their associations across both evaluated cultivars.

The summary of the methodology and execution of the test can be seen in the Figure 2.

2.6. Yield Estimation Models

Predictive models estimating final yield per plant were developed using RStudio (version 4.3.1) and XLSTAT (version 2014; Addinsoft, Paris, France). The modeling process comprised four stages: (i) selection of candidate predictor variables, (ii) calibration with data from the first season, (iii) validation with independent data from the second season, and (iv) comparison of algorithmic performance. Initially, variables exhibiting the strongest associations with yield were selected. Subsequently, additional agronomic, physiological, and spectral variables with potential explanatory value were incorporated to enhance predictive accuracy.

The database encompassed both grape varieties. Season 1 data were used for calibration, while Season 2 data served for external validation. The modeling approaches tested included Simple Linear Model (SLM), Simple Nonlinear Model (SNM), Least Squares Model (LSM), Stepwise Model (SM), Random Forest (RF), and Multilayer Perceptron (MLP) neural networks. Model performance was evaluated by comparing observed and predicted values using fit statistics.

2.6.1. Simple Linear Model (SLM)

The SLM was employed to describe the relationship between a single independent variable (X) and final yield per plant (Y). This approach was applied to identify individual predictors with a direct linear association with yield. The model was expressed as:

Y = + b X

where the model consists of a parameter for the intercept ‘a’ and another for the slope coefficient ‘b’ of the graph of predicted yield per plant ‘Y’ as a function of the continuous explanatory variable ‘X’.

2.6.2. Simple Nonlinear Model (LSM)

Because some relationships between performance and predictor variables may not follow a linear pattern, LSM were also tested. These models allow for curved responses between the dependent variable and one or more predictor variables. The general nonlinear model was expressed as [31]:

y_{t} = f (X_{t}, β) + U_{t}

where

y_{t}

is the response variable, f(X_t, β) is a nonlinear function defined by the predictor vector of X_t and parameter vector β and

U_{t}

is the error term.

2.6.3. Least Square Model (LSM)

Multiple linear regression was used to estimate final yield per plant from two or more predictor variables simultaneously. This approach allowed for the evaluation of the combined effect of agronomic, physiological, and spectral variables within a single linear model. The model was expressed as [32]:

y_{i} = β_{0} + β_{1} x_{1 i} + β_{2} x_{2 i} + ε_{i}

where “y_i” is the observed yield per planta, “

x_{1 i}

and

x_{2 i}

“ are the explanatory variables, β₀ is the intercept, β₁ and β₂ are regression coefficients and “ε_i” represents the error in observation _i.

2.6.4. Stepwise Model (SM)

SM was used as the variable selection procedure to identify the most relevant predictors for performance estimation [33]. Variables were sequentially incorporated into the model using a stepwise selection approach. At each step, the contribution of each candidate variable was evaluated using the F-statistic. Variables that significantly improved the model fit were retained, while those that did not contribute significantly were excluded. This procedure reduced model complexity and minimized the inclusion of redundant or uninformative predictors.

The F-statistic used for variable selection was calculated as follows:

F_{s} = \frac{{S S}_{k - 1} / (k - 1)}{{M S E}_{k}}

where

F_{s}

is the F-statistic of the model,

S S_{k - 1}

is the regression sum of squares with

k - 1

degrees of freedom, and

M S E_{k}

is the mean squared error of the residuals.

2.6.5. Random Forest (RF)

A RF algorithm was implemented to model the relationship between predictor variables and final yield per plant. RF is a conjoint learning method that builds multiple regression trees using bootstrap samples from the training dataset and aggregates their predictions to improve the model’s accuracy and robustness.

The model included four predictor variables: number of clusters, cluster weight, NDVI, and GNDVI. Model training was performed using the RF and caret packages in R. A total of 500 trees were generated, and the number of variables randomly selected on each split (mtry) was automatically optimized through cross-validation during model training.

To ensure reproducibility, a fixed random seed was used before model fitting. The model’s performance was evaluated using independent data, and the importance of the predictors was assessed using two standard metrics: the increase in mean squared error (%IncMSE), obtained by permuting each variable, and the total decrease in node impurity (IncNodePurity).

The RF model identified cluster number as the most influential predictor variable, showing the highest increase in mean squared error (%IncMSE = 71.33) and the greatest contribution to node purity (IncNodePurity = 510.74). Cluster weight was the second most important variable, with a %IncMSE of 23.14 and an IncNodePurity of 213.33, suggesting a moderate contribution to yield prediction. Both spectral indices showed lower importance compared to structural variables. NDVI presented a %IncMSE of 18.79 and an IncNodePurity of 136.33, while GNDVI exhibited a similar %IncMSE (18.40) but a slightly lower contribution to node purity (101.93) [34,35].

2.6.6. Multilayer Perceptron (MPL)

A MLP neural network was implemented to model the nonlinear relationships between the predictor variables and the final yield per plant. The model consisted of a forward propagation architecture with an input layer, a hidden layer, and an output layer.

The input layer included four predictor variables: cluster number, cluster weight, NDVI, and GNDVI. The hidden layer contained two neurons, automatically determined by the algorithm through an internal optimization procedure. A hyperbolic tangent activation function was used in the hidden layer to capture the nonlinear relationships, while a linear activation function was used in the output layer.

Before model training, the input variables were normalized using tight normalization, and the dependent variable was standardized to improve model convergence and stability. The dataset was randomly divided into training (62.9%), test (30.3%), and validation (6.7%) subsets. Only the first season’s database was considered for this purpose.

The model was trained using a batch learning approach with the scaled conjugate gradient optimization algorithm. Training was stopped according to early stopping criteria, defined as a consecutive step without reduction in the error function, with additional constraints including a minimum error change of 10⁻⁴ and an error ratio threshold of 0.001.

Model performance was evaluated using the sum of squared errors and the relative error in the training and test datasets. The relative importance of the predictor variables was estimated based on their contribution to the network, with the number of clusters identified as the most influential variable, followed by cluster weight, NDVI, and GNDVI.

Once the model was developed, it underwent a second validation process with the second data set [36].

2.7. Model Adjustment

Model adjustment constitutes a critical phase in evaluating predictive quality, wherein observed data are compared against model-generated predictions. Although multiple validation techniques are available, in this study the following statistical indicators were employed [37] (Table 3).

2.8. Cartography Proposal

After estimating yield at each point within the study grid and identifying the variables most strongly associated with yield, spatial maps were generated using 3DField software (version 2.9.0.0). These yield maps (YM) enable visualization of spatial variability in yield through a color scale. The scale distinctly differentiates georeferenced areas with higher or lower expected yields, thereby facilitating more accurate agronomic decision-making [40].

2.9. Geostatistical Analysis

The real and estimated yield values per plant (kg/pl) for the two vineyards under study were used to assess the spatial structure and variability of yield maps. Geostatistical parameters were calculated for each of the proposed models using variogram analysis, including nugget (C₀), sill (C₀ + C₁), and range (r) [41,42,43]. These parameters were subsequently employed to calculate the Cambardella Index [42,44], which quantifies the proportion of yield variability explained by spatial location.

The index is defined as the ratio between the nugget and the total semivariance of the semivariogram, expressed as a percentage:

S D = \frac{C_{0}}{C_{0} + C_{1}} \times 100

where SD represents the spatial dependence of yield per plant within the field. Accordingly, values of SD ≤ 25% indicate strong spatial dependence, values between 25% and 75% indicate moderate spatial dependence, and values exceeding 75% indicate weak spatial dependence. Semi-variance analysis and parameter estimation were performed using GS+ software (version 9.0; Gamma Design Software, Plainwell, MI, USA, 2008).

3. Results and Discussion

3.1. Filtering Physiological, Multispectral, and Yield Component Data

For the development of yield estimation models, physiological measurements, vegetation indices, and agronomic variables associated with yield structural components were collected for both cultivars. Data acquisition was conducted from December to April across two study seasons, focusing on Vitis vinifera L. cv. Carménère and Chardonnay. The average yield per plant at harvest was 10.9 kg for Carménère and 8.9 kg for Chardonnay, corresponding to yields per hectare of 29,299 kg and 21,670 kg, respectively.

Principal component analysis (PCA) was initially performed using the 33 variables collected from both vineyards. Subsequently, the dataset was filtered to retain only those variables most strongly correlated with yield per plant (kg/pl), resulting in a final selection of nine variables used in the PCA (Table 4).

The principal component analysis (PCA) (Figure 3) was performed using values recorded at each measurement point for all variables under study to examine correlations across the complete dataset collected during both seasons and vineyards. The vectors in the PCA biplot indicate the direction of increase for each analyzed variable, thereby illustrating their relative contribution and association within the multivariate space.

In both PCA analyses for Season I (left panel) and Season II (right panel), components F1 and F2 explained 84% and 81% of the total variability, respectively. Component F1 was primarily associated with vegetation indices (GNDVI, NDVI), berry weight, and xylem water potential measured in December, whereas F2 was mainly explained by yield per plant (kg), cluster number, and cluster weight (g) in Season II. Furthermore, the Kaiser–Meyer–Olkin (KMO) test yielded values of 0.83 and 0.80 for seasons I and II, respectively, indicating an adequate correlation structure among the selected variables in both seasons.

In both analyses, F1 accounted for 69.2% and 66.0% of the total variance, respectively, reflecting the strong correlations among the variables aligned with this axis. Notably, xylem water potential measured in December exhibited a strong inverse correlation with structural yield components (cluster weight and berry weight) and vegetation indices (GNDVI, NDVI). This indicates that increased water restriction (more negative xylem potential) reduces vegetative expression, thereby decreasing cluster and berry weight. Conversely, cluster weight, berry weight, and vegetation indices were positively correlated, increasing in the same direction.

Component F2, which explained 14.5% and 14.7% of the variance in Seasons I and II, respectively, correlated directly with yield per plant at harvest. This was consistent across both panels, where yield-related variables (cluster weight and cluster number) were positioned. These findings align with previous observations [45], which reported that improvements in plant water status and vegetative expression positively correlate with yield through increased cluster and berry weight.

The PCA biplots also illustrate the distribution of sampling sites for both cultivars across seasons. Chardonnay consistently clustered on the right side of the graph, while Carménère was positioned on the left. This pattern indicates that Carménère exhibited greater vegetative expression (higher vegetation index values), as well as higher berry and cluster weights, compared with Chardonnay. The latter showed less favorable water status (greater restriction), as evidenced by more negative xylem water potential values. In relation to F2, observations aligned with higher yield values were consistently located in the Carménère sector, whereas Chardonnay tended to exhibit lower yields. These differences can be explained by cultivar-specific ripening dynamics: Chardonnay is an early-ripening cultivar, while Carménère ripens approximately 45 days later, depending on vineyard climate. This phenological difference largely accounts for the yield gap observed between cultivars. Furthermore, it is important to note that for both cultivars the variables explaining yield differences are the same (berry weight, cluster weight, GNDVI, NDVI, and xylem water potential), highlighting the possibility of proposing a single yield estimation model for both cultivars in a high-productivity free-canopy system. Finally, from a practical standpoint, having a single yield estimation model is of particular interest to growers, as it facilitates harvest management and decision-making.

The correlation between cluster number and final yield per plant at harvest reflects the expected behavior of vineyard productivity, as yield depends directly on biomass accumulation and vegetative expression. Yield is defined as the combination of cluster number and cluster weight [46]. Consequently, simultaneous increases in both factors result in proportional increases in harvest weight, as illustrated in Figure 2. Cluster number thus represents a quantitative determinant of yield, while cluster weight reflects the qualitative-productive dimension, linked to sugar and phenolic compound accumulation [47]. Together, these variables explain their joint contribution to total harvest weight.

The inverse relationship observed between cluster number and cluster weight may be explained by a compensatory effect among yield components. A higher number of clusters increases assimilate demand, intensifying competition among reproductive organs for carbon and nutrients, thereby reducing resources available per cluster and consequently lowering individual cluster weight [7,13]. The separation between structural and physiological variables in the principal components supports the notion that yield results from the interaction between direct productive factors and physiological processes that modulate crop efficiency throughout the season [48].

To complement the multivariate analysis and evaluate the relationship between vegetative expression and structural yield components, Pearson’s correlation matrix was constructed using the nine selected variables. This analysis identified the factors most strongly correlated with yield per plant.

Pearson’s correlation matrix revealed significant linear associations between vegetation indices and structural yield components (Figure 4). Yield per plant (kg/pl) showed a positive correlation with cluster number (r = 0.64), while a moderate negative correlation was observed with average xylem water potential measured in December (r = −0.49). Vegetation indices, including GNDVI and NDVI, exhibited positive correlations with yield per plant (r = 0.56 and r = 0.55, respectively). Additional positive associations were identified with berry weight (r = 0.53) and cluster weight (r = 0.28).

Negative correlations with xylem water potential measured using the Scholander pressure chamber indicate that increased water stress (more negative values) reduces vegetative expression, a physiological response widely reported in grapevines under water deficit conditions, where stomatal regulation plays a key role in water conservation [48]. The most relevant finding for this study is the strong positive correlation between cluster number and yield per plant (r = 0.64), confirming that fruit load is the primary determinant of yield under productive conditions. Cluster number exerts the greatest influence on yield variation, while cluster and berry weight act as secondary components [13]. This supports the inclusion of cluster number as a central predictor in the yield estimation models developed.

3.2. Yield Prediction Models

The following section presents the validation of the proposed models, ranging from simple linear approaches to more complex algorithms (Figure 5), for yield prediction at the plant level in two commercial vineyards (Vitis vinifera L. cv. Carménère and Chardonnay) managed under free-canopy systems. Although PCA consistently differentiated Chardonnay from Carménère across both evaluated seasons, this separation should be interpreted as reflecting the physiological and phenological differences inherent to each cultivar rather than as evidence of distinct yield-determining mechanisms. Indeed, the variables contributing most strongly to the observed variability were the same in both cultivars and exhibited similar relationships with final yield. This indicates that productivity was governed by common physiological processes associated with water status, vegetative expression, and the structural components of yield. From this perspective, the combined use of both cultivars is not only statistically valid but also agronomically relevant, as it enables the capture of a broader range of yield responses within the same training system. Incorporating this variability enhances the model’s predictive capacity, strengthens its robustness under diverse production scenarios, and facilitates its transferability to commercial growing conditions.

The first prediction model was developed using the variable showing the strongest and most statistically significant correlation between yield per plant and the number of clusters per plant at harvest (Table 5). The resulting model is a simple linear regression designed to predict yield per plant at harvest (kg/pl) (Figure 4, model A).

Model A achieved an R² of 0.41, indicating that it explained approximately 41% of the total variability in fruit weight at harvest (kg/pl). In Figure 4, the dotted black line represents the regression line of Model A (y = 0.3964x + 7.2512), which should be compared with the dotted red line (y = 1.1002x) representing the 1:1 linear relationship between observed and estimated yield, with an approximate R² of 0.96. Analysis of the figure shows that within the average production range (7.0–14.0 kg/pl), Model A exhibits systematic errors at the extremes. Specifically, the model underestimates high yields (above 11.0 kg/pl), as the black line falls below the ideal line, while it tends to overestimate low yields (below 10.0 kg/pl).

Based on the results described above, Model A, with an R² of 0.41, can be evaluated as fair. However, considering that only one variable was used (number of clusters per plant), the outcome can be regarded as noteworthy, since it explains 40% of the yield variability using a single predictor. This finding is consistent with studies emphasizing the importance of numerical components in crop estimation [49]. Despite its simplicity, model A offers a significant operational advantage, as it relies on an easily measurable and highly consistent variable (the number of clusters per plant). This characteristic makes it a practical tool for rapid yield estimation, particularly in production contexts where the integration of physiological sensors or remote sensing platforms is not feasible, and a straightforward methodology is required for decision-making at the intra-farm level [7,50]. In this regard, the model provides an initial approximation of the field’s production potential and can serve as a foundation for implementing differentiated management strategies, given the substantial variability in yield estimates per plant.

These results indicate that, to increase the coefficient of determination (R²) and enhance the model’s ability to explain yield variability, it is necessary to incorporate a greater number and diversity of variables, particularly those related to physiology, vegetative expression, and structural yield components. The inclusion of physiological variables allows the integration of dynamic processes associated with the plant’s functional state, such as water balance and photosynthetic activity, which directly influence biomass accumulation and the allocation of assimilates to reproductive organs [48,51].

Building upon the variable Cluster Number at Harvest, Figure 4 (model B) was constructed using a simple nonlinear approach to better capture the dispersion pattern of the data, thereby reducing error at the extremes of the proposed model.

This model provides a nonlinear analysis with an R² very similar to that of the previous model. In the scatter plot, the dotted black line represents the regression of Model B, with an R² of 0.42, while the dotted orange line represents the 1:1 linear relationship, with an R² of 0.96. Analysis of this figure show that Model B maintains the same trend as the previous linear regression (Model A), without eliminating the error at the extreme ranges, Specifically, it continues to underestimate the highest observed yields (close to 14 kg/pl), while showing a slight overestimation al the lowest yields (close to 13 kg/pl).

The second model yielded a coefficient of determination (R²) of 0.42, slightly higher than that of the previous one. Thus, model B modified the structure of the relationship between the variables analyzed. This model is particularly suitable when biological relationships exhibit nonlinear behavior, as such transformation tends to linearize them, stabilize variance, and improve the distribution of residuals [32]. In grapevines, the number of clusters is not proportionally associated with yield, since excessive increases in fruit load reduce berry size due to internal competition for photo assimilates, resulting in a physiologically nonlinear relationship [13]. In this context, the transformation applied is consistent with physiological principles, although in this study it was not sufficiently effective to produce a significant improvement in the final fit of the proposed model.

Model C was developed using the variables analyzed in PCA (Figure 3), selecting the six most strongly correlated with yield per plant (cluster number per plant, cluster weight, berry weight, xylem water potential, GNDVI, and NDVI). These variables were identified as significant for accurately estimating yield per plant at harvest. A Least Squares model was applied, incorporating the previously identified robust variables into its algorithm to improve yield-per-plant estimates.

This model improved yield-per-plant estimation, achieving an R² of 0.63, indicating that the combination of the incorporated variables explained more than 60% of the total variability analyzed. In the scatter plot, the dashed black line represents the regression of Model C (y = 0.8723x + 0.8549), which is compared with the dashed red line (y = 0.9553x), representing the 1:1 linear relationship between observed and estimated yield (ideal R²). Analysis of this figure shows that Model C reduced dispersion within the average production range, indicating that the addition of the selected variables improved the model’s precision. However, the model still systematically underestimates the highest observed yields (above 16.0 kg/pl), as the orange line remains below the ideal line 1:1 curve.

The incorporation of multiple variables related to cluster structural components, vegetation indices, and plant water status as predictors improves yield estimation by capturing variability associated with cluster size, berry number, and density. This variability is not accounted for when only the number of clusters is considered, since plants with the same cluster number may exhibit significantly different cluster weights. Therefore, integrating multiple productivity-related variables provides a more comprehensive representation of the vineyard’s productive structure, reducing both underestimation and overestimation of final yield [52].

A model developed by [53] demonstrated that integrating vegetation indices derived from satellite time series (NDVI, LAI) significantly enhances yield prediction. In their study, linear models achieved an R² of 0.79, while machine learning approaches, particularly neural networks, reached correlations of 0.92–0.95, as these indices reflect canopy vigor, phenology, and environmental stress. Furthermore, ref. [54] reported that incorporating temporal vegetative indices and applying nonlinear methods substantially increase R², thereby improving predictive robustness. Consequently, a clear improvement is achieved by integrating yield-related variables with vineyard NDVI information.

Finally, model D incorporates four production-related variables (number of clusters, cluster weight, average GNDVI, and average NDVI), all of which showed strong correlations with yield per plant, as indicated by the PCA and Pearson’s correlation matrix. The proposed stepwise selection model, which integrated both structural yield components and vegetative expression variables, achieved an R² of 0.60, indicating that 60% of the total variability in harvest weight (kg/pl) was explained by the combination of the four selected variables. This result is slightly lower than that of the previous model, despite using two fewer variables.

Analysis of Figure 5 (model D) reveals a marked reduction in data dispersion compared with the two initial models, and a pattern very similar to that of the third. In this regard, incorporating the four variables reduced the systematic underestimation of high yields. The highest production points (close to 18.0 kg/pl) lie near the ideal line, highlighting the predictive capacity and robustness of Model D across a wide range of yield per plant.

Several authors emphasize that the incorporation of a greater number of explanatory variables (model C and D), particularly when they represent complementary dimensions to the production system, tend to improve the fit of predictive models by reducing residual variance and capturing the complexity of the crop [32,55]. When comparing models C and D developed in this study with the one proposed by [8], different approaches and types of variables can be analyzed, each directly influencing the accuracy of yield prediction in grapevines. The model developed in this study integrates a wide range of physiological variables, structural yield components, and multispectral information, such as cluster weight, berry weight, cluster number, xylem water potential, and vegetation indices (NDVI, GNDVI), achieving R² values greater than 0.60, which can be classified as good indicators.

3.3. Prediction Model with Artificial Neural Networks

For the development of the artificial neural network models (RF and MPL), the same database used to construct several of the previously presented models (Table 5) was employed, but with the inclusion of physiological variables that showed the highest correlations in the PCA and Pearson’s correlation matrix: Yield (kg), cluster weight (g), cluster number, berry weight (g), xylem water potential in December (Ψ Dec (MPa)), average xylem water potential (Ψ Ave (MPa)), GNDVI in December, average GNDVI, NDVI in December, and average NDVI. These variables were incorporated to enhance the predictive capacity of the neural network models, aiming to surpass previously developed models (Table 6).

To structure the model, the activation function generated a series of neurons or nodes, which in turn formed hidden layers. These layers were derived from the combination of each of the variables used for the actual yield estimation per plant, following the methodology proposed by [56].

Figure 6 shows the yield estimated through model validation using RM and MP techniques. The most relevant variables for constructing the models were the number of bunches, bunch weight, NDVI, and GNDVI, with a normalized importance. The results are promising, with R² values of 0.62 and 0.66 for the RF and MPL models, respectively. The R² for the 1:1 relationship between observed and estimated yield was 0.97, demonstrating strong predictive capacity. Furthermore, the slope for the 1:1 ratio was 1.10 and 1.01, respectively, indicating a slight to almost negligible overestimation. Finally, the errors were lower than those observed in the MPL model, with RMSE and MAE values of 1.57 and 1.23, respectively. When converting estimation errors to kg/ha per unit produced, the MLP model shows an error of 12%, which is lower than the error typically observed at the wine producer level.

3.4. Comparison of Fit Level and Spatial Prediction Models

Table 7 summarizes the fit statistics used to evaluate the predictive accuracy and validity of all proposed models, including Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Standard Deviation (SD), Residual Predictive Deviation (RPD), Model Efficiency (EF), and Coefficient of Determination (R²).

Comparative analysis of fit statistics was essential for selecting the most reliable models and contrasting their predictive capacity. The simple linear model (SLM) and simple nonlinear model (SNM) showed limited performance, with R² values of 0.41 and 0.42, respectively, and high MAE values (~2.10 and 1.84 kg/pl), translating into substantial errors when expressed at the hectare scale. Model efficiency was also low (0.89 and 0.72), indicating predictions far from ideal yield values.

In contrast, the least squares model (LSM), which incorporated six variables, significantly improved predictive accuracy, achieving a validation R² of 0.64 and reducing MAE to 3714 kg ha⁻¹ (15% error). The stepwise selection model (SM), based on four variables, achieved a validation R² of 0.60. Both models integrating physiological and multispectral variables (Models C and F) demonstrated superior accuracy, with RMSE values of 1.83 and 1.57, MAE values of 1.45 and 3151 kg ha⁻¹, and EF values of 0.47 and 0.41, respectively. These improvements confirm the reliability of Models C and F for yield estimation in the evaluated vineyards.

The models developed in this study were designed to predict yield at the plant level (kg/pl), which limits extrapolation and representativeness. By contrast, block- or hectare-level models, such as those proposed by [6], integrate spatial and environmental variability, thereby reducing plant-to-plant differences. Incorporating climatic variables has also been shown to improve statistical validation [57]. For example, predictive models using climate, soil, maturity indices, and historical yield data achieved relative errors of 24–25%, nearly double those reported here, underscoring the importance of integrating multispectral and physiological information.

Previous studies highlight the potential of multispectral imagery for yield estimation. Reference [58] used NDVI-derived canopy cover fraction from UAV imagery, reporting relative errors of 12–15%, comparable to this study, though requiring annual calibration. Similarly, [59] employed RGB imagery and unsupervised segmentation, achieving accuracies of 84–92%, but performance declined in vineyards with high canopy density, particularly under free-canopy systems.

Despite these advances, several limitations remain. Yield estimation requires integrating multiple layers of information, which entails high operational costs in both field and laboratory. Incorporating historical yield data would improve monitoring efficiency, while soil spatial variability (e.g., electrical resistivity) should be considered, as it represents a major source of vineyard heterogeneity affecting yield and grape quality.

Although the MLP model exhibited the best fit among the evaluated methodologies, its improvement over multiple linear regression was relatively modest. This finding suggests that much of the yield variability can be adequately explained by linear relationships among the physiological, multispectral, and structural variables considered. Consequently, machine learning models should not necessarily be regarded as universally superior alternatives, but rather as complementary tools whose utility depends on the complexity of the available data and the specific application objectives. In this context, multiple linear regression retains notable advantages related to its simplicity, interpretability, and ease of implementation in vineyard management programs. Future research should encompass a broader range of cultivars, vineyards, and evaluation seasons to more accurately assess the generalizability of different modeling approaches and to determine the conditions under which machine learning methods provide significant predictive improvements.

3.5. Mapping of Real and Estimated Yield

The following section presents the spatial analysis of yield maps generated for the prediction models evaluated for both, cv. Carménère (Figure 7) and Chardonnay (Figure 8). The objective was to visually compare the spatial patterns of the real yield maps with those estimated using the algorithms proposed in this study. This comparison enables the identification of differences in spatial dependence and variability between observed and predicted yields, thereby providing insights into the reliability and practical applicability of the models for vineyard management.

The maps corresponding to the proposed models (A–F) illustrate the high spatial variability observed in yield (kg/pl). Dark gray represents the lowest yield (<9.77 kg/pl in Carménère and <7.6 kg/pl in Chardonnay), light gray indicates intermediate yield (9.7–14.5 kg/pl for Carménère and 7.6–10.6 kg/pl for Chardonnay), and white represents the highest yield (>14.55 kg/pl for Carménère and >10.6 kg/pl for Chardonnay).

Visual analysis shows that the estimated maps adequately reproduced the spatial distribution of observed yield in both cultivars. In Carménère, low productivity zones (dark gray) were concentrated in the northern and southeastern vineyard sectors, while in Chardonnay, low yields were observed in the northern and northeastern sectors. Models A (R² = 0.41) and B (R² = 0.42), despite moderate explanatory power, captured the general spatial distribution, with intermediate yield being the predominant class. Model A identified a small high-yield zone in the southeastern Carménère vineyard, whereas in Chardonnay, the highest yields were in the southern sector.

Model B for Carménère showed greater homogeneity by reducing the extent of high-yield zones. In Chardonnay, Models D and E reduced the high-yield area, while Model C (R² = 0.63) for Carménère and Model D for Chardonnay provided clearer segmentation of low-yield zones. Models D (R² = 0.61) and E (R² = 0.63) tended to overestimate high-yield zones in Carménère, but not in Chardonnay, where they provided accurate estimates. Model F (R² = 0.67) achieved the greatest spatial agreement with observed yield maps, accurately identifying both high- and low-yield zones (southeastern Carménère and southwestern Chardonnay), although it underestimated certain productivity peaks due to unexplained variability.

Overall, all models approximated the vineyard’s spatial structure despite differences in statistical performance, consistent with findings in precision viticulture studies where even models with moderate R² values captured relevant spatial patterns [50,60]. A key factor influencing spatial performance is sampling scale: yield measured at the individual plant level increases variability and reduces predictive accuracy for extreme values. Yield variability within a single plot can exceed 40–60%, depending on management and soil characteristics [61].

Strategies to reduce spatial error include increasing sampling unit size (aggregating plants), spatial data aggregation, and vineyard zoning based on vigor. Vegetation indices derived from multispectral imagery (e.g., NDVI) have proven effective in delineating homogeneous zones and reducing random sampling error [62,63]. Another limitation is the use of single-season data: interannual variability can be as significant as spatial variability, and multi-season datasets improve robustness and predictive capacity over time [64].

3.6. Spatial Analysis of Yield Maps

Table 8 presents the Cambardella index values expressed as percentages (%), calculated using yield-per-plant data [42,44]. In this analysis, the Nugget (C₀) and Sill (C₀ + C₁) values were found to be very low, reflecting the limited variation observed in Carménère and Chardonnay cultivars. In contrast, the Cambardella Index exhibited a wide distribution, ranging from 0 to 100% for both cultivars. This outcome arises because the index is derived from variogram parameters (C₀ and C₁), which, when scaled by 100, yield high values. The index can therefore be used to classify yield maps according to their spatial dependence (SD).

High spatial dependence (SD ≤ 25%) indicates that most of the variability observed in the field—such as soil fertility, soil water availability, or yield per plant—is primarily explained by structural or systematic factors rather than random variation or sampling error. Maps exhibiting strong spatial dependence (well-defined spatial structure) enable more accurate spatial interpolations using ordinary kriging, resulting in reliable and interpretable maps. In summary, strong spatial dependence suggests that field variability is largely controlled by well-defined spatial processes, and that data distribution is continuous rather than random.

Based on the results presented in Table 8, Carménère exhibited a more defined spatial structure than Chardonnay, attributable to greater homogeneity in production factors such as soil conditions and vegetative expression. This outcome is reflected in the Cambardella Index values, where yield estimates for Carménère were classified as moderate to strong, in contrast to Chardonnay, for which models A and B were categorized as weak. Consequently, yield maps for Carménère are more reliable and easier to interpret for the identification of management zones, whereas those for Chardonnay present greater challenges in modeling and interpretation.

Furthermore, nugget values were generally higher for Chardonnay, indicating greater experimental or field error in this vineyard. This suggests that yield maps for Chardonnay (particularly models A and B) are less reliable and more difficult to interpret than those for Carménère, especially from a practical standpoint in vineyard yield-related decision-making.

4. Conclusions

The results of this study demonstrate that spatial variability of yield per plant is directly associated with structural yield components, with cluster number and cluster weight emerging as the primary determinants of final yield per plant. In contrast, plant water status, together with vegetation indices, exhibited significant correlations with yield per unit area.

Simple models based on one or two easily measurable variables produced substantial estimation errors (2.10 kg plant⁻¹), reflecting low predictive accuracy compared with the more complex models evaluated. In this context, the less complex models (A and B), although limited in accuracy, represent a practical alternative for viticulturists, as their estimation errors are comparable to those reported for classical yield estimation methodologies. These models may therefore be suitable for production systems where highly precise predictions are not required.

Incorporation of plant water status and vegetation indices reduced estimation errors to 1.45 and 1.23 kg plant⁻¹ in models C and F, respectively, which achieved the highest predictive accuracy in this study. These findings confirm the importance of methodologies that integrate multiple layers of information. Specifically, combining structural yield components, plant water status, and spectral canopy characteristics allowed for a more accurate representation of the processes regulating vineyard yield, thereby reducing variability compared with traditional estimation methods commonly employed by growers.

Finally, the results suggest that predictive performance could be further enhanced by incorporating additional layers of information, such as interannual climatic variability and vineyard phenological dynamics, to capture variability not accounted for in this study. Moreover, integrating yield data from homogeneous vineyard zones, rather than individual plants, could mitigate the high variability observed. In this context, combining traditional statistical approaches with machine learning algorithms and remote sensing techniques emerges as a promising avenue for future research, offering considerable potential to improve yield estimation in modern viticultural systems.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/agronomy16111106/s1, Figure S1: Construction of yield estimation models using one or more variables. A. Simple linear model; B. Simple nonlinear model; C. Least squares model; D. Stepwise variable selection model. The coefficient of determination (R2) between observed and estimated yield is shown in black. The coefficient of determination (R2) between observed and estimated yield is shown in red, forced to follow a 1:1 ratio; Figure S2: Construction of yield estimation models using artificial neural networks random forest (RF) and multilayer perceptron (MLP) one or more variables. The coefficient of determination (R2) between observed and estimated yield is shown in black. The coefficient of determination (R2) between observed and estimated yield is shown in red, forced to follow a 1:1 ratio.

Author Contributions

Conceptualization, C.A.-O. and P.C.-S.; methodology, C.A.-O. and Y.M.-S.; software, P.C.-S. and L.V.; validation, M.A.-A. and C.A.-E.; formal analysis, C.A.-O., P.C.-S. and M.A.-A.; investigation, C.A.-O., L.V. and C.A.-E.; resources, C.A.-O.; data curation, C.A.-O.; writing—original draft preparation, C.A.-O.; writing—review and editing, C.A.-O., Y.M.-S. and P.C.-S.; visualization, C.A.-O.; supervision, C.A.-O.; project administration, C.A.-O.; funding acquisition, C.A.-O. All authors have read and agreed to the published version of the manuscript.

Funding

The research that led to this report received financial support from the Chilean ANID-FONDECYT project No. 1231420.

Data Availability Statement

The data presented in this study are not publicly available due to confidentiality agreements with commercial vineyards. Access to the data may be granted upon reasonable request to the corresponding author, subject to authorization from the involved companies.

Acknowledgments

The authors would also like to express their gratitude to all the technical staff for their invaluable contribution to the successful conduct of the experiments.

Conflicts of Interest

Author Cristian Ackerknecht-Espinosa works for the company: ERDE Technology and Applied Engineering SPA. The authors declare no conflict of interest.

References

Andrade, C.B.; Moura-Bueno, J.M.; Comin, J.J.; Brunetto, G. Grape yield prediction models: Approaching different machine learning algorithms. Horticulturae 2023, 9, 1294. [Google Scholar] [CrossRef]
Canicattì, M.; Ferro, M.V.; Vallone, M.; Orlando, S.; Catania, P. Bayesian yield mapping and uncertainty analysis in vineyards using remote sensing data and grape harvester tracking. Precis. Agric. 2025, 26, 63. [Google Scholar] [CrossRef]
Matese, A.; Di Gennaro, S.F. Technology in precision viticulture: A state of the art review. Int. J. Wine Res. 2015, 7, 69–81. [Google Scholar] [CrossRef]
Carrillo, E.; Matese, A.; Rousseau, J.; Tisseyre, B. Use of multi-spectral airborne imagery to improve yield sampling in viticulture. Precis. Agric. 2016, 17, 74–92. [Google Scholar] [CrossRef]
Dunn, G.M.; Martin, S.R. Yield prediction from digital image analysis: A technique with potential for vineyard assessments prior to harvest. Aust. J. Grape Wine Res. 2004, 10, 196–198. [Google Scholar] [CrossRef]
Barriguinha, A.; de Castro Neto, M.; Gil, A. Vineyard yield estimation, prediction, and forecasting: A systematic literature review. Agronomy 2021, 11, 1789. [Google Scholar] [CrossRef]
Clingeleffer, P.R.; Dunn, G.M.; Martin, S.R. Crop development, crop estimation and crop control to secure quality and production of major wine grape varieties. In Proceedings of the 12th Australian Wine Industry Technical Conference, Melbourne, Australia, 7–11 October 2001. [Google Scholar]
Palacios, F.; Diago, M.P.; Melo-Pinto, P.; Tardaguila, J. Early yield prediction in different grapevine varieties using computer vision and machine learning. Precis. Agric. 2023, 24, 407–435. [Google Scholar] [CrossRef]
Araya-Alman, M.; Leroux, C.; Acevedo-Opazo, C.; Guillaume, S.; Valdés-Gómez, H.; Verdugo-Vásquez, N.; Pañitrur, C.; Tisseyre, B. A new localized sampling method to improve grape yield estimation of the current season using yield historical data. Precis. Agric. 2019, 20, 445–459. [Google Scholar] [CrossRef]
Nuske, S.; Achar, S.; Bates, T.; Narasimhan, S.; Singh, S. Yield estimation in vineyards by visual grape detection. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), San Francisco, CA, USA, 25–30 September 2011; pp. 2352–2358. [Google Scholar] [CrossRef]
Arnó, J.; Martínez-Casasnovas, J.A.; Uribeetxebarria, A.; Escolà, A.; Rosell-Polo, J.R. Comparing efficiency of different sampling schemes to estimate yield and quality parameters in fruit orchards. Adv. Anim. Biosci. 2017, 8, 471–476. [Google Scholar] [CrossRef]
Wolpert, J.A.; Vilas, E.P. Estimating vineyard yields: Introduction to a simple, two-step method. Am. J. Enol. Vitic. 1992, 43, 384–388. [Google Scholar] [CrossRef]
Keller, M. The Science of Grapevines: Anatomy and Physiology, 3rd ed.; Academic Press: London, UK, 2020. [Google Scholar]
Smith, D.E.; Solgaard, H.S.; Beckmann, S.C. Changes and trends in alcohol consumption patterns in Europe. J. Consum. Stud. Home Econ. 1999, 23, 247–260. [Google Scholar] [CrossRef]
Taylor, J.A.; Bates, T.R.; Jakubowski, R.; Jones, H. Machine-learning methods to identify key predictors of site-specific vineyard yield and vine size. Am. J. Enol. Vitic. 2023, 74, 0740013. [Google Scholar] [CrossRef]
Aquino, A.; Diago, M.P.; Millán, B.; Tardáguila, J. A New Methodology for Estimating the Grapevine Berry Number per Cluster Using Image Analysis. Biosyst. Eng. 2017, 156, 80–95. [Google Scholar] [CrossRef]
Taylor, J.A.; Sánchez, L.; Sams, B.; Haggerty, L.; Jakubowski, R.; Djafour, S.; Bates, T.R. Evaluation of a com-mercial grape yield monitor for use mid-season and at-harvest. OENO One 2016, 50, 57–63. [Google Scholar] [CrossRef]
Jewan, S.Y.Y.; Gautam, D.; Sparkes, D.; Singh, A.; Cogato, A.; Murchie, E.; Pagay, V. Integrating hyperspectral, thermal, and ground data with machine learning algorithms enhances the prediction of grapevine yield and berry composition. Remote Sens. 2024, 16, 4539. [Google Scholar] [CrossRef]
Laurent, C.; Oger, B.; Taylor, J.A.; Scholasch, T.; Metay, A.; Tisseyre, B. A Review of the Issues, Methods and Perspectives for Yield Estimation, Prediction and Forecasting in Viticulture. Eur. J. Agron. 2021, 130, 126339. [Google Scholar] [CrossRef]
Martínez-Casasnovas, J.A.; Agelet-Fernández, J.; Arnó, J.; Ramos, M.C. Analysis of Vineyard Differential Management Zones and Relation to Vine Development, Grape Maturity and Quality. Span. J. Agric. Res. 2012, 10, 326–337. [Google Scholar] [CrossRef]
Diago, M.P.; Correa, C.; Millán, B.; Barreiro, P.; Valero, C.; Tardaguila, J. Grapevine Yield and Leaf Area Estimation Using Supervised Classification Methodology on RGB Images Taken under Field Conditions. Sensors 2012, 12, 16988–17006. [Google Scholar] [CrossRef]
Blom, P.E.; Tarara, J.M. Trellis tension monitoring improves yield estimation in vineyards. HortScience 2009, 44, 678–685. [Google Scholar] [CrossRef]
Taylor, J.A.; Tisseyre, B.; Bramley, R.G.V.; Reid, A. A Comparison of the Spatial Variability of Vineyard Yield in European and Australian Production Systems. In Precision Agriculture ‘05; Stafford, J.V., Ed.; Wageningen Academic Publishers: Wageningen, The Netherlands, 2005; pp. 907–914. [Google Scholar] [CrossRef]
Taylor, J.A.; McBratney, A.B.; Whelan, B.M. Establishing management classes for broadacre agricultural production. Agron. J. 2007, 99, 1366–1376. [Google Scholar] [CrossRef]
CIREN; CORFO. Estudio Agrológico Región del Maule; Centro de Información de Recursos Naturales (CIREN): Santiago, Chile, 2023. [Google Scholar]
Gallardo, A.; Mella, C.; Saavedra, N. Estudio de Suelos del Valle del Maule; CIREN: Santiago, Chile, 1994. [Google Scholar]
Scholander, P.F.; Bradstreet, E.D.; Hemmingsen, E.A.; Hammel, H.T. Sap pressure in vascular plants: Negative hydrostatic pressure can be measured in plants. Science 1965, 148, 339–346. [Google Scholar] [CrossRef] [PubMed]
Galvez, D.A.; Landhäusser, S.M.; Tyree, M.T. Root Carbon Reserve Dynamics in Aspen Seedlings: Does Simulated Drought Induce Reserve Limitation? Tree Physiol. 2011, 31, 250–257. [Google Scholar] [CrossRef] [PubMed]
Gade, R.; Moeslund, T.B. Thermal cameras and applications: A survey. Mach. Vis. Appl. 2014, 25, 245–262. [Google Scholar] [CrossRef]
Rodgers, J.L.; Nicewander, W.A. Thirteen Ways to Look at the Correlation Coefficient. Am. Stat. 1988, 42, 59–66. [Google Scholar] [CrossRef]
Seber, G.A.F.; Wild, C.J. Nonlinear Regression; Wiley: New York, NY, USA, 2003. [Google Scholar]
Montgomery, D.C.; Peck, E.A.; Vining, G.G. Introduction to Linear Regression Analysis, 5th ed.; Wiley: Hoboken, NJ, USA, 2012. [Google Scholar]
Thompson, M.L. Selection of variables in multiple regression: Part I. A review and evaluation. Int. Stat. Rev. 1978, 46, 1–19. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Liaw, A.; Wiener, M. Classification and regression by randomForest. R News 2002, 2, 18–22. Available online: https://journal.r-project.org/articles/RN-2002-022/RN-2002-022.pdf (accessed on 28 February 2026).
Cañete-Salinas, P.; Ogass, K.; Saavedra-Pérez, N.; Espinosa-Ackerknecht, C.; Urzua, J.; Guajardo, J.; Errázur-iz-Montanares, I.; Garrido-Faúndez, P.; Acevedo-Opazo, C. European Hazelnut Yield Prediction Model for Low and High Production Years, Using a Neural Network and Multispectral Images. Ecol. Conserv. Open Access 2024, 4, 555644. [Google Scholar] [CrossRef]
Mayer, D.G.; Butler, D.G. Statistical validation. Ecol. Model. 1993, 68, 21–32. [Google Scholar] [CrossRef]
Ahmed, S. A software framework for predicting the maize yield using modified multi-layer perceptron. Sustainability 2023, 15, 3017. [Google Scholar] [CrossRef]
Bai, X.; Li, Z.; Li, W.; Zhao, Y.; Li, M.; Chen, H.; Wei, S.; Jiang, Y.; Yang, G.; Zhu, X. Comparison of machine-learning and CASA models for predicting apple fruit yields from time-series Planet imageries. Remote Sens. 2021, 13, 3073. [Google Scholar] [CrossRef]
Bramley, R.G.V.; Hamilton, R.P. Understanding variability in winegrape production systems 1. Within vineyard variation in yield over several vintages. Aust. J. Grape Wine Res. 2004, 10, 32–45. [Google Scholar] [CrossRef]
Rossi, R.E.; Mulla, D.J.; Journel, A.G.; Franz, E.H. Geostatistical tools for modeling and interpreting eco-logical spatial dependence. Ecol. Monogr. 1992, 62, 277–314. [Google Scholar] [CrossRef]
Silbernagel, J.; Lang, N.S. Spatial distribution of environmental stress indicators in Concord grape vineyards. Ecol. Indic. 2002, 2, 271–286. [Google Scholar] [CrossRef]
Júnior, V.V.; Carvalho, M.P.; Dafonte, J.; Freddi, O.S.; Vázquez, E.V.; Ingaramo, O.E. Spatial variability of soil water content and mechanical resistance of Brazilian ferralsol. Soil Tillage Res. 2006, 85, 166–177. [Google Scholar] [CrossRef]
Cambardella, C.A.; Moorman, T.B.; Novak, J.M.; Parkin, T.B.; Karlen, D.L.; Turco, R.F.; Konopka, A.E. Field-Scale Variability of Soil Properties in Central Iowa Soils. Soil Sci. Soc. Am. J. 1994, 58, 1501–1511. [Google Scholar] [CrossRef]
Medrano, H.; Escalona, J.M.; Cifre, J.; Bota, J.; Flexas, J. A ten-year study on the physiology of two Spanish grape-vine cultivars under field conditions: Effects of water availability on gas exchange and yield. Funct. Plant Biol. 2003, 30, 607–619. [Google Scholar] [CrossRef]
Ojeda, H. Influence of water deficits on grapevine growth, yield components and fruit composition. In Proceedings of the International Workshop on Advances in Grapevine and Wine Research, Venosa, Italy, 15–17 September 2005; International Society for Horticultural Science: Orlando, FL, USA, 2007. [Google Scholar]
Santesteban, L.G.; Royo, J.B. Water status, leaf area and fruit load influence on berry weight and sugar accumu-lation of cv. Tempranillo under semiarid conditions. Sci. Hortic. 2006, 109, 60–65. [Google Scholar] [CrossRef]
Intrigliolo, D.S.; Castel, J.R. Response of grapevine cv. Tempranillo to timing and amount of irrigation: Water relations, vine growth, yield and berry composition. Irrig. Sci. 2010, 28, 113–125. [Google Scholar] [CrossRef]
Dami, I.E. Estimating Grape Yield Components in Vineyards; Ohio State University Extension: Columbus, OH, USA, 2006. [Google Scholar]
Bramley, R.G.V. Understanding Variability in Winegrape Production Systems 2. Within Vineyard Variation in Quality over Several Vintages. Aust. J. Grape Wine Res. 2005, 11, 33–42. [Google Scholar] [CrossRef]
Medrano, H.; Tomás, M.; Martorell, S.; Escalona, J.M.; Pou, A.; Fuentes, S.; Flexas, J.; Bota, J. From leaf to whole-plant water use efficiency (WUE) in complex canopies: Limitations of leaf WUE as a selection target. Crop J. 2015, 3, 220–228. [Google Scholar] [CrossRef]
Towers, P.C.; Roulet, S.E.; Poblete-Echeverría, C. Vine Yield Estimation from Block to Regional Scale Employing Remote Sensing, Weather, and Management Data. Inf. Process. Agric. 2025, 12, 195–208. [Google Scholar] [CrossRef]
Pham, H.T.; Awange, J.; Kuhn, M.; Nguyen, B.V.; Bui, L.K. Enhancing Crop Yield Prediction Utilizing Machine Learning on Satellite-Based Vegetation Health Indices. Sensors 2022, 22, 719. [Google Scholar] [CrossRef] [PubMed]
Sun, J.; Lai, Z.; Di, L.; Sun, Z.; Tao, J.; Shen, Y. Multilevel Deep Learning Network for County-Level Corn Yield Estimation in the US Corn Belt. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 5048–5060. [Google Scholar] [CrossRef]
Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: New York, NY, USA, 2013. [Google Scholar]
Oficina de Estudios y Políticas Agrarias (ODEPA). Boletín del vino y Pisco, Enero 2025; Ministerio de Agricultura: Santiago, Chile, 2025; Available online: https://www.odepa.gob.cl/publicaciones/boletines/boletin-del-vino-y-pisco-enero-2025 (accessed on 2 February 2026).
Sirsat, M.S.; Mendes-Moreira, J.; Ferreira, C.; Cunha, M. Machine learning predictive model of grapevine yield based on agroclimatic patterns. Eng. Agric. Environ. Food 2019, 12, 443–450. [Google Scholar] [CrossRef]
Ballesteros, R.; Intrigliolo, D.S.; Ortega, J.F.; Ramírez-Cuesta, J.M.; Buesa, I.; Moreno, M.A. Vineyard yield estima-tion by combining remote sensing, computer vision and artificial neural network techniques. Precis. Agric. 2020, 21, 1242–1262. [Google Scholar] [CrossRef]
Di Gennaro, S.F.; Toscano, P.; Cinat, P.; Berton, A.; Matese, A. A low-cost and unsupervised image recognition methodology for yield estimation in a vineyard. Front. Plant Sci. 2019, 10, 559. [Google Scholar] [CrossRef]
Tisseyre, B.; Ojeda, H.; Taylor, J.A. New technologies and methodologies for site-specific viticulture. J. Int. Sci. Vigne Vin. 2008, 42, 63–76. [Google Scholar] [CrossRef]
Bramley, R.G.V.; Ouzman, J.; Boss, P.K. Variation in Vine Vigour, Grape Yield and Vineyard Soils and Topography as Indicators of Variation in the Chemical Composition of Grapes, Wine and Wine Sensory Attributes. Aust. J. Grape Wine Res. 2011, 17, 217–229. [Google Scholar] [CrossRef]
Hall, A.; Lamb, D.W.; Holzapfel, B.; Louis, J. Optical remote sensing applications in viticulture—A review. Aust. J. Grape Wine Res. 2011, 17, 113–131. [Google Scholar] [CrossRef]
Johnson, L.F.; Roczen, D.E.; Youkhana, S.K.; Nemani, R.R.; Bosch, D.F. Mapping vineyard leaf area with multi-spectral satellite imagery. Comput. Electron. Agric. 2003, 38, 33–44. [Google Scholar] [CrossRef]
Bramley, R.G.V.; Ouzman, J.; Trought, M.C.T.; Neal, S.M.; Bennett, J.S. Spatio-temporal variability in vine vigour and yield in a Marlborough Sauvignon Blanc vineyard. Aust. J. Grape Wine Res. 2019, 25, 430–438. [Google Scholar] [CrossRef]

Figure 1. Vegetation Indices calculated from drone-based multispectral imagery.

Figure 2. Graphical summary of the experiment methodology.

Figure 3. Principal component analysis (PCA) by season. Sites and ellipses are color-coded: green represents Chardonnay and blue represents Carménère. Variables included in the analysis were yield (kg), cluster weight (g), cluster number, berry weight (g), xylem water potential in December (Ψ Dec, MPa), average xylem water potential (Ψ Ave, MPa), GNDVI in December, average GNDVI, NDVI in December, and average NDVI. KMO Test Season I (0.83) and Season II: (0.80).

Figure 4. Correlation matrix of the nine selected variables across the two study seasons (2023–2024 and 2024–2025). Variables included were yield (kg), cluster weight (g), cluster number, berry weight (g), xylem water potential in December (Ψ Dec, MPa), average xylem water potential (Ψ Ave, MPa), GNDVI in December, average GNDVI, NDVI in December, and average NDVI.

Figure 5. Validation of yield estimation models using one or more predictor variables. (A) Simple linear model; (B) simple nonlinear model; (C) least squares model; (D) stepwise variable selection model. The coefficient of determination (R²) between observed and estimated yield is shown in black. The red line represents the forced 1:1 relationship between observed and estimated yield, with its corresponding R² value.

Figure 6. Validation of yield estimation models using artificial neural networks: (A) Random Forest (RF) and (B) Multilayer Perceptron (MLP). The coefficient of determination (R²) between observed and estimated yield is shown in black. The red line represents the forced 1:1 relationship between observed and estimated yield, with its corresponding R² value.

Figure 7. Maps of real yield in Vitis vinifera L. cv. Carménère and maps of estimated yield generated using different predictive models in commercial vineyards managed under a free-canopy system. The comparison highlights spatial patterns of observed yield and those predicted by the evaluated algorithms, enabling assessment of model accuracy and spatial reliability.

Figure 8. Maps of real yield in Vitis vinifera L. cv. Chardonnay and maps of estimated yield generated using different predictive models in commercial vineyards managed under a free-canopy system. The comparison highlights spatial patterns of observed yield and those predicted by the evaluated algorithms, enabling assessment of model accuracy and spatial reliability.

Table 1. Main production and management characteristics of the two experimental sites.

Production Characteristics	Experimental Site 1	Experimental Site 2
Cultivar	Carménère	Chardonnay
Training system	Single wine free canopy system
Production goal	High productivity–varietal quality wine
Climate	Temperate mediterranean Warm-temperate climate with winter precipitation (Csb) Winter rainfall and 6-month dry season
	Average annual temperature 16.2 °C	Average annual temperature 14.5 °C
	Ave. max. temp. 23 °C, max. exceeding 33 °C Ave. min. temp. 8.7 °C Average annual rainfall 427 mm	Ave. max. temp. 21.3 °C, max. exceeding 30 °C Ave. min. temp. 6.9 °C Average annual rainfall 605 mm
Soil	Tutucura series, moderately deep 45–75 cm Silty to silty and clay-silty texture	Talca series, moderately deep 55–90 cm Loamy-clay and clayey subsoil texture
Topography	Slope less than 1.5%	Slope less than 2.5%
Planting density	1.5 m × 2.5 m (2667 plants ha⁻¹)	1.2 m × 3.0 m (2778 plants ha⁻¹)
Rootstock	SO4
Irrigation system	Drip-irrigated with 2 emitters per plant at 2 L h⁻¹
Field sampling grid	12.0 × 7.5 m (42 measurement points)	9.6 × 9.0 m (48 measurement points)
Experimental unit	2 plants per site

Table 2. Vegetation indices calculated using imagery from the DJI Mavic 3 multispectral sensor. Spectral bands correspond to sensor specifications: NIR = near-infrared band (840 nm), Red = red band (650 nm), and Green = green band (560 nm).

Vegetation Index Name	Code	Equation
Normalized Difference Vegetation Index	NDVI	$\frac{N I R - R e d}{N I R + R e d}$
Green Normalized Difference Vegetation Index	GNDVI	$\frac{N I R - G r e e n}{N I R + G r e e n}$
Normalized Difference Red Edge Index	NDRE	$\frac{N I R - R e d E d g e}{N I R + R e d E d g e}$

Table 3. Adjustment statistics, codes, and corresponding equations.

Statistic	Code	Equation	Interpretation
RMSE	Root Mean Square Error	$\sqrt{\frac{1}{n} \times \sum_{i = 1}^{n} {({\hat{Y}}_{i} - Y_{i})}^{2}}$	Lower values indicate better model fit.
SD	Standard Deviation	$\sqrt{\frac{1}{n - 1} \times \sum_{i = 1}^{n} {(Y_{i} - \bar{Y})}^{2}}$	Lower values indicate better model fit.
MAE	Mean Absolute Error	$\frac{1}{n} \times \sum_{i = 1}^{n} \|{\hat{Y}}_{i} - Y_{i}\|$	Lower values indicate better model fit.
EF	Model Efficiency	$1 - \frac{\sum_{i = 1}^{n} {{(Y}_{i} - {\hat{Y}}_{i})}^{2}}{\sum_{i = 1}^{n} {{(Y}_{i} - {\bar{Y}}_{i})}^{2}}$	Values > 0 indicate acceptable performance
RPD	Residual Predictive Deviation	$\frac{S D}{R M S E}$	RPD < 1.5 = poor; 1.5–2 = fair; 2–2.5 = good; and 2.5 < excellent.
R²	Coefficient of determination	$\frac{\sum_{i = 1}^{n} {(Y_{i} - {\hat{Y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(Y_{i} - {\bar{Y}}_{i})}^{2}}$	Values closer to 1 indicate better fit

Where

Y_{i}

is the actual or observed value,

{\hat{Y}}_{i}

is the value predicted by the model,

{\bar{Y}}_{i}

is the average of the actual or observed value, and n is the number of observations ([38,39]).

Table 4. Summary of variables used in this study. After data filtering, nine variables remained, each demonstrating the strongest correlation with yield across both seasons.

Variable Type	Variable	Description
Yield component	Yield per plant (kg), Cluster weight (g), Cluster number, Berry weight (g) and Rachis weight	Obtained at two stages. Pea size berries (January) and during harvest (March-April)
Physiological measurements	Xylem water potential (Ψ (MPa)), Stomatal conductance (g_s; mol m⁻² s⁻¹), and Transpiration rate (T_r; mmol m⁻² s⁻¹)	Measured during the months of December, January, and February during the phenological development of the vine throughout the season. It is also included in the average value for the entire season per site.
Vegetation index	Green Normalized Difference Vegetation Index (GNDVI), Normalized Difference Vegetation Index (NDVI) and Normalized Difference Red Edge (NDRE)	Measured during the months of December, January, and February during the phenological development of the vine throughout the season. It is also included in the average value for the entire season per site.

Table 5. Traditional model equations evaluated during construction and validation. The validation process is illustrated in Supplementary Figure S2.

Model	Type Model	Equation
A	Simple Linear Model (SLM)	$Y i e l d = 4.407 + (0.137 \times C l u s t e r n u m b e r)$
B	Simple Nonlinear Model (SNM)	$Y i e l d = \frac{1}{0.0022 + (\frac{3.45}{C l u s t e r n u m b e r})}$
C	Least Square Model (LSM)	$\begin{array}{l} Y i e l d = - 15.56 & + (0.02 \times C l u s t e r w e i g h t) \\ + (0.011 \times C l u s t e r n u m b e r) \\ - (0.01 \times B e r r y w e i g h t) + (4.84 \times Ψ x a v e) \\ + (12.23 \times G N D V I a v e) + (12.33 \times N D V I a v e) \end{array}$
D	Stepwise Model (SM)	$\begin{array}{l} Y i e l d = - 9.29 & + (0.02 \times C l u s t e r w e i g h t) \\ + (0.011 \times C l u s t e r n u m b e r) - (7.23 \times G N D V I a v e) \\ + (18.48 \times N D V I a v e) \end{array}$

Table 6. Relative importance of predictor variables during the construction and validation processes of artificial neural network models (Random Forest and Multilayer Perceptron). The model validation process is illustrated in Supplementary Figure S2.

Model	Type Model	Relative Importance
A	Random Forest (RF)	Cluster number (71.33%), Cluster weight (23.14%), NDVI (18.79%) and GNDVI (18.4%).
B	Multilayer Perceptron (MP)	Cluster number (100%), Cluster weight (55.7%), NDVI (39.3%) and GNDVI (6.6%).

Table 7. Fit statistics obtained during the model validation process. Metrics include Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Standard Deviation (SD), Residual Predictive Deviation (RPD), Model Efficiency (EF), and Coefficient of Determination (R²).

Model	RMSE (kg/pl)	MAE (kg/ha)	SD (kg/ha)	RPD (kg/ha)	EF (kg/ha)	Construction R²	Validation R²	Error (kg/ha)	Error (%)
A	2.52	2.10	2.65	1.05	0.89	0.47	0.41	5379	21
B	2.26	1.84	2.65	1.18	0.72	0.49	0.42	4713	18
C	1.83	1.45	2.65	1.45	0.47	0.76	0.64	3714	15
D	2.39	2.06	2.65	1.11	0.80	0.74	0.60	5276	21
E	1.95	1.68	2.65	1.36	0.63	0.95	0.62	4303	17
F	1.57	1.23	2.65	1.68	0.41	0.84	0.66	3151	12

Table 8. Summary of selected models, variogram parameters (nugget, C₀; sill, C₀ + C₁; range, r), Cambardella Index (%), and spatial dependence (SD) of yield per plant for Carménère and Chardonnay cultivars.

Cultivar	Model	Nugget (C₀)	Sill (C₀ + C₁)	Range (r)	Cambardella Index (%)	SD
	Real data	0.03	6.93	26.3	0.43	strong
	A	0.96	2.33	13.73	41.20	moderate
	B	0.92	3.54	13.55	25.99	moderate
Carménère	C	0.83	3.08	12.17	26.95	moderate
	D	0.82	3.19	15.14	25.71	moderate
	E	0.01	4.32	25.80	0.23	strong
	F	0.01	5.37	27.50	0.19	strong
	Real data	0.05	4.08	25.08	1.23	strong
	A	2.67	3.12	9.10	85.58	weak
	B	3.56	4.64	11.65	76.72	weak
Chardonnay	C	1.90	4.18	13.07	45.45	moderate
	D	1.46	3.85	14.20	37.92	moderate
	E	0.06	3.35	21.90	1.79	Strong
	F	0.07	4.13	22.30	1.69	strong

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Acevedo-Opazo, C.; Cañete-Salinas, P.; Araya-Alman, M.; Ackerknecht-Espinosa, C.; Vásquez, L.; Moreno-Simunovic, Y. Mid-Season Yield Estimation in High-Productivity Vineyards: A Preliminary Modeling Framework for Free-Canopy Systems. Agronomy 2026, 16, 1106. https://doi.org/10.3390/agronomy16111106

AMA Style

Acevedo-Opazo C, Cañete-Salinas P, Araya-Alman M, Ackerknecht-Espinosa C, Vásquez L, Moreno-Simunovic Y. Mid-Season Yield Estimation in High-Productivity Vineyards: A Preliminary Modeling Framework for Free-Canopy Systems. Agronomy. 2026; 16(11):1106. https://doi.org/10.3390/agronomy16111106

Chicago/Turabian Style

Acevedo-Opazo, César, Paulo Cañete-Salinas, Miguel Araya-Alman, Cristian Ackerknecht-Espinosa, Lucas Vásquez, and Yerko Moreno-Simunovic. 2026. "Mid-Season Yield Estimation in High-Productivity Vineyards: A Preliminary Modeling Framework for Free-Canopy Systems" Agronomy 16, no. 11: 1106. https://doi.org/10.3390/agronomy16111106

APA Style

Acevedo-Opazo, C., Cañete-Salinas, P., Araya-Alman, M., Ackerknecht-Espinosa, C., Vásquez, L., & Moreno-Simunovic, Y. (2026). Mid-Season Yield Estimation in High-Productivity Vineyards: A Preliminary Modeling Framework for Free-Canopy Systems. Agronomy, 16(11), 1106. https://doi.org/10.3390/agronomy16111106

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Mid-Season Yield Estimation in High-Productivity Vineyards: A Preliminary Modeling Framework for Free-Canopy Systems

Abstract

1. Introduction

2. Materials and Methods

2.1. Characterization of the Experimental Site

2.2. Plant Measurements

2.3. Yield Structural Components

2.4. Drone Flights

2.5. Statistical Analysis

2.6. Yield Estimation Models

2.6.1. Simple Linear Model (SLM)

2.6.2. Simple Nonlinear Model (LSM)

2.6.3. Least Square Model (LSM)

2.6.4. Stepwise Model (SM)

2.6.5. Random Forest (RF)

2.6.6. Multilayer Perceptron (MPL)

2.7. Model Adjustment

2.8. Cartography Proposal

2.9. Geostatistical Analysis

3. Results and Discussion

3.1. Filtering Physiological, Multispectral, and Yield Component Data

3.2. Yield Prediction Models

3.3. Prediction Model with Artificial Neural Networks

3.4. Comparison of Fit Level and Spatial Prediction Models

3.5. Mapping of Real and Estimated Yield

3.6. Spatial Analysis of Yield Maps

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI