Predicting Wheat Yield by Spectral Indices and Multivariate Analysis in Direct and Conventional Sowing Systems

Polanía-Montiel, Diana Carolina; Velasquez Rubio, Santiago; Suarez Cardozo, Edna Jeraldy; Ferraz, Gabriel Araújo e Silva; Navas-Gracia, Luis Manuel

doi:10.3390/agronomy15112625

Open AccessArticle

Predicting Wheat Yield by Spectral Indices and Multivariate Analysis in Direct and Conventional Sowing Systems

by

Diana Carolina Polanía-Montiel

^1,2

,

Santiago Velasquez Rubio

³

,

Edna Jeraldy Suarez Cardozo

³

,

Gabriel Araújo e Silva Ferraz

³

and

Luis Manuel Navas-Gracia

^1,*

¹

TADRUS Research Group, Department of Agricultural and Forestry Engineering, University of Valladolid, UVa Campus of Palencia, 34004 Palencia, Spain

²

Faculty of Engineering, Surcolombiana University, Pastrana Borrero Avenue, Carrera 1, Neiva 410010, Colombia

³

Department of Agricultural Engineering (DEA), School of Engineering (EENG), Federal University of Lavras (UFLA), P.O. Box 3037, Lavras 37200-900, MG, Brazil

^*

Author to whom correspondence should be addressed.

Agronomy 2025, 15(11), 2625; https://doi.org/10.3390/agronomy15112625

Submission received: 19 October 2025 / Revised: 8 November 2025 / Accepted: 13 November 2025 / Published: 15 November 2025

(This article belongs to the Topic Advances in Smart Agriculture with Remote Sensing as the Core and Its Applications in Crops Field)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Wheat (Triticum aestivum L.) is a key crop in Spain, especially in Castilla and León Region. However, there are few studies evaluating predictive models based on spectral indices and multivariate analysis to estimate yield in direct seeding (DS) and conventional seeding (CS) systems. This study addresses this need by implementing a split-plot experimental design in the city of Palencia, Spain, analyzing crop physiological data and nine spectral indices derived from multispectral aerial images captured by drones. The analysis included multivariate techniques such as Principal Component Analysis (PCA) and Random Forest (RF), supplemented with statistical tests, ROC curves, and prediction analysis. The results showed that the RF model successfully classified treatments with 93.75% accuracy and a Kappa index of 0.875, highlighting performance, nitrogen, and protein as key variables. Among the vegetation indices, the Soil-Adjusted Vegetation Index (SAVI) and the Advanced Vegetation Index (AVI) were the most relevant in the flowering stage, with ROC curve values of 0.7778 and 0.8025, respectively. Spearman’s correlations confirmed a significant relationship between these indices and key physiological variables, allowing to distinguish between DS and CS systems. The RF-based prediction model for performance showed R² values above 91% in the indices with the highest correlation. However, predictive capacity was higher in DS, suggesting that conditions inherent in non-mechanized handling significantly influence model performance. This highlights the importance of using non-destructive procedures to estimate production, enabling the development of adaptive and sustainable strategies that contribute to efficient agricultural production, since it is possible to anticipate crop yields before harvest, optimizing resources such as fertilizers and water.

Keywords:

Principal Component Analysis (PCA); classification of treatments; supervised classification models; spectral indices; key physiological variables; ROC curve; Spearman’s correlation

1. Introduction

Agriculture is currently facing significant challenges due to the need to ensure food security in the context of a growing global population and climate change. The sustainability in production systems is a priority, especially in key crops such as wheat, which accounts for 21% of global food requirement [1,2]. In this context, direct seeding (DS) has emerged as a promising practice for improving agricultural productivity, reducing energy costs, and mitigating environmental impacts compared to conventional seeding (CS) [3].

Wheat (Triticum aestivum L.) is one of the world’s major crops and plays a crucial role in food security. In Spain, this crop covers an area of approximately 1,967,485 hectares, producing 7,116,771 tons in 2024, with the Castilla and León being the main producing region [4]. Agricultural management practices, such as DS and CS, significantly influence crop yield, grain quality, and sustainability. Researchers have shown that DS on previous crop residues significantly increases yields compared to seeding on tilled soil where an equivalent amount of residue has been incorporated. This effect results from the benefits of minimal soil disturbance. The structure generated by the root canals of the previous crop, combined with the biological activity of earthworms and other forms of soil fauna, promotes deep rooting while improving rainwater infiltration and percolation. These conditions contribute to creating a more favorable environment for crop development [5].

Previous studies have shown that wheat productivity is sensitive to soil preparation, climatic conditions, and agronomic management. Zero-tillage practices have significantly increased soil organic carbon (SOC) levels, especially in the top 0 to 10 cm layer. This performance highlights the effectiveness of reducing soil disturbance and encouraging the incorporation of organic matter to improve SOC reserves [6]. The implementing of a no-till system promotes better water retention in the soil, improves its physical and biological properties, and reduces erosion, often resulting in increased yields and net income for producers. This system also strengthens future food security by providing greater resilience to extreme weather events such as prolonged droughts and heat waves. These phenomena are projected to increase in the coming years [7].

The integration of remote sensing technologies has revolutionized the study of agricultural systems, enabling non-destructive monitoring of physiological variables and spectral indices. The NDVI and NDRE indices are widely used to observe sensitivity to plant vigor variation, providing key information on the physiological status of crops [8]. In addition, the SAVI minimizes the influence of soil brightness, which is particularly useful in DS systems. Studies have shown that applying unmanned aerial systems (UAS) improves the accuracy of estimating key indicators such as plant height and chlorophyll content [9].

Spectral indices derived from remote sensing can contribute to agricultural yield estimation [10].

Regarding data analysis, Principal Component Analysis (PCA) and machine learning algorithms, such as Random Forest (RF), are practical tools for identifying patterns in multidimensional data and distinguishing between different treatments. Researchers have highlighted the usefulness of PCA for selecting key variables in studies on genetic diversity and crop management [11]. In complex systems such as wheat, multivariate analysis is crucial for interpreting interactions between agronomic, environmental, and management factors. The GWP method is one of the most widely used statistical procedures for data dimension reduction [12]. In addition, statistical tests such as ANOVA and Levene are robust tools for detecting significant differences between treatments, as reported by Katral et al. [13] in rice experiments.

On the other hand, combining remote sensing data, meteorological data, field observations, and machine learning algorithms such as RF can improve crop yield estimation [14]. Xu et al. [15] used RF to classify crops with high precision and accuracy, highlighting its robustness in agricultural scenarios. Han et al. [16] emphasized the importance of considering structural and spectral information jointly rather than analyzing them independently when estimating crop biophysical parameters combined with machine learning.

This research identifies robust indicators that distinguish between agricultural management systems in order to optimize farming practices. Spearman’s correlations and ROC analysis are key techniques for establishing relationships between physiological variables and spectral indices [17]. These tools validate the selected indicators, ensuring their applicability in different agronomic and climate contexts.

Despite advances in remote sensing and machine learning, significant gaps that remain are as follows: scarcity of studies that jointly integrate physiological variables and spectral indices to distinguish management systems; limited evidence under long-term conditions (e.g., fields with decades of DS) where surface residues and soil structure modulate the optical signal; and little consideration of discrimination measures alongside yield prediction in a unified analytical framework.

This study aims to predict wheat yield in DS and CS systems based on spectral indices selected through statistical analysis. In addition, the yield variable was evaluated to determine whether it has a greater ability to discriminative between management systems, using a classification model to develop an appropriate tool for estimating crop productivity.

Unlike studies that typically evaluate these systems separately, this study integrates UAS-derived spectral indices and physiological variables into a single multivariate framework to predict yield and distinguish between DS and CS. It also explicitly evaluates discrimination ability alongside predictive performance, providing operational indicators for early management decisions in precision agriculture.

2. Materials and Methods

2.1. Study Area and Experimental Design

This study was carried out in the Autonomous Community of Castilla and León, in the province of Palencia, in the Integrated Vocational Training Centers Viñalta, located on the outskirts of Palencia, 2 km west. The area lies at 709 m.a.s.l., at 42°0′16.35″ North latitude and 4°34′8.70″ West longitude, with average temperature is 11.6 °C, the coldest month is January with average temperatures of 3.4 °C, the warmest month is July with average temperatures of 20.9 °C, annual precipitation of 397 mm, average annual solar radiation 6.07 GJ/m².year, climate classification according to Köppen is temperate and type of Atlantic climate, annual potential evapotranspiration of 699 mm and type of háplico luvisol soil [18]. Figure 1 illustrates the ubication in Spain and the distribution of the plots in the field comprising an area of one hectare for each treatment.

The experimental design consisted of divided plots, each with an approximate area of 0.42 ha, implemented under two planting systems. In the first plot, the Viñalta Integrated Vocational Training Centers have practiced a DS system for 30 years, actively applying conservation agriculture techniques that prioritize the protection of ecological integrity; among the measures adopted, the prohibition of the use of agricultural machinery stands out, two applications of non-selective herbicide (glyphosate) were made before planting to control weeds. In the second plot, the CS was used, which included activities such as two passes with fast disk stands (Minidisc), cultivator, and vibrating cultivator. In both plots of investigation, the previous crop was Veza, in rainfed, and the wheat variety evaluated was Andino, which was sown with a 200 kg/ha density.

2.2. Field Data Collection

2.2.1. Measurements of Physiological Responses

To ensure representativeness in a 1 ha plot, a geolocated grid of points with a 5 m spacing was generated in QGIS version 3.28.15. In each plot, nine sampling points per treatment were selected using spatially balanced sampling based on the conditional Latin hypercube (cLHS) methodology [19], which maximizes covariate space coverage and avoids preferential sampling. The selection was implemented in RStudio version 4.4.1, and the design was chosen because it homogeneously distributes the locations across the entire area and reduces variance compared to simple random sampling with equal effort (Figure 2). In this study, reliability was addressed a priori through a spatially balanced design and the use of blocks, which ensures uniform coverage of within-field gradients across the one-hectare plot. The selected points were exported to the QField version 3.7.9 application to facilitate the location and installation in the field of 18 sampling quadrants, each with dimensions of 0.5 × 0.5 m. At these sites, the corresponding samples were collected during the phenological stage of flowering on 13 May 2024.

At each sampling point, information corresponding to the crop, contained in Table 1, was collected.

The harvest was carried out on 6 July 2024. Therefore, sampling was carried out one day earlier. The variables to be measured are presented in Table 1.

2.2.2. Acquisition of Aerial Images

The images were captured using a DJI Mavic 3 Enterprise UAS, model M3M, DJI, Shenzhen, China (Figure 3a), classified as rotor equipment with four propellers, which has integrated a 4/3 CMOS RGB camera, effective 20 MP pixel, 84° field of view, 24 mm equivalent format, f/2.8–f/11 aperture and 1 m to ∞ focus (with autofocus); the CMOS 1⁄2.8’ multispectral camera, 5 MP effective pixel, 73.91° field of view, 25 mm equivalent format and f/2.0 aperture. The multispectral sensor makes it possible to collect data in four spectral bands: green (G 560 ± 16 nm), red (R 650 ± 16 nm), red (RE 730 ± 16 nm), and near-infrared (NIR 860 ± 26 nm) (Figure 3b).

The aerial reconnaissance was carried out on the same day the phenological field responses were evaluated and carried out between 10:00 a.m. and 1:00 p.m. The mission was carried out at an altitude of 50 m and an approximate speed of 4.4 m/s. The images had a ground sampling distance (GSD) of 2.31 cm per pixel, incorporating 80% frontal and 80% lateral overlap. The flight path was automatically created using DJI Pilot 2 software version 02.01.07.12 integrated into the remote control. In addition, the radiometric sensor was calibrated before and after image capture, aiming to compensate for variations in incident light conditions. This procedure made it possible to obtain accurate quantitative data by using a reference plate and adjusting the image capture to the fluctuations in sunlight recorded during the flight.

The UAS has a real-time kinematic (RTK) sensor to ensure the images’ georeferencing. This centimeter-precision positioning sensor connects to the national geodetic network of GNSS reference stations (ERGNSS); for the study area, the PALE3M sensor from the Castile and León GNSS station network was used.

For radiometric calibration, the images obtained from the UAS were adjusted to convert reflectance signals into physical values using a reflectance calibration panel (a surface calibrator with known reflectance) that was placed in the field during acquisition. This panel allowed for the adjustment of the images to the atmospheric reflectance values of each spectral band, ensuring that the measurements were comparable to reference conditions [23]. Regarding geometric calibration, a georeferencing process was applied to correct possible geometric distortions caused by UAV movement or camera tilt. To do this, geospatial control points and a GPS coordinate system were used, which allowed the images to be adjusted to a standard cartographic projection. Data acquisition took place at 10:00 AM, with a temperature of 15 °C, wind speed of 16.7 km/h from the southwest, visibility of over 10 km, and clear skies, ensuring optimal conditions for image capture.

2.3. Aerial Image Processing

The images collected were processed using the Pix4Dmapper software (version 4.4.12) to obtain the orthophoto and vegetation indices. During this procedure, additional products, such as elevation point clouds, digital elevation models, digital surface models, and three-dimensional meshes (3D), can also be generated [24].

Processing in Pix4Dmapper was carried out using the standard workflows for RGB and multispectral images, applying the “3D Maps” and “Ag Multispectral” configurations, respectively, to generate the corresponding ortho mosaics. Once the ortho mosaics were generated, the vegetation indices were calculated and exported in TIFF format, facilitating their subsequent analysis.

The vegetation indices (Table 2) were selected according to the potential relationship with crop biophysical attributes—such as canopy health and sensitivity to chlorophyll content, nitrogen status, leaf structure, and senescence [25]—as well as their agronomic relevance, robustness to illumination and soil-background variability, and high predictive accuracy with the on-board sensor bands and the UAS RGB camera [26]. Specifically, NDVI and RVI were chosen for their long track record and were used as established metrics of vigor and biomass, widely reported in precision agriculture in recent years [27]. GNDVI and the green chlorophyll index (GCL) were included due to their higher sensitivity to chlorophyll and nitrogen status and their lower saturation relative to NDVI, as <corroborated by recent reviews and studies showing a strong correlation between GNDVI and chlorophyll [28]. NDRE and the red-based chlorophyll index (RECL) leverage the red-edge region to estimate chlorophyll and diagnose nitrogen with better performance at high canopy cover; their utility has been reinforced by the availability of red-edge bands [29]. SAVI and AVI were used to mitigate soil and illumination effects (early stages or low cover), with AVI documented in current operational guides [30]. Finally, the RGR ratio (R/G) provides sensitivity to senescence and pigments [31]. A buffer of 2.5 m was generated around each point to calculate the vegetation index values at the sampling points. All available values were collected within this area, obtaining 672,862 data. Subsequently, the analysis used the mean and median as representative measures. To differentiate them, they were labeled as NDVI_P (associated with the average) and NDVI_M (those corresponding to the median).

2.4. Statistical Analysis

Identifying the most relevant factors to differentiate the treatments is essential to visualize the contributions of the original and modeled variables. This process involves a workflow that integrates dimension reduction and predictive modeling techniques, enabling the simplification and analysis of complex data while ensuring interpretability and accuracy in parcel differentiation.

First, data related to crop physiological responses and vegetation indices were processed. A component reduction technique was applied using PCA to simplify the large amount of multidimensional information [40]. This procedure made it possible to identify the most relevant variables, facilitating their understanding, evaluation, and interpretation. As part of the analysis, the data were standardized to scale and center the numerical variables to an average of 0 and a standard deviation of 1, ensuring that all variables had equal importance in the analysis, regardless of their units or ranges. PCA was applied to the numerical variables, combining them linearly to obtain new components. The number of components (k) was defined as the minimum required to achieve ≥90% cumulative variance, calculated from the eigenvalue spectrum (percentage of variance explained by each component and its cumulative total), in order to retain at least 90% of the information. The data were then projected into the k-dimensional PCA space and split into training (70%) and test (30%) sets using a reproducible random partition (fixed seed: 123) to ensure reproducibility.

The differentiation model was trained, evaluated, and validated using the RF algorithm to predict the DS and CS treatment variables from the training data set. The model was adjusted by cross-validation of 5 partitions on the training set, exploring values of the hyperparameter mtry ∈ {2, 4, 6} to optimize performance; the number of trees was set to the package default (500), the split rule was Gini, and sampling was performed by bootstrap with replacement (enabling estimation of the out-of-bag, OOB, error), allowing a robust performance evaluation. Once trained, predictions were made on the test data set, and performance was evaluated using a confusion matrix, obtaining metrics such as accuracy, sensitivity, and specificity.

Finally, the algorithm’s performance and differentiation quality were analyzed using the ROC curve and the area under the curve (AUC), fundamental metrics to understand the model’s capacity for predicting and differentiating observations in the categories defined by the treatment variables DS and CS.

In addition, to identify if there are significant differences between treatments, an average test was performed using vegetation indices. For this purpose, the normality of the data was initially verified graphically and using the Anderson-Darling test [41]. For data with normal distribution, ANOVA with 5% probability (p < 0.05) was applied, and the Levene test was based on means for data without normal distribution with 5% probability (p < 0.05) [42].

The indices that presented significant differences were selected to apply the Spearman correlation and the ROC curve, indicating that the lowest correlations and the highest value of ROC correspond to the most effective vegetative indices and correlated with the physiological variables of interest. With the indices that demonstrated a more significant relationship with performance, we proceeded with RF to model the performance prediction and compare it with the actual data; 70% of the data was used to train the model, and the remaining 30% was used for validation [43]. The model’s accuracy was assessed by calculating the coefficient of determination R². All procedures were performed using RStudio software (Version 2024.04.2+764 “Chocolate Cosmos”).

2.4.1. Model Validation

This study uses data from a single season and site. Accordingly, we performed internal validation via a 70/30 train–test split, stratified fivefold cross-validation on the training set, and RF out-of-bag (OOB) error estimation. We report accuracy, Cohen’s kappa, sensitivity/specificity, and AUC, with confidence intervals where appropriate.

ROC/AUC criteria. For the binary discrimination (CS vs. DS), ROC curves were computed from model class probabilities and performance was summarized by AUC. We interpreted AUC using conventional thresholds widely adopted in the diagnostic-test literature: AUC < 0.60 = no/poor, 0.60–0.69 = poor, 0.70–0.79 = acceptable, 0.80–0.89 = good/excellent, ≥0.90 = outstanding. Under this scale, AUC > 0.80 indicates a good classifier.

Note on the use of AI Tools: ChatGPT version 5.0 (OpenAI, San Francisco, CA, USA) and Grammarly were used solely to improve the clarity, grammar, and English language of the manuscript. These tools were not employed to generate, analyze, or interpret any methodological or scientific content. All descriptions, analyses, and conclusions were written, reviewed, and validated by the authors, who take full responsibility for the final content.

3. Results

3.1. Principal Component Analysis and Random Forest

Applying PCA, dimensionality reduction is observed to identify the linear combinations of original variables that capture most of the variance, facilitating visualization and interpretation. Figure 4 shows the percentage of variance explained by each principal component and determines how many components are needed to capture most of the information in the original data. The first two components combined account for approximately 76% of the cumulative variance. Figure 5 represents the measured quality values of each variable in the different principal components (dimensions), where dimensions 1 and 2 are the ones that most contribute to explaining the variability in the data.

Principal components (PCs) were interpreted using factor loadings (sign and magnitude), variable contributions (% contrib), and squared cosines (cos²) to assess the quality of representation. A variable was considered relevant in a principal component when its contribution exceeded the uniform expectation (100/p) or when |loading| > 0.40. Positive factor loadings indicate covarying traits, while opposite signs suggest trade-offs. Indices designed to reduce soil effects (e.g., SAVI, AVI) aligned with canopy cover/soil gradients, while chlorophyll-sensitive metrics (e.g., GNDVI, NDRE, GCL/RECL) correlated with greenness/chlorophyll gradients. This framework guided the biological interpretation of the PCA figures and influenced the RF model inputs.

For RF modeling, the minimum number of principal components (k) required to reach ≥90% cumulative variance was retained, and the data were projected into the k-dimensional PC space. On the held-out test set, RF achieved 93.75% accuracy (95% CI: 69.77–99.84%) with Kappa = 0.875; in stratified fivefold cross-validation, accuracy was 94.29% (Kappa = 0.8831). Methodological details are provided in Methods (Section 2.4.1).

The model demonstrates perfect sensitivity, with a value of 1.0 for the positive class corresponding to the CS variables, indicating that it correctly differentiates all cases in this category. With a specificity of 0.8750, it correctly identifies most of the DS negatives, largely avoiding their misclassification as CS. The predictive value for CS was 0.8889, equivalent to 88.89% accuracy, while for DS, the positive predictive value reached 1.0, indicating perfect accuracy.

Furthermore, the balanced accuracy value of 0.9375 reflected balanced performance between both classes and the low p-value (0.0002594) associated with the significance test indicates that the model significantly outperforms the performance expected by chance. The McNemar’s Test p-Value (1.000) result shows no evidence of imbalance in differentiation errors, reinforcing the model’s robustness and confidence.

The above result is confirmed by the ROC curve analysis, where the AUC value reached 0.9375, indicating that the model has the ability to correctly differentiate the classes in an average of 93.75% of the cases on average.

After adequately differentiating the treatments, we proceeded to analyze the importance of the RF model’s variables. According to the most relevant dimension presented in Figure 6a, Dim. 2 is the most important within the model, with a value close to 100. This indicates that this dimension contributes significantly to the accuracy of the predictions generated by the model.

For its part, Figure 6b highlights the variables with greater relevance in the model in Dim.2. Among these, yield, nitrogen, and protein were identified as those with the highest contribution, suggesting that they should receive priority attention, given that they significantly impact the analysis’s results.

3.2. Vegetation Index Analysis

Vegetation indices vary depending on the crop’s phenological status. In this study, wheat monitoring was carried out in two stages of development, observing that significant differences were concentrated in the flowering phase. After carrying out the normality analysis and the means comparison tests, the results were represented by box diagrams that illustrate the distribution of the values of each index under the different treatments.

Considering average and median values, eight of the nine indices evaluated showed significant differences according to the means tests. Additionally, Spearman correlation values were included, which allowed the identification of the most relevant vegetation indices (Figure 7). The results indicate that indices such as AVI and SAVI. However, they have low or even negative R² values and show a greater capacity for differentiation between the CS and DS classes, suggesting a stronger relationship between these indices and the variable of interest. Moreover, they outperform other indices because they integrate the canopy signal (high NIR and low R) and correct for soil reflectance, thereby better capturing canopy closure and structure, as well as the greater shadow and residue characteristic of DS. Additionally, in Figure 8, a differentiation is visually observed between the treatments based on the SAVI and AVI, evidencing distinctive patterns in the spectral response of wheat under DS and CS.

The ROC curve complements and reinforces the results obtained by the Spearman correlation, showing that the vegetation index with the highest predictive capacity is the AVI, presenting the highest value of AUC = 0.8025, followed by the SAVI with AUC = 0.7778. These results highlight that both vegetation indices have the best discriminative capacity within the model, which indicates that they are the most effective in distinguishing between the treatments evaluated (Table 3).

Based on these findings, it was decided to work with the median values since these values present slightly higher performance than the averages, although with a minimal difference. This approach ensures greater consistency and robustness in interpreting the results obtained [44].

The Spearman correlation was performed again with the vegetation indices that obtained the best performance in the differentiation of treatments. This analysis made it possible to compare these indices with the most important variable identified by the RF model to determine their relationship (Figure 9). The AVI_CS and SAVI_CS indices have significant positive correlations with CS performance (0.69), indicating a close relationship between these indices and the performance variable in the CS class. This result suggests that as the values of these indices increase, the performance associated with this class also increases. On the other hand, AVI_DS and SAVI_DS show strong negative correlations with performance in DS (−0.79), which implies that these indices are helpful in differentiating the characteristics of the DS class. In this case, performance in the DS class tends to increase as index values decrease.

3.3. Prediction of Performance

The wheat yield prediction was carried out using the vegetation indices AVI and SAVI, which were selected for their ability to capture variability in tillage systems. These indices provide crucial information about the physiological state of the crop and the differences generated by the treatments applied. The RF algorithm was used to ensure an accurate performance prediction in the CS and DS systems, using the spectral indices mentioned as predictive variables. The model was trained with 70% of the available data, reserving the remaining 30% for validation. As a result, only the five-point representation is observed.

Additionally, linear regression graphs were generated to visualize the relationship between the actual performance and the performance predicted by the model. The model’s accuracy was evaluated by calculating the coefficient of determination (R²) as a key performance metric. This coefficient was estimated independently for the AVI and SAVI spectral indices in both tillage systems, providing a detailed assessment of the model’s efficacy under different crop management conditions.

In Figure 10 and Figure 11, high predictive capability is demonstrated with the AVI and SAVI, respectively. With AVI, the coefficient of determination (R²) values were 0.92504 in CS and 0.93948 in DS, reflecting an excellent fit between predicted and observed values. With SAVI, R² reached 0.91142 in CS and 0.9398 in DS, confirming comparable performance. In both cases, the proximity of the points to the identity line indicates low discrepancy and strong agreement between predictions and actual data. Overall, these results validate the reliability of the approach and support AVI and SAVI as robust predictors of yield, with particularly high performance in DS.

4. Discussion

The combination of PCA and RF proved effective in modeling wheat yield in DS and CS areas based on physiological variables and spectral indices. In the multivariate analysis, the PCA results revealed that the first two dimensions explained 76% of the cumulative variance. This level is comparable to that reported by Liu et al. [45], indicating that PCA is highly effective in identifying discriminant patterns in complex management systems. It reduces dimensionality without losing relevant information and facilitates data interpretation and key variable selection. In biological terms, the first principal component (PC1) summarizes a gradient of vegetation cover closure relative to soil influence, characterized by positive loads in NIR and indices that attenuate soil brightness (SAVI/AVI), as well as negative loads in the red band (R) and ratios related to greater soil exposure (e.g., RGR). The second component (PC2) captures a chlorophyll/greenness gradient, with high loads in GNDVI, NDRE, and GCL/RECL, which are associated with higher pigment content and better nitrogen status. Higher nitrogen availability is linked to an increase in chlorophyll content, which directly influences the health of the plant cover and, consequently, yield [46]. According to the loads and correlation models, yield, nitrogen, and protein are positively associated with these axes, confirming their agronomic relevance.

In the evaluation of the confusion-matrix, the differentiation model achieved an overall accuracy of 93.75% and a Kappa index of 0.875, reflecting a high level of agreement. These results coincide with the findings of Zhao et al. [47], who demonstrated the high effectiveness of the RF method for predicting wheat yield in the Northern plains of China. Nevertheless, our study focuses on physiological variables and spectral indices in DS treatments (a plot with 30 years of conservation agriculture) compared to CS, while the previous work centered on accumulated biomass and various climatic indices.

The RF method identified yield, nitrogen, and protein as the most relevant variables for distinguishing between treatments, in line with previous studies. Esaulko et al. [48] report that DS can stabilize productivity in arid regions by improving infiltration and reducing evaporation. On the other hand, Colecchia et al. [49] observed higher yields in CS under conditions of good water availability. Our findings reinforce the usefulness of spectral indices for quantifying and modeling crop yield, highlighting the long-term benefits of DS. However, the effectiveness of each system depends mainly on soil and climate conditions and the agronomic management.

Furthermore, Levene’s test and ANOVA confirmed significant differences in the variables evaluated between treatments. These tests are essential for validating the robustness of the results and corroborating that the observed differences are not due to chance. Houšt et al. [50] reported similar findings when applying these tests in wheat experiments, highlighting their relevance in comparative agricultural management studies.

Vegetation indices showed significant differences between treatments, particularly during the flowering phase. Walsh et al. [51], reported that spectral indices based on UAS accurately capture phenological variations in crops. In our case, ROC analysis and Spearman correlations highlighted AVI and SAVI as the most informative indices for distinguishing treatments at flowering (AUC = 0.8025 and 0.7778, respectively). In CS, the Spearman correlation with yield was positive (ρ = 0.69), indicating that higher index values are associated with greater productivity. Conversely, in DS, correlations with yield were negative (ρ = −0.79). This pattern is consistent with the presence of residue cover and higher soil moisture, which reduce soil brightness/reflectance and increase canopy shade. At the pixel level, this decreases the NIR–R contrast and can reduce SAVI/AVI even when crop condition is good, producing negative correlations with yield (i.e., lower index values associated with higher yield). This coincides with the findings of Cheng et al. [52], who highlighted the usefulness of SAVI in minimizing the effect of soil brightness in management systems such as DS.

SAVI and AVI not only demonstrated their ability to distinguish between treatments, but also their ability to model performance in tillage systems. Using RF, coefficients of determination (R²) greater than 0.9 were obtained in DS and CS, providing accurate performance prediction. The relationship between yield and AVI highlights its usefulness as a key indicator in agricultural systems, integrating information on plant cover and photosynthetic efficiency. This is consistent with the study by Yang et al. [53], who noted that spectral indices capture yield variation through non-destructive monitoring. In this context, AVI not only distinguishes DS from CS, but also provides essential information for estimating yield under different management strategies. This behavior reinforces the importance of spectral indices as essential tools in precision agriculture, facilitating the distinction of treatments and optimizing management practices.

5. Conclusions

The analysis identified that the vegetation indices AVI and SAVI were the most relevant in identifying the treatments, standing out for their discriminative capacity, especially between CS and DS. The ROC curve indicated its efficacy, with AVI (AUC = 0.8025) and SAVI (AUC = 0.7778) as the best for classification and discrimination.

The high predictive capacity of the AVI and SAVI stands out, with R² higher than 91%. Both indexes showed excellent performance in CS and DS, with the highest values of R² in DS (0.93948 for AVI and 0.93980 for SAVI). The closeness of the points to the identity line confirms the reliability of the predictions, consolidating AVI and SAVI as precise tools to estimate agricultural yield. In practical terms, the results enable anticipatory, zone-specific fertilization, prioritized irrigation and monitoring in low-vigor areas, and postharvest scheduling according to spatial variability, thereby optimizing costs, inputs, and logistics.

This study demonstrates the feasibility of applying remote sensing and machine learning to precision agriculture in wheat. Integrating UAS, spectral indices, and Random Forest enabled us to discriminate between DS and CS and to estimate yield with high accuracy, providing operational indicators for more efficient and sustainable management. However, the models were trained with data from a single site and a single season; broader external validation across multiple seasons and regions with different edaphoclimatic conditions is required to assess generalizability. Likewise, although AVI and SAVI proved robust for wheat under DS and CS, their direct applicability to other crops or climates should be verified before operational use. Future work will expand multi-site, multi-year validation and explore the transferability of both the models and the indices to diverse production contexts.

Since yield, nitrogen, and protein were identified as the most relevant variables, it is recommended to further investigate their analysis through studies exploring their interaction with other agronomic and climatic factors, to refine the understanding of their impact and improve predictive accuracy. Additionally, it is advisable to compare RF with alternatives such as Gradient Boosting, Support Vector Machines, and deep neural networks, evaluating precision, robustness, and efficiency. Also, validating models using external datasets is critical to ensure their applicability and generalization in real-world scenarios.

Author Contributions

Conceptualization, D.C.P.-M., G.A.e.S.F. and L.M.N.-G.; methodology, D.C.P.-M., G.A.e.S.F., S.V.R., E.J.S.C. and L.M.N.-G.; validation, G.A.e.S.F. and L.M.N.-G.; formal analysis, D.C.P.-M., S.V.R. and E.J.S.C.; investigation, D.C.P.-M., G.A.e.S.F. and L.M.N.-G.; resources, L.M.N.-G.; data curation, D.C.P.-M., S.V.R. and E.J.S.C.; writing—original draft preparation, D.C.P.-M. and G.A.e.S.F.; writing—review and editing, D.C.P.-M., G.A.e.S.F. and L.M.N.-G.; visualization, D.C.P.-M., S.V.R. and E.J.S.C.; supervision, G.A.e.S.F. and L.M.N.-G.; project administration, L.M.N.-G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the European Union supporting this work through CIRAWA Project (HORIZON-CL6-2022-FARM2FORK-01) and DIGIS3 Project (DIGITAL-2021-EDIH-01).

Data Availability Statement

Data available on request due to restrictions. The data presented in this study are available on request from the corresponding author due to privacy.

Acknowledgments

The authors used ChatGPT version 5.0 (OpenAI, San Francisco, CA, USA) and Grammarly to assist in improving the English language and readability of this manuscript. These tools were used solely for language editing; no AI tools were used to generate, analyze, or interpret the scientific content. The authors reviewed and verified all content and take full responsibility for the final version of the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

AUC	Bench Area
AVI	Advanced Vegetation Index
CS	Conventional Seeding
DS	Direct Seeding
ERGNSS	National Geodetic Network of Reference Stations GNSS
G	Green
GCL	Green Chlorophyll Index
GNDVI	Green Standardized Difference Vegetation Index
GNSS	Global Satellite Navigation System
GSD	Soil Sampling Distance
NDRE	Red Edge Standardized Difference Index
NDVI	Normalized Difference Vegetation Index
NIR	Near Infrared
PC	Principal Component
PCA	Principal Component Analysis
R	Red
RE	Red Edge
RECL	Red Rim Chlorophyll Index
RF	Random Forest
RGR	Red/Green Ratio
ROC	Receiver Operational Feature Curve
RTK	Real-Time Kinematic
RVI	Proportional Vegetation Index.
SAVI	Soil-Adjusted Vegetation Index
SOC	Soil Organic Carbon
UAS	Unmanned Air System

References

United States Department of Agriculture—(USDA). World Agricultural Production; USDA: Washington, DC, USA, 2024.
Atamanyuk, I.; Havrysh, V.; Nitsenko, V.; Diachenko, O.; Tepliuk, M.; Chebakova, T.; Trofimova, H. Forecasting of Winter Wheat Yield: A Mathematical Model and Field Experiments. Agriculture 2023, 13, 41. [Google Scholar] [CrossRef]
Saldukaitė-Sribikė, L.; Šarauskis, E.; Buragienė, S.; Adamavičienė, A.; Velička, R.; Kriaučiūnienė, Z.; Savickas, D. Effect of Tillage and Sowing Technologies Nexus on Winter Wheat Production in Terms of Yield, Energy, and Environment Impact. Agronomy 2022, 12, 2713. [Google Scholar] [CrossRef]
Ministerio de Agricultura, Pesca y Alimentación. Avances Mensuales de Superficies y Producciones Agrícolas; Ministerio de Agricultura, Pesca y Alimentación: Madrid, Spain, 2024.
FAO Capítulo 5. Agricultura de Conservación. Available online: https://www.fao.org/4/y4690s/y4690s0a.htm?utm_source=chatgpt.com (accessed on 26 December 2024).
Steponavičienė, V.; Žiūraitis, G.; Rudinskienė, A.; Jackevičienė, K.; Bogužas, V. Long-Term Effects of Different Tillage Systems and Their Impact on Soil Properties and Crop Yields. Agronomy 2024, 14, 870. [Google Scholar] [CrossRef]
Dang, Y.P.; Dalal, R.C.; Menzies, N.W. No-Till Farming Systems for Sustainable Agriculture: Challenges and Opportunities; Springer International Publishing: Cham, Switzerland, 2020; ISBN 9783030464097. [Google Scholar]
Revelo Luna, D.; Mejía Manzano, J.; Montoya-Bonilla, B.; Hoyos García, J. Analysis of the Vegetation Indices NDVI, GNDVI and NDRE for the Characterization of the Coffee Crop (Coffea arabica). Ing. Desarro. 2020, 38, 2145–9371. [Google Scholar] [CrossRef]
Camenzind, M.P.; Yu, K. Multi Temporal Multispectral UAV Remote Sensing Allows for Yield Assessment across European Wheat Varieties Already before Flowering. Front. Plant Sci. 2023, 14, 1214931. [Google Scholar] [CrossRef]
Zenteno Cruz, G.A.; Palacios Vélez, E.; Tijerina Chávez, L.; Flores Magdaleno, H. Application of Remote Sensing Technologies for Estimating Yield. Rev. Mex. Cienc. Agric. 2017, 8, 1575–1586. [Google Scholar]
Wu, Q.; Zhang, Y.; Zhao, Z.; Xie, M.; Hou, D. Estimation of Relative Chlorophyll Content in Spring Wheat Based on Multi-Temporal UAV Remote Sensing. Agronomy 2023, 13, 211. [Google Scholar] [CrossRef]
Jiang, X.; Gao, J.; Yang, Z. High-Dimensional Robust Principal Component Analysis and Its Applications. J. Comput. Methods Sci. Eng. 2023, 23, 2303–2311. [Google Scholar] [CrossRef]
Katral, A.; Biradar, H.; Harijan, Y.; Aruna, Y.R.; Hadimani, J.; Hittalmani, S. Genetic Analysis and Traits Association Study in Marker-Assisted Multi-Drought-Traits Pyramided Genotypes Under Reproductive-Stage Moisture Stress In Rice (Oryza sativa L.). Euphytica 2022, 218, 21. [Google Scholar] [CrossRef]
Oficina de Estudios y Políticas Agrarias. Estudio: Inteligencia Artificial Para La Estimación de Superficie de Trigo a Nivel Nacional; Oficina de Estudios y Políticas Agrarias: Santiago, Chile, 2023.
Xu, Q.; Jin, M.; Guo, P. A High-Precision Crop Classification Method Based on Time-Series UAV Images. Agriculture 2023, 13, 97. [Google Scholar] [CrossRef]
Han, L.; Yang, G.; Dai, H.; Xu, B.; Yang, H.; Feng, H.; Li, Z.; Yang, X. Modeling Maize Above-Ground Biomass Based on Machine Learning Approaches Using UAV Remote-Sensing Data. Plant Methods 2019, 15, 10. [Google Scholar] [CrossRef]
Giraldo Betancourt, C. Evaluación Del Potencial de Datos Espectrales Para El Diagnóstico de Marchitez Letal (ML) En Palma de Aceite (Elaeis Guineensis Jacq); Universidad Nacional de Colombia: Bogotá, Colombia, 2021. [Google Scholar]
Nafría García, D.A.; Garrido del Pozo, N.; Álvarez Arias, M.V.; Cubero Jiménez, D.; Fernández Sánchez, M.; Villarino Barrera, I.; Gutiérrez García, A.; Abia Llera, I. Agroclimatic Atlas of Castile and León, 1st ed.; Junta de Castilla y León, Ministerio de Agricultura: Valladolid, Spain, 2013. [Google Scholar]
Pacciorett, P.A.; Kurina, F.G.; Balzarini, M.G. Regional Scale Site Sampling for Digital Mapping Based on Soil Properties. Cienc. Suelo 2020, 38, 310–320. [Google Scholar]
YARA Knowledge Grows Yara N-Tester^TM. Available online: https://www.yara.com.ar/nutricion-vegetal/portafolio-de-agricultura-digital/n-tester/ (accessed on 16 December 2024).
METOS^® by Pessl Instruments Dualex Plant Health Promotion and Nutrition Monitoring. Available online: https://metos.global/es/dualex/ (accessed on 16 December 2024).
UNE-EN ISO 7971-2; Cereals. Determination of Bulk Density, Called Mass per Hectolitre. Part 2: Method of Traceability for Measuring Instruments Through Reference to the International Standard Instrument. Asociación Española de Normalización: Madrid, Spain, 2019.
Guo, Y.; Senthilnath, J.; Wu, W.; Zhang, X.; Zeng, Z.; Huang, H. Radiometric Calibration for Multispectral Camera of Different Imaging Conditions Mounted on a UAV Platform. Sustainability 2019, 11, 978. [Google Scholar] [CrossRef]
Williams, V.; Unger, D.R.; Kulhavy, D.; Hung, I.-K.; Zhang, Y. Comparing Drone2Map versus Pix4Dmapper When Creating Mosaics over Homogeneous Land Features. Int. J. Geospat. Environ. Res. 2023, 10. Available online: https://paperity.org/p/311233145/comparing-drone2map-versus-pix4dmapper-when-creating-orthophoto-mosaics-over-homogeneous (accessed on 1 November 2025).
Ferraz, M.A.J.; Barboza, T.O.C.; Arantes, P.d.S.; Von Pinho, R.G.; Santos, A.F. dos Integrating Satellite and UAV Technologies for Maize Plant Height Estimation Using Advanced Machine Learning. AgriEngineering 2024, 6, 20–33. [Google Scholar] [CrossRef]
Li, X.; Zhu, B.; Li, S.; Liu, L.; Song, K.; Liu, J. A Comprehensive Review of Crop Chlorophyll Mapping Using Remote Sensing Approaches: Achievements, Limitations, and Future Perspectives. Sensors 2025, 25, 2345. [Google Scholar] [CrossRef]
Radočaj, D.; Šiljeg, A.; Marinović, R.; Jurišić, M. State of Major Vegetation Indices in Precision Agriculture Studies Indexed in Web of Science: A Review. Agriculture 2023, 13, 707. [Google Scholar] [CrossRef]
Li, M.; Wang, W.; Li, H.; Yang, Z.; Li, J. Monitoring of Vegetation Chlorophyll Content in Photovoltaic Areas Using UAV-Mounted Multispectral Imaging. Front. Plant Sci. 2025, 16, 1643945. [Google Scholar] [CrossRef]
Duan, J.; Rudnick, D.R.; Proctor, C.A.; Heeren, D.; Nakabuye, H.N.; Katimbo, A.; Shi, Y.; de Sousa Ferreira, V. Estimation of Corn Nitrogen Demand under Different Irrigation Conditions Based on UAV Multispectral Technology. Agric. Water Manag. 2024, 304, 109075. [Google Scholar] [CrossRef]
Wiratmoko, D.; Sabrina, T.; Minasny, B.; Nasution, Z. Using the Soil-Adjustment Vegetation Index from Landsat-8 Imagery for Estimating the Nutrient Content of Oil Palm Leaves for Optimized Fertilizer Application. BIO Web Conf. 2025, 192, 01002. [Google Scholar] [CrossRef]
Coswosk, G.G.; Gonçalves, V.M.L.; de Lima, V.J.; de Souza, G.A.R.; Teixeira do Amaral Junior, A.; Pereira, M.G.; de Oliveira, E.C.; Leite, J.T.; Kamphorst, S.H.; de Oliveira, U.A.; et al. Utilizing Visible Band Vegetation Indices from Unmanned Aerial Vehicle Images for Maize Phenotyping. Remote Sens. 2024, 16, 3015. [Google Scholar] [CrossRef]
Rouse, J.W.; Haas, R.H.; Schell, J.A.; Deering, D.W. Monitoring Vegetation on Systems in the Great Plains with Erts. In Erts-1 Symposium; NASA: Washington, DC, USA, 1974. [Google Scholar]
Gitelson, A.A.; Merzlyak, M.N. Signature Analysis of Leaf Reflectance Spectra: Algorithm Development for Remote Sensing of Chlorophyll. J. Plant Physiol. 1996, 148, 494–500. [Google Scholar] [CrossRef]
Gitelson, A.; Merzlyakb, M.N. Quantitative Estimation of Chlorophyll-u Using Reflectance Spectra: Experiments with Autumn Chestnut and Maple Leaves. J. Photochem. Photobiol. B-Biol. 1994, 22, 247. [Google Scholar] [CrossRef]
Huete, A.R. A Soil-Adjusted Vegetation Index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
ArcGreek Lista de Índices Espectrales en Sentinel 2 y Landsat. Available online: https://acolita.com/lista-de-indices-espectrales-en-sentinel-2-y-landsat/ (accessed on 17 December 2024).
Gitelson, A.A.; Viña, A.; Ciganda, V.; Rundquist, D.C.; Arkebauer, T.J. Remote Estimation of Canopy Chlorophyll Content in Crops. Geophys. Res. Lett. 2005, 32, L08403. [Google Scholar] [CrossRef]
Geoinnova Los 9 Principales Índices de Vegetación Más Usados en Teledetección. Available online: https://geoinnova.org/blog-territorio/analisis-de-indices-de-vegetacion-en-teledeteccion/?utm_source=chatgpt.com (accessed on 17 December 2024).
Pearson, R.L.; Miller, L.D. Remote Spectral Measurements as a Method for Determining Plant Cover; Colorado State University: Fort Collins, CO, USA, 1972. [Google Scholar]
Mohanlal, V.A.; Saravanan, K.; Sabesan, T. Application of Principal Component Analysis (PCA) for Blackgram [Vigna mungo (L.) Hepper] Germplasm Evaluation under Normal and Water Stressed Conditions. Legume Res. 2023, 46, 1134–1140. [Google Scholar] [CrossRef]
Scholz, F.W.; Stephens, M.A. K-Sample Anderson–Darling Tests. J. Am. Stat. Assoc. 1987, 82, 918–924. [Google Scholar] [CrossRef]
Correa, J.C.; Iral, R.; Rojas, L. A Study of the Power of Tests for Homogeneity of Variance. Rev. Colomb. Estad. 2006, 29, 57–76. [Google Scholar]
Sharma, M.; Goel, S.; Elias, A.A. Predictive Modeling of Soil Profiles for Precision Agriculture: A Case Study in Safflower Cultivation Environments. Sci. Rep. 2025, 15, 44. [Google Scholar] [CrossRef]
McGrath, S.; Zhao, X.F.; Qin, Z.Z.; Steele, R.; Benedetti, A. One-Sample Aggregate Data Meta-Analysis of Medians. Stat. Med. 2019, 38, 969–984. [Google Scholar] [CrossRef]
Liu, T.; Yang, T.; Zhu, S.; Mou, N.; Zhang, W.; Wu, W.; Zhao, Y.; Yao, Z.; Sun, J.; Chen, C.; et al. Estimation of Wheat Biomass Based on Phenological Identification and Spectral Response. Comput. Electron. Agric. 2024, 222, 109076. [Google Scholar] [CrossRef]
Feyisa, D.S.; Jiao, X.; Mojo, D. Wheat Yield Response to Chemical Nitrogen Fertilizer Application in Africa and China: A Meta-Analysis. J. Soil Sci. Plant Nutr. 2024, 24, 102–114. [Google Scholar] [CrossRef]
Zhao, Y.; Xiao, D.; Bai, H.; Tang, J.; Liu, D.L.; Qi, Y.; Shen, Y. The Prediction of Wheat Yield in the North China Plain by Coupling Crop Model with Machine Learning Algorithms. Agriculture 2023, 13, 99. [Google Scholar] [CrossRef]
Esaulko, A.; Sitnikov, V.; Pismennaya, E.; Vlasova, O.; Golosnoi, E.; Ozheredova, A.; Ivolga, A.; Erokhin, V. Productivity of Winter Wheat Cultivated by Direct Seeding: Measuring the Effect of Hydrothermal Coefficient in the Arid Zone of Central Fore-Caucasus. Agriculture 2023, 13, 55. [Google Scholar] [CrossRef]
Colecchia, S.A.; De Vita, P.; Rinaldi, M. Effects of Tillage Systems in Durum Wheat under Rainfed Mediterranean Conditions. Cereal Res. Commun. 2015, 43, 704–716. [Google Scholar] [CrossRef]
Houšť, M.; Procházková, B.; Hledík, P. Effect of Different Tillage Intensity on Yields and Yield—Forming Factors in Winter Wheat. Acta Univ. Agric. Silvic. Mendel. Brun. 2012, 60, 89–96. [Google Scholar] [CrossRef]
Walsh, O.S.; Marshall, J.M.; Nambi, E.; Jackson, C.A.; Ansah, E.O.; Lamichhane, R.; McClintick-Chess, J.; Bautista, F. Wheat Yield and Protein Estimation with Handheld and Unmanned Aerial Vehicle-Mounted Sensors. Agronomy 2023, 13, 207. [Google Scholar] [CrossRef]
Cheng, Z.; Gu, X.; Zhou, Z.; Yin, R.; Zheng, X.; Li, W.; Cai, W.; Chang, T.; Du, Y. Crop Aboveground Biomass Monitoring Model Based on UAV Spectral Index Reconstruction and Bayesian Model Averaging: A Case Study of Film-Mulched Wheat and Maize. Comput. Electron. Agric. 2024, 224, 109190. [Google Scholar] [CrossRef]
Yang, Z.; Tian, J.; Zhang, L.; Zan, O.; Yan, X.; Feng, K. Spectral Detection of Leaf Carbon and Nitrogen as a Proxy for Remote Assessment of Photosynthetic Capacity for Wheat and Maize under Nitrogen Stress. Comput. Electron. Agric. 2024, 224, 109174. [Google Scholar] [CrossRef]

Figure 1. Location of the study area and plots in DS (direct seeding) and CS (conventional seeding).

Figure 2. Limit of treatments and location of sampling points per plot in DS (direct seeding) and CS (conventional seeding).

Figure 3. (a) DJI Mavic 3 Enterprise Unmanned Air System (UAS) and (b) CMOS 1⁄2.8 multispectral camera.

Figure 4. The percentage of variation is explained by each principal component.

Figure 5. Quality map of representation of the variables in the analysis of principal components of the physiological responses of the crop and vegetation indices.

Figure 6. (a) Importance of Random Forest variables represented in dimensions and (b) Largest contributing dimension in Random Forest. The dotted red line marks the relevance threshold.

Figure 7. Vegetation indices with significant differences and Spearman’s R² correlation between DS (direct seeding) and CS (conventional seeding) treatments in flowering.

Figure 8. Monitoring by SAVI and AVI in wheat in phenological state of flowering: (a) Location of treatments (RGB), (b) SAVI, and (c) AVI.

Figure 9. Spearman correlation between yield and vegetation indices with greater significant differences from DS (direct seeding) and CS (conventional seeding) treatments in flowering.

Figure 10. Real vs. Predicted Performance with AVI spectral index in DS (direct seeding) and CS (conventional seeding) treatments in flowering. The dotted line represents the linear fit between actual and predicted values.

Figure 11. Real vs. Predicted yield with SAVI spectral index in DS (direct seeding) and CS (conventional seeding) treatments in flowering. The dotted line represents the linear fit between actual and predicted values.

Table 1. Variables measured at each sampling point.

Variable	Method
Number of plants	All plants within the sampling quadrant were counted
Number of the number of tillers per plant was counted	Three plants were randomly selected within the gauge, and the number of tillers per plant was counted.
Height	Three plants were randomly selected within the sampling quadrants, and a flexometer height was measured.
Nitrogen nutritional status	Yara, Norway developed the portable N-Tester BT equipment, which measured the chlorophyll content in leaves [20].
Chlorophyll index	The Spad DUALEX^® SCIENTIFIC sensor was used, developed by the CNRS (National Centre for Scientific Research) and the University of Paris-Sud Orsay, France [21].
Epidermal flavonol index
Nitrogen Balance Index (NBI) Chl/Flav
Index of epidermal anthocyanins
Grain humidity (%)	The oven drying method is the standard method, approved by organizations such as the AOAC (Association of Official Analytical Chemists)
Specific weight (kg/ha)	Determination of volumetric density (ISO 7971-2:2019) [22].
Yield (kg/ha)	Crop harvest in an area of 0.25 m²
Nitrogen and Proteins (%)	Kjeldahl method

Table 2. Vegetation indices for multispectral images captured in the project.

Vegetation Index ¹	Equation	Reference
NDVI	(NIR − R)/(NIR + R)	[32]
GNDVI	(NIR − G)/(NIR + G)	[33]
NDRE	(NIR − RE)/(NIR + RE)	[34]
SAVI	((NIR − R)/(NIR + R + L)) × (1 + L)	[35]
AVI	[NIR × (1 − R) × (NIR − R)]^1/3	[36]
GCL	(NIR)/(G) − 1	[37]
RECL	(NIR/R) − 1	[37]
RGR	(R)/(G)	[38]
RVI	(NIR)/(R)	[39]

¹ NDVI: Normalized Difference Vegetation Index; GNDVI: Green standard difference vegetation index; NDRE: Red Rim Standard Difference Index; SAVI: Soil-Adjusted Vegetation Index; AVI: Advanced vegetation index; GCL: Green Chlorophyll Index; RECL: Red Rim Chlorophyll Index; RGR: Red/Green ratio; RVI: Proportional Vegetation Index.

Table 3. Values of the ROC curve for each vegetation index.

Vegetation Index	The Area Under the Curve
AVI	0.8025
SAVI	0.7778
RECL	0.5432
NDRE	0.5185
GNDVI	0.5185
GCL	0.5185
RVI	0.5185
NDVI	0.5062

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Polanía-Montiel, D.C.; Velasquez Rubio, S.; Suarez Cardozo, E.J.; Ferraz, G.A.e.S.; Navas-Gracia, L.M. Predicting Wheat Yield by Spectral Indices and Multivariate Analysis in Direct and Conventional Sowing Systems. Agronomy 2025, 15, 2625. https://doi.org/10.3390/agronomy15112625

AMA Style

Polanía-Montiel DC, Velasquez Rubio S, Suarez Cardozo EJ, Ferraz GAeS, Navas-Gracia LM. Predicting Wheat Yield by Spectral Indices and Multivariate Analysis in Direct and Conventional Sowing Systems. Agronomy. 2025; 15(11):2625. https://doi.org/10.3390/agronomy15112625

Chicago/Turabian Style

Polanía-Montiel, Diana Carolina, Santiago Velasquez Rubio, Edna Jeraldy Suarez Cardozo, Gabriel Araújo e Silva Ferraz, and Luis Manuel Navas-Gracia. 2025. "Predicting Wheat Yield by Spectral Indices and Multivariate Analysis in Direct and Conventional Sowing Systems" Agronomy 15, no. 11: 2625. https://doi.org/10.3390/agronomy15112625

APA Style

Polanía-Montiel, D. C., Velasquez Rubio, S., Suarez Cardozo, E. J., Ferraz, G. A. e. S., & Navas-Gracia, L. M. (2025). Predicting Wheat Yield by Spectral Indices and Multivariate Analysis in Direct and Conventional Sowing Systems. Agronomy, 15(11), 2625. https://doi.org/10.3390/agronomy15112625

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Predicting Wheat Yield by Spectral Indices and Multivariate Analysis in Direct and Conventional Sowing Systems

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area and Experimental Design

2.2. Field Data Collection

2.2.1. Measurements of Physiological Responses

2.2.2. Acquisition of Aerial Images

2.3. Aerial Image Processing

2.4. Statistical Analysis

2.4.1. Model Validation

3. Results

3.1. Principal Component Analysis and Random Forest

3.2. Vegetation Index Analysis

3.3. Prediction of Performance

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI