Comparative Evaluation of Machine Learning Algorithms for the Identification and Morphological Classification of Rice Grains

Coronel-Reyes, Julián; Haro-Sarango, Alexander; Delgado-Vera, Carlota; Triviño-Sánchez, Johnny

doi:10.3390/agriengineering8030100

Open AccessArticle

Comparative Evaluation of Machine Learning Algorithms for the Identification and Morphological Classification of Rice Grains

by

Julián Coronel-Reyes

^1,*

,

Alexander Haro-Sarango

²

,

Carlota Delgado-Vera

¹

and

Johnny Triviño-Sánchez

¹

Facultad de Ciencias Agrarias, Universidad Agraria del Ecuador (UAE), Av. 25 de Julio, Guayaquil 090104, Ecuador

²

Instituto Superior Tecnológico España, Av. Bolívar Entre Castillo y Quito, Ambato 180150, Ecuador

^*

Author to whom correspondence should be addressed.

AgriEngineering 2026, 8(3), 100; https://doi.org/10.3390/agriengineering8030100

Submission received: 27 November 2025 / Revised: 22 January 2026 / Accepted: 27 January 2026 / Published: 6 March 2026

(This article belongs to the Special Issue The Application of Machine Learning and Deep Learning Techniques in Agriculture)

Download

Browse Figures

Versions Notes

Abstract

Machine learning has enhanced rice grain classification by enabling accurate, automated, and objective morphological analysis, supporting quality control and varietal selection. This study compared the performance of several algorithms in identifying three Ecuadorian rice varieties (INIAP-11, INIAP-12, and INIAP-20) using a balanced dataset of morphological features. Five models were trained with cross-validation and evaluated using multi-class metrics. Significant differences among varieties particularly in area, length, and eccentricity confirmed their discriminative potential. Initially, models were trained using all morphological variables. However, to optimize training time and computational cost, the study also evaluated model performance after applying dimensionality reduction through Principal Component Analysis (PCA). This approach enabled assessing whether reduced feature spaces could maintain competitive predictive performance while improving efficiency. Overall, all algorithms performed well, but only the Artificial Neural Network (ANN) and Support Vector Classifier (SVC) demonstrated strong generalization without overfitting. In contrast, Random Forest achieved perfect accuracy in training but decreased performance in testing. In conclusion, ANN and SVC emerged as the most robust alternatives for rice grain morphological classification, while the PCA results highlight the value of dimensionality reduction as a strategy to enhance computational scalability without substantially compromising accuracy. The objective of the present study is to train, evaluate, and compare different machine learning algorithms for the classification of three types of rice grains, in order to determine the best model for this task based on seven morphological characteristics of the grains applying machine learning algorithms with and without dimensional reduction.

Keywords:

machine learning; classification; rice varieties; morphology; quality control

1. Introduction

Rice (Oryza sativa) underpins a substantial share of global food security, and its quality control requires objective, reproducible, and scalable procedures in operational settings. Recent evidence confirms that computer vision applied to rice addresses critical tasks such as foliar disease identification, validating the potential for automation in real-world contexts [1]. Likewise, deep architectures have been developed to diagnose leaf nutrient deficiencies, demonstrating that visual features encode agronomically relevant signals for decision-making [2]. On the phenological dimension, image-trained models have shown the ability to distinguish crop stages, reinforcing AI’s sensitivity to relevant morphological changes [3]. There are even reports of geographic traceability from visual patterns using efficient models, which opens possibilities for origin certification [4].

Beyond foliar analysis, grain morphometrics offers a direct pathway for varietal identification and the selection of quality lines, insofar as the grain phenotype integrates genetic and management signals measurable from images. Resources with specific geometric annotations for japonica grains are available, enabling comparable and reproducible morphometric analyses and providing a solid basis for classification tasks [5]. In parallel, combinatorial image-processing approaches with multivariate criteria have been proposed to prioritize high-quality lines, linking feature extraction with breeding decisions [6]. Non-destructive radiography expands the repertoire of variables by allowing observation of internal and physical grain traits without altering the sample, a key aspect for standardized protocols [7]. In addition, semi-supervised strategies for in situ seed counting and characterization demonstrate robustness to environmental variability, which favors scalability in production systems [8].

At the methodological level, alternatives based on hand-crafted features and deep representations coexist, with complementary advantages in interpretability and discriminative power. Residual networks have shown the capacity to separate complex classes from rich representation spaces an attractive property in domains with high phenotypic variability such as rice [9]. Hybrid designs that combine classical preprocessing in OpenCV with deep classifiers help balance pipeline control and performance, with potential for industrial adoption [10]. Comparative studies among CNNs, transformers, and non-neural methods for leaf disease detection have also been presented, providing a map of the state of the art and its generalization trade-offs [11]. Notably, efficient topologies for weakly supervised segmentation in field environments such as PIS-Net for rice weeds illustrate the usefulness of designs that capture microstructures and fine textures under minimal annotations [12].

Transferring these advances to production environments requires data quality and replicability standards that ensure comparability, auditability, and robustness outside the laboratory. In aerial acquisition of seedlings, pipelines have been proposed to localize, detect, and count seedlings with drone-mounted cameras, explicitly specifying capture conditions and evaluation metrics, which supports large-scale deployments [13]. For real-time diagnosis, systems using CNNs that prioritize low latency and output readability have been reported an essential requirement on inspection lines [14]. From a synthesis-of-the-field perspective, reviews compile approaches and gaps in rice disease classification, which are useful for aligning methodological decisions with validation needs [15]. Likewise, hybrid models with ResNet and LSTM targeting microbial pathologies show that combining architectures can capture spatial and local dependencies relevant to quality [16].

Although recent computer vision advances in rice research demonstrate that visual signals can be leveraged for multiple agronomic tasks, the practical bottleneck for quality control and seed/variety certification remains the same: grain-level varietal identification must be objective, reproducible, and scalable, yet in operational settings it is still constrained by subjectivity, variability in acquisition conditions, and limited transparency regarding why a model succeeds or fails outside the development dataset. As a result, high reported accuracies in the literature do not always translate into reliable deployment, particularly when the goal is not merely “classification”, but traceable and auditable decision-making that can be integrated into inspection workflows.

From this standpoint, it is important to state the scientific problem separately from its context. The context is the need for scalable grain quality control in rice value chains; the scientific problem is to determine under standardized and metrically consistent morphometric extraction what modeling and validation choices yield robust generalization rather than performance inflation. This problem is simultaneously (i) empirical, because it tests whether a compact set of image-derived morphometric descriptors contains enough discriminative signal to separate Ecuadorian INIAP varieties; (ii) methodological, because it requires a rigorous, comparable benchmark across algorithmic families under identical splits and evaluation criteria; and (iii) operational, because it must balance predictive performance with computational efficiency so that the solution remains feasible for routine use (e.g., via feature compression without erasing discriminative structure).

Accordingly, the research gap is not a lack of models, but a lack of decision-grade evidence: many prior contributions are optimized for a single architecture or task setting, and often do not articulate (a) why a given pipeline fails to generalize, (b) how overfitting is ruled out beyond reporting metrics, or (c) whether efficiency-oriented reductions preserve the relevant morphometric signal. What is still needed is a reproducible comparative baseline that clarifies which families of classifiers remain stable, how sensitive they are to dimensionality reduction, and which choices are most defensible for certification-oriented workflows where stability and traceability matter as much as raw accuracy.

In one sentence, this study addresses the practical and scientific problem of reliable grain-level varietal identification by benchmarking logistic regression, support vector machines, random forests, artificial neural networks, and k-nearest neighbors on a balanced morphometric dataset of INIAP-11, INIAP-12, and INIAP-20, both with the full feature space and after PCA-based dimensionality reduction, to identify the model(s) that best preserve accuracy while demonstrating generalization and operational efficiency.

Against this backdrop, we identify a high-impact gap: automated morphometric varietal identification for Ecuadorian INIAP lines from grain images with metric calibration, standardized segmentation, and a strict separation between training and test sets. Our aim is to establish a comparative baseline of performance and generalization across five families of algorithms, logistic regression, support vector machines, random forests, artificial neural networks, and nearest neighbors, using a balanced dataset. The objective is to determine which algorithms offer the most robust balance among accuracy, stability, and traceability for adoption in quality control and seed certification, while also providing a practical guide with hyperparameters, regularization criteria, error analyses, and paired validation procedures that foster reproducibility in INIAP’s Ecuadorian context.

2. Materials and Methods

Figure 1 illustrates the research workflow, which begins with an initial data exploration aimed at examining the structure of the dataset and verifying the correct execution of the data import process. Thereafter, a descriptive analysis is performed to provide a comprehensive characterization of the data, employing measures of central tendency, dispersion, and other relevant statistical indicators. The dataset is then partitioned into training and testing subsets for model development, and finally, the trained models are evaluated to assess their performance. Regarding data acquisition, this study is based on the extraction of morphological measurements of rice grains obtained through image-based analysis. The measurements were derived from images of rice grains obtained through an external research collaboration. Due to copyright restrictions, the original source code used to extract the numerical measurements from these images cannot be made publicly available. However, in order to ensure the transparency and reproducibility of the study, a GitHub repository (version 3.5.4) has been provided containing the code used for data processing and analysis, as well as a sample of synthetic data that reproduces the statistical characteristics of the original dataset. These materials are available in (Supplementary Materials). All numerical data used in this work are described in detail in the manuscript and were employed exclusively for the analyses presented.

The present study aimed to conduct a comparative evaluation of the performance of different machine learning algorithms for the morphological classification of three Ecuadorian rice varieties and to carry out a comparison using the full dataset and a dimensionality reduction approach to lessen the models’ processing workload, focusing on the grain varieties INIAP-11, INIAP-12, and INIAP-20. A dataset comprising 2400 observations was available, with 800 samples corresponding to each of the aforementioned classes in both training and test under the methodology that includes and that does not include dimensionality reduction. The morphological features considered as independent variables were area, perimeter, major axis length, minor axis length, eccentricity, convex area, and extent. These variables were obtained through image-processing techniques and reflect key physical attributes for differentiating among grain varieties.

Table 1 presents the parameters employed for the training of the machine learning models, namely linear regression, support vector classifier, random forest, artificial neural network, and k-nearest neighbors.

To ensure a robust estimation of model performance and to avoid biases arising from a single partition of the data, we used k-fold cross-validation with k = 5 (five-fold cross-validation). This procedure consisted of randomly dividing the dataset into five subsets of equal size. In each iteration, four of the subsets were used to train the model, while the remaining subset was used to evaluate it. This process was repeated five times so that each subset served once as the test set. The metrics obtained in each iteration were averaged to yield a global estimate of each algorithm’s performance, as recommended by recent studies on agricultural classification using machine learning [17,18].

The multiclass classification models selected were logistic regression (LR), support vector machines (SVC), random forest (RF), artificial neural network (ANN), and k-nearest neighbors (KNN). All models were trained on the same training sets to ensure comparability and were implemented using the scikit-learn library in Python 3.12.12. In terms of the base configuration, logistic regression was trained with the lbfgs solver, a maximum of 1000 iterations, and a multinomial multiclass setting. The SVC model was configured with a radial basis function (RBF) kernel, a regularization parameter of C = 1.0, automatically scaled gamma (scale), and probability estimates enabled. The random forest was implemented with 100 trees, no restriction on maximum depth, and a minimum split of 2 samples per node. For the artificial neural network, a single hidden layer with 100 neurons was defined, with ReLU activation, the adam optimizer, and a maximum of 500 iterations. Finally, the KNN model used 5 nearest neighbors, uniform weights, and Euclidean distance as the metric. These configurations correspond to standard parameters recommended for comparative studies in supervised classification [19,20].

The evaluation metrics used to compare model performance were accuracy, macro-averaged precision, macro-averaged recall (sensitivity), and macro-averaged F1-score. These metrics allow assessment of model behavior in multiclass contexts and are particularly useful when balanced performance across classes is sought. The macro-averaging approach assigns equal weight to each class, which is consistent with the balanced design of the dataset.

To analyze potential overfitting, we compared accuracy values obtained on the training and test sets. If no significant differences were found, we concluded that the model generalized adequately. Conversely, if a significant difference was detected and the model showed higher performance on training than on testing, it was considered overfitted, which could affect its applicability to new data. This procedure has been recommended in recent literature as good practice to ensure generalization capacity in agricultural and food classification problems [21].

The choice of algorithms, cross-validation, metrics, and base model configurations is grounded in best practices in data science and machine learning, ensuring an objective, reproducible, and statistically sound evaluation of model performance [17,18,20].

3. Results

The results reveal marked morphometric differences among the three varieties (Table 2). INIAP-12 exhibits, on average, the highest values for area (69,699.29) and major axis length (392.70), indicating larger and more elongated grains. INIAP-11 follows closely on both variables, whereas INIAP-20 stands out for having significantly the lowest values in area (26,499.15) and major axis (223.23), reflecting a smaller overall grain surface and length.

Perimeter follows a similar trend, with INIAP-12 and INIAP-11 showing comparable means that are higher than that of INIAP-20 (605.25), reinforcing their morphological differentiation. Regarding minor axis length, which represents grain width INIAP-20 again shows the lowest values (151.62), indicating a smaller and possibly more rounded morphology.

These quantitative characteristics are key for varietal differentiation and are useful in automatic classification contexts using machine learning, as suggested by previous research in grain morphometrics.

The results (Table 3) show clear morphological differences among the rice varieties analyzed. For eccentricity, INIAP-12 presents the highest average values (0.81), indicating more elongated grains, while INIAP-20 has the lowest values (0.73), suggesting a more rounded shape. INIAP-11 lies in an intermediate position (0.75).

With respect to convex area, INIAP-12 again stands out with the highest values (mean: 70,776.53), followed by INIAP-11 (66,650.87), and far below is INIAP-20 (26,838.75), reflecting a considerably smaller size in the latter variety.

For extent (the ratio between area and its bounding area), the three varieties display similar values (around 0.75–0.76), although INIAP-12 shows a slight predominance.

These differences are consistent with the use of morphometric variables for varietal discrimination in rice, as noted by previous studies on classification based on physical grain characteristics.

Previously to model training, as observed in the correlation analysis among the variables (Figure 2), no relationship can be considered strong among the independent variables, which suggests that the likelihood of significant redundancy between them is low. Likewise, the moderate associations observed between certain variables are attributed to the existence of a shared context and to the natural relationships among the characteristics measured by the independent variables.

In the performance analysis of the models without applying dimensionality reduction (Table 4), all of them achieved relatively high metrics on both the training and test sets. However, a comparative assessment of these metrics reveals differential behaviors in terms of generalization capacity and possible overfitting in some models.

The random forest (RF) model showed perfect performance on the training set (accuracy, precision, recall, and F1-score all equal to 1.0000), whereas on the test set these metrics decreased slightly (all at 0.9400). This difference suggests the presence of overfitting, given that the model fits the training data completely but loses generalization capacity on unseen data. This discrepancy is relevant and can be evaluated with a Z-test for the difference in proportions between the training and test accuracy. If statistical significance is found, the model should be considered overfitted, as cautioned by Ref. [22].

In contrast, the artificial neural network (ANN) not only maintained high performance in training (F1-score = 0.9400) but also improved slightly on the test set (F1-score = 0.9533), which indicates adequate generalization and suggests the model is not overfitted. This robust behavior implies that the network captured meaningful patterns without memorizing the training data.

The support vector classifier (SVC) likewise exhibited balanced behavior between the two sets, with very similar metrics (F1-score of 0.9367 in training and 0.9433 in testing). This stability reflects a good fit with no evidence of overlearning and is consistent with SVC’s strengths in multiclass classification when variables offer high class separation [23].

Logistic regression (LR) and k-nearest neighbors (KNN) showed minimal differences between training and test metrics. For LR, accuracy went from 0.9190 to 0.9156, while for KNN it went from 0.9338 to 0.9244. These slight declines are not indicative of overfitting but rather of normal variability between sets, especially considering the sample size and the cross-validation technique employed.

Therefore, the only model with signs of overfitting is random forest, as it attained perfect performance in training that did not replicate in testing. By contrast, ANN and SVC emerge as the models with the best generalization capacity, while LR and KNN deliver competitive performance without overfitting, albeit with slightly lower metrics.

As shown in Figure 3, the learning curves for both the training and testing datasets converge, providing a priori evidence of an adequate fit of the SVC model and the absence of overfitting. The color around de lines corresponds to the IC 95% interval for the proportion of the accuracy.

The performance of the models after applying dimensionality reduction via PCA is shown in Table 5. In general terms, most algorithms maintain acceptable performance across the training and test sets, albeit with variations that reveal strengths and limitations in their generalization capacity.

For logistic regression (LR), the training and test metrics remain practically constant. Accuracy decreased from 0.9110 to 0.9063, while precision, recall, and F1-score showed minimal variations. These results reflect the model’s stability and adequate generalization capacity, although its performance is relatively more modest than that of more complex algorithms.

The support vector classifier (SVC) emerged as the best-performing model after dimensionality reduction. On the training set it achieved an accuracy of 0.9733 and an F1-score of 0.9579, while on the test set it maintained high values (0.9567 and 0.9326, respectively). Although there is a slight decrease in recall and F1-score, the model preserves a notable balance between precision and recall, supporting its robustness in multiclass contexts with transformed data.

In contrast, the random forest (RF) exhibited an overfitting pattern. Despite high training metrics (0.9700 across all measures), it showed a considerable drop on the test set, reaching only 0.9000 in accuracy, precision, recall, and F1-score. This discrepancy suggests the model tends to memorize training patterns, reducing its generalization capacity on unseen data and limiting its practical applicability.

The artificial neural network (ANN), for its part, maintained consistent and stable performance across sets. Accuracy increased slightly from 0.9210 to 0.9363, while the F1-score remained at similar levels (0.9320 in training and 0.9251 in testing). These results indicate that the ANN captures relevant relationships without evidence of overfitting, consolidating it as a robust model after dimensionality reduction.

Finally, the k-nearest neighbors model (KNN) showed a slight decrease in metrics from training to testing. Accuracy fell from 0.9154 to 0.8945, accompanied by proportional reductions in precision, recall, and F1-score. This performance loss, though moderate, is consistent with the nature of KNN, whose behavior depends on the spatial structure of the data which can be altered after PCA.

Taken together, the results suggest that SVC and ANN are the most suitable algorithms for morphological classification of rice grains under dimensionality reduction, combining high accuracy with adequate generalization. In contrast, RF showed clear signs of overfitting, while LR and KNN delivered acceptable performance with somewhat lower metrics. These findings are consistent with prior studies highlighting the effectiveness of SVC and neural networks in multivariate classification tasks with transformed data.

Regarding the number of components retained for predicting outcomes when applying dimensionality reduction through principal component analysis (PCA), we decided to keep 3 of the 7 total components because they capture nearly all of the total variability (Figure 4). Consequently, when examining the results in Table 3 and Table 4, the predictive outcomes are not substantially diminished by comparison, allowing for very similar results with reduced processing.

To assess potential overfitting in the models employed, we conducted a statistical comparison between the accuracy percentages obtained on the training and test sets for each classifier. The test used was the Z-test for the difference in proportions, which determines whether the observed difference between the two sets is statistically significant (Table 6).

The results show that the logistic regression (LR), support vector classifier (SVC), artificial neural network (ANN), and k-nearest neighbors (KNN) models do not exhibit significant differences between training and test accuracy. For these models, the absolute Z values were below 1.2 and p-values exceeded 0.24, indicating that the observed differences could be attributed to chance (LR: Z = 0.25, p = 0.805; SVC: Z = 0.56, p = 0.578; ANN: Z = 1.15, p = 0.250; KNN: Z = 0.73, p = 0.464). Consequently, these classifiers are concluded to have adequate generalization capacity with no evidence of overlearning.

In contrast, the random forest (RF) model displayed a markedly different behavior. Accuracy on the training set was 100%, whereas on the test set it was 94%, yielding Z = 7.15 with p < 0.001. This statistically significant difference indicates a clear case of overfitting, which is consistent with the fact that ensemble models such as RF can tend to memorize the training data if depth or number-of-trees parameters are not properly controlled.

Table 7 presents the results of the Z-test for the difference in proportions, which enables the statistical comparison of the prediction percentages obtained on the training and test sets for each algorithm after dimensionality reduction. This analysis is key to identifying the existence of overfitting, that is, performance that is significantly higher on training that does not replicate on unseen data.

In the case of logistic regression (LR), training accuracy (0.9110) and test accuracy (0.9063) do not differ significantly (Z = 0.2433; p = 0.8078). This confirms that the model generalizes adequately, with variations attributable to sampling randomness rather than overfitting.

The support vector classifier (SVC) showed a slight reduction between training (0.9733) and test (0.9567), but this difference did not reach statistical significance (Z = 1.2896; p = 0.1972). These results reinforce the model’s stability and robustness, as it maintains high and consistent performance across both sets.

In contrast, random forest (RF) did present statistically significant differences between training (0.9700) and test (0.9000), with Z = 3.8398 and p < 0.001. This finding confirms the presence of overfitting, since the model achieves an almost perfect performance in training that it fails to sustain on test data, limiting its generalization capacity.

The artificial neural network (ANN) showed the opposite situation: test accuracy (0.9363) slightly exceeded training accuracy (0.9210), with a negative Z value (−0.9149) that was not significant (p = 0.3602). This indicates that the network is not overfitted and, on the contrary, maintains a stable and robust behavior when faced with new data.

As for the k-nearest neighbors model (KNN), it showed a slight difference between training (0.9154) and test (0.8945), without reaching statistical significance (Z = 1.0442; p = 0.2964). This reflects stable performance with no signs of overfitting, although with metrics somewhat lower than those of SVC and ANN.

With regard to the ROC curve analysis (Figure 5), all evaluated models achieved area-under-the-curve (AUC) values close to 1, indicating high discriminative capacity. In particular, the Logistic Regression, Random Forest, SVC, and Artificial Neural Network models obtained an AUC of 0.99, while k-nearest neighbors reached a slightly lower value (0.98). Nevertheless, when considering the previous results as a whole including accuracy, precision, recall, and F1-score in training and testing, as well as the statistical test to detect overfitting it is confirmed that the SVC and Artificial Neural Network models offer the best balance between performance and generalization capacity. In contrast, despite its excellent AUC, the Random Forest model showed statistically significant evidence of overfitting. Thus, it is concluded that the Artificial Neural Network and SVC are the most robust alternatives for the morphological classification of rice grains.

Below, the confusion matrix (Table 8) is shown for what, in this case, turned out to be the best prediction model without dimensionality reduction. Given that this is a multiclass classification case, the matrices are presented as follows.

Likewise, the confusion matrix of the best predictive model for multiclass classification for the model without dimensionality reduction is presented (Table 9).

Dimensionality reduction with PCA proved efficient from an operational standpoint. Retaining three of seven components captured virtually all relevant variability and maintained high levels of accuracy, precision, recall, and F1-score, which suggests structural correlations among morphometric features that can be compressed without an appreciable loss of discriminative signal. This observation is consistent with recent evidence on the ability of image-based models to capture agronomic and phenological signals pertinent to both foliar diagnosis and crop stage monitoring [1,2,3], and even to geographic traceability tasks based on visual patterns [4]. In terms of transferability, the input compression afforded by PCA contributes to the scalability of field systems where computing resources are limited, as reported in counting and detection pipelines using aerial platforms and in real-time diagnostic deployments [13,14].

In the comparison of algorithms, the Support Vector Classifier and the artificial neural network were the most robust models, with balanced and stable metrics between training and testing and no evidence of overfitting according to the Z-test. The stability of SVC is explained by margin maximization and the nonlinear projection induced by the RBF kernel, characteristics that have proven advantageous in domains with high phenotypic variability and moderate-to-high class separability [11,23]. In the case of the ANN, a moderate architecture with ReLU activation and the Adam optimizer captured nonlinear interactions among shape features without memorizing idiosyncrasies of the training set, which is consistent with recent comparisons of deep learning for morphological and plant health (disease) classification in rice [9,10,12].

The overfitting pattern of Random Forest, evidenced by perfect training performance and a significant drop on testing, aligns with methodological cautions regarding ensemble models when complexity is not controlled, particularly tree depth, minimum leaf size, number of estimators, and the fraction of features considered per split [22]. Even with AUC values close to 1 for all classifiers, including RF, the results underscore that a high AUC does not by itself guarantee out-of-sample generalization. Consequently, evaluation should integrate macro-averaged metrics, contrasts between training and testing, and replicable validation frameworks, as recommended by applied studies of agricultural classification using machine learning [17,18,20,21].

The behavior of logistic regression and k-nearest neighbors also offers design lessons. Logistic regression, despite its linearity, maintained stability and interpretability, attributes that are valuable for certification audits and process control. K-nearest neighbors showed a slight degradation after PCA, an expected effect given its dependence on distances and on the local structure of the feature space. This result suggests calibrating k and the distance metric and even considering supervised metric learning prior to projection when KNN is an operational candidate.

4. Discussion

The results obtained in this study provide a detailed comparison of the performance of morphological classification algorithms for rice grains under two different scenarios: with and without dimensionality reduction through Principal Component Analysis (PCA). A key finding is that dimensionality reduction did not substantially alter the predictive capacity of the models. Across both scenarios, global metrics such as accuracy, precision, recall, and F1-score remained at consistently high levels. This indicates that the three retained principal components successfully concentrated most of the data variability while preserving the information necessary for varietal classification. From a practical standpoint, this finding is of particular relevance since it demonstrates that computational cost can be reduced without significantly compromising classifier performance, a consideration that is essential in large-scale industrial applications where efficiency is paramount [24,25].

When comparing individual algorithms, the Support Vector Classifier (SVC) and Artificial Neural Network (ANN) consistently emerged as the most robust models across both experimental conditions. Both demonstrated balanced and stable metrics between training and test datasets, with no evidence of overfitting according to the Z-test for differences in proportions. The robustness of these models can be attributed to two main aspects. First, SVC is well-suited to problems where classes are morphologically distinct, as it constructs separating hyperplanes that maximize class margins in the feature space. Second, ANN is capable of modeling complex nonlinear relationships between features, enabling it to capture subtle morphological variations in rice grains. The combination of these properties ensures that SVC and ANN maintain a high degree of generalizability, which is crucial when the models are applied to unseen data [26,27].

In contrast, Random Forest (RF) exhibited a consistent tendency toward overfitting in both scenarios. The model achieved near-perfect performance on the training dataset but showed a notable decline in accuracy and other metrics when evaluated on the test set. This discrepancy was statistically confirmed, underscoring the limited generalization capacity of RF in its current configuration. While RF is known for its robustness and ability to handle noisy data in many domains, the results here suggest that its high variance, especially when the number of estimators and tree depth are not carefully regulated, makes it less suitable for multiclass classification tasks in rice grain morphology [28,29].

Logistic Regression (LR) and K-Nearest Neighbors (KNN) demonstrated acceptable performance across both scenarios, although their metrics were consistently lower than those of SVC and ANN. LR distinguished itself by its stability and interpretability, qualities that may prove useful in laboratory settings or in contexts where model transparency is prioritized over predictive power. KNN, however, displayed a degree of sensitivity to dimensionality reduction. After applying PCA, the algorithm experienced a slight decline in performance, which is consistent with its distance-based nature. Since KNN relies on geometric relationships in feature space, transformations that alter the underlying structure can influence classification outcomes. Despite these limitations, both LR and KNN avoided overfitting and thus represent viable alternatives in settings with reduced complexity or as complementary models to more advanced approaches [30,31].

The practical implications of these findings are particularly significant for rice milling plants and seed certification laboratories. The fact that PCA did not compromise predictive performance suggests that automated classification systems can be implemented using a reduced number of features, thereby optimizing computational resources while maintaining accuracy. This is especially advantageous in industrial contexts where large volumes of grains must be processed in real time, as computational efficiency directly translates into operational scalability [32,33]. Within this framework, SVC and ANN stand out as the most appropriate models for deployment, given their superior accuracy, consistent stability, and demonstrated absence of overfitting. Their suitability becomes especially evident in contexts where rice varieties display clear morphometric separability, allowing for reliable classification across diverse grain batches [11].

Despite these promising results, several limitations of the study must be acknowledged. First, the dataset was artificially balanced across classes. In real-world scenarios, some varieties are naturally more prevalent than others, which can affect classifier performance. Models trained under balanced conditions may not fully reflect performance in imbalanced settings, underscoring the need for future studies to apply class-weighting strategies or stratified sampling [26]. Second, the images were captured under controlled conditions with standardized lighting and uniform backgrounds. In practical applications, such as processing plants, environmental variability (e.g., shadows, uneven lighting, or heterogeneous backgrounds) could challenge classifier robustness. Finally, the number of rice varieties included in the analysis was limited. While sufficient to demonstrate model feasibility, a broader range of varieties would be necessary to fully validate the generalizability of the findings across diverse agricultural contexts [34].

An important insight that emerges from this study is that models incorporating explicit mechanisms of complexity control and regularization such as SVC and ANN perform more effectively in morphological classification tasks where classes are well separated. Conversely, RF, despite its high training performance and elevated AUC values, demonstrated a reduced ability to generalize. This highlights the importance of evaluating not only global performance metrics but also the stability of models across different datasets. The integrated analytical approach used here, which combined traditional metrics with statistical tests and AUC evaluation, provided a more comprehensive assessment of model reliability and avoided the risk of drawing conclusions from accuracy scores alone [35].

Looking forward, the findings open several avenues for future research. One promising direction is the fine-tuning of RF parameters to reduce its tendency to overfit. By optimizing hyperparameters such as maximum tree depth or the number of estimators, it may be possible to strike a balance between model flexibility and generalization. Another critical step involves validating the models on externally collected datasets under uncontrolled imaging conditions. Such validation would provide a more realistic assessment of model robustness and adaptability. Similarly, the exploration of deep learning approaches, particularly Convolutional Neural Networks (CNNs), represents an important opportunity. CNNs, which operate directly on images without requiring manual feature extraction, could potentially outperform classical models like SVC and ANN when applied to sufficiently large and diverse datasets. However, their computational demands remain a practical consideration for industrial implementation [36,37].

Future work should also consider the integration of additional data modalities. For example, combining morphometric data with spectral or hyperspectral information could improve classification performance by incorporating not only shape and size but also chemical composition. This multimodal approach would enhance varietal classification while also enabling the detection of other grain properties such as damage, contamination, or maturity level [38]. From an applied perspective, further research is also needed to explore the scalability of these models in real-time systems. Developing integrated prototypes that combine camera systems with trained classifiers and testing them in operational environments would provide valuable insights into processing speed, cost-effectiveness, and feasibility for large-scale industrial deployment [39].

In summary, the results demonstrate that dimensionality reduction via PCA constitutes an efficient strategy for optimizing computational resources without sacrificing predictive accuracy. SVC and ANN clearly stand out as the most robust and reliable models for morphological classification of rice grains, combining high performance, stability across datasets, and resistance to overfitting. RF, while achieving high AUC scores, displayed practical limitations due to its overfitting tendencies, whereas LR and KNN offered stable but comparatively modest performance. Collectively, these findings provide strong evidence for the adoption of SVC and ANN as primary tools in automated rice classification systems, while also emphasizing the importance of external validation, model tuning, and exploration of advanced methods such as deep learning and multimodal data integration [11]. The path toward more universal and efficient classification systems in the rice industry thus lies in combining computational efficiency with methodological flexibility and enhanced generalization capacity.

5. Conclusions

The conclusions derived from this study make it possible to draw clear comparisons between the performance of morphological classification algorithms for rice grains when applying dimensionality reduction via PCA and when not using it.

First, the results show that dimensionality reduction does not substantially affect the predictive capacity of the models. In both scenarios, the global metrics of accuracy, precision, recall, and F1-score remained at high levels, confirming that the three retained principal components concentrate most of the data variability and preserve the information relevant for varietal classification. This means it is possible to reduce computational cost without significantly compromising classifier performance.

Regarding the comparison of algorithms, the Support Vector Classifier (SVC) and the Artificial Neural Network (ANN) proved to be the most robust models both with and without dimensionality reduction. In both cases, they exhibited balanced and stable metrics between training and test, with no evidence of overfitting according to the Z-test for the difference in proportions. These findings reinforce the suitability of SVC and ANN for multiclass classification tasks in contexts with high morphometric separation among rice varieties.

By contrast, Random Forest (RF) displayed a consistent pattern of overfitting in both scenarios. Although it achieved near-perfect performance in training, its metrics declined significantly on the test set, and this difference was statistically verified. This limitation suggests the need to adjust model complexity for example, by regulating tree depth or the number of estimators to improve generalization capacity.

Logistic Regression (LR) and K-Nearest Neighbors (KNN) maintained acceptable performance under both approaches, with metrics lower than those of SVC and ANN but without signs of overfitting. LR was characterized by its stability and simplicity, whereas KNN showed some sensitivity to data transformation, with slight performance losses after applying PCA consistent with its distance-dependent nature.

As a final point, the integrated analysis including performance metrics, the Z-test, and AUC values allows us to conclude that dimensionality reduction via PCA is an efficient strategy to optimize processing without sacrificing predictive quality. Within this framework, SVC and ANN stand out as the most advisable models for the morphological classification of rice grains, as they combine high accuracy, stability across datasets, and an absence of overfitting, whereas RF, despite its high AUC, shows practical limitations due to its tendency toward overlearning.

In future work, it is expected to implement and deploy the best prediction model in combination with other tools such as precision cameras and real-time image processing. From a comparative perspective, since machine learning models typically exhibit deterministic behavior and converge to the same parameter estimates, it can be stated that their predictions are consistent across runs. In contrast, deep learning algorithms incorporate stochastic components that may lead to different outcomes, potentially resulting in improved predictive performance.

It should be noted that the results obtained are based on a specific dataset which, for confidentiality reasons, cannot be made publicly available.

Supplementary Materials

To facilitate the replication of this study, a GitHub repository (version 3.5.4) has been created and is available at https://github.com/Julians30/CLASIFICACION_INIAP (accessed on 5 November 2025). The data are proprietary and therefore not authorized for public dissemination.

Author Contributions

Conceptualization, J.C.-R. and A.H.-S.; methodology, A.H.-S. and C.D.-V.; software, J.T.-S.; validation, J.C.-R., A.H.-S. and C.D.-V.; formal analysis, J.C.-R.; investigation, A.H.-S.; resources, C.D.-V.; data curation, J.T.-S.; writing—original draft preparation, J.C.-R.; writing—review and editing, A.H.-S. and C.D.-V.; visualization, J.T.-S.; supervision, A.H.-S.; project administration, J.C.-R.; funding acquisition, A.H.-S. All authors have read and agreed to the published version of the manuscript.

Funding

The APC was funded by the Universidad Agraria del Ecuador (UAE).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The authors would like to express their sincere gratitude to the Universidad Agraria del Ecuador (UAE) for its support of this research.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

INIAP	Instituto Nacional de Investigaciones Agropecuarias
LR	Logistic Regression
SVC	Support vector machines,
RF	Random forest
KNN	k-nearest neighbors
ANN	Artificial neural network
PCA	Principal Component Analysis
ROC	Receptor Operative Curve

References

Alsakar, Y.M.; Sakr, N.A.; Elmogy, M. An enhanced classification system of various rice plant diseases based on multi-level handcrafted feature extraction technique. Sci. Rep. 2024, 14, 30601. [Google Scholar] [CrossRef]
Nikitha, S.; Prabhanjan, S.; Rupa, T.R.; Dinesh, R. Enhancing plant nutritional deficiency analysis: A multi-attention convolutional neural network approach. Multimed. Tools Appl. 2024, 84, 27795–27817. [Google Scholar] [CrossRef]
Chaurasia, H.; Arora, A.; Raju, D.; Marwaha, S.; Chinnusamy, V.; Jain, R.; Ray, M.; Sahoo, R.N. Identification of paddy stages from images using deep learning. J. Indian Soc. Agric. Stat. 2024, 78, 69–74. [Google Scholar] [CrossRef]
Yu, H.; Chen, Z.; Liu, X.; Song, S.; Chen, M. Improving EfficientNet_b0 for distinguishing rice from different origins: A deep learning method for geographical traceability in precision agriculture. Curr. Plant Biol. 2025, 43, 100501. [Google Scholar] [CrossRef]
Xu, J. GrainShape: A landmark-annotated image dataset of japonica rice grains for geometric morphometric analysis. Data Brief 2025, 61, 111781. [Google Scholar] [CrossRef] [PubMed]
Feizi, N.; Sabouri, A.; Bakhshipour, A.; Abedi, A. Combinatorial approaches to image processing and MGIDI for the efficient selection of superior rice grain quality lines. Agriculture 2025, 15, 615. [Google Scholar] [CrossRef]
Tharanya, M.; Chakraborty, D.; Pandravada, A.; Babu, R.; Gangashetti, M.; Paidi, S.; Choudhary, S.; Sivasakthi, K.; Anbazhagan, K.; Vaditandra, B.; et al. Utilizing X-ray radiography for non-destructive assessment of paddy rice grain quality traits. Plant Methods 2025, 21, 94. [Google Scholar] [CrossRef]
Sung, B.-G.; Lee, C.-G.; Kang, Y.-H.; Yu, S.-H.; Lee, D.-H. Semi-supervised density estimation with background-augmented data for in situ seed counting. Agriculture 2025, 15, 1682. [Google Scholar] [CrossRef]
Yadav, N. Deep learning-based detection and classification of rice diseases using residual networks (ResNet50). Int. J. Latest Technol. Eng. Manag. Appl. Sci. 2025, 14, 567–573. [Google Scholar] [CrossRef]
Hossen, M.K.; Das, P.K.; Roy, R. Detection and classification of rice leaf diseases using OpenCV and deep learning. Curr. Appl. Sci. Technol. 2025, 25, e0260191. [Google Scholar] [CrossRef]
Kondaveeti, H.K.; Simhadri, C.G. Evaluation of deep learning models using explainable AI with qualitative and quantitative analysis for rice leaf disease detection. Sci. Rep. 2025, 15, 31850. [Google Scholar] [CrossRef]
Chen, H.; Zhang, Y.; He, C.; Chen, C.; Zhang, Y.; Chen, Z.; Jiang, Y.; Lin, C.; Ma, R.; Qi, L. PIS-Net: Efficient weakly supervised instance segmentation network based on annotated points for rice field weed identification. Smart Agric. Technol. 2024, 9, 100557. [Google Scholar] [CrossRef]
Luu, T.H.; Phuc, P.N.K.; Ngo, Q.H.; Nguyen, T.T.; Nguyen, H.C. Design a computer vision approach to localize, detect and count rice seedlings captured by a UAV-mounted camera. Comput. Mater. Contin. 2025, 83, 5643–5656. [Google Scholar] [CrossRef]
Arcila-Diaz, J.; Altamirano-Chavez, D.; Arcila-Diaz, L.; Valdivia, C. Real-time Identification of Rice Leaf Diseases using Convolutional Neural Networks. Int. J. Comput. 2024, 23, 709–714. [Google Scholar] [CrossRef]
Yusuf, H.M.; Yusuf, S.A.; Abubakar, A.H.; Abdullahi, M.; Hassan, I.H. A systematic review of deep learning techniques for rice disease recognition: Current trends and future directions. Frankl. Open 2024, 8, 100154. [Google Scholar] [CrossRef]
Yadav, R. A hybrid model for enhanced detection of microbial diseases in rice plants using ResNet50 and vision LSTM. J. Inf. Syst. Eng. Manag. 2025, 10, 675–685. [Google Scholar] [CrossRef]
Qadri, S.; Aslam, T.; Nawaz, S.A.; Saher, N.; Razzaq, A.; Ur Rehman, M.; Ahmad, N.; Shahzad, F.; Furqan Qadri, S. Machine vision approach for classification of rice varieties using texture features. Int. J. Food Prop. 2021, 24, 1615–1630. [Google Scholar] [CrossRef]
Wijayanto, A.K.; Junaedi, A.; Sujaswara, A.A.; Khamid, M.B.R.; Prasetyo, L.B.; Hongo, C.; Kuze, H. Machine learning for precise rice variety classification in tropical environments using UAV-based multispectral sensing. AgriEngineering 2023, 5, 2000–2019. [Google Scholar] [CrossRef]
Khan, M.S.; Nath, T.D.; Hossain, M.M.; Mukherjee, A.; Hasnath, H.B.; Meem, T.M.; Khan, U. Comparison of multiclass classification techniques using dry bean dataset. Int. J. Cogn. Comput. Eng. 2023, 4, 6–20. [Google Scholar] [CrossRef]
Rajalakshmi, R.; Faizal, S.; Sivasankaran, S.; Geetha, R. RiceSeedNet: Rice seed variety identification using deep neural network. J. Agric. Food Res. 2024, 16, 101062. [Google Scholar] [CrossRef]
Iqbal, M.J.; Aasem, M.; Ahmad, I.; Alassafi, M.O.; Bakhsh, S.T.; Noreen, N.; Alhomoud, A. On Application of Lightweight Models for Rice Variety Classification and Their Potential in Edge Computing. Foods 2023, 12, 3993. [Google Scholar] [CrossRef] [PubMed]
Kuhn, M.; Johnson, K. Applied Predictive Modeling, 1st ed.; Springer: New York, NY, USA, 2013. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed.; Springer: New York, NY, USA, 2009. [Google Scholar] [CrossRef]
Sharma, N.K.; Anand, A.; Budhlakoti, N.; Mishra, D.C.; Jha, G.K. Artificial intelligence and machine learning for rice improvement. In Climate-Smart Rice Breeding; Springer Nature: Singapore, 2024; pp. 273–300. [Google Scholar] [CrossRef]
Rana, M.E.; Hameed, V.A.; Eng, I.K.Y.; Tripathy, H.K.; Mallik, S. Harnessing artificial intelligence for sustainable rice leaf disease classification. Front. Plant Sci. 2025, 16, 1594329. [Google Scholar] [CrossRef] [PubMed]
Sampaio, P.S.; Almeida, A.S.; Brites, C.M. Use of artificial neural network model for rice quality prediction based on grain physical parameters. Foods 2021, 10, 3016. [Google Scholar] [CrossRef]
Shah, M.; Banker, K.; Patel, J.; Rao, D. Comparative analysis of deep learning architectures for rice crop image classification. In Proceedings of the 4th International Conference on Artificial Intelligence and Smart Energy; Springer Nature: Cham, Switzerland, 2024; pp. 245–259. [Google Scholar] [CrossRef]
Ahmed, S.B.; Ali, S.F.; Khan, A. On the frontiers of rice grain analysis, classification and quality grading: A review. IEEE Access 2021, 9, 160779–160796. [Google Scholar] [CrossRef]
Sheng, R.T.-C.; Huang, Y.-H.; Chan, P.-C.; Bhat, S.; Wu, Y.-C.; Huang, N. Rice growth stage classification via RF-based machine learning and image processing. Agriculture 2022, 12, 2137. [Google Scholar] [CrossRef]
Ali, M.; Hussain, Z.M.G. A Comparative Study Between Traditional Machine Learning and Deep Learning Models to Classify Rice Types. Master’s Thesis, National College of Ireland, Dublin, Ireland, 2023. Available online: https://norma.ncirl.ie/id/eprint/6621 (accessed on 10 November 2025).
Hamdikatama, B.; Kusrini, K.; Setyanto, A. Comparison of the performance of SVR, KNN and Decision Tree methods in predicting rice production. J. Tek. Inform. Dan Sist. Inf. 2025, 12. [Google Scholar] [CrossRef]
Bouchard, J.D.; Acevedo, B.A.; Díaz, S.F.; Maiocchi, M. Análisis multivariante aplicado al estudio de las propiedades culinarias de arroz (Oryza sativa L.) en variedades largo fino. Rev. Cienc. Tecnol. 2020, 33, 33–37. [Google Scholar] [CrossRef]
Ascoli, C.A.; da Silva, A.C. Relação entre condutividade elétrica e desempenho fisiológio de sementes de arroz. Nativa 2021, 9, 182–193. [Google Scholar] [CrossRef]
Hassan, M.M.; Habib, M.A.; Nayak, S.; Jewel, Z.A.; Resmi, S.I.; Hasan, M. Evaluating farmers’ satisfaction from high-yielding rice (Oryza sativa) variety cultivation in boro season: Evidence from adaptive trials in Bangladesh. J. Saudi Soc. Agric. Sci. 2025, 24, 40. [Google Scholar] [CrossRef]
Sun, J.; Jia, H.; Ren, Z.; Cui, J.; Yang, W.; Song, P. Accurate rice grain counting in natural morphology: A method based on image classification and object detection. Comput. Electron. Agric. 2024, 227, 109490. [Google Scholar] [CrossRef]
Hashim, N.; Ali, M.M.; Mahadi, M.R.; Abdullah, A.F.; Wayayok, A.; Kassim, M.S.M.; Jamaluddin, A. Smart farming for sustainable rice production: An insight into application, challenge, and future prospect. Rice Sci. 2024, 31, 47–61. [Google Scholar] [CrossRef]
Mansakul, T.; Tang, G.; Webb, P.; Rice, J.; Oakley, D.; Fowler, J. An end-to-end computationally lightweight vision-based grasping system for grocery items. Sensors 2025, 25, 5309. [Google Scholar] [CrossRef] [PubMed]
Islam, M.M.; Himel, G.M.S.; Moazzam, M.G.; Uddin, M.S. Artificial intelligence-based rice variety classification: A state-of-the-art review and future directions. Smart Agric. Technol. 2025, 10, 100788. [Google Scholar] [CrossRef]
Borah, S.S.; Khanal, A.; Sundaravadivel, P. Emerging technologies for automation in environmental sensing: Review. Appl. Sci. 2024, 14, 3531. [Google Scholar] [CrossRef]

Figure 1. Diagram of moments and steps in experimental methodology.

Figure 2. Correlation matrix among independent variables.

Figure 3. Learning Curve for the best ML model.

Figure 4. Cumulative variance explained by the first n components.

Figure 5. ROC curve and AUC for each prediction model.

Table 1. Parameters of the machine learning models used in the analysis.

Model	Parameters
Logistic Regression (LR)	fit_intercept = True
	copy_X = True
	n_jobs = None
	positive = False
Support Vector Classifier (SVC)	C = 1.0
	kernel = rbf
	degree = 3
	gamma = scale
	coef0 = 0.0
	shrinking = True
	tol = 0.001
	cache_size = 200
	class_weight = None
	max_iter = −1
	decision_function_shape = ovr
Random Forest (RF)	n_estimators = 100
	criterion = gini
	max_depth = None
	min_samples_split = 2
	min_samples_leaf = 1
	min_weight_fraction_leaf = 0.0
	max_features = sqrt
	max_leaf_nodes = None
	min_impurity_decrease = 0.0
Artificial Neural Network (ANN)	Hidden layers size = 100
	activation = relu
	optimizador = Adam
	alpha = 0.0001
	learning rate = 0.01
	output function = linear
K-Nearest Neighboors (KNN)	n_neighbors = 5
	weights = uniform
	leaf_size = 30
	metric = minkowski

Table 2. Descriptives for area, perimeter, major axis length, and minor axis length.

Stat.	Area			Perimeter
Stat.	INIAP-11	INIAP-12	INIAP-20	INIAP-11	INIAP-12	INIAP-20
Average	65,508.53	69,699.29	26,499.15	1011.83	1016.94	605.25
Std. Dev.	7253.64	5400.35	1834.50	69.50	43.92	25.18
Median	66,876.50	70,552.50	26,825.00	1019.74	1023.14	607.76
Q1	61,070.75	66,699.25	25,390.75	969.89	992.17	591.66
Q3	71,205.75	74,045.75	27,955.75	1062.20	1048.20	621.54
Stat.	Major Axis Length			Minor Axis Length
Stat.	INIAP-11	INIAP-12	INIAP-20	INIAP-11	INIAP-12	INIAP-20
Average	358.56	392.70	223.23	233.25	227.93	151.62
Std. Dev.	26.16	20.98	11.58	15.85	9.67	6.92
Median	359.84	396.13	224.16	233.53	228.95	152.08
Q1	343.05	380.82	216.85	223.12	222.70	147.02
Q3	376.14	407.18	230.72	244.13	234.40	156.44

Table 3. Descriptives for eccentricity, convex area, and extent.

Stat.	Eccentricity			Convex Area			Extent
Stat.	INIAP-11	INIAP-12	INIAP-20	INIAP-11	INIAP-12	INIAP-20	INIAP-11	INIAP-12	INIAP-20
Mean	0.75	0.81	0.73	66,650.87	70,776.53	26,838.75	0.75	0.76	0.75
Std. Dev.	0.05	0.02	0.04	7367.38	5508.19	1850.66	0.04	0.04	0.04
Median	0.76	0.81	0.73	68,072.00	71,645.00	27,175.50	0.75	0.77	0.75
Q1	0.73	0.80	0.70	62,116.50	67,754.25	25,719.50	0.72	0.73	0.72
Q3	0.79	0.83	0.76	72,427.50	75,175.50	28,282.25	0.78	0.79	0.79

Table 4. Comparison of training and test metrics without dimensionality reduction.

Model	Train				Test
Model	Accuracy	Precision	Recall	F1-Score	Accuracy	Precision	Recall	F1-Score
LR	0.9190	0.9200	0.9200	0.9200	0.9156	0.9200	0.9133	0.9167
SVC	0.9367	0.9400	0.9367	0.9367	0.9433	0.9467	0.9433	0.9433
RF	1.0000	1.0000	1.0000	1.0000	0.9400	0.9400	0.9400	0.9400
ANN	0.9404	0.9433	0.9400	0.9400	0.9533	0.9567	0.9533	0.9533
KNN	0.9338	0.9367	0.9333	0.9333	0.9244	0.9233	0.9233	0.9233

Table 5. Comparison of training and test metrics with dimensionality reduction.

Model	Train				Test
Model	Accuracy	Precision	Recall	F1-Score	Accuracy	Precision	Recall	F1-Score
LR	0.9110	0.9000	0.9000	0.9100	0.9063	0.9100	0.9036	0.9057
SVC	0.9733	0.9785	0.9381	0.9579	0.9567	0.9677	0.9414	0.9326
RF	0.9700	0.9700	0.9700	0.9700	0.9000	0.9000	0.9000	0.9000
ANN	0.9210	0.9311	0.9200	0.9320	0.9363	0.9272	0.9236	0.9251
KNN	0.9154	0.9157	0.9015	0.9222	0.8945	0.8926	0.8941	0.8985

Table 6. Z-test between training and test prediction percentages by algorithm without dimensionality reduction.

Model	Training	Test	\|Z\|	p-Value
LR	0.9190	0.9156	0.2469	0.8050
SVC	0.9367	0.9433	0.5559	0.5783
RF	1.0000	0.9400	7.1459	0.0000
ANN	0.9404	0.9533	1.1506	0.2499
KNN	0.9338	0.9244	0.7326	0.4638

Table 7. Z-test between the prediction percentage in training and test by algorithm with dimensionality reduction.

Model	Training	Test	\|Z\|	p-Value
LR	0.9110	0.9063	0.2433	0.8078
SVC	0.9733	0.9567	1.2896	0.1972
RF	0.9700	0.9000	3.8398	0.0001
ANN	0.9210	0.9363	−0.9149	0.3602
KNN	0.9154	0.8945	1.0442	0.2964

Table 8. Confusion matrix of the Support Vector Classifier model without dimensionality reduction.

	Expected
	INIAP-11	INIAP-12	INIAP-20
INIAP-11	232	35	0
INIAP-12	11	256	0
INIAP-20	0	0	266

Table 9. Confusion matrix of the Support Vector Classifier model with dimensionality reduction.

		Expected
		INIAP-11	INIAP-12	INIAP-20
Observed	INIAP-11	243	14	2
	INIAP-12	4	264	0
	INIAP-20	1	0	272

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Coronel-Reyes, J.; Haro-Sarango, A.; Delgado-Vera, C.; Triviño-Sánchez, J. Comparative Evaluation of Machine Learning Algorithms for the Identification and Morphological Classification of Rice Grains. AgriEngineering 2026, 8, 100. https://doi.org/10.3390/agriengineering8030100

AMA Style

Coronel-Reyes J, Haro-Sarango A, Delgado-Vera C, Triviño-Sánchez J. Comparative Evaluation of Machine Learning Algorithms for the Identification and Morphological Classification of Rice Grains. AgriEngineering. 2026; 8(3):100. https://doi.org/10.3390/agriengineering8030100

Chicago/Turabian Style

Coronel-Reyes, Julián, Alexander Haro-Sarango, Carlota Delgado-Vera, and Johnny Triviño-Sánchez. 2026. "Comparative Evaluation of Machine Learning Algorithms for the Identification and Morphological Classification of Rice Grains" AgriEngineering 8, no. 3: 100. https://doi.org/10.3390/agriengineering8030100

APA Style

Coronel-Reyes, J., Haro-Sarango, A., Delgado-Vera, C., & Triviño-Sánchez, J. (2026). Comparative Evaluation of Machine Learning Algorithms for the Identification and Morphological Classification of Rice Grains. AgriEngineering, 8(3), 100. https://doi.org/10.3390/agriengineering8030100

Article Menu

Comparative Evaluation of Machine Learning Algorithms for the Identification and Morphological Classification of Rice Grains

Abstract

1. Introduction

2. Materials and Methods

3. Results

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI