Next Article in Journal
Advances in Propagation and Cultivation of Mushroom
Previous Article in Journal
Sensitivity of Pinus kesiya var. langbianensis Seeds to Desiccation Treatment for Storage and Elucidation of the Physiological Mechanisms
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Using Machine Learning to Classify Capsicum Genotypes Based on Agronomic Traits

by
Ana Izabella Freire
1,
Alex Fernandes de Souza
1,
Gustavo dos Santos Leal
1,
Filipe Bittencourt Machado de Souza
2,*,
Filipe Alves Neto Verri
3,
Pedro Paulo Balestrassi
1,
Anderson Paulo de Paiva
1,
João José da Silva Júnior
2,
Leonardo França da Silva
2,
Fernando Henrique Silva Garcia
4 and
Guilherme Godoy Fonseca
2
1
Institute of Industrial Engineering and Management, Federal University of Itajubá (UNIFEI), BPS Avenue, 1303, Pinheirinho, Itajubá CEP 37500-903, MG, Brazil
2
Faculty of Agronomy and Veterinary Medicine, University of Brasília (UnB), Campus Darcy Ribeiro, ICC Centro–Bloco B, Térreo, Asa Norte, Brasília CEP 70910-970, DF, Brazil
3
Division of Computer Science, Aeronautics Institute of Technology (ITA), Marshal Eduardo Gomes Square, 50, Vila das Acacias, São José dos Campos CEP 12228-900, SP, Brazil
4
Department of Biological and Health Sciences, Federal University of Amapá (UNIFAP), Rodovia Josmar Chaves Pinto KM 02, Macapá CEP 68903-419, AP, Brazil
*
Author to whom correspondence should be addressed.
Horticulturae 2026, 12(5), 623; https://doi.org/10.3390/horticulturae12050623 (registering DOI)
Submission received: 17 March 2026 / Revised: 1 May 2026 / Accepted: 7 May 2026 / Published: 18 May 2026
(This article belongs to the Section Genetics, Genomics, Breeding, and Biotechnology (G2B2))

Abstract

Peppers from the Capsicum genus are highly valued worldwide for their culinary, medicinal, and nutritional uses. However, accurately classifying and developing new varieties to enhance these traits remains a challenge due to the limitations of traditional methods, which often lack precision and are time-consuming. This study aimed to overcome these limitations by applying advanced multivariate statistical techniques and machine learning models (KNN, RF, XGBoost) to characterize and classify Capsicum genotypes based on genetic and phenotypic features. Sixteen Capsicum genotypes were analyzed using methods such as MANOVA, PCA, and cluster analysis to explore their variabilities and similarities. Cluster analysis revealed the formation of distinct groups, indicating phenotypic similarity patterns among specific varieties. The machine learning models were evaluated using Leave-One-Out cross-validation to address the challenges posed by small datasets. The results indicated that Random Forest outperformed the other models, exhibiting superior class discrimination with an AUC of 0.96, while KNN and XGBoost achieved AUC values of 0.95 and 0.85, respectively. Despite the slightly superior performance of Random Forest relative to KNN, both models demonstrated strong predictive performance, whereas XGBoost exhibited moderate performance. In addition, key agronomic traits such as pericarp thickness, fruit diameter, seeds per fruit, and corolla color were identified as the most relevant variables for classification. Principal component analysis indicated that the first components explained a substantial proportion of the total variance, supporting efficient dimensionality reduction and pattern recognition. Furthermore, the Random Forest model achieved high overall performance, with accuracy, precision, recall, and F1-score values close to 0.93, reinforcing its robustness in multiclass classification. This study highlights the effectiveness of machine learning in overcoming the constraints of traditional classification methods, providing a robust approach for the accurate identification and improvement of pepper varieties.
Keywords: KNN; plant breeding; phenotyping; Random Forest; XGBoost KNN; plant breeding; phenotyping; Random Forest; XGBoost
Graphical Abstract

Share and Cite

MDPI and ACS Style

Freire, A.I.; Souza, A.F.d.; Leal, G.d.S.; Souza, F.B.M.d.; Verri, F.A.N.; Balestrassi, P.P.; Paiva, A.P.d.; Júnior, J.J.d.S.; Silva, L.F.d.; Garcia, F.H.S.; et al. Using Machine Learning to Classify Capsicum Genotypes Based on Agronomic Traits. Horticulturae 2026, 12, 623. https://doi.org/10.3390/horticulturae12050623

AMA Style

Freire AI, Souza AFd, Leal GdS, Souza FBMd, Verri FAN, Balestrassi PP, Paiva APd, Júnior JJdS, Silva LFd, Garcia FHS, et al. Using Machine Learning to Classify Capsicum Genotypes Based on Agronomic Traits. Horticulturae. 2026; 12(5):623. https://doi.org/10.3390/horticulturae12050623

Chicago/Turabian Style

Freire, Ana Izabella, Alex Fernandes de Souza, Gustavo dos Santos Leal, Filipe Bittencourt Machado de Souza, Filipe Alves Neto Verri, Pedro Paulo Balestrassi, Anderson Paulo de Paiva, João José da Silva Júnior, Leonardo França da Silva, Fernando Henrique Silva Garcia, and et al. 2026. "Using Machine Learning to Classify Capsicum Genotypes Based on Agronomic Traits" Horticulturae 12, no. 5: 623. https://doi.org/10.3390/horticulturae12050623

APA Style

Freire, A. I., Souza, A. F. d., Leal, G. d. S., Souza, F. B. M. d., Verri, F. A. N., Balestrassi, P. P., Paiva, A. P. d., Júnior, J. J. d. S., Silva, L. F. d., Garcia, F. H. S., & Fonseca, G. G. (2026). Using Machine Learning to Classify Capsicum Genotypes Based on Agronomic Traits. Horticulturae, 12(5), 623. https://doi.org/10.3390/horticulturae12050623

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop