Breeding of Solanaceous Crops Using AI: Machine Learning and Deep Learning Approaches—A Critical Review

Gerakari, Maria; Katsileros, Anastasios; Kleftogianni, Konstantina; Tani, Eleni; Bebeli, Penelope J.; Papasotiropoulos, Vasileios

doi:10.3390/agronomy15030757

Open AccessReview

Breeding of Solanaceous Crops Using AI: Machine Learning and Deep Learning Approaches—A Critical Review

by

Maria Gerakari

^†

,

Anastasios Katsileros

^†

,

Konstantina Kleftogianni

,

Eleni Tani

,

Penelope J. Bebeli

and

Vasileios Papasotiropoulos

^*

Laboratory of Plant Breeding & Biometry, Department of Crop Science, Agricultural University of Athens, Iera Odos 75, 11855 Athens, Greece

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Agronomy 2025, 15(3), 757; https://doi.org/10.3390/agronomy15030757

Submission received: 15 February 2025 / Revised: 12 March 2025 / Accepted: 18 March 2025 / Published: 20 March 2025

(This article belongs to the Special Issue Progress and Innovations in Breeding Objectives and Technologies for Solanaceae Crops Production)

Download

Browse Figures

Versions Notes

Abstract

This review discusses the potential of artificial intelligence (AI), particularly machine learning (ML) and its subset, deep learning (DL), in advancing the genetic improvement of Solanaceous crops. AI has emerged as a powerful solution to overcome the limitations of traditional breeding techniques, which often involve time-consuming, resource-intensive processes with limited predictive accuracy. Through advanced algorithms and predictive models, ML and DL facilitate the identification and optimization of key traits, including higher yield, improved quality, pest resistance, and tolerance to extreme climatic conditions. By integrating big data analytics and omics, these methods enhance genomic selection (GS), support gene-editing technologies like CRISPR-Cas9, and accelerate crop breeding, thus enabling the development of resilient and adaptable crops. This review highlights the role of ML and DL in improving Solanaceae crops, such as tomato, potato, eggplant, and pepper, with the aim of developing novel varieties with superior agronomic and quality traits. Additionally, this study examines the advantages and limitations of AI-driven breeding compared to traditional methods in Solanaceae, emphasizing its contribution to agricultural resilience, food security, and environmental sustainability.

Keywords:

artificial intelligence; machine learning; big data; deep learning; plant breeding; Solanaceae; tomato; potato; eggplant; pepper

1. Introduction

Artificial intelligence (AI), specifically machine learning (ML) and its subset, deep learning (DL), have become key components of modern technological advancement, with novel applications across diverse fields. Specifically, agriculture has increasingly embraced these technologies to tackle challenges such as food security, environmental sustainability, and resource optimization. By simulating intelligence and learning from data, these technologies enhance efficiency and resilience in agricultural systems [1], enabling the optimization of precision farming, field monitoring, and supply chain management [2].

With the advent of AI algorithms, massively produced datasets are processed, allowing farmers to enhance decision-making in addressing the specific needs of their crops. These vast datasets are also called “Big Data”, and they consist of 5Vs key attributes (Volume, Value, Variety, Veracity, and Velocity). More specifically, Volume refers to the potential of analyzing large datasets, Value refers to the extraction of meaningful conclusions, while Variety is about integrating various data sources. Veracity relates to ensuring data accuracy, and, finally, Velocity corresponds to information processing in real-time [3]. The utilization of big data analysis in agriculture through AI and ML enables the processing of large-scale, diverse, and real-time datasets to make precise predictions, manage resources efficiently, and support data-driven decision-making [4] (Figure 1).

1.1. ML and DL Applications for Plant Breeding

AI-driven techniques like ML, DL, robotics, and computer vision enhance plant breeding by analyzing vast datasets to identify superior genotypes and predict phenotypic outcomes [5]. ML models, like random forests (RFs), support vector machines (SVMs), and gradient boosting algorithms (GBAs), increase the efficiency of identifying desirable traits much more quickly and with greater accuracy than traditional methods [6]. Random forests (RFs), which are supervised ML algorithms, enhance classification and regression accuracy by combining multiple decision trees. This makes them well-suited for genomic data analysis [7]. SVMs, another type of supervised ML model, are particularly useful for classifying plant varieties based on genetic markers [8]. Moreover, GBAs are a type of ML technique that is mostly used for regression and classification tasks [9].

DL, a subset of ML, has significantly impacted plant breeding by automatically processing high-dimensional and complex data. Some of the most widely used DL tools include convolutional neural networks (CNNs), which are a class of DL models well-suited for tasks involving spatial data, such as images or sequences. CNNs can analyze plant images to identify subtle phenotypic traits that might escape human observation, such as disease symptoms or growth anomalies [10]. On the other hand, recurrent neural networks (RNNs) are designed to process sequential data and analyze temporal patterns, such as changes in plant growth over time under varying environmental conditions [11]. Finally, artificial neural networks (ANNs), which are computational models inspired by the structure and function of biological neural networks in animal and human brains, are used for prediction and identification of different species by analyzing accurately and effectively complex morphological traits [12]. All these ML and DL tools are valuable for breeders facilitating the selection and development of superior crop varieties [13,14].

1.2. AI-Driven Crop Breeding to Overcome Traditional Breeding Limitations in Solanaceae

A transformative use of AI and AI-enabled learning methods, like ML and DL in agriculture, lies in the field of Solanaceae crop genetics and breeding. Members of the Solanaceae family are among the world’s most important agricultural species. These crops are not only economically significant but also essential for global diets and nutrition. Several crops of this family, e.g., tomato (Solanum lycopersicum), potato (Solanum tuberosum), eggplant (Solanum melongena), and pepper (Capsicum annuum), have been subjected to intensive breeding for improved agricultural traits that lead to higher yields, resistance to biotic and abiotic stresses, and longer shelf-life, as well as better taste and superior nutritional quality [15]. The application of AI and ML can facilitate the identification of key features associated with important traits, aiding in the development of improved Solanaceae crops and plant varieties in general [16].

Traditional breeding strategies in Solanaceae continue to rely heavily on subjective selection based on breeders’ experience [17]. This method requires significant investment in labor and time, often compromising accuracy and efficiency [18]. Manual phenotyping, commonly applied in Solanaceae crops, typically captures only superficial, observable traits, failing to account for subcellular processes and complex interactions affecting the overall plant phenotype [18]. Although conventional selective breeding has successfully improved several key agronomic and non-agronomic traits, continued phenotypic selection has limited genetic variation, minimizing the gene pool and causing bottleneck phenomena [19]. Traditional breeding techniques face challenges in keeping up with the rapid progress of high-throughput genotyping technologies, making it difficult to thoroughly assess the dynamic nature of plant traits at various growth stages and under different environmental conditions. Moreover, conventional disease detection methods, which primarily rely on visual assessments of multiple genotypes during early breeding stages, are time-consuming and costly. Widely used image processing techniques, such as thresholding, edge detection, region-based segmentation, and clustering, are based on mathematical algorithms for data categorization and analysis. However, conventional methods lack the capability to effectively analyze large datasets or handle intricate details of complex images [20]. Furthermore, traditional genomic prediction (GP) methods also face significant challenges, particularly in high-dimensional datasets where the number of markers exceeds the number of observations, limiting predictive accuracy [21].

The emergence of big data and omics technologies has significantly enhanced the capabilities of ML in plant genetics and breeding. The power of big data analytics lies in its ability to integrate these extensive and varied datasets, creating a comprehensive framework for understanding plant biology. ML algorithms are particularly adept at linking omics data with phenotypic traits, enabling researchers to discover previously obscured relationships and patterns. With the exponential growth of genomic data and advances in AI algorithms, breeders can now leverage ML and DL techniques to revolutionize crop improvement [22].

AI-based learning methods accelerate genomic selection (GS) allowing all markers (e.g., SNPs) to contribute to the model, even those with weak effects or high correlations [21,23]. ML and DL handle complex interactions, including nonadditive models with dominant and epistatic effects, without assuming any specific distribution for predictor variables. Additionally, they enhance gene editing technologies like CRISPR-Cas9, by identifying and validating target genes, significantly increasing the accuracy and efficiency of genetic modifications [24]. These advancements enable the precise improvement of traits such as salinity tolerance and pest resistance, contributing to long-term agricultural sustainability [25,26,27].

Several AI-driven tools are widely used in modern agricultural practices to overcome traditional breeding limitations and revolutionize modern Solanaceae breeding. The integration of ML applications in Solanaceous crop breeding accelerates trait prediction, disease detection, and stress resilience, enabling breeders to efficiently analyze complex datasets, uncover genetic insights, and develop high-quality, resilient cultivars tailored to meet the challenges of modern agriculture in the context of climate change (Figure 2). ML applications in Solanaceae research are used to estimate the performance of different plants under diverse conditions, analyzing extensive data and genotype-phenotype interactions [28,29]. Multispectral imaging and ML tools offer faster, more efficient, and cost-effective disease monitoring, enhancing genetic gains and improving management strategies [30].

The purpose of this study is to investigate the applications of ML and DL in breeding Solanaceous crops, focusing on major representatives of this family, such as tomatoes, potatoes, peppers, and eggplants, to highlight the beneficial effects of AI methods in enhancing important agricultural traits, while also promoting innovations that support global food security and environmental sustainability.

2. Applications of Machine and Deep Learning in Solanaceous Crop Breeding

2.1. Tomato

2.1.1. Plant Phenotyping for Productivity Monitoring and Yield Prediction

Yield is a complex trait influenced by multiple genetic factors. Modern crop breeding techniques, including ML, enhance efficiency by reducing reliance on high-throughput phenotyping. These methods enable precise phenotyping through predictive models, aiding in the identification of key traits affecting yield.

Researchers utilized ANNs to predict tomato yields in greenhouse production. By conducting a sensitivity analysis, they determined which input variables significantly impact prediction accuracy and showed that combining ANNs with sensitivity analysis can effectively enhance decision-making in greenhouse cultivation. The ANN model outperformed the multiple linear regression (MLR) model in terms of various performance criteria and gave satisfactory predictions of tomato yield in the studied area [31]. ANNs and MLR models were used to analyze segregating populations (F₂–F₅) from three tomato interspecific crosses, identifying key yield-related traits. Gene action was assessed through skewness, kurtosis, and parent–progeny regression, leading to the identification of promising segregates with potential for developing superior tomato lines through targeted interspecific breeding [32].

Advancements in AI, Augmented Reality (AR), and remote sensing technologies are revolutionizing tomato yield prediction, enabling more precise, data-driven decision-making for breeders and farmers. An innovative development in tomato yield estimation is the ARIA (Augmented Reality and Artificial Intelligence) mobile app, designed to detect, count, and classify tomatoes by capturing images through markerless AR technology. This app gives insights into the quality and maturity of preharvest tomatoes and provides reliable yield estimations in breeding programs [33]. Rahnemoonfar et al. [34] developed a simulated deep CNN for automated yield estimation in agriculture. Their model, based on a modified Inception-ResNet architecture, was trained entirely on synthetic data and tested on real images. The system efficiently counted fruits, flowers, and trees, even under challenging conditions such as shadows, occlusions, and overlapping fruits. Their approach offers a more practical and cost-effective alternative to traditional manual counting methods, which are labor-intensive and impractical for large fields. Experimental results demonstrated a test accuracy of 91% on real images and 93% on synthetic images.

Hemming et al. [35] conducted a six-month experiment on cherry tomato cultivation across six high-tech greenhouse compartments, aiming to maximize net profit through AI-driven climate and crop management strategies. They developed hybrid systems that combined expert knowledge with control and predictive algorithms to optimize their growing methods. A variety of algorithms were explored and implemented, including conditional rule-based approaches, data-enabled predictive control (DeePC), long short-term memory (LSTM) networks, bidirectional LSTM, RFs, and imitation learning. The results indicated that the AI-supported strategies outperformed a human-operated greenhouse used as a reference. The study identified challenges and opportunities for implementing remote-control systems in greenhouse production, emphasizing the potential for autonomous agriculture to address labor shortages and improve efficiency.

ANNs and MLR models have also been employed to investigate important yield attributes, including tomato callus formation, another culture, and the factors that influence these phenomena. Analyzing the effects of multiple parameters such as plant genotype, concentration of plant growth regulators, cold temperature duration and flower length through ANNs and MLRs researchers investigated the callus induction percentage and the numbers of regenerated calli [36]. Yamamoto et al. [37] discussed a simulation-based breeding strategy that integrates whole-genome prediction to improve complex characteristics like yield and flavor by leveraging genomic information. The main aspects studied were the integration of breeding stimulation and GP, the construction of phenotype prediction models, and the simulation for yield improvement and flavor-related traits. Recent research introduced the term Integrated Genomic-Environomic Prediction (iGEP), extending the traditional GP approach. It combines multi-omics data, big data technologies, and AI, particularly using ML and DL, to increase the precision and credibility of phenotypic prediction. Although these studies’ primary emphasis is on crops like rice, wheat, and maize, the researchers highlight that the same methodologies and techniques can be applied to tomatoes successfully [38,39].

2.1.2. Genomic Selection Based on Morphological Classification and Fruit Quality Traits

Multiple prediction models have been developed in tomato to estimate critical quality traits that must be incorporated into genotypes during breeding processes, such as aspartate content, fruit weight, firmness, ripeness, elasticity, soluble solids, pH, acidity, sugar, and carotene levels [40,41,42]. In addition, AI algorithms have been used to analyze fruit color to classify ripeness stages, achieving high accuracy in the process automatically [43]. Vazquez et al. [44] demonstrated a novel approach incorporating ML techniques to establish a robust fruit shape classification system, which automatically classifies tomato fruits according to their shape, improving the efficiency of visual characterization, highlighting its significance for tomato breeding and genetics. The researchers trained and evaluated seven supervised ML algorithms for recognition and classification of different tomato fruit shapes. They also used image analysis techniques to extract shape-related information, then created a database containing all these features which served as the training and testing base for the model. The results demonstrated the superiority of the SVM model in terms of its accuracy and the applications of this computational tool, which could enhance the knowledge regarding the relationship between fruit shape and related genes. This development could facilitate breeding programs, cultivar description and varietal registration and, at the same time, increase the classification accuracy for consumer-preferred fruit shape characteristics.

Yeon et al. [45] utilized two tomato germplasm collections (TGC1 with 162 accessions and TGC2 with 191 accessions) employing a 51K Axiom™ SNP array, and they examined the genomic estimated breeding values (GEVs) of five quality characteristics (fruit weight, fruit width, fruit height, pericarp thickness, and total soluble solids (TSS) content). The researchers implemented parametric models (RR-BLUP, Bayes A, Bayesian LASSO) and non-parametric models (RKHS, SVM, RF) evaluating prediction accuracy across various cross-validation methods of GS models and marker sets. They concluded that the selection of the appropriate GS model can unravel preferable fruit traits in tomato breeding programs, potentially accelerating the development of elite cultivars.

AI genomic prediction (GP) models, which enable precise selection based on traits such as fruit morphology, color, and yield, have also been used to forecast key visual and size traits in tomatoes, proving that GS has effectively accelerated the breeding process. The implementation of GP in tomato breeding requires the optimization of various factors, including field trial management, agronomic practices, seed production, phenotyping, and sequencing. Furthermore, a careful evaluation of parameters such as inbreeding levels, marker metrics, and the number of individuals to assess is essential. The integration of GP into breeding programs like the single seed descent scheme and backcrossing can reduce the number of generations and streamline the selection process in tomato breeding. Additionally, genotyping platforms can facilitate the identification of desirable and undesirable genotypes, thereby enhancing introgression of favorable traits [46].

Tong et al. [28] applied GS methods to two populations of tomato and pepper to predict morphometric and colorimetric traits. These traits were evaluated using both traditional scoring-based conventional descriptors (CDs) and the Tomato Analyzer (TA) tool, which analyzed images of fruit cut longitudinally and latitudinally. GS performance was tested through cross-validation using classification-based ML models, such as RFs and support vector classification (SVC) for CD traits, the state-of-the-art GS model in plant breeding, and the ridge regression best linear unbiased prediction (rrBLUP) for TA traits. The findings demonstrated that, overall, GS approaches can assist the selection of high-performance Solanaceous fruits in crop breeding.

An innovative hybrid tomato AI breeding program has also been proposed to accelerate time to market by combining Seed-X’s advanced computer vision and AI technology with TomaTech’s. The researchers reported that this combination significantly increased the likelihood of desirable market characteristics and enhanced prediction capabilities in the breeding process [47]. Furthermore, under the perspective of breeding for fruit quality, Khan and Adem in 2023 [48], utilized the AI model “Connoisseur”, which leverages the consumers’ sensory scores for various tomato varieties, coupled with their chemical composition for identifying whether these compounds are positively or negatively related to flavor.

ML algorithms like RFs and Neutral Networks have also been implemented to classify and discriminate tomato seeds cultivars. Images of tomato seeds from various cultivars were captured using a consistent imaging setup, and seed morphology-related traits were extracted. The ML models, after they tested for their accuracy and precision, were applied to distinguish tomato seeds which belong to different cultivars based on their physical and morphological traits. The present study’s findings highlight the potential of integrating ML with seed imaging to modernize agricultural practices and facilitates the identification and tracking of promising cultivars in breeding experiments, emphasizing that this approach can be extended to crops other than tomato [49].

2.1.3. Breeding Against Environmental Stressors

ML and DL models have been employed to monitor tomato yield under environmental stresses, like drought, and diseases. Recent studies have integrated such ML and DL models to develop predictive frameworks for managing irrigation and improving tomato production. These models have effectively captured the interactions between environmental and plant variables, supporting their application in crop management and breeding programs under limited water availability. Similarly, in their 2023 review study, Zhang et al. [50] highlighted the potential of AI in breeding stress-tolerant tomato cultivars under various abiotic stressors including drought, salinity, cold and heat stress. They emphasized the use of ML algorithms to identify genetic markers, DL models to analyze multi-omic and environmental data, and AI-driven tools for phenotype prediction. Additionally, advanced decision-support systems were proposed to assist breeders in selecting and optimizing genotypes for stress-prone environments. These approaches demonstrate AI’s transformative role in precision breeding and crop management. Moreover, a study by Chowdhury et al. [51] focused on identifying drought-responsive genes in tomato by utilizing ML tools to analyze gene expression data. More specifically, the researchers utilized gene expression values as features to train classification models using the gradient boosting algorithm “XGBoost” in R to unravel critical drought-related genes and pathways, enhancing our understanding of drought tolerance while providing targets for genetic improvement in tomatoes.

In a study conducted by Bupi et al. [52], an integrated ML framework was developed to assess the severity of Tomato Yellow Leaf Curl Virus (TYLCV) infections. By leveraging advanced algorithms to identify patterns and predict infection severity, the framework in this study incorporates tools for data preprocessing, feature selection, and model optimization. The researchers suggest that this ML framework can effectively and timely contribute to managing TYLCV in agricultural systems. Additionally, they highlight its potential applications in precision agriculture and its role in developing resistant cultivars for use in breeding programs. Moreover, Dias et al. [53] employed RFs models to predict late blight severity in tomato plants caused by Phytophthora infestans. They utilized multispectral images captured by unmanned aerial vehicles (UAVs) to calculate vegetation indices, which served as input features for the models, integrating remote sensing applications. The researchers emphasize the effectiveness and potential of integrating remote sensing and ML for phenotyping in real-world conditions. They highlight its applications in breeding programs aimed at developing late blight-resistant tomato varieties, as well as its utility for informed decision-making in crop management and accelerating the selection of superior cultivars. Gadade and Kirange [54] explored the use of classical ML techniques like SVMs for identifying tomato leaf disease progress across different developmental plant stages. Although this study focuses on disease identification, it offers tools and methods that are highly beneficial for breeding programs by streamlining the disease evaluation process, providing accurate solutions for disease resistance screening, and supporting the developments of tomato cultivars with enhanced disease resistance. Furthermore, Tan et al. [55] compared classical ML techniques such as SVMs, k-Nearest Neighbors (k-NN), and RFs and DL methods (CNNs and pre-trained models) for classifying tomato leaf diseases. The results indicated that DL models outperformed classical ML in terms of classification accuracy and robustness and eliminated the need for manual feature extraction by learning features directly from the data. This study can contribute to tomato breeding, by highlighting that DL methods can provide a tool to enhance disease resistance screening and improve breeding efficiency.

Johansen et al. [56] conducted a phenotyping experiment using field and UAV-based imaging on 199 accessions of the wild tomato species Solanum pimpinellifolium and one S. lycopersicum germplasm. Their goal was to predict biomass and yield under normal and saline conditions using UAV imagery and random forest algorithms based on shape characteristics. The study demonstrated the feasibility of predicting biomass and yield—two indicators of salt tolerance—up to eight weeks before harvest. This approach facilitates the early identification of salt-tolerant accessions, which could be utilized to introgress this trait into commercial cultivars. Various deep-learning approaches have been employed to identify biotic and abiotic stresses in tomatoes. Fuentes et al. [57] integrated three deep-learning meta-architectures—Faster Region-based CNN (Faster R-CNN), Region-based Fully Convolutional Network (R-FCN), and Single Shot Multibox Detector (SSD)—along with two deep feature extractors, Residual Network (ResNet) and VGGNet, to detect pests and diseases. Karthik et al. [58] developed a model incorporating an attention gating mechanism within a residual CNN. They used the PlantVillage dataset, which includes three tomato diseases: leaf mold, early blight, and late blight. By applying a fivefold cross-validation method, their model achieved 98% accuracy on validation datasets. Another study utilized images from the PlantVillage database, covering bacterial (bacterial spot), viral (tomato mosaic virus and yellow leaf curl virus), and fungal (leaf mold, target spot, early blight, and late blight) diseases, as well as pests (such as spider mites). The proposed framework demonstrated an accuracy of 99.18% with GoogLeNet and 98.66% with AlexNet [59].

2.2. Eggplant

2.2.1. Productivity Monitoring and Yield Prediction

Various studies have explored AI-based methods to enhance yield estimation and agricultural management in eggplants. ML models using spectral vegetation indices (SVIs) derived from remote sensing data have significantly improved eggplant yield predictions. SVIs are mathematical combinations of reflectance values from different wavelengths captured by remote sensing devices. They are widely used to quantify vegetation characteristics such as health, biomass, chlorophyll content and water status [60]. Taşan et al. [61] demonstrated the accuracy of five distinct ML models—ANN, k-Nearest Neighbor (kNN), support vector regression (SVR), RF, and Adaptive Boosting (AB)—which were examined for their capacity to forecast eggplant yield at field scale, with varying input combinations, highlighting data-driven approaches to optimize precision agriculture. Additionally, a study by Islam et al. [62] applied predictive algorithms, including regression and boosting techniques, for precise eggplant yield prediction of 130 locally collected eggplant genotypes. The study’s overall findings demonstrated that combining vegetation index and cop data can significantly improve eggplant production modeling using ANN-based remote sensing, even though the data collected over three growing seasons is insufficient to make definitive judgments. These two studies underscore AI’s utility in aiding breeders with the selection of superior genotypes.

ANNs were also used to model eggplant growth, yield, and quality based on integrated nutrient management strategies, incorporating factors such as macronutrients, organic manure, and soil properties. According to this research, ANNs outperformed traditional statistical methods by capturing complex nonlinear relationships, enabling optimized fertilizer application for improved yield and reduced environmental impact [63]. The aforementioned advancements underscore AI’s potential to improve agricultural practices, particularly in yield prediction, water management, and crop quality breeding for eggplants.

2.2.2. Phenotyping for Key Aspects on Plant Physiology and Development

García-Fortea et al. [64] introduced MicroScan, a DL-based tool for detecting and recognizing the stages of pollen development, which is a critical step in determining the optimal stage of pollen induction for androgenesis. As a result, a more efficient method for producing doubled haploid (DH) lines was developed, providing a valuable tool for research in plant genetics and breeding. Additionally, Sun et al. [65] used multispectral imaging and ML to classify eggplant seeds with greater accuracy, benefiting seed quality assessment through improved classification models. Furthermore, research conducted by Nomura et al. [66] focused on developing a hybrid AI model for canopy photosynthesis rate estimation in eggplants, combining different data-driven techniques. The model combines ML methods and traditional modeling techniques to create an accurate and trustworthy system for predicting canopy photosynthesis rates under various environmental conditions and their impact on fruit quality, while researchers demonstrate that this strategy can be applied effectively for greenhouse management optimizations in eggplants.

2.2.3. Breeding Against Environmental Stressors

Kaniyassery et al. [67] developed an AI-based disease detection system for eggplant, focusing on leaf spot and fruit rot diseases. The research addressed two primary aspects: the impact of meteorological variables on disease incidence and the AI-based classification of diseases using techniques such as image recognition and pattern analysis. The study utilized the YOLOv8 (You Only Look Once v8) model, a state-of-the-art DL algorithm for object detection, to accurately identify and classify disease symptoms from images. The researchers concluded that combining weather-based disease modeling with AI-driven classification offers a comprehensive approach to managing plant diseases, enhancing productivity and decision-making processes in eggplant breeding programs. In another recent study, Lajom et al. [68] employed a SVM model integrated with near-infrared spectroscopy (NIRS) to detect eggplant fruit and shoot borer (EFSB) (Leucinodes orbonalis) infestations accurately at early stages. The results demonstrated a high degree of accuracy in identifying EFSB, marking a significant advancement in the integration of modern technology into agricultural pest management. This approach provides a valuable tool for eggplant farmers and breeders, aiding in the selection of resistant genotypes and improving pest control strategies. Additionally, Zhang et al. [69] detected Verticillium wilt in eggplant leaves using combined VGG16, which is a network CNN architecture enhanced with a triplet attention mechanism. That trained VGG16-triplet attention model achieved a precision of 86.73% on the test set, demonstrating its effectiveness in detecting the disease and contributing to eggplant breeding efforts by addressing disease management and resistance traits in breeding programs.

Cemek et al. [70] addressed water management challenges by applying AI techniques to predict crop evapotranspiration (ET) for eggplants. Understanding and managing evapotranspiration is essential for selecting water-efficient crops, optimizing yields, adapting to climate conditions, and developing crop varieties with reduced water requirements. Several ML and DL approaches were investigated, such as ANNs, deep neural networks (DNN), RF, and SVM. Moreover, kNN and AB were investigated as ML approaches. The best performance was obtained by the ANN model. The model’s performance in ET_c estimation was significantly improved with the addition of leaf area index (LAI) and crop height (h_c) to the climate parameters. Models like ANNs and SVMs provided reliable ET_c estimates based on environmental and crop data, supporting water use efficiency. In another study, AlexNet and VGG-16, two pioneering deep CNNs architectures that are based on image recognition and computer vision tasks, were proposed for the classification of five eggplant diseases (little leaf, epilachna beetle infestation, cercospora leaf spot, tobacco mosaic virus (TMV) and two-spotted spider mite). Healthy and infected plants were analyzed with images acquired from smartphones, and the modified VGG-16 model achieved an accuracy of 93.33% [71,72].

2.3. Potato

2.3.1. Productivity Monitoring and Yield Prediction

The use of ML in potato cultivation has led to significant advancements in yield prediction, with potential implications for plant breeding, as demonstrated by numerous studies integrating satellite imagery, climate data, and agronomic factors. Salvador et al. [73] employed a combination of meteorological data, field observations, and satellite imagery with five ML algorithms—RF, SVM linear (svmL), SVM polynomial (svmP), SVM radial (svmR), and general linear model (GLM)—across six-time frames to assess yield prediction models in Mexico. The SVM-polynomial model, when trained with the first five months of data post-sowing, was the most effective for predicting yield during the summer cycle, while the RF model performed best in the winter cycle with only three months of data. The proposed methodology can predict potato yield prior to harvest, making it highly valuable for developing food security strategies.

Similarly, Gómez et al. [74] in Spain developed predictive models using Sentinel satellite imagery to support precision agriculture. By testing nine ML algorithms in their initial study—ranging from GLM to KNN and model-averaged neural networks (avNNet)—they were able to identify the models best suited for potato yield forecasting. In a subsequent study, Gómez et al. [75] focused on SVM-radial and RF algorithms and introduced the Potato Productivity Index (PPI), a novel metric for yield prediction. Their findings validated the effectiveness of the PPI index, underscoring the potential of ML and remote sensing data to refine yield estimations in regional potato production. Additionally, Kurek et al. [76] utilized agronomic, climatic, soil, and satellite data across five growing seasons on 114 commercial potato fields in Poland. By applying ML techniques such as linear regression, ridge, Lasso, Elastic Net, XGBoost, RF, multilayer perceptron (MLP), stochastic gradient descent (SGD), and SVR, they developed three predictive models: non-satellite, satellite, and hybrid, the latter achieving the lowest mean absolute percentage error (MAPE). El-Kenawy et al. [77] assessed several predictive models—such as KNN, gradient boosting, XGBoost, multilayer perceptron (MLP), GNNs, gated recurrent units (GRUs), and long short-term memory networks (LSTMs)—using metrics like mean squared error (MSE), root mean squared error (RMSE), and mean absolute error (MAE) to predict potato yield. Their results indicate that GNNs and LSTMs offer superior accuracy and effectively capture complex spatial and temporal patterns.

Cultivar-specific data were combined with UAV remote sensing to improve yield predictions in Minnesota. Using RF regression and SVR, researchers found that early-season UAV spectral data—particularly at the tuber initiation stage in late June—correlated strongly with marketable yield. Their results revealed that integrating high-resolution UAV imagery with cultivar data significantly outperformed yield prediction models that lacked cultivar-specific information, highlighting the potential of early detection for yield optimization [78].

Coulibali et al. [79] studied gradients in the elemental composition of a potato leaf tissue (i.e., its ionome) that are linked to crop potential and therefore have applications in plant breeding. Because the ionome is a function of genetics and environmental conditions, practitioners aim at fine-tuning fertilization to obtain an optimal ionome based on the needs of potato cultivars. Their objective was to assess the validity of cultivar grouping and predict potato tuber yields using foliar ionomes. Their dataset comprised 3382 observations in Québec (Canada) from 1970 to 2017. The first mature leaves from top were sampled at the beginning of flowering for total N, P, K, Ca, and Mg analysis. They also used preprocessed ionomes to assess their effects on tuber yield classes (high- and low-yields) on a cultivar basis using k-Nearest Neighbors, RF, and SVMs classification algorithms. Their ML models returned an average accuracy of 70%, which is a fair diagnostic potential to detect in-season nutrient imbalance of potato cultivars.

On the other hand, Yu et al. [80] highlighted the importance of accurately estimating potato Leaf Area Index (LAI) for optimizing yield prediction and management practices. Using UAV-based remote sensing, their study combined data from RGB images, LiDAR, and hyperspectral imaging (HSI). Four ML models—SVR, random forest regression (RFR), histogram-based gradient boosting regression tree (HGBR), and partial least-squares regression (PLSR)—analyzed features from these data sources, with HSI showing the highest predictive accuracy due to its rich spectral information. Combining all features across sensors achieved the highest R² (0.782), with RFR excelling in feature integration. This approach not only advances LAI estimation but also has potential applications in breeding programs and precision agriculture.

2.3.2. Varietal Identification and Tuber Quality Assessment

The use of deep learning models for potato variety recognition was explored, employing five state-of-the-art CNN models: VGG16, ResNet50, MobileNet, Inception-v3, and a custom CNN model. These models were trained on images of various potato varieties to differentiate them based on visual traits, such as size, shape, color, texture, and skin pattern. Performance evaluation revealed that the customized CNN model achieved the highest accuracy at 94.84%, demonstrating its superiority for this task [81]. Similarly, a method for identifying and differentiating 10 potato varieties was proposed by Azizi et al. [82], integrating machine vision and artificial neural networks (ANNs). A perfect classification accuracy of 100% was achieved using non-linear ANNs, highlighting the effectiveness of combining machine vision with neural networks for precise potato variety identification.

2.3.3. Breeding Against Environmental Stressors

Potato crops are highly susceptible to fungal diseases like early blight (Alternaria solani) and late blight (Phytophthora infestans), leading to significant yield losses. Gao et al. [83] used high-resolution visual field images as a dataset to train DNNs for automatic and accurate segmentation and recognition of Phytophthora infestans disease lesions. This study demonstrates the feasibility of using deep learning algorithms for disease lesion segmentation and severity evaluation based on proximal imagery, which could aid crop breeding for resistance and benefit precision farming. Additionally, a plethora of ML tools, like SVMs, RFs, ANN and CNNs, have been implemented in various studies for efficient detection of plant diseases enhancing genotype selection in breeding programs. Aparajita et al. [84] used adaptive thresholding for late blight detection, achieving 96% accuracy under variable conditions. Multiple algorithms were applied to 450 leaf images, achieving 97% accuracy with RF, while CNNs also proved highly effective [85]. A dataset of over 2000 potato leaf images covering seven diseases, including early blight, was created, achieving 99% accuracy with a CNN-based model. Future development of this model includes a smartphone app for real-time disease diagnosis [86]. Singh and Kaur [87] used K-means clustering, Gray Level Co-occurrence Matrix (GLCM) features, and SVM to achieve 95.99% accuracy in distinguishing early and late blight. Rashid et al. [88] developed a multi-level DL model, achieving 99.75% accuracy in detecting early and late blight, with superior performance compared to other models in both accuracy and computational efficiency. Other studies, such as those by Bhere et al. [89], Pasalkar et al. [90], and Sholihati et al. [91], demonstrate CNN’s accuracy, exceeding 90% in potato leaf disease classification.

ML applications in disease detection also extend to viral infections, underscoring the critical role of ML in advancing virus detection and supporting healthier crop management practices. Griffel et al. [92] investigated the detection of Potato Virus Y (PVY) using near-infrared and shortwave infrared wavelengths, applying SVMs to achieve an 89.8% accuracy, significantly outperforming traditional red, green, and blue (RGB) wavelength approaches that yielded only 46.9% accuracy. Polder et al. [93] took a hyperspectral imaging approach, employing CNNs to identify viral diseases in seed potatoes, achieving high precision and recall.

An image classification method was developed to detect virus-infected plants in potato seed production fields in Japan, aiming to enhance the roguing process during selection programs. In this study, RGB images were captured using UAVs from 5 to 10 m above the ground. A CNN achieved 96% accuracy in training and 84% in validation, demonstrating the potential of UAV-based image classification for effective virus detection in potato fields. This method is important for plant breeding as it enables the efficient identification of virus-infected plants, ensuring the production of virus-free seed tubers and contributing to the overall health and productivity of potato crops [94].

ML has also been applied to monitor stress factors and optimize nutrient management in potato crops. Gold et al. [95] analyzed physiological responses in potato cultivars with varied resistance to late blight by examining their spectral reflectance following exposure to Phytophthora infestans. Using ML algorithms, including RF and partial least squares discriminant analysis (PLS-DA), they showed that specific genotypic traits significantly influence disease response, providing insights into the complex host-pathogen interactions and helping identify cultivars with natural disease resistance. These findings highlight the potential of ML to improve understanding of crop resilience and facilitate the selection of stress-resistant varieties.

Boguszewska-Mańkowska et al. [96] investigated drought tolerance variability among 50 potato cultivars by analyzing morphological traits under different water regimes over 11 consecutive years. The study focused on tuber yield, plant tolerance indices, and Climatic Water Balance to assess stability in drought conditions. To enhance the classification of drought tolerance groups, several ML algorithms, including Quadratic Discriminant Analysis, RF, Extra Trees, AdaBoost, and extreme gradient boosting, were evaluated. Extreme gradient boosting emerged as the most effective classifier, achieving an accuracy of 96.7%.

Hyperspectral imaging and attention-based deep learning models were explored to detect drought stress in potato plants. The study involved two potato cultivars exposed to water-deficient conditions and utilized dual-sensor hyperspectral imaging (Visible and Near-Infrared/VNIR and Short-Wave Infrared/SWIR) to identify critical wavelengths associated with drought stress. These applications can aid plant breeding by improving the detection of environmental stress effects in breeding programs, enabling more efficient selection of plants with desirable traits and enhancing both breeding outcomes and crop sustainability [97].

2.4. Pepper

2.4.1. Agronomic Traits and Yield Prediction

Ridge regression and DL-based models were implemented to estimate genomic breeding values for yield and agronomic traits in 204 Capsicum genotypes evaluated across multi-environment trials in New Mexico, USA [98]. This study aimed to assess the accuracy of GP for traits related to yield, morphology, and phenology, examine the impact of marker subsets on prediction accuracy, and evaluate selection responses for various strategies. Using six models, Lozada et al. [98] highlighted the promise of genome-wide selection for chile pepper breeding and underscored the importance of large training datasets to enhance the accuracy of DL models.

2.4.2. Varietal Identification Based on Morphological and Chemical Classification

Sabanci et al. [99] explored the use of computer vision and AI to classify pepper seeds from different cultivars, which is crucial for breeding programs. The images of seeds from green, orange, red, and yellow pepper cultivars were captured using a flatbed scanner. The following approaches were proposed for classification: the first involved training CNN models (ResNet18 and ResNet50), achieving accuracies of 98.05% and 97.07%, respectively. The second approach involved fusing features from pre-trained CNN models and applying feature selection before classifying with an SVM. The CNN-SVM-Cubic model achieved up to 99.02% accuracy, offering high precision and efficiency in plant breeding. Moreover, Kurtulmuş et al. [100] developed a classification method to discriminate pepper seed varieties using neural networks and computer vision. The multilayer perceptron model with 30 neurons in the hidden layer, trained using resilient back propagation, achieved an accuracy of 84.94% in classifying eight pepper seed varieties. Additionally, Tu et al. [101] focused on improving the selection of high-quality pepper seeds by automating the recognition of seed features. The study identified several physical traits, such as color, size, and weight, as key indicators of seed vigor. The best predictive model, based on a multilayer perceptron (MLP) neural network using 15 physical traits, achieved a high stability rate of 99.4%. The model significantly improved germination rates and selection efficiency, reaching up to 79.4% germination and 90% selection rate. This automated approach shows potential for reducing costs and labor in seed selection, making it an effective tool for quality control in pepper breeding programs.

Ramírez-Meraz et al. [102] applied 1H NMR-based metabolomics combined with ML, specifically RF, to study the metabolic fingerprinting of ten experimental races of Capsicum annuum cv. Jalapeño. Their analysis classified and evaluated these races based on differential metabolite profiles, commercial traits, and multivariate data analysis. The study revealed variations among the races in carbohydrate, amino acid, nucleotide, and organic acid content. RF identified length, width, weight, and yield as key variables for accurately distinguishing between the races, highlighting critical traits for commercial and breeding applications. As mentioned in Section 2.1, Tong et al. [28] applied genomic selection (GS) in pepper to predict fruit morphometric and colorimetric traits, utilizing random forest (RF), support vector classification (SVC), and rrBLUP models. Their results showed that GS, combined with these models, effectively supported the selection of superior pepper fruits in breeding programs.

2.4.3. Breeding Against Environmental Stressors

AI can facilitate and improve selection efficiency for resilient genotypes against environmental stressors. Dissanayake et al. [103] developed an effective method for detecting diseases and nutrient deficiencies in bell peppers, focusing on the rapid spread of powdery mildew and magnesium deficiency. The study integrated CNNs to enhance detection accuracy, achieving a 93% success rate in distinguishing the health status of bell pepper leaves, with 97% accuracy in identifying magnesium deficiency and powdery mildew. The approach also demonstrated 98% accuracy in assessing the progression of powdery mildew and 96% in magnesium deficiency. Haque et al. [104] highlighted the importance of detecting pepper diseases quickly and accurately to prevent significant losses in pepper production. The study utilized several pre-trained DL models, including VGG-19, Xception, NasNet Mobile, MobileNet-V2, ResNet-152-V2, and Inception-ResNet-V2, to extract deep features from pepper plant images for disease identification. The customized CNN models achieved high accuracy, with VGG-19 and ResNet-152-V2 reaching an impressive 96.26% accuracy. Additionally, Xception outperformed Inception-ResNet-V2, MobileNet-V2, and NasNet-Mobile, achieving a 93.46% accuracy. These results suggest that DL models can be effectively used for early disease detection in pepper crops, helping farmers minimize losses by enabling rapid identification and treatment of diseases and for breeding programs to ensure disease resistance in pepper cultivars.

Fumia et al. [105] conducted a comparative study of genomic and phenomic selection methodologies to identify heat-tolerant genotypes within a core collection of 300 Capsicum annuum accessions, representing 84.1% of the species’ diversity. Initially, anomaly analysis via k-means clustering was utilized to identify individuals exhibiting anomalous behavior under heat stress compared to optimal conditions, based on phenotypic data. This analysis informed the training of an RF ML model capable of classifying heat-tolerant genotypes with near-perfect accuracy using only data from trials under optimal conditions. Subsequently, a genomic-based predictive analysis was performed, leveraging genomic data to predict component traits and generate a weighted rank-sum selection index (WRSSI) to identify heat-tolerant lines. Finally, the selected lines were compared across three selection methodologies: (1) breeder’s intuition, (2) phenomics-based anomaly analysis, and (3) genomics-based predictive modeling and selection index. The study concluded that integrating classical and multispectral phenotyping techniques enhances selection efficiency and outcomes.

Moreover, Islam et al. [106] used a support vector machine (SVM) to develop an image processing-based method for classifying early-stage stress symptoms in pepper seedlings caused by environmental factors, such as temperature, light intensity, and day-night cycles. By analyzing RGB images, they extracted 18 color features, nine texture features, and one morphological feature. The SVM model achieved an accuracy of 85%, enabling real-time stress monitoring, helping growers to accelerate the identification of stress-resistant traits for developing resilient cultivars. Hyperspectral imaging was explored for detecting aflatoxin contamination in chili peppers, offering a rapid, non-destructive alternative to traditional chemical methods. The study utilized both UV and Halogen excitations, extracting features from individual spectral bands and their differences. Machine learning classifiers, including multi-layer perceptrons (MLPs) and linear discriminant analysis (LDA), were applied, achieving robust classification performance with fewer spectral bands. This method could be valuable in breeding programs for selecting aflatoxin-resistant cultivars, enhancing food safety [107].

3. Limitations and Future Prospects of AI in Breeding Solanaceous Crops

Despite the impact of AI on plant breeding and its various applications in Solanaceous crops, as summarized in Table 1, several challenges hinder its full potential. A major limitation lies in data availability and quality. ML and DL models rely on large, well-annotated datasets integrating genetic, phenotypic, and environmental factors. However, inconsistencies in data collection methods and limited access to comprehensive datasets impede model accuracy and reliability [19,20]. This is further complicated by the challenge of data integration, where combining genomic, transcriptomic, and phenotypic data can be difficult due to their varying types, scales, and structures. As AI/ML models improve with larger training datasets, the task of gathering high-quality, multi-omics data remains time-consuming and costly, especially in resource-limited settings [108].

Another challenge is model interpretability. Many ML and DL models function as “black boxes”, making it difficult for breeders to understand how predictions are made [109]. This lack of transparency can reduce trust and limit the practical implementation of AI-driven breeding tools. The complex, non-linear relationships between different biological layers can make the biological interpretation of AI/ML models even more challenging. To address this, there is a need for explainable AI techniques to improve the transparency of predictions, fostering greater trust in AI tools.

Furthermore, integrating AI into traditional breeding programs requires interdisciplinary collaboration among geneticists, data scientists, and agronomists, which can be resource-intensive and complex [110]. This collaboration is often hindered by overfitting, where models trained on limited datasets perform well on training data but fail to generalize effectively to new data, compromising predictive accuracy. The technical challenge of model interpretability also plays a crucial role here, as it is difficult to translate AI outcomes into practical breeding decisions without understanding the underlying biological mechanisms.

Computational costs and infrastructure also pose limitations. High-throughput data analysis, particularly with deep learning approaches, demands significant computational power and storage capacity, which may not be accessible to all breeding programs, especially in developing countries with limited access to high-performance computing infrastructure and reliable internet connectivity [17]. This challenge is compounded by the need for standardized protocols in AI applications for plant breeding, which remains a critical barrier to achieving consistent results across different breeding programs.

To maximize the benefits of AI in Solanaceous crop breeding, future research should focus on enhancing model transparency and interpretability [111]. The development of explainable AI techniques can help bridge the gap between AI predictions and human decision-making, increasing trust and usability. Automated machine learning (AutoML) presents another promising avenue by simplifying model development and deployment, making AI tools more accessible to breeders with limited data science expertise [112]. Additionally, integrating multimodal AI approaches that combine genomic, phenotypic, and environmental data will lead to more holistic breeding strategies and improved predictive accuracy.

The convergence of AI with emerging technologies offers further opportunities. Quantum computing has the potential to revolutionize data processing capabilities, while advancements in high-throughput phenotyping and sensor technologies can provide richer datasets for training AI models. These innovations, combined with robust interdisciplinary collaborations, will drive the next generation of AI-powered Solanaceae breeding programs. Furthermore, AI-assisted disease detection and early intervention strategies, as seen in research like Ferentinos [113], can enhance the management of diseases in Solanaceous crops, reducing crop losses and improving overall productivity. By addressing these technical and non-technical challenges, including data quality, model interpretability, and computational barriers, the field of AI-driven breeding for Solanaceous crops can unlock significant advances, improving both crop yield and quality while enhancing sustainability in agriculture [16].

4. Conclusions

AI-driven approaches, particularly ML and DL, are reshaping Solanaceous crop breeding by enhancing predictive accuracy, accelerating decision-making, and optimizing resource use. Despite these advancements, challenges related to data quality, model interpretability, data integration, and computational costs must be addressed to ensure the seamless integration of AI into breeding programs. Prioritizing model transparency, automated learning, and multimodal data integration will enhance the reliability and accessibility of AI-powered strategies, supporting more efficient breeding pipelines. As AI technologies continue to advance, they are poised to become integral tools for developing high-yielding, stress-resistant, and climate-adapted Solanaceous crop varieties. Interdisciplinary collaboration among geneticists, agronomists, and data scientists will be pivotal in refining AI models and bridging the gap between technological innovations and practical applications. By overcoming current limitations and harnessing emerging technologies, AI will significantly contribute to building resilient agricultural systems, promoting sustainable food production, and addressing the global challenges of climate change and food security.

Author Contributions

Conceptualization, V.P., E.T. and P.J.B.; writing—original draft preparation, M.G., A.K., K.K. and V.P.; writing—review and editing, V.P., E.T. and P.J.B.; visualization, M.G. and A.K.; supervision, V.P., E.T. and P.J.B.; project administration, V.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Mogili, U.M.R.; Deepak, B.B.V.L. Review on application of drone systems in precision agriculture International Conference on Robotics and Smart Manufacturing. Procedia Comput. Sci. 2018, 133, 502–509. [Google Scholar] [CrossRef]
Monteiro, A.; Santos, S.; Gonçalves, P. Precision Agriculture for Crop and Livestock Farming-Brief Review. Animals 2021, 11, 2345. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Karami, A.; Jafari, F. Leveraging Big Data Characteristics for Enhanced Healthcare Fraud Detection. Cluster Comput. 2024; in press. [Google Scholar]
Qiang, J.; Zhang, X. Data analytics for crop management: A big data view. J. Big Data 2022, 9, 52. [Google Scholar] [CrossRef]
Balasubramanian, V.; Guo, W.; Chandra, A.; Desai, S.V. Computer Vision with Deep Learning for Plant Phenotyping in Agriculture: A Survey. arXiv 2020, arXiv:2006.11391. [Google Scholar] [CrossRef]
Van Dijk, A.D.J.; Kootstra, G.; Kruijer, W.; de Ridder, D. Machine learning in plant science and plant breeding. iScience 2021, 24, 101890. [Google Scholar] [CrossRef]
Montesinos López, O.A.; Montesinos López, A.; Crossa, J. Random Forest for Genomic Prediction. In Multivariate Statistical Machine Learning Methods for Genomic Prediction; Springer: Cham, Switzerland, 2020. [Google Scholar] [CrossRef]
Gutiérrez, S.; Tardaguila, J.; Fernández-Novales, J.; Diago, M.P. Support Vector Machine and Artificial Neural Network Models for the Classification of Grapevine Varieties Using a Portable NIR Spectrophotometer. PLoS ONE 2015, 10, e0143197. [Google Scholar] [CrossRef]
Yan, J.; Xu, Y.; Cheng, Q.; Jiang, S.; Wang, Q.; Xiao, Y.; Ma, C.; Yan, J.; Wang, X. LightGBM: Accelerated genomically designed crop breeding through ensemble learning. Genome. Biol. 2021, 22, 271. [Google Scholar] [CrossRef]
Karahan, T.; Nabiyev, V. Plant identification with convolutional neural networks and transfer learning. Pamukkale Univ. J. Eng. Sci. 2021, 27, 638–645. [Google Scholar] [CrossRef]
Mohmed, G.; Heynes, X.; Sun, W.; Hardy, K.; Grundy, S. Modelling daily plant growth response to environmental conditions in Chinese solar greenhouse using Bayesian neural network. Sci. Rep. 2023, 13, 4379. [Google Scholar] [CrossRef]
Ferreira, M.G.; Azevedo, A.M.; Siman, L.I.; da Silva, G.H.; Carneiro, C.D.S.; Alves, F.M.; Nick, C. Automation in accession classification of Brazilian Capsicum germplasm through artificial neural networks. Sci. Agric. 2017, 74, 203–207. [Google Scholar] [CrossRef]
Singh, A.; Ganapathysubramanian, B.; Singh, A.K.; Sarkar, S. Machine learning for high-throughput stress phenotyping in plants. Trends Plant Sci. 2016, 21, 110–124. [Google Scholar] [CrossRef] [PubMed]
Gill, T.; Gill, S.K.; Saini, D.K.; Chopra, Y.; de Koff, J.P.; Sandhu, K.S. A comprehensive review of high throughput phenotyping and machine learning for plant stress phenotyping. Phenomics 2020, 2, 156–183. [Google Scholar] [CrossRef]
Bai, Y.; Lindhout, P. Domestication and Breeding of Tomatoes: What Have We Gained and What Can We Gain in the Future. Ann. Bot. 2007, 100, 1085–1094. [Google Scholar] [CrossRef] [PubMed]
Cembrowska-Lech, D.; Krzemińska, A.; Miller, T.; Nowakowska, A.; Adamski, C.; Radaczyńska, M.; Mikiciuk, G.; Mikiciuk, M. An Integrated Multi-Omics and Artificial Intelligence Framework for Advance Plant Phenotyping in Horticulture. Biology 2023, 12, 1298. [Google Scholar] [CrossRef] [PubMed]
Jeon, D.; Kang, Y.; Lee, S.; Choi, S.; Sung, Y.; Lee, T.H.; Kim, C. Digitalizing breeding in plants: A new trend of next-generation breeding based on genomic prediction. Front. Plant Sci. 2023, 14, 1092584. [Google Scholar] [CrossRef]
Diaz-Garcia, L.; Covarrubias-Pazaran, G.; Schlautman, B.; Zalapa, J. GiNA, an Efficient and High-Throughput Software for Horticultural Phenotyping. PLoS ONE 2016, 11, e0160439. [Google Scholar] [CrossRef]
Sun, L.; Lai, M.; Ghouri, F.; Nawaz, M.A.; Ali, F.; Baloch, F.S.; Nadeem, M.A.; Aasim, M.; Shahid, M.Q. Modern Plant Breeding Techniques in Crop Improvement and Genetic Diversity: From Molecular Markers and Gene Editing to Artificial Intelligence—A Critical Review. Plants 2024, 13, 2676. [Google Scholar] [CrossRef]
Jafar, A.; Bibi, N.; Naqvi, R.A.; Sadeghi-Niaraki, A.; Jeong, D. Revolutionizing agriculture with artificial intelligence: Plant disease detection methods, applications, and their limitations. Front. Plant Sci. 2024, 15, 1356260. [Google Scholar] [CrossRef]
Barbosa, I.D.P.; da Silva, M.J.; da Costa, W.G.; de Castro Sant’Anna, I.; Nascimento, M.; Cruz, C.D. Genome-enabled prediction through machine learning methods considering different levels of trait complexity. Crop Sci. 2021, 61, 1890–1902. [Google Scholar] [CrossRef]
Esposito, S.; Ruggieri, V.; Tripodi, P. Editorial: Machine Learning for Big Data Analysis: Applications in Plant Breeding and Genomics. Front. Genet. 2020, 13, 916462. [Google Scholar] [CrossRef] [PubMed]
Fernandes, I.K.; Vieira, C.C.; Dias, K.O.G. Using machine learning to combine genetic and environmental data for maize grain yield predictions across multi-environment trials. Theor. Appl. Genet. 2024, 137, 189. [Google Scholar] [CrossRef] [PubMed]
Tang, D.; Zhu, Y.; Hu, X. The AI-driven potato gene editing: A new perspective on target identification. Geogr. Res. Bull. 2024, 3, 370–372. [Google Scholar] [CrossRef]
Amoah, P.; Oumarou Mahamane, A.R.; Byiringiro, M.H.; Mahula, N.J.; Manneh, N.; Oluwasegun, Y.R.; Assfaw, A.T.; Mukiti, H.M.; Garba, A.D.; Chiemeke, F.K.; et al. Genome editing in Sub-Saharan Africa: A game-changing strategy for climate change mitigation and sustainable agriculture. GM Crops Food 2024, 15, 279–302. [Google Scholar] [CrossRef]
Angon, P.B.; Mondal, S.; Akter, S.; Sakil, M.A.; Jalil, M.A. Roles of CRISPR to mitigate drought and salinity stresses on plants. Plant Stress 2023, 8, 100169. [Google Scholar] [CrossRef]
Manoj, K.; Ranjan, P.M.; Manish, K.P.; Kumar, S.P.; Abhishek, B.; Baozhu, G.; Rajeev, K.V. Application of CRISPR/Cas9-mediated gene editing for abiotic stress management in crop plants. Front. Plant Sci. 2023, 14, 1157678. [Google Scholar]
Tong, H.; Nankar, A.N.; Liu, J.; Todorova, V.; Ganeva, D.; Grozeva, S.; Nikoloski, Z. Genomic prediction of morphometric and colorimetric traits in Solanaceous fruits. Hortic. Res. 2022, 9, uhac072. [Google Scholar] [CrossRef]
Anitha, P.; Kuthkunja, A.; Aditya, G.; Anvitha, V. Analysis of Leaf Disease Detection in the Solanaceae family plants using Machine Learning Algorithms. In Proceedings of the International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE), Bangalore, India, 24–25 January 2024; pp. 1–6. [Google Scholar] [CrossRef]
Taneja, A.; Nair, G.; Joshi, M.; Sharma, S.; Sharma, S.; Jambrak, A.R.; Roselló-Soto, E.; Barba, F.J.; Castagnini, J.M.; Leksawasdi, N.; et al. Artificial Intelligence: Implications for the Agri-Food Sector. Agronomy 2023, 13, 1397. [Google Scholar] [CrossRef]
Belouz, K.; Nourani, A.; Zereg, S.; Bencheikh, A. Prediction of greenhouse tomato yield using artificial neural networks combined with sensitivity analysis. Sci. Hortic. 2022, 293, 110666. [Google Scholar] [CrossRef]
Acharya, B.; Kumar, P.P.; Hazra, S.; Dutta, S.; Saha, S.; Roy, S.; Maji, A.; Chakraborty, I.; Chattopadhyay, A.; Hazra, P. Genetic control of important yield attributing characters predicted through machine learning in segregating generations of interspecific crosses of tomato (Solanum lycopersicum L.). Acta Physiol. Plant. 2024, 46, 78. [Google Scholar] [CrossRef]
Balaji Prabhu, B.V. ARIA: Augmented Reality and Artificial Intelligence enable mobile application for Yield and grade prediction of tomato crops. Procedia Comput. Sci. 2024, 235, 2693–2702. [Google Scholar] [CrossRef]
Rahnemoonfar, M.; Sheppard, C. Deep Count: Fruit Counting Based on Deep Simulated Learning. Sensors 2017, 17, 905. [Google Scholar] [CrossRef]
Hemming, S.; Zwart, F.D.; Elings, A.; Petropoulou, A.; Righini, I. Cherry Tomato Production in Intelligent Greenhouses—Sensors and AI for Control of Climate, Irrigation, Crop Yield, and Quality. Sensors 2019, 20, 6430. [Google Scholar] [CrossRef] [PubMed]
Niazian, M.; Shariatpanahi, M.E.; Abdipour, M.; Oroojloo, M. Modeling callus induction and regeneration in an anther culture of tomato (Lycopersicon esculentum L.) using image processing and artificial neural network method. Protoplasma 2019, 256, 1317–1332. [Google Scholar] [CrossRef] [PubMed]
Yamamoto, E.; Matsunaga, H.; Onogi, A.; Kajiya-Kanegae, H.; Minamikawa, M.; Suzuki, A.; Shirasawa, K.; Hirakawa, H.; Nunome, T.; Yamaguchi, H.; et al. A simulation-based breeding design that uses whole-genome prediction in tomato. Sci. Rep. 2016, 6, 19454. [Google Scholar] [CrossRef]
Xu, Y.; Zhang, X.; Li, H.; Zheng, H.; Zhang, J.; Olsen, M.S.; Varshney, R.K.; Prasanna, B.M.; Qian, Q. Smart breeding driven by big data, artificial intelligence, and integrated genomic-enviromic prediction. Mol Plant 2022, 15, 1664–1695. [Google Scholar] [CrossRef]
Ansari, R.; Manna, A.; Hazra, S.; Bose, S.; Chatterjee, A.; Sen, P. Breeding 4.0 vis-à-vis application of artificial intelligence (AI) in crop improvement: An overview. N. Z. J. Crop Hortic. Sci. 2024, 1–43. [Google Scholar] [CrossRef]
Mollazade, K.; Omid, M.; Akhlaghian Tab, F.; Rezaei Kalaj, Y.; Mohtasebi, S. Data Mining-Based Wavelength Selection for Monitoring Quality of Tomato Fruit by Backscattering and Multispectral Imaging. Int. J. Food Prop. 2015, 18, 880–896. [Google Scholar] [CrossRef]
Zhu, W.; Li, W.; Zhang, H.; Li, L. Big data and artificial intelligence-aided crop breeding: Progress and prospects. J. Integr. Plant Biol. 2024, 1–14. [Google Scholar] [CrossRef]
Duangjit, J.; Causse, M.; Sauvage, C. Efficiency of genomic selection for tomato fruit quality. Mol. Breed. 2016, 36, 1–16. [Google Scholar] [CrossRef]
El-Bendary, N.; El-hariri, E.; Hassanien, A.E.; Badr, A. Using machine learning techniques for evaluating tomato ripeness. Expert Syst. Appl. 2014, 42, 1892–1905. [Google Scholar] [CrossRef]
Vazquez, D.V.; Spetale, F.E.; Nankar, A.N.; Grozeva, S.; Rodríguez, G.R. Machine Learning-Based Tomato Fruit Shape Classification System. System. Plants 2024, 13, 2357. [Google Scholar] [CrossRef] [PubMed]
Yeon, J.; Nguyen, T.T.; Kim, M.; Sim, S.C. Prediction accuracy of genomic estimated breeding values for fruit traits in cultivated tomato (Solanum lycopersicum L.). BMC Plant Biol. 2024, 24, 222. [Google Scholar] [CrossRef] [PubMed]
Cappetta, E.; Andolfo, G.; Di Matteo, A.; Barone, A.; Frusciante, L.; Ercolano, M.R. Accelerating Tomato Breeding by Exploiting Genomic Selection Approaches. Plants 2020, 9, 1236. [Google Scholar] [CrossRef]
Seed, X. Seed-X and TomaTech Use AI to Speed Up Breeding of Superior Quality Hybrid Tomatoes. 2019. Available online: https://www.prnewswire.com/news-releases/seed-x-and-tomatech-use-ai-to-speed-up-breeding-of-superior-quality-hybrid-tomatoes-300860396.html (accessed on 14 January 2025).
Khan, T.; Adem, I. An AI Taste ‘Connoisseur’ Could be the Future of Crop Breeding. Agriculture Dive. 2023. Available online: https://www.agriculturedive.com/news/an-ai-taste-connoisseur-could-be-thefuture-of-crop-breeding/700086/ (accessed on 16 January 2025).
Ropelewska, E.; Piecko, J. Discrimination of tomato seeds belonging to different cultivars using machine learning. Eur. Food Res. Technol. 2022, 248, 685–705. [Google Scholar] [CrossRef]
Zhang, X.; Ibrahim, Z.; Khaskheli, M.B.; Raza, H.; Zhou, F.; Shamsi, I.H. Integrative Approaches to Abiotic Stress Management in Crops: Combining Bioinformatics Educational Tools and Artificial Intelligence Applications. Sustainability 2023, 16, 7651. [Google Scholar] [CrossRef]
Chowdhury, R.H.; Eti, F.S.; Ahmed, R.; Gupta, S.D.; Jhan, P.K.; Islam, T.; Bhuiyan, M.A.R.; Rubel, M.H.; Khayer, A. Drought-responsive genes in tomato: Meta-analysis of gene expression using machine learning. Sci Rep. 2023, 13, 19374. [Google Scholar] [CrossRef]
Bupi, N.; Sangaraju, V.K.; Phan, L.T.; Lal, A.; Vo, T.T.B.; Ho, P.T.; Qureshi, M.A.; Tabassum, M.; Lee, S.; Manavalan, B. An Effective Integrated Machine Learning Framework for Identifying Severity of Tomato Yellow Leaf Curl Virus and Their Experimental Validation. Research 2023, 6, 16. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Dias, F.; Valente, D.; Oliveira, C.; Dariva, F.; Copati, M. Remote sensing and machine learning techniques for high throughput phenotyping of late blight-resistant tomato plants in open field trials. Int. J. Remote Sens. 2023, 44, 1900–1921. [Google Scholar] [CrossRef]
Gadade, H.; Kirange, D. Machine Learning Based Identification of Tomato Leaf Diseases at Various Stages of Development; Semantic Scholar: Seatle, WA, USA, 2021; pp. 814–819. [Google Scholar] [CrossRef]
Tan, L.; Lu, J.; Jiang, H. Tomato Leaf Diseases Classification Based on Leaf Images: A Comparison between Classical Machine 838 Learning and Deep Learning Methods. AgriEngineering 2021, 3, 542–558. [Google Scholar] [CrossRef]
Johansen, K.; Morton, M.J.L.; Malbeteau, Y.; Aragon, B.; Al-Mashharawi, S.; Ziliani, M.G.; McCabe, M.F. Predicting biomass and yield in a tomato phenotyping experiment using UAV imagery and random forest. Front. Artif. Intell. 2020, 3, 28. [Google Scholar] [CrossRef] [PubMed]
Fuentes, A.; Yoon, S.; Kim, S.C.; Park, D.S. A robust deep-learning-based detector for real-time tomato plant diseases and pests recognition. Sensors 2017, 17, 2022. [Google Scholar] [CrossRef] [PubMed]
Karthik, R.; Hariharan, M.; Anand, S.; Mathikshara, P.; Johnson, A.; Menaka, R. Attention embedded residual CNN for disease detection in tomato leaves. Appl. Soft Comput. 2020, 86, 105933. [Google Scholar] [CrossRef]
Brahimi, M.; Boukhalfa, K.; Moussaoui, A. Deep learning for tomato diseases: Classification and symptoms visualization. Appl. Artif. Intell. 2017, 31, 299–315. [Google Scholar] [CrossRef]
Bannari, A.; Morin, D.; Bonn, F.; Huete, A.R. A review of vegetation indices. Remote Sens. Rev. 1995, 13, 95–120. [Google Scholar] [CrossRef]
Taşan, S.; Cemek, B.; Taşan, M.; Cantürk, A. Estimation of eggplant yield with machine learning methods using spectral vegetation indices. Comput. Electron. Agric. 2022, 202, 107367. [Google Scholar] [CrossRef]
Islam, A.; Shanto, M.N.I.; Rabby, M.S.M.; Sikder, A.R.; Uddin, M.S.; Arefin, M.N.; Patwary, M.J. Eggplant Yield Prediction Utilizing 130 Locally Collected Genotypes and Machine Learning Model. In Proceedings of the 2023 26th International Conference on Computer and Information Technology (ICCIT), Cox’s Bazar, Bangladesh, 13–15 December 2023; pp. 1–6. [Google Scholar]
Thingujam, U.; Bhattacharyya, K.; Ray, K.; Phonglosa, A.; Pari, A.; Banerjee, H.; Majumdar, K. Integrated Nutrient Management for Eggplant: Yield and Quality Models through Artificial Neural Network. Commun. Soil Sci. Plant Anal. 2019, 51, 70–85. [Google Scholar] [CrossRef]
García-Fortea, E.; García-Pérez, A.; Gimeno-Páez, E.; Sánchez-Gimeno, A.; Vilanova, S.; Prohens, J.; Pastor-Calle, D. A Deep Learning-Based System (Microscan) for the Identification of Pollen Development Stages and Its Application to Obtaining Doubled Haploid Lines in Eggplant. Biology 2020, 9, 272. [Google Scholar] [CrossRef]
Sun, L.; Fan, X.; Huang, S.; Luo, S.; Zhao, L.; Chen, X.; Suo, X. Research on classification method of eggplant seeds based on machine learning and multispectral imaging classification eggplant seeds. J. Sens. 2021, 2021, 8857931. [Google Scholar] [CrossRef]
Nomura, K.; Kaneko, T.; Iwao, T.; Kitayama, M.; Goto, Y.; Kitano, M. Hybrid AI model for estimating the canopy photosynthesis of eggplants. Photosynth. Res. 2023, 155, 77–92. [Google Scholar] [CrossRef]
Kaniyassery, A.; Goyal, A.; Thorat, S.A.; Rao, M.R.; Chandrashekar, H.K.; Murali, T.S.; Muthusamy, A. Association of meteorological variables with leaf spot and fruit rot disease incidence in eggplant and YOLOv8-based disease classification. Ecol. Inform. 2024, 83, 102809. [Google Scholar] [CrossRef]
Lajom, M.P.; Remigio, J.P.; Arboleda, E.; Sacala, R.J.R. Design and Development of Eggplant Fruit and Shoot Borer (Leucinodes orbonalis) Detector Using Near-Infrared Spectroscopy. J. Eng. Sustain. Dev. 2024, 28, 439–454. [Google Scholar] [CrossRef]
Zhang, Y.; Zhang, D.; Zhang, Y.; Cheng, F.; Zhao, X.; Wang, M.; Fan, X. Early detection of verticillium wilt in eggplant leaves by fusing five image channels: A deep learning approach. Plant Methods 2024, 20, 173. [Google Scholar] [CrossRef] [PubMed]
Cemek, B.; Tasan, S.; Canturk, A. Machine learning techniques in estimation of eggplant crop evapotranspiration. Appl. Water Sci. 2023, 13, 136. [Google Scholar] [CrossRef]
Aravind, K.R.; Raja, P.; Ashiwin, R.; Mukesh, K.V. Disease classification in solanum melongena using deep learning. Spanish J. Agric. Res. 2019, 17, e0204. [Google Scholar] [CrossRef]
Biyun, Y.; Yong, X. Applications of deep-learning approaches in horticultural research: A review. Hortic. Res. 2021, 8, 123. [Google Scholar] [CrossRef]
Salvador, P.; Gómez, D.; Sanz, J.; Casanova, J.L. Estimation of Potato Yield Using Satellite Data at a Municipal Level: A Machine Learning Approach. ISPRS Int. J. Geo-Inf. 2020, 9, 343. [Google Scholar] [CrossRef]
Gómez, D.; Salvador, P.; Sanz, J.; Casanova, J.L. Potato Yield Prediction Using Machine Learning Techniques and Sentinel 2 Data. Remote Sens. 2019, 11, 1745. [Google Scholar] [CrossRef]
Gómez, D.; Salvador, P.; Sanz, J.; Casanova, J.L. New spectral indicator Potato Productivity Index based on Sentinel-2 data to 890 improve potato yield prediction: A machine learning approach. Int. J. Remote Sens. 2021, 42, 3426–3444. [Google Scholar] [CrossRef]
Kurek, J.; Niedbała, G.; Wojciechowski, T.; Świderski, B.; Antoniuk, I.; Piekutowska, M.; Kruk, M.; Bobran, K. Prediction of Potato (Solanum tuberosum L.) Yield Based on Machine Learning Methods. Agriculture 2023, 13, 2259. [Google Scholar] [CrossRef]
El-Kenawy, E.-S.M.; Alhussan, A.A.; Khodadadi, N.; Mirjalili, S.; Eid, M.M. Predicting Potato Crop Yield with Machine Learning and Deep Learning for Sustainable Agriculture. Potato Res. 2024, 1–34. [Google Scholar] [CrossRef]
Li, D.; Miao, Y.; Gupta, S.K.; Rosen, C.J.; Yuan, F.; Wang, C.; Wang, L.; Huang, Y. Improving Potato Yield Prediction by Combining Cultivar Information and UAV Remote Sensing Data Using Machine Learning. Remote Sens. 2021, 13, 3322. [Google Scholar] [CrossRef]
Coulibali, Z.; Cambouris, A.N.; Parent, S.-É. Cultivar-specific nutritional status of potato (Solanum tuberosum L.) crops. PLoS ONE 2020, 15, e0230458. [Google Scholar] [CrossRef] [PubMed]
Yu, T.; Zhou, J.; Fan, J.; Wang, Y.; Zhang, Z. Potato Leaf Area Index Estimation Using Multi-Sensor Unmanned Aerial Vehicle (UAV) Imagery and Machine Learning. Remote Sens. 2023, 15, 4108. [Google Scholar] [CrossRef]
Rahman, M.A.; Khan, A.A.; Hasan, M.M.; Rahman, M.S.; Habib, M.T. Deep Learning Modeling for Potato Breed Recognition. IEEE Trans. AgriFood Electron. 2024, 2, 419–427. [Google Scholar] [CrossRef]
Azizi, A.; Abbaspour-Gilandeh, Y.; Nooshyar, M.; Afkari-Sayah, A. Identifying Potato Varieties Using Machine Vision and Artificial Neural Networks. Int. J. Food Prop. 2015, 19, 618–635. [Google Scholar] [CrossRef]
Gao, J.; Westergaard, J.C.; Sundmark, E.H.R.; Bagge, M.; Liljeroth, E.; Alexandersson, E. Automatic late blight lesion recognition and severity quantification based on field imagery of diverse potato genotypes by deep learning. Knowl. Based Syst. 2021, 214, 106723. [Google Scholar] [CrossRef]
Aparajita Sharma, R.; Singh, A.; Dutta, M.K.; Riha, K.; Kriz, P. Image processing based automated identification of late blight disease from leaf images of potato crops. In Proceedings of the 2017 40th International Conference on Telecommunications and Signal Processing, Barcelona, Spain, 5–7 July 2017; pp. 758–762. [Google Scholar] [CrossRef]
Iqbal, M.A.; Talukder, K.H. Detection of Potato Disease Using Image Segmentation and Machine Learning. In Proceedings of the 2020 International Conference on Wireless Communications Signal Processing and Networking (WiSPNET), Chennai, India, 4–6 August 2020; pp. 43–47. [Google Scholar] [CrossRef]
Tarik, M.I.; Akter, S.; Mamun, A.A.; Sattar, A. Potato Disease Detection Using Machine Learning. In Proceedings of the 2021 Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV), Tirunelveli, India, 4–6 February 2021; pp. 800–803. [Google Scholar] [CrossRef]
Singh, A.; Kaur, H. Potato Plant Leaves Disease Detection and Classification using Machine Learning Methodologies. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1022, 012121. [Google Scholar] [CrossRef]
Rashid, J.; Khan, I.; Ali, G.; Almotiri, S.H.; AlGhamdi, M.A.; Masood, K. Multi-Level Deep Learning Model for Potato Leaf Disease Recognition. Electronics 2021, 10, 2064. [Google Scholar] [CrossRef]
Bhere, H.; Jariwala, V.; Sharma, A.; Nemade, V. Potato Plant Leaf Disease Classification Using Deep CNN. In Proceedings of the Potato Plant Leaf Disease Classification Using Deep CNN, Presented at the 5th International Conference (ESDA 2022), Kolkata, India, 17–18 December 2022; Springer: Singapore, 2024; pp. 367–378. [Google Scholar] [CrossRef]
Pasalkar, J.; Gorde, G.; More, C.; Memane, S.; Gaikwad, V. Potato Leaf Disease Detection using Machine Learning. Curr. Agric. Res. J. 2023, 11, 949–954. [Google Scholar] [CrossRef]
Sholihati, R.A.; Sulistijono, I.A.; Risnumawan, A.; Kusumawati, E. Potato Leaf Disease Classification Using Deep Learning Approach. In Proceedings of the 2020 International Electronics Symposium (IES), Surabaya, Indonesia, 29–30 September 2020; pp. 392–397. [Google Scholar] [CrossRef]
Griffel, L.M.; Delparte, D.; Edwards, J. Using Support Vector Machines classification to differentiate spectral signatures of potato plants infected with Potato Virus, Y. Comput. Electron. Agric. 2018, 153, 318–324. [Google Scholar] [CrossRef]
Polder, G.; Blok, P.M.; De Villiers, H.A.C.; Van Der Wolf, J.M.; Kamp, J. Potato Virus Y Detection in Seed Potatoes Using Deep Learning on Hyperspectral Images. Front. Plant Sci. 2019, 10, 209. [Google Scholar] [CrossRef] [PubMed]
Sugiura, R.; Tsuda, S.; Tsuji, H.; Murakami, N. Virus-Infected Plant Detection in Potato Seed Production Field by UAV Imagery. In Proceedings of the 2018 ASABE Annual International Meeting, Detroit, MI, USA, 29 July–1 August 2018; American Society of Agricultural and Biological Engineers: St. Joseph, MI, USA, 2018. [Google Scholar] [CrossRef]
Gold, K.M.; Townsend, P.A.; Herrmann, I.; Gevens, A.J. Investigating potato late blight physiological differences across potato cultivars with spectroscopy and machine learning. Plant Sci. 2020, 295, 110316. [Google Scholar] [CrossRef] [PubMed]
Boguszewska-Mańkowska, D.; Ruszczak, B.; Zarzyńska, K. Classification of Potato Varieties Drought Stress Tolerance Using Supervised Learning. Appl. Sci. 2022, 12, 1939. [Google Scholar] [CrossRef]
Lapajne, J.; Vojnović, A.; Vončina, A.; Žibrat, U. Enhancing Water-Deficient Potato Plant Identification: Assessing Realistic Performance of Attention-Based Deep Neural Networks and Hyperspectral Imaging for Agricultural Applications. Plants 2024, 13, 1918. [Google Scholar] [CrossRef]
Lozada, D.N.; Sandhu, K.S.; Bhatta, M. Ridge regression and deep learning models for genome-wide selection of complex traits in New Mexican Chile peppers. BMC Genom. Data 2023, 24, 80. [Google Scholar] [CrossRef]
Sabanci, K.; Fatih Aslan, M.; Ropelewska, E.; Fahri Unlersen, M. A convolutional neural network-based comparative study for pepper seed classification: Analysis of selected deep features with support vector machine. J. Food Process Eng. 2021, 45, e13955. [Google Scholar] [CrossRef]
Kurtulmuş, F.; Kavdir, I.; Alibaş, İ. Classification of pepper seeds using machine vision based on neural network. Int. J. Agric. Biol. Eng. 2016, 9, 51–62. [Google Scholar] [CrossRef]
Tu, K.; Li, L.; Yang, L.; Wang, J.; Sun, Q. Selection for high quality pepper seeds by machine vision and classifiers. J. Integr. Agric. 2016, 17, 1999–2006. [Google Scholar] [CrossRef]
Ramírez-Meraz, M.; Méndez-Aguilar, R.; Hidalgo-Martínez, D.; Villa-Ruano, N.; Zepeda-Vallejo, L.G.; Vallejo-Contreras, F.; Hernández-Guerrero, C.J.; Becerra-Martínez, E. Experimental races of Capsicum annuum cv. jalapeño: Chemical characterization and classification by 1H NMR/machine learning. Food Res. Int. 2020; 138, 109763. [Google Scholar] [CrossRef]
Dissanayake, A.; Rajapaksha, I.; Gunarathna, R.; Jayasinghe, S.; De Silva, H.; Hettiarachchi, S. Detection of Diseases and Nutrition in Bell Pepper. In Proceedings of the 5th International Conference on Advancements in Computing (ICAC), Colombo, Sri Lanka, 7–8 December 2023; pp. 286–291. [Google Scholar] [CrossRef]
Haque, I.; Islam, M.A.; Roy, K.; Rahaman, M.M.; Shohan, A.A.; Islam, M.S. Classifying Pepper Disease based on Transfer Learning: A DeepLearning Approach. In Proceedings of the 2022 International Conference on Applied Artificial Intelligence and Computing (ICAAIC), Salem, India, 9–11 May 2022; pp. 620–629. [Google Scholar] [CrossRef]
Fumia, N.; Kantar, M.; Lin, Y.; Schafleitner, R.; Lefebvre, V.; Paran, I.; Börner, A.; Diez, M.J.; Prohens, J.; Bovy, A.; et al. Exploration of high-throughput data for heat tolerance selection in Capsicum annuum. Plant Phenome J. 2023, 6, e20071. [Google Scholar] [CrossRef]
Islam, S.; Samsuzzaman Reza, M.N.; Lee, K.-H.; Ahmed, S.; Cho, Y.J.; Noh, D.H.; Chung, S.-O. Image Processing and Support Vector Machine (SVM) for Classifying Environmental Stress Symptoms of Pepper Seedlings Grown in a Plant Factory. Agronomy 2024, 14, 2043. [Google Scholar] [CrossRef]
Ataş, M.; Yardimci, Y.; Temizel, A. A new approach to aflatoxin detection in chili pepper by machine vision. Comput. Electron. Agric. 2012, 87, 129–141. [Google Scholar] [CrossRef]
Whang, S.E.; Roh, Y.; Song, H.; Lee, J.-G. Data Collection and Quality Challenges in Deep Learning: A Data-Centric AI Perspective. VLDB J. 2023, 32, 791–813. [Google Scholar] [CrossRef]
Linardatos, P.; Papastefanopoulos, V.; Kotsiantis, S. Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy 2020, 23, 18. [Google Scholar] [CrossRef]
Piorkowski, D.; Park, S.; Wang, A.Y.; Wang, D.; Muller, M.; Portnoy, F. How AI Developers Overcome Communication Challenges in a Multidisciplinary Team. Proc. ACM Hum. Comput. Interact. 2021, 5, 131. [Google Scholar] [CrossRef]
Graziani, M.; Dutkiewicz, L.; Calvaresi, D.; Amorim, J.P.; Yordanova, K.; Vered, M.; Nair, R.; Abreu, P.H.; Blanke, T.; Pulignano, V.; et al. A Global Taxonomy of Interpretable AI: Unifying the Terminology for the Technical and Social Sciences. Artif. Intell. Rev. 2023, 56, 3473–3504. [Google Scholar] [CrossRef]
Malounas, I.; Paliouras, G.; Nikolopoulos, D.; Liakopoulos, G.; Bresta, P.; Londra, P.; Katsileros, A.; Fountas, S. Early detection of broccoli drought acclimation/stress in agricultural environments utilizing proximal hyperspectral imaging and AutoML. Smart Agric. Technol. 2024, 8, 100463. [Google Scholar] [CrossRef]
Ferentinos, K.P. Deep Learning Models for Plant Disease Detection and Diagnosis. Comput. Electron. Agric. 2018, 145, 311–318. [Google Scholar] [CrossRef]

Figure 1. The five key attributes of Big Data (Volume, Value, Variety, Veracity, and Velocity). Volume denotes the vast agricultural datasets, while Value represents actionable insights that optimize precision agriculture, resource allocation, and crop productivity. Variety covers diverse sources, like sensors and market trends, and Veracity ensures data reliability. Velocity reflects real-time data processing.

Figure 2. An overview of ML tools applied in Solanaceous crops to enhance breeding strategies for important selection targets. Various ML approaches, including SVMs, RFs, Genomic-Environmental Prediction (iGEP), and algorithms that enhance GS efficacy and accuracy supporting predictive modeling. DL techniques such as ANNs, CNNs, RNNs, and Graph Neural Networks (GNNs) further enhance forecasting capabilities. These ML tools contribute to forecasting and selecting key breeding targets, including biotic and abiotic stress resistance, genetic diversity, yield, nutritional value, fruit quality, and high-throughput phenotyping, ultimately optimizing crop improvement strategies.

Table 1. Applications of AI-assisted techniques in the genetic improvement of the most common breeding targets in four representative members of the Solanaceae family.

Traits
Yield prediction	Artificial Neural Networks [31,32], Deep Convolutional Neural Networks [33,34], Supervised Learning [37]	Artificial Neural Networks [61,62,63], Support Vector Machines [61], k-Nearest Neighbors [61], Random Forests [61], Adaptive Boosting [61], Categorical Boosting Regression [62], Lightgradient Boosting Regression [62], Extreme Gradient boosting Regression [62]	Random Forests [73,74,75,78,79], Support Vector Machine [73,74,75,78,79], k-Nearest Neighbors [74,77,79], Extreme Gradient Boosting [77], Graph Neural Networks [77], Long Short-Term Memory Networks [77]	Random Forests [98,102], Bayesian Ridge Regression [102]
Varietal identification & classification	Artificial Neural Networks & Random Forests [49]	Support Vector Machines [65], Convolutional Neural Networks [65]	Support Vector Machines [79], k-Nearest Neighbors [79], Random Forests [79], Convolutional Neural Network [81], Artificial Neural Networks [82]	Convolutional Neural Networks & Support Vector Machines [99], Artificial Neural Networks [100], Multilayer Perceptron Neural Network [101], Random Forests [102]
Fruit/Tuber quality (shape, color, weight, firmness, etc.)	Support Vector Classification & Regression-based models [28], Artificial Neural Networks [40], Support Vector Machines [43,44,45], Random Forests [28,45]	Artificial Neural Networks [63]	Convolutional Neural Network [81,82], Artificial Neural Network [82]	Classification and Regression-based models [28], Random Forests [28,102]
Biotic stess (early blight, leaf spot, fruit rot, pests etc.)	Artificial Neural Networks [52], Random Forests [53,55,57], Support Vector Machine [54,55], k-Nearest Neighbors [54,55], Naive Bayes [54], Decision Trees [54], k-Nearest Neighbors [54], Convolutional Neural Networks [57,58,59]	Support Vector Machine [68], Deep Convolutional Neural Network (VGG16) [69,71]	Deep Neural Networks [83], Random Forests [85,95], Support Vector Machine [87,92], Convolutional Neural Network [85,86,88,89,90,91,92,93,94]	Convolutional Neural Network [103,104]
Abiotic stress (drought, cold stress, heat stress, salinity etc.)	Extreme Gradient Boosting [51], Random Forests [56]	_	Extreme Gradient Boosting [96], Convolutional Neural Network [97]	Convolutional Neural Network [103], Random Forests [105], Support Vector Machine [106]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gerakari, M.; Katsileros, A.; Kleftogianni, K.; Tani, E.; Bebeli, P.J.; Papasotiropoulos, V. Breeding of Solanaceous Crops Using AI: Machine Learning and Deep Learning Approaches—A Critical Review. Agronomy 2025, 15, 757. https://doi.org/10.3390/agronomy15030757

AMA Style

Gerakari M, Katsileros A, Kleftogianni K, Tani E, Bebeli PJ, Papasotiropoulos V. Breeding of Solanaceous Crops Using AI: Machine Learning and Deep Learning Approaches—A Critical Review. Agronomy. 2025; 15(3):757. https://doi.org/10.3390/agronomy15030757

Chicago/Turabian Style

Gerakari, Maria, Anastasios Katsileros, Konstantina Kleftogianni, Eleni Tani, Penelope J. Bebeli, and Vasileios Papasotiropoulos. 2025. "Breeding of Solanaceous Crops Using AI: Machine Learning and Deep Learning Approaches—A Critical Review" Agronomy 15, no. 3: 757. https://doi.org/10.3390/agronomy15030757

APA Style

Gerakari, M., Katsileros, A., Kleftogianni, K., Tani, E., Bebeli, P. J., & Papasotiropoulos, V. (2025). Breeding of Solanaceous Crops Using AI: Machine Learning and Deep Learning Approaches—A Critical Review. Agronomy, 15(3), 757. https://doi.org/10.3390/agronomy15030757

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Breeding of Solanaceous Crops Using AI: Machine Learning and Deep Learning Approaches—A Critical Review

Abstract

1. Introduction

1.1. ML and DL Applications for Plant Breeding

1.2. AI-Driven Crop Breeding to Overcome Traditional Breeding Limitations in Solanaceae

2. Applications of Machine and Deep Learning in Solanaceous Crop Breeding

2.1. Tomato

2.1.1. Plant Phenotyping for Productivity Monitoring and Yield Prediction

2.1.2. Genomic Selection Based on Morphological Classification and Fruit Quality Traits

2.1.3. Breeding Against Environmental Stressors

2.2. Eggplant

2.2.1. Productivity Monitoring and Yield Prediction

2.2.2. Phenotyping for Key Aspects on Plant Physiology and Development

2.2.3. Breeding Against Environmental Stressors

2.3. Potato

2.3.1. Productivity Monitoring and Yield Prediction

2.3.2. Varietal Identification and Tuber Quality Assessment

2.3.3. Breeding Against Environmental Stressors

2.4. Pepper

2.4.1. Agronomic Traits and Yield Prediction

2.4.2. Varietal Identification Based on Morphological and Chemical Classification

2.4.3. Breeding Against Environmental Stressors

3. Limitations and Future Prospects of AI in Breeding Solanaceous Crops

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI