A Deep-Learning-Based Model for the Detection of Diseased Tomato Leaves

: This study introduces a You Only Look Once (YOLO) model for detecting diseases in tomato leaves, utilizing YOLOV8s as the underlying framework. The tomato leaf images, both healthy and diseased, were obtained from the Plant Village dataset. These images were then enhanced, implemented, and trained using YOLOV8s using the Ultralytics Hub. The Ultralytics Hub provides an optimal setting for training YOLOV8 and YOLOV5 models. The YAML file was carefully programmed to identify sick leaves. The results of the detection demonstrate the resilience and efficiency of the YOLOV8s model in accurately recognizing unhealthy tomato leaves, surpassing the performance of both the YOLOV5 and Faster R-CNN models. The results indicate that YOLOV8s attained the highest mean average precision (mAP) of 92.5%, surpassing YOLOV5’s 89.1% and Faster R-CNN’s 77.5%. In addition, the YOLOV8s model is considerably smaller and demonstrates a significantly faster inference speed. The YOLOV8s model has a significantly superior frame rate, reaching 121.5 FPS, in contrast to YOLOV5’s 102.7 FPS and Faster R-CNN’s 11 FPS. This illustrates the lack of real-time detection capability in Faster R-CNN, whereas YOLOV5 is comparatively less efficient than YOLOV8s in meeting these needs. Overall, the results demonstrate that the YOLOV8s model is more efficient than the other models examined in this study for object detection.


Introduction
The introduction of image detection packages has greatly improved the accuracy and precision of picture classification tasks.Several deep-learning frameworks, including Keras, TensorFlow, and PyTorch, have been extensively used for image recognition and classification, producing impressive outcomes [1][2][3][4][5].Although these packages possess robust precision and detection capabilities, they encounter several constraints when used for training deep-learning models.The restrictions encompass resource-intensive GPU requirements, sluggish processing rates that are frequently reliant on the system, an extensive array of parameters, and intricate network architectures.
An important breakthrough in this domain was the creation of the You Only Look Once (YOLO) algorithm.YOLO signifies a significant advancement in the field of object recognition and classification, providing major enhancements compared to previous frameworks.Multiple versions of YOLO have been used for the purpose of detecting Agronomy 2024, 14, 1593 2 of 13 and categorizing objects.These versions have consistently demonstrated higher levels of accuracy, precision, and mean average precision (mAP) in comparison to the packages described above [6][7][8][9][10][11][12].
YOLO-based detection models have been utilized in several domains, such as the health sector, agriculture, automobile industry, geospatial analysis, and other engineering sectors.YOLO has been highly regarded by researchers due to its streamlined network architectures, exceptional accuracy in terms of picture detection, and reduced parameter count, all of which contribute to its superior detection skills.
Guoxu et al. [13] created a strong tomato-detecting method using the YOLOV3 framework.This deep-learning model incorporates a dense architecture into YOLOV3, hence greatly improving the model's capacity for learning and its potential for future reuse in tomato detection.The model exhibited superior detection capabilities in comparison to many alternative methods.
In their study, Yukun et al. [14] introduced an enhanced model that utilizes YOLOV3 for the purpose of detecting cotton stubble.Initially, they gathered leftover film images and resolved the discrepancy between the datasets and the stubble shape by suggesting a segmented dataset labeling.In addition, the model improved the darknet-53 backbone of the basic YOLOV3 network to better detect small targets in the datasets.The K-means method was employed to cluster the prediction anchor boxes, maximizing their dimensions for the upgraded YOLOV3.The enhanced model demonstrated a markedly superior detection rate compared to the standard YOLOV3 in several settings.
Rongli et al. [15] devised a sophisticated algorithm utilizing YOLOV4 to detect cherry fruit.The proposal suggests augmenting the YOLOV4 CSPDarknet53 model by including DenseNet to improve its detection accuracy.The efficacy of this approach was validated through a comparative analysis with other deep-learning algorithms, including YOLOV3, YOLOV3 Dense, and the conventional YOLOV4.The results demonstrated that the upgraded YOLOV4 achieved a mean average precision (mAP) that was 15% more than that of the standard YOLOV4.
Wu et al. [16] developed a streamlined apple blossom identification model using YOLOV4.They enhanced the detection efficiency by optimizing the channels and evaluated the outcomes against YOLOV2, YOLOV3, and Faster R-CNN.The model based on YOLOV4 demonstrated superior detection performance compared to the other assessed approaches.Li et al. [17] performed network optimization on YOLOV5 to improve the detection and recognition of tiny tomatoes.Their method required combining a focus and cross-stage network and using an efficiency intersection over union (IoU) loss function to improve performance.
Sozzi et al. [18] examined the real-time bunch identification properties by employing six distinct YOLO algorithms: YOLOV3, YOLOV3-tiny, YOLOV4, YOLOV4-tiny, YOLOV5x, and YOLOV5s.The study examined different white grape cultivars, utilizing a diverse array of photos taken under varying lighting conditions.The training was carried out utilizing the previously described YOLO models.YOLOV4 and YOLOV5x obtained F1 scores of 0.77 and 0.76, respectively, according to the results.Furthermore, YOLOV4 showcased a frame per second (FPS) rate of 32, but YOLOV5x displayed an FPS of 31.YOLOV4-tiny achieved the greatest frames per second (FPS) of 196, with an F1 score of 0.6, making it well suited for real-time grape yield estimation.
Jialian et al. [19] devised an improved RDE-YOLOV7 model to enhance the precision of detecting dragon fruit.The purpose of this deep-learning model is to improve the efficiency of robots in picking fruits.The dragon-fruit-picking system, utilizing the RDE-YOLOV7 framework, exhibited exceptional accuracy in detecting fruit during picking trials, showcasing the model's suitability for automated fruit picking.
The authors Yang et al. [20] introduced an enhanced YOLOV7 model specifically designed for the detection of apple fruit.The algorithm was improved to tackle concerns pertaining to diminished precision in identifying apple fruit caused by congestion, blockages, and overlapping fruit.The improvements comprised the addition of the SPPCSPS module and the transformation of the serial channel into a parallel channel, resulting in a substantial enhancement in the speed of the picture feature fusion.Additionally, a supplementary detection component was incorporated into the main construction of the head in order to further augment the output of the detection process.The enhanced algorithm exhibited a 6.9% augmentation in mean average precision (mAP) in contrast to the standard YOLOV7 model, showcasing higher accuracy in terms of recognition.
YOLOV8, the latest iteration of the YOLO framework, represents a significant breakthrough in object recognition and classification.It surpasses its previous versions [21,22] by providing improved adaptability, precision, and efficiency.The YOLOV8 architecture prioritizes local feature analysis over global picture observation.This approach greatly decreases the amount of computer work required, allowing for the detection to be performed in real time.
Convolution, a fundamental step in this process, mathematically integrates two functions to create a third function.It is commonly employed in computer vision and signal processing to apply filters to signals or images, therefore detecting specific patterns.Convolutions utilize many lines in the algorithm to obtain feature maps.Convolutional neural networks (CNNs) utilize convolution to enable features to interact with images and other inputs.Convolution is determined by the use of paddings (P), kernels (K), and strides (S).
Convolution involves the movement of a kernel (or filter) across the input signal or image.A stride of one is indicated when the kernel moves one location at a time.A stride of two is observed when the kernel advances by two locations with each movement.The stride directly affects the spatial dimension of the convolution result.Increasing the size of the strides decreases the output dimensionality but maintains a higher amount of spatial information.Conversely, decreasing the size of the strides preserves more detail but necessitates a greater computing workload.Utilizing longer strides reduces the computational burden and enhances the operating velocity while potentially affecting the quality.Padding is the act of adding additional pixels, usually zeros, to the edges of the input image before performing convolution.This is implemented to effectively handle the effects that occur at the boundaries throughout the process.
YOLOV8 has several benefits compared to prior iterations, such as superior accuracy in detecting objects in low-resolution images, leading to the more accurate and dependable identification of objects in different situations.Syed et al. [23] introduced an improved and comprehensive method for identifying and separating plant diseases using the end-to-end YOLOV8 model.The Ultralytics YOLOV8 model was utilized to train the Plant Village and Plant Doc datasets, resulting in a notable improvement in the model's detection and forecasting skills.The findings demonstrated a remarkable level of precision, with an F1 score of 0.99, a precision rate of 99.8%, a mAP50 (mean average precision at 50%) of 99.5%, and a mAP50-95 (mean average precision from 50% to 95%) of 96.5%.The measurements demonstrate the exceptional accuracy and precision of the YOLOV8 model in identifying and isolating sick regions on plant leaves.The model exhibits high precision in identifying and categorizing diverse crop situations and illnesses on plant leaves, and in distinguishing specific target fruits.Furthermore, it attains a high level of precision while maintaining quick inference speeds without any negative impact on performance.The decrease in the quantity of the parameters in comparison to prior iterations enables the creation of lightweight deep-learning models.
This study utilized YOLOV8s to identify tomato leaf diseases, taking advantage of the notable improvements in the advanced YOLO framework.The model utilizes feature extraction techniques to analyze both healthy and diseased tomato leaves, accurately categorizing them based on their characteristics.The YAML configuration was customized to fulfill the precise detection criteria.The training was performed utilizing the YOLO8s version using the Ultralytics Hub web interface, which necessitates considerably less coding in comparison to other training platforms.A comparative analysis was conducted utilizing the data obtained by the YOLOV5 model that was trained using the identical interface.
The aims of this endeavor are delineated as follows: • Dataset enhancement and argumentation for much better precision and accurate detection performance in relation to the diseased tomato leaves.

•
Developing the YAML code for the required detection output.

•
Deploying and training the models based on the standard network structure of the YOLOV8s and YOLOV5 via the Ultralytics Hub, which is less time-consuming.

•
Comparative analysis of the detection performance of the implemented YOLOV8s with other models, such as YOLOV5 and Faster-R-CNN, using the same parameter.

Materials 2.1.1. Dataset
The dataset containing fourteen distinct categories of plant leaves was acquired from the Plant Village dataset [24].The photos of plant leaves were classified into 39 unique categories, encompassing both healthy and damaged leaves.The afflicted foliage was additionally categorized according to the specific type and classification of the ailment.The chosen photographs of damaged tomato leaves encompassed many situations, including yellow leaf curl virus, late blight, Septoria leaf spot, mosaic virus, target spot, bacterial spot, leaf mold, and early blight (Figure 1).utilizing the data obtained by the YOLOV5 model that was trained using the identical interface.
The aims of this endeavor are delineated as follows: • Dataset enhancement and argumentation for much better precision and accurate detection performance in relation to the diseased tomato leaves.

•
Developing the YAML code for the required detection output.

•
Deploying and training the models based on the standard network structure of the YOLOV8s and YOLOV5 via the Ultralytics Hub, which is less time-consuming.

•
Comparative analysis of the detection performance of the implemented YOLOV8s with other models, such as YOLOV5 and Faster-R-CNN, using the same parameter.

Dataset
The dataset containing fourteen distinct categories of plant leaves was acquired from the Plant Village dataset [24].The photos of plant leaves were classified into 39 unique categories, encompassing both healthy and damaged leaves.The afflicted foliage was additionally categorized according to the specific type and classification of the ailment.The chosen photographs of damaged tomato leaves encompassed many situations, including yellow leaf curl virus, late blight, Septoria leaf spot, mosaic virus, target spot, bacterial spot, leaf mold, and early blight (Figure 1).The training dataset comprised 10,000 photos encompassing both healthy and sick tomato leaves.For the purpose of validation and testing, a total of 7000 and 500 photos were utilized, respectively.The training dataset consists of 10,000 photos, with 1000 images representing healthy tomato leaves and 1000 images for each of the categorized sick conditions.

Data Enhancement and Augmentation
Accurate and well-annotated data are crucial for deep-learning networks to efficiently extract the required features.Errors and noise in the dataset have a negative impact on the performance of deep-learning models.Furthermore, the existence of erroneously labeled or wrong data, along with incompatible or inconsistent images, might greatly impede the model's ability to train effectively.These problems can result in overfitting, particularly when training on a small dataset, causing the network to overly emphasize disturbance and noise within the objects that need to be recognized.This leads to a significant reduction in the detection accuracy.Thus, our work utilized the offline data enhancement The training dataset comprised 10,000 photos encompassing both healthy and sick tomato leaves.For the purpose of validation and testing, a total of 7000 and 500 photos were utilized, respectively.The training dataset consists of 10,000 photos, with 1000 images representing healthy tomato leaves and 1000 images for each of the categorized sick conditions.

Data Enhancement and Augmentation
Accurate and well-annotated data are crucial for deep-learning networks to efficiently extract the required features.Errors and noise in the dataset have a negative impact on the performance of deep-learning models.Furthermore, the existence of erroneously labeled or wrong data, along with incompatible or inconsistent images, might greatly impede the model's ability to train effectively.These problems can result in overfitting, particularly when training on a small dataset, causing the network to overly emphasize disturbance and noise within the objects that need to be recognized.This leads to a significant reduction in the detection accuracy.Thus, our work utilized the offline data enhancement technique proposed by Yang et al. [22].The training dataset comprised 10,000 photos encompassing both healthy and sick tomato leaves.For the purpose of validation and testing, a total of 7000 and 500 photos were utilized, respectively.The training dataset consisted of 10,000 photos, with 1000 images representing healthy tomato leaves and 1000 images for each of the categorized sick conditions.This approach enhanced the dataset by introducing blurry images, randomly rotated images, Gaussian noise, and flipped images with distinct mean and variance.As a result, the deep-learning model showed significant improvement in both the general performance and robustness while effectively reducing the impact of overfitting.Consequently, the data augmentation was carried out using the aforementioned methods, resulting in a substantial increase in the dataset size, with the number of photos being multiplied by a factor of 5. Figure 2 displays the original and augmented images.
technique proposed by Yang et al. [22].The training dataset comprised 10,000 photos encompassing both healthy and sick tomato leaves.For the purpose of validation and testing, a total of 7000 and 500 photos were utilized, respectively.The training dataset consisted of 10,000 photos, with 1000 images representing healthy tomato leaves and 1000 images for each of the categorized sick conditions.
This approach enhanced the dataset by introducing blurry images, randomly rotated images, Gaussian noise, and flipped images with distinct mean and variance.As a result, the deep-learning model showed significant improvement in both the general performance and robustness while effectively reducing the impact of overfitting.Consequently, the data augmentation was carried out using the aforementioned methods, resulting in a substantial increase in the dataset size, with the number of photos being multiplied by a factor of 5. Figure 2 displays the original and augmented images.

Standard YOLOV8s
The standard lightweight YOLOV8s is used in this paper to detect diseased tomato leaves by extracting their features.Figure 3 illustrates the detection procedure of YOLOV8.The YOLOV8s model was created and trained using the Ultralytics Hub interface.The model is constructed using the YOLOV8 deep-learning technique and is designed to be lightweight.The YOLOV8s architecture consists of a backbone, a neck network, and a prediction output head.The backbone network utilizes convolutional techniques to extract features of varying scales from RGB color images.Subsequently, the retrieved characteristics are combined with the neck network.
A Feature Pyramid Network (FPN) is commonly used to combine categorized features from lower-level categories into higher-level representations.The primary layer is tasked with forecasting the target categories, employing three variant detectors of varying sizes to identify the contents of the image.The utilization of a multi-scale technique significantly improves the model's capacity to precisely identify and categorize items with

Method Standard YOLOV8s
The standard lightweight YOLOV8s is used in this paper to detect diseased tomato leaves by extracting their features.Figure 3 illustrates the detection procedure of YOLOV8.
technique proposed by Yang et al. [22].The training dataset comprised 10,000 photos encompassing both healthy and sick tomato leaves.For the purpose of validation and testing, a total of 7000 and 500 photos were utilized, respectively.The training dataset consisted of 10,000 photos, with 1000 images representing healthy tomato leaves and 1000 images for each of the categorized sick conditions.
This approach enhanced the dataset by introducing blurry images, randomly rotated images, Gaussian noise, and flipped images with distinct mean and variance.As a result, the deep-learning model showed significant improvement in both the general performance and robustness while effectively reducing the impact of overfitting.Consequently, the data augmentation was carried out using the aforementioned methods, resulting in a substantial increase in the dataset size, with the number of photos being multiplied by a factor of 5. Figure 2 displays the original and augmented images.

Standard YOLOV8s
The standard lightweight YOLOV8s is used in this paper to detect diseased tomato leaves by extracting their features.Figure 3 illustrates the detection procedure of YOLOV8.The YOLOV8s model was created and trained using the Ultralytics Hub interface.The model is constructed using the YOLOV8 deep-learning technique and is designed to be lightweight.The YOLOV8s architecture consists of a backbone, a neck network, and a prediction output head.The backbone network utilizes convolutional techniques to extract features of varying scales from RGB color images.Subsequently, the retrieved characteristics are combined with the neck network.
A Feature Pyramid Network (FPN) is commonly used to combine categorized features from lower-level categories into higher-level representations.The primary layer is tasked with forecasting the target categories, employing three variant detectors of varying sizes to identify the contents of the image.The utilization of a multi-scale technique significantly improves the model's capacity to precisely identify and categorize items with The YOLOV8s model was created and trained using the Ultralytics Hub interface.The model is constructed using the YOLOV8 deep-learning technique and is designed to be lightweight.The YOLOV8s architecture consists of a backbone, a neck network, and a prediction output head.The backbone network utilizes convolutional techniques to extract features of varying scales from RGB color images.Subsequently, the retrieved characteristics are combined with the neck network.
A Feature Pyramid Network (FPN) is commonly used to combine categorized features from lower-level categories into higher-level representations.The primary layer is tasked with forecasting the target categories, employing three variant detectors of varying sizes to identify the contents of the image.The utilization of a multi-scale technique significantly improves the model's capacity to precisely identify and categorize items with different sizes and levels of intricacy.The standard YOLOV8 network structures are presented in Figure 4.

Training Environment
The model in this work was trained using the Windows 10 operating system.The Ultralytics Hub interface, a platform for deploying and training deep-learning models, was used.The present configuration and formatting of this interface allow for seamless compatibility and optimal efficiency in the training process of both the YOLOV5 and YOLOV8 versions.

Dataset Preparation and Model Training
The dataset was formatted following the YOLOV8 design specifications.The YAML file that provides a description of the dataset was located in the main directory.Subsequently, the directory was compressed into a zip file to facilitate its transfer to the Ultralytics Hub.Precautions were taken to ensure that the dataset, the YAML file, the directory, and the zip file were all named identically.The YAML specifications carefully followed the YOLOV8 format.Once the dataset was compressed, it was uploaded to the Ultralytics Hub for assessment before training with the chosen YOLOV8s model.Training began by utilizing the model's API key and establishing a connection to the Ultralytics Hub notebook.The configuration of the experiment environment is shown in Table 1.

Training Parameters and Setting 2.3.1. Training Environment
The model in this work was trained using the Windows 10 operating system.The Ultralytics Hub interface, a platform for deploying and training deep-learning models, was used.The present configuration and formatting of this interface allow for seamless compatibility and optimal efficiency in the training process of both the YOLOV5 and YOLOV8 versions.

Dataset Preparation and Model Training
The dataset was formatted following the YOLOV8 design specifications.The YAML file that provides a description of the dataset was located in the main directory.Subsequently, the directory was compressed into a zip file to facilitate its transfer to the Ultralytics Hub.Precautions were taken to ensure that the dataset, the YAML file, the directory, and the zip file were all named identically.The YAML specifications carefully followed the YOLOV8 format.Once the dataset was compressed, it was uploaded to the Ultralytics Hub for assessment before training with the chosen YOLOV8s model.Training began by utilizing the model's API key and establishing a connection to the Ultralytics Hub notebook.The configuration of the experiment environment is shown in Table 1.

Parameters for Model Evaluation
In order to evaluate the performance characteristics of the implemented YOLOV8s model, the following metrics are taken into account: mean average precision (mAP), recall (R), and precision (P).The mean average precision (mAP) is used as a standard measure to evaluate the accuracy and resilience of the object identification method employed by the model.The mAP is calculated by determining the average precision (AP) for each class using the intersection over union (IoU) protocol.The AP is computed by determining the integral of the precision-recall curve.The mAP is computed using the following formula.
where the AP i is the AP of the class i and N is the number of classes.The backbone of the equation for computing the mAP is formed by the sub-matrices mentioned as follows: confusion matrix, intersection over union (IoU), recall, and precision.The measure of how well the true positives (TPs) can be determined out of all the positive predictions is called precision.It is calculated using the following formula.
where TP stands for true positives and FP for false positives.The determinant of how well a TP can be identified out of all the predictions is known as recall.It is determined using the following equation.
where FN represents the false negative, the four attributes required in creating a confusion matrix are the TP, TN, FP, and FN. Figure 5 illustrates the conceptual view of the confusion matrix.
In order to evaluate the performance characteristics of the implemented YOLOV8s model, the following metrics are taken into account: mean average precision (mAP), recall (R), and precision (P).The mean average precision (mAP) is used as a standard measure to evaluate the accuracy and resilience of the object identification method employed by the model.The mAP is calculated by determining the average precision (AP) for each class using the intersection over union (IoU) protocol.The AP is computed by determining the integral of the precision-recall curve.The mAP is computed using the following formula.

𝐴𝑃 = ∑ 𝐴𝑃
(1) where the  is the AP of the class i and N is the number of classes.The backbone of the equation for computing the mAP is formed by the sub-matrices mentioned as follows: confusion matrix, intersection over union (IoU), recall, and precision.The measure of how well the true positives (TPs) can be determined out of all the positive predictions is called precision.It is calculated using the following formula.

𝑃 =
(2) where TP stands for true positives and FP for false positives.The determinant of how well a TP can be identified out of all the predictions is known as recall.It is determined using the following equation.

𝑅 = (3)
where FN represents the false negative, the four attributes required in creating a confusion matrix are the TP, TN, FP, and FN. Figure 5 illustrates the conceptual view of the confusion matrix.The intersection over union (IoU) quantifies the degree of overlap between the anticipated bounding box and the ground truth box.A higher intersection over union (IoU) value implies a stronger correlation between the predicted bounding box's coordinates and the actual coordinates of the ground truth box.

Results
This section provides a comprehensive analysis of the detection outcomes obtained from the implemented YOLOV8s model.Furthermore, the model's ability to accurately detect damaged tomato leaves from the dataset it was trained on is examined and presented.The intersection over union (IoU) quantifies the degree of overlap between the anticipated bounding box and the ground truth box.A higher intersection over union (IoU) value implies a stronger correlation between the predicted bounding box's coordinates and the actual coordinates of the ground truth box.

Results
This section provides a comprehensive analysis of the detection outcomes obtained from the implemented YOLOV8s model.Furthermore, the model's ability to accurately detect damaged tomato leaves from the dataset it was trained on is examined and presented.

Comparative Analysis of the Ablation Experiments
This section presents the outcome of the ablation experiment whereby the detection performances of YOLOV8s and that of YOLOV5, both implemented in the Ultralytics Hub, were compared, as shown in Table 2.The performance study in Table 2 shows that YOLOV8s surpasses the YOLOV5 model in terms of the precision, recall, and mean average precision.Moreover, the YOLOV8 algorithm's decreased number of parameters adds to its lightweight design, hence greatly improving its ability to detect objects.The YOLOV8-based model obtains a precision of 93.2%, surpassing YOLOV5 by 3.1%.Additionally, it achieves a mAP of 92.5%, showing a 2.9% enhancement over YOLOV5.Furthermore, the constructed algorithm has a recall rate that is 3.2% higher than that of YOLOV5.The findings highlight the significant enhancements incorporated into the YOLOV8 algorithm, leading to better detection capabilities compared to its predecessor, YOLO.
The overall loss in YOLOV8 is calculated by aggregating the classification loss and regression loss.The classification loss is calculated using the varifocal loss (VFL), whereas the regression loss consists of the distribution focal loss (DFL) and complete intersection over union (CIoU) loss.This methodology guarantees precise classification of detected objects, as CIoU enhances the efficiency and resilience of IoU in object detection.Meanwhile, the issue of a class imbalance in the object detection process is addressed or handled by the DFL loss.The aforementioned losses are presented as follows: where q and p represent the label and IoU classification score (IACS), respectively.Then, the CIoU is computed as follows: The parameters b and b gt are the center point of the two boxes, the length of the diagonal of the enclosed area of the two boxes is denoted by c, α is the weighted coefficient, and ρ is the Euclidean distance.The consistency of the aspect ratio is measured by the term v, and it can be evaluated using the following formula: As formulated in Equation ( 5), the priority of regression is decided by the term α, which is the trade-off term.The alpha is determined as follows.
The DFL is computed using the following equation: whereby the general distribution value is y, and i is the number, S i =

Comparative Analysis of the Detection Output between YOLOV8 and Other Models
A comparative analysis was performed to assess the detection effectiveness of the implemented YOLOV8s model in comparison to other deep-learning models.More specifically, the performance of YOLOV8s was compared to that of YOLOV5 and Faster R-CNN.The YOLOV8s model and YOLOV5 model were both trained using the same parameters.Similarly, the Faster R-CNN model was trained using PyTorch (version 1.7.1) and a Python 3.8 interface, which is different from the Ultralytics Hub web interface used for training YOLOV8s and YOLOV5.
The results unequivocally establish the supremacy of YOLOV8s in terms of several facets of object identification, encompassing precision, processing speed measured in frames per second, and the ability to recognize objects in real time.The obtained results are presented in Table 3 below.

Discussion
The results of this study demonstrate a significant improvement in the performance of the YOLOV8s model compared to its previous versions, YOLOV5 and Faster R-CNN, when evaluated under identical parameters and experimental conditions.YOLOV8s had a mean average precision (mAP) of 92.5%, exceeding the mAP of 89.6% for YOLOV5 and 77.3% for Faster R-CNN.In addition, YOLOV8s exhibited outstanding efficiency in terms of the frames per second (FPS), making it the only model capable of real-time detection in this experiment.The enhanced performance is consistent with the findings of Yang et al. [22], who also observed superior performance metrics for YOLOV8 in tomato recognition experiments.
When compared to other models, such as YOLOV7, YOLOV5, YOLOV4, Faster R-CNN, and SSD, YOLOV8s outperformed them by achieving the highest mean average precision (mAP) of 91.9%, a recall rate of 91.2%, and a precision rate of 92.5%.The results

Comparative Analysis of the Detection Output between YOLOV8 and Other Models
A comparative analysis was performed to assess the detection effectiveness of the implemented YOLOV8s model in comparison to other deep-learning models.More specifically, the performance of YOLOV8s was compared to that of YOLOV5 and Faster R-CNN.The YOLOV8s model and YOLOV5 model were both trained using the same parameters.Similarly, the Faster R-CNN model was trained using PyTorch (version 1.7.1) and a Python 3.8 interface, which is different from the Ultralytics Hub web interface used for training YOLOV8s and YOLOV5.
The results unequivocally establish the supremacy of YOLOV8s in terms of several facets of object identification, encompassing precision, processing speed measured in frames per second, and the ability to recognize objects in real time.The obtained results are presented in Table 3 below.

Discussion
The results of this study demonstrate a significant improvement in the performance of the YOLOV8s model compared to its previous versions, YOLOV5 and Faster R-CNN, when evaluated under identical parameters and experimental conditions.YOLOV8s had a mean average precision (mAP) of 92.5%, exceeding the mAP of 89.6% for YOLOV5 and 77.3% for Faster R-CNN.In addition, YOLOV8s exhibited outstanding efficiency in terms of the frames per second (FPS), making it the only model capable of real-time detection in this experiment.The enhanced performance is consistent with the findings of Yang et al. [22], who also observed superior performance metrics for YOLOV8 in tomato recognition experiments.
When compared to other models, such as YOLOV7, YOLOV5, YOLOV4, Faster R-CNN, and SSD, YOLOV8s outperformed them by achieving the highest mean average precision (mAP) of 91.9%, a recall rate of 91.2%, and a precision rate of 92.5%.The results align with a prior investigation [23], demonstrating that YOLOV8 exhibits exceptional precision and comprehensiveness in detecting and distinguishing plant leaf diseases.It achieves a peak accuracy of 99.8% and completeness of 99.3%.The outstanding performance of YOLOV8s can be attributed to its optimized architecture, decreased parameter count, and rapid inference speed, highlighting its robustness in many detection tasks.
Furthermore, Agarwal et al. [25] evaluated convolutional neural network (CNN) models that possessed a higher quantity of parameters compared to YOLOV8s.This had a negative impact on their real-time performance since it led to decreased inference speeds.The study exclusively employed the Ultralytics Hub interface for training YOLOV5 and YOLOV8s.This interface is constructed using YAML and features a user-friendly configuration, which improves the operational efficiency of YOLOV8s.Nevertheless, the performance outcomes can change when applying these models in diverse settings, such as PyTorch.
The improved effectiveness of YOLOV8s seen in this study can be attributed to its simplified structure and the use of the Ultralytics Hub interface.On the other hand, Liu et al. [26,27] obtained a slightly higher mean average precision (mAP) of 98.56% and a frame rate of 131.41 frames per second (FPS) by using YOLOX with a MobileNet V3 backbone.The researchers employed Cy-cleGAN to normalize the distribution of sick leaf samples in the dataset, which had a substantial impact on their findings.However, we could not utilize this approach in our research due to the constraints imposed by the unalterable Ultralytics Hub interface.This implies that further investigation and enhancement are required in this domain.
These findings emphasize the importance of the model's architecture and the settings under which it is trained in influencing the efficiency of deep-learning models for real-time applications.YOLOV8 demonstrates outstanding performance in terms of the mean average precision (mAP), recall, and precision.Additionally, its real-time object detection capability makes it a highly potent tool for various practical applications [28,29].This study contributes to the growing body of evidence that confirms the efficacy of YOLOV8s in demanding detection tasks and proposes that it should be taken into account for future implementations.
The results of this study have significant implications for the progress and application of deep-learning models in real-time scenarios.The given text represents a list containing the numbers 29 and 30.The YOLOV8s model sets a superior benchmark in terms of both precision and efficiency, highlighting the importance of streamlined architectures and optimal training settings.Additional research might be carried out to explore the integration of advanced techniques such as CycleGAN into the Ultralytics Hub framework or the adaptation of YOLOV8s in different scenarios to enhance its efficiency and adaptability across diverse datasets and environments.
Moreover, it is crucial to recognize the potential of YOLOV8s in several industries outside of agriculture and the identification of plant diseases.The capacity to promptly detect and recognize objects in real time can be effectively employed in domains such as autonomous driving, surveillance, and medical diagnostics, where rapidity and precision are of paramount importance.The extensive utility of YOLOV8s highlights the importance of continuous innovation and adaptation to evolving technical demands.

Conclusions
This work presents a tomato leaf disease detection model that has been constructed using YOLOV8s, the most advanced and up-to-date version available.The model deployment and training were conducted using a no-code Ultralytics Hub environment.A comparative analysis was performed using YOLOV5 and Faster R-CNN models, which showed that the YOLOV8s-based model displayed a substantial enhancement in the detection performance.

1.
The ablation experiment shows that the model based on YOLOV8s exhibits significantly improved detection performance in comparison to the model based on YOLOV5.The YOLOV8s model outperformed YOLOV5, with a 2.9% increase in mean average precision (mAP), 3.2% increase in precision, and 3.1% increase in recall.
14, x FOR PEER REVIEW 6 of 13different sizes and levels of intricacy.The standard YOLOV8 network structures are presented in Figure4.

Figure 5 .
Figure 5. Conceptual view of the confusion matrix.

Figure 5 .
Figure 5. Conceptual view of the confusion matrix.
y i+1 −y y y+1 −y i and S i+1 = y i −y y y+1 −y i .Graphical representations of the training and validation losses are shown in Figure 6a and b, respectively.
14, x FOR PEER REVIEW 9 of 13 Graphical representations of the training and validation losses are shown in Figure 6a and b, respectively.

Figure 6 .
Figure 6.Graphical representation of the (a) training loss value and (b) validation loss value.

Figures 7
Figures 7 and 8 display the detection outcomes of the YOLOV8s model employed in this investigation, demonstrating examples of the identified healthy and sick tomato leaves correspondingly.The delineated regions represent items identified by the network, with the accompanying text identifying the category of the detected tomato leaf and a number value showing the level of confidence in the model's identification.The YOLOV8 model exhibited precise identification of both healthy and diseased tomato leaves, showcasing a strong level of confidence in detecting these particular categories.The precision of the acquired dataset photos can be attributed to the lack of overlapping items.

Figure 6 .
Figure 6.Graphical representation of the (a) training loss value and (b) validation loss value.

Figures 7
Figures 7 and 8 display the detection outcomes of the YOLOV8s model employed in this investigation, demonstrating examples of the identified healthy and sick tomato leaves correspondingly.The delineated regions represent items identified by the network, with the accompanying text identifying the category of the detected tomato leaf and a number value showing the level of confidence in the model's identification.The YOLOV8 model exhibited precise identification of both healthy and diseased tomato leaves, showcasing a strong level of confidence in detecting these particular categories.The precision of the acquired dataset photos can be attributed to the lack of overlapping items.

Figure 6 .
Figure 6.Graphical representation of the (a) training loss value and (b) validation loss value.

Figures 7
Figures 7 and 8 display the detection outcomes of the YOLOV8s model employed in this investigation, demonstrating examples of the identified healthy and sick tomato leaves correspondingly.The delineated regions represent items identified by the network, with the accompanying text identifying the category of the detected tomato leaf and a number value showing the level of confidence in the model's identification.The YOLOV8 model exhibited precise identification of both healthy and diseased tomato leaves, showcasing a strong level of confidence in detecting these particular categories.The precision of the acquired dataset photos can be attributed to the lack of overlapping items.

Table 1 .
Configuration of the experiment environment.

Table 1 .
Configuration of the experiment environment.

Table 2 .
Results of the ablation experiment.

Table 3 .
A comparative analysis results.

Table 3 .
A comparative analysis results.