1. Introduction
Green tea is a distinctive category of tea in China [
1]. Its production process generally includes picking, withering, greening, kneading, drying, and refining [
2]. In the stages leading up to harvesting and throughout the processing of tea, it is inevitable that some impurities, which affect the quality of the tea, may be mixed in. Consequently, the sorting of these impurities is essential [
3]. In light of acute labor shortages and the rapid advancement of automation technologies, there is a growing demand for the automation of sorting out impurities from tea, highlighting its escalating necessity. Currently, color sorting machines are pivotal in automating the process of eliminating tea impurities, leading to a substantial enhancement in sorting efficacy [
4]. Nevertheless, for the impurities in famous green tea with a unique or fragile shape, the sorting ability of color sorters is still insufficient, and there are some potential disadvantages. One is that the tea will suffer friction-induced damage and breakage under mechanical stress, thereby compromising both its aesthetic appeal and overall structural soundness [
5]. Second, when sorting some impurities similar to the shape of tea, a large number of qualified tea leaves will also be separated and removed together with the impurities, which affects the output of famous green tea. Third, the rapid sorting action performed by color sorters can exert an influence on the volatile compounds responsible for tea’s aroma profile, subsequently impacting its overall flavor characteristics [
6]. Typically, color sorting technology is employed for filtering out imperfections in lower-grade tea, whereas the purification of superior green tea varieties continues to depend on manual methods, proving to be an expensive and physically demanding process [
7]. Guo et al. introduced an approach for detecting impurities in Pu’er tea utilizing spectral imaging techniques. Hyperspectral imaging was employed to capture detailed images of the tea samples along with their associated impurities across a spectrum ranging from 400 to 1000 nanometers. From these images, spectral information was then gathered for every distinct type of sample, facilitating the creation of a Support Vector Machine (SVM) model, which enabled classification at the pixel level of the hyperspectral images [
8]. While this technique proves efficient in pinpointing contaminants within Pu’er tea, its implementation in practical manufacturing settings remains limited due to the prohibitive expense associated with spectrometry equipment and the susceptibility of spectral reflection data to disruptions.
Recent years have witnessed remarkable progress in deep learning technology [
9], particularly in the area of target detection algorithms. Such algorithms can be categorized into two main types, namely, a two-stage detection network, exemplified by Faster R-CNN, which has high detection accuracy and slow speed [
10], and a single-stage detection network, such as SSD and YOLO [
11], which has high detection speed but lower accuracy. In addition, the complexity of existing target detection models is high, which is not favorable for deployment in real production [
12]. Therefore, it is essential to develop a high-precision and lightweight impurity detection algorithm for famous green tea.
Scholars, both domestically and internationally, have conducted extensive research on the application of deep learning in impurity detection. Liu et al. established a CPU Net semantic segmentation framework tailored for corn impurity assessment, incorporating a convolutional block attention mechanism and a pyramid pooling strategy within the U-Net architecture. The average MIoU, MPA, and ST of the model are 97.31%, 98.71%, and 158.4 ms, respectively, with a relative error of 4.64% compared to the manually calculated average [
13]. Qi et al. developed a methodology for assessing wheat crushing rates and impurity levels utilizing the DeepLab-EDA semantic segmentation framework. This approach yielded a Mean Intersection over Union (MIoU) of 89.41%, a Mean Precision (MP) of 95.97%, and a Mean Recall (MR) of 94.83%, representing improvements of 9.94%, 7.41%, and 7.52% over the foundational model’s performance, respectively [
14]. Rong et al. devised a dual-phase convolutional neural network architecture aimed at both segmenting images and conducting instantaneous detection of foreign materials within walnut visuals. Their technique demonstrates an impressive capability to accurately delineate 99.4% of the target areas in testing images while achieving a success rate of 96.5% in identifying contaminants during the validation phase [
15]. Huang et al. proposed an impurity detection algorithm for Tieguanyin tea, building upon the foundation of the upgraded YOLOv5 model [
16]. The enhanced model exhibits a greater confidence level than the baseline model in identifying tea contaminants, but the FPS is only 62.
The above research provides a reference for the research of this paper, however, the task of identifying impurities in prestigious green tea varieties differs significantly from the aforementioned research subjects and contexts, particularly given the challenges arising from impurities like tea stems, which closely resemble green tea in color and appearance, thereby significantly escalating the complexity of accurate detection. Currently, there are fewer research reports on deep learning in premium green tea impurity detection. In response to these hurdles, our research presents an enhanced YOLO model with model compression for detecting imperfections in high-quality green tea. The key contributions of this study are as follows: (1) To solve the two major problems of sample imbalance and scale sensitivity in tea impurity detection scenarios, four loss functions were compared experimentally, and Focaler_mpdiou was selected as the final loss function. The model achieved improved detection performance without increasing complexity. (2) To address the issue of high complexity and unsuitability for deployment in existing models, the L1 regularization pruning method is introduced to prune the model at the cost of sacrificing model detection accuracy and reducing model complexity. (3) In response to the dense and small target characteristics in tea impurity detection, the BCKD method is introduced to distill knowledge from the model, which improves the detection accuracy of small target impurities without increasing the complexity of the model.
4. Discussion
To solve the technical problem of high labor intensity and low sorting efficiency in manually sorting impurities in high-quality green tea, this study proposes an enhanced YOLO model with model compression for impurity detection in high-quality green tea. In comparison to the initial model, P, R, mAP, and FPS were enhanced by 0.0051, 0.012, 0.0094, and 72.2, while GFLOPs and Params were reduced by 2.3 and 860350 B, respectively. The model enhances detection precision while concurrently decreasing complexity, which fulfills the criteria for detecting and categorizing impurities in famous green tea. The overall P value of impurity identification in high-quality green tea was enhanced by improving the model, and overall, the accuracy of impurity detection was enhanced. From each type, the P value of sunflower shells and tea stems slightly decreased, which may be due to the fact that the color of sunflower shells and tea stems resembles that of high-quality green tea and that the improved model’s feature learning is not good enough. The enhanced model increased the R and mAP values for impurity identification in high-quality green tea, indicating that the enhanced model can reduce the occurrence of missed detections and improve the average detection accuracy. In the generalization experiment, the model may have missed the detection of tea stems, which may be due to the similar morphology of tea stems and black tea. Overall, the detection results of the model can correctly reflect the actual situation of impurities in black tea. This indicates that the model can adapt to the image features of black tea to a certain extent and effectively detect impurities in it.
Compared to traditional manual sorting methods, the detection and sorting method based on deep learning greatly improves sorting efficiency. Compared to Huang et al.’s research [
16], this study greatly improved the detection speed and made the model more lightweight. Compared to Guo et al.’s research, this study has a lower cost and stronger anti-interference ability [
8]. The improved model, due to its high detection speed and lightweight characteristics, can be directly embedded into tea impurity removal pipelines. In addition, although the improved model uses green tea as the training sample, it can still adapt to the image features of black tea in the generalization experiment and effectively detect impurities, proving the feasibility of cross-category detection.
Of course, the model proposed in this article also has certain limitations. Compared to the categories of tea impurities in actual production, the impurities selected in this article have a certain representativeness, but the types are relatively small, and the background environment is limited to white, making it difficult to fully simulate the complex situations that may occur in actual production environments.
5. Conclusions
This study proposes an enhanced YOLO model with model compression for impurities in high-quality green tea detection. In the homemade dataset containing four categories of impurities, i.e., tea stems, sunflower shells, stones, and tea fruits, experiments with YOLOv8 models were used to compare model complexity and the experimental results, using the YOLOv8n model as the base model. We experimentally compared ShapeIoU, SIoU, MPDIoU, and Focaler_mpdiou and selected Focaler_mpdiou as the final loss function. The loss function was replaced by Focaler_mpdiou for the other models of YOLOv8, and YOLOv8m-Focaler_mpdiou was selected as the teacher model based on the model’s detection capabilities. The model was pruned to achieve a lightweight model with a reduction in detection accuracy, and the model was subjected to knowledge model distillation without escalating the complexity to further enhance detection performance. Finally, the experimental outcomes based on the homemade tea impurity-containing dataset demonstrate that the GFLOPs, Params, P, R, mAP, and FPS of the enhanced model were 5.8, 2146078 B, 0.9214, 0.8759, 0.9317, and 885.2, respectively. The model enhanced detection precision while concurrently decreasing complexity. Afterwards, we conducted model generalization validation on black tea samples. The results indicate that the model can detect four types of impurities in black tea, but there are still missed detections. This model has the potential for industrial application in specific green tea categories and common impurity detection scenarios.
Our proposed impurity identification model for high-quality green tea improves detection performance while achieving model lightweighting. It also has certain advantages compared to some mainstream models currently available. However, the impurity identification model of famous green tea in our research has some limitations (e.g., it contains few types of tea impurities), which constrains the model’s ability to generalize. Going forward, we plan to expand the dataset to enhance the diversity of tea impurity data, enabling the model to detect multiple types of tea impurities in complex scenarios.