You are currently viewing a new version of our website. To view the old version click .
Electronics
  • Article
  • Open Access

17 July 2021

A Real-Time Detection Algorithm for Kiwifruit Defects Based on YOLOv5

,
,
,
,
and
1
College of Information Engineering, Sichuan Agricultural University, Ya’an 625000, China
2
Sichuan Key Laboratory of Agricultural Information Engineering, Ya’an 625000, China
3
School of Computing, National University of Singapore, Singapore 119077, Singapore
*
Author to whom correspondence should be addressed.
This article belongs to the Collection Electronics for Agriculture

Abstract

Defect detection is the most important step in the postpartum reprocessing of kiwifruit. However, there are some small defects difficult to detect. The accuracy and speed of existing detection algorithms are difficult to meet the requirements of real-time detection. For solving these problems, we developed a defect detection model based on YOLOv5, which is able to detect defects accurately and at a fast speed. The main contributions of this research are as follows: (1) a small object detection layer is added to improve the model’s ability to detect small defects; (2) we pay attention to the importance of different channels by embedding SELayer; (3) the loss function CIoU is introduced to make the regression more accurate; (4) under the prerequisite of no increase in training cost, we train our model based on transfer learning and use the CosineAnnealing algorithm to improve the effect. The results of the experiment show that the overall performance of the improved network YOLOv5-Ours is better than the original and mainstream detection algorithms. The mAP@0.5 of YOLOv5-Ours has reached 94.7%, which was an improvement of nearly 9%, compared to the original algorithm. Our model only takes 0.1 s to detect a single image, which proves the effectiveness of the model. Therefore, YOLOv5-Ours can well meet the requirements of real-time detection and provides a robust strategy for the kiwi flaw detection system.

1. Introduction

China is a giant producer of kiwi, whose output ranks first in the world [1]. Defect detection plays a significant role in the postpartum reprocessing of kiwifruit. Through defect detection, we can grade and price different kiwifruit based on their quality, which helps to change the phenomenon that the price of kiwifruit was difficult to increase in the past [2]. It also guarantees food safety. However, detection technology is very traditional and outdated. Most manufacturers and workers mainly rely on manual detecting, which wastes too much labor and has poor efficiency [3].
In recent years, computer-vision-based object detection technology has gradually become matured [4,5]. Shah et al. use Faster RCNN to identify plants and weeds [6]. Zeze et al. use CNN to realize the recognition of apples [7]. Computer vision has the obvious advantages of high accuracy and fast speed [8]. Defect detection based on computer vision is an automatic and nondestructive fruit detection method [9]. It overwhelms manual detection on precision and efficiency; hence, it will bring the inevitable trend of application in fruits in the future [10].
In current fruit defect detection algorithms, it is difficult to balance speed and accuracy simultaneously. Dong et al. [11] used computer vision technology to detect the surface defects of Korla fragrant pears. Under the condition of guaranteeing accuracy, it still takes 2.5 s to detect a single image. Wang et al. [12] conducted rapid detection of pomegranate leaf diseases, but the accuracy was only 87%. Xing et al. [13] used the BP neural network in mango quality inspection to increase the speed as much as possible while ensuring accuracy. The final speed also took 0.8 s.
The development of deep learning algorithms in recent years has led to major breakthroughs in the field of computer vision. In terms of target recognition, deep learning algorithms represented by convolutional neural networks (CNNs) have improved the accuracy and detection speed, compared with traditional methods [14]. At present, target recognition algorithms are mainly divided into two types: one is a two-stage algorithm based on the detection frame and classifier, such as the R-CNN [15] series algorithm, which is of higher accuracy, but its deeper network structure also leads to a slower speed, failing to meet real-time the requirements of the target recognition detection. The other is a regression-based first-order algorithm, such as SDD [16], YOLO [17] series algorithms, etc., with faster inference speed and stronger practicability, which can meet real-time object recognition and detection.
This paper takes kiwifruit defect as the research object, collects four types of common flaw photos to make a kiwi flaw dataset, and uses the characteristics of high detection speed and high accuracy of the YOLOv5 [18] algorithm in the field of image detection. We ameliorated the problem and compared the improved model with the original one. The use of the CosineAnnealing [19] decay method in the training process can improve the model effect without increasing the cost of training. The result proves that the improved model leads to significant progress, which proves the effectiveness of the improved model.

3. Results

3.1. Experimental Results

In order to judge the quality of the detection model accurately, the evaluation in this paper is based on the loss function curve (Loss) and average accuracy value (mAP).
During the network training process, the loss function can intuitively reflect whether the network model can converge stably as the number of iterations increases. The specific loss function of the model is shown in Figure 8 below.
Figure 8. The training of loss.
From the figure, it is found that as the number of iterations gradually increases, the improved YOLOv5 algorithm curve gradually converges, and the loss value becomes smaller and smaller. When the model is iterated 600 times, the loss value is basically stable and has dropped to near 0, and the network basically converges. Compared with the original YOLOv5, the regression is faster and more accurate, which proves the validity and effectiveness of the model.
The mAP is used to measure the quality of the defect detection model. The higher the value is, the higher the average detection accuracy and the better the performance will be.
Figure 9 shows that after about 200 iterations of the YOLOv5-Ours model, the mAP reaches about 94%, and has gradually stabilized, reaching a maximum of 98%, indicating that the improved YOLOv5 model has an average accuracy rate for defect detection. The overall model performance has met and even exceeded expectations.
Figure 9. The training of mAP.

3.2. Analysis

The following Figure 10 shows the improved YOLOv5 network and the YOLOv5-Ours network in the kiwifruit dataset part of the detection results, respectively, for different defect categories and defect sizes.
Figure 10. Comparison of detection algorithm before and after improvement.
As the results show, our improved YOLOv5 can accurately detect defects in complex environments, such as tiny defects, and the return positioning frame is more accurate. Embedding SELayer discards unimportant features, significantly improves the robustness of the model, and proves the effectiveness of the network.
Under the condition that the IoU threshold is 50%, the mAP@0.5 of the original YOLOv5 is 85%, and the mAP@0.5 of the improved YOLOv5 is 94.7%.
Table 1 below shows the accuracy comparison between the original model and the improved one.
Table 1. Comparison between the original model and the improved model.
According to Table 1, the improved model has improved mAP by nearly 8%. Through testing, it is found that despite the increased complexity of the model, the improved network still only takes 0.1 s to detect a single image, which is in line with real-time detection.
It can be inferred from Table 2 that, compared with mainstream detection algorithms, our network has a higher mAP. Although Fast R-CNN performs well on mAP, it takes 0.79 s to detect a single image, which cannot meet the requirements of real-time detection.
Table 2. Comparison between mainstream detection algorithms.

4. Discussion

This paper explores an automatic detection method for kiwifruit defects in real time. To meet the needs of farmers to understand the states of kiwifruit at any time and in real time, we use the YOLOv5 model for deeper research. By adding a small target detection layer, the ability to detect small defects is improved. The layer was embedded to enhance useful features and suppress less important features. The CIoU was used as the loss function to make the regression more stable. The feasibility of this method is as follows:
  • In terms of processing accuracy, the dataset of this study is manually captured images; hence, the background information is relatively simple. In slightly complex background conditions, the accuracy may be reduced. However, this research is based on unnatural or industrial scenes. Thus, there will be no complex background in practical application.
  • In terms of processing speed, in order to meet the real-time needs of farmers, it is necessary to process the images collected by the camera. The initial consideration is using an object detection model to replace models such as, for instance, segmentation or semantic segmentation (the latter two are relatively slow in processing speed). To detect models in multiple objects, the YOLOv5 model for processing is considered, which is a useful model in an advanced single-stage method in the field of object detection. Compared with the two-step method, the former has a higher processing speed based on the same hardware environment. Compared with other one-stage methods (such as YOLOv2), the related reasons have been described in Section 2.1. The optimized YOLOv5 network structure is complex. Compared with the YOLOv5-Ours, the detection speed is reduced, but a single image only takes 0.1 s, which can meet the above requirements.
  • In terms of model generalization ability, YOLOv5 uses a mosaic data enhancement strategy to improve the model’s generalization ability and robustness.
Based on the above discussion, we believe that the method we proposed is an effective exploration and can promote the development of postproduction reprocessing of crops.

5. Conclusions and Future Work

In this research, Deep learning technology was applied to kiwi flaw detection. Based on YOLOv5, a high-precision kiwi flaw detection method was proposed. First, a kiwifruit dataset containing four types of defects was collected. As far as we know, this is the first kiwifruit defect dataset in the world and even the first agricultural product postproduction defect dataset. At the same time, this is the first time that the YOLOv5 network has been applied to crops. Then, through the improvement of YOLOv5, a small target detection layer was added to the backbone network, and SELayer was embedded to improve the feature extraction ability of the model. In addition, we modified the DIoU loss function to the CIoU loss function to improve the accurate positioning ability of the model prediction frame and enhance the model convergence effect. Compared with the original YOLOv5 model, mAP@0.5 increases 9%. It can detect a single image in only 0.1 s (base on GPU 1050Ti) and has better robustness to the environment, which proves the effectiveness of the model and provides farmers with more efficient and intelligent postproduction reprocessing strategies.
This paper mainly researches and develops kiwifruit defects under the requirement of real-time detection. However, fast detection still needs specific hardware configuration. In the future, we will continue to optimize YOLOv5-Ours and use pruning technology to optimize the model. At the same time, we will continue to increase the research on more kiwifruit varieties and increase the scope of application.

Author Contributions

Conceptualization, J.Y. (Jia Yao); methodology, J.Y. (Jia Yao); software, J.Y. (Jia Yao) and J.Q.; validation, X.L. and H.S.; formal analysis, J.Y. (Jia Yao) and J.Q. investigation, J.Y. (Jia Yang) and H.S.; resources, J.Z. and. J.Q.; data curation, J.Z. and J.Y. (Jia Yang); writing—original draft preparation, J.Y. (Jia Yao) and J.Y. (Jia Yang); writing—review and editing, J.Z. and J.Y. (Jia Yang); visualization, J.Y. (Jia Yao); supervision, X.L. and H.S.; project administration, J.Z.; funding acquisition, J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Sichuan Provincial Federation of Social Sciences, Youth Project, Grant Number SC19C032, and funded by Sichuan Agricultural University Scientific Research Interest Program, Grant Number 2021663.

Acknowledgments

We would like to thank Ya’an Hongming Farm for its help in collecting the dataset, and Jiaoyang Jiang, Yuan Ou, for providing English language support. We also thank Hongming Shao, Jingyu Pu, and Ying Xiang for their advice on the dataset.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Food Industry Network. China’s kiwifruit production ranks first in the world. Food Saf. Guide 2018, 33, 6. [Google Scholar]
  2. Fayuan, W.; Wenkai, W. Introduction to Frontier Knowledge and Skills of Modern Agricultural Economic Development; Hubei Science and Technology Press: Wuhan, China, 2010. [Google Scholar]
  3. Li, Q. Research on Non-Destructive Testing and Automatic Grading of Kiwifruit Based on Computer Vision; Anhui Agricultural University: Hefei, China, 2020. [Google Scholar]
  4. Jiao, L.; Zhang, F.; Liu, F.; Yang, S.; Li, L.; Feng, Z.; Qu, R. A survey of deep learning-based object detection. IEEE Access 2019, 7, 128837–128868. [Google Scholar] [CrossRef]
  5. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
  6. Shah, T.M.; Nasika, D.P.B.; Otterpohl, R. Plant and Weed Identifier Robot as an Agroecological Tool Using Artificial Neural Networks for Image Identification. Agriculture 2021, 11, 222. [Google Scholar] [CrossRef]
  7. Zeze, F.; Qian, L.; Jiewei, C.; Xiaofeng, Y.; Haifang, L. Apple tree fruit detection and grading based on color and fruit diameter characteristics. Comput. Eng. Sci. 2020, 42, 82–90. [Google Scholar]
  8. Pan, Y.; Wei, J.; Zeng, L. Farmland Bird Target Detection Algorithm Based on YOLOv3. Available online: http://kns.cnki.net/kcms/detail/31.1690.TN.20210409.0942.050.html (accessed on 16 July 2021).
  9. Qingzhong, L.; Maohua, W. Development and prospect of real-time fruit grading technology based on computer vision. Trans. Chin. Soc. Agric. Mach. 1999, 6, 1–7. [Google Scholar]
  10. Xu, T. Research on Classification and Recognition of Fruit Surface Grade Based on Machine Vision; Chongqing Jiaotong University: Chongqing, China, 2018. [Google Scholar]
  11. Jianwei, D.; Yuanyuan, L.; Fei, C.; Tongxuan, W.; Shengsheng, D.; Yankun, P. Surface Defect Detection of Korla Fragrant Pear Based on Multispectral Image. J. Agric. Mech. Res. 2021, 43, 41–46. [Google Scholar]
  12. Yanni, W.; Li, H. Detection method of pomegranate leaf diseases based on multi-class SVM. Comput. Meas. Control. 2020, 28, 197–201. [Google Scholar]
  13. Huajian, X. Research on the Application of Computer Vision in Mango Quality Detection. J. Agric. Mech. Res. 2019, 1, 190–193. [Google Scholar]
  14. Du, Z.; Fang, S.; Zhe, L.; Zheng, J. Tomato Leaf Disease Detection Based on Deep Feature Fusion of Convolutional Neural Network; China Sciencepaper: Beijing, China, 2020. [Google Scholar]
  15. Liu, X. Research on Tomato Diseased Leaf Recognition Based on Mask R-CNN and Its Application in Smart Agriculture System; Xidian University: Xian, China, 2020. [Google Scholar]
  16. Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the Computer Vision—ECCV 2016 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016. [Google Scholar]
  17. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. IEEE 2016, 1, 779–788. [Google Scholar]
  18. Shao, H.; Pu, J.; Mu, J. Pig-Posture Recognition Based on Computer Vision: Dataset and Exploration. Animals 2021, 11, 1295. [Google Scholar] [CrossRef] [PubMed]
  19. Loshchilov, I.; Hutter, F. SGDR: Stochastic Gradient Descent with Warm Restarts. In Proceedings of the ICLR 2017 (5th International Conference on Learning Representations), Toulon, France, 24–26 April 2017. [Google Scholar]
  20. Ruan, J. Design and Implementation of Target Detection Algorithm Based on YOLO; Beijing University of Posts and Telecommunications: Beijing, China, 2019. [Google Scholar]
  21. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–8 December 2012; 25, pp. 1097–1105. [Google Scholar]
  22. Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition IEEE, Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
  23. Redmon, J.; Farhadi, A. YOLOv3: An incremental improvement. In Proceedings of the CVPR 2018: IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake, UT, USA, 18–22 June 2018. [Google Scholar]
  24. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. Available online: https://arxiv.org/abs/1512.03385 (accessed on 16 July 2021).
  25. Lin, T.Y.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
  26. Bochkovskiy, A.; Wang, C.Y.; Liao, H. YOLOv4: Optimal Speed and Accuracy of Object Detection. Available online: https://arxiv.org/abs/2004.10934 (accessed on 16 July 2021).
  27. Luvizon, D.; Tabia, H.; Picard, D. SSP-Net: Scalable Sequential Pyramid Networks for Real-Time 3D Human Pose Regression. Available online: https://arxiv.org/abs/2009.01998 (accessed on 16 July 2021).
  28. Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path Aggregation Network for Instance Segmentation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) IEEE, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
  29. Wang, C.Y.; Liao, H.; Wu, Y.H.; Chen, P.Y.; Hsieh, J.W.; Yeh I, H. CSPNet: A New Backbone that can Enhance Learning Capability of CNN. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
  30. Jie, H.; Li, S.; Gang, S. Squeeze-and-Excitation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 42, 7132–7141. [Google Scholar]
  31. Jiang, B.; Luo, R.; Mao, J.; Xiao, T.; Jiang, Y. Acquisition of Localization Confidence for Accurate Object Detection. In Proceedings of the European conference on computer vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
  32. Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) IEEE, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
  33. Zheng, Z.; Wang, P.; Ren, D.; Liu, W.; Ye, R.; Hu, Q.; Zuo, W. Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation. Available online: https://arxiv.org/abs/2005.03572 (accessed on 16 July 2021).
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.