1. Introduction
Mushrooms are fungi with important nutritional value, and there are about 2000 edible species worldwide [
1]. The most cultivated species include Agaricus bisporus, Lentinula edodes, and Pleurotus ostreatus. The current global fresh mushroom market was valued at USD 38 billion in 2018 [
2]. China is one of the world’s largest producers and exporters of wild edible mushrooms; the value of edible mushroom cultivation ranks in the top five after grain, vegetable, fruit, and edible oil cultivation and is larger than sugar, cotton, and tobacco. Mushroom cultivation has had a significant impact on China’s poverty alleviation program, with incomes at least ten times higher than those from rice and maize, and the rapid development of the market and increased demand for edible mushrooms have generally strengthened the domestic edible mushroom economy [
3].
However, oyster mushroom cultivation is still mainly performed manually, which is inefficient and costly, and it is also difficult to ensure the quality of the mushrooms. The future trend of edible mushroom production will be from automation to intelligence and informationization [
4]. Bria et al. [
5] used a fuzzy logic system to predict the size and quality of oyster mushrooms by utilizing the temperature, light intensity, and humidity. Cikarge and Arifin also used fuzzy logic to maintain an optimum humidity and thus promote the growth of oyster mushrooms [
6]. Meanwhile, Rawidean et al. [
7] implemented a smart mushroom farm using IoT technology to provide optimal conditions for mushroom growth. At the other end of the production spectrum, harvesting and grading are equally important. In many rural mushroom farms, growing conditions and economic factors often prevent growers from utilizing industrial automation to harvest mushrooms. Due to the short shelf life of fresh mushrooms, untimely harvesting can lead to a decline in mushroom quality, resulting in significant economic losses [
8]. Not only are labor costs very high, but the accuracy of human judgment may decrease with increases in working hours, so intelligent picking and grading are of great significance for the development of smart mushroom factories.
Algorithms for deep learning and target detection have been developing rapidly in recent years, and machine vision will be applied in large quantities to automated, intelligent agricultural growing bases in the future [
9,
10,
11,
12]. There are many related research works on applying deep learning to mushroom smart factories: Alok Mukherjee [
13] from IIT Malda, India, used two supervised learning models combining Support Vector Machines (SVMs) and Artificial Neural Networks (ANNs) to carry out classification of the freshness and deterioration of oyster mushrooms. Qian et al. [
14] improved SSD Convolutional Neural Networks combined with a binocular depth camera to realize the recognition and localization of flat oyster mushrooms in three-dimensional space to accomplish the task of oyster mushroom picking. Chuan-Pin Lu et al. [
15], from Meiwa University, utilized a YOLOv3 algorithm combined with a self-designed Score-Punishment algorithm for mushroom cap measurements. BOHAN Wei [
16] et al., from Zhejiang College of Tongji University, utilized an improved YOLOv5 algorithm for detecting edible mushrooms. Wang Leilei et al. [
17], from Hebei University of Engineering, used the improved YOLOv5 algorithm to detect the maturity of oyster mushrooms. The YOLO algorithm has become the mainstream model for target detection due to its advantages of being fast, lightweight, and much better than traditional models in the design of its single-stage model. Current research mainly focuses on the mature picking of oyster mushrooms, and there is no research on the use of the mainstream YOLO algorithm for detecting the grading of oyster mushrooms. This paper synthesizes the above research and adopts YOLOv8 for the grading detection of oyster mushrooms.
Furthermore, by applying YOLOv8 to the grading detection of oyster mushrooms, this paper proposes an OMC-YOLO model and optimizes and improves the network model according to the characteristics of oyster mushrooms, which improves its detection accuracy while making the model more lightweight. The model shows good results in solving the problems of omission, wrong detection, and low accuracy when detecting oyster mushrooms.
The main contributions of this paper are as follows: firstly, images of oyster mushrooms were collected through self-growing and internet resources, and they were organized and labeled to extend the oyster mushroom dataset. Secondly, depth-wise separable convolution (DWConv) [
18] was used to replace the regular convolution of part of the backbone network so that the number of model parameters decreased and was more lightweight. Then, large separable kernel attention (LSKA) [
19] was incorporated into the C2F module in the Neck part, which focuses on localized regions of the input feature maps through advanced spatial attention convolution processing to enhance the model’s ability to understand and capture spatial details. Then, Slim-Neck was used in the Neck part, including GSConv instead of the regular convolution and incorporating the use of VoVGSCSP to replace the C2F part of the Neck in order to alleviate the resistance of the deep layer to the data flow and to reduce the inference time. Finally, the Distance-IoU (DIoU) loss function [
20] was selected by comparing experiments with different loss functions.
4. Discussion
The performance of OMC-YOLO was compared via ablation and comparison experiments, and the performance results were significantly better than YOLOv8 and its mainstream models. To ensure that the model does not overfit the data, we used separate training, validation, and test sets. The dataset was divided into 60% for training, 20% for validation, and 20% for testing.
Table 8 shows the detection performance of OMC-YOLO compared to YOLOv8 and other models on several test sets, demonstrating that the model generalizes well to new data.
We employed techniques such as early stopping and regularization to prevent overfitting. Training was halted when the validation loss stopped improving, thereby avoiding an overly complex model that might fit the noise in the training data. Additionally, we used Dropout and weight decay to penalize overly complex models, promoting simplicity and generalization. To evaluate bias and variance, we compared the training loss and validation loss. The small gap between these losses indicated low variance, while the lower validation loss suggested low bias.
Figure 16 shows the prediction frame loss, classification loss, and DFL for the OMC-YOLO training and validation sets. The loss values leveled off at 100 epoch iterations, indicating that the training had converged without overfitting. The step-down in the training set loss values in the last ten rounds was due to the image mosaic being turned off during the last ten rounds of training, which improved the model’s stability and reduced unnecessary noise in the later stages of training.
The excellent performance of OMC-YOLO stems from the various modules added in this paper for the features of the oyster mushroom dataset. The main role of the Neck part of the model is to integrate and extract the features from the previous convolutional layers and scale fusion for objects of different sizes, playing a decisive role in the final detection results. The main improvement of OMC-YOLO focuses on the Neck part, firstly by integrating the LSKA attention mechanism and secondly by adopting the Slim-Neck network structure.
Figure 17 shows the performance of the heat map after adding the large separable convolutional kernel attention mechanism. The darker color of the heat map and the denser heat performance on the detected objects after the addition prove that LSKA pays more attention to the local region of the input feature map, which enhances the model’s ability to understand and capture spatial details.
The Slim-Neck module, on the other hand, is a module specifically designed to improve on the Neck section. Its usefulness for the Neck section has been extensively described. The use of GSConv and VoVGSCSP preserves as many of the hidden connections of these channels as possible. For feature maps that have become slender by the time they reach the Neck section, the use of GSConv alleviates the deeper resistance to the data flow, significantly reducing inference time.
Table 9 shows a comparison of the performance of the Slim-Neck module on the Neck section alone, both overall and by category, and it is clear that the module has a significant impact on the final performance of OMC-YOLO.
5. Conclusions
Currently, the process of mushroom cultivation has been partially applied with intelligence, but it still mainly relies on manual labor in picking and sorting, which is not only time-consuming but also labor-intensive. OMC-YOLO effectively improves the accuracy and efficiency of grading detection of oyster mushrooms by adopting the latest YOLOv8 model and optimizing it for the characteristics of oyster mushrooms. The experimental results show that the improved model outperforms the current mainstream target detection models, including Faster R-CNN, SSD, and various versions of the YOLO series, and that OMC-YOLO is suitable for sorting oyster mushrooms in automated mycological factories.
OMC-YOLO carries out in-depth network model optimization and improvement for its characteristics, which effectively improves the accuracy and efficiency of oyster mushroom sorting detection. By introducing DWConv, LSKA, and Slim-Neck modules and adopting the DIoU loss function, OMC-YOLO not only significantly reduces the number of parameters and computation volume of the model and realizes model lightweighting but also enhances the model’s ability to extract oyster mushroom features, which in turn improves the accuracy of detection. The experimental results show that OMC-YOLO outperforms the current mainstream target-detection models in the oyster mushrooms classification detection task, especially in the mAP50 value of 94.95%, with improvements in the Special Class, First Class, and Second Class of 4.16%, 1.33%, and 0.48%, which are higher than that of the YOLOv8 model, and the number of parameters and computation as well as the size of the model have also been reduced, verifying the effectiveness of the optimization improvement.
Despite the results achieved in this study, the challenges of the diversity of appearance characteristics of oyster mushrooms, the complexity of cultivation environments, and the stability of the model under different light conditions still exist. Future work will focus on improving the performance and reliability of the model in more complex and variable real-world application scenarios by further expanding the training dataset, optimizing the model structure and tuning parameters. In addition, exploring more efficient algorithms and techniques to cope with the accuracy problem of Unripe mushroom detection is also an important direction for subsequent research. Through continuous technological innovation and optimization, it is expected that fully automated and intelligent cultivation and management of oyster mushrooms will be realized in the near future, contributing more scientific and technological power to the development of the edible mushroom industry.