Phenotypic Feature Extraction and Yield Prediction of Lentinula edodes with Lightweight YOLO-SFCB Model

Liu, Pan; Zhang, Ruiqing; Chen, Wenjie; Li, Shoumian; Hao, Jianjun; Su, Tianyue; Wang, Mingyang

doi:10.3390/horticulturae11111406

Open AccessArticle

Phenotypic Feature Extraction and Yield Prediction of Lentinula edodes with Lightweight YOLO-SFCB Model

by

Pan Liu

^1,2,

Ruiqing Zhang

^1,2,*,

Wenjie Chen

^3,*,

Shoumian Li

⁴,

Jianjun Hao

^1,2,

Tianyue Su

² and

Mingyang Wang

²

¹

Technology Innovation Center of Intelligent Agricultural Equipment, Baoding 071000, China

²

College of Mechanical and Electrical Engineering, Hebei Agricultural University, Baoding 071000, China

³

College of Life Sciences, Langfang Normal University, Langfang 065000, China

⁴

College of Horticulture, Hebei Agricultural University, Baoding 071000, China

^*

Authors to whom correspondence should be addressed.

Horticulturae 2025, 11(11), 1406; https://doi.org/10.3390/horticulturae11111406

Submission received: 18 September 2025 / Revised: 14 November 2025 / Accepted: 17 November 2025 / Published: 20 November 2025

(This article belongs to the Section Medicinals, Herbs, and Specialty Crops)

Download

Browse Figures

Versions Notes

Abstract

The phenotypic features and yield of Lentinula edodes fruiting bodies are key metrics in breeding, cultivation, and quality evaluation. To overcome the inefficiency and physical damage associated with manual measurement, this paper proposes an automated approach using a lightweight YOLOv11-Seg model. On the basis of the YOLOv11-Seg model, the ShuffleNetV2 network, the C3k2-FasterBlock feature extraction module, and the convolutional block attention module (CBAM) were introduced to construct a lightweight YOLO-SFCB model, which significantly reduced the complexity and computational cost of the model. The experimental results show that the parameters, floating point operations (FLOPs), and mAP50-95 of the YOLO-SFCB model reach 2.0 M, 7.8 G, and 80.5%, respectively, while the GPU-based inference time is only 15.7 ms. Compared with the original model, parameters and FLOPs were reduced by 29% and 25%, inference time was shortened by 9.8%, and mAP50-95 increased by 0.9%. Based on the YOLO-SFCB model, OpenCV was used to extract the minimum rotation circumscribed rectangle of the stipe and pileus segmentation areas, and the stipe height, stipe diameter, pileus width, and pileus thickness were measured; the average residual is less than 0.35 mm. Finally, using the least squares method, a yield prediction model for Lentinula edodes fruiting bodies was developed. The average prediction errors for fresh weight and dry weight were controlled within 0.5 g and 0.15 g, respectively. The YOLO-SFCB model and the method for extracting phenotypic features and predicting yield of Lentinula edodes proposed in this study can help promote the development of Lentinula edodes breeding and cultivation and stabilize market supply and demand.

Keywords:

Lentinula edodes; YOLO-SFCB; OpenCV; least squares method; feature extraction; yield prediction

1. Introduction

Lentinula edodes is the second most cultivated mushroom species worldwide, with a history of over 800 years of cultivation. It is not only delicious and unique in flavor but also possesses health benefits such as tumor prevention and immunity enhancement [1,2]. In breeding, cultivation, and quality evaluation, the phenotypic features of Lentinula edodes, such as pileus diameter and stipe thickness, are crucial: they are not only the key basis for optimizing production and screening good varieties but also an important reference for consumers to intuitively judge the appearance and freshness of products [3]. Since the beginning of the 21st century, the production of Lentinula edodes has continued to increase. In some years, market demand has exceeded supply, leading to an imbalance between supply and demand [4]. Accurate prediction of the yield of Lentinula edodes fruiting bodies can optimize planting plans and resource allocation, stabilize market supply and demand, and ensure industrial profitability. Therefore, real-time detection of phenotypic features and fresh weight of Lentinula edodes is particularly important in the cultivation process and market regulation.

Traditional methods for measuring the phenotypic features and yield of Lentinula edodes primarily rely on manual measurement, which is cumbersome and inefficient. Similarly to Lentinula edodes, the size and length characteristics of various crops can intuitively reflect their growth status and quality, representing a key aspect of crop phenomics research. With the support of machine vision technology, methods for measuring phenotypic parameters of various crops have been successively developed [5,6,7]. He et al. [8] extracted the edge contours of soybean pods using a Gaussian filter function and Canny edge detector and calculated the length, width, and area of the pods by combining the minimum bounding rectangle method and the maximum inscribed circle method. Lou et al. [9] used a UNet model with an attention mechanism to segment images of broadleaf tree seedlings, combined with a generative adversarial network to restore occluded parts of branches, and extracted phenotypic data such as seedling height, stem diameter, and canopy width. Xu et al. [10] used three digital cameras to acquire images of maize plants and reconstructed 3D point clouds to extract detailed phenotypic parameters such as stem length, branch length, number of branches, and branch angle. Based on measured crop size and length data, crop and fruit yield can also be inferred, providing data support for crop breeding and yield prediction [11,12,13]. Fass et al. [14] used the SAM model to segment tomatoes in images and count the number of pixels in the tomato regions, successfully predicting seven indicators of tomato yield, firmness, and total soluble solids. However, the above methods for phenotypic feature measurement still face issues such as low automation, high requirements for detection environments, and low efficiency, making them unsuitable for direct application in phenotypic feature extraction of Lentinula edodes fruiting bodies.

In recent years, various object detection and segmentation models have rapidly developed, facilitating the automated identification of crop types, quantities, and characteristics such as pests and diseases, injecting new vitality into crop phenomics research [15,16,17,18,19]. Sun et al. [20] used the YOLOv5 model to successfully detect moldy areas in rice grains contaminated by Aspergillus niger, Penicillium citrinum, and Aspergillus griseus, with accuracy rates of 89.26%, 91.15%, and 90.19%, respectively. Ding et al. [21] used MobileNetV2 as the backbone of the DeepLabV3+ model to successfully identify and segment lesions on apple surfaces, addressing issues such as blurred edges and large shape variations. Bai et al. [22] designed the RiceNet network to process RGB images of rice fields collected by unmanned aerial vehicles. RiceNet integrates a feature extractor frontend and a feature decoder module, enabling counting, locating, and sizing of rice plants. Ding et al. [23] improved the YOLOv8 model to enhance its detection capability for small objects and accurately identify downy mildew and powdery mildew in cucumbers. The above models are all specialized optimization models for specific crops, effectively promoting the development of related crop phenotype detection technologies. However, these models generally suffer from the problems of a large parameter count and high computational complexity, requiring strict computing power for deployment platforms, making it difficult to achieve efficient deployment on portable devices such as Raspberry Pi. This limitation greatly restricts the promotion and application of related technologies in production practice, making it difficult for the efficiency of phenotype detection to meet practical needs. Therefore, the core of developing a practical detection and yield prediction system for Lentinula edodes fruiting bodies lies in constructing a lightweight object detection and segmentation model that meets the recognition needs of Lentinula edodes fruiting bodies.

YOLOv11 is the latest generation of real-time object detection models developed by Ultralytics, known for its low computational requirements and strong recognition capabilities. Among these, YOLOv11-Seg is a class of instance segmentation models with a small model size and powerful segmentation performance [24,25]. This study is based on the YOLOv11n-Seg model; a lightweight YOLO-SFCB model was constructed to accurately identify and segment the stipe and pileus of Lentinula edodes in different growth states. The YOLO-SFCB model, combined with spatial information, can measure the stipe height, stipe diameter, pileus thickness, and pileus width of Lentinula edodes fruiting bodies and predict fresh and dry weight.

2. Materials and Methods

2.1. Dataset Collection

In this study, a self-built dataset was used to train the segmentation model for Lentinula edodes fruiting bodies. From March to June 2024, 56 different batches of Lentinula edodes samples from 40 mushroom sticks were selected as subjects for image-based phenotypic feature collection in Fuping County, Baoding City, Hebei Province. Images were captured using the 12-megapixel rear camera of an iPhone 11 mobile phone, with the image resolution set to 4032 × 3024 pixels and the shooting distance fixed at 15 cm. The shooting angle is kept perpendicular to the long axis of the Lentinula edodes stick to obtain a standard side view for measurement. After the fruiting body of Lentinula edodes emerges, they mature in 5–7 days. To ensure the dataset covers the complete growth stages of Lentinula edodes fruiting bodies from early development to maturation and harvest, images of the front and both sides of the Lentinula edodes were taken daily between 8:00 and 10:00 a.m. and 16:00 and 18:00 p.m. Figure 1 visually illustrates the morphological changes in Lentinula edodes throughout the growth cycle.

During the image acquisition process, vernier calipers were used to measure and record four types of phenotypic features—stipe height, stipe diameter, pileus thickness, and pileus width—for each sample of Lentinula edodes fruiting body. After the fruiting bodies reached maturity, 120 samples were selected from the mushroom shed, and their phenotypic data and fresh weight were recorded. These mature Lentinula edodes were then dried in an oven, and the dry weight was measured using the constant-temperature atmospheric pressure drying method [26].

2.2. Dataset Processing

The four types of phenotypic features of Lentinula edodes can be obtained from side-view images of its fruiting bodies. After excluding frontal images from the collected data, a total of 722 side view images of Lentinula edodes fruiting bodies were retained. These side view images are divided into training, validation, and testing sets in an 8:1:1 ratio to ensure independence between the datasets [27,28]. In order to improve the generalization ability of the model, data augmentation operations were performed on the training set, validation set, and test set, including brightness adjustment, rotation, and noise addition. When performing data augmentation operations, the brightness is randomly adjusted by ±20%, the image rotation angle range is set to −45° to +45°, the noise type is Gaussian noise, the noise mean is 0, and the standard deviation is 10. After data augmentation, the training set contains 2304 images, the validation set has 292 images, and the test set has 292 images.

Image annotations were created using Labelme, with polygon masks outlining the stipe and pileus of each Lentinula edodes. The resulting JSON files were converted to TXT format compatible with YOLOv11-Seg. The data enhancement and annotation process is shown in Figure 2.

2.3. Construction of the YOLO-SFCB Model

2.3.1. Introduce the ShuffleNetV2 Network in the Backbone Section

Cross stage partial darknet (CSP Darknet) is the backbone network of YOLOv11-Seg, which improves the efficiency and accuracy of feature extraction by introducing Cross Stage Partial Networks (CSP Net) on the basis of traditional Darknet architecture [29]. However, the stacking of multi-layer convolutional layers and C3k2 modules increases the complexity of the model. Lentinula edodes cultivation is intensive, and the computing resources of portable devices are limited, requiring lightweight models. ShuffleNetV2 maintains rich feature expression ability while significantly reducing the number of parameters through channel mixing operation, making it suitable for extracting texture features and overall morphological features of Lentinula edodes. ShuffleNetV2 unit structure is shown in Figure 3.

ShuffleNetV2 is an improved version of ShuffleNet that introduces a unique channel shuffle operation to enhance inter-channel information exchange and fusion, making it more accurate than other lightweight networks of similar complexity [30]. By rearranging the channel order of the original feature maps, the channel shuffle operation breaks information isolation between channels, thereby strengthening the model’s representational capacity and generalization ability.

Figure 3 illustrates the unit structure of ShuffleNetV2. It can be observed that the unit module divides the feature channels into two parts: one part remains unchanged, while the other undergoes convolutional and depthwise convolution operations. The two feature channels are then merged, followed by a channel shuffle operation to enhance inter-channel feature communication. In the downsampling module, depthwise convolution and standard convolution are applied to the unchanged feature channels based on the unit structure. Owing to its lightweight network design, ShuffleNetV2 significantly reduces model complexity while maintaining baseline performance.

2.3.2. Incorporation of C3k2-FasterBlock Lightweight Feature Extractor into Neck

The C3k2 module of YOLOv11-Seg is responsible for important feature extraction and feature fusion in the network, but the C3k2 module has many convolutional layers, and the computational burden is still relatively large. In order to further lighten the model, the Bottleneck module in the C3k module is replaced by the FasterBlock module, and the structure of the new C3k2-FasterBlock lightweight feature extraction module is shown in Figure 4.

The FasterBlock module serves as the core building unit of the FasterNet network. It integrates Partial Convolution (PConv) and Pointwise Convolution (PWConv) to achieve efficient feature extraction and information aggregation [31]. PConv, a distinctive component within the FasterBlock module, enhances computational efficiency by minimizing redundant computation and memory access. For an input feature map with input and output dimensions of (c, h, w) and a convolution kernel of size k × k, the floating point operations (FLOPs) and memory access cost (MAC) of a standard convolution operation are given in Equation (1), while the FLOPs and MAC for PConv are provided in Equation (2).

\begin{matrix} FLOPs = h \times w \times k^{2} \times c^{2} \\ MAC = h \times w \times 2 c + k^{2} \times c^{2} \approx h \times w \times 2 c \end{matrix}

(1)

\begin{matrix} FLOPs = h \times w \times k^{2} \times c_{p}^{2} \\ MAC = h \times w \times 2 c_{p} + k^{2} \times c_{p}^{2} \approx h \times w \times 2 c_{p} \end{matrix}

(2)

Here, c_p represents the number of convolution channels of PConv. When c_p = c/2, the FLOPs of PConv are only 1/4 of those in standard convolution (Conv), and the MAC is reduced to 1/2. The FasterBlock module connects two PWConv after the PConv and employs residual connections at the end to add the input feature map to the output feature map. This design not only reduces computational load but also enhances training stability. The introduction of the C3k2-FasterBlock module forms a collaborative optimization with the ShuffleNetV2 backbone network. The partial convolution design of this module is aimed at addressing the computational bottleneck of the neck network, while maintaining multi-scale feature fusion capability and further reducing computational overhead.

2.3.3. CBAM Attention Mechanism Integration in Front of Small Object Detection Head

After the aforementioned lightweight improvements, the model has achieved reductions in both size and computational requirements. However, the original model exhibits insufficient recognition capability for small-sized Lentinula edodes fruiting bodies. To address this, a CBAM attention module is incorporated before the small object detection head in the improved model, enhancing the new model’s ability to detect and segment small Lentinula edodes. The structure of the CBAM attention mechanism is shown in Figure 5.

The convolutional block attention module (CBAM) mechanism is a dual attention structure that integrates both a Channel attention module (CAM) and a Spatial attention module (SAM), enhancing the model’s ability to extract salient features [32]. CAM performs global max pooling and global average pooling operations on each channel to generate channel attention weights. By multiplying the channel attention weights with each starting channel, the more important feature channels for the task can be dynamically highlighted. SAM can generate spatial attention weights by maximizing and averaging the input feature map along the channel dimension. By weighting the original feature map with spatial attention weights, important image regions can be highlighted and the influence of unimportant regions can be reduced.

By sequentially connecting CAM and SAM, the CBAM attention mechanism effectively incorporates both channel-wise and spatial feature representations. The dual attention of CBAM can optimize model performance from both feature channels and spatial position dimensions. Channel attention enhances the key features that distinguish Lentinula edodes from complex backgrounds, while spatial attention can effectively focus on specific areas where small volume Lentinula edodes are located.

In summary, the structure of the Lentinula edodes instance segmentation model developed in this study, based on YOLOv11-Seg, is illustrated in Figure 6. The improved model has been named YOLO-SFCB.

2.4. Model Evaluation Metrics

The evaluation of model complexity incorporates three primary metrics: parameter count (Params), FLOPs, and inference time. Lower Params and FLOPs signify a more lightweight model that requires less computational power. The inference time reflects the recognition and segmentation speed of the model. The measurement was conducted on an NVIDIA RTX 4060 Ti 8 GB graphics card, with 10 warm-up iterations. 500 test images were inferred and the average time was calculated. The batch size was 1 to simulate real-time application scenarios.

Model performance evaluation includes Precision, Recall, and mean average precision (mAP). Accuracy and recall represent the detection accuracy and recognition ability of the model, respectively, while mAP comprehensively reflects the overall performance of target recognition and segmentation. Their calculation methods are shown in Equations (3)–(5).

Precision = \frac{T_{P}}{T_{P} + F_{P}}

(3)

Recall = \frac{T_{P}}{T_{P} + F_{N}}

(4)

mAP = \frac{\sum_{i = 1}^{L} \sum_{j = 1}^{N} P_{j} R_{j}}{L}

(5)

In the formula: T_P (true positive) represents the true positive sample size; F_P (false positive) refers to the number of negative samples incorrectly labeled as positive samples; F_N (false negative) denotes the count of positive samples that were incorrectly not detected; N is the total number of single-target samples; j is the index of the single-target sample; L is the number of targets.

2.5. The Relationship Between the Distance of the Object and the Pixel-to-Physical Length Ratio

Using a fixed object-distance method to capture images of Lentinula edodes fruiting bodies and calculate phenotypic features is straightforward in process, but it imposes high requirements for imaging conditions, which can also limit measurement efficiency. Calculating the pixel-to-physical length ratio of the camera at different shooting distances can reduce the difficulty of image acquisition and improve measurement efficiency.

When studying the relationship between pixel-to-physical length ratio and shooting distance, the standard block method was employed to determine the pixel-to-physical length ratio of an industrial camera at distances ranging from 5 cm to 20 cm. The experiment uses an industrial camera equipped with a 2.8 mm ultra-wide angle lens (aperture F1.0–2.6), which integrates real-time digital distortion correction function to ensure reliable geometric accuracy of image data under complex lighting conditions in Lentinula edodes cultivation environment. The standard block and imaging setup are illustrated in Figure 7. First, images of the standard block were captured using the industrial camera at 1 cm intervals from 5 cm to 20 cm. The resolution of the captured images was then reduced to 640 × 640 pixels, and the actual ratio between pixels and physical length was calculated for each distance. Finally, a regression equation was fitted with shooting distance as the independent variable and the pixel ratio as the dependent variable.

After capturing images of the Lentinula edodes fruiting bodies, this fitted equation can be used to calculate the phenotypic parameters of Lentinula edodes at various shooting distances.

2.6. Extraction Method of the Phenotypic Features of the Fruiting Body of Lentinula edodes

The growth angles of Lentinula edodes vary significantly, leading to considerable angular differences among the stipe, pileus, and the cultivation stick. This requires a measurement method capable of adapting to diverse angle variations. To address this, the OpenCV library was used to process the segmented images of Lentinula edodes and calculate the phenotypic parameters of the mushroom bodies [33]. The procedures for measuring the stipe height, stipe diameter, pileus thickness, and pileus width are as follows:

(1) The YOLO-SFCB model was employed to identify and segment the stipe and pileus of Lentinula edodes in the images.

(2) OpenCV was used to extract the minimum-area rotated bounding rectangle of the stipe and pileus regions to mitigate measurement errors caused by variations in the growth angle of Lentinula edodes.

(3) Direction determination: For the pileus region, the longer side of the minimum rotated bounding rectangle is taken as the pileus width direction, and the shorter side as the thickness direction. For the stipe region, the direction parallel to the longer side of the pileus bounding rectangle is used as the stipe diameter direction, while the direction with the larger angular deviation relative to the stick is considered the stipe height direction.

In terms of measurement location: The pileus thickness is measured as the length across the mask region at the center point of the rotated bounding rectangle along the thickness direction. The stipe diameter is measured similarly at the center location along the diameter direction. The pileus width and stipe height are directly taken as the side lengths of the rotated bounding rectangle in their respective directions, as shown in Figure 8.

(4) The number of pixels corresponding to the four phenotypic features is calculated, and the actual physical dimensions—stipe height, stipe diameter, pileus thickness, and pileus width—are derived based on the physical length per pixel.

2.7. Prediction Method of Lentinula edodes Fruiting Body Yield

After obtaining the measurements of the four phenotypic features of Lentinula edodes fruiting bodies, along with fresh weight and dry weight data, the relationship between phenotypic parameters and yield can be investigated, providing support for yield prediction in Lentinula edodes. The yield prediction dataset consists of a total of 120 samples containing measured phenotypic parameters and yield values of Lentinula edodes. When constructing the Lentinula edodes yield dataset, first sort all samples by weight value from low to high. Subsequently, 2 out of every 10 consecutive samples were randomly selected and included in the test set, while the remaining 8 were included in the training set, ensuring that both the training and testing sets could evenly cover the complete yield prediction interval. The final training set contains 96 samples, and the test set contains 24 samples [34].

To explore the mathematical relationship between phenotypic parameters and yield, pileus width, stipe diameter, and total height of Lentinula edodes were used as inputs, while fresh weight and dry weight were set as outputs. Yield simulation experiments were conducted using the least squares method [35], BP neural network (back propagation neural network) [36], and random forest [37], respectively.

3. Results

3.1. Experimental Platform and Model Training Parameters

In the Windows 11 operating system environment, the YOLOv11-Seg model and the improved model undergo single deterministic training without using pretrained weights and gradient optimization. In terms of hardware configuration, an Intel(R) CoreTM i5-12600KF @ 3.60 GHZ processor, equipped with 32 GB RAM and an NVIDIA RTX 4060 Ti 8 GB graphics card, was utilized to provide robust computational support for model training. The software environment selected was CUDA (compute unified device architecture) version 11.7, relying on the Pytorch 2.0.1 deep learning framework, executed on the PyCharm 2024.1.6, with the Python version set to 3.10. Table 1 shows the hyperparameters during the model training process [38].

3.2. Comparison of the Effects of Different Attention Mechanisms

The attention mechanism mimics the human ability to filter irrelevant information and focus on key features, thereby enhancing the model’s ability to capture important characteristics. To evaluate the impact of different attention mechanisms on the performance of the YOLOv11-Seg model, several attention modules—including SE (squeeze and excitation attention) [39], ECA (efficient channel attention) [40], CA (coordinate attention) [41], EMA (expectation maximization attention) [42], GAM (global attention mechanism) [43], and the CBAM mechanism used in this study—were incorporated ahead of the small-object detection head. A comparative analysis of their performance improvements is presented in Table 2.

From Table 2, it can be seen that the effect of pure channel attention on model improvement is not as significant as that of mixed attention. This is because during the shooting process, the Lentinula edodes fruiting bodies are always located at the center of the image, thus fixing the target position and allowing spatial attention in mixed attention to play a role. Furthermore, after adding CBAM attention, the model’s mAP50-95 reached 82.0%, surpassing that of GAM attention. This indicates that the sequential integration of channel attention and spatial attention is more suitable for the mushroom sub-entity detection and segmentation task in this study.

3.3. Ablation Experiment of YOLO-SFCB Model

In order to verify the improvement effect of the proposed model, ablation experiments were carried out for three improvements, and the effects of ShuffleNetV2 lightweight network, C3k2-FasterBlock feature extraction module and CBAM attention on the recognition and segmentation performance of the original model and Lentinula edodes were analyzed. The results of the ablation experiment are shown in Table 3.

It can be observed that after incorporating the ShuffleNetV2 network into the Backbone section, the model’s Params and FLOPs decreased by 29% and 23%, respectively, and mAP50-95 decreased to 77.7%. This indicates that the ShuffleNetV2 network can effectively reduce the complexity of the YOLOv11-Seg model, but at the cost of sacrificing some model performance.

Replacing the C3k2 module in the original model’s Neck section with the C3k2-FasterBlock module resulted in a further reduction in Params and FLOPs, along with improvements in both Precision and Recall. This demonstrates that the PConv module can reduce computational load while slightly enhancing model performance.

Introducing the CBAM attention mechanism before the small target detection head led to a 0.7% increase in Precision and a 1.3% improvement in Recall. This suggests that CBAM attention, through its dual mechanism, enhances the model’s focus on important feature channels and critical spatial regions, thereby improving its ability to identify and segment Lentinula edodes fruiting bodies.

Compared with the original model, the improved YOLO-SFCB model shows reductions in Params and FLOPs by 29% and 25%, respectively, a decrease in inference time by 9.8%, and increases in Precision and Recall by 1% and 1.3%, achieving an mAP50-95 of 80.5%. All key performance indicators demonstrate noticeable enhancement.

To more intuitively demonstrate YOLO-SFCB’s improved performance in identifying and segmenting Lentinula edodes fruiting bodies, Grad-CAM (gradient weighted class activation mapping) was employed for visual evaluation. Attention heatmaps of YOLOv11-Seg and YOLO-SFCB generated by Grad-CAM are shown in Figure 9. The figure displays four Lentinula edodes fruiting bodies in different growth stages. Warm-colored regions indicate areas of greater model attention, with darker colors representing higher contribution of the corresponding regions to the model’s decision.

By comparing Figure 9a and Figure 9c, it is evident that YOLO-SFCB demonstrates stronger recognition capability for Lentinula edodes in early growth stages, with higher confidence levels. This indicates that incorporating the CBAM attention mechanism before the small object detection head effectively enhances the model’s ability to recognize small Lentinula edodes. Notably, while YOLOv11n-Seg encounters misidentification issues in segmentation tasks, YOLO-SFCB effectively reduces such interference while maintaining high recognition accuracy.

Observation of the heatmaps in Figure 9b,d reveals that YOLOv11-Seg is relatively sensitive to background interference, resulting in lower confidence and misidentification problems. In contrast, the improved YOLO-SFCB model significantly enhances its ability to filter out background disturbances through the dual optimization of PConv and CBAM attention, showing stronger focus on the Lentinula edodes fruiting bodies.

The combined results from ablation experiments and heatmap analysis indicate that eliminating irrelevant feature channels while strengthening the model’s spatial attention capability contributes to improved segmentation performance in Lentinula edodes recognition. Coupled with the lightweight efficiency of the ShuffleNetV2 architecture, these improvements make YOLO-SFCB particularly suitable for Lentinula edodes recognition and segmentation tasks under low-computation conditions.

3.4. Performance Comparison with Different Segmentation Models

In order to further explore the recognition and segmentation performance of the improved YOLO-SFCB model on the fruiting body of Lentinula edodes, the classical two-stage instance segmentation model Mask R-CNN (mask region-based convolutional neural network), the single-stage segmentation model YOLACT (you only look at coefficients), and the previous YOLO segmentation models were selected for performance comparison. To ensure fairness in comparison, all models were trained and evaluated under identical training data, hardware platforms, and hyperparameter settings. The segmentation effect is shown in Figure 10, and the performance comparison results are shown in Figure 11.

Figure 10 indicates that all models perform well in segmenting Lentinula edodes fruit bodies, accurately capturing their overall contours. However, Figure 11 reveals considerable differences in computational complexity: Mask R-CNN and YOLACT exhibit FLOPs of 239.3 G and 89.7 G, respectively, with large model sizes, unsuitable for deployment on mobile platforms. In contrast, the YOLO series models show significantly lower FLOPs and model sizes. This advantage stems from their single-stage architecture, which unifies detection and segmentation, along with a lightweight design that effectively reduces computational costs. As a result, the YOLO models achieve notably higher inference speeds.

Furthermore, the volume variation in Lentinula edodes at different growth stages influences recognition performance: models perform better in identifying larger Lentinula edodes compared to smaller ones. The pileus is easier to recognize due to its distinct color contrast with the background, whereas the stipe, being similar in color to the substrate, poses greater recognition challenges. The mAP50 values of different growth stages and modes of pileus and stipe in Figure 11 indicate that the recognition accuracy of Lentinula edodes in the early growth stage is usually lower than that in the growth and maturation stages. The overall performance of Mask R-CNN and YOLACT in the recognition task of Lentinula edodes is not as good as the YOLO series models, but the YOLO-SFCB model enhances its feature extraction ability for Lentinula edodes by introducing the CBAM attention mechanism, thus exhibiting better adaptability than other comparison models in this scenario.

Compared to other models, the YOLO-SFCB model proposed in this study requires the least computation and has the smallest model size, achieving an FPS (frames per second) of 58.26 frames per second—higher than the original YOLOv11n-Seg model. In addition, YOLO-SFCB achieves average mAP50 values of 97.5% for the Lentinula edodes stipe and 98.4% for the pileus across different growth stages, surpassing all other segmentation models. These comprehensive results demonstrate that the YOLO-SFCB model offers a more balanced performance, delivering excellent segmentation accuracy for Lentinula edodes identification while maintaining low computational requirements suitable for deployment on lightweight devices.

3.5. Model Deployment Experiment

The lightweight YOLO-SFCB model is capable of recognizing and segmenting Lentinula edodes fruiting bodies on low-computing-power platforms. To evaluate the detection model’s operational performance, it was deployed on a Raspberry Pi 5. The Raspberry Pi 5 is compact and cost-effective, equipped with a 64-bit quad-core ARM Cortex-A76 processor running at 2.4 GHz and a VideoCore VII GPU operating at 800 MHz, delivering a 2 to 3 times performance improvement over the previous generation. Testing was conducted using the 8 GB memory version of the device, with a rated power of 25 W, as shown in Figure 12.

A total of 50 images of Lentinula edodes fruiting bodies were selected as the test set for the Raspberry Pi model inference experiment. Inference segmentation experiments were conducted on these images using both the YOLOv11n-Seg and YOLO-SFCB models deployed on the Raspberry Pi 5. The average inference time for processing an image was calculated for each model.

To evaluate and compare the inference efficiency of the YOLOv11n-Seg and YOLO-SFCB models deployed on the Raspberry Pi5 platform, we randomly selected 50 images of Lentinula edodes fruiting bodies as the test set. Both models were converted to ONNX format, and segmentation inference experiments were conducted on each image using CPU processing. The average inference time was calculated for each model to quantitatively assess their computational performance. The results showed that YOLOv11n-Seg required an average of 308 ms per image, while the YOLO-SFCB model reduced this time to 236 ms—a 23.5% decrease compared to the original model. This experiment demonstrates that the lightweight improvements of the YOLO-SFCB model are highly effective, making it more suitable for deployment on mobile and edge computing devices.

3.6. Pixel-Physical Length Ratio Fitting Results

Using industrial cameras to capture images of standard blocks and calculating the pixel-physical length ratio at distances of 5–20 cm, it has been found through multiple experiments that the actual length represented by the pixel points increases with the increase in object distance, and exhibits a linear relationship. The fitted mathematical relationship is shown in Equation (6).

P D = \frac{1.39 O D - 0.33}{1000}

(6)

In the equation, OD represents the object distance, and PD represents the actual physical length represented by a single pixel, with units measured in centimeters. The goodness of fit indicator R² reached 0.998. Use Equation (6) to randomly measure the length and width information of blocks within a distance of 5–20 cm, and the calculated value has an error of less than 0.5 mm compared to the actual block length.

3.7. Effect Analysis of Phenotypic Feature Extraction of Lentinula edodes Fruiting Bodies

This study selected 100 additional samples of Lentinula edodes fruiting bodies that were photographed and measured for phenotype parameters. The phenotype information of four types was measured using the method of this study and compared with the values measured by a vernier caliper. Finally, the measurement errors of various phenotype parameters were calculated separately, and the results are shown in Figure 13.

From Figure 13, it can be seen that the four phenotype parameters have a high overall correlation with the actual values measured by the vernier caliper. The R² values of stipe diameter, stipe height, pileus width, and pileus thickness reached 0.81, 0.95, 0.99, and 0.96, respectively, indicating good consistency between image-based measurement methods and manual measurements. The correlation between the stipe diameter is slightly low, because its length range is small, and small measurement differences can cause a decrease in R².

When using a vernier caliper for manual measurement, two types of errors are mainly introduced: one is operational error, which is difficult to standardize force application and reading due to the need to avoid damaging the Lentinula edodes body; the second is the morphological adaptability error, which is caused by the natural bending and irregular shape of various parts of the fruiting body of Lentinula edodes, resulting in the inability to unify the measurement angle and position. These errors have varying degrees of impact on the measurement of different traits. According to Bland–Altman analysis, the average residual between stipe diameter and pileus thickness is less than 0, which is 0.12 mm and 0.10 mm, respectively. The reason is that the vernier caliper cannot touch the Lentinula edodes body, resulting in an overall bias in manual measurement values. The average residual between the stipe height and pileus width is −0.35 mm and −0.10 mm, respectively, which is higher than the measurement value of the vernier caliper. This indicates that the measurement angle determined by the minimum rotation of the external rectangle is more accurate, avoiding the disadvantage of difficulty in determining the angle manually.

Overall, the average residual of phenotypic parameter differences in Lentinula edodes fruiting bodies based on image measurement is less than 0.35 mm, with a standard deviation of less than 1.4 mm. The absolute difference does not increase with the increase in physical size, proving that image-based measurement methods can achieve accurate measurement of phenotypic parameters in Lentinula edodes fruiting bodies.

3.8. Analysis of Prediction Results for Lentinula edodes Fruiting Bodies

This study used three methods—least squares method, BP neural network, and random forest—to learn the fresh and dry weight training set data of Lentinula edodes fruiting bodies. The BP neural network architecture consisted of an input layer (3 neurons), two hidden layers (64 and 32 neurons, ReLU activation), and a linear output layer (1 neuron). It was trained using the Adam optimizer with L2 regularization (λ = 0.001). The random forest model was configured with 100 estimators, a maximum tree depth of 10, and a random state of 42. All models were rigorously evaluated using 10-fold cross-validation repeated 5 times. The model performance metrics are summarized in Table 4, while the training set data distribution and Bland–Altman analysis are presented in Figure 14.

As shown in Table 4, the least squares method performed the best in predicting the fresh and dry weight of Lentinula edodes fruiting bodies, with average R² values of 0.94 and 0.90, respectively, and standard deviations of only 0.03 and 0.07. The sample distribution in Figure 14 and Bland–Altman indicates that the least squares method predicts fresh and dry weight values closer to the true values, with smaller residuals, indicating the lowest overall error and best performance. In contrast, the results of random forest are relatively poor, with average R² values of 0.91 and 0.88, respectively, and relative errors greater than those of least squares and BP neural network.

The learning results of the three models indicate a strong linear relationship between the selected phenotype parameters and the weight of Lentinula edodes. Among them, pileus width has the closest relationship with weight and is the feature with the highest contribution value. The total height contributes less than 0.17 in all three models, indicating the lowest correlation with weight. The least squares method was used to predict the fresh and dry weight of 24 Lentinula edodes fruiting body test set samples that did not participate in training; the sample distribution is shown in Figure 15.

As shown in Figure 15, the fresh weight and dry weight of Lentinula edodes fruiting bodies predicted using these formulas are close to the actual measured values. The average absolute measurement error of fresh weight is 0.66 g, and the average relative error is 3.11%. The average absolute measurement error of dry weight is 0.16 g, and the average relative error is 7.19%. Based on the analysis of the training and testing sets, the average measurement errors of fresh weight and dry weight using the least squares method are 1.56 g and 0.39 g, respectively, which can meet the requirements for predicting the yield of Lentinula edodes fruiting bodies.

4. Discussion

4.1. Segmentation Model

In the development of automated measurement technologies for crop phenotypic parameters, the size and computational complexity of segmentation models remain critical factors limiting their widespread application [9,10,14]. For instance, the CSW-YOLO model proposed by Xu et al. [44] achieved an mAP50 of 96.7% in bitter melon detection but has a model size of 20.7 MB. Similarly, the YOLOTree model developed by Luo et al. [45] for estimating crown volume contains 3.0 M parameters. Models of this kind generally suffer from large size and high computational demands, making them difficult to deploy directly in real production environments for phenotypic parameter measurement.

To overcome these challenges, the YOLO-SFCB model proposed in this study substantially reduces model size and computational burden through a series of lightweight design strategies. By incorporating the ShuffleNetV2 backbone and the C3k2-FasterBlock module, the number of Params and FLOPs are significantly reduced. In addition, the CBAM attention mechanism embedded before the detection head enhances the model’s ability to accurately segment Lentinula edodes fruiting bodies of different sizes. In the end, the Params of YOLO-SFCB are only 2.0 M, and FLOPs are only 7.8 G, with the potential for direct deployment on portable devices.

However, this study still has certain limitations. The datasets used are all based on a single strain and culture medium, with limited sample size and scene diversity; in addition, the performance of the model under complex conditions such as low light, water mist interference, or blurred target boundaries has not been systematically validated, and these factors may affect its generalization ability in diverse production environments. In future research, we will further collect images of Lentinula edodes under different substrates, strains, and growth environments to systematically verify the robustness of the model under variable conditions and continuously improve the adaptability and reliability of YOLO-SFCB in practical applications.

4.2. Phenotypic Feature Detection and Yield Prediction of Lentinula edodes

The crop phenotype feature detection technology based on visible light imaging can be mainly divided into two types: three-dimensional reconstruction and two-dimensional plane analysis. In recent years, 3D reconstruction methods have received widespread attention due to their high measurement accuracy [46,47]. For example, Xu et al. [48] segmented the fruiting body of Lentinula edodes based on the YOLOv8x model and measured parameters such as stipe width and pileus thickness using 3D reconstruction technology, controlling the average error to around 10%. Although this method has high accuracy, its process is complex and its efficiency is low, making it difficult to meet the requirements of high-throughput phenotype detection. In contrast, measurement methods based on two-dimensional images are more convenient, such as Lu et al. [49] using the YOLOv3 model to identify Lentinula edodes pileus and estimate their diameter, thereby predicting harvest time; However, the phenotype parameters that can be obtained by this method are limited, and the overall measurement accuracy is low. The phenotype detection and yield prediction method for Lentinula edodes proposed in this study balances measurement accuracy with operational convenience, effectively unifying the advantages of both methods. This method has low limitations on the distance of image capture and can extract various phenotype parameters, including stipe diameter, stipe height, pileus width, and pileus thickness. The average residuals of each indicator are 0.12 mm, −0.35 mm, −0.10 mm, and 0.10 mm, respectively, verifying the feasibility of high-precision phenotype measurement based on two-dimensional images. The fresh weight and dry weight prediction models established based on the least squares method have R² values of 0.94 and 0.90, respectively, indicating a strong correlation between the extracted phenotype parameters and yield, providing a reliable basis for yield estimation.

However, the measurement accuracy of this method is still affected by the segmentation effect. If there is a deviation in the segmentation area, it will directly interfere with the parameter estimation results. In addition, current methods still have certain requirements for the shooting angle, which to some extent limits the efficiency of phenotype extraction. In future research, we will further expand the measurable phenotype feature types, develop a more comprehensive Lentinula edodes phenotype collection system, and, based on this, construct an automatic evaluation system for Lentinula edodes grades to support quality grading and intelligent decision-making in actual production.

5. Conclusions

This study proposes a method for measuring the phenotype characteristics and predicting the yield of Lentinula edodes based on instance segmentation models, which solves the problems of low efficiency and easy damage to the Lentinula edodes body in traditional Lentinula edodes phenotype feature detection methods. A segmentation network, YOLO-SFCB, is proposed for recognizing and segmenting Lentinula edodes fruiting bodies on mobile devices. By introducing the ShuffleNetV2 network, the C3k2-FasterBlock module, and the CBAM attention mechanism, the ability to recognize and segment small targets is enhanced while lightweighting the model. The experimental results show that the YOLO-SFCB model achieves an mAP50-95 of 80.5% on a self-built dataset and an inference time of 236 ms on a Raspberry Pi, balancing accuracy and efficiency. The average residual of the phenotype measurement method combined with OpenCV is less than 0.35 mm. The MAE predicted by the least squares method for fresh weight and dry weight are 1.56 g and 0.39 g, respectively, and the overall error is within an acceptable range.

In summary, the phenotype feature measurement and yield prediction method proposed in this study is a solution with low computing power requirements and high measurement accuracy and has the potential to be promoted in the field of automatic detection and evaluation of Lentinula edodes. Future research will focus on developing phenotype feature measurement methods that integrate multiple angles, reducing dependence on shooting angles and further improving measurement accuracy. In addition, we will continue to expand the image data of different edible fungi to further improve the applicability of the model and assist in the development of automated cultivation and harvesting technology for edible fungi.

Author Contributions

P.L.: Writing—review and editing, Writing—original draft, Visualization, Validation, Software, Methodology, Investigation, Formal analysis, Data curation, Conceptualization. R.Z.: Writing—review and editing, Methodology, Software, Supervision, Project administration. W.C.: Writing—review and editing, Validation, Supervision, Investigation, Resources, Data curation, Funding acquisition. S.L.: Writing—review and editing, Methodology, Investigation, Data curation. J.H.: Writing—review and editing, Methodology, Investigation, Data curation. T.S.: Validation, Visualization, Data curation. M.W.: Validation, Visualization, Data curation. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Modern Agriculture Industry Technology System Hebei Provincial Innovation Team Construction Project (HBCT2023090201), and Natural Science Foundation of Hebei Province funding project (C2025204257).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Sheng, K.; Wang, C.; Chen, B.; Kang, M.; Wang, M.; Liu, K.; Wang, M. Recent Advances in Polysaccharides from Lentinus edodes (Berk.): Isolation, Structures and Bioactivities. Food Chem. 2021, 358, 129883. [Google Scholar] [CrossRef]
Randeni, N.; Xu, B. New Insights into Signaling Pathways of Cancer Prevention Effects of Polysaccharides from Edible and Medicinal Mushrooms. Phytomedicine 2024, 132, 155875. [Google Scholar] [CrossRef] [PubMed]
Tian, J.; Liu, H.; Li, J.; Wang, Y. Recent Trends in Non-Destructive Techniques for Quality Assessment of Edible Mushrooms. J. Food Compos. Anal. 2024, 136, 106805. [Google Scholar] [CrossRef]
Guo, X.; Xu, J. Impacts of Global Climate Change on Mushroom Productions: Challenges and Opportunities. Agric. Commun. 2025, 3, 100091. [Google Scholar] [CrossRef]
Zhou, W.; Chen, Y.; Li, W.; Zhang, C.; Xiong, Y.; Zhan, W.; Huang, L.; Wang, J.; Qiu, L. SPP-Extractor: Automatic Phenotype Extraction for Densely Grown Soybean Plants. Crop J. 2023, 11, 1569–1578. [Google Scholar] [CrossRef]
Cardellicchio, A.; Solimani, F.; Dimauro, G.; Petrozza, A.; Summerer, S.; Cellini, F.; Renò, V. Detection of Tomato Plant Phenotyping Traits Using YOLOv5-Based Single Stage Detectors. Comput. Electron. Agric. 2023, 207, 107757. [Google Scholar] [CrossRef]
Chen, X.; Wen, S.; Zhang, L.; Lan, Y.; Ge, Y.; Hu, Y.; Luo, S. A Calculation Method for Cotton Phenotypic Traits Based on Unmanned Aerial Vehicle LiDAR Combined with a Three-Dimensional Deep Neural Network. Comput. Electron. Agric. 2025, 230, 109857. [Google Scholar] [CrossRef]
He, H.; Ma, X.; Guan, H. A Calculation Method of Phenotypic Traits of Soybean Pods Based on Image Processing Technology. Ecol. Inform. 2022, 69, 101676. [Google Scholar] [CrossRef]
Lou, X.; Fu, Z.; Lin, E.; Liu, H.; He, Y.; Huang, H.; Liu, F.; Weng, Y.; Liang, H. Phenotypic Measurements of Broadleaf Tree Seedlings Based on Improved UNet and Pix2PixHD. Ind. Crops Prod. 2024, 222, 119880. [Google Scholar] [CrossRef]
Xu, B.; Wan, X.; Yang, H.; Feng, H.; Fu, Y.; Cen, H.; Wang, B.; Zhang, Z.; Li, S.; Zhao, C.; et al. TIPS: A Three-Dimensional Phenotypic Measurement System for Individual Maize Tassel Based on TreeQSM. Comput. Electron. Agric. 2023, 212, 108150. [Google Scholar] [CrossRef]
Miranda, J.C.; Arnó, J.; Gené-Mola, J.; Lordan, J.; Asín, L.; Gregorio, E. Assessing Automatic Data Processing Algorithms for RGB-D Cameras to Predict Fruit Size and Weight in Apples. Comput. Electron. Agric. 2023, 214, 108302. [Google Scholar] [CrossRef]
Jorquera-Fontena, E.; Génard, M.; Ribera-Fonseca, A.; Franck, N. A Simple Allometric Model for Estimating Blueberry Fruit Weight from Diameter Measurements. Sci. Hortic. 2017, 219, 131–134. [Google Scholar] [CrossRef]
Yu, G.; Ma, B.; Li, Y.; Dong, F. Quality Detection of Watermelons and Muskmelons Using Innovative Nondestructive Techniques: A Comprehensive Review of Novel Trends and Applications. Food Control 2024, 165, 110688. [Google Scholar] [CrossRef]
Fass, E.; Shlomi, E.; Ziv, C.; Glickman, O.; Helman, D. Machine Learning Models Based on Hyperspectral Imaging for Pre-Harvest Tomato Fruit Quality Monitoring. Comput. Electron. Agric. 2025, 229, 109788. [Google Scholar] [CrossRef]
Vásconez, J.P.; Vásconez, I.N.; Moya, V.; Calderón-Díaz, M.J.; Valenzuela, M.; Besoain, X.; Seeger, M.; Auat Cheein, F. Deep Learning-Based Classification of Visual Symptoms of Bacterial Wilt Disease Caused by Ralstonia Solanacearum in Tomato Plants. Comput. Electron. Agric. 2024, 227, 109617. [Google Scholar] [CrossRef]
Polly, R.; Devi, E.A. Semantic Segmentation for Plant Leaf Disease Classification and Damage Detection: A Deep Learning Approach. Smart Agric. Technol. 2024, 9, 100526. [Google Scholar] [CrossRef]
You, H.; Li, Z.; Wei, Z.; Zhang, L.; Bi, X.; Bi, C.; Li, X.; Duan, Y. A Blueberry Maturity Detection Method Integrating Attention-Driven Multi-Scale Feature Interaction and Dynamic Upsampling. Horticulturae 2025, 11, 600. [Google Scholar] [CrossRef]
Omaye, J.D.; Ogbuju, E.; Ataguba, G.; Jaiyeoba, O.; Aneke, J.; Oladipo, F. Cross-Comparative Review of Machine Learning for Plant Disease Detection: Apple, Cassava, Cotton and Potato Plants. Artif. Intell. Agric. 2024, 12, 127–151. [Google Scholar] [CrossRef]
Li, Y.; Ma, R.; Zhang, R.; Cheng, Y.; Dong, C. A Tea Buds Counting Method Based on YOLOv5 and Kalman Filter Tracking Algorithm. Plant Phenomics 2023, 5, 0030. [Google Scholar] [CrossRef]
Sun, K.; Zhang, Y.-J.; Tong, S.-Y.; Tang, M.-D.; Wang, C.-B. Study on Rice Grain Mildewed Region Recognition Based on Microscopic Computer Vision and YOLO-v5 Model. Foods 2022, 11, 4031. [Google Scholar] [CrossRef]
Ding, Y.; Yang, W.; Zhang, J. An Improved DeepLabV3+ Based Approach for Disease Spot Segmentation on Apple Leaves. Comput. Electron. Agric. 2025, 231, 110041. [Google Scholar] [CrossRef]
Bai, X.; Liu, P.; Cao, Z.; Lu, H.; Xiong, H.; Yang, A.; Cai, Z.; Wang, J.; Yao, J. Rice Plant Counting, Locating, and Sizing Method Based on High-Throughput UAV RGB Images. Plant Phenomics 2023, 5, 0020. [Google Scholar] [CrossRef] [PubMed]
Ding, J.-Y.; Zou, C.-M.; Jeon, W.-S.; Rhee, S.-Y. An Improved YOLO Detection Approach for Pinpointing Cucumber Diseases and Pests. Comput. Mater. Contin. 2024, 81, 3989–4014. [Google Scholar] [CrossRef]
Mao, M.; Hong, M. YOLO Object Detection for Real-Time Fabric Defect Inspection in the Textile Industry: A Review of YOLOv1 to YOLOv11. Sensors 2025, 25, 2270. [Google Scholar] [CrossRef] [PubMed]
Jiang, P.; Ergu, D.; Liu, F.; Cai, Y.; Ma, B. A Review of Yolo Algorithm Developments. Procedia Comput. Sci. 2022, 199, 1066–1073. [Google Scholar] [CrossRef]
Liu, Z.-L.; Zielinska, M.; Yang, X.-H.; Yu, X.-L.; Chen, C.; Wang, H.; Wang, J.; Pan, Z.; Xiao, H.-W. Moisturizing Strategy for Enhanced Convective Drying of Mushroom Slices. Renew. Energy 2021, 172, 728–739. [Google Scholar] [CrossRef]
Yu, H.; Zhang, X.; Yan, J.; Meng, X. YOLO11m-SCFPose: An Improved Detection Framework for Keypoint Extraction in Cucumber Fruit Phenotyping. Horticulturae 2025, 11, 858. [Google Scholar] [CrossRef]
Shi, L.; Bai, Z.; Yin, X.; Wei, Z.; You, H.; Liu, S.; Wang, F.; Qi, X.; Yu, H.; Bi, C.; et al. OMB-YOLO-tiny: A Lightweight Detection Model for Damaged Pleurotus ostreatus Based on Enhanced YOLOv8n. Horticulturae 2025, 11, 744. [Google Scholar] [CrossRef]
Gao, J.; Dai, S.; Huang, J.; Xiao, X.; Liu, L.; Wang, L.; Sun, X.; Guo, Y.; Li, M. Kiwifruit Detection Method in Orchard via an Improved Light-Weight YOLOv4. Agronomy 2022, 12, 2081. [Google Scholar] [CrossRef]
Ma, N.; Zhang, X.; Zheng, H.T.; Sun, J. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European Conference on Computer Vision (ECCV), Malmö, Sweden, 8–13 September 2018; pp. 116–131. [Google Scholar]
Chen, J.; Kao, S.; He, H.; Zhuo, W.; Wen, S.; Lee, C.-H.; Chan, S.-H.G. Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023. [Google Scholar]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
Chen, H.; Shi, X.; Liu, M.; Chen, C. A Chinese License Plate Recognition System Based on OpenCV for Complex Environments. Procedia Comput. Sci. 2024, 243, 1265–1272. [Google Scholar] [CrossRef]
Zou, H.; Shen, S.; Lan, T.; Sheng, X.; Zan, J.; Jiang, Y.; Du, Q.; Yuan, H. Prediction Method of the Moisture Content of Black Tea during Processing Based on the Miniaturized Near-Infrared Spectrometer. Horticulturae 2022, 8, 1170. [Google Scholar] [CrossRef]
Chung, W.; Chen, Y.-T. A Nonparametric Least Squares Regression Method for Forecasting Building Energy Performance. Appl. Energy 2024, 376, 124219. [Google Scholar] [CrossRef]
Du, X.; Han, X.; Shen, T.; Meng, Z.; Chen, K.; Yao, X.; Cao, Y.; Castro-García, S. Natural Frequency Identification Model Based on BP Neural Network for Camellia Oleifera Fruit Harvesting. Biosyst. Eng. 2024, 237, 38–49. [Google Scholar] [CrossRef]
Zhang, Y.; Liao, J.; Xu, C.; Du, M.; Zhang, X. Optimizing Variables Selection of Random Forest to Predict Radial Growth of Larix Gmelinii Var. Principis-Rupprechtii in Temperate Regions. For. Ecol. Manag. 2024, 569, 122159. [Google Scholar] [CrossRef]
Yang, T.; Zhou, S.; Xu, A.; Ye, J.; Yin, J. YOLO-SegNet: A Method for Individual Street Tree Segmentation Based on the Improved YOLOv8 and the SegFormer Network. Agriculture 2024, 14, 1620. [Google Scholar] [CrossRef]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 11534–11542. [Google Scholar]
Hou, Q.; Zhou, D.; Feng, J. Coordinate Attention for Efficient Mobile Network Design. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; IEEE: Nashville, TN, USA, 2021; pp. 13708–13717. [Google Scholar]
Ouyang, D.; He, S.; Zhang, G.; Luo, M.; Guo, H.; Zhan, J.; Huang, Z. Efficient Multi-Scale Attention Module with Cross-Spatial Learning. In Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar]
Liu, Y.; Shao, Z.; Hoffmann, N. Global Attention Mechanism: Retain Information to Enhance Channel-Spatial Interactions. arXiv 2021, arXiv:2112.05561. [Google Scholar] [CrossRef]
Xu, H.; Zhang, X.; Shen, W.; Lin, Z.; Liu, S.; Jia, Q.; Li, H.; Zheng, J.; Zhong, F. Improved CSW-YOLO Model for Bitter Melon Phenotype Detection. Plants 2024, 13, 3329. [Google Scholar] [CrossRef] [PubMed]
Luo, T.; Rao, S.; Ma, W.; Song, Q.; Cao, Z.; Zhang, H.; Xie, J.; Wen, X.; Gao, W.; Chen, Q.; et al. YOLOTree-Individual Tree Spatial Positioning and Crown Volume Calculation Using UAV-RGB Imagery and LiDAR Data. Forests 2024, 15, 1375. [Google Scholar] [CrossRef]
Wei, X.; Wang, Q.; Li, K.; Zhang, W. A Method for Sesame (Sesamum indicum L.) Organ Segmentation and Phenotypic Parameter Extraction Based on CAVF-PointNet++. Plants 2025, 14, 289. [Google Scholar] [CrossRef]
Ma, S.; Lu, X.; Zhang, L. TSINet: A Semantic and Instance Segmentation Network for 3D Tomato Plant Point Clouds. Appl. Sci. 2025, 15, 8406. [Google Scholar] [CrossRef]
Xu, X.; Li, J.; Zhou, J.; Feng, P.; Yu, H.; Ma, Y. Three-Dimensional Reconstruction, Phenotypic Traits Extraction, and Yield Estimation of Shiitake Mushrooms Based on Structure from Motion and Multi-View Stereo. Agriculture 2025, 15, 298. [Google Scholar] [CrossRef]
Lu, C.-P.; Liaw, J.-J.; Wu, T.-C.; Hung, T.-F. Development of a Mushroom Growth Measurement System Applying Deep Learning for Image Recognition. Agronomy 2019, 9, 32. [Google Scholar] [CrossRef]

Figure 1. Sample of whole growth cycle of Lentinula edodes.

Figure 2. Data enhancement and image annotation: (a) original image, (b) brightness adjustment, (c) rotation, (d) adding noise, (e) image annotation. Note: Green dots and lines are polygon contour labels.

Figure 3. ShuffleNetV2 unit structure: (a) basic ShuffleNet unit; (b) ShuffleNet unit for spatial downsampling (2×); (c) ShuffleNetV2 unit; (d) ShuffleNetV2 unit for spatial downsampling (2×). Note: DWConv denotes depthwise convolution; BN denotes batch normalization; ReLU denotes rectified linear unit; AVG Pool denotes average pooling.

Figure 4. C3k2-FasterBlock module structure. Note: Ellipses represent n consecutive identical modules arranged in sequence.

Figure 5. CBAM attention structure. Note: H denotes height of the feature map; W denotes width of the feature map; C denotes number of channels of the feature map; MLP denotes multi-layer perceptron; Mc denotes channel attention module; Ms denotes spatial attention module; FC denotes fully connected layer.

Figure 6. YOLO-SFCB model network structure. Note: PSABlock denotes pyramid split attention block.

Figure 7. Standard block and shooting principle. Note: ellipsis represents a continuous arrangement of Calibration papers.

Figure 8. Measurement direction and position of Lentinula edodes: (a) original image, (b) segmented image, (c) measurement direction and position. Note: The red dashed box represents the minimum rotation bounding rectangle.

Figure 9. Visualization of detection heat map of stipes and pileus of Lentinula edodes fruiting bodies of YOLOv11-Seg and YOLO-SFCB: (a) detection result of YOLOv11-Seg, (b) detection heat map of YOLOv11-Seg, (c) detection result of YOLO-SFCB, (d) detection heat map of YOLO-SFCB. Note: The light blue areas in (a,c) represent the pileus segmentation region, while the dark blue areas represent the stipe segmentation region; The red areas in (b,d) represent the areas that the model pays the most attention to, the yellow areas represent the less important areas, and the blue areas are the areas with the least impact, usually representing redundant information.

Figure 10. The segmentation effect of different models: (a) Mask R-CNN, (b) YOLACT, (c) YOLOv5n-Seg, (d) YOLOv8n-Seg, (e) YOLOv11n-Seg, (f) YOLO-SFCB.

Figure 11. Comparison results of different models: (a) comparison of model size and segmentation performance; (b) comparison of mAP50 (%) of different volumes of Lentinula edodes stipe and pileus.

Figure 12. Model raspberry pie deployment and display page. Note: (a) denotes raspberry pi5; (b) denotes distance measurement module; (c) denotes industrial camera.

Figure 13. Scatter plots and Bland–Altman plots for the comparison of measured and actual phenotypic parameters in Lentinula edodes fruiting bodies: (a) stipe diameter, (b) stipe height, (c) pileus width, (d) pileus thickness. Note: MEAN denotes mean difference; SD denotes standard deviation; the red solid line represents the regression fitting line; the black dashed line represents the identity line; the blue solid line represents the mean difference line; and the red dashed line represents the limits of agreement.

Figure 14. Distribution of fresh weight and dry weight samples and Bland–Altman analysis results: (a) Least squares method, (b) BP neural network, (c) Random forest.

Figure 15. Least squares method for predicting sample distribution: (a) fresh weight, (b) dry weight.

Table 1. Model hyper-parameter setting value.

Name of Hyper-Parameters	Parameter Value
Epochs	200
Batch	16
Imgsz	640
Optimizer	AdamW
Initial learning rate (Lr0)	1.667 × 10⁻³
Final learning rate (Lrf)	1.667 × 10⁻⁵
Momentum	0.9
Iou	0.7
Linear warmup epochs	3
Weight decay	0.0005
Random seed	0
Patience	100

Table 2. Effect of adding different attention mechanisms.

Attention Module	Precision (%)	Recall (%)	mAP50-95 (%)
No attention	98.8	97.8	79.6
SE attention	99.6	98.9	80.3
CA attention	99.0	99.1	80.5
ECA attention	99.0	98.4	80.2
EMA attention	98.1	99.1	80.1
GAM attention	99.0	99.1	81.4
CBAM attention	99.5	99.1	82.0

Table 3. Ablation experimental results.

YOLOv11-Seg	ShuffleNetV2	C3k2-FasterBlock	CBAM Attention	Params (Million)	FLOPs (G)	Inference Time (ms)	Precision (%)	Recall (%)	mAP50-95 (%)
√	×	×	×	2.8	10.4	17.4	98.8	97.8	79.6
√	√	×	×	2.0	8.0	14.7	99.2	97.4	77.7
√	×	√	×	2.8	10.2	17.2	99.7	98.7	79.7
√	×	×	√	2.9	10.4	18.5	99.5	99.1	82.0
√	√	√	×	1.9	7.8	15.3	98.8	99.0	80.0
√	√	×	√	2.0	8.0	16.4	98.7	98.4	80.2
√	×	√	√	2.9	10.1	18.0	99.6	99.1	81.5
√	√	√	√	2.0	7.8	15.7	99.8	99.1	80.5

Note: √ denotes this module is included; × denotes this module is not included.

Table 4. Learning effect of different models.

Weight	Model	R² ± SD	MRE ± SD (%)	MAE ± SD (g)	Feature Importance
Weight	Model	R² ± SD	MRE ± SD (%)	MAE ± SD (g)	Pileus Width	Stipe Diameter	Total Height
Fresh weight	Least squares method	0.94 ± 0.03	7.22 ± 1.43	1.56 ± 0.28	0.56	0.27	0.17
	BP neural network	0.91 ± 0.07	8.22 ± 1.84	1.74 ± 0.30	0.65	0.28	0.07
	Random forest	0.91 ± 0.04	7.95 ± 1.93	1.78 ± 0.46	0.81	0.17	0.02
Dry weight	Least squares method	0.90 ± 0.07	11.89 ± 2.48	0.39 ± 0.07	0.68	0.19	0.13
	BP neural network	0.90 ± 0.08	12.18 ± 2.58	0.41 ± 0.09	0.62	0.29	0.09
	Random forest	0.88 ± 0.06	11.44 ± 2.87	0.42 ± 0.11	0.83	0.14	0.03

Note: MRE denotes mean relative error; MAE denotes mean absolute error.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, P.; Zhang, R.; Chen, W.; Li, S.; Hao, J.; Su, T.; Wang, M. Phenotypic Feature Extraction and Yield Prediction of Lentinula edodes with Lightweight YOLO-SFCB Model. Horticulturae 2025, 11, 1406. https://doi.org/10.3390/horticulturae11111406

AMA Style

Liu P, Zhang R, Chen W, Li S, Hao J, Su T, Wang M. Phenotypic Feature Extraction and Yield Prediction of Lentinula edodes with Lightweight YOLO-SFCB Model. Horticulturae. 2025; 11(11):1406. https://doi.org/10.3390/horticulturae11111406

Chicago/Turabian Style

Liu, Pan, Ruiqing Zhang, Wenjie Chen, Shoumian Li, Jianjun Hao, Tianyue Su, and Mingyang Wang. 2025. "Phenotypic Feature Extraction and Yield Prediction of Lentinula edodes with Lightweight YOLO-SFCB Model" Horticulturae 11, no. 11: 1406. https://doi.org/10.3390/horticulturae11111406

APA Style

Liu, P., Zhang, R., Chen, W., Li, S., Hao, J., Su, T., & Wang, M. (2025). Phenotypic Feature Extraction and Yield Prediction of Lentinula edodes with Lightweight YOLO-SFCB Model. Horticulturae, 11(11), 1406. https://doi.org/10.3390/horticulturae11111406

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Phenotypic Feature Extraction and Yield Prediction of Lentinula edodes with Lightweight YOLO-SFCB Model

Abstract

1. Introduction

2. Materials and Methods

2.1. Dataset Collection

2.2. Dataset Processing

2.3. Construction of the YOLO-SFCB Model

2.3.1. Introduce the ShuffleNetV2 Network in the Backbone Section

2.3.2. Incorporation of C3k2-FasterBlock Lightweight Feature Extractor into Neck

2.3.3. CBAM Attention Mechanism Integration in Front of Small Object Detection Head

2.4. Model Evaluation Metrics

2.5. The Relationship Between the Distance of the Object and the Pixel-to-Physical Length Ratio

2.6. Extraction Method of the Phenotypic Features of the Fruiting Body of Lentinula edodes

2.7. Prediction Method of Lentinula edodes Fruiting Body Yield

3. Results

3.1. Experimental Platform and Model Training Parameters

3.2. Comparison of the Effects of Different Attention Mechanisms

3.3. Ablation Experiment of YOLO-SFCB Model

3.4. Performance Comparison with Different Segmentation Models

3.5. Model Deployment Experiment

3.6. Pixel-Physical Length Ratio Fitting Results

3.7. Effect Analysis of Phenotypic Feature Extraction of Lentinula edodes Fruiting Bodies

3.8. Analysis of Prediction Results for Lentinula edodes Fruiting Bodies

4. Discussion

4.1. Segmentation Model

4.2. Phenotypic Feature Detection and Yield Prediction of Lentinula edodes

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI