Next Article in Journal
Automated Residential Bubble Diagram Generation Based on Dual-Branch Graph Neural Network and Variational Encoding
Previous Article in Journal
Matrix Method of Defect Analysis for Structures with Areas of Considerable Stiffness Differences
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Grading Algorithm for Orah Sorting Line Based on Improved ShuffleNet V2

1
College of Mechanical and Electrical Engineering, Fujian Agriculture and Forestry University, Fuzhou 350001, China
2
Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
3
College of Mechanical and Electronic Engineering, Northwest A&F University, Yangling 712100, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(8), 4483; https://doi.org/10.3390/app15084483
Submission received: 19 March 2025 / Revised: 16 April 2025 / Accepted: 16 April 2025 / Published: 18 April 2025

Abstract

:

Featured Application

This study is primarily oriented towards the industrialized Orah sorting process.

Abstract

This study proposes a grading algorithm for Orah sorting lines based on machine vision and deep learning. The original ShuffleNet V2 network was modified by replacing the ReLU activation function with the Mish activation function to alleviate the neuron death problem. The ECA attention module was incorporated to enhance the extraction of Orah appearance features, and transfer learning was applied to improve model performance. As a result, the ShuffleNet_wogan model was developed. Based on the operational principles of the sorting line, a time-sequential grading algorithm was designed to improve grading accuracy, along with a multi-sampling diameter algorithm for simultaneous Orah diameter measurement. Experimental results show that the ShuffleNet_wogan model achieved an accuracy of 91.12%, a 3.92% improvement compared to the original ShuffleNet V2 network. The average prediction time for processing 10 input images was 51.44 ms. The sorting line achieved a grading speed of 10 Orahs per second, with an appearance grading accuracy of 92.5% and a diameter measurement compliance rate of 98.3%. The proposed algorithm is characterized by high speed and accuracy, enabling efficient Orah sorting.

1. Introduction

Orah mandarin (Citrus reticulata cv. Orah) is a popular citrus fruit favored by consumers for its rich taste and high nutritional value [1]. In 2021, China ranked fifth in global citrus exports, accounting for nearly 7% of the total volume, with exports reaching 918,000 tons [2].
However, Orah grading in China still primarily relies on manual operations and basic mechanical methods. The former is inefficient and costly, while the latter often causes significant damage to the fruit. This situation is particularly prominent against the backdrop of China’s advancing agricultural modernization and increasing levels of mechanization. Therefore, this study aims to develop a grading algorithm for Orah sorting lines to enhance the efficiency, accuracy, and consistency of Orah grading.
Automatic sorting systems have been widely used in agriculture to improve efficiency and accuracy. Computer vision (CV) and deep learning (DL) offer a new approach to automatic sorting. CV systems use cameras to capture images of objects, and DL algorithms can then be used to identify and classify the objects in the images. This method demonstrates high accuracy and has been successfully applied to various sorting tasks, such as ripeness [3,4], defects [5,6], and size [7].
Application of image processing in fruit sorting machines enables non-destructive fruit inspection. Various mechanical devices equipped with image processing techniques have been developed, including belt, drum, and tray-type conveyors. Some of the earliest include Bennedsen et al. [8], who described a tray-based sorting system where apples are driven forward and tilted to lean on a plastic roller, with the calyx perpendicular to the camera. This allows for the acquisition of images of the entire apple surface. However, handling apples with the calyx oriented vertically is challenging, and tilting the fruit plate makes it impossible to capture the stem part of the fruit. Cheng et al. [9] developed a drum-type conveyor belt with six apple conveyance channels that can rotate the fruit in front of a camera for online image processing. The drive and speed controllers provide timing signals for electromechanical equipment, ensuring the capture of all apple surface information at 30 frames/s. However, a challenge of drum-type conveyor belts is determining the position of the fruit after it has been inspected. Sofu et al. [10] proposed an automatic apple sorting system with a vision-based conveyor for apple detection and a tray-based conveyor with weighing sensors and a drive system for sorting. However, the detection algorithm employed in this system is too simple, using only the K-means algorithm and thresholding. This can lead to compromised detection accuracy when the threshold values for healthy apple regions and defects are close to each other.
In recent years, deep learning [11,12,13], which is a relatively new research direction in machine vision, has seen increased attention. It establishes deep nonlinear network models, utilizes massive sample data for learning, and improves recognition accuracy. Nazrul Ismail and Malik [14] used the EfficientNet model on apple and banana test sets, achieving average accuracies of 99.2% and 98.6%, respectively. Varsha Bhole and Kumar [15] implemented a mango non-destructive sorting and grading system centered on deep learning, with classification accuracies of 93.33% and 92.27%, respectively. Da Costa et al. [16] proposed a deep learning approach for detecting external defects in food using computer vision (CV). The study employed deep residual neural network (ResNet) classifiers, with a particular emphasis on feature extraction and fine-tuning. The results indicate that fine-tuning outperformed feature extraction, especially when abundant data samples were available. The best-performing model was a fully fine-tuned ResNet50, achieving an average precision of 94.6% on the test set. Fu et al. [17] applied deep learning to fruit freshness grading and proposed a hierarchical approach in which multiple fruits are detected and classified through real-time object detection. The regions of interest are cropped from the source images and fed into convolutional neural network (CNN) models for regression, ultimately grading the freshness. Cárdenas-Pérez et al. [18] used computer vision (CV) to study the ripening stages of Golden Delicious apples and graded their ripeness based on color. Momeny et al. [19] developed a more robust deep convolutional neural network model for detecting the ripeness levels and black spot disease in citrus fruits on trees. Lu et al. [20] proposed a defect detection model, Yolo-FD, which integrates a three-dimensional coordinate attention (TDCA) mechanism into its backbone network, improving the precision of detecting minor defects on citrus peels. They also introduced a fruit morphology detection algorithm combined with the particle swarm optimized extreme learning machine (PSO-ELM) model, achieving a morphology detection accuracy of 91.42%. Kundu et al. [21] proposed a deep learning-based automatic citrus sorting algorithm. In the first stage, the algorithm classifies lemons and oranges. In the second stage, it uses a corresponding model to perform binary classification of fruit quality as either good or bad. Chakraborty et al. [22] designed a small-scale citrus washing and sorting machine suitable for farms, which is capable of weighing and sorting citrus fruits. Additionally, the custom lightweight CNN model “SortNet” classifies fruit quality into “accept” and “reject” categories based on appearance. The machine can sort up to 230 kg of citrus fruits per hour. Moreover, deep learning has been widely applied to grading other fruits and vegetables, including garlic, soybeans, dragon fruit, and corn [23,24,25]. These techniques demonstrated outstanding performance in the fruit grading industry.
To improve the efficiency of Orah sorting, developing industrialized sorting equipment is an effective approach. This study is based on a self-developed sorting line prototype. An Orah grading algorithm was developed according to its operating principles. The aim is to achieve rapid, comprehensive, and accurate Orah grading through improvements in the neural network model and the application of data obtained from multiple sampling.

2. Materials and Methods

2.1. Orah Sorting Line

The Orah sorting line used in this study measures 30 m in length and 4 m in width and integrates four conveyor chains. The mechanical structure of the Orah sorting line is a critical component of the overall design, as it determines the operational principles of the entire system [26,27,28,29]. The mechanical section of the Orah visual sorting system consists of a feed conveyor, a single-line conveyor, an image acquisition chamber, a grading execution device, and grading output channels, as shown in Figure 1.
The feeding conveyor is connected to an external Orah washing machine, which is a necessary step in the post-harvest processing of Orah to extend its storage life. During this process, most dust and debris are removed, ensuring that clean images of Orah surface are captured. The single-line conveyor ensures that Orah enters the sorting process in an orderly and continuous manner. This design prevents Orah from accumulating and ensures the smoothness of the sorting process. The conveyor mechanism is equipped with fruit cups in which the Orah is placed. As shown in Figure 2, these fruit cups come into contact with a belt upon entering the image acquisition chamber. The fruit cups are driven to rotate because the belt speed (V2) is slightly higher than the conveyor chain speed (V1). Consequently, the Orah undergoes horizontal transport while continuously rotating with the fruit cups. Through repeated sampling, multiple images from different angles of the Orah can be captured. When defects are detected, the grading result from adjacent angles can be cross-verified. This reduces the likelihood of misgrading due to limitations in model accuracy. These multi-angle sampled images are also utilized for diameter detection.
When the fruit cup blocks the photoelectric switch (model: Panasonic CX-411E, Figure 3a), it triggers the industrial camera (Figure 3b) to capture images. This ensures that each fruit cup occupies a fixed position in the image, and consequently, the position of the Orah within the image is also stable. The Hikvision Robot industrial camera (model: MV-CA032-10GC) features a resolution of 2048 × 1536 and employs a global shutter, avoiding the distortion typically caused by rolling shutters when photographing moving objects [30]. The image acquisition chamber is equipped with high-powered LED lighting, providing sufficient brightness at a shutter speed of 1/1000 s to ensure clarity for moving objects. Together, these components ensure that the Orah’s surface is captured from multiple angles against a clean and uniform background, producing high-quality images suitable for grading and data collection. An overview of the entire sorting line is shown in Figure 4a,b.

2.2. Data Collection and Processing

This study collected 60 kg of Orah samples from Wuming County, Nanning, Guangxi, a major production area for Orah. Before entering the sorting line, the Orah fruits were cleaned and dried using specialized washing machinery to remove surface dust and impurities. To replicate the actual imaging conditions of the visual sorting line, fixed lighting and integrated cameras were used for image acquisition. This ensured that the captured images accurately reflected the operational environment. The Orah samples were repeatedly fed into the sorting line, and a self-developed program based on the Hikrobot camera SDK was used to automatically transmit and save the images to a PC. From each captured image, 10 smaller images containing individual Orahs were cropped, resulting in a total of 19,982 Orah images. In addition, 5632 images of empty fruit cups were obtained for the purpose of identifying empty fruit cups. Skilled sorting workers were invited to classify the individual images into Grade A, B, and C, following the standards shown in Table 1. Representative images of each grade are illustrated in Figure 5.
To improve fitting accuracy, enhance the model’s generalization ability, and reduce overfitting, this experiment employed data augmentation techniques to expand the training dataset. The dataset was initially split, with 80% allocated to the training set and the remaining 20% to the validation set. Data augmentation was applied to the training set using methods such as adjusting brightness and contrast, random rotation, vertical flipping, horizontal flipping, and random scaling transformations. In the training set, data augmentation was uniformly applied to the three grades. The “empty fruit cup” category, representing the black background image, already contained sufficient features and thus was not augmented. The test set was not augmented to accurately reflect the model’s generalization performance. Table 2 presents the distribution of the expanded dataset, and an example image processed using different data augmentation methods is shown in Figure 6.

2.3. Orah Diameter Detection

The background of the image can interfere with diameter measurement. To ensure accurate results, image preprocessing is required before measurement. First, the image is converted to the HSV color space, and the brightness channel (V) is binarized using the Otsu thresholding method. Next, the connected domain area is calculated, and regions smaller than 50 pixels are removed to eliminate background interference. The specific process is illustrated in Figure 7.
As shown in Figure 8, the Orah in the image capture area of the sorting line will be photographed continuously 10 times, obtaining 10 images of the Orah from different orientations. These images will serve as the input data for the program.
Based on the filtered binary image, the fitEllipse function from the OpenCV library is used to determine the best-fitting ellipse and obtain the length of the major axis. The outcomes of these steps are shown in Figure 9.
To improve the accuracy of measuring the diameter of Orah, the following methods were adopted:
1.
The image acquisition system is shown in Figure 8. When the camera is positioned at a height of 1.2 m, the field of view width is 1.45 m, corresponding to the range from Pos1 to Pos12. Due to lower image quality and greater distortion at the edges of the frame, measurements were taken at 10 positions within the range of Pos2 to Pos11.
2.
Image distortion correction was applied. An orange plastic ball with a diameter of 85 mm was used for diameter tests across the Pos2 to Pos11 range to calculate distortion coefficients at each position. The corrected diameter value is obtained by multiplying the original diameter value by the distortion coefficient.
3.
Orange plastic balls with diameters of 55 mm, 70 mm, and 85 mm were used to calibrate the system. A conversion function was generated to convert fitted diameter pixel values into the corresponding physical diameters.
Diameter algorithm: Based on the above algorithm, the long-axis diameters of a fitted ellipse for the same Orah can be obtained from 10 measurements taken at different angles. However, testing shows that 1–3 results among these 10 often deviate significantly. Analysis indicates this is caused by the binarization process using Otsu’s method, which is an automatic thresholding algorithm. If low-grayscale regions appear near the fruit’s contour, such as black spots or scars, they may also be classified as background, leading to underestimated results. Similarly, if leaves or other debris from the background appear in the fruit cup, they can merge with the fruit after binarization, resulting in overestimated diameters. Based on practical usage and experience, the following algorithm is used to refine the diameter data:
1.
Define a list D = [d1, d2, …, d10] and set two thresholds Dmax and Dmin. For each diameter, check if di < Dmin or di > Dmax. If so, remove di from the list. Dmin and Dmax are determined based on the size range of the fruit. In this paper, Dmin = 40 and Dmax = 200.
2.
Use the Z-score method to identify the most outlying value, calculated as follows: Define a threshold Zmax. For each, calculate its Z-score zi. If zi > Zmax, remove the corresponding di from the list. According to test results, Zmax = 15 was selected.
3.
Repeat this step until the list length is less than 6 or all zi < Zmax.
If the list length is less than 6, an error message is output, and the Orah is directed to a special exit, which leads to the front of the image acquisition chamber for re-grading. If the result is valid, output the average of the remaining values in the list.

2.4. ShuffleNet V2 Model

ShuffleNet V1 is a lightweight convolutional neural network proposed by Megvii Technology. It was designed to enable efficient neural network inference on resource-constrained mobile devices. The network integrates grouped convolution, depthwise separable convolution, and channel shuffling. These techniques enhance feature mixing between channels without increasing computational cost, effectively reducing both computation and parameter counts. Building on these ideas, ShuffleNet V1 introduced four efficient and lightweight design principles, which led to the development of the ShuffleNet V2 model [31]. ShuffleNet V2 demonstrated high accuracy in several benchmark tests. It also outperformed many earlier lightweight models in terms of speed and efficiency, making it highly suitable for Orah sorting applications with strict real-time requirements. The model architecture is shown in Figure 10.

2.5. Improved ShuffleNet_wogan Model

The images of Orah captured by the sorting line have relatively simple backgrounds, resulting in minimal noise during model training. This environment facilitates good performance for lightweight networks. Therefore, this study selected ShuffleNet V2 ×1.0 as the backbone network and introduced several improvements to enhance its performance.
The architecture of the ShuffleNet_wogan model is shown in Figure 11. In this figure, Conv represents convolution operations, MaxPool denotes max pooling, and AvgPool refers to average pooling. Channel split indicates channel separation, DWConv represents depth-wise convolution, Concat denotes channel concatenation, and Channel Shuffle represents channel shuffling. Stage2, Stage3, and Stage4 correspond to ShuffleNet V2 units, which consist of stacked basic units and down-sampling units. Mish represents the activation function, ECA denotes the attention mechanism module, BN stands for batch normalization, and Grade result indicates the fully connected output layer.

2.5.1. Mish Activation Function

In the original ShuffleNet V2 model, the ReLU activation function is used. When ReLU is employed in a neural network, the output is either 0 or 1, regardless of the input value. This adds sparsity to the network and results in high computational efficiency. However, a drawback is that during training, when negative values or large gradients pass through ReLU, its neurons may become inactive and cannot be reactivated by any input.
To address this limitation, this study replaced ReLU with the Mish activation function. Mish is a smooth and non-monotonic activation function, with its mathematical expression shown in Equation (1). Compared to ReLU, Mish offers better gradient flow, especially in the negative value region, where the output does not completely drop to zero. This enhances the model’s feature representation capability and allows the model to converge faster. Mish demonstrated superior performance over ReLU and Swish in many deep learning tasks, making it a well-balanced choice in terms of stability and performance.
M i s h x = x × tanh ( ln ( 1 + e x ) )

2.5.2. Efficient Channel Attention Module

The structure of the ECA module [32] is shown in Figure 12. The workflow is as follows:
First, the input feature map is as follows:
X R B × C × H × W
where B represents the batch size, C denotes the number of channels, and H and W are the spatial resolution (height and width) of the feature map.
Next, global average pooling is performed on the spatial dimensions of each channel to compute the global response of the channel:
z c = 1 H × W i = 1 H j = 1 W X c , i , j
The result is as follows:
z R B × C
where zc represents the global average value of channel c, indicating the global response strength of that channel. Xc,i,j is the value of channel c at position (i,j) in the input feature map.
Then, a one-dimensional convolution (Conv1D) with a dynamically determined kernel size is applied to z, completing cross-channel threshold interaction and capturing local relationships between channels:
w c = C o n v 1 D k z .
Here, wc denotes the weight of channel c, Conv1D represents the one-dimensional convolution operation, and k is the kernel size, which is adaptively determined using the following equation:
k = Φ C = l o g 2 ( C ) γ + b γ o d d .
In this paper, the parameters γ and b are set to 2 and 1, respectively. C represents the number of channels, and |t|odd denotes the nearest odd number to t.
The channel weights are then mapped to the range [0, 1] using the sigmoid activation function and applied to weight the original feature map, resulting in the weighted feature map:
Y c , i , j = σ w c · X c , i , j
The output is as follows:
Y R B × C × H × W
where σ represents the sigmoid activation function, and Y is the newly generated feature map.

2.5.3. Transfer Learning

Transfer learning [33] is a technique that leverages pretrained models on large-scale datasets. It applies the learned features to new tasks, improving both learning efficiency and model performance. In this study, the pretrained weights of ShuffleNet V2, trained on the large-scale ImageNet dataset, were selected for transfer learning. Since the dataset in this study contains a relatively large number of samples, all feature extraction layers were chosen for training to achieve better training results.

2.6. Time Sequence-Based Grading Algorithm

As the Orah rotates while moving on the conveyor chain, it undergoes approximately one full rotation from Pos2 to Pos11. The sorting line captures and grades the same Orah ten times, producing ten grading results. This forms a multi-sampling approach, where each sample reflects a different angle due to continuous rotation. If visible defects exist, up to five images may capture those defects.
To ensure grading accuracy, a judgment mechanism was implemented in the program. If Grade B or Grade C results are detected, the system requires at least three identical grades to appear within any five consecutive images for the result to be considered valid. The corresponding grade is then output. If neither Grade C nor Grade B meet this condition, Grade A is output by default. When both Grade C and Grade B satisfy the condition simultaneously, the system prioritizes Grade C as the final result. By leveraging the correlation between sequential results, the program re-evaluates the neural network’s output. This enhances grading accuracy and reduces the risk of misclassification.

2.7. Experimental Setup and Parameters

The experiments in this study were conducted on a 64-bit Windows 11 Professional Edition operating system. The hardware configuration included a NVIDIA GeForce RTX 4070 Super GPU (NVIDIA, Santa Clara, CA, USA), an Intel(R) Core(TM) i7-13700K CPU @ 3.40 GHz (Intel, Santa Clara, CA, USA), and 64 GB of RAM. Model construction, training, and validation were performed using Python 3.9 and the PyTorch 1.13.0 deep learning framework.
During training, the stochastic gradient descent (SGD) optimizer was used. The momentum parameter was set to 0.9, with a weight decay of 0.00004, and the loss function was cross-entropy. The learning rate was 0.01, the total number of training iterations was 200, and the batch size was 120.

3. Results and Discussion

3.1. Evaluation Metrics

This study evaluates the model’s performance using accuracy, precision, recall, and the F1 score. The computational complexity of the model is assessed based on the number of parameters and the volume of floating-point operations.
Accuracy refers to the proportion of correctly predicted samples among all samples, as shown in Equation (9).
A c c u r a c y = T P + T N T P + T N + F P + F N
Precision is used to measure how many of the samples predicted as positive are truly positive. Its calculation formula is shown in Equation (10).
P r e c i s i o n = T P T P + F P
Recall is used to measure the proportion of actual positive samples that are correctly predicted as positive by the model. Its calculation formula is shown in Equation (11).
R e c a l l = T P T P + F N
The F1 score is the harmonic mean of precision and recall, focusing on the balance between precision and recall. Its calculation formula is shown in Equation (12).
F 1 = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l
where TP represents the number of actual positive samples correctly predicted as positive, TN represents the number of actual negative samples correctly predicted as negative, FP represents the number of actual negative samples incorrectly predicted as positive, and FN represents the number of actual positive samples incorrectly predicted as negative.

3.2. Comparison Test Between ReLU and Mish Activation Functions

In this study, the ReLU activation function used in ShuffleNet V2 was replaced with the Mish activation function. The accuracy curves of ReLU and Mish activation functions are shown in Figure 13, where Mish demonstrates a certain advantage over ReLU, enhancing the model’s representational capacity.
The results analyzed in Table 3 indicate that the Mish activation function achieved an accuracy, precision, recall, and F1 score of 87.35%, 87.37%, 87.33%, and 87.33%, respectively. These metrics were 0.15%, 0.04%, 0.13%, and 0.08% higher than those achieved by the ReLU activation function. Although the improvement is relatively small, it is still notable. This is especially true given that the dataset used in this study features clean backgrounds and minimal noise. Even with ReLU, features can be effectively extracted. Therefore, the performance gain provided by Mish demonstrates a meaningful advantage.

3.3. Comparison Test of Attention Mechanisms

To investigate the impact of different attention mechanisms on model performance, four lightweight attention mechanisms were selected for comparative experiments under identical conditions. CBAM, SE, FCA, and ECA attention mechanisms were separately integrated into the original ShuffleNet V2 model, and the results are shown in Table 4.
The experimental results indicate that the model integrated with the ECA attention mechanism outperformed other schemes in terms of accuracy, precision, recall, and F1 score. Furthermore, the ECA mechanism demonstrated lower parameter count and floating-point operations compared to other attention mechanisms. Therefore, integrating ECA is more suitable for this study (annotation: FLOPs: floating-point operations).

3.4. Ablation Study

Ablation experiments were conducted to verify the effect of adding or removing specific modules on the overall network performance. In this study, improvements were introduced into ShuffleNet V2, and the proposed optimization methods were validated through ablation experiments. These experiments included the introduction of the ECA module, transfer learning, and the Mish activation function, with the results shown in Table 5. In the table, the symbol “√” indicates that the improvement was applied, while “-” indicates that it was not applied.
As shown in Table 5, introducing the ECA attention module into ShuffleNet V2 increased accuracy and F1 score by 0.45% and 0.38%, respectively, while slightly increasing the floating-point operations. Applying transfer learning improved accuracy and F1 score by 2.67% and 2.53%, respectively, without changing the parameter count, floating-point operations, or model size. The Mish activation function resulted in slight improvements in accuracy and F1 score, a slight reduction in floating-point operations, and a minor increase in model size. When the ECA module, transfer learning, and Mish activation function were all applied simultaneously to ShuffleNet V2, and the accuracy and F1 score improved by 3.92% and 3.81%, respectively. This was achieved without increasing the parameter count or model size, and with an increase in floating-point operations of less than 0.1%. These results indicate that ShuffleNet_wogan improves grading accuracy without significantly increasing model complexity (annotation: TL: transfer learning; Par: parameters; FLOPs: floating-point operations; and MS: model size).

3.5. Visualization Analysis Results

To further evaluate the model in this study, a confusion matrix visualization was employed. The confusion matrix for the model on the test set is shown in Figure 14. In the confusion matrix, all correct predictions are located on the diagonal, while all incorrect predictions are off the diagonal. The deeper the color, the higher the prediction accuracy for the corresponding category.
As shown in Figure 14, 1196 out of 1265 Grade A samples were correctly identified, achieving an accuracy of 94.5%. Among the 1020 Grade B samples, 95 were misgraded as Grade A, and 141 were misgraded as Grade C, resulting in an accuracy of 76.9%. Out of 1712 Grade C samples, 7 were misgraded as Grade A, and 158 were misgraded as Grade B, achieving an accuracy of 91.2%. All 1127 empty fruit cup samples were correctly identified.
Misclassified images from the validation set were selected for further analysis.
Misclassifications in Grade A primarily occurred for the following reasons:
1.
Fruit stems were not neatly trimmed or exhibited white cores, causing them to resemble defects and be incorrectly classified as Grade B or C, accounting for 45.0% of misclassified images (Figure 15a).
2.
Minor surface defects that still fell within the Grade A category but were incorrectly classified as Grade B or C (Figure 15b), accounting for 55.0%.
Misclassifications in Grade B primarily occurred for the following reasons:
1.
Small scars were not recognized, leading to misclassification as Grade A, accounting for 41.5%. These minor defects would typically be acceptable as Grade A in traditional manual sorting due to slight differences in standards among workers, thus not significantly impacting practical sorting. However, they were still treated as misclassifications in this study (Figure 16a).
2.
Foreign objects appearing in images, causing misclassification as Grade C, accounting for 7.5% (Figure 16b).
3.
Improperly trimmed fruit stems resulting in misclassification as Grade C, accounting for 11.9% (Figure 16c).
4.
Misclassification due to limitations in model accuracy, accounting for 39% (Figure 16d).
The primary reason for misclassification in Grade C is insufficient model accuracy, resulting in ineffective feature extraction (Figure 17).
The error analysis results indicate that various features may appear across all three grades, making classification inherently challenging. Notably, Grade B samples often exhibit diverse and subtle defects, but the number of Grade B images in the dataset is relatively lower compared to Grades A and C, which may have further impacted classification accuracy. Therefore, collecting a larger-scale and more balanced dataset is necessary to enhance the model’s feature extraction capability.
In summary, while the ShuffleNet_wogan model did not perfectly classify all images, it successfully identified the majority of samples in each category, demonstrating good practical application potential. Given that the sorting line in this study samples each Orah 10 times, the output results can undergo secondary evaluation, further reducing the probability of misgrading.
To intuitively understand which areas of the Orah images the model primarily focuses on, heatmap visualization was employed to highlight key attention regions within the network. Grad-CAM was used to visualize the model’s attention on sample images from Grade A, Grade B, and Grade C categories, as shown in Figure 18. Since the recognition accuracy for the empty fruit cup category reached 100%, no heatmap tests were conducted for this category. According to the heatmaps, the ShuffleNet V2 network overly concentrated on the defect areas in the Orah images, and its focus regions showed potential deviations, leading to lower accuracy. After the introduction of the attention mechanism, ShuffleNet_wogan demonstrated more precise focus, allocating greater attention to the overall appearance of the Orah. However, a drawback was the increased attention paid to the background regions. Despite this, after overall weighting, ShuffleNet_wogan still demonstrated better grading accuracy.

3.6. Comparison with Conventional Methods

Conventional image processing methods, such as morphological operations and thresholding, are still applicable in defect detection scenarios. Therefore, it is meaningful to compare them with the model proposed in this study. A Python-based program was developed for black spot detection, and its main workflow is shown in Figure 19.
After input, the image is converted to grayscale. A black hat transformation (cv2.MORPH_BLACKHAT) is applied to extract dark features. The result is then normalized to enhance contrast. Otsu’s method is used to automatically determine a global threshold, which separates black spot regions from the background. Morphological opening (cv2.MORPH_OPEN) is used to remove small noise. Finally, contours are detected to locate the black spots.
The results show two main problems. First, the method fails to distinguish stems from black spots (Figure 20). In nearly all images with visible stems, the stems are incorrectly identified as black spots, which severely affects reliability. The visual similarity between stems and black spots also makes it difficult to filter out such interference. Second, the method has a low detection rate for small black spots (Figure 21). Foreign objects in the background are also often misidentified as defects, reducing robustness.
Therefore, compared with the ShuffleNet_wogan model, which achieved an accuracy of 91.12%, the performance of conventional methods shows a clear disadvantage.

3.7. Comparison Test of Different Network Models

To evaluate the performance of the ShuffleNet_wogan model, its performance was compared with various classical classification models. The models selected for the comparative experiments included ResNet50, DenseNet121, AlexNet, EfficientNet B0, MobileNetV2, MobileNetV3-Large, and ShuffleNet V2_×1.0. The results are shown in Table 6.
As observed in Table 6, all models demonstrated good performance in determining the external grades of Orah, with accuracy exceeding 87%. Among the classical classification models, ShuffleNet V2_×1.0 exhibited the lowest performance, achieving an accuracy of 87.20%. However, it had the lowest floating-point operations (FLOPs) and parameter count, indicating its suitability for high-speed grading requirements in Orah sorting lines. ResNet50 achieved a higher accuracy of 90.64%, but its complexity was significantly greater. Its FLOPs exceeded those of ShuffleNet V2 by more than 25 times, which could negatively impact grading speed. After 200 iterations of training, the ShuffleNet_wogan model achieved an accuracy of 91.12%, making it the best-performing model. Additionally, it demonstrated advantages in model complexity, with the lowest FLOPs and parameter count compared to other networks. These results indicate that the proposed model is an efficient and lightweight network with excellent overall performance, making it well-suited for the requirements of Orah grading (annotation: F1: F1 score; Par: parameters; and FLOPs: floating-point operations).

3.8. Diameter Measurement Test

From Grade A, Grade B, and Grade C, 100 Orahs from each category were selected for visual diameter measurement and caliper diameter measurement. The difference between the two measurements was taken as the deviation, with deviations less than ±2 mm considered acceptable.
The experimental results are shown in Table 7. Out of 300 Orahs, 295 met the criteria, resulting in a pass rate of 98.3%. The average deviation was +0.09 mm, the standard deviation was 1.14, the maximum positive deviation was +2.07 mm, and the maximum negative deviation was −1.84 mm. The high pass rate indicates that the diameter data measured by the program can serve as a standard for grading Orahs by size. There were no significant performance differences among the three grades, indicating that the method is suitable for Orahs of all grades.

3.9. Actual Grading Testing

First, the classification speed of ShuffleNet_wogan on the testing platform was evaluated. Using Python’s threading module, 10 multi-threaded tasks were created. Each batch loaded 10 Orah images for grading. The next batch was loaded only after all threads completed grading for the current batch. This design simulated the working conditions of a sorting line. It ensured that the grading time for each batch was determined by the longest grading time among the threads. A total of 1000 batches, or 10,000 images, was tested.
The results show an average grading time of 51.44 ms per batch, with a maximum grading time of 87.02 ms. Thus, ShuffleNet_wogan is capable of handling 10 batches per second, equating to 10 Orahs per second on the conveyor chain.
To evaluate classification accuracy, 200 Orahs from Grade A, Grade B, and Grade C were selected. These samples were mixed and input into the sorting line. The fruits were directed to the corresponding output channels based on their classification, and the correct classification ratio was recorded at the output. The conveyor chain was set to transport 10 Orahs per second.
The results shown in Figure 22 indicate that the time-sequential grade judgment algorithm significantly improved the recognition success rate. The sorting accuracy reached 97% for Grade A, 87.5% for Grade B, and 93% for Grade C. The overall sorting accuracy rate was 92.5%. From the output results of 10 classifications for each Orah, defects in appearance were highly likely to be correctly classified within the image range of five adjacent positions, effectively filtering out misclassifications occurring during single recognitions. The lower accuracy for Grade B is attributed to insufficiently distinct features compared to Grade C, leading to defective areas occasionally being misclassified into Grade C. This issue requires further improvement.

4. Conclusions

Deep learning for fruit grading is characterized by high efficiency and accuracy, effectively reducing labor costs. This study collected a high-quality dataset of Orah samples based on an actual prototype. The imaging environment was carefully controlled to minimize noise during model training, ensuring abundant and reliable samples. Data augmentation was employed to further expand the dataset, improving the model’s generalization ability. The ShuffleNet V2 network architecture was enhanced by replacing the activation function and incorporating the ECA attention module. These changes led to performance improvements without introducing additional computational overhead. Furthermore, transfer learning was applied to further enhance performance. According to the results of multi-model comparison experiments, ablation studies, and heatmap visualizations, these improvements significantly boosted model performance. Compared with conventional morphology-based defect detection methods, the model demonstrates significant advantages.
The proposed ShuffleNet_wogan network model achieved a recognition accuracy of 91.12% on the test platform, with recognition time for processing 10 images under 100 ms, reaching a throughput of 10 Orahs/100 images per second. Assuming each Orah weighs 80 g with a loading efficiency of 70%, a single conveyor chain can sort approximately 2000 kg of Orah per hour. With four conveyor chains integrated into the prototype, the sorting capacity can reach 8000 kg per hour, equivalent to the sorting speed of 30–40 skilled manual workers. Over two years of continuous improvement, the sorting line prototype and its algorithm cumulatively sorted more than 10,000 tons of Orah, demonstrating excellent practical value.
Taking advantage of the continuous rotation of Orahs during image acquisition, a time-sequence-based grading algorithm and a multi-sampling-based diameter algorithm were proposed. These algorithms address issues related to the precision of existing acquisition equipment and the accuracy of deep learning models, thereby improving practical grading accuracy. Overall tests demonstrated that the appearance grading accuracy reached 92.5%, and the diameter recognition pass rate achieved 98.3%, showcasing high usability. The appearance grading was divided into three levels, providing finer segmentation compared to binary classification deep learning models and enabling greater economic value. By combining appearance grading and diameter data, the harvested Orahs can be categorized into multiple quality grades, meeting diverse market demands.
This study has some limitations due to research constraints. First, the model was evaluated only on a single cultivar: Orah. Fruits of different cultivars may have different visual characteristics, which could affect the model’s generalizability. Second, all images in the dataset were collected using the sorting line. The lighting and background conditions remained highly consistent. Therefore, the model’s performance under varying environmental conditions has not been tested.
Future work will focus on further expanding the dataset and adopting advanced technologies to optimize the model. During future dataset collection, more attention will be given to maintaining data balance. This is expected to improve the classification performance for each grade. The model will also be extended to different cultivars and varied environmental conditions to enhance its adaptability and generalization. In addition, deployment on embedded and mobile platforms will be explored to support real-time applications in resource-constrained environments.

Author Contributions

Conceptualization, Y.B. and H.L. (Hao Liu); methodology, Y.B.; software, Y.B. and H.L. (Hao Liu); validation, Y.B., H.L. (Hao Liu) and H.L. (Hongda Li); formal analysis, H.L. (Hongda Li); investigation, H.L. (Hao Liu); resources, B.G.M.; data curation, X.W.; writing—original draft preparation, Y.B. and H.L. (Hao Liu); writing—review and editing, Y.B. and H.L. (Hao Liu); visualization, Y.B. and H.L. (Hongda Li); supervision, X.C.; project administration, X.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key Research and Industrialization Projects of Science and Technology Innovation in Fujian Province (Grant No. 2023XQ005). The APC was funded by the same project.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

The authors would like to thank Pinghe Huayu Machinery Company for their support in machining and providing the necessary facilities.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. He, Y.; Li, W.; Zhu, P.; Wang, M.; Qiu, J.; Sun, H.; Zhang, R.; Liu, P.; Ling, L.; Fu, X.; et al. Comparison between the vegetative and fruit characteristics of “Orah” (Citrus reticulata Blanco) mandarin under different climatic conditions. Sci. Hortic. 2022, 300, 111064. [Google Scholar] [CrossRef]
  2. Citrus World Statistics 2022 Edition. Available online: https://worldcitrusorganisation.org/activities/citrus-world-statistics/ (accessed on 1 January 2025).
  3. Halstead, M.; McCool, C.; Denman, S.; Perez, T.; Fookes, C. Fruit Quantity and Ripeness Estimation Using a Robotic Vision System. IEEE Robot. Autom. 2018, 3, 2995–3002. [Google Scholar] [CrossRef]
  4. Mazen, F.M.A.; Nashat, A.A. Ripeness Classification of Bananas Using an Artificial Neural Network. Arab. J. Sci. Eng. 2019, 44, 6901–6910. [Google Scholar] [CrossRef]
  5. Weijun, X.; Wei, S.; Zhaohui, Z.; Deyong, Y. A CNN-based lightweight ensemble model for detecting defective carrots. Biosyst. Eng. 2021, 208, 287–299. [Google Scholar] [CrossRef]
  6. Jahanbakhshi, A.; Momeny, M.; Mahmoudi, M.; Zhang, Y. Classification of sour lemons based on apparent defects using stochastic pooling mechanism in deep convolutional neural networks. Sci. Hortic. 2020, 263, 109133. [Google Scholar] [CrossRef]
  7. Liu, L.; Li, Z.; Lan, Y.; Shi, Y.; Cui, Y. Design of a tomato classifier based on machine vision. PLoS ONE 2019, 14, e0219803. [Google Scholar] [CrossRef] [PubMed]
  8. Bennedsen, B.S.; Peterson, D.L.; Tabb, A. Identifying defects in images of rotating apples. Comput. Electron. Agric. 2005, 48, 92–102. [Google Scholar] [CrossRef]
  9. Cheng, X.; Tao, Y.; Chen, Y.R.; Luo, Y. Nir/MIR dual–sensor machine vision system for online apple stem–end/calyx recognition. Trans. ASAE 2003, 46, 551. [Google Scholar] [CrossRef]
  10. Sofu, M.M.; Er, O.; Kayacan, M.C.; Cetişli, B. Design of an automatic apple sorting system using machine vision. Comput. Electron. Agric. 2016, 127, 395–405. [Google Scholar] [CrossRef]
  11. Shelhamer, E.; Long, J.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 640–651. [Google Scholar] [CrossRef]
  12. Cai, Z.; Vasconcelos, N. Cascade R-CNN: Delving Into High Quality Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar] [CrossRef]
  13. Xuan, X.; Zhang, X.; Kwon, O.; Ma, K. VAC-CNN: A Visual Analytics System for Comparative Studies of Deep Convolutional Neural Networks. IEEE Trans. Visual. Comput. Graph. 2022, 28, 2326–2337. [Google Scholar] [CrossRef]
  14. Ismail, N.; Malik, O.A. Real-time visual inspection system for grading fruits using computer vision and deep learning techniques. Inf. Process. Agric. 2022, 9, 24–37. [Google Scholar] [CrossRef]
  15. Bhole, V.; Kumar, A. Mango Quality Grading using Deep Learning Technique: Perspectives from Agriculture and Food Industry. In Proceedings of the 21st Annual Conference on Information Technology Education, Online, 7–9 October 2020. [Google Scholar] [CrossRef]
  16. Da Costa, A.Z.; Figueroa, H.E.H.; Fracarolli, J.A. Computer vision based detection of external defects on tomatoes using deep learning. Biosyst. Eng. 2020, 190, 131–144. [Google Scholar] [CrossRef]
  17. Fu, Y.; Nguyen, M.; Yan, W. Grading Methods for Fruit Freshness Based on Deep Learning. SN Comput. Sci. 2022, 3, 264. [Google Scholar] [CrossRef]
  18. Cárdenas-Pérez, S.; Chanona-Pérez, J.; Méndez-Méndez, J.V.; Calderón-Domínguez, G.; López-Santiago, R.; Perea-Flores, M.J.; Arzate-Vázquez, I. Evaluation of the ripening stages of apple (Golden Delicious) by means of computer vision system. Biosyst. Eng. 2017, 159, 46–58. [Google Scholar] [CrossRef]
  19. Momeny, M.; Jahanbakhshi, A.; Neshat, A.A.; Hadipour-Rokni, R.; Zhang, Y.D.; Ampatzidis, Y. Detection of citrus black spot disease and ripeness level in orange fruit using learning-to-augment incorporated deep networks. Ecol. Inform. 2022, 71, 101829. [Google Scholar] [CrossRef]
  20. Lu, J.; Chen, W.; Lan, Y.; Qiu, X.; Huang, J.; Luo, H. Design of citrus peel defect and fruit morphology detection method based on machine vision. Comput. Electron. Agric. 2024, 219, 108721. [Google Scholar] [CrossRef]
  21. Kundu, N.; Rani, G.; Dhaka, V.S. A deep learning based system for automatic sorting and quality grading of citrus fruits. In Proceedings of the 2024 International Conference on Emerging Trends in Networks and Computer Communications (ETNCC), Windhoek, Namibia, 23–25 July 2024. [Google Scholar] [CrossRef]
  22. Chakraborty, S.K.; Subeesh, A.; Potdar, R.; Chandel, N.S.; Jat, D.; Dubey, K.; Shelake, P. Ai-enabled farm-friendly automatic machine for washing, image-based sorting, and weight grading of citrus fruits: Design optimization, performance evaluation, and ergonomic assessment. J. Field Robot. 2023, 40, 1581–1602. [Google Scholar] [CrossRef]
  23. Omid, M.; Salehi, A.; Rashvand, M.; Firouz, M.S. Development and evaluation of an online grading system for pinto beans using machine vision and artificial neural network. Int. J. Postharvest Technol. Innov. 2020, 7, 1–14. [Google Scholar] [CrossRef]
  24. Patil, P.U.; Lande, S.B.; Nagalkar, V.J.; Nikam, S.B.; Wakchaure, G.C. Grading and sorting technique of dragon fruits using machine learning algorithms. J. Agric. Food Res. 2021, 4, 100118. [Google Scholar] [CrossRef]
  25. Abu-Jamie, T.N.; Abu-Naser, S.S.; Alkahlout, M.A.; Aish, M.A. Six Fruits Classification Using Deep Learning. Int. J. Acad. Inf. Syst. Res. 2022, 6, 1–8. [Google Scholar]
  26. Zhou, X. Mechanism Design and Implementation of Citrus Automatic Grading Equipment. Master’s Thesis, Chongqing University of Technology, Chongqing, China, June 2018. (In Chinese). [Google Scholar]
  27. Fan, Y. Research on Sorting and Conveying Device of Mountain Navel Orange Picking Operation Platform. Master’s Thesis, Chongqing Three Gorges University, Chongqing, China, June 2021. (In Chinese). [Google Scholar] [CrossRef]
  28. Xie, M. Research on Compact Visual Grading System of Citrus. Master’s Thesis, Harbin Institute of Technology, Harbin, China, June 2022. (In Chinese). [Google Scholar] [CrossRef]
  29. Wang, X. Study on Key Technologies of Small Citrus Automatic Grading Equipment. Master’s Thesis, Jiangsu University, Zhenjiang, China, June 2022. (In Chinese). [Google Scholar] [CrossRef]
  30. Lauxtermann, S.; Lee, A.; Stevens, J.; Joshi, A. Comparison of global shutter pixels for cmos image sensors. In Proceedings of the 2007 International Image Sensor Workshop, Ogunquit, ME, USA, 7–10 June 2007. [Google Scholar] [CrossRef]
  31. Zhang, X.Y.; Zhou, X.Y.; Lin, M.X.; Sun, J. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar] [CrossRef]
  32. Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Online, 14–19 June 2020. [Google Scholar] [CrossRef]
  33. Ganin, Y.; Lempitsky, V. Unsupervised domain adaptation by backpropagation. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 6–11 July 2015. [Google Scholar]
Figure 1. Diagram of the mechanical model for the Orah sorting line.
Figure 1. Diagram of the mechanical model for the Orah sorting line.
Applsci 15 04483 g001
Figure 2. Movement and rotation of Orah on the conveyor system.
Figure 2. Movement and rotation of Orah on the conveyor system.
Applsci 15 04483 g002
Figure 3. Component diagram of the Orah sorting line. (a) Panasonic photoelectric sensor; (b) Hikvision Robot industrial camera.
Figure 3. Component diagram of the Orah sorting line. (a) Panasonic photoelectric sensor; (b) Hikvision Robot industrial camera.
Applsci 15 04483 g003
Figure 4. Overview of the sorting line. (a) Single-line conveyor and image acquisition chamber; (b) grading execution device.
Figure 4. Overview of the sorting line. (a) Single-line conveyor and image acquisition chamber; (b) grading execution device.
Applsci 15 04483 g004
Figure 5. Samples of three grades of Orah.
Figure 5. Samples of three grades of Orah.
Applsci 15 04483 g005
Figure 6. Images after data augmentation. (a) Original appearance of Orah; (b) horizontal flip; (c) vertical flip; (d) random rotation; (e) random brightness and contrast; and (f) random scaling.
Figure 6. Images after data augmentation. (a) Original appearance of Orah; (b) horizontal flip; (c) vertical flip; (d) random rotation; (e) random brightness and contrast; and (f) random scaling.
Applsci 15 04483 g006
Figure 7. Images preprocessing procedure. (a) Original image; (b) V channel; (c) Otsu binary image; and (d) filtered binary image.
Figure 7. Images preprocessing procedure. (a) Original image; (b) V channel; (c) Otsu binary image; and (d) filtered binary image.
Applsci 15 04483 g007
Figure 8. The image acquisition system. Radiating arrows indicate the field of view of the camera.
Figure 8. The image acquisition system. Radiating arrows indicate the field of view of the camera.
Applsci 15 04483 g008
Figure 9. Best-fit ellipse.
Figure 9. Best-fit ellipse.
Applsci 15 04483 g009
Figure 10. Module architecture of ShuffleNet V2. (a) Basic unit; (b) down-sampling unit.
Figure 10. Module architecture of ShuffleNet V2. (a) Basic unit; (b) down-sampling unit.
Applsci 15 04483 g010
Figure 11. ShuffleNet_wogan model architecture.
Figure 11. ShuffleNet_wogan model architecture.
Applsci 15 04483 g011
Figure 12. ECA module architecture.
Figure 12. ECA module architecture.
Applsci 15 04483 g012
Figure 13. Comparison of accuracy between ReLU and Mish.
Figure 13. Comparison of accuracy between ReLU and Mish.
Applsci 15 04483 g013
Figure 14. Confusion matrix for ShuffleNet_wogan model.
Figure 14. Confusion matrix for ShuffleNet_wogan model.
Applsci 15 04483 g014
Figure 15. Misclassifications in Grade A. (a) Untrimmed stem or white core misclassified as defect; (b) Minor surface defects misclassified as lower grade.
Figure 15. Misclassifications in Grade A. (a) Untrimmed stem or white core misclassified as defect; (b) Minor surface defects misclassified as lower grade.
Applsci 15 04483 g015
Figure 16. Misclassifications in Grade B. (a) Unrecognized small scars; (b) Background interference from foreign objects; (c) Misleading stem appearance; (d) Prediction error due to model limitations.
Figure 16. Misclassifications in Grade B. (a) Unrecognized small scars; (b) Background interference from foreign objects; (c) Misleading stem appearance; (d) Prediction error due to model limitations.
Applsci 15 04483 g016aApplsci 15 04483 g016b
Figure 17. Misclassifications in Grade C.
Figure 17. Misclassifications in Grade C.
Applsci 15 04483 g017
Figure 18. Heatmap for ShuffleNet V2 and ShuffleNet_wogan model.
Figure 18. Heatmap for ShuffleNet V2 and ShuffleNet_wogan model.
Applsci 15 04483 g018
Figure 19. Workflow of black spot detect program. The green squares indicate the black spots detected by the program.
Figure 19. Workflow of black spot detect program. The green squares indicate the black spots detected by the program.
Applsci 15 04483 g019
Figure 20. The stem was misidentified as a black spot. The green squares indicate the black spots detected by the program.
Figure 20. The stem was misidentified as a black spot. The green squares indicate the black spots detected by the program.
Applsci 15 04483 g020
Figure 21. Small black spots were not detected. The green squares indicate the black spots detected by the program.
Figure 21. Small black spots were not detected. The green squares indicate the black spots detected by the program.
Applsci 15 04483 g021
Figure 22. Confusion matrix for Orah sorting line.
Figure 22. Confusion matrix for Orah sorting line.
Applsci 15 04483 g022
Table 1. Orah grading standards.
Table 1. Orah grading standards.
GradeAppearance Features
Grade AThe fruit has a uniform shape and consistent color, smooth skin, and no signs of rot, sunburn, black spots, or mechanical damage.
Grade BThe fruit has a relatively uniform shape; over 75% of the fruit surface exhibits consistent coloring; the fruit skin is relatively smooth, with no mechanical damage. A single fruit is allowed to have sunburn covering no more than 10% of its surface, scars within 1 mm in width, or black spots within 1 mm in diameter.
Grade CDeformed fruit; uneven coloring; mold or mechanical damage; and relatively severe defects such as sunburn, scars, and black spots.
Table 2. Distribution of Orah appearance grade dataset.
Table 2. Distribution of Orah appearance grade dataset.
CategoryOriginal Image CountImage Count After Data Augmentation
Grade A632231,607
Grade B510025,500
Grade C856042,800
Empty fruit cup56325632
Total25614105,539
Table 3. Activation functions comparison results.
Table 3. Activation functions comparison results.
Activation FunctionAccuracyPrecisionRecallF1
ReLU0.87200.87330.87200.8726
Mish0.87350.87370.87330.8733
Table 4. Attention mechanisms comparison results.
Table 4. Attention mechanisms comparison results.
Attention MechanismsAccuracyPrecisionRecallF1Parameters/106FLOPs/106
CBAM0.86420.86320.86420.86361.67152.0
SE0.87630.87560.87630.87591.53151.8
FCA0.87240.87160.87240.87181.53151.8
ECA0.87650.87640.87650.87641.26149.8
Table 5. Ablation experiments comparison results.
Table 5. Ablation experiments comparison results.
ModelECATLMishAccuracyF1 ScorePar/106FLOPs/106MS/MB
ShuffleNet V2-×1.0---0.87200.87261.26149.64.96
--0.87650.87641.26149.84.96
--0.89870.89791.26149.64.96
--0.87350.87331.26148.04.97
-0.90650.90551.26148.04.97
-0.90550.90581.26149.84.96
ShuffleNet_wogan0.91120.91071.26149.84.96
Table 6. Model comparison results.
Table 6. Model comparison results.
ModelAccuracyPrecisionRecallF1Parameters/106FLOPs/106
ResNet500.90640.90450.90630.905423.524119.8
DenseNet1210.89700.89730.89700.89706.962881.6
AlexNet0.88620.88610.88620.886157.02711.5
EfficientNet B00.90400.90280.90400.90314.01400.4
MobileNet V20.90340.90540.90330.90422.23319.0
MobileNet V3-Large0.90400.90300.90400.90344.21227.0
ShuffleNet V2_×1.00.87200.87330.87200.87261.26149.6
ShuffleNet_wogan0.91120.91040.91120.91071.26149.8
Table 7. Diameter measurement deviation results.
Table 7. Diameter measurement deviation results.
GradeMean/mmStandard Deviation/mmMaximum Value/mmMinimum Value/mmPass Rate
Grade A0.061.102.06−1.7698%
Grade B0.211.152.00−1.7999%
Grade C0.01.172.07−1.8398%
Total0.091.142.07−1.8398.3%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bu, Y.; Liu, H.; Li, H.; Murengami, B.G.; Wang, X.; Chen, X. Grading Algorithm for Orah Sorting Line Based on Improved ShuffleNet V2. Appl. Sci. 2025, 15, 4483. https://doi.org/10.3390/app15084483

AMA Style

Bu Y, Liu H, Li H, Murengami BG, Wang X, Chen X. Grading Algorithm for Orah Sorting Line Based on Improved ShuffleNet V2. Applied Sciences. 2025; 15(8):4483. https://doi.org/10.3390/app15084483

Chicago/Turabian Style

Bu, Yifan, Hao Liu, Hongda Li, Bryan Gilbert Murengami, Xingwang Wang, and Xueyong Chen. 2025. "Grading Algorithm for Orah Sorting Line Based on Improved ShuffleNet V2" Applied Sciences 15, no. 8: 4483. https://doi.org/10.3390/app15084483

APA Style

Bu, Y., Liu, H., Li, H., Murengami, B. G., Wang, X., & Chen, X. (2025). Grading Algorithm for Orah Sorting Line Based on Improved ShuffleNet V2. Applied Sciences, 15(8), 4483. https://doi.org/10.3390/app15084483

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop