1. Introduction
A magnetic ring is a ring-shaped magnet that is widely used in common components in electronic circuits, and its quality directly affects the performance of the device. However, in the production process, due to the influence of raw material composition, processing technology, and equipment conditions, some defects, such as cracks and adhesion, will occur on the surface of the magnetic ring, which will affect the appearance and performance of the magnetic ring. Currently, most magnetic rings are tested by manual inspection, which has low detection efficiency and slow speed, and it is difficult to achieve intelligent production [
1]. Therefore, there is an urgent need for new key technologies that can quickly and automatically detect magnetic rings in the industry.
In the past, manual visual inspections or non-destructive detection techniques were used to detect magnetic ring defects, which could avoid secondary pollution caused by contact detection. Common non-destructive testing techniques include ultrasonic testing technology [
2], spectrum detection technology [
3], laser detection technology [
4], X-ray detection technology [
5], etc., which are high in cost and have low efficiency and accuracy. It is challenging for a single technique to find all defects in a magnetic ring due to the diversity of defects. Meanwhile, machine vision techniques are employed to overcome these drawbacks since they make it simpler to implement automated production and information technology across a range of industries [
6,
7,
8,
9,
10,
11].
Machine vision uses sensors to capture images, extract features, and interpret them to obtain information for controlling machinery or processes. Nowadays, magnetic ring defect detection methods based on machine vision can be divided into two classes: traditional machine vision and online detection methods based on deep learning. The conventional detection method is that researchers design algorithms according to the types and characteristics of magnetic ring defects. For example, we can locate the defects by analyzing the grey difference between defect areas and regular areas in the magnetic ring surface image. Li et al. [
1] proposed a magnetic ring defect detection method based on a mask image, which can accurately and quickly extract the defects in each area of the magnetic ring surface image. Aiming at the problem in which the conventional segmentation algorithm had difficulty extracting defects from complex textures, Hou et al. [
12] adopted an edge segmentation algorithm based on the wavelet analysis of weakening texture and the adaptive Canny algorithm. To enhance the defect recognition ability, according to the circular image observation, Cui [
13] used an adaptive segmentation algorithm based on a multi-scale, multiple-direction Gabor filter group, which can enhance the contrast of the defect. Wang [
14] carried out a machine vision-based magnetic ring surface defect detection system, in which the defects were extracted according to the defect features. Then, the support vector machine (SVM) method was used for defect classifications.
The defect detection method based on deep learning has become the mainstream method in recent years, and it can automatically extract defect features and avoid complex algorithm designs. Deep learning methods can be classified into two types: two-stage methods and one-stage methods. The two-stage method involves object region proposal with deep networks, followed by object classification based on features extracted from the proposed region with bounding-box regression. The one-stage method predicts bounding boxes over the images without the region proposal step. It comprises an end-to-end target detector, including YOLO and SSD, which consumes less time and can be used in real-time applications. Li et al. [
15] established a dataset of six types of surface defects on steel strips, improved the YOLO network, and applied it to the production line. Zhang et al. [
16] modified the original YOLOv3 by introducing a novel transfer learning method with fully pretrained weights from a geometrically similar dataset and increasing the accuracy of concrete bridge surface damage detection. Chen et al. [
17] tried to use DenseNet instead of YOLOv3’s Darknet-53 backbone network to detect SMD LED defects and achieved good results. Guo et al. [
18] introduced a MSFT-YOLO model to detect defects of steel surfaces by adding the TRANS module. Wang et al. [
19] proposed a YOLOv5 algorithm based on the improved MS-YOLOv5 model to detect the surface defects of aluminum profiles. By replacing the neck part of the original algorithm with a PE-Neck structure, the model’s ability to extract and locate defects at different scales was enhanced. Liao et al. [
20] replaced the FPN structure of YOLOv5 with BiFPN structure for the surface defect detection of turbine blades, achieving higher-level feature fusion. However, With the deepening of the deep network, the computing overhead of the system increases; due to the popularity of edge computing and because of the neural network framework, YOLO is not capable of detecting minor defects, which can easily cause missed inspections. Therefore, this paper conducts an improved model based on YOLOv5.
In order to achieve good accuracy and speed, the YOLOv5s backbone network was replaced with MobileNetV3 for feature extraction. Some defects of the magnetic ring are relatively small, and it is easy to miss during inspections. To avoid missed detection, we introduce an effective SE module into the backbone network, which makes the model pay more attention to the primary information to improve detection accuracies. Moreover, we design an EIoU loss function to improve the localization accuracy of the model. The main contributions of this paper are as follows:
The paper proposes a lightweight detection algorithm named MR-YOLO (YOLOv5 for Magnetic Ring) by replacing the backbone of the original YOLOv5 with the backbone of the lightweight MobileNetV3.
We add the SE attention mechanism and introduce the updated SIOU-loss function into the model to improve the detection effect and expression effect of the model.
The training dataset is enhanced with Mosaic data, and a GPU can generate more significant results, lowering the need for large mini-batch sizes.
All these works are tested and verified on the existing magnetic ring defect dataset, proving the proposed algorithm’s feasibility.
4. Experimental Setup and Method Validation
The main training steps in the magnetic ring surface defect detection experiment are as follows. Firstly, the labeled lithium battery dataset is inputted into the network model. Secondly, the initialization parameters of the model training are selected to start the network’s model training. After the training, a weight file is generated to save the model information. Finally, the weight file is loaded into the network model for image detection.
4.1. Data Preparation
The experimental dataset is constructed from the magnetic ring photos taken by the magnetic ring production factory, and the pixel size of the collected images is 2048 × 1536. The defective magnetic ring pictures were manually screened out, and the initial data volume was 445. In order to improve the training effect of the model, the data set is expanded to 1225 pictures using symmetrical operations, and the collected pictures are manually marked with LabelImg software. The defects to be detected mainly include top cracks, inner cracks, and adhesion. The expanded data set is randomly divided into three parts: training data set, validating data set, and test dataset according to the ratio of 8:1:1. The distribution of defects on the surface of the magnetic ring is shown in the
Figure 3. Most defects are small in size, which puts forward higher requirements for the model to detect small defects.
4.2. Experimental Platform
The experimental platform is shown in
Table 1, which consists of a workstation driven by Windows 10.
4.3. Network Parameter Settings
In the training process, the default hyperparameter settings are shown in
Table 2.
4.4. Evaluation Index
The standard evaluation indices are chosen for quantitative evaluations in order to assess the model’s performance, such as P (precision), R (recall),
[email protected], and
[email protected]:0.95. Moreover, we also choose the other parameters as the evaluation basis, such as average detection and processing time, the amount of parameters, FLOPs, model size, and so on. P (precision) was calculated as the ratio of the number of correctly predicted positive samples to the number of predicted positive examples, which is defined by the following.
The recall is the proportion of all targets that are correctly predicted, and it is governed by the following:
where true positives (TPs) indicate the number of samples predicted by the algorithm as positive sample targets. False positives (FPs) indicate the number of samples that predict negative samples to positive ones. False negatives (FNs) represent the number of samples that the algorithm predicts as positive samples relative to negative samples. P stands for accuracy and R for recall rate.
To validate the performance of the model, the mAP is adopted as the main evaluation metric. The mAP (average precision, AP) denotes the mean average precision, which is calculated by the area of the P-R curve. The mAP takes out the AP of each category separately and then calculates the average AP of all categories. Generally speaking, the better the classifier, the higher the AP value.
In addition to detecting accuracy, another key evaluation criterion for defect detection is speed, which plays an important role in real-time scenarios. The speed is generally measured by FPS. The FPS represents the number of frames per second in which an image is detected. A higher FPS value indicates a faster detection of the system.
5. Experimental Results and Analysis
To validate the effectiveness of our model, we introduce a baseline YOLOv5s with a re-parameterization technique. Four ablation experiments were designed in this section, which include YOLOv5s, YOLOv5S-MobilenetV3, YOLOv5s-MobilenetV3-SE, and YOLOv5s-Mobilenetv3SE-SIOU. The experimental results are shown in
Figure 9 and
Table 3, respectively. We first test five different models on the same dataset, and the specific results are shown in
Table 1. When the module of the backbone is replaced with MobilenetV3, compared with YOLOv5s, FLOP reduced by 60%, the number of model parameters reduced by 49.5%, the mAP reduced by 1.7%, and the model size and reasoning time reduced by 46.5% and 17.8%, respectively.
In order to improve the mAP of the model, an SE module was added to the improved YOLOv5s-MV3, and SIOU-Loss was used as the loss function. The mAP (0.5) of the final model YOLOv5s-MV3+SE+SIOU+Mocica was 1.4% higher than YOLOv5s-MV3, the average detection, and processing time; the number of FLOP and parameters slightly increased, and 2.99% reduced the model’s size. Compared with the original YOLOv5s, YOLOv5S-MV3+SE+SIOU+Mocica (MR-YOLO) can achieve 59.3% and 47.9% reductions in FLOPs and Params, respectively, a 16.6% increase in reasoning speed, and a 48.1% reduction in model size, with only a 0.3% loss in mAP.
Table 4 shows the performance comparison between Mosaic data enhancement and non-Mosaic data enhancement in each improved model. After the Mosaic data enhancement was added, map values slightly increased, and YOLOv5-MV3+SIoU+SE increased the map values by 0.3% after Mosaic data enhancement technology was used.
Table 5 shows the performance of different loss functions. Experiments show that the SIoU has the best performance in this data set.
Table 6 shows the comparison results between the YOLOv5s-MV3+SIoU +SE +mosaic network and other classical networks. Compared with Faster-RCNN, YOLOv3, and YOLOv3-tiny, the Yolov3-tiny network has better performance in terms of accuracy, model size, and reasoning speed. Compared with the map of YOLOv3-tiny, MR-YOLO is 3.4% higher and the number of references is 49.6% lower, which reflects the advantages of having high accuracies and the low reference number of MR-YOLO. In addition, by comparing the confusion matrix between the original YOLOv5 and MR-YOLO, as shown in
Figure 10, the prediction result of MR-YOLO in the top cracks data is better than that of the original YOLOv5. The MR-YOLO is as good as the original YOLOv5 in predicting inner cracks and adhesion data. By using the comparative analysis of the confusion matrix, we obtained a more accurate and reliable result: MR-YOLO minimizes computation and model size, while maintaining appropriate accuracies.