Surface Defect Detection of Magnetic Tiles Based on YOLOv8-AHF

Ma, Cheng; Pan, Yurong; Chen, Junfu

doi:10.3390/electronics14142857

Open AccessArticle

Surface Defect Detection of Magnetic Tiles Based on YOLOv8-AHF

by

Cheng Ma

^1,2,*,

Yurong Pan

³ and

Junfu Chen

²

¹

School of Electronics and Electrical Engineering, Bengbu University, Bengbu 233030, China

²

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210000, China

³

School of Mathematics and Physics, Bengbu University, Bengbu 233030, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(14), 2857; https://doi.org/10.3390/electronics14142857

Submission received: 18 June 2025 / Revised: 9 July 2025 / Accepted: 15 July 2025 / Published: 17 July 2025

Download

Browse Figures

Versions Notes

Abstract

Magnetic tiles are an important component of permanent magnet motors, and the quality of magnetic tiles directly affects the performance and service life of a motor. It is necessary to perform defect detection on magnetic tiles in industrial production and remove those with defects. The YOLOv8-AHF algorithm is proposed to improve the ability of network feature information extraction and solve the problem of missed detection or poor detection results in surface defect detection due to the small volume of permanent magnet motor tiles, which reduces the deviation between the predicted box and the true box simultaneously. Firstly, a hybrid module of a combination of atrous convolution and depthwise separable convolution (ADConv) is introduced in the backbone of the model to capture global and local features in magnet tile detection images. In the neck section, a hybrid attention module (HAM) is introduced to focus on the regions of interest in the magnetic tile surface defect images, which improves the ability of information transmission and fusion. The Focal-Enhanced Intersection over Union loss function (Focal-EIoU) is optimized to effectively achieve localization. We conducted comparative experiments, ablation experiments, and corresponding generalization experiments on the magnetic tile surface defect dataset. The experimental results show that the evaluation metrics of YOLOv8-AHF surpass mainstream single-stage object detection algorithms. Compared to the You Only Look Once version 8 (YOLOv8) algorithm, the performance of the YOLOv8-AHF algorithm was improved by 5.9%, 4.1%, 5%, 5%, and 5.8% in terms of mAP@0.5, mAP@0.5:0.95, F1-Score, precision, and recall, respectively. This algorithm achieved significant performance improvement in the task of detecting surface defects on magnetic tiles.

Keywords:

defect detection; deep learning; image recognition; industrial vision

1. Introduction

As a type of motor, permanent magnet motors have the advantages of simple structure, easy maintenance, light weight, small size, and low energy consumption. They are widely manufactured and applied in various aspects of production and daily life. Being a crucial component of permanent magnet motors, magnetic tiles’ quality directly affects the performance and service life of the motors. During the industrial production of magnetic tiles, various defects may arise on the surface of the tiles due to factors such as raw materials, manufacturing processes, and equipment. These defects can not only attenuate the magnetic field effect and shorten the service life but also pose substantial production safety risks.

In summary, considering the importance of magnetic tiles for permanent magnet motors, it is necessary to conduct defect detection on magnetic tiles in industrial production and eliminate those with defects.

Generally, the related work of defect detection can be classified into the following four categories, i.e., manual detection, traditional image detection, traditional machine vision detection, and deep learning detection.

Manual Detection: Manual inspection has good adaptability to various types of defects and complex defects, but there are many problems, such as the following: inspection personnel cannot accurately identify magnetic tile defects for a long time, the accuracy of inspection is related to the subjective judgment and technical level of inspection personnel, and the speed of manual inspection is far lower than the production speed. Due to the low efficiency and high false detection rate of manual defect detection, it not only increases the cost of enterprises and restricts detection efficiency but also cannot guarantee the stability and accuracy of detection results.

Traditional Image Detection: Traditional image detection mainly relies on methods such as threshold segmentation, edge detection, and morphological processing to detect defects in magnetic tiles. Threshold segmentation is based on the distribution of pixel grayscale values in an image, selecting a threshold to separate the background and defect targets. Edge detection identifies and locates defect edges based on the drastic changes in grayscale at the defect edges. Morphological processing optimizes the results of threshold segmentation or edge detection to remove pseudo-defects. Li Xueqin et al. [1] used line scanning to threshold the coefficients of Non-Subsampled Contourlet Transform (NSCT), which can extract more accurate features. Lin Lijun et al. [2] proposed a magnetic tile crack defect edge detection algorithm based on a wavelet modulus maximum, ensuring the closure of the defect edge. Gao Qianqian [3] adopted morphological methods for detection, effectively preserving edge details. Li Zehui [4] adopted the methods of grayscale mean subtraction correction and pixel response characteristic correction to enhance the grayscale uniformity of the image and quantify the evaluation results. Traditional image detection methods are relatively simple and have clear judgment rules, but they have the following problems. For example, sensitivity to color grayscale differences, high requirements for lighting, no correlation with the spatial position, and the need for fixed regions to reduce pixel traversal computation and noise interference.

Traditional Machine Vision Detection: Traditional machine vision inspection mainly selects suitable industrial cameras and light sources to capture product images based on the industrial production environment and product characteristics, designs corresponding feature extraction algorithms based on defect targets in the images, and then uses defect detection algorithms for feature recognition and extraction. The rapid development of machine learning has led to the emergence of defect detection algorithms. Ding Longfei et al. [5] proposed a magnetic-tile-defect detection algorithm that integrates multiple attention mechanisms to establish spatial correlation between network features, which is suitable for defect classification. Hu Hao et al. [6] proposed a machine vision-based method for detecting micro-defects on the surface of small magnetic tiles and designed corresponding defect extraction methods for three types of magnetic tile defects. Lei LJ et al. [7] proposed a fast detection method for the segmented embedding of surface defects, which achieves bidirectional fusion of image processing and defect detection and improves the detection of bearing surface defects. Compared with manual detection, machine vision detection technology has the advantages of fast detection speed, good stability, and high detection accuracy. However, traditional machine vision detection methods rely on manual settings to extract features, which are difficult to extract and have poor detection robustness. Further improvement is needed to enhance system adaptability and robustness.

Deep Learning Detection: After LeCun et al. proposed LeNet [8], a solid foundation was laid for the development of deep learning. Deep networks such as AlexNet [9], VggNet [10], GoogleNet [11], Resnet [12], MobileNet [13,14], and EfficientNet [15,16] emerged, and the concept of deep learning has made rapid progress in fields such as computer vision, speech recognition, and natural language processing. With the widespread application of deep learning in the field of computer vision, defect detection methods based on deep learning have also been developed and applied. Many scholars have conducted extensive research on this and have successfully launched more in-depth and comprehensive networks. For example, some scholars have used a convolutional neural network structure called Crack Mask R-CNN to detect asphalt pavement cracks at the pixel level [17], a convolutional neural network image classification framework combined with transfer learning to detect wood surface defects [18], a residual network combined with active learning to detect defects in infrastructure such as pipelines [19], the weighted bidirectional feature pyramid network based on YOLOv5s model embedded residual module is used for hot-rolled steel plate surface defect detection [20], the generative adversarial network is used for surface defect detection [21], the deep confidence network is used for solar cell defect detection [22], and the image pyramid hierarchical structure idea and convolutional denoising autoencoder network are combined to detect texture image defects [23]. A supervised autoencoder is constructed for detecting defects in small samples by adopting the background suppression strategy [24]. A novel multilevel representation based on a two-stage framework during segmentation refines the blurred edge problem [25].

However, deep learning-based methods still have the following main problems: when detecting whether there are defects on the surface of magnetic tiles, due to the small volume of magnetic tiles, there may be insufficient information about defect areas, low resolution, and high noise in industrial production. Additionally, the small size and uneven distribution of defects or abnormal sample data in the magnetic tile dataset can affect the model’s generalization and prediction ability in test data, resulting in missed detections or poor detection performance.

The key contributions of this paper include the following: In the backbone part of the original YOLOv8 model, a hybrid convolution module ADConv is introduced to replace the ordinary convolution in the Conv-BN-SILU (CBS) module. ADConv can integrate global and local information to more accurately capture the complex features of defects. In the neck part, a hybrid attention module (HAM) is introduced to adjust the model’s attention to high-quality samples. We introduce cost-sensitive factors and optimize the loss function Focal-EIoU. The improved YOLOv8 model is trained using the enhanced sample dataset to obtain a magnetic tile surface defect detection model.

The rest of the paper is organized as follows. Section 2 and Section 3 declare the basic and improvement methods employed in this study. Section 4 details experimental analysis and results. The study concludes in Section 5.

2. Basic Backbone Network

YOLO is a single-stage object detection algorithm series. The core idea of it is to transform object detection into a regression problem, using the entire image as network input and directly obtaining the position and category of the bounding box through neural networks. This series of algorithms has always been a research hotspot in the field of object detection due to its high efficiency and real-time characteristics. With the development of technology, the detection accuracy and speed of algorithms are constantly improving [26,27,28,29,30,31,32].

The paper uses the YOLOv8 network as the basic backbone network. YOLOv8 mainly consists of three parts: the backbone, neck, and head. The backbone part is mainly responsible for feature extraction. It consists of CBS, cross-stage partial (CSP), and spatial pyramid pooling fast (SPPF) modules. The CBS module, as a convolutional module, is composed of conv, batch normalization (BN), and the sigmoid linear unit (SILU) concatenated to achieve dimensionality reduction and channel enhancement. The convolution of input images and feature images is used to extract local features, BN accelerates training to improve stability, and SILU introduces nonlinear capabilities. The CSP module mainly implements cross-stage partial aggregation to increase the diversity of features. The SPPF module concatenates feature maps of different scales together to enhance the feature expression ability of the backbone. The neck part is located between the backbone network and the head network, mainly responsible for multi-scale feature fusion and feature enhancement. Through path aggregation networks, it achieves bidirectional feature fusion from bottom to top and top to bottom, effectively enhancing feature information at different scales. The head part introduces the concepts of a decoupled head and being anchor-free, consisting of three detection heads primarily responsible for object detection and classification, producing the final detection results.

3. Algorithm Improvements

YOLOv8, as an efficient object detection algorithm, is widely used in various visual tasks due to its end-to-end training approach and real-time inference capability. However, there are still certain performance bottlenecks when using YOLOv8 for small object detection. In this study, we propose an optimization scheme for magnetic tile surface defect detection based on the YOLOv8 model as the backbone network, aiming to improve the accuracy and robustness of the model in identifying small magnetic tile targets in industrial production. This section will provide a detailed introduction to the three main improvements we made to the YOLOv8 algorithm to extract, fuse, and detect feature information from magnetic tile surface defect images. The specific network structure is shown in Figure 1. (1) The model introduces a hybrid convolution module ADConv to replace the ordinary convolution conv (3, 2) in the CBS module of the original backbone part, which can take into account both global and local information, thus more accurately capturing the complex features of defects. (2) The model introduces a hybrid attention module (HAM) in the neck part, which uses a parallel attention structure to analyze the attention of space and channel dimensions. It selectively focuses on the regions of interest in the surface defect images of industrial product magnetic tiles and solves the problem of shallow feature loss in the network, increasing the ability of the network to perceive remote location information, learn local features, and information transmission and fusion. (3) By considering balancing the contributions of high-quality and low-quality samples to the loss function, we introduce cost-sensitive factors and optimize the loss function Focal-EIoU to adjust the attention of the model to high-quality samples, allocating higher costs to high-quality samples and lower costs to low-quality samples in order to balance the differences in sample size. These improvements not only enhance the model’s perception of small targets but also improve the model’s localization accuracy and convergence speed.

3.1. Mixed Convolution Module ADConv

In the backbone part, we replace the ordinary convolution conv (3, 2) of the CBS module in the original YOLOv8 model with the mixed convolution module ADConv. The main idea of the ADConv module is to combine atrous convolution [33] with depthwise separable convolution [34] in series, which can expand the receptive field through atrous convolution while maintaining the efficiency of depthwise separable convolution. It can better capture multi-scale feature information in the image, increase the sampling interval, improve global feature extraction ability, and reduce computational complexity and parameters. The mixed convolution module ADConv structure is shown in Figure 2. Firstly, we use atrous convolution and introduce the dilated parameter K. It can have a larger receptive field than ordinary convolution and obtain multi-scale information without changing the size of the output feature map of the image. Further, we obtain image detail features through depthwise separable convolution. Depthwise separable convolution consists of two steps: depthwise convolution and pointwise convolution. Depthwise convolution is applied on the input layer. Three 3 × 3 convolution kernels are first used to perform convolution calculations on the three channels, extracting features from each channel of the input layer and stacking them together. Then we use a 1 × 1 convolution with three channels for calculation. Finally, we obtain the result with only one channel as the output.

3.2. Hybrid Attention Module (HAM)

In the neck part, we introduce the mixed attention module HAM [35]. This module combines two dimensions of channel attention and spatial attention, processed in a parallel attention structure. It is a more comprehensive feature attention method. The hybrid attention module (HAM) structure is shown in Figure 3. The following describes the operations on two dimensions of attention. In the spatial attention dimension, the input feature map X is first subjected to global max pooling and global average pooling in the channel dimension, resulting in two H × W × 1 feature maps. Then, the results of global max pooling and global average pooling are concatenated by channels to obtain a feature map dimension of H × W × 2. We perform a 7 × 7 convolution operation on the concatenated result to obtain a feature map dimension of H × W × 1 and use the sigmoid activation function to obtain the spatial attention weight matrix. In the channel attention dimension, we first perform feature learning on the channels, globally average pool each channel of the input feature map X, and transform the feature map of each channel into a single numerical value. The pooled features are compressed into a one-dimensional vector 1 × 1 × C. Then we learn the correlation between feature channels, scale the channels by n times through 1 × 1 convolution, perform ReLU activation, restore the channels to C through 1 × 1 convolution, use the activation function sigmoid to process, and limit the weight values of each channel to the interval of [0, 1]. Finally, the feature map is recalibrated after the above operation. We multiply the weights of the channel and spatial dimensions obtained by processing the input feature map X in sequence with itself and apply them to the corresponding channels and spaces to obtain the final output feature map X’, which focuses more on important features and highlights the most-contributing regions to improve the defect detection performance of the model. The mathematical representation is as follows:

M_{s} (X) = σ (c o n v (7,7) ([G M P (X); G A P (X)]))

(1)

M_{c} (X) = σ (c o n v (1,1) (R e L U (c o n v (1,1) (G A P (X)))))

(2)

X^{’} = X \otimes M_{s} (X) \otimes M_{c} (X)

(3)

3.3. Loss Function Focal-EIoU

The YOLOv8n model uses complete intersection over union (CIoU) as the bounding box regression loss function [36,37]. Although this function considers the overlapping area, center point distance, and aspect ratio of the bounding box, the ratio is not the true difference between their confidence levels, which may lead to inaccurate model regression. We introduce the Focal-EIoU loss function. It solves the fuzzy definition of the aspect ratio based on CIoU. The efficient intersection over union (EIoU) loss function consists of three parts:

L_{I o U}

,

L_{d i s}

, and

L_{a s p}

. EIoU comprehensively considers the true differences in overlapping area, center point distance, and aspect ratio. It adds Focal Loss to solve the problem of sample imbalance in bounding box regression, effectively achieving localization and accelerating convergence speed. The optimized loss function is as follows:

I o U = \frac{g \cap p + α}{g \cup p + α}

(4)

L_{I o U} = 1 - I o U

(5)

L_{d i s} = \frac{ρ^{2} (B^{g}, B^{p})}{c^{2}}

(6)

L_{a s p} = \frac{ρ^{2} (w^{g}, w^{p})}{{C_{w}}^{2}} + \frac{ρ^{2} (h^{g}, h^{p})}{{C_{h}}^{2}}

(7)

L_{E I o U} = L_{I o U} + L_{d i s} + L_{a s p} = 1 - I o U + \frac{ρ^{2} (B^{g}, B^{p})}{c^{2}} + \frac{ρ^{2} (w^{g}, w^{p})}{{C_{w}}^{2}} + \frac{ρ^{2} (h^{g}, h^{p})}{{C_{h}}^{2}}

(8)

L_{F o c a l - E I o U} = {I o U}^{γ} L_{E I o U}

(9)

where

g

denotes the real bounding box,

p

denotes the predicted bounding box, and α is a small positive number.

c

represents the diagonal length of the smallest bounding box between the true bounding box and the predicted bounding box.

B^{g}

and

B^{p}

, respectively, represent the center point coordinates of the true bounding box and the predicted bounding box.

w^{g}

,

w^{p}

and

h^{g}

,

h^{p}

, respectively, represent the width and height of the true bounding box and the predicted bounding box.

C_{w}

and

C_{h}

, respectively, denote the width and height of the minimum bounding box.

γ

is a regulatory factor that controls the degree of outlier suppression. Compared with EIoU loss, Focal-EIoU loss using

γ

has lower localization errors and brings more stable improvement.

γ

is a hyperparameter. Originally, the Focal-EIoU loss works well when facing the extreme foreground–background class imbalance. With an increase in

γ

, a larger γ brings stronger suppression on hard examples and retards the convergence speed. We set

γ

= 0.5, which achieves the best trade-off, and use it as the default value for further experiments.

4. Experiments and Analysis

4.1. Experimental Environment

This study conducted experiments based on the following environment. Table 1 lists the experimental operating environment. During the training process, the RT-DETR algorithm and YOLO series algorithms were trained using the efficient AdamW optimizer within the same framework. We trained the SSD algorithm with the default SGD optimizer and set the learning rate to 1 × 10⁻³. The batch size was 16 and the number of iterations was 300. The input image size was processed to 1280 × 1280.

4.2. Experimental Data

To assess the performance of the YOLOv8-AHF algorithm, we ran detailed experiments on the open dataset. The study used the magnetic-tile-defect dataset [38] collected by the Chinese Academy of Sciences. The dataset contains six types of magnetic tile surface detection images, as shown in Figure 4, including blowhole, crack, break, fray, uneven, and free data. We divided the dataset into a training set and a validation set in an 8:2 ratio for research.

4.3. Evaluation Metrics

In order to analyze the surface defects of magnetic tiles, standard evaluation metrics were employed, such as precision, recall, F1-Score, Mean Average Precision (mAP), Parameters, giga floating point operations per second (GFLOPs), frames per second (FPS), etc.

The evaluation metrics are listed as follows:

Precision = \frac{T P}{T P + F P}

(10)

Precision is an important indicator in model evaluation. It measures the proportion of actual positive examples in the predicted positive examples. True positive (TP) indicates that positive examples are recognized as positive examples by the model, while false positive (FP) indicates that negative examples are recognized as positive examples by the model.

Recall = \frac{T P}{T P + F N}

(11)

Recall is an important indicator in model evaluation. It measures the proportion of predicted positive examples in actual positive examples. False negative (FN) indicates that positive examples are recognized as negative examples by the model.

F1-Score = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(12)

F1-Score is the harmonic average of precision and recall.

mAP = \frac{1}{N} \sum_{k = 0}^{N} A P

(13)

Average precision (AP) is the average precision of each category, and mAP is the comprehensive weighted average of AP for all categories. The mAP index is used for the comprehensive evaluation of the performance of object detection models. GFLOPS refers to the number of giga floating point operations per second, which is typically used to evaluate the computational complexity and computational resource requirements of a model. FPS is the number of images that can be processed per second.

4.4. Comparative Experiments

To verify the effectiveness of defect detection, several typical object detection algorithms (SSD, RT-DETR, YOLOv5, YOLOv8, YOLOv11, YOLOv12) were selected for comparative experiments under the same configuration environment and dataset.

4.4.1. Algorithm Comparison Experiment

We evaluated the performance of the seven algorithms using the aforementioned evaluation metrics. Table 2 lists the following. From the comparative data, the YOLOv8-AHF algorithm has achieved an optimal level in terms of detection quality. In particular, it has the highest mAP@0.5, mAP@0.5:0.95, F1-Score, and recall, reflecting its excellent detection performance and stability. Compared to SSD and RT-DETR, YOLOv8-AHF achieves higher detection performance with minimal parameter and computational complexity, striking a good balance between speed and accuracy and demonstrating excellent model design and optimization strategies. Although its GFLOPs and FPS are not as good as YOLOv12, they still far outperform traditional single-stage object detectors and can meet the needs of most real-time applications.

4.4.2. mAP Comparison Experiment

We drew a comparison chart of mAP@0.5 and mAP@0.5:0.95 of the above seven algorithms under certain circumstances. The horizontal axis represents training epochs and the vertical axis represents mAP in Figure 5. From the training curve, it can be seen that the YOLOv8-AHF algorithm performs well. It maintains a high index of mAP@0.5 and mAP@0.5:0.95, and the final convergence performance is better than other methods. This indicates that the algorithm not only has higher detection accuracy but also has good training stability and convergence speed. In Figure 5a, the mAP@0.5 of the YOLOV8-AHF algorithm rapidly improved to 0.89 in the first 105 epochs and ultimately stabilized at 0.96. In Figure 5b, the mAP@0.5:0.95 of the algorithm converged to 0.66, and the performance tended to stabilize after 185 epochs.

4.4.3. Precision–Recall Comparison Experiment

We used seven algorithms mentioned in Section 4.4 of this paper to detect magnetic tile images and draw the precision–recall (PR) curve of every algorithm, such as SSD, RT-DETR, YOLOv5, YOLOv8, YOLOv11, YOLOv12, and YOLOv8-AHF, as shown in Figure 6. The PR curves of five types of magnetic tile surface defects for each algorithm are shown in Figure 6a–g, sequentially. In the PR curve graph, the closer the curve area is to 1, the better the algorithm performance. The YOLOv8-AHF algorithm reached 0.962 in mAP@0.5, which is the highest among all algorithms, indicating the strongest overall detection performance. The PR curves of the algorithm show a trend of high precision, high recall, and convergence stability in various categories, almost approaching the ideal rectangular area. The PR curves of the five types of defects also have clear contours, indicating that the algorithm has strong generalization ability across all categories. From Figure 6g, it can be seen that the curve area of our algorithm enclosed by the blue line is close to 1. The corresponding PR curves for the five types of defects are steep and approach the upper right corner, indicating that the YOLOv8-AHF algorithm can maintain good accuracy even at high recall rates. It is an ideal state for object detection tasks.

4.4.4. Detection Effect Comparison Experiment

In order to verify the effectiveness of the algorithm, we conducted a comparative experiment of the detection effect. As shown in Figure 7, we provide some original images and annotation images of magnetic tile surface defects, as well as detection results for the comparison of the YOLOv8 algorithm and the YOLOv8-AHF algorithm. They are placed in columns a, b, c, and d, sequentially. The images of three defect types of crack, break, and uneven are arranged in three rows of the figure. We can notice from Figure 7 that the YOLOv8-AHF algorithm has a higher detection rate than YOLOv8 by comparing columns c and d.

4.5. Ablation Experiment

4.5.1. Convolutional Module Ablation Experiment

Table 3 reveals the performance comparison of various typical convolution modules. Through our pilot study, we have found that the ADConv module has excellent performance in multiple dimensions, significantly improving detection capabilities while maintaining efficiency. The mAP@0.5:0.95 of the ADConv module is 0.698, which demonstrates that the ADConv module performs the strongest in high-quality prediction. The mAP@0.5 of the ADConv module is 0.955, almost on par with the omni-dimensional dynamic convolution (ODConv) module, reaching a relatively high level. The F1-Score of the ADConv module reached 0.94, which is the highest in Table 3, indicating stable classification and strong recall ability. It can be seen that the ADConv module is an excellent structure that balances precision and recall. Compared to the ODConv module, it maintains better precision and inference efficiency and smaller computational complexity. The ADConv module introduces more computational complexity compared to the GSConv module, but it is still within an acceptable real-time detection range. The ADConv module has achieved improvements in precision and recall. There are also significant advantages in indicators mAP@0.5 and mAP@0.5:0.95. Although the ADConv module is not as fast as the DyConv module, it surpasses the DyConv module in other performance aspects.

4.5.2. Attention Module Ablation Experiment

To verify the effectiveness of the hybrid attention module (HAM) proposed in this article, we used YOLOv8 as the baseline and introduced CA, GAM, EMA, BAM, and HAM attention modules successively to compare them. From Table 4, it can be seen that the HAM outperforms other attention mechanisms in terms of mAP@0.5, F1-Score, and recall. Compared to the GAM module, the overall performance of the HAM is better. Although the precision and mAP@0.5:0.95 of the HAM is slightly lower, its target missed detection rate is lower. The HAM can achieve a good balance between precision and recall. The results indicate that the HAM exhibits better performance balance and practicality on the magnetic tile surface defect dataset compared to other attention mechanisms, and it is more reliable in practical applications.

4.5.3. Loss Function Ablation Experiment

In this experiment, we replaced CIoU with multiple typical loss functions based on the YOLOv8 algorithm. As shown in Table 5, our mAP@0.5 is 0.935 and our F1-Score is 0.90. Both of them are the highest among all loss functions. It demonstrates that the Focal-EIoU loss function has excellent comprehensive performance. The precision of our method reached 0.951 and surpasses other loss functions. We can see that it has outstanding detection accuracy. The recall is 0.868, which is relatively high. It indicates that the loss function is more stable in positive and negative sample classification and has good target recognition ability.

4.5.4. Overall Ablation Experiment

To evaluate the performance of each module in YOLOv8-AHF, the ADConv, HAM, Focal-EIoU, ADconv+Focal-EIoU, and ADconv+HAM+Focal-EIoU modules were introduced successively into the original YOLOv8 algorithm. The last method is the final model adopted in this article. The above experiments all used the same dataset and ensured the same training strategy. The experimental results are shown in Table 6 and Figure 8. It can be seen that the ADconv+HAM+Focal-EIoU module achieved the strongest detection performance on multiple evaluation metrics and also had significant improvements compared to the baseline YOLOv8. Although the F1-Score of the ADconv+HAM+Focal-EIoU module is the same as using the HAM alone, it also indicates that the additive combination enhances precision and recall. In summary, the ADConv module further enhances feature extraction capability, the HAM attention mechanism strengthens key region perception, improves the F1-Score and mAP, and Focal-EIoU focuses on difficult samples and improves regression stability and high-quality prediction ability. Their combination effectively achieves comprehensive improvement in detection performance and generalization ability.

4.6. Generalization Experiment

In order to further validate the performance of the YOLOv8-AHF algorithm mentioned in this article, we compared the model on the magnetic-tile-defect dataset and the PKU-Market-PCB dataset. The comparison is shown in Table 7. From the comparison, we can see that the YOLOv8-AHF algorithm has significantly improved various evaluation metrics compared to the YOLOv8 algorithm on both datasets. It has been confirmed that the algorithm can effectively improve the performance of defect detection and meet the needs of industrial defect detection.

5. Conclusions

In order to solve the problems of low accuracy in detecting the surface defects of motor magnetic tiles and difficulty in detecting small targets, this paper proposes the YOLOv8-AHF algorithm. The backbone part is used for feature extraction and the ADConv module captures structural feature information to obtain a feature map. The mixed attention module HAM analyzes channel attention and spatial attention dimensions. The optimized loss function Focal-EIoU adjusts the model’s attention to high-quality samples. We conducted multiple comparative experiments and ablation experiments on the magnetic-tile-defect dataset. The mAP@0.5 index of the YOLOv8-AHF algorithm reached 0.962, the mAP@0.5:0.95 index reached 0.682, the F1-Score index was 0.93, and the recall index was 0.943, which were higher than the relevant mainstream object detection algorithms. The excellent detection performance and stability of the YOLOv8-AHF algorithm in detecting small defects on the surface of tiles were demonstrated. Compared with SSD and RT-DETR, YOLOv8-AHF achieved higher detection performance with minimal parameters and computational complexity, achieving a good balance between speed and accuracy. We conducted generalization experiments on the magnetic-tile-defect dataset and the PKU-Market-PCB dataset based on the YOLOv8 algorithm. The performance of mAP@0.5, mAP@0.5:0.95, F1-Score, precision, and recall on both datasets were all improved to varying degrees. The experiments further confirmed that the research has potential application prospects in detecting small surface defects of industrial products.

Author Contributions

All of the authors provided significant contributions to the work. Methodology, C.M.; experiments and analysis, C.M. and J.C.; writing—original draft preparation, review and editing, C.M., Y.P. and J.C.; visualization, C.M.; funding acquisition, C.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Key Project of Natural Science Research in Anhui Province’s Universities under grant agreement No. 2022AH051913. The project was funded by Anhui Province in China.

Data Availability Statement

The Magnetic-Tile-Defect dataset and PKU-Market-PCB dataset were analyzed in the article. They were collected by the Chinese Academy of Sciences and Open Lab on Human Robot Interaction in Peking University, which were obtained from https://github.com/Charmve/Surface-Defect-Detection/tree/master/Magnetic-Tile-Defect (accessed on 18 March 2025) and https://robotics.pkusz.edu.cn/resources/dataset/ (accessed on 18 March 2025).

Acknowledgments

I would like to express my sincere gratitude to many people who helped me to bring this paper fruition. First I would like to thank my tutor. I am so deeply grateful for his help, professionalism, and valuable guidance throughout this paper. I would also like to thank to my friends, colleague and families who helped me to accomplish this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

Correction Statement

This article has been republished with a minor correction to resolve spelling and grammatical errors. This change does not affect the scientific content of the article.

Abbreviations

The following abbreviations are used in this manuscript:

ADConv	Combination of Atrous Convolution and Depthwise Separable Convolution
AlphaIoU	Alpha Intersection over Union
AP	Average Precision
BAM	Bottleneck Attention Module
BN	Batch Normalization
CA	Channel Attention
CBS	Conv-BN-SILU
CIoU	Complete Intersection over Union
CSP	Cross-Stage Partial
DIoU	Distance Intersection over Union
DyConv	Dynamic Convolution
EIoU	Efficient Intersection over Union
EMA	Efficient Multi-Scale Attention
FN	False Negative
Focal-EIoU	Focal-Enhanced Intersection over Union
FP	False Positive
FPS	Frames Per Second
GAM	Global Attention Mechanism
GFLOPS	Giga Floating Point Operations Per Second
GSConv	Group Spatial Convolution
HAM	Hybrid Attention Module
IoU	Intersection over Union
mAP	Mean Average Precision
NSCT	Non-Subsampled Contourlet Transform
ODConv	Omni-Dimensional Dynamic Convolution
PR	Precision-Recall
SILU	Sigmoid Linear Unit
SPDConv	Space-to-Depth Convolution
SPPF	Spatial Pyramid Pooling Fast
TP	True Positive
YOLO	You Only Look Once

References

Li, X.; Jiang, H.; Liu, P.; Yin, G. Non downsampling Contourlet domain adaptive threshold surface magnetic tile surface defect detection. J. Comput.-Aided. Des. Comput. Graph. 2014, 26, 553–558. [Google Scholar]
Lin, L.J.; Yin, Y.; He, M.G.; Yin, X.Y.; Yin, G.F. Edge detection algorithm for magnetic tile crack defects based on wavelet modulus maximum. J. Univ. Electron. Sci. Technol. China. 2015, 44, 283–288. [Google Scholar]
Gao, Q.Q. Research on Visual Inspection Technology for Surface Quality of Magnetic Tile. Ph.D. Thesis, Shandong University of Technology, Zibo, China, 2018. [Google Scholar]
Li, Z.H. Research on Visual Inspection System and Algorithm for Surface Defects of Polished Bricks. Ph.D. Thesis, Guangdong University of Technology, Guangzhou, China, 2021. [Google Scholar]
Ding, L.F.; Zeng, S.L. Surface defect detection of motor magnetic tiles based on multiple attention mechanisms. Comput. Technol. Development 2022, 32, 194–199. [Google Scholar]
Hu, H.; Li, J.F.; Shen, J.M. Research on micro defect detection method for small magnetic tile surface based on machine vision. Mech. Electr. Engineering 2019, 36, 117–123. [Google Scholar]
Wang, L.F. Research on Modeling and Quantitative Diagnosis Method of Impact Characteristics of Rolling Bearing Defects. Master Thesis, Chongqing University, Chongqing, China, 2021. [Google Scholar]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. In Proceedings of the IEEE, Orlando, FL, USA, 7–13 November 2018; pp. 2278–2324. [Google Scholar]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Szegedy, C.; Liu, W.; Jia, Y.Q.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. arXiv 2014, arXiv:1409.4842. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. arXiv 2016, arXiv:1512.03385. [Google Scholar]
Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar] [CrossRef]
Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. arXiv 2019, arXiv:1801.04381. [Google Scholar]
Tan, M.; Le, Q. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv 2020, arXiv:1905.11946. [Google Scholar] [CrossRef]
Tan, M.; Le, Q. EfficientNetV2: Smaller Models and Faster Training. arXiv 2021, arXiv:2104.00298. [Google Scholar]
Liu, J.H.; Dong, J.X.; Wang, N.N.; Fang, H.H. Application analysis of pixel level segmentation and measurement algorithm for pavement cracks based on Crack Mask R-CNN model. Domest. Foreign Highw. 2023, 43, 47–52. [Google Scholar]
Gao, M.Y. Research on Wood Knot Defect Detection Based on ResNet Convolutional Neural Network. Master’s Thesis, Northeast Forestry University, Harbin, China, 2022. [Google Scholar]
Shen, D.M.; Liu, X.; Shang, Y.F.; Tang, X. Intelligent recognition method for underground drainage pipeline defects based on improving ResNet. Intell. Comput. Appl. 2024, 14, 92–98. [Google Scholar]
Wang, W.J. Research on Surface Defect Detection of Steel Billets Based on Cross Scale Cross Weighted Feature Fusion Network. Master’s Thesis, Hefei University of Technology, Hefei, China, 2023. [Google Scholar]
Dai, Y. Research on Surface Defect Detection and Localization of Industrial Images Based on Generative Adversarial Networks. Master’s Thesis, China West Normal University, Nanchong, China, 2024. [Google Scholar]
Yu, S.; Xia, Y.; Guo, P.W.; Hou, R.X.; Zhang, Y.B.; Zhou, Z. L A defect detection method for solar cells based on deep convolutional neural networks. J. Sens. Technol. 2023, 36, 1165–1170. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [PubMed]
Liu, G.; Yang, N.; Guo, L.; Guo, S.; Chen, Z. A one-stage approach for surface anomaly detection with background suppression strategies. Sensors 2020, 20, 01829. [Google Scholar] [CrossRef] [PubMed]
Wang, X.; Wang, Y.; Xu, X.; Yan, F.; Zeng, Z. Two-stage deep neural network with joint loss and multi-level representations for defect detection. J. Electron. Imaging 2022, 31, 063060. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 779–788. [Google Scholar]
Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2016; pp. 6517–6525. [Google Scholar]
Redmon, J.; Farhadi, A. YOLOv3: Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar] [CrossRef]
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:1804.02767. [Google Scholar] [CrossRef]
Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO Series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar] [CrossRef]
Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: Single-Stage Object Detection Framework for Industrial Applications. arXiv 2022, arXiv:2209.02976. [Google Scholar]
Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]
Yu, F.; Koltun, V.; Funkhouser, T. Dilated Residual Networks. arXiv 2017, arXiv:1705.09914. [Google Scholar] [PubMed]
Chollet, F. Xception: Deep learning with depthwise separable convolutions. arXiv 2017, arXiv:1610.02357. [Google Scholar] [CrossRef]
Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. arXiv 2018, arXiv:1807.06521. [Google Scholar] [CrossRef]
Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. arXiv 2019, arXiv:1911.08287. [Google Scholar] [CrossRef]
Zhang, Y.F.; Ren, W.; Zhang, Z.; Jia, Z.; Wang, L.; Tan, T. Focal and Efficient IOU Loss for Accurate Bounding Box Regression. arXiv 2022, arXiv:2101.08158. [Google Scholar] [CrossRef]
Huang, Y.; Qiu, C.; Yuan, K. Surface Defect Saliency of Magnetic Tile. Vis. Comput. 2020, 36, 85–96. [Google Scholar] [CrossRef]

Figure 1. Network structure of the magnetic tile surface defect detection model.

Figure 2. ADConv structure.

Figure 3. HAM structure.

Figure 4. Six types of magnetic tile surface images in the dataset.

Figure 5. Comparison chart of mAP for the above seven object detection algorithms (SSD, RT-DETR, YOLOv5, YOLOv8, YOLOv11, YOLOv12, YOLOv8-AHF). (a) mAP@0.5 comparison; (b) mAP@0.5:0.95 comparison.

Figure 6. Comparison chart of PR curve of seven algorithms. (a) The PR curve of SSD; (b) the PR curve of RT-DETR; (c) the PR curve of YOLOv5; (d) the PR curve of YOLOv8; (e) the PR curve of YOLOv11; (f) the PR curve of YOLOv12; (g) the PR curve of YOLOv8-AHF.

Figure 7. Comparison chart of detection effect. (a) Original images of defect data; (b) annotation image of defect data; (c) detection results using YOLOv8 algorithm; (d) detection results using YOLOv8-AHF algorithm.

Figure 8. Comparison chart of mAP with different modules added based on YOLOv8. (a) mAP@0.5 comparison; (b) mAP@0.5:0.95 comparison.

Table 1. Experimental operating environment.

Experimental Environment	Configuration
CPU	AMD EPYC 9554 CPU@ 3.00 GHz × 128
GPU	NVIDIA RTX A6000 × 1
Memory	125GiB
Operating System	Ubuntu 22.04.1 LTS × 64 (5.15.0-67-generic)
Deep Learning Computing Platform	Cuda11.8
Deep Learning Framework	PyTorch 2.1.2
Compiler Language	Python 3.10.8

Table 2. Performance comparison of several typical object detection algorithms.

Algorithm	mAP@0.5	mAP@0.5:0.95	F1-Score	Precision	Recall	Parameters	GFLOPs	FPS
SSD	0.887	0.596	0.88	0.96	0.803	26,375,621	31.6	52.51
RT-DETR	0.89	0.558	0.83	0.851	0.823	31,994,015	103.5	24.77
YOLOv5	0.922	0.656	0.90	0.878	0.919	2,503,919	7.1	84.37
YOLOv8	0.903	0.63	0.88	0.874	0.885	3,006,623	8.1	75.68
YOLOv11	0.954	0.656	0.91	0.917	0.897	2,583,127	6.3	68.32
YOLOv12	0.934	0.585	0.87	0.86	0.871	2,509,319	5.8	85.28
YOLOv8-AHF(ours)	0.962	0.682	0.93	0.924	0.943	3,051,829	8.5	65.55

Table 3. Convolutional module ablation experiment results.

Module	mAP@0.5	mAP@0.5:0.95	F1-Score	Precision	Recall	Parameters	GFLOPs	FPS
-	0.903	0.630	0.88	0.874	0.885	3,006,623	8.1	75.68
GSConv	0.934	0.672	0.89	0.907	0.890	2,816,767	7.7	77.76
SPDConv	0.921	0.663	0.91	0.968	0.871	4,182,959	7.6	79.52
DyConv	0.930	0.655	0.91	0.939	0.879	4,190,431	7.2	81.60
ODConv	0.956	0.651	0.91	0.922	0.909	5,733,850	15.7	37.44
ADConv(ours)	0.955	0.698	0.94	0.956	0.917	3,097,887	8.2	68.31

Table 4. Attention mechanism ablation experiment results.

Module	mAP@0.5	mAP@0.5:0.95	F1-Score	Precision	Recall
-	0.903	0.63	0.88	0.874	0.885
CA	0.911	0.651	0.86	0.918	0.808
GAM	0.917	0.677	0.91	0.965	0.855
EMA	0.916	0.642	0.89	0.947	0.851
BAM	0.916	0.665	0.91	0.928	0.888
HAM (ours)	0.931	0.658	0.93	0.912	0.894

Table 5. Loss function ablation experiment results.

Loss Function	mAP@0.5	mAP@0.5:0.95	F1-Score	Precision	Recall
-	0.903	0.63	0.88	0.874	0.885
IoU	0.926	0.634	0.87	0.883	0.864
DIoU	0.888	0.629	0.85	0.876	0.824
EIoU	0.929	0.64	0.89	0.884	0.897
AlphaIoU	0.871	0.495	0.76	0.777	0.757
Focal-EIoU(ours)	0.935	0.632	0.90	0.951	0.868

Table 6. Overall ablation experiment results.

Module	mAP@0.5	mAP@0.5:0.95	F1-Score	Precision	Recall
-	0.903	0.63	0.88	0.874	0.885
ADConv	0.955	0.668	0.92	0.956	0.917
HAM	0.931	0.658	0.93	0.912	0.894
Focal-EIoU	0.935	0.632	0.87	0.883	0.864
ADconv+Focal-EIoU	0.95	0.671	0.91	0.917	0.906
ADconv+HAM+Focal-EIoU (YOLOv8-AHF)	0.962	0.682	0.93	0.924	0.943

Table 7. Generalization experiment results.

Dataset	Algorithm	mAP@0.5	mAP@0.5:0.95	F1-Score	Precision	Recall
Magnetic-Tile-Defect	YOLOv8	0.903	0.630	0.88	0.874	0.885
Magnetic-Tile-Defect	YOLOv8-AHF (ours)	0.962	0.682	0.93	0.924	0.943
PKU-Market-PCB	YOLOv8	0.908	0.458	0.88	0.916	0.841
PKU-Market-PCB	YOLOv8-AHF (ours)	0.955	0.525	0.93	0.943	0.914

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ma, C.; Pan, Y.; Chen, J. Surface Defect Detection of Magnetic Tiles Based on YOLOv8-AHF. Electronics 2025, 14, 2857. https://doi.org/10.3390/electronics14142857

AMA Style

Ma C, Pan Y, Chen J. Surface Defect Detection of Magnetic Tiles Based on YOLOv8-AHF. Electronics. 2025; 14(14):2857. https://doi.org/10.3390/electronics14142857

Chicago/Turabian Style

Ma, Cheng, Yurong Pan, and Junfu Chen. 2025. "Surface Defect Detection of Magnetic Tiles Based on YOLOv8-AHF" Electronics 14, no. 14: 2857. https://doi.org/10.3390/electronics14142857

APA Style

Ma, C., Pan, Y., & Chen, J. (2025). Surface Defect Detection of Magnetic Tiles Based on YOLOv8-AHF. Electronics, 14(14), 2857. https://doi.org/10.3390/electronics14142857

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Surface Defect Detection of Magnetic Tiles Based on YOLOv8-AHF

Abstract

1. Introduction

2. Basic Backbone Network

3. Algorithm Improvements

3.1. Mixed Convolution Module ADConv

3.2. Hybrid Attention Module (HAM)

3.3. Loss Function Focal-EIoU

4. Experiments and Analysis

4.1. Experimental Environment

4.2. Experimental Data

4.3. Evaluation Metrics

4.4. Comparative Experiments

4.4.1. Algorithm Comparison Experiment

4.4.2. mAP Comparison Experiment

4.4.3. Precision–Recall Comparison Experiment

4.4.4. Detection Effect Comparison Experiment

4.5. Ablation Experiment

4.5.1. Convolutional Module Ablation Experiment

4.5.2. Attention Module Ablation Experiment

4.5.3. Loss Function Ablation Experiment

4.5.4. Overall Ablation Experiment

4.6. Generalization Experiment

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Correction Statement

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI