EH-YOLO: Dimensional Transformation and Hierarchical Feature Fusion-Based PCB Surface Defect Detection

Deng, Chengzhi; Zhang, You; Wu, Zhaoming; Wu, Yingbo; Sun, Xiaowei; Wang, Shengqian

doi:10.3390/app152010895

Open AccessArticle

EH-YOLO: Dimensional Transformation and Hierarchical Feature Fusion-Based PCB Surface Defect Detection

by

Chengzhi Deng

^*,

You Zhang

,

Zhaoming Wu

,

Yingbo Wu

,

Xiaowei Sun

and

Shengqian Wang

School of Information Engineering, Jiangxi University of Water Resources and Electric Power, Nanchang 330099, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(20), 10895; https://doi.org/10.3390/app152010895

Submission received: 17 September 2025 / Revised: 4 October 2025 / Accepted: 6 October 2025 / Published: 10 October 2025

Download

Browse Figures

Versions Notes

Abstract

Small surface defects in printed circuit boards (PCBs) severely affect the reliability of electronic devices, making PCB surface defect detection crucial for ensuring the quality of electronic products. However, the existing detection methods often struggle with insufficient accuracy and the inherent trade-off between detection precision and inference speed. To address these problems, we propose a novel ESDM-HNN-YOLO (EH-YOLO) network based on the improved YOLOv10 for efficient detection of small PCB defects. Firstly, an enhanced spatial-depth module (ESDM) is designed, which transforms spatial-dimensional features into depth-dimensional representations while integrating spatial attention module (SAM) and channel attention module (CAM) to highlight critical features. This dual mechanism not only effectively suppresses feature loss in micro-defects but also significantly enhances detection accuracy. Secondly, a hybrid neck network (HNN) is designed, which optimizes the speed–accuracy balance through hierarchical architecture. The hierarchical structure uses a computationally efficient weighted bidirectional feature pyramid network (BiFPN) to enhance multi-scale feature fusion of small objects in the shallow layer and uses a path aggregation network (PAN) to prevent feature loss in the deeper layer. Comprehensive evaluations on benchmark datasets (PCB_DATASET and DeepPCB) demonstrate the superior performance of EH-YOLO, achieving mAP@50-95 scores of 45.3% and 78.8% with inference speeds of 166.67 FPS and 158.73 FPS, respectively. These results significantly outperform existing approaches in both accuracy and processing efficiency.

Keywords:

deep learning; defect detection; printed circuit board (PCB); YOLOv10

1. Introduction

The relentless trend toward electronic miniaturization imposes ever-higher quality demands on printed circuit board (PCB) fabrication processes. As illustrated in Figure 1, the inevitable occurrence of defects during PCB fabrication underscores the critical importance of advanced detection methodologies for maintaining stringent quality and functional standards [1]. The traditional manual detection methods are plagued by low efficiency and subjective variability. The conventional machine learning approaches face fundamental limitations due to their dependence on handcrafted features and poor generalizability across diverse production scenarios [2]. These challenges have propelled deep learning-based methods to the forefront of PCB defect detection research, offering superior performance through automated feature learning and enhanced adaptability to complex industrial environments.

The existing deep learning-based methods can be generally divided into two-stage networks and one-stage networks. Two-stage detectors (e.g., Faster R-CNN [3] and Mask R-CNN [4]) first generate candidate regions and then predict them, which have huge computational complexity and find it difficult to meet the needs of real-time detection. In contrast, single-stage detectors (including the YOLO series [5,6,7,8,9,10,11,12,13], and RetinaNet [14]) utilize an end-to-end detection framework that delivers substantially faster inference speeds suitable for real-time industrial applications. To overcome the limitations of baseline methods, researchers have proposed various improved versions specifically for deep learning-based PCB defect detection. Ding et al. [15] proposed an improved Faster R–CNN framework incorporating k-means clustering [16] for adaptive anchor box generation, which significantly boosts the small-target detection performance. Lian et al. [17] introduced a geometric attention mechanism into Mask R–CNN architecture to improve the segmentation accuracy. Yuan et al. [18] proposed a modified YOLOv5 framework by incorporating a HorNet backbone network and designing multi-convolution attention modules to improve small defect feature extraction. Shao et al. [19] proposed a RetinaNet-based detection system utilizing ResNet as the feature extraction backbone. Although these methods have achieved notable progress, PCB defect detection still faces the following challenges:

(1): Difficulty in detecting small defects: The multi-scale convolutional downsampling operations in extraction networks inevitably lead to the degradation of critical feature information for small defects [20], which adversely affects the precision of small defects.
(2): Background interference: The relentless miniaturization of electronic components necessitates ultra-dense PCB trace configurations, where the elevated complexity of background patterns and their inherent resemblance to defect features create substantial challenges for accurate defect–background differentiation.
(3): Trade-off between accuracy and speed: Existing PCB defect detection methods face a fundamental trade-off between accuracy and efficiency, where improvements in detection precision typically come at the expense of increased model complexity and computational overhead, which makes it challenging to achieve optimal performance in both dimensions simultaneously.

To address the above issue, we have pertinently improved and optimized the feature extraction and feature fusion of YOLOv10, respectively. The main contributions of this paper are as follows:

(1): A novel EH-YOLO framework based on improved YOLOv10 is proposed, which further realizes the lightweight features of the network model by ensuring the improvement of detection accuracy and speed.
(2): In the backbone of the network, we design ESDM, which converts the spatial information of the input information into channel information through the dual mechanism of dimension transformation and attention. The model can obtain the deep semantics while retaining the shallow semantic features, so as to improve the detection accuracy of PCB defects.
(3): The neck network part of the HNN is designed to trade-off inference speed and detection accuracy by combining the advantages of weighted bidirectional feature pyramid networks (BiFPNs) [21] and path aggregation networks (PANs) [22] to refine the work of different feature fusion layers.

The rest of the paper is organized as follows. Section 2 briefly describes the advantages of YOLOv10 for PCB defect detection and Section 3 details the proposed EH-YOLO model. Section 4 validates the algorithm of this paper on public datasets and compares its performance with other algorithmic models.

2. Related Work

Owing to its efficient end-to-end processing framework, the YOLO architecture series has become a prominent solution for PCB defect detection, offering both accuracy and computational efficiency. Tang et al. [23] optimized the backbone network of YOLOv5 to enhance feature extraction, while implementing an efficient intersection over union (EIoU) loss function to optimize bounding box regression for enhanced small defect localization. Xiao et al. [24] enhanced YOLOv7-tiny backbone and neck components, which incorporate a coordinated attention (CA) mechanism to refine spatial channel features to achieve higher PCB detection accuracy. Lou et al. [25] optimized the SPP module in YOLOv7 by replacing the serial channels with concurrent channels to improve the fusion speed of the image features. Xiong et al. [26] employed GhostNet and HGNetV2 as the backbone network of YOLOv8, which reduces model parameter number and improves model inference speed. These approaches have achieved relatively good performance, but they do not effectively solve the problems of loss of feature information of small defects and the inherent trade-off between detection precision and computational efficiency.

YOLOv10, as an advanced one-stage detection framework, achieves remarkable breakthroughs in the trade-off between detection accuracy and inference speed through two key innovations. Firstly, it introduces a dual-allocation strategy that eliminates the need for non-maximum suppression (NMS) [27], significantly reducing model inference latency. Secondly, the framework adopts an efficiency–accuracy co-design paradigm, which synergistically enhances both detection precision and processing speed. This dual optimization renders YOLOv10 particularly advantageous for PCB inspection applications. Li et al. [28] proposed a framework based on YOLOv10 to enhance the ability to capture PCB defective features through a fine-grained feature enhancement approach with a dynamic weighting mechanism. Zheng et al. [29] innovatively combined omni-dimensional dynamic convolution with optimized bottleneck structures, which leverage multi-dimensional kernel characteristics to amplify hierarchical feature extraction. Li et al. [30] proposed a lightweight detection model named ASF-YOLO, which uses the adown module to recognize defects of different sizes and types, thereby effectively reducing the number of parameters. Liao et al. [31] proposed the YOLOv10n-SFDC model, which incorporates the DualConv module, SlimFusionCSP module, and Shape-IoU loss function to improve detection accuracy. The aforementioned research has achieved certain progress in PCB defect detection, which offers a theoretical foundation for our work. However, there remains room for improvement in both detection accuracy and efficiency. Therefore, we optimize the backbone and neck networks of YOLOv10.

Feature dimensionality reduction and transformation techniques have a long-standing history in computer vision. Traditional methods such as PCA have primarily focused on data visualization. In recent years, research trends in this field have increasingly emphasized the preservation of feature structure and the interpretability of representations, which aligns closely with the design philosophy of our ESDM—we are similarly committed to maximally retaining structural information critical for detection tasks during spatial downsampling.

Regarding feature analysis, the MING method proposed by Colange et al. [32] provides interpretable support tools for the visual exploration of multidimensional data, enabling the identification of still-entangled categories or size clusters within the feature space. Furthermore, their neighborhood graph superposition technique [33] offers methodological support for assessing the quality of intermediate feature representations by quantifying distortion during dimensionality reduction. These techniques provide potential analytical tools for verifying whether the ESDM maintains crucial neighborhood relationships between defective and non-defective regions throughout the feature transformation process, thereby offering theoretical grounding for architectural design and supporting decision-making.

3. Methodology

3.1. Overall Framework of EH-YOLO

The framework of the proposed EH-YOLO network is shown in Figure 2, which includes the backbone network responsible for feature extraction, the neck network responsible for feature fusion, and the detection head responsible for detection.

In the backbone network of YOLOv10, multi-stride convolutions-based downsampling results in feature degradation, especially for small PCB defect features in deep layers. To address this limitation, we abandon the traditional multi-step long convolutional downsampling while integrating an ESDM with a scaling factor of 2 after the convolutional layer, which enables downsampling to preserve PCB defect feature information by dimensional transformations instead of multi-stride convolutions.

The neck network in YOLOv10 utilizes four PAN-based feature fusion layers. This architecture introduces computational overhead that prevents optimal balance between detection precision and processing speed. To resolve this imbalance, we develop the HNN (as shown in Figure 2), which refines the division of labor among feature fusion layers. This architecture enables shallow networks to better integrate feature information of small defects, while allowing deeper networks to avoid unnecessary computational overhead. As a result, the proposed architecture maintains competitive detection accuracy while achieving significant improvements in inference speed.

3.2. Description of ESDM

The multi-stage convolutional downsampling process leads to progressive loss of fine-grained defect features, particularly affecting the detection accuracy for small PCB defects. To resolve this issue, we propose the ESDM to avoid the loss of PCB small defective feature information. As shown in Figure 3, ESDM includes spatial preprocessing, deep integration, and channel processing. ESDM transforms the spatial information of the input data into channel information through a sampling reorganization method, with these three components cooperating with each other. By this method, ESDM can accomplish the same effect of downsampling without losing information.

3.2.1. Spatial Preprocessing Module

This module selectively highlights critical spatial regions in the input feature maps to enhance feature discriminability by a spatial attention module (SAM) [34].

Assuming that the input

X \in R^{H \times W \times C}

, the spatial preprocessing can be expressed as the following equation:

\begin{array}{l} ω_{S} = F_{S A M} (X) \\ = σ (f^{7 \times 7} ([A v g P o o l (X); M a x P o o l (X)])) \end{array}

(1)

X_{S P} = ω_{S} • X

(2)

where

F_{S A M} (•)

represents the SAM processing function,

ω_{S}

is the weight matrix,

σ

is the sigmoid activation function,

f^{7 \times 7} (•)

is the

7 \times 7

convolution, and

X_{S P} \in R^{H \times W \times C}

is the output of the spatial preprocessing.

3.2.2. Deep Integration Module

This module replaces multi-stride convolutional downsampling by converting spatial information into depth information through a sampling and recombination mechanism, which prevents the loss of fine-grained feature details while strengthening the capability of feature extraction. Firstly, when the input feature map

X_{S P}

undergoes sampling (SP) with ratio S (as shown in Figure 4), generate

S^{2}

corresponding feature sub-maps

\{X_{S - N, S - N} | N \in [1, S]\}

where each

X_{S - N, S - N} \in R^{\frac{H}{S} \times \frac{W}{S} \times C}

. Secondly, the SAM processes each feature sub-map along the spatial dimension to generate a parameter matrix, which is then element-wise multiplied with the original feature sub-map to produce new feature sub-maps

X_{S - N} \in R^{\frac{H}{S} \times \frac{W}{S} \times C}

. Finally, the new feature sub-maps are connected along the depth dimension to obtain

X_{D I} \in R^{\frac{H}{S} \times \frac{W}{S} \times S^{2} \times C}

.

Assuming that the input is

X_{S P}

, the deep integration can be expressed as the following equation:

f_{S} (X_{S P}) = \{X_{S - 1, S - 1}, X_{S - 2, S - 1}, \dots, X_{0, 0}\}

(3)

\{\begin{matrix} X_{0} & = & F_{S A M} (X_{0, 0}) X_{0, 0} \\ ⋮ ⋮ ⋮ \\ X_{S^{2} - 2} & = & F_{S A M} (X_{S - 2, S - 1}) X_{S - 2, S - 1} \\ X_{S^{2} - 1} & = & F_{S A M} (X_{S - 1, S - 1}) X_{S - 1, S - 1} \end{matrix}

(4)

X_{D I} = c a t (X_{0}, X_{1}, \dots, X_{S^{2} - 1})

(5)

where

f_{S} (•)

represents the sampling function, S is the sampling ratio,

X_{S - i, S - i} (i < S)

is i²-th feature sub-map, and

c a t (•)

represents the depth-wise concatenation operator.

Downsampling operations generate output through methods such as computing the average or maximum values of surrounding pixels, which often leads to dilution of the original pixel values and loss of detail. In contrast, the SP operation employed by the Deep Integration Module applies a dimensional transformation to the surrounding pixels and then concatenates them as an output, effectively avoiding pixel value dilution. As a result, this approach better preserves the original pixel information and is particularly beneficial in preventing the loss of subtle defects during feature extraction.

3.2.3. Channel Processing Module

This module enhances critical channel features and further improves the model’s performance by employing the channel attention model (CAM) [34] to process the outputs from deep integration.

Assume that the input is

X_{D I}

, the channel processing can be expressed as the following equation:

\begin{array}{l} ω_{C} = F_{C A M} (X_{D I}) \\ = σ (M L P (A v g P o o l (X_{D I})) + M L P (M a x P o o l (X_{D I}))) \end{array}

(6)

X_{C P} = F_{C A M} (X_{D I}) • X_{D I}

(7)

and

X_{C P} \in R^{\frac{H}{S} \times \frac{W}{S} \times S^{2} \times C}

, where

F_{C A M} (•)

represents the CAM processing function,

M L P (•)

represents a nonlinear transformation,

A v g P o o l (•)

denotes average pooling,

M a x P o o l (•)

denotes max pooling,

ω_{C}

is the weight matrix, and this is the output of the ESDM.

3.2.4. SP Operation

This section provides a detailed description of the SP operation. As shown in Figure 4, a schematic diagram illustrates different sampling ratios. The operation can be conceptually regarded as applying a square sampling kernel composed of

S^{2}

sub-regions to the input feature map, when the sampling ratio is S. The sampled pixels from each sub-region collectively form a new feature sub-map. Notably, this design prevents the exclusion of critical feature information during the sampling process by omitting the sampling interval.

3.3. Description of HNN

PCB defect detection inherently faces a fundamental performance trade-off between detection accuracy and speed. To resolve this challenge, we present HNN, which synergistically integrates the complementary strengths of PAN and BiFPN. The core innovation of HNN lies in its hierarchical feature fusion mechanism, which refines the functional specialization of different fusion layers. This design is grounded in two key observations.

3.3.1. YOLOv10’s Neck Architecture and Feature Degradation

The original four feature fusion layers in YOLOv10’s neck network suffer from progressive loss of defect-related features as the network deepens. Consequently, the shallow feature fusion layers are better suited for integrating fine-grained features of small targets, while the deeper feature fusion layers focus on aggregating coarse-grained features of larger targets.

3.3.2. PCB-Specific Defect Characteristics

In contrast to natural targets that exhibit a wide range of sizes, PCB defects are typically minuscule and require specialized detection approaches.

Based on these principles, the HNN adopts a dual-layer strategic architecture. BiFPN is applied to the first two feature fusion layers, which improves small target detection accuracy and reduces computational expenses by a dynamic feature weighting mechanism to strengthen the feature representation of small defects and cross-scale recursive feature fusion. PAN is employed in the latter two feature fusion layers, which reduces loss of small defect-related features through a non-cross-layer network structure.

4. Experiments

4.1. Experimental Environment and Datasets

All experiments were conducted using the following software and hardware environment: a 64-bit Windows 10 operating system (Microsoft Corporation, Redmond, WA, USA), an NVIDIA GeForce RTX 4060 GPU (NVIDIA Corporation, Santa Clara, CA, USA), and a 12th Gen Intel(R) Core(TM) i5-12600KF CPU (Intel Corporation, Santa Clara, CA, USA). The deep learning framework utilized was PyTorch 1.10.0 with Python 3.9.0 as the programming language. The key training parameters were configured as follows: batch size was set to 32; the AdamW optimizer was employed; training was conducted for 300 epochs; input image size was maintained at 640 × 640; both initial learning rate (lr0) and final learning rate (lrf) were set to 0.01; weight decay was configured at 0.0005; random seed was fixed at 0; and exponential moving average (EMA = 0.9999) was enabled. The training process followed YOLOv10’s dual-label assignment strategy, with Wise-IoU employed as the loss function. For data augmentation, mosaic augmentation incorporating random affine transformations and random horizontal flipping was implemented, though this was disabled during the final 10 training epochs. During inference, preliminary filtering was performed using a confidence threshold of 0.001.

This study employs two public PCB defect detection datasets: PCB_DATASET (open-sourced by Peking University Intelligent Laboratory) and DeepPCB. As shown in Figure 5, these two datasets exhibit significant differences in image characteristics and detection difficulty. PCB_DATASET features complex image composition with strong background interference and generally small defect targets, presenting considerable detection challenges. In contrast, the DeepPCB dataset is characterized by uniform pixel distribution, low background noise, relatively larger defect sizes, and generally easier detection. Given that both datasets predominantly contain small defects, we further categorized the defects based on their relative sizes to enable more granular evaluation: missing hole, spurious copper, and short are classified as relatively medium-sized defects (denoted as AP_M), while open circuit, spur, and mouse bite are categorized as relatively smaller defects (denoted as AP_S). To comprehensively evaluate model generalization ability and prevent evaluation bias caused by circuit board-specific information leakage, we implemented a rigorous “held-out board” partitioning strategy for both datasets. This methodology ensures that all images originating from the same physical PCB appear exclusively in either the training, validation, or test set, thereby more accurately reflecting model performance on unseen PCB instances. Both datasets were divided into training, validation, and test sets following an 8:1:1 ratio. The validation set was used exclusively for hyperparameter tuning, while the test set was solely reserved for final performance assessment, ensuring unbiased and reliable results.

4.2. Evaluation Index

This study employs recall (R), precision (P), average precision (mAP), and frames per second (FPS) for objective performance assessment. These metrics are derived from a binary classification confusion matrix, as shown in Table 1. T_P represents correctly predicted defective samples; F_N represents defective samples incorrectly predicted as normal; F_P represents normal samples misclassified as defective; and T_N represents correctly identified normal samples.

The R measures the model’s ability to detect defects, defined as:

R = \frac{T_{P}}{T_{P} + F_{N}}

The P measures the accuracy of the model prediction results, defined as:

P = \frac{T_{P}}{T_{P} + F_{P}}

The mAP computes the average prediction accuracy across all classes, defined as:

m A P = \frac{1}{n} \sum_{i = 1}^{n} A P_{i}

where i represent the i-th class and n represents the total number of classes.

The FPS represents how many frames of images are processed per second, defined as:

F P S = \frac{F_{n}}{T}

where

F_{n}

represents the number of photos to be processed, and T represents the total time.

4.3. Contrast Experiment

To validate the superior performance of the EH-YOLO model, we conducted comparative experiments with several state-of-the-art models under identical experimental conditions.

As shown in Table 2, our proposed method, EH-YOLO, demonstrates exceptional detection performance, achieving the highest accuracy among all models with a mAP@50 of 91.6% and mAP@50-95 of 45.3%. This represents a significant improvement over other strong model such as YOLOv11 (87.9%/43.1%) and YOLOv10 (84.4%/42.1%). More notably, EH-YOLO maintains balanced model complexity while achieving this performance breakthrough. With only 3.548 M parameters and 12.3 G FLOPs, it proves more efficient than larger models like Faster-R–CNN (23.59 M/38.2 G) while delivering superior detection accuracy. These results fully validate the remarkable superiority of EH-YOLO’s detection metrics, establishing its better position as a high-precision and efficient solution in the field of PCB defect detection. Furthermore, the ESDN + YOLOv11n model in the table incorporates the ESDN module into YOLOv11n, demonstrating superior detection performance compared to YOLOv11n, particularly in the improved mAP@50-95. This finding indicate that the ESDN module possesses transferability across different architectures.

To validate the practical performance of EH-YOLO in PCB defect detection, Figure 6 presents a comparative analysis with YOLOv10. It can be observed that with an input size of 640 × 640, YOLOv10 produces false detections when detecting small defects such as open circuit and mouse bite, while also exhibiting numerous missed detections across other defect categories. Under identical input conditions, EH-YOLO not only completely avoids both false detections and missed detections but also maintains high detection accuracy for all defect types. Furthermore, when the input size of YOLOv10 is increased to 1024 × 1024, the false detections issue is partially mitigated, although some missed detections persist. Notably, its overall detection accuracy remains lower than that of EH-YOLO. Based on the collective experimental evidence, EH-YOLO demonstrates superior performance across all PCB defect detection tasks.

4.4. Ablation Experiment

To verify the improvement of model performance by structures such as ESDM and HNN, we designed an ablation experiment. Firstly, we tested YOLOv10, then we added HNN and ESDM and verified whether they contributed to the detection. Finally, we added these improved structures in pairs and checked if there is any problem damage between them. As shown in Table 3, compared to the baseline YOLOv10 model, with the introduction of HNN and ESDM YOLOv10 demonstrates a 1.6% and 5.9% increase in mAP@50, respectively. In group 3, the introduction of ESDM led to the increase in model parameters and the decrease in FPS, but FPS also reached 169.492, which could meet real-time detection. Experimental results in group 4 show that the ESDM effectively reduces the loss of small defect features, consequently improving feature fusion in the HNN architecture which confirms its strong compatibility.

To quantitatively assess the performance advantages of EH-YOLO in detecting minute defects, we provide a detailed precision analysis categorized by defect size and type. As shown in Table 4, the results demonstrate that EH-YOLO achieves marked superiority over the baseline model YOLOv10 across all size categories. Specifically, it attains a 7.8% improvement in ap for small targets (AP_S), while achieving a notably greater enhancement of 13.4% for medium-sized targets (AP_M). These findings confirm that EH-YOLO not only elevates overall PCB detection accuracy but also particularly strengthens the capability to identify minute defects, thereby validating the rationality and effectiveness of our model design.

To demonstrate the generalization of the EH-YOLO model, we conducted ablation experiments on DeepPCB. The experimental results are shown in Table 5. Obviously, each improvement contributes to an elevation in the model’s detection performance when YOLOv10 is enhanced following the methodology proposed in this work. Compared with the YOLOv10, the mAP@50 and mAP@50-95 of EH-YOLO are increased by 1.2% and 7.6%, respectively. The above experimental results can prove that EH-YOLO has the generalization property.

The above experiments demonstrate that HNN and ESDM exhibit excellent synergy, enabling EH-YOLO to significantly enhance detection accuracy while maintaining real-time performance. In addition, to further verify the feasibility and effectiveness of the various improvement schemes, a detailed experimental analysis of each improvement scheme will be conducted.

4.4.1. ESDM-Related Experiments

To more clearly validate the role of the ESDM in retaining fine defect features and suppressing PCB background interference, this study selects three types of PCB defects—open circuit, spur, and mouse bite, which are relatively small in size and exhibit significant background interference—for feature map visualization analysis. The results are presented in Figure 7. Among them, (a) depicts the feature map connected to the first detection head of YOLOv10, while (b) depicts the feature map connected to the corresponding detection head of YOLOv10 after incorporating the ESDM. The comparison reveals that the feature maps generated by YOLOv10 + ESDM exhibit superior overall completeness in retaining defect information compared to the original YOLOv10. Moreover, the edge and texture characteristics of the defect regions are significantly enhanced, facilitating clearer distinction between defects and the background. This demonstrates that the ESDM achieves effective preservation of fine defect features while simultaneously suppressing interference from complex backgrounds.

Finally, it should be noted that the feature maps visualized in this experiment are those connected to the detection heads. This is because the detection heads are responsible for the final detection tasks, and the information in their feature maps most faithfully represents the model’s ultimate capacity for retaining defect-related information.

To validate the synergistic collaboration among dimensional transformation, SAM, and CAM within the ESDM, we conducted experiments. This is shown in Table 6, which presents the results of YOLOv10 integrated with different ESDM configurations. In the table, 1 represents ESDM without SAM in the spatial preprocessing; 2 represents ESDM without SAM in the deep integration; 3 represents ESDM without CAM in the channel processing; and 4 represents complete ESDM implementation. Since the area under the precision–confidence (P-C) curve effectively reflects model stability, Figure 8 gives the P-C curve of the above four different structures. Group 1 and 2 exhibit significantly lower values than group 4 across all evaluation metrics. Although group 3 achieves a 0.7% higher R value compared to group 4, it shows 1% lower P and 1.4% lower mAP@50 and with a reduced area under the P-C curve. These experimental results confirm the strong synergistic effects among dimensional transformation, SAM, and CAM within the ESDM.

4.4.2. HNN-Related Experiments

To validate the effectiveness of HNN in balancing detection accuracy and speed, we conducted comparative experiments by integrating three feature fusion architectures—PAN, BiFPN, and HNN—into the YOLOv10 baseline model. As shown in Figure 9, PAN produces feature maps that appear blurry with poorly defined defect features while exhibiting substantial background noise. In contrast, BiFPN generates sharper feature maps with more distinct defect characteristics and reduced background interference. Our proposed HNN has clearer feature map clarity. Table 7 reveals that PAN demonstrated the lowest efficiency, while our HNN achieved the optimal accuracy of 86.0% mAP, surpassing BiFPN by 0.4%, with only minimal computational and temporal overhead (an increase of 0.2 G FLOPs and 0.3 ms latency). In summary, HNN achieves a superior balance between speed and accuracy.

4.4.3. Error and Robustness Analysis of EH-YOLO

To visually demonstrate the advantage of EH-YOLO in reducing false detections, Figure 10 presents a comparative analysis of the confusion matrices between YOLOv10 and EH-YOLO, where (a) corresponds to the results of YOLOv10 and (b) to those of EH-YOLO. It can be observed from the figure that EH-YOLO exhibits no false detections for the spur and short categories, whereas YOLOv10 demonstrates noticeable misclassifications in these cases. Furthermore, although both models occasionally misclassify open circuit as mouse bite and vice versa, the frequency of such misclassifications is significantly lower in EH-YOLO compared to YOLOv10. These results indicate that EH-YOLO achieves superior performance in suppressing false detections.

To systematically evaluate the robustness of EH-YOLO, we subjected PCB images to various perturbations including noise, illumination enhancement, and contrast adjustment. The processed images were then fed into both YOLOv10 and EH-YOLO models at a resolution of 640 × 640 for detection. As illustrated in Figure 11, EH-YOLO maintained a consistent detection rate for missing hole defects in the AP_M category under different disturbance conditions. However, a small number of missed detections and false positives occurred when detecting open circuit defects in the AP_S category. Nevertheless, compared to the detection results of YOLOv10, EH-YOLO demonstrated significantly better overall performance across various perturbations, indicating its superior robustness.

5. Conclusions

In this paper, a novel EH-YOLO model based on YOLOv10 for PCB detection was proposed. To solve the problems of low PCB detection accuracy and background interference, we designed the ESDM, which employs dimensional transformation and multi-attention collaboration to prevent the loss of small defect features while enhancing critical feature representation. To balance the accuracy and speed of detection, we designed the HNN, which optimizes task specialization across different feature fusion layers. The superior performance of the proposed EH-YOLO model is substantiated through comprehensive evaluations on the PCB_DATASET and DeepPCB benchmarks, where it achieved mAP@50-95 scores of 45.3% and 78.8% whilst maintaining high inference speeds of 166.67 FPS and 158.73 FPS, respectively.

Although the performance of EH-YOLO for PCB detection is excellent, the results are only obtained on public datasets; therefore, the model needs to be trained more extensively. In the future, we will perform on-site collection of PCB datasets to further train the model and enhance its real-world applicability. Furthermore, we will refine the SP operation by establishing systematic mapping relationships to filter out irrelevant feature information, thereby reducing computational overhead for the model.

Author Contributions

Resources, Funding acquisition, Methodology, and Writing: C.D.; Methodology, Software, Validation, and Writing: Y.Z., Z.W., and Y.W.; Writing and Validation: X.S. and S.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 61865012, and in part by the Jiangxi Provincial Key Research and Development Program under Grant 20213AAG01012.

Data Availability Statement

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhou, Y.; Yuan, M.; Zhang, J.; Ding, G.; Qin, S. Review of Vision-Based Defect Detection Research and Its Perspectives for Printed Circuit Board. J. Manuf. Syst. 2023, 70, 557–578. [Google Scholar] [CrossRef]
Chen, Y.; Ding, Y.; Zhao, F.; Zhang, E.; Wu, Z.; Shao, L. Surface Defect Detection Methods for Industrial Products: A Review. Appl. Sci. 2021, 11, 7657. [Google Scholar] [CrossRef]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. arXiv 2018. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. arXiv 2016. [Google Scholar] [CrossRef]
Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar] [CrossRef]
Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018. [Google Scholar] [CrossRef]
Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020. [Google Scholar] [CrossRef]
Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. arXiv 2022. [Google Scholar] [CrossRef]
Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar] [CrossRef]
Wang, C.-Y.; Yeh, I.-H.; Liao, H.-Y.M. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. arXiv 2024. [Google Scholar] [CrossRef]
Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J.; Ding, G. YOLOv10: Real-Time End-to-End Object Detection. arXiv 2024. [Google Scholar] [CrossRef]
Tian, Y.; Ye, Q.; Doermann, D. YOLOv12: Attention-Centric Real-Time Object Detectors. arXiv 2025. [Google Scholar] [CrossRef]
Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2999–3007. [Google Scholar] [CrossRef]
Ding, R.; Dai, L.; Li, G.; Liu, H. TDD-Net: A Tiny Defect Detection Network for Printed Circuit Boards. CAAI Trans. Intell. Technol. 2019, 4, 110–116. [Google Scholar] [CrossRef]
Ikotun, A.M.; Ezugwu, A.E.; Abualigah, L.; Abuhaija, B.; Heming, J. K-Means Clustering Algorithms: A Comprehensive Review, Variants Analysis, and Advances in the Era of Big Data. Inf. Sci. 2023, 622, 178–210. [Google Scholar] [CrossRef]
Lian, J.; Wang, L.; Liu, T.; Ding, X.; Yu, Z. Automatic Visual Inspection for Printed Circuit Board via Novel Mask R-CNN in Smart City Applications. Sustain. Energy Technol. Assess. 2021, 44, 101032. [Google Scholar] [CrossRef]
Yuan, M.; Zhou, Y.; Ren, X.; Zhi, H.; Zhang, J.; Chen, H. YOLO-HMC: An Improved Method for PCB Surface Defect Detection. IEEE Trans. Instrum. Meas. 2024, 73, 2001611. [Google Scholar] [CrossRef]
Shao, Q.; Liu, J.; Hu, D. FFDR-Net: Feature Fusion Deeper RetinaNet for PCB Defect Detection. In Proceedings of the 2023 IEEE 3rd International Conference on Electronic Technology, Communication and Information (ICETCI), Changchun, China, 26–28 May 2023; pp. 770–774. [Google Scholar] [CrossRef]
Sunkara, R.; Luo, T. No More Strided Convolutions or Pooling: A New CNN Building Block for Low-Resolution Images and Small Objects. arXiv 2022. [Google Scholar] [CrossRef]
Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and Efficient Object Detection. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 10778–10787. [Google Scholar] [CrossRef]
Li, H.; Xiong, P.; An, J.; Wang, L. Pyramid Attention Network for Semantic Segmentation. arXiv 2018. [Google Scholar] [CrossRef]
Tang, J.; Liu, S.; Zhao, D.; Tang, L.; Zou, W.; Zheng, B. PCB-YOLO: An Improved Detection Algorithm of PCB Surface Defects Based on YOLOv5. Sustainability 2023, 15, 5963. [Google Scholar] [CrossRef]
Luo, S.; Wan, F.; Lei, G.; Xu, L.; Ye, Z.; Liu, W.; Zhou, W.; Xu, C. EC-YOLO: Improved YOLOv7 Model for PCB Electronic Component Detection. Sensors 2024, 24, 4363. [Google Scholar] [CrossRef]
Xiao, G.; Hou, S.; Zhou, H. PCB Defect Detection Algorithm Based on CDI-YOLO. Sci. Rep. 2024, 14, 7351. [Google Scholar] [CrossRef]
Xiong, Z. A Design of Bare Printed Circuit Board Defect Detection System Based on YOLOv8. Highlights Sci. Eng. Technol. 2023, 57, 203–209. [Google Scholar] [CrossRef]
Neubeck, A.; Van Gool, L. Efficient Non-Maximum Suppression. In Proceedings of the 18th International Conference on Pattern Recognition (ICPR’06), Hong Kong, China, 20–24 August 2006; Volume 3, pp. 850–855. [Google Scholar] [CrossRef]
Li, K.; Zhong, X.; Han, Y. A High-Performance Small Target Defect Detection Method for PCB Boards Based on a Novel YOLO-DFA Algorithm. IEEE Trans. Instrum. Meas. 2025, 74, 2008312. [Google Scholar] [CrossRef]
Zheng, H.; Peng, J.; Yu, X.; Wu, M.; Huang, Q.; Chen, L. FDDC-YOLO: An Efficient Detection Algorithm for Dense Small-Target Solder Joint Defects in PCB Inspection. J. Real-Time Image Proc. 2025, 22, 83. [Google Scholar] [CrossRef]
Li, Z.; Zhan, J.; Qu, C.; Chen, X.; Zhang, L. Lightweight PCB Defect Detection Algorithm and Deployment Based on ASF-YOLO. In Proceedings of the 2024 7th International Conference on Computer Information Science and Application Technology (CISAT), Hangzhou, China, 12–14 July 2024; pp. 36–40. [Google Scholar] [CrossRef]
Liao, L.; Song, C.; Wu, S.; Fu, J. A Novel YOLOv10-Based Algorithm for Accurate Steel Surface Defect Detection. Sensors 2025, 25, 769. [Google Scholar] [CrossRef]
Colange, B.; Vuillon, L.; Lespinats, S.; Dutykh, D. MING: An interpretative support method for visual exploration of multidimensional data. Inf. Vis. 2022, 21, 246–269. [Google Scholar] [CrossRef]
Colange, B.; Vuillon, L.; Lespinats, S.; Dutykh, D. Interpreting Distortions in Dimensionality Reduction by Superimposing Neighbourhood Graphs. In Proceedings of the 2019 IEEE Visualization Conference (VIS), Vancouver, BC, Canada, 20–25 October 2019. [Google Scholar] [CrossRef]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. arXiv 2018. [Google Scholar] [CrossRef]

Figure 1. Typical PCB manufacturing defects.

Figure 2. Framework of EH-YOLO.

Figure 3. ESDM structural diagram: SP represents sampling operation, MLP represents shared full-connection layer, maxpool represents maximum pool, and avgpool represents average pooling.

Figure 4. SP schematic diagram.

Figure 5. Dataset diagram: (a) PCB_DATASET and (b) DeepPCB.

Figure 6. Comparative analysis of PCB defect detection performance. Subfigures show representative detections of: (a) short, (b) missing hole, (c) spurious copper, (d) spur, (e) open circuit, and (f) mouse bite. Performance benchmarking includes YOLOv10n at 640 × 640 and 1024 × 1024 input resolutions alongside the proposed EH-YOLO at 640 × 640 resolution.

Figure 7. Visualization of defect feature maps: (a) depicts the feature map connected to the first detection head of YOLOv10, (b) depicts the feature map connected to the corresponding detection head of YOLOv10 after incorporating the ESDM.

Figure 8. P-C curves: (a–d) correspond to the four different structures 1, 2, 3, and 4 in Table 6, respectively.

Figure 9. Feature map visualization of YOLOv10 with variant neck structures.

Figure 10. Confusion matrices: (a) corresponds to the results of YOLOv10, (b) corresponds to the results of EH-YOLO.

Figure 11. Detection performance comparison on perturbed images.

Table 1. Binary classification confusion matrix.

Actual Categories	Predicting Classes
Actual Categories	Defect	Non-Defect
Defect	T_P	F_N
Non-Defect	F_P	T_N

Table 2. Comparative experimental results.

Model	Size (PX)	mAP@50 (%)	mAP@50-95 (%)	Parameters (M)	FLOPs (G)
YOLOv3-tiny	640*640	80.7	36.2	12.13	19.0
YOLOv6	640*640	83.5	40.3	4.234	11.9
YOLOv8-ghost	640*640	75.2	34.3	1.715	5.1
Faster-RCNN	640*640	84.6	40.2	23.59	38.2
YOLOv10	640*640	84.4	42.1	2.473	8.4
YOLOv11	640*640	87.9	43.1	2.583	8.9
EH-YOLO	640*640	91.6	45.3	3.548	12.3
ESDN + YOLOv11	640*640	89.3	43.2	3.830	12.7

Table 3. Results of the EH-YOLO training PCB_DATASET ablation experiments.

	YOLOv10	HNN	ESDM	mAP@50 (%)	mAP@50-95 (%)	FPS
1	√			84.4	42.1	192.308
2	√	√		86.0	42.9	188.680
3	√		√	90.3	44.4	169.492
4	√	√	√	91.6	45.3	166.667

Table 4. Per-class average precision (AP) by defect size.

Size	Defect Types	YOLOv10	EH-YOLO	Increment
AP_M	Missing hole	89.5%	96.7%	7.2%
	Spurious copper	82.5%	89.1%	6.6%
	Short	83.3%	93.0%	9.7%
	Average	85.1%	92.9%	7.8%
AP_S	Open circuit	80.4%	94.0%	13.6%
	Spur	74.6%	90.0%	15.4%
	Mouse bite	79.8%	91.0%	11.2%
	Average	78.3%	91.7%	13.4%

Table 5. Results of the EH-YOLO training DeepPCB ablation experiments.

	YOLOv10n	HNN	ESDM	mAP@50 (%)	mAP@50-95 (%)	FPS
1	√			97.0	71.2	188.679
2	√	√		97.1	75.6	181.818
3	√		√	98.2	77.9	158.730
4	√	√	√	98.2	78.8	158.730

Table 6. Results of YOLOv10 introducing different ESDM structures to train PCB_DATASET.

	P (%)	R (%)	mAP@50
1	89.1	82.6	89.6
2	87.0	81.8	88.4
3	89.1	83.1	88.9
4	90.1	82.4	90.3

Table 7. Comparison of three structural training metrics.

	mAP (%)	FLOPs (G)	Latency (ms)
PAN	84.5	10.0	6.2
BiFPN	85.6	8.3	5.1
HNN	86.0	8.5	5.4

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Deng, C.; Zhang, Y.; Wu, Z.; Wu, Y.; Sun, X.; Wang, S. EH-YOLO: Dimensional Transformation and Hierarchical Feature Fusion-Based PCB Surface Defect Detection. Appl. Sci. 2025, 15, 10895. https://doi.org/10.3390/app152010895

AMA Style

Deng C, Zhang Y, Wu Z, Wu Y, Sun X, Wang S. EH-YOLO: Dimensional Transformation and Hierarchical Feature Fusion-Based PCB Surface Defect Detection. Applied Sciences. 2025; 15(20):10895. https://doi.org/10.3390/app152010895

Chicago/Turabian Style

Deng, Chengzhi, You Zhang, Zhaoming Wu, Yingbo Wu, Xiaowei Sun, and Shengqian Wang. 2025. "EH-YOLO: Dimensional Transformation and Hierarchical Feature Fusion-Based PCB Surface Defect Detection" Applied Sciences 15, no. 20: 10895. https://doi.org/10.3390/app152010895

APA Style

Deng, C., Zhang, Y., Wu, Z., Wu, Y., Sun, X., & Wang, S. (2025). EH-YOLO: Dimensional Transformation and Hierarchical Feature Fusion-Based PCB Surface Defect Detection. Applied Sciences, 15(20), 10895. https://doi.org/10.3390/app152010895

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

EH-YOLO: Dimensional Transformation and Hierarchical Feature Fusion-Based PCB Surface Defect Detection

Abstract

1. Introduction

2. Related Work

3. Methodology

3.1. Overall Framework of EH-YOLO

3.2. Description of ESDM

3.2.1. Spatial Preprocessing Module

3.2.2. Deep Integration Module

3.2.3. Channel Processing Module

3.2.4. SP Operation

3.3. Description of HNN

3.3.1. YOLOv10’s Neck Architecture and Feature Degradation

3.3.2. PCB-Specific Defect Characteristics

4. Experiments

4.1. Experimental Environment and Datasets

4.2. Evaluation Index

4.3. Contrast Experiment

4.4. Ablation Experiment

4.4.1. ESDM-Related Experiments

4.4.2. HNN-Related Experiments

4.4.3. Error and Robustness Analysis of EH-YOLO

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI