Real-Time Multi-Scale Barcode Image Deblurring Based on Edge Feature Guidance

Shi, Chenbo; Jiang, Xin; Zhang, Xiangyu; Zhu, Changsheng; Hu, Xiaowei; Zhang, Guodong; Li, Yuejia; Zhang, Chun

doi:10.3390/electronics14071298

Open AccessArticle

Real-Time Multi-Scale Barcode Image Deblurring Based on Edge Feature Guidance

by

Chenbo Shi

¹,

Xin Jiang

¹,

Xiangyu Zhang

¹,

Changsheng Zhu

¹,

Xiaowei Hu

²,

Guodong Zhang

²,

Yuejia Li

¹ and

Chun Zhang

^1,*

¹

College of Intelligent Equipment, Shandong University of Science and Technology, Taian 271019, China

²

Department of Artificial Intelligence, Suzhou Lamberv Intelligent Technology, Suzhou 215000, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(7), 1298; https://doi.org/10.3390/electronics14071298

Submission received: 21 February 2025 / Revised: 19 March 2025 / Accepted: 24 March 2025 / Published: 25 March 2025

(This article belongs to the Special Issue Artificial Intelligence Innovations in Image Processing)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Barcode technology plays a crucial role in automatic identification and data acquisition systems, with extensive applications in retail, warehousing, healthcare, and industrial automation. However, barcode images often suffer from blurriness due to lighting conditions, camera quality, motion blur, and noise, adversely affecting their readability and system performance. This paper proposes a multi-scale real-time deblurring method based on edge feature guidance. Our designed multi-scale deblurring network integrates an edge feature fusion module (EFFM) to restore image edges better. Additionally, we introduce a feature filtering mechanism (FFM), which effectively suppresses noise interference by precisely filtering and enhancing critical signal features. Moreover, by incorporating wavelet reconstruction loss, the method significantly improves the restoration of details and textures. Extensive experiments on various barcode datasets demonstrate that our method significantly enhances barcode clarity and scanning accuracy, especially in noisy environments. Furthermore, our algorithm ensures robustness and real-time performance. The research results indicate that our method holds significant promise for enhancing barcode image processing, with potential applications in retail, logistics, inventory management, and industrial automation.

Keywords:

barcode deblurring; multi-scale strategy; neural networks; edge guidance

1. Introduction

In automatic identification and data collection systems, barcode technology plays a pivotal role. Barcodes, including one-dimensional and two-dimensional formats, are extensively utilized in fields such as retail [1], warehousing [2], healthcare [3], and industrial automation [4]. These systems enable rapid and accurate information collection, greatly enhancing operational efficiency and data processing capabilities. However, in practical applications, barcode images are frequently compromised by various factors, including ambient lighting, camera quality, camera shake, inaccurate focusing, rapid movement, and noise interference, resulting in blurred images. Such blurriness significantly reduces barcode readability, thus impairing the performance of the entire automatic identification system. This issue is particularly critical in industrial automation environments, where barcode scanning systems, like other vision-based automation technologies used in industrial settings such as structural damage recognition in automated inspection robots [5], must meet stringent real-time and accuracy requirements. Any recognition delays or errors can disrupt production processes, increase costs, and affect output efficiency. Consequently, developing efficient image deblurring techniques is essential for improving the reliability and efficiency of barcode recognition systems.

Image deblurring is a classic problem in image and signal processing. The formation process of image blur is usually modeled as follows:

g (x, y) = k (x, y) \otimes f (x, y) + n (x, y)

(1)

where

g (x, y)

,

f (x, y)

,

k (x, y)

, and

n (x, y)

denote the blurred image, the sharp image, the blur kernel, and the noise, respectively. The symbol ⊗ indicates the convolution operation. Typically, the blur kernel is unknown, making the estimation of both the blur kernel and the latent sharp image from a given blurred image a highly ill-posed problem. Traditional blind deblurring methods start from the blur degradation model and use mathematical modeling to treat it as an optimization problem. They introduce various complex image priors to constrain the solution space of the deblurring process, such as total variation (TV) [6], sparse priors [7], dark channel prior (DCP) [8], and gradient priors [9]. However, handcrafted priors are based on limited observations and statistics of image features. They often fail to describe blur in complex and varied scenarios, leading to insufficient restoration and significant detail loss in traditional methods. Moreover, optimization-based methods typically involve considerable time costs, making them unsuitable for high-speed production lines. In recent years, deep learning technology has been integrated into image processing tasks. Researchers are now primarily focused on using end-to-end methods to construct mapping models from blurred images to latent images [10,11,12,13]. Deep learning approaches demonstrate higher robustness and broader applicability. However, existing deep learning-based deblurring models are predominantly designed for natural scene images, which differ significantly from barcode images in terms of their structural characteristics and feature distribution. Industrial barcode images are highly sensitive to noise, distortions, and edge degradation, as even minor artifacts can lead to decoding failures. Unlike natural images, barcodes rely on precise edge integrity and contrast for accurate recognition. Conventional deblurring methods often introduce unwanted artifacts or fail to recover fine details, significantly impacting decoding performance in high-speed industrial applications. Despite the increasing adoption of deep learning, there is still a lack of specialized approaches tailored for barcode image deblurring. Existing models often have high computational complexity, making real-time industrial deployment challenging. Therefore, further research is needed to develop more efficient and lightweight solutions.

We proposed an edge feature-guided multi-scale deblurring network (EGMSNet) to address the issues above. Unlike conventional methods that primarily focus on global feature restoration, our approach leverages edge information as a crucial prior to better preserve barcode structures.The edge feature fusion module (EFFM) integrates edge features with encoder features to better restore the edges of the image. Considering the differences between clear and blurred image pairs in the spatial and spectral domains, we designed a feature filtering mechanism (FFM) to reduce noise artifacts effectively, amplify important region responses, and recover clean features. Additionally, we implemented a wavelet reconstruction loss to enhance detail and texture recovery. We specifically optimized the algorithm’s real-time performance and robustness, making it more suitable for industrial applications. Our contributions are summarized as follows:

We proposed an efficient edge feature-guided multi-scale deblurring network (EGMSNet) that leverages edge features and multi-scale information to enhance structural details, ensuring better preservation of barcode edges and overall image clarity.
We introduced a feature filtering mechanism (FFM) that amplifies responses in important regions, recovers clean features, and reduces noise artifacts.
We constructed three independent datasets of blurred and clear image pairs for Barcode, QRCode, and Data Matrix codes, aiming to guide the deblurring of barcode images better.

The structure of the rest of this article is organized as follows. Section 2 reviews the related work on image deblurring. Section 3 describes the proposed method and architecture in detail. Section 4 presents extensive experiments and analyses to evaluate the performance of our approach. Finally, a discussion and overall conclusion surrounding the algorithm is given in Section 5 and Section 6.

2. Related Work

Traditional Image Deblurring: Traditional single-image blind deblurring techniques primarily rely on optimization methods that utilize effective image priors to construct deblurring models, estimate blur kernels, and restore images through deconvolution. For instance, Xu et al. [9] proposed a non-natural L0 sparse representation approach, which improves image recovery quality by optimizing sparsity, particularly excelling in preserving edge sharpness. However, this method demonstrates limited robustness to noise and may result in over-sharpening artifacts when applied to images with complex textures. Michaeli et al. [14] introduced a self-similarity-based approach that exploits patch repetition within the image for blind deblurring. While effective for images with repetitive structures, its performance significantly degrades when processing images with sparse textures or minimal repetitive patterns. Pan et al. [8] proposed a technique using dark channel priors for blind deblurring, which is especially effective under low-light and adverse weather conditions. Nonetheless, its applicability is restricted to scenarios where the dark channel assumption holds, limiting its generalizability. Bai et al. [6] presented a graph-based single-image blind deblurring method that utilizes graph smoothness and sparse priors for image recovery. Although this approach demonstrated promising results, its performance is highly dependent on the accuracy of the graph structure and sparsity parameters. Chen et al. [7] developed an enhanced sparse model combining sparse representations to restore clear images. Despite their effectiveness in specific scenarios, these prior-based optimization methods are inherently limited by their reliance on handcrafted priors and computationally intensive iterative processes, which hinder their practical applicability in real-world settings.

Deep Learning for Image Deblurring: In recent years, with the rapid development of deep learning technologies, Sun et al. [15] employed convolutional neural networks (CNNs) to estimate motion kernels for spatial transformations, effectively eliminating motion blur. Following this, more research has shifted toward end-to-end approaches, directly mapping blurred images to sharp ones. For example, Nah et al. [10] developed a multi-scale convolutional neural network for end-to-end deblurring. Researchers using similar methods have improved multi-scale strategies to achieve better deblurring results [16,17]. The multi-scale approach processes images at different resolutions, enabling the network to capture details from coarse to fine, thereby effectively enhancing image quality. However, this strategy has certain limitations, such as the potential loss of edge information when downscaling blurred images to lower resolutions. Overlaying multi-scale input images onto subnetworks gradually enhances image clarity from lower-level to higher-level subnetworks, inevitably incurring higher computational costs. To address these issues, Cho et al. [11] designed a coarse-to-fine network architecture and proposed a multi-input multi-output U-Net, effectively reducing the spatiotemporal complexity of multi-scale strategy networks. Recent advancements have further enhanced deblurring performance through innovative network designs. MSDI-Net [18] introduces an extra degradation representation encoder to improve image deblurring by explicitly modeling degradation information. Similarly, UFPNet [19] incorporates a pretrained kernel prior module to estimate spatially variant blur kernel information, which is then integrated into the encoder, enabling more accurate blur removal. DeepRFT [20] proposes the Res-FFTReLU-Block, which leverages frequency selection mechanisms to enhance deblurring performance.

Furthermore, despite significant advances in image deblurring with generative adversarial networks (GANs), such as DeblurGAN [21] and DeblurGAN-v2 [22] by Kupyn et al.—which improves deblurring performance through innovative architectures—GAN techniques typically require large amounts of training data. Challenges in training stability and convergence speed may limit their widespread adoption in practical applications. In the field of image restoration, the application of transformer architectures has shown unique advantages and challenges. Wang et al. proposed the Uformer [23] model, which combines the self-attention mechanism of transformers with the hierarchical characteristics of U-shaped networks, effectively handling long-range dependencies in images and accurately restoring image details. Stripformer [24] applies Local Self-Attention (SA) to capture long-range dependencies with low complexity. Conversely, Zamir et al. introduced the Restormer [25] model, which focuses on high-resolution image restoration, achieving efficient computational performance through a carefully designed deep transformer architecture. Kong et al. [26] proposed a frequency domain-based Transformer that replaces spatial self-attention with efficient elementwise operations in the frequency domain. Their approach improves efficiency to a certain extent, but transformer-based methods are still computationally demanding. Although transformer models excel in restoration quality, their complex structures and high computational demands limit their application in resource-constrained environments.

Barcode Image Deblurring: Extensive research has been conducted on the restoration of barcode images. Liu et al. [27] proposed an iterative deconvolution algorithm based on the concept of barcodes as bilayer waveforms. Xu and McCloskey [28] enhanced the robustness of 2D barcodes under motion blur through coded exposure imaging. Sörös et al. [29] introduced a fast recovery and recognition algorithm, which blindly estimates the blur of prominent edges in the barcode using an iterative optimization scheme, alternating between image sharpening, blur estimation, and decoding. Additionally, Van Gennip et al. [30] achieved rapid barcode deblurring through sparse optimization based on L0 regularization. Despite these significant advancements, these methods still face limitations when handling various types of blur in complex scenarios.

3. Methodology

3.1. Overall Architecture

Barcode images typically exhibit high contrast and sharp edges, features that are easily lost in blurred conditions, complicating barcode recognition. Additionally, artifacts and noise in blurred barcode images significantly hinder recognition accuracy. Traditional image deblurring methods often struggle to restore details and edge features in barcode images and fail to reduce artifacts adequately. To address these challenges, we propose an edge-guided multi-scale fusion network architecture, EGMSNet, designed to progressively restore potential sharp images at various scales, enhance edge clarity, and reduce artifact generation through a feature filtering mechanism (FFM).

As shown in Figure 1, EGMSNet consists of three scales: S3, S2, and S1, corresponding to the original scale, one downsampling, and two downsamplings, respectively. The network restores the potential clear image progressively from coarse to fine. We denote the original full-resolution and the blurry images at different scales, obtained through downsampling, as B3, B2, and B1, respectively. Correspondingly, L3, L2, and L1 represent the output images at the same resolutions as B3, B2, and B1. The encoder–decoder architecture of EGMSNet adopts a three-layer hierarchical structure to extract and reconstruct features step by step from coarse to fine. The encoder uses two ResBlocks per layer, and each ResBlock contains two 3 × 3 convolutional layers with ReLU activation, as described in ResNet [31]. Downsampling is performed using dilated convolutions with a stride of 2, which reduces the spatial resolution while maintaining key spatial information. Similar to the multi-scale approach proposed in MSSNet [13], we employ cross-scale information propagation between stages, providing upsampled residual features to facilitate effective information transfer between scales. During this process, residual features at each scale are subjected to bilinear upsampling and a 1 × 1 convolutional layer, ensuring precise feature alignment and compatibility for subsequent stages. This cross-scale feature fusion facilitates seamless transitions between scales and significantly reduces artifacts in the final restored image. Specifically, at the end of the coarse scale, the residual features are bilinearly upsampled and processed through a 1 × 1 convolution layer. For the S3 scale, we adopt the cross-stage feature fusion scheme proposed in MPRNet [32].

Additionally, we introduce an edge branch to extract multi-scale edge features, which are integrated into the edge feature fusion module (EFFM) along with the multi-scale encoder features from the S3 scale to restore sharp edge features. Finally, the feature filtering mechanism (FFM) is employed to amplify the response of important regions and restore clean features, with FFM sharing weights across the S3 scales. EGMSNet employs a residual learning scheme to achieve effective restoration and is widely used in various restoration tasks [33,34,35,36]. EGMSNet predicts the residual image R, which is added to the input blurry image B to produce the deblurred output L = B + R.

3.2. Edge Branch

The clarity of image edges is a crucial metric in the subjective evaluation of image quality. For barcode images, sharp edges are particularly essential, since barcode recognition heavily relies on the distinct boundaries of the patterns. Historically, many approaches have prioritized deblurring at the expense of edge feature restoration, leading to a loss of critical information and a deficiency in rich edge details in the restored images. Additionally, numerous image restoration techniques [37,38] have utilized gradients and high-frequency information as preprocessed inputs, significantly enhancing the critical information in the images.

Inspired by these techniques and considering that barcodes typically exhibit horizontal and vertical edges, and given the Sobel operator’s sensitivity to such edges, this study employs the Sobel operator to extract image edges. These edges are then used as supplementary input to the network and directed to the edge branch network. The edge branch network is designed to model the image’s edge information, integrating the extracted edge features with the encoder features to enhance edge restoration. Specifically, the edge branch network consists of three cascaded edge extract modules (EEMs). Each EEM integrates a 3 × 3 dilated convolution layer to extract local features, a hybrid convolution unit that combines standard convolution, dilated convolution, and depthwise separable convolution to capture finer-grained feature information, as well as a spatial attention mechanism to adaptively recalibrate feature responses. This design enables the model to focus on important edge regions while suppressing irrelevant information, refining edge features at multiple scales and ensuring precise edge representation. As illustrated in Figure 2, the edge branch network generates multi-scale edge features, which are then fused with the multi-scale features from the S3 scale stage encoder through the edge feature fusion module (EFFM). The multi-scale feature fusion process can be described as follows:

fused features = [EFFM (F_{e_{i}}, F_{c_{i}}) ∣ i \in {1, 2, 3}]

(2)

In the EFFM module, the edge features and encoder features are independent, hindering effective edge fusion. A 3 × 3 convolution is first applied to refine the edge features, resulting in

F_{1}

. Subsequently, a 1 × 1 convolution followed by a Sigmoid activation function is used to generate the edge attention map

F_{map}

. This attention map is then used to perform pixelwise multiplication and addition with the encoder features, yielding the weighted feature

F_{cl}

. Finally,

F_{cl}

is added pixelwise to the weighted content features to produce the output feature

F_{out}

. The detailed computation process is as follows:

\begin{matrix} F_{1} = δ ({Conv}_{3 \times 3} (F_{e})), \\ F_{map} = σ ({Conv}_{1 \times 1} (F_{1})), \\ F_{cl} = F_{c} \times F_{map} + F_{c}, \\ F_{out} = F_{1} + F_{cl}, \end{matrix}

(3)

where

σ

denotes the sigmoid activation function,

δ

represents the ReLU, and

F_{e}

and

F_{c}

refer to the edge features and encoder features, respectively.

3.3. Feature Filtering Mechanism (FFM)

The feature filtering mechanism emphasizes critical regions and restores clean features. Figure 3 illustrates the spatial selection module (SSM) and the frequency selection module (FSM). Given the input feature

F \in R^{H \times W \times C}

, SSM and FSM are applied sequentially and represented as

\hat{F} = FSM (SSM (F)) .

(4)

Next, we provide a detailed introduction to these two modules.

Spatial Selection Module (SSM): The SSM is designed to focus the network on key areas in the spatial domain, thereby providing clear focal points for the frequency selection module (FSM). The SSM consists of three main branches that integrate channel attention and coordinate attention mechanisms. These mechanisms collaborate to extract both global context and local detail information, significantly enhancing the quality of feature representation. Specifically, the input feature map F is initially processed through the coordinate attention (CA) [39] mechanism, generating a spatially selective attention map that considers each channel’s importance and its corresponding positional information. This step is crucial for producing an attention map with spatial selectivity. Subsequently, based on the CBAM model, max pooling and average pooling are applied along the channel dimension to compress the feature map, followed by a 3 × 3 convolution layer to further refine and highlight the degraded areas that require attention. The formal representation is as follows:

\begin{matrix} F^{'} = CA (F), \\ F^{″} = Conv 3 ([AvgPool (F^{'}), MaxPool (F^{'})]), \end{matrix}

(5)

where

[\cdot, \cdot]

denotes concatenation; AvgPool, MaxPool, and Conv3 represent average pooling, max pooling, and a 3 × 3 convolution layer, respectively. Through this process,

F^{″} \in R^{H \times W \times 1}

captures the regions of degradation requiring focus. Given the varying degradation patterns across channels, depthwise convolutions are applied to the input feature F, enabling channelwise transformations and the generation of channel representations. Subsequently, these representations are modulated with

F^{″}

. The process is expressed as follows:

F_{s} = {DConvs}_{5, 7} (F^{'}) \otimes T (F^{″}, C) + DConv 3 (F^{'}) .

(6)

Here,

{DConvs}_{5, 7}

indicates cascaded depthwise convolutions with kernel sizes of 5 × 5 and 7 × 7; DConv3 represents a depthwise convolution with a 3 × 3 kernel; ⊗ signifies elementwise multiplication; and

T (F^{″}, C)

denotes a tiling function that replicates

F^{″}

across C channels to form

R^{H \times W \times C}

. The spatially selected features

F_{s} \in R^{H \times W \times C}

are then passed to the FSM for frequency selection.

Frequency Selection Module (FSM): Since degraded and clear image pairs share similar low-frequency components but differ in high-frequency ones, we employ the proposed FSM to remove the lowest frequencies. The FSM further emphasizes the regions containing the actual differences between the input and clear image pairs, thus enhancing the restoration of clean features. We first apply a mean filter to

F_{s}

to generate the low-frequency features to achieve this. By subtracting the obtained low-frequency signal from the input, we acquire the complementary high-frequency features as

F_{s}^{h} = F_{s} - Mean (F_{s}) .

(7)

In our case, the mean filter is implemented using channelwise global average pooling. We introduce learnable parameters a and b to dynamically adjust the emphasis on low-frequency and high-frequency features, with a and b initially set to 0 and 1, respectively. The final output of the FFM is generated using elementwise multiplication and residual connections between

F_{h}^{s}

and

F_{s}

as

\hat{F} = a \cdot F_{s}^{h} \otimes F_{s} + b \cdot F_{s} .

(8)

After the FFM, the important areas are emphasized. As shown in the first row of Figure 4, from left to right are the blurred image, the FFM input feature heat map, and the FFM output feature heat map. The FFM further enhances the edge response in QRCode.

3.4. Loss Function

To facilitate dual-domain learning, we employ L1 loss functions in both the spatial and frequency domains [11]. Additionally, we introduce a wavelet reconstruction loss to emphasize the recovery of details and textures. We use three types of loss functions to train EGMSNet: content loss (

L_{cont}

), frequency reconstruction loss (

L_{freq}

), and wavelet reconstruction loss (

L_{wav}

). The content loss (

L_{cont}

) and frequency reconstruction loss (

L_{freq}

) are defined as follows:

\begin{matrix} L_{cont} & = \sum_{k = 1}^{K} \frac{1}{t_{k}} {∥ {\hat{S}}_{k} - S_{k} ∥}_{1}, \\ L_{freq} & = \sum_{k = 1}^{K} \frac{1}{t_{k}} {∥ F ({\hat{S}}_{k}) - F (S_{k}) ∥}_{1}, \end{matrix}

(9)

where K represents the index of input/output images at different scales and stages, and

\hat{S}

and S represent the output and ground truth images, respectively. We normalize the loss by dividing it by the total number of elements

t_{k}

.

F

denotes the fast Fourier transform (FFT), which transfers the image signal to the frequency domain.

The discrete wavelet transform (DWT) decomposes an image into four sub-bands containing different frequency information for the wavelet reconstruction loss. These sub-bands are the low-frequency component (LL), horizontal high-frequency component (LH), vertical high-frequency component (HL), and diagonal high-frequency component (HH). The wavelet reconstruction loss (

L_{wav}

) is defined as follows:

L_{wav} = \sum_{k = 1}^{K} \frac{1}{t_{k}} {∥ {\hat{S}}_{k}^{d} - S_{k}^{d} ∥}_{1}, d \in {L L, H L, L H, H H}

(10)

To this end, the final loss function is a combination of the content loss, frequency reconstruction loss, and wavelet reconstruction loss as follows:

L = L_{cont} + 0.1 \cdot L_{freq} + α \cdot L_{wav}

(11)

where

α

is a hyperparameter to balance the loss. We set it to 0.1 in our experiments.

4. Experiments

4.1. Datasets

Collecting numerous paired blurry barcode images in real-world settings is a challenging task. Therefore, we used a CCD camera to capture three types of clear barcode images: EAN13, QRCode, and Data Matrix. We then generated various types of blur using Python 3.8.18, including motion blur, defocus blur, and Gaussian blur. For motion blur, the motion angle ranged from 0 to 360, and the length ranged from 3 to 21 pixels. For defocus blur, the blur kernel radius ranged from 3 to 15 pixels, with Gaussian sigma weighting ranging from 0.1 to 0.5. For Gaussian blur, the blur kernel radius ranged from 3 to 27 pixels, with Gaussian sigma ranging from 0 to 4. Additionally, we randomly added Gaussian white noise to simulate real-world data. Consequently, we constructed three datasets comprising 12,764 pairs of QRCode images, 3686 pairs of EAN13 images, and 16,175 pairs of Data Matrix images. These datasets were divided into training and testing sets with an 8:2 ratio. Additionally, we collected 1615 real-world images of QRCode and barcodes exhibiting defocus and motion blur. These images were captured in diverse real-world settings to enhance the variability and robustness of our dataset.

4.2. Implementation Details

All our models were implemented using the OpenMMLab MMagic framework. The comparative methods utilized the models provided by the original papers and were executed on a workstation equipped with two NVIDIA GeForce RTX 3090 24G GPUs. We used the Adam optimizer for model training, with parameters

β_{1} = 0.9

and

β_{2} = 0.999

. The initial learning rate was set to

2 \times 10^{- 4}

and gradually decreased to

1 \times 10^{- 6}

using a cosine annealing strategy. Training patches were randomly cropped to

256 \times 256

, with a batch size of 8, for over 40,000 iterations. Following recent image deblurring methods, we employed vertical flipping and random rotation for data augmentation.

4.3. Comparison with State-of-the-Art Methods

We evaluated the performance of the proposed model on our synthetic datasets (QRCode, Data Matrix, and EAN13) as well as on a real dataset. We compared our method with several state-of-the-art deblurring approaches, including three representative optimization-based methods (DCP [8], RGTV [6], and ESM [7]) and five deep learning-based dynamic scene deblurring methods (DeepDeblur [10], MIMO-UNet [11], MIMO-UNet+ [11], NAFNet [12], and MSSNet [13]. Using standard evaluation metrics such as the peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and mean absolute error (MAE), we assessed the generalization performance of these models. All methods were trained on the QRCode training set, and the pretrained models were tested on the QRCode test set, EAN13 test set, and Data Matrix test set. Since code images typically contain rich semantic information and specific encoding structures, traditional metrics often fail to reflect the quality of deblurred images adequately. Therefore, we collected a series of blurred images from real-world scenarios to construct a real blurred dataset as a reference. By comparing the decoding rates of blurred images to clear images, we can more intuitively evaluate the effectiveness of the deblurring algorithms.

Comparison of Synthetic Datasets: Table 1 and Table 2 present the quantitative results for all competitors on the synthetic datasets QRCode, Data Matrix, and EAN13. The proposed method significantly outperformed traditional deblurring techniques. While traditional methods struggled with complex blur patterns and introduce artifacts, our approach leveraged edge-guided multi-scale feature learning, which enhanced the fine structure recovery and robustness across different barcode types. When compared to popular multi-scale deblurring models such as DeepDeblur [10], MIMO-UNet+ [11], and MSSNet [13], our method, EGMSNet, demonstrated superior performance. Specifically, our model’s PSNR and SSIM values on the QRCode dataset showed increases of 0.5728 dB and 0.0057, respectively, over MSSNet. Additionally, it significantly reduced the number of parameters and computational load, resulting in a 1.7-fold increase in inference speed. We have plotted a scatter plot of latency versus performance for various deep learning algorithms in Figure 5, which further illustrates this point. The plot demonstrates that our method, EGMSNet, achieved superior PSNR values while significantly reducing latency compared to other approaches. This reduction in latency highlights the real-time processing capabilities of EGMSNet, making it highly suitable for applications that require fast and efficient image processing. Furthermore, we note that these deep learning-based networks, when trained on the QRCode dataset, achieved relatively lower evaluation metrics on the Data Matrix and EAN13 datasets. This suggests that a model trained on one barcode type may not always generalize well to others, likely due to variations in the structural complexity and noise patterns. As shown in Table 2, our method still achieved superior performance across all datasets, further validating its robustness and efficiency. The strong generalization of EGMSNet can be attributed to its effective multi-scale feature extraction, edge feature fusion, and feature filtering mechanisms, which allow it to adapt to different barcode structures while suppressing noise artifacts.

For qualitative evaluation, we randomly selected three types of blur from the synthetic QRCode dataset: motion blur, defocus blur, and Gaussian blur. Figure 6 shows the zoomed-in details of images processed by different methods. It can be observed that comparison methods such as DCP [8] and DeepDeblur [10] produce images with various degrees of artifacts, detail loss, distortion, and deformation. For instance, in the edge structure and text regions of the QR code in Figure 6, our EGMSNet demonstrated impressive restoration performance, effectively removing artifacts and restoring clear image edge structures. Furthermore, we tested the pretrained weights obtained on the QRCode dataset on the Data Matrix and EAN13 test sets. Randomly selected examples are visualized in Figure 7, showing the comparison results of different methods. Other methods had issues with either failing to remove blur or introducing additional artifact noise, such as MSSNet’s collapse on the EAN13 dataset (bottom of Figure 7). In contrast, our method achieved more apparent structures on both the Data Matrix and EAN13 datasets, demonstrating the superior generalization ability of our model across different types of codes.

Comparison of Real Blur Images: To further evaluate the generalization performance of the proposed model in real-world blurred scenarios, Figure 8 presents a visual comparison of various methods on a dataset of real-world defocus and motion-blurred QR codes and barcodes. For a fair comparison, all deep learning-based methods were trained on the QRCode dataset and tested on this real-world dataset. We selected three QR code scene images with different levels of blur, contrast, and clarity. As shown in the first row of Figure 8, other methods produced various degrees of artifacts and noise, while our method recovered relatively clean structures, which is crucial for decoding. Additionally, for low-contrast and relatively clear images, as seen in the second and third rows of Figure 8, our method restored clear structures and details, particularly at the edges of the codes. Furthermore, we selected real-world images of barcodes with motion blur, as shown in Figure 8, and our method consistently produced satisfactory results.

Since real blurred images lack an actual ground truth due to changes in the field of view, we conducted extensive decoding experiments to demonstrate the effectiveness of our deblurring method in improving the decoding rates of commercial decoders. We prepared a challenging test dataset with various degrees of blur, totaling 815 images, and used two decoders: Zbar and WeChat. Zbar is typically used in experiments, while WeChat is a commercial decoder. Figure 9 shows the impact of ESM, MiMoUnet-plus, NAFnet, MSSNet, and our method on the decoding rates before and after applying deblurring. “Original” represents the decoding rate before deblurring. Our deblurring method improved the decoding rates on Zbar and WeChat by 27.2% and 10.4%, respectively. The result demonstrates that our deblurring method effectively enhances decoding capabilities.

4.4. Ablation Study

In this section, we conducted comprehensive ablation experiments to verify the contribution of each component in our proposed EGMSNet model. All models were trained and tested on the QRCode dataset with 40,000 iterations to ensure accurate evaluation and comparison.

Ablation Study on Network Structure: We designed four experiments to validate the effectiveness of the edge branch and the proposed feature filtering mechanism. We began with a simple baseline by removing both the edge branch and FFM from our EGMSNet and gradually added the edge branch, spatial selection module, and frequency selection module. Table 3 presents the quantitative results of these ablation experiments.

As shown in Table 3a, the baseline model achieved a PSNR of 31.45 dB. This model served as a benchmark for evaluating the contributions of other components. Adding the edge branch increased the PSNR to 31.55 dB, representing an improvement of 0.10 dB over the baseline. The introduction of the edge branch effectively enhanced the model’s ability to capture edge information, thereby improving image restoration performance. The introduction of SSM boosted the PSNR to 31.5804 dB (+0.1304 dB), demonstrating its effectiveness in spatial feature modeling for deblurring. The FSM integration achieved 31.5307 dB (+0.0807 dB), enhancing feature selection for sharper reconstruction. Combining both components yielded 31.6307 dB, highlighting their complementary roles in feature modeling and selection for improved restoration. As illustrated in Figure 4, the use of the FFM significantly reduced the generation of artifacts. By combining the edge branch and FFM, the PSNR and SSIM improved further by 0.3711 dB and 0.0025, respectively, compared to the baseline model.

Ablation Study of the Proposed Strategy: To verify the effectiveness of each component in our proposed EGMSNet, we conducted multiple experiments, focusing on the edge feature fusion module and shared FFM weights. For this purpose, we designed two sets of experiments. The first set explores different feature fusion methods: (1) Addition, using addition to combine edge features and encoder features, (2) Concat+Conv, concatenating features along the channel dimension and using convolution to adjust the channels, and (3) EFFM, using the EFFM module for feature fusion. The second set examines the impact of shared weights in the FFM: (1) FFM*, the FFM without shared weights, and (2) FFM, the FFM module with shared weights.

Table 4 presents the quantitative results of the ablation studies. These results show that the PSNR/SSIM values for the Addition and Concat+Conv methods were lower than those for our EFFM module by 0.123 dB/0.0076 and 0.0786 dB/0.004, respectively. Although Concat+Conv offers a certain degree of feature fusion compared to simple addition, replacing it with EFFM resulted in higher deblurring performance, indicating the effectiveness of the EFFM feature fusion strategy. Regarding the feature filtering mechanism, the shared-weight FFM provided better performance improvement and parameter reduction than the non-shared-weight FFM, demonstrating the effectiveness of our shared-weight approach.

Ablation Study on Loss Function: As shown in Equation (11), the hyperparameter

α

was used to balance the weight of the wavelet reconstruction loss. We optimized the deblurring performance by adjusting the value of

α

. The hyperparameter

α

was set to 0, 0.02, 0.05, 0.1, 0.5, 1, and 2. As shown in Figure 10, the highest PSNR and SSIM values were obtained when

α = 0.1

, so we set the hyperparameter to 0.1 for all experiments. Additionally, we randomly selected a QRCode image and obtained restored images with and without the wavelet reconstruction loss, as shown in Figure 11. From the figure, it is evident that the wavelet reconstruction loss is crucial for restoring texture details. Therefore, we included the wavelet reconstruction loss and set its balance parameter

α

to 0.1.

5. Discussion

This paper introduces an edge-guided multi-scale deblurring network (EGMSNet) tailored for robust barcode image restoration, emphasizing structural integrity preservation and decoding accuracy enhancement in challenging scenarios. EGMSNet exhibits strong robustness against common blur types such as motion and defocus blur while maintaining real-time processing efficiency. However, its performance is constrained under extreme blur or low-contrast conditions, where barcode edges are severely distorted or obscured by noise. This limitation stems from the significant loss of high-frequency details, which complicates fine structure reconstruction. To address this, future work will focus on integrating adaptive attention mechanisms and hybrid frequency domain models to enhance robustness in such demanding scenarios. Beyond barcode deblurring, we aim to extend our research to text image deblurring, which presents unique challenges due to the inherent diversity in font styles, sizes, and orientations. Unlike barcodes, text images exhibit greater structural variability, necessitating more sophisticated edge-aware strategies to ensure readability. Moreover, different types of blur, such as motion blur in scanned documents or defocus blur in camera-captured text, impact recognition accuracy in distinct ways. To tackle these challenges, we plan to incorporate text-specific edge features and advanced multi-scale strategies to improve text clarity while enhancing robustness against noise and artifacts. By addressing these challenges, our work seeks to bridge the gap between specialized and generalized deblurring, paving the way for more versatile and effective image restoration solutions.

6. Conclusions

In this paper, we presented an efficient edge-guided multi-scale deblurring network (EGMSNet), specifically designed for barcode image restoration. Our approach introduces several novel contributions that address the challenges of structural preservation and decoding accuracy in real-world scenarios. The proposed model employs a coarse-to-fine strategy that progressively restores clear images while integrating an edge feature fusion module (EFFM) to enhance structural integrity. Additionally, we introduced a feature filtering mechanism (FFM) to suppress noise artifacts and amplify critical details, thereby ensuring cleaner and more effective feature reconstruction. These innovations enable EGMSNet to adapt robustly to diverse blur types, including motion and defocus blur, while maintaining real-time processing efficiency. Most importantly, our approach significantly improved the decoding accuracy of commercial barcode scanners in complex environments, ensuring practical applicability. Future work will focus on extending this framework to text image deblurring and further optimizing its efficiency for resource-constrained industrial applications.

Author Contributions

Conceptualization, C.S., G.Z. and X.J.; data curation, X.Z.; formal analysis, X.J. and Y.L.; funding acquisition, C.Z. (Changsheng Zhu); investigation, X.J.; methodology, C.Z. (Chun Zhang), X.J. and C.S.; project administration, X.H.; resources, C.Z. (Changsheng Zhu); software, X.J.; supervision, C.Z. (Chun Zhang); validation, X.J.; visualization, X.J.; writing—original draft, X.J. and C.S.; writing—review and editing, X.J. and C.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the Shandong Province Science and Technology SMEs Innovation Capability Enhancement Project (Nos. 2023TSGC0576 and 2023TSGC0605).

Data Availability Statement

The data that support the findings of this study are available from the author, Xin Jiang, via 202283230006@sdust.edu.cn upon reasonable request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Kim, Y.G.; Woo, E. Consumer acceptance of a quick response (QR) code for the food traceability system: Application of an extended technology acceptance model (TAM). Food Res. Int. 2016, 85, 266–272. [Google Scholar] [CrossRef] [PubMed]
M’hand, M.A.; Boulmakoul, A.; Badir, H.; Lbath, A. A scalable real-time tracking and monitoring architecture for logistics and transport in RoRo terminals. Procedia Comput. Sci. 2019, 151, 218–225. [Google Scholar] [CrossRef]
Ji, Y.; Sun, D.; Zhao, Y.; Tang, J.; Tang, J.; Song, J.; Zhang, J.; Wang, X.; Shao, W.; Chen, D.; et al. A high-throughput mass cytometry barcoding platform recapitulating the immune features for HCC detection. Nano Today 2023, 52, 101940. [Google Scholar] [CrossRef]
Lin, P.Y. Distributed Secret Sharing Approach With Cheater Prevention Based on QR Code. IEEE Trans. Ind. Inform. 2016, 12, 384–392. [Google Scholar] [CrossRef]
Hu, K.; Chen, Z.; Kang, H.; Tang, Y. 3D vision technologies for a self-developed structural external crack damage recognition robot. Autom. Constr. 2024, 159, 105262. [Google Scholar] [CrossRef]
Bai, Y.; Cheung, G.; Liu, X.; Gao, W. Graph-Based Blind Image Deblurring From a Single Photograph. IEEE Trans. Image Process. 2019, 28, 1404–1418. [Google Scholar] [CrossRef] [PubMed]
Chen, L.; Fang, F.; Lei, S.; Li, F.; Zhang, G. Enhanced sparse model for blind deblurring. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 631–646. [Google Scholar]
Pan, J.; Sun, D.; Pfister, H.; Yang, M.H. Blind image deblurring using dark channel prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1628–1636. [Google Scholar]
Xu, L.; Zheng, S.; Jia, J. Unnatural l0 sparse representation for natural image deblurring. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 1107–1114. [Google Scholar]
Nah, S.; Hyun Kim, T.; Mu Lee, K. Deep multi-scale convolutional neural network for dynamic scene deblurring. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3883–3891. [Google Scholar]
Cho, S.J.; Ji, S.W.; Hong, J.P.; Jung, S.W.; Ko, S.J. Rethinking coarse-to-fine approach in single image deblurring. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 4641–4650. [Google Scholar]
Chen, L.; Chu, X.; Zhang, X.; Sun, J. Simple baselines for image restoration. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022; pp. 17–33. [Google Scholar]
Kim, K.; Lee, S.; Cho, S. Mssnet: Multi-scale-stage network for single image deblurring. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022; pp. 524–539. [Google Scholar]
Michaeli, T.; Irani, M. Blind deblurring using internal patch recurrence. In Proceedings of the Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part III 13. Springer: Cham, Switzerland, 2014; pp. 783–798. [Google Scholar]
Sun, J.; Cao, W.; Xu, Z.; Ponce, J. Learning a convolutional neural network for non-uniform motion blur removal. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 769–777. [Google Scholar]
Tao, X.; Gao, H.; Shen, X.; Wang, J.; Jia, J. Scale-recurrent network for deep image deblurring. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8174–8182. [Google Scholar]
Purohit, K.; Rajagopalan, A.N. Region-Adaptive Dense Network for Efficient Motion Deblurring. Proc. AAAI Conf. Artif. Intell. 2020, 34, 11882–11889. [Google Scholar] [CrossRef]
Li, D.; Zhang, Y.; Cheung, K.C.; Wang, X.; Qin, H.; Li, H. Learning degradation representations for image deblurring. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022; pp. 736–753. [Google Scholar]
Fang, Z.; Wu, F.; Dong, W.; Li, X.; Wu, J.; Shi, G. Self-supervised non-uniform kernel estimation with flow-based motion prior for blind image deblurring. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 18105–18114. [Google Scholar]
Mao, X.; Liu, Y.; Liu, F.; Li, Q.; Shen, W.; Wang, Y. Intriguing findings of frequency selection for image deblurring. Proc. AAAI Conf. Artif. Intell. 2023, 37, 1905–1913. [Google Scholar] [CrossRef]
Kupyn, O.; Budzan, V.; Mykhailych, M.; Mishkin, D.; Matas, J. Deblurgan: Blind motion deblurring using conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8183–8192. [Google Scholar]
Kupyn, O.; Martyniuk, T.; Wu, J.; Wang, Z. Deblurgan-v2: Deblurring (orders-of-magnitude) faster and better. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8878–8887. [Google Scholar]
Wang, Z.; Cun, X.; Bao, J.; Zhou, W.; Liu, J.; Li, H. Uformer: A general u-shaped transformer for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 17683–17693. [Google Scholar]
Tsai, F.J.; Peng, Y.T.; Lin, Y.Y.; Tsai, C.C.; Lin, C.W. Stripformer: Strip transformer for fast image deblurring. In Proceedings of the European Conference on Computer vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022; pp. 146–162. [Google Scholar]
Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5728–5739. [Google Scholar]
Kong, L.; Dong, J.; Ge, J.; Li, M.; Pan, J. Efficient frequency domain-based transformers for high-quality image deblurring. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 5886–5895. [Google Scholar]
Liu, N.; Sun, H.; Yang, J. Recognition of the stacked two-dimensional bar code based on iterative deconvolution. Imaging Sci. J. 2010, 58, 81–88. [Google Scholar]
Xu, W.; McCloskey, S. 2D Barcode localization and motion deblurring using a flutter shutter camera. In Proceedings of the 2011 IEEE Workshop on Applications of Computer Vision (WACV), Kona, HI, USA, 5–7 January 2011; pp. 159–165. [Google Scholar] [CrossRef]
Sörös, G.; Semmler, S.; Humair, L.; Hilliges, O. Fast blur removal for wearable QR code scanners. In Proceedings of the 2015 ACM International Symposium on Wearable Computers, Osaka, Japan, 7–11 September 2015; ISWC ’15. pp. 117–124. [Google Scholar] [CrossRef]
van Gennip, Y.; Athavale, P.; Gilles, J.; Choksi, R. A Regularization Approach to Blind Deblurring and Denoising of QR Barcodes. IEEE Trans. Image Process. 2015, 24, 2864–2873. [Google Scholar] [CrossRef] [PubMed]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H.; Shao, L. Multi-stage progressive image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 14821–14831. [Google Scholar]
Kim, J.; Lee, J.K.; Lee, K.M. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1646–1654. [Google Scholar]
Lai, W.S.; Huang, J.B.; Ahuja, N.; Yang, M.H. Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 2599–2613. [Google Scholar] [CrossRef] [PubMed]
Park, D.; Kang, D.U.; Kim, J.; Chun, S.Y. Multi-temporal recurrent neural networks for progressive non-uniform single image deblurring with incremental temporal training. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 327–343. [Google Scholar]
Zhang, K.; Zuo, W.; Chen, Y.; Meng, D.; Zhang, L. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. IEEE Trans. Image Process. 2017, 26, 3142–3155. [Google Scholar] [CrossRef] [PubMed]
Liu, Y.; Fang, F.; Wang, T.; Li, J.; Sheng, Y.; Zhang, G. Multi-Scale Grid Network for Image Deblurring With High-Frequency Guidance. IEEE Trans. Multimed. 2022, 24, 2890–2901. [Google Scholar] [CrossRef]
Zhang, J.; Cui, G.; Zhao, J.; Chen, Y. High-Frequency Attention Residual GAN Network for Blind Motion Deblurring. IEEE Access 2022, 10, 81390–81405. [Google Scholar] [CrossRef]
Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13713–13722. [Google Scholar]

Figure 1. Network architecture of EGMSNet.

Figure 2. The architecture of the edge branch network with three cascaded edge extract modules (EEMs) and the edge feature fusion module (EFFM).

Figure 3. The architecture of the feature filtering mechanism (FFM) comprising the spatial selection module (SSM) and the frequency selection module (FSM).

Figure 4. Heatmap visualization and the impact of the FFM module on recovery. The first row (from left to right) shows the blurred image, heatmap before FFM, and heatmap after FFM. The second row (from left to right) presents the blurred image, deblurring result without FFM, and deblurring result with FFM. The red box highlights areas where the FFM module significantly enhances recovery quality.

Figure 5. Scatter plot of latency versus performance for various deep learning algorithms, demonstrating the superior PSNR and reduced latency of EGMSNet.

Figure 6. Comparison of the zoomed-in details of images processed by different methods on the synthetic QRCode dataset with motion blur, defocus blur, and Gaussian blur.

Figure 7. Visual comparison of randomly selected examples from the Data Matrix and EAN13 test sets using different methods.

Figure 8. Visual comparison of different restoration methods applied to a real QR code image affected by camera defocus. The figure shows three scene images with varying levels of blur, contrast, and sharpness, along with a barcode image exhibiting real motion blur artifacts.

Figure 9. Decoding rates before and after applying deblurring methods.

Figure 10. PSNR and SSIM values for different

α

values in the wavelet reconstruction loss. The highest values were obtained when

α = 0.1

.

Figure 10. PSNR and SSIM values for different

α

values in the wavelet reconstruction loss. The highest values were obtained when

α = 0.1

.

Figure 11. Restored QRCode images with and without wavelet reconstruction loss. The inclusion of wavelet reconstruction loss significantly improved texture detail restoration.

Table 1. Quantitative results for all competitors on the synthetic datasets QRCode.

Method	QRCode Dataset
Method	PSNR	SSIM	MAE	Params (M)	FLOPs (G)	Runtime (s)
DCP [8]	21.2259	0.8721	0.0371	N/A	N/A	136.659
RGTV [6]	19.8732	0.7911	0.0463	N/A	N/A	24.339
ESM [7]	21.9150	0.8789	0.0347	N/A	N/A	13.156
DeepDeblur [10]	27.3358	0.9151	0.0327	11.694	335	0.033
MIMO-Unet [11]	29.8646	0.9433	0.0248	6.802	63.639	0.011
MIMO-Unet+ [11]	30.5985	0.9519	0.0223	16.102	151.34	0.022
NAFNet [12]	30.3950	0.9485	0.0230	64.887	63.236	0.023
MSSNet [13]	31.2483	0.9565	0.0327	15.591	145.23	0.024
Ours	31.8211	0.9622	0.0192	6.831	39.794	0.014

Table 2. Quantitative results for all competitors on the synthetic datasets Data Matrix and EAN13.

Method	Data Matrix Dataset			EAN13 Dataset
Method	PSNR	SSIM	MAE	PSNR	SSIM	MAE
DCP [8]	24.3528	0.7271	0.0479	18.6741	0.7095	0.0755
RGTV [6]	20.4702	0.7201	0.0490	16.8041	0.6855	0.0833
ESM [7]	24.7180	0.7376	0.0462	21.3252	0.7435	0.0644
DeepDeblur [10]	24.1110	0.7201	0.0490	25.5796	0.8126	0.0416
MIMO-UNet [11]	26.0586	0.7762	0.0400	27.0412	0.8306	0.0384
MIMO-UNet+ [11]	26.5528	0.7856	0.0392	28.2813	0.8328	0.0353
NAFNet [12]	26.9079	0.7940	0.0388	26.4733	0.8096	0.0429
MSSNet [13]	27.1040	0.7963	0.0382	25.3145	0.8022	0.0447
Ours	27.5142	0.8157	0.0346	28.6011	0.8369	0.0340

Table 3. Ablation study on edge and FFM.

	Base	Edge	SSM	FSM	PSNR	SSIM	Params (M)
a	✓				31.45	0.9597	6.448
b	✓	✓			31.55	0.9602	6.821
c	✓		✓		31.5804	0.9610	6.454
d	✓			✓	31.5307	0.9604	6.452
e	✓		✓	✓	31.6307	0.9612	6.458
f	✓	✓	✓	✓	31.8211	0.9622	6.831

Table 4. Comparison of different models.

	Models	PSNR	SSIM	Params (M)
a	Addition	31.6981	0.9546	6.728
b	Concat+conv	31.7425	0.9582	6.739
c	EFFM	31.8211	0.9622	6.831
d	FFM*	31.5901	0.9605	6.841
e	FFM	31.8211	0.9622	6.831

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shi, C.; Jiang, X.; Zhang, X.; Zhu, C.; Hu, X.; Zhang, G.; Li, Y.; Zhang, C. Real-Time Multi-Scale Barcode Image Deblurring Based on Edge Feature Guidance. Electronics 2025, 14, 1298. https://doi.org/10.3390/electronics14071298

AMA Style

Shi C, Jiang X, Zhang X, Zhu C, Hu X, Zhang G, Li Y, Zhang C. Real-Time Multi-Scale Barcode Image Deblurring Based on Edge Feature Guidance. Electronics. 2025; 14(7):1298. https://doi.org/10.3390/electronics14071298

Chicago/Turabian Style

Shi, Chenbo, Xin Jiang, Xiangyu Zhang, Changsheng Zhu, Xiaowei Hu, Guodong Zhang, Yuejia Li, and Chun Zhang. 2025. "Real-Time Multi-Scale Barcode Image Deblurring Based on Edge Feature Guidance" Electronics 14, no. 7: 1298. https://doi.org/10.3390/electronics14071298

APA Style

Shi, C., Jiang, X., Zhang, X., Zhu, C., Hu, X., Zhang, G., Li, Y., & Zhang, C. (2025). Real-Time Multi-Scale Barcode Image Deblurring Based on Edge Feature Guidance. Electronics, 14(7), 1298. https://doi.org/10.3390/electronics14071298

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Real-Time Multi-Scale Barcode Image Deblurring Based on Edge Feature Guidance

Abstract

1. Introduction

2. Related Work

3. Methodology

3.1. Overall Architecture

3.2. Edge Branch

3.3. Feature Filtering Mechanism (FFM)

3.4. Loss Function

4. Experiments

4.1. Datasets

4.2. Implementation Details

4.3. Comparison with State-of-the-Art Methods

4.4. Ablation Study

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI