Detection of Printing Defects on Wood-Derived Paper Products Using an Improved YOLOv8n

Zhang, Mingyang; Li, Shuai; Zhang, Jun; Bai, Xiaopeng; Wang, Kun; Yuan, Hongxia

doi:10.3390/f16121818

Open AccessArticle

Detection of Printing Defects on Wood-Derived Paper Products Using an Improved YOLOv8n

by

Mingyang Zhang

¹,

Shuai Li

^1,2,3,*,

Jun Zhang

^1,2,3

,

Xiaopeng Bai

⁴

,

Kun Wang

¹ and

Hongxia Yuan

¹

School of Mechanical and Electrical Engineering, Beijing Institute of Graphic Communication, Beijing 102600, China

²

Beijing Key Laboratory of Digital Printing Equipment, Beijing Institute of Graphic Communication, Beijing 102600, China

³

Beijing Engineering Research Center for Printing Equipment, Beijing Institute of Graphic Communication, Beijing 102600, China

⁴

School of Technology, Beijing Forestry University, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Forests 2025, 16(12), 1818; https://doi.org/10.3390/f16121818

Submission received: 27 October 2025 / Revised: 26 November 2025 / Accepted: 2 December 2025 / Published: 5 December 2025

(This article belongs to the Special Issue Harvesting, Processing and Management of Specialty Forest Products and Biomass)

Download

Browse Figures

Versions Notes

Abstract

Paper-based printing materials originate from the wood-based value chain–wood–pulp–paper–printing—and their yield reflects the utilization efficiency of pulp and paper resources. In roll-to-roll printing production, small printing defects (e.g., missing prints, smudges, cracks) often cause rework and scrap, thereby increasing the consumption of wood-derived materials. To improve resource efficiency, this study proposes a lightweight, improved YOLOv8n model for real-time small-defect detection. The Efficient IoU (EIoU) loss is introduced in the bounding box regression stage to improve localization accuracy, and a Squeeze-and-Excitation (SE) channel attention mechanism is embedded in the feature fusion stage to strengthen feature representation for small printing defects. Evaluations conducted on datasets collected from real production lines demonstrate that, with 3.02 M parameters and 8.1 GFLOPs, the model achieves mAP@0.5 = 94.1%, Precision = 95.1%, Recall = 94.3%, and an inference speed of 100.2 FPS, outperforming the baseline model. The proposed method contributes to reducing rework and material waste, supporting the efficient utilization of wood resources and the sustainable development of the paper-based packaging industry.

Keywords:

wood-derived paper products; printing defect detection; YOLOv8n; EIoU loss; SE attention mechanism; real time; small-defect detection

1. Introduction

Paper-based printing materials are renewable and recyclable, making them an integral component of the circular bioeconomy and green manufacturing [1,2]. In roll-to-roll production, defects such as missing prints, smudges, and cracks—if not addressed promptly—can escalate into rework and scrap, resulting in additional material input and increased energy consumption. From a value-chain perspective, these losses manifest as greater pulp and paper-based material use and elevated carbon emissions across the entire forest–wood–pulp–paper–printing chain [3,4]. Therefore, enhancing the online recognition and rapid handling of small printing defects has become critical to improving the efficient utilization of wood-derived materials and advancing the sustainable development of green printing and packaging [5].

Traditional methods, such as morphological analysis and gray-level co-occurrence matrix (GLCM)-based approaches, rely on handcrafted features and typically employ classifiers such as support vector machines (SVMs) for defect discrimination. Although these methods are relatively stable in structured texture scenarios, their adaptability is limited when dealing with complex backgrounds, weak contrast, or small printing defects [6,7,8]. With the advancement of deep learning, detection performance has improved substantially. One group of studies focuses on mitigating small-target omission through graph learning and multi-scale feature fusion [9,10], while another leverages template matching and geometric alignment to enhance robustness under scale variation and text-like defect scenarios [11,12]. The SSD and YOLO families have also demonstrated a balance between accuracy and real-time performance in the quality inspection of wood and board materials, establishing a reusable technical framework [13,14]. Meanwhile, Siamese-YOLOv4 enhances the discriminability of multiple defect categories through a twin-branch structure combined with the Mish activation function [15]. Furthermore, incorporating various attention mechanisms into the YOLOv5 framework significantly strengthens feature representation and noise suppression [16,17,18]. The joint design of deformable convolution and large-kernel attention has also improved small-target detection accuracy while maintaining real-time inference [19].

Nevertheless, two major bottlenecks remain in current research. (1) It is difficult to construct high-quality datasets, as small-defect samples are scarce and unevenly distributed, which limits the generalization capability of models [20,21]. (2) Existing detectors often rely on a large number of parameters and high inference complexity, making it challenging to simultaneously satisfy the dual requirements of low latency and high stability in roll-to-roll high-speed production lines [22].

Overall, recent YOLO-based improvements for wood surface or printing defect inspection often pursue higher accuracy by introducing heavier components, such as deformable convolutions, multi-branch large-kernel attention, or advanced attention modules [23,24]. However, these modifications usually increase parameters and FLOPs, making it difficult to simultaneously satisfy the industrial requirements of extreme lightness, low latency, and stable small-defect recognition in roll-to-roll high-speed production. Moreover, dedicated optimization for printing-specific challenges—small-scale and low-contrast defects under complex color textures and specular noise—remains insufficient, and systematic comparisons of attention mechanisms and regression losses on real production line data are still scarce.

To fill these gaps, our contributions are threefold. First, we propose an improved YOLOv8n that co-designs the EIoU loss and SE attention module as a lightweight complementary pair: EIoU explicitly refines the regression of tiny and irregular printing defects without increasing inference cost, while SE, inserted only in the Neck, adaptively recalibrates channel responses with only a few additional parameters to suppress complex background textures. Second, we construct an industrially relevant printing defect dataset and validate the proposed design through systematic comparisons of attention mechanisms and regression losses as well as detailed ablation studies. Third, the resulting detector achieves notable accuracy gains while maintaining the compact model size and high FPS of YOLOv8n, providing a practical solution for online quality control along the wood–pulp–paper–printing value chain.

2. Materials and Methods

2.1. YOLOv8 Object Detection Algorithm

Classical object detection algorithms are generally divided into two-stage and single-stage approaches. Two-stage methods, such as Faster R-CNN [25], first generate region proposals and then perform classification and regression for each region. Although these methods achieve high accuracy, repeated feature extraction and redundant computation lead to significant latency. In contrast, single-stage methods, such as SSD [26] and the YOLO series [27,28], directly map detection tasks to a single forward pass, thereby reducing computational complexity and enabling high-speed, end-to-end inference. In the context of printing defect detection, where stringent requirements exist for real-time performance and small-object recognition, YOLOv8n leverages its lightweight backbone network, an anchor-free detection head, and an improved IoU loss function to provide enhanced responsiveness to small printing defects such as missing prints, spots, and cracks. Furthermore, YOLOv8n maintains the efficiency of single-stage detection while flexibly scaling model size to adapt to different hardware conditions. Its built-in feature pyramid and multi-scale fusion strategies reduce false positives and missed detection in complex textured backgrounds of printed products.

2.2. SE Attention Mechanism

Printing defects are often associated with complex background noise and uneven multi-scale feature distribution. To strengthen the model’s focus on critical features, this study integrates the SE attention module [29] into the Neck of YOLOv8n. The SE module first performs a “squeeze” operation to obtain channel-wise feature responses from global information, followed by an “excitation” mechanism that adaptively assigns weights to each channel. This process emphasizes defect-relevant details while suppressing background noise.

Given an input feature map F ∈ R^{C × H × W}, global average pooling is applied along the spatial dimensions to compress it into a channel descriptor vector z ∈ R^{C}, representing the global response of each channel:

z_{c} = \frac{1}{H \times W} \sum_{i = 1}^{H} \sum_{j = 1}^{W} F_{c} (i, j), \dots, (c = 1, \dots, C)

(1)

The channel descriptor z is then passed through two fully connected layers to generate the channel-wise weights s ∈ R^{C}:

s = σ (W_{2} δ (W_{1} z)), W_{1} \in ℝ^{\frac{C}{r} \times C}, W_{2} \in ℝ^{C \times \frac{C}{r}}

(2)

where δ(·) denotes the ReLU activation function, σ(·) represents the Sigmoid activation function, and the reduction ratio r controls the bottleneck dimension. The dimensionality reduction–expansion structure in Equation (2) enables the model to capture high-order inter-channel dependencies and to generate channel importance weights suitable for defect detection.

To achieve ordered enhancement and suppression of channel responses in the feature map, the SE module applies the channel weights s = [s1, s2, …, s_C] to the original feature map. By performing channel-wise weighting, it emphasizes informative channels while suppressing redundant ones. The weighted formulation is expressed as follows:

F_{c} (i, j) = s_{c} \cdot F_{c} (i, j), \forall i \in [1, H], j \in [1, W]

(3)

Embedding the SE module into the Neck of YOLOv8n brings two main benefits. First, after multi-scale feature fusion, SE further emphasizes channel responses related to defect edges, ink-density variations, and texture irregularities, thereby enhancing the network’s sensitivity to small-scale and low-contrast defects. Second, under complex colored halftone backgrounds and printing noise, channel re-calibration effectively suppresses background channels irrelevant to defect detection, reducing the interference of redundant features on the subsequent detection heads. The overall architecture is illustrated in Figure 1. In summary, without introducing a noticeable increase in model complexity, the SE attention mechanism provides more discriminative channel representations for the improved YOLOv8n, establishing a stronger feature basis for subsequent bounding-box regression and defect classification.

In implementation, SE blocks are inserted only at the multi-scale fusion stage of the YOLOv8n Neck, with three placements corresponding to the fused features of P3, P4, and P5, where channel re-calibration is performed on the fused representations. The reduction ratio is set to r = 16 in this study. For a feature map with C channels, each SE block introduces two fully connected layers, and the additional parameters can be expressed as:

Δ P a r a m_{S E} = 2 \times \frac{C^{2}}{r}

(4)

The associated computation is dominated by channel-wise linear transformations and is negligible compared with convolutional operations. Consequently, although SE improves detection accuracy, the overall model remains lightweight, with 3.02 M parameters and 8.1 GFLOPs in the final optimized YOLOv8n, meeting the low-overhead and real-time requirements for online inspection.

2.3. EIoU Loss Function

In the conventional YOLOv8n framework, the CIoU loss considers the overlap ratio, center-point distance, and aspect ratio of the bounding box. However, it still suffers from limitations in terms of convergence speed and regression accuracy, particularly for elongated or irregularly shaped objects. To address this issue, the CIoU loss is replaced with the EIoU loss in this study. EIoU introduces a direct optimization of width and height errors, in addition to the IoU and center distance terms, thereby further improving regression performance.

The EIoU loss is designed to alleviate the gradient vanishing problem that often arises when the IoU is small during regression [30]. It achieves this by decomposing the bounding box offset error into two independent components—center-point distance error and width-height error—thus making the regression process more stable and reliable.

As the primary loss function for localization regression, CIoU (Complete-IoU) integrates three factors—IoU, center-point distance, and aspect ratio—into a normalized formulation:

L_{CIoU} = 1 - I o U + \frac{ρ^{2} (b, b_{g t})}{c^{2}} + α v

(5)

where ρ(·) denotes the Euclidean distance between the centers of the predicted and ground-truth boxes, c represents the diagonal length of the smallest enclosing box covering both, v measures the aspect ratio consistency, and α is the weighting factor.

In defect detection tasks, the target objects are often small in size and exhibit large variations in shape. Under such conditions, the regression capability of CIoU shows certain limitations, particularly for small-scale targets and objects with non-standard aspect ratios, where its performance declines.

To address this issue, EIoU further refines the regression formulation on the basis of CIoU. Specifically, EIoU explicitly models three components between the predicted and ground-truth boxes: center-point distance, width error, and height error. The loss function is formulated as follows:

L_{EIoU} = 1 - I o U + \frac{ρ^{2} (b, b^{g t})}{c^{2}} + \frac{ρ^{2} (w, w^{g t})}{c_{w}^{2}} + \frac{ρ^{2} (h, h^{g t})}{c_{h}^{2}}

(6)

where ρ denotes the Euclidean distance; b, w, and h represent the center coordinates, width, and height of the predicted box, respectively; b^gt, w^gt, h^gt denote the corresponding parameters of the ground-truth box; and c, c_w, c_h correspond to the diagonal length, width, and height of the minimum enclosing rectangle.

Compared with the aspect ratio penalty in CIoU, EIoU introduces a linear width–height error term, which provides more stable gradients across the entire IoU range. Notably, even under low IoU conditions (i.e., when the boxes barely overlap), EIoU ensures reliable width–height regression accuracy and stable localization. This effectively alleviates the error propagation caused by size deviations in small-target detection.

2.4. Improved YOLOv8n Overall Architecture

In this study, the YOLOv8n architecture was optimized for localization accuracy and feature enhancement by adopting EIoU and integrating SE into the Neck. Specifically, the EIoU loss function was adopted in the bounding box regression stage to improve localization accuracy, while the SE attention mechanism was integrated into the Neck to enhance the model’s responsiveness to small targets and critical defect regions.

These improvements not only maintain low parameter count and computational complexity but also significantly enhance detection accuracy and generalization capability. As illustrated in Figure 2, the improved architecture strengthens sensitivity to small printing defects while preserving lightweight efficiency.

3. Experimental Setup

3.1. Dataset and Augmentation

This study constructs a printing defect dataset targeting roll-to-roll online quality inspection. Defective samples (missing prints, cracks, and spots) were obtained from real production outputs of the roll-to-roll printing line at Shanxi Xinlongteng Advertising Media Co., Ltd. (Shanxi, China); however, image acquisition was conducted in an offline mode. Images were captured under controlled conditions using a Hikvision industrial camera (CU060-10GC; Hangzhou Hikrobot Co., Ltd., Hangzhou, China) in a vertical top-view configuration with ring-LED uniform illumination to reproduce the imaging viewpoint and lighting environment of practical production lines, as shown in Figure 3. This setup provides shadow-free and consistent illumination, reduces ambient-light interference, and enhances the visibility of subtle defects, while the vertical imaging configuration avoids geometric distortion and ensures accurate spatial localization. The raw images were stored in JPEG format with a resolution of 3072 × 2048 pixels and a file size of approximately 4.2 MB per image. To make “small defects” reproducible, we define them as defects whose bounding-box area accounts for less than [0.1]% of the raw image area, i.e., A_box/A_img < 1 × 10⁻³. With our original resolution of 3072 × 2048, this corresponds to boxes typically smaller than about 60 × 40 px. After resizing inputs to 640 × 640 for training, the effective defect size is mostly within 8–20 px on the long side, which represents typical tiny targets in roll-to-roll printing inspection. The sample sheets used in this study were derived from two types of self-adhesive media: indoor water-based adhesive paper (photo paper adhesive and polypropylene adhesive, PP) and outdoor white adhesive film. Both media utilized paper-based release liners, establishing a stable linkage with the wood-based value chain. Among them, the surface layer of the photo paper adhesive is composed of wood pulp–based paper, which is classified as a wood-derived material.

A total of 1200 valid images were collected for this study. To improve the robustness and generalization of complex printing defect detection, diverse image-augmentation strategies (implemented using the Albumentations library, version 1.4.11). were applied during training. Considering the high variability of printing defects in shape, scale, and background texture, the augmentation pipeline was designed to simulate real production line printing conditions, including geometric transformations, illumination perturbations, blur degradation, affine transformations, and local occlusion. To ensure reproducibility, the augmentation probabilities and magnitude ranges were specified as follows. Geometric augmentations include random horizontal flip (p = 0.5), random rotation (±10°, p = 0.3), random scaling (0.8–1.2×, p = 0.3), and random translation (within ±10% of image size, p = 0.3). Photometric perturbations include brightness/contrast jitter (±20%, p = 0.4) and mild HSV jitter (H ±5, S/V ±15%, p = 0.4). Degradation-based augmentations include Gaussian blur (kernel size 3–7, p = 0.2) and Gaussian noise (σ = 5–15, p = 0.3). Local occlusion is simulated by CoarseDropout/Cutout (1–3 holes, each covering 2–6% of the image area, p = 0.2). All probabilities are applied independently per image. Both training and augmentation were performed with a fixed random seed of 42, set consistently in Python, NumPy 1.24.3, and PyTorch, with deterministic computation enabled to guarantee fully reproducible results. These strategies increased the diversity of training samples and reduced the risk of overfitting to specific feature distributions. After augmentation, the dataset was expanded to 4500 images and annotated using LabelImg 1.8.6. The defect categories include spots, missing prints, and cracks, and representative examples are shown in Figure 4. Detailed per-class statistics and the 7:2:1 train/val/test split are summarized in Table 1.

3.2. Experimental Environment and Training Parameters

The hardware and software environment used in the experiments is summarized in Table 2, and the key training hyperparameters are listed in Table 3.

To ensure the stability and convergence efficiency of the improved model in printing defect detection tasks, the following hyperparameter configuration was adopted during the training phase, as summarized in Table 3. The number of epochs was set to 150, providing sufficient iterations for the model to adequately fit the sample features, while employing an early stopping strategy to effectively prevent overfitting. Early stopping was applied by monitoring the validation mAP@[0.5:0.95]. Training was terminated if this metric did not improve for 50 consecutive epochs (minimum improvement < 0.001), and the checkpoint with the best validation performance was retained for reporting. A batch size of 16 was chosen as a compromise between maintaining training stability, GPU memory constraints, and computational efficiency.

Images were resized to 640 × 640 pixels to ensure compatibility with mainstream detection models, while achieving an optimal balance between small-target detection performance and computational resource consumption. The number of data loader workers was set to 8 to fully leverage multithreading capabilities, accelerate data loading and augmentation, and consequently improve overall training speed.

AdamW was chosen as the optimizer instead of SGD or Adam. With a weight decay term (0.0005), AdamW provides stronger regularization during gradient updates, preventing parameter overgrowth and enhancing generalization. The learning rate was set to 0.001, allowing for rapid convergence during the initial training phase while facilitating stable fine-tuning in later stages, thus accommodating defect detection tasks under complex backgrounds.

All models were trained once under identical hyperparameter settings with a fixed random seed, and the best validation checkpoint was consistently selected for reporting. Therefore, the observed improvements are stable and not attributable to run-to-run randomness.

3.3. Evaluation Metrics

Precision, Recall, mAP@0.5, mAP@[0.5:0.95], and Frames Per Second (FPS) are adopted as the primary evaluation metrics. Together, these metrics provide a comprehensive view of model performance: Precision and Recall characterize false-positive and false-negative rates; mAP@0.5 measures detection accuracy at a fixed IoU threshold, whereas mAP@[0.5:0.95] reflects robustness across IoU thresholds and object scales; and FPS quantifies throughput and real-time capability on the target hardware. Using these five metrics mitigates single-metric bias and enables an objective assessment of the accuracy–speed trade-off as well as the generalization capability of printing defect detection under real production line conditions.

P = \frac{T P}{T P + F P}

(7)

True Positives (TP) denote the number of defect boxes correctly detected, while False Positives (FP) denote the number of boxes incorrectly identified as defects from background or non-defective regions. A higher precision indicates fewer false alarms generated by the model.

Recall measures the model’s ability to detect all true defects and is defined as follows:

R = \frac{T P}{T P + F N}

(8)

where False Negatives (FN) denote the number of true defect boxes that the model failed to detect. A higher recall indicates fewer missed detections.

Average Precision (AP) represents the area under the Precision–Recall curve and is used to evaluate the overall detection performance of the model across different confidence thresholds.

A P = \int_{0}^{1} P (r) d r

(9)

mAP is the arithmetic mean of AP values across multiple classes or different IoU thresholds, reflecting the overall detection performance of the model. In this study, the following two commonly used forms are reported:

mAP@0.5: Calculated by averaging the AP of each class at a fixed IoU of 0.5.

mAP@[0.5:0.95]: Computed as the average of AP values across IoU thresholds ranging from 0.5 to 0.95 with a step size of 0.05.

m A P = \frac{1}{N} \sum_{i = 1}^{N} A P_{i}

(10)

where N denotes the number of classes or the number of selected IoU thresholds.

FPS represents the number of image frames the model can process per unit time and is a key metric for evaluating detection speed and real-time performance. It is calculated as follows:

F P S = \frac{1000}{p r e p r o c e s s i n g + i n f e r e n c e + N M S}

(11)

where preprocessing denotes the image preprocessing time; inference refers to the time from feeding a preprocessed image into the model to obtaining the output results; and NMS represents the post-processing time of the image.

To benchmark the FPS reported in this study, we adopted an end-to-end protocol in accordance with Equation (11), where preprocessing, network inference, and post-processing were all included. Speed evaluation was conducted on the RTX 4070 GPU (Table 2) with a single input stream and a fixed input size of 640 × 640. The benchmarking batch size was set to 1 to match the real-time single-frame processing mode required in roll-to-roll online inspection (the training batch size is reported in Table 3). To remove cache/compilation overhead, 100 images were used for warm-up, followed by timing 1000 test images. CUDA-synchronized timing was applied to obtain the average per-frame latency t_avg, and FPS was computed as 1/t_avg. Disk I/O was excluded from timing. The preprocessing includes resizing to 640 × 640, RGB conversion, normalization, and tensor formatting, while the post-processing includes confidence filtering (score threshold = 0.5) and NMS (IoU = 0.5) to yield final detections. This protocol ensures that the reported FPS reflects the practical throughput of the online inspection system.

4. Results and Discussion

4.1. Comparison of Attention Mechanisms

To investigate the impact of different attention mechanisms on detection performance, four mainstream attention modules—SE, CBAM, CA, and Shuffle—were integrated into the YOLOv8n model separately, while keeping the rest of the network architecture unchanged. The detection performance of these variants was then compared using the same dataset. The comparative results are summarized in Table 4.

As summarized in Table 4, all four attention modules improve detection to varying degrees, confirming that attention helps highlight subtle defects under low-contrast, texture-rich backgrounds. Compared with the baseline YOLOv8n (P = 93.2%, R = 94.2%, mAP@0.5 = 92.8%, mAP@[0.5:0.95] = 39.1%), SE provides the most balanced gains, increasing P and R to 94.3% (+1.1%) and 94.3% (+0.1%), and raising mAP@0.5 to 93.5% (+0.7%). CBAM and Shuffle Attention also yield stable improvements; CBAM achieves the best strict metric mAP@[0.5:0.95] of 39.9% (+0.8%), whereas Shuffle Attention slightly improves mAP@0.5 to 93.8% and mAP@[0.5:0.95] to 39.5%.

From a modeling perspective, these behaviors are consistent with the design of the modules and the characteristics of our defects. SE performs lightweight channel-wise re-weighting based on global pooling, which has been reported to enhance subtle object cues on complex backgrounds with very low computational overhead. For the tiny, low-contrast cracks, missing-print areas and spots embedded in halftone-dot textures, the most discriminative information is mainly encoded in a few weak channels rather than in stable spatial contours, so strengthening channel discriminability after multi-scale fusion is more effective than introducing complex spatial interactions. CBAM adds an extra spatial-attention branch that can further refine boundaries and thus slightly benefits the strict mAP@0.5:0.95, but this branch is more sensitive to local texture noise and increases the computational cost [31]. Coordinate Attention injects directional positional cues into channel attention and is well suited for large objects with stable orientation; however, the defects in this work are mostly irregular and orientation-ambiguous, which limits its effectiveness [32]. Shuffle Attention implements joint channel–spatial modeling via grouped shuffling operations, but in our experiments it brings only limited gains while slightly weakening the global channel dependency compared with SE [33]. Overall, these results indicate that a lightweight channel re-calibration module such as SE is better matched to the representation characteristics of tiny printing defects and offers a more favorable accuracy–complexity trade-off than the other attention mechanisms.

4.2. Comparison of Loss Functions

To evaluate the performance improvement introduced by the EIoU loss function, the original CIoU loss in YOLOv8n was replaced with WIoU-V3, GIoU, and EIoU. Model performance metrics were then compared under a consistent experimental setup, as shown in Table 5.

Table 5 shows consistent yet different levels of improvement when CIoU is replaced. Relative to the baseline CIoU (P = 93.2%, R = 94.2%, mAP@0.5 = 92.8%, mAP@[0.5:0.95] = 39.1%), WIoU-v3 yields mild gains (mAP@0.5 = 93.1%, mAP@[0.5:0.95] = 39.3%), and GIoU performs slightly better on the stricter metric (mAP@[0.5:0.95] = 39.5%). EIoU brings the most substantial and uniform improvement, increasing P to 94.1%, R to 95.0%, mAP@0.5 to 93.8%, and mAP@[0.5:0.95] to 39.6%.

From a geometric viewpoint, IoU-based losses such as DIoU/CIoU extend the original IoU by incorporating center-distance and aspect-ratio terms [34], which improves localization but can still be sub-optimal for low-quality or highly elongated boxes—situations that frequently occur for thin cracks and small missing-print regions in printing inspection. WIoU and its variants (e.g., WIoU-v3 used in this work) introduce quality-aware re-weighting to stabilize regression for hard small targets [35], but the overall gains on our dataset remain modest. In contrast, EIoU explicitly decomposes width and height errors and directly constrains side-length discrepancies, which better matches the small, thin and weakly bounded geometry of printing defects and explains its consistent improvements on both mAP@0.5 and mAP@[0.5:0.95].

4.3. Detection Results by Defect Type: Before and After Optimization

To further validate the detection capability of the proposed optimization strategy across different types of printed defects, three representative defect categories—cracks, missing prints, and spots—were selected. Precision and recall were compared between the original YOLOv8n model and the optimized YOLOv8n model, with the results presented in Table 6.

As shown in Table 6, the optimized YOLOv8n model achieved consistent improvements in detection performance across all defect categories:

For Cracks, the optimized model achieved a precision of 98.5% and a recall of 96.8%, representing a 2.6-percentage-point improvement in precision over the original model (95.9%, 96.5%). This indicates that the optimized model possesses stronger discriminative capability in modeling boundary details.

For missing print defects, the optimized model achieved a precision of 97.9% and a recall of 96.4%, showing higher consistency and detection stability compared to the original YOLOv8n model (95.0%, 96.3%). This demonstrates improved robustness when handling low-contrast targets.

For Spot defects, which are relatively more challenging, the optimized model still achieved a slight precision increase from 89.4% to 89.5% and an improvement in recall from 89.7% to 90.5%, indicating that the optimization strategy also has potential to enhance detection of small targets with low-texture contrast.

To further reveal class-specific performance, we report the per-class PR curves and AP values of the optimized model on the test set (Figure 5). The AP@0.5 for cracks and missing prints is close to 0.99, whereas spots remain more challenging due to weak contrast and strong texture interference, leading to a relatively lower AP. Figure 6 presents the normalized confusion matrix, which clarifies the major confusion patterns among defect categories. The dominant diagonal entries indicate reliable classification overall, and most residual errors arise from the confusion between tiny spot defects and background textures, consistent with their imaging characteristics.

4.4. Ablation Study

To evaluate the specific performance improvements contributed by the EIoU loss function and SE attention mechanism in the proposed optimization strategy, an ablation study was conducted. Based on the YOLOv8n model, three configurations were tested: incorporation of EIoU loss, integration of SE attention, and the combination of both. The impact of each configuration on detection accuracy was assessed. The experimental results are presented in Table 7.

As shown in Table 7, the original YOLOv8n model achieved P of 93.2%, R of 94.2%, and mAP@0.5 of 92.8%. After incorporating the EIoU loss function, the model’s bounding box regression capability was enhanced, with precision increasing to 94.1%, recall to 95.0%, and mAP@0.5 to 93.8%, indicating that EIoU positively optimizes object localization performance. Furthermore, introducing the SE attention mechanism also improved model performance, particularly in the comprehensive metric mAP@[0.5:0.95], which increased from 39.1% to 39.6%, demonstrating the effectiveness of the SE mechanism in enhancing channel feature representation.

When EIoU and the SE mechanism were combined, the model achieved optimal overall performance, with precision reaching 95.1%, mAP@0.5 at 94.1%, and mAP@[0.5:0.95] at 39.6%. This confirms the synergistic effect of EIoU loss and SE attention, as they respectively enhance model performance in bounding box regression accuracy and channel feature responsiveness, exhibiting complementary benefits. Therefore, both the proposed EIoU and SE modules independently improved object detection performance, and their combined use produced more stable and significant enhancements, providing an effective pathway for subsequent model optimization. As shown in Figure 7, the proposed method can detect various defects such as missing prints, cracks, and spots in printed materials, intuitively demonstrating the detection performance of the improved model on real samples.

4.5. Comparison with Other Models and Real-Time Performance Analysis

To further evaluate the effectiveness of the proposed optimization strategy in printing defect detection, this study conducted a comparative analysis with current mainstream object detection models, including YOLOv3-tiny, YOLOv5s, YOLOv6s, YOLOv8s, and YOLOv8m. The analysis focused on comprehensive performance in terms of accuracy and efficiency, with the optimized YOLOv8n model included for evaluation. The experimental results are presented in Table 8.

As shown in Table 8, YOLOv3-tiny has an advantage in lightweight design (model size 23.2 MB, 18.9 GFLOPs), but its accuracy remains relatively low (mAP@0.5 = 92.4%), which is insufficient for high-precision industrial inspection requirements. YOLOv5s and YOLOv6s achieved a balance between model complexity and inference speed. Notably, YOLOv6s reached a maximum inference speed of 145.6 FPS with relatively low computation (11.8 GFLOPs), demonstrating strong real-time capability. However, its detection accuracy (mAP@0.5 = 92.1%) remained suboptimal.

Overall, the YOLOv8 series outperformed previous models in accuracy. YOLOv8m achieved an mAP@0.5 of 93.4%, but its computational cost and model size increased substantially (78.7 GFLOPs, 49.6 MB), limiting practical deployment. YOLOv8s reduced model size and computation to some extent, yet its inference speed offered no significant advantage.

In contrast, the proposed optimized YOLOv8n model achieved the best overall performance while remaining lightweight. The model contains only 3.02 M parameters, requires 8.1 GFLOPs, and has a size of 6 MB, yet exhibits significant accuracy improvements, achieving an mAP@0.5 of 94.1% and precision of 95.1%. Its inference speed reaches 100.2 FPS, meeting the dual requirements of real-time processing and high accuracy for high-speed printing production lines. In summary, the optimized YOLOv8n demonstrates advantages in detection accuracy, inference speed, and lightweight design, highlighting its strong potential for practical engineering applications.

To visually illustrate the comprehensive advantages of the optimized YOLOv8n model in terms of accuracy, computational cost, and model size, Figure 8 compares key detection metrics—precision (P), recall (R), and mAP@0.5—across mainstream models, overlaid with GFLOPs to depict computational complexity. As shown, the optimized YOLOv8n achieves excellent performance with only 8.1 GFLOPs and a 6 MB model size, reaching 95.1% precision, 94.3% recall, and 94.1% mAP@0.5, outperforming YOLOv3-tiny, YOLOv5s, YOLOv6s, and YOLOv8s. These results confirm the effectiveness of the proposed optimization strategy and its strong potential for industrial deployment.

As shown in Figure 8, the improved YOLOv8n outperforms the comparison models in Precision, Recall, and mAP@0.5. Despite requiring only 8.1 GFLOPs of computation and 6 MB of storage, it still achieves an inference speed of 100.2 FPS, demonstrating an outstanding balance between accuracy and computational complexity. Based on these results, the following section discusses the engineering and industrial implications of the proposed method from the downstream production perspective of the “forest–wood–pulp–paper–printing” value chain, with a particular focus on wood-based manufacturing processes.

To clarify the practical meaning of the achieved inference speed (100.2 FPS), we quantitatively relate the processing rate to the throughput requirement of the roll-to-roll digital printing line from which the dataset was collected. Using a representative web speed of ≈ 50 m/min, the corresponding linear speed is v_s = 50/60 ≈ 0.833 m/s. With the measured inference rate, the interval between consecutive frames is Δt = 1/FPS ≈ 0.00998 s, yielding a per-frame covered web length of:

Δ L = v_{s} \cdot Δ t \approx 0.833 \times 0.00998 = 0.00832 m \approx 8.32 mm / frame

(12)

Equivalently, the spatial sampling density along the moving web is:

N = \frac{1}{Δ L} = \frac{F P S \cdot 60}{v} \approx \frac{100.2 \times 60}{50} \approx 120 frames / m

(13)

This mapping indicates that, at a typical operating speed, the proposed system performs dense continuous inspection (roughly one frame every 8 mm) without evident spatial gaps, providing sufficient real-time margin and throughput for online quality control.

Beyond sampling density, roll-to-roll online quality control requires the detection latency to satisfy control-loop constraints. Let the web speed be v_s and the distance from the camera station to the downstream action unit (alarm/marking/rejection/stop) be d. The end-to-end latency should meet L_e2e ≤ d/v_s to ensure that defects can be acted upon before reaching the actuator. Under a representative speed of 50 m/min (v_s ≈ 0.833 m/s) and a typical industrial camera–actuator distance on the order of 0.3–0.5 m, the allowable latency budget is about 0.36–0.60 s. The proposed model yields a single-frame inference interval of Δt ≈ 9.98 ms (100.2 FPS) on the target hardware, which is far below this threshold, demonstrating sufficient low-latency margin for roll-to-roll control.

In addition, to assess the stability of real-time inference, we report the end-to-end per-frame latency distribution on the validation set, as shown in Figure 9. Most frames are processed within approximately 8–11 ms, with an average latency of 9.83 ms, and p95 and p99 values of 10.94 ms and 11.80 ms, respectively, indicating no evident long-tail jitter. These results demonstrate that the improved YOLOv8n model provides stable low-latency performance under the current hardware and parameter configuration, which is consistent with the ≈100 FPS throughput analysis above and satisfies the real-time requirements of roll-to-roll online inspection.

4.6. Error Analysis

To improve interpretability and reveal practical limitations under real industrial conditions, we further analyze typical failure modes of the proposed detector. Figure 10 presents representative error cases, where FN denotes false negatives (missed detections) and FP denotes false positives (false alarms). These examples provide a more intuitive understanding of where the model may struggle under complex textures, low-contrast defects, and structured backgrounds, thereby motivating future improvements.

As shown in Figure 10a, extremely small or weak-contrast defects can be overwhelmed by background textures and noise, leading to missed detections (FN). This indicates that fine-grained feature representation remains challenging for tiny and low-contrast defects. Figure 10b illustrates another common error pattern: near highly structured, high-contrast backgrounds (e.g., large text or sharp pattern boundaries), local textures may resemble defect appearances, causing false positives (FP). To mitigate these issues, future work will incorporate more hard samples of tiny/low-contrast defects, apply targeted augmentations (e.g., texture perturbation, specular-noise simulation, and small-defect oversampling), and explore finer-scale detection branches or higher-resolution inputs to further enhance robustness.

4.7. Result Discussion and Industrial Implications

(1): Material efficiency: The improved YOLOv8n increases the detection and localization accuracy of small printing defects, enabling earlier identification and handling of defective products, which is expected to reduce rework and scrap caused by missed defects. It should be noted that no production-statistics-based material quantification or life cycle assessment (LCA) was performed in this study; therefore, the sustainability benefits discussed here are qualitatively inferred from the anticipated reduction in waste, rather than empirically measured. Nevertheless, online inspection directly helps prevent defective items from entering downstream processes, aligning with resource-efficiency goals along the “forest–wood–pulp–paper–printing” chain [36].
(2): Energy consumption and carbon intensity: By reducing scrap and rework, fewer repeated printing, re-inspection, rewinding, and re-processing operations are required, which is expected to lower energy use and carbon intensity per unit output. Since no on-site measurements of energy consumption or carbon emissions were conducted, these impacts are qualitatively reasoned from the reduction in waste and repeated processing, serving to highlight the potential value of the method from a green-manufacturing perspective [37].
(3): Cross-category transferability: The proposed EIoU-based regression enhancement and SE-driven channel refinement target generic challenges of small-scale, low-contrast surface defects under complex textures, which supports cross-category robustness. As shown in Table 6, compared with the baseline, the optimized model raises the precision for cracks from 95.9% to 98.5% and for missing prints from 95.0% to 97.9%, while also improving the recall of the most challenging spot defects from 89.7% to 90.5%. Such consistent category-level gains indicate that the optimization does not overfit a single defect pattern but enhances defect-relevant representations in a general manner. Recent Forests studies on wood/board surface defects similarly report that lightweight YOLO variants benefit from channel or multi-scale reinforcement for tiny defects. Therefore, the method is expected to transfer to other wood-derived paper or wood-surface inspection tasks sharing similar “tiny-defect + textured-background” characteristics.

A limitation of this study is that the dataset was collected from a single roll-to-roll production line and two self-adhesive substrates provided by one partner company. Although consistent improvements were observed, generalization to other printers, substrates with different colors/textures, or factories may be affected by domain shifts. Future work will expand the dataset to multiple lines and sites and investigate domain-adaptive or incremental-learning strategies to validate and enhance broader industrial applicability [38].

In summary, the sustainability implications discussed above are qualitative industrial inferences grounded in waste reduction. Future work will integrate long-term multi-line production statistics (e.g., scrap rate, rework rate, and energy per unit area) and incorporate LCA or carbon-accounting frameworks to quantitatively evaluate material savings and energy/carbon reductions, thereby validating the real environmental benefits in broader industrial settings.

5. Conclusions

This study focuses on defect detection in roll-to-roll paper-based printing and proposes an improved YOLOv8n model. The EIoU is introduced in the bounding box regression stage, and an SE channel attention module is embedded in the feature fusion (Neck) stage to effectively enhance localization precision and feature responsiveness for subtle printing defects. Validation using datasets collected from real production lines shows that, while maintaining a lightweight architecture (3.02 M parameters and 8.1 GFLOPs), the model achieves mAP@0.5 = 94.1%, Precision = 95.1%, Recall = 94.3%, and a real-time inference speed of 100.2 FPS—approximately 1.3 percentage points higher than the baseline. This research provides an efficient and reliable technical solution for print quality control, contributing to reduced rework and material waste, improved utilization of wood-derived paper products, and technical support for material efficiency and green manufacturing across the “forest–wood–pulp–paper–printing” value chain. In future work, we plan to extend the detector toward adaptive domain generalization across heterogeneous printing substrates and production lines by incorporating continual-learning schemes that can incrementally adapt to new jobs while retaining knowledge of previously seen materials.

Author Contributions

Conceptualization, M.Z. and S.L.; methodology, M.Z., S.L., J.Z. and X.B.; software, M.Z. and K.W.; validation, M.Z., K.W. and H.Y.; formal analysis, M.Z. and S.L.; investigation, M.Z., K.W. and H.Y.; resources, S.L., J.Z. and X.B.; data curation, M.Z., K.W. and H.Y.; writing—original draft preparation, M.Z.; writing—review and editing, S.L., J.Z. and X.B.; visualization, M.Z. and K.W.; supervision, S.L., J.Z. and X.B.; project administration, S.L. and J.Z.; funding acquisition, S.L. and J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Beijing Institute of Graphic Communication School-Level Project, “Research and Development of Working Pressure Direct Monitoring for Flat Die Cutting Machine” (Grant No. Ea202406); the Project of Construction and Support for High-Level Innovative Teams of Beijing Municipal Institutions, “Key Technologies and Equipment for MLCC Roll-to-Roll Precision Coated Printing” (Grant No. BPHR20220107); and the School-Level Project of Beijing Institute of Graphic Communication, “Beijing Key Laboratory of Digital Printing Equipment Construction Project” (Grant No. KYCPT202508). The APC was funded by the Beijing Institute of Graphic Communication.

Data Availability Statement

The data supporting the reported results are not publicly available due to corporate privacy and laboratory confidentiality agreements.

Acknowledgments

The authors express their sincere gratitude to Shanxi Xinlongteng Advertising Media Co., Ltd. for kindly providing access to the production dataset used in this study under the company’s data-privacy and confidentiality requirements. We also gratefully acknowledge the School of Mechanical and Electrical Engineering, Beijing Institute of Graphic Communication, for administrative and technical support, including assistance with data annotation, equipment maintenance, and experimental scheduling. No generative AI tools were used for text generation, data analysis, figure creation, study design, data collection, analysis, or interpretation; the authors take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

AP	Average Precision
CBAM	Convolutional Block Attention Module
CA	Coordinate Attention
CIoU	Complete Intersection over Union
CUDA	Compute Unified Device Architecture
CPU	Central Processing Unit
EIoU	Efficient IoU
FPS	Frames Per Second
GFLOPs	Giga Floating-Point Operations
LCA	Life Cycle Assessment
GPU	Graphics Processing Unit
IoU	Intersection over Union
JPEG	Joint Photographic Experts Group
mAP	mean Average Precision
SE	Squeeze and Excitation
SGD	Stochastic Gradient Descent
YOLO	You Only Look Once

References

Miassi, Y.E.; Dossa, K.F. Circular Economy Initiatives for Wood-Derived Bioeconomy: Harnessing the Potential of Non-Wood Biomaterials. Waste Manag. Bull. 2024, 2, 270–278. [Google Scholar] [CrossRef]
Khan, M.T.; Pettenella, D.; Masiero, M. Material Flow Analysis of the Wood-Based Value Chains in a Rapidly Changing Bioeconomy: A Literature Review. Forests 2024, 15, 2112. [Google Scholar] [CrossRef]
Zhao, Q.; Ding, S.; Wen, Z.; Toppinen, A. Energy Flows and Carbon Footprint in the Forestry-Pulp and Paper Industry. Forests 2019, 10, 725. [Google Scholar] [CrossRef]
Lazaridou, D.C.; Michailidis, A.; Trigkas, M. Exploring Environmental and Economic Costs and Benefits of a Wood-derived Circular Economy: A Literature Review. Forests 2021, 12, 436. [Google Scholar] [CrossRef]
Van Schoubroeck, S.; Chacon, L.; Reynolds, A.M.; Lavoine, N.; Hakovirta, M.; Gonzalez, R.; Van Passel, S.; Venditti, R.A. Environmental Sustainability Perception toward Obvious Recovered Waste Content in Paper-Based Packaging: An Online and in-Person Survey Best-Worst Scaling Experiment. Resour. Conserv. Recycl. 2023, 188, 106682. [Google Scholar] [CrossRef]
Wang, H.-B.; Xie, Y.-F. Printing Surface Defect Detection Method Based on Improved GLCM. Packag. Eng. 2020, 41, 272–278. (In Chinese) [Google Scholar] [CrossRef]
Zuo, C.; Zhang, Y.-B.; Qi, Y.-S.; Li, X.-Y.; Wang, Y.-C. Detection of Surface Scratch Defects of Printing Products Based on Machine Vision. Print. Digit. Media Technol. Study 2023, 5, 42–48. (In Chinese) [Google Scholar] [CrossRef]
Ru, X.; Yao, Y.; Li, J.; Peng, L.; Sun, Z. Defect Detection in Hot Stamping Process Printed Matter by Beluga Optimized Support Vector Machine with Opposition-Based Learning. Signal Image Video Process. 2024, 19, 93. [Google Scholar] [CrossRef]
Zhu, Z.; Zhang, J.; Wang, J.; Zhang, P.; Li, J. Puzzle Mode Graph Learning with Pattern Composition Relationships Reasoning for Defect Detection of Printed Products. J. Manuf. Syst. 2025, 81, 34–48. [Google Scholar] [CrossRef]
Chen, W.; Zheng, Y.; Liao, K.; Liu, H.; Miao, Y.; Sun, B. Small Target Detection Algorithm for Printing Defects Detection Based on Context Structure Perception and Multi-Scale Feature Fusion. Signal Image Video Process. 2024, 18, 657–667. [Google Scholar] [CrossRef]
Liu, X.; Li, Y.; Guo, Y.; Zhou, L. Printing Defect Detection Based on Scale-Adaptive Template Matching and Image Alignment. Sensors 2023, 23, 4414. [Google Scholar] [CrossRef]
Versino, F.; Ortega, F.; Monroy, Y.; Rivero, S.; López, O.V.; García, M.A. Sustainable and Bio-Based Food Packaging: A Review on Past and Current Design Innovations. Foods 2023, 12, 1057. [Google Scholar] [CrossRef]
Yang, Y.; Wang, H.; Jiang, D.; Hu, Z. Surface Detection of Solid Wood Defects Based on SSD Improved with ResNet. Forests 2021, 12, 1419. [Google Scholar] [CrossRef]
Wang, X. Recent Advances in Nondestructive Evaluation of Wood: In-Forest Wood Quality Assessments. Forests 2021, 12, 949. [Google Scholar] [CrossRef]
Lou, H.-J.; Zheng, Y.-L.; Liao, K.-Y.; Lei, H.; Li, J. Defect Target Detection for Printed Matter Based on Siamese-YOLOv4. J. Comput. Appl. 2021, 41, 3206–3212. (In Chinese) [Google Scholar] [CrossRef]
Liu, H.-W.; Zheng, Y.-L.; Zhong, C.-J.; Liao, K.-Y.; Sun, B.-Y.; Zhao, H.-X.; Lin, J.; Wang, H.-Q.; Han, S.-X.; Xie, B. Defect Detection of Printed Matter Based on Improved YOLOv5l. Prog. Laser Optoelectron. 2024, 61, 228–235. (In Chinese) [Google Scholar] [CrossRef]
Zhang, K.-S.; Guan, K.-K. Paper Defect Detection Method Based on Improved YOLOv5. China Pulp Pap. 2022, 41, 79–86. (In Chinese) [Google Scholar] [CrossRef]
Xu, Y.; Du, W.; Deng, L.; Zhang, Y.; Wen, W. Ship Target Detection in SAR Images Based on SimAM Attention YOLOv8. IET Commun. 2024, 18, 1428–1436. [Google Scholar] [CrossRef]
Liu, J.; Cai, Z.; He, K.; Huang, C.; Lin, X.; Liu, Z.; Li, Z.; Chen, M. An Efficient Printing Defect Detection Based on YOLOv5-DCN-LSK. Sensors 2024, 24, 7429. Available online: https://www.mdpi.com/1424-8220/24/23/7429 (accessed on 2 October 2025). [CrossRef] [PubMed]
Ma, Y.; Yin, J.; Huang, F.; Li, Q. Surface Defect Inspection of Industrial Products with Object Detection Deep Networks: A Systematic Review. Artif. Intell. Rev. 2024, 57, 333. [Google Scholar] [CrossRef]
Ren, Z.; Fang, F.; Yan, N.; Wu, Y. State of the Art in Defect Detection Based on Machine Vision. Int. J. Precis. Eng. Manuf.-Green Tech. 2022, 9, 661–691. [Google Scholar] [CrossRef]
Wang, Y.-X.; Chen, B. Print Defect Detection Method Based on Deep Comparison Network. J. Comput. Appl. 2023, 43, 250–258. (In Chinese) [Google Scholar]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
Wang, R.; Chen, Y.; Liang, F.; Wang, B.; Mou, X.; Zhang, G. BPN-YOLO: A Novel Method for Wood Defect Detection Based on YOLOv7. Forests 2024, 15, 1096. [Google Scholar] [CrossRef]
Wang, R.; Liang, F.; Wang, B.; Zhang, G.; Chen, Y.; Mou, X. An Efficient and Accurate Surface Defect Detection Method for Wood Based on Improved YOLOv8. Forests 2024, 15, 1176. [Google Scholar] [CrossRef]
Wang, H.; Qian, H.; Feng, S.; Wang, W. L-SSD: Lightweight SSD Target Detection Based on Depth-Separable Convolution. J. Real-Time Image Process. 2024, 21, 33. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, K.S.; Girshick, R.B.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. arXiv 2015, arXiv:1506.02640. [Google Scholar]
Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. arXiv 2016, arXiv:1612.08242. [Google Scholar] [CrossRef]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar] [CrossRef]
Zhang, Y.-F.; Ren, W.; Zhang, Z.; Jia, Z.; Wang, L.; Tan, T. Focal and Efficient IOU Loss for Accurate Bounding Box Regression. Neurocomputing 2022, 506, 146–157. [Google Scholar] [CrossRef]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Computer Vision—ECCV 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2018; Volume 11211, pp. 3–19. ISBN 978-3-030-01233-5. [Google Scholar]
Hou, Q.; Zhou, D.; Feng, J. Coordinate Attention for Efficient Mobile Network Design. arXiv 2021. [Google Scholar] [CrossRef]
Zhang, Q.-L.; Yang, Y.-B. SA-Net: Shuffle Attention for Deep Convolutional Neural Networks. arXiv 2021. [Google Scholar] [CrossRef]
Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. arXiv 2019. [Google Scholar] [CrossRef]
Tong, Z.; Chen, Y.; Xu, Z.; Yu, R. Wise-IoU: Bounding Box Regression Loss with Dynamic Focusing Mechanism. arXiv 2023. [Google Scholar] [CrossRef]
Pingale, R.R.; Altay, B.N. Reducing Waste and Increasing Yield in Roll-to-Roll Printing: A Case Study with Global Scalability. Int. J. Adv. Manuf. Technol. 2025, 141, 4139–4153. [Google Scholar] [CrossRef]
Lee, Y.; Yun, J.; Lee, S.; Lee, C. Image Data-Centric Visual Feature Selection on Roll-to-Roll Slot-Die Coating Systems for Edge Wave Coating Defect Detection. Polymers 2024, 16, 1156. [Google Scholar] [CrossRef]
Wang, B.; Wang, R.; Chen, Y.; Yang, C.; Teng, X.; Sun, P. FDD-YOLO: A Novel Detection Model for Detecting Surface Defects in Wood. Forests 2025, 16, 308. [Google Scholar] [CrossRef]

Figure 1. SE attention module architecture.

Figure 2. Improved YOLOv8n model architecture.

Figure 3. Image acquisition platform for defect dataset construction.

Figure 4. Collected images of different defect types. (a) Crack. (b) Missing print. (c) Spot.

Figure 5. Precision–recall curves and per-class AP@0.5 of the optimized YOLOv8n on the test set.

Figure 6. Normalized confusion matrix of three printing defect categories on the test set.

Figure 7. Detection results on printed defects.

Figure 8. Performance Comparison of Different Algorithmic Models.

Figure 9. Latency histogram on the validation set.

Figure 10. Error analysis with representative failure cases (FN/FP). (a) Representative false-negative case (FN). (b) Representative false-positive case (FP).

Table 1. Statistics of the printing defect dataset.

Defect Class	Total Images	Total Instances	Train	Val	Test
Crack	1501	1518	1051	300	150
Missing print	1329	1329	930	266	133
Spot	1670	2592	1169	334	167
Total	4500	5439	3150	900	450

Table 2. Test environment.

Component	Specification
Operating system	Windows 11
CPU	AMD Ryzen 7 7745HX with Radeon Graphics
GPU	NVIDIA GeForce RTX 4070
Python	3.8.20
CUDA	12.1
Deep learning framework	PyTorch 2.1.2

Table 3. Training parameters.

Parameters	Value
Epochs	150
Batch size	16
Image size	640 × 640
Random seed	42
Early stopping	50
Workers	8
Learning rate	0.001
Optimizer	AdamW
Weight decay	0.0005

Table 4. Comparative experiments of attention mechanisms.

Model	P (%)	R (%)	mAP@0.5 (%)	mAP@[0.5:0.95] (%)
YOLOv8n	93.2	94.2	92.8	39.1
YOLOv8n + CA	93.3	93.8	92.7	38.9
YOLOv8n + Shuffle	93.5	93.5	93.8	39.5
YOLOv8n + CBAM	93.4	94.2	93.5	39.9
YOLOv8n + SE	94.3	94.3	93.5	39.6

Table 5. Comparison of bounding box regression loss functions.

Loss Function	P (%)	R (%)	mAP@0.5 (%)	mAP@[0.5:0.95] (%)
CIoU	93.2	94.2	92.8	39.1
WIoU-V3	93.2	93.8	93.1	39.3
GIoU	93.5	93.8	93.2	39.5
EIoU	94.1	95	93.8	39.6

Table 6. Comparison of Defect Detection Results Before and After Optimization.

Defect Type	Original YOLOv8n		Optimized YOLOv8n
Defect Type	P (%)	R (%)	P (%)	R (%)
Crack	95.9	96.5	98.5	96.8
Missing print	95.0	96.3	97.9	96.4
Spot	89.4	89.7	89.5	90.5

Table 7. Comparison of Ablation Experiment Results.

Model Variant	P (%)	R (%)	mAP@0.5 (%)	mAP@[0.5:0.95] (%)
YOLOv8n	93.2	94.2	92.8	39.1
YOLOv8n + EIoU	94.1	95	93.8	39.6
YOLOv8n + SE	94.3	94.3	93.5	39.6
YOLOv8n + EIoU + SE	95.1	94.3	94.1	39.6

Table 8. Comparison of Detection Results Across Different Models.

Model	P (%)	mAP@0.5 (%)	FPS	Params (M)	GFLOPs	Size/MB
YOLOv3-tiny	91.9	92.4	108.9	12.13	18.9	23.2
YOLOv5s	90.2	91.8	104.3	7.02	15.8	14.4
YOLOv6s	93.1	92.1	145.6	4.23	11.8	8.3
YOLOv8m	93.2	93.4	106.4	25.84	78.7	49.6
YOLOv8s	92.6	92.9	105.3	11.13	28.4	21.5
Optimized YOLOv8n	95.1	94.1	100.2	3.02	8.1	6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, M.; Li, S.; Zhang, J.; Bai, X.; Wang, K.; Yuan, H. Detection of Printing Defects on Wood-Derived Paper Products Using an Improved YOLOv8n. Forests 2025, 16, 1818. https://doi.org/10.3390/f16121818

AMA Style

Zhang M, Li S, Zhang J, Bai X, Wang K, Yuan H. Detection of Printing Defects on Wood-Derived Paper Products Using an Improved YOLOv8n. Forests. 2025; 16(12):1818. https://doi.org/10.3390/f16121818

Chicago/Turabian Style

Zhang, Mingyang, Shuai Li, Jun Zhang, Xiaopeng Bai, Kun Wang, and Hongxia Yuan. 2025. "Detection of Printing Defects on Wood-Derived Paper Products Using an Improved YOLOv8n" Forests 16, no. 12: 1818. https://doi.org/10.3390/f16121818

APA Style

Zhang, M., Li, S., Zhang, J., Bai, X., Wang, K., & Yuan, H. (2025). Detection of Printing Defects on Wood-Derived Paper Products Using an Improved YOLOv8n. Forests, 16(12), 1818. https://doi.org/10.3390/f16121818

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Detection of Printing Defects on Wood-Derived Paper Products Using an Improved YOLOv8n

Abstract

1. Introduction

2. Materials and Methods

2.1. YOLOv8 Object Detection Algorithm

2.2. SE Attention Mechanism

2.3. EIoU Loss Function

2.4. Improved YOLOv8n Overall Architecture

3. Experimental Setup

3.1. Dataset and Augmentation

3.2. Experimental Environment and Training Parameters

3.3. Evaluation Metrics

4. Results and Discussion

4.1. Comparison of Attention Mechanisms

4.2. Comparison of Loss Functions

4.3. Detection Results by Defect Type: Before and After Optimization

4.4. Ablation Study

4.5. Comparison with Other Models and Real-Time Performance Analysis

4.6. Error Analysis

4.7. Result Discussion and Industrial Implications

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI