TASNet-YOLO: An Identification and Classification Model for Surface Defects of Rough Planed Bamboo Strips

Zhang, Yitong; Gao, Rui; Ji, Min; Zhang, Wei; Yu, Wenquan; Wang, Xiangfeng

doi:10.3390/f16101595

Open AccessArticle

TASNet-YOLO: An Identification and Classification Model for Surface Defects of Rough Planed Bamboo Strips

by

Yitong Zhang

^1,2,

Rui Gao

^1,3,*,

Min Ji

¹,

Wei Zhang

⁴,

Wenquan Yu

¹ and

Xiangfeng Wang

¹

Fujian Academy of Forestry, Fuzhou 350012, China

²

School of Mechatronic Engineering, Fujian Agriculture and Forestry University, Fuzhou 350002, China

³

Key Laboratory of Forest Cultivation and Forest Products Processing and Utilization of Fujian, Fuzhou 300012, China

⁴

Research Institute of Wood Industry, Chinese Academy of Forestry, Beijing 100091, China

^*

Author to whom correspondence should be addressed.

Forests 2025, 16(10), 1595; https://doi.org/10.3390/f16101595

Submission received: 11 September 2025 / Revised: 14 October 2025 / Accepted: 15 October 2025 / Published: 17 October 2025

(This article belongs to the Special Issue Transforming Wood into Smart Materials: Innovations in Processing and Application)

Download

Browse Figures

Versions Notes

Abstract

After rough planing, defects such as wormholes and small patches of green bark residue and decay are often overlooked and misclassified. Strip-like defects, including splinters and chipped edges, are easily confused with the natural bamboo grain, and a single elongated defect is frequently fragmented into multiple detection boxes. This study proposes a modified TASNet-YOLO model, an improved detector built on YOLO11n. Unlike prior YOLO-based bamboo defect detectors, TASNet-YOLO is a mechanism-guided redesign that jointly targets two persistent failure modes—limited visibility of small, low-contrast defects and fragmentation of elongated defects—while remaining feasible for real-time production settings. In the backbone, a newly designed TriMAD_Conv module is introduced as the core unit, enhancing the detection of wormholes as well as small-area defects such as green bark residue and decay. The additive-gated C3k2_AddCGLU is further integrated at selected C3k2 stages. The combination of additive interaction and CGLU improves channel selection and detail retention, highlighting differences between splinters and chipped edges and bamboo grain strips, thereby reducing false positives and improving precision. In the neck, the neck replaces nearest-neighbor upsampling and CBS with SNI-GSNeck to improve cross-scale alignment and fusion. Under an acceptable real-time budget, predictions for splinters and chipped edges become more contiguous and better aligned to edges, while wormholes predictions are more circular and less noisy. Experiments on our in-house dataset (8445 bamboo-strip defect images) show that, compared with YOLO11n, the proposed model improves detection accuracy by 5.1%, achieves 106.4 FPS, and reduces computational costs by 0.4 GFLOPs per forward pass. These properties meet the throughput demand of 2 m/s conveyor lines, and the compact model size and compute footprint make edge deployment straightforward for fast online screening and preliminary quality grading in industrial production.

Keywords:

bamboo strips; YOLO11; defect detection; multi-scale feature fusion; AddCGLU

1. Introduction

Bamboo strips are the fundamental feedstock for producing bamboo products, and their quality directly determines product performance. In practice, a variety of surface defects often occur—arising from environmental stress and pest damage during growth, as well as from mechanical damage introduced during processing, transport, and storage. Typical defects include wormholes, chipped edges, decay, green bark residue, yellow pith residue, and splinters [1]. These defects diminish mechanical strength and stability, shorten service life, and impair surface quality, ultimately reducing market value. A critical intermediate in bamboo laminated lumber [2] manufacturing is the rough-planed bamboo strip; its surface quality directly affects adhesive bonding, grading, and material yield. During panel manufacture, multiple strips are bonded to form a single panel; however, the presence of just one defective strip can compromise the entire panel, resulting in a downgraded grade at best, or in scrappage and production stoppage at worst. By conservative estimates, defects lead to industry-wide profit losses approaching USD 140 million each year [3]. At the global scale, the cross-border trade value of bamboo and rattan reached approximately USD 4.12 billion in 2022 [4], indicating steady international demand and an active supply chain. Among downstream applications directly linked to bamboo strips, the bamboo flooring segment was about USD 1.43 billion in 2024 and is projected to reach USD 2.02 billion by 2034; in a broader scope, the global bamboo industry was on the order of USD 67.1 billion in 2024 [5]. Accordingly, reliable inspection and screening of bamboo strips are critical for improving the quality of bamboo products and reducing manufacturing costs.

Current approaches to bamboo-strip surface defect detection fall into two categories: machine-vision methods and deep-learning–based object detection methods [6]. Traditional machine-learning–based methods primarily exploit hand-crafted cues—color, texture, and edges—for defect detection [7]. Zeng et al. [8] extracted color features via color moments and texture features via the gray-level co-occurrence matrix, applied PCA for dimensionality reduction, and then used an SVM to classify bamboo into eight color grades, achieving promising recognition accuracy. Kuang et al. [9] extracted regions of interest via thresholding and built a dual-modal texture representation by combining LBP and GLCM features, thereby validating the effectiveness of multi-feature fusion for bamboo-strip classification. Classical machine-vision approaches face labor-intensive feature engineering, weak cross-scene generalization, and nontrivial computational overhead [10], which fall short of the efficiency and real-time requirements of modern bamboo-strip surface defect inspection.

In recent years, deep-learning–based image classification and object detection have gained significant traction [11]. Detectors are generally divided into one-stage methods [12] and two-stage methods [13]. Compared with their two-stage counterparts, one-stage detectors offer substantially higher speed while preserving accuracy; representative examples include YOLO [14], SSD [15], and RetinaNet [16]. The YOLO family performs object classification and bounding-box regression simultaneously, enabling fast response, non-contact operation, and flexible deployment, and has been widely applied to surface-defect inspection of bamboo and wood [17]. Nonetheless, in aspects most pertinent to real industrial deployments, common shortcomings persist. First, the continuity of elongated defects and the preservation of their topology are inadequate: high aspect ratio splinters and chipped edges are often fragmented into multiple detection boxes during inference, with poor boundary adherence, so that a single physical defect appears as disjoint predictions. Second, publicly available datasets specific to this task are scarce and the literature remains limited; moreover, datasets used in existing studies are typically small in scale, which leads to limited model generalization. Yang et al. [18] introduced a Coordinate Attention (CA) module to enhance feature extraction for five defect types and refined the hierarchy with C2f, improving accuracy while reducing parameters for a lightweight model. In their later work, Yang et al. [19] adapted an improved YOLOv8 to bamboo-strip defect detection, with targeted optimizations for small objects and complex textures, yielding a significant accuracy gain. Guo et al. [20] addressed the peculiarity of high aspect-ratio, slender defects on bamboo surfaces by strengthening horizontal responses with an asymmetric convolution block and introducing a hybrid attention mechanism for joint spatial we adopt channel enhancement, achieving high accuracy across six typical defect classes. Accordingly, YOLO was adopted as the baseline in this study for further improvements.

To address six surface defect types on rough-planed bamboo strips, this study proposed an improved YOLO11n-based detection method. The main contributions are as follows:

To mitigate the under-detection of wormholes and small-area green bark residue and decay, a tri-branch unit was designed for capturing fine edges while preserving surrounding texture, with channel attention suppressing interference from bamboo grain bands and knots. A dilated-convolution branch widened the effective field of view, improving small-object visibility and detection stability. C3k2_AddCGLU was integrated at the C3k2 stages, leveraging additive residual interaction and CGLU gating to sharpen channel selection and preserve details. This accentuated the differences between splinters and chipped edges and the bamboo grain band, reducing false positives and improving precision. To mitigate fragmentation and drift when aggregating along-grain elongated defects, SNI was adopted as the upsampling strategy in the neck and then fuse cross-scale features with GSConvE-I. This combination suppressed the blocky, jagged stair-step artifacts introduced by upsampling. Under an acceptable real-time budget, predictions for splinters and chipped edges became more contiguous and better aligned to edges, and wormholes detections appeared rounder.

2. Materials and Methods

2.1. Materials

Given the absence of public datasets for bamboo-strip surface defects, an in-house laboratory dataset was constructed as an experimental basis. A total of 5000 defect-bearing strips were collected from a manufacturing plant, and a laboratory line-scan acquisition platform was established—consisting of a line-scan camera (Equipment manufacturer: Hangzhou Hikvision Digital Technology Co., Ltd.; Source: Hangzhou, China), lighting system (Equipment manufacturer: Hangzhou Hikvision Digital Technology Co., Ltd.; Source: Hangzhou, China), light-source controller (Equipment manufacturer: Hangzhou Hikvision Digital Technology Co., Ltd.; Source: Hangzhou, China), and photoelectric sensor (Equipment manufacturer: OMRON Corporation; Source: Osaka City, Japan) (Figure 1) [21]. Triggered capture and data management were conducted with MVS (V4.5.1) software. The setup comprised a line-scan camera (pixel size 14 µm, effective resolution 2048 × 2 px), a 40 mm industrial lens, and a bar-type illuminator. Because the roller table ran at constant speed, a photoelectric sensor was paired with internal frame triggering to synchronize conveyor speed and line rate [22]. The exposure time was set to 428 µs and the line rate to 5 kHz, corresponding to a conveyor speed of 700 mm/s. Geometric distortion and illumination non-uniformity were corrected via checkerboard calibration plus flat-field and dark-field compensation, ensuring comparable physical scale and brightness across batches. In total, 38,095 JPG images were collected, covering diverse operating conditions and defect appearances.

After data cleaning, defects were annotated one by one in LabelImg (V1.8.6), yielding 8445 valid instances. Based on morphology and cause, six categories were defined: Green bark residue 1707 (20.2%), Splinters 2206 (26.1%), Wormholes 748 (8.9%), Decay 1753 (20.8%), Yellow pith residue 1495 (17.7%), and Chipped edges 536 (6.3%). The distribution revealed clear class imbalance. Representative examples are shown in Figure 2.

The dataset was then randomly split into training, validation, and test sets in an 8:1:1 ratio to balance learning, tuning, and unbiased evaluation. The final counts were 6753, 841, and 851 instances, respectively (Table 1).

Training samples were drawn according to the dataset’s natural distribution (no rebalancing). In future work, we will evaluate minority-oriented augmentation and rebalancing strategies under the same real-time budget. Without changing the real-time budget, we will evaluate its improvement on the recall of minority classes and overall generalization, and simultaneously report the class-wise and macro-average indicators.

2.2. Baseline Model Selection

YOLO11 is a next-generation detector released by Ultralytics (Sept 2024). It is provided in five scales, balancing accuracy and efficiency via depth/width scaling. The architecture retains the YOLOv8 backbone–neck–head design with several targeted refinements (Figure 3). The backbone is the core of an object detector and is responsible for feature extraction. YOLO11 replaces YOLOv8′s C2f with C3k2, making the network lighter while maintaining rich gradient flow. To capture global context with modest overhead, YOLO11 adds C2PSA after SPPF—an evolution of the PSA block from YOLOv10 combined with a CSP design—thereby expanding the effective receptive field and further lowering computation. The neck adopts a PAN–FPN scheme for multi-scale fusion: the top-down FPN path propagates high-level semantics to shallow layers, while the bottom-up PAN path feeds precise localization cues back to deep layers, preserving spatial detail and strengthening localization. This bidirectional flow enables complementary information across scales and improves representation and detection quality. The head consumes the fused features to predict object locations and categories. In YOLO11, the detection head inherits YOLOv8′s decoupled classification–regression design and introduces targeted refinements. The first two convolutions in the classification branch are replaced with depthwise-separable convolutions, and the final layer is a pointwise convolution. This approach reduced computational costs while maintaining classification and localization performance, thereby improving efficiency. Nevertheless, the default backbone and neck fall short in accuracy for bamboo-strip defect detection, motivating the architectural improvements proposed in this study.

2.3. Improved TASNet-YOLO Model

Considering inference speed, accuracy, and deployability, we adopt YOLO11n as the baseline. Then the following architectural modifications to YOLO11n were introduced. First, a new TriMAD_Conv block was introduced into the YOLO11n backbone to strengthen the detection of wormholes and small area green bark residue and decay. Second, for the long, continuous boundaries of splinters and chipped edges, the C3 wiring was retained and an additive-residual pathway was added with CGLU gating, sharpening the contrast between these defects and the bamboo-grain bands. Finally, the neck was restructured by adopting SNI for upsampling and GSConvE-I for cross-scale fusion, which improved accuracy under a real-time inference budget. The resulting model, TASNet-YOLO, is shown in Figure 4.

2.3.1. TriMAD_Conv

To enhance the visibility and stable detection of small targets, a tri-branch convolutional module, TriMAD_Conv (Tri-branch Multi-scale Attention with Dilated Convolution) (Figure 5), was designed and embedded in the YOLO11 backbone to strengthen the representation of minute surface defects under modest compute. The module took a shared input and ran three parallel paths that were merged along the channel dimension, providing complementary multi-scale spatial cues, channel reweighting, and long-range context. (i) Top path: a 3 × 3 standard convolution performed channel integration and local texture extraction, followed by two parallel depthwise-separable convolutions (3 × 3 and 5 × 5), using lightweight DWConv [23], fine details and a larger receptive field were captured; their outputs were concatenated within the branch to form a multi-scale spatial representation. (ii) Middle path: channel attention was applied through global average pooling, followed by a 1 × 1 convolution and a Sigmoid function to generate channel weights. These weights rescaled the input features channel by channel, amplifying defect-related responses while suppressing background noise. (iii) Bottom path: a 3 × 3 dilated convolution was applied with dilation rate 5 [24], followed by batch normalization, to introduce wider contextual cues; the effective receptive field was approximately 11 × 11, which helped handle scale variation in small targets and low-contrast textures. Finally, the three paths were concatenated along the channel dimension to produce the module output (Figure 2). This design kept parameters and computation low while enhancing the separability and detection robustness of tiny defects.

Compared with the original CBS, the design integrates parallel multi-scale DWConv to cover both local texture and mid-range context under very low parameter and FLOP budgets, which was well suited to heterogeneous, fine-grained defects such as Splinter edges, tiny wormholes openings, and chipped edges fractures. A lightweight channel-attention pathway—global average pooling, a 1 × 1 convolution, Sigmoid, and per-channel reweighting—performed semantic recalibration under strong bamboo grain and nonuniform illumination, boosting target responses while suppressing noise. In addition, a dilated-convolution branch enlarged the effective receptive field and, together with the non-dilated branches, provided complementary long-range context that mitigated gridding artifacts and improved the separability of low-contrast defects. Taken together, the module remained interface-compatible and computationally efficient, while significantly enhancing the backbone’s sensitivity to tiny defects and providing more discriminative features for neck fusion and the detection head.

2.3.2. C3k2_AddCGLU

To suppress interference from bamboo-fiber background textures and enhance responses to fine-grained defects—such as splinter edges and wormholes openings—an additive-gated module, C3k2_AddCGLU was designed. This module refined AdditiveBlock with a convolutional gated linear unit (CGLU) [25] and embedded the resulting AdditiveBlock_CGLU into the C3k2 unit. The standalone structures of AdditiveBlock [26] and CGLU are shown in Figure 6, the combined AdditiveBlock_CGLU in Figure 7, and the integrated C3k2_AddCGLU in Figure 8.

In bamboo-strip defect detection, the core difficulty lied in strongly periodic fiber textures and small, low-contrast anomalies. Broad textured regions triggered false positives, while illumination and material changed fragment boundaries and depress contrast. In the baseline, the C3k unit inside C3k2 relied on local convolutions, which tended to overfit repetitive grain and missed weak anomalies spanning larger areas. Simply enlarging the receptive field greatly increased computation and undermined real-time deployment. To this end, inspired by additive attention, AdditiveBlock was employed as the basic unit. The block first applied a lightweight convolution to consolidate local cues and strengthened edges and positional information. It then performed convolutional additive token interaction, constructing learnable attention maps in the spatial and channel domains and combining them by summation to form similarity. This similarity map was finally used to modulate the value branch elementwise, enhancing weak anomalies while suppressing repetitive grain. The proposed additive-convolution design scales approximately linearly with input resolution, remained numerically stable and memory-bandwidth friendly and built cross-region context without sacrificing real-time throughput. For low-contrast defects such as green bark residue, yellow pith residue, and decay, the spatial-attention branch lifted responses over a broader area. For tiny wormholes, splinters, and chipped edges, front-loaded edge consolidation together with channel reweighting made faint yet anomalous shape and texture cues more salient. The additive-convolution designed scales approximately linearly with input resolution, remained numerically stable and bandwidth-friendly and captured cross-region context without sacrificing real-time throughput. For low-contrast defects such as green bark residue, yellow pith residue, and decay, the spatial-attention branch enhanced responses over a broader area. For tiny wormholes, splinters, and chipped edges, front-loaded edge consolidation together with channel reweighting made faint, yet anomalous, shape and texture cues more salient.

Using a conventional MLP [27] for channel mixing at the tail of AdditiveBlock remained limiting: an MLP was essentially per-channel linear recombination, lacked spatial inductive bias, and tended to amplify non-discriminative texture responses under strong grain. Its purely additive modeling also struggled to achieve fine-grained selective suppression across space or channels. Therefore, a CGLU was placed at the block tail, decomposing the intermediate representation into an information branch and a gating branch. Both branches followed a lightweight DWConv + 1 × 1 path, with the gating branch modulating the information branch elementwise. This “convolution plus gating” design kept parameters and compute low while introducing an effective local receptive field, making the gate more sensitive to geometric cues such as splinter edges, small wormholes openings, and chipped edges. Over large homogeneous fiber textures or under nonuniform lighting, the gate yielded lower responses, suppressing non-discriminative texture and artifacts and focusing attention on truly anomalous fine-grained patterns. Built on this idea, C3k2_AddCGLU provided global and local interaction with controllable texture suppression, addressing small-target and low-contrast challenges in bamboo strips without notable computational overhead.

Mathematical formulation of the C3k2_AddCGLU module:

C3k2 progressive concatenation given the input feature

x \in R^{B \times C_{h} \times H \times W}

, and the hidden width

C_{h} = ⌊e \cdot C_{2}⌋

,

\begin{matrix} h = {C o n v}_{1 \times 1} (x) \in R^{B \times C_{h} \times H \times W}, z^{(0)} = h, u^{(0)} = h \end{matrix}

(1)

\begin{matrix} f o r i = 1 . . n : u^{(i)} = U_{i} (u^{(i - 1)}) \end{matrix}

(2)

\begin{matrix} y = C o n c a t [z^{(0)}, u^{(1)}, . . ., u^{(n)}] \in R^{B \times (n + 1) C_{h} \times H \times W} \end{matrix}

(3)

\begin{matrix} O u t = {C o n v}_{1 \times 1} (y) \in R^{B \times C_{2} \times H \times W} \end{matrix}

(4)

Additive interaction in AdditiveBlock, the third relation applies a pointwise gate to V:

O = Γ (A) ⨀ V

, where

Γ (\cdot)

is a pointwise scalar gating function and

“ ⨀ ”

denotes element-wise multiplication with broadcasting; equivalently,

O_{b, c, h, w} = Γ (A_{b, c, h, w}) V_{b, c, h, w}

:

\begin{matrix} Q, K, V = {C o n v}_{1 \times 1} (x), A = Φ_{s} (Q) + Φ_{s} (K) + Φ_{C} (Q) + Φ_{C} (K), O = Γ (A) ⨀ V \end{matrix}

(5)

\begin{matrix} C G L U (x) = A (x) ⨀ σ (B (x)), A (x), B (x) = {C o n v}_{1 \times 1} (D W {C o n v}_{3 \times 3} (x)) \end{matrix}

(6)

The representation obtained in Equation (5) was normalized and added via a residual path, then fed into Equation (6) and output with a residual connection, thereby forming AdditiveBlock_CGLU.

where B is the batch size; H and W are the spatial dimensions; C1 and C2 are the input and output channels; n is the number of units in the working branch; Conv1 × 1 denotes a pointwise convolution; DWConv3 × 3 denotes a 3 × 3 depthwise convolution; Concat[·] indicates channel-wise concatenation; ⊙ denotes element-wise multiplication; σ is the Sigmoid; Φs and Φc are the spatial and channel attention maps; Γ is a linear projection; and Ui denotes the i-th internal unit.

2.3.3. SNI-GSNeck

To preserve fine-grained texture and stabilize cross-scale alignment while improving accuracy under a real-time budget, the neck was restructured: on the upsampling side, SNI (Soft Nearest-Neighbor Interpolation; Figure 9) was adopted as a soft interpolation strategy, and on the fusion side, GSConvE-I (Enhanced GSConv, variant I) [28] (Figure 10) was utilized as a lightweight feature aggregation unit.

In scenes where elongated bamboo grain coexists with small defects, conventional nearest-neighbor upsampling copies high-level semantics to high resolution and overlayed them pixelwise with shallow textures, leading to cross-layer misalignment and noise amplification. SNI applied a magnification-aware soft scaling after nearest-neighbor upsampling to gently calibrate high-level responses; it preserved shallow texture without extra parameters or latency and improved recall and boundary quality for small targets such as wormholes and fine splinters. Meanwhile, GSConvE-I performed efficient channel mixing and spatial aggregation to complete top-down and bottom-up information exchange, mitigating feature aliasing and artifacts. Under the same budget, it yielded a more stable multi-scale representation and supplied the head with cleaner, more discriminative features.

In bamboo-strip defect detection, the CBS block used in the original neck fusion was limited: channel interaction was weak and expansion of the effective receptive field was inefficient under a comparable computed budget. Therefore, the lightweight aggregation unit GSConvE-I was introduced at the neck fusion stage to organize and fuse information more effectively. The workflow is as follows: a 1 × 1 convolution first performed channel alignment and feature transformation; the features were then split into two parallel paths—an upper path that passed through to preserve the main information flow with minimal loss, and a lower path that applied a 3 × 3 standard convolution, a 3 × 3 depthwise convolution, and GELU activation in sequence to extract richer texture and context. After channel-wise concatenation, a single channel shuffle was applied to enhance cross-channel interaction and fusion. Under a comparable computation budget and real-time constraints, GSConvE-I strengthened both cross-channel communication and the effective receptive field. It better accommodated large-scale tone and shape variation in bamboo strips—such as regional consistency and boundary transitions in green bark residue, yellow pith residue, and decay—while preserving microstructural details, including splinter tip sharpness and wormholes rim integrity. Compared with the original fusion block based on CBS, this design improved discriminability and robustness at the fusion stage without sacrificing inference speed, providing the detection head with cleaner and more informative multi-scale features.

3. Results

3.1. Experimental Environment

The experimental software and hardware configuration was summarized and is presented in Table 2, and the training hyperparameters are listed in Table 3.

3.2. Evaluation Criteria

To evaluate the proposed model, Precision (P), Recall (R), Mean Average Precision (mAP), parameter count (Params), and floating-point operations (FLOPs) were utilized as the primary metrics. The definitions are given below.

\begin{matrix} P = \frac{T P}{T P + F P} \end{matrix}

(7)

\begin{matrix} R = \frac{T P}{T P + F N} \end{matrix}

(8)

\begin{matrix} A P = \int_{0}^{1} P R d r \end{matrix}

(9)

\begin{matrix} m A P = \frac{1}{n} \sum_{i = 0}^{n} A P_{i} \end{matrix}

(10)

where TP is the number of true positives, FP is the number of false positives, FN is the number of false negatives, and n is the number of classes in the dataset. Additionally, parameter count (Params) and inference speed (FPS) were reported to provide a more comprehensive assessment of model performance.

FPS measured the number of image frames processed per unit time, reflecting inference throughput and real-time capability. To ensure fair comparison, all components were benchmarked on the same GPU under identical runtime settings (driver and library versions, batch size, etc.). Params denote the number of trainable parameters, while GFLOPs represents the floating-point operations per forward pass, serving as an approximation of computational cost. For deployment, models with lower Params and GFLOPs are generally preferred, as they reduce memory footprint and computational demand.

3.3. Ablation Experiment

To systematically assess the contribution of each module in TASNet-YOLO, comprehensive ablation studies were conducted. All experiments were run under the same environment and hyperparameter settings to ensure fair comparison. The results are summarized in Table 4, where a check mark (✔) indicates the module is enabled.

The results are shown in Table 4: Experiment 1 is the YOLO11n baseline, and Experiment 4 is TASNet-YOLO. Introducing the self-designed TriMAD_Conv into the backbone in place of CBS markedly reduced parameters, FLOPs, and memory footprint, improving compute and storage efficiency. Building on this, upgrading C3k2 to C3k2_AddCGLU further optimized computational efficiency and increased detection accuracy. Incorporating SNI and GSConvE-I in the neck delivered the best overall performance. Compared with YOLO11n, TASNet-YOLO increases mAP by 5.1 percentage points while reducing GFLOPs by 0.4, with only a slight rise in parameters, indicating higher accuracy at comparable complexity. Figure 11 presents qualitative comparisons on the test set, where different defects are highlighted with colored boxes and corresponding confidence scores.

3.4. Comparative Experiment

3.4.1. Comparative Test of Defect Detection Results

Class-wise results for the six defect types are listed in Table 5. Yellow pith residue performed the best, with a precision of 94.7%, recall of 92.9%, and mAP of 96.8%. Chipped edges was the most challenging class, with a precision of 79.1%, recall of 54.5%, and mAP of 64.6%. The superior performance on yellow pith residue increased from its distinctive yellow patches with strong color contrast and relatively coherent boundaries, together with a larger sample size that supports learning and generalization. For wormholes, recall lagged precision because small wormholes lack distinctive shape and color cues, making them easily confusable with bamboo grain and mild decay, leading to missed detections and cross-class errors.

3.4.2. Comparative Test of Different Models

To compare TASNet-YOLO with mainstream YOLO detectors on bamboo-strip surface defects, YOLOv5, YOLOv6, YOLOv8, YOLOv10, YOLO11, and YOLO12 [29] were evaluated under identical settings. As shown in Table 6, TASNet-YOLO achieved 84.8% precision, 73.3% recall, and 81.8% mAP. Its mAP exceeded the six counterparts by 7.5, 4.4, 6.7, 5.8, 5.1, and 6.2 percentage points, indicating higher accuracy and reliability. The model also maintains a small memory footprint and strong efficiency, reflecting effective use of compute. Its inference speed reached 106.4 FPS, the best among non-baseline models, meeting real-time requirements for production lines operating at roughly 2 m/s. Overall, TASNet-YOLO offered a well-balanced and strong trade-off among accuracy, resource usage, and throughput for bamboo-strip surface defect inspection.

3.5. Visual Analysis

Built on YOLO11n, the proposed method delivered a substantial improvement in bamboo-strip surface defect detection. Figure 11 shows qualitative comparisons between YOLO11n and our model. The baseline exhibited missed detections and imprecise localization, struggled with small targets and degraded under stacked or partially occluded instances. The proposed model recovered more targets in these cases, demonstrating higher accuracy and stronger robustness.

4. Discussion

Amid global timber scarcity and mounting environmental pressures, bamboo—fast-growing, renewable, and non-wood—has become strategically important. Bamboo strips are the basic units of bamboo laminated lumber; their surface quality governs product grade and application range. Surface defects markedly degrade mechanical performance and bonding, weakening overall strength, stability, and durability, and thus limiting bamboo’s role in “bamboo-for-plastic” and “bamboo-for-wood” initiatives. As sustainability agendas advance, bamboo’s strong carbon sequestration and benign degradation make it a compelling substitute; however, uncontrolled strip-surface defects hinder quality improvement and scale-up, waste resources, and erode bamboo’s position in green supply chains. Consequently, improving strip surface quality and developing high-accuracy defect inspection are essential for high-value utilization of bamboo, alleviating timber shortages, and reducing plastic pollution [30].

A systematic evaluation of bamboo-strip surface defects in an industrial setting was conducted in this study. Visual comparisons showed that the improved model was more stable on small targets and in complex textures: boundaries for wormholes, fine splinters, and chipped edges were more continuous; boxes adhere to edges more consistently; and false or missed detections decreased under strong periodic grain, glare, and local contamination (Figure 11). These observations align with the quantitative gains, indicating better localization and higher recall on low-contrast targets. Why it works: TriMAD_Conv enriches local texture and long-range context through a lightweight multi-branch design with dilated receptive fields, helping preserve fine defects that might otherwise be lost during down-sampling. C3k2_AddCGLU performs channel-wise gated recalibration, suppressing non-discriminative responses over large homogeneous grain and uneven illumination while emphasizing truly anomalous details. On the neck, SNI softens nearest-neighbor upsampling to curb cross-layer misalignment and jagged artifacts, preserving shallow texture. GSConvE-I strengthens cross-channel communication and the effective receptive field via channel shuffle and lightweight aggregation, improving multi-scale fusion without additional compute. Together, these components deliver a better balance among accuracy, computation, and latency.

Nevertheless, there remains room for improvement. Although the overall parameter count is comparable to YOLO11n, computation and latency can be further reduced. By integrating pruning, distillation, and quantization without sacrificing accuracy, a more favorable balance among accuracy, speed, and model size, can be achieved, thereby meeting stricter real-time requirements on edge deployments. Meanwhile, it is necessary to increase the proportion of hard examples in the dataset. For categories such as splinters, decay, and chipped edges, which exhibit relatively low per-class detection accuracy, targeted data augmentation can partially compensate for biases caused by sample scarcity and thereby improve the accuracy and generalization of these minority defect classes. Future application-oriented work will focus on broadening the method’s applicability. Specifically, we will investigate domain adaptation on unlabeled target production lines to mitigate cross-plant and cross-season distribution shifts, pursue transfer learning across bamboo species to accommodate interspecies differences in texture and epidermal structure, and extend the model to other natural materials (wood, rattan, fiber composites) and to edge-device deployments, while preserving real-time performance to ensure deployability and operational stability.

5. Conclusions

Rough-planed bamboo strips present several detection challenges. Wormholes and small patches of green bark residue or decay are difficult to detect; splinters and chipped edges are easily masked by the periodic grain; elongated defects are often fragmented into multiple boxes during detection. To address these issues, we propose TASNet-YOLO, an improved detector built on YOLO11n. By incorporating TriMAD_Conv, C3k2_AddCGLU, and SNI-GSNeck, the model enhances sensitivity to tiny defects, improves feature discrimination under complex textures, and produces more coherent results for long, along-grain defects. On our in-house dataset it achieves 81.8% mAP and 106.4 FPS, striking a solid balance between accuracy and efficiency and demonstrating strong practical potential. The study also validates, at a methodological level, the effectiveness of multi-branch design, gated mechanisms, and lightweight convolutional fusion for bamboo-defect detection, offering a transferable recipe for detecting weak targets in textured backgrounds. Remaining gaps include recognition of chipped edges and generalization under extreme conditions; future work will focus on enlarging difficult classes and refining annotation protocols to advance deployment in industrial settings.

Author Contributions

Conceptualization, Y.Z.; methodology, Y.Z., R.G. and M.J.; software, Y.Z. and M.J.; validation, W.Z.; formal analysis, M.J.; investigation, W.Y.; resources, X.W.; data curation, Y.Z.; writing—original draft, Y.Z.; writing—review and editing, R.G. and M.J.; visualization, Y.Z.; supervision, M.J.; project administration, R.G.; funding acquisition, R.G. All authors have read and agreed to the published version of the manuscript.

Funding

Fujian Provincial Science and Technology Major Project: Key technology and equipment for continuous and intelligent production of glued laminated bamboo (Grant No.2024HZ026011); Fujian Province Forestry Science and Technology Project: “Research and Development of Key Technologies and Equipment for Automated Precision Planing of Bamboo Strips and Raw Material Loading”(2025FKJ37) and Nanping City Science and Technology Major Project: “Research and Development of Continuous Intelligent Processing Equipment System of glued laminated bamboo”(N2023A003).

Data Availability Statement

The datasets generated during and/or analyzed during the current study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Abbreviations

The following abbreviations are used in this manuscript:

TASNet-YOLO	An efficient bamboo-strip surface defect detection model based on YOLO
TriMAD_Conv	Tri-branch Multi-scale Attention with Dilated Convolution
SPPF	Spatial Pyramid Pooling Fast
SNI	Soft Nearest-Neighbor Interpolation
GSConvE-I	Enhanced GSConv, variant I
FPS	Frames per second
IoU	Intersection over union
mAP	Mean Average Precision
mAP50	mAP computed at IoU = 0.50 only
mAP50-95	mAP averaged over IoU thresholds 0.50:0.05:0.95
P	precision
R	recall
GFLOPs	Giga floating-point operations

References

Lu, Q.F. Research on Bamboo Strip Defects Recognition Technology Based on Computer Vision. Master’s Thesis, Fujian Agriculture and Forestry University, Fuzhou, China, 2019. [Google Scholar]
GB/T 36394-2018; Bamboo Products—Terminology. Standardization Administration of China (SAC): Beijing, China, 2018.
Fujian Daily. Intelligent Sorting, “Industrial Eyes” Target Bamboo Processing—Fuzhou University News Network. Available online: https://news.fzu.edu.cn/info/1014/4017.htm (accessed on 5 August 2025).
International Bamboo and Rattan Organization (INBAR). Trade Overview 2022: Bamboo and Rattan Commodities in the International Market; INBAR: Beijing, China, 2024; Available online: https://www.inbar.int/ (accessed on 5 August 2025).
Grand View Research. Bamboos Market Size, Share, Growth Report, 2030; Grand View Research: San Francisco, CA, USA, 2024; Available online: https://www.grandviewresearch.com/industry-analysis/bamboos-market (accessed on 5 August 2025).
Yang, S.N.; Cao, L.J.; Yang, Y.; Guo, C.D. Review of PCB defect detection algorithms based on machine vision. J. Front. Comput. Sci. Technol. 2025, 19, 901–915. [Google Scholar] [CrossRef]
Chen, Y.; Ding, Y.; Zhao, F.; Zhang, E.; Wu, Z.; Shao, L. Surface defect detection methods for industrial products: A review. Appl. Sci. 2021, 11, 7657. [Google Scholar] [CrossRef]
Zeng, C.H.; Chen, H.; Ding, Y.C.; Gao, Y. Study on bamboo classification method based on color and grain features. For. Mach. Woodwork. Equip. 2010, 38, 37–39. [Google Scholar]
Kuang, H.; Ding, Y.; Li, R.; Liu, X. Defect detection of bamboo strips based on LBP and GLCM features by using SVM classifier. In Proceedings of the 2018 Chinese Control and Decision Conference (CCDC), Shenyang, China, 9–11 June 2018; pp. 3341–3345. [Google Scholar] [CrossRef]
Tout, K.; Meguenani, A.; Urban, J.-P.; Cudel, C. Automated vision system for magnetic particle inspection of crankshafts using convolutional neural networks. Int. J. Adv. Manuf. Technol. 2021, 112, 3307–3326. [Google Scholar] [CrossRef]
Dung, C.V.; Anh, L.D. Autonomous Concrete Crack Detection Using Deep Fully Convolutional Neural Network. Autom. Constr. 2019, 99, 52–58. [Google Scholar] [CrossRef]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems 28 (NeurIPS 2015), Montréal, QC, Canada, 7–12 December 2015; pp. 91–99. Available online: https://papers.nips.cc/paper_files/paper/2015/file/14bfa6bb14875e45bba028a21ed38046-Paper.pdf (accessed on 24 July 2025).
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef]
Redmon, J.; Farhadi, A. YOLOv3: An incremental improvement. arXiv 2018, arXiv:1804.02767. Available online: https://arxiv.org/abs/1804.02767 (accessed on 20 July 2025). [CrossRef]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single shot multibox detector. In European Conference on Computer Vision (ECCV 2016); Springer: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar] [CrossRef]
Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 318–327. [Google Scholar] [CrossRef] [PubMed]
Ji, M.; Zhang, W.; Han, J.-K.; Miao, H.; Diao, X.-L.; Wang, G.-F. A Deep Learning-Based Algorithm for Online Detection of Small Target Defects in Large-Size Sawn Timber. Ind. Crops Prod. 2024, 222, 119671. [Google Scholar] [CrossRef]
Yang, R.X.; Lee, Y.R.; Lee, F.S.; Liang, Z.Y.; Liu, Y. An improved YOLOv5 algorithm for bamboo strip defect detection based on the Ghost module. Forests 2024, 15, 1480. [Google Scholar] [CrossRef]
Yang, R.X.; Lee, Y.R.; Lee, F.S.; Liang, Z.Y.; Chen, N.H.; Liu, Y. Improvement of YOLO detection strategy for detailed defects in bamboo strips. Forests 2025, 16, 595. [Google Scholar] [CrossRef]
Guo, Y.J.; Zeng, Y.X.; Gao, F.Q.; Qiu, Y.; Zhou, X.; Zhong, L. Improved YOLOv4-CSP algorithm for detection of bamboo surface sliver defects with extreme aspect ratio. IEEE Access 2022, 10, 29810–29820. [Google Scholar] [CrossRef]
Ji, M.; Zhang, W.; Diao, X.; Wang, G.; Miao, H. Intelligent Automation Manufacturing for Betula Solid Timber Based on Machine Vision Detection and Optimization Grading System Applied to Building Materials. Forests 2023, 14, 1510. [Google Scholar] [CrossRef]
Ji, M.; Zhang, W.; Wang, G.; Wang, Y.; Miao, H. Online Measurement of Outline Size for Pinus densiflora Dimension Lumber: Maximizing Lumber Recovery by Minimizing Enclosure Rectangle Fitting Area. Forests 2022, 13, 1627. [Google Scholar] [CrossRef]
Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar] [CrossRef]
Chen, L.-C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv 2017, arXiv:1706.05587. Available online: https://arxiv.org/abs/1706.05587 (accessed on 25 July 2025). [CrossRef]
Dauphin, Y.N.; Fan, A.; Auli, M.; Grangier, D. Language modeling with gated convolutional networks. In Proceedings of the 34th International Conference on Machine Learning (ICML 2017), Sydney, Australia, 6–11 August 2017; Volume 70, pp. 933–941. Available online: https://proceedings.mlr.press/v70/dauphin17a.html (accessed on 25 July 2025).
Liu, C.; Zhen, J.; Shan, W. Time series classification based on convolutional network with a gated linear units kernel. Eng. Appl. Artif. Intell. 2023, 123, 106296. [Google Scholar] [CrossRef]
Tolstikhin, I.; Houlsby, N.; Kolesnikov, A.; Beyer, L.; Zhai, X.; Unterthiner, T.; Yung, J.; Steiner, A.; Keysers, D.; Uszkoreit, J.; et al. MLP-Mixer: An all-MLP architecture for vision. arXiv 2021, arXiv:2105.01601. [Google Scholar] [CrossRef]
Li, H. Rethinking Features-Fused-Pyramid-Neck for Object Detection. In Computer Vision—ECCV 2024; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2024; pp. 74–90. [Google Scholar] [CrossRef]
Tian, Y.; Ye, Q.; Doermann, D. YOLOv12: Attention-Centric Real-Time Object Detectors. arXiv 2025, arXiv:2502.12524. Available online: https://arxiv.org/abs/2502.12524 (accessed on 5 August 2025).
Mousavi, M.; Gandomi, A.H. Wood Hole-Damage Detection and Classification via Contact Ultrasonic Testing. Constr. Build. Mater. 2021, 307, 124999. [Google Scholar] [CrossRef]

Figure 1. Line-scan image acquisition system.

Figure 2. Typical defect images of bamboo strips.

Figure 3. Structure of the YOLO11 framework.

Figure 4. TASNet-YOLO structure diagram.

Figure 5. Structure of the TriMAD_Conv module. A three-path block combining (i) local multi-scale DWConv, (ii) lightweight channel attention, and (iii) a dilated 3 × 3 branch to inject mid-range context. It raises the visibility of tiny or low-contrast defects without inflating compute, serving as a drop-in backbone unit.

Figure 6. AdditiveBlock and CGLU module structure diagram. AdditiveBlock: builds spatial or channel similarity maps via additive token interaction to modulate a value branch, strengthening weak anomalies and suppressing repetitive grain responses under modest compute. CGLU: a convolutional gated linear unit that decomposes features into an information branch and a gating branch, with the gate (Sigmoid) element-wise modulating the information branch. Alone, AdditiveBlock provides global–local interaction; CGLU introduces an inductive, geometry-aware gate that is sensitive to edges and tiny rims (wormhole). Role in our design: they are the two building blocks that will be combined in Figure 7 and embedded into C3k2 in Figure 8.

Figure 7. AdditiveBlock_CGLU module. Integration aligns channels; CATM provides spatial and channel gating; CGLU applies convolutional gating to preserve edges and elongated continuity.

Figure 8. C3k2_AddCGLU module. Input is aligned by a 1 × 1 conv, split into two branches, processed by n AdditiveBlock_CGLU units, concatenated, and projected with a 1 × 1 conv to give the output.

Figure 9. Structure of the SNI Module. Nearest-neighbor upsampling with scale s, followed by a soft calibration factor α; the calibrated feature is the output Y.

Figure 10. Structure of the GSConvE-I Module. A lightweight two-path fusion block identity path and enrichment path using linear fusion, channel concatenation and shuffle (with GELU) to improve cross-scale aggregation and edge adherence in the neck at low cost.

Figure 11. Comparison of YOLO11n and TASNet-YOLO detection results. (a) The original label image of the test image; (b) the detection results of the baseline model for this batch of images; (c) the detection results of the TASNet-YOLO model for this batch of images.

Table 1. Dataset Categories and Annotation Quantities.

Defect Type	Train	Validation	Test	Total
Green bark residue	1365	170	172	1707
Splinter	1764	220	222	2206
Wormhole	598	74	76	748
Decay	1402	175	176	1753
Yellow pith residue	1196	149	150	1495
Chipped edge	428	53	55	536
Total	6753	841	851	8445

Table 2. Hardware and software configuration.

Configuration	Version
Operating system	Windows11
Python	3.9.7
Pytorch	2.7.1
GPU	RTX 5070
CPU	i5-14600KF
CUDA	12.8

Table 3. Training hyperparameters.

Parameter	Configuration
batch	32
epochs	200
imgsz	640
lr0	0.01
workers	8
optimizer	SGD

Table 4. Results of ablation experiment.

Basic	TriMAD_Conv	C3k2_AddCGLU	SNI-GSNeck	mAP50/%	mAP50-95/%	Params/M	FLOPs/G
✔				76.7	44.9	2.62	6.6
✔	✔			78.8	44.6	3.93	6.3
✔	✔	✔		79.8	50.4	3.87	6.3
✔	✔	✔	✔	81.8	50.0	3.82	6.2

Table 5. Comparison of detection results of different defects by TASNet-YOLO model.

Defect Name	P/%	R/%	mAP50/%
Green bark residue	86.3	82.4	88.1
Splinter	82.8	70.8	82.8
Wormhole	85.5	70.2	80.5
Decay	80.6	69.1	77.7
Yellow pith residue	94.7	92.9	96.8
Chipped Edge	79.1	54.5	64.6

Table 6. Comparison of different models.

Models	mAP50/%	mAP50-95/%	P/%	R/%	FPS	Params/M	GFLOPs
YOLOv5	74.3	44.5	72.9	65.7	97.1	2.50	7.1
YOLOv6	75.9	45.0	76.1	69.7	86.3	4.16	11.5
YOLOv8	75.1	44.4	74.9	67.1	97.8	2.69	8.9
YOLOv10	76.0	43.8	78.2	68.8	89.2	2.70	8.2
YOLO11	76.7	44.9	77.8	69.1	172.0	2.62	6.6
YOLO12	75.6	45.6	71.4	72.3	91.4	2.60	6.7
TASNet-YOLO	81.8	50.0	84.8	73.3	106.4	3.82	6.2

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, Y.; Gao, R.; Ji, M.; Zhang, W.; Yu, W.; Wang, X. TASNet-YOLO: An Identification and Classification Model for Surface Defects of Rough Planed Bamboo Strips. Forests 2025, 16, 1595. https://doi.org/10.3390/f16101595

AMA Style

Zhang Y, Gao R, Ji M, Zhang W, Yu W, Wang X. TASNet-YOLO: An Identification and Classification Model for Surface Defects of Rough Planed Bamboo Strips. Forests. 2025; 16(10):1595. https://doi.org/10.3390/f16101595

Chicago/Turabian Style

Zhang, Yitong, Rui Gao, Min Ji, Wei Zhang, Wenquan Yu, and Xiangfeng Wang. 2025. "TASNet-YOLO: An Identification and Classification Model for Surface Defects of Rough Planed Bamboo Strips" Forests 16, no. 10: 1595. https://doi.org/10.3390/f16101595

APA Style

Zhang, Y., Gao, R., Ji, M., Zhang, W., Yu, W., & Wang, X. (2025). TASNet-YOLO: An Identification and Classification Model for Surface Defects of Rough Planed Bamboo Strips. Forests, 16(10), 1595. https://doi.org/10.3390/f16101595

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

TASNet-YOLO: An Identification and Classification Model for Surface Defects of Rough Planed Bamboo Strips

Abstract

1. Introduction

2. Materials and Methods

2.1. Materials

2.2. Baseline Model Selection

2.3. Improved TASNet-YOLO Model

2.3.1. TriMAD_Conv

2.3.2. C3k2_AddCGLU

2.3.3. SNI-GSNeck

3. Results

3.1. Experimental Environment

3.2. Evaluation Criteria

3.3. Ablation Experiment

3.4. Comparative Experiment

3.4.1. Comparative Test of Defect Detection Results

3.4.2. Comparative Test of Different Models

3.5. Visual Analysis

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI