# ODCA-YOLO: An Omni-Dynamic Convolution Coordinate Attention-Based YOLO for Wood Defect Detection

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

- Introducing ODCA, a novel attention mechanism that enhances the network’s capability to detect small targets, thus improving feature representation within the network.
- An omni-dimensional dynamic convolution coordinate attention-based YOLO model (ODCA-YOLO) for wood defects detection is proposed.
- Designing an efficient features extraction network block (S-HorBlock) specifically for ODCA-YOLO. S-HorBlock enhances the network’s learning capacity and improves its ability to extract diverse types of defective wood features.

## 2. Methodology

#### 2.1. Omni-Dimensional Dynamic Coordinate Attention

#### 2.1.1. Review of Omni-Dimensional Dynamic Convolution

_{k}(x)} and aggregation kernels, which is negligible compared to convolution. The dynamic convolution finds a better compromise between network performance and computational load and improves the expressiveness of the model by fusing multiple convolution kernels and aggregating them in a nonlinear manner via attention. In Figure 1, “+” denotes a linear combination of n convolution kernels, Softmax stands for the Softmax activation function, GAP stands for global average pooling, FC stands for fully connected, and ReLU stands for the ReLU activation function, Here, “*”denotes convolution operation, W

_{i}stands for the i-th convolution kernel, and α

_{wi}represents the attention scalar associated with the convolution kernel W

_{i}.

**U**is the output, and can be written as $U=[{u}_{1},{u}_{2},\cdots ,{u}_{\mathrm{C}}]$. ${\mathrm{F}}_{\mathrm{tr}}$( ) is a primary mapping and can be taken as a convolution operator.

**v**

_{i}($i=1,\text{\hspace{0.17em}}2,\text{\hspace{0.17em}}\cdots ,\text{\hspace{0.17em}}C$) refers to the parameters of the i-th filter, and ${v}_{i}=[{v}_{i}^{1},{v}_{i}^{2},\cdots ,{v}_{i}^{{C}^{\prime}}]$. $X=[{x}^{1},{x}^{1},\dots ,{x}^{{C}^{\prime}}]$ is the feature map of the defects. ${v}_{i}^{\mathrm{s}}$ ($s=1,\text{\hspace{0.17em}}2,\text{\hspace{0.17em}}\cdots ,\text{\hspace{0.17em}}{C}^{\u2019}$) is a 2D spatial kernel representing a single channel of ${v}_{i}$ acting on the corresponding channel of

**X**. Here, * denotes convolution operation.

_{c}is the c-th element of z, and $(z\in {\mathbb{R}}^{H\times W})$.

**s**, which represents the weight ratio of each channel. L

_{1}and L

_{2}refer to the two fully connected layers, while δ and σ denote the ReLU and sigmoid activation functions, respectively.

**s**is defined as ${\pi}_{k}$, which represents the attention weight of the k-th convolution kernel. Unlike before, the attention weight is now applied not to individual channels but to the entire convolution kernel.

_{i}(x), and

_{i}are input to the Multi-dimensional multiplication operation (MDMul) module (as shown in Figure 2) for the convolution kernel of ODConv together with

**X**for convolution operations in each dimension. Given n convolution kernels, the corresponding kernel space has four dimensions regarding the spatial kernel size k × k, the input channel number c

_{in}and the output channel number c

_{out}for each convolution kernel, and the convolution kernel number n.

_{i}are combined with

**X**, forming the input for the Multi-dimensional multiplication operation (MDMul) module (as depicted in Figure 2). This operation takes place for the convolution kernels of ODConv, incorporating convolution operations across each dimension.

_{in}, the output channel number c

_{out}for each convolution kernel, and the total number of convolution kernels n. This comprehensive representation characterizes ODConv. The ODConv can be described as:

_{i}. The ODConv model builds upon dynamic convolution by introducing three novel attention scalars, computed along different dimensions within the kernel space of W

_{i}. These dimensions include the spatial dimension ${\alpha}_{si}\in {\mathbb{R}}^{{c}_{in}}$, the input channel dimension ${\alpha}_{ci}\in {\mathbb{R}}^{{c}_{in}}$, and the output channel dimension ${\alpha}_{fi}\in {\mathbb{R}}^{{c}_{out}}$. The symbol ⊙ denotes multiplication operations performed across these various dimensions. To calculate ${\alpha}_{wi}$, ${\alpha}_{fi}$, ${\alpha}_{ci}$, and ${\alpha}_{si}$, we employ the multi-head attention module π

_{i}(x) as a crucial component of the process.

Algorithm 1: ODConv |

Input: $X\in {\mathbb{R}}^{H\times W\times C}$Output: $Y\in {\mathbb{R}}^{H\times W\times C}$# Initialization Step 1: $x\leftarrow \mathrm{Input}$ Step 2: ${u}_{i}\leftarrow {v}_{i}\ast X\leftarrow {\Sigma}_{\mathrm{s}=1}^{\mathrm{C}\text{\u2019}}{v}_{i}^{\mathrm{s}}{\ast \mathrm{x}}^{\mathrm{s}}$ Step 3: ${z}_{c}\leftarrow {F}_{sq}({u}_{c})\leftarrow {\displaystyle \sum _{i=1}^{H}{\displaystyle \sum _{j=1}^{W}{u}_{c}}}(i,j)/H\times W$ Step 4: $z\leftarrow [{z}_{1},{z}_{2},\dots ,{z}_{c}]$ Step 5: $s\leftarrow \sigma ({L}_{2}\delta ({L}_{2}z)),\delta \leftarrow Relu,\sigma \leftarrow sigmoid$ Step 6: ${\pi}_{i}(x)\leftarrow s$ Step 7: ${\alpha}_{w1}\leftarrow {\pi}_{i}(x),\text{\hspace{0.17em}}{\alpha}_{f1}\leftarrow {\pi}_{i}(x),\text{\hspace{0.17em}}{\alpha}_{c1}\leftarrow {\pi}_{i}(x),\text{\hspace{0.17em}}{\alpha}_{s1}\leftarrow {\pi}_{i}(x),\text{\hspace{0.17em}}{\alpha}_{wn}\leftarrow {\pi}_{i}(x),\text{\hspace{0.17em}}{\alpha}_{fn}\leftarrow {\pi}_{i}(x),\text{\hspace{0.17em}}{\alpha}_{cn}\leftarrow {\pi}_{i}(x),\text{\hspace{0.17em}}{\alpha}_{sn}\leftarrow {\pi}_{i}(x)$ Step 8: $Y\text{\hspace{0.17em}}\leftarrow ({\alpha}_{w1}\odot {\alpha}_{f1}\odot {\alpha}_{c1}\odot {\alpha}_{s1}\odot {W}_{1}+\dots +{\alpha}_{wn}\odot {\alpha}_{fn}\odot {\alpha}_{cn}\odot {\alpha}_{sn}\odot {W}_{n})\ast X$ |

#### 2.1.2. Design of ODCA

**X**, the global average pooling operation is performed for each of the x-direction and y-direction of the feature map to obtain two feature vectors of C × H × 1 and C × 1 × W, respectively. We have,

**X**, we perform a global average pooling operation for both the x-direction and y-direction of the feature map. This yields two distinct feature vectors with dimensions of C × H × 1 and C × 1 × W, respectively. We have,

**f**is the intermediate feature map that encodes spatial information in both the x-direction and the y-direction. [•, •] denotes the concatenation operation along the spatial dimension. δ is the nonlinear activation function. F

_{1}is the 1 × 1 convolution operation. r is the hyperparameter representing the reduction ratio for controlling the block size as in the SE block, which is used to reduce the number of channels to reduce the model complexity.

**f**to encode spatial information in both the x-direction and the y-direction. To achieve this, we employ the concatenation operation [•, •] along the spatial dimension, followed by the nonlinear activation function δ. Subsequently, F

_{1}, representing the 1 × 1 convolution operation, is applied. The hyperparameter r controls the block size, similar to the SE block, and is used to reduce the number of channels, thereby enhancing model simplicity.

_{i}(x) and π

_{i}(y), respectively. These values represent the attention weights for the x-direction and y-direction, as well as for the i-th convolution kernel and its corresponding four dimensions. In other words, ${g}^{h}$ and ${g}^{w}$ signify the attention weights allocated to each dimension within the convolution kernel for both the x-direction and y-direction, respectively.

_{i}(x), π

_{i}(y) are input to the MDMul module of ODConv together with

**X**to perform the convolution operation in each dimension, and the output

**Y**is

_{out}channel of each convolution kernel. ${\alpha}_{hci}$ denotes different attention scalars assigned to the c

_{in}channel of each convolution kernel. ${\alpha}_{hsi}$ denotes different attention scalars assigned to the spatial location of each k × k convolution kernel. These are obtained from π

_{i}(x) in the horizontal attention direction. ${\alpha}_{wwi}$ denotes the attention scalars assigned to the entire convolution kernel in the vertical attention channel. ${\alpha}_{wfi}$ denotes different attention scalars assigned to the c

_{out}channel of each convolution kernel. ${\alpha}_{wci}$ denotes different attention scalars assigned to the c

_{in}channel of each convolution kernel. $\text{\hspace{0.17em}}{\alpha}_{wsi}$ denotes different attention scalars assigned to the spatial location of each k × k convolution kernel. These are obtained from π

_{i}(y) in the vertical attention direction.

Algorithm 2: ODCA |

Input: $X\in {\mathbb{R}}^{H\times W\times C}$Output: $Y\in {\mathbb{R}}^{H\times W\times C}$# Initialization Step 1: $X\leftarrow \mathrm{Input}$ Step 2: ${z}_{c}^{h}(h)\leftarrow \frac{1}{W}{\Sigma}_{0\le i<W}{x}_{c}(h,i),{z}_{c}^{w}(w)\leftarrow \frac{1}{h}{\Sigma}_{0\le i<H}{x}_{c}(j,w)$ Step 3: $f\leftarrow \delta ({F}_{1}[{z}^{h},{z}^{w}])$ Step 4: $[{f}^{h},\text{}{f}^{w}]\leftarrow f$ Step 5: ${g}^{h}\leftarrow \sigma ({F}_{h}({f}^{h})),\text{\hspace{0.17em}}{g}^{w}\leftarrow \sigma ({F}_{w}({f}^{w}))$ Step 6: ${\pi}_{i}(\mathrm{x})\leftarrow {g}^{h},{\pi}_{i}(\mathrm{y})\leftarrow {g}^{w}$ Step7: ${\alpha}_{hw1}\leftarrow {\pi}_{i}(x),{\alpha}_{hf1}\leftarrow {\pi}_{i}(x),{\alpha}_{hc1}\leftarrow {\pi}_{i}(x),{\alpha}_{hs1}\leftarrow {\pi}_{i}(x),{\alpha}_{hwn}\leftarrow {\pi}_{i}(\alpha x),{\alpha}_{hfn}\leftarrow {\pi}_{i}(x),{\alpha}_{hcn}\leftarrow {\pi}_{i}(x),{\alpha}_{hsn}\leftarrow {\pi}_{i}(x)$ ${\alpha}_{ww1}\leftarrow {\pi}_{i}(x),{\alpha}_{wf1}\leftarrow {\pi}_{i}(x),{\alpha}_{wc1}\leftarrow {\pi}_{i}(x),{\alpha}_{ws1}\leftarrow {\pi}_{i}(x),{\alpha}_{wwn}\leftarrow {\pi}_{i}(x),{\alpha}_{wfn}\leftarrow {\pi}_{i}(x),{\alpha}_{wcn}\leftarrow {\pi}_{i}(x),{\alpha}_{wsn}\leftarrow {\pi}_{i}(x)$ Step 8: $\begin{array}{l}Y\text{\hspace{0.17em}}\leftarrow ({\alpha}_{hw1}\odot {\alpha}_{hf1}\odot {\alpha}_{hc1}\odot {\alpha}_{hs1}\odot {W}_{1}+\dots +{\alpha}_{hwn}\odot {\alpha}_{hfn}\odot {\alpha}_{hcn}\odot {\alpha}_{hsn}\odot {W}_{n})\ast X\\ \text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}\times ({\alpha}_{ww1}\odot {\alpha}_{wf1}\odot {\alpha}_{wc1}\odot {\alpha}_{ws1}\odot {W}_{1}+\dots +{\alpha}_{wwn}\odot {\alpha}_{wfn}\odot {\alpha}_{wcn}\odot {\alpha}_{wsn}\odot {W}_{\mathrm{n}})\ast X\end{array}$ |

#### 2.2. S-HorBlock Module

**g**Conv convolution. The structure of the

^{n}**g**Conv convolution is shown in Figure 5c. The

^{n}**g**Conv convolution achieves the spatial interaction of any order, which can improve the modeling ability and high-density prediction performance of the network model. Accordingly, following the same architecture as Vit [33] and SwinTransformer [32], the HorNet network [34] is built. The HorNet network (as shown in Figure 5b) contains a spatial hybrid layer HorBlock and a feedforward network (FFN). Nonetheless, the HorBlock structure is intricate, and its repeated utilization slows down the model’s inference speed. To strike a balance between accuracy and inference speed, we incorporated the ShuffleNetv2 network structure as the overarching framework for S-HorBlock. Within the proposed ODCA-YOLO model, we applied the S-HorBlock module at the initial and final stages of the backbone section, replacing the Efficient Layer Aggregation Networks (ELAN) module in the head section with the S-HorBlock module. This strategic adaptation ensures both accuracy and improved inference speed for the model.

^{n}#### 2.3. The Proposed ODCA-YOLO

## 3. Experiment and Results

#### 3.1. Experimental Details and Dataset

#### 3.2. Performance Evaluation

#### 3.3. Ablation Experiments

#### 3.4. Comparisons with Other Methods and Experiments

## 4. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## References

- Longuetaud, F.; Mothe, F.; Kerautret, B.; Krähenbühl, A.; Hory, L.; Leban, J.M.; Debled-Rennesson, I. Automatic knot detection and measurements from X-ray CT images of wood: A review and validation of an improved algorithm on softwood samples. Comput. Electron. Agric.
**2012**, 85, 77–89. [Google Scholar] [CrossRef] - Chen, Y.; Sun, C.; Ren, Z.; Na, B. Review of the current state of application of wood defect recognition technology. BioResources
**2022**, 18, 2288–2302. [Google Scholar] [CrossRef] - Deflorio, G.; Fink, S.; Schwarze, F.W.M.R. Detection of incipient decay in tree stems with sonic tomography after wounding and fungal inoculation. Wood Sci. Technol.
**2008**, 42, 117–132. [Google Scholar] [CrossRef] - Fang, Y.; Lin, L.; Feng, H.; Lu, Z.; Emms, G.W. Review of the use of air-coupled ultrasonic technologies for nondestructive testing of wood and wood products. Comput. Electron. Agric.
**2017**, 137, 79–87. [Google Scholar] [CrossRef] - Yang, H.; Yu, L. Feature extraction of wood-hole defects using wavelet-based ultrasonic testing. J. For. Res.
**2017**, 28, 395–402. [Google Scholar] [CrossRef] - Li, X.; Qian, W.; Cheng, L.; Chang, L. A coupling model based on grey relational analysis and stepwise discriminant analysis for wood defect area identification by stress wave. BioResources
**2020**, 15, 1171–1186. [Google Scholar] [CrossRef] - Du, X.; Li, J.; Feng, H.; Chen, S. Image Reconstruction of Internal Defects in Wood Based on Segmented Propagation Rays of Stress Waves. Appl. Sci.
**2018**, 8, 1778. [Google Scholar] [CrossRef] - Wang, Q.; Liu, X.; Yang, S. Predicting Density and Moisture Content of Populus xiangchengensis and Phyllostachys edulis using the X-Ray Computed Tomography Technique. For. Prod. J.
**2020**, 70, 193–199. [Google Scholar] [CrossRef] - Qiu, Q. Thermal conductivity assessment of wood using micro computed tomography based finite element analysis (μCT-based FEA). NDT E Int.
**2023**, 139, 102921. [Google Scholar] [CrossRef] - Lai, F.; Luo, T.; Ding, R.; Luo, R.; Deng, T.; Wang, W.; Li, M. Application of Image Processing Technology to Wood Surface Defect Detection. For. Mach. Woodwork Equip.
**2021**, 49, 16–21. [Google Scholar] [CrossRef] - Siekański, P.; Magda, K.; Malowany, K.; Rutkiewicz, J.; Styk, A.; Krzesłowski, J.; Kowaluk, T.; Zagórski, A. On-Line Laser Triangulation Scanner for Wood Logs Surface Geometry Measurement. Sensors
**2019**, 19, 1074. [Google Scholar] [CrossRef] [PubMed] - Peng, Z.; Yue, L.; Xiao, N. Simultaneous Wood Defect and Species Detection with 3D Laser Scanning Scheme. Int. J. Opt.
**2016**, 2016, 1–6. [Google Scholar] [CrossRef] - Hu, C.; Tanaka, C.; Ohtani, T. Locating and identifying splits and holes on sugi by the laser displacement sensor. J. Wood Sci.
**2003**, 49, 492–498. [Google Scholar] [CrossRef] - Li, D.; Zhang, Z.; Wang, B.; Yang, C.; Deng, L. Detection method of timber defects based on target detection algorithm. Measurement
**2022**, 203, 111937. [Google Scholar] [CrossRef] - Shi, J.; Li, Z.; Zhu, T.; Wang, D.; Ni, C. Defect Detection of Industry Wood Veneer Based on NAS and Multi-Channel Mask R-CNN. Sensors
**2020**, 20, 4398. [Google Scholar] [CrossRef] - Han, S.; Jiang, X.; Wu, Z. An Improved YOLOv5 Algorithm for Wood Defect Detection Based on Attention. IEEE Access
**2023**, 11, 71800–71810. [Google Scholar] [CrossRef] - Cui, Y.; Lu, S.; Liu, S. Real-time detection of wood defects based on SPP-improved YOLO algorithm. Multimed Tools Appl.
**2023**, 82, 21031–21044. [Google Scholar] [CrossRef] - Gao, M.; Wang, F.; Song, P.; Liu, J.; Qi, D. BLNN: Multiscale Feature Fusion-Based Bilinear Fine-Grained Convolutional Neural Network for Image Classification of Wood Knot Defects. J. Sens.
**2021**, 2021, 1–18. [Google Scholar] [CrossRef] - Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv
**2022**, arXiv:2207.02696. [Google Scholar] - Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. arXiv
**2016**, arXiv:1506.02640. [Google Scholar] - Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. arXiv
**2016**, arXiv:1612.08242. [Google Scholar] - Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv
**2018**, arXiv:1804.02767. [Google Scholar] - Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv
**2020**, arXiv:2004.10934. [Google Scholar] - Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. arXiv
**2022**, arXiv:2209.02976. [Google Scholar] - Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO Series in 2021. arXiv
**2021**, arXiv:2107.08430. [Google Scholar] - Sirisha, U.; Praveen, S.P.; Srinivasu, P.N.; Barsocchi, P.; Bhoi, A.K. Statistical Analysis of Design Aspects of Various YOLO-Based Deep Learning Models for Object Detection. Int. J. Comput. Intell. Syst.
**2023**, 16, 126. [Google Scholar] [CrossRef] - Chen, Y.; Dai, X.; Liu, M.; Chen, D.; Yuan, L.; Liu, Z. Dynamic Convolution: Attention Over Convolution Kernels. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 11027–11036. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-Excitation Networks. arXiv
**2019**, arXiv:1709.01507. [Google Scholar] - Li, C.; Zhou, A.; Yao, A. Omni-Dimensional Dynamic Convolution. arXiv
**2022**, arXiv:2209.07947. [Google Scholar] - Hou, Q.; Zhou, D.; Feng, J. Coordinate Attention for Efficient Mobile Network Design. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 13708–13717. [Google Scholar] [CrossRef]
- Ma, N.; Zhang, X.; Zheng, H.-T.; Sun, J. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. arXiv
**2018**, arXiv:1807.11164. [Google Scholar] - Rao, Y.; Zhao, W.; Tang, Y.; Zhou, J.; Lim, S.-N.; Lu, J. HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions. arXiv
**2022**, arXiv:2207.14284. [Google Scholar] - Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv
**2021**, arXiv:2010.11929. [Google Scholar] - Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. arXiv
**2021**, arXiv:2103.14030. [Google Scholar] - Kodytek, P.; Bodzas, A.; Bilik, P. A large-scale image dataset of wood surface defects for automated vision-based quality control processes. F1000Research
**2022**, 10, 581. [Google Scholar] [CrossRef] [PubMed]

**Figure 5.**Structure of S-HorBlock: (

**a**) S-HorBlock module; (

**b**) HorNet network structure; (

**c**)

**g**

^{n}Conv module structure.

Defect Type | Number of Occurrences | Number of Images with the Defect | Images with the Defect in the Dataset (%) |
---|---|---|---|

Live_Knot | 4070 | 2256 | 62.7 |

Marrow | 206 | 191 | 5.3 |

Resin | 650 | 523 | 14.5 |

Dead_Knot | 2934 | 1875 | 52.1 |

Knot_with_crack | 542 | 398 | 11.1 |

Knot_missing | 121 | 110 | 3.1 |

Crack | 517 | 371 | 10.3 |

without any defects | — | 7 | 0.2 |

mAP | AP | |||||||
---|---|---|---|---|---|---|---|---|

Live_Knot | Morrow | Resin | Dead_Knot | Knot_with_Crack | Knot_Missing | Crack | ||

YOLOv7 | 0.694 | 0.777 | 0.811 | 0.669 | 0.789 | 0.486 | 0.632 | 0.693 |

YOLOv7+S-HorBlock | 0.745 | 0.830 | 0.747 | 0.698 | 0.832 | 0.543 | 0.868 | 0.694 |

YOLOv7+ODCA | 0.753 | 0.842 | 0.807 | 0.793 | 0.836 | 0.592 | 0.736 | 0.668 |

ODCA-YOLO | 0.785 | 0.835 | 0.930 | 0.790 | 0.834 | 0.614 | 0.782 | 0.707 |

mAP | AP | |||||||
---|---|---|---|---|---|---|---|---|

Live_Knot | Morrow | Resin | Dead_Knot | Knot_with_Crack | Knot_Missing | Crack | ||

YOLOv5 | 0.753 | 0.789 | 0.872 | 0.773 | 0.783 | 0.552 | 0.763 | 0.736 |

YOLOv7 | 0.694 | 0.777 | 0.811 | 0.669 | 0.789 | 0.486 | 0.632 | 0.693 |

YOLOX | 0.600 | 0.692 | 0.661 | 0.760 | 0.666 | 0.403 | 0.474 | 0.544 |

SSD | 0.605 | 0.695 | 0.642 | 0.774 | 0.650 | 0.511 | 0.483 | 0.479 |

RetinaNet | 0.526 | 0.684 | 0.413 | 0.735 | 0.633 | 0.541 | 0.477 | 0.196 |

ODCA-YOLO | 0.785 | 0.835 | 0.930 | 0.790 | 0.834 | 0.614 | 0.782 | 0.707 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Wang, R.; Liang, F.; Wang, B.; Mou, X.
ODCA-YOLO: An Omni-Dynamic Convolution Coordinate Attention-Based YOLO for Wood Defect Detection. *Forests* **2023**, *14*, 1885.
https://doi.org/10.3390/f14091885

**AMA Style**

Wang R, Liang F, Wang B, Mou X.
ODCA-YOLO: An Omni-Dynamic Convolution Coordinate Attention-Based YOLO for Wood Defect Detection. *Forests*. 2023; 14(9):1885.
https://doi.org/10.3390/f14091885

**Chicago/Turabian Style**

Wang, Rijun, Fulong Liang, Bo Wang, and Xiangwei Mou.
2023. "ODCA-YOLO: An Omni-Dynamic Convolution Coordinate Attention-Based YOLO for Wood Defect Detection" *Forests* 14, no. 9: 1885.
https://doi.org/10.3390/f14091885