Wood-YOLOv11: An Optimized YOLOv11-Based Model for Real-Time Pith Detection in Sawn Timber
Abstract
1. Introduction
- Task-Oriented Dataset Construction: We develop a dedicated sawn timber cross-section dataset composed of multiple species, diverse imaging conditions, and high-quality pith annotations. The dataset includes a substantial number of negative samples (pithless boards) to reflect practical industrial conditions.
- Negative-Sample-Aware Training Strategy: Pithless images are explicitly incorporated during training as background-only samples. A weighted binary cross-entropy (WBCE) component is adopted to mitigate severe class imbalance and enhance the model’s capability to suppress false positives.
- Resolution and Loss Optimization for Small-Target Detection: The model uses a high-resolution input configuration (840 × 840) and a composite loss function with tuned weighting coefficients to improve the localization accuracy of small pith targets.
- Comprehensive Evaluation Pipeline: Beyond standard mAP and precision/recall metrics, the evaluation includes false-positive rate analysis on pithless boards, ablation studies on model configurations, and comparisons with mainstream detectors.
2. Related Work and Theoretical Basis
2.1. Convolutional Neural Networks (CNNs) in Image Recognition
2.2. Object Detection Models
3. Dataset Construction and Preprocessing
3.1. Data Collection
3.2. Image Annotation
3.3. Data Preprocessing and Augmentation
- Image Normalization: All images were resized to a uniform input resolution of 840 × 840 pixels. This larger-than-standard size was chosen specifically to preserve the fine details of the small pith target during training [26];
- Data Augmentation: To prevent overfitting and improve the model’s robustness to variations in appearance, several data augmentation techniques were applied during the training phase, including random rotations, horizontal flips, and adjustments to brightness and contrast. Techniques like Mosaic and Mixup, common in YOLO training pipelines, were also employed [27,28].
3.4. Task Analysis
4. Proposed Method
4.1. YOLOv11 Architecture Overview
4.2. Loss Function Optimization
4.2.1. Localization Loss: Using Enhanced CIoU Loss
4.2.2. Classification Loss: Using Weighted Binary Cross-Entropy
4.2.3. Distribution Focal Loss: Optimizing Boundary Details
4.3. Model Parameters and Training
5. Results and Discussion
5.1. Experimental Setup
5.2. Quantitative Results
5.3. Ablation Studies
5.4. Training Curves and Convergence
5.5. Visualization Results and Analysis of Pith Detection Model
5.5.1. Initial Candidate Box Generation Stage
5.5.2. Feature Extraction and Prediction Stage
5.5.3. Post-Processing and Final Output Stage
5.6. Discussion
6. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Habite, T.; Abdeljaber, O.; Olsson, A. Automatic detection of annual rings and pith location along Norway spruce timber boards using conditional adversarial networks. Wood Sci. Technol. 2021, 55, 461–488. [Google Scholar] [CrossRef]
- Hu, M.; Briggert, A.; Olsson, A.; Johansson, M.; Oscarsson, J.; Säll, H. Growth layer and fibre orientation around knots in Norway spruce: A laboratory investigation. Wood Sci. Technol. 2018, 52, 7–27. [Google Scholar] [CrossRef]
- Zielinski, K.M.; Scabini, L.; Ribas, L.C.; da Silva, N.R.; Beeckman, H.; Verwaeren, J.; Bruno, O.M.; De Baets, B. Advanced wood species identification based on multiple anatomical sections and using deep feature transfer and fusion. Comput. Electron. Agric. 2025, 231, 109867. [Google Scholar] [CrossRef]
- Wang, B.; Wang, R.; Chen, Y.; Yang, C.; Teng, X.; Sun, P. FDD-YOLO: A Novel Detection Model for Detecting Surface Defects in Wood. Forests 2025, 16, 308. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Lee, S.H.; Seo, H.I.; Seong, J.H.; Joo, Y.I.; Seo, D.H. A study on defect detection in X-ray image castings based on unsupervised learning. J. Adv. Mar. Eng. Technol. 2020, 44, 487–493. [Google Scholar] [CrossRef]
- Kisantal, M.; Wojna, Z.; Murawski, J.; Gorban, J.; Naruniec, J.; Cho, K. Augmentation for small object detection. arXiv 2019, arXiv:1902.07296. [Google Scholar] [CrossRef]
- Liu, H.; Wang, M.; Liu, L.; Wu, J.; Huang, H. A survey of small object detection based on deep learning. Comput. Eng. Sci. 2021, 43, 1429–1442. [Google Scholar] [CrossRef]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 8759–8768. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), New York, NY, USA, 7–12 February 2020; pp. 12993–13000. [Google Scholar] [CrossRef]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems 25 (NIPS 2012), Lake Tahoe, NV, USA, 3–8 December 2012; pp. 1097–1105. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar] [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object detection in 20 years: A survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems 28 (NIPS 2015), Montréal, QC, Canada, 7–12 December 2015; pp. 91–99. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 779–788. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 7464–7475. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016, Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar] [CrossRef]
- Jiao, L.; Zhang, F.; Liu, F.; Yang, S.; Li, L.; Feng, Z.; Qu, R. A survey of deep learning-based object detection. IEEE Access 2019, 7, 128837–128868. [Google Scholar] [CrossRef]
- Wang, C.Y.; Liao, H.Y.M.; Wu, Y.H.; Chen, P.Y.; Hsieh, J.W.; Yeh, I.H. CSPNet: A new backbone that can enhance learning capability of CNN. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Seattle, WA, USA, 13–19 June 2020; pp. 390–391. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef]
- Singh, B.; Davis, L.S. An analysis of scale invariance in object detection-snip. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 3578–3587. [Google Scholar]
- Zhang, H.; Cisse, M.; Dauphin, Y.N.; Lopez-Paz, D. mixup: Beyond empirical risk minimization. arXiv 2018, arXiv:1710.09412. [Google Scholar] [CrossRef]
- Yun, S.; Han, D.; Oh, S.J.; Chun, S.; Choe, J.; Yoo, Y. CutMix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6023–6032. [Google Scholar]
- Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. [Google Scholar] [CrossRef] [PubMed]
- Gulshan, V.; Peng, L.; Coram, M.; Stumpe, M.C.; Wu, D.; Narayanaswamy, A.; Venugopalan, S.; Widner, K.; Madams, T.; Cuadros, J.; et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA-J. Am. Med. Assoc. 2016, 316, 2402–2410. [Google Scholar] [CrossRef]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV) 2018, Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar] [CrossRef]
- Li, X.; Wang, W.; Wu, L.; Chen, S.; Hu, X.; Li, J.; Tang, J.; Yang, J. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. In Proceedings of the Advances in Neural Information Processing Systems 33 (NeurIPS 2020), Virtual, 6–12 December 2020. [Google Scholar]
- Loshchilov, I.; Hutter, F. Decoupled weight decay regularization. In Proceedings of the 7th International Conference on Learning Representations (ICLR 2019), New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Loshchilov, I.; Hutter, F. SGDR: Stochastic gradient descent with warm restarts. In Proceedings of the 5th International Conference on Learning Representations (ICLR 2017), Toulon, France, 24–26 April 2017. [Google Scholar]
- Yosinski, J.; Clune, J.; Bengio, Y.; Lipson, H. How transferable are features in deep neural networks? In Proceedings of the Advances in Neural Information Processing Systems 27, Montréal, QC, Canada, 8–13 December 2014; pp. 3320–3328. [Google Scholar]
- Everingham, M.; Van Gool, L.; Williams, C.K.I.; Winn, J.; Zisserman, A. The PASCAL Visual Object Classes (VOC) Challenge. Int. J. Comput. Vis. 2010, 88, 303–338. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Ciresan, D.; Giusti, A.; Gambardella, L.M.; Schmidhuber, J. Deep neural networks segment neuronal membranes in electron microscopy images. In Proceedings of the Advances in Neural Information Processing Systems 25 (NIPS 2012), Lake Tahoe, NV, USA, 3–8 December 2012; pp. 2843–2851. [Google Scholar]
- Fink, G.; Kohler, J. Model for the prediction of the tensile strength and tensile stiffness of knot clusters within structural timber. Eur. J. Wood Wood Prod. 2014, 72, 331–341. [Google Scholar] [CrossRef]
- Guindos, P.; Guaita, M. A three-dimensional wood material model to simulate the behavior of wood with any type of knot at the macro-scale. Wood Sci. Technol. 2013, 47, 585–599. [Google Scholar] [CrossRef]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar] [CrossRef]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4510–4520. [Google Scholar]
- Tan, M.; Le, Q.E. EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; Volume 15, pp. 6105–6114. [Google Scholar]
- Mokroš, M.; Liang, X.; Surový, P.; Valent, P.; Čerňava, J.; Chudý, F.; Tunák, D.; Saloň, Š.; Merganič, J. Evaluation of close-range photogrammetry image collection methods for estimating tree diameters. ISPRS Int. J. Geo-Inf. 2018, 7, 93. [Google Scholar] [CrossRef]
- Bhandarkar, S.M.; Luo, X.; Daniels, R.F.; Tollner, E.W. Automated planning and optimization of lumber production using machine vision and computed tomography. IEEE Trans. Autom. Sci. Eng. 2008, 5, 677–695. [Google Scholar] [CrossRef]
















| Model | mAP@0.5 | Precision | Recall | FPS |
|---|---|---|---|---|
| SSD | 0.72 | 0.901 | 0.78 | 45 |
| Faster R-CNN | 0.845 | 0.935 | 0.836 | 14 |
| YOLOv7 | 0.89 | 0.94 | 0.857 | 24 |
| YOLOv8 | 0.905 | 0.945 | 0.865 | 28 |
| Wood-YOLOv11 (ours) | 0.921 | 0.952 | 0.877 | 27 |
| Resolution | mAP@0.5 | Precision | Recall |
|---|---|---|---|
| 640 | 0.909 | 0.948 | 0.865 |
| 840 (ours) | 0.921 | 0.952 | 0.877 |
| Model | mAP@0.5 | Precision | Recall |
|---|---|---|---|
| w/o WBCE | 0.907 | 0.948 | 0.864 |
| w WBCE | 0.912 | 0.949 | 0.871 |
| w/o DFL | 0.906 | 0.947 | 0.863 |
| w DFL | 0.914 | 0.951 | 0.868 |
| w WBCE + DFL | 0.921 | 0.952 | 0.877 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jia, S.; Kong, F.; Jin, B.; Jin, C.; Que, Z. Wood-YOLOv11: An Optimized YOLOv11-Based Model for Real-Time Pith Detection in Sawn Timber. Appl. Sci. 2025, 15, 13056. https://doi.org/10.3390/app152413056
Jia S, Kong F, Jin B, Jin C, Que Z. Wood-YOLOv11: An Optimized YOLOv11-Based Model for Real-Time Pith Detection in Sawn Timber. Applied Sciences. 2025; 15(24):13056. https://doi.org/10.3390/app152413056
Chicago/Turabian StyleJia, Shuke, Fanxu Kong, Baolei Jin, Chenyang Jin, and Zeli Que. 2025. "Wood-YOLOv11: An Optimized YOLOv11-Based Model for Real-Time Pith Detection in Sawn Timber" Applied Sciences 15, no. 24: 13056. https://doi.org/10.3390/app152413056
APA StyleJia, S., Kong, F., Jin, B., Jin, C., & Que, Z. (2025). Wood-YOLOv11: An Optimized YOLOv11-Based Model for Real-Time Pith Detection in Sawn Timber. Applied Sciences, 15(24), 13056. https://doi.org/10.3390/app152413056

