DYNet: A Printed Book Detection Model Using Dual Kernel Neural Networks
Abstract
:1. Introduction
2. Principles and Methods
3. DYNet Model
3.1. RPM
3.1.1. Network Structure
3.1.2. Double Residual Structure
3.2. STSM
3.2.1. Network Structure
3.2.2. Mish Activation Function
3.2.3. Multi-Target Detection Branch
3.3. NRAM
4. Quality Evaluator
4.1. Quantity Checker
4.2. Fitting Checker
4.3. IoU Checker
4.4. Loss Checker
4.5. Weighted Voting Rights
5. Experimental Results
5.1. Datasets and Assessment Indicators
5.2. Experimental Process
5.3. Ablation Experiments
5.4. Performance Comparison
6. Conclusions
7. Discussion
- Introduce more samples and data enhancement techniques: By introducing more types, sizes, and angles of target samples and combining data enhancement techniques, DYNet’s adaptability to various target detection situations can be enhanced.
- Optimize model architecture and algorithms: By adjusting and improving the model architecture and using more advanced target detection algorithms, the accuracy of the algorithms and processing speed can be improved.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Zhang, R.; Xu, L.; Yu, Z.; Shi, Y.; Mu, C.; Xu, M. Deep-IRTarget: An Automatic Target Detector in Infrared Imagery Using Dual-Domain Feature Extraction and Allocation. IEEE Trans. Multimed. 2022, 24, 1735–1749. [Google Scholar] [CrossRef]
- Wu, G.; Wan, B.; Gu, Y.; Guo, M.; He, G. Microcomputer-Based Detecting System for Gray Coded Mark Recognition. J. Univ. Sci. Technol. 1992, 22, 123–127. [Google Scholar]
- Yan, F. Research and Design of Signature Detecting System Based on Robot Vision. Master’s Thesis, Xi’an University of Technology, Xi’an, China, 2011. [Google Scholar]
- Hu, X. Overall Design of Production Line and the Design of Assembling Machine detection System for Children’s Hardcover. Master’s Thesis, Southwest Jiaotong University, Chengdu, China, 2008. [Google Scholar]
- Wang, M.; Peng, X. Exploitation of the Online Detection System of Bookbinding Signature Mark. Packag. Eng. 2016, 37, 171–174. [Google Scholar] [CrossRef]
- Sheng, G.; Shu, X. An Adaptive Signature Mark Detection Method Based on Phase Correlation for Bookbinding. Packag. Eng. 2018, 39, 4. [Google Scholar] [CrossRef]
- Zhang, L. Research on Book Association Detection Based on Signature Marks. Master’s Thesis, Beijing Jiaotong University, Beijing, China, 2021. [Google Scholar] [CrossRef]
- Ju, M.; Luo, J.; Liu, G.; Luo, H. ISTDet: An efficient end-to-end neural network for infrared small target detection. Infrared Phys. Technol. 2021, 114, 103659. [Google Scholar] [CrossRef]
- Yang, X.; Wang, H.; Dong, M. Improved YOLOvS’s book Ladder label detection algorithm. J. Guilin Univ. Technol. 2022. Available online: https://kns.cnki.net/kcms/detail/45.1375.N.20221013.1439.002.html (accessed on 10 November 2023).
- He, S.; Yu, Q. Design and implementation of automatic detection system for book production. Manuf. Autom. 2023, 45, 17–20. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar] [CrossRef]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar] [CrossRef]
- Wang, Z.; Wang, J.; Li, D.; Zhu, D. A Multi-Strategy Sparrow Search Algorithm with Selective Ensemble. Electronics 2023, 12, 2505. [Google Scholar] [CrossRef]
- Song, P.; Li, P.; Dai, L.; Wang, T.; Chen, Z. Boosting R-CNN: Reweighting R-CNN samples by RPN’s error for underwater object detection. Neurocomputing 2023, 530, 150–164. [Google Scholar] [CrossRef]
- Pont-Tuset, J.; Arbeláez, P.; Barron, J.T.; Marques, F.; Malik, J. Multiscale Combinatorial Grouping for Image Segmentation and Object Proposal Generation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 28–140. [Google Scholar] [CrossRef] [PubMed]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar] [CrossRef]
- Wang, C.-Y.; Liao, H.-Y.M.; Wu, Y.-H.; Chen, P.-Y.; Hsieh, J.-W.; Yeh, I.-H. CSPNet: A New Backbone that can Enhance Learning Capability of CNN. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020; pp. 1571–1580. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Identity Mappings in Deep Residual Networks. In Proceedings of the Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 630–645. [Google Scholar] [CrossRef]
- Cai, D.; Zhang, Z.; Zhang, Z. Corner-Point and Foreground-Area IoU Loss: Better Localization of Small Objects in Bounding Box Regression. Sensors 2023, 23, 4961. [Google Scholar] [CrossRef] [PubMed]
- Mondal, A.; Shrivastava, V.K. A novel Parametric Flatten-p Mish activation function based deep CNN model for brain tumor classification. Comput. Biol. Med. 2022, 150, 106183. [Google Scholar] [CrossRef] [PubMed]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 12993–13000. [Google Scholar] [CrossRef]
Sample No. | Image Size | Quantity | Details of the Target and the Background |
---|---|---|---|
1 | 416 × 416 | 1133 | With light, low noise |
2 | 416 × 416 | 764 | With light, high noise |
3 | 416 × 416 | 891 | No light, low noise |
4 | 416 × 416 | 532 | No light, high noise |
The Title of the Book | Quantity |
---|---|
Dictionary of Common-Used Ancient Chinese Words | 1330 |
Xi Jinping: The Governance of China | 1234 |
Modern Chinese Dictionary | 313 |
Les Misérables | 103 |
The Brain Project | 76 |
Notre Dame de Paris | 62 |
Zero to One: Notes on Startups, or How to Build the Future | 60 |
World Order | 55 |
Complete Growth | 50 |
<1942> | 30 |
The Past | 29 |
Total | 3342 |
RPM | NRAM | Sample 1 | Sample 2 | Sample 3 | Sample 4 | ||||
---|---|---|---|---|---|---|---|---|---|
AP (%) | AR (%) | AP (%) | AR (%) | AP (%) | AR (%) | AP (%) | AR (%) | ||
✘ | ✘ | 96.29 | 97.18 | 95.81 | 96.07 | 97.31 | 97.53 | 95.86 | 96.80 |
✘ | ✔ | 96.29 | 97.18 | 96.73 | 98.04 | 97.31 | 97.98 | 95.86 | 97.93 |
✔ | ✘ | 98.23 | 99.21 | 97.51 | 98.30 | 98.43 | 99.33 | 97.93 | 98.68 |
✔ | ✔ | 100.00 | 100.00 | 99.87 | 99.87 | 99.89 | 99.89 | 99.81 | 99.81 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, L.; Xie, X.; Huang, P.; Yu, Q. DYNet: A Printed Book Detection Model Using Dual Kernel Neural Networks. Sensors 2023, 23, 9880. https://doi.org/10.3390/s23249880
Wang L, Xie X, Huang P, Yu Q. DYNet: A Printed Book Detection Model Using Dual Kernel Neural Networks. Sensors. 2023; 23(24):9880. https://doi.org/10.3390/s23249880
Chicago/Turabian StyleWang, Lubin, Xiaolan Xie, Peng Huang, and Qiang Yu. 2023. "DYNet: A Printed Book Detection Model Using Dual Kernel Neural Networks" Sensors 23, no. 24: 9880. https://doi.org/10.3390/s23249880
APA StyleWang, L., Xie, X., Huang, P., & Yu, Q. (2023). DYNet: A Printed Book Detection Model Using Dual Kernel Neural Networks. Sensors, 23(24), 9880. https://doi.org/10.3390/s23249880