Feature Sparse Choosing VIT Model for Efficient Concrete Crack Segmentation in Portable Crack Measuring Devices
Abstract
:1. Introduction
- (1)
- A crack image segmentation model based on a light-weight VIT module is designed by us. Due to the strong continuity of cracks, there is a certain correlation between cracks in different positions in the crack images. In our model, VIT is used to capture the relationship between different crack positions in crack images, and thereby better processing global information. More importantly, the computational complexity of the VIT module is reduced by the FSVIT. On the one hand, the fully connected layers in VIT are replaced by Depthwise Separable Convolution, and on the other hand, the Feature Space Choosing layer is used to select channels for features and reduce the number of channels for crack features.
- (2)
- A Feature Channel Selecting Module (FCSM) is used to select channel features in the decoder. The key operation of our proposed FCSM is the Channel Sparse Choosing operation. In the Channel Sparse Choosing operation, each channel’s feature corresponds to a scaling factor α and the channels with scaling factors approaching zero are pruned. Therefore, the number of channels of the original features significantly decrease after being processed by FCSM. In addition, the FCSM could suppress the influence of interfering features.
- (3)
- The current public concrete crack segmentation datasets have too few samples, and the crack samples are very similar. Thus, these datasets may not fully reflect the crack scenarios in the real world. Therefore, this study creates a new dataset named QUCrack which contains a large number of irregular cracks in a variety of environments.
2. Materials and Methods
2.1. The Structure of the LTNet
2.2. The Feature Sparse Choosing VIT Module
2.3. The Structure of the Feature Channel Selecting Module
- (1)
- Reducing computational complexity: In some cases, the number of channels for inputting feature maps may be very large, resulting in higher computational complexity. By selecting specific channels, the number of channels that need to be processed can be reduced, thereby reducing computational complexity. This helps to improve the efficiency and speed of the model.
- (2)
- Improving the generalization ability of the model: In some cases, certain channels may not have significant discriminative ability for specific tasks. By selecting channels with higher discrimination, the model’s generalization ability can be improved to better adapt to new data beyond the training data.
- (3)
- Reducing the risk of over-fitting: Excessive feature channels may increase the complexity of the model, which can easily lead to over-fitting problems. By selecting the most representative feature channels, the complexity of the model can be reduced, the risk of over-fitting can be reduced, and the generalization ability of the model can be improved.
2.4. Our Collected Dataset
3. Datasets and Experimental Setup
3.1. Datasets
3.2. Experimental Setup
4. Results
4.1. Comparison with the State-of-the-Art Methods
4.2. Effects of Using Different Numbers of FSVIT
4.3. Effects of Using Different Numbers of FCSM
4.4. Comparison of Different Light-Weight Segmentation Models
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Xiao, Y.; Li, J. Crack Detection Algorithm based on the Fusion of Percolation Theory and Adaptive Canny Operator. In Proceedings of the 2018 37th Chinese Control Conference (CCC), Wuhan, China, 25–27 July 2018; pp. 4295–4299. [Google Scholar]
- Salman, M.; Mathavan, S.; Kamal, K.; Rahman, M. Pavement crack detection using the Gabor filter. In Proceedings of the 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013), The Hague, The Netherlands, 6–9 October 2013; pp. 2039–2044. [Google Scholar]
- Akbari, J.; Ahmadifarid, M.; Amiri, A.K. Multiple Crack Detection using Wavelet Transforms and Energy Signal Techniques. Frat. Integrità Strutt. 2020, 14, 269–280. [Google Scholar] [CrossRef]
- Ramnivas, K.; Sachin, K.S. Crack detection near the ends of a beam using wavelet transform and high resolution beam deflection measurement. Eur. J. Mech. A/Solids 2021, 88, 104259, ISSN 0997-7538. [Google Scholar]
- Kaul, V.; Yezzi, A.; Tsai, Y. Detecting Curves with Unknown Endpoints and Arbitrary Topology Using Minimal Paths. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 1952–1965. [Google Scholar] [CrossRef] [PubMed]
- Hu, Y.; Zhao, C.-X. A novel LBP based methods for pavement crack detection. J. Pattern Recognit. Res. 2010, 5, 140–147. [Google Scholar] [CrossRef] [PubMed]
- Mojidra, R.; Li, J.; Mohammadkhorasani, A.; Moreu, F.; Bennett, C.; Collins, W. Vision-based fatigue crack detection using feature tracking. Earthq. Eng. Eng. Vib. 2023, 22, 19–39. [Google Scholar] [CrossRef]
- Aswini, E.; Divya, S.; Kardheepan, S.; Manikandan, T. Mathematical morphology and bottom-hat filtering approach for crack detection on relay surfaces. In Proceedings of the International Conference on Smart Structures and Systems Icsss’13, Chennai, India, 28–29 March 2013; pp. 108–113. [Google Scholar]
- Nguyen, T.S.; Begot, S.; Duculty, F.; Avila, M. Free-form anisotropy: A new method for crack detection on pavement surface images. In Proceedings of the 2011 18th IEEE International Conference on Image Processing, Brussels, Belgium, 11–14 September 2011; pp. 1069–1072. [Google Scholar]
- Oliveira, H.; Correia, P.L. Automatic road crack segmentation using entropy and image dynamic thresholding. In Proceedings of the 2009 17th European Signal Processing Conference, Glasgow, UK, 24–28 August 2009; pp. 622–626. [Google Scholar]
- Pan, Y.; Zhang, X.; Cervone, G.; Yang, L. Detection of Asphalt Pavement Potholes and Cracks Based on the Unmanned Aerial Vehicle Multispectral Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 3701–3712. [Google Scholar] [CrossRef]
- Shi, Y.; Cui, L.; Qi, Z.; Meng, F.; Chen, Z. Automatic Road Crack Detection Using Random Structured Forests. IEEE Trans. Intell. Transp. Syst. 2016, 17, 3434–3445. [Google Scholar] [CrossRef]
- Sheng, P.; Chen, L.; Tian, J. Learning-based road crack detection using gradient boost decision tree. In Proceedings of the 2018 13th IEEE Conference on Industrial Electronics and Applications (ICIEA), Wuhan, China, 31 May–2 June 2018; pp. 1228–1232. [Google Scholar]
- Qu, Z.; Mei, J.; Liu, L.; Zhou, D.-Y. Crack Detection of Concrete Pavement With Cross-Entropy Loss Function and Improved VGG16 Network Model. IEEE Access 2020, 8, 54564–54573. [Google Scholar] [CrossRef]
- Chaiyasarn, K.; Buatik, A.; Mohamad, H.; Zhou, M.; Kongsilp, S.; Poovarodom, N. Integrated pixel-level CNN-FCN crack detection via photogrammetric 3D texture mapping of concrete structures. Autom. Constr. 2022, 140, 104388, ISSN 0926-5805. [Google Scholar] [CrossRef]
- Wang, S.; Pan, Y.; Chen, M.; Zhang, Y.; Wu, X. FCN-SFW: Steel Structure Crack Segmentation Using a Fully Convolutional Network and Structured Forests. IEEE Access 2020, 8, 214358–214373. [Google Scholar] [CrossRef]
- He, M.; Lau, T.L. CrackHAM: A Novel Automatic Crack Detection Network Based on U-Net for Asphalt Pavement. IEEE Access 2024, 12, 12655–12666. [Google Scholar] [CrossRef]
- Khan, M.A.-M.; Kee, S.-H.; Nahid, A.-A. Vision-Based Concrete-Crack Detection on Railway Sleepers Using Dense U-Net Model. Algorithms 2023, 16, 568. [Google Scholar] [CrossRef]
- Gao, X.; Tong, B. MRA-UNet: Balancing speed and accuracy in road crack segmentation network. Signal Image Video Process. 2023, 17, 2093–2100. [Google Scholar] [CrossRef]
- Sizyakin, R.; Voronin, V.V.; Gapon, N.; Pižurica, A. A deep learning approach to crack detection on road surfaces. In Proceedings of the Conference on Artificial Intelligence and Machine Learning in Defense Applications, Online, 21–25 September 2020. [Google Scholar] [CrossRef]
- Zhang, X.H.; Huang, H. PSNet: Parallel-Convolution-Based U-Net for Crack Detection with Self-Gated Attention Block. Appl. Sci. 2023, 13, 9875. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In IEEE Transactions on Pattern Analysis and Machine Intelligence; IEEE: New York, NY, USA, 2017. [Google Scholar]
- Mark, D.J.; Thomas, A.C.; Maria, I.I.; Tom, B.; Gordon, M. A deep convolutional neural network for semantic pixel-wise segmentation of road and pavement surface cracks. In Proceedings of the 2018 26th European Signal Processing Conference (EUSIPCO), Roma, Italy, 3–7 September 2018. [Google Scholar]
- Nhung, T.H.; Nguyen, T.H.L.; Stuart, P.; Nguyen, T.T. Pavement crack detection using convolutional neural network. In International Symposium on Information and Communication Technology; Association for Computing Machinery: New York, NY, USA, 2018. [Google Scholar]
- Di Benedetto, A.; Fiani, M.; Gujski, L.M. U-Net-Based CNN Architecture for Road Crack Segmentation. Infrastructures 2023, 8, 90. [Google Scholar] [CrossRef]
- Yang, G.; Geng, P.; Ma, H.; Liu, J.; Luo, J. Dwta-unet: Concrete crack segmentation based on discrete wavelet transform and unet. In Proceedings of the 2021 Chinese Intelligent Automation Conference, Zhanjiang, China, 5–7 November 2021. [Google Scholar]
- Han, C.; Ma, T.; Ju, H.; Huang, X.; Zhang, Y. Crackw-net: A novel pavement crack image segmentation convolutional neural network. IEEE Trans. Intell. Transp. Syst. 2022, 23, 22135–22144. [Google Scholar] [CrossRef]
- Zhang, C.; Jiang, W.; Zhao, Q. Semantic segmentation of aerial imagery via split-attention networks with disentangled nonlocal and edge supervision. Remote Sens. 2021, 13, 1176. [Google Scholar] [CrossRef]
- Sun, X.; Xie, Y.; Jiang, L.; Cao, Y.; Liu, B. Dma-net: Deeplab with multi-scale attention for pavement crack segmentation. IEEE Trans. Intell. Transp. Syst. 2022, 23, 18392–18403. [Google Scholar] [CrossRef]
- Feng, J.; Li, J.; Shi, Y.; Zhao, Y.; Zhang, C. Acau-net: Atrous convolution and attention u-net model for pavement crack segmentation. In Proceedings of the 2022 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI), Shijiazhuang, China, 22–24 July 2022; pp. 561–565. [Google Scholar]
- Li, J.; Liu, Y.; Zhang, Y.; Zhang, Y. Cascaded attention denseunet (cadunet) for road extraction from very-high-resolution images. Int. J. Geo-Inf. 2021, 10, 329. [Google Scholar] [CrossRef]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Hu, Q. Eca-net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Gao, Z.; Peng, B.; Li, T.; Gou, C. Generative adversarial networks for road crack image segmentation. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar]
- Nhung, H.T.; Nguyen, A.; Stuart, P.A.; Don, B.A.; Ha, T.L.B.; Thuy, T.N.C. Two-stage convolutional neural network for road crack detection and segmentation. Expert Syst. Appl. 2021, 186, 115718. [Google Scholar]
- Zhang, X.; Huang, H. PHCNet: Pyramid Hierarchical-Convolution-Based U-Net for Crack Detection with Mixed Global Attention Module and Edge Feature Extractor. Appl. Sci. 2023, 13, 10263. [Google Scholar] [CrossRef]
- Zhang, X.; Huang, H. LCSNet: Light-Weighted Convolution-Based Segmentation Method with Separable Multi-Directional Convolution Module for Concrete Crack Segmentation in Drones. Electronics 2024, 13, 1307. [Google Scholar] [CrossRef]
- Emara, T.; Munim, H.E.A.E.; Abbas, H.M. LiteSeg: A Novel Lightweight ConvNet for Semantic Segmentation. In 2019 Digital Image Computing: Techniques and Applications (DICTA); IEEE: New York, NY, USA, 2020. [Google Scholar] [CrossRef]
- Wang, B.; Li, H.S. Lane detection algorithm based on MoblieNet + UNet lightweight network. In Proceedings of the 2021 3rd International Symposium on Robotics & Intelligent Manufacturing Technology (ISRIMT), Changzhou, China, 24–26 September 2021; pp. 352–356. [Google Scholar] [CrossRef]
- Tsai, T.H.; Tseng, Y.W. BiSeNet V3: Bilateral segmentation network with coordinate attention for real-time semantic segmentation. Neurocomputing 2023, 532, 33–42. [Google Scholar] [CrossRef]
- Ruan, J.; Xie, M.; Gao, J.; Liu, T.; Fu, Y. Ege-unet: An efficient group enhanced unet for skin lesion segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Vancouver, BC, Canada, 8–12 October 2023; Springer Nature: Cham, Switzerland; pp. 481–490. [Google Scholar]
- Jiang, W.; Xie, Z.; Li, Y.; Liu, C.; Lu, H. Lrnnet: A light-weighted network with efficient reduced non-local operation for real-time semantic segmentation. In 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW); IEEE: New York, NY, USA, 2020; pp. 1–6. [Google Scholar]
Dataset | Image Size | Train | Test |
---|---|---|---|
The Crack500 dataset | 1440 × 2560 or 2560 × 1440 | 1896 | 1124 |
The OAD_CRACK dataset | 1920 × 1080 | 3500 | 1500 |
The QUCrack dataset | 1920 × 1080 | 8000 | 2000 |
(a) | ||||
Methods | Crack500 | OAD_CRACK | QUCrack | Model Size |
ConvNet [22] | 0.591 | 0.572 | 0.416 | - |
U-Net by Jenkins [23] | 0.681 | 0.681 | 0.475 | - |
U-Net by Nguyen [24] | 0.695 | 0.683 | 0.473 | - |
U-Net proposed by Di [25] | 0.732 | 0.729 | 0.522 | - |
DWTA-U-Net [26] | 0.77 | 0.754 | 0.567 | - |
CrackW-Net [27] | 0.789 | 0.769 | 0.572 | - |
Split-Attention Network [28] | 0.73 | 0.696 | 0.519 | - |
DMA-Net [29] | 0.746 | 0.715 | 0.534 | - |
ACAU-Net [30] | 0.792 | 0.774 | 0.575 | - |
Cascaded Attention DenseU-Net [31] | 0.74 | 0.695 | 0.532 | 137 M |
ECA-Net [32] | 0.753 | 0.711 | 0.546 | 87 M |
FU-Net [33] | 0.795 | 0.769 | 0.582 | 90 M |
Two-stage CNN [34] | 0.79 | 0.765 | 0.585 | 230 M |
PSNet [21] | 0.812 | 0.762 | 0.612 | 185 M |
PHCNet [35] | 0.823 | 0.773 | 0.643 | 167 M |
LCSNet [36] | 0.836 | 0.785 | 0.661 | 2 M |
LTNet | 0.887 | 0.817 | 0.693 | 2 M |
(b) | ||||
Methods | Crack500 | OAD_CRACK | QUCrack | |
Split-Attention Network [28] | 0.725 | 0.682 | 0.502 | |
DMA-Net [29] | 0.775 | 0.762 | 0.56 | |
ACAU-Net [30] | 0.776 | 0.753 | 0.551 | |
Cascaded Attention DenseU-Net [31] | 0.732 | 0.681 | 0.524 | |
ECA-Net [32] | 0.767 | 0.723 | 0.552 | |
FU-Net [33] | 0.761 | 0.736 | 0.557 | |
Two-stage CNN [34] | 0.773 | 0.742 | 0.563 | |
PSNet [21] | 0.829 | 0.771 | 0.599 | |
PHCNet [35] | 0.817 | 0.765 | 0.631 | |
LCSNet [36] | 0.828 | 0.773 | 0.652 | |
LTNet | 0.882 | 0.805 | 0.681 | |
(c) | ||||
Methods | Crack500 | OAD_CRACK | QUCrack | |
Split-Attention Network [28] | 0.73 | 0.69 | 0.51 | |
DMA-Net [29] | 0.76 | 0.74 | 0.55 | |
ACAU-Net [30] | 0.78 | 0.76 | 0.56 | |
Cascaded Attention DenseU-Net [31] | 0.74 | 0.69 | 0.53 | |
ECA-Net [32] | 0.76 | 0.72 | 0.55 | |
FU-Net [33] | 0.78 | 0.75 | 0.57 | |
Two-stage CNN [34] | 0.78 | 0.75 | 0.57 | |
PSNet [21] | 0.82 | 0.77 | 0.61 | |
PHCNet [35] | 0.82 | 0.77 | 0.64 | |
LCSNet [36] | 0.83 | 0.78 | 0.66 | |
LTNet | 0.88 | 0.81 | 0.69 |
Methods | Crack500 | OAD_CRACK | QUCrack |
---|---|---|---|
LiteSeg [37] | 0.814 | 0.764 | 0.657 |
MobileNet + UNet [38] | 0.786 | 0.724 | 0.591 |
BiSeNet v3 [39] | 0.792 | 0.733 | 0.618 |
EGE-UNet [40] | 0.803 | 0.747 | 0.621 |
LRNNet [41] | 0.812 | 0.762 | 0.649 |
LCSNet [36] | 0.836 | 0.785 | 0.661 |
LTNet | 0.887 | 0.817 | 0.693 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, X.; Huang, H.; Cai, M. Feature Sparse Choosing VIT Model for Efficient Concrete Crack Segmentation in Portable Crack Measuring Devices. Electronics 2024, 13, 1641. https://doi.org/10.3390/electronics13091641
Zhang X, Huang H, Cai M. Feature Sparse Choosing VIT Model for Efficient Concrete Crack Segmentation in Portable Crack Measuring Devices. Electronics. 2024; 13(9):1641. https://doi.org/10.3390/electronics13091641
Chicago/Turabian StyleZhang, Xiaohu, Haifeng Huang, and Meng Cai. 2024. "Feature Sparse Choosing VIT Model for Efficient Concrete Crack Segmentation in Portable Crack Measuring Devices" Electronics 13, no. 9: 1641. https://doi.org/10.3390/electronics13091641
APA StyleZhang, X., Huang, H., & Cai, M. (2024). Feature Sparse Choosing VIT Model for Efficient Concrete Crack Segmentation in Portable Crack Measuring Devices. Electronics, 13(9), 1641. https://doi.org/10.3390/electronics13091641