Next Article in Journal
Real-Time Defect Identification in Automotive Brake Calipers Using PCA-Optimized Feature Extraction and Machine Learning
Previous Article in Journal
Electro-Oculography and Proprioceptive Calibration Enable Horizontal and Vertical Gaze Estimation, Even with Eyes Closed
Previous Article in Special Issue
Design of Realistic and Artistically Expressive 3D Facial Models for Film AIGC: A Cross-Modal Framework Integrating Audience Perception Evaluation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Transformer–CNN Hybrid Framework for Pavement Pothole Segmentation

1
Center for Advanced Infrastructure and Transportation, Rutgers University, Piscataway, NJ 08854, USA
2
Department of Engineering, Boise State University, Boise, ID 83752, USA
3
Institute of Space and Earth Information Science, Fok Ying Tung Remote Sensing Science Building, The Chinese University of Hong Kong, Hong Kong SAR, China
4
School of Transportation, Southeast University, Nanjing 211189, China
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(21), 6756; https://doi.org/10.3390/s25216756 (registering DOI)
Submission received: 30 September 2025 / Revised: 26 October 2025 / Accepted: 3 November 2025 / Published: 4 November 2025
(This article belongs to the Special Issue Convolutional Neural Network Technology for 3D Imaging and Sensing)

Abstract

Pavement surface defects such as potholes pose significant safety risks and accelerate infrastructure deterioration. Accurate and automated detection of such defects requires both advanced sensing technologies and robust deep learning models. In this study, we propose PoFormer, a Transformer–CNN hybrid framework designed for precise segmentation of pavement potholes from heterogeneous image datasets. The architecture leverages the global feature extraction ability of Transformers and the fine-grained localization capability of CNNs, achieving superior segmentation accuracy compared to state-of-the-art models. To construct a representative dataset, we combined open source images with high-resolution field data acquired using a multi-sensor pavement inspection vehicle equipped with a line-scan camera and infrared/laser-assisted lighting. This sensing system provides millimeter-level resolution and continuous 3D surface imaging under diverse environmental conditions, ensuring robust training inputs for deep learning. Experimental results demonstrate that PoFormer achieves a mean IoU of 77.23% and a mean pixel accuracy of 84.48%, outperforming existing CNN-based models. By integrating multi-sensor data acquisition with advanced hybrid neural networks, this work highlights the potential of 3D imaging and sensing technologies for intelligent pavement condition monitoring and automated infrastructure maintenance.
Keywords: transformer; pothole; image segmentation; CNN; deep learning transformer; pothole; image segmentation; CNN; deep learning

Share and Cite

MDPI and ACS Style

Zhang, T.; Liu, Z.; Cui, B.; Gu, X.; Lu, Y. Transformer–CNN Hybrid Framework for Pavement Pothole Segmentation. Sensors 2025, 25, 6756. https://doi.org/10.3390/s25216756

AMA Style

Zhang T, Liu Z, Cui B, Gu X, Lu Y. Transformer–CNN Hybrid Framework for Pavement Pothole Segmentation. Sensors. 2025; 25(21):6756. https://doi.org/10.3390/s25216756

Chicago/Turabian Style

Zhang, Tianjie, Zhen Liu, Bingyan Cui, Xingyu Gu, and Yang Lu. 2025. "Transformer–CNN Hybrid Framework for Pavement Pothole Segmentation" Sensors 25, no. 21: 6756. https://doi.org/10.3390/s25216756

APA Style

Zhang, T., Liu, Z., Cui, B., Gu, X., & Lu, Y. (2025). Transformer–CNN Hybrid Framework for Pavement Pothole Segmentation. Sensors, 25(21), 6756. https://doi.org/10.3390/s25216756

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop