This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Transformer–CNN Hybrid Framework for Pavement Pothole Segmentation
1
Center for Advanced Infrastructure and Transportation, Rutgers University, Piscataway, NJ 08854, USA
2
Department of Engineering, Boise State University, Boise, ID 83752, USA
3
Institute of Space and Earth Information Science, Fok Ying Tung Remote Sensing Science Building, The Chinese University of Hong Kong, Hong Kong SAR, China
4
School of Transportation, Southeast University, Nanjing 211189, China
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(21), 6756; https://doi.org/10.3390/s25216756 (registering DOI)
Submission received: 30 September 2025
/
Revised: 26 October 2025
/
Accepted: 3 November 2025
/
Published: 4 November 2025
Abstract
Pavement surface defects such as potholes pose significant safety risks and accelerate infrastructure deterioration. Accurate and automated detection of such defects requires both advanced sensing technologies and robust deep learning models. In this study, we propose PoFormer, a Transformer–CNN hybrid framework designed for precise segmentation of pavement potholes from heterogeneous image datasets. The architecture leverages the global feature extraction ability of Transformers and the fine-grained localization capability of CNNs, achieving superior segmentation accuracy compared to state-of-the-art models. To construct a representative dataset, we combined open source images with high-resolution field data acquired using a multi-sensor pavement inspection vehicle equipped with a line-scan camera and infrared/laser-assisted lighting. This sensing system provides millimeter-level resolution and continuous 3D surface imaging under diverse environmental conditions, ensuring robust training inputs for deep learning. Experimental results demonstrate that PoFormer achieves a mean IoU of 77.23% and a mean pixel accuracy of 84.48%, outperforming existing CNN-based models. By integrating multi-sensor data acquisition with advanced hybrid neural networks, this work highlights the potential of 3D imaging and sensing technologies for intelligent pavement condition monitoring and automated infrastructure maintenance.
Share and Cite
MDPI and ACS Style
Zhang, T.; Liu, Z.; Cui, B.; Gu, X.; Lu, Y.
Transformer–CNN Hybrid Framework for Pavement Pothole Segmentation. Sensors 2025, 25, 6756.
https://doi.org/10.3390/s25216756
AMA Style
Zhang T, Liu Z, Cui B, Gu X, Lu Y.
Transformer–CNN Hybrid Framework for Pavement Pothole Segmentation. Sensors. 2025; 25(21):6756.
https://doi.org/10.3390/s25216756
Chicago/Turabian Style
Zhang, Tianjie, Zhen Liu, Bingyan Cui, Xingyu Gu, and Yang Lu.
2025. "Transformer–CNN Hybrid Framework for Pavement Pothole Segmentation" Sensors 25, no. 21: 6756.
https://doi.org/10.3390/s25216756
APA Style
Zhang, T., Liu, Z., Cui, B., Gu, X., & Lu, Y.
(2025). Transformer–CNN Hybrid Framework for Pavement Pothole Segmentation. Sensors, 25(21), 6756.
https://doi.org/10.3390/s25216756
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.