Impact of Image Preprocessing and Crack Type Distribution on YOLOv8-Based Road Crack Detection
Abstract
:1. Introduction
- Convert the RGB images from all four datasets into grayscale and binarized formats, conducting comparative experiments to determine the type of image which is most suitable for road crack detection.
- Re-annotate all datasets by refining crack classification into longitudinal cracks (LCs), transverse cracks (TCs), alligator cracks (ACs), oblique cracks (OCs), and potholes (PHs).
- Utilize CrackVariety, which maintains a balanced distribution of crack types, to analyze the effects of balanced vs. imbalanced crack type distributions on model accuracy and generalization.
- Comprehensive evaluation of image preprocessing techniques (RGB, grayscale, binarization) in YOLO-based road crack detection, providing insights into the optimal input format.
- Development of a balanced crack dataset (CrackVariety) with detailed crack type classification, ensuring equal representation of various crack types.
- Re-annotation of existing datasets (CFD, Crack500, CrackTree200) with refined crack labels, improving dataset consistency and classification accuracy.
- Empirical analysis of crack type distribution balance, demonstrating its impact on model accuracy and generalization performance.
2. Datasets and Methods
2.1. Datasets
2.2. Data Preprocessing
2.2.1. Grayscale Conversion
- (1)
- Average Method
- (2)
- Green Channel Method
- (3)
- Maximum Value Method
- (4)
- Minimum Value Method
- (5)
- Weighted Average Method
2.2.2. Binarization Conversion
- Image cropping: to ensure a higher proportion of crack pixels, images are cropped to focus on crack regions (Figure 3).
- Compute grayscale range: the brightness distribution of each cropped image is analyzed to determine its minimum and maximum grayscale values.
- Grayscale compression: the grayscale range of the image is compressed to improve contrast for enhancing crack visibility (Figure 4).
- Histogram generation: histograms are generated to visualize the grayscale value distribution for providing insights into contrast variations (Figure 5).
- K-means clustering for threshold estimation:
- Each image undergoes K-means clustering (k = 2), where pixels are grouped into two clusters representing cracks and the background.
- The clustering is performed by minimizing the Euclidean distance between pixel values and cluster centers:
- The binarization threshold for each image is determined as the midpoint between the two cluster centers.
- 6.
- Adjust grayscale in original image: the grayscale values in the original image are adjusted to match the contrast characteristics of the cropped regions (Figure 6).
- 7.
- Thresholding for binarization: the final threshold, computed as the average of all per-image thresholds, is applied to convert grayscale images into a binary format for ensuring a consistent processing standard across the dataset (Figure 7).
2.3. Experimental Setup
2.4. Training Parameters and Evaluation Metrics
- Mean Average Precision (mAP)
- AP for a single class is computed as
- mAP is then computed as the mean of AP values over all classes:
- 2.
- Precision and Recall
- Precision (P): measures of quantity of detected cracks which are correct:
- Recall (R): measures of quantity of actual cracks which are successfully detected:
- 3.
- Inference Time and Frames Per Second (FPS)
- Inference Time: measures the time taken to process a single image.
- FPSs is defined as
3. Results and Discussion
3.1. Results and Discussion of CFD Dataset
3.2. Results and Discussion of Crack500 Dataset
3.3. Results and Discussion of CrackTree200 Dataset
3.4. Results and Discussion of CrackVariety Dataset
4. Conclusions
- RGB images consistently outperformed grayscale and binarized formats, confirming that preserving color-based texture and contrast enhances YOLOv8s’s detection accuracy.
- Grayscale processing has different performance effects for different datasets.
- Binarization generally degraded performance, except in CrackVariety (mAP@0.5 value of 0.406), where balanced crack distribution mitigated the negative effects of contrast loss.
- Dataset size significantly affects model performance, as demonstrated by the CFD dataset, which exhibited the lowest mAP@0.5 value of 0.222 due to its limited number of images (118). This suggests that small datasets may lead to poor generalization and overfitting, emphasizing the need for a sufficient number of samples in training.
- The reported FPS values across all experiments significantly exceed 30 FPS, ensuring real-time detection capability.
- In CrackVariety, binarized images achieved relatively high detection performance (mAP@0.5 = 0.406) compared to other datasets. This may be attributed to the dataset’s balanced crack type distribution, which helps the model generalize better despite the reduced feature richness in binarized images.
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
PMS | Pavement Management System |
DL | deep learning |
YOLO | You Only Look Once |
CNN | Convolutional Neural Network |
CFD | Crack Forest Dataset |
LC | longitudinal crack |
TC | transverse crack |
AC | alligator crack |
OC | oblique crack |
PH | pothole |
mAP | Mean Average Precision |
FPS | Frames Per Second |
GISs | geographic information systems |
References
- Redmon, J. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Ibragimov, E.; Lee, H.-J.; Lee, J.-J.; Kim, N. Automated Pavement Distress Detection Using Region Based Convolutional Neural Networks. Int. J. Pavement Eng. 2022, 23, 1981–1992. [Google Scholar] [CrossRef]
- Feng, H.; Li, W.; Luo, Z.; Chen, Y.; Fatholahi, S.N.; Cheng, M.; Wang, C.; Junior, J.M.; Li, J. GCN-Based Pavement Crack Detection Using Mobile LiDAR Point Clouds. IEEE Trans. Intell. Transp. Syst. 2022, 23, 11052–11061. [Google Scholar] [CrossRef]
- Glenn, J.; Ayush, C.; Jing, Q. Ultralytics YOLOv8. Available online: https://github.com/ultralytics/ultralytics (accessed on 28 February 2025).
- Wang, C.-Y.; Yeh, I.-H.; Liao, H.-Y.M. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. In Proceedings of the Computer Vision—ECCV 2024, Milan, Italy, 29 September–4 October 2024; Lecture Notes in Computer Science. Springer: Cham, Switzerland, 2024. [Google Scholar]
- Ultralytics Model Comparison: YOLOv8 vs YOLOv9 for Object Detection. Available online: https://docs.ultralytics.com/compare/yolov8-vs-yolov9/ (accessed on 28 February 2025).
- Tong, Z.; Gao, J.; Han, Z.; Wang, Z. Recognition of Asphalt Pavement Crack Length Using Deep Convolutional Neural Networks. Road Mater. Pavement Des. 2018, 19, 1334–1349. [Google Scholar]
- Hou, Y.; Liu, S.; Cao, D.; Peng, B.; Liu, Z.; Sun, W.; Chen, N. A Deep Learning Method for Pavement Crack Identification Based on Limited Field Images. IEEE Trans. Intell. Transp. Syst. 2022, 23, 22156–22165. [Google Scholar]
- Chun, P.-J.; Yamane, T.; Tsuzuki, Y. Automatic Detection of Cracks in Asphalt Pavement Using Deep Learning to Overcome Weaknesses in Images and Gis Visualization. Appl. Sci. Switz. 2021, 11, 892. [Google Scholar] [CrossRef]
- Du, Y.; Pan, N.; Xu, Z.; Deng, F.; Shen, Y.; Kang, H. Pavement Distress Detection and Classification Based on YOLO Network. Int. J. Pavement Eng. 2021, 22, 1659–1672. [Google Scholar] [CrossRef]
- Maniat, M.; Camp, C.V.; Kashani, A.R. Deep Learning-Based Visual Crack Detection Using Google Street View Images. Neural Comput. Appl. 2021, 33, 14565–14582. [Google Scholar]
- Fan, L.; Li, S.; Li, Y.; Li, B.; Cao, D.; Wang, F.-Y. Pavement Cracks Coupled With Shadows: A New Shadow-Crack Dataset and A Shadow-Removal-Oriented Crack Detection Approach. IEEE/CAA J. Autom. Sinica 2023, 10, 1593–1607. [Google Scholar] [CrossRef]
- Shi, Y.; Cui, L.; Qi, Z.; Meng, F.; Chen, Z. Automatic Road Crack Detection Using Random Structured Forests. IEEE Trans. Intell. Transp. Syst. 2016, 17, 3434–3445. [Google Scholar] [CrossRef]
- Yang, F.; Zhang, L.; Yu, S.; Prokhorov, D.; Mei, X.; Ling, H. Feature Pyramid and Hierarchical Boosting Network for Pavement Crack Detection. IEEE Trans. Intell. Transp. Syst. 2019, 21, 1525–1535. [Google Scholar]
- Zou, Q.; Cao, Y.; Li, Q.; Mao, Q.; Wang, S. CrackTree: Automatic Crack Detection from Pavement Images. Pattern Recognit. Lett. 2012, 33, 227–238. [Google Scholar]
- Eisenbach, M.; Stricker, R.; Seichter, D.; Amende, K.; Debes, K.; Sesselmann, M.; Ebersbach, D.; Stoeckert, U.; Gross, H.-M. How to Get Pavement Distress Detection Ready for Deep Learning? A Systematic Approach. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 2039–2047. [Google Scholar]
- Liu, Y.; Yao, J.; Lu, X.; Xie, R.; Li, L. DeepCrack: A Deep Hierarchical Feature Learning Architecture for Crack Segmentation. Neurocomputing 2019, 338, 139–153. [Google Scholar]
- Mei, Q.; Gül, M. A Cost Effective Solution for Pavement Crack Inspection Using Cameras and Deep Neural Networks. Constr. Build. Mater. 2020, 256, 119397. [Google Scholar] [CrossRef]
- Fan, L.; Tang, S.; Mohd Ariffin, M.K.A.; Ismail, M.I.S.; Zhao, R. How to Make a State of the Art Report—Case Study—Image-Based Road Crack Detection: A Scientometric Literature Review. Appl. Sci. 2024, 14, 4817. [Google Scholar] [CrossRef]
Hardware | Configurations |
---|---|
CPU | Intel (R) Core (TM) i9-14900HX @ 2.20 GHz |
GPU | NVIDIA GeForce RTX 4080 Laptop |
RAM | 32 GB |
Software | Configurations |
---|---|
Operating System | Windows 11 Home Single Language |
Development Environment | Anaconda 2.6.3, PyCharm Professional 2024.2.1 |
Programming Language | Python 3.9 |
Libraries and Frameworks | OpenCV 4.10.0 PyTorch 1.12.1 CUDA 11.6 cuDNN 8.9.7 |
Parameter | Value |
---|---|
Epochs | 300 |
Input Image Size (imgsz) | 640 pixels |
Batch Size | 4 |
Patience | 300 |
Other Hyperparameters | Default YOLOv8 settings |
Datasets | Results | RGB | Grayscale | Binarized | ||||
---|---|---|---|---|---|---|---|---|
Average | Green Channel | Maximum Value | Minimum Value | Weighted Average | ||||
CFD | Precision | 0.387 | 0.230 | 0.403 | 0.365 | 0.435 | 0.297 | 0.197 |
Recall | 0.205 | 0.221 | 0.301 | 0.219 | 0.289 | 0.224 | 0.188 | |
mAP@0.5 | 0.222 | 0.221 | 0.259 | 0.216 | 0.269 | 0.229 | 0.165 | |
FPS | 303.03 | 303.03 | 322.58 | 333.33 | 312.50 | 312.50 | 333.33 |
Datasets | Results | RGB | Grayscale | Binarized | ||||
---|---|---|---|---|---|---|---|---|
Average | Green Channel | Maximum Value | Minimum Value | Weighted Average | ||||
Crack500 | Precision | 0.386 | 0.504 | 0.516 | 0.543 | 0.591 | 0.491 | 0.483 |
Recall | 0.475 | 0.370 | 0.434 | 0.430 | 0.371 | 0.397 | 0.460 | |
mAP@0.5 | 0.384 | 0.333 | 0.376 | 0.393 | 0.390 | 0.354 | 0.376 | |
FPS | 357.14 | 344.82 | 333.33 | 359.14 | 333.33 | 333.33 | 384.62 |
Datasets | Results | RGB | Grayscale | Binarized | ||||
---|---|---|---|---|---|---|---|---|
Average | Green Channel | Maximum Value | Minimum Value | Weighted Average | ||||
CrackTree20 | Precision | 0.593 | 0.455 | 0.608 | 0.455 | 0.608 | 0.612 | 0.504 |
Recall | 0.385 | 0.479 | 0.355 | 0.479 | 0.355 | 0.472 | 0.351 | |
mAP@0.5 | 0.421 | 0.441 | 0.434 | 0.441 | 0.434 | 0.446 | 0.380 | |
FPS | 384.62 | 370.37 | 333.33 | 333.33 | 370.37 | 370.37 | 384.62 |
Datasets | Results | RGB | Grayscale | Binarized | ||||
---|---|---|---|---|---|---|---|---|
Average | Green Channel | Maximum Value | Minimum Value | Weighted Average | ||||
CrackVariety | Precision | 0.638 | 0.721 | 0.484 | 0.522 | 0.505 | 0.550 | 0.569 |
Recall | 0.368 | 0.392 | 0.338 | 0.403 | 0.393 | 0.375 | 0.364 | |
mAP@0.5 | 0.404 | 0.451 | 0.345 | 0.418 | 0.378 | 0.400 | 0.406 | |
FPS | 222.22 | 263.16 | 192.31 | 222.22 | 222.22 | 227.27 | 222.22 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fan, L.; Tang, S.; Ariffin, M.K.A.b.M.; Ismail, M.I.S.; Wang, X. Impact of Image Preprocessing and Crack Type Distribution on YOLOv8-Based Road Crack Detection. Sensors 2025, 25, 2180. https://doi.org/10.3390/s25072180
Fan L, Tang S, Ariffin MKAbM, Ismail MIS, Wang X. Impact of Image Preprocessing and Crack Type Distribution on YOLOv8-Based Road Crack Detection. Sensors. 2025; 25(7):2180. https://doi.org/10.3390/s25072180
Chicago/Turabian StyleFan, Luxin, Saihong Tang, Mohd Khairol Anuar b. Mohd Ariffin, Mohd Idris Shah Ismail, and Xinming Wang. 2025. "Impact of Image Preprocessing and Crack Type Distribution on YOLOv8-Based Road Crack Detection" Sensors 25, no. 7: 2180. https://doi.org/10.3390/s25072180
APA StyleFan, L., Tang, S., Ariffin, M. K. A. b. M., Ismail, M. I. S., & Wang, X. (2025). Impact of Image Preprocessing and Crack Type Distribution on YOLOv8-Based Road Crack Detection. Sensors, 25(7), 2180. https://doi.org/10.3390/s25072180