Infrastructure is a significant element of society and a large part of the daily lives of many individuals. Roads are one of the most important national facilities in the Kingdom of Saudi Arabia. They are used every day by thousands and millions of people to commute; their safety is critical. These facilities require maintenance and successive repairs so that damage does not cause further losses and harm the national economy as the facility becomes in need of complete replacement. Damaged roads and potholes pose a risk to drivers and can cause accidents. It costs vast amounts of money for the damage it causes to vehicles and is responsible for a large part of highway deaths [1
]. Road maintenance is critical to a country’s socio-economic development and the smooth continuation of day-to-day operations. It is critical to specify the type of damage and its location to assist in fixing it as soon as possible. In the Kingdom of Saudi Arabia, road damage and asphalt defects are detected using modern laser technology devices [2
]. The use of these devices is often expensive. Moreover, the visual inspection is performed periodically, and the process is time-consuming, expensive, prone to errors, and does not provide top-level observability of the situation. It is safer and faster to report road damage using smartphone images.
Roads deteriorate gradually, particularly as a result of moisture and traffic. Water can deteriorate the road exterior, roadbed, and shoulder and can also damage the physical structures of the road. Traffic also causes road deterioration through the loss of surface material and the deformation of the road surface by vehicle tires, resulting in the road base becoming exposed and leading to ruts, potholes, and grooves [3
]. There is an urgent demand and need for more efficient, advanced, and less costly methods. Therefore, it is necessary to take advantage of modern technologies and advanced methods in artificial intelligence to recognize road problems and estimate their impact by developing practical algorithms to detect and classify damage automatically. This can help municipalities repair defects promptly to prevent accidents and ensure the safety of road users. This research aims to develop an appropriate algorithm for developing a road vision system that can detect and classify road damage in an efficient and fast way.
The research contributions are as follows:
Create a new dataset that contains road damage images in the Kingdom of Saudi Arabia. The dataset contains six types of road damage and their severity by capturing images using a smartphone. The dataset contains a wide variety of weather and lighting conditions. In addition, the images were taken from various angles.
The newly created dataset was annotated and reviewed with the help of civil engineers with more than ten years of experience to comment on the dataset and make it reliable and error-free.
Implement an appropriate classification algorithm for Saudi Arabia’s roads that is able to detect and classify road damage efficiently and quickly.
Implement and evaluate several deep learning algorithms using the proposed dataset, including VGG-16, AlexNet, and ResNet-34.
Evaluate the performance of the proposed method and compare the results with state-of-art algorithms and previous works.
2. Related Work
During the last few years, many datasets of road damages were released and made publicly available; some of these datasets are being used extensively by other researchers, such as the RDD2018 dataset released by Maeda et al. [4
], which contained around 9000 images of eight types of road damage in Japan. The RDD2020 by Arya et al. [5
] included around 26,000 images of eight classes of damage collected from the roads of India, Japan, and the Czech Republic. In general, road damage datasets can be divided into two types based on their uses. Detection datasets that are used primarily for binary classification such as the AigleRN dataset collected from France by Amhaz et al. [6
], CFD collected from China released by Shi et al. [7
], Temple University dataset collected from the United States by Zhang et al. [8
], Denish Technological Institute dataset collected from Denmark by Silva et al. [9
], and METU gathered from Turkey released by Özgenel [10
]. The other type is the classification datasets which usually contain more than two positive classes, such as the Taiwan and Japan datasets released by Chen et al. [11
], the Czech and Slovak datasets released by Mraz et al. [12
], and many others.
Machine learning (ML) algorithms have been used for more than two decades to detect road damage, such as the work completed by Hoang et al. [13
], Song et al. [14
], and Hoang et al. [15
]. Numerous studies have been proposed to detect potholes and cracks using edge detection and image thresholding, for instance, Otsu et al. [16
], Ayenu-Prah et al. [17
], and Koch et al. [18
]. Many studies developed methods to detect cracks utilizing random structured forests, e.g., Shi et al. [7
], and an unsupervised method based on Otsu’s thresholds and photo-metric information, e.g., Akagic et al. [19
]. The ML methods used to detect potholes detection included unsupervised fuzzy c-means clustering and morphological reconstruction, e.g., Ouma et al. [20
], and Support Vector Machine (SVM), e.g., Marques et al. [21
]. Furthermore, Ahmadi et al. [22
] and Cubero-Fernandez et al. [23
] implemented several ML classification methods to classify four types of road damage, including K-nearest neighbors (KNN), Bagged Trees, SVM, and Decision Tree.
In recent years, deep learning (DL) has been widely used to detect road damage, for instance, Biçici et al. [24
], Stricker et al. [25
], and Zhang et al. [26
]. These algorithms and techniques are being used now in self-driving cars to avoid obstacles and ensure road safety while driving. Most research uses detection tasks that only discover the damage. In many studies presented by Zhang et al. [8
], Silva and Lucena [9
], Rao et al. [27
], and Fan et al. [28
], several Convolutional Neural Network (CNN) models were presented to detect cracks in road images. Despite the good results that were achieved, the detection method only determines the presence of the damage; still, it does not classify its type. As a result, in recent studies, classification methods based on DL algorithms have been used on input images to classify them into various types of damage that can assist municipalities in accurately identifying and classifying damage. For instance, Ebenezer et al. [29
] and Elghaish et al. [30
] presented a method to detect four types of damage to the road using several CNN models.
Despite the good results that were achieved in the previous studies, the method detects only four types of road damage. However, a number of methods were developed to classifiy more than six types of road damage using different techniques. Maeda et al. [31
] and Mraz et al. [12
] proposed a method for road damage detection and classification based on Convolutional Neural Networks (CNNs). The model was trained using the SSD MobileNet and SSD Inception V2 frameworks. Singh et al. [32
], Vishwakarma et al. [33
], Kortmann et al. [34
] and Wang et al. [35
] proposed an automatic road damage detection and classification based on Convolutional Neural Network (CNN) using an R-CNN model. In the last two years, some studies have used advanced algorithms such as a single stage that uses a single CNN, such as YOLO, to predict the class and location of damage directly: for instance, Doshi and Yilmaz [36
], Alfarrarjeh et al. [37
], Jeong [38
], Hegde et al. [39
], Pena-Caballero [40
] and Al-Shaghouri [41
Although many studies have proposed approaches to automate the detection of road damages, several problems remain, and there is still room for more improvement. New methods could be proposed to classify more types of road damage more precisely, especially those present in the Kingdom of Saudi Arabia, whose nature of damage may differ from other countries from which the available datasets were collected in terms of geography and climate. Table 1
summarizes the studies conducted to detect and classify road damage using ML and DL methods, and as shown, the most used ML method was SVM. CNN was the most commonly used for DL.
6. Results and Discussion
This experiment was performed using the SARD-2022 dataset proposed in this study by us. The damage types include longitudinal and transverse cracks, alligator cracks, edge cracks, potholes, depressions, and shoving. We first classify the types of damage into six classes and then classify them according to their severity into two classes: high damage (D_high) or low damage (D_low). Table 6
demonstrates the performance of the four models in classifying six classes of damage. Multiple deep learning models were implemented to evaluate their performance and determine which best fits the dataset and is the most effective and efficient at classifying six types of road damage. These models included RoadNet, AlexNet, ResNet 34, and VGG-16. It could be seen that the RoadNet model outperforms with the highest performance on test images at 98.6% accuracy, because it has less computational complexity and less parameters than the other models. The accuracy of AlexNet is 98.5%, but it has a higher computational complexity and more parameters compared to RoadNet. Meanwhile, the lowest accuracy of 80% was achieved by the ResNet34 model.
We further evaluate using different types of damage to detect their severity as high or low. Table 7
demonstrates the performance of the four models to classify six classes of damage according to their severity. In this experiment, there was a slight decrease in accuracy in all models due to the fact that each damage was categorized into two categories (D_high and D_low), and the total number of categories entered in the training process became 12 categories. The RoadNet achieved higher accuracy than other models. An increase in the accuracy of ResNet34 has also been observed of 81% from the previous experiment. In all these experiments, it was noticed that the VGG-16 network takes a lot of time in the training process, also followed by AlexNet, while the network of ResNet34 was fast but did not achieve good results. It has been discovered that VGG-16 has the most parameters, indicating that it has a higher computational complexity than pre-training models. The RoadNet model has fewer parameters and a shorter training time when compared to other models. It also outperformed the other pre-trained models in terms of accuracy.
Comparison with Previous Works
Many studies have been conducted to detect and classify road damage using machine learning (ML) and deep learning (DL), but because most of these methods used different datasets and evaluation measurements, comparing the results objectively is difficult. The results of the previous work are shown in Table 8
. In Zhang et al. [8
], they presented a Deep Convolutional Neural Network (ConvNet) to detect cracks in the road. The ConvNet consists of two convolutional layers and two fully connected layers. The dataset consists of 500 pavement images with and without cracks at a resolution of 3264 × 2448. The images were collected using a smartphone at the USA’s Temple University campus. The results achieved a precision of 0.86%, a recall of 0.92%, and an F1 score of 0.89%. Our approach achieved a better accuracy of 0.99%, a recall of 0.99%, and an F1 score of 0.99% with two convolutional layers followed by two fully connected layers, with six different damage classes classified. Silva and Lucena [9
] applied a crack detection model dependent on a convolutional neural network (CNN). The model was trained using the VGG16 framework and a dataset collected from the Danish Technological Institute Dataset. It achieved an accuracy of 92.27%; as for our VGG-16 model, it achieved a better accuracy of 95.1%. Rao et al. [27
] presented several convolutional neural network (CNN) classification models to detect the crack image. The dataset consists of 2173 pavement images with and without cracks at a resolution of 256 × 256. The result of the AlexNet model achieved 94% accuracy and 88% precision. Our model AlexNet achieved a better accuracy of 98.4% and a precision of 0.99%.