Analysis and Classification of Distress on Flexible Pavements Using Convolutional Neural Networks: A Case Study in Benin Republic

Yabi, Crespin Prudence; Gbehoun, Godfree F.; Tamou, Bio Chéissou Koto; Alamou, Eric; Gibigaye, Mohamed; Farsangi, Ehsan Noroozinejad

doi:10.3390/infrastructures10050111

Open AccessArticle

Analysis and Classification of Distress on Flexible Pavements Using Convolutional Neural Networks: A Case Study in Benin Republic

by

Crespin Prudence Yabi

¹

,

Godfree F. Gbehoun

¹,

Bio Chéissou Koto Tamou

¹,

Eric Alamou

¹,

Mohamed Gibigaye

² and

Ehsan Noroozinejad Farsangi

^3,*

¹

Laboratory of Studies and Tests in Civil Engineering (L2EGC), National University of Sciences, Technologies, Engineering and Mathematics, Abomey BP 486, Benin

²

Laboratory of Applied Energetic and Mechanic (LEMA), University of Abomey-Calavi, Abomey-Calavi 01BP 526, Benin

³

Urban Transformations Research Centre (UTRC), Western Sydney University, Parramatta, NSW 2150, Australia

^*

Author to whom correspondence should be addressed.

Infrastructures 2025, 10(5), 111; https://doi.org/10.3390/infrastructures10050111

Submission received: 19 February 2025 / Revised: 1 April 2025 / Accepted: 16 April 2025 / Published: 29 April 2025

(This article belongs to the Section Infrastructures Inspection and Maintenance)

Download

Browse Figures

Versions Notes

Abstract

Roads are critical infrastructure in multi-sectoral development. Any country that aims to expand and stabilize its activities must have a network of paved roads in good condition. However, that is not the case in many countries. The usual methods of recording and classifying pavement distress on the roads require a lot of equipment, technicians, and time to obtain the nature and indices of the damage to estimate the roadway’s quality level. This study proposes the use of pavement distress detection and classification models based on Convolutional Neural Networks, starting from videos taken of any asphalt road. To carry out this work, various routes were filmed to list the degradations concerned. Images were extracted from these videos and then resized and annotated. Then, these images were used to constitute several databases of road damage, such as longitudinal cracks, alligator cracks, small potholes, and patching. Within an appropriate development environment, three Convolutional Neural Networks were developed and trained on the databases. The accuracy achieved by the different models varies from 94.6% to 97.3%. This accuracy is promising compared to the literature models. This method would make it possible to considerably reduce the financial resources used for each road data campaign.

Keywords:

asphalt pavement; pavement distress classification; image classification; road maintenance; convolutional neural network

1. Introduction

Roads are essential factors for the development of a nation due to their importance in the exchange of goods and services. Therefore, it is crucial to prioritize the construction of transport infrastructure, especially roads [1]. Ensuring the quality of the roadway in service by quickly detecting road pathologies and scheduling regular maintenance of the layers concerned is becoming one of the priority tasks in infrastructure management [2]. The policies involve the implementation of tools that allow not only the recording of damage and deformation but also, above all, the location and estimation of the scale (extent and severity) of the damage in question [3,4]. However, road data campaigns are tedious and restrictive processes for government. Indeed, Zhang et al. [5] explain that a significant investment is required, as well as special inspection vehicles and the presence of sensors and cameras to recover the data.

Therefore, the question of implementing an intelligent roadway management system becomes relevant. Since the 1980s, many researchers have been interested in developing automatic methods and means to detect and classify pathologies on a paved road. In addition, road monitoring systems allow collection of high-resolution images of roads at normal speed [6]. Interest in using such images to create an inventory of damage has increased and consolidated. Areas of research interest have varied, depending on the evolution of the tools available to them: the intensity thresholding technique, edge detection [7], wavelet transformation, texture analysis, Machine Learning, and Deep Learning [8].

Orthodox intensity thresholding methods are based on setting a single threshold to divide pavement images into grey levels and create binary photos on which it would be easy to detect cracked and non-cracked pixels. This solution has been particularly used for cracks since the pixels of the cracks are darker than the other surrounding pixels [9]. This is how some studies [9,10,11] carried out their work by developing algorithms based on iterated splitting and histograms. However, the results provided are not satisfactory. They stated that variation in image intensity depending on lighting conditions has too strong an impact on the value of the pixels to be processed to reach the set threshold. To overcome these shortcomings of the method, Zhang et al. [12] proposed in their research a crude methodology also involving new concepts such as the region of aggregation (ROA) and belief (ROB) for the segmentation and classification of cracking.

Regarding the edge detection method in the classification of pathologies, it involves the application of different filters to the images to recover precise shapes to be categorized: edge detectors such as Canny and Sobel [13], and also Prewitt, Laplacian, etc. Morphological filters are used to detect cracks by eliminating noise in the images [13]. However, the major inadequacy of this method remains the determination of disjoint crack curves. These treatment methods did not remain the prerogative of crack classification for long. Indeed, Huidrom et al. [14] proposed a classification of three pathologies: potholes, cracks, and repairs. The study was carried out in two stages, the first of which was the classification of video frames into degradation levels using the Distress Frame Selection algorithm, while the second was the categorization of the degradations as well as the estimation of the impact zone using the Critical Distress Detection Measurement and Classification algorithm, which uses image texture, shape, and dimension factors to assign a category to pathologies.

Detection by image processing has proven itself, and many works have demonstrated that the automation of road monitoring processes is convincing, effective, and promising with larger data and more efficient tools [15]. The evolution of Machine Learning offers more encouraging avenues in terms of detecting pathologies. Indeed, in the work of Vancouver et al. [16], the histogram and density are calculated as features and subsequently sent to a simple neural network to perform crack classification. Also, Hoang [17] proposed an AI model for the detection of potholes on the roadway based first on the extraction of the characteristics of the images with the Gaussian and Oriental filters, followed by the use of said characteristics for the training of Machine Learning models such as LS-SVM and ANN. The work carried out by Amalia [15] demonstrated that classifiers based on artificial neural networks that use pictures of the roadway are more efficient than traditional Machine Learning classifiers such as that of Bayes, or that based on the K-Nearest Neighbor method. For all these Machine Learning-based classification methods, the categorization is based on the extracted features and not on the raw images. In fact, given the rich diversity of image data, it would be impossible to set up a precise and generalized classifier.

Interest in Deep Learning then emerged and grew exponentially. The use of a Convolutional Neural Network (CNN) in the work of Liene [18,19] on a large database of road images taken from an inspection vehicle equipped with a professional camera, several scanners and a monitoring system led the way. Over nine thousand (9712) pictures collected were reduced and then sent to a Convolutional Neural Network for the classification of cracks on asphalt roads. The accuracy achieved of 98%, which was 3% higher than that obtained by pixel-by-pixel analysis of images, led to the conclusion that the potential of Deep Learning in terms of the classification of pathologies was significant and that the coming years would be marked by a rush of work towards this new approach. In the same way, Maeda et al. [20] proposed a method for detecting eight different pathologies. A general and public database was created, and 9053 images containing 15,435 degradations were collected using a smartphone in the streets of Japan. The notion of transfer learning was also introduced into this research since two detection models (Inception V2 and SSD MobileNet) already trained were used on the database created.

Recently, Wang et al. [21], still with transfer learning, developed a tool for detecting potholes on a paved road using the YOLO v3 object detection model. However, to improve the model’s performance, the extraction of image characteristics was done by ResNet 101, more precisely by the convolutional part of the ResNet architecture. Maslan et al. [22] recommended ResNet 50 for extracting image features, and then the YOLO v2 model for detecting the pathologies concerned. In their case, cracking was the object of the study. ResNet 50 being an image classification model like all the others, Chun et al. [23] chose to use it exclusively to identify six different classes on paved roads, depending on whether the cracks are visible or not. General performance will allow researchers to remember the relevance of their tool and the need to improve it for better results. In 2021, Zhang et al. were inspired by the architecture of the VGG 16 network to set up a CNN capable of effectively detecting five different pathologies as well as the healthy state of the roadway. With satisfactory results, to say the least, longitudinal and transverse cracks, potholes, repairs, and markings can be spotted on any coated surface.

Developing CNN architecture to meet the desired objectives is more complex than using pre-trained models. Previous studies have shown CNN architecture to have better precision than the pre-trained models. For example, Hoang et al. [24] compared the performance of an ML algorithm with a CNN developed in Matlab and managed to show once again the superiority of convolutional networks in all classification/detection tasks. Indeed, the two versions of the algorithm implemented could not exceed 80%, while the CNN reached 92% accuracy. Similarly, Gopalakrishnan et al. [8] obtained the best detection performance with the VGG 16 convolutional model (90%), the latter having outperformed five other Machine Learning models.

Many of the studies cited above focus on cracks or potholes. Few models can detect other types of damage and repairs with acceptable accuracy in addition to cracks and potholes. Furthermore, to our knowledge, none of these models are based on images of pavements in sub-Saharan Africa, where the problem of pavement damage is acute. This research deals with the development of a Deep Learning Convolutional Neural Network suitable for the classification of pathologies in roadways in the Benin Republic.

2. Materials and Methods

This research was carried out on two fronts: first identifying, and then classifying the different deteriorations on any flexible roadway. This study is summarized in four stages: data collection, data preprocessing, development of the architecture of models, and training and evaluation of models to judge the best performance.

2.1. Collection of Data

The current condition of paved roads in Benin is taken into consideration to obtain optimal results. Indeed, the national asphalt network having been operational for almost 15–20 years without renovations and regular and appropriate maintenance, the country’s paved roads are in an advanced state of degradation, with the simultaneous presence of several pathologies. The roads that we selected for this work are the ‘Sèmè–Porto’ road in the South of Benin Republic for cracking and repairs, and the “Bohicon–Dassa” road in the Center of Benin Republic for deformations and tearing. Figure 1 presents a map of the itinerary surveyed.

To obtain high-quality images, a Rebel T8i camera was used for this study. This device allowed us to obtain high-resolution videos. It is possible to adjust the flow of images per second to increase the proportion of images that could be extracted from these videos.

The camera used for data collection in this research was the Canon EOS Rebel T8i (Figure 2). Its resolution was 1920 p × 1080 p, with a fixed lens of 50 mm and a movable 18–55 mm lens. The image flow selected was 60 img/s during the first data collection on the Sèmè–Porto-Novo road, but for the second collection carried out on the Bohicon–Dassa section, the image flow is 25–50 img/s. This rate of image capture made it possible to obtain sufficiently shifted images to constitute a varied database, making image extraction by computer easier.

It was necessary to drive at a speed of about 5 km per hour in order for the images obtained to be sufficiently clear for use in Deep Learning. This collection was made for almost 2 h, allowing us to have a large batch of sequenced videos containing parts of degraded roadways.

2.2. Data Preprocessing

Improvement of the models during training requires well-prepared datasets, especially images with appropriate dimensions. However, to optimize the information collection method, the cameras recorded videos of the roadway, which allowed the recovery of a lot of pictures from different angles of various pathologies.

The videos obtained passed through an image extractor, a code written in the Python language in the PyCharm development environment, which helped us to recover the individual frames constituting each video. In total, nearly 5000 images were extracted from all the videos taken.

The second part of picture processing was the filtering of the extracted pictures. Indeed, for a Convolutional Neural Network to extract a significant number of characteristics specific to each category of degradation, the database it uses must contain a substantial number of pictures but, above all, with sufficient variety. Then, the tens of thousands of extracted pictures were sorted so that only the pictures of interest for this work were retained.

This involves removing pictures considered to be duplicates, pictures that are too like be considered as two different pictures (pictures separated by 0.017 s, for example). It was also necessary to remove all pictures presenting several degradations simultaneously, because it would be difficult to classify them in a single degradation category. Pictures, the sharpness of which had been affected by the vehicle’s movement, were also removed because they did not provide reliable information to the models. Some examples of the images obtained are in Figure 3, Figure 4, Figure 5 and Figure 6.

The third part of preprocessing the data consisted of resizing pictures. Indeed, the larger the pictures sent to a convolutional model, the larger the model’s final size. A larger model results in a high number of hidden layers (set of neurons grouped for the same task), which undoubtedly leads to a much greater number of hyperparameters and increased complexity and calculation time.

The size of the pictures was reduced. With a code written in the Python language in the PyCharm IDE, the dimensions of the extracted pictures were changed from (1920 p × 1080 p) to (480 p × 270 p), a reduction to one-quarter in length and width. This resizing allowed better conditions for the developed models (sufficiently reduced number of parameters), much more productive training, and better results to interpret. Figure 7, Figure 8, Figure 9 and Figure 10 present examples of the reduced images.

The last stage in preprocessing was the annotation of resized pictures. It is this step that allows models to assign a class to each identified degradation. It was made with a class assignment to each subset of pathologies constituted in the datasets that were used for training the models. Each batch of pictures was hosted in a folder named after the pathology highlighted in the pictures. Once the dataset is imported into the development environment, a labelling function is assigned to each picture in each sub-folder with the name of the degradation concerned. Four types of degradations were listed by the developed tool:

Longitudinal cracks;
Alligator cracks;
Raveling;
Patching.

The different datasets were obtained as the image extraction progressed. In Table 1, Dataset 1 served as a foundation for visualization of events related to data preprocessing and the training of a Convolutional Neural Network (CNN) with a field database. Dataset 2 was tried with the maximum of scenarios. All adjustments to find the best possible results were made with this dataset, starting from image preprocessing, through the ideal architecture for the CNN, and finally focusing on hyperparameter optimization. Once the best possible CNN was achieved, Datasets 3 and 4 were used to perform adjustment work to remain in the most optimal condition.

The distribution of collected data is an important parameter of Deep Learning. Here, the images have been partitioned in a ratio of 65-15-20. Table 2 shows how this ratio was applied to the datasets created.

2.3. Development of Models

This study focused on developing a model for detecting and classifying damage on a flexible road surface. With the advent of Deep Learning, there was no longer any question of thinking about Machine Learning models for this work. Instead, attention was directed towards much more powerful tools once they had been properly developed: Deep Neural Networks (or Deep Learning models). In addition, researchers in the field of AI on the tasks of recognition, classification, object detection, and many others, have configured a type of model to solve these types of problems in the future: networks of Convolutional Neural Networks (CNN).

A Convolutional Neural Network is a deep neural network with a specific architecture; therefore, the goal is to process all the information in a picture to execute the classification–detection–segmentation tasks [25,26,27]. It is one of the most powerful and efficient tools available to humans to solve the problems of object detection, facial recognition, automated driving, etc.

Its architecture is as follows:

A convolution layer is used to extract the characteristics of the input pictures by using several filters chosen randomly by the model [27]. The size of the filters is defined by the designer as well as their number. The principle of the convolution layer is presented in Figure 11.

A pooling layer is used to reduce the size of the extracted features without significant loss of data. Depending on the type chosen, it can retain the maximum (MaxPooling) or average (Average Pooling) values each time [26]. The principle of the max-pooling technique is presented in Figure 12.

An activation layer defines the process of transferring information from one layer to another [25]. The most used is ReLU, followed by its variants. Softmax is used at the end of the network’s architecture to perform class assignments.

A dense layer takes as input all the characteristics extracted from the pictures in the database and analyzes them to establish all possible correlations to draw the best conclusions about class membership [25].

Figure 13 shows the architecture of a Convolutional Neural Network (Le-Net). This type of model takes pictures as input and outputs a predefined membership class. We will, therefore, set up a CNN architecture to identify and classify the different cases of pavement distress present on a road in the pictures collected. Based on previous work, a better adjustment of the architecture will be made.

In this research, Convolutional Neural Networks with various architectures were developed and tested using the databases created. The first CNN proposed was inspired by the architecture of the VGG16 model, a very efficient model that was subsequently used for transfer learning. Figure 14 shows the architecture of this model.

Due to the small size of this database, more complex architecture would not have been suitable but would have just consumed more time and computing power for mediocre results. So, the architecture of the CNN proposed at the start of the work was quite simple, like that of VGG16.

The model took as input reduced pictures of size (480 p × 270 p), and included three successive convolutional blocks, followed by a flattening layer, a “full dense” layer (fully connected) for the analysis of the characteristics extracted from the pictures, and an output layer assigning a membership class to each picture. Each convolutional block has two convolutional layers (32 channels, 64 channels, and 128 channels) and a MaxPooling layer with a 2 × 2 window for the extraction and then reduction of picture characteristics. At this stage of the work, the activation function used for the convolutional layers and the dense layers was the ReLU (Rectified Linear Unit) function.

A “Batch Normalization” layer and a “Regularization L2” layer in this architecture were added to increase the model’s performance. With a larger database, we could fully explore the avenues available to us, such as hyperparameter optimization. This trial-and-error method allowed us to move on to using the Leaky ReLU function for better stabilization of the results, and then to the discovery of the ELU function, which provided superior performance. The number of epochs defined also increased from 10–15 to 150–200.

The architecture of the CNN has also evolved. The number of layers no longer responds to the mass of input information. More than ten architectures were tried, and the three with the best performance were selected. Table 3 shows a summary of the scenarios implemented.

3. Results and Discussion

3.1. Model Training and Evaluation

Once the models are constructed, they are trained. With the training of the different models built to classify the pictures in the datasets, it was possible to see the behavior of each model according to the parameters assigned to it. The corrections made to the architecture and hyperparameters led to increasingly greater and more stable accuracies.

Overfitting was the first and most serious problem that occurred in the process (Figure 15) and the loss was not decreasing well (Figure 16). It was necessary to change the parameters to control this lack of precision in the training process. For example, adding the Batch Normalization layer (BN) fixed the loss problem a little (Figure 17), and the accuracy of the model was better, even with the overfitting (Figure 18).

Fixing the overfitting problem led to the use of a bigger dataset. However, even if the accuracy was satisfactory, the overfitting was still there (Figure 19); but the loss graph was at least better than it had been previously, showing that the loss problem was on the way to being solved (Figure 20). After adding a new convolutional layer, the loss problem was solved (Figure 21). Even if the accuracy did not increase, the overfitting was still present in classifying pavement distress (Figure 22).

The activation layer “ELU” was added to the architecture of the CNN, and then the overfitting finally disappeared in the model’s training graphs (Figure 23 and Figure 24). Since this problem was solved, we just had to change the parameters of training to increase the accuracy of the model and decrease the loss after training. For example, more epochs of learning led to better accuracy (Figure 25) and less loss (Figure 26). The evolution was the same for the rest of the scenarios, with an increase in accuracy (Figure 27 and Figure 28), and a decrease of loss (Figure 29 and Figure 30). Eventually, the model obtained the best results for an architecture of pavement distress classification.

Table 4 and Figure 31 show the evolution of accuracy and loss across the eight scenarios.

Evaluation is a step following training to judge the predictive quality of the models. The most important parameters to analyze are performance, loss, and confusion matrix. Figure 32, Figure 33, Figure 34, Figure 35, Figure 36, Figure 37, Figure 38, Figure 39 and Figure 40 present the confusion matrix. The latter was used to see the distribution of the different predictions made by the model compared to reality. It is also from this powerful tool that the metrics can be determined.

3.2. Discussion and Model Performance Comparison

The best architecture was selected based on the performances observed during scenario 8. After that, training was conducted on the two other databases to ensure the validity of the model selected. Also, the already existing models were trained and then evaluated on these databases to compare with the performance of the developed CNN. Table 5 compares the model’s performance with other pre-trained models.

It is easy to see that the different models developed in this work achieve genuinely excellent performances. These performances even surpass those of already existing models such as ResNet 101 and VGG 16. The developed models also have better accuracy compared to those obtained by Maslan et al. [22] and Hoang [24,28]. This performance of the developed models can be explained by the utilization of the ELU activation function instead of the RELU activation function that is used in all their works. The RELU function lets positive values pass through to subsequent layers and blocks negative values. This filter then allows the model to focus only on certain characteristics of the data, while eliminating the others. Indeed, the main advantage of ELU over ReLU is its ability to avoid the dying neuron problem and produce activations more centered around zero [25]. These are generally considered beneficial for learning, as they approximate the properties of batch normalization. This type of normalization has a weak regularizing effect which in our case was able to address the overfitting problem and lead to faster and more robust learning.

4. Conclusions

This study proposed a method for detecting damage on paved roads using CNNs. Data was collected on several roads in Benin, using a field vehicle and a professional Canon EOS camera. Videos were taken with a picture stream of 25–50 frames per second, which made it possible to recover many images of the road. However, with the necessary processing, many of these pictures were found useless, and only the relevant pictures were retained.

Four databases were assembled with the pictures retrieved and processed, and these databases were used to construct and train the CNNs, and then to optimize the hyperparameters as well as all the parameters necessary to obtain good results. Thus, numerous CNN architectures were tried, following various scenarios, and this trial-and-error method allowed us to select the best possible architecture for classification. Once this model was retained, training with the remaining databases was carried out to find other models that were just as efficient but with the same architecture (the difference being in the weights assigned to the neurons of each model). Consequently, we obtained at the end three models with an accuracy between 94.6% and 97.3%, which is encouraging and allows us to affirm that the future of road monitoring lies in using CNNs.

In terms of performance, these models responded better than most of the other classification models.

For future work, the estimation of degradation indices will be an important focus, because it is essential for complete monitoring. Methods are already proposed to determine the extent and severity of the observed degradation, which refers to the detection of objects in pictures.

Author Contributions

Conceptualization, C.P.Y. and G.F.G.; methodology, C.P.Y. and G.F.G.; software, G.F.G.; validation B.C.K.T., M.G. and E.A.; formal analysis, C.P.Y., B.C.K.T. and E.N.F.; investigation, G.F.G.; resources, C.P.Y. and G.F.G.; data curation, E.A. and E.N.F.; writing—original draft preparation, G.F.G., C.P.Y. and B.C.K.T.; writing—review and editing, C.P.Y. and E.N.F.; visualization, E.N.F.; supervision, E.N.F. and M.G.; project administration, M.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The datasets used in this manuscript are available from the corresponding author upon request.

Acknowledgments

This study acknowledges the generosity of National School of Public Works of Abomey in Benin, which provided technical and financial support to the first and second authors to collect data.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CNN	Convolutional Neural Network
VGG	Visual Geometry Group
ROA	Region Of Aggregation
ROB	Region Of Belief
LS-SVM	Least-Squares Support-Vector Machines
ANN	Artificial Neural Network

References

Ukhwah, E.N.; Yuniarno, E.M.; Suprapto, Y.K. Asphalt Pavement Pothole Detection Using Deep Learning Method Based on YOLO Neural Network. In Proceedings of the 2019 International Seminar on Intelligent Technology and Its Applications (ISITIA), Surabaya, Indonesia, 28–29 August 2019; pp. 35–40. [Google Scholar] [CrossRef]
Hoang, N. Automatic Detection of Asphalt Pavement Raveling Using Image Texture Based Feature Extraction and Stochastic Gradient Descent Logistic Regression. Autom. Constr. 2019, 105, 102843. [Google Scholar] [CrossRef]
CEBTP. LCPC Manuel Pour Le Renforcement Des Chaussées Souples En Pays Tropicaux; Documentation Française: Paris, France, 1985; ISBN 2.11.084817-0. [Google Scholar]
IDRRIM. Diagnostic et Conception Des Renforcements de Chaussées; CEREMA, Ed.; Centre d’études et d’expertise sur les risques, l’environnement, la mobilité et l’aménagement (CEREMA): Paris, France, 2016; ISBN 978-2-37180-132-5. [Google Scholar]
Zhang, C.; Nateghinia, E.; Miranda-Moreno, L.F.; Sun, L. Pavement Distress Detection Using Convolutional Neural Network (CNN): A Case Study in Montreal, Canada. Int. J. Transp. Sci. Technol. 2022, 11, 298–309. [Google Scholar] [CrossRef]
Li, B.; Wang, K.C.P.; Zhang, A.; Yang, E.; Wang, G. Automatic Classification of Pavement Crack Using Deep Convolutional Neural Network. Int. J. Pavement Eng. 2020, 21, 457–463. [Google Scholar] [CrossRef]
Zou, Q.; Cao, Y.; Li, Q.; Mao, Q.; Wang, S. CrackTree: Automatic Crack Detection from Pavement Images. Pattern Recognit. Lett. 2012, 33, 227–238. [Google Scholar] [CrossRef]
Gopalakrishnan, K.; Khaitan, S.K.; Choudhary, A.; Agrawal, A. Deep Convolutional Neural Networks with Transfer Learning for Computer Vision-Based Data-Driven Pavement Distress Detection. Constr. Build. Mater. 2017, 157, 322–330. [Google Scholar] [CrossRef]
Kirschke, K.R.; Velinsky, S.A. Histogram-Based Approach for Automated Pavement-Crack Sensing. J. Transp. Eng. 1992, 118, 700–710. [Google Scholar] [CrossRef]
Oh, H.; Garrick, N.W.; Achenie, L. Segmentation Algorithm Using Iterative Clipping for Processing Noisy Pavement Images. In Imaging Technologies: Techniques and Applications in Civil Engineering; Second International ConferenceEngineering Foundation; Imaging Technologies Committee of the Technical Council on Computer Practices, American Society of Civil Engineers: Reston, VA, USA, 1998; pp. 148–159. [Google Scholar]
Li, Q.; Liu, X. Novel Approach to Pavement Image Segmentation Based on Neighboring Difference Histogram Method. Congr. Image Signal Process. 2008, 2, 792–796. [Google Scholar] [CrossRef]
Zhang, D.; Li, Q.; Chen, Y.; Cao, M.; He, L.; Zhang, B. An Efficient and Reliable Coarse-to-Fine Approach for Asphalt Pavement Crack Detection. Image Vis. Comput. 2017, 57, 130–146. [Google Scholar] [CrossRef]
Yan, M.; Bo, S.; Xu, K.; He, Y. Pavement Crack Detection and Analysis for High-Grade Highway. In Proceedings of the 2007 8th International Conference on Electronic Measurement and Instruments, Xi’an, China, 16–18 August 2007; pp. 4548–4552. [Google Scholar] [CrossRef]
Huidrom, L.; Kumar, L.; Sud, S.K. Method for Automated Assessment of Potholes, Cracks and Patches from Road Surface Video Clips. Procedia-Soc. Behav. Sci. 2013, 104, 312–321. [Google Scholar] [CrossRef]
Kaseko, M.S.; Lo, Z.P.; Ritchie, S.G. Comparison of Traditional and Neural Classifiers for Pavement-Crack Detection. J. Transp. Eng. 1994, 120, 552–569. [Google Scholar] [CrossRef]
Bray, J.; Verma, B.; Li, X.; He, W. A Neural Network Based Technique for Automatic Classification of Road Cracks. In Proceedings of the IEEE International Conference on Neural Networks—Conference Proceedings, Vancouver, BC, Canada, 16–21 July 2006; pp. 907–912. [Google Scholar]
Hoang, N. An Artificial Intelligence Method for Asphalt Pavement Pothole Detection Using Least Squares Support Vector Machine and Neural Network with Steerable Filter-Based Feature Extraction. Adv. Civ. Eng. 2018, 2018, 7419058. [Google Scholar] [CrossRef]
Munawar, H.S.; Hammad, A.W.A.; Haddad, A.; Soares, C.A.P.; Waller, S.T. Image-Based Crack Detection Methods: A Review. Infrastructures 2021, 6, 1–20. [Google Scholar] [CrossRef]
Some, L. Automatic Image-Based Road Crack Detection Methods; Kth Royal Institute of Technologie: Stockholm, Sweden, 2016. [Google Scholar]
Maeda, H.; Sekimoto, Y.; Seto, T.; Kashiyama, T.; Omata, H. Road Damage Detection and Classification Using Deep Neural Networks with Smartphone Images. Comput. Civ. Infrastruct. Eng. 2018, 33, 1127–1141. [Google Scholar] [CrossRef]
Wang, D.; Liu, Z.; Gu, X.; Wu, W.; Chen, Y.; Wang, L. Automatic Detection of Pothole Distress in Asphalt Pavement Using Improved Convolutional Neural Networks. Remote Sens. 2022, 14, 3892. [Google Scholar] [CrossRef]
Maslan, J.; Cicmanec, L. A System for the Automatic Detection and Evaluation of the Runway Surface Cracks Obtained by Unmanned Aerial Vehicle Imagery Using Deep Convolutional Neural Networks. Appl. Sci. 2023, 13, 6000. [Google Scholar] [CrossRef]
Chun, P.J.; Yamane, T.; Tsuzuki, Y. Automatic Detection of Cracks in Asphalt Pavement Using Deep Learning to Overcome Weaknesses in Images and Gis Visualization. Appl. Sci. 2021, 11, 892. [Google Scholar] [CrossRef]
Nhat-Duc, H.; Nguyen, Q.L.; Tran, V.D. Automatic Recognition of Asphalt Pavement Cracks Using Metaheuristic Optimized Edge Detection Algorithms and Convolution Neural Network. Autom. Constr. 2018, 94, 203–213. [Google Scholar] [CrossRef]
Aggarwal, C.C. Neural Networks and Deep Learning; Springer International Publishing AG: Cham, Switzerland, 2018; ISBN 9783319944623. [Google Scholar]
Aggarwal, C.C. Teaching Deep Learners to Generalize. In Neural Networks and Deep Learning; Springer International Publishing AG: Cham, Switzerland, 2018; pp. 169–216. ISBN 9783319944630. [Google Scholar]
Aggarwal, C.C. Training Deep Neural Networks. In Neural Networks and Deep Learning; Springer International Publishing AG: Cham, Switzerland, 2018; pp. 105–167. ISBN 9783319944630. [Google Scholar]
Hoang, N. Automatic Recognition of Asphalt Pavement Cracks Based on Image Processing and Machine Learning Approaches: A Comparative Study on Classifier Performance. Math. Probl. Eng. 2018, 2018, 6290498. [Google Scholar] [CrossRef]

Figure 1. Map presenting the roads surveyed.

Figure 2. Canon EOS Rebel T8i Camera.

Figure 3. Alligator crack.

Figure 4. Longitudinal crack.

Figure 5. Raveling.

Figure 6. Patching.

Figure 7. Resized picture of Alligator cracks.

Figure 8. Resized picture of Longitudinal cracks.

Figure 9. Resized picture of raveling.

Figure 10. Resized picture of Patching.

Figure 11. Example of convolution.

Figure 12. Example of MaxPooling.

Figure 13. The architecture of a Convolutional Neural Network (Le-Net 5).

Figure 14. The structure of model VGG 16.

Figure 15. Training graph of accuracy for scenario 1.

Figure 16. Training graph of loss for scenario 1.

Figure 17. Training graph of loss for scenario 2.

Figure 18. Training graph of accuracy for scenario 2.

Figure 19. Training graph of accuracy for scenario 3.

Figure 20. Training graph of loss for scenario 3.

Figure 21. Training graph of loss for scenario 4.

Figure 22. Training graph of accuracy for scenario 4.

Figure 23. Training graph of accuracy for scenario 5.

Figure 24. Training graph of loss for scenario 5.

Figure 25. Training graph of accuracy for scenario 6.

Figure 26. Training graph of loss for scenario 6.

Figure 27. Training graph of accuracy for scenario 7.

Figure 28. Training graph of loss for scenario 7.

Figure 29. Training graph of accuracy for scenario 8.

Figure 30. Training graph of loss for scenario 8.

Figure 31. Evolution of accuracy and loss during variation of scenarios.

Figure 32. Confusion matrix of scenario 1.

Figure 33. Confusion matrix of scenario 2.

Figure 34. Confusion matrix of scenario 3.

Figure 35. Confusion matrix of scenario 4.

Figure 36. Confusion matrix of scenario 5.

Figure 37. Confusion matrix of scenario 6.

Figure 38. Confusion matrix of scenario 7.

Figure 39. Confusion matrix of scenario 8.

Figure 40. The best confusion matrix for testing (97.3%).

Table 1. Constitution of the datasets.

		Class
	Image Size	Alligator Crack	Longitudinal Crack	Raveling	Patching	Total
Dataset 1	150 × 650	50	50	0	0	100
Dataset 2	480 × 270	140	121	38	70	369
Dataset 3	480 × 270	270	285	35	100	690
Dataset 4	480 × 270	323	364	51	129	867

Table 2. Repartition of the datasets.

	Training Set	Validation Set	Test Set	Total
Dataset 1	65	15	20	100
Dataset 2	236	59	74	369
Dataset 3	441	111	128	690
Dataset 4	554	138	174	867

Table 3. Constitution of the datasets.

	Scenario 1	Scenario 2	Scenario 3	Scenario 4	Scenario 5	Scenario 6	Scenario 7	Scenario 8
Epochs	10	10	30	30	50	70	150	200
Input shape	150*650	150*650	270*480	270*480	270*480	270*480	270*480	270*480
Layer Type
Convolutional	32 3*3 ReLU	32 3*3 ReLU BN	32 3*3 ReLU BN	32 3*3 Leaky ReLU BN	32 3*3 ELU	32 3*3 ELU	32 3*3 ELU	32 3*3 ELU
Regularization	-	-	-	16, l2	16, l2	16, l2	16, l2	16, l2
Convolutional	32 3*3 ReLU	32 3*3 ReLU BN	32 3*3 ReLU BN	32 3*3 Leaky ReLU BN	32 3*3 ELU	32 3*3 ELU	32 3*3 ELU	32 3*3 ELU
Regularization	-	-	-	16, l2	16, l2	16, l2	16, l2	16, l2
MaxPooling	2*2	2*2	2*2	2*2	2*2	2*2	2*2	2*2
Regularization	16, l2	16, l2	16, l2	-	-	-	-	-
Convolutional	64 3*3 ReLU	64 3*3 ReLU BN	64 3*3 ReLU BN	64 3*3 Leaky ReLU BN	64 3*3 ELU	64 3*3 ELU	64 3*3 ELU	64 3*3 ELU
Regularization	-	-	-	16, l2	16, l2	16, l2	16, l2	16, l2
Convolutional	64 3*3 ReLU	64 3*3 ReLU BN	64 3*3 ReLU BN	64 3*3 Leaky ReLU BN	64 3*3 ELU	64 3*3 ELU	64 3*3 ELU	64 3*3 ELU
Regularization	-	-	-	16, l2	16, l2	16, l2	16, l2	16, l2
MaxPooling	2*2	2*2	2*2	2*2	2*2	2*2	2*2	2*2
Regularization	16, l2	16, l2	16, l2	-	-	-	-	-
Convolutional	128 3*3 ReLU	128 3*3 ReLU BN	128 3*3 ReLU BN	128 3*3 Leaky ReLU BN	128 3*3 ELU	128 3*3 ELU	128 3*3 ELU	128 3*3 ELU
Regularization	-	-	-	16, l2	16, l2	16, l2	16, l2	16, l2
Convolutional	128 3*3 ReLU	128 3*3 ReLU BN	128 3*3 ReLU BN	128 3*3 Leaky ReLU BN	128 3*3 ELU	128 3*3 ELU	128 3*3 ELU	128 3*3 ELU
Regularization	-	-	-	16, l2	16, l2	16, l2	16, l2	16, l2
MaxPooling	2*2	2*2	2*2	2*2	2*2	2*2	2*2	2*2
Regularization	16, l2	16, l2	16, l2	-	-	-	-	-
Convolutional	-	-	-	-	256 3*3 ELU	256 3*3 ELU	256 3*3 ELU	256 3*3 ELU
Regularization	-	-	-	-	16, l2	16, l2	16, l2	16, l2
Convolutional	-	-	-	-	256 3*3 ELU	256 3*3 ELU	256 3*3 ELU	256 3*3 ELU
Regularization	-	-	-	-	16, l2	16, l2	16, l2	16, l2
MaxPooling	-	-	-	-	2*2	2*2	2*2	2*2
Regularization	-	-	-	-	-	-	-	-
Convolutional	-	-	-	-	-	-	512 3*3 ELU	512 3*3 ELU
Regularization	-	-	-	-	-	-	16, l2	16, l2
Convolutional	-	-	-	-	-	-	512 3*3 ELU	512 3*3 ELU
Regularization	-	-	-	-	-	-	16, l2	16, l2
MaxPooling	-	-	-	-	-	-	2*2	2*2
Regularization	-	-	-	-	-	-	-	-
Flattening
Fully connected	256 ReLU	256 ReLU	512 ReLU	512 ReLU	512 ELU	512 ELU	512 ELU	512 ELU
Dropout	0.5	0.5	0.5	0.3	0.3	0.3	0.3	0.4
Output	2 Softmax	2 Softmax	4 Softmax	4 Softmax	4 Softmax	4 Softmax	4 Softmax	4 Softmax

Table 4. Values of accuracy–loss during training.

	Scenario 1	Scenario 2	Scenario 3	Scenario 4	Scenario 5	Scenario 6	Scenario 7	Scenario 8
Accuracy	43.3%	36.7%	36.0%	39.2%	62.2%	86.5%	94.6%	95.9%
Loss	3.49	2.35	3.18	1.61	2.26	1.86	2.70	2.46

Table 5. CNN models performance comparison.

	Accuracy	Loss
CNN 1	95.6%	2.70
CNN 2	95.9%	2.46
CNN 3	97.3%	0.49
ResNet 101	91.9%	0.27
VGG 16	77.0%	0.71
Maslan et al. [22]	88.7%	0.26
Hoang et al. [24]	92.1%	5.25
Hoang N. [28]	95.3%	3.46

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yabi, C.P.; Gbehoun, G.F.; Tamou, B.C.K.; Alamou, E.; Gibigaye, M.; Farsangi, E.N. Analysis and Classification of Distress on Flexible Pavements Using Convolutional Neural Networks: A Case Study in Benin Republic. Infrastructures 2025, 10, 111. https://doi.org/10.3390/infrastructures10050111

AMA Style

Yabi CP, Gbehoun GF, Tamou BCK, Alamou E, Gibigaye M, Farsangi EN. Analysis and Classification of Distress on Flexible Pavements Using Convolutional Neural Networks: A Case Study in Benin Republic. Infrastructures. 2025; 10(5):111. https://doi.org/10.3390/infrastructures10050111

Chicago/Turabian Style

Yabi, Crespin Prudence, Godfree F. Gbehoun, Bio Chéissou Koto Tamou, Eric Alamou, Mohamed Gibigaye, and Ehsan Noroozinejad Farsangi. 2025. "Analysis and Classification of Distress on Flexible Pavements Using Convolutional Neural Networks: A Case Study in Benin Republic" Infrastructures 10, no. 5: 111. https://doi.org/10.3390/infrastructures10050111

APA Style

Yabi, C. P., Gbehoun, G. F., Tamou, B. C. K., Alamou, E., Gibigaye, M., & Farsangi, E. N. (2025). Analysis and Classification of Distress on Flexible Pavements Using Convolutional Neural Networks: A Case Study in Benin Republic. Infrastructures, 10(5), 111. https://doi.org/10.3390/infrastructures10050111

Article Menu

Analysis and Classification of Distress on Flexible Pavements Using Convolutional Neural Networks: A Case Study in Benin Republic

Abstract

1. Introduction

2. Materials and Methods

2.1. Collection of Data

2.2. Data Preprocessing

2.3. Development of Models

3. Results and Discussion

3.1. Model Training and Evaluation

3.2. Discussion and Model Performance Comparison

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI