Pothole Classification Model Using Edge Detection in Road Image

: Since the image related to road damage includes objects such as potholes, cracks, shadows, and lanes, there is a problem that it is difficult to detect a specific object. In this paper, we propose a pothole classification model using edge detection in road image. The proposed method converts RGB (red green and blue) image data, including potholes and other objects, to gray ‐ scale to reduce the amount of computation. It detects all objects except potholes using an object detection algorithm. The detected object is removed, and a pixel value of 255 is assigned to process it as a background. In addition, to extract the characteristics of a pothole, the contour of the pothole is extracted through edge detection. Finally, potholes are detected and classified based by the (you only look once) YOLO algorithm. The performance evaluation evaluates the distortion rate and restoration rate of the image, and the validity of the model and accuracy of the classification. The result of the evaluation shows that the mean square error (MSE) of the distortion rate and restoration rate of the proposed method has errors of 0.2–0.44. The peak signal to noise ratio (PSNR) is evaluated as 50 db or higher. The structural similarity index map (SSIM) is evaluated as 0.71–0.82. In addition, the result of the pothole classification shows that the area under curve (AUC) is evaluated as 0.9.


Introduction
Computer vision is a technology that extracts useful information by inputting visual data into a computer and analyzing it [1,2].This is a technology being used in various fields such as robots, medical treatment, security, games, and road transportation by using images and images through deep learning as information technology develops.The goal of computer vision is to extract interesting information from pattern recognition, statistical learning, and projection geometry through object detection, segmentation, and recognition in images [3,4].The object detection is a technology that distinguishes and identifies a specific object from the background in visual data such as images and videos [5].To this end, if CNN (Convolutional Neural Network) is used, regression and classification for object detection are possible.Efficient methods for object detection include the area proposal method and the method of finding objects with a predetermined location and size [6].The region proposal method uses a RPN (Regional Proposal Network) that selectively searches for a region that is likely to contain the object.This has algorithms such as R-CNN, Fast-R-CNN, Faster-R-CNN, and the disadvantages of high accuracy but very slow processing speed.The method of finding an object of a predetermined location and a predetermined size predicts a fixed number of objects having a shape and a size in advance for each region.This is used in fields where real-time detection is required because there are algorithms such as YOLO (you only look once) and SSD (Single Shot Detector), and fast processing is possible [7,8].However, YOLO needs to be supplemented because fast processing is possible but the accuracy for classification is low.On the other hand, segmentation is a technique that analyzes an object in an image in pixels and clearly interprets the position of the object [9].Segmentation is divided into semantic segmentation and instance segmentation [10,11].Semantic segmentation is a method of classifying all objects in pixels [10].Instance segmentation is a method of classifying according to objects by adding object detection based on semantic segmentation [8].Algorithms used for image segmentation include FCN (Fully Convolutional Network) and Mask-R-CNN etc.This has the disadvantage that the processing speed is very slow.
Potholes are caused by factors such as climate change, traffic volume, road aging, and vehicle weight.In road damage, a pothole refers to a damaged road surface [12].This may also cause traffic accidents.According to data from the Ministry of Land, Infrastructure and Transport, the number of road damage to potholes in Korea in 2019 amounted to 650,000 [13].As a result, 654 human accidents, 2 deaths, and 5,153 material accidents, and 1 trillion won of road repair costs were counted.A pothole is a serious risk factor for the safety of road traffic drivers and fellow passengers and the stability of driving.Therefore, there is a need for a method that can detect road conditions at various locations and identify related information for efficient management and maintenance.There are various methods of current methods for detecting potholes include detection using an acceleration sensor, scanning using a laser, and detection using an image and etc. [14].A detection method using an acceleration sensor is a disadvantage of low accuracy because there is a possibility to recognize even speed bumps [15].The laser scan method is can measure general cracks as well as potholes, but the cost is very high [16].In addition, a detection method using an image is a method that can detect a large area of potholes at a lower cost than the method using an acceleration sensor and a laser [17].In images, however, it is difficult to detect potholes due to factors such as light and darkness and image quality.Therefore, there is a need for a pothole detection technology considering these factors.Therefore, this study proposes a pothole classification model using edge detection in road image.The proposed method digitizes the image in the form of binary in pixels.After that, all other objects except the pothole are detected using the object detection algorithm, and the detected objects are removed and filled with 255 to process as a background.Then, the feature is extracted by detecting the edge of the pothole.Lastly, it is a method to detect, classify, and predict potholes based on YOLO.Performance evaluation is conducted in three ways.The first evaluates the distortion rate and restoration rate of image data using MSE, PSNR, and SSIM.The second conducts the accuracy evaluation of the model validity and pothole classification result.Third, we compare the performance of classification accuracy with the existing model.
The composition of this paper is as follows: Chapter 2 describes trends in pothole prediction technology, image object detection and segmentation algorithms.Chapter 3 describes the pothole classification model using edge detection in road image.Chapter 4 describes the performance evaluation of the proposed method.Finally, conclusions are given in Section 5.

Image Object Detection Algorithm
As artificial intelligence and vision technology continue to develop, they are leading innovation in various fields.Among them, the field of artificial intelligence based on visual data using videos and images is used for cancer diagnosis, joint diagnosis, and terrain analysis through object recognition and detection to help improve lives.Based on CNN (Convolution Neural Network), the visual data analysis can classify images, detect objects and generate images.CNN aims to reduce the complexity of the model and extract significant features by applying a convolution operation [17].In visual data, object detection is a technique to find a candidate region for a detection target to recognize a specific target and to predict the type and location of the object (bounding box) [18,19].The algorithms for this include R-CNN [20], Fast-R-CNN [21], Faster-R-CNN [22], and YOLO [23].R-CNN (Regions with Convolutional Neural Networks) is divided into three stages to detect objects.The first stage is the Region proposal stage.This is the process of extracting a region from an image without classifying the target class.The second step is a stage of extracting the feature vector.This removes the region proposal area from the original image, generates it as the same size, and extracts features using CNN.In the last step, the features are classified.Since R-CNN performs CNN calculation as many as the number of region proposal areas per image data, it has the disadvantage of very slow speed and high complexity [20].Fast-R-CNN is a method that improved the speed problem of R-CNN.This uses the Rol Pooling (Region of Interest Pooling) method.Rol Pooling applies convolution only once to the input image instead of performing convolution operation for every region proposal area and extracts features to identify objects with Rol Pooling.Therefore, Fast-R-CNN is a method of finding the feature by applying the computed feature and Rol Pooling only once to the input image [21].Faster-R-CNN can design a network that can generate region proposals inside convolution to improve the time required and directly generate accurate region proposals.Faster-R-CNN is an algorithm that adds a region proposal network to Fast-R-CNN [22].YOLO is an algorithm that predicts the type and location of an object by viewing the image data once using the bounding box and classification probability of the image data as a single regression problem.This is an algorithm suitable for analyzing real-time visual data because it is fast as a simple process.It is also easy to understand what objects exist in the image and where they are located by learning the data only once.Since YOLO learns the general characteristics of objects, it can be predicted even when new data is entered [23].
Accordingly, S. Lu et al. [24] proposed a real-time object detection algorithm in the video.The proposed algorithm trained the Fast-YOLO model to preprocess the image to remove the background and obtain object information.In addition, the number of parameters is reduced by improving the existing YOLO using the convolution operation based on the GoogLeNet structure.Accordingly, the time required for object detection was reduced.In addition, T. Gong et al. [25] proposed a multi-label classification method to improve object detection.In order to improve object detection using R-FCN, the proposed method uses multiple classifications to carry out end-to-end training and testing.Thus, the problem of classification is solved through the approximate location information about the object using the feature result generated using the attention mechanism.Therefore, the box level feature and the label feature are combined to improve the accuracy of object detection.Figure 1 shows the image object detection process.

Pothole Prediction Technology Trends in Road Damage
In road damage, potholes are caused by road aging, climate change, and traffic volume.This is a major cause of tire damage in cars driving on the road, vehicle damage, traffic accidents, and traffic jams.In addition, it is not easy to find a pothole at night, which slows the driverʹs response.Therefore, the severity of traffic accidents increases.Accordingly, a method capable of maintenance through continuous monitoring is required.The pothole detection includes vibration method [26], laser scanning method [27], video processing and image method [15,28].The vibration method is a method of detecting a pothole by analyzing a pattern of the degree of shaking occurring when a vehicle passes through the pothole.This allows the system to be built at a low cost.However, the shaking occurs even when passing through speed bumps, manholes, and obstacles, so the accuracy of awareness of potholes is low [26].The laser scanning method is a method of finding a pothole by detecting the shape of a road surface using a laser scanner.This is higher in accuracy than the vibration method, but its scalability is low because it costs a lot [27].In addition, the video recognition method can quickly detect a wide area at a low cost.Potholes have cracks, darkness, and features that stand out from the surroundings compared to flat road surfaces.However, there is a need for an algorithm with excellent performance that can accurately recognize videos with noise such as video quality and texture [28,29].As a result, methods and techniques for detecting potholes using deep learning have been developed in recent years.In Korea, Y. T. Jo et al. [29] proposed a Saliency Map based algorithm to improve the accuracy of pothole detection.The proposed algorithm is composed of a pothole candidate extraction section and a decision section based on Saliency Map.Saliency Map is an algorithm that finds a region that is more visible than the surroundings in an image [29,30].The candidate extraction section separates the suspected pothole region from the road surface and creates it as a candidate region.The decision section determines whether the region extracted as a candidate is an actual pothole.Thus, the feature of the pothole on the road surface can be extracted and detected.In foreign countries, C. Chen et al. [31] developed a road damage detection system using big data analysis of multiple sensors to monitor the road damage conditions such as potholes.This is a method of detecting road damage using time and frequency.The data set for this capture the data using accelerometers, cameras and recording devices on the mobile.The Morlet parent algorithm and STFT (Short-Time Fast Fourier Transform) algorithm commonly used in the field of audio recognition are used for the collected data set to find the frequency through accelerated signal processing to find the signal frequency of the damaged road.Thus, road damage can be detected at the frequency found.

Image Pre-Processing Using Object Detection
The road damage image there are so many objects such as people, cars, trees, street lights, and buildings.Also, it costs a lot in calculating a multi-channel image such as RGB.A preprocessing process is necessary for efficient analysis to solve this.Methods for image preprocessing include grayscale, binarization, enlargement and reduction, rotation and transformation.The gray-scale method reduces the amount of data by converting a multi-channel image such as RGB into a single channel.This can be converted without significantly damaging the shape of the image [32].The binarization method is used to transform the threshold into black or white.This greatly reduces the number of data [33].The enlargement and reduction methods can adjust the range by enlarging or reducing the image when there is a large or small amount of data in the image [34].The rotation and transformation methods are used when the object to be detected in the image is not in the designated form [35].A pothole has a feature that does not have the color of RGB.In the road surface data, however, there are various objects such as trees and cars around the pothole.Therefore, if all of them are converted to black or white in order to recognize each feature, it is difficult to analyze the features.The pothole image used in this study is converted into a single channel using a gray-scale as a preprocessing method to minimize the loss.The pothole image used is data collected directly from the road.This is done using the OpenCV library.OpenCV (Open Source Computer Vision) is a library that provides various algorithms related to computer vision and machine learning [36].Accordingly, the transformed gray-scale image is quantified according to pixels to form a matrix.In addition, an object detection algorithm is used to detect objects in the image.YOLO is the algorithm used for object detection.YOLO can extract features at a time for real-time data using a single network and create a bounding box.It can also classify classes [23].Accordingly, the detected object is removed to replace the pixel value with 255 to create a background.This is because the color of the road and pothole is gray, replace the background to 255 can highlight the importance of road and pothole detection.Figure 2 shows the preprocessing process using object detection.In Figure 2, the pre-processing using object detection proceeds in two stages.In the first stage, the image data for analysis is RGB data having 1,098,000 pixels.This is changed to data having 366,000 pixels through gray-scale.This reduces the size of the data to 1/3 and represents it as a twodimensional matrix with a size of (732,500).The second stage detects the object using the YOLO algorithm in the data converted to gray-scale.Among the detected objects, objects excluding the pothole are removed to change the pixel value to 255 and create a background.Therefore, other objects excluding the pothole are removed through the pre-processing process.

Feature Extraction of Road Damage Using Edge Detection
Road damage is caused by various causes, such as weather, traffic volume, vehicles, and road pavement conditions.There are various types of road damage, such as cracks, potholes, sinkholes, erased lanes and crosswalks, and decomposition.In particular, there is a difference in the degree of danger depending on the depth or size of the pothole.You may pass by small potholes indifferently.On the other hand, large potholes damage the tires of driving vehicles, increasing the risk of accidents.Therefore, it is necessary to extract and deal with the characteristics of potholes.To this end, detection is conducted along the edge of the pothole and its size is estimated.The edge means the boundary between two areas with different brightness.The types of edges include loop edge [37], line edge [38], step edge [39], and ramp edge [40].The loop edge indicates a place where the brightness of the image gradually changes and then returns again at a certain point [37].The line edge indicates a place where the brightness of the image changes suddenly at a point, but returns over time [38].The step edge indicates a place where the brightness of the image changes suddenly [39].The lamp edge indicates a place where the projection brightness changes gradually [40].The edge detection for extracting the features of the edge is a method of extracting the pixel corresponding to the edge to obtain the gradient through the partial differential operation [41].This identifies the existence of the edge in the image through the magnitude of the graph gradient using the first differential.The second differential is used to identify the position of the light and dark parts of the edge pixels through the sign of the graph gradient.In the road damage image data, the pothole is darker than the surroundings.Therefore, it is possible to extract features in real-time through edge detection.
To extract the edge, we use a mask that acts as a differential operator.The mask means a filter that meets mathematical conditions and has the same effect as a differential operator.The feature is that the sum of all pixels in the mask is 0. This is a filter used in noise reduction, sharpening, edge detection, etc. in image processing.The masks include Sobel mask [42], Prewitt mask [43], Robert mask [44], Laplacian mask [45] and Canny mask [46].The Sobel mask is very sensitive to brightness because it detects even noise as an edge [42].The Prewitt mask has similar results to those of the Sobel mask.On the other hand, since the calculation speed is faster than Sobel and there is a difference in the specific gravity of the brightness boundary, the edge is relatively weak [43].The Robert mask is very sensitive to noise and only detects edges with distinct characteristics [44].The Laplacian mask detects edges with sharp features.In addition, the calculation speed is fast, and edges in all directions can be detected [45].Being resistant to noise, the Canny mask has the advantage of utilizing various edge detection masks.However, several masks should be used because only clear edges are detected [46].To select a filter for extracting the correct pothole contour among Sobel, Prewitt, Robert, Laplacian, and Canny masks through evaluation using PSNR.Accordingly, the mask with the best average value is used as the filter of the model.Table 1 shows the PSNR results according to the mask type.In Table 1, the average was extracted by evaluating PSNR for each mask through the data of four samples.The PSNR evaluation resulted in Sobel of 49.672, Prewitt of 50.156,Robert of 46.852, Laplacian of 54.859, and Canny of 52.020.With Laplacian, the highest PSNR values are extracted.Therefore, Laplacian is used as a filter to extract pothole geometry.In the extracted edge, if the position, shape, or size of the object changes, accuracy is reduced in finding the characteristics of the object.Therefore, each feature point is extracted from the extracted edge.The feature point can be easily detected even if the shape, size, or position of the object changes.In addition, the object can be easily detected even when the camera position and lighting change.Figure 3 shows the process of extracting features of road damage using edge detection.In Figure 3, feature extraction of road damage using edge detection proceeds in two stages.In the first stage, the partial differential operator is used.Since the edge is a part where the brightness is changed, the differential is used to extract the edge according to the gradient change.In the second stage, the second differential is performed using a mask.The mask uses Laplacian to extract the edge along with all directions of the pothole.Accordingly, the feature of the edge is identified by extracting the edge.The size of the filter used at this time is evaluated as 1×1, 2×2, 3×3, 4×4, 5×5, and the size of the filter with the highest accuracy is used.In general, the size of the filter is odd.When using an odd number of filter sizes, there is a center pixel.Accordingly, the cell value of the filter is calculated evenly, and it is possible to output as one value representing the filter.On the other hand, when the filter size is used as an even number, asymmetric results are extracted.Therefore, the computational cost for feature extraction is very high, and feature learning is difficult.However, in this paper, the sizes of 1×1, 2×2, 3×3, 4×4, and 5×5 filters are compared in order to find out the change in accuracy when the filter size is gradually increased.The 1×1 filter uses the original image.Figure 4 shows the accuracy according to the size of the filter using the Laplacian mask.The vertical axis represents accuracy, and the horizontal axis represents the size of the filter.As a result of the accuracy evaluation in Figure 4, if the size is 1×1 and 3×3 among the five filters, the level is high.At this time, the filter of 1×1 size represents the result when the original image is used.At this time, it shows an accuracy of about 75% even without applying a filter.However, in the case of the size of 4×4 and 5×5, the accuracy decreases.In the case of 5×5, it is judged that detailed features could not be found because the size of the filter is large.In addition, since 2×2 and 4×4 are even-numbered filters, it is judged that an asymmetric problem, an error in learning, and an error in feature extraction occurred.On the other hand, when the size of the filter is 3×3, it is a peak, and the accuracy is high.Therefore, when the filter size is 3×3, the pothole feature is most effectively extracted, so the model filter uses the 3×3 size.

Pothole Classification Model Using Edge Detection
In order to classify potholes from road damage image data, we construct a pothole detection model that extracts various features such as points, lines, and areas of the potholes using YOLO based on CNN.The Ministry of Land, Infrastructure and Transport defines a pothole as a jar-shaped breakage where one part of the entire surface of the asphalt pavement is recessed and detached [13].YOLO searches the entire object in the image data at once, so the computation speed is fast and accurate recognition is possible [47].However, there is a disadvantage of the low accuracy for classification.To solve this problem, the CNN method is additionally used in the network to accurately learn and recognize features to improve the accuracy of pothole classification.Classification derives results based on the presence or absence of potholes in the image data.Figure 5 shows the block diagram of the pothole classification model using edge detection in road image proposed.The road damage image data for analysis in the model of Figure 5 has a size of (486,353).YOLO consists of 24 convolutional layers and 2 FC (Fully Connected) layers.To extract the features for the edge of the pothole, the filter used in convolution is constructed using the Laplacian mask.It is used to extract the edge of the pothole from all directions and carry out the fast calculations.It also approximates the second-order differential function of pixel values in the image.The loss function used for learning in YOLO uses a multi-part loss function [47].Equation ( 1) represents the multi-part loss function.S represents the size of the grid cell.This is a matrix.Therefore, the total number of grid cells is represented by S×S, S 2 .CS stands for Confidence Score.Also,  is used as a balance parameter to balance the loss for the coordinates (k, l, m, n) and other losses. is a balance parameter for the balance between a box with and without an object.This is used to maintain the balance because the number of cells with objects is relatively more than that of cells without objects in a general image.
The process of Equation ( 1) is as follows.The first stage computes the loss of k and l for the predictor bounding box of the grid cell where there is an object.The predictor affects overall reproducibility improvement.In the second stage, the loss value of h is calculated for the j-th predictor bounding box of the i-th grid cell where the object is present, and for large boxes, the sum squared error is calculated after applying the root to apply the small deviation.In the third stage, the loss of the confidence score is calculated for the j-th predictor bounding box of the i-th grid cell in which the object exists.On the other hand, the fourth step calculates the loss of the confidence score for the j-th predictor bounding box of the i-th grid cell where there is no object.In the last stage, the loss of conditional class probability is calculated for the grid cells in which the object exists [47].Accordingly, an optimized model is generated.Features are extracted through the learning of image data from the generated model, and classification is performed according to the presence or absence of a pothole.
The classification result is output as 1 if there is a pothole and 0 if not.In the image of road damage, most potholes have a circular shape.Therefore, the determination of the pothole compares the number of endpoints of the extracted edge with the sizes of the candidate area and the surrounding area of the bounding box.The size for pothole classification is roughly estimated through the coordinates of the bounding box.Equation (2) represents a discriminant for discriminating the pot hole.w, h, a represents the thresholds.When the maximum length and height area are greater than the threshold, it is determined as a pothole, otherwise, it is determined as not a pothole.In this paper, the threshold values for the maximum length, height, and area are the average of the most accurately detected values through repeated experiments.Therefore, the threshold values for the maximum length, height, and area are 200, 150, and 300.The algorithm in Algorithm 1 progress in three steps.The first step is the pre-processing of road damage image data.The second step is the road image feature extraction step.Finally, the third step is the porthole classification step.It is classified according to the presence or absence of a pothole in the road damage image data.

Pothole Classification Using Edge Detection
There are many places damaged by potholes and cracks on the road surface.If the external factors are continuously applied to the crack, a pothole is changed and generated.However, it is difficult to classify potholes from the image data of the road surface used in the field of image recognition.This is because there are various shapes on the road, such as lanes, shadows of various objects, and cracks.Also, potholes vary in size.Small potholes have little effect on the risk situation and are convenient for maintenance.On the other hand, the larger the pothole, the greater the impact on traffic accidents, vehicle damage, etc., and the higher the cost of maintenance.To classify potholes, therefore, edge detection is used to extract features and classification is carried out based on YOLO. Figure 6 shows the process of the pothole classification model using edge detection in road image.In Figure 6, there are three stages to classify potholes.In the first stage, image data is converted to a gray-scale to reduce operation costs.In addition, a preprocessing process is performed to remove the background excluding the potholes using object detection.The second stage is to extract the features of a pothole.A pothole is detected to extract the edge.In the third stage, images are classified according to the presence or absence of potholes, and the results are output.Figure 7 shows the road traffic pothole classification system.It implements the system using C++.The pothole classification system in Figure 7 is structured as follows.The system displays the current time and date.In addition, Seoul, Incheon, Gyeonggi-do, Jeollanam-bukdo, Chungcheongnam-bukdo, Gyeongsangnam-bukdo, and Jeju-do are used to determine whether the road is damaged.Thus, the edge of a pothole is detected and classified to determine the presence or absence of the pothole.Therefore, since it is known that a pothole exists in a specific area, the information on maintenance and road use etc. can be identified.

Performance Evaluation
In this paper, as an environment for performance evaluation, Windows 10, Intel Core i5-8400, 16GB RAM, and 2070 super GPU are used.The Global Road Damage Detection Challenge 2020 data set is used for performance evaluation [48].This includes images of damaged roads such as potholes and cracks on roads in Japan, India and the Czech Republic.Road damage information is provided through a label indicating the type of damage associated with the coordinates of the bounding box.There are 13,376 road damage image data in the data set.Among them, in this paper, to determine the existence or non-existence of a pothole, damage information other than the pothole is classified as non-pothole.The data used to evaluate the pothole classification model using edge detection in road image is converted to a gray-scale to reduce the amount of computation.Also, other objects except for potholes are detected using the YOLO algorithm to perform preprocessing in the background.Itʹs divided into 20% validation data set, 10% test data set, and 70% training data set to prevent overfitting.Also, the data collected directly is used as a verification data set.Accordingly, the feature is extracted by detecting the edge of the pothole, and the pothole is detected and classified based on YOLO.In the performance evaluation, the distortion rate and restoration rate of the image data are evaluated, the result of pothole classification using the model and the effectiveness evaluation, and the performance comparison with the existing model is conducted.The first evaluation method uses MSE, PSNR, and SSIM to evaluate the distortion rate and restoration rate.The classification results and effectiveness using the proposed model are evaluated through AUC (area under curve) using ROC based on the accuracy and confusion matrix.In addition, the performance comparison with the existing model is compared for the classification accuracy.
Data used for performance evaluation is pre-processed, and distortion occurs in the original.Therefore, it is evaluated whether effective pre-processing has been performed by evaluating the image distortion rate and restoration rate.Therefore, the first performance evaluation evaluates the distortion rate and restoration rate of the data.Mean Square Error (MSE) for this means the mean square error and compares the measured value for the difference between pixel values.This is a method of measuring the difference in pixel values between the original image and the output image.Equation (3) represents MSE.
In Equation ( 3), I refer to a gray-scale image of A×B size.K refers to an image with noise in I. Image and video data are subject to data loss during compression and differ from original data.PSNR is used to measure this difference.The PSNR stands for the Peak Signal-to-Noise Ratio, which means the ratio of noise to maximum signal that a signal can have.This is an index used to determine image quality loss information in video or loss compression.Also, the smaller the loss, the higher the value.For lossless images, the PSNR is not defined because the mean squared error is 0. Equation (4) represents PSNR.

𝑃𝑆𝑁𝑅 10 log In Equation ( 4), MAXI represents the maximum value of the corresponding video or image, and in the case of a gray-scale video or image, the maximum value is 255.This means that the smaller the MSE, the higher the PSNR.Accordingly, the PSNR value is higher in the data with good quality than in the data with poor quality.In addition, the SSIM is used to evaluate visual similarity through differences in visual quality, not numerical errors.The Structural similarity Index Map (SSIM) means a structural similarity index.This is a method of measuring similarity to original data for distortion caused by compression and transformation.This is a method of comparing luminance (l), contrast (C), and structure (S) of two images when there are an original image X and a distorted image Y. Luminance represents the degree of brightness of an object reflected from and to a light source or the brightness of a device display.Contrast refers to the standard deviation value of an image.Structure means the value obtained by subtracting the average brightness from the image and dividing it by the standard deviation.Equation (5) The SSIM has values from −1 to +1, and being closer to +1 means that it matches the original.In addition,  and  mean the internal average of pixels in each image,  and  mean the internal standard deviation of pixels in each image.Also,  means the internal covariance of pixels in the image.The image data with potholes is used for performance evaluation.This uses image data of pothole-related news from news data provided by KBS (Korea Broadcasting System) [49], YTN [50], and SBS (Seoul Broadcasting System) [51].Accordingly, each data set is converted to a grayscale, and the YOLO algorithm is used to detect objects.In addition, the detected object is removed and processed as a background to use the pre-processed data set.Through the proposed method, the edge of the pothole is detected and the degree of loss when restoring to the original data is evaluated.Table 2 shows the results of the performance evaluation of the distortion rate and the restoration rate for the pothole image.The result of the evaluation shows errors of 0.2 to 0.44 in MSE.The PSNR was evaluated as 52db or more.In addition, the SSIM was evaluated as 0.71~0.82.Therefore, it can be seen that image loss can be reduced and potholes can be accurately predicted through the proposed method.
The second performance evaluation evaluates the validity of the pothole classification model and the accuracy of the classification results.Accuracy of pothole detection is evaluated in the image through the proposed model.Table 3 shows the results of pothole classification accuracy using the proposed model.The configuration shows the ratio of potholes, cracks, and others (people, cars, buildings, etc., except for cracks and potholes) in the pothole image to be analyzed.As a result of the classification accuracy in Table 3, pothole_4 and pothole_5 have potholes, but their accuracy is evaluated to be low, so it is predicted that there is no pothole.In the case of pothole_4, there are many cracks in the image compared to the pothole.Accordingly, it is difficult to detect potholes.In addition, since the pothole does not exist clearly, the accuracy was evaluated low because the edge extraction using edge detection was not performed properly.In the case of pothole_5, potholes, cracks, and shadows exist at the same time.However, since cracks and shadows exist around the pothole, the accuracy was evaluated low because the edge was extracted as part of the pothole during the edge detection process.On the other hand, Pothole_9 predicts that the pothole is not present in the image but is a pothole at 75%.This is because a manhole was included in the road and it was recognized as a pothole.In addition, the pothole and other road damage results are included at the same time, such as the existence of multiple potholes on the road in the same area, or both potholes and cracks.In order to evaluate the accuracy of the pothole detection using the proposed method, evaluation is performed on sample data resulting from damage to the pothole and other roads.Evaluation evaluates the accuracy of the classification.This uses 6 sample data from the complex pothole.Complex_1 is an image that includes lines, potholes, and cracks.Complex_2 is an image of multiple potholes.Complex_3 is an image that includes cars, motorcycles, and potholes.Complex_4 is an image containing lines and potholes.In addition, Complex_5 is an image that includes pothole and line at night time.Complex_6 is an image containing cracks, lines, and cars.Table 4 shows the classification results for potholes and other road damage.This is represented by the composition, accuracy, and classification results of sample data.Data organization is divided into cracks, potholes, lines, and other objects.In the classification result of Table 4, the average accuracy of the pothole classification result is about 74%.All 6 sample data were extracted with an accuracy of 65% or more.Complex_1, Complex_2, and Complex_3 have 2 or more potholes.However, it correctly classifies the pothole.On the other hand, Complex_4 and Complex_5 have potholes, but the characteristics of the line are more prominent and extracted.In the case of Complex_4, a pothole and a line can be visually distinguished, but the model detects a line with a clearer outline than a pothole.Therefore, it is classified as nonpothole.Also, Complex_5 is an image of a rainy night.Therefore, when converting to gray-scale, it is difficult to find the outline of the pothole because of the difference, in contrast, is small.Accordingly, it is classified as having a pothole but non-pothole.Also, Complex_6 is classified as not an existence pothole because there are non-pothole and cracks, lines, and other objects that exist.
To evaluate the validity of the model using the confusion matrix.Table 5 shows the confusion matrix for evaluating the pothole classification model.True indicates a pothole, while false indicates a non-pothole.TP means a case where the prediction result is judged to be a pothole when it is an actual pothole, FN means a case where the prediction result is judged not to be a pothole when it is an actual pothole, FP means a case where the prediction result is judged to be a pothole when it is not an actual pothole and TN means a case where the prediction result is judged not to be a pothole when it is not an actual pothole.Table 6 shows the classification results using the confusion matrix.This is a classification result for the case of one pothole.Table 6 shows 5 results for TP, 2 for FN, 1 for FP, and 2 for TN.This is about 83% precision.To evaluate the classification results and the effectiveness of the model, it is evaluated through AUC using ROC.The ROC evaluates the performance of the model at thresholds for all classifications [52].AUC indicates the area under the ROC and evaluates performance through a value of 0-1.This means that the closer to 1, the better the performance of the classification model [53].Figure 8 shows the results of pothole classification using AUC.The horizontal axis represents FPR (False Positive Rate).The vertical axis represents TPR (True Positive Rate).In the pothole classification result using the AUC in Figure 8, the area of the AUC is evaluated to be 0.9.This means that the pothole images were accurately classified according to the presence or absence of the pothole.Therefore, the proposed model is suitable for classifying potholes In addition, Table 7 shows the results of pothole classification in a complex environment.In Table 7, there are 3 results for TP, 2 for FN, 1 for FP, and 0 for TN.This is about 75% precision.Figure 9 is the evaluation result of the model for the pothole classification result in a complex environment using AUC.As a result of evaluating the pothole classification model in a complex environment using the AUC in Figure 9, the area of AUC is estimated to be about 0.733.This model is suitable for classifying potholes even in a complex environment.Finally, the third performance evaluation compares the classification accuracy and precision of the proposed method and the existing pothole detection method.This is evaluated using a confusion matrix.Table 8 shows the average of the classification accuracy and precision comparison results of the proposed method and the existing pothole detection method.As a result of Table 8, the proposed method is evaluated better in terms of precision and accuracy of pothole detection and classification than the conventional method.Aparna et al. [54] in the case of the method, image data with only pothole is used.Therefore, the performance is evaluated low because the pothole cannot be properly detected in the image where other objects are present.Also, M. H. Yousaf et al. [55] in the case of the method, it is limited to the road image of asphalt.It is also possible if the shape of the pothole is similar.Therefore, the performance is evaluated poorly in image data of different shapes.Therefore, the proposed method shows excellent performance even in various images because it considers whether several objects are included.

Discussion
Potholes and cracks in road damage are characterized by various shapes and sizes.This causes road traffic accidents when the size is large.Therefore, a pothole, which is potential danger is detected to ensure the safety of drivers, fellow passengers, and cars driving on the road.In this study, a pothole classification model using edge detection in road image is proposed.The proposed method is a method for detecting potholes in road damage image data.This was converted to a gray-scale to reduce the amount of computation of the RGB pothole image.In addition, in order to increase the accuracy of the detection of potholes, other objects except the pothole were detected using the YOLO algorithm and removed.Accordingly, a value of 255 was applied to the background to perform the preprocessing process to convert it to a white background.In addition, the pothole is lighter or darker than the surrounding roads.Therefore, the part where the brightness of the edge changes is detected.Accordingly, the partial differential operator for edge detection was used to extract the edge according to the gradient change.Also, the Laplacian mask was used as a model filter to extract the edges along with all directions of the pothole.By extracting the edge, the characteristics of the edge of the pothole were identified.In addition, based on YOLO, classification is performed according to the presence or absence of a pothole.In the performance evaluation of the proposed method, the distortion rate and restoration rate of the image data were evaluated, the result of pothole classification using the model and the effectiveness evaluation, and the performance comparison with the existing model was conducted.First, the distortion rate and restoration rate of image data were evaluated using MSE, PSNR, and SSIM.As a result of the evaluation, the MSE has an error of 0.2 to 0.44.PSNR was evaluated to be 50db or more.In addition, SSIM was evaluated at 0.71 to 0.82.In the second performance evaluation, the accuracy, precision and validity of the model were evaluated by classifying potholes through the proposed method.As a result of the evaluation, when there is one pothole in the image data, the average accuracy was about 77.86% and the average accuracy was about 83%.On the other hand, if there are multiple potholes or the results of the pothole and other road damage are included at the same time, the average accuracy is about 74% and the average accuracy is about 75%.In addition, the validity of the model for a single pothole was evaluated as 0.9 using AUC.On the other hand, in the pothole classification in a complex environment, an AUC of 0.733 was evaluated.Therefore, we proved that the proposed model is a useful model for classifying potholes in a single pothole and a complex environment.In the third performance evaluation, the performance comparison of precision and accuracy according to the existing model and classification was conducted.As a result of the evaluation, the proposed method was excellently evaluated in terms of precision and accuracy.Accordingly, the characteristics of the shape and size of potholes can be identified through the proposed method.In addition, image data can be analyzed effectively by reducing the loss rate on image data.However, the image data analysis using the proposed method can detect the shape of a pothole, but there are difficulties in predicting the actual size of a pothole.In addition, the performance of classification is evaluated low in an image in which there are many cracks around a shadow or pothole.Therefore, there is a need for a method that can maintain and manage roads continuously by identifying detailed characteristics and predicting the size of potholes.In the future, we plan to study how to extract detailed features of the pit in order to predict the actual size of the pit from the image data.

Figure 3 .
Figure 3.The process of extracting features of road damage using edge detection.

Figure 4 .
Figure 4.The accuracy according to the size of the filter using the laplacian mask.

Figure 5 .
Figure 5.The block diagram of the pothole classification model using edge Detection in road image.

Figure 6 .
Figure 6.Process of the pothole classification model using edge detection in road image.

Figure 7 .
Figure 7.The road traffic pothole classification system.

Figure 8 .
Figure 8.The results of pothole classification using AUC (area under curve).

Figure 9 .
Figure 9. Evaluation results of pothole classification model in a complex environment using AUC.

Author
Contributions: J.-W.B. and K.C. conceived and designed the framework.J.-W.B. performed experiments and analyzed the results.All authors have contributed to writing and proofreading the paper.All authors have read and agreed to the published version of the manuscript.

Table 1 .
The peak signal to noise ratio (PSNR) results according to the mask type.

of Mask DATA PSNR (Peak Signal to Noise Ratio)
Algorithm 1 shows the algorithms for pothole classification based on YOLO.The input is road damage image data.The output is the classification result.
Algorithm 1. Algorithms for pothole classification based on YOLO (you only look once).Input: Road Damage Image Data → RD(n) Output: Classification Result → CR threshold w, h, a // Step 1. Road damage image data pre-processing for n ← 1 to the number of scales in road damage of images do Covert n images RGB to Grayscale Covert to 255 pexels after opbject detection Create pre-processed RD(n) // Step 2. Road damage feature extraction using edge detection for RD(n) ← 1 to number of filter of stage f do

Table 2 .
The results of the evaluation of the distortion rate and the restoration rate.

Table 3 .
The results of pothole classification accuracy using the proposed model.

Table 4 .
The classification results for potholes and other road damage.

Table 5 .
Confusion matrix for evaluating the pothole classification.

Table 6 .
Classification results using the confusion matrix.

Table 7 .
The results of pothole classification in a complex environment.

Table 8 .
The average of the classification accuracy and precision comparison results of the proposed method and the existing pothole detection method.