Concrete Bridge Crack Image Classiﬁcation Using Histograms of Oriented Gradients, Uniform Local Binary Patterns, and Kernel Principal Component Analysis

: Bridges deteriorate over time, which requires the continuous monitoring of their condition. There are many digital technologies for inspecting and monitoring bridges in real-time. In this context, computer vision has extensively studied cracks to automate their identiﬁcation in concrete surfaces, overcoming the conventional manual methods that rely on human judgment. The general framework of vision-based techniques consists of feature extraction using different ﬁlters and descriptors and classiﬁer training to perform the classiﬁcation task. However, training can be time-consuming and computationally expensive, depending on the dimension of the features. To address this limitation, dimensionality reduction techniques are applied to extracted features, and a new feature subspace is generated. This work used histograms of oriented gradients (HOGs) and uniform local binary patterns (ULBPs) to extract features from a dataset containing over 3000 uncracked and cracked images covering different patterns of cracks and concrete surface representations. Nonlinear dimensionality reduction was performed using kernel principal component analysis (KPCA), and three machine learning classiﬁers were implemented to conduct the classiﬁcation. The experimental results show that the classiﬁcation scheme based on the support-vector machine (SVM) model and feature-level fusion of the HOG and ULBP features after KPCA application provided the best results as an accuracy of 99.26% was achieved by the proposed classiﬁcation framework.


Introduction
Cracks in concrete bridges are hazardous to their durability and can potentially lead to their structural deficiency. These defects are largely encountered in bridges while performing visual inspections, and inspectors need to carefully identify them and document their different properties to assess the condition of the examined bridge.
Automating the detection of this type of defect has been an active research area in the computer vision field in an attempt to overcome limitations related to traditional manual methods that essentially rely on human judgment.
The extracted features are fed to a classifier (e.g., support-vector machine [5], random forests [6] and AdaBoost [7]). Another framework where features are automatically learned from a training set of images has been extensively explored leveraging convolutional neural networks, transfer learning, and several public concrete crack datasets [8][9][10][11][12]. 2 of 11 In the context of handcrafted features-based frameworks, Choudhary and Dey [13] used a Sobel edge detector to extract features from digital images of concrete surfaces and trained neural network and fuzzy models to detect concrete cracks.
Chen et al. proposed a detection method for highway pavement damage [14]. The authors used a grayscale-weighted histogram of oriented gradient feature patterns and a convolutional neural network classifier.
Jin et al. [15] used the histogram of oriented gradients and the watershed algorithm to detect different pavement cracks and identify their severity.
Another example was proposed by Zalama et al. [16] The authors developed a methodology utilizing Gabor filters, several binary classifiers, and an AdaBoost algorithm to detect longitudinal and transverse pavement cracks.
Chen et al. [17] presented a texture-based video-processing approach using local binary patterns (LBP), support-vector machine (SVM), and Bayesian decision theory to automate crack detection on metallic surfaces. Finally, Hu et al. [18] proposed a pavement crack detection method based on texture analysis using the gray-level co-occurrence matrix, translation invariant shape descriptors, and an SVM classifier.
Depending on the dimension of the extracted features, classifier training can be timeconsuming and computationally expensive. This limitation can be addressed by applying dimensionality reduction techniques.
For example, principal component analysis, linear discriminant analysis, or isometric mapping can be used as a processing step in the general classification framework.
In this sense, Abdelmawla et al. [19] applied image processing techniques, principal component analysis (PCA), and the k-means algorithm to detect and classify different types of cracks in pavement surface images. In addition, Abdel-Qader et al. [20] proposed an algorithm based on PCA, linear features, and local information to detect cracks in concrete bridge images.
Kumar et al. [21] used PCA and a modified LeNet-5 model to detect cracks in roads and bridges working with three different crack datasets. In addition, Endri et al. [22] used PCA as a preprocessing step to extract features from a pavement crack dataset containing 400 images.
Chen et al. [23] extracted features from frames of road videos using LBP, reduced the dimension of the LBP feature space by PCA, and trained an SVM classifier to detect different types of pavement cracks.
Elhariri et al. [24] proposed a crack detection model for historical building images based on feature extraction, a fusion of handcrafted features (i.e., HOG and LBP features), and convolutional-neural-network-learned features, and dimensionality reduction using PCA and machine learning classifiers.
In this paper, several concrete bridge crack classification schemes based on histograms of oriented gradients (HOG), uniform local binary patterns (ULBPs), kernel principal component analysis (KPCA), and machine learning classifiers (i.e., SVM, random forests, and decision trees) are studied and compared. The main contributions of this work are the following:

•
More than 3000 images from a crack dataset constructed by the authors in [25] are preprocessed using a median filter to remove crack-like noise; • HOG and ULBP features are extracted from the preprocessed images, and dimensionality reduction by KPCA is applied to the extracted features; • Classification schemes based on the reduced HOG and ULBP features and machine learning classifiers are investigated and evaluated using different classification metrics to yield the best model for concrete bridge crack detection.
The remainder of the paper is organized as follows: Section 2 briefly explains the HOG and ULBP descriptors and the KPCA dimensionality reduction technique. Section 3 presents the details of the experimental setup followed to implement the classification schemes studied in this paper. Finally, results are presented and discussed in Section 4, and the conclusions of the present work are provided in Section 5.

Materials
This section presents a brief explanation of the HOG and ULBP descriptors and the KPCA dimensionality reduction technique used in this paper.

Histogram of Oriented Gradients
The histogram of oriented gradients (i.e., HOG) [26] is a feature descriptor used to describe edge gradient distribution and orientation.
The vertical and horizontal gradients of an image are calculated by applying convolution filters that are commonly used to find edges in an image. The magnitude and the direction of the gradients are computed following Equations (1) and (2), respectively: where: g x represents the gradient in the x direction, and g y represents the gradient in the y direction. The image is divided into small cells, and the HOG of each cell is calculated. This histogram is split into nine bins corresponding to angles from 0-180 degrees and shows the frequency distribution of the gradient orientations. The contribution of each pixel within the cell is weighted by its gradient magnitude. The cells are combined into blocks, and the HOG features of the image are generated by combining all of the resulting block features. Figure 1 presents the HOG features of four concrete images with cracks from the dataset constructed by the authors in [25].
The remainder of the paper is organized as follows: Section 2 briefly explains the HOG and ULBP descriptors and the KPCA dimensionality reduction technique. Section 3 presents the details of the experimental setup followed to implement the classification schemes studied in this paper. Finally, results are presented and discussed in Section 4, and the conclusions of the present work are provided in Section 5.

Materials
This section presents a brief explanation of the HOG and ULBP descriptors and the KPCA dimensionality reduction technique used in this paper.

Histogram of Oriented Gradients
The histogram of oriented gradients (i.e., HOG) [26] is a feature descriptor used to describe edge gradient distribution and orientation.
The vertical and horizontal gradients of an image are calculated by applying convolution filters that are commonly used to find edges in an image. The magnitude and the direction of the gradients are computed following Equations (1) and (2), respectively: where: g represents the gradient in the x direction, and g represents the gradient in the y direction. The image is divided into small cells, and the HOG of each cell is calculated. This histogram is split into nine bins corresponding to angles from 0-180 degrees and shows the frequency distribution of the gradient orientations. The contribution of each pixel within the cell is weighted by its gradient magnitude. The cells are combined into blocks, and the HOG features of the image are generated by combining all of the resulting block features. Figure 1 presents the HOG features of four concrete images with cracks from the dataset constructed by the authors in [25].

Uniform Local Binary Patterns
Local Binary Patterns (LBP) is a texture operator used in various computer-vision applications. Each window in an image is processed to extract an LBP code. The processing includes labeling the center pixel of the window by thresholding its neighboring pixels and generating a binary code following Equation

Uniform Local Binary Patterns
Local Binary Patterns (LBP) is a texture operator used in various computer-vision applications. Each window in an image is processed to extract an LBP code. The processing includes labeling the center pixel of the window by thresholding its neighboring pixels and generating a binary code following Equation (3) [24]: where: Electronics 2022, 11, 3357 4 of 11 P represents the number of surrounding pixels, R represents the circle radius, and g c and g i represent the gray level of the center pixel and the neighboring pixels, respectively. The generated binary code is converted to a decimal format, and an LBP-based descriptor is created by computing a histogram over the LBP values.
An LBP pattern is called uniform (ULBP) if it has two or fewer transitions when traversed circularly. A unique label is assigned for each uniform pattern, while nonuniform patterns are assigned with the same label. Figure 2 illustrates ULBP texture features of four concrete crack images. where: ( ) = 1 ≥ 0 0 < 0 P represents the number of surrounding pixels, R represents the circle radius, and and represent the gray level of the center pixel and the neighboring pixels, respectively.
The generated binary code is converted to a decimal format, and an LBP-based descriptor is created by computing a histogram over the LBP values.
An LBP pattern is called uniform (ULBP) if it has two or fewer transitions when traversed circularly. A unique label is assigned for each uniform pattern, while nonuniform patterns are assigned with the same label. Figure 2 illustrates ULBP texture features of four concrete crack images.

Kernel Principal Component Analysis
Principal component analysis (PCA) is an unsupervised decomposition algorithm used for feature extraction and dimensionality reduction. It performs a linear transformation of features from a higher dimensional space to a lower dimensional space and retains the linear principal components, which maximizes the variance of the data in the low-dimensional space. Generally, the eigenvalues and eigenvectors of the covariance matrix of the original data are calculated, and the eigenvectors represent the new directions in the new feature space.
Kernel principal component analysis (KPCA) is a nonlinear extension of PCA. The data is first mapped to a nonlinear feature space using nonlinear kernels (e.g., radius basis function, sigmoid function, and polynomial function), and the principal components are extracted in the new space. Figure 3 shows an example of a crack image reconstruction after KPCA application using the radius basis function (RBF) kernel and different numbers of components.

Kernel Principal Component Analysis
Principal component analysis (PCA) is an unsupervised decomposition algorithm used for feature extraction and dimensionality reduction. It performs a linear transformation of features from a higher dimensional space to a lower dimensional space and retains the linear principal components, which maximizes the variance of the data in the lowdimensional space. Generally, the eigenvalues and eigenvectors of the covariance matrix of the original data are calculated, and the eigenvectors represent the new directions in the new feature space.
Kernel principal component analysis (KPCA) is a nonlinear extension of PCA. The data is first mapped to a nonlinear feature space using nonlinear kernels (e.g., radius basis function, sigmoid function, and polynomial function), and the principal components are extracted in the new space. Figure 3 shows an example of a crack image reconstruction after KPCA application using the radius basis function (RBF) kernel and different numbers of components.

Experimental Setup
The diagram in Figure 4 shows the general framework of the experimental setup followed in this paper.

Dataset
A total of 1304 cracked and 1806 uncracked RGB images at a resolution of 200 × 200 were extracted from the dataset constructed by the authors in [25]. These images cover different representations of cracks and concrete surface in the bridge inspection real world. Figure 5 presents examples from the dataset studied in this paper. The images were converted to grayscale and preprocessed using a 3 × 3 median filter to reduce crack-like noise.

Experimental Setup
The diagram in Figure 4 shows the general framework of the experimental setup followed in this paper.

Experimental Setup
The diagram in Figure 4 shows the general framework of the experimental setup followed in this paper.

Dataset
A total of 1304 cracked and 1806 uncracked RGB images at a resolution of 200 × 200 were extracted from the dataset constructed by the authors in [25]. These images cover different representations of cracks and concrete surface in the bridge inspection real world. Figure 5 presents examples from the dataset studied in this paper. The images were converted to grayscale and preprocessed using a 3 × 3 median filter to reduce crack-like noise.

Dataset
A total of 1304 cracked and 1806 uncracked RGB images at a resolution of 200 × 200 were extracted from the dataset constructed by the authors in [25]. These images cover different representations of cracks and concrete surface in the bridge inspection real world. Figure 5 presents examples from the dataset studied in this paper. The images were converted to grayscale and preprocessed using a 3 × 3 median filter to reduce crack-like noise.

Feature Extraction and Dimensionality Reduction
The HOG and ULBP descriptors were used to generate the feature sets from the images of the studied dataset, and dimensionality reduction was performed by applying KPCA.
The hyperparameters of the HOGs, ULBPs, and KPCAs were set according to Table 1.

Feature Extraction and Dimensionality Reduction
The HOG and ULBP descriptors were used to generate the feature sets from the images of the studied dataset, and dimensionality reduction was performed by applying KPCA.
The hyperparameters of the HOGs, ULBPs, and KPCAs were set according to Table  1. A standard scaler was applied to the HOG and ULBP features to normalize them (i.e., the transformed data had a mean equal to 0 and a standard deviation equal to 1) before reducing their dimensions using KPCA.
A reconstruction error based on the root-mean-square error was defined following Equation (4) so as to measure the performance of the dimensionality reduction technique: where represents features before applying KPCA, Y represents reconstructed features after applying KPCA, and n represents the number of images To further visualize the HOG and ULBP features after the KPCA application, the authors employed the uniform manifold approximation and projection (UMAP) algorithm [27], which is a manifold learning and dimension reduction algorithm effective for cluster visualization.  A standard scaler was applied to the HOG and ULBP features to normalize them (i.e., the transformed data had a mean equal to 0 and a standard deviation equal to 1) before reducing their dimensions using KPCA.
A reconstruction error based on the root-mean-square error was defined following Equation (4) so as to measure the performance of the dimensionality reduction technique: where X represents features before applying KPCA, Y represents reconstructed features after applying KPCA, and n represents the number of images.
To further visualize the HOG and ULBP features after the KPCA application, the authors employed the uniform manifold approximation and projection (UMAP) algorithm [27], which is a manifold learning and dimension reduction algorithm effective for cluster visualization.

Image Classification
Image classification was performed using three machine learning classifiers (i.e., SVM, random forests, and decision trees).
The HOG and ULBP features and the reduced feature sets resulting from dimensionality reduction using KPCA were fed to the three classifiers. Five classification schemes were tested in this work:

•
The HOG features were extracted from the filtered grayscale images and were directly fed to the three classifiers; • KPCA was applied to the extracted HOG features. and the new feature set was fed to the three classifiers; • The ULBPs were extracted from the filtered grayscale images and were directly fed to the three classifiers; • KPCA was applied to the extracted ULBP features, and the new feature set was fed to the three classifiers; and • KPCA was applied to the extracted HOG and ULBP features, and the new feature sets were fused and fed to the three classifiers.
The performance of the classification methods was evaluated using a fivefold crossvalidation procedure based on the following metrics: F1-Score = 2 (1/Precision) + (1/Recall) (8) where TP (true positives) refer to the number of cracked images that are correctly classified as cracked; TN (true negatives) refer to the number of noncracked images that are correctly classified as noncracked; FP (false positives): refer to the number of noncracked images that are incorrectly identified as cracked; and FN (false negatives) refer to the number of cracked images that are incorrectly identified as noncracked.
All of the experiments in this paper were conducted in Google Colaboratory with the 12 GB NVIDIA Tesla K80 GPU provided by the platform. The scikit-learn library [28] and the scikit-image Python package [29] were used to implement the classification methods discussed in this paper. Table 2 shows the reconstruction error of the HOG and ULBP features using KPCA. It can be seen that the reconstruction errors of HOG and ULBP features using the nonlinear dimensionality reduction technique were significantly low (i.e., an error of 2.82 × 10 −8 and 1.35 × 10 −7 for the HOG and ULBP features, respectively). This means that KPCA based on the RBF kernel is capable of reconstructing the HOG and ULPB features and can capture maximum information with a low number of principal components (i.e., 100 components).  Tables 3 and 4 display the accuracy, precision, recall, and F1-score results of the classification schemes based on the HOG and ULBP features, respectively. The SVM model yielded the best classification results as compared to the random forest and decision tree models (e.g., an accuracy of 98.49%, 90.90%, and 72.60% achieved by SVM, random forests, and decision trees, respectively, for the HOG features and an accuracy of 93.44%, 90.10%, and 68.58% achieved by SVM, random forests, and decision trees, respectively, for the ULBP Electronics 2022, 11, 3357 8 of 11 features). It can also be seen that the HOG features-based classification scheme achieved better results compared to the ULBP texture-based features.  Figure 6 presents the UMAP visualization maps of the HOG and ULBP features after the KPCA application. It can be noticed that the samples represented by the reduced HOG and ULBP features can be separable, which is convenient for the classification models in differentiating cracked and uncracked concrete images.  Tables 3 and 4 display the accuracy, precision, recall, and F1-score results of the classification schemes based on the HOG and ULBP features, respectively. The SVM model yielded the best classification results as compared to the random forest and decision tree models (e.g., an accuracy of 98.49%, 90.90%, and 72.60% achieved by SVM, random forests, and decision trees, respectively, for the HOG features and an accuracy of 93.44%, 90.10%, and 68.58% achieved by SVM, random forests, and decision trees, respectively, for the ULBP features). It can also be seen that the HOG features-based classification scheme achieved better results compared to the ULBP texture-based features. Figure 6 presents the UMAP visualization maps of the HOG and ULBP features after the KPCA application. It can be noticed that the samples represented by the reduced HOG and ULBP features can be separable, which is convenient for the classification models in differentiating cracked and uncracked concrete images.  Tables 5 and 6 show the classification results based on the reduced HOG and ULBP features after the KPCA application. It can be noticed that the classification metrics were significantly enhanced after the dimensionality reduction of the HOG and ULBP feature sets. This is particularly observable for the random forest and decision tree classifiers (e.g., an accuracy of 95.40% and 90.90% achieved by the random forests classifier, with and without KPCA, respectively, for the HOG features and an accuracy of 93.15% and 90.10% achieved by the random forests classifier, with and without KPCA, respectively, for the ULBP features). The SVM classifier provided the highest accuracies using the reduced HOG and ULBP features (i.e., an accuracy of 98.81% and 93.63% for the HOG and ULBP features, respectively). These experimental results show that, by using only 100 components, the original 4356 HOG features and 40000 ULBP features are well represented in the nonlinear, new feature subspace, enabling better performance of the three classifiers by addressing and capturing complex data patterns.   Tables 5 and 6 show the classification results based on the reduced HOG and ULBP features after the KPCA application. It can be noticed that the classification metrics were significantly enhanced after the dimensionality reduction of the HOG and ULBP feature sets. This is particularly observable for the random forest and decision tree classifiers (e.g., an accuracy of 95.40% and 90.90% achieved by the random forests classifier, with and without KPCA, respectively, for the HOG features and an accuracy of 93.15% and 90.10% achieved by the random forests classifier, with and without KPCA, respectively, for the ULBP features). The SVM classifier provided the highest accuracies using the reduced HOG and ULBP features (i.e., an accuracy of 98.81% and 93.63% for the HOG and ULBP features, respectively). These experimental results show that, by using only 100 components, the original 4356 HOG features and 40,000 ULBP features are well represented in the nonlinear, new feature subspace, enabling better performance of the three classifiers by addressing and capturing complex data patterns.  Figure 7 presents the UMAP visualization map of the HOG and ULBP features after KPCA application and feature fusion. It can be seen that the fusion of the reduced HOG and ULBP features resulted in a better separability of the studied samples and would presumably improve the performance of the classification models.  Figure 7 presents the UMAP visualization map of the HOG and ULBP features after KPCA application and feature fusion. It can be seen that the fusion of the reduced HOG and ULBP features resulted in a better separability of the studied samples and would presumably improve the performance of the classification models.  Table 7 summarizes the performance metrics of the three machine learning models based on the fusion of the reduced HOG and ULBP features using KPCA. A gain in performance of the three classifiers was achieved by performing a feature-level fusion after the KPCA application (e.g., an accuracy of 94.24%, 89.20%, and 90.10% achieved by the Decision Tree model with feature fusion, with reduced HOG features and with reduced ULBP features, respectively). It can also be noticed that the SVM classifier yielded the highest classification metrics obtained in this work (e.g., an accuracy of 99.26%) in the classification scheme based on feature fusion after dimensionality reduction of the HOG and ULBP features using KPCA. The fusion of the HOG and ULBP handcrafted features with nonlinear dimensionality reduction provides a strong descriptive capability of  Table 7 summarizes the performance metrics of the three machine learning models based on the fusion of the reduced HOG and ULBP features using KPCA. A gain in performance of the three classifiers was achieved by performing a feature-level fusion after the KPCA application (e.g., an accuracy of 94.24%, 89.20%, and 90.10% achieved by the Decision Tree model with feature fusion, with reduced HOG features and with reduced ULBP features, respectively). It can also be noticed that the SVM classifier yielded the highest classification metrics obtained in this work (e.g., an accuracy of 99.26%) in the classification scheme based on feature fusion after dimensionality reduction of the HOG and ULBP features using KPCA. The fusion of the HOG and ULBP handcrafted features with nonlinear dimensionality reduction provides a strong descriptive capability of complex data patterns. As a result, the proposed classification scheme can accurately separate cracked and uncracked images.

Conclusions
In this work, experiments were carried out on the nonlinear (i.e., KPCA) transformation of feature sets generated by the HOG and ULBP feature descriptors. The experiments were validated using a concrete crack dataset covering different representations of cracks and concrete surfaces in the real world of bridge inspection. The images of the studied dataset were preprocessed using a median filter to remove crack-like noise. In addition, this paper evaluated and proposes multiple classification schemes based on HOG and ULBP features, dimensionality reduction using KPCA, feature fusion, and three machine learning classifiers (i.e., SVM, random forests, and decision trees). The performance of the models was evaluated based on standard classification metrics. Experimental results show that applying KPCA improved the performance of the trained models on the HOG and ULBP features as the nonlinear kernel dimensionality reduction technique captures complex data patterns.
Furthermore, the classification scheme based on the SVM classifier and the fusion of the HOG and ULBP reduced features (i.e., using KPCA) provided the best performance with an accuracy of 99.26%. The proposed classification scheme showcases the strong impact of handcrafted feature fusion and nonlinear dimensionality reduction on the performance of classification models in the context of a challenging and complex concrete crack dataset.