You are currently viewing a new version of our website. To view the old version click .
Electronics
  • Article
  • Open Access

22 August 2024

PAN: Improved PointNet++ for Pavement Crack Information Extraction

,
,
,
,
and
1
School of Geomatics, Liaoning Technical University, Fuxin 123000, China
2
Collaborative Innovation Institute of Geospatial Information Service, Liaoning Technical University, Fuxin 123000, China
3
State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
4
School of Resources and Civil Engineering, Liaoning Institute of Science and Technology, Benxi 117000, China
This article belongs to the Special Issue Fault Detection Technology Based on Deep Learning

Abstract

Maintenance and repair of expressways are becoming increasingly important due to the growing frequency of their use. Accurate pavement crack information extraction helps with routine maintenance and reduces the risk of traffic accidents. The traditional 2D crack image detection method has limitations and cannot effectively obtain depth information. Three-dimensional crack extraction from 3D point cloud has become a new solution that can capture pavement crack information more comprehensively and accurately. However, the existing algorithms are not effective in the feature extraction of cracks due to the different and irregular shapes and sizes of pavement cracks and interference from the external environment. To solve this, a new method for detecting pavement cracks in point clouds, namely point attention net (PAN), is herein proposed. It uses a two-branch attention fusion module to focus on space and feature information in the cloud and capture features of crack points at different scales. It also uses the Poly Loss function to solve the imbalance of foreground and background points in pavement point cloud data. Experiments on the LNTU-RDD-LiDAR dataset were carried out to verify the effectiveness of the proposed method. Compared with the traditional method and the latest point cloud segmentation technology, the performance indexes of mIoU, Acc, F1, and Rec achieved significant improvement, reaching 75.4%, 91.5%, 75.4%, and 67.1%, respectively.

1. Introduction

Highway pavement crack information extraction is very important in road quality assurance and driving safety. The early-detection methods relying on manual and sensor have high cost, high detection cost, low efficiency, and strong subjectivity, which makes road crack detection complicated and difficult [1]. The modern machine vision detection method improves the detection efficiency and accuracy to a certain extent. It provides a new solution for the identification and maintenance of highway cracks, which helps to reduce cost, improve efficiency, reduce the impact of road cracks on traffic, and improve the safety and service life of highways.
Traditional methods have been extensively studied for automatic pavement crack detection. Image processing-based techniques primarily include threshold segmentation [2,3], edge detection [4,5], and region growth [6,7]. However, these approaches often rely on prior knowledge or optimal thresholds, limiting their applicability on complex and variable urban roads, particularly in reliably detecting cracks with weak connections or uneven geometric topologies. Some studies have proposed an information extraction method for calculating crack length [8] and used a data acquisition system combining radar rangefinder and camera [9]. Image segmentation was carried out using gray adaptive threshold algorithm, crack contour was extracted by GaussLaplace algorithm [10], and video-based software system was developed for calculating bridge crack widths. However, traditional methods often have low accuracy and can only approximate crack locations in complex backgrounds or when there is minimal contrast. These methods are also easily affected by the background.
With the rise of deep learning, crack detection has ushered in a new stage of development, and the quality of crack detection has been significantly improved by learning advanced features [11]. The use of deep learning in pavement crack information extraction allows for automated feature extraction and learning without human intervention. Additionally, Transformer models based on the seq2seq architecture have achieved significant success in crack detection, offering not only feature extraction but also multimodal fusion, addressing some of the limitations of CNNs [12]. While two-dimensional detection has made some progress, these image-based methods still face challenges, as pavement images are frequently obscured by lighting, shadows, dirt, and noise, which hinders the accurate capture of terrain and texture details.
With the advancement of 3D data acquisition technology, mobile laser scanning (MLS) systems have become widely used for generating precise 3D coordinate data, enabling efficient and flexible point cloud acquisition on road surfaces [13]. Pavement point clouds exhibit characteristics that include 3D spatial coordinates and intensity information for each point, with crack points typically differing in elevation and intensity from the surrounding pavement points. These features make the pavement point cloud an effective tool for analyzing road conditions and detecting damage. The point cloud’s irregularity offers rich information about the pavement surface. The normal vector, indicating the direction at each point, is used for crack segmentation. However, road wear and noise complicate distinguishing elevation and intensity changes between cracks and normal pavement, posing challenges for crack detection.
While current methods have shown promising results, point cloud-based crack-detection techniques face three significant challenges that hinder their practical application. (1) Disorder: LiDAR systems generate 3D point cloud data that are inherently unstructured and unordered, creating difficulties for deep learning algorithms. To address this, many existing methods employ dimension reduction techniques to transform 3D point clouds into 2D images, thereby simplifying the processing. However, this conversion often results in information loss [14]. Moreover, these methods often treat point clouds as discrete, uncorrelated sets, overlooking elevation differences between cracked and non-cracked points. (2) Spatial correlation: Existing point cloud-based crack-detection methods tend to ignore the spatial correlation and adjacency of points in complex pavement structures. Since the pavement is relatively flat, neglecting these correlations can lead to insufficient recognition of small crack details, resulting in incomplete detection. (3) Data dependency: Deep learning models based on point clouds typically require large amounts of manually labeled data and involve training a significant number of parameters. This not only increases the input of manpower, time, and cost but also significantly increases the computing cost with the increase of network complexity [15]. These limitations limit the applicability of current point cloud-based crack-detection methods in different scenarios.
Based on the above problems, the following improvements are proposed:
(1)
This paper introduces a PAN network built on a U-Net architecture that processes unordered point sets and directly extracts features, avoiding the information loss often caused by dimensionality reduction in traditional methods. This approach effectively preserves the spatial information and geometric characteristics of point cloud data, enhancing the accuracy and efficiency of point cloud processing;
(2)
In this paper, we present a PC-Parallel module featuring a dual attention module branch parallel structure, which flexibly adapts to pavement point cloud data of varying densities and samples. This design enhances the model’s ability to understand both overall and local features, especially in extracting crack information from complex structures, and addresses segmentation result incompleteness. It not only improves the model’s robustness across multi-scale and multi-density data but also enhances its capability to capture crack edges and details, thereby increasing the accuracy of point cloud pavement crack segmentation;
(3)
The Poly Loss function is introduced. By adjusting the form of the loss function, the imbalance between crack points and background points can be better balanced, and the sensitivity to edge and detail information can be enhanced. The problem of boundary refinement and class imbalance in point cloud pavement crack segmentation is solved effectively, thus improving segmentation accuracy and model performance;
(4)
A set of large-scale 3D point cloud dataset of pavement cracks suitable for semantic segmentation is established.

3. Materials and Methods

3.1. Point Attention Net Model Overview

As a general point cloud processing framework, PointNet++ excels in tasks like point cloud classification, semantic segmentation, and object detection, demonstrating wide applicability. However, in small target segmentation tasks, such as pavement crack extraction, PointNet++ uses a fixed receptive field. Although this field is gradually expanded through multiple Set Abstraction layers, it relies solely on the feature extraction stage of the PointNet layer to input several points into the fully connected layer, resulting in a relatively simple encoding method with low robustness.
To address these issues, this paper proposes a PAN network based on PointNet++, which directly processes unordered point sets as inputs and uses the Set Abstraction module to extract local features at different levels, capturing local structures in the point cloud. Through hierarchical subsampling and aggregation operations, point cloud information is captured at various scales, allowing the network to consider both local structures and global context. This improves the network’s understanding of the overall and local structure of the point cloud, enhancing its ability to process point cloud data with multi-scale characteristics. Using symmetric functions to handle the arrangement of input point clouds makes the network insensitive to point arrangements, ensuring consistent outputs across different input configurations. This approach enhances the model’s generalization and makes it more adaptable to point clouds with various shapes and structures, boosting its robustness in practical applications. Additionally, the proposed PC-Parallel module enlarges the model’s receptive field and strengthens encoding robustness. This module enhances PointNet++’s performance in small target segmentation tasks such as pavement point cloud crack detection, allowing it to better adapt to varying scales and complexities of point cloud data and improving its effectiveness in real-world applications.
The PC-Parallel module enhances the network’s capability to capture critical features over long distances, increasing adaptability and robustness. As illustrated in Figure 1, the PC-Parallel module is introduced after the SA block, where PointNet layer features are input. This module combines spatial and channel attention in parallel, leveraging the strengths of both mechanisms to improve point cloud segmentation performance. By modeling the attention matrix, the spatial attention component effectively captures the spatial relationship between any two points, enhancing the ability to identify local crack features and expanding the model’s perception of the crack region, which enhances the ability to process crack shapes, filter out irrelevant and noisy points, reduce redundant information in the point cloud, and improve computational efficiency. Simultaneously, the channel attention component focuses on capturing remote context information in the channel dimension, emphasizing feature channels with significant differentiation of crack features while suppressing those with minor contributions. This reduces noise interference and redundant information, improving the model’s overall performance.
Figure 1. PAN model structure, using PointNet++ as the baseline code and adding PC-Parallel modules in the early and late stages of the network. Adding the PC-Parallel module in the early stage can increase the correlation and encoding between local geometric features; adding PC in the later-stage PC-Parallel module can enable better feature interaction in all aspects of the object and expand the original receptive field.
Finally, aggregating the outputs of the two attention modules enhances the recognition of crack points across spatial and feature dimensions, achieving more effective multi-scale feature fusion. This process captures fracture point characteristics at different sampling levels, allowing the model to better understand the global correlation among fracture points. The combined use of spatial and channel attention enables the model to fully perceive and interpret the information within crack points, thereby improving its understanding of the overall structure and pattern. By obtaining better feature representations, these features can be used more accurately to predict crack points and provide more powerful modeling capabilities for crack point cloud segmentation tasks. Moreover, Poly Loss is introduced to adjust the form of loss function to better balance the imbalance between crack points and background points and significantly improve the identification accuracy of crack areas.

3.2. PC-Parallel Module

The PC-Parallel module consists of two types of attention branches, as illustrated in Figure 2: the spatial attention branch learns the relationship between different feature points, while the channel attention module captures remote context information along the channel dimension. Finally, the outputs from these two attention modules are aggregated to achieve a more effective point-level feature representation.
Figure 2. PC-Parallel module structure. The two attention modules work together to improve the point cloud model’s understanding of different channel and position combinations and enhance complex structure modeling. Global structure understanding enables more accurate segmentation and better model adaptability, and dynamic weight adjustment promotes feature learning for generalization performance.
Spatial Attention Branch: Spatial attention allows the module to selectively focus on local regions around crack points, enhancing local features and improving the perception of crack areas. This is crucial for crack point segmentation, as crack features are typically small and scattered. By incorporating spatial attention, the model becomes more adaptable to point clouds of varying shapes and structures, enabling a deeper understanding and exploitation of spatial relationships between points to better capture local crack details. This enables the model to focus more on areas with important structure. By guiding the model to focus on the key part of the crack point and filtering irrelevant points and noise points, the computational cost of processing redundant information is effectively reduced, and the overall work efficiency is improved. Furthermore, spatial attention helps the model grasp the global correlations within the point cloud, enhancing its ability to capture overall structures by learning spatial relationships between points. This approach addresses the non-uniformity of point clouds, allowing the model to more accurately process crack points with varying densities and samples, thereby improving its effectiveness in handling pavement crack point cloud data.
We first input local features A R B × C × N , which are initially processed through two convolutional layers to produce new feature maps T and P, respectively, represented as T R B × C 2 × N , P R B × C 2 × N . Then, we reshape them into R B × N , where N = H × W . Next, matrix multiplication is performed between the transpose of T and P, followed by a softmax layer to compute the spatial attention map P _ att R B × N × N :
s o f t m a x b t p = e P _ att b t p l = 1 N e P _ att b t l
Among them, P _ att b t p is a measure of the positional relationship between t and p. b represents the batch dimension, t represents the first spatial dimension, and p represents the second spatial dimension. We first input the feature A R B × C × N into another convolutional layer to generate a new feature map G R B × C 2 × N and then reshape it to R B × N . Then, we perform matrix multiplication between P _ att and G and reshape it to R B × N . Finally, the results are input into a convolutional layer and are then element-wise summed with feature A after normalization operation to obtain the final output E R B × C × N :
E a t t = GN Conv 1 d out softmax T T P c n G + residual
It can be seen from the formula that the final feature of each position s p a t i a l _ a t t E is the weighted sum of the features of all positions, taking into account the original features; c is the number of channels, and n is the number of points. Therefore, it has a global context view and selectively aggregates context according to the spatial attention map, helping the model focus on local areas, enhance local features, and reduce the interference of background point clouds. Similar semantic features improve each other, thereby improving intra-class compactness and semantic consistency.
Channel Attention Branch: Channel attention is crucial for dynamically adjusting feature weights across different channels in the point cloud during learning. It emphasizes the importance of various feature channels, enhancing crack point representation by adjusting each channel’s weight. Important feature channels are highlighted by emphasizing those with significant differentiation for crack points. Meanwhile, redundant feature channels that contribute little to the crack point cloud segmentation task are suppressed, reducing noise interference and improving the model’s overall performance and segmentation accuracy for cracks.
By introducing average pooling and max pooling operations, we integrated the spatial information of the feature map, producing two independent spatial context descriptors named AvgPool and MaxPool, representing average and max pooling features, respectively. These descriptors are then fed into a shared network consisting of a multilayer perceptron (MLP) with hidden layers. To balance parameter count and effectiveness, we set the hidden activation size to R C r × 1 × 1 , where r is the decay ratio. After applying the shared network to each descriptor and processing the merged features through the Sigmoid activation function, we obtain the channel attention map H a t t . This map generation relies on learning a shared network with relatively few parameters, reducing computational load. Channel attention is calculated as follows:
H a t t = sigmoid ( MLP ( AvgPool ( A ) ) + MLP ( MaxPool ( A ) ) )
This channel attention design enables the model to adaptively focus on each channel’s information based on task requirements, thereby enhancing the network’s sensitivity to point cloud features. This mechanism effectively captures inter-channel relationships in point clouds, offering a more accurate feature representation for segmentation tasks and improving the model’s performance and generalization capabilities.
Finally, by integrating the channel attention module and the spatial attention module F R B × C × N through matrix concatenation, the model can learn the relationship between the channel and the location more comprehensively so that the model can understand and represent the details and global information of pavement cracks more comprehensively. The synergistic effect of these two attention modules allows the model to more effectively comprehend the combination of different channels at various positions within the point cloud. This capability enables the model to adapt more efficiently to cracks of diverse shapes, sizes, and positions, ensuring stable performance across different types of cracks. Additionally, this synergy enhances the model’s ability to accurately locate and segment cracks, significantly improving the precision and accuracy of segmentation tasks. By understanding the spatial and channel relationships within the point cloud, the model can deliver more reliable and consistent results in crack detection and analysis. In summary, this fusion strategy provides a more comprehensive and flexible feature learning mechanism for point cloud segmentation tasks.
F a t t = E a t t + H a t t

3.3. Poly Loss Function

In the original PointNet++ framework, NLL Loss (Negative Log Likelihood Loss) is used as a standard loss function and is calculated based on probability distribution. By computing the negative log-likelihood between the predicted class probability distribution of pixels or points and the true label, the model is guided to optimize. However, for fracture point cloud segmentation tasks, NLL Loss overlooks the local structure of fractures and the order of points, leading to a lack of global and local structural information and poor robustness against point cloud rotations or translations.
Given the characteristics of pavement crack point cloud data, there is an imbalance in the distribution of crack sample points and background points. To effectively address this issue in pavement crack segmentation, this study introduced Poly Loss [59]. The core idea of Poly Loss is to enhance the deep learning model’s robustness and accuracy by designing specific loss functions tailored to the segmentation task. In the point cloud pavement crack segmentation task, cracks usually occupy a small part of the point cloud, while most of the area is normal pavement, and the data imbalance will cause the original loss function to be unable to effectively identify the crack area. By introducing weighting factors or polynomial terms, Poly Loss effectively addresses the imbalance between crack and background points, significantly enhancing crack region identification accuracy. It makes the model more sensitive to edge and detail information, allowing better preservation and recognition of cracks’ fine structure. By adjusting polynomial coefficients, Poly Loss optimizes the model’s predictive performance for crack point cloud segmentation, in line with the task requirements for pavement crack point cloud segmentation. We must balance the prediction ability of crack point and background point of the model to improve the overall model performance. Poly Loss is defined as follows:
L Poly - 1 = log ( P t ) + ϵ 1 ( 1 P t )
Among them, ϵ 1 is an additional hyperparameter used to adjust the first polynomial coefficient, and P t is the predicted probability of the crack point category. By introducing an additional term ϵ 1 ( 1 P t ) into the original cross-entropy loss function log ( p t ) to adjust the first polynomial coefficient, the classification performance is improved. At the same time, the softmax operation is introduced to effectively deal with the problem of category imbalance and strengthen the learning of minority categories.
In this paper, we employed the Poly Loss function to train the entire network architecture. For the semantic segmentation of pavement cracks in point cloud data, Poly Loss effectively addresses category imbalance by adjusting polynomial coefficients to weight crack and background points. This approach enhances the model’s robustness against noise and outliers, significantly boosting performance. It also improves the model’s stability and reliability in complex scenarios.

4. Results

4.1. Implementation

This study used a Ubuntu 20.04 operating system, python v3.8.18, pytorch v1.12.1 deep learning framework, and CUDA v11.5.50. The CPU is Intel(R) Xeon(R) Gold 6133 CPU @ 2.50 GHz, and the GPU is NVIDIA GeForce RTX 3090. All models were trained 300 times, the initial learning rate was 0.001, each training epoch updated the learning rate, the optimizer was Adam, and the batch size was 4. The hyperparameters involved in the Point Attention Net model are shown in Table 1. The proposed Point Attention Net model was trained using LNTU-RDD-LiDAR Road-1 and Road-4, verified using LNTU-RDD-LiDAR Road-3, and tested using LNTU-RDD-LiDAR Road-2.
Table 1. Point Attention Net model parameter settings.

4.2. Quantitative Assessment Measures

To comprehensively evaluate the effectiveness of construction disaster detection, this study used I o U , m I o U ,   P r e ,   R e c , and F1 as the main evaluation metrics. Here, P r e calculates the percentage of correctly predicted pavement cracks to assess model effectiveness, as shown in Equation (6). R e c calculates the ratio of correctly identified crack points among all crack points to evaluate detection completeness, as shown in Equation (7). F1 is the comprehensive evaluation index of P r e and R e c , as shown in Formula (8). T P represents the number of point clouds in the fracture area correctly identified; F P represents the number of point clouds incorrectly identified as fractures, and F N represents the number of fracture point clouds not identified.
P r e = T P T P + F P
R e c = T P T P + F N
F 1 = 2 × P r e × R e c P r e + R e c
m I o U is an indicator used to evaluate the performance of deep learning point cloud segmentation tasks. Its calculation method is shown in Equation (10), where N represents the number of categories. Among them, I o U is only used as an evaluation index for each crack category segmentation, such as Equation (9).
I o U i = T P i T P i + F P i + F N i
m I o U = 1 N i = 1 N T P i T P i + F P i + F N i

4.3. LNTU-RDD-LiDAR Dataset

In similar studies, most pavement crack information extraction methods focus on a single crack type, leading to incomplete crack type coverage. To address this, we developed a pavement crack point cloud dataset named LNTU-RDD-LiDAR for the segmentation task of pavement cracks on point cloud data. Provincial roads served as the collection source, where varying weather conditions and terrain changes resulted in a wider variety of surface cracks. This diversity makes regular, timely, and comprehensive road maintenance more challenging.
Collect: In the data acquisition stage, a laser WPL7-808-10W was used as a point cloud scanner, and combined with an airborne laser measurement system, the road point cloud was recorded to construct the original point cloud dataset. The airborne laser measurement system consists of two WPL7-808-10W lasers, two 3D AT-C2 cameras, and an inertial navigation unit. The clear distance between the bottom of the laser sensor and the road surface was 1900 mm. The point cloud measurement covers a range of 4 m to 2.1 m, the ranging accuracy reaches 0.5 mm, the absolute accuracy is 10 mm, and the elevation error between lanes is only 3 mm. The selection of these scanning parameters and settings was designed to ensure that the obtained point cloud data reached an accuracy level of 0.01 mm in resolution. The entire experimental dataset contains 188 pavement point cloud segments, each of which contains an average of 20 million points. Each segment has a width of 3.8 m and a length of about 16 m. Table 2 shows the size and high resolution of the dataset that provides detailed and comprehensive pavement information for this study, which lays a solid foundation for subsequent crack detection and analysis.
Table 2. Parameters of 3D laser road acquisition equipment.
Mark: After collecting and processing the pavement point cloud data through the airborne laser measurement system, a pavement crack point cloud labeling task was carried out. The point cloud labeling tool CloudCompare_v2.13 is used in this paper. CloudCompare_v2.13 is a powerful open-source point cloud data processing software with rich point cloud processing functions, including import, export, and editing. Its intuitive 3D visual interface makes visual analysis and interactive operation more convenient. CloudCompare_v2.13 can efficiently process and annotate point cloud data, which provides a good data foundation for subsequent tasks.
Table 3 shows the detailed requirements of the labeling task, including the accurate location, shape, and size of the cracks. The whole labeling process covered 188 crack point cloud pavement segments, among which the ratio of crack point to non-crack point was about 0.6:9.4. Then, the LNTU-RDD-LiDAR experimental data were divided into four subsets: Road-1, Road-2, Road-3, and Road-4, among which Road-1 contains 51 pavement segments. Road-2 contains the latter 17 road segments, Road-3 contains 59 road segments, and Road-4 contains 61 road segments. At the same time, the background points were also marked to consider the overall road surface information. This provides a high-quality labeled dataset for semantic segmentation of pavement crack point cloud-type data, as illustrated in Figure 3. Through multi-person verification and quality control, the accuracy and consistency of the annotation results were ensured, which provides a reliable basis for the training and evaluation of the model.
Table 3. LNTU_RDD_PCD dataset description table.
Figure 3. LNTU-RDD-LiDAR pavement crack point cloud dataset. (a) Generate a gray value image based on the z value of the point cloud. (b) Generate a depth value image based on the z value of the point cloud. (c) Corresponds to the true value of the manually labeled road crack.

4.4. Comparative Experiment

To verify the effectiveness of the proposed Point Attention Net model for pavement crack point cloud data segmentation, this study tested the LNTU-RDD-LiDAR point cloud dataset and compared the model with existing methods, including PointNet [40], Pointnet++ [41], Point Transformer [47], and PointMLP [50]. PointNet, a pioneer of point-by-point classification, uses a symmetric function invariant to permutations but struggles to learn local features in complex road scenarios. Point Transformer introduces a self-attention mechanism to capture the spatial structure of point clouds and establish a global context. PointMLP incorporates a lightweight geometric affine module to enhance performance.
From Table 4, it is clear that the PAN model improves in mIoU, R e c , F1, and Acc compared to the PointNet, PointNet++, Point Transformer, and PointMLP models. mIoU improved by 0.9% relative to the Point Transformer model, Acc improved by 0.1% relative to the PointMLP model, and F1 improved by 1.3% relative to the Point Transformer model. As illustrated in Figure 4, experimental results show that, compared with most traditional deep learning algorithms, the Point Attention Net model mainly benefits from two-branch attention fusion, which enhances the ability of description feature coding.
Table 4. Comparison of semantic segmentation results between PAN and PointNet, PointNet++, PointTransformer, and PointMLP on the LNTU-RDD-LiDAR dataset.
Figure 4. Visual comparison results and corresponding detailed observations of pavement crack information extraction between PAN and PointNet, PointNet++, PointTransformer, and PointMLP methods on the LNTU-RDD-LiDAR dataset.

4.5. Ablation Experiments

PC-Parallel: Table 5 shows the results of the gradual improvement of pavement crack information extraction performance with the increase of attention blocks. Compared with the original Pointnet ++ baseline network, the introduction of location attention module and channel attention module can significantly improve the segmentation effect of pavement crack point cloud. In this case, mIoU reached 75.4%, R e c reached 67.1%, F1 reached 75.4%, and Acc reached 91.5%.
Table 5. Ablation experiments of PC-Att-Parallel module test on LNTU-RDD-LiDAR dataset.
As shown in Figure 5, incorporating the PC-Parallel module effectively addresses issues of incomplete segmentation and imprecise boundaries. The location attention module learns spatial interdependencies between features, while the channel attention module captures dependencies between channels. Both attention blocks positively impact pavement crack extraction performance. Introducing these modules and fusing their features significantly enhances performance. Consequently, the two-branch attention block demonstrates remarkable effectiveness in the crack segmentation task of pavement point cloud data.
Figure 5. The PC-Parallel module on LNTU-RDD-LiDAR dataset tested the results of visual comparison and detailed observation of the ablation experimental method.
Poly Loss: Table 6 shows the results of the gradual improvement of pavement crack information extraction performance with the increase of loss function. Compared to the original PointNet++ baseline code, the introduction of Poly Loss to handle class imbalance and focus on difficult-to-classify samples effectively improves the performance and robustness of the model. In this case, mIoU reached 74.0%, R e c reached 63.3%, F1 reached 73.5%, and Acc reached 91.1%. Therefore, Poly Loss shows remarkable effectiveness in the pavement point cloud data crack segmentation task. As one can see from Figure 6, replacing the Loss function in the original PointNet++ baseline code with Poly Loss effectively solves the class imbalance problem. The loss function highlights the features of foreground points and background points so that the model can better distinguish the slit points and other points.
Table 6. Ablation experiments of Poly Loss test on LNTU-RDD-LiDAR dataset.
Figure 6. Visual comparison and detailed observation of ablation experiments with Poly Loss test on LNTU-RDD-LiDAR dataset.
From Figure 6, replacing the Loss function in the original PointNet++ baseline code with Poly Loss effectively solves the class imbalance problem. The loss function highlights the features of foreground points and background points so that the model can better distinguish the slit points and other points.

5. Discussion

In this paper, the 3D laser point cloud pavement crack data have better robustness than a 2D image under varying illumination conditions and in a low-intensity contrast environment. It can effectively deal with various kinds of rust and oil covering the road surface. Experiments on the proposed model show that the indexes of mIoU, Acc, F1, and R e c are significantly improved, which confirms the effectiveness of the proposed method compared with traditional methods. The introduction of the Poly Loss function helps to better capture the edges of cracks and details.
However, the dataset in this paper includes a large number of 3D point cloud data of pavement cracks collected under clear-weather conditions. However, these data do not fully cover all environmental conditions. On rainy and snowy days, water and snow can mask pavement cracks, making it difficult for laser scans to capture precise crack features. In addition, point cloud data may introduce additional noise, reducing the detection effect and recognition ability of the model.
To overcome these limitations and further explore the application of 3D point cloud datasets in pavement crack segmentation tasks, future research will focus on the following aspects:
1. Expanding datasets and multi-source data integration: We plan to collect more samples under various environmental conditions to enhance the generalization ability of the model. By combining other sensor data such as RGB images and thermal imaging, the deficiency of point cloud data can be supplemented to provide richer environmental information. In addition, we will develop automated data-labeling tools to automatically generate annotated data using deep learning and machine learning algorithms. This will help speed up labeling, reduce reliance on manual labeling, and improve the consistency and accuracy of labeling;
2. Technology integration and practical application: We will study how to seamlessly integrate the technology into the existing monitoring system and apply it to the actual road maintenance scenario. This includes addressing compatibility issues and adding new features without affecting existing system functionality. To ensure the effectiveness of the technology in practical scenarios, we plan to promote the application in cooperation with experts in relevant fields. This will help us identify and address potential implementation issues and optimize the technology based on actual needs. At the same time, we will evaluate implementation and maintenance costs to design solutions that are both efficient and cost-effective to support a wide range of applications.
Through these efforts, we hope to significantly improve the usefulness of the dataset and the applicability of the technology, providing a solid foundation for future road monitoring and maintenance.

6. Conclusions

This paper introduces the Point Attention Net (PAN) network for extracting 3D pavement crack information, aiming to overcome the limitations of previous point cloud-based deep learning approaches. The PAN network incorporates a novel PC-Parallel module that was specifically designed to learn spatial interdependencies and channel dependencies of features separately. This design significantly enhanced the performance of point cloud pavement crack segmentation by allowing the network to more effectively capture and process the intricate details of crack features. Additionally, boundary refinement and class imbalance issues in point cloud pavement crack segmentation were addressed by introducing the Poly Loss function. The test results on the LNTU-RDD-LiDAR dataset show that the proposed method has excellent performance on mIoU, R e c , F1, and Acc, reaching 75.4%, 75.4%, 67.1%, and 86.8%, respectively. In comparison to existing point cloud segmentation methods, the proposed approach demonstrates superior performance, as the proposed method improved the mIoU and F1 indexes by 1.1% and 1.3%, respectively. The experimental results demonstrate that the proposed method significantly enhances point cloud pavement crack segmentation.
In future studies, we will explore more effective class-balancing strategies to enhance the model’s performance and generalization on unbalanced datasets such as highway pavement crack point cloud data. Additionally, we aim to increase the number of samples and scene categories, expanding the model’s training set to encompass a broader range of practical situations, thereby improving its generalization ability when encountering unknown data.

Author Contributions

Conceptualization, J.F. and S.S.; data curation, J.F.; funding acquisition, W.S. and S.S.; investigation, J.F. and J.Z.; methodology, J.F.; project administration, W.S.; funding acquisition, W.S.; resources, G.J. (Guohui Jia) and G.J. (Guang Jin); software, J.F.; visualization, J.F.; writing—original draft, J.F.; writing—review and editing, S.S., J.Z. and G.J. (Guohui Jia). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (grant number 42071343).

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

Author Guang Jin was employed by the company Iroadc (Liaoning) Transportation Tech. Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Liu, Z.; Cao, Y.; Wang, Y.; Wang, W. Computer vision-based concrete crack detection using U-net fully convolutional networks. Autom. Constr. 2019, 104, 129–139. [Google Scholar] [CrossRef]
  2. Oliveira, H.; Correia, P.L. Automatic road crack segmentation using entropy and image dynamic thresholding. In Proceedings of the 2009 17th European Signal Processing Conference, Glasgow, UK, 24–28 August 2009; pp. 622–626. [Google Scholar]
  3. Peng, L.; Chao, W.; Shuangmiao, L.; Baocai, F. Research on crack detection method of airport runway based on twice-threshold segmentation. In Proceedings of the 2015 Fifth International Conference on Instrumentation and Measurement, Computer, Communication and Control (IMCCC), Qinhuangdao, China, 18–20 September 2015; pp. 1716–1720. [Google Scholar]
  4. Lim, R.S.; La, H.M.; Shan, Z.; Sheng, W. Developing a crack inspection robot for bridge maintenance. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 6288–6293. [Google Scholar]
  5. Lim, R.S.; La, H.M.; Sheng, W. A robotic crack inspection and mapping system for bridge deck maintenance. IEEE Trans. Autom. Sci. Eng. 2014, 11, 367–378. [Google Scholar] [CrossRef]
  6. Gavilán, M.; Balcones, D.; Marcos, O.; Llorca, D.F.; Sotelo, M.A.; Parra, I.; Ocaña, M.; Aliseda, P.; Yarza, P.; Amírola, A. Adaptive road crack detection system by pavement classification. Sensors 2011, 11, 9628–9657. [Google Scholar] [CrossRef] [PubMed]
  7. Zou, Q.; Cao, Y.; Li, Q.; Mao, Q.; Wang, S. CrackTree: Automatic crack detection from pavement images. Pattern Recognit. Lett. 2012, 33, 227–238. [Google Scholar] [CrossRef]
  8. Tian, F.; Zhao, Y.; Che, X.; Zhao, Y.; Xin, D. Concrete crack identification and image mosaic based on image processing. Appl. Sci. 2019, 9, 4826. [Google Scholar] [CrossRef]
  9. Cao, X.; Li, T.; Bai, J.; Wei, Z. Identification and Classification of Surface Cracks on Concrete Members Based on Image Processing. Trait. Du Signal 2020, 37, 519–525. [Google Scholar] [CrossRef]
  10. Kang, Y.; Yu, A.; Zeng, W. Construction of Concrete Surface Crack Recognition Model Based on Digital Image Processing Technology. J. Phys. Conf. Ser. 2021, 2074, 012067. [Google Scholar] [CrossRef]
  11. Tong, Z.; Yuan, D.; Gao, J.; Wei, Y.; Dou, H. Pavement-distress detection using ground-penetrating radar and network in networks. Constr. Build. Mater. 2020, 233, 117352. [Google Scholar] [CrossRef]
  12. Zhang, J.; Pu, J.; Xue, J.; Yang, M.; Xu, X.; Wang, X.; Wang, F.-Y. HiVeGPT: Human-machine-augmented intelligent vehicles with generative pre-trained transformer. IEEE Trans. Intell. Veh. 2023, 8, 2027–2033. [Google Scholar] [CrossRef]
  13. Turkan, Y.; Hong, J.; Laflamme, S.; Puri, N. Adaptive wavelet neural network for terrestrial laser scanner-based crack detection. Autom. Constr. 2018, 94, 191–202. [Google Scholar] [CrossRef]
  14. Wang, L.; Zhang, X.; Song, Z.; Bi, J.; Zhang, G.; Wei, H.; Tang, L.; Yang, L.; Li, J.; Jia, C. Multi-modal 3D Object Detection in Autonomous Driving: A Survey and Taxonomy. IEEE Trans. Intell. Veh. 2023, 8, 3781–3798. [Google Scholar] [CrossRef]
  15. Feng, H.; Ma, L.; Yu, Y.; Chen, Y.; Li, J. SCL-GCN: Stratified Contrastive Learning Graph Convolution Network for pavement crack detection from mobile LiDAR point clouds. Int. J. Appl. Earth Obs. Geoinf. 2023, 118, 103248. [Google Scholar] [CrossRef]
  16. Cheng, H.; Shi, X.; Glazier, C. Real-time image thresholding based on sample space reduction and interpolation approach. J. Comput. Civ. Eng. 2003, 17, 264–272. [Google Scholar] [CrossRef]
  17. Huang, Y.; Xu, B. Automatic inspection of pavement cracking distress. J. Electron. Imaging 2006, 15, 013017. [Google Scholar] [CrossRef]
  18. Ying, L.; Salari, E. Beamlet transform-based technique for pavement crack detection and classification. Comput.-Aided Civ. Infrastruct. Eng. 2010, 25, 572–580. [Google Scholar] [CrossRef]
  19. Zhang, A.; Li, Q.; Wang, K.C.; Qiu, S. Matched filtering algorithm for pavement cracking detection. Transp. Res. Rec. 2013, 2367, 30–42. [Google Scholar] [CrossRef]
  20. Zalama, E.; Gómez-García-Bermejo, J.; Medina, R.; Llamas, J. Road crack detection using visual features extracted by Gabor filters. Comput.-Aided Civ. Infrastruct. Eng. 2014, 29, 342–358. [Google Scholar] [CrossRef]
  21. Zhang, L.; Yang, F.; Zhang, Y.D.; Zhu, Y.J. Road crack detection using deep convolutional neural network. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3708–3712. [Google Scholar]
  22. LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
  23. Cha, Y.J.; Choi, W.; Büyüköztürk, O. Deep learning-based crack damage detection using convolutional neural networks. Comput.-Aided Civ. Infrastruct. Eng. 2017, 32, 361–378. [Google Scholar] [CrossRef]
  24. An, Y.-K.; Jang, K.; Kim, B.; Cho, S. Deep learning-based concrete crack detection using hybrid images. In Proceedings of the Sensors and Smart Structures Technologies for Civil, Mechanical, and Aerospace Systems 2018, Denver, CO, USA, 5–8 March 2018; pp. 273–284. [Google Scholar]
  25. Fan, Z.; Wu, Y.; Lu, J.; Li, W. Automatic pavement crack detection based on structured prediction with the convolutional neural network. arXiv 2018, arXiv:1802.02208. [Google Scholar]
  26. Yang, X.; Li, H.; Yu, Y.; Luo, X.; Huang, T.; Yang, X. Automatic pixel-level crack detection and measurement using fully convolutional network. Comput.-Aided Civ. Infrastruct. Eng. 2018, 33, 1090–1109. [Google Scholar] [CrossRef]
  27. Huang, H.-W.; Li, Q.-T.; Zhang, D.-M. Deep learning based image recognition for crack and leakage defects of metro shield tunnel. Tunn. Undergr. Space Technol. 2018, 77, 166–176. [Google Scholar] [CrossRef]
  28. Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
  29. Dorafshan, S.; Thomas, R.J.; Maguire, M. Comparison of deep convolutional neural networks and edge detectors for image-based crack detection in concrete. Constr. Build. Mater. 2018, 186, 1031–1045. [Google Scholar] [CrossRef]
  30. Song, L.; Wang, X. Faster region convolutional neural network for automated pavement distress detection. Road Mater. Pavement Des. 2021, 22, 23–41. [Google Scholar] [CrossRef]
  31. Yang, F.; Zhang, L.; Yu, S.; Prokhorov, D.; Mei, X.; Ling, H. Feature pyramid and hierarchical boosting network for pavement crack detection. IEEE Trans. Intell. Transp. Syst. 2019, 21, 1525–1535. [Google Scholar] [CrossRef]
  32. Jahanshahi, M.R.; Jazizadeh, F.; Masri, S.F.; Becerik-Gerber, B. Unsupervised approach for autonomous pavement-defect detection and quantification using an inexpensive depth sensor. J. Comput. Civ. Eng. 2013, 27, 743–754. [Google Scholar] [CrossRef]
  33. Ouyang, W.; Xu, B. Pavement cracking measurements using 3D laser-scan images. Meas. Sci. Technol. 2013, 24, 105204. [Google Scholar] [CrossRef]
  34. Guan, H.; Li, J.; Yu, Y.; Chapman, M.; Wang, H.; Wang, C.; Zhai, R. Iterative tensor voting for pavement crack extraction using mobile laser scanning data. IEEE Trans. Geosci. Remote Sens. 2014, 53, 1527–1537. [Google Scholar] [CrossRef]
  35. Yu, Y.; Li, J.; Guan, H.; Wang, C. 3D crack skeleton extraction from mobile LiDAR point clouds. In Proceedings of the 2014 IEEE Geoscience and Remote Sensing Symposium, Quebec City, QC, Canada, 13–18 July 2014; pp. 914–917. [Google Scholar]
  36. Jiang, H.; Li, Q.; Jiao, Q.; Wang, X.; Wu, L. Extraction of wall cracks on earthquake-damaged buildings based on TLS point clouds. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 3088–3096. [Google Scholar] [CrossRef]
  37. Xu, X.; Yang, H. Intelligent crack extraction and analysis for tunnel structures with terrestrial laser scanning measurement. Adv. Mech. Eng. 2019, 11, 1687814019872650. [Google Scholar] [CrossRef]
  38. Ma, L.; Li, J. SD-GCN: Saliency-based dilated graph convolution network for pavement crack extraction from 3D point clouds. Int. J. Appl. Earth Obs. Geoinf. 2022, 111, 102836. [Google Scholar] [CrossRef]
  39. Zhong, M.; Sui, L.; Wang, Z.; Hu, D. Pavement Crack Detection from Mobile Laser Scanning Point Clouds Using a Time Grid. Sensors 2020, 20, 4198. [Google Scholar] [CrossRef]
  40. Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
  41. Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
  42. Wang, Y.; Sun, Y.; Liu, Z.; Sarma, S.E.; Bronstein, M.M.; Solomon, J.M. Dynamic graph cnn for learning on point clouds. ACM Trans. Graph. (Tog) 2019, 38, 146. [Google Scholar] [CrossRef]
  43. Fan, S.; Dong, Q.; Zhu, F.; Lv, Y.; Ye, P.; Wang, F.-Y. SCF-Net: Learning spatial contextual features for large-scale point cloud segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 14504–14513. [Google Scholar]
  44. Xu, M.; Ding, R.; Zhao, H.; Qi, X. Paconv: Position adaptive convolution with dynamic kernel assembling on point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 3173–3182. [Google Scholar]
  45. Yu, H.; Huang, H.; Zheng, J.; Zhao, T.; Zhou, X. Non-contact on-line inspection method for surface defects of cross-rolling piercing plugs for seamless steel tubes. China Mech. Eng. 2022, 33, 1717. [Google Scholar]
  46. Guo, M.-H.; Cai, J.-X.; Liu, Z.-N.; Mu, T.-J.; Martin, R.R.; Hu, S.-M. Pct: Point cloud transformer. Comput. Vis. Media 2021, 7, 187–199. [Google Scholar] [CrossRef]
  47. Zhao, H.; Jiang, L.; Jia, J.; Torr, P.H.; Koltun, V. Point transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 16259–16268. [Google Scholar]
  48. Engel, N.; Belagiannis, V.; Dietmayer, K. Point transformer. IEEE Access 2021, 9, 134826–134840. [Google Scholar] [CrossRef]
  49. Zhou, Y.; Ji, A.; Zhang, L. Sewer defect detection from 3D point clouds using a transformer-based deep learning model. Autom. Constr. 2022, 136, 104163. [Google Scholar] [CrossRef]
  50. Ma, X.; Qin, C.; You, H.; Ran, H.; Fu, Y. Rethinking network design and local geometry in point cloud: A simple residual MLP framework. arXiv 2022, arXiv:2202.07123. [Google Scholar]
  51. Ran, H.; Liu, J.; Wang, C. Surface representation for point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 18942–18952. [Google Scholar]
  52. Qian, G.; Li, Y.; Peng, H.; Mai, J.; Hammoud, H.; Elhoseiny, M.; Ghanem, B. Pointnext: Revisiting pointnet++ with improved training and scaling strategies. Adv. Neural Inf. Process. Syst. 2022, 35, 23192–23204. [Google Scholar]
  53. Zhang, A.; Wang, K.C.; Li, B.; Yang, E.; Dai, X.; Peng, Y.; Fei, Y.; Liu, Y.; Li, J.Q.; Chen, C. Automated pixel-level pavement crack detection on 3D asphalt surfaces using a deep-learning network. Comput.-Aided Civ. Infrastruct. Eng. 2017, 32, 805–819. [Google Scholar] [CrossRef]
  54. Zhang, A.; Wang, K.C.; Fei, Y.; Liu, Y.; Tao, S.; Chen, C.; Li, J.Q.; Li, B. Deep learning–based fully automated pavement crack detection on 3D asphalt surfaces with an improved CrackNet. J. Comput. Civ. Eng. 2018, 32, 04018041. [Google Scholar] [CrossRef]
  55. Fei, Y.; Wang, K.C.; Zhang, A.; Chen, C.; Li, J.Q.; Liu, Y.; Yang, G.; Li, B. Pixel-level cracking detection on 3D asphalt pavement images through deep-learning-based CrackNet-V. IEEE Trans. Intell. Transp. Syst. 2019, 21, 273–284. [Google Scholar] [CrossRef]
  56. Zhang, A.; Wang, K.C.; Fei, Y.; Liu, Y.; Chen, C.; Yang, G.; Li, J.Q.; Yang, E.; Qiu, S. Automated pixel-level pavement crack detection on 3D asphalt surfaces with a recurrent neural network. Comput.-Aided Civ. Infrastruct. Eng. 2019, 34, 213–229. [Google Scholar] [CrossRef]
  57. Feng, H.; Li, W.; Luo, Z.; Chen, Y.; Fatholahi, S.N.; Cheng, M.; Wang, C.; Junior, J.M.; Li, J. GCN-based pavement crack detection using mobile LiDAR point clouds. IEEE Trans. Intell. Transp. Syst. 2021, 23, 11052–11061. [Google Scholar] [CrossRef]
  58. Chen, T.-H.; Chang, T.S. RangeSeg: Range-aware real time segmentation of 3D LiDAR point clouds. IEEE Trans. Intell. Veh. 2021, 7, 93–101. [Google Scholar] [CrossRef]
  59. Leng, Z.; Tan, M.; Liu, C.; Cubuk, E.D.; Shi, X.; Cheng, S.; Anguelov, D. Polyloss: A polynomial expansion perspective of classification loss functions. arXiv 2022, arXiv:2204.12511. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.