Review of GPR Activities in Civil Infrastructures: Data Analysis and Applications

: Ground penetration radar (GPR) technology has received in-depth analysis and rapid development in the field of civil engineering. GPR data analysis is one of the basic and challenging problems in this field. This research aims to conduct a comprehensive survey of the progress from 2015 to the present in GPR scanning tasks. More than 130 major publications are cited in this research covering different aspects of the research, including advanced data processing methods and a wide variety of applications. First, it briefly introduces the data collection of the GPR system and discusses the signal complexity in simulated/real scenes. Then, it reviews the main signal processing techniques used to interpret the GPR data. Subsequently, the latest GPR surveys are considered and divided according to four application domains, namely bridges, road pavements, underground utilities, and urban subsurface risks. Finally, the survey discusses the open challenges and directions for future research.


Introduction
Interpreting the useful information gained from the existing infrastructures is the main task of civil researchers. Labeling, mapping, and diagnosing subsurface-embedded elements and unpredictable risks are indispensable aspects of this work. As a typical nondestructive testing (NDT) technique, ground-penetrating radar (GPR) has been widespread in this domain [1,2] due to its several advantages: first, easy-to-use data collection and real-time interface display can quickly provide feedback information about the subsurface conditions; second, the centimeter-level image resolution can be obtained by adjusting its system bandwidth. GPR sends electromagnetic (EM) pulses into the belowground medium by a transmitter and collects the reflected signal by a receiver. Based on the analysis of reflected signal strength and time difference of reception, this instrument can be used to infer the relevant underground conditions and gather useful information [3,4].
GPR first appeared in the realm of geosciences in mid-1950s. After about 40 years, the emergence of many powerful GPR data analysis techniques has given rise to a further increase in civil activities. Subsequently, huge progress in digital computing capabilities led to the spread of GPR applications in recent years. Milestones in GPR data analysis over the past decades are listed in Figure 1. Research on GPR data analysis in the late 1990s and the early 2000s mainly focused on neural network (NN) [5,6], multilayer perceptron (MLP) [7], and Hough transform [8,9]. After that time, the machine learning (ML) algorithms achieved advanced results in GPR data interpretation, such as genetic algorithm (GA) [10], Viola-Jones [11], and support vector machine (SVM) [12]. Recently, some research on integrated systems [13][14][15] has been developed to automatically detect and fit the GPR characteristics. After 2015, the capability of deep learning (DL) technology has been explored even in cross-disciplines, which promotes GPR analysis from a small-scale manual process to an automatic interpretation even with large amounts of data. Much research focus has been on DL methods for GPR feature representation and classification, such as the improved Figure 1. The evolution of GPR data analysis methods over the past decades [5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22].
A thorough review and survey of existing work will help make more progress in GPR data analysis. Our goal is to outline its core tasks and key challenges, to summarize the performance of GPR signal complexity, to define categories of data processing methods and to provide a review of GPR applications. Based on the different forms of GPR data the data processing literature is divided into three types: A-scan-based, B-scan-based, and C-scan-based. The transmitting antenna (T antenna) emits EM waves at a fixed position and the receiving antenna (R antenna) receives the echo and forms a 1-D signal, called Ascan. The T/R antenna moves equidistantly in a certain direction, and a 2-D dataset is formed by superimposing multiple consecutive A-scans, namely B-scan. The GPR scans target in the form of multiple parallel lines, and a series of B-scans is gathered and sequenced consecutively to form C-scan data. Figure 2 can visually show the relationship between the scan data and the measurement position. Furthermore, according to the different feature extraction forms, the B-scan-based methods are divided into traditionalbased, ML-based, and DL-based. Figure 1. The evolution of GPR data analysis methods over the past decades [5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22].
A thorough review and survey of existing work will help make more progress in GPR data analysis. Our goal is to outline its core tasks and key challenges, to summarize the performance of GPR signal complexity, to define categories of data processing methods, and to provide a review of GPR applications. Based on the different forms of GPR data, the data processing literature is divided into three types: A-scan-based, B-scan-based, and Cscan-based. The transmitting antenna (T antenna) emits EM waves at a fixed position, and the receiving antenna (R antenna) receives the echo and forms a 1-D signal, called A-scan. The T/R antenna moves equidistantly in a certain direction, and a 2-D dataset is formed by superimposing multiple consecutive A-scans, namely B-scan. The GPR scans target in the form of multiple parallel lines, and a series of B-scans is gathered and sequenced consecutively to form C-scan data. Figure 2 can visually show the relationship between the scan data and the measurement position. Furthermore, according to the different feature extraction forms, the B-scan-based methods are divided into traditional-based, ML-based, and DL-based.
Even though the current research in GPR data analysis has made numerous achievements, the automated interpretation of GPR data still faces some challenges that are difficult to solve: (1) There exists the matching inconsistency between GPR features and deep models; (2) The construction of DL model and parametric analysis depends heavily on a large amount of dataset; (3) Many factors in the on-site environment have unpredictable impacts on data analysis and increase their complexity; (4) Integrating multiple NDT technologies for underground target detection will lead to incompatibility among them. Even though the current research in GPR data analysis has made numerous achievements, the automated interpretation of GPR data still faces some challenges that are difficult to solve: (1) There exists the matching inconsistency between GPR features and deep models; (2) The construction of DL model and parametric analysis depends heavily on a large amount of dataset; (3) Many factors in the on-site environment have unpredictable impacts on data analysis and increase their complexity; (4) Integrating multiple NDT technologies for underground target detection will lead to incompatibility among them.
The remainder of this paper is organized as follows. Section 2 discusses the major signal processing techniques, wherein the main results achieved by DL techniques are highlighted. Section 3 provides an overview of the latest achievements in the following fields: bridges, roads and pavements, surface utilities, and risks. Section 4 discusses the future prospects and draws conclusions.

A-Scan Processing
This section discusses the advanced studies on the A-scan data processing in civil surveys. Indeed, A-scan processing technique is often used to avoid generating additional noises and enhance the target signals. For instance, the work [24] removed the direct current (DC) bias from each A-scan signal in order to avoid complex noises, and thus increase the visibility of buried plastic pipes. Benedetto et al. [25] summarized the main A-scan signal processing methods used in road surveys, as well as provided the strategies about The remainder of this paper is organized as follows. Section 2 discusses the major signal processing techniques, wherein the main results achieved by DL techniques are highlighted. Section 3 provides an overview of the latest achievements in the following fields: bridges, roads and pavements, surface utilities, and risks. Section 4 discusses the future prospects and draws conclusions.

A-Scan Processing
This section discusses the advanced studies on the A-scan data processing in civil surveys. Indeed, A-scan processing technique is often used to avoid generating additional noises and enhance the target signals. For instance, the work [24] removed the direct current (DC) bias from each A-scan signal in order to avoid complex noises, and thus increase the visibility of buried plastic pipes. Benedetto et al. [25] summarized the main A-scan signal processing methods used in road surveys, as well as provided the strategies about how to select and use the proper processing technique based on the data quality and the survey purposes.
Existing works have been reported in [26,27] to evaluate changes of material properties. In fact, the essential properties will change or evolve over time due to several factors such as dry brickwork, or steel corrosion, etc. The collected A-scan data were recorded and analyzed to learn the internal changes of the material. To provide rich information about the progress of bridge deck deterioration, the study in [28] interpreted time-series GPR Remote Sens. 2022, 14, 5972 4 of 23 data based on a correlation coefficient between A-scans. The generated map could clearly show deterioration progression between the two consecutive scans.
Facing the challenge of object detection task, some research also focuses on the interpretation of the features extracted from A-scan data or their energy analysis to determine whether A-scan implies target information, for example, comparing the A-scan energy with and without a buried object [29], or with and without background removal [30]. When it is observed that the former signal is not similar to the latter signal, it can be inferred that there may be buried objects.

Target Identification from B-Scan Image
Many advanced methods try to interpret the collected B-scan data. B-scan data processing is mainly divided into three parts: (1) data preprocessing and denoising; (2) hyperbolic features extraction for locating target; and (3) geometric characteristics estimation (e.g., depth, size, EM wave velocity). Many studies have focused on the task 2 to conduct the localization and identification of underground targets in civil infrastructure. Based on the ways of feature extraction, the task 2 can be further divided into traditional-based, ML-based, DL-based. The flow diagram of task 2 is given in Figure 3.

Image-Based Interpretation Methods
Image-based interpretations first extracts features from GPR images, and then analyzes the geometrical characteristics from these features. The traditional feature extraction methods mainly include edge detection, image segmentation, LS, and Hough transform.
The echo feature appears as a hyperbolic shape on the GPR image, with alternating

Image-Based Interpretation Methods
Image-based interpretations first extracts features from GPR images, and then analyzes the geometrical characteristics from these features. The traditional feature extraction methods mainly include edge detection, image segmentation, LS, and Hough transform.
The echo feature appears as a hyperbolic shape on the GPR image, with alternating bright (high intensity) and dark (low intensity) edges. Naturally, edge detection methods can be considered for extracting hyperbolic targets. The authors [31] conducted the Canny edge detection method with the image matrix and the computed threshold as inputs. To effectively segment the intersecting hyperbolas, the study [32] mimicked the falling motion of a raindrop and introduced a new drop-flow algorithm to detect GPR signatures and decompose them into feature components in B-scans. In addition, Hough Transform is also a typical hyperbolic recognition method [11,33]. These methods initially locate the potential position of targets, and then the Hough transform is implemented to accurately identify hyperbolic signatures and find the apexes of the targets in the restricted region. To localize the target region, the study [11] utilized the trained cascade of classifiers and applied it to all input images to obtain the region containing the hyperbolic target. The work [33] calculated the normalized variance of the amplitude component to locate the target reflections. However, the Hough transform has a limited ability to reconstruct the hyperbolic signatures due to its high computation cost when facing large amounts of data. Moreover, when encountering multiple intersecting hyperbolic features, the Hough transform is difficult to set the parameter space threshold. Unlike the Hough transform, the LS method can be used to search and distinguish the quadratic curve in GPR images. Windsor et al. [34] proposed an overlapping target segmentation method, which used the traditional LS algorithm to assist in estimating the target (position, depth, radius, or speed). This method has a fast calculation speed, but the matching template depends on a large amount of prior knowledge, which limits its wide application.
Image segmentation algorithms are another vital category used to extract GPR target features. These segmented hyperbolic signatures are then integrated with advanced computer vision methods to detect targets automatically. This category does not require prior knowledge and includes two categories: traditional threshold segmentation and state-of-the-art cluster segmentation: i.
Threshold segmentation is the most common method. It can enhance image features and shape pattern features, and eliminate most of the background interference in GPR images, which can be reflected in [35]. Based on this, the amount of calculation for subsequent processing can be greatly reduced. ii. Recently, cluster algorithms are operated in GPR image to classify points into different point clusters. The authors [13] proposed a column connection clustering (C3), which scanned the binary map in columns to extract coordinate point sets for clustering and split the GPR images into several parts. Intuitively, the hyperbola in the GPR image has a downward opening, which is a key feature for identifying hyperbola. However, the above C3 algorithm does not consider this important feature. The work [14] developed the open-scan clustering algorithm (OSCA), which makes use of the downward opening feature to make up for the deficiencies of the C3 algorithm. OSCA scans pre-processed binary images line by line, not only using pixel connections, but also clustering through opening information. However, many complex situations are not considered by the OSCA and certain non-target clustering may not be eliminated. Then, the DCSE algorithm was proposed in [15] to solve the above three situations, which collects the downward openings by implementing rule-based searching strategies. The first-round searches for all openings and sets thresholds to eliminate irregular areas. Based on this, the openings are re-searched and marked in the second round. The image-based interpretation method is more flexible and can adjust the corresponding sequence of steps or add restrictions according to different application scenarios. It can assist the target area positioning method and can adapt to various application scenarios.

Machine Learning-Based Techniques
ML is often integrated with the extracted features to predict classification results. The common algorithms include SVM, BP neural network, Naïve Bayes classifier, GA, and so forth.
At present, some scholars have systematically compared the impact of different feature expression methods on discriminant performance [36], and provided guidance suggestions on how to effectively extract training and test samples [37]. Maas et al. [11] applied the face detection framework, namely Viola-Jones, to the location recognition of the hyperbolic signature. This framework works well for simple hyperbolic patterns. The Haar-like wavelet features are extracted as the input of the adaboost-based cascaded classifier. This method eliminates many preprocessing steps, but relies heavily on the reliability and comprehensiveness of the sample. Some studies [38,39] have designed genetic algorithm (GA)-based schemes to identify the linear and hyperbolic features of underground objects in binary images. Based on these identified features, Harkat et al. [39] further classified candidate patches into positive/negative samples by a neural network radial basis function (RBF) classifier. The success of techniques such as [12,40] illustrate the potential for histogram of oriented gradient (HOG) feature extraction combined with SVM technique to develop algorithms for effective and efficient object detection. With additional research, other techniques, such as ANN, may also play a vital role to perform the task. Qing Dou et al. [13] manually select positive/negative samples after clustering GPR target hyperbola, and then calculate two normalized cross-correlation values of the hyperbola, which are fed into the three-layer perception NN for further filtering the target. In work [41], the feature vector extracted from GPR training samples was decomposed into the principal components (PCs) for training a BP-ANN model. In [42], a multilayer perceptron (MLP) was implemented for classifying bridge rebars in Region of Interest (RoI).
The ML model can achieve a pretty trade-off between accuracy and speed when facing a small-scale dataset. However, a large-scale GPR data may limit its efficacy in classification results. That is because the input features of ML model are manually extracted, which depends on expertise and is error-prone.

Deep Learning-Based Techniques
Lately, DL theory has demonstrated its wide applications in classification, recognition, and segmentation [43,44], which lays a foundation for its application in GPR discipline. Unlike ML, which requires pre-designed features, DL networks can directly learn the feature representation from radargrams, even in complex scenes.
The CNN model eliminates the need for researchers to spend too much effort on describing the hyperbolic characteristics. By constructing positive/negative sample sets, the CNN model is able to distinguish the hyperbolic morphology and clutter characteristics. At present, some scholars have proposed the CNN-based detection algorithm for analyzing GPR data [16,17,45], and a large number of experiments have confirmed that CNN can extract and classify the complicated features. Although CNN can make a suitable distinction between targets and clutter features, the localization of the RoI before classification step still depends on other techniques.
The latest strategy for learning and distinguishing features is based on deep object detection or segmentation models [15,20,21,46]. The authors [15] proposed an automatic method based on the trained faster region CNN (Faster R-CNN) [47] to first detect target regions, and then used the transfer learning method to improve the stability. Zhang et al. [20] proposed the mixed deep CNN that consists of both ResNet50 and YOLO v2 networks to assess deterioration conditions of bridges. Based on the understanding of the advanced instance segmentation model, Hou et al. [21] enhanced the performance of Mask R-CNN [48] to further segment GPR signatures, wherein a new loss computation, is developed and incorporated in Mask R-CNN to minimize the discrepancy between the predicted bounding box (Bbox) and the real Bbox in the training phase. To better present the latest research, Figure 4 draws diagram to highlight some significant deep models and algorithms. Remote Sens. 2022, 14, x FOR PEER REVIEW 8 of 24

C-Scan Processing
In most cases, the applications of GPR target detection approaches met in the literature depend on B-scans data, and rarely directly on C-scans. As a result, some information might be misinterpreted through this process. To address this important issue, it is necessary to perform this task on C-scans for providing richer information.
Researchers and practitioners have focused on the automated schemes using C-scans. The authors [49] originally developed a novel concept, 3-D S-transform, for creating 3-D patterns of sinkholes in geological structures based on C-scans. In 2015, Klęsk et al. [50] proposed a ML-based approach to efficiently analyze the C-scans. The boosted decision trees are selected as the detector, where the 3-D Haar-like features are customized as the input of detector. In 2018, Klęsk et al., [51] extended the previous research and extracted 3-D variant features from integral images. Taking raw 3-D GPR data as input, the work [52] designed a 3-D model that highlights underground structure and visualizes the location of buried target.
The DL technique also conducts this task based on 3-D data. The studies respectively exploited a deep CNN framework in [53] and used an integrated model of CNN and recurrent neural networks (RNN) in [54] to classify subsurface targets by analyzing both Bscan and C-scan data. In fact, these approaches are directly executed on the B-scan data, not on the C-scan volume. These B-scan data are obtained from the processing and conversion of C-scans. To address this issue, Khudoyarov et al. [55] developed a 3-D CNN model directly operating on 3-D data. Each small 3-D block including the desired target is picked and segmented from the full-sized C-scans. These segmented blocks are then fed into the 3-D deep model for training, and the trained model enables the recognition of the

C-Scan Processing
In most cases, the applications of GPR target detection approaches met in the literature depend on B-scans data, and rarely directly on C-scans. As a result, some information might be misinterpreted through this process. To address this important issue, it is necessary to perform this task on C-scans for providing richer information.
Researchers and practitioners have focused on the automated schemes using C-scans. The authors [49] originally developed a novel concept, 3-D S-transform, for creating 3-D patterns of sinkholes in geological structures based on C-scans. In 2015, Klęsk et al. [50] proposed a ML-based approach to efficiently analyze the C-scans. The boosted decision trees are selected as the detector, where the 3-D Haar-like features are customized as the input of detector. In 2018, Klęsk et al., [51] extended the previous research and extracted 3-D variant features from integral images. Taking raw 3-D GPR data as input, the work [52] designed a 3-D model that highlights underground structure and visualizes the location of buried target.
The DL technique also conducts this task based on 3-D data. The studies respectively exploited a deep CNN framework in [53] and used an integrated model of CNN and recurrent neural networks (RNN) in [54] to classify subsurface targets by analyzing both B-scan and C-scan data. In fact, these approaches are directly executed on the B-scan data, not on the C-scan volume. These B-scan data are obtained from the processing and conversion of C-scans. To address this issue, Khudoyarov et al. [55] developed a 3-D CNN model directly operating on 3-D data. Each small 3-D block including the desired target is picked and segmented from the full-sized C-scans. These segmented blocks are then fed into the 3-D deep model for training, and the trained model enables the recognition of the new 3-D data. For further study, in 2020, the authors [56] established a CNN model that uses AlexNet [57] as a baseline and is enhanced by the transfer learning technique. This model directly uses 3-D data for subsurface cavity detection.

Bridge Application
Bridge management mainly concerns condition rating and mapping modeling. First, the condition evaluation of a bridge deck usually focuses on cracks, rebar corrosion, structural integrity, component recognition, and poor compaction. Second, the rebar mapping of a concrete bridge concerns the object detection, position estimation, and shape visualization.

Application in Condition Evaluation
By comparing the related computation values of the rebar reflection amplitudes to corrosion quantities, the corrosion degree of a bridge deck is estimated in the recent works [28,[58][59][60][61][62]. Table 1 gives some examples of these works. For example, the relevant parameters obtained from [59] included direct coupling amplitude, wave velocity in concrete cover depth, and reflected signal amplitude. The work in [60] measured two parameters: relative permittivity of concrete and EM wave attenuation of rebars. However, the above methods only use the amplitude information of reflected signals, and the incomplete use of GPR information could lead to a limited visualization of true conditions. The integration with other techniques, such SVM classifier integrated with image processing [12], synthetic aperture focusing technique (SAFT) [63], can be exploited to visualize the corrosion regions. Previous studies in [64,65] introduced an integrated framework for predicting and analyzing deterioration based on visual inspection and GPR evaluation results. A case study was conducted in [66] to study the use of GPR surveys in conjunction with unmanned aerial photogrammetry and infrared (IR) thermography analyses to help assess bridge degradation.
Some research has been contributed to the automated task of crack detection based on the ML method [67,68]. The work in [67] used ML classification first, followed by a curve fitting method to further identify crack areas from noisy backgrounds. The study in [68] applied ML to a constructed Gaussian regression model based on the formation of crack damage resulting from different influential factors. In addition, the integration of IR and GPR technologies was also explored in [69,70] to enhance defect detection. IR photography was used here for measuring the temperature changes of the bridge deck, and GPR was used to record signal attenuation.
Recent studies also focus on achieving a structural integrity evaluation for bridges. Both works in [71,72] used an integrated model of GPR and interferometric synthetic aperture radar (InSAR) to assess displacement level of the bridge structure. InSAR can initially identify RoI, while GPR can further detect attenuation sources. The research in [73] concerned the development of a ML algorithm, called gradient boosting, for the full depth condition assessment. Using the EM properties (permittivity and conductivity) as characteristic parameters to simulate concrete conditions with different quality level. Narazaki et al. [74] investigated and recognized the bridge component after earthquakes in an automatic manner. This work exploited a DL semantic segmentation framework that consists of 45 convolutional layers.

Application in Mapping Rebar
Mapping the location and dimension of rebars in concrete bridge can be critical for assessing the structure and state of reinforced concrete (RC). Figure 5 shows some scene pictures of field applications. The model proposed in [75] demonstrated that the combination of sparse blind deconvolution (SBD) and full-waveform inversion (FWI) can estimate rebar diameters and obtain a sparse representation of the subsurface reflectivity series. In recent works, some techniques, such as the limited and simplified hyperbolic summation (LSHS) [76], CNN [45], GA [38], MLP [42], and SVM [12], have been developed Remote Sens. 2022, 14, 5972 9 of 23 for automated rebar classification, laying the foundation for next target location mapping. Some works about this topic are listed in Table 1. mate rebar diameters and obtain a sparse representation of the subsurface reflectivity series. In recent works, some techniques, such as the limited and simplified hyperbolic summation (LSHS) [76], CNN [45], GA [38], MLP [42], and SVM [12], have been developed for automated rebar classification, laying the foundation for next target location mapping. Some works about this topic are listed in Table 1.

Road Pavement Assessment
GPR road application involves the distressing diagnosis of existing roads and quality control surveys in road projects.

Distress Detection
Road pavements are subject to traffic and temperature variations, resulting in distresses, such as reflection cracks, defects, or potholes, shortening their lifeline. Early diagnosis of these distresses allows proper maintenance and rehabilitation. Some of the literature is summarized in Table 2.
Many technologies have been used for pavement distress detection [77][78][79][80]. The work in [79] detected the cracks under different widths and proved that the signal shape of GPR is not affected by width. The EM wave velocity in underground media was estimated in [80] to detect the damaged parts, even the thin cracks can be found. Literature [81] designed a detector based on the analysis of the reflected signals of potholes and estimated the corresponding position and dimension. Lagüela et al. [82] studied the joint use of GPR, IR thermography and terrestrial laser scanning (TLS), and tested them in a road next to the sea for a comprehensive assessment of pathologies.
Some studies exploited the DL method to identify distress on GPR data. Tong et al. [83] applied two different CNNs framework, multi-stage CNN and cascaded CNN, to automate the classification of road defects. The results proved the cascaded CNN outperforms the multi-stage CNN. The network model developed in [84] deepened multilayer perceptrons in CNN to extract low-level features first, and then these features were grouped into highlevel features. The network directly uses GPR signals as input data to identify and classify the type of distress, and evaluate the locations and sizes. Gao et al. [85] developed an object detection model to detect distress. The model was optimized using both the new anchor scales and DL tricks such as stochastic pooling.

Quality Control
Effective quality assurance and control inspections of road pavements is taking priority nowadays. The works [87,88] evaluated and analyzed the frequency spectrum of GPR signals, and thus to provide an accurate judgment of the internal conditions of pavement materials. For most studies, the acquired data from asphalt layer of road pavement are interpreted into the relevant input, such as thickness [89][90][91][92][93][94], permittivity [89,95], moduli [96], interface roughness [97], or stiffness [98], for flexible pavements assessment. Among them, asphalt layer thickness is the most relevant input. To estimate the thickness, De Coster et al. [90] exploited the GPR full-wave inversion and straight-ray methods, and the study [91] applied the Common Mid-Point (CMP) method. A case study in [92] was investigated to differentiate the asphalt pavement layers and map their variable thickness. In addition, the multiple signal classification (MUSIC) method was conducted in [93] to increase the resolution of 3-D GPR signals for improving the evaluation performance of asphalt overlay thickness, and used in [97] to compute time delay for assessing interface roughness. In addition, GPR applications also demonstrates its ability to monitor the density change during the compaction of asphalt pavement. To extract density information without the effect of surface moisture, some works [99][100][101] were investigated on pavement to remove the surface moisture and predict the density profile of asphalt concrete (AC) pavement.

Underground Utilities Survey
It is vital to clearly survey underground pipelines when the underground space of an old urban area is rebuilt and expanded. GPR is usually used to locate utilities, determine their diameters, and assess water leakage.

Utilities Positioning and Mapping
Underground utility mapping is an important technology for extracting underground information, which can provide an effective man-machine interaction for safe excavation.
Before locating underground utilities, some preprocessing methods are required, such as background removal algorithm [102] and pipes visibility increase [24]. Existing studies [53,[103][104][105][106][107][108] have been implemented for positioning and mapping utilities. In [105], the interpretation and comparison of the raw B-scans associated with the different pipe zone and three different GPR system frequencies have allowed us to detect the hyperbola signatures of the buried pipes. The study [107] about the visualization of urban utilities was conducted based on the integrated model of GPR and robotic terrestrial positioning system (TPS). In [108], a smooth 3-D curve with location and depth information was obtained to intuitively visualize the direction of buried cables. After locating underground pipelines, the interpretation of GPR data is also very vital for city planning. For example, the diameter prediction of utilities filled with lossy media [109], the depth and radius estimation of plastic pipe [110], and the size and condition prediction of drainage pipes [111]. Some examples can be found in Table 3. S. Li et al. have contributed to this research from 2015 to 2020 [32,[112][113][114]. In 2015 [112], an integrated system with GPR, a global positioning system (GPS), and geographical information system (GIS) were developed for mapping subsurface pipelines. Next year, to estimate the buried depth and radius of utilities, the authors [113] proposed a novel hyperbola equation to model GPR raw data based on the incorporation of the relative angles between buried utilities and GPR scanning trajectories. In 2018, to further effectively segment the intersecting hyperbolas, the research [32] mimicked the motion of a raindrop falling and introduced a new drop-flow scheme to identify and segment GPR signatures into feature components in B-scans. In 2020 [114], the state of pipes (presence/absence) can be inferred by fusing the evidence from heterogeneous sources based on the Dempster-Shafer evidence theory.

Water Leakage Detection
Leakage detection in a buried water pipe is a crucial issue as underground pipes become aged. Many studies have used GPR as an effective tool because its EM wave is highly sensitive to the water in the soil. For example, the work [115] reviewed two measure means: GPR and IR camera, to detect and locate water leakage in pipeline networks. GPR is first used to locate buried utilities, followed by the IR technique for subtle leakage collection. In addition to these two techniques, the study [116] also considered an additional technique, acoustic detectors, for the environment with high soil moisture and to make up for the inadequate operational capabilities of GPR and IR in the same environment. By detecting the water leak from the pipes buried in a sand box [117], the work compared the collected data with different patterns and diagnosed if there is leakage issue. In [118], the authors measured the changes in EM wave velocity and wave reverberation to sense an integrated water leakage. The comparative results proved leakages could be identified most clearly in the 600 MHz GPR.

Urban's Subsurface Risks
Urban areas often face road safety problems caused by sudden road cave-ins, which seriously threatens people's life and property safety. In order to prevent road collapse accidents, extensive research has been carried out. Some examples of literature are listed in Table 4.

Void Risk
Void disease caused by construction quality and external loads contributes to the failure of RC structures. The authors [119] proposed a void-detection algorithm using the target echoes data collected from railways. Based on the typical target echo models of rebar and void, a horizontal filter was constructed to identify the void diseases and eliminate the interference of rebar echoes. The study [120] identifies voids based on the SVM classifier, using features vectors composed of discrete cosine transform (DCT) coefficients as input. The authors [121] established a database of void patterns of both C-scans and B-scans, wherein voids were automatically located from C-scans and verified from corresponding B-scans. A novel framework based on multi-sensors (such as unmanned aerial vehicles (UAV)) and GPR was proposed. Both works in [122,123] exploited GPR data to detect voids in disaster areas, with the purpose of providing related information to rescue potential victims buried/trapped in ruined buildings.

Sinkhole Risk
The sinkhole is located beneath a street and has led to the demolition of buildings. This section considers the current literature on sinkhole occurrence through the deployment of GPR. Solutions have been grasped; preventative planning based on early detection is among the most effective available solutions [49,124]. The literature [49] first developed the concept of 3-D S-transform, which allowed the study of 3-D GPR data, and was used to look for sinkholes in geological structures. Sevil et al. [124] conducted a sinkhole investigation by analyzing and comparing data gathered by several different strategies: trenching, GPR technique, electrical resistivity tomography (ERT), and high-precision leveling. The combination of common methods can provide the key information for specific sinkholes.

Cavity Risk
Ground cavity configurations, including depth, roof shape, and length, are the main factors affecting the risk of ground sinkholes. Much research has focused on urban cavity detection using 3-D GPR data. A case study was achieved based on the 3-D GPR mapping, which provided most of the identified cavities [125]. In addition, the tailored CNN-based cavity detection techniques were studied in [56,126]. The authors [126] established an underground cavity detection network (UcNet) to decrease underground cavity misclassification, and the study [56] developed a CNN framework that was based on the pre-trained AlexNet and enhanced by the transfer learning technique. Some additional techniques, such as instantaneous phase analysis [127] and domain reflectometry [128], also were integrated with the GPR technique to estimate the status of the cavity. The study [127] visualized and distinguished the hidden cavities from other underground objects such as buried pipes and manholes. The work [128] developed a novel time domain reflectometrybased penetrometer system to accurately estimate the relative permittivity of the ground at different depths.

Discussion and Conclusions
This section includes a comprehensive discussion of GPR data analysis techniques and future perspectives, as shown in Figure 6.

Discussion and Conclusions
This section includes a comprehensive discussion of GPR data analysis techniques and future perspectives, as shown in Figure 6.

Comprehensive Discussion of GPR Data Analysis Techniques
At present, a large amount of data analysis techniques have been carried out on the GPR target detection task. However, the existing research work still relies on manual work, and the challenges brought by the complex instability of subsurface scenes have not been fully solved. The main challenges of the data analysis techniques in the GPR field are summarized as follows:

Lack of Effective Signature Extraction Strategies in Complex Scenarios
Most of the existing work assumes obtaining high-quality GPR simulation data under ideal scenarios and relies on these data to perform tasks related to subsurface target detection, localization, and parameter evaluation. The task effectiveness depends on the

Comprehensive Discussion of GPR Data Analysis Techniques
At present, a large amount of data analysis techniques have been carried out on the GPR target detection task. However, the existing research work still relies on manual work, and the challenges brought by the complex instability of subsurface scenes have not been fully solved. The main challenges of the data analysis techniques in the GPR field are summarized as follows:

Lack of Effective Signature Extraction Strategies in Complex Scenarios
Most of the existing work assumes obtaining high-quality GPR simulation data under ideal scenarios and relies on these data to perform tasks related to subsurface target detection, localization, and parameter evaluation. The task effectiveness depends on the quality of the data, which largely determines the simplicity of the subsequent processing steps and the correctness of the evaluation results. However, in a complex environment, consistent, high-quality data and high-performance results are not always available, which may lead to serious misjudgment of the perception of underground targets. In addition, many underground structures, irregular target distribution, unknown target size, depth information, and complex underground media conditions greatly limit the automatic realtime development of large-scale target recognition systems. As the dielectric constant of underground fillings is highly related to weather conditions, the surrounding environment, and emergencies, the reflected ground penetrating radar signal is too weak or even invisible, which may lead to the omission of target objects under the conditions of high surface water content, diverse fillings, and corrosion of buried objects.
It is of great research value to deeply understand the attributes of the target itself, especially to distinguish the characteristics between target and non-target. It is necessary to design an effective feature extraction strategy for GPR target detection in complex situations. In this way, not only all target features can be completely extracted from the whole GPR image, but they can also not be interfered with by adjacent or overlapping targets so as to obtain effective, high-quality data.

Lack of Customized Deep Models for Different Types of Target Signatures
First, the DL technique can be used for most image-related tasks, but radar images are different from conventional images, and the internal architecture of the deep model is not closely related to the attributes of the radar map and the target features. Therefore, directly applying the deep models to GPR data analysis may ignore useful information, resulting in their redundant processing. In addition, the synthetic aperture length of GPR is much larger than the target size, so the rising and falling edges of the hyperbolic feature of the target in GPR imaging are steep; that is to say, the vertical and horizontal proportions of the rising and falling edges are large, which leads to the area where the hyperbolic features of the target are located. The proportion of pixels in the whole GPR image is very small, showing typical small target characteristics. When the conventional depth model is applied to the detection of these small target features, it is easy to cause the loss and missed detection of dense small targets.
Therefore, it is necessary to design a specific target detection model to match the input features of each target type. Since the features extracted by the deep model are nonintuitive and difficult to interpret, how to design an effective model for specific applications in specific scenarios and focus on exploring and interpreting the internal structure of the deep model is a challenging and meaningful research problem.

Future Perspective
This paper provided an overview of the states of research on employing the GPR technique in the civil engineering world. As a comprehensive overview of GPR data analysis and processing, this paper has analyzed the complexity of GPR signals, summarized the popular A/B/C-scan processing methods, provided some structural categories according to the feature extraction style, and discussed the advanced applications for the civil infrastructure. Although significant progress has been made, the following discussion identifies some promising directions for exploratory research.

Matching Consistency between GPR Feature and Deep Model
In practical operation, multiple factors such as the changes in radar parameters, the complexity of the underground medium, or the non-uniform movement of the T/R antenna will cause the target characteristics to be distorted in the image domain. It is necessary to design the corresponding DL model that enables matching each type of input feature to complete the target recognition process. Since the features extracted by DL models are non-intuitive and difficult to explain, it is still a challenge to design matching models for certain application-oriented problems. Future research can focus on the exploration and interpretation of the internal structure of the DL model so as to expand its application scope.

Reduced Dependence on Large Amounts of Data
DL-based GPR analysis has not kept pace with the rapid progress in other fields, partially due to the unavailability of a large-scale radar database. Therefore, there is an important need for a high-quality, large-scale GPR dataset, which will greatly promote radar data analysis. However, in many applications, only a limited number of annotated training data are available, or it is too expensive to collect labeled training data. Possible research could be to develop the learnable multi-dimensional descriptors that require modest training data or to explore effective transfer learning.

Impact of Multiple Factors on Data Analysis
Several factors, such as material, condition, and environment, must be considered while performing GPR data analysis. For example, for bridge applications, these factors include the condition of the asphalt concrete overlay and its material properties and moisture, deck structure, and extent of deterioration. For road pavement applications, these factors include reflection cracks, defects distribution, potholes, moisture, and cavity existence. For underneath utilities survey, these factors include materials, the extent of deterioration, corrosion, and water leakage. For subsurface risks, these factors concern cavity depth, roof shape, and length. Further research is needed to study the impacts of these factors individually and the impacts of the combination of multiple of these factors.

Integrated NDT Technologies
Some studies integrated results from GPR and other high-performance NDTs for supporting various subsurface conditions. Future research to complete the integration task needs to concern: (1) Selecting the technologies to integrate for specific conditions; (2) Evaluating the number of technologies to integrate according to the functions that need to be implemented; (3) Automatically integrating results of multiple technologies into one output, instead of separately integrating results collected from multiple technologies. Table 5 discusses several existing literature reviews [1,25,129,130] from the aspects of reference sources, research fields, research sites, research cycles, various applications, and signal processing methods and compares them with this paper.

Paper Selection Strategy
A search strategy for identifying relevant literature must be developed. This review paper follows the PRISMA method (https://www.prisma-statement.org/Default.aspx, accessed on 24 March 2020). This includes selecting search terms and appropriate databases and deciding on inclusion and exclusion criteria. This search work lasted from May 2019 to June 2022, and the publications were retrieved from the most recognized international scientific citation indexing services. This work used two reviewers to select articles. A-scan processing (noise removal, resolution enhancement, object detection, material property analysis); B-scan processing (image-, ML-, DL-target identification); C-scan processing (3-D reconstruction, target recognition) First, three databases are utilized to search references: (1) Web of Science (due date: 06/2022) (https://www.webofscience.com/, accessed on 2 April 2020); (2) Elsevier (due date: 05/2022) (https://www.sciencedirect.com/, accessed on 17 March 2020); (3) IEEE/IET Electronic Library (due date: 03/2022) (https://ieeexplore.ieee.org/, accessed on 29 December 2020). Second, search terms can be words or phrases that are directly related to the research question of each subsection. Third, a strategy is required to narrow the search range and point out indeed relevant literature. Criteria that are considered and used are: (1) the year of publications is limited in recent years from 2015 to 2022, except for some classical literature; (2) the type of articles focuses on scholarly journals, top-level computation conference proceedings, and books; (3) the language of articles is limited to English.