Research on an Automatic Detection Method for Response Keypoints of Three-Dimensional Targets in Directional Borehole Radar Profiles

Tang, Xiaosong; Xu, Maoxuan; Yang, Feng; Liu, Jialin; Peng, Suping; Qiao, Xu

doi:10.3390/rs18071102

Open AccessArticle

Research on an Automatic Detection Method for Response Keypoints of Three-Dimensional Targets in Directional Borehole Radar Profiles

by

Xiaosong Tang

¹

,

Maoxuan Xu

^1,*

,

Feng Yang

¹,

Jialin Liu

¹

,

Suping Peng

² and

Xu Qiao

¹

School of Artificial Intelligence, China University of Mining and Technology (Beijing), Beijing 100083, China

²

State Key Laboratory for Fine Exploration and Intelligent Development of Coal Resources, China University of Mining & Technology (Beijing), Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(7), 1102; https://doi.org/10.3390/rs18071102

Submission received: 6 March 2026 / Revised: 31 March 2026 / Accepted: 5 April 2026 / Published: 7 April 2026

(This article belongs to the Special Issue Ground Penetrating Radar (GPR) Applications in Earth, Moon and Planetary Exploration (Second Edition))

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

A three-dimensional electromagnetic simulation dataset for directional BHR is constructed, providing reliable data support for model training, evaluation, and validation.
BSS-Pose-BHR is proposed for keypoint detection of three-dimensional geological targets in directional BHR profiles. Built upon YOLOv11n-pose, three enhanced modules are introduced to improve feature extraction and localization capability, thereby significantly improving detection accuracy.

What are the implications of the main findings?

The proposed framework enhances detection accuracy and intelligence in directional BHR interpretation, enabling more accurate and automated azimuth localization of underground targets.
Ablation experiments confirm the effectiveness and rationality of each module, indoor physical experiments validate the performance on measured signals, and robustness analysis demonstrates stability under noise and missing-trace conditions, indicating strong potential for practical engineering deployment.

Abstract

During the interpretation of Borehole Radar (BHR) B-scan profiles, the accurate determination of the azimuth of geological targets in three-dimensional space is a critical issue for achieving precise anomaly localization and spatial structure inversion. However, existing directional BHR anomaly localization methods exhibit limited intelligence, insufficient adaptability to multi-site data, and weak generalization capability, rendering them inadequate for engineering applications under complex geological conditions. To address these challenges, a robust deep learning model, termed BSS-Pose-BHR, is developed based on YOLOv11n-pose for keypoint detection in directional BHR profiles. The model incorporates three key optimizations: Bi-Level Routing Attention (BRA) replaces Multi-Head Self-Attention (MHSA) in the backbone to improve computational efficiency; Conv_SAMWS enhances keypoint-related feature weighting in the backbone and neck; and Spatial and Channel Reconstruction Convolution (SCConv) is integrated into the detection head to reduce redundancy and strengthen local feature extraction, thereby improving suitability for keypoint detection tasks. In addition, a three-dimensional electromagnetic model of limestone containing a certain density of clay particles is established to construct a simulation dataset. On the simulated test set, compared with current mainstream deep learning approaches and conventional directional borehole radar anomaly localization algorithms, BSS-Pose-BHR achieves superior performance, with an mAP50(B) of 0.9686, an mAP50–95(B) of 0.7712, an mAP50(P) of 0.9951, and an mAP50–95(P) of 0.9952. Ablation experiments demonstrate that each proposed module contributes significantly to performance improvement. Compared with the baseline, BSS-Pose-BHR improves mAP50(B) by 5.39% and mAP50(P) by 0.86%, while increasing model weight by only 1.05 MB, thereby achieving a reasonable trade-off between detection accuracy and complexity. Furthermore, indoor physical model experiments validate the effectiveness of the method on measured data. Robustness experiments under different Peak Signal-to-Noise Ratio (PSNR) conditions and varying missing-trace rates indicate that BSS-Pose-BHR maintains high detection accuracy under moderate noise and data loss, demonstrating strong engineering applicability and practical value.

Keywords:

borehole radar (BHR); BSS-Pose-BHR; YOLOv11n-pose; keypoint detection; ablation experiments; indoor physical model experiments; robustness experiments

1. Introduction

Borehole radar, a specialized application of ground penetrating radar (GPR), is a geophysical technique operated within boreholes [1,2,3]. Due to its flexible deployment capability in three-dimensional geological space, BHR detection offers several advantages, including close-range investigation within boreholes, reduced susceptibility to external electromagnetic interference, and high-resolution detection enabled by high-frequency electromagnetic waves. These features enable relatively refined detection of sub-meter-scale anomalies and have demonstrated promising performance in geological and hydrogeological investigations, mineral exploration, and structural health monitoring [4,5,6,7,8], indicating considerable potential for engineering applications. However, conventional omnidirectional BHR systems are constrained by radar wave propagation characteristics and complex geological conditions, allowing only the determination of the distance between the target and the borehole and making directional detection of surrounding targets difficult [9]. To address this limitation, the present study is based on a rotatable directional detection radar system, as illustrated in Figure 1, which enables precise localization of adverse geological hazard sources. The proposed BHR system adopts an embedded design, in which the acquisition control module is implemented using an embedded chip to enable radar and gyroscope data acquisition, while communication with the host computer is achieved via a photoelectric converter. The gyroscope provides information on antenna orientation and in-borehole mileage, while a camera monitors the surrounding rock conditions near the antenna to assist in radar data interpretation. The radar system performs dense 360° scanning via an electric rotation controller, thereby enabling full-space observation of geological targets.

For B-scan profiles acquired by directional BHR, accurately determining the three-dimensional azimuth of geological targets within the profiles is of critical importance [10]. Traditional anomaly localization algorithms for directional BHR profiles can generally be classified into two categories: the maximum-amplitude-based orientation algorithm and the intermediate-value-based localization algorithm [9]. The fundamental principle of the maximum-amplitude-based orientation algorithm is as follows. When the directional BHR antenna performs rotational detection or uniform angular velocity scanning within the borehole, the transmitting antenna emits a directional electromagnetic beam, whose directivity is governed by the intrinsic radiation pattern of the antenna, thereby enabling directional scanning of the surrounding medium. When the beam axis (i.e., the direction corresponding to the maximum electromagnetic field intensity) is aligned with the anomaly, the reflected response exhibits the maximum amplitude. Therefore, when the echo pulse reaches its peak value, the corresponding beam axis direction is considered to represent the azimuth of the anomaly. In contrast, the intermediate-value-based method determines the anomaly azimuth by first setting a predefined detection threshold and then selecting several strong-amplitude points exceeding this threshold. Since the amplitude variation near the maximum value is relatively flat, the central position of these selected points is taken as the azimuth of the anomaly. As illustrated in Figure 2, the maximum-amplitude-based orientation algorithm identifies the azimuth corresponding to the position with the strongest target response energy (Figure 2a). By comparison, the intermediate-value-based localization algorithm first specifies a threshold (e.g., 0.7 in Figure 2c) and then determines the target azimuth by calculating the average of the maximum and minimum angles within the range exceeding the threshold.

However, due to the complexity of in situ geological conditions and the large volume of data generated during rapid acquisition, even experienced technicians face significant challenges, as BHR data interpretation is both time-consuming and labor-intensive. In practice, data collected by BHR within a single day may require several weeks for complete interpretation. Moreover, the aforementioned traditional methods inherently involve subjective uncertainty. Therefore, there is an urgent need to develop an efficient and reliable automated BHR interpretation approach [11].

In recent years, advances in artificial intelligence—particularly in computer vision—have enabled deep learning algorithms to be applied to the automatic and efficient detection of structural damage in civil engineering [12,13]. For instance, Pham et al. [14] adopted a Faster R-CNN model pre-trained on the CIFAR-10 dataset and conducted joint training and fine-tuning on both real and simulated GPR profiles to detect underground targets (i.e., hyperbolic reflections) in GPR B-scans. Qin et al. [12] extended deep learning to tunnel lining inspection by developing an automatic recognition framework based on a Mask Region-Based Convolutional Neural Network (Mask R-CNN) for identifying steel ribs, voids, and initial linings from GPR profiles. Their method employed a ResNet-101 backbone combined with a Feature Pyramid Network (FPN) for feature extraction, a Region Proposal Network (RPN) for target detection, and a Fully Convolutional Network (FCN) for segmentation. To address data scarcity, they further enhanced training using synthetic GPR data generated by the finite-difference time-domain (FDTD) method and a deep convolutional generative adversarial network. Results from both synthetic and field experiments demonstrated high recognition accuracy, highlighting the potential of deep learning for intelligent interpretation of GPR data.

However, the aforementioned methods are often computationally intensive and time-consuming. By contrast, lightweight single-stage frameworks, including the Single Shot Multibox Detector (SSD) and You Only Look Once (YOLO), have demonstrated remarkable effectiveness in GPR profile interpretation and are therefore more suitable for efficient and low-cost analysis of large-scale datasets [15]. Illustratively, Luo et al. [16] proposed Multi-Task DL for GPR data (MTGPR), an automatic method for detecting voids and cavities in tunnel linings from GPR radargrams. CAPW-YOLO was employed to enhance feature extraction and fusion under infrastructure interference, and a refined synthetic dataset was used to augment training. Ablation experiments showed 88.5% accuracy and 84.0% precision, with a 46.9% increase in speed and a 10.2% reduction in model size compared with the YOLOv7 baseline. Khedr et al. [17] demonstrated that YOLOv8 outperforms both Faster R-CNN and YOLOv7. They trained the YOLOv8 model using experimental and field data and validated its accuracy and rebar diameter classification capability on real-world building datasets.

The aforementioned deep learning-based automatic GPR profile detection algorithms have primarily focused on data acquired along conventional survey lines. In contrast, research on automatic detection for directional BHR profiles remains limited, particularly regarding the angular localization of geological targets within such profiles, which has not been thoroughly investigated. Pose estimation is a task that involves identifying the locations of specific points in an image, commonly referred to as keypoints. These keypoints can represent various parts of an object, such as joints, landmarks, or other distinctive features. Determining both the azimuth of targets in directional BHR profiles and the distance between targets and the antenna involves keypoint localization within the profiles and can therefore be formulated as a pose estimation task. Motivated by the YOLOv11n-pose model, this study develops a robust deep learning framework, termed BSS-Pose-BHR, for keypoint detection in directional BHR response profiles. The main contributions of this work are as follows:

A three-dimensional electromagnetic model of limestone containing a certain density of clay particles is constructed, and a simulation dataset of directional BHR response profiles is generated by setting rigorously designed and dynamically controlled simulation parameters.
The BSS-Pose-BHR model incorporates three innovative modules: (1) Backbone network optimization: Bi-Level Routing Attention (BRA) is introduced to replace Multi-Head Self-Attention (MHSA) in C2PSA. This query-aware dynamic sparse attention mechanism filters irrelevant features, significantly improving computational efficiency and memory utilization for large-scale datasets. (2) Keypoint extraction enhancement: A lightweight sliced attention module, Conv_SAMWS, is embedded in both the Backbone and Neck networks. By slicing feature maps and applying parameter-free attention weighting, it enhances keypoint feature representation while maintaining a lightweight architecture. (3) Detection head improvement: Spatial and Channel Reconstruction Convolution (SCConv) is adopted to optimize the detection head. By refining spatial and channel information, feature redundancy is reduced and local feature extraction is strengthened, making the model more suitable for keypoint detection tasks.
Using BSS-Pose-BHR for keypoint detection in directional BHR response profiles, the proposed method achieves superior performance compared with current mainstream detection models, including YOLOv8n-pose, YOLOv10n-pose, YOLOv11n-pose, YOLOv12n-pose, and YOLOv11n-SPPF_improved-BSAM-LSCD_LQE. In addition, it provides more accurate angular localization than traditional target azimuth estimation methods for directional BHR profiles.

2. Materials and Methods

2.1. Principle of Directional BHR Detection and Imaging

As shown in Figure 3a, conventional BHR detection methods typically involve laying survey lines on the working face to acquire geological information in previously unexplored regions. Electromagnetic wave propagation within the medium follows Maxwell’s equations [18], as expressed in Equation (1).

\{\begin{matrix} \begin{matrix} J = σ E \\ D = ε E \\ B = μ H \end{matrix} \\ \nabla \times E = - μ \frac{\partial H}{\partial t} \\ \begin{matrix} \begin{matrix} \nabla \times H = σ E + ε \frac{\partial E}{\partial t} \\ \nabla \cdot (μ H) = 0 \end{matrix} \\ \nabla \cdot (ε E) = ρ \end{matrix} \end{matrix}

(1)

In Equation (1),

J

represents the current density (A/m²);

σ

denotes the electrical conductivity (S/m); E stands for the electric field intensity (V/m); D is the electric displacement (C/m²);

ε

indicates the permittivity (F/m); B refers to the magnetic flux density (T);

μ

represents the magnetic permeability (H/m);

H

is the magnetic field intensity (A/m); and

ρ

denotes the charge density (C/m³).

However, this conventional approach has several limitations, including limited detection accuracy, restricted detection range, constrained detection depth, and insufficient spatial information regarding geological structures [19].

BHR, as a specialized form of ground-penetrating radar, operates on the same fundamental principles as conventional geological radar [20] while retaining its high-resolution capability. This method enables close-range detection of subsurface targets via boreholes, thereby improving detection accuracy and depth of investigation. BHR can be implemented using three measurement configurations: single-hole detection, cross-hole detection, and borehole-to-surface detection [21]. In this study, the directional BHR configuration adopts the single-hole detection mode. As illustrated in Figure 3b, the BHR antenna consists of an arc-shaped radiating element, a metallic shielding cover, and impedance-absorbing materials, with a borehole antenna diameter of 60 mm. The electric rotation controller shown in Figure 1 enables the directional BHR system to perform high-precision 360° close-range rotational scanning within the borehole.

As the antenna rotates through the full spatial domain, its radiation direction also rotates in three-dimensional space, resulting in a continuous change in the orientation of the main lobe. During this process, the antenna center remains fixed, and its relative position with respect to the detection target does not change [22]. As shown in Figure 2a, within the profiles, the target-reflected waveforms exhibit an approximately horizontal linear distribution, indicating that the distance between the reflection points and the antenna remains nearly constant. Furthermore, due to the strong directivity of the directional BHR antenna, its radiated field is mainly concentrated within the main lobe. When the main lobe is aligned with the target center, the target lies within the region of maximum radiation intensity, and the corresponding echo energy reaches its maximum. As the antenna gradually deviates from the target direction, the target moves out of the main lobe and enters the sidelobe region with lower radiation intensity, resulting in reduced incident energy and a gradual attenuation of the target response.

2.2. Dataset Acquisition

This study is still at the early stage of instrument development, and large-scale field surveys in complex geological environments have not yet been conducted. Therefore, the dataset is primarily generated and augmented based on simulated electromagnetic models. As shown in Figure 4a,b, a three-dimensional electromagnetic model of limestone containing a specified proportion of clay particles is constructed. The main simulation parameters of the model are listed in Table 1.

The simulations are conducted using gprMax (v3.0) [23,24]. The main electromagnetic parameters of the materials used in the model are listed in Table 2. In particular, the rotation of the cavity around the antenna’s central Z-axis is described by Equation (2).

r o t a t e (i, j, k, a n g l e, axis = (x_{0}, y_{0})) = (\begin{matrix} (i - \frac{x_{0}}{Δ}) c o s θ - (j - \frac{y_{0}}{Δ}) s i n θ + \frac{x_{0}}{Δ}, \\ (i - \frac{x_{0}}{Δ}) s i n θ + (j - \frac{y_{0}}{Δ}) c o s θ + \frac{y_{0}}{Δ}, \\ k \end{matrix})

(2)

Here,

i

,

j

,

k

represent the three-dimensional coordinates of all points within the cavity anomaly. The function

r o t a t e (i, j, k, a n g l e, axis = (x_{0}, y_{0}))

gives the three-dimensional coordinates of the cavity after rotation. “

a n g l e

” denotes the rotation angle of the cavity, with

θ = \frac{π}{180} \cdot a n g l e % 360 (a n g l e : 0 ° ~ 360 °)

.

Δ

represents the model resolution, while

x_{0}

and

y_{0}

are the X–Y plane coordinates of the point around which the cavity rotates about the Z-axis. In this study, the default values are

x_{0}

= 1.535 m and

y_{0}

= 1.535 m, and

Δ

= 0.005 m.

The dataset annotation in this study is performed using LabelMe (v5.9.1) [25,26]. Figure 4c illustrates the annotation of different BHR acquisition data. Regarding the keypoint information of the anomalies (distance and azimuth), the distance refers to the distance from the center of the anomaly to the antenna center (

{d i s t a n c e}_{t a r g e t}

), while the azimuth refers to the angle when the anomaly is aligned with the antenna (

θ_{t a r g e t}

). The corresponding formulas are as follows:

{d i s t a n c e}_{t a r g e t} = \sqrt{{(x_{t} - x_{a})}^{2} + {(y_{t} - y_{a})}^{2} + {(z_{t} - z_{a})}^{2}}

(3)

θ_{t a r g e t} = 360 - [\frac{180}{π} a t a n 2 (- (x_{t} - x_{a}), y_{t} - y_{a}) + 360] % 360

(4)

Here,

x_{t}

,

y_{t}

, and

z_{t}

denote the geometric center coordinates of the anomaly along the X, Y, and Z axes, respectively, while

x_{a}

,

y_{a}

, and

z_{a}

denote the geometric center coordinates of the antenna along the X, Y, and Z axes, respectively.

2.3. Design of the BSS-Pose-BHR Model

2.3.1. Overall Model Architecture Design

As shown in Figure 5, the overall architecture of BSS-Pose-BHR is built upon the YOLOv11n framework [27,28], forming an end-to-end pipeline consisting of Input–Backbone–Neck–Pose Head–Output. The numbers (0–22) denote the layer indices in the YOLO network architecture. The simulation dataset constructed in Section 2.2 serves as the input, and all annotated images are uniformly resized to 640 × 640 pixels before being fed into the input layer.

The BSS-Pose-BHR model incorporates three major modifications:

Backbone network reconstruction (C2PSA → C2PSA_BRA): Bi-Level Routing Attention (BRA) replaces the Multi-Head Self-Attention (MHSA) within the Position-Sensitive Attention (PSA) module of C2PSA. BRA is an attention mechanism designed to address the scalability limitations of MHSA. Traditional attention mechanisms require each query to attend to all key–value pairs, which leads to excessive computational cost and memory consumption when processing large-scale data. BRA introduces a dynamic, query-aware sparse attention mechanism [29], which filters out most irrelevant key–value pairs at a coarse region-level granularity while retaining only a small set of routed regions. Fine-grained token-to-token attention is then performed within the union of these routed regions, allowing each query to focus on a limited number of relevant key–value pairs, thereby improving computational efficiency and memory utilization.
Keypoint extraction enhancement (Conv → Conv_SAMWS in Backbone and Neck): The Conv modules in both the Backbone and Neck networks are reconstructed as Conv_SAMWS, which incorporates a Simple Parameter-Free Attention Module with Slicing (SimAMWithSlicing). SimAM is a lightweight and efficient attention mechanism that enhances the model’s ability to capture important features through simple computations [30]. By performing slicing operations on the input feature maps, the module strengthens the attention weights of keypoint-relevant features while maintaining a lightweight architecture.
Detection head optimization (SCConv): The SCConv [31] is introduced to optimize the Pose Head structure, reducing spatial and channel redundancy while enhancing local feature learning. This improvement is particularly effective for keypoint detection tasks.

2.3.2. BRA Module

As shown in Figure 6, the internal structure of the BRA module is illustrated. First, the input feature map

X \in R^{H \times W \times C}

is partitioned into

S \times S

regions, and each region is projected to generate Query (

Q

), Key (

K

), and Value (

V

) tensors. Let

X_{r}

denote the reshaped regional feature matrix, and

W_{q}

,

W_{k}

, and

W_{v}

represent the corresponding projection matrices, respectively. The computation is formalized as follows:

Q = X_{r} W_{q}, K = X_{r} W_{k}, V = X_{r} W_{ν}

(5)

Then, the mean values of

Q

and

K

are computed to obtain

Q_{r}

and

K_{r}

, respectively. Equation (6) is then employed to construct the adjacency matrix

A_{r}

, which measures the semantic similarity across different regions.

A_{r} = Q_{r} K_{r}^{⊤}

(6)

The matrix

A_{r}

is filtered using Equation (7), and only the top-

k

connections are retained for each region to prune the association graph, resulting in the index matrix

I_{r}

.

I_{r} = t o p k (A_{r})

(7)

Finally, for each query region, the key–value pairs of the selected regions are aggregated to perform token-to-token attention computation.

O = A t t e n t i o n (Q, K_{g}, V_{g}) + L C E (V)

(8)

where

K_{g}

and

V_{g}

are the aggregated key-value pairs from selected regions, and

L C E (V)

represents Local Context Enhancement (Depthwise Convolution).

2.3.3. SimAMWithSlicing Module

SimAM integrates spatial, channel, and feature dimensions to generate 3D weights. Figure 7a illustrates the generation of these 3D weights. The SimAM attention mechanism estimates the importance of each neuron by constructing and optimizing an energy function. By evaluating the linear separability of neurons, the energy function of SimAM can be expressed as follows:

e_{t}^{*} = \frac{4 ({\hat{σ}}^{2} + λ)}{{(t - \hat{μ})}^{2} + 2 {\hat{σ}}^{2} + 2 λ}

(9)

where

e_{t}^{*}

is the energy function value representing the degree of difference between the target neuron and other neurons.

t

is input feature of target neuron in a single channel.

\hat{μ} = \frac{1}{M} \sum_{i = 1}^{M} x_{i}

and

{\hat{σ}}^{2} = \frac{1}{M} \sum_{i = 1}^{M} {(x_{i} - \hat{μ})}^{2}

are the mean of all other neurons in the corresponding channel except the target neuron

t

and the variance of all neurons in the corresponding channel except target neuron

t

,

x_{i}

is input feature of other neurons in the single channel.

M

and

i

are the number of neurons and index and

λ

is a regularization factor.

Equation (9) means that the lower the energy, the more different the neuron

t

is from the surrounding neurons, and the higher the importance. Finally, the output features of the SimAM

\tilde{X}

is expressed as follows:

\tilde{X} = s i g m o i d (\frac{1}{E}) ⊙ X

(10)

where

X

denotes the input features tensor of the SimAM,

E

is a tensor composed of the energy function value of each features.

⊙

is a Hadamard product. Sigmoid function is used to scale attention and suppress relatively large values.

When SimAM computes the mean pixel difference across the entire feature map, the weighting process may overlook the importance of small targets, resulting in weak enhancement for small targets or keypoints and limiting its effectiveness in keypoint detection tasks. To address this issue, Figure 7b introduces a slicing operation during feature map computation. By dividing the feature map into separate blocks, large targets, due to their prominent texture characteristics, influence the block-wise mean, thereby reducing the additional weighting they receive. After merging the blocks, large targets still maintain high recognizability and may even receive further enhancement. In contrast, small targets exhibit larger deviations from the local mean, thereby receiving stronger weighting and feature enhancement. This approach improves the precision of keypoint localization, particularly for small or subtle features.

2.3.4. SCConv Module

The structure of the Spatial and Channel Reconstruction Convolution (SCConv) is shown in Figure 8 and primarily consists of the Spatial Reconstruction Unit (SRU) and the Channel Reconstruction Unit (CRU). The SRU serves as a spatial reduction unit, reducing the spatial dimensions of feature maps through separation and reconstruction operations. The CRU functions as a channel reduction unit, reducing the number of feature channels through segmentation, transformation, and fusion operations. By combining these two reconstruction units, SCConv effectively captures complex relationships within the input features. This not only mitigates feature redundancy but also reduces the number of model parameters and floating-point operations per second (FLOPs), thereby significantly enhancing the model’s feature extraction capability.

2.4. Evaluation Metrics for Keypoint Detection Model

In order to evaluate the effectiveness of the proposed model [32], the metrics used in this study include mean Average Precision (mAP50), mAP50–95, model size (MB), and floating point operations (FLOPs). The mAP metrics are computed based on the precision–recall framework, while different matching criteria are adopted for bounding box detection and keypoint detection tasks.

The mAP is a comprehensive metric that reflects both precision and recall. Precision (P) and Recall (R) are defined as follows:

Precision (P) = \frac{T P}{T P + F P}

(11)

Recall (R) = \frac{T P}{T P + F N}

(12)

where

T P

,

F P

and

F N

denote true positives, false positives, and false negatives, respectively. By varying the confidence threshold, a precision–recall (P–R) curve can be obtained, and the Average Precision (AP) is defined as the area under the P–R curve:

A P = \int_{0}^{1} P (R) d R

(13)

m A P = \frac{1}{n} \sum_{i = 1}^{n} A P_{i}

(14)

Here,

n

is the total number of classes, and

A P_{i}

denotes the average precision for the

i

-th class.

For bounding box detection, mAP50(B) and mAP50–95(B) are computed based on the Intersection over Union (IoU) between the predicted bounding box and the ground-truth box. IoU is defined as the ratio of the overlap area to the union area of the predicted box

B_{p}

and the ground-truth box

B_{g}

:

I O U = \frac{B_{p} {⋂ B}_{g}}{B_{p} {⋃ B}_{g}}

(15)

A prediction is considered correct when the IoU exceeds a specified threshold. Specifically, mAP50(B) denotes the mAP computed when the IoU threshold is set to 0.5, while mAP50–95(B) represents the average mAP over multiple IoU thresholds from 0.50 to 0.95 with a step of 0.05.

Similarly to bounding box evaluation, the metrics mAP50(P) and mAP50–95(P) are used for keypoint detection, where the Object Keypoint Similarity (OKS) defined in the MS COCO evaluation protocol is adopted instead of IoU, and the OKS is calculated as:

O K S = \frac{\sum_{i} e x p^{(- \frac{d_{i}^{2}}{2 s^{2} k_{i}^{2}})} δ (v_{i} > 0)}{\sum_{i} δ (v_{i} > 0)}

(16)

Here,

i

represents the annotated key point index,

d_{i}^{2}

represents the squared Euclidean distance between the detected key point position and the ground truth key point position,

s^{2}

represents the area occupied by the detected human body in the image,

k_{i}

represents the decay constant used to control the disease location point

i

. In the case of multiple keypoints,

k_{i}

can be calculated as the standard deviation of the corresponding ground truth positions across the dataset, reflecting the annotation consistency of that point. The value of

k_{i}

is normalized by the target region size in the OKS calculation. A larger

k_{i}

indicates lower consistency (higher annotation variability), while a smaller

k_{i}

indicates higher consistency (more reliable annotations). For a single geological keypoint,

k_{i}

is typically set to 0.5. In other words,

k_{i}

can be interpreted as a weight reflecting the importance of each keypoint: more important points can be assigned smaller

k_{i}

values, requiring higher localization precision and contributing more to the OKS; less important points can be assigned larger

k_{i}

, allowing for larger prediction errors without significantly affecting the overall OKS.

δ

is the impulse function, indicating that the OKS value is only computed for visible relationship points in the ground truth annotations.

v_{i}

represents the visibility of the

i

key point, where 0 signifies unannotated, 1 signifies annotated but occluded, and 2 signifies annotated and visible.

In keypoint evaluation, a prediction is considered correct when the OKS exceeds a specified threshold. Therefore, mAP50(P) denotes the mAP computed when the OKS threshold is set to 0.5, while mAP50–95(P) represents the average mAP over multiple OKS thresholds from 0.50 to 0.95 with a step of 0.05.

Meanwhile, model size is also crucial. Industrial equipment usually has limited resources, and smaller models are easier to deploy on edge devices or embedded systems. In addition, FLOPs are used to describe the number of floating-point computations required during model inference and are commonly adopted to evaluate the overall computational complexity of a model.

3. Results and Discussion

3.1. Implementation Details

All deep learning models are tested on a Windows 10 operating system. The experimental hardware includes an Intel(R) Xeon(R) Gold 6133 @ 2.50 GHz processor and an NVIDIA GeForce RTX 4090 GPU with 24 GB of video memory. Development is carried out using PyTorch 2.0.1 and CUDA 11.7, with Python 3.8 as the programming environment. The specific training parameters of all models used in the experiments are detailed in Table 3.

In this study, for the simulation dataset, GPUs are employed to accelerate gprMax simulations, using a computational platform equipped with two RTX 4090 GPUs (24 GB VRAM each). However, since the simulations are conducted in three-dimensional space and involve detailed antenna modeling, the computational complexity is high. It takes nearly one month to generate 623 data pairs, which are further divided into training, validation, and test sets in an 8:1:1 ratio.

3.2. Training Performance of BSS-Pose-BHR

As shown in Figure 9, this study presents the variation curves of different loss functions and evaluation metrics of BSS-Pose-BHR over 400 training epochs. For the training set, the box loss initially exhibits a high value (~4.4) but decreases sharply within the first 50 epochs, indicating that the model rapidly learns the spatial localization and scale of cavity objects by refining bounding box predictions. After 50 epochs, the decreasing trend slows and the curve stabilizes at approximately 0.86, indicating convergence. Further training yields negligible improvements in box regression performance. Similarly, the classification loss, which starts at a relatively high value (~3.9), gradually decreases and stabilizes at 0.44 after 400 epochs. The initially high classification loss indicates difficulty in assigning correct class labels to detected bounding boxes; however, as training progresses, the model achieves improved classification accuracy for cavity targets at different locations. Furthermore, the stabilization of the distribution focal loss after 400 epochs suggests that the model has learned a consistent pattern for refining bounding box predictions and has increased confidence in delineating cavity response boundaries within BHR radargrams. The overall reduction in the three loss functions demonstrates an effective optimization process, with the model progressively converging as training proceeds. The final loss values, all below 1, indicate that the BSS-Pose-BHR model is sufficiently trained to capture the gradual energy response characteristics of cavities in directional BHR profiles. Regarding keypoint estimation losses, both pose loss and keypoint objectness loss (kobj_loss) stabilize after 400 epochs, with values below 0.1. This indicates that BSS-Pose-BHR achieves balanced confidence in keypoint prediction, effectively distinguishing true keypoints from background noise and learning robust spatial keypoint representations. In addition, key evaluation metrics such as Precision, Recall, mAP50, and mAP50–95 all reach stable peak values after 400 epochs.

For the validation set, the above loss functions and evaluation metrics exhibit smooth and gradual convergence as training progresses. No curve shows signs of overfitting or severe oscillations during training, indicating that the training configuration of BSS-Pose-BHR is reasonable and stable.

3.3. Comparative Experiments

The optimal weights of the BSS-Pose-BHR model obtained after 400 training epochs are evaluated on the test set, and its performance is compared with several state-of-the-art models. The comparison results are shown in Table 4.

To ensure a fair comparison, all baseline models (YOLOv8n-pose, YOLOv10n-pose, YOLOv11n-pose, and YOLOv12n-pose) are trained on the proposed simulated BHR dataset under the same training configuration as BSS-Pose-BHR, including input resolution, optimizer, learning rate, batch size, and number of epochs. No additional hyperparameter tuning is applied to individual models. Among these keypoint detection models, YOLOv8n-pose and YOLOv11n-pose also achieve relatively strong performance. Specifically, compared with YOLOv10n-pose, BSS-Pose-BHR achieves improvements of 7.54% in mAP50(B), 4.00% in mAP50–95(B), 2.75% in mAP50(P), and 3.00% in mAP50–95(P). This significant performance gap is also reflected in the confusion matrices computed on the validation set during training. As shown in Figure 10, the confusion matrix of BSS-Pose-BHR shows perfectly correct predictions on the validation set, with no misclassifications or missed detections. In contrast, YOLOv10n-pose misclassifies five target samples as background on the same validation set, further demonstrating the superiority of BSS-Pose-BHR over YOLOv10n-pose.

For YOLOv11n-SPPF_improve-BSAM-LSCD_LQE, the model integrates several advanced enhancement strategies. Based on YOLOv11n-pose, three major modifications are introduced. (1) In the backbone, the original SPPF module is enhanced by incorporating global average pooling and global max pooling layers. The resulting features are concatenated to embed global background information, providing a broader contextual representation. (2) The Bi-Level Routing Spatial Attention Module (BSAM) is appended after the C2PSA module. BSAM is an improved variant of the Convolutional Block Attention Module (CBAM), in which the original channel attention mechanism is replaced to enhance feature selection capability. (3) The Local Structure and Context Description–Local Quality Estimation (LSCD_LQE) module replaces the original detection head to improve localization and quality estimation. For the variant “Ours (LSCD),” the detection head of BSS-Pose-BHR is replaced with LSCD for comparison. Compared with the baseline model (YOLOv11n-pose), these two modified models show improvements on only a subset of evaluation metrics. In contrast, BSS-Pose-BHR achieves consistent and significant improvements across all four key metrics, with larger gains than both variants. This further demonstrates the effectiveness and rationality of the proposed module design. In terms of computational complexity, measured in GFLOPs, BSS-Pose-BHR maintains relatively low overhead and ranks third among all compared models, while YOLOv12n-pose achieves the lowest complexity. Although three improvement modules are introduced into YOLOv11n-pose, the computational cost of BSS-Pose-BHR increases by only 0.2 GFLOPs, indicating a marginal increase in complexity and still remaining lower than most competing models. Overall, these results demonstrate that the proposed method achieves a favorable trade-off between detection accuracy and computational efficiency.

To provide a more intuitive comparison between the proposed model and the baseline in terms of both bounding box detection and keypoint localization accuracy, Figure 11 and Figure 12 are presented. As shown in the detection results of eight representative examples in Figure 11, both BSS-Pose-BHR and YOLOv11n-pose successfully detect the expected number of bounding boxes. However, the predicted bounding boxes of BSS-Pose-BHR exhibit higher confidence scores than those of YOLOv11n-pose, indicating that BSS-Pose-BHR provides more reliable predictions for practical BHR profile-based target recognition applications. Figure 12 illustrates the proportional differences along the x–y axes between predicted keypoint positions and ground truth for approximately 450 cases, including both simulated and indoor experimental data. From the figure, YOLOv11n-pose shows a more dispersed distribution of proportional differences, indicating slightly larger deviations from the ground truth compared with the proposed method. Specifically, the mean and standard deviation are 0.0128 and 0.0099 for YOLOv11n-pose, and 0.0118 and 0.0086 for BSS-Pose-BHR, respectively. These results demonstrate that the proposed method achieves higher accuracy and better stability in keypoint position estimation.

In addition, Section 1 introduces two representative directional BHR profile target localization algorithms: the maximum amplitude-based directional algorithm and the median-value-based localization algorithm, both of which are used to estimate azimuth angles. Therefore, this study further compares these two methods with BSS-Pose-BHR in terms of azimuth angle prediction accuracy. The experimental data are consistent with the cases analyzed above, and the results are presented in Table 5.

In summary, for advanced deep learning-based methods, both the quantitative metrics of bounding box detection and keypoint localization, as well as the qualitative visualization results, clearly demonstrate the superior performance of BSS-Pose-BHR. Compared with traditional directional BHR profile-based target azimuth estimation algorithms, BSS-Pose-BHR also achieves more accurate azimuth angle prediction.

3.4. Ablation Experiment

To verify the impact of the proposed improvement modules on model performance, ablation experiments are conducted based on YOLOv11n-pose. The results are shown in Table 6.

The experimental results indicate that all improvement strategies positively contribute to overall model performance. The introduction of the BRA module improves the baseline YOLOv11n-pose model, yielding gains of 2.02% in mAP50(B) and 0.23% in mAP50(P), while increasing the model size by 0.56 MB. The incorporation of this module significantly enhances both mAP50(B) and mAP50(P), improving the model’s ability to extract and accurately localize targets in BHR profiles, albeit with a slight increase in model weight size.

After introducing the SimAMWithSlicing module on top of the BRA module, the model achieves improvements of 3.14% in mAP50(B) and 0.69% in mAP50(P) compared with YOLOv11n-pose. Due to the lightweight design of the SimAMWithSlicing attention mechanism, the model size remains almost unchanged relative to the BRA-only model, while still providing noticeable performance gains.

To further improve accuracy, the detection head is enhanced after integrating BRA and SimAMWithSlicing by incorporating SCConv. This module separately optimizes spatial and channel information, reduces redundant features, and further improves mAP50(B) and mAP50(P), thereby significantly enhancing the performance of YOLOv11n-pose in detecting keypoint positions of underground targets in BHR profiles. However, although the integration of these three modules yields substantial improvements in detection accuracy, it also leads to an increase in model size.

3.5. Indoor Experimental Testing

3.5.1. Single-Target Detection Experiment

Since the instrument research, supported by a major national project, is still at an early stage, this study conducts signal processing experiments using only indoor tests. In these experiments, the BHR system is positioned approximately 50 cm above the ground and performs a 360° full-space rotational scan. This setup is primarily used to observe the locations of ground response features in the BHR profiles. The antenna operates at 400 MHz and adopts a water-drop-shaped butterfly radiator. The borehole diameter is 60 mm, the load resistance is 200 Ω, and the antenna shielding arc is 180°. The trained optimal model weights of BSS-Pose-BHR and YOLOv11n-pose are then applied to perform target detection and keypoint localization on the acquired profiles. The results are shown in Figure 13.

Due to factors such as the electronic design of the radar antenna, impedance mismatches between the antenna and the ground, and multiple reflections of electromagnetic waves at subsurface interfaces, the raw profiles obtained during indoor ground detection inevitably contain horizontal interference with varying amplitude levels. This interference degrades the quality of the high-resolution images provided by the system. Therefore, as shown in Figure 13b, Robust Principal Component Analysis (RPCA) is applied to remove most low-rank components from the raw profiles. Subsequently, both BSS-Pose-BHR and YOLOv11n-pose are used to perform keypoint detection on the processed data. In terms of detection confidence, BSS-Pose-BHR shows a clear advantage. From the perspective of keypoint azimuth localization, the ground is theoretically located directly beneath the BHR system, and the corresponding angle of the strongest ground response in the profile should be approximately 180°, i.e., at the midpoint (0.5) of the horizontal axis. According to the quantitative results, BSS-Pose-BHR predicts the keypoint position at 0.527522, whereas YOLOv11n-pose predicts it at 0.551346. This indicates that BSS-Pose-BHR produces a keypoint estimation closer to the ground truth.

3.5.2. Multi-Target Detection Experiment

Based on the experimental setup described in Section 3.5.1, a horizontal piece of aluminum foil is placed directly above the borehole radar. In this case, the BHR scans exhibit additional response features corresponding to the foil. The ground target keypoints are located at approximately 180° (i.e., the center of the scan), while the foil keypoints are located around 0°/360° (i.e., the edges of the scan). Detection is performed using both BSS-Pose-BHR and YOLOv11n-pose, and the comparative results are shown in Figure 14. The results indicate that the two methods achieve similar performance in keypoint localization, while the proposed method produces higher confidence scores for the predicted bounding boxes compared with the baseline.

3.6. Robustness Analysis

To conduct a robustness analysis of the BSS-Pose-BHR model, we systematically evaluate its performance by introducing varying levels of Gaussian noise and different channel drop rates into the BHR profiles.

In practice, BHR data acquisition inevitably involves multiple sources of random interference, such as thermal noise from radar transmit/receive electronics, electromagnetic background noise, and stochastic perturbations in the signal acquisition chain. In addition, subsurface heterogeneity and scattering effects introduce further noise components. Therefore, degraded scenarios are constructed by artificially adding Gaussian noise with different intensities to the original clean profiles, enabling a systematic evaluation of the model’s robustness under complex real-world conditions.

Furthermore, to evaluate the model’s adaptability to incomplete sampling, profiles with varying degrees of missing channels are generated. The rotational directional BHR system acquires circumferential data by controlling the antenna rotation angle interval. In practice, the angular sampling density is sometimes reduced to improve acquisition efficiency, which can lead to missing traces in the profiles. Therefore, degraded scenarios with different missing-channel ratios are constructed to assess the stability and engineering applicability of the BSS-Pose-BHR model under sparse sampling and incomplete observation conditions.

Experiments are conducted on 40 profiles. Gaussian noise is added to achieve PSNR values ranging from 31 dB to 43 dB, representing light to moderate noise levels that preserve the main profile structures while challenging the model’s feature extraction and target recognition capabilities. In practical simulations, when the PSNR falls below approximately 30 dB, the noise becomes sufficiently strong to noticeably distort the morphology of the BHR profiles and may obscure the response characteristics of cavity anomalies. Conversely, when the PSNR exceeds 43 dB, the profiles become overly clean, and the noise interference is insufficient to effectively evaluate the robustness of the model. Therefore, the range of 31–43 dB is adopted as a relatively conservative setting, introducing noticeable but non-destructive noise while ensuring that the main structural features and anomaly responses remain clearly identifiable. It is observed that profiles with PSNR around 30 dB can still preserve the basic structural information; however, a slightly higher range is selected in this study to ensure stable anomaly visibility and consistent evaluation conditions. For missing-channel scenarios, random channel omissions are applied to the original profiles, after which the data are merged and resized to a uniform dimension.

As shown in Figure 15 and Figure 16, under different PSNR levels and missing-channel conditions, BSS-Pose-BHR consistently outperforms YOLOv11n-pose in terms of both mAP50–95(B) and mAP50–95(P), with the performance curves remaining consistently above those of YOLOv11n-pose. These quantitative results directly demonstrate the superior performance of BSS-Pose-BHR in both bounding box detection and keypoint estimation tasks. Representative cases further show that even at PSNR = 31 dB or a missing-channel rate of 0.35, BSS-Pose-BHR successfully detects the complete number of bounding boxes and accurately predicts keypoints. As summarized in Table 7, across different degradation scenarios, the average error between predicted and true azimuth angles indicates that BSS-Pose-BHR achieves significantly higher azimuth prediction accuracy than the two traditional methods. Notably, even under severe conditions (missing-channel rate = 0.35, PSNR = 31 dB), its average error remains lower than that of the traditional algorithms under much milder conditions (missing-channel rate = 0.06, PSNR = 43 dB). Moreover, among the traditional methods, the maximum-amplitude-based approach consistently underperforms the median-value-based method.

3.7. Limitations

Although the proposed BSS-Pose-BHR method achieves good performance in the experiments, several limitations should be noted:

The dataset used in this study is mainly generated from numerical simulations, which may not fully represent the complexity of real geological environments. In future work, more realistic geological models and refined simulation settings will be considered, and real measured data or data augmentation techniques such as generative networks will be introduced to further improve the diversity and realism of the dataset.
The simulations are conducted in three-dimensional space with detailed antenna modeling, resulting in high modeling complexity and limiting the efficiency of data generation. Future studies will explore simplified simulation strategies and reduced model complexity to improve computational efficiency and support the construction of larger-scale datasets.
The keypoint definition in this study is limited to the azimuth of the target response, which is a simplified setting suitable for preliminary investigation. Physical validation in real borehole environments is time-consuming and resource-intensive. In addition, experiments involving more complex and non-symmetric geological structures are required in future work to further improve the generalization capability of the proposed method.

4. Conclusions

Based on YOLOv11n-pose, this study proposes BSS-Pose-BHR for keypoint detection of geological targets in directional BHR profiles. To the best of our knowledge, this study represents one of the early attempts to apply deep learning to this task in automated underground utility detection. The main findings are as follows:

Training Stability: For both the training and validation sets, the loss and metric curves of BSS-Pose-BHR over 400 epochs exhibit smooth and stable convergence. All curves gradually decrease or stabilize without showing overfitting or significant oscillations, indicating that the training configuration of BSS-Pose-BHR is reasonable and stable.
Performance Evaluation: Compared with state-of-the-art baseline models, BSS-Pose-BHR demonstrates clear superiority. All accuracy metrics achieve the best performance among the compared methods, demonstrating the effectiveness and rationality of the proposed improvements. To further visualize performance differences, both bounding box detection and keypoint localization results are presented. The confidence scores of predicted bounding boxes from BSS-Pose-BHR consistently exceed those of the baseline. In the proportional error plots of predicted versus ground-truth keypoints along the x–y axis, the baseline shows a mean of 0.0128 with a standard deviation of 0.0099, whereas BSS-Pose-BHR achieves a mean of 0.0118 and a standard deviation of 0.0086, indicating higher keypoint localization accuracy. Furthermore, compared with traditional directional BHR target localization methods, BSS-Pose-BHR achieves superior azimuth angle prediction, with an average error of only 2.7°.
Ablation Study: The incremental contributions of the proposed modules are validated through ablation experiments. Each module positively contributes to the final performance. Compared with the baseline, BSS-Pose-BHR improves mAP50(B) by 5.39% and mAP50(P) by 0.86%, at the cost of an additional 1.05 MB in model size.
Indoor Experimental Test: The model is further validated using real BHR signals in a controlled indoor experiment. For the single-target case, where the ground truth angle corresponds to 0.5 along the horizontal axis of the profile, BSS-Pose-BHR predicts a keypoint at 0.527522, while YOLOv11n-pose predicts 0.551346, demonstrating that BSS-Pose-BHR achieves more accurate keypoint localization. In the indoor multi-target experiments, the proposed method shows comparable keypoint localization performance to the baseline, while producing higher confidence scores for predicted bounding boxes.
Robustness Analysis: To evaluate robustness, datasets with varying PSNR levels and missing-trace rates are constructed. Across all degradation scenarios, BSS-Pose-BHR consistently outperforms YOLOv11n-pose. Its average mAP50–95(B) and mAP50–95(P) remain higher under all conditions, and its azimuth angle predictions are significantly more accurate than those of the two traditional localization methods.

Author Contributions

Conceptualization, X.T. and F.Y.; methodology, X.T. and F.Y.; software, F.Y. and X.Q.; validation, X.T., M.X. and F.Y.; formal analysis, X.T., M.X. and F.Y.; investigation, X.T., M.X. and F.Y.; resources, F.Y. and S.P.; data curation, J.L. and X.Q.; writing—original draft preparation, X.T.; writing—review and editing, X.T., F.Y. and S.P.; visualization, X.T.; supervision, F.Y.; project administration, X.T.; funding acquisition, S.P. and F.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research is funded by the National Natural Science Foundation of China (grant number 52427901).

Data Availability Statement

Restrictions apply to the availability of these data. Data are obtained from the China University of Minning and Technology (Beijing) and are available from the authors with the permission of the China University of Minning and Technology (Beijing).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Olsson, O.; Falk, L.; Forslund, O.; Lundmark, L.; Sandberg, E. Borehole radar applied to the characterization of hydraulically conductive fracture zones in crystalline rock 1. Geophys. Prospect. 1992, 40, 109–142. [Google Scholar] [CrossRef]
Mason, I.; Cloete, J.; Palmer, K. Borehole radar imaging in tactical support of miners working in very narrow stopes. In Proceedings of the 2012 IEEE-APS Topical Conference on Antennas and Propagation in Wireless Communications (APWC), Cape Town, South Africa, 2–7 September 2012; pp. 973–976. [Google Scholar]
Li, S.; Liu, B.; Xu, X.; Nie, L.; Liu, Z.; Song, J.; Sun, H.; Chen, L.; Fan, K. An overview of ahead geological prospecting in tunneling. Tunn. Undergr. Space Technol. 2017, 63, 69–94. [Google Scholar] [CrossRef]
Slob, E.; Sato, M.; Olhoeft, G. Ground-Penetrating Radar: Surface and borehole ground-penetrating-radar developments. Geophysics 2010, 75, 75A103–75A120. [Google Scholar] [CrossRef]
Luo, T.X.; Zheng, Q.; Hou, F. Unsupervised deep-learning model for quantitative diagnosis of tunnel lining cavity condition using interfering GPR data. Tunn. Undergr. Space Technol. 2026, 170, 107373. [Google Scholar] [CrossRef]
Liu, H.; Yue, Y.; Lai, S.; Meng, X.; Du, Y.; Cui, J.; Spencer, B.F. Evaluation of the antenna parameters for inspection of hidden defects behind a reinforced shield tunnel using GPR. Tunn. Undergr. Space Technol. 2023, 140, 105265. [Google Scholar] [CrossRef]
Song, L.; Chen, Q.; Wang, X.; Sun, H.; Yu, Z. Quantitative identification of rebar corrosion in concrete based on GPR and transfer learning. Constr. Build. Mater. 2025, 502, 144424. [Google Scholar] [CrossRef]
Wang, S.; Zhang, Y.; Huang, X.; Jin, G.; Ma, T. Asphalt pavement density prediction using GPR incorporating mixing theory model calibration and height correction. Constr. Build. Mater. 2025, 494, 143401. [Google Scholar] [CrossRef]
Qian, Z. Research on Data Processing Method of Directional Borehole Radar for Tunnel Ahead Probing. Master’s Thesis, Shandong University, Jinan, China, 2023. [Google Scholar]
Hu, M.; Ni, J.; Liu, S.; Tian, S.; Lu, Q. Subsurface Rough Fractures Detection by Borehole Radar: Numerical Simulation and Analysis. IEEE Trans. Geosci. Remote. Sens. 2025, 63, 5926713. [Google Scholar] [CrossRef]
Xiong, H.; Li, J.; Li, Z.; Zhang, Z. GPR-GAN: A ground-penetrating radar data generative adversarial network. IEEE Trans. Geosci. Remote. Sens. 2023, 62, 5200114. [Google Scholar] [CrossRef]
Qin, H.; Zhang, D.; Tang, Y.; Wang, Y. Automatic recognition of tunnel lining elements from GPR images using deep convolutional networks with data augmentation. Autom. Constr. 2021, 130, 103830. [Google Scholar] [CrossRef]
Yang, S.; Wang, Z.; Wang, J.; Cohn, A.G.; Zhang, J.; Jiang, P.; Nie, L.; Sui, Q. Defect segmentation: Mapping tunnel lining internal defects with ground penetrating radar data using a convolutional neural network. Constr. Build. Mater. 2022, 319, 125658. [Google Scholar] [CrossRef]
Pham, M.-T.; Lefèvre, S. Buried object detection from B-scan ground penetrating radar data using Faster-RCNN. In Proceedings of the IGARSS 2018—2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 6804–6807. [Google Scholar]
El-Saadawy, H.; Tantawi, M.; Shedeed, H.A.; Tolba, M.F. One-stage vs two-stage deep learning method for bone abnormality detection. In Proceedings of the The International Conference on Artificial Intelligence and Computer Vision, Settat, Morocco, 28–30 June 2021; pp. 122–132. [Google Scholar]
Luo, T.X.; Zhou, Y.; Zheng, Q.; Hou, F.; Lin, C. Lightweight deep learning model for identifying tunnel lining defects based on GPR data. Autom. Constr. 2024, 165, 105506. [Google Scholar] [CrossRef]
Khedr, M.; Metawie, M.; Marzouk, M. Integrated ground penetrating radar and deep learning approach for rebar diameter classification in concrete elements. Front. Struct. Civ. Eng. 2025, 19, 524–540. [Google Scholar] [CrossRef]
Maxwell, J.C., VIII. A dynamical theory of the electromagnetic field. Philos. Trans. R. Soc. Lond. 1865, 155, 459–512. [Google Scholar] [CrossRef]
Liu, J.; Tang, X.; Yang, F.; Qiao, X.; Li, F.; Peng, S.; Huang, X.; Fang, Y.; Xu, M. Study on the Identification Method of Planar Geological Structures in Coal Mines Using Ground-Penetrating Radar. Remote. Sens. 2024, 16, 3990. [Google Scholar] [CrossRef]
Yu, Y.; Huisman, J.A.; Klotzsche, A.; Vereecken, H.; Weihermüller, L. Coupled full-waveform inversion of horizontal borehole ground penetrating radar data to estimate soil hydraulic parameters: A synthetic study. J. Hydrol. 2022, 610, 127817. [Google Scholar] [CrossRef]
Peng, D.; Cheng, F.; Liu, J.; Zong, Y.; Yu, M.; Hu, G.; Xiong, X. Joint tomography of multi-cross-hole and borehole-to-surface seismic data for karst detection. J. Appl. Geophys. 2021, 184, 104252. [Google Scholar] [CrossRef]
Tang, X.; Liu, J.; Yang, F.; Qiao, X.; Fu, T.; Peng, S. Research on the penetration performance of rotary ground-penetrating radar in detecting coal-rock interfaces of roofs based on numerical simulation and actual exploration. Eng. Geol. 2025, 349, 107978. [Google Scholar] [CrossRef]
Giannakis, I.; Giannopoulos, A.; Warren, C. A realistic FDTD numerical modeling framework of ground penetrating radar for landmine detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 9, 37–51. [Google Scholar] [CrossRef]
Wang, L.; Gu, X.; Liu, Z.; Wu, W.; Wang, D. Automatic detection of asphalt pavement thickness: A method combining GPR images and improved Canny algorithm. Measurement 2022, 196, 111248. [Google Scholar] [CrossRef]
Yu, B.; Gao, K.; Cheng, Z.; Chen, Y.; Yue, L. A human-like visual perception system for autonomous vehicles using a neuron-triggered hybrid unsupervised deep learning method. IEEE Trans. Intell. Transp. Syst. 2024, 25, 8171–8180. [Google Scholar] [CrossRef]
Chakurkar, P.; Vora, D. Context-Aware Deep Learning based Indian Footpath Damage Segmentation Dataset for Risk Assessment. Sci. Data 2025, 12, 1926. [Google Scholar] [CrossRef]
Liu, L.; Meng, L.; Li, A.; Lv, Y.; Zhao, B. PD-YOLOv11: A power distribution enabled YOLOv11 algorithm for power transmission tower component detection in UAV inspection. Alex. Eng. J. 2025, 131, 312–324. [Google Scholar] [CrossRef]
Liu, L.; Meng, L.; Li, X.; Liu, J.; Bi, J. WCD-YOLOv11: A lightweight YOLOv11 model for the real-time image processing in UAV. Alex. Eng. J. 2025, 133, 73–88. [Google Scholar] [CrossRef]
Zhu, L.; Wang, X.; Ke, Z.; Zhang, W.; Lau, R.W. Biformer: Vision transformer with bi-level routing attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 10323–10333. [Google Scholar]
Yang, L.; Zhang, R.-Y.; Li, L.; Xie, X. Simam: A simple, parameter-free attention module for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 11863–11874. [Google Scholar]
Li, J.; Wen, Y.; He, L. Scconv: Spatial and channel reconstruction convolution for feature redundancy. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 6153–6162. [Google Scholar]
Wang, J.; Tan, D.; Sui, L.; Guo, J.; Wang, R. Wolfberry recognition and picking-point localization technology in natural environments based on improved Yolov8n-Pose-LBD. Comput. Electron. Agric. 2024, 227, 109551. [Google Scholar] [CrossRef]
Li, Y.; Shen, H.; Fu, Y.; Wang, K. A method of dense point cloud SLAM based on improved YOLOV8 and fused with ORB-SLAM3 to cope with dynamic environments. Expert Syst. Appl. 2024, 255, 124918. [Google Scholar] [CrossRef]
Meng, Z.; Du, X.; Sapkota, R.; Ma, Z.; Cheng, H. YOLOv10-pose and YOLOv9-pose: Real-time strawberry stalk pose detection models. Comput. Ind. 2025, 165, 104231. [Google Scholar] [CrossRef]
Ma, J.; Zhou, Y.; Zhou, Z.; Zhang, Y.; He, L. Toward smart ocean monitoring: Real-time detection of marine litter using YOLOv12 in support of pollution mitigation. Mar. Pollut. Bull. 2025, 217, 118136. [Google Scholar] [CrossRef]
Zhu, W.; Li, J.; An, Z.; Hua, Z. Mutiscale hybrid attention transformer for remote sensing image pansharpening. IEEE Trans. Geosci. Remote. Sens. 2023, 61, 5400416. [Google Scholar] [CrossRef]
Zheng, Z.; Zhao, J.; Fan, J. YOLO-GML: An object edge enhancement detection model for UAV aerial images in complex environments. PLoS ONE 2025, 20, e0328070. [Google Scholar] [CrossRef]

Figure 1. Design Schematic and Physical Prototype of the Borehole Electromagnetic (Radar) Acquisition System.

Figure 2. Illustration of target azimuth determination using the maximum-amplitude method and the intermediate-value method. (a) Directional BHR response B-scan profile of a single target (obtained by subtracting the profile without a target from the profile with a target). (b) Azimuth information curve in three-dimensional space corresponding to the target in (a), derived using the maximum-amplitude-based orientation algorithm. (c) Azimuth information curve in three-dimensional space corresponding to the target in (a), derived using the intermediate-value-based anomaly localization algorithm.

Figure 3. Conventional BHR Survey vs. Rotational Detection. (a) Operational diagram of conventional BHR survey. (b) Structure of the single-hole directional BHR antenna and its operational diagram.

Figure 4. Illustration of Directional BHR Dataset Acquisition. (a) Simulation model. (b) Diagram of target rotation angle and positioning rules within the simulation model. (c) Dataset annotation diagram.

Figure 5. Overall Architecture of the BSS-Pose-BHR Model.

Figure 6. Architecture of the BRA Module.

Figure 7. Illustration of the SimAMWithSlicing Module. (a) Schematic of the SimAM attention module. (b) Illustration of the feature map slicing operation.

Figure 8. The architecture of SCConv module.

Figure 9. Training curves of BSS-Pose-BHR over 400 Epochs for Different Losses and Metrics.

Figure 10. Confusion Matrices. (a) Confusion matrix of BSS-Pose-BHR. (b) Confusion matrix of YOLOv10n-pose.

Figure 11. Test Comparison Results. (a) YOLOv11n-Pose. (b) BSS-Pose-BHR.

Figure 12. Keypoint Localization Error Distribution YOLOv11n-pose vs. BSS-Pose-BHR.

Figure 13. Indoor Single-Target Detection Testing Results. (a) Raw profile of indoor ground detection using directional BHR. (b) Profile in (a) after RPCA processing, suppressing horizontal correlated interference. (c) Keypoint detection results of BSS-Pose-BHR on the profile in (b). (d) Keypoint detection results of YOLOv11n-Pose on the profile in (b).

Figure 14. Indoor Multi-Target Detection Testing Results. (a) Keypoint detection results of BSS-Pose-BHR on the profile. (b) Keypoint detection results of YOLOv11n-Pose on the profile.

Figure 15. Comparison of algorithm robustness under different PSNR levels. (a) Line chart showing average mAP50–95(B) values of BSS-Pose-BHR and YOLOv11n-pose across profiles with varying PSNR. (b) Line chart showing average mAP50–95(P) values of BSS-Pose-BHR and YOLOv11n-pose across profiles with varying PSNR. (c) Example profile at PSNR = 31 dB. (d) Detection results of BSS-Pose-BHR on the PSNR = 31 dB example. (e) Example profile at PSNR = 43 dB. (f) Detection results of BSS-Pose-BHR on the PSNR = 43 dB example.

Figure 16. Comparison of algorithm robustness under different channel loss rates. (a) Line chart showing average mAP50–95(B) values of BSS-Pose-BHR and YOLOv11n-pose across profiles with varying channel loss rates. (b) Line chart showing average mAP50–95(P) values of BSS-Pose-BHR and YOLOv11n-pose across profiles with varying channel loss rates. (c) Example profile at channel loss rate = 0.06. (d) Detection results of BSS-Pose-BHR on the channel loss rate = 0.06 example. (e) Example profile at channel loss rate = 0.35. (f) Detection results of BSS-Pose-BHR on the channel loss rate = 0.35 example.

Table 1. Model parameter list.

Parameter	Value
Model size	3 m × 3 m × 3 m
Model resolution	0.005 m × 0.005 m × 0.005 m
Antenna radiation direction	(from Y = 0 to Y = 3)
Built-in antenna type	“MALA”
Simulated antenna size	0.075 m × 0.075 m × 0.3 m
Rotation around the axis	Z-axis
Coordinates of the rotation center	(1.535, 1.535, $z$ )
Rotation angle	0° to 360°, 10° interval
Time window	60 ns, 80 ns, 100 ns
Data processing	Automatic gain function” $G (n) = m i n (3^{0.01 n}, 4000)$ ”, where $n$ is the time sample index
Clay volume fraction	0.0001, 0.0005, 0.001, 0.002
Anomaly shape	Cube, cuboid, sphere, cylinder, and irregular geometry
Spatial extent of the anomaly	0.1 m~0.3 m
Anomaly type	Cavity
Number of anomalies in a single B-scan	1~3

Table 2. Electric parameters of Main materials.

Materials	Relative Dielectric Constant/-	Conductivity/mS.m⁻¹	Relative Magnetic Permeability/-
Air	1	0	1
Limestone	5.5	0.004	1
Clay Particles	7.2	0.000001	1

Table 3. Training parameter configuration.

Input Image Size	Optimizer	Batch Size	Epochs	Pre-Training Weights
640 × 640	SGD (Initial learning rate = 0.01)	8	400	YOLOv11n.pt

Table 4. Contrast test.

Model	mAP50(B)/% (↑)	mAP50–95(B)/% (↑)	mAP50(P)/% (↑)	mAP50–95(P)/% (↑)	GFLOPs (↓)
Ours	0.9686	0.7712	0.9951	0.9952	6.9
YOLOv8n-pose [33]	0.9496	0.7492	0.9742	0.9795	7.2
YOLOv10n-pose [34]	0.9007	0.7416	0.9685	0.9664	8.6
YOLOv11n-pose	0.9191	0.7590	0.9866	0.9823	6.7
YOLOv12n-pose [35]	0.9048	0.7462	0.9678	0.9671	6.2
YOLOv11n-SPPF_improve-BSAM [36]-LSCD_LQE [37]	0.9441	0.7447	0.9874	0.9783	8.5
Ours (LSCD)	0.9502	0.7640	0.9826	0.9817	7.1

Table 5. Target Azimuth Angle Prediction Errors of Different Methods.

	Maximum Amplitude Method	Intermediate-Value Method	BSS-Pose-BHR
Mean Error with Respect to the Actual Azimuth Angle	8.3°	7.2°	2.7°

Table 6. Ablation experiment results.

Model	YOLOv11n-Pose	BRA	SimAMWithSlicing	SCConv	mAP50(B)/% (↑)	mAP50(P)/% (↑)	Size/(MB) (↓)
1	✓				0.9191	0.9866	5.42
2	✓	✓			0.9377	0.9889	5.98
3	✓	✓	✓		0.9480	0.9934	5.98
4	✓	✓	✓	✓	0.9686	0.9951	6.47

Table 7. Average error between predicted and actual azimuth angles of the three methods under different degraded scenarios.

Degraded BHR Scenario	Maximum Amplitude Method (Average Error)	Intermediate Value Method (Average Error)	BSS-Pose-BHR (Average Error)
Missing Rate = 0.06	10.4°	8.8°	4.2°
Missing Rate = 0.35	21.2°	15.4°	8.5°
PSNR = 31	10.7°	8.6°	5.2°
PSNR = 43	7.5°	5.8°	2.4°

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tang, X.; Xu, M.; Yang, F.; Liu, J.; Peng, S.; Qiao, X. Research on an Automatic Detection Method for Response Keypoints of Three-Dimensional Targets in Directional Borehole Radar Profiles. Remote Sens. 2026, 18, 1102. https://doi.org/10.3390/rs18071102

AMA Style

Tang X, Xu M, Yang F, Liu J, Peng S, Qiao X. Research on an Automatic Detection Method for Response Keypoints of Three-Dimensional Targets in Directional Borehole Radar Profiles. Remote Sensing. 2026; 18(7):1102. https://doi.org/10.3390/rs18071102

Chicago/Turabian Style

Tang, Xiaosong, Maoxuan Xu, Feng Yang, Jialin Liu, Suping Peng, and Xu Qiao. 2026. "Research on an Automatic Detection Method for Response Keypoints of Three-Dimensional Targets in Directional Borehole Radar Profiles" Remote Sensing 18, no. 7: 1102. https://doi.org/10.3390/rs18071102

APA Style

Tang, X., Xu, M., Yang, F., Liu, J., Peng, S., & Qiao, X. (2026). Research on an Automatic Detection Method for Response Keypoints of Three-Dimensional Targets in Directional Borehole Radar Profiles. Remote Sensing, 18(7), 1102. https://doi.org/10.3390/rs18071102

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on an Automatic Detection Method for Response Keypoints of Three-Dimensional Targets in Directional Borehole Radar Profiles

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Principle of Directional BHR Detection and Imaging

2.2. Dataset Acquisition

2.3. Design of the BSS-Pose-BHR Model

2.3.1. Overall Model Architecture Design

2.3.2. BRA Module

2.3.3. SimAMWithSlicing Module

2.3.4. SCConv Module

2.4. Evaluation Metrics for Keypoint Detection Model

3. Results and Discussion

3.1. Implementation Details

3.2. Training Performance of BSS-Pose-BHR

3.3. Comparative Experiments

3.4. Ablation Experiment

3.5. Indoor Experimental Testing

3.5.1. Single-Target Detection Experiment

3.5.2. Multi-Target Detection Experiment

3.6. Robustness Analysis

3.7. Limitations

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI