5.1. Establishment of the Experimental Environment
- (1)
System configuration
Figure 8 illustrates the implemented system configuration, with the core sensing device positioned on the left and the cable external breakage source localization system on the right. The primary experimental apparatus comprises the STM32F103C8T6 microcontroller (ST Microelectronics, Geneva, Switzerland), the MAX9814 acquisition module (Maxim Integrated Products, CA, USA), and the FT232RL serial communication module (Future Technology Devices International, Glasgow, UK), which transmits the captured acoustic data to the computer via the USART serial port.
The figure above illustrates the designed hardware diagram, upon which the engineering deployment plan is based. To achieve the engineering deployment of the cable external breakdown sound source localization system, a layered distributed architecture design is adopted. The specific structure is as follows:
Perception Layer: Intelligent identification posts, each equipped with four MAX9814 microphones, are deployed every 50–80 m.
Edge Layer: An STM32F103C8T6 microcontroller.
Cloud Platform: A distributed data-processing pipeline based on IoT is employed to achieve multi-post data fusion.
To fully demonstrate the applicability of this approach,
Table 2 lists the key engineering parameters of the system.
Figure 9 illustrates a schematic diagram of the cable engineering deployment in an urban area, showcasing the three-dimensional spatial distribution of marking piles within the cable corridor. The black curve represents the primary cable path; the green dots indicate the marking pile nodes (spaced at 80 m intervals); the blue hexagons denote the sensor arrays; and the red dashed area delineates the high-risk construction zone.
- (2)
Sound Source Simulation and Test Environment
For the sound source simulation, a segment of boring head noise was recorded on-site, while interference noise was captured randomly within an urban setting.
A realistic test environment was constructed to emulate the urban underground cable context, incorporating potential obstacles and ambient noise sources. The selected test scenarios were developed to address three primary engineering challenges: environmental noise interference (addressed in Scenarios 1 and 3), multipath propagation effects (addressed in Scenarios 2 and 4), and complex spatial constraints (addressed in Scenario 5).
These scenarios systematically delineate the performance boundaries of the proposed method. Specifically, the open-area scenario (Scenario 1) establishes a baseline performance, the industrial noise scenario (Scenario 3) evaluates the dynamic weighting mechanism, the multi-obstacle scenario (Scenario 4) corroborates the adaptive environment compensation, and the fully enclosed scenario (Scenario 5) assesses three-dimensional spatial resolution. Detailed descriptions of the various scenarios are provided in
Table 3.
5.2. System Performance Verification
In this section, we evaluate the performance of the proposed sound source localization system under various conditions, including interference suppression capacity, dynamic weighting mechanisms, multi-obstacle scenarios, localization accuracy, et al.
- (1)
Interference Resilience Test
Experiments were performed in Scenario 3 to compare the conventional GCC algorithm with the improved GCC-PHAT algorithm. The time differences for each of the three sensor groups were recorded, and the data were subsequently transferred to a computer for analysis. The resulting graph, shown in
Figure 10, depicts the conventional GCC algorithm in blue and the enhanced GCC-PHAT algorithm in green.
Figure 9 illustrates the comparative delay estimation performance of two algorithms within an industrial noise scenario. The root-mean-square error of the conventional GCC algorithm reaches 0.1051 s, whereas that of the improved GCC-PHAT algorithm is reduced to 0.0056 s, representing a 94.7% decrease. These results confirm the effectiveness of phase transform weighting for noise suppression, as the frequency-domain whitening process inherent in the PHAT method significantly enhances the signal-to-noise ratio of the delay peaks.
- (2)
Dynamic Weighting Mechanism Test
Experiments were conducted in Scenario 1 to validate and quantify the impact of the dynamic weighting mechanism on optimizing the performance of the sound source localization system. Equation (17) indicates that the weighting factor is a function of the sound pressure level, which is, in turn, dependent on the distance between the sound source and the sensor. Accordingly, simulation experiments were performed to observe the variations in the weighting coefficients under different decibel conditions.
Figure 11 illustrates the trend of the weighting coefficient ai relative to the signal’s decibel level.
Figure 11 demonstrates that the coefficient A exhibits a characteristic S-shaped curve at
θ = 70 dB. Specifically, the weights converge rapidly to values above 0.9 when the sound pressure level (SPL) exceeds 80 dB, while they decline to below 0.1 when the SPL is less than 50 dB. Moreover, as the distance from the sound source exceeds 50 m—corresponding to an SPL below 50 dB—the weights stabilize at values below 0.2. These findings confirm the efficacy of Equation (17) in evaluating transducer reliability.
- (3)
Multi-Obstacle Scene Comparative Analysis
Experiments were performed in Scene 2 with the sound source fixed at coordinates (25, 11). The positions of the obstacles were varied across ten tests to evaluate the performance enhancement of the proposed algorithm relative to traditional TDOA and generalized cross-correlation methods. The results of these tests are presented in the table below.
Table 4 illustrates the performance disparities among various algorithms in complex multipath environments. The traditional time difference of arrival (TDOA) method is notably compromised by secondary reflections, yielding an average error of 5.2 m and a maximum single error of 7.8 m. This outcome suggests that a purely geometric model is insufficient for adapting to underground environments with obstacles. In contrast, the proposed approach—incorporating an adaptive environment compensation algorithm—reduces the average error to 1.8 m (a 65.4% improvement) and decreases the error’s standard deviation to 0.6 m (a 66.7% reduction). These improvements validate the effectiveness of the sparse constraint described in Equation (18), whereby the algorithm automatically mitigates 83% of non-line-of-sight interference in challenging NLOS scenarios. It is noteworthy, however, that in three experiments the error still exceeded 2.5 m, a discrepancy primarily attributed to multiple reflections (three or more) induced by metal pipes in the test settings. This finding underscores the necessity of enhancing the modeling of higher-order reflection paths in future research.
- (4)
Adaptive Environment Compensation Algorithm Testing
To comprehensively evaluate the performance of the localization algorithm across various typical scenarios, experiments were conducted in five distinct settings, with the median error recorded for each. The results have been visualized in a box-and-whisker diagram that depicts the localization error distribution across the different scenes.
Figure 12 illustrates that the localization error of the proposed algorithm is reduced relative to that of traditional methods. In a multi-obstacle environment, the error decreases from 5.13 m to 1.89 m—a reduction of 63.2%—demonstrating that the adaptive environment compensation algorithm effectively suppresses multipath interference.
- (5)
Positioning Accuracy Test
Experiments conducted in Scene 2 validate the accuracy of the proposed sound source localization method. The sound source was repositioned repeatedly to derive its corresponding coordinate locations.
Table 5 summarizes the obtained sound source positions at various distances along with the associated time delays between the sound sensors.
The measured data in
Table 5 corroborate the system’s distance–accuracy characteristics. In the near-field (<50 m), the localization error is ≤0.7 m, meeting the theoretical expectation of the dynamic weighting design in Equation (15). In the mid- and far-field (50–100 m), the error gradually increases to between 1.1 and 2 m, primarily due to air absorption effects (i.e., high-frequency attenuation characterized by α(f) in Equation (2)). Notably, a y-axis deviation of 1.1 m at the (100, 0) test point is attributed to the non-ideal symmetry of the sensor array (with a mounting error of ±2°). Compared with traditional methods, the localization accuracy of this system at 80 m is improved by 58% (yielding an error of 4.2 m under similar conditions, as reported in [
12]), which is ascribed to the multi-sensor fusion strategy. Specifically, the utilization of the DBSCAN clustering algorithm reduces the standard deviation of the localization results from 1.8 m to 0.4 m for the four marker piles.
- (6)
Anti-Reverberation Performance Test
To evaluate the robustness of the proposed algorithm in complex reverberant environments, experiments were conducted in Scene 5 with the introduction of a moderate number of obstacles. This setting simulates an underground concrete environment characterized by strong reverberation. Two comparative experimental conditions were established: one without environmental compensation, in which the adaptive filtering module was deactivated and the original GCC-PHAT delay estimation was applied directly, and one with environmental compensation, where the adaptive filtering module was enabled.
Table 6 summarizes the statistical outcomes derived from 20 independent experiments.
The experimental data indicate that in the absence of compensation, the multipath effect results in an average localization error of 3.8 m with a standard deviation of 1.5 m. With the implementation of adaptive compensation, the average error decreases to 1.5 m—a 60.5% improvement—while the standard deviation is reduced to 0.4 m.
Figure 13 further illustrates the changes in the residual distribution of the delay estimation before and after compensation.
Figure 13 illustrates that the compensated residuals are predominantly distributed within the interval (–0.15 ms, 0.15 ms), accounting for 92% of the data, whereas the uncompensated residuals display a bimodal distribution, with a primary peak at 0.12 ms and a secondary peak at 0.82 ms. This finding substantiates the effective suppression of multipath interference. Consequently, the experimental results confirm that the proposed adaptive compensation algorithm maintains high localization accuracy even in environments with strong reverberation, thereby meeting the requirements of underground cable protection scenarios.
- (7)
Regularization Coefficient Optimization and Performance Validation
Simulation experiments were conducted in Scenario 4 to determine the optimal combinations of the step factor and regularization coefficients. In these experiments, the sound source was positioned at (6, 3). The experimental design and results are detailed below (
Table 7).
Based on the provided data, setting parameter A to 0.05 yields an optimal trade-off between localization accuracy (RMSE = 0.32 m) and real-time performance (18 ms/frame).
- (8)
Comparative Experimental Results Analysis
To comprehensively evaluate the performance of the proposed method, comparative experiments were conducted with mainstream sound source localization techniques in Scene 4, which represents a multi-obstacle environment. The results of these experiments are presented in
Table 8.
Based on the experimental results, the proposed method reduces the average localization error by 14.3% compared with CNN-TDOA, while its computational time is only 10.5% of that required by SRP-PHAT. These improvements are achieved through the enhanced FFT acceleration of GCC-PHAT and the sparsification of the Jacobian matrix in the LM algorithm. Overall, the method presented in this paper demonstrates significant advantages in both localization accuracy and computational efficiency relative to existing approaches. For visualization of the results,
Figure 14 presents box plots showing the distribution of localization errors for three algorithms across 200 test samples.
As shown in
Figure 14, our method exhibits a median error of 1.8 m, which is significantly superior to other algorithms. The SRP-PHAT method demonstrates a right-skewed distribution with a median error of 3.9 m, while the CNN-TDOA method shows a bimodal distribution, reflecting the algorithm’s sensitivity to different scenarios. Moreover, our method maintains an error of 2.3 m at the 75th percentile, which is still lower than the median values of the other algorithms, thereby verifying the effectiveness of the dynamic weighting mechanism.
- (9)
Optimization Test of Spatial Clustering Parameters
Leveraging density-based clustering properties, this study employs the Silhouette Coefficient (SC) to evaluate the performance of various DBSCAN parameter configurations. Simulation experiments were conducted to assess clustering performance under different DBSCAN parameters, with the evaluation index represented by the Silhouette Coefficient,
S.
Let
a(
i) denote the average distance from sample i to its corresponding similar cluster, and let
b(
i) represent the average distance to the nearest dissimilar cluster.
Table 9 summarizes the clustering performance across various parameter combinations.
Based on the data presented, the highest contour coefficient of 0.71 is achieved with ε = 1.5 m and MinPts = 4, representing a 14.5% improvement over the baseline parameter setting (ε = 1.0 m, MinPts = 4).