1. Introduction
In the field of underwater acoustic signal processing, the demand for advanced passive sonar methods for target detection and tracking has increased with the development of military underwater unmanned vehicles (UUVs). Traditional sonar systems, which rely on manual judgment and target selection, are gradually being replaced by autonomous detection systems that integrate detection, tracking, and identification processes. The intelligence level of these systems is crucial for the operational capabilities of UUVs, highlighting the need for robust and accurate tracking algorithms.
The traditional detect-before-track (DBT) principle has been the cornerstone of sonar tracking, where the thresholding of raw data precedes the correlation of measurements across frames to establish the target trajectories. However, this approach is susceptible to detection errors, particularly in low signal-to-noise ratio environments and when dealing with crossing targets, leading to tracking losses and association errors. To overcome these challenges, particle filter (PF)-based track-before-detect (TBD) methods have emerged, showing significant advantages in avoiding association challenges. The PF-TBD method calculates the posterior density distribution using the energy accumulation of multiple pings along the particle trajectories, thereby circumventing the association problem between measurements. Therefore, this method is less sensitive to missing measurements but relies on trajectory continuity. However, when a weak target crosses paths with a strong one, it can be submerged by the strong interference for an extended period, leading to discontinuities in the tracking results.
Over the past few decades, research on TBD algorithms has made significant progress. Early work focused on the elucidation and development of principles, such as adaptive nonlinear filtering proposed by Bar-Shalom and Jaffer [
1] and tracking methods in complex environments by Bar-Shalom and Tse [
2]. Fortmann et al. [
3] further developed the joint probabilistic data association method for multi-target tracking. Streit and Luginbuhl [
4] proposed a probabilistic multi-hypothesis tracking theory, laying the foundation for subsequent research. Morelande et al. [
5] explored multi-target detection and tracking from a Bayesian perspective, whereas Stone et al. [
6] systematically summarized the theory and methods of multi-target tracking within the Bayesian framework. Arulampalam et al. [
7] provided a tutorial on the application of particle filters in nonlinear/non-Gaussian Bayesian tracking, offering theoretical support for the application of particle filters in TBD.
As the research has deepened, scholars have begun to explore how to apply TBD algorithms to practical problems. Xu et al. [
8] proposed a TBD method based on particle filters for towed passive array sonar systems. Orlando et al. [
9] studied TBD algorithms for bistatic sonars. Wei et al. [
10] and C. Jing et al. [
11], respectively, proposed TBD methods based on particle filters for passive array sonar systems, which performed well in handling target detection and in tracking in complex environments. A.-A. Saucan [
12], A. Lepoutre [
13] and Liang [
14] focused on improving the robustness of TBD algorithms in specific applications. Northardt et al. [
15,
16] examined the performance of TBD algorithms in complex passive sonar scenarios and derived a Cramér-Rao Lower Bound, providing theoretical insights into the limitations and potential of these algorithms. Researchers in references [
17,
18,
19,
20,
21,
22] were dedicated to improving the performance of TBD algorithms when dealing with multiple targets, maneuvering targets, and weak targets with low signal-to-noise ratios.
Recent advancements in target detection and tracking have made great strides, with innovations such as deep learning-enhanced TBD frameworks [
23], space–time adaptive algorithms [
24,
25], and multi-frame detection methods [
26,
27]. Xu et al. investigated radar transceiver design for extended targets and robust anti-detection jamming signals [
28,
29,
30], while Zhu et al. focused on improving underwater target detection through azimuth trajectory enhancement, sub-band peak-based broadband detection, delay-Doppler map shaping for high-speed targets, and refined Doppler resolution for Golay waveforms [
31,
32,
33,
34,
35,
36]. These contributions have pushed the boundaries of signal processing in target detection and tracking, offering key insights that directly inform our proposed sub-band adaptive weighting TBD algorithm.
Building upon these foundational studies, our research introduces an innovative TBD algorithm that integrates particle states with frequency-band features. The innovation of this method is attributed to the integration of frequency-band adaptive weighting with particle filter tracking. Specifically, dynamic sub-band weighting is implemented during the tracking process to enhance the tracking performance during target crossover events. The effectiveness of this method has been empirically validated using real-world sea-trial data and compared with existing algorithms.
2. Particle Filter Tracking Before Detection
This section outlines the proposed TBD algorithm, detailing the motion model, particle filter principles, and the novel energy likelihood function that underpins our approach.
2.1. Motion Model
In the Cartesian coordinate system, we assume that the target moves at a uniform speed over a short period, and the constant velocity (CV) model is used as the basis for the state transition of the target.
Let
be the state of the target at time
k, where
is the position of the target at time k, and
is the velocity component of the target under the two coordinate axes at time k. The state equation of the target can be expressed as
where
is the system process noise in the Gaussian form;
is the state transition matrix; and
T represents the time step. In the passive alert process, the observation of the system is the azimuth, and the observation equation is
where
,
o is the position of the observer at time
k, and
is the azimuth of the target relative to the observer.
2.2. The Principle of the Particle Filter
Particle filters are Monte Carlo implementations of Bayesian filtering that are unaffected by linearization and Gaussian assumptions. They use Monte Carlo sampling to avoid the integral operation of the posterior probability analytical solution and use the particle average value to replace the integral to obtain an approximate solution.
Filtering within a Bayesian framework is mainly divided into predictions and updates.
Prediction:
where
is determined by the state equation of the system, and its probability distribution depends on the process noise distribution of the system;
represents the state of the system at time
k; and
represents the measurements from time 1 to
k−1.
Updated:
where
is the normalization constant, and
is called the likelihood function, which is determined by the measurement equation.
The estimate based on the minimum mean square error (MMSE) is
is a function of the state variables. Because an integral operation exists in the process of solving the posterior probability
, it is difficult to obtain an analytical solution for a nonlinear, non-Gaussian system. Therefore, Monte Carlo sampling is used to obtain the samples in the posterior probability, and then the posterior probability can be estimated by the sample expectation. When the particles are all equally probable, there are
The ^ symbol denotes the state estimate
represents the estimated posterior probability density function. Substituting Equation (7) into Equation (6) yields the state variable estimate as shown in Equation (8).
where
is the number of particles; superscript (
) represents the
-th particle; and
is the Dirac function.
The particle weights of the standard particle filter were updated iteratively using importance sampling. The importance probability density function
can generally be taken as the probability density function
of the state transition; then, the weight can be calculated iteratively using the following formula:
The state can be estimated using weight normalization and resampling.
2.3. Energy Likelihood Function
The energy likelihood function, a pivotal component of our particle filter approach, operates by assessing the probability that a particle’s trajectory matches the observed signal energy over time. This is akin to piecing together a puzzle in which each particle’s path offers a potential piece, and the accumulated energy signals indicate the fit.
The flexibility of a particle filter is that it does not require orientation measurements from each frame to update state estimation. The only requirement is the likelihood function, which can be defined freely for a specific application scenario. In passive tracking, the measured state can be included in the energy trajectory of the azimuth history. For convenience, the position of the
n particles obtained by observation at time
k is
, just like Equation (3).
Here,
,
is the known position of the observer at time
k, and
,
is the position status of particle
n. At time
k, the corresponding position wave number is
where
is the truncation operator symbol for downward rounding;
is the division width of each beam; and the azimuth history of the
-th particle in the past
scan is defined as a
—dimensional vector.
As shown in
Figure 1, after being processed by beamforming, the original array element data form the Bearing Time Record (BTR), which is divided into a grid according to the beam and the moment, and the energy matrix is represented by the matrix
.
is an element in the matrix that represents the energy value corresponding to moment k in the orientation
. Let
denote the energy value
corresponding to the azimuth at which particle
is located at moment
. Then the energy intensity sequence of particle
within the time window
to
is a
-dimensional vector.
The energy sum of the
scanned beams is used to define the likelihood function of the nth particle at moment k. It represents the likelihood of the particle passing through the path in
during the time window, which is denoted by
.
where
is the index variable for the values between the current moment
and previous moments
.
The integration over scans creates trajectories for the particles according to the motion model and accumulates energy through this trajectory. High likelihood values at long integration times indicate that the path is more consistent with the motion model, and vice versa.
The posterior probability of state can be approximated by considering the mean of the particles.
where
denotes the cumulative values of all particles on their respective trajectories in the
time window before moment k, and
is calculated from the previous weight
with the likelihood
.
The maximum or mean of the posterior probability density is often selected to obtain an estimate of the state at moment k. In this method, the mean value was calculated using the following equation:
An approximate state mean estimate was obtained from the above equation. In addition, the orientation tracking output for moment k can be obtained by carrying out the state estimate in Equation (12).
3. Target-Based Band Adaptive Selection
3.1. Target Beam Enhancement
Broadband beamforming converts the received array element data into the frequency domain and then performs narrowband direction-of-arrival (DOA) estimation in several sub-bands to obtain the spatial spectrum of each narrowband. At each moment, its output is a function of the frequency and azimuth, hereafter referred to as frequency–azimuth (FRAZ) data. The FRAZ is divided into K bands and H beams, as shown in
Figure 2.
Here, the output
of the
-th beam in the frequency band with center frequency j is
To enhance the target beam, the algorithm selects and acquires narrowband sub-bands based on the spectral characteristics of the signal. Specifically, the selection process involves analyzing the frequency content of the received signal to identify sub-bands that contain significant energy contributions from the target of interest.
Conventional energy detection directly sums the energy of the two-dimensional FRAZ matrix in the frequency dimension and converts it into a one-dimensional broadband spatial spectrum. The result of the
-th beam in the spatial spectrum
is given by the following:
Owing to the influence of ocean background noise and platform self-noise, the noise power is larger at low frequencies, and the noise power decreases with an increase in frequency. To reduce the effect of a low-frequency background on high-frequency signals, the background noise needs to be “flattened”. The background noise is first estimated, and then the beam output is subtracted from the estimated background noise to obtain a power spectrum that is closer to the signal itself. Here, the filtered median of all beams in a certain band is used as the noise background
for the current band.
3.2. Adaptive Sub-Band Selection and Weighting
Target-based adaptive sub-band selection and weighting are divided into selection and weighting processes. The band selection is to eliminate the bands with a low output signal-to-noise ratio of the target and keep only the “dominant” bands with a high output signal-to-noise ratio, while the weighting is to match the frequency features of the target and enhance the corresponding target. The output of each band is subtracted from the background estimation of the current band. The bands that account for the major energy contribution in each beam are selected, and their energy is “retained”, while the bands that are less than zero and have minimal contribution are “rejected”; that is, their output energy is set to zero. After subtracting the estimated noise from each band, the new FRAZ matrix is
where
. For each column (the same beam), the data are sorted according to the energy magnitude to obtain
, which gives
if
, then
where \ denotes subtraction between sets.
is the
-th maximum in the series
. For example, suppose
equal to {1, 3, 2, 8, 6}; then, there are
equal to {1, 2, 3, 6, 8}, with s(1) = 1, s(2) = 3, s(3) = 2, s(4) = 5, and s(5) = 4.
Define
, which satisfies
where
denotes the percentage of retained energy of the band to be retained in the total energy of the current beam.
was taken as 0.8, which means that the band contributing 80% of the energy from large to small was retained.
Based on the energy distribution of the tracking beam where a target is located, the target adaptive (TA) weights can be obtained as
After normalization, there is
where
is an extremely small value to avoid a zero denominator. The target adaptive selection weighted output for beam
is expressed as
The algorithm flowchart is shown on the right side of
Figure 3.
3.3. Adaptive Nature of the Algorithm
The adaptivity of the algorithm is reflected in its capability to dynamically adjust the frequency-band selection and weight allocation in response to the real-time characteristics of the target. At each time step, the algorithm acquires the azimuth from the tracking results and dynamically identifies the energy distribution of the target across the different frequency bands. Subsequently, it selects the “dominant” frequency bands with a high signal-to-noise ratio and assigns them higher weights, while rejecting frequency bands with a low signal-to-noise ratio. This dynamic adjustment ensures that the algorithm can adapt to changes in the target’s characteristics and environmental noise, thereby enhancing the continuity and accuracy of the target tracking.
Figure 3 shows the processing structure, which combines the target-tracking output with the “matching and weighting” of band features for different targets to obtain the corresponding spatial spectrum to improve the tracking of weak targets. The rectangular blocks represent processing steps; the quadrilateral blocks represent data; and the rounded rectangles indicate the start or end of the process.
The adaptivity of the algorithm is further enhanced by its ability to adjust weights during the target crossovers. When two targets cross each other, the algorithm calculates separate weights for each target based on their respective band features. These weights are then used to perform spatial spectral weighting on each target, allowing the algorithm to distinguish between the two targets even when they overlap in the azimuth. This dynamic weight adjustment during the crossovers further improves the robustness of the tracking process.
When two targets A and B are at crossover, high weights are given to those bands that contribute major energy to target A and little to target B, and zero weights are additionally given to those bands that contribute major energy to B and little to A. In this way, the broadband summation can strengthen target A while suppressing target B and improve the signal-to-interference ratio of target A with respect to target B, and vice versa.
A schematic diagram of cross weighting is given in
Figure 4, in which targets A and B will cross in azimuth, and the latest normalized weights
and
have been obtained before the crossover based on their respective band features, given by Equation (30).
Then during the crossover, the cross weights
and
are used to perform spatial spectral weighting on target A and target B, respectively.
and
are given by
where
and
are the weighted weights used for the two targets during crossover. The
and
are the last updated adaptive target weights before target crossover, where
.
For more than two targets, the algorithm extends the weight-adjustment rule by iteratively applying the cross-weighting scheme to each pair of targets. Specifically, the algorithm first determines which target’s characteristic sub-band each sub-band belongs to and then calculates the cross weights as described previously, until the weights for all targets are adjusted. The final weights for each target are normalized to ensure that the sum of weights for all targets in each frequency band equals one.
The adaptivity of the algorithm is crucial for maintaining tracking continuity, particularly when weak targets cross paths with strong ones. By dynamically selecting and weighting the frequency bands based on the real-time energy distribution of the target, the algorithm can effectively reduce the impact of strong interference on weak targets, thereby minimizing tracking discontinuities and improving the overall tracking performance. This adaptivity makes the algorithm highly suitable for complex underwater acoustic environments, in which the target characteristics and interference levels can change rapidly.
4. Experiments and Analysis
The proposed joint algorithm was verified using sea-test data. The experimental data were sourced from a linear array mounted on the UUV platform. During the experiment, the UUV platform moved in a straight line at a speed of 4 knots. The array aperture is 8 m, and the data processing utilizes the received data from 32 array elements. Broadband beamforming is performed in the frequency band of 1 kHz to 3 kHz, and the band is divided into 400 sub-bands for narrowband beamforming. This array converts the received signals into FRAZ data via narrowband DOA processing. The FRAZ data were further processed through broadband integration and summation to generate the BTR, which served as raw data for the tracking algorithm.
The angle of incidence of the signal was defined as 0° in the bow direction and 180° in the stern direction. The (0°, 180°] grid was scanned using linear array data with a scan interval of ∆θ = 1° and a frame duration of 1 s. The broadband BTR obtained using conventional beamforming in the working frequency band is shown in
Figure 5.
The state transfer equation of the target was constructed using the constant velocity (CV) model. The BTR in
Figure 5 was tracked using the particle filtering algorithm based on energy likelihood. The initial number of particles was 2000 for each target, and the initial azimuth was randomly distributed within ±0.5° of the target peak azimuth. The distance states of the particles were randomly distributed in the range 0.1~10 km. The velocity states of the particles were randomly distributed in the range 0~10 m/s, depending on the specific situation. The heading of each particle was randomly distributed in the range of 0~360°. The tracking results are shown in
Figure 6a.
The tracking results show that the tracking of a target with an azimuth of 60° was interrupted when it crossed at 650 s. At 600 s, the algorithm determined that the track had terminated, and after the end of the target crossing, it determined the start of the track again at 750 s. The prolonged disappearance of weak targets leads to discontinuities and changes in trackers. After 400 s, the target with an azimuth of 150° gradually overlaps the right target. The algorithm determines that the track is terminated, and the target is lost at approximately 400 s.
The above results were further analyzed. During the tracking process, targets with azimuth angles varying from 120° to 50° have stronger energy. The two targets with azimuths of 90° and 60° have relatively weak energy, so the peak is affected during the crossover. The target at 60° was affected for a longer period of time because of its weaker intensity. The peak result is shown in
Figure 6b, which indicates that the peak of the weak target was interrupted by the interference of the strong target during the crossover from 600 s to 700 s. The peak interruption of the weak target leads to the determination of track termination before entering the crossing state; therefore, the tracking of the crossing target cannot be performed correctly. However, the target crossover at 350 s can correctly predict the target trajectory based on the particle state because of the small energy intensity difference between the targets and the short peak missing time. Because the target with an azimuth angle of 150° overlaps with the right target in azimuth for a long time, there is only one peak after 400 s; therefore, it is determined that the track is terminated.
The frequency-band adaptive selection method based on tracking targets can use the concept of frequency-band matching to strengthen the target being tracked while weakening the target that crosses it. This method enhances the continuity of target peaks and allows for direct distinction of crossing targets by frequency when there are differences in their frequency—band characteristics. The initial FRAZ is shown in
Figure 7. The target with an initial azimuth of 60° is defined as target A, and targets with azimuths of 150° and 160° are defined as targets B and C, and the strong interference is initially at 110°. It can be observed that there are significant differences in the intensity distribution of the target in terms of frequency. The strong interference has an equal energy distribution in each frequency band, so it overwhelms target A in the crossover process. Target B has a strong concentration of energy in the band of serial numbers 50~100 and almost no energy in the band of serial numbers 300~400. However, target C, which overlaps with target B for a long time, has a strong SNR in the band of serial numbers 300~400.
The joint processing of band features and particle state features was used to track target A. First, the band features of the tracked target were obtained based on the azimuth tracking results of the particle filtering. Second, the target band adaptive weighting vector is calculated, and the enhanced spatial spectrum of the tracked target is used for the subsequent tracking of target A. Finally, the new weights are calculated again from the tracking results.
Figure 8 illustrates the band adaptive weight assignment results for target A during crossover, revealing the key advantages of the proposed method in handling the intersection of target signals and strong interference. From left to right, the figure displays the energy distribution of target A and the interference, as well as the matching weights calculated based on their respective spectral distributions. The energy distribution across sub-bands shows that target A’s energy is weaker than the interference in all the analyzed bands, resulting in a signal-to-interference ratio (SIR) of less than 0 dB. However, target A exhibits a relatively higher SIR in some sub-bands. The weight distribution of target A and the interference indicates that target A has a certain “dominance” in sub-bands numbered 125–150 and 220–240, where the SIR is relatively higher. By enhancing target A’s relative “dominance” in these sub-bands, the proposed method not only effectively improves the detectability of the target signal but also maintains signal continuity during the crossover period. This dynamic weight allocation strategy is one of the core innovations of the proposed method, enhancing target detection performance in complex interference environments.
Similarly, the results of the joint processing of objectives B and C are shown in
Figure 9. The cross weights of target B and target C during the overlap period show that the “strong” sub-band of target B is in the top 100 sub-bands, while the “strong” sub-band of target C is between 300 and 400. When the same BTR is used as the tracking input, the two targets overlap in azimuth, and only one target can be tracked. However, by weighting according to the cross weights to obtain the BTR for target B and the BTR for target C, it is easy to distinguish the two targets on the same azimuth and provide correct tracking results during the overlap.
Figure 10a–c show the sub-band adaptive weighting results of the peak history for targets A, B, and C, respectively. The processing results were compared with the original BTR to obtain
Figure 10d.
To demonstrate the superiority of our proposed algorithm, we compared it with the traditional detect-before-track (DBT) [
7], Standard PF-TBD [
8], and recent state-of-the-art methods including deep learning TBD (DL-TBD) [
24]. In traditional DBT following Bar-Shalom’s framework [
7], the measurements were thresholded at the 3σ noise level before Kalman filtering with nearest-neighbor association. Standard PF-TBD was implemented according to Arulampalam’s particle filtering guidelines [
8] with 2000 particles and energy likelihood matching. The DL-TBD adopted Wang et al.’s CNN architecture [
24] trained on our dataset with identical input BTRs.
For a fair comparison, all the particle-based methods used identical resampling thresholds (set resampling thresholds < 50%) and systematic resampling with 2000 particles and energy likelihood matching. The DP-TBD implementation followed [
26], trained on our dataset with identical input BTRs with a grid resolution of Δθ = 0.5°. All the timing metrics were obtained on an Intel i7-11800H CPU using a single-threaded MATLAB 2018b implementation.
The results in
Table 1 show that we adopted the following methods to calculate the tracking accuracy and trajectory continuity metrics. The tracking accuracy was quantified using the root mean square error (RMSE) between the estimated and ground-truth positions (obtained via GPS-augmented surface buoy markers). Trajectory continuity was measured as the percentage of time steps without track fragmentation (>3 s gaps). The computational complexity was evaluated via the processing time/frame (ms) on an Intel i7-11800H CPU.
To isolate the contribution of sub-band weighting, we conducted ablation tests by removing the adaptive weighting module. This resulted in an RMSE degradation from 0.93° to 1.64°, confirming the critical role of band adaptation.
The proposed method reduces tracking discontinuity compared with conventional PF-TBD (from 78.1% to 93.2%) while maintaining real-time capability (<150 ms/frame). Frequency-domain feature weighting proved particularly effective during target crossovers in which we observed a reduction in track swaps. The current implementation requires a predefined target count. Future work will integrate birth/death processes and explore GPU acceleration for >10,000 particles.
5. Conclusions
This paper presents a novel TBD algorithm that integrates particle states and band features to enhance the tracking of weak targets in complex underwater acoustic environments. By introducing frequency-band adaptive matching and particle filter techniques, the algorithm significantly improves the continuity and accuracy of target trajectories, especially in scenarios in which targets cross paths. The experimental results, supported by detailed hardware setup and data processing, demonstrate the robustness and effectiveness of the proposed method.
However, the proposed algorithm also has notable limitations. It does not account for the birth and death of targets, and the added frequency-band adaptation and particle filtering processes increase the computational complexity. Additionally, the algorithm’s performance has only been validated in a single array configuration and a specific marine environment. To address these limitations, future work will introduce dynamic birth and death processes to handle changes in the number of targets. We will continue to explore optimization strategies, such as GPU acceleration and the use of lower-resolution sub-bands, to reduce processing time while maintaining tracking performance. Furthermore, the algorithm will be tested in multiple sonar configurations and environments to assess its robustness and adaptability.
In summary, although the proposed algorithm achieves high tracking accuracy and continuity, addressing its current limitations is crucial for enhancing its practicality and robustness. Future efforts will focus on optimizing the algorithm to reduce computational complexity while preserving its performance advantages, making it more suitable for real-world applications in complex underwater environments.