Robust Blind Algorithm for DOA Estimation Using TDOA Consensus
Abstract
1. Introduction
- A whitening transformation that decorrelates the received signals, reducing the impact of colored noise and reverberation.
- A Lawson norm optimization approach that provides robustness against outliers in the TDOA estimation process.
- The novel TDOA consensus framework based on Median and MAD statistics that further improves estimation accuracy by filtering out anomalous measurements.
- 1.
- The development of a novel TDOA consensus framework using Median and MAD statistics for robust outlier detection and filtering.
- 2.
- The integration of whitening transformation with Lawson norm optimization in a unified DOA estimation algorithm.
- 3.
- Comprehensive evaluation of the proposed method against both traditional approaches (standard GCC, GCC-PHAT with L2 norms) and modern alternatives (SRP-PHAT, robust MUSIC) across various acoustic conditions.
- 4.
- Analysis of the algorithm’s sensitivity to key parameters, including the Lawson norm parameter and microphone array configuration.
- 5.
- Detailed computational complexity analysis demonstrating the algorithm’s feasibility for real-time applications.
2. Methods
2.1. Signal Model
2.2. Whitening Transformation
2.3. TDOA Estimation with GCC-PHAT
2.4. Lawson Norm Optimization
2.5. TDOA Consensus Framework
- 1.
- Generate K sets of perturbed signals by adding small random perturbations to the whitened signals:
- 2.
- For each set of perturbed signals, estimate the TDOA between each microphone and the reference microphone using GCC-PHAT followed by Lawson norm refinement:
- 3.
- For each microphone pair, collect the TDOA estimates across all perturbations:
- 4.
- Apply the Median and MAD-based outlier detection to each set :
- 5.
- Compute the consensus TDOA estimate as the mean of the inliers:
2.6. DOA Estimation from TDOA Measurements
2.7. Computational Complexity Analysis
- 1.
- Whitening transformation: , where M is the number of microphones and N is the number of signal samples.
- 2.
- GCC-PHAT computation: for all microphone pairs.
- 3.
- Lawson norm refinement: , where S is the size of the search neighborhood.
- 4.
- TDOA consensus framework: , where K is the number of perturbations.
- 5.
- DOA estimation: , where A is the number of angles in the search grid.
3. Results
3.1. Experimental Setup
3.1.1. Simulation Parameters
- Room dimensions: 10 m × 8 m;
- Sampling frequency: 16 kHz;
- Signal duration: 1 s;
- Source signal: Combination of chirp and speech-like signals;
- Microphone array: Various configurations (detailed below);
- Signal-to-Noise Ratio (SNR): 0 to 20 dB;
- Reverberation time (T60): 0.2 to 0.8 s;
- Number of trials per condition: 10.
3.1.2. Microphone Array Configurations
- 1.
- Diverse array: Four microphones at positions (2,1), (6,2), (8,6), and (5,7) meters;
- 2.
- Uniform Linear Array (ULA): Four microphones with 0.5 m spacing along the x-axis;
- 3.
- Uniform Circular Array (UCA): Four microphones arranged in a circle with 0.5 m radius.
3.1.3. Compared Methods
- 1.
- Standard method: Basic cross-correlation without PHAT weighting, using L2 norm optimization;
- 2.
- GCC-PHAT with L2 norm: Standard GCC-PHAT with L2 norm optimization;
- 3.
- No whitening: Proposed method without the whitening transformation;
- 4.
- No consensus: Proposed method without the TDOA consensus framework;
- 5.
- SRP-PHAT: Steered Response Power with Phase Transform;
- 6.
- Robust MUSIC: MUSIC algorithm with robust covariance estimation.
3.1.4. Performance Metrics
3.2. Performance vs. SNR
3.3. Performance vs. Reverberation Time
3.4. Error Distribution Analysis
3.5. Performance Under Impulsive Noise
3.6. Ablation Study
- 1.
- The full proposed method consistently outperforms all variants, confirming that each component contributes positively to the overall performance.
- 2.
- Removing the whitening transformation (“No Whitening”) leads to performance degradation, particularly at high reverberation times, highlighting the importance of decorrelating the signals in reverberant environments.
- 3.
- Removing the TDOA consensus framework (“No Consensus”) results in increased RMSE across all conditions, with the effect being more pronounced at low SNR and high reverberation, demonstrating the value of the MAD-based outlier filtering approach.
- 4.
- Replacing the Lawson norm with the L2 norm (“L2 Norms (GCC-PHAT)”) significantly reduces performance, especially at low SNR, confirming the robustness benefits of the Lawson norm optimization.
- 5.
- The standard method (“Standard (Basic GCC)”) performs worst overall, indicating that the combination of basic cross-correlation without PHAT weighting and L2 norm optimization is particularly vulnerable to noise and reverberation.
3.7. Impact of Microphone Count
3.8. Impact of Array Geometry
3.9. Sensitivity to Lawson Norm Parameter
3.10. Comparison with Modern Algorithms
4. Discussion
4.1. Interpretation of Results
- 1.
- The proposed method consistently outperforms both traditional approaches and modern alternatives, with improvements of up to 30% in RMSE under challenging conditions (low SNR, high reverberation).
- 2.
- Each component of the proposed method (whitening, Lawson norm optimization, TDOA consensus) contributes significantly to the overall performance, as evidenced by the ablation study. The TDOA consensus framework based on Median and MAD statistics is particularly effective at filtering out outliers in TDOA estimates.
- 3.
- The method’s performance advantage is maintained across different microphone array configurations and counts, demonstrating its versatility for various practical applications.
- 4.
- The Lawson norm parameter p has a notable impact on performance, with values around 1.7 providing the best results for the tested conditions. This suggests that while robustness to outliers is important (favoring lower p values), some degree of smoothness (higher p values) is beneficial for overall accuracy.
4.2. Practical Implications
- 1.
- Robustness to challenging conditions: The method’s superior performance in low SNR and high reverberation makes it particularly suitable for applications in challenging acoustic environments, such as smart home devices, teleconferencing systems, and surveillance systems.
- 2.
- Flexibility in array configuration: The method works well with various microphone array geometries, allowing for flexible deployment in different physical settings. The results suggest that a diverse array configuration generally provides the best performance, but the method still maintains its advantage with standard configurations like ULA and UCA.
- 3.
- Scalability with microphone count: The performance improves with additional microphones, but with diminishing returns beyond 6-7 microphones. This provides practical guidance for system design, suggesting that a moderate number of microphones (4-6) may offer a good balance between performance and complexity.
- 4.
- Computational feasibility: The complexity analysis indicates that the method can be implemented to run in real time on modern hardware with appropriate optimizations. This makes it suitable for applications requiring low-latency DOA estimation.
4.3. Limitations and Future Work
- Single source assumption: The current method assumes a single dominant sound source. Extending the approach to handle multiple simultaneous sources would increase its applicability to more complex acoustic scenarios.
- Fixed Lawson norm parameter: The current implementation uses a fixed value for the Lawson norm parameter p. Developing an adaptive approach that selects the optimal p based on the acoustic conditions could further improve performance.
- Limited frequency analysis: The method operates on the full-band signals without explicit frequency-dependent processing. Incorporating frequency-dependent TDOA estimation and consensus could potentially improve performance, especially in environments with frequency-dependent reverberation characteristics.
- Simulation-based evaluation: While our simulations include realistic modeling of room acoustics, evaluation on real-world recordings would provide additional validation of the method’s practical performance.
- 1.
- Extending the method to handle multiple sound sources through clustering of TDOA estimates or subspace-based approaches.
- 2.
- Developing an adaptive framework for selecting the Lawson norm parameter based on estimated SNR and reverberation conditions.
- 3.
- Incorporating frequency-dependent processing, potentially using a sub-band approach with frequency-dependent consensus.
- 4.
- Evaluating the method on real-world recordings in various acoustic environments.
- 5.
- Exploring the integration of the proposed method with tracking algorithms for dynamic sound sources.
- 6.
- Investigating the potential of machine learning approaches to optimize the consensus mechanism based on acoustic conditions.
4.4. Applications
- Smart home devices: Improving the accuracy of voice command localization in devices like smart speakers, especially in reverberant home environments.
- Teleconferencing systems: Enhancing speaker localization and tracking in meeting rooms, enabling better camera control and audio beamforming.
- Hearing aids: Providing more accurate sound source localization for adaptive beamforming in hearing assistance devices.
- Audio surveillance: Improving the localization of sound events in security and monitoring applications.
- Secure Communication in IoT: DOA estimation can play a role in enhancing the security of IoT networks by enabling location-aware security protocols and facilitating privacy-preserving data aggregation schemes [16].
- Target Tracking: The algorithm can be used for tracking the movement of targets in various scenarios, including military applications, search and rescue operations, and wildlife monitoring. TDOA measurements from a network of cooperating sensors can be processed using the proposed algorithm to achieve accurate and robust target tracking [17].
- Human-robot interaction: Enabling more natural interaction by allowing robots to accurately locate and track human speakers in noisy environments.
- Vehicle Localization: The algorithm can be used to accurately localize vehicles in complex urban environments where GPS signals may be unreliable due to multipath effects. Cooperative localization using multiple base stations and sparse Bayesian learning techniques can further enhance accuracy [18,19].
- Wireless Sensor Networks: The algorithm can be applied in wireless acoustic sensor networks for sound source localization and tracking, which is useful for surveillance, environmental monitoring, and smart building applications. The algorithm’s efficiency is particularly beneficial in resource-constrained sensor networks [22,23].
5. Conclusions
Funding
Institutional Review Board Statement
Informed Consent Statement
Acknowledgments
Conflicts of Interest
Abbreviations
DOA | Direction of Arrival |
TDOA | Time Difference of Arrival |
GCC | Generalized Cross-Correlation |
PHAT | Phase Transform |
MAD | Median Absolute Deviation |
RMSE | Root Mean Square Error |
SNR | Signal-to-Noise Ratio |
MUSIC | MUltiple SIgnal Classification |
ESPRIT | Estimation of Signal Parameters via Rotational Invariance Techniques |
SRP | Steered Response Power |
ULA | Uniform Linear Array |
UCA | Uniform Circular Array |
References
- Brandstein, M.; Ward, D. Microphone Arrays: Signal Processing Techniques and Applications; Springer: Berlin, Germany, 2001. [Google Scholar]
- Benesty, J.; Chen, J.; Huang, Y. Microphone Array Signal Processing; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
- Krim, H.; Viberg, M. Two decades of array signal processing research: The parametric approach. IEEE Signal Process. Mag. 1996, 13, 67–94. [Google Scholar] [CrossRef]
- Godara, L.C. Application of antenna arrays to mobile communications, Part II: Beam-forming and direction-of-arrival considerations. Proc. IEEE 1997, 85, 1195–1245. [Google Scholar] [CrossRef]
- DiBiase, J.H.; Silverman, H.F.; Brandstein, M.S. Robust localization in reverberant rooms. In Microphone Arrays; Springer: Berlin, Germany, 2001; pp. 157–180. [Google Scholar]
- Valin, J.M.; Michaud, F.; Rouat, J. Robust localization and tracking of simultaneous moving sound sources using beamforming and particle filtering. Robot. Auton. Syst. 2007, 55, 216–228. [Google Scholar] [CrossRef]
- Schmidt, R.O. Multiple emitter location and signal parameter estimation. IEEE Trans. Antennas Propag. 1986, 34, 276–280. [Google Scholar] [CrossRef]
- Roy, R.; Kailath, T. ESPRIT-estimation of signal parameters via rotational invariance techniques. IEEE Trans. Acoust. Speech Signal Process. 1989, 37, 984–995. [Google Scholar] [CrossRef]
- Stoica, P.; Sharman, K.C. Maximum likelihood methods for direction-of-arrival estimation. IEEE Trans. Acoust. Speech Signal Process. 1990, 38, 1132–1143. [Google Scholar] [CrossRef]
- Van Veen, B.D.; Buckley, K.M. Beamforming: A versatile approach to spatial filtering. IEEE ASSP Mag. 1988, 5, 4–24. [Google Scholar] [CrossRef]
- Knapp, C.; Carter, G. The generalized correlation method for estimation of time delay. IEEE Trans. Acoust. Speech Signal Process. 1976, 24, 320–327. [Google Scholar] [CrossRef]
- Greco, D.; Cavazza, J.; Bue, A.D. Are Multiple Cross-Correlation Identities better than just Two? Improving the Estimate of Time Differences-of-Arrivals from Blind Audio Signals. In Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 6592–6599. [Google Scholar] [CrossRef]
- Zoltowski, M.D.; Kautz, G.M.; Silverstein, S.D. Beamspace root-MUSIC. IEEE Trans. Signal Process. 1996, 44, 1131–1146. [Google Scholar] [CrossRef]
- Malioutov, D.; Çetin, M.; Willsky, A.S. A sparse signal reconstruction perspective for source localization with sensor arrays. IEEE Trans. Signal Process. 2005, 53, 3010–3022. [Google Scholar] [CrossRef]
- Chakrabarty, S.; Habets, E.A. Multi-speaker DOA estimation using deep convolutional networks trained with noise signals. IEEE J. Sel. Top. Signal Process. 2019, 13, 8–21. [Google Scholar] [CrossRef]
- Yu, C.; Li, Y.; Li, L.; Huang, Z.; Wu, Q.; de Lamare, R. Dual Lawson Norm-Based Robust DOA Estimation for RIS-Aided Wireless Communication Systems. IEEE Trans. Aerosp. Electron. Syst. 2025, 61, 582–592. [Google Scholar] [CrossRef]
- Khan, N.A.; Ali, S. Robust spatial time-frequency distributions for DOA estimation using spatial averaging and directional smoothing. Signal Process. 2021, 180, 107897. [Google Scholar] [CrossRef]
- Owen, O.; Pan, Z.; Shimamoto, S. Vehicle Localization utilizing a Novel Hybrid TDOA-Based Estimation. In Proceedings of the 2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall), London, UK, 26–29 September 2022; pp. 1–6. [Google Scholar] [CrossRef]
- Flores, L.A.; Lomas, I.; Guachalá, L.; Lupera-Morillo, P.; Álvarez, R.; Llugsi, R. Direction-of-Arrival (DOA) Estimation Based on Real Field Measurements and Modified Linear Regression. Eng. Proc. 2024, 77, 11. [Google Scholar] [CrossRef]
- Xu, Z.; Wu, S.; Yu, Z.; Guang, X. A Robust Direction of Arrival Estimation Method for Uniform Circular Array. Sensors 2019, 19, 4427. [Google Scholar] [CrossRef]
- Mofeed, M.A.E.; Mofeed, H.A.E. Direction-of-arrival methods (DOA) and time difference of arrival (TDOA) position location technique. In Proceedings of the Twenty-Second National Radio Science Conference, NRSC 2005, Cairo, Egypt, 15–17 March 2005; pp. 173–182. [Google Scholar] [CrossRef]
- Traa, J.; Smaragdis, P. Multichannel Source Separation and Tracking With RANSAC and Directional Statistics. IEEE/ACM Trans. Audio Speech Lang. Process. 2014, 22, 2233–2243. [Google Scholar] [CrossRef]
- Lan, X.; Hu, J.; Zhang, Y.; Ma, S.; Tian, Y. A Novel DOA Estimation Algorithm Based on Robust Mixed Fractional Lower-Order Correntropy in Impulsive Noise. Electronics 2024, 13, 2386. [Google Scholar] [CrossRef]
Method | SNR (dB), T60 = 0.6 s | T60 (s), SNR = 10 dB | |||
---|---|---|---|---|---|
0 | 10 | 20 | 0.2 | 0.8 | |
Proposed Robust | 10.2 | 11.1 | 11.0 | 11.3 | 12.4 |
SRP-PHAT | 12.5 | 13.2 | 13.0 | 12.8 | 14.7 |
Robust MUSIC | 11.8 | 12.5 | 12.3 | 12.1 | 14.1 |
Standard (Basic GCC) | 15.3 | 16.5 | 16.2 | 13.3 | 17.9 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Greco, D. Robust Blind Algorithm for DOA Estimation Using TDOA Consensus. Acoustics 2025, 7, 52. https://doi.org/10.3390/acoustics7030052
Greco D. Robust Blind Algorithm for DOA Estimation Using TDOA Consensus. Acoustics. 2025; 7(3):52. https://doi.org/10.3390/acoustics7030052
Chicago/Turabian StyleGreco, Danilo. 2025. "Robust Blind Algorithm for DOA Estimation Using TDOA Consensus" Acoustics 7, no. 3: 52. https://doi.org/10.3390/acoustics7030052
APA StyleGreco, D. (2025). Robust Blind Algorithm for DOA Estimation Using TDOA Consensus. Acoustics, 7(3), 52. https://doi.org/10.3390/acoustics7030052