Target Speaker Localization Based on the Complex Watson Mixture Model and Time-Frequency Selection Neural Network
AbstractCommon sound source localization algorithms focus on localizing all the active sources in the environment. While the source identities are generally unknown, retrieving the location of a speaker of interest requires extra effort. This paper addresses the problem of localizing a speaker of interest from a novel perspective by first performing time-frequency selection before localization. The speaker of interest, namely the target speaker, is assumed to be sparsely active in the signal spectra. The target speaker-dominant time-frequency regions are separated by a speaker-aware Long Short-Term Memory (LSTM) neural network, and they are sufficient to determine the Direction of Arrival (DoA) of the target speaker. Speaker-awareness is achieved by utilizing a short target utterance to adapt the hidden layer outputs of the neural network. The instantaneous DoA estimator is based on the probabilistic complex Watson Mixture Model (cWMM), and a weighted maximum likelihood estimation of the model parameters is accordingly derived. Simulative experiments show that the proposed algorithm works well in various noisy conditions and remains robust when the signal-to-noise ratio is low and when a competing speaker exists. View Full-Text
Share & Cite This Article
Wang, Z.; Li, J.; Yan, Y. Target Speaker Localization Based on the Complex Watson Mixture Model and Time-Frequency Selection Neural Network. Appl. Sci. 2018, 8, 2326.
Wang Z, Li J, Yan Y. Target Speaker Localization Based on the Complex Watson Mixture Model and Time-Frequency Selection Neural Network. Applied Sciences. 2018; 8(11):2326.Chicago/Turabian Style
Wang, Ziteng; Li, Junfeng; Yan, Yonghong. 2018. "Target Speaker Localization Based on the Complex Watson Mixture Model and Time-Frequency Selection Neural Network." Appl. Sci. 8, no. 11: 2326.
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.