# Underdetermined Blind Source Separation Combining Tensor Decomposition and Nonnegative Matrix Factorization

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Problem Formulation

#### 2.1. Linear Instantaneous Mixture Model

#### 2.2. The NMF Source Model

#### 2.3. Objective

## 3. The Proposed Optimization Algorithm

#### 3.1. Mixing Matrix Estimation Using Tensor Decomposition

#### 3.2. Source Separation Using the Baseline Methods

#### 3.2.1. ${l}_{p}$ Norm Minimization Method

#### 3.2.2. Binary Masking Method

#### 3.3. The Optimization EM Algorithm

**E-step:**Conditional Expectations of Natural Statistics

**M-step:**Update of Parameters

**A, W**, and

**H**.

Algorithm 1: Proposed Algorithm for Underdetermined Linear BSS. |

• Underdetermined Linear Mixture Case ($I>J$)Step 1. Estimate the mixing matrix $\mathbf{A}$ by using the time-domain tensor decomposition.Step 2. Perform STFT on $\mathbf{x}(t)$ to get ${\mathbf{x}}_{fn}$.Step 3. Estimate the sources using (20) and detect the source spectrogram factors employingthe NMF method with (7). Step 4. Initialize the updated matrix, the spectral basis, and temporal code, then update theseparameters using EM algorithm. i.e., repeat(i). Update $\mathbf{A}$ with (33) in the linear mixture case. (ii). Alternately update ${w}_{fk}$ and ${h}_{kn}$ with (35). until convergenceStep 5. Estimate ${\widehat{\mathbf{s}}}_{fn}$ by using Wiener filter of (28).Step 6. Transform ${\widehat{\mathbf{s}}}_{fn}$ into time-domain to obtain $\mathbf{s}(t)$ through inverse STFT.• end |

#### 3.4. Convolutive Mixed Sources Case

Algorithm 2: Proposed Algorithm for Underdetermined Convolutive BSS. |

• Underdetermined Convolutive Mixture Case ($I>J$)Step 1. Perform STFT on $\mathbf{x}(t)$ to get ${\mathbf{x}}_{fn}$Step 2. Estimate the mixing matrix ${\mathbf{A}}_{f}$ by using frequency-domain tensor decomposition.Step 3. Estimate the sources using (22), and detect the source spectrogram factors employingthe NMF method with (7). Step 4. Initialize the updated matrix, the spectral basis, and temporal code, then update theseparameters using EM algorithm. i.e., repeat(i). Update ${\mathbf{A}}_{f}$ with (41) in the convolutive mixture case. (ii). Alternately update ${w}_{fk}$ and ${h}_{kn}$ with (35). until convergenceStep 5. Estimate ${\widehat{\mathbf{s}}}_{fn}$ by using Wiener filter of (28).Step 6. Transform ${\widehat{\mathbf{s}}}_{fn}$ into time-domain to obtain $\mathbf{s}(t)$ through inverse STFT.• end |

## 4. Experiments

#### 4.1. Datasets

#### 4.2. Source Signal Separation Evaluation Criteria

#### 4.3. Algorithm Parameters

#### 4.4. Underdetermined BSS in the Linear Instantaneous Case and Convolutive Mixture Case

#### 4.4.1. Music Signal Mixtures in the Linear Instantaneous Case

#### 4.4.2. Speech Signal Mixtures in the Linear Instantaneous Case

#### 4.4.3. Music Signal Mixtures in the Convolutive Case

#### 4.4.4. Speech Signal Mixtures in the Convolutive Case

**Discussion 1**. According to the above experimental results of Dataset A, Dataset B, Dataset C, and Dataset D, it can be seen that our proposed algorithm can separate music signal mixtures and speech signal mixtures in the underdetermined linear and convolutive case. What is more, according to the average value of source separation results, it is also shown that our proposed algorithm outperforms the baseline algorithms.

#### 4.5. The Runtime of All Algorithms

## 5. Conclusions and Future Work

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Wang, L.; Reiss, J.D.; Cavallaro, A. Over-determined source separation and localization using distributed microphones. IEEE/ACM Trans. Audio Speech Lang. Process.
**2016**, 24, 1573–1588. [Google Scholar] [CrossRef] - Loesch, B.; Yang, B. Adaptive segmentation and separation of determined convolutive mixtures under dynamic conditions. In Proceedings of the International Conference on Latent Variable Analysis and Signal Separation, St. Malo, France, 27–30 September 2010; pp. 41–48. [Google Scholar]
- Kitamura, D.; Ono, N.; Sawada, H.; Kameoka, H.; Saruwatari, H. Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization. IEEE/ACM Trans. Audio Speech Lang. Process.
**2016**, 24, 1622–1637. [Google Scholar] [CrossRef] - Sawada, H.; Araki, S.; Makino, S. Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans. Audio Speech Lang. Process.
**2011**, 19, 516–527. [Google Scholar] [CrossRef] - Cho, J.; Yoo, C.D. Underdetermined convolutive BSS: Bayes risk minimization based on a mixture of super-Gaussian posterior approximation. IEEE Press
**2015**, 23, 828–839. [Google Scholar] [CrossRef] - Harshman, R.A. Foundations of the PARAFAC procedure: Models and conditions for an “explanatory” multi-model factor analysis. Ucla Work. Pap. Phon.
**1970**, 16, 1–84. [Google Scholar] - Kolda, T.G.; Bader, B.W. Tensor decompositions and applications. Siam Rev.
**2009**, 51, 455–500. [Google Scholar] [CrossRef] - Nion, D.; Mokios, K.N.; Sidiropoulos, N.D.; Potamianos, A. Batch and adaptive PARAFAC-based blind separation of convolutive speech mixtures. IEEE Trans. Audio Speech Lang. Process.
**2010**, 18, 1193–1207. [Google Scholar] [CrossRef] - Liavas, A.P.; Sidiropoulos, N.D. Parallel algorithms for constrained tensor factorization via alternating direction method of multipliers. IEEE Trans. Signal Process.
**2015**, 63, 5450–5463. [Google Scholar] [CrossRef] - Vincent, E. Complex Nonconvex lp Norm Minimization for Underdetermined Source Separation. In Proceedings of the International Conference on Independent Component Analysis and Signal Separation, London, UK, 9–12 September 2007; pp. 430–437. [Google Scholar]
- Yilmaz, O.; Rickard, S. Blind separation of speech mixtures via time-frequency masking. IEEE Trans. Signal Process.
**2004**, 52, 1830–1847. [Google Scholar] [CrossRef] - Lee, D.D.; Seung, H.S. Learning the parts of objects by non-negative matrix factorization. Nature
**1999**, 401, 788–791. [Google Scholar] [CrossRef] [PubMed] - Gillis, N.A.; Vavasis, S. Fast and robust recursive algorithms for separable nonnegative matrix factorization. IEEE Pattern Anal. Mach. Intell.
**2014**, 36, 698–714. [Google Scholar] [CrossRef] [PubMed] - Févotte, C.; Bertin, N.; Durrieu, J.L. Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis. Neural Comput.
**2009**, 21, 793. [Google Scholar] [CrossRef] [PubMed] - Gao, B.; Woo, W.L.; Dlay, S.S. Variational regularized 2-D nonnegative matrix factorization. IEEE Trans. Neural Netw. Learn. Syst.
**2012**, 23, 703–716. [Google Scholar] [PubMed] - Ozerov, A.; Fevotte, C. Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. Audio Speech Lang. Process.
**2010**, 18, 550–563. [Google Scholar] [CrossRef] - Al-Tmeme, A.; Woo, W.L.; Dlay, S.S.; Gao, B. Underdetermined convolutive source separation using GEM-MU with variational approximated optimum model order NMF2D. IEEE/ACM Trans. Audio Speech Lang. Process.
**2017**, 25, 35–49. [Google Scholar] [CrossRef] - Dempster, A.P. Maximum likelihood estimation from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc.
**1977**, 39, 1–38. [Google Scholar] - Kitamura, D.; Saruwatari, H.; Kameoka, H.; Yu, T.; Kondo, K.; Nakamura, S. Multichannel signal separation combining directional clustering and nonnegative matrix factorization with spectrogram restoration. IEEE/ACM Trans. Audio Speech Lang. Process.
**2015**, 23, 654–669. [Google Scholar] [CrossRef] - Sawada, H.; Kameoka, H.; Araki, S.; Ueda, N. Multichannel extensions of non-negative matrix factorization with complex-valued data. IEEE Trans. Audio Speech Lang. Process.
**2013**, 21, 971–982. [Google Scholar] [CrossRef] - Neeser, F.D.; Massey, J.L. Proper complex random processes with applications to information theory. IEEE Trans. Inform. Theor.
**2002**, 39, 1293–1302. [Google Scholar] [CrossRef] - Zhou, G.; Cichocki, A.; Zhao, Q.; Xie, S. Nonnegative matrix and tensor factorizations: An algorithmic perspective. IEEE Signal Process. Mag.
**2009**, 31, 54–65. [Google Scholar] [CrossRef] - Cichocki, A.; Mandic, D.; Lathauwer, L.D.; Zhou, G.; Zhao, Q.; Caiafa, C.; Phan, H.A. Tensor decompositions for signal processing applications: From two-way to multiway component analysis. IEEE Signal Process. Mag.
**2015**, 32, 145–163. [Google Scholar] [CrossRef] - Sidiropoulos, N.D.; Giannakis, G.B.; Bro, R. Blind parafac receivers for Ds-Cdma systems. IEEE Trans. Signal Process.
**2000**, 48, 810–823. [Google Scholar] [CrossRef] - Rajih, M.; Comon, P. Enhanced Line Search: A novel method to accelerate Parafac. In Proceedings of the 13th European Signal Processing Conference, Antalya, Turkey, 4–8 September 2005; pp. 1–4. [Google Scholar]
- Nion, D.; Lathauwer, L.D. Line search computation of the block factor model for blind multi-user access in wireless communications. In Proceedings of the IEEE 7th Workshop on Signal Processing Advances in Wireless Communications, Cannes, France, 2–5 July 2006; pp. 1–4. [Google Scholar]
- Domanov, I.; De Lathauwer, L. An Enhanced Plane Search Scheme for Complex-Valued Tensor Decompositions. In Proceedings of the 16th Conference of the International Linear Algebra Society (ILAS), Pisa, Italy, 1 June 2010. [Google Scholar]
- De Lathauwer, L. A link between the canonical decomposition in multilinear algebra and simultaneous matrix diagonalization. SIAM J. Matrix Anal. Appl.
**2006**, 28, 642–666. [Google Scholar] [CrossRef] - Vincent, E.; Sawada, H.; Bofill, P.; Makino, S.; Rosca, J.P. First stereo audio source separation evaluation campaign: Data, algorithms and results. In Proceedings of the International Conference on Independent Component Analysis and Signal Separation (ICA 2007), London, UK, 9–12 September 2007; pp. 552–559. [Google Scholar]
- Duong, N.Q.K.; Vincent, E. Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model; IEEE Press: New York, NY, USA, 2010; pp. 1830–1840. [Google Scholar]
- Arberet, S. A robust method to count and locate audio sources in a stereophonic linear instantaneous mixture. In Proceedings of the International Conference on Independent Component Analysis and Blind Signal Separation, Charleston, SC, USA, 5–8 March 2006; pp. 536–543. [Google Scholar]
- O’Grady, P.D.; Pearlmutter, B.A. Soft-LOST: EM on a mixture of oriented lines. In Proceedings of the International Conference on Independent Component Analysis and Signal Separation, Granada, Spain, 22–24 September 2004; Volume 3195, pp. 430–436. [Google Scholar]
- Tan, V.Y.F.; Févotte, C. Automatic Relevance Determination in Nonnegative Matrix Factorization. In Proceedings of the Signal Processing with Adaptive Sparse Structured Representations, SPARS, St-Malo, France, 6–9 April 2009; Volume 35, pp. 1592–1605. [Google Scholar]
- Wax, M.; Kailath, T. Determining the number of signals by information theoretic criteria. In Proceedings of the IEEE International Conference on ICASSP Acoustics, Speech, and Signal Processing, San Diego, CA, USA, 19–21 March 1984; pp. 232–235. [Google Scholar]
- He, Z.; Cichocki, A.; Xie, S.; Choi, K. Detecting the number of clusters in n-way probabilistic clustering. IEEE Trans. Pattern Anal. Mach. Intell.
**2010**, 32, 2006–2021. [Google Scholar] [PubMed] - Nikunen, J.; Diment, A.; Virtanen, T. Separation of moving sound sources using multichannel NMF and acoustic tracking. IEEE/ACM Trans. Audio Speech Lang. Process.
**2017**, 26, 281–295. [Google Scholar] [CrossRef] - Taseska, M.; Habets, E.A.P. Blind source separation of moving sources using sparsity-based source detection and tracking. IEEE/ACM Trans. Audio Speech Lang. Process.
**2017**, 26, 657–670. [Google Scholar] [CrossRef]

**Figure 4.**A numerical example demonstrating that (

**a**) Waveforms of music source signals with drum in the linear mixture case; (

**b**) Waveforms of the mixture sources; (

**c**) Waveforms of the estimated sources using MU algorithm for drum case [16]; (

**d**) Waveforms of the estimated sources using EM algorithm for drum case [16]; (

**e**) Waveforms of the estimated sources using ${l}_{0}$ minimization algorithm for drum case [10]; and (

**f**) Waveforms of the estimated sources using our proposed algorithm (Tensor-IS) in the linear instantaneous mixture case.

**Figure 9.**A numerical example demonstrating that (

**a**) Waveforms of music source signals with drum in the convolutive mixture case; (

**b**) Waveforms of the mixture sources [16]; (

**c**) Waveforms of the estimated sources using MU algorithm [16]; (

**d**) Waveforms of the estimated sources using EM algorithm; (

**e**) Waveforms of the estimated sources using binary masking algorithm [11]; and (

**f**) Waveforms of the estimated sources using the proposed algorithm (Tensor-IS) in the convolutive mixture case.

Dataset | Window Length | Sampling | Iterations | |
---|---|---|---|---|

Samples | Milliseconds | Freq. (Hz) | ||

A-inst | 1024 | 64 | 16000 | 200 |

B-inst | 1024 | 64 | 16000 | 200 |

C-conv | 2048 | 128 | 16000 | 500 |

D-conv | 2048 | 128 | 16000 | 500 |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Xie, Y.; Xie, K.; Yang, J.; Xie, S.
Underdetermined Blind Source Separation Combining Tensor Decomposition and Nonnegative Matrix Factorization. *Symmetry* **2018**, *10*, 521.
https://doi.org/10.3390/sym10100521

**AMA Style**

Xie Y, Xie K, Yang J, Xie S.
Underdetermined Blind Source Separation Combining Tensor Decomposition and Nonnegative Matrix Factorization. *Symmetry*. 2018; 10(10):521.
https://doi.org/10.3390/sym10100521

**Chicago/Turabian Style**

Xie, Yuan, Kan Xie, Junjie Yang, and Shengli Xie.
2018. "Underdetermined Blind Source Separation Combining Tensor Decomposition and Nonnegative Matrix Factorization" *Symmetry* 10, no. 10: 521.
https://doi.org/10.3390/sym10100521