# Underdetermined Blind Source Separation Combining Tensor Decomposition and Nonnegative Matrix Factorization

## Abstract

## 1. Introduction

## 2. Problem Formulation

#### 2.1. Linear Instantaneous Mixture Model

#### 2.2. The NMF Source Model

#### 2.3. Objective

## 3. The Proposed Optimization Algorithm

#### 3.1. Mixing Matrix Estimation Using Tensor Decomposition

#### 3.2. Source Separation Using the Baseline Methods

#### 3.2.1. ${l}_{p}$ Norm Minimization Method

#### 3.2.2. Binary Masking Method

#### 3.3. The Optimization EM Algorithm

**E-step:**Conditional Expectations of Natural Statistics

**M-step:**Update of Parameters

Algorithm 1: Proposed Algorithm for Underdetermined Linear BSS. |

• Underdetermined Linear Mixture Case ($I>J$)Step 1. Estimate the mixing matrix $\mathbf{A}$ by using the time-domain tensor decomposition.Step 2. Perform STFT on $\mathbf{x}(t)$ to get ${\mathbf{x}}_{fn}$.Step 3. Estimate the sources using (20) and detect the source spectrogram factors employingthe NMF method with (7). Step 4. Initialize the updated matrix, the spectral basis, and temporal code, then update theseparameters using EM algorithm. i.e., repeat(i). Update $\mathbf{A}$ with (33) in the linear mixture case. (ii). Alternately update ${w}_{fk}$ and ${h}_{kn}$ with (35). until convergenceStep 5. Estimate ${\widehat{\mathbf{s}}}_{fn}$ by using Wiener filter of (28).Step 6. Transform ${\widehat{\mathbf{s}}}_{fn}$ into time-domain to obtain $\mathbf{s}(t)$ through inverse STFT.• end |

#### 3.4. Convolutive Mixed Sources Case

Algorithm 2: Proposed Algorithm for Underdetermined Convolutive BSS. |

• Underdetermined Convolutive Mixture Case ($I>J$)Step 1. Perform STFT on $\mathbf{x}(t)$ to get ${\mathbf{x}}_{fn}$Step 2. Estimate the mixing matrix ${\mathbf{A}}_{f}$ by using frequency-domain tensor decomposition.Step 3. Estimate the sources using (22), and detect the source spectrogram factors employingthe NMF method with (7). Step 4. Initialize the updated matrix, the spectral basis, and temporal code, then update theseparameters using EM algorithm. i.e., repeat(i). Update ${\mathbf{A}}_{f}$ with (41) in the convolutive mixture case. (ii). Alternately update ${w}_{fk}$ and ${h}_{kn}$ with (35). until convergenceStep 5. Estimate ${\widehat{\mathbf{s}}}_{fn}$ by using Wiener filter of (28).Step 6. Transform ${\widehat{\mathbf{s}}}_{fn}$ into time-domain to obtain $\mathbf{s}(t)$ through inverse STFT.• end |

## 4. Experiments

#### 4.1. Datasets

#### 4.2. Source Signal Separation Evaluation Criteria

#### 4.3. Algorithm Parameters

#### 4.4. Underdetermined BSS in the Linear Instantaneous Case and Convolutive Mixture Case

#### 4.4.1. Music Signal Mixtures in the Linear Instantaneous Case

#### 4.4.2. Speech Signal Mixtures in the Linear Instantaneous Case

#### 4.4.3. Music Signal Mixtures in the Convolutive Case

#### 4.4.4. Speech Signal Mixtures in the Convolutive Case

**Discussion 1**. According to the above experimental results of Dataset A, Dataset B, Dataset C, and Dataset D, it can be seen that our proposed algorithm can separate music signal mixtures and speech signal mixtures in the underdetermined linear and convolutive case. What is more, according to the average value of source separation results, it is also shown that our proposed algorithm outperforms the baseline algorithms.

#### 4.5. The Runtime of All Algorithms

## 5. Conclusions and Future Work

## Author Contributions

## Funding

## Conflicts of Interest

## References

Dataset | Window Length | Sampling | Iterations | |
---|---|---|---|---|

Samples | Milliseconds | Freq. (Hz) | ||

A-inst | 1024 | 64 | 16000 | 200 |

B-inst | 1024 | 64 | 16000 | 200 |

C-conv | 2048 | 128 | 16000 | 500 |

D-conv | 2048 | 128 | 16000 | 500 |

