A Connection Between the Kalman Filter and an Optimized LMS Algorithm for Bilinear Forms

Dogariu, Laura-Maria; Ciochină, Silviu; Paleologu, Constantin; Benesty, Jacob

doi:10.3390/a11120211

Open AccessArticle

A Connection Between the Kalman Filter and an Optimized LMS Algorithm for Bilinear Forms

by

Laura-Maria Dogariu

¹,

Silviu Ciochină

¹,

Constantin Paleologu

^1,*

and

Jacob Benesty

²

¹

Department of Telecommunications, University Politehnica of Bucharest, 1-3, Iuliu Maniu Blvd., 061071 Bucharest, Romania

²

Energy Materials Telecommunications Research Centre, National Institute of Scientific Research (INRS-EMT), University of Quebec, Montreal, QC H5A 1K6, Canada

^*

Author to whom correspondence should be addressed.

Algorithms 2018, 11(12), 211; https://doi.org/10.3390/a11120211

Submission received: 1 December 2018 / Revised: 13 December 2018 / Accepted: 14 December 2018 / Published: 17 December 2018

(This article belongs to the Special Issue Adaptive Filtering Algorithms)

Download

Browse Figures

Versions Notes

Abstract

:

The system identification problem becomes more challenging when the parameter space increases. Recently, several works have focused on the identification of bilinear forms, which are related to the impulse responses of a spatiotemporal model, in the context of a multiple-input/single-output system. In this framework, the problem was addressed in terms of the Wiener filter and different basic adaptive algorithms. This paper studies two types of algorithms tailored for the identification of such bilinear forms, i.e., the Kalman filter (along with its simplified version) and an optimized least-mean-square (LMS) algorithm. Also, a comparison between them is performed, which shows interesting similarities. In addition to the mathematical derivation of the algorithms, we also provide extensive experimental results, which support the theoretical findings and indicate the good performance of the proposed solutions.

Keywords:

adaptive filter; Kalman filter; optimized LMS algorithm; bilinear forms; system identification

1. Introduction

Bilinear systems were previously treated in various contexts [1], such as the nonlinear systems approximation. The applications derived from there are numerous, among which we can mention system identification [2,3,4,5], design of digital filters [6], echo cancellation [7], chaotic communications [8], active noise control [9], neural networks [10], etc. In all these papers, the bilinear term is understood in terms of an input-output relation (with respect to the data).

More recently, a new approach was studied in [11], where the bilinear term was defined in the context of a multiple-input/single-output (MISO) system, with respect to the impulse responses of a spatiotemporal model. In [11], the Wiener filter solution for the identification of such bilinear forms was proposed, and then, in [12,13,14], some adaptive solutions based on different basic algorithms were provided. Similar frameworks can be found in [15,16,17,18], in conjunction with specific applications, such as channel equalization and nonlinear acoustic echo cancellation; however, these works were not associated with the identification of bilinear forms or analyzed in this context.

The Kalman filter was first introduced in [19] and it has a wide range of applications in technology, as well as in signal processing [20]. Some of its applications in the automotive field are mentioned in [21,22], and the references therein, while [23] also provides a computational complexity analysis.

In this paper, we focus on a comparative study of two different types of algorithms tailored for the identification of bilinear forms. The first one is based on the Kalman filter [19], whose bilinear form was introduced in our previous work [13], together with a simplified (i.e., low complexity) version, which could be more suitable in real-world applications. The second algorithm is a version of the least-mean-square (LMS) adaptive filter, but where the step-size parameter has been optimized in order to meet a proper compromise between the convergence rate and misadjustment. In addition, as it will be shown, there is a strong similarity between the simplified Kalman filter and the optimized LMS algorithm. Simulations show the good performance of the proposed solutions, in comparison with other related works based on the conventional algorithms. Moreover, the gain could be twofold, in terms of both convergence performance and computational complexity.

The paper structure is as follows. Section 2 introduces the system model in the context of bilinear forms. In this framework, we present the Kalman filter together with its simplified version in Section 3. The optimized LMS algorithm is derived in Section 4. Next, a comparison between the simplified Kalman filter and the optimized LMS algorithm for bilinear forms is presented in Section 5. Section 6 provides some practical considerations. Simulations are performed in the framework of system identification and the results are given in Section 7. Finally, some conclusions are drawn in Section 8. Also, Appendix A provides detailed computations of some terms required within the optimized LMS algorithm, while Appendix B summarizes and briefly explains the main parameters used in the paper, in order to facilitate the reading. For the sake of clarity, the main notation used in this work is provided in Table 1.

2. System Model

The signal model that shall be used throughout the paper is given by [11]

\begin{matrix} d (n) & = h^{T} (n) X (n) g (n) + v (n) \\ = y (n) + v (n), \end{matrix}

(1)

where

d (n)

is the zero-mean desired (or reference) signal at the discrete-time index n,

h (n)

and

g (n)

are the two impulse responses of the system of lengths L and M, respectively, the superscript

^{T}

is the transpose operator,

X (n) = [x_{1} (n) x_{2} (n) \dots x_{M} (n)]

is the zero-mean multiple-input signal matrix,

x_{m} (n) = {[x_{m} (n) x_{m} (n - 1) \dots x_{m} (t - L + 1)]}^{T}

is a vector containing the L most recent time samples of the mth (

m = 1, 2, \dots, M

) input signal,

y (n) = h^{T} (n) X (n) g (n)

is the output signal and it represents a bilinear form, and

v (n)

is a zero-mean additive noise (with the variance

σ_{v}^{2}

). It is assumed that all signals are real-valued, and

X (n)

and

v (n)

are independent. As we can notice,

y (n)

is a bilinear function of

h (n)

and

g (n)

, because for every fixed

h (n)

,

y (n)

is a linear function of

g (n)

, and for every fixed

g (n)

, it is a linear function of

h (n)

.

We can rewrite the matrix

X (n)

, of size

L \times M

, as a vector of length

M L

, by using the vectorization operation:

vec [X (n)] = {[x_{1}^{T} (n) x_{2}^{T} (n) \dots x_{M}^{T} (n)]}^{T} = \tilde{x} (n) .

(2)

Therefore, the output signal can be expressed as

\begin{matrix} y (n) & = h^{T} (n) X (n) g (n) \\ = {[g (n) \otimes h (n)]}^{T} \tilde{x} (n) = f^{T} (n) \tilde{x} (n), \end{matrix}

(3)

where ⊗ denotes the Kronecker product between the individual impulse responses, while the vector

f (n) = g (n) \otimes h (n)

, of length

M L

, represents the spatiotemporal (i.e., global) impulse response of the system. Consequently, the signal model in (1) becomes

d (n) = f^{T} (n) \tilde{x} (n) + v (n) .

(4)

The major difference compared to a general MISO system is yielded by the fact that in this bilinear context

f (n)

is formed with only

M + L

different elements, despite being of length

M L

.

The goal is the identification of the two impulse responses

h (n)

and

g (n)

, and, in this way, of the spatiotemporal impulse response

f (n)

. For this aim, we can use two adaptive filters,

\hat{h} (n)

and

\hat{g} (n)

; hence, the global impulse response can be evaluated as

\hat{f} (n) = \hat{g} (n) \otimes \hat{h} (n) .

(5)

Let

η \neq 0

be a real-valued constant. We can see from (1) that

[\frac{1}{η} h^{T} (n)] X (n) [η g (n)] = h^{T} (n) X (n) g (n) = y (n),

(6)

meaning that the pairs

[h (n) / η, η g (n)]

and

[h (n), g (n)]

are equivalent in the bilinear form. This implies that we can only identify

\hat{h} (n)

and

\hat{g} (n)

up to a scaling factor. A similar discussion can be found in [15,18] in the framework of blind identification/equalization and nonlinear acoustic echo cancellation, respectively. Nevertheless, because

f (n) = g (n) \otimes h (n) = [η g (n)] \otimes [\frac{1}{η} h (n)],

(7)

the global impulse response can be identified with no scaling ambiguity. Consequently, for the performance evaluation of the identification of the temporal and spatial filters, we can use the normalized projection misalignment (NPM), defined in [24]:

\begin{matrix} NPM [h (n), \hat{h} (n)] & = 1 - {[\frac{h^{T} (n) \hat{h} (n)}{∥ h (n) ∥ ∥ \hat{h} (n) ∥}]}^{2}, \end{matrix}

(8)

\begin{matrix} NPM [g (n), \hat{g} (n)] & = 1 - {[\frac{g^{T} (n) \hat{g} (n)}{∥ g (n) ∥ ∥ \hat{g} (n) ∥}]}^{2}, \end{matrix}

(9)

where

∥ \cdot ∥

denotes the Euclidean norm. On the other hand, for the identification of the global filter,

f (n)

, we should use the normalized misalignment:

NM [f (n), \hat{f} (n)] = \frac{∥ f (n) - \hat{f} {(n) ∥}^{2}}{{∥ f (n) ∥}^{2}} .

(10)

In [11], this bilinear system identification problem has been studied in terms of the Wiener filter. Therefore, the assumption was that the two impulse responses that need to be identified are time-invariant (which represents a basic assumption in the context of the Wiener filter). In practice, however, these systems could vary in time. For this reason, in this paper, we approach the system identification problem considering that the systems that need to be identified vary in time. Thus, we assume that

h (n)

and

g (n)

are zero-mean random vectors, following a simplified first-order Markov model, i.e.,

\begin{matrix} h (n) & = h (n - 1) + w_{h} (n), \end{matrix}

(11)

\begin{matrix} g (n) & = g (n - 1) + w_{g} (n), \end{matrix}

(12)

where

w_{h} (n)

and

w_{g} (n)

are zero-mean white Gaussian noise vectors, with correlation matrices

R_{w_{h}} (n) = σ_{w_{h}}^{2} I_{L}

and

R_{w_{g}} (n) = σ_{w_{g}}^{2} I_{M}

, respectively (with

I_{L}

and

I_{M}

being the identity matrices of sizes

L \times L

and

M \times M

, respectively). It is considered that

w_{h} (n)

is uncorrelated with

h (n - 1)

and

v (n)

, while

w_{g} (n)

is uncorrelated with

g (n - 1)

and

v (n)

. The variances

σ_{w_{h}}^{2}

and

σ_{w_{g}}^{2}

capture the uncertainties in

h (n)

and

g (n)

, respectively.

3. Kalman Filter for Bilinear Forms

In this section, we address the previously described system identification problem in terms of the Kalman filter, summarizing the findings from [13]. In this context, the signal model from (1) may be interpreted as the observation equation, while the system impulse responses can be considered as state equations. Given the two adaptive filters

\hat{h} (n)

and

\hat{g} (n)

, the estimated signal is given by

\hat{y} (n) = {\hat{h}}^{T} (n - 1) X (n) \hat{g} (n - 1) .

(13)

As a result, the a priori error signal between the desired and estimated signals can be defined as

\begin{matrix} e (n) & = d (n) - \hat{y} (n) \\ = d (n) - {\hat{h}}^{T} (n - 1) X (n) \hat{g} (n - 1) \\ = d (n) - {[\hat{g} (n - 1) \otimes \hat{h} (n - 1)]}^{T} \tilde{x} (n) \\ = d (n) - {\hat{f}}^{T} (n - 1) \tilde{x} (n) \\ = d (n) - {\hat{h}}^{T} (n - 1) x_{\hat{g}} (n) \\ = d (n) - {\hat{g}}^{T} (n - 1) x_{\hat{h}} (n), \end{matrix}

(14)

where

x_{\hat{g}} (n) = {[\hat{g} (n - 1) \otimes I_{L}]}^{T} \tilde{x} (n),

(15)

x_{\hat{h}} (n) = {[I_{M} \otimes \hat{h} (n - 1)]}^{T} \tilde{x} (n) .

(16)

In the context of the linear sequential Bayesian approach, the optimal estimates of the state vectors have the forms [25]:

\begin{matrix} \hat{h} (n) & = \hat{h} (n - 1) + k_{h} (n) e (n), \end{matrix}

(17)

\begin{matrix} \hat{g} (n) & = \hat{g} (n - 1) + k_{g} (n) e (n), \end{matrix}

(18)

where

k_{h} (n)

and

k_{g} (n)

are the Kalman gain vectors. Next, we can define the a posteriori misalignments (which represent the state estimation errors) related to the temporal and spatial impulse responses:

c_{h} (n) = \frac{1}{η} h (n) - \hat{h} (n),

(19)

c_{g} (n) = η g (n) - \hat{g} (n),

(20)

for which their correlation matrices are

R_{c_{h}} (n) = E [c_{h} (n) c_{h}^{T} (n)]

and

R_{c_{g}} (n) = E [c_{g} (n) c_{g}^{T} (n)]

, respectively, where

E [\cdot]

denotes mathematical expectation. As mentioned in Section 2, we can only identify the impulse responses up to this arbitrary scaling factor

η

; however, the pair

h (n) / η

and

η g (n)

is equivalent to the pair

h

and

g

in the bilinear form. Let us also define the a priori misalignments related to the two impulse responses:

\begin{matrix} c_{h_{a}} (n) & = \frac{1}{η} h (n) - \hat{h} (n - 1) \\ = c_{h} (n - 1) + \frac{1}{η} w_{h} (n), \end{matrix}

(21)

\begin{matrix} c_{g_{a}} (n) & = η g (n) - \hat{g} (n - 1) \\ = c_{g} (n - 1) + η w_{g} (n), \end{matrix}

(22)

whose correlation matrices are

R_{c_{h_{a}}} (n) = E [c_{h_{a}} (n) c_{h_{a}}^{T} (n)]

and

R_{c_{g_{a}}} (n) = E [c_{g_{a}} (n) c_{g_{a}}^{T} (n)]

, respectively.

For the sake of simplicity of the upcoming developments, let us multiply

w_{h} (n)

and

w_{g} (n)

by

1 / η

and

η

, respectively, and introduce the notation:

\begin{matrix} {\bar{w}}_{h} (n) & = \frac{1}{η} w_{h} (n), \end{matrix}

(23)

\begin{matrix} {\bar{w}}_{g} (n) & = η w_{g} (n) . \end{matrix}

(24)

These new terms are also zero-mean white Gaussian noise vectors, having the correlation matrices

R_{{\bar{w}}_{h}} (n) = σ_{{\bar{w}}_{h}}^{2} I_{L}

and

R_{{\bar{w}}_{g}} (n) = σ_{{\bar{w}}_{g}}^{2} I_{M}

, respectively. Clearly, we have

σ_{{\bar{w}}_{h}}^{2} = σ_{w_{h}}^{2} / η^{2}

and

σ_{{\bar{w}}_{g}}^{2} = η^{2} σ_{w_{g}}^{2}

. Consequently, using this notation in (21) and (22), we obtain

\begin{matrix} R_{c_{h_{a}}} (n) & = R_{c_{h}} (n - 1) + σ_{{\bar{w}}_{h}}^{2} I_{L}, \end{matrix}

(25)

\begin{matrix} R_{c_{g_{a}}} (n) & = R_{c_{g}} (n - 1) + σ_{{\bar{w}}_{g}}^{2} I_{M} . \end{matrix}

(26)

In this context, the Kalman gain vectors are computed from the minimization of the criteria:

\begin{matrix} J_{h} (n) & = \frac{1}{L} tr [R_{c_{h}} (n)], \end{matrix}

(27)

\begin{matrix} J_{g} (n) & = \frac{1}{M} tr [R_{c_{g}} (n)], \end{matrix}

(28)

with respect to

k_{h} (n)

and

k_{g} (n)

, respectively, where

tr [\cdot]

denotes the trace of a square matrix. From these minimizations, we find that

\begin{matrix} k_{h} (n) & = \frac{R_{c_{h_{a}}} (n) x_{\hat{g}} (n)}{x_{\hat{g}}^{T} (n) R_{c_{h_{a}}} (n) x_{\hat{g}} (n) + σ_{v}^{2}}, \end{matrix}

(29)

\begin{matrix} k_{g} (n) & = \frac{R_{c_{g_{a}}} (n) x_{\hat{h}} (n)}{x_{\hat{h}}^{T} (n) R_{c_{g_{a}}} (n) x_{\hat{h}} (n) + σ_{v}^{2}}, \end{matrix}

(30)

and

\begin{matrix} R_{c_{h}} (n) & = [I_{L} - k_{h} (n) x_{\hat{g}}^{T} (n)] R_{c_{h_{a}}} (n), \end{matrix}

(31)

\begin{matrix} R_{c_{g}} (n) & = [I_{M} - k_{g} (n) x_{\hat{h}}^{T} (n)] R_{c_{g_{a}}} (n) . \end{matrix}

(32)

Summarizing, the Kalman filter for bilinear forms (namely KF-BF) is defined by Equations (17), (18), (25), (26), and (29)–(32). As we can notice, the computational complexity of this algorithm is proportional to

O (L^{2} + M^{2})

. For the purpose of reducing the computational complexity of the KF-BF, a simplified version of this algorithm is derived. The idea of this low-complexity version was inspired by the work presented in [26], in the context of echo cancellation. First, let us assume that the KF-BF has reached the steady-state convergence. Consequently,

R_{c_{h_{a}}} (n)

and

R_{c_{g_{a}}} (n)

tend to become diagonal matrices, which have all the elements on the main diagonal equal to small positive numbers,

σ_{c_{h_{a}}}^{2} (n)

and

σ_{c_{g_{a}}}^{2} (n)

, respectively. Therefore, we can use the approximations:

\begin{matrix} R_{c_{h_{a}}} (n) & \approx σ_{c_{h_{a}}}^{2} (n) I_{L}, \end{matrix}

(33)

\begin{matrix} R_{c_{g_{a}}} (n) & \approx σ_{c_{g_{a}}}^{2} (n) I_{M} . \end{matrix}

(34)

Hence, the Kalman gain vectors for the temporal and spatial impulse responses become

\begin{matrix} k_{h} (n) & = \frac{x_{\hat{g}} (n)}{x_{\hat{g}}^{T} (n) x_{\hat{g}} (n) + δ_{h} (n)}, \end{matrix}

(35)

\begin{matrix} k_{g} (n) & = \frac{x_{\hat{h}} (n)}{x_{\hat{h}}^{T} (n) x_{\hat{h}} (n) + δ_{g} (n)}, \end{matrix}

(36)

where

δ_{h} (n) = σ_{v}^{2} / σ_{c_{h_{a}}}^{2} (n)

and

δ_{g} (n) = σ_{v}^{2} / σ_{c_{g_{a}}}^{2} (n)

can be seen as variable regularization parameters. Then, we use the Kalman vectors from (35) and (36) in the updates (17) and (18), respectively.

Next, a new simplification can be made, by considering that the matrices appearing in the updates of

R_{c_{h}} (n)

and

R_{c_{g}} (n)

can be approximated as

\begin{matrix} I_{L} - k_{h} (n) x_{\hat{g}}^{T} (n) & \approx [1 - \frac{1}{L} k_{h}^{T} (n) x_{\hat{g}} (n)] I_{L}, \end{matrix}

(37)

\begin{matrix} I_{M} - k_{g} (n) x_{\hat{h}}^{T} (n) & \approx [1 - \frac{1}{M} k_{g}^{T} (n) x_{\hat{h}} (n)] I_{M} . \end{matrix}

(38)

We can perform these approximations because, as the filters start to converge, the misalignments of the individual coefficients tend to become uncorrelated; due to this fact, the matrices

R_{c_{h}} (n)

and

R_{c_{g}} (n)

tend to become diagonal. Using the notation:

R_{c_{h_{a}}} (n) \approx σ_{c_{h_{a}}}^{2} (n) I_{L} = r_{c_{h_{a}}} (n) I_{L},

R_{c_{g_{a}}} (n) \approx σ_{c_{g_{a}}}^{2} (n) I_{M} = r_{c_{g_{a}}} (n) I_{M},

(39)

together with

R_{c_{h}} (n) \approx σ_{c_{h}}^{2} (n) I_{L} = r_{c_{h}} (n) I_{L},

R_{c_{g}} (n) \approx σ_{c_{g}}^{2} (n) I_{M} = r_{c_{g}} (n) I_{M},

(40)

we can summarize the simplified Kalman filter for bilinear forms (SKF-BF) in Table 2. As we can notice, its computational complexity is proportional to

O (L + M)

, which represents an important gain as compared to KF-BF.

4. Optimized LMS Algorithm for Bilinear Forms

In this section, we approach the system identification problem based on the LMS algorithm, aiming to optimize its step-size parameter in order to address the compromise between the main performance criteria, i.e., convergence rate versus misadjustment [27]. In the following, the proposed optimized LMS algorithm for bilinear forms (namely OLMS-BF) is derived based on the same system model given in Section 2. As will be explained in Section 5, this algorithm has striking resemblances with the SKF-BF, even if their derivations follow different patterns.

Let us consider the two estimated impulse responses

\hat{h} (n)

and

\hat{g} (n)

, such that the estimated signal is given by (13). As a consequence, the a priori error signal between the desired signal and the estimated one can be defined following (14), i.e.,

\begin{matrix} e (n) & = d (n) - \hat{y} (n) \\ = {[g (n) \otimes h (n)]}^{T} \tilde{x} (n) + v (n) - {[\hat{g} (n - 1) \otimes \hat{h} (n - 1)]}^{T} \tilde{x} (n) \\ = h^{T} (n) x_{g} (n) + v (n) - {\hat{h}}^{T} (n - 1) x_{\hat{g}} (n) \end{matrix}

(41)

\begin{matrix} = g^{T} (n) x_{h} (n) + v (n) - {\hat{g}}^{T} (n - 1) x_{\hat{h}} (n), \end{matrix}

(42)

where

x_{g} (n) = {[g (n) \otimes I_{L}]}^{T} \tilde{x} (n)

and

x_{h} (n) = {[I_{M} \otimes h (n)]}^{T} \tilde{x} (n)

, while

x_{\hat{g}} (n)

and

x_{\hat{h}} (n)

are defined in (15) and (16), respectively.

The desired signal

d (n)

may also be expressed as

\begin{matrix} d (n) & = g^{T} (n) x_{h} (n) + g^{T} (n) x_{\hat{h}} (n) - g^{T} (n) x_{\hat{h}} (n) + v (n) \\ = g^{T} (n) x_{\hat{h}} (n) + v (n) + v_{g} (n), \end{matrix}

(43)

where the term:

\begin{matrix} v_{g} (n) & = g^{T} (n) [x_{h} (n) - x_{\hat{h}} (n)] \\ = c_{h_{a}}^{T} (n) x_{g} (n) \\ = {[c_{h} (n - 1) + {\bar{w}}_{h} (n)]}^{T} x_{g} (n) \end{matrix}

(44)

can be seen as an additional “noise” term, of variance

σ_{v_{g}}^{2} (n)

, introduced by the system

g

. The terms

c_{h}

and

c_{h_{a}}

were previously defined in (19) and (21), respectively. For the sake of simplicity, the scaling factor

η

does not appear explicitly in the following. As explained before, this parameter is included in the expression of the uncertainty parameter, thus leading to the notation from (23) and (24).

In a similar way, for the second system we have

\begin{matrix} d (n) & = h^{T} (n) x_{g} (n) + h^{T} (n) x_{\hat{g}} (n) - h^{T} (n) x_{\hat{g}} (n) + v (n) \\ = h^{T} (n) x_{\hat{g}} (n) + v (n) + v_{h} (n), \end{matrix}

(45)

where

\begin{matrix} v_{h} (n) & = h^{T} (n) [x_{g} (n) - x_{\hat{g}} (n)] \\ = c_{g_{a}}^{T} (n) x_{h} (n) \\ = {[c_{g} (n - 1) + {\bar{w}}_{g} (n)]}^{T} x_{h} (n) \end{matrix}

(46)

can be interpreted as an additional “noise” term, of variance

σ_{v_{h}}^{2} (n)

, related to the system

h

. Here, the a posteriori and a priori misalignments corresponding to the system

g

were defined in (20) and (22), respectively. In Figure 1 and Figure 2, the equivalent system identification scheme is represented in terms of the two components,

g (n)

and

h (n)

, respectively; it can be observed that each system influences the other one through the additional “noise” term.

In the framework of the LMS algorithm for bilinear forms, namely LMS-BF [12], the updates are the following:

\begin{matrix} \hat{g} (n) & = \hat{g} (n - 1) + μ_{\hat{g}} x_{\hat{h}} (n) e (n), \end{matrix}

(47)

\begin{matrix} \hat{h} (n) & = \hat{h} (n - 1) + μ_{\hat{h}} x_{\hat{g}} (n) e (n), \end{matrix}

(48)

where

μ_{\hat{g}}

and

μ_{\hat{h}}

are the step-size parameters. In this context, the vectors corresponding to the a posteriori misalignments become

\begin{matrix} c_{g} (n) & = g (n) - \hat{g} (n - 1) - μ_{\hat{g}} x_{\hat{h}} (n) e (n) \\ = c_{g} (n - 1) + {\bar{w}}_{g} (n) - μ_{\hat{g}} x_{\hat{h}} (n) e (n), \end{matrix}

(49)

\begin{matrix} c_{h} (n) & = h (n) - \hat{h} (n - 1) - μ_{\hat{h}} x_{\hat{g}} (n) e (n) \\ = c_{h} (n - 1) + {\bar{w}}_{h} (n) - μ_{\hat{h}} x_{\hat{g}} (n) e (n) . \end{matrix}

(50)

At this point, let us introduce the notation

m_{g} (n) = E [∥ c_{g} {(n) ∥}^{2}]

and

m_{h} (n) = E [∥ c_{h} {(n) ∥}^{2}]

. Taking the square

ℓ_{2}

norms in both sides of (49) and (50), respectively, we can recursively evaluate

\begin{matrix} m_{g} (n) & = m_{g} (n - 1) - 2 μ_{\hat{g}} E \{[c_{g}^{T} (n - 1) + {\bar{w}}_{g}^{T} (n)] x_{\hat{h}} (n) e (n)\} + μ_{\hat{g}}^{2} E [x_{\hat{h}}^{T} (n) x_{\hat{h}} (n) e^{2} (n)] + M σ_{{\bar{w}}_{g}}^{2} \\ = m_{g} (n - 1) - 2 A_{g} μ_{\hat{g}} + μ_{\hat{g}}^{2} B_{g} + M σ_{{\bar{w}}_{g}}^{2}, \end{matrix}

(51)

\begin{matrix} m_{h} (n) & = m_{h} (n - 1) - 2 μ_{\hat{h}} E \{[c_{h}^{T} (n - 1) + {\bar{w}}_{h}^{T} (n)] x_{\hat{g}} (n) e (n)\} + μ_{\hat{h}}^{2} E [x_{\hat{g}}^{T} (n) x_{\hat{g}} (n) e^{2} (n)] + L σ_{{\bar{w}}_{h}}^{2} \\ = m_{h} (n - 1) - 2 A_{h} μ_{\hat{h}} + μ_{\hat{h}}^{2} B_{h} + L σ_{{\bar{w}}_{h}}^{2}, \end{matrix}

(52)

where

\begin{matrix} A_{g} & = E \{[c_{g}^{T} (n - 1) + {\bar{w}}_{g}^{T} (n)] x_{\hat{h}} (n) e (n)\}, \end{matrix}

(53)

\begin{matrix} B_{g} & = E [x_{\hat{h}}^{T} (n) x_{\hat{h}} (n) e^{2} (n)], \end{matrix}

(54)

\begin{matrix} A_{h} & = E \{[c_{h}^{T} (n - 1) + {\bar{w}}_{h}^{T} (n)] x_{\hat{g}} (n) e (n)\}, \end{matrix}

(55)

\begin{matrix} B_{h} & = E [x_{\hat{g}}^{T} (n) x_{\hat{g}} (n) e^{2} (n)] . \end{matrix}

(56)

It is very difficult to further process the expectation terms from (53)–(56) (and, consequently, (51) and (52)) without any supporting assumptions on the character of the input signals. Hence, let us consider that the covariance matrices of the inputs are close to a diagonal one. This is a fairly restrictive assumption on the input signals, which has been widely used to simplify the convergence analysis of many adaptive algorithms [27,28]. Also, let us consider that the input signals are independent and have the same power. In this context, the computations of the expectation terms from (53)–(56) are detailed in Appendix A. Summarizing the results from this appendix, these terms result in

\begin{matrix} A_{g} & = p_{g} (n) + σ_{x}^{2} E [{∥\hat{h} (n - 1)∥}^{2}] [m_{g} (n - 1) + M σ_{{\bar{w}}_{g}}^{2}], \end{matrix}

(57)

\begin{matrix} A_{h} & = p_{h} (n) + σ_{x}^{2} E [{∥\hat{g} (n - 1)∥}^{2}] [m_{h} (n - 1) + L σ_{{\bar{w}}_{h}}^{2}], \end{matrix}

(58)

\begin{matrix} B_{g} & = M σ_{x}^{2} E [{∥\hat{h} (n - 1)∥}^{2}] \{σ_{v}^{2} + 2 p_{g} (n) + \frac{1}{L} σ_{v_{g}}^{2} (n) + σ_{x}^{2} E [{∥\hat{h} (n - 1)∥}^{2}] [m_{g} (n - 1) + M σ_{{\bar{w}}_{g}}^{2}]\}, \end{matrix}

(59)

\begin{matrix} B_{h} & = L σ_{x}^{2} E [{∥\hat{g} (n - 1)∥}^{2}] \{σ_{v}^{2} + 2 p_{h} (n) + \frac{1}{M} σ_{v_{h}}^{2} (n) + σ_{x}^{2} E [{∥\hat{g} (n - 1)∥}^{2}] [m_{h} (n - 1) + L σ_{{\bar{w}}_{h}}^{2}]\}, \end{matrix}

(60)

where

σ_{x}^{2} = E [∥ \tilde{x} {(n) ∥}^{2}]

and the terms denoted by

p_{g} (n)

and

p_{h} (n)

are evaluated as

\begin{matrix} p_{g} (n) & = E \{c_{g}^{T} (n - 1) x_{\hat{h}} (n) {\tilde{x}}^{T} (n) [g \otimes c_{h} (n - 1)]\} \\ = E \{c_{g}^{T} (n - 1) x_{\hat{h}} (n) c_{h}^{T} (n - 1) x_{g} (n)\}, \end{matrix}

(61)

\begin{matrix} p_{h} (n) & = E \{c_{h}^{T} (n - 1) x_{\hat{g}} (n) {\tilde{x}}^{T} (n) [c_{g} (n - 1) \otimes h]\} \\ = E \{c_{h}^{T} (n - 1) x_{\hat{g}} (n) c_{g}^{T} (n - 1) x_{h} (n)\} . \end{matrix}

(62)

The expressions of the variances

σ_{v_{g}}^{2} (n)

and

σ_{v_{h}}^{2} (n)

are also provided in Appendix A. Consequently, using (57)–(60) in (51) and (52), we obtain

\begin{matrix} m_{g} (n) & = m_{g} (n - 1) \{1 - 2 μ_{\hat{g}} σ_{x}^{2} E [∥ \hat{h} {(n - 1) ∥}^{2}] + μ_{\hat{g}}^{2} σ_{x}^{4} M {\{E [∥ \hat{h} {(n - 1) ∥}^{2}]\}}^{2}\} - 2 μ_{\hat{g}} p_{g} (n) \\ + μ_{\hat{g}}^{2} M σ_{x}^{2} E [∥ \hat{h} {(n - 1) ∥}^{2}] \{σ_{v}^{2} + \frac{1}{L} σ_{v_{g}}^{2} (n) + M σ_{{\bar{w}}_{g}}^{2} σ_{x}^{2} E [∥ \hat{h} {(n - 1) ∥}^{2}] + 2 p_{g} (n)\} \\ + M σ_{{\bar{w}}_{g}}^{2} \{1 - 2 μ_{\hat{g}} σ_{x}^{2} E [∥ \hat{h} {(n - 1) ∥}^{2}]\}, \end{matrix}

(63)

\begin{matrix} m_{h} (n) & = m_{h} (n - 1) \{1 - 2 μ_{\hat{h}} σ_{x}^{2} E [∥ \hat{g} {(n - 1) ∥}^{2}] + μ_{\hat{h}}^{2} σ_{x}^{4} L {\{E [∥ \hat{g} {(n - 1) ∥}^{2}]\}}^{2}\} - 2 μ_{\hat{h}} p_{h} (n) \\ + μ_{\hat{h}}^{2} L σ_{x}^{2} E [∥ \hat{g} {(n - 1) ∥}^{2}] \{σ_{v}^{2} + \frac{1}{M} σ_{v_{h}}^{2} (n) + L σ_{{\bar{w}}_{h}}^{2} σ_{x}^{2} E [∥ \hat{g} {(n - 1) ∥}^{2}] + 2 p_{h} (n)\} \\ + L σ_{{\bar{w}}_{h}}^{2} \{1 - 2 μ_{\hat{h}} σ_{x}^{2} E [∥ \hat{g} {(n - 1) ∥}^{2}]\} . \end{matrix}

(64)

In the context of system identification problems, the main goal is to reduce the system misalignment, which basically represents the difference between the true impulse response and the estimated one. Therefore, in our framework, the optimal step-size parameters (denoted in the following by

μ_{\hat{g}, o}

and

μ_{\hat{h}, o}

) can be found by minimizing (63) and (64). This is done by canceling the derivatives of (63) and (64) with respect to the step-sizes, which result in:

\begin{matrix} \frac{\partial m_{g} (n)}{\partial μ_{\hat{g}}} & = - 2 A_{g} + 2 B_{g} μ_{\hat{g}} = 0 \Rightarrow μ_{\hat{g}, o} = \frac{A_{g}}{B_{g}}, \end{matrix}

(65)

\begin{matrix} \frac{\partial m_{h} (n)}{\partial μ_{\hat{h}}} & = - 2 A_{h} + 2 B_{h} μ_{\hat{h}} = 0 \Rightarrow μ_{\hat{h}, o} = \frac{A_{h}}{B_{h}} . \end{matrix}

(66)

By replacing

A_{g}, B_{g}, A_{h}

, and

B_{h}

with their expressions (see (57)–(60)), the step-size parameters of the proposed optimized LMS algorithm for bilinear forms (namely OLMS-BF) are found. Finally, introducing these parameters in (47) and (48), the updates of the OLMS-BF algorithm become

\begin{matrix} \hat{g} (n) & = \hat{g} (n - 1) + μ_{\hat{g}, o} (n) x_{\hat{h}} (n) e (n) \\ = \hat{g} (n - 1) + \frac{x_{\hat{h}} (n) e (n)}{M σ_{x}^{2} E [∥ \hat{h} {(n - 1) ∥}^{2}] \{1 + \frac{p_{g} (n) + σ_{v}^{2} + \frac{1}{L} σ_{v_{g}}^{2} (n)}{p_{g} (n) + σ_{x}^{2} E [∥ \hat{h} {(n - 1) ∥}^{2}] [m_{g} (n - 1) + M σ_{{\bar{w}}_{g}}^{2}]}\}}, \end{matrix}

(67)

\begin{matrix} \hat{h} (n) & = \hat{h} (n - 1) + μ_{\hat{h}, o} (n) x_{\hat{g}} (n) e (n) \\ = \hat{h} (n - 1) + \frac{x_{\hat{g}} (n) e (n)}{L σ_{x}^{2} E [∥ \hat{g} {(n - 1) ∥}^{2}] \{1 + \frac{p_{h} (n) + σ_{v}^{2} + \frac{1}{M} σ_{v_{h}}^{2} (n)}{p_{h} (n) + σ_{x}^{2} E [∥ \hat{g} {(n - 1) ∥}^{2}] [m_{h} (n - 1) + L σ_{{\bar{w}}_{h}}^{2}]}\}} . \end{matrix}

(68)

The most problematic terms in (67) and (68) are

p_{g} (n)

and

p_{h} (n)

(from (61) and (62), respectively), which depend on the true impulse responses. However, as shown in the next section, these terms could be omitted in practice.

5. SKF-BF versus OLMS-BF

The SKF-BF and OLMS-BF algorithms were developed following different theoretical patterns. However, there are strong similarities between these two algorithms, as will be explained in this section.

The update equations of the SKF-BF are given by (17) and (18), where the Kalman gain vectors have the expressions in (35) and (36). It can be noticed that the updates of the SKF-BF can be expressed as

\begin{matrix} \hat{g} (n) & = \hat{g} (n - 1) + μ_{\hat{g}, K} (n) x_{\hat{h}} (n) e (n), \end{matrix}

(69)

\begin{matrix} \hat{h} (n) & = \hat{h} (n - 1) + μ_{\hat{h}, K} (n) x_{\hat{g}} (n) e (n), \end{matrix}

(70)

where the Kalman step-size parameters are

\begin{matrix} μ_{\hat{g}, K} (n) & = \frac{1}{x_{\hat{h}}^{T} (n) x_{\hat{h}} (n) \{1 + \frac{M σ_{v}^{2}}{[m_{g} (n - 1) + M σ_{{\bar{w}}_{g}}^{2}] x_{\hat{h}}^{T} (n) x_{\hat{h}} (n)}\}}, \end{matrix}

(71)

\begin{matrix} μ_{\hat{h}, K} (n) & = \frac{1}{x_{\hat{g}}^{T} (n) x_{\hat{g}} (n) \{1 + \frac{L σ_{v}^{2}}{[m_{h} (n - 1) + L σ_{{\bar{w}}_{h}}^{2}] x_{\hat{g}}^{T} (n) x_{\hat{g}} (n)}\}} . \end{matrix}

(72)

Comparing these parameters with the optimal step-sizes from (67) and (68) (also taking (A7) and (A8) into account; see Appendix A), we can notice striking resemblances between SKF-BF and OLMS-BF. In fact, these two algorithms are very similar when

\begin{matrix} p_{g} & = p_{h} = 0 . \end{matrix}

(73)

On the other hand, as it was indicated in [12], this could represent a reasonable assumption, since

lim_{n \to \infty} p_{g} (n) = lim_{n \to \infty} p_{h} (n) = 0,

(74)

suggesting that in the steady-state of the algorithm, the influence of the terms

p_{h}

and

p_{g}

on the step-size parameters diminishes. As will be supported in simulations, (73) can be fairly imposed within the OLMS-BF algorithm, while still leading to a very good compromise between the performance criteria (e.g., convergence rate versus misadjustment). Under these considerations, the OLMS-BF algorithm is summarized in Table 3 (in a practical form that facilitates its implementation).

6. Practical Considerations

The previously developed algorithms are designed to identify the individual impulse responses of the bilinear form. The global (spatiotemporal) impulse response can be computed based on the Kronecker product between them. An alternative solution is to use the regular Kalman filter to identify the spatiotemporal impulse response directly, relying on the observation Equation (4) and identifying the state equation:

f (n) = f (n - 1) + w (n),

(75)

where

w (n)

is a zero-mean white Gaussian noise signal vector. The covariance matrix of

w (n)

is

R_{w} (n) = σ_{w}^{2} I_{M L}

, where

I_{M L}

is the identity matrix of size

M L \times M L

and the variance

σ_{w}^{2}

captures the uncertainties in

f (n)

.

In this way, following the approach from [26], we can easily derive the regular Kalman filter (KF) and its simplified version (namely SKF), which can identify the global impulse response using a single adaptive filter

\hat{f} (n)

; for further details, please see Sections VI and VII in [26]. However, we need to mention that the solution found using the regular KF and SKF involves an adaptive filter of length

M L

, whereas their counterparts tailored for bilinear forms (i.e., KF-BF and SKF-BF) use two shorter filters of lengths L and M, respectively. As a consequence, besides a lower computational complexity, a much faster converge rate and tracking are expected for the bilinear algorithms with respect to the conventional ones. The same ideas apply for the OLMS-BF algorithm, as compared to its regular counterpart, i.e., the joint-optimized normalized LMS (JO-NLMS) algorithm [29], which could be used to identify the global impulse response

\hat{f} (n)

.

The computational complexity of the previously discussed algorithms is summarized in Table 4. It can be easily seen that the SKF-BF offer a great reduction in terms of complexity with respect to KF-BF. Also, the SKF-BF and OLMS-BF differ only by a small number of operations, thus confirming the similarity that was highlighted in Section 5. Finally, when

M L ≫ M + L

(which is usually the case in practice), we can notice that the algorithms tailored for bilinear forms (namely KF-BF, SKF-BF, and OLMS-BF) offer lower computational complexities as compared to their regular counterparts (i.e., KF, SKF, and JO-NLMS, respectively).

Next, a few important observations ought to be made regarding the specific parameters that must be set within the algorithms. Here, the noise power

σ_{v}^{2}

is required in order to compute the Kalman gain vectors (for KF-BF and SKF-BF) or the optimal step-sizes (for OLMS-BF). In practice, we can estimate this parameter in different ways; some simple and efficient methods for this purpose are presented in [30,31]. Although there are different other methods that can be used to estimate the noise power, the analysis of their influence on the performance of the algorithms lies beyond the scope of this paper.

The parameters related to the uncertainties in the unknown systems also need to be set or estimated, i.e.,

σ_{{\bar{w}}_{h}}^{2}

and

σ_{{\bar{w}}_{g}}^{2}

. Choosing small values for these parameters yields a small misalignment, but at the same time a poor tracking. On the other hand, large values (meaning that there are high uncertainties in the unknown systems) lead to a good tracking but also a high misalignment. This means that we always need to have a good compromise between fast tracking and low misalignment. In practice, if we have some a priori information about the systems which we need to identify, we can take it into consideration when setting the values of these parameters. For example, if we assume the spatial impulse response to be time-invariant, we could fix

σ_{{\bar{w}}_{g}}^{2} = 0

and tune only the parameter related to the temporal impulse response. Thus, based on the state equation related to

h (n)

, together with the approximation

{∥{\bar{w}}_{h} (n)∥}^{2} \approx L σ_{{\bar{w}}_{h}}^{2}

(which is valid when

L ≫ 1

), and replacing

h (n)

and

h (n - 1)

by their estimates, we can evaluate

{\hat{σ}}_{{\bar{w}}_{h}}^{2} (n) = \frac{1}{L} {∥\hat{h} (n) - \hat{h} (n - 1)∥}^{2} .

(76)

It can be noticed that the estimation from (76) is designed to achieve a proper compromise between good tracking and low misalignment. When the algorithm starts to converge or when there is an abrupt change of the system, the difference between

\hat{h} (n)

and

\hat{h} (n - 1)

is significant, leading to large values of the parameter

{\hat{σ}}_{{\bar{w}}_{h}}^{2} (n)

, therefore providing fast convergence and tracking. On the contrary, when the algorithm is converging to its steady-state, the difference between

\hat{h} (n)

and

\hat{h} (n - 1)

reduces, thus leading to the parameter

{\hat{σ}}_{{\bar{w}}_{h}}^{2} (n)

taking small values and, consequently, to a low misalignment.

7. Results

Experiments are performed in the context of system identification, in order to highlight the performance of the Kalman-based algorithms for bilinear forms (referred to as KF-BF and SKF-BF), in comparison with their regular counterparts (KF and SKF, as mentioned in Section 6). Also, we aim to evaluate the features of the OLMS-BF algorithm, as compared to other existing solutions, e.g., the normalized LMS-BF (NLMS-BF) and the JO-NLMS algorithms, which were introduced in [12] and [29], respectively.

In most of the experiments, both the temporal and the spatial impulse responses are randomly generated from a Gaussian distribution, having the lengths equal to

L = 64

and

M = 8

, respectively. This leads to a length of the spatiotemporal impulse response equal to

M L = 8 \times 64 = 512

. It is also useful to evaluate the tracking capabilities of the algorithms; to this purpose, a sudden change in the temporal impulse response is applied in the middle of simulations, by generating a new random vector of length

L = 64

, also from a Gaussian distribution. Only in the last experiment, the impulse response

h (n)

is an acoustic echo path of length

L = 512

, while the coefficients of

g (n)

are computed as

g_{m} (n) = {0.5}^{m}

, with

m = 1, \dots, M

and

M = 4

; in this case, the length of the global system is

M L = 4 \times 512 = 2048

.

The input signals

x_{m} (n), m = 1, 2, \dots, M

are either white Gaussian noises (WGNs) or AR(1) processes (which were obtained after passing a white Gaussian noise through a first-order system with the transfer function

1 / (1 - 0.8 z^{- 1})

). The additive noise

v (n)

is white and Gaussian, having the variance

σ_{v}^{2} = 0.01

; we assume that this parameter is available in the experiments. In most of simulations, the measure of the performance is the NM (in dB) (see (10)), to evaluate the identification of the global impulse response. In addition, in the second set of experiments (focusing on the OLMS-BF algorithm), we also involve the NPMs (based on (8) and (9)), related to the individual impulse responses.

In Figure 3 and Figure 4, the KF-BF is compared to the regular KF for WGN and AR(1) input signals, respectively. The specific parameters of the algorithms are set to

σ_{{\bar{w}}_{h}}^{2} = σ_{{\bar{w}}_{g}}^{2} = σ_{w}^{2} = 10^{- 9}

. It can be noticed from both figures that the KF-BF achieves a faster convergence rate as compared to the regular KF, for both types of input signals, providing also a better tracking capability. The gain is even more apparent in case of AR(1) inputs.

The previous experiment is repeated (for the same two types of inputs) in Figure 5 and Figure 6, this time comparing the SKF-BF with the regular SKF [26]. As it can be observed, the simplified versions (SKF-BF and SKF) yield a slower convergence rate (especially in case of AR(1) inputs) as compared to the full versions (KF-BF and KF, respectively); however, the computational complexities for these simplified versions are much lower. As it was expected, the SKF-BF outperforms the regular SKF in terms of the convergence rate; the improvement is much more visible in the case of AR(1) inputs.

Next, the performance of the SKF-BF is evaluated in Figure 7 and Figure 8, but using the recursive estimation

{\hat{σ}}_{{\bar{w}}_{h}}^{2} (n)

from (76) (not a constant value, as in the previous experiments). The spatial impulse response is assumed to be time invariant, so that we can set

σ_{{\bar{w}}_{g}}^{2} = 0

. The regular SKF is considered for comparison, using a similar way to estimate its specific parameter, i.e.,

{\hat{σ}}_{w}^{2} (n) = 1 / (M L) {∥\hat{f} (n) - \hat{f} (n - 1)∥}^{2}

[26]. Because of the nature of the estimators (as it was explained in Section 6), the algorithms behave like variable step-size adaptive filters, achieving both low misalignment and fast convergence/tracking. Moreover, as we can notice from these two figures, the proposed SKF-BF still outperforms the regular SKF in terms of both performance criteria.

As outlined in Section 5, there are strong similarities between the SKF-BF and OLMS-BF algorithms. In Figure 9 and Figure 10, we compare the performances of these algorithms using two types of input signals, i.e., WGNs and AR(1) processes, respectively. Both algorithms use the recursive estimate

{\hat{σ}}_{{\bar{w}}_{h}}^{2} (n)

(from (76)) and

σ_{{\bar{w}}_{g}}^{2} = 0

. As we can notice, the SKF-BF and OLMS-BF algorithms behave quite similar, especially when the input signals are WGNs (Figure 9). When the input signals are AR(1) processes (Figure 10), the SKF-BF outperforms the OLMS-BF in terms of the initial convergence rate; however, it pays with a slower tracking reaction. Nevertheless, the overall performances of these algorithms are very similar, as supported by the comparison provided in Section 5.

In the second set of experiments, the behavior of the OLMS-BF algorithm is analyzed, as compared to the NLMS-BF algorithm [12]. The NLMS-BF algorithm uses different values of its step-size parameters,

α_{\hat{h}}

and

α_{\hat{g}}

. The performances are now evaluated in terms of both NPMs and NM, using both types of input signals as before (WGNs and AR(1) processes). The results are presented in Figure 11 and Figure 12, using WGNs as inputs, and in Figure 13 and Figure 14, where the input signals are AR(1) processes. It can be noticed that the proposed solution achieves similar convergence rate but a much lower misalignment level than the NLMS-BF algorithm with

α_{\hat{h}} = α_{\hat{g}} = 0.5

(which provides the fastest convergence rate [12]). On the other hand, if we target a lower misalignment and set the step-sizes of the NLMS-BF to smaller values (i.e.,

α_{\hat{h}} = α_{\hat{g}} = 0.1

and

α_{\hat{h}} = α_{\hat{g}} = 0.01

), the convergence rate also decreases. However, the OLMS-BF algorithm leads to a misalignment level similar to the NLMS-BF algorithm using the smallest step-sizes. In addition, when the input signals are AR(1) processes, the improvement offered by the OLMS-BF algorithm is even more apparent.

Next, the performance of the OLMS-BF algorithm is evaluated along with the JO-NLMS algorithm [29], which is applied for the identification of the global impulse response of length

M L = 512

. The results are presented in Figure 15 and Figure 16, using WGNs and AR(1) input signals, respectively. As specified in Section 6, the JO-NLMS algorithm is the regular counterpart of the OLMS-BF in a classical (one-dimensional) system identification scenario. We can see that the proposed solution (tailored for bilinear forms, i.e., exploiting the two-dimensional decomposition) offers both faster convergence and tracking, as well as a lower misalignment, as compared to the JO-NLMS algorithm. The performance improvement is even more important in case of AR(1) input signals.

Finally, to validate our approach, we assess the performance of the OLMS-BF algorithm when applying it in a context which is closer to a real scenario. The temporal impulse response

h (n)

is a real-world echo path of length

L = 512

. The spatial impulse response

g (n)

, of length

M = 4

, is generated using an exponential decay with the elements

g_{m} = {0.5}^{m}

,

m = 1, \dots, M

. Both impulse responses are then normalized such that

∥ h (n) ∥ = ∥ g (n) ∥ = 1

. The input signal is an AR(1) process and we compare the behaviors of the OLMS-BF and NLMS-BF algorithms. The performance is illustrated in Figure 17 and Figure 18. We can notice that the proposed solution slightly outperforms the fastest convergence rate of NLMS-BF, given by

α_{\hat{h}} = α_{\hat{g}} = 0.5

, but at the same time offering a much lower value of the misalignment. If, however, we use the NLMS-BF algorithm with the smaller step-sizes (in order to obtain a better misalignment), the resulting convergence rate is much lower than the one of the OLMS-BF algorithm.

8. Discussion

In this paper, we have focused on the Kalman filter tailored for the identification of bilinear forms (KF-BF), together with its simplified version (SKF-BF). Also, we have developed an optimized version of the LMS algorithm for bilinear forms, namely OLMS-BF. In addition, a comparison between the SKF-BF and OLMS-BF algorithms has been outlined, indicating strong similarities between these two solutions. In our framework, the bilinear term has been defined with respect to the impulse responses of the spatiotemporal model.

The SKF-BF provides a reduced computational complexity as compared to KF-BF; the downside is that it has a slower convergence rate, more visible for correlated inputs. On the other hand, the SKF-BF and OLMS-BF algorithms perform very similarly. Experimental results also indicate that the algorithms tailored for bilinear forms outperform their regular counterparts (in such two-dimensional system identification scenarios), in terms of both convergence rate and tracking, as well as the steady-state misalignment. Adding to that, the reduced computational amount provided by the use of two shorter adaptive filters instead of a single (much longer) one, led us to conclude that the proposed algorithms could represent appealing solutions for the identification of bilinear forms.

Author Contributions

Conceptualization, L.-M.D.; Formal analysis, S.C.; Software, C.P.; Methodology, J.B.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

We detail here the evaluation of the expectation terms from (53)–(56), which are required in the development of (51) and (52). First, based on (43), we can express the error as

\begin{matrix} e (n) & = c_{g}^{T} (n - 1) x_{\hat{h}} (n) + {\bar{w}}_{g}^{T} (n) x_{\hat{h}} (n) + v (n) + v_{g} (n) \\ = c_{g}^{T} (n - 1) x_{\hat{h}} (n) + {\bar{w}}_{g}^{T} (n) x_{\hat{h}} (n) + v (n) + c_{h}^{T} (n - 1) x_{g} (n) + {\bar{w}}_{h}^{T} (n) x_{g} (n) . \end{matrix}

(A1)

Using this relation, together with (14) and (16), and taking into account that

c_{g} (n - 1)

and

x_{\hat{h}} (n)

are uncorrelated, the term from (53) results in

\begin{matrix} A_{g} & = E \{[c_{g}^{T} (n - 1) + {\bar{w}}_{g}^{T} (n)] x_{\hat{h}} (n) e (n)\} \\ = E \{[c_{g}^{T} (n - 1) + {\bar{w}}_{g}^{T} (n)] x_{\hat{h}} (n) [v (n) + {\bar{w}}_{g}^{T} (n) x_{\hat{h}} (n) + v_{g} (n) + c_{g}^{T} (n - 1) x_{\hat{h}} (n)]\} \\ = p_{g} (n) + E [c_{g}^{T} (n - 1) x_{\hat{h}} (n) x_{\hat{h}}^{T} (n) c_{g} (n - 1) + {\bar{w}}_{g}^{T} (n) x_{\hat{h}} (n) x_{\hat{h}}^{T} (n) {\bar{w}}_{g} (n)] \\ = p_{g} (n) + tr \{E [c_{g} (n - 1) c_{g}^{T} (n - 1) + {\bar{w}}_{g} (n) {\bar{w}}_{g}^{T} (n)] E [x_{\hat{h}} (n) x_{\hat{h}}^{T} (n)]\} \\ = p_{g} (n) + [m_{g} (n - 1) + M σ_{{\bar{w}}_{g}}^{2}] E [x_{\hat{h}} (n) x_{\hat{h}}^{T} (n)], \end{matrix}

(A2)

where

p_{g} (n)

is given in (61) and we took into account that

c_{g} (n - 1)

and

x_{\hat{h}} (n)

are uncorrelated.

Next, we should concentrate on the last expectation term in (A2), which can be expressed as

\begin{matrix} E [x_{\hat{h}} (n) x_{\hat{h}}^{T} (n)] = E \{{[I_{M} \otimes \hat{h} (n - 1)]}^{T} \tilde{x} (n) {\tilde{x}}^{T} (n) [I_{M} \otimes \hat{h} (n - 1)]\} . \end{matrix}

(A3)

The main diagonal terms of this matrix are

E [{\hat{h}}^{T} (n - 1) x_{m} (n) x_{m}^{T} (n) \hat{h} (n - 1)], m = 1, 2, \dots, M

. In the following, we consider the assumption that the input signals are independent and have the same power, while their covariance matrices are close to a diagonal matrix [27,28]. Consequently,

\begin{matrix} E [{\hat{h}}^{T} (n - 1) x_{m} (n) x_{m}^{T} (n) \hat{h} (n - 1)] & = tr \{E [x_{m} (n) x_{m}^{T} (n) \hat{h} (n - 1) {\hat{h}}^{T} (n - 1)]\} \\ = tr \{E [x_{m} (n) x_{m}^{T} (n)] E [\hat{h} (n - 1) {\hat{h}}^{T} (n - 1)]\} \\ = σ_{x}^{2} E [{∥\hat{h} (n - 1)∥}^{2}] . \end{matrix}

(A4)

Finally, using (A3) and (A4) in (A2), we obtain

\begin{matrix} A_{g} & = p_{g} (n) + σ_{x}^{2} E [{∥\hat{h} (n - 1)∥}^{2}] [m_{g} (n - 1) + M σ_{{\bar{w}}_{g}}^{2}] . \end{matrix}

(A5)

In a similar manner, the corresponding term from (55) is derived as

\begin{matrix} A_{h} & = p_{h} (n) + σ_{x}^{2} E [{∥\hat{g} (n - 1)∥}^{2}] [m_{h} (n - 1) + L σ_{{\bar{w}}_{h}}^{2}] . \end{matrix}

(A6)

The terms

p_{g} (n)

and

p_{h} (n)

are given in (61) and (62), respectively.

Further, we detail the evaluation of the expectation term from (54). To begin, let us focus on the product

x_{\hat{h}}^{T} (n) x_{\hat{h}} (n)

. Relying on the same considerations and assumptions from (A3) and (A4), we obtain

\begin{matrix} x_{\hat{h}}^{T} (n) x_{\hat{h}} (n) & = {\tilde{x}}^{T} (n) [I_{M} \otimes \hat{h} (n - 1)] {[I_{M} \otimes \hat{h} (n - 1)]}^{T} \tilde{x} (n) \\ = tr \{{[I_{M} \otimes \hat{h} (n - 1)]}^{T} \tilde{x} (n) {\tilde{x}}^{T} (n) [I_{M} \otimes \hat{h} (n - 1)]\} \\ \approx M σ_{x}^{2} {∥\hat{h} (n - 1)∥}^{2} . \end{matrix}

(A7)

Similarly,

\begin{matrix} x_{\hat{g}}^{T} (n) x_{\hat{g}} (n) & \approx L σ_{x}^{2} {∥\hat{g} (n - 1)∥}^{2} . \end{matrix}

(A8)

Hence, considering some degree of stationarity of the input signals, (A7) can be seen as a deterministic quantity, yielding

\begin{matrix} B_{g} = E [x_{\hat{h}}^{T} (n) x_{\hat{h}} (n) e^{2} (n)] & \approx M σ_{x}^{2} E [{∥\hat{h} (n - 1)∥}^{2}] E [e^{2} (n)] . \end{matrix}

(A9)

Let us focus on the computation of the expectation term

E [e^{2} (n)]

. Using (A1), we obtain

\begin{matrix} E [e^{2} (n)] & = σ_{v}^{2} + E [{\bar{w}}_{g}^{T} (n) x_{\hat{h}} (n) x_{\hat{h}}^{T} (n) {\bar{w}}_{g} (n)] \\ + E \{[v_{g} (n) + x_{\hat{h}}^{T} (n) c_{g} (n - 1)] [v_{g} (n) + c_{g}^{T} (n - 1) x_{\hat{h}} (n)]\} . \end{matrix}

(A10)

At this point, we need to evaluate the variance of

v_{g}

, which can be developed as

\begin{matrix} σ_{v_{g}}^{2} (n) & = E [v_{g}^{2} (n)] \\ = E \{[c_{h}^{T} (n - 1) x_{g} (n) + {\bar{w}}_{h}^{T} (n) x_{g} (n)] [x_{g}^{T} (n) c_{h} (n - 1) + x_{g}^{T} (n) {\bar{w}}_{h} (n)]\} \\ = E [c_{h}^{T} (n - 1) x_{g} (n) x_{g}^{T} (n) c_{h} (n - 1) + w_{h}^{T} (n) x_{g} (n) x_{g}^{T} (n) w_{h} (n)] \\ = tr \{E [c_{h} (n - 1) c_{h}^{T} (n - 1)] E [x_{g} (n) x_{g}^{T} (n)] + σ_{{\bar{w}}_{h}}^{2} E [x_{g} (n) x_{g}^{T} (n)]\} \\ = E [∥ x_{g} {(n) ∥}^{2}] \{L σ_{{\bar{w}}_{h}}^{2} + E [∥ c_{h} {(n - 1) ∥}^{2}]\} \\ = L σ_{x}^{2} E [{∥ g (n) ∥}^{2}] [L σ_{{\bar{w}}_{h}}^{2} + m_{h} (n - 1)], \end{matrix}

(120)

where

σ_{x}^{2} = E [∥ \tilde{x} {(n) ∥}^{2}]

. Therefore, (A10) results in

\begin{matrix} E [e^{2} (n)] & = σ_{v}^{2} + σ_{x}^{2} [M σ_{{\bar{w}}_{g}}^{2} + m_{g} (n - 1)] E [{∥\hat{h} (n - 1)∥}^{2}] + \frac{1}{L} σ_{v_{g}}^{2} (n) + 2 p_{g} (n), \end{matrix}

(A12)

thus obtaining

\begin{matrix} B_{g} & = M σ_{x}^{2} E [{∥\hat{h} (n - 1)∥}^{2}] \{σ_{v}^{2} + 2 p_{g} (n) + \frac{1}{L} σ_{v_{g}}^{2} (n) + σ_{x}^{2} E [{∥\hat{h} (n - 1)∥}^{2}] [m_{g} (n - 1) + M σ_{{\bar{w}}_{g}}^{2}]\} . \end{matrix}

(A13)

For the corresponding term from (56), we use the dual expression for

e (n)

(see (14)), which leads to

\begin{matrix} E [e^{2} (n)] & = σ_{v}^{2} + σ_{x}^{2} [L σ_{{\bar{w}}_{h}}^{2} + m_{h} (n - 1)] E [{∥\hat{g} (n - 1)∥}^{2}] + \frac{1}{M} σ_{v_{h}}^{2} (n) + 2 p_{h} (n), \end{matrix}

(A14)

where (similar to (A11))

\begin{matrix} σ_{v_{h}}^{2} (n) & = E [v_{h}^{2} (n)] \\ = M σ_{x}^{2} E [{∥ h (n) ∥}^{2}] [M σ_{{\bar{w}}_{g}}^{2} + m_{g} (n - 1)] . \end{matrix}

(A15)

Thus, we finally obtain

\begin{matrix} B_{h} & = L σ_{x}^{2} E [{∥\hat{g} (n - 1)∥}^{2}] \{σ_{v}^{2} + 2 p_{h} (n) + \frac{1}{M} σ_{v_{h}}^{2} (n) + σ_{x}^{2} E [{∥\hat{g} (n - 1)∥}^{2}] [m_{h} (n - 1) + L σ_{{\bar{w}}_{h}}^{2}]\} . \end{matrix}

(A16)

Summarizing, we can use (A5), (A6), (A13), and (A16) in (51)–(52), in order to obtain the recursive relations from (63)–(64), which are further used in the development of the OLMS-BF algorithm.

Appendix B

We provide here (summarized in Table A1) the main parameters that were used throughout the paper, in order to facilitate the reading.

Table A1. The main parameters used throughout the paper.

Parameters	Significance
$X (n)$	zero-mean multiple-input signal matrix
$\tilde{x} (n) = vec [X (n)]$	input signal vector
$y (n)$	output signal at time n (i.e., the bilinear form)
$d (n)$	zero-mean desired signal at time n
$v (n)$	zero-mean additive noise, of variance $σ_{v}^{2}$
$h (n)$ , $g (n)$	true impulse responses of the system, of lengths L and M, respectively
$η$	arbitrary scaling parameter
$f (n)$	true spatiotemporal (i.e., global) impulse response of the system, of length $M L$
$\hat{h} (n)$ , $\hat{g} (n)$	estimated impulse responses of the system
$\hat{f} (n) = \hat{g} (n) \otimes \hat{h} (n)$	estimated spatiotemporal impulse response of the system
$x_{\hat{h}} (n)$ , $x_{\hat{g}} (n)$	the input signals from Figure 1 and Figure 2
$\hat{y} (n)$	estimated output signal at time n
$e (n)$	error signal between the desired and estimated signals at time n
${\bar{w}}_{h} (n)$ , ${\bar{w}}_{g} (n)$	zero-mean white Gaussian noise vectors, with correlation matrices $R_{{\bar{w}}_{h}} (n) = σ_{{\bar{w}}_{h}}^{2} I_{L}$ and $R_{{\bar{w}}_{g}} (n) = σ_{{\bar{w}}_{g}}^{2} I_{M}$ , respectively
$k_{h} (n)$ , $k_{g} (n)$	Kalman gain vectors
$c_{h} (n)$ , $c_{g} (n)$	a posteriori misalignments corresponding to the two impulse responses, with correlation matrices $R_{c_{h}} (n)$ and $R_{c_{g}} (n)$ , respectively
$c_{h_{a}} (n)$ , $c_{g_{a}} (n)$	a priori misalignments corresponding to the two impulse responses, with correlation matrices $R_{c_{h_{a}}} (n)$ and $R_{c_{g_{a}}} (n)$ , respectively
$m_{g} (n) = E [∥ c_{g} {(n) ∥}^{2}]$ , $m_{h} (n) = E [∥ c_{h} {(n) ∥}^{2}]$	squared norms of the a posteriori misalignments
$μ_{\hat{h}, o}, μ_{\hat{g}, o}$	optimal step-size parameters corresponding to the two adaptive filters

References

Mohler, R.R.; Kolodziej, W.J. An overview of bilinear system theory and applications. IEEE Trans. Syst. Man Cybern. 1980, 10, 683–688. [Google Scholar]
Baik, H.K.; Mathews, V.J. Adaptive lattice bilinear filters. IEEE Trans. Signal Process. 1993, 41, 2033–2046. [Google Scholar] [CrossRef]
Han, S.; Kim, J.; Sung, K. Extended generalized total least squares method for the identification of bilinear systems. IEEE Trans. Signal Process. 1996, 44, 1015–1018. [Google Scholar]
Tsoulkas, V.; Koukoulas, P.; Kalouptsidis, N. Identification of input-output bilinear systems using cumulants. IEEE Trans. Signal Process. 2001, 51, 2753–2761. [Google Scholar] [CrossRef]
Dos Santos, P.L.; Ramos, J.A.; de Carvalho, J.L.M. Identification of bilinear systems with white noise inputs: an iterative deterministic-stochastic subspace approach. IEEE Trans. Control Syst. Technol. 2009, 17, 1145–1153. [Google Scholar] [CrossRef]
Forssén, U. Adaptive bilinear digital filters. IEEE Trans. Circuits Syst. II Analog Digital Signal Process. 1993, 40, 729–735. [Google Scholar] [CrossRef]
Hu, R.; Hassan, H.M. Echo cancellation in high speed data transmission systems using adaptive layered bilinear filters. IEEE Trans. Commun. 1994, 42, 655–663. [Google Scholar]
Zhu, Z.; Leung, H. Adaptive identification of nonlinear systems with application to chaotic communications. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 2000, 47, 1072–1080. [Google Scholar] [CrossRef]
Kuo, S.M.; Wu, H.-T. Nonlinear adaptive bilinear filters for active noise control systems. IEEE Trans. Circuits Syst. I Regul. Pap. 2005, 52, 617–624. [Google Scholar] [CrossRef]
Zhao, H.; Zeng, X.; He, Z. Low-complexity nonlinear adaptive filter based on a pipelined bilinear recurrent neural network. IEEE Trans. Neural Netw. 2011, 22, 1494–1507. [Google Scholar] [CrossRef]
Benesty, J.; Paleologu, C.; Ciochină, S. On the identification of bilinear forms with the Wiener filter. IEEE Signal Process. Lett. 2017, 24, 653–657. [Google Scholar] [CrossRef]
Paleologu, C.; Benesty, J.; Ciochină, S. Adaptive filtering for the identification of bilinear forms. Digital Signal Process. 2018, 75, 153–167. [Google Scholar] [CrossRef]
Dogariu, L.; Paleologu, C.; Ciochină, S.; Benesty, J.; Piantanida, P. Identification of bilinear forms with the Kalman filter. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 4134–4138. [Google Scholar]
Elisei-Iliescu, C.; Stanciu, C.; Paleologu, C.; Benesty, J.; Anghel, C.; Ciochină, S. Efficient recursive least-squares algorithms for the identification of bilinear forms. Digital Signal Process. 2018, 83, 280–296. [Google Scholar] [CrossRef]
Gesbert, D.; Duhamel, P. Robust blind joint data/channel estimation based on bilinear optimization. In Proceedings of the 8th Workshop on Statistical Signal and Array Processing, Corfu, Greece, 24–26 June 1996; pp. 168–171. [Google Scholar]
Stenger, A.; Kellermann, W.; Rabenstein, R. Adaptation of acoustic echo cancellers incorporating a memoryless nonlinearity. In Proceedings of the IEEE Workshop on Acoustic Echo and Noise Control (IWAENC’99), Pocono Manor, PA, USA, 27–29 September 1999. [Google Scholar]
Stenger, A.; Kellermann, W. Adaptation of a memoryless preprocessor for nonlinear acoustic echo cancelling. Signal Process. 2000, 80, 1747–1760. [Google Scholar] [CrossRef] [Green Version]
Huang, Y.; Skoglund, J.; Luebs, A. Practically efficient nonlinear acoustic echo cancellers using cascaded block RLS and FLMS adaptive filters. In Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, USA, 5–9 March 2017; pp. 596–600. [Google Scholar]
Kalman, R.E. A new approach to linear filtering and prediction problems. J. Basic Eng. 1960, 82, 35–45. [Google Scholar] [CrossRef]
Zarchan, P.; Musoff, H. Fundamentals of Kalman Filtering: A Practical Approach, 4th ed.; American Institute of Aeronautics and Astronautics, Incorporated: Reston, VA, USA, 2015. [Google Scholar]
Hsieh, C.-S. Robust two-stage Kalman filters for systems with unknown inputs. IEEE Trans. Autom. Control 2000, 45, 2374–2378. [Google Scholar] [CrossRef]
Chen, L.; Mercorelli, P.; Liu, S. A Kalman Estimator for Detecting Repetitive Disturbances. In Proceedings of the 2005, American Control Conference, Portland, OR, USA, 8–10 June 2005; pp. 1631–1636. [Google Scholar]
Mercorelli, P. A Motion-Sensorless Control for Intake Valves in Combustion Engines. IEEE Trans. Ind. Electron. 2016, 64, 3402–3412. [Google Scholar] [CrossRef]
Morgan, D.R.; Benesty, J.; Sondhi, M.M. On the evaluation of estimated impulse responses. IEEE Signal Process. Lett. 1998, 5, 174–176. [Google Scholar] [CrossRef]
Kay, S.M. Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory; Prentice Hall: Englewood Cliffs, NJ, USA, 1993. [Google Scholar]
Paleologu, C.; Benesty, J.; Ciochină, S. Study of the general Kalman filter for echo cancellation. IEEE Trans. Audio Speech Lang. Process. 2013, 21, 1539–1549. [Google Scholar] [CrossRef]
Haykin, S.; Widrow, B. (Eds.) Least-Mean-Square Adaptive Filters; Wiley: Hoboken, NJ, USA, 2003. [Google Scholar]
Sulyman, A.I.; Zerguine, A. Convergence and steady-state analysis of a variable step-size NLMS algorithm. Signal Process. 2003, 83, 1255–1273. [Google Scholar] [CrossRef]
Ciochină, S.; Paleologu, C.; Benesty, J. An optimized NLMS algorithm for system identification. Signal Process. 2016, 118, 115–121. [Google Scholar] [CrossRef]
Iqbal, M.A.; Grant, S.L. Novel variable step size NLMS algorithm for echo cancellation. In Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, USA, 31 March–4 April 2008; pp. 241–244. [Google Scholar]
Paleologu, C.; Ciochină, S.; Benesty, J. Double-talk robust VSS-NLMS algorithm for under-modeling acoustic echo cancellation. In Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, USA, 31 March–4 April 2008; pp. 245–248. [Google Scholar]

Figure 1. Equivalent system identification scheme when considering the system

g (n)

and the input

x_{\hat{h}} (n)

.

Figure 1. Equivalent system identification scheme when considering the system

g (n)

and the input

x_{\hat{h}} (n)

.

Figure 2. Equivalent system identification scheme when considering the system

h (n)

and the input

x_{\hat{g}} (n)

.

Figure 2. Equivalent system identification scheme when considering the system

h (n)

and the input

x_{\hat{g}} (n)

.

Figure 3. Normalized misalignment of the KF-BF and regular KF using WGNs as input signals. The length of the global impulse response is

M L = 512

. The specific parameters are set to

σ_{{\bar{w}}_{h}}^{2} = σ_{{\bar{w}}_{g}}^{2} = σ_{w}^{2} = 10^{- 9}

.

Figure 3. Normalized misalignment of the KF-BF and regular KF using WGNs as input signals. The length of the global impulse response is

M L = 512

. The specific parameters are set to

σ_{{\bar{w}}_{h}}^{2} = σ_{{\bar{w}}_{g}}^{2} = σ_{w}^{2} = 10^{- 9}

.

Figure 4. Normalized misalignment of the KF-BF and regular KF using AR(1) processes as input signals. The length of the global impulse response is

M L = 512

. The specific parameters are set to

σ_{{\bar{w}}_{h}}^{2} = σ_{{\bar{w}}_{g}}^{2} = σ_{w}^{2} = 10^{- 9}

.

Figure 4. Normalized misalignment of the KF-BF and regular KF using AR(1) processes as input signals. The length of the global impulse response is

M L = 512

. The specific parameters are set to

σ_{{\bar{w}}_{h}}^{2} = σ_{{\bar{w}}_{g}}^{2} = σ_{w}^{2} = 10^{- 9}

.

Figure 5. Normalized misalignment of the SKF-BF and regular SKF using WGNs as input signals. The length of the global impulse response is

M L = 512

. The specific parameters are set to

σ_{{\bar{w}}_{h}}^{2} = σ_{{\bar{w}}_{g}}^{2} = σ_{w}^{2} = 10^{- 9}

.

Figure 5. Normalized misalignment of the SKF-BF and regular SKF using WGNs as input signals. The length of the global impulse response is

M L = 512

. The specific parameters are set to

σ_{{\bar{w}}_{h}}^{2} = σ_{{\bar{w}}_{g}}^{2} = σ_{w}^{2} = 10^{- 9}

.

Figure 6. Normalized misalignment of the SKF-BF and regular SKF using AR(1) processes as input signals. The length of the global impulse response is

M L = 512

. The specific parameters are set to

σ_{{\bar{w}}_{h}}^{2} = σ_{{\bar{w}}_{g}}^{2} = σ_{w}^{2} = 10^{- 9}

.

Figure 6. Normalized misalignment of the SKF-BF and regular SKF using AR(1) processes as input signals. The length of the global impulse response is

M L = 512

. The specific parameters are set to

σ_{{\bar{w}}_{h}}^{2} = σ_{{\bar{w}}_{g}}^{2} = σ_{w}^{2} = 10^{- 9}

.

Figure 7. Normalized misalignment of the SKF-BF and regular SKF (for WGNs input signals), using the recursive estimates

{\hat{σ}}_{{\bar{w}}_{h}}^{2} (n)

and

{\hat{σ}}_{w}^{2} (n)

, respectively; the SKF-BF uses

σ_{{\bar{w}}_{g}}^{2} = 0

. The length of the global impulse responses is

M L = 512

.

Figure 7. Normalized misalignment of the SKF-BF and regular SKF (for WGNs input signals), using the recursive estimates

{\hat{σ}}_{{\bar{w}}_{h}}^{2} (n)

and

{\hat{σ}}_{w}^{2} (n)

, respectively; the SKF-BF uses

σ_{{\bar{w}}_{g}}^{2} = 0

. The length of the global impulse responses is

M L = 512

.

Figure 8. Normalized misalignment of the SKF-BF and regular SKF (for AR(1) input signals), using the recursive estimates

{\hat{σ}}_{{\bar{w}}_{h}}^{2} (n)

and

{\hat{σ}}_{w}^{2} (n)

, respectively; the SKF-BF uses

σ_{{\bar{w}}_{g}}^{2} = 0

. The length of the global impulse responses is

M L = 512

.

Figure 8. Normalized misalignment of the SKF-BF and regular SKF (for AR(1) input signals), using the recursive estimates

{\hat{σ}}_{{\bar{w}}_{h}}^{2} (n)

and

{\hat{σ}}_{w}^{2} (n)

, respectively; the SKF-BF uses

σ_{{\bar{w}}_{g}}^{2} = 0

. The length of the global impulse responses is

M L = 512

.

Figure 9. Normalized misalignment of the SKF-BF and OLMS-BF algorithms using WGNs as input signals. Both algorithms use the recursive estimate

{\hat{σ}}_{{\bar{w}}_{h}}^{2} (n)

and

σ_{{\bar{w}}_{g}}^{2} = 0

. The length of the global impulse responses is

M L = 512

.

Figure 9. Normalized misalignment of the SKF-BF and OLMS-BF algorithms using WGNs as input signals. Both algorithms use the recursive estimate

{\hat{σ}}_{{\bar{w}}_{h}}^{2} (n)

and

σ_{{\bar{w}}_{g}}^{2} = 0

. The length of the global impulse responses is

M L = 512

.

Figure 10. Normalized misalignment of the SKF-BF and OLMS-BF algorithms using AR(1) processes as input signals. Both algorithms use the recursive estimate

{\hat{σ}}_{{\bar{w}}_{h}}^{2} (n)

and

σ_{{\bar{w}}_{g}}^{2} = 0

. The length of the global impulse responses is

M L = 512

.

Figure 10. Normalized misalignment of the SKF-BF and OLMS-BF algorithms using AR(1) processes as input signals. Both algorithms use the recursive estimate

{\hat{σ}}_{{\bar{w}}_{h}}^{2} (n)

and

σ_{{\bar{w}}_{g}}^{2} = 0

. The length of the global impulse responses is

M L = 512

.

Figure 11. Normalized projection misalignment of the OLMS-BF and NLMS-BF (using different step-size parameters): (Top) identification of the temporal impulse response

h (n)

, (Bottom) identification of the spatial impulse response

g (n)

. The input signals are WGNs,

L = 64

, and

M = 8

.

Figure 11. Normalized projection misalignment of the OLMS-BF and NLMS-BF (using different step-size parameters): (Top) identification of the temporal impulse response

h (n)

, (Bottom) identification of the spatial impulse response

g (n)

. The input signals are WGNs,

L = 64

, and

M = 8

.

Figure 12. Normalized misalignment of the OLMS-BF and NLMS-BF (using different step-size parameters). The input signals are WGNs and

M L = 512

.

Figure 12. Normalized misalignment of the OLMS-BF and NLMS-BF (using different step-size parameters). The input signals are WGNs and

M L = 512

.

Figure 13. Normalized projection misalignment of the OLMS-BF and NLMS-BF (using different step-size parameters): (Top) identification of the temporal impulse response

h (n)

, (Bottom) identification of the spatial impulse response

g (n)

. The input signals are AR(1) processes,

L = 64

, and

M = 8

.

Figure 13. Normalized projection misalignment of the OLMS-BF and NLMS-BF (using different step-size parameters): (Top) identification of the temporal impulse response

h (n)

, (Bottom) identification of the spatial impulse response

g (n)

. The input signals are AR(1) processes,

L = 64

, and

M = 8

.

Figure 14. Normalized misalignment of the OLMS-BF and NLMS-BF (using different step-size parameters). The input signals are AR(1) processes and

M L = 512

.

Figure 14. Normalized misalignment of the OLMS-BF and NLMS-BF (using different step-size parameters). The input signals are AR(1) processes and

M L = 512

.

Figure 15. Normalized misalignment of the OLMS-BF and regular JO-NLMS algorithms. The input signals are WGNs and

M L = 512

.

Figure 15. Normalized misalignment of the OLMS-BF and regular JO-NLMS algorithms. The input signals are WGNs and

M L = 512

.

Figure 16. Normalized misalignment of the OLMS-BF and regular JO-NLMS algorithms. The input signals are AR(1) processes and

M L = 512

.

Figure 16. Normalized misalignment of the OLMS-BF and regular JO-NLMS algorithms. The input signals are AR(1) processes and

M L = 512

.

Figure 17. Normalized projection misalignment of the OLMS-BF and NLMS-BF (using different step-size parameters): (Top) identification of the temporal impulse response

h (n)

, (Bottom) identification of the spatial impulse response

g (n)

. The input signals are AR(1) processes,

L = 512

, and

M = 4

.

Figure 17. Normalized projection misalignment of the OLMS-BF and NLMS-BF (using different step-size parameters): (Top) identification of the temporal impulse response

h (n)

, (Bottom) identification of the spatial impulse response

g (n)

. The input signals are AR(1) processes,

L = 512

, and

M = 4

.

Figure 18. Normalized misalignment of the OLMS-BF and NLMS-BF (using different step-size parameters). The input signals are AR(1) processes and

M L = 2048

.

Figure 18. Normalized misalignment of the OLMS-BF and NLMS-BF (using different step-size parameters). The input signals are AR(1) processes and

M L = 2048

.

Table 1. Notation used throughout the paper.

Symbol	Significance
a	scalar a
$a$	vector $a$
$A$	matrix $A$
$^{T}$	transpose operator
⊗	Kronecker product operator
$∥ b ∥$	Euclidean norm of the vector $b$
$E (\cdot)$	mathematical expectation
$I_{L}$	identity matrix of size $L \times L$
$tr (B)$	trace of the square matrix $B$
$vec (C) = c$	vectorization operation, i.e., conversion of a matrix ( $C$ of size $L \times M$ ) into a vector ( $c$ of length $M L$ )

Table 2. Simplified Kalman filter for bilinear forms (SKF-BF).

Initialization:

\hat{h} (0) = {[1 0 \dots 0]}^{T}, \hat{g} (0) = \frac{1}{M} {[1 1 \dots 1]}^{T}

r_{c_{h}} (0) = ϵ_{h}, r_{c_{g}} (0) = ϵ_{g} (positive constants)

Parameters : σ_{{\bar{w}}_{h}}^{2}, σ_{{\bar{w}}_{g}}^{2}, σ_{v}^{2} known or estimated

Algorithm :

r_{c_{h_{a}}} (n) = r_{c_{h}} (n - 1) + σ_{{\bar{w}}_{h}}^{2} (n)

r_{c_{g_{a}}} (n) = r_{c_{g}} (n - 1) + σ_{{\bar{w}}_{g}}^{2} (n)

δ_{h} (n) = \frac{σ_{v}^{2}}{r_{c_{h_{a}}} (n)}

δ_{g} (n) = \frac{σ_{v}^{2}}{r_{c_{g_{a}}} (n)}

e (n) = d (n) - x_{\hat{g}}^{T} (n) \hat{h} (n - 1) = d (n) - x_{\hat{h}}^{T} (n) \hat{g} (n - 1)

\hat{h} (n) = \hat{h} (n - 1) + \frac{x_{\hat{g}} (n) e (n)}{x_{\hat{g}}^{T} (n) x_{\hat{g}} (n) + δ_{h} (n)}

\hat{g} (n) = \hat{g} (n - 1) + \frac{x_{\hat{h}} (n) e (n)}{x_{\hat{h}}^{T} (n) x_{\hat{h}} (n) + δ_{g} (n)}

r_{c_{h}} (n) = \{1 - \frac{x_{\hat{g}}^{T} (n) x_{\hat{g}} (n)}{L [x_{\hat{g}}^{T} (n) x_{\hat{g}} (n) + δ_{h} (n)]}\} r_{c_{h_{a}}} (n)

r_{c_{g}} (n) = \{1 - \frac{x_{\hat{h}}^{T} (n) x_{\hat{h}} (n)}{M [x_{\hat{h}}^{T} (n) x_{\hat{h}} (n) + δ_{g} (n)]}\} r_{c_{g_{a}}} (n)

Table 3. Optimized LMS algorithm for bilinear forms (OLMS-BF) (practical version, i.e., using (73)).

Initialization:

\hat{h} (0) = {[1 0 \dots 0]}^{T}, \hat{g} (0) = \frac{1}{M} {[1 1 \dots 1]}^{T}

m_{h} (0) = ϵ_{h}, m_{g} (0) = ϵ_{g} (positive constants)

Parameters : E [{∥ h (n) ∥}^{2}], E [{∥ g (n) ∥}^{2}], σ_{{\bar{w}}_{h}}^{2}, σ_{{\bar{w}}_{g}}^{2}, σ_{v}^{2} known or estimated

Algorithm :

σ_{v_{h}}^{2} (n) = M σ_{x}^{2} E [{∥ h (n) ∥}^{2}] [m_{g} (n - 1) + M σ_{{\bar{w}}_{g}}^{2}]

σ_{v_{g}}^{2} (n) = L σ_{x}^{2} E [{∥ g (n) ∥}^{2}] [m_{h} (n - 1) + L σ_{{\bar{w}}_{h}}^{2}]

A_{h} = σ_{x}^{2} E [{∥\hat{g} (n - 1)∥}^{2}] [m_{h} (n - 1) + L σ_{{\bar{w}}_{h}}^{2}]

B_{h} = L σ_{x}^{2} E [{∥\hat{g} (n - 1)∥}^{2}] \{σ_{v}^{2} + \frac{1}{M} σ_{v_{h}}^{2} (n) + σ_{x}^{2} E [{∥\hat{g} (n - 1)∥}^{2}] [m_{h} (n - 1) + L σ_{{\bar{w}}_{h}}^{2}]\}

A_{g} = σ_{x}^{2} E [{∥\hat{h} (n - 1)∥}^{2}] [m_{g} (n - 1) + M σ_{{\bar{w}}_{g}}^{2}]

B_{g} = M σ_{x}^{2} E [{∥\hat{h} (n - 1)∥}^{2}] \{σ_{v}^{2} + \frac{1}{L} σ_{v_{g}}^{2} (n) + σ_{x}^{2} E [{∥\hat{h} (n - 1)∥}^{2}] [m_{g} (n - 1) + M σ_{{\bar{w}}_{g}}^{2}]\}

μ_{\hat{h}, o} = \frac{A_{h}}{B_{h}}

μ_{\hat{g}, o} = \frac{A_{g}}{B_{g}}

e (n) = d (n) - x_{\hat{g}}^{T} (n) \hat{h} (n - 1) = d (n) - x_{\hat{h}}^{T} (n) \hat{g} (n - 1)

\hat{h} (n) = \hat{h} (n - 1) + μ_{\hat{h}, o} x_{\hat{g}} (n) e (n)

\hat{g} (n) = \hat{g} (n - 1) + μ_{\hat{g}, o} x_{\hat{h}} (n) e (n)

m_{h} (n) = m_{h} (n - 1) - 2 A_{h} μ_{\hat{h}, o} + μ_{\hat{h}, o}^{2} B_{h} + L σ_{{\bar{w}}_{h}}^{2}

m_{g} (n) = m_{g} (n - 1) - 2 A_{g} μ_{\hat{g}, o} + μ_{\hat{g}, o}^{2} B_{g} + M σ_{{\bar{w}}_{g}}^{2}

Table 4. Computational complexity of the algorithms.

Algorithms	×	+	/
KF-BF [13]	$3 (L^{2} + M^{2}) + 2 M L + 3 L + 4 M$	$2 (L^{2} + M^{2}) + 2 M L + L + 2 M$	2
KF [26]	$3 {(M L)}^{2} + 4 M L$	$2 {(M L)}^{2} + 3 M L$	1
SKF-BF (Table 2)	$2 M L + 2 L + 3 M + 6$	$2 M L + L + 2 M + 4$	2
SKF [26]	$3 M L + 6$	$3 M L + 4$	1
OLMS-BF (Table 3)	$2 M L + 2 L + 3 M + 12$	$2 M L + L + 2 M + 8$	2
JO-NLMS [29]	$3 M L + 6$	$3 M L + 5$	1

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dogariu, L.-M.; Ciochină, S.; Paleologu, C.; Benesty, J. A Connection Between the Kalman Filter and an Optimized LMS Algorithm for Bilinear Forms. Algorithms 2018, 11, 211. https://doi.org/10.3390/a11120211

AMA Style

Dogariu L-M, Ciochină S, Paleologu C, Benesty J. A Connection Between the Kalman Filter and an Optimized LMS Algorithm for Bilinear Forms. Algorithms. 2018; 11(12):211. https://doi.org/10.3390/a11120211

Chicago/Turabian Style

Dogariu, Laura-Maria, Silviu Ciochină, Constantin Paleologu, and Jacob Benesty. 2018. "A Connection Between the Kalman Filter and an Optimized LMS Algorithm for Bilinear Forms" Algorithms 11, no. 12: 211. https://doi.org/10.3390/a11120211

APA Style

Dogariu, L.-M., Ciochină, S., Paleologu, C., & Benesty, J. (2018). A Connection Between the Kalman Filter and an Optimized LMS Algorithm for Bilinear Forms. Algorithms, 11(12), 211. https://doi.org/10.3390/a11120211

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Connection Between the Kalman Filter and an Optimized LMS Algorithm for Bilinear Forms

Abstract

1. Introduction

2. System Model

3. Kalman Filter for Bilinear Forms

4. Optimized LMS Algorithm for Bilinear Forms

5. SKF-BF versus OLMS-BF

6. Practical Considerations

7. Results

8. Discussion

Author Contributions

Funding

Conflicts of Interest

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI