Complex-Valued FastICA Estimator with a Weighted Unitary Constraint: A Robust and Equivariant Estimator

E, Jianwei; Yang, Mingshu

doi:10.3390/math12121840

Open AccessArticle

Complex-Valued FastICA Estimator with a Weighted Unitary Constraint: A Robust and Equivariant Estimator

by

Jianwei E

and

Mingshu Yang

^*

College of Mathematics and Physics, Center for Applied Mathematics of Guangxi, Guangxi Minzu University, Nanning 530006, China

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(12), 1840; https://doi.org/10.3390/math12121840

Submission received: 28 May 2024 / Revised: 11 June 2024 / Accepted: 11 June 2024 / Published: 13 June 2024

Download

Browse Figures

Versions Notes

Abstract

Independent component analysis (ICA), as a statistical and computational approach, has been successfully applied to digital signal processing. Performance analysis for the ICA approach is perceived as a challenging task to work on. This contribution concerns the complex-valued FastICA algorithm in the range of ICA over the complex number domain. The focus is on the robust and equivariant behavior analysis of the complex-valued FastICA estimator. Although the complex-valued FastICA algorithm as well as its derivatives have been widely used methods for approaching the complex blind signal separation problem, rigorous mathematical treatments of the robust measurement and equivariance for the complex-valued FastICA estimator are still missing. This paper strictly analyzes the robustness against outliers and separation performance depending on the global system. We begin with defining the influence function (IF) of complex-valued FastICA functional and followed by deriving its closed-form expression. Then, we prove that the complex-valued FastICA algorithm based on the optimizing cost function is linear-equivariant, depending only on the source signals.

Keywords:

complex-valued independent component analysis; FastICA algorithm; robustness; equivariance

MSC:

62F35; 62B15; 62H12

1. Introduction

Independent component analysis (ICA) is a statistical and computational model, where the observed signals are considered as linear mixtures of underlying source signals. The purpose of ICA is to estimate mutually independent source signals from their linear mixtures without prior knowledge of sources and mixing coefficients [1,2]. Up to the present, ICA has been widely applied in diverse fields, such as signal processing [3,4], financial analysis [5,6], and so on. Hence, there is ever-increasing literature recognizing the essentiality of analyzing the properties of the employed ICA approaches from the aspect of theoretical support. The main purpose of this paper is to analyze the robustness and equivariance of the complex-valued fast fixed-point algorithm for ICA (complex-valued FastICA for short) [7,8], which is one of the most prominent algorithms in the complex number domain due to its faster convergence and easier implementation.

Among many publications, algorithmic property is one of the most fundamental properties for complex-valued ICA. Novey and Adali proposed complex ICA based on negentropy maximization (NM) [9] and provided the local stability condition of the algorithm. Furthermore, Qian and Wei gave a stability analysis from a unique perspective [10]. The authors pointed out that the NM-based ICA algorithm might iterate a poor separation vector even if the source signals meet the stability conditions. Reference [11] presented the Riemann and Lie structures of the complex unitary group, which generalized the topological properties of the complex-valued ICA. Koldovský and Tichavský [12] addressed the problem regarding the region of convergence of gradient-related ICA algorithms. The results showed that the size of the region of convergence was related to the employed algorithm and relied on the ratio of scales of source signals. E et al. [13] provided a performance analysis of the complex-valued FastICA algorithm with the M-estimator cost function, including stability and local convergence. They proved the existence of local optimal solutions and stability conditions.

Moreover, the statistical property is another fundamental property for the complex-valued ICA. Cramér-Rao bound (CRB) is essential for the ICA algorithms to describe the performance limit, and has been researched, e.g., in [14,15]; the authors derived a closed-form expression for the CRB of the separation parameter for complex-valued ICA. Fu et al. [16] established the theory for complex-valued ICA, providing the Cramér-Rao lower bound and identification conditions, which exploited diversities of non-Gaussianity, non-whiteness, and non-circularity. Furthermore, Koldovský et al. [17] analyzed the accuracy of fast dynamic independent vector analysis, showing asymptotic efficiency under given mixing and statistical models, which coincided with the Cramér-Rao lower bound derived in [15]. Reference [13] also analyzed the statistical properties of the complex-valued FastICA method, including uniformity and robustness (the sequences of the complex-valued FastICA estimator converge in probability to the true demixing vector). However, unfortunately, they addressed the problem concerning the robust property without a rigorous mathematical treatment. In this paper, we focus on deriving a closed-form expression of the robust measurement for the complex-valued FastICA functional.

The circular complex-valued FastICA (c-FastICA) algorithm was introduced by Bingham and Hyvärinen [7], as an extension of FastICA in the real domain [18,19]. It has received wide attention for solving complex-valued source signal separation due to its faster convergence and easier implementation. In order to improve the robustness against outliers, Chao and Douglas [20] selected the Huber M-estimator as nonlinearity in the cost function within the complex-circular FastICA algorithm. The local stability analysis showed that the improved approach was locally stable for Huber’s single-parameter M-estimator cost function with circular source signals. The aforementioned complex-valued FastICA algorithm, however, has poor performance when dealing with noncircular source signal separation. To address this obstacle, Novey and Adali extended it to the noncircular source signals separation scenario and analyzed the local convergence of the estimator [8]. Although there have been several attempts to study the statistical properties of the complex-noncircular FastICA algorithm (nc-FastICA for short), a rigorous analysis of the robust and equivariant behavior of nc-FastICA is still missing in the community. To fill this gap, this paper will analyze the robustness against outliers and the separation performance of the complex-valued FastICA estimator with a weighted unitary constraint.

The innovations of this paper are three-fold. First, we define the complex-valued FastICA functional and its influence function (IF) for the deflationary procedure. Then, a closed-form expression of the IF for the complex-valued FastICA functional is derived, which is utilized to measure robustness against outliers from a rigorous mathematical perspective. Third, we prove that the complex-valued FastICA algorithm is equivariant.

This paper is organized as follows: Section 2 provides the preliminaries of the complex-valued ICA model and the deflationary complex FastICA considered argument. Section 3 establishes the theories concerning the complex-valued FastICA functional and its influence function in the range of ICA over the complex number domain. Section 4 analyzes the equivariance of the complex FastICA estimator. Section 5 concludes the whole paper.

2. Preliminaries

2.1. Complex-Valued ICA Model

Assume that a complex-valued random vector

x \in C^{n}

consists of two real-valued random vectors

x^{R}, x^{I} \in R^{n}

with imaginary unit j,

x = x^{R} + j x^{I}

, and the conjugate of

x

can be defined by

x^{*} = x^{R} - j x^{I}

. We denote by

m_{x} = E {x},

C_{x} = E {(x - m_{x}) {(x - m_{x})}^{H}},

{\tilde{C}}_{x} = E {(x - m_{x}) {(x - m_{x})}^{T}},

the mean vector, covariance matrix, and complementary covariance matrix of

x

, respectively, wherein

E {x} = E {x^{R}} + j E {x^{I}}

, H denotes the conjugate transpose, and T represents the ordinary transpose. The complex-valued random vector

x

is circular if

{\tilde{C}}_{x} = 0

, and otherwise noncircular.

Let

f (x, x^{*})

be an analytic function of a complex random vector

x

and its conjugate

x^{*}

. In this paper, we utilize the Wirtinger calculus to obtain the partial derivatives with respect to

x

and

x^{*}

, considering

x

and

x^{*}

as two variables independently [21].

The ICA model established in the complex number field can be considered an extension of a real-valued ICA model [2], which assumes that the observed signals are the linear combinations of the unknown source signals

s = {(s_{1}, s_{2}, \dots, s_{n})}^{T}

, resulting in the generation of the observed signals vector

x = {(x_{1}, x_{2}, \dots, x_{n})}^{T}

according to

x = As = \sum_{i = 1}^{n} a_{i} s_{i},

(1)

where

A \in C^{n \times n}

is the unknown mixing matrix or coefficients matrix, which regulates the cumulative distribution function (cdf) of the observed signal

x

denoted by

F (x)

, and

a_{i}

is the ith column of matrix

A

. Generally, the following assumptions are needed in the ICA model:

Assumption 1.

The source signals are of zero mean and unit variance, i.e.,

C_{s} = I

and the sources

s_{1}, s_{2}, \dots, s_{n}

are statistically mutually independent, and at most, one source is Gaussian.

Assumption 2.

The mixing matrix

A

is of full rank.

The primary aim of solving the ICA problem is to procure a matrix

B

named the demixing matrix, which can be utilized to estimate the source signals

s

by

y = {(y_{1}, y_{2}, \dots, y_{n})}^{T}

defined as follows:

y = Bx = BAs = P Λ s,

(2)

where

P

and

Λ

are the permutation and diagonal matrices, respectively, representing the ambiguities of the ICA model (1). This can be obtained by the non-Gaussian-maximization criterion [2] up to the scales, phase, and order of source signals.

The data-centering procedure is needed when solving the ICA problem, as it simplifies the theory and algorithm. Considering the zero-mean source signals

s

, which can be achieved by subtracting the mean vector

m_{x}

from

x

, we can similarly define their covariance matrix

C_{s}

and complementary covariance matrix

{\tilde{C}}_{s}

. Especially, the source signal is circular if

{\tilde{C}}_{s} = 0

, otherwise, it is noncircular. In this study, we analyze the robustness and equivariant characteristics of symmetric complex-valued FastICA under the more general case of noncircular sources.

2.2. Deflationary Complex FastICA

To date, various approaches have been developed for solving the complex ICA problem, with the complex-valued FastICA algorithm being one of the most prominent for finding the demixing matrix

B

.

Let

b

be one of the columns of the demixing matrix

B

, regarding a demixing vector. On the basis of Assumption 1, the estimated source signal

y = b^{H} x

should be of unit variance by searching for the solution on the weighted unitary constraint group

O (n) = {b \in C^{n} : | | b | |_{C_{x}} = 1}

with a weighted vector norm, such that

{| | b | |}_{C_{x}}^{2} = b^{H} C_{x} b

. Hence, the cost function of the complex-valued FastICA algorithm with the weighted unitary constraint for the one-unit version has the following form:

J (b) = E {G (| b^{H} {x |}^{2})}, b \in O (n),

(3)

where

G : R^{+} ⋃ {0} \to R

represents a smooth even function defined as the nonlinearity function (which determines the robustness of the complex-valued FastICA algorithm, which will be further involved in the forthcoming Section 3);

x

denotes the observed signals vector defined in (1).

We note that (i) the cost function performs on the absolute value

| b^{H} {x |}^{2}

instead of a complex-valued vector, which results from seeking to maximize the expectation of the non-linear cost function; (ii) the optimization of the weighted unitary constraint group

O (n)

guarantees that the estimated source signal

y = b^{H} x

has unit variance.

In order to solve the optimization problem (3), Bingham and Hyvärinen [7] presented a fast fixed-point type algorithm (called c-FastICA), which is capable of estimating the circular source signals. However, it has poor performance when dealing with noncircular source separation. To figure out this obstacle, Novey and Adali extended the c-FastICA algorithm to the noncircular source signal separation scenario (called the nc-FastICA algorithm), consisting of the following learning rule [8]:

Step 1. Choose an arbitrary initial value of the unit norm for $b \in O (n)$ ;
Step 2. Run iterations:

$\begin{matrix} b \leftarrow - E {g (| b^{H} {x |}^{2}) {(b^{H} x)}^{*} x} + E {g^{'} (| b^{H} {x |}^{2}) | b^{H} {x |}^{2} \\ + g (| b^{H} {x |}^{2})} b + E {{xx}^{T}} E {g^{'} (| b^{H} {x |}^{2}) (b^{H} x)^{* 2}} b^{*}, \\ b \leftarrow \frac{b}{{| | b | |}_{C_{x}}^{2}} \end{matrix}$

(4)

until convergence, where the notations $g (z)$ , $g^{'} (z)$ denote $\frac{d G (z)}{d z}$ , and $\frac{d g (z)}{d z}$ , respectively.

One needs to estimate several source signals; the one-unit complex FastICA algorithm (4) should be run several times with vectors

b_{1}, b_{2}, \dots, b_{n}

. To prevent the vectors from converging to the same maxima, they need to be orthogonalized after each iteration using the Gram–Schmidt method [2], resulting in the following deflationary complex FastICA algorithm:

Step 1. Choose an arbitrary initial value of unit norm for $b_{j} \in O (n)$ ; set counter $j = 1$ ;
Step 2. Run iterations:

$\begin{matrix} b_{j} \leftarrow & - E {g (| b_{j}^{H} {x |}^{2}) {(b_{j}^{H} x)}^{*} x} + E {g^{'} (| b_{j}^{H} {x |}^{2}) | b_{j}^{H} {x |}^{2} \\ + g (| b_{j}^{H} {x |}^{2})} b_{j} + E {{xx}^{T}} E {g^{'} (| b_{j}^{H} {x |}^{2}) (b_{j}^{H} x)^{* 2}} b_{j}^{*}, \\ b_{j} \leftarrow b_{j} - \sum_{i = 1}^{j - 1} (b_{j}^{T} b_{i}) b_{i} \\ b_{j} \leftarrow \frac{b_{j}}{| | b_{j} {| |}_{C_{x}}^{2}} \end{matrix}$

(5)

until convergence;
Step 3. Set $j \leftarrow j + 1$ until $j = n$ .

3. Robustness of the Complex-Valued FastICA Estimator

3.1. Nonlinearity

From a statistical viewpoint, the nonlinear function G provides information on the higher-order statistics in the expectation format

E {G (| b^{H} {x |}^{2})}

, which determines the selection of g (the derivation of the nonlinear function G) in the algorithm (4). Hence, the statistical properties of the complex-valued FastICA estimator

b_{k}

lie essentially in the selection of nonlinear function G. In this paper, we analyze the robustness of the

b_{k}

via the IF.

Generally, robustness against outliers is a desirable property for an estimator, meaning that the estimator is insensitive to individual, highly erroneous observations. In this section, we mainly address the problem of how to measure the robustness of the complex-valued FastICA estimator

b_{k}

? Heuristically, the value of the function

G (x)

cannot grow fast with the increase in

| x |

if one needs a robust estimator. Specifically, we list the classical nonlinearities

G (x)

in Table 1. The curves of the function

G (x)

and its derivation with respect to x are plotted in Figure 1 and Figure 2.

From these Figures, one can yield that the Tukey M-estimator function implements a more robust estimator that is insensitive to outliers, and kurtosis gives a non-robust estimator that may be influenced by individual highly erroneous observations. In fact, the values of

G_{k u r t} (x)

(red line in Figure 1) are increased quickly from the beginning of 1 or −1 without a downward trajectory. With the increase in

| x |

, the outliers do not have much influence on the values of

G_{T u k e y} (x)

in the sense that the complex-valued FastICA estimator based on the Tukey M-estimator cost function is recommended with better robustness. This evidence can also be seen in the curves of the derivation of nonlinearities

G (x)

in Figure 2. In this paper, we shall, hereafter, analyze the robustness of the complex-valued FastICA estimator via the IF.

3.2. Influence Function of Complex-Valued FastICA Functional

We analyze the robustness of the complex-valued FastICA estimator by concerning the deflationary version of the complex-valued FastICA algorithm. For convenience, we shall hereafter suppose that

b_{k}

is the kth column of the demixing matrix

B

, which corresponds to the estimator for finding the kth source signal

s_{k}

in the sense that the equation

y_{k} = b_{k}^{H} x

gives an estimation of the source signal

s_{k}

.

In the complex-valued ICA model

(1)

, the observed signal vector

x

comes from an unknown distribution with the cdf

F (x)

. Hence, in order to give the measurement of the complex-valued FastICA estimator, we first define the complex-valued FastICA functional

b_{k} (F)

for the deflationary procedure and its influence function.

Definition 1.

Assume that the observed vector

x

follows the complex-valued ICA model (1). The complex-valued FastICA functional

b_{k} (F)

for the deflationary procedure is defined as follows:

b_{k} (F) = arg max_{| | b_{k} {| |}_{C_{x}}} E {G (| b_{k}^{H} {x |}^{2})},

(6)

subject to the following weighted unitary constraint:

b_{k} {(F)}^{H} C_{x} (F) b_{j} (F) = \{\begin{matrix} 0, & j < k, \\ 1, & j = k . \end{matrix}

(7)

where

G : R^{+} ⋃ {0} \to R

represents a smooth even function and

C_{x} (F)

denotes the covariance matrix function of the observed vector

x

at the distribution

F (x)

.

We note that the constraint condition (7) can be performed by the deflationary orthogonalization process, which separates the source signals one by one.

Definition 2.

The influence function (IF) of the complex-valued FastICA functional

b_{k} (\cdot)

at

F (x)

is given by the following:

I F (x, b, F (x)) = lim_{t \to 0} \frac{b_{k} ((1 - t) F (x) + t ▵_{x}) - b_{k} (F (x))}{t}

(8)

where

▵_{x}

denotes the probability measure, which puts mass 1 at point

x \in C^{n}

.

We note that the IF of the complex-valued FastICA functional

b_{k} (\cdot)

is defined based on the fact that it is Gâteaux differentiable [22] at the distribution

F (x)

in

C^{n}

. Thus, it can also be written as follows:

I F (x, b, F (x)) = \frac{\partial b_{k} (F_{t} (x))}{\partial t} |_{t = 0},

(9)

where

F_{t} (x) = (1 - t) F (x) + t ▵_{x}

.

To simplify the notation, we shall hereafter replace the notations

I F (x, b, F (x))

with

I F (b)

to denote the IF of the complex-valued FastICA functional

b_{k} (\cdot)

,

C_{x} (F)

(Res.

m_{x} (F)

) by the

C_{x}

(Res.

m_{x}

) to denote the covariance matrix function (Res. mean vector function) of the observed vector

x

at the distribution

F (x)

,

F_{t}

by

F_{t} (x)

, respectively.

The significance of the IF lies in its heuristic explanation: it reports the influence of the contamination at point

x

on the estimate. From the robust statistics point of view, the IF quantifies the asymptotic bias resulting from contamination in the observation [22]. Hence, the forthcoming section provides a closed-form expression of the IF of the complex-valued FastICA functional

b_{k} (\cdot)

.

3.3. Robustness

In order to analyze the robustness, we use the Lagrangian multiplier method to deduce the specific expression of the influence function of the complex-valued FastICA functional. Plugging the weighted unitary constraint (7) into the cost function (3), the Lagrange function can be obtained as follows:

L (b_{k}, λ) = J (b_{k}) - λ_{k} [b_{k}^{H} C_{x} b_{k} - 1] - \sum_{j = 1}^{k - 1} λ_{j} b_{j}^{H} C_{x} b_{k},

(10)

where

λ_{j} \in R, j = 1, \dots, k

is the penalty factor;

J (b_{k})

is the cost function of the complex-valued FastICA. Note that the observed data,

x

, are concerned without a preprocessing procedure, in the sense that

J (b_{k}) = E {G (| y_{x, k} |^{2})} = E {G (y_{x, k} y_{x, k}^{*})},

wherein

y_{x, k} = b_{k}^{H} (x - m_{x}) = \sum_{i = 1}^{n} b_{i}^{*} (x_{i} - m_{i})

, and

b_{i}, x_{i}, m_{i}

denote the ith elements for counterparts, respectively. After differentiating the

L (b_{k}, λ)

with respect to

b_{k} = {(b_{1}, \dots, b_{n})}^{T}

and equating to zero, one can yield

\begin{matrix} \nabla_{b_{k}} J (b_{k}) = \nabla_{b_{k}} λ_{k} & [b_{k}^{H} C_{x} b_{k} - 1] + \nabla_{b_{k}} \sum_{j = 1}^{k - 1} λ_{j} b_{j}^{H} C_{x} b_{k}, \end{matrix}

(11)

where

\nabla_{b}

denotes the gradient operator with respect to

b

, yielding the following:

\frac{\partial E {G (y_{x, k} y_{x, k}^{*})}}{\partial b_{i}} = E {g (y_{x, k} y_{x, k}^{*}) y_{x, k} {[x_{i} - m_{i}]}^{*}},

\frac{\partial λ_{k} b_{k}^{H} C_{x} b_{k}}{\partial b_{i}} = λ_{k} b_{k}^{H} c_{i},

\frac{\partial λ_{j} b_{j}^{H} C_{x} b_{k}}{\partial b_{i}} = λ_{j} b_{j}^{H} c_{i},

where

c_{i}

represents the ith column of the covariance matrix function

C_{x} (F)

. Furthermore,

E {g (| y_{x, k} |^{2}) y_{x, k} {[x - m_{x}]}^{*}} = λ_{k} C_{x}^{T} b_{k}^{*} + \sum_{j = 1}^{k - 1} λ_{j} C_{x}^{T} b_{j}^{*} .

(12)

For the sake of convenience in writing, we shall identify the following:

L_{k} (F) ≜ E {g (| y_{x, k} |^{2}) y_{x, k} {[x - m_{x}]}^{*}}

(13)

and

R_{k} (F) ≜ λ_{k} C_{x}^{T} b_{k}^{*} + \sum_{j = 1}^{k - 1} λ_{j} C_{x}^{T} b_{j}^{*}

(14)

respectively, yielding the following:

L_{k} (F) = R_{k} (F) .

(15)

Theorem 1.

The influence function of the complex-valued FastICA functional

b_{k} (\cdot)

,

k \in {1, 2, \dots, n}

, at the centered mixture distribution

F (x)

is as follows:

\begin{matrix} I F (b_{k}) = - {\bar{s}}_{k}^{*} \sum_{i = 1}^{k - 1} [u_{i} {({\bar{s}}_{i})}^{*} + {\bar{s}}_{i}] b_{i} + \frac{1 - | {\bar{s}}_{k} |^{2}}{2} b_{k} + u_{k} ({\bar{s}}_{k}) \sum_{i = k + 1}^{n} {\bar{s}}_{i} b_{i} \end{matrix}

(16)

where

u_{k} (x) = \frac{{(g (| x |}^{2}) - c_{k}) x^{*} - ρ_{p}^{*}}{c_{k} - μ_{k}}

,

c_{k} = E {g (| s_{k} |^{2}) | s_{k} |^{2}}

,

μ_{k} = E {g (| s_{k} |^{2})}

,

ρ_{k} = E {g (| s_{k} |^{2}) s_{k}^{2}}

,

g (\cdot)

is the derivative operator of the nonlinearity

G (\cdot)

in the cost function (3), and

{\bar{s}}_{i} = b_{i}^{H} (z - m_{x})

denotes the projection of the centered contamination point

z

into the direction

b_{i}

.

To prove Theorem 1, the following five Lemmas are needed, which will be proved in Appendix A and Appendix B.

Lemma 1.

Assume that the observed data,

x

, obey the distribution

F (x)

with the mean vector

m_{x}

, and the sources

s

follow Assumption 1. Let

y_{x, i} (F_{t}) = b_{i}^{H} (F_{t}) (x - m_{x})

,

L_{k} (F) = E {g (| y_{x, k} |^{2}) y_{x, k} {[x - m_{x}]}^{*}}

and

B_{k} (F) = E {g (| y_{x, k} |^{2}) [x - m_{x}] {[x - m_{x}]}^{H}}

, one obtains the following:

(a): $I F (y_{x, i}) = I F {(b_{i})}^{H} (x - m_{x}) - {\bar{s}}_{i},$
(b): $L_{k} (F) = c_{k} a_{k}^{*}$ ,
(c): $B_{k} (F) = μ_{k} {AA}^{H} + (c_{k} - μ_{k}) a_{k} a_{k}^{H}$ ,
(d): $a_{k}^{H} I F (b_{i}) = - I F {(b_{k})}^{H} a_{i} - {\bar{s}}_{k} {\bar{s}}_{i}^{*}$ , $i = 1, \dots, k - 1$ ,
(e): $2 R e (a_{k}^{H} I F (b_{k})) = 1 - {| {\bar{s}}_{k} |}^{2}$ ,

where

{\bar{s}}_{i} = b_{i}^{H} (z - m_{x}), i = 1, \dots, n

denotes the projection of the centered contamination point

z

into the direction

b_{i}

;

c_{k} = E {g (| s_{k} |^{2}) | s_{k} |^{2}}

,

μ_{k} = E {g (| s_{k} |^{2})}

. The notation of

a_{k}

represents the kth column of matrix

A

.

Proof.

See Appendix A. □

Lemma 2.

Assume that the observed data,

x

, obey the distribution

F (x)

with the mean vector

m_{x}

, and the sources

s

follow Assumption 1. For

λ_{i}, i = 1, \dots, k

in (10) at the distribution

F (x)

, we have the following:

(a): $λ_{i} (F) = 0, i = 1, \dots, k - 1$ ,
(b): $λ_{k} (F) = c_{k}$ ,
(c): $I F (λ_{i}) = (c_{k} - μ_{k}) a_{k}^{H} I F (b_{i}) + (g (| {\bar{s}}_{k} |^{2}) - μ_{k}) {\bar{s}}_{k} {\bar{s}}_{i}^{*} - ρ_{k} s_{i}^{*}, i = 1, \dots, k - 1$ ,
(d): $I F (λ_{k}) = (g (| {\bar{s}}_{k} |^{2}) - c_{k}) | {\bar{s}}_{k} |^{2} + E {g^{'} (| {\bar{s}}_{k} |^{2}) v_{k} ({\bar{s}}_{k}) | {\bar{s}}_{k} |^{2}} - 2 R e (E {g (| {\bar{s}}_{k} |^{2}) {\bar{s}}_{k} s_{k}^{*}})$ ,

where

v_{k} ({\bar{s}}_{k}) = | s_{k} |^{2} - | s_{k} |^{2} {| {\bar{s}}_{k} |}^{2} - 2 R e ({\bar{s}}_{k} s_{k}^{*})

, and

c_{k}

,

μ_{k}

,

{\bar{s}}_{i}

,

ρ_{k}

are defined as in Theorem 1.

Proof.

See Appendix B. □

Based on Lemmas 1 and 2, we will now prove Theorem 1 from the following three steps:

Proof.

Step 1. Differentiating

L_{k} (F_{t})

and

R_{k} (F_{t})

with respect to

t = 0

in (15), yields the following equation:

\frac{\partial L_{k} (F_{t})}{\partial t} |_{t = 0} = \frac{\partial R_{k} (F_{t})}{\partial t} |_{t = 0},

(17)

which aims at obtaining a closed-form expression of the influence function of the complex-valued FastICA functional

I F (b_{k})

.

Step 2. Calculating the left-hand side of the Equation (17). Combing

F_{t} (x) = (1 - t) F (x) + t ▵_{x}

with Equation (13), we conclude the following:

\begin{matrix} L_{k} (F_{t}) = (1 - t) E {g (| y_{x, k} |^{2}) y_{x, k} {[x - m_{x}]}^{*}} + t g (| y_{z, k} |^{2}) y_{z, k} {[z - m_{x}]}^{*}, \end{matrix}

(18)

Differentiating

L_{k} (F_{t})

with respect to

t = 0

and leveraging Lemma 1, one can obtain the following:

\begin{matrix} \frac{\partial L_{k} (F_{t})}{\partial t} |_{t = 0} = & - L_{k} (F) + (1 - t) [E {g^{'} (| y_{x, k} |^{2}) (I F (y_{x, k}) y_{x, k}^{*} + y_{x, k} I F {(y_{x, k})}^{*}) y_{x, k} {(x - m_{x})}^{*}} \\ + E {g (| y_{x, k} |^{2}) I F (y_{x, k}) {(x - m_{x})}^{*}} - E {g (| y_{x, k} |^{2}) y_{x, k} I F {(m_{x})}^{*}}] |_{t = 0} \\ + g (| y_{z, k} |^{2}) y_{z, k} {(z - m_{x})}^{*} |_{t = 0} \\ = - c_{k} a_{k}^{*} + E {g^{'} (| s_{k} |^{2}) | s_{k} |^{2} v_{k} ({\bar{s}}_{k})} a_{k}^{*} + {[μ_{k} {AA}^{H} + (c_{k} - μ_{k}) a_{k} a_{k}^{H}]}^{*} I F {(b_{k})}^{*} \\ - 2 R e (E {g (| s_{k} |^{2}) {\bar{s}}_{k} s_{k}^{*}}) a_{k}^{*} + ρ_{k} {\bar{s}}_{k}^{*} a_{k}^{*} + [g (| {\bar{s}}_{k} |^{2}) {\bar{s}}_{k} - ρ_{k}] {(z - m_{x})}^{*} . \end{matrix}

Step 3. Calculating the right-hand side of (17). From (14), we have the following:

\begin{matrix} R_{k} (F_{t}) = λ_{k} (F_{t}) C_{x}^{T} b_{k}^{*} (F_{t}) + \sum_{j = 1}^{k - 1} λ_{j} (F_{t}) C_{x}^{T} b_{j}^{*} (F_{t}), \end{matrix}

(19)

Differentiating

R_{k} (F_{t})

with respect to

t = 0

and leveraging Lemma 2 one can obtain the following:

\begin{matrix} \frac{\partial R_{k} (F_{t})}{\partial t} |_{t = 0} = & I F (λ_{k}) {({AA}^{H})}^{T} b_{k}^{*} + λ_{k} (F) I F {(C_{x})}^{T} b_{k}^{*} + λ_{k} (F) {({AA}^{H})}^{T} I F {(b_{k})}^{*} \\ + \sum_{j = 1}^{k - 1} [I F (λ_{j}) {({AA}^{H})}^{T} b_{j}^{*} + λ_{j} (F) I F {(C_{x})}^{T} b_{j}^{*} + λ_{j} (F) {({AA}^{H})}^{T} I F {(b_{j})}^{*}] \\ = I F (λ_{k}) a_{k}^{*} + c_{k} [{(z - m_{x})}^{*} {\bar{s}}_{k} - a_{k}^{*}] + c_{k} {({AA}^{H})}^{T} I F {(b_{k})}^{*} + \sum_{j = 1}^{k - 1} I F (λ_{j}) a_{j}^{*} \\ = [g (| {\bar{s}}_{k} |^{2}) | {\bar{s}}_{k} |^{2} - c_{k} {| {\bar{s}}_{k} |}^{2}] a_{k}^{*} - c_{k} a_{k}^{*} + c_{k} {(z - m_{x})}^{*} {\bar{s}}_{k} + c_{k} {({AA}^{H})}^{T} I F {(b_{k})}^{*} \\ + E {g^{'} (| s_{k} |^{2}) | s_{k} |^{2} v_{k} ({\bar{s}}_{k})} a_{k}^{*} - 2 R e (E {g (| s_{k} |^{2}) {\bar{s}}_{k} s_{k}^{*}}) a_{k}^{*} - ρ_{k} \sum_{j = 1}^{k - 1} {\bar{s}}_{j}^{*} a_{j}^{*} \\ + (c_{k} - μ_{k}) \sum_{j = 1}^{k - 1} a_{k}^{H} I F (b_{j}) a_{j}^{*} + [g (| {\bar{s}}_{k} |^{2}) - μ_{k}] {\bar{s}}_{k} \sum_{j = 1}^{k - 1} {\bar{s}}_{j}^{*} a_{j}^{*} . \end{matrix}

Step 4. Deriving the specific expression of the IF of the complex-valued FastICA function. Plugging the results derived in step 1 and step 2 into (17), after tedious manipulation, one can yield the following:

\begin{matrix} I F (b_{k}) = & \frac{g (| {\bar{s}}_{k} |^{2}) {\bar{s}}_{k}^{*} - c_{k} {\bar{s}}_{k}^{*} - ρ_{k}^{*}}{c_{k} - μ_{k}} B^{H} B (z - m_{x}) - \frac{g (| {\bar{s}}_{k} |^{2}) - c_{k}}{c_{k} - μ_{k}} {| {\bar{s}}_{k} |}^{2} B^{H} B a_{k} + \frac{ρ_{k}^{*} {\bar{s}}_{k}}{c_{k} - μ_{k}} B^{H} B a_{k} \\ + \frac{ρ_{k}^{*}}{c_{k} - μ_{k}} \sum_{j = 1}^{k - 1} {\bar{s}}_{j} B^{H} B a_{j} - \sum_{j = 1}^{k - 1} B^{H} B a_{k}^{T} I F {(b_{j})}^{*} a_{j} \\ - \frac{g (| {\bar{s}}_{k} |^{2}) - μ_{k}}{c_{k} - μ_{k}} {\bar{s}}_{k}^{*} \sum_{j = 1}^{k - 1} {\bar{s}}_{j} B^{H} B a_{j} + \frac{1 - | {\bar{s}}_{k} |^{2}}{2} B^{H} B a_{k} \\ = \frac{g (| {\bar{s}}_{k} |^{2}) {\bar{s}}_{k}^{*} - c_{k} {\bar{s}}_{k}^{*} - ρ_{k}^{*}}{c_{k} - μ_{k}} (\sum_{j \neq k}^{n} b_{j} {\bar{s}}_{j} + {\bar{s}}_{k} b_{k}) - \frac{g (| {\bar{s}}_{k} |^{2}) - c_{k}}{c_{k} - μ_{k}} {| {\bar{s}}_{k} |}^{2} b_{k} + \frac{ρ_{k}^{*} {\bar{s}}_{k}}{c_{k} - μ_{k}} b_{k} \\ + \frac{ρ_{k}^{*}}{c_{k} - μ_{k}} \sum_{j = 1}^{k - 1} {\bar{s}}_{j} b_{j} - \sum_{j = 1}^{k - 1} a_{k}^{T} I F {(b_{j})}^{*} b_{j} - \frac{g (| {\bar{s}}_{k} |^{2}) - μ_{k}}{c_{k} - μ_{k}} {\bar{s}}_{k}^{*} \sum_{j = 1}^{k - 1} {\bar{s}}_{j} b_{j} + \frac{1 - | {\bar{s}}_{k} |^{2}}{2} b_{k} \\ = - s_{k}^{*} \sum_{j = 1}^{k - 1} {\bar{s}}_{j} b_{j} + u_{k} ({\bar{s}}_{k}) \sum_{j = k + 1}^{n} {\bar{s}}_{j} b_{j} - \sum_{j = 1}^{k - 1} a_{k}^{T} I F {(b_{j})}^{*} b_{j} + \frac{1 - | {\bar{s}}_{k} |^{2}}{2} b_{k} . \end{matrix}

(20)

Combining the preceding equality concludes our proof

a_{k}^{H} I F (b_{j}) = u_{j} ({\bar{s}}_{j}) {\bar{s}}_{k}

. □

Remark 1.

In the proof of Theorem 1 and Lemmas 1–2, we need to consider the following facts:

A^{H} b_{k} = e_{k},

A e_{k} = a_{k},

b_{i}^{H} a_{j} = \{\begin{matrix} 0, & j < k \\ 1, & j = k \end{matrix}

where

A

denotes the mixing matrix in ICA model (1);

a_{k}

is the kth column of matrix

A

;

e_{k}

represents a vector with 1 in the kth element and 0 elsewhere; and

b_{i}

is the ith column of the demixing matrix

B

.

Remark 2.

Note that pre-multiplying both sides of Equation (20) by

a_{i}^{H}

, we have the following:

\begin{matrix} a_{i}^{H} I F (b_{k}) = - s_{k}^{*} \sum_{j = 1}^{k - 1} {\bar{s}}_{j} a_{i}^{H} b_{j} + u_{k} ({\bar{s}}_{k}) \sum_{j = k + 1}^{n} {\bar{s}}_{j} a_{i}^{H} b_{j} - \sum_{j = 1}^{k - 1} a_{k}^{T} I F {(b_{j})}^{*} a_{i}^{H} b_{j} + \frac{1 - | {\bar{s}}_{k} |^{2}}{2} a_{i}^{H} b_{k}, \end{matrix}

(21)

After analyzing (21), one can observe the following:

a_{i}^{H} I F (b_{k}) = u_{k} ({\bar{s}}_{k}) {\bar{s}}_{i}

for

i > k

, or

a_{k}^{H} I F (b_{j}) = u_{j} ({\bar{s}}_{j}) {\bar{s}}_{k}

for

k < j

.

Remark 3.

The expression of complex-valued FastICA IF can also be written as follows:

\begin{matrix} I F (b_{k}) = \{\begin{matrix} \frac{1 - | {\bar{s}}_{1} |^{2}}{2} b_{1} + u_{1} ({\bar{s}}_{1}) \sum_{i = 2}^{n} {\bar{s}}_{i} b_{i}, & k = 1, \\ - {\bar{s}}_{k}^{*} \sum_{i = 1}^{k - 1} [u_{i} {({\bar{s}}_{i})}^{*} + {\bar{s}}_{i}] b_{i} + \frac{1 - | {\bar{s}}_{k} |^{2}}{2} b_{k} + u_{k} ({\bar{s}}_{k}) \sum_{i = k + 1}^{n} {\bar{s}}_{i} b_{i}, & 1 < k < n, \\ - {\bar{s}}_{n}^{*} \sum_{i = 1}^{n - 1} [u_{i} {({\bar{s}}_{i})}^{*} + {\bar{s}}_{i}] b_{i} + \frac{1 - | {\bar{s}}_{n} |^{2}}{2} b_{n}, & k = n . \end{matrix} \end{matrix}

(22)

By observing the closed-form expression of the IF for the complex-valued FastICA functional

b_{k}

in Theorem 1, we obtain that the IF of

b_{k}

is the weighted sum of the separation vector

b_{1}, \dots, b_{n}

with the unbounded weight coefficient function, with respect to the projection of the contaminated point

{\bar{s}}_{1}, \dots, {\bar{s}}_{n}

. This finding confirms that the values of

I F (b_{k})

are large when the outliers are present in the source signals. As can also be seen in Theorem 1, the greater the values of the contaminated point

{\bar{s}}_{1}, \dots, {\bar{s}}_{n}

presenting in the source signals, the higher the values of

I F (b_{k})

.

4. Equivariance of the Complex-Valued FastICA Estimator

4.1. Equivariance of Complex-Valued ICA

Equivariance is a fundamental property in statistics [23], which makes the transformation of the parameter in the model equal to that of the sample data, which results in the following definition.

Definition 3.

Assume that

\hat{A} = A (X)

is an estimator of the mixing matrix

A

in the complex-valued ICA model (1), which relies on the sample matrix

X = [x (1), \dots, x (T)]

with T sample points. Then, we call the estimator

\hat{A}

equivariant if

A (TX) = T A (X)

(23)

for any invertible transformation

T

.

In order to analyze the equivariance of the complex-valued FastICA estimator, the following Theorem is needed.

Theorem 2.

Let

C = BA

be the global mixing–demixing system of the complex-valued ICA model (1) and

\hat{A}

be equivariant. Then the global mixing–demixing system

C

of ICA algorithms based on the optimizing cost function depends only on the source signal matrix

S = [s (1), \dots, s (T)]

, instead of the mixing system

A

and demixing system

B

.

Proof.

Denote by

C

the mixing–demixing matrix

BA

; the estimation of the source can be rewritten as follows:

y = Cs .

(24)

In fact, the main property of the equivariant estimator for the BSS problem is that it provides uniform performance in the sense that

y = Bx = A {(X)}^{- 1} As = A {(AS)}^{- 1} As = A {(S)}^{- 1} s,

(25)

where

S = [s (1), \dots, s (T)]

is the source signal matrix with T sample points.

By comparing Equations (24) and (25), the result follows from the fact that

C = A {(S)}^{- 1} .

□

Remark 4.

From Theorem 2, the current estimation of source

y_{t}

estimated by the equivariant estimator relies only on the source signal matrix

S

and the source

s_{t}

, rather than the mixing matrix

A

.

Remark 5.

Due to the fact that the estimator based on the optimizing cost function is equivariant [24], we only need to consider the equivariance of the complex-valued FastICA based on the global mixing–demixing system

C = BA

, i.e., the update learning rule of the estimated parameter for the ICA algorithm depends on the composition system.

4.2. Equivariance of Symmetric Complex-Valued FastICA

For convenience, we analyze the equivariance of the complex-valued FastICA algorithm by focusing on the symmetric version, in the sense that the source signals are estimated in parallel. In particular, the demixing vectors

b_{1}, b_{2}, \dots, b_{n}

are iterated simultaneously, as follows:

\begin{matrix} b_{1} \leftarrow & - E {g (| b_{1}^{H} {x |}^{2}) {(b_{1}^{H} x)}^{*} x} + E {g^{'} (| b_{1}^{H} {x |}^{2}) | b_{1}^{H} {x |}^{2} + g (| b_{1}^{H} {x |}^{2})} b_{1} + E {{xx}^{T}} E {g^{'} (| b_{1}^{H} {x |}^{2}) (b_{1}^{H} x)^{* 2}} b_{1}^{*}, \\ ⋮ \\ b_{n} \leftarrow & - E {g (| b_{n}^{H} {x |}^{2}) {(b_{n}^{H} x)}^{*} x} + E {g^{'} (| b_{n}^{H} {x |}^{2}) | b_{n}^{H} {x |}^{2} + g (| b_{n}^{H} {x |}^{2})} b_{n} + E {{xx}^{T}} E {g^{'} (| b_{n}^{H} {x |}^{2}) (b_{n}^{H} x)^{* 2}} b_{n}^{*}, \end{matrix}

where

b_{1}, b_{2}, \dots, b_{n}

is an orthonormal set. In order to prevent different parameters from converging to the same maxima, the estimate sources

y_{1} = b_{1}^{H} x, y_{2} = b_{2}^{H} x, \dots, y_{n} = b_{n}^{H} x

need to be decorrelated, which can be accomplished by the following classic symmetric orthogonalization approach involving matrix square roots

B \leftarrow {(B B^{H})}^{- \frac{1}{2}} B,

(26)

where

B = (b_{1}, \dots, b_{n})

is the matrix of the vectors. Note that the inverse square root

{(B B^{H})}^{- \frac{1}{2}}

can be acquired by the EVD of

B B^{H} = E d i a g (d_{1}, d_{2}, \dots, d_{n}) E^{H}

. For any vector

h = {(h_{1}, \dots, h_{n})}^{T}

, we denote the following:

d i a g (h) = (\begin{matrix} h_{1} & 0 \\ ⋱ \\ 0 & h_{n} \end{matrix}) .

Hence, we have the following:

{(B B^{H})}^{- \frac{1}{2}} = E d i a g (d_{1}^{- \frac{1}{2}}, d_{2}^{- \frac{1}{2}}, \dots, d_{n}^{- \frac{1}{2}}) E^{H} .

(27)

Using the matrix representation, the symmetric version of the complex-valued FastICA algorithm can be rewritten by the following learning rule:

Step 1: Choose arbitrary initial values for the $b_{1}, \dots, b_{n} \in O (n)$ ; orthogonalize matrix $B = (b_{1}, \dots, b_{n})$ as in Step 2 below.
Step 2: Let $y = Bx$ , run iteration

\begin{matrix} B \leftarrow & - {E {d i a g (g (| y |}^{2})) {yx}^{H}} + E {d i a g (g^{'} {(| y |}^{2} {)) | y |}^{2} \\ + {g (| y |}^{2})} B + E {d i a g (g^{'} {(| y |}^{2})) y^{2}} E {y^{*} x^{H}}, \\ B \leftarrow {(B B^{H})}^{- \frac{1}{2}} B \end{matrix}

(28)

until convergence.

Definition 4.

Assume that

B = B (X)

is an estimator of the demixing matrix

B

in the complex-valued ICA model (1), which relies on the sample matrix

X = [x (1), \dots, x (T)]

with T sample points. Then, we call the estimator

B (X)

linear-equivariant, if

C = B (S),

(29)

where

x (i) = As (i), i = 1, \dots, T

,

S = [s (1), \dots, s (T)]

and

C = BA

.

From Definition 4, we can obtain the equivariance of the symmetric complex-valued FastICA algorithm:

Theorem 3.

The algorithm (28) for the complex-valued BSS is linear-equivariant and

B (S) = A {(S)}^{- 1}

.

Proof.

We denote by

C

the mixing–demixing matrix

BA

, and the estimation of sources can be rewritten as follows:

y = Cs .

(30)

Plugging the symmetric orthogonalization into the algorithm (28), by post-multiplying

A

on both sides of the complex-valued FastICA algorithm, one can yield

\begin{matrix} C \leftarrow & - {E {d i a g (g (| Cs |}^{2})) {Css}^{H} C^{H} C} + E {d i a g (g^{'} {(| Cs |}^{2} {)) | Cs |}^{2} \\ + {g (| Cs |}^{2})} C + E {d i a g (g^{'} {(| Cs |}^{2} {)) (Cs)}^{2}} E {{(Cs)}^{*} s^{H} C^{H} C} . \end{matrix}

(31)

From the above equation, we have

C = B (S)

. Hence, the complex-valued BSS is linear-equivariant. The residual conclusion

B (S) = A {(S)}^{- 1}

can be obtained by combining Theorem 2, Definition 4 and Theorem 3. □

The results obtained demonstrate the following: (1) Theorems 2 and 3 suggest that the complex-valued FastICA estimator is of equivariance, namely, the performance of the complex-valued FastICA algorithm depends only on the source signal

s

, rather than the mixing matrix

A

and demixing matrix

B

; (2) There is an invertible relationship between the mixing matrix

A

and demixing matrix

B

under Assumption 2.

5. Conclusions

In order to provide a rigorous mathematical treatment of the robust measurement and equivariance for the complex-valued FastICA estimator, this paper analyzed the statistical properties of the complex-valued FastICA algorithm in the context of ICA over the complex number domain. Firstly, a closed-form expression of the complex-valued FastICA functional was derived and used to measure robustness against outliers. We found that the complex-valued FastICA algorithm based on Tukey’s single-parameter M-estimator cost function had the best separation performance in both circular and noncircular scenarios. Then, we proved that the complex-valued FastICA algorithm is equivariant in the sense that the global mixing-demixing system of the algorithm depended only on the source signals.

Author Contributions

Conceptualization, J.E. and M.Y.; methodology, J.E. and M.Y.; software, M.Y. writing—original draft preparation, J.E. and M.Y.; validation, J.E. and M.Y.; writing—review and editing, J.E. and M.Y.; visualization, J.E. and M.Y.; supervision, M.Y.; funding acquisition, M.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Guangxi Natural Science Foundation under grant 2023GXNSFBA026180, Middle-aged and Young Teachers’ Basic Ability Promotion Project of Guangxi Province under grant 2024KY0181, and the Natural Science Basic Research Program of Shaanxi under grant 2024JC-YBMS-043.

Data Availability Statement

The data used to support the findings of this study are included within the article.

Acknowledgments

Sincere thanks are given to anonymous referees for their insightful comments which are valuable for improving this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Proof of Lemma 1.

(a) By determining the derivative

y_{x, i} (F_{t})

with respect to t at point

t = 0

, we have the following:

\begin{matrix} I F (y_{x, i})) & = \frac{\partial y_{x, i} (F_{t})}{\partial t} |_{t = 0} \\ = \frac{\partial b_{i}^{H} (F_{t})}{\partial t} |_{t = 0} \cdot (x - m_{x}) |_{t = 0} + b_{i}^{H} (F_{t}) |_{t = 0} \cdot \frac{\partial (x - m_{x})}{\partial t} |_{t = 0} \\ = I F {(b_{i})}^{H} (x - m_{x}) - b_{i}^{H} I F (m_{x}) \\ = I F {(b_{i})}^{H} (x - m_{x}) - b_{i}^{H} (z - m_{x}) \\ = I F {(b_{i})}^{H} (x - m_{x}) - {\bar{s}}_{i} . \end{matrix}

(b)

y_{x, k} (F) = b_{k}^{H} (F) [x - m_{x}] = b_{k}^{H} As = e_{k}^{H} s = s_{k}

, the notation of

e_{k}

represents a real-valued vector with element 1 in the kth component, and element 0 elsewhere. By

E {g (| s_{k} |^{2}) s_{k} s_{l}} = E {g (| s_{k} |^{2}) s_{k}} E {s_{l}} = 0, l \neq k

, we have the following:

\begin{matrix} L_{k} (F) & = E {g (| s_{k} |^{2}) s_{k} {(As)}^{*}} = A^{*} E {g (| s_{k} |^{2}) | s_{k} |^{2} e_{k}} \\ = E {g (| s_{k} |^{2}) | s_{k} |^{2}} a_{k}^{*} = c_{k} a_{k}^{*}, \end{matrix}

where

{Ae}_{k} = a_{k}

.

(c) By the assumption

E {s} = 0, E {{ss}^{H}} = I

; that is,

E {s_{i}} = 0, E {s_{i} s_{j}^{*}} = 0

,

E {| s_{i} |^{2}} = 1, i \neq j = 1, \dots, n

, one can obtain the following:

\begin{matrix} B_{k} (F_{A}) & = E {g (| s_{k} |^{2} {) As (As)}^{H}} = A E {g (| s_{k} |^{2}) {(ss)}^{H}} A \\ = A (\begin{matrix} E {g (| s_{k} |^{2})} \\ ⋱ \\ E {g (| s_{k} |^{2}) | s_{k} |^{2}} \\ ⋱ \\ E {g (| s_{k} |^{2})} \end{matrix}) A^{H} \\ = A (c_{k} e_{k} e_{k}^{T} + μ_{k} \sum_{l \neq k} e_{l} e_{l}^{T}) A^{H} \\ = c_{k} A e_{k} e_{k}^{H} A^{H} + μ_{k} {AA}^{H} - μ_{k} a_{k} a_{k}^{H} \\ = μ_{k} {AA}^{H} + (c_{k} - μ_{k}) a_{k} a_{k}^{H} . \end{matrix}

(d) According to the constraint condition (7), if

i < k

:

b_{k} {(F_{t})}^{H} C_{x} (F_{t}) b_{i} (F_{t}) = 0

, after calculating the derivation with respect to

t = 0

on both sides, we have the following:

\begin{matrix} 0 & = I F {(b_{k})}^{H} {AA}^{H} b_{i} + b_{k}^{H} I F (C_{x}) b_{i} + b_{k}^{H} {AA}^{H} I F (b_{i}) \\ = I F {(b_{k})}^{H} a_{i} + b_{k}^{H} [(z - m_{x}) {(z - m_{x})}^{H} - {AA}^{H}] b_{i} + a_{k}^{H} I F (b_{i}) \\ = I F {(b_{k})}^{H} a_{i} + {\bar{s}}_{k} {\bar{s}}_{i}^{*} - b_{k}^{H} {AA}^{H} b_{i} + a_{k}^{H} I F (b_{i}), \end{matrix}

which is equivalent to

a_{k}^{H} I F (b_{i}) = - I F {(b_{k})}^{H} a_{i} - {\bar{s}}_{k} {\bar{s}}_{i}^{*}

, where

{\bar{s}}_{i} = b_{i}^{H} (z - m_{x})

,

i = 1, \dots, k - 1

denotes the projection of the centered contamination point

z

into the direction

b_{i}

.

(e) According to the constraint condition (7), if

j = k

:

b_{k} {(F_{t})}^{H} C_{x} (F_{t}) b_{k} (F_{t}) = 1

, after calculating the derivation with respect to

t = 0

on both sides, we have the following:

\begin{matrix} 0 & = I F {(b_{k})}^{H} {AA}^{H} b_{k} + b_{k}^{H} I F (C_{x}) b_{k} + b_{k}^{H} {AA}^{H} I F (b_{k}) \\ = I F {(b_{k})}^{H} a_{k} + b_{k}^{H} [(z - m_{x}) {(z - m_{x})}^{H} - {AA}^{H}] b_{k} + a_{k}^{H} I F (b_{k}) \\ = I F {(b_{k})}^{H} a_{k} + {\bar{s}}_{k} {\bar{s}}_{k}^{*} - b_{k}^{H} {AA}^{H} b_{k} + a_{k}^{H} I F (b_{k}) \\ = I F {(b_{k})}^{H} a_{k} + a_{k}^{H} I F (b_{k}) + {| {\bar{s}}_{k} |}^{2} - 1, \end{matrix}

which is equivalent to

2 R e (a_{k}^{H} I F (b_{k})) = 1 - {| {\bar{s}}_{k} |}^{2}

, where

{\bar{s}}_{k} = b_{k}^{H} (z - m_{x})

denotes the projection of the centered contamination point

z

into the direction

b_{k}

. Note that

A^{H} b_{k} = e_{k}

;

A e_{k} = a_{k}

;

a_{k}^{H} a_{k} = 0

,

a_{k}^{H} a_{i} = 0 (i < k)

. □

Appendix B

Proof of Lemma 2.

(a) and (b) Recalling

R_{k} (F) = λ_{k} C_{x}^{T} b_{k}^{*} + \sum_{j = 1}^{k - 1} λ_{j} C_{x}^{T} b_{j}^{*},

by multiplying

b_{i}^{T}

from the right-hand side of the above equation and plugging the constraint condition (7), one can obtain the following:

λ_{i} (F) = b_{i}^{T} R_{k} (F), i = 1, \dots, k

. We note that

L_{k} (F) = R_{k} (F)

and recalling Lemma 1 (b), we have the following:

λ_{i} (F_{A}) = c_{k} {(b_{i}^{H} a_{k})}^{*} .

Hence,

λ_{k} (F) = 0

, if

i = 1, \dots, k - 1

;

λ_{i} (F) = c_{k}

if

i = k

.

(c) and (d) According to the proof of Lemma 1, we have the following:

\begin{matrix} λ_{i} (F) & = b_{i}^{T} L_{k} (F) \\ = b_{i}^{T} E {g (| y_{x, k} |^{2}) y_{x, k} [x - m_{x})]^{*}} \\ = E {g (| y_{x, k} |^{2}) y_{x, k} y_{x, i}^{*}}, i = 1, \dots, k \end{matrix}

due to

λ_{i} (F_{t}) = (1 - t) E {g (| y_{x, k} |^{2}) y_{x, k} y_{x, i}^{*}} + t g (| y_{z, k} |^{2}) y_{z, k} y_{z, i}^{*},

after calculating the derivation with respect to

t = 0

, one can achieve the following:

\begin{matrix} I F (λ_{i}) = & - λ_{i} (F) + 2 E {g^{'} (| y_{x, k} |^{2}) R e (I F (y_{x, k}) y_{x, k}^{*}) y_{x, k} y_{x, i}^{*}} \\ + E {g (| y_{x, k} |^{2}) I F (y_{x, k}) y_{x, i}^{*}} + E {g (| y_{x, k} |^{2}) y_{x, k} I F {(y_{x, i})}^{*}} + g (| {\bar{s}}_{k} |^{2}) {\bar{s}}_{k} {\bar{s}}_{i}^{*} . \end{matrix}

When

i = 1, \dots, k - 1

, after tedious calculations, we have the following:

\begin{matrix} I F (λ_{i}) & = - μ_{k} [I F {(b_{i})}^{T} a_{k}^{*} - {\bar{s}}_{k} {\bar{s}}_{i}^{*}] + c_{k} a_{k}^{H} I F (b_{i}) - ρ_{k} {\bar{s}}_{i}^{*} + g (| {\bar{s}}_{k} |^{2}) {\bar{s}}_{k} {\bar{s}}_{i}^{*} \\ = (c_{k} - μ_{k}) a_{k}^{H} I F (b_{i}) + (g (| {\bar{s}}_{k} |^{2}) - μ_{k}) {\bar{s}}_{k} {\bar{s}}_{i}^{*} - ρ_{k} s_{i}^{*} . \end{matrix}

which completes the proof of Lemma 2 (c).

When

i = k

, after tedious calculations, we have the following:

\begin{matrix} I F (λ_{i}) = & - c_{k} + E {g^{'} (| s_{k} |^{2}) (| s_{k} |^{2} - | s_{k} |^{2} {| {\bar{s}}_{k} |}^{2} - 2 R e ({\bar{s}}_{k} s_{k}^{*})) | s_{k} |^{2}} + μ_{k} a_{k}^{T} I F {(b_{k})}^{*} \\ + (c_{k} - μ_{k}) a_{k}^{T} I F {(b_{k})}^{*} - E {g (| s_{k} |^{2}) {\bar{s}}_{k} s_{k}^{*}} + c_{k} a_{k}^{H} I F (b_{k}) \\ - E {g (| s_{k} |^{2}) s_{k} {\bar{s}}_{k}^{*}} + g (| {\bar{s}}_{k} |^{2}) | {\bar{s}}_{k} |^{2} \\ = - c_{k} + E {g^{'} (| s_{k} |^{2}) v_{k} ({\bar{s}}_{k}) | s_{k} |^{2}} + 2 c_{k} R e (a_{k}^{H} I F (b_{k})) \\ - 2 R e (E {g (| s_{k} |^{2}) {\bar{s}}_{k} s_{k}^{*}}) + g (| {\bar{s}}_{k} |^{2}) | {\bar{s}}_{k} |^{2} \\ = (g (| {\bar{s}}_{k} |^{2}) - c_{k}) | {\bar{s}}_{k} |^{2} + E {g^{'} (| {\bar{s}}_{k} |^{2}) v_{k} ({\bar{s}}_{k}) | {\bar{s}}_{k} |^{2}} - 2 R e (E {g (| {\bar{s}}_{k} |^{2}) {\bar{s}}_{k} s_{k}^{*}}) \end{matrix}

which completes the proof of Lemma 2 (d). □

References

Comon, P. Independent component analysis, A new concept? Signal Process. 1994, 36, 287–314. [Google Scholar] [CrossRef]
Hyvärinen, A.; Karhunen, J.; Oja, E. Independent Component Analysis; Wiley and Sons: New York, NY, USA, 2001. [Google Scholar]
Chen, Y.H.; Wang, S.P. Low-cost implementation of independent component analysis for biomedical signal separation using very-large-scale integration. IEEE Trans. Circuit Syst. II 2020, 67, 3437–3441. [Google Scholar] [CrossRef]
Schell, A.; Oberhauser, H. Nonlinear independent component analysis for discrete-time and continuous-time signals. Ann. Stat. 2023, 51, 487–518. [Google Scholar] [CrossRef]
Liu, J.; Ye, J.; E, J. A multi-scale forecasting model for CPI based on independent component analysis and non-linear autoregressive neural network. Phys. A 2023, 609, 128369. [Google Scholar] [CrossRef]
E, J.; He, K.; Liu, H.; Ji, Q. A novel separation-ensemble analyzing and forecasting method for the gold price forecasting based on RLS-type independent component analysis. Expert Syst. Appl. 2023, 232, 120852. [Google Scholar] [CrossRef]
Bingham, E.; Hyvärinen, A. A fast fixed-point algorithm for independent component analysis of complex valued signals. Int. J. Neural Syst. 2000, 10, 1–8. [Google Scholar] [CrossRef] [PubMed]
Novey, M.; Adali, T. On extending the complex FastICA algorithm to noncircular sources. IEEE Trans. Signal Process. 2008, 56, 2148–2154. [Google Scholar] [CrossRef]
Novey, M.; Adali, L. Complex ICA by negentropy maximization. IEEE Trans. Neural Netw. 2008, 19, 596–609. [Google Scholar] [CrossRef] [PubMed]
Qian, G.; Wei, P. Stability analysis of complex ica by negentropy maximization: A unique perspective. Neurocomputing 2016, 214, 80–85. [Google Scholar] [CrossRef]
Mika, D. Fast gradient algorithm with toral decomposition for complex ICA. Mech. Syst. Signal Process. 2022, 178, 109266. [Google Scholar] [CrossRef]
Koldovský, Z.; Tichavský, P. Gradient algorithms for complex non-Gaussian independent component/vector extraction, question of convergence. IEEE Trans. Signal Process. 2019, 67, 1050–1064. [Google Scholar] [CrossRef]
E, J.; Ye, J.; He, L.; Jin, H. Performance analysis for complex-valued FastICA and its improvement based on the Tukey M-estimator. Digit. Signal Process. 2021, 115, 103077. [Google Scholar] [CrossRef]
Loesch, B.; Yang, B. Cramér-Rao bound for circular and noncircular complex independent component analysis. IEEE Trans. Signal Process. 2012, 61, 365–379. [Google Scholar] [CrossRef]
Kautský, V.; Koldovský, Z.; Tichavský, P.; Zarzoso, V. Cramér-Rao bounds for complex-valued independent component extraction: Determined and piecewise determined mixing models. IEEE Trans. Signal Process. 2020, 68, 5230–5243. [Google Scholar] [CrossRef]
Fu, G.S.; Phlypo, R.; Anderson, M.; Adalı, T. Complex independent component analysis using three types of diversity: Non-Gaussianity, nonwhiteness and noncircularity. IEEE Trans. Signal Process. 2015, 63, 794–805. [Google Scholar] [CrossRef]
Koldovský, Z.; Kautský, V.; Tichavský, P.; Čmejla, J.; Málek, J. Dynamic independent component/vector analysis: Time-variant linear mixtures separable by time-invariant beamformers. IEEE Trans. Signal Process. 2021, 69, 2158–2173. [Google Scholar] [CrossRef]
Hyvärinen, A.; Oja, E. A fast fixed-point algorithms for independent component analysis. Neural Comput. 1997, 9, 1483–1492. [Google Scholar] [CrossRef]
Hyvärinen, A. Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw. 1999, 10, 626–634. [Google Scholar] [CrossRef] [PubMed]
Chao, J.C.; Douglas, S.C. A Robust Complex FastICA Algorithm Using the Huber M-Estimator Cost Function; Springer: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
Adali, T.; Schreier, P.J.; Scharf, L.L. Complex-valued signal processing: The proper way to deal with impropriety. IEEE Trans. Signal Process. 2011, 59, 5101–5125. [Google Scholar] [CrossRef]
Hampel, F.R.; Ronchetti, E.M.; Stahel, W.A. Robust Statistics: The Approach Based on Influence Functions; Wiley and Sons: New York, NY, USA, 1986. [Google Scholar]
Lehmann, E.L. Testing Statistical Hypothesis; Wiley: New York, NY, USA, 1959. [Google Scholar]
Cardoso, J.F. Blind signal separation: Statistical principles. Proc. IEEE 1998, 86, 2009–2025. [Google Scholar] [CrossRef]

Figure 1. Plot of nonlinearities

G (y)

.

Figure 1. Plot of nonlinearities

G (y)

.

Figure 2. Plot of nonlinearities

g (y)

.

Figure 2. Plot of nonlinearities

g (y)

.

Table 1. The nonlinearities in the complex-valued FastICA algorithm.

Nonlinear Function $G (x)$	Derivation of $G (x)$
$G_{l o g} = log (1 + x^{2})$	$g_{l o g} = \frac{2 x}{1 + x^{2}}$
$G_{t a n h} = log cosh (x)$	$g_{t a n h} = t a n h (y)$
$G_{k u r t} = \frac{1}{2} x^{4}$	$g_{k u r t} = 2 x^{3}$
$G_{s q r t} = \sqrt{1 + x^{2}}$	$g_{s q r t} = \frac{x}{\sqrt{1 + x^{2}}}$
$G_{H u b e r} = \{\begin{matrix} \frac{1}{2} {\| x \|}^{2}, & \| x \| \leq θ, \\ θ \| x \| - \frac{θ^{2}}{2}, & \| x \| > θ . \end{matrix}$	$g_{H u b e r} = \{\begin{matrix} x, & \| x \| \leq θ, \\ θ s g n (x), & \| x \| > θ . \end{matrix}$
$G_{T u k e y} = \{\begin{matrix} \frac{θ^{2}}{6} [1 - {(1 - \frac{x^{2}}{θ^{2}})}^{3}], & \| x \| \leq θ, \\ \frac{θ^{2}}{6}, & \| x \| > θ . \end{matrix}$	$g_{T u k e y} = \{\begin{matrix} x {(1 - \frac{x^{2}}{θ^{2}})}^{2}, & \| x \| \leq θ, \\ 0, & \| x \| > θ . \end{matrix}$

Note: The functions with subscripts denote the logarithm, hyperbolic tangent, kurtosis, mean square root, Huber M-estimator, and Tukey M-estimator functions, respectively

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

E, J.; Yang, M. Complex-Valued FastICA Estimator with a Weighted Unitary Constraint: A Robust and Equivariant Estimator. Mathematics 2024, 12, 1840. https://doi.org/10.3390/math12121840

AMA Style

E J, Yang M. Complex-Valued FastICA Estimator with a Weighted Unitary Constraint: A Robust and Equivariant Estimator. Mathematics. 2024; 12(12):1840. https://doi.org/10.3390/math12121840

Chicago/Turabian Style

E, Jianwei, and Mingshu Yang. 2024. "Complex-Valued FastICA Estimator with a Weighted Unitary Constraint: A Robust and Equivariant Estimator" Mathematics 12, no. 12: 1840. https://doi.org/10.3390/math12121840

APA Style

E, J., & Yang, M. (2024). Complex-Valued FastICA Estimator with a Weighted Unitary Constraint: A Robust and Equivariant Estimator. Mathematics, 12(12), 1840. https://doi.org/10.3390/math12121840

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Complex-Valued FastICA Estimator with a Weighted Unitary Constraint: A Robust and Equivariant Estimator

Abstract

1. Introduction

2. Preliminaries

2.1. Complex-Valued ICA Model

2.2. Deflationary Complex FastICA

3. Robustness of the Complex-Valued FastICA Estimator

3.1. Nonlinearity

3.2. Influence Function of Complex-Valued FastICA Functional

3.3. Robustness

4. Equivariance of the Complex-Valued FastICA Estimator

4.1. Equivariance of Complex-Valued ICA

4.2. Equivariance of Symmetric Complex-Valued FastICA

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI