1. Introduction
Independent component analysis (ICA) is a statistical and computational model, where the observed signals are considered as linear mixtures of underlying source signals. The purpose of ICA is to estimate mutually independent source signals from their linear mixtures without prior knowledge of sources and mixing coefficients [
1,
2]. Up to the present, ICA has been widely applied in diverse fields, such as signal processing [
3,
4], financial analysis [
5,
6], and so on. Hence, there is ever-increasing literature recognizing the essentiality of analyzing the properties of the employed ICA approaches from the aspect of theoretical support. The main purpose of this paper is to analyze the robustness and equivariance of the complex-valued fast fixed-point algorithm for ICA (complex-valued FastICA for short) [
7,
8], which is one of the most prominent algorithms in the complex number domain due to its faster convergence and easier implementation.
Among many publications, algorithmic property is one of the most fundamental properties for complex-valued ICA. Novey and Adali proposed complex ICA based on negentropy maximization (NM) [
9] and provided the local stability condition of the algorithm. Furthermore, Qian and Wei gave a stability analysis from a unique perspective [
10]. The authors pointed out that the NM-based ICA algorithm might iterate a poor separation vector even if the source signals meet the stability conditions. Reference [
11] presented the Riemann and Lie structures of the complex unitary group, which generalized the topological properties of the complex-valued ICA. Koldovský and Tichavský [
12] addressed the problem regarding the region of convergence of gradient-related ICA algorithms. The results showed that the size of the region of convergence was related to the employed algorithm and relied on the ratio of scales of source signals. E et al. [
13] provided a performance analysis of the complex-valued FastICA algorithm with the M-estimator cost function, including stability and local convergence. They proved the existence of local optimal solutions and stability conditions.
Moreover, the statistical property is another fundamental property for the complex-valued ICA. Cramér-Rao bound (CRB) is essential for the ICA algorithms to describe the performance limit, and has been researched, e.g., in [
14,
15]; the authors derived a closed-form expression for the CRB of the separation parameter for complex-valued ICA. Fu et al. [
16] established the theory for complex-valued ICA, providing the Cramér-Rao lower bound and identification conditions, which exploited diversities of non-Gaussianity, non-whiteness, and non-circularity. Furthermore, Koldovský et al. [
17] analyzed the accuracy of fast dynamic independent vector analysis, showing asymptotic efficiency under given mixing and statistical models, which coincided with the Cramér-Rao lower bound derived in [
15]. Reference [
13] also analyzed the statistical properties of the complex-valued FastICA method, including uniformity and robustness (the sequences of the complex-valued FastICA estimator converge in probability to the true demixing vector). However, unfortunately, they addressed the problem concerning the robust property without a rigorous mathematical treatment. In this paper, we focus on deriving a closed-form expression of the robust measurement for the complex-valued FastICA functional.
The circular complex-valued FastICA (c-FastICA) algorithm was introduced by Bingham and Hyvärinen [
7], as an extension of FastICA in the real domain [
18,
19]. It has received wide attention for solving complex-valued source signal separation due to its faster convergence and easier implementation. In order to improve the robustness against outliers, Chao and Douglas [
20] selected the Huber M-estimator as nonlinearity in the cost function within the complex-circular FastICA algorithm. The local stability analysis showed that the improved approach was locally stable for Huber’s single-parameter M-estimator cost function with circular source signals. The aforementioned complex-valued FastICA algorithm, however, has poor performance when dealing with noncircular source signal separation. To address this obstacle, Novey and Adali extended it to the noncircular source signals separation scenario and analyzed the local convergence of the estimator [
8]. Although there have been several attempts to study the statistical properties of the complex-noncircular FastICA algorithm (nc-FastICA for short), a rigorous analysis of the robust and equivariant behavior of nc-FastICA is still missing in the community. To fill this gap, this paper will analyze the robustness against outliers and the separation performance of the complex-valued FastICA estimator with a weighted unitary constraint.
The innovations of this paper are three-fold. First, we define the complex-valued FastICA functional and its influence function (IF) for the deflationary procedure. Then, a closed-form expression of the IF for the complex-valued FastICA functional is derived, which is utilized to measure robustness against outliers from a rigorous mathematical perspective. Third, we prove that the complex-valued FastICA algorithm is equivariant.
This paper is organized as follows:
Section 2 provides the preliminaries of the complex-valued ICA model and the deflationary complex FastICA considered argument.
Section 3 establishes the theories concerning the complex-valued FastICA functional and its influence function in the range of ICA over the complex number domain.
Section 4 analyzes the equivariance of the complex FastICA estimator.
Section 5 concludes the whole paper.
3. Robustness of the Complex-Valued FastICA Estimator
3.1. Nonlinearity
From a statistical viewpoint, the nonlinear function
G provides information on the higher-order statistics in the expectation format
, which determines the selection of
g (the derivation of the nonlinear function
G) in the algorithm (
4). Hence, the statistical properties of the complex-valued FastICA estimator
lie essentially in the selection of nonlinear function
G. In this paper, we analyze the robustness of the
via the IF.
Generally, robustness against outliers is a desirable property for an estimator, meaning that the estimator is insensitive to individual, highly erroneous observations. In this section, we mainly address the problem of how to measure the robustness of the complex-valued FastICA estimator
? Heuristically, the value of the function
cannot grow fast with the increase in
if one needs a robust estimator. Specifically, we list the classical nonlinearities
in
Table 1. The curves of the function
and its derivation with respect to
x are plotted in
Figure 1 and
Figure 2.
From these Figures, one can yield that the Tukey M-estimator function implements a more robust estimator that is insensitive to outliers, and kurtosis gives a non-robust estimator that may be influenced by individual highly erroneous observations. In fact, the values of
(red line in
Figure 1) are increased quickly from the beginning of 1 or −1 without a downward trajectory. With the increase in
, the outliers do not have much influence on the values of
in the sense that the complex-valued FastICA estimator based on the Tukey M-estimator cost function is recommended with better robustness. This evidence can also be seen in the curves of the derivation of nonlinearities
in
Figure 2. In this paper, we shall, hereafter, analyze the robustness of the complex-valued FastICA estimator via the IF.
3.2. Influence Function of Complex-Valued FastICA Functional
We analyze the robustness of the complex-valued FastICA estimator by concerning the deflationary version of the complex-valued FastICA algorithm. For convenience, we shall hereafter suppose that is the kth column of the demixing matrix , which corresponds to the estimator for finding the kth source signal in the sense that the equation gives an estimation of the source signal .
In the complex-valued ICA model , the observed signal vector comes from an unknown distribution with the cdf . Hence, in order to give the measurement of the complex-valued FastICA estimator, we first define the complex-valued FastICA functional for the deflationary procedure and its influence function.
Definition 1. Assume that the observed vector follows the complex-valued ICA model (1). The complex-valued FastICA functional for the deflationary procedure is defined as follows:subject to the following weighted unitary constraint:where represents a smooth even function and denotes the covariance matrix function of the observed vector at the distribution . We note that the constraint condition (
7) can be performed by the deflationary orthogonalization process, which separates the source signals one by one.
Definition 2. The influence function (IF) of the complex-valued FastICA functional at is given by the following:where denotes the probability measure, which puts mass 1 at point . We note that the IF of the complex-valued FastICA functional
is defined based on the fact that it is Gâteaux differentiable [
22] at the distribution
in
. Thus, it can also be written as follows:
where
.
To simplify the notation, we shall hereafter replace the notations with to denote the IF of the complex-valued FastICA functional , (Res. ) by the (Res. ) to denote the covariance matrix function (Res. mean vector function) of the observed vector at the distribution , by , respectively.
The significance of the IF lies in its heuristic explanation: it reports the influence of the contamination at point
on the estimate. From the robust statistics point of view, the IF quantifies the asymptotic bias resulting from contamination in the observation [
22]. Hence, the forthcoming section provides a closed-form expression of the IF of the complex-valued FastICA functional
.
3.3. Robustness
In order to analyze the robustness, we use the Lagrangian multiplier method to deduce the specific expression of the influence function of the complex-valued FastICA functional. Plugging the weighted unitary constraint (
7) into the cost function (
3), the Lagrange function can be obtained as follows:
where
is the penalty factor;
is the cost function of the complex-valued FastICA. Note that the observed data,
, are concerned without a preprocessing procedure, in the sense that
wherein
, and
denote the
ith elements for counterparts, respectively. After differentiating the
with respect to
and equating to zero, one can yield
where
denotes the gradient operator with respect to
, yielding the following:
where
represents the
ith column of the covariance matrix function
. Furthermore,
For the sake of convenience in writing, we shall identify the following:
and
respectively, yielding the following:
Theorem 1. The influence function of the complex-valued FastICA functional , , at the centered mixture distribution is as follows:where , , , , is the derivative operator of the nonlinearity in the cost function (3), and denotes the projection of the centered contamination point into the direction . To prove Theorem 1, the following five Lemmas are needed, which will be proved in
Appendix A and
Appendix B.
Lemma 1. Assume that the observed data, , obey the distribution with the mean vector , and the sources follow Assumption 1. Let , and , one obtains the following:
- (a)
- (b)
,
- (c)
,
- (d)
, ,
- (e)
,
where denotes the projection of the centered contamination point into the direction ; , . The notation of represents the kth column of matrix .
Lemma 2. Assume that the observed data, , obey the distribution with the mean vector , and the sources follow Assumption 1. For in (10) at the distribution , we have the following: - (a)
,
- (b)
,
- (c)
,
- (d)
,
where , and , , , are defined as in Theorem 1.
Based on Lemmas 1 and 2, we will now prove Theorem 1 from the following three steps:
Proof. Step 1. Differentiating
and
with respect to
in (
15), yields the following equation:
which aims at obtaining a closed-form expression of the influence function of the complex-valued FastICA functional
.
Step 2. Calculating the left-hand side of the Equation (
17). Combing
with Equation (
13), we conclude the following:
Differentiating
with respect to
and leveraging Lemma 1, one can obtain the following:
Step 3. Calculating the right-hand side of (
17). From (
14), we have the following:
Differentiating
with respect to
and leveraging Lemma 2 one can obtain the following:
Step 4. Deriving the specific expression of the IF of the complex-valued FastICA function. Plugging the results derived in step 1 and step 2 into (
17), after tedious manipulation, one can yield the following:
Combining the preceding equality concludes our proof
. □
Remark 1. In the proof of Theorem 1 and Lemmas 1–2, we need to consider the following facts:where denotes the mixing matrix in ICA model (1); is the kth column of matrix ; represents a vector with 1 in the kth element and 0 elsewhere; and is the ith column of the demixing matrix . Remark 2. Note that pre-multiplying both sides of Equation (20) by , we have the following:After analyzing (21), one can observe the following: for , or for . Remark 3. The expression of complex-valued FastICA IF can also be written as follows: By observing the closed-form expression of the IF for the complex-valued FastICA functional in Theorem 1, we obtain that the IF of is the weighted sum of the separation vector with the unbounded weight coefficient function, with respect to the projection of the contaminated point . This finding confirms that the values of are large when the outliers are present in the source signals. As can also be seen in Theorem 1, the greater the values of the contaminated point presenting in the source signals, the higher the values of .