Stable Analysis of Compressive Principal Component Pursuit

You, Qingshan; Wan, Qun

doi:10.3390/a10010029

Open AccessArticle

Stable Analysis of Compressive Principal Component Pursuit

by

Qingshan You

^1,2,*

and

Qun Wan

²

¹

School of Computer Science, Civil Aviation Flight University of China, Guanghan 618307, China

²

School of Electronic Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China

^*

Author to whom correspondence should be addressed.

Algorithms 2017, 10(1), 29; https://doi.org/10.3390/a10010029

Submission received: 5 January 2017 / Revised: 10 February 2017 / Accepted: 17 February 2017 / Published: 21 February 2017

Download

Browse Figure

Versions Notes

Abstract

:

Compressive principal component pursuit (CPCP) recovers a target matrix that is a superposition of low-complexity structures from a small set of linear measurements. Pervious works mainly focus on the analysis of the existence and uniqueness. In this paper, we address its stability. We prove that the solution to the related convex programming of CPCP gives an estimate that is stable to small entry-wise noise. We also provide numerical simulation results to support our result. Numerical results show that the solution to the related convex program is stable to small entry-wise noise under board condition.

Keywords:

matrix completion; low-complexity structure; stability analytic; compressive principal component pursuit

1. Introduction

Recently, there has been a rapidly increasing interest in recovering a target matrix that is a superposition of low-rank and sparse components from a small set of linear measurements. In many cases, this problem is shorted for matrix completion [1,2,3], which arises in a number of fields, such as medical imaging [4,5], seismology [6], and computer vision [7,8] and Kalman filter [9]. Mathematically, there exists a large-scale data matrix

M = L_{0} + S_{0}

, where

L_{0}

is a low-rank matrix, and

S_{0}

is a sparse matrix. One of the important problems here is how to extract the intrinsic low-dimensional structure from a small set of linear measurements. In a recent paper [10], E. J. Candès et al. proved that most low-rank matrices and the sparse components can be recovered, provided that the rank of the low-rank component is not too large, and that the sparse component is reasonably sparse. It is more important that they proved that these two components can be recovered by solving a simple convex optimization problem. In [11], John Wright et al. generalized this problem to decompose a matrix into multiple incoherent components:

\begin{matrix} minimize & \sum_{i}^{τ} λ_{i} {∥ X_{i} ∥}_{(i)} \\ subject to & \sum_{i}^{τ} X_{i} = M, \end{matrix}

(1)

where

∥ X_{i} ∥_{(i)}

are norms that encourage various types of low-complexity structure. The authors also provide a sufficient condition that can promise the existence and uniqueness theorem of compressive principle component pursuit (CPCP). The result in [11] requires that the components are low-complexity structures.

However, in many applications, the observed measurements are always corrupted by different kinds of noise which may affect every entry of the data matrix. In order to further complete the theory developed in [11], it is necessary to research the stability of CPCP which can guarantee stable and accurate recovery in the presence of entry-wise noise. In this paper, we make a commendable attempt in this respect. We denote M as the observing matrix which can decompose into multiple incoherent components, and assume that

M = \sum_{i}^{τ} X_{i, 0} + Z_{0},

where

X_{i, 0}

are corresponding incoherent components and

Z_{0}

is an independent and identically distributed (i.i.d.) noise. We assume that

Z_{0}

is only limited by

∥ Z_{0} ∥_{F} \leq δ

for some

δ > 0

. In order to recover the unknown low-complexity structures, we suggest solving the following relaxed optimization problem.

\begin{matrix} minimize & \sum_{i}^{τ} λ_{i} {∥ X_{i} ∥}_{(i)} \\ subject to & ∥ \sum_{i}^{τ} X_{i} {- M ∥}_{F} \leq δ \end{matrix}

(2)

In this paper, we prove the solution of (2) is stable to small entry-wise noise. The rest of paper is organized as follows. In Section 2, we show some notations and the main result, which will be proven in Section 3 and Section 4. In Section 3, we give two important lemmas which are an important parts of our main result. In Section 4, The proof of Theorem 1 will be given. We further provide numerical results in Section 5 and conclude the paper in Section 6.

2. Notations and Main Results

In this section, we first give some important notions which will be used throughout this paper, and then provide the main result.

2.1. Notations

We denote the operator norm of matrix by

∥ X ∥

, the Frobenius norm by

{∥ X ∥}_{F}

, and the nuclear norm by

{∥ X ∥}_{*}

, and denote the dual norm of

{∥ X ∥}_{(i)}

by

{∥ X ∥}_{(i)}^{*}

. The Euclidean inner product between two matrices is defined by the formula

〈X, Y〉 = t r a c e (X^{*} Y)

. Note that

{∥ X ∥}_{F}^{2} = 〈X, X〉

. The Cauchy–Schwarz inequality gives

〈X, Y〉 \leq {∥ X ∥}_{F} {∥ Y ∥}_{F}

, and it is well known that we also have

〈X, Y〉 \leq {∥ X ∥}_{(i)} {∥ Y ∥}_{(i)}^{*}

(e.g., [1,12]).

{∥ X ∥}_{(i)}

majorized the Frobenius norm means

{∥ X ∥}_{(i)} \geq {∥ X ∥}_{F}

for all X. Linear transformations which act on the space of matrices are denoted by

P_{T} X

. It is easy to see that the operator of

P_{T}

is a high dimension matrix. The operator norm of the operator is denoted by

∥ P_{T} ∥

. It should be noted that

∥ P_{T} ∥ = \sup_{{∥ X ∥_{F} = 1}} {∥ P_{T} X ∥}_{F}

.

For any matrix vector

x = [X_{i}], i = 1, 2, \dots, τ

, where

X_{i} \in R^{m \times n}

is i-th matrix. We will consider two norms of this matrix pair, which can define as

{∥ x ∥}_{⋄} : = \sum_{i}^{τ} λ_{i} {∥ X_{i} ∥}_{(i)}

and

{∥ x ∥}_{2} : = \sum_{i}^{τ} {∥ X_{i} ∥}_{F}

. In order to simplify the stability analysis of CPCP, we also define the subspaces (the common component)

γ : = [Γ_{i}], Γ_{i} = (\sum_{l}^{τ} X_{l}) / τ i = 1, 2, \dots, τ

, and (the different component)

γ^{⊥} : = [Γ_{i}^{⊥}], Γ_{i}^{⊥} = X_{i} - Γ_{i}

i = 1, 2, \dots, τ

. In order to analyze the behavior of special projection operator, we define the projection operator

P_{T_{1}} \times \dots \times P_{T_{τ}} (x) : = [P_{T_{1}} (X_{1}), \dots, P_{T_{τ}} (X_{τ})]

.

we assume that

∥ X_{i} ∥_{(i)} i = 1, 2, \dots, τ

are decomposable norms. The definition of decomposable norms is below.

Definition 1 (Decomposable Norms).

if there exists a subspace T and a matrix Z satisfying

\begin{matrix} \partial ∥ \cdot ∥ (X) = {Λ | P_{T} Λ = Z, ∥ P_{T^{⊥}} Λ ∥^{*} \leq 1}, \end{matrix}

(3)

where

{∥ \cdot ∥}^{*}

denotes the dual norm of

∥ \cdot ∥

and

P_{T^{⊥}}

is nonexpansive with respect to

{∥ \cdot ∥}^{*}

. Then, we say that the norm

∥ \cdot ∥

is decomposable at X.

Definition 2 (Inexact Certificate).

We say Λ is an

(α, β)

-inexact certificate for a putative solution

(X_{1, ⋄}, \dots, X_{τ, ⋄})

to (1.1) with parameters

(λ_{1}, \dots, λ_{τ})

if for each i,

∥ P_{T_{i}} Λ - λ_{i} Z_{i} ∥_{F} \leq α

, and

∥ P_{T_{i}^{⊥}} {Λ ∥}_{(i)}^{*} < λ_{i} β

.

2.2. Main Results

Pertaining to Problem (1), we have the result as follows.

Lemma 1

([11]). Assume there exists a feasible solution

x = (X_{1}, \dots, X_{τ})

to the optimization Problem (1). Suppose that each of the norms

{∥ \cdot ∥}_{(i)}

is decomposable at

X_{i}

, and that each of the

{∥ \cdot ∥}_{(i)}

majorizes the Frobenius norm. Then, x is the unique optimal solution if

T_{1}, \dots, T_{τ}

are independent subspaces with

\begin{matrix} ∥ P_{T_{i}} P_{T_{j}} ∥ < \frac{1}{τ - 1} \forall i \neq j, \end{matrix}

and there exists an

(α, β)

-inexact certificate

\hat{Λ}

, with

\begin{matrix} β + \frac{α \sqrt{τ}}{\sqrt{1 - (τ - 1) {m a x}_{i j} ∥ P_{T_{i}} P_{T_{j}} ∥}} \times \frac{1}{{m i n}_{l} λ_{l}} \leq 1 . \end{matrix}

The main contribution of this paper is the stability analysis of the solution of CPCP; the main Theorem of [13] can be regarded as a special case of our result (although the main idea of proof is similar to the paper [13], there are some important differences here). Next, we will provide the proposed related convex programming (2) is stable from small entry-wise noise under board condition. The main result of this paper is provided below.

Theorem 1.

Assume

x_{⋄} = (X_{1, ⋄}, \dots, X_{τ, ⋄})

,

\hat{x} = (X_{1}, \dots, X_{τ})

are the solutions of the optimization Problems (1) and (2), respectively. Suppose that each of the norms

{∥ \cdot ∥}_{(i)}

is decomposable at

X_{i, ⋄}

, and each of the

{∥ \cdot ∥}_{(i)}

majorizes the Frobenius norm. Then, if

T_{1}, \dots, T_{τ}

are independent subspaces with

\begin{matrix} ∥ P_{T_{i}} P_{T_{j}} ∥ < \frac{1}{τ - 1} \forall i \neq j \end{matrix}

and there exists an

(α, β)

-inexact certificate

\hat{Λ}

, with

\begin{matrix} β + \frac{α \sqrt{τ}}{\sqrt{1 - (τ - 1) {m a x}_{i j} ∥ P_{T_{i}} P_{T_{j}} ∥}} \times \frac{1}{{m i n}_{l} λ_{l}} \leq 1, \end{matrix}

(4)

then for any

Z_{0}

which is limited by

∥ Z_{0} ∥_{F} \leq δ

, the solution

\hat{x}

to the convex programming (2) obeys

\begin{matrix} \sum_{i} {∥ {\hat{x}}_{i} - x_{i, ⋄} ∥}_{2}^{2} \leq C (n, τ, α, β) δ^{2}, \end{matrix}

(5)

where

C (n, τ, α, β)

is a numerical constant only depending upon

n, τ, α, β

.

3. Main Lemmas

In this section, we present two main lemmas which are used to obtain Theorem 1. The paper [11] states that:

Lemma 2

([11]). Suppose

T_{1}, \dots, T_{τ}

are independent subspaces of

R^{m \times n}

and

Z_{1} \in T_{1}, \dots, Z_{τ} \in T_{τ}

, under the other conditions of Lemma 1. Then, the below equations

\begin{matrix} P_{T_{i}} Δ = λ_{i} Z_{i} - P_{T_{i}} Λ, i = 1, \dots, τ \end{matrix}

have a solution

Δ \in T_{1} + \dots + T_{τ}

obeying

\begin{matrix} {∥ Δ ∥}_{F} \leq \sqrt{\frac{α^{2} τ}{1 - (τ - 1) {m a x}_{i \neq j} ∥ P_{T_{i}} P_{T_{i}} ∥}} . \end{matrix}

In order to bound the behavior of the norm of

{∥ x ∥}_{⋄}

, we have the first main lemma that is used to obtain Theorem 1.

Lemma 3.

Assume

∥ P_{T_{i}} P_{T_{j}} ∥ < \frac{1}{τ - 1} \forall i \neq j

. Suppose there exists an

(α, β)

-inexact certificate

\hat{Λ}

satisfying Lemma 1. Then, for any perturbation

h = [H_{i}]

obeying

\sum_{i} H_{i} = 0

\begin{matrix} ∥ x_{0} {+ h ∥}_{⋄} \geq ∥ x_{0} ∥_{⋄} + \sum_{i = 1}^{τ} (λ_{i} - C_{α} - λ_{i} β) {∥ P_{T_{i}^{⊥}} H_{i} ∥}_{(i)}, \end{matrix}

wherein, let

C_{α} = \sqrt{\frac{α^{2} τ}{1 - (τ - 1) {m a x}_{i \neq j} ∥ P_{T_{i}} P_{T_{i}} ∥}}

. It is easy to see that under the hypothesis of Lemma 1, the coefficients of

∥ P_{T_{i}^{⊥}} H_{i} ∥_{(i)}

satisfy

λ_{i} - C_{α} - λ_{i} β > 0

.

Proof.

According to the property of convex function, for any subgradients

z = [Z_{i}] \in \partial {∥ x_{0} ∥}_{⋄}

, we can obtain

\begin{matrix} ∥ x_{0} {+ h ∥}_{⋄} \geq {∥ x ∥}_{⋄} + \sum_{i = 1}^{τ} λ_{i} < Z_{i}, H_{i} > . \end{matrix}

Now, because the norm of the subgradients is decomposable at

X_{i}

, there exists Λ,

Z_{i}

, α, and β obeying

∥ P_{T_{i}} Λ - λ_{i} Z_{i} ∥_{F} \leq α

, and

∥ P_{T_{i}^{⊥}} {Λ ∥}_{(i)}^{*} < λ_{i} β

. Let

Δ_{i} : = P_{T_{i}} Δ = λ_{i} Z_{i} - P_{T_{i}} Λ \in T_{i}

(see Lemma 2). Note that

\begin{matrix} Λ + Δ_{i} + P_{T_{i}^{⊥}} (λ_{i} Z_{i} - Λ) & = & Λ + λ_{i} Z_{i} - P_{T_{i}} Λ + P_{T_{i}^{⊥}} λ_{i} Z_{i} - P_{T_{i}^{⊥}} Λ \\ = & λ_{i} Z_{i} + P_{T_{i}^{⊥}} λ_{i} Z_{i} \\ = & λ_{i} Z_{i}, \end{matrix}

where the second equation obeys

Z_{i} \in T_{i}

. According to the above equation, we will continue bounding

\sum_{i = 1}^{τ} λ_{i} < Z_{i}, H_{i} >

.

\begin{matrix} \sum_{i = 1}^{τ} λ_{i} < Z_{i}, H_{i} > & = & \sum_{i = 1}^{τ} < Λ + Δ_{i} + P_{T_{i}^{⊥}} (λ_{i} Z_{i} - Λ), H_{i} > \\ = & \sum_{i = 1}^{τ} < Λ, H_{i} > + \sum_{i = 1}^{τ} < P_{T_{i}} Δ, H_{i} > + \sum_{i = 1}^{τ} < P_{T_{i}^{⊥}} (λ_{i} Z_{i} - Λ), H_{i} > \\ = & < Λ, {i = 1}^{τ} H_{i} > + \sum_{i = 1}^{τ} < Δ, P_{T_{i}} H_{i} > + \sum_{i = 1}^{τ} < λ_{i} Z_{i} - Λ, P_{T_{i}^{⊥}} H_{i} > \\ = & \sum_{i = 1}^{τ} < Δ, H_{i} > - \sum_{i = 1}^{τ} < Δ, P_{T_{i}^{⊥}} H_{i} > + \sum_{i = 1}^{τ} < λ_{i} Z_{i} - Λ, P_{T_{i}^{⊥}} H_{i} > \\ \geq & \sum_{i = 1}^{τ} < λ_{i} Z_{i} - Λ, P_{T_{i}^{⊥}} H_{i} > - \sum_{i = 1}^{τ} {∥ Δ ∥}_{F} {∥ P_{T_{i}^{⊥}} H_{i} ∥}_{F} \\ \geq & \sum_{i = 1}^{τ} < λ_{i} Z_{i} - Λ, P_{T_{i}^{⊥}} H_{i} > - \sum_{i = 1}^{τ} {∥ Δ ∥}_{F} {∥ P_{T_{i}^{⊥}} H_{i} ∥}_{(i)} \end{matrix}

With the definition of duality, there exists

{\hat{Z}}_{i} \in \partial {∥ X_{i, 0} ∥}_{(i)}

with

∥ {\hat{Z}}_{i} ∥_{(i)}^{*} \leq 1

such that

< Z_{i}^{*}, P_{T_{i}^{⊥}} H_{i} > = {∥ P_{T_{i}^{⊥}} H_{i} ∥}_{(i)}

. Moreover, with the Cauchy–Schwarz inequality, we have

\begin{matrix} | < Λ, P_{T_{i}^{⊥}} H_{i} > | & = & | < P_{T_{i}^{⊥}} Λ, P_{T_{i}^{⊥}} H_{i} > | \leq ∥ P_{T_{i}^{⊥}} {Λ ∥}_{(i)}^{*} {∥ P_{T_{i}^{⊥}} H_{i} ∥}_{(i)} . \end{matrix}

Let

Z_{i} = {\hat{Z}}_{i}

. Then, we can obtain:

< λ_{i} Z_{i} - Λ, P_{T_{i}^{⊥}} H_{i} > \geq (λ_{i} - ∥ P_{T_{i}^{⊥}} {Λ ∥}_{(i)}^{*}) ∥ P_{T_{i}^{⊥}} H_{i} ∥_{(i)} .

Combining with the inequalities above, we can obtain

\begin{matrix} ∥ x_{0} {+ h ∥}_{⋄} & \geq & ∥ x_{0} ∥_{⋄} + \sum_{i = 1}^{τ} (λ_{i} - {∥ Δ ∥}_{F} - ∥ P_{T_{i}^{⊥}} {Λ ∥}_{(i)}^{*}) ∥ P_{T_{i}^{⊥}} H_{i} ∥_{(i)} \\ \geq & ∥ x_{0} ∥_{⋄} + \sum_{i = 1}^{τ} (λ_{i} - C_{α} - λ_{i} β) {∥ P_{T_{i}^{⊥}} H_{i} ∥}_{(i)} . \end{matrix}

The Lemma 3 is established. ☐

For bounding the behavior of

\sum_{i} {∥ {\hat{x}}_{i} - x_{i, ⋄} ∥}_{F}^{2}

, we have to bound the projection operator

P_{T_{1}} \times \dots \times P_{T_{τ}} (x)

. Therefore, we have the second main lemma that will be used to obtain Theorem 1.

Lemma 4.

Assume that

∥ P_{T_{i}} P_{T_{j}} ∥ < \frac{1}{τ - 1} \forall i \neq j

. For any matrix vector

x = [X_{i}]

, we have

\begin{matrix} ∥ P_{γ} (P_{T_{1}} \times \dots \times P_{T_{τ}}) (x) ∥_{F}^{2} \geq \frac{1 - {m a x}_{i} \frac{1}{2} \sum_{j \neq i} (∥ P_{T_{i}} P_{T_{j}} ∥ + ∥ P_{T_{j}} P_{T_{i}} ∥)}{τ} {∥ P_{T_{1}} \times \dots \times P_{T_{τ}} (x) ∥}_{F}^{2} . \end{matrix}

It is easy to see that under the hypothesis of

∥ P_{T_{i}} P_{T_{j}} ∥ < \frac{1}{τ - 1} \forall i \neq j

, the constant

\frac{1 - {m a x}_{i} \frac{1}{2} \sum_{j \neq i} (∥ P_{T_{i}} P_{T_{j}} ∥ + ∥ P_{T_{j}} P_{T_{i}} ∥)}{τ^{2}}

is strictly greater than zero.

Proof.

With respect to any matrix

x = [X_{i}]

, we have

P_{γ} (x) = [Γ_{i}]

, where

Γ_{i} = (\sum_{l = 1}^{τ} X_{l}) / τ

. It is easy to see that

∥ P_{γ} {(x) ∥}_{F}^{2} = \frac{1}{τ} {∥ \sum_{l = 1}^{τ} X_{l} ∥}_{F}^{2}

. Then, we have

\begin{matrix} ∥ P_{γ} (P_{T_{1}} \times \dots \times P_{T_{τ}}) (x) ∥_{F}^{2} & = & \frac{1}{τ} ∥ \sum_{i = 1}^{τ} P_{T_{i}} X_{i} ∥_{F}^{2} \\ = & \frac{1}{τ} (\sum_{i = 1}^{τ} (∥ P_{T_{i}} X_{i} ∥_{F}^{2} + \sum_{j \neq i} < P_{T_{i}} X_{i}, P_{T_{j}} X_{j} >)) . \end{matrix}

Note that

\begin{matrix} < P_{T_{i}} X_{i}, P_{T_{j}} X_{j} > & = & < P_{T_{i}} X_{i}, P_{T_{i}} P_{T_{j}} X_{j} > \\ \geq & - ∥ P_{T_{i}} P_{T_{j}} ∥ ∥ P_{T_{i}} X_{i} ∥_{F} {∥ P_{T_{j}} X_{j} ∥}_{F} . \end{matrix}

Together with

∥ P_{T_{i}} P_{T_{j}} ∥ < \frac{1}{τ - 1} \forall i \neq j

, we have

\begin{matrix} ∥ P_{γ} (P_{T_{1}} \times \dots \times P_{T_{τ}}) (x) ∥_{F}^{2} & \geq & \frac{1}{τ} (\sum_{i = 1}^{τ} ∥ P_{T_{i}} X_{i} ∥_{F}^{2} - \sum_{j \neq i} ∥ P_{T_{i}} P_{T_{j}} ∥ ∥ P_{T_{i}} X_{i} ∥_{F} ∥ P_{T_{j}} X_{j} ∥_{F})) \\ \geq & \frac{1}{τ} (\sum_{i = 1}^{τ} (∥ P_{T_{i}} X_{i} ∥_{F}^{2} - \sum_{j \neq i} \frac{∥ P_{T_{i}} P_{T_{j}} ∥}{2} (∥ P_{T_{i}} X_{i} ∥_{F}^{2} + ∥ P_{T_{j}} X_{j} ∥_{F}^{2})) \\ = & \frac{1}{τ} \sum_{i = 1}^{τ} (1 - \frac{1}{2} \sum_{j \neq i} (∥ P_{T_{i}} P_{T_{j}} ∥ + ∥ P_{T_{j}} P_{T_{i}} ∥)) ∥ P_{T_{i}} X_{i} ∥_{F}^{2} \\ \geq & \frac{1 - \max_{i} \frac{1}{2} \sum_{j \neq i} (∥ P_{T_{i}} P_{T_{j}} ∥ + ∥ P_{T_{j}} P_{T_{i}} ∥)}{τ} (\sum_{i} ∥ P_{T_{i}} X_{i} ∥_{F}^{2}) \\ \geq & \frac{1 - \max_{i} \frac{1}{2} \sum_{j \neq i} (∥ P_{T_{i}} P_{T_{j}} ∥ + ∥ P_{T_{j}} P_{T_{i}} ∥)}{τ} {∥ P_{T_{1}} \times \dots \times P_{T_{τ}} (x) ∥}_{F}^{2}, \end{matrix}

where in the second inequality, we have used the inequity that for any

x, y

,

2 x y \leq (x^{2} + y^{2})

. Therefore, Lemma 4 is established. ☐

4. Proof of Theorem 1

In this section, we will provide the proof of Theorem 1. Our main proof is based on two elementary and important properties of

\hat{x}

, which is the solution of Problem (2). First, note that

x_{0}

is also a feasible solution to Problem (2) and

\hat{x}

is the optimum solution; therefore, we can obtain

∥ \hat{x} ∥_{⋄} \leq {∥ x_{0} ∥}_{⋄}

. Second, according to triangle inequality, we can obtain

\begin{matrix} ∥ \hat{x} - x_{0} ∥_{2} & = & ∥ \hat{x} - M - (x_{0} - M) ∥_{2} \\ \leq & ∥ \hat{x} {- M ∥}_{2} + {∥ x_{0} - M ∥}_{2} \\ \leq & 2 δ . \end{matrix}

(6)

Let

\hat{x} = x_{0} + h

, where

h = [H_{i}]

. According to the definition of subspace of γ, we denote

h^{γ} : = P_{γ} (h)

,

h^{γ^{⊥}} : = P_{γ^{⊥}} (h)

for short. Our main aim is to bound

{∥ h ∥}_{2} = {∥ \hat{x} - x_{0} ∥}_{2}

, which can be rewritten as

\begin{matrix} {∥ h ∥}_{2}^{2} & = & ∥ h^{γ} ∥_{2}^{2} + {∥ h^{γ^{⊥}} ∥}_{2}^{2} \\ = & ∥ h^{γ} ∥_{2}^{2} + {∥ P_{T_{1}} \times \dots \times P_{T_{τ}} h^{γ^{⊥}} ∥}_{2}^{2} \\ + ∥ P_{T_{1}^{⊥}} \times \dots \times P_{T_{τ}^{⊥}} h^{γ^{⊥}} ∥_{2}^{2} . \end{matrix}

(7)

Combining with (4), we have

\begin{matrix} ∥ h^{γ} ∥_{2}^{2} = \sum_{i} ∥ \frac{\sum_{j = 1}^{τ} H_{j}}{τ} ∥_{2}^{2} \leq \frac{4 δ^{2}}{τ} \end{matrix}

Therefore, it is necessary to bound the other two terms on the right-hand-side of (5). We will bound the second and third terms, respectively.

Norm equivalence theorem tells us that every two norms on a finite dimensional normed space are equivalent, which implies that there exists two constants

C (n, τ) \geq c (n, τ) > 0

satisfying

\begin{matrix} {c (n, τ) ∥ x ∥}_{2} \leq {∥ x ∥}_{⋄} \leq C (n, τ) {∥ x ∥}_{2} . \end{matrix}

(8)

A. Estimate the third term of (5) Let Λ be a dual certificate obeying Lemma 1. Then, using triangle inequality, we have

\begin{matrix} ∥ x_{0} {+ h ∥}_{2} \geq ∥ x_{0} + h^{γ^{⊥}} ∥_{2} - {∥ h^{γ} ∥}_{2} . \end{matrix}

(9)

Combining with Lemma 3, we can obtain

\begin{matrix} ∥ x_{0} + h^{γ^{⊥}} ∥_{2} & \geq & ∥ x_{0} ∥_{d} + \sum_{i = 1}^{τ} (λ_{i} - C_{α} - λ_{i} β) {∥ P_{T_{i}^{⊥}} H_{i} ∥}_{(i)} \\ \geq & ∥ x_{0} ∥_{d} + (1 - C_{α} \frac{1}{\min_{i} λ_{i}} - β) \sum_{i = 1}^{τ} λ_{i} {∥ P_{T_{i}^{⊥}} H_{i}^{Γ^{⊥}} ∥}_{(i)} \\ \geq & ∥ x_{0} {+ h ∥}_{d} + (1 - C_{α} \frac{1}{\min_{i} λ_{i}} - β) \sum_{i = 1}^{τ} λ_{i} {∥ P_{T_{i}^{⊥}} H_{i}^{Γ^{⊥}} ∥}_{(i)}, \end{matrix}

wherein, to get the third inequality, we used the fact

∥ \hat{x} ∥_{d} \leq {∥ x_{0} ∥}_{d}

. For simplification, let

\begin{matrix} C_{1} (α, β) ≜ (1 - C_{α} \frac{1}{\min_{i} λ_{i}} - β) > 0 . \end{matrix}

Therefore, we have

\begin{matrix} ∥ x_{0} + h^{γ^{⊥}} ∥_{2} & \geq & ∥ x_{0} {+ h ∥}_{d} + C_{1} (α, β) \sum_{i = 1}^{τ} λ_{i} {∥ P_{T_{i}^{⊥}} H_{i}^{Γ^{⊥}} ∥}_{(i)} . \end{matrix}

Combining with (7), we can obtain

\begin{matrix} C_{1} (α, β) \sum_{i = 1}^{τ} λ_{i} ∥ P_{T_{i}^{⊥}} H_{i}^{Γ^{⊥}} ∥_{(i)} \leq {∥ h^{γ} ∥}_{2} . \end{matrix}

Then

\begin{matrix} \sum_{i = 1}^{τ} λ_{i} ∥ P_{T_{i}^{⊥}} H_{i}^{Γ^{⊥}} ∥_{(i)} \leq C_{2} (α, β) {∥ h^{γ} ∥}_{2}, \end{matrix}

(10)

where

C_{2} (α, β) = 1 / C_{1} (α, β)

. We will estimate the third term of (5). Using triangle inequality, we have

\begin{matrix} ∥ P_{T_{1}^{⊥}} \times \dots \times P_{T_{τ}^{⊥}} h^{Γ^{⊥}} ∥_{2} & \leq & \sum_{i} {∥ P_{T_{i}^{⊥}} H_{i}^{Γ^{⊥}} ∥}_{F} \\ \leq & \frac{1}{c (n, τ)} \sum_{i = 1}^{τ} λ_{i} {∥ P_{T_{i}^{⊥}} H_{i}^{Γ^{⊥}} ∥}_{(i)} \\ \leq & \frac{C_{2} (α, β)}{c (n, τ)} {∥ h^{γ} ∥}_{2} \\ \leq & C (n, τ, α, β) δ, \end{matrix}

where

C (n, τ, α, β) : = \frac{2 C_{2} (α, β)}{c (n, τ) \sqrt{τ}}

. The second inequality is set up by (6); the fourth inequality is obtained by (8); the last one is obtained by the fact

∥ h^{γ} ∥_{2} \leq \frac{2 δ}{\sqrt{τ}}

. Therefore, we can obtain

\begin{matrix} ∥ P_{T_{1}^{⊥}} \times \dots \times P_{T_{τ}^{⊥}} (h^{Γ^{⊥}}) ∥_{2}^{2} \leq C^{2} (n, τ, α, β) δ^{2}, \end{matrix}

(11)

which implies that the third term of (5) can bound by

C δ

.

B. Estimate the second term of (5) According to Lemma 4, we can obtain

\begin{matrix} ∥ P_{γ} (P_{T_{1}} \times \dots \times P_{T_{τ}}) (h^{γ^{⊥}}) ∥_{2}^{2} & \geq & \frac{1 - \max_{i} \frac{1}{2} \sum_{j \neq i} (∥ P_{T_{i}} P_{T_{j}} ∥ + ∥ P_{T_{j}} P_{T_{i}} ∥)}{τ} \\ ∥ P_{T_{1}} \times \dots \times P_{T_{τ}} (h^{γ^{⊥}}) ∥_{2}^{2} \\ = & \hat{C} (τ, α, β) ∥ P_{T_{1}} \times \dots \times P_{T_{τ}} (h^{γ^{⊥}}) ∥_{2}^{2}, \end{matrix}

where

\hat{C} (τ, α, β) : = \frac{1 - \max_{i} \frac{1}{2} \sum_{j \neq i} (∥ P_{T_{i}} P_{T_{j}} ∥ + ∥ P_{T_{j}} P_{T_{i}} ∥)}{τ}

. Note that

\begin{matrix} 0 & = & P_{γ} (h^{γ^{⊥}}) \\ = & P_{γ} P_{T_{1}} \times \dots \times P_{T_{τ}} h^{γ^{⊥}} + P_{γ} P_{T_{1}^{⊥}} \times \dots \times P_{T_{τ}^{⊥}} h^{γ^{⊥}} . \end{matrix}

Therefore,

\begin{matrix} ∥ P_{γ} P_{T_{1}} \times \dots \times P_{T_{τ}} h^{γ^{⊥}} ∥_{2} & = & ∥ P_{γ} P_{T_{1}^{⊥}} \times \dots \times P_{T_{τ}^{⊥}} h^{γ^{⊥}} ∥_{2} \\ \leq & ∥ P_{T_{1}^{⊥}} \times \dots \times P_{T_{τ}^{⊥}} h^{γ^{⊥}} ∥_{2} . \end{matrix}

Taking the previous two inequalities, we have

\begin{matrix} ∥ P_{T_{1}} \times \dots \times P_{T_{τ}} h^{γ^{⊥}} ∥_{2} & \leq & \frac{∥ P_{γ} P_{T_{1}^{⊥}} \times \dots \times P_{T_{τ}^{⊥}} h^{γ^{⊥}} ∥_{2}}{\sqrt{\hat{C} (τ, α, β)}} \\ \leq & \frac{∥ P_{T_{1}^{⊥}} \times \dots \times P_{T_{τ}^{⊥}} h^{γ^{⊥}} ∥_{2}}{\sqrt{\hat{C} (τ, α, β)}} \\ \leq & C (n, τ, α, β) δ, \end{matrix}

(12)

where

C (n, τ, α, β)

is an appropriate constant. Combining with (9), we can obtain

\begin{matrix} {∥ h ∥}_{2}^{2} & = & ∥ h^{Γ} ∥_{2}^{2} + {∥ P_{T_{1}} \times \dots \times P_{T_{τ}} h^{Γ^{⊥}} ∥}_{2}^{2} \\ + ∥ P_{T_{1}^{⊥}} \times \dots \times P_{T_{τ}^{⊥}} h^{Γ^{⊥}} ∥_{2}^{2} \\ \leq & \hat{C} (n, τ, α, β) δ^{2} . \end{matrix}

Therefore, Theorem 1 is established.

Remark 1.

if

τ = 2

, then Theorem 1 will degrade to the main result of [13].

5. Numerical Results

In this section, numerical experiments with varieties of the value of parameter σ, parameter

ρ_{s}

, and rank r are given. For each setting of parameters, we show the average errors over 10 trials. Our implementation was realized with MATLAB. All the computational results were obtained on a desktop computer with a 2.27-GHz CPU (Intel(R) Core(TM) i3) and 2 GB of memory. Without loss of generality, we assume that

τ = 2

. In [13], the authors certified this result with Accelerated Proximal Gradient (APG) by numerical experiments. In our numerical experiments, we will provide that this result is also proper with Principal Component Pursuit by Alternating Direction Method (PCP-ADM). In our simulations, our matrix is generated by the formulation as:

M = L_{0} + S_{0} + N_{0}

, and a rank-r matrix

L_{0}

is a product

L_{0} = X Y^{T}

, where X and Y are

m \times r

and

n \times r

matrices in which entries are independently sampled from a

N (0; 1)

distribution. According to PCP-ADM, we can generate

S_{0}

by choosing a support set of size

k_{s} = ρ_{s} m n

uniformly at random, and set

S_{0} = P_{Ω} E

. Noise component

N_{0}

is generated with entries independently sampled from a

N (0; σ)

distribution. Without loss of generality, we set

m = n = 200

and

ρ_{s} = 0.01

, and other parameters which PCP-ADM requires are the same as parameters of PCP-ADM [10]. Here we briefly interpret PCP-ADM. In [10], in order to stably recover

\hat{X} = (\hat{L}; \hat{S})

, the ADM method operates on the augmented Lagrangian

\begin{matrix} l (L, S, Y) & = & {∥ L ∥}_{*} + {λ ∥ S ∥}_{1} + < Y, M - L - S > + \frac{μ}{2} {∥ M - L - S ∥}_{F}^{2} . \end{matrix}

The details of the PCP-ADM can be found in [14,15].

In our simulations, the stopping criterion of the PCP-ADM algorithm can be

\begin{matrix} \frac{{∥ L + S - M ∥}_{F}}{{∥ M ∥}_{F}} \leq tolerance \end{matrix}

or the maximum iteration number (

k_{m a x} = 500

). In order to estimate the errors, we use the root-mean-squared (RMS) error as

∥ \hat{L} - L_{0} ∥_{F} / n

,

∥ \hat{S} - S_{0} ∥_{F} / n

for the low-rank component and the sparse component, respectively. Figure 1 shows the RMS errors’ variation with different values of

σ^{2}

. It is noted that the RMS error grows approximately linearly with the noise level in Figure 1. This phenomenon verifies Theorem 1 by numerical experiments with PCP-ADM (this phenomenon also exists in [13] with APG, which is very different from PCP-ADM in principle).

6. Conclusions

In this paper, we have investigated the the stability of CPCP. Our main contribution is the proof of Theorem 1, which implies the solution to the related convex programming (1.2) is stable to small entrywise noise under board condition. It is an extension of the result in [13], which only allows

τ = 2

. Moreover, in the numerical experiments, we have investigated the performance of the PCP-ADM algorithm. Numerical results showed that it is stable to small entrywise noise.

Acknowledgments

The author would like to thank the anonymous reviewers for their comments that helped to improve the quality of the paper. This research was supported by the National Natural Science Foundation of China (NSFC) under Grant U1533125, and the Scientific Research Program of the Education Department of Sichuan under Grant 16ZB0032.

Author Contributions

Qingshan You and Qun Wan. contributed reagents/materials/analysis tools; Qingshan You wrote the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Candès, E.J.; Recht, B. Exact matrix completion via convex optimzation. Found. Comput. Math. 2009, 9, 717–772. [Google Scholar] [CrossRef]
Candès, E.J.; Plan, Y. Matrix completion with noise. Proc. IEEE 2010, 98, 925–936. [Google Scholar] [CrossRef]
Candès, E.J.; Tao, T. The power of convex relaxation: Near-optimal matrix completion. IEEE Trans. Inf. Theory 2010, 56, 2053–2080. [Google Scholar] [CrossRef]
Ellenberg, J. Fill in the blanks: Using math to turn lo-res datasets into hi-res samples. Wired 2010. Available online: https://www.wired.com/2010/02/ff_algorithm/all/1 (accessed on 26 January 2016). [Google Scholar]
Antonin Chambolle and Pierre-Louis Lions. Image recovery via total variation minimization and related problems. Numer. Math. 1997, 76, 167–188. [Google Scholar]
Jon, F. Claerbout and Francis Muir. Robust modeling of erratic data. Geophysics 1973, 38, 826–844. [Google Scholar]
Zeng, B.; Fu, J. Directional discrete cosine transforms: A new framework for image coding. IEEE Trans. Circuits Syst. Video Technol. 2011, 18, 305–313. [Google Scholar] [CrossRef]
Elad, M.; Aharon, M. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 2006, 15, 3736–3745. [Google Scholar] [CrossRef] [PubMed]
Rodger, J.A. Toward reducing failure risk in an integrated vehicle health maintenance system: A fuzzy multi-sensor data fusion Kalman filter approach for IVHMS. Expert Syst. Appl. 2012, 39, 9821–9836. [Google Scholar] [CrossRef]
Candès, E.J.; Li, X.; Ma, Y.; Wright, J. Robust principal component analysis? J. ACM 2011. [Google Scholar] [CrossRef]
Wright, J.; Ganesh, A.; Min, K.; Ma, Y. Compressive Principal Component Pursuit. Available online: http://yima.csl.illinois.edu/psfile/CPCP.pdf (accessed on 9 April 2012).
Recht, B.; Fazel, M.; Parrilo, P. Guaranteed minimum rank solutions of matrix equations via nuclear norm minimization. arXiv, 2007; arxiv:0706.4138. [Google Scholar] [CrossRef]
Zhou, Z.; Li, X.; Wright, J.; Candès, E.J.; Ma, Y. Stable Principal Component Pursuit. arXiv, 2010; arXiv:1001.2363v1. [Google Scholar]
Yuan, X.; Yang, J. Sparse and low rank matrix decomposition via alternating direction method. Pac. J. Optim. 2009, 9, 167–180. [Google Scholar]
Kontogiorgis, S.; Meyer, R. A variable-penalty alternating direction method for convex optimization. Math. Program. 1989, 83, 29–53. [Google Scholar] [CrossRef]

Figure 1. Root-mean-squared (RMS) errors as a function of

σ^{2}

with

r = 10; ρ_{s} = 0.01; n = 200

. PCP-ADM: Principal Component Pursuit by Alternating Direction Method.

Figure 1. Root-mean-squared (RMS) errors as a function of

σ^{2}

with

r = 10; ρ_{s} = 0.01; n = 200

. PCP-ADM: Principal Component Pursuit by Alternating Direction Method.

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

You, Q.; Wan, Q. Stable Analysis of Compressive Principal Component Pursuit. Algorithms 2017, 10, 29. https://doi.org/10.3390/a10010029

AMA Style

You Q, Wan Q. Stable Analysis of Compressive Principal Component Pursuit. Algorithms. 2017; 10(1):29. https://doi.org/10.3390/a10010029

Chicago/Turabian Style

You, Qingshan, and Qun Wan. 2017. "Stable Analysis of Compressive Principal Component Pursuit" Algorithms 10, no. 1: 29. https://doi.org/10.3390/a10010029

APA Style

You, Q., & Wan, Q. (2017). Stable Analysis of Compressive Principal Component Pursuit. Algorithms, 10(1), 29. https://doi.org/10.3390/a10010029

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Stable Analysis of Compressive Principal Component Pursuit

Abstract

1. Introduction

2. Notations and Main Results

2.1. Notations

2.2. Main Results

3. Main Lemmas

4. Proof of Theorem 1

5. Numerical Results

6. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI