Next Article in Journal
Three Authentication Schemes without Secrecy over Finite Fields and Galois Rings
Previous Article in Journal
Visualization Method for Decision-Making: A Case Study in Bibliometric Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The PPADMM Method for Solving Quadratic Programming Problems

1
Department of Mathematics College of Sciences, Northeastern University Shenyang, Shenyang 100819, China
2
Northwest Institute of Mechanical and Electrical Engineering Xianyang, Xianyang 712000, China
*
Author to whom correspondence should be addressed.
Mathematics 2021, 9(9), 941; https://doi.org/10.3390/math9090941
Submission received: 28 February 2021 / Revised: 6 April 2021 / Accepted: 9 April 2021 / Published: 23 April 2021

Abstract

:
In this paper, a preconditioned and proximal alternating direction method of multipliers (PPADMM) is established for iteratively solving the equality-constraint quadratic programming problems. Based on strictly matrix analysis, we prove that this method is asymptotically convergent. We also show the connection between this method with some existing methods, so it combines the advantages of the methods. Finally, the numerical examples show that the algorithm proposed is efficient, stable, and flexible for solving the quadratic programming problems with equality constraint.

1. Introduction

This manuscript introduces the following convex optimization model with linear constraints and a separable objective function:
{ min f 1 ( x ) + f 2 ( y ) , s . t . A x + B y = b ,
where A p × n and B p × m are two matrices, and b p is a known vector. The objective function f 1 : n and f 2 : m are two quadratic functions defined by
{ f 1 ( x ) = 1 2 x T F x + x T f , f 2 ( y ) = 1 2 y T G y + y T g ,
where F n × n , G m × m are symmetric positive semidefinite matrices and f n , g m are the known vectors. The class of convex minimization problems arises in many areas of computational science and engineering applications such as compressed sensing [1], financial [2,3], image restoration [4,5,6], network optimization problems [7,8], and traffic planning convex problems [9,10,11,12]. The model (1)–(2) captures many applications in different areas—see the l1-norm regularized least-squares problems in [12,13], the total variation image restoration in [13,14,15,16], and the standard quadratic programming problems in [7,13].
Let H p × p be the symmetric positive definite matrix, < , > H represents the weighted inner product with the weighting matrix H, stands for the Euclidean norm, and H stands for the analogous weighted matrix norm. Note that for the vectors u , v p and the matrix X p × p , it holds that < u , v > H = < H u , v > , u H = H 1 2 u and X H = H 1 2 X H 1 2 .
If < u , v > H = 0 , we say that u , v p are H-orthogonal, which is denoted by u H v . In particular, if H is the identity matrix, then the vectors u and v are orthogonal, which is simply denoted by u v . For ζ , ζ ¯ stands for its conjugate complex.
The problem (1)–(2) is mathematically equivalent to the unconstraint optimization problem [7]
max λ min x , y ψ ( x , y , λ )
ψ ( x , y , λ ) is the augmented Lagrangian function defined as
ψ ( x , y , λ ) = f 1 ( x ) + f 2 ( y ) < A x + B y b , λ > + β 2 A x + B y b 2 ,
where λ is the Lagrangian multiplier, and β is a regularization parameter. In a word, a point ( x * , y * ) is the solution to the problem (1)–(2) if and only if there exists λ * p such that the point ( x * , y * , λ * ) n × m × p is the solution to the problem (3)–(4) [7].
The most common method to solve the problem (3)–(4) is the alternating direction method of multipliers (ADMM) [7], in which each iteration of the augmented Lagrangian method (ALM) [17,18] has a Gauss–Seidel decomposition. The scheme of the ADMM method for (1)–(2) is
{ x ( k + 1 ) = arg min { ψ ( x , y ( k ) , λ ( k ) ) } , y ( k + 1 ) = arg min { ψ ( x ( k + 1 ) , y , λ ( k ) ) } , λ ( k + 1 ) = λ ( k ) β ( A x ( k + 1 ) + B y ( k + 1 ) b ) .
A significant advantage of ADMM is that each subproblem involves only one of the functions f1 and f2 in the original problem. Therefore, the variables x and y are treated separately during iterations, which makes solving the subproblems in (3)–(4) evidently easier than the original problem (1)–(2). The ADMM’s easy implementation and impressive efficiency has recently received extensive attention from many different areas. The ADMM is a very effective method to solve the convex optimization problem. Glowinski and Marrocco [19] were the first batch of scientists to describe it. Gabay [20], Glowinski and Le Tallec [21], as well as Eckstein and Bertsekas [22] studied some convergence results related to ADMM. It is suitable to solve convex programming problems with separable variables [7,23,24], and they are widely used in image processing, statistical learning, and machine learning [4,5,6,7,8,9].
For the convex optimization problems, some methods based on gradient operator are sensitive to the choice of the iteration step. If the parameters are not properly selected, the algorithm may not be convergent. In contrast, the ADMM is robust to the choice of parameters: under some mild conditions, the method can guarantee convergence for any positive parameter of its single parameter. When the objective function is a quadratic function, its global convergence is proved [13,24], and this method is linearly convergent. For example, the ADMM converges linearly to the optimal solution for problems (1) and (2). Although the convergence of the ADMM is perfectly solved, the accurate estimation of its convergence rate is still in its early stages; see, for example, [13,14].
Since the classical ADMM algorithm is inefficient in solving the accuracy of subproblem, Deng and Wo proposed a generalized ADMM in the literature [25], which adds the proximal terms 1 2 x x ( k ) P 2 and 1 2 y y ( k ) T 2 to the x- and y-subproblems, respectively. Its scheme for (1)–(2) is
{ x ( k + 1 ) = arg min { ψ ( x , y ( k ) , λ ( k ) ) + 1 2 x x ( k ) P 2 } , y ( k + 1 ) = arg min { ψ ( x ( k + 1 ) , y , λ ( k ) ) + 1 2 y y ( k ) T 2 } , λ ( k + 1 ) = λ ( k ) α β ( A x ( k + 1 ) + B y ( k + 1 ) b ) ,
where α ( 0 , 1 + 5 2 ) , and the matrices P and T are symmetric positive semidefinite.
Deng and Wo solved the problem of global and linear convergence of the generalized ADMM and gave the mathematical proof where P and T are symmetric positive semidefinite matrices. What is more, in order to make the subproblems of the generalized ADMM easier to solve and more efficient to run, the ultimate goal is to choose the adequate P and T.
In order to iteratively solve the linear constraint quadratic programming problem (1)–(2), Bai and Tao [18] proposed a class of preconditioned alternating variable minimization with multiplier (PAVMM) methods by the matrix preconditioning strategy and utilizing a parameter accelerating technique, which is based on a weighted inner product and the homologous weighted norm. The iteration scheme is as follows:
{ x ( k + 1 ) = arg min { ψ ˜ ( x , y ( k ) , λ ( k ) ) } , y ( k + 1 ) = arg min { ψ ˜ ( x ( k + 1 ) , y , λ ( k ) ) } , λ ( k + 1 ) = λ ( k ) α Q 1 W 1 ( A x ( k + 1 ) + B y ( k + 1 ) b ) ,
where ψ ˜ ( x , y , λ ) is the following weighted augmented Lagrangian function:
ψ ˜ ( x , y , λ ) = f 1 ( x ) + f 2 ( y ) < A x + B y b , λ > W 1 + β 2 A x + B y b W 1 2 ,
the matrices W and Q are symmetric positive semidefinite and nonsingular, and α is the relaxation parameter. Actually, the PAVMM method is a class of preconditioned alternating direction method of multipliers (PADMM). Therefore, in this manuscript, the PAVMM method is recorded as PADMM. In particular, the ADMM is a special case of the PADMM. If W = Q = I and a = β , the PADMM automatically reduces to the ADMM. Later, Bai and Tao [26] also establish a preconditioned and relaxed alternating variable minimization with multiplier (PRAVMM) method based on the PAVMM method, and the scheme is
{ x ^ ( k + 1 ) = arg min { ψ ˜ ( x , y ( k ) , λ ( k ) ) } , x ( k + 1 ) = ϖ x ^ ( k + 1 ) + ( 1 ϖ ) x ( k ) , y ^ ( k + 1 ) = arg min { L ˜ β ( x ( k + 1 ) , y , λ ( k + 1 2 ) ) } , y ( k + 1 ) = τ y ^ ( k + 1 ) + ( 1 τ ) y ( k ) , λ ( k + 1 ) = λ ( k + 1 2 ) α Q 1 W 1 ( A x ( k + 1 ) + B y ( k + 1 ) b )
where ϖ , τ , and α are positive constants. As above, we also rewrite the PRAVMM into PRADMM. In order to achieve acceleration, it is obvious that the PRADMM adds two relaxation parameters to the iterative process. Hence, the PADMM and ADMM are the special cases of the PRADMM. When ϖ = τ = 1 , the PRADMM degenerates into the PADMM; when ϖ = τ = 1 , W = Q = I and a = β ; then, the PRADMM reduces to the ADMM.
To further generalize the PADMM and maximally promote the convergence speed of the PADMM, in this manuscript, we establish a preconditioned and proximal alternating direction of multiplier (PPADMM) to solve the problem (1)–(2) by the iterative methods. Assuming that the matrices F n × n and G m × m are symmetric positive semidefinite, we should choose reasonably the parameters, the weighting, and the preconditioning matrices; the PPADMM is convergent to the unique solution of the problem (1.1)–(1.2) for any initial guess. Obviously, the PPADMM proposed in this paper is an extension of the PADMM. Hence, the ADMM and PADMM are special cases of the PPADMM. In addition, we test the robustness and effectiveness of the PPADMM by using numerical experiments. The experimental results indicate that this method performs better than ADMM, PADMM, and PRADMM when they are employed to solve the convex optimization programming problem (1)–(2).
The paper is organized as follows. In Section 2, the computational properties of PPADMM was established. We introduce some necessary concepts for analyzing the asymptotic convergence of the PPADMM in Section 3. Then, based on the analysis in Section 3, the asymptotic convergence of the PPADMM was demonstrated in Section 4. In Section 5, we have some test by solving the image deblurring problem to illustrate that our method is effective. Finally, we give some concluding remarks in Section 5.

2. The PPADMM Method

In this section, we will introduce the PPADMM proposed in this paper. At the k-th iteration step of the PPADMM, we add the proximal term 1 2 x x ( k ) P 2 and 1 2 y y ( k ) T 2 to the x- and y-subproblems during computing minimal points x ( k + 1 ) and y ( k + 1 ) of the minimization subproblems. In order to solve the problem (1)–(2) by the iterative methods, we are able to establish the preconditioned and proximal alternating direction method of multipliers (PPADMM) as follows:
{ x ( k + 1 ) = arg min { ψ ˜ ( x , y ( k ) , λ ( k ) ) + 1 2 x x ( k ) P 2 } , y ( k + 1 ) = arg min { ψ ˜ ( x ( k + 1 ) , y , λ ( k ) ) + 1 2 y y ( k ) T 2 } , λ ( k + 1 ) = λ ( k ) α β Q 1 W 1 ( A x ( k + 1 ) + B y ( k + 1 ) b ) ,
where P n × n and T m × m are two symmetric positive semidefinite matrices, and α and β are positive constants. We choose P = τ 1 I n β A T A with the requirement τ 1 > β A T A and T = τ 2 I m β B T B with the requirement τ 2 > β B T B . In the same way, the matrices W, Q are generated as follows:
W = ξ 1 I n β A T A , Q = ξ 2 I m β B T B
where ξ 1 ( 0 , 1 λ max ( A T A ) ) , ξ 2 ( 0 , 1 λ max ( B T B ) ) .
It is easy to get the derivative of the quadratic functions f 1 ( x ) and f 2 ( y ) in (2) as follows:
f 1 ( x ) = F x + f and f 2 ( y ) = G y + g .
By making full use of Equation (10), the first and second for formulas in Equation (10) are differentiated for x and y, respectively, with a simple manipulation. Afterwards, the iteration scheme of PPADMM can be rewritten as the following mathematically equivalent matrix–vector form:
{ F x ( k + 1 ) + f A T W 1 λ ( k ) + β A T W 1 ( A x ( k + 1 ) + B y ( k ) b ) + P ( x ( k + 1 ) x ( k ) ) = 0 , G y ( k + 1 ) + g B T W 1 λ ( k ) + β B T W 1 ( A x ( k + 1 ) + B y ( k + 1 ) b ) + T ( y ( k + 1 ) y ( k ) ) = 0 λ ( k + 1 ) = λ ( k ) α β Q 1 W 1 ( A x ( k + 1 ) + B y ( k + 1 ) b ) . ,
which can be equivalently reformulated as
{ ( F + β A T W 1 A + P ) x ( k + 1 ) = P x ( k ) + A T W 1 [ β ( b B y ( k ) ) + λ ( k ) ] f , ( G + β B T W 1 B + T ) y ( k + 1 ) = B T W 1 [ β ( b A x ( k + 1 ) ) + λ ( k ) ] + T y ( k ) g , λ ( k + 1 ) = λ ( k ) α β Q 1 W 1 ( A x ( k + 1 ) + B y ( k + 1 ) b ) .
Since F, G and P, T are symmetric positive semidefinite matrices, W is a symmetric positive definite matrix, and α > 0 , β > 0 ; thus, we know that the matrices F + β A T W 1 A + P and G + β B T W 1 B + T are symmetric positive definite if and only if the following statements are true [18]:
(a)
n u l l ( F ) n u l l ( A ) = { 0 } or n u l l ( F ) n u l l ( P ) = { 0 } or n u l l ( A ) n u l l ( P ) = { 0 } .
(b)
n u l l ( G ) n u l l ( B ) = { 0 } or n u l l ( G ) n u l l ( T ) = { 0 } or n u l l ( B ) n u l l ( T ) = { 0 } .
Therefore, when these above null-space conditions of PPADMM are well satisfied, measuring the costs of computation in the iteration scheme (13), it is obvious that the main costs of the PPADMM are to solve the linear systems with coefficient matrices F + β A T W 1 A + P and G + β B T W 1 B + T . When the sizes n, m, and/or p are relatively small, some direct methods such as the Cholesky factorization are able to effectively solve these systems of linear equations. However, when the sizes n, m, and/or p are huge, it will take a lot of time to solve these linear systems by the direct methods. So, the iterative methods are used to solve the problems, e.g., the preconditioned conjugate methods. Of course, the weighted matrices W, P, and T, and penalty parameters α , β should be reasonably selected so that both the matrices F + β A T W 1 A + P and G + β B T W 1 B + T have the better conditions than the original matrices F and G, respectively. It will make the linear equations with the coefficient matrices F + β A T W 1 A + P and G + β B T W 1 B + T accurate, fast, and robust at each step of the PPADMM.
For the linear systems A x = b , we define
A = D C L C U ,
where A = ( a i j ) m × n ( m n ), D = d i a g ( A 11 , , A m × m ) , C L and C U are strictly lower triangular matrix and strictly upper triangular matrix, respectively. Let L = D 1 C L , U = D 1 C U , then, the block SOR iterative method is described as
x ( k + 1 ) = L w x ( k ) + w ( I w L ) 1 D 1 b , L w = ( I w L ) 1 [ ( 1 w ) I + w U ] .
According to the iterative scheme (13), we can see that the PPADMM can be classified as a modified block SOR iterative method for solving the following linear systems.
{ ( F + β A T W 1 A ) x β A T W 1 B y A T W 1 z = β A T W 1 b f β B T W 1 A x + ( G + β B T W 1 B ) y B T W 1 z = β B T W 1 b g W 1 A x + W 1 B y = W 1 b
For the properties that a saddle point of the weighted augmented Lagrangian function ψ ( x , y , λ ) defined in (3) possess, we refer to the literature [18] for more details.
Theorem 1.
[7]Let F n × n , G m × m be symmetric positive semidefinite matrices and A p × n , B p × m be two arbitrary matrices. Let
A ( β ) = [ F + β A T W 1 A β A T W 1 B A T W 1 β B T W 1 A G + β B T W 1 B B T W 1 W 1 A W 1 B 0 ] ,   b ( β ) = ( β A T W 1 b f β B T W 1 b f W 1 b ) , x = ( x y λ ) .
Then, the following results are obtained:
(i)
For x * n , y * m and λ * p , then the point ( x * , y * , λ * ) is a saddle point of the weighted augmented Lagrangian function ψ ( x , y , λ ) defined in (3) if and only if x * = ( x * T , y * T , λ * T ) T is a solution of the linear system A ( β ) x = b ( β ) ;
(ii)
The matrix A ( β ) ( n + m + p ) × ( n + m + p ) is nonsingular if and only if
(a)
( n u l l ( F ) n u l l ( G ) ) n u l l ( ( A B ) ) = { 0 }
(b)
n u l l ( A T ) n u l l ( B T ) = { 0 } .
In addition, for the complex quadratic polynomial equation, the determinant criterion for locations of its two roots is shown in Lemma 1. It is indispensable to prove the asymptotic convergence of the PPADMM in Section 4.
Lemma 1.
[16,27,28]Assume η and ζ are two complex constants; then, both roots of the complex quadratic polynomial equation λ 2 + ζ λ + η = 0 have a modulus of less than one if and only if
| ζ ζ ¯ η | + | η | 2 < 1 .
In particular, if both η and n u l l ( G ) n u l l ( B ) = { 0 } are real constants, the condition (14) degrades into | η | < 1 and | ζ | < 1 + η .

3. The Asymptotic Convergence of the PPADMM

In this section, we will prove that the PPADMM is globally and asymptotically convergent, and its corresponding asymptotic convergence rate was also estimated.
First, we define the matrices
M β ( α ) = [ F + β A T W 1 A + P 0 0 β B T W 1 A G + β B T W 1 B + T 0 W 1 A W 1 B 1 α β Q ] , N β ( α ) = [ P β A T W 1 B A T W 1 0 T B T W 1 0 0 1 α β Q ] .
Obviously, the following matrix equations are true:
A ( β ) = M β ( α ) N β ( α ) = [ F + β A T W 1 A β A T W 1 B A T W 1 β B T W 1 A G + β B T W 1 B B T W 1 W 1 A W 1 B 0 ] .
Hence, the iteration scheme (13) of the PPADMM can be rewritten as the equivalent matrix-vector form
M β ( α ) x ( k + 1 ) = N β ( α ) x ( k ) + b ( β ) .
Multiplying M β 1 ( α ) by both sides of Equation (16), we obtain
x ( k + 1 ) = M β 1 ( α ) N β ( α ) x ( k ) + M β 1 ( α ) b ( β ) .
It is easy to know that L β ( α ) = M β 1 ( α ) N β ( α ) is the iterative matrix of (17). For the matrix splitting of (14), the iterative scheme of the PPADMM can be equivalently reconstructed into the matrix splitting iteration method (16), which can solve the linear equation A(β)x = b(β). Therefore, the PPADMM is asymptotically and globally convergent if and only if the spectral radius of the iterative matrix L β ( α ) is less than one, i.e., ρ ( L β ( α ) ) < 1 . We define the weighted matrices
A ^ = W 1 2 A , B ^ = W 1 2 B , Q ^ = W 1 2 Q W 1 2 ,
the augmented matrices
F ^ = F + β A T W 1 A + P , G ^ = G + β B T W 1 B + T ,
and the compounded matrices
R ^ = A ^ ( P λ F ^ ) 1 A ^ T , S ^ = B ^ ( T λ G ^ ) 1 B ^ T .
Therefore, M , N can be rewritten as
M β ( α ) = [ F ^ 0 0 β B ^ T A ^ G ^ 0 W 1 2 A ^ W 1 2 B ^ 1 α β Q ] , N β ( α ) = [ P β A ^ T B ^ A ^ T W 1 2 0 T B ^ T W 1 2 0 0 1 a b Q ] .
Here, we define the block diagonal matrix
H = [ I 0 0 0 I 0 0 0 W 1 2 ] .
After preconditioning M β ( α ) and N β ( α ) , respectively, we can obtain the block lower-triangular matrix M ^ β ( α ) and the block upper-triangular matrix N ^ β ( α ) as follows:
M ^ β ( α ) = H M β ( α ) H = [ F ^ 0 0 β B ^ T A ^ G ^ 0 A ^ B ^ 1 α β Q ^ ] , N ^ β ( α ) = H N β ( α ) H = [ P β A ^ T B ^ A ^ T 0 T B ^ T 0 0 1 α β Q ^ ] .
It follows that L ^ β ( α ) = M ^ β ( α ) 1 N ^ β ( α ) = H 1 L β ( α ) H . It is obvious that L ^ β ( α ) and L β ( α ) are similar matrices, so they have the same eigenvalues. When ρ ( L β ( α ) ) < 1 , the PPADMM globally converges to the optimal solution of the problem (1)–(2).
Theorem 2.
Suppose that F n × n and G m × m are symmetric positive semidefinite matrices, A p × n and B p × m are two matrices such that:
(a)
( n u l l ( F ) n u l l ( G ) ) n u l l ( ( A B ) ) = { 0 } and
(b)
n u l l ( A T ) n u l l ( B T ) = { 0 } .
Define the scaled matrices A ^ , B ^ , Q ^ , F ^ , G ^ and R ^ , S ^ as in (18)–(19). Suppose that λ is a nonzero eigenvalue of the PPADMM iteration matrix L β ( α ) . If the matrices I + β λ S ^ , P λ F ^ , and T λ G ^ are nonsingular, then λ is an eigenvalue of the following eigenvalue problem ( λ 2 β 2 R ^ S ^ Q ˜ + λ E + Q ˜ ) w ˜ = 0 , where
E = α β S ˜ Q ˜ + α β 2 R ^ S ˜ β 2 R ^ S ˜ Q ˜ + α β R ^ ,
{ Q ˜ = ( I + β λ S ^ ) 1 Q ^ ( I + β λ S ^ ) , S ˜ = ( I + β λ S ^ ) 1 S ^ ( I + β λ S ^ ) , w ˜ = ( I + β λ S ^ ) 1 w .
Proof. 
In order to analyze the nonzero eigenvalues of L β ( α ) , we first need to analyze the nonzero eigenvalues of L ^ β ( α ) because the matrices L ^ β ( α ) and L β ( α ) are similar. Obviously, the conditions (a) and (b) guarantee the nonsingularity of the matrix A ( β ) , λ = 1 is not an eigenvalue of L ^ β ( α ) .
Assume λ is a nonunit and nonzero eigenvalue of the matrix L ^ β ( α ) , u = ( u T , v T , w T ) T n + m + p , with u n , v m and w p is the corresponding eigenvector, i.e., L ^ β ( α ) u = λ u . Then, it holds that N ^ β ( α ) u = λ M ^ β ( α ) u , or equivalently,
{ ( P λ F ^ ) u = β A ^ T B ^ v A ^ T w , ( T λ G ^ ) v = λ β B ^ T A ^ u B ^ w , 1 λ α β Q ^ w = λ ( A ^ u + B ^ v ) .
Since the matrices I + β λ S ^ , P λ F ^ and T λ G ^ are nonsingular, by simplifying the first and second equations in (21), we can get
{ A ^ u = R ^ ( β B ^ v w ) , B ^ v = S ^ ( λ β A ^ u w ) .
Obviously, B ^ v = 1 λ α β λ Q ^ w A ^ u . Substituting the equation into the first and second equation in (21), it holds that
{ ( I + β R ^ ) A ^ u = R ^ ( 1 λ α λ Q ^ I ) w , ( I + β λ S ^ ) A ^ u = ( 1 λ α β λ Q ^ + S ^ ) w .
We can get the following simplified equation
( I + β R ^ ) [ ( 1 λ ) Q ˜ + α β λ S ˜ ] w ˜ = β R ^ ( I + β λ S ^ ) [ ( 1 λ ) Q ˜ α λ I ] w ˜ .
By combining the same terms in (25) according to the power exponents of λ , we can get the following equation
{ λ 2 β 2 R ^ S ^ Q ˜ + λ [ ( I + β R ^ ) ( α β S ˜ Q ˜ ) β 2 R ^ S ^ Q ˜ + β R ^ Q ˜ + α β R ^ ] + Q ˜ } w ˜ = 0 .
As a result of
( I + β R ^ ) ( α β S ˜ Q ˜ ) β 2 R ^ S ^ Q ˜ + β R ^ Q ˜ + α β R ^ = α β S ˜ Q ˜ + α β 2 R ^ S ˜ β 2 R ^ S ^ Q ˜ + α β R ^ ,
we can rewrite (26) as ( λ 2 β 2 R ^ S ^ Q ˜ + λ E + Q ˜ ) w ˜ = 0 . □
In accordance with Theorem 2, we can instantly get the following sufficient condition, which guarantees the globally asymptotic convergence of the PPADMM.
Theorem 3.
Let the conditions of Theorem 2 be satisfied.For any non-zero vector w ˜ n , define
μ s = w ˜ * S ˜ w ˜ w ˜ * R ^ S ^ Q ˜ w ˜ , μ q = w ˜ * Q ˜ w ˜ w ˜ * R ^ S ^ Q ˜ w ˜ , χ = w ˜ * R ^ S ˜ w ˜ w ˜ * R ^ S ^ Q ˜ w ˜ , μ r = w ˜ * R ^ w ˜ w ˜ * R ^ S ^ Q ˜ w ˜ .
If | κ κ ¯ η | + | η | 2 < 1 is satisfied, then the iteration sequence { x ( k ) } k = 0 generated by the iteration scheme of the PPADMM in (10) converges to the optimal solution of the problem (1)–(2), where
{ κ = α β μ s μ q + α β 2 χ β 2 + α β μ r β 2 , η = μ q β 2 .
Moreover, the convergence factor of the PPADMM is given by σ ( α , β ) = max { λ + ( max ) , λ ( max ) } , where
λ ± ( max ) = max w ˜ p \ { 0 } { | κ ± κ 2 4 η | 2 } .
Proof. 
According to Theorem 2, we know that the PPADMM generates the following rational eigenvalue problem
( λ 2 β 2 R ^ S ^ Q ˜ + λ E + Q ˜ ) w ˜ = 0 .
The two sides of the equation are multiplied by w ˜ * p \ { 0 } at the same time, and we can get
λ 2 β 2 w ˜ * R ^ S ^ Q ˜ w ˜ + λ w ˜ * E w ˜ + w ˜ * Q ˜ w ˜ = 0
where E = α β S ˜ Q ˜ + α β 2 R ^ S ˜ β 2 R ^ S ^ Q ˜ + α β R ^ , we can rewrite the Equation (30) as
β 2 λ 2 + λ ( α β μ s μ q + α β 2 χ β 2 + α β μ r ) + μ q = 0
With the notation in (28), Equation (30) is organized into
λ 2 + λ κ + η = 0 .
According to the Lemma 1, if the root of the quadratic function about the eigenvalue is less than one, the coefficient must satisfy the following conditions:
| κ κ ¯ η | + | η | 2 < 1 .
Last but not least, it is easy to acquire two roots of the quadratic Equation (32). The two roots are listed:
λ ± ( max ) = max w ˜ p \ { 0 } { | κ ± κ 2 4 η | 2 } .
So, we can get the convergence factor of the PPADMM as follows σ ( α , β ) = max { λ + ( max ) , λ ( max ) } . □

4. Numerical Results

In this section, numerical examples are adopted to illustrate the performance of the PPADMM for the image deblurring problem. In [26], the PRADMM was shown to be more efficient than the PADMM. Hence, we only need to compare the PPADMM that we proposed with the PRADMM and ADMM in this section.
Image deblurring is a classical and significant subject study in image processing, which is usually an inverse problem. It is comprehensively concerned by many scholars engaged in computer vision. Normally, the goal of image deblurring is to recover an unknown original image u n from a noisy image y n that is often modeled as y = B u + n , where B is a linear operator, and n is a white Gaussian noise with variance σ 2 .
In general, the blurring matrix B is a highly ill-conditioned matrix, so the image deblurring problem is an ill-conditioned inverse problem. A common method for solving the ill-conditioned problems is the regularization methods. Therefore, in practical applications, the image deblurring problem is converted into the following optimization problem
min x { 1 2 A x c 2 2 + ε 2 2 K x 2 2 }
where A is a blurring operator, K is a regularization operator, ε is a regularization parameter, and c is the observed image (see [16,29,30,31,32,33]). The mathematical expression (33) can be equivalently reformulated into the equality-constraint quadratic programming problem (1)–(2) as follows:
{ min x , y ε 2 2 K x 2 + 1 2 y 2 , s . t A x y = b . .
As it was mentioned above, the blurring image c can be described as b = A x t + ω r , where x t is the true image, r is the noise generated by the random vector, and ω is the level of noise. In our simulations of image deblurring, in order to generate blurred images, first, the original images are blurred by a blurring kernel, and then followed by an additive Gaussian white noise. As in [15], we set ε = 0.1 , β = 0.1 , and ω = 3 .
In our simulations, K takes the Laplacian operator. Under the periodic boundary conditions, K and A are the block circulant matrices with circulant blocks. Therefore, both K T K and A T A are also block circulant matrices with circulant blocks, so that they can be diagonalized by the FFT2 (2 Dimension Fast Fourier Transform). See, e.g., [34]. Therefore, the three methods experimented require o ( n log n ) operations per iteration.
The PRADMM is used to solve the problem (34), and its corresponding iteration scheme is as follows:
{ ( ε 2 K T K + β A T W 1 A ) x ^ ( k + 1 ) = A T W 1 [ β ( c + y ( k ) ) + λ ( k ) ) , x ( k + 1 ) = ω x ^ ( k + 1 ) + ( 1 ω ) x ( k ) , ( I + β W 1 ) y ^ ( k + 1 ) = W 1 [ β ( A x ( k + 1 ) c ) + λ ( k ) ] , y ( k + 1 ) = τ y ^ ( k + 1 ) + ( 1 τ ) y ( k ) . λ ( k + 1 ) = λ ( k ) α Q 1 W 1 ( A x ( k + 1 ) y ( k + 1 ) c ) .
Applying the PPADMM to (33), we obtain the following iterative scheme:
{ ( ε 2 K T K + β A T W 1 A + P ) x ( k + 1 ) = P x ( k ) + A T W 1 [ β ( c + y ( k ) ) + λ ( k ) ) , ( I + β W 1 + T ) y ( k + 1 ) = W 1 [ β ( A x ( k + 1 ) c ) + λ ( k ) ] + T y ( k ) , λ ( k + 1 ) = λ ( k ) α β Q 1 W 1 ( A x ( k + 1 ) y ( k + 1 ) c ) .
In order to reduce the computational cost of iterative schemes (35) and (36), we choose the following method to generate the positive definite proximal term matrices P and T as P = β τ 1 I β A T A , T = ( β τ 2 β ) I , where τ 1 ( 0 , 1 λ max ( A T A ) ) , τ 2 ( 0 , 1 ) . For the PRADMM and PPADMM, W = ( β γ 1 I β A T A ) 1 , Q = ( β γ 2 β ) I , where γ 1 ( 0 , 1 λ max ( A T A ) ) , γ 2 ( 0 , 1 ) . In our simulations, τ 1 = 0.9 , τ 2 = 0.04 , and γ 1 = γ 2 = 0.1 . It is obvious that the matrices ( ε 2 K T K + β A T W 1 A ) , ( I + β W 1 ) , and ( I + β W 1 + T ) , ( ε 2 K T K + β A T W 1 A + P ) are the block circulant matrices with circulant blocks so that their inverse matrices can be easily obtained by the FFT2. It will greatly reduce the computational cost of schemes (35) and (36).
In our experiments, the methods have been implemented in MATLAB (version R2014a) on a PC with Core i5-4590, 3.30 GHz CPU, and 8.00 GB RAM. We test the 256-by-256 Grayscale images Capsule, Baboon, House, and Cameraman [35], which are shown in Figure 1. We state that the image Capsule is a .jpg file and the other three images are all .pgm files. For 8-bit images, their pixel values are all integers in [0, 255]. Hence, for these four images, we have m = n = p = 256 × 256 = 65,536 in the problem (34). In our experiments of deblurring, we test two types of blurring kernels: Type I (fspecial(average, 13)) and TypeII (fspecial(Gaussian, [9, 9], 3)). For the three methods in the experiment, all initial values are chosen as the blurred image c , i.e., x ( 0 ) = c , which is terminated if the current iterates satisfy x ( k ) x ( k 1 ) x ( k 1 ) 10 5 and are performed in MATLAB with a machine precision 10−16.
The peak signal-to-noise ratio (PSNR) is the most common and widely used objective measurement method for evaluating the quality of an image. It is widely used in the literature to investigate the performance of the methods solving the image restoration problem. The PSNR is defined as follows:
P S N R ( x ) = 20 log 10 255 var ( x , x t )   with   var ( x , x t ) = 1 n j = 0 n 1 ( x j t x j ) 2
where x j t is the j-th pixel value of the original image x t , and x j is the j-th pixel values of the corresponding estimated image y = B u + n . PSNR is a very important index in image deblurring, which can measure the difference between the restored image and the original image. In general, the PSNR value increases with the quality of restoration. In image deblurring, if the PSNR is larger, the effect of image deblurring is better. In order to obtain the high value of PSNR, it will be worth taking a long time. Table 1 shows the optimal parameters for the PPADMM and the PRADMM in the experiment, respectively. These optimal parameters are obtained experimentally by maximizing the corresponding PSNR.
In order to illustrate the effectiveness and robustness of the PPADMM to the different images, we show the experimental results for the PPADMM, PRADMM, and ADMM to solve the image deblurring problem in Table 2. Table 2 shows the elapsed CPU time in seconds (denoted as “CPU”) and the peak signal-to-noise ratio measured in the described (dB) (denoted as “PSNR”) image deblurring problem (34). In order to be accurate, the experimental results are the average value of the three repeatedly executed results. In short, the PPADMM, PRADMM, and ADMM are denoted as PPAD, PRAD, and AD. From Table 2, we can see that the PPADMM acquires the highest value of PSNR for the four images of two types of the kernel. The difference with respect to the PSNR ranges from 0.38 to 0.94 dB. In order to know the relationship between PSNR and iteration, Figure 2 takes the capsule blurred by Type I as an example, which shows the PSNR versus iterations for the PPADMM, PRADMM, and ADMM. We can see that for the image capsule blurred by Type I, the PPADMM was able to solve the problem (34) in 83 iterations and achieved a PSNR value of 25.05 dB. The PRADMM and ADMM took 18 iterations and 47 iterations to solve the problem (33) and obtained a PSNR value of 24.5 dB. Clearly, although the PPADMM took more iterations than the PRADMM and ADMM, the value of PSNR that the PPADMM calculated is much larger than those of the other two methods. Above all, the PPADMM is much more effective than the PRADMM and the ADMM in the qualities of the deblurred images, and it is robust.
Finally, for the blurred (Type I) and noised (B&N) image capsule, we deblur it by the ADMM, PRADMM, and PPADMM, respectively, and we show the deblurred images in Figure 3. We can see that the PRADMM and ADMM produced over-smoothed results and eliminated many images details. Instead, the PPADMM ensured that the restored images have better visual quality. On the one hand, it can remove effectively the blurring effects and noise. On the other hand, it is also able to reconstruct more image edges than the two other methods.
In summary, it can be concluded from Table 2 and Figure 1 and Figure 2 that for the image deblurring problem (33), the proposed PPADMM is clearly more effective than the PRADMM and ADMM in obtaining comparable qualities of the deblurred images.

5. Conclusions

In this paper, we have proposed an efficient PPADMM for solving the linear constraint quadratic programming problem (1)–(2). This algorithm is a proximal generalization of the PADMM, and it extrapolates the block variables and the block multiplier in each new iteration. In fact, the PPADMM is naturally a generalized and modified block SOR iterative method in the viewpoint of matrix computation. Its theoretical properties such as global convergence and convergence factor are established. Meanwhile, our numerical results verify the efficiency of the proposed method in comparison with PRADMM and ADMM. In addition, it is easy to be applied to construct the similar method for solving convex optimization problems of several block variables and of several equality and inequality constraints. However, how to choose the optimal parameters is still a challenging problem that deserves further discussion.

Author Contributions

Conceptualization, H.-L.S. and X.T.; methodology, H.-L.S. and X.T.; software, X.T.; validation, H.-L.S. and X.T.; formal analysis, H.-L.S. and X.T.; investigation, H.-L.S.; resources, H.-L.S.; data curation, X.T.; writing—original draft preparation, H.-L.S. and X.T.; writing—review and editing, H.-L.S.; visualization, H.-L.S.; supervision, H.-L.S. and X.T.; project administration, H.-L.S.; funding acquisition, H.-L.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Natural Science Foundation of Liaoning Province (No. 20170540323), Central University Basic Scientific Research Business Expenses Special Funds (N2005013).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The codes related to numerical examples refer to the link: https://github.com/Klarustx/ADMM/tree/master/code (accessed on 6 July 2020).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Wang, H.; Li, G.; Tsai, C.L. Regression coefficient and autoregressive order shrinkage and selection via the lasso. J. R. Stat. Soc. 2007, 69, 63–78. [Google Scholar]
  2. Markowitz, H.M. Portfolio Selection: Efficient Diversification of Investment. J. Inst. Actuar. 1992, 119, 243–265. [Google Scholar]
  3. André, P.; Harry, M. Markowitz, Sparsity and Piecewise Linearity in Large Portfolio Optimization Problems. In Sparse Matricies and Their Uses; Academic Press: Cambridge, MA, USA, 1981; pp. 89–108. [Google Scholar]
  4. Recht, B.; Fazel, M.; Parrilo, P.A. Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization. SIAM Rev. J. 2010, 52, 471–501. [Google Scholar] [CrossRef] [Green Version]
  5. Haber, E.; Modersitzki, J. Numerical methods for volume preserving image registration. Inverse Probl. J. 2004, 20, 1621–1638. [Google Scholar] [CrossRef] [Green Version]
  6. Han, D.; Yuan, X.; Zhang, W. An augmented Lagrangian based parallel splitting method for separable convex minimization with applications to image processing. J. Math. Comput. 2014, 83, 2263–2291. [Google Scholar] [CrossRef]
  7. Boyd, S.; Parikh, N.; Chu, E. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. J. Found. Trends Mach. Learn. 2010, 3, 1–122. [Google Scholar] [CrossRef]
  8. Erseghe, T.; Zennaro, D.; Dall’Anese, E.; Vangelista, L. Fast Consensus by the Alternating Direction Multipliers Method. J. IEEE Trans. Signal Process. 2011, 59, 5523–5537. [Google Scholar] [CrossRef]
  9. He, B.; Xu, M.; Yuan, X. Solving Large-Scale Least Squares Semidefinite Programming by Alternating Direction Methods. SIAM J. Matrix Anal. Appl. 2011, 32, 136–152. [Google Scholar] [CrossRef]
  10. Amir, F.; Farajzadeh, A.; Petrot, N. Proximal Point Algorithm for Differentiable Quasi-Convex Multiobjective Optimization. J. Filomat 2020, 34, 2367–2376. [Google Scholar] [CrossRef]
  11. Goodrich, C.S.; Ragusa, M.A. Holder continuity of weak solutions of p-Laplacian PDEs with VMO coefficients. J. Nonlinear Anal. Theory Methods Appl. 2019, 185, 336–355. [Google Scholar] [CrossRef]
  12. Suantai, S.; Kankam, K.; Cholamjiak, P. A Novel Forward-Backward Algorithm for Solving Convex Minimization Problem in Hilbert Spaces. J. Math. 2020, 8, 42. [Google Scholar] [CrossRef] [Green Version]
  13. Daniel, B. Local Linear Convergence of the Alternating Direction Method of Multipliers on Quadratic or Linear Programs. SIAM J. Optim. 2013, 23, 2183–2207. [Google Scholar]
  14. He, B.; Yuan, X. On the O(1/n) Convergence Rate of the Douglas-Rachford Alternating Direction Method. SIAM J. Numer. Anal. 2012, 50, 700–709. [Google Scholar] [CrossRef]
  15. Morini, B.; Porcelli, M.; Chan, R.H. A reduced Newton method for constrained linear least-squares problems. J. Comput. Appl. Math. 2010, 233, 2200–2212. [Google Scholar] [CrossRef] [Green Version]
  16. Miller, J.H. On the Location of Zeros of Certain Classes of Polynomials with Applications to Numerical Analysis. J. Inst. Math. Appl. 1971, 8, 397–406. [Google Scholar] [CrossRef]
  17. Powell, M.J.D. A method for nonlinear constraints in minimization problems. J. Optim. 1969, 5, 283–298. [Google Scholar]
  18. Bai, Z.Z.; Tao, M. Rigorous convergence analysis of alternating variable minimization with multiplier methods for quadratic programming problems with equality constraints. J. Bit Numer. Math. 2016, 56, 399–422. [Google Scholar] [CrossRef]
  19. Glowinski, R.; Marroco, A. Sur l’approximation, par éléments finis d’ordre un, et la résolution, par pénalisation-dualité d’une classe de problèmes de Dirichlet non linéaires. J. Equine Vet. Sci. 1975, 2, 41–76. [Google Scholar] [CrossRef]
  20. Gabay, D.; Mercier, B. A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Comput. J. Math. Appl. 1976, 2, 17–40. [Google Scholar] [CrossRef] [Green Version]
  21. Oden, J.T. Augmented Lagrangian and Operator-Splitting Methods in Nonlinear Mechanics; Glowinski, R., Tallec, R.L.M., Eds.; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 1989. [Google Scholar]
  22. Eckstein, J.; Bertsekas, D.P. On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. J. Math. Program. 1992, 55, 293–318. [Google Scholar] [CrossRef] [Green Version]
  23. Bertsekas, D.P. Constrained Optimization and Lagrange Multiplier Methods; Academic Press: Cambridge, MA, USA, 1982. [Google Scholar]
  24. Gabay, D. Chapter ix applications of the method of multipliers to variational inequalities. In Studies in Mathematics and Its Applications; Elsevier: Amsterdam, The Netherlands, 1983; Volume 15, pp. 299–331. [Google Scholar]
  25. Deng, W.; Yin, W. On the Global and Linear Convergence of the Generalized Alternating Direction Method of Multipliers. J. Sci. Comput. 2016, 66, 889–916. [Google Scholar] [CrossRef] [Green Version]
  26. Bai, Z.Z.; Tao, M. On Preconditioned and Relaxed AVMM Methods for Quadratic Programming Problems with Equality Constraints. J. Linear Algebra Its Appl. 2016, 516, 264–285. [Google Scholar] [CrossRef]
  27. Bai, Z.Z.; Parlett, B.N.; Wang, Z.Q. On generalized successive overrelaxation methods for augmented linear systems. J. Numer. Math. 2005, 102, 1–38. [Google Scholar] [CrossRef]
  28. Young, D.M. Iterative Solution of Large Linear Systems; Academic Press: New York, NY, USA, 1971. [Google Scholar]
  29. Xie, S.; Rahardja, S. Alternating Direction Method for Balanced Image Restoration. J. IEEE Trans. Image Process. 2012, 21, 4557–4567. [Google Scholar]
  30. Hearn, T.A.; Reichel, L. Image Denoising via Residual Kurtosis Minimization. J. Numer. Math. Theory Methods Appl. 2015, 8, 406–424. [Google Scholar] [CrossRef]
  31. Hochstenbach, M.E.; Noschese, S.; Reichel, L. Fractional regularization matrices for linear discrete ill-posed problems. J. Eng. Math. 2015, 93, 1–17. [Google Scholar] [CrossRef] [Green Version]
  32. Hearn, T.A.; Reichel, L. Application of denoising methods to regularization of ill-posed problems. J. Numer. Algorithms 2014, 66, 761–777. [Google Scholar] [CrossRef]
  33. Hageman, L.A.; Young, D.M. Iterative Solution of Large Linear Systems. J. Am. Math. Mon. 1973, 80, 92. [Google Scholar] [CrossRef]
  34. Chan, R.H.; Ng, M.K. Conjugate Gradient Methods for Toeplitz Systems. J. SIAM Rev. 1995, 38, 427–482. [Google Scholar] [CrossRef]
  35. Available online: http://decsai.ugr.es/cvg/dbimagenes/ (accessed on 13 March 2014).
Figure 1. The original images; (a) Capsule. (b) Baboon. (c) House. (d) Cameraman.
Figure 1. The original images; (a) Capsule. (b) Baboon. (c) House. (d) Cameraman.
Mathematics 09 00941 g001
Figure 2. Pictures of PSNR versus iterations for the PPADMM, PRADMM, and ADMM.
Figure 2. Pictures of PSNR versus iterations for the PPADMM, PRADMM, and ADMM.
Mathematics 09 00941 g002
Figure 3. The blurred and noised (B&N) image, and the deblurring images by using the ADMM, PRADMM, and PPADMM.
Figure 3. The blurred and noised (B&N) image, and the deblurring images by using the ADMM, PRADMM, and PPADMM.
Mathematics 09 00941 g003
Table 1. The optimal iteration parameters computed by experiment for the preconditioned and proximal alternating direction method of multipliers (PPADMM) and preconditioned and relaxed alternating direction method of multipliers (PRADMM).
Table 1. The optimal iteration parameters computed by experiment for the preconditioned and proximal alternating direction method of multipliers (PPADMM) and preconditioned and relaxed alternating direction method of multipliers (PRADMM).
ImageCapsuleBaboonHouseCameraman
blur typeIIIIIIIIIIII
PPADMM α * 2.02.12.12.12.12.12.12.1
PRADMM ω * 0.80.80.80.80.80.80.80.8
τ * 0.60.60.60.60.60.60.60.6
α * 0.230.230.230.230.230.260.250.26
Table 2. The elapsed CPU time (CPU), peak signal-to-noise ratio (PSNR) and valofun for the PPAD (PPADMM), PRAD (PRADMM) and AD (alternating direction method of multipliers, or ADMM) with respect to optimal parameters.
Table 2. The elapsed CPU time (CPU), peak signal-to-noise ratio (PSNR) and valofun for the PPAD (PPADMM), PRAD (PRADMM) and AD (alternating direction method of multipliers, or ADMM) with respect to optimal parameters.
ImageBlurCpuPsnr
TypePpadPradAdPpadPradAd
CapsuleI1.0185580.2110380.44992425.050124.495524.4942
II0.7323730.1987810.42578826.594226.003826.0026
BaboonI0.6707350.1901450.42867620.832120.450720.4500
II0.6258690.1801120.41139821.063920.631820.6311
HouseI0.6049270.1880130.43331826.621226.150526.1484
II0.5440120.1617780.40406527.314726.373926.3706
CameramanI0.6773340.1696640.42125523.753023.258323.2566
II0.5383810.1687580.41606324.470523.665423.6632
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Shen, H.-L.; Tang, X. The PPADMM Method for Solving Quadratic Programming Problems. Mathematics 2021, 9, 941. https://doi.org/10.3390/math9090941

AMA Style

Shen H-L, Tang X. The PPADMM Method for Solving Quadratic Programming Problems. Mathematics. 2021; 9(9):941. https://doi.org/10.3390/math9090941

Chicago/Turabian Style

Shen, Hai-Long, and Xu Tang. 2021. "The PPADMM Method for Solving Quadratic Programming Problems" Mathematics 9, no. 9: 941. https://doi.org/10.3390/math9090941

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop