The PPADMM Method for Solving Quadratic Programming Problems

: In this paper, a preconditioned and proximal alternating direction method of multipliers (PPADMM) is established for iteratively solving the equality-constraint quadratic programming problems. Based on strictly matrix analysis, we prove that this method is asymptotically convergent. We also show the connection between this method with some existing methods, so it combines the advantages of the methods. Finally, the numerical examples show that the algorithm proposed is efﬁcient, stable, and ﬂexible for solving the quadratic programming problems with equality constraint


Introduction
This manuscript introduces the following convex optimization model with linear constraints and a separable objective function: where A ∈ R p×n and B ∈ R p×m are two matrices, and b ∈ R p is a known vector. The objective function f 1 : R n → R and f 2 : R m → R are two quadratic functions defined by f 1 (x) = 1 2 x T Fx + x T f , f 2 (y) = 1 2 y T Gy + y T g, where F ∈ R n×n , G ∈ R m×m are symmetric positive semidefinite matrices and f ∈ R n , g ∈ R m are the known vectors. The class of convex minimization problems arises in many areas of computational science and engineering applications such as compressed sensing [1], financial [2,3], image restoration [4][5][6], network optimization problems [7,8], and traffic planning convex problems [9][10][11][12]. The model (1)-(2) captures many applications in different areas-see the l1-norm regularized least-squares problems in [12,13], the total variation image restoration in [13][14][15][16], and the standard quadratic programming problems in [7,13].
Let H ∈ R p×p be the symmetric positive definite matrix, < ·, · > H represents the weighted inner product with the weighting matrix H, · stands for the Euclidean norm, and · H stands for the analogous weighted matrix norm. Note that for the vectors u, v ∈ R p and the matrix X ∈ R p×p , it holds that < u, v > H =< Hu, v >, u H = H 1 2 u and X H = H 1 2 XH − 1 2 . If < u, v > H = 0, we say that u, v ∈ R p are H-orthogonal, which is denoted by u⊥ H v. In particular, if H is the identity matrix, then the vectors u and v are orthogonal, which is simply denoted by u⊥v. For ζ ∈ C, ζ stands for its conjugate complex.
The problem (1)- (2) is mathematically equivalent to the unconstraint optimization problem [7] max λ min x,y ψ(x, y, λ) (3) ψ(x, y, λ) is the augmented Lagrangian function defined as where λ is the Lagrangian multiplier, and β is a regularization parameter. In a word, a point (x * , y * ) is the solution to the problem (1)- (2) if and only if there exists λ * ∈ R p such that the point (x * , y * , λ * ) ∈ R n × R m × R p is the solution to the problem (3)-(4) [7]. The most common method to solve the problem (3)-(4) is the alternating direction method of multipliers (ADMM) [7], in which each iteration of the augmented Lagrangian method (ALM) [17,18] has a Gauss-Seidel decomposition. The scheme of the ADMM method for (1)- (2) is x (k+1) = argmin ψ(x, y (k) , λ (k) ) , y (k+1) = argmin ψ(x (k+1) , y, λ (k) ) , A significant advantage of ADMM is that each subproblem involves only one of the functions f 1 and f 2 in the original problem. Therefore, the variables x and y are treated separately during iterations, which makes solving the subproblems in (3)-(4) evidently easier than the original problem (1)- (2). The ADMM's easy implementation and impressive efficiency has recently received extensive attention from many different areas. The ADMM is a very effective method to solve the convex optimization problem. Glowinski and Marrocco [19] were the first batch of scientists to describe it. Gabay [20], Glowinski and Le Tallec [21], as well as Eckstein and Bertsekas [22] studied some convergence results related to ADMM. It is suitable to solve convex programming problems with separable variables [7,23,24], and they are widely used in image processing, statistical learning, and machine learning [4][5][6][7][8][9].
For the convex optimization problems, some methods based on gradient operator are sensitive to the choice of the iteration step. If the parameters are not properly selected, the algorithm may not be convergent. In contrast, the ADMM is robust to the choice of parameters: under some mild conditions, the method can guarantee convergence for any positive parameter of its single parameter. When the objective function is a quadratic function, its global convergence is proved [13,24], and this method is linearly convergent. For example, the ADMM converges linearly to the optimal solution for problems (1) and (2). Although the convergence of the ADMM is perfectly solved, the accurate estimation of its convergence rate is still in its early stages; see, for example, [13,14].
Since the classical ADMM algorithm is inefficient in solving the accuracy of subproblem, Deng and Wo proposed a generalized ADMM in the literature [25], which adds the proximal terms 1 2 x − x (k) 2 P and 1 2 y − y (k) 2 T to the xand y-subproblems, respectively. Its scheme for (1)-(2) is where α ∈ (0, 1+ √ 5 2 ), and the matrices P and T are symmetric positive semidefinite. Deng and Wo solved the problem of global and linear convergence of the generalized ADMM and gave the mathematical proof where P and T are symmetric positive semidefi-nite matrices. What is more, in order to make the subproblems of the generalized ADMM easier to solve and more efficient to run, the ultimate goal is to choose the adequate P and T.
In order to iteratively solve the linear constraint quadratic programming problem (1)-(2), Bai and Tao [18] proposed a class of preconditioned alternating variable minimization with multiplier (PAVMM) methods by the matrix preconditioning strategy and utilizing a parameter accelerating technique, which is based on a weighted inner product and the homologous weighted norm. The iteration scheme is as follows: where ψ(x, y, λ) is the following weighted augmented Lagrangian function: the matrices W and Q are symmetric positive semidefinite and nonsingular, and α is the relaxation parameter. Actually, the PAVMM method is a class of preconditioned alternating direction method of multipliers (PADMM). Therefore, in this manuscript, the PAVMM method is recorded as PADMM. In particular, the ADMM is a special case of the PADMM. If W = Q = I and a = β, the PADMM automatically reduces to the ADMM. Later, Bai and Tao [26] also establish a preconditioned and relaxed alternating variable minimization with multiplier (PRAVMM) method based on the PAVMM method, and the scheme is where , τ, and α are positive constants. As above, we also rewrite the PRAVMM into PRADMM. In order to achieve acceleration, it is obvious that the PRADMM adds two relaxation parameters to the iterative process. Hence, the PADMM and ADMM are the special cases of the PRADMM. When = τ = 1, the PRADMM degenerates into the PADMM; when = τ = 1, W = Q = I and a = β; then, the PRADMM reduces to the ADMM.
To further generalize the PADMM and maximally promote the convergence speed of the PADMM, in this manuscript, we establish a preconditioned and proximal alternating direction of multiplier (PPADMM) to solve the problem (1)-(2) by the iterative methods. Assuming that the matrices F ∈ R n×n and G ∈ R m×m are symmetric positive semidefinite, we should choose reasonably the parameters, the weighting, and the preconditioning matrices; the PPADMM is convergent to the unique solution of the problem (1.1)-(1.2) for any initial guess. Obviously, the PPADMM proposed in this paper is an extension of the PADMM. Hence, the ADMM and PADMM are special cases of the PPADMM. In addition, we test the robustness and effectiveness of the PPADMM by using numerical experiments. The experimental results indicate that this method performs better than ADMM, PADMM, and PRADMM when they are employed to solve the convex optimization programming problem (1)- (2).
The paper is organized as follows. In Section 2, the computational properties of PPADMM was established. We introduce some necessary concepts for analyzing the asymptotic convergence of the PPADMM in Section 3. Then, based on the analysis in Section 3, the asymptotic convergence of the PPADMM was demonstrated in Section 4. In Section 5, we have some test by solving the image deblurring problem to illustrate that our method is effective. Finally, we give some concluding remarks in Section 5.

The PPADMM Method
In this section, we will introduce the PPADMM proposed in this paper. At the k-th iteration step of the PPADMM, we add the proximal term 1 2 x − x (k) 2 P and 1 2 y − y (k) 2

T
to the xand y-subproblems during computing minimal points x (k+1) and y (k+1) of the minimization subproblems. In order to solve the problem (1)-(2) by the iterative methods, we are able to establish the preconditioned and proximal alternating direction method of multipliers (PPADMM) as follows: where P ∈ R n×n and T ∈ R m×m are two symmetric positive semidefinite matrices, and α and β are positive constants. We choose P = τ 1 I n − βA T A with the requirement τ 1 > β A T A and T = τ 2 I m − βB T B with the requirement τ 2 > β B T B . In the same way, the matrices W, Q are generated as follows: ). It is easy to get the derivative of the quadratic functions f 1 (x) and f 2 (y) in (2) as follows: By making full use of Equation (10), the first and second for formulas in Equation (10) are differentiated for x and y, respectively, with a simple manipulation. Afterwards, the iteration scheme of PPADMM can be rewritten as the following mathematically equivalent matrix-vector form: which can be equivalently reformulated as Since F, G and P, T are symmetric positive semidefinite matrices, W is a symmetric positive definite matrix, and α > 0, β > 0; thus, we know that the matrices F + βA T W −1 A + P and G + βB T W −1 B + T are symmetric positive definite if and only if the following statements are true [18]: Therefore, when these above null-space conditions of PPADMM are well satisfied, measuring the costs of computation in the iteration scheme (13), it is obvious that the main costs of the PPADMM are to solve the linear systems with coefficient matrices F + βA T W −1 A + P and G + βB T W −1 B + T. When the sizes n, m, and/or p are relatively small, some direct methods such as the Cholesky factorization are able to effectively solve these systems of linear equations. However, when the sizes n, m, and/or p are huge, it will take a lot of time to solve these linear systems by the direct methods. So, the iterative methods are used to solve the problems, e.g., the preconditioned conjugate methods. Of course, the weighted matrices W, P, and T, and penalty parameters α, β should be reasonably selected so that both the matrices F + βA T W −1 A + P and G + βB T W −1 B + T have the better conditions than the original matrices F and G, respectively. It will make the linear equations with the coefficient matrices F + βA T W −1 A + P and G + βB T W −1 B + T accurate, fast, and robust at each step of the PPADMM.
For the linear systems Ax = b, we define , C L and C U are strictly lower triangular matrix and strictly upper triangular matrix, respectively. Let then, the block SOR iterative method is described as According to the iterative scheme (13), we can see that the PPADMM can be classified as a modified block SOR iterative method for solving the following linear systems.
For the properties that a saddle point of the weighted augmented Lagrangian function ψ(x, y, λ) defined in (3) possess, we refer to the literature [18] for more details. Theorem 1. [7] Let F ∈ R n×n , G ∈ R m×m be symmetric positive semidefinite matrices and A ∈ R p×n , B ∈ R p×m be two arbitrary matrices. Let Then, the following results are obtained: (i) For x * ∈ R n , y * ∈ R m and λ * ∈ R p , then the point (x * , y * , λ * ) is a saddle point of the weighted augmented Lagrangian function ψ(x, y, λ) defined in (3) if and only if In addition, for the complex quadratic polynomial equation, the determinant criterion for locations of its two roots is shown in Lemma 1. It is indispensable to prove the asymptotic convergence of the PPADMM in Section 4. Lemma 1. [16,27,28] Assume η and ζ are two complex constants; then, both roots of the complex quadratic polynomial equation λ 2 + ζλ + η = 0 have a modulus of less than one if and only if In particular, if both η and null(G) ∩ null(B) = {0} are real constants, the condition (14) degrades into |η|< 1 and |ζ|< 1 + η .

The Asymptotic Convergence of the PPADMM
In this section, we will prove that the PPADMM is globally and asymptotically convergent, and its corresponding asymptotic convergence rate was also estimated.
First, we define the matrices Obviously, the following matrix equations are true: Hence, the iteration scheme (13) of the PPADMM can be rewritten as the equivalent matrix-vector form Multiplying M −1 β (α) by both sides of Equation (16), we obtain It is easy to know that (17). For the matrix splitting of (14), the iterative scheme of the PPADMM can be equivalently reconstructed into the matrix splitting iteration method (16), which can solve the linear equation A(β)x = b(β). Therefore, the PPADMM is asymptotically and globally convergent if and only if the spectral radius of the iterative matrix L β (α) is less than one, i.e., ρ(L β (α)) < 1. We define the weighted matriceŝ the augmented matriceŝ and the compounded matriceŝ Therefore, M, N can be rewritten as Here, we define the block diagonal matrix After preconditioning M β (α) and N β (α), respectively, we can obtain the block lowertriangular matrixM β (α) and the block upper-triangular matrixN β (α) as follows: It is obvious thatL β (α) and L β (α) are similar matrices, so they have the same eigenvalues. When ρ(L β (α)) < 1, the PPADMM globally converges to the optimal solution of the problem (1)-(2).

Theorem 2.
Suppose that F ∈ R n×n and G ∈ R m×m are symmetric positive semidefinite matrices, A ∈ R p×n and B ∈ R p×m are two matrices such that: Define the scaled matricesÂ,B,Q,F,Ĝ andR,Ŝ as in (18)- (19). Suppose that λ is a nonzero eigenvalue of the PPADMM iteration matrix L β (α). If the matrices I + βλŜ, P − λF, and T − λĜ are nonsingular, then λ is an eigenvalue of the following eigenvalue problem (λ 2 β 2RŜ Q + λE + Q) w = 0, where Proof. In order to analyze the nonzero eigenvalues of L β (α), we first need to analyze the nonzero eigenvalues ofL β (α) because the matricesL β (α) and L β (α) are similar. Obviously, the conditions (a) and (b) guarantee the nonsingularity of the matrix A(β), λ = 1 is not an eigenvalue ofL β (α). Assume λ is a nonunit and nonzero eigenvalue of the matrixL β (α), u = (u T , v T , w T ) T ∈ C n+m+p , with u ∈ C n , v ∈ C m and w ∈ C p is the corresponding eigenvector, i.e., L β (α)u = λu. Then, it holds thatN β (α)u = λM β (α)u, or equivalently, Since the matrices I + βλŜ, P − λF and T − λĜ are nonsingular, by simplifying the first and second equations in (21), we can get Obviously,Bv = 1−λ αβλQ w −Âu. Substituting the equation into the first and second equation in (21), it holds that We can get the following simplified equation By combining the same terms in (25) according to the power exponents of λ, we can get the following equation As a result of we can rewrite (26) as (λ 2 β 2RŜ Q + λE + Q) w = 0.
In accordance with Theorem 2, we can instantly get the following sufficient condition, which guarantees the globally asymptotic convergence of the PPADMM.
If |κ − κη|+|η| 2 < 1 is satisfied, then the iteration sequence x (k) ∞ k=0 generated by the iteration scheme of the PPADMM in (10) converges to the optimal solution of the problem (1) Moreover, the convergence factor of the PPADMM is given by Proof. According to Theorem 2, we know that the PPADMM generates the following rational eigenvalue problem The two sides of the equation are multiplied by w * ∈ C p \{0} at the same time, and we can get where E = αβ S − Q + αβ 2R S − β 2RŜ Q + αβR, we can rewrite the Equation (30) as With the notation in (28), Equation (30) is organized into According to the Lemma 1, if the root of the quadratic function about the eigenvalue is less than one, the coefficient must satisfy the following conditions: |κ − κη|+|η| 2 < 1.
Last but not least, it is easy to acquire two roots of the quadratic Equation (32). The two roots are listed: So, we can get the convergence factor of the PPADMM as follows

Numerical Results
In this section, numerical examples are adopted to illustrate the performance of the PPADMM for the image deblurring problem. In [26], the PRADMM was shown to be more efficient than the PADMM. Hence, we only need to compare the PPADMM that we proposed with the PRADMM and ADMM in this section.
Image deblurring is a classical and significant subject study in image processing, which is usually an inverse problem. It is comprehensively concerned by many scholars engaged in computer vision. Normally, the goal of image deblurring is to recover an unknown original image u ∈ R n from a noisy image y ∈ R n that is often modeled as y = Bu + n, where B is a linear operator, and n is a white Gaussian noise with variance σ 2 .
In general, the blurring matrix B is a highly ill-conditioned matrix, so the image deblurring problem is an ill-conditioned inverse problem. A common method for solving the ill-conditioned problems is the regularization methods. Therefore, in practical applications, the image deblurring problem is converted into the following optimization problem where A is a blurring operator, K is a regularization operator, ε is a regularization parameter, and c is the observed image (see [16,[29][30][31][32][33]). The mathematical expression (33) can be equivalently reformulated into the equality-constraint quadratic programming problem (1)-(2) as follows: min x,y As it was mentioned above, the blurring image c can be described as b = Ax t + ωr, where x t is the true image, r is the noise generated by the random vector, and ω is the level of noise. In our simulations of image deblurring, in order to generate blurred images, first, the original images are blurred by a blurring kernel, and then followed by an additive Gaussian white noise. As in [15], we set ε = 0.1, β = 0.1, and ω = 3.
In our simulations, K takes the Laplacian operator. Under the periodic boundary conditions, K and A are the block circulant matrices with circulant blocks. Therefore, both K T K and A T A are also block circulant matrices with circulant blocks, so that they can be diagonalized by the FFT2 (2 Dimension Fast Fourier Transform). See, e.g., [34]. Therefore, the three methods experimented require o(n log n) operations per iteration.
The PRADMM is used to solve the problem (34), and its corresponding iteration scheme is as follows: Applying the PPADMM to (33), we obtain the following iterative scheme:

(36)
In order to reduce the computational cost of iterative schemes (35) and (36), we choose the following method to generate the positive definite proximal term matrices P and T as P = β . For the PRADMM and PPADMM, W = ( β x (k−1) ≤ 10 −5 and are performed in MATLAB with a machine precision 10 −16 .
Mathematics 2021, 9, x FOR PEER REVIEW 11 of 16 In order to reduce the computational cost of iterative schemes (35) and (36), we choose the following method to generate the positive definite proximal term matrices P and T as The peak signal-to-noise ratio (PSNR) is the most common and widely used objective measurement method for evaluating the quality of an image. It is widely used in the literature to investigate the performance of the methods solving the image restoration problem. The PSNR is defined as follows: The peak signal-to-noise ratio (PSNR) is the most common and widely used objective measurement method for evaluating the quality of an image. It is widely used in the literature to investigate the performance of the methods solving the image restoration problem. The PSNR is defined as follows: PSNR(x) = 20 log 10 255 where x t j is the j-th pixel value of the original image x t , and x j is the j-th pixel values of the corresponding estimated image y = Bu + n. PSNR is a very important index in image deblurring, which can measure the difference between the restored image and the original image. In general, the PSNR value increases with the quality of restoration. In image deblurring, if the PSNR is larger, the effect of image deblurring is better. In order to obtain the high value of PSNR, it will be worth taking a long time. Table 1 shows the optimal parameters for the PPADMM and the PRADMM in the experiment, respectively. These optimal parameters are obtained experimentally by maximizing the corresponding PSNR. Table 1. The optimal iteration parameters computed by experiment for the preconditioned and proximal alternating direction method of multipliers (PPADMM) and preconditioned and relaxed alternating direction method of multipliers (PRADMM). In order to illustrate the effectiveness and robustness of the PPADMM to the different images, we show the experimental results for the PPADMM, PRADMM, and ADMM to solve the image deblurring problem in Table 2. Table 2 shows the elapsed CPU time in seconds (denoted as "CPU") and the peak signal-to-noise ratio measured in the described (dB) (denoted as "PSNR") image deblurring problem (34). In order to be accurate, the experimental results are the average value of the three repeatedly executed results. In short, the PPADMM, PRADMM, and ADMM are denoted as PPAD, PRAD, and AD. From Table 2, we can see that the PPADMM acquires the highest value of PSNR for the four images of two types of the kernel. The difference with respect to the PSNR ranges from 0.38 to 0.94 dB. In order to know the relationship between PSNR and iteration, Figure 2 takes the capsule blurred by Type I as an example, which shows the PSNR versus iterations for the PPADMM, PRADMM, and ADMM. We can see that for the image capsule blurred by Type I, the PPADMM was able to solve the problem (34) in 83 iterations and achieved a PSNR value of 25.05 dB. The PRADMM and ADMM took 18 iterations and 47 iterations to solve the problem (33) and obtained a PSNR value of 24.5 dB. Clearly, although the PPADMM took more iterations than the PRADMM and ADMM, the value of PSNR that the PPADMM calculated is much larger than those of the other two methods. Above all, the PPADMM is much more effective than the PRADMM and the ADMM in the qualities of the deblurred images, and it is robust.   Finally, for the blurred (Type I) and noised (B&N) image capsule, we deblur it by the ADMM, PRADMM, and PPADMM, respectively, and we show the deblurred images in Figure 3. We can see that the PRADMM and ADMM produced over-smoothed results and eliminated many images details. Instead, the PPADMM ensured that the restored images have better visual quality. On the one hand, it can remove effectively the blurring effects and noise. On the other hand, it is also able to reconstruct more image edges than the two other methods.

Image
In summary, it can be concluded from Table 2 and Figures 1 and 2 that for the image deblurring problem (33), the proposed PPADMM is clearly more effective than the PRADMM and ADMM in obtaining comparable qualities of the deblurred images. Finally, for the blurred (Type I) and noised (B&N) image capsule, we deblur it by the ADMM, PRADMM, and PPADMM, respectively, and we show the deblurred images in Figure 3. We can see that the PRADMM and ADMM produced over-smoothed results and eliminated many images details. Instead, the PPADMM ensured that the restored images have better visual quality. On the one hand, it can remove effectively the blurring effects and noise. On the other hand, it is also able to reconstruct more image edges than the two other methods.
In summary, it can be concluded from Table 2 and Figures 1 and 2 that for the image deblurring problem (33), the proposed PPADMM is clearly more effective than the PRADMM and ADMM in obtaining comparable qualities of the deblurred images.

Conclusions
In this paper, we have proposed an efficient PPADMM for solving the linear constraint quadratic programming problem (1)- (2). This algorithm is a proximal generalization of the PADMM, and it extrapolates the block variables and the block multiplier in each new iteration. In fact, the PPADMM is naturally a generalized and modified block SOR iterative method in the viewpoint of matrix computation. Its theoretical properties such as global convergence and convergence factor are established. Meanwhile, our numerical results verify the efficiency of the proposed method in comparison with PRADMM and ADMM. In addition, it is easy to be applied to construct the similar method for solving convex optimization problems of several block variables and of several equality and inequality constraints. However, how to choose the optimal parameters is still a challenging problem that deserves further discussion.

Conclusions
In this paper, we have proposed an efficient PPADMM for solving the linear constraint quadratic programming problem (1)- (2). This algorithm is a proximal generalization of the PADMM, and it extrapolates the block variables and the block multiplier in each new iteration. In fact, the PPADMM is naturally a generalized and modified block SOR iterative method in the viewpoint of matrix computation. Its theoretical properties such as global convergence and convergence factor are established. Meanwhile, our numerical results verify the efficiency of the proposed method in comparison with PRADMM and ADMM. In addition, it is easy to be applied to construct the similar method for solving convex optimization problems of several block variables and of several equality and inequality constraints. However, how to choose the optimal parameters is still a challenging problem that deserves further discussion.