1. Introduction
This manuscript introduces the following convex optimization model with linear constraints and a separable objective function:
where
and
are two matrices, and
is a known vector. The objective function
and
are two quadratic functions defined by
where
are symmetric positive semidefinite matrices and
are the known vectors. The class of convex minimization problems arises in many areas of computational science and engineering applications such as compressed sensing [
1], financial [
2,
3], image restoration [
4,
5,
6], network optimization problems [
7,
8], and traffic planning convex problems [
9,
10,
11,
12]. The model (1)–(2) captures many applications in different areas—see the
l1-norm regularized least-squares problems in [
12,
13], the total variation image restoration in [
13,
14,
15,
16], and the standard quadratic programming problems in [
7,
13].
Let be the symmetric positive definite matrix, represents the weighted inner product with the weighting matrix H, stands for the Euclidean norm, and stands for the analogous weighted matrix norm. Note that for the vectors and the matrix , it holds that , and .
If , we say that are H-orthogonal, which is denoted by . In particular, if H is the identity matrix, then the vectors and are orthogonal, which is simply denoted by . For , stands for its conjugate complex.
The problem (1)–(2) is mathematically equivalent to the unconstraint optimization problem [
7]
is the augmented Lagrangian function defined as
where
is the Lagrangian multiplier, and
is a regularization parameter. In a word, a point
is the solution to the problem (1)–(2) if and only if there exists
such that the point
is the solution to the problem (3)–(4) [
7].
The most common method to solve the problem (3)–(4) is the alternating direction method of multipliers (ADMM) [
7], in which each iteration of the augmented Lagrangian method (ALM) [
17,
18] has a Gauss–Seidel decomposition. The scheme of the ADMM method for (1)–(2) is
A significant advantage of ADMM is that each subproblem involves only one of the functions
f1 and
f2 in the original problem. Therefore, the variables
x and
y are treated separately during iterations, which makes solving the subproblems in (3)–(4) evidently easier than the original problem (1)–(2). The ADMM’s easy implementation and impressive efficiency has recently received extensive attention from many different areas. The ADMM is a very effective method to solve the convex optimization problem. Glowinski and Marrocco [
19] were the first batch of scientists to describe it. Gabay [
20], Glowinski and Le Tallec [
21], as well as Eckstein and Bertsekas [
22] studied some convergence results related to ADMM. It is suitable to solve convex programming problems with separable variables [
7,
23,
24], and they are widely used in image processing, statistical learning, and machine learning [
4,
5,
6,
7,
8,
9].
For the convex optimization problems, some methods based on gradient operator are sensitive to the choice of the iteration step. If the parameters are not properly selected, the algorithm may not be convergent. In contrast, the ADMM is robust to the choice of parameters: under some mild conditions, the method can guarantee convergence for any positive parameter of its single parameter. When the objective function is a quadratic function, its global convergence is proved [
13,
24], and this method is linearly convergent. For example, the ADMM converges linearly to the optimal solution for problems (1) and (2). Although the convergence of the ADMM is perfectly solved, the accurate estimation of its convergence rate is still in its early stages; see, for example, [
13,
14].
Since the classical ADMM algorithm is inefficient in solving the accuracy of subproblem, Deng and Wo proposed a generalized ADMM in the literature [
25], which adds the proximal terms
and
to the
x- and
y-subproblems, respectively. Its scheme for (1)–(2) is
where
, and the matrices
P and
T are symmetric positive semidefinite.
Deng and Wo solved the problem of global and linear convergence of the generalized ADMM and gave the mathematical proof where P and T are symmetric positive semidefinite matrices. What is more, in order to make the subproblems of the generalized ADMM easier to solve and more efficient to run, the ultimate goal is to choose the adequate P and T.
In order to iteratively solve the linear constraint quadratic programming problem (1)–(2), Bai and Tao [
18] proposed a class of
preconditioned alternating variable minimization with multiplier (
PAVMM) methods by the matrix preconditioning strategy and utilizing a parameter accelerating technique, which is based on a weighted inner product and the homologous weighted norm. The iteration scheme is as follows:
where
is the following weighted augmented Lagrangian function:
the matrices
W and
Q are symmetric positive semidefinite and nonsingular, and
is the relaxation parameter. Actually, the PAVMM method is a class of preconditioned alternating direction method of multipliers (
PADMM). Therefore, in this manuscript, the PAVMM method is recorded as PADMM. In particular, the ADMM is a special case of the PADMM. If
and
, the PADMM automatically reduces to the ADMM. Later, Bai and Tao [
26] also establish a
preconditioned and relaxed alternating variable minimization with multiplier (PRAVMM) method based on the PAVMM method, and the scheme is
where
,
, and
are positive constants. As above, we also rewrite the PRAVMM into PRADMM. In order to achieve acceleration, it is obvious that the PRADMM adds two relaxation parameters to the iterative process. Hence, the PADMM and ADMM are the special cases of the PRADMM. When
, the PRADMM degenerates into the PADMM; when
,
and
; then, the PRADMM reduces to the ADMM.
To further generalize the PADMM and maximally promote the convergence speed of the PADMM, in this manuscript, we establish a preconditioned and proximal alternating direction of multiplier (PPADMM) to solve the problem (1)–(2) by the iterative methods. Assuming that the matrices and are symmetric positive semidefinite, we should choose reasonably the parameters, the weighting, and the preconditioning matrices; the PPADMM is convergent to the unique solution of the problem (1.1)–(1.2) for any initial guess. Obviously, the PPADMM proposed in this paper is an extension of the PADMM. Hence, the ADMM and PADMM are special cases of the PPADMM. In addition, we test the robustness and effectiveness of the PPADMM by using numerical experiments. The experimental results indicate that this method performs better than ADMM, PADMM, and PRADMM when they are employed to solve the convex optimization programming problem (1)–(2).
The paper is organized as follows. In
Section 2, the computational properties of PPADMM was established. We introduce some necessary concepts for analyzing the asymptotic convergence of the PPADMM in
Section 3. Then, based on the analysis in
Section 3, the asymptotic convergence of the PPADMM was demonstrated in
Section 4. In
Section 5, we have some test by solving the image deblurring problem to illustrate that our method is effective. Finally, we give some concluding remarks in
Section 5.
2. The PPADMM Method
In this section, we will introduce the PPADMM proposed in this paper. At the
k-th iteration step of the PPADMM, we add the proximal term
and
to the
x- and
y-subproblems during computing minimal points
and
of the minimization subproblems. In order to solve the problem (1)–(2) by the iterative methods, we are able to establish the preconditioned and proximal alternating direction method of multipliers (PPADMM) as follows:
where
and
are two symmetric positive semidefinite matrices, and
and
are positive constants. We choose
with the requirement
and
with the requirement
. In the same way, the matrices
W,
Q are generated as follows:
where
,
.
It is easy to get the derivative of the quadratic functions
and
in (2) as follows:
By making full use of Equation (10), the first and second for formulas in Equation (10) are differentiated for
x and
y, respectively, with a simple manipulation. Afterwards, the iteration scheme of PPADMM can be rewritten as the following mathematically equivalent matrix–vector form:
which can be equivalently reformulated as
Since
F,
G and
P,
T are symmetric positive semidefinite matrices,
W is a symmetric positive definite matrix, and
,
; thus, we know that the matrices
and
are symmetric positive definite if and only if the following statements are true [
18]:
- (a)
or or .
- (b)
or or .
Therefore, when these above null-space conditions of PPADMM are well satisfied, measuring the costs of computation in the iteration scheme (13), it is obvious that the main costs of the PPADMM are to solve the linear systems with coefficient matrices and . When the sizes n, m, and/or p are relatively small, some direct methods such as the Cholesky factorization are able to effectively solve these systems of linear equations. However, when the sizes n, m, and/or p are huge, it will take a lot of time to solve these linear systems by the direct methods. So, the iterative methods are used to solve the problems, e.g., the preconditioned conjugate methods. Of course, the weighted matrices W, P, and T, and penalty parameters should be reasonably selected so that both the matrices and have the better conditions than the original matrices F and G, respectively. It will make the linear equations with the coefficient matrices and accurate, fast, and robust at each step of the PPADMM.
For the linear systems
, we define
where
(
),
,
and
are strictly lower triangular matrix and strictly upper triangular matrix, respectively. Let
, then, the block SOR iterative method is described as
According to the iterative scheme (13), we can see that the PPADMM can be classified as a modified block SOR iterative method for solving the following linear systems.
For the properties that a saddle point of the weighted augmented Lagrangian function
defined in (3) possess, we refer to the literature [
18] for more details.
Theorem 1. [7]Let,be symmetric positive semidefinite matrices and,be two arbitrary matrices. Let Then, the following results are obtained:
- (i)
For and , then the point is a saddle point of the weighted augmented Lagrangian function defined in (3) if and only if is a solution of the linear system ;
- (ii)
The matrix is nonsingular if and only if
- (a)
- (b)
.
In addition, for the complex quadratic polynomial equation, the determinant criterion for locations of its two roots is shown in Lemma 1. It is indispensable to prove the asymptotic convergence of the PPADMM in
Section 4.
Lemma 1. [16,27,28]Assumeandare two complex constants; then, both roots of the complex quadratic polynomial equationhave a modulus of less than one if and only if In particular, if both and are real constants, the condition (14) degrades into and .
3. The Asymptotic Convergence of the PPADMM
In this section, we will prove that the PPADMM is globally and asymptotically convergent, and its corresponding asymptotic convergence rate was also estimated.
First, we define the matrices
Obviously, the following matrix equations are true:
Hence, the iteration scheme (13) of the PPADMM can be rewritten as the equivalent matrix-vector form
Multiplying
by both sides of Equation (16), we obtain
It is easy to know that
is the iterative matrix of (17). For the matrix splitting of (14), the iterative scheme of the PPADMM can be equivalently reconstructed into the matrix splitting iteration method (16), which can solve the linear equation
A(β)x =
b(β). Therefore, the PPADMM is asymptotically and globally convergent if and only if the spectral radius of the iterative matrix
is less than one, i.e.,
. We define the weighted matrices
the augmented matrices
and the compounded matrices
Therefore,
can be rewritten as
Here, we define the block diagonal matrix
After preconditioning
and
, respectively, we can obtain the block lower-triangular matrix
and the block upper-triangular matrix
as follows:
It follows that . It is obvious that and are similar matrices, so they have the same eigenvalues. When , the PPADMM globally converges to the optimal solution of the problem (1)–(2).
Theorem 2. Suppose thatandare symmetric positive semidefinite matrices,andare two matrices such that:
- (a)
and
- (b)
.
Define the scaled matrices
and
as in (18)–(19). Suppose that
is a nonzero eigenvalue of the PPADMM iteration matrix
. If the matrices
,
, and
are nonsingular, then
is an eigenvalue of the following eigenvalue problem
, where
Proof. In order to analyze the nonzero eigenvalues of , we first need to analyze the nonzero eigenvalues of because the matrices and are similar. Obviously, the conditions (a) and (b) guarantee the nonsingularity of the matrix , is not an eigenvalue of .
Assume
is a nonunit and nonzero eigenvalue of the matrix
,
, with
,
and
is the corresponding eigenvector, i.e.,
. Then, it holds that
, or equivalently,
Since the matrices
,
and
are nonsingular, by simplifying the first and second equations in (21), we can get
Obviously,
. Substituting the equation into the first and second equation in (21), it holds that
We can get the following simplified equation
By combining the same terms in (25) according to the power exponents of
, we can get the following equation
As a result of
we can rewrite (26) as
. □
In accordance with Theorem 2, we can instantly get the following sufficient condition, which guarantees the globally asymptotic convergence of the PPADMM.
Theorem 3. Let the conditions of Theorem 2 be satisfied.For any non-zero vector, define If
is satisfied, then the iteration sequence
generated by the iteration scheme of the PPADMM in (10) converges to the optimal solution of the problem (1)–(2), where
Moreover, the convergence factor of the PPADMM is given by
, where
Proof. According to Theorem 2, we know that the PPADMM generates the following rational eigenvalue problem
The two sides of the equation are multiplied by
at the same time, and we can get
where
, we can rewrite the Equation (30) as
With the notation in (28), Equation (30) is organized into
According to the Lemma 1, if the root of the quadratic function about the eigenvalue is less than one, the coefficient must satisfy the following conditions:
Last but not least, it is easy to acquire two roots of the quadratic Equation (32). The two roots are listed:
So, we can get the convergence factor of the PPADMM as follows . □
4. Numerical Results
In this section, numerical examples are adopted to illustrate the performance of the PPADMM for the image deblurring problem. In [
26], the PRADMM was shown to be more efficient than the PADMM. Hence, we only need to compare the PPADMM that we proposed with the PRADMM and ADMM in this section.
Image deblurring is a classical and significant subject study in image processing, which is usually an inverse problem. It is comprehensively concerned by many scholars engaged in computer vision. Normally, the goal of image deblurring is to recover an unknown original image from a noisy image that is often modeled as , where B is a linear operator, and n is a white Gaussian noise with variance .
In general, the blurring matrix
B is a highly ill-conditioned matrix, so the image deblurring problem is an ill-conditioned inverse problem. A common method for solving the ill-conditioned problems is the regularization methods. Therefore, in practical applications, the image deblurring problem is converted into the following optimization problem
where
A is a blurring operator,
K is a regularization operator,
is a regularization parameter, and
c is the observed image (see [
16,
29,
30,
31,
32,
33]). The mathematical expression (33) can be equivalently reformulated into the equality-constraint quadratic programming problem (1)–(2) as follows:
As it was mentioned above, the blurring image
c can be described as
, where
is the true image,
r is the noise generated by the random vector, and
is the level of noise. In our simulations of image deblurring, in order to generate blurred images, first, the original images are blurred by a blurring kernel, and then followed by an additive Gaussian white noise. As in [
15], we set
,
, and
.
In our simulations,
K takes the Laplacian operator. Under the periodic boundary conditions,
K and
A are the block circulant matrices with circulant blocks. Therefore, both
and
are also block circulant matrices with circulant blocks, so that they can be diagonalized by the FFT2 (2 Dimension Fast Fourier Transform). See, e.g., [
34]. Therefore, the three methods experimented require
operations per iteration.
The PRADMM is used to solve the problem (34), and its corresponding iteration scheme is as follows:
Applying the PPADMM to (33), we obtain the following iterative scheme:
In order to reduce the computational cost of iterative schemes (35) and (36), we choose the following method to generate the positive definite proximal term matrices P and T as , , where , . For the PRADMM and PPADMM, , , where , . In our simulations, , , and . It is obvious that the matrices , , and , are the block circulant matrices with circulant blocks so that their inverse matrices can be easily obtained by the FFT2. It will greatly reduce the computational cost of schemes (35) and (36).
In our experiments, the methods have been implemented in MATLAB (version R2014a) on a PC with Core i5-4590, 3.30 GHz CPU, and 8.00 GB RAM. We test the 256-by-256 Grayscale images Capsule, Baboon, House, and Cameraman [
35], which are shown in
Figure 1. We state that the image Capsule is a .jpg file and the other three images are all .pgm files. For 8-bit images, their pixel values are all integers in [0, 255]. Hence, for these four images, we have
in the problem (34). In our experiments of deblurring, we test two types of blurring kernels: Type I (fspecial(average, 13)) and TypeII (fspecial(Gaussian, [9, 9], 3)). For the three methods in the experiment, all initial values are chosen as the blurred image
, i.e.,
, which is terminated if the current iterates satisfy
and are performed in MATLAB with a machine precision 10
−16.
The peak signal-to-noise ratio (PSNR) is the most common and widely used objective measurement method for evaluating the quality of an image. It is widely used in the literature to investigate the performance of the methods solving the image restoration problem. The PSNR is defined as follows:
where
is the
j-th pixel value of the original image
, and
is the
j-th pixel values of the corresponding estimated image
. PSNR is a very important index in image deblurring, which can measure the difference between the restored image and the original image. In general, the PSNR value increases with the quality of restoration. In image deblurring, if the PSNR is larger, the effect of image deblurring is better. In order to obtain the high value of PSNR, it will be worth taking a long time.
Table 1 shows the optimal parameters for the PPADMM and the PRADMM in the experiment, respectively. These optimal parameters are obtained experimentally by maximizing the corresponding PSNR.
In order to illustrate the effectiveness and robustness of the PPADMM to the different images, we show the experimental results for the PPADMM, PRADMM, and ADMM to solve the image deblurring problem in
Table 2.
Table 2 shows the elapsed CPU time in seconds (denoted as “CPU”) and the peak signal-to-noise ratio measured in the described (dB) (denoted as “PSNR”) image deblurring problem (34). In order to be accurate, the experimental results are the average value of the three repeatedly executed results. In short, the PPADMM, PRADMM, and ADMM are denoted as PPAD, PRAD, and AD. From
Table 2, we can see that the PPADMM acquires the highest value of PSNR for the four images of two types of the kernel. The difference with respect to the PSNR ranges from 0.38 to 0.94 dB. In order to know the relationship between PSNR and iteration,
Figure 2 takes the capsule blurred by Type I as an example, which shows the PSNR versus iterations for the PPADMM, PRADMM, and ADMM. We can see that for the image capsule blurred by Type I, the PPADMM was able to solve the problem (34) in 83 iterations and achieved a PSNR value of 25.05 dB. The PRADMM and ADMM took 18 iterations and 47 iterations to solve the problem (33) and obtained a PSNR value of 24.5 dB. Clearly, although the PPADMM took more iterations than the PRADMM and ADMM, the value of PSNR that the PPADMM calculated is much larger than those of the other two methods. Above all, the PPADMM is much more effective than the PRADMM and the ADMM in the qualities of the deblurred images, and it is robust.
Finally, for the blurred (
Type I) and noised (B&N) image capsule, we deblur it by the ADMM, PRADMM, and PPADMM, respectively, and we show the deblurred images in
Figure 3. We can see that the PRADMM and ADMM produced over-smoothed results and eliminated many images details. Instead, the PPADMM ensured that the restored images have better visual quality. On the one hand, it can remove effectively the blurring effects and noise. On the other hand, it is also able to reconstruct more image edges than the two other methods.
In summary, it can be concluded from
Table 2 and
Figure 1 and
Figure 2 that for the image deblurring problem (33), the proposed PPADMM is clearly more effective than the PRADMM and ADMM in obtaining comparable qualities of the deblurred images.