A Modified Fletcher–Reeves Conjugate Gradient Method for Monotone Nonlinear Equations with Some Applications

One of the fastest growing and efficient methods for solving the unconstrained minimization problem is the conjugate gradient method (CG). Recently, considerable efforts have been made to extend the CG method for solving monotone nonlinear equations. In this research article, we present a modification of the Fletcher–Reeves (FR) conjugate gradient projection method for constrained monotone nonlinear equations. The method possesses sufficient descent property and its global convergence was proved using some appropriate assumptions. Two sets of numerical experiments were carried out to show the good performance of the proposed method compared with some existing ones. The first experiment was for solving monotone constrained nonlinear equations using some benchmark test problem while the second experiment was applying the method in signal and image recovery problems arising from compressive sensing.


Introduction
In this paper, we are considering a system of nonlinear monotone equations of the form where E ⊆ R n is closed and convex, F : R n → R m , (m ≥ n) is continuous and monotone, which means F(x) − F(y), (x − y) ≥ 0, ∀x, y ∈ R n .
in solving monotone nonlinear equations with convex constraints, and also apply it to recover a noisy signal and a blurred image.

Algorithm
In this section, we define the projection map together with its well-known properties, give some useful assumptions and finally present the proposed algorithm. Throughout this article, · denotes the Euclidean norm. Definition 1. Let E ⊂ R n be nonempty closed and convex set. Then for any x ∈ R n , its projection onto E is defined as P E (x) = arg min{ x − y : y ∈ E.} The following lemma gives some properties of the projection map. Lemma 1 ([37]). Suppose E ⊂ R n is nonempty, closed and convex set. Then the following statements are true: 1. x Throughout, we suppose the followings (C 1 ) The solution set of (1), denoted by E , is nonempty. (C 2 ) The mapping F is monotone. (C 3 ) The mapping F is Lipschitz continuous, that is there exists a positive constant L such that F(x) − F(y) ≤ L x − y , ∀x, y ∈ R n .
Our algorithm is motivated by the work of Papp and Rapajić in [12]. In the paper, they modified the well known Fletcher-Reeves conjugate gradient method to solve unconstrained nonlinear monotone equation. The modification was adding the term −θ k F(x k ) to the direction of Fletcher-Reeves. The parameter θ k was then determined in three different ways and three different directions were proposed, namely, M3TFR1, M3TFR2 and M3TFR3. The direction we are interested in is M3TFR1 and is defined as: where, It follows that F(x k ) T d k = − F(x k ) 2 .
Using same modification proposed in [3], we modify the direction (2) as follows where µ > 0 is a positive constant. The difference between the M3TFR1 direction and the direction proposed in this paper is the scaling term appearing in the denominator of Equation (3) i.e., max{µ w k−1 F(x k ) , F(x k−1 ) 2 }. This modification was shown to have a very good numerical performance in [3] and also helps in obtaining the boundedness of the direction easily.

Remark 1.
Note the the parameter µ is chosen to be strictly positive because if µ ≤ 0 then This means that the direction d k will always be M3TFR1 given by (2).

Convergence Analysis
To prove the global convergence of Algorithm 1, the following results are needed.
Step 1. If F(x k ) ≤ Tol, stop, otherwise go to Step 2.
Step 3. Find the step length α k = γρ m k where m k is the smallest non-negative integer m such that Step 4. Set z k = x k + α k d k . If z k ∈ E and F(z k ) ≤ Tol, stop. Else compute Step 5. Let k = k + 1 and go to Step 1.
Lemma 3. Suppose that assumptions (C 1 )-(C 3 ) hold and the sequences {x k } and {z k } are generated by Algorithm 1. Then we have Proof. Suppose α k = ρ, then α k ρ does not satisfy Equation (4), that is This combined with (7) and the fact that F is Lipschitz continuous yields The above equation implies which completes the proof.

Lemma 4.
Suppose that assumptions (C 1 )-(C 3 ) holds, then the sequences {x k } and {z k } generated by Algorithm 1 are bounded. Moreover, we have Proof. We will start by showing that the sequences {x k } and {z k } are bounded. Supposex ∈ E , then by monotonicity of F, we get Also by definition of z k and the line search (4), we have So, we have Thus the sequence { x k −x } is non increasing and convergent, and hence {x k } is bounded. Furthermore, from Equation (15), we have and we can deduce recursively that Then from assumption (C 3 ), we obtain By the definition of z k , Equation (14), monotonicity of F and the Cauchy-Schwatz inequality, we get The boundedness of the sequence {x k } together with Equations (17) and (18), implies the sequence {z k } is bounded. Now, as {z k } is bounded, then for anyx ∈ E , the sequence {z k −x} is also bounded, that is, there exists a positive constant ν > 0 such that This together with assumption (C 3 ), this yields Therefore, using Equation (15), we have Equation (19) implies lim However, using statement 2 of Lemma 1, the definition of ζ k and the Cauchy-Schwartz inequality, we have which yields lim k→∞ x k+1 − x k = 0. (11) and definition of z k , then

Numerical Experiments
To test the performance of the proposed method, we compare it with accelerated conjugate gradient descent (ACGD) and projected Dai-Yuan (PDY) methods in [27,28], respectively. In addition, MFRM method is applied to solve signal and image recovery problems arising in compressive sensing. All codes were written in MATLAB R2018b and run on a PC with intel COREi5 processor with 4GB of RAM and CPU 2.3GHZ. All runs were stopped whenever F(x k ) < 10 −5 . The parameters chosen for each method are as follows: MFRM method: γ = 1, ρ = 0.9, µ = 0.01, σ = 0.0001. ACGD method: all parameters are chosen as in [27]. PDY method: all parameters are chosen as in [28].
and E = R n + .
and E = {x ∈ R n : It is clear that problem 3 is nonsmooth at x = 0.
and E = R n + .
To show in detail the efficiency and robustness of all methods, we employ the performance profile developed in [41], which is a helpful process of standardizing the comparison of methods. Suppose that we have n s solvers and n l problems and we are interested in using either number of Iterations, CPU time or number of function evaluations as our measure of performance; so we let k l,s to be the number of iterations, CPU time or number of function evaluations required to solve problem by solver s. To compare the performance on problem l by a solver s with the best performance by any other solver on this problem, we use the performance ratio r l,s defined as where S is the set of solvers. The overall performance of the solver is obtained using the (cumulative) distribution function for the performance ratio P. So if we let P(t) = 1 n l size{l ∈ L : r l,s ≤ t}, then P(t) is the probability for solver s ∈ S that a performance ratio r l,s is within a factor t ∈ R of the best possible ratio. If the set of problems L is large enough, then the solvers with the large probability P(t) are considered as the best.         Figure 1 reveals that MFRM performed better in terms of number of Iterations, as it solves and wins over 70 percent of the problems with less number of Iterations, while ACGD and PDY solve and win over 40 and almost 10 percent respectively. The story is a little bit different in Figure 2 as ACGD method was very competitive. However, MFRM method performed a little bit better by solving and winning over 50 percent of the problems with less CPU time as against ACGD method which solves and wins less than 50 percent of the problems considered. The PDY method had the least performance with just 10 percent success. The interpretation of Figure 3 was similar to that of Figure 1. Finally, in Table 11 we report numerical results for MFRM, ACGD and PDY for problem 2 with given initial points and dimensions with double float (10 −16 ) accuracy.

Experiments on Solving Sparse Signal Problems
There were many problems in signal processing and statistical inference involving finding sparse solutions to ill-conditioned linear systems of equations. Among popular approaches was minimizing an objective function which contains quadratic ( 2 ) error term and a sparse 1 −regularization term, i.e., where x ∈ R n , y ∈ R k is an observation, B ∈ R k×n (k << n) is a linear operator, η is a non-negative parameter, x 2 denotes the Euclidean norm of x and It is easy to see that problem (25) is a convex unconstrained minimization problem. Due to the fact that if the original signal is sparse or approximately sparse in some orthogonal basis, problem (25) frequently appears in compressive sensing, and hence an exact restoration can be produced by solving (25).
Iterative methods for solving (25) have been presented in many papers (see [42][43][44][45]). The most popular method among these methods is the gradient-based method and the earliest gradient projection method for sparse reconstruction (GPRS) was proposed by Figueiredo et al. [44]. The first step of the GPRS method is to express (25) as a quadratic problem using the following process. Consider a point x ∈ R n such that x = u − v, where u, v ≥ 0. u and v are chosen in such a way that x is splitted into its positive and negative parts as follows u i = (x i ) + , v i = (−x i ) + for all i = 1, 2, ..., n, and (.) + = max{0, .}. By definition of 1 -norm, we have x 1 = e T n u + e T n v, where e n = (1, 1, ..., 1) T ∈ R n . Now (25) can be written as which is a bound-constrained quadratic program. However, from [44], Equation (26) can be written in standard form as min Clearly, D is a positive semi-definite matrix, which implies that Equation (27) is a convex quadratic problem. Xiao et al. [20] translated (27) into a linear variable inequality problem which is equivalent to a linear complementarity problem. Moreover, z is a solution of the linear complementarity problem if and only if it is a solution of the following nonlinear equation: The function F is a vector-valued function and the "min" was interpreted as component wise minimum. Furthermore, F was proved to be continuous and monotone in [46]. Therefore problem (25) can be translated into problem (1) and thus MFRM method can be applied to solve it.
In this experiment, we consider a simple compressive sensing possible situation, where our goal is to reconstruct a sparse signal of length n from k observations. The quality of recovery is assessed by mean of squared error (MSE) to the original signalx, where x * is the recovered signal. The signal size is chosen as n = 2 11 , k = 2 9 and the original signal contains 2 6 randomly nonzero elements. In addition, the measurement y is distributed with noise, that is, y = Bx + , where B is a randomly generated Gaussian matrix and is the Gaussian noise distributed normally with mean 0 and variance 10 −4 .
To demonstrate the performance of the MFRM method in signal recovery problems, we compare it with the conjugate gradient descent CGD [20] and projected conjugate gradient PCG [23] methods. The parameters in PCG and CGD methods are chosen as γ = 10, σ = 10 −4 , ρ = 0.5. However, we chose γ = 1, σ = 10 −4 , ρ = 0.9 and µ = 0.01 in MFRM method. For fairness in comparison, each code was run from the same initial point, same continuation technique on the parameter η, and observed only the behavior of the convergence of each method to have a similar accurate solution. The experiment was initialized with x 0 = B T y and terminates when where f (x k ) = 1 2 y − Bx k 2 2 + η x k 1 . In Figures 4 and 5, MFRM, CGD and PCG methods recovered the disturbed signal almost exactly. The experiment was repeated for 20 different noise samples (see Table 9). It can be observed that the MFRM is more efficient in terms of the number of Iterations and CPU time than CGD and PCG methods in most cases. Furthermore, MFRM was able to achieve the least MSE in nine (9) out of the twenty (20) experiments. To reveal visually the performance of both methods, two figures were plotted to demonstrate their convergence behavior based on MSE, objective function values, the number of Iterations and CPU time (see Figures 6 and 7). It can also be observed that MFRM requires less computing time to achieve similar quality resolution. This can be seen graphically in Figures 6 and 7 which illustrate that the objective function values obtained by MFRM decrease faster throughout the entire Iteration process.

Experiments on Blurred Image Restoration
In this subsection, we test the performance of MFRM in restoring a blurred image. We use the following well-known gray test images; (P1) Cameraman, (P2) Lena, (P3) House and (P4) Peppers for the experiments. We use 4 different Gaussian blur kernels with a standard deviation υ to compare the robustness of MFRM method with CGD method proposed in [20].
To assess the performance of each algorithm tested with respect to the metrics that indicate better quality of restoration, in Table 10 we reported the objective function (ObjFun) at the approximate solution, the MSE, the signal-to-noise-ratio (SNR) which is defined as SNR = 20 × log 10 x x −x , and the structural similarity (SSIM) index that measure the similarity between the original image and the restored image [47] for each of the 16 experiments. The MATLAB implementation of the SSIM index can be obtained at http://www.cns.nyu.edu/~lcv/ssim/. The original, blurred and restored images by each of the algorithm are given in Figures 8-11. The figures demonstrate that both the two algorithms can restore the blurred images. In contrast to the CGD, the quality of the restored image by MFRM is superior in most cases. Table 11 reported numerical results for MFRM, ACGD and PDY for problem 2.

Conclusions
In this paper, a modified conjugate gradient method for solving monotone nonlinear equations with convex constraints was presented which is similar to that in [3]. The proposed method is suitable for non-smooth equations. Under some suitable assumptions, the global convergence of the proposed method was demonstrated. Numerical results were presented to show the effectiveness of the MFRM method compared to the ACGD and PDY methods for the given constrained monotone equation problems. Finally, the MFRM was also shown to be effective in decoding sparse signals and restoration of blurred images.