A Spectral Conjugate Gradient Method with Descent Property

: Spectral conjugate gradient method (SCGM) is an important generalization of the conjugate gradient method (CGM), and it is also one of the effective numerical methods for large-scale unconstrained optimization. The designing for the spectral parameter and the conjugate parameter in SCGM is a core work. And the aim of this paper is to propose a new and effective alternative method for these two parameters. First, motivated by the strong Wolfe line search requirement, we design a new spectral parameter. Second, we propose a hybrid conjugate parameter. Such a way for yielding the two parameters can ensure that the search directions always possess descent property without depending on any line search rule. As a result, a new SCGM with the standard Wolfe line search is proposed. Under usual assumptions, the global convergence of the proposed SCGM is proved. Finally, by testing 108 test instances from 2 to 1,000,000 dimensions in the CUTE library and other classic test collections, a large number of numerical experiments, comparing with both SCGMs and CGMs, for the presented SCGM are executed. The detail results and their corresponding performance proﬁles are reported, which show that the proposed SCGM is effective and promising.


Introduction
Conjugate gradient method (CGM) is one class of the prevailing methods commonly used for solving large-scale optimization problems, since it possesses simple iterations, fast convergence properties and low memory requirements. In this paper, we consider the following unconstrained optimization problem: where f : R n → R is a continuously differentiable function and its gradient is denoted by g(x) = ∇ f (x). The iterates of the classical CGM can be formulated as and where g k = g(x k ), d k is the search direction, β k is the so-called conjugate parameter, and α k is the steplength which obtained by a suitable exact or inexact line search. However, as the high cost of the exact line search, α k is usually generated by an inexact line search, such as the Wolfe line search or the strong Wolfe line search The parameters δ and σ above generally are required to satisfy 0 < δ < σ < 1. As we know, different choice for β k would generate different CGM. The most well-known CGMs are the Hestenes-Stiefel (HS, 1952) method [1], Fletcher-Reeves (FR, 1964) method [2], Polak-Ribière-Polyak (PRP, 1969) method [3,4] and the Dai-Yuan (DY, 1999) method [5], where the corresponding formulas for β k are , respectively, where · is the standard Euclidean norm. Generally, these four methods are often referred to the classical CGMs. Under the corresponding assumptions, the authors analysed the convergence properties and tested the numerical performance of the four CGMs in References [2][3][4][5][6], respectively. It is well known that the FR CGM and DY CGM possess nice convergence properties. However, their numerical performances are not so good. Inversely, the PRP and HS methods have excellent performance in practical computation. But their convergence properties are hardly to be obtained. Thus, to overcome these shortcomings of the classical CGMs, many researchers pay great attention to improve the CGMs. As a result, many improvements with excellent theoretical properties and numerical performance of the CGMs results were proposed, for example, References [7][8][9][10][11][12][13][14][15][16][17][18][19][20]. Where, the spectral conjugate gradient method (SCGM) proposed by Birgin and Martinez [12] can be seen as an important development of CGM. The main difference between SCGM and CGM lies in the computation of the search direction. The search direction of SCGM is usually yielded as follows: where θ k is a spectral parameter. Obviously, for a SCGM, the selection techniques for the spectral parameter θ k and the conjugate parameter β k are core work and very important. In Reference [12], after giving the concrete conjugate parameter β k , Birgin and Martinez required the spectral search direction yielded by (6) satisfying g T k d k = − g k 2 (a special sufficient descent property), and then obtained the corresponding spectral parameter: Under suitable line search, the SCGM yielded by (2) and (6) as well as (7) performs superiorly to the PRP CGM, the FR CGM and the Perry method [21].
Yielding the conjugate parameter β k by a modified DY formula, based on the Newton's direction and the quasi-Newton equation as well as the conjugate conditions, respectively, Andrei [13] considered two approaches to generate the spectral parameter θ k , namely, and The two SCGMs associated with θ N k and θ C k are all sufficient descent without depending on any line search, and are global convergent with the Wolfe line search (4). Also, the numerical results show that the SCGM associated with θ N k is more encouraging. Recently, by requiring the spectral direction d k defined by (6) satisfying the special sufficient descent condition g T k d k = − g k 2 for general β k , Liu et al. [15] proposed a class of choice for θ k as follows: Under the conventional assumptions and request |β k | ≤ |β FR k | as well as the Wolfe line search (4), the SCGM developed by θ LFZ k is global convergent, and implemented with good computation performance.
On the other hand, Jiang et al. [19] considered to improve both the FR method and the DY method by utilizing the strong Wolfe line search (5), and achieved their good numerical effect. As a result, two schemes for the conjugate parameter are proposed, namely, Interestingly, it is found, from formulas (3) and (6), that the SCGM can lead to more decrease than the classical CGM for a same β k and any θ k > 1. Therefore, in this work, motivated by the ideas of the modified FR method and DY method [19], and making full use of the second condition of the strong Wolfe line search (5), we first introduce a new approach for yielding the spectral parameter as follows: Obviously, θ JYJLL k ≥ 1 if the previous d k−1 is a descent direction. Secondly, based on the scheme of the conjugate parameter in Reference [20] with the form and fully absorbing the hybrid idea of Reference [11], we propose a new conjugate parameter in the following manner So far, a basic conception of our SCGM has been formed. As a result, a new SCGM is proposed, and the theoretical features and numerical performance is analysed and reported.

Algorithm and the Descent Property
Based on the formulas (2) and (6) as well as (9), we establish the new SCGM as follows.
Step 2. Determine a steplength α k by an inexact line search.
The following lemma indicates that the JYJLL-SCGM always satisfies the descent condition without depending on any line search, and the conjugate parameter β JYJLL k has the similar properties as the DY formula.

Lemma 1.
Suppose that the search direction d k is generated by JYJLL-SCGM. Then, we have g T k d k < 0 for k ≥ 1, that is, the search direction satisfies the descent condition. Furthermore, we obtain 0 ≤ β JYJLL Proof. We first prove the former claim by induction. For k = 1, it is easy to see that g T . Now, we prove that g T k d k < 0 holds for k. Letθ k be the angle between g k and d k−1 . The proof is divided into two cases as follows.
Now we prove the second assertion. From (10) and (11), it follows that g T k d k ≤ β JYJLL , furthermore, we deduce that 0 ≤ β JYJLL k from (9), and the proof is complete.

Convergence Analysis
To analyze and ensure the global convergence of the JYJLL-SCGM, we choose the Wolfe line search (4) to yield the steplength α k . Further, a basic assumption about the objective function as follows is needed.
(H2) f (x) is continuously differentiable in a neighborhood U of Λ, and its gradient g(x) is Lipschitz continuous, namely, there exists a constant L > 0 such that g(x) − g(y) ≤ L x − y , ∀ x, y ∈ U.
In the following lemma, we review the well-known Zoutendijk condition [6], which plays an important role in the convergence analysis of CGMs. Also, the Zoutendijk condition is suitable for the convergence analysis of the JYJLL-SCGM.

Lemma 2.
Suppose that Assumption 1 holds. Consider a general iterative method x k+1 = x k + α k d k , where d k is a descent direction such that g T k d k < 0, and the steplength α k satisfies the Wolfe line search condition (4).
Based on Lemmas 1 and 2, we can establish the global convergence of the JYJLL-SCGM.
Next, dividing both sides of the above inequality by (g T k d k ) 2 , we obtain In terms of d 1 2 (g T 1 d 1 ) 2 = 1 g 1 2 , together with the above relations and g k 2 ≥ γ, we have that is, = ∞, which contradicts Lemma 2. Therefore, the proof is complete.

Numerical Results
In this section, we test the numerical performance of our method (denoted by JYJLL for short) via 108 test problems, and compare it with the four methods HZ [7], KD [8], AN1 [13] and LFZ [15]. The HZ and KD methods belong to the CGMs with excellent effect, and the AN1 and LFZ methods are the SCGMs with more efficient performance. The first 53 (from bard to woods) test problems are taken from the CUTE library in N. I. M. Gould et al. [22], and the last 55 are from References [23,24], their dimensions ranging from 2 to 1,000,000. All codes were written in Matlab 2016a and run on a DELL PC with 4GB of memory and windows 10 operating system. All the steplength α k is generated by the Wolfe line search with σ = 0.1 and δ = 0.01.
In the experiments, notations Itr, NF, NG, and Tcpu and g * denote the number of iteration, function evaluation, gradient evaluation, computing time of CPU and gradient values, respectively. We stop the iteration, if one of the following two cases is satisfied: (i) g k ≤ 10 −6 ; (ii) Itr > 2000. When case (ii) appears, the method is deemed to be invalid and is denoted by "F".
To show the numerical performance of the tested methods, we observed and reported the values of Itr, NF, NG, Tcpu and g * generated by the five tested methods for each test instance, see Tables 1  and 2 below. On the other hand, to visually characterize and compare the numerical results in Tables 1 and 2, we use the performance profiles introduced by Dolan and Morè [25] to describe the performance of the five tested methods according to Itr, NF, NG and Tcpu, respectively, see Figures 1-4 below.

Discussion of Results
First, it is known, from the characters of the performance profiles, that the higher the curve in the figures, the better the associated method. Second, by summarizing the convergence analysis and numerical reports in Tables 1 and 2 and Figures 1-4, the proposed JYJLL-SCGM shows the following three advantages.
(i) It has good global convergence under mild assumptions.
(ii) It is practically effective, at least for the 108 tested instances.
(iii) It is the most effective in the five tested methods. In addition, the numerical performance of AN1 [13] method and the JYJLL-SCGM is relatively stable.
Of cause, the advantages above are attributed to the choice techniques (8) and (9) for the spectral parameter and the conjugate parameter.

Conclusions
The contributions of this work are two aspects. The one is to design a new computing schemes for the spectral parameter which ensures the θ k > 1. The other is to propose a new computing method for the conjugate parameter. These two techniques such that the search directions always possess descent property independent of the line search technique. As a result, the presented JYJLL-SCGM possesses global convergence if using the Wolfe line search to yield the steplength. A lot of numerical experiments in comparison with relative methods show that our SCGM is promising.
As further works, we think the following two problems are interesting and worth studying. The one is to design new approaches for the spectral parameter to guarantee θ k > 1, such as, combining the Newton direction and some new quasi-Newton equations, or the new conjugate conditions. The other is to find new computation techniques for the conjugate parameter with the help of the existing approaches, for example, the hybrid parameter, the three-term conjugate parameter et al.