Next Article in Journal
Design Optimization of a Traction Synchronous Homopolar Motor
Previous Article in Journal
Interval Ranges of Fuzzy Sets Induced by Arithmetic Operations Using Gradual Numbers
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Hybrid Three-Term Conjugate Gradient Algorithm for Large-Scale Unconstrained Problems

1
School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China
2
School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266520, China
*
Author to whom correspondence should be addressed.
Mathematics 2021, 9(12), 1353; https://doi.org/10.3390/math9121353
Submission received: 9 May 2021 / Revised: 7 June 2021 / Accepted: 8 June 2021 / Published: 11 June 2021

Abstract

:
Three-term conjugate gradient methods have attracted much attention for large-scale unconstrained problems in recent years, since they have attractive practical factors such as simple computation, low memory requirement, better descent property and strong global convergence property. In this paper, a hybrid three-term conjugate gradient algorithm is proposed and it owns a sufficient descent property, independent of any line search technique. Under some mild conditions, the proposed method is globally convergent for uniformly convex objective functions. Meanwhile, by using the modified secant equation, the proposed method is also global convergence without convexity assumption on the objective function. Numerical results also indicate that the proposed algorithm is more efficient and reliable than the other methods for the testing problems.

1. Introduction

In this paper, we consider the following unconstrained problem:
min x R n f ( x ) ,
where function f : R n R is continuously differentiable and bounded below. There are many methods for solving (1). such as the Levenberg–Marquardt methods [1], Newton methods [2] and quasi-Newton methods [3,4]. However, these methods are efficient for small and medium-sized problems and are not suitable for large scale problems in terms of the storage of a matrix for second order information or its approximation. Conjugate gradient (CG) methods [4,5,6,7,8,9,10,11,12] are much more effective for unconstrained problems, especially for large-scale cases by low memory requirements and strong convergence properties [6,8,9,10,11], etc. Meanwhile, CG methods have been applied to image restoration problems, optimal control problems and optimal problems in machine learning [13,14,15], etc. In this paper, we design a CG method for (1).
The nonlinear CG method was first proposed by Hestenes and Stiefel [16] for linear equations A x = b . In 1964, Fletcher and Reeves [17] extended the CG method in [16] to unconstrained optimization problems. After that, many researchers proposed various CG methods [6,7,8,9,10,12,18]. In CG methods, a sequence of iterative point { x k } is generated by an initial point x 0 and:
x k + 1 = x k + α k d k , k 0 ,
where α k is the step size which is determined by some line search technique and d k is the search direction. In a traditional CG method, the direction is usually defined by
d k = g k , if k = 0 , g k + β k d k 1 , if k 1 .
Different conjugate parameter β k generates a different CG method which may be significantly different in theoretical properties and numerical performance. The Hestenes–Stiefel (HS) method [16] and Polak–Ribière–Polyak (PRP) method [19,20] have nice numerical performance and their conjugate parameters are:
β k + 1 H S = g k + 1 T y k d k T y k , β k + 1 P R P = g k + 1 T y k g k 2 ,
where g k = f ( x k ) , g k + 1 = f ( x k + 1 ) and y k = g k + 1 g k . Note that the HS method automatically satisfies the conjugate condition d k + 1 T y k = 0 , independently of any line search technique. Dai and Liao [18] extended the above conjugate condition to:
d k + 1 T y k = t g k + 1 T s k , k 1 .
where t 0 and s k = x k + 1 x k . The new condition (4) gives a more accurate approximation for the Hessian matrix of the original objective function. Based on the condition (4), Dai and Liao [18] presented a new conjugate parameter:
β k + 1 D L ( t ) = g k + 1 T ( y k t s k ) y k T d k = g k + 1 T y k y k T d k t g k + 1 T s k y k T d k .
In order to have global convergence, they selected the non-negative conjugate parameter, meaning that:
β k + 1 D L + ( t ) = max { 0 , β k + 1 D L ( t ) } .
Under some mild conditions, the global convergence was established. However, the selection of the parameter t strongly affects the numerical performance, thus many scholars have focused on the choices for the parameter t, as can be seen in [21,22,23,24,25,26] etc.
Compared with the traditional two-term CG method, three-term CG methods [27,28,29,30,31] always have good numerical performance and nice theoretical properties, such as the sufficient descent property, independently of the accuracy of line search, i.e., it is always holds that:
g k T d k c g k 2 ,
where c > 0 . Specifically, Zhang et al. [29] proposed a descent three-term PRP CG method in which the direction has the form:
d 0 = g 0 , d k + 1 = g k + 1 + β k + 1 P R P d k + δ k + 1 P R P y k , δ k + 1 P R P = g k + 1 T d k g k 2 , k 0 .
and Zhang et al. [28] presented a descent three-term HS CG method in which the direction is:
d 0 = g 0 , d k + 1 = g k + 1 + β k + 1 H S d k + δ k + 1 H S y k , δ k + 1 H S = g k + 1 T d k d k T y k , k 0 .
and Babaie-Kafaki and Ghanbari [31] gave a modified three-term HS/DL method in which the direction owns the form:
d 0 = g 0 , d k + 1 = g k + 1 + β k + 1 H S d k t g k + 1 T s k | y k T d k | d k + δ k + 1 H S y k , δ k + 1 H S = g k + 1 T d k d k T y k , k 0 .
For the above three directions, the sufficient descent property is always satisfied, i.e., g k T d k g k 2 . Note that the sufficient descent property is stronger that the descent property and may greatly improve the numerical performance of the corresponding methods.
Motivated by the above discussions, in this paper, we propose a new descent hybrid three-term CG algorithm. Under some mild conditions, the direction in this descent hybrid three-term CG algorithm may reduce the directions in [28,29,31], respectively, and another new three-term direction which also satisfies the sufficient descent property with c = 1 (which is why we call our method as the hybrid three-term CG method). The new method owns the sufficient descent property independent of the accuracy of the line search technique. Under some mild conditions, the global convergence is established for uniformly convex objective functions. For general functions without convexity assumption, the global convergence is also established by using the modified secant condition in [32]. Numerical results indicate that the proposed algorithm is effective and reliable.
The paper is organized as follows. In Section 2, we firstly present the motivation for the hybrid three-term CG method and then propose the new hybrid three-term direction and prove some properties of the new direction and give the global convergence for uniformly convex objective functions at last. In Section 3, the global convergence for the general nonlinear functions is established with the help of the modified secant condition. Numerical tests are given in Section 4 to show the efficient and reliable nature of the proposed algorithm. Finally, the conclusions are presented in Section 5.

2. Motivation and Algorithm

In this section, we firstly present the motivation and give the form of the new direction.
It should be noted that if the exact line search technique is adopted, which implies g k + 1 T d k = 0 , then it holds that:
β k + 1 H S = β k + 1 P R P = β k + 1 D L ( t ) .
If the inexact line search technique is adopted, these three methods may be different in theoretical property and numerical performance and the HS method and DL method may be not well defined (the denominator y k T d k may be 0). Zhang [33] present a hybrid conjugate parameter β k + 1 h y b r i d for the traditional two-term Dai–Liao CG method:
β k + 1 h y b r i d = g k + 1 T ( y k t s k ) max { y k T d k , g k 2 } .
The numerical results for general nonlinear equations show that the hybrid two-term conjugate residual method is effective and reliable.
Motivated by the above discussions and the nice properties of three-term CG methods, in the following, we propose a new hybrid descent three-term direction which has the following form: d 0 = g 0 and:
d k + 1 N = g k + 1 + β k + 1 N s k δ k + 1 N y k ,
where:
β k + 1 N = g k + 1 T ( y k t s k ) max { y k T s k , g k 2 } , δ k + 1 N = g k + 1 T s k max { y k T s k , g k 2 } .
Note that the direction d k + 1 N is well defined. In fact, if y k T s k = g k 2 = 0 holds, the condition g k 2 = 0 indicates that the method stops and the optimal solution ( x k ) is obtained.
In the following, we give some remarks for the above direction. Note that if t = 0 and y k T s k g k 2 hold, the direction d k + 1 N reduces the direction d k + 1 T T H S in [28], and t = 0 and y k T s k g k 2 hold, the direction d k + 1 N reduces to the direction d k + 1 T T P R P in [29]. Note also that if y k T s k g k 2 holds, then the parameter β k + 1 N reduces to the conjugate parameter β k + 1 D L and the direction d k + 1 N reduces to a modified vision of the direction d k + 1 T T D L in [31]. If y k T s k < g k 2 holds, the direction d k + 1 N reduces to a new three-term direction which also satisfies the sufficient descent property with c = 1 . Overall, we regard the direction d k + 1 N as the hybrid direction of the HS direction, the Dai–Liao direction and the PRP direction.

2.1. Algorithm for Uniformly Convex Functions

Now, based on the above analyses, we state the steps of our algorithm as follows:
Algorithm 1: New hybrid three-term conjugate gradient method (HTTCG).
1 
Step 0. Select the initial point x 0 R n . Compute g ( x 0 ) and set d 0 = g 0 . Let  k : = 0 .
2 
Step 1. If g k ε , then stop, otherwise go to next step;
3 
Step 2. Compute the step size α k along the direction d k by the line search strategy technique;
4 
Step 3. Let x k + 1 = x k + α k d k ;
5 
Step 4. Compute the search direction d k + 1 by (9);
6 
Step 5. Set k : = k + 1 and go to Step 1.
Remark 1.
Note that in Algorithm 1, the line search technique is not explicitly given. In fact, any line search technique is accepted.
In the following, we show that Algorithm 1 owns the sufficient descent property independent of any line search technique.
Lemma 1.
For any line search technique, the sequence { d k N } is generated by Algorithm 1, and it always holds that:
g k T d k N g k 2 , k 0 .
Proof. 
If k = 0 , we have d 0 = g 0 , then it holds g 0 T d 0 = g 0 2 . For k 0 , by the definition of d k + 1 N , we have the following inequality:
g k + 1 T d k + 1 N = g k + 1 2 + g k + 1 T ( y k t s k ) max { y k T s k , g k 2 } g k + 1 T s k g k + 1 T s k max { y k T s k , g k 2 } g k + 1 T y k = g k + 1 2 + g k + 1 T s k g k + 1 T y k max { y k T s k , g k 2 } t ( g k + 1 T s k ) 2 max { y k T s k , g k 2 } g k + 1 T s k g k + 1 T y k max { y k T s k , g k 2 } = g k + 1 2 t ( g k + 1 T s k ) 2 max { y k T s k , g k 2 } g k + 1 2 ,
where the last inequality holds by t 0 . Then, (10) holds. This completes the proof. □
Lemma 1 means that the new direction satisfies the sufficient descent property independent of the line search technique. A conjugate condition also plays an important role in numerical performance. In the HS method, it automatically satisfies the condition d k + 1 T y k = 0 ; in Dai–Liao method, the modified condition d k + 1 T y k = t g k + 1 T s k is always satisfied. In our part, by the design of the direction d k + 1 N , we have:
( d k + 1 N ) T y k = g k + 1 T y k + g k + 1 T ( y k t s k ) max { y k T s k , g k 2 } y k T s k g k + 1 T s k max { y k T s k , g k 2 } y k 2 = g k + 1 T y k + y k T s k max { y k T s k , g k 2 } g k + 1 T y k y k T s k ( t g k + 1 T s k ) max { y k T s k , g k 2 } y k 2 ( g k + 1 T s k ) max { y k T s k , g k 2 } t y k T s k + y k 2 max { y k T s k , g k 2 } g k + 1 T s k .
From (12), it holds that the new direction d k + 1 N also satisfies the DL conjugate condition (4) in an extent form, i.e., ( d k + 1 N ) T y k t 1 g k + 1 T s k where t 1 = t y k T s k + y k 2 max { y k T s k , g k 2 } . In fact, if we adopt the line search technique which results in y k T s k 0 , then it holds that t 1 = t y k T s k + y k 2 max { y k T s k , g k 2 } > 0 .

2.2. Convergence for Uniformly Convex Functions

In the following, we present the global convergence analysis of the HTTCG method under the following assumptions.
Assumption 1.
The level set T : = { x R n : f ( x ) f ( x 0 ) } is bounded where x 0 is the starting point, namely, there exists a constant X > 0 such that:
x X , x T .
Assumption 2.
In some neighborhood N of T , the gradient of function f ( x ) , g ( x ) , is Lipschitz continuous, which means there exists a constant L > 0 such that:
g ( x ) g ( y ) L x y , x , y N .
Note that based on Assumptions 1 and 2, there exists a positive constant G such that:
g ( x ) G , x T .
In the following, we show that the sequence { d k N } generated by Algorithm 1 is bounded.
Lemma 2.
Assume 0 < t T and Assumptions 1 and 2 hold. For any line search technique, the sequence { d k N } is generated by Algorithm 1. If the objective function f is uniformly convex on the set N , then { d k N } is bounded.
Proof. 
Since function f is uniformly convex on the set N , then for any x , y N , we have
( f ( x ) f ( y ) ) T ( x y ) μ x y 2 ,
where μ > 0 is the uniformly convex parameter. Especially, if we take x = x k + 1 and y = x k , then it holds that:
y k T s k μ s k 2 > 0 .
In the following, we prove the boundedness of parameters β k + 1 N and δ k + 1 N . In fact, by their definitions, we have:
| β k + 1 N | = g k + 1 T ( y k t s k ) max { y k T s k , g k 2 } g k + 1 ( y k + t s k ) | max { y k T s k , g k 2 } | g k + 1 ( y k + t s k ) y k T s k ( L + T ) s k μ s k 2 g k + 1 = L + T μ g k + 1 s k ,
| δ k + 1 N | = g k + 1 T s k max { y k T s k , g k 2 } g k + 1 s k | max { y k T s k , g k 2 } | g k + 1 s k y k T s k g k + 1 s k μ s k 2 = 1 μ g k + 1 s k ,
By the definition of d k + 1 N , we have:
d k + 1 N g k + 1 + | β k + 1 N | s k + | δ k + 1 N | y k g k + 1 + L + T μ g k + 1 s k s k + 1 μ g k + 1 s k y k g k + 1 + L + T μ g k + 1 + L μ g k + 1 = 1 + 2 L + T μ g k + 1 1 + 2 L + T μ G ,
where the last inequality holds by (15). This completes the proof. □
The standard Wolfe line search technique is often referred to in CG methods (see [6,7,11] etc.). It has the following form:
f ( x k + α k d k ) f ( x k ) + σ 1 α k g k T d k , g ( x k + α k d k ) T d k σ 2 g k T d k ,
where 0 < σ 1 < σ 2 < 1 .
The following lemma plays an essential role for the global convergence theorem of our method. It can be seen in Lemma 3.1 in [34]. Hence, we only state it here and omit its proof.
Lemma 3.
Suppose that Assumptions 1 and 2 hold. Consider any iterative method with the form (2) where d k satisfies the sufficient descent property and α k is computed by Wolfe line search technique (20). If the following relationship holds:
k 0 1 d k 2 = + ,
then the method globally converges in the sense that:
lim inf k + g k = 0 .
Now, we establish the global convergence of Algorithm 1 for the uniformly convex objective functions.
Theorem 1.
Suppose that Assumptions 1 and 2 hold. Consider Algorithm 1 in which step size α k is computed by the line search technique (20). If the objective function f is uniformly convex on set N , then Algorithm 1 globally converges in the sense that:
lim k + g k = 0 .
Proof. 
From Lemma 1, we have that the direction d k + 1 N satisfies the sufficient descent property with c = 1 . By the first inequality in the line search technique (20), we have the sequence { f ( x k ) } k 0 is non-increasing and { x k } k 0 N . By the boundedness of { d k N } in Lemma 2, we have that (21) holds. Then, (22) holds. Since f is uniformly convex, then we have (23). This completes the proof. □

3. Convergence for General Nonlinear Functions

In order to achieve the global convergence without convexity assumption for the general function, we adopt the modified secant condition in [32] (similar modified secant conditions can also be founded in [35,36] etc.). Concretely, the modified secant condition is:
2 f ( x k + 1 ) s k = z k ,
where p > 0 and C > 0 and:
z k = y k + h k g k p s k , h k = C + max 0 , y k T s k s k 2 g k p .
Based on the modified secant condition, we present the direction: d 0 = g 0 and:
d k + 1 N N = g k + 1 + β k + 1 N N s k δ k + 1 N N z k ,
where:
β k + 1 N N = g k + 1 T ( z k t s k ) max { z k T s k , g k 2 } , δ k + 1 N N = g k + 1 T s k max { z k T s k , g k 2 } .
Now, based on the above discussions, we state our algorithm as follows:
Algorithm 2: Hybrid three-term CG method using modified secant condition (HTTCGSC).
1 
Step 0. Select x 0 R n , constants C > 0 and r > 0 . Compute g ( x 0 ) and set d 0 = g 0 . Let k : = 0 .
2 
Step 1. If g k ε , then stop, otherwise go to next step;
3 
Step 2. Compute the step size α k along the direction d k by the line search technique;
4 
Step 3. Let x k + 1 = x k + α k d k ;
5 
Step 4. Compute the search direction d k + 1 by (25);
6 
Step 5. Set k : = k + 1 and go to Step 1.
Note also that in Algorithm 2, the line search technique is not explicitly given. Similar to Lemma 1, we also have the following lemma. Here, we omit its proof.
Lemma 4.
For any line search technique, the sequence { d k N N } is generated by Algorithm 2, then it always holds that:
g k T d k N N g k 2 , k 0 .
Lemma 5.
The sequences { z k } and { s k } are generated by Algorithm 2, then for any line search technique, we have:
z k T s k C g k p s k 2 ,
and:
z k ( 2 L + C G p ) s k .
Proof. 
In fact, for any line search technique, we consider two cases:
Case (i) y k T s k 0 holds. In this case, we have h k = C and z k = y k + C g k p s k . Then, we have:
z k T s k = y k T s k + C g k p s k 2 C g k p s k 2 .
Case (ii) y k T s k < 0 holds. In this case, it holds that h k = C y k T s k s k 2 g k p and z k = y k + C g k p s k y k T s k s k 2 s k . Then, we have:
z k T s k = y k T s k + C g k p s k 2 y k T s k C g k p s k 2 .
Based on the above discussions, we have that for any line search technique, (27) always holds.
By the definition of z k , we have:
z k y k + | h k | g k p s k L s k + C + | y k T s k | s k 2 g k p g k p s k ( 2 L + C G p ) s k ,
where the last inequality holds by g k G . Then, (28) holds. This completes the proof. □
In the following, we assume that the limit (22) does not hold, otherwise Algorithm 2 converges. This means that there exists a positive constant η such that:
g k η , k 0 .
Now, we give the global convergence of Algorithm 2 for general nonlinear problems without convex assumption.
Theorem 2.
Suppose that Assumptions 1 and 2 hold. Consider Algorithm 2 in which step size α k is computed by line search technique (20). Then, Algorithm 2 globally converges in the sense that (22) holds.
Proof. 
We proceed by contradiction. Suppose that (22) is not true. Then, the inequality (31) holds.
In the following, we firstly prove that the sequence { d k N N } k 0 is bounded. Similarly, due to the analyses in (17) and (18), we have that:
| β k + 1 N N | = g k + 1 T ( z k t s k ) max { z k T s k , g k 2 } g k + 1 ( z k + t s k ) z k T s k ( 2 L + C G p + T ) s k C g k p s k 2 g k + 1 2 L + C G p + T C η p g k + 1 s k ,
where the third inequality holds by Assumption 1 and (28), the last inequality holds by g k η .
| δ k + 1 N N | = g k + 1 T s k max { z k T s k , g k 2 } g k + 1 s k | max { z k T s k , g k 2 } | g k + 1 s k z k T s k g k + 1 s k C g k p s k 2 1 C η p g k + 1 s k ,
where the last inequality holds by g k η . Then, by (25), it holds that:
d k + 1 N N g k + 1 + | β k + 1 N N | s k + | δ k + 1 N N | z k g k + 1 + 2 L + C G p + T C η p g k + 1 s k s k + 1 C η p g k + 1 s k z k g k + 1 + 2 L + C G p + T C η p g k + 1 + 2 L + C G p C η p g k + 1 = 1 + 4 L + 2 C G p + T C η p g k + 1 1 + 4 L + 2 C G p + T C η p G .
That means the sequence { d k N N } k 0 is bounded. Hence, (21) holds. From Lemmas 3 and 4, we have lim inf k + g k = 0 , which contradicts (31). Then, (22) holds. This completes the proof. □

4. Numerical Results

In this section, we firstly present the numerical performance of Algorithm 2 and compare it with other methods in [28,31]. Then, an accelerated technique is applied in our method.

4.1. Numerical Performance of Algorithm 2

In this subsection, we focus on the numerical performance of Algorithm 2 and compare it with the MTTDLCG method in [31] and the MTTHSCG method in [28]. For the MTTDLCG method and the MTTHSCG method, we take their settings for parameters. For the value of parameter t, authors in [37] point out that t = y k 2 y k T s k is a good choice for the Dai–Liao method and the authors in [18] suggest t = 0.1 is a good choice. In this paper, we take t = max { 0.1 , z k 2 max { z k T s k , g k 2 } } .
We execute the tests on a personal computer with the Windows 10 operating system, AMD CPU with @ 2.1 GHz, 16.00 GB of RAM. Meanwhile, the corresponding codes are written in MATLAB R2016b. The parameters referred are that: σ 1 = 0.20 , σ 2 = 0.85 , C = 0.1 , r = 1 if s k 2 < 1 , r = 3 otherwise. In the following, we present the stopping rules:
Stopping rules (Himmeblau rule): During the testing, if | f ( x k ) | > ϵ 1 , set Δ = | f ( x k ) f ( x k + 1 ) | | f ( x k ) | , or let Δ = | f ( x k ) f ( x k + 1 ) | . Testing stops if g ( x ) < ε or Tem < ε 2 holds. The values of these parameters are ε 1 = ε 2 = 10 5 and ϵ = 10 6 . Meanwhile, the testing also stops if the total iteration number is larger than 10,000.
For the step size, α k will be chosen when the search number of the WWP line search is more than 6. Testing problems with initial points considered here are from [38], which are listed in Table 1. Meanwhile, for each problem, ten large-scale dimensions with 1500, 3000, 6000, 7500, 9000, 15,000, 30,000, 60,000, 75,000 and 90,000 variables are considered.
To approximately assess the corresponding numerical performances, the performance profile introduced by Dolan and Moré in [39] is adopted. That is, for each method, we plot the fraction P of the testing problems for which the method is within a factor τ of the best time. The left side of the figure presents the percentage of the test problems for which a method is the fastest; the right side gives the percentage of the test problems that are successfully solved by each of the methods. Figure 1 presents the performance profile of the three methods in iterations.
From Figure 1, we have that Algorithm 2 (the HTTCG method) in 53 % , the MTTDLCG method in 35 % and the MTTHSCG method in 41 % solve the test problems with the least iteration number. This indicates that Algorithm 2 performs best. Figure 2 presents the performance profile of the three methods in the function-gradient number case:
From Figure 2, we have that Algorithm 2 at 69 % , the MTTDLCG method at 14 % and the MTTHSCG method at 22 % solve the test problems with the least number of computing functions and gradients. This also indicates that Algorithm 2 performs best. Figure 3 presents the performance profile of the three methods in CPU time consumed.
From Figure 3, we have that Algorithm 2 at 12 % , the MTTDLCG method at 31 % and the MTTHSCG method at 46 % solve the test problems with the least CPU time consumed. This indicates that the MTTHSCG method performs best in terms of CPU time consumed. From Figure 1, Figure 2 and Figure 3, we have that Algorithm 2 is effective and comparable with the MTTDLCG method and the MTTHSCG method for the test problems in Table 1.

4.2. Accelerated Strategy for Algorithm 2

In order to improve the numerical performances of Algorithm 2, in this subsection, we utilize the acceleration strategy in [40], which modifies the step in a multiplicative manner along iterations. Concretely, the iterative form (2) reduces to:
x k + 1 = x k + ρ k α k d k ,
where θ k = α k g k T d k and ξ k = α k [ g ( x k + α k d k ) g k ] T d k and if ξ k < 0 , ρ k = θ k ξ k , otherwise, ρ k = 1 .
We also test the problems in Table 1 and compare Algorithm 2 with the acceleration strategy with Algorithm 2. The corresponding parameters remain unchanged. The performance profiles can be founded in Figure 4, Figure 5 and Figure 6.
From Figure 4, we find that the Algorithm 2 with the acceleration strategy (HTTCG-A method) in 73 % and HTTCG method in 36 % solve the testing problems with the least iteration number. From Figure 5, we see that the HTTCG-A method in 72 % and HTTCG method in 35 % solve the testing problems with the least number of computing functions and gradient. From Figure 6, we see that the HTTCG-A method in 59 % and HTTCG method in 28 % solve the testing problems with the least CPU time consumed. These all indicate that the acceleration strategy works and can reduce the number of iterations, the number of computing functions and gradients and the time consumed.

5. Conclusions

Unconstrained smooth optimization problems can be found in many problems such as optimal control problems and machine learning problems, etc. In this paper, a hybrid three-term descent conjugate gradient algorithm is proposed. This hybrid three-term conjugate gradient algorithm owns the sufficient descent property independent of any line search technique. Meanwhile, it also satisfies the extent Dai–Liao conjugate condition. Under some mild conditions, this algorithm is globally convergent for the uniformly convex functions. For general nonlinear function, the hybrid method is also globally convergent by using some modified secant conditions. Numerical results indicate that the hybrid method is effective and reliable. Meanwhile, an acceleration strategy is adopted to improve the numerical performances. In the future, we will apply our conjugate gradient methods to some non-smooth problems by smoothing strategy and Moreau–Yosida regularization technique and to image restoration problems.

Author Contributions

Conceptualization, Q.T., X.W., L.P. and M.Z.; methodology, Q.T. and X.W.; software, X.W.; validation, X.W., L.P., M.Z. and F.M.; formal analysis, X.W., Q.T. and F.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially supported by the Natural Science Foundation of Shandong Province with No. ZR2019BA014.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data can be found in the manuscript.

Acknowledgments

The authors would like to thank Gonglin Yuan of School of Mathematics and Information Science at Guangxi University for their assistance in the numerical experiment. The authors are grateful to the referees and the editor for their constructive comments and helpful suggestions which much improve the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Fan, J.; Zeng, J. A Levenberg–Marquardt algorithm with correction for singular system of nonlinear equations. Appl. Math. Comput. 2013, 219, 9438–9446. [Google Scholar] [CrossRef]
  2. Yerina, M.; Izmailov, A. The Gauss–Newton method for finding singular solutions to systems of nonlinear equations. Comp. Math. Math. Phys. 2007, 47, 748–759. [Google Scholar] [CrossRef]
  3. Yuan, G.; Wei, Z.; Wang, Z. Gradient trust region algorithm with limited memory BFGS update for nonsmooth convex minimization. Comput. Optim. Appl. 2013, 54, 45–64. [Google Scholar] [CrossRef]
  4. Yuan, G.; Wei, Z.; Lu, X. Global convergence of BFGS and PRP methods under a modified weak Wolfe–Powell line search. Appl. Math. Model. 2017, 47, 811–825. [Google Scholar] [CrossRef]
  5. Yuan, G.; Lu, J.; Wang, Z. The PRP conjugate gradient algorithm with a modified WWP line search and its application in the image restoration problems. Appl. Numer. Math. 2020, 152, 1–11. [Google Scholar] [CrossRef]
  6. Dai, Y.; Han, J.; Liu, G.; Sun, D.; Yin, H.; Yuan, Y. Convergence properties of nonlinear conjugate gradient methods. SIAM J. Optim. 2000, 10, 345–358. [Google Scholar] [CrossRef]
  7. Yuan, G.; Wang, X.; Sheng, Z. Family weak conjugate gradient algorithms and their convergence analysis for nonconvex functions. Numer. Algorithms 2020, 84, 935–956. [Google Scholar] [CrossRef]
  8. Hager, W.; Zhang, H. A survey of nonlinear conjugate gradient methods. Pac. J. Optim. 2006, 2, 35–58. [Google Scholar]
  9. Andrei, N. Numerical comparison of conjugate gradient algorithms for unconstrained optimization. Stud. Inform. Control. 2007, 16, 333–352. [Google Scholar]
  10. Li, X.; Wang, X.; Sheng, Z.; Duan, X. A modified conjugate gradient algorithm with backtracking line search technique for large-scale nonlinear equations. Int. J. Comput. Math. 2018, 95, 382–395. [Google Scholar] [CrossRef]
  11. Wang, X.; Hu, W.; Yuan, G. A Modified Wei-Yao-Liu Conjugate Gradient Algorithm for Two Type Minimization Optimization Models; Springer: Cham, Switzerland, 2018. [Google Scholar]
  12. Yuan, G.; Meng, Z.; Li, Y. A modified Hestenes and Stiefel conjugate gradient algorithm for large-scale nonsmooth minimizations and nonlinear equations. J. Optim. Theory Appl. 2016, 168, 129–152. [Google Scholar] [CrossRef]
  13. Cao, J.; Wu, J. A conjugate gradient algorithm and its applications in image restoration. Appl. Numer. Math. 2020, 152, 243–252. [Google Scholar] [CrossRef]
  14. Møller, F. A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw. 1993, 6, 525–533. [Google Scholar]
  15. Zhou, Y.; Wu, Y.; Li, X. A New Hybrid PRPFR Conjugate Gradient Method for Solving Nonlinear Monotone Equations and Image Restoration Problems. Math. Probl. Eng. 2020, 2020, 1–13. [Google Scholar] [CrossRef]
  16. Hestenes, R.; Stiefel, L. Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. 1952, 49, 409–436. [Google Scholar] [CrossRef]
  17. Fletcher, R.; Reeves, C.M. Function minimization by conjugate gradients. Comput. J. 1964, 2, 149–154. [Google Scholar] [CrossRef] [Green Version]
  18. Dai, Y.; Liao, L. New conjugacy conditions and related nonlinear conjugate gradient methods. Appl. Math. Opt. 2001, 43, 87–101. [Google Scholar] [CrossRef]
  19. Polak, E. The conjugate gradient method in extreme problems. USSR Comput. Math. Math. Phys. 1969, 9, 94–112. [Google Scholar] [CrossRef]
  20. Polak, E.; Ribière, G. Note sur la convergence de directions conjugees. Rev. Fr. Informat Rech. Opertionelle 1969, 16, 35–43. [Google Scholar]
  21. Zhang, K.; Liu, H.; Liu, Z. A new Dai-Liao conjugate gradient method with optimal parameter choice. Numer. Funct. Anal. Optim. 2019, 40, 194–215. [Google Scholar] [CrossRef]
  22. Andrei, N. A Dai-Liao conjugate gradient algorithm with clustering of eigenvalues. Numer. Algorithms 2018, 77, 1273–1282. [Google Scholar] [CrossRef]
  23. Babaie-kafaki, S.; Ghanbari, R. The Dai-Liao nonlinear conjugate gradient method with optimal parameter choices. Eur. J. Oper. Res. 2014, 234, 625–630. [Google Scholar] [CrossRef]
  24. Zheng, Y.; Zheng, B. Two new Dai-Liao-type conjugate gradient methods for unconstrained optimization problems. J. Optim. Theory Appl. 2017, 175, 502–509. [Google Scholar] [CrossRef]
  25. Peyghami, M.; Ahmadzadeh, H.; Fazli, A. A new class of efficient and globally convergent conjugate gradient methods in the Dai-Liao family. Optim. Method. Softw. 2015, 30, 843–863. [Google Scholar] [CrossRef]
  26. Yuan, G.; Wang, X.; Sheng, Z. The projection technique for two open problems of unconstrained optimization problems. J. Optim. Theory Appl. 2020, 186, 590–619. [Google Scholar] [CrossRef]
  27. Narushima, Y.; Yabe, H.; Ford, J. A three-term conjugate gradient method with sufficient descent property for unconstrained optimization. SIAM J. Optim. 2011, 21, 212–230. [Google Scholar] [CrossRef] [Green Version]
  28. Zhang, L.; Zhou, W.; Li, D. Some descent three-term conjugate gradient methods and their global convergence. Optim. Method. Softw. 2007, 22, 697–711. [Google Scholar] [CrossRef]
  29. Zhang, L.; Zhou, W.; Li, D. A descent modified Polak-Ribière-Polyak conjugate gradient method and its global convergence. IMA J. Numer. Anal. 2006, 26, 629–640. [Google Scholar] [CrossRef]
  30. Andrei, N. A simple three-term conjugate gradient algorithm for unconstrained optimization. J. Comput. Appl. Math. 2013, 241, 19–29. [Google Scholar] [CrossRef]
  31. Babaie-Kafaki, S.; Ghanbari, R. Two modified three-term conjugate gradient methods with sufficient descent property. Optim. Lett. 2014, 8, 2285–2297. [Google Scholar] [CrossRef]
  32. Li, D.; Fukushima, M. A modified BFGS method and its global convergence in nonconvex minimization. J. Comput. Appl. Math. 2001, 129, 15–35. [Google Scholar] [CrossRef] [Green Version]
  33. Zhang, L. A derivative-free conjugate residual method using secant condition for general large-scale nonlinear equations. Numer. Algorithms 2020, 83, 1277–1293. [Google Scholar] [CrossRef]
  34. Sugiki, K.; Narushima, Y.; Yabe, H. Globally convergent three-term conjugate gradient methods that use secant conditions and generate descent search directions for unconstrained optimization. J. Optim. Theory Appl. 2012, 153, 733–757. [Google Scholar] [CrossRef]
  35. Wei, Z.; Li, G.; Qi, L. New quasi-Newton methods for unconstrained optimization problems. Appl. Math. Comput. 2006, 175, 1156–1188. [Google Scholar] [CrossRef]
  36. Yuan, G.; Wei, Z. Convergence analysis of a modified BFGS method on convex minimizations. Comput. Optim. Appl. 2010, 47, 237–255. [Google Scholar] [CrossRef]
  37. Dai, Y.; Kou, C. A nonlinear conjugate gradient algorithm with an optimal property and an improved wolfe line search. SIAM J. Optim. 2013, 23, 296–320. [Google Scholar] [CrossRef] [Green Version]
  38. Andrei, N. An unconstrained optimization test functions collection. Adv. Model Optim. 2008, 10, 147–161. [Google Scholar]
  39. Dolan, E.; Moré, J. Benchmarking optimization software with performance profiles. Math. Program 2002, 91, 201–213. [Google Scholar] [CrossRef]
  40. Andrei, N. An acceleration of gradient descent algorithm with backtracking for unconstrained optimization. Numer. Algorithms 2006, 42, 63–73. [Google Scholar] [CrossRef]
Figure 1. Performance profiles of the numerical results for these methods in number of iterations case.
Figure 1. Performance profiles of the numerical results for these methods in number of iterations case.
Mathematics 09 01353 g001
Figure 2. Performance profiles of the numerical results for these methods in a number of function and gradient evaluations.
Figure 2. Performance profiles of the numerical results for these methods in a number of function and gradient evaluations.
Mathematics 09 01353 g002
Figure 3. Performance profiles of the numerical results for these methods in CPU time consumed case.
Figure 3. Performance profiles of the numerical results for these methods in CPU time consumed case.
Mathematics 09 01353 g003
Figure 4. Performance profiles of Algorithm 2 and the acceleration form in NI case.
Figure 4. Performance profiles of Algorithm 2 and the acceleration form in NI case.
Mathematics 09 01353 g004
Figure 5. Performance profiles of Algorithm 2 and the acceleration form in NFG case.
Figure 5. Performance profiles of Algorithm 2 and the acceleration form in NFG case.
Mathematics 09 01353 g005
Figure 6. Performance profiles of Algorithm 2 and the acceleration form in CPU Time case.
Figure 6. Performance profiles of Algorithm 2 and the acceleration form in CPU Time case.
Mathematics 09 01353 g006
Table 1. The testing problems.
Table 1. The testing problems.
N0ProblemN0Problem
1Extended Trigonometric Function26BDQRTIC (CUTE)
2Extended Rosenbrock Function27ARWHEAD (CUTE)
3Extended White and Holst Function28NONDIA (Shanno-78) (CUTE)
4Extended Beale Function U63 (MatrixRom)29DQDRTIC (CUTE)
5Extended Penalty Function30EG2 (CUTE)
6Raydan 1 Function31DIXMAANA (CUTE)
7Raydan 2 Function32DIXMAANB (CUTE)
8Diagonal 3 Function33DIXMAANC (CUTE)
9Generalized Tridiagonal-1 Function34DIXMAANE (CUTE)
10Extended Tridiagonal-1 Function35Broyden Tridiagonal
11Extended Three Exponential Terms36EDENSCH Function (CUTE)
12Generalized Tridiagonal-2 Function37VARDIM Function (CUTE)
13Diagonal 4 Function38DIAGONAL 6
14Diagonal 5 Function (MatrixRom)39DIXMAANF (CUTE)
15Extended Himmelblau Function40DIXMAANG (CUTE)
16Generalized PSC1 Function41DIXMAANH (CUTE)
17Extended PSC1 Function42DIXMAANI (CUTE)
18Extended Maratos Function43DIXMAANJ (CUTE)
19Extended Cliff Function44DIXMAANK (CUTE)
20Extended Wood Function45DIXMAANL (CUTE)
21Extended Quadratic Penalty QP1 Function46DIXMAAND (CUTE)
22Extended Quadratic Penalty QP2 Function47ENGVAL1 (CUTE)
23A Quadratic Function QF248COSINE (CUTE)
24Extended EP1 Function49Extended DENSCHNB (CUTE)
25Extended Tridiagonal-2 Function50Extended DENSCHNF (CUTE)
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Tian, Q.; Wang, X.; Pang, L.; Zhang, M.; Meng, F. A New Hybrid Three-Term Conjugate Gradient Algorithm for Large-Scale Unconstrained Problems. Mathematics 2021, 9, 1353. https://doi.org/10.3390/math9121353

AMA Style

Tian Q, Wang X, Pang L, Zhang M, Meng F. A New Hybrid Three-Term Conjugate Gradient Algorithm for Large-Scale Unconstrained Problems. Mathematics. 2021; 9(12):1353. https://doi.org/10.3390/math9121353

Chicago/Turabian Style

Tian, Qi, Xiaoliang Wang, Liping Pang, Mingkun Zhang, and Fanyun Meng. 2021. "A New Hybrid Three-Term Conjugate Gradient Algorithm for Large-Scale Unconstrained Problems" Mathematics 9, no. 12: 1353. https://doi.org/10.3390/math9121353

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop