Next Article in Journal
Characteristics of Flow Symmetry and Heat Transfer of Winglet Pair in Common Flow Down Configuration
Next Article in Special Issue
Sufficiency for Purely Essentially Bounded Optimal Controls
Previous Article in Journal
Single-Valued Neutrosophic Linguistic-Induced Aggregation Distance Measures and Their Application in Investment Multiple Attribute Group Decision Making
Previous Article in Special Issue
Maximum-Entropy-Model-Enabled Complexity Reduction Algorithm in Modern Video Coding Standards
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Filter Nonmonotone Adaptive Trust Region Method for Unconstrained Optimization

1
School of Science, Southwest Petroleum University, Chengdu 610500, China
2
School of Artificial Intelligence, Southwest Petroleum University, Chengdu 610500, China
*
Author to whom correspondence should be addressed.
Symmetry 2020, 12(2), 208; https://doi.org/10.3390/sym12020208
Submission received: 17 December 2019 / Revised: 20 January 2020 / Accepted: 21 January 2020 / Published: 2 February 2020
(This article belongs to the Special Issue Advance in Nonlinear Analysis and Optimization)

Abstract

:
In this paper, a new filter nonmonotone adaptive trust region with fixed step length for unconstrained optimization is proposed. The trust region radius adopts a new adaptive strategy to overcome additional computational costs at each iteration. A new nonmonotone trust region ratio is introduced. When a trial step is not successful, a multidimensional filter is employed to increase the possibility of the trial step being accepted. If the trial step is still not accepted by the filter set, it is possible to find a new iteration point along the trial step and the step length is computed by a fixed formula. The positive definite symmetric matrix of the approximate Hessian matrix is updated using the MBFGS method. The global convergence and superlinear convergence of the proposed algorithm is proven by some classical assumptions. The efficiency of the algorithm is tested by numerical results.

1. Introduction

Consider the following unconstrained optimization problem:
min x R n f ( x ) ,
where f : R n R is a twice continuously differentiable function. The trust region method is one of the prominent classes of iterative methods. At the iteration point x k , the trial step d k is obtained by the following quadratic subproblem:
min d R n m k ( d ) = g k T d + 1 2 d T B k d ,
d Δ k
where . is the Euclidean norm, f k = f ( x k ) , g k = f ( x k ) , G k = 2 f ( x k ) , B k is a symmetric approximation of G k , and Δ k is a trust region radius. The most ordinary ratio is defined as follows:
ρ k = f k f ( x k + d ) m k ( 0 ) m k ( d k ) ,
Generally, the numerator is referred to as the actual reduction and the denominator is known as the predicted reduction.
The disadvantage of the traditional trust region method is that the subproblem needs to be solved many times to achieve an acceptable trial step in one iteration. To overcome this drawback, Mo et al. [1] first proposed a nonmonotone trust region algorithm with a fixed step length. When the trial step is not acceptable, we use a fixed step length to find a new iteration point instead of resolving the subproblem. Based on this advantage, Ou, Hang, and Wang have proposed a trust region algorithm with fixed step length in [2,3,4], respectively. The fixed step length is computed by
α k = δ g k T d k d k T B k d k ,
It is well known that the strategy of selecting a trust region radius has a significant impact on the performance of the trust region methods. In 1997, Sartenaer [5] presented a strategy which can automatically determine an initial trust region radius. This fact leads to an increase in the number of subproblems to be solved in some problems, thereby reducing the efficiency of these methods. In 2002, Zhang et al. [6] provided another scheme to reduce the number of subproblems that need to be solved, where the trust region radius uses: Δ k = c p g k B ^ k 1 , B ^ k = B k + i I , where i N . Zhang’s strategy requires an estimation of the inverse of the matrixes B k and B ^ k 1 in each iteration; however, Li [4] has suggested another practically efficient adaptive trust region radius that uses Δ k = d k 1 y k 1 g k . The strategy requires not only the gradient value but also the function value. Inspired by these facts, some modified versions of adaptive trust region methods have been proposed in [7,8,9,10].
As we all know, monotone techniques may slow down the rate of convergence, especially in the presence of the narrow curved valley. The monotone techniques require the value of the function to be decreased at each iteration. In order to overcome these disadvantages, Deng et al. [11] proposed a nonmonotone trust region algorithm in 1993. The general nonmonotone term f l ( k ) is defined by
f l ( k ) = f ( x l ( k ) ) = max 0 j m ( k ) { f k j } k = 0 , 1 , 2 , ,
where m ( 0 ) = 0 , 0 m ( k ) min { N , m ( k 1 ) + 1 } , and N 0 is an integer. Deng et al. [11] modified the ratio (3) which evaluates the consistency between the quadratic model and the objective function in trust region methods. The most common nonmonotone ratio is defined as follows:
ρ ˜ k = f l ( k ) f ( x k + d k ) m k ( 0 ) m k ( d k )
The general nonmonotone term f l ( k ) suffers from various drawbacks, such as the fact that numerical performance is highly dependent on the choice of N . In order to introduce a more suitable nonmonotone strategy, Ahookhosh et al. [12] proposed a new nonmonotone ratio as follows.
ρ ^ k = R k f ( x k + d k ) m k ( 0 ) m k ( d k ) ,
where
R k = η k f l ( k ) + ( 1 η k ) f k ,
in which η k [ η min , η max ] , with η min [ 0 , 1 ) , and η max [ η min , 1 ] . We recommend that interested readers refer to [13,14] for more details and progress on the nonmonotone trust region algorithm.
In order to overcome the difficulties associated with using the penalty function, especially the adjustment of the penalty parameter. The filter methods were recently presented by Fletcher and Leyffer [15] for constrained nonlinear optimization. More recently, Gould et al. [16] explored a new nonmonotone trust region algorithm for the unconstrained optimization problems with the multidimensional filter technique in [17]. Compared with the standard nonmonotone algorithm, the new algorithm dynamically determines iterations based on filter elements and increases the possibility of the trial step being accepted. Therefore, this topic has received great attention in recent years (see [18,19,20,21]).
The remainder of this paper is organized as follows. In Section 2, we describe a new trust region method. The global convergence is investigated in Section 3. In Section 4, we prove the superlinear convergence of the algorithm. Numerical results are shown in Section 5. Finally, the paper ends with some conclusions in Section 6.

2. The New Algorithm

In this section, we propose a trust region method by combining a new trust region radius and the modified trust region ratio to effectively solve unconstrained optimization problems. In each iteration, a trial step d k is generated by solving an adaptive trust region subproblem,
min d n m k ( d ) = g k T d + 1 2 d T B k d ,
d Δ k : = c k g k γ ,
where 0 < γ < 1 and c k is an adjustment parameter. Prompted by the adaptive technique, the proposed method has the following effective properties: it is not necessary to calculate the matrix of the inverse and the value of each iteration point, and the algorithm also reduces the related workload and calculation time.
In fact, the matrix B k is usually obtained by approximation, and the subproblems are only solved roughly. In this case, it may be more reasonable to adjust the next trust radius Δ k + 1 , not only according to ρ ^ k , but also by considering the use of { ρ ^ k m , , ρ ^ k } . To improve the efficiency of the nonmonotone trust region methods, we can define the modified ratio formula based on (7) as follows:
ρ ^ k = i = 1 min { k , m } w k i ρ ^ k i + 1 ,
where m is a positive integer and w k i is the weight of ρ ^ k i + 1 , such that i = 1 min { k , m } w k i = 1 .
More exactly, ρ ^ k can be used to determine whether the trial step is acceptable. Adjusting the next radius Δ k + 1 depends on (11), thus c k is updated by
c k + 1 = { min { β 2 c k , c max } if   ρ ^ k μ 2 c k if   μ 1 ρ ^ k < μ 2   β 1 c k if   ρ ^ k < μ 1 ,
In what follows, we refer to f ( x k ) by g k = g ( x k ) . When an i th component of g k is needed, we denote it using g i ( x k ) . Based on this filter, we say that an iterate point x 1 dominates x 2 if, and only if
| g i ( x 1 ) | | g i ( x 2 ) | i = 1 , 2 , , n .
A multidimensional filter F is a list of n -tuples of the form ( g 1 ( x k ) , g 2 ( x k ) , , g n ( x k ) ) , such that
| g j ( x k ) | | g j ( x l ) | j { 1 , 2 , 3 , , n } ,
where g k , g l belong to F .
For all g ( x l ) F , a new trial point is acceptable if there exists j { 1 , 2 , 3 , , n } , such that
| g j ( x k ) | | g j ( x l ) | γ g g ( x l ) γ g ( 0 , 1 n )   ,
If the iterate point x k + is acceptable, we add g ( x k + ) to the filter. Meanwhile, we remove all the points which are dominated by x k + from the filter. In the general filter trust region algorithm, for the trial point x k + satisfying ρ ^ k < μ 1 , we verify whether it is accepted by filter F . In our algorithm, it is the trial point x k + that satisfies 0 < ρ ^ k < μ 1 , and verifies whether or not it is accepted by the filter F .
Our discussion can be summarized as the following Algorithm 1.
Algorithm 1. A new filter nonmonotone adaptive trust region method.
Step 0. (Initialization) An initial point x 0 R n and a symmetric matrix B 0 R n × R n are given. The constants 0 < μ 1 < μ 2 < 1 , 0 < η min η max < 1 , τ > 0 , N > 0 , ε > 0 , η min [ 0 , 1 ) and η max [ η min , 1 ) are also given.
Step 1. If g k ε , then stop.
Step 2. Solve the subproblem (2) to find the trial step d k .
Step 3. Choose w k i [ 0 , 1 ] , which satisfies i = 1 min { k , m } w k i = 1 . Compute R k , ρ ^ k , and ρ ^ k , respectively.
Step 4. Test the trial step.
If ρ ^ k μ 1 , then set x k + 1 = x k + .
Otherwise compute g k + = f ( x k + ) ;
  if x k + is acceptable by the filter F , then x k + 1 = x k + , add g k + = f ( x k + ) into the filter F .
  Otherwise, find the step length α k satisfying (4), set x k + 1 = x k + α k d k .
  end(if)
end(If)
Step 5. Update the trust region radius by Δ k + 1 = c k + 1 g k + 1 γ , where c k + 1 is updated by (12).
Step 6. Compute the new Hessian approximation B k + 1 by a modified BFGS method formula. Set k   =   k   +   1 , and return to Step 1.
In order to obtain convergence results, we use the following notation:
D = { k | ρ ^ k μ 1 } , A = { k | 0 < ρ ^ k < μ 1   and   x k +   is   accepted   by   the   filter   F } , S = { k | x k + 1 = x k + d k } . Then, we have S = { k | ρ ^ k μ 1   or   x k +   is   accepted   by   the   filter   F } . When k S , we have x k + 1 = x k + α k d k .

3. Convergence Analysis

To establish the convergence of Algorithm 1, we make the following common assumption.
Assumption 1.
H1.The level set L ( x 0 ) = { x R n | f ( x ) f ( x 0 ) } Ω ,where Ω R n is bounded f ( x ) is continuously differentiable on the level set L ( x 0 ) .
H2. The matrix B k is uniformly bounded, i.e., there exists a constant M 1 > 0 such that B k M 1 .
H3. There exists constant v such that v d 2 d T B k d , for all k N { 0 } .
Remark 1.
In order to analyze the convergence of the new algorithm, it should be seen to that the trial step d k satisfies the following conditions:
m k ( 0 ) m k ( d k ) τ g k min { Δ k , g k B k }   ,
g k T d k τ g k min { Δ k , g k B k }   ,
where the constant τ ( 0 , 1 ) .
Remark 2.
If f is a twice continuously differentiable function, then H1 means that there is a positive constant L such that
f ( x ) f ( y ) L x y x , y Ω   ,
Lemma 1.
For all k , we can obtain that
| f k f ( x k + d k ) ( m k ( 0 ) m k ( d k ) ) | O ( d k 2 ) ,
Proof. 
The proof is given by using Taylor’s expansion and H3. □
Lemma 2.
Suppose that H1-H3 hold, the sequence { x k } is generated by Algorithm 1. Moreover, assume that there exists a constant 0 < ε < 1 such that g k > ε , for all k . Then, for every k , there exists a nonnegative integer p , so that x k + p + 1 is a successful iteration point, i.e., ρ ^ k + p μ 1 .
Proof. 
We are able to prove this by using contradiction; we assume that there exists an iteration k , and that x k + p + 1 is an unsuccessful iteration point for all nonnegative integer p , i.e.,
ρ ^ k + p < μ 1 p = 0 , 1 , 2 , .
It follows from (11) that ρ ^ k + p < μ 1 p = 0 , 1 , 2 , . Thus, using (10) and (12), we show
Δ k + p + 1 β 1 p + 1 c k g k + p γ β 1 p + 1 c max g k γ .
As a matter of fact, in the unsuccessful iterations, we have x k + p = x k . Thus, according to 0 < β 1 < 1 and (21), we have
lim p Δ k + p + 1 = 0 .
Now, using Lemma 1 and (16) we get
| f ( x k + p ) f ( x k + p + d k + p ) m k + p ( 0 ) m k + p ( d k + p ) 1 | = | f ( x k + p ) f ( x k + p + d k + p ) ( m k + p ( 0 ) m k + p ( d k + p ) ) m k + p ( 0 ) m k + p ( d k + p ) | O ( d k + p 2 ) τ g k + p min { Δ k + p , g k + p B k + p } O ( d k + p 2 ) τ ε min { Δ k + p , ε M 1 } O ( Δ k + p 2 ) O ( Δ k + p ) 0 ( p )
According to the definition of R k , we get R k η k f k + ( 1 η k ) f k = f k . Thus, for sufficiently large p we have,
ρ ^ k + p = R k + p f ( x k + p + d k + p ) m k + p ( 0 ) m k + p ( d k + p ) f k + p f ( x k + p + d k + p ) m k + p ( 0 ) m k + p ( d k + p ) 1 ,
which contradicts (20). This completes the proof of Lemma 2. □
Lemma 3.
Suppose that H1–H3 hold and the sequence { x k } is generated by Algorithm 1. Set δ ( 0 , min { 1 , v L } ) . For k S , we have
f k + 1 R k δ 2 ( 1 L δ v ) g k T d k 0 ,
Proof. 
According to the definition of R k , for all k , we have f k R k . Using the differential mean value theorem, we get
f k + 1 R k f k + 1 f k = g ( ξ ) T ( x k + 1 x k ) ,
where ξ [ x k , x k + 1 ] . For k S and (18), we obtain
g ( ξ ) T ( x k + 1 x k ) = g k T ( x k + 1 x k ) + ( g ( ξ ) g k ) T ( x k + 1 x k ) g k T ( x k + 1 x k ) + g ( ξ ) g k x k + 1 x k g k T ( x k + 1 x k ) + L x k + 1 x k 2 = g k T α k d k + L α k 2 d k 2 = ( 1 L δ d k 2 / d k T B k d k ) α k g k T d k ( 1 L δ / v ) α k g k T d k
Note that (4) and (16) imply that α k δ 2 for all. Setting δ ( 0 , min { 1 , v L } ) means that 1 L δ / v > 0 . According to (17), (25), and (26), we can conclude that (24) holds. □
Lemma 4.
Suppose that the sequence { x k } is generated by Algorithm 1. Then, we have { x k } L ( x 0 ) .
Proof. 
We can proceed by induction. When k = 0 , apparently x 0 L ( x 0 ) .
Assuming that x k L ( x 0 ) ( k 0 ) holds, we would obtain f k f 0 . Then, we can prove that x k + 1 L ( x 0 ) .
(a) When k S , consider two cases:
Case 1: k D . According to (7) and (16), we obtain R k f k + 1 μ 1 ( m k ( 0 ) m k ( d k ) ) 0 . Thus, we have R k f k + 1 . Following the definition of R k and f l ( k ) , we obtain
R k = η k f l ( k ) + ( 1 η k ) f k η k f l ( k ) + ( 1 η k ) f l ( k ) = f l ( k ) .
The above two inequalities show that
f k + 1 R k f l ( k ) f 0 .
Case 2: k A . According to 0 < ρ ^ k < μ 1 , we have R k f ( x k + d k ) > 0 . Thus, we get f k + 1 R k f l ( k ) f 0 .
(b) When k S . Using Lemma 3 and (27), we obtain f k + 1 R k f l ( k ) f 0 .
Now, we can conclude that { x k } L ( x 0 ) . □
Lemma 5.
Suppose that H1–H3 hold, and the sequence { x k } is generated by Algorithm 1. This would mean that { f l ( k ) } is a not a monotonically increasing sequence, nor is it convergent.
Proof. 
Now, we can prove that the sequence { f l ( k ) } is not a monotonically increasing sequence. Thus, we consider two cases:
Case 1: For k < N , it is clear that m ( k ) = k . Since, for any k , we have f k f 0 . Thus, we get f l ( k ) = f 0 .
Case 2: For k N , we obtain m ( k + 1 ) = m ( k ) + 1 . Thus, using the definition of f l ( k + 1 ) and (5), we observe that
f l ( k + 1 ) = max 0 j m ( k + 1 ) { f k + 1 j } max 0 j m ( k ) + 1 { f k + 1 j } = max { f l ( k ) , f k + 1 } f l ( k ) ,
{ f l ( k ) } is not a monotonically increasing sequence. This fact, along with H1, implies that
λ ,   s . t . n N { 0 } : λ f k + n f l ( k + n ) f l ( k + 1 ) f l ( k ) .
This shows that the sequence { f l ( k ) } is convergent. □
Lemma 6.
Suppose that H1–H3 hold, and there exists ε > 0 such that g k ε , for all k . Then, there is a constant β > 0 , and we have
Δ k β M k ,
where M k = max 0 i k B i + 1 .
Proof. 
Set σ = τ ε ( 1 μ 1 ) 2 L . We proceed by induction; set
β = { Δ 0 M 0 , β 1 σ M 0 , ( 1 μ 1 ) β 1 τ ε , β 1 ε } ,
When k = 0 , we can see that Δ 0 β M 0 . Then, assume that (31) holds for k . Note that { M k } is an increasing sequence. Thus, we prove that
Δ k + 1 β M k ,       k = 0 , 1 , ,
(a) When k D , i.e., ρ ^ k μ 1 . Using (11) and (7), we deduce that ρ ^ k μ 1 . From (12), we get Δ k + 1 = λ Δ k , where λ > 1 is a constant. Thus, the inequality Δ k β M k implies that Δ k + 1 β M k .
(b) When k A , i.e., 0 < ρ ^ k < μ 1 . Supposing that d k σ , we have
Δ k + 1 = β 1 Δ k β 1 d k β 1 σ β 1 σ M 0 M k β M k
Then, assuming that d k < σ , we have
f k f ( x k + d k ) m k ( 0 ) m k ( d k ) R k f ( x k + d k ) m k ( 0 ) m k ( d k ) < μ 1
Thus,
f ( x k + d k ) f k > μ 1 ( m k ( 0 ) m k ( d k ) ) = μ 1 ( g k T d k + 1 2 d k T B k d k ) .
Using Taylor’s formula and H1–H3, it is easy to show that
f ( x k + d k ) f k g ( η ) T d k = g k T d k + ( g ( η ) g k ) T d k g k T d k + L d k 2 g k T d k + τ ( 1 μ 1 ) ε d k 2
where η [ x k , x k + d k ] . When combining (35) with (36), we discover that
( 1 μ ) ( g k T d k + τ ε d k 2 ) > μ d k T B k d k 2 .
Moreover, the inequality (16), together with g k ε , imply that
g k T d k 1 2 d k T B k d k τ ε min { Δ k , ε B k } .
Multiply the two sides of inequality (38) by ( 1 μ ) , such that
( 1 μ ) ( g k T d k + d k T B k d k / 2 ) ( 1 μ 1 ) τ ε min { Δ k , ε B k } .
On the other hand, from H3, (37), and (39), we have
Δ k B k τ ( 1 μ 1 ) ε min { 1 , 2 ε B k Δ k 1 } .
If Δ k B k ε , we have Δ k B k > ( 1 μ ) τ ε . Otherwise, we obtain Δ k B k > ε . Now, following (40) we obtain
Δ k B k min { ( 1 μ 1 ) τ ε , ε }
Thus,
Δ k + 1 = β 1 Δ k min { ( 1 μ 1 ) β 1 τ ε , β 1 ε } B k β M k
The proof is completed. □
Based on the analyses and lemmas above, we obtain the global convergence of Algorithm 1 as follows:
Theorem 1.
(Global Convergence) Suppose that H1–H3 hold, and the sequence { x k } is generated by Algorithm 1. Then,
lim k inf g k = 0
Proof. 
Consider the following two cases:
Case 1: The number of successful iterations and many filter iterations are infinite, e.g., | S | = + , | A | = + .
We proceed from this proof with a contradiction. Suppose that (42) is not true, then there exists a positive constant ε such that g k > ε . From H1, we can see that { g k } is bounded. Set in the index of set A is the sequence { k i } . Thus, there exists a subsequence { k t } { k i } , which satisfies lim t g k t = g , g ε ,   j { 1 , 2 , , n } and we have
| g k t j | | g k t 1 j | γ g g k t 1 .
Using (43), as t is sufficiently large, we have
0 | g k t j | | g k t 1 j | γ g ε < 0 .
As this is a contradiction, the proof is completed.
Case 2: The number of successful iterations is infinite, and the number of filter iterations is finite, e.g., | S | = + , | A | < + .
Assume for a moment that there exists an integer constant k 1 , such that k > k 1 . This implies that k D , and we therefore have ρ ^ k μ 1 . Hence, from (16) and (27), we find that
f l ( k ) f k + 1 R k f k + 1 μ 1 τ g k min { Δ k , g k B k } 0 .
We proceed from this proof with a contradiction. Suppose that there exist constants ε > 0 and k 2 > k 1 , such that g k ε , k k 2 . Based on Lemma 6 and (45), we write
f l ( k ) f k + 1 R k f k + 1 μ 1 τ ε min { β M k , ε M k } = μ 1 τ ε min { β , ε } M k .
Set a = μ 1 τ ε min { β , ε } , thus,
f l ( k ) f k + 1 a M k .
According to (47) and { M k } , this is an increasing sequence, as we have
f l ( k ) f k + 1 + a / M k f k + 1 + a / M k + M + 1 f l ( k + 1 ) f k + 2 + a / M k + 1 f k + 2 + a / M k + M + 1                      f l ( k + M ) f k + M + 1 + a / M k + M f k + M + 1 + a / M k + M + 1
We then take the maximum value of both sides of (48), along with Lemma 5, to imply that
f l ( k ) max { f k + 1 , f k + 2 , , f k + M + 1 } + a / M k + M + 1       k k 2 .
According to (5), we have
f l ( k + M + 1 ) max { f k + 1 , f k + 2 , , f k + M + 1 } .
Thus, we get
f l ( k ) f l ( k + M + 1 ) a / M k + M + 1 .
Now, using (51), we deduce that
k k 2 1 M k + M + 1 1 a k k 2 ( f l ( k ) f l ( k + M + 1 ) ) = 1 a k k 2 s = 0 M ( f l ( k + s ) f l ( k + s + 1 ) )   = 1 a k k 2 ( f l ( k ) f l ( k + 1 ) ) < +
which contradicts k = 1 1 M k = + . This completes the proof of Theorem 1. □

4. Local Convergence

In this section, we will demonstrate the superlinear convergence of Algorithm 1 under appropriate conditions.
Theorem 2.
(Superlinear Convergence) Suppose that H1–H3 hold, and the sequence { x k } generated by Algorithm 1 converges to x * . Moreover, assume that 2 f ( x * ) is a positive definite, and 2 f ( x ) is Lipschitz continuous in a neighborhood of x * . If d k Δ k , where d k = B k 1 g k , and
lim k ( B k 2 f ( x * ) d k d k = 0
Then, the sequence { x k } converges to x * superlinearly, that is,
x k + 1 x * = o ( x k x * )
Proof. 
Following Lemmas 1 and 2, it is obvious that ρ ^ k μ 1 for sufficiently large k . This shows that Algorithm 1 has been simplified to the superlinear convergence standard quasi-Newton methods [22]. Thus, the superlinear convergence of this algorithm can be proven to be similar to Theorem 5.5.1 in [22]. We omit it for convenience. □

5. Preliminary Numerical Experiments

In this section, we perform numerical experiments on Algorithm 1, and compare it with Mo [1] and Hang [4]. A set of unconstrained test problems (of variable dimension) are selected from [23]. The simulation experiment uses MATLAB 9.4 and the processor uses Intel (R) Core (TM), 2.00 GHz, 6 GB RAM. Take exactly the same value for the public parameters of these algorithms: μ 1 = 0.25 , μ 2 = 0.75 , β 1 = 0.25 , β 2 = 1.5 , M = 5 . In our experiments, algorithms are being stopped when g k 10 6 g 0 or the number of iterations exceeds 10,000. We denote the running time via CPU. n f and n i denoted the total number of gradient evaluations, and the total number of function evaluations, respectively. The matrix B k is updated using a MBFGS formula [24]:
B k + 1 = { B k + z k z k T z k T d k B k d k d k T B k d k T B k d k , y k T d k > 0 B k , y k T d k 0
where d k = x k + 1 x k , y k = g k + 1 g k , z k = y k + t k g k d k , t k = 1 + max { y k T d k g k d k , 0 } .
To facilitate this, we used the following notations to represent the algorithms:
ANTRFS: A Nonmonotone Trust Region Method with Fixed Step length [1];
FSNATR: On a Fixed Step length Nonmonotone Adaptive Trust Region Method [4];
From Table 1, there are some variable dimension problems, which select the dimension in the range [4,1000]. We know that the new algorithm is generally better than NTRFS and FSNATR in terms of the total number of gradient evaluations and function evaluations. The new algorithm solves all the test functions in Table 1. The performance profiles given by Dolan and Moré [25] are used to compare the efficiency of the three algorithms. Figure 1, Figure 2 and Figure 3 give the performance profiles of the three algorithms for running time, the number of gradient evaluations, and the number of function evaluations, respectively. The figures show that Algorithm 1 performs well when compared with the other algorithms, at least on the test problems considered, which are mostly of small dimension. It can be observed that Algorithm 1 increases faster than the other algorithms, especially in contrast to NTRFS. Therefore, we can deduce that the new algorithm is more efficient and robust than the other considered trust region algorithms for solving small and medium-scale unconstrained optimization problems.

6. Conclusions

In this paper, the authors proposed a new nonmonotone trust region method and also put forward the following innovations:
(1) a new adaptive radius strategy to reduce the number of calculations;
(2) a modified trust region ratio to solve effectively unconstrained optimization problems. The filter technology is also important. Theorems 1 and 2 show that the proposed algorithm can preserve global convergence and superlinear convergence, respectively. According to preliminary numerical experiments, we can conclude that the new algorithm is very effective for unconstrained optimization, and the nonmonotone technology is very helpful for many optimization problems. In the future, we will have more ideas for solving many optimization problems, such as combining the modified conjugate gradient algorithm with a modified trust region method. We can also use the new algorithm for solving constrained optimization problems.

Author Contributions

Conceptualization, X.W. and Q.Q.; methodology, X.W.; software, X.W.; validation, X.W., Q.Q. and X.D.; formal analysis, X.D.; investigation, Q.Q.; resources, X.D.; data curation, Q.Q.; writing—original draft preparation, X.W.; writing—review and editing, X.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Acknowledgments

At the point of finishing this paper, I’d like to express my sincere thanks to all those who have lent me a hand over the course of my writing this paper. First of all, I would like to take this opportunity to show my sincere gratitude to my supervisor, Xianfeng Ding, who has given me so much useful advice on my writing and has tried his best to improve my paper. Secondly, I would like to express my gratitude to my classmates, who offered me references and information on time. Without their help, it would have been much harder for me to finish this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Mo, J.T.; Zhang, K.C.; Wei, Z.X. A nonmonotone trust region method for unconstrained optimization. Appl. Math. Comput. 2005, 171, 371–384. [Google Scholar] [CrossRef]
  2. Ou, Y.G.; Zhou, Q.; Lin, H.C. An ODE-based trust region method for unconstrained optimization problems. J. Comput. Appl. Math. 2009, 232, 318–326. [Google Scholar] [CrossRef] [Green Version]
  3. Wang, X.Y.; Tong, J. A Nonmonotone Adaptive Trust Region Algorithm with Fixed Stepsize for Unconstrained Optimization Problems. Math. Appl. 2009, 3, 496–500. [Google Scholar]
  4. Hang, D.; Liu, M. On a Fixed Stepsize Nonmonotonic Self-Adaptive Trust Region Algorithm. J. Southwest China Norm. Univ. 2013, 38. [Google Scholar] [CrossRef]
  5. Sartenaer, A. Automatic determination of an initial trust region in nonlinear programming. SIAM J. Sci. Comput. 1997, 18, 1788–1803. [Google Scholar] [CrossRef] [Green Version]
  6. Zhang, X.S.; Zhang, J.L.; Liao, L.Z. An adaptive trust region method and its convergence. Sci. China 2002, 45, 620–631. [Google Scholar] [CrossRef] [Green Version]
  7. Shi, Z.J.; Guo, J.H. A new trust region methods for unconstrained optimization. Comput. Appl. Math. 2008, 213, 509–520. [Google Scholar] [CrossRef] [Green Version]
  8. Kimiaei, M. A new class of nonmonotone adaptive trust-region methods for nonlinear equations with box constraints. Calcolo 2017, 54, 769–812. [Google Scholar] [CrossRef]
  9. Amini, K.; Shiker Mushtak, A.K.; Kimiaei, M. A line search trust-region algorithm with nonmonotone adaptive radius for a system of nonlinear equations. Q. J. Oper. Res. 2016, 4, 132–152. [Google Scholar] [CrossRef]
  10. Peyghami, M.R.; Tarzanagh, D.A. A relaxed nonmonotone adaptive trust region method for solving unconstrained optimization problems. Comput. Optim. Appl. 2015, 61, 321–341. [Google Scholar] [CrossRef]
  11. Deng, N.Y.; Xiao, Y.; Zhou, F.J. Nonmonotone Trust Region Algorithm. J. Optim. Theory Appl. 1993, 76, 259–285. [Google Scholar] [CrossRef]
  12. Ahookhoosh, M.; Amini, K.; Peyghami, M. A nonmonotone trust region line search method for large scale unconstrained optimization. Appl. Math. Model. 2012, 36, 478–487. [Google Scholar] [CrossRef]
  13. Zhang, H.C.; Hager, W.W. A nonmonotone line search technique and its application to unconstrained optimization. SIAM J. Optim. 2004, 14, 1043–1056. [Google Scholar] [CrossRef] [Green Version]
  14. Gu, N.Z.; Mo, J.T. Incorporating nonmonotone strategies into the trust region for unconstrained optimization. Comput. Math. Appl. 2008, 55, 2158–2172. [Google Scholar] [CrossRef] [Green Version]
  15. Fletcher, R.; Leyffer, S. Nonlinear programming without a penalty function. Math. Program. 2002, 91, 239–269. [Google Scholar] [CrossRef]
  16. Gould, N.I.; Sainvitu, C.; Toint, P.L. A filter-trust-region method for unconstrained optimization. SIAM J. Optim. 2005, 16, 341–357. [Google Scholar] [CrossRef] [Green Version]
  17. Gould, N.I.; Leyffer, S.; Toint, P.L. A multidimensional filter algorithm for nonlinear equations and nonlinear least-squares. SIAM J. Optim. 2004, 15, 17–38. [Google Scholar] [CrossRef] [Green Version]
  18. Wächter, A.; Biegler, L.T. Line search filter methods for nonlinear programming and global convergence. SIAM J. Optim. 2005, 16, 1–31. [Google Scholar] [CrossRef]
  19. Miao, W.H.; Sun, W. A filter trust-region method for unconstrained optimization. Numer. Math. J. Chin. Univ. 2007, 19, 88–96. [Google Scholar]
  20. Zhang, Y.; Sun, W.; Qi, L. A nonmonotone filter Barzilai-Borwein method for optimization. Asia Pac. J. Oper. Res. 2010, 27, 55–69. [Google Scholar] [CrossRef]
  21. Fatemi, M.; Mahdavi-Amiri, N. A filter trust-region algorithm for unconstrained optimization with strong global convergence properties. Comput. Optim. Appl. 2012, 52, 239–266. [Google Scholar] [CrossRef]
  22. Nocedal, J.; Wright, S.J. Numerical Optimization; Springer: New York, NY, USA, 2006. [Google Scholar]
  23. Andrei, N. An unconstrained optimization test functions collection. Environ. Sci. Technol. 2008, 10, 6552–6558. [Google Scholar]
  24. Pang, S.; Chen, L. A new family of nonmonotone trust region algorithm. Math. Pract. Theory. 2011, 10, 211–218. [Google Scholar]
  25. Dolan, E.D.; Moŕe, J.J. Benchmarking optimization software with performance profiles. Math. Program. 2002, 91, 201–213. [Google Scholar] [CrossRef]
Figure 1. CPU performance profile for the three algorithms.
Figure 1. CPU performance profile for the three algorithms.
Symmetry 12 00208 g001
Figure 2. Performance profile for the number of gradient evaluations ( n i ).
Figure 2. Performance profile for the number of gradient evaluations ( n i ).
Symmetry 12 00208 g002
Figure 3. Performance profile for the number of function evaluations ( n f ).
Figure 3. Performance profile for the number of function evaluations ( n f ).
Symmetry 12 00208 g003
Table 1. Numerical comparisons on a subset of test problems.
Table 1. Numerical comparisons on a subset of test problems.
Problem n n f / n i
ANTRFS [9]CPUFSNATR [20]CPUAlgorithm 1CPU
Ext.Rose4755/3822.755795168/880.32203687/580.104316
Ext. Beale425/130.00865141/210.06918518/160.028946
Penalty i218/100.08753218/100.06702017/140.032533
Pert.Quad628/250.05892125/130.05870018/170.035631
Raydan 1818/100.01510938/200.10592839/200.070292
Raydan 2421/110.01535613/80.01272911/60.017449
Diagonal 11013/80.00949335/180.07019927/260.064282
Diagonal 21056/290.01784158/300.11938557/290.083905
Diagonal 350200/1011.926143182/921.232287127/1261.849887
Hager1027/140.04903727/140.04824733/170.071906
Gen. Trid 120967/4843.53605550/260.43257747/240.217367
Ext.Trid 11027/140.01389029/150.12869618/120.071580
Ext. TET5013/70.20309316/90.03141617/90.119907
Diadonal 41007/40.0359339/50.3438495/40.146901
Ext.Him10029/150.14710225/130.20897629/280.409463
Gen. White50785/57610.47342771/4299.940880443/2285.741535
Ext. Powell161567/7877.266044794/4042.148929496/3371.208253
Full Hessian FH310011/60.05359811/60.0847268/70.088831
Ext.BD110051/270.21079050/280.73962121/150.261978
Pert. Quad20091/662.54768987/442.42159657/562.405979
Extended Hiebert161821/10009.819290175/1432.456780135/680.527388
Quadratic QF1415/80.00790317/90.01702511/100.010983
FLETCHCR3436210/1231.847519150/910.950314165/831.786160
ARWHEAD200297/15037.92805029/150.31797615/120.317976
NONDIA5075/390.36828092/470.54407951/350.307129
DQDRTIC5067/380.5124353/280.34143532/300.318596
EG220032/170.31995428/160.37376449/352.633184
Bro.Tridiagonal2002797/1504441.453385744/398119.57083869/351.539657
A.Per.Quad1673/470.14489063/320.13264445/260.128349
Pert.Trid.Quad100330/16610.985321325/1639.663929289/1568.521700
Ext.DENSCH10037/190.19054943/220.398777128/825.638770
SINCOS1004303/2152198.7175441303/952142.54318565/361.122092
BIGGSB1101949/10428.466655329/1950.676394275/1850.376394
ENGVAL1200788/487139.949938643/40699.088596474/47288.401960
EDENSCH100474/23825.63966445/260.40757437/230.930150
CUBE100430/22021.53234357/19820.93564280/14713.946540
BDEXP100476/36934.54797452/35624.56919622/210.550708
GENHUMPS100532/3213.27453412/2130.4754531014/5371.235720
QUARTC10057/321.03573443/220.44332518/170.326680
Gen. PSC1500198/21210.45762451/549.56235451/548.539801
Ext. PSC150015/151.25432715/151.098345213/131.562763
Variably dim.50041/272.957824321/161.453698217/151.093456
DIXMAANA100021/211.45789321/211.23764220/201.025372
SINQUAD10001582/1063187.5637231995/1215135.872354912/579100.458723
DIXMAANJ10002415/2398431.2534852320/2311410.2534852246/2132397.256732

Share and Cite

MDPI and ACS Style

Wang, X.; Ding, X.; Qu, Q. A New Filter Nonmonotone Adaptive Trust Region Method for Unconstrained Optimization. Symmetry 2020, 12, 208. https://doi.org/10.3390/sym12020208

AMA Style

Wang X, Ding X, Qu Q. A New Filter Nonmonotone Adaptive Trust Region Method for Unconstrained Optimization. Symmetry. 2020; 12(2):208. https://doi.org/10.3390/sym12020208

Chicago/Turabian Style

Wang, Xinyi, Xianfeng Ding, and Quan Qu. 2020. "A New Filter Nonmonotone Adaptive Trust Region Method for Unconstrained Optimization" Symmetry 12, no. 2: 208. https://doi.org/10.3390/sym12020208

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop