A Filter and Nonmonotone Adaptive Trust Region Line Search Method for Unconstrained Optimization

: In this paper, a new nonmonotone adaptive trust region algorithm is proposed for unconstrained optimization by combining a multidimensional ﬁlter and the Goldstein-type line search technique. A modiﬁed trust region ratio is presented which results in more reasonable consistency between the accurate model and the approximate model. When a trial step is rejected, we use a multidimensional ﬁlter to increase the likelihood that the trial step is accepted. If the trial step is still not successful with the ﬁlter, a nonmonotone Goldstein-type line search is used in the direction of the rejected trial step. The approximation of the Hessian matrix is updated by the modiﬁed Quasi-Newton formula (CBFGS). Under appropriate conditions, the proposed algorithm is globally convergent and superlinearly convergent. The new algorithm shows better performance in terms of the Dolan–Mor é performance proﬁle. Numerical results demonstrate the e ﬃ ciency and robustness of the proposed algorithm for solving unconstrained optimization problems.


Introduction
Consider the following unconstrained optimization problem: where f : R n → R is a twice continuously differentiable function. The problem has widely used in many applications based on medical science, optimal control, and functional approximation, etc. As we all know, there are many methods for solving unconstrained optimization problems, such as the conjugate gradient method [1][2][3], the Newton method [4,5], and the trust region method [6][7][8]. Constrained optimization problems can also be solved by processing constraint conditions and transforming them into unconstrained optimization problems. Motivated by this, it is quite necessary to propose a new modified trust region method for solving unconstrained optimization problems. As is commonly known, the trust region method and the line search method are two frequently used iterative methods. Line search methods involve the process of calculating the step length α k in the specific direction d k and driving a new point as x k+1 = x k + α k d k . The primary idea of the trust region method is as follows: at current iteration point x k , the trial step d k is obtained by solving the following subproblem: min d∈R n m k (d) = g T k d + where σ ∈ (0, 1), f l(k) = max 0≤ j≤m(k) f k− j , m(0) = 0, 0 ≤ m(k) ≤ min m(k − 1) + 1, N , and N ≥ 0 is an integer constant. However, the common nonmonotone term f l(k) suffers from various drawbacks. For example, the valid value of the produced function f in any iteration is essentially discarded, and the numerical results highly depend on the choice of N. To overcome these drawbacks, Cui et al. [15] proposed another nonmonotone line search method as follows: where the nonmonotone term C k is defined by and where σ ∈ (0, 1), η k ∈ [η min , η max ], η min ∈ [0, 1], and η max ∈ [η min , 1]. Based on this idea, in order to include the minimum value of α k in an acceptable interval and keep the consistency of the nonmonotone term, we proposed a trust region method with the Goldstein-type line search technique. The step length α k satisfies the following inequalities: where c 1 ∈ (0, 1 2 ), c 2 ∈ (c 1 , 1), η k ∈ [η min , η max ], η min ∈ [0, 1], and η max ∈ [η min , 1].
To evaluate the consistency between the quadratic model and the objective function, the ratio is defined by Ahookhosh et al. [16] as follows: It is well-known that the adaptive radius plays a valuable role in performance. In 1997, an adaptive strategy for automatically determining the initial trust region radius was proposed by Sartenear [17]. However, it can be seen that the gradient or Hessian information is not explicitly used to update the radius. Motivated by the first-order information and second-order information of the objective function, Zhang et al. [18] proposed a new scheme to determine trust region radius in 2002 as follows: In order to avoid computing the inverse of the matrix and the Euclidean norm ofB −1 k at each iteration point x k , Zhou et al. [19] proposed an adaptive trust region radius as follows: ∆ k = c p d k−1 y k−1 g k , where y k−1 = g k − g k−1 , and c and p are parameters. Prompted by the adaptive technique, Wang et al. [8] proposed a new adaptive trust region radius as follows: ∆ k = c k g k γ , which reduces the related workload and calculation time. Based on this fact, other authors also proposed modified adaptive trust region methods [20][21][22]. In order to overcome the difficulty of selecting penalty factors when using penalty functions, Fletcher et al. first recommended the filter techniques for constrained nonlinear optimization (see [23] for details). More recently, Gould et al. [24] explored a new nonmonotone trust region method with multidimensional filter techniques for solving unconstrained optimization problems. This idea incorporates the concept of nonmonotone to build a filter that can reject poor iteration points, and enforce convergence from random starting points. At the same time, the prototype of the multidimensional filter techniques relax the requirements of monotonicity in the classic trust region framework. This idea has been popularized by some authors [25][26][27].
In the following, we refer to ∇ f (x k ) by g k = (g 1 k , g 2 k , . . . , g n k ); when the i − th component of g k = g(x k ) is needed, it is denoted with g i k , where i ∈ {1, 2, 3, . . . , n}. We say that an iteration point x 1 dominates x 2 whenever where γ g ∈ (0, 1 √ n ) is a small positive constant. Based on [8], we know that a multidimensional filter F is a list of n-tuples of the form (g 1 k , g 2 k , . . . , g n k ), such that g j k ≤ g j l j ∈ {1, 2, 3, . . . , n}, where g k and g l belong to F . For all g l ∈ F , a new trial point x k is acceptable if there exists j ∈ {1, 2, 3, . . . , n}, such that where γ 1 and γ 2 are positive constants, and λ 1 and λ 2 satisfy the inequality 0 ≤ λ 1 < λ 2 < 1 √ n . When an iteration point x k is accepted by the filter, we add g(x k ) to the filter, and g(x l ) ∈ F with the following property is removed from the filter.
The rest of this article is organized as follows. In Section 2, we describe a new nonmonotone adaptive trust region algorithm. We establish the global convergence and superlinear convergence of the algorithm in Section 3. In Section 4, numerical results are given, which show that the new method is effective. Finally, some concluding comments are provided in Section 5.

The new algorithm
In this section, a new filter and nonmonotone adaptive trust region Goldstein-type line search method is proposed. The trust region ratio is used to determine whether the trial step d k is accepted. Following the trust region ratio of Ahookhosh et al. in [16], we define a modified form as follows: We can see that the effect of nonmonotonicity can be controlled the numerator and denominator, respectively. Thus, the new trust region ratio may find the global optimal solution effectively. Compared with the general filter trust region algorithm in [24], we propose a new criteria, that is, whether the trial point x + k satisfies 0 <ρ k < µ 1 , and verify whether it is accepted by the filter F . At the same time, a new adaptive trust region radius is presented as follows: where 0 < γ < 1, 0 < c < 1, and p is a nonnegative integer. Compared with the adaptive trust region method in [8], the new method has the following effective properties: the parameter p plays a vital role in adjusting the radius, and it can also reduce the workload and computational time. However, the new trust region radius only uses gradient function information, not function information.
Step 2. Solve the subproblems of Equations (18) and (19) to find the trial step d k , set x + k = x k + d k .
Step 4. Test the trial step.
the filter F , and go to Step 5. Otherwise, find the step length α k , satisfying Equations (8) and (9), and set x k+1 = x k + α k d k . Then, set p = p + 1, and go to Step 5.
Step 5. Update the symmetric matrix B k by using a modified Quasi-Newton formula. Set k = k + 1, p = 0, and go to Step 1.
In particular, we consider the following assumptions to analyze the convergence properties of Algorithm 1.
is continuously differentiable and has a lower bound.

Assumption 2.
The matrix B k is uniformly bounded, i.e., there exists a constant M 1 > 0 such that B k ≤ M 1 .

Remark 1.
There is a constantτ ∈ (0, 1); B k is a positive definite symmetric matrix, and d k satisfies the following inequalities:

Convergence Analysis
In order to easily derive convergence results, we define the following indexes: D = k|ρ k ≥ µ 1 , A = k|0 <ρ k < µ 1 and x + k is accepted by the filter F , and S = k|x k+1 = x k + d k . Then, S = k|ρ k ≥ µ 1 orx + k is accepted by the filter F . At the time of k S, we obtain x k+1 = x k + α k d k . (18); then,

Lemma 1. Suppose that Assumptions 1 and 2 holds, and d k is the solution of Equation
Proof.
Taking into account Equation (24) and Remark 1, we can conclude that Equation (23) holds.

Lemma 2.
For all k, we can find that Proof. The proof can be obtained by Taylor's expansion and H3.
Proof. We can proceed by induction. When k = 0, apparently we obtain Then, we prove x k+1 ∈ L(x 0 ). Consider the following two cases: Case 1: When k ∈ D, according to Equation (16) we have, Thus, According to Equations (23) and (27), we can obtain R k ≥ f k+1 . Using the definition of R k and f l(k) , we get The above two inequalities show that

Lemma 4.
Suppose that Assumptions 1 and 2 holds, and the sequence {x k } is generated by Algorithm 1. Then, the sequence f l(k) is not monotonically increasing and convergent.
Proof. The proof is similar to the proof of Lemma 5 in [8] and is here omitted.

Lemma 5.
Suppose that Assumptions 1 and 2 holds, and the sequence{x k } is generated by Algorithm 1. Moreover, assume that there exists a constant 0 < ε < 1, so that g k > ε, for all k. Then, Algorithm 1 is well defined; that is, the algorithm terminates in a limited number of steps.
Proof. In contradiction, suppose that Algorithm 1 cycles infinitely at iteration k. Then, we havê Following Equation (17), we have c p → 0 as p → ∞ . Thus, we get, where d p k is a solution of the subproblem of Equation (18) corresponding to p in the k − th iteration. Combining Lemma 1, Lemma 2, and Equation (28), we obtain which implies that there exists a sufficiently large p such thatρ p k ≥ µ 1 as p → ∞ . This contradicts Equation (30), and shows that Algorithm 1 is well defined. Lemma 6. Suppose that Assumptions 1 and 2 holds, and there exists a constantε such that g k ≥ ε for all k. Therefore, there is a constant υ such that ∆ k > υ, k = 0, 1, 2, . . . , , Proof. The proof is similar to that of Theorem 6.4.3 in [28], and is therefore omitted here.
In what follows, we establish global convergence of Algorithm 1 based on the above and the lemmas. Suppose, on the contrary, that Equation (34) does not hold. Thus, there exists a constant ε such that g k > ε, as k is sufficiently large. Introduce the index of set S = {k i }. Following Assumption 1, we can find that g k is bounded. Therefore, there is a subsequence where ε is a constant. The iteration point x k t is accepted by the filter F kt ; then there exists j ∈ {1, 2, . . . , n}, for every t > 1, that is As t is sufficiently large, we have However, we obtain −γ g g k t−1 ≤ −γ g ε < 0, which means that Equation (37) does not hold. The proof is completed.
We proceed from the following proof with a contradiction. Suppose that there exists a constant ε > 0, such that g k ≥ ε, for sufficiently large k. Based on |A| < +∞, for sufficiently large k ∈ S, we haveρ k ≥ µ 1 . Thus, set ξ k = p, p + 1, . . . , k ∩ S .
Based on Assumption 2, Equation (28), Lemma 1, and Lemma 6, we write As p and k are sufficiently large, according to |S| = +∞ and |A| < +∞, we know that ξ k is sufficiently large. Thus, we can find that ξ k µ 1 τεmin υ, ε M 1 → +∞ , and the left end of Equation (39) has no lower bound. We can deduce that Using Lemma 4, as p and k are sufficiently large, the left end of Equation (40) has a lower bound, which contradicts Equation (39). This completes the proof of Theorem 1. Now, based on the appropriate conditions, the following superlinear convergence is presented for Algorithm 1.

Theorem 2.
(Superlinear Convergence) Suppose that Assumptions 1 and 2 holds, and the sequence {x k } generated by Algorithm 1 converges to x * . Moreover, it is reasonable to assume that the Hessian matrix then the sequence {x k } converges to x * in a superlinear manner.
Proof. Found using the same method as in the proof of Theorem 4.1 in [29].

Preliminary Numerical Experiments
In this section, we present numerical results to illustrate the performance of Algorithm 1 in comparison with the standard nonmonotone trust region algorithm of Pang et al. in [30] (ASNTR), the nonmonotone adaptive trust region algorithm of Ahookhoosh et al. in [16] (ANATR), and the multidimensional filter trust region algorithm of Wang et al. in [8] (AFTR). We performed our codes in double precision format of algorithm in MATLAB 9.4 programming, and the codes are given in the Appendix A. A set of unconstrained optimization test problems are selected from Andrei [31] with the some medium-scale and large-scale problems. The stopping criteria are that the number of iterations exceeds 10,000 or g k ≤ 10 −6 (1 + f (x k ) ). n f ,n i , and CPU represent the total number of function evaluations, the total number of gradient evaluations, and running time in seconds, respectively. Following Step 0, we exploit the following values:µ 1 = 0.25, β 1 = 0.25, β 2 = 1.5, η 0 = 0.25, N = 5, ε = 0.5, c 1 = 0.25, c 2 = 0.75, and B 0 = I ∈ R n × R n . In addition, η k is updated by the following recursive formula: The matrix B k is updated using a CBFGS formula [32]: where d k = x k+1 − x k , and y k = g k+1 − g k . In Table 1, it is easily can be seen that Algorithm 1 outperforms the ASNTR, ANATR, and AFTR algorithms with respect to n f , n i , and CPU, especially for some problems. The Dolan-Moré [33] performance profile was used to compare the efficiency using the number of functional evaluations, the number of gradient evaluations, and running time. A performance index can be selected as measure of comparison among the mentioned algorithms, and the results can be illustrated by a performance profile. For every τ ≥ 1, the performance profile gives the proportion ρ(τ) of the test problems. The performance of each considered algorithmic variant was the best within a range of τ of the best.
It can be easily seen from Figures 1-3 that the new algorithm shows a better performance than the other algorithms from the perspective of the number of function evaluations, the number of gradient evaluations, and running time, especially in contrast to ASNTR. As a general result, we can infer that the new algorithm is more efficient and robust than the other mentioned algorithms in terms of the total number of iterations and running time. It can be easily seen from Figures 1-3

Conclusions
In this paper, we combine the nonmonotone adaptive line search strategy with multidimensional filter techniques, and propose a nonmonotone trust region method with a new adaptive radius. Our method possesses the following attractive properties: (1) The new algorithm is quite different from the standard trust region method; in order to avoid resolving the subproblem, a new nonmonotone Goldstein-type line search is performed in the direction of the rejected trial step.
(2) A new adaptive trust region radius is presented, which decreases the amount of work and computational time. However, full use of the function information for the new trust region radius is not made. A modified trust region ratio is computed which provides more information about evaluating the consistency between the quadratic model and the objective function.
(3) The approximation of the Hessian matrix is updated by the modified BFGS method.
Convergence analysis has shown that the proposed algorithm preserves global convergence as well as superlinear convergence. Numerical experiments were performed on a set of unconstrained optimization test problems in [31]. The numerical results showed that the proposed method is more competitive than the ASNTR, ANATR, and AFTR algorithms for medium-scale problems and large-scale problems with respect to the performance profile explained by Dolan-Moré in [33]. Thus, we can draw the conclusion that the new algorithm works quite well for solving

Conclusions
In this paper, we combine the nonmonotone adaptive line search strategy with multidimensional filter techniques, and propose a nonmonotone trust region method with a new adaptive radius. Our method possesses the following attractive properties: (1) The new algorithm is quite different from the standard trust region method; in order to avoid resolving the subproblem, a new nonmonotone Goldstein-type line search is performed in the direction of the rejected trial step.
(2) A new adaptive trust region radius is presented, which decreases the amount of work and computational time. However, full use of the function information for the new trust region radius is not made. A modified trust region ratio is computed which provides more information about evaluating the consistency between the quadratic model and the objective function.
(3) The approximation of the Hessian matrix is updated by the modified BFGS method. Convergence analysis has shown that the proposed algorithm preserves global convergence as well as superlinear convergence. Numerical experiments were performed on a set of unconstrained optimization test problems in [31]. The numerical results showed that the proposed method is more competitive than the ASNTR, ANATR, and AFTR algorithms for medium-scale problems and large-scale problems with respect to the performance profile explained by Dolan-Moré in [33]. Thus, we can draw the conclusion that the new algorithm works quite well for solving unconstrained optimization problems. In the future, it will be interesting to see the new nonmonotone trust region method used to solve constrained optimization problems and nonlinear equations with constrained conditions. On the other hand, it also will be interesting to combine an improved conjugate gradient algorithm with an improved nonmonotone trust region method to solve many optimization problems.