On q-Quasi-Newton’s Method for Unconstrained Multiobjective Optimization Problems

A parameter-free optimization technique is applied in Quasi-Newton’s method for solving unconstrained multiobjective optimization problems. The components of the Hessian matrix are constructed using q-derivative, which is positive definite at every iteration. The step-length is computed by an Armijo-like rule which is responsible to escape the point from local minimum to global minimum at every iteration due to q-derivative. Further, the rate of convergence is proved as a superlinear in a local neighborhood of a minimum point based on q-derivative. Finally, the numerical experiments show better performance.


Introduction
Multiobjective optimization is the method of optimizing two or more real valued objective functions at the same time. There is no ideal minimizer to minimize all objective functions at once, thus the optimality concept is replaced by the idea of Pareto optimality/efficiency. A point is called Pareto optimal or efficient if there does not exist an alternative point with the equivalent or smaller objective function values, such that there is a decrease in at least one objective function value. In many applications such as engineering [1,2], economic theory [3], management science [4], machine learning [5,6], and space exploration [7], etc., several multiobjective optimization techniques are used to make the desired decision. One of the basic approaches is the weighting method [8], where a single objective optimization problem is created by the weighting of several objective functions. Another approach is the -constraint method [9], where we minimize only the chosen objective function and keep other objectives as constraints. Some multiobjective algorithms require a lexicographic method, where all objective functions are optimized in their order of priority [10,11]. First, the most preferred function is optimized, then that objective function is transformed into a constraint and a second priority objective function is optimized. This approach is repeated until the last objective function is optimized. The user needs to choose the sequence of objectives. Two distinct lexicographic optimizations with distinct sequences of objective functions do not produce the same solution. The disadvantages of such approaches are the choice of weights, constraints, and importance of the functions, respectively, which are not known in advance and they have to be specified from the beginning. Some other techniques [12][13][14] that do not need any prior information are developed for solving unconstrained multiobjective optimization problems (UMOP) with at most linear convergence rate. Other methods like heuristic approaches or evolutionary approaches [15] provide an approximate Pareto front but do not guarantee the convergence property.
Newton's method [16] that solves the single-objective optimization problems is extended for solving (UMOP), which is based on an a priori parameter-free optimization method [17]. In this case, the objective functions are twice continuously differentiable, no other parameter or ordering of the functions is needed, and each objective function is replaced with a quadratic model. The rate of convergence is observed as superlinear, and it is quadratic if the second-order derivative is Lipschitz continuous. Newton's method is also studied under the assumptions of Banach and Hilbert spaces for finding the efficient solutions of (UMOP) [18]. A new type of Quasi-Newton algorithm is developed to solve the nonsmooth multiobjective optimization problems, where the directional derivative of every objective function exists [19].
A necessary condition for finding the vector critical point of (UMOP) is introduced in the steepest descent algorithm [12], where neither weighting factors nor ordering information for the different objective functions are assumed to be known. The relationship between critical points and efficient points is discussed in [17]. If the domain of (UMOP) is a convex set and the objective functions are convex component-wise then every critical point is the weak efficient point, and if the objective functions are strictly convex component-wise, then every critical point is the efficient point. The new classes of vector invex and pseudoinvex functions for (UMOP) are also characterized in terms of critical points and (weak) efficient points [20] by using Fritz John (FJ) optimality conditions and Karush-Kuhn-Tucker (KKT) conditions. Our focus is on Newton's direction for a standard scalar optimization problem which is implicitly induced by weighting the several objective functions.
The weighting values are a priori unknown and non-negative KKT multipliers, that is, they are not required to fix in advance. Every new point generated by the Newton algorithm [17] initiates such weights in the form of KKT multipliers.
Quantum calculus or q-calculus is also called calculus without limits. The q-analogues of mathematical objects can be again recaptured as q → 1. The history of quantum calculus can be traced back to Euler (1707-1783), who first proposed the quantum q in Newton's infinite series. In recent years, many researchers have shown considerable interest in examining and exploring the quantum calculus. Therefore, it emerges as an interdisciplinary subject. Of course, the quantum analysis is very useful in numerous fields such as in signal processing [21], operator theory [22], fractional integral and derivatives [23], integral inequalities [24], variational calculus [25], transform calculus [26], sampling theory [27], etc. The quantum calculus is seen as the bridge between mathematics and physics. To study some recent developments in quantum calculus, interested researches should refer to [28][29][30][31].
The q-calculus was first studied in the area of optimization [32], where the q-gradient is used in steepest descent method to optimize objective functions. Further, global optimum was searched using q-steepest descent method and q-conjugate gradient method where a descent scheme is presented using q-calculus with the stochastic approach which does not focus on the order of convergence of the scheme [33]. The q-calculus is applied in Newton's method to solve unconstrained single objective optimization [34]. Further, this idea is extended to solve (UMOP) within the context of the q-calculus [35].
In this paper, we present the q-calculus in Quasi-Newton's method for solving (UMOP). We approximate the second q-derivative matrices instead of evaluating them. Using q-calculus, we present the convergence rate is superlinear.
The rest of this paper is organized as follows. Section 2 recalls the problem, notation, and preliminaries. Section 3 derives a q-Quasi-Newton direction search method solved by (KKT) conditions. Section 4 establishes the algorithms for convergence analysis. The numerical results are given in Section 5 and the conclusion is in the last section.

Preliminaries
Denote R as the set of real numbers, N as the set of positive integers, and R + or (R − ) as the set of strictly positive or (negative) real numbers. If a function is continuous on any interval excluding zero, then the function is called continuous q-differentiable. For a function f : R → R, the q-derivative of f [36] denoted as D q,x f , is given as (1) Suppose f : R n → R, whose partial derivatives exist. For x ∈ R n , consider an operator q,i on f as The q-partial derivative of f at x with respect to x i , indicated by D q,x i f , is [23]: We are interested to solve the following (UMOP): where X ⊆ R n is a feasible region and F : X → R m . Note that the function F = ( f 1 , f 2 , . . . , f m ) is a vector function whose components are real valued functions such as f j : X → R, where j = 1, . . . , m.
In general, n and m are independent. For x, y ∈ R n , we present the vector inequalities as: A point x * ∈ X is called Pareto optimal point such that there is no any point x ∈ X, for which Note that every Pareto optimal point is a weakly Pareto optimal point [37]. The directional derivative of f j at x in the descent direction d q is given as: The necessary condition to get the critical point for multiobjective optimization problems is given in [17]. For any x ∈ R n , x denotes the Euclidean norm in R n . Let K(x 0 , r) = {x : x − x 0 ≤ r} with a center x 0 ∈ R n and radius r ∈ R + . Norm of the matrix A ∈ R n×n is A = max x∈R n×n Ax x , x = 0. The following proposition indicates that when f (x) is a linear function, then the q-gradient is similar to the classical gradient.
Proposition 1 ( [33]). If f (x) = a + p T x, where a ∈ R and p ∈ R n , then for any x ∈ R n , and q ∈ (0, 1), All the quasi-Newton methods approximate the Hessian of function f as W k ∈ R n×n , and update the new formula based on previous approximation [38]. Line search methods are imperative methods for (UMOP) in which a search direction is first computed and then along this direction a step-length is chosen. The entire process is an iterative.

The q-Quasi-Newton Direction for Multiobjective
The most well-known quasi-Newton method for single objective function is the BFGS (Broyden, Fletcher, Goldfarb, and Shanno) method. This is a line search method along with a descent direction d k q within the context of q-derivative, given as: where f is a continuously q-differentiable function, and W k ∈ R n×n is a positive definite matrix that is updated at every iteration. The new point is: In the case of the Steepest Descent method and Newton's method, W k is taken to be an Identity matrix and exact Hessian of f , respectively. The quasi-Newton BFGS scheme generates the next W k+1 as where s k = x k+1 − x k = α k d k q , and y k = ∇ q f (x k+1 ) − ∇ q f (x k ). In Newton's method, second-order differentiability of the function is required. While calculating W k , we use q-derivative which behaves like a Hessian matrix of f (x). W k+1 may not be a positive definite, which can be modified to be a positive definite through the symmetric indefinite factorization [39]. The q-Quasi-Newton's direction d q (x) is an optimal solution of the following modified problem [40] as: where W j (x) is computed as (8). The solution and optimal value of (9) are: and The problem (9) becomes a convex quadratic optimization problem (CQOP) as follows: where (t, d q ) ∈ R × R n .
The Lagrangian function of (CQOP) is: For λ = (λ 1 , λ 2 , . . . , λ m ) T , we obtain the following (KKT) conditions [40]: The solution (d q (x), ψ(x)) is unique, and set λ j = λ j (x) for all j = 1, . . . , m with d q = d q (x) and t = ψ(x) for satisfying (14)- (18). From (14), we obtain This is a so-called q-Quasi-Newton's direction for solving (UMOP). We present the basic result for relating the stationary condition at a given point x to its q-Quasi-Newton direction d q (x) and function ψ. Proposition 1. Let ψ : X → R and d q : X → R n be given by (10) and (11), respectively, and W j (x) ≥ 0 for all x ∈ X. Then, 1. ψ(x) ≤ 0 for all x ∈ X. 2. The conditions below are equivalent: (a) The point x is non stationary.

The function ψ is continuous.
Proof. Since d q = 0, then from (10), we have Thus, the given point x ∈ R n is non-stationary. Since W j (x) is positive definite, and from (10) and (11), we have Since ψ(x) is the optimal value of (CQOP), and it is negative, thus solution of (CQOP) can never be d q (x) = 0. It is sufficient to show that the continuity [41] of ψ in set Y ⊂ X. Since ψ(x) ≤ 0, then for all j = 1, . . . , m, and W j (x), where j = 1 . . . , m are positive definite for all x ∈ Y. Thus, the eigenvalues of Hessian matrices W j (x), where j = 1, . . . , m are uniformly bounded away from zero on Y so there exists R, S ∈ R + such that and S = min x∈Y, e =1,j=1,...,m e T W j (x)e.
From (20) and using Cauchy-Schwarz inequality, we get for all x ∈ Y, that is, Newton's direction is uniformly bounded on Y. We present the family of function {ℵ x,j } x∈Y,j=1,...,m , where ℵ x,j : Y → R, We shall prove that this family of functions is uniformly equicontinuous. For small value z ∈ R + there exists δ z ∈ R + , and for y ∈ K(z, δ z ), we have for all j = 1, . . . , m. because of q-continuity of Hessian matrices, the second inequality is true. Since Y is compact space, then there exists a finite sub-cover.
The following modified lemma is due to [17,42].
Proof. Since x * is not a critical point, then ψ(x) < 0. Let r > 0 such that B(x, r) ⊂ X and α ∈ (0, σ]. Therefore, The last term in the right-hand side of the above equation is non-positive because ψ(x) ≤ ψ(x * )

Algorithm and Convergence Analysis
We first present the following Algorithm 1 [43] to find the gradient of the function using q-calculus. The higher-order q-derivative of f can be found in [44].
Example 1. Given that f : We are now prepared to write the unconstrained q-Quasi-Newton's Algorithm 2 for solving (UMOP). At each step, we solve the (CQOP) to find the q-Quasi-Newton direction. Then, we obtain the step length using the Armijo line search method. In every iteration, the new point and Hessian approximation are generated based on historical values. Algorithm 2 q-Quasi-Newton's Algorithm for Unconstrained Multiobjective (q-QNUM) 1: Choose q ∈ (0, 1), x 0 ∈ X, symmetric definite matrix W 0 ∈ R n×n , c ∈ (0, 1), and a small tolerance value > 0. 2: for k=0,1,2,. . . do 3: Solve (CQOP). 4: Compute d k q and ψ k .
We now finally start to show that every sequence produced by the proposed method converges to a weakly efficient point. It does not matter how poorly the initial point is guessed. We assume that the method does not stop, and produces an infinite sequence of iterates. We now present the modified sufficient conditions for the superlinear convergence [17,40] within the context of q-calculus. Theorem 1. Let {x k } be a sequence generated by (q-QNUM), and Y ⊂ X be a convex set. Also, γ ∈ (0, 1) and r, a, b, δ, > 0, and (a) aI ≤ W j (x) ≤ bI for all x ∈ Y, j = 1, . . . , m, Then, for all k ≥ k 0 , we have that Then, the sequence {x k } converges to local Pareto points x * ∈ R m , and the convergence rate is superlinear.
Proof. From part 1, part 3 of this theorem and triangle inequality, From (d) and (f), we follow x k , x k + d q (x k ) ∈ K(x k 0 , r) and We also have that is, Since for all j = 1, . . . , m. The Armijo conditions holds for α k = 1. Part 1 of this theorem holds. We now set x k , x k+1 ∈ K(x k 0 , r), and x k+1 − x k < δ. Thus, we get We now estimate v(x k+1 ) . For x ∈ X, we define where λ k j ≥ 0, for all j = 1, . . . , m, are KKT multipliers. We obtain following: Then, v k+1 = ∇ q G k (x k+1 ). We get From assumptions (b) and (c) of this theorem, hold for all x, y ∈ Y with y − x < δ and k ≥ k 0 . We have We have Thus, Thus, part 4 is proved. We finally prove superlinear convergence of {x k }. First we define From triangle inequality, assumptions (e), (f) and part 1, we have K(x k , r k ) ⊂ K(x k 0 , r) ⊂ V. Choose any τ ∈ R + , and defineε = min{a τ 1 + 2τ , ε}.
for all x, y ∈ K(x k , r k ) with y − x < δ k , and for all y ∈ K(x k , r k ) and l ≥ k holds both for j = 1, . . . , m. Assumptions (a)-(f) are satisfied forε, r k , δ k , and x k instead of , r, δ, and x 0 , respectively. We have Let l → ∞ and we get x * − x k ≤ d q (x k ) 1 1−ε a . Using the last inequality, and part 4, we have From above and triangle inequality, we have that is, Since 1 − 2¯ε a > 0, and 1 − 2¯ε a > 0, then we get where τ ∈ R + is chosen arbitrarily. Thus, the sequence {x k } converges superlinearly to x * .

Numerical Results
The proposed algorithm (q-QNUM), i.e., Algorithm 2, presented in Section 4 is implemented in MATLAB (2017a) and tested on some test problems known from the literature. All tests were run under the same conditions. The box constraints of the form lb ≤ x ≤ ub are used for each test problem. These constraints are considered under the direction search problem (CQOP) such that the newly generated point always lies in the same box, that is, lb ≤ x + d q ≤ ub holds. We use the stopping criteria at x k as: ψ(x k ) > − where ∈ R + . All test problems given in Table 1 are solved 100 times. The starting points are randomly chosen from a uniform distribution between lb and ub. The first column in the given table is the name of the test problem. We use the abbreviation of author's names and number of the problem in the corresponding paper. The second column indicates the source of the paper. The third column is for lower bound and upper bound. We compare the results of (q-QNUM) with (QNMO) of [40] in the form of a number of iterations (iter), number of objective functions evaluation (obj), and number of gradient evaluations (grad), respectively. From Table 1, we can conclude that our algorithm shows better performance.

Example 2.
Find the approximate Pareto front using (q-QNUM) and (QNMO) for the given (UMOP) [45]: The number of Pareto points generated due to (q-QNUM) with Algorithm 1 and (QNMO) is shown in Figure 1. One can observe that the number of iterations as iter = 200 in (q-QNUM) and iter = 525 in (QNMO) are responsible for generating the approximate Pareto front of above (UMOP).

Conclusions
The q-Quasi-Newton method converges superlinearly to the solution of (UMOP) if all objective functions are strongly convex within the context of q-derivative. In a neighborhood of this solution, the algorithm uses a full Armijo steplength. The numerical performance of the proposed algorithm is faster than their actual evaluation.