Next Article in Journal
Exploratory Study of a Green Function Based Solver for Nonlinear Partial Differential Equations
Previous Article in Journal
Unlocking the Potential of Remanufacturing Through Machine Learning and Data-Driven Models—A Survey
Previous Article in Special Issue
New Insights into Fuzzy Genetic Algorithms for Optimization Problems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Issues on a 2–Dimensional Quadratic Sub–Problem and Its Applications in Nonlinear Programming: Trust–Region Methods (TRMs) and Linesearch Based Methods (LBMs)

by
Giovanni Fasano
1,*,†,
Christian Piermarini
2,† and
Massimo Roma
2,†
1
Department of Management, Venice School of Management, Ca’ Foscari University, S. Giobbe, Cannaregio 873, 30121 Venice, Italy
2
Department of Computer, Control and Management Engineering, Sapienza University of Rome, Via Ariosto 25, 00185 Rome, Italy
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Algorithms 2024, 17(12), 563; https://doi.org/10.3390/a17120563
Submission received: 15 August 2024 / Revised: 27 November 2024 / Accepted: 28 November 2024 / Published: 9 December 2024
(This article belongs to the Special Issue Numerical Optimization and Algorithms: 2nd Edition)

Abstract

:
This paper analyses the solution of a specific quadratic sub-problem, along with its possible applications, within both constrained and unconstrained Nonlinear Programming frameworks. We give evidence that this sub–problem may appear in a number of Linesearch Based Methods (LBM) schemes, and to some extent it reveals a close analogy with the solution of trust–region sub–problems. Namely, we refer to a two-dimensional structured quadratic problem, where five linear inequality constraints are included. Finally, we detail how to compute an exact global solution of our two-dimensional quadratic sub-problem, exploiting first order Karush-Khun-Tucker (KKT) conditions.

1. Introduction

There are plenty of real problems where the minimization of a twice continuously differentiable functional f : R n R is sought, (possibly) subject to several linear and nonlinear constraints. Among authoritative textbooks, where such problems are widely detailed, we can surely find [1,2,3]. Such general problems typically require the solution to a sequence of simple sub-problems with the following pattern:
min x φ k ( x ) s . t . x D k , D k : A k x + u k = 0 B k x + v k 0 ,
where A k R m k × n , B k R p k × n , u k R m k , v k R p k , m k , p k 1 and k 1 . Furthermore, φ k ( x ) represents a model of the smooth function f ( x ) at the current iterate x k and the feasible set D k represents a linearization of the constraints.
As is well known, affine and quadratic polynomials based on Taylor’s expansion are often adopted to represent the models { φ k ( x ) } , but valid alternatives also include least squares approximations, Radial Basis Functions, metamodels based on Splines, B-Splines, Kriging, etc. [4,5]. We remark that the advantage of solving the sequence of sub-problems (1) in place of the original nonlinearly constrained problem, within a suitable convergence framework, essentially relies on their simplicity. In particular, in this paper our focus is on investigating the role and the properties of the next problem (2), that represents a special sub-case of the more general problem (1). More specifically, we consider the case where in (1) the feasible set D k includes only a finite number of inequalities, and the function φ k ( x ) is a quadratic functional, i.e., we focus on the sub-problem and drop the dependency on k
min α , β φ ( x ) = 1 2 x T Q x + b T x + c x = x ¯ + α d + β z a 1 α b 1 a 2 β b 2 ϵ 1 α + ϵ 2 β ϵ 3
where Q R n × n , b , x ¯ R n , c R , d , z R n are given n-real search directions, and a 1 b 1 , a 2 b 2 . Despite the apparent specific structure of (2), a number of real applications may benefit from its solution, as partly described in Section 4 (see also [6] for a general perspective and [7] for a more recent similar viewpoint within neural network frameworks).
As an example of versatility for the structure of (2), both in TRMs and LBMs, we will shortly consider how it may be possibly successfully embedded within the framework of Truncated Newton’s methods (TNMs—see Table 1).
Where (see also [8,9,10,11,12,13,14])
  • d R n represents an approximate Newton-type direction, at the current feasible point x ¯ R n ;
  • z R n represents a negative curvature direction for the nonlinear function f ( x ) , at the current feasible point x ¯ R n ;
  • Q R n × n represents the exact/approximate Hessian matrix of f ( x ) at x ¯ ;
  • b R n represents the exact/approximate Gradient vector of f ( x ) at x ¯ ;
  • α and β are steplengths along the directions d , z R n (i.e., following the taxonomy of Table 1, we have α ω 1 ( α ) and β ω 2 ( α ) ), with < a 1 b 1 < + and < a 2 b 2 < + . The constraint ε 1 α + ε 2 β ε 3 potentially plays a multi-purpose role, modeling for instance the gradient-related property for the search direction α d + β z R n at x ¯ , i.e.,
( α d + β z ) T f ( x ¯ ) c ¯ f ( x ¯ ) h ,
being
ε 1 = d T f ( x ¯ ) , ε 2 = z T f ( x ¯ ) , ε 3 = c ¯ f ( x ¯ ) h , c ¯ , h > 0 .
The availability of an (exact) global solution for (2) may also suggest some alternatives to Table 1, either selecting a TRM or a LBM framework, or combining the two approaches. In particular, the scheme in Table 2 represents an immediate acceleration scheme for linesearch-based TNMs with respect to Table 1, in case the global convergence of { x k } to stationary limit points is simply sought. Note that selecting negative values for a 1 , a 2 and positive ones for b 1 , b 2 allows us to possibly perform the following:
  • reverse the directions d k and z k ;
  • use (2) in the light of simulating a dogleg-like [3] procedure for TRMs, also in LBM nonconvex frameworks.
  • In Table 2, the global convergence to stationary limit points is easily preserved by using similar results adopted for Table 1.
As a further alternative case for considering the exact global solution of (2), with respect to Table 1 and Table 2, we have the scheme in Table 3, where we suitably combine the strategies used in TRMs and LBMs to ensure global convergence (in particular, TRMs require the fulfillment of a sufficient reduction of the model in order to force a sufficient decrease in the objective function, so that they do not need any linesearch procedure, possibly implying a reduced computational burden with respect to LBMs. Conversely, LBMs easily compute an effective search direction but they need to perform a linesearch procedure, because they do not include any -direct- function reduction mechanism based on the local quadratic model). In particular, if the test A r e d k / P r e d k > ρ is fulfilled, there is no need to perform a linesearch procedure, since the global convergence for { x k } is preserved by the trust-region framework. We also remark that, in Table 3, the computation of both φ ( x k ) and φ ( x k + v k ) is required, regardless of the outcomes of the test A r e d k / P r e d k ρ , since in any case these quantities must be computed.
Finally, there is a chance to further exploit the scheme (2) in a TNM framework based on the linesearch procedure, in order to ensure global convergence properties for the sequence { x k } to stationary limit points satisfying the second-order necessary optimality conditions (namely, those stationary points where the Hessian matrix is positive and semidefinite). The resulting scheme is proposed in Table 4 and potentially does not require additional comments. The above examples give an overview of the possible basic contexts where the solution of the sub-problem (2) is sought. Hence, to some extent, specifically exploiting issues on its solution may yield a tool for practitioners working in Nonlinear Programming frameworks. We remark that, both in Section 3 and Section 6, the reader may find additional guidelines for possible alternatives and extensions to the use of global solutions of (2).
We also highlight that our perspective both differs from SQP (Sequential Quadratic Programming) methods—see, for example, the seminal paper [15], and approaches from the literature where LSMs and TRMs have been combined. Indeed, in the basic structure of SQPs (see also [16,17]), inner and outer iterations are performed. At each outer iteration, the pair given by primal-dual variables is computed, and a problem similar to (2) is addressed. On the contrary, we do not intend to propose a (novel) framework of global convergence for Nonlinear Programming, but rather we suggest the generality of the scheme (2) within a number of cases from the literature. Furthermore, we specifically focus on the exact solution of the quadratic sub-problem (2) as well.
On the other hand, in the seminal paper [18] and in the more recent ones [19,20,21], linesearch and trust-region techniques are integrated in a unified framework. Conversely, our point of view merely intends to bridge the gap between them.
The structure of the present paper is as follows. In Section 2, we describe the conditions ensuring the feasibility of our problem. In Section 3, we reveal the basic motivations for our analysis and outcomes. Section 4 reports relevant remarks, highlighting how general our proposal can be. Section 5 includes the Karush-Kuhn-Tucker conditions associated with problem (2), along with precise guidelines to find a global minimum for it. Finally, Section 6 provides some conclusions and suggestions for future work.
As regards the symbols adopted in the paper, x 1 , x 2 and x are, respectively, used to indicate the 1-norm, the 2-norm, and the -norm of the vector x R n or the real n × m matrix x. Given the n-real vectors x and y, we indicate their standard inner product with x T y . Given the matrix A R m × n , we then indicate by A + its Moore-Penrose pseudoinverse matrix, i.e., the unique matrix, such that A A + A = A , A + A A + = A + , ( A A + ) T = A A + , ( A + A ) T = A + A . With A 0 ( A 0 ), we indicate a positive semidefinite (positive definite) matrix A.

2. Feasibility Issues for Our Quadratic Problem

Here, we consider some feasibility issues for the linear inequality constrained quadratic problem (2). Clearly, (2) just includes the two real unknowns α and β . Moreover, as regards the existence of solutions for (2), we have the following result.
Lemma 1
(Feasibility). Let the problem (2) be given and assume that the real values a 1 , b 1 , a 2 , b 2 are finite, with a 1 b 1 and a 2 b 2 . Then, (2) admits solutions if and only if at least one of the following conditions holds:
  • Cond. I: ϵ 1 = ϵ 2 = 0 and ϵ 3 0 .
  • Cond. II: ϵ 1 = 0 and ϵ 2 0 ; moreover,
     
    if ϵ 2 > 0 , then a 2 ϵ 3 / ϵ 2
     
    if ϵ 2 < 0 , then b 2 ϵ 3 / ϵ 2 .
  • Cond. III: ϵ 1 0 and ϵ 2 = 0 ; moreover,
     
    if ϵ 1 > 0 , then a 1 ϵ 3 / ϵ 1
     
    if ϵ 1 < 0 , then b 1 ϵ 3 / ϵ 1 .
  • Cond. IV: ϵ 1 0 , ϵ 2 0 , ϵ 1 / ϵ 2 < 0 , moreover,
     
    if ε 1 > 0 and ε 2 > 0 , then a 2 ( ϵ 1 / ϵ 2 ) a 1 + ( ϵ 3 / ϵ 2 )
     
    if ε 1 < 0 and ε 2 < 0 , then b 2 ( ϵ 1 / ϵ 2 ) b 1 + ( ϵ 3 / ϵ 2 ) .
  • Cond. V: ϵ 1 0 , ϵ 2 0 , ϵ 1 / ϵ 2 > 0 , moreover,
     
    if ε 1 < 0 and ε 2 > 0 , then a 2 ( ϵ 1 / ϵ 2 ) b 1 + ( ϵ 3 / ϵ 2 )
     
    if ε 1 > 0 and ε 2 < 0 , then b 2 ( ϵ 1 / ϵ 2 ) a 1 + ( ϵ 3 / ϵ 2 ) .
Proof of Lemma 1.
For the sake of simplicity, we refer to Figure 1. The objective function in (2) is continuous, so that the existence of solutions follows from the compactness and nonemptiness of the feasible region. In this regard, the compactness is a consequence of assuming a 1 , b 1 , a 2 , b 2 finite. Furthermore, it is not difficult to realize that the feasible set of (2) is nonempty as long as at least one among the five conditions, Cond. I–Cond. V, is fulfilled, where the dashed-dotted line in Figure 1 represents the line associated with the last inequality constraint in (2). In particular, Cond. IV refers to the corner points A and B of Figure 1, while Cond. V refers to the vertices C and D. □

3. On the Use of Quadratic Sub-Problems Within TRMs and LBMs, in Large-Scale Optimization

Here, we give details about a possible motivation for our proposal, in order to reduce the gap between two renowned classes of optimization methods, namely TRMs and LBMs. We are indeed persuaded that such a viewpoint may suggest a number of possible enhancements, to improve both the last classes of methods.
In this regard, observe that a TRM for large-scale problems is an iterative procedure that generates the sequence of n-real iterates { x k } , and seeks at any step k for the solution of the trust-region sub-problem
min s q k ( s ) = f ( x k ) + f ( x k ) T s + 1 2 s T Q k s s 2 Δ k ,
where x k is the current iterate, Q k represents the exact/approximate Hessian matrix 2 f ( x k ) , and Δ k > 0 represents the radius of the trust-region, i.e., the compact subset where the model q k ( s ) needs to be validated (for an exhaustive description of TRMs for Nonlinear Programming, the reader can refer to [2]). A number of possible variants of (4) can be introduced when n is large, including iterative updating strategies for both Q k and Δ k , and a number of approximate/sophisticated/refined schemes for its solution are available in the literature.
A distinguishing feature of TRMs, with respect to LBMs, is that at iteration k the methods in the first class attempt to determine the stepsize α k and the search direction d k at once, so that x k + 1 = x k + s k x k + α k d k , where s k indeed approximately/exactly solves (4). Conversely, in LBMs, the computations of α k and d k are independent, as detailed later on in this paper. In particular, (see also [3]) the effective computation of s k in TRMs properly attempts to comply with the following issues:
  • s k can be computed by either an exact (small- and medium-scale problems) or an approximate (large-scale problems) procedure;
  • In order to prove the global convergence of the sequence { x k } to stationary limit points satisfying either first- or second-order necessary optimality conditions, s k is required to provide a sufficient reduction of the quadratic model q k ( s ) , i.e., the difference q k ( 0 ) q k ( s k ) is asked to satisfy a condition like ( c > 0 )
    q k ( 0 ) q k ( s k ) c f ( x k ) 2 min Δ k , f ( x k ) 2 1 + Q 2 ;
  • s k can be computed by an approximate procedure, e.g., by adopting a Cauchy step or using the Steihaug conjugate gradient (see [22,23]), regardless of Q k signature. Then, the approximate solution of (4) is merely sought on a linear manifold of a dimension of one or at most two, rather than on the entire subset B { s R n : s 2 Δ k } ;
  • Depending on a number of additional assumptions, TRMs can prove to be globally convergent to either a simple stationary limit point, or to a point which satisfies second-order necessary optimality conditions [2];
  • Finding the exact/accurate solution of the sub-problem (4) is in general quite a cumbersome task in large-scale problems, representing a difficult goal that is often (when possible) skipped.
On the other hand, to some extent, LBMs represent the counterpart of TRMs. Indeed, to yield the next iterate x k + 1 = x k + α k d k , they perform the computation of the steplength α k and the direction d k as separate tasks. Furthermore, unlike for TRMs, the novel iterate x k + 1 in LBMs can be also obtained by adopting the more general update
x k + 1 = x k + α k d k + β k z k ,
with d k and z k now being two search directions summarizing different information on the function f ( x ) , and α k and β k being stepsizes. In particular:
  • when z k 0 (or β k 0 for any k), then d k represents a Newton-type direction, being typically computed by approximately solving Newton’s equation 2 f ( x k ) d = f ( x k ) at the current iterate x k . Then, an Armijo-type linesearch procedure is applied along d k to compute α k , provided that d k is gradient-related (see e.g., [3]) at x k ;
  • when z k 0 , then d k represents a Newton-type direction again, while z k is typically a negative curvature direction for f ( x ) at x k , which approximates an eigenvector associated with the least negative eigenvalue of 2 f ( x k ) . The vector z k plays an essential role, when LBMs’ convergence to stationary points satisfying the second-order necessary optimality conditions needs to be proved. In the last case, the computation of the steplengths α k and β k is often carried out at once, (as in curvilinear linesearch procedures—see [24]), or the steplength computation is carried out by pursuing independent tasks (see, for example, [25]). We highlight that in (5), when both d k 0 and z k 0 , we may experience difficulties related to properly scaling the two search directions.
As a general class of efficient algorithms within LBMs for large-scale problems, we find Truncated Newton methods (TNMs) coupled with a linesearch procedure (see Table 1). Similarly to general TRMs, they are evidently based on possibly computing d k and z k after exploiting the second-order Taylor’s expansion of f ( x ) at x k . However, a couple of quite disappointing issues arise when applying linesearch-based TNMs, namely:
  • Unlike trust-region based TNMs, at iterate x k , the search of a stationary point for a quadratic polynomial model of f ( x ) (i.e., Newton’s equation) is performed on R n , so that the quadratic expansion is not trusted on a more reliable compact subset (trust-region) of R n . Thus, the search direction d k might show poor performance when the iterates in the sequence { x k } are far from a stationary limit point x * . More specifically, note that in case 2 f ( x k ) 0 ; then, solving Newton’s equation and the trust-region sub-problem
    min d q k ( d ) = f ( x k ) + f ( x k ) T d + 1 2 d T 2 f ( x k ) d d 2 γ k ,
    for any γ k [ 2 f ( x k ) ] + f ( x k ) 2 yields the same solutions. Conversely, when 2 f ( x k ) is indefinite, then Newton’s equation provides a saddle point for q k ( d ) , that might be interpreted as a solution to a trust-region sub-problem (the interested reader may consider the paper [26] for some extensions). Furthermore, from this perspective, we remark that in LBMs, solving (2) where d k = f ( x k ) , z k 0 , and ε 1 = ε 2 = ε 3 = 0 , is to a to large extent equivalent to computing the Cauchy step when solving (4). Indeed, in the last case, the trust-region constraint in (4), in principle, can be equivalently replaced by the compact feasible set (box constraints) in (2), after setting ε 1 = ε 2 = ε 3 = 0 . On the other hand, in case 2 f ( x k ) 0 , and setting (2) d k = f ( x k ) and z k = [ 2 f ( x k ) ] + f ( x k ) , along with ε 1 = ε 2 = ε 3 = 0 , then, with similar reasoning, the solution to (2) closely resembles the application of the dogleg method when solving (4). Finally, since the coefficients a 1 , a 2 in (2) may have negative values, we may potentially reverse the directions d k and z k when solving (2). Thus, following the idea behind (3), the scheme (2) suggests that, in case 2 f ( x k ) is also indefinite, (2) easily generalizes the proposals in [27]. In fact, following (3), we are able to exactly compute a global minimum ( α * , β * ) for (2), regardless of the signature of Q, so that the resulting direction α * d k + β * z k is gradient-related at x k .
  • As in (5), the search directions d k and z k might be suitably combined in a curvilinear framework (see, for example, [24]). However, to our knowledge, the selection of α k and β k in the literature is seldom performed with a joint procedure to separately assess α k and β k , i.e., α k and β k are rarely chosen as independent parameters. Hence, in the literature of linesearch-based TNMs, the linesearch procedure that starts from x k and yields x k + 1 explores a one-dimensional manifold (regular curve), rather than considering x k + α d k + β z k as a two-dimensional manifold with independent real coefficients α and β .
In this regard, using (2) within LBMs tends to partially compensate for the drawbacks in the last two items, in light of the great success that TRMs have achieved in the last decade. In particular, using (2) within linesearch-based TNMs, our aim is that of developing a simple tool which could possibly carry out the following:
  • Combines at iterate x k two independently computed vectors, namely d k , z k R n , by exactly computing a global minimum (we recall that, conversely, a global solution of the trust-region sub-problem (4) is often only approximately computed) for the two-dimensional constrained problem (2), being x ¯ x k , d d k , z z k ;
  • Adaptively updates the parameters a 1 , a 2 , b 1 , b 2 in (2), when the iterate x k changes, following the rationale behind the update of Δ k in (4), and retaining the strong convergence properties of TRMs. This fact is of remarkable interest, since in (2) the information associated with the search directions d k and z k is suitably trusted in a compact subset of R n (namely, the box constraints a 1 α b 1 , a 2 β b 2 );
  • Exactly computes a cheap global minimum ( α * , β * ) for (2), so that the vector α * d k + β * s k is then provided to a standard linesearch procedure such as the Armijo rule, to ensure that the global convergence of the sequence { x k } to stationary (limit) points is preserved;
  • Allows for the convergence of subsequences of the iterates { x k } to stationary limit points, where either first- or second-order necessary optimality conditions are fulfilled;
  • Preserves generality within a wide range of optimization frameworks, as reported in the next Section 4;
  • Combines the effects of d k and z k , skipping all the drawbacks related to a possible different scaling between these directions. We recall that since d k and z k are generated through the application of different methods, then the comparison of their performances may be biased by the latter generating methods.
  • The TNMs sketched in Table 2, Table 3 and Table 4 are only examples of proposals in light of the previous comments.

4. How General Is the Model (2) in Nonlinear Programming Frameworks?

This section is devoted to reporting a number of real constrained optimization schemes from Nonlinear Programming, whose formulation is encompassed in (2). We can see that for some of the following schemes (see Figure 2), it is possible that more than one reformulation can be considered in the framework (2).

4.1. Minimization over a Bounded Simplex

We consider the problem of minimizing a quadratic functional over the simplex S R n , such that
min x 1 2 x T Q x + b T x + c x S ,
where S = { x R n : x = i = 1 3 λ i x i , i = 1 3 λ i = 1 , λ i 0 , i = 1 , 2 , 3 } . Figure 2a reports an example of a simplex. In this regard, by simply setting in (2)
  • d = x 2 x 1 , z = x 3 x 1 ,
  • x ¯ = x 1 ,
  • a 1 = 0 , b 1 = 1 , a 2 = 0 , b 2 = 1 ,
  • ϵ 1 = ϵ 2 = ϵ 3 = 1 ,
  • the problem (6) is a special case of the problem (2).

4.2. Minimization over a Bounded Polygon

We consider the problem of minimizing a quadratic functional over a polygon P R 2 , described by a finite number m of vertices (observe that the points in the polygon P must belong to a hyperplane π R n , with π : ω T x + ω 0 = 0 , ω = ( ω 1 , , ω n ) T R n , ω 0 R , so that ω T x ¯ + ω 0 = 0 for any x ¯ P .), i.e.,
min x 1 2 x T Q x + b T x + c x P ,
where P = { x R 2 : x = i = 1 m λ i x i , i = 1 m λ i = 1 , λ i 0 , i = 1 , , m } . Figure 2b reports an example of a polygon with m = 5 . In this regard, the problem (7) can be split into to solution of the ( m 2 ) sub-problems
min x 1 2 x T Q x + b T x + c x S i , i = 1 , , m 2 ,
where
S i = x π R 2 : x = j { 1 , i + 1 , i + 2 } λ j x j , j { 1 , i + 1 , i + 2 } λ j = 1 , λ j 0 , j { 1 , i + 1 , i + 2 } ,
which are of the form (6). Thus, solving the problem (7) corresponds to solve a sequence of ( m 2 ) instances of the problem (2).

4.3. Minimization over a Bounded Segment

We consider the problem of minimizing a quadratic functional over a segment L R n , i.e.,
min x 1 2 x T Q x + b T x + c x L ,
where L = { x R n : x = λ x 1 + ( 1 λ ) x 2 , λ [ 0 , 1 ] } . Figure 2c reports an example of a segment. In this regard, by simply setting in (2)
  • d = x 2 x 1 , z 0 ,
  • x ¯ = x 1 ,
  • a 1 = 0 , b 1 = 1 , a 2 = 0 , b 2 = 0 ,
  • ϵ 1 = ϵ 2 = ϵ 3 = 0 ,
  • the problem (9) is a special case of the problem (2).

4.4. Minimization over a Bounded Box in R 2

We consider the problem of minimizing a quadratic functional over a box domain D R 2 , i.e.,
min x 1 2 x T Q x + b T x + c x D ,
where D = { x R 2 : c i x i e i , i = 1 , 2 } . Figure 2d reports an example of a box domain. In this regard, by simply setting in (2)
  • d = e 1 c 1 0 , z = 0 e 2 c 2
  • x ¯ = c 1 c 2
  • a 1 = 0 , b 1 = 1 , a 2 = 0 , b 2 = 1 ,
  • ϵ 1 = ϵ 2 = ϵ 3 = 0 ,
  • the problem (10) is a special case of the problem (2). As an alternative to the previous setting, we might also consider treating this case with a setting in (2), given by
  • d = e 1 c 1 2 0 , z = 0 e 2 c 2 2
  • x ¯ = e 1 + c 1 2 e 2 + c 2 2
  • a 1 = 1 , b 1 = 1 , a 2 = 1 , b 2 = 1 ,
  • ϵ 1 = ϵ 2 = ϵ 3 = 0 .

4.5. Minimization Including a 1-Norm Inequality Constraint in R 2

We consider the problem of minimizing a quadratic functional subject to the 1-norm inequality constraint x N , with N R 2 , i.e.,
min x 1 2 x T Q x + b T x + c x N ,
which is N = { x R 2 : x 1 a } . Figure 2e reports an example of such a constraint. In this regard, it suffices to recast (11) as in (8), where
  • m = 6
  • x ¯ = x 1 = 0
  • x 2 = 0 1 , x 3 = 1 0 , x 4 = 0 1 , x 5 = 1 0 , x 6 = x 2 ,
  • so that four instances of the problem (2) need to be solved.

4.6. Minimization Including an -Norm Inequality Constraint in R 2

We consider the problem of minimizing a quadratic functional subject to the -norm inequality constraint x E , with E R 2 , i.e.,
min x 1 2 x T Q x + b T x + c x E ,
which is E = { x R 2 : x a } . In this regard, we obtain similar results with respect to Section 4.5. Indeed, by simply setting in (2)
  • d = a 0 , z = 0 a
  • x ¯ = 0
  • a 1 = 1 , b 1 = 1 , a 2 = 1 , b 2 = 1 ,
  • ϵ 1 = ϵ 2 = ϵ 3 = 0 ,
  • the problem (12) is a special case of the problem (2).

4.7. Minimization Including a 2-Norm Inequality Constraint in R 2

We consider the problem of minimizing a quadratic functional in R 2 subject to the 2-norm inequality constraint x 2 γ , with γ 0 , i.e.,
min x 1 2 x T Q x + b T x + c x 2 γ , x R 2 .
In this regard, it suffices to observe that the solution of (2) provides both a
  • LOWER bound to the solution of (13), as long as we set (following Section 4.6)
     
    d = γ 0 , z = 0 γ
     
    x ¯ = 0
     
    a 1 = 1 , b 1 = 1 , a 2 = 1 , b 2 = 1 ,
     
    ϵ 1 = ϵ 2 = ϵ 3 = 0 ,
  • UPPER bound: to the solution of (13), as long as we follow the indications in Section 4.5, i.e., we recast and solve (11) as in (8), where
     
    m = 6
     
    x ¯ = x 1 = 0
     
    x 2 = 0 γ , x 3 = γ 0 , x 4 = 0 γ , x 5 = γ 0 , x 6 = x 2 ,
    so that four instances of the problem (2) need to be solved.
Note that the dogleg-like methods for the approximate solution of the trust-region problem (4), in the convex case, equivalently solve the sub-problem (13) with just a couple of unknowns.

5. KKT Conditions and the Fast Solution of Problem (2)

Replacing the expression of the vector x in (2) within the objective function, we easily obtain the equivalent problem
min α , β φ ( α , β ) P : a 1 α b 1 a 2 β b 2 ϵ 1 α + ϵ 2 β ϵ 3 ,
where
φ ( α , β ) = 1 2 α β T t u u w α β + y h T α β + q t = d T Q d u = d T Q z = z T Q d w = z T Q z y = ( Q x ¯ + b ) T d h = ( Q x ¯ + b ) T z q = 1 2 Q x ¯ + b T x ¯ + c .
Observe that transforming (2) into (14) only requires the computation of two additional matrix-vector products (i.e., Q d and Q z ), along with six inner products. The problem (14) is a constrained quadratic problem, such that first-order Fritz-John optimality conditions do not require additional constraint qualifications (since all the constraints are linear). Thus, after considering its Lagrangian function
L ( α , β , μ 1 , μ 2 , μ 3 , μ 4 , μ 5 ) = φ ( α , β ) μ 1 ( α a 1 ) + μ 2 ( α b 1 ) μ 3 ( β a 2 ) + μ 4 ( β b 2 ) + μ 5 ( ϵ 1 α + ϵ 2 β ϵ 3 )
we have the next set of equalities/inequalities representing the associated KKT conditions:
t u u w α * β * + y h + μ 1 * + μ 2 * + ϵ 1 μ 5 * μ 3 * + μ 4 * + ϵ 2 μ 5 * = 0 0 α * β * P μ 1 * [ α * a 1 ] = 0 μ 2 * [ α * b 1 ] = 0 μ 3 * [ β * a 2 ] = 0 μ 4 * [ β * b 2 ] = 0 μ 5 * [ ϵ 1 α * + ϵ 2 β * ϵ 3 ] = 0 μ i * 0 , i = 1 , , 5 .
The remaining part of the present section will be devoted to analyze all the possible solutions of (15), with the aim of possibly computing a global minimum for (2). In this regard, exploiting the solutions of (15) evidently undergoes a reduction, allowing us to analyze the cases (I)–(XII) in Figure 3.
Observing that in (15) the multipliers μ i * and i = 1 , , 5 must fulfill nonnegativity conditions, it is not difficult to realize that computing all the KKT points satisfying (15) can turn out to be a burdensome task, including a number of sub-cases depending on the possible combinations of signs for the parameters a 1 , b 1 , a 2 , b 2 , ϵ 1 , ϵ 2 and ϵ 3 . Conversely, a global minimizer for (2) can be equivalently exploited by analyzing all the possible solutions of (15) uniquely in terms of α * and β * , without requiring the computation of the multipliers as well. Hence, we limit our analysis to consider the computation of α * and β * in the cases (I)–(XII) of Figure 3, where
  • Cases (I), (II), (III), (IV) are associated with possible solutions in the vertices of the box constraints;
  • Cases (V), (VI), (VII), (VIII) are associated with possible solutions on the edges of the box constraints;
  • Case (IX) represents a possible feasible unconstrained minimizer for the objective function in (2);
  • Cases (X), (XI), (XII) are associated with possible solutions, making the last inequality constraint in (14) active.
Then, in Lemma 2, we will provide a simple theoretical result which justifies our simplification, with respect to computing all the KKT points. In this regard, we preliminarily set i = 1 and consider the next cases from Figure 3, with { y i } being the sequence of tentative solution points of (14):
  • Case (I): We set α ¯ = b 1 , β ¯ = b 2 . If ϵ 1 α ¯ + ϵ 2 β ¯ ϵ 3 , then set
    P 1 = b 1 b 2 , φ i = φ ( b 1 , b 2 ) , y i = P 1 , i = i + 1 ;
  • Case (II): We set α ¯ = a 1 , β ¯ = b 2 . If ϵ 1 α ¯ + ϵ 2 β ¯ ϵ 3 , then set
    P 2 = a 1 b 2 , φ i = φ ( a 1 , b 2 ) , y i = P 2 , i = i + 1 ;
  • Case (III): We set α ¯ = b 1 , β ¯ = a 2 . If ϵ 1 α ¯ + ϵ 2 β ¯ ϵ 3 , then set
    P 3 = b 1 a 2 , φ i = φ ( b 1 , a 2 ) , y i = P 3 , i = i + 1 ;
  • Case (IV): We set α ¯ = a 1 , β ¯ = a 2 . If ϵ 1 α ¯ + ϵ 2 β ¯ ϵ 3 , then set
    P 4 = a 1 a 2 , φ i = φ ( a 1 , a 2 ) , y i = P 4 , i = i + 1 ;
  • Case (V): We set α ¯ = b 1 and possibly compute the solution β ¯ = ( u b 1 + h ) / w of the equation
    d φ ( b 1 , β ) d β = w β + u b 1 + h = 0 ,
    so that:
     
    if w 0   AND   ϵ 1 α ¯ + ϵ 2 β ¯ ϵ 3 , then set
    P 5 = α ¯ β ¯ , φ i = φ α ¯ , β ¯ , y i = P 5 , i = i + 1 ;
     
    if w = 0   AND   u b 1 + h 0 , then there is no solution for Case (V);
     
    if w = 0   AND   u b 1 + h = 0 , then set β ¯ [ a 2 , b 2 ] as any value satisfying ϵ 1 α ¯ + ϵ 2 β ¯ ϵ 3 , and compute P 5 as in (20);
  • Case (VI): We set β ¯ = a 2 and possibly compute the solution α ¯ = ( u a 2 + y ) / t of the equation
    d φ ( α , a 2 ) d α = t α + u a 2 + y = 0 ,
    so that:
     
    if t 0   AND   ϵ 1 α ¯ + ϵ 2 β ¯ ϵ 3 , then set
    P 6 = α ¯ β ¯ , φ i = φ ( α ¯ , β ¯ ) , y i = P 6 , i = i + 1 ;
     
    if t = 0   AND   u a 2 + y 0 , then there is no solution for Case (VI);
     
    if t = 0   AND   u a 2 + y = 0 , then set α ¯ [ a 1 , b 1 ] as any value satisfying ϵ 1 α ¯ + ϵ 2 β ¯ ϵ 3 , and compute P 6 as in (21);
  • Case (VII): We set α ¯ = a 1 and possibly compute the solution β ¯ = ( u a 1 + z ) / w of the equation
    d φ ( a 1 , β ) d β = w β + u a 1 + h = 0 ,
    so that:
     
    if w 0   AND   ϵ 1 α ¯ + ϵ 2 β ¯ ϵ 3 , then set
    P 7 = α ¯ β ¯ , φ i = φ ( α ¯ , β ¯ ) , y i = P 7 , i = i + 1 ;
     
    if w = 0   AND   u a 1 + h 0 , then there is no solution for Case (VII);
     
    if w = 0   AND   u a 1 + h = 0 , then set β ¯ [ a 2 , b 2 ] as any value satisfying ϵ 1 α ¯ + ϵ 2 β ¯ ϵ 3 , and compute P 7 as in (22);
  • Case (VIII): We set β ¯ = b 2 and possibly compute the solution α ¯ = ( u b 2 + y ) / t of the equation
    d φ ( α , b 2 ) d α = t α + u b 2 + y = 0 ,
    so that:
     
    if t 0   AND   ϵ 1 α ¯ + ϵ 2 β ¯ ϵ 3 , then set
    P 8 = α ¯ β ¯ , φ i = φ ( α ¯ , β ¯ ) , y i = P 8 , i = i + 1 ;
     
    if t = 0   AND   u b 2 + y 0 , then there is no solution for Case (VIII);
     
    if t = 0   AND   u b 2 + y = 0 , then set α ¯ [ a 1 , b 1 ] as any value satisfying ϵ 1 α ¯ + ϵ 2 β ¯ ϵ 3 , and compute P 8 as in (23);
  • Case (IX): If t w u 2 0 , we compute the solution
    α ¯ β ¯ = t u u w 1 y h
    of the linear system
    φ α ( α , β ) = t α + u β + y = 0 φ β ( α , β ) = u α + w β + h = 0 ;
    otherwise, in case t w u 2 = 0   AND   ( t h u y 0 ) OR ( u h w y 0 ) , then there is no solution for Case (IX);
    otherwise, in case t w u 2 = 0   AND   ( t h u y = 0 )   AND   ( u h w y = 0 ) , then we have three sub-cases:
    • t > 0 : then, recalling that we are in the sub-case where equations φ α ( α , β ) = 0 and φ β ( α , β ) = 0 yield the same information, we exploit equation φ α ( α , β ) = 0 and we set α = ( u β + y ) / t . Thus, from the bounds and the last inequality in (14), we obtain
      a 2 β b 2 ε 2 t ε 1 u β ε 3 t + ε 1 y a 1 t + y u β b 1 t + y
      which yield the next three cases:
       
      ε 2 t ε 1 u > 0 : admitting other three sub-cases, namely
      u > 0 , so that we set
      β 1 = max a 2 , b 1 t + y u β min b 2 , ε 3 t + ε 1 y ε 2 t ε 1 u , a 1 t + y u = β 2
      u = 0 , so that we set
      β 1 = a 2 β min b 2 , ε 3 t + ε 1 y ε 2 t = β 2
      u < 0 , so that we set
      β 1 = max a 2 , a 1 t + y u β min b 2 , ε 3 t + ε 1 y ε 2 t ε 1 u , b 1 t + y u = β 2
       
      ε 2 t ε 1 u = 0 : admitting no solution for Case (IX) as long as the condition ε 3 t + ε 1 y < 0 holds. Conversely, in case ε 3 t + ε 1 y 0 , we have the three cases:
      u > 0 , so that we set
      β 1 = max a 2 , b 1 t + y u β min b 2 , a 1 t + y u = β 2
      u = 0 , so that we set
      β 1 = a 2 β b 2 = β 2
      u < 0 , so that we set
      β 1 = max a 2 , a 1 t + y u β min b 2 , b 1 t + y u = β 2
       
      ε 2 t ε 1 u < 0 : corresponding to the three cases:
      u > 0 , so that we set
      β 1 = max a 2 , b 1 t + y u , ε 3 t + ε 1 y ε 2 t ε 1 u β min b 2 , a 1 t + y u = β 2
      u = 0 , so that we set
      β 1 = max a 2 , ε 3 t + ε 1 y ε 2 t ε 1 u β b 2 = β 2
      u < 0 , so that we set
      β 1 = max a 2 , a 1 t + y u , ε 3 t + ε 1 y ε 2 t ε 1 u β min b 2 , b 1 t + y u = β 2
    • t = 0 : then, recalling that we are in the sub-case where equations φ α ( α , β ) = 0 and φ β ( α , β ) = 0 yield the same information, with t w u 2 = 0 , we exploit equation φ α ( α , β ) = 0 with t = u = 0 . Therefore, we have
      a 2 β b 2 y = 0 a 1 α b 1
      which yield the next two cases:
       
      y = 0 This case implies that the objective function is constant (i.e., φ ( α , β ) = q ), so that we set
      β 1 = a 2 β b 2 = β 2
       
      y 0 : admitting no solution for Case (IX)
    • t < 0 : then, recalling that we are again in the sub-case where equations φ α ( α , β ) = 0 and φ β ( α , β ) = 0 yield the same information, we exploit equation φ α ( α , β ) = 0 and we set α = ( u β + y ) / t . Thus, from the bounds and the last inequality in (14), we obtain
      a 2 β b 2 ε 2 t ε 1 u β ε 3 t + ε 1 y b 1 t + y u β a 1 t + y
      which yield the next three cases:
       
      ε 2 t ε 1 u > 0 : admitting other three cases, namely
      u > 0 , so that we set
      β 1 = max a 2 , a 1 t + y u , ε 3 t + ε 1 y ε 2 t ε 1 u β min b 2 , b 1 t + y u = β 2
      u = 0 , so that b 1 t + y u β a 1 t + y is always fulfilled and we set
      β 1 = max a 2 , ε 3 t + ε 1 y ε 2 t β b 2 = β 2
      u < 0 , so that we set
      β 1 = max a 2 , b 1 t + y u , ε 3 t + ε 1 y ε 2 t ε 1 u β min b 2 , a 1 t + y u = β 2
       
      ε 2 t ε 1 u = 0 : admitting no solution for Case (IX) as long as the condition ε 3 t + ε 1 y > 0 holds. Conversely, in case ε 3 t + ε 1 y 0 we have the three cases:
      u > 0 , so that we set
      β 1 = max a 2 , a 1 t + y u β min b 2 , b 1 t + y u = β 2
      u = 0 , so that we set
      β 1 = a 2 β b 2 = β 2
      u < 0 , so that we set
      β 1 = max a 2 , b 1 t + y u β min b 2 , a 1 t + y u = β 2
       
      ε 2 t + ε 1 u < 0 : corresponding to the three cases
      u > 0 , so that we set
      β 1 = max a 2 , a 1 t + y u β min b 2 , ε 3 t + ε 1 y ε 2 t ε 1 u , b 1 t + y u = β 2
      u = 0 , so that we set
      β 1 = a 2 β min b 2 , ε 3 t + ε 1 y ε 2 t ε 1 u = β 2
      u < 0 , so that we set
      β 1 = max a 2 , b 1 t + y u β min b 2 , ε 3 t + ε 1 y ε 2 t ε 1 u , a 1 t + y u = β 2 .
    Thus, overall, for Case (IX), if β 1 β 2 , we set
    β ¯ = ( β 1 + β 2 ) / 2 , α ¯ = ( u β ¯ + y ) / t t 0 ( a 1 + b 1 ) / 2 t = 0 ,
    along with
    P 9 = α ¯ β ¯ , φ i = φ ( α ¯ , β ¯ ) , y i = P 9 , i = i + 1 ,
    otherwise, if β 1 > β 2 , there is no solution for Case (IX);
  • Case (X): We set α ¯ = a 1 with ε 1 a 1 + ε 2 β = ε 3 , and we distinguish among three cases:
     
    if ε 2 = 0   AND   ε 3 = ε 1 a 1 , then set β 1 = a 2 β b 2 = β 2 ;
     
    if ε 2 = 0   AND   ε 3 ε 1 a 1 , then there is no solution for Case (X);
     
    if ε 2 0 , then set
    β 1 = max a 2 , ε 3 ε 1 a 1 ε 2 β min b 2 , ε 3 ε 1 a 1 ε 2 = β 2 .
    Set β ¯ = ( β 1 + β 2 ) / 2 with
    P 10 = a 1 β ¯ , φ i = φ ( a 1 , β ¯ ) , y i = P 10 , i = i + 1 ;
  • Case (XI): We distinguish among the next four cases:
     
    if ε 1 = ε 2 = 0   AND   ε 3 0 , then set α ¯ = ( a 1 + b 1 ) / 2 , β 1 = a 2 β β 2 = b 2 ; otherwise, there is no solution for Case (XI);
     
    if ε 1 > 0 , then α = ( ε 2 β + ε 3 ) / ε 1 and we analyze three sub-cases:
    • If ε 2 > 0 , then set
      β 1 = max a 2 , ε 1 b 1 ε 3 ε 2 β min b 2 , ε 1 a 1 ε 3 ε 2 = β 2 ;
    • If ε 2 = 0 , then set β 1 = a 2 β b 2 = β 2 ;
    • If ε 2 < 0 , then set
      β 1 = max a 2 , ε 1 a 1 ε 3 ε 2 β min b 2 , ε 1 b 1 ε 3 ε 2 = β 2 ;
     
    if ε 1 = 0   AND   ε 2 0 , then set β ¯ = ε 3 / ε 2 , α ¯ = ( a 1 + b 1 ) / 2 ; if ( β ¯ < a 2 OR β ¯ > b 2 ), then there is no solution for Case (XI);
     
    if ε 1 < 0 , then α = ( ε 2 β + ε 3 ) / ε 1 and we analyze three sub-cases:
    • If ε 2 > 0 , then set
      β 1 = max a 2 , ε 1 a 1 ε 3 ε 2 β min b 2 , ε 1 b 1 ε 3 ε 2 = β 2 ;
    • If ε 2 = 0 , then set β 1 = a 2 β b 2 = β 2 ;
    • If ε 2 < 0 , then set
      β 1 = max a 2 , ε 1 b 1 ε 3 ε 2 β min b 2 , ε 1 a 1 ε 3 ε 2 = β 2 .
    Set β ¯ = ( β 1 + β 2 ) / 2 and α ¯ = ( ε 2 β ¯ + ε 3 ) / ε 1 ; if a 1 α ¯ b 1 , then set
    P 11 = α ¯ β ¯ , φ i = φ ( α ¯ , β ¯ ) , y i = P 11 , i = i + 1 ,
    otherwise, there is no solution for Case (XI);
  • Case (XII): We set β ¯ = a 2 with ε 1 α + ε 2 a 2 = ε 3 , and we distinguish among three cases:
     
    if ε 1 = 0   AND   ε 3 = ε 2 a 2 , then set α ¯ = ( a 1 + b 1 ) / 2 ;
     
    if ε 1 = 0   AND   ε 3 ε 2 a 2 , then there is no solution for Case (XII);
     
    if ε 1 0 , then set
    α 1 = max a 1 , ε 3 ε 2 a 2 ε 1 α min b 1 , ε 3 ε 2 a 2 ε 1 = α 2 .
    Set α ¯ = ( α 1 + α 2 ) / 2 with
    P 12 = α ¯ a 2 , φ i = φ ( α ¯ , a 2 ) , y i = P 12 , i = i + 1 .
The next lemma justifies the role of the last analysis for the computation of possible solutions of (14).
Lemma 2.
Given the problem (14), and let the assumptions of Lemma 1 hold. Consider the sequence of the m points { y i } and the sequence of the m values { φ i } , from (16)–(27), which are relabelled so that for any index i 2 , we have
φ i 1 φ i φ i + 1 .
Then, if
ß ^ arg min 1 i m { φ i }
then, the point y ß ^ is a global minimum for (14).
Proof of Lemma 2.
The existence of a global minimum y * and the corresponding value φ ( y * ) for (14) is ensured by Lemma 1. Moreover, each global minimum of (14) naturally fulfills KKT conditions, so that each global minimum must belong to the sequence { y i } . Now, assume by contradiction that there exists a point y ˜ { y i } , with y ˜ arg min 1 i m { φ i } , but y ˜ is not a global minimum. This yields the contradictory fact that φ ( y * ) > φ ( y ˜ ) . □

6. Conclusions and Future Work

We have considered a very relevant issue within Nonlinear Programming, namely the solution of a specific constrained quadratic problem, whose exact global solution can be easily computed after analyzing the first-order KKT conditions associated with it. We also highlighted that our proposal may, to a large extent, suggest guidelines for the research of novel LBMs, by drawing inspiration from TRMs. This last observation represents a promising tool, in order to provide algorithms which guarantee global convergence to stationary limit points, satisfying either first- or second-order necessary optimality conditions. In particular, we can summarize the following promising lines of research, for large-scale problems which iteratively generate the sequences of points
x k + 1 = x k + α k d k x k + 1 = x k + α k d k + β k z k for   LBMs x k + 1 = x k + s k for   TRMs
which are d k , z k , and s k search directions at the current iterate x k :
  • Developing novel iterative LBMs (e.g., linesearch-based TNMs), where the search direction d k (e.g., a Newton-type direction) is possibly combined with another direction z k (e.g., the steepest descent at x k , a negative curvature direction at x k , etc.) through the use of (14). Then, comparing the efficiency of the novel methods with more standard linesearch-based approaches from the literature could give indications of the reliability of the ideas in this paper;
  • Developing novel hybrid methods where the rationale behind alternating trust-region or linesearch-based techniques is exploited. In particular, the iterative scheme x k + 1 = x k + α k d k + β k z k (respectively, x k + 1 = x k + α k d k ) might be considered, where the search directions d k and z k , along with the steplengths α k and β k (respectively, d k and α k ), are alternatively computed by solving
    • A trust-region sub-problem like (4), so that a sufficient reduction in the quadratic model is ensured;
    • A sub-problem like (14), so that the solution α * d k + β * z k is a promising gradient-related direction to be used within a linesearch procedure.
    In order to preserve the global convergence to stationary points satisfying either first- or second-order necessary optimality conditions;
  • Specifically, comparing the use of dogleg methods (within TRMs) vs. the application of (14) coupled with a linesearch technique. This issue is tricky, since dogleg methods are applied to trust-region sub-problems like (4), including a general quadratic constraint (i.e., the trust-region constraint), while in (14) all the constraints are linear, so that the exact global solution of (14) is easily computed. Moreover, the last issue might shed light also on the opportunity (possibly) of privileging an efficient linesearch procedure applied to a (coarsely computed) gradient-related search direction, in place of a precise computation of the search direction in LBMs, using an inexpensive linesearch procedure. In other words, it is at present questionable if coupling a coarse computation of the vectors d k and z k with an accurate linesearch procedure would be more preferable than coupling the accurately computed vectors d k and z k with a cheaper linesearch procedure;
  • Introducing nonmonotone stabilization techniques (see e.g., [28]) combining nonmonotonicity with any of the above ideas, for both TRMs and LBMs.

Author Contributions

G.F., C.P. and M.R. have equally contributed to this paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Data are contained within the article.

Acknowledgments

Giovanni Fasano and Massimo Roma thank INδAM (Istituto Nazionale di Alta Matematica) for the support they received.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Ben-Tal, A.; Nemirovski, A. Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications. In MPS-SIAM Series on Optimization; SIAM: Philadelphia, PA, USA, 2001. [Google Scholar]
  2. Conn, A.R.; Gould, N.I.M.; Toint, P.L. Trust-region methods. In MPS-SIAM Series on Optimization; SIAM: Philadelphia, PA, USA, 2000. [Google Scholar]
  3. Nocedal, J.; Wright, S.J. Numerical Optimization, 2nd ed.; Springer: New York, NY, USA, 2006. [Google Scholar]
  4. Micchelli, C.A. Interpolation of scattered data: Distance matrices and conditionally positive definite functions. Constr. Approx. 1986, 2, 11–22. [Google Scholar] [CrossRef]
  5. Myers, D.E. Kriging, cokriging, radial basis functions and the role of positive definiteness. Comput. Math. Appl. 1992, 24, 139–148. [Google Scholar] [CrossRef]
  6. Conn, A.R.; Gould, N.I.M.; Sartenaer, A.; Toint, P.L. On Iterated-Subspace Minimization Methods for Nonlinear Optimization. In Proceedings on Linear and Nonlinear Conjugate Gradient-Related Methods; Adams, L., Nazareth, L., Eds.; SIAM: Philadelphia, PA, USA, 1996; pp. 50–78. [Google Scholar]
  7. Shea, B.; Schmidt, M. Why line search when you can plane search? SO-friendly neural networks allow per-iteration optimization of learning and momentum rates for every layer. arXiv 2024, arXiv:2406.17954. [Google Scholar]
  8. Caliciotti, A.; Fasano, G.; Nash, S.; Roma, M. An adaptive truncation criterion for Newton-Krylov methods in large scale nonconvex optimization. Oper. Res. Lett. 2018, 46, 7–12. [Google Scholar] [CrossRef]
  9. McCormick, G.P. A modification of Armijo’s step-size rule for negative curvature. Math. Program. 1977, 13, 111–115. [Google Scholar] [CrossRef]
  10. Moré, J.J.; Sorensen, D.C. On the use of directions of negative curvature in a modified Newton method. Math. Program. 1979, 16, 1–20. [Google Scholar] [CrossRef]
  11. Nash, S.G. A survey of truncated-Newton methods. J. Comput. Appl. Math. 2000, 124, 45–59. [Google Scholar] [CrossRef]
  12. Fasano, G.; Roma, M. Iterative computation of negative curvature directions in large scale optimization. Comput. Optim. Appl. 2007, 38, 81–104. [Google Scholar] [CrossRef]
  13. De Leone, R.; Fasano, G.; Roma, M.; Sergeyev, Y.D. Iterative Grossone-Based Computation of Negative Curvature Directions in Large-Scale Optimization. J. Optim. Theory Appl. 2020, 186, 554–589. [Google Scholar] [CrossRef]
  14. Curtis, F.E.; Robinson, D.P. Exploiting negative curvature in deterministic and stochastic optimization. Math. Program. 2019, 176, 69–94. [Google Scholar] [CrossRef]
  15. Gill, P.E.; Wong, E. Sequential Quadratic Programming Methods. In Mixed Integer Nonlinear Programming; Lee, J., Leyffer, S., Eds.; The IMA Volumes in Mathematics and Its Applications; Springer: New York, NY, USA, 2012; Volume 154. [Google Scholar]
  16. Fletcher, R.; Gould, N.I.; Leyffer, S.; Toint, P.; Wächter, A. Global convergence of a trust-region SQP-filter algorithm for general nonlinear programming. SIAM J. Optim. 2002, 13, 635–659. [Google Scholar] [CrossRef]
  17. Wang, J.; Petra, C.G. A Sequential Quadratic Programming Algorithm for Nonsmooth Problems with Upper-Objective. SIAM J. Optim. 2023, 33, 2379–2405. [Google Scholar] [CrossRef]
  18. Nocedal, J.; Yuan, Y. Combining trust-region and line-search techniques. In Advances in Nonlinear Programming; Yuan, Y., Ed.; Kluwer: Boston, MA, USA, 1998; pp. 157–175. [Google Scholar]
  19. Tong, X.; Zhou, S. Combining Trust Region and Line Search Methods for Equality Constrained Optimization. Numer. Funct. Anal. Optim. 2006, 24, 143–162. [Google Scholar] [CrossRef]
  20. Waltz, R.A.; Morales, J.L.; Nocedal, J.; Orban, D. An interior algorithm for nonlinear optimization that combines line search and trust region steps. Math. Program. 2006, 107, 391–408. [Google Scholar] [CrossRef]
  21. Pei, Y.; Zhu, D. A trust-region algorithm combining line search filter technique for nonlinear constrained optimization. Int. J. Comput. Math. 2014, 91, 1817–1839. [Google Scholar] [CrossRef]
  22. Dembo, R.S.; Eisenstat, S.C.; Steihaug, T. Inexact Newton methods. SIAM J. Numer. Anal. 1982, 19, 400–408. [Google Scholar] [CrossRef]
  23. Steihaug, T. The Conjugate Gradient method and Trust Regions in large scale optimization. SIAM J. Numer. Anal. 1983, 20, 626–637. [Google Scholar] [CrossRef]
  24. Lucidi, S.; Rochetich, F.; Roma, M. Curvilinear stabilization techniques for truncated Newton methods in large scale unconstrained optimization. SIAM J. Optim. 1998, 8, 916–939. [Google Scholar] [CrossRef]
  25. Gould, N.I.M.; Lucidi, S.; Roma, M.; Toint, P.L. Solving the trust-region subproblem using the Lanczos method. SIAM J. Optim. 1999, 9, 504–525. [Google Scholar] [CrossRef]
  26. De Leone, R.; Fasano, G.; Sergeyev, Y.D. Planar methods and Grossone for the Conjugate Gradient breakdown in Nonlinear Programming. Comput. Optim. Appl. 2018, 71, 73–93. [Google Scholar] [CrossRef]
  27. Grippo, L.; Lampariello, F.; Lucidi, S. A truncated Newton method with nonmonotone linesearch for unconstrained optimization. J. Optim. Theory Appl. 1989, 60, 401–419. [Google Scholar] [CrossRef]
  28. Grippo, L.; Lampariello, F.; Lucidi, S. A class of nonmonotone stabilization methods in unconstrained optimization. Numer. Math. 1991, 59, 779–805. [Google Scholar] [CrossRef]
Figure 1. A graphical representation of the feasible set in (2). The dashed-dotted lines represent all the extreme choices for the last inequality constraint in (2).
Figure 1. A graphical representation of the feasible set in (2). The dashed-dotted lines represent all the extreme choices for the last inequality constraint in (2).
Algorithms 17 00563 g001
Figure 2. Examples where the structure of the feasible set in (2) is helpful: case (a) is treated in Section 4.1, case (b) is treated in Section 4.2, case (c) is treated in Section 4.3, case (d) is treated in Section 4.4 and case (e) is treated in Section 4.5.
Figure 2. Examples where the structure of the feasible set in (2) is helpful: case (a) is treated in Section 4.1, case (b) is treated in Section 4.2, case (c) is treated in Section 4.3, case (d) is treated in Section 4.4 and case (e) is treated in Section 4.5.
Algorithms 17 00563 g002
Figure 3. Overview of possible solutions (I)–(XII) for KKT conditions in (15).
Figure 3. Overview of possible solutions (I)–(XII) for KKT conditions in (15).
Algorithms 17 00563 g003
Table 1. A standard framework for linesearch-based TNMs for large-scale problems. The alternative of possibly using negative curvature directions allows for convergence to stationary limit points which fulfill second-order necessary optimality conditions.
Table 1. A standard framework for linesearch-based TNMs for large-scale problems. The alternative of possibly using negative curvature directions allows for convergence to stationary limit points which fulfill second-order necessary optimality conditions.
Set x 0 R n
Set η k [ 0 , 1 ) for any k, with { η k } 0
OUTER ITERATIONS
for  k = 0 , 1 ,
        Compute b f ( x k ) and Q 2 f ( x k ) ; if b is small then STOP
               INNER ITERATIONS
                   -   Compute d k , which approximately solves Newton’s equation
                      Q d + b = 0 , i.e., it satisfies the truncation rule  Q d k + b η k b
                   -   Possibly compute a bounded negative curvature direction z k at x k
       Use a criterion to either combine d k and z k , or choose between d k and z k
       If the directions d k and z k were combined, set v k ( α ) = ω 1 ( α ) d k + ω 2 ( α ) z k , and
       use a curvilinear linesearch procedure to select α α k . Otherwise, set v k ( α ) = α d ¯
       with d ¯ { d k , z k } , and use an Armijo-type procedure to select α α k
       Update x k + 1 = x k + v k ( α k )
endfor
Table 2. A standard framework for linesearch-based TNMs for large-scale problems which exploits the sub-problem (2). Differences with respect to Table 1 are quite evident.
Table 2. A standard framework for linesearch-based TNMs for large-scale problems which exploits the sub-problem (2). Differences with respect to Table 1 are quite evident.
Set x 0 R n
Set η k [ 0 , 1 ) for any k, with { η k } 0
OUTER ITERATIONS
for  k = 0 , 1 ,
        Compute b f ( x k ) and Q 2 f ( x k ) ; if b is small then STOP
               INNER ITERATIONS
                     -   Compute d k , which approximately solves Newton’s equation
                        Q d + b = 0 , i.e., it satisfies the truncation rule  Q d k + b η k b
                     -   Set z k = b
        Compute α * and β * by solving (2); then, update the trust-region
        parameters a 1 , a 2 , b 1 , b 2
        Set v k = α * d k + β * z k , and use an Armijo-type procedure to select the
        steplength α k along the direction v k
        Update x k + 1 = x k + α k v k
endfor
Table 3. A framework for combining trust-region and linesearch approaches within TNMs for large-scale problems, exploiting the sub-problem again (2). Differences with respect to Table 1 and Table 2 are quite evident.
Table 3. A framework for combining trust-region and linesearch approaches within TNMs for large-scale problems, exploiting the sub-problem again (2). Differences with respect to Table 1 and Table 2 are quite evident.
Set x 0 R n
Set η k [ 0 , 1 ) for any k, with { η k } 0 . Set ρ > 0
OUTER ITERATIONS
for  k = 0 , 1 ,
        Compute b f ( x k ) and Q 2 f ( x k ) ; if b is small then STOP
               INNER ITERATIONS
                     -   Compute d k , which approximately solves Newton’s equation
                        Q d + b = 0 , i.e., it satisfies the truncation rule  Q d k + b η k b
                     -   Set z k = b
        Compute α * and β * by solving (2); then, set v k = α * d k + β * z k ,
        A r e d k = f ( x k ) f ( x k + v k ) , P r e d k = φ ( x k ) φ ( x k + v k )
        If A r e d k / P r e d k ρ , use an Armijo-type procedure to select the steplength
        α k along v k ; otherwise, skip the linesearch procedure
        Update the trust-region parameters a 1 , a 2 , b 1 , b 2
        Update x k + 1 = x k + α k v k
endfor
Table 4. A framework of linesearch-based approaches within TNMs for large-scale problems: solving the sub-problem (2) successfully allows for the convergence of the sequence { x k } to limit points satisfying second-order necessary optimality conditions. Differences with respect to Table 1, Table 2 and Table 3 are quite evident.
Table 4. A framework of linesearch-based approaches within TNMs for large-scale problems: solving the sub-problem (2) successfully allows for the convergence of the sequence { x k } to limit points satisfying second-order necessary optimality conditions. Differences with respect to Table 1, Table 2 and Table 3 are quite evident.
Set x 0 R n
Set η k [ 0 , 1 ) for any k, with { η k } 0
OUTER ITERATIONS
for  k = 0 , 1 ,
        Compute b f ( x k ) and Q 2 f ( x k ) ; if b is small then STOP
               INNER ITERATIONS
                     -   Compute d k , which approximately solves Newton’s equation
                        Q d + b = 0 , i.e., it satisfies the truncation rule  Q d k + b η k b
        Compute a suitable negative curvature direction z k for f ( x ) at x k
        Compute α * and β * by solving (2); then, set v k = α * d k + β * z k . Update the
        trust-region parameters a 1 , a 2 , b 1 , b 2
        Use an Armijo-type procedure to select the steplength α k along v k
        Update x k + 1 = x k + α k v k
endfor
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fasano, G.; Piermarini, C.; Roma, M. Issues on a 2–Dimensional Quadratic Sub–Problem and Its Applications in Nonlinear Programming: Trust–Region Methods (TRMs) and Linesearch Based Methods (LBMs). Algorithms 2024, 17, 563. https://doi.org/10.3390/a17120563

AMA Style

Fasano G, Piermarini C, Roma M. Issues on a 2–Dimensional Quadratic Sub–Problem and Its Applications in Nonlinear Programming: Trust–Region Methods (TRMs) and Linesearch Based Methods (LBMs). Algorithms. 2024; 17(12):563. https://doi.org/10.3390/a17120563

Chicago/Turabian Style

Fasano, Giovanni, Christian Piermarini, and Massimo Roma. 2024. "Issues on a 2–Dimensional Quadratic Sub–Problem and Its Applications in Nonlinear Programming: Trust–Region Methods (TRMs) and Linesearch Based Methods (LBMs)" Algorithms 17, no. 12: 563. https://doi.org/10.3390/a17120563

APA Style

Fasano, G., Piermarini, C., & Roma, M. (2024). Issues on a 2–Dimensional Quadratic Sub–Problem and Its Applications in Nonlinear Programming: Trust–Region Methods (TRMs) and Linesearch Based Methods (LBMs). Algorithms, 17(12), 563. https://doi.org/10.3390/a17120563

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop