Issues on a 2–Dimensional Quadratic Sub–Problem and Its Applications in Nonlinear Programming: Trust–Region Methods (TRMs) and Linesearch Based Methods (LBMs)

Giovanni Fasano; Christian Piermarini; Massimo Roma

doi:10.3390/a17120563

,

and

¹

Department of Management, Venice School of Management, Ca’ Foscari University, S. Giobbe, Cannaregio 873, 30121 Venice, Italy

²

Department of Computer, Control and Management Engineering, Sapienza University of Rome, Via Ariosto 25, 00185 Rome, Italy

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Algorithms2024, 17(12), 563;https://doi.org/10.3390/a17120563

This article belongs to the Special Issue Numerical Optimization and Algorithms: 2nd Edition

Version Notes

Order Reprints

Abstract

This paper analyses the solution of a specific quadratic sub-problem, along with its possible applications, within both constrained and unconstrained Nonlinear Programming frameworks. We give evidence that this sub–problem may appear in a number of Linesearch Based Methods (LBM) schemes, and to some extent it reveals a close analogy with the solution of trust–region sub–problems. Namely, we refer to a two-dimensional structured quadratic problem, where five linear inequality constraints are included. Finally, we detail how to compute an exact global solution of our two-dimensional quadratic sub-problem, exploiting first order Karush-Khun-Tucker (KKT) conditions.

Keywords:

nonlinear programming; quadratic linearly constrained optimization; KKT conditions

1. Introduction

There are plenty of real problems where the minimization of a twice continuously differentiable functional

f : R^{n} \to R

is sought, (possibly) subject to several linear and nonlinear constraints. Among authoritative textbooks, where such problems are widely detailed, we can surely find [1,2,3]. Such general problems typically require the solution to a sequence of simple sub-problems with the following pattern:

\begin{matrix} min_{x} φ_{k} (x) \\ s . t . x \in D_{k}, \end{matrix} D_{k} : \{\begin{matrix} A_{k} x + u_{k} = 0 \\ B_{k} x + v_{k} \leq 0, \end{matrix}

(1)

where

A_{k} \in R^{m_{k} \times n}

,

B_{k} \in R^{p_{k} \times n}

,

u_{k} \in R^{m_{k}}

,

v_{k} \in R^{p_{k}}

,

m_{k}, p_{k} \geq 1

and

k \geq 1

. Furthermore,

φ_{k} (x)

represents a model of the smooth function

f (x)

at the current iterate

x_{k}

and the feasible set

D_{k}

represents a linearization of the constraints.

As is well known, affine and quadratic polynomials based on Taylor’s expansion are often adopted to represent the models

{φ_{k} (x)}

, but valid alternatives also include least squares approximations, Radial Basis Functions, metamodels based on Splines, B-Splines, Kriging, etc. [4,5]. We remark that the advantage of solving the sequence of sub-problems (1) in place of the original nonlinearly constrained problem, within a suitable convergence framework, essentially relies on their simplicity. In particular, in this paper our focus is on investigating the role and the properties of the next problem (2), that represents a special sub-case of the more general problem (1). More specifically, we consider the case where in (1) the feasible set

D_{k}

includes only a finite number of inequalities, and the function

φ_{k} (x)

is a quadratic functional, i.e., we focus on the sub-problem and drop the dependency on k

\begin{matrix} min_{α, β} φ (x) = \frac{1}{2} x^{T} Q x + b^{T} x + c \\ x = \bar{x} + α d + β z \\ a_{1} \leq α \leq b_{1} \\ a_{2} \leq β \leq b_{2} \\ ϵ_{1} α + ϵ_{2} β \leq ϵ_{3} \end{matrix}

(2)

where

Q \in R^{n \times n}

,

b, \bar{x} \in R^{n}

,

c \in R

,

d, z \in R^{n}

are given n-real search directions, and

a_{1} \leq b_{1}

,

a_{2} \leq b_{2}

. Despite the apparent specific structure of (2), a number of real applications may benefit from its solution, as partly described in Section 4 (see also [6] for a general perspective and [7] for a more recent similar viewpoint within neural network frameworks).

As an example of versatility for the structure of (2), both in TRMs and LBMs, we will shortly consider how it may be possibly successfully embedded within the framework of Truncated Newton’s methods (TNMs—see Table 1).

Table 1. A standard framework for linesearch-based TNMs for large-scale problems. The alternative of possibly using negative curvature directions allows for convergence to stationary limit points which fulfill second-order necessary optimality conditions.

Where (see also [8,9,10,11,12,13,14])

$d \in R^{n}$ represents an approximate Newton-type direction, at the current feasible point $\bar{x} \in R^{n}$ ;
$z \in R^{n}$ represents a negative curvature direction for the nonlinear function $f (x)$ , at the current feasible point $\bar{x} \in R^{n}$ ;
$Q \in R^{n \times n}$ represents the exact/approximate Hessian matrix of $f (x)$ at $\bar{x}$ ;
$b \in R^{n}$ represents the exact/approximate Gradient vector of $f (x)$ at $\bar{x}$ ;
$α$ and $β$ are steplengths along the directions $d, z \in R^{n}$ (i.e., following the taxonomy of Table 1, we have $α \leftarrow ω_{1} (α)$ and $β \leftarrow ω_{2} (α)$ ), with $- \infty < a_{1} \leq b_{1} < + \infty$ and $- \infty < a_{2} \leq b_{2} < + \infty$ . The constraint $ε_{1} α + ε_{2} β \leq ε_{3}$ potentially plays a multi-purpose role, modeling for instance the gradient-related property for the search direction $α d + β z \in R^{n}$ at $\bar{x}$ , i.e.,

{(α d + β z)}^{T} \nabla f (\bar{x}) \leq - \bar{c} {∥ \nabla f (\bar{x}) ∥}^{h},

(3)

being

\{\begin{matrix} ε_{1} = d^{T} \nabla f (\bar{x}), \\ ε_{2} = z^{T} \nabla f (\bar{x}), \\ ε_{3} = - \bar{c} {∥ \nabla f (\bar{x}) ∥}^{h}, \bar{c}, h > 0 . \end{matrix}

The availability of an (exact) global solution for (2) may also suggest some alternatives to Table 1, either selecting a TRM or a LBM framework, or combining the two approaches. In particular, the scheme in Table 2 represents an immediate acceleration scheme for linesearch-based TNMs with respect to Table 1, in case the global convergence of

{x_{k}}

to stationary limit points is simply sought. Note that selecting negative values for

a_{1}, a_{2}

and positive ones for

b_{1}, b_{2}

allows us to possibly perform the following:

Table 2. A standard framework for linesearch-based TNMs for large-scale problems which exploits the sub-problem (2). Differences with respect to Table 1 are quite evident.

reverse the directions $d_{k}$ and $z_{k}$ ;
use (2) in the light of simulating a dogleg-like [3] procedure for TRMs, also in LBM nonconvex frameworks.

In Table 2, the global convergence to stationary limit points is easily preserved by using similar results adopted for Table 1.

As a further alternative case for considering the exact global solution of (2), with respect to Table 1 and Table 2, we have the scheme in Table 3, where we suitably combine the strategies used in TRMs and LBMs to ensure global convergence (in particular, TRMs require the fulfillment of a sufficient reduction of the model in order to force a sufficient decrease in the objective function, so that they do not need any linesearch procedure, possibly implying a reduced computational burden with respect to LBMs. Conversely, LBMs easily compute an effective search direction but they need to perform a linesearch procedure, because they do not include any -direct- function reduction mechanism based on the local quadratic model). In particular, if the test

A r e d_{k} / P r e d_{k} > ρ

is fulfilled, there is no need to perform a linesearch procedure, since the global convergence for

{x_{k}}

is preserved by the trust-region framework. We also remark that, in Table 3, the computation of both

φ (x_{k})

and

φ (x_{k} + v_{k})

is required, regardless of the outcomes of the test

A r e d_{k} / P r e d_{k} ≯ ρ

, since in any case these quantities must be computed.

Table 3. A framework for combining trust-region and linesearch approaches within TNMs for large-scale problems, exploiting the sub-problem again (2). Differences with respect to Table 1 and Table 2 are quite evident.

Finally, there is a chance to further exploit the scheme (2) in a TNM framework based on the linesearch procedure, in order to ensure global convergence properties for the sequence

{x_{k}}

to stationary limit points satisfying the second-order necessary optimality conditions (namely, those stationary points where the Hessian matrix is positive and semidefinite). The resulting scheme is proposed in Table 4 and potentially does not require additional comments. The above examples give an overview of the possible basic contexts where the solution of the sub-problem (2) is sought. Hence, to some extent, specifically exploiting issues on its solution may yield a tool for practitioners working in Nonlinear Programming frameworks. We remark that, both in Section 3 and Section 6, the reader may find additional guidelines for possible alternatives and extensions to the use of global solutions of (2).

Table 4. A framework of linesearch-based approaches within TNMs for large-scale problems: solving the sub-problem (2) successfully allows for the convergence of the sequence

{x_{k}}

to limit points satisfying second-order necessary optimality conditions. Differences with respect to Table 1, Table 2 and Table 3 are quite evident.

We also highlight that our perspective both differs from SQP (Sequential Quadratic Programming) methods—see, for example, the seminal paper [15], and approaches from the literature where LSMs and TRMs have been combined. Indeed, in the basic structure of SQPs (see also [16,17]), inner and outer iterations are performed. At each outer iteration, the pair given by primal-dual variables is computed, and a problem similar to (2) is addressed. On the contrary, we do not intend to propose a (novel) framework of global convergence for Nonlinear Programming, but rather we suggest the generality of the scheme (2) within a number of cases from the literature. Furthermore, we specifically focus on the exact solution of the quadratic sub-problem (2) as well.

On the other hand, in the seminal paper [18] and in the more recent ones [19,20,21], linesearch and trust-region techniques are integrated in a unified framework. Conversely, our point of view merely intends to bridge the gap between them.

The structure of the present paper is as follows. In Section 2, we describe the conditions ensuring the feasibility of our problem. In Section 3, we reveal the basic motivations for our analysis and outcomes. Section 4 reports relevant remarks, highlighting how general our proposal can be. Section 5 includes the Karush-Kuhn-Tucker conditions associated with problem (2), along with precise guidelines to find a global minimum for it. Finally, Section 6 provides some conclusions and suggestions for future work.

As regards the symbols adopted in the paper,

{∥ x ∥}_{1}

,

{∥ x ∥}_{2}

and

{∥ x ∥}_{\infty}

are, respectively, used to indicate the 1-norm, the 2-norm, and the ∞-norm of the vector

x \in R^{n}

or the real

n \times m

matrix x. Given the n-real vectors x and y, we indicate their standard inner product with

x^{T} y

. Given the matrix

A \in R^{m \times n}

, we then indicate by

A^{+}

its Moore-Penrose pseudoinverse matrix, i.e., the unique matrix, such that

A A^{+} A = A

,

A^{+} A A^{+} = A^{+}

,

{(A A^{+})}^{T} = A A^{+}

,

{(A^{+} A)}^{T} = A^{+} A

. With

A ⪰ 0

(

A ≻ 0

), we indicate a positive semidefinite (positive definite) matrix A.

2. Feasibility Issues for Our Quadratic Problem

Here, we consider some feasibility issues for the linear inequality constrained quadratic problem (2). Clearly, (2) just includes the two real unknowns

α

and

β

. Moreover, as regards the existence of solutions for (2), we have the following result.

Lemma 1

(Feasibility). Let the problem (2) be given and assume that the real values

a_{1}, b_{1}, a_{2}, b_{2}

are finite, with

a_{1} \leq b_{1}

and

a_{2} \leq b_{2}

. Then, (2) admits solutions if and only if at least one of the following conditions holds:

Cond. I: $ϵ_{1} = ϵ_{2} = 0$ and $ϵ_{3} \geq 0$ .
Cond. II: $ϵ_{1} = 0$ and $ϵ_{2} \neq 0$ ; moreover,
–
if $ϵ_{2} > 0$ , then $a_{2} \leq ϵ_{3} / ϵ_{2}$
–
if $ϵ_{2} < 0$ , then $b_{2} \geq ϵ_{3} / ϵ_{2} .$
Cond. III: $ϵ_{1} \neq 0$ and $ϵ_{2} = 0$ ; moreover,
–
if $ϵ_{1} > 0$ , then $a_{1} \leq ϵ_{3} / ϵ_{1}$
–
if $ϵ_{1} < 0$ , then $b_{1} \geq ϵ_{3} / ϵ_{1} .$
Cond. IV: $ϵ_{1} \neq 0$ , $ϵ_{2} \neq 0$ , $- ϵ_{1} / ϵ_{2} < 0$ , moreover,
–
if $ε_{1} > 0$ and $ε_{2} > 0$ , then $a_{2} \leq - (ϵ_{1} / ϵ_{2}) a_{1} + (ϵ_{3} / ϵ_{2})$
–
if $ε_{1} < 0$ and $ε_{2} < 0$ , then $b_{2} \geq - (ϵ_{1} / ϵ_{2}) b_{1} + (ϵ_{3} / ϵ_{2})$ .
Cond. V: $ϵ_{1} \neq 0$ , $ϵ_{2} \neq 0$ , $- ϵ_{1} / ϵ_{2} > 0$ , moreover,
–
if $ε_{1} < 0$ and $ε_{2} > 0$ , then $a_{2} \leq - (ϵ_{1} / ϵ_{2}) b_{1} + (ϵ_{3} / ϵ_{2})$
–
if $ε_{1} > 0$ and $ε_{2} < 0$ , then $b_{2} \geq - (ϵ_{1} / ϵ_{2}) a_{1} + (ϵ_{3} / ϵ_{2})$ .

Proof of Lemma 1.

For the sake of simplicity, we refer to Figure 1. The objective function in (2) is continuous, so that the existence of solutions follows from the compactness and nonemptiness of the feasible region. In this regard, the compactness is a consequence of assuming

a_{1}, b_{1}, a_{2}, b_{2}

finite. Furthermore, it is not difficult to realize that the feasible set of (2) is nonempty as long as at least one among the five conditions, Cond. I–Cond. V, is fulfilled, where the dashed-dotted line in Figure 1 represents the line associated with the last inequality constraint in (2). In particular, Cond. IV refers to the corner points A and B of Figure 1, while Cond. V refers to the vertices C and D. □

Figure 1. A graphical representation of the feasible set in (2). The dashed-dotted lines represent all the extreme choices for the last inequality constraint in (2).

3. On the Use of Quadratic Sub-Problems Within TRMs and LBMs, in Large-Scale Optimization

Here, we give details about a possible motivation for our proposal, in order to reduce the gap between two renowned classes of optimization methods, namely TRMs and LBMs. We are indeed persuaded that such a viewpoint may suggest a number of possible enhancements, to improve both the last classes of methods.

In this regard, observe that a TRM for large-scale problems is an iterative procedure that generates the sequence of n-real iterates

{x_{k}}

, and seeks at any step k for the solution of the trust-region sub-problem

\begin{matrix} min_{s} q_{k} (s) = f (x_{k}) + \nabla f {(x_{k})}^{T} s + \frac{1}{2} s^{T} Q_{k} s \\ {∥ s ∥}_{2} \leq Δ_{k}, \end{matrix}

(4)

where

x_{k}

is the current iterate,

Q_{k}

represents the exact/approximate Hessian matrix

\nabla^{2} f (x_{k})

, and

Δ_{k} > 0

represents the radius of the trust-region, i.e., the compact subset where the model

q_{k} (s)

needs to be validated (for an exhaustive description of TRMs for Nonlinear Programming, the reader can refer to [2]). A number of possible variants of (4) can be introduced when n is large, including iterative updating strategies for both

Q_{k}

and

Δ_{k}

, and a number of approximate/sophisticated/refined schemes for its solution are available in the literature.

A distinguishing feature of TRMs, with respect to LBMs, is that at iteration k the methods in the first class attempt to determine the stepsize

α_{k}

and the search direction

d_{k}

at once, so that

x_{k + 1} = x_{k} + s_{k} \equiv x_{k} + α_{k} d_{k}

, where

s_{k}

indeed approximately/exactly solves (4). Conversely, in LBMs, the computations of

α_{k}

and

d_{k}

are independent, as detailed later on in this paper. In particular, (see also [3]) the effective computation of

s_{k}

in TRMs properly attempts to comply with the following issues:

$s_{k}$ can be computed by either an exact (small- and medium-scale problems) or an approximate (large-scale problems) procedure;
In order to prove the global convergence of the sequence ${x_{k}}$ to stationary limit points satisfying either first- or second-order necessary optimality conditions, $s_{k}$ is required to provide a sufficient reduction of the quadratic model $q_{k} (s)$ , i.e., the difference $q_{k} (0) - q_{k} (s_{k})$ is asked to satisfy a condition like ( $c > 0$ )

$q_{k} (0) - q_{k} (s_{k}) \geq c {∥ \nabla f (x_{k}) ∥}_{2} min \{Δ_{k}, \frac{∥ \nabla f (x_{k}) ∥_{2}}{1 + {∥ Q ∥}_{2}}\};$
$s_{k}$ can be computed by an approximate procedure, e.g., by adopting a Cauchy step or using the Steihaug conjugate gradient (see [22,23]), regardless of $Q_{k}$ signature. Then, the approximate solution of (4) is merely sought on a linear manifold of a dimension of one or at most two, rather than on the entire subset $B \equiv {s \in R^{n} : ∥ s ∥_{2} \leq Δ_{k}}$ ;
Depending on a number of additional assumptions, TRMs can prove to be globally convergent to either a simple stationary limit point, or to a point which satisfies second-order necessary optimality conditions [2];
Finding the exact/accurate solution of the sub-problem (4) is in general quite a cumbersome task in large-scale problems, representing a difficult goal that is often (when possible) skipped.

On the other hand, to some extent, LBMs represent the counterpart of TRMs. Indeed, to yield the next iterate

x_{k + 1} = x_{k} + α_{k} d_{k}

, they perform the computation of the steplength

α_{k}

and the direction

d_{k}

as separate tasks. Furthermore, unlike for TRMs, the novel iterate

x_{k + 1}

in LBMs can be also obtained by adopting the more general update

x_{k + 1} = x_{k} + α_{k} d_{k} + β_{k} z_{k},

(5)

with

d_{k}

and

z_{k}

now being two search directions summarizing different information on the function

f (x)

, and

α_{k}

and

β_{k}

being stepsizes. In particular:

when $z_{k} \equiv 0$ (or $β_{k} \equiv 0$ for any k), then $d_{k}$ represents a Newton-type direction, being typically computed by approximately solving Newton’s equation $\nabla^{2} f (x_{k}) d = - \nabla f (x_{k})$ at the current iterate $x_{k}$ . Then, an Armijo-type linesearch procedure is applied along $d_{k}$ to compute $α_{k}$ , provided that $d_{k}$ is gradient-related (see e.g., [3]) at $x_{k}$ ;
when $z_{k} \neq 0$ , then $d_{k}$ represents a Newton-type direction again, while $z_{k}$ is typically a negative curvature direction for $f (x)$ at $x_{k}$ , which approximates an eigenvector associated with the least negative eigenvalue of $\nabla^{2} f (x_{k})$ . The vector $z_{k}$ plays an essential role, when LBMs’ convergence to stationary points satisfying the second-order necessary optimality conditions needs to be proved. In the last case, the computation of the steplengths $α_{k}$ and $β_{k}$ is often carried out at once, (as in curvilinear linesearch procedures—see [24]), or the steplength computation is carried out by pursuing independent tasks (see, for example, [25]). We highlight that in (5), when both $d_{k} \neq 0$ and $z_{k} \neq 0$ , we may experience difficulties related to properly scaling the two search directions.

As a general class of efficient algorithms within LBMs for large-scale problems, we find Truncated Newton methods (TNMs) coupled with a linesearch procedure (see Table 1). Similarly to general TRMs, they are evidently based on possibly computing

d_{k}

and

z_{k}

after exploiting the second-order Taylor’s expansion of

f (x)

at

x_{k}

. However, a couple of quite disappointing issues arise when applying linesearch-based TNMs, namely:

Unlike trust-region based TNMs, at iterate $x_{k}$ , the search of a stationary point for a quadratic polynomial model of $f (x)$ (i.e., Newton’s equation) is performed on $R^{n}$ , so that the quadratic expansion is not trusted on a more reliable compact subset (trust-region) of $R^{n}$ . Thus, the search direction $d_{k}$ might show poor performance when the iterates in the sequence ${x_{k}}$ are far from a stationary limit point $x^{*}$ . More specifically, note that in case $\nabla^{2} f (x_{k}) ⪰ 0$ ; then, solving Newton’s equation and the trust-region sub-problem

$\begin{matrix} min_{d} q_{k} (d) = f (x_{k}) + \nabla f {(x_{k})}^{T} d + \frac{1}{2} d^{T} \nabla^{2} f (x_{k}) d \\ {∥ d ∥}_{2} \leq γ_{k}, \end{matrix}$

for any $γ_{k} \geq {∥ {[\nabla^{2} f (x_{k})]}^{+} \nabla f (x_{k}) ∥}_{2}$ yields the same solutions. Conversely, when $\nabla^{2} f (x_{k})$ is indefinite, then Newton’s equation provides a saddle point for $q_{k} (d)$ , that might be interpreted as a solution to a trust-region sub-problem (the interested reader may consider the paper [26] for some extensions). Furthermore, from this perspective, we remark that in LBMs, solving (2) where $d_{k} = - \nabla f (x_{k})$ , $z_{k} \equiv 0$ , and $ε_{1} = ε_{2} = ε_{3} = 0$ , is to a to large extent equivalent to computing the Cauchy step when solving (4). Indeed, in the last case, the trust-region constraint in (4), in principle, can be equivalently replaced by the compact feasible set (box constraints) in (2), after setting $ε_{1} = ε_{2} = ε_{3} = 0$ . On the other hand, in case $\nabla^{2} f (x_{k}) ⪰ 0$ , and setting (2) $d_{k} = - \nabla f (x_{k})$ and $z_{k} = - {[\nabla^{2} f (x_{k})]}^{+} \nabla f (x_{k})$ , along with $ε_{1} = ε_{2} = ε_{3} = 0$ , then, with similar reasoning, the solution to (2) closely resembles the application of the dogleg method when solving (4). Finally, since the coefficients $a_{1}, a_{2}$ in (2) may have negative values, we may potentially reverse the directions $d_{k}$ and $z_{k}$ when solving (2). Thus, following the idea behind (3), the scheme (2) suggests that, in case $\nabla^{2} f (x_{k})$ is also indefinite, (2) easily generalizes the proposals in [27]. In fact, following (3), we are able to exactly compute a global minimum $(α^{*}, β^{*})$ for (2), regardless of the signature of Q, so that the resulting direction $α^{*} d_{k} + β^{*} z_{k}$ is gradient-related at $x_{k}$ .
As in (5), the search directions $d_{k}$ and $z_{k}$ might be suitably combined in a curvilinear framework (see, for example, [24]). However, to our knowledge, the selection of $α_{k}$ and $β_{k}$ in the literature is seldom performed with a joint procedure to separately assess $α_{k}$ and $β_{k}$ , i.e., $α_{k}$ and $β_{k}$ are rarely chosen as independent parameters. Hence, in the literature of linesearch-based TNMs, the linesearch procedure that starts from $x_{k}$ and yields $x_{k + 1}$ explores a one-dimensional manifold (regular curve), rather than considering $x_{k} + α d_{k} + β z_{k}$ as a two-dimensional manifold with independent real coefficients $α$ and $β$ .

In this regard, using (2) within LBMs tends to partially compensate for the drawbacks in the last two items, in light of the great success that TRMs have achieved in the last decade. In particular, using (2) within linesearch-based TNMs, our aim is that of developing a simple tool which could possibly carry out the following:

Combines at iterate $x_{k}$ two independently computed vectors, namely $d_{k}, z_{k} \in R^{n}$ , by exactly computing a global minimum (we recall that, conversely, a global solution of the trust-region sub-problem (4) is often only approximately computed) for the two-dimensional constrained problem (2), being $\bar{x} \leftarrow x_{k}$ , $d \leftarrow d_{k}$ , $z \leftarrow z_{k}$ ;
Adaptively updates the parameters $a_{1}$ , $a_{2}$ , $b_{1}$ , $b_{2}$ in (2), when the iterate $x_{k}$ changes, following the rationale behind the update of $Δ_{k}$ in (4), and retaining the strong convergence properties of TRMs. This fact is of remarkable interest, since in (2) the information associated with the search directions $d_{k}$ and $z_{k}$ is suitably trusted in a compact subset of $R^{n}$ (namely, the box constraints $a_{1} \leq α \leq b_{1}$ , $a_{2} \leq β \leq b_{2}$ );
Exactly computes a cheap global minimum $(α^{*}, β^{*})$ for (2), so that the vector $α^{*} d_{k} + β^{*} s_{k}$ is then provided to a standard linesearch procedure such as the Armijo rule, to ensure that the global convergence of the sequence ${x_{k}}$ to stationary (limit) points is preserved;
Allows for the convergence of subsequences of the iterates ${x_{k}}$ to stationary limit points, where either first- or second-order necessary optimality conditions are fulfilled;
Preserves generality within a wide range of optimization frameworks, as reported in the next Section 4;
Combines the effects of $d_{k}$ and $z_{k}$ , skipping all the drawbacks related to a possible different scaling between these directions. We recall that since $d_{k}$ and $z_{k}$ are generated through the application of different methods, then the comparison of their performances may be biased by the latter generating methods.

The TNMs sketched in Table 2, Table 3 and Table 4 are only examples of proposals in light of the previous comments.

4. How General Is the Model (2) in Nonlinear Programming Frameworks?

This section is devoted to reporting a number of real constrained optimization schemes from Nonlinear Programming, whose formulation is encompassed in (2). We can see that for some of the following schemes (see Figure 2), it is possible that more than one reformulation can be considered in the framework (2).

Figure 2. Examples where the structure of the feasible set in (2) is helpful: case (a) is treated in Section 4.1, case (b) is treated in Section 4.2, case (c) is treated in Section 4.3, case (d) is treated in Section 4.4 and case (e) is treated in Section 4.5.

4.1. Minimization over a Bounded Simplex

We consider the problem of minimizing a quadratic functional over the simplex

S \subset R^{n}

, such that

\begin{matrix} min_{x} \frac{1}{2} x^{T} Q x + b^{T} x + c \\ x \in S, \end{matrix}

(6)

where

S = {x \in R^{n} : x = \sum_{i = 1}^{3} λ_{i} x_{i}, \sum_{i = 1}^{3} λ_{i} = 1, λ_{i} \geq 0, i = 1, 2, 3}

. Figure 2a reports an example of a simplex. In this regard, by simply setting in (2)

$d = x_{2} - x_{1}$ , $z = x_{3} - x_{1}$ ,
$\bar{x} = x_{1}$ ,
$a_{1} = 0$ , $b_{1} = 1$ , $a_{2} = 0$ , $b_{2} = 1$ ,
$ϵ_{1} = ϵ_{2} = ϵ_{3} = 1$ ,

the problem (6) is a special case of the problem (2).

4.2. Minimization over a Bounded Polygon

We consider the problem of minimizing a quadratic functional over a polygon

P \subset R^{2}

, described by a finite number m of vertices (observe that the points in the polygon P must belong to a hyperplane

π \subset R^{n}

, with

π : ω^{T} x + ω_{0} = 0

,

ω = {(ω_{1}, \dots, ω_{n})}^{T} \in R^{n}

,

ω_{0} \in R

, so that

ω^{T} \bar{x} + ω_{0} = 0

for any

\bar{x} \in P

.), i.e.,

\begin{matrix} min_{x} \frac{1}{2} x^{T} Q x + b^{T} x + c \\ x \in P, \end{matrix}

(7)

where

P = {x \in R^{2} : x = \sum_{i = 1}^{m} λ_{i} x_{i}, \sum_{i = 1}^{m} λ_{i} = 1, λ_{i} \geq 0, i = 1, \dots, m}

. Figure 2b reports an example of a polygon with

m = 5

. In this regard, the problem (7) can be split into to solution of the

(m - 2)

sub-problems

\begin{matrix} min_{x} \frac{1}{2} x^{T} Q x + b^{T} x + c \\ x \in S_{i}, i = 1, \dots, m - 2, \end{matrix}

(8)

where

\begin{matrix} S_{i} = \{x \in π \subset R^{2} : x = \sum_{j \in {1, i + 1, i + 2}} λ_{j} x_{j}, \sum_{j \in {1, i + 1, i + 2}} λ_{j} = 1, λ_{j} \geq 0, j \in {1, i + 1, i + 2}\}, \end{matrix}

which are of the form (6). Thus, solving the problem (7) corresponds to solve a sequence of

(m - 2)

instances of the problem (2).

4.3. Minimization over a Bounded Segment

We consider the problem of minimizing a quadratic functional over a segment

L \subset R^{n}

, i.e.,

\begin{matrix} min_{x} \frac{1}{2} x^{T} Q x + b^{T} x + c \\ x \in L, \end{matrix}

(9)

where

L = {x \in R^{n} : x = λ x_{1} + (1 - λ) x_{2}, λ \in [0, 1]}

. Figure 2c reports an example of a segment. In this regard, by simply setting in (2)

$d = x_{2} - x_{1}$ , $z \equiv 0$ ,
$\bar{x} = x_{1}$ ,
$a_{1} = 0$ , $b_{1} = 1$ , $a_{2} = 0$ , $b_{2} = 0$ ,
$ϵ_{1} = ϵ_{2} = ϵ_{3} = 0$ ,

the problem (9) is a special case of the problem (2).

4.4. Minimization over a Bounded Box in $R^{2}$

We consider the problem of minimizing a quadratic functional over a box domain

D \subset R^{2}

, i.e.,

\begin{matrix} min_{x} \frac{1}{2} x^{T} Q x + b^{T} x + c \\ x \in D, \end{matrix}

(10)

where

D = {x \in R^{2} : c_{i} \leq x_{i} \leq e_{i}, i = 1, 2}

. Figure 2d reports an example of a box domain. In this regard, by simply setting in (2)

$d = (\begin{matrix} e_{1} - c_{1} \\ 0 \end{matrix})$ , $z = (\begin{matrix} 0 \\ e_{2} - c_{2} \end{matrix})$
$\bar{x} = (\begin{matrix} c_{1} \\ c_{2} \end{matrix})$
$a_{1} = 0$ , $b_{1} = 1$ , $a_{2} = 0$ , $b_{2} = 1$ ,
$ϵ_{1} = ϵ_{2} = ϵ_{3} = 0$ ,

the problem (10) is a special case of the problem (2). As an alternative to the previous setting, we might also consider treating this case with a setting in (2), given by

$d = (\begin{matrix} \frac{e_{1} - c_{1}}{2} \\ 0 \end{matrix})$ , $z = (\begin{matrix} 0 \\ \frac{e_{2} - c_{2}}{2} \end{matrix})$
$\bar{x} = (\begin{matrix} \frac{e_{1} + c_{1}}{2} \\ \frac{e_{2} + c_{2}}{2} \end{matrix})$
$a_{1} = - 1$ , $b_{1} = 1$ , $a_{2} = - 1$ , $b_{2} = 1$ ,
$ϵ_{1} = ϵ_{2} = ϵ_{3} = 0$ .

4.5. Minimization Including a 1-Norm Inequality Constraint in $R^{2}$

We consider the problem of minimizing a quadratic functional subject to the 1-norm inequality constraint

x \in N

, with

N \subset R^{2}

, i.e.,

\begin{matrix} min_{x} \frac{1}{2} x^{T} Q x + b^{T} x + c \\ x \in N, \end{matrix}

(11)

which is

N = {x \in R^{2} : ∥ x ∥_{1} \leq a}

. Figure 2e reports an example of such a constraint. In this regard, it suffices to recast (11) as in (8), where

$m = 6$
$\bar{x} = x_{1} = 0$
$x_{2} = (\begin{matrix} 0 \\ 1 \end{matrix})$ , $x_{3} = (\begin{matrix} 1 \\ 0 \end{matrix})$ , $x_{4} = (\begin{matrix} 0 \\ - 1 \end{matrix})$ , $x_{5} = (\begin{matrix} - 1 \\ 0 \end{matrix})$ , $x_{6} = x_{2}$ ,

so that four instances of the problem (2) need to be solved.

4.6. Minimization Including an ∞-Norm Inequality Constraint in $R^{2}$

We consider the problem of minimizing a quadratic functional subject to the ∞-norm inequality constraint

x \in E

, with

E \subset R^{2}

, i.e.,

\begin{matrix} min_{x} \frac{1}{2} x^{T} Q x + b^{T} x + c \\ x \in E, \end{matrix}

(12)

which is

E = {x \in R^{2} : ∥ x ∥_{\infty} \leq a}

. In this regard, we obtain similar results with respect to Section 4.5. Indeed, by simply setting in (2)

$d = (\begin{matrix} a \\ 0 \end{matrix})$ , $z = (\begin{matrix} 0 \\ a \end{matrix})$
$\bar{x} = 0$
$a_{1} = - 1$ , $b_{1} = 1$ , $a_{2} = - 1$ , $b_{2} = 1$ ,
$ϵ_{1} = ϵ_{2} = ϵ_{3} = 0$ ,

the problem (12) is a special case of the problem (2).

4.7. Minimization Including a 2-Norm Inequality Constraint in $R^{2}$

We consider the problem of minimizing a quadratic functional in

R^{2}

subject to the 2-norm inequality constraint

{∥ x ∥}_{2} \leq γ

, with

γ \geq 0

, i.e.,

\begin{matrix} min_{x} \frac{1}{2} x^{T} Q x + b^{T} x + c \\ {∥ x ∥}_{2} \leq γ, x \in R^{2} . \end{matrix}

(13)

In this regard, it suffices to observe that the solution of (2) provides both a

LOWER bound to the solution of (13), as long as we set (following Section 4.6)
–
$d = (\begin{matrix} γ \\ 0 \end{matrix})$ , $z = (\begin{matrix} 0 \\ γ \end{matrix})$
–
$\bar{x} = 0$
–
$a_{1} = - 1$ , $b_{1} = 1$ , $a_{2} = - 1$ , $b_{2} = 1$ ,
–
$ϵ_{1} = ϵ_{2} = ϵ_{3} = 0$ ,
UPPER bound: to the solution of (13), as long as we follow the indications in Section 4.5, i.e., we recast and solve (11) as in (8), where
–
$m = 6$
–
$\bar{x} = x_{1} = 0$
–
$x_{2} = (\begin{matrix} 0 \\ γ \end{matrix})$ , $x_{3} = (\begin{matrix} γ \\ 0 \end{matrix})$ , $x_{4} = (\begin{matrix} 0 \\ - γ \end{matrix})$ , $x_{5} = (\begin{matrix} - γ \\ 0 \end{matrix})$ , $x_{6} = x_{2}$ ,
so that four instances of the problem (2) need to be solved.

Note that the dogleg-like methods for the approximate solution of the trust-region problem (4), in the convex case, equivalently solve the sub-problem (13) with just a couple of unknowns.

5. KKT Conditions and the Fast Solution of Problem (2)

Replacing the expression of the vector x in (2) within the objective function, we easily obtain the equivalent problem

\begin{matrix} min_{α, β} φ (α, β) \\ P : \{\begin{matrix} a_{1} \leq α \leq b_{1} \\ a_{2} \leq β \leq b_{2} \\ ϵ_{1} α + ϵ_{2} β \leq ϵ_{3}, \end{matrix} \end{matrix}

(14)

where

\{\begin{matrix} φ (α, β) = \frac{1}{2} {(\begin{matrix} α \\ β \end{matrix})}^{T} (\begin{matrix} t & u \\ u & w \end{matrix}) (\begin{matrix} α \\ β \end{matrix}) + {(\begin{matrix} y \\ h \end{matrix})}^{T} (\begin{matrix} α \\ β \end{matrix}) + q \\ t = d^{T} Q d \\ u = d^{T} Q z = z^{T} Q d \\ w = z^{T} Q z \\ y = {(Q \bar{x} + b)}^{T} d \\ h = {(Q \bar{x} + b)}^{T} z \\ q = {(\frac{1}{2} Q \bar{x} + b)}^{T} \bar{x} + c . \end{matrix}

Observe that transforming (2) into (14) only requires the computation of two additional matrix-vector products (i.e.,

Q d

and

Q z

), along with six inner products. The problem (14) is a constrained quadratic problem, such that first-order Fritz-John optimality conditions do not require additional constraint qualifications (since all the constraints are linear). Thus, after considering its Lagrangian function

\begin{matrix} L (α, β, μ_{1}, μ_{2}, μ_{3}, μ_{4}, μ_{5}) & = \\ φ (α, β) - μ_{1} (α - a_{1}) + μ_{2} (α - b_{1}) - μ_{3} (β - a_{2}) + μ_{4} (β - b_{2}) + μ_{5} (ϵ_{1} α + ϵ_{2} β - ϵ_{3}) \end{matrix}

we have the next set of equalities/inequalities representing the associated KKT conditions:

\{\begin{matrix} (\begin{matrix} t & u \\ u & w \end{matrix}) (\begin{matrix} α^{*} \\ β^{*} \end{matrix}) + (\begin{matrix} y \\ h \end{matrix}) + (\begin{matrix} - μ_{1}^{*} + μ_{2}^{*} + ϵ_{1} μ_{5}^{*} \\ - μ_{3}^{*} + μ_{4}^{*} + ϵ_{2} μ_{5}^{*} \end{matrix}) = (\begin{matrix} 0 \\ 0 \end{matrix}) \\ (\begin{matrix} α^{*} \\ β^{*} \end{matrix}) \in P \\ μ_{1}^{*} [α^{*} - a_{1}] = 0 \\ μ_{2}^{*} [α^{*} - b_{1}] = 0 \\ μ_{3}^{*} [β^{*} - a_{2}] = 0 \\ μ_{4}^{*} [β^{*} - b_{2}] = 0 \\ μ_{5}^{*} [ϵ_{1} α^{*} + ϵ_{2} β^{*} - ϵ_{3}] = 0 \\ μ_{i}^{*} \geq 0, i = 1, \dots, 5 . \end{matrix}

(15)

The remaining part of the present section will be devoted to analyze all the possible solutions of (15), with the aim of possibly computing a global minimum for (2). In this regard, exploiting the solutions of (15) evidently undergoes a reduction, allowing us to analyze the cases (I)–(XII) in Figure 3.

Figure 3. Overview of possible solutions (I)–(XII) for KKT conditions in (15).

Observing that in (15) the multipliers

μ_{i}^{*}

and

i = 1, \dots, 5

must fulfill nonnegativity conditions, it is not difficult to realize that computing all the KKT points satisfying (15) can turn out to be a burdensome task, including a number of sub-cases depending on the possible combinations of signs for the parameters

a_{1}

,

b_{1}

,

a_{2}

,

b_{2}

,

ϵ_{1}

,

ϵ_{2}

and

ϵ_{3}

. Conversely, a global minimizer for (2) can be equivalently exploited by analyzing all the possible solutions of (15) uniquely in terms of

α^{*}

and

β^{*}

, without requiring the computation of the multipliers as well. Hence, we limit our analysis to consider the computation of

α^{*}

and

β^{*}

in the cases (I)–(XII) of Figure 3, where

Cases (I), (II), (III), (IV) are associated with possible solutions in the vertices of the box constraints;
Cases (V), (VI), (VII), (VIII) are associated with possible solutions on the edges of the box constraints;
Case (IX) represents a possible feasible unconstrained minimizer for the objective function in (2);
Cases (X), (XI), (XII) are associated with possible solutions, making the last inequality constraint in (14) active.

Then, in Lemma 2, we will provide a simple theoretical result which justifies our simplification, with respect to computing all the KKT points. In this regard, we preliminarily set

i = 1

and consider the next cases from Figure 3, with

{y_{i}}

being the sequence of tentative solution points of (14):

Case (I): We set $\bar{α} = b_{1}$ , $\bar{β} = b_{2}$ . If $ϵ_{1} \bar{α} + ϵ_{2} \bar{β} \leq ϵ_{3}$ , then set

$P_{1} = (\begin{matrix} b_{1} \\ b_{2} \end{matrix}), φ_{i} = φ (b_{1}, b_{2}), y_{i} = P_{1}, i = i + 1;$

(16)
Case (II): We set $\bar{α} = a_{1}$ , $\bar{β} = b_{2}$ . If $ϵ_{1} \bar{α} + ϵ_{2} \bar{β} \leq ϵ_{3}$ , then set

$P_{2} = (\begin{matrix} a_{1} \\ b_{2} \end{matrix}), φ_{i} = φ (a_{1}, b_{2}), y_{i} = P_{2}, i = i + 1;$

(17)
Case (III): We set $\bar{α} = b_{1}$ , $\bar{β} = a_{2}$ . If $ϵ_{1} \bar{α} + ϵ_{2} \bar{β} \leq ϵ_{3}$ , then set

$P_{3} = (\begin{matrix} b_{1} \\ a_{2} \end{matrix}), φ_{i} = φ (b_{1}, a_{2}), y_{i} = P_{3}, i = i + 1;$

(18)
Case (IV): We set $\bar{α} = a_{1}$ , $\bar{β} = a_{2}$ . If $ϵ_{1} \bar{α} + ϵ_{2} \bar{β} \leq ϵ_{3}$ , then set

$P_{4} = (\begin{matrix} a_{1} \\ a_{2} \end{matrix}), φ_{i} = φ (a_{1}, a_{2}), y_{i} = P_{4}, i = i + 1;$

(19)
Case (V): We set $\bar{α} = b_{1}$ and possibly compute the solution $\bar{β} = - (u b_{1} + h) / w$ of the equation

$\frac{d φ (b_{1}, β)}{d β} = w β + u b_{1} + h = 0,$

so that:
–
if $w \neq 0 AND ϵ_{1} \bar{α} + ϵ_{2} \bar{β} \leq ϵ_{3}$ , then set

$P_{5} = (\begin{matrix} \bar{α} \\ \bar{β} \end{matrix}), φ_{i} = φ (\bar{α}, \bar{β}), y_{i} = P_{5}, i = i + 1;$

(20)

–
if $w = 0 AND u b_{1} + h \neq 0$ , then there is no solution for Case (V);
–
if $w = 0 AND u b_{1} + h = 0$ , then set $\bar{β} \in [a_{2}, b_{2}]$ as any value satisfying $ϵ_{1} \bar{α} + ϵ_{2} \bar{β} \leq ϵ_{3}$ , and compute $P_{5}$ as in (20);
Case (VI): We set $\bar{β} = a_{2}$ and possibly compute the solution $\bar{α} = - (u a_{2} + y) / t$ of the equation

$\frac{d φ (α, a_{2})}{d α} = t α + u a_{2} + y = 0,$

so that:
–
if $t \neq 0 AND ϵ_{1} \bar{α} + ϵ_{2} \bar{β} \leq ϵ_{3}$ , then set

$P_{6} = (\begin{matrix} \bar{α} \\ \bar{β} \end{matrix}), φ_{i} = φ (\bar{α}, \bar{β}), y_{i} = P_{6}, i = i + 1;$

(21)

–
if $t = 0 AND u a_{2} + y \neq 0$ , then there is no solution for Case (VI);
–
if $t = 0 AND u a_{2} + y = 0$ , then set $\bar{α} \in [a_{1}, b_{1}]$ as any value satisfying $ϵ_{1} \bar{α} + ϵ_{2} \bar{β} \leq ϵ_{3}$ , and compute $P_{6}$ as in (21);
Case (VII): We set $\bar{α} = a_{1}$ and possibly compute the solution $\bar{β} = - (u a_{1} + z) / w$ of the equation

$\frac{d φ (a_{1}, β)}{d β} = w β + u a_{1} + h = 0,$

so that:
–
if $w \neq 0 AND ϵ_{1} \bar{α} + ϵ_{2} \bar{β} \leq ϵ_{3}$ , then set

$P_{7} = (\begin{matrix} \bar{α} \\ \bar{β} \end{matrix}), φ_{i} = φ (\bar{α}, \bar{β}), y_{i} = P_{7}, i = i + 1;$

(22)

–
if $w = 0 AND u a_{1} + h \neq 0$ , then there is no solution for Case (VII);
–
if $w = 0 AND u a_{1} + h = 0$ , then set $\bar{β} \in [a_{2}, b_{2}]$ as any value satisfying $ϵ_{1} \bar{α} + ϵ_{2} \bar{β} \leq ϵ_{3}$ , and compute $P_{7}$ as in (22);
Case (VIII): We set $\bar{β} = b_{2}$ and possibly compute the solution $\bar{α} = - (u b_{2} + y) / t$ of the equation

$\frac{d φ (α, b_{2})}{d α} = t α + u b_{2} + y = 0,$

so that:
–
if $t \neq 0 AND ϵ_{1} \bar{α} + ϵ_{2} \bar{β} \leq ϵ_{3}$ , then set

$P_{8} = (\begin{matrix} \bar{α} \\ \bar{β} \end{matrix}), φ_{i} = φ (\bar{α}, \bar{β}), y_{i} = P_{8}, i = i + 1;$

(23)

–
if $t = 0 AND u b_{2} + y \neq 0$ , then there is no solution for Case (VIII);
–
if $t = 0 AND u b_{2} + y = 0$ , then set $\bar{α} \in [a_{1}, b_{1}]$ as any value satisfying $ϵ_{1} \bar{α} + ϵ_{2} \bar{β} \leq ϵ_{3}$ , and compute $P_{8}$ as in (23);
Case (IX): If $t w - u^{2} \neq 0$ , we compute the solution

$(\begin{matrix} \bar{α} \\ \bar{β} \end{matrix}) = - {(\begin{matrix} t & u \\ u & w \end{matrix})}^{- 1} (\begin{matrix} y \\ h \end{matrix})$

of the linear system

$\{\begin{matrix} φ_{α} (α, β) = t α + u β + y = 0 \\ φ_{β} (α, β) = u α + w β + h = 0; \end{matrix}$

otherwise, in case $t w - u^{2} = 0 AND ((t h - u y \neq 0) OR (u h - w y \neq 0))$ , then there is no solution for Case (IX);
otherwise, in case $t w - u^{2} = 0 AND (t h - u y = 0) AND (u h - w y = 0)$ , then we have three sub-cases:
- $t > 0$ : then, recalling that we are in the sub-case where equations $φ_{α} (α, β) = 0$ and $φ_{β} (α, β) = 0$ yield the same information, we exploit equation $φ_{α} (α, β) = 0$ and we set $α = - (u β + y) / t$ . Thus, from the bounds and the last inequality in (14), we obtain
  
  $\{\begin{matrix} a_{2} \leq β \leq b_{2} \\ (ε_{2} t - ε_{1} u) β \leq ε_{3} t + ε_{1} y \\ a_{1} t + y \leq - u β \leq b_{1} t + y \end{matrix}$
  
  which yield the next three cases:
  –
  $ε_{2} t - ε_{1} u > 0$ : admitting other three sub-cases, namely
  ∗
  $u > 0$ , so that we set
  
  $β_{1} = max \{a_{2}, - \frac{b_{1} t + y}{u}\} \leq β \leq min \{b_{2}, \frac{ε_{3} t + ε_{1} y}{ε_{2} t - ε_{1} u}, - \frac{a_{1} t + y}{u}\} = β_{2}$
  
  ∗
  $u = 0$ , so that we set
  
  $β_{1} = a_{2} \leq β \leq min \{b_{2}, \frac{ε_{3} t + ε_{1} y}{ε_{2} t}\} = β_{2}$
  
  ∗
  $u < 0$ , so that we set
  
  $β_{1} = max \{a_{2}, - \frac{a_{1} t + y}{u}\} \leq β \leq min \{b_{2}, \frac{ε_{3} t + ε_{1} y}{ε_{2} t - ε_{1} u}, - \frac{b_{1} t + y}{u}\} = β_{2}$
  
  –
  $ε_{2} t - ε_{1} u = 0$ : admitting no solution for Case (IX) as long as the condition $ε_{3} t + ε_{1} y < 0$ holds. Conversely, in case $ε_{3} t + ε_{1} y \geq 0$ , we have the three cases:
  ∗
  $u > 0$ , so that we set
  
  $β_{1} = max \{a_{2}, - \frac{b_{1} t + y}{u}\} \leq β \leq min \{b_{2}, - \frac{a_{1} t + y}{u}\} = β_{2}$
  
  ∗
  $u = 0$ , so that we set
  
  $β_{1} = a_{2} \leq β \leq b_{2} = β_{2}$
  
  ∗
  $u < 0$ , so that we set
  
  $β_{1} = max \{a_{2}, - \frac{a_{1} t + y}{u}\} \leq β \leq min \{b_{2}, - \frac{b_{1} t + y}{u}\} = β_{2}$
  
  –
  $ε_{2} t - ε_{1} u < 0$ : corresponding to the three cases:
  ∗
  $u > 0$ , so that we set
  
  $β_{1} = max \{a_{2}, - \frac{b_{1} t + y}{u}, \frac{ε_{3} t + ε_{1} y}{ε_{2} t - ε_{1} u}\} \leq β \leq min \{b_{2}, - \frac{a_{1} t + y}{u}\} = β_{2}$
  
  ∗
  $u = 0$ , so that we set
  
  $β_{1} = max \{a_{2}, \frac{ε_{3} t + ε_{1} y}{ε_{2} t - ε_{1} u}\} \leq β \leq b_{2} = β_{2}$
  
  ∗
  $u < 0$ , so that we set
  
  $β_{1} = max \{a_{2}, - \frac{a_{1} t + y}{u}, \frac{ε_{3} t + ε_{1} y}{ε_{2} t - ε_{1} u}\} \leq β \leq min \{b_{2}, - \frac{b_{1} t + y}{u}\} = β_{2}$
- $t = 0$ : then, recalling that we are in the sub-case where equations $φ_{α} (α, β) = 0$ and $φ_{β} (α, β) = 0$ yield the same information, with $t w - u^{2} = 0$ , we exploit equation $φ_{α} (α, β) = 0$ with $t = u = 0$ . Therefore, we have
  
  $\{\begin{matrix} a_{2} \leq β \leq b_{2} \\ y = 0 \\ a_{1} \leq α \leq b_{1} \end{matrix}$
  
  which yield the next two cases:
  –
  $y = 0$ This case implies that the objective function is constant (i.e., $φ (α, β) = q$ ), so that we set
  
  $β_{1} = a_{2} \leq β \leq b_{2} = β_{2}$
  
  –
  $y \neq 0$ : admitting no solution for Case (IX)
- $t < 0$ : then, recalling that we are again in the sub-case where equations $φ_{α} (α, β) = 0$ and $φ_{β} (α, β) = 0$ yield the same information, we exploit equation $φ_{α} (α, β) = 0$ and we set $α = - (u β + y) / t$ . Thus, from the bounds and the last inequality in (14), we obtain
  
  $\{\begin{matrix} a_{2} \leq β \leq b_{2} \\ (ε_{2} t - ε_{1} u) β \geq ε_{3} t + ε_{1} y \\ b_{1} t + y \leq - u β \leq a_{1} t + y \end{matrix}$
  
  which yield the next three cases:
  –
  $ε_{2} t - ε_{1} u > 0$ : admitting other three cases, namely
  ∗
  $u > 0$ , so that we set
  
  $β_{1} = max \{a_{2}, - \frac{a_{1} t + y}{u}, \frac{ε_{3} t + ε_{1} y}{ε_{2} t - ε_{1} u}\} \leq β \leq min \{b_{2}, - \frac{b_{1} t + y}{u}\} = β_{2}$
  
  ∗
  $u = 0$ , so that $b_{1} t + y \leq - u β \leq a_{1} t + y$ is always fulfilled and we set
  
  $β_{1} = max \{a_{2}, \frac{ε_{3} t + ε_{1} y}{ε_{2} t}\} \leq β \leq b_{2} = β_{2}$
  
  ∗
  $u < 0$ , so that we set
  
  $β_{1} = max \{a_{2}, - \frac{b_{1} t + y}{u}, \frac{ε_{3} t + ε_{1} y}{ε_{2} t - ε_{1} u}\} \leq β \leq min \{b_{2}, - \frac{a_{1} t + y}{u}\} = β_{2}$
  
  –
  $ε_{2} t - ε_{1} u = 0$ : admitting no solution for Case (IX) as long as the condition $ε_{3} t + ε_{1} y > 0$ holds. Conversely, in case $ε_{3} t + ε_{1} y \leq 0$ we have the three cases:
  ∗
  $u > 0$ , so that we set
  
  $β_{1} = max \{a_{2}, - \frac{a_{1} t + y}{u}\} \leq β \leq min \{b_{2}, - \frac{b_{1} t + y}{u}\} = β_{2}$
  
  ∗
  $u = 0$ , so that we set
  
  $β_{1} = a_{2} \leq β \leq b_{2} = β_{2}$
  
  ∗
  $u < 0$ , so that we set
  
  $β_{1} = max \{a_{2}, - \frac{b_{1} t + y}{u}\} \leq β \leq min \{b_{2}, - \frac{a_{1} t + y}{u}\} = β_{2}$
  
  –
  $ε_{2} t + ε_{1} u < 0$ : corresponding to the three cases
  ∗
  $u > 0$ , so that we set
  
  $β_{1} = max \{a_{2}, - \frac{a_{1} t + y}{u}\} \leq β \leq min \{b_{2}, \frac{ε_{3} t + ε_{1} y}{ε_{2} t - ε_{1} u}, - \frac{b_{1} t + y}{u}\} = β_{2}$
  
  ∗
  $u = 0$ , so that we set
  
  $β_{1} = a_{2} \leq β \leq min \{b_{2}, \frac{ε_{3} t + ε_{1} y}{ε_{2} t - ε_{1} u}\} = β_{2}$
  
  ∗
  $u < 0$ , so that we set
  
  $β_{1} = max \{a_{2}, - \frac{b_{1} t + y}{u}\} \leq β \leq min \{b_{2}, \frac{ε_{3} t + ε_{1} y}{ε_{2} t - ε_{1} u}, - \frac{a_{1} t + y}{u}\} = β_{2} .$
Thus, overall, for Case (IX), if $β_{1} \leq β_{2}$ , we set

$\bar{β} = (β_{1} + β_{2}) / 2, \bar{α} = \{\begin{matrix} - (u \bar{β} + y) / t & t \neq 0 \\ (a_{1} + b_{1}) / 2 & t = 0, \end{matrix}$

along with

$\begin{matrix} P_{9} = (\begin{matrix} \bar{α} \\ \bar{β} \end{matrix}), φ_{i} = φ (\bar{α}, \bar{β}), y_{i} = P_{9}, i = i + 1, \end{matrix}$

(24)

otherwise, if $β_{1} > β_{2}$ , there is no solution for Case (IX);
Case (X): We set $\bar{α} = a_{1}$ with $ε_{1} a_{1} + ε_{2} β = ε_{3}$ , and we distinguish among three cases:
–
if $ε_{2} = 0 AND ε_{3} = ε_{1} a_{1}$ , then set $β_{1} = a_{2} \leq β \leq b_{2} = β_{2}$ ;
–
if $ε_{2} = 0 AND ε_{3} \neq ε_{1} a_{1}$ , then there is no solution for Case (X);
–
if $ε_{2} \neq 0$ , then set

$β_{1} = max \{a_{2}, \frac{ε_{3} - ε_{1} a_{1}}{ε_{2}}\} \leq β \leq min \{b_{2}, \frac{ε_{3} - ε_{1} a_{1}}{ε_{2}}\} = β_{2} .$

Set $\bar{β} = (β_{1} + β_{2}) / 2$ with

$P_{10} = (\begin{matrix} a_{1} \\ \bar{β} \end{matrix}), φ_{i} = φ (a_{1}, \bar{β}), y_{i} = P_{10}, i = i + 1;$

(25)
Case (XI): We distinguish among the next four cases:
–
if $ε_{1} = ε_{2} = 0 AND ε_{3} \geq 0$ , then set $\bar{α} = (a_{1} + b_{1}) / 2$ , $β_{1} = a_{2} \leq β \leq β_{2} = b_{2}$ ; otherwise, there is no solution for Case (XI);
–
if $ε_{1} > 0$ , then $α = (- ε_{2} β + ε_{3}) / ε_{1}$ and we analyze three sub-cases:
- If $ε_{2} > 0$ , then set
  
  $β_{1} = max \{a_{2}, \frac{ε_{1} b_{1} - ε_{3}}{- ε_{2}}\} \leq β \leq min \{b_{2}, \frac{ε_{1} a_{1} - ε_{3}}{- ε_{2}}\} = β_{2};$
- If $ε_{2} = 0$ , then set $β_{1} = a_{2} \leq β \leq b_{2} = β_{2}$ ;
- If $ε_{2} < 0$ , then set
  
  $β_{1} = max \{a_{2}, \frac{ε_{1} a_{1} - ε_{3}}{- ε_{2}}\} \leq β \leq min \{b_{2}, \frac{ε_{1} b_{1} - ε_{3}}{- ε_{2}}\} = β_{2};$
–
if $ε_{1} = 0 AND ε_{2} \neq 0$ , then set $\bar{β} = ε_{3} / ε_{2}$ , $\bar{α} = (a_{1} + b_{1}) / 2$ ; if ( $\bar{β} < a_{2}$ OR $\bar{β} > b_{2}$ ), then there is no solution for Case (XI);
–
if $ε_{1} < 0$ , then $α = (- ε_{2} β + ε_{3}) / ε_{1}$ and we analyze three sub-cases:
- If $ε_{2} > 0$ , then set
  
  $β_{1} = max \{a_{2}, \frac{ε_{1} a_{1} - ε_{3}}{- ε_{2}}\} \leq β \leq min \{b_{2}, \frac{ε_{1} b_{1} - ε_{3}}{- ε_{2}}\} = β_{2};$
- If $ε_{2} = 0$ , then set $β_{1} = a_{2} \leq β \leq b_{2} = β_{2}$ ;
- If $ε_{2} < 0$ , then set
  
  $β_{1} = max \{a_{2}, \frac{ε_{1} b_{1} - ε_{3}}{- ε_{2}}\} \leq β \leq min \{b_{2}, \frac{ε_{1} a_{1} - ε_{3}}{- ε_{2}}\} = β_{2} .$
Set $\bar{β} = (β_{1} + β_{2}) / 2$ and $\bar{α} = (- ε_{2} \bar{β} + ε_{3}) / ε_{1}$ ; if $a_{1} \leq \bar{α} \leq b_{1}$ , then set

$P_{11} = (\begin{matrix} \bar{α} \\ \bar{β} \end{matrix}), φ_{i} = φ (\bar{α}, \bar{β}), y_{i} = P_{11}, i = i + 1,$

(26)

otherwise, there is no solution for Case (XI);
Case (XII): We set $\bar{β} = a_{2}$ with $ε_{1} α + ε_{2} a_{2} = ε_{3}$ , and we distinguish among three cases:
–
if $ε_{1} = 0 AND ε_{3} = ε_{2} a_{2}$ , then set $\bar{α} = (a_{1} + b_{1}) / 2$ ;
–
if $ε_{1} = 0 AND ε_{3} \neq ε_{2} a_{2}$ , then there is no solution for Case (XII);
–
if $ε_{1} \neq 0$ , then set

$α_{1} = max \{a_{1}, \frac{ε_{3} - ε_{2} a_{2}}{ε_{1}}\} \leq α \leq min \{b_{1}, \frac{ε_{3} - ε_{2} a_{2}}{ε_{1}}\} = α_{2} .$

Set $\bar{α} = (α_{1} + α_{2}) / 2$ with

$P_{12} = (\begin{matrix} \bar{α} \\ a_{2} \end{matrix}), φ_{i} = φ (\bar{α}, a_{2}), y_{i} = P_{12}, i = i + 1 .$

(27)

The next lemma justifies the role of the last analysis for the computation of possible solutions of (14).

Lemma 2.

Given the problem (14), and let the assumptions of Lemma 1 hold. Consider the sequence of the m points

{y_{i}}

and the sequence of the m values

{φ_{i}}

, from (16)–(27), which are relabelled so that for any index

i \geq 2

, we have

φ_{i - 1} \leq φ_{i} \leq φ_{i + 1} .

Then, if

\hat{ß} \in arg min_{1 \leq i \leq m} {φ_{i}}

then, the point

y_{\hat{ß}}

is a global minimum for (14).

Proof of Lemma 2.

The existence of a global minimum

y^{*}

and the corresponding value

φ (y^{*})

for (14) is ensured by Lemma 1. Moreover, each global minimum of (14) naturally fulfills KKT conditions, so that each global minimum must belong to the sequence

{y_{i}}

. Now, assume by contradiction that there exists a point

\tilde{y} \in {y_{i}}

, with

\tilde{y} \in arg {min}_{1 \leq i \leq m} {φ_{i}}

, but

\tilde{y}

is not a global minimum. This yields the contradictory fact that

φ (y^{*}) > φ (\tilde{y})

. □

6. Conclusions and Future Work

We have considered a very relevant issue within Nonlinear Programming, namely the solution of a specific constrained quadratic problem, whose exact global solution can be easily computed after analyzing the first-order KKT conditions associated with it. We also highlighted that our proposal may, to a large extent, suggest guidelines for the research of novel LBMs, by drawing inspiration from TRMs. This last observation represents a promising tool, in order to provide algorithms which guarantee global convergence to stationary limit points, satisfying either first- or second-order necessary optimality conditions. In particular, we can summarize the following promising lines of research, for large-scale problems which iteratively generate the sequences of points

\begin{array}{l} \{\begin{matrix} x_{k + 1} = x_{k} + α_{k} d_{k} \\ x_{k + 1} = x_{k} + α_{k} d_{k} + β_{k} z_{k} \end{matrix} & ⟵ & for LBMs \\ x_{k + 1} = x_{k} + s_{k} & ⟵ & for TRMs \end{array}

which are

d_{k}

,

z_{k}

, and

s_{k}

search directions at the current iterate

x_{k}

:

Developing novel iterative LBMs (e.g., linesearch-based TNMs), where the search direction $d_{k}$ (e.g., a Newton-type direction) is possibly combined with another direction $z_{k}$ (e.g., the steepest descent at $x_{k}$ , a negative curvature direction at $x_{k}$ , etc.) through the use of (14). Then, comparing the efficiency of the novel methods with more standard linesearch-based approaches from the literature could give indications of the reliability of the ideas in this paper;
Developing novel hybrid methods where the rationale behind alternating trust-region or linesearch-based techniques is exploited. In particular, the iterative scheme $x_{k + 1} = x_{k} + α_{k} d_{k} + β_{k} z_{k}$ (respectively, $x_{k + 1} = x_{k} + α_{k} d_{k}$ ) might be considered, where the search directions $d_{k}$ and $z_{k}$ , along with the steplengths $α_{k}$ and $β_{k}$ (respectively, $d_{k}$ and $α_{k}$ ), are alternatively computed by solving
- A trust-region sub-problem like (4), so that a sufficient reduction in the quadratic model is ensured;
- A sub-problem like (14), so that the solution $α^{*} d_{k} + β^{*} z_{k}$ is a promising gradient-related direction to be used within a linesearch procedure.
In order to preserve the global convergence to stationary points satisfying either first- or second-order necessary optimality conditions;
Specifically, comparing the use of dogleg methods (within TRMs) vs. the application of (14) coupled with a linesearch technique. This issue is tricky, since dogleg methods are applied to trust-region sub-problems like (4), including a general quadratic constraint (i.e., the trust-region constraint), while in (14) all the constraints are linear, so that the exact global solution of (14) is easily computed. Moreover, the last issue might shed light also on the opportunity (possibly) of privileging an efficient linesearch procedure applied to a (coarsely computed) gradient-related search direction, in place of a precise computation of the search direction in LBMs, using an inexpensive linesearch procedure. In other words, it is at present questionable if coupling a coarse computation of the vectors $d_{k}$ and $z_{k}$ with an accurate linesearch procedure would be more preferable than coupling the accurately computed vectors $d_{k}$ and $z_{k}$ with a cheaper linesearch procedure;
Introducing nonmonotone stabilization techniques (see e.g., [28]) combining nonmonotonicity with any of the above ideas, for both TRMs and LBMs.

Author Contributions

G.F., C.P. and M.R. have equally contributed to this paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Data are contained within the article.

Acknowledgments

Giovanni Fasano and Massimo Roma thank INδAM (Istituto Nazionale di Alta Matematica) for the support they received.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Ben-Tal, A.; Nemirovski, A. Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications. In MPS-SIAM Series on Optimization; SIAM: Philadelphia, PA, USA, 2001. [Google Scholar]
Conn, A.R.; Gould, N.I.M.; Toint, P.L. Trust-region methods. In MPS-SIAM Series on Optimization; SIAM: Philadelphia, PA, USA, 2000. [Google Scholar]
Nocedal, J.; Wright, S.J. Numerical Optimization, 2nd ed.; Springer: New York, NY, USA, 2006. [Google Scholar]
Micchelli, C.A. Interpolation of scattered data: Distance matrices and conditionally positive definite functions. Constr. Approx. 1986, 2, 11–22. [Google Scholar] [CrossRef]
Myers, D.E. Kriging, cokriging, radial basis functions and the role of positive definiteness. Comput. Math. Appl. 1992, 24, 139–148. [Google Scholar] [CrossRef]
Conn, A.R.; Gould, N.I.M.; Sartenaer, A.; Toint, P.L. On Iterated-Subspace Minimization Methods for Nonlinear Optimization. In Proceedings on Linear and Nonlinear Conjugate Gradient-Related Methods; Adams, L., Nazareth, L., Eds.; SIAM: Philadelphia, PA, USA, 1996; pp. 50–78. [Google Scholar]
Shea, B.; Schmidt, M. Why line search when you can plane search? SO-friendly neural networks allow per-iteration optimization of learning and momentum rates for every layer. arXiv 2024, arXiv:2406.17954. [Google Scholar]
Caliciotti, A.; Fasano, G.; Nash, S.; Roma, M. An adaptive truncation criterion for Newton-Krylov methods in large scale nonconvex optimization. Oper. Res. Lett. 2018, 46, 7–12. [Google Scholar] [CrossRef]
McCormick, G.P. A modification of Armijo’s step-size rule for negative curvature. Math. Program. 1977, 13, 111–115. [Google Scholar] [CrossRef]
Moré, J.J.; Sorensen, D.C. On the use of directions of negative curvature in a modified Newton method. Math. Program. 1979, 16, 1–20. [Google Scholar] [CrossRef]
Nash, S.G. A survey of truncated-Newton methods. J. Comput. Appl. Math. 2000, 124, 45–59. [Google Scholar] [CrossRef]
Fasano, G.; Roma, M. Iterative computation of negative curvature directions in large scale optimization. Comput. Optim. Appl. 2007, 38, 81–104. [Google Scholar] [CrossRef]
De Leone, R.; Fasano, G.; Roma, M.; Sergeyev, Y.D. Iterative Grossone-Based Computation of Negative Curvature Directions in Large-Scale Optimization. J. Optim. Theory Appl. 2020, 186, 554–589. [Google Scholar] [CrossRef]
Curtis, F.E.; Robinson, D.P. Exploiting negative curvature in deterministic and stochastic optimization. Math. Program. 2019, 176, 69–94. [Google Scholar] [CrossRef]
Gill, P.E.; Wong, E. Sequential Quadratic Programming Methods. In Mixed Integer Nonlinear Programming; Lee, J., Leyffer, S., Eds.; The IMA Volumes in Mathematics and Its Applications; Springer: New York, NY, USA, 2012; Volume 154. [Google Scholar]
Fletcher, R.; Gould, N.I.; Leyffer, S.; Toint, P.; Wächter, A. Global convergence of a trust-region SQP-filter algorithm for general nonlinear programming. SIAM J. Optim. 2002, 13, 635–659. [Google Scholar] [CrossRef]
Wang, J.; Petra, C.G. A Sequential Quadratic Programming Algorithm for Nonsmooth Problems with Upper-Objective. SIAM J. Optim. 2023, 33, 2379–2405. [Google Scholar] [CrossRef]
Nocedal, J.; Yuan, Y. Combining trust-region and line-search techniques. In Advances in Nonlinear Programming; Yuan, Y., Ed.; Kluwer: Boston, MA, USA, 1998; pp. 157–175. [Google Scholar]
Tong, X.; Zhou, S. Combining Trust Region and Line Search Methods for Equality Constrained Optimization. Numer. Funct. Anal. Optim. 2006, 24, 143–162. [Google Scholar] [CrossRef]
Waltz, R.A.; Morales, J.L.; Nocedal, J.; Orban, D. An interior algorithm for nonlinear optimization that combines line search and trust region steps. Math. Program. 2006, 107, 391–408. [Google Scholar] [CrossRef]
Pei, Y.; Zhu, D. A trust-region algorithm combining line search filter technique for nonlinear constrained optimization. Int. J. Comput. Math. 2014, 91, 1817–1839. [Google Scholar] [CrossRef]
Dembo, R.S.; Eisenstat, S.C.; Steihaug, T. Inexact Newton methods. SIAM J. Numer. Anal. 1982, 19, 400–408. [Google Scholar] [CrossRef]
Steihaug, T. The Conjugate Gradient method and Trust Regions in large scale optimization. SIAM J. Numer. Anal. 1983, 20, 626–637. [Google Scholar] [CrossRef]
Lucidi, S.; Rochetich, F.; Roma, M. Curvilinear stabilization techniques for truncated Newton methods in large scale unconstrained optimization. SIAM J. Optim. 1998, 8, 916–939. [Google Scholar] [CrossRef]
Gould, N.I.M.; Lucidi, S.; Roma, M.; Toint, P.L. Solving the trust-region subproblem using the Lanczos method. SIAM J. Optim. 1999, 9, 504–525. [Google Scholar] [CrossRef]
De Leone, R.; Fasano, G.; Sergeyev, Y.D. Planar methods and Grossone for the Conjugate Gradient breakdown in Nonlinear Programming. Comput. Optim. Appl. 2018, 71, 73–93. [Google Scholar] [CrossRef]
Grippo, L.; Lampariello, F.; Lucidi, S. A truncated Newton method with nonmonotone linesearch for unconstrained optimization. J. Optim. Theory Appl. 1989, 60, 401–419. [Google Scholar] [CrossRef]
Grippo, L.; Lampariello, F.; Lucidi, S. A class of nonmonotone stabilization methods in unconstrained optimization. Numer. Math. 1991, 59, 779–805. [Google Scholar] [CrossRef]

Figure 1. A graphical representation of the feasible set in (2). The dashed-dotted lines represent all the extreme choices for the last inequality constraint in (2).

Figure 2. Examples where the structure of the feasible set in (2) is helpful: case (a) is treated in Section 4.1, case (b) is treated in Section 4.2, case (c) is treated in Section 4.3, case (d) is treated in Section 4.4 and case (e) is treated in Section 4.5.

Figure 3. Overview of possible solutions (I)–(XII) for KKT conditions in (15).

Table 1. A standard framework for linesearch-based TNMs for large-scale problems. The alternative of possibly using negative curvature directions allows for convergence to stationary limit points which fulfill second-order necessary optimality conditions.

Set

x_{0} \in R^{n}

Set

η_{k} \in [0, 1)

for any k, with

{η_{k}} \to 0

OUTER ITERATIONS

for

k = 0, 1, \dots

Compute

b \approx \nabla f (x_{k})

and

Q \approx \nabla^{2} f (x_{k})

; if

∥ b ∥

is small then STOP

INNER ITERATIONS

- Compute

d_{k}

, which approximately solves Newton’s equation

Q d + b = 0

, i.e., it satisfies the truncation rule

∥ Q d_{k} + b ∥ \leq η_{k} ∥ b ∥

- Possibly compute a bounded negative curvature direction

z_{k}

at

x_{k}

Use a criterion to either combine

d_{k}

and

z_{k}

, or choose between

d_{k}

and

z_{k}

If the directions

d_{k}

and

z_{k}

were combined, set

v_{k} (α) = ω_{1} (α) d_{k} + ω_{2} (α) z_{k}

, and

use a curvilinear linesearch procedure to select

α \leftarrow α_{k}

. Otherwise, set

v_{k} (α) = α \bar{d}

with

\bar{d} \in {d_{k}, z_{k}}

, and use an Armijo-type procedure to select

α \leftarrow α_{k}

Update

x_{k + 1} = x_{k} + v_{k} (α_{k})

endfor

Table 2. A standard framework for linesearch-based TNMs for large-scale problems which exploits the sub-problem (2). Differences with respect to Table 1 are quite evident.

Set

x_{0} \in R^{n}

Set

η_{k} \in [0, 1)

for any k, with

{η_{k}} \to 0

OUTER ITERATIONS

for

k = 0, 1, \dots

Compute

b \approx \nabla f (x_{k})

and

Q \approx \nabla^{2} f (x_{k})

; if

∥ b ∥

is small then STOP

INNER ITERATIONS

- Compute

d_{k}

, which approximately solves Newton’s equation

Q d + b = 0

, i.e., it satisfies the truncation rule

∥ Q d_{k} + b ∥ \leq η_{k} ∥ b ∥

- Set

z_{k} = - b

Compute

α^{*}

and

β^{*}

by solving (2); then, update the trust-region

parameters

a_{1}, a_{2}, b_{1}, b_{2}

Set

v_{k} = α^{*} d_{k} + β^{*} z_{k}

, and use an Armijo-type procedure to select the

steplength

α_{k}

along the direction

v_{k}

Update

x_{k + 1} = x_{k} + α_{k} v_{k}

endfor

Table 3. A framework for combining trust-region and linesearch approaches within TNMs for large-scale problems, exploiting the sub-problem again (2). Differences with respect to Table 1 and Table 2 are quite evident.

Set

x_{0} \in R^{n}

Set

η_{k} \in [0, 1)

for any k, with

{η_{k}} \to 0

. Set

ρ > 0

OUTER ITERATIONS

for

k = 0, 1, \dots

Compute

b \approx \nabla f (x_{k})

and

Q \approx \nabla^{2} f (x_{k})

; if

∥ b ∥

is small then STOP

INNER ITERATIONS

- Compute

d_{k}

, which approximately solves Newton’s equation

Q d + b = 0

, i.e., it satisfies the truncation rule

∥ Q d_{k} + b ∥ \leq η_{k} ∥ b ∥

- Set

z_{k} = - b

Compute

α^{*}

and

β^{*}

by solving (2); then, set

v_{k} = α^{*} d_{k} + β^{*} z_{k}

,

A r e d_{k} = f (x_{k}) - f (x_{k} + v_{k})

,

P r e d_{k} = φ (x_{k}) - φ (x_{k} + v_{k})

If

A r e d_{k} / P r e d_{k} ≯ ρ

, use an Armijo-type procedure to select the steplength

α_{k}

along

v_{k}

; otherwise, skip the linesearch procedure

Update the trust-region parameters

a_{1}, a_{2}, b_{1}, b_{2}

Update

x_{k + 1} = x_{k} + α_{k} v_{k}

endfor

Table 4. A framework of linesearch-based approaches within TNMs for large-scale problems: solving the sub-problem (2) successfully allows for the convergence of the sequence

{x_{k}}

to limit points satisfying second-order necessary optimality conditions. Differences with respect to Table 1, Table 2 and Table 3 are quite evident.

Table 4. A framework of linesearch-based approaches within TNMs for large-scale problems: solving the sub-problem (2) successfully allows for the convergence of the sequence

{x_{k}}

to limit points satisfying second-order necessary optimality conditions. Differences with respect to Table 1, Table 2 and Table 3 are quite evident.

Set

x_{0} \in R^{n}

Set

η_{k} \in [0, 1)

for any k, with

{η_{k}} \to 0

OUTER ITERATIONS

for

k = 0, 1, \dots

Compute

b \approx \nabla f (x_{k})

and

Q \approx \nabla^{2} f (x_{k})

; if

∥ b ∥

is small then STOP

INNER ITERATIONS

- Compute

d_{k}

, which approximately solves Newton’s equation

Q d + b = 0

, i.e., it satisfies the truncation rule

∥ Q d_{k} + b ∥ \leq η_{k} ∥ b ∥

Compute a suitable negative curvature direction

z_{k}

for

f (x)

at

x_{k}

Compute

α^{*}

and

β^{*}

by solving (2); then, set

v_{k} = α^{*} d_{k} + β^{*} z_{k}

. Update the

trust-region parameters

a_{1}, a_{2}, b_{1}, b_{2}

Use an Armijo-type procedure to select the steplength

α_{k}

along

v_{k}

Update

x_{k + 1} = x_{k} + α_{k} v_{k}

endfor

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Issues on a 2–Dimensional Quadratic Sub–Problem and Its Applications in Nonlinear Programming: Trust–Region Methods (TRMs) and Linesearch Based Methods (LBMs)

Abstract

1. Introduction

2. Feasibility Issues for Our Quadratic Problem

3. On the Use of Quadratic Sub-Problems Within TRMs and LBMs, in Large-Scale Optimization

4. How General Is the Model (2) in Nonlinear Programming Frameworks?

4.1. Minimization over a Bounded Simplex

4.2. Minimization over a Bounded Polygon

4.3. Minimization over a Bounded Segment

4.4. Minimization over a Bounded Box in $R^{2}$

4.5. Minimization Including a 1-Norm Inequality Constraint in $R^{2}$

4.6. Minimization Including an ∞-Norm Inequality Constraint in $R^{2}$

4.7. Minimization Including a 2-Norm Inequality Constraint in $R^{2}$

5. KKT Conditions and the Fast Solution of Problem (2)

6. Conclusions and Future Work

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

Issues on a 2–Dimensional Quadratic Sub–Problem and Its Applications in Nonlinear Programming: Trust–Region Methods (TRMs) and Linesearch Based Methods (LBMs)

Abstract

1. Introduction

2. Feasibility Issues for Our Quadratic Problem

3. On the Use of Quadratic Sub-Problems Within TRMs and LBMs, in Large-Scale Optimization

4. How General Is the Model (2) in Nonlinear Programming Frameworks?

4.1. Minimization over a Bounded Simplex

4.2. Minimization over a Bounded Polygon

4.3. Minimization over a Bounded Segment

4.4. Minimization over a Bounded Box in R 2

4.5. Minimization Including a 1-Norm Inequality Constraint in R 2

4.6. Minimization Including an ∞-Norm Inequality Constraint in R 2

4.7. Minimization Including a 2-Norm Inequality Constraint in R 2

5. KKT Conditions and the Fast Solution of Problem (2)

6. Conclusions and Future Work

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

4.4. Minimization over a Bounded Box in $R^{2}$

4.5. Minimization Including a 1-Norm Inequality Constraint in $R^{2}$

4.6. Minimization Including an ∞-Norm Inequality Constraint in $R^{2}$

4.7. Minimization Including a 2-Norm Inequality Constraint in $R^{2}$