An Efficient Penalty Method without a Line Search for Nonlinear Optimization

Assma Leulmi

doi:10.3390/axioms13030176

Abstract

In this work, we integrate some new approximate functions using the logarithmic penalty method to solve nonlinear optimization problems. Firstly, we determine the direction by Newton’s method. Then, we establish an efficient algorithm to compute the displacement step according to the direction. Finally, we illustrate the superior performance of our new approximate function with respect to the line search one through a numerical experiment on numerous collections of test problems.

Keywords:

interior point methods; logarithmic penalty method; line search; approximate functions; nonlinear optimization

MSC:

90C25; 90C30; 90C51

1. Introduction

The nonlinear optimization is a fundamental subject in the modern optimization literature. It focuses on the problem of optimizing an objective function in the presence of inequality and/or equality constraints. Furthermore, the optimization problem is obviously linear if all the functions are linear, otherwise it is called a nonlinear optimization problem.

This research field is motivated by the fact that it arises in various problems encountered in practice, such as business administration, economics, agriculture, mathematics, engineering, and physical sciences.

In our knowledge, Frank and Wolfe are the deans in nonlinear optimization problems. They established a powerful algorithm in [1] to solve them. Later, they used another method in [2] based on the application of the Simplex method on the nonlinear problem after converting it to a linear one.

This pioneer work inspired many authors to propose and develop several methods and techniques to solve this class of problems. We refer to [3,4] for interior point methods to find the solution of nonlinear optimization problems with a high dimension.

In order to make this theory applicable in practice, other methods are designed on the linear optimization history, among robust algorithms with polynomial complexity. In this perception, Khachian succeeded in 1979 to introduce a new ellipsoid method from approaches applied originally to nonlinear optimization.

Interior point methods outperform the Simplex ones, and they have recently been the subject of several monographs including Bonnans and Gilbert [5], Evtushenko and Zhadan [6], Nesterov and Emirovski [7], and Wright [8] and Ye [9].

Interior point methods can be classified into three different groups as follows: projective methods and their alternatives as in Powell [10] and Rosen [11,12], central trajectory methods (see Ouriemchi [13] and Forsgren et al. [14]), and barrier/penalty methods, where majorant functions were originally proposed by Crouzeix and Merikhi [15] to solve a semidefinite optimization problem. Inspired by this work, Menniche and Benterki [16] and Bachir Cherif and Merikhi [17] applied this idea to linear and nonlinear optimizations, respectively.

A majorant function for the penalty method in convex quadratic optimization was proposed by Chaghoub and Benterki [18]. On the other hand, A. Leulmi et al. [19,20] used new minorant functions for semidefinite optimization, and this idea was extended to linear programming by A. Leulmi and S. Leulmi in [21].

As far as we know, our new approximate function has not been studied in the nonlinear optimization literature. These approximate functions are more convenient and efficient than the line search method for rapidly computing the displacement step.

Therefore, in our work, we aim to optimize a nonlinear problem based on prior efforts. Thus, we propose a straightforward and effective barrier penalty method using new minorant functions.

More precisely, we first introduce the position of the problem and its perturbed problem with the results of convergence in Section 2 and Section 3 of our paper. Then, in Section 4, we establish the solution of the perturbed problem by finding new minorant functions. Section 5 is devoted to presenting a concise description of the algorithm and to illustrating the outperformance of our new approach by carrying out a simulation study. Finally, we summarize our work in the conclusion.

Throughout this paper, the following notations are adopted. Let

⟨., .⟩

and

∥.∥

denote the scalar product and the Euclidean norm, respectively, given by the following:

⟨x, y⟩ = x^{T} y = \sum_{i = 1}^{n} x_{i} y_{i}, x, y \in R^{n}

and

∥x∥ = \sqrt{⟨x, x⟩} = \sqrt{\sum_{i = 1}^{n} x_{i}^{2}}

2. The Problem

We aim to present an algorithm for solving the following optimization problem:

\{\begin{matrix} \min f (x) \\ A x = b \\ x \geq 0, \end{matrix}

(P)

where

b \in R^{m}

and A

\in R^{m \times n}

is a full-rank matrix with

m < n .

For this purpose, we need the following hypothesis:

Hypothesis 1.

f is nonlinear, twice continuously differentiable, and convex on

L

, where

L = \{x \in R^{n} : A x = b; x \geq 0\}

is the set of realizable solutions of (P).

Hypothesis 2.

(P) satisfies the condition of interior point (IPC), i.e., there exists

x_{0} > 0

such that

A x_{0} = b .

Hypothesis 3.

The set of optimal solutions of (P) is nonempty and bounded.

Notice that these conditions are standard in this context. We refer to [17,20].

If

x^{*}

is an optimal solution, there exist two Lagrange multipliers

p^{*}

\in R^{m}

and

q^{*}

\in R^{n},

such that

\{\begin{matrix} ▿ f (x^{*}) + A^{T} p^{*} = q^{*} \geq 0, \\ A x^{*} = b, \\ ⟨q^{*}, x^{*}⟩ = 0 . \end{matrix}

(1)

3. Formulation of the Perturbed Problem of (P)

Let us first consider the function

ψ

defined on

R \times R^{n}

by the following:

ψ (η, x) = \{\begin{matrix} f (x) + \sum_{i = 1}^{n} ξ (η, x_{i}) & if & x \geq 0, A x = b \\ + \infty & if not, \end{matrix}

where

ξ : R^{2} ⟶ (- \infty, + \infty]

is a convex, lower semicontinuous and proper function given by the following:

ξ (η, α) = \{\begin{matrix} η \ln (η) - η \ln (α) & if & α > 0 and η > 0, \\ 0 & if & α \geq 0 and η = 0, \\ + \infty & otherwise . \end{matrix}

Thus,

ψ

is a proper, convex, and lower semicontinuous function.

Furthermore, the function g defined by

g (η) = \inf_{x \in R^{n}} [ψ_{η} (x) = f (x) + \overset{n}{\sum_{i = 1}} ξ (η, x_{i})]

(Pη)

is convex. Notice that for

η = 0,

the perturbed problem

(P η)

coincides with the initial problem (P); then,

f^{*} = g (0) .

3.1. Existence and Uniqueness of Optimal Solution

To show that the perturbed problem

(P η)

has a unique optimal solution, it is sufficient to demonstrate that the recession cone of

ψ_{η}

is reduced to zero.

Proof.

For a fixed

η,

the function

ψ_{η}

is proper, convex, and lower semicontinuous. The asymptotic function of

ψ_{η}

is defined by the following:

{(ψ_{η})}_{\infty} (d) = \lim_{α \to + \infty} \frac{ψ_{η} (x_{0} + α d) - ψ_{η} (x_{0})}{α},

thus, the asymptotic functions of f and

ψ_{η}

satisfy the relation:

{(ψ_{η})}_{\infty} (d) = \{\begin{matrix} {(f)}_{\infty} (x) & if & d \geq 0, A d = 0, \\ + \infty & if not . \end{matrix}

Moreover, hypothesis H3 is equivalent to

\{d \in R^{n} : {(f)}_{\infty} (x) \leq 0, d \geq 0, A d = 0\} = \{0\} .

Then,

\{d \in R^{n} : {(ψ_{η})}_{\infty} (d) \leq 0\} = \{0\}

and from [17], for each non-negative real number

η,

the strictly convex problem

(P η)

admits a unique optimal solution noted by

x_{η}^{*} .

The solution of the problem (P) is the limit of the solutions sequence of the perturbed problem

(P η)

when

η

tends to 0. □

3.2. Convergence of the Solution

Now, we are in a position to state the convergence result of

(P η)

to (P), which is proved in Lemma 1 on [18].

Let

η > 0,

for all

x \in L;

we define

ψ (x, η) = f_{η} (x) .

Lemma 1

([18]). We consider

η > 0 .

If the perturbed problem

(P η)

admits an optimal solution

x_{η},

such that

\lim_{η \to 0} x_{η} = x^{*}

, then the problem (P) admits an optimal solution

x^{*} .

We use the classical prototype of penalty methods. We begin our process with

(x_{0}, η_{0}) \in \tilde{L} \times (0, \infty),

where

\tilde{L} = \{x \in R^{n} : x > 0, A x = b\}

(2)

and the iteration scheme is divided into the following steps:

1. Select

η_{k + 1} \in (0, η_{k}) .

2. Establish an approximate solution

x_{k + 1}

for

(P η_{k}) .

It is obvious that

ψ (η_{k}, x_{k + 1}) < ψ (η_{k}, x_{k}) .

Remark 1.

If the values of the objective functions of the problem (P) and the perturbed problem

(P η)

are equal and finite, then (P) will have an optimal solution if and only if

(P η)

has an optimal solution.

The iterative process stops when we obtain an acceptable approximation of

g (0) .

4. Computational Resolution of the Perturbed Problem

Our approach to the numerical solution of the perturbed problem

(P η),

consists of two stages. In the first one, we calculate the descent direction using the Newton approach, and in the second one, we propose an efficient new-minorant-functions approach to compute the displacement step easily and quickly relative to the line search method.

4.1. The Descent Direction

As

(P η)

is strictly convex, the necessary and sufficient optimality conditions state that

x_{η}

is an optimal solution of

(P η)

if and only if it satisfies the nonlinear system:

\nabla ψ_{η} (x_{η}) = 0 .

Using the Newton approach, a penalty method is provided to solve the above system, where the vector

x_{k + 1}

in each is given by

x_{k + 1} = x_{k} + α_{k} d_{k} .

The solution of the following quadratic convex optimization problem is necessary to obtain the Newton descent direction d:

\min_{d} [⟨▿ ψ_{η} (x), d⟩ + \frac{1}{2} ⟨▿^{2} ψ_{η} (x) d, d⟩ : A d = 0] = \min_{d} [H (η, x) : A d = 0],

where

x \in \tilde{L}

and

\begin{matrix} ψ_{η} (x) & = & f (x_{η}) + n η \ln η - η \sum_{i = 1}^{n} \ln (x_{i}) \\ \nabla ψ_{η} (x) & = & \nabla f (x) - η X^{- 1} e \\ \nabla^{2} ψ_{η} (x) & = & \nabla^{2} f (x) + η X^{- 2} e \\ H (η, x) & = & ⟨▿ ψ_{η} (x), d⟩ + \frac{1}{2} ⟨▿^{2} ψ_{η} (x) d, d⟩, \end{matrix}

with the diagonal matrix

X = diag {(x_{i})}_{i = \bar{1, n}} .

The Lagrangian is given by the following:

L (x, s) = ⟨\nabla f (x) - X^{- 1} η, d⟩ + \frac{1}{2} ⟨(\nabla^{2} f (x) + X^{- 2} η) d, d⟩ + ⟨A d, s⟩,

where

s \in R^{m}

is the Lagrange multiplier. It is sufficient for solving the linear system equations with

n + m :

\{\begin{matrix} \nabla f (x) - η X^{- 1} e + ⟨(\nabla^{2} f (x) + X^{- 2} η) d, d⟩ + A^{t} s & = 0 \\ A d & = 0, \end{matrix}

then,

\{\begin{matrix} (▿^{2} f (x) - η X^{- 2}) d + A^{T} s & = η X^{- 1} e - ▿ f (x) \\ A d & = 0 . \end{matrix}

(3)

It is simple to prove that system (3) is non-singular. We obtain

\{\begin{matrix} d^{T} (▿^{2} f (x) - η X^{- 2}) d + d^{T} A^{T} s & = d^{T} η X^{- 1} e - d^{T} ▿ f (x) \\ d^{T} A d & = 0, \end{matrix}

As

d^{T} A^{T} s = {(A d)}^{T} s = 0

and

A d = 0,

we obtain

⟨▿^{2} f (x) d, d⟩ + ⟨▿ f (x), d⟩ = η [⟨X^{- 1} d, e⟩ - {∥X^{- 1} d∥}^{2}] .

(4)

The system can also be written as follows:

\{\begin{matrix} (X ▿^{2} f (x) X) (X^{- 1} d) + η I (X^{- 1} d) + X A^{T} s & = η X X^{- 1} e - X ▿ f (x) \\ A X (X^{- 1} d) & = 0, \end{matrix} .

(5)

Thus, the Newton descent direction is obtained.

Throughout this paper, we take x instead of

x_{η} .

4.2. Computation of the Displacement Step

This section deals with the numerical solution of the displacement step. We give a brief highlight of the line search methods used in nonlinear optimization problems. Then, we collect some important results of approximate function approaches applied to both semidefinite and linear programming problems. Finally, we propose our new approximate function method for the nonlinear optimization problem (P).

4.2.1. Line Search Methods

The line search methods consists of determining a displacement step

α_{k}

, which ensures the sufficient decrease in the objective at each iteration

x_{k + 1} = x_{k} + α_{k} d_{k},

where

α_{k} > 0,

along the descent direction

d_{k};

in other words, it involves solving the following one-dimensional problem:

φ (α) = \min_{α > 0} ψ_{η} (x_{k} + α d_{k}) .

The disadvantage of this method is that the solution

α

is not necessarily optimal, which make the feasibility of

x_{k + 1}

not guaranteed.

The line search techniques of Wolfe, Goldstein-Armijo, and Fibonacci are the most widely used ones. However, generally, their computational volume is costly. This is what made us search for another alternative.

4.2.2. Approximate Functions Techniques

These methods are based on sophisticated techniques introduced by J.P. Crouzeix et al. [15] and A. Leulmi et al. [20] to obtain the solution of a semidefinite optimization problem.

The aim of these techniques is to give a minimized approximation of one real-variable function

φ (α)

defined by

\begin{matrix} φ (α) & = & \frac{1}{η} [ψ_{η} (x + α d) - ψ_{η} (x)] \\ = & \frac{1}{η} [f (x + α d) - f (x)] - \overset{n}{\sum_{i = 1}} \ln (1 + α t_{i}), t = X^{- 1} d . \end{matrix}

The function

φ

is convex, and we obtain the following:

\begin{matrix} φ^{'} (α) & = & \frac{1}{η} ⟨▿ f (x + α d), d⟩ - \overset{n}{\sum_{i = 1}} \frac{t_{i}}{1 + α t_{i}}, \\ φ^{″} (α) & = & \frac{1}{η} ⟨▿^{2} f (x + α d), d⟩ + \overset{n}{\sum_{i = 1}} \frac{t_{i}^{2}}{{(1 + α t_{i})}^{2}} . \end{matrix}

We find that

φ^{'} (0) + φ^{″} (0) = 0,

deduced from (4), which is expected since d is the direction of Newton’s descent direction.

We aim to avoid the disadvantages of line search methods and accelerate the convergence of the algorithm. For this reason, we have to identify an

\bar{α}

that yields a significant decrease in the function

φ (α) .

This is the same as solving a polynomial equation of degree

n + 1,

where f is a linear function.

Now, we include a few helpful inequalities below, which are used throughout the paper.

H. Wolkowicz et al. [22] see also Crouzeix and Seeger [23] presented the following inequalities:

\begin{matrix} \bar{z} - σ_{z} \sqrt{n - 1} & \leq & \min_{i} z_{i} \leq \bar{z} - \frac{σ_{z}}{\sqrt{n - 1}} \\ \bar{z} + \frac{σ_{x}}{\sqrt{n - 1}} & \leq & \max_{i} z_{i} \leq \bar{z} + σ_{z} \sqrt{n - 1}, \end{matrix}

where

\bar{z}

and

σ_{z}

represent the mean and the standard deviation, respectively, of a statistical real numbers series

{z_{1}, z_{2}, . . ., z_{n}}

. The later quantities are defined as follows:

\bar{z} = \frac{1}{n} \sum_{i = 1}^{n} z_{i} and σ_{z}^{2} = \frac{1}{n} \sum_{i = 1}^{n} z_{i}^{2} - {\bar{z}}^{2} = \frac{1}{n} \sum_{i = 1}^{n} {(z_{i} - \bar{z})}^{2} .

Theorem 1

([15]). Let

z_{i} > 0,

for

i = 1, 2, . . ., n .

We have the following:

\sum_{i = 1}^{n} \ln (z_{i}) \leq \ln (\bar{z} + σ_{z} \sqrt{n - 1}) + (n - 1) \ln (\bar{z} - \frac{σ_{z}}{\sqrt{n - 1}}) .

(6)

where

z_{i} = 1 + α t_{i},

z = 1 + α t

, and

σ_{z} = α σ_{t} .

We will proceed to present the paper’s principal result.

4.2.3. New Approximate Functions Approach

Let

φ (α) = \frac{1}{η} [f (x + α d) - f (x)] - \overset{n}{\sum_{i = 1}} \ln (1 + α t_{i}),

be defined on

\tilde{α} = \min_{i \in I_{-}} \{\frac{- 1}{t_{i}}\}

such that

I_{-} = {i : t_{i} < 0} .

To find the displacement step, it is necessary to solve

φ^{'} (α) = 0 .

Considering the difficulty of solving a non-algebraic equation, approximate functions are recommended alternatives.

Two novel approximation functions of

φ

are introduced in the following lemma.

Lemma 2.

For all

α \in [0, α_{1}^{*}[

with

α_{1}^{*} = \min (\hat{α}, {\hat{α}}_{1}),

we have

φ (α) \geq {\hat{φ}}_{1} (α),

and for all

α \in [0, α_{2}^{*}[

with

α_{2}^{*} = \min (\hat{α}, {\hat{α}}_{2}),

we obtain

{\hat{φ}}_{2} (α) \leq φ (α),

where

{\hat{φ}}_{1} (α) = \frac{1}{η} (f (x + α d) - f (x)) - \ln (1 + δ α) - (n - 1) \ln (1 + β α),

and

{\hat{φ}}_{2} (α) = \frac{1}{η} (f (x + α d) - f (x)) - τ \ln (1 + β_{1} α),

with

\{\begin{matrix} β = \bar{t} - \frac{σ_{t}}{\sqrt{n - 1}} \\ δ = \bar{t} + σ_{t} \sqrt{n - 1} \\ β_{1} = \frac{{∥t∥}^{2}}{n \bar{t}} . \end{matrix}

(7)

Furthermore, we have

{\hat{φ}}_{2} (α) \leq {\hat{φ}}_{1} (α) \leq φ (α),

Proof.

We start by proving that

φ (α) \geq {\hat{φ}}_{1} (α),

Theorem 1 gives

\sum_{i = 1}^{n} \ln (z_{i}) \leq \ln (\bar{z} + σ_{z} \sqrt{n - 1}) + (n - 1) \ln (\bar{z} - \frac{σ_{z}}{\sqrt{n - 1}}),

then,

\overset{n}{\sum_{i = 1}} \ln (1 + α t_{i}) \leq \ln (1 + α δ) + (n - 1) \ln (1 + α β),

and

- \overset{n}{\sum_{i = 1}} \ln (1 + α t_{i}) \geq - \ln (1 + α δ) - (n - 1) \ln (1 + α β) .

Hence,

\begin{matrix} \frac{1}{η} (f (x + α d) - f (x)) - \sum_{i = 1}^{n} \ln (1 + α t_{i}) \geq \frac{1}{η} (f (x + α d) - f (x)) - \ln (1 + α δ) \\ - (n - 1) \ln (1 + α β) . \end{matrix}

Therefore,

\begin{matrix} φ (α) & \geq & {\hat{φ}}_{1} (α) = \frac{1}{η} (f (x + α d) - f (x)) - \ln (1 + α δ) - (n - 1) \ln (1 + α β), \\ {\hat{φ}}_{1}^{'} (α) & = & \frac{1}{η} ⟨▿ f (x + α d) - d⟩ - \frac{δ}{1 + α δ} - (n - 1) \frac{β}{1 + α β} . \end{matrix}

Let us consider the following:

g (α) = φ (α) - {\hat{φ}}_{2} (α) .

We have

g^{″} (α) = \sum_{i = 1}^{n} \frac{t_{i}^{2}}{{(1 + α t_{i})}^{2}} - τ \frac{β_{1}}{{(1 + α β_{1})}^{2}} .

Because of the fact that

| t_{i} | \leq ∥ t ∥

and

n \bar{t} \leq ∥ t ∥,

it is easy to see that

\forall α \geq 0, g^{″} (α) \geq 0 .

Therefore,

\frac{1}{η} (f (x + α d) - f (x)) - \sum_{i = 1}^{n} \ln (1 + α t_{i}) \geq \frac{1}{η} (f (x + α d) - f (x)) - τ \ln (1 + β_{1} α),

then,

φ (α) \leq {\hat{φ}}_{2} (α) .

□

Hence, the domain of

{({\hat{φ}}_{i})}_{i = 1, 2}

is included in the domain of

φ,

which is

(0, \tilde{α}),

where

\tilde{α} = \max [α : 1 + α δ > 0, 1 + α β > 0] .

Let us remark that

0 = {\hat{φ}}_{i} (0) = φ (0), - φ^{'} (0) = - {\hat{φ}}_{i}^{'} (0) = φ_{i}^{″} (0) = {\hat{φ}}_{i}^{″} (0) > 0, i = 1, 2 .

Thus,

φ

is well approximated by

{\hat{φ}}_{i}

in a neighborhood of

0 .

Since

{\hat{φ}}_{i}

is strictly convex, it attains its minimum at one unique point

\bar{α},

which is the unique root of the equation

{\hat{φ}}_{i}^{'} (α) = 0

. This point belongs to the domain of

{\hat{φ}}_{i} (i = 1, 2) .

Therefore,

φ

is bounded from below by

{\hat{φ}}_{1} :

{\hat{φ}}_{1} (\bar{α}) \leq φ (\bar{α}) < 0

And it is also bounded from below by

{\hat{φ}}_{2} :

{\hat{φ}}_{2} (\bar{α}) \leq φ (\bar{α}) < 0 .

Then,

\bar{α}

gives an apparent decrease in the function

φ .

4.3. Minimize an Auxiliary Function

We now consider the minimization of the function

φ_{1} (α) = n γ α - \ln (1 + δ α) - (n - 1) \ln (1 + β α),

and we also have the following approximate function:

φ_{2} (α) = n γ α - τ \ln (1 + β_{1} α),

where

β_{1}

is defined in (7). Then, we have the following:

\begin{matrix} φ_{1}^{'} (α) & = & n γ - \frac{δ}{1 + δ α} - (n - 1) \frac{β}{1 + β α}, \\ φ_{1}^{″} (α) & = & \frac{δ^{2}}{{(1 + δ α)}^{2}} + (n - 1) \frac{β^{2}}{{(1 + β α)}^{2}} . \end{matrix}

and

\begin{matrix} φ_{2}^{'} (α) & = & n γ - τ \frac{β_{1}}{(1 + β_{1} α)}, \\ φ_{2}^{″} (α) & = & τ \frac{β_{1}^{2}}{{(1 + β_{1} α)}^{2}} . \end{matrix}

We remark that for

i = 1, 2 :

φ_{i} (0) = 0, φ_{i}^{'} (0) = n (γ - \bar{t}), φ_{i}^{″} (0) = {∥t∥}^{2} .

We present the conditions

φ_{i}^{'} (0) < 0

and

φ_{i}^{″} (0) > 0 .

The function

φ

is strictly convex. It attains its minimum at one unique point

α

such that

φ_{i}^{'} (α) = 0,

which is one of the roots of the equations

γ δ β α^{2} + α (γ δ + γ β - δ β) + γ - \bar{t} = 0,

(8)

and

n γ β (1 + β α) - {∥t∥}^{2} = 0 .

(9)

For Equation (8), the roots are explicitly calculated, and we distinguish the following cases:

If $δ = 0,$ we obtain ${\bar{α}}_{1} = \frac{\bar{t} - γ}{γ β} .$
If $β = 0,$ we obtain ${\bar{α}}_{1} = \frac{\bar{t} - γ}{γ δ} .$
If $γ = 0,$ we have ${\bar{α}}_{1} = \frac{- \bar{t}}{δ β} .$
If $γ δ β \neq 0,$ $\bar{α}$ is the only root of the second-degree equation that belongs to the domain of definition of $φ .$ We obtain $▵ = \frac{1}{γ^{2}} + \frac{1}{β^{2}} + \frac{1}{δ^{2}} - \frac{2}{δ β} + (\frac{2 n - 4}{n}) [\frac{1}{β γ} - \frac{1}{γ δ}] .$ Both roots are

${\bar{α}}_{1.1} = \frac{1}{2} (\frac{1}{γ} - \frac{1}{β} - \frac{1}{δ} + \sqrt{▵}),$

and

${\bar{α}}_{1.2} = \frac{1}{2} (\frac{1}{γ} - \frac{1}{β} - \frac{1}{δ} + \sqrt{▵}) .$

Then, the root of Equation (9) is explicitly calculated, and we have

{\bar{α}}_{2} = (\frac{{∥t∥}^{2}}{β (1 - β)} - \frac{1}{β}) .

Consequently, we compute the two values

{\bar{α}}_{i}, i = 1, 2

, explicitly. Then, we take

{\bar{α}}_{1}, {\bar{α}}_{2} \in [0, \tilde{α} - ε[

, where

ε > 0

is a fixed precision and

φ^{'} ({\bar{α}}_{i}) > 0, i = 1, 2 .

Remark 2.

The computation of

{\bar{α}}_{i}, i = 1, 2

is performed through a dichotomous procedure in the cases where

{\bar{α}}_{i} \notin (0, \tilde{α} - ε),

and

φ^{'} ({\bar{α}}_{i}) > 0

, as follows:

Put

a = 0, b = \tilde{α} - ε

.

While

|b - a| > ε

do

If

φ^{'} (\frac{a + b}{2}) < 0

then,

b = \frac{a + b}{2},

else

a = \frac{a + b}{2},

so

{\bar{α}}_{i} = b .

This computation guarantees a better approximation of the minimum of

φ^{'} (α)

while remaining in the domain of

φ .

4.4. The Objective Function f Is

4.4.1. Linear

For all

x,

there exists

c \in R^{n}

such that

f (x) =

⟨c, x⟩ .

The minimum of

{\hat{φ}}_{i}

is reached at the unique root

\bar{α}

of the equation

φ^{'} (α) = 0 .

Then,

{\hat{φ}}_{i} (\bar{α}) \leq φ (\bar{α}) < {\hat{φ}}_{i} (0) = φ (0) = 0 .

Take

γ = n^{- 1} ⟨c, d⟩

in the auxiliary function

φ .

The two functions

φ

and

{({\hat{φ}}_{i})}_{i = 1, 2}

coincide.

\bar{α}

yields a significant decrease in the function

ψ_{η}

along the descent direction

d .

It is interesting to note that the condition

{\hat{φ}}_{i}^{'} (0) + {\hat{φ}}_{i}^{″} (0) = 0

(i = 1, 2)

implies the following:

- {\hat{φ}}_{i}^{'} (0) = n (\bar{t} - γ) = {∥t∥}^{2} = n ({\bar{t}}^{2} - σ_{t}^{2}) = {\hat{φ}}_{i}^{''} (0) > 0 .

4.4.2. Convex

▿ f (x + α d)

is no longer constant, and the equation

{\hat{φ}}_{i}^{'} (α) = 0

is not reduced to one equation of a second degree for

i = 1, 2 .

We consider another function

\tilde{φ}

less than

φ .

Given

\hat{α} \in (0, \tilde{α}),

we have, for all

α \in (0, \hat{α}]

, the following:

\frac{f (x + α d) - f (x)}{η} \leq \frac{f (x + \hat{α} d) - f (x)}{η \hat{α}} α,

(10)

then,

φ (α) \geq {\tilde{φ}}_{1} (α) = \frac{f (x + \hat{α} d) - f (x)}{η \hat{α}} α - \ln (1 + α δ) - (n - 1) \ln (1 + α β)

and

φ (α) \geq {\tilde{φ}}_{2} (α) = \frac{f (x + \hat{α} d) - f (x)}{η \hat{α}} α - τ \ln (1 + β_{1} α), τ \in]0, 1[.

We choose

γ = \frac{f (x + \hat{α} d) - f (x)}{n η \hat{α}}

in the auxiliary function

φ

, and we compute the root

\bar{α}

of the equation

φ_{i}^{'} (α) = 0

with

i = 1, 2 .

Therefore, we have two cases:

Where $\bar{α} \leq \hat{α} :$ We have the following:

$φ (\bar{α}) \geq {\hat{φ}}_{i} (\bar{α}) \geq {\tilde{φ}}_{i} (\bar{α}), for i = 1, 2$

and, thus, along the direction $d,$ we obtain a significant decrease in the function $ψ_{η} .$ The approximation accuracy of $φ$ by ${\tilde{φ}}_{i}$ being better for small values of $\hat{α} (for i = 1, 2),$ it is recommended to use a new value of $\hat{α},$ situated between $\tilde{α}$ and the former $\hat{α},$ for the next iteration. Moreover, the cost of the supplementary computation is small since it is the cost of one evaluation of f and the resolution of a second-order equation.
Where $\bar{α} > \hat{α} :$ The computation of $\hat{α}$ is performed through a dichotomous procedure (see Remark 3).

5. Description of the Algorithm and Numerical Simulations

5.1. Description of the Algorithm

This section is devoted to introducing our algorithm for obtaining an optimal solution

\bar{x}

of (P).

Begin

Initialization

ε > 0

is a given precision.

\hat{η} > 0

and

σ \in [0, 1]

are given.

x_{0}

is a strictly realizable solution from

\tilde{L},

d_{0} \in R^{m} .

Iteration

Start with $η > \hat{η} .$
Calculate d and $t = X^{- 1} d .$
If $∥t∥ > ε,$ calculate $\bar{t}, γ, δ, β$ , and $β_{1} .$
Determine $\bar{α}$ following (8), (10), or (9) depending on the linear or nonlinear case.
Take the new iterate $x = x + \bar{α} d = X (e + \bar{α} t)$ and go back to step 2.
If $∥t∥ \leq ε,$ a well approximation of $g (η)$ has been obtained.
(a)
If $η \geq$ $\hat{η}$ and $η = σ η,$ return to step 2.
(b)
If $η <$ $\hat{η},$ STOP: a well approximate solution of $(P)$ has been obtained.

End algorithm.

The aim of this method is to reduce the number of iterations and the time consumption. In the next section, we provide some examples.

5.2. Numerical Simulations

To assess the superior performance and accuracy of our algorithm, based on our minorant functions, numerical tests are conducted to make comparisons between our new approach and the classical line search method.

For this purpose, in this section, we present comparative numerical tests on different examples taken from the literature [5,24].

We report the results obtained by implementing the algorithm in MATLAB on an Intel Core i7-7700HQ (2.80 GHz) machine with 16.00 Go RAM.

5.2.1. Examples with a Fixed Size

Nonlinear Convex Objective

Example 1.

Let us take the following problem:

\{\begin{matrix} \min 2 x_{1}^{2} + 2 x_{2}^{2} - 2 x_{1} x_{2} - 4 x_{1} - 6 x_{2} \\ x_{1} + x_{2} + x_{3} = 2 \\ x_{1} + 5 x_{2} + x_{4} = 5 \\ x_{1}, x_{2}, x_{3}, x_{4} \geq 0 . \end{matrix}

The optimal value is

- 7.1613,

and the optimal solution is

x^{*} = {(\begin{matrix} 1.1290 & 0.7742 & 0.0968 & 0 \end{matrix})}^{t} .

Example 2.

Let us take the following problem:

\{\begin{matrix} \min x_{1}^{3} + x_{2}^{3} \\ x_{1} - x_{2} + x_{3} + x_{4} = 3 \\ 2 x_{1} + x_{2} - x_{3} + x_{4} = 2.0086 \\ x_{1} + x_{3} + 2 x_{4} = 4.9957 \\ x_{1}, x_{2}, x_{3}, x_{4} \geq 0 . \end{matrix}

The optimal value is

0.0390,

and the optimal solution is

x^{*} = {(\begin{matrix} 0.3391 & 0 & 0.6652 & 1.9957 \end{matrix})}^{t} .

Example 3.

Let us consider the following problem:

\{\begin{matrix} \min x_{1}^{3} + x_{2}^{3} + x_{1} x_{2} \\ 2 x_{1} - x_{2} + x_{3} = 8 \\ x_{1} + 2 x_{2} + x_{4} = 6 \\ x_{1}, x_{2}, x_{3}, x_{4} \geq 0 . \end{matrix}

The optimal value is

1.6157,

and the optimal solution is

x^{*} = {(\begin{matrix} 1.1734 & 0 & 5.6532 & 4.8265 \end{matrix})}^{t} .

This table presents the results of the previous examples:

Example	st1		st2		LS
	iter	Time (s)	iter	Time (s)	iter	Time (s)
1	12	0.0006	19	0.0015	6	0.0091
2	5	0.0004	9	0.0009	44	0.099
3	3	0.0001	5	0.0006	65	0.89

5.2.2. Example with a Variable Size

The Objective Function f Is

1-Linear: Let us consider the linear programming problem:

ζ = \min [c^{T} x : x \geq 0, A x = b],

where A is an

m \times 2 m

matrix given by the following:

\begin{matrix} A [i, j] & = & \{\begin{matrix} 1 & if i = j & or j = i + m \\ 0 & if not, \end{matrix} \\ c [i] & = & - 1, c [i + m] = 0 and b [i] = 2, \forall i = 1, . . . m, \end{matrix}

where

c, b \in R^{2 m} .

The results are presented in the table below.

Size	st1		st2		LS
	iter	Time (s)	iter	Time (s)	iter	Time (s)
$5 \times 10$	1	0.0021	2	0.0039	9	0.0512
$20 \times 40$	1	0.0031	3	0.0045	13	0.0821
$50 \times 100$	2	0.0049	3	0.0032	17	0.3219
$100 \times 200$	2	0.0053	4	0.0088	19	0.5383
$200 \times 400$	2	0.0088	4	0.0098	22	0.9220
$250 \times 500$	3	0.0096	5	0.0125	26	9.2647

2-Nonlinear:

Example 4

(Quadratic case [13]). Let the quadratic problem be as follows:

ζ = \min [f (x) : x \geq 0, A x \geq b],

with

f (x) = \frac{1}{2} ⟨x, Q x⟩,

Q is the matrix defined for

n = 2 m

by the following:

\begin{matrix} Q [i, j] & = & \{\begin{matrix} 2 j - 1 & if i > j \\ 2 i - 1 & if i < j \\ i (i + 1) - 1 & if i = j, i, j = 1, . ., n \end{matrix} \\ A [i, j] & = & \{\begin{matrix} 1 & if i = j or j = i + m, i = 1, . ., m and j = 1, . ., n \\ 0 & if not \end{matrix} \\ c [i] & = & - 1, c [i + m] = 0 and b [i] = 2, \forall i = 1, . ., m . \end{matrix}

This example is tested for many values of

n .

The obtained results are given by the following table:

ex( $m, n$ )	st1		st2		LS
	iter	Time (s)	iter	Time (s)	iter	Time (s)
$300 \times 600$	5	0.9968	4	0.9699	26	19.5241
$400 \times 800$	7	18.1448	5	9.6012	35	86.1259
$600 \times 1200$	12	36.3259	5	19.0099	23	98.2354
$1000 \times 2000$	21	56.9912	17	41.1012	33	109.2553
$1500 \times 3000$	28	140.1325	23	95.6903	40	1599.1596

Example 5

(The problem of Erikson [25]). Let the following be the quadratic problem:

ζ = \min [f (x) = \overset{n}{\sum_{i = 1}} x_{i} \ln (\frac{x_{i}}{a_{i}}) : x_{i} + x_{i + m} = b, x \geq 0],

where

n = 2 m, a_{i} > 0

and

b \in R^{m}

are fixed.

This example is tested for different values of

n, a_{i}

, and

b_{i} .

The following table resumes the obtained results in the case

(a_{i} = 2, \forall i = 1, . . ., n,

b_{i} = 4, \forall i = 1, . . ., m) :

ex( $m, n$ )	st1		st2		LS
	iter	Time (s)	iter	Time (s)	iter	Time (s)
$10 \times 20$	1	0.0001	2	0.0012	4	0.0236
$40 \times 100$	2	0.0021	3	0.0033	5	0.7996
$100 \times 200$	2	0.0043	3	0.0201	5	1.5289
$500 \times 1000$	2	3.0901	4	5.9619	12	22.1254

In the above tables, we take

ε

= 1.0 ×

10^{- 4}

.

We also denote the following:

- (iter) is the number of iterations.

- (time) is the computational time in seconds (s).

- (sti)_i=1,2 represents the strategy of approximate functions introduced in this paper.

- (LS) represents the classical line search method.

Commentary: The numerical tests carried out show, without doubt, that our approach leads to a very significant reduction in the cost of calculation and an improvement in the result. When comparing the approximate functions to the line search approach, the number of iterations and computing time are significantly reduced.

6. Conclusions

The contribution of this paper is particular focused on the study of nonlinear optimization problems by using the logarithmic penalty method based on some new approximate functions. We first formulate the problems

(P)

and

(P η)

with the results of the convergence. Then, we find their solutions by using new approximate functions.

Finally, to lend further support to our theoretical results, a simulation study is conducted to illustrate the good accuracy of the studied approach. More precisely, our new approximate functions approach outperforms the line search one as it significantly reduces the cost and computing time.

Funding

This work has been supported by: The General Directorate of Scientific Research and Technological Development (DGRSDT-MESRS) under the PRFU project number C00L03UN190120220009. Algeria.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The author is very pleased to thank the editor and the reviewers for their helpful suggestions and comments.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Frank, M.; Wolfe, B. An algorithm for quadratic programming. Nav. Res. Logist. Q. 1956, 3, 95–110. [Google Scholar] [CrossRef]
Wolfe, P. A Duality Theorem for Nonlinear Programming. Q. Appl. Math. 1961, 19, 239–244. [Google Scholar] [CrossRef]
Bracken, J.; M1cCormiek, G.P. Selected Applications of Nonlinear Programming; John Wiley & Sons, Inc.: New York, NY, USA, 1968. [Google Scholar]
Fiacco, A.V.; McCormick, G.P. Nonlinear Programming: Sequential Unconstrained Minimization Techniques; John Wiley & Sons, Inc.: New York, NY, USA, 1968. [Google Scholar]
Bonnans, J.-F.; Gilbert, J.-C.; Lemaréchal, C.; Sagastizàbal, C. Numerical Optimization: Theoretical and Practical Aspects; Mathematics and Applications; Springer: Berlin/Heidelberg, Germany, 2003; Volume 27. [Google Scholar]
Evtushenko, Y.G.; Zhadan, V.G. Stable barrier-projection and barrier-Newton methods in nonlinear programming. In Optimization Methods and Software; Taylor & Francis: Abingdon, UK, 1994; Volume 3, pp. 237–256. [Google Scholar]
Nestrov, Y.E.; Nemiroveskii, A. Interior-Point Polynomial Algorithms in Convex Programming; SIAM: Philadelphia, PA, USA, 1994. [Google Scholar]
Wright, S.J. Primal–Dual Interior Point Methods; SIAM: Philadelphia, PA, USA, 1997. [Google Scholar]
Ye, Y. Interior Point Algorithms: Theory and Analysis. In Discrete Mathematics Optimization; Wiley-Interscience Series; John Wiley & Sons: New York, NY, USA, 1997. [Google Scholar]
Powell, M.J.D. Karmarkar’s Algorithm: A View from Nonlinear Programming; Department of Applied Mathematics and Theoretical Physics, University of Cambridge: Cambridge, UK, 1989; Volume 53. [Google Scholar]
Rosen, J.B. The Gradient Projection Method for Nonlinear Programming. Soc. Ind. Appl. Math. J. Appl. Math. 1960, 8, 181–217. [Google Scholar] [CrossRef]
Rosen, J.B. The Gradient Projection Method for Nonlinear Programming. Soc. Ind. Appl. Math. J. Appl. Math. 1961, 9, 514–553. [Google Scholar] [CrossRef]
Ouriemchi, M. Résolution de Problèmes non Linéaires par les Méthodes de Points Intérieurs. Théorie et Algorithmes. Doctoral Thesis, Université du Havre, Havre, France, 2006. [Google Scholar]
Forsgren, A.; Gill, P.E.; Wright, M.H. Interior Methods for Nonlinear Optimization; SIAM: Philadelphia, PA, USA, 2002; Volume 44, pp. 525–597. [Google Scholar]
Crouzeix, J.P.; Merikhi, B. A logarithm barrier method for semidefinite programming. RAIRO-Oper. Res. 2008, 42, 123–139. [Google Scholar] [CrossRef]
Menniche, L.; Benterki, D. A Logarithmic Barrier Approach for Linear Programming. J. Computat. Appl. Math. 2017, 312, 267–275. [Google Scholar] [CrossRef]
Cherif, L.B.; Merikhi, B. A Penalty Method for Nonlinear Programming. RAIRO-Oper. Res. 2019, 53, 29–38. [Google Scholar] [CrossRef]
Chaghoub, S.; Benterki, D. A Logarithmic Barrier Method Based on a New Majorant Function for Convex Quadratic Programming. IAENG Int. J. Appl. Math. 2021, 51, 563–568. [Google Scholar]
Leulmi, A. Etude d’une Méthode Barrière Logarithmique via Minorants Functions pour la Programmation Semi-Définie. Doctoral Thesis, Université de Biskra, Biskra, Algeria, 2018. [Google Scholar]
Leulmi, A.; Merikhi, B.; Benterki, D. Study of a Logarithmic Barrier Approach for Linear Semidefinite Programming. J. Sib. Fed. Univ. Math. Phys. 2018, 11, 300–312. [Google Scholar]
Leulmi, A.; Leulmi, S. Logarithmic Barrier Method via Minorant Function for Linear Programming. J. Sib. Fed. Univ. Math. Phys. 2019, 12, 191–201. [Google Scholar] [CrossRef]
Wolkowicz, H.; Styan, G.P.H. Bounds for Eigenvalues Using Traces. Lin. Alg. Appl. 1980, 29, 471–506. [Google Scholar] [CrossRef]
Crouzeix, J.-P.; Seeger, A. New bounds for the extreme values of a finite sample of real numbers. J. Math. Anal. Appl. 1996, 197, 411–426. [Google Scholar] [CrossRef][Green Version]
Bazraa, M.S.; Sherali, H.D.; Shetty, C.M. Nonlinear Programming, Willey-Interscience; John Wiley & Sons, Inc.: Hoboken, NJ, USA; Toronto, ON, Canada, 2006. [Google Scholar]
Shannon, E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423+623–656. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.