Inexact Restoration Methods for Semivectorial Bilevel Programming Problem on Riemannian Manifolds

Liao, Jiagen; Wan, Zhongping

doi:10.3390/axioms11120696

Open AccessArticle

Inexact Restoration Methods for Semivectorial Bilevel Programming Problem on Riemannian Manifolds

by

Jiagen Liao

^*,†

and

Zhongping Wan

^†

School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Axioms 2022, 11(12), 696; https://doi.org/10.3390/axioms11120696

Submission received: 30 September 2022 / Revised: 1 December 2022 / Accepted: 2 December 2022 / Published: 5 December 2022

(This article belongs to the Special Issue 10th Anniversary of Axioms: Mathematical Analysis)

Download Versions Notes

Abstract

:

For a better understanding of the bilevel programming on Riemannian manifolds, a semivectorial bilevel programming scheme is proposed in this paper. The semivectorial bilevel programming is firstly transformed into a single-level programming problem by using the Karush–Kuhn–Tucker (KKT) conditions of the lower-level problem, which is convex and satisfies the Slater constraint qualification. Then, the single-level programming is divided into two stages: restoration and minimization, based on which an Inexact Restoration algorithm is developed. Under certain conditions, the stability and convergence of the algorithm are analyzed.

Keywords:

Riemannian manifolds; semivectorial bilevel programming; Inexact Restoration algorithm

MSC:

58C05; 90C25; 90C29

1. Introduction

The bilevel optimization problem on Euclidean spaces has been shown to be NP-hard, and even the verification of the local optimality for a feasible solution is in general NP-hard. Bilevel optimization problems are often nonconvex optimization problems, and this makes the computation of an optimal solution a challenging task. Thus, it is natural to consider the bilevel optimization problems on Riemannian manifolds. Actually, studying optimization problems on Riemannian manifolds has many advantages. Some constrained optimization problems on Euclidean spaces can be seen as unconstrained ones from the Riemannian geometry viewpoint. Moreover, some nonconvex optimization problems in the setting of Euclidean spaces may become convex optimization problems by introducing an appropriate Riemannian metric. See for instance [1,2]. The aim of this paper is to study the bilevel optimization problem on Riemannian manifolds.

In order to study the bilevel optimization problem on Riemannian manifolds, it is reasonable to have some idea of solving the bilevel optimization problem in Euclidean spaces. An approach to investigate bilevel optimization problems on Euclidean spaces is to replace the lower-level problem by its (under certain necessary and sufficient assumptions) KKT optimality conditions. In a recent article [3], the authors presented the KKT reformulation of the bilevel optimization problems on Riemannian manifolds. Moreover, it has been shown that global optimal solutions of the KKT reformulation correspond to global optimal solutions of the bilevel problem on the Riemannian manifolds provided the lower level convex problem satisfies Slater’s constraint qualification. On this basis, we consider a semivectorial bilevel optimization problem on Riemannian manifolds with a multiobjective problem in the lower-level problem. Since the Inexact Restoration (IR) algorithm [4,5] was introduced to solve constrained optimization problems and if we transform the semivectorial bilevel optimization problem into a single-level problem, it also can be solved by using the IR algorithm as a constrained optimization problem.

For the convenience of the readers, let us review the IR algorithm on Euclidean spaces firstly. Each iteration of the IR algorithm consists of two phases: restoration and minimization. Consider the following nonlinear programming:

\begin{matrix} min & f (x) \\ s . t . & C (x) \leq 0, x \in Ω, \end{matrix}

(1)

where

f : R^{n} \to R

and

C : R^{n} \to R^{m}

are continuous differentiable functions and the set

Ω \subset R^{m}

is closed convex. The algorithm generates feasible iterates with respect to

Ω

,

x^{k} \in Ω

(for all

k = 0, 1, 2 \dots

).

In the restoration step, which is executed once per iteration, an intermediate point

y^{k} \in Ω

is found such that the infeasibility at

y^{k}

is a fraction of the infeasibility at

x^{k}

. Immediately after restoration, we construct an approximation

π_{k}

of the feasible region using available information at

y^{k}

. In the minimization step, we compute a trial point

z^{k i} \in π_{k}

such that

f (z^{k i}) ≪ f (y^{k})

. Here, the symbol ≪ means sufficiently smaller than, and

∥ z^{k i} - y^{k} ∥ \leq δ_{k i}

, where

δ_{k i}

is a trust-region radius. The trial point

z^{k i}

is accepted as a new iteration one if the value of a nonsmooth (exact penalty) merit function at

z^{k i}

is sufficiently smaller than its value at

x^{k}

. If

z^{k i}

is not acceptable, the trust-region radius is reduced.

The IR algorithm is related to classical feasible methods for nonlinear programming, such as the generalized reduced gradient (GRG) and the family of sequential gradient restoration algorithms. There are several studies on the numerical characteristics of the IR algorithm. For example, this method was applied to the general constraint problem in [6], and good results were obtained. In addition, the IR algorithm using the regularization strategy was proposed in [7], in which the problem of derivative-free optimization was effectively solved. The IR algorithms are especially useful when there is some natural way to restore feasibility. One of the most successful applications of the IR algorithm is electronic structure calculation, as shown in [8]. Moreover, the IR algorithm has also been successful applied to optimization problems with the box constraint in [9] and problems with multiobjective constraints under weighted-sum scalarization in [10]. For more applications, please see [11,12].

Since the IR algorithm is so important in applications, many researches have been trying to improve it from different angles. The restoration phase improves feasibility, and in the minimization step, optimality is improved as a linear tangent approximation of the constraints. When a sufficient descent criterion does not hold, the trial point is modified in such a way that, eventually, acceptance occurs at a point that may be close to the solution of the restoration (first) phase. The acceptance criterion may use merit functions [4,5] or filters [13]. The minimization step consists of an inexact (approximate) minimization of f with linear constraints. In this case, the restoration step represents also an inexact minimization of infeasibility with linear constraints. Therefore, the available algorithms for (large-scale) linearly constrained minimization can be fully exploited; see the published articles [14,15,16]. Furthermore, IR techniques for constrained optimization were improved, extended, and analyzed in [7,17,18,19], among others.

Inspired and motivated by the research works [4,10,20,21,22,23,24,25], we introduce a kind of bilevel programming with a multiobjective problem in the lower level on Riemannian manifolds, the so-called semivectorial bilevel programming. Then, we transform the semivectorial bilevel programming into a single-level programming by using the KKT optimality conditions of the lower-level problem, which is convex and satisfies the Slater constraint qualification. Finally, we divide the single-level programming into two stages: restoration and minimization, and give an IR algorithm for semivectorial bilevel programming. Under certain conditions, we analyze the well-definiteness and convergence of the presented algorithm.

The remainder of this paper is organized as follows: In Section 2, some basic concepts, notations, and important results of Riemannian geometry are presented. In Section 3, we propose the semivectorial bilevel programming on the Riemannian manifold and give the KKT reformulation, and then, we present an algorithm by using the IR technique for solving the semivectorial bilevel programming on Riemannian manifolds. In Section 4, its convergence properties are studied. The conclusions are given in Section 5.

2. Preliminaries

An m-dimensional Riemannian manifold is a pair

(M, g)

, where M stands for an m-dimensional smooth manifold and g stands for a smooth, symmetric positive definite

(0, 2)

-tensor field on M, called a Riemannian metric on M. If

(M, g)

is a Riemannian manifold, then for any point

x \in M

, the restriction

g_{x} : T_{x} M \times T_{x} M \to R

is an inner product on the tangent space

T_{x} M

. The tangent bundle

T M

over M is

T M : = ⋃_{x \in M} T_{x} M

, and a vector field on M is a section of the tangent bundle, which is a mapping

X : M \to T M

such that, for any

x \in M

,

X (x) \equiv X_{x} \in T_{x} M

.

We denote

{〈 \cdot, \cdot 〉}_{x}

by the scalar product on

T_{x} M

with the associated norm

{∥ . ∥}_{x}

. The length of a tangent vector

v \in T_{x} M

is defined by

{∥ v ∥}_{x} = {〈 v, v 〉}^{\frac{1}{2}}

. Given a piecewise smooth curve

γ : [a, b] \subset R \to M

joining x to y, i.e.,

γ (a) = x

and

γ (b) = y

, then its length is defined by

L (γ) = \int_{a}^{b} {∥ \dot{γ} (t) ∥}_{γ (t)} d t

, where

\dot{γ}

means the first derivative of

γ

with respect to t. Let x and y be two points in Riemannian manifold

(M, g)

and

Γ_{x, y}

the set of all piecewise smooth curves joining x and y. The function:

d : M \times M \to R, d (x, y) : = inf {L (γ) : γ \in Γ_{x, y}}

is a distance on M, and the induced metric topology on M coincides with the topology of M as the manifold.

Let ∇ be the Levi-Civita connection associated with the Riemannian metric and

γ

be a smooth curve in M. A vector field X is said to be parallel along

γ : [0, 1] \to M

if

\nabla_{\dot{γ}} X = 0

. If

\dot{γ}

itself is parallel along

γ

joining x to y,

γ (0) = x, γ (1) = y and \nabla_{\dot{γ}} \dot{γ} = 0 on [0, 1],

then we say that

γ

is a geodesic, and in this case,

∥ \dot{γ} ∥

is constant. When

∥ \dot{γ} ∥ = 1

,

γ

is said to be normalized. A geodesic joining x to y in M is said to be minimal if its length equals

d (x, y)

.

By the Hopf–Rinow theorem, we know that, if M is complete, then any pair of points in M can be joined by a minimal geodesic. Moreover,

(M, d)

is a complete metric space, and the bounded closed subsets are compact. Furthermore, for the exponential mapping at x,

\exp_{x} : T_{x} M \to M

is well defined on

T_{x} M

. Clearly, a curve

γ : [0, 1] \to M

is a minimal geodesic joining x to y if and only if there exists a vector

v \in T_{x} M

such that

∥ v ∥ = d (x, y)

and

γ (t) = \exp_{x} (t v)

for each

t \in [0, 1]

.

Set

p \in M

and

V_{p} : = {v \in T_{p} M : γ_{v}

defined in

[0, 1]}

. The exponential mapping

\exp_{p} : V_{p} \to M

is defined by

\exp_{p} (v) = γ_{v} (1), \forall v \in V_{p}

. The exponential mapping

\exp_{p} : T_{p} M \to M

at

p \in M

is well posed on the tangent space

T_{p} M

. Obviously, a curve

γ : [0, 1] \to M

joining p and q is a minimum geodesic, if and only if there is a vector

v \in T_{p} M

such that

∥ v ∥ = d (p, q)

and

γ (t) = \exp_{p} (t v)

hold for every

t \in [0, 1]

.

The gradient of a differentiable function

f : M \to R

with respect to the Riemannian metric g is the vector field

grad f

defined by

g (grad f, X) = d f (X)

,

\forall X \in T M

, where

d f

denotes the differential of the function f.

In this normal coordinate system, the geodesics through p are represented by lines passing through the origin. Moreover, the matrix

(g_{i j})

associated with the bilinear form g at the point p in this orthonormal basis reduces to the identity matrix, and the Christoffel symbols vanish. Thus, for any smooth function

f : M \to R

, in normal coordinates around p, we obtain

grad f (p) = \sum_{i} \frac{\partial f}{\partial x^{i}} (p) \frac{\partial}{\partial x^{i}} .

Now, consider a smooth function

f : M \to R

and the real-valued function

T_{p} M ∋ v \mapsto f_{p} (v) : = f (\exp_{p} v)

defined around 0 in

T_{p} M

.

It is easy to see that

\frac{\partial f_{p}}{\partial x^{i}} (0) = \frac{\partial f}{\partial x^{i}} (p) .

The Taylor–Young formula (for Euclidean spaces) applied to

f_{p}

around the origin can be written using matrices as

f_{p} (v) = f_{p} (0) + J_{f_{p}} (0) v + \frac{1}{2} v^{T} {Hess}_{f_{p}} (0) v + {o (∥ v ∥}^{2}),

where

\begin{matrix} v & = {[v^{1} \dots v^{n}]}^{T}, \\ J_{f_{p}} (0) & = [\frac{\partial f}{\partial x^{1}} (p) \dots \frac{\partial f}{\partial x^{n}} (p)], \\ {Hess}_{f_{p}} (0) & = (\frac{\partial^{2} f}{\partial x^{i} \partial x^{j}} (p)) = {Hess}_{p} f (v, v) . \end{matrix}

In other words, we have the following Taylor–Young expansion for f around p:

f (\exp_{p} v) = f (p) + g_{p} (grad f, v) + \frac{1}{2} {Hess}_{p} f (v, v) + {o (∥ v ∥}_{p}^{2})

which holds in any coordinate system.

The set

A \subset M

is said to be convex if it contains a geodesic segment

γ

whenever it contains the end points of

γ

, that is

γ ((1 - t) a + t b)

is in A whenever

x = γ (a)

and

y = γ (b)

are in A, and

t \in [0, 1]

. A function

f : M \to R

is said to be convex if its restriction to any geodesic curve

γ : [a, b] \to M

is convex in the classical sense, such that the one real variable function

f \circ γ : [a, b] \to R

is convex. Let

P_{A}

denote the projection on

A \subset M

, that is, for each

x \in M

,

P_{A} x = \{\bar{x} \in A : d (x, \bar{x}) = inf_{z \in A} d (x, z)\} .

(2)

For more details and complete information on the fundamentals in Riemannian geometry, see [1,26,27,28].

3. Inexact Restoration Algorithm

We study an optimistic bilevel programming on an m-dimensional Riemannian manifold

(M, g)

, where the lower-level problem is a multi-objective problem, the so-called semivectorial bilevel programming. The problem is formulated below:

\begin{matrix} min & F (x) \\ s . t . & x \in Sol (MOP), \end{matrix}

(3)

where

F : M \to R

and

Sol (MOP)

is the effective solution set of the following multi-objective problem (MOP):

\begin{matrix} min & {f_{1} (x), \dots, f_{p} (x)} \\ s . t . & h (x) = 0, \\ x \in M, \end{matrix}

(4)

where

f = {f_{1} (x), \dots, f_{p} (x)} : M \to R^{p}

,

I : = {1, \dots, p}

,

h : M \to R^{n}

, and

D = {x \in M : h (x) = 0}

denote the feasible solution of the MOP.

Definition 1.

Let

f : M \to R^{p}

be a vectorial function on Riemannian manifold M. Then, f is said to be convex on M if, for every

x, y \in M

and every geodesic segment

γ : [0, 1] \to M

joining x to y, i.e.,

γ (0) = x

and

γ (1) = y

, it holds that

f (γ (t)) ⪯ (1 - t) f (x) + t f (y), t \in [0, 1] .

The above definition is a natural extension of the definition of convexity in Euclidean space to the Riemannian context; see [29].

Definition 2.

A point

x \in M

is said to be Pareto critical of f on Riemannian manifold M if, for any

v \in T_{x} M

, there are an index

i \in I

and

u \in grad f_{i} (x)

, such that

〈 u, v 〉 \geq 0 .

Definition 3.

(a) A point

x^{*} \in M

is a Pareto-optimal point of f on Riemannian manifold M if there is no

x \in M

with

f (x) ⪯ f (x^{*})

. (b) A point

x^{*} \in M

is a weak Pareto-optimal point of f on Riemannian manifold M if there is no

x \in M

with

f (x) ≺ f (x^{*})

.

We know that criticality is a necessary, but not a sufficient condition for optimality. Under the convexity of the vectorial function f, the following proposition shows that criticality is equivalent to weak optimality.

Proposition 1

([29]). Let

f : M \to R^{p}

be a convex function given by

f = {f_{1} (x), \dots, f_{p} (x)}

. A point

x \in M

is a critical Pareto-optimal point of the function f if and only if it is a weak Pareto-optimal point of the function f.

We assume that the functions

f = {f_{1} (x), \dots, f_{p} (x)} : M \to R^{p}

and

h : M \to R^{n}

are twice continuously differentiable and consider the weighted sun scaling problem related to the MOP, as follows.

Let

ω_{i} \geq 0, i = 1, \dots, p

such that

\sum_{i = 1}^{p} ω_{i} = 1

:

\begin{matrix} min_{x} & \sum_{i = 1}^{p} ω_{i} f_{i} (x) \\ s . t . & h (x) = 0, \\ x \in M . \end{matrix}

(5)

Note that, if

ω_{i} \geq 0, i = 1, \dots, p

such that

\sum_{i = 1}^{p} ω_{i} = 1

, then the weak Pareto-optimal solution sets of Problem (4) are equivalent to the union of the optimal solution sets of Problem (5). Meanwhile, if

f_{i} : M \to R

,

i = 1, \dots, p

is the convex function on the Riemannian manifold, then the function

\sum_{i = 1}^{p} ω_{i} f_{i} (x)

is also convex. Thus, the bilevel programming (3)–(4) can be transformed into the following problem:

\begin{matrix} min_{x, ω} & F (x) \\ s . t . & \sum_{i = 1}^{p} ω_{i} = 1, \\ ω_{i} \geq 0, i \in I, \\ x \in \arg min \{\begin{matrix} min & \sum_{i = 1}^{p} ω_{i} f_{i} (x) \\ s . t . & h (x) = 0, \\ x \in M . \end{matrix}\} . \end{matrix}

(6)

A strategy to solve the bilevel problem (6) on the Riemannian manifolds is to replace the lower-level problem with the KKT conditions. When the lower-level problem is convex and satisfies the Slater constraint qualification, the global optimal solutions of the KKT reformulation correspond to the global optimal solutions of the bilevel problem on the Riemannian manifolds. See Theorems 4.1 and 4.2 in [3].

In the following, we give the KKT reformulation of the semivectorial bilevel programming on Riemannian manifolds.

\begin{matrix} min_{x, ω} & F (x) \\ s . t . & ω \in W, \\ \sum_{i = 1}^{p} ω_{i} {grad}_{x} f_{i} (x) + {grad}_{x} h (x) μ = 0, \\ h (x) = 0, \\ x \in M, \end{matrix}

(7)

where

W = \{ω \in R^{p} : \sum_{i = 1}^{p} ω_{i} = 1, ω_{i} \geq 0, i = 1, \dots, p\}

is a convex and compact set,

μ \in R^{n}

, and M is a complete m-dimensional Riemannian manifold.

We will adopt an IR method to solve the optimization problem in two stages, first pursuing feasibility and optimality, keeping a certain control over the feasibility that has been realized. Consequently, the approach exploits the inherent minimization structure of the problem, especially in the feasibility phase, so that it can obtain better solutions. Moreover, in the feasibility phase of the IR strategy, the user is free to choose the method of his/her choice, as long as the recovered iteration satisfies some mild assumptions [4,5].

For simplicity, we introduce the following notations:

C (x, ω, μ) = (\begin{matrix} \sum_{i = 1}^{p} ω_{i} {grad}_{x} f_{i} (x) + {grad}_{x} h (x) μ \\ h (x) \end{matrix}) \in R^{m + n}

(8)

and

L (x, ω, μ, λ) = F (x) + C {(x, ω, μ)}^{T} λ, λ \in R^{m + n} .

(9)

We write shortly

s = (x, ω, μ) \in M \times W \times R^{n}

and give the Jacobian of C as follows:

C^{'} (s) = (\begin{matrix} \sum_{i = 1}^{p} ω_{i} {Hess}_{x} f_{i} + \sum_{j = 1}^{n} μ_{j} {Hess}_{x} h_{j} & {grad}_{x} f_{1} & \dots & {grad}_{x} f_{p} & {grad}_{x} h \\ {grad}_{x} h^{T} & 0 & \dots & 0 & 0 \end{matrix}) .

(10)

Thus, the semivectorial bilevel programming can be reduced:

\begin{matrix} min & F (s) \\ s . t . & C (s) = 0, \\ s \in M \times W \times R^{n} . \end{matrix}

(11)

Before giving a rigorous description of the algorithm, let us start with an overview of each step.

Restoration step: We apply any globally convergent optimization algorithm to solve the lower-level minimization problem parameterized by

z^{k} = (\bar{x}, ω^{k}, \bar{μ})

. Once an approximate minimizer

\bar{x}

and a pair of corresponding estimated Lagrange multiplier vectors are obtained, then we compute the current set

π_{k}

and the direction

d_{\tan}^{k}

.

Approximate linearized feasible region: The set

π_{k}

is a linear approximation of the region described by KKT

(\bar{x})

containing

z^{k} = (\bar{x}, ω^{k}, \bar{μ})

. This auxiliary region is given by

π_{k} = {s \in M \times W \times R^{n} : 〈 C^{'} (z^{k}), {\dot{γ}}_{s, z^{k}} (0) 〉 = 0} .

Descent direction: Using the projection on Riemannian manifolds, the projection defined on

π_{k}

is represented as follows:

P_{π_{k}} (z^{k}) = P_{k} (\exp_{z^{k}} (- η {grad}_{s} L (z^{k}, λ^{k}))),

where

η > 0

is an arbitrary scaling parameter independent of k. It turns out that

d_{\tan}^{k} = P_{k} (\exp_{z^{k}} (- η {grad}_{s} L (z^{k}, λ^{k}))) - z^{k}

which is a feasible descent direction on

π_{k}

.

Minimization step: The objective of the minimization step is to obtain

v^{k, i} \in π_{k}

such that

L (v^{k, i}, λ^{k}) < L (z^{k}, λ^{k})

and

v^{k, i} \in B_{k, i} = {v : d (v, z^{k}) \leq δ_{k, i}}

, where

δ_{k, i}

is a trust-region radius. The first trial point at each iteration is obtained using a trust-region radius

δ_{k, 0}

. A successive trust-region radius is tried until a point

v^{k, i}

is found such that the merit function at this point is sufficiently smaller than the merit function at

s^{k}

.

Merit function and penalty parameter: We decided to use a variant of the sharp Lagrangian merit function, given by

Ψ (s, λ, θ) = θ L (s, λ) + (1 - θ) | C (s) |,

where

θ \in (0, 1]

is a penalty parameter used to give different weights to the objective function and the feasibility objective. The choice of the parameter

θ

at each iteration depends on practical and theoretical considerations. Roughly speaking, we wish the merit function at the new point to be less than the merit function at the current point

s^{k}

.

That is, we want

{Ared}_{k, i} > 0

, where

{Ared}_{k, i}

is the actual reduction of the merit function, defined by

{Ared}_{k, i} = Ψ (s^{k}, λ^{k}, θ_{k, i}) - Ψ (v^{k, i}, λ^{k}, θ_{k, i}) .

So,

{Ared}_{k, i} = θ_{k, i} [L (s^{k}, λ^{k}) - L (v^{k, i}, λ^{k, i})] + (1 - θ_{k, i}) [| C (s^{k}) | - | C (v^{k, i}) |] .

However, merely a reduction of the merit function is not sufficient to guarantee convergence. In fact, we need a sufficient reduction of the merit function, which will be defined by the satisfaction of the following test:

{Ared}_{k, i} \geq 0.1 {Pred}_{k, i},

where

{Pred}_{k, i}

is a positive predicted reduction of the merit function

Ψ (s, λ, θ)

between

s^{k}

and

v^{k, i}

. It is defined by

\begin{matrix} {Pred}_{k, i} = & θ_{k, i} [L (s^{k}, λ^{k}) - L (v^{k, i}, λ^{k}) - C {(z^{k})}^{T} (λ^{k, i} - λ^{k})] + \\ (1 - θ_{k, i}) [| C (s^{k}) | - | C (z^{k}) |] . \end{matrix}

The quantity

{Pred}_{k, i}

defined above can be nonpositive depending on the value of the penalty parameter. Fortunately, if

θ_{k, i}

is small enough,

{Pred}_{k, i}

is arbitrarily close to

[| C (s^{k}) | - | C (z^{k}) |]

, which is necessarily nonnegative. Therefore, we will always be able to choose

θ_{k, i} \in (0, 1]

such that

{Pred}_{k, i} \geq \frac{1}{2} [| C (s^{k}) | - | C (z^{k}) |] .

When the criterion

{Ared}_{k, i} \geq 0.1 {Pred}_{k, i}

is satisfied, we accept

v^{k, i} = z^{k}

. Otherwise, we reduce the trust-region radius.

To establish IR methods for semivectorial bilevel programming on Riemannian manifolds, we adapt the IR method presented in [4]. In the presented algorithm, the parameters

η > 0

,

N > 0

,

θ_{- 1} \in (0, 1)

,

δ_{min} > 0

,

τ_{1} > 0

, and

τ_{2} > 0

are given. The initial approximations

s^{0} \in W \times M \times R^{n}

,

λ^{0} \in R^{m + n}

, as well as a sequence

{ω^{k}}

such that

\sum_{k = 0}^{+} \infty ω^{k} < + \infty

are also given.

4. Convergence Results

Using the method for studying the convergence of the IR algorithm in Euclidean spaces [20,22], the convergence results of IR algorithms for semivectorial bilevel programming on Riemannian manifolds are given under the following assumptions. From now on, we assume that the semivectorial bilevel optimization problems on Riemannian manifolds satisfy assumptions

H_{1}

–

H_{3}

stated below:

$H_{1}$: There exists $L_{1}$ such that, for all $(x, ω)$ , $(\bar{x}, \bar{ω}) \in M \times W$ , $μ, \bar{μ} \in R^{n}$ , and $ξ \in [0, ξ_{max}]$ ,

$| C^{'} (x, ω, μ) - C^{'} (\bar{x}, \bar{ω}, \bar{μ}) | \leq L_{1} d ((x, ω, μ), (\bar{x}, \bar{ω}, \bar{μ})) .$
$H_{2}$: There exists $L_{2}$ such that, for all $x, \bar{x} \in M$ ,

$| {gard}_{x} F (x) - {gard}_{x} F (\bar{x}) | \leq L_{2} d ((x, \bar{x})) .$
$H_{3}$: There exists $r \in [0, 1)$ , independently of k, such that the point $z^{k} = (\bar{x}, \bar{ω}, \bar{μ})$ obtained at the restoration phase satisfies

$| C (z^{k}) | \leq r | C (s^{k}) |,$

where $s^{k} = (x^{k}, ω^{k}, μ^{k})$ . Moreover, if $C (s^{k}) = 0$ , then $z^{k} = s^{k}$ .

Theorem 1 (Well-definiteness).

Under assumptions

H_{1} - H_{3}

, IR Algorithm 1 for bilevel programming is well defined.

Algorithm 1: Inexact Restoration algorithm

Define $θ_{k}^{min} = min {1, θ_{k - 1}, \dots, θ_{- 1}}$ , $θ_{k}^{large} = min {1, θ_{k}^{min} + ω^{k}}$ , and $θ_{k, - 1} = θ_{k}^{large}$ .
(Restoration phase) Find an approximate minimizer $\bar{x}$ and multipliers $\bar{μ} \in R^{n}$ for the problem:

$\begin{matrix} min_{x} & \sum_{i = 1}^{p} ω_{i}^{k} f_{i} (x) \\ s . t . & h (x) = 0, \\ x \in M, \end{matrix}$

and define $z^{k} = (\bar{x}, ω^{k}, \bar{μ})$ .
(Direction) Compute

$d_{\tan}^{k} = P_{k} (\exp_{z^{k}} (- η {grad}_{s} L (z^{k}, λ^{k}))) - z^{k},$

where $P_{k}$ is the projection on

$π_{k} = {s \in M \times W \times R^{n} : 〈 C^{'} (z^{k}), {\dot{γ}}_{s, z^{k}} (0) 〉 = 0},$

and $P_{k} (\exp_{z^{k}} (- η {grad}_{s} L (z^{k}, λ^{k})))$ is a solution of the following problem:

$\begin{matrix} min_{y \in M \times W \times R^{n}} & \frac{1}{2} ∥ y - \exp_{z^{k}} (- η {grad}_{s} L (z^{k}, λ^{k})) ∥^{2} \\ s . t . & 〈 C^{'} (z^{k}), {\dot{γ}}_{y, z^{k}} (0) 〉 = 0 . \end{matrix}$

If $z^{k} = s^{k}$ , $d_{\tan}^{k} = 0$ , then stop and return $x^{k}$ as a solution of Problem (7). Otherwise, we set
$i \leftarrow 0$ and choose $δ_{k, 0} \geq δ_{min}$ .
(Minimization phase) If $d_{\tan}^{k} = 0$ , then we take $v^{k, i} = z^{k}$ . Otherwise, we take $t_{break}^{k, i} = min \{1, \frac{δ_{k, i}}{d_{\tan}^{k}}\}$ and find $v^{k, i} \in π_{k}$ such that, for some $0 < t < t_{break}^{k, i}$ , we have

$L (v^{k, i}, λ^{k}) \leq max \{L (z^{k} + t d_{\tan}^{k}, λ^{k}), L (z^{k}, λ^{k}) - τ_{1} δ_{k, i}, L (z^{k}, λ^{k}) - τ_{2}\}$

and $d (v^{k, i}, z^{k}) \leq δ_{k, i}$ .
If $d_{\tan}^{k} = 0$ , define $λ^{k, i} = λ^{k}$ . Otherwise, we take $λ^{k, i} \in R^{n + m}$ such that $| λ^{k, i} | \leq N$ .
For all $θ \in [0, 1]$ , we define

$\begin{matrix} {Pred}_{k, i} (θ) = & θ [L (s^{k}, λ^{k}) - L (v^{k, i}, λ^{k}) - C {(z^{k})}^{T} (λ^{k, i} - λ^{k})] + \\ (1 - θ) [| C (s^{k}) | - | C (z^{k}) |] . \end{matrix}$

We take $θ_{k, i}$ as the maximum $θ \in [0, θ_{k, i - 1}]$ that it satisfies:

${Pred}_{k, i} (θ) \geq \frac{1}{2} [| C (s^{k}) | - | C (z^{k}) |],$

(12)

and define ${Pred}_{k, i} = {Pred}_{k, i} (θ_{k, i})$ .
Compute

${Ared}_{k, i} = θ_{k, i} [L (s^{k}, λ^{k}) - L (v^{k, i}, λ^{k, i})] + (1 - θ_{k, i}) [| C (s^{k}) | - | C (v^{k, i}) |] .$

If

${Ared}_{k, i} \geq 0.1 {Pred}_{k, i},$

then we take

$\begin{matrix} s^{k + 1} = v^{k, i}, & λ^{k + 1} = λ^{k, i}, θ_{k} = θ_{k, i}, δ_{k} = δ_{k, i}, \\ {Ared}_{k} = {Ared}_{k, i}, {Pred}_{k} = {Pred}_{k, i} . \end{matrix}$

and finish the current $k th$ iteration. Otherwise, we choose $δ_{k, i + 1} \in [0.1 δ_{k, i}, 0.9 δ_{k, i}]$ , set $i \leftarrow i + 1$ , and go to Step 4.

Proof.

According to Step 6 and Step 7 of Algorithm 1, it can be calculated that

\begin{matrix} {Ared}_{k, i} - 0.1 {Pred}_{k, i} & = 0.9 {Pred}_{k, i} + (1 - θ_{k, i}) [| C (z^{k}) | - | C (v^{k, i}) |] \\ + θ_{k, i} [L (v^{k, i}, λ^{k}) - L (v^{k, i}, λ^{k, i}) + C {(z^{k})}^{T} (λ^{k, i} - λ^{k})] \\ = 0.9 {Pred}_{k, i} + (1 - θ_{k, i}) [| C (z^{k}) | - | C (v^{k, i}) |] \\ + θ_{k, i} [C {(v^{k, i})}^{T} λ^{k} - C {(v^{k, i})}^{T} λ^{k, i} + C {(z^{k})}^{T} (λ^{k, i} - λ^{k})] \\ = 0.9 {Pred}_{k, i} + (1 - θ_{k, i}) [| C (z^{k}) | - | C (v^{k, i}) |] \\ + θ_{k, i} {(C (z^{k}) - C (v^{k, i}))}^{T} (λ^{k, i} - λ^{k}) . \end{matrix}

Through the condition (12), we have

\begin{matrix} {Ared}_{k, i} - 0.1 {Pred}_{k, i} & \geq 0.45 [| C (s^{k}) | - | C (z^{k}) |] + (1 - θ_{k, i}) [| C (z^{k}) | - | C (v^{k, i}) |] \\ + θ_{k, i} {(C (z^{k}) - C (v^{k, i}))}^{T} (λ^{k, i} - λ^{k}) . \end{matrix}

(13)

Then, from the assumption

H_{3}

,

\begin{matrix} {Ared}_{k, i} - 0.1 {Pred}_{k, i} & = 0.45 (1 - r) | C (s^{k}) | + (1 - θ_{k, i}) [| C (z^{k}) | - | C (v^{k, i}) |] \\ + θ_{k, i} {(C (z^{k}) - C (v^{k, i}))}^{T} (λ^{k, i} - λ^{k}) . \end{matrix}

If

C (s^{k}) \neq 0

, due to the continuity of C and

δ_{k, i} \to 0

, we have

| C (z^{k}) | - | C (v^{k, i}) | \to 0

. Thus, there exists a positive constant

δ_{k, i}

such that

{Ared}_{k, i} - 0.1 {Pred}_{k, i} \geq 0 .

This means that the algorithm is well defined when

C (s^{k}) \neq 0

.

If

C (s^{k}) = 0

, then

s^{k}

is feasible. Since the algorithm does not terminate at the kth iteration, we know that

d_{\tan}^{k} \neq 0

. Therefore, we have

z^{k} = s^{k} a n d C (z^{k}) = C (s^{k}) = 0 .

Combining the condition (12), it follows that

{Pred}_{k, i} (θ) = θ [L (s^{k}, λ^{k}) - L (v^{k, i}, λ^{k})] \geq 0,

and independent of

θ

, for all i,

θ_{k, i} = θ_{k, - 1}

. In terms of the inequality (13), when

δ_{k, i}

is sufficiently small, we obtain

{Ared}_{k, i} - 0.1 {Pred}_{k, i} \geq 0 .

Therefore, Algorithm 1 is well defined. □

The next theorem is an important tool for proving the convergence of Algorithm 1. We prove that the actual reduction

{Ared}_{k, i^{*}}

, with

i^{*}

the accepted value of i, achieved at each iteration necessarily tends to 0.

Theorem 2.

Under the assumptions

H_{1} - H_{3}

, if Algorithm 1 generates an infinite sequence, then

lim_{k \to + \infty} {Ared}_{k} = 0, lim_{k \to + \infty} | C (s^{k}) | = 0 .

The same results above occur when

λ^{k} = 0

, for all k.

Proof.

Let us prove that

{lim}_{k \to + \infty} {Ared}_{k} = 0

, i.e., we need to prove

lim_{k \to + \infty} [θ_{k} [L (s^{k}, λ^{k}) - L (s^{k + 1}, λ^{k + 1})] + (1 - θ_{k}) [| C (s^{k}) | - | C (s^{k + 1}) |]] = 0,

that is

lim_{k \to + \infty} [θ_{k} L (s^{k}, λ^{k}) + (1 - θ_{k}) | C (s^{k}) | - [θ_{k} L (s^{k + 1}, λ^{k + 1}) + (1 - θ_{k}) | C (s^{k + 1}) |]] = 0,

namely

lim_{k \to + \infty} [Ψ (s^{k}, θ_{k}) - Ψ (s^{k + 1}, θ_{k})] = 0,

where

Ψ (s^{k}, θ_{k}) = θ_{k} L (s^{k}, λ^{k}) + (1 - θ_{k}) | C (s^{k}) |

.

By contradiction, suppose that there is an infinite indicator set

T_{1} \subset {0, 1, 2 \dots}

and a positive constant

ζ > 0

such that, for any

k \in T_{1}

, we have

Ψ (s^{k + 1}, θ_{k}) \leq Ψ (s^{k}, θ_{k}) - ζ .

Let

Ψ_{k} = Ψ (s^{k}, θ_{k})

, then

\begin{matrix} Ψ_{k + 1} & = θ_{k + 1} L (s^{k + 1}, λ^{k + 1}) + (1 - θ_{k + 1}) | C (s^{k + 1}) | \\ = θ_{k + 1} L (s^{k + 1}, λ^{k + 1}) + (1 - θ_{k + 1}) | C (s^{k + 1}) | \\ - θ_{k} L (s^{k + 1}, λ^{k + 1}) + (1 - θ_{k}) | C (s^{k + 1}) | \\ + θ_{k} L (s^{k + 1}, λ^{k + 1}) + (1 - θ_{k}) | C (s^{k + 1}) | \\ = (θ_{k + 1} - θ_{k}) L (s^{k + 1}, λ^{k + 1}) + (θ_{k} - θ_{k + 1}) | C (s^{k + 1}) | \\ + θ_{k} L (s^{k + 1}, λ^{k + 1}) + (1 - θ_{k}) | C (s^{k + 1}) | \\ \leq (θ_{k} - θ_{k + 1}) [∥ C (s^{k + 1}) ∥ - L (s^{k + 1}, λ^{k + 1})] \\ + θ_{k} L (s^{k}, λ^{k}) + (1 - θ_{k}) | C (s^{k}) | - ζ_{k} . \end{matrix}

Equivalently,

Ψ_{k + 1} \leq (θ_{k} - θ_{k + 1}) [| C (s^{k + 1}) | - L (s^{k + 1}, λ^{k + 1})] + Ψ_{k} - ζ_{k},

(14)

where

ζ_{k} > 0

and

ζ_{k} > ζ > 0

,

k \in T_{1}

.

According to the definition of

θ_{k, - 1}

,

θ_{k} - θ_{k + 1} + ω_{k} \geq 0, k \in T_{1} .

There is an upper bound

c > 0

, such that

| C (s^{k}) | - ∥ L (s^{k + 1}, λ^{k + 1}) ∥ \leq c .

(15)

Combining the inequalities (14) and (15), it follows that

\begin{matrix} Ψ_{j + 1} & \leq (θ_{j} - θ_{j + 1} + ω_{j}) [| C (s^{j + 1}) | - L (s^{j + 1}, λ^{j + 1})] \\ + Ψ_{j} - ζ_{j} - ω_{j} [| C (s^{j + 1}) | - L (s^{j + 1}, λ^{j + 1})] \\ \leq (θ_{j} - θ_{j + 1} + ω_{j}) c + Ψ_{j} - ζ_{j} + ω_{j} c \\ \leq (θ_{j} - θ_{j + 1}) c + Ψ_{j} - ζ_{j} + 2 ω_{j} c . \end{matrix}

Then, for all

k \geq 1

, we have

\begin{matrix} Ψ_{k} & \leq Ψ_{0} + (θ_{0} - θ_{k + 1}) c + - \sum_{j = 0}^{k - 1} ζ_{j} + \sum_{j = 0}^{k - 1} 2 ω_{j} c \\ \leq Ψ_{0} + 2 c + - \sum_{j = 0}^{k - 1} ζ_{j} + \sum_{j = 0}^{k - 1} 2 ω_{j} c . \end{matrix}

Since

\sum_{j = 0}^{k - 1} 2 ω_{j}

is the convergence and

ζ_{j}

is bounded away from zero, this implies that

Ψ_{k}

is unbounded. This is a contradiction. Thus, we have that

{lim}_{k \to + \infty} {Ared}_{k} = 0

. In addition, in a similar way, we can prove

{lim}_{k \to + \infty} | C (s^{k}) | = 0

. □

According to Theorem 2, it means that the point generated by the IR algorithm for the KKT transformation (7) will converge to a feasible point eventually. Then, we prove that

d_{t a n}^{k}

cannot be bounded away from zero under the following assumption

H_{4}

. This means that the point generated by the IR algorithm will converge to a weak Pareto solution of Problem (7):

$H_{4}$: There exists $β > 0$ , independently of k, such that

$d (s^{k}, z^{k}) \leq β | C (s^{k}) | .$

Theorem 3.

Suppose that the assumptions

H_{1}

,

H_{2}

,

H_{3}

, and

H_{4}

hold. If

{s^{k}}

is an infinite sequence generated by Algorithm 1,

{z^{k}}

is the sequence defined at the restoration phase in Algorithm 1, then:

1: $|C (s^{k})| \to 0$ .
2: There exists a limit point $s^{*}$ of ${s^{k}}$ .
3: Every limit point of ${s^{k}}$ is a feasible point of the KKT reformulation (7).
4: If, for all ω, a global solution of the lower-level problem is found, then any limit point $(x^{*}, ω^{*})$ is feasible for the weighted semivectorial bilevel programming (6).
5: If $s^{*}$ is a limit point of ${s^{k}}$ , there exists an infinite set $K \subset N$ such that

$lim_{k \in K} s^{k} = lim_{k \in K} z^{k} = s^{*}, C (s^{*}) = 0, lim_{k \in K} d_{t a n}^{k} = 0 .$

Proof.

We can prove the first two items from Theorem 2 and the assumption

H_{1} - H_{3}

. Based on the conclusions of the first two terms, the third and forth items are valid. The fifth item follows from the assumption

H_{4}

and the first item. □

The above conclusions give the well-definiteness and convergence of the algorithm proposed for semivectorial bilevel programming on Riemannian manifolds. From the point of view of the assumption put forward in this paper, the assumptions

H_{3}

and

H_{4}

are related to the sequences generated by the IR algorithm. Therefore, it is worth studying establishing sufficient conditions to ensure their effectiveness. Two assumptions about the lower-level problem are given below to verify the hypotheses

H_{3}

and

H_{4}

:

$H_{5}$: For every solution $s = (x, ω, μ)$ of $C (x, ω, μ) = 0$ , such that the gradients $grad h_{i} (x)$ , $i = 1, \dots, n$ of the active lower level constraints are linearly independent.
$H_{6}$: For every solution $s = (x, ω, μ)$ of $C (x, ω, μ) = 0$ such that the matrix:

$H (x, ω, μ) = \sum_{i = 1}^{p} ω_{i} {Hess}_{x} f_{i} (x) + \sum_{i = 1}^{n} μ_{i} {Hess}_{x} h_{i} (x),$

is positive definite in the following set:

$Z (x) = {d \in R^{n} | grad h {(x)}^{T} d = 0, d_{j} = 0 for all j} .$

For convenience, to verify

H_{3}

and

H_{4}

, we define the following matrix:

D^{'} (s) = (\begin{matrix} \sum_{i = 1}^{p} ω_{i} {Hess}_{x} f_{i} + \sum_{i = 1}^{n} μ_{i} {Hess}_{x} h_{i} & {grad}_{x} h \\ {grad}_{x} h^{T} & 0 \end{matrix}) .

Lemma 1.

The matrix

D^{'} (s)

is non-singular for any solution

s = (x, ω, μ)

of

C (x, ω, μ) = 0

.

Proof.

Assuming that there exist

u \in R^{m}

and

v \in R^{p}

such that

D^{'} (s) (\begin{matrix} u \\ v \end{matrix}) = 0,

then we have

\begin{matrix} (\sum_{i = 1}^{p} ω_{i} {Hess}_{x} f_{i} + \sum_{i = 1}^{n} μ_{i} {Hess}_{x} h_{i}) u + {grad}_{x} h v = 0, \end{matrix}

(16)

\begin{matrix} {grad}_{x} h u = 0 . \end{matrix}

(17)

According to the assumptions

H_{5}

–

H_{6}

and Equalities (16) and (17), it follows that

u = 0

and

v = 0

. This means that the matrix

D^{'} (s)

is non-singular for any solution

s = (x, ω, μ)

of

C (x, ω, μ) = 0

. □

Let

D (s)

be defined on

M \times W \times R^{n}

, for each

ω \in W

, a solution

u (ω) = (x (ω), μ (ω))

of

C (x, ω, μ) = 0

such that the function

v (ω) = u (ω)

is continuous on W. Now, we fix the function

v (ω)

, by Lemma 1, and we can define a function

Υ (ω) = D^{'} {(ω, v (ω))}^{- 1}

over the set W. Let

V (v (ω), α) = {v \in M \times R^{n} : d (v, v (ω)) \leq α}

. Furthermore, the following lemma can be obtained.

Lemma 2.

There exist

α > 0

and

β > 0

, such that, for all

ω \in W

, it holds

| Υ (ω) | < β

, and for all

v \in V (v (ω), α)

,

Υ (ω)

coincides with the local inverse operator of

D^{'} (ω, \cdot)

.

Proof.

Since

D^{'} (ω, v)

is continuous on

(ω, v)

,

v (ω)

is continuous on W, and

Υ (ω)

is continuous with respect to

ω \in W

, there exists

β > 0

, such that, for all

ω \in W

,

| Υ (ω) | < β

.

For each fixed value of

ω \in W

, associated with each v, the continuously differentiable operator of the vector

C (ω, v)

verifies the assumption of the inverse function theorem at

v (ω)

. Hence, there exists

α > 0

such that

C (ω, \cdot)

has a continuously differentiable local inverse operator

G (ω) : C (ω, V (v (ω), α)) \mapsto V (v (ω), α)

, and the Jacobian matrix

{[G (ω)]}^{'}

is consistent with

Υ (ω)

. This ends the proof. □

Finally, we state that

H_{3}

and

H_{4}

hold under the assumptions

H_{5}

to

H_{6}

. The next theorem summarizes this fact, and it can be proven as follows.

Theorem 4.

Let

r \in [0, 1)

,

(ω, u) \in W \times M \times R^{n}

be such that

C (ω, u) \neq 0

. If the assumptions

H_{5}

–

H_{6}

hold, then there exist

β > 0

,

ω \in W

, and

\bar{u} = (\bar{x}, \bar{μ}) \in M \times R^{n}

such that

| C (ω, \bar{u}) | \leq r | C (ω, u) |,

and

d ((ω, u), (ω, \bar{u})) \leq β | C (ω, u) | .

Proof.

According to Lemmas 1 and 2, combining the assumptions

H_{5}

and

H_{6}

, by using Taylor expansions of the functions on Riemannian manifolds, the statement follows from the results of [20]. This ends the proof. □

Example 1.

We consider the particular case

M = R_{+}^{2} : = {(x_{1}, x_{2}) \in R^{2} | x_{1} > 0, x_{2} > 0}

with the metric g given in Cartesian coordinates

(x_{1}, x_{2})

around the point

x \in M

by the matrix:

M ∋ y \mapsto {(g_{i j})}_{y} = (g (\frac{\partial}{\partial y_{i}}, \frac{\partial}{\partial y_{j}})) : = diag (x_{1}^{- 1}, x_{2}^{- 1}) .

In other words, for any vectors

u = (u_{1}, u_{2})

and

v = (v_{1}, v_{2})

in the tangent plane at

x \in M

, denoted by

T_{y} M

, which coincides with

R^{2}

, we have

g (u, v) = \frac{u_{1} v_{1}}{x_{1}} + \frac{u_{2} v_{2}}{x_{2}} .

Let

a = (a_{1}, a_{2}) \in M

and

v = (v_{1}, v_{2}) \in T_{a} M

. It is easy to see that the (minimizing) geodesic curve

t \mapsto γ (t)

verifying

γ (0) = a

,

γ (0) = v

is given by

R ∋ t \mapsto (a_{1} e^{\frac{v_{1}}{a_{1}} t}, a_{2} e^{\frac{v_{2}}{a_{2}} t}) .

Hence, M is a complete Riemannian manifold. Furthermore, the (minimizing) geodesic segment

γ : [0, 1] \to M_{2}

joining the points

a = (a_{1}, a_{2})

and

b = (b_{1}, b_{2})

, i.e.,

γ (0) = a

,

γ (1) = b

is given by

γ_{i} (t) = a_{1}^{1 - t} b_{i}^{t}

,

i = 1, 2

. Thus, the distance d on the metric space

(M_{2}, g_{2})

is given by

\begin{matrix} d (a, b) & = \int_{0}^{1} {∥ \dot{γ} (t) ∥}_{γ (t)} d t = \int_{0}^{1} \sqrt{{(\frac{{\dot{γ}}_{1} (t)}{γ_{1} (t)})}^{2} + {(\frac{{\dot{γ}}_{2} (t)}{γ_{2} (t)})}^{2}} d t \\ = \sqrt{{(ln \frac{a_{1}}{b_{1}})}^{2} + {(ln \frac{a_{2}}{b_{2}})}^{2}} . \end{matrix}

It follows easily that the closed ball

B (a; R)

centered in

a \in M

of radius

R \geq 0

verifies

[a_{1} e^{- \frac{R}{\sqrt{2}}}, a_{1} e^{- \frac{R}{\sqrt{2}}}] \times [a_{2} e^{- \frac{R}{\sqrt{2}}}, a_{2} e^{- \frac{R}{\sqrt{2}}}] \subset B (a; R);

thus, every closed rectangle

[ρ_{1}, η_{1}] \times [ρ_{2}, η_{2}] (ρ_{1} > 0, ρ_{2} > 0)

is bounded in the metric space

(M, g)

with the distance d.

Next, we consider the functions

F : M \to R

,

f : M \to R^{2}

and

h : M \to R

given for any

x \in M

by

\begin{matrix} F (x) & = - x_{1}, \\ f_{1} (x) & = \frac{1}{2} {(x_{1} - 1)}^{2} - \frac{3}{4} ln x_{1} + \frac{3}{8} {(x_{2} - 1)}^{2}, \\ f_{2} (x) & = \frac{1}{4} {(x_{1} - 1)}^{2} - \frac{3}{8} ln x_{1} + \frac{3}{16} {(x_{2} - 1)}^{2}, \\ h (x) & = \frac{1}{3} {(x_{1} - 1)}^{2} + \frac{1}{3} {(x_{2} - 1)}^{2} - \frac{1}{3} . \end{matrix}

It is easy to see that, for

x \in M

and any geodesic segment

γ : [0, 1] \to M

with

γ (0) = a

,

γ (1) = b

, the functions

f_{i} (x), i = 1, 2

, and

h (x)

are all convex on M with the Riemannian metric g. Moreover, the function

h (x)

satisfies the Slater constraint qualification.

We then consider the corresponding KKT reformulation of the semivectorial bilevel programming on Riemannian manifolds:

\begin{matrix} min_{x, ω} & F (x) = - x_{2} \\ s . t . & ω \in W, \\ \sum_{i = 1}^{2} ω_{i} {grad}_{x} f_{i} (x) + {grad}_{x} h (x) μ = 0, \\ h (x) = 0, \\ x \in M . \end{matrix}

By the definition of the gradient of a differentiable function with respect to the Riemannian metric g, let

ω_{1} = \frac{1}{3}

,

ω_{2} = \frac{2}{3}

,

ω_{1} + ω_{2} = 1

, and

μ = {(\frac{1}{2}, \frac{3}{4})}^{T} \in R^{2}

; we have

\begin{matrix} min_{x, ω} & F (x) = - x_{1} \\ s . t . & {(x_{1} - \frac{1}{2})}^{2} + {(x_{2} - \frac{1}{2})}^{2} - 1 = 0, \\ \frac{1}{3} {(x_{1} - 1)}^{2} + \frac{1}{3} {(x_{2} - 1)}^{2} - \frac{1}{3} = 0, \\ x \in M . \end{matrix}

It is easy to see that the unique optimal solution of the KKT reformulation is

x = (\frac{3 - \sqrt{7}}{4}, \frac{3 + \sqrt{7}}{4})

.

According to Algorithm 1, we first give the initial approximations

s^{0} \in W \times M \times R^{2}

,

λ^{0} \in R^{2}

, and a sequence

{ω^{k}}

. In the restoration phase, find an approximate minimizer

\bar{x} = (\bar{x_{1}}, \bar{x_{2}}) \in M

and multiplier

\bar{μ} = (\bar{μ_{1}}, \bar{μ_{2}}) \in R^{2}

for the problem:

\begin{matrix} min_{x} & ω_{1}^{k} f_{1} (x) + ω_{2}^{k} f_{2} (x) \\ s . t . & h (x) = 0, \\ x \in M, \end{matrix}

and define

z^{k} = (\bar{x}, ω^{k}, \bar{μ})

.

We then compute the direction by using the exponential mapping and the projection defined on Riemannian manifold M.

\begin{matrix} d_{\tan}^{k} & = P_{k} (\exp_{z^{k}} (- η {grad}_{s} L (z^{k}, λ^{k}))) - z^{k}, \\ = P_{k} (z_{1}^{k} e^{- η \frac{{grad}_{s} L (z^{k}, λ^{k})}{z_{1}^{k}}}, z_{2}^{k} e^{- η \frac{{grad}_{s} L (z^{k}, λ^{k})}{z_{2}^{k}}}) - z^{k}, \end{matrix}

where

L (z^{k}, λ^{k}) = - x_{1} + λ_{1}^{k} (\sum_{i = 1}^{2} ω_{i}^{k} {grad}_{s} f_{i} (\bar{x}) + {grad}_{s} h (\bar{x}) \bar{μ}) + λ_{2}^{k} h (\bar{x})

.

In the minimization phase, we first find

v^{k, i}

such that

L (v^{k, i}, λ^{k}) < L (z^{k}, λ^{k})

and

v^{k, i} \in B_{k, i} = {v : d (v, z^{k}) \leq δ_{k, i}}

. Then, by calculating the actual reduction

{Ared}_{k, i}

and positive predicted reduction

{Pred}_{k, i}

of the merit function

Ψ (s, λ, θ)

such that

{Ared}_{k, i} \geq 0.1 {Pred}_{k, i}

, we obtain a sequence

{s^{k}}

.

According to Theorems 3 and 4, the sequence

{s^{k}}

generated by the IR method established in the present paper converges to a solution of the semivectorial bilevel programming on Riemannian manifolds.

5. Conclusions

In this paper, a new algorithm for solving the semivectorial bilevel programming based on the IR technique was proposed, which preserves the two-stage structure of the problem. In the feasibility phase, lower-level problems can be solved imprecisely using their properties, and users are free to use special-purpose solvers. In the optimal stage, a minimization algorithm with linear constraints was used. Moreover, it was also proven that the algorithm is well-defined and converges to the feasible point under mild conditions. Under more stringent assumptions, the convergence of sequences generated by the presented algorithm was proven. Furthermore, the validity of some conditions generated by the algorithm was given as well.

Author Contributions

Conceptualization, J.L. and Z.W.; methodology, J.L.; writing—original draft preparation, J.L.; writing—review and editing, J.L.; supervision, Z.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of China Grant Number 11871383.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare that they have no conflict of interest.

References

Udrişte, C. Convex Functions and Optimization Methods on Riemannian Manifolds; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013; Volume 297. [Google Scholar]
Boumal, N. An Introduction to Optimization on Smooth Manifolds. 2020. Available online: https://www.nicolasboumal.net/book/ (accessed on 11 September 2020).
Liao, L.; Wan, Z. On the Karush–Kuhn–Tucker reformulation of the bilevel optimization problems on Riemannian manifolds. Filomat, 2022; 36, accepted and in press. [Google Scholar]
Martínez, J.M.; Pilotta, E.A. Inexact-restoration algorithms for constrained optimization. J. Optim. Theory Appl. 2000, 104, 135–163. [Google Scholar] [CrossRef]
Martínez, J.M. Inexact-restoration method with Lagrangian tangent decrease and new merit function for nonlinear programming. J. Optim. Theory Appl. 2001, 111, 39–58. [Google Scholar] [CrossRef] [Green Version]
Birgin, E.G.; Bueno, L.F.; Martínez, J.M. Assessing the reliability of general-purpose inexact restoration methods. J. Comput. Appl. Math. 2015, 282, 1–16. [Google Scholar] [CrossRef]
Bueno, L.F.; Friedlander, A.; Martínez, J.M.; Sobral, F.N. Inexact restoration method for derivative-free optimization with smooth constraints. SIAM J. Optim. 2013, 23, 1189–1213. [Google Scholar] [CrossRef] [Green Version]
Francisco, J.B.; Martínez, J.M.; Martínez, L.; Pisnitchenko, F. Inexact restoration method for minimization problems arising in electronic structure calculations. Comput. Optim. Appl. 2011, 50, 555–590. [Google Scholar] [CrossRef]
Banihashemi, N.; Kaya, C.Y. Inexact restoration for Euler discretization of box-constrained optimal control problems. J. Optim. Theory Appl. 2013, 156, 726–760. [Google Scholar] [CrossRef]
Bueno, L.F.; Haeser, G.; Martínez, J.M. An inexact restoration approach to optimization problems with multiobjective constraints under weighted-sum scalarization. Optim. Lett. 2016, 10, 1315–1325. [Google Scholar] [CrossRef]
Krejić, N.; Jerinkić, N.K.; Ostojić, T. An inexact restoration-nonsmooth algorithm with variable accuracy for stochastic nonsmooth convex optimization problems in machine learning and stochastic linear complementarity problems. J. Comput. Appl. Math. 2022, 423, 114943. [Google Scholar] [CrossRef]
Ma, Y.; Pan, B.; Yan, R. Feasible Sequential Convex Programming With Inexact Restoration for Multistage Ascent Trajectory Optimization. IEEE T. Aero. Elec. Sys. 2022, 1–14. [Google Scholar] [CrossRef]
Gabay, D. Minimizing a differentiable function over a differential manifold. J. Optim. Theory Appl. 1982, 37, 177–219. [Google Scholar] [CrossRef]
Murtagh, B.A.; Saunders, M.A. Large-scale linearly constrained optimization. Math. Program. 1978, 14, 41–72. [Google Scholar] [CrossRef] [Green Version]
Gay, D.M. A trust-region approach to linearly constrained optimization. In Numerical Analysis; Springer: Berlin/Heidelberg, Germany, 1984; Volume 1066, pp. 72–105. [Google Scholar]
Gill, P.E.; Murray, W.; Wright, M.H. Practical Optimization; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2019. [Google Scholar]
Fischer, A.; Friedlander, A. A new line search inexact restoration approach for nonlinear programming. Comput. Optim. Appl. 2010, 46, 333–346. [Google Scholar] [CrossRef]
Bueno, L.F.; Martínez, J.M. On the complexity of an inexact restoration method for constrained optimization. SIAM J. Optim. 2020, 30, 80–101. [Google Scholar] [CrossRef]
Andreani, R.; Ramos, A.; Secchin, L.D. Improving the Global Convergence of Inexact Restoration Methods for Constrained Optimization Problems. Optimization-Online. 2022. Available online: https://optimization-online.org/?p=18821 (accessed on 28 March 2022).
Andreani, R.; Castro, S.L.C.; Chela, J.L.; Friedlander, A.; Santos, S.A. An inexact-restoration method for nonlinear bilevel programming problems. Comput. Optim. Appl. 2009, 43, 307–328. [Google Scholar] [CrossRef]
Friedlander, A.; Gomes, F.A.M. Solution of a truss topology bilevel programming problem by means of an inexact restoration method. Comput. Appl. Math. 2011, 30, 109–125. [Google Scholar]
Andreani, R.; Ramirez, V.A.; Santos, S.A.; Secchin, L.D. Bilevel optimization with a multiobjective problem in the lower level. Numer. Algorithms 2019, 81, 915–946. [Google Scholar] [CrossRef]
Martínez, J.M.; Pilotta, E.A. Inexact restoration methods for nonlinear programming: Advances and Perspectives. In Optimization and Control with Applications; Springer: Boston, MA, USA, 2005; pp. 271–291. [Google Scholar]
Fernández, D.; Pilotta, E.A.; Torres, G.A. An inexact restoration strategy for the globalization of the sSQP method. Comput. Optim. Appl. 2013, 54, 595–617. [Google Scholar] [CrossRef]
Dempe, S.; Kalashnikov, V.; Pérez-Valdés, G.A.; Kalashnykova, N. Bilevel programming problems. In Energy Systems; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
Eisenhart, L.P. Riemannian Geometry; Princeton University Press: Princeton, NJ, USA, 1997; Volume 51. [Google Scholar]
Jost, J. Riemannian Geometry and Geometric Analysis; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2008; Volume 42005. [Google Scholar]
Absil, P.A.; Mahony, R.; Sepulchre, R. Optimization Algorithms on Matrix Manifolds; Princeton University Press: Princeton, NJ, USA, 2009. [Google Scholar]
Bento, G.C.; Cruz Neto, J.X. A subgradient method for multiobjective optimization on Riemannian manifolds. J. Optim. Theory Appl. 2013, 159, 125–137. [Google Scholar] [CrossRef]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liao, J.; Wan, Z. Inexact Restoration Methods for Semivectorial Bilevel Programming Problem on Riemannian Manifolds. Axioms 2022, 11, 696. https://doi.org/10.3390/axioms11120696

AMA Style

Liao J, Wan Z. Inexact Restoration Methods for Semivectorial Bilevel Programming Problem on Riemannian Manifolds. Axioms. 2022; 11(12):696. https://doi.org/10.3390/axioms11120696

Chicago/Turabian Style

Liao, Jiagen, and Zhongping Wan. 2022. "Inexact Restoration Methods for Semivectorial Bilevel Programming Problem on Riemannian Manifolds" Axioms 11, no. 12: 696. https://doi.org/10.3390/axioms11120696

APA Style

Liao, J., & Wan, Z. (2022). Inexact Restoration Methods for Semivectorial Bilevel Programming Problem on Riemannian Manifolds. Axioms, 11(12), 696. https://doi.org/10.3390/axioms11120696

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Inexact Restoration Methods for Semivectorial Bilevel Programming Problem on Riemannian Manifolds

Abstract

1. Introduction

2. Preliminaries

3. Inexact Restoration Algorithm

4. Convergence Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI