1. Introduction
The bilevel optimization problem on Euclidean spaces has been shown to be NP-hard, and even the verification of the local optimality for a feasible solution is in general NP-hard. Bilevel optimization problems are often nonconvex optimization problems, and this makes the computation of an optimal solution a challenging task. Thus, it is natural to consider the bilevel optimization problems on Riemannian manifolds. Actually, studying optimization problems on Riemannian manifolds has many advantages. Some constrained optimization problems on Euclidean spaces can be seen as unconstrained ones from the Riemannian geometry viewpoint. Moreover, some nonconvex optimization problems in the setting of Euclidean spaces may become convex optimization problems by introducing an appropriate Riemannian metric. See for instance [
1,
2]. The aim of this paper is to study the bilevel optimization problem on Riemannian manifolds.
In order to study the bilevel optimization problem on Riemannian manifolds, it is reasonable to have some idea of solving the bilevel optimization problem in Euclidean spaces. An approach to investigate bilevel optimization problems on Euclidean spaces is to replace the lower-level problem by its (under certain necessary and sufficient assumptions) KKT optimality conditions. In a recent article [
3], the authors presented the KKT reformulation of the bilevel optimization problems on Riemannian manifolds. Moreover, it has been shown that global optimal solutions of the KKT reformulation correspond to global optimal solutions of the bilevel problem on the Riemannian manifolds provided the lower level convex problem satisfies Slater’s constraint qualification. On this basis, we consider a semivectorial bilevel optimization problem on Riemannian manifolds with a multiobjective problem in the lower-level problem. Since the Inexact Restoration (IR) algorithm [
4,
5] was introduced to solve constrained optimization problems and if we transform the semivectorial bilevel optimization problem into a single-level problem, it also can be solved by using the IR algorithm as a constrained optimization problem.
For the convenience of the readers, let us review the IR algorithm on Euclidean spaces firstly. Each iteration of the IR algorithm consists of two phases: restoration and minimization. Consider the following nonlinear programming:
where
and
are continuous differentiable functions and the set
is closed convex. The algorithm generates feasible iterates with respect to
,
(for all
).
In the restoration step, which is executed once per iteration, an intermediate point is found such that the infeasibility at is a fraction of the infeasibility at . Immediately after restoration, we construct an approximation of the feasible region using available information at . In the minimization step, we compute a trial point such that . Here, the symbol ≪ means sufficiently smaller than, and , where is a trust-region radius. The trial point is accepted as a new iteration one if the value of a nonsmooth (exact penalty) merit function at is sufficiently smaller than its value at . If is not acceptable, the trust-region radius is reduced.
The IR algorithm is related to classical feasible methods for nonlinear programming, such as the generalized reduced gradient (GRG) and the family of sequential gradient restoration algorithms. There are several studies on the numerical characteristics of the IR algorithm. For example, this method was applied to the general constraint problem in [
6], and good results were obtained. In addition, the IR algorithm using the regularization strategy was proposed in [
7], in which the problem of derivative-free optimization was effectively solved. The IR algorithms are especially useful when there is some natural way to restore feasibility. One of the most successful applications of the IR algorithm is electronic structure calculation, as shown in [
8]. Moreover, the IR algorithm has also been successful applied to optimization problems with the box constraint in [
9] and problems with multiobjective constraints under weighted-sum scalarization in [
10]. For more applications, please see [
11,
12].
Since the IR algorithm is so important in applications, many researches have been trying to improve it from different angles. The restoration phase improves feasibility, and in the minimization step, optimality is improved as a linear tangent approximation of the constraints. When a sufficient descent criterion does not hold, the trial point is modified in such a way that, eventually, acceptance occurs at a point that may be close to the solution of the restoration (first) phase. The acceptance criterion may use merit functions [
4,
5] or filters [
13]. The minimization step consists of an inexact (approximate) minimization of
f with linear constraints. In this case, the restoration step represents also an inexact minimization of infeasibility with linear constraints. Therefore, the available algorithms for (large-scale) linearly constrained minimization can be fully exploited; see the published articles [
14,
15,
16]. Furthermore, IR techniques for constrained optimization were improved, extended, and analyzed in [
7,
17,
18,
19], among others.
Inspired and motivated by the research works [
4,
10,
20,
21,
22,
23,
24,
25], we introduce a kind of bilevel programming with a multiobjective problem in the lower level on Riemannian manifolds, the so-called semivectorial bilevel programming. Then, we transform the semivectorial bilevel programming into a single-level programming by using the KKT optimality conditions of the lower-level problem, which is convex and satisfies the Slater constraint qualification. Finally, we divide the single-level programming into two stages: restoration and minimization, and give an IR algorithm for semivectorial bilevel programming. Under certain conditions, we analyze the well-definiteness and convergence of the presented algorithm.
The remainder of this paper is organized as follows: In
Section 2, some basic concepts, notations, and important results of Riemannian geometry are presented. In
Section 3, we propose the semivectorial bilevel programming on the Riemannian manifold and give the KKT reformulation, and then, we present an algorithm by using the IR technique for solving the semivectorial bilevel programming on Riemannian manifolds. In
Section 4, its convergence properties are studied. The conclusions are given in
Section 5.
2. Preliminaries
An m-dimensional Riemannian manifold is a pair , where M stands for an m-dimensional smooth manifold and g stands for a smooth, symmetric positive definite -tensor field on M, called a Riemannian metric on M. If is a Riemannian manifold, then for any point , the restriction is an inner product on the tangent space . The tangent bundle over M is , and a vector field on M is a section of the tangent bundle, which is a mapping such that, for any , .
We denote
by the scalar product on
with the associated norm
. The length of a tangent vector
is defined by
. Given a piecewise smooth curve
joining
x to
y, i.e.,
and
, then its length is defined by
, where
means the first derivative of
with respect to
t. Let
x and
y be two points in Riemannian manifold
and
the set of all piecewise smooth curves joining
x and
y. The function:
is a distance on
M, and the induced metric topology on
M coincides with the topology of
M as the manifold.
Let ∇ be the Levi-Civita connection associated with the Riemannian metric and
be a smooth curve in
M. A vector field
X is said to be parallel along
if
. If
itself is parallel along
joining
x to
y,
then we say that
is a geodesic, and in this case,
is constant. When
,
is said to be normalized. A geodesic joining
x to
y in
M is said to be minimal if its length equals
.
By the Hopf–Rinow theorem, we know that, if M is complete, then any pair of points in M can be joined by a minimal geodesic. Moreover, is a complete metric space, and the bounded closed subsets are compact. Furthermore, for the exponential mapping at x, is well defined on . Clearly, a curve is a minimal geodesic joining x to y if and only if there exists a vector such that and for each .
Set and defined in . The exponential mapping is defined by . The exponential mapping at is well posed on the tangent space . Obviously, a curve joining p and q is a minimum geodesic, if and only if there is a vector such that and hold for every .
The gradient of a differentiable function with respect to the Riemannian metric g is the vector field defined by , , where denotes the differential of the function f.
In this normal coordinate system, the geodesics through
p are represented by lines passing through the origin. Moreover, the matrix
associated with the bilinear form
g at the point
p in this orthonormal basis reduces to the identity matrix, and the Christoffel symbols vanish. Thus, for any smooth function
, in normal coordinates around
p, we obtain
Now, consider a smooth function and the real-valued function defined around 0 in .
The Taylor–Young formula (for Euclidean spaces) applied to
around the origin can be written using matrices as
where
In other words, we have the following Taylor–Young expansion for
f around
p:
which holds in any coordinate system.
The set
is said to be convex if it contains a geodesic segment
whenever it contains the end points of
, that is
is in
A whenever
and
are in
A, and
. A function
is said to be convex if its restriction to any geodesic curve
is convex in the classical sense, such that the one real variable function
is convex. Let
denote the projection on
, that is, for each
,
For more details and complete information on the fundamentals in Riemannian geometry, see [
1,
26,
27,
28].
3. Inexact Restoration Algorithm
We study an optimistic bilevel programming on an
m-dimensional Riemannian manifold
, where the lower-level problem is a multi-objective problem, the so-called semivectorial bilevel programming. The problem is formulated below:
where
and
is the effective solution set of the following multi-objective problem (MOP):
where
,
,
, and
denote the feasible solution of the MOP.
Definition 1. Let be a vectorial function on Riemannian manifold M. Then, f is said to be convex on M if, for every and every geodesic segment joining x to y, i.e., and , it holds that The above definition is a natural extension of the definition of convexity in Euclidean space to the Riemannian context; see [
29].
Definition 2. A point is said to be Pareto critical of f on Riemannian manifold M if, for any , there are an index and , such that Definition 3. (a) A point is a Pareto-optimal point of f on Riemannian manifold M if there is no with . (b) A point is a weak Pareto-optimal point of f on Riemannian manifold M if there is no with .
We know that criticality is a necessary, but not a sufficient condition for optimality. Under the convexity of the vectorial function f, the following proposition shows that criticality is equivalent to weak optimality.
Proposition 1 ([
29]).
Let be a convex function given by . A point is a critical Pareto-optimal point of the function f if and only if it is a weak Pareto-optimal point of the function f. We assume that the functions and are twice continuously differentiable and consider the weighted sun scaling problem related to the MOP, as follows.
Let
such that
:
Note that, if
such that
, then the weak Pareto-optimal solution sets of Problem (
4) are equivalent to the union of the optimal solution sets of Problem (
5). Meanwhile, if
,
is the convex function on the Riemannian manifold, then the function
is also convex. Thus, the bilevel programming (
3)–(
4) can be transformed into the following problem:
A strategy to solve the bilevel problem (
6) on the Riemannian manifolds is to replace the lower-level problem with the KKT conditions. When the lower-level problem is convex and satisfies the Slater constraint qualification, the global optimal solutions of the KKT reformulation correspond to the global optimal solutions of the bilevel problem on the Riemannian manifolds. See Theorems 4.1 and 4.2 in [
3].
In the following, we give the KKT reformulation of the semivectorial bilevel programming on Riemannian manifolds.
where
is a convex and compact set,
, and
M is a complete
m-dimensional Riemannian manifold.
We will adopt an IR method to solve the optimization problem in two stages, first pursuing feasibility and optimality, keeping a certain control over the feasibility that has been realized. Consequently, the approach exploits the inherent minimization structure of the problem, especially in the feasibility phase, so that it can obtain better solutions. Moreover, in the feasibility phase of the IR strategy, the user is free to choose the method of his/her choice, as long as the recovered iteration satisfies some mild assumptions [
4,
5].
For simplicity, we introduce the following notations:
and
We write shortly
and give the Jacobian of
C as follows:
Thus, the semivectorial bilevel programming can be reduced:
Before giving a rigorous description of the algorithm, let us start with an overview of each step.
Restoration step: We apply any globally convergent optimization algorithm to solve the lower-level minimization problem parameterized by . Once an approximate minimizer and a pair of corresponding estimated Lagrange multiplier vectors are obtained, then we compute the current set and the direction .
Approximate linearized feasible region: The set
is a linear approximation of the region described by KKT
containing
. This auxiliary region is given by
Descent direction: Using the projection on Riemannian manifolds, the projection defined on
is represented as follows:
where
is an arbitrary scaling parameter independent of
k. It turns out that
which is a feasible descent direction on
.
Minimization step: The objective of the minimization step is to obtain such that and , where is a trust-region radius. The first trial point at each iteration is obtained using a trust-region radius . A successive trust-region radius is tried until a point is found such that the merit function at this point is sufficiently smaller than the merit function at .
Merit function and penalty parameter: We decided to use a variant of the sharp Lagrangian merit function, given by
where
is a penalty parameter used to give different weights to the objective function and the feasibility objective. The choice of the parameter
at each iteration depends on practical and theoretical considerations. Roughly speaking, we wish the merit function at the new point to be less than the merit function at the current point
.
That is, we want
, where
is the actual reduction of the merit function, defined by
However, merely a reduction of the merit function is not sufficient to guarantee convergence. In fact, we need a sufficient reduction of the merit function, which will be defined by the satisfaction of the following test:
where
is a positive predicted reduction of the merit function
between
and
. It is defined by
The quantity
defined above can be nonpositive depending on the value of the penalty parameter. Fortunately, if
is small enough,
is arbitrarily close to
, which is necessarily nonnegative. Therefore, we will always be able to choose
such that
When the criterion is satisfied, we accept . Otherwise, we reduce the trust-region radius.
To establish IR methods for semivectorial bilevel programming on Riemannian manifolds, we adapt the IR method presented in [
4]. In the presented algorithm, the parameters
,
,
,
,
, and
are given. The initial approximations
,
, as well as a sequence
such that
are also given.
4. Convergence Results
Using the method for studying the convergence of the IR algorithm in Euclidean spaces [
20,
22], the convergence results of IR algorithms for semivectorial bilevel programming on Riemannian manifolds are given under the following assumptions. From now on, we assume that the semivectorial bilevel optimization problems on Riemannian manifolds satisfy assumptions
–
stated below:
There exists
such that, for all
,
,
, and
,
There exists
such that, for all
,
There exists
, independently of
k, such that the point
obtained at the restoration phase satisfies
where
. Moreover, if
, then
.
Theorem 1 (
Well-definiteness)
. Under assumptions , IR Algorithm 1 for bilevel programming is well defined.
Algorithm 1: Inexact Restoration algorithm |
Define, , and. ( Restoration phase) Find an approximate minimizer and multipliers for the problem:and define . ( Direction) Computewhere is the projection onand is a solution of the following problem:If , , then stop and return as a solution of Problem (7). Otherwise, we set and choose . ( Minimization phase) If , then we take . Otherwise, we take and find such that, for some , we haveand . If , define . Otherwise, we take such that . For all , we define We take as the maximum that it satisfies:and define. Ifthen we takeand finish the current iteration. Otherwise, we choose , set , and go to Step 4.
|
Proof. According to Step 6 and Step 7 of Algorithm 1, it can be calculated that
Through the condition (
12), we have
Then, from the assumption
,
If
, due to the continuity of
C and
, we have
. Thus, there exists a positive constant
such that
This means that the algorithm is well defined when .
If
, then
is feasible. Since the algorithm does not terminate at the
kth iteration, we know that
. Therefore, we have
Combining the condition (
12), it follows that
and independent of
, for all
i,
. In terms of the inequality (
13), when
is sufficiently small, we obtain
Therefore, Algorithm 1 is well defined. □
The next theorem is an important tool for proving the convergence of Algorithm 1. We prove that the actual reduction , with the accepted value of i, achieved at each iteration necessarily tends to 0.
Theorem 2. Under the assumptions , if Algorithm 1 generates an infinite sequence, thenThe same results above occur when , for all k. Proof. Let us prove that
, i.e., we need to prove
that is
namely
where
.
By contradiction, suppose that there is an infinite indicator set
and a positive constant
such that, for any
, we have
Let
, then
Equivalently,
where
and
,
.
According to the definition of
,
There is an upper bound
, such that
Combining the inequalities (
14) and (
15), it follows that
Then, for all
, we have
Since is the convergence and is bounded away from zero, this implies that is unbounded. This is a contradiction. Thus, we have that . In addition, in a similar way, we can prove . □
According to Theorem 2, it means that the point generated by the IR algorithm for the KKT transformation (
7) will converge to a feasible point eventually. Then, we prove that
cannot be bounded away from zero under the following assumption
. This means that the point generated by the IR algorithm will converge to a weak Pareto solution of Problem (
7):
There exists
, independently of
k, such that
Theorem 3. Suppose that the assumptions , , , and hold. If is an infinite sequence generated by Algorithm 1, is the sequence defined at the restoration phase in Algorithm 1, then:
- 1
.
- 2
There exists a limit point of .
- 3
Every limit point of is a feasible point of the KKT reformulation (7). - 4
If, for all ω, a global solution of the lower-level problem is found, then any limit point is feasible for the weighted semivectorial bilevel programming (6). - 5
If is a limit point of , there exists an infinite set such that
Proof. We can prove the first two items from Theorem 2 and the assumption . Based on the conclusions of the first two terms, the third and forth items are valid. The fifth item follows from the assumption and the first item. □
The above conclusions give the well-definiteness and convergence of the algorithm proposed for semivectorial bilevel programming on Riemannian manifolds. From the point of view of the assumption put forward in this paper, the assumptions and are related to the sequences generated by the IR algorithm. Therefore, it is worth studying establishing sufficient conditions to ensure their effectiveness. Two assumptions about the lower-level problem are given below to verify the hypotheses and :
For every solution of , such that the gradients , of the active lower level constraints are linearly independent.
For every solution
of
such that the matrix:
is positive definite in the following set:
For convenience, to verify
and
, we define the following matrix:
Lemma 1. The matrix is non-singular for any solution of .
Proof. Assuming that there exist
and
such that
then we have
According to the assumptions – and Equalities (16) and (17), it follows that and . This means that the matrix is non-singular for any solution of . □
Let be defined on , for each , a solution of such that the function is continuous on W. Now, we fix the function , by Lemma 1, and we can define a function over the set W. Let . Furthermore, the following lemma can be obtained.
Lemma 2. There exist and , such that, for all , it holds , and for all , coincides with the local inverse operator of .
Proof. Since is continuous on , is continuous on W, and is continuous with respect to , there exists , such that, for all , .
For each fixed value of , associated with each v, the continuously differentiable operator of the vector verifies the assumption of the inverse function theorem at . Hence, there exists such that has a continuously differentiable local inverse operator , and the Jacobian matrix is consistent with . This ends the proof. □
Finally, we state that and hold under the assumptions to . The next theorem summarizes this fact, and it can be proven as follows.
Theorem 4. Let , be such that . If the assumptions – hold, then there exist , , and such thatand Proof. According to Lemmas 1 and 2, combining the assumptions
and
, by using Taylor expansions of the functions on Riemannian manifolds, the statement follows from the results of [
20]. This ends the proof. □
Example 1. We consider the particular case with the metric g given in Cartesian coordinates around the point by the matrix: In other words, for any vectors and in the tangent plane at , denoted by , which coincides with , we have Let and . It is easy to see that the (minimizing) geodesic curve verifying , is given by Hence, M is a complete Riemannian manifold. Furthermore, the (minimizing) geodesic segment joining the points and , i.e., , is given by , . Thus, the distance d on the metric space is given by It follows easily that the closed ball centered in of radius verifiesthus, every closed rectangle is bounded in the metric space with the distance d. Next, we consider the functions , and given for any by It is easy to see that, for and any geodesic segment with , , the functions , and are all convex on M with the Riemannian metric g. Moreover, the function satisfies the Slater constraint qualification.
We then consider the corresponding KKT reformulation of the semivectorial bilevel programming on Riemannian manifolds: By the definition of the gradient of a differentiable function with respect to the Riemannian metric g, let , , , and ; we have It is easy to see that the unique optimal solution of the KKT reformulation is .
According to Algorithm 1, we first give the initial approximations , , and a sequence . In the restoration phase, find an approximate minimizer and multiplier for the problem:and define . We then compute the direction by using the exponential mapping and the projection defined on Riemannian manifold M.where . In the minimization phase, we first find such that and . Then, by calculating the actual reduction and positive predicted reduction of the merit function such that , we obtain a sequence .
According to Theorems 3 and 4, the sequence generated by the IR method established in the present paper converges to a solution of the semivectorial bilevel programming on Riemannian manifolds.