Saddle Points of Partial Augmented Lagrangian Functions

Huang, Longfei; Tang, Jingyong; Wang, Yutian; Zhou, Jinchuan

doi:10.3390/mca30050110

Open AccessArticle

Saddle Points of Partial Augmented Lagrangian Functions

¹

School of Mathematics and Statistics, Shandong University of Technology, Zibo 255000, China

²

School of Mathematics and Statistics, Xinyang Normal University, Xinyang 464000, China

^*

Author to whom correspondence should be addressed.

Math. Comput. Appl. 2025, 30(5), 110; https://doi.org/10.3390/mca30050110

Submission received: 2 September 2025 / Revised: 23 September 2025 / Accepted: 6 October 2025 / Published: 8 October 2025

Download Versions Notes

Abstract

In this paper, we study a class of optimization problems with separable constraint structures, characterized by a combination of convex and nonconvex constraints. To handle these two distinct types of constraints, we introduce a partial augmented Lagrangian function by retaining nonconvex constraints while relaxing convex constraints into the objective function. Specifically, we employ the Moreau envelope for the convex term and apply second-order variational geometry to analyze the nonconvex term. For this partial augmented Lagrangian function, we study its saddle points and establish their relationship with KKT conditions. Furthermore, second-order optimality conditions are developed by employing tools such as second-order subdifferentials, asymptotic second-order tangent cones, and second-order tangent sets.

Keywords:

saddle points; partial augmented Lagrangian; second-order subdifferentials; asymptotic second-order tangent cones

1. Introduction

Optimization problems have attracted significant attention due to their wide range of practical applications. In real-world scenarios, many problems often exhibit special structures, for example,

min f (x) : = f_{1} (x) + f_{2} (x),

(1)

where

f_{1}, f_{2} : R^{n} \to R

and

f_{2}

is regarded as a regularization term that imposes certain structures on

f_{1}

to yield a desired solution; see [1,2,3,4,5] for more information. Particularly, the functions

f_{1}

and

f_{2}

often exhibit distinct properties, thus requiring different numerical treatment strategies. For example, in the LASSO problem,

f_{1}

is typically a smooth function, whereas

f_{2}

possesses non-smooth properties. Given a constrained optimization problem,

\begin{matrix} min & f (x), \\ s . t . & g (x) \in Γ, \end{matrix}

(2)

where

g : R^{n} \to R^{m}

and

Γ

is a closed subset in

R^{m}

, the classical Lagrangian function provides a relaxation of the primal problem (2). However if either g or

Γ

is nonconvex, due to the presence of nonconvex constraints, this approach typically leads to a nonzero duality gap. Therefore, it is necessary to consider alternative strategies. In this case, we can employ the augmented Lagrangian function [6,7,8]. However this approach requires the penalty parameter

τ

to be sufficiently large to achieve the zero duality gap property between primal and dual problems, which inevitably introduces computational difficulties such as numerical instability.

Note that separable structures appear not only in the objective function as shown in (1) but also in the constraint system. In fact, in some applications, the constraint system can also be categorized into different types [9,10,11,12,13,14]. For example, the author in [14] considers the following optimization problem:

\begin{matrix} min & f (x), \\ s . t . & g (x) \leq 0, \\ h (x) = 0, \\ G (x) H (x) = 0, \end{matrix}

where

g : R^{n} \to R^{m}

,

h : R^{n} \to R^{p}

,

G, H : R^{n} \to R^{l}

. The first two constraints are of the classical nonlinear programming form. The third constraint, however, requires at least one of the corresponding component functions of G and H to be zero, and hence, the problem is referred to as a mathematical program with switching constraints. If the third constraint is replaced by the complementarity constraint

G (x) \geq 0

,

H (x) \geq 0

,

G (x) ⊥ H (x)

, or equivalently

(G (x), H (x)) \in Γ : = {(u, v) | u \geq 0, v \geq 0, u ⊥ v}

, then the corresponding problem is termed as a mathematical program with complementarity constraints. Due to the special structure of the third constraint, we need to apply distinct analytical approaches to the first two constraints and the third constraint separately. In this paper, we consider the following structured constrained optimization problem:

\begin{matrix} (P) & min & f (x) \\ s . t . & G (x) \in K, \end{matrix}

(3a)

\begin{matrix} H (x) \in D, \end{matrix}

(3b)

which corresponds to problem (2) by setting

g (x) : = (G (x), H (x))

and

Γ : = (K, D)

. The main reason for partitioning the constraint system into two components stems from their fundamentally different characteristics: in (3a),

G : R^{n} \to R^{m}

is continuously differentiable with Lipschitz-continuous derivatives (i.e.,

G (x) \in C^{1, 1}

), and

K \subset R^{m}

is a convex set; meanwhile, in (3b),

H : R^{n} \to R^{p}

is twice continuously differentiable (i.e.,

H (x) \in C^{2}

), but

D \subset R^{p}

is a closed (but not necessarily convex) set. The objective function

f : R^{n} \to R

is continuously differentiable with Lipschitz gradient, i.e.,

f (x) \in C^{1, 1}

.

We adopt different strategies to handle these two distinct types of constraints in (3). For the convex constraint (3a), since K is convex, we utilize the Moreau envelope for regularization. Recall that for a convex function

g : R^{n} \to R

, the Moreau envelope with parameter

ν > 0

is defined as follows:

e_{ν} g (y) : = inf_{w \in R^{m}} \{g (w) + \frac{1}{2 ν} {∥ y - w ∥}^{2}\} .

(4)

A key advantage of the Moreau envelope lies in its smoothing property: even if g is nonsmooth,

e_{ν} g

remains smooth, and hence,

e_{ν}

serves as a smooth approximation of g. For the nonconvex constraint (3b), as mentioned above, since D is nonconvex, the augmented Lagrangian function requires sufficiently large penalty parameters, which in turn leads to poor numerical performance. To overcome this drawback, we choose to remain this constraint explicitly rather than relaxing it into the objective function. Based on these considerations, in this paper we mainly study the following partial augmented Lagrangian function:

L (x, u, τ) : = f (x) + e_{τ^{- 1}} δ_{K} (τ^{- 1} u + G (x)) - \frac{1}{2 τ} {∥ u ∥}^{2},

(5)

where

u \in R^{m}

is the multiplier and

τ > 0

is a penalty parameter. The key difference between the proximal Lagrangian and the partial augmented Lagrangian lies in their treatment of nonconvex constraints: the former generally relaxes or regularizes nonconvex constraints by introducing a proximal term, whereas the latter retains nonconvex constraints directly within the constraint set and applies relaxation only to the convex part. The modified Lagrangian of this form has been studied in [8] for sparse optimization problems, where D denotes the sparse constraint, i.e.,

D : = {x | ∥ x ∥_{0} \leq s}

. In this case, D is a union of polyhedra, but in this paper, we require D to be merely a closed set, without any additional structure. In addition, when analyzing the second-order variational geometry of a set, it typically requires this set to be second-order regular or its second-order tangent set being nonempty [15,16,17]. Our approach eliminates this requirement.

The main work of this paper is summarized in the following three aspects:

(i): Saddle points and Karush–Kuhn–Tucker (KKT) conditions. We first analyze the local/global saddle points of the partial augmented Lagrangian function (5). The relationships between saddle points, minimizers, and KKT points of problem (3) are established. In particular, if $(x^{*}, u^{*})$ is a local/global saddle point of $L$ , then $x^{*}$ is a local/global minimizer of problem (3). Furthermore, $(x^{*}, u^{*}, v^{*})$ with $v^{*} \in N_{D} (H (x^{*}))$ becomes a KKT point, provided that the metric subregularity constraint qualification (MSCQ) holds. Conversely, if $(x^{*}, u^{*}, v^{*})$ is a KKT point, then $(x^{*}, u^{*})$ is a global saddle point of $L (x, u, τ)$ for all $τ > 0$ , as problem (3) is convex, where we only require the set K to be closed under addition, not necessarily to be a cone. The relationship between saddle points and the dual problem associated with the partial augmented Lagrangian is discussed.
(ii): Second-order analysis for $C^{1, 1}$ functions. According to the definition of saddle points (15) below, we know that $x^{*}$ is a minimizer of the following problem:

$min L (x, u, τ) s . t . x \in C,$

(6)

where $C : = H^{- 1} (D)$ . Note that since f and G are $C^{1, 1}$ functions, then $L$ also belongs to the $C^{1, 1}$ class, rather than being twice continuous differentiable. To establish second-order optimality conditions for problem (6), we need to study the second-order approximation of $L$ . Toward this end, we employ the second-order subdifferential $\partial_{x}^{2} L (x, u, τ)$ , defined as the coderivative of the gradient $\nabla_{x} L (x, u, τ)$ . It enables us to obtain upper and lower bounds for the first-order Taylor expansion of $L (x, u, τ)$ with respect to x. In the process of theoretical analysis, the outer semicontinuity and local boundedness properties of $\partial_{x}^{2} L$ for $C^{1, 1}$ functions play an important role. Many works on second-order optimality conditions traditionally require the functions to be twice continuously differentiable; see [7,17,18,19,20,21]. Our work further relaxes this requirement from twice continuous differentiable to first-order continuous differentiability with Lipschitzian gradients, i.e., from $C^{2}$ to $C^{1, 1}$ .
(iii): Second-order variational geometry of the constraint set $C = H^{- 1} (D)$ . Since the set $H (x) \in D$ is explicitly retained in the constraint system, it is necessary to study its variational geometric properties. The traditional approach for describing second-order geometric information of sets utilizes the concept of second-order tangent sets. However, this set may be empty even for convex sets. To overcome this limitation, we further study the asymptotic second-order tangent cone. A key theoretical result is that the second-order tangent set and asymptotic second-order tangent cone cannot be empty simultaneously (Proposition 2.1 in [22]). Therefore, the asymptotic second-order tangent cone serves as a supplement tool to second-order tangent sets, and their combination provides a complete characterization of the set’s second-order geometric information. By using the geometric analysis on constraint system and the aforementioned second-order analysis of objective function, we establish second-order optimality conditions for problem (6).

The structure of this paper is organized as follows. Section 2 presents fundamental notations and related results in the field of variational analysis. Section 3 introduces the partial augmented Lagrangian function and investigates its saddle point properties. Section 4 develops second-order optimality conditions for the partial augmented Lagrangian. Conclusion is drawn in Section 5.

2. Basic Notations and Tools in Variational Analysis

In this section, we first recall some notations and fundamental results in variational analysis which are used throughout the paper.

Let

B

denote the closed unit ball in

R^{n}

. For a nonempty set S, the support function is defined by

σ (x | S) : = {sup}_{u \in S} 〈 x, u 〉 .

The indicator function is defined as follows:

δ_{S} (x) = 0

if

x \in S

, and

δ_{S} (x) = + \infty

otherwise. The metric projection of a point x onto the set S is denoted by

P_{S} (x)

. For a set-valued mapping

F : R^{n} ⇉ R^{p}

, its graph and inverse mapping are

gph F : = {(x, y) \in R^{n} \times R^{p} ∣ y \in F (x)}

and

F^{- 1} (y) : = {x \in R^{n} ∣ y \in F (x)} .

The Painlevé-Kuratowski upper/outer limit of a set-valued mapping F at a point x is defined as

\underset{x^{'} \to x}{limsup} F (x^{'}) : = {y \in R^{p} ∣ \exists x_{k} \to x, y_{k} \to y with y_{k} \in F (x_{k})} .

The Bouligand-Severi tangent/contingent cone to a closed set S at a point

\bar{x} \in S

is

T_{S} (\bar{x}) : = \underset{t \to 0^{+}}{limsup} \frac{S - \bar{x}}{t} = {d \in R^{n} ∣ \exists t_{k} ↓ 0, d_{k} \to d with \bar{x} + t_{k} d_{k} \in S} .

The Fréchet normal cone and the limiting/Mordukhovich/basic normal cone of S at

\bar{x}

are given by

{\hat{N}}_{S} (\bar{x}) : = \{v \in R^{n} |\underset{x \overset{S}{⟶} \bar{x}}{limsup} \frac{〈 v, x - \bar{x} 〉}{∥ x - \bar{x} ∥} \leq 0\}, N_{S} (\bar{x}) : = \underset{x \to \bar{x}}{limsup} {\hat{N}}_{S} (x),

where

x \overset{S}{⟶} \bar{x}

represents the convergence of x to

\bar{x}

with

x \in S

. If S is convex, the Fréchet and limiting normal cones coincide. For a given direction

d \in R^{n}

, the limiting normal cone to S in direction d at

\bar{x}

is defined as

\begin{matrix} N_{S} (\bar{x}; d) & : = \underset{\binom{t ↓ 0}{d^{'} \to d}}{limsup} {\hat{N}}_{S} (\bar{x} + t d^{'}) \\ = \{v \in R^{n} |\exists t_{k} ↓ 0, d_{k} \to d, v_{k} \to v with v_{k} \in {\hat{N}}_{S} (\bar{x} + t_{k} d_{k})\} . \end{matrix}

If, in particular,

d = 0

, then

N_{S} (\bar{x}; 0)

coincides with

N_{S} (\bar{x})

.

Now we are ready to review two kinds of second-order tangent sets, both of which play fundamental roles in the second-order analysis later.

Definition 1

([16,22]). Let

S \subset R^{n}

,

\bar{x} \in S

and

d \in T_{S} (\bar{x})

.

(i): The outer second-order tangent set to S at $\bar{x}$ in direction d is defined by

$T_{S}^{2} (\bar{x}; d) : = \{w \in R^{n} |\exists t_{k} ↓ 0, w_{k} \to w with \bar{x} + t_{k} d + \frac{1}{2} t_{k}^{2} w_{k} \in S\} .$
(ii): The asymptotic second-order tangent cone to S at $\bar{x}$ in direction d is defined by

$T_{S}^{″} (\bar{x}; d) : = \{w \in R^{n} |\exists (t_{k}, r_{k}) ↓ (0, 0), w_{k} \to w with \frac{t_{k}}{r_{k}} \to 0, \bar{x} + t_{k} d + \frac{1}{2} t_{k} r_{k} w_{k} \in S\} .$

The asymptotic second-order tangent cone was first introduced by Penot [22] in the study of optimality conditions for scalar optimization. Note that the asymptotic second-order tangent cone is indeed a cone, while the second-order tangent set may not be a cone and even it may be empty; see, e.g., Bonnans and Shapiro (Example 3.29 in [16]). An important fact is that both sets cannot be empty simultaneously (Proposition 2.1 in [22]), i.e.,

T_{S}^{2} (\bar{x}; d) \cup (T_{S}^{″} (\bar{x}; d) ∖ {0}) \neq \emptyset

. From this point, these two sets describe second-order information of the involved set sufficiently. In addition, according to definitions, it is easy to see

T_{S}^{2} (\bar{x}; t d) = t^{2} T_{S}^{2} (\bar{x}; d) and T_{S}^{″} (\bar{x}; t d) = t^{2} T_{S}^{″} (\bar{x}; d), \forall t > 0 .

(7)

The following is the concept of directional neighborhood.

Definition 2

([23]). Given a direction

d \in R^{n}

and positive numbers

ρ, δ > 0

, the directional neighborhood of direction d is defined as follows:

\begin{matrix} V_{ρ, δ} (d) : & = \{w \in δ B | ∥ ∥ d ∥ w - ∥ w ∥ d ∥ \leq ρ ∥ w ∥ ∥ d ∥\} \\ = \{\begin{matrix} δ B, & if d = 0, \\ \{w \in δ B ∖ {0} | ∥\frac{w}{∥ w ∥} - \frac{d}{∥ d ∥}∥ \leq ρ\} \cup {0}, & if d \neq 0 . \end{matrix} \end{matrix}

The concept of directional metric subregularity, as a directional version of constraint qualifications, plays an important role in developing variational geometric properties of constraint systems.

Definition 3

(Directional Metric Subregularity, [23]). Given a multifunction

M : R^{n} \to R^{p}

and

(\bar{x}, \bar{y}) \in gph M

, the mapping M is said to be metrically subregular at

(\bar{x}, \bar{y})

in direction

d \in R^{n}

, if there are positive numbers

ρ, δ, κ > 0

such that

d (x, M^{- 1} (\bar{y})) \leq κ d (\bar{y}, M (x)), \forall x \in \bar{x} + V_{ρ, δ} (d) .

The infimum of κ over all the combinations

ρ, δ

, and κ satisfying the above relations is called the modulus of directional metric subregularity. In the case of

d = 0

, we simply say that M is metric subregular at

(\bar{x}, \bar{y})

.

For the constraint system

H (x) \in D

, we say that the metric subregularity constraint qualification (MSCQ) holds at

\bar{x}

in direction d if the set-valued mapping

M (x) : = H (x) - D

is metric subregularity at

(\bar{x}, 0)

in direction d.

Lemma 1

(Lemma 2.2, [21]). Let

\bar{x} \in C : = H^{- 1} (D)

and assume that MSCQ holds in direction

d \in R^{n}

for the constraint system

H (x) \in D

with modulus

κ > 0

. Then,

\begin{matrix} N_{C} (\bar{x}; d) \subset \{v \in R^{n} ∣ \exists λ \in N_{D} (H (\bar{x}); \nabla H (\bar{x}) d) \cap κ ∥ v ∥ B with v = \nabla H {(\bar{x})}^{⊤} λ\} . \end{matrix}

(8)

The relations between second-order tangent set and asymptotic second-order tangent cone of the sets C and D, where

C : = H^{- 1} (D)

, under directional MSCQ are given below.

Lemma 2

(Proposition 2.2, [21]). Let

\bar{x} \in C : = H^{- 1} (D)

and

d \in T_{C} (\bar{x})

. Then

T_{C}^{2} (\bar{x}; d) \subset \{w \in R^{n} ∣ \nabla H (\bar{x}) w + \nabla^{2} H (\bar{x}) (d, d) \in T_{D}^{2} (H (\bar{x}); \nabla H (\bar{x}) d)\}

(9)

and

T_{C}^{″} (\bar{x}; d) \subset \{w \in R^{n} ∣ \nabla H (\bar{x}) w \in T_{D}^{″} (H (\bar{x}); \nabla H (\bar{x}) d)\} .

(10)

If, in addition, assume that MSCQ holds in directional

d \in R^{n}

for the constraint system

H (x) \in D

with modulus

κ > 0

, then (9) and (10) hold as equality and the following estimations

d (w, T_{C}^{2} (\bar{x}; d)) \leq κ d (\nabla H (\bar{x}) w + \nabla^{2} H (\bar{x}) (d, d), T_{D}^{2} (H (\bar{x}); \nabla H (\bar{x}) d)), \forall w \in R^{n}

and

d (w, T_{C}^{″} (\bar{x}; d)) \leq κ d (\nabla H (\bar{x}) w, T_{D}^{″} (H (\bar{x}); \nabla H (\bar{x}) d)), \forall w \in R^{n} .

Let

ϕ : R^{n} \to \bar{R}

be a single-valued map, and suppose that

\bar{x} \in R^{n}

satisfies

| ϕ (\bar{x}) | < + \infty

. The limiting (Mordukhovich) subdifferential of

ϕ

at

\bar{x}

is defined as

\partial ϕ (\bar{x}) : = \{v \in R^{n} ∣ (v, - 1) \in N_{epi ϕ} ((\bar{x}, ϕ (\bar{x})))\},

where

epi ϕ

stands for the epigraph of

ϕ

.

Definition 4

([24]). Let

F : R^{n} ⇉ R^{p}

be a multifunction and

(\bar{x}, \bar{y}) \in gph F

. The limiting (Mordukhovich) coderivative of F at

(\bar{x}, \bar{y})

is a multifunction

D^{*} F (\bar{x}, \bar{y}) : R^{p} ⇉ R^{n}

with the values

D^{*} F (\bar{x}, \bar{y}) (v) : = \{u \in R^{n} ∣ (u, - v) \in N_{gph F} (\bar{x}, \bar{y})\}, v \in R^{p} .

If

(\bar{x}, \bar{y}) \notin gph F

, one puts

D^{*} F (\bar{x}, \bar{y}) (v) = ⌀

for any

v \in R^{p}

. We simple write

D^{*} F (\bar{x})

when F is single-valued at

\bar{x}

and

y = F (\bar{x})

.

By employing the notion of coderivative, we can establish the second-order generalized differential theory for extended-real-valued functions. This theoretical framework has become increasingly significant in areas such as second-order conditions, stability theory, and algorithmic analysis [25,26,27].

Definition 5

([24]). Let

ϕ : R^{n} \to \bar{R}

be a function with a finite value at

\bar{x}

. For any

\bar{y} \in \partial ϕ (\bar{x})

, the map

\partial^{2} ϕ (\bar{x}, \bar{y}) : R^{n} ⇉ R^{n}

with the values

\partial^{2} ϕ (\bar{x}, \bar{y}) (v) : = D^{*} (\partial ϕ) (\bar{x}, \bar{y}) (v) = \{u \in R^{n} ∣ (u, - v) \in N_{gph \partial ϕ} ((\bar{x}, \bar{y}))\}

is said to be the limiting (Mordukhovich) second-order subdifferential of ϕ at

\bar{x}

relative to

\bar{y}

. If

\partial ϕ (\bar{x})

is a singleton, we simple write

\partial^{2} ϕ (\bar{x}) (v)

for convenience.

If

ϕ

is twice continuously differentiable in a neighborhood of

\bar{x}

, by Proposition 1.119 in [28],

\partial^{2} ϕ (\bar{x}) (v) = \nabla^{2} ϕ {(\bar{x})}^{*} v = \nabla^{2} ϕ (\bar{x}) v, \forall v \in R^{n},

where

\nabla^{2} ϕ (\bar{x})

denotes the Hessian matrix of

ϕ

at

\bar{x}

. Denote by

C^{1, 1}

the class of real-valued functions

ϕ

, which are Fréchet differentiable, and the gradient mapping

\nabla ϕ (\cdot)

is locally Lipschitz. According to Proposition 1.120 in [28], if

ϕ \in C^{1, 1}

, one has

\partial^{2} ϕ (\bar{x}) (v) = \partial^{2} ϕ (\bar{x}, \nabla ϕ (\bar{x})) (v) = \partial 〈 v, \nabla ϕ (\cdot) 〉 (\bar{x}), \forall v \in R^{n} .

(11)

Lemma 3

(Proposition 2.6, [29]). Let

ϕ \in C^{1, 1}

. The following assertions hold:

(i): For any $λ \geq 0$ , one has $\partial^{2} ϕ (\bar{x}) (λ υ) = λ \partial^{2} ϕ (\bar{x}) (υ)$ , $\forall v \in R^{n} .$
(ii): For any $v \in R^{n}$ the mapping $x \mapsto \partial^{2} ϕ (x) (v)$ is locally bounded. Moreover, if $x_{k} \to \bar{x}, v_{k} \to v, x_{k}^{*} \to x^{*}, x_{k}^{*} \in \partial^{2} ϕ (x_{k}) (v_{k})$ for all $k \in N$ , then $x^{*} \in \partial^{2} ϕ (\bar{x}) (v)$ .

This following result establishes upper and lower bounds for the first-order Taylor approximation of

C^{1, 1}

functions by employing the limiting second-order subdifferential.

Lemma 4

(Theorem 3.1, [30]). Let

ϕ \in C^{1, 1}

and

[a, b] : = {x | x = λ a + (1 - λ) b, λ \in [0, 1]}

. Then, there exist

z \in \partial^{2} ϕ (ξ) (b - a)

where

ξ \in [a, b]

, and

z^{'} \in \partial^{2} ϕ (ξ^{'}) (b - a)

where

ξ^{'} \in [a, b]

, such that

\frac{1}{2} 〈 z^{'}, b - a 〉 \leq ϕ (b) - ϕ (a) - 〈 \nabla ϕ (a), b - a 〉 \leq \frac{1}{2} 〈 z, b - a 〉 .

3. Saddle Points and KKT Conditions

The set of minimum points corresponding to the Moreau envelope

e_{ν} g

(4) is defined as

P_{ν} g (y) : = \underset{w \in R^{m}}{argmin} \{g (w) + \frac{1}{2 ν} {∥ w - y ∥}^{2}\} = {(I + ν \partial g)}^{- 1} (y) .

In particular, if g is convex, then

e_{ν} g (y)

is Fréchet differentiable and its gradient

\nabla e_{ν} g (y) = ν^{- 1} (y - P_{ν} g (y))

(12)

is

ν^{- 1}

Lipschitz-continuous. Because the set K considered in this paper is convex, then

L (x, u, τ)

is continuously differentiable with respect to x and the derivative takes the form

\nabla_{x} L (x, u, τ) = \nabla f (x) + τ \nabla G {(x)}^{⊤} [τ^{- 1} u + G (x) - P_{K} (τ^{- 1} u + G (x))] .

(13)

Since the projection operator

P_{K}

is Lipschitz with constant 1, then

\nabla_{x} L (x, u, τ)

is Lipschitz as well. Hence the partial augmented Lagrangian function

L (x, u, τ)

belongs to

C^{1, 1}

.

For simplicity, we say that

x^{*}

is a Karush–Kuhn–Tucker (KKT) point of problem (3) if there exists

(u^{*}, v^{*}) \in R^{m} \times R^{p}

satisfying the following condition:

\{\begin{matrix} \nabla f (x^{*}) + \nabla G {(x^{*})}^{⊤} u^{*} + \nabla H {(x^{*})}^{⊤} v^{*} = 0, \\ u^{*} \in N_{K} (G (x^{*})), v^{*} \in N_{D} (H (x^{*})) . \end{matrix}

(14)

The definition of local and global saddle points is given below.

Definition 6

(Local/Global saddle point). A pair

(x^{*}, u^{*}) \in C \times R^{m}

with

C : = H^{- 1} (D)

is called a local saddle point of the partial augmented Lagrangian

L (x, u, τ)

if there exists

τ_{0} > 0

and a neighborhood U of

x^{*}

such that

sup_{u \in R^{m}} L (x^{*}, u, τ) \leq L (x^{*}, u^{*}, τ) \leq inf_{x \in U \cap C} L (x, u^{*}, τ), \forall τ \geq τ_{0} .

(15)

If the restriction of U is omitted, then the pair

(x^{*}, u^{*}) \in C \times R^{m}

is called a global saddle point and the infimum of all such

τ_{0}

is denoted by

τ^{*} (x^{*}, u^{*})

.

Notice that there are two inequalities in the definition of saddle points. We begin by analyzing the first inequality.

Lemma 5.

For a given

(x^{*}, u^{*}) \in C \times R^{m}

, suppose that there exists

τ_{0} > 0

such that

sup_{u \in R^{m}} L (x^{*}, u, τ) \leq L (x^{*}, u^{*}, τ), \forall τ \geq τ_{0} .

(16)

Then the following results hold.

(i): $x^{*}$ is a feasible solution of problem (3);
(ii): $L (x^{*}, u^{*}, τ) = f (x^{*}), \forall τ \geq τ_{0}$ ;
(iii): $u^{*} \in N_{K} (G (x^{*}))$ .

Proof.

(i) Suppose on the contrary that

x^{*}

is not a feasible point of problem (3). Since

x^{*} \in C

by assumption, i.e., the constraint condition

H (x^{*}) \in D

holds, then it remains to consider the case of

G (x^{*}) \notin K

.

Let

u : = P_{K} (G (x^{*})) - G (x^{*}) .

(17)

Then

u \neq 0

since

G (x^{*}) \notin K

. It follows from Example 6.16 in [31] that

G (x^{*}) - P_{K} (G (x^{*})) \in N_{K} (P_{K} (G (x^{*}))),

further implying

(1 + t τ^{- 1}) [G (x^{*}) - P_{K} (G (x^{*}))] \in N_{K} (P_{K} (G (x^{*})))

(18)

for all

t > 0

, since

N_{K}

is cone. Note that

u + G (x^{*}) - P_{K} (G (x^{*})) = 0

by (17). This together with (18) implies

\begin{matrix} - t τ^{- 1} [u + G (x^{*}) - P_{K} (G (x^{*}))] + (1 + t τ^{- 1}) [G (x^{*}) - P_{K} (G (x^{*}))] \in N_{K} (P_{K} (G (x^{*}))) \\ ⟹ & - t τ^{- 1} u + G (x^{*}) - P_{K} (G (x^{*})) \in N_{K} (P_{K} (G (x^{*}))) \\ ⟹ & P_{K} [P_{K} (G (x^{*})) - t τ^{- 1} u + G (x^{*}) - P_{K} (G (x^{*}))] = P_{K} (G (x^{*})) \\ ⟹ & P_{K} (- t τ^{- 1} u + G (x^{*})) = P_{K} (G (x^{*})), \end{matrix}

(19)

where the second step above comes from the fact that

b \in N_{K} (a) ⟺ P_{K} (a + b) = a

. Thus, we obtain from (17) and (19) that

\begin{matrix} \frac{τ}{2} {∥- t τ^{- 1} u + G (x^{*}) - P_{K} (- t τ^{- 1} u + G (x^{*}))∥}^{2} - \frac{1}{2 τ} {∥ t u ∥}^{2} \\ = & \frac{τ}{2} {∥G (x^{*}) - P_{K} (- t τ^{- 1} u + G (x^{*}))∥}^{2} - t 〈 u, G (x^{*}) - P_{K} (- t τ^{- 1} u + G (x^{*})) 〉 \\ = & \frac{τ}{2} {∥G (x^{*}) - P_{K} (G (x^{*}))∥}^{2} - t 〈 u, G (x^{*}) - P_{K} (G (x^{*})) 〉 \\ = & (\frac{τ}{2} + t) {∥ u ∥}^{2}, \end{matrix}

(20)

Furthermore, as

τ \geq τ_{0}

, it follows from (16) and (20) that

\begin{matrix} L (x^{*}, u^{*}, τ) & \geq L (x^{*}, - t u, τ) \\ = f (x^{*}) + e_{τ^{- 1}} δ_{K} (- t τ^{- 1} u + G (x^{*})) - \frac{1}{2 τ} {∥ t u ∥}^{2} \\ = f (x^{*}) + \frac{τ}{2} {∥- t τ^{- 1} u + G (x^{*}) - P_{K} (- t τ^{- 1} u + G (x^{*}))∥}^{2} - \frac{1}{2 τ} {∥ t u ∥}^{2} \\ = f (x^{*}) + (\frac{τ}{2} + t) {∥u∥}^{2} . \end{matrix}

(21)

Taking the limit as

t \to + \infty

in (21) and using the fact that

u \neq 0

leads to a contradiction with the finiteness of

L (x^{*}, u^{*}, τ)

by (16). Thus,

G (x^{*}) \in K

.

(ii) If x satisfies

G (x) \in K

and

u \in R^{m}, τ > 0

, then

\begin{matrix} e_{τ^{- 1}} δ_{K} (τ^{- 1} u + G (x)) - \frac{1}{2 τ} {∥ u ∥}^{2} & = inf_{w \in K} \{\frac{τ}{2} {∥ τ^{- 1} u + G (x) - w ∥}^{2}\} - \frac{1}{2 τ} {∥ u ∥}^{2} \\ \leq \frac{τ}{2} ∥ τ^{- 1} {u ∥}^{2} - \frac{1}{2 τ} {∥ u ∥}^{2} = 0 . \end{matrix}

Hence,

L (x, u, τ) = f (x) + e_{τ^{- 1}} δ_{K} (τ^{- 1} u + G (x)) - \frac{1}{2 τ} {∥ u ∥}^{2} \leq f (x) .

(22)

In addition,

L (x, 0, τ) = f (x) + e_{τ^{- 1}} δ_{K} (G (x)) = f (x) + inf_{w \in K} \frac{τ}{2} \{{∥ G (x) - w ∥}^{2}\} = f (x) .

(23)

Since

x^{*}

is a feasible point by (i), according to (22) and (23), we have

L (x^{*}, u^{*}, τ) \leq f (x^{*}) = L (x^{*}, 0, τ), \forall τ > 0 .

(24)

Combining (16) and (24) yields

f (x^{*}) = L (x^{*}, 0, τ) \leq sup_{u \in R^{m}} L (x^{*}, u, τ) \leq L (x^{*}, u^{*}, τ) \leq f (x^{*}), \forall τ \geq τ_{0} .

(25)

Thus,

L (x^{*}, u^{*}, τ) = f (x^{*})

for all

τ \geq τ_{0}

.

(iii) According to (25), we have

\begin{matrix} f (x^{*}) = L (x^{*}, u^{*}, τ) \\ ⟺ & inf_{w \in R^{m}} \{δ_{K} (w) + \frac{τ}{2} {∥ τ^{- 1} u^{*} + G (x^{*}) - w ∥}^{2}\} = \frac{1}{2 τ} {∥ u^{*} ∥}^{2} \\ ⟺ & \frac{τ}{2} {∥τ^{- 1} u^{*} + G (x^{*}) - P_{K} (τ^{- 1} u^{*} + G (x^{*}))∥}^{2} = \frac{1}{2 τ} {∥ u^{*} ∥}^{2} \\ ⟺ & {∥τ^{- 1} u^{*} + G (x^{*}) - P_{K} (τ^{- 1} u^{*} + G (x^{*}))∥}^{2} = {∥τ^{- 1} u^{*}∥}^{2} \\ ⟺ & {∥τ^{- 1} u^{*} + G (x^{*}) - P_{K} (τ^{- 1} u^{*} + G (x^{*}))∥}^{2} = {∥τ^{- 1} u^{*} + G (x^{*}) - G (x^{*})∥}^{2}, \end{matrix}

which implies

G (x^{*}) = P_{K} (τ^{- 1} u^{*} + G (x^{*}))

because both

P_{K} (τ^{- 1} u^{*} + G (x^{*}))

and

G (x^{*})

belong to K and the projection onto a convex set is unique. Hence,

G (x^{*}) = P_{K} (τ^{- 1} u^{*} + G (x^{*})) ⟺ τ^{- 1} u^{*} \in N_{K} (G (x^{*})) ⟺ u^{*} \in N_{K} (G (x^{*})) .

This completes the proof. □

In the above process of proof, it can be seen that

L (x, u, τ) = f (x), \forall u \in N_{K} (G (x)), τ > 0,

(26)

i.e., (22) holds as an equality whenever

u \in N_{K} (G (x))

. In fact, since

u \in N_{K} (G (x))

, then

G (x) = P_{K} (τ^{- 1} u + G (x))

, and hence,

\begin{matrix} L (x, u, τ) & = f (x) + inf_{w \in R^{m}} \{δ_{K} (w) + \frac{τ}{2} {∥τ^{- 1} u + G (x) - w∥}^{2}\} - \frac{1}{2 τ} {∥ u ∥}^{2} \\ = f (x) + \frac{τ}{2} {∥τ^{- 1} u + G (x) - P_{K} (τ^{- 1} u + G (x))∥}^{2} - \frac{1}{2 τ} {∥ u ∥}^{2} \\ = f (x) + \frac{τ}{2} {∥τ^{- 1} u + G (x) - G (x)∥}^{2} - \frac{1}{2 τ} {∥ u ∥}^{2} = f (x) . \end{matrix}

Lemma 6.

If

(x^{*}, u^{*}) \in C \times R^{m}

is a local/global saddle point of

L (x, u, τ)

, then

x^{*}

is a locally/globally optimal solution of problem (3).

Proof.

The whole proof is divided into two parts. The first part proves that

x^{*}

is a feasible point of problem (3) and the second part proves that

x^{*}

is a local/global minimizer. Here we only consider the case of local saddle points, because the case of global saddle points can be proved by replacing the neighborhood U appeared in the following analysis by the whole space

R^{n}

.

(i). Since

(x^{*}, u^{*})

is a local saddle point of the partial augmented Lagrangian function

L

, then there exists

τ_{0} > 0

and a neighborhood U of

x^{*}

such that

sup_{u \in R^{m}} L (x^{*}, u, τ) \leq L (x^{*}, u^{*}, τ) \leq inf_{x \in U \cap C} L (x, u^{*}, τ), \forall τ \geq τ_{0} .

(27)

According to Lemma 5 (i), we know that

x^{*}

is a feasible solution of problem (3).

(ii). Since

(x^{*}, u^{*})

is a local saddle point of the partial augmented Lagrangian function

L (x, u, τ)

, then for a feasible point x of problem (3) satisfying

x \in U

and

τ \geq τ_{0}

, one has

f (x^{*}) = L (x^{*}, u^{*}, τ) \leq inf_{x \in U \cap C} L (x, u^{*}, τ) \leq L (x, u^{*}, τ) \leq f (x),

where the first equality comes from Lemma 5 (ii), the first inequality follows from (27), and last step is due to (22). Thus

x^{*}

is a locally optimal solution to the problem (3). □

The dual problem associated with the partial augmented Lagrangian

L (x, u, τ)

for problem (3) is defined as

max_{u, τ} ϖ (u, τ) s . t . u \in R^{m}, τ > 0,

(28)

where

ϖ (u, τ) : = inf_{x \in C} L (x, u, τ)

and

C : = H^{- 1} (D)

. From (22), the weak duality property between the primal problem (3) and its dual problem holds, i.e.,

ϖ (u, τ) \leq L (x, u, τ) \leq f (x), \forall (u, τ) \in R^{m} \times (0, + \infty), \forall x \in Ω,

(29)

where

Ω

denotes the feasible set of problem (3).

We say that the zero duality gap property holds for the partial augmented Lagrangian

L (x, u, τ)

, if

inf \{f (x) | x \in Ω\} = sup \{ϖ (u, τ) | u \in R^{m}, τ > 0\} .

The relationship between saddle points and the zero duality gap property is given below.

Theorem 1.

(i) If

(x^{*}, u^{*})

is a saddle point of the partial augmented Lagrangian

L (x, u, τ)

, then for any

τ \geq τ^{*} (x^{*}, u^{*})

, the pair

(u^{*}, τ)

is an optimal solution of the dual problem (28), and the zero duality gap property holds. (ii) If

(u^{*}, τ^{*})

is an optimal solution of the dual problem (28),

x^{*}

is an optimal solution of problem (3), and the zero duality gap property holds, then the pair

(x^{*}, u^{*})

is a global saddle point of

L (x, u, τ)

and

τ^{*} \geq τ^{*} (x^{*}, u^{*})

.

Proof.

(i) Let

(x^{*}, u^{*})

be a saddle point of

L (x, u, τ)

. For

τ \geq τ^{*} (x^{*}, u^{*})

, applying Lemma 5 yields

sup_{u \in R^{m}} L (x^{*}, u, τ) \leq L (x^{*}, u^{*}, τ) = f (x^{*}) \leq inf_{x \in C} L (x, u^{*}, τ) .

By the definition of

ϖ

, it follows that for all

u \in R^{m}

and

τ \geq τ^{*} (x^{*}, u^{*})

\begin{matrix} ϖ (u, τ) & = inf_{x \in C} L (x, u, τ) \leq L (x^{*}, u, τ) \leq sup_{u \in R^{m}} L (x^{*}, u, τ) \\ \leq L (x^{*}, u^{*}, τ) = f (x^{*}) \leq inf_{x \in C} L (x, u^{*}, τ) = ϖ (u^{*}, τ) . \end{matrix}

(30)

Taking the supremum over all

(u, τ) \in R^{m} \times R_{+}

, we obtain from (30) that for all

τ \geq τ^{*} (x^{*}, u^{*})

,

sup_{(u, τ) \in R^{m} \times R_{+}} ϖ (u, τ) \leq f (x^{*}) \leq ϖ (u^{*}, τ) \leq sup_{(u, τ) \in R^{m} \times R_{+}} ϖ (u, τ),

which implies

sup_{(u, τ) \in R_{+} \times R^{m}} ϖ (u, τ) = ϖ (u^{*}, τ) = f (x^{*}) .

Therefore,

(u^{*}, τ)

is an optimal solution of the dual problem (28), and the zero duality gap property holds, since

x^{*}

is an optimal solution of problem (3) by Lemma 6.

(ii) Let

(u^{*}, τ^{*})

be an optimal solution of the dual problem (28),

x^{*}

be an optimal solution of problem (3), and suppose that zero duality gap property holds, i.e.,

ϖ (u^{*}, τ^{*}) = sup \{ϖ (u, τ) ∣ u \in R^{m}, τ > 0\} = inf {f (x) | x \in Ω} = f (x^{*}) .

Note that

ϖ (u, τ)

is nondecreasing in

τ

for all

u \in R^{m}

. It follows that

ϖ (u^{*}, τ) = f (x^{*})

for all

τ \geq τ^{*}

. Since

x^{*}

is feasible, then

L (x^{*}, u^{*}, τ) \leq f (x^{*})

for any

τ > 0

. Therefore, by the definition of

ϖ

, we have

L (x^{*}, u^{*}, τ) \leq f (x^{*}) = ϖ (u^{*}, τ^{*}) = inf_{x \in C} L (x, u^{*}, τ), \forall τ \geq τ^{*},

(31)

which implies

L (x^{*}, u^{*}, τ) = f (x^{*})

. Since

x^{*}

is feasible for problem (3), it follows from (29) that for all

(u, τ) \in R^{m} \times [τ^{*}, + \infty)

L (x^{*}, u, τ) \leq f (x^{*}) = L (x^{*}, u^{*}, τ) .

(32)

Combining (31) and (32) yields

sup_{u \in R^{m}} L (x^{*}, u, τ) \leq L (x^{*}, u^{*}, τ) \leq inf_{x \in C} L (x, u^{*}, τ), \forall τ \geq τ^{*} .

This means that

(x^{*}, u^{*})

is a global saddle point of

L (x, u, τ)

and

τ^{*} \geq τ^{*} (x^{*}, u^{*})

. □

Under the metric subregularity constraint qualification, saddle points necessarily satisfy the KKT conditions.

Theorem 2.

If

(x^{*}, u^{*})

is a local/global saddle point of

L (x, u, τ)

, and the metric subregularity constraint qualification (MSCQ) holds at

x^{*}

for the system

H (x) \in D

, then there exists

v^{*} \in N_{D} (H (x^{*}))

such that

(x^{*}, u^{*}, v^{*})

satisfies the KKT conditions for problem (3).

Proof.

Since

(x^{*}, u^{*})

is a local/global saddle point of

L (x, u, τ)

, then

u^{*} \in N_{K} (G (x^{*}))

by Lemma 5 (iii), which further implies

G (x^{*}) = P_{K} (τ^{- 1} u^{*} + G (x^{*}))

. Thus, according to (13), we have

\begin{matrix} \nabla_{x} L (x^{*}, u^{*}, τ) & = \nabla f (x^{*}) + τ \nabla G {(x^{*})}^{⊤} (τ^{- 1} u^{*} + G (x^{*}) - P_{K} (τ^{- 1} u^{*} + G (x^{*}))) \\ = \nabla f (x^{*}) + \nabla G {(x^{*})}^{⊤} u^{*} . \end{matrix}

(33)

Under the MSCQ condition, it follows from (8) that

\begin{matrix} N_{C} (x^{*}) & = N_{C} (x^{*}; 0) \subset \{v \in R^{n} ∣ \exists λ \in N_{D} (H (x^{*}); 0) \cap κ ∥ v ∥ B with v = \nabla H {(x^{*})}^{⊤} λ\} \\ \subset \nabla H {(x^{*})}^{⊤} N_{D} (H (x^{*})) . \end{matrix}

(34)

Since

L (\cdot, u^{*}, τ)

is continuously differentiable, then

\begin{matrix} \partial_{x} (L (\cdot, u^{*}, τ) + δ_{C} (\cdot)) (x^{*}) & = \nabla_{x} [L (\cdot, u^{*}, τ)] (x^{*}) + \partial [δ_{C} (\cdot)] (x^{*}) \\ = \nabla f (x^{*}) + \nabla G {(x^{*})}^{⊤} u^{*} + N_{C} (x^{*}), \end{matrix}

(35)

where the second equality follows from (33) and

\partial δ_{C} (x^{*}) = N_{C} (x^{*})

by Exercise 8.14 in [31].

Since

(x^{*}, u^{*})

is a local/global saddle point of

L (x, u, τ)

, it follows from the definition that

x^{*}

is a local/global minimizer of

L (x, u^{*}, τ)

over

H (x) \in D

, i.e.,

x^{*}

is a local/global minimizer of

L (x, u^{*}, τ) + δ_{D} (H (x))

. By applying Fermat’s rule generalized (see Theorem 10.1 in [31]), we have

\begin{matrix} 0 & \in \partial_{x} (L (\cdot, u^{*}, τ) + δ_{D} (H (\cdot))) (x^{*}) \\ = \partial_{x} (L (\cdot, u^{*}, τ) + δ_{C} (\cdot)) (x^{*}) \\ = \nabla f (x^{*}) + \nabla G {(x^{*})}^{⊤} u^{*} + N_{C} (x^{*}) \\ \subset \nabla f (x^{*}) + \nabla G {(x^{*})}^{⊤} u^{*} + \nabla H {(x^{*})}^{⊤} N_{D} (H (x^{*})), \end{matrix}

(36)

where the second step follows from the fact that

δ_{D} (H (\cdot)) = δ_{C} (\cdot)

since

C = H^{- 1} (D)

, and the third step comes from (35), and the last step is due to (34).

The formula (36) ensures the existence of

v^{*} \in N_{D} (H (x^{*}))

such that

\nabla f (x^{*}) + \nabla G {(x^{*})}^{⊤} u^{*} + \nabla H {(x^{*})}^{⊤} v^{*} = 0 .

This together with the fact

u^{*} \in N_{K} (G (x^{*}))

as shown in Lemma 5 (iii) yields that

(x^{*}, u^{*}, v^{*})

satisfies the KKT condition (14) for problem (3). □

In the remainder of this section, we show that the converse of Theorem 2 is valid when problem (3) is convex.

Definition 7.

We say that problem (3) is convex if f is convex, the sets K and D are closed convex sets satisfying the additive closure properties:

K + K \subset K a n d D + D \subset D,

and the mappings G and H are convex with respect to K and D, respectively, i.e., for all

x, y \in R^{n}

and

t \in [0, 1]

, we have

\begin{matrix} G (t x + (1 - t) y) & ⪯_{- K} t G (x) + (1 - t) G (y), \\ H (t x + (1 - t) y) & ⪯_{- D} t H (x) + (1 - t) H (y), \end{matrix}

(37)

where

u ⪯_{- C} v

means

v - u \in - C

.

Here are some examples of convex sets that satisfy the additivity property.

Example 1.

(i). The set K is a convex cone. It is well-known that for any convex cone K, the property

K + K = K

holds, meaning that additivity is satisfied.

(ii).: The set $K : = \{x \in R^{n} | x_{i} \geq a_{i}, i = 1, 2, \dots, n\}$ , where $a_{i} > 0$ for $i = 1, 2, \dots, n$ , is a closed convex set that satisfies $K + K \subset K$ , but it is not a convex cone.
(iii).: Let $n \geq 2$ and $γ > 0$ . Define the set

$K = \{x \in R^{n} : x_{i} \geq 0 for all i, {(\prod_{i = 1}^{n} x_{i})}^{1 / n} \geq γ\} .$

Then K is a closed convex set such that $K + K \subset K$ , but K is not a convex cone.
Let $f (x) : = {(\prod_{i = 1}^{n} x_{i})}^{1 / n}$ . Clearly f is continuous on the nonnegative orthant $R_{+}^{n}$ , which implies that K is closed. Note first that for any $x \in K$ , we must have $x_{i} > 0$ for all i, due to $γ > 0$ . Thus, K is a subset of the positive orthant $R_{+ +}^{n}$ . The function f is concave on $R_{+ +}^{n}$ , as it is the geometric mean function (Examples 3.1.5 in [32]). Hence, K is convex, since it is a superlevel set of a concave function. Now, take $x, y \in K$ and consider their sum $x + y$ . Clearly, ${(x + y)}_{i} \geq 0$ for all i. It remains to show that $f (x + y) = {(\prod_{i = 1}^{n} (x_{i} + y_{i}))}^{1 / n} \geq γ$ . By the inequality of arithmetic and geometric means, we have $x_{i} + y_{i} \geq 2 \sqrt{x_{i} y_{i}}$ for each i. Therefore,

$\prod_{i = 1}^{n} (x_{i} + y_{i}) \geq \prod_{i = 1}^{n} 2 \sqrt{x_{i} y_{i}} = 2^{n} \sqrt{\prod_{i = 1}^{n} x_{i} \prod_{i = 1}^{n} y_{i}} = 2^{n} {(\prod_{i = 1}^{n} x_{i} \prod_{i = 1}^{n} y_{i})}^{1 / 2} .$

Taking the $1 / n$ th power on both sides yields

$f (x + y) \geq 2 {(\prod_{i = 1}^{n} x_{i} \prod_{i = 1}^{n} y_{i})}^{1 / (2 n)} = 2 {[f (x) f (y)]}^{1 / 2} \geq 2 {(γ \cdot γ)}^{1 / 2} = 2 γ > γ .$

Thus, $x + y \in K$ , which implies $K + K \subset K$ . Note that K is not a cone. In fact, take any $x \in K$ such that $f (x) = γ$ (for example, $x = (γ, γ, \dots, γ)$ ). For $λ \in (0, 1)$ , we have $f (λ x) = λ f (x) = λ γ < γ$ , so $λ x \notin K$ . Thus, K is not closed under scalar multiplication and hence is not a cone.

The following results show that the Moreau envelope preserves convexity properties under certain conditions.

Lemma 7.

Let

u \in R^{m}

and K be a closed convex set. Assume that G is a convex mapping with respect to K with

K + K \subset K

. Then

δ_{K} (G (x))

and

(g \circ G) (x) : = e_{τ^{- 1}} δ_{K} (τ^{- 1} u + G (x))

is convex, where

g (v) : = e_{τ^{- 1}} δ_{K} (τ^{- 1} u + v)

.

Proof.

Since G is a convex mapping with respect to K, it follows from (37) that for any

x, y \in R^{n}

and

t \in [0, 1]

, we have

G (t x + (1 - t) y) - (t G (x) + (1 - t) G (y)) \in K .

(38)

This ensures that

δ_{K} (G (t x + (1 - t) y)) \leq δ_{K} (t G (x) + (1 - t) G (y)),

(39)

which is equivalent to the statement that if

G (t x + (1 - t) y) \notin K

, then

t G (x) + (1 - t) G (y) \notin K

as well. Suppose on the contrary that

t G (x) + (1 - t) G (y) \in K

, then

G (t x + (1 - t) y) = [G (t x + (1 - t) y) - (t G (x) + (1 - t) G (y))] + (t G (x) + (1 - t) G (y)) \in K,

where the last step is due to the fact that

K + K \subset K

. This leads to a contradiction with

G (t x + (1 - t) y) \notin K

. Note that

δ_{K}

is convex, since K is convex. This together with (39) yields

δ_{K} (G (t x + (1 - t) y)) \leq δ_{K} (t G (x) + (1 - t) G (y)) \leq t δ_{K} (G (x)) + (1 - t) δ_{K} (G (y)),

which means that

δ_{K} (G (x))

is convex.

Since K is convex, it follows from (12) that

g (v)

is differentiable with the gradient

\begin{matrix} \nabla g (v) & = \nabla_{v} (e_{τ^{- 1}} δ_{K} (τ^{- 1} u + v)) = τ (τ^{- 1} u + v - P_{K} (τ^{- 1} u + v)) \\ = u + τ v - τ P_{K} (τ^{- 1} u + v) . \end{matrix}

Hence, for

a, b \in R^{m}

, we have

\begin{matrix} 〈\nabla g (a) - \nabla g (b), a - b〉 \\ = & 〈u + τ a - τ P_{K} (τ^{- 1} u + a) - (u + τ b - τ P_{K} (τ^{- 1} u + b)), a - b〉 \\ = & 〈τ a - τ b, a - b〉 - 〈τ P_{K} (τ^{- 1} u + a) - τ P_{K} (τ^{- 1} u + b), a - b〉 \\ = & τ {∥a - b∥}^{2} - τ 〈P_{K} (τ^{- 1} u + a) - P_{K} (τ^{- 1} u + b), a - b〉 \\ \geq & {τ ∥ a - b ∥}^{2} - τ ∥P_{K} (τ^{- 1} u + a) - P_{K} (τ^{- 1} u + b)∥ \cdot ∥ a - b ∥ \\ \geq & {τ ∥ a - b ∥}^{2} - τ {∥ a - b ∥}^{2} \\ = & 0, \end{matrix}

(40)

where the fifth step follows from the fact that the metric projection is a non-expansive mapping, i.e.,

∥P_{K} (τ^{- 1} u + a) - P_{K} (τ^{- 1} u + b)∥ \leq ∥ τ^{- 1} u + a - (τ^{- 1} u + b) ∥ = ∥ a - b ∥ .

Thus, we show that

\nabla g (v)

is monotone by (40). Taking into account Theorem 12.17 in [31], the function

g (v) = e_{τ^{- 1}} δ_{K} (τ^{- 1} u + v)

is convex.

Note that

\begin{matrix} e_{τ^{- 1}} δ_{K} (τ^{- 1} u + G (t x + (1 - t) y)) \\ = & inf_{w \in R^{m}} \{δ_{K} (w) + \frac{τ}{2} {∥ τ^{- 1} u + G (t x + (1 - t) y) - w ∥}^{2}\} \\ = & inf_{w \in K} \frac{τ}{2} \{{∥w - (τ^{- 1} u + G (t x + (1 - t) y))∥}^{2}\} \\ \leq & inf_{v \in K} \frac{τ}{2} \{{∥v - (τ^{- 1} u + [t G (x) + (1 - t) G (y)])∥}^{2}\} \\ = & inf_{v \in R^{m}} \{δ_{K} (v) + \frac{τ}{2} {∥v - (τ^{- 1} u + [t G (x) + (1 - t) G (y)])∥}^{2}\} \\ = & e_{τ^{- 1}} δ_{K} (τ^{- 1} u + [t G (x) + (1 - t) G (y)]), \end{matrix}

(41)

where the inequality comes from that fact that

K + G (t x + (1 - t) y) - [t G (x) + (1 - t) G (y)] \subset K

by (38) and

K + K \subset K

by assumption.

Since

g (v) = e_{τ^{- 1}} δ_{K} (τ^{- 1} u + v)

is convex as shown above, we have

\begin{matrix} e_{τ^{- 1}} δ_{K} (τ^{- 1} u + [t G (x) + (1 - t) G (y)]) \\ \leq & t e_{τ^{- 1}} δ_{K} (τ^{- 1} u + G (x)) + (1 - t) e_{τ^{- 1}} δ_{K} (τ^{- 1} u + G (y)) . \end{matrix}

This together with (41) yields

e_{τ^{- 1}} δ_{K} (τ^{- 1} u + G (t x + (1 - t) y)) \leq t e_{τ^{- 1}} δ_{K} (τ^{- 1} u + G (x)) + (1 - t) e_{τ^{- 1}} δ_{K} (τ^{- 1} u + G (y)),

i.e.,

(g \circ G) (t x + (1 - t) y) \leq t (g \circ G) (x) + (1 - t) (g \circ G) (y)

. This completes the proof. □

Theorem 3.

Suppose that problem (3) is convex. If

(x^{*}, u^{*}, v^{*})

satisfies the KKT conditions, then

(x^{*}, u^{*})

is a global saddle point of

L (x, u, τ)

for all

τ > 0

.

Proof.

Taking into account the convexity of problem (3) and Lemma 7, we know that the function

T (x) : = δ_{D} (H (x))

is convex. Hence,

\nabla H {(x)}^{⊤} N_{D} (H (x)) = \nabla H {(x)}^{⊤} {\hat{N}}_{D} (H (x)) \subset \hat{\partial} T (x) = \partial T (x),

(42)

where the first step follows from the fact

N_{D} (H (x)) = {\hat{N}}_{D} (H (x))

since D is convex, and the second step comes from

\hat{\partial} T (x) \supset \nabla H {(x)}^{⊤} \hat{\partial} δ_{D} (H (x)) = \nabla H {(x)}^{⊤} {\hat{N}}_{D} (H (x))

by Exercise 8.14 and Theorem 10.6 in [31].

Define

Φ (x) : = L (x, u, τ) + T (x)

. Note that the function

Φ (x)

is convex, since

Φ (x)

is composed of

f (x)

,

T (x)

, and

e_{τ^{- 1}} δ_{K} (τ^{- 1} u + G (x))

, and these functions are convex with respect to x by Lemma 7. Since

(x^{*}, u^{*}, v^{*})

satisfies the KKT conditions (14), then

u^{*} \in N_{K} (G (x^{*}))

and

v^{*} \in N_{D} (H (x^{*}))

. It then follows from (33) that

\begin{matrix} 0 & = \nabla f (x^{*}) + \nabla G {(x^{*})}^{⊤} u^{*} + \nabla H {(x^{*})}^{⊤} v^{*} = \nabla_{x} L (x^{*}, u^{*}, τ) + \nabla H {(x^{*})}^{⊤} v^{*} \\ \in \nabla_{x} L (x^{*}, u^{*}, τ) + \nabla H {(x^{*})}^{⊤} N_{D} (H (x^{*})) \subset \nabla_{x} L (x^{*}, u^{*}, τ) + \partial T (x^{*}) = \partial Φ (x^{*}), \end{matrix}

(43)

where the forth step is ensured by (42).

Since

Φ (x)

is convex and

0 \in \partial Φ (x^{*})

by (43), we know that

x^{*}

is a global optimal solution of

Φ (x)

. Therefore,

\begin{matrix} Φ (x^{*}) \leq Φ (x), \forall x \in R^{n} \\ ⟺ & L (x^{*}, u^{*}, τ) + δ_{D} (H (x^{*})) \leq L (x, u^{*}, τ) + δ_{D} (H (x)), \forall x \in R^{n} \\ ⟺ & L (x^{*}, u^{*}, τ) \leq L (x, u^{*}, τ), \forall H (x) \in D . \end{matrix}

(44)

On the other hand, for any

u \in R^{m}

and

τ > 0

, since

x^{*}

is feasible and

u^{*} \in N_{K} (G (x^{*}))

, it follows from (22) and (26) that

L (x^{*}, u, τ) \leq f (x^{*}) = L (x^{*}, u^{*}, τ) .

(45)

Therefore, putting (44) and (45) together yields

L (x^{*}, u, τ) \leq L (x^{*}, u^{*}, τ) \leq L (x, u^{*}, τ), \forall H (x) \in D, u \in R^{m}, τ > 0,

which means that

(x^{*}, u^{*})

is a global saddle point of

L (x, u, τ)

. □

4. Optimality Conditions for Partial Augmented Lagrangian

In the previous section, we have discussed the first inequality in the definition of saddle points (15). Now, we turn our attention to the second inequality that appears in (15). This inequality indicates that

x^{*}

is a local optimal solution of the following optimization problem

\begin{matrix} min & L (x, u, τ) \\ s . t . & H (x) \in D . \end{matrix}

(46)

The concept of directional local minimizer is given below.

Definition 8.

A point

\bar{x} \in C : = H^{- 1} (D)

is said to be a local optimal solution of (46) in direction

d \in R^{n}

, if there exist positive numbers

ρ, δ > 0

such that

L (x, u, τ) \geq L (\bar{x}, u, τ), \forall x \in C \cap (\bar{x} + V_{ρ, δ} (d)) .

The following result establishes the first-order optimality conditions for problem (46).

Theorem 4.

Let

\bar{x} \in C

be a local optimal solution of problem (46) in direction

d \in T_{C} (\bar{x})

. For the constraint system

H (x) \in D

, suppose that MSCQ holds at

\bar{x}

in direction

d \in R^{n}

with modulus

κ > 0

. Then

(i): $(\nabla f (\bar{x}) + \nabla G {(\bar{x})}^{⊤} u + τ \nabla G {(\bar{x})}^{⊤} [G (\bar{x}) - P_{K} (τ^{- 1} u + G (\bar{x}))]) d \geq 0;$
(ii): if $(\nabla f (\bar{x}) + \nabla G {(\bar{x})}^{⊤} u + τ \nabla G {(\bar{x})}^{⊤} [G (\bar{x}) - P_{K} (τ^{- 1} u + G (\bar{x}))]) d = 0$ , then there exists $v \in N_{D} (H (\bar{x}); \nabla H (\bar{x}) d)$ such that

$∥ v ∥ \leq κ ∥\nabla f (\bar{x}) + \nabla G {(\bar{x})}^{⊤} u + τ \nabla G {(\bar{x})}^{⊤} [G (\bar{x}) - P_{K} (τ^{- 1} u + G (\bar{x}))]∥,$

and

$\nabla f (\bar{x}) + \nabla G {(\bar{x})}^{⊤} u + τ \nabla G {(\bar{x})}^{⊤} [G (\bar{x}) - P_{K} (τ^{- 1} u + G (\bar{x}))] + \nabla H {(\bar{x})}^{⊤} v = 0 .$

Proof.

Since

L (x, u, τ)

is continuously differentiable with respect to x, applying Proposition 3.1 in [21] yields

(a): $\nabla_{x} L (\bar{x}, u, τ) d \geq 0$ ;
(b): if $\nabla_{x} L (\bar{x}, u, τ) d = 0$ , then $0 \in \nabla_{x} L (\bar{x}, u, τ) + N_{C} (\bar{x}; d)$ .

By (13), we know

\nabla_{x} L (\bar{x}, u, τ) = \nabla f (\bar{x}) + \nabla G {(\bar{x})}^{⊤} u + τ \nabla G {(\bar{x})}^{⊤} [G (\bar{x}) - P_{K} (τ^{- 1} u + G (\bar{x}))],

(47)

which together with (a) yields the desired conclusion (i).

According to (b) and Lemma 1, there exists

v \in N_{D} (H (\bar{x}); \nabla H (\bar{x}) d)

such that

\nabla_{x} L (\bar{x}, u, τ) + \nabla H {(\bar{x})}^{⊤} v = 0

and

∥ v ∥ \leq κ ∥ \nabla H {(\bar{x})}^{⊤} v ∥ = κ ∥ \nabla_{x} L (\bar{x}, u, τ) ∥

. The desired results hold by further utilizing the formula of

\nabla_{x} L (\bar{x}, u, τ)

given in (47). □

The second-order necessary condition is obtained using second-order tangent sets and asymptotic second-order tangent cones.

Theorem 5.

Let

\bar{x}

be a local optimal solution of problem (46) in direction

d \in T_{C} (\bar{x})

with

\nabla_{x} L (\bar{x}, u, τ) d = 0

. Then,

(i): $\nabla_{x} L (\bar{x}, u, τ) v \geq 0, \forall v \in T_{C}^{″} (\bar{x}; d);$
(ii): for any $w \in T_{C}^{2} (\bar{x}; d)$ , there exists $z \in \partial_{x}^{2} L (\bar{x}, u, τ) (d)$ such that

$〈 \nabla_{x} L (\bar{x}, u, τ), w 〉 + 〈 z, d 〉 \geq 0 .$

Proof.

(i) Pick

v \in T_{C}^{″} (\bar{x}; d)

. Then, there exist

(t_{k}, r_{k}) ↓ (0, 0)

and

v_{k} \to v

such that

t_{k} / r_{k} \to 0

and

x_{k} : = \bar{x} + t_{k} d + \frac{1}{2} t_{k} r_{k} v_{k} \in C

for all

k \in N

. Since

\frac{x_{k} - \bar{x}}{∥ x_{k} - \bar{x} ∥} = \frac{t_{k} d + \frac{1}{2} t_{k} r_{k} v_{k}}{∥ t_{k} d + \frac{1}{2} t_{k} r_{k} v_{k} ∥} \to \frac{d}{∥ d ∥},

then

x_{k} \in \bar{x} + V_{ρ, δ} (d)

whenever k is sufficiently large, and hence

L (x_{k}, u, τ) \geq L (\bar{x}, u, τ)

. Note that

\nabla_{x} L (\bar{x}, u, τ) d = 0

. Then

\begin{matrix} L (x_{k}, u, τ) - L (\bar{x}, u, τ) - 〈 \nabla_{x} L (\bar{x}, u, τ), x_{k} - \bar{x} 〉 \\ = & L (x_{k}, u, τ) - L (\bar{x}, u, τ) - 〈 \nabla_{x} L (\bar{x}, u, τ), t_{k} d + \frac{1}{2} t_{k} r_{k} v_{k} 〉 \\ \geq & - 〈 \nabla_{x} L (\bar{x}, u, τ), t_{k} d + \frac{1}{2} t_{k} r_{k} v_{k} 〉 \\ = & - t_{k} 〈 \nabla_{x} L (\bar{x}, u, τ), d 〉 - \frac{1}{2} t_{k} r_{k} 〈 \nabla_{x} L (\bar{x}, u, τ), v_{k} 〉 \\ = & - \frac{1}{2} t_{k} r_{k} 〈 \nabla_{x} L (\bar{x}, u, τ), v_{k} 〉 . \end{matrix}

(48)

Since

L (x, u, τ) \in C^{1, 1}

, it follows from Lemma 4 that there exists

z_{k} \in \partial_{x}^{2} L (ξ_{k}, u, τ) (x_{k} - \bar{x})

, where

ξ_{k} \in [\bar{x}, x_{k}]

, such that

\begin{matrix} L (x_{k}, u, τ) - L (\bar{x}, u, τ) - 〈 \nabla_{x} L (\bar{x}, u, τ), x_{k} - \bar{x} 〉 \leq \frac{1}{2} 〈 z_{k}, t_{k} d + \frac{1}{2} t_{k} r_{k} v_{k} 〉 . \end{matrix}

(49)

According to Lemma 3, we have

\begin{matrix} z_{k} & \in \partial_{x}^{2} L (ξ_{k}, u, τ) (x_{k} - \bar{x}) = \partial_{x}^{2} L (ξ_{k}, u, τ) (t_{k} d + \frac{1}{2} t_{k} r_{k} v_{k}) = \partial_{x}^{2} L (ξ_{k}, u, τ) (t_{k} (d + \frac{1}{2} r_{k} v_{k})) \\ = t_{k} \partial_{x}^{2} L (ξ_{k}, u, τ) (d + \frac{1}{2} r_{k} v_{k}), \end{matrix}

which implies

z_{k} = t_{k} z_{k}^{'}

for some

z_{k}^{'} \in \partial_{x}^{2} L (ξ_{k}, u, τ) (d + \frac{1}{2} r_{k} v_{k}) .

(50)

This together with (48) and (49) yields

\begin{matrix} - \frac{1}{2} t_{k} r_{k} 〈 \nabla_{x} L (\bar{x}, u, τ), v_{k} 〉 & \leq L (x_{k}, u, τ) - L (\bar{x}, u, τ) - 〈 \nabla_{x} L (\bar{x}, u, τ), x_{k} - \bar{x} 〉 \\ \leq \frac{1}{2} 〈 z_{k}, t_{k} d + \frac{1}{2} t_{k} r_{k} v_{k} 〉 \\ = \frac{1}{2} 〈 t_{k} z_{k}^{'}, t_{k} d + \frac{1}{2} t_{k} r_{k} v_{k} 〉 \\ = \frac{1}{2} t_{k}^{2} 〈 z_{k}^{'}, d + \frac{1}{2} r_{k} v_{k} 〉 . \end{matrix}

By dividing both sides of the above inequality by

- (t_{k} r_{k}) / 2

, we get

〈 \nabla_{x} L (\bar{x}, u, τ), v_{k} 〉 \geq - \frac{t_{k}}{r_{k}} 〈 z_{k}^{'}, d + \frac{1}{2} r_{k} v_{k} 〉 .

(51)

We claim that

{z_{k}^{'}}

is bounded. Let

g_{k} (x) : = 〈 d + \frac{1}{2} r_{k} v_{k}, \nabla_{x} L (x, u, τ) 〉

. According to (11) and (50), we have

z_{k}^{'} \in \partial_{x}^{2} L (ξ_{k}, u, τ) (d + \frac{1}{2} r_{k} v_{k}) = \partial 〈 d + \frac{1}{2} r_{k} v_{k}, \nabla_{x} L (\cdot, u, τ) 〉 (ξ_{k}) = \partial g_{k} (ξ_{k}) .

(52)

Note that

\begin{matrix} ∥ g_{k} (x) - g_{k} (y) ∥ & = ∥ 〈 d + \frac{1}{2} r_{k} v_{k}, \nabla_{x} L (x, u, τ) - \nabla_{x} L (y, u, τ) 〉 ∥ \\ \leq ∥ d + \frac{1}{2} r_{k} v_{k} ∥ \cdot ∥ \nabla_{x} L (x, u, τ) - \nabla_{x} L (y, u, τ) ∥ \\ \leq (∥ d ∥ + 1) L ∥ x - y ∥, \end{matrix}

where L is the Lipschitz constant of

\nabla_{x} L

. This implies that the subdifferential

\partial g_{k}

is included in

(∥ d ∥ + 1) L B

. Since

z_{k}^{'} \in \partial g_{k} (ξ_{k})

by (52), then

{z_{k}^{'}}

is bounded. We can assume without loss of generality that

z_{k}^{'} \to z^{'}

. Since

ξ_{k} \to \bar{x}, d + \frac{1}{2} r_{k} v_{k} \to d

, then

z^{'} \in \partial_{x}^{2} L (\bar{x}, u, τ) (d)

by Lemma 3. Using

t_{k} / r_{k} \to 0

and taking the limit in both sides of (51) yields

〈 \nabla_{x} L (\bar{x}, u, τ), v 〉 \geq 0 .

(ii) Pick

w \in T_{C}^{2} (\bar{x}; d)

. Then, there exist

t_{k} ↓ 0

and

w_{k} \to w

such that

x_{k} : = \bar{x} + t_{k} d + \frac{1}{2} t_{k}^{2} w_{k} \in C

for all

k \in N

. By an argument similar to that used for (51) in case (i), we can obtain

〈 \nabla_{x} L (\bar{x}, u, τ), w_{k} 〉 \geq - 〈 z_{k}^{'}, d + \frac{1}{2} t_{k} w_{k} 〉,

where

z_{k}^{'} \in \partial_{x}^{2} L (ξ_{k}^{'}, u, τ) (d + \frac{1}{2} t_{k} w_{k})

and

ξ_{k}^{'} \in [\bar{x}, x_{k}]

. Taking limits yields

〈 z^{'}, d 〉 + 〈 \nabla_{x} L (\bar{x}, u, τ), w 〉 \geq 0 .

This completes the proof. □

Corollary 1.

Let

\bar{x}

be a local optimal solution of problem (46) in direction

d \in T_{C} (\bar{x})

with

\nabla_{x} L (\bar{x}, u, τ) d = 0

. For the constraint system

H (x) \in D

, suppose that MSCQ holds at

\bar{x}

in direction d. Then,

(i): for every $v \in R^{n}$ satisfying $\nabla H (\bar{x}) v \in T_{D}^{″} (H (\bar{x}); \nabla H (\bar{x}) d)$ , we have $\nabla_{x} L (\bar{x}, u, τ) v \geq 0$ .
(ii): for every $w \in R^{n}$ satisfying $\nabla H (\bar{x}) w + \nabla^{2} H (\bar{x}) (d, d) \in T_{D}^{2} (H (\bar{x}); \nabla H (\bar{x}) d)$ , there exists $z \in \partial_{x}^{2} L (\bar{x}, u, τ) (d)$ such that $〈 \nabla_{x} L (\bar{x}, u, τ), w 〉 + 〈 z, d 〉 \geq 0$ .

Proof.

The results follows immediately by applying Lemma 2 and Theorem 5. □

The following result develops second-order necessary conditions in terms of support functions.

Theorem 6.

Let

\bar{x}

be a local optimal solution of problem (46) in direction

d \in T_{C} (\bar{x})

with

\nabla_{x} L (\bar{x}, u, τ) d = 0

. If

λ \in R^{p}

satisfies

\nabla_{x} L (\bar{x}, u, τ) + \nabla H {(\bar{x})}^{⊤} λ = 0

, then

(i): $σ (λ | \nabla H (\bar{x}) T_{C}^{″} (x; d)) \leq 0;$
(ii): for each $γ > 0$ , there exists $z \in \partial_{x}^{2} L (\bar{x}, u, τ) (d)$ such that

$〈 z, d 〉 + 〈 λ, \nabla^{2} H (\bar{x}) (d, d) 〉 - σ (λ | \nabla H (\bar{x}) [T_{C}^{2} (\bar{x}; d) \cap γ B] + \nabla^{2} H (\bar{x}) (d, d)) \geq 0 .$

(53)

Proof.

(i) Pick

ξ \in \nabla H (\bar{x}) T_{C}^{″} (x; d)

, i.e., there exists

v \in T_{C}^{″} (x; d)

such that

ξ = \nabla H (\bar{x}) v

. Hence,

〈 λ, ξ 〉 = 〈 λ, \nabla H (\bar{x}) v 〉 = 〈 \nabla H {(\bar{x})}^{⊤} λ, v 〉 = 〈 - \nabla_{x} L (\bar{x}, u, τ), v 〉 \leq 0,

where the third step comes from the fact

\nabla_{x} L (\bar{x}, u, τ) + \nabla H {(\bar{x})}^{⊤} λ = 0

by assumption and the last step follows from Theorem 5 (i).

(ii) If

T_{C}^{2} (\bar{x}; d) \cap γ B = \emptyset

, then the corresponding support function takes the value

- \infty

, and in this case, (53) holds trivially. If

T_{C}^{2} (\bar{x}; d) \cap γ B \neq \emptyset

, then there exists

w_{γ} \in T_{C}^{2} (\bar{x}; d) \cap γ B

such that

min_{w \in T_{C}^{2} (\bar{x}; d) \cap γ B} 〈 \nabla_{x} L (\bar{x}, u, τ), w 〉 = 〈 \nabla_{x} L (\bar{x}, u, τ), w_{γ} 〉 .

(54)

According to Theorem 5, for the above

w_{γ}

, there exists

z_{γ} \in \partial_{x}^{2} L (\bar{x}, u, τ) (d)

such that

〈 \nabla_{x} L (\bar{x}, u, τ), w_{γ} 〉 + 〈 z_{γ}, d 〉 \geq 0 .

(55)

Therefore,

\begin{matrix} σ (λ | \nabla H (\bar{x}) [T_{C}^{2} (\bar{x}; d) \cap γ B] + \nabla^{2} H (\bar{x}) (d, d)) \\ = & max \{〈 λ, \nabla H (\bar{x}) w + \nabla^{2} H (\bar{x}) (d, d) 〉 | w \in T_{C}^{2} (\bar{x}; d) \cap γ B\} \\ = & max \{〈 \nabla H {(\bar{x})}^{⊤} λ, w 〉 | w \in T_{C}^{2} (\bar{x}; d) \cap γ B\} + 〈 λ, \nabla^{2} H (\bar{x}) (d, d) 〉 \\ = & max \{〈 - \nabla_{x} L (\bar{x}, u, τ), w 〉 | w \in T_{C}^{2} (\bar{x}; d) \cap γ B\} + 〈 λ, \nabla^{2} H (\bar{x}) (d, d) 〉 \\ = & - min \{〈 \nabla_{x} L (\bar{x}, u, τ), w 〉 | w \in T_{C}^{2} (\bar{x}; d) \cap γ B\} + 〈 λ, \nabla^{2} H (\bar{x}) (d, d) 〉 \\ = & - 〈 \nabla_{x} L (\bar{x}, u, τ), w_{γ} 〉 + 〈 λ, \nabla^{2} H (\bar{x}) (d, d) 〉 \\ = & - 〈 \nabla_{x} L (\bar{x}, u, τ), w_{γ} 〉 - 〈 z_{γ}, d 〉 + 〈 z_{γ}, d 〉 + 〈 λ, \nabla^{2} H (\bar{x}) (d, d) 〉 \\ \leq & 〈 z_{γ}, d 〉 + 〈 λ, \nabla^{2} H (\bar{x}) (d, d) 〉, \end{matrix}

where the third equality comes from the fact

\nabla_{x} L (\bar{x}, u, τ) + \nabla H {(\bar{x})}^{⊤} λ = 0

by assumption, the fifth equality follows from (54), and the last step is due to (55). □

Lemma 8

(Lemma 3.4, [33]). Let

\bar{x} \in F \subset R^{n}

. If the sequence

x_{k} \in F ∖ {\bar{x}}

converges to

\bar{x}

such that

(x_{k} - \bar{x}) / t_{k}

converges to some nonzero vector

d \in T_{F} (\bar{x})

, where

t_{k} : = ∥ x_{k} - \bar{x} ∥

, then either

(x_{k} - \bar{x} - t_{k} d) / \frac{1}{2} t_{k}^{2}

converges to some vector

w \in T_{F}^{2} (\bar{x}; d) \cap {d}^{⊥}

, or there exists a sequence

r_{k} ↓ 0

such that

t_{k} / r_{k} \to 0

and

(x_{k} - \bar{x} - t_{k} d) / \frac{1}{2} t_{k} r_{k}

converges to some vector

w \in (T_{F}^{″} (\bar{x}; d) \cap {d}^{⊥}) ∖ {0}

, where

{d}^{⊥}

denotes the orthogonal subspace to d.

The second-order sufficient condition is derived below.

Theorem 7.

Let

\bar{x}

be a feasible solution of problem (46), i.e.,

\bar{x} \in C

. Let

d \in T_{C} (\bar{x}) ∖ {0}

with

\nabla_{x} L (\bar{x}, u, τ) d = 0

. Suppose that there exists

λ \in R^{p}

such that

\nabla_{x} L (\bar{x}, u, τ) + \nabla H {(\bar{x})}^{⊤} λ = 0

and

(i): $〈 λ, v 〉 < 0, \forall v \in \nabla H (x) ((T_{C}^{″} (\bar{x}; d) \cap {d}^{⊥}) ∖ {0});$
(ii): $〈 z, d 〉 + 〈 λ, \nabla^{2} H (\bar{x}) (d, d) 〉 - σ (λ | \nabla H (\bar{x}) [T_{C}^{2} (\bar{x}; d) \cap {d}^{⊥}] + \nabla^{2} H (\bar{x}) (d, d)) > 0, \forall z \in \partial_{x}^{2} L (\bar{x}, u, τ) (d) .$

Then, the second-order growth conditions holds at

\bar{x}

in direction d, i.e., there exist

κ, ρ, δ > 0

such that

L (x, u, τ) \geq L (\bar{x}, u, τ) + κ {∥ x - \bar{x} ∥}^{2}, \forall x \in C \cap (\bar{x} + V_{ρ, δ} (d)) .

Proof.

Suppose on the contrary that the second-order growth condition in direction d does not hold at

\bar{x}

. This means that there exists sequence

{x_{k}} \subset \bar{x} + V_{(1 / k, 1 / k)} (d)

such that

x_{k} \in C

and

L (x_{k}, u, τ) < L (\bar{x}, u, τ) + \frac{1}{k} {∥ x_{k} - \bar{x} ∥}^{2} .

(56)

Pick

λ \in R^{p}

with

\nabla_{x} L (\bar{x}, u, τ) + \nabla H {(\bar{x})}^{⊤} λ = 0

satisfying (i) and (ii). Let

t_{k} : = ∥ x_{k} - \bar{x} ∥

and

d_{k} : = (x_{k} - \bar{x}) / t_{k}

. Then

t_{k} \to 0^{+}

and we can assume without loss of generality that

d_{k} \to \bar{d} : = d / ∥ d ∥ \in T_{C} (\bar{x})

. Lemma 8 ensures that one of the following conditions hold:

(a): $w_{k} : = (x_{k} - \bar{x} - t_{k} \bar{d}) / \frac{1}{2} t_{k}^{2}$ converges to some vector $w \in T_{C}^{2} (\bar{x}; \bar{d}) \cap {\bar{d}}^{⊥}$ ;
(b): there exists a sequence $r_{k} ↓ 0$ such that $t_{k} / r_{k} \to 0$ and ${\tilde{w}}_{k} : = (x_{k} - \bar{x} - t_{k} \bar{d}) / \frac{1}{2} t_{k} r_{k}$ converges to some vector $\tilde{w} \in T_{C}^{″} (\bar{x}; \bar{d}) \cap ({\bar{d}}^{⊥} ∖ {0})$ .

Case (1). If condition (a) holds, then

x_{k} = \bar{x} + t_{k} \bar{d} + \frac{1}{2} t_{k}^{2} w_{k}

. Since

L (\cdot, u, τ) \in C^{1, 1}

, it follows from Lemma 4 that there exists

z_{k} \in \partial_{x}^{2} L (ξ_{k}, u, τ) (x_{k} - \bar{x})

, where

ξ_{k} \in [\bar{x}, x_{k}]

, such that

L (x_{k}, u, τ) - L (\bar{x}, u, τ) - 〈 \nabla_{x} L (\bar{x}, u, τ), x_{k} - \bar{x} 〉 \geq \frac{1}{2} 〈 z_{k}, x_{k} - \bar{x} 〉 .

(57)

Note that

\begin{matrix} L (x_{k}, u, τ) - L (\bar{x}, u, τ) - 〈 \nabla_{x} L (\bar{x}, u, τ), x_{k} - \bar{x} 〉 \\ = & L (x_{k}, u, τ) - L (\bar{x}, u, τ) - 〈 \nabla_{x} L (\bar{x}, u, τ), t_{k} \bar{d} + \frac{1}{2} t_{k}^{2} w_{k} 〉 \\ = & L (x_{k}, u, τ) - L (\bar{x}, u, τ) - t_{k} 〈 \nabla_{x} L (\bar{x}, u, τ), \bar{d} 〉 - \frac{1}{2} t_{k}^{2} 〈 \nabla_{x} L (\bar{x}, u, τ), w_{k} 〉 \\ = & L (x_{k}, u, τ) - L (\bar{x}, u, τ) - \frac{1}{2} t_{k}^{2} 〈 \nabla_{x} L (\bar{x}, u, τ), w_{k} 〉, \end{matrix}

(58)

where the last step is due to

\nabla_{x} L (\bar{x}, u, τ) d = 0

by assumption. Hence, it follows from (57) and (58) that

L (x_{k}, u, τ) - L (\bar{x}, u, τ) \geq \frac{1}{2} 〈 z_{k}, t_{k} \bar{d} + \frac{1}{2} t_{k}^{2} w_{k} 〉 + \frac{1}{2} t_{k}^{2} 〈 \nabla_{x} L (\bar{x}, u, τ), w_{k} 〉 .

(59)

According to Lemma 3, we have

z_{k} \in \partial_{x}^{2} L (ξ_{k}, u, τ) (x_{k} - \bar{x}) = t_{k} \partial_{x}^{2} L (ξ_{k}, u, τ) (\bar{d} + \frac{1}{2} t_{k} w_{k}) .

Hence,

z_{k} = t_{k} z_{k}^{'}

with

z_{k}^{'} \in \partial_{x}^{2} L (ξ_{k}, u, τ) (\bar{d} + \frac{1}{2} t_{k} w_{k})

. Following a similar argument as given for (52), we can assume without loss of generality that

z_{k}^{'}

converge to some

z^{'} \in \partial_{x}^{2} L (\bar{x}, u, τ) (\bar{d}) = \frac{1}{∥ d ∥} \partial_{x}^{2} L (\bar{x}, u, τ) (d)

by Lemma 3 (i). Hence

z^{'} = \frac{1}{∥ d ∥} {\tilde{z}}^{'}

for some

{\tilde{z}}^{'} \in \partial_{x}^{2} L (\bar{x}, u, τ) (d)

.

It follows from (56) and (59) that

\frac{1}{2} 〈 z_{k}, t_{k} \bar{d} + \frac{1}{2} t_{k}^{2} w_{k} 〉 + \frac{1}{2} t_{k}^{2} 〈 \nabla_{x} L (\bar{x}, u, τ), w_{k} 〉 \leq L (x_{k}, u, τ) - L (\bar{x}, u, τ) < \frac{1}{k} t_{k}^{2} .

(60)

Since

z_{k} = t_{k} z_{k}^{'}

, then

\frac{1}{2} 〈 z_{k}, t_{k} \bar{d} + \frac{1}{2} t_{k}^{2} w_{k} 〉 = \frac{1}{2} 〈 t_{k} z_{k}^{'}, t_{k} \bar{d} + \frac{1}{2} t_{k}^{2} w_{k} 〉 = \frac{1}{2} t_{k}^{2} 〈 z_{k}^{'}, \bar{d} 〉 + \frac{1}{4} t_{k}^{3} 〈 z_{k}^{'}, w_{k} 〉 .

(61)

Putting (60) and (61) together gives

\frac{1}{2} t_{k}^{2} 〈 z_{k}^{'}, \bar{d} 〉 + \frac{1}{4} t_{k}^{3} 〈 z_{k}^{'}, w_{k} 〉 + \frac{1}{2} t_{k}^{2} 〈 \nabla_{x} L (\bar{x}, u, τ), w_{k} 〉 < \frac{1}{k} t_{k}^{2} .

(62)

Dividing both sides of the above equation by

\frac{1}{2} t_{k}^{2}

and letting

k \to \infty

yields

〈 z^{'}, \bar{d} 〉 + 〈 \nabla_{x} L (\bar{x}, u, τ), w 〉 \leq 0 .

(63)

Since

w \in T_{C}^{2} (\bar{x}; \bar{d}) \cap {\bar{d}}^{⊥}

, then

\nabla H (\bar{x}) w + \nabla^{2} H (\bar{x}) (\bar{d}, \bar{d}) \in \nabla H (\bar{x}) (T_{C}^{2} (\bar{x}; \bar{d}) \cap {\bar{d}}^{⊥}) + \nabla^{2} H (\bar{x}) (\bar{d}, \bar{d}),

and hence,

〈 λ, \nabla H (\bar{x}) w + \nabla^{2} H (\bar{x}) (\bar{d}, \bar{d}) 〉 \leq σ (λ | \nabla H (\bar{x}) (T_{C}^{2} (\bar{x}; \bar{d}) \cap {\bar{d}}^{⊥}) + \nabla^{2} H (\bar{x}) (\bar{d}, \bar{d})) .

(64)

Note that

\begin{matrix} 〈 z^{'}, \bar{d} 〉 + 〈 λ, \nabla^{2} H (\bar{x}) (\bar{d}, \bar{d}) 〉 \\ = & 〈 \nabla_{x} L (\bar{x}, u, τ) + \nabla H {(\bar{x})}^{⊤} λ, w 〉 + 〈 z^{'}, \bar{d} 〉 + 〈 λ, \nabla^{2} H (\bar{x}) (\bar{d}, \bar{d}) 〉 \\ = & 〈 z^{'}, \bar{d} 〉 + 〈 \nabla_{x} L (\bar{x}, u, τ), w 〉 + 〈 λ, \nabla H (\bar{x}) w + \nabla^{2} H (\bar{x}) (\bar{d}, \bar{d}) 〉 \\ \leq & 〈 z^{'}, \bar{d} 〉 + 〈 \nabla_{x} L (\bar{x}, u, τ), w 〉 + σ (λ | \nabla H (\bar{x}) (T_{C}^{2} (\bar{x}; \bar{d}) \cap {\bar{d}}^{⊥}) + \nabla^{2} H (\bar{x}) (\bar{d}, \bar{d})) \\ \leq & σ (λ | \nabla H (\bar{x}) (T_{C}^{2} (\bar{x}; \bar{d}) \cap {\bar{d}}^{⊥}) + \nabla^{2} H (\bar{x}) (\bar{d}, \bar{d})), \end{matrix}

(65)

where the first step is due to

\nabla_{x} L (\bar{x}, u, τ) + \nabla H {(\bar{x})}^{⊤} λ = 0

by assumption, the third step follows from (64), and the last step comes from (63). Hence, it follows from (65) that

〈 z^{'}, \bar{d} 〉 + 〈 λ, \nabla^{2} H (\bar{x}) (\bar{d}, \bar{d}) 〉 - σ (λ | \nabla H (\bar{x}) (T_{C}^{2} (\bar{x}; \bar{d}) \cap {\bar{d}}^{⊥}) + \nabla^{2} H (\bar{x}) (\bar{d}, \bar{d})) \leq 0 .

Recall that

z^{'} = {\tilde{z}}^{'} / ∥ d ∥

and

\bar{d} = d / ∥ d ∥

. Hence applying (7), the above formula can be rewritten as

〈 {\tilde{z}}^{'}, d 〉 + 〈 λ, \nabla^{2} H (\bar{x}) (d, d) 〉 - σ (λ | \nabla H (\bar{x}) (T_{C}^{2} (\bar{x}; d) \cap {d}^{⊥}) + \nabla^{2} H (\bar{x}) (d, d)) \leq 0,

which is a contradiction to assumption (ii).

Case (2). If condition (b) holds, then

x_{k} = \bar{x} + t_{k} \bar{d} + \frac{1}{2} t_{k} r_{k} {\tilde{w}}_{k}

. By following a similar argument on the formula (62) as given in case (1), we can obtain

\frac{1}{2} t_{k}^{2} 〈 {\tilde{z}}_{k}^{'}, \bar{d} 〉 + \frac{1}{4} t_{k}^{2} r_{k} 〈 {\tilde{z}}_{k}^{'}, {\tilde{w}}_{k} 〉 + \frac{1}{2} t_{k} r_{k} 〈 \nabla_{x} L (\bar{x}, u, τ), {\tilde{w}}_{k} 〉 < \frac{1}{k} t_{k}^{2},

where

{\tilde{z}}_{k}^{'} \in \partial_{x}^{2} L (ξ_{k}, u, τ) (\bar{d} + \frac{1}{2} r_{k} {\tilde{w}}_{k})

. Dividing both sides of the above equation by

\frac{1}{2} t_{k} r_{k}

yields

\frac{t_{k}}{r_{k}} 〈 {\tilde{z}}_{k}^{'}, \bar{d} 〉 + \frac{1}{2} t_{k} 〈 {\tilde{z}}_{k}^{'}, {\tilde{w}}_{k} 〉 + 〈 \nabla_{x} L (\bar{x}, u, τ), {\tilde{w}}_{k} 〉 < \frac{2}{k} \cdot \frac{t_{k}}{r_{k}} .

Taking the limit as

k \to \infty

and noting that

t_{k} / r_{k} \to 0

yields

〈 \nabla_{x} L (\bar{x}, u, τ), \tilde{w} 〉 \leq 0 .

This together with the fact

\nabla_{x} L (\bar{x}, u, τ) + \nabla H {(\bar{x})}^{⊤} λ = 0

by assumption implies

〈 λ, \nabla H (\bar{x}) \tilde{w} 〉 = 〈 \nabla H {(\bar{x})}^{⊤} λ, \tilde{w} 〉 = 〈 - \nabla_{x} L (\bar{x}, u, τ), \tilde{w} 〉 \geq 0,

which is a contradiction to condition (i). □

Corollary 2.

Let

\bar{x}

be a feasible solution of problem (46), i.e.,

\bar{x} \in C

, and let

d \in T_{C} (\bar{x}) ∖ {0}

such that

\nabla_{x} L (\bar{x}, u, τ) d = 0

. Suppose that there exists

λ \in R^{p}

satisfying

\nabla_{x} L (\bar{x}, u, τ) + \nabla H {(\bar{x})}^{⊤} λ = 0

and

(i): $〈 λ, v 〉 < 0, \forall v \in T_{D}^{″} (H (\bar{x}); \nabla H (\bar{x}) d) \cap \nabla H (\bar{x}) ({d}^{⊥} ∖ {0});$
(ii): $〈 z, d 〉 + 〈 λ, \nabla^{2} H (\bar{x}) (d, d) 〉 - σ_{T_{D}^{2} (H (\bar{x}); \nabla H (\bar{x}) d)} (λ) > 0, \forall z \in \partial_{x}^{2} L (\bar{x}, u, τ) (d) .$

Then, the second-order growth conditions holds at

\bar{x}

in direction d.

Proof.

The results follows immediately by applying Lemma 2 and Theorem 7. □

Example 2.

Consider the following optimization problem:

\begin{matrix} min & f (x_{1}, x_{2}) = x_{1}^{2} + x_{1} + \int_{0}^{x_{2}} | t | d t, \\ s . t . & G (x_{1}, x_{2}) = (\begin{matrix} x_{1} \\ x_{2} \end{matrix}) \in K = \{(y_{1}, y_{2}) ∣ y_{1} \geq 0, y_{2} \geq 0\}, \\ H (x_{1}, x_{2}) = (\begin{matrix} e^{x_{1}} - x_{2}^{2} - 1 \\ 2 x_{2} \end{matrix}) \in D, \end{matrix}

where

D = {(y_{1}, y_{2}) \in R^{2} ∣ y_{1} \geq y_{2}^{2} / 2} \cup {(y_{1}, y_{2}) \in R^{2} ∣ y_{1} \leq - y_{2}^{2} / 2, y_{1} \leq - 1} .

For

τ > 0

, the partial augmented Lagrangian is

\begin{matrix} L (x, u, τ) & = f (x) + e_{τ^{- 1}} δ_{K} (τ^{- 1} u + G (x)) - \frac{1}{2 τ} {∥ u ∥}^{2} \\ = x_{1}^{2} + x_{1} + \int_{0}^{x_{2}} | t | d t + e_{τ^{- 1}} δ_{K} (τ^{- 1} u + x) - \frac{1}{2 τ} {∥ u ∥}^{2}, \end{matrix}

and its gradient is

\nabla_{x} L (x, u, τ) = (\begin{matrix} 2 x_{1} + 1 \\ | x_{2} | \end{matrix}) + τ (τ^{- 1} u + x - P_{K} (τ^{- 1} u + x)) .

Let

\bar{x} = u^{*} = (0, 0) \in R^{2}

. Note that

C : = H^{- 1} (D) = {(x_{1}, x_{2}) \in R^{2} | e^{x_{1}} - 1 \geq 3 x_{2}^{2}} \cup {(x_{1}, x_{2}) \in R^{2} | e^{x_{1}} \leq x_{2}^{2} \leq 1 - e^{x_{1}}}

and

T_{C} (\bar{x}) = R_{+} \times R

. Hence, if

x \in C

and lies in some neighborhood of

\bar{x}

, then

e^{x_{1}} - 1 \geq 3 x_{2}^{2} \geq 0

, which implies

x_{1} \geq 0

. So,

\nabla_{x} L (x, u^{*}, τ) = \{\begin{matrix} (\begin{matrix} 2 x_{1} + 1 \\ x_{2} \end{matrix}), & x_{2} \geq 0, \\ (\begin{matrix} 2 x_{1} + 1 \\ (τ - 1) x_{2} \end{matrix}), & x_{2} < 0 . \end{matrix}

For any nonzero direction

d \in T_{C} (\bar{x}) = R_{+} \times R

, we show that either

\nabla_{x} L (\bar{x}, u^{*}, τ) d > 0

or

\nabla_{x} L (\bar{x}, u^{*}, τ) d = 0

and the second-order condition in Corollary 2 holds. In fact, since

d \in T_{C} (\bar{x})

, then

d_{1} \geq 0

. Consider the following two cases. If

d_{1} > 0

, then

\nabla_{x} L (\bar{x}, u^{*}, τ) d = d_{1} > 0

. If

d_{1} = 0

, then by direct calculation, we obtain

\nabla H (\bar{x}) = (\begin{matrix} 1 & 0 \\ 0 & 2 \end{matrix})

,

T_{D}^{2} (H (\bar{x}); \nabla H (\bar{x}) d) = {(w_{1}, w_{2}) \in R^{2} ∣ w_{1} \geq 4 d_{2}^{2}}

, and

T_{D}^{″} (H (\bar{x}); \nabla H (\bar{x}) d) \cap \nabla H (\bar{x}) ({d}^{⊥} ∖ {0}) = {(w_{1}, w_{2}) \in R^{2} ∣ w_{1} > 0, w_{2} = 0} .

Let

λ = (- 1, 0)

. Then,

\nabla_{x} L (\bar{x}, u^{*}, τ) + \nabla H {(\bar{x})}^{⊤} λ = 0

and

〈 λ, v 〉 = - w_{1} < 0

for all

v : = (w_{1}, w_{2}) \in T_{D}^{″} (H (\bar{x}); \nabla H (\bar{x}) d) \cap \nabla H (\bar{x}) ({d}^{⊥} ∖ {0})

. Hence, the condition (i) in Corollary 2 holds.

Define

ϕ (x) : = 〈 d, \nabla L (x, u^{*}, τ) 〉

. Then,

ϕ (x) = \{\begin{matrix} x_{2} d_{2}, & x_{2} \geq 0, \\ (τ - 1) x_{2} d_{2}, & x_{2} < 0 . \end{matrix}

For

x_{2} > 0

, the gradient of ϕ is

(0, d_{2})

, and for

x_{2} < 0

, it is

(0, (τ - 1) d_{2})

. Therefore, the subdifferential of ϕ at

\bar{x} = (0, 0)

is given by

\partial_{x}^{2} L (\bar{x}, u^{*}, τ) (d) = \partial ϕ (\bar{x}) = \{\begin{matrix} \{(0, d_{2}), (0, (τ - 1) d_{2})\}, & τ \geq 2, \\ co {(0, d_{2}), (0, (τ - 1) d_{2})}, & τ < 2 . \end{matrix}

It is straightforward to compute that

〈 λ, \nabla^{2} H (\bar{x}) (d, d) 〉 = 〈 (- 1, 0), (- 2 d_{2}^{2}, 0) 〉 = 2 d_{2}^{2} .

(66)

For any

z \in \partial_{x}^{2} L (\bar{x}, u^{*}, τ) (d)

, we have

z = (0, θ d_{2})

with

θ \in {1, τ - 1}

if

τ \geq 2

, or

θ \in co {1, τ - 1}

if

τ < 2

. So,

〈 z, d 〉 = 〈 (0, θ d_{2}), (0, d_{2}) 〉 = θ d_{2}^{2} .

(67)

The support function of the second-order tangent set

T_{D}^{2} (H (\bar{x}); \nabla H (\bar{x}) d)

at

λ = (- 1, 0)

is

σ_{T_{D}^{2} (H (\bar{x}); \nabla H (\bar{x}) d)} (λ) = sup_{w \in T_{D}^{2} (H (\bar{x}); \nabla H (\bar{x}) d)} 〈 λ, w 〉 = sup_{w_{1} \geq 4 d_{2}^{2}} (- w_{1}) = - 4 d_{2}^{2} .

(68)

Combining (66)–(68) yields

\begin{matrix} 〈 z, d 〉 + 〈 λ, \nabla^{2} H (\bar{x}) (d, d) 〉 - σ_{T_{D}^{2} (H (\bar{x}); \nabla H (\bar{x}) d)} (λ) \\ = & θ d_{2}^{2} + 2 d_{2}^{2} - (- 4 d_{2}^{2}) = (θ + 6) d_{2}^{2} \geq min {7, τ + 5} d_{2}^{2} > 0 . \end{matrix}

So, the condition (ii) in Corollary 2 holds.

According to the above analysis, we can conclude that

\bar{x}

is a local minimizer of

L (x, u, τ)

. This establishes the second inequality in (15). The first inequality in (15) follows from the fact that

L (\bar{x}, u^{*}, τ) = f (\bar{x})

by (26) since

u^{*} = 0 \in N_{K} (G (\bar{x}))

, and that

L (\bar{x}, u, τ) \leq f (\bar{x})

for all

u \in R^{m}

by (22) due to

\bar{x}

being feasible. This shows that

(\bar{x}, u^{*})

is a local saddle point of the problem.

5. Conclusions

In this paper, we focus on optimization problems with a separable structure, where the constraint system is categorized into two types based on their properties: convex constraints and nonconvex constraints. To handle this particular structure, we propose a partial augmented Lagrangian function that adopts distinct analytical strategies for each type of constraint. We mainly investigate the properties of saddle points and second-order optimality conditions for this partial augmented Lagrangian function. In particular, (i) the convex set is only required to be closed under addition, without assuming it to be a cone; (ii) the differentiability requirements are weakened from twice continuous differentiability to gradient Lipschitz continuity; and (iii) asymptotic second-order tangent cones and second-order tangent sets are utilized to capture the geometric properties of the nonconvex constraints. It is worth further investigating the development of duality theory and algorithmic designs based on this partial augmented Lagrangian function.

Author Contributions

Conceptualization, L.H. and J.Z.; Methodology, L.H. and Y.W.; Validation, J.T., Y.W. and J.Z.; Formal analysis, L.H. and J.T.; Writing—original draft preparation, L.H. and Y.W.; Writing—review and editing, J.T. and J.Z.; Supervision, J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (12371305, 12571321), the Shandong Provincial Natural Science Foundation (ZR2023MA020, ZR2025QC11), and the Key scientific research projects of Higher Education of Henan Province (26A110018).

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

The authors are gratefully indebted to the anonymous referees for their valuable suggestions that helped us greatly improve the original presentation of the paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Bai, J.C.; Hager, W.W.; Zhang, H.C. An inexact accelerated stochastic ADMM for separable convex optimization. Comput. Optim. Appl. 2022, 81, 479–518. [Google Scholar] [CrossRef]
Bai, J.C.; Li, J.C.; Xu, F.M.; Zhang, H.C. Generalized symmetric ADMM for separable convex optimization. Comput. Optim. Appl. 2018, 70, 129–170. [Google Scholar] [CrossRef]
Hager, W.W.; Zhang, H.C. Convergence rates for an inexact ADMM applied to separable convex optimization. Comput. Optim. Appl. 2020, 77, 729–754. [Google Scholar] [CrossRef]
Luke, D.R. Convergence in distribution of randomized algorithms: The case of partially separable optimization. Math. Program. 2025, 212, 763–798. [Google Scholar] [CrossRef]
Stefanov, S.M. Separable Optimization: Theory and Methods; Springer: Cham, Switzerland, 2013. [Google Scholar]
Bai, X.D.; Sun, J.; Zheng, X.J. An augmented Lagrangian decomposition method for chance-constrained optimization problems. INFORMS J. Comput. 2021, 33, 1056–1069. [Google Scholar] [CrossRef]
Kan, C.; Song, W. Second-order conditions for existence of augmented Lagrange multipliers for eigenvalue composite optimization problems. J. Glob. Optim. 2015, 63, 77–97. [Google Scholar] [CrossRef]
Kan, C.; Song, W. Second-order conditions for the existence of augmented Lagrange multipliers for sparse optimization. J. Optim. Theory Appl. 2024, 201, 103–129. [Google Scholar] [CrossRef]
Bai, K.; Ye, J.J.; Zeng, S.Z. Optimality conditions for bilevel programmes via Moreau envelope reformulation. Optimization 2024, 74, 2685–2719. [Google Scholar] [CrossRef]
Ma, X.; Yao, W.; Ye, J.J.; Zhang, J. Combined approach with second-order optimality conditions for bilevel programming problems. arXiv 2021, arXiv:2108.00179. [Google Scholar] [CrossRef]
Chen, J.; Liu, L.; Lv, Y.; Ghosh, D.; Yao, J.C. Second-order strong optimality and duality for nonsmooth multiobjective fractional programming with constraints. Positivity 2024, 28, 36. [Google Scholar] [CrossRef]
Chen, J.; Su, H.; Ou, X.; Lv, Y. First- and second-order optimality conditions of nonsmooth sparsity multiobjective optimization via variational analysis. J. Glob. Optim. 2024, 89, 303–325. [Google Scholar] [CrossRef]
Kien, B.T.; Qin, X.; Wen, C.F.; Yao, J.C. Second-order optimality conditions for multiobjective optimal control problems with mixed pointwise constraints and free right end point. SIAM J. Control Optim. 2020, 58, 2658–2677. [Google Scholar] [CrossRef]
Mehlitz, P. Stationarity conditions and constraint qualifications for mathematical programs with switching constraints. Math. Program. 2020, 181, 149–186. [Google Scholar] [CrossRef]
Bonnans, J.F.; Cominetti, R.; Shapiro, A. Second order optimality conditions based on parabolic second order tangent sets. SIAM J. Optim. 1999, 9, 466–492. [Google Scholar] [CrossRef]
Bonnans, J.F.; Shapiro, A. Perturbation Analysis of Optimization Problems; Springer: New York, NY, USA, 2000. [Google Scholar]
Gfrerer, H.; Ye, J.J.; Zhou, J.C. Second-order optimality conditions for nonconvex set-constrained optimization problems. Math. Oper. Res. 2022, 47, 2344–2365. [Google Scholar] [CrossRef]
Bai, K.; Song, Y.; Zhang, J. Second-order enhanced optimality conditions and constraint qualifications. J. Optim. Theory Appl. 2023, 198, 1264–1284. [Google Scholar] [CrossRef]
Andreani, R.; Haeser, G.; Mito, L.M.; Ramirez, H.; Silveira, T.P. First- and second-order optimality conditions for second-order cone and semidefinite programming under a constant rank condition. Math. Program. 2023, 202, 473–513. [Google Scholar] [CrossRef]
Medeiros, J.C.A.; Ribeiro, A.A.; Sachine, M.; Secchin, L.D. A practical second-order optimality condition for cardinality-constrained problems with application to an augmented Lagrangian method. J. Optim. Theory Appl. 2025, 206, 22. [Google Scholar] [CrossRef]
Ouyang, W.; Ye, J.J.; Zhang, B. New second-order optimality conditions for directional optimality of a general set-constrained optimization problem. SIAM J. Optim. 2025, 35, 1274–1299. [Google Scholar] [CrossRef]
Penot, J.P. Second-order conditions for optimization problems with constraints. SIAM J. Control Optim. 1998, 37, 303–318. [Google Scholar] [CrossRef]
Gfrerer, H. On directional metric regularity, subregularity and optimality conditions for nonsmooth mathematical programs. Set-Valued Var. Anal. 2013, 21, 151–176. [Google Scholar] [CrossRef]
Mordukhovich, B.S. Variational Analysis and Applications; Springer: Cham, Switzerland, 2018. [Google Scholar]
An, D.T.V.; Xu, H.K.; Yen, N.D. Fréchet second-order subdifferentials of Lagrangian functions and optimality conditions. SIAM J. Optim. 2023, 33, 766–784. [Google Scholar] [CrossRef]
Khanh, P.D.; Khoa, V.V.H.; Mordukhovich, B.S.; Phat, V.T. Second-order subdifferential optimality conditions in nonsmooth optimization. SIAM J. Optim. 2025, 35, 678–711. [Google Scholar] [CrossRef]
Mordukhovich, B.S. Second-Order Variational Analysis in Optimization, Variational Stability, and Control: Theory, Algorithms, Applications; Springer: Cham, Switzerland, 2024. [Google Scholar]
Mordukhovich, B.S. Variational Analysis and Generalized Differentiation I: Basic Theory; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
An, D.T.V.; Tuyen, N.V. On second-order optimality conditions for C^1,1 optimization problems via Lagrangian functions. Appl. Anal. 2025, 1, 17. [Google Scholar] [CrossRef]
Feng, M.; Li, S.J. On second-order Fritz John type optimality conditions for a class of differentiable optimization problems. Appl. Anal. 2020, 99, 2594–2608. [Google Scholar] [CrossRef]
Rockafellar, R.T.; Wets, R.J.-B. Variational Analysis; Springer: Berlin/Heidelberg, Germany, 1998. [Google Scholar]
Boyd, S.P.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Jiménez, B.; Novo, V. Optimality conditions in differentiable vector optimization via second-order tangent sets. Appl. Math. Optim. 2004, 49, 123–144. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, L.; Tang, J.; Wang, Y.; Zhou, J. Saddle Points of Partial Augmented Lagrangian Functions. Math. Comput. Appl. 2025, 30, 110. https://doi.org/10.3390/mca30050110

AMA Style

Huang L, Tang J, Wang Y, Zhou J. Saddle Points of Partial Augmented Lagrangian Functions. Mathematical and Computational Applications. 2025; 30(5):110. https://doi.org/10.3390/mca30050110

Chicago/Turabian Style

Huang, Longfei, Jingyong Tang, Yutian Wang, and Jinchuan Zhou. 2025. "Saddle Points of Partial Augmented Lagrangian Functions" Mathematical and Computational Applications 30, no. 5: 110. https://doi.org/10.3390/mca30050110

APA Style

Huang, L., Tang, J., Wang, Y., & Zhou, J. (2025). Saddle Points of Partial Augmented Lagrangian Functions. Mathematical and Computational Applications, 30(5), 110. https://doi.org/10.3390/mca30050110

Article Menu

Saddle Points of Partial Augmented Lagrangian Functions

Abstract

1. Introduction

2. Basic Notations and Tools in Variational Analysis

3. Saddle Points and KKT Conditions

4. Optimality Conditions for Partial Augmented Lagrangian

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI