Resolvent-Free Method for Solving Monotone Inclusions

Tang, Yan; Gibali, Aviv

doi:10.3390/axioms12060557

Open AccessArticle

Resolvent-Free Method for Solving Monotone Inclusions

by

Yan Tang

^1,2 and

Aviv Gibali

^3,*

¹

School of Mathematics and Statistics, Chongqing Technology and Business University, Chongqing 400067, China

²

College of Mathematics, Sichuan University, Chengdu 610065, China

³

Department of Mathematics, Braude College, Karmiel 2161002, Israel

^*

Author to whom correspondence should be addressed.

Axioms 2023, 12(6), 557; https://doi.org/10.3390/axioms12060557

Submission received: 19 April 2023 / Revised: 27 May 2023 / Accepted: 3 June 2023 / Published: 5 June 2023

(This article belongs to the Special Issue Differential Equations and Asymptotic Analysis: Recent Advances and Applications)

Download

Browse Figures

Versions Notes

Abstract

In this work, we consider the monotone inclusion problem in real Hilbert spaces and propose a simple inertial method that does not include any evaluations of the associated resolvent and projection. Under suitable assumptions, we establish the strong convergence of the method to a minimal norm solution. Saddle points of minimax problems and critical points problems are considered as the applications. Numerical examples in finite- and infinite-dimensional spaces illustrate the performances of our scheme.

Keywords:

monotone inclusion; resolvent free; minimax problems; critical point problems

MSC:

65K05; 65K10; 47H10; 47L25

1. Introduction

Since Minty [1], and the many others to follow, such as [2,3,4], introduced the theory of the monotone operator, a large number of theoretical and practical developments have been presented. Pascali and Sburian [5] pointed out that the class of monotone operators is important, and due to the simple structure of the monotonicity condition, it can be handled easily. The monotone inclusion problem is one of the highlights due to its important significance in convex analysis and convex optimization problems, which includes convex minimization, monotone variational inequality, convex and concave minimax problems, linear programming problems and many others. For further information and applications, see, e.g., Bot and Csetnek [6], Korpelevich [7], Khanc et al. [8], Sicre et al. [9], Xu [10], Yin et al. [11] and the many references therein [12,13,14,15].

Let H be a real Hilbert space and

A : H \to H

be a given operator with domain

D o m (A) = {x \in H : A x \neq \emptyset}

. The monotone inclusion problem is formulated as finding a point

x^{*}

such that

\begin{matrix} 0 \in A x^{*} . \end{matrix}

(1)

The monotonicity term of (1) refers to the monotonicity of A which means that for all

x, y \in H

,

〈 u - v, x - y 〉 \geq 0, u \in A x, v \in A y .

We denote the solution set of (1) by

Ω = A^{- 1} (0)

.

One of the simplest classical algorithms for solving the monotone inclusion problem (1) is the proximal point method of Martinet [16]. Given a maximal monotone mapping

A : H \to H

and its associated resolvent

J_{r}^{A} = {(I + r A)}^{- 1}

, the proximal point algorithm generates a sequence according to the update rule:

x_{n + 1} = J_{r}^{A} x_{n} .

(2)

The proximal point algorithm, also known as the regularization algorithm, is a first-order optimization method that requires the function and gradient (subgradient) evaluations, and thus attracts much interest. For more relevant improvements and achievements on the regularization methods in Hilbert spaces, one can refer to [17,18,19,20,21,22,23].

One important application of monotone inclusions is the convex minimization problem. Given

C \subseteq R^{n}

is a nonempty, closed and convex set and a continuously differentiable function f, the constrained minimization aims to find a point

x^{*} \in C

such that

f (x^{*}) = min_{x \in C} f (x) .

(3)

Using some operator theory properties, it is known that

x^{*}

solves (3) if and only if

x^{*} = P_{C} (I - λ \nabla f) x^{*}

for some

λ > 0

. This relationship translates to the projected gradient method:

x_{n + 1} = P_{C} (x_{n} - λ \nabla f (x_{n})),

where

P_{C}

is the metric projection onto C and

\nabla f

is the gradient of f.

The projected gradient method calls for the evaluation of the projection onto the feasible set C as well as the gradient evaluation of f. This guarantees a reduction in the objective function while keeping the iterates feasible. With the set C as above and an operator

A : H \to H

, an important problem worth mentioning is the monotonic variational inequality problem, consisting of finding a point

x^{*} \in C

such that

〈 A x^{*}, x - x^{*} 〉 \geq 0 for all x \in C .

(4)

Using the relationship between the projection

P_{C}

, the resolvent and the normal cone

N_{C}

of the set C, that is,

\begin{matrix} y = J_{λ}^{N_{C}} (x) & \Leftrightarrow & x \in y + λ N_{C} (y) \Leftrightarrow x - y \in λ N_{C} (y) \\ \Leftrightarrow & 〈 x - y, d - y 〉 \leq 0 \Leftrightarrow y = P_{C} x, \forall d \in C, \end{matrix}

we obtain the iterative step rule for solving (4)

x_{n + 1} = P_{C} (x_{n} - λ A x_{n}) .

(5)

Indeed, the mentioned optimization methods above now “dominate” in modern optimization algorithms based on first-order information (such as function values and radial/subgradient), and it can be predicted that they will become increasingly important as the scale of practical application problems increases. For excellent works, one can refer to Teboulle [24], Drusvyatskiy and Lewis [25], etc. However, it is undeniable that they are highly dependent on the structure of the given problem, and computationally, these methods rely on the ability to compute resolvents/projections per iteration; taking algorithm (5), for instance, the complexity of each step depends on the computation of the projection to the convex set C.

Hence, in this work, we wish to combine the popular inertial technology (see, e.g., Nesterov [26], Alvarez [27] and Alvarez–Attouch [28]) and establish a strong convergence iterative method that does not use resolvents or projections, and has good convergence properties due to the inertial technique.

The outline of this paper is as follows. In Section 2, we collect the definitions and results needed for our analysis. In Section 3, the resolvent/projection-free algorithm and its convergence analysis are presented. Later, in Section 4, we present two applications of the monotone inclusion problem, saddle points of the minimax problem and the critical points problem. Finally, in Section 5, numerical experiments illustrate the performances of our scheme in finite- and infinite-dimensional spaces.

2. Preliminaries

Let C be a nonempty, closed and convex subset of a real Hilbert space H equipped with the inner product

〈 \cdot, \cdot 〉

. Denote the strong convergence to x of

{x_{n}}

by

x_{n} \to x

, the

ω

-weak limit set of

{x_{n}}

by

w_{ω} (x_{n}) = {x \in H : x_{n_{j}} ⇀ x for some subsequence {x_{n_{j}}} of {x_{n}}} .

We recall two useful properties of the norm:

\begin{matrix} {∥ x + y ∥}^{2} \leq {∥ x ∥}^{2} + 2 〈 y, x + y 〉; \\ {∥ α x + β y + γ z ∥}^{2} = {α ∥ x ∥}^{2} + {β ∥ y ∥}^{2} + {γ ∥ z ∥}^{2} - α β {∥ x - y ∥}^{2} \end{matrix}

(6)

\begin{matrix} {- β γ ∥ y - z ∥}^{2} - α γ {∥ x - z ∥}^{2}, \end{matrix}

(7)

for all

x, y, z \in H

and

α, β, γ \in R

such that

α + β + γ = 1

.

Definition 1.

Let H be a real Hilbert space. An operator

A : H \to H

is called

μ —

inverse strongly monotone (μ-ism) (or μ-cocoercive) if there exists a number

μ > 0

such that

\begin{matrix} 〈 x - y, A x - A y 〉 \geq {μ ∥ A x - A y ∥}^{2} . \end{matrix}

Definition 2.

Let C be a nonempty, closed convex subset of H. The operator

P_{C}

is called the metric projection of H onto C: for every element

x \in H

, there is a unique nearest point

P_{C} x

in C, such that

\begin{matrix} ∥ x - P_{C} x ∥ = min {∥ x - y ∥ : y \in C} . \end{matrix}

The characterization of the metric projection is

\begin{matrix} 〈 x - P_{C} x, y - P_{C} x 〉 \leq 0, \forall x \in H, \forall y \in C . \end{matrix}

(8)

Lemma 1

(Xu [29], Maingé [30]). Assume that

{a_{n}}

and

{c_{n}}

are nonnegative real sequences such that

\begin{matrix} a_{n + 1} \leq (1 - γ_{n}) a_{n} + b_{n} + c_{n}, \forall n \geq 0, \end{matrix}

where

{γ_{n}}

is a sequence in

(0, 1)

and

{b_{n}}

is a real sequence. Provided that

(a)

{lim}_{n \to \infty} γ_{n} = 0

,

Σ_{n = 1}^{\infty} γ_{n} = \infty

;

Σ_{n = 1}^{\infty} c_{n} < \infty

;

(b)

lim {sup}_{n \to \infty} \frac{b_{n}}{γ_{n}} \leq 0

.

Then, the limit of the sequence

{a_{n}}

exists and

{lim}_{n \to \infty} a_{n} = 0

.

Lemma 2

(see, e.g., Opial [31]). Let H be a real Hilbert space and

{x_{n}}_{n = 0}^{\infty} \subset H

such that there exists a nonempty, closed and convex set

S \subset H

satisfying the following:

(1) For every

z \in S

,

{lim}_{n \to \infty} ∥ x_{n} - z ∥

exists;

(2) Any weak cluster point of

{x_{n}}_{n = 0}^{\infty}

belongs to S.

Then, there exists

\bar{x} \in S

such that

{x_{n}}_{n = 0}^{\infty}

converges weakly to

\bar{x}

.

Lemma 3

(see, e.g., Maingé [30]). Let

{Γ_{n}}

be a sequence of real numbers that does not decrease at infinity, in the sense that there exists a subsequence

{Γ_{n_{j}}}

of

{Γ_{n}}

such that

Γ_{n_{j}} < Γ_{n_{j} + 1}

for all

j \geq 0

. Also consider the sequence of integers

{σ (n)}_{n \geq n_{0}}

defined by

\begin{matrix} σ (n) = max {k \leq n : Γ_{k} \leq Γ_{k + 1}} . \end{matrix}

Then,

{σ (n)}_{n \geq n_{0}}

is a nondecreasing sequence verifying

{lim}_{n \to \infty} σ (n) = \infty

and, for all

n \geq n_{0}

,

\begin{matrix} max {Γ_{σ (n)}, Γ_{n}} \leq Γ_{σ (n) + 1} . \end{matrix}

3. Main Result

We are concerned with the following monotone inclusion problem: finding

x^{*} \in H

such that

\begin{matrix} 0 \in A x^{*}, \end{matrix}

(9)

where A is a monotone-type operator on H.

Remark 1.

Clearly, if

y_{n} = z_{n} = x_{n}

for some

n \geq 1

, then

x_{n}

is a solution of (9) and the iteration process is terminated in finite iterations. In general, the algorithm does not stop in finite iterations, and thus we assume that the algorithm generates an infinite sequence.

Convergence Analysis

For the convergence analysis of our algorithm, we assume the following assumptions:

(A1) A is a continuous maximal monotone operator with cocoercive coefficient

μ

from H to H;

(A2) The solution set

Ω

of (9) is nonempty.

Theorem 1.

Suppose that the assumptions (A1)–(A2) hold. If the sequences

{α_{n}}

,

{γ_{n}}

are in

(0, 1)

and satisfy the following conditions:

(B1)

{lim}_{n \to \infty} γ_{n} = 0

,

lim inf (1 - α_{n} - γ_{n}) α_{n} > 0

and

Σ_{n = 1}^{\infty} γ_{n} = \infty

;

(B2)

ϵ_{n} = o (γ_{n}) .

Then, the recursion

{x_{n}}

generated by Algorithm 1 converges strongly to an element p which is closest to

0

in Ω, that is,

p = P_{Ω} (0)

.

Algorithm 1 Convergence Analysis

Initialization: Choose

λ_{n} \in (0, 2 μ)

,

θ \in (0, 1)

and

ϵ_{n} \in (0, \infty)

such that

\sum_{n = 1}^{\infty} ϵ_{n} < \infty

, select arbitrary starting points

x_{0}, x_{1} \in C

, and set

n = 1

.

Iterative Step: Given the iterates

x_{n}

and

x_{n - 1}

for each

n \geq 1

, choose

θ_{n}

such that

0 < θ_{n} < {\bar{θ}}_{n}

, compute

\begin{matrix} \{\begin{matrix} y_{n} = x_{n} + θ_{n} (x_{n} - x_{n - 1}), \\ z_{n} = y_{n} - λ_{n} A y_{n}, \\ x_{n + 1} = (1 - α_{n} - γ_{n}) y_{n} + α_{n} z_{n}, \end{matrix} \end{matrix}

(10)

where

\begin{matrix} {\bar{θ}}_{n} = \{\begin{matrix} min {θ, ϵ_{n} [max (∥ x_{n} - x_{n - 1} ∥^{2}, ∥ x_{n} - x_{n - 1} {∥)]}^{- 1}}, x_{n} \neq x_{n - 1}; \\ θ . e l s e \end{matrix} \end{matrix}

Stopping Criterion: If

y_{n} = z_{n}

, then stop. Otherwise, set

n : = n + 1

and return to Iterative Step.

Proof.

First, we prove that

{x_{n}}

is bounded. Without the loss of the generality, let p be the closest element to 0 in

Ω

because

Ω \neq \emptyset

. It follows from the cocoercivity of A with coefficient

μ

that

\begin{matrix} 〈 A x_{n}, x_{n} - p 〉 = 〈 A x_{n} - A p, x_{n} - p 〉 \geq μ {∥ A x_{n} ∥}^{2} . \end{matrix}

Taking into account the definition of

y_{n}

in the recursion (10), we have

\begin{matrix} ∥ y_{n} - p ∥ & = & ∥ x_{n} + θ_{n} (x_{n} - x_{n - 1}) p z ∥ \\ \leq & ∥ x_{n} - p ∥ + θ_{n} ∥ x_{n} - x_{n - 1} ∥, \end{matrix}

and

\begin{matrix} ∥ z_{n} {- p ∥}^{2} & = & ∥ y_{n} - λ_{n} A y_{n} {- p ∥}^{2} \\ = & ∥ y_{n} {- p ∥}^{2} + λ_{n}^{2} {∥ A y_{n} ∥}^{2} - 2 λ_{n} 〈 A y_{n}, y_{n} - p 〉 \\ \leq & ∥ y_{n} {- p ∥}^{2} + λ_{n}^{2} ∥ A y_{n} ∥^{2} - 2 μ λ_{n} {∥ A y_{n} ∥}^{2} \\ \leq & ∥ y_{n} {- p ∥}^{2} + (λ_{n} - 2 μ) λ_{n} {∥ A y_{n} ∥}^{2}, \end{matrix}

(11)

which implies that

∥ z_{n} - p ∥ \leq ∥ y_{n} - p ∥ .

Furthermore, we have

\begin{matrix} ∥ x_{n + 1} - p ∥ & = & ∥ (1 - α_{n} - γ_{n}) y_{n} + α_{n} z_{n} - p ∥ \\ \leq & (1 - α_{n} - γ_{n}) ∥ y_{n} - p ∥ + λ_{n}^{2} ∥ z_{n} - p ∥ + γ_{n} ∥ p ∥ \\ \leq & (1 - γ_{n}) ∥ y_{n} - p ∥ + γ_{n} ∥ p ∥ \\ \leq & (1 - γ_{n}) [∥ x_{n} - p ∥ + θ_{n} ∥ x_{n} - x_{n - 1} ∥] + γ_{n} ∥ p ∥ \\ \leq & (1 - γ_{n}) ∥ x_{n} - p ∥ + γ_{n} [∥ p ∥ + \frac{θ_{n}}{γ_{n}} ∥ x_{n} - x_{n - 1} ∥] . \end{matrix}

In view of the assumption on

θ_{n}

, we obtain

θ_{n} ∥ x_{n} - x_{n - 1} ∥ \leq ϵ_{n} = o (γ_{n})

, which entails that there exists some positive constant

σ

such that

σ = sup \frac{θ_{n}}{γ_{n}} ∥ x_{n} - x_{n - 1} ∥

; therefore,

\begin{matrix} ∥ x_{n + 1} - p ∥ & \leq & (1 - γ_{n}) ∥ x_{n} - p ∥ + γ_{n} (∥ p ∥ + σ) \\ \leq & max {∥ x_{0} - p ∥, ∥ p ∥ + σ}, \end{matrix}

namely, the sequence

{x_{n}}

is bounded, and so are

{y_{n}}

and

{z_{n}}

.

It follows from (10) and (11) that

\begin{matrix} ∥ x_{n + 1} {- p ∥}^{2} & = & ∥ (1 - α_{n} - γ_{n}) y_{n} + α_{n} z_{n} {- p ∥}^{2} \\ = & ∥ (1 - α_{n} - γ_{n}) (y_{n} - p) + α_{n} (z_{n} - p) - γ_{n} {p ∥}^{2} \\ \leq & (1 - α_{n} - γ_{n}) ∥ y_{n} {- p ∥}^{2} + α_{n} ∥ z_{n} {- p ∥}^{2} + γ_{n} {∥ p ∥}^{2} \\ - (1 - α_{n} - γ_{n}) α_{n} {∥ y_{n} - z_{n} ∥}^{2} \\ \leq & (1 - α_{n} - γ_{n}) ∥ y_{n} {- p ∥}^{2} + α_{n} [∥ y_{n} {- p ∥}^{2} + (λ_{n} - 2 μ) λ_{n} ∥ A y_{n} ∥^{2}] \\ + γ_{n} {∥ p ∥}^{2} - (1 - α_{n} - γ_{n}) α_{n} {∥ y_{n} - z_{n} ∥}^{2} \\ = & (1 - γ_{n}) ∥ y_{n} {- p ∥}^{2} + α_{n} (λ_{n} - 2 μ) λ_{n} {∥ A y_{n} ∥}^{2} \\ + γ_{n} {∥ p ∥}^{2} - (1 - α_{n} - γ_{n}) α_{n} {∥ y_{n} - z_{n} ∥}^{2} . \end{matrix}

(12)

By using again the formation of

y_{n}

, we obtain

\begin{matrix} ∥ y_{n} {- p ∥}^{2} & = & ∥ x_{n} + θ_{n} (x_{n} - x_{n - 1}) {- p ∥}^{2} \\ = & ∥ (1 + θ_{n}) (x_{n} - p) - θ_{n} (x_{n - 1} - p) ∥^{2} \\ \leq & (1 + θ_{n}) ∥ x_{n} {- p ∥}^{2} - θ_{n} ∥ x_{n - 1} {- p ∥}^{2} + θ_{n} (1 + θ_{n}) {∥ x_{n} - x_{n - 1} ∥}^{2} \\ \leq & (1 + θ_{n}) ∥ x_{n} {- p ∥}^{2} - θ_{n} ∥ x_{n - 1} {- p ∥}^{2} + 2 θ_{n} {∥ x_{n} - x_{n - 1} ∥}^{2} . \end{matrix}

(13)

Substituting (13) into (12), we have

\begin{matrix} ∥ x_{n + 1} {- p ∥}^{2} & \leq & (1 - γ_{n}) [(1 + θ_{n}) ∥ x_{n} {- p ∥}^{2} - θ_{n} ∥ x_{n - 1} {- p ∥}^{2} + 2 θ_{n} ∥ x_{n} - x_{n - 1} ∥^{2}] \\ + α_{n} (λ_{n} - 2 μ) λ_{n} ∥ A y_{n} ∥^{2} + γ_{n} {∥ p ∥}^{2} - (1 - α_{n} - γ_{n}) α_{n} {∥ y_{n} - z_{n} ∥}^{2} \\ = & (1 - γ_{n}) [∥ x_{n} {- p ∥}^{2} + θ_{n} (∥ x_{n} {- p ∥}^{2} - ∥ x_{n - 1} - {p) ∥}^{2}) \\ + 2 θ_{n} ∥ x_{n} - x_{n - 1} ∥^{2}] + α_{n} (λ_{n} - 2 μ) λ_{n} ∥ A y_{n} ∥^{2} + γ_{n} {∥ p ∥}^{2} \\ - (1 - α_{n} - γ_{n}) α_{n} {∥ y_{n} - z_{n} ∥}^{2} \\ = & ∥ x_{n} {- p ∥}^{2} + γ_{n} {(∥ p ∥}^{2} - ∥ x_{n} {- p ∥}^{2}) + 2 (1 - γ_{n}) θ_{n} {∥ x_{n} - x_{n - 1} ∥}^{2} \\ + (1 - γ_{n}) θ_{n} (∥ x_{n} {- p ∥}^{2} - ∥ x_{n - 1} - {p) ∥}^{2}) + α_{n} (λ_{n} - 2 μ) λ_{n} {∥ A y_{n} ∥}^{2} \\ - (1 - α_{n} - γ_{n}) α_{n} {∥ y_{n} - z_{n} ∥}^{2}, \end{matrix}

and transposing, we have

\begin{matrix} (1 - α_{n} - γ_{n}) α_{n} ∥ y_{n} - z_{n} ∥^{2} + α_{n} (2 μ - λ_{n}) λ_{n} {∥ A y_{n} ∥}^{2} \\ \leq & (∥ x_{n} {- p ∥}^{2} - ∥ x_{n + 1} {- p ∥}^{2}) + (1 - γ_{n}) θ_{n} (∥ x_{n} {- p ∥}^{2} - ∥ x_{n - 1} - {p) ∥}^{2}) \\ + γ_{n} {(∥ p ∥}^{2} - ∥ x_{n} {- p ∥}^{2}) + 2 (1 - γ_{n}) θ_{n} {∥ x_{n} - x_{n - 1} ∥}^{2}, \end{matrix}

(14)

Here, two cases should be considered.

Case I. Assume that the sequence

∥ x_{n} - p ∥

is decreasing, namely, there exists

N_{0} > 0

such that

∥ x_{n + 1} - p ∥ \leq ∥ x_{n} - p ∥

for each

n > N_{0}

, and then there is the limit of

∥ x_{n} - p ∥

and

{lim}_{n \to \infty} (∥ x_{n + 1} - p ∥ - ∥ x_{n} - p ∥) = 0 .

It turns out from (14) and the condition

(B 1)

that

\begin{matrix} (1 - α_{n} - γ_{n}) α_{n} ∥ y_{n} - z_{n} ∥^{2} \to 0; α_{n} (2 μ - λ_{n}) λ_{n} {∥ A y_{n} ∥}^{2} \to 0, \end{matrix}

which implies that

∥ y_{n} - z_{n} ∥^{2} \to 0

and

∥ A y_{n} ∥^{2} \to 0

.

Furthermore, by the setting of

u_{n}

, we have

∥ u_{n} - y_{n} ∥ = α_{n} ∥ z_{n} - y_{n} ∥ \to 0

and

∥ x_{n + 1} - u_{n} ∥ = γ_{n} ∥ y_{n} ∥ \to 0

, which together with

∥ x_{n} - y_{n} ∥ = θ_{n} ∥ x_{n} - x_{n - 1} ∥ \to 0

yields that

∥ x_{n + 1} - x_{n} ∥ \leq ∥ x_{n + 1} - u_{n} ∥ + ∥ u_{n} - y_{n} ∥ + ∥ y_{n} - x_{n} ∥ \to 0 .

Because

{x_{n}}

is bounded, it follows from Eberlein–Shmulyan’s theorem, for arbitrary point

q \in w_{ω} (x_{n})

, that there exits a subsequence

{x_{n_{j}}}

of

{x_{n}}

such that

x_{n_{j}}

converges weakly to q. By

∥ x_{n} - y_{n} ∥ \to 0

,

∥ A y_{n} ∥^{2} \to 0

and A is continuous, we have

0 = lim_{n \to \infty} ∥ A y_{n} ∥ = lim_{j \to \infty} ∥ A y_{n_{j}} ∥ = A q,

which entails that

q \in A^{- 1} (0)

. In view of the fact that the choice of q in

w_{ω} (x_{n})

was arbitrary, we conclude that

w_{ω} (x_{n}) \subset Ω

, which makes Lemma 2 workable, that is,

{x_{n}}_{n = 0}^{\infty}

converges weakly to some point in

Ω

.

Now, we claim that

x_{n} \to p

, where

p = P_{Ω} (0)

.

For this purpose, let

u_{n} = (1 - α_{n}) y_{n} + α_{n} z_{n}

, and then we have

\begin{matrix} x_{n + 1} & = & (1 - α_{n} - γ_{n}) y_{n} + α_{n} z_{n} = (1 - γ_{n}) u_{n} + γ_{n} (u_{n} - y_{n}) \\ = & (1 - γ_{n}) u_{n} + γ_{n} α_{n} (z_{n} - y_{n}), \end{matrix}

which yields that

\begin{matrix} ∥ x_{n + 1} {- p ∥}^{2} & = & ∥ (1 - γ_{n}) u_{n} + γ_{n} α_{n} (z_{n} - y_{n}) {- p ∥}^{2} \\ = & ∥ (1 - γ_{n}) (u_{n} - p) + γ_{n} α_{n} (z_{n} - y_{n}) - γ_{n} {p ∥}^{2} \\ \leq & {(1 - γ_{n})}^{2} {∥ u_{n} - p ∥}^{2} - 2 〈 γ_{n} p - γ_{n} α_{n} (z_{n} - y_{n}), x_{n + 1} - z 〉 \\ = & {(1 - γ_{n})}^{2} {∥ u_{n} - p ∥}^{2} - 2 γ_{n} α_{n} 〈 y_{n} - z_{n}, x_{n + 1} - p 〉 \\ + 2 γ_{n} 〈 - p, x_{n + 1} - p 〉 . \end{matrix}

(15)

In addition, by using again the formation of

{y_{n}}

, we obtain

\begin{matrix} ∥ y_{n} {- p ∥}^{2} & = & ∥ x_{n} + θ_{n} (x_{n} - x_{n - 1}) {- p ∥}^{2} \\ = & ∥ (1 + θ_{n}) (x_{n} - p) - θ_{n} (x_{n - 1} - p) ∥^{2} \\ \leq & (1 + θ_{n}) ∥ x_{n} {- p ∥}^{2} - θ_{n} ∥ x_{n - 1} {- p ∥}^{2} + θ_{n} (1 + θ_{n}) {∥ x_{n} - x_{n - 1} ∥}^{2} \\ \leq & (1 + θ_{n}) ∥ x_{n} {- p ∥}^{2} - θ_{n} ∥ x_{n - 1} {- p ∥}^{2} + 2 θ_{n} {∥ x_{n} - x_{n - 1} ∥}^{2}, \end{matrix}

and substituting the above inequality in (15), we have

\begin{matrix} ∥ x_{n + 1} {- p ∥}^{2} & \leq & {(1 - γ_{n})}^{2} {∥ y_{n} - p ∥}^{2} - 2 γ_{n} α_{n} 〈 y_{n} - z_{n}, x_{n + 1} - p 〉 + 2 γ_{n} 〈 - p, x_{n + 1} - p 〉 \\ \leq & {(1 - γ_{n})}^{2} [(1 + θ_{n}) ∥ x_{n} {- p ∥}^{2} - θ_{n} ∥ x_{n - 1} {- p ∥}^{2} + 2 θ_{n} ∥ x_{n} - x_{n - 1} ∥^{2}] \\ - 2 γ_{n} α_{n} 〈 y_{n} - z_{n}, x_{n + 1} - p 〉 + 2 γ_{n} 〈 - p, x_{n + 1} - p 〉 \\ \leq & (1 - γ_{n}) [∥ x_{n} {- p ∥}^{2} + θ_{n} (∥ x_{n} {- p ∥}^{2} - ∥ x_{n - 1} {- p ∥}^{2}) + 2 θ_{n} ∥ x_{n} - x_{n - 1} ∥^{2}] \\ - 2 γ_{n} α_{n} 〈 y_{n} - z_{n}, x_{n + 1} - p 〉 + 2 γ_{n} 〈 - p, x_{n + 1} - p 〉 \\ \leq & (1 - γ_{n}) [∥ x_{n} {- p ∥}^{2} + θ_{n} ∥ x_{n} - x_{n - 1} ∥ \cdot (∥ x_{n} - p ∥ + ∥ x_{n - 1} - p ∥) \\ + 2 θ_{n} ∥ x_{n} - x_{n - 1} ∥^{2}] - 2 γ_{n} α_{n} 〈 y_{n} - z_{n}, x_{n + 1} - p 〉 + 2 γ_{n} 〈 - p, x_{n + 1} - p 〉 \\ \leq & (1 - γ_{n}) ∥ x_{n} {- p ∥}^{2} + θ_{n} (1 - γ_{n}) M ∥ x_{n} - x_{n - 1} ∥ \\ + 2 γ_{n} 〈 - p, x_{n + 1} - p 〉 - 2 γ_{n} α_{n} 〈 y_{n} - z_{n}, x_{n + 1} - p 〉, \end{matrix}

(16)

where

M = sup (∥ x_{n} - p ∥ + ∥ x_{n - 1} - p ∥ + 2 θ_{n} ∥ x_{n} - x_{n - 1} ∥)

.

Owing to

p = P_{Ω} (0)

, we can infer that

〈 0 - P_{Ω} (0), y - P_{Ω} (0) 〉 \leq 0

for each

y \in Ω

, so we have

lim sup 〈 - p, x_{n + 1} - p 〉 = max_{q \in w_{ω (x_{n})}} 〈 - p, q - p 〉 \leq 0 .

In addition, from the assumption on

{θ_{n}}

, we have

\sum_{n = 1}^{\infty} θ_{n} (1 - γ_{n}) M ∥ x_{n} - x_{n - 1} ∥ < \infty,

and from

y_{n} - z_{n} \to 0

, we have

lim_{n \to \infty} sup {- 2 γ_{n} α_{n} 〈 y_{n} - z_{n}, x_{n + 1} - p 〉 + 2 γ_{n} 〈 - p, x_{n + 1} - p 〉} / γ_{n} \leq 0,

and therefore (16) enables Lemma 1 to be applicable to, namely,

∥ x_{n} - p ∥ \to 0

.

Case II. If the sequence

∥ x_{n} - p ∥

is not decreasing at infinity, in the sense that there exists a subsequence

{∥ x_{n_{j}} - p ∥}

of

{∥ x_{n} - p ∥}

such that

∥ x_{n_{j}} - p ∥ \leq ∥ x_{n_{j + 1}} - p ∥

. Owing to Lemma 3, we can induce that

∥ x_{σ (n)} - p ∥ \leq ∥ x_{σ (n) + 1} - p ∥

and

∥ x_{n} - p ∥ \leq ∥ x_{σ (n) + 1} - p ∥

, where

σ (n)

is an indicator defined by

σ (n) = max {k \leq n : ∥ x_{k} - p ∥ \leq ∥ x_{k + 1} - p ∥}

and

σ (n) \to \infty

as

n \to \infty

.

Taking into account the fact that the formula (14) still holds for each

σ (n)

, that is,

\begin{matrix} (1 - α_{σ (n)} - γ_{σ (n)}) α_{σ (n)} ∥ y_{σ} (n) - z_{σ (n)} ∥^{2} + α_{σ (n)} (2 μ - λ_{σ (n)}) λ_{σ (n)} {∥ A y_{σ (n)} ∥}^{2} \\ \leq & (∥ x_{σ (n)} {- p ∥}^{2} - ∥ x_{σ (n) + 1} {- p ∥}^{2}) + γ_{σ (n)} {(∥ p ∥}^{2} - ∥ x_{σ} (n) - p ∥^{2}) \\ + (1 - γ_{σ (n)}) θ_{σ (n)} (∥ x_{σ (n)} {- p ∥}^{2} - ∥ x_{σ (n) - 1} - {p) ∥}^{2}) \\ + 2 (1 - γ_{σ (n)}) θ_{σ (n)} {∥ x_{σ (n)} - x_{σ (n) - 1} ∥}^{2} \\ \leq & γ_{σ (n)} {(∥ p ∥}^{2} - ∥ x_{σ (n)} {- p ∥}^{2}) + (1 - γ_{σ (n)}) θ_{σ (n)} (∥ x_{σ (n)} {- p ∥}^{2} - ∥ x_{σ (n) - 1} - {p) ∥}^{2}) \\ + 2 (1 - γ_{σ (n)}) θ_{σ (n)} {∥ x_{σ (n)} - x_{σ (n) - 1} ∥}^{2} . \end{matrix}

In addition, from the theorem’s assumptions

(B_{1})

and

(B_{2})

that

\begin{matrix} ∥ y_{σ (n)} - z_{σ (n)} ∥^{2} \to 0; {∥ A y_{σ (n)} ∥}^{2} \to 0, \end{matrix}

(17)

Similarly to the proofs of (16) in Case I, we have

w_{ω} (x_{n}) \subset Ω

and

\begin{matrix} lim sup 〈 - p, x_{σ (n) + 1} - p 〉 \leq 0, \end{matrix}

(18)

and

\begin{matrix} ∥ x_{σ (n) + 1} {- p ∥}^{2} & \leq & (1 - γ_{σ (n)}) ∥ x_{σ (n)} {- p ∥}^{2} + θ_{σ (n)} (1 - γ_{σ (n)}) M ∥ x_{σ (n)} - x_{σ (n) - 1} ∥ \\ - 2 γ_{σ (n)} α_{σ (n)} 〈 y_{σ (n)} - z_{σ (n)}, x_{σ (n) + 1} - p 〉 \\ + 2 γ_{σ (n)} 〈 - p, x_{σ (n) + 1} - p 〉 . \end{matrix}

(19)

Transposing again, we have

\begin{matrix} γ_{σ (n)} {∥ x_{σ (n)} - p ∥}^{2} & \leq & (∥ x_{σ (n)} {- p ∥}^{2} - ∥ x_{σ (n) + 1} - p ∥^{2}) + θ_{σ (n)} (1 - γ_{σ (n)}) M \\ \times ∥ x_{σ (n)} - x_{σ (n) - 1} ∥ + 2 γ_{σ (n)} 〈 - p, x_{σ (n) + 1} - p 〉 \\ - 2 γ_{σ (n)} α_{σ (n)} 〈 y_{σ (n)} - z_{σ (n)}, x_{σ (n) + 1} - p 〉 \\ \leq & θ_{σ (n)} (1 - γ_{σ (n)}) M ∥ x_{σ (n)} - x_{σ (n) - 1} ∥ + 2 γ_{σ (n)} 〈 - p, x_{σ (n) + 1} - p 〉 \\ - 2 γ_{σ (n)} α_{σ (n)} 〈 y_{σ (n)} - z_{σ (n)}, x_{σ (n) + 1} - p 〉, \end{matrix}

which amounts to

\begin{matrix} ∥ x_{σ (n)} {- p ∥}^{2} & \leq & \frac{θ_{σ (n)}}{γ_{σ (n)}} (1 - γ_{σ (n)}) M ∥ x_{σ (n)} - x_{σ (n) - 1} ∥ + 2 〈 - p, x_{σ (n) + 1} - p 〉 \\ - 2 α_{σ (n)} 〈 y_{σ (n)} - z_{σ (n)}, x_{σ (n) + 1} - p 〉 . \end{matrix}

(20)

Noting the grant of

ϵ_{σ (n)} = o (γ_{σ (n)})

, we have

\frac{θ_{σ (n)}}{γ_{σ (n)}} (1 - γ_{σ (n)}) M ∥ x_{σ (n)} - x_{σ (n) - 1} ∥ \to 0

. Putting (18) and (17) into (20), it yields that

∥ x_{σ (n)} - p ∥ \to 0 .

It follows from (19) that

\begin{matrix} lim_{n \to \infty} ∥ x_{σ (n) + 1} - p ∥ = lim_{n \to \infty} {∥ x_{σ (n)} - p ∥}^{2} = 0, \end{matrix}

which makes Lemma 3 practicable, and hence

\begin{matrix} 0 \leq ∥ x_{n} - p ∥ \leq max {∥ x_{n} - p ∥, ∥ x_{σ (n)} - p ∥} \leq ∥ x_{σ (n) + 1} - p ∥ \to 0 . \end{matrix}

Consequently, the sequence

{x_{n}}

converges strongly to p, which is the closest point to 0 in

Ω

. This completes the proof. □

Remark 2.

If the operator A is accretive with

μ -

cocoercivity or maximal monotone, then all the above results hold.

4. Applications

4.1. Minimax Problem

Suppose

H_{1}

and

H_{2}

are two real Hilbert spaces, the general convex–concave minimax problem in a Hilbert space setting is illustrated as follows:

\begin{matrix} min_{x \in Q} max_{λ \in S} L (x, λ), \end{matrix}

(21)

where Q and S are nonempty, closed and convex subsets of Hilbert spaces

H_{1}

and

H_{2}

, respectively, and

L (x, λ)

is convex in x (for each fixed

λ \in S)

and concave in

λ

(for each fixed

x \in Q

).

A solution

(x^{*}, λ^{*}) \in Q \times S

of the minimax problem (21) is interpreted as a saddle point, satisfying the following inequality

L (x^{*}, λ) \leq L (x^{*}, λ^{*}) \leq L (x, λ^{*}), x \in Q, λ \in S,

which amounts to the fact that

x^{*} \in Q

is a minimizer in Q of the function

L (\cdot, λ^{*})

, and

λ^{*} \in S

is a maximizer in S of the function

L (x^{*}, \cdot)

.

Minimax problems are an important modeling tool due to their ability to handle many important applications in machine learning, in particular, in generative adversarial nets (GANs), statistical learning, certification of robustness in deep learning and distributed computing. Some recent works can be seen in, e.g., Ataş [32], Ji-Zhao [33] and Hassanpour et al. [34].

For example, if we consider the standard convex programming problem,

\begin{matrix} min f (x), \\ s . t . h_{i} (x) \leq 0, i = 1, 2, \dots, l, \end{matrix}

(22)

where f and

h_{i}, (i = 1, 2, \dots, l)

are convex functions. Using the Lagrange function L, the problem (22) can be reformulated as the following minimax problem (see, e.g., Qi and Sun [35]):

\begin{matrix} L (x, λ) = f (x) + \sum_{i} λ_{i} h_{i} (x) . \end{matrix}

(23)

It can be seen that

L (x, λ)

in (23) is a convex–concave function on

Q \times S

, where

\begin{matrix} Q = {x : h_{i} (x) \leq 0, i = 1, 2, \dots, l}, S = {λ : λ_{i} \geq 0, i = 1, 2, \dots, l}, \end{matrix}

and the Kuhn–Tucker vector

(x^{*}, λ^{*})

of (22) is exactly the saddle point of Lagrangian function

L (x, λ)

in (23).

Another nice example is the Tchebychev approximating problem that consists of finding

(x, λ)

such that

min_{λ \in Q} max_{x \in S} {(g (x) - λ (x))}^{2},

that is, for given

g : S \subset R^{n} \to R

, finding

λ (x) \in Q

approaching

g (x)

, where

λ : R^{n} \to R

and Q is the space composed of the functions

λ

.

It is known that L has a saddle point if and only if

\begin{matrix} min_{x \in Q} max_{λ \in S} L (x, λ) = max_{λ \in S} min_{x \in Q} L (x, λ) . \end{matrix}

If L is convex–concave and differentiable, let

\nabla_{x} L (x, λ)

and

- \nabla_{λ} L (x, λ)

present the derivatives of L on x and

λ

, respectively, and then we have

\partial L (z) = {[\nabla_{x} L (x, λ), - \nabla_{λ} L (x, λ)]}^{T}

, where

z = (x, λ)

.

Note that

\partial L

is maximal monotone for the unconstrained case (i.e.,

Q = H_{1}, S = H_{2}

), and finding a saddle point

z^{*} = (x^{*}, λ^{*}) \in Q \times S

of L equals to solving the equation

\partial L (z^{*}) = 0

. For more details on the minimax problem and its solutions, one can refer to the von Neumann works from the 1920s and 1930s [36,37] and Ky Fan’s minimax theorem [38].

Now, we consider minimax problems (21) under the unconstrained case, and let the solution set

Ω

of the minimax problem be nonempty. So, by taking

A = \partial L

, we can obtain the saddle point of the minimax problem in

H_{1} \times H_{2}

from the following results.

Theorem 2.

Let

H_{1}

and

H_{2}

be two real Hilbert spaces. Suppose that the function L is convex–concave and differentiable such that

Ω \neq \emptyset

. Under the setting of the parameters in Algorithm 1, if the sequences

{α_{n}}

,

{γ_{n}}

,

{ϵ_{n}}

are in

(0, 1)

and satisfying the conditions as in Theorem 1, then the sequence

{z_{n}}

generated by the following scheme

\begin{matrix} \{\begin{matrix} y_{n} = z_{n} + θ_{n} (z_{n} - z_{n - 1}), \\ {\bar{z}}_{n} = y_{n} - λ_{n} \partial L (y_{n}), \\ z_{n + 1} = (1 - α_{n} - γ_{n}) y_{n} + α_{n} {\bar{z}}_{n}, \end{matrix} \end{matrix}

(24)

converges strongly to the least norm element

z^{*} \in Ω

, where

z_{0} \in H_{1} \times H_{2}

and

z_{1} \in H_{1} \times H_{2}

are two arbitrary initial points.

Proof.

Noting that

\partial L

is maximal monotone, so letting A be

\partial L

in Algorithm 1, and following Theorem 1, we have the result. □

Indeed, if we denote

z_{n} = (x_{n}, λ_{n}) \in H_{1} \times H_{2}

, then the recursions (24) specifically can be rewritten as follows for arbitrary initial points

x_{0}

,

x_{1}

,

λ_{0}

,

λ_{1}

,

\begin{matrix} \{\begin{matrix} x_{n}^{'} = x_{n} + θ_{n} (x_{n} - x_{n - 1}), \\ λ_{n}^{'} = λ_{n} + θ_{n} (λ_{n} - λ_{n - 1}), \\ {\bar{x}}_{n} = x_{n}^{'} - \nabla_{x} L (x_{n}^{'}, λ_{n}^{'}), \\ {\bar{λ}}_{n} = λ_{n}^{'} + \nabla_{λ} L (x_{n}^{'}, λ_{n}^{'}), \\ x_{n + 1} = (1 - α_{n} - γ_{n}) x_{n}^{'} + α_{n} {\bar{x}}_{n}, \\ λ_{n + 1} = (1 - α_{n} - γ_{n}) λ_{n}^{'} + α_{n} {\bar{λ}}_{n}, \end{matrix} \end{matrix}

(25)

and the sequence pair

(x_{n}, λ_{n})

converges strongly to an element

(x^{*}, λ^{*}) \in Ω

which is closest to

(0, 0)

.

4.2. Critical Points Problem

In this part, we focus on finding the critical points of the functional

F : H \to R \cup {+ \infty}

defined by

F : = Ψ + Φ,

(26)

where H is a real Hilbert space, the function

Ψ : H \to R \cup {+ \infty}

is a proper, convex and lower semi-continuous function and

Φ : H \to R

is a convex locally Lipschitz mapping.

A point

x^{*}

is said to be a critical point of

F = Ψ + Φ

if

x^{*} \in d o m (Ψ)

and if it satisfies

Ψ (x^{*}) - Ψ (v) \leq Φ^{\circ} (x^{*}, v - x^{*}),

where

Φ^{\circ}

is the generalized directional derivative of

Φ

at

x^{*} \in C

in the direction

v \in H

which is defined by

Φ^{\circ} (x^{*}, v) = lim_{t ↓ 0} sup_{w \to x^{*}} \frac{Φ (w + t v) - Φ (w)}{t} .

Critical point theory is a powerful theoretical tool, which has been greatly developed in recent years and has been widely used in many fields, such as differential equations, operations research optimization and so on. For some recent works on the applications of critical point theory, we can refer to Trushnikov et al. [39], Turgut et al. [40] and therein.

A typical instance is finding the solution of the impulsive differential equation model existing in the fields of medicine, biology, rocket and aerospace motion and optimization theory which can be transformed into finding the critical point of some functional.

Specifically, we consider the following impulsive differential equation:

\begin{matrix} \{\begin{matrix} - \ddot{q} (t) = λ q (t) + f (t, q (t)), t \in (s_{k - 1}, s_{k}), \\ Δ \dot{q} (s_{k}) = g_{k} (q (s_{k}^{-})), k = 1, 2, \dots, \end{matrix} \end{matrix}

(27)

where

k \in Z, λ \geq 0,

q (t) \in R^{n}

,

Δ \dot{q} (s_{k}) = \dot{q} (s_{k}^{+}) - \dot{q} (s_{k}^{-})

,

\dot{q} (s_{k}^{\pm}) = {lim}_{t \to s_{k}^{\pm}} \dot{q} (t)

,

f (t, q) = g r a d_{q} I (t, q)

,

I (t, q) \in C^{1} (R \times R^{n}, R), g_{k} (q) = g r a d_{q} G_{k} (q), G_{k} \in C^{1} (R^{n}, R)

. In addition, there exist

m \in N

and

T \in R^{+}

such that

0 = s_{0} < s_{1} < s_{2} < \dots \dots < s_{m} = T, s_{k + m} = s_{k} + T, g_{k + m} = g_{k}

holds for all

k \in Z

.

Let

H = {q \in R \to R^{n} | q b e a b s o l u t e c o n t i n u o u s, \dot{q} \in L^{2} ((0, T), R^{n}), q (t) = q (t + T), t \in R}

and the norm

∥ \cdot ∥

is induced by the inner product

〈 q, p 〉 = \int_{0}^{T} \dot{q} (t) \dot{p} (t) + q (t) p (t) d t

,

\forall p, q \in H

.

Denote

K = {1, 2, \dots, m}

, and the functional on H is defined as

\begin{matrix} F (q) = \int_{0}^{T} \frac{1}{2} {| \dot{q} |}^{2} - \frac{1}{2} λ q^{2} - I (t, q) d t + \sum_{k \in K} G_{k} (q (s_{k})), \end{matrix}

and then the periodic solution of the system (27) corresponds to the critical point of the functional F one to one.

If the functional F in (26) satisfies the Palais–Smale compact conditions and F is bounded from below, then there exists a critical point

x^{*}

such that

F (x^{*}) = {inf}_{u \in H} F (u)

(see, e.g., Motreanu and Panagiotopoulos [41]). From Fermat’s theorem, one can refer that the critical point

x^{*}

is a solution of the inclusion (see, e.g., Moameni [42]),

0 \in \partial Ψ (x^{*}) + \partial_{c} Φ (x^{*}),

where

\partial_{c} Φ (\cdot)

is the generalized derivative of

Φ

defined as

\partial_{c} Φ (u) = {u^{*} \in H^{*}; Φ^{\circ} (u, v) \geq 〈 u^{*}, v 〉, \forall v \in H} .

From Clarke [43],

\partial_{c} Φ

carries bounded sets of H into bounded sets of

H^{*}

and is hemicontinuous. Moreover, we can infer that

\partial_{c} Φ

is a monotone mapping because

Φ

is convex, which makes Browder ([17], Theorem 2) applicable, namely,

\partial Ψ + \partial_{c} Φ

is a maximal monotone mapping. Denoted by

Ω

is the critical points set of the problem (26). By taking

A = \partial Ψ + \partial_{c} Φ

, we have the following result.

Theorem 3.

Let H be a real Hilbert space. Suppose that

F : H \to (- \infty, \infty]

is of the form (26), bounded from below and satisfying the Palais–Smale compact conditions such that

Ω \neq \emptyset

. Under the setting of the parameters in Algorithm 1, if

{α_{n}}

,

{γ_{n}}

,

{ϵ_{n}}

are the sequences in

(0, 1)

satisfying the conditions as in Theorem 1, then the sequence

{z_{n}}

generated by the following schemes

\begin{matrix} \{\begin{matrix} y_{n} = z_{n} + θ_{n} (z_{n} - z_{n - 1}), \\ {\bar{z}}_{n} = y_{n} - λ_{n} (\partial Ψ + \partial_{c} Φ) (y_{n}), \\ z_{n + 1} = (1 - α_{n} - γ_{n}) y_{n} + α_{n} {\bar{z}}_{n}, \end{matrix} \end{matrix}

(28)

converges strongly to an element

\bar{x} \in Ω

which is closest to

0

.

5. Numerical Examples

In this section, we present numerical examples in finite- and infinite-dimensional spaces to illustrate the applicability, efficiency and stability of Algorithm 1. All the codes for the results are written in Matlab R2016b and are performed on an LG dual-core personal computer.

Example 1.

Here, we test the effectiveness of our algorithm in finite-dimensional space which does not need super high dimensions. For the purpose, let

H = R^{6}

, and define the monotone operators A as follows:

A = (\begin{matrix} 6 & 0 & 0 & 0 & 0 & 0 \\ 0 & 7 & 0 & 0 & 0 & 0 \\ 0 & 0 & 8 & 0 & 0 & 0 \\ 0 & 0 & 0 & 3 & 0 & 0 \\ 0 & 0 & 0 & 0 & 4 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \end{matrix})

(29)

it is easy to verify that the cocoercivity coefficient

μ = \frac{1}{8}

, so we set

λ_{n} = \frac{1}{8} - \frac{1}{10 n}

.

Next, let us compare our Algorithm 1 with the regularization method. Specifically, the regularization algorithm (RM) is considered as

\begin{matrix} \{\begin{matrix} y_{n} = x_{n} + θ_{n} (x_{n} - x_{n - 1}), \\ z_{n} = J_{r}^{A} y_{n}, \\ x_{n + 1} = (1 - α_{n} - γ_{n}) y_{n} + α_{n} z_{n} . \end{matrix} \end{matrix}

As for the components, both our Algorithm 1 and the regularization method (RM), initial points

x_{0}

,

x_{1}

are generated randomly by Matlab, inertial coefficient

θ_{n}

is chosen to satisfy that if

θ > ϵ_{n} \times (max (∥ x_{n - 1} - x_{n} ∥, ∥ x_{n - 1} - x_{n} ∥^{2}))

, then

θ_{n} {= 1 / ((n + 2)}^{2} \times (max (∥ x_{n - 1} - x_{n} ∥, ∥ x_{n - 1} - x_{n} ∥^{2})))

; otherwise,

θ_{n} = \frac{θ}{2}

, where

θ = 0.6

,

ϵ_{n} = 1 / {(n + 1)}^{2}

,

γ_{n} = 1 / (10 n)

. The experimental results are listed in Figure 1. Moreover, the iterations and convergence rate of Algorithm 1 for different values of

{α_{n}}

are presented in Table 1.

Example 2.

Now, we measure our Algorithm 1 in

H = L_{2} [0, 1]

with

∥ \cdot ∥ = {(\int_{0}^{1} x^{2} (t) d t)}^{\frac{1}{2}}

. Define the mappings A by

A (x) (t) : = 2 x (t) / 3

for all

x (t) \in L_{2} [0, 1]

, and then it can be shown that A is

\frac{3}{2}

-cocoercive monotone mapping. All the parameters

θ_{n}

, θ,

λ_{n}

,

ϵ_{n}

and

γ_{n}

are chosen as in Example 1. The stop criterion is

∥ x_{n + 1} - x_{n} ∥ \leq 10^{- 6}

. We test Algorithm 1 for the following three different initial points:

Case I:

x_{0} = 2 t^{3} e^{5 t}, x_{1} = s i n (3 t) e^{t} / 100

;

Case II:

x_{0} = s i n (- 3 t) + c o s (- 5 t) / 2, x_{1} = 2 t s i n (3 t) e^{- 5 t} / 200;

Case III:

x_{0} = 2 t s i n (3 t) e^{- 5 t} / 200, x_{1} = e^{t} - e^{- 2 t} .

In addition, we also test the regularization method as illustrated in Example 1, and the tendency of the sequence is proposed in Figure 2 and Figure 3 and Table 2.

6. Conclusions

The proximal point method (regularization method) and projection-based method are two classical and significant methods for solving monotone inclusions, variational inequalities and related problems.

However, the evaluations of resolvents/projections in these methods heavily rely on the structure of the given problem, and in the general case, this might seriously affect the computational effort of the given method. Thus, motivated by the ideas of Chidume et al. [44], Alvarez [28], Alvarez–Attouch [27] and Zegeye [45], we present a simple strong convergence method that avoids the need to compute resolvents/projections.

We present several theoretical applications such as minimax problems and critical point problems, as well as some numerical experiments illustrating the performances of our scheme.

Author Contributions

All authors contributed equally to this work. All authors read and approved the final manuscript.

Funding

This article was funded by the National Natural Science Foundation of China (12071316) and the Natural Science Foundation of Chongqing (cstc2021jcyj-msxmX0177).

Conflicts of Interest

The authors declare no conflict of interest.

References

Minty, G.J. Monotone (nonlinear)operators in Hilbert spaces. Duke Math. J. 1962, 29, 341–346. [Google Scholar] [CrossRef]
Browder, F. The solvability of nonlinear functional equations. Duke Math. J. 1963, 30, 557–566. [Google Scholar] [CrossRef]
Leray, J.; Lions, J. Quelques résultats de Višik sur les problèmes elliptiques non linéares par les méthodes de Minty-Browder. Bull. Soc. Math. Fr. 1965, 93, 97–107. [Google Scholar] [CrossRef]
Minty, G.J. On a monotonicity method for the solution of non-linear equations in Banach spaces. Proc. Nat. Acad. Sci. USA 1963, 50, 1038–1041. [Google Scholar] [CrossRef] [PubMed]
Pascali, D.; Sburian, S. Nonlinear Mappings of Monotone Type; Editura Academia Bucuresti: Bucharest, Romania, 1978; p. 101. [Google Scholar]
Bot, R.I.; Csetnek, E.R. An inertial forward-backward-forward primal-dual splitting algorithm for solving monotone inclusion problems. Numer. Algorithms 2016, 71, 519–540. [Google Scholar] [CrossRef]
Korpelevich, G.M. The extragradient method for finding saddle points and other problems. Ekonomika i Matematicheskie Metody 1976, 12, 747–756. [Google Scholar]
Khan, S.A.; Suantai, S.; Cholamjiak, W. Shrinking projection methods involving inertial forward–backward splitting methods for inclusion problems. Rev. Real Acad. Cienc. Exactas Fis. Nat. A Mat. 2019, 113, 645–656. [Google Scholar] [CrossRef]
Sicre, M.R. On the complexity of a hybrid proximal extragradient projective method for solving monotone inclusion problems. Comput. Optim. Appl. 2020, 76, 991–1019. [Google Scholar] [CrossRef]
Xu, H.K. A regularization method for the proximal point algorithm. J. Glob. Optim. 2006, 36, 115–125. [Google Scholar] [CrossRef]
Yin, J.H.; Jian, J.B.; Jiang, X.Z.; Liu, M.X.; Wang, L.Z. A hybrid three-term conjugate gradient projection method for constrained nonlinear monotone equations with applications. Numer. Algorithms 2021, 88, 389–418. [Google Scholar] [CrossRef]
Berinde, V. Iterative Approximation of Fixed Points; Lecture Notes in Mathematics; Springer: London, UK, 2007. [Google Scholar]
Chidume, C.E. An approximation method for monotone Lipshitz operators in Hilbert spaces. J. Austral. Math. Soc. Ser. 1986, A 41, 59–63. [Google Scholar] [CrossRef]
Kačurovskii, R.I. On monotone operators and convex functionals. Usp. Mat. Nauk. 1960, 15, 213–215. [Google Scholar]
Zarantonello, E.H. Solving Functional Equations by Contractive Averaging; Technical Report #160; U. S. Army Mathematics Research Center: Madison, WI, USA, 1960. [Google Scholar]
Martinet, B. Regularisation d’inequations variationnelles par approximations successives. Rev. Fr. Inform. Rech. Oper. 1970, 4, 154–158. [Google Scholar]
Browder, F.E. Nonlinear maximal monotone operators in Banach space. Math. Annalen 1968, 175, 89–113. [Google Scholar] [CrossRef]
Bruck, R.E., Jr. A strongly convergent iterative method for the solution of 0∈Ux for a maximal monotone operator U in Hilbert space. J. Math. Anal. Appl. 1974, 48, 114–126. [Google Scholar] [CrossRef]
Boikanyo, O.A.; Morosanu, G. A proximal point algorithm converging strongly for general errors. Optim. Lett. 2010, 4, 635–641. [Google Scholar] [CrossRef]
Khatibzadeh, H. Some Remarks on the Proximal Point Algorithm. J. Optim. Theory Appl. 2012, 153, 769–778. [Google Scholar] [CrossRef]
Rockafellar, R.T. Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 1976, 14, 877–898. [Google Scholar] [CrossRef]
Shehu, Y. Single projection algorithm for variational inequalities in Banach spaces with applications to contact problems. Acta Math Sci. 2020, 40B, 1045–1063. [Google Scholar] [CrossRef]
Yao, Y.H.; Shahzad, N. Strong convergence of a proximal point algorithm with general errors. Optim. Lett. 2012, 6, 621–628. [Google Scholar] [CrossRef]
Teboulle, M. A simplified view of first order methods for optimization. Math. Program. Ser. B 2018, 170, 67–96. [Google Scholar] [CrossRef]
Drusvyatskiy, D.; Lewis, A.S. Error bounds, quadratic growth, and linear convergence of proximal methods. Math. Oper. Res. 2018, 43, 919–948. [Google Scholar] [CrossRef]
Nesterov, Y. Introductory Lectures on Convex Optimization; Cluwer: Baltimore, MD, USA, 2004. [Google Scholar]
Alvarez, F. Weak convergence of a relaxed and inertial hybrid projection-proximal point algorithm for maximal monotone operators in Hilbert spaces. SIAM J. Optim. 2004, 14, 773–782. [Google Scholar] [CrossRef]
Alvarez, F.; Attouch, H. An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Anal. 2001, 9, 3–11. [Google Scholar] [CrossRef]
Xu, H.K. Iterative algorithms for nonliear operators. J. Lond. Math. Soc. 2002, 66, 240–256. [Google Scholar] [CrossRef]
Maingé, P.E. Approximation methods for common fixed points of nonexpansive mappingn Hilbert spaces. J. Math. Anal. Appl. 2007, 325, 469–479. [Google Scholar] [CrossRef]
Opial, Z. Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bull Amer Math Soc. 1967, 73, 591–597. [Google Scholar] [CrossRef]
Ataş, İ. Comparison of deep convolution and least squares GANs for diabetic retinopathy image synthesis. Neural Comput. Appl. 2023, 35, 14431–14448. [Google Scholar] [CrossRef]
Ji, M.M.; Zhao, P. Image restoration based on the minimax-concave and the overlapping group sparsity. Signal Image Video Process. 2023, 17, 1733–1741. [Google Scholar] [CrossRef]
Hassanpour, H.; Hosseinzadeh, E.; Moodi, M. Solving intuitionistic fuzzy multi-objective linear programming problem and its application in supply chain management. Appl. Math. 2023, 68, 269–287. [Google Scholar] [CrossRef]
Qi, L.Q.; Sun, W.Y. Nonconvex Optimization and Its Applications; Book Series (NOIA, Volume 4), Minimax and Applications; Kluwer Academic Publishers: London, UK, 1995; pp. 55–67. [Google Scholar]
Von Neumann, J. Zur Theorie der Gesellschaftsspiele. Math. Ann. 1928, 100, 295–320. [Google Scholar] [CrossRef]
Von Neumann, J. Uber ein bkonomisches Gleichungssystem und eine Verallgemeinerung des Brouwerschen Fixpunktsatzes. Ergebn. Math. Kolloqu. Wien 1935, 8, 73–83. [Google Scholar]
Fan, K. A minimax inequality and applications. In Inequalities, III; Shisha, O., Ed.; Academic Press: San Diego, CA, USA, 1972; pp. 103–113. [Google Scholar]
Trushnikov, D.N.; Krotova, E.L.; Starikov, S.S.; Musikhin, N.A.; Varushkin, S.V.; Matveev, E.V. Solving the inverse problem of surface reconstruction during electron beam surfacing. Russ. J. Nondestruct. Test. 2023, 59, 240–250. [Google Scholar] [CrossRef]
Turgut, O.E.; Turgut, M.S.; Kirtepe, E. A systematic review of the emerging metaheuristic algorithms on solving complex optimization problems. Neural Comput. Appl. 2023, 35, 14275–14378. [Google Scholar] [CrossRef]
Motreanu, D.; Panagiotopoulos, P.D. Minimax Theorems and Qualitative Properties of the Solutions of Hemivariational Inequalities; Nonconvex Optimization and Its Applications; Kluwer Academic: New York, NY, USA, 1999. [Google Scholar]
Moameni, A. Critical point theory on convex subsets with applications in differential equations and analysis. J. Math. Pures. Appl. 2020, 141, 266–315. [Google Scholar] [CrossRef]
Clarke, F. Functional Analysis Calculus of Variations and Optimal Control; Springer: London, UK, 2013; pp. 193–209. [Google Scholar]
Chidume, C.E.; Osilike, M.O. Iterative solutions of nonlinear accretive operator equations in arbitrary Banach spaces. Nonlinear Anal. Theory Methods Appl. 1999, 36, 863–872. [Google Scholar] [CrossRef]
Zegeye, H. Strong convergence theorems for maximal monotone mappings in Banach spaces. J. Math. Anal. Appl. 2008, 343, 663–671. [Google Scholar] [CrossRef]

Figure 1. Algorithm 1 and the Regularization Method.

Figure 2. Algorithm 1 for Case I, Case II, Case III in Example 2.

Figure 3. Regularization Method for Case I, Case II, Case III in Example 2.

Table 1. Example 1 Numerical Results for Algorithm 1 and Regularization Method.

Algorithm 1				Regularization Method
${α_{n}}$	CPU Time	Iter.	$\frac{∥ x_{n + 1} - x^{*} ∥}{∥ x_{n + 1} - x_{0} ∥}$	$r$	CPU Time	Iter.	$\frac{∥ x_{n + 1} - x^{*} ∥}{∥ x_{n + 1} - x_{0} ∥}$
$1 - \frac{1}{100 n}$	0.0201	42	6.1691 $\times 10^{- 04}$	$0.1$	0.0324	63	5.3741 $\times 10^{- 04}$
$\frac{1}{2} - \frac{1}{100 n}$	0.0594	88	9.159 $\times 10^{- 04}$	$0.05$	0.0530	115	0.0013
$\frac{1}{8} - \frac{1}{100 n}$	0.0422	276	0.0039	$0.01$	0.2205	367	0.0078

Table 2. Example 2 Numerical Results for Algorithm 1 and Regularization Method.

	Case I		Case II		Case III
	Algorithm 1	RM	Algorithm 1	RM	Algorithm 1	RM
CPU time	2.64	5.28	4.29	8.6	3.62	7.59
Iteration Number	9	17	9	23	13	27

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tang, Y.; Gibali, A. Resolvent-Free Method for Solving Monotone Inclusions. Axioms 2023, 12, 557. https://doi.org/10.3390/axioms12060557

AMA Style

Tang Y, Gibali A. Resolvent-Free Method for Solving Monotone Inclusions. Axioms. 2023; 12(6):557. https://doi.org/10.3390/axioms12060557

Chicago/Turabian Style

Tang, Yan, and Aviv Gibali. 2023. "Resolvent-Free Method for Solving Monotone Inclusions" Axioms 12, no. 6: 557. https://doi.org/10.3390/axioms12060557

APA Style

Tang, Y., & Gibali, A. (2023). Resolvent-Free Method for Solving Monotone Inclusions. Axioms, 12(6), 557. https://doi.org/10.3390/axioms12060557

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Resolvent-Free Method for Solving Monotone Inclusions

Abstract

1. Introduction

2. Preliminaries

3. Main Result

Convergence Analysis

4. Applications

4.1. Minimax Problem

4.2. Critical Points Problem

5. Numerical Examples

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI