A Mean Extragradient Method for Solving Variational Inequalities

: We propose a modiﬁed extragradient method for solving the variational inequality problem in a Hilbert space. The method is a combination of the well-known subgradient extragradient with the Mann’s mean value method in which the updated iterate is picked in the convex hull of all previous iterates. We show weak convergence of the mean value iterate to a solution of the variational inequality problem, provided that a condition on the corresponding averaging matrix is fulﬁlled. Some numerical experiments are given to show the effectiveness of the obtained theoretical result.


Introduction
Let H be a real Hilbert space with an inner product ·, · and the associated norm · . Let C be a nonempty closed convex subset of H, and let F : H → H be a monotone operator defined by x − y, F(x) − F(y) ≥ 0, for all x, y ∈ H, and a L-Lipschitz continuous operator defined by for all x, y ∈ H. The (Stampacchia) variational inequality problem is to find a point x * ∈ C such that We will denote the solution set of the considered variational inequality by VIP(F, C) and assume that it is nonempty. Since VIP(F, C) has been utilized for modeling many mathematical and practical situations (see [1] for insight discussions), many iterative methods have been proposed for solving it. The classical method is due to Goldstein [2] which can be read in the form: for a given x 1 ∈ H, calculate where τ > 0 is a step-size parameter and P C is the metric projection onto C. By assuming that F is η-strongly monotone and L-Lipschitz continuous and τ ∈ (0, 2η/L 2 ), it has been proved that the sequence generated by (2) converges strongly to the unique solution of VIP(F, C).
As the convergence of the iterative scheme in (2) needs to use the strong monotonicity of F which is quite restricted, in 1976, Korpelevich [3] proposed the so-called extragradient method (in short, EM), which is defined by the following form: for a given x 1 ∈ H, calculate y k := P C (x k − τF(x k )), x n+1 := P C (x k − τF(y k )), k ∈ N.
In the setting of a finite dimensional space, it has been proved that the sequence generated by EM (3) converges to a solution of VIP(F, C) governed by Lipschitz continuity and monotonicity of F. From such starting point, several variants of Korpelevich's EM have been investigated, for instance [4][5][6][7][8][9] and references there cited in. Especially, we only underline here the work of Censor, Gibali and Reich [10]. As one can see from the above scheme, EM requires the performing of two metric projections in each iteration. For this reason, EM will be suitable for the case when the constrained set C is simple enough so that the metric projection P C onto C has a closed-form expression, otherwise one needs to solve a hidden minimization sub-problem. To avoid this situation, Censor, Gibali and Reich proposed the so-called subgradient-extragradient method (SEM), which requires only one metric projection onto C for updating y k , meanwhile, another one is replaced by the metric projection onto a half-space containing C for updating the next iterate x k+1 . The method essentially has the form: where It is worth noting that the closed-form expression of P T k is explicitly given in the literature (see the Formula (7) for further details). The weak convergence result is also given in the paper [10]. Several variants of SEM have been investigated, see for instance [11][12][13][14][15][16]. Note that, even if SEM has the advantage of reducing the performance of the metric projection onto C when performing x k+1 , there still is the metric projection when evaluating y k , in this situation the inner-loop iteration remains when the constrained set C is not simple enough, for example, the intersection of a finite number of nonempty closed convex simple sets. On the other hand, let us move to another aspect of the nonlinear problem. Let T : H → H be a nonlinear operator, the celebrate fixed-point problem is to find x * ∈ Fix T := {x ∈ H : x = Tx} = ∅. In order to solve this problem, we recall the classical Picard's iteration which updates the next iterate x k+1 by using the information of the current iterate x k , that is This kind of method is known as the memoryless scheme, and it is well-known in the literature that the sequence generated by Picard's iterative method may fail to converge to a point in Fix T. In 1953, Mann [17] proposed a modified version of Picard's iteration as where x k denotes a convex combination of the iterates {x j } k j=1 , or in another word, x k is a point in the convex hull of all previous iterates. This method is known as Mann's mean value iteration. The advantage of Mann's mean value iteration is underlined for avoiding some numerical desirable situations, for instance, the generated sequence may have zig-zag or spiral behavior around the solution set, see [18] for more insight discussion. Some works based on the idea of Mann's mean value iteration have been investigated, for instance [18,19].
In this work, we present an iterative method by utilizing the ideas of the celebrated SEM together with Mann's mean value iteration for solving VIP(F, C) governed by mono-tone and Lipschitz continuous operator. We show that the sequence generated by the proposed method converges weakly to a solution of VIP(F, C). To demonstrate the numerical behavior of the proposed method, we consider the constrained minimization problem in which the constrained set is given by the intersection of a finite family nonempty closed convex simple sets. We present numerical experiments which show that, under some suitable parameters, the proposed method outperforms the existing one.

Preliminaries
For convenience, we present here some notations which are used throughout the paper. For more details, the reader may consult the reference books [20,21].
We denote the strong convergence and weak convergence of a sequence {x k } ∞ k=1 to x ∈ H by x k → x and x k x, respectively. We denote the identity operator on H by Id. Let C be a nonempty, closed, and convex subset of H. For each point x ∈ H, there exists a unique nearest point in C, denoted by P C (x), that is, The mapping P C : H → C is called the metric projection of H onto C. Note that P C is a nonexpansive mapping of H onto C, i.e., Moreover, the metric projection P C satisfies the variational property: Let a ∈ H \ {0} and β ∈ R, we define the hyperplane in H by It is clear that both hyperplane and half-space are closed and convex sets. Moreover, it is important to note that the metric projection onto the half-space H ≤ (a; β) can be done explicitly as the following formula: For a point x ∈ H and a nonempty closed convex C ⊂ H, we say that a point z ∈ H separates C from x if, We say that an operator T : C → H is a separator of C if the point Tx separates C from a point x for all x ∈ H. It is clear from the relation (6) that the projection P C is a separator of C. It is worth noting that for any x / ∈ C, the hyperplane H(x − P C (x); P C (x), x − P C (x) ) cuts the space H into two half-spaces. One space contains the element x while the other one contains the subset C. We also know that C ⊂ H ≤ (x − P C (x); P C (x), x − P C (x) ), and the hyperplane H ≤ (x − P C (x); P C (x), x − P C (x) ) is a supporting hyperplane to C at the point P C (x).
Let A : H → 2 H be a set-valued operator. We denote its graph by We denote the set of all zeros of A by The operator A is said to be monotone if for all (x, u), (y, v) ∈ Gr(A), and it is called maximally monotone if its graph is not properly contained in the graph of any other monotone operator. Note that if A is maximally monotone, then A −1 (0) is a convex and closed set.
Let C ⊂ H be a nonempty closed convex set. We denote by N C (x) the normal cone to C at x ∈ C, i.e., Let F : H → H be a monotone continuous operator and C be a nonempty closed convex subset of H. Define the operator A : Then, we have A is a maximally monotone operator, and the following important property holds:

Mann's Type Mean Extragradient Algorithm
In this section, we present a mean extragradient algorithm for solving the considered variational inequality problem.
We start with recalling the so-called averaging matrix as follows. An infinite lower triangular row matrix [α k,j ] ∞ k,j=1 is said to be averaging if the following conditions are satisfied: For a sequence {x k } ∞ k=1 ⊂ H and an averaging matrix [α k,j ] ∞ k,j=1 , we denote the mean iterate by Now, we are in position to state the Mann mean extragradient method (Mann-MEM) as follows Algorithm 1.
Step 1: Given a current iterate x k ∈ H, compute the mean iterate Step 2: If y k = x k , then x k ∈ VIP(F, C) and STOP.
If not, construct the half-space T k defined by and calculate the next iterate Update k = k + 1 and go to Step 1.

Remark 1.
In the case that the averaging matrix [α k,j ] ∞ k,j=1 is the identity matrix, then Mean-MEM is reduced to the classical subgradient extragradient method proposed by Censor et al. [10] Algorithm 4.1.
The following proposition confirms us a stopping criterion of Mann-MEM on Step 2.
Proof. Let k 0 ∈ N be such that y k 0 = x k 0 . Then, by the definition of y k , we have which holds by the fact that τ > 0. Hence, we conclude that x k 0 ∈ VIP(F, C) as required.
According to Proposition 1, for the rest of our convergence analysis, we may assume throughout this section that Mann-MEM does not terminate after a finite number of iterations, that is, we assume that y k = x k for all k ≥ 1.
The following technical lemma is a key tool in order to prove the convergence result of a sequence generated by Mann − MEM.
k=1 be a sequence generated by Mann − MEM. For every k ≥ 1 and u ∈ VIP(F, C), it holds that Proof. Let k ≥ 1 and u ∈ VIP(F, C) be fixed. Since F is monotone, we note that where the second inequality holds true by the fact that y k ∈ C and u ∈ VIP(F, C). Thus, we also have Now, invoking the definition of T k , we note that and, it follows that Denoting z k := x k − τF(y k ), we note that Note that, it follows from the variational property of P T k that and which yields that By substituting (11) in (10), we obtain Thus, from above inequality and by using (8), (9), we have By using the L-Lipschitz continuity of F and the fact that 2ab ≤ a 2 + b 2 for all a, b ∈ R, we have Finally, by using the assumption that [α k,j ] ∞ k,j=1 is an averaging matrix, and the convexity of · 2 , we have and the proof is completed.
Next, we recall the following concept which plays a crucial role in the convergence analysis of our work. The following proposition is very useful in our convergence proof, and it is due to [22] (Section 3.5, Theorem 4).
Note that in view of Lemma 1, if we add an additional prior criterion on τ so that the term (1 − τ 2 L 2 ) x k − y k 2 of the right-hand side is nonpositive, together with the assumtion that the averaging matrix [α k,j ] ∞ k,j=1 is M-concentrating, it will yield the convergence of the sequence { x k − u 2 } ∞ k=1 . Now, we are in a position to formally state the convergence analysis of Mann-MEM. Proof. Let k ≥ 1 and u ∈ VIP(F, C) be fixed. Now, we note from Lemma 1 that Since τ ∈ (0, 1/L), we have and then the inequality (14) can be written as In view of ϕ := x k − u 2 and ε k := 0 for every k ≥ 1 in (13), and by using the assumption that the averaging matrix [α k,j ] ∞ k,j=1 is M-concentrating, we obtain that lim k→∞ ||x k − u 2 exists, say r(u) ∈ R. Invoking Lemma 2, we have lim k→∞ ∑ k j=1 α k,j x j − u 2 exists with the same limit r(u), and subsequently, it follows from these together with (14) and Moreover, we note that from Lemma 1 again that we also have lim k→∞ x k − u 2 = r(u).
Since the sequence {x k } ∞ k=1 is bounded, there are a weak cluster point x ∈ H and a subset {x k i } ∞ i=1 such that x k i x . Thus, it follows from (16) that y k i x . Now, let us define the operator A : H → 2 H by Then, we know that A is a maximally monotone operator and A −1 (0) = VIP(F, C).
Thus, by the variational property of y k , we have for all k ≥ 1. Hence, by using (17) and (18) and replacing y by y k i and y k by y k i , respectively, we have Taking the limit as i → ∞, we obtain Now, since A is a maximally monotone operator, we obtain that x ∈ A −1 (0) = VIP(F, C).
Next, we show that the whole sequence converges weakly to x . Assume that there is a subsequence {x j } ∞ j=1 of {x k } ∞ k=1 such that it converges weakly to some y = x . By following all above statements, we also obtain that y ∈ VIP(F, C) and lim k→∞ x k − y . Invoking the Opial's condition, we note that which is a contradiction. Therefore, x = y , and hence we conclude that {x k } ∞ k=1 converges weakly to x .
Next, we will discuss an important example of the M-concentrating averaging matrix, for simplicity, we will make use of the following notions. For a given averaging matrix for all k, j ≥ 1. An averaging matrix [α k,j ] ∞ k,j=1 is said to satisfy the generalized segmenting condition [18] if If α k,j = 0 for all k, j ≥ 1, then [α k,j ] ∞ k,j=1 is said to satisfy the segmenting condition.
The following proposition indicates the sufficient and necessary condition for an averaging matrix satisfying the generalized segmenting condition to be M-concentrating.

Numerical Result
In this section, we present the effectiveness of the proposed algorithm by minimizing the distance of a given point over the intersection of a finite number of linear half-spaces.
Let c ∈ R n and a i ∈ R n and b i ≥ 0 be given data, for all i = 1, . . . , m. In this experiment, we want to solve the constrained minimization problem of the form: Note that the function f := 1 2 · −c 2 is convex Fréchet differentiable with ∇ f is 1-Lipschitz continuous gradient, and the constrained set C i := {x ∈ R n : a i , x ≤ b i }, i = 1, . . . , m, is a nonempty closed convex set. Thus, the problem (19) fits into the setting of the variational inequality problem (1), where F = ∇ f and C = m i=1 C i . One can easily see that F is 1-Lipschitz continuous. In this situation, the obtained theoretical results hold and we can apply Mann-MEM for solving the problem (19).
All the experiments were performed under MATLAB 9.6 (R2019a) running on a MacBook Pro 13-inch, 2019 with a 2.4 GHz Intel Core i5 processor and 8 GB 2133 MHz LPDDR3 memory. All computational times are given in seconds (sec.). In all tables of computational results, SEM means the classical subgradient extragradient method [10], while Mann-MEM means the Mann type mean extragradient method with the generalized segmenting averaging matrix [α k,j ] ∞ k,j=1 is given by where the parameter α ∈ (0, 1). Note that the set T k := H ≤ ((x k − τF(x k )) − y k ; y k , (x k − τF(x k )) − y k ) in Mann-MEM is a supporting hyperplane to C at the point y k . In this situation, the metric projection P T k can be computed explicitly by the Formula (7) provided that the estimate (x k − τF(x k )) − y k = 0. Nevertheless, if the estimate (x k − τF(x k )) − y k = 0, we have that the half-space T k turns out to be the whole space H so that the iterate x k+1 is nothing else but the estimate x k − τF(y k ).
Observe that the extragradient type methods require the computation of the metric projection onto the constrained set C which is the intersection of a finite number of linear half-spaces. Of course, the metric projection P C of this constrained set is not computed explicitly, and we need to solve the sub-optimization problem (5) in order to obtain the metric projection onto the constrained set. To deal with this situation, we make use of the classical Halpern iteration by performing the inner loop: pick arbitrary initial point ϕ 1 ∈ R n and a sequence {λ i } ∞ i=1 , we compute It is well-known that if the sequence converges to the unique point P C (x k − τF(x k )) (see [20] Theorem 30.1), which is nothing else than the point y k in Mann-MEM. In order to approximate the point y k , in all experiment, we use the stopping criterion ϕ i+1 −ϕ i ϕ i +1 ≤ 10 −8 for the inner loop. Notice that this strategy is also used when performing SEM.
In  (20) to be 0.9. We terminate the methods by, for SEM, the stopping criterions x k − c ≤ 10 −5 or after 100 iterations, whichever came first, and for Mann-MEM, the stopping criterions x k − c ≤ 10 −5 or after 100 iterations, whichever came first. We present in Table 1 the influences of the parameters λ ∈ [1.3, 1.9] on the computational time (Time), the number of iterations (k) (#(Iters)), and the total number of inner iterations (i) given by (21) (#(Inner)) when the stopping criterions were met. Table 1. Influences of the stepsize λ k = λ/(k + 1) for several paramters λ > 0 when performing subgradient-extragradient method (SEM) and Mann mean extragradient method (Mann-MEM). It can be seen from Table 1 that, in each of these two algorithms tested, the larger values of parameter λ give the better algorithm performances, that is the least computational time is achieved when the parameter λ is as large as possible. This behavior may probably due to the larger stepsize λ k , which is defined by the parameter λ, can make the inner loop (21) terminate in fewer iterations so that the algorithmic runtime decreases. However, we can see that SEM with λ = 1.7 and Mann-MEM with λ = 1.3, 1.4 need more than 100 iterations to reach the stopping criterion. We observe that the high performance of both SEM and Mann-MEM is obtained by the choice of λ = 1.9, moreover, Mann-MEM with λ = 1.9 gives the best result of algorithm runtime 0.0607 seconds.

Method
In Figure 1, we perform the experiments with varying the the stepsize τ > 0 in the two tested methods. With the same setting as above experiment and putting the inner-loop stepsize λ k = 1.9/(k + 1) for SEM and Mann-MEM. We observe that the best computational time for both SEM and Mann-MEM is obtained by the choice of τ = 0.6 For more insight into the convergence behavior of Mann-MEM, we also consider the influence of the parameter α given in Mann-MEM. We put λ k = 1.9/(k + 1), and τ = 0.6, and the results are presented in Figure 2. It can be observed that the least computational time and the number of iterations are achieved when the parameter α is quite large, that is, the best algorithm's performance is obtained by the choice of α = 0.99. In the next experiment, we also considered the solving of the problem (19) by the aforementioned tested methods. We compare the methods for various dimensions n and the number of constraints m. We put vectors a i , i = 1, . . . , m, whose coordinates are randomly chosen from the interval [−m, m], positive real numbers b i = 0.5, i = 1, . . . , m, and the initial point x 1 is a vector whose coordinates are randomly chosen from the interval [0, 1]. We set the point c to be a vector whose all coordinates are 1, and choose the best choices of parameters λ = 1.9, and stepsize τ = 0.6 for SEM and λ = 1.9, τ = 0.6, and α = 0.99 for Mann-MEM. In the following numerical experiments, in order to terminate SEM, we applied the following stopping criterion and in order to terminate Mann-MEM, we applied the following stopping criterion max We performed 10 independent tests for any collections of high dimensions n = 500, 1000, 2000, and 3000 and the number of constraints m = 50, 100, and 200. The results are presented in Table 2, where the average computational runtime and the average number of iterations for any collection of n and m are presented. It is clear from Table 2 that Mann-MEM is more efficient than SEM in the sense that Mann-MEM requires less computation than SEM in the average computational runtime. One notable behavior is that for the case when m is quite large, Mann-MEM requires significantly below the average computational runtime. For each dimension, we observe that the larger problem sizes need more average computational runtime. This suggests that the use of the generalized segmenting averaging matrix is more efficient than SEM. In this situation, we can note that the essential superiority of Mann-MEM with respect to SEM is dependent on the optimal choice of the averaging matrix [α k,j ] ∞ k,j=1 which is, in our experiments, the generalized segmenting averaging matrix.

Some Concluding Remarks
The objective of this work was the solving of a variational inequality problem governed by a monotone and Lipschitz continuous operator. We associated to it the so-called Mann's mean extragradient method, and proved weak convergence of the generated sequence of iterates to a solution to the considered problem. Numerical experiments show that under some suitable parameters, the proposed method has a better convergence behavior compared to the classical subgradient-extragradient method. For future work, some comments are in order.
(i) Let us observe that the convergence of Mann-MEM requires us to know the Lipschitz constant L of the operator F, nevertheless, it is sometimes difficult to indicate exactly the Lipschitz constant, so that Mann-MEM and its convergence result can not be practically applicable. It is very interesting to consider a variant Mann-MEM with a variable stepsize {τ k } ∞ k=1 in place of the fixed stepsize τ > 0 and the prior knowledge of L is not necessarily known. (ii) It can be noted that the superiority of Mann-MEM with respect to SEM is depended on the optimal choice of the averaging matrix [α k,j ] ∞ k,j=1 . It is also very interesting to find more possible examples of averaging matrices satisfying the M-concentrating condition.