Next Article in Journal
A Conceptual Probabilistic Framework for Annotation Aggregation of Citizen Science Data
Previous Article in Journal
Analytic Hierarchy Process with the Correlation Effect via WordNet

Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

# An Adaptive Proximal Bundle Method with Inexact Oracles for a Class of Nonconvex and Nonsmooth Composite Optimization

by
Xiaoliang Wang
*,
Liping Pang
,
Qi Wu
and
Mingkun Zhang
School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China
*
Author to whom correspondence should be addressed.
Mathematics 2021, 9(8), 874; https://doi.org/10.3390/math9080874
Submission received: 12 March 2021 / Revised: 7 April 2021 / Accepted: 9 April 2021 / Published: 15 April 2021

## Abstract

:
In this paper, an adaptive proximal bundle method is proposed for a class of nonconvex and nonsmooth composite problems with inexact information. The composite problems are the sum of a finite convex function with inexact information and a nonconvex function. For the nonconvex function, we design the convexification technique and ensure the linearization errors of its augment function to be nonnegative. Then, the sum of the convex function and the augment function is regarded as an approximate function to the primal problem. For the approximate function, we adopt a disaggregate strategy and regard the sum of cutting plane models of the convex function and the augment function as a cutting plane model for the approximate function. Then, we give the adaptive nonconvex proximal bundle method. Meanwhile, for the convex function with inexact information, we utilize the noise management strategy and update the proximal parameter to reduce the influence of inexact information. The method can obtain an approximate solution. Two polynomial functions and six DC problems are referred to in the numerical experiment. The preliminary numerical results show that our algorithm is effective and reliable.

## 1. Introduction

Consider the following optimization problem:
$min x ∈ R N ψ ( x ) : = f ( x ) + h ( x ) ,$
where $f : R N → R$ is a finite convex function and function h is not necessarily convex. So the primal function (1) may be nonconvex and note that functions f and h are not necessarily smooth. In this paper, we consider the case that function h is easy to evaluate while the function f is much harder to evaluate and is time consuming.
The sum of two functions can be found in many optimization problems such as the Lasso problem in image problems and the optimization problems in machine leaning and so on. Meanwhile, the composite function (1) can be obtained from other problems such as by splitting technique and nonlinear programming and so on. Concretely, in some cases, the function considered is much complicated and difficult to evaluate, to speed up calculations, dividing the primal function into two functions f and h with relatively simple structure is a possible way. Besides that, another way is the penalty strategy which transfers the constraint problem into an unconstrained problem with the sum form.
Note that the splitting type methods (see [1,2]) and the alternating type methods (see [3,4]) are two classes of important methods for composite optimization. When functions f and h have some special structures, the above methods may be effective and own better convergent results. However, if the functions do not own special structures or the functions are much complex and difficult to evaluate, the above methods may not be suitable for Problem (1). Meanwhile, in the alternating direction type methods, at least two subproblems need to be solved at each iteration, if one of the subproblems is difficult or hard to solve, the algorithms’ effectiveness may be slowed down. Then, it is meaningful to seek other suitable methods to deal with Problem (1) without special structures.
In recent years, many scholars have devoted time to seeking effective methods for nonconvex and nonsmooth optimization problems, see [5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]. Usually, bundle methods are much effective for solving nonsmooth optimization problems [21,22,23,24,25,26]. Bundle methods use “black box” to compute the objective value and one of its subgradients (not special) at each iterations. Then, bundle techniques can be a class of possible effective ways to deal with the composite problem (1). At present, the proximal alternating linearization type methods (see [4,27,28,29]) are one effective kind of bundle methods for some composite problems. They need to solve two subproblems at each iteration and the data referred are usually exact. When inexact oracles are involved, the above methods may not be suitable and even be not convergent.
In this paper, we design a proximal bundle method for the inexact composite problem (1) and update the proximal parameter $μ$ to reduce the effects of the inexact information. In the following, we first present some cases where inexact evaluations are generated.
Inexact evaluations are usually referred to in stochastic programming and Lagrangian relaxation [30,31]. It is at the very least impractical and is usually not even easy to solve those subproblems exactly. In bundle methods, inexact information are obtained from inexact oracles. There are different types of inexact oracles. In our work, we consider the Upper oracle (see (2a)–(2c) below). The Upper oracles may overestimate the corresponding function values and get negative linearization errors even if the primal function is convex.
In this paper, we focus on a class of nonconvex and nonsmooth composite problems with inexact data. The design and convergence analysis of bundle methods for nonconvex problems with inexact function and subgradient evaluations are quite involved and there are only a handful of papers for this topic, see [15,32,33,34,35].
In this paper, we present a proximal bundle method with a convexification technique and noise management strategy to solve composite problem (1). Concretely, we design “convexification” technique for the nonconvex function h to make sure the corresponding linearization error nonnegative and then we adopt the noise management strategy for inexact function f. If the error is “too ” large and the testing condition (22) is not satisfied, we decrease the value of the proximal parameter $μ$ to obtain better iterative point. We summary our work as follows:
• Firstly, we design the convexification technique for the nonconvex function h to make sure the linearization errors of the augment function $ϕ n = h + η n 2 ∥ · ∥ 2$ are nonnegative. Note that the the augment function $ϕ n$ may not be a convex function, the nonnegative linearization errors can be obtained by the choices of the parameter $η n$. Similar strategy can also be seen in [10,11,15,16].
• Then, the sum of functions f and $ϕ n$ is regarded as an approximate function for the composite function (1). We construct respectively the cutting plane models for functions f and $ϕ n$ and regard the sum of the two cutting plane models as the cutting plane model for the approximate function which may be a better cutting plane model. It should be noted that since inexact information are referred, the corresponding cutting plane model may not always be below function f.
• Although we design the cutting plane models for functions f and $φ n$, respectively, only one quadratic programming (QP) subproblem needs to be solved at each iteration. By the construction of cutting plane models, we have that the QP subproblem is strictly convex and has unique solution which makes our algorithm more effective.
• In the method, we construct the noise management step to deal with the inexact data for function values and subgradient values where the errors are only bounded and need not to vanish. If the noise error is “too” large and the testing condition (22) is not satisfied, we decrease the value of $μ$ to obtain a better iterative point.
• Two polynomial functions with twenty different dimensions and six DC (difference of convex) problems are referred to in the numerical experiment. In exact cases, our method is comparable with the method in [16] and has higher precision. In five different types of inexact oracles, we obtain that the exact case has the best performance and the performance of the vanishing error cases are generally better than that in the constant error cases. We also apply our method to six DC problems and the results show that our algorithm is effective and reliable.
The remainder of this paper is organized as follows. In Section 2, we review some variational analysis definitions and some preliminaries for proximal bundle method. Our proximal bundle method is given in Section 3. In Section 4, we present the convergent property of the algorithm. Some preliminary numerical testings are implemented in Section 5. In Section 6, we give some conclusions.

## 2. Preliminaries

In this section, we firstly review some concepts and definitions and then present some preliminaries for a proximal bundle method.

#### 2.1. Preliminary

In this subsection, we recall concepts and results of variational analysis that will be used in the latter of the paper. The definition of lower-$C k$ is given in Definition 10.29 in [36]. For completeness, we state it as follows:
Definition 1.
A function $F : O → R$, where $O$ is an open subset of $R N$, is said to be lower-$C k$ on $O$, if on some neighborhood V of each $x ^ ∈ O$, there is a representation:
$F ( x ) = max t ∈ T F t ( x ) ,$
in which the function $F t$ are of class $C k$ on V and the index set T is a compact space such that $F t ( x )$ and all its partial derivatives through order k depend continuously not just on x but on $( t , x ) ∈ T × V$.
If $k = 2$, F is a lower-$C 2$ function. lower-$C 2$ function has a special relationship with convexity, see Theorem 10.33 in [36]. We state its equivalent statement as follow: A function F is lower-$C 2$ on an open set $O ⊆ R N$ if F is finite on $O$ and for any $x ∈ O$, there exists a threshold $λ ¯ ≥ 0$ such that $F + λ 2 ∥ · ∥ 2$ is convex on an open neighborhood V of x for all $λ ≥ λ ¯$. Specifically, if the function F is convex and finite-valued, then F is lower-$C 2$ with threshold $λ ¯ = 0$.
For nonconvex function h, in the following, we consider that h is a lower-$C 2$ function. Since functions f and h are all not necessarily smooth, then the composite function (1) is also not necessarily smooth. For proper convex function f, the common subdifferential in convex analysis is used, which is denoted by $∂ f ( x )$ at the point $x ∈ R N$ (see in [37]). For proper and regular function h, we utilize the limiting subdifferential and also denote it by $∂ h ( x )$ at point x (see in [36]). The definition of the limiting subdifferential is as follows:
In nonsmooth analysis for convex function f, $ε − subdifferential$ at point $x k$ is usually used, which is defined:
where $ε ≥ 0$. In the following, we present the inexact data for function f and give some preliminaries for proximal bundle method.

#### 2.2. Inexact Information and Bundle Construction

Bundle methods are much effective for nonsmooth problems and always utilize “black box” to compute function value and one subgradient at each iterative point. It should be noted that the obtained subgradient is not special. Along the iterative progress, the generated points are divided into two styles: null points, used essentially to increase the model’s accuracy; serious points, significantly decreasing the objective function (and also improve the approximate model’s accuracy). The corresponding iterations are respectively called null steps and serious steps. In the literature, serious points are sometimes called as prox-center or stability center, denoted by $x ^ k ( n )$. Then, the sequence ${ x ^ k ( n ) }$ is a subsequence of the sequence ${ x n }$. For notation simplification, we write $x ^ k = x ^ k ( n )$.
For function f, the oracle can only provide inexact function value and one subgradient value at each iteration, $f ^ l : = f x l , g ^ f l : = g f x l$, with unknown but bounded inaccuracy. That is, for $∀ x l ∈ R N$, we have
$f ^ l ≥ f ( x l ) − θ l ,$
$f ( · ) ≥ f ^ l + 〈 g ^ f l , · − x l 〉 − ε l ,$
Meanwhile
$θ l ≤ θ ¯ , and ε l ≤ ε ¯ .$
According to (2a)–(2c), we have the following relationships
Note that we only require the relationship $θ l + ε l ≥ 0$ holds for each index l. The bundle for function f can be noted as
Now we present the cutting plane model of function f by the inexact information:
where $f ^ k ¯ = f x k ( n )$, $x ^ k = x k ( n )$ is the current stability center with index $k ( n )$ corresponding to its candidate point index and $e f , l k$ is linearization errors, which measures the difference between cutting planes and the function value computed in the oracle for the current serious point, that is
$e f , l k = f ^ k ¯ − f ^ l − 〈 g ^ f l , x ^ k − x l 〉 .$
Especially, note that the relation $φ ˜ n ( x ) ≤ f ( x )$ does not necessarily hold. So the linearization error $e f , l k$ may be negative. In fact, by (2a), (2b) and (6), $e f , l k$ satisfies that
$e f , l k ≥ − ( θ k ( n ) + ε l ) .$
Meanwhile, cutting plane model $φ ˜ n$ may overestimate f at some points. By (2b), the following inequality holds
$φ ˜ n ( x ) ≤ f ( x ) + max l ∈ I n ε l .$
For nonconvex function h, linearization errors may be nonnegative. In bundle methods, nonnegative linearization errors are much important for the convergence. For that we present a local “convexification” technique, similar techniques can also be seen in [15,16,38]. For the convexification parameter $η n$, its choice is as follows
where $I n$ denotes an index set, i.e., $I n ⊆ { 0 , 1 , 2 , ⋯ }$ and $e h , l k$ is the linearization error of h, which is defined as follows with $h l = h ( x l ) , g h l ∈ ∂ h ( x l )$
$e h , l k : = h ( x ^ k ) − h l − 〈 g h l , x ^ k − x l 〉 .$
The bundle for function h can be noted as
Next, we introduce the augment function $ϕ n$ of f, it is defined by
$ϕ n ( x ) : = h ( x ) + η n 2 ∥ x − x ^ k ∥ 2 , ∀ x ∈ R N ,$
where $η n ≥ η n min$ holds. Note that by the definition of $ϕ n$, we have $h ( x ^ k ) = ϕ n ( x ^ k )$. By the calculation of subgradient, we have there exists $g h l ∈ ∂ h ( x l )$ satisfying
$g ϕ l = g h l + η n ( x l − x ^ k ) ∈ ∂ ϕ n ( x l ) .$
Meanwhile, the linearization error of function $ϕ n$ is
By the choice of the convexification parameter $η n$, we have $e ϕ , l k ≥ 0$ holds for all $l ∈ I n$.
In the following, we regard the sum of functions f and $ϕ n$ as an approximate function for composite function (1):
$Ψ n ( x ) = f ( x ) + ϕ n ( x ) .$
For (13), we utilize the sum of cutting plane models of functions f and $ϕ n$ as the cutting plane model. The cutting plane model of the augment function $ϕ n$ is defined as follows
Its equivalent form is
Then, the cutting plane model for the approximate function $Ψ n$ is
$Φ n ( x ) = φ ˜ n ( x ) + ϕ ˜ n ( x ) .$
The new iterative point $x n + 1$ is given by the following QP (quadratic programming) subproblem
where $μ n > 0$ is the proximal parameter. Note that $x n + 1$ is the unique solution to (15) by strong convexity. The following lemma shows the relation between the current stationary center and the new generated point. Similar conclusion can also be seen in Lemma 10.8 in [39] which is for convex function. Here we omit the proof.
Lemma 1.
Let $x n + 1$ be the unique solution to the QP subproblem (15) and proximal parameter $μ n > 0$. Then, we have
$x n + 1 = x ^ k − 1 μ n G n ,$
where
Meanwhile $α 1 = ( α 1 1 , ⋯ , α 1 n )$ and $α 2 = ( α 2 1 , ⋯ , α 2 n )$ is the solution to
In addition, the following relations hold:
(i)
$G n ∈ ∂ Φ n ( x n + 1 )$;
(ii)
$f ^ k ¯ + ϕ n ( x ^ k ) − Φ n ( x n + 1 ) = μ n ∥ x n + 1 − x ^ k ∥ 2 + ∑ l ∈ I n α 1 l e f , l k + ∑ l ∈ I n α 2 l e ϕ , l k$.
In the following, we present the concepts of the predict descent. Concretely, the predict descent for functions $f , ϕ n , Ψ n$ are stated as follows
Note that the predict descent is much important for the convergence of bundle methods. By the definitions of functions $ϕ n$ and $ϕ ˜ n$, we have $δ n + 1 ϕ ≥ 0$. Since inexact data are referred to in the computation of function f, the nonnegativity of $δ n + 1 f$ can not be guaranteed. Hence the nonnegativity of $δ n + 1$ can not be guaranteed too.
Next we give the aggregate linearization error which is defined by
$e n + 1 : = ∑ i ∈ I n α 1 l e f , l k + ∑ l ∈ I n α 2 l e ϕ , l k .$
By the term (ii) in Lemma 1 and the definition of $δ n + 1$ in (18), the following relationship holds
$δ n + 1 = ∥ G n ∥ 2 μ n + η n 2 ∥ x n + 1 − x ^ k ∥ 2 + e n + 1 = R n + μ n 2 ∥ x n + 1 − x ^ k ∥ 2 + e n + 1 ,$
where $R n = μ n + η n$. Next we define the aggregate linearization for approximate model $Φ n$:
$Φ n l i n ( x ) : = Φ n ( x n + 1 ) + 〈 G n , x − x n + 1 〉 .$
Then, the aggregate linearization error can be also expressed as the difference between the value of the oracle at the current serious point and the value of aggregate linearization $Φ n l i n$ at that point, that is,
$e n + 1 = f ^ k ¯ + h ( x ^ k ) − Φ n l i n ( x ^ k ) .$
Indeed, by the definition of $Φ n l i n ( x )$, we have
$Φ n l i n ( x ) = f ^ k ¯ + h ( x ^ k ) + 〈 G n , x − x ^ k 〉 − e n + 1 .$
By the convexity of function $Φ n$, the inequality $Φ n ( x ) ≥ Φ n l i n ( x )$ holds. So for any $x ∈ R N$, we have
$f ^ k ¯ + h ( x ^ k ) ≤ Φ n ( x ) − 〈 G n , x − x ^ k 〉 + e n + 1 .$
By (8), the following inequality holds under the condition $ϕ ˜ n ( x ) ≤ ϕ n ( x )$:
$f ^ k ¯ + h ( x ^ k ) ≤ ψ ( x ) + η n 2 ∥ x − x ^ k ∥ 2 + max l ∈ I n ε l − 〈 G n , x − x ^ k 〉 + e n + 1 .$
Note that the condition $ϕ ˜ n ( x ) ≤ ϕ n ( x )$ may not be hold if the convexification parameter $η n$ is less that the threshold parameter $ρ ¯$ (the function $ϕ n ( x )$ may not be convex), but the choice of $η n$ ensures the nonnegativity of $e ϕ , l k$ for all $l ∈ I n$.
By the nonnegativity of $e ϕ , l k$ and (7), the aggregate linearization error satisfies
Using the fact that $x n + 1$ is the solution of the QP problem (15) and the definition of predict descent in (18), we have that
where the second inequality follows from the nonnegativity of $e ϕ , l k$. By (5) and $x = x ^ k$, we have holds, Note that if only “small” errors have been introduced into the model $Φ n$, then it holds
$δ n + 1 > η n 2 ∥ x n + 1 − x ^ k ∥ 2 + μ n 2 ∥ x n + 1 − x ^ k ∥ 2 = R n 2 ∥ x n + 1 − x ^ k ∥ 2 .$
Then, by (20) and (16), (22) has the following equivalent forms:
Next, we present a optimality measure. Concretely, it is that
$V n : = max { ∥ G n ∥ , e n + 1 } .$
By the above discussions, we have
From the above inequalities, the smaller $μ n$ may lead to higher probability to make inequality (22) hold. Based on that, we will update the parameter $μ n$ to reduce the effects of errors. In the next section, we will give our proximal bundle algorithm for the primal composite problem (1) with inexact information.

## 3. Algorithm

In this section, we present our adaptive bundle algorithm to composite problem (1) with inexact information. To handle inexact information, similar to [17], we introduce the noise management step. Concretely, when the condition (22) does not hold, $μ n$ is reduced in order to make $δ n + 2 > δ n + 1$ and increase the probability of condition (22).
 Algorithm 1 (Nonconvex Nonsmooth Adaptive Proximal Bundle Method with Inexact Information for a class of composite optimization) Step 0 (Input and Initialization):          Choose initial point $x 0 ∈ R N$, constants , an unacceptable increase parameter $M 0 > 0$, $μ m a x > 0 , R 0 > 0 , τ > 1 , γ ≥ 1$ and a stopping tolerance $T o l ≥ 0$. Set noise management parameter NMP = 0 and $x ^ 0 = x 0$. Set $( η 0 , μ 0 ) = ( 0 , R 0 )$. Call the black box to compute $f x ^ 0 , g x ^ 0 0 , h ( x ^ 0 ) , g h , 0 0$. Set $n = k = 0$. Step 1 (Model generation and QP subproblem):          Having the current proximal center $x ^ k$, the current bundles $B n f$ and $B n h$ with index set $I n$, and the current proximal parameter $μ n$ and the convexification parameter $η n$. Having the current approximate models $φ ˜ n ( x )$ and $ϕ ˜ n ( x )$. Compute the QP problem (15) to get the next iterative point $x n + 1$ and simplex multipliers $( α 1 , α 2 )$. Then, compute $G n , δ n + 1 , e n + 1$ and $V n$. Step 2 (Stopping criterion):          If $V n ≤ T o l$, then stop. Otherwise, go to Step 3. Step 3 (Noise Management):          If relationship (22) does not hold, set $N M P = 1$, $μ n + 1 = κ μ n$, $n : = n + 1$, go to Step 1; otherwise, set $N M P = 0$ and we call the noise is acceptable and go to Step 4. Step 4 (Descent testing):         Call the black box to compute $( f ^ n + 1 , g ^ f n + 1 )$ and $( h n + 1 , g h n + 1 )$. Check the descent condition $f ^ k ¯ + h ( x ^ k ) − f ^ n + 1 − h n + 1 − η n 2 ∥ x n + 1 − x ^ k ∥ 2 ≥ m 1 δ n + 1 ,$ (25) if (25) does not hold, then declare a null step, and set $k ( n + 1 ) = k ( n )$. If $NMP = 0$, choose $μ n + 1 ∈ [ γ μ n , μ m a x ]$; else if NMP = 1, take $μ n + 1 = μ n$. Update the bundle information and go to Step 5. Otherwise, declare a serious step. Set and choose $μ n + 1 = μ n$, update the bundle information and go to Step 5. Step 5 (Update parameter):         Apply the rule to compute $η n + 1$ (26) where $η n m i n$ is given by (9), written with n replaced by $n + 1$. Step 6 (Restart step):           If $ψ ( x n + 1 ) > ψ ( x ^ k ) + M 0$ holds, then the objective increase is unacceptable; Restart the algorithm by setting $η 0 : = η n , μ 0 : = τ μ n , R 0 : = η 0 + μ 0 ,$ $x 0 : = x ^ k , k ( 0 ) : = 0 , i 0 : = 0 , I 0 : = { 0 } , n : = 0 ,$ where $i k$ is the index of serious points. Then, loop to Step 1; otherwise, increase k by 1 in the case of the serious step. In all cases, increase n by 1, and loop to Step 1.
Remark 1.
Note that in Algorithm 1, the update of elements in bundles are not stated clearly. For null step and serious step, the updating strategies are different. When a serious step occurs, the new generated point is regarded as a new proximal center and the corresponding linearization errors in the bundles all should be updated. When a null step emerges, the proximal center keeps unchanged and only the new generated information are added into the bundles to improve the model’s accuracy. As the iterations proceed, the elements in the bundles may be too large that reduces the efficiency of the algorithm. Then, the active technology (only the active element $α 1 l$ and $α 2 l$ are kept in the bundles) and the compression strategy can be adopted. For the compression strategy, the number of elements in the bundles can be at least two, the aggregate information and the new generated information. It should be noted that although the compression strategy does not impair the convergence of the algorithm, it may affect the model’s effectiveness if the number of elements in the bundles is too small.
In the following, we will focus on the analysis of Algorithm 1 which indicates the algorithm is well defined. If the algorithm loops forever, three situations may occur (the number of restart steps are finite, which can be see in Lemma 3 ):
• an very large loop of noise management between Step 1 and Step 3, driving $μ n → 0$;
• a finite number of serious steps, followed by an very large number of null steps;
• an very large number of serious steps.
Next, we firstly give the case of very large loop of noise management.
Lemma 2.
If an very large loop between Step 1 and Step 3 in Algorithm 1 occurs, then the optimal measure $V n → 0$.
Proof.
Suppose an very large loop between Step 1 and Step 3 begins at iterative index $l ¯$. According to the algorithm, this means that for all $n ≥ l ¯$, neither the proximal center $x ^ k ( n ) = x ^ k ( l ¯ )$ nor the approximate models $φ ˜ n , ϕ ˜ n , Φ n$ change. Hence, when solving sequentially the QP optimization problem (15), only parameter $μ n$ is updated. By the strategy for $μ n$, we have $μ n = κ n − l ¯ μ l ¯$, then it holds $μ n → 0$ as $n → ∞$ by $κ ∈ ( 0 , 1 )$. Using (24), we have
$0 ≤ V n ≤ 2 μ n ( θ ¯ + ε ¯ ) → 0 , μ n → 0 when n → ∞ .$
Then, the proof is completed. □
Note that if very large noise management steps happen, there is finite number of update for the convexification parameter $η n$. Then, $η n$ eventually is bounded. Before the last two cases, we show that there is only finite number of restart steps in Algorithm 1. For that, we make an assumption, which can also be founded in [16].
Assumption 1.
The following level set $T : = { x ∈ R N : ψ ( x ) ≤ ψ ( x ^ 0 ) + M 0 }$ is nonempty and compact.
By the definition of lower-$C 2$, the compactness of set $T$ and the finite covering theorem, there exists a threshold $ρ ¯$ such that for all $η ≥ ρ ¯$, the augmented function $h ( x ) + η ∥ x − x ^ k ∥ 2 / 2$ is convex for $∀ x , x ^ k ∈ T$.
The compactness of $T$ allows us to find Lipschitz constants for functions f and h, named $L f$ and $L h$ respectively (by lower-$C 2$ function’s local lipschitz property and the finite covering theorem). The following lemma indicates that the restart step in Algorithm 1 is finite.
Lemma 3.
Suppose only finite number of the noise management steps occur, Assumption 1 holds and consider the sequence of iterative points ${ x n }$ generated by Algorithm 1. The index $l k ∈ I n$ denotes the current proximal center index, then there can be only a finite number of restart steps in Algorithm 1. Hence, eventually the sequence ${ x ^ k }$ is entirely in $T$.
Proof.
Firstly, new iterative point $x n + 1$ is well defined by the strong convexity of QP subproblems (15). As functions f and h are lipschitz continuous on the level set $T$ and their lipschitz constants are respective $L f$ and $L h$, $ψ ( x )$ is also lipschitz continuous in the compact set $T$ and one of its lipschitz constants is $L : = L f + L h$. By the lipschitz continuity of $ψ$, there exists $ϵ > 0$ such that for any $x ˜ ∈ { x : ψ ( x ) ≤ ψ ( x ^ 0 ) }$, the open set $B ϵ ( x ˜ )$ is contained in compact set $T$ (indeed, the choice of $ϵ = M 0 / L$ suffices). Note that:
where $l k ∈ I n$, and $g l k ∈ ∂ ( φ ˜ n + ϕ ˜ n ) ( x ^ k )$. It also holds $g l k ∈ ∂ ψ ( x ^ k )$, then $∥ g l k ∥ ≤ L$. In Algorithm 1, $μ n$ increases when the restart steps and the null steps with $N M P = 0$ happen, eventually proximal parameter $μ n$ becomes large enough that the relationship $2 L μ n < ϵ$ holds. Noting that $ψ ( x ^ k ) ≤ ψ ( x ^ 0 )$ for any new serious point $x ^ k$ generated in Algorithm 1 completes the proof. □
Next, we focus on the update of convexification parameter $η n$. The following lemma shows $η n$ eventually keeps unchanged.
Lemma 4.
Suppose there is a finite number of the noise management steps and Assumption 1 holds. Then, there exists an iteration index $n ¯$ such that for all $n ≥ n ¯$, the convexification parameter $η n$ stabilizes, i.e., $η n = η ¯$. Moreover, if $η ¯ ≥ ρ ¯$ holds, then for all $n ≥ n ¯$, the augmented function $ϕ n ( x ) = h ( x ) + η n 2 ∥ x − x ^ k ∥ 2$ is convex on the compact set $T$.
Proof.
By the update of the convexification parameter $η n$ in Algorithm 1, we have $η n$ is nondecreasing: either $η n + 1 = η n$ or $η n + 1 = τ η n + 1 m i n > τ η n$. Suppose the sequence ${ η n }$ does not stabilize, there must exist an very large number iterations such that the convexification parameter is increased by a factor of at least $τ$, but that is difficult and can lead to a contradiction. Since there exists an index $n ˜$ such that $η n ˜ ≥ ρ ¯$ and $h ( x ) + η n ˜ 2 ∥ x − x k ( n ) ∥ 2$ is convex on the compact set $T$. For this iteration, we have $e h , l k + η n ˜ ∥ x − x k ( n ˜ ) ∥ 2 / 2 ≥ 0$ for all $l ∈ I n ˜$ (the linearization error for a convex is always nonnegative). Hence, holds. Then, from the iteration onward, the convexification parameter will keep unchanged, i.e., $η n ˜ + i = η n ˜$ for all $i ≥ 1$. Then, the sequence ${ η n }$ stabilizes. Specially, we choose $η ¯ = η n ˜$. For $n ≥ n ¯$, if $η n = η ¯ ≥ ρ ¯$ holds, then the augmented function $ϕ n ( x )$ in $T$ is convex. □
The optimality measure in Algorithm 1 for inexact information is different with that in exact case. The following lemma justifies the choice of $V n$ as optimality measure and indicates the accumulate point is an approximate solution of primal problem (1).
Lemma 5.
Suppose there is a finite number of the noise management steps and Assumption 1 holds. Suppose that for an very large subset of iterations $I ⊆ { 0 , 1 , ⋯ }$, the sequence ${ V λ } λ ∈ I → 0$ as $I ∋ λ → ∞$. Let ${ x ^ k ( λ ) } λ ∈ I$ be the corresponding subsequence of serious points and let $x ^ a c c$ be an accumulation point. If $η ¯ ≥ ρ ¯$ holds, then $x ^ a c c$ is an approximate solution to the problem (13) with
where $Ψ *$ is the optimal value of function $Ψ n$.
Proof.
Taking $λ ∈ I$ large enough and by the definition of $Ψ n$, we have $Ψ n ( x ^ k ( λ ) ) = f ( x ^ k ( λ ) ) + h ( x ^ k ( λ ) ) + η ¯ 2 ∥ x ^ k ( λ ) − x ^ k ( λ ) ∥ 2 = f ( x ^ k ( λ ) ) + h ( x ^ k ( λ ) )$. Passing to the limit in inequality (21) and by $η ¯ ≥ ρ ¯$, we have that
Moreover, for any cluster point $x ^ a c c$ of ${ x ^ k ( λ ) } λ ∈ I$, passing to the limit to (2a), we obtain that
$Ψ n ( x ^ a c c ) − lim sup λ → ∞ θ k ( λ ) ≤ lim λ → ∞ ( f ^ k ( λ ) + h ( x ^ k ( λ ) ) ) .$
Rewriting the two inequalities above, the conclusion holds. □
Note that by the definition of function $Ψ n$ and large enough index n, we have $Ψ n ( x ) = ψ ( x ) + η ¯ 2 ∥ x − x ^ a c c ∥ 2$ and $Ψ n ( x ^ a c c ) = ψ ( x ^ a c c )$. By the above discussions, we have where $x *$ and $ψ *$ are local optimal solution and optimal value respectively. Then, $x ^ a c c$ is an approximate solution to the primal problem (1). There are some corollaries from Lemma 5, which are much important for convergent analysis. We state it here but omit its proof.
Corollary 1.
(i) if for some iteration index λ, $η λ ≥ ρ ¯$ holds and the optimality measure satisfies $V λ = 0$, then the serious point $x ^ k ( λ )$ is an approximate solution to problem (13) with
$Ψ n ( x ^ k ( λ ) ) ≤ Ψ * + θ k ( λ ) + max l ∈ I n ε l .$
(ii) Suppose that the serious point sequence finally stabilizes, i.e., there exists a constant m such that for all $λ ≥ m$, we have $x ^ k ( λ ) = x ^ k ( m )$. If $η m ≥ ρ ¯$ holds, then $x ^ k ( m )$ is an approximate solution to the problem (13) with
Note that if an very large loop of noise management happens after some iteration $l ˜$ and $η l ˜ ≥ ρ ¯$, then the proximal center keeps unchanged. According to (29), the last serious point $x ^ k ( l ˜ )$ is an approximate solution to problem (13). From above lemmas and corollary, we have Algorithm 1 is well defined. In the next section, we will study separately the last two cases.

## 4. Convergence Theory

In this section, we study separately the last two cases above. The similar proof process can be found in [13,16,17,38,40]. In the following lemma, the second case, i.e., finite serious step with very large null steps, is considered.
Lemma 6.
Assumption 1 holds and suppose that, after some iteration $n ¯$, $η n ¯ ≥ ρ ¯$ holds and there is no serious step declared in Algorithm 1. Then, there exists a subsequence ${ x n } n ∈ I n$, such that $V n → 0$ as $I n ∋ n → ∞$.
Proof.
After some iteration $n ¯$, no serious step is declared. Hence either noise management steps or null steps are done for $n ≥ n ¯$. The serious point does not change, i.e., for all $n ≥ n ¯$, $x ^ k ( n ) = x ^ k ( n ¯ )$. For notational simplicity, we denote $x ^ : = x ^ k ( n ¯ )$.
If the number of noise management steps is very large, we have that $μ n → 0$ as $I n ∋ n → ∞$. The previous proof indicates that there exists a subsequence ${ x n + 1 } n ∈ I n$ such that $V n → 0$ as $I n ∋ n → ∞$.
Suppose there is only a finite number of noise management steps. Since the number of the restart steps is finite, there exists some iterative index $n ^$, such that (22) holds and only null steps occur for all $n ≥ n ^$. Consequently, ${ μ n }$ is a nondecreasing sequence since $μ n + 1 ∈ [ γ μ n , μ m a x ]$ for all $n > n ^$. Meanwhile $μ n → μ ˜ ≤ μ m a x$ as $n → ∞$. In the following, we show $δ n → 0$. Let $P n$ be the partial linearization of the the QP model (15), that is,
$P n ( x ) : = Φ n l i n ( x ) + μ n 2 ∥ x − x ^ ∥ 2 , ∀ x ∈ T .$
By Lemma 10.10 in [39], we know that the rules to apply selection on the bundles guarantee that $Φ n l i n ( x ) ≤ Φ n + 1 ( x )$ holds. By the inequality (8), we have
$P n ( x ^ ) = Φ n l i n ( x ^ ) ≤ Φ n + 1 ( x ^ ) ≤ ψ ( x ^ ) + η n 2 ∥ x ^ − x ^ ∥ 2 + ε ¯ = ψ ( x ^ ) + ε ¯ .$
Similarly, evaluating $P n$ at $x n + 2$, and using the fact that $μ n + 1 ≥ γ μ n$, we have
$P n ( x n + 2 ) ≤ Φ n + 1 ( x n + 2 ) + μ n + 1 2 ∥ x n + 2 − x ^ ∥ 2 = Φ n + 1 l i n ( x n + 2 ) − 〈 G n + 1 , x n + 2 − x n + 2 〉 + μ n + 1 2 ∥ x n + 2 − x ^ ∥ 2 = P n + 1 ( x n + 2 ) .$
Furthermore, $x n + 1$ is the unique solution to (15), then $∇ P n ( x n + 1 ) = 0$. By Taylor’s expansion, we get
$P n ( x ) = P n ( x n + 1 ) + μ n 2 ∥ x − x n + 1 ∥ 2 .$
Hence the following two equalities hold
Using the relationship above, the fact $μ n ≥ μ n ^$ and (30), we obtain
Then, the sequence is nondecreasing and bounded. Hence the limit exists:
$P n ( x n + 1 ) → P * < ∞ , and ∥ x n + 1 − x n ∥ → 0 , when n → ∞ .$
Then, the sequence of null steps ${ x n }$ is bounded. By (16) and ${ μ n }$ is bounded, we have ${ G n + 1 }$ is bounded (see [39]). Since for $n > n ^$, the serious steps test is not satisfied, so by the definition of $δ n + 1$, we have
$f ^ n + 1 + h ( x n + 1 ) + η n ∥ x n + 1 − x ^ ∥ 2 − Φ n ( x n + 1 ) > ( 1 − m 1 ) δ n + 1 .$
Since $f ^ n + 1 + h ( x n + 1 ) + η n 2 ∥ x n + 1 − x ^ ∥ 2 ≤ Φ n + 1 ( x n + 2 ) + 〈 G n + 1 , x n + 1 − x n + 2 〉$ holds, then by the definition of partial linearization and $Ω : = f ^ n + 1 + h ( x n + 1 ) + η n ∥ x n + 1 − x ^ ∥ 2 − Φ n ( x n + 1 )$, we have
By (32), Theorem 1 in [16] and $μ n → μ ˜ ≤ μ m a x$, we have the right side of the above inequality vanishes as $n → ∞$. So $Ω → 0$ holds as $n → ∞$. Hence
$0 ≤ ( 1 − m 1 ) δ n + 1 < f ^ n + 1 + h ( x n + 1 ) + η n ∥ x n + 1 − x ^ ∥ 2 − Φ n ( x n + 1 ) → 0 .$
Then, $δ n + 1 → 0$ holds as $n → ∞$. By (24), $V n → 0$ holds as $l ^ < n → ∞$. □
Theorem 1.
Suppose Algorithm 1 loops forever and Assumption 1 holds. Assume there are finite number of serious steps and $η ¯ ≥ ρ ¯$ holds. Then, the last serious point $x ^$ is an approximate solution of problem (13) with
Proof.
(i) If very large noise management steps happens, Algorithm 1 finally stops and the conclusion holds. (ii) If very large null steps happen in Algorithm 1, by Lemma 6 and the second term of Corollary 1, the conclusion holds. □
The case of very large serious points generated in Algorithm 1 is considered in the next lemma. For notational convenience, we denote by $K$ the subset of iterations which are chosen as serious points. Let $x ^ k$ and $x ^ k *$ be two successive serious points.
Lemma 7.
Suppose an very large sequence of serious steps is generated in Algorithm 1, Assumption 1 and $η ¯ ≥ ρ ¯$ hold. Then, the $V n → 0$ as $K ∋ n → ∞$.
Proof.
Since the serious points satisfy the descent condition (25), for two successive serious points $x ^ k$ and $x ^ k *$, applying the descent condition inequality, we have
$f ^ k ¯ + h ( x ^ k ) − f ^ k ¯ * − h ( x ^ k * ) − η n k 2 ∥ x ^ k * − x ^ k ∥ 2 ≥ m 1 δ n k * .$
Rewriting the above inequality, we have
$f ^ k ¯ + h ( x ^ k ) − f ^ k ¯ * − h ( x ^ k * ) ≥ m 1 δ n k * + η n k 2 ∥ x ^ k * − x ^ k ∥ 2 > 0 .$
Then, we have the sequence ${ f ^ k ¯ + h ( x ^ k ) }$ is strictly decreasing. By summing up this inequality for all serious steps, we deduce that
$1 2 ∑ k ∈ K η k ∥ x ^ k * − x ^ k ∥ 2 + m 1 ∑ k ∈ K δ k < ∞ , with δ k > 0 .$
Hence, the above inequality deduces $δ k → 0$. Since (22) holds, then by (24), we have $V k → 0$ as $K ∋ k → ∞$. □
Theorem 2.
Suppose Algorithm 1 loops forever, there are very large number of serious steps and $η ¯ ≥ ρ ¯$ holds. Then, any accumulation point $x ^ a c c$ of serious points sequence ${ x ^ k } k ∈ K$ is an approximate solution of the problem (13) with
Proof.
The conclusion follows from Lemma 5 and Lemma 7. □

## 5. Numerical Results

In this section, we consider two Ferrier polynomial functions (see [10,15,16]) and some DC (difference of convex) functions (see [41,42,43,44]). The section is divided into three parts. We code Algorithm 1 in MATLAB R2016 and run it on a PC with 2.10 GHZ CPU. Meanwhile, the Quadratic programming solver for Algorithm 1 in this paper is the Quadprog.m, which is available in the Optimization Toolbox in MATLAB. Note that the quadratic programming solver is not special and any solver for quadratic programming is accepted.

#### 5.1. Two Polynomial Functions

In this subsection, we first present two polynomial functions which are in the form of the objective function (1). The two polynomial functions have the following forms:
where $ω i ( x ) : R N → R$ is defined by $ω i ( x ) = ( i x i 2 − 2 x i ) + ∑ j = 1 n x j$, where ∀$x ∈ R N$ and for each $i = 1 , ⋯ , N$. It is clear that the above functions are nonconvex, nonsmooth, lower-$C 2$ and have $0$ as their global minimizer. If we denote and $f ( x ) = ∥ x ∥ 2 / 2$ or $f ( x ) = ∥ x ∥ / 2$, the above functions are clear in the form (1). In the following, we adopt initial points $x 0 = [ 1 , 1 , … , 1 ]$ and consider the case $N ∈ { 1 , 2 , … , 19 , 20 }$. The parameters in this subsection are set as follows: $m 1 = 0.01$, $κ = 0.9$, $R 0 = 10$, $M 0 = 10$, $τ = 2$, $γ = 2$, $μ m a x = 10 20$ and $T o l = 10 − 6$. We also stop the algorithm when the iterative number is over 1000. First, we present the numerical results in Table 1 and Table 2 for the case $θ ¯ = 0$ and $ε ¯ = 0$, that is the exact case, and compare them with the results in [16]. We call the algorithm in [16] as the RedistProx algorithm. Meanwhile, we adopt $I n = { 0 , 1 , 2 , ⋯ }$. Note that in the exact case, we stop the progress when $δ k ≤ T o l$ occurs, which is the same in [16]. In exact case, the linearization error $e f , i k$ and $e ϕ , i k$ are nonnegative, so the noise attenuation steps never happen. Then, in the numerical results for exact cases, the NNA is always zero, and we omit the NNA in the Table 1 and Table 2. The columns of Tables have the following meanings: Dim: the the tested problem dimension, NS: the number of serious steps, NNA: the number of noise attenuation steps, NF: the number of oracle function evaluations used, fk: the minimal function value found, $δ k$: the value of $δ k$ at the final iteration, $ψ *$: the optimal function found, $V k$: the value of $V n$ at the final iteration, RN: the number of restart steps, Nu: the number of null steps.
From Table 1 and Table 2, in most cases, our algorithm has a higher accuracy and compares with the RedistProx algorithm in [16]. For $N = { 11 , 12 , 13 , 14 }$, we adopt a larger initial model prox-parameter $μ 0$ (a smaller steplength), $R 0 = 1000$. Meanwhile, the $T o l$ and other parameters keep unchanged. For $N ∈ { 15 , 16 , 17 , 18 , 19 , 20 }$, We take $T o l = 10 − 5$, $R 0 = 1000$ and keep other parameters unchanged. The numerical results for $ψ 1 ( x )$ and $ψ 2 ( x )$ are reported in Table 3.
From Table 3, Algorithm 1 can solve the two Ferrier polynomial functions in higher dimension successfully and have a reasonable and higher accuracy. The parameters $μ$ and $η$ eventually keep unchanged in exact case, which is illustrated by Figure 1.
Next, inexact data are referred and we consider the cases of random noises for function value and subgradient. We introduce two kinds of random noises in matlab codes. The first case is $θ j = 0.01 ∗ n o r m r n d ( 0 , 0.1 )$, and $ε j = 0.01 ∗ n o r m r d ( 0 , 0.1 , 1 , d i m )$. The code $n o r m r n d ( 0 , 0.1 , 1 , d i m )$ generates random numbers from the normal distribution with mean value 0, standard deviation 0.1 and scalars $d i m$ and 1 are the row and column dimensions. We take $m 1 = 0.01$, $κ = 0.9$, $γ = 2$, $τ = 2$, $M 0 = 5$ and $R 0 = 1000$ in this random error case. The algorithm stops when $V k ≤ T o l$ holds or the number of function evaluated is over 1000. The numerical results for this case are report in Table 4.
From Table 4, we have Algorithm 1 can solve $ψ 1 ( x )$ and $ψ 2 ( x )$ successfully for random errors in a reasonable accuracy. We also focus on the parameters $η$ and $μ$ in the implementation of Algorithm 1. Although the convexification parameter $η n$ eventually keeps unchanged, the update strategy of the proximal parameter $μ n$ is complicated. When the noise management step occurs, parameter $μ n$ is decreasing to reduce the ’noise’ errors’ impact. When the unacceptable condition happens, we increase the parameter $μ n$ to get a smaller step length. Figure 2 shows the variation of parameters $η$ and $μ$ along NF for $ψ 1 ( x )$ with $N = 19$ in normal random error case.
In the following, we introduce the error case of $θ j = 0.01 ∗ u n i f r n d ( 0 , 1 )$ and $ε j = 0.01 ∗ u n i f r n d$$( 0 , 1 , 1 , d i m )$. The code $u n i f r n d ( 0 , 1 , 1 , d i m )$ gives a similar case with the ‘normrd’ case. In this case, we adopt two $T o l$ values and two initial proximal parameter values $R 0$ for different dimension of the variables. Concretely, we take $m 1 = 0.01$, $κ = 0.9$, $τ = 2$, $γ = 2$, $M 0 = 5$ and $R 0 = 20$, $T o l = 10 − 6$ for $N ∈ { 1 , 2 , ⋯ , 8 }$. For $N ∈ { 9 , 10 , ⋯ , 19 , 20 }$, we take $R 0 = 200$, $T o l = 10 − 4$ and the other parameters keep unchanged. We also take 1000 as the upper limit of function evaluated. The algorithm stops when $V k ≤ T o l$ holds or the number of function evaluated is over 1000. The numerical results for this error case are reported in Table 5.
From Table 5, Algorithm 1 can solve $ψ 1 ( x )$ and $ψ 2 ( x )$ in a reasonable accuracy for the ‘unifrnd’ random error case. In this inexact case, we also illustrate the vary of $η n$ and $μ n$ in Figure 3 and Figure 4. The parameter $η n$ is eventually stable. Although the vary of the proximal parameter $μ n$ is complicate for inexact case, the hypothesis about the upper limits for $μ n$ in the numerical experiment is reasonable, which are illustrated in the numerical testing.

#### 5.2. Noise’s Impact on Solution Accturacy

The error have different types. To analysis the impact of different noise types, we test five different types of inexact oracles:
• NNE (no noise error): in this case, $θ ¯ = ε ¯ = 0$ and $θ i = ε i = 0$ for all i in iterative process;
• CNE (constant noise error): in this case, $θ ¯ = ε ¯ = θ i = ε i = 0.01$, for all i in iterative process;
• VNE (vanishing noise error): in this case, $θ ¯ = ε ¯ = 0.01$ and $θ i = ε i = min { 0.01 , ∥ x i ∥ / 100 }$ for all i in iterative process;
• CGNE (constant subgradient noise error): in this case, $θ ¯ = θ i = 0$ and $ε ¯ = ε i = 0.01$ for all i in iterative process;
• VGNE (vanishing subgradient noise error): in this case, we set $θ ¯ = θ i = 0$ and $ε ¯ = 0.01$, $ε i = min { 0.01 , ∥ x i ∥ / 100 }$ for all i in iterative process;
In the numerical experiment, the parameters involved are the same with that in the ‘unifrnd’ error case. We present the numerical results of no noise error case (exact values case) in Table 6. In the test, for $N = 2$, we take $T o l = 10 − 5$. In exact case, the number of $N N A$ is always 0, then we omit the columns of $N N A$ in Table 6.
In the following, we present the numerical results of constant noise error case in Table 7. The parameters are same with that in the NNE case except that for $ψ 1 ( x )$ with $N = 2 , 18$. For the case of $ψ 1 ( x )$ with $N = 2$, we take $T o l = 10 − 1$, $R 0 = 200$ and the other parameters keep unchanged. For the case of $ψ 1 ( x )$ with $N = 18$, we take $T o l = 10 0$, $R 0 = 200$ and the other parameters keep unchanged.
Next, Table 8 presents the results for the vanishing noise error case. The parameters keep unchanged except that for $ψ 1 ( x )$ with $N = 2$ and $ψ 2 ( x )$ with $N = 8$. In the case of $ψ 1 ( x )$ with $N = 2$, we take $T O L = 10 − 1$ and the other parameters keep unchanged. For the $ψ 2 ( x )$ with $N = 8$ case, we take $R 0 = 10$ and other parameters keep unchanged. It also should note the results for the case of $ψ 2 ( x )$ with $N = 19$.
In the following, Table 9 presents the results for the constant subgradient noise error case (CGNE). The parameters keep unchanged except the case of $ψ 1 ( x )$ with $N = 18$. In this case, we take $T o l = 10 0$ and $R 0 = 500$. Table 10 presents the results for the vanishing subgradient noise error case (VGNE). The parameters keep unchanged except the cases of $ψ 1 ( x )$ with $N = 2 , 7 , 14 , 18$. In these cases, we take $T o l = 10 − 1 , R 0 = 200$ and other parameters keep unchanged.
Next, we compare the numerical performance of different noise type. For comparing the performances, we adopt the formula Precision $= | l o g 10 ( | f k | ) |$ and regard the NNE case as a benchmark for comparison. The cases of constant noise (CNE and CGNE) and exact form (NNE) are referred to in Figure 5. It is clear that the exact case has the best performance and Algorithm 1 can achieve a reasonable accuracy for the constant case. Meanwhile, the performance of CGNE case is better that than of CNE case. Similarly, Figure 6 reports the numerical performance for the vanishing cases (VNE and VGNE) and the exact form (NNE). From Figure 6, the performance of the VGNE case is comparable with that of the exact (NNE) case. Meanwhile, the performance of the vanishing error cases are generally better than that in the constant error cases.

#### 5.3. Application to Some DC Problems

In this subsection, we test some unconstrained DC examples to illustrate the effectiveness of Algorithm 1. These examples come from [42,43,44]. Usually, the DC function has the form: $ψ ( x ) = f ( x ) − g ( x )$. If we take $h ( x ) = − g ( x )$, the problems are in the form of (1).
Problem 1.
Dimension: $N = 2$,
Component functions: ; ,
Relevant information: $x 0 = ( 2 , 5 ) T$, $x * = ( 1 , 1 ) T$, $ψ * = 0$.
Problem 2.
Dimension: $N = 4$,
Component functions: ; ,
Relevant information: $x 0 = ( 1 , 3 , 3 , 1 ) T$, $x * = ( 1 , 1 , 1 , 1 ) T$, $ψ * = 0$.
Problem 3.
Dimension: $N = 2 , 5 , 10$,
Component functions: $f ( x ) = N max { | x i | : i = 1 , ⋯ , N }$, $g ( x ) = ∑ 1 N | x i |$,
Relevant information: $x 0 = ( i , i = 1 , ⋯ , ⌊ N / 2 ⌋ , − i , i = ⌊ N / 2 ⌋ + 1 , ⋯ , N ) T$, $x * = ( x 1 * , ⋯ , x N * ) T$, $x i * = a$ or $x i * = − a$, $a ∈ R$, $i = 1 , ⋯ , N$, $ψ * = 0$.
Problem 4.
Dimension: $N = 4$,
Component functions: $f ( x ) = x 1 2 + ( x 1 − 1 ) 2 + 2 ( x 1 − 2 ) 2 + ( x 1 − 3 ) 2 + 2 x 2 2 + ( x 2 − 1 ) 2 + 2 ( x 2 − 2 ) 2 + x 3 2 + ( x 3 − 1 ) 2 + 2 ( x 3 − 2 ) 2 + ( x 3 − 3 ) 2 + 2 x 4 2 + ( x 4 − 1 ) 2 + 2 ( x 4 − 2 ) 2$;
$g ( x ) = max { ( x 1 − 2 ) 2 + x 2 2 , ( x 3 − 2 ) 2 + x 4 2 } + max { ( x 1 − 2 ) 2 + ( x 2 − 1 ) 2 , ( x 3 − 2 ) 2 + ( x 4 − 1 ) 2 } + max { ( x 1 − 3 ) 2 + x 2 2 , ( x 3 − 3 ) 2 + x 4 2 } + max { x 1 2 + ( x 2 − 2 ) 2 , x 3 2 + ( x 4 − 2 ) 2 } + max { ( x 1 − 1 ) 2 + ( x 2 − 2 ) 2 , ( x 3 − 1 ) 2 + ( x 4 − 2 ) 2 } ,$
Relevant information: $x 0 = ( 3 , 1 , 3 , 1 ) T$, $x * = ( 7 / 3 , 1 / 3 , 0.5 , 2 ) T$, $ψ * = 11 / 6$;
Problem 5.
Dimension: $N = 2 , 5$,
Component functions: $f ( x ) = 20 max { | ∑ i = 1 N ( x i − x i * ) t j i − 1 | : j = 1 , 2 , ⋯ , 20 }$,
$g ( x ) = ∑ j = 1 20 | ∑ i = 1 N ( x i − x i * ) t j i − 1 | , t j = 0.05 j , j = 1 , 2 , ⋯ , 20 ,$
Relevant information: $x 0 = ( 1 / N , 0 , ⋯ , 0 ) T$, $x * = ( 1 / N , 1 / N , ⋯ , 1 / N ) T$, $ψ * = 0$.
Problem 6.
Dimension: $N = 2 , 4$,
Component functions: $f ( x ) = ∑ j = 1 100 | ∑ i = 1 N ( x i − x i * ) t j i − 1 | , t j = 0.01 j , j = 1 , 2 , ⋯ , 100 ,$
$g ( x ) = max { | ∑ i = 1 N ( x i − x i * ) t j i − 1 | : j = 1 , 2 , ⋯ , 100 }$
Relevant information: $x 0 = ( 0 , 0 , ⋯ , 0 ) T$, $x * = ( 1 / N , 1 / N , ⋯ , 1 / N ) T$, $ψ * = 0$.
For the effectiveness of Algorithm 1, we compare it with the TCM algorithm, NCVX algorithm and penalty NCVX algorithm in [42]. The values of parameters in Algorithm 1 are that: $m 1 = 0.01$, $κ = 0.9$, $τ = 2$, $γ = 2$, $M 0 = 5$, $R 0 = 10$ and $T o l = 10 − 3$. The results can be seen in Table 11. Meanwhile the ∗ in Table 11 means the obtained value is not optimal. From Table 11, we have that Algorithm 1 can successfully solve these DC problems, however the TCM algorithm cannot solve Problem 4, NCVX algorithm cannot solve Problems 1 and 4 and the penalty NCVX algorithm can not solve Problem 1. Then, Algorithm 1 is reliable. From the obtained function value and the number of function evaluations, Algorithm 1 is also effective.
For the above DC problems, we consider the vanishing noise error (VNE) case and the exact (NNE case. We also take 1000 as the upper limit of function evaluated. The algorithm stops when $V k ≤ T o l$ holds or the number of function evaluated is over 1000. For the vanishing noise case, we set $θ i = min { 0.01 , ∥ x − x * ∥ / 100 }$ and $ε i = min { 0.01 , ∥ x − x * ∥ / 100 }$ except Problem 3. In Problem 3, the optimal solutions vary with the dimensions, then we set $θ i = min { 0.01 , ∥ x ∥ 2 / 100 }$ and $ε i = min { 0.01 , ∥ x ∥ 2 / 100 }$. Table 12 presents the results for the vanishing noise error case (VNE) and the exact case (NNE). The column $P r$ in Table 12 denotes the index of problems. We also compute the Precision. However it is not suitable since the optimal value is not 0. To deal with that, we take $a k = ( f k − f * ) / f *$ and Precision $= | l o g 10 ( | a k | ) |$. The numerical results are reported as follows.
From Table 12, Algorithm 1 can successfully solve the above DC problems in a higher precision and is effective to the VNE case in a reasonable accuracy. Then, Algorithm 1 is effective and reliable to the above DC problems. During the numerical experiment, we also focus on the variation of parameters $η$ and $μ$, which are both bounded and the parameter $η$ eventually keeps unchanged, which can be illustrated in Figure 7 and Figure 8.

## 6. Conclusions

In this paper, we consider a special class of nonconvex and nonsmooth composite problem. The problem is constituted by the sum of two functions, one is finite convex with inexact information and the other is a nonconvex function (lower-$C 2$). For the nonconvex function, we utilize the convexification technique and adjust the parameter dynamically to make sure the linearization errors of the augment function nonnegative and construct the corresponding cutting plane models. Then, we regard the sum of the convex function and the augment function as an approximate function. For the convex function with inexact information, we construct the cutting plane model by its inexact information and notice that the cutting plane model may not be below the convex function. Then, the sum of the cutting plane models of the convex function with inexact information and the augment function is regarded as a cutting plane model of the approximate function. After that, we design an adaptive proximal bundle method. Meanwhile, for the convex function with inexact information, we utilize the noise management strategy and update adaptively the proximal parameter to reduce the influence of inexact information. Two polynomial functions including five different inexact types and six DC problems with different dimension are referred to in the numerical experiment. The preliminary numerical results show our algorithm is interesting and reliable. Meanwhile, our method can also be applied to some constraint problems and stochastic programming in the future.

## Author Contributions

Conceptualization, X.W. and L.P.; methodology, X.W.; software, X.W., Q.W. and M.Z.; validation, X.W., L.P. and M.Z.; formal analysis, X.W. and Q.W. All authors have read and agreed to the published version of the manuscript.

## Funding

This research received no external funding.

Not applicable.

Not applicable.

## Data Availability Statement

The data can be found in the manuscript.

## Acknowledgments

We are greatly indebted to three anonymous referees for many helpful comments.

## Conflicts of Interest

The authors declare no conflict of interest.

## References

1. Condat, L. A primal-dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms. J. Optim. Theory Appl. 2013, 158, 460–479. [Google Scholar] [CrossRef] [Green Version]
2. Li, G.Y.; Pong, T.K. Global convergence of splitting methods for nonconvex composite optimization. SIAM J. Optim. 2015, 25, 2434–2460. [Google Scholar] [CrossRef] [Green Version]
3. Hong, M.; Luo, Z.Q. On the linear convergence of the alternating direction method of multipliers. Math. Program. 2017, 162, 165–199. [Google Scholar] [CrossRef] [Green Version]
4. Li, D.; Pang, L.P.; Chen, S. A proximal alternating linearization method for nonconvex optimization problems. Optim. Method Softw. 2014, 29, 771–785. [Google Scholar] [CrossRef]
5. Burke, J.V.; Lewis, A.S.; Overton, M.L. A robust gradient sampling algorithm for nonsmooth, nonconvex optimization. SIAM J. Optim. 2005, 15, 751–779. [Google Scholar] [CrossRef] [Green Version]
6. Kiwiel, K.C. A method of centers with approximate subgradient linearizations for nonsmooth convex optimization. SIAM J. Optim. 2008, 18, 1467–1489. [Google Scholar] [CrossRef]
7. Yuan, G.L.; Meng, Z.H.; Li, Y. A modified Hestenes and Stiefel conjugate gradient algorithm for large-scale nonsmooth minimizations and nonlinear equations. J. Optim. Theory Appl. 2016, 168, 129–152. [Google Scholar] [CrossRef]
8. Yuan, G.L.; Sheng, Z. Nonsmooth Optimization Algorithms; Science Press: Beijing, China, 2017. [Google Scholar]
9. Yuan, G.L.; Wei, Z.X.; Li, G. A modified Polak-Ribière-Polyak conjugate gradient algorithm for nonsmooth convex programs. J. Comput. Appl. Math. 2014, 255, 86–96. [Google Scholar] [CrossRef]
10. Lv, J.; Pang, L.P.; Meng, F.F. A proximal bundle method for constrained nonsmooth nonconvex optimization with inexact information. J. Glob. Optim. 2018, 70, 517–549. [Google Scholar] [CrossRef]
11. Yang, Y.; Pang, L.P.; Ma, X.F.; Shen, J. Constrained nonconvex nonsmooth optimization via proximal bundle method. J. Optim. Theory Appl. 2014, 163, 900–925. [Google Scholar] [CrossRef]
12. Fuduli, A.; Gaudioso, M.; Giallombardo, G. Minimizing nonconvex nonsmooth functions via cutting planes and proximity control. SIAM J. Optim. 2004, 14, 743–756. [Google Scholar] [CrossRef]
13. Sagastizábal, C. Composite proximal bundle method. Math. Program. 2013, 140, 189–233. [Google Scholar] [CrossRef] [Green Version]
14. Mäkelä, M.M. Survey of bundle methods for nonsmooth optimization. Optim. Method Softw. 2002, 17, 1–29. [Google Scholar] [CrossRef]
15. Hare, W.; Sagastizábal, C.; Solodov, M. A proximal bundle method for nonsmooth nonconvex functions with inexact information. Comput. Optim. Appl. 2016, 63, 1–28. [Google Scholar] [CrossRef]
16. Sagastizábal, C.; Hare, W. A redistributed proximal bundle method for nonconvex optimization. SIAM J. Optim. 2010, 20, 2442–2473. [Google Scholar]
17. Kiwiel, K.C. A proximal bundle method with approximate subgradient linearizations. SIAM J. Optim. 2006, 16, 1007–1023. [Google Scholar] [CrossRef] [Green Version]
18. Kiwiel, K.C. A linearization algorithm for nonsmooth minimization. Math. Oper. Res. 1985, 10, 185–194. [Google Scholar] [CrossRef]
19. Tang, C.M.; Liu, S.; Jian, J.B.; Li, J.L. A feasible SQP-GS algorithm for nonconvex, nonsmooth constrained optimization. Numer. Algorithms 2014, 65, 1–22. [Google Scholar] [CrossRef]
20. Tang, C.M.; Jian, J.B. Strongly sub-feasible direction method for constrained optimization problems with nonsmooth objective functions. Eur. J. Oper. Res. 2012, 218, 28–37. [Google Scholar] [CrossRef]
21. Hintermüller, M. A proximal bundle method based on approximate subgradients. Comput. Optim. Appl. 2001, 20, 245–266. [Google Scholar] [CrossRef]
22. Lukšan, L.; Vlček, J. A bundle-Newton method for nonsmooth unconstrained minimization. Math. Program. 1998, 83, 373–391. [Google Scholar] [CrossRef] [Green Version]
23. Solodov, M.V. On approximations with finite precision in bundle methods for nonsmooth optimization. J. Optim. Theory Appl. 2003, 119, 151–165. [Google Scholar] [CrossRef] [Green Version]
24. Kiwiel, K.C. Restricted step and Levenberg-Marquardt techniques in proximal bundle methods for nonconvex nondifferentiable optimization. SIAM J. Optim. 1996, 6, 227–249. [Google Scholar] [CrossRef]
25. Borghetti, A.; Frangioni, A.; Lacalandra, F.; Nucci, C.A. Lagrangian heuristics based on disaggregated bundle methods for hydrothermal unit commitment. IEEE Trans Power Syst. 2003, 18, 313–323. [Google Scholar] [CrossRef]
26. Zhang, Y.; Gatsis, N.; Giannakis, G.B. Disaggregated bundle methods for distributed market clearing in power networks. In Proceedings of the 2013 IEEE Global Conference on Signal and Information Processing, Austin, TX, USA, 3–5 December 2013; pp. 835–838. [Google Scholar]
27. Gao, H.; Lv, J.; Wang, X.L.; Pang, L.P. An alternating linearization bundle method for a class of nonconvex optimization problem with inexact information. J. Ind. Manag. Optim. 2021, 17, 805–825. [Google Scholar] [CrossRef] [Green Version]
28. Goldfarb, D.; Ma, S.; Scheinberg, K. Fast alternating linearization methods for minimizing the sum of two convex functions. Math. Program. 2013, 141, 349–382. [Google Scholar] [CrossRef] [Green Version]
29. Bolte, J.; Sabach, S.; Teboulle, M. Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 2014, 146, 459–494. [Google Scholar] [CrossRef]
30. De Oliveira, W.; Solodov, M. Bundle Methods for Inexact Data; Technical Report; 2018; Available online: http://pages.cs.wisc.edu/~solodov/wlomvs18iBundle.pdf (accessed on 1 April 2021).
31. Fábián, C.I.; Wolf, C.; Koberstein, A.; Suhl, L. Risk-averse optimization in two-stage stochastic models: Computational aspects and a study. SIAM J. Optim. 2015, 25, 28–52. [Google Scholar] [CrossRef]
32. Solodov, M.V.; Zavriev, S.K. Error stability properties of generalized gradient-type algorithms. J. Optim. Theory Appl. 1998, 98, 663–680. [Google Scholar] [CrossRef] [Green Version]
33. De Oliveira, W.; Sagastizábal, C.; Lemaréchal, C. Convex proximal bundle methods in depth: A unified analysis for inexact oracles. Math. Program. 2014, 148, 241–277. [Google Scholar] [CrossRef]
34. Hertlein, L.; Ulbrich, M. An inexact bundle algorithm for nonconvex nonsmooth minimization in Hilbert space. SIAM J. Optim. 2019, 57, 3137–3165. [Google Scholar] [CrossRef]
35. Noll, D. Bundle Method for Non-Convex Minimization with Inexact Subgradients and Function Values. In Computational and Analytical Mathematics; Springer Proceedings in Mathematics; 2013; Volume 50, pp. 555–592. Available online: https://link.springer.com/chapter/10.1007/978-1-4614-7621-4_26 (accessed on 10 March 2021).
36. Rockafellar, R.T.; Wets, R.J.B. Variational Analysis; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1998. [Google Scholar]
37. Hiriart-Urruty, J.B.; Lemaréchal, C. Convex Analysis and Minimization Algorithms; No. 305–306 in Grund. der math. Wiss; Springer: Berlin, Germany, 1993; Volume 2, Available online: https://core.ac.uk/display/44384992 (accessed on 10 March 2021).
38. Hare, W.; Sagastizábal, C. Computing proximal points of nonconvex functions. Math. Program. 2009, 116, 221–258. [Google Scholar] [CrossRef]
39. Bonnans, J.; Gilbert, J.; Lemaréchal, C.; Sagastizábal, C. Numerical Optimization: Theoretical and Practical Aspects, 2nd ed.; Springer: Berlin, Germany, 2006. [Google Scholar]
40. Emiel, G.; Sagastizábal, C. Incremental-like bundle methods with application to energy planning. Comput. Optim. Appl. 2010, 46, 305–332. [Google Scholar] [CrossRef]
41. Fuduli, A.; Gaudioso, M.; Giallombardo, G. A DC piecewise affine model and a bundling technique in nonconvex nonsmooth minimization. Optim. Method Softw. 2004, 19, 89–102. [Google Scholar] [CrossRef]
42. Joki, K.; Bagirov, A.M.; Karmitsa, N.; Mäkelä, M.M. A proximal bundle method for nonsmooth DC optimization utilizing nonconvex cutting planes. J. Glob. Optim. 2017, 68, 501–535. [Google Scholar] [CrossRef]
43. Bagirov, A. A method for minimization of quasidifferentiable functions. Optim. Method Softw. 2002, 17, 31–60. [Google Scholar] [CrossRef]
44. Bagirov, A.M.; Ugon, J. Codifferential method for minimizing nonsmooth DC functions. J. Glob. Optim. 2011, 50, 3–22. [Google Scholar] [CrossRef]
Figure 1. Values of $η$ and $μ$ in function $ψ 1 ( x )$ with $N = 17$ in exact case.
Figure 1. Values of $η$ and $μ$ in function $ψ 1 ( x )$ with $N = 17$ in exact case.
Figure 2. The values of $η$ and $μ$ in function $ψ 1 ( x )$ with $N = 19$ in inexact case.
Figure 2. The values of $η$ and $μ$ in function $ψ 1 ( x )$ with $N = 19$ in inexact case.
Figure 3. Values of $η$ and $μ$ in function $ψ 1 ( x )$ with $N = 14$ in unifrnd error case.
Figure 3. Values of $η$ and $μ$ in function $ψ 1 ( x )$ with $N = 14$ in unifrnd error case.
Figure 4. Values of $η$ and $μ$ in function $ψ 2 ( x )$ with $N = 6$ in unifrnd error case.
Figure 4. Values of $η$ and $μ$ in function $ψ 2 ( x )$ with $N = 6$ in unifrnd error case.
Figure 5. Performance of Algorithm 1 for the NNE, CNE and CGNE cases.
Figure 5. Performance of Algorithm 1 for the NNE, CNE and CGNE cases.
Figure 6. Performance of Algorithm 1 for the NNE, VNE and VGNE cases.
Figure 6. Performance of Algorithm 1 for the NNE, VNE and VGNE cases.
Figure 7. The values of $η$ and $μ$ in Problem 3 with $N = 5$ in the VNE case.
Figure 7. The values of $η$ and $μ$ in Problem 3 with $N = 5$ in the VNE case.
Figure 8. The values of $η$ and $μ$ in Problem 6 with $N = 4$ in the VNE case.
Figure 8. The values of $η$ and $μ$ in Problem 6 with $N = 4$ in the VNE case.
Table 1. The numerical results of Algorithm 1 and RedistProx for $ψ 1 ( x )$.
Table 1. The numerical results of Algorithm 1 and RedistProx for $ψ 1 ( x )$.
Algorithm 1RedistProx
DimNSNF$δ k$fkNF$δ k$fk$ψ *$
128311.28443.070251.00000.50000
246777.44539.2727108.00193.66230
312185.31285.1480122.52125.29220
411194.17367.0451181.235992.57220
515188.90985.4569263.043115.11660
6703755.97181.7556602.00000.0000000
729738.24911.2567342.574932.392810
8401688.26621.0969561.546846.98230
931549.05853.70281501.40000.0000000
1041705.93401.4399611.719622.173520
Table 2. The numerical results of Algorithm 1 and RedistProx for $ψ 2 ( x )$.
Table 2. The numerical results of Algorithm 1 and RedistProx for $ψ 2 ( x )$.
Algorithm 1RedistProx
DimNSNF$δ k$fkNF$δ k$fk$ψ *$
1049.81110.5000130.00007.00000
213167.38503.6061160.00000.00000
312179.94385.8183172.00000.00000
411148.75772.0964231.289981.91050
531513.79422.5534313.746743.513320
632628.75081.4844352.177321.138350
716224.68104.5559411.820871.200090
8291586.79102.2678418.66621.0846000
941839.04562.7211422.38307.82530
10471169.65272.8546668.74043.63270
Table 3. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$.
Table 3. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$.
Algorithm 1 for $ψ 1 ( x )$Algorithm 1 for $ψ 2 ( x )$
DimNSNF$δ k$fkNSNF$δ k$fk$ψ *$
111802794.71721.58301731878.48251.29890
121802077.07563.51081742379.95984.08150
131702148.16863.04801783639.78676.71490
141692388.83774.13421652469.11311.85550
151772319.50841.22131671747.74333.88910
161662105.03002.07551672126.53873.11440
171782168.54722.83041701857.49242.97950
181882599.82212.61601501955.05523.61970
191873896.76932.63641731919.09802.00030
201681837.75704.467952365719.88283.05700
Table 4. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$ in normal case.
Table 4. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$ in normal case.
Algorithm 1 for $ψ 1 ( x )$Algorithm 1 for $ψ 2 ( x )$
DimNuNSNNANF$V k$fkNuNSNNANF$V k$fk
15101141309.03447.07171068709.63830.49979
251171262499.24683.88790146822299.75141.5986
31891021939.11713.42641831332189.89733.1147
41591141759.59766.5193161981619.27043.3729
5348871399.42547.65182481071589.97547.9746
61441121589.07744.73921431111569.72167.3592
7345581075.88791.52673411121579.49945.6870
8341981439.39216.96366391161629.47671.4089
94411211679.05019.15573361321729.71581.6736
105391141599.23675.49233361431839.64039.3759
115361061489.36821.15683381391818.45843.0633
1213391271809.29029.29853361561969.47623.8106
1315411161738.30681.40016451141669.74252.8040
141401101529.27983.88843381451879.51111.0486
153391441879.67936.19322351491879.65894.3316
1615481101748.51881.762221411271909.16132.7225
177411271768.98366.36386361672109.23303.8989
1822381482099.33221.71419371391869.84475.9524
1944642363459.99612.35531594797511649.55322.2285
2035531112009.75349.360731511302139.00645.1422
Table 5. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$ in unifrnd error case.
Table 5. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$ in unifrnd error case.
Algorithm 1 for $ψ 1 ( x )$Algorithm 1 for $ψ 2 ( x )$
DimNuNSNNANF$V k$fkNuNSNNANF$V k$fk
14983979.0613−6.47942019229.77554.9060
23171271489.1482−1.875811879999.02237.7941
33111101259.03881.743621183979.5395−7.1758
421179939.48137.107901081929.42332.0590
5110961089.4407−4.5471381021149.01886.8412
64111021189.99767.357031372899.42476.0276
76101121299.23997.523479851029.27231.2366
87161081329.70601.48246131021229.13417.7913
9538811259.33525.26112361081479.05091.4101
10736941389.38082.79202361071469.92702.3615
114361251669.39526.23962341421799.79262.2681
1232392493219.89771.1030234961339.90741.3646
13736931379.02633.85766341031449.77384.7721
1431412313049.67814.2488040831249.64857.3474
152381101519.59996.19327341041469.46304.2409
16537921358.66624.90526341061479.40221.1794
1724462052769.5651−8.02784341161559.94565.7203
1832422172929.43221.41774331071459.08055.2319
1912381231749.77891.15085351261679.20052.8864
2024521181959.83367.62412144851519.35819.4419
Table 6. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$ in NNE case.
Table 6. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$ in NNE case.
Algorithm 1 for $ψ 1 ( x )$Algorithm 1 for $ψ 2 ( x )$
DimNuNSNF$V k$fkNuNSNF$V k$fk
1561679.42154.58823043.44410.5000
2216422599.65603.7064722304.14911.3116
31919395.19177.2127915255.20488.5245
4821309.64716.1925515212.87176.8710
53423588.26901.31921614314.61467.9665
65226797.64721.44892018399.75122.9449
74329738.24911.2567130321639.89732.6066
8166281956.22958.90616632996.86947.3557
964591249.48502.1974767756.27591.1151
101155676.94911.64401456715.77842.3212
1171521247.33342.9259854639.36011.7999
12180772585.66464.70892857869.44411.4864
1335761127.86022.93281851708.35281.7982
142569956.81301.96052869988.99343.2136
15111882009.18812.639872681416.71256.2161
1677821407.71132.09201960805.42672.4310
17140852266.33214.44852567936.56682.6549
186981398388.17464.14552556827.26422.4928
19971092078.53622.17822411253679.92035.6158
202411003425.84326.6563851162029.33623.3206
Table 7. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$ in CNE case.
Table 7. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$ in CNE case.
Algorithm 1 for $ψ 1 ( x )$Algorithm 1 for $ψ 2 ( x )$
DimNuNSNNANF$V k$fkNuNSNNANF$V k$fk
126110745.4925−9.99992020239.29630.4900
2$0 *$22002219.8078−9.1205212334799.6577−9.7816
321362789.9421−9.947211443599.6385−9.9870
421242579.0680−9.325021645649.0853−9.9870
551510319.2336−9.534931238549.9726−3.2581
6162327678.4599−9.9969314981169.5355−5.3995
716652282478.6648−9.995989881069.6388−3.8179
8112735749.6865−9.99024131101289.81784.4799
910560739.7410−9.9957363521198.6710−9.5796
10768401619.6288−9.9981859581269.7235−9.3007
11476901178.3545−9.996385531959.8103−9.9510
1233810504447.3700−9.997974443959.6506−9.9432
13529301466.3101−9.9978649591159.8502−9.1591
141738202567.8317−9.99731250411049.3526−9.7982
152871341349.1862−9.88354269581709.4662−8.7053
16347001057.4213−9.9973956871539.0649−5.9286
171049502006.0123−9.99701650691369.0611−7.4826
18$3 *$620669.93865.1193852741359.3028−1.9240
199310902039.9699−9.997712341041519.33232.8573
20447001155.8482−3.34851976981949.3637−3.0299
Table 8. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$ in VNE case.
Table 8. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$ in VNE case.
Algorithm 1 for $ψ 1 ( x )$Algorithm 1 for $ψ 2 ( x )$
DimNuNSNNANF$V k$fkNuNSNNANF$V k$fk
13630672.24113.77731090929.42610.4900
21 $*$270299.85749.620314230384.19762.1736
34180238.82751.38005150214.56756.2853
46200275.75813.595816200373.72274.9713
545320785.64451.009716200379.02821.6974
611220348.88171.018334250606.81766.3339
7111745749.70748.0881713301059.24379.3392
88844111445.77912.44733 $*$161211419.72243.5503
9267291084.97322.439120590807.80922.8594
10728501589.99152.202014540698.77094.8392
11918001729.44441.741821640865.31671.8802
122529733536.29313.9407967801758.77603.1215
13397561219.27344.318617510697.38071.7909
143919854959.61825.730022590829.47833.1236
157510501819.72632.36242908503763.93854.0266
1625550819.29342.949120610827.75002.9207
17628061496.76725.171035570939.92521.7155
1820313883506.57854.549425520786.62011.8421
19889621878.72634.5004464011205673.6082 $*$1.7001
201409122346.50715.50172394631819.33517.6290
Table 9. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$ in CGNE case.
Table 9. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$ in CGNE case.
Algorithm 1 for $ψ 1 ( x )$Algorithm 1 for $ψ 2 ( x )$
DimNuNSNNANF$V k$fkNuNSNNANF$V k$fk
126110745.49251.34722020239.29630.5000
221531903374.32257.1521212334799.65772.1843
321362789.94215.265011443599.63851.2983
421242579.06806.750421645649.08531.1792
551510319.23364.650631238549.97266.7418
6162327678.45993.1286314981169.53559.4601
716253232397.84022.382389881069.63886.1821
8112735749.68659.80104131101289.81781.4480
910566739.74104.2731363521198.67104.2042
10768401619.62891.9282859581269.72356.9928
11476901178.35463.739685531959.81034.9014
1236310404689.41692.692774443959.65065.6826
13529301466.31042.1522649591159.85028.4093
141738202567.83172.72011250411049.35262.0185
152871341349.18621.16544269581709.46621.2947
16347001057.42132.6541956871539.06494.0714
171049502006.01233.01151650691369.06112.5174
182 $*$770808.02723.3133852741359.30288.0760
197410901849.91722.668312341041519.33232.9573
20447001155.84826.651519761011979.81159.6949
Table 10. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$ in VGNE case.
Table 10. The numerical results of Algorithm 1 for $ψ 1 ( x )$ and $ψ 2 ( x )$ in VGNE case.
Algorithm 1 for $ψ 1 ( x )$Algorithm 1 for $ψ 2 ( x )$
DimNuNSNNANF$V k$fkNuNSNNANF$V k$fk
13630674.34521.01162020239.29640.5000
227 $*$29803263.33381.1757182341839.76772.7914
36170244.61894.86932172227.63151.3123
42200233.40816.756131538579.41559.1216
532320659.66111.355751645678.07612.2608
616240413.88629.884552158859.24912.4062
76 $*$570642.91991.033382068979.45524.4226
81304101725.86814.3867715931169.86711.0319
9547401298.54122.358285419828.92521.3088
10917901716.47273.478875115747.24761.2106
11797501557.34594.738665436974.89217.0049
121729002634.26173.9532970391198.97731.3272
13467801258.44362.985255040969.49992.1125
148 $*$780876.36577.5245144734969.29847.1209
152038002846.93321.42014980541847.25073.7361
16556301196.59302.25001052701339.25832.3373
17797901598.51364.54432456551367.35766.2449
1847 $*$222162854.4363 $*$2.6364947841419.03352.4064
19548901446.60942.5192308126685036.89941.0272
20979801967.88126.65411979721719.17186.4603
Table 11. The numerical results for Algorithm 1, TCM, NCVX and PNCVX algorithms in Problems 1–5.
Table 11. The numerical results for Algorithm 1, TCM, NCVX and PNCVX algorithms in Problems 1–5.
Algorithm 1TCMNCVXPNCVX
PrnNffkNffkNffkNffk$ψ *$
122047.42195268.4453201.0000 $*$481.0000 $*$0
24696.42935173.7104596.8183595.20880
3262.00001254.0552191.280041.25000
35303.71003993.6698252.5204877.17310
310911.41168441.0639346.0636782.20920
44211.83333079.2000 $*$59.2000 $*$341.833311/6
52325.77271491.9922191.1842247.52300
55124.065411747.1262139.6573121.15460
Table 12. The numerical results of Algorithm 1 for these DC problems in VNE and NNE cases.
Table 12. The numerical results of Algorithm 1 for these DC problems in VNE and NNE cases.
Algorithm 1 for VNE CaseAlgorithm 1 for NNE Case
PrDimNuNSNNANF$V k$fkPrecisionNuNSRSNF$V k$fkPrecision
12219902027.3309−9.87262.0044320012041.82217.42199.1295
248680777.67241.75275.75635632691.55696.42936.1918
3217096.4421−4.97032.301005063.26012.000010.6990
35307932706719.8576−8.75702.05551280302.30673.71225.4304
3106267242982.66371.14800.94011890912.97601.41163.8503
442230268.20231.82342.26611190218.84581.83334.7404
5216086.21981.68643.77306250329.26825.77275.2386
555130198.0435−6.53853.1845290127.56984.06546.3909
6225081.16071.85864.730824271.08405.385714.2688
6420201131541.3749−2.64062.58504112161.31231.93016.7144
 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Share and Cite

MDPI and ACS Style

Wang, X.; Pang, L.; Wu, Q.; Zhang, M. An Adaptive Proximal Bundle Method with Inexact Oracles for a Class of Nonconvex and Nonsmooth Composite Optimization. Mathematics 2021, 9, 874. https://doi.org/10.3390/math9080874

AMA Style

Wang X, Pang L, Wu Q, Zhang M. An Adaptive Proximal Bundle Method with Inexact Oracles for a Class of Nonconvex and Nonsmooth Composite Optimization. Mathematics. 2021; 9(8):874. https://doi.org/10.3390/math9080874

Chicago/Turabian Style

Wang, Xiaoliang, Liping Pang, Qi Wu, and Mingkun Zhang. 2021. "An Adaptive Proximal Bundle Method with Inexact Oracles for a Class of Nonconvex and Nonsmooth Composite Optimization" Mathematics 9, no. 8: 874. https://doi.org/10.3390/math9080874

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.