Abstract
The accelerated prox-level (APL) and uniform smoothing level (USL) methods recently proposed by Lan (Math Program, 149: 1–45, 2015) can achieve uniformly optimal complexity when solving black-box convex programming (CP) and structure non-smooth CP problems. In this paper, we propose two modified accelerated bundle-level type methods, namely, the modified APL (MAPL) and modified USL (MUSL) methods. Compared with the original APL and USL methods, the MAPL and MUSL methods reduce the number of subproblems by one in each iteration, thereby improving the efficiency of the algorithms. Conclusions of optimal iteration complexity of the proposed algorithms are established. Furthermore, the modified methods are applied to the two-stage stochastic programming, and numerical experiments are implemented to illustrate the advantages of our methods in terms of efficiency and accuracy.
1. Introduction
In the fields of production planning, finance risk, telecommunication, and electricity, decision makers need to take into consideration uncertainty about the information and the model itself. Lack of data, calculation errors as well as unpredictability, etc. lead to the uncertainty in information. The uncertainty of model is derived from the structure of problem, the features of constraints, as well as the risks and profiles of decisions. Stochastic programming (SP) is an effective tool for dealing with optimization problems under uncertainty. The expectation model in stochastic programming is widely used, which maximizes the expectation of the benefit or minimizes the loss under the expected constraints. In this paper, we are concerned with the two-stage stochastic programming with recourse. Next, through a practical problem, network planning with random demand, we introduce the mathematical model of two-stage stochastic programming.
Due to demand for higher bandwidth and dedicated lines, network capacity is becoming a scarce resource. Consider a situation where a network provider plans bandwidth allocation between network links, under total network capacity b that is available for allocation. In addition, there are n different links needed to be expanded in capacity. The extra capacity allocated to link j is , where and vector consists of elements . In network planning, demands refer to the number of connections requested between point-to-point pairs provided by the network at a certain time. Here, the demand related to the m point-to-point pairs is modeled as a random variable.
Suppose that the extra capacities , are given and the demand D is observed. Then the capacity planning model introduced in [1] is to minimize the total number of unoffered requests. Let be the indices of point-to-point pairs, and the paths that can be offered to connections related to point-to-point pair i be represented in . is a vector whose elements denote the capacity of current link j. and denote the number of connections and unoffered requests related to pair i, respectively. For an observation of the random variable D, one can obtain the optimal decision by solving the linear programming problem as follows
Here, 0-1 vector is introduced as an incidence whose j-th component, , is 1 if link j lies in path p and 0 otherwise. Let denote the optimal value of the above linear programming (1) which depends on the observation d of D and capacity x. It is obvious that is also a random variable due to the randomness of D. Thus, the capacities , can be obtained from the following programming problem
Here refers to the expectation of the probability distribution of random variable D. In a word, the purpose of programming problem (2) is to minimize the expectation of unoffered requests, while satisfying the constraint within the total network capacity b. The above optimization model (1) and (2) is an instance of two-stage stochastic programming with recourse, in which (1) is the second-stage problem and (2) is the first-stage problem.
From the above practical instance we can see that the stochastic programming plays a key role in network planning with random demand. In addition to stochastic programming, the unit commitment problems in the field of energy systems usually involve mixed-integer quadratic programming. Recently, there have been some studies on solving such problems [2,3,4]. For solving the programming problems (1) to (2), Sen et al. [1] applied the stochastic decomposition method, which is one of the most efficient methods for stochastic programming. There are still many applications of stochastic programming in other fields, such as insurance and finance, valuation of electricity, telecommunications, hydrothermal power production planning and pollution control, etc. Stochastic models of these application problems can be found in [5]. In what follows, as an extension of the network planning problem, the standard mathematical model of two-stage stochastic programming with recourse is given, which is more convenient for generalization and theoretical analysis.
The two-stage stochastic programming with fixed recourse is in the following form
where x is the decision variable, c is the the cost of production and denotes the optimal objective value of the second-stage problem
Here, D is the demand vector with the elements , , and T. Let W be the fixed recourse and be a random variable with the support of probability distribution . in the first-stage problem (3) is the expectation with respect to D.
The mathematical model of the two-stage stochastic programming is given and next we will summarize the methods mainly used to solve it. Since the two and multi-stage stochastic programming problems have very large dimension and special structures, they can be solved by means of decomposition. In two-stage models (3) and (4), suppose that the probability space is finite, and that represent all basic events and their probabilities. Then the first-stage problem can be rewritten in the form
Here, is the optimal value of the second-stage problem
where, , , , , .
The main idea of the basic dual decomposition methods is to establish certain approximations to the first-stage problem by means of solving subproblems in structure (6). As an original decomposition-type method, the cutting-plane method establishes a linear model of , and its main scheme can be found in [6,7]. Simplicity and cheapness in calculation are advantages of the cutting-plane model. With the increasing of cuts, however, the demand for storage growths which is a typical difficulty of such method. On the other hand, although there is a good initial iteration, such method may generate a longer stepsize. In order to tackle these difficulties, Ruszczynski [8] proposed the regularized decomposition method which is considered as an improvement of the basic one. Another way to avoid generating longer steps is the trust region method which was extended to two-stage stochastic programming by Linderoth and Wright [9]. In addition, bundle methods, which can be viewed as stabilized variants of the original cutting-plane methods, were developed in [10,11,12]. Algorithmic modifications based on bundle methods were introduced in [13,14]. The bundle level (BL) method was first proposed by Lemaréchal et al. [15] as another kind of bundle method. On this basis, in [16,17,18] the “restricted-memory” version of BL method is developed which performs well in numerical experiments. In recent years, there has been tremendous development in “asynchronous” and “partial” versions of BL methods, see references [19,20,21,22,23]. Considering that the research on BL methods is only focused on general non-smooth convex optimization (CP) problems, Lan [24] proposed an accelerated BL-type method, namely the accelerated bundle level (ABL) method and its restricted memory version, the accelerated prox-level (APL) method. Benefiting from the multi-step strategy introduced by Nesterov [25] and later applied in [26,27,28,29,30], both ABL and APL methods are uniformly optimal for solving non-smooth, weakly smooth and smooth CP problems. In addition, by incorporating Nesterov’s smoothing technique [16,17] into APL method, Lan [24] presented the uniform smoothing level (USL) method for solving structure non-smooth CP problems with optimal iteration complexity. In particular, the USL method does not require any input of problem parameters. Moreover, Lan [24] illustrated that the APL and USL methods, when applied to solving semidefine programming and two-stage stochastic programming, have obvious advantages in computation time and accuracy over the related gradient-type algorithms and some existing methods.
Our main work in this paper includes several aspects. First of all, on the basis of Lan’s work, we make further improvement and present the modified accelerated prox-level (MAPL) method. By selecting the proper proximal function, the MAPL method only need to solve one subproblem to update the prox-center and the lower bound synchronously, which improves the computational efficiency. In addition, MAPL method can achieve uniformly optimal complexity for solving smooth, weakly smooth, and non-smooth CP problems uniformly. Furthermore, we extend the MAPL method to solve the structure non-smooth CP problems and present the modified uniform smoothing level (MUSL) method. Finally, we apply the proposed methods to solve the two-stage stochastic programming problems with recourse. The numerical results show that the MAPL and MUSL methods have certain advantages in iterations and computation time.
The present paper is built up as follows. In Section 2, some related works about BL-type methods are reviewed. The MAPL method and its complexity analysis are presented in Section 3. The MUSL method and its complexity analysis are presented in Section 4. The application of MAPL and MUSL methods to two-stage stochastic programming as well as numerical experiments are shown in Section 5. Conclusions are presented in the final section.
2. Related Work
This section reviews some related work on BL methods. Specifically, the main ideas of BL methods and related notation are reviewed in Section 2.1. The APL method and its gap reduction procedure introduced in Section 2.2 are the basis of the main work in this paper.
2.1. The Bundle Level (BL) Method
Consider the general CP problem
where the constraint set is convex and compact, and the objective function is closed and convex over X. The function f is known only through a first-order oracle returning the function value and a subgradient for a given point , where is the subdifferential of f at x.
Given a sequence of iteration points, the first-order information and are provided by oracle. The cutting-plane approximation of f is generated by
where In addition, refers to the inner product. The level set of with level parameter is defined by . The BL method [15] generates the next iteration point by
The main steps of the BL method are listed below.
Step 1. Set and compute a lower bound ;
Step 2. Set the level parameter for some ;
Step 3. Set and generate the new iterate by solving (8).
2.2. The Accelerated Prox-Level (APL) Method and Its Gap Reduction Procedure
In this subsection we consider the CP problem (7), where f satisfies the following inequality
for some and It is easy to show that non-smooth , smooth and weakly smooth CP problems are contained in this family of problems. Lan [24] generalized the BL method to accelerated bundle level (ABL) and accelerated prox-level (APL) methods, such that they achieve uniformly optimal complexity bounds for functions satisfying (9). Here, we mainly introduce the APL method and its principle.
Firstly, Lan [24] introduced three related iteration sequences, and to establish the cutting-plane approximation , generate the upper bounds , and update prox-center , respectively. Specifically, and are updated by and . Secondly, Lan introduced an internal procedure to reduce the gap between the upper and lower bounds of . The algorithmic framework of the APL method is as follows (Algorithm 1).
| Algorithm 1: The APL method. |
| Input: Choose , stopping tolerance and parameters . Step 0: (Initialization) Set , and . Let . Step 1: (Stopping test) If , terminate. Step 2: (Call procedure) Set and . Step 3: (Loop) Set , and return to Step 1. |
Next, we focus on describing the gap reduction procedure . Denote the level set . For a given iterate z, denote
Then, it can be verified that
This implies that a lower bound of . However, the problem (10) is difficult to solve in general. To overcome the difficulty and obtain the lower bound in a convenient way, Lan used a compact and convex set to replace in problem (10). Thus, one can solve the relaxation of (10)
Here, the set is called the localizer of the level set satisfying . Then we obtain a lower bound on as follow
Indeed, as shown in [24], if , then for all . If , we have that and , and therefore for all . Thus, (11) holds.
Moreover, in order to make better use of the structure of the feasible set X, similar to the NERML algorithm in [16,17], Lan introduced the prox-function to replace the Euclidean distance function . Here, a function is served as a prox-function of a convex compact set with coefficient , if it is a function with differentiability, strong convexity as well as coefficient i.e.
Furthermore, one can redefine the diameter of X with by
This lead to the following relation
Furthermore, let
be the prox-function on X and be the prox-center of . It follows that
and is a strong convex function with coefficient .
The internal gap reduction procedure of APL method is as follows.
The APL gap reduction procedure:
- Step 0: (Initialization) Set and . In addition, choose and the initial localizer . The prox-function is defined in (15). Also let .
- Step 1: (Update lower bound) Let , and .If , then stop the procedure and output and .
- Step 2: (Update prox-center) Set
- Step 3: (Update upper bound) Set and such that . If , then stop the procedure and output and .
- Step 4: (Update localizer) Choose such that , where
- Step 5: (Loop) Set and return to Step 1.
3. The Modified Accelerated Prox-Level (MAPL) Method
In this section, we proposed the modified accelerated prox-level (MAPL) method which requires only one subproblem to be solved per iteration while achieving the uniformly optimal iteration complexity for solving the black-box CP problems. We first present the modified gap reduction procedure , which generates a new search point and a new lower bound of such that from the given input p and lb and some Here, the value of q depends on the parameters .
The MAPL gap reduction procedure:
- Step 0: (Initialization) Set and . In addition, choose and the initial localizer . The prox-function is defined in (15). Let .
- Step 1: (Update level set) Set
- Step 2: (Update prox-center and lower bound) SetIf , then stop the procedure and output .
- Step 3: (Update upper bound) Setand . If , then stop the procedure (there is a significant improvement on the upper bound) and output .
- Step 4: (Update localizer) Choose an arbitrary such that , where
- Step 5: (Loop) Set and return to Step 1.
The following are a few remarks on the MAPL gap reduction procedure. Firstly, the upper bound and lower bound of in Step 0 are obtained from the outer iteration of the MAPL method (described below), and and are fixed throughout the entire progress of . Furthermore, the level parameter ℓ is also fixed as a convex combination of and . But in the original BL methods, the level parameter changes in each iteration. Secondly, Step 2 and Step 3 have two exits. When the procedure stops at Step 2, ℓ is the lower bound on ; when the procedure stops at Step 3, there is a significant progress on , which depends on the parameter . Compared with the gap reduction procedure of APL method, the updates the lower bound in a easier way. Indeed, in Step 1 of see [24], in order to determine whether the lower bound lb needs to be updated, a linear programming subproblem is first solved. However, in the procedure, the prox-center and the lower bound lb are merged into one subproblem to be updated by selecting appropriate prox-function . While solving the subproblem (20), it can be automatically checked that whether . If , the lower bound lb is directly updated to ℓ, which avoids solving the linear programming as and reduce the number of calculation. Thirdly, in Step 4, can be selected arbitrarily under condition . For convenience, one can directly choose or . However, the number of constraints in increases with k, and has only one more constraint than the feasible set X. In practice, we can choose between these two sets, which can control the number of constraints in (19) and reduce the computation cost. Moreover, a proper selection of the stepsize sequences is critical for to terminate after finite iterations and achieve optimal iteration complexity. Lan proposed a generalized selection rule for the sequence in [24]. Chen et al. [31] proposed a more concise selection rule, i.e., the sequence satisfies the following conditions:
for any and some . Two examples for are as follows [31]:
(1) with ;
(2) , with .
The following lemma shows the important properties of procedure . These properties are similar to those of [24,31], whose proof can be found in Appendix A.
Lemma 1.
The following properties hold for the procedure .
a. is a collection of localizers for the level set ;
b. holds for any ;
c. , and hence for any ;
d. When holds, the problem (20) has a unique solution. In addition, if the procedure stops at Step 2, we have .
e. When the procedure stops, the relation holds, where
Referring to the theoretical analysis mode of [24], the following proposition will show that the gap between the upper bound and the level parameter ℓ will decrease with k, and it is proven that when the algorithm stops, the total number of iterations does not exceed an upper bound.
Proposition 1.
Proof.
By the definition of in (23) we have for any . This together with shows that
It follows from (19) and (20) that , for all . Moreover, due to the strong convexity of and the optimality condition of subproblem (20), we have
It turns out that
Summing up the above inequalities over k, we obtain that
It then follows that
Subtracting both sides of the above inequality by ℓ and dividing both sides by , from (24), we obtain that
As a result,
Summing up the above inequality over m and from the fact that we have
Applying the Hlder inequality to the above inequality and using (27), we have
From the fact that does not stop at Step 3, we have
This together with (29) shows that
We finally conclude that
that is
□
After giving the relevant properties and complexity analysis of the procedure , we are now in a position to introduce the so-called modified accelerated prox-level (MAPL) method, which repeatedly calls the procedure during iteration progress until it finds an approximate solution with the given accuracy. The algorithmic framework of MAPL method is as follows (Algorithm 2).
| Algorithm 2: The MAPL method. |
| Parameters: Choose stopping tolerance and parameters . Step 0: (Initialization) Choose initial point , set , and . Let ; Step 1: (Stopping test) If , terminate; Step 2: (Call procedure) Set and ; Step 3: (Loop) Set , and return to Step 1. |
Since the procedure is called during the progress of the MAPL method, we consider that an iteration of is also an iteration of the MAPL method. Take this fact into consideration, the following theorem establishes the convergence and iteration complexity of the MAPL method. The principle of the proof comes from reference [24].
Theorem 1.
For given , if , in are chosen to satisfy condition (24) with some , then
(1) The number of procedure called by the MAPL method can be bounded by
(2) The total number of iterations performed by the MAPL method does not exceed
Proof.
(1) Without loss of generality, we suppose that where According to Step 1 of the MAPL method, (9) and (14), we have
From Lemma 1 and the fact that , we have
Furthermore, suppose that the MAPL method finds an -solution after calling times , i.e., the MAPL method stops at . Then we have
and
It is easy to obtain
that is
(2) Suppose that the procedure has been called times in MAPL method. Then by (30) and (31) we have
Due to , we know that
This together with Lemma 1 and Proposition 1 shows that the total number of iterations performed by the MAPL method does not exceed
Here, we denote the number of the internal iterations performed by the s-th procedure with . □
We present a few remarks on the iteration complexity of MAPL method.
Remark 1.
According to the classic complexity theory [32] for CP problem (7), the number of iterations to find an ε-solution i.e., an approximate solution such that , does not exceed , if f is a general non-smooth Lipschitz continuous convex function. For smooth convex optimization, the optimal iteration complexity bound is . Furthermore, in case that f is weakly smooth and its gradient is Hlder continuous, the optimal iteration complexity is bounded by for some . It follows from Theorem 1 that the iteration complexity bound of the MAPL method is
In other words, the MAPL method can achieve uniformly optimal iteration complexity bounds for solving non-smooth (), smooth () and weakly smooth () CP problems.
4. The Modified Uniform Smoothing Level (MUSL) Method
In this section we consider the objective function f in (7) with the form of
where is Lipschitz continuous and simple. In addition, has a special structure that
Here, the compact convex set is nonempty, is a linear operator and is convex and continuous on Y.
Generally, the function F is convex and non-smooth. In this case, F can be approximated by constructing a series of smooth convex functions [16,17]. Let denote the prox-function of compact convex set Y with coefficient . refers to the prox-center of . Let
The functions and are approximated by and (with some smoothing parameter ), respectively, i.e.,
As described in [17], from the first-order optimality conditions, the convexity of and the strong convexity of , we know that the gradient of is Lipschitz continuous with Lipschitz constant
Furthermore, we have
and
for any .
Inspired by the smoothing technique of [16,17], Lan [24] proposed the uniform smoothing level (USL) method for solving structure non-smooth CP problem (7) with f being defined by (32). The advantage of the USL method is that the parameter and estimation of can be automatically adjusted and obtained during the gap reduction procedure , which makes the USL method free of parameters. However, similar to the APL method, each iteration of the USL method also involves two subproblems. Based on the USL method, combining with the analysis of the MAPL method in the previous section, we propose the modified uniform smoothing level (MUSL) method. Same as the USL method, the MUSL method can achieve the optimal iteration complexity when solving problem (7) with (32), but only one subproblem is required to solve in each iteration.
We next describe the gap reduction procedure of MUSL method. As the internal procedure of MUSL method, is called to compress the gap between upper and lower bounds of .
The MUSL gap reduction procedure:
- Step 0: (Initialization) Set and . LetChoose and the initial localizer . The prox-function is defined in (15). Let .
- Step 1: (Update level set) Let
- Step 2: (Update prox-center and lower bound) LetIf , then stop the procedure and output .
- Step 3: (Update upper bound) Letand . Check the following two possible stopping rules:
- (3a) if , stop the procedure with .
- (3b) Otherwise, if , stop the procedure with
- Step 4: (Update localizer) Choose an arbitrary such that , where
- Step 5: (Loop) Set and return to Step 1.
The following are a few remarks to the procedure including the differences from and some properties. Firstly, compared with , the procedure needs to input one more parameter which is used to calculate the smoothing parameter , so that the approximate function of f can be defined. Secondly, these two procedures approximate the objective function in different ways. approximates the objective function f by the linearization of f (18), while only linearizes the function and approximates the function f with (40). According to (40), convexity of , (36) and (32), we have
which means that the function is the lower estimate of . Thirdly, the procedure has one more exit than with three possible exits: Step 2, Step 3a, and Step 3b. If the procedure stops at Step 2, the lower bound lb is updated to ℓ; if it stops at Step 3a, we say that a significant improvement has been made on the upper bound of and update the upper bound . In addition, if the procedure stops at Step 3b, it is considered that there is no significant improvement on the upper bound of , thus the parameter needs to be adjusted to estimate and we update to .
Referring to the work of [24,31], the following lemma gives some simple observations of procedure . Its proof can be found in Appendix A.
Lemma 2.
The following results hold for internal procedure :
a. If terminates at Step 2 or Step 3a, we have . where ;
b. If terminates at Step 3b, we have and .
Similar to the previous section, it is time to build the convergence results for procedure . Referring to the theoretical analysis mode of [24], the detailed derivations of the iteration complexity are given as follows.
Proposition 2.
Proof.
Suppose the gap reduction procedure does not stop at the K-th iteration. From (34) and inequality (45), we have . Because of (34), (40) and the Lipschitz continuity of the subgradient of , we have
Hence,
It then follows that
Subtracting both sides of the above inequality by ℓ, then dividing both sides by , we have
From the fact that the procedure does not stop at Step 3b at the K-th iteration, we obtain Noticing that , thus
In conclusion, we obtain
□
Based on the results of the convergence of the procedure above, we next give the algorithm of the MUSL method. Similar to the MAPL method, the MUSL method is also implemented with outernal algorithm framework and internal gap reduction procedure . The outernal algorithm of the MUSL method is mainly to determine whether the gap between the upper and lower bounds on has reached the given tolerance in current iteration k. If the given tolerance is reached, the algorithm terminates and output an approximate optimal solution of f, otherwise the outernal algorithm call the internal procedure continuously to compress the gap between the upper and lower bounds. The algorithmic framework of the MUSL method is as follows (Algorithm 3).
| Algorithm 3: The MUSL method. |
|
Step 0: (Input) Choose , stopping tolerance , initial estimate and parameters . Step 1: (Initialization) Set and . Let ; Step 2: (Stopping test) If , terminate; Step 3: (Call procedure) Set and ; Step 4: (Loop) Set , and return to Step 2 |
We now turn to analyze the optimal complexity bound for the MUSL method. Please note that the following results are modifications of those in [24].
Lemma 3.
Theorem 2.
Suppose that in procedure is chosen to satisfy condition (24), and that . Then for given , the following statements hold for the MUSL method.
(1) the number of non-significant phases can be bounded by
and the number of significant phases can be bounded by
(2) the total number of iterations performed by the MUSL method does not exceed
where , and and are defined in (14).
Proof.
(1) Let . Without loss of generality, we suppose that From conclusion (2) of Lemma 2, we konw that , if non-significant phase occurs. From and , we have
From (e) in Lemma 1 and the definition of and , we know that and . In conclusion, we obtain that
(2) Let and denote the index sets of non-significant and significant phases respectively. For any in the non-significant phases, we have . Then we know that and Due to Proposition 2, we know that K decreases about the first variable and increases about the second variable monotonously. In addition, we conclude that the total number of iterations performed by the significant phases can be bounded by
From the fact that , we know that
It follows from the monotonic of K about variables and Proposition 2 that . In addition, we obtain that the total number of iterations performed in the non-significant phases can be bounded by
In what follows, we apply the MUSL method to solve a special problem where in (32) is a smooth convex function. In addition, the complexity results of procedure and MUSL method are established in this special case. is a smooth convex function means that there exists a constant such that satisfies
From the fact that has a Lipschitz continuous gradient with Lipschitz-constant , we have
Thus, we know that is a smooth function on X and satisfies
where . From the fact that F is non-smooth, we konw that f is non-smooth as well.
We make the following changes to the outernal algorithm of the MUSL method and internal procedure . Replace (47) in outernal algorithm with
and (40) in procedure with
The function is fixed as a result of the fact that is fixed in the internal procedure . Similar to Propostion 2, we have the following results.
Theorem 3.
Proof.
Suppose that the procedure does not stop in iteration . Because of the features of and , we know that satisfies (52). Let . From (14), we have
Then according to the stopping test in Step 3b in procedure , we have
Thus, we have
□
Here we will also establish the complexity of the MUSL method under this special case where is smooth and convex.
Theorem 4.
Assume that satisfy condition (24) and take . If is a smooth convex function, then the following statements hold for the MUSL method.
(1) the number of non-significant phases can be bounded by
and the number of significant phases can be bounded by
(2) the total number of iterations performed by the MUSL method does not exceed
Proof.
From (e) in Lemma 1 and the definition of and , we know that and . In conclusion, we obtain that
(2) Let and denote the index sets of non-significant and significant phases respectively. For any in the non-significant phases, we have . Then we know that and By Proposition 2, we know that decreases about the first variable and increases about the second variable monotonously. It follows that the total number of iterations performed by the significant phases can be bounded by
From the fact that , we know that
By the monotonic of about variables and the fact that , we know that the total number of iterations performed in the non-significant phases can be bounded by
5. Two-Stage Stochastic Programming and Numerical Experiments
In this section, the MAPL and MUSL methods are applied to solve the two-stage stochastic programming problems. Furthermore, we compare the two modified methods with the APL and USL methods. All algorithms are implemented in MATLAB (R2014a), and Mosek (8.0) is called for solving subproblems. The environment in which the program runs is Windows 7 (64 bite), Intel(R) Core(TM) i7-6700 CPU 3.40GHz and 16G memory.
Consider the following two-stage stochastic programming with recourse
where
Here, and are the decision variables of the first-stage problem and the second-stage problem respectively. is a compact convex set, and is a random vector with a known probability distribution and support . In [33], it was pointed out that by strong convexity one has
Let be the feasible set of (58) and we assume that . It is easy to know that is generally non-smooth. Therefore, for general distributions of , (56) is a non-smooth convex programming problem. In [33] Ahmed applied Nesterov’s smoothing technique to the two-stage stochastic programming problem (56), and established a proper smooth approximation to as follows. For a given smoothing parameter , consider the function with the form
where
For with discrete distribution, it is shown in [34,35] that is differentiable and its gradient is Lipschitz continuous. Furthermore, when is sufficiently small, can approximate f uniformly. Based on the above analysis, we next carry out the MAPL and MUSL methods to solve the problem (59) and (60) to illustrate the effectiveness of the methods and compare them with the APL and USL methods.
We perform numerical experiments on some existing SP instances in [1,36,37] including a telecommunication design (SSN) problem and a motor freight carrier routing problem (20Term). The SSN problem studied by Sen, Doverspike, and Cosares [1] comes from the telecommunications industry: the first-stage problem allocates capacity between network links, and the second-stage problem generates demands to connections requested between point-to-point pairs. The 20Term problem studied by Mak, Morton, and Wood [37] comes from a motor freight carrier’s model: the first-stage problem determines a program of carriers, and the second-stage problem adjusts the program according a multi-commodity network. The instances of SSN, 20Term and Storm are downloaded from the link: http://pwp.gatech.edu/guanghui-lan/computer-codes/. The dimensions related to these instances are shown in Table 1. denotes the number of constraints in the i-stage problem, and denotes the number of variable in the i-stage problem. In particular, we assume that the number of possible realizations of is fixed, i.e. or 100. In this case, a total of five instances are tested. The integers in the brackets are the number of possible realizations. For given parameters , tolerance and stepsize such that (24), we compare the number of iterations and CPU time of different methods. The results are shown in Table 2, Table 3 and Table 4.
Table 1.
Data about instances.
Table 2.
MAPL and APL methods on SP instances.
Table 3.
MUSL and USL methods on SP instances.
Table 4.
MAPL and MUSL methods on SP instances.
There are some observations on the results in Table 2, Table 3 and Table 4. When the initial gap is large (up to or ), MAPL and APL algorithms are implemented 400 times. The results in Table 2 show that MAPL method has certain advantages over APL method in terms of CPU time and the number of iterations. The results in Table 3 show that in addition to the advantages in CPU time, MUSL algorithm can achieve higher accuracy. From the results in Table 4 we know that compared with the MUSL algorithm, the MAPL algorithm has less CPU time and fewer number of iterations. This is because that while the MAPL method solves linear programming problems, MUSL only needs to solve N smooth quadratic programming problems in the progress of algorithm.
6. Conclusions
In this paper, we presented two modified BL-type methods, the modified accelerated prox-level (MAPL) and modified uniform smoothing level (MUSL) methods, for uniformly solving the black-box CP problems and a class of structure non-smooth problems. In addition, both MAPL and MUSL methods can achieve optimal complexities respectively. To illustrate the effectiveness of the modified methods, they were then applied to solve the two-stage stochastic programming with recourse and numerical experiments were carried out. Finally, the numerical results shown that the MAPL and MUSL methods have certain advantages in algorithm efficiency and solution time.
Author Contributions
C.T. mainly contributed to the algorithm design and convergence analysis; B.H. mainly contributed to the convergence analysis and numerical results; and Z.W. mainly contributed to the algorithm design. All authors have read and agree to the published version of the manuscript.
Funding
This work was supported by the National Natural Science Foundation (11761013, 71861002) and Guangxi Natural Science Foundation (2018GXNSFFA281007) of China.
Acknowledgments
The authors would like to thank for the support funds.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A
Proof of Lemma 1.
(a) By using mathematical induction, we first show part a). It is obvious that . Assume that is a localizer of i.e., , for all . Then we have that , for all . By the definition of h and convexity of f we know that , for any . From the fact that and we have that .
(b) It follows from in Step 3 of procedure that , for all .
(c) Form a) we have that . By the optimal condition of (20) we have that . Then we obtain that , for any . By the definition of we have and further , i.e., there exists a set such that . In conclusion, we have , for all .
(d) It follows from the fact that X is a compact convex set and the definition of that is a compact convex set. Together with the strong convexity of , we know that there is a unique optimal solution for (20) if . Then we have , for any . Furthermore, we know that .
(e) From b) we know that . We have if the procedure stops at Step 2. From the fact that we have
We have and if the procedure stops at Step 3. From the fact that , we have
□
Proof of Lemma 2.
(a) The proof of a) is the same as the proof of e) of Lemma 1.
(b) If the internal procedure stops at Step 3, we have that and . Then we have that
Furthermore we know that
Then we have . Together with the definition in Step 3b, we know that . □
References
- Sen, S.; Doverspike, R.D.; Cosares, S. Network planning with random demand. Telecommun. Syst. 1994, 3, 11–30. [Google Scholar] [CrossRef]
- Yang, L.F.; Jian, J.B.; Wang, Y.Y.; Dong, Z.Y. Projected mixed integer programming formulations for unit commitment problem. Int. J. Electr. Power Energy Syst. 2015, 68, 195–202. [Google Scholar] [CrossRef]
- Yang, L.F.; Jian, J.B.; Zhu, Y.N.; Dong, Z.Y. Tight Relaxation Method for Unit Commitment Problem Using Reformulation and Lift-and-Project. IEEE Trans. Power Syst. 2015, 30, 13–23. [Google Scholar] [CrossRef]
- Yang, L.F.; Zhang, C.; Jian, J.B.; Meng, K.; Xu, Y.; Dong, Z.Y. A novel projected two-binary-variable formulation for unit commitment in power systems. Appl. Energy 2017, 187, 732–745. [Google Scholar] [CrossRef]
- Wallace, S.W.; Ziemba, W.T. Applications of Stochastic Programming; Society for Industrial and Applied Mathematics and the Mathematical Programming Society: Philadelphia, PA, USA, 2005. [Google Scholar]
- Kelley, J.E., Jr. The cutting-plane method for solving convex programs. J. Soc. Ind. Appl. Math. 1960, 8, 703–712. [Google Scholar] [CrossRef]
- Veinott, A.F. The Supporting Hyperplane Method for Unimodal Programming. Oper. Res. 1967, 15, 147–152. [Google Scholar] [CrossRef]
- Ruszczynski, A. A regularized decomposition method for minimizing a sum of polyhedral functions. Math. Program. 1986, 35, 309–333. [Google Scholar] [CrossRef]
- Linderoth, J.; Wright, S. Decomposition Algorithms for Stochastic Programming on a Computational Grid. Comput. Optim. Appl. 2003, 24, 207–250. [Google Scholar] [CrossRef]
- Lemaréchal, C. Nonsmooth optimization and descent methods. In Research Report 78-4; IIASA: Laxenburg, Austria, 1978. [Google Scholar]
- Mifflin, R. A modification and an extension of Lemaréchal’s algorithm for nonsmooth minimization. In Nondifferential and Variational Techniques in Optimization; Springer: Berlin, Germany, 1982; pp. 77–90. [Google Scholar]
- Kiwiel, K.C. An aggregate subgradient method for nonsmooth convex minimization. Math. Program. 1983, 27, 320–341. [Google Scholar] [CrossRef]
- Kiwiel, K.C. Proximity control in bundle methods for convex nondifferentiable minimization. Math. Program. 1990, 46, 105–122. [Google Scholar] [CrossRef]
- Ruszczyński, A.; Świȩtanowski, A. Accelerating the regularized decomposition method for two stage stochastic linear problems. Eur. J. Oper. Res. 1997, 101, 328–342. [Google Scholar] [CrossRef]
- Lemaréchal, C.; Nesterov, Y.; Nemirovskii, A. New variants of bundle methods. Math. Program. 1995, 69, 111–147. [Google Scholar] [CrossRef]
- Ben-Tal, A.; Nemirovski, A. Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications; SIAM: Philadelphia, PA, USA, 2001. [Google Scholar]
- Ben-Tal, A.; Nemirovski, A. Non-euclidean restricted memory level method for large-scale convex optimization. Math. Program. 2005, 102, 407–456. [Google Scholar] [CrossRef]
- Richtárik, P. Approximate Level Method for Nonsmooth Convex Minimization. J. Optim. Theory Appl. 2012, 152, 334–350. [Google Scholar] [CrossRef]
- Fischer, F.; Helmberg, C. A parallel bundle framework for asynchronous subspace optimization of nonsmooth convex functions. SIAM J. Optim. 2014, 24, 795–822. [Google Scholar] [CrossRef]
- Kim, K.; Petra, C.G.; Zavala, V.M. An asynchronous bundle-trust-region method for dual decomposition of stochastic mixed-integer programming. SIAM J. Optim. 2019, 29, 318–342. [Google Scholar] [CrossRef]
- Van Ackooij, W.; Frangioni, A. Incremental bundle methods using upper models. SIAM J. Optim. 2018, 28, 379–410. [Google Scholar] [CrossRef]
- Iutzeler, F.; Malick, J.; de Oliveira, W. Asynchronous level bundle methods. Math. Program. 2019, 49, 1–30. [Google Scholar] [CrossRef]
- Tang, C.; Jian, J.; Li, G. A proximal-projection partial bundle method for convex constrained minimax problems. J. Ind. Manag. Optim. 2019, 15, 757–774. [Google Scholar] [CrossRef]
- Lan, G. Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization. Math. Program. 2015, 149, 1–45. [Google Scholar] [CrossRef]
- Nesterov, Y. A method for unconstrained convex minimization problem with the rate of convergence O(1/k2). Doklady AN USSR 1983, 269, 543–547. [Google Scholar]
- Auslender, A.; Teboulle, M. Interior gradient and proximal methods for convex and conic optimization. SIAM J. Optim. 2006, 16, 697–725. [Google Scholar] [CrossRef]
- Lan, G.; Lu, Z.; Monteiro, R.D. Primal-dual first-order methods with O(1/ϵ) iteration-complexity for cone programming. Math. Program. 2011, 126, 1–29. [Google Scholar] [CrossRef]
- Lan, G. An optimal method for stochastic composite optimization. Math. Program. 2012, 133, 365–397. [Google Scholar] [CrossRef]
- Nesterov, Y. Introductory Lectures on Convex Optimization a Basic Course; Springer Science & Business Media: New York, NY, USA, 2004; Volume 87. [Google Scholar]
- Nesterov, Y. Smooth minimization of non-smooth functions. Math. Program. 2005, 103, 127–152. [Google Scholar] [CrossRef]
- Chen, Y.; Lan, G.; Ouyang, Y.; Zhang, W. Fast Bundle-Level Type Methods for Unconstrained and Ball-Constrained Convex Optimization. Comput. Optim. Appl. 2019, 73, 159–199. [Google Scholar] [CrossRef]
- Nemirovsky, A.S.; Yudin, D. Problem Complexity and Method Efficiency in Optimization. In Wiley-Interscience Series in Discrete Mathematics; Wiley-Interscience: New York, NY, USA, 1983. [Google Scholar]
- Ahmed, S. Smooth Minimization of Two-Stage Stochastic Linear Programs; Georgia Institute of Technology: Atlanta, GA, USA, 2006. [Google Scholar]
- Chen, X.; Qi, L.; Womersley, R.S. Newton’s method for quadratic stochastic programs with recourse. J. Comput. Appl. Math. 1995, 60, 29–46. [Google Scholar] [CrossRef][Green Version]
- Chen, X. A parallel BFGS-SQP method for stochastic linear programs. In Computational Techniques and Applications; World Scientific: Princeton, NJ, USA, 1995; pp. 67–74. [Google Scholar]
- Linderoth, J.; Shapiro, A.; Wright, S. The empirical behavior of sampling methods for stochastic programming. Ann. Oper. Res. 2006, 142, 215–241. [Google Scholar] [CrossRef]
- Mak, W.; Morton, D.P.; Wood, R.K. Monte Carlo bounding techniques for determining solution quality in stochastic programs. Oper. Res. Lett. 1999, 24, 47–56. [Google Scholar] [CrossRef]
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).