An Alternating Iteration Algorithm for a Parameter-Dependent Distributionally Robust Optimization Model

Shuang Lin; Jie Zhang; Nan Shi

doi:10.3390/math10071175

,

and

¹

Department of Basic Courses Teaching, Dalian Polytechnic University, Dalian 116034, China

²

School of Mathematics, Liaoning Normal University, Dalian 116029, China

^*

Author to whom correspondence should be addressed.

Mathematics2022, 10(7), 1175;https://doi.org/10.3390/math10071175

Version Notes

Order Reprints

Abstract

Based on a successive convex programming method, an alternating iteration algorithm is proposed for solving a parameter-dependent distributionally robust optimization. Under the Slater-type condition, the convergence analysis of the algorithm is obtained. When the objective function is convex, a modified algorithm is proposed and a less-conservative solution is obtained. Lastly, some numerical tests results are illustrated to show the efficiency of the algorithm.

Keywords:

distributionally robust optimization; alternating iteration algorithm; convergence analysis

MSC:

90C30

1. Introduction

In stochastic programming, the involved random variables usually satisfy certain distribution. However, in the real world, the certain distribution may be unknown or only the part of it is known. Distributionally robust optimization (DRO) method happens to be an effective way to solve such uncertain problems.

The study of the DRO method can be traced back to Scarf’s early work [1], which is intended to address potential uncertainties in supply chain and inventory control. In the DRO method, historical data may not be sufficient to estimate future distribution, therefore, a larger distribution set containing the true distribution can adequately address the risk of fuzzy uncertainty sets. The DRO model has been widely used in operations research, finance and management science, see [2,3,4,5,6] for recent development and further research. However, most of the ambiguity set of DRO are independent of decision variable.

Recently, Zhang, Xu and Zhang [7] have proposed a parameter-dependent DRO model, where the probability of the underlying random variables depends on the decision variables and the ambiguity set is defined through parametric moment conditions with generic cone constraints. Under Slater-type conditions, the quantitative stability results are established for the parameter-dependent DRO. By recent developments from the variational theory, Royset and Wets [8] have established convergence results for approximations of a class of DRO problems with decision-dependent ambiguity sets. Their discussion covers a variety of ambiguity sets, including moment-based and stochastic-dominance-based ones. Luo and Mehrotra [9] have obtained formulations for problems that feature distributional ambiguity sets defined by decision-dependent bounds on moments. Until recently, DRO with decision-dependent ambiguity sets has been an almost untouched research field. The few studies [7,8,9] on DRO with decision-dependent ambiguity sets are mostly theoretical achievements and the algorithms for solving such DRO are not related.

In this paper, for the parameter-dependent DRO model in [7], we propose an alternating iteration algorithm for solving it and propose a less-conservative solution strategy for its special case.

As far as we are concerned, the main contributions of this paper can be summarized as follows. Firstly, we carry out convergence analysis for alternating iteration algorithm. Under the Slater constraint qualification, we show that any cluster point of the sequence generated by the alternating iteration algorithm is an optimal solution of the parameter-dependent DRO. Notice that the proof of the convergence of successive convex programming method in [10] cannot cover our convergence analysis, since the uncertain set in Equation (1) depends on x, therefore our convergence analysis can be seen an extension of the proposition in [10]. Secondly, when the corresponding objective function is convex, a less-conservative DRO is constructed and a modified algorithm is proposed for it. At last, numerical experiments are carried out to show the efficiency of the algorithm.

The paper is organized as follows. Section 2 demonstrates the structure of the algorithm for the parameter-dependent DRO and establishes the convergence of the algorithm. In Section 3, the modified algorithm is proposed for a special case of DRO and the less-conservative solution is obtained. In Section 4, some numerical test results are illustrated to show the less conservative property of solutions obtained by the modified algorithm.

Throughout the paper, we use the following notations. By convention, we use

R^{n \times n}

and

S^{n \times n}

to denote the space of all

n \times n

matrices and symmetric matrices respectively. For matrix

A \in S^{n \times n}

,

A ⪯ 0

means that A is a negative semidefinite symmetric matrix,

∥ x ∥

denotes the Euclidean norm of a vector x in

R^{n}

. For a real-valued function

φ : R^{n} \to R

,

\nabla φ (x)

denotes the gradient of

φ

at x.

2. DRO Model and Its Algorithm

Consider the following distributionally robust optimization (DRO) problem:

\begin{matrix} (P) \begin{matrix} min_{x} & sup_{P \in P (x)} E_{P} [f (x, ξ (ω))] \\ s . t . & x \in X, \end{matrix} \end{matrix}

(1)

where X is a compact set of

R^{n}

,

f : R^{n} \times R^{k} \to R

is a continuously differentiability function,

ξ : Ω \to Ξ

is a vector of random variables defined on probability space

(Ω, F, P)

with support set

Ξ \subset R^{k}

, for fixed

x \in X

,

P (x)

is a set of distributions which contains the true probability distribution of random variable

ξ

, and

E_{P} [\cdot]

denotes the expected value with respect to probability measure

P \in P (x)

.

In this paper, we consider the case when

P (x)

is constructed through moment condition

P (x) : = \{P \in P : E_{P} [Ψ (x, ξ (ω))] \in K\},

(2)

where

Ψ

is a random map which consists of vectors and/or matrices with measurable random components, and

P

denotes the set of all probability distributions/measures in the space

(Ω, F)

and

K

is a closed convex cone in a finite dimensional vector and/or matrix spaces. If we consider

(Ξ, B)

as a measurable space equipped with Borel sigma algebra

B

, then

P (x)

may be viewed as a set of probability measures defined on

(Ξ, B)

induced by the random variate

ξ

. To ease notation, we will use

ξ

to denote either the random vector

ξ (ω)

or an element of

R^{k}

depending on the context.

When

Ξ

is a finite discrete set, that is,

Ξ = {ξ_{1}, \dots, ξ_{N}}

, for some N, (2) can be written as

P (x) = \{(p_{1}, \dots, p_{N}) : \sum_{j = 1}^{N} p_{j} Ψ (x, ξ_{j}) \in K, p_{j} \geq 0, \sum_{j = 1}^{N} p_{j} = 1\} .

(3)

In this section, we consider the DRO model (1) with

P (x)

defined by (3). In this case,

E_{P} [f (x, ξ (ω))] = \sum_{j = 1}^{N} p_{j} f (x, ξ_{j}) a n d E_{P} [Ψ (x, ξ (ω))] = \sum_{j = 1}^{N} p_{j} Ψ (x, ξ_{j}) .

In [10], a successive convex programming (SCP) method for a max–min problem with fixed compact set is proposed. However, the SCP method in [10] cannot be used to solve (1) directly, since

P (x)

in (1) depends on x.

Based on the SCP algorithm, we propose an alternating iteration algorithm for solving (1). In the algorithm proposed, the optimal solution is obtained by alternative iteration of solutions of inner maximum problems and outer minimum problems in (1). For convenience, let

C = \{(p_{1}, \dots, p_{N}) \in R^{N} : p_{j} \geq 0, j = 1, 2, \dots, N, \sum_{j = 1}^{N} p_{j} = 1\} .

We know from the algorithm that if the algorithm stops in finite steps with

C_{k + 1} = C_{k}

or

v_{k} \leq t_{k}

, then

x_{k}

is an optimal solution of (1). In practice, problem (6) can be solved by its dual problem. In the case when an infinite sequence is produced, we use the following theorem to ensure the validity of the algorithm.

We introduce a notation, which is used in the proof of the convergence of the Algorithm in Table 1. Let

P, Q \in P

, the total variation metric between P and Q is defined as (see, e.g., page 270 in [11]),

d_{T V} (P, Q) : = sup_{h \in H} |E_{P} [h (ξ)] - E_{Q} [h (ξ)]|,

(4)

where,

H : = \{h : R^{k} \to R : h i s B measurable, sup_{ξ \in Ξ} | h (ξ) | \leq 1\},

(5)

Table 1. The Alternating Iteration Algorithm.

Using the total variation norm, we can define the distance from a metric

P \in P

to a metric set

P \subset P

, that is,

d_{T V} (Q, P) : = inf_{P \in P} d_{T V} (Q, P) .

We next provide the convergence of the Algorithm in Table 1.

Theorem 1.

Let

{x_{n}}

be a sequence generated by Algorithm in Table 1 and

x_{0}

be a cluster point. If (a)

(x, P) \mapsto E_{P} [f (x, ξ)]

and

(x, P) \mapsto E_{P} [Ψ (x, ξ)]

are both continuous on

X \times C

, (b) for

x \in X, f (x, \cdot)

and

Ψ (x, \cdot)

are finite valued and continuous on Ξ, (b)

0 \in i n t {E_{P} [Ψ (x_{0}, ξ)] - K : P \in C}

, then

x_{0}

is an optimal solution of problem (1).

Proof.

Since

C_{n}

is an increasing sequence of sets and

C

is a compact set, we have

{lim}_{n \to \infty} C_{n}

= cl [\cup_{n = 1}^{\infty} C_{n}] : = C^{+} .

Since

x_{0}

is a cluster point of

{x_{n}}

, there exists an subsequence of

{x_{n}}

converging to

x_{0}

. Without loss of generality, for simplicity, we assume that

x_{0}

is the limit point of

{x_{n}}

. We know from step 2 in the algorithm that

x_{n}

is an optimal solution of

\begin{matrix} min_{x} & sup_{P \in P (x) \cap C_{n}} E_{P} [f (x, ξ (ω))] \\ s . t . & x \in X . \end{matrix}

(8)

Let

{\hat{S}}_{n} (x)

and

{\hat{v}}_{n} (x)

denote the optimal solution set and optimal value of

sup_{P \in P_{n} (x)}, E_{P} [f (x, ξ)]

respectively,

\hat{S} (x)

and

\hat{v} (x)

denote the optimal solution set and optimal value of

sup_{P \in P (x) \cap C^{+}} E_{P} [f (x, ξ)],

respectively. Then, we have from (8) that:

{\hat{v}}_{n} (x_{n}) \leq {\hat{v}}_{n} (x) for any x \in X .

(9)

We proceed the rest of the proof in three steps.

Step 1. We next show

lim_{n \to \infty} {\hat{v}}_{n} (x_{n}) = \hat{v} (x_{0}) .

(10)

Let

{\hat{P}}_{n} \in {\hat{S}}_{n} (x_{n})

, by compactness of

C^{+}

,

{{\hat{P}}_{n}}

has cluster points. We assume

{\hat{P}}^{*}

is a cluster point of

{{\hat{P}}_{n}}

, then there exists a subsequence

{n_{k}} \subseteq {n}

such that

{\hat{P}}_{n_{k}}

converges to

{\hat{P}}^{*}

weakly as

k \to \infty

and

{\hat{P}}^{*} \in C^{+}

. Under conditions (a) and (b), we have

{\hat{v}}_{n_{k}} (x_{n_{k}}) = E_{{\hat{P}}_{n_{k}}} [f (x_{n_{k}}, ξ)] \to E_{{\hat{P}}^{*}} [f (x_{0}, ξ)] \leq \hat{v} (x_{0})

as

k \to \infty

. Hence, we have

\underset{n \to \infty}{lim sup} {\hat{v}}_{n} (x_{n}) \leq \hat{v} (x_{0}) .

(11)

Since

P_{n} (x_{n}) = P (x_{n}) \cap C_{n} \neq \emptyset

, we have form condition (a) that

P (x_{0}) \cap C^{+} \neq \emptyset

, which means that

\hat{S} (x_{0}) \neq \emptyset

. Let

P^{*} \in \hat{S} (x_{0})

, we next show that there exists a sequence

{{\hat{P}}_{n}}

with

{\hat{P}}_{n} \in P_{n} (x_{n})

such that

{\hat{P}}_{n}

converges to

P^{*}

weakly as

n \to \infty

. Under conditions (b) and (c), we know from [Theorem 2.1] in [7] that there exist positive constants

γ

and

ν \in (0, 1)

such that:

d_{T V} (Q, P (x_{n})) \leq γ {∥ x_{n} - x_{0} ∥}^{ν}

for all

Q \in P (x_{0})

and n large enough, which means that for

P^{*} \in \hat{S} (x_{0})

,

d_{T V} (P^{*}, P (x_{n}) \cap C_{n}) \leq d_{T V} (P^{*}, P (x_{n})) + d_{T V} (P^{*}, C_{n}) \leq γ {∥ x_{n} - x_{0} ∥}^{ν} + d_{T V} (P^{*}, C_{n})

(12)

for n large enough. Let

{\hat{P}}_{n} = Π_{P (x_{n}) \cap C_{n}} (P^{*})

, then by (12), we have

{\hat{P}}_{n}

converges to

P^{*}

weakly as n converges to infinity. Consequently, under condition (b),

{\hat{v}}_{n} (x_{n}) \geq E_{{\hat{P}}_{n}} [f (x_{n}, ξ)] \to E_{P^{*}} [f (x_{0}, ξ)] = \hat{v} (x_{0})

as

n \to \infty

and hence,

\underset{n \to \infty}{lim inf} \hat{v} (x_{n}) \geq \hat{v} (x_{0}) .

(13)

Combining (11) and (13), we have

\hat{v} (x_{n})

converges to

\hat{v} (x_{0})

as

n \to \infty

.

Step 2. We next show for any fixed

x \in X

,

lim_{n \to \infty} {\hat{v}}_{n} (x) = \hat{v} (x) .

(14)

Since

{lim}_{n \to \infty} C_{n} = C^{+}

, we have

{lim}_{n \to \infty} P (x) \cap C_{n} = P (x) \cap C^{+}

.

Then under conditions (a) and (b), similarly to the proof of step 1, we have

{\hat{v}}_{n} (x)

converges to

\hat{v} (x)

as

n \to \infty .

Step 3. Combining (9), (10) and (14), we have

\hat{v} (x_{0}) \leq \hat{v} (x) for any x \in X,

(15)

which means that,

x_{0}

is an optimal solution of

\begin{matrix} min_{x} & sup_{P \in P (x) \cap C^{+}} E_{P} [f (x, ξ (ω))] \\ s . t . & x \in X . \end{matrix}

(16)

By step 3 in algorithm, we have

sup_{P \in P (x_{n}) \cap C^{+}} E_{P} [f (x_{n}, ξ)] \leq sup_{P \in P (x_{n}) \cap C} E_{P} [f (x_{n}, ξ)] = E_{{\hat{P}}_{n + 1}} [f (x_{n}, ξ)] \leq sup_{P \in P (x_{n}) \cap C^{+}} E_{P} [f (x_{n}, ξ)],

which means that

sup_{P \in P (x_{n}) \cap C^{+}} E_{P} [f (x_{n}, ξ)] = sup_{P \in P (x_{n}) \cap C} E_{P} [f (x_{n}, ξ)] .

Then by the proof in step 1, letting

n \to \infty

, we have

\hat{v} (x_{0}) = {sup}_{P \in P (x_{0}) \cap C} E_{P} [f (x_{0}, ξ)] .

Consequently, by (15), we have

\hat{v} (x_{0}) \leq \hat{v} (x) \leq sup_{P \in P (x) \cap C} E_{P} [f (x, ξ)]

for all

x \in X

. Therefore,

x_{0}

is an optimal solution of (1). □

Remark 1.

In [10], without any constraint qualifications, the proof of the convergence of SCP method is obtained. However, in our proof, since the uncertain set in (1) depends on x, the Slater condition ensures the proof. We know from the above proof that if the uncertain set in (1) independent on x, the Slater condition can be omitted. Therefore our convergence analysis can be seen an extension of the proposition in [10].

3. Less Conservative Model and a Modified Algorithm

In this section, we consider a special case of (1) and provide a less-conservative model.

In the case when

Ξ = {ξ_{1}, \dots, ξ_{N}}

and the ambiguity set is

P (x) : = \{P \in P : \begin{matrix} E_{P} {[ξ - μ_{0}]}^{T} Σ_{0}^{- 1} E_{P} [ξ - μ_{0}] \leq γ_{1} \\ E_{P} [(ξ - μ_{0}) {(ξ - μ_{0})}^{T}] - γ_{2} Σ_{0} ⪯ 0 \end{matrix}\},

(17)

where

γ_{1}

and

γ_{2}

are nonnegative constants,

μ_{0} \in R^{k}

and

Σ_{0} \in S^{k \times k}

is positive semidefinite, the model (1) is the following problem:

\begin{matrix} min_{x \in X} max_{(p_{1}, \dots, p_{N}) \in R^{N}} & E_{P} [f (x, ξ)] \\ s . t . & \sum_{j = 1}^{N} p_{j} g_{1} (ξ_{j}) ⪯ 0, \\ \sum_{j = 1}^{N} p_{j} g_{2} (ξ_{j}) ⪯ 0, \\ p_{j} \geq 0, j = 1, \dots, N, \\ \sum_{j = 1}^{N} p_{j} = 1, \end{matrix}

(18)

where

g_{1} (ξ) = [\begin{matrix} - Σ_{0} & μ_{0} - ξ \\ {(μ_{0} - ξ)}^{T} & - γ_{1} \end{matrix}]

and

g_{2} (ξ) = (ξ - μ_{0}) {(ξ - μ_{0})}^{T} - γ_{2} Σ_{0} .

The model has been investigated in [2]. As shown in [2], the constraints in (18) imply that the mean of

ξ

lies in an ellipsoid of size

γ_{1}

centered at the estimate

μ_{0}

and the centered second moment matrix of

ξ

lies in a positive semidefinite cone defined with a matrix inequality.

However, in the constraints of (18), not all

ξ_{j}

lies in the ellipsoid of size

γ_{1}

centered at the estimate

μ_{0}

. In practice, we may be only interested in the

ξ_{j}

which lies in the ellipsoid and omit the ones outside the ellipsoid. Consequently, we propose a less-conservative DRO model, that is

\begin{matrix} min_{x \in X} max_{(p_{1}, \dots, p_{N}) \in R^{N}} & E_{P} [f (x, ξ)] \\ s . t . & p_{j} g_{1} (ξ_{j}) ⪯ 0, \\ p_{j} g_{2} (ξ_{j}) ⪯ 0, \\ p_{j} \geq 0, j = 1, \dots, N, \\ \sum_{j = 1}^{N} p_{j} = 1 . \end{matrix}

(19)

In the above model, if the

ξ_{j}

does not lie in an ellipsoid of size

γ_{1}

centered at the estimate

μ_{0}

or does not satisfy the matrix inequality

g_{2} (ξ_{j}) ⪯ 0

, the corresponding constraints are vanished. Moreover, we can choose

γ_{1}

and

γ_{2}

such the feasible set of the inner problem is not empty, for example, for the first constraint, let

γ_{1} = max {{(ξ_{j} - μ_{0})}^{T} Σ_{0}^{- 1} (ξ_{j} - μ_{0}) : j = 1, 2, \dots, N}

. Compare with model (18), the model (19) is less conservative since the feasible set of the inner maximum problem is smaller.

Let

Q

be a set of probability distributions defined as

Q = \{(p_{1}, \dots, p_{N}) \in R^{N} : p_{j} g_{i} (ξ_{j}) ⪯ 0, \sum_{j = 1}^{N} p_{j} = 1, p_{j} \geq 0, j = 1, \dots, N, i = 1, 2\} .

(20)

Next we give a modified alternative solution algorithm for (19):

The above algorithm is based on the algorithm in Pflug and Wozabal [10] for solving a distributed robust investment problem and a cutting plane algorithm in Kelley [12] for solving convex optimization problems. A similar algorithm has been used in Xu et al. [5] to solve a different DRO model and the proof of the convergence is omitted. In the following, we provide convergence analysis of the modified alternative solution algorithm based on Theorem 1.

Theorem 2.

Let

{x_{n}}

be a sequence generated by Algorithm in Table 2 and

x_{0}

be a limit point. If for each

ξ \in Ξ

,

f (\cdot, ξ)

is continuously differentiable and convex on X, then

x_{0}

is an optimal solution of problem (19).

Table 2. The Modified Alternating Iteration Algorithm.

Proof.

The proof is similar as the proof of Theorem 1. Since

Q_{n}

is an increasing sequence of sets and

Q

is a compact set, we have

{lim}_{n \to \infty} Q_{n} = cl [\cup_{n = 1}^{\infty} Q_{n}] : = Q^{+} .

Let

{\hat{S}}_{n} (x)

and

{\hat{v}}_{n} (x)

denote the optimal solution set and optimal value of

sup_{(p_{1}, \dots, p_{N}) \in Q_{n}} \sum_{j = 1}^{N} p_{j} [f (x_{n - 1}, ξ_{j}) + \nabla_{x} f {(x_{n - 1}, ξ_{j})}^{T} (x - x_{n - 1})]

respectively,

\hat{S} (x)

and

\hat{v} (x)

denote the optimal solution set and optimal value of

sup_{(p_{1}, \dots, p_{N}) \in Q^{+}} \sum_{j = 1}^{N} p_{j} f (x, ξ_{j})

respectively. Then we have

{\hat{v}}_{n} (x_{n}) \leq {\hat{v}}_{n} (x) for any x \in X .

(23)

Let

(p_{1}^{n}, \dots, p_{N}^{n}) \in {\hat{S}}_{n} (x_{n})

, by compactness of

Q^{+}

,

{(p_{1}^{n}, \dots, p_{N}^{n})}

has cluster points. We assume

(p_{1}^{*}, \dots, p_{N}^{*})

is a cluster point of

{(p_{1}^{n}, \dots, p_{N}^{n})}

, then there exists a subsequence

{n_{k}} \subseteq {n}

such that

(p_{1}^{n_{k}}, \dots, p_{N}^{n_{k}})

converges to

(p_{1}^{*}, \dots, p_{N}^{*})

weakly as

k \to \infty

and

(p_{1}^{*}, \dots, p_{N}^{*}) \in Q^{+}

. Then we have

\begin{matrix} {\hat{v}}_{n_{k}} (x_{n_{k}}) & = & \sum_{j = 1}^{N} p_{j}^{n_{k}} [f (x_{n_{k} - 1}, ξ_{j}) + \nabla_{x} f {(x_{n_{k} - 1}, ξ_{j})}^{T} (x_{n_{k}} - x_{n_{k} - 1})] \\ \leq & \sum_{j = 1}^{N} p_{j}^{n_{k}} f (x_{n_{k}}, ξ_{j}) \to \sum_{j = 1}^{N} p_{j}^{*} [f (x_{0}, ξ_{j})] \leq \hat{v} (x_{0}) \end{matrix}

as

k \to \infty

. Hence, we have

\underset{n \to \infty}{lim sup} {\hat{v}}_{n} (x_{n}) \leq \hat{v} (x_{0}) .

(24)

On the other hand, for

(p_{1}^{*}, \dots, p_{N}^{*}) \in \hat{S} (x_{0}), (p_{1}^{*}, \dots, p_{N}^{*}) \in Q^{+}

, which means that

\exists (p_{1}^{n}, \dots, p_{N}^{n}) \in Q^{n}

such that

(p_{1}^{n}, \dots, p_{N}^{n}) \to (p_{1}^{*}, \dots, p_{N}^{*})

as

n \to \infty

. Therefore, we have

{\hat{v}}_{n} (x_{n}) \geq \sum_{j = 1}^{N} p_{j}^{n} [f (x_{n - 1}, ξ_{j}) + \nabla_{x} f {(x_{n - 1}, ξ_{j})}^{T} (x_{n} - x_{n - 1})] \to \sum_{j = 1}^{N} p_{j}^{*} f (x_{0}, ξ_{j}) = \hat{v} (x_{0})

(25)

as

n \to \infty

. Combining (24) and (25), we obtain

lim_{n \to \infty} {\hat{v}}_{n} (x_{n}) = \hat{v} (x_{0}) .

The else of proof follows from the proof of Theorem 1. □

Remark 2.

Notice that the Slater condition is not used in the proof, since the uncertain set in (1) is independent on x, the Slater condition can be omitted.

4. Numerical Tests

In this section, we discuss the numerical performance of proposed alternating iteration algorithm for solving (18) and (19). We do so by applying the alternating iteration algorithm to a news vender problem [4] and provide comparative analysis of the numerical results.

Suppose the company has to decide the order quantity

x_{j}

of a product to meet the demand

ξ_{j}

and the news provider trades in

j = 1, \dots, n

products. Before knowing the uncertain demand

ξ_{j}

, the news vender orders

x_{j}

units of product j at the wholesale price

c_{j} > 0

. Once the demand

ξ_{j}

is known, it can be quantified

min {x_{j}, ξ}

at the retail price of

v_{j}

. Any stock that have not been sold

{(x_{j} - ξ_{j})}_{+}

are cleared by the remedy price

h_{j}

. Any unsatisfied demand

{(ξ_{j} - x_{j})}_{+}

is lost. The total loss of the news vendors can be described as a function of the order decision

x : = {(x_{1}, \dots, x_{n})}^{⊤}

:

L (x, ξ) = c^{⊤} x - v^{⊤} min (x, ξ) - h^{⊤} {(x - ξ)}_{+} = {(c - v)}^{⊤} x + {(v - h)}^{⊤} {(x - ξ)}_{+},

(26)

where non-negativity and minimum operators are applied to the component method. We study the risk aversion of the news vendor problem on two models:

\begin{matrix} (H 1) min_{x \in X} sup_{P \in P} E_{P} [U (L (x, ξ))], \end{matrix}

(27)

and

\begin{matrix} (H 2) min_{x \in X} sup_{P \in Q} E_{P} [U (L (x, ξ))], \end{matrix}

(28)

where

U (w) : = e^{w / 10}

is an exponential distribution function,

P = \{(p_{1}, \dots, p_{N}) \in R^{N} : \sum_{j = 1}^{N} p_{j} g_{i} (ξ_{j}) ⪯ 0, \sum_{j = 1}^{N} p_{j} = 1, p_{j} \geq 0, j = 1, \dots, N, i = 1, 2\} .

and

Q

is defined as in (20). Notice that for the news vender problem, problems (18) and (19) are just (

H 1

) and (

H 2

) respectively.

The data are generated as follows: for i-th product, wholesale, retail and remedy prices are

c_{j} = 0.1 (5 + j - 1)

,

v_{j} = 0.15 (5 + j - 1)

and

h_{j} = 0.05 (5 + j - 1)

respectively; the product demands vector

ξ

is characterized by a multivariate log-normal distribution with the mean

μ = (μ_{1}, \dots, μ_{n})

,

μ_{j} = 2, j = 1, \dots, n .

In the execution of the algorithm, we use an ambiguity set

Q

in (20) with

γ_{1} = 0.1

and

γ_{2} = 1.1

. The mean and convariance matrix

μ_{0}

and

Σ_{0}

are calculated to be generated through a computer. The experiments are carried out through Matlab 2016 installed on a Dell notebook computer with Windows 7 operating system and Intel Core i5 processor. The SDP subproblems in Algorithms are solved by Matlab solver “SDPT3-4.0” [13].

The computation results are shown in the Table 3 and Table 4 and Figure 1, Figure 2, Figure 3, Figure 4 and Figure 5 below sequentially. In Table 3 and Table 4, we show the average cpu time (Times(s)), iteration (Iter) and optimal values (Optimal Vlue) of each test problem with different sample sizes.

Table 3. The performance of (

H 1

).

Table 4. The performance of (

H 2

).

Figure 1. Comparative analysis of (

H 1

) and (

H 2

) on Time(s).

Figure 2. Comparative analysis of (

H 1

) and (

H 2

) on Iter.

Figure 3. Comparative analysis of (

H 1

) and (

H 2

) on Optimal Value.

Figure 4. Comparative analysis of Optimal Value from (

H 1

).

Figure 5. Comparative analysis of Optimal Value from (

H 2

).

From the Table 3 and Table 4 and Figure 1, Figure 2, Figure 3, Figure 4 and Figure 5, we can roughly see that problems (

H 1

) and (

H 2

) can be solved by the alternating iteration algorithm. We know from Figure 1, Figure 2, Figure 3, Figure 4 and Figure 5 that when using the algorithm to solve (

H 1

) and

(H 2)

, the number of iterations and time of solving

(H 1)

are basically more than that of solving (

H 2

). Moreover, the optimal values of

(H 2)

is smaller than the ones of (

H 1

). Since the DRO model is usually used to describe an upper bound of uncertain optimization problems, the smaller the optimal value of the DRO model, the less conservative the DRO model is. Therefore,

(H 2)

is a less conservative DRO model. However, according to Figure 3, (

H 1

) is more robust than

(H 2)

because the curve shown by (

H 1

) is more stable.

The numerical results show that, in order to obtain a conservative total loss in the news vender problem, solving DRO model

(H 2)

by the alternating iteration algorithm usually performs better than solving DRO model

(H 1)

. However, in our observations, when we only focus on the robustness, DRO model (

H 1

) may be the better choice. We provide the links to the source codes as follows: https://pan.baidu.com/s/1dSmMUynZqi5LzWgn6aUUoQ?pwd=xn44 (accessed on 25 January 2022).

5. Conclusions

In this paper, we carry out convergence analysis for an alternating iteration algorithm for a distributionally robust optimization problem where the ambiguity set depends on decision variables. Convergence analysis of the alternating iteration algorithm are obtained under the Slater-type condition, which can be seen an extension of the result in [10]. When the objective function is convex, a modified alternating iteration algorithm is proposed for obtaining a less-conservative solution of DRO and the convergence analysis is established. Finally, we discuss the numerical performance of proposed alternating iteration algorithm for obtaining a conservative total loss in the news vender problem. We can undertake similar analysis when the ambiguity set in DRO is constructed in other ways such as Kullback–Leiblor divergence [14], Wasserstein metric [15,16] etc. We leave all these for future research as they are beyond the focus of this paper.

Author Contributions

Conceptualization, S.L. and J.Z.; methodology, S.L.; software, N.S.; validation, J.Z., S.L. and N.S.; formal analysis, S.L.; investigation, J.Z.; writing—original draft preparation, S.L.; writing—review and editing, J.Z.; visualization, N.S.; supervision, J.Z.; project administration, J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China under Project Grant Nos. 12171219 and 61877032, the Liaoning Revitalization Talents Program No. XLYC2007113, Scientific Research Fund of Liaoning Provincial Education Department under Project No. LJKZ0961.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Scarf, H. A min-max solution of an inventory problem. In Studies in the Mathematical Theory of Inventory and Production; Arrow, K.S., Karlin, S., Scarf, H.E., Eds.; Stanford University Press: Stanford, CA, USA, 1958; pp. 201–209. [Google Scholar]
Ye, Y.; Delage, E. Distributionally robust optimization under moment uncertainty with application to data-driven problems. Oper. Res. 2010, 58, 595–612. [Google Scholar]
Goh, J.; Sim, M. Distributionally robust optimization and its tractable approximations. Oper. Res. 2010, 58, 902–917. [Google Scholar] [CrossRef]
Wiesemann, W.; Kuhn, D.; Sim, M. Distributionally robust convex optimization. Oper. Res. 2014, 62, 1358–1376. [Google Scholar] [CrossRef] [Green Version]
Xu, H.; Liu, Y.C.; Sun, H.L. Distributionally robust optimization with matrix moment constraints: Lagrange duality and cutting plane methods. Math. Program. 2017, 130, 1–22. [Google Scholar] [CrossRef] [Green Version]
Shapiro, A. On duality theory of conic linear problems. In Semi-Infinite Programming; Goberna, M.A., Lopez, M.A., Eds.; Springer: Boston, MA, USA, 2001; pp. 135–165. [Google Scholar]
Zhang, J.; Xu, H.; Zhang, L. Quantitative stability analysis for distributionally robust optimization with moment constraints. SIAM J. Optim. 2017, 26, 1855–1882. [Google Scholar] [CrossRef]
Royset, J.O.; Wets, R.J.-B. Variational theory for optimization under stochastic ambiguity. SIAM J. Optim. 2017, 27, 1118–1149. [Google Scholar] [CrossRef] [Green Version]
Luo, F.; Mehrotra, S. Distributionally robust optimization with decision dependent ambiguity sets. Tech. Rep. 2018, 14, 2565–2594. [Google Scholar] [CrossRef]
Pflug, G.C.; Wozabal, D. Ambiguity in portfolio selection. Quantitative 2007, 7, 435–442. [Google Scholar] [CrossRef] [Green Version]
Athreya, K.B.; Lahiri, S.N. Measure Theory and Probability Theory; Springer: New York, NY, USA, 2006. [Google Scholar]
Kelley, J.E. The cutting-plane method for solving convex programs. SIAM J. Appl. Math. 1960, 8, 703–712. [Google Scholar] [CrossRef]
Toh, K.C.; Todd, M.J.; Tütüncü, R.H. SDPT3-a Matlab software package for semidefinite programming. Optim. Methods Softw. 1999, 11, 545–581. [Google Scholar] [CrossRef]
Hu, Z.; Hong, L.J. Kullback-Leibler Divergence Constrained Distributionally Robust Optimization. Available online: http://www.optimization-online.org/DB_HTML/2012/11/3677.html (accessed on 1 November 2012).
Esfahani, P.M.; Kuhn, D. Data-driven distributionally robust optimization using the wasserstein metric: Performance guarantees and tractable reformulations. Math. Program. 2018, 171, 115–166. [Google Scholar] [CrossRef]
Zhao, C.; Guan, Y. Data-driven risk-averse stochastic optimization with wasserstein metric. Oper. Res. Lett. 2018, 46, 262. [Google Scholar] [CrossRef]

Figure 1. Comparative analysis of (

H 1

) and (

H 2

) on Time(s).

Figure 2. Comparative analysis of (

H 1

) and (

H 2

) on Iter.

Figure 3. Comparative analysis of (

H 1

) and (

H 2

) on Optimal Value.

Figure 4. Comparative analysis of Optimal Value from (

H 1

).

Figure 5. Comparative analysis of Optimal Value from (

H 2

).

Table 1. The Alternating Iteration Algorithm.

1. Set

k = 0

and

C_{0} = {\hat{P}}

with

\hat{P} \in C

satisfying

{x \in X : E_{\hat{P}} [Ψ (x, ξ)] \in K} \neq \emptyset

.

2. Solve the outer problem

\begin{matrix} min_{x, t} & t \\ s . t . & E_{P} [f (x, ξ (ω))] \leq t, \forall P \in P_{k} (x) \\ x \in X, \end{matrix}

(6)

and obtain the solution

(x_{k}, t_{k})

, where

P_{k} (x) = {P \in C_{k} : E_{P} [Ψ (x, ξ)] \in K} .

3. Solve the inner problem

\begin{matrix} max_{P} & E_{P} [f (x_{k}, ξ)] \\ s . t . & E_{P} [Ψ (x_{k}, ξ)] \in K, \forall P \in C \end{matrix}

(7)

and obtain the solution

{\hat{P}}_{k}

and the optimal value

v_{k}

.

4. Let

C_{k + 1} = C_{k} \cup {{\hat{P}}_{k}}

.

5. If

C_{k + 1} = C_{k}

or

v_{k} \leq t_{k}

, then a solution of (1) is found and the algorithm stops. Otherwise set

k = k + 1

and goto 2.

Table 2. The Modified Alternating Iteration Algorithm.

1. Let

P_{0} = (p_{1}^{0}, \dots, p_{N}^{0}) \in Q

and

Q_{0} : = {P_{0}}

and

x_{0} \in X

. Set

k = 0

.

2. Solve the outer minimization problem

\begin{matrix} min_{x, t} & t \\ s . t . & x \in X, \\ \sum_{j = 1}^{N} p_{j}^{k} [f (x_{k}, ξ_{j}) + \nabla_{x} f {(x_{k}, ξ_{j})}^{⊤} (x - x_{k})] \leq t, for P_{k} = (p_{1}^{k}, \dots, p_{N}^{k}) \in Q_{k} \end{matrix}

(21)

and obtain the solution

(x_{k + 1}, t_{k + 1})

.

3. Solve the inner maximization problem

\begin{matrix} max_{(p_{1}, \dots, p_{N}) \in R^{N}} & \sum_{j = 1}^{N} p_{j} f (x_{k + 1}, ξ_{j}) \\ s . t . & p_{j} [\begin{matrix} - Σ_{0} & μ_{0} - ξ \\ {(μ_{0} - ξ)}^{⊤} & - γ_{1} \end{matrix}] ⪯ 0 \\ p_{j} [(ξ - μ_{0}) {(ξ - μ_{0})}^{⊤}] ⪯ γ_{2} Σ_{0}, \\ p_{j} \geq 0, j = 1, \dots, N, \\ \sum_{j = 1}^{N} p_{j} = 1 \end{matrix}

(22)

and obtain the solution

(P_{k + 1}, v_{k + 1})

.

4. Let

Q_{k + 1} = Q_{k} ⋃ {P_{k + 1}}

. If

Q_{k + 1} = Q_{k}

or

v_{k + 1} \leq t_{k + 1}

, then stop, else let

k : = k + 1

, go to 1.

Table 3. The performance of (

H 1

).

Table 3. The performance of (

H 1

).

n	Time (s)	Iter	Optimal Value
2	31.234226	48	0.9669
4	44.774935	66	0.9663
6	52.961924	72	0.9616
8	52.583319	69	0.9578
10	56.231204	74	0.9659
12	65.792619	85	0.9629

Table 4. The performance of (

H 2

).

Table 4. The performance of (

H 2

).

n	Time (s)	Iter	Optimal Value
2	28.848998	48	0.9483
4	32.741311	50	0.7441
6	38.559070	57	0.7603
8	41.873316	57	0.8045
10	61.811460	76	0.8577
12	63.149645	75	0.8433

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

An Alternating Iteration Algorithm for a Parameter-Dependent Distributionally Robust Optimization Model

Abstract

1. Introduction

2. DRO Model and Its Algorithm

3. Less Conservative Model and a Modified Algorithm

4. Numerical Tests

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics