Next Article in Journal
Generalized Voronoi Diagram-Guided and Contact-Optimized Motion Planning for Snake Robots
Previous Article in Journal
An Overview of Fuzzy Implication and Its Generalizations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Computable Reformulation of Data-Driven Distributionally Robust Chance Constraints: Validated by Solution of Capacitated Lot-Sizing Problems

1
School of Mathematics and Statistics, Central South University, Changsha 410083, China
2
School of Mathematics and Finance, Hunan University of Humanities, Science and Technology, Loudi 417000, China
3
International Business School, Hunan University of Information Technology, Changsha 410151, China
*
Author to whom correspondence should be addressed.
Mathematics 2026, 14(2), 331; https://doi.org/10.3390/math14020331
Submission received: 8 December 2025 / Revised: 10 January 2026 / Accepted: 16 January 2026 / Published: 19 January 2026
(This article belongs to the Section D: Statistics and Operational Research)

Abstract

Uncertainty in optimization models often causes awkward properties in their deterministic equivalent formulations (DEFs), even for simple linear models. Chance-constrained programming is a reasonable tool for handling optimization problems with random parameters in objective functions and constraints, but it assumes that the distribution of these random parameters is known, and its DEF is often associated with the complicated computation of multiple integrals, hence impeding its extensive applications. In this paper, for optimization models with chance constraints, the historical data of random model parameters are first exploited to construct an adaptive approximate density function by incorporating piecewise linear interpolation into the well-known histogram method, so as to remove the assumption of a known distribution. Then, in view of this estimation, a novel confidence set only involving finitely many variables is constructed to depict all the potential distributions for the random parameters, and a computable reformulation of data-driven distributionally robust chance constraints is proposed. By virtue of such a confidence set, it is proven that the deterministic equivalent constraints are reformulated as several ordinary constraints in line with the principles of the distributionally robust optimization approach, without the need to solve complicated semi-definite programming problems, compute multiple integrals, or solve additional auxiliary optimization problems, as done in existing works. The proposed method is further validated by the solution of the stochastic multiperiod capacitated lot-sizing problem, and the numerical results demonstrate that: (1) The proposed method can significantly reduce the computational time needed to find a robust optimal production strategy compared with similar ones in the literature; (2) The optimal production strategy provided by our method can maintain moderate conservatism, i.e., it has the ability to achieve a better trade-off between cost-effectiveness and robustness than existing methods.

1. Introduction

Chance-constrained optimization is a popular way to handle optimization models with random parameters, possessing significant advantages over the classical expectation-based approaches [1]. In this paper, we consider the following chance-constrained optimization problem:
min C ( x )
                                                                      s . t .     P { A ( ξ ) x b ( ξ ) } 1 α ,
                              x X R n ,
where x R n is the decision vector, C : R n R is a real-valued convex continuous objective function, X R n denotes a computable bounded closed convex set, ξ is a random model parameter causing uncertainty in the coefficient matrix A ( ξ ) R m × n and the capacity vector b ( ξ ) R m , P { · } is a probability operator of random events when the probability distributional function of ξ is P , and α ( 0 , 1 ) represents a given risk level, i.e., the tolerance of constraint violation permitted by the decision-maker.
Owing to the practical difficulties in finding the true distribution of ξ , data-driven distributionally robust chance-constrained programming (DRCCP) has been developed, which replaces the single distribution in the chance constraint (1b) with a group of potential distributions in a confidence set D [2,3,4,5,6]. Specifically, by collecting sufficient observed historical data of ξ , the so-called confidence set D is first constructed to depict the domain of all approximate probability distributional functions of ξ . Then, in the realm of distributionally robust optimization, Problem (1) is approximated by the following DRCCP model:
min C ( x )
                                                                                                            s . t . inf P D P { A ( ξ ) x b ( ξ ) } 1 α ,
                        x X R n .
Clearly, in order to guarantee the applicability of the above data-driven DRCCP approach, it is crucial to address how to choose an appropriate confidence set, especially paying attention to the computational tractability of the constraint (2b) [7]. For example, by using the first and second moments of the data [2,3], moment-based confidence sets were defined. Moment-based sets, confined by estimates of mean and covariance, often lead to tractable reformulations but may overlook crucial distributional shapes, potentially resulting in overly conservative decisions. In order to incorporate higher-order distributional information of ξ , proper divergence-based confidence sets were constructed [6,8,9]. These confidence sets typically comprised all potential probability distributions within a specified tolerance of a reference distribution (e.g., the empirical distribution), where the tolerance is quantified by a statistical divergence such as the Kullback–Leibler (KL) divergence [10,11] or Wasserstein metric [6,12]. However, the reformulation of the constraint (2b) in existing works typically involves the solution of complicated semi-definite programming (SDP) problems based on the theory of functional extrema and duality [6]. For instance, Nguyen et al. derived an equivalent SDP reformulation of the problem via an analytical formula for the Wasserstein distance between two normal distributions [13,14]. In addition, the distributionally robust chance constraints can be reformulated into tractable mixed-integer programs (MIPs) [15,16], but solving this MIP reformulation is also difficult, even with the most advanced solvers [17]. For large-scale or complex instances, the use of data-driven methods can be beneficial for improvements in solution efficiency [18].
Motivated by the need to build a more appropriate confidence set from the collected data with a complex distribution, we intend to first construct an adaptive approximate density function by incorporating piecewise linear interpolation into the well-known histogram method, so as to remove the assumption of known distributions. Then, using this estimated distribution, we define a confidence set only involving finitely many variables, and, by the principles of distributionally robust optimization, we reformulate the original DRCCP into a computable standard optimization problem. We summarize the main contributions of this research as follows.
1. This is the first time that an adaptive probability density estimation is constructed by optimizing the bin widths in line with the distributional structure of the data, rather than using a uniform width as in the classic estimation method. In particular, for smaller fluctuations in the data, wider bin widths are chosen. For greater fluctuations in the data, smaller bin widths are used. An advantage of doing so is that it can reduce the number of used nodes when defining confidence sets and is beneficial for the dimensionality reduction of the derived robust optimization models.
2. Through variation distances, a confidence set to depict all the approximate probability density functions is constructed, which can strictly restrict the global difference between the confidence sets and the estimated discrete nominal distribution, avoiding the loss of high-order statistical information caused by moment constraints. Owing to the advantages of such a confidence set, it will be proven that the complicated chance constraint (2b) can be reformulated as an easily computable ordinary inequality constraint, which is beneficial for the development of efficient algorithms to solve the original problem (2).
3. The proposed reformulation technique is applied to solve the stochastic multiperiod capacitated lot-sizing problem (SMP-CLSP), and its advantages are validated, especially in comparison with similar ones in the literature.
The remainder of this paper is structured as follows. Section 2 introduces data-driven adaptive confidence sets with finitely many parameters, including the adaptive estimation of probability density functions and the construction of the corresponding confidence sets. Section 3 presents the reformulation of data-driven distributionally robust chance constraints. Section 4 provides numerical validation when applying the proposed computational technique to the SMP-CLSP. This research is concluded in Section 5.

2. Data-Driven Adaptive Confidence Sets with Finite Parameters

2.1. Adaptive Estimation of Probability Density Functions

Instead of fixing the bin widths when obtaining the reference distribution from the data, we first propose a numerical method to adaptively estimate the probability density function directly from the collected data by incorporating a piecewise linear interpolation technique into the well-known histogram method, where the distributional structure of the raw sample data is employed to optimize the choice of bin widths.
Let Ξ = { ξ 1 , ξ 2 , , ξ n s } represent a sample set of size n s , where all the samples are independently and identically distributed (i.i.d.) realizations of a random variable ξ , governed by an unknown probability density function f. We define the sample range by the smallest and largest realizations in the sample set, denoted as ξ min = min 1 j n s ξ j and ξ max = max 1 j n s ξ j , respectively. Like the classical histogram method [19], we partition [ ξ min , ξ max ] into N subintervals [ z i , z i + 1 ) , being referred to as N bins: B 1 , B 2 , , B N . h i = z i + 1 z i is called the width of the i-th bin. For each bin B i , let v i denote the number of samples that fall into the bin B i . By the Sturges’ number-of-bins rule, the initial number of bins is chosen to be N = log 2 n s + 1 , where · represents the upper rounding-off operation. Then, by the classical histogram method, the reference density function [20] f ^ n : R [ 0 , + ) is defined to be
f ^ n ( z ) = v i n s h i = 1 n s h i j = 1 n s I [ z i , z i + 1 ) ( ξ j ) , z B i , i = 1 , , N 1 ; v N n s h N = 1 n s h N j = 1 n s I [ z N , z N + 1 ] ( ξ j ) , z B N ,
where
I B i ( z ) = 1 , if z B i ; 0 , Otherwise .
Numerous studies have shown that the selection of the bin width plays a crucial role in accurately estimating the probability density [21]. In particular, for sample data with a complex distributional structure, the fixed bin width in the classical histogram method may greatly restrict the ability of the estimated density function to depict the true distribution [20]. In order to overcome this difficulty, we now present a numerical method to adaptively estimate the probability density function directly from the collected data by incorporating the piecewise linear interpolation technique into the well-known histogram method [19]. Specifically, we first define a new function f ^ : [ z 1 , z N + 1 ] [ 0 , + ) by
f ^ ( z ) = 2 f ^ n ( η i + 1 ) f ^ n ( η i ) h i + h i + 1 ( z η i ) + f ^ n ( η i ) , z i z < z i + 1 , i = 1 , 2 , , N 1 ; h N + η N z h N f ^ n ( η N ) , z N z z N + 1 ,
where η i = z i + z i + 1 2 , i = 1 , 2 , , N . Then, in order to ensure the nonnegativity of f ^ and achieve a better approximation, we compute
h i = z i + 1 z i , i = 1 , 2 , , N , η i = z i + z i + 1 2 , i = 1 , 2 , , N , s l p i ( f ^ ) = 2 ( f ^ n ( η i + 1 ) f ^ n ( η i ) ) h i + h i + 1 , i = 1 , 2 , , N 1 , f ^ [ η i , η i + 1 , η i + 2 ] = 4 h i + 1 + h i | f ^ n ( η i + 2 ) f ^ n ( η i + 1 ) h i + 1 + h i + 2 | | f ^ n ( η i + 1 ) f ^ n ( η i ) h i + 1 + h i | , i = 1 , 2 , , N 2 . f ^ [ η N 1 , η N , z N + 1 ] = 2 h N 1 + h N | f ^ n ( η N ) h N | | 2 ( f ^ n ( η N ) f ^ n ( η N 1 ) ) h N + h N 1 | , i m a x 2 = max f ^ [ η i , η i + 1 , η i + 2 ] , f ^ [ η N 1 , η N , z N + 1 ] , i = 1 , 2 , , N 2 . .
It can be seen that f ^ [ η i , η i + 1 , η i + 2 ] and i m a x 2 reflect the information of the second-order forward difference quotient and the maximum second-order difference quotient, respectively. With this first and second-order information, we further optimize the bin widths by inserting more nodes into the bins in line with the following rules.
(I)
When s l p i ( f ^ ) > max 0 , 2 f ^ n ( η i ) h i , a new node z i + 2 = z i + 1 + z i + 2 2 is inserted into the ( i + 1 ) -th bin. The original ( i + 1 ) -th bin B i + 1 is split into two new sub-bins: B i + 1 = [ z i + 1 , z i + 2 ) , B i + 2 = [ z i + 2 , z i + 2 ) . The midpoints of the new sub-bins B i + 1 and B i + 2 are η i + 1 = z i + 1 + z i + 2 2 and η i + 2 = z i + 2 + z i + 2 2 , respectively. Apply (3) to obtain the corresponding f ^ n ( η i + 1 ) and f ^ n ( η i + 2 ) for the two new sub-bins. Consequently, the updated midpoints are given by η i + 1 , f ^ n ( η i + 1 ) and η i + 2 , f ^ n ( η i + 2 ) , respectively. By rearranging the indices of all obtained bins, we insert more nodes till the inequality s l p i ( f ^ ) 2 f ^ n ( η i ) h i holds.
(II)
When s l p i ( f ^ ) < min 0 , 2 f ^ n ( η i ) h i , a new node z i + 1 = z i + z i + 1 2 is inserted into the i-th bin. Split B i into two sub-bins: B i = [ z i , z i + 1 ) , B i + 1 = [ z i + 1 , z i + 1 ) . The midpoints of the new sub-bins B i and B i + 1 are η i = z i + z i + 1 2 and η i + 1 = z i + 1 + z i + 1 2 , respectively. Apply (3) to obtain the corresponding f ^ n ( η i ) and f ^ n ( η i + 1 ) for the two new sub-bins. Consequently, the updated midpoints are given by η i , f ^ n ( η i ) and η i + 1 , f ^ n ( η i + 1 ) , respectively. By rearranging the indices of all bins, we insert more nodes till s l p i ( f ^ ) 2 f ^ n ( η i ) h i holds.
(III)
Denote by B i m a x , B i m a x + 1 , and B i m a x + 2 the three adjacent bins corresponding to this maximum second-order difference quotient, respectively. Let T H be the given interpolation tolerance. If it is required that the maximum second-order difference quotient i m a x 2 T H , then we select the bin with the maximum frequency in B i m a x , B i m a x + 1 , and B i m a x + 2 , denoted by B i m . Add its midpoint as the newly inserted interpolator, i.e., z i m + 1 = z i m + z i m + 1 2 . Split B i m into two sub-bins: B i m = [ z i m , z i m + 1 ) , B i m + 1 = [ z i m + 1 , z i m + 1 ) . The midpoints of the new sub-bins B i m and B i m + 1 are η i m = z i m + z i m + 1 2 and η i m + 1 = z i m + 1 + z i m + 1 2 , respectively. The updated midpoints are given by η i m , f ^ n ( η i m ) and η i m + 1 , f ^ n ( η i m + 1 ) , respectively. Consequently, among the three bins involved in the maximum second-order difference quotient, only the bin with a larger frequency needs to be further interpolated. With this new interpolator, we calculate the largest second-order difference quotient again and repeat the above interpolation until the inequality i m a x 2 < T H is satisfied.
The consecutive steps of the proposed approach to adaptively estimating the probability density based on Rules (I)–(III) are illustrated in Figure 1.
Remark 1.
As demonstrated intuitively in Figure 1, the above three rules aim to optimize the bin width by adaptively inserting new knots in each bin in line with the properties of f ^ desired. Essentially, bins with larger slope variations will be inserted into new knots. The rate of slope changes between these bins is characterized using the second-order difference quotient.
After applying Rules (I)–(III) to optimize the bin widths, the obtained estimation of the density function can depict the distributional nature of the sample data. Without loss of generality, the final node set is still denoted by z f i n a l = { z 1 , z 2 , , z N + 1 } , where N is the optimized number of bins. Consequently, by Rules (I), (II), and (III), f ^ in (4) is modified to be a piecewise linear function with variable steps, denoted by f ^ n . We can prove that f ^ n is an improved estimation of the true density function with adaptive bin widths, and the method is called improved density estimation with piecewise linear interpolation (IDE-PLI).
Proposition 1.
Let f ^ be defined by (4) and Rules (I), (II), and (III). Then, (1) it holds that f ^ n 0 . (2) z 1 z N + 1 f ^ n ( z ) d z = 1 .
Proof. 
(1) In the case of s l p i ( f ^ ) 0 , it follows from the first equality in (4) that
f ^ ( z i ) = 2 ( f ^ n ( η i + 1 ) f ^ n ( η i ) ) h i + h i + 1 ( z i η i ) + f ^ n ( η i ) = s l p i ( f ^ ( z ) , γ ) ( η i z i ) + f ^ n ( η i ) 2 f ^ n ( η i ) h i ( η i z i ) + f ^ n ( η i ) = 0 ,
where h i = η i + 1 η i , h i + 1 = η i + 2 η i + 1 , z i z < z i + 1 . A similar proof can be obtained in the case of s l p i ( f ^ ( z ) ) < 0 .
(2) For the bin [ z i , z i + 1 ) , i = 1 , 2 , , N 1 , we have
z i z i + 1 f ^ ( z ) d z = z i z i + 1 2 ( f ^ n ( η i + 1 ) f ^ n ( η i ) ) h i + h i + 1 ( z η i ) d z + z i z i + 1 f ^ n ( η i ) d z = f ^ n ( η i + 1 ) f ^ n ( η i ) h i + h i + 1 ( z i + 1 η i ) 2 ( z i η i ) 2 + f ^ n ( η i ) z i + 1 z i = f ^ n ( η i + 1 ) f ^ n ( η i ) h i + h i + 1 h i ( z i + 1 + z i 2 η i ) + f ^ n ( η i ) h i = f ^ n ( η i ) h i .
For the bin [ z N , z N + 1 ] , we obtain
z N z N + 1 f ^ ( z ) d z = z N z N + 1 h N + η N z h N f ^ n ( η N ) d z = f ^ N ( η N ) h N .
By the classical histogram method, we know that i = 1 N f ^ n ( η i ) h i = 1 . Thus,
z 1 z N + 1 f ^ ( z ) d z = 1 .
The proof is completed. □
Remark 2.
By Proposition 1, one knows that f ^ n can also be viewed as an estimation of the true density function, but its adaptive bin widths enable it to clearly depict the distributional nature of the sample data, as compared with the fixed bin width in the classical frequency histogram method.

2.2. Construction of Confidence Sets Only with Finitely Many Parameters

Using the obtained f ^ n ( z ) with N + 1 nodes z 1 , z 2 , , z N + 1 , we calculate the probability mass P ^ of f ^ n ( z ) on each bin B i
p ^ i ( z i , z i + 1 ) = P ^ ( ξ [ z i , z i + 1 ) ) = z i z i + 1 f ^ n ( z ) d z , i = 1 , , N .
With p ^ i ( z i , z i + 1 ) , i = 1 , , N , we define a confidence set for the random variable ξ and a given divergence tolerance γ > 0 by
D = P p i = P { ξ = η i = z i + z i + 1 2 } , i = 1 , 2 , , N , i = 1 N p i = 1 , i = 1 N | p i p ^ i ( z i , z i + 1 ) | γ , 0 p i 1 , i = 1 , 2 , , N . .
Clearly, the divergence tolerance γ > 0 can control the size of the confidence set D . Notably, different from the existing confidence sets in the literature, the set D in (6) only involves finitely many parameters, rather than a space of probability density functions.
Theorem 1.
Let ξ be a random variable with the distribution P and probability density f. Partition the domain of ξ into N disjoint bins { B i } i = 1 N . The sample size is denoted by n s . Let p ^ i ( z i , z i + 1 ) be defined by (5). Let P n s denote the sampling distribution induced by samples of size n s . Then, for any confidence level β ( 0 , 1 ) , it holds that
P n s i = 1 N | p i p ^ i ( z i , z i + 1 ) | N 2 ln ( 2 N / β ) 2 n s 1 β .
Proof. 
By (5), for each bin i,
p ^ i ( z i , z i + 1 ) = B i v i n s h i d z = v i n s = 1 n s j = 1 n s I B i ( ξ j ) , i = 1 , , N , j = 1 , , n s ,
where I B i ( ξ j ) is the Bernoulli variable with 0 I B i ( ξ j ) 1 . Due to sample independence, the indicators I B i ( ξ j ) are independent across j, satisfying the independence and boundedness conditions required for Hoeffding’s inequality [22]. For any ϵ > 0 ,
P n s | p ^ i ( z i , z i + 1 ) p i | ϵ 2 exp ( 2 n s ϵ 2 ) .
To control the total variation divergence across all N bins, we use the union bound to the events { | p ^ i ( z i , z i + 1 ) p i | ϵ } , i.e.,
P n s i = 1 N { | p ^ i ( z i , z i + 1 ) p i | ϵ } i = 1 N P n s | p ^ i ( z i , z i + 1 ) p i | ϵ .
Substituting the Hoeffding bound from (8) into (9), we get
P n s i = 1 N | p ^ i ( z i , z i + 1 ) p i | ϵ 2 N exp ( 2 n s ϵ 2 ) .
For a given confidence level β ( 0 , 1 ) , the inequality 2 N exp ( 2 n s ϵ 2 ) β yields
ϵ ln ( 2 N / β ) 2 n s .
Consequently, the desired inequality (7) holds. □
Remark 3.
By Theorem 1, we establish a rigorous statistical guarantee for the confidence set D : the total deviation between the estimated probabilities p ^ i (from adaptive bandwidth density estimation) and the true probabilities p i is bounded by
γ β , n s = N 2 ln ( 2 N / β ) 2 n s
with confidence level 1 β .
Notably, D is constructed based on direct deviations of probability mass functions, rather than the traditional moment information (e.g., mean or covariance), making it inherently suitable for scenarios where higher-order distributional properties are unknown or difficult to estimate.
By Theorem 1, we further prove the following result, which gives a distributionally robust guarantee for the chance constraint P { A ( ξ ) x b ( ξ ) } 1 α .
Theorem 2.
Let ξ be a random variable with unknown distribution P . Suppose that the domain of ξ is partitioned into N disjoint bins { B i } i = 1 N . Let n s be the number of independent samples used to construct the reference distribution P ^ , with p ^ i denoting the reference probability of bin B i . Define the confidence set D defined in (6), where γ = N 2 ln ( 2 N / β ) 2 n s for some β ( 0 , 1 ) . For the chance constraint P { A ( ξ ) x b ( ξ ) } 1 α , let V ^ ( x ) = P ^ { A ( ξ ) x > b ( ξ ) } be the reference violation probability. Then, for any α , β ( 0 , 1 ) , when
V ^ ( x ) α γ 2 ,
it holds with probability at least 1 β that inf P D P { A ( ξ ) x b ( ξ ) } 1 α .
Moreover, the required sample size n s to ensure V ^ ( x ) α γ 2 satisfies
n s N 2 ln ( 2 N / β ) 8 ( α V ^ ( x ) ) 2 .
Proof. 
By Theorem 1, with probability at least 1 β , we have i = 1 N | p i p ^ i | γ . The total variation distance [23] between P and P ^ is
δ ( P , P ^ ) = 1 2 i = 1 N | p i p ^ i | γ 2 .
Thus, for any random event E, we know that
| P ( E ) P ^ ( E ) | δ ( P , P ^ ) γ 2 .
In particular, taking E = { A ( ξ ) x > b ( ξ ) } , we get
P ( E ) P ^ ( E ) + γ 2 = V ^ ( x ) + γ 2 .
Consequently,
P { A ( ξ ) x b ( ξ ) } = 1 P ( E ) 1 V ^ ( x ) γ 2 .
In the case that V ^ ( x ) α γ 2 , it is concluded that P { A ( ξ ) x b ( ξ ) } 1 α . Since this inequality holds for all P D , it follows that
inf P D P { A ( ξ ) x b ( ξ ) } 1 α .
Since
V ^ ( x ) α γ 2 ,
it follows from γ = N 2 ln ( 2 N / β ) 2 n s that
n s N 2 ln ( 2 N / β ) 8 ( α V ^ ( x ) ) 2 .
This completes the proof. □
Remark 4.
In Theorem 2, a sample size condition (10) is provided such that the distributionally robust chance constraint holds with high probability, without requiring knowledge of the mean or covariance. Note that this condition depends on the reference violation probability V ^ ( x ) , which can be directly estimated from the collected data.

3. Reformulation of Data-Driven Distributionally Robust Chance Constraints

3.1. Reformulation of Distributionally Robust Chance Constraints with a Random Variable

As done in [8], it is supposed that all the model parameters in (2b) are linearly correlated with the random parameter ξ , which yields
A ( ξ ) = A 0 + A ξ , b ( ξ ) = b 0 + b ξ .
Thus, for each k K = { 1 , 2 , , m } , it holds that
A ( ξ ) x b ( ξ ) ( A x b ) ξ b 0 A 0 x ( A k x b k ) ξ b k 0 A k 0 x ,
where the k-th row of the matrix A is denoted by A k .
Remark 5.
When A k x b k = 0 , the inequality degenerates to 0 b k 0 A k 0 x , which is a deterministic constraint independent of ξ. When ξ is a continuous random parameter, the degenerate case A k x b k = 0 can be ignored in the analysis of the chance constraint.
For any given x, define
c k ( x ) = z R z T o p k + ( x ) = b k 0 A k 0 x A k x b k , if A k x b k > 0 , z T o p k ( x ) = b k 0 A k 0 x A k x b k , if A k x b k < 0 .
Let C ( x ) = k = 1 m c k ( x ) . Then, we can define the worst-case probability bound by
z D = inf P D P { A ( ξ ) x b ( ξ ) } = inf P D P ξ C ( x ) .
Denote
S + = { k K : A k x b k > 0 } , U ( x ) = min k S + T o p k + ( x ) , S = { k K : A k x b k < 0 } , L ( x ) = max k S T o p k ( x ) .
We now prove that z D in (11) has the following property.
Proposition 2.
Let S + , S , L ( x ) , and U ( x ) be defined by (12). The worst-case probability bound z D in (11) is specified by one of the following three formulas.
  • (1) When S = Ø ,
z D = 0 , if U ( x ) < z 1 , inf P D i = 1 j p i , if z j U ( x ) < z j + 1 , j = 1 , 2 , , N , 1 , if U ( x ) z N .
  • (2) When S + = Ø ,
z D = 1 , if L ( x ) < z 1 , inf P D i = j N p i , if z j L ( x ) < z j + 1 , j = 1 , 2 , , N , 0 , if L ( x ) z N + 1 .
  • (3) When S + Ø and S Ø ,
z D = 0 , if L ( x ) > U ( x ) , or U ( x ) < z 1 , or L ( x ) z N + 1 , inf P D i = j j p i , if L ( x ) [ z j , z j + 1 ) , U ( x ) [ z j , z j + 1 ) with j , j { 1 , 2 , , N } , 1 , if L ( x ) z 1 and U ( x ) z N + 1 .
Proof. 
The chance constraints P ξ C ( x ) in (11) reduce to three distinct cases.
Case 1: S = Ø . Then, P ξ C ( x ) = P ξ min k K T o p k + ( x ) , matching the original formulation when all A k x b k > 0 . From (12), we know that
P ( ξ C ( x ) ) = P ξ U ( x ) = 0 , if U ( x ) < z 1 , i = 1 j p i , if z j U ( x ) < z j + 1 , j = 1 , 2 , , N , 1 , if U ( x ) z N .
Thus,
z D = 0 , if U ( x ) < z 1 , inf P D i = 1 j p i , if z j U ( x ) < z j + 1 , j = 1 , 2 , , N , 1 , if U ( x ) z N .
Case 2: S + = Ø . Then, P ξ C ( x ) = P ξ max k K T o p k ( x ) , matching the original formulation when all A k x b k < 0 . By (12), we get
P ( ξ C ( x ) ) = P ξ L ( x ) = 1 , if L ( x ) < z 1 , i = j N p i , if z j L ( x ) < z j + 1 , j = 1 , 2 , , N , 0 , if L ( x ) z N + 1 .
Then,
z D = 1 , if L ( x ) < z 1 , inf P D i = j N p i , if z j L ( x ) < z j + 1 , j = 1 , 2 , , N , 0 , if L ( x ) z N + 1 .
Case 3: S + Ø and S Ø . Then,
P ξ C ( x ) = P max k S T o p k ( x ) ξ min k S + T o p k + ( x ) ,
which is nonzero only if max k S T o p k ( x ) min k S + T o p k + ( x ) . Conversely, if max k S T o p k ( x ) > min k S + T o p k + ( x ) , then P ( ξ C ( x ) ) = 0 .
From (12), it follows that the probability P ξ C ( x ) is expressed as follows:
P ( ξ C ( x ) ) = P L ( x ) ξ U ( x ) = 0 , if L ( x ) > U ( x ) , or U ( x ) < z 1 , or L ( x ) z N + 1 , i = j j p i , if L ( x ) [ z j , z j + 1 ) , U ( x ) [ z j , z j + 1 ) with j , j { 1 , 2 , , N } , 1 , if L ( x ) z 1 and U ( x ) z N + 1 .
Then,
z D = 0 , if L ( x ) > U ( x ) , or U ( x ) < z 1 , or L ( x ) z N + 1 , inf P D i = j j p i , if L ( x ) [ z j , z j + 1 ) , U ( x ) [ z j , z j + 1 ) , with j , j { 1 , 2 , , N } , 1 , if L ( x ) z 1 and U ( x ) z N + 1 .
With the above argument, we have completed the proof of the desired result.□
By Proposition 2, we now prove that the chance constraints (2b) can be reformulated as a number of ordinary ones.
Theorem 3.
Let A and b be a given matrix and a vector. Let P ^ be the estimated distribution obtained through the proposed adaptive data-driven method. Let γ be a given divergence tolerance of the ambiguous density functions in the confidence set D defined by (6), satisfying 2 α γ > 0 . Then, the feasible set defined by the chance constraint, i.e., F c c = x R n : inf P D P { A ( ξ ) x b ( ξ ) } 1 α , can be reformulated in the following forms.
1. The set F c c = Ø if one of the following conditions holds:
(i) For all k K , A k x b k 0 , U ( x ) z 1 ;
(ii) For all k K , A k x b k 0 , L ( x ) z N + 1 ;
(iii) There exist k 1 , k 2 K such that A k 1 x b k 1 0 , A k 2 x b k 2 0 , L ( x ) U ( x ) ;
(iv) There exist k 1 , k 2 K such that A k 1 x b k 1 0 , A k 2 x b k 2 0 , U ( x ) z 1 ;
(v) There exist k 1 , k 2 K such that A k 1 x b k 1 0 , A k 2 x b k 2 0 , L ( x ) z N + 1 .
2. The set F c c = C 1 1 C 2 1 C 3 1 , where
C 1 1 = x R n : k K , A k x b k 0 , U ( x ) z N , C 2 1 = x R n : k K , A k x b k 0 , L ( x ) z 1 , C 3 1 = x R n : k 1 , k 2 K , A k 1 x b k 1 0 , A k 2 x b k 2 0 , L ( x ) z 1 x R n : k 1 , k 2 K , A k 1 x b k 1 0 , A k 2 x b k 2 0 , U ( x ) z N + 1
3. The set F c c = C 1 C 2 C 3 , where
C 1 = x R n : i = 1 j p ^ i ( z i , z i + 1 ) 1 α + γ 2 , A k x b k 0 , z j U ( x ) < z j + 1 , k , j , C 2 = x R n : i = j N p ^ i ( z i , z i + 1 ) 1 α + γ 2 , A k x b k 0 , z j L ( x ) z j + 1 , k , j , C 3 = x R n : i = j j p ^ i ( z i , z i + 1 ) 1 α + γ 2 , k 1 , k 2 K , A k 1 x b k 1 0 , A k 2 x b k 2 0 , L ( x ) [ z j , z j + 1 ) , U ( x ) [ z j , z j + 1 ) , with j , j { 1 , 2 , , N } .
Proof. 
We prove the result in the following three cases.
Case 1: S = Ø , i.e., A k x b k > 0 for all k K .
(i) When U ( x ) < z 1 , from (13), it is deduced that z D = 0 . Consequently, the inequality z D 1 α does not hold.
(ii) When U ( x ) z N , it follows from (13) that z D = 1 , and thus z D 1 α always holds.
(iii) When z j U ( x ) < z j + 1 , j = 1 , 2 , , N , the confidence set D is defined by (6). By Theorem 1, for the given x and the set C ( x ) , the worst-case probability bound (11) is transformed into the following programming problem:
[ Primal ] z D = inf P D P { ξ C ( x ) }
min P P { ξ C ( x ) }
            s . t . i = 1 N p ^ i ( z i , z i + 1 ) 1 p i p ^ i ( z i , z i + 1 ) γ ,
i = 1 N p i = 1 ,
  p i 0 , i = 1 , 2 , , N ,
where the constraint (15c) bounds the divergence from above by γ , and constraints (15d) and (15e) guarantee that P is a density function. This transformation is justified by the structural properties of the confidence set D , which collectively constrain the feasible region of P . Consequently, we get
z D = min P i = 1 j p i s . t . i = 1 N p ^ i ( z i , z i + 1 ) 1 p i p ^ i ( z i , z i + 1 ) γ , i = 1 N p i = 1 , p i 0 , i = 1 , 2 , , N .
For the constraints in Problem (16), let λ e q be the Lagrange multiplier corresponding to the equality constraint. Let λ n e be the Lagrange multiplier corresponding to the inequality constraint. Then, its Lagrangian function reads
L ( P , λ e q , λ n e ) = max λ e q , λ n e 0 min P 0 i = 1 j p i + λ e q 1 i = 1 N p i + λ n e i = 1 N p ^ i ( z i , z i + 1 ) 1 p i p ^ i ( z i , z i + 1 ) γ = max λ e q , λ n e 0 λ e q λ n e γ + min P 0 i = 1 N ( I C ( x ) ( z ) λ e q ) p i + λ n e i = 1 N p ^ i ( z i , z i + 1 ) 1 p i p ^ i ( z i , z i + 1 ) = max λ e q , λ n e 0 λ e q λ n e γ max P 0 i = 1 N ( λ e q I C ( x ) ( z ) ) p i λ n e i = 1 N p ^ i ( z i , z i + 1 ) 1 p i p ^ i ( z i , z i + 1 ) = max λ e q , λ n e 0 λ e q λ n e γ max P 0 λ n e i = 1 N p ^ i ( z i , z i + 1 ) λ e q I C ( x ) ( z ) λ n e p i p ^ i ( z i , z i + 1 ) 1 p i p ^ i ( z i , z i + 1 ) ,
where I C ( x ) ( z ) is the indicator function, which takes the value of 1 when z C ( x ) ; otherwise, it takes 0. Note that the conjugate function of φ ( t ) = | t 1 | is
φ * ( s ) = sup t 0 s t φ ( t ) = 1 , if s < 1 , s , if 1 s 1 , + , if s > 1 .
Then,
z D = max λ e q , λ n e > 0 λ e q λ n e γ λ n e i = 1 N p ^ i ( z i , z i + 1 ) φ * λ e q I C ( x ) ( z ) λ n e = max λ e q , λ n e > 0 λ e q λ n e γ λ n e i : I C ( x ) = 1 p ^ i ( z i , z i + 1 ) φ * λ e q 1 λ n e λ n e i : I C ( x ) = 0 p ^ i ( z i , z i + 1 ) φ * λ e q λ n e = max λ e q , λ n e > 0 λ e q λ n e γ λ n e z ^ D φ * λ e q 1 λ n e λ n e ( 1 z ^ D ) φ * λ e q λ n e ,
where z ^ D = i : I C ( x ) = 1 p ^ i ( z i , z i + 1 ) = P ^ ( ξ C ( x ) ) .
By the strong duality in the theory of linear programming, it holds that z D = z D for Problems (15) and (18). Thus, there exist constants λ e q > 0 and λ n e > 0 such that z D 1 α , i.e.,
λ e q λ n e γ λ n e z ^ D φ * λ e q 1 λ n e λ n e ( 1 z ^ D ) φ * λ e q λ n e 1 α
z ^ D 1 α λ e q + λ n e γ + λ n e φ * λ e q λ n e λ n e φ * λ e q λ n e φ * λ e q 1 λ n e
= 1 α + λ e q 1 λ n e γ λ n e φ * λ e q 1 λ n e λ n e φ * λ e q λ n e φ * λ e q 1 λ n e .
Denote λ 1 = λ e q λ n e 0 and λ 0 = λ e q 1 λ n e . Then, λ 0 = λ 1 1 λ n e λ 1 .
We first prove that λ 1 > 1 is invalid. Indeed, from (17) and λ n e > 0 , it is inferred that when λ 1 > 1 , the left side of (19a) tends towards negative infinity, violating DCC. Therefore, 0 λ 1 1 and λ 0 1 .
Because φ * is a conjugate function, from the properties of conjugate functions, it follows that (19c) is equivalent to
z ^ D 1 sup λ 0 1 , 0 λ 1 1 α + λ e q 1 λ n e γ λ n e φ * λ e q 1 λ n e λ n e φ * λ e q λ n e φ * λ e q 1 λ n e .
We now conduct a case-by-case analysis of λ 0 as follows.
When 1 λ 0 1 , the right-hand side of inequality (20) is transformed into
1 sup λ 0 1 , 0 λ 1 1 α λ n e γ = 1 α + γ 2 .
Indeed, 1 λ n e 1 λ 1 1 + 1 λ n e and λ 1 1 . Thus, λ n e 1 2 .
When λ 0 < 1 , the right-hand side of inequality (20) is rewritten as
1 sup λ 0 1 , 0 λ 1 1 α 1 + λ n e ( λ 1 γ + 1 ) λ n e λ 1 + 1 = 1 sup λ 0 1 , 0 λ 1 1 α 1 λ n e λ 1 + 1 + λ 1 γ + 1 λ 1 + 1 .
Since 0 λ 1 1 and 0 α 1 , when λ n e 0 + , the optimization problem in (22) tends towards positive infinity—that is, there is no solution. Therefore, from (20) and (21), we get
z ^ D 1 α + γ 2 .
Furthermore, by (13), we have
z ^ D = i = 1 j p ^ i ( z i , z i + 1 ) , if S = Ø or z j U ( x ) < z j + 1 , j = 1 , 2 , , N .
A similar proof can be obtained for Cases 2 and 3.
Case 2: S + = Ø , which means that A k x b k < 0 for all k K .
(i) When L ( x ) z N + 1 , this implies that z D = 0 .
(ii) When L ( x ) < z 1 , this leads to z D = 1 . As a result, the chance constraints are always satisfied.
(iii) When z j L ( x ) < z j + 1 , j = 1 , 2 , , N , according to (14), it yields
z ^ D = i = j N p ^ i ( z i , z i + 1 ) , if S + = Ø or z j L ( x ) < z j + 1 , j = 1 , 2 , , N .
Case 3: S + Ø and S Ø —that is, k 1 , k 2 K , with A k 1 x b k 1 > 0 and A k 2 x b k 2 < 0 .
(i) When L ( x ) > U ( x ) , or U ( x ) < z 1 , or L ( x ) z N + 1 , these lead to z D = 0 .
(ii) When L ( x ) z 1 and U ( x ) z N + 1 , then z D = 1 . Thus, the chance constraints are always satisfied.
(iii) When L ( x ) [ z j , z j + 1 ) , U ( x ) [ z j , z j + 1 ) with j , j { 1 , 2 , , N } , it yields
z ^ D = i = j j p ^ i ( z i , z i + 1 ) , where S + = Ø or S = Ø or L ( x ) [ z j , z j + 1 ) , U ( x ) [ z j , z j + 1 ) , with j , j { 1 , , N } .
The proof has been completed. □
Remark 6.
By Theorem 3, constraint (2b) is reformulated by a number of ordinary inequality constraints. Notably, different from the existing results, the derived deterministic equivalent formulation (DEF) of Model (1) is a linear programming model. It does not involve solving any complicated auxiliary optimization problems [8], a semi-definite programming problem [3], or a mixed-integer second-order cone programming problem.

3.2. Extension to Multistage Chance-Constrained Programming

We further extend the results in Proposition (2) and Theorem (3) to multistage DRCCPs.
Different from Model (2), a multistage DRCCP can be given by
min C ( x )     s . t . inf P D P { A ( ξ ) x b ( ξ ) } 1 α ,     x X R n ,
where ξ = ( ξ 1 , , ξ T ) is a random vector, T is the number of stages, ξ t represents the random parameter at the t-th stage, A ( ξ ) R m × n is a random coefficient matrix, and b ( ξ ) R m is a random capacity vector. Suppose that ξ 1 , , ξ T are assumed to be mutually independent. Thus, the constraint A ( ξ ) x b ( ξ ) is equivalent to the intersection of m scalar constraints
s = 1 m A s ( ξ ) x b s ( ξ ) ,
where A s ( ξ ) R n is the s-th row of A , and b s ( ξ ) R is the s-th component of b .
Let α ( 0 , 1 ) represent the total risk level, i.e., the required confidence level is 1 α . If the distribution of each ξ t in Model (23) is not known precisely but belongs to a confidence set D t , constructed from historical data, based on (6), we know that D t is defined by
D t = P ( t ) p i t ( t ) = P ( t ) ξ t = z i t ( t ) + z i t + 1 ( t ) 2 , i t = 1 , 2 , , N t , i t = 1 N t p i t ( t ) = 1 , i t = 1 N t | p i t ( t ) p ^ i t ( t ) ( z i t ( t ) , z i t + 1 ( t ) ) | γ t , 0 p i t ( t ) 1 , i = 1 , 2 , , N t . , t = 1 , , T ,
where p ^ i t ( t ) ( z i t ( t ) , z i t + 1 ( t ) ) is the estimated probability of ξ t falling in [ z i t ( t ) , z i t + 1 ( t ) ) , and γ t 0 controls the confidence level of D t . Clearly, the overall confidence set for ξ is the product of these marginal sets, i.e., D = D 1 × × D T .
For the multistage DRCCP (23), we can prove the following result.
Theorem 4.
Let ξ = ( ξ 1 , , ξ T ) be a random vector with independent components. Let the confidence set D = D 1 × × D T , with D t being defined by (24). If there exist α s [ 0 , 1 ] ( s = 1 , , m ) , s = 1 m α s α , such that
inf P D P ( A s ( ξ ) x b s ( ξ ) ) 1 α s , s { 1 , , m } ,
then
inf P D P s = 1 m A s ( ξ ) x b s ( ξ ) 1 α .
Proof. 
Let E s = { A s ( ξ ) x b s ( ξ ) } ( s = 1 , , m ) denote events that can potentially be dependent on the random vector ξ . For any joint distribution P D , the Bonferroni inequality (union bound) gives
P s = 1 m E s c s = 1 m P ( E s c ) ,
where E s c = { A s ( ξ ) x > b s ( ξ ) } is the complement of E s . Therefore,
P s = 1 m E s = 1 P s = 1 m E s c 1 s = 1 m P ( E s c ) .
Taking the infimum over P D on both sides,
inf P D P s = 1 m E s 1 sup P D s = 1 m P ( E s c ) .
For each s, the individual chance constraint condition inf P D P ( E s ) 1 α s implies sup P D P ( E s c ) = sup P D [ 1 P ( E s ) ] = 1 inf P D P ( E s ) α s . Thus,
sup P D s = 1 m P ( E s c ) s = 1 m sup P D P ( E s c ) s = 1 m α s .
Therefore,
inf P D P s = 1 m E s 1 s = 1 m α s 1 α .
We have completed the proof. □
Remark 7.
In the proof of Theorem 4, the Bonferroni inequality is only employed to derive a sufficient condition by which one can easily transform a joint chance constraint into a group of individual chance constraints, so as to obtain a computationally tractable reformulation of the original model for the given confidence level of the joint chance constraint. Although this inequality seems overly conservative, Theorem 4 enables a conservative approximation of the joint chance constraint by a group of individual chance constraints with easily allocated confidence levels, such as taking α s = α / m for all s. Clearly, this is valuable for future research so that a more compact inequality can be used to further optimize the confidence levels of individual chance constraints.
Remark 8.
Note that Theorem 4 is proven under the assumption that the random variables are independent and the confidence set is a product of marginal sets, but, by this theorem, the results in Proposition 2 and Theorem 3 can be directly applied to handle the multistage DRCCP studied in the subsequent section.

4. Numerical Tests

4.1. Stochastic Multiperiod Capacitated Lot-Sizing Problems

To assess the performance of the proposed DRCCP approach, we carry out numerical tests on our method by solving the SMP-CLSP, one of the most fundamental inventory management problems [8,24]. Mathematically, the model of this problem reads as follows:
min x , y t = 1 T c ¯ t y t + ( c t + H t ) x t
s . t . inf P D P x 1 ξ 1 x 1 + x 2 ξ 1 + ξ 2 x 1 + x 2 + + x T ξ 1 + ξ 2 + + ξ T 1 α ,
x t M t y t t = 1 , , T ,
x t Z + , t = 1 , , T ,
  y t { 0 , 1 } t = 1 , , T ,
where the initial inventory level is set to 0, T represents the time horizon, M t denotes an upper bound on the quantity of units manufactured during period t, c t stands for the unit production cost in period t, H t is the unit holding cost for period t, c ¯ t refers to the fixed setup cost per production run, ξ t represents the demand during period t, x t is the number of units to be produced in period t, Z + represents a set of nonnegative integers, and y t serves as a binary variable—it takes the value of 1 when a setup is performed during period t and 0 otherwise. More precisely, y t = 1 when x t > 0 , and y t = 0 when x t = 0 . x t and y t are decision variables, where t = 1 , , T .
The constraint (25a) ensures that the cumulative production up to period t meets the cumulative demand ξ t with a higher probability 1 α under the worst-case distribution from D . Here, D is the product of marginal confidence sets D t for each ξ t , as defined in Equation (24), and the random variables ξ t are assumed independent. Thus, (25a) is equivalent to
inf P D P t = 1 T s = 1 t x s s = 1 t ξ s 1 α .
By Theorem 4, if we allocate risk levels α t such that t = 1 T α t α satisfying
inf P D P s = 1 t x s s = 1 t ξ s 1 α t , t = 1 , , T ,
then the inequality (25a) holds. For example, we can take equal risk allocation, i.e., α t = α / T for all t.

4.2. Reformulation of Models and Development of Algorithms

We next reformulate Problem (25) as an ordinary optimization problem.
Let the random variable ξ s be approximated by a confidence set D s defined in (24), s = 1 , , t . That is to say, ξ s is discretized by ξ s = η i s ( s ) with probabilities p i s ( s ) , i s = 1 , , N s .
Define a random variable
ζ t = s = 1 t ξ s .
Then, ζ t takes the value s = 1 t η i s ( s ) with probability s = 1 t p i s ( s ) . Let N ζ = s = 1 t N s denote the number of all values taken by ζ t . Sorting all sums s = 1 t η i s ( s ) in ascending order yields N ζ nodes s = 1 t η i s ( l ) ( s ) with probability s = 1 t p i s ( l ) ( s ) , l = 1 , 2 , , N ζ . Denote all nodes by ζ t ( l ) , l = 1 , 2 , , N ζ . Then, the cumulative distribution function (CDF) of ζ t is specified by the following step function:
F t ( z ) = 0 , if z < ζ t ( 1 ) ; q = 1 l s = 1 t p i s ( q ) ( s ) , if ζ t ( l ) z < ζ t ( l + 1 ) , for , l = 1 , 2 , , N ζ ; 1 , if z ζ t ( N ζ ) .
Consequently, for any given x s , s = 1 , 2 , , t , the value of inf P D P s = 1 t x s s = 1 t ξ s is computed by solving the following auxiliary problem:
min F ( s = 1 t x s ) = q = 1 l s = 1 t p i s ( q ) ( s ) s . t . p i s ( q ) ( s ) D s , s = 1 , 2 , , t , s = 1 t η i s ( l ) ( s ) s = 1 t x s .
Indeed, if p i s ( * ) ( s ) is the optimal solution of Problem (27), then its corresponding value is the minimal value of P s = 1 t x s s = 1 t ξ s in D .
With a given distribution F r b , such as the optimal solution p i s ( * ) ( s ) of Problem (27), we construct a master problem to solve the SMP-CLSP (25) as follows.
min x , y t = 1 T ( c ¯ t y t + ( c t + H t ) x t ) s . t . s = 1 t x s F t * 1 ( 1 α t ) , t = 1 , 2 , , T , x t M t y t t = 1 , , T , x t Z + , t = 1 , , T , y t { 0 , 1 } t = 1 , , T .
Model (28) effectively reformulates the original problem (25) as a deterministic programming model.
With the above preparations, we now develop an algorithm to solve the original DRCCP (25), where the master problem is a mixed-integer linear programming problem and is first solved for an initial approximate solution of the auxiliary problem; then, the auxiliary problem is solved for the obtained approximate solution of the master problem. The computational procedure is specified in Algorithm 1.
However, by Theorem 3, we can further reformulate Problem (25) as another form of standard constrained optimization problem.
Indeed, for each period t = 1 , , T , using the reference distribution P ^ t at the t-th stage, Problem (26) corresponds to Case (1) in Proposition 2 under the condition A k x b k 0 , where A k = 0 and b k = 1 . The structure of the feasible set F c c , as characterized by Theorem 3, depends on the value of U ( x ) = s = 1 t x s :
(1) If U ( x ) < z 1 , part 1(i) implies that F c c = Ø .
(2) If U ( x ) z N , set C 1 1 in part 2 applies, yielding F c c = x R n : U ( x ) z N .
(3) If z j U ( x ) < z j + 1 for some j { 1 , , N } , set C 1 in part 3 applies, and the feasible set is given by
x R n : z j U ( x ) < z j + 1 , i = 1 j p ^ i ( z i , z i + 1 ) 1 α t + γ t 2 T .
The reformulation of the chance constraint is explicitly given by the condition for set C 1 in the third case above. Therefore, for any x satisfying z j U ( x ) < z j + 1 , the data-driven chance constraint (26) is equivalent to
P ^ t s = 1 t x s s = 1 t ξ s 1 α t + γ t 2 T .
Let F ^ t be the estimated cumulative distribution function (CDF) of ζ t = s = 1 t ξ s . Then, the constraint (29) becomes
F ^ t s = 1 t x s 1 α t + γ t 2 T .
Thus, the deterministic reformulation of (25) is
min x , y t = 1 T c ¯ t y t + ( c t + H t ) x t       s . t . s = 1 t x s F ^ t 1 1 α t + γ t 2 T , t = 1 , , T , x t M t y t , t = 1 , , T , x t Z + , t = 1 , , T , y t { 0 , 1 } , t = 1 , , T .
Algorithm 1 Alternating DRCCP-LS algorithm
Input: Time horizon T; sample sizes: n s for s = 1 , , T ; sample data: { ξ s j } j = 1 n s for s = 1 , , T ; interpolation tolerance: T H ; divergence tolerance: γ t , t = 1 , , T ; risk level α with α t = α / T , t = 1 , , T ; cost parameters: c ¯ t , c t , H t , t = 1 , , T ; capacity parameters: M t , t = 1 , , T ; tolerance ϵ
Output: Optimal production plan ( x * , y * ) and the minimum total cost T C *
  1: Apply Rules (I)-(III) to determine the bin parameters N s , η i s ( s ) , and reference distributions p ^ i s ( s ) , i s = 1 , , N s , s = 1 , , T
  2: Initialize a production plan, denoted by x ( 0 ) 1 ; the total cost T C ( 0 ) ; the iteration counter k 0 ; the convergence flag converged false
  3: while not converged do
  4:  Initialize robust quantiles Q ( k ) [ 0 , , 0 ] of length T
  5:  for t = 1 to T do
  6:   Compute cumulative production z t s = 1 t x s ( k 1 )
  7:   Solve the auxiliary problem (27) with x s = x ( k 1 ) . Denote its optimal solution by P t ( s ) * for s = 1 , , t
  8:   With P t ( s ) * , s = 1 , , t , define a distribution function F t * , and compute Q ( k ) [ t ] F t * 1 ( 1 α t )
  9:  end for
10:  With Q ( k ) , solve the master problem (28). The optimal solution is referred to as ( x ( k ) , y ( k ) ) , and set T C ( k ) t = 1 T ( c ¯ t y t ( k ) + ( c t + H t ) x t ( k ) )
11:  if | T C ( k ) T C ( k 1 ) | < ϵ then
12:    converged true
13:  else
14:    k k + 1
15:  end if
16: end while
17: x * x ( k ) , y * y ( k ) , T C * T C ( k )
18: return ( x * , y * ) and T C *
Problem (30) is also a mixed-integer linear programming (MILP) model, which can be solved using standard optimization solvers. By virtue of this reformulation defined by Model (30), we can present a more efficient algorithm to find a robust optimal production plan. The computational procedure is specified in Algorithm 2.
Algorithm 2 Reformulated DRCCP-LS algorithm
Input: Time horizon T; sample sizes: n s for s = 1 , , T ; sample data: { ξ s j } j = 1 n s for s = 1 , , T ; interpolation tolerance: T H ; divergence tolerance: γ t , t = 1 , , T ; risk parameters: α , α t = α / T , t = 1 , , T ; cost parameters: c ¯ t , c t , H t , t = 1 , , T ; capacity parameters: M t , t = 1 , , T
Output: Optimal production plan ( x * , y * ) and the minimum total cost T C *
1: Apply Rules (I)-(III) to determine the bin parameters N s , η i s ( s ) , and reference distributions p ^ i s ( s ) , s = 1 , , T
2: Compute the estimated cumulative distribution functions F ^ t for ζ t = s = 1 t ξ s , t = 1 , , T , using the reference distributions p ^ i s ( s )
3: for t = 1 to T do
4:   Compute the quantile Q ^ t = F ^ t 1 1 α t + γ t 2 T
5: end for
6: Solve the MILP problem (30) to obtain ( x * , y * ) and T C *
7: return ( x * , y * ) and T C *
When only the traditional histogram, instead of the approach proposed in this paper, is employed to define the confidence set, Algorithm 2 is replaced by Algorithm 1.
Remark 9.
Clearly, compared with Algorithm 2, Algorithm 1 involves alternative solutions of a mixed-integer linear master problem and an auxiliary nonlinear programming problem. In contrast, owing to the used reformulation technique in Model (30), Algorithm 2 is developed to directly solve a single mixed-integer linear program, being beneficial in the use of any off-the-shelf MILP solver. In the subsequent section, we will further validate its efficiency advantages through numerical tests.

4.3. Numerical Tests

In our computational experiments, we consider a planning horizon of T = 2 periods, which is sufficient to capture the essential dynamics of the multiperiod DRCCP. The initial inventory is initialized to zero. Both the holding cost and setup cost are assumed to be constant throughout the experiment, with values of 0.50 and 48, respectively. The production capacity is fixed at 50 units per period. The variable production costs are dependent on the period, and they are 28 and 15 when t = 1 and t = 2 , respectively.
To simulate realistic and challenging demand patterns, we construct nonstandard demand distributions. The demand in each period is a nonnegative random variable, and demands across periods are mutually independent. For t = 1 , the demand follows a mixture of two negative binomial distributions, 0.3 NB ( 18 , 0.8 ) + 0.7 NB ( 56 , 0.8116 ) , where the parameters r and p in NB ( r , p ) represent the number of successes and the success probability in each trial, respectively. This mixture distribution can capture the multimodal characteristics that are representative of an uncertain market. For t = 2 , the demand is generated from a Poisson distribution with mean λ = 15 , incorporating outliers that occur with a probability of 6 % .
In Figure 2 and Figure 3, we show the application of the proposed IDE-PLI method and the traditional histogram-based approach to estimating the probability density functions for the demand distributions at t = 1 and t = 2 under different sample sizes.
From the results in Figure 2 and Figure 3, the following can be observed.
(1) When the sample size is 5000 and the data contain more outliers, the fixed binning strategy of the traditional histogram method results in the substantial deviation of the histogram shape from the true distribution (see the last sub-figure in Figure 2).
(2) Although the traditional histogram method can better capture the distribution characteristics when the sample size increases to 20,000 (see the last sub-figure in Figure 3), the number of bins in this method grows significantly, thereby increasing the model complexity. In contrast, the IDE-PLI method can effectively capture the underlying multimodal structure and tail behavior of the empirical data at both sample sizes without overfitting to sampling noise.
(3) The proposed IDE-PLI method always requires fewer bins compared to the traditional histogram method. This significant reduction can directly improve the computational efficiency in solving subsequent robust DEFs.
In order to assess the applicability of the proposed IDE-PLI method, we further apply this method to solve the SMP-CLSP, especially in comparison with other methods, when all of them are employed to construct distinct confidence sets. Besides Algorithm 1 (denoted by DRCCP-A) and Algorithm 2 (denoted by DRCCP-E), the other three compared methods are stated as follows.
CCC: The classical chance-constrained programming approach without dependence on constructing any confidence set, which assumes that the true demand distribution is known.
DRCCP-AT: An algorithm with the reference distribution in Algorithm 1 being replaced by the traditional histogram.
DRCCP-KL: The method in [25], where the classic empirical distribution is used as the reference distribution in the construction of confidence sets, and the confidence set is defined by the Kullback–Leibler divergence.
In Table 1, we present a comparison of the computational running times of various methods with the sample sizes of 5000 and 50,000 under different risk aversion levels.
From the numerical results in Table 1, the following conclusions can be drawn.
(1) The proposed DRCCP-E method exhibits remarkable computational efficiency and is even superior to the CCC method. The CCC model runs the fastest without considering the robustness of optimal solutions, but its assumption of a known true distribution is often impractical.
(2) As the sample size increases from 5000 to 50,000, DRCCP-A exhibits a significantly shorter running time compared with DRCCP-AT under the same setting of α . This result highlights the computational advantage of the adaptive binning strategy employed in DRCCP-A over the traditional histogram approach used in DRCCP-AT.
(3) The DRCCP-KL method consistently records the highest computational costs owing to the solution of complicated reformulated models therein. Indeed, its running time increases as the sample size grows, which underscores the numerical challenges associated with the Kullback–Leibler divergence constraint.
Considering changing risk levels ( α ) and changing divergence tolerances ( γ ), we conduct a sensitivity analysis of the minimized total cost and the optimal solutions derived from different methods, so as to validate the ability of the proposed data-driven robust approach in terms of achieving a trade-off between cost and robustness, as well as between optimality and robustness.
For this purpose, we change the value of the divergence tolerance from 0.03 to 0.09 with a step size of 0.02, and we change the value of α from 0.05 to 0.15 with a step size of 0.05. The minimized total cost and the corresponding optimal solutions across different combinations of α and γ are listed in Table 2 and Table 3, respectively. In Table 3, each optimal solution is represented as ( x 1 , x 2 ; y 1 , y 2 ) . All experimental results are derived from a training sample size of 50,000, ensuring statistical significance.
From the results in Table 2, the following observations can be made.
(1) For the distributionally robust models (DRCCP-E, DRCCP-A, DRCCP-AT), the total cost increases as γ increases, reflecting the price of robustness against a larger confidence set. In contrast, the DRCCP-KL method displays nonmonotonic cost behavior, which can be attributed to the nonlinear nature of KL constraints, which often introduce numerical instabilities during optimization and lead to inconsistent convergence. The total cost of the CCC model remains constant since it is not influenced by γ . Among them, the performance of the DRCCP-A and DRCCP-AT methods is comparable, although the former outperforms the latter when γ = 0.05 .
(2) For each of the fixed α values, the proposed DRCCP-E model consistently yields the lowest costs for the majority of the given values of γ among all compared robust methods. This advantage in terms of the total cost underscores its effectiveness in striking a balance between economic efficiency and robustness.
Furthermore, from the results in Table 3, the following observations are derived.
(1) For all DRCCP methods except DRCCP-KL, the optimal quantity of single-period production is generally increasing with increased values of γ , and the total quantity of two-period production consistently also displays an increasing trend in accordance with theoretical expectations. Indeed, a larger value of γ can expand the size of the confidence set, enforcing robustness by a wider range of distributions but easily leading to an overconservative solution.
(2) The classical CCC method, which assumes a known distribution, always yields the smallest quantity of production among the compared methods since it does not account for distributional ambiguity. However, this advantage may disappear if the nominal distribution is misspecified.
Since any chance-constrained programming approach aims to minimize the constraint violation degree of the obtained optimal strategy when applied to unrealized scenarios of random model parameters in the future, we investigate the performance of the compared five methods in this context, i.e., the percentage of test samples that causes the demand to exceed the production capacity and the risk of constraint violation. In Table 4, we list the violation probabilities of the constraints under different risk levels and divergence tolerances, where 5000 samples are generated for the learning of the distribution and 500 samples are generated for an out-of-sample test.
From the listed results in Table 4, we derive the following observations.
(1) Owing to the assumption of knowing the true demand distribution, the violation probability of the CCC method remains stable across different γ and is generally below the corresponding targeted level of α , which is attributed to finite-sample estimation errors in this method.
(2) All DRCCP methods exhibit consistently and significantly lower violation probabilities than those in the CCC method, further validating their characteristic superiority in view of robustness. By considering the ambiguity of the demand distribution, all DRCCP methods can provide a more conservative production strategy, so as to hedge against the worst-case scenario.
(3) As γ increases, the violation probabilities of all DRCCP methods (DRCCP-E, DRCCP-A, DDRCCP-AT, and most cases of DRCCP-KL) decrease monotonically. This inverse relationship between γ and the violation probability demonstrates that one can effectively tune γ to achieve a desired risk protection level, albeit at the potential cost of higher production expenses.
(4) The primary practical advantage of the proposed DRCCP-E method lies in its balanced performance. Although it does not always achieve the lowest violation probability, it consistently maintains violation probabilities below the required risk level ( α ). From the previously conducted cost analysis, it is concluded that DRCCP-E generally achieves the minimum total cost among all compared methods. Thus, it is capable of successfully achieving a trade-off between reliability and cost, i.e., it can provide an optimal robust solution with an acceptable violation degree regarding constraints while avoiding the excessive costs associated with overconservative strategies.
(5) The violation probabilities of DRCCP-A and DRCCP-AT typically lie between those of CCC and DRCCP-E. This similar performance indicates that both of the histogram-based confidence sets (adaptive vs. traditional) offer comparable robustness. The observed minor variations between them arise from their differences in choosing the bins and allocating the probability mass.
In summary, it is concluded that all the DRCCP methods can reduce the violation probabilities compared to the CCC approach. The proposed DRCCP-E method provides the best balance between reliable risk control and a minimal production cost. The CCC method offers no protection against distributional misspecification. The divergence tolerance γ can serve as a critical model parameter employed to control the robustness levels in violating the constraints.
To conclude this section, we evaluate the performance of the compared distributionally robust optimization models when the confidence level and the sample size vary. In Figure 4, we present the numerical results obtained by taking values of 1 α from 0.85 to 0.95 under different sample sizes of 1000, 5000, 10,000, 20,000, and 50,000.
Looking at Figure 4, the following conclusions can be drawn.
(1) The total cost obtained by each method consistently decreases as the confidence level decreases, except for DRCCP-KL. This fundamentally reflects the trade-off between risk and cost, as well as the inherent conservatism in robust optimization. A higher confidence level implies a low probability of stockout. Consequently, the model adopts a conservative strategy, typically involving higher production to build safety stock against demand uncertainty, thereby increasing the total cost. DRCCP-KL occasionally deviates from this trend, which can be attributed to the numerical instabilities introduced by the nonlinear structure of the KL divergence constraints during optimization, as noted previously.
(2) The proposed DRCCP-E method displays remarkable stability across varying sample sizes. In contrast, DRCCP-A and DRCCP-AT show greater sensitivity to the sample size. Furthermore, as the confidence level decreases and the sample size increases, the minimal costs obtained by the DRCCP-A and DRCCP-AT methods converge. This convergence occurs because both methods are based on the same Algorithm 1, and, with a sufficiently large sample size, the distributions estimated by both the traditional histogram approach and the proposed IDE-PLI method converge to the true distribution. This result further validates the effectiveness of the proposed density estimation method.
(3) Across all settings of changing parameters, the proposed DRCCP-E method displays competitive performance, with the total cost lying between those of the conservative methods (DRCCP-AT, DRCCP-A, DRCCP-KL) and the nonrobust CCC benchmark.
In conclusion, the DRCCP-E method has the advantage of achieving a more appropriate balance between cost-effectiveness and robustness.

5. Conclusions

In this study, we propose a novel approach to constructing a data-driven adaptive confidence set for optimization problems with chance constraints, which only involves finitely many unknown parameters. By leveraging this confidence set, the original complicated chance constraint is reformulated into some tractable ordinary constraints. It is remarkable that, compared with existing works, the proposed reformulation does not involve solving complicated additional auxiliary optimization problems, a semi-definite programming problem, or a mixed-integer second-order cone programming problem.
We further expand the scope of this work to address stochastic multistage DRCCPs. A complete alternating solution method and a reformulation-based solution strategy are presented for the SMP-CLSP under demand uncertainty, along with detailed algorithmic frameworks established for each approach. In the alternating solution strategy, the auxiliary problem computes robust quantiles using nonlinear programming, while the master problem solves a deterministic mixed-integer programming problem. The resulting alternating DRCCP-LS algorithm efficiently handles distributional uncertainty while maintaining computational tractability. In the reformulation-based strategy, the DRCCP is transformed into a standard constrained optimization form, leading to the reformulated DRCCP-LS algorithm. Numerical experiments demonstrate that the proposed DRCCP-E method achieves a more appropriate balance between cost-effectiveness and robustness.
In future research, the proposed approach can be extended to handle chance constraints associated with multiple random factors, as well as developing efficient algorithms to solve the derived DEFs based on the property analysis of models. A straightforward yet computationally costly generalization method would be to construct a grid over the multidimensional sample space. Thus, investigating computational strategies such as adaptive partitioning or dimension reduction techniques will be essential. Additional directions could involve the extension of the proposed methodology to more complicated scenarios with time-varying confidence sets. Another promising direction is the application of the proposed methodology to practical operational decision-making problems in engineering and management under uncertain environments. These could include robust supply chain management, portfolio optimization, production planning, and the robust utilization of renewable energy sources such as wind and solar power.

Author Contributions

Conceptualization, Z.W.; validation, H.D.; formal analysis, H.D. and Z.W.; investigation, H.D.; writing—original draft, H.D.; writing—review and editing, Z.W.; supervision, Z.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Social Science Foundation of China (Grant No. 21BGL122).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

We declare that all authors have no conflicts of interest regarding the submission and publication of this paper.

References

  1. Nemirovski, A.; Shapiro, A. Convex approximations of chance constrained programs. SIAM J. Optim. 2007, 17, 969–996. [Google Scholar] [CrossRef]
  2. Calafiore, G.C.; Ghaoui, L.E. On distributionally robust chance-constrained linear programs. J. Optim. Theory Appl. 2006, 130, 1–22. [Google Scholar] [CrossRef]
  3. Delage, E.; Ye, Y. Distributionally robust optimization under moment uncertainty with application to data-driven problems. Oper. Res. 2010, 58, 595–612. [Google Scholar] [CrossRef]
  4. Jiang, N.; Xie, W. ALSO-X#: Better convex approximations for distributionally robust chance constrained programs. Math. Program. 2025, 213, 575–638. [Google Scholar]
  5. Meng, Q.; Jin, X.; Luo, F.; Wang, Z.; Hussain, S. Distributionally robust scheduling for benefit allocation in regional integrated energy system with multiple stakeholders. J. Mod. Power Syst. Clean Energy 2024, 12, 1631–1642. [Google Scholar] [CrossRef]
  6. Mohajerin Esfahani, P.; Kuhn, D. Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. Math. Program. 2018, 171, 115–166. [Google Scholar] [CrossRef]
  7. Ning, C.; You, F. Optimization under uncertainty in the era of big data and deep learning: When machine learning meets mathematical programming. Comput. Chem. Eng. 2019, 125, 434–448. [Google Scholar] [CrossRef]
  8. Jiang, R.; Guan, Y. Data-driven chance constrained stochastic program. Math. Program. 2016, 158, 291–327. [Google Scholar] [CrossRef]
  9. Zymler, S.; Kuhn, D.; Rustem, B. Distributionally robust joint chance constraints with second-order moment information. Math. Program. 2013, 137, 167–198. [Google Scholar] [CrossRef]
  10. Shapiro, A.; Zhou, E.; Lin, Y. Bayesian distributionally robust optimization. SIAM J. Optim. 2023, 33, 1279–1304. [Google Scholar] [CrossRef]
  11. Jiang, Y.; Ren, Z.; Li, W. Committed carbon emission operation region for integrated energy systems: Concepts and analyses. IEEE Trans. Sustain. Energy 2023, 15, 1194–1209. [Google Scholar] [CrossRef]
  12. Kuhn, D.; Shafiee, S.; Wiesemann, W. Distributionally robust optimization. Acta Numer. 2025, 34, 579–804. [Google Scholar] [CrossRef]
  13. Nguyen, V.A.; Kuhn, D.; Mohajerin Esfahani, P. Distributionally robust inverse covariance estimation: The Wasserstein shrinkage estimator. Oper. Res. 2022, 70, 490–515. [Google Scholar] [CrossRef]
  14. Rahimian, H.; Mehrotra, S. Frameworks and results in distributionally robust optimization. Open J. Math. Optim. 2022, 3, 4. [Google Scholar] [CrossRef]
  15. Chen, Z.; Kuhn, D.; Wiesemann, W. Data-driven chance constrained programs over Wasserstein balls. Oper. Res. 2024, 72, 410–424. [Google Scholar] [CrossRef]
  16. Zhong, J.; Zhao, Y.; Li, Y.; Yan, M.; Peng, Y.; Cai, Y.; Cao, Y. Synergistic operation framework for the energy hub merging stochastic distributionally robust chance-constrained optimization and Stackelberg game. IEEE Trans. Smart Grid 2024, 16, 1037–1050. [Google Scholar] [CrossRef]
  17. Küçükyavuz, S.; Jiang, R. Chance-constrained optimization under limited distributional information: A review of reformulations based on sampling and distributional robustness. EURO J. Comput. Optim. 2022, 10, 100030. [Google Scholar] [CrossRef]
  18. Zhang, B.; Meng, L.L.; Lu, C.; Han, Y.Y.; Sang, H.Y. Automatic design of constructive heuristics for a reconfigurable distributed flowshop group scheduling problem. Comput. Oper. Res. 2024, 161, 106432. [Google Scholar] [CrossRef]
  19. Pearson, K. Contributions to the mathematical theory of evolution. Philos. Trans. R. Soc. Lond. A 1894, 185, 71–110. [Google Scholar] [CrossRef]
  20. Scott, D.W. Multivariate Density Estimation: Theory, Practice, and Visualization; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
  21. Van Kerm, P. Adaptive kernel density estimation. Stata J. 2003, 3, 148–156. [Google Scholar] [CrossRef]
  22. Hoeffding, W. Probability Inequalities for Sums of Bounded Random Variables. J. Am. Stat. Assoc. 1963, 58, 13–30. [Google Scholar] [CrossRef]
  23. Levin, D.A.; Peres, Y. Markov Chains and Mixing Times; American Mathematical Society: Providence, RI, USA, 2017. [Google Scholar]
  24. Beraldi, P.; Ruszczyński, A. A branch and bound method for stochastic integer problems under probabilistic constraints. Optim. Methods Softw. 2002, 17, 359–382. [Google Scholar] [CrossRef]
  25. Chen, Y.; Guo, Q.; Sun, H.; Li, Z.; Wu, W.; Li, Z. A distributionally robust optimization model for unit commitment based on Kullback–Leibler divergence. IEEE Trans. Power Syst. 2018, 33, 5147–5160. [Google Scholar] [CrossRef]
Figure 1. The structure of the proposed approach to adaptively estimating the probability density based on Rules (I)–(III).
Figure 1. The structure of the proposed approach to adaptively estimating the probability density based on Rules (I)–(III).
Mathematics 14 00331 g001
Figure 2. Distribution estimations with 5000 samples by the IDE-PLI method and the traditional histogram-based method.
Figure 2. Distribution estimations with 5000 samples by the IDE-PLI method and the traditional histogram-based method.
Mathematics 14 00331 g002
Figure 3. Distribution estimations with 20,000 samples by the IDE-PLI method and the traditional histogram-based method.
Figure 3. Distribution estimations with 20,000 samples by the IDE-PLI method and the traditional histogram-based method.
Mathematics 14 00331 g003
Figure 4. Cost values corresponding to different methods at various confidence levels when γ = 0.05 .
Figure 4. Cost values corresponding to different methods at various confidence levels when γ = 0.05 .
Mathematics 14 00331 g004
Table 1. Running time comparison.
Table 1. Running time comparison.
Sample Size α Time (s)
CCCDRCCP-EDRCCP-ADRCCP-ATDRCCP-KL
5000 0.05 0.426880.2920913.37117.84951.000
0.10 0.410520.2837213.09317.63550.675
0.15 0.389370.2761413.26117.65851.035
50,000 0.05 0.402260.2517613.81932.57460.924
0.10 0.401130.235713.797332.28259.093
0.15 0.384580.250933.888732.15159.934
Table 2. Comparison of the total costs for different methods.
Table 2. Comparison of the total costs for different methods.
α MethodMinimized Total Cost
γ = 0 . 03 γ = 0 . 05 γ = 0 . 07 γ = 0 . 09
α = 0.05 CCC1048104810481048
DRCCP-E1336.51367.51398.51514.5
DRCCP-A1367.5147216151619
DRCCP-AT138315261545.51657.5
DRCCP-KL1472141814851471.5
α = 0.10 CCC972.5972.5972.5972.5
DRCCP-E12011216.512321263
DRCCP-A12321278.51367.51441
DRCCP-AT1218.51278.51338.51412
DRCCP-KL1338.5138313171303.5
α = 0.15 CCC912.5912.5912.5912.5
DRCCP-E988101910501112
DRCCP-A1096.51158.51216.51263
DRCCP-AT11121158.51218.51249.5
DRCCP-KL1247.5142513501230
Table 3. Comparison of optimal solutions under different methods.
Table 3. Comparison of optimal solutions under different methods.
α MethodSolution
γ = 0 . 03 γ = 0 . 05 γ = 0 . 07 γ = 0 . 09
α = 0.05 CCC(20,24;1,1)(20,24;1,1)(20,24;1,1)(20,24;1,1)
DRCCP-E(23,37;1,1)(23,39;1,1)(23,41;1,1)(27,41;1,1)
DRCCP-A(23,39;1,1)(25,42;1,1)(31,40;1,1)(29,44;1,1)
DRCCP-AT(23,40;1,1)(29,38;1,1)(27,43;1,1)(33,39;1,1)
DRCCP-KL(25,42;1,1)(21,46;1,1)(34,26;1,1)(33,27;1,1)
α = 0.10 CCC(19,21;1,1)(19,21;1,1)(19,21;1,1)(19,21;1,1)
DRCCP-E(21,32;1,1)(21,33;1,1)(21,34;1,1)(21,36;1,1)
DRCCP-A(21,34;1,1)(21,37;1,1)(23,39;1,1)(25,40;1,1)
DRCCP-AT(20,35;1,1)(21,37;1,1)(22,39;1,1)(24,40;1,1)
DRCCP-KL(22,39;1,1)(23,40;1,1)(25,32;1,1)(24,33;1,1)
α = 0.15 CCC(18,19;1,1)(18,19;1,1)(18,19;1,1)(18,19;1,1)
DRCCP-E(19,22;1,1)(19,24;1,1)(19,26;1,1)(19,30;1,1)
DRCCP-A(19,29;1,1)(19,33;1,1)(21,33;1,1)(21,36;1,1)
DRCCP-AT(19,30;1,1)(19,33;1,1)(20,35;1,1)(20,37;1,1)
DRCCP-KL(21,35;1,1)(33,24;1,1)(24,36;1,1)(22,32;1,1)
Table 4. Violation probabilities of different methods under varying risk levels and divergence tolerances.
Table 4. Violation probabilities of different methods under varying risk levels and divergence tolerances.
α MethodViolation Probability
γ = 0 . 03 γ = 0 . 05 γ = 0 . 07 γ = 0 . 09
α = 0.05 CCC0.0470.0470.0470.047
DRCCP-E0.0430.0400.0200.010
DRCCP-A0.0400.0300.0200.017
DRCCP-AT0.0400.0300.0170.017
DRCCP-KL0.0430.0430.0430.043
α = 0.10 CCC0.0670.0670.0670.067
DRCCP-E0.0470.0470.0470.043
DRCCP-A0.0470.0430.0430.040
DRCCP-AT0.0470.0470.0430.040
DRCCP-KL0.0470.0430.0430.043
α = 0.15 CCC0.1030.1030.1030.103
DRCCP-E0.0500.0500.0470.047
DRCCP-A0.0470.0470.0470.047
DRCCP-AT0.0470.0470.0470.047
DRCCP-KL0.0470.0470.0430.047
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Deng, H.; Wan, Z. Computable Reformulation of Data-Driven Distributionally Robust Chance Constraints: Validated by Solution of Capacitated Lot-Sizing Problems. Mathematics 2026, 14, 331. https://doi.org/10.3390/math14020331

AMA Style

Deng H, Wan Z. Computable Reformulation of Data-Driven Distributionally Robust Chance Constraints: Validated by Solution of Capacitated Lot-Sizing Problems. Mathematics. 2026; 14(2):331. https://doi.org/10.3390/math14020331

Chicago/Turabian Style

Deng, Hua, and Zhong Wan. 2026. "Computable Reformulation of Data-Driven Distributionally Robust Chance Constraints: Validated by Solution of Capacitated Lot-Sizing Problems" Mathematics 14, no. 2: 331. https://doi.org/10.3390/math14020331

APA Style

Deng, H., & Wan, Z. (2026). Computable Reformulation of Data-Driven Distributionally Robust Chance Constraints: Validated by Solution of Capacitated Lot-Sizing Problems. Mathematics, 14(2), 331. https://doi.org/10.3390/math14020331

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop