Data-Driven Two-Stage Distributionally Robust Mean Semi-Variance Mixed-Integer Optimization Model for Location Allocation Problems in an Uncertain Environment

Liu, Zhimin; Raza, Hassan

doi:10.3390/sym17040589

Open AccessArticle

Data-Driven Two-Stage Distributionally Robust Mean Semi-Variance Mixed-Integer Optimization Model for Location Allocation Problems in an Uncertain Environment

by

Zhimin Liu

^1,*,† and

Hassan Raza

^2,†

¹

School of Mathematics Science, Liaocheng University, Liaocheng 252000, China

²

School of Mathematical Sciences, Wenzhou-Kean University, Wenzhou 325015, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Symmetry 2025, 17(4), 589; https://doi.org/10.3390/sym17040589

Submission received: 7 March 2025 / Revised: 30 March 2025 / Accepted: 1 April 2025 / Published: 12 April 2025

(This article belongs to the Section Computer)

Download

Browse Figures

Versions Notes

Abstract

This study considers the uncertainty caused by data asymmetry in supply chains, the risks associated with this uncertainty and the need for robustness in the supply chain network. It discusses the construction of a data-driven two-stage distributionally robust mean semi-variance mixed-integer optimization model to address the location optimization problem under conditions of uncertainty in transportation costs and demand. To solve this model, a distributed separation hybrid genetic algorithm is introduced, enabling determination of the optimal location, distribution strategy and expected return for a distribution center in the worst case. Then, a fresh food supply chain is utilized as a case study to analyze the effects of uncertainty on location allocation decisions while deriving pertinent managerial insights. Additionally, compared to traditional stochastic optimization models, the proposed model demonstrates greater robustness in numerical simulations. The algorithm is also benchmarked against other methods, and its effectiveness and stability are validated in terms of the computational time, the number of iterations and the convergence speed.

Keywords:

two-stage distributionally robust mean semi-variance mixed-integer optimization; uncertainty; location allocation; algorithm

1. Introduction

With continuous improvements in income and consumption levels, people’s demand for high-quality, personalized and experiential consumption is growing, which prompts enterprises to focus on improving the customer experience and continuously improving the service level from the perspective of customers, centering on logistics and transportation. By strengthening the construction of logistics informatization, enterprises can accurately predict the quantity of products sold, allocate products and provide services to customers with higher efficiency, lower cost and a better service experience. Nowadays, many enterprises are starting to manage the quantity of products shipped through the new approach of “Internet + logistics”. The basic approach is to analyze the historical data for enterprise products, coordinate the relationship between goods and transportation and realize seamless connections between logistics nodes [1,2]. At this time, the location of logistics nodes and the transportation scheme are among the most pivotal and strategic considerations in the planning and operation of supply chain networks. How to use historical data information to optimize the location allocation in supply chain networks is a problem worthy of further study.

Supply chain management involves numerous uncertainties. For instance, demand forecasting is subject to unpredictability due to factors such as fluctuating economic conditions, shifts in consumer psychology and a lack of historical demand data. This also leads to information asymmetry in transportation costs among suppliers, distribution centers and customers, further contributing to uncertainty. A reliable supply chain system is particularly crucial for meeting the volatile demands of customers. Although most supply chains are relatively sound and usually operate effectively, when events such as global health crises occur, various weaknesses are often exposed in supply chains, and they cannot provide real flexibility. For example, the outbreak of COVID-19 undoubtedly brought greater uncertainty to the global supply chain, which means the supply chain system needs robustness, and a distributionally robust optimization method can solve this problem [3,4,5]. Therefore, this study examines the location allocation problem by using a distributionally robust optimization method which makes the supply chain system more robust. In contrast to stochastic programming, which typically presumes that the probability distribution of random parameters is known beforehand, the core concept of the distributed robust optimization approach is the building of an uncertainty set that encompasses the true probability distribution, centered around a reference distribution [6,7].

In recent years, a two-stage stochastic optimization model has become a research hotspot within modern optimization methods. It divides the whole decision-making process into two stages. The first stage is the “here and now” decision, which must be carried out before the value of the random vector is realized. The second stage is the “wait and see” decision, which can be used as retrospective behavior for the first-stage decision after the value of the random vector is realized. The two-stage stochastic optimization model has been extensively applied across various domains, including financial portfolio management, supply chain operations, emergency resource allocation and smart grid systems [8,9,10,11,12,13].

Furthermore, the majority of location problems adopt a risk-neutral approach, which means that they focus on optimizing the expected cost (or the expected profit) as the objective function without accounting for risk [4,14,15,16,17,18]. In contrast, risk-averse methods take into account the variability in random outcomes, offering a more robust framework for decision-making under uncertainty. This allows decision-makers to assess strategies based on their individual risk preferences. Given the challenges posed by actual market fluctuations, this paper incorporates risk considerations to derive more effective and practical solutions. The variance risk measure is a commonly used risk measure which regards both positive and negative fluctuations that deviate from the expected return as risks. The lower-half variance measure only regards the negative deviation below the expected return as risk. It can be seen that the use of a semi-variance measurement is more in line with people’s perception of risk than a variance risk measurement. As far as we know, no previous studies have used a two-stage distributionally robust mean semi-variance mixed-integer optimization model to study location allocation in the supply chain. Therefore, this paper uses the semi-variance risk measure as the risk measure for this purpose. Based on the above discussion, this study examines how to build a data-driven two-stage distributionally robust mean semi-variance mixed-integer optimization model to discuss the location optimization problem under uncertainty in the transportation cost and demand.

There are two common algorithms for two-stage distributed robust optimization: a decomposition algorithm [12,19] and a distribution separation algorithm [20,21]. In this paper, because the first-stage decision variables are integers and semi-variance risk applies, it is difficult to apply transformation for dual problems. Thus, this paper utilizes the concept of distributional separation to solve a distributionally robust optimization problem. Additionally, Medsker [22] introduced several innovative methodologies for developing hybrid intelligent algorithms, providing a foundational framework for addressing mixed-integer programming problems. As far as we know, there is still a lack of research on the application of intelligent algorithms to solve the two-stage distribution robust mean semi-variance optimization problem. A genetic algorithm is an adaptive global optimization search algorithm that simulates the genetic and evolutionary processes of organisms in the natural environment. A genetic algorithm is an efficient, practical and robust optimization technology which can effectively solve NP problems. Therefore, drawing on the principles of distributed separation and genetic algorithms, this study presents a novel solution framework known as the distributed separation hybrid genetic algorithm. The primary contributions of this research are outlined below.

⋆: In light of the uncertainty arising from data asymmetry within the supply chain, a two-stage distributionally robust mean semi-variance mixed-integer optimization model is developed.
⋆: A distributed separation hybrid genetic algorithm for solving the two-stage distributionally robust mean semi-variance mixed-integer optimization model is proposed. The effectiveness of the algorithm is verified by comparison.
⋆: In an example, management opinions are put forward.

The rest of this paper is organized as follows: Section 2 is a literature review. Section 3 includes the preliminary knowledge, and introduces the concepts related to risk measurement and time uncertainty. Section 4 establishes the model. In Section 5, a distributed separation hybrid genetic algorithm is proposed. Section 6 verifies the effectiveness of the model and algorithm through an example. Section 7 is the conclusion.

2. Literature Review

Location allocation problems are widely applied in industrial, civil and defense sectors, such as in the siting of logistics centers, warehouses and military bases. They directly impact operational efficiency and service quality, making them a long-standing focus of academic and industrial research and a key interdisciplinary topic. Hu et al. [4] investigated the multi-period hub location problem with uncertain periodic demand. Nick et al. [23] developed a location–inventory allocation model for post-disaster humanitarian logistics distribution points, emphasizing both the temporal dynamics of the system and potential social cost factors in facility location decisions. Qi et al. [24] investigated the competitive facility location problem, formulating a bilevel mixed-integer nonlinear programming model and developing a solution algorithm with guaranteed constant-factor approximation performance. Liu et al. [25,26] studied the location problem of distribution centers in supply chain networks with random transportation costs and demand. To address supply chain risk uncertainty, they established a mean-risk optimization model and developed heuristic algorithms to solve it.

In most studies on location allocation problems, uncertainty is typically based on randomness, with the assumption that the probability information of uncertainty is known [14,15,16,17,18,27]. However, in many cases, historical data or reliable forecasting techniques to derive accurate probability information are limited. Soyster [28] introduced a robust optimization approach to tackle challenges related to data uncertainty and the unavailability of probabilistic information.

In recent years, two-stage stochastic programming has emerged as a pivotal branch of modern optimization theory, garnering significant academic attention. This methodology has demonstrated remarkable value across multiple practical applications, including financial asset allocation, supply chain optimization scheduling, emergency resource distribution and smart energy networks. Atamturk et al. [8] introduced a two-stage robust optimization method to address the network flow design problem under conditions of demand uncertainty. Nakao et al. [9] studied stochastic network design problems through a two-stage approach. In the first stage, network capacity is designed, while, in the second stage, single-commodity network traffic is optimized after demand realization. The model aims to minimize total costs, including network capacity allocation, goods flow and penalties for unmet demand. Zhang et al. [10] constructed a multi-microgrids cooperative operation model based on two-stage adaptive robust optimization, and discussed how to minimize the operating cost of the microgrid combination in the worst case when PV was uncertain. Zhang et al. [11] constructed a two-stage distributionally robust optimization model and studied the storage and scheduling problems of a natural gas–electric hybrid energy system under uncertain wind power. Lei et al. [12] proposed a two-stage robust optimization model for the size and routing of mobile device clusters under uncertain demand, and gave a solution method based on the bilevel cut plane. Ding et al. [13] employed a two-stage robust optimization framework to investigate the reactive power optimization problem associated with uncertain wind power integration in active distribution networks. Ling et al. [29] proposed a two-stage distributionally robust optimization model based on moment constraints with risk aversion, equivalently transformed the model into semidefinite programming for solution and applied the constructed model to study supply chain management and portfolio problems.

Addressing the two-stage stochastic optimization model generally presents greater complexity, as it involves computing the expected value of multivariate random variables. A widely used approach for solving such problems is the scenario-based stochastic optimization method [30,31]. However, the effectiveness of this method heavily depends on the predefined scenarios and their associated probabilities. Moreover, as the number of scenarios increases, the computational complexity grows exponentially, leading to the curse of dimensionality.

Moreover, most location problems adopt a risk-neutral approach. For example, Laporte et al. [14] addressed the capacitated facility location problem with stochastic demands and developed a novel modeling approach based on two-stage stochastic integer programming. Wang et al. [15] explored facility location optimization under demand uncertainty with fixed server constraints. Zadeh et al. [16] investigated the strategic and tactical design of iron and steel supply chain networks. Armas et al. [17] studied the uncapacitated facility location problem with stochastic demands and proposed a simheuristic algorithm for solving it. Li et al. [18] optimized equipment support depot location and transportation allocation under multiple constraints using an uncertain chance-constrained programming approach. To overcome the limitations of traditional risk-neutral optimization methods (which focus solely on expected costs/profits while neglecting risk factors), this study proposes a risk-averse approach that better aligns with real-world decision-making needs. Specifically, it employs a semi-variance risk measure (considering only negative deviations) instead of the traditional variance measure (accounting for both positive and negative deviations) to better reflect human risk perception. A novel two-stage distributionally robust mean semi-variance mixed-integer optimization model for supply chain location allocation problems is developed, providing robust solutions for volatile market environments.

Based on the aforementioned analysis, this study proposes a data-driven two-stage distributionally robust optimization approach to address location problems under uncertainties in both transportation costs and demand, integrating mean semi-variance risk measures with mixed-integer programming techniques. When solving two-stage distributionally robust optimization problems, two predominant algorithmic approaches are typically employed: decomposition algorithms [12,14,19] and distribution separation algorithms [20,21,26]. This study proposes an innovative distribution separation technique to construct a solution framework, overcoming the duality transformation challenges caused by integer-constrained first-stage decision variables and semi-variance risk measures.

3. Preliminaries

Risk measures are mathematical functions that quantify risks associated with random variables through scalar values, offering a method to evaluate and compare outcomes according to decision-makers’ risk preferences. The variance risk measure considers both positive and negative fluctuations deviating from expected returns as risks. The semi-variance risk measure treats only negative deviations below expected returns as risks, or positive deviations above expected costs as risks. It can be seen that using the semi-variance risk measure is more consistent with people’s cognition of risk than the variance risk measure. Next, we give the definition of semi-variance:

Definition 1.

Let ξ be a random cost variable. If

E {[{(ξ - E [ξ])}^{+}]}^{2}

exists,

E {[{(ξ - E [ξ])}^{+}]}^{2}

is called the semi-variance of ξ, denoted as

S V a R (ξ)

, that is,

S V a R (ξ) = E {[{(ξ - E [ξ])}^{+}]}^{2},

where

{(x)}^{+} = m a x (0, x) .

To facilitate the solution, we give some properties of SVaR:

Proposition 1.

If ξ is a random cost variable and C is a constant, then

S V a R (C + ξ) = S V a R (ξ) .

Proof.

By definition, we obtain

\begin{matrix} SVaR (C + ξ) & = E {[{(C + ξ - E [C + ξ])}^{+}]}^{2} \\ = E {[{(ξ - E [ξ])}^{+}]}^{2} \\ = SVaR (ξ) . \end{matrix}

□

The objective of uncertainty quantification is to certify that a given physical, engineering or economic system satisfies multiple safety conditions with high probability. In this paper, we assume that the system parameters are governed by an ambiguous distribution, which is only known to belong to a prescribed ambiguity set.

Definition 2.

Let the random vector ξ be defined in the probability space

(Ω, F, P)

, where

Ω = R^{n}

. It is assumed that the first moment (mean) and the second moment (covariance) of the random vector

ξ \in Ω

are known. Denote

μ = E_{P} [ξ] \in R^{n}

as the expected value (mean vector) and

\sum = E_{P} [(ξ - μ) {(ξ - μ)}^{T}] \in S^{n}

as the covariance matrix of ξ. The ambiguity set, comprising all random vectors that share the same first and second moments, is formally defined as follows:

P : = {P : \int_{ξ \in Ω} P (ξ) d ξ = 1, \int_{ξ \in Ω} ξ P (ξ) d ξ = μ, \int_{ξ \in Ω} ξ ξ^{T} P (ξ) d ξ = Σ + μ μ^{T}} .

4. Problem Description and Two-Stage Distributionally Robust Mean Semi-Variance Mixed-Integer Optimization Model

With the increasing demand for fresh products, the construction of fresh product supply chains is gradually accelerating. Although the traditional fresh supply chain can promote the circulation of products in the market, the turnover times and circulation time are long, it is easy to cause serious damage to fresh products and sales prices are high.

Since the application of Internet technology in the construction of fresh product supply chains, these problems have been effectively alleviated. With the operation mode of e-commerce platform supply chains, consumers can have organic interaction with producers and promote the fresh product market from being product led to being service led. After several years of exploration and development, a fresh food chain enterprise has established a three-tier structure of online sales supply chain model, that is, “supplier—distribution center—consumer”, from the production, procurement, storage and distribution of goods in the whole process. Through the app, enterprises can collect customers’ shopping information, analyze their demand preferences and then predict the market demand trend, make procurement plans and meet consumers’ demands for diversified products. After the fresh products are purchased from the planting base or locally, the quality of the fresh products is tested, the products are stored and packaged at low temperatures and the raw materials or semi-finished products are reprocessed and transported to the distribution center.

The aim of this paper is to solve the problem that, in the worst case, considers the choice of distribution center and fresh distribution to ensure that customers receive orders in the shortest time. It is assumed that the distribution of only one kind of goods is considered in the location allocation problem. In the context of uncertain demand and transportation costs, the decision of the first stage is to choose the distribution center. In the second stage, after determining the selected distribution center, the distribution scheme is decided. The goal is to minimize total costs within the constraints of meeting requirements. In order to introduce the application of the two-stage distributionally robust mean semi-variance mixed-integer optimization model into fresh location distribution, relevant parameters and symbols are shown below.

For the sake of simplicity, the following notation is adopted throughout this model. All vectors referenced in this study are assumed to be column vectors by default.

Symbol	Definition
Notation
I	The set of suppliers.
J	The set of distribution centers.
K	The set of customers.
Parameters
$a_{i}$	The ability of supplier i to supply products, $i \in I$ .
$C p_{i}$	The product cost, $i \in I$ .
$C u_{i}$	The cost of processing a unit of product by supplier i, $i \in I$ .
$C d_{j}$	Fixed cost of operating distribution center j, $j \in J$ .
$e_{j}$	Distribution capacity of distribution center j, $j \in J$ .
r	Retail price of a unit product.
${\tilde{m}}_{i j}$	Transportation cost from supplier i to distribution center j (random variable).
${\tilde{n}}_{j k}$	Distribution cost from center j to customer k (random variable).
${\tilde{w}}_{k}$	Demand of customer k (random variable).
$ξ (ω)$	Demand and transportation cost vector $ξ (ω) = ({\tilde{m}}_{i j} (ω), {\tilde{n}}_{j k} (ω),$ ${\tilde{w}}_{k} (ω))$ , $ω \in Ω$ (random variable).
$λ$	Weight of cost.
$E (\cdot)$	Expectation.
Decision variables
$β_{j}$	0–1 decision variable: 1 means open distribution center j; otherwise 0, $j \in J$ .
$x_{i j}$	Quantity of products transported from supplier i to distribution center j, $i \in I$ , $j \in J$ .
$y_{j k}$	Quantity of products transported from distribution center j to customer k, $j \in J$ , $k \in K$ .

Assume that random variables are independent of each other. Based on the above description, we establish a data-driven two-stage distributionally robust mean semi-variance mixed-integer optimization model. The model considers the expected cost of the supply chain and its risk minimization, and is obtained by the convex combination of the risk measure function and the expected cost function, that is, the convex combination is used to transform multi-objective programming into single-objective programming. As a function of the random parameter vector

ξ (ω)

, the total cost

\sum_{J} C d_{j} β_{j} + Q (β, ξ (ω))

is obviously a random variable. What is higher than the expected cost is regarded as the risk (that is, the part lower than the expected benefit), and the risk function can be expressed as

SVaR (\sum_{J} C d_{j} β_{j} + Q (β, ξ (ω)))

. It follows from the property of semi-variance that

SVaR (\sum_{J} C d_{j} β_{j} + Q (β, ξ (ω)))

=

SVaR (Q (β, ξ (ω)))

. We focus on optimizing the expected cost and minimizing risk within the supply chain under the worst-case scenario. That is,

{min}_{β} {sup}_{p \in P} {λ (\sum_{J} C d_{j} β_{j} + E_{p} [Q (β, ω)]) + (1 - λ) SVaR (\sum_{J} C d_{j} β_{j} + Q (β, ξ (ω)))}

.

Next, we present a data-driven two-stage distributionally robust mean semi-variance mixed-integer optimization model as follows:

\begin{matrix} min_{β} λ \sum_{J} C d_{j} β_{j} + sup_{p \in P} {λ E_{p} [Q (β, ξ (ω))] + (1 - λ) SVaR (Q (β, ξ (ω)))} \\ s . t . β \in {0, 1}^{J} . \end{matrix}

(1)

where

P

represents the uncertain set containing the real distribution, and

Q (β, ω)

is the optimal value for the following second-stage problem:

\begin{matrix} Q (β, ξ (ω)) = \min_{x, y} {\sum_{I} \sum_{J} C p_{i} x_{i j} + \sum_{I} \sum_{J} C u_{i} x_{i j} + \sum_{I} \sum_{J} {\tilde{m}}_{i j} x_{i j} \\ + \sum_{J} \sum_{K} {\tilde{n}}_{j k} y_{j k} - \sum_{J} \sum_{K} r y_{j k}} \end{matrix}

(2)

\begin{matrix} s . t . & \sum_{J} x_{i j} \leq a_{i} \forall i \in I, \end{matrix}

(2a)

\begin{matrix} \sum_{K} y_{j k} \leq β_{j} e_{j} \forall j \in J, \end{matrix}

(2b)

\begin{matrix} \sum_{K} y_{j k} \leq \sum_{I} x_{i j} \forall j \in J, \end{matrix}

(2c)

\begin{matrix} \sum_{J} y_{j k} = {\tilde{w}}_{k} (ω) \forall k \in K, \end{matrix}

(2d)

\begin{matrix} x_{i j}, y_{j k} \geq 0, \forall i \in I, j \in J, k \in K . \end{matrix}

(2e)

This representation clearly indicates the order of events. The first-stage decision variable

β

is solved in the case of uncertainties in

ω

, and the effects of these uncertainties are measured by the backtracking function

E_{p} [Q (β, ξ (ω))]

. In the second stage, the actual value of

ω

is given and the traceability decision

x, y

is solved.

In the model, the objective function (1) aims to minimize both the expected cost and the associated risk of the supply chain in the worst case. The second stage minimizes the cost function (2), which includes product cost, handling cost, transportation cost and revenue from selling the product. Constraints (2a) and (2b) represent the supply capacity constraints of suppliers and the distribution capacity constraints of distribution centers, respectively. Constraint (2c) represents the equilibrium condition of the product. Constraint (2d) means to meet the needs of consumers. The final constraint (2e) guarantees the non-negativity of the decision variable.

5. Solution Algorithm

This section discusses the algorithm for solving models (1) and (2). Without loss of generality, in the subsequent analysis, we denote

ξ_{s} (ω)

as the random variable in scenario s, with

p_{s}

(unknown) representing its corresponding probability,

s \in S

and S as the set of all possible scenarios.

In the two-stage distributionally robust mean semi-variance mixed-integer optimization models (1) and (2), when

λ = 1

, Equation (1) degenerates into a two-stage risk-neutral distributionally robust optimization model, and some scholars have studied its robust equivalent transformation. Thereafter, the robust equivalent equation is solved [6]. When

λ \neq 1

, because the risk function SVaR is a nonlinear function and there are integer constraints in the decision variables, the problem in the first stage is a nonlinear distributionally robust mixed-integer optimization problem. All this makes it more complicated to obtain a robust equivalent of the original problem. In addition, notice that, for each

ξ (ω)

, the second-stage problem has

2 | I | + 3 | J | + 2 | K |

constraints, which makes it very difficult to solve the two-stage distributionally robust mean semi-variance mixed-integer optimization model when scenario

| S |

is very large or the supply chain has a large number of members.

It is important to note that the two-stage distributionally robust mixed-integer optimization model is at least as complex as the two-stage random mixed-integer optimization model, which represents a specific instance of the former. Given that the two-stage random mixed-integer optimization problem is NP hard, this implies a significant computational challenge. To address such problems, several researchers have employed the idea of distribution separation within the framework of robust optimization [20,26]. This idea directly avoids the transformation of robust equivalence and solving large-scale robust equivalence (when there are many scenarios or many supply chain members). In addition, the idea of a hybrid intelligent algorithm lays a foundation for solving mixed-integer optimization problems [32]. A GA is an adaptive global optimization search technique inspired by the genetic and evolutionary mechanisms observed in biological organisms within natural environments. A GA is an efficient, practical and robust optimization technology which can solve NP problems effectively. Therefore, leveraging the principles of distribution separation and genetic algorithms, this study proposes a solution framework termed the distribution separation hybrid genetic algorithm, with the detailed solution procedure illustrated in Figure 1. First, the first-stage problem is solved, where the decision vector

β

is the first-stage decision that must be determined before the random transportation costs and demand vector

ξ (ω)

are realized. Then, in the second-stage, given the realized random transportation costs, demand

ξ (ω)

and the first-stage decision

β

, the second-stage problem is solved to obtain the optimal solution

x, y

and the optimal value

Q (β, ξ (ω))

. By solving the distribution separation problem, the optimal probability distribution p is derived. The process then returns to the first stage, repeating the above cycle until the obtained optimal solution stabilizes.

The optimization problem

\begin{matrix} Q : = sup_{p \in P} {λ E_{p} [Q (β, ξ (ω))] + (1 - λ) SVaR (Q (β, ξ (ω)))} \end{matrix}

(3)

is called the distribution separation problem. The algorithm for solving (3) is called the distribution separation algorithm [20].

Next, we describe the solution process in detail.

X = (β_{1}, β_{2}, \dots, β_{J})

, where the population consists of N individuals. Initial population

B = {B_{1}, B_{2}, \dots, B_{N}}

.

F i t (\cdot)

is a fitness function, and the fitness of each individual is the negative of the value of the function in the first stage, that is,

\begin{matrix} F i t_{n} (X) = - {λ \sum_{J} C d_{j} β_{j} + sup_{p \in P} {λ E_{p} [Q (β, ξ (ω))] + (1 - λ) SVaR (Q (β, ξ (ω)))}}, n \in N . \end{matrix}

(4)

Therefore, the smaller the function value in the first stage, the higher the fitness value. For each

ξ (ω)

, the interior point algorithm is used to solve the problem of the second stage (2).

If the selection strategy adopts roulette wheel selection, it is necessary to rotate the wheel N times when selecting N individuals, and the calculation is complicated. In order to improve this problem, this paper adopts stochastic universal sampling. If N individuals are selected, N individuals can be selected by generating N equally spaced marker pointer positions once. Assuming that the total fitness value is

F = \sum (F i t_{n} (X))

and the number of individuals is N, the specific steps of stochastic universal sampling are as follows:

step1:: Calculate the pointer spacing $P = F / N$ ;
step2:: Randomly generate the starting point pointer position Start = [random number between 0 and P];
step3:: Calculation of the position of the pointer Pointers = [Start $+ i * P (i = (0, 1, \dots, N - 1))$ ];
step4:: Select N individuals according to the position of each pointer. Select and replicate, and the duplicated chromosomes form population $B 1$ .

Next, a crossover operation is performed. This paper adopts a crossover operation based on probability. According to the number of chromosomes

\bar{c}

participating in the crossover, determined by crossover rate

P c

,

\bar{c}

chromosomes are randomly selected from

B 1

which are paired for the crossover operation, and the resulting new chromosomes are used to replace the original chromosomes to obtain population

B 2

.

According to the variation times m determined by the variation rate

P m

, m chromosomes are randomly determined from

B 2

, and the variation operations are carried out, respectively, and the new chromosomes are used to replace the original chromosomes to obtain population

B 3

. Population

B 3

is taken as the new generation population, that is,

B 3

replaces B. The fitness

F i t (X)

of each chromosome in B is calculated again until the termination condition is met. Next, we present the pseudo-code of Algorithm 1.

Algorithm 1 Distribution separation hybrid genetic algorithm (DSHGA).

Initialization. The population size is N, maximum number of iterations G, crossover rate $P c$ , mutation rate $P m$ , randomly obtain the initial population $B = r a n d i n t (N, J)$ ;
Genetic algorithms iterate.
for g = 1:G do
Calculate the fitness
for n = 1:N do
The second stage optimization problem (2) is addressed utilizing the interior point algorithm;
The distribution separation problem (3) is solved to obtain the value of $Q$ ;
Calculate the fitness $F i t_{n} (X)$ for each chromosome in B;
end for
Obtain the optimal individual $X_{B e s t}$ ;
Stochastic universal sampling
Calculate the pointer spacing $P = \sum (F i t_{n} (X)) / N$ ;
Randomly generate the starting point pointer position Start = [random number between 0 and P];
Calculation of the position of the pointer Pointers = [Start $+ i * P (i = 0, 1, \dots, N - 1)$ ];
Select N individuals according to the position of each pointer.
Crossover operation
for n = 1:2:N do
if $r a n d < P c$ then
crossover chromatid;
end if
end for
Get population $B 2$ ;
Mutation operation
for n = 1:N do
for j = 1:J do
if $r a n d < P m$ then
Mutation;
end if
end for
end for
Get population $B 3$ ;
The optimal individual $X_{B e s t}$ is kept in the new species group $B 3$ ;
end for
Return the $X_{b e s t}$ as the optimal solution, and $- F i t (X_{b e s t})$ as the optimal value.

6. Numerical Results

In this section, the impact of uncertainty on the location of distribution centers is examined using the fresh supply chain in Shanghai as a case study. All the program codes were written on MATLAB R2014a using Lenovo computers running an Intel(R) Core(TM) i7-8565U CPU @ 1.80 GHz, 8.00 GB memory. In the whole calculation experiment, the parameters of the DSHGA are set as

G = 1000

,

N = 500

,

P c = 0.9

,

P m = 0.05

,

λ = 0.9 .

In the numerical table,

P r o = - {\sum_{J} C d_{j} β_{j} + {sup}_{p \in P} {E_{p} [Q (β, ξ (ω))]}}

represents the profit of the supply chain in the worst case,

V a l

represents the optimal value of model (1) and

β^{*}

is the optimal solution.

x_{i j}^{*}

represents the quantity of product shipped from supplier i to distribution center j.

y_{j k}^{*}

represents the quantity of product shipped from distribution center j to demand area k.

T I

indicates the computation time, in seconds. Next, we give a description of the test problem.

As the economy continues to grow, China’s new retail business model is experiencing rapid expansion. Take “Freshippo” as an example. Consumers who used to shop in supermarkets can now use the Freshippo app to buy all the commodities in the supermarket, so that they can enjoy the fun of shopping without going out to meet their needs. It can also be delivered quickly, taking 30 min to reach the customer’s door. The biggest difference from traditional fresh retail is that it uses big data to realize the optimal configuration of supply, distribution centers and customer distribution. Because of the particularity of fresh goods, it is necessary to carry out refrigerated transportation. How to choose the distribution center reasonably, reduce the storage cost and realize fast distribution to customers in time are problems worthy of further study. Take Shanghai’s fresh supply chain network as an example, in which fresh products are provided by four suppliers and distributed by 30 selectable distribution centers in order to meet the daily fresh demand of Freshippo consumers in 10 demand areas in Shanghai. The location relationship of the supply chain is shown in Figure 2. For the convenience of the discussion, we mark four suppliers, numbered from

I 1

to

I 4

, and also mark the locations of 30 alternative distribution centers, numbered from

J 1

to

J 30

, and 10 demand areas numbered from

K 1

to

K 10

.

Demand area

K 1

is Jiading District. Demand area

K 2

is Baoshan District. Demand area

K 3

consists of Putuo District, Jing’an District, Yangpu District and Hongkou District. Demand area

K 4

consists of Changning District, Xuhui District and Huangpu District. Demand area

K 5

is Qingpu District. Demand area

K 6

is Songjiang District. Demand area

K 7

is Minhang District. Demand area

K 8

is Pudong New Area. Demand area

K 9

is Jinshan District. Demand area

K 10

is Fengxian District. Given that Chongming District is geographically isolated as an independent island, its consumption demand is excluded from the computational analysis.

To simplify the computational process, this study assumes a uniform price for fresh products. Furthermore, it is presumed that distribution centers are capable of successfully delivering fresh products to demand areas. The relevant parameters, random transportation costs and random demand are provided in Table 1, Table 2, Table 3 and Table 4.

Table 2 shows the expected costs of transportation from suppliers to distribution centers. Transportation costs primarily encompass fuel expenses, vehicle maintenance costs and driver salaries. The unit transportation cost is calculated using the following formula:

F u e l c o s t = d i s t a n c e * f u e l c o n s u m p t i o n * o i l p r i c e,

U n i t t r a n s p o r t a t i o n c o s t = f u e l e x p e n s e s + v e h i c l e m a i n t e n a n c e c o s t s + d r i v e r s a l a r i e s .

For example, as shown in Table 2, to transport 1000 units of product from

I 1

to

J 1

, the distance is 39 km, the fuel consumption is

12 L / 100

km, the oil price is 9.93 CNY/L, the vehicle cost is CNY 100 and the engine cost is CNY 200. Because of this, the single-bit cost of transporting the raw materials is

(33 * 12 / 100 * 9.93 + 100 + 200) / 1000 = 0.34

CNY. Due to uncertain factors such as traffic congestion and traffic flow, we use CNY 0.35 to represent the first-moment information regarding the transportation cost from

I 1

to

J 1

. Based on the above calculation rules, the first-moment information of random transportation costs is given in Table 2 and Table 3.

Based on market survey data, the daily online orders in Jiading District amount to 17,600 orders. Assuming that orders for fresh products at a uniform price constitute

50 %

of the total, the expected demand for

K 1

is 8800. Through statistical analysis, the random daily consumer demand is as presented in Table 4, where

ϱ = 10

denotes the price elasticity coefficient of demand and r is set to CNY 12.

Since the random vectors are independent of each other, and according to the statistics of market survey data, the second moment (covariance matrix) satisfies

\sum = 0.004 * W

, where W is a 430-dimensional identity matrix. To solve the location assignment problem, for any feasible solution, we use the Monte Carlo method to generate 1000 scenarios, each generating 1000 random sampling points

ξ_{n} (ω)

,

n = 1, 2, \dots, 1000

, for random simulation (such a sample size is sufficient to simulate random expected values). When the data are generated, the first-order moment information is as shown in Table 2, Table 3 and Table 4, and the second-order moment satisfies

\sum = 0.004 * W

. The numerical results are shown in Table 5, Table 6, Table 7, Table 8 and Table 9 and Figure 3, Figure 4 and Figure 5, respectively.

Figure 3 illustrates the supply chain network for location allocation, with detailed results presented in Table 5. To maximize customer satisfaction, the enterprise selects

J 3

,

J 5

,

J 6

,

J 11

,

J 15

,

J 16

,

J 20

,

J 21

,

J 24

,

J 26

,

J 28

and

J 30

as distribution centers to supply fresh products to customers. Under this configuration, the supply chain achieves a maximum profit of CNY

6.2228 \times 10^{5}

. The expected demand of demand area

K 1

is

E [K 1] = 8800 - ϱ r = 8680.00

, and the distribution is completed by distribution center

J 24

. The expected demand of demand area

K 2

is

E [K 2] = 9000 - ϱ r = 8880.00

, which is provided by

J 5

,

J 20

and

J 24

, with

249.90

,

8020.10

and

610.00

, respectively The expected demand of demand area

K 3

is

E [K 3] = 10000 - ϱ r = 9880.00

, which is provided by

J 3

and

J 15

, with

9450.00

and

430.00

, respectively. The expected demand of demand area

K 4

is

E [K 4] = 9800 - ϱ r = 9680.00

, and the distribution is completed by distribution center

J 16

. The expected demand of demand area

K 5

is

E [K 5] = 9100 - ϱ r = 8980.00

, which is provided by

J 6

and

J 21

, with

7860.00

and

1120.00

, respectively. The expected demand of demand area

K 6

is

E [K 6] = 9500 - ϱ r = 9380.00

, which is provided by

J 6

,

J 11

and

J 26

, with

1630.00

,

3730.00

and

4020.00

, respectively. The expected demand of demand area

K 7

is

E [K 7] = 9700 - ϱ r = 9580.00

, which is provided by

J 24

and

J 28

, with

80.00

and

9500.00

, respectively. The expected demand of demand area

K 8

is

E [K 8] = 9500 - ϱ r = 9380.00

, which is provided by

J 5

and

J 16

, with

9260.10

and

119.90

, respectively. The expected demand of demand area

K 9

is

E [K 9] = 8700 - ϱ r = 8580.00

, which is provided by

J 20

and

J 30

, with

1579.90

and

7000.10

, respectively. The expected demand of demand area

K 10

is

E [K 10] = 8500 - ϱ r = 8380.00

, and the distribution is completed by distribution center

J 21

. At the same time, it can be seen from Figure 3 that the location of the selected distribution center is relatively close to the suppliers and demand areas, meeting the principle of nearby supply.

To further demonstrate the effectiveness of the DSHGA, we evaluate its stability by varying the number of iterations, population size and parameter settings. The numerical outcomes are presented in Table 6, with the final column indicating the relative error, defined as follows:

E r r o r = \frac{O p t i m a l P r o - P r o}{O p t i m a l P r o} \times 100 % .

As shown in Table 6, varying the iterations, population sizes and parameters in the DSHGA results in relative errors not exceeding

0.34 %

, demonstrating that the DSHGA exhibits strong parameter robustness and is capable of effectively addressing the two-stage distributionally robust mean semi-variance mixed-integer optimization problem. Furthermore, the observed relative error arises from the use of the Monte Carlo method for random simulation during the computational process.

To further assess the efficacy of our proposed DSHGA, a comparative analysis is conducted against other discrete hybrid optimization techniques, including the GA developed by Liu [26]. The comparative performance metrics are presented in Table 7, while the corresponding graphical representation of the results is illustrated in Figure 4.

As demonstrated in Table 7, the implementation of two distinct algorithmic approaches for addressing the supply chain location allocation problem yields identical optimal solutions. Notably, the application of the DSHGA exhibits superior performance characteristics, achieving maximal supply chain profitability while concurrently demonstrating enhanced computational efficiency through reduced processing time. As can be seen from Figure 4, the algorithm proposed in this paper converges faster than the algorithm proposed by Liu [26], and it can be considered that the DSHGA is more suitable for solving this problem.

In Table 8, we compare the impact of three optimization models on decision-making and profit when modeling the supply chain. The examined models include (1) a two-stage distributionally robust mixed-integer optimization model

(λ = 1)

, (2) a two-stage distributionally robust mean semi-variance mixed-integer optimization model and (3) a two-stage distributionally robust mean variance mixed-integer optimization model incorporating Value at Risk (VaR) as the risk measurement function. The computational results reveal distinct patterns in supply chain profitability under varying risk considerations. The baseline scenario, excluding risk factors, yields a supply chain profit of

6.2309 \times 10^{5}

. Incorporation of the SVaR metric results in a marginally reduced profit of

6.2228 \times 10^{5}

, demonstrating that risk-adjusted profitability metrics exhibit more conservative estimates compared to risk-neutral scenarios, aligning with empirical observations. Furthermore, the implementation of VaR as the risk measure generates a profit of

6.2179 \times 10^{5}

, representing a

0.08 %

reduction relative to the SVaR-based model. This comparative analysis suggests that the SVaR metric provides a more accurate representation of risk perception than the conventional VaR approach. Regarding Model (1), sensitivity analysis indicates that the

λ

exerts significant influence on decision-making outcomes. Strategic adjustment of the SVaR weighting coefficient enables decision-makers to achieve targeted benefit expectations. The quantitative impact of

λ

variations on both supply chain profitability and objective function values is systematically presented in Figure 5.

As can be seen from Figure 5, when

λ

increases from

0.1

to

0.9

, the corresponding risk weight

(1 - λ)

decreases from

0.9

to

0.1

, the supply chain profit increases, the corresponding fitness function value increases and the function value Val = −Fit decreases. In other words, the greater the risk weight

(1 - λ)

, the more conservative the decision will be. In the uncertain environment, the decision-maker can consider the corresponding risk weight according to the actual situation and make the optimal decision.

Table 9 shows the optimal location, profit and function values of the supply chain under the worst-case and expected scenarios. The computational results demonstrate significant divergence in optimal decision-making outcomes between these two scenario frameworks. In the worst-case scenario, the supply chain achieves a conservative profit margin of CNY

6.2228 \times 10^{5}

, accompanied by correspondingly conservative location decisions. This conservative configuration ensures solution feasibility across the entire uncertainty set, thereby demonstrating the model’s inherent robustness against parameter variations. Conversely, the expected scenario yields an enhanced profitability of CNY

6.2286 \times 10^{5}

. However, while this scenario presents superior expected returns compared to the robust optimization approach, it potentially overestimates achievable outcomes, presenting an optimistic bias that may not accurately reflect real-world operational conditions. The comparative analysis underscores the fundamental trade-off between solution robustness and optimality in supply chain optimization under uncertainty. Moreover, it is difficult to predict the exact probability distribution in practice. Therefore, in the absence of precise probabilistic information about uncertainty, the decision-maker can formulate a robust plan using a two-stage distributionally robust mixed-integer optimization model.

Management insights

This study develops a data-driven two-stage distributionally robust mean semi-variance mixed-integer optimization model with the following distinctive features:

(a): Simultaneous consideration of transportation cost volatility and customer demand uncertainty;
(b): Incorporation of SVaR as a risk measure for cost functions;
(c): Application of distributionally robust mixed-integer programming to ensure implementable solutions.

Compared with traditional risk-neutral approaches, our risk-averse model demonstrates three key advantages:

(a): It effectively captures volatility in uncertain outcomes;
(b): It provides more reliable decision support in uncertain environments;
(c): It enables flexible strategy adjustment based on risk preferences.

The data demonstrate that decision-makers can flexibly adjust the SVaR weighting parameter based on their risk preferences to achieve an optimal balance between expected returns and risk protection, as detailed in Table 8 and Figure 5.

Comparative analysis through the models shows the following:

(a): The results of the distributionally robust model are relatively conservative, but ensure feasibility in all scenarios;
(b): Although non-robust models show higher expected profits, there is a significant optimistic bias (see Table 9 for specific numerical comparisons).

Based on the above analysis, we derive some managerial insights. This method enables decision-makers in complex supply chain environments to carry out the following:

(a): Quantitatively evaluate the risk–return characteristics of different strategies;
(b): Avoid operational risks caused by over-optimistic estimations;
(c): Obtain uncertainty-resilient decision solutions.

This study provides a new methodological tool for supply chain risk management, particularly suitable for the current highly volatile business environment. Decision-makers can find the optimal balance between the robustness of the plan and the expected returns based on their actual risk tolerance.

7. Conclusions

This study addresses the location allocation optimization problem under conditions of uncertain transportation costs and demand fluctuations through the development of a novel two-stage distributionally robust mean semi-variance mixed-integer optimization framework. The proposed model, constructed using a data-driven methodology, incorporates an uncertainty set encompassing all probability distribution functions that equal the first and second moments. Compared with the traditional stochastic optimization model, the model is more robust in numerical simulation. Considering the complexity of the model, a DSHGA is proposed to solve the model. Compared with other algorithms, the effectiveness of the algorithm is illustrated by the calculation time, iteration steps and convergence speed.

It is particularly noteworthy that the proposed model in this study exhibits significant versatility, being not only applicable to conventional location allocation problems but also extendable to complex joint optimization scenarios, including emergency resource distribution and integrated production–inventory–transportation systems. The construction of uncertainty sets is the key when employing distributionally robust optimization frameworks for supply chain optimization problems. Beyond the moment-based constraint approach adopted in this work, alternative uncertainty set formulations can be developed through statistical measurement theories. Specifically, the Wasserstein distance provides a robust metric for quantifying distributional differences between empirical and reference probability distributions. Such distributions can be effectively estimated through data-driven approaches incorporating nonparametric estimation techniques, representing a valuable direction for future research endeavors. This study also has some limitations. For instance, when constructing the uncertainty set, incomplete historical data or excessive noisy data may lead to an incomplete uncertainty set or data bias. In such cases, machine learning algorithms such as clustering algorithms, deep learning algorithms and statistical learning algorithms can be incorporated to construct the uncertainty set, which also provides a valuable direction for future research.

Author Contributions

Conceptualization, Z.L. and H.R.; methodology, Z.L.; software, Z.L.; validation, Z.L. and H.R.; formal analysis, Z.L. and H.R.; investigation, Z.L.; resources, Z.L.; data curation, Z.L.; writing—original draft preparation, Z.L.; writing—review and editing, Z.L. and H.R.; methodology visualization, H.R.; supervision, H.R.; project administration, Z.L. and H.R.; funding acquisition, Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Natural Science Foundation of Shandong Province (no. ZR2022QA070).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Liu, J.M.; Chen, W.W.; Yang, J.Y.; Xiong, H.; Chen, C. Iterative prediction-and-optimization for E-logistics distribution network design. Informs J. Comput. 2021, 34, 769–789. [Google Scholar] [CrossRef]
Gao, X.; Gao, X.; Liu, Y. A symmetric fourth party logistics routing problem with multiple distributors in uncertain random environments. Symmetry 2024, 16, 701. [Google Scholar] [CrossRef]
Long, D.Z.; Qi, J.; Zhang, A.Q. Supermodularity in two-stage distributionally robust optimization. Manag. Sci. 2023, 70, 1394–1409. [Google Scholar] [CrossRef]
Hu, J.; Chen, Z.; Wang, S.M. Budget-driven multiperiod hub location: A robust time-series approach. Oper. Res. 2024, 73, 613–631. [Google Scholar] [CrossRef]
Niu, S.; Sun, G.J.; Yang, G.Q. Distributionally robust optimization for a capacity-sharing supply chain network design problem. J. Clean. Prod. 2024, 447, 141563. [Google Scholar] [CrossRef]
Zymler, S.; Kuhn, D.; Rustem, B. Distributionally robust joint chance constraints with second-order moment information. Math. Program. 2013, 137, 167–198. [Google Scholar] [CrossRef]
Esfahani, P.M.; Kuhn, D. Data-driven distributionally robust optimization using the wasserstein metric: Performance guarantees and tractable reformulations. Math. Program. 2018, 171, 115–166. [Google Scholar] [CrossRef]
Atamturk, A.; Zhang, M. Two-stage robust network flow and design under demand uncertainty. Oper. Res. 2007, 55, 662–673. [Google Scholar] [CrossRef]
Nakao, H.; Shen, S.Q.; Chen, Z.H. Network design in scarce data environment using moment-based distributionally robust optimization. Comput. Oper. Res. 2017, 88, 44–57. [Google Scholar] [CrossRef]
Zhang, B.Y.; Li, Q.Q.; Wang, L.H.; Feng, W. Robust optimization for energy transactions in multi-microgrids under uncertainty. Appl. Energy 2018, 217, 346–360. [Google Scholar] [CrossRef]
Zhang, Y.C.; Le, J.; Zheng, F.; Zhang, Y.; Liu, K.P. Two-stage distributionally robust coordinated scheduling for gas-electricity integrated energy system considering wind power uncertainty and reserve capacity configuration. Renew. Energy 2019, 135, 122–135. [Google Scholar] [CrossRef]
Lei, C.; Lin, W.H.; Miao, L.X. A two-stage robust optimization approach for the mobile facility fleet sizing and routing problem under uncertainty. Comput. Oper. Res. 2016, 67, 75–89. [Google Scholar] [CrossRef]
Ding, T.; Liu, S.Y.; Yuan, W.; Bie, Z.H.; Zeng, B. A two-stage robust reactive power optimization considering uncertain wind power integration in active distribution networks. IEEE Trans. Sustain. Energy 2016, 7, 301–311. [Google Scholar] [CrossRef]
Laporte, G.; Louveaux, F.V.; Van Hamme, L. Exact solution to a location problem with stochastic demands. Transp. Sci. 1994, 28, 95–103. [Google Scholar] [CrossRef]
Wang, Q.; Batta, R.; Rump, C.M. Algorithms for a facility location problem with stochastic customer demand and immobile servers. Ann. Oper. Res. 2002, 111, 17–34. [Google Scholar] [CrossRef]
Zadeh, A.S.; Sahraeian, R.; Homayouni, S.M. A dynamic multi-commodity inventory and facility location problem in steel supply chain network design. Int. J. Adv. Manuf. Technol. 2014, 70, 1267–1282. [Google Scholar] [CrossRef]
Armas, J.D.; Juan, A.A.; Marques, J.M.; Pedroso, J.P. Solving the deterministic and stochastic uncapacitated facility location problem: From a heuristic to a simheuristic. J. Oper. Res. Soc. 2017, 68, 1161–1176. [Google Scholar] [CrossRef]
Li, H.; Xie, W.; Wen, M.; Li, S.; Yang, Y.; Guo, L. An optimal location-allocation model for equipment supporting system based on uncertainty theory. Symmetry 2023, 15, 338. [Google Scholar] [CrossRef]
Liu, X.; Kucukyavuz, S.; Luedtke, J. Decomposition algorithms for two-stage chance-constrained programs. Math. Program. 2016, 157, 219–243. [Google Scholar] [CrossRef]
Bansal, M.; Huang, K.L.; Sanjay, M. Decomposition algorithms for two-stage distributionally robust mixed binary programs. SIAM J. Optim. 2018, 28, 2360–2383. [Google Scholar] [CrossRef]
Bansal, M.; Mehrotra, S. On solving two-stage distributionally robust disjunctive programs with a general ambiguity set. Eur. J. Oper. Res. 2019, 279, 296–307. [Google Scholar] [CrossRef]
Medsker, L.R. Hybrid Intelligent Systems; Kluwer Academic Publishers: Boston, MA, USA, 1995. [Google Scholar]
Nick, L.; Felipe, A.V. Points of distribution location and inventory management model for post disaster humanitarian logistics. Transp. Res. Part E Logist. Transp. Rev. 2018, 116, 1–24. [Google Scholar]
Qi, M.Y.; Jiang, R.W.; Shen, S.Q. Sequential competitive facility location: Exact and approximate algorithms. Oper. Res. 2022, 72, 300–316. [Google Scholar] [CrossRef]
Liu, Z.M.; Qu, S.J.; Wu, Z.; Qu, D.Q.; Du, J.H. Two-stage mean-risk stochastic mixed integer optimization model for location-allocation problems under uncertain environment. J. Ind. Manag. Optim. 2021, 17, 2783–2804. [Google Scholar] [CrossRef]
Liu, Z.M.; Wu, Z.; Ji, Y.; Qu, S.J.; Hassan, R. Two-stage distributionally robust mixed-integer optimization model for three-level location allocation problems under uncertain environment. Phys. A Stat. Mech. Its Appl. 2021, 572, 125872. [Google Scholar] [CrossRef]
Liu, M.; Lin, T.; Chu, F.; Zheng, F.F.; Chu, C.B. A new and general stochastic parallel machine ScheLoc problem with limited location capacity and customer credit risk. RAIRO Oper. Res. 2023, 57, 1179–1193. [Google Scholar] [CrossRef]
Soyster, A.L. Convex programming with set-inclusive constraints and applications to inexact linear programming. Oper. Res. 1973, 21, 1154–1157. [Google Scholar] [CrossRef]
Ling, A.F.; Sun, J.; Xiu, N.H.; Yang, X.G. Robust two-stage stochastic linear optimization with risk aversion. Eur. J. Oper. Res. 2017, 256, 215–229. [Google Scholar] [CrossRef]
Maggioni, F.; Potra, F.A.; Bertocchi, M. A scenario-based framework for supply planning under uncertainty: Stochastic programming versus robust optimization approaches. Comput. Manag. Sci. 2017, 14, 5–44. [Google Scholar] [CrossRef]
Rakesh, V.; Adil, G.K. Designing a block stacked warehouse for dynamic and stochastic product flow: A scenario-based robust approach. Int. J. Prod. Res. 2019, 57, 1345–1365. [Google Scholar]
Corchado, E.; Abraham, A.; Carvalho, A.D. Hybrid intelligent algorithms and applications. Inf. Sci. 2010, 180, 2633–2634. [Google Scholar] [CrossRef]

Figure 1. The process of solving the two-stage distributionally robust mean semi-variance mixed-integer optimization problem.

Figure 2. A fresh supply chain in Shanghai.

Figure 3. A fresh supply chain network in Shanghai.

Figure 4. Comparison between different algorithms.

Figure 5. The value of supply chain profit and fitness function under different

λ

values.

Figure 5. The value of supply chain profit and fitness function under different

λ

values.

Table 1. Parameters for suppliers and distribution centers.

Index	Suppliers		Distribution	Centers
$I, J$	$a_{i}$	$b_{i}$	$d_{j}$	$e_{j}$
1	28,000	4.20	235	10,000
2	24,000	4.10	230	9480
3	26,000	4.40	220	9450
4	25,000	4.00	225	9470
5	/	/	240	9510
6	/	/	230	9490
7	/	/	245	9520
8	/	/	243	9505
9	/	/	233	9460
10	/	/	234	9485
11	/	/	233	9600
12	/	/	229	9380
13	/	/	219	9950
14	/	/	223	9770
15	/	/	238	9410
16	/	/	228	9800
17	/	/	243	9420
18	/	/	242	9600
19	/	/	223	9660
20	/	/	233	9600
21	/	/	243	9500
22	/	/	239	9320
23	/	/	219	9350
24	/	/	213	9370
25	/	/	226	9710
26	/	/	225	9500
27	/	/	213	9320
28	/	/	222	9500
29	/	/	223	9560
30	/	/	213	9400

Table 2. Expected costs of transportation from suppliers to distribution centers.

	I1	I2	I3	I4
$J 1$	$0.35$	$0.35$	$0.40$	$0.41$
$J 2$	$0.37$	$0.33$	$0.39$	$0.40$
$J 3$	$0.36$	$0.34$	$0.38$	$0.39$
$J 4$	$0.35$	$0.35$	$0.39$	$0.40$
$J 5$	$0.34$	$0.36$	$0.40$	$0.40$
$J 6$	$0.38$	$0.33$	$0.37$	$0.38$
$J 7$	$0.36$	$0.35$	$0.38$	$0.39$
$J 8$	$0.35$	$0.36$	$0.39$	$0.39$
$J 9$	$0.36$	$0.35$	$0.38$	$0.38$
$J 10$	$0.35$	$0.38$	$0.40$	$0.41$
$J 11$	$0.40$	$0.33$	$0.35$	$0.36$
$J 12$	$0.39$	$0.34$	$0.35$	$0.36$
$J 13$	$0.37$	$0.35$	$0.37$	$0.37$
$J 14$	$0.38$	$0.35$	$0.36$	$0.37$
$J 15$	$0.37$	$0.37$	$0.38$	$0.38$
$J 16$	$0.37$	$0.37$	$0.38$	$0.38$
$J 17$	$0.36$	$0.38$	$0.40$	$0.39$
$J 18$	$0.35$	$0.39$	$0.41$	$0.41$
$J 19$	$0.39$	$0.35$	$0.35$	$0.35$
$J 20$	$0.38$	$0.36$	$0.37$	$0.37$
$J 21$	$0.37$	$0.39$	$0.40$	$0.40$
$J 22$	$0.43$	$0.36$	$0.32$	$0.33$
$J 23$	$0.41$	$0.36$	$0.33$	$0.33$
$J 24$	$0.39$	$0.37$	$0.36$	$0.36$
$J 25$	$0.39$	$0.38$	$0.37$	$0.37$
$J 26$	$0.38$	$0.39$	$0.39$	$0.38$
$J 27$	$0.38$	$0.41$	$0.40$	$0.40$
$J 28$	$0.42$	$0.38$	$0.33$	$0.33$
$J 29$	$0.41$	$0.38$	$0.35$	$0.34$
$J 30$	$0.42$	$0.39$	$0.35$	$0.34$

Table 3. Expected costs of transportation from distribution centers to consumers.

	K1	K2	K3	K4	K5	K6	K7	K8	K9	K10
$J 1$	0.33	0.31	0.32	0.34	0.36	0.36	0.35	0.35	0.38	0.38
$J 2$	0.31	0.32	0.33	0.34	0.35	0.35	0.35	0.36	0.38	0.38
$J 3$	0.32	0.32	0.32	0.33	0.34	0.34	0.33	0.35	0.37	0.37
$J 4$	0.33	0.31	0.31	0.33	0.36	0.35	0.34	0.34	0.37	0.37
$J 5$	0.34	0.31	0.31	0.33	0.36	0.35	0.34	0.33	0.38	0.37
$J 6$	0.32	0.33	0.33	0.33	0.33	0.33	0.33	0.36	0.36	0.36
$J 7$	0.33	0.32	0.31	0.32	0.35	0.34	0.33	0.33	0.36	0.36
$J 8$	0.34	0.33	0.31	0.32	0.36	0.35	0.32	0.32	0.36	0.35
$J 9$	0.34	0.33	0.32	0.31	0.35	0.34	0.32	0.33	0.36	0.35
$J 10$	0.36	0.34	0.33	0.34	0.38	0.36	0.34	0.31	0.38	0.35
$J 11$	0.33	0.35	0.35	0.34	0.31	0.32	0.34	0.37	0.35	0.37
$J 12$	0.34	0.34	0.34	0.33	0.32	0.31	0.33	0.36	0.34	0.35
$J 13$	0.34	0.34	0.33	0.31	0.34	0.33	0.31	0.34	0.35	0.35
$J 14$	0.34	0.34	0.33	0.31	0.34	0.32	0.31	0.34	0.34	0.34
$J 15$	0.35	0.34	0.33	0.31	0.35	0.34	0.31	0.33	0.35	0.34
$J 16$	0.36	0.34	0.33	0.32	0.36	0.35	0.32	0.32	0.36	0.34
$J 17$	0.36	0.35	0.33	0.33	0.37	0.36	0.33	0.31	0.36	0.34
$J 18$	0.37	0.35	0.34	0.34	0.38	0.37	0.34	0.31	0.38	0.35
$J 19$	0.35	0.35	0.35	0.33	0.33	0.31	0.33	0.36	0.33	0.35
$J 20$	0.35	0.35	0.34	0.32	0.35	0.33	0.31	0.34	0.34	0.33
$J 21$	0.38	0.36	0.35	0.34	0.38	0.37	0.34	0.32	0.33	0.36
$J 23$	0.36	0.37	0.36	0.35	0.33	0.32	0.34	0.37	0.31	0.35
$J 24$	0.37	0.36	0.35	0.33	0.35	0.33	0.32	0.35	0.33	0.32
$J 26$	0.38	0.36	0.35	0.34	0.37	0.35	0.33	0.33	0.35	0.32
$J 27$	0.39	0.38	0.36	0.36	0.39	0.38	0.35	0.34	0.37	0.34
$J 28$	0.38	0.38	0.38	0.36	0.35	0.34	0.35	0.38	0.31	0.34
$J 29$	0.38	0.38	0.37	0.35	0.35	0.34	0.34	0.37	0.31	0.33
$J 30$	0.39	0.39	0.38	0.36	0.36	0.35	0.35	0.37	0.32	0.33

Table 4. Expected demand of consumer.

Consumer	K1	K2	K3	K4	K5
Expected demand	$8800 - ϱ r$	$9000 - ϱ r$	$10, 000 - ϱ r$	$9800 - ϱ r$	$9100 - ϱ r$
Consumer	K6	K7	K8	K9	K10
Expected demand	$9500 - ϱ r$	$9700 - ϱ r$	$9500 - ϱ r$	$8700 - ϱ r$	$8500 - ϱ r$

Table 5. Numerical optimal solution and value of the example.

$β^{*} =$	$(0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1)$
$x_{15}^{*} = 9510.00$	$x_{1 \underset{̲}{20}}^{*} = 9600.00$	$x_{1 \underset{̲}{21}}^{*} = 4870.00$	$x_{1 \underset{̲}{26}}^{*} = 4020.00$	$x_{23}^{*} = 9450.00$
$x_{26}^{*} = 9490.00$	$x_{2 \underset{̲}{15}}^{*} = 430.00$	$x_{2 \underset{̲}{21}}^{*} = 4630.00$	$x_{3 \underset{̲}{24}}^{*} = 9370.00$	$x_{3 \underset{̲}{30}}^{*} = 5030.20$
$x_{4 \underset{̲}{11}}^{*} = 3730.00$	$x_{4 \underset{̲}{16}}^{*} = 9800.00$	$x_{4 \underset{̲}{28}}^{*} = 9500.00$	$x_{4 \underset{̲}{30}}^{*} = 1970.00$	$y_{33}^{*} = 9450.00$
$y_{52}^{*} = 249.90$	$y_{58}^{*} = 9260.10$	$y_{65}^{*} = 7860.00$	$y_{66}^{*} = 1630.00$	$y_{\underset{̲}{11} 6}^{*} = 3730.00$
$y_{\underset{̲}{15} 3}^{*} = 430.00$	$y_{\underset{̲}{16} 4}^{*} = 9680.10$	$y_{\underset{̲}{16} 8}^{*} = 119.90$	$y_{\underset{̲}{20} 2}^{*} = 8020.10$	$y_{\underset{̲}{20} 9}^{*} = 1579.90$
$y_{\underset{̲}{21} 5}^{*} = 1120.00$	$y_{\underset{̲}{21} \underset{̲}{10}}^{*} = 8380.00$	$y_{\underset{̲}{24} 1}^{*} = 8680.00$	$y_{\underset{̲}{24} 2}^{*} = 610.00$	$y_{\underset{̲}{24} 7}^{*} = 80.00$
$y_{\underset{̲}{26} 6}^{*} = 4020.00$	$y_{\underset{̲}{28} 7}^{*} = 9500.00$	$y_{\underset{̲}{30} 9}^{*} = 7000.10$	$P r o = 622, 280$	$v a l = - 560, 050$

Table 6. Results of the DSHGA with different parameters.

System		Parameters		Results
T	N	$P_{c}$	$P_{m}$	$β^{*}$	Pro	Error (%)
200	500	0.9	0.05	$(0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0,$	$6.2019 \times 10^{5}$	0.34
				$0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1)$
500	500	0.9	0.05	$(0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1,$	$6.2210 \times 10^{5}$	0.03
				$1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1)$
1000	500	0.9	0.05	$(0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1,$	$6.2228 \times 10^{5}$	0.00
				$1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1)$
1000	100	0.9	0.05	$(0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1,$	$6.2216 \times 10^{5}$	0.02
				$1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1)$
1000	200	0.9	0.05	$(0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1,$	$6.2212 \times 10^{5}$	0.03
				$1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1)$
1000	300	0.9	0.05	$(0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1,$	$6.2216 \times 10^{5}$	0.02
				$1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1)$
1000	500	0.92	0.05	$(0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1,$	$6.2223 \times 10^{5}$	0.01
				$1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1)$
1000	500	0.94	0.05	$(0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1,$	$6.2215 \times 10^{5}$	0.02
				$1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1)$
1000	500	0.96	0.05	$(0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1,$	$6.2220 \times 10^{5}$	0.01
				$1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1)$
1000	500	0.9	0.02	$(0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1,$	$6.2218 \times 10^{5}$	0.02
				$1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1)$
1000	500	0.9	0.07	$(0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1,$	$6.2215 \times 10^{5}$	0.02
				$1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1)$

Table 7. Comparisons of different algorithms.

Algorithm	$β^{*}$	Pro	Fit	TI
DSHGA	$(0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1,$	$6.2228 \times 10^{5}$	$5.6005 \times 10^{5}$	$36937.3489$
	$1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1)$
Liu [26]	$(0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1,$	$6.2223 \times 10^{5}$	$5.6003 \times 10^{5}$	$38235.7439$
	$1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1)$

Table 8. Supply chain location and profit under different optimization models.

Algorithm	$β^{*}$	Pro	Fit
No-Risk ( $λ = 1$ )	$(0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1,$	$6.2309 \times 10^{5}$	$6.2309 \times 10^{5}$
	$1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0)$
SVaR	$(0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1,$	$6.2228 \times 10^{5}$	$5.6005 \times 10^{5}$
	$1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1)$
VaR	$(0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1,$	$6.2179 \times 10^{5}$	$5.5553 \times 10^{5}$
	$0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0)$

Table 9. Supply chain profit with robustness and no robustness.

Algorithm	$β^{*}$	Pro	Fit
No-Robust	$(0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1,$	$6.2286 \times 10^{5}$	$5.6921 \times 10^{5}$
	$1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0)$
Robust	$(0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1,$	$6.2228 \times 10^{5}$	$5.6005 \times 10^{5}$
	$1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1)$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, Z.; Raza, H. Data-Driven Two-Stage Distributionally Robust Mean Semi-Variance Mixed-Integer Optimization Model for Location Allocation Problems in an Uncertain Environment. Symmetry 2025, 17, 589. https://doi.org/10.3390/sym17040589

AMA Style

Liu Z, Raza H. Data-Driven Two-Stage Distributionally Robust Mean Semi-Variance Mixed-Integer Optimization Model for Location Allocation Problems in an Uncertain Environment. Symmetry. 2025; 17(4):589. https://doi.org/10.3390/sym17040589

Chicago/Turabian Style

Liu, Zhimin, and Hassan Raza. 2025. "Data-Driven Two-Stage Distributionally Robust Mean Semi-Variance Mixed-Integer Optimization Model for Location Allocation Problems in an Uncertain Environment" Symmetry 17, no. 4: 589. https://doi.org/10.3390/sym17040589

APA Style

Liu, Z., & Raza, H. (2025). Data-Driven Two-Stage Distributionally Robust Mean Semi-Variance Mixed-Integer Optimization Model for Location Allocation Problems in an Uncertain Environment. Symmetry, 17(4), 589. https://doi.org/10.3390/sym17040589

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Data-Driven Two-Stage Distributionally Robust Mean Semi-Variance Mixed-Integer Optimization Model for Location Allocation Problems in an Uncertain Environment

Abstract

1. Introduction

2. Literature Review

3. Preliminaries

4. Problem Description and Two-Stage Distributionally Robust Mean Semi-Variance Mixed-Integer Optimization Model

5. Solution Algorithm

6. Numerical Results

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI