A Simulation-Based Multi-Objective Optimization Framework for the Production Planning in Energy Supply Chains

: The work presents a simulation-based Multi-Objective Optimization (MOO) framework for efﬁcient production planning in Energy Supply Chains (ESCs). An Agent-based Model (ABM) that is more comprehensive than others adopted in the literature is developed to simulate the agent’s uncertain behaviors and the transaction processes stochastically occurring in dynamically changing ESC structures. These are important realistic characteristics that are rarely considered. The simulation is embedded into a Non-dominated Sorting Genetic Algorithm (NSGA-II)-based optimization scheme to identify the Pareto solutions for which the ESC total proﬁt is maximized and the disequilibrium among its agent’s proﬁts is minimized, while uncertainty is accounted for by Monte Carlo (MC) sampling. An oil and gas ESC model with ﬁve layers is considered to show the proposed framework and its capability of enabling efﬁcient management of the ESC sustained production while considering the agent’s uncertain interactions and the dynamically changing structure.


Introduction
An Energy Supply Chain (ESC) is a complex system made up of a number of components/agents interacting with each other, the environment, its hazards and threats [1,2].The components/agents are heterogeneously structured in a hierarchical system, within which they operate and cooperate in a balanced transaction environment to realize the maximization of the benefits under various environmental and safety constraints [3,4].ESCs significantly contribute to the sustainment of many industrial areas, such as biomass [5], food [6], oil and gas [7,8], chemistry [9], and sustainable and renewable energy [10][11][12].For this, ESCs must offer enhanced flexibility, innovative connectivity and communication to guarantee an orderly and healthy supply management environment to improve the way the energy industrial world copes with changes and adapts to risks [8,9,[13][14][15][16][17][18].
Energy companies have to make decisions in the planning to satisfy customer (uncertain) demands and maximize their own profits [19,20].However, planning and management are challenging because of the multiple sources of uncertainty existing in ESC.Uncertainty exists in supply and demand, propagates in the interactions throughout the whole ESC and influences the agent's profits and the ESC operations [21].The published research has addressed the planning of ESC whilst treating uncertainty.Ribas, Hamacher and Street [22] proposed a strategic planning model for an integrated oil industry supply chain taking three sources of uncertainty into account, that is, uncertainties in production of crude oil, demand for refined products and market prices.Ren et al. [23] developed a biobjective interval mix integer programming model to design a biofuel supply chain under uncertainties, which considers the CO 2 emission and the life cycle energy consumption.You, Wassick and Grossmann [24] optimized the risk-based mid-term planning of a global multi-product chemical supply chain, considering demand and freight rate uncertainties.
In addition to uncertainties, the other important aspect is that the ESC structure may change dynamically due to the changing interactions among agents.In ESC, every individual may make decisions independently at any time, and these anonymous decisions cause the whole structure to change, which results in an adaptive and dynamic ESC.Therefore, the ESC structure dynamics must be considered when modeling ESC.In this modeling, every agent makes decisions by itself, based on the production amount of the supply and the demand.The supply and the demand are uncertain, so the agent's decisions may be different in each transaction process, and thus, the structure changes from time to time.
The methodological framework proposed in the study supports a simulation-based optimization, as done in many studies [25][26][27][28].A hybrid simulation-optimization framework to assess the risk in pipelines has been proposed [29].In general, the simulation-based optimization framework is processed in two phases: simulation phase and optimization phase.In the application to ESC, the simulation model considers multiple variables, describes their influences in the behavior of the agents and provides the objective optimization values for a given simulation run configuration as the outcome.Then, the Multi-Objective Optimization (MOO) problem is solved by a Genetic Algorithm (GA).GA is a very popular meta-heuristic algorithm that has been successfully applied for simulation-based optimization.
In this research, Agent-based Modeling (ABM) and Non-dominated Sorting Genetic Algorithm (NSGA-II) are used.The overall framework is shown in Figure 1.ABM [30][31][32][33][34] is used to model and simulate the agent behavior and the ESC transaction processes; this is embedded into an optimization scheme solved by NSGA-II [35] to obtain the Pareto front of solutions with respect to the maximization of the ESC total profit and the minimization of the disequilibrium among the agent's own profits.The demand and supply uncertainties are mainly considered and modeled by a set of discrete scenarios with known probabilities [21].The uncertain scenarios are generated by Monte Carlo (MC) simulation and propagated in the dynamically changing ABM-based ESC structure, dependent on the agent decisions and other autonomous decisions.Agent's demand and supply fluctuate at each transaction time and, then, affect the ESC structure and the transaction processes.

ABM-ESC Simulation MOO
In the situation of uncertainty and dynamic ESC structure, the agents of the ESC have to undergo production planning, with the purpose of profit maximization and disequilibrium reduction, which can be measured by the ESC total profit and the disequilibrium among the agent's own profits, respectively.
Without loss of generality and for demonstration, the Non-dominated Sorting Genetic Algorithm II (NSGA-II) approach [35] is used here to solve the MOO problem to find the Pareto front of solutions.The NSGA-II algorithm has been widely used for solving MOO problems [35,36].It is selected as an optimization algorithm in this research because the uncertainty in supply and demand can be conveniently addressed [37].
Based on the min-max method [38,39], the single best compromised solution is identified among the Pareto solutions, guiding the explicit decision in the agent's plans.MC simulations are used in the proposed ABM-MOO framework, in presence of the uncertainty influencing the agent's behaviors [40][41][42].
The originality of this research is threefold: ABM is adopted to model and simulate an ESC with multiple behavioral and interactive agents: the ABM enables describing the ESC structure dynamics and accounting for the supply and demand uncertainties in the ESC transaction process.

2.
MOO is originally embedded within the ABM model for the planning of the ESC production with objectives of the ESC total profit maximization and the disequilibrium of the agent's own profit minimization.

3.
MC is used in the proposed ABM-MOO framework in a way to properly handle and control the uncertainty originating from multiple sources.
For demonstration, an application is carried out on a specific ESC for the oil industry, including five layers with crude oil producers, storages, refineries, terminal storages and retailers.The results show that the framework proposed is capable of handling the uncertainty and complexity of the problem and achieving a desired ESC maximal profit and equilibrium.
The rest of the paper is organized as follows.In Section 2, a literature review on the state-of-the-art ESC modeling is presented.Section 3 introduces the agent-based ESC model.Section 4 presents the ABM-MOO framework used to solve the production planning problem.In Section 5, the analysis of a case study is given to prove the effectiveness of the ABM-MOO framework.Conclusions are given in Section 6.

State-of-the-Art Literature Review on Energy Supply Chain Models
A variety of models have been used to model and describe the characteristics of an ESC, and they can be identified into four categories based on model types, as shown in Figure 2. Mathematical programming has been widely used to solve the problem of the optimal design of ESC networks.A variety of techniques have been applied in this context, such as Linear Programming (LP), Mixed-Integer LP (MILP), Nonlinear Programming (NLP) and MINLP.For example, a multi-objective multi-period fuzzy linear programming model is proposed to evaluate and optimize the natural gas supply chain [43].Guo et al. [44] develop a MILP model that enables biobased supply chain optimization with the integrated biomass logistical center (IBLC) concept.Robertson et al. [45] have used the NLP model to solve refinery production scheduling and unit operation optimization problems.A two-stage stochastic programming model is proposed to describe and optimize the shale gas supply chain network under uncertain conditions [46].Although applicable to the treatment of problems involving blending, continuous flow processing, production and distribution, strategic/tactical planning, etc., mathematical programming models are not flexible in dealing with the stochasticity, uncertainty and complexity of structure and interaction typically encountered in supply chains [47,48].
Analytical models built on mathematical expressions and numerical models characterize the ESC behavior and find solutions to ESC management problems by use of, for example, Game theory and Multi-Criteria Decision Making (MCDM) (including Data Envelopment Analysis (DEA), Analytic Hierarchy Process (AHP)) [49][50][51].In Reference [52], the authors proposed a novel Game-theory-based stochastic model for optimizing decentralized supply chains under uncertainty.The optimization of renewable power sources has been tackled with a fuzzy MCDM technique based on the cumulative prospect theory [53].A DEA model has been used to reduce the complexity of solving the proposed model [54].AHP combined with a fuzzy set theory enhances the reliability of the sustainable results along different stages of petroleum refinery industry projects [55].Analytical modeling can be used to evaluate and improve the performance of an ESC but has strong a limitation in the description of realistically complex supply processes, including stochastic and dynamic structures, uncertainty and partial information sharing [56].
Simulation models [57], for example, Discrete Event Simulation (DES) and Dynamic Simulation (DS), have been developed to explore the behavior of agents in ESC, with the further goals of evaluation [58], analysis and optimization [59], risk management [60], and so on.For example, Windisch et al. [59] have applied DES to simulate the raw material planning in an energy wood supply chain.Becerra-Fernandez et al. [61] have proposed a DS model for assessing alternative security of supply policy along the natural gas value chain.The modeling benefits and the large computational capacities make simulation models increasingly attractive for the modeling of realistic ESC realistic problems.
ABM provides another way to model and simulate ESC that is also applicable to continuous processes [30,31] and can be applied to various types of supply chains.For example, Raghu et al. [62] relied on ABM and Geographical Information System (GIS) to assess the environmental impact on the forest biomass supply chain.Moncada et al. [63] developed a spatially explicit agent-based model to analyze the impact of different blend mandates and taxes levied on investments in processing capacity and on production and consumption of ethanol in the biofuel supply chain.
These studies confirm that ABM can be effectively used for modeling, simulating, assessing and analyzing ESCs, but rarely has it been used in optimizing ESCs operations.To the author's knowledge, no study has attempted to solve ESC planning problems by using ABM within an optimization framework, as well as considering demand and supply uncertainties and structure dynamics at the same time.This is done here, thanks to the capability of ABM in handling complex production processes and service problems under uncertainty, which are difficult to deal with by mathematical programming and analytical models.The main features of our ABM compared with the existing literature are shown in Table 1.

The Agent-Based Energy Supply Chain Modeling
Energy Supply Chain (ESC) can be effectively described by ABM.ABM can provide logical rules to describe the agent behavior and interactions and allows simulating the ESC transaction processes in an uncertain, dynamic and time-dependent environment [30,32,33,[65][66][67].Thus, in this research, ABM is applied to the modeling, analysis and optimization of ESCs, taking into account both the demand and supply uncertainties and the structure dynamics.

Modeling of Agent Uncertain Behavior
An ESC is considered with L layers, and each l-th layer consists of V l agents, a l,1 , a l,2 , . . ., a l,v , . . ., a l,V l (Figure 3).In the ESC, the orders are sent from the Layer 1 flow layer-by-layer to the end layer L, of which the V L agents are suppliers that send supply decisions back to the demanders.An agent a l,v , v = 1, 2, . . ., V l in the l-th layer is assigned with behaviors, which allow the agent to adaptively interact with the others.The details of the model are described in the following.

Sending Orders
The process of sending orders of an agent a l,v in the l-th layer (Figure 4) is as follows: (a) a l,v chooses the supplier(s) a l+1,v in the upper layer l + 1.(b) a l,v sends orders to a l+1,v .(c) a l,v updates the list of alternative suppliers.

Receiving Orders and Choosing Demanders
The process of an agent a l,v receiving orders and choosing demander(s) is shown in Figure 5 and described as follows: (a) a l,v receives the order from a l−1,v in the lower layer l − 1.(b) a l,v checks whether the received order is empty:

•
If yes, a l,v does not choose demanders.• Otherwise, (c) a l,v checks whether it has available productions that satisfy the received orders.
-If yes, (d) a l,v checks whether the received orders exceed the existing production limitation U l,v (t) defined as: where S l,v is the storage of a l,v and S * l,v is the back-up safety storage of a l,v .* If yes, (e) a l,v refuses the order from a l−1,v demanded with the lowest bid price, and returns to (d).
* Otherwise, (f) the agent a l,v accepts the order and makes a contract with a l−1,v .Then, the existing oil production limitation of the agent a l,v (Equation ( 2)) updates for the next time t + 1: where, The process of receiving orders and choosing demander(s).

Response
One demander a l,v may negotiate with its supplier a l+1,v , in case of receiving a response from a l+1,v .This process is shown in Figure 6 and described as follows: (a) The agent a l,v receives a response from the supplier a l+1,v .(b) Check whether the order plan is satisfied:

•
If yes, a l,v stops sending the order plan.• Otherwise, (c) a l,v checks whether all the alternative suppliers have been considered: -If yes, the agent a l,v stops sending the order plan.-Otherwise, (d) the agent a l,v updates its demands, where, y l+1,v l,v (t) is the number of orders sent by the agent a l,v , which are received by the agent a l+1,v at time t, x l,v l+1,v (t) is the number of orders accepted by the agent a l+1,v , which are sent by the agent a l,v at time t.Then, (e) send orders (as discussed in Section 3.1.1)again.

Selling Production
An agent a l,v should sell productions after accepting an order plan.This process is shown in Figure 7 and defined as follows: (a) a l,v checks whether it stores enough productions satisfying the accepted order plan:

•
If yes, (b) sells the productions to the demander a l−1,v and updates the set of accepted orders and the storage for the next time t + 1 (Equation ( 4)): where S l,v is the production storage of a l,v , and (t) is the amount of the production sold by a l,v to a l−1,v .• Otherwise, (c) a l,v sells the productions to the demander who makes their income highest and then updates the set of accepted orders and the storage S l,v (t + 1).(d) Check whether the storage or the set of accepted orders is empty.

Receiving Production
An agent a l,v receives the oil production from the upstream agents a l+1,v and updates its storage S l,v (t + 1) for the next time t + 1: where w l+1,v l,v (t) is the amount of the oil production sent by the agent a l,v , which is received by the agent a l+1,v at time t, and k l,v is the production capacity of a l,v .

The Planning Problem of the Energy Supply Chain
In the aforementioned ESC, each agent a l,v needs to make decisions on the planning to satisfy its customer's uncertain demands and to maximize their own profits.This is challenging due to the fact that the uncertainty originating from the production and purchase quantities (e.g., cost, price, demand, supply) influences the agent's decisions associated with the interaction structure, which, in turn, affects the agent's profits.The uncertainty propagates throughout the dynamic transaction process in the whole ESC, making it difficult to deal with production schedules and purchase orders.
To deal with this, an MOO problem is formulated in terms of the ESC total profit (which is expected to be maximized) and the agent's own profits (for which the disequilibriums are supposed to be minimized).
Equation ( 6) defines the ESC total profit P: where A is related to the income from selling the oil production to the customer, expressed in Equation (7). where is the unit price of the agent a l,v when selling the oil production to the agent is the production sent by the agent a l,v , which is received by the agent a l−1,v at time t.
B is the purchase cost, which includes the procurement cost plus the other costs for example, the transportation cost, the labor cost and so forth.
where p l,v l+1,v is the unit price of the agent a l+1,v for selling the oil production to agent a l,v , o l+1,v l,v is the unit price for the other cost, w l+1,v l,v,t is the amount of the oil production sent by the agent a l+1,v , which is received by the agent a l,v .
Item C calculates the storage cost.
where c S l,v is the agent a l,v storage unit cost, S l,v,t is the production storage of the agent a l,v at time t.
Here, D is a penalty from the loss [68], where r(t) is equal to 1 when the agent suffers a loss after the oil production transaction at time t; otherwise, 0. ε P is an arbitrary large number for the total profit.On the other hand, the profits of the agents are expected to be different in a healthy ESC.It is of importance to measure and control the disequilibriums among the agent's own profits, E, which is defined as the sum of the dispersion quantifying the deviation of the agent's real profits from the expected values (F) and the penalty of loss in all the transaction cycles (G): where σ l is the standard deviation of the profits of the agents in the layer l, µ l is the mean of the agent's profits in the layer l, r(t) is equal to 1 if the agent suffers a loss after the oil production transaction at time t; otherwise, 0. ε E is an arbitrary, large number for the disequilibrium.
Hence, the MOO problem can be formulated: The problem will be solved by maximizing the ESC total profit ( 14) and minimizing the disequilibrium (15), simultaneously.Equations ( 16) and ( 17) are constraints defining the feasible regions for the average orders and the prices.

The Agent-Based Modeling Multi-Objective Optimization Framework
An ABM-MOO framework is originally proposed to obtain the non-dominant solutions of the Pareto fronts, which can maximize the total ESC profit P and minimize the disequilibrium among the agent's profits E. To account for demand and supply uncertainties, MC simulations are used, as sketched in Figure 8. NSGA-II as a type of GA is flexible and easy to apply to the optimization problem in our ESC.The general advantages of NSGA-II are listed as follows [69]: 1.The non-dominated sorting technique is used to get close to the optimal solution.2. The crowding distance technique is used to maintain the diversity of the solutions.3. The elitist technique is used to preserve the best solution for the next generation.Especially in our case, the number of decision variables is 39, and the ABM-ESC environment is uncertain and dynamic, which causes problems for traditional derivative-based methods but can be easily encoded and then optimized by NSGA-II.
The algorithm is summarized as follows: Set a total of MG max runs of the MC loop.

Initialization of the ABM-MOO in each n-th MC run
• Set the transaction time t = 0, 1, 2, . . ., NT max , the GA population size NP, the maximum number of GA generations NG max , the crossover coefficient C c and the mutation coefficient is the amount of productions that are produced by the agent a L,V L in the last layer at time t, and µ Q is the average value, σ 2 Q is the variance.

•
Set the NSGA-II generation index k = 1.

•
Randomly generate the order and price decision matrices ] n NP } within the feasible space.They are coded into the chromosomes of the GA.Notice that a set of Pareto fronts can be identified from each MC run of the ABM-MOO framework, and the best compromised solution can be identified among them by the min-max method [38,39].The min-max finds the highest value that one objective can be sure to get without knowing the strategy that would satisfy the other objective.Given Pareto front H ≡ (P, E), the relative deviations of P and E are defined, respectively: where P min and P max are the minimum and maximum of the fitness values P, respectively.
where E min and E max are the minimum and maximum of the fitness values E, respectively.The best compromised solution is determined as

Rank the chromosomes by running the fast non-dominated sorting algorithm and crowding distance with respect to the fitness values and identify the ranked Preto-optimal fronts
where is the best front and is the least good front

Case Study
For illustration, an oil ESC is considered, which is structured in five layers (Figure 9), including retailers (Layer 1), terminal storage (Layer 2), refinery (Layer 3), storage (Layer 4) and crude oil producers (Layer 5).Three cooperative agents are modeled in Layer 1 (i.e., a 1,1 , a 1,2 , a 1,3 ), Layer 2 (i.e., a 2,1 , a 2,2 , a 2,3 ) and Layer 3 (i.e., a 3,1 , a 3,2 , a 3,3 ), respectively.Two cooperative agents are modeled in Layer 4 (i.e., a 4,1 , a 4,2 ) and Layer 5 (i.e., a 5,1 , a 5,2 ), respectively.Customer agents try to get enough production from suppliers, and at the same time, supplier agents have to choose the demander(s) that can bring them the highest profits.Motivated by this, agents pursue relative behaviors to realize their expectations.These behaviors have been illustrated in Section 3 and define the rules and decision processes of the agents interacting with other agents in an uncertain environment.
The ABM-MOO framework is applied to the ABM of the aforementioned oil ESC, for a total of MG max = 100 MC simulation runs, in a period of 1000 transaction days.Table 2 summarizes the main parameters set for the ABM-MOO algorithm.Table 3 lists the constraints, as discussed in Equations ( 16) and ( 17), and Table 4 lists the unit prices o l+1,v l,v for the other cost from a l+1,v to a l,v .We assume the total transaction days are 1000 days, and every 30 days the agents make a deal.The variables (the supply and the demand) are sampled from Gaussian distributions when the agents make a deal, and the beginning time is different for each agent; so, the sampling size is about 30.
The modeling is implemented in the software ANYLOGIC, which is exported as a jar file and then imported in ECLIPSE.The packages of the NSGA-II are implemented in MATLAB [73].The NSGA-II algorithm is also exported as a jar file and imported in ECLIPSE to run with the modeling jar file.The production capacity of agent a l,v 1 (if l = 3, 0.8) Table 3.The values of the orders and price limitations.

Sensitivity Analysis
Based on References [28,67], we have an unformed sensitive analysis by using Sobol indices.In the Sobol indices, the higher the Sobol indices values, the more influential the respective model parameters are.The result is shown in Figure 10.Regarding the objective total profit, the order amounts of the agents are more important than the prices of the agents.Based on Equations ( 6)-( 9) in the paper, the total profit can be written as: where p 0 1,v is the unit price of the agent a 1,v by selling the oil production to the customers, z 0 1,v (t) is the amount of the oil production sent by the agent a 1,v , which is received by the customers at time t, o l+1,v l,v is the unit price for the other cost, w l+1,v l,v (t) is the amount of the oil production sent by the the agent a 1+1,v , which is received by the agent a 1,v at time t, p 5,v 6 is the unit price of the crude oil production, w 6 5,v (t) is the amount of the oil production received by the agent a 5,v at time t, c S l,v is the agent a l,v storage unit cost and S l,v (t) is the storage of the agent a l,v at time t.
Equation (21) shows that the total profit is related to the order amount but has weak sensitivity on the price, because the price of one agent is the buying cost of others.The price influence is neutralized when calculating the total profit.
Figure 11 shows the sensitivity on the order amount for the total profit.The agents in the higher layer (up-stream) have higher sensitivity for the total profit, which makes sense because the oil supply chain model is a make-to-stock model.The down-stream agent's profit strongly relies on the up-stream agent's production amount in their hand.The agents in the higher layer have greater control of the total profit.Every layer has at least one important variable sensitive to it regarding the objective of disequilibrium, ȳ1,2 in Layer 1, ȳ2,1 in Layer 2, p 3,1 4,2 in Layer 3, ȳ4,2 in Layer 4 are most sensitive to the objective disequilibrium.In each layer, the influence of the order amount and the price on the disequilibrium is complicated and different from layer to layer.For example, ȳ1,2 in Layer 1 is sensitive because its cost is large if it buys the production from agent a 2,1 and agent a 2,2 in Layer 2, which influences the equilibrium of the profit in Layer 1.

Results of the Pareto front
As shown in Figure 12, the Pareto front is not continuous and broken into three pieces, mainly because the feasible area of the profit is discontinuous due to the switch in the decisions of agents and also the constraint, which causes the jumps in the objective function values.
The Pareto front shows that the ESC total profit increases when the disequilibrium among the agent's profits tends to be large, which makes sense because some agents have price and cost advantages when they look for oil productions.Some agents can get more profit than others, which increases the total profit but also the disequilibrium among the agent's profits.
According to Equation (20) discussed in Section 4, the best compromised solution is identified by the min-max method when substituting H 1 , with P equal to 518440.3 and E equal to 1569.5. Figure 12 shows that no single solution exists that simultaneously optimizes each objective: all solutions on the Pareto front are equally good but give different strategies to the ESC ABM and get different results for the total profit and the total disequilibrium.Some solutions (like H 3 in Figure 12) can obtain a large total profit but at the expense of a large disequilibrium.Some solutions (like H 2 in Figure 12) can achieve small disequilibrium but also small profit.
In order to show the agent's own profits with the disequilibrium of each layer, three typical Pareto solutions (H 1 , H 2 and H 3 ) are chosen to input into the original ESC ABM simulation again, where H 1 is the best compromised solution identified, H 2 is the solution with the smallest values of both the ESC total profit and the disequilibrium, and H 3 is the solution with the highest values of both the ESC total profit and the disequilibrium.The values of decision variables getting H 1 , H 2 and H 3 are shown in Figures 13 and 14.

Agent's Own Profits Obtained While Re-Inputting H 1 into the Energy Supply Chain Modeling
Figure 15 shows the simulation results when the best compromised solution H 1 is substituted into the ESC ABM again.Figure 15a-e shows the profits of the agents of different layers during the transaction processes.The results show that every agent in ESC gets a profit rather than a loss, due to the fact that the penalty Equations ( 10) and ( 13) keep each agent from a profit loss.
In detail, in Layer 1 (Figure 15a), Retailer 1 can get more profit, with an average of 3830, than Retailers 2 and 3 in the transaction processes after the initial phase, thanks to the relatively lower total cost allowing greater opportunities to be accepted by the suppliers in Layer 2. This phenomenon is obvious in the intermediate Layer 2. As shown in Figure 15b, Terminal Storage 1 beats Terminal Storage 2 and 3, profiting most in Layer 2. The difference among agent's own profits weakens in Layer 3 and 4, as shown in Figure 15c,d, respectively, due to the fact that the agents in the same layer are bidding with similar price levels and, thus, have equal chances to order productions from the agents in the upper layer and to sell productions to the agents in the lower layer.It should be noted that despite slight differences in the profits of the two crude oil producers in Layer 5 (Figure 15e), both profits fluctuate severely during the transactions, because the supplier profits are mainly determined by the propagation of the demand and supply uncertainties in the supply chain network.
Figure 15f shows the disequilibrium for each layer.The results show that Layer 3 reaches the smallest value of the disequilibrium, equal to 90.8, followed by Layer 4 with a value equal to 130.3.This makes sense because the agents in Layers 3 and 4 profit with slight difference, which makes their deviations from the expected profits small, equivalent to the results of Figure 15c,d, respectively.In contrast, Layers 1 and 2 result in relatively large values of disequilibriums, equal to 643.7 and 478.4,respectively, mainly due to the imbalance of profits among agents in the layers.A value of 226.3 in Layer 5 proves that the propagation of uncertainty in the supply chain network can also result in a relatively high probability of the imbalanced profits among agents.Figure 16 compares the ESC total profits when H 1 , H 2 and H 3 operationalize the ESC ABM.The re-input of the best compromise solution H 1 results in the ESC total profit P equal to 518,440.2, which is larger than P = 490,755.1 obtained from the re-input of H 2 (with the smallest expected ESC total profit), but smaller than P = 526,831.4obtained from the re-input of H 3 (with the largest expected ESC total profit).
Furthermore, the disequilibrium of each layer is compared by re-inserting Pareto solutions into the ESC ABM, as shown in Figure 17.Despite the highest ESC total profit ( 526,831.4), the re-input of H 3 leads to the largest disequilibriums for all the layers, especially an extreme value equal to 2056.3 in the first layer, which can make the retailers own profits uncertain and less-foreseeable.In spite of the relatively large values of the disequilibriums of Layers 1 and 2, the re-input of the best compromise solution H 1 can achieve relatively small disequilibriums, equal to 90.The results of Figures 16 and 17 show that agents in the same layer can get close profits to decrease their disequilibrium, but this includes sacrificing the maximization of their own profits at the same time.This is because the weak agents in the layer are constrained by the cost and the production amount, so the strong agents have to sacrifice their own profits in order to get similar profits to the weak agents and, eventually, to minimize their disequilibrium.

Results from the Total Monte Carlo Runs
MC simulations are used in the proposed ABM-MOO framework in the presence of supply and demand uncertainties influencing the agent's behaviors.Figure 18 shows the results of the Pareto fronts for different scenarios from a total of 100 MC runs.In each run, a population size of 100 is generated and obtains 100 Pareto fronts (i.e., 10,000 Pareto solutions).According to the equation: in the literature [74], given a confidence level α, the sample size n is determined by the level of precision φ and the sample standard deviation s.Assuming s is fixed, an order of magnitude increasing in the level of precision φ requires an increase of two orders of magnitude in the sample size n.Therefore, simply increasing n is not a valid and practical solution in our study due to the huge expense of the running time.In fact, 100 Pareto fronts containing 10,000 Pareto solutions are obtained here, which is enough for our study because we estimate that the sample standard deviation s in the profit is equal to 12,072.
If the confidence level α = 0.1 is set, the level of precision φ = 197.98,which is acceptable regarding the order of magnitudes 10 5 in the profit.The Pareto solutions are distributed roughly in three regions in Figure 18.This is meaningful because, as mentioned before, the ESC ABM considers a number of continuous and discrete design variables and various other design constraints, which consequently make the design space of the ESC ABM system discontinuous and generate inherent nonlinearities in the ESC ABM, of which the concave section is dominated to produce the discontinuous Pareto front.
The mean values and the double-sided 95% confidence intervals of the ESC total profit and the ESC disequilibrium are calculated.The results show that the mean value of the expected ESC total profit (equal to 517,084.5) and of the ESC disequilibrium (equal to 1835.4) are close to those of H 1 (equal to 518,440.2 and 1569.5, respectively), demonstrating the effectiveness of selecting and re-inputting the best compromise solutions into the ESC ABM simulations.
Notice that the MC simulations are considered to control the demand and supply uncertainties and the dynamic structure in the proposed ABM-MOO framework.The estimates of the double-sided 95% confidence intervals can identify the range of the ESC total profit and the disequilibrium that their true values lie in.The ESC total profit is bounded within 493,435.2 and 532,110.5 with a 95% level of confidence, which provides a range of values ( 493,435.2, 532,110.5) that the expected ESC profit lies in; whereas, the disequilibrium is bounded between 516.5 and 3756.8, which suggests the ESC may operate in an uncertain way if the agents transact in an unbalanced and competitive environment.

Conclusions
In this research, an Agent-based Modeling Multi-Objective Optimization (ABM-MOO) framework is proposed for production planning in an Energy Supply Chain (ESC), where the agent's interactive behaviors are uncertain and the ESC structure dynamically changes.ABM is originally used to model and simulate the transaction processes by multiple behavioral and interactive agents, occurring in an ESC with an ESC dynamic structure, and supply and demand uncertainties.An MOO problem is defined to drive the optimization of the production planning towards maximizing the ESC total profit, and at the same time, minimizing the disequilibrium among the agent's profits.A Monte Carlo (MC) sampling approach is deployed to operationalize the proposed ABM-MOO framework with proper handling and to control the uncertainty originating from multiple sources that can reduce the confidence in decision making for the production planning.
As a limitation of this study, because the simulation module and the optimization module are combined, one iteration of the GA and optimization requires one run of the simulation, which is burdensome with regards to total time.Moreover, because of the limitation of the GA, we cannot conclude that we found the best solution for the production planning problem in the ESC, but our methodology produces an efficient Pareto set that allows the decision maker to compromise between the total profit and the disequilibrium.
The framework is illustrated by considering an oil ESC with five layers.The results of the case study show that the ABM-MOO framework is effective in addressing the ESC planning problem in an uncertain environment and with a dynamic structure.
Future research will account for the following: In the ESC modeling, the uncertainty can be simulated by generating random samples from a given distribution and simulating the risk by setting production flows.The agentbased ESC modeling could easily implement all the uncertainties and risks in a real setting.As we mentioned, because of other modelings limitations, the optimization with demand uncertainty and supply disruption is rarely considered in the ESC assessment.On the contrary, the optimization approach needs to account for this in ESC operation.
In the ESC, there are many agents (e.g., retailer, refinery, storage) of which variables and objectives are independent, so Many-objective Optimization Problems (MaOPs) arise in ESC, which are difficult to solve by traditional Evolutionary Algorithms (EAs).To deal with this, the cooperative co-evolutionary algorithm has been considered.
Resilience to high-impact low-probability events affecting the ESC can also be considered for optimization.

NT max
The total transaction time.o l+1,v l,v The unit price for the other cost.
The unit price of the agent a l,v by selling the oil production to the agent a l−1,v .p min l The minimum for the unit price.p max l The maximum for the unit price.
The amount of productions which are produced by the agents in the last layer at time t.r(t) The number of agents who get a loss after the oil production transaction.s The sample standard deviation of Monte Carlo simulation.S l,v (t) The storage of the agent a l,v at time t.S * l,v (t) The back up safety storage in the agent a l,v at time t.

U l,v (t)
The residual oil production limitation of the agent a l,v at time t.{v a } The sets of the agent indexes whose orders are accepted.

V l
The maximal amount of agents in the l-th layer.
The amount of the oil production sent by the the agent a l+1,v , which is received by the agent a l,v at time t.
The number of orders accepted by the agent a l,v , which is sent by the agent a l−1,v at time t.
The number of orders sent by the agent a l,v , which is received by the agent a l+1,v at time t.ȳmin l The minimum for the average orders.ȳmax l The maximum for the average orders.ȳl,v The amount of average orders sent by the agent a l,v at time t.
The amount of the oil production sent by the agent a l,v , which is received by the agent a l−1,v at time t.σ l The standard deviation of agent's profits in the layer l. σ Q The standard deviation of agent's production amount Q L,V L (t). σ y The standard deviation of agent's orders y l+1,v l,v (t).
ε E An arbitrary large value for uncertainty.ε P An arbitrary large value for total profit.λ α The z-statistic under the confidence level α. µ l The mean of agent's profits in the layer l. µ Q The mean of agent's production amount Q L,V L (t). φ The level of precision of Monte Carlo simulation.

Figure 4 .
Figure 4.The process of sending orders.
is the number of orders accepted by the agent a l,v , which are sent by the agent a l−1,v-Otherwise, (g) a l,v sends a response back to the demander a l−1,v .

Figure 6 .
Figure 6.The process of the response.

Figure 7 .
Figure 7.The process of selling production.

Figure 8 .
Figure 8.The flowchart of the Agent-based Modeling Multi-Objective Optimization (ABM-MOO) framework.

Figure 9 .
Figure 9.The main layers in the Energy Supply Chain (ESC).

Figure 10 .
Figure 10.Sensitivity for the profit and the disequilibrium.

Figure 11 .
Figure 11.Sensitivity of the order amounts for the profit.

Figure 13 .
Figure 13.The values of the orders (ton) sent by the agent a l,v for getting Pareto solutions H 1 , H 2 and H 3 .

Figure 14 .
Figure 14.The values of prices ( /ton) for getting H 1 , H 2 and H 3 .

Figure 15 .
Figure 15.The profit and the disequilibrium of the profit for each layer considering variables of H 1 .5.2.2.Comparison of Results Obtained from the Re-Inputs of H 1 , H 2 and H 3 The Pareto solutions H 2 and H 3 are re-inputed into the ESC ABM to implement the simulations of the transaction processes again.The agent's profits of different layers obtained from the re-inputs of H 2 and H 3 , respectively, show the same trends as the re-input of H 1 .The simulation results are attached in Appendix A.Figure16compares the ESC total profits when H 1 , H 2 and H 3 operationalize the ESC ABM.The re-input of the best compromise solution H 1 results in the ESC total profit P equal to 518,440.2, which is larger than P = 490,755.1 obtained from the re-input of H 2 (with the smallest expected ESC total profit), but smaller than P = 526,831.4obtained from the re-input of H 3 (with the largest expected ESC total profit).Furthermore, the disequilibrium of each layer is compared by re-inserting Pareto solutions into the ESC ABM, as shown in Figure17.Despite the highest ESC total profit ( 526,831.4), the re-input of H 3 leads to the largest disequilibriums for all the layers, especially an extreme value equal to 2056.3 in the first layer, which can make the retailers own profits uncertain and less-foreseeable.In spite of the relatively large values of the disequilibriums of Layers 1 and 2, the re-input of the best compromise solution H 1 can achieve relatively small disequilibriums, equal to 90.8 for Layer 3, 130.3 for Layer 4, and 226.3 for Layer 5, which are very close to those obtained from the re-input of H 2 (which gives the smallest expected disequilibriums), equal to 36.1, 123.2 and 204.4,respectively.These results demonstrate that the re-input of the best compromise solution H 1 can be of use in solving the ESC planning problem while leveraging the ESC total profit with the disequilibrium of the ESC layers.The results of Figures16 and 17show that agents in the same layer can get close profits to decrease their disequilibrium, but this includes sacrificing the maximization of their own profits at the same time.This is because the weak agents in the layer are constrained by the cost and the production amount, so the strong agents have to sacrifice their own profits in order to get similar profits to the weak agents and, eventually, to minimize their disequilibrium.
8 for Layer 3, 130.3 for Layer 4, and 226.3 for Layer 5, which are very close to those obtained from the re-input of H 2 (which gives the smallest expected disequilibriums), equal to 36.1, 123.2 and 204.4,respectively.These results demonstrate that the re-input of the best compromise solution H 1 can be of use in solving the ESC planning problem while leveraging the ESC total profit with the disequilibrium of the ESC layers.

Figure 16 .
Figure 16.The ESC total profits obtained from the re-inputs of H 1 , H 2 and H 3 .

Figure 17 .
Figure 17.The disequilibrium for each layer considering variables of H 1 , H 2 and H 3 .

Table 1 .
The main features of our ABM compared with the existing literature.
(t) is the amount of orders sent by agent a l,v that are received by agent a l+1,v at time t, and ȳl,v is the average value and σ 2 y the variance.based on the crowding distance, aimed at finding the Euclidean distance between chromosomes in the front that maximizes P and minimizes E. It is worth pointing out that the chromosomes in the boundary are selected all the time since they are assigned with infinite distance.
k + 1 and check that the NSGA-II stopping criterion (k > NG max ) is reached: • Check whether the MC simulation stopping criterion reaches (n > MG max ).-If yes, end ABM-MOO and get all the Pareto fronts.-Otherwise, begin a new MC simulation cycle.

Table 2 .
The setting of the parameters for the Agent-based Modeling Multi-Objective Optimization (ABM-MOO).

Table 4 .
The values of the unit prices o l+1,v l,v ( /ton) for the other costs from a l+1,v to a l,v .