Data-Driven Stochastic Scheduling for Energy Integrated Systems

: As the penetration of intermittent renewable energy increases and unexpected market behaviors continue to occur, new challenges arise for system operators to ensure cost effectiveness while maintaining system reliability under uncertainties. To systematically address these uncertainties and challenges, innovative advanced methods and approaches are needed. Motivated by these, in this paper, we consider an energy integrated system with renewable energy and pumped-storage units involved. In addition, we propose a data-driven risk-averse two-stage stochastic model that considers the features of forbidden zones and dynamic ramping rate limits. This model minimizes the total cost against the worst-case distribution in the conﬁdence set built for an unknown distribution and constructed based on data. Our numerical experiments show how pumped-storage units contribute to the system, how inclusions of the aforementioned two features improve the reliability of the system, and how our proposed data-driven model converges to a risk-neutral model with historical data.


Introduction
In the past several years, renewable energy, and in particular, wind power, has maintained an increasing penetration in the power system. For instance, more than 33% of the total electricity consumption served by the California Independent System Operator (CAISO) are provided from renewable energy resources [1]. Nevertheless, due to the usually intermittent nature of wind power, the high penetration of such a renewable energy source fluctuates, leading to a fluctuation in power generation, which involves sophisticated power system operation and creates difficulties for the system to schedule power generation [2]. An example is the duck curve [3], a well-known new pattern of the CAISO demand curve. Therefore, advanced modeling and solutions are needed so that demands can be satisfied while renewable energy is maximally harnessed.
Currently, pumped-storage units are outstanding representatives of power storage devices that can offset the impacts of the aforementioned properties of renewable energy by transferring energy from low-use periods to peak-use periods [4]. Therefore, energy systems that integrate thermal generators, wind farms and pumped-storage units are now becoming a popular research subject, and we refer to this kind of power system as an energy integrated system.
In addition, recently, significant research progress has been achieved to address the uncertainty brought by renewable energy generation [5,6]. Among such studies, stochastic optimization and robust optimization are common approaches to address the aforementioned uncertainty to help

Problem Description
In this paper, we consider the unit commitment (UC) problem in an energy integrated system in which thermal generators, wind farms and pumped-storage units are integrated. In this problem, the renewable energy generation and demands are uncertain, which results in uncertainties in the system that need to be addressed. Moreover, to accurately model the operation of thermal generators, we consider forbidden zones and dynamic ramping rate limits in the proposed model. We aim to study how the pumped-storage units improve the operation of the system, show the effectiveness of our proposed data-driven risk-averse stochastic model in optimizing the operation of the system under uncertainty and present the reliability of the system when forbidden zones and dynamic ramping rate limits are included.
Therefore, we are tasked with constructing a data-driven risk-averse two-stage stochastic model to minimize the total cost of the energy integrated system that includes forbidden zones and dynamic ramping rate limits under uncertainty. Decisions in the UCs and power generation are made at discrete time periods over a finite time horizon. Furthermore, appropriate approaches are adopted to help solve the model efficiently.

Paper Organization
The remaining parts of this paper are organized as follows. Section 2 reviews the most relevant studies in UCs and the corresponding solution techniques. Section 3.1 introduces the deterministic two-stage model with forbidden zones and dynamic ramping rate limits. Then, Section 3.2 describes how to derive the data-driven risk-averse two-stage stochastic model by introducing constructions of confidence sets with L 1 and L ∞ norms and introduces how to develop the Benders decomposition algorithm to solve the corresponding model. Section 4 provides computational results and demonstrates the reliability and practicality of our proposed model. Finally, Section 5 concludes this paper.

Literature Review
There is a rapidly growing body of literature on power systems that integrate pumped-storage units and wind generation. Functions of pumped-storage units in the power system have been studied in real cases [9][10][11] and their significances are proved, especially their specialists in addressing the uncertainty when the renewable energy is involved in the power system. Furthermore, current studies analyse more details about characteristics of pumped-storage units that play vital roles in addressing the variability and the uncertainty in the power system, for example, their capacities [4,12,13] and allocations [14][15][16]. And detailed solutions of how to optimize two such characteristics of pumped-storage units for achieving a more cost-efficient power system are provided. Although the aforementioned studies present advantages of pumped-storage units in the power system in real cases and offer efficient solutions in how to utilize them in a smart way, they neglect properties of thermal generators in real-time operations, and consequently the systems that they consider cannot be perfectly reliable.
To address the uncertainty in the UC problem, stochastic optimization and robust optimization are two popular traditional optimization-under-uncertainty techniques. Two-stage stochastic frameworks are built for solving UC problems under uncertainty in [17,18]; the former study considers the uncertainty due to wind power output, whereas the latter study investigates the problem when demands are uncertain. Similarly, a two-stage stochastic programming (TSP) approach is applied to address the uncertainty associated with the electricity price in the UC problem [19]. However, even though the aforementioned studies provide significant performances in dealing with their studied issues, problems cannot be solved in the extended form when a large number of scenarios is taken into account. To overcome the obstacle associated with large number of scenarios, Niknam et al. [20] utilize improved teaching-learning-based optimization algorithm and Farzin et al. [21] employ the model predictive control based techniques algorithm. However, their methods cannot work without a predetermined probabilistic distribution of the uncertainty parameters. Actually a predetermined random distribution is necessary when the stochastic optimization approach is adopted to solve the UC problem [22,23]. In addition, robust optimization has been extensively studied to help the power system make better decisions under uncertainties. Some early studies [24][25][26] apply robust optimization techniques in modelling and solving UC problems and demonstrate the effectiveness of robust optimization. However, they all possibly lead to excessively conservative solutions. In order to control the conservatism, Chen et al. [27] configure uncertainty range to cover the uncertainty within each scenario. However, this method requires operators to pick the proper scenario, which is a little bit subjective and may lead to solutions of low accuracy. And Wu et al. [28] employ robust interval optimization algorithm to address uncertainties, which may lead to excessively conservative solutions as the wind power prediction error becomes large. Although the conservatism of robust optimization can also be reduced by adopting the approach of adjusting the budget of uncertainty [29] or utilizing dynamic uncertainty sets [30], there is no systematic method to prioritize the value of budget of uncertainty.
Currently, most studies [31][32][33] focus on a model of traditional thermal generators in a UC problem without taking into account forbidden zones and dynamic ramping rate limits. Even if such two features are included, the corresponding constraints are not reliable enough. For example, forbidden zones in the formulation of the UC are considered in studies [7,34,35], nevertheless, they assume that power generation within forbidden zones is prohibited. However, this assumption is impractical, since the the power generation ramps continuously, which means that power generation within forbidden zones is inevitable. Regarding studies about dynamic ramping rate limits, Li and Shahidehpour [8] consider dynamic ramping rates as stepwise or piecewise functions of generation and include this feature by proposing these functions. However, the constructed models have limitations in imitating the transitions of the ramping rates across different zones (e.g., from a forbidden zone to a normal zone). Based on Li and Shahidehpour [8], Correa-Posada et al. [36] further improve the formulation with dynamic ramping rate limits taken into account by incorporating more realistic cases. Unfortunately, these studies cannot capture the dynamic ramping rate limits accurately, since they do not take into account forbidden zones, which have special requirements [37].
Motivated by these, in this paper, we try to deal with the aforementioned defects, and the main results and contributions are summarized as follows: • We consider an energy integrated system that integrates thermal generators, wind farms and pumped-storage units. The performances under both deterministic conditions and stochastic conditions show the benefits of employing pumped-storage units that can contribute to hedging against the uncertainty, specifically in decreasing the total costs and the total online time of the generators. Indeed, the internet system considered here enables more reliable and practical operation.

•
We establish a data-driven risk-averse two-stage stochastic model, which behaves much better than current traditional stochastic or robust programming models in hedging against uncertainty. We does not need a predetermined distribution of uncertain parameters that traditional stochastic optimization requires. Moreover, the conservativeness level of our model can be simply adjusted based on the number of data samples, which is unavailable in traditional robust optimization.
Benefiting from these, we can obtain reliable solutions with less efforts spent in dealing with data and the model's conservativeness, which increases the computational efficiency of the model.

•
We consider a significantly accurate model by including forbidden zones and dynamic ramping rate limits simultaneously through a status transition modeling approach. This formulation performs well in improving the reliability and feasibility of the system.

Methodology
In this paper, we construct a data-driven risk-averse two-stage stochastic model with forbidden zones and dynamic ramping rate limits to minimize the total cost of the energy integrated system by aggregating generation decisions and uncertainties in the second-stage problem after determining the UCs and status transitions of the generators in the first-stage problem. By employing a status transition modeling approach, we include the forbidden zones and dynamic ramping rate limits in our proposed model while ensuring that the model is solvable. This model is precisely introduced in Sections 3.1 and 3.2.
We employ the Monte Carlo simulation method [38] to generate more data based on the historical data that we already have. By utilizing such historical data, we then derive a distributional ambiguity set, a confidence set, for the distributions of uncertain parameters, for example, uncertain demands, and optimize the total cost against the worst-case distribution in this set, leading to a data-driven risk-averse model. The construction of the confidence set is precisely introduced in Section 3.2.2. To address the computational challenge, we employ the Benders decomposition approach to solve the risk-averse two-stage stochastic model [39,40], the details of which are presented in Section 3.2.3.

Deterministic Model
In this paper, we first construct a model for the energy integrated system considering forbidden zones and dynamic ramping rate limits.
Our goal is to minimize the total cost of the system; therefore, we construct the objective function as follows: where the objective function (1) minimizes the start-up/shut-down cost and the operation cost, in which SU k and SD k represent the start-up and shut-down costs of generator k ∈ G, respectively, with G being the set of generators. Notice here that the operating costs for absorbing or generating power by pumped-storage hydros are usually very low and therefore are not considered in this model. The binary decision variable u k t determines if the generator k starts up at time t ∈ T, correspondingly, u k t = 1, or does not start up, u k t = 0, where T represents the total time horizon. The binary decision variable y k t determines if the generator k is online at time t, correspondingly, y k t = 1, or offline, where α k n represents the nth break point of the operation range of generator k between its lower bound, C k , and upper bound, C k . Decision variables x k t denote the amount of power generation of generator k at time t.
Next, we start to build constraints.
where Constraints (3) and (4) represent the minimum up and down time limits, respectively. Specifically, (3) constrains generator k to remain online for at least L k time periods until t if it starts up at time t − L k + 1 (i.e., y t−L k +1 = u k t−L k +1 = 1). Similarly, (4) constrains generator k to remain offline for at least k time periods until t if it starts up at time t − k + 1 (i.e., y t− k +1 = u k t− k +1 = 1). Moreover, (5) constrains the relationship between y k t and u k t .
where (6) is the load balance constraint and (7) constrains the capacity limits of each transfer line m. In these two constraints, the decision variables H Out h,t and H In h,t represent the amount of electricity generated and absorbed by the pumped-storage unit h at time t, respectively, where B represents the set of buses, G means the set of thermal generators and H denotes the set of pumped-storage units in the system. Moreover, the random parameters W b,t and d b,t denote the random amount of wind power generation and electricity demands at bus b at time t, respectively. The certain parameter K b m denotes the line flow distribution factor for transmission line m at bus b, and C m represents the transmission capacity of line m ∈ M, where M denotes the set of lines in the system. In addition, G b and H b mean the set of thermal generators and pumped-storage units at bus b, respectively.
where (8) are hydro storage balance constraints in which the variables s h,t determine the water reserve storage of the pumped-storage unit h at time t and the parameters η 1 and η 2 illustrate the efficiencies of the absorbing and generating cycles of the pumped-storage units, respectively. In addition, the Constraints (9) and (10) determine the lower and upper bounds of the electricity absorbed, represented by L In and U In , and the electricity generated, represented by L Out and U Out , by the pumped-storage units, respectively. The Constraints (11) and (12) define the initial and final water storage levels, S 0 and S T , respectively, for the pumped-storage units.
After constructing the basic constraints for the thermal generators, wind farms and pumped-storage units, we next focus on the constraints for the forbidden zones and dynamic ramping rate limits.
For the relevant constraints, we first introduce some binary decision variables (represented by directed lines) defined to represent the status transitions of the generators among different zones and dynamic ramping rate limits in Figure 1. As shown in Figure 1, we define a binary variable for each directed dashed line to indicate whether this line is active or not, which reflects the transition status of the generator. If the line is active, it indicates that the corresponding generator changes its status following that directed dashed line. Specifically, we define the binary variable z t,k,i u to represent whether the transition status of generator k at time t is from zone i to zone i + 1. If it is, z t,k,i u = 1; otherwise, z t,k,i u = 0. Similarly, we define the binary variable z t,k,i d to represent whether the transition status of generator k at time t is from zone i to In addition, we define the binary variables z t,k,i f f u and z t,k,i f f d to represent if the transition of generator k is from forbidden zone i f to forbidden zone i f ramping up and ramping down at time t, respectively. We also define the binary variable z t,k,i n n to represent if the transition is from normal zone i n to normal zone i n , regardless of the ramping directions.
In summary, we can divide such binary variables into two types: one type for transitions across different zones (e.g., z t,k,i u and z t,k,i d ) and another type for transitions within one zone (e.g., The dynamic ramping rate limits are also presented in Figure 1. Different zones, for instance, normal zone i and forbidden zone i + 1, have corresponding ramping rate limits that are different, and we define the ramping up and down rates of generator k in zone i as RU k i and RD k i , respectively, which are equal. Moreover, we define the parameters P i,k min and P i,k max as the minimal and maximal generation outputs of generator k in zone i, respectively.
Based on these definitions, we have following constraints: where Constraints (13) and (14) ensure that if the generator enters the normal zone i n ∈ N k , in which N k denotes the set of normal zones, or the forbidden zone i f ∈ F k , in which F k means the set of forbidden zones, from any other zones at time t, then at the next time t + 1 it has to stay within the normal zone i n or the forbidden zone i f , respectively, or leave to some other possible zone.
where Constraints (15) describe the relationship between the online/offline status and the transition status. For example, if generator k at time t is offline (y k t = 0), none of the transition statuses of the generator would be active.
where Constraint (16) and (17) restrict the generator's operations within the forbidden zones by enforcing the transition direction. Specifically, (16) means that if a generator has an up-transition process at time t in which it ramps up in forbidden zone i f or enters this zone from zone i f − 1, then at time t + 1 it has to maintain an up-transition process, for instance, by continually ramping up in the forbidden zone i f or going up to zone i f + 1. Similarly, Constraints (17) describe the down-transition process.
where Constraints (18) and (19) restrict the lower and upper generation bounds of generator k at time t.
where Constraints (20)-(23) describe the dynamic ramping rate limits. Specifically, Constraints (20) and (21) restrict the ramping rate of the generator within the normal zone, and Constraints (22) and (23) illustrate the ramping rate limits across different zones, in which δ denotes the unit time interval (e.g., 15 min), V k is the start-up/shut-down ramping rate of generator k and RU k max and RD k max represent the largest ramping up and down rates, respectively, of generator k among all zones. Constraints (20) work in such a way that if the generator ramps within the normal zone (i.e., z t+1,k,i n n = 1), it has to be restricted by the ramping rate corresponding to the normal zone i n , for instance, RU k i n ; otherwise, such constraints become relaxed since max{RU k max δ, V k } is large enough. Constraint (22) enforces the ramping amount if the generator ramps up across different zones (i.e., z t+1,k,i u = 1); otherwise, these constraints will also be relaxed. And Constraints (21) and (23) can be analysed in a similar way.
where Constraints (24) and (25) enforce that once the generator enters the forbidden zone, it has to leave with the highest ramping rate, which equals the ramping rate limit of the forbidden zone i f , so that the generator is ensured to remain there in the shortest time.

Data-Driven Risk-Averse Model
In this section, based on the deterministic model constructed above, we propose a data-driven risk-averse two-stage stochastic model with forbidden zones and dynamic ramping rate limits. Furthermore, we describe how to construct the confidence set and how to employ Benders decomposition algorithm to solve such a two-stage stochastic model.

Risk-Averse Two-Stage Stochastic Model
In the traditional TSP framework, the distribution of random parameters is known which leads to risk-neutral decisions. For example, a set of scenarios, each with a corresponding probability, are given. Nevertheless, in practice, such probabilities are uncertain, which we consider in this paper. Thus, in this paper, we assume that the net load and the renewable generation are uncertain, and for simplicity, the renewable generation is treated as a proportion of uncertain demands, e.g., 30%. Therefore, in this paper, we consider that the net load distribution (denoted by P) is uncertain and ambiguous. However, P can be predefined within a confidence set D based on the available samples of historical data, the details of which are introduced in Section 3.2.2. In addition, in our risk-averse TSP model, we determine the optimal decisions in UCs and status transitions of the generators in the first stage, and then the details of the amount of power generation of the thermal units and pumped-storage units are decided against the worst-case scenario in the second stage. Therefore, the corresponding data-driven risk-averse TSP model can be built as follows: where E P [Q(y, u, ξ)] means the expectation under distribution P, and Q(y, u, ξ) is defined as follows: where ξ denotes a certain scenario. We define the variable f k t (ξ) as the replacement of the operation costs Φ k t (x k t (ξ)) of generator k at time t under scenario ξ. Thus, here the decision variable x k t (ξ) denotes the amount of power generation of generator k at time t under scenario ξ. Furthermore, the variables s h,t (ξ), H In h,t (ξ) and H Out h,t (ξ) are defined in a similar way. Then, without loss of generality, we assume that we have a set of finite possible scenarios, for instance, ξ 1 , ξ 2 , ..., ξ J , with unknown probabilities p j , j = 1, ..., J corresponding to each scenario. Then, the expectation can be formulated as follows: Notice here again that each probability p j , j = 1, · · · , J is unknown and constrained by the confidence set D.

Confidence Set Construction
In this section, we describe how to construct the confidence set by following the approach introduced in [38].
To construct the confidence set, first, an empirical distribution needs to be obtained through a histogram. We divide S available samples into J bins to fit the historical data, and accordingly each bin has S 1 , S 2 , ..., S J samples. Therefore, in this way, a histogram with S = ∑ J j=1 S j is constructed, and accordingly the empirical distribution for random parameters (i.e., uncertain net loads) can be determined as P 0 = (p 0 1 , p 0 2 , ..., p 0 J ), with p 0 1 = S 1 /S, p 0 2 = S 2 /S, ..., p 0 J = S J /S. Considering that the true distribution of uncertain demands may be different from the empirical distribution that we construct, we then design the confidence set for the true distribution by employing statistical inference. Specifically, we construct the confidence set by employing L 1 and L ∞ norms, respectively, since when such two norms are used, the derived empirical distribution converges to the true distribution as the number of available historical data samples (i.e., S) approaches infinity. In addition, when utilizing two such norms, our proposed model can be reformulated as a mixed integer linear programming (MILP) problem that can be solved efficiently.
The constructions of confidence sets D 1 and D ∞ corresponding to L 1 and L ∞ norms, respectively, are described as follows: where θ in (29) and (30) denotes the distance between the empirical distribution and the true distribution, which is determined by the quantity of data and the confidence level. If we set the confidence level at 99%, the true distribution is guaranteed to be within the confidence set D 1 or D ∞ with a probability of 99%. In addition, when a larger amount of historical data is utilized, the distance can be smaller; in other words, θ will decrease. This means that, with a fixed confidence level (e.g., 99%), when the quantity of data increases, the confidence set will shrink. For further details on how θ is defined by the confidence level and the number of samples, please refer to studies in [38] in which the details of the convergence rate are introduced.

Benders Decomposition
The Benders decomposition algorithm has been a very popular approach to solving many stochastic problems in power systems (e.g., [5,6,38,41]).
Before introducing the detailed Benders decomposition algorithm, for notation brevity and ease of reading, we first rewrite our proposed model into an abstract form by employing matrices and vectors, which is denoted by (PP). For instance, we let e denote a vector whose components are all 1. Then, we can derive an abstract formulation as follows: L 1 z n + L 2 z u + L 3 z d + L 4 z fu + L 5 z fd + L 6 y = 0, where constraint (32) represents Constraint (2), Constraint (33)  Considering that scenarios j, j = 1, ..., J are independent, which means the second-stage minimization formulations min e T f j are independent, we can interchange the operations of the minimization and summation in the objective Function (31) and reformulate the model as follows: Then, through the Benders decomposition method, the proposed problem can be separated into a master problem, denoted as (MP), and a subproblem, denoted as (SP). We can obtain (MP) as follows: where σ denotes the operation cost in the worst-case distribution in the second-stage problem, and feasibility cuts and optimality cuts are generated and added to (MP) iteratively, the details of which are introduced in the following.  46), respectively, which are also matrices. Then, the two maximization problems can be integrated, and the duality of the subproblem, denoted as (DSP), can be formulated as follows: If the objective value of problem (50) (i.e., the (DSP)) is +∞, then the status of (DSP) is unbounded, which means that (SP) is infeasible, due to tduality theory. Then, we need to generate the following feasibility cuts and add them to the problem (MP). In the problem (M), since we denote the variable σ as the optimal value of the problem (SP), we should have that σ ≥ optval(DSP), in which optval(DSP) represents the optimal value of problem (DSP). However, if not, that is, σ ≤ optval(DSP), then we can claim that the current solution is not optimal; accordingly, we need to add the following optimality cuts to the master problem (MP).
..,λ 10 j λ 10 j λ 10 j , j = 1, ..., J are the optimal solutions by solving the (DSP). Then, by adding feasibility cuts and optimality cuts, the Benders decomposition algorithm can be guaranteed to converge to the global optimum [39,40]. The implementation of such an algorithm is described as follows:

Reformulation Techniques
To characterize the constraints P ∈ D, we utilize the following reformulation techniques to transform the above subproblem into an MILP problem.
• L 1 norm Case: For the L 1 norm case, P ∈ D represents ∑ J j=1 |p j − p 0 j | ≤ θ, which is equivalent to • L ∞ norm Case: For the L ∞ norm case, P ∈ D represents max 1≤j≤J |p j − p 0 j | ≤ θ, which is equivalent to

Case Studies
In this section, we implement various datasets to test our proposed model. It is a general model which can run in different time periods, depending on the fluctuation of uncertain parameters as well as operators' needs. In this paper, it runs as look-ahead unit commitment (LAUC), with shorter time horizon than the day-ahead unit commitment, in which six hours are considered and each time interval corresponds to fifteen minutes (e.g., δ = 15) that is general in LAUC [6], leading to 24 time intervals in total. Besides, it can also run as a day-ahead unit commitment covering 24 h with each time interval being 1 h, but in this paper we only study the former condition. However, if uncertain parameters have more significant fluctuations, the time horizon and time interval may be shortened.
In particular, since it is an extremely complicated problem, for simplicity, we first test the deterministic model, which allows us to obtain basic insights into the contributions of the pumped-storage units in improving decisions and the effectiveness of our model when the forbidden zones and dynamic ramping rate limits are included in Section 4.1. Then, in Section 4.2, we extend the experiments to the data-driven risk-averse two-stage stochastic model and show how the data impacts the performance of the proposed model and how the pumped-storage units contribute to improving the final decisions under uncertainties. All experiments are carried out by employing the commercial solver CPLEX 12.7.1 on an Intel-i5 2.3 GHz personal computer with 4 GB of memory.

Deterministic Conditions
In this section, the deterministic model is implemented with the modified IEEE 6-bus system [25]. The system contains 6 buses, 4 thermal generators, 1 wind farm, 1 pumped-storage unit and 6 transmission lines. The loads are deterministic with mean values of 50 WM at bus 2, 80 MW at bus 3 and 200 MW at bus 6. We assume that the power generation from the wind farm contributes to 30% of the total demands.
We denote the deterministic model with pumped-storage units as "With PS" and the one without pumped-storage units as "Without PS". After having run these two models in the aforementioned system, the objective cost ($) for the model "With PS" is 82066.7, while for model "Without PS" it is 91096.5. Therefore, this result proves that pumped-storage units can make significant contributions towards reducing the total costs. In addition, we test the effectiveness of the pumped-storage units in optimizing decisions in units commitments, which are shown in Table 1. The amounts of electricity absorbed and generated by the pumped-storage units are presented in Table 2, in which the performances in the time periods not mentioned are all 0. Table 1. Online statuses of the six-bus system under deterministic conditions.

Hour
With PS Without PS G 1 G 2 G 3 G 4 G 1 G 2 G 3 G 4  From Table 1, we can observe that generator 2 (denoted as G 2 in the table) in the model "With PS" only stays online in hours 1 and 2, compared with the generator 2 in the model "Without PS", which remains online during the whole time horizon. In addition, the online time of generator 3 (i.e., G 3 ) in the model "With PS" is only 1 h, which is much shorter than the 5 h for which generator 3 in the model "Without PS" remains online. Therefore, it can be verified that the energy integrated system with pumped-storage units can perform better than a system without pumped-storage units in reducing the total costs in the system and shortening the online time of the generators.
Then, to verify if our proposed model can represent operations of thermal generators accurately, we present the amount of power generation of generator 1 (i.e., G 1 ) over different time intervals in Figure 2. The dynamic ramping rates (MW/min) of this generator are 2.2, 2, and 1.8 for generation ranges (MW) (e.g., forbidden zones and normal zones) of 50-125, 125-190 and 190-250, respectively. Among such ranges, the forbidden zone is the generation range (MW) of 125-190, whose boundaries have been marked in red lines in Figure 2, while the others are normal zones. In Figure 2, it is obvious that the power generation of generator 1 within the forbidden zone lies in time periods 3 and 4, in which the generator ramps up from 134.243 MW to 164.243 MW with a ramping rate of 2 MW/min, reaching the ramping rate limit in the corresponding forbidden zone. In addition, we can observe that the generator enters the forbidden zone after ramping up in time period 2, and then continues ramping up until it leaves the forbidden zone. Thus, it can be verified that our proposed model can imitate the operations of a real-life system accurately, which leads to more reliable and practical solutions.

Stochastic Conditions
In this section, we test our data-driven risk-averse two-stage stochastic model with the modified IEEE 118-bus systems, which is available online at http://motor.ece.iit.edu/data. In this system, there are 118 buses, 54 thermal generators, 186 transmission lines, 1 wind farm and 1 pumped-storage unit. And corresponding generator data, fuel data and demands data of each bus in each hour are described. We assume that the renewable power generation from the wind farm contributes to 15% of the total demands. Thus, for simplicity, the renewable power output is considered as the negative system load, so that over time the considered net load can be defined as the difference of the net load and the wind power output. In addition, there are 24 time intervals in total, which means that each time interval is 15 min (i.e., δ = 15). We compare our proposed model utilizing the L 1 and L ∞ norms with the traditional stochastic programming model in the modified IEEE 118-bus system with and without pumped-storage units. We summarize the models as follows: To generate the set of historical demand data in each time period, we first generate the demand of each bus in each time interval randomly by using the demand data in the corresponding hour offered in IEEE 118-bus system. Then we use the Monte Carlo sampling method based on these demand data, with which as the mean value and a deviation equalling to 30% of its mean. In addition, we set the number of bins to 5 (i.e., J = 5). And to allocate samples into 5 bins, k-means algorithm is employed.

Effects on Unit Commitment
This experiment aims to test the effects of the involvements of pumped-storage units on the online/offline statues of thermal generators in the system. In addition, we also test how the historical data impact the UC decisions.
Different from deterministic conditions, under stochastic conditions, risk plays a pivotal role in impacting the final optimal solutions of the proposed model. However, when the pumped-storage unit is included in the system, it can play a vital role in compensating for overproduction or underproduction, which is beneficial for overcoming risk, and consequently the corresponding decisions in UCs may be impacted. In addition, in the traditional TSP problem, the probability of each scenario is certain, which leads to risk-neutral decisions. Conversely, in our data-driven model, decisions against the worst-case distribution are provided, leading to risk-averse decisions. Furthermore, when the number of data samples increases, the risk-averse level of the data-driven model decreases, which will have impacts on the corresponding decisions in the UC. Motivated by this idea, to further investigate such effects, we implement the model with the modified 118-bus system with and without the pumped-storage unit when the number of data samples varies. The UC decisions are shown in Table 3. Table 3. Online statuses of generator 52 from the 118-bus system under the L 1 and L ∞ norms.

# of Samples
With PS  Without PS   T1 T2 T3 T4 T5 T6 T1 T2 T3 T4 T5 T6  500  1 Table 3, it can be observed that generator 52 in the system with the pumped-storage unit remains online less than the generator in the system without the pumped-storage unit. Specifically, when the number of data samples is 500 or 1000, the generator in the system with the pumped-storage unit remains online for only the first two hours, while the other generator remains online during the whole time horizon. Similarly, when the number of data samples is 2000 and 5000, the generator no longer starts up; meanwhile, the other generator remains online for two hours. The same is true for the performances in the risk-neutral situation, in which the probabilities of the corresponding scenarios are certain, whose values are estimated with 20,000 data samples. The reason is that the pumped-storage unit behaves like a reserve for providing compensation for the uncertain demands to hedge against the risk, therefore benefited from this, operators can start up generators to provide power for loads instead of starting them up to prepare for uncertainties. Accordingly, the online time of the generator can be reduced.
In addition, we can also observe that when more historical data are given, in either the system with the pumped-storage unit or the system without such a unit, the generator will remain online for less time, and when the number of data samples is large enough, the decisions in the online/offline statues will be the same as the ones made in the risk-neutral TSP model. The reason is that as the number of data samples increases, the risk-averse level decreases, and a more accurate probability distribution can be obtained that is closer to the true distribution. Thus, the benefit is that operators can make more precise decisions in determining whether or not it is necessary to keep generators online to prepare for uncertainties in the future.
Consequently, pumped-storage units can contribute to reducing the online time of generators under uncertainties by offering compensations to hedge against risk. In addition, when the number of data samples increases, benefiting from decreased risk-averse level, less time is needed for generators to remain online to hedge against possible uncertainties in the future.

Effects on Conservativeness
In this section, we study the impacts of historical data on the conservativeness of our proposed model. Specifically, with the fixed confidence level assumed to be 99%, we test our proposed model with the number of data samples increasing from 5 to 5000 in systems with and without the pumped-storage units. The performances are reported in Tables 4 and 5 and Figures 3 and 4.  From Tables 4 and 5 and Figures 3 and 4, we can easily find that in both systems, with an increasing number of data samples, the objective cost decreases, regardless of the norm that is employed. This is because the larger number of data samples leads to a smaller confidence set of the ambiguous distribution, and consequently the results become less conservative. When the number of historical data samples is large enough, the data-driven risk-averse model converges to the risk-neutral model. Furthermore, in Figures 3 and 4, a rapid convergence rate is obviously shown, and when the number of data samples reaches 1000, the objective costs in both two systems almost converge, regardless of the norm that is employed. In further studies of the conservativeness, we find that the objective costs in DD-1 are always higher than those in DD-Inf, which reflects that DD-1 is more conservative than DD-Inf. This is because the L 1 norm leads to a larger convergence rate (i.e., θ) than that derived from the L ∞ norm, which is shown in Figure 5 (notice here it is reproduced with permission from [6]. IEEE Transactions on Industrial Informatics, 2018.) In practice, operators in the system can choose models based on different norms according to a particular risk-averse level.   Then, we study the value of the extra data in reducing the conservativeness of the model. First, two gaps Gap 1 (s) and Gap ∞ (s) are defined, which mean the objective cost difference between data-driven risk-averse models DD − 1 and DD − In f with the traditional two-stage stochastic model (TTSP), respectively. We specify the gaps as follows: where Cost 0 (s) Cost 1 (s) and Cost ∞ (s) denote the objective costs of the (TTSP), (DD-1) and (DD-Inf), respectively, when the number of data samples equals s.
Based on such definitions, we define the value of the data as From Tables 6 and 7, we can find the value of the data in the two systems with the two norms employed when the number of data samples varies. It is obvious that as the number of data samples increases, both "Gap" and "Vod" decrease. Additionally, when the number of data samples reaches 5000, both Vod 1 and Vod ∞ are near 1, which means that the value of the data becomes very limited for our data-driven risk-averse model and that our proposed model can converge to the risk-neutral model rapidly. Therefore, the results demonstrate that we can obtain the relative true solution without too much data being needed. Table 6. Value of the extra data in the 118-bus system with a pumped-storage unit.

Effects on Costs
Next, we investigate the impacts of pumped-storage units on objective costs of the system under uncertainties. From Tables 4 and 5, we can verify that the pumped-storage unit can contribute towards reducing the objective costs in the system under uncertainties. In a further investigation, we determine the decrease in the objective costs when the pumped-storage unit is included and the corresponding percentage equal to the decrease divided by the corresponding optimal cost in the system without the pumped-storage unit. The results are reported in Table 8 and Figure 6.
From Table 8 and Figure 6, we can clearly observe that the objective cost reduction happens when the system includes the pumped-storage unit. Furthermore, as the number of historical data samples increases, regardless of the norm that is employed, both the cost reduction and corresponding cost reduction percentage exhibit decreasing trends. In other words, when the number of data samples is not large, meaning that the risk-averse level is high, the system with a pumped-storage unit can contribute more towards reducing the objective cost. As mentioned previously, the pumped-storage unit plays a pivotal role in hedging against the uncertainty and risk in the system. Table 8. Cost decrease in the 118-bus system.  6. Effects of historical data on the decrease in the total cost in the 118-bus system.

Conclusions
In this paper, we consider an energy integrated system in which thermal generators, wind farms and pumped-storage units are integrated under uncertainties. We propose a data-driven risk-averse two-stage stochastic model with forbidden zones and dynamic ramping rate limits. We first construct a deterministic model and then extend it to the data-driven risk-averse TSP model. From numerical results, it can be verified that pumped-storage units can contribute towards reducing the objective costs of the system and decreasing the online time of generators. In addition, the results show that in our model, the generator is allowed to enter the forbidden zone, but it has to leave there with the possible highest ramping rate and a fixed ramping direction, which proves that our approach improves the reliability and practicality of the system. Then, we implement our data-driven model to hedge against uncertainties, and from the results obtained by solving the model with Benders decomposition algorithm, we find that pumped-storage units can contribute towards decreasing the online time and the objective costs of the system under uncertainties, especially when the risk-averse level is high. Furthermore, our model utilizes historical data to construct the distributional confidence set, and we demonstrate that when the number of historical data samples is large enough, our proposed model converges to the risk-neutral model.