Generation Capacity Expansion Planning Considering Hourly Dynamics of Renewable Resources

: As more generation capacity using renewable sources is accommodated in the power system, methods to represent the uncertainty of renewable sources become more important, and stochastic models with different methods for uncertainty representation are introduced. This paper investigates the impacts of hourly variability representation of random variables on a stochastic generation capacity expansion planning model. In order to represent the hourly variability as well as uncertainty of the random parameters such as wind power availability, solar irradiance, and load, AutoRegressive-To-Anything (ARTA) stochastic process is applied. By using autocorrelations and marginal distributions of the random parameters, a stochastic process with hourly intervals is generated, where generated random sample paths are used for scenarios. A mathematical formulation using stochastic programming is presented, and a modiﬁed IEEE 300-bus system with transmission line constraints is employed to the mathematical model as a test system. Optimal generation capacity solutions obtained using GAMS/CPLEX are compared to the ones from the model only capturing the uncertainty and seasonal variability of the random parameters. The comparison results indicate that the economic value of solar photovoltaic (PV) generation may be overestimated in the case where the hourly variability is not reﬂected; thus, ignoring the hourly variability may lead to higher building costs and higher capacity of solar PV systems.


Introduction
Uncertainty in power system planning becomes a critical factor as the capacity of renewable generators in power systems increases to cope with global warming and climate change due to electricity generation. In this context, generation capacity planning models considering the uncertainty of renewable resources have been developed and can be found in the literature [1][2][3]. In those models, renewable resources such as wind and solar power are represented by random variables or uncertainty sets as a means of quantifying uncertainty.
In [1], hydropower resources with uncertainty are quantified as scenario sets, and a mathematical generation planning model is formulated with two-stage stochastic programming. In [2], uncertainty sets for electric demand and wind power generation capability are formed via a statistical method, k-means clustering. A multi-stage stochastic generation planning model is presented in [3], where seasonal, day-and nighttime random samples are generated for wind power availability, solar direct normal irradiance (DNI), and electric load using Gaussian copula. Seasonal variability is represented in the scenarios; however, the short-term variability or chronology is not included. In [4], generation capacity planning models applying methods for the uncertainty representation of renewable resources are surveyed.
In the generation capacity planning models described above, the uncertainty of the random parameters is well presented in the view of long-term planning, and optimal solutions are obtained within a reasonable amount of time; however, temporal variability is not fully captured because one random variable or scenario represents a long time period. Moreover, the chronology of the random parameters is ignored, which may affect the optimal solution of generation capacity to be built. Therefore, there have been discussions regarding the effect when the chronology and temporal details of the random parameters such as electric loads are included in the planning models, which can be found in [5][6][7][8][9][10][11][12].
In [5], with 1-h time steps, operating costs for generation capacity building portfolios are evaluated where a detailed operational model with a unit commitment decision-making is included in the optimization problem. Scaled time-series data for electric demand and wind power are used. An energy planning tool is employed in [6] to see the relation between a long-term planning model and short-term dynamics of renewable resources. As a scenario, a typical load curve is selected from historical time-series data for each month. Four different generation capacity building plans are evaluated based on the given scenario of electric load.
In [7], scenario paths are formed for each season with different time intervals (1-, 2-, 6-, 12-, and 24-h intervals), and the aggregated scenario paths from the spring to the winter form yearly scenario paths.
A unit commitment decision is included in the operational problem with 1-h time resolution in [8], and a test system without transmission lines is applied to investigate the performance of the model.
For more precise representation of the operational model, unit commitment decision-making is included in a generation capacity expansion planning problem with hourly time steps [9], and typical scenarios with 1-h time steps of demand and renewable energy sources for four seasons are selected as representative scenarios.
In [10], linked simulations with pre-developed long-term and short-term models with 1-h time intervals are performed. With the planned (fixed) future capacity of renewable generation systems, scenarios for electricity demand growth and capacity factors for renewable generators are used based on assumed and historical data for the year 2010, respectively.
Experiments are performed with scenarios in [11] with 1-and 4-h time steps for a generation capacity investment problem without considering transmission capacity limits. The scenarios are generated via the aggregation method from the hourly historical data.
The models above consider a detailed temporal variability by applying short time intervals in long-term planning models; however, the uncertainty of renewable resources is not explicitly modeled in the mathematical problem.
It is challenging to maintain a high resolution of the planning horizon with short time intervals, due to increasing problem size. In particular, in the stochastic optimization context, difficulties in computation drastically increase in accordance with short time steps since the number of decision variables is increased by the number of scenarios and time steps. There may be a tradeoff between computational costs and the quality of approximated solutions, because solution quality becomes relatively poor if a small number of scenarios are applied.
In view of integrating solar power in the power systems, it is important to capture the unavailability of solar power during nighttime and rapid changes in availability around sunrise and sunset times. Therefore, any methods where a scenario at a particular time interval represents a long time period may not reflect the actual behavior of solar power accurately.
Several stochastic generation planning models containing a high-resolution operational model are introduced, e.g., [12], where a stochastic model is developed under a two-stage stochastic programming framework with hourly operational intervals; however, transmission capacity constraints are not considered, and the scenario paths for renewable resources such as wind and solar power availability are simply based on the historical information; therefore, the obtained solutions rely heavily on historical events, not future events that possibly occur with a certain level of probability.
In order to overcome the shortcomings of the generation planning models appeared in the literature, as stated above, this paper proposes a stochastic generation capacity planning model using hourly random scenarios and investigates the impacts of the short-term variability inclusion by applying the model to the IEEE 300-bus system with transmission constraints.
The contributions of this paper can be pointed out as follows: (1) Impacts on a stochastic generation planning model with an hourly resolution of the operating problem are investigated.
(2) An optimization model is developed to find an optimal solution to the generation capacity planning problem with sample paths under a stochastic programming framework and implemented on a high-voltage 300-bus system with transmission constraints. (3) An ARTA stochastic process is successfully implemented on the optimization model to generate hourly random paths, which enables Monte Carlo simulation.
The rest of this paper is organized as follows. Section 2 describes the method in which sample paths are generated, and the procedure for sample path generation is explained step by step. The mathematical formulation of a two-stage stochastic generation capacity planning model including the decision process is presented in Section 3. Simulation results applied to a modified IEEE 300-bus system are presented, and a comparison of the results from the models with and without consideration of the short-term variability is reported in Section 4. The paper concludes with some remarks in Section 6.

Scenario Generation Method
As a method to generate random sample paths representing the short-term variability of the uncertain parameters, the AutoRegressive-To-Anything (ARTA) process is applied [13,14]. The ARTA process can be achieved by finding an AR(p) base process, {Z t }, with known autocorrelations and the marginal distribution of the desired process {Y t } and transforming {Z t } into the process {Y t }, First of all, a stationary desired process {Y t } needs to be defined; however, the corresponding time-series data, solar DNI, wind power availability, and electric load are not stationary in general, so a stationary desired process {X t } is defined, where X t satisfies that X t = Y t+1 − Y t , t = 1, 2, 3, · · · , T − 1. The base process, {Z t }, is derived using the autocorrelation ρ for the stationary process X t and marginal cumulative distribution, F X of {X t }.
The whole procedure for ARTA process can be summarized as several steps: (1) finding the autocorrelation structure for AR(p) process {Z t } using the autocorrelations and the marginal distribution of {X t } in Section 2.2, (2) finding AR parameters of {Z t } with the autocorrelations of {Z t } in Section 2.3, (3) forming {Z t } and transforming them into the desired process {X t }, (4) transforming the process {X t } back into the target process, {Y t }. These steps are illustrated in Figure 1. All calculations for generating the sample paths are done with MATLAB. Detailed steps to generate the ARTA process are described in the following subsections.

Historical Seasonal Data
The obtained historical time-series data include hourly potential wind power, solar irradiance, and electric load in Texas for a year from November 2009 to October 2010 [15]. In order to obtain sample paths that indicate different seasonal characteristics, data are classified into four seasons, and the sample paths are generated for each season. Marginal distributions and autocorrelation structures for those seasonal data are found from the divided datasets separately. For solar DNI, only daytime data are used, and sample paths for nighttime are assumed to be zero without generating samples. The daytime hours for solar DNI are assumed to be from 07:00 to 19:00 h for spring and fall, 07:00 to 20:00 h for summer, and 08:00 to 19:00 h for winter, respectively.

Autocorrelation Structure of {Z t }
A marginal distribution and autocorrelations of each process for solar DNI, wind power availability, and electric load are derived with the classified data, which are described in Section 2.1. Johnson's translated systems are fit to the historical data for the marginal distributions of the process [16], whereas the Johnson's translation system has a great flexibility for numerical analysis. Parameter estimation for the distributions is done with statistical package R. The autocorrelation structure composed of the autocorrelations of the process X t is represented by ρ = [ρ 1 , ρ 2 , . . . , ρ p ] , where ρ p refers to the correlation between two time series, Corr(X t , X t+p ).
The autocorrelation structure of an underlying standard normal process Z t , can be represented by r = [r 1 , r 2 , . . . , r p ] , p × 1 vector, and the autocorrelation, r p , can be found with where In the above Equations (1) and (2), E[ ], Φ, and φ represent expectation, the standard normal cumulative distribution function, and a standardized bivariate normal distribution function, respectively. Since E[X t ], Var[X t ], and Corr[X t , X t+h ] are known in (1), the left-hand-side (LHS) value, E[X t X t+h ], in (2) is known. By adjusting an unknown value, r h , that is implicitly incorporated in φ r h (z t , z t+h ), the right-hand-side value (RHS) can be found when both LHS and RHS values are equal within a tolerance value (in this simulation, the tolerance is 1 × 10 −6 ). Using this method, the autocorrelation structure r = [r 1 , r 2 , . . . , r p ] can be determined. The autocorrelations of the desired process and the derived stationary base process for solar DNI, wind power availability, and electricity demand with different lags, p = 1 and p = 2, are listed in Tables 1 and 2. A matrix composed of variances and covariances for a standardized normal stationary process Z t with lag p can be denoted by A random variable, t , can be generated using variance σ 2 which is denoted by

Generating Scenarios Using ARTA Process
In Section 2.3, the AR parameters, α 1 , α 2 , . . . , α p , and the variance σ 2 , are determined. By using the parameters, a stationary AR(p) base process, {Z t } t = 1, 2, . . . , T, can be found as follows where t is a random variable satisfying t ∼ N(0, σ 2 ). Once {Z t } is formed, a stationary processes, {X t }, can be generated by , and the final, target process {Y t } is generated. The procedure to generate random sample paths {Ỹ t } is described below: (1) Generate p + 1 initial values,Ŷ t−1 , . . . ,Ŷ t−p ,Ŷ t−(p+1) , using cumulative distribution functions for the historical initial value data, where p represents a lag. Initial values for X t , composed of the series of the differences,X t−2 ,X t−3 , . . . ,X t−(p+1) , are calculated, e.g., if the lag p is equal to 2, three initial values,Ŷ t−1 ,Ŷ t−2 , andŶ t−3 , are generated, and the differences, using the differences,X t s, which are derived in the previous step; IfX t is out of the expected, realistic range at any t, go back to Step (1), otherwise go to the next step; (6) TransformX t back into the target sample pathỸ t withỸ t =X t−1 +Ŷ t−1 , where the initial value, Y t−1 , is pre-determined in Step (1).
In order to generate N random sample paths for a year, the steps described need be done for each season and are repeated N times. With the given procedure, N sample paths with AR order p = 2 for T = 24 are generated in this simulation. The generated 20 sample paths are illustrated in Figure 2. The solar DNI sample paths are generated only for daytime, as described in Section 2.1, and the values for nighttime are assumed to be zero.

Optimization Problem
In this section, a mathematical optimization model seeking an optimal capacity of generators minimizing the building and operating costs is provided. When the optimization problem considering uncertainties is formulated, other methodologies, such as robust optimization, that are shown in the literature [17] can be applied. In this paper, a two-stage stochastic programming is selected, and the uncertainties of solar DNI, wind power availability, and uncontrollable load are consequently represented by probability distributions.

Decision Process
Within a two-stage stochastic framework [18,19], scenarios for solar DNI, wind-power availability, and electric load are represented by a scenario fan, where a sample path for 96 h representing a year (24 h for spring, summer, fall, and winter, each) is generated in the second stage as a scenario, which is shown in Figure 3.
The objective function in (7) includes approximated cost terms for generation, reserve procurement, CO 2 emissions, and penalty costs of unserved electricity demands with respect to scenarios, where the penalty costs for unserved electricity demands and un-procured reserve are USD 30,000/MWh and USD 1000/MW, respectively, in this formulation. A carbon tax rate is assumed to be USD 10 per metric ton of CO 2 as a fixed amount for the whole simulation.
Building decision variables for generators (x n ) are represented by binary variables in (8). The binary number indicates a decision for building or not building generators with 1 and 0, respectively. At any electric buses, the power balance implying that the injection and withdrawal of power at the bus are always the same is maintained by the constraint (9), where the continuous decision variable (V ω t ) represents load shedding (unserved electric demand), and it also prevents violation of the constraint. In the power balance constraint, a part of the electric load can be represented by controllable load, and electric vehicles can be modeled as a part of load as well (e.g., [20]). In this formulation, only load shedding is reflected, and the demand-side load control is not considered. Reserve capacity (R ω tc ) is assumed to be 15% of generating power, and it is procured only from dispatchable, conventional generators in the constraint (10). The reserve procurement costs are assumed to be 10% of the generation costs by conventional generators. This type of margin is different from the operating reserve which covers short-term system changes, rather planning a reserve margin that ensures a reliable amount of generation capacity in the long-run. The deliverability of the reserve is not considered in this formulation, and only the required capacity is procured.
Constraints (11) and (12) restrict the conventional generators' capacity for existing and newly added ones, respectively. With (12), the values representing the capacity of candidate generators (P max ) are set to zero when the generators are not built; otherwise, the capacity values are activated as pre-fixed, non-zero values. Wind-power capacity (P ω tw and P ω tk for existing and candidate ones, respectively) is determined by realized wind power availability (P max w (ξ ω wt )), where ξ ω wt are random sample paths generated by the method introduced in Section 2 for existing wind farms and for newly built farms in (13) and (14), respectively. As the uncertainty of solar DNI is realized and sample paths are generated, available solar power is estimated as in Section 4.1. For existing and candidate solar PV farms, solar power generation (P ω ts for existing and P ω tm for candidate farms) is restricted by constraints (15) and (16).
In (17), active power transmitted via transmission lines ( f ω tl ) is restricted by the product of differences of voltage angles (θ ω ti ) across the line and 1/reactance, where the differences between the voltage angles are represented by bus voltage angles and the branch-node incident matrix, S li , which indicates the status of the linkage between electric nodes (buses) and transmission lines.
Constraint (18) also avoids overloads on transmission lines by restricting power flows in accordance with the physical thermal limits of transmission lines. The voltage angle difference between two buses connected by a transmission line is maintained within − π 2 and π 2 by (19), and a voltage angle at each bus is restricted by the range, −π and π, in (20). All decision variables for generation (P ω tg ) need to be non-negative in (21). The mathematical formulation in Section 3 is provided in Nomenclature.

Case Study: IEEE 300-Bus System
In order to verify if the mathematical model performs as planned, a case study with a modified IEEE 300-bus system (see [21,22]) including 411 transmission lines is performed, where the grid system is a high-voltage transmission system including several low-voltage load buses. Conventional generators (excluding solar and wind generators) in the system are coal, nuclear, conventional combustion turbine (Conv. CT), advanced combustion turbine (Adv. CT), and combined cycle gas turbine (CCGT) generators. The building and operating costs by generation technology are listed in Table 3 [23, 24], where the building costs represent the annualized costs that are assumed to be 15% of overnight building costs. The number of units for solar and wind generators implicitly indicates the number of farms. Peak and average system-wide electric load are assumed to be 45% load of ERCOT (Electric Reliability Council of Texas) in 2010 [25]. Although the effects of electric vehicles and controllable load were not considered in the network, they can be modeled and applied as needed. For carbon tax, CO 2 emissions need to be calculated; therefore, emissions rates for different generator types are estimated based on [24]. Those emissions rates per MWh are listed in Table 4. The available capacity of solar PV systems using the realizations of solar DNI (ξ ω st ) is calculated by [3,26] P max The area for solar PV farm installation, A, is assumed as 1.3935 × 10 6 m 2 for 150 MW, which is the unit capacity of the solar farm. The efficiency (η) can be estimated based on the reference [26] as follows where the ambient temperature, T a , is estimated by averaging the historical temperature data for the corresponding season and time for McCamey area in Texas [27]. The efficiency values for PV modules (η 0 ) and for DC/AC conversion (η inv ) are assumed as 15% and 85%, respectively.

Scenario Generation Using Gaussian Copula
For a comparison purpose, scenarios representing a year are generated using Gaussian copula (GC) [28], which is a method to generate random samples using correlations and marginal distributions of random variables. Samples for each season, and day-and nighttime, are generated, and those samples form a scenario representing eight time slices, spring-daytime (sp-da in Figure 4), spring-nighttime, summer-daytime, up to winter-nighttime.
The 20 random samples by GC are illustrated in Figure 4. GC samples only indicate day-and nighttime variability and uncertainty for each season, while ARTA sample paths represent the hourly variability and chronology of stochastic parameters as well as uncertainty in the model. The differences between the scenarios by two methodologies can be compared by the two plots.

Optimal Building Capacity and Costs
The presented mathematical formulation was coded in Generic Algebraic Modeling System (GAMS) and applied to a modified IEEE 300-bus system. The CPLEX was used as an optimization solver for a mixed integer linear program [29]. In Table 5, the obtained optimal building decisions are listed to compare two cases with consideration of the seasonal, day and nighttime variability and the seasonal hourly variability. These two cases employed different scenarios generated from GC and ARTA, respectively, where scenarios from GC and ARTA are illustrated in Figures 2 and 4.
Under the carbon tax, CCGT generators, and wind and solar farms tend to be built more than other conventional generators in terms of capacity. For those generation technologies, CO 2 emissions rates are relatively low so that the small cost increment due to carbon tax can be expected when generating electric power. Consequently, those generators are preferred to be built and operated. Coal generators are not built at all with the same rationale.
With 20 GC samples, building capacity for conventional CT and advanced CT is decreased by 337 MW compared to the 10 GC sample case, while wind-power capacity is increased by 200 MW. In terms of costs, the building cost increment was about 15.7 million dollars per year for more investment in wind capacity and less investment in CTs to 10 GC sample case, which implies that the investments in the generating units by wind resources are more attractive due to lower carbon taxes and fewer operating costs under the current simulation setup.
In the case with the 30 samples by GC, the built capacity for solar farms is the same as capacity in the case with 10 GC samples, while with the 30 sample paths by ARTA, the built capacity of solar PV systems decreases by 150 MW compared to the result with 10 sample paths.

Impacts of Hourly Variability
From the results in Table 5, the impacts on optimal decisions with and without hourly variability in the stochastic generation planning model can be compared. A noticeable change in the built capacity is observed for solar PV systems with the ARTA sample paths. The built capacity of solar PV systems with ARTA sample paths indicates a smaller value than the case using GC samples that do not consider hourly variability. The simulation model with representation of the hourly variability captures the short-term changes in solar DNI during daytime, which may diminish the economic value of solar PV systems.
Solar PV systems tend to be less economical in cases where the short-term variability is incorporated in the given stochastic generation planning model.

Computation Time
The average computation times are compared in Table 6, which reports the elapsed times in GAMS with different numbers of samples/paths. A computing machine equipped with Intel Xeon W-2135 CPU and 120 GB RAM is used for solving the presented optimization problem. We can see from Table 6 that the incremental increase in computation time is much higher with the random sample paths by the ARTA process, as the number of scenarios increases. By comparing ARTA 10, 20, and 30 scenario cases, the computation time becomes 7.67 times larger than the 10 ARTA sample path case when the number of samples is doubled from 10 to 20. In the case with 30 ARTA scenarios, the computation time becomes 21.24 times larger then 10 ARTA scenario case.
Simulations with the ARTA sample paths require much more computational time compared to the model using GC samples.

Conclusions
In this paper, a stochastic generation capacity planning model with hourly random sample paths reflecting the uncertainty and hourly variability of renewable resources and electric load is developed, and a case study with the IEEE 300-bus system is conducted. The obtained optimal solutions by the presented model are compared to the solutions from the stochastic model considering uncertainty and seasonal variability.
From the results, it can be concluded that accommodating the hourly variability (short-term variability) in a stochastic generation capacity planning model may lower the economic value of solar PV systems and may lead to decrease in solar PV system installation under the given condition. In other words, long-term planning models without consideration of hourly variability may overestimate the optimal capacity of solar PV systems.

Discussions
Long-term planning models using stochastic programming generally do not consider the short-term variability and chronology of renewable resources due to high computational efforts. Therefore, the effect of incorporating the short-term variability as well as the uncertainty has not been properly addressed, and expected changes in the economic value of renewable resources due to the short-term variability have not been examined.
One thing to be noted is that for accommodating utility-scale solar PV systems that evidently experience unavailability during nighttime and rapid changes in generation for sunrise and sunset times, the inclusion of chronology and high temporal resolution of the planning horizon have impacts on optimal generation capacity solutions in the long-term planning models. There is a tradeoff between the level of operational details in the model and computational burden in stochastic generation planning models. In many cases, very detailed operational information introduced in the model may lead to intractability of the problem. It is still challenging to obtain optimal solutions to the stochastic generation capacity planning expansion problem considering the short-term variability, and few studies introduce stochastic models incorporating the short-term variability of renewable resources due to excessive computational costs.
A limitation of the model is indeed a high computational cost. Solution times with different numbers of samples are compared in this paper, and from the comparison result we can see that incorporating the short-term variability with high resolution of the operational model dramatically increases the computational time; therefore, an optimal level of operational details needs to be found to efficiently estimate the value of uncertain, intermittent renewable resources. The high computational costs and accuracy of the operational model are tradeoffs in stochastic capacity planning models. Acknowledgments: The author also would like to thank Ross Baldick of the University of Texas and David P. Morton of Northwestern University for comments on this paper.

Conflicts of Interest:
The author declares no conflict of interest.

Nomenclature
Sets/Indices ω ∈ Ω Scenarios, sample paths d ∈ D Electric demand (electric load) i ∈ I Electric buses l ∈ L Transmission lines t ∈ T Time interval (1 h) g ∈ G All existing and candidate generators c ∈ G c Existing conventional generators, G c ⊂ G r ∈ G f Existing fossil-fueled generators, G f ⊂ G w ∈ G w Existing wind generators, G w ⊂ G s ∈ G s Existing solar PV generators, G s ⊂ G n ∈ G n Candidate generators to be built j ∈ G c n Candidate conventional generators, G c n ⊂ G n q ∈ G f n Candidate fossil-fueled generators, G f n ⊂ G n k ∈ G w n Candidate wind generators, G w n ⊂ G n m ∈ G s n Candidate solar PV generators, G s n ⊂ G n Data/Parameters e r , e q CO 2 emissions rate of generator r and q (metric ton/MWh) o r c , o r j Cost for procuring reserve from generator c and j ($/MWh) o g Operating cost for generator g ($/MWh) RS Fixed reserve requirements (×10 2 %) S di Demand-node incidence matrix S gi Generator-node incidence matrix S li Branch-node incidence matrix Availability of solar PV generator s and candidate solar PV m of ω (MW) P max w (ξ ω wt ), P max k (ξ ω wt ) Availability of wind generator w and candidate wind generator k of ω (MW) Binary Decision Variables x n , x j , x k Decision to build candidate generator n, j, k Continuous Decision Variables P ω tg , P ω tc , P ω tr , P ω tw , P ω ts Generation of generator g, c, r, w, and s (MW) R ω tc , R ω tj Reserve procured from generator c and j (MW) P ω tn , P ω tj , P ω tq , P ω tk , P ω tm Generation of candidate generator n, j, q, k, and m (MW)