## 1. Introduction

During the time, the mankind has developed several electricity generation technologies based on different primary sources such as wind, biomass, gas, coal, nuclear, and so on. Evidently, each technology has associated costs, sustainability and security of supply characteristics, efficiency and environmental concerns. According to the United States Environmental Protection Agency, the different primary energy sources are organized by conventional power, such as oil, natural gas, coal and nuclear; renewable energy, such as large hydropower and municipal solid waste; and green power, such as wind, solar, biomass, geothermal, biogas and low-impact hydropower. In particular, the low-impact hydropower is the use of hydroelectric power on a scale suitable for local community and industry, or to contribute to distributed generation in a regional electricity grid, with a lower negative environmental impact compared to the large hydropower. In terms of less environmental impacts, the conventional power sources are the least beneficial and the green power sources are the most beneficial.

The worldwide demand for energy has been increasing over the last decades and it will continue to grow [

1]. Consequently, for both countries and companies, the long-term planning of the electricity generation infrastructure is of utmost importance. Actually, it should be part of the central objectives of any energy policy. The achievement of an optimal designed electricity generation infrastructure bends towards a more balanced portfolio allocation among the different available technologies. In addition, it is also important to distinguish in the planning process the already existing electricity producing plants with maintenance costs from the ones desired to be built. Economically, drastic changes of the current electricity investment allocations are not feasible. In this paper, our model distinguishes the costs of the already existing from the costs of the prospective or desired to be built plants.

The United States Energy Information Administration not only has the history of the average annual maintenance, operational, and fuel costs for existing power plants by energy source or major fuel types, but also projections for electricity generation costs [

2]. However, even so, the costs have a significant uncertainty. For instance, future control on CO

_{2} emission and the corresponding mechanisms will surely impact the electricity generation costs. Precisely, the future price of an emitted ton of CO

_{2} is uncertain and this uncertainty should be considered in the planning process. Consequently, electricity generation policies solely relying on the evolution of historical average costs of electricity generation technologies are unsatisfactory. The careful consideration of the uncertainties associated with the current and the prospective costs of such technologies is fundamental for planning purposes.

Considering the costs as random variables, in the literature, the allocation of resources in the available electricity generation technologies has been solved as a mean-variance optimization problem using the expected values and covariance matrix of the technology costs in megawatt hours (see, for instance, [

3,

4,

5,

6]). The mean-variance optimization, introduced by Markowitz [

7], was the first mathematical formalization of investment diversification and it is part of the modern portfolio theory (MPT). The mean-variance optimized portfolios compose the called efficient frontier, a set of portfolios that dominate all other feasible portfolios in terms of their mean and variance tradeoff. Clearly, in the MPT, the random variables of interest are the returns of the risky assets instead of the costs of the technologies.

In practice, the expected values and the covariance matrix of the electricity generation technology costs for a future time horizon are not exactly known. The use of only historical data to estimate the expected values and covariance matrix is a naive approach because the past will not necessarily repeat in the future. Noticeably, the usefulness of the allocations obtained from the mean-variance optimization depends on the preciseness of such parameters. For instance, in the MPT context, it was shown in [

8] that minor changes in the expected values of returns can produce major changes in asset allocation decisions. Consequently, several robust versions of the mean-variance optimization were proposed in the MPT literature to consider uncertainties on the expected returns and covariance matrix (see, for instance, [

9,

10,

11]).

There are many published research on uncertainty analysis using Bayesian methods for the energy industry. For instance, the application of Bayesian networks in the renewable energy area to deal with storage, smart grids and assessment are ample (for a complete survey, see [

12]). Bayesian network is a technique used to deal with problems with uncertainty [

13,

14]. The related literature is diverse including a building occupants representation model for energy efficiency using Bayesian networks [

15] to a Bayesian framework for power network planning using statistical emulators [

16]. A model and the computer program used to implement it are referred to as a simulator and an emulator is a statistical approximation of a simulator [

17,

18]. Basically, the uncertainty in the inputs of the models is represented as a probability distribution in a Bayesian framework.

Particularly in [

19], in the electricity planning context using MPT, it was presented a robust portfolio optimization approach to deal with uncertainties in the input parameters. The uncertainty in the robust portfolio optimization approach is represented by an uncertainty set for the input parameters. In [

19], the uncertainty sets considered were the box, the ellipsoidal, the lower and the upper bounds, and the convex polytopic. However, the energy planning process is very complex and involves other concerns such as sustainability, resiliency, availability, reliability, efficiency, safety and security of the generation technologies. Such concerns add not only additional uncertainty in the costs of such technologies but also beliefs that come from technology specialists. Actually, it is usual to have the participation of specialists in the electricity generation technologies of interest in the electricity planning processes.

Undoubtedly, a natural way of conducting a comprehensive planning process is to take into account the available data together with the prior experience of the participant specialists. Bayesian approaches treat the probability distributions themselves as uncertain and subject to updates as new information arrives. Consequently, the Bayesian approach has been successfully applied in the MPT context to take into account not only the beliefs of the investors but also the uncertainties in the expected returns and the correspondent covariance matrix (see, for instance, [

8,

20,

21]). The Bayesian mean-variance portfolio optimizations consider both the estimation uncertainty and the specialist prior information. In a few words, the prior probability represents the beliefs of the investment specialists, the probability update represents the incorporation of the available data in the model and the predictive probability represents the updated beliefs of the specialists using the available data.

In the literature, there are different existent Bayesian approaches to deal with the parameter uncertainty in the context of MPT (for instance, see [

22,

23,

24,

25,

26,

27,

28,

29]). Historically, the initial applications of Bayesian approaches in 1970s were based on improper or data-based priors [

30]. The Bayesian approaches based on improper priors usually give comparable results to the classical methods and the difference arises when some risky assets have longer historical data than others [

31]. Then, trying to incorporate prior information into the asset allocation model, the Black–Litterman model was introduced using a Bayesian approach to include investors views and equilibrium relations in the portfolio allocation [

8]. The main difficulty to apply Black-Litterman model in practice is that it requires the investors views as inputs and, usually, they are not publicly available. Other studies are centering prior beliefs around values implied by asset pricing theories [

32,

33] or using investment objectives to obtain useful priors [

34].

In this paper, our objective is the introduction of the classical-equivalent Bayesian portfolio optimization to electricity generation planning. The main contribution of our Bayesian approach is the possibility to take into account both the estimation uncertainty and the specialists’ information at the same time in the energy planning process. In the next section, we give a brief review of the classical mean-variance optimization with the basic notation and fundamental concepts. Then, we present the classical-equivalent Bayesian approach using both improper and proper priors. In addition, for illustration purposes, we compare the classical-equivalent Bayesian optimal portfolios with the classical mean-variance optimal portfolios using the same data from [

19,

35]. Finally, we present some final comments about our proposed approach and suggestions for future research at the end of the paper.

## 2. Classical or Naive Mean-Variance Approach

Traditionally, the classical or naive mean-variance optimization assumes that cost and risk, the last one measured as the portfolio volatility, are known when making portfolio allocation decisions. For that reason, a rational planner would prefer a portfolio with a lower expected cost for a given level of risk. Alternatively, a preferred portfolio is the one that minimizes risk for a given expected cost level. The set of portfolios that are optimal is called the efficient frontier. No rational planner would select a portfolio lying above the efficient frontier, since that would mean accepting a higher cost for the same amount of risk as an efficient portfolio. Similarly, it would mean accepting greater risk for the same expected cost as an efficient portfolio.

Following [

19,

35], it is important to distinguish in the planning process an already existing electricity producing plant using technology

i, with random cost

${C}_{i}^{e}$ in USD/MWh, from a prospective idea of building a new plant using

i, with random cost

${C}_{i}^{p}$ in USD/MWh. In practice, substantial changes of the current electricity investment allocations are not feasible and the maintenance costs of existing plants are different from the implementation costs of new plants. The random vectors of costs for existing plants and prospective ideas of building new plants when there are

N different technologies are given by

respectively. It is also usual to assume that the random costs are multivariate normal

where

${\mathit{\mu}}^{e}={({\mu}_{i}^{e})}_{N\times 1}$ and

${\mathit{\mu}}^{p}={({\mu}_{i}^{p})}_{N\times 1}$ are mean vectors and

${\mathbf{\Sigma}}^{e}$ and

${\mathbf{\Sigma}}^{p}$ are

$N\times N$ covariance matrices. The means

${\mu}_{i}^{e}$ and

${\mu}_{i}^{p}$ are different because maintenance costs are different from the costs of building a new plant. Additionally, the risk or standard deviation of maintenance

${\sigma}_{i}^{e}$ is also different from the risk or standard deviation of building a new plant

${\sigma}_{i}^{p}$. However, since the technology is the same, the correlation between

${C}_{i}^{e}$ and

${C}_{i}^{p}$ is equal to

${\rho}_{{C}_{i}^{e},{C}_{i}^{p}}=1$. Thus, we can write almost surely (with probability 1) that (see Proposition 1.1.2 from [

36])

Essentially, Equation (

3) says that the source of uncertainty for both

${C}_{i}^{e}$ and

${C}_{i}^{p}$ is the same. Additionally,

${\mathbf{\Sigma}}^{e}=\mathrm{diag}\left({\mathit{\sigma}}^{e}\right)\mathit{R}\mathrm{diag}\left({\mathit{\sigma}}^{e}\right)$ and

${\mathbf{\Sigma}}^{p}=\mathrm{diag}\left({\mathit{\sigma}}^{p}\right)\mathit{R}\mathrm{diag}\left({\mathit{\sigma}}^{p}\right)$, where the correlation matrix

$\mathit{R}$ is the same for both the existing and the prospective costs and

${\mathit{\sigma}}^{e}={\left({\sigma}_{i}^{e}\right)}_{N\times 1},{\mathit{\sigma}}^{p}={\left({\sigma}_{i}^{p}\right)}_{N\times 1}$ are standard deviation vectors.

Defining

$\mathit{C}={\left({\mathit{C}}^{{e}^{\prime}}\phantom{\rule{4pt}{0ex}}{\mathit{C}}^{{p}^{\prime}}\right)}^{\prime}$, it follows that

where

The portfolio weights are the proportions of the total budget allocated in each technology. The allocation vectors in the existent and prospective technologies are denoted by

${\mathit{\omega}}^{e}={({\omega}_{i}^{e})}_{N\times 1}$ and

${\mathit{\omega}}^{p}={({\omega}_{i}^{p})}_{N\times 1}$, respectively. Naturally,

$0\le {\omega}_{i}^{e}\le 1$,

$\forall i=1,2,\dots ,N$;

$0\le {\omega}_{i}^{p}\le 1$,

$\forall i=1,2,\dots ,N$; and

Defining

$\mathit{\omega}={({\mathit{\omega}}^{{e}^{\prime}}\phantom{\rule{4pt}{0ex}}{\mathit{\omega}}^{{p}^{\prime}})}^{\prime}$, we denote by

$\mathsf{\Omega}$ the set of admissible electricity generation mix so that we must have

$\mathit{\omega}\in \mathsf{\Omega}$. The set

$\mathsf{\Omega}$ will represent constraints like Equation (

6),

${\mathit{\omega}}^{\prime}{\mathbf{1}}_{2N}=1$ (

${\mathbf{1}}_{2N}$ is a

$2N\times 1$ vector of ones), and minimum and/or maximum values for the allocations (

${\mathit{\omega}}_{\mathrm{min}}\le \mathit{\omega}$ and/or

$\mathit{\omega}\le {\mathit{\omega}}_{\mathrm{max}}$). Using the

$\mathit{\omega}$ definition, the total cost of the portfolio is given by

Using the previous Equation (

7), the expected cost of the portfolio is given by

and the variance of the portfolio is given by

For the case in which the vector of expected costs

$\mathit{\mu}$ and the covariance matrix

$\mathbf{\Sigma}$ are known, three kinds of mean-variance problems are usually considered in the MPT literature (for the details, see [

20]). In the following, we translate the three kinds of mean-variance problems to the electricity generation planning context. The first approach minimizes the variance of the costs conditional on a target maximum expected cost

c. The target maximum expected cost

$c\in {\Re}_{+}$ is provided by the electricity energy policy planner, which represents the maximum allowable expected energy cost. Formally, the problem is written as follows:

The second approach, a dual form of the first approach, minimizes the expected cost conditional on a maximum value

${s}^{2}$ for the variance of the costs. The value

${s}^{2}\in {\Re}_{+}$, provided by the policy planner, represents the maximum value that the variance of the cost could achieve. Formally, the problem is written as follows:

The third approach minimizes a combination of the expectation and variance of the costs, weighted by a risk aversion parameter

$\lambda >0$. Higher value of

$\lambda $ indicates a greater risk aversion. Formally, the problem is written as follows:

Considering linear constraints and known expected costs

$\mathit{\mu}$ and covariance matrix

$\mathbf{\Sigma}$, the solution of the previous optimization problem is trivially obtained using any quadratic programming solver. Actually, it is possible to rewrite the previous optimization problem as follows:

where

$\phi $ is the quadratic cost function such that

and

$\mathrm{p}\left(\mathit{c}|\mathit{\mu},\mathbf{\Sigma}\right)$ is the multivariate Gaussian or normal probability density function with mean

$\mathit{\mu}$ and covariance matrix

$\mathbf{\Sigma}$. In the MPT context, the approximation of the investor utility function using a quadratic function was shown to be exact when the input data is elliptically distributed [

37]. For instance, elliptical distribution includes the normal, Student’s

t and Levy distributions.

## 3. Classical-Equivalent Bayesian Mean-Variance Approach

In terms of modeling, the Bayesian approaches, compared with the approaches from the last section, address estimation risk from a different angle. In place of treating the unknown parameters as constants, they are considered random. Additionally, the belief or prior knowledge of the specialist about the input parameters is combined with the observed data. The Bayesian models provide an entire distribution of predicted costs that explicitly consider the estimation and predictive uncertainty [

21].

The predictive, posterior or updated density of the unknown parameters

$\mathit{\mu}$ and

$\mathbf{\Sigma}$, according to the Bayes’ theorem, is given by

where

${\mathit{c}}_{1},\dots ,{\mathit{c}}_{T}$ are recorded observations;

$\mathrm{L}\left(\xb7|\xb7\right)$ is the likelihood function; and

$\pi \left(\xb7\right)$ is the prior distribution. Particularly, the likelihood function is given by

where

$\left|\mathbf{\Sigma}\right|$ is the determinant of the covariance matrix.

Using the predictive density of the unknown parameters

$\mathit{\mu}$ and

$\mathbf{\Sigma}$ from Equation (

17), it is possible to obtain the predictive density of the costs as

Then, using the predictive density of the costs in the optimization problem from Equation (

15), the Bayesian optimization problem is defined by

In the following subsections, we present the predictive distributions using improper and proper priors for the unknown parameters $\mathit{\mu}$ and $\mathbf{\Sigma}$.

#### 3.1. Improper Prior Case

In some cases, our prior beliefs are vague and thus difficult to express into an informative prior. Consequently, we would like to still consider the uncertainty of the model parameters without impacting them with any prior belief. The improper priors, also called non-informative, diffuse or vague priors, are employed to that end. We consider the case when the investor is uncertain about the distribution of both parameters,

$\mathit{\mu}$ and

$\mathbf{\Sigma}$, and has no particular prior knowledge of them. This case is modeled using an improper prior, which is typically chosen to be the Jeffreys’ prior [

38]

where

$\mathit{\mu}$ and

$\mathbf{\Sigma}$ are considered independent in the prior, and

$\mathit{\mu}$ is not restricted as to the values it can take. The prior is non-informative in the sense that only changes in the data exert an influence on the predictive distribution of the parameters. When the sample mean,

$\widehat{\mathit{\mu}}$, and sample covariance matrix,

$\widehat{\mathbf{\Sigma}}$, are given, it is straightforward to verify that the predictive distribution of the costs is a multivariate Student’s

t-distribution (for the complete derivation of the following result, see [

20] or [

21])

where the predictive mean and covariance matrix are, respectively,

Here, the predictive covariance matrix represents the sample covariance scaled up by a factor, reflecting the estimation risk. For a given number of technologies N, $\tilde{\mathbf{\Sigma}}$ becomes closer to $\widehat{\mathbf{\Sigma}}$ as more historical data are available. Actually, when N is fixed and $T\to \infty $, we have $\tilde{\mathbf{\Sigma}}\to \widehat{\mathbf{\Sigma}}$. On the other hand, with a fixed number of historical observations T, increasing the number of technologies N respecting the constraint $T-2N-2>0$, leads to higher absolute numerical values for the covariance matrix and estimation risk, since the relative amount of available data decreases. In practice, there are relevant information coming from specialists on energy costs. Consequently, in the next subsection, we present a study with proper priors.

To conclude, the classical-equivalent Bayesian optimization problem for electricity generation planning for the improper prior case is given by

#### 3.2. Proper Prior Case

In the proper prior case, the specialists have informative beliefs about the mean and covariance of technology costs. Particularly, in this subsection, we adopt conjugate priors because it is an algebraic convenience producing a closed expression for the posterior. Using a similar approach common in the investment portfolio allocation context [

20,

21], the conjugate prior for the mean vector of the normal distribution (conditional on

$\mathbf{\Sigma}$) is taken to be the multivariate normal while the conjugate prior for the unknown covariance matrix of the normal distribution is taken to be the inverse Wishart distribution:

where

$\mathit{\eta}$ is the vector of expected costs based on the specialist experience,

$\tau \in {\Re}_{+}$ represents the confidence strength the specialist places on the value of

$\mathit{\eta}$,

$\mathsf{\Psi}$ is the covariance matrix based on the specialist experience, and

$\nu \in \Re $ represents the degrees of freedom of the inverse Wishart distribution reflecting the confidence about

$\mathsf{\Psi}$. Lower values of

$\tau $ and

$\nu $ indicates higher uncertainty about

$\mathit{\eta}$ and

$\mathsf{\Psi}$, respectively.

As in the improper prior case, the predictive distribution of the costs is a multivariate Student’s

t-distribution (for the complete derivation of the following result, see [

20] or [

21])

where the predictive mean and covariance matrix are, respectively,

and

We notice that the predictive mean $\stackrel{\u02d8}{\mathit{\mu}}$ is a weighted average of the prior mean, $\mathit{\eta}$, and the sample mean, $\widehat{\mathit{\mu}}$. In other words, the sample mean is shrunk toward the prior mean. Actually, the predictive mean and predictive covariance matrix are not proportional to the sample estimates. The improper prior case is suitable to use when we do not suspect that the sample mean or sample covariance matrix contain considerable estimation errors. Alternatively, the proper prior case is better when the planner believes that, in the future, the expectation and covariance matrix of the costs will differ substantially from the historical ones.

To conclude, the classical-equivalent Bayesian optimization problem for electricity generation planning for the proper prior case is given by

## 4. Results

In this section, we present a application to illustrate the classical-equivalent Bayesian approaches. In

Table 1, we reproduce from [

35] the means and standard deviations of costs for maintenance of existing plants and building of new plants for different energy generation technologies: (1) gas; (2) coal; (3) nuclear; (4) fuel oil; (5) biomass; (6) large hydropower; (7) wind; and (8) low-impact or small hydropower. Additionally, in

Table 2, we also reproduce from [

35] the correlation matrix of the technologies considering the fuel costs. Since the correlation matrix is symmetric, we do not repeat the elements. In [

35], the data was obtained using the Levelized Busbar Cost (LBC) methodology (for instance, see [

4]). LBC is a valuation technique that calculates the costs over the electric plants’ useful lifetimes and averages them to yield a total production cost. For the purpose of our application, we consider the data from

Table 1 and

Table 2 as the sample estimates of

${\widehat{\mathit{\mu}}}^{e}$,

${\widehat{\mathit{\mu}}}^{\mathrm{p}}$,

${\widehat{\mathit{\sigma}}}^{e}$,

${\widehat{\mathit{\sigma}}}^{\mathrm{p}}$ and

$\widehat{\mathit{R}}$.

The naive mean-variance efficient frontier obtained using

$\widehat{\mathit{\mu}}$ and

$\widehat{\mathbf{\Sigma}}$ is presented in

Figure 1 and

Figure 2 (repeated in the two graphics). The efficient frontier represent the set of all optimal choices. It is important to notice that the portfolios above the efficient frontier are realizable but inefficient and the portfolios below the efficient frontier are unrealizable. On the other hand, in the MPT context, since the random variables are the returns instead of the costs, the portfolios below the efficient frontier are realizable but inefficient, and the portfolios above the efficient frontier are unrealizable. It is fundamental to highlight the differences between the set of realizable portfolios in the two contexts to avoid misinterpretations. In addition, it is also important to notice that the efficient frontier for the costs is always convex while the efficient frontier for the returns is always concave.

In the MPT context, the efficient frontier is calculated without considering the risk-free asset [

20]. The risk-free asset is the one with a certain future return. The identification of the risk-free asset depends on the context of interest. For example, in the United States, the treasury bills (T-bills) are considered the risk-free asset because they are backed by the government. Analogously, in the energy planning context, the efficient frontier must be calculated without considering the risk-free energy generation technologies. The existence of risk-free technologies also depends on the context of interest. For example, government backed subsidized energy generation technologies could be considered risk-free technologies. After obtaining the efficient frontier, following the MPT procedure, the risk-free technologies must be linearly combined with the efficient portfolios to obtain new optimal portfolios [

39].

In the improper prior case, illustrated in

Figure 1, the efficient frontier changes depending on the value of

T. As mentioned, the predictive covariance of the improper case is the sample covariance scaled up by a factor that approaches to one when

T increases. Obviously, we do not have

T here representing the actual size of the sample used in the estimation. Actually, for us,

T is not only a proxy to the size of the sample used in the estimation but also the degree of confidence the planner has on the estimations based only on historical data. Consequently, decreasing the value of

T shifts the efficient frontier to the right. The same shift to the right was observed in [

19] using the robust mean-variance optimization with uncertainty sets when decreasing the degree of confidence the planner has on the estimations. In other words, the robust mean-variance optimization and our improper prior case include the uncertainty of the estimations in the electricity planning process. However, we highlight the fact that the robust mean-variance optimization is computationally more expensive than our approach because it requires several optimizations to cover all the uncertainty set. Our improper prior case only requires a single optimization.

In the proper prior case, the hyperparameters

$\mathit{\eta}$ and

$\mathsf{\Psi}$ represent the prior information of the specialist about the expected value and covariance matrix of the technology costs, respectively. Since we do not have such parameters for the situation described in [

35], we assume, for illustration purposes, that

$\mathit{\eta}$ and

$\mathsf{\Psi}$ are obtained increasing by 10% the vectors

$\widehat{\mathit{\mu}}$,

$\widehat{{\mathit{\sigma}}^{e}}$ and

$\widehat{{\mathit{\sigma}}^{\mathrm{p}}}$. Unfortunately, specialist priors are not publicly available. Consequently, in the proper prior case, the application is just a toy problem for illustration purposes. In

Figure 2, we present the obtained efficient frontiers for different values of

$\tau $ with

$T=50$ and

$\nu =34$. Noticeably, the resulting efficient frontiers are not simple shifts of the naive mean-variance frontier. Consequently, as mentioned, the informative proper prior case is most suitable to use than the improper prior case when the planner believes that in the future the costs will differ substantially from the historically estimated ones. In this section, the objective was to show the flexibility and the potential of applicability of the Bayesian approach to include not only the estimation uncertainty, but also the specialist information in the energy planning process.