Modelling of Fuel- and Energy-Switching Prices by Mean-Reverting Processes and Their Applications to Alberta Energy Markets

: This paper introduces a fuel-switching price to the Alberta market, which is designed for encouraging power plant companies to switch from coal to natural gas when they produce electricity; this has been successfully applied to the European market. Moreover, we consider an energy-switching price which considers power switch from natural gas to wind. We modeled these two prices using ﬁve mean reverting processes including a Regime-switching processes, Lévy-driven Ornstein–Uhlenbeck process and Inhomogeneous Geometric Brownian Motion, and estimate them based on multiple procedures such as Maximum likelihood estimation and Expectation-Maximization algorithm. Finally, this paper proves previous results applied to the Albertan Market where the jump modeling technique is needed when modeling fuel-switching data. In addition, it not only gives promising conclusions on the necessity of introducing Regime-switching models to the fuel-switching data, but also shows that the Regime-switching model is better ﬁtted to the data.


Overview
The federal government of Canada published "The pan-Canadian approach to pricing Carbon Pollution" in October 2016 aiming at reducing green house gas (GHG) emissions by taxing fossil fuels. There are two carbon tax systems mentioned in it: an explicit price-based system and a cap-and-trade system. The cap-and-trade system gives each company limit tradable allowance for GHG emission, while the explicit price-based system sets a tax price for a fixed carbon emission. In Alberta, we are using the explicit price-based system and the current Carbon levy is $50 per tonne. However, Tombe (2017) [1] found that household electricity bills were expected to rise by $150 when the Carbon levy increased in 2018. Moreover, this tax is going to affect poor households which could increase inequalities in turn (Ambasta and Buonocore, 2018 [2]). This all indicates that this price is imperfect and a correct pricing method is needed for Albertan residents.
Electricity prices are effected by factors such as weather conditions, economic growth, political principles and fuel-switching (Sjim et al., 2006 [3]; Seifert et al., 2008 [4]; Carmona et al., 2009 [5]), where the fuel-switching is the only one that is controllable by the company itself. The term fuel-switching means specifically that for any power plant that is coal fired or gas fired, the company can switch based on price and produce the same quantities of electricity. Since coal price is cheaper than natural gas price but nearly twice as polluting as gas (Arora and Taylor 2014 [6]), we are able to consider potentials of GHG emissions reduction through fuel-switching (Erik Delarue, William D'haeseleer 2006 [7]). This paper introduces a fuel-switching price, which explains how to switch from coal to natural gas when producing electricity, and how this can apply to the Albertan energy market. In the Albertan market, there are examples of power plant companies which are working on fuel switching, such as Transalta's Coal-to-Gas Conversions Project and Capital Power's Repowering Genesee Project. We also introduce an energy-switching price which considers a power switch from natural gas to wind. Moreover, note that this paper focuses on different jump and regime-switching mean-reverting stochastic models, where the financial and economic meanings and their applications will be discussed in another paper. We model these two prices using five mean-reverting processes and including regime-switching processes, such as the Lévy-driven Ornstein-Uhlenbeck process and inhomogeneous geometric motion. The estimation of these processes is based on multiple procedures such as maximum likelihood estimation and the expectation-maximization algorithm. This paper also justifies previous results applied to the Albertan energy market showing that the jump modelling technique is needed when modelling fuel-switching data. Finally, the novelty of the paper lies not only in giving a promising conclusion on the necessity of introducing regime-switching models to the fuel-switching data, but also in showing that regime-switching models are better fitted to the data. This paper is based on several projects and ideas, first introduced in Arrigoni et al., 2019 [8], and then extended in Lu 2020 [9] to consider more stochastic models for fuelswitching and energy-switching prices. We consider inhomogeneous geometric Brownian motion and generate its parameter estimation procedure by using change of time method from Swishchuk 2008 [10]. We also consider the regime-switching models and the original Lévy-driven models. Unlike previous models with a constant volatility, the regimeswitching model assumes that the volatility is changing according to each regime and they can switch the value between regimes. The regime-switching models generally consider a business cycle behind our data, with different regimes modelled by a Markov chain that represents different stages in a business cycle. Furthermore, in reality, the volatility often has a high and low value and this feature can be captured by the regime-switching model. Such models have been widely used in economics and finance after Hamilton first introduced them in 1989, and it is useful to model a different volatilities using financial data (Yang et al., 2016 [11], Bai and Wu 2018 [12], Stübinger and Endres 2019 [13]). Finally, we aim at finding a more accurate continuous time model by considering regime-switching and non-negative properties of the prices.
We note that option valuations of futures contracts with negative underlying prices were studied in Swishchuk et al., 2020 [14].

Literature Review
There are many papers that have studied fuel-switching before and there are two main papers we are going to review. First, Goutte and Chevalier 2015 [15] introduced an estimation methodology for Lévy-driven Ornstein-Uhlenbeck processes and applied them on the fuel-switching price in the European market. They also defined a calculation formula to fuel-switching price based on power plants efficiency, fuel price, emissions factor and etc. They mainly concluded that the political decisions impact the jumping nature of the process and enhance our understanding of emission reduction through power systems that could be achieved in the EU Emissions Trading System (EU-ETS). We are going to introduce the fuel-switching price below.
As we discussed before, the fuel-switching price should stimulate companies to reduce GHG in a profitable way. Hence, this price starts from defining the marginal cost of each fuel without a carbon cost: where FC represents fuel cost and η is the power plant's efficiency. In this paper, we do not account for any cost that is related to regulatory cost, maintenance, labour force cost, etc and we follow the same definitions mentioned in Goutte and Chevalier 2015 [15]. Thus, here, the total/average cost for one kWh power generation is equal only to fuel cost. Further, the marginal cost defined here is different from the classic definition of marginal cost which is the change in the total cost that arises when one unit of production is incremented by one. The power plant's efficiency is considered as an average efficiency, so the marginal cost is actually only affected by the fuel cost. Next, if we have carbon cost such as carbon tax or allowance price, the equation changes to: where EF is the emission factor, which depends on the fuel and the amount of fuel burnt, and EC is the emission cost. For simplicity, we do not consider any cost. Now, the fuel switching happens when generated from other fuel is cheaper, so to define the minimum cost for a switch to occur, we need to let MC coal = MC gas , which turns out as solving the emission cost inside this equation: Thus, when the EU Allowance price is lower than this price, generating electricity from coal is cheaper than natural gas and vice versa. Later, they used this price in the European market and modelling it by a jump mean reverting model. They conclude that jump modelling techniques should included when modelling fuel-switching price.
The paper by Arrigoni et al. 2019 [8] introduces this method to the Albertan energy market, and based on the difference situations between the Albertan and European energy markets, they also introduced an energy-switching price as a 'future' fuel-switching price which considers a fuel-switching between natural gas and wind. They then also present their ideas as poster in the 42nd International Association for Energy Economics (IAEE) Annual Conference [16]. This is more meaningful because coal to natural gas is a switch to cleaner energy, while natural gas to wind is a switch to renewable energy. Later, similar to Goutte et al., 2015 [15], they modelled these two prices using Levy-driven Ornstein-Uhlenbeck process and standard Ornstein-Uhlenbeck process, and concluded that jump modelling techniques is needed.
The paper by Fargos et al., 2021 [17] offers new insights into the risk of carbon leakage and industrial relocation in case of asymmetric climate policies, and provides an improved understanding of the sectoral and regional structure of leakage, as well as policy measures to mitigate the adverse impacts on industrial competitiveness. Regarding the existing literature, this paper goes beyond it by considering: (1) assessing for the first time the competitiveness impacts of the ambitious EU Green Deal targets towards climate neutrality (no green house gas emission) by mid-century, (2) using an advanced version of the leading GEM-E3-FIT model (which is a multi-regional, multi-sectoral, recursive dynamic computable general equilibrium (CGE) model) with an enhanced representation of energy system and technologies required for net-zero transition. Here net-zero transition is brought out by Global Future Council of the World Economic Forum, which represents reducing global greenhouse gas emissions to 'net zero' by 2050, (3) exploring the impacts of first-mover coalitions conceptualizing the most recent climate policy announcements of EU and China aiming to achieve carbon neutrality (i.e., achieving net zero carbon dioxide emissions by balancing carbon dioxide emissions) by mid-century, thus ensuring high policy relevance of the analysis and (4) assessing the cost-effectiveness of border carbon adjustment (BCA) mechanism in preventing carbon leakage and supporting the European Emissions-Intensive and Trade-Exposed (EITE) industries, as suggested by the EU Green Deal.
The paper by Montenegro et al. (2021) [18] reviews the recent literature on the modelbased methods utilized for policy analyses in the areas of economics, energy systems, and environmental damages. It also explores ways in which these models and methodologies depict distributional impacts, the main indicators obtained from their results, shortcomings of each modelling methodology, and suggest improvements for future research. Georgescu-Roegen (1971) [19] mentioned that it could generate a reverse or duel case if using the temporal borders of economic process method, that two identical technologies could have two different productivities or marginal cost after several years. This paper is organized as follows. In Section 2, we introduce the economic background and fuel-switching data used in this paper. Section 3 described the parameters estimation procedures for five stochastic models considered in this paper and introduced in Appendix A. Section 4 gives the empirical results obtained by applying the models from Appendix A to the data introduced in Section 2. Section 5 provides model comparison based on AIC and BIC, and cross-validation methods. The Conclusion is given in Section 6. Appendix A discusses five mean-reverting stochastic models which are used in this paper.

Economic Content
Signed in 2016, the Pairs Agreement which is an agreement within the United Nations Framework Convention on Climate Change (UNFCC) aimed at dealing with GHGemissions mitigation, adaptation, and finance. Its goal can be summarized in three parts: (1) Limit the long-term increase of temperature to 1.5 degrees which should be done by reducing emissions as much as possible. (2) Provide a framework for transparency, accountability, and the achievement of more ambitious targets. (3) Mobilize support for climate change mitigation and adaptation. Under this agreement, each country should determine a plan to mitigate warming. According to these, the federal government of Canada published "The pan-Canadian approach to pricing Carbon Pollution" in October 2016 as we mentioned in the beginning and Alberta choose the taxation method for reducing GHG emission. It raises the public's attention on the carbon pricing method.
The prices in the energy sector mainly depend on two aspects. The electricity prices determine majorly on the prices in the energy market such as coal and gas futures, plus the carbon tax in one of the two forms. These factors are usually affected by the supply and demand in the energy market. Further, the fuel-switching is another important aspect to consider when pricing electricity. Theoretically, the switching can happen between either coal, gas or nuclear, but in practise, switching from coal or gas to nuclear seems difficult since nuclear energy is not flexible. There are several reasons the study of fuel switching is needed. In general, there is a trade-off between coal fire plant and gas fire plant that the coal causes more GHG emissions while gas is more expensive, and this can be applied in the switching. Moreover, the fuel-switching between coal and gas appears to be a natural cost effective way to reduce CO 2 emission (Ellerman and Buchner 2007 [20]). At last, a stochastic modelling on fuel-switching gives various applications in operation management and risk management. These reasons make the accurate modelling on fuel switching to become very important.

The Fuel-Switching Data
We use the fuel-switching and energy-switching price of Alberta as presented in   [8]. We are interested in how these processes are performed on this data and what are the general properties of this data reflected by these processes. Moreover, it would be better if we can study this data using one of the mean reverting models, this is why we introduce a cross validation method in Section 5.
In   [8], the authors provided a new method for carbon pricing based on the European energy market experience which we reviewed in chapter one. Remember that the fuel-switching price is defined as EC switch = η coal FC gas − η gas FC coal η gas EF coal − η coal EF gas .
To calculate this price, they found the emission factor is equal to 210 kgCO2eq/MWh for natural gas and 320 kgCO2eq/MWh for coal from the financial data of Winnipeg. Further, a case study from JEM Energy in 2004 gives the average coal efficiency which is 32.6%, and 31.1% for natural gas in Albertan power plants. The coal market price and natural gas market price are derived from NRG stream. Thus, one can calculate the fuel-switching price based on the information above, and the data are given in Figure 1. As we can see from the yellow line in Figure 1, it is negative after September 2017, which means that electricity from natural gas is always more profitable than coal. Hence, it is reasonable to consider a switch between natural gas to wind. So they provide an energyswitching price, where the emission factor for wind is zero and efficiency for wind ranges from 32% to 37%. In the paper, they also generate a wind price by a random uniformly distributed process lies between 37$ to 42.5$. So the energy-switching price is given in Figure 2. We also provide the descriptive statistics for our data in Table 1. As we can see, we have a negative kurtosis and a positive skewness in fuel-switching data which means this data have a thin-tail and are left-skewed. Furthermore, the negative skewness of energyswitching data indicates that the mean of the data values is less than the median, while the positive kurtosis implies a fat tailed distribution. Although it is hard to see graphically from Figures 1 and 2 that Albertan government had the carbon tax in places in January first 2017, the results from regime switching models reveal some signals of it. In the next section, we are going to give the model estimation result of the fuel-switching price and energy-switching price based on Albertan market by a panel of mean reverting processes we introduced before.

Inhomogeneous Geometric Brownian Motion
Using time changed method, Swishchuk (2008) [10] obtained the solution to the inhomogeneous geometric Brownian motion model (IGBM): In order to fit this model into our data, we consider the calibration of this model using the solution above. Firstly, according to the solution we have Then calibrating the above equation we obtain (assuming here we have data set Thus, the log-likelihood for where n is the length of S t and V k = Var[w(φ −1 likelihood estimator is very sensitive to the initial value. Based on this, we choose to use the result from as our initial value and we set time scale as 0, 1 n , 2 n , . . . , n−1 n to compensate this defect, where n is the length of our data. Now in order to maximize the log-likelihood and let our algorithm search in a correct parameter space, we need to maximize the log-likelihood under several constrains for our parameters: In the later section, we will proceed to a Augmented Lagrangian Adaptive Barrier Algorithm to find our maximum likelihood estimator.

Parameter Estimation
We are going to present a two steps estimation procedure introduced by Goutte and Chevalier (2015) [15]. Firstly, we use the Ordinary Least Square (OLS) method to estimate parameters κ, θ and σ in OU process. Secondly, the Maximized Likelihood Estimation will be used to model the parameters in Lévy NIG distribution.
In the previous section we have the solution to OU process and LDOU process which is where Y t = W t for OU process, Y t = L t for LDOU process and ∆t = t k+1 − t k . Here we change the time interval from [0, t] to [t k , t k+1 ] because in practice, we observed the data in time increments 0 < t 0 < t 1 < · · · < t n with ∆t = t k+1 − t k be a constant. Now rearrange this equation we obtain: σ and t k = 1 s t k+1 t k e κ(s−t k+1 ) σdY s . Notice that the term v t k is a white noise, where white noise means it has zero mean with constant finite variance, and no autocorrelation or autocovariance. We consider the Ordinary Least Square (OLS) method for Equation (6) which is one of the popular parameter estimation methods for an autoregressive model of order one (AR(1)), then our first step to estimate parameter {κ, θ, σ} is to minimize the empirical variance of white noise: The solution is given by: where Suppose the Y t inside term t k is a Wiener process (corresponding to OU process), then till now we already find out the estimators of parameters in OU process. However, when Y t is a Levy process with NIG distribution, we need our second step to estimate parameters {α, β, δ, µ} for NIG. Noticing that in LDOU case, S t k+1 − S t k + aS t k = m + v t k is following a non-centered and un-normalized NIG distribution, so we are considering the MLE method for these four parameters. By Barndorff-Nielsen and Halgreen (1977) [21], they have proved the log-likelihood function of N IG(α, β, δ, µ) distribution for a set of i.i.d random variables x 1 , . . . , x n is where τ k = x k −µ δ , a k = 1 + τ 2 k and K 1 is the same function in the density of NIG. Thus, if we let x k = S t k+1 − S t k +âS t k in Equation (9), the only thing we need to do next is to maximize function (9) with respect to parameters (α, β, δ, µ). However, since the log-likelihood function is very complicated, we consider using the nigFit function in R package "GeneralizedHyperbolic" to find the MLE, which use a numeric way to maximize this function. Moreover, notice that in definition of NIG distribution, we need δ > 0, α ≥ 0 and γ ≥ 0, so we are going to maximize the log-likelihood under constrains.

Parameters Initialization
Since we are going to maximize this function numerically, we have to decide the initial value for the algorithm. Here we consider a method of moments for our initial value. In Jordan S. (1999) [22], he present the formula for the first four moments of NIG distribution: where m i is the sample moments of i.i.d random variables x i which follow N IG(α, β, δ, µ) distribution and γ = α 2 − β 2 . Then solving the parameters from Equations (10) to (13), we obtain our formula of initial values.
These result can also be found in Goutte and Chevalier (2015) [15].

Regime-Switching Models
In the application below, we are going to consider a two-state continuous-time Markov Chain Z t and these two states represent the 'expansion' and 'contraction' in economics. The transition of this Markov chain determined the probability that of our parameters will switch to another regime. Considering the complicated form of SDE of these two processes, we are not going to find their explicit solution. Instead, we will use EM algorithm to estimate parameters in it. Moreover, these processes also have the mean-reverting properties while under different states, it will have a different long-term mean, mean reverting rate, volatility, and even the parameters in Lévy process.

Parameter Estimation of RSOU Models Using EM Algorithm
Recall that the RSOU and RSLDOU are the solution to the SDE where Y t can be Wiener process or Levy process. Since it is complicate to find a explicit solution when we add a Markov chain in it, we discretize these two processes by their SDE. For the observed price {S t 1 , S t 2 , . . . , S t n }, we have: We use the EM algorithm estimation procedure which was introduced in Julien Chevallier and Stéphane Goutte (2016) [23] to estimate parameters in RSOU process: For the expectation step, they first calculate the filtered probability: where and because of the Wiener process term in RSOU process. Then they calculate the Smoothed probability, At last, for the maximize step, they provided a iterative solution which is given by: where In our cases, we only consider a two-states Markov chain, which means our stats space S = {1, 2}. The next step of this algorithm is to set an ending constraint for it. Since we are aiming at maximize the log-likelihood, so one of the constrain is log L (m+1) − log L (m) < . Another constrain we set is that m < M where M is a fixed number of iterations we want, since this algorithm might converge very slowly.

Parameter Estimation of the NIG Part of RSLDOU Model Using MLE Algorithm
After estimate the parameters Θ 1 , we then proceed to the remaining set of parameters Θ 2 in NIG distribution. Recall that in Section 3.2.1, we talked about how to estimate NIG parameters using MLE method, here we are using the similar technique instead of that we have two regimes now. Thus, to estimate the remaining set of parameters, we fit NIG distributions for each regime: and Following the same procedures in Section 3.1, we can obtain the estimatesΘ 2 = {α 1 ,β 1 ,δ 1 ,μ 1 ,α 2 ,β 2 ,δ 2 ,μ 2 }.

Estimation of IGBM
When it comes to change of time approach, as we have talked before, due to the limits of exponential term in computer, the algorithm is quite sensitive to the initial values. Thus, in this problem, we use the estimators in Euler-scheme calibration method to be our initial value and Augmented Lagrangian Adaptive Barrier Algorithm to find our maximum likelihood estimator. The results are shown in Table 2. One of the simulation results in given by Figure 3. The reason we have this terrible result is because IGBM keeps the simulation positive all the time. So this model is typically not fit for our fuel switching data since the data decrease below 0 at the end. Thus, we do not consider this situation in the later model comparison section. Table 3 gives esitmation result of OU and LDOU model and Figure 4 present our simulation result. In Table 3, the θ is 1.10466 which represents a very low long-term mean and we also have a very small mean-reverting rate κ = 0.008600791. The α in NIG is close to 0 witch indicates a high intensity of jumps in Lévy process. Comparing to Figure 3, the simulation result of OU process present in Figure 4 is clearly better, since OU process allows negative values. However, we might also see that it reaches negative value in a very early date and bounce back to positive, which could be a signal for not fitting the data well. Figure 4 also gives simulation results for the LDOU process, we can see there are lots of 'cliffs' inside the process, which is the consequence of introducing Lévy process to the model. Next, we talk about the solution of regime-switching models on fuel-switching price.

Estimation of RSOU and RSLDOU Processes
The estimation result for regime switching OU process is given by Table 4. In Table 4, we can see that regime 1 has a positive long-term mean while regime 2 has a negative value, and we will also see in the later that this process switch to regime 2 more frequently when time increase. Compare to the volatility in OU process, regime switching model provides a lower and a higher volatility according to different state. Now, one of the simulation result is in Figure 5.  The reason we observe 'cliffs' in the trajectory even if we do not include Lévy process is because when each regime changes, this process change its parameters, which cause the 'cliffs' in the process. Regime switching Lévy driven OU model has the same estimator value as above, but with different estimator for the parameters in NIG distribution. Next, we present the estimators for two NIG distributions according to the two regimes we set before: The α is increasing in regime two means a lower intensity of jump happened in Lévy process in regime two. Further we have a negative beta value which implies a left skew in NIG distribution. One of the simulation result is present in Figure 6: To see when the switching happened in our simulation result, we need to have a closer look at the filtered and smoothed probabilities in the regime switching model given in Figure 7.   Figure 7 basically shows the probability of each regime happened in time t. Based on the two graphs, when the number of trials is less than around 850, regime one is more likely to happened and when the number of trials is greater than around 850, it has more possibilities to switch to regime 2. According to our data, a number of trials around 850 is consistent with July in 2017 which is the time Carbon Levy is introduce to Albertan energy market. It tells us that the introduce of carbon tax changes the structure of our data and according to Figure 1, the natural gas will be more profitable for generating electricity after the introduction of carbon tax. Further, from Tables 4 and 5, the regime two has greater volatility and higher alpha, which implies a higher volatility and intensity of jumps of fuel-switching data caused by the introduction of carbon levy. So far, we can see that the regime-switching models provides a better fit graphically compare to OU and LDOU process, since their simulation result have more points close to the original trajectory.

Estimation Results for Energy-Switching Data
In this section, we follow the same procedure of Section 4.1 and create estimation result of different models for energy-switching data.

Estimation of IGBM
The obtained estimation results for IGBM model are given in Table 6: The long-term mean reaches 100 means that this price will go higher in the future to around 100.2216 indicated by the IGBM model. One of the simulation results in given by Figure 8. We can see that the blue curve do not fit the red curve very well, as it separate with the red curve in the early date and has an intend to go beyond red curve after the final date of this data. Thus, this model may not fit the data very well. Table 7 presents the estimation result for OU and LDOU process based on two-step estimation method.  Table 7 give a long-run mean 64.3041 which is smaller than the long-term mean in IGBM model. Further, it has a larger α compares to fuel-switching data which shows that the energy-switching price may includes less jumps compare to the fuel-switching price. Similar as fuel-switching data, we have a negative beta here, which indicates a left skewed density. Figure 9 is the simulation result of OU model. Compare to Figure 8, the blue trajectory goes up and down around the red one and it also perfectly explains the trend of red curve. So then we are interested in how LDOU model perform on energy-switching data, as we can see from Figure 10, it keeps following the trend of red curve while it has some extreme values in the early dates.

Estimation of RSOU and RSLDOU Processes
The estimation result for regime switching OU process is given by Table 8. And one of the simulation result is in Figure 11. Similar as before, Regime Switching Lévy Driven OU model has the same estimator value as above, but with different estimator for the parameters in NIG distribution. So in the next, we only present the estimators for two NIG distributions according to the two regimes we set before in Table 9. Further, we have the simulation result given by Figure 12. To see the probabilities of regime one and two happened in time t, we report the smoothed and filtered probabilities for energy-switching data in Figure 13. We can see the switch happened more frequently after the number of trials reaches 170 which is as well the July of 2017 in Figure 2. This indicates that for RSOU model, the volatility and long-term mean are all changes after the introduction of carbon tax in Albertan energy market. Since the volatility for regime two is 109.0701 according to Table 8, and regime one only have σ 2 = 25.8930, it can say that introduction of carbon tax cause a higher volatility for our energy-switching data. As for the NIG distribution in RSLDOU model, a smaller alpha in regime two means the decrease of steepness or pointiness of the density, which implies a fat tail after July 2017.
If we have a closer look at Figures 4-11, it is hard to tell which model fits the data well graphically. Hence, in the next section we are going to propose two model comparison test and hoping to sifting the best fitted model for the two prices.

Models Comparison
So far we have fitted the fuel-switching data into five different mean-reverting models: The Inhomogeneous Geometric Brownian Motion (IGBM), OU process, Levy-driven OU process, regime switching OU process and regime switching Levy driven OU process. In this section, we numerically compare their performance on the fuel-switching and energy-switching data.

AIC and BIC
To compare the goodness of fit of all the models, we first use the Akaike information criteria (AIC) and Bayesian Information Criteria (BIC), as these two criteria are estimators of relative quality of statistical models for a given set of data: whereL is the likelihood value under Maximum likelihood estimator, k is the number of parameters that need to estimated and n is the number of observation. The smaller of AIC and BIC, the better the model is. Table 10 gives the results for fuel-switching data:  [8]. However, we can see the value of AIC and BIC decrease significantly after we introduce regime-switching model to the data and the RSOU process gives the best fit to the fuel-switching data. Notice that here we do not consider AIC and BIC for IGBM because this model is typically not fit for fuel-switching data, since IGBM is always positive while the data have negative values. Table 11 gives the comparison results for energy-switching data: In Table 11, the OU process and LDOU process produced a relatively low value of AIC and BIC,while other models have the value around 1500. It is reasonable since the wind prices used in   [8] is a random uniformly distributed process and the OU process reflect this property.
As we can see from Table 10, our AIC and BIC report negative values, which is reasonable since our data create a relatively large density and cause the In(L) to be a positive value. Moreover, this negative value also just depend on the model and have nothing to do with the initial value. If we try a 3-states MC in RSOU or RSLDOU model, we will still obtain negative AIC and BIC. Another worth noticing feature is that the AIC and BIC values are relatively large, this situation also happened in Chevallier and Goutte (2016) [23] and there is no need to re-scale them.
The AIC and BIC give us a general idea of the goodness of fit for each model on the data. In addition, to access the predictability of each model on the data, we propose a cross validation test in the next section. More importantly, in the cross validation test we estimate the mean square error of each model, which is a different comparison scale with respect to AIC and BIC.

Cross Validation (CV) for Time Series Data: The h-Block Cross Validation
The traditional k-fold cross validation introduced by P. A. Lachenbruch and M. R. Mickey (1968) [24] is a popular method in evaluate machine learning models. The first step of this method is to shuffle the data randomly, while it is impossible for time series data since they already have an order based on time. For times series data, people usually consider a leave-one-out cross validation. Suppose the model we have is where θ is a set of parameters in the model and g is a linear function of variable x t−1 depend on the model, for example, g(θ, x t−1 ) = δ + φx t−1 + for an AR(1) model. The leave-oneout CV first estimate parameter on the data that removed a single case x i , which we name this set of estimatorθ i . Then it assess the predictive ability of the model by the expectation wherex i is a random variable that has same distribution as x i andθ is the estimates of original model. However, when x t are not independent of each other, the estimation of E[(x i+1 − g(θ,x i )) 2 ] in Equation (A16) can be very poor. Thus, in the h-block CV [25], instead of removing one case for the i-th parameter estimate, we remove as well a block of h cases from either side of it and then calculate the corrected h-block cross-validated estimate as: Now we use the method given in Equation (A17), which we set h = 2 to make sure the Independence of each test value on our models and produce the following result in Table 12. Similarly as before, we do not test IGBM for fuel-switching data due to inconsistent property between them. Further we are happy to see that the RSOU process and OU process has the best predictability for fuel-switching data and energy-switching data respectively, which has the same result as we obtained from AIC and BIC. It is saying that these two models are the best fitted model for the two prices respectively.

Conclusions
In this paper, we introduced the fuel-switching price and energy-switching price to the Albertan Market hoping that they can encourage power plant companies to generate electricity with cleaner energy. We modeled these prices using five mean reverting processes: IGBM, OU process, LDOU process, RSOU process, and RSLDOU process. Each of them have characteristic properties and the modelling steps not only to give us an insight on the properties of the price, but also help us to find a practical guidance for companies to proceed fuel switching in a profitable way.
After the model comparison in Section 6, we are able to conclude that the RSOU process and OU process are the best models for fuel-switching price and energy-switching price, respectively. We can see that for the fuel-switching price, the regime-switching model largely increases the goodness of fit compared to other models, which indicates the important property of regime-switching for this price. Moreover, jump modelling techniques are also important as they increase the performance of the OU process, and this finding is similar to the previous results from North American and European Markets.
So far, as reflected by the stochastic models, the fuel-switching price in the Albertan market includes jumps and regime-switching. However, as the natural gas price keep decreasing in Alberta, more and more companies have already switched their power plant to natural gas, and this is why we need to further consider energy-switching prices. The best fit of the OU process on energy-switching price reflects the steadyness of wind price, since it is a uniform distributed process.
The result in this paper can be applied in many different areas related to the energy market. Firstly, for speculators, arbitrageurs and risk managers in the market, a jump or regime-switching modelling technique can be applied in the optimization of futures and options portfolios, or, more importantly, the study of the volatility of a specific market. For example, a regime-switching or jump volatility can totally change the calculation of Valueat-Risk (VaR) or Expected Short Fall (ES) in risk management. Secondly, considering the impact of carbon cost on wholesale electricity prices, our results can be used to better study electricity prices, optimize portfolios including fuel contracts and emission allowances, and compare the price formation in different countries. A similar conclusion can also be found in Goutte and Chevalier (2015) [15] and Zachmann (2011) [26]. Finally, the fuel-switching and energy-switching prices introduced in this paper provide a better understanding of operation management and policy making. For electricity producers, it helps them to understand the trade-off between carbon prices and emission reduction costs. For policy makers in Alberta, it would be better to consider a fuel-switching carbon price instead of an explicit carbon tax in order to better incite the electricity generator of using cleaner energy in a profitable way.

Acknowledgments:
We would like to appreciate four anonymous reviewers for constructive comments and suggestions, Antony Ware for data supporting, as well as all the participants at the 42nd International Association for Energy Economics (IAEE) Annual Conference in Montréal and the New Challenges in Energy Markets-Data Analytics, Modelling and Numerics Workshop in Banff.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

Appendix A. Mean-Reverting Stochastic Models
In this section, we introduce a set of five mean-reverting processes: inhomogeneous geometric Brownian motion (IGBM), Ornstein-Uhlenbeck (OU) process, Lévy-driven OU process, regime-switching OU process and regime-switching Lévy-driven OU process.
where a is the mean reverting rate, L > 0 is the long-term mean, σ > 0 is the volatility and W t is a F t Wiener process.
Unlike the Geometric Brownian Motion (GBM) where we can solve its SDE by applying Ito's lemma on d log(S t ), the SDE of IGBM is very hard to find the explicit solution because of the long-term mean L in the equation. In Swishchuk, A. (2008) [10], he generate this solution by change of time method which introduced in Ikeda and Watanabe (1981) [27] or Elliott (1982) [28] and this method is state as: Theorem A1 ((Ikeda and Watanabe 1981, Chapter IV, Theorem 4.3 [27])). Consider the SDE dX t = α(t, X t )dW t , where W t is the Wiener process and α is a measurable function on [0, +∞] × R. LetW(t) be an one-dimensional F (t) -Wiener process withW(0) = 0, given on a probability space (Ω, F , (F t )t ≥ 0, P) and let X(0) be an F (0)-adopted random variable. Define a continuous process V(t) by Let φ(t) be the change of time process , then there exists aF t adopt Wiener process W(t) such that (X(t), W(t)) is the a solution to the SDE. Now we can solve the SDE of IGBM by letting V t = e at (S t − L) first and calculate dV t = ae at (S t − L)dt + e at dS t = σ(V t + e at L)dW t (A2) by the Itô's Lemma, then use theorem 7.1 to solve V t from the SDE above. The solution is or equivalently, where andW t is a standard Wiener process. From this solution one can find out the expectation and variance which are An apparently result from the formula of expectation will be lim t→∞ E[S t ] = L, this explain why L is called long-term mean. Before we proceed further, let us calculate the E[S t ] and Var[S t ] above. This calculation method can be found in Swishchuk (2008) [10]. We first calculate the mean of S t . Since Now since we need to find out E(W 2 (φ −1 t )) to know what is Var(S t ).
Recall that and so we have Thus, Take derivative with respect to t on both side we have, and solve this nonhomogeneous differential equation we have, so by Equation (A9), we have

. Lévy Process and Normal Inverse Gaussian Distribution
Before we go further to other models, we need to introduce Lévy process. It is named after French mathematician Paul Lévy and it can be considered as a mixture of continuous process and jump process, where jump process means it allows left discontinuous. Another important property the Lévy process have is that it has independent identically distributed increments, like Wiener process, and we can see this from the definition below.

Definition A2. {L t } is Lévy Process if it is a continuous-time stochastic process and it follows
for any three times 0 ≤ s < t < u, the increments of this process L t − L s and L u − L t are independent; • for any two times 0 ≤ s < t, L t − L s have equal distribution to L t−s .

•
The sample path of L t should have right continuous and admit left limits, i.e., a CadLag path.
In this paper, we consider the lévy process that follows a Normal Inverse Gaussian (NIG) distribution which was introduced by Barndorff-Nielsen and Halgreen (1977) [21]. This distribution has been widely used in modeling fat-tailed process in economics and finance. There are also other distributions that can be considered, one can refer to [29].
The parameters of NIG distributions have their natural interpretations related to the overall shapes of the density. Firstly, a large value of α implies light tail as well as a smaller steepness or pointiness. Further, a lower value of α indicates a higher intensity of jumps. When β equals 0 and µ arbitrary, an NIG r.v. is following a Normal distribution if α goes to infinity. Secondly, the beta is the skewness parameter with β < 0 implies a density skew to left and vice versa. When β = 0, we have a density symmetric around µ. At last, δ is a scale parameter in the sense that the re-scaled parameter δα and δβ are consistent under location scaled change of x.
Here we consider a Lévy process with NIG distribution means that the increments is following N IG(α, β, δ, µ) and other definitions remains the same. Under this case, the Lévy process is a purely jump process.

Appendix A.3. Ornstein-Uhlenbeck (OU) Process and Lévy-Driven Ornstein-Uhlenbeck (LDOU) Process
This section includes two mean reverting processes that are related to the Ornstein-Uhlenbeck (OU) process and Lévy process. Comparing to IGBM, OU process is another widely used mean reverting process where its value can be negative. Sometimes an oil company faces a situation of overstock which created a huge amount of storage costs and they would like to sale these oil even in a negative price, so at some point, the spot price of oil may goes to negative even if it looks like impossible. Thus, OU process is also an interesting topic to study and would be useful in commodity market.
Definition A4. Under the probability space (Ω, (F t )t ≥ 0, P), a stochastic process is called Ornstein-Uhlenbeck (OU) process if it satisfied the following SDE: where κ is the mean reverting rate, θ is the long-term mean, σ > 0 is the volatility and W t is a F t -Wiener process.
Note that if we apply Ito's Lemma on f (t, S t ) = S t e κt , we will obtain dS t e κt = κS t e κt dt + e κt dS t (A22) = κS t e κt dt + e κt (κ(θ − S t )dt + σdW t ) (A23) = κθe κt dt + e κt σdW t (A24) which means, The W t term in the SDE indicate that this process is continuous everywhere and do not include jumps in it, so the only difference in Lévy-driven OU process is that we change dW t term into a dL t term, where L t is a Lévy process with NIG distribution, so that we can include jumps in our model.

Definition A5.
Under the probability space (Ω, (F t )t ≥ 0, P), a stochastic process is called Lévy-driven Ornstein-Uhlenbeck (LDOU) process if it satisfied the following SDE: where κ is the mean reverting rate, θ is the long-term mean, σ > 0 is the volatility and L t is a F t Lévy process with NIG distribution. Following the same procedure above, we can solve the SDE of LDOU which is S t = S 0 e −κt + θ(1 − e −κt ) + So far, we have already considered two popular mean reverting processes and one can notice that all this has a constant mean reverting rate, long-term mean and volatility. However, in practice, we are not facing a constant mean reverting rate, long-term mean and volatility, so here we try to upgrade our model by considering a Regime-switching Ornstein-Uhlenbeck (RSOU) process and a Regime-switching Lévy-driven Ornstein-Uhlenbeck (RSLDOU) process. The basic idea of RSOU and RELDOU process is that we assume these constants are following another continuous time Markov chain with discrete states. Each state in the Markov chain can represent different states of economic activities such as deflation and inflation so that these will bring out different values of those constants. Let us introduce the definition of RSOU and RSLDOU.
Definition A6. For all t ∈ [0, T], let Z t be a continuous time Markov chain on finite space S := {1, . . . , K}. A regime-switching model is a stochastic process (X t ) which is solution of the stochastic differential equation given by  (1), . . . , σ(K)} ∈ R K + .