For a better understanding of the classification results, we include a brief description of some markets analyzed.
3.1. Description of Some Electricity Markets Analyzed
The four Australians markets selected in this study are New South Wales (NSW), Queensland (QLD), South Australia (SA) and Victoria (VIC). The Australian National Electricity Market (NEM) promotes efficient generation and demand use by a wholesale market, which allows electricity trade among five regions in the east of Australia (see
Figure 1): Queensland (QLD); New South Wales (NSW); Snowy Mountains region (SN, abolished in 2008 and merged into VIC and NSW areas); Victoria (VIC); South Australia (SA); and Tasmania (TA, fully operational in NEM since 2006).
Each region has different characteristics (generation mix and load) and interconnection capacities. For example, New South Wales is a net importer of electricity and has limited capacity to cover the highest peaks of demand, and for this reason, it needs generation support from QLD, Snowy Hydro and VIC. Victoria had in the period under study (2004 to 2009) a substantial low cost base-load capacity, making it a net exporter of electricity. Queensland is a net exporter too, mainly to NSW, due to their geographical and electrical proximity. South Australia is a net importer (a high percent of its demand was covered outside this region until 2005–2006 because a new investment in wind generation was developed in this area).
Table 2 (adapted from [
35]) shows the inter-regional trade of these regions.
The NEM market works at unison when the electricity can flow freely among all areas, but this does not mean that the price is the same in the five areas during these periods. The “integrity” or price alignment of the NEM market as a percentage of trading hours ranges between 70% and 80% across the regions. Australia manages congestion periods by splitting its regions, allowing different and more independent marginal prices in each area. This separation occurs when a transmission inter-connector becomes congested and limits inter-regional power flows. In these cases, each area needs to reconsider offers from the generation in its own region, and in this way, a different behavior of the market occurs in each area (the generation mix is different for each region). This scenario may occur at times of peak demand or when an inter-connector experiences some outage or is under maintenance tasks. The inter-connectors in Australia are shown in
Figure 1. Notice that Australia does not have a meshed link among regions (QLD, NSW, SA, VIC, TA), but a radial one.
The Nord Pool markets are divided into several bidding areas. The available transmission capacity may vary and congest the flow of power between the bidding areas, and thereby, different area prices are established. For each Nordic country, the local transmission system operator (TSO) decides into which bidding areas the country is divided. The bidding areas has changed along time, and for the time period analyzed (the years 2004 to 2009), we have considered the following: Sweden (SE), Finland (FI), Western Denmark (DK1), Eastern Denmark (DK2), Oslo (NO1) and Trondheim (NO2). Nord Pool calculates a price for each bidding area for each hour of the following day. The Nord Pool System price (NPS) is calculated based on the sale and purchase orders disregarding the available transmission capacity between the bidding areas in the Nordic market.
The Nordic area is a good example of a well-linked region. From the early 1990s, these countries made solid foundations for the development of a supra-national market, but despite this fact, the integrity of price areas is not the same (see
Figure 2). The Nordic Transmission grid connects the four countries of this area, and the congestions between the countries are managed by implicit auctions through Nord Pool spot. The Nordic electricity grid has several AC and DC inter-connectors to link the different countries in the region and to interconnect adjacent areas. For example, in the period under study (2004 to 2009), the Denmark West- Germany corridor had 1500 MW and 950 MW in the opposite direction. Finland is strongly connected to Sweden (2050 MW Sweden-Finland and 1650 MW in the opposite direction), but weakly with North Norway (100 MW) and Estonia (in 2007 with a capacity of 350 MW). Finland forms its own bidding area. The weakest linked area is Western Denmark (DK1) because it was part of the Continental European synchronous power system, the former UCTE area (Union for the Coordination of the Transmission of Electricity) and now the Continental European Group of ENTSO-E (European Network of Transmission System Operators for Electricity), whereas Eastern Denmark (DK2) was part of the Nordic synchronous area (the former Nordel, now the Baltic Regional Group of ENTSO-E [
36]). The second one, according to
Figure 2, is the NO1 area (Oslo region) due the capacity problems of the west coast Swedish corridor. Moreover, the capacity usually available from SE to NO2 and NO3 is limited. The most coherent areas in the period analyzed were FI and SE due to the high transmission capacity between Finland and North Sweden.
3.2. Classification Results
For each electricity market, hourly price series from 2004 to 2009 are used in the analysis. The proposed measures allow us to determine which markets present strong relationships and which ones are not related. Furthermore, the strength of the relation can be measured along the year in order to detect periods with the most or the least price dependency.
For that, the whole time series has been divided into non-overlapping blocks of size
w (block size), and then, given an embedding dimension
m, the distance measures proposed in this paper are computed for each block. The block size selected when computing distance measures usually corresponds to a year approximately (
h) or to a season of the year (
h), because the proposed measures do not depend on the block size
w, and we are interested in studying whether the dependency level is homogeneous along time. However, a suitable combination of embedding
m and block size
w should be chosen when developing the independency test. A general rule to get a good performance is that the block size
w ought to be roughly
·5·
·
. For example, when the embedding dimension is
, a block size of
·5·
·
is recommended. See [
14] for more details.
Firstly, we highlight the necessity of removing the seasonal component before the analysis. Note that hourly electricity price series have daily and weekly seasonal components (period = 24 h and period = 168 h, respectively), and these seasonal parts are more relevant (higher values) than the stochastic part of the series. Taking into account this framework, we wondered if the dependence test was appropriate for series with a seasonal behavior. Let
be the original price series of a specific electricity market. In this context, we consider three different ways to remove seasonality in the price series to extract the stochastic component:
Taking weekly seasonal differences:
First taking weekly seasonal differences and then daily differences:
Using the method proposed in [
37]:
where
is the number of weeks used for calibration. This approach is more popular among practitioners because it combines differencing at various lags with moving average smoothing.
Note that the the length of the resulting stochastic component is less than the length of the original series in all cases, because the first part of the data cannot be used.
Let us consider the hourly price series in the whole period 2004 to 2009 of two very different electricity markets, Ontario and Omel, which are far away and have different market regulations. It is clear that the prices of both markets are independent, but the presence of seasonality leads to the wrong conclusion if the seasonal component is not previously removed.
Figure 3 shows the correlograms of the two price series, which reveals clear daily and weekly seasonal components (peaks in Lags 24, 168 and their multiples).
Now, we compute Pearson’s chi-squared, the likelihood-ratio and the Cressie–Read statistics in four different situations: using the original data (without removing the seasonal component) and using the stochastic component extracted in the three ways mentioned above.
Figure 4 shows the results for Pearson’s chi-squared statistics (the others statistics were nearly the same), and the dotted line represents the limit of the rejection region. An embedding dimension of
and a block size of
were chosen for the test. When original data are considered (see
Figure 4a), the statistic lays in the rejection region, so we would conclude that both price series are dependent. However, after removing the seasonal component with any method (see
Figure 4b–d), the statistic states independency between the price series. The selection of
and
14,400 leads to the same conclusions.
In the rest of the paper, we have applied Weron’s method to all price series before each analysis, so the stochastic components of the price series have been used instead of the original data.
As we mentioned before, the proposed distance measurements can be used to study the strength of the dependency along time. To illustrate this task, let us consider the hourly price series of Finland and Sweden from 2004 to 2009, two electricity markets that are strongly related. First, we compute the dependency statistics with
and
to show a true price dependence between these two electricity markets; see
Figure 5. Note that the resulting series are of a size of 51,768 h after applying Weron’s method, so there are 57 windows of a size of
along the period analyzed.
An embedding dimension of
and a block size of
(a season of the year, approximately) are now selected to evaluate how the dependency level varies along time. Note that the resulting series are of a size of 50,724 h after removing the seasonality through Weron’s procedure and starting in 21 March 2004 (spring). Therefore, there are 23 windows of a size of
along the period analyzed, from spring 2004 to autumn 2009.
Figure 6 reveals that the dependency level is not homogenous along time. On the one hand, a slight increase of the dependency level can be appreciated along the years analyzed (distance presents a decreasing trend). On the other hand, there are some dependency peaks (valleys in the distance graph) in autumn of 2004, spring 2005, spring-summer of 2006, spring-summer of 2007, spring-summer of 2008 and autumn of 2009. Furthermore, note that the four distances provide a similar pattern, but the scales change, except for the uncertainty distance (
) and the Universal Distance 2 (
), which are roughly the same.
To explain, from a physical point of view, the results shown in
Figure 6, it is interesting to consider two aspects. First, the fact that the share of electricity bought from the power exchange in relation to electricity consumption has increased considerably since Finland and Sweden joined the Nordic power market. For example in Finland, the share of electricity bought from the Nordic power exchange has increased from 5% to 60% of the Finnish consumption in 2012 [
38]. This means a higher dependence (potentially) among Finland and Sweden (and, obviously, with the Nord Pool area) and explains the slight increase in dependency level along the period shown in
Figure 6. The second is the management of congestions. In the Nordic area, two mechanisms are used: counter trade and congestion rents. The first is used with market agents to relieve both national and inter-regional congestions during the daily network operation. The cost of this mechanism in Finland decreased from 0.86 million euros in 2004 to 0.085 million euros in 2009 [
39]. The second mechanism is the most important to evaluate cross-border congestions, the so-called congestion rents. Congestion rents come up in the situation where transmission capacity between bidding zones is not sufficient to fulfill the demand. The congestion splits the price bidding zones into separate price areas, and the power exchange and TSOs receive congestion income from the congested interconnection. The congestion rents are computed as the product of the commercial flow on the day ahead market and the difference of the area prices. In this way, high levels of congestion rents between two areas in some periods of time mean that these areas were more independent during those periods. Historical congestion rents between Finland and Sweden [
39] have been analyzed (from summer of 2006 to autumn 2009), and they are shown in
Figure 7. Note that the right part of
Figure 6 (starting at window Number 10, which corresponds to summer 2006) and
Figure 7 exhibit similar trend changes.
Finally, we study the dependence structure among all of the electricity markets analyzed. First, we compute the corresponding distance matrix, and then, we obtain the hierarchical classification of the markets. The distance matrices are computed for each one of the proposed distance measures (
,
,
and
), for each year of the analyzed period (2004, 2005, 2006, 2007, 2008 and 2009) and for the whole period 2004 to 2009. An embedding dimension of
is selected for individual years and
for the six-year period. As examples,
Table 3 and
Table 4 show the distances between each pair of markets for the six-year period and
Table 5 and
Table 6 for the individual year 2007.
The hierarchical clustering of the electricity markets has been developed from the previous distance matrices and using different linkages (single, complete and average). For instance,
Figure 8 shows the classification results for the whole six-year period, V-Cramer distance and single linkage. Dendrograms for all distance measurements and all linkages reveal the same hierarchical classification. Four clusters can be distinguished: two of them are isolated markets (Omel and Ontario, respectively); the third one consists of the four Australian regions (Victoria, New South Wales, South Australia and Queensland); and the forth cluster includes all Nord Pool regions (Finland, Sweden, Trondheim, Oslo, East Denmark, West Denmark and the system) together with Austria. Note that West Denmark is weakly related to the rest of the Nordic countries, whereas Finland and Sweden have the strongest price dependency.
Note that the clustering approach proposed in this paper produces plausible, non-trivial results that can be intuitively explained in the given scenario. Obviously, the final classification results depend on several aspects jointly, such as the size of the regions, the system’s regulation laws, demand daily patterns, costs for the spinning reserve or fees for cross-border energy transmission. Below, we try to highlight some aspects that partially justify the clustering results in spite of the fact that it is not the aim of the work.
The isolation of the Ontario market in this analysis does not need any comment, and the one of the Spanish market is also well known. For instance, the capacity of cross-border connection from Spain to France in 2008 was only 1400 MW (3% of Spanish demand), and France did not join the European Power Exchange (EPEX) initiative until 2009 to 2010, as well. According to the European Association of Regulators (ACER), up to 2010, the percentage of hours for equal hourly day-ahead prices in the pair France-Germany was 0%. In this way, Spain had no possibility of economic or physical linkage with other European markets, such us Nord Pool or Austria, outside the limited possibility of exchange with France. Therefore, it is very unlikely that Omel and Nord Pool had been linked through EPEX (via France-Germany) during that period. On the other side, the dendrograms reveal that Austria exhibits a weak dependence with Denmark areas. This is due to the fact that Austria and Denmark areas (DK1 and DK2) are linked through Germany. Austria has a high capacity of cross-border lines with Germany (10020 MW and 3664 MW in 2009). However, from 2004 to 2008, the energy volume traded by the Energy Spot Market in Austria (EXAA), which covers German areas) did not get 7% with respect to Austrian overall demand [
40]. In September 2008, the EPEX (Germany-Austria) was founded, but in its first year, it traded less than 17% of the Austrian gross demand of electricity. Hence, the market integration was very weak in that period.
The results obtained for the Nordic regions are in agreement with the integrity levels showed in
Figure 2, where DK1 has the lowest integrity percentage with the rest of regions, whereas FI and SE have the highest one. To explain the hierarchical classification in the case of Australia, two aspect can be considered: first, inter-connectors’ capacity and their constraints, and second, the annual power flows between Australian areas. With respect to annual power flows between areas,
Figure 9 shows a snapshot of the NEM market for 2006/2007 (adapted from [
41]). This figure and the above-mentioned conditions of transmission inter-connectors and physical energy exchanges among regions can explain the distance matrices and dendrograms. From these power flows, it can be seen that NSW needs support from QLD and VIC. On the other side, QLD has a sufficient amount of generation in its area (the area is more independent), and its dependency with VIC and SA is lower than the link with NSW. Finally, SA needs imports from VIC (a net exporter area), but not from NSW (a net importer from VIC and QLD).
In general, dendrograms for each individual year lead to clustering results similar to that of the six-year period, but some differences are worth being outlined (see
Figure 10). For instance, in 2005, there was a strong dependence between prices of Nord Pool’s system and Oslo (even higher than the dependence level between Finland and Sweden). In 2008, the dependency strength of Oslo’s region with the rest of the Nordic regions went down, and it became the weakest (even lower than the association of West Denmark with the rest of the regions). In that year, the hydropower production in Norway was higher to compensate lower Swedish production (because the availability of nuclear power plants in Sweden went down during 2008, reaching 65% during some months, especially in November and December) and also due to some problems with the imports from the Central-West European area [
42]. Both facts originated congestion problems with the transmission inter-connectors and a loss of price integrity in the NO1 area. Finally, the dependence scheme of the four Australian regions has been changing along the years: in 2005 and 2006, NSW and VIC were the most related; in 2007 and 2008, the highest dependency went to the couple SA and VIC; but in 2009, NSW and QLD reached the maximum dependence level.
Although we have focused on electricity prices, the proposed approach could be helpful to study the relationships among other kinds of time series like electricity loads. Below, we consider a set of twelve time series corresponding to the hourly electricity loads in four different regions along three different years (2007, 2008 and 2009). Specifically, we have analyzed the electricity load series of three regions in Australia: New South Wales (NSW), South Australia (SA) and Victoria (VIC); and the load time series of Ontario’s market. The objective is to apply the proposed clustering procedure to this set of time series in order to obtain groups of series that present dependency among themselves.
Recall that the steps of the procedure can be summarized as follows:
First, the seasonal component of the time series must be removed. We suggest using Weron’s method given in (
27), but other techniques can be applied.
Secondly, the resulting time series (after removing the seasonal component) are codified by means of permutations. For that, the researcher has to choose the embedding dimension.
Thirdly, the distance between each pair of time series (through their codes) is computed, and the corresponding distance matrix is obtained. In this step, we propose using four different dissimilarity measures (, , and ).
Finally, the dendrogram is computed obtaining the clustering results. For that, the researcher has to choose the distance measure and the linkage of the hierarchical method.
Once we have removed the seasonal component of each time series and we have codified the resulting series, we compute the distance matrices.
Figure 11 shows the distance matrices (Crammer’s V distance and Universal Distance 2) of the twelve time series, using embedding dimension
. Additionally,
Figure 12 shows the corresponding classification results choosing different linkages. The electricity loads of New South Wales for 2007, 2008 and 2009 are denoted by NSW07, NSW08 and NSW09, respectively, and similar notation is used for South Australia (SA07, SA08 and SA09), Victoria (VIC07, VIC08 and VIC09) and Ontario (Ont07, Ont08 and Ont09).
In
Figure 12, two different clusters can be seen: the first one formed by the three load series of Ontario’s market and the second one formed by the nine load series of the Australian market. Moreover, in the second cluster, there are three subgroups that are well separated, one for each year analyzed. Therefore, we can state that the strength of dependency is greater among the Australian regions (NSW, SA and VIC) for a specific year than among the years for a specific region.
In each of the three subgroups of the Australian cluster, we can see that the strongest dependency corresponds to the load series of South Australia and Victoria, whereas New South Wales has the weakest dependency inside its subgroup. On the other hand, the three load series of Ontario present a weak dependency level among them, but high enough to create a different cluster from the Australian load series.
Finally, we compare some of our results with those obtained using a classical clustering approach for time series: a raw data-based approach and the Euclidean distance. In this case, we work directly with the original data, that is the time series are neither transformed nor codified. Additionally, the Euclidean distance is used as a dissimilarity measure, which is combined with different linkages.
Figure 13 shows the Euclidean distance matrix of the twelve time series also considered in
Figure 11. Recall that the Euclidean distance is not upper bounded; it is very sensitive to transformations; and the proximity notion relies on the closeness of the values observed at corresponding points of time.
Figure 14 shows the corresponding clustering results for the electricity loads of Ontario and Australia over different years.
Once again, two clusters can be distinguished: one composed of Ontario’s loads and the other one composed of the Australian loads. However, when we compare
Figure 12 with
Figure 14, an essential difference can be observed. This time, the cluster of the Australian loads is divided into three subgroups corresponding to each region analyzed. Therefore, if we classify this set of time series according to the information that they share (using the clustering approach proposed in the present paper), we get that the strength of dependency is greater among the regions (for each specific year), whereas if we classify them looking for similarities in time, we get that the similarity in time is greater among the years (for each specific region). This example illustrates the importance of choosing a suitable clustering approach and dissimilarity measure depending on the classification purpose.