Optimization of wind-energy bands is in the core of this framework. We provide an expression to compute a band around any forecast, and, for that concrete formula, we seek for the narrowest band that satisfies a set of constraints, which imposes limits to the actual process in its deviations in accordance with the historical behavior. A traditional approach would go the way of setting constraints to keep the power deviation under certain boundaries. Conversely, this work aims on minimizing the expected off-band energy.
The previous is explained by the particulars of the Uruguayan electricity installed plant, but it is also justified by trends of new technologies. Around 98% of the electricity annually consumed in this country comes from renewable sources (see [
17]). In average, 50% is from hydroelectric sources, while 35% is from wind-power. All of the hydroelectric dams in Uruguayan have water reservoirs; two of them (Bonete and Salto Grande, see
Figure 1) are particularly huge. Almost 40% of the hydroelectric capacity is located after the greater lake (Bonete’s), which would take 5 months to empty at full-power. Therefore, in fact, the hydroelectric plant also constitutes an accumulator, i.e., a kind of battery that can plenty compensate short-term fluctuations of the power coming from wind turbines. Hence, regarding Uruguayan short-term planning concerns, an accurate prediction of energy boundaries is more convenient than a power forecast of limited punctual quality. In a complementary manner, smart-grids capabilities are rapidly advancing towards active applications, capable of dynamically adjusting portions of the demand to adapt them to fit system needs (see [
18]), while electricity storage units based on batteries are just around the corner (read [
19] and also see the “Neoen & Tesla Motors” project in Australia). Therefore, in the near future, this work could be a useful experience for other countries.
The information required to determine an instance of our optimization problem comprises the following data sets. At first place, we need a historical of wind-power forecasts. We consider a collection of deterministc registers that involves short-term point forecasts over a horizon of a few days ahead; i.e., a family of vectors , with fixed T, which is set by the number of samples along the time horizon. Here, d is the index for each day on which the construction of a band begins; , being D the set of indices for days with historical observations. Wind-power forecasts usually span from one up to three days, i.e., from 24 to 72 h, and time is discretized at a rate of one sample per hour. Let be the limit of hours ahead available for each forecast. We assume that all forecasts share the same time horizon, and that in the current power is the only data known for sure. As we mentioned earlier, for simplicity the wind-power is expressed as the PLF, which corresponds to the actual power generated divided by the sum of the installed power capacity of wind turbines in the system at each moment. Thus is the normalized point forecast of the wind-power t hours ahead, within the vector associated to the forecast issued on the day d.
The second part of the input data set comprises the actual historical wind-power time series samples, grouped into a collection: , whose elements are also assumed normalized. Hence, is the actual PLF measured t hours after the beginning of the day d. For consistency, since the current state can be measured rather than forecasted, for each day d. Observe that the set usually has duplicated records, for instance: . Despite that, we have chosen this format to simplify those expressions that link with forecast information. Regarding forecasts, however, the previous equality doesn’t hold. In fact, (a sample, forecasted 24 h ahead) is different from (the actual value measured a day later).
It is clear that, given any two bands containing the real process inside of them at the same instants, the narrower band is of better quality. Wind-power generation is a process hard to anticipate, and violations to computed bands is a fact we must coexist with. However, not every violation has the same severity in terms of its impact to the power grid. In the context of the short-term energy dispatch, how much cumulated energy falls down outside the band is a convenient metric to assess the confidence of the pair: forecast plus computed band. In this work, we define the following expression as a metric for the reliability (The expression on the right hand side corresponds precisely to the anti-reliability, which of course is the complement of the reliability; hence the notation for the left hand side)
, of a band around a given forecast
:
where
and
respectively are the functions that determine upper and lower limits for the bands along the forecasted period, and
w is the actual generation, unknown until the near future where reality is revealed. Functions
and
take a forecast (
) and an instant (
t) as their inputs, while their outputs are the respective bounds to expect.
As mentioned, the feasible region of the optimization model imposes limits to the severity of violations to the band. Besides, in order to improve the quality as further as possible, the model allows to discard up to a limit of elements in the training set, which are atypical, specially bad forecasts that whether included would either: deteriorate the accuracy of the result, or force us to use too broad bands. So, to complete an instance we must set values to those quantities. The parameter limits the amount of energy allowed to fall down outside the band along the optimization horizon. The parameter sets a minimum fraction of regular (i.e., not atypical) forecasts to be used in the effective training set or, in other terms, is the maximum fraction of atypical days allowed to be discarded. It is worth mentioning that the limit for off-band energy only accounts over regular forecasts.
3.1. Minimal Relative Width of Bands
This work considers those bands defined by relative deviations with respect to forecasted values, which are simple to calculate and optimize, and yet lead to accurate results. Let
be a set of coefficients associated to the time series analyzed, which delimits the width of the band. That is, for any instant
t within the time horizon of the forecast issued on day
d, we take
and compute the lower and upper limits of the band using the expressions
and
respectively. Hence,
comprises the first set of control variables that modulates the relative width of the band for a given forecast
.
Figure 8 sketches about how these variables and derivatives are related, through a hypothetical forecast (centroid of the band, highlighted in blue), its correspondent energy-band (shaded in grey), and the actual power-process registered afterwards (red curve).
The objective function of this optimization is
, where
is the average PLF at time
t over a historical record of observations
, eventually different from that of the training set. In other words,
corresponds to the sequence of asterisks in the rightmost of
Figure 4, while
matches the average grey area in
Figure 8. Whenever forecasts are statistically reliable, the objective function corresponds with the expected absolute PLF area of the band along the period
T.
Defined so, the optimization is not instant-to-instant greedy, in the sense that it could deteriorate the performance at some points in order to surpass the overall performance by gaining more in others. That differentiates this work from related ones (like [
7]), whose intention is to track power rather than energy. In fact, this model doesn’t need conventional hypotheses about stochastic processes, such as homogeneity or markovianity.
The second group of control variables is composed by those who determine which are the regular forecasts. The variable
indicates whether the forecast issued on day
d should be considered regular (
), or atypical (
). Unlike the
variables, these new ones are boolean. We denote
D to the set of days for which the optimization problem is implemented (i.e., the training-set). The complete combinatorial optimization model is that in (
2).
The auxiliary variables
account by how much power the process (
) violates the band around the forecast (
), either at the top or the bottom, for those days classified as regular (i.e., when
). For instance, if
and
, then it must be held
to satisfy equation
in (
2) for that day
d at time
t. When
and
, then
should verify
to satisfy equation
. For a graphical reference about both situations, please see
Figure 8. The optimization process pushes down the
values, which ultimately are to be set to
, the anti-reliability of (
1). That equation is always satisfied when
simply by choosing
for every
t; therefore, atypical days are disregarded for violations.
Given any day
d, when
(an effective day of the training set), the second equation guarantees that the time-normalized cumulated off-band energy along the forecasted period
T is below
. That is, in terms of the reliability:
; so
bounds the energy that lies outside the band to a fraction of the installed power plant. As it happens with
, equation
is automatically satisfied when
. Coming back to
Figure 8 as a reference instance, by combining equations
and
inside an optimization process, we are forcing the total off-band energy (the result of adding up both yellow areas) to be under a desired threshold. Finally, equation
forces the problem to select at least
days to be regular, which combined with the persistence hypothesis conveys likelihood to the result.
3.2. Experimental Evaluation
The experimental evaluation of this work is based upon a later open data from the Uruguayan Electricity Market. From that past repository, we chose two independent forecast sources: Garrad Hassan and Meteológica. The data were pre-processed using a power assimilation methodology, which fits forecasts along the first 6 h in order to match the starting state (
). The exact process is described in paper [
3]. The used forecasts from Garrad Hassan were those issued at 1AM between 5 April 2016, and 10 March 2017. Within this period there are 302 days where both, forecast and actual data, are complete. Regarding the other provider (Meteológica), the number of complete records is 394, with dates of issue ranging from 1 January 2016, to 10 March 2017. Regarding our own forecast (PSF44), synthesized up from a series of actual power registers, we used the same 730 days of between years 2014 to 2016 that were used upon the first part of
Section 2.1. Best performance was found by using
and
(acronym PSF44 refers to those parameters).
Throughout this work, we relied upon
IBM(R) ILOG(R) CPLEX(R) Interactive Optimizer12.6.3 as the optimization solver. The server was an
HP ProLiant DL385 G7, with 24 AMD Opteron(tm) Processor 6172 with 64 GB of RAM. After running model (
2) over a training set comprising around 30% of Meteológica’s days, we find bands like those sketched in
Figure 9.
The x-axis represents the number of hours ahead for each forecast, while the y-axis corresponds to the PLF. Blue curves are associated with power forecasts while red ones are the actual values. Finally, the grey area represents the wind-power band for
and
. Since
equals 1, every day within the training set must be effectively included; that is,
for each
, so all days are treated as regular. Furthermore, when
then (
2) turns out to be a pure linear programing problem, and running times are within the second. Fixing
to 1, it is of interest to explore how
modifies the bands.
Figure 10 shows the result over the same training set when
.
Observe that bands in
Figure 10 are wider than in
Figure 9, which is expected since we are less tolerant respect to how much energy lies outside those bands. In order to balance reliability and thickness, it is of interest to compute how much area do bands cover as we change
while keeping
.
The training in all of the previous cases was performed over a set
D of 120 randomly selected days out of a set of 300 days in common for all providers. The common complement, i.e., the set of (180) days shared by these three forecasts and not being in the
training-set, is used as the
test-set. The calibration of PSF44 was crafted using the set of 430 contiguous days previous to those of training and test sets. Experimental evaluation (see leftmost of
Figure 11) verifies that the average width of the bands, when trained over the entire training set of forecasts (
), falls down rapidly to 0, which is reached upon both companies when
is close to 0.2. Although similar, Meteológica’s bands (blue) are always better than Garrad Hassan’s (red). PSF44 (green) requires much wider bands to achieve the same grades of reliability. The middle plot shows the relative difference between widths of original bands (those of the training set), and widths computed over the test-set using the corresponding
x vector found for each
. It is worth mentioning that widths are always similar (divergence is low), so the objetive function in (
2) when evaluated over the training set is representative of what happens outside it. This is sustained even for relatively higher
values right below 0.2, where widths tend to zero and the relative deviation makes no sense to be accounted.
Regarding off-band energy violations to the limit
when computed over the test-set (they do not happen in the training-set because of (
2) (
)), the rightmost of
Figure 11 shows the fraction of those violations, i.e., the fraction of samples where the off-band energy surpasses
. They are also low for all forecasts and are particularly lower as values of
get apart from zero. The previous exercise experimentally justifies the persistence hypothesis this technique is based on.
The goal of this work is providing stochastic short-term optimal power dispatch schedulers, with accurate wind-energy bands, in the context of the Uruguayan Electricity Market. In particular, our interest is keeping off-band energy below 10% of the average PLF, which is around 0.35; so we consider . In Uruguay, 35% of electricity comes from wind-power sources, thus that value of corresponds to 1.23% of the average energy consumed, what is ambitious. That value is used as reference during this work.
The other parameter to consider is
which attends to the fact that, whatever accurate a family of forecasts may be, there will always be samples that degrade the overall quality of the whole.
Table 2 shows how some attributes of the bands change as
decreases from 1 to 0.6, while
remains fixed in 0.035 (our target off-band violation).
The first three four columns correspond to Meteológica forecasts, the second part does to Garrad Hassan’s and the last one to PSF44 forecasts. These metrics were computed over the test-set by using optimal
x coefficients for each
over the samples in the training-set. Columns labeled as
%anomalous indicate the percentage of the samples, in each case, whose off-band energies surpasses the 0.035 of the total plant factor (
). We decided to use different adjetives to distinguish between
atypical days: samples intentionally excluded from the training-set, and
anomalous days: samples in the testing-set that by chance surpass the off-band energy limit. The columns
and %
respectively show the average absolute and relative areas of the wind-power bands over the test set, using 72 as the full plant factor for the time horizon. Finally, the number of seconds spent by the solver to find the optimal solution for each case appears in the column labeled as
t(s). Observe that as
decreases so it does the expected width for energy bands, because the solver is allowed to select down to
days during the training, and the optimization ends up by crafting bands for the best subset with a
fraction of the original number of days. Computation times ascend, because (
2) becomes combinatorial for
. Conversely, the percentage of
anomalous days (i.e., those whose off-band energy falls outside the limit) increases, since the calibration performed over a partial/elite training-set is no longer representative over the complement (i.e., the test-set). A second goal of this work is keeping the percentage of
anomalous days below 10%, which translates into attaining the target
at least 90% of the times. The final goal is over the allowed variance for wind-power. Until now, we have focused upon energy rather than power. Keeping the process within narrower bands is equivalent to expect lower power variations. According to official sources, the total electricity produced by Uruguay during 2017 (to meet internal demand plus energy exports to Argentina and Brazil) was of 12,600 GWh. The equivalent hourly average power is 1438 MW. The total wind-power plant by late 2017 was of 1437 MW (the fact these final figures match is just a coincidence). Hence, aiming on having energy bands whose relative width is below 20% is equivalent to expect average power fluctuations (either upwards or downwards the centroid) below 10% of the installed wind-power plant, which matches the average power consumption. In summary, our targets are:
, %
% and
%anomalous≤ 10%. Observe that no record in
Table 2 fulfills all these goals simultaneously. Along
Section 4 we see how to deal with that issue.