Extreme Quantiles Dynamic Line Rating Forecasts and Application on Network Operation

: This paper presents a study on dynamic line rating (DLR) forecasting procedure aimed at developing a new methodology able to forecast future ampacity values for rare and extreme events. This is motivated by the belief that to apply DLR network operators must be able to forecast their values and this must be based on conservative approaches able to guarantee the safe operation of the network. The proposed methodology can be summarised as follows: ﬁrstly, probabilistic forecasts of conductors’ ampacity are calculated with a non-parametric model, secondly, the lower part of the distribution is replaced with a new distribution calculated with a parametric model. The paper presents also an evaluation of the proposed methodology in network operation, suggesting an application method and highlighting the advantages. The proposed forecasting methodology delivers a high improvement of the lowest quantiles’ reliability, allowing perfect reliability for the 1% quantile and a reduction of roughly 75% in overconﬁdence for the 0.1% quantile.


Introduction
Presently, transmission and distribution system operators (TSOs and DSOs) are confronted by a rising number of challenges, due to several important evolutions in the grid such as increasing penetration of renewable energy, electricity market deregulation and electricity consumption growth. The electrical grid is being operated closer to its limits with peak power continuously increasing every year, and the traditional solution of the grid reinforcements being capital intensive and often too slow because of public opinion opposition, new alternative technologies are being developed. It is in this context that the replacement of the static line rating (SLR) with the dynamic line rating (DLR) is being investigated.
DLR is a technology aiming at exploiting measurements of conductors' thermal state to modify dynamically the thermal rating of a circuit. Whilst maintaining constant the maximum conductor operating temperature, it is possible, thanks to the variable environmental conditions, modifying and increasing the real-time allowable current. This has the potential of increasing allowed power flows in conditions of high wind (facilitating renewable energy integration), cold weather (facilitating the management of load winter peak load) or during the night (facilitating both wind energy integration and winter peak load management). It has also the potential to reduce allowed power flows to increase circuits' operation safety. It is necessary to remember that overhead lines rating (called also ampacity or current carrying capacity) depends on the ability of the conductor to dissipate into the environment the heat produced by Joule effect or absorbed through convective or irradiative heat exchange from the sun. It is therefore strongly dependent on weather parameters such as wind speed and direction, air temperature and solar radiation. To facilitate the reading, a series of definitions are here provided: -Real-time line rating (RTLR) is the real current carrying capacity of the circuit. It cannot be measured directly but can be estimated from other measurements, for example, conductor temperature or local weather conditions. The current flowing on the circuit must always be lower than its value. -Static line rating (SLR) is a value set by the network operator (TSO or DSO) to operate the line. Its value is constant over a year or a season. It should always be lower of the RTLR but rarely, small excursions are in general tolerated. -Dynamic line rating (DLR), like the SLR, is a value set by the network operator (TSO or DSO) to operate the line and the same constraints apply. The difference is that it is variable and can be modified dynamically according to weather and load conditions.
As it will be clarified later in Section 2, the proposed frameworks for setting DLR based on RTLR probabilistic forecasts suggest that low quantiles, in general, lower than 5%, should be used. There are two reasons for it: Firstly, because errors in the DLR setting can result in losses far higher than the value of the extra electricity transferred and decision-makers are traditionally very risk-averse. Secondly, because DLR uprating above a certain threshold (e.g., 20%, 30% of the SLR) do not have any impact as other components and other parameters becomes the new bottlenecks.
This consideration motivated the authors to develop a methodology to calculate improved RTLR forecasts for the lowest quantiles (e.g., below 5%), corresponding to the highest reliable values for DLR. The attention is also given to the definition of the forecasts which in this area has been brought to 0.1%.
For this reason, this work claims a single but major contribution: in line with operational needs, a novel semi-parametric methodology for probabilistic forecasts of overhead line ampacity in the lower tail (or low quantiles) is proposed to strongly improve RTLR forecasting skill. This is considered a necessary step for the application of DLR since safety concerns push for the application of extremely strict approaches for setting this parameter. Furthermore, an economic evaluation of the forecasts value is made with electrical grid simulations, which provides an overview of the impact of the proposed forecast modifications for field operations and benefits in network operation. The results show an interrelation between the quality of the probabilistic DLR forecasts and the strategies employed to select the DLR set points, which is also a novel and relevant result for operational implementation of DLR.
The document is structured as follows: after the introduction in Section 1, the state of the art in DLR and DLR forecasts is presented in Section 2, clarifying the gaps in the current literature and highlighting the importance of this work. In Section 3 the methodology used to calculate the advanced forecasts is described, along with the methodology used for evaluating its benefits and a description of the test case. Results are described in Section 4 where they are reported in terms of sensitivity analysis, statistical evaluation metrics and benefits for network operation. Finally, the methodology and the results are discussed in light of the suitability of this approach in an operational environment and conclusions are drawn in Section 5.

State of the Art
Firstly, it must be clarified that the work presented in this paper is based on previous works by the authors which have been combined and improved to achieve better performance. In [1], the framework for calculating day ahead probabilistic forecast and the framework for its impact evaluation on network operation has been established. In [2,3] the improvement of extreme quantiles forecasting for wind power production has been explored. To help the reader, descriptions of the models developed in [1,2] are presented in Sections 3.1 and 3.2, respectively.
As mentioned above, DLR is a technology aiming at exploiting measurements of conductors' thermal state to modify demonically the thermal rating of a circuit. The DLR is a value set by the network operator and should always be lower than the RTLR [A] that can be calculated as shown in Equation (1). This requires the maximum allowable conductor's operating temperature T c c [K] to be set and to calculate the heat balance on a section of the conductor as described in [4] or [5]. These parameters influence the convective heat exchange P c (T c , T a , W s , W d ) [ Traditionally, network operators apply static seasonal ratings (SLR) considering conservative and fixed values of the environmental parameters such as W s = 0.5 m/s, T a = 5, 15, 25 • C, respectively, for winter, summer/autumn and summer and S r = 0 or 1000 W/m 2 as suggested in [5]. This approach has two main drawbacks: it reduces assets utilisation by leaving a large unexploited headroom, with average RTLR sometimes equivalent to the double of the SLR on a single line [6] and let to apply SLR higher than RTLR for a small part of the year [7].
The use of DLR, although associated with additional costs in measurement, control and protection equipment, has the potential to overcome these problems. A series of studies have shown the interest for this approach on system performance, distribution network and system operation.
Regarding system performance in [8], the behaviour of a transmission network with and without DLR is studied. It is found that only a limited number of lines benefit from the application of DLR and a value is found during conductor ageing estimation and decision making related to emergencies. In [9] it is studied the evaluation of economic benefits of DLR application and assessing the feasibility of a DLR project. With a test on the Swedish-Finnish interconnection, it is shown that net transfer capacity is at least equivalent to 140 MW, increasing to 200 MW in the 99% of the cases. It is also stated that day ahead forecasts are necessary to take full advantage of this technology. Finally, in [10] the application of DLR to system scheduling with high penetration of wind shows that network congestions are avoided.
Regarding application on the distribution network [11] reliability analysis over a standard distribution network where DLR is applied and exposed to Finnish weather is performed, showing a reduction of curtailment in the range of 48-84% and savings in damage costs in the range between 5.6% and 7.4%. In [12], the hosting capacity of a network where DLR is applied is calculated, showing the unlocking of considerable hosting capacity equivalent to 2.2 times the business as usual scenario.
Regarding system operation safety, [13] highlights that the application of DLR in the test network allows for an increase between 32% and 75% of the total transmission capacity while allowing to increased operation safety relieving network operators from part of their risks. In [14] a methodology to incorporate DLR into network reliability analysis is developed, confirming both the ability of DLR to accommodate larger wind generation capacity and to increase network reliability. Finally, in [15] a reliability assessment of a DLR empowered network was conducted highlighting the importance of line selection and measurement placement and redundancy.
Most of the studies above consider reactive control systems or policies based on the observation of RTLR. Although this has the potential to exploit a large amount of the available headroom, it has a series of problems when implemented. As mentioned above, the effective possible uprating is not far higher than the currently used SLR because of other network constraints. Furthermore, to be integrated into current network operation activities, the available headroom must be known in advance.
This drives research on RTLR forecast, a field of large interest and under growing scrutiny by the research community. An extensive overview of this topic can be found in [16], which covers the period until 2015. Among them and among the most recent works on the RTLR forecast it is possible to cite the following main works: [17] where ensemble weather forecasts are used along with a Monte Carlo model to calculate day ahead RTLR probabilistic forecasts and [18] where conditional heteroscedastic auto-regressive models are used to provide short term probabilistic forecasts of RTLR. In [19], machine learning has been used to calculate day ahead forecasts of RTLR.
These works suggest the use of low quantiles, in general, lower than 5% to set DLR from RTLR forecasts but they use forecasts optimised to maximise their performance on all the probability distribution. The same is done in [1] by the authors of this paper, but the performance of the forecast tends to be lower at the tail of the distribution due to the more extreme and rare values found.
Among the latest research in this field, it is worthy to mention two works focusing on DLR forecasts and its application. In [20], affine arithmetic has been proposed for estimating confidence intervals for DLR. This is an interesting approach already proposed in [21] for thermal state estimation of overhead lines conductor. Unfortunately, results presentation does not allow a comparison of performance between this and other models. On the other hand, in [22] a chance constrained-based methodology for applying DLR probabilistic forecasts to network operation is proposed.
Currently, no research has been done on how to provide RTLR forecasts with such levels of probability, but several studies in other fields show an interest in semi-parametric methods. The mean quantiles are provided with an initial model, and the extreme quantiles are provided by modelling a distribution tail with a given shape for the probabilistic forecast. As an example, [23] proposes modelling the quantile forecasts of electricity prices for levels of probability lower than 5% using an exponential interpolation. Considering a tail parameter , the authors consider the same parameter for all of the considered forecasts. In [2], where forecasts are carried out to determine the net transfer capacity between Portugal and Spain, it is also proposed to model forecasts for quantiles lower than 5% using exponential interpolation, but the values of are equal to the average power depending on the position belonging of the median forecast class. In this paper, the authors propose to use a similar approach for RTLR forecasting and conditional to a set of explanatory variables.

Contribution to Knowledge
In summary, concerning existing literature, this work focuses its attention on calculating highly reliable DLR forecasts, improving the performance for the lowest quantiles of the distribution. Besides, it proposes a methodology for applying and evaluating the benefits of extreme quantiles forecast improvements on DLR selection and network operation. This is important since conductor ampacity must be set for values with a very low probability of being overestimated to maintain the operating temperature below the maximum operating value (e.g., 75 • C). When this is done probabilistically, this means choosing values corresponding to the distribution's tail and very low probability levels (<1-2%). This has two main consequences. Firstly, it is not important to consider the full probability distribution and that typical metrics, such as the continuous ranked probability score (CRPS) and Brier score, lose their relevance and must be replaced by other scores such as the quantile score sum (QSS) used in this paper. Secondly, values in the distribution' tail are more difficult to predict since they correspond to rare and extreme values, and therefore much more difficult to model than averages.
The basic methodology proposed in [2] is enhanced with a clustering approach to provide a conditional estimation of extreme quantiles associated to DLR and applied to the bi-level stochastic optimization described in [1] to evaluate the benefit of highly accurate DLR probabilistic forecasts for quantiles below 5%. This result is not automatic nor known in advance: in fact, wind power is a bounded variable limited between 0 and the rated power of the wind turbine or power plant, whilst RTLR is unbounded in the region of observation. The evaluation methodology proposed in [1] is enriched to both facilitate results comparison and to highlight the effect of the higher granularity allowed by the more precise distribution's tail. The combination of these three steps, described in Sections 3.1-3.3, respectively, allow for performance that cannot be obtained with improvements of the single parts individually.

Methodology
The methodology proposed to obtain improved RTLR forecasts can be summarised in the following steps: (1) Probabilistic forecasts for the RTLR are calculated. They are relative to the day ahead and have a time resolution of 1 h. This action is described in detail in Section 3.1. (2) The lower part of the distribution is improved by separate modelling. This action is described in detail in Section 3.2. (3) The parameters for the distribution's tail model are further refined by calculating them for different clusters of forecasting results. This action is described in detail in Section 3.3.
Finally, the impact of the improved forecasts for network operation with the methodology described in Section 3.4 on the test case described in Section 3.5 is evaluated.

Probabilistic Forecasting Algorithm
The definition of a probabilistic forecast can be summarised as follows. A forecastRTLR τ t+h|t , made at an instant t for an instant t + h, is defined to have a probability τ ∈ [0%, 100%] of being higher than the future observation RTLR t+h as summarised in Equation (2): In this paper, the considered model is a quantile regression forest (QRF) [24]. This is a non-parametric model, selected based on the results shown in [19] and [1] where QRF outperformed other approaches. Although the forecast model used here has already been described in [1], a summary is presented to facilitate the reader's understanding.
The QRF is a machine learning ensemble method, based on the generation of n decisional trees, each one trained with a randomly selected subset of features and data. Each tree is trained to predict with the highest accuracy the value of the observed parameter Y at time t + h using the explicative variables X known at time t. At this point, the outputs of all the trees are concatenated and sorted, and the quantile forecasts are drawn from the sorted list. Apart of the better performance experienced, another advantage of a QRF is its ability to be easy to configure: the model outputs converge when the number of decisional trees becomes high, and a single QRF directly provides all the quantile forecasts. As seen, no hypothesis is made on the shape of the resulting probability density function, therefore the method is said to be non-parametric.
A visual representation of the forecast method is presented in Figure 1. It must be clarified that the QRF model has been selected after an extensive evaluation of several prediction models carried out in [1], where the QRF model proposed consistently outperformed other models based on quantile linear regression, mixed density neural networks and kernel density estimation, over metrics such as continuously ranked probability score, quantile score, reliability and sharpness.
In this case, a separate model, made of n = 1000 independent trees, is calculated for each forecast horizon h using the following information: (1) Information relative to time t, with five features: observations of the four available weather variables (wind speed, wind direction, air temperature and solar radiation) at time t and for the position of the weather station. These data are not subjected to a feature selection process thanks to the ability of the QRF model to avoid the utilisation of non-meaningful parameters.

Modelling of the Tail of the Distribution
To improve the results of the non-parametric model described above, the quantile forecasts for quantiles inferior to 5% are calculated as explained above in Section 3.1 by being modelled with a probability distribution function of pre-defined nature. Since this is a known function characterised by certain parameters the method is parametric. This means that the probability distribution associated with the variable RTLR t+h has a parametric shape for levels of probability lower than a given threshold τ lim . Here, as in [19,25], this threshold is set as equal to 5%. Regarding the parametric function, the exponential distributed proposed in [2] has been selected. Other solutions such as a linear function or the use of extreme value theory have been discarded after initial tests with insufficient results: where F R TLR τ t+h|t , ρ is the parametric cumulative distribution function for the lower tail of the distribution,RTLR τ t+h|t is the RTLR predicted value of the τ quantile calculated at time t for horizon h, ρ is the function shape parameter and τ lim is the quantile limit indicating where the parametric model is applied for modelling the distribution's tail.
The problem is now to estimate the best value for the parameter ρ. It is suggested in [25] that it should depend only on the value of the median forecast (i.e.,RTLR 50% t+h|t ). A similar approach is not appropriate for RTLR forecasting since two probabilistic forecasts with the same 50% quantile forecasts might have very different standard deviations. RTLR depends on several weather characteristics and, for example, different combinations of temperature and wind speed could be associated with equal median forecasts but have very different uncertainty levels.
Thus, the authors here propose to estimate ρ values depending on whether probabilistic forecasts are present in specific clusters defined with two quantile forecasts. This is done to consider both the information related to the median values and the standard deviation of the initial forecast.
It must be observed that several parametric distributions could be used to model the lower part of the RTLR distribution, but after initial tests, it was observed that in this particular problem the exponential distribution outperformed the others.

Clustering
Here forecasts are clustered according to the relative values of two forecasted quantiles (e.g., 5% and 50%) and for each cluster, an optimal value of ρ is searched.
To do this, a set of N forecasts considered to have similar properties to the operational forecasts is generated by k-cross-validation of the initial training set, resulting in an ensemble of N couples of quantile forecasts. With this ensemble, clusters are defined.
Regarding the generation of such clusters, in [2], it is proposed to define them as n intervals [a i , a i+1 ], a i+1 ≥ a i . These intervals are designed with the N observed probabilistic forecasts such as the same amount of median forecast values belongs to each interval, to a unit close.
Such structure seems less suitable in our case for space with two dimensions or more, and it is here preferred to generate a structure with unsupervised learning methods, here a k-mean clustering process. The n clusters structure is defined such as to minimize the following Equation: where x j is the coordinates of j th observation of the cluster k, and µ k is the barycentre of the cluster k.
The Lloyd algorithm [26], able to identify a series of equally sized clusters, is employed here to find the clusters and, to provide the same clusters for any resolution of the process, 200 initialisations of the clustering are applied. To train the clustering, the quantile forecasts are normalized using the maximal and minimal values observed on the training set: Considering all of the observations below the 5% quantile forecasts for each cluster, the ρ parameters are found for each cluster such as to maximize the value of the likelihood.
A problem faced is the necessity to obtain a sufficient number of observations below 5% quantile forecasts to obtain statistically relevant results. For this reason, clustering has been applied to forecasts related to groups of six forecast horizons (24 to 29, 30 to 35, 36 to 41 and 42 to 47 h respectively). This allows, with the dataset used, to obtain roughly 110 observations (365 × 6 × 5% = 109.5) for the quantiles between 0% and 5% for each group. This means that if 10 clusters are searched, roughly ten observation will be in each one of them. This approach proved to be sufficient to avoid instability in the results.

Evaluation
To evaluate the benefit of applying predictive DLR in network operation, in [1], a bi-level stochastic optimization process for setting DLR forecasting in operational planning has been proposed.
Again, a summary is provided here but the reader is invited to consult [1] for more details. The evaluation is done through a simulation of the behaviour of the using DLR set thanks to the RTLR forecasts calculated above on selected lines. Based on these values optimal generators output is calculated with optimal power flow. It is assumed that these lines are equipped with DLR monitoring sensor able to alert the network operator in case of thermal rating infringement. In this event, the network operator activates reserve in opposed parts of the network to reduce the power flow on the congested lines. This results in a more expensive energy cost due to the deoptimization of the generation park and an additional cost due to the activation of the reserves.
The bi-level stochastic optimisation problem is described in Equations (6)- (10), where: with f is represented the objective function and with g the constraints of the optimal power flow. Capital letters (F, G) indicate the leader problem, an optimal power flow where the risk associated with wrong forecasts and reserve activation is considered into the optimisation function and where different values of the DLR are tested. With lower case letters (f,g) is represented the follower problem: a traditional optimal power flow where the value of the DLR is fixed: y ∈ argmin z∈Y f (x, z) : where: In this, Ng is a set of conventional generators; I g is a binary variable, the value 1 describing a committed generator; π f uel g is the fuel price for generator g (€/MWh); π f ix g is the commitment price for a conventional generator g (€/h); P g is the scheduled output of the generator g (MW); πhup g and πhdo g are the prices for holding up and down reserve for a generator g (€/MWh); H up g and H do g are the up and down reserve service holding amount for a generator g (MW); N s is the set of potential future realizations, which has a probability ρ s of occurring; πrup g and πrdo g are the reserve activation prices (€/MWh); and R up g,s and R do g,s are the activated reserves from a generator g in scenario s (MW). In this problem, the upper-level decision vector x includes the allocated reserves (H up g /H do g ), the activated reserves in case of the realization of scenario s (R up g,s /R do g,s ), and the value of the forecasted RTLR, which appears in the constraints G i and g j . The lower-level decision vector y considers the planned production levels (P g ) and the list of activated generators (I g ).
The function v β considers the risk aversion of the TSO. In this work, three functions are considered: (1) a linear function, to decrease the total costs paid by the TSO and shown in (11), (2) a quadratic function, by which the TSO decreases the number of situations in which reserve costs are high shown in (12) and an exponential function, described in Equation (13), with an effect similar to the quadratic one:

Test Case Description
To carry out the evaluation, this study uses the same test case already described in [1] in terms of weather, network topology, generation and load characteristics. Measurements from six weather stations in the United Kingdom are used to calculate the estimated RTLR along the line. This, along with numerical weather predictions from the European Centre for Medium-Range Weather Forecasts (ECMWF) is used to calculated RTLR forecasts, as described above. The considered forecasts are provided by a model that runs every day at noon and generates forecasts for horizons from 24 h to 47 h. The model is first trained using one year of data, then the training is repeated every month with a growing dataset. The forecasts employed for the clustering process are provided with a k-cross-validation carried out with the first year of data, with k being selected such that 11 months of the remaining data are used to train the model.
The network used is the IEEE 24-bus grid represented in Figure 2, with the same characteristics as those in [27]. Here Lines 8-9 and 6-8 are considered to be equipped with DLR, and the same value of RTLR is considered for both lines. As explained in [28], it is on these lines that the first thermal overload is experienced on the network. The DLR is then limited to a maximum of 125% of the SLR since higher power flows will anyway result in other constraints on the network being breached.

Numerical Results
In this section, several aspects related to the evaluation of the proposed methodology are presented. Firstly, a sensitivity analysis of the impact of clusters numbers is shown in Section 4.1. Finally, the improvements in network operation are presented in Section 4.2 for the two strategies considered. To illustrate the impact of the process, Figure 3 compares the same example of forecasts as in [1] with forecasts generated for the same period, employing the proposed tail modelling process.
In Figure 3 is possible to see, in red, the observed value of RTLR and, in blue shades, the different prediction intervals. The lower bands correspond to the distribution's tail, between 0.02% and 2%. The lower chart, corresponding to the model presented in this paper, depicts far lower values for the RTLR at the 0.2% quantile, showing that the improvements offered by the new methodology manage to better describe the distribution's tail and identify more potentially dangerous situations.
This comparison shows that the low quantile forecasts obtained with the proposed method are very different from the initial ones. The improvements are evaluated through metrics specific to probabilistic forecasts.

Clustering Sensitivity Analysis
An example of the results from the clustering process is presented in Figure 4. Here the features used for clustering are the value of the 5% and 20% quantiles. Nine clusters are identified and for each one of them, the value of the parameter is reported. The purpose of this section is to show the evolution of forecast performance according to different aspects of the clustering process, such as the selection of the features used for clustering the results and the number of clusters considered. These combinations are tested according to two main indicators. Regarding the evaluation of probabilistic forecasts, three properties are generally considered: -Reliability or calibration, which measures whether the empirical probabilities asymptotically approach the nominal probabilities. -Sharpness, which quantifies the uncertainty of the probabilistic forecasts, and can numerically correspond to computing the average interval size between two symmetric quantiles. - The resolution, which is the ability of the model to provide different forecast intervals conditioned to the forecast conditions.
For a perfectly reliable forecast, the resolution and sharpness are equivalents [29], and only these two parameters are here considered.
Several pairs of quantiles are considered, and these can be split into two categories, i.e., pairs focused on the whole distribution, such as (10-90%); pairs focused on the low part of the distribution, such as (5-10%). Considering these different pairs for generating the clusters and the application of the proposed modification, on Table 1 the evolution of the reliability of the forecasts, for the 1% and 0.1% levels of probability is represented. Better performances are represented by values closest to zero. The values are reported for each couple of quantiles and each number of clusters tested. Lines labelled 'no' represent the performance of the forecasts obtained in the original method without the exponential tail modelling. The improvements are significant: the relative frequency of overestimation decreases from 180% to 100% for the 1% quantile forecast and from 800% to 200% for the 0.1% quantile forecast.
Using several tail parameters instead of a single one brings improvements, especially for very low quantiles. Thus, the relative reliability for 0.1% quantile forecasts drops from 328% to 191% when four clusters are used instead of one.
It has also been tested which a couple of quantiles forecast provide the best performance when they are used as a discriminant to classify the different clusters. The best performances on extreme quantiles forecasts (from 0.1% to 0.5%) are obtained with the couples (5-20%), (5-10%), (5-8%) and (1-5%). Similarly to the results presented in Figure 4, the evolution of sharpness resulting from the proposed modifications is shown in Table 1. It must be noted that although reliability improvements are observed, the sharpness appears to have degraded with the proposed process, with the 1% quantile forecasts having on average decreased by 20 Amperes.
To evaluate the sharpness and reliability simultaneously, scoring rules may be used. Currently, no papers in the literature related to RTLR forecasts employ this kind of scoring rule to evaluate forecast models, generally considering only the reliability [17,30].
A scoring rule of this type was recently employed during the Global Energy Forecasting Competition, where probabilistic forecasts of load and renewable energy production were evaluated with the quantile scoring rule, equal to the sum of quantile scores for all the percentiles [31]. This index, QSS, computed as the sum of quantile scores, was not devised for solely evaluating low quantile forecasts used for RTLR forecasting since the main part of the index is associated with the quantile score value of the intermediate quantiles [32]. This is represented in Table 2. Here, the use of a sum of quantile scores for the quantiles lower than 5% is proposed. For a single forecast, this index can be described as follows: where: -Ŷ τ t+h|t and Y t+h are respectively the predicted (Ŷ τ t+h|t ) value of the τ th quantile at time t + h and the observed value at time t + h ( Y t+h ) -QSS a,b is the quantile score sum between the a th and b th quantile -QS c Ŷ τ t+h|t , Y t+h is the quantile score relative to quantile c Table 3 shows the relative evolution of the quantile scores employing the proposed methodology according to the number of clusters used and its improvement with respect to the base case without distribution tail improvement. Table 3. Evolution of the QSS0. 1%,5%, depending on the number of clusters and couples of quantiles used.

Evaluation of Benefits on Network Operation
To evaluate the RTLR forecast use-value, different strategies to set DLR are considered and, as in [1], the following indices were considered: - The benefits of the DLR set, here computed as the value of the total reduction of the system cost divided by the initial total system costs. For a high number of observations, a difference lower than 1% is observed between the expected and observed benefits, and thus it is considered that the observed benefits are known depending on the DLR forecast selection strategy applied. - The total costs of the reserve activations linked to DLR overestimations. - The frequency of DLR overestimations, which implies correction measures at a cost greater than a given threshold, here €1500. Such overestimations are qualified as incidents.
Three sets of forecasts are considered: -Forecasts solely generated with a QRF model. - Forecasts generated with a QRF model, and the exponential tails are applied considering 4 clusters created with the LR quantile forecasts associated with the quantiles couple (5%, 10%). This version obtained the best results in terms of reliability (QRF_EI_4c). -Forecasts generated with a QRF model, and exponential tails are applied considering 9 clusters generated with the LR quantile forecasts associated with the quantiles couple (5%, 10%). This modification obtained the best performance in terms of quantile score (QRF_EI_4c).
In this section benefits and costs are expressed in percentage points relative to the overall cost of energy for the system at the given time step.

Fixed Quantile Strategy
It is not possible to make a fair comparison between different forecast models based only on the use of a τ-quantile DLR forecasts with a given value of τ due to the contemporary effect observed on reserve costs (paid by the network operator) and system cost (paid by the users). As an example, for the test cases proposed in this study and considering the use of the RTLR forecasts with a fixed value τ of 1%, the proposed modifications (i.e., with the exponential tails featuring nine clusters generated using the quantiles (5%, 10%)) result in a reserve cost reduction of 37% from 0.016% to 0.010% of the total energy cost, but imply a decrease in benefits of 25% from 0.61% to 0.46% of the total energy cost.
A comparison between the forecast models is then made considering the whole set of possible values of τ, the evolution of the total reserve costs and the frequency of incidents with the observed benefits. This is represented in Figure 5. Three cases are considered: the basic forecast obtained with QRF shown in [1] (QRF), the same model with an improved tail obtained modelled with an exponential function and based on the use of four clusters (QRF_EI_4c) and nine clusters (QRF_EI_9c). Regarding the evolution of the total costs, the modifications of the forecast models do not appear to generate a significant improvement, and it could be considered that these modifications have no impact. However, when large incidents are considered (in this case, events with a cost of reserve greater than 1500 euros), the advantage of a finer tail modelling is evident. As shown on the lower part of Figure 5, considering a benefit level of 0.61% and compared with the results associated with the use of forecasts generated with QRF, the frequencies of incidents increase by 18% and 35% respectively when employing forecasts generated with QRF_EI_9c and QRF_EI_4c.
The example shown here permits us to conclude that the single criterion of reliability, often used solely in the literature to evaluate RTLR forecasts as in [17,30], is not sufficient to evaluate RTLR forecasting models. In a proper evaluation methodology, additional criteria should be employed.

Risk-Averse Strategy
In the previous subsection, the application of RTLR forecasts based on the use of fixed quantiles was considered. In [1], it was shown that by using stochastic risk-averse strategies, it is possible to increase the benefits associated with the use of RTLR forecasts, while ensuring a low level of costs for the TSO or a low frequency of events involving high costs. This type of strategy is used here, considering the forecast model without and with tail modifications and considering 9 clusters generated with quantiles (5%, 10%).
Three possible penalty functions v β are considered: linear, quadratic and exponential. Firstly, the penalty function is considered linear. The aim of using such a function is to reduce the total costs paid by the TSO. Figure 6 depicts the evolution of these costs depending on the observed benefits. Whereas with the fixed quantile strategy, the tail modelling resulted in decreasing the value of the RTLR forecasts, improvements are now observed. Considering a fixed level of benefits of 0.61%, the reduction of total costs goes up from 17% to 21%. In a second approach, quadratic and exponential functions are considered. These approaches aim at improving the benefits while ensuring a low frequency of DLR forecast overestimations at a high cost.
In Figure 7, the frequency of such overestimations, depending on the forecast model used, the level of benefits, and the selected forecast strategy are represented. As shown in [1], in the case where forecasts were obtained with non-modified QRF, the quadratic penalties appear to be slightly more efficient than the exponential function. Regarding the proposed forecast modifications, significant benefits are observed with the quadratic penalties. For a level of benefits of 0.61%, the incident frequency reduction goes up from 18% to 29%, i.e., a relative improvement of 61%. This benefit improvement is even greater when exponential penalties are used with the modified forecast model. In this configuration, the frequency of incidents is reduced by 35%, and the exponential penalties thus appear to be more efficient than the quadratic ones with such a forecast model.
The choice between quadratic and exponential penalties for an optimal strategy to reduce incident frequency thus depends on the quality of the forecast model for extreme levels of probability. Exponential penalties are more averse than quadratic ones to costly forecast overestimations. However, to provide better results in terms of forecast value, forecasts with a high level of reliability for extreme quantiles are required with such penalties, and such forecasts were not used in [1].

Conclusions
In conclusion, several solutions have been tested to provide improved extreme quantiles LR forecasts and DLR selection for the network application. The most performant can be summarised as follows: to forecast RTLR with a QRF model, then replace the bottom of the distribution up to the 5% quantile with an exponential distribution, whose parameters are calculated according to the relative value of the 5% and 10% quantile in nine different clusters. This must be coupled with a dynamic quantile selection for DLR setting based on an exponential penalty function.
In this case, the relative frequency of overestimation decreases from 180% to 100% for the 1% quantile forecast and from 800% to 200% for the 0.1% quantile forecast. Although reliability improvements are observed, the sharpness appears to have been degraded with the proposed process. The application of this strategy provides consistently overall lower reserve costs and frequency of incidents respect to the use of rough QRF-based LR forecasts. In particular large incidents are reduced by roughly 61%.
This major result emphasizes the need to consider forecasting value (and not just forecasting skill or accuracy) when selecting one forecast from a group of multiple suppliers. Moreover, it also shows that small improvements in the distribution's tails can entail large economic savings in use cases with risk-averse decision-makers.
Several perspectives are opened up by this work. Firstly, it appears that whereas the minimum level of probability for RTLR forecasts investigated in the literature is usually 1%, the use of well-calibrated extreme quantile RTLR forecasts for quantiles lower than 1% may be of significant interest for dispatch operations. Secondly, it is shown that the traditional indices for evaluating probabilistic forecasts are not adequate to evaluate their application. Two different RTLR forecasts with quantile scoring rule values varying by less than 0.01% could have a very different economic value. Therefore, other scoring indices for RTLR forecasts evaluation are required. Finally, it was demonstrated a relationship between the quality of the probabilistic RTLR forecasts and the strategies employed to select the DLR set points, as shown here by the different ranking provided for the quadratic and exponential penalties. An optimal strategy, based on the expected performance of RTLR forecasting models, remains to be defined in future work.   Wind speed (m/s) τ Probability selected [0, 1] π f ix g commitment price for a conventional generator g (€/h) π f uel g fuel price for generator g (€/MWh) πhup g , πhdo g prices for holding up and down reserve for a generator g (€/MWh) πrup g , πrdo g up and down reserve activation prices (€/MWh) ρ s probability of occurring for the S realisation