An Improved Fuzzy C-Means Algorithm for the Implementation of Demand Side Management Measures

Load profiling refers to a procedure that leads to the formulation of daily load curves and consumer classes regarding the similarity of the curve shapes. This procedure incorporates a set of unsupervised machine learning algorithms. While many crisp clustering algorithms have been proposed for grouping load curves into clusters, only one soft clustering algorithm is utilized for the aforementioned purpose, namely the Fuzzy C-Means (FCM) algorithm. Since the benefits of soft clustering are demonstrated in a variety of applications, the potential of introducing a novel modification of the FCM in the electricity consumer clustering process is examined. Additionally, this paper proposes a novel Demand Side Management (DSM) strategy for load management of consumers that are eligible for the implementation of Real-Time Pricing (RTP) schemes. The DSM strategy is formulated as a constrained optimization problem that can be easily solved and therefore, making it a useful tool for retailers’ decision-making framework in competitive electricity markets.


Introduction
In order to cope with the rapidly increasing energy demand in many countries, modern electric power system planning relies on two fundamental concepts, namely Supply Side Management (SSM) and Demand Side Management (DSM) [1]. In the first case, the demand increment is dealt with by power capacity expansion. The required time period for the installation of centralized power plants that will cover the electricity needs is large. Moreover, the continuous utilization of fossil fuels in the power generation and transport sectors gives rise to environmental concerns. Many countries have set minimum requirements regarding the generation capacity expansion planning in order to meet specific environmental protection targets. In addition, many electricity generation schemes such as Renewable Energy Sources (RES) depend on unforeseen parameters like the weather conditions. The intermittent power generation of certain types of RES, though, yields problems in the reliability and balance of the grid. Also, high RES penetration in the grid results in increased needs of operational reserves, investments in transmission and distribution infrastructure, and sophisticated methods of solar and aeolic capacity forecasts. On the other hand, DSM seeks methods to modify the demand patterns for a variety of purposes such as lowering the risks that accompany high demand peaks, generation shortages and transmission network congestion [2].

1.
The present paper aims to fully connect the DSM targets with price-based DR. The latter is widely regarded in the literature as an efficient mechanism to alter the demand patterns of the consumers. The model developed in the paper outputs tariffs that emphasize on achieving a pre-defined DSM objective. Therefore, the benefits of RTP are evident in power grid scale and not restricted to the load serving entity.

2.
The flexibility of the consumer to alter the demand according to the tariff is expressed by the elasticity. This parameter is part of the model's operation. A novel technique is proposed to derive dynamic elasticities that more accurately capture the changes in the demand according to tariffs.

3.
A part of the model's operation is the utilization of unsupervised machine learning techniques such as soft clustering for a two-fold purpose: (a) to extract the representative load curves or load profiles of the consumers and (b) to track similarities in historical loads for the purpose of drawing the dynamic elasticities curves.
The model can be applied in intra-days energy markets and there is no restriction on the number of consumers or on the type of the involved DR program. related to the procurement of electricity, the target of DR is to maximize the profits through efficient pricing schemes [35,36]. The price offered to consumers can be constant or variable, i.e., different prices are applied per hour. Most studies in Retailer profit problems focus on the optimal procurement mechanism, i.e., forward contracts, call options, pool market and self-production [37][38][39][40][41]. Another factor that indirectly influences profits, due to the risks of power market prices, is the prediction of these prices. In [37][38][39][40][41] market prices are forecasted using time series models such as ARIMA. However, more reliable forecasting models have been proposed in the literature [42,43].
The decisions of the Retailer are categorized as long or short-term. Dynamic pricing is considered in [19,23,41] focusing on the tariffs. Apart from the costs that are related with the procurement mechanism, a crucial parameter is the DR of the consumers. RTP is considered in [19], where different price/demand functions are examined in order to evaluate the influence of the type of the aforementioned function in the profits. In [23], the test set includes 300 consumers connected in the 20 kV distribution network in Iran. The consumers are clustered into four clusters and for each cluster a different RTP profile is applied. The procurement mechanism corresponds to the pool market. No method is used for risk modeling and management. In [35], the scope is to define the optimal procurement mechanism. The mechanisms considered are spot price, forward contracts, call option and self-production. Future market prices and consumer load are considered as stochastic variables. The exposure of the Retailer in the uncertainties of future spot prices is modeled with the conditional-value at risk parameter. The results indicate that the risk parameters affect the offered price to the consumer. More specifically, if the Retailer tends to be risky, i.e., to rely for its procurement on the spot market, the offered price is higher. In [36], the Retailer serves five different consumers. The electricity is purchased from the pool market. The authors include a wide set of scenarios in their analysis that differ in terms of price strategy (i.e., flat and TOU rates), maximum price limit, and elasticity value. The authors conclude that the optimal price strategy, i.e., the one that leads to higher profits, is TOU rates. Also, as the consumers become more elastic, the profits of the Retailer decline. The authors of [37] focus on the PJM energy market. As in [35], market prices and load are treated as stochastic variables and various scenarios are generated. These two factors influence Retailer profits and a careful simulation of them appears to be crucial in profit maximization. In [38], the scope is to address the contract design both at the supply and end-user levels. The paper provides a stochastic programming methodology that allows the Retailer to make informed contractual decisions, particularly in respect to contract prices and quantities. A stochastic-based optimization model is proposed in [39]. The Retailer's decisions are distinguished in short-and medium-term based on the procurement mechanism involved, i.e., pool market and forward contract. The Retailer serves a residential, a commercial and an industrial consumer. The authors focus on the influences of the type of the procurement mechanism and the risk parameter value on the profits. The work in [39] is expanded in [40]. The consumers' reaction to the offered price is modeled with a linear price/demand function. The methodology of [23] is enriched in [41]. In this paper, the approach of the Retailer towards the risks associated with the pool market is modeled with the conditional value at risk method. The response of the consumers to the offered price by the Retailer is modeled by a piece-wise acceptance function. This function relates the number of consumers that accept a specific price value.
A model between price and responsive demand requires a mathematical formula, known as price/demand function [44]. The price/demand function indicates how a change in price affects the load. The flexibility of the demand to the offered price by the Retailer is expressed by the price elasticity parameter [45]. In the majority of the related studies in the literature, the elasticity is considered constant within the 24 h period or receives different values per period, namely off-peak and peak hours. As defined in profit maximization literature, the elasticity only relates to load and price, namely it represents the price elasticity. No other factors that influence demand are taken into account. The values of price elasticities are set out by the analyst or taken by previous econometric studies [37,38]. Moreover, consumers of the same activity, i.e., residential, industrial and others are Energies 2017, 10, 1407 5 of 42 considered having the same price elasticity value. However, this approach does not always reflect the actual behavior of the consumer.
A detailed knowledge of the demand patterns influence the success of DR programs and has an influence on profits. Load profiling is tested as an approach to derive representative or typical demand patterns [46,47]. The latter refers to the formulation of typical load curves for single consumers and groups of consumers. Based on certain criteria, the consumers are grouped together in a number of clusters. Each cluster has a representative daily load curve, which is the weighted average of the diagrams that belong to the cluster. According to this approach, the consumers are not only distinguished in macro-categories (i.e., residential, commercial, etc.), but sub-categories are formed within the macro-classes. A load profiling problem is formulated as an unsupervised machine learning task and the objective is to optimally classify load with a similar pattern. The algorithms that have been proposed in the load profiling related literature can be distinguished into five general classes: (a) partitional algorithms, such as the K-means and others; (b) hierarchical agglomerative algorithms; (c) fuzzy algorithms, such as the FCM; (d) neural network-based algorithms, such as the Self-Organizing Map (SOM) and Hopfield network; and (e) algorithms that do not belong to the above classes, such as the Renyi Entropy Clustering, Competitive Leaky Algorithm (CLA) and others .
Crisp or hard clustering assigns each pattern to exactly one cluster. Soft clustering is a generalization of crisp clustering. The patterns are assigned to all clusters with partial membership. This leads to flexibility in the definition of the clusters' configurations. Soft clustering is suitable when the presence of noise or the erroneous representation of the patterns is an obstacle to the clustering outcome. Also, soft clustering is beneficial in cases where the distinctions margins of the generated clusters in the feature space are vague. The advantages of soft clustering have been shown in a wide diversity of research fields [67,68].
FCM has been occupied in load profiling tasks as a sole algorithm [53,55,[57][58][59][60][61] or as a part of a comparison among many algorithms [48,50,51,[62][63][64][65][66]. In [48], the utilized data set includes 234 non-residential consumers. A comparison between K-means, FCM, Ward hierarchical algorithm, average distance criterion hierarchical algorithm, SOM and modified follow-the-leader algorithm was performed. The algorithm evaluation is done with four validity indicators or adequacy measures. The modified follow-the-leader algorithm wins the competition. The FCM is only superior to the SOM. In [50], the same algorithms as in [48] are used, plus the support vector clustering algorithm. The latter is more effective than the rest. In [51], three algorithms based on the Renyi entropy distance criterion are presented. The authors examine the same data as in [48,50]. In this comparison, FCM is superior to two of the three newly proposed algorithms. In [53], the data set involves 300 load curves coming from distribution feeders in Malaysia that serve domestic, commercial and small size industries consumers. The algorithm's potency is checked by two validity indicators and the number of output clusters is set to five. The authors of [55] use seven fuzzy validity indicators. An inherent limitation of the fuzzy validity indicators is that they lead in the majority of the cases to small optimal number of clusters. The algorithm is applied to the day load of a city in China curves covering the period of 1 year. The authors argue that the fuzziness index holds an important role in the algorithms operation. The parameter is data specific and several trial-and-error executions should take place in order to calibrate it. The data used in [57] are the same as the test case of [53]. In this paper, the number of clusters is set to four. Since the data refer to aggregate loads coming from feeders, no large differences are observed. As load level increases, the daily load becomes less flexible with a rather deficient shape. After the initial clustering, the FCM is executed again separately in the daily load curves of each cluster. An additional load profile per cluster is drawn leading to the final number of load profile to be equal to eight. In [58], the data set refers to 100 daily load curves obtained from feeders in Malaysia. The FCM is used to cluster the load data. Afterwards, a probabilistic neural network is used to classify the resulting load profiles to pre-defined consumer types. The same authors in [59] focus on the fuzziness parameter. A series of simulations took place in order to define the optimal value. According to the paper's findings, the parameter increase tends to lead to lower clustering errors. This fact is observed in all validity indicators used in the paper. Gerbec et al. [60] apply the same concept as [58]. The FCM generates the load profiles that are used as part of the training vector of the probabilistic neural network. The other part involves the load profiles of the respective activity types. With this approach, a business activity is allocated into the most probable cluster. In [61], the FCM is applied to 300 low voltage consumers. Next, a feed-forward neural network is used to assign the consumers to the clusters. The inputs of the network are indicators related with monthly energy usage. The outputs correspond to the membership functions of the FCM that refer to the level of certainty with which the assignment of a load curve to a cluster is made. Various combinations with different clusters and different neural network parameters are examined. In [62], a comparison takes place between FCM, K-means and various topologies of the SOM. The authors utilize the Intra Cluster Index (IAI) and the Inter Cluster Index (IEI) validity indicators. The data cover two years and are provided by a utility in Brazil. In [63], the FCM is tested via three validity indicators. The authors conduct an experiment with different values of the fuzziness parameter. As the parameter increases, the clustering errors decrease. Thus, while fuzziness on the formatted clusters tends to decrease, the clustering of a load data set is more robust. In [64], the FCM is compared to hierarchical clustering. No information about the type of hierarchical algorithms is given. Also, the comparison of the algorithms is based on the differences of their theoretical framework. No results are presented regarding the scores of the two algorithms in a validity indicator. The comparison of K-means, FCM and hierarchical algorithm in [65] is based on the shapes of the load profiles that each algorithm results in. The data refer to 288 consumers of the Slovenian distribution grid. The algorithms are applied to the data set and the shapes of the load profiles are discussed. In [66], the comparison includes FCM, spectral clustering and expectation maximization algorithm. The data refer to the total national load of 8 European countries collected by the ENTSO-E database. The authors conclude that spectral clustering appears as the most robust algorithm.
Based on the previous literature survey, the contributions of the present paper are summarized in the following: RTP is utilized not as a part of profit maximization for electricity retailers but as a tool for the implementation of DSM targets. A novel method is proposed to derive time variant price elasticities of consumers' load. Finally, a novel version of the traditional FCM is proposed for load profiling but also for implementing RTP schemes to meet predefined DSM targets. More specifically:

•
In profit maximization problems, no attention is placed on meeting specific goals on shaping the load curves of the consumers. In the present paper, RTP is utilized not as a part of profit maximization for electricity retailers, but as a tool for the implementation of DSM targets. Therefore, price-based DR is used to achieve a load management target. A single case study of a Retailer that interacts with the wholesale market environment and serves the consumers is investigated. For each hour period in an intra-day basis, the Retailer is asked by the System Operator (SO) to deliver specific load modifications refer to as DSM strategy goals, i.e., Peak Clipping, Valley Filling, Load Shifting, Strategic Conservation, Strategic Load Growth and Flexible Load Shape [69,70]. The Retailer implements the appropriate strategy based on the load profiles and the price elasticity of the consumers, market prices and other conditions. The strategy is interpreted as RTP schemes that are transferred to the elastic consumers. The term "elastic" characterizes the consumers that take full advantage the selling price by the Retailer and modify their demand accordingly. Hence, the consumers react to the selling price and achieve the pre-defined DSM objective. The proposed DSM strategy implementation is formed as a single linear constrained optimization task. The decision variables are the prices of the specific hour for the consumers. The price approach of the proposed strategy is consumer oriented; each consumer is offered a specific price profile based on the load profile and price elasticity. This leads to more accurate pricing, not only in terms of transferring the actual generation costs mirrored in the wholesale market, but also in designing consumer-specific tariff products based on the consumer demand patterns. • A novel method is proposed to derive time variant price elasticities. The term "dynamic" price elasticity is used to indicate price elasticity curves that express different values per hour. This approach not only represents the actual behavior of the consumer but consumer oriented price elasticities profiles can be drawn. • While many crisp algorithms have been proposed in the load profiling literature, the potential of soft clustering has not been adequately demonstrated, since only FCM has been examined. In the present paper, a detailed comparison takes place between two fuzzy algorithms, and their performance is checked by a set of adequacy measures that have been proposed in the load profiling related literature. The goal is to use soft clustering not only for load profiling aims but also to design and implement RTP schemes to meet predefined DSM targets in day-ahead electricity markets.
The analysis of the present paper is concentrated at two High Voltage (HV) industrial consumers located in Greece. The HV industry is considered as energy intensive. Currently, the consumption of HV industries in Greece accounts for 14% of the total consumption of the Greek electric sector [71]. At least theoretically, the potential of implementation of DSM measures to achieve certain DSM goals is high. An ongoing discussion between the Retailers sector and the Regulatory Authority of Energy of Greece, have been taking place regarding the modification of the existing tariffs offered to the HV [72]. This fact raises the need for a more accurate and justified pricing policy.

Methodological Approach
According to [73], in some cases industrial DSM implementation encounters difficulties due to the nature and operational characteristics of the loads. Although the per process modeling of industrial activity offers a detailed simulation of a DR-based management program, as most authors state, the modeling becomes unfeasible in the case of complex processes. In the present paper, the industrial consumer is examined as a generic block. The consumer itself is responsible for altering the consumption of the electrical or other apparatus as he wishes. It has to be clarified that in the present work the economic benefit and profitability of the Retailer are not examined. The scope is to develop and test a model that outputs the optimal RTP profiles that will fully accomplish a predefined DSM objective. Also, the contribution of each consumer to the objective is derived. The Retailer is not rewarded for delivering the DSM objective asked by the SO. The model's operation presupposes the existence of a telecommunication infrastructure that enables two-way information flow between the Retailer and consumers. The consumers act rationally and follow the price signals on an hourly basis. No interaction between consumers takes place. The energy market conditions or other factors do not influence the response of the consumers to prices, such as rival Retailers' contract offers, weather conditions, economic conditions and others. The Retailer's actions do not influence market prices. The consumers follow the price profiles variations offered by the Retailer with no rights of negotiation. The model outputs automatically the shares of each consumer contributions for achieving the overall DSM target. A test case with two existing industrial consumers within the Greek electric sector is examined. The data sets used for the model refer to the consumption of the consumers and their existing tariffs as offered by their current Retailer.

Proposed Model
In the following Sections the model is described. Specifically, the general concept is given in Section 3.1. Next, in Section 3.2 the mathematical description of the DR of the consumers is presented. In this study, three DR models are compared in order to assess the influence of the type of DR modelling in the RTP. In Section 3.3 the DSM objectives are presented and in Section 3.4, the dynamic elasticity extraction method is presented. The mathematical background of load profiling that is described in Section 3.5 includes the pattern representation, the clustering evaluation framework and description of the algorithms.

Concept
A conceptual diagram of the proposed model is presented in Figure 1. According to market conditions and operational constrains, the SO sends a signal to the Retailer with specific DSM objective on a day-ahead basis. Thus, the DSM objective is known to the Retailer the day ahead that is asked to be manifested. The SO acts in the transmission system and connects the wholesale and retail markets. The Retailer acts in the distribution system level and transfers the commands of the SO to the consumers. Figure 2 depicts the DSM objectives. The figure shows in a simple manner how an initial load curve is affected after the successful exercise of a set of measures that aim at satisfying the objective. The task of the Retailer is to "distribute" the DSM objective to the consumers. For instance, if the SO asks the Retailer for an overall reduction of 1 MW, the Retailer sends different RTP signals to the consumers, specifically designed in order to meet overall the 1 MW target.   The dimension of the vector refers to the recorded load time interval of metering. Originally, the two consumers are charged with the same TOUs rates tariffs designed by their current Retailer [71].    The dimension of the vector refers to the recorded load time interval of metering. Originally, the two consumers are charged with the same TOUs rates tariffs designed by their current Retailer [71]. In order to demonstrate and assess the proposed model, a single case of a Retailer that serves a number of consumers m = 1, 2, . . . , M with total consumption in day n, P(n, M) is regarded. For each day n the daily load curve of consumer m is expressed as a D-dimension vector x m n = [x m 1 , . . . , x m D ] T . The dimension of the vector refers to the recorded load time interval of metering. Originally, the two consumers are charged with the same TOUs rates tariffs designed by their current Retailer [71]. These tariffs are referred as "nominal" and indicated as p 0 (h) at hour h = 1, 2, . . . , 24. Nominal price is a product of bilateral agreement among the Retailer and the consumer. The demand that corresponds to the nominal price p 0 (h) is referred as "nominal" demand and indicated as d 0 (h). Based on its load magnitude and price elasticity, every consumer contributes differently in the overall DSM objective. Therefore, it is more accurate for each consumer to be charged with a different RTP scheme. Let p 1 (h) and p 2 (h) (€/kWh) be the RTPs at hour h within the day, of the 1st and 2nd consumer, respectively. These prices are accepted by the consumers who modify their loads accordingly, and finally the DSM objective is met. While price-based DR programs such as RTP are voluntary, the consumers are considered elastic, take full advantage of the hourly RTP, and respond rationally. The DR of the consumer d(p(h)) corresponds to price p(h).

Linear Price/Demand Function
In the present study, three price/demand functions are considered for the purpose of examining the influence of the selection of the function on the DSM success. The simplest price/demand function is linear d lin (p(h)) = a lin + b lin p(h), where a lin and b lin are two coefficients of the linear function. Let d lin 0 (h) be the nominal demand of the consumer using the linear demand function, and B lin 0 (h) the nominal consumer benefit (€) that corresponds to demand d lin 0 (h). The obtained benefit B c (d lin (p(h))) that corresponds to demand d lin (p(h)) is expressed by the following equation [44]: where E lin (h) is the hourly price elasticity using the linear model.

Exponential Price/Demand Function
Apart from the linear model, two nonlinear ones are considered, namely the exponential and logarithmic models, where the relationship between the two quantities is a nonlinear function. The exponential price/demand function is d exp (p(h)) = a exp e bexpp(h) , where a exp and b exp are two coefficients of the exponential function. Let B c (d exp (p(h))) and B exp 0 (h) be the obtained benefits (€) from consuming demand d exp (p(h)) and nominal demand d exp 0 (h), respectively, considering the exponential model. Consequently, it is: where E exp (h) is the hourly price elasticity using the exponential model.

Logarithmic Price/Demand Function
The logarithmic price/demand function is d log (p(h)) = a log + b log ln(p(h)), where a log and b log are two coefficients of the logarithmic function. Let B c (d log (p(h))) and B log 0 (h) be the obtained benefits (€) from consuming demand d log (p(h)) and nominal demand d where E log (h) is the hourly price elasticity using the logarithmic model. The DR functions of the three models are derived using the condition of maximizing the benefit functions:

Strategic Conservation
Three DSM objectives are examined, namely Strategic Conservation (SC) or Energy Efficiency, Load Shifting (LS) and Valley Filling (VF). The DSM objectives are accomplished only with RTP schemes, i.e., the consumer is responsible for modifying its consumption by selecting the number and types of load equipment that will be used for meeting the objective. No energy efficient equipment installment is pre-assumed. In SC objective, the scope is a net reduction of the nominal demand d lin 0 (h) to d lin (p(h)) at all hours of the day. For each hour h = 1, 2, . . . , 24, the SO asks the Retailer for R(h)% reduction of the Retailer's serving load. Then, the Retailer has to define prices p(h), h = 1, 2, . . . , 24 to meet the reduction target. Let d 1 0 (h) and d 2 0 (h) be the nominal demand, d 1 (p 1 (h)) and d 2 (p 2 (h)) be the DR and p 1 (h) and p 2 (h) be the RTP prices that lead to d 1 (p 1 (h)) and d 2 (p 2 (h)) of the 1st and 2nd consumer, respectively. Due to market conditions and competition, upper hourly limits (u(h)%) to prices p 1 (h) and p 2 (h) need to be set. The SC objective problem is formulated as an optimization task formed as follows: Maximize: Subject to: Each consumer contributes by a different portion to the overall R(h)% reduction target.

Load Shifting
The LS objective aims at shifting loads from peak hours to base hours. Usually, the loads that are shifted from peak hours are equal to those that fill the base hours. Let i, i = 1, 2, ..., h − 1 be a pre-defined instance during the day that refers to a load that will be removed in order to be summed with the load of another instance i' = i, i' = 1, 2, . . . , h − 1. The instance may refer to a single hour or a period of hours. Two base load instances i' and i" = (i',i), i" = 1, 2, ..., h − 1 are considered. The LS problem is formed as follows: Maximize: and simultaneously, minimize: Subject to: where R(i') and I(i') be the load reduction and increment of instance I' and R(i") and I(i") be the load reduction and increment of instance I", respectively. The following conditions apply: I(i') = I(i"), At instance i, the nominal demand and DR of the 1st consumer are denoted as d 1 0 (i) and d 1 (p 1 (i)), respectively, and the nominal price and real-time prices are denoted as p 0 (i) and p 1 (i). Also, the nominal demand and DR of the 2nd consumer are denoted as d 2 0 (i) and d 2 (p 2 (i)), respectively, and the nominal price and real-time prices are denoted as p 0 (i) and p 2 (i). The upper price limit is represented as u(i). The corresponding quantities of instances i' and i" are defined accordingly.

Valley Filling
The VF objective aims at filling loads at base hours. The offered prices should be lower than the nominal in order to motivate the consumers to increase their consumption. The VF problem is formed as follows: Minimize: Subject to:

Price Elasticity Extraction
The three objectives are exercised separately via a multi-step methodology, which is described in the following: Step 1. The SO sets the DSM objective and informs the Retailer the day before. Depending on the type of the objective, the quantities R(h), I(i'), I(i"), R(i') and R(i") are determined.
Step 2. The Retailer selects a price/demand function to simulate the consumer behavior towards the offered prices. Also, based on the DSM objective, the Retailer sets upper prices limits u(h), u(i), u(i') and u(i").
Step 3. The Retailer calculates the price elasticity of each consumer. The price elasticity parameter is an indication of the flexibility of the demand to price signals. In the present paper, a novel method of price elasticity estimation is proposed. Let n be the day that the SO asks the Retailer to exercise a DSM objective. A clustering takes place on the available daily load curves of the consumer up to day n −  15 May 2003 belong to is conducted. Next, depending on the selected demand function, a 1st order polynomial, exponential or logarithmic model to fit the hourly loads of all days that belong to the selected clusters is used and the corresponding prices p 0 (h). By taking into account all the days that belong to the same cluster with the selected day, for instance the 15 May 2009, the extraction of the price elasticity is held using similar days and hence, the seasonal variations and periodicities of the load are integrated. The hourly price elasticities using a linear, exponential and logarithmic model are given by the following expressions [46]: The parameters a lin , b lin , a exp , b exp , a log and b log are derived via the fitting procedure. According to Equations (23)- (25), the price elasticities vary per hour, a concept that corresponds to a more reliable modeling of the consumer's responsiveness on the price signals. The hourly price elasticities of day n are obtained by averaging the price elasticities of the days that belong to the same cluster with day n.

Load Profiling Fundamentals
In the present paper, a novel modified form of FCM is developed and tested. The clustering algorithm is used to calculate the price elasticity of the test day n. Clustering tracks the similarities of the daily load curves based on their shape, therefore a data scaling is necessary. The term pattern refers to the individual component of every data set, i.e., the daily load curve. The data set of consumer m is denoted as X m = {x m n , n = 1, . . . , N}, where N indicates the number of patterns of consumer. Let x min and x max be the minimum and maximum value of set X, respectively. Using the following expression, patterns x m n expressed in physical units (i.e., kW) are normalized in the [0, 1] range: Let the set of normalized vectors be represented as Y m = {y m n , n = 1, . . . , N}. Normalization is necessary since clustering tracks the similarities of patterns shapes. The patterns' magnitude significantly affects the results. The normalization formula of Equation (26) is applied to each year separately (i.e., N = 9). Thus, nine sets Y m are derived. A clustering algorithm provides a mapping of N → K, where K is the number of clusters and 1 ≤ K ≤ N. Each generated cluster has a centroid, which is the average of all patterns of the same cluster: (27) where N k denotes the number of patterns of X m that belongs to cluster C k . The set of clusters is denoted as C k = {c k , k = 1, . . . , K}. The clustering assessment is held through a set of validity indicators. Depending on the indicator, the superiority of an algorithm over others is demonstrated when leading to lower or higher values of the indicator for the majority of clusters. The indicators are built upon similarity metrics. The most common similarity metric is the Euclidean distance. Let x s n and x t n be two patterns x s n , x t n ∈ X m . Prior to the presentation of the indicators that were considered in this study to evaluate the improved FCM performance, the following metrics are defined:

•
The Euclidean distance between x s n and x t n : The subset of X m that belong to the cluster C k is denoted as S k . The Euclidean distance between the centroid c k of the kth cluster and the subset S k is the geometric mean of the Euclidean distances d eucl (c k , S k ) between c k and each member x k n of S k : • The geometric mean of the inner-distances between the patterns x k n and x l n members of the subset S k is: The following validity indicators are considered [56]: • The ratio of Within Cluster Sum of Squares to Between Cluster Variation (WCBCR), which corresponds to the ratio of the distance of each pattern from its cluster centroid and the sum of distances of the set C k : • The Intra Cluster Index (IAI), which corresponds to the overall sum of the distances between patterns and centroids: • The Scatter Index (SI), which corresponds to the ratio of distances between the patterns and the arithmetic mean to the distances between the centroids and the arithmetic mean: where p is the arithmetic mean of set X m .

•
The Inter Cluster Index (IEI), which corresponds to the sum of distances between the cluster centroids and the arithmetic mean:

Description of the Algorithm
It should be noted that since data are at the core of the clustering procedure, each sample is modeled differently. Crisp clustering is disadvantageous in cases when there is no clear data structure, in cases of presence of borderline patterns, and finally when the outcome partitions are not easily separable. Soft clustering is a generalization of crisp clustering and involves partial membership over the clusters; this concept is appropriate for handling the degree of fuzziness of the original data. The most widely used soft clustering algorithm is the FCM. The aim is to minimize the following objective function [67,68]: where q ∈ [1, ∞) is the fuzziness parameter and U is the partition matrix. This matrix contains the membership degrees of the patterns to the k clusters. The minimization of Equation (34) Equation (37) gives the centroids of the k clusters. In FCM, the clusters' centroids are obtained by Equation (37). The difference with Equation (27) is the presence of fuzziness parameter q and the element u of the partition matrix. Equation (38) implies that the patterns are assigned to all clusters with a membership degree u. The sum of the k membership degrees u is 1. The value of fuzziness membership influences the results of clustering. Equation (38) gives the membership degree of pattern y m n in cluster C k . The operation of FCM includes the following phases: Randomly select values in the [0, 1] range for the elements of U so that the following constrains are satisfied: where the binary variable u nk indicates if the pattern y m n belongs to cluster C k (u nk = 1) or not (u nk = 0). Phase 2. Calculate the centroids c k , k = 1, . . . , K, with Equation (36). Phase 3. Calculate the cost function with Equation (35). Terminate the algorithm if the value of J is smaller than a pre-defined threshold. Else, go to Phase 4. Phase 4. Calculate a new matrix U with Equation (36) and go to Phase 2.
FCM is an iteration-based cost function minimization algorithm. While the convergence of the algorithm is not always guaranteed, a series of executions should take place with different initialization conditions [56]. The convergence is accomplished when the pre-defined number of maximum iterations is met or when the improvement of the objective function J between two successive iterations is smaller than a threshold value. In the FCM, u nk denotes the degree of membership of the pattern y m n in cluster C k and consists of the elements of matrix U. This matrix is randomly initialized resulting to the strong dependence of the FCM on the initialization phase. To cluster a specific data sample, FCM should be executed many times, in order to reach a satisfactory result. To overcome this drawback, this study proposes a hybrid and improved version of the FCM. The new algorithm aims at providing a solution to the random initialization problem of the conventional FCM by employing within its operation another clustering algorithm, namely the K-means. Figure 3 presents the flowchart of the improved FCM. The operational phases of the algorithm are described in the following: Phase 1. Conduct an initial clustering of Y m for a pre-defined number of clusters (i.e., k) using the K-means algorithm [75]. The initial c k centroids are obtained. Potentially, in this phase every clustering algorithm can be used to produce the initial centroids. K-means is used since it is a fast and efficient algorithm that has been applied to many clustering problems [76,77]. Phase 2. For each pattern of the set Y m , calculate the Euclidean distances between the patterns and centroids c k . Phase 3. Divide each distance d kn with the sum of all distances sum(d kn ). Build the matrix U by setting its elements as: All elements u kn are within the [0, 1] range.

Load Profiling of HV Consumers
Two (M = 2) existing consumers of the Greek electric sector are considered, i.e., a cement and a steel manufacturing industry. For each consumer, the available data set refer to a period of nine consecutive years and 15 min recordings of the active load are available. A large number of load data strengthens the load profiling conclusions. The consumption of HV industries in Greece is 14% of the total electricity consumption of the interconnected Greek network. This means that the HV industrial sector is a potential candidate for the efficient implementation of load management measures. Such measures may include managing the load patterns in order to reduce or shift the demand from periods of high to low consumption in a beneficial way for both the industry and the distribution SO. Thus, an effective load profiling process can aid in the design and evaluation of sophisticated load management measures. Also, load profiling aids at tracking seasonal patterns, outliers and provides a generic overview of the demand patterns of the consumer. In this section the improved FCM algorithm is compared with the conventional FCM using the load data of the two consumers. No prior external experts information about the number of daily load curve clusters or the data inter-relationships and structures are available, a fact that makes the current load profiling task a purely unsupervised machine one. Also, no pre-processing technique took place such as outlier removal, smoothing, de-trending and others. The soft clustering algorithms are data driven. The number of clusters is determined by a mathematical process applied to the validity indicators shapes. Since, no expertise knowledge about the number of clusters is available, the algorithms should be executed for variable number of clusters. For each cluster number the values of the indicators are checked. The available data for each consumer set covers the period 2003-2011 and refers to hourly load data, i.e., the dimension of patterns is D = 24. Each daily load curve is expressed with a vector with D = 24. Clustering is applied separately in the data set of each year. While the quality of the data was rather acceptable; no outlier removal or missing data filling were employed. Using the maximum and minimum values per year, the normalization technique of Equation (26) is applied separately to the load data per year. The two algorithms are applied to the yearly sets Y m . Figures 4 and 5 display the comparison of the algorithms using the 2011 data set for consumer 1 (i.e., steel manufacturing industry) and for consumer 2 (i.e., cement manufacturing industry), respectively. The algorithms are executed for two to 30 clusters. A low number of clusters is not desirable, since it results in high clustering errors and therefore the segmentation of the original data set is not proper. On the other hand, a high number of clusters is not recommended since it refers to increased complexity and impose limitations on the load profiling applications, for example DSM, load forecasting and others. To keep the comparison fair, the algorithms' parameters are kept identical (i.e., fuzziness parameter, objective function improvement threshold and maximum number of iterations). The WCBCR, SI and IAI indicators generally have lower values as the number of clusters is increasing, contrary to the values of IEI. It should be noted that a comparative analysis of clustering with many validity indicators is a common approach in the literature [56]. According to the Figures, Improved   The selection of the clustering validity indicator is critical to the conclusions drawn by a load profiling application. The indicators evaluate the algorithms' performance by measuring the clustering error. The latter refers to the level of similarity of patterns within the same cluster and level dissimilarity between patterns of different clusters. The term compactness or cohesion refers to similarities of patterns of the same clusters and between the patterns and the centroids of the clusters that belong to. The term separation refers to the dissimilarity of the centroids. In the ideal case, an algorithm should result in compact and well-separated clusters. For the purpose of delivering an in-depth comparison between the algorithms, a set of indicators is required. Indicators may measure the compactness, the separation or both of them. The indicators are built upon Euclidean distance, which is a fundamental metric to measure the similarity between two vector patterns. The Euclidean distances between the patterns and the centroids or the centroids themselves of the generated clusters, are expressed by the WCBCR. WCBCR appears as a robust indicator since it assesses both compactness and separation. The distances between the patterns and the centroids tend to decrease while the number of clusters is increasing. This is due to the formation of more centroids that are closer in the feature space with specific centroids. The IAI indicator refers to the sum of the distances between the patterns and the centroids. IAI only measures the compactness of the clusters. As in the case of WCBCR, it displays a decreasing tendency but with a more unstable shape. The SI displays many fluctuations especially in the case of the FCM algorithm. Finally, IEI is a measure of distinctiveness between the centroids and the arithmetic mean of the patterns data set. This distinctiveness is a form of assessing the separation. While the number of centroids is increasing, the number of summations (i.e., between the centroid and mean) increase in number and hence, IEI receives higher values. From Figures 4 and 5 it is shown that the proposed algorithm leads to better clusterings than the original version of the FCM. This is more evident using the IAI, SI and IEI indicators.  The selection of the clustering validity indicator is critical to the conclusions drawn by a load profiling application. The indicators evaluate the algorithms' performance by measuring the clustering error. The latter refers to the level of similarity of patterns within the same cluster and level dissimilarity between patterns of different clusters. The term compactness or cohesion refers to similarities of patterns of the same clusters and between the patterns and the centroids of the clusters The difference between the performances of the two algorithms becomes clearer after the 5th or 6th cluster. Recall that the available data cover the period between 2003 and 2011. The algorithms are executed separately for each year. In order to evaluate the algorithms for the whole period, the mean values of the adequacy measures are used. To clarify this concept, for each year an adequacy measure curve is calculated. Then the mean curve referring to the period 2003-2011 is calculated and utilized for the algorithms comparison. The concept of using the mean values of the indicators that corresponds to subsequent years is useful when dealing with large data sets. It should be noticed that the majority of the load profiling literature presents and evaluates algorithms using sets that cover periods of a year or less [48,51,52]. It is recommended that the usage of large sets supports the reliability of the assessment of a proposed algorithm or methodology. Figures 6 and 7 present the comparison of the algorithms regarding the mean values of the validity indicators. utilized for the algorithms comparison. The concept of using the mean values of the indicators that corresponds to subsequent years is useful when dealing with large data sets. It should be noticed that the majority of the load profiling literature presents and evaluates algorithms using sets that cover periods of a year or less [48,51,52]. It is recommended that the usage of large sets supports the reliability of the assessment of a proposed algorithm or methodology. Figures 6 and 7 present the comparison of the algorithms regarding the mean values of the validity indicators.   It is observable that the fluctuations in the shapes of IAI, SI and IEI have been limited. Again in this set of experiments, the improved FCM outperforms the conventional form of the algorithm. Figure 8 is a graphical representation of the shape of a general adequacy measure that shows a monotonic decreasing behavior with the number of clusters. For illustration reasons, the maximum number of clusters is k = 40. In the ideal case, WCBCR, SI and IAI measure are continually decreasing while clustering is performed with higher number of clusters. It is observable that the fluctuations in the shapes of IAI, SI and IEI have been limited. Again in this set of experiments, the improved FCM outperforms the conventional form of the algorithm. Figure 8 is a graphical representation of the shape of a general adequacy measure that shows a monotonic decreasing behavior with the number of clusters. For illustration reasons, the maximum number of  It is observable that the fluctuations in the shapes of IAI, SI and IEI have been limited. Again in this set of experiments, the improved FCM outperforms the conventional form of the algorithm. Figure 8 is a graphical representation of the shape of a general adequacy measure that shows a monotonic decreasing behavior with the number of clusters. For illustration reasons, the maximum number of clusters is k = 40. In the ideal case, WCBCR, SI and IAI measure are continually decreasing while clustering is performed with higher number of clusters. After a specific number of clusters, there is a slight decrease of the adequacy measure that corresponds to the "knee" of the curve. In order to derive the optimal number of clusters, i.e., the one that corresponds to the value of measure at the curve's "knee", a mathematical process known as the "knee" point detection method is applied, which is briefly described in the following: After a specific number of clusters, there is a slight decrease of the adequacy measure that corresponds to the "knee" of the curve. In order to derive the optimal number of clusters, i.e., the one that corresponds to the value of measure at the curve's "knee", a mathematical process known as the "knee" point detection method is applied, which is briefly described in the following: Two lines from the points (x 2 , y 2 ) to (x 3 , y 3 ) and form the points (x 39 , y 39 ) to (x 40 , y 40 ) are drawn. The point of intersection (x, y) of the two tangent lines will give approximately the "knee" of the curve.
where a is the y intercept and b = ∆y ∆x is the slope of the line. From LINE1 it is: From LINE2 it is: The equation of the slopes, the value of x is derived: and since ∆x 1 = ∆x 2 , it is: The values of points (x 2 , y 2 ), (x 3 , y 3 ), (x 39 , y 39 ) and (x 40 , y 40 ) are extracted from clustering and using Equation (45) the value of x is obtained.
The output of clustering refers to the centroids expressed in the [0, 1] range of values. The inverse transformation of Equation (27) gives the centroids in physical units, i.e., the load profiles. Apart from assessing the algorithms' performance, a robust clustering validity indicator should provide information about the optimal number of clusters. Among the validity indicators, WCBCR is more suitable to apply the "knee" point detection method. This is because WCBCR presents a less volatile shape compared to the other indicators. The application of the method to the WCBCR curve as it is extracted by the improved FCM (Figure 4) leads to k = 6 clusters for consumer 1. Also, following the same approach, k = 6 clusters are obtained for consumer 2 ( Figure 5) as well. The load profiles of 2011 are illustrated in Figure 9.
Load profiling actually provides a reduction of the magnitude of the original load data set; the initial set of 365 daily load curves is represented now by a reduced set of typical load curves or load profiles. In the present cases, the daily load curves of 2011 are optimally grouped in six clusters and represented by six profiles. According to Figure 9a, a further classification of the profiles into two types can be done. Profiles c#1, c#3 and c#6 refer to a relatively stable demand. No large fluctuations and no sudden peaks are observed. All the other profiles refer to high demand during the first morning and night hours. This is due to the fact of the lower night-time tariffs charges offered by the Retailer to eligible industrial consumers. The industry's owner takes advantage of the low electricity tariffs offer and shifts the larger portion of the industrial activity during that specific period. The lowest consumption is met during 10:00-14:00 h. Usually, this period coincides with the morning peak of the Greek power system. Afterwards, the consumption is continually increasing. This tariff structure is a basic technique to avoid load congestion in morning peak hours. The load of consumer 2 is higher. In this case, two profiles display a stable form, namely c#1 and c#3. The rest profiles correspond to low consumption during the regular working hours. The selection of a single profile to represent the consumer is a choice that is made by the analyst. A potential selection will refer to the most populated cluster profile that corresponds to the majority of the days of the original data set. Other selections would refer to the profile of the maximum daily energy consumed (i.e., the profile that refers to the larger area covered by the curve), the profile that corresponds mainly on working days, the profile with the peak load and others. It should be noted that the extracted load profiles refer to the year of 2011. In case the database of the consumer is updated with new data, these new daily load curves can be distributed in the existing clusters or a new clustering can take place that will include the new daily load curves within the existing load data set. profile that refers to the larger area covered by the curve), the profile that corresponds mainly on working days, the profile with the peak load and others. It should be noted that the extracted load profiles refer to the year of 2011. In case the database of the consumer is updated with new data, these new daily load curves can be distributed in the existing clusters or a new clustering can take place that will include the new daily load curves within the existing load data set.

DSM Objective: SC
The proposed model is applied to a random test day, namely on 8 August 2011, which was the 2nd Monday of August. In the present paper, a novel method is proposed to derive dynamic price elasticities curves, i.e., price elasticities that vary per hour. Figures 10-12 show the price elasticity curves of the test day of 2011 considering the linear, exponential and logarithmic models for both consumers, respectively. The procedure to obtain the dynamic price elasticity curves is the

DSM Objective: SC
The proposed model is applied to a random test day, namely on 8 August 2011, which was the 2nd Monday of August. In the present paper, a novel method is proposed to derive dynamic price elasticities curves, i.e., price elasticities that vary per hour. Figures 10-12 show the price elasticity curves of the test day of 2011 considering the linear, exponential and logarithmic models for both consumers, respectively. The procedure to obtain the dynamic price elasticity curves is the following: Clustering is applied to derive the price elasticity of the consumers for the test day under study, based on the similarities of the test day and others days per year. Using the improved FCM for k = 6, all the clusters that contain the 2nd Let I day ⊂ I day be the set that does not contain the days that precede the test days of each year. The corresponding days with I day of 2011 are gathered in a new set, namely J day . Then, using a curve fitting process on the daily load curves of J day and p 0 (h), the parameters a lin , b lin , a exp , b exp , a log and b log are calculated. The curve fitting models that are applied to estimate the above parameters are the demand functions d lin p(h) = a lin + b lin p(h), d exp p(h) = a exp e bexpp(h) and d log p(h) = a log + b log ln(p(h)), respectively. The next step is to calculate the hourly price elasticities using Equations (23)-(25) of each day of set J day . By averaging the daily price elasticities per hour that are extracted via the fitting process, the hourly price elasticities, i.e., the daily price elasticity curve of the test day of 2011 is derived. According to Figures 10-12, consumer 2 is less flexible. The linear model leads to more volatile curves. The logarithmic model leads to more stable elasticity curves with lower values. Values close to zero refer to inelastic behavior. The exponential model leads to price elasticity curves with values that are generally between those of the linear and logarithmic models. It can be concluded that the selection of the DR model affects the derived elasticity curve values.     The purpose of SC is to reduce the overall net-consumption during a period. Usually, this is accomplished by motivating consumers to use energy efficient equipment. In the proposed model, the SC objective is materialized by a RTP program. Although price-based DR programs such as RTP are voluntarily, in the present study it is assumed that consumers respond to all price signals. In the SC objective, ten minutes prior to the hourly RTP signal, nominal loads (ℎ) and (ℎ) are known to the Retailer. For the test day under study an hourly reduction of R(h) = 10% is assumed. The upper limit of prices is set to u(h) = 150%, i.e., the offered prices derived by the optimization problem, p 1 (h) and p 2 (h), should not exceed by 50% the nominal price p0(h). Price limits can be set by the Retailer or the regulatory authority reflecting market competition and conditions. SO has no influence on the pricing policy of Retailer. Extremely high prices may lead to consumer not complying with the price signals. The parameters R(h) and u(h) can vary per hour. In the present test case, they are considered unchanged within the 24 h period of the test day. Thus, a total 10% reduction objective applies for each hour. Other potential SC objectives may refer to an average 10% reduction per day or to a 10% reduction per consumer. The scope is to derive the RTP profiles that will lead to the reduction target. Prices p 1 (h) and p 2 (h) refer only to the electricity cost. Additional expenses during the operation of the Retailer can also be included. The optimization problem is solved hourly and separately for each demand function. Prices p 1 (h) and p 2 (h) obtained by the solution are sent to consumers. The responses or final demands d 1 (p 1 (h)) and d 2 (p 2 (h)) are calculated after the termination of the DR event. Figure 13 presents the results of the SC objective employing the linear DR model. The nominal daily price profile is depicted in Figure 13a and is common for both consumers. The RTP profiles of the two consumers refer to variant prices that are higher than the nominal ones in order to bring forth the desirable SC objective. RTP#1 corresponds to consumer 1 and includes lower charges than The purpose of SC is to reduce the overall net-consumption during a period. Usually, this is accomplished by motivating consumers to use energy efficient equipment. In the proposed model, the SC objective is materialized by a RTP program. Although price-based DR programs such as RTP are voluntarily, in the present study it is assumed that consumers respond to all price signals. In the SC objective, ten minutes prior to the hourly RTP signal, nominal loads d 1 0 (h) and d 2 0 (h) are known to the Retailer. For the test day under study an hourly reduction of R(h) = 10% is assumed. The upper limit of prices is set to u(h) = 150%, i.e., the offered prices derived by the optimization problem, p 1 (h) and p 2 (h), should not exceed by 50% the nominal price p 0 (h). Price limits can be set by the Retailer or the regulatory authority reflecting market competition and conditions. SO has no influence on the pricing policy of Retailer. Extremely high prices may lead to consumer not complying with the price signals. The parameters R(h) and u(h) can vary per hour. In the present test case, they are considered unchanged within the 24 h period of the test day. Thus, a total 10% reduction objective applies for each hour. Other potential SC objectives may refer to an average 10% reduction per day or to a 10% reduction per consumer. The scope is to derive the RTP profiles that will lead to the reduction target. Prices p 1 (h) and p 2 (h) refer only to the electricity cost. Additional expenses during the operation of the Retailer can also be included. The optimization problem is solved hourly and separately for each demand function. Prices p 1 (h) and p 2 (h) obtained by the solution are sent to consumers. The responses or final demands d 1 (p 1 (h)) and d 2 (p 2 (h)) are calculated after the termination of the DR event. Figure 13 presents the results of the SC objective employing the linear DR model. The nominal daily price profile is depicted in Figure 13a and is common for both consumers. The RTP profiles of the two consumers refer to variant prices that are higher than the nominal ones in order to bring forth the desirable SC objective. RTP#1 corresponds to consumer 1 and includes lower charges than RTP#2 that is applied to consumer 2. The price elasticity profiles play a significant role in the determination of the RTP. As reported by Figures 10-12, consumer 1 is more sensitive to prices and thus, lower electricity charges can be applied. In addition, due to its elastic behavior, at least theoretically, consumer 1 should contribute to the largest part of the 10% reduction target. The two consumers do not only differ in terms of load profiles and load magnitudes, but also in terms of price elasticity profiles. The more flexible is the demand the higher deviations from the nominal demand are expected. It can been observed from Figure 13a that both RTP profiles generally follow the pattern of the nominal profile. The latter profile aims at formulating a demand pattern to HV consumers that is poorly correlated with the one of the interconnected system of Greece. The scope is to shift HV loads to system's low consumption periods. Contrary to the nominal price profile that is a typical TOU rate tariff, RTP profiles vary per hour presenting a more accurate and fair pricing policy to the HV consumers. RTP has the potential to deliver all the different DSM objectives. Also, by linking RTP with system marginal prices of the wholesale market, the actual generation costs can be transferred to the consumers. The nominal and final demands of the two consumers are shown in Figure 13b,c. The reduction levels are in accordance with the RPT profiles variations. Each consumer contributes to the reduction target by a different share that depends on the price elasticity and offered price. It is shown that the SC objective is successfully manifested. The first morning and the late night hours correspond to higher consumption periods. Consumption progressively decreases until 08:00 h and remains low until midnight. Afterwards, the consumption increases again. The higher share for delivering the 10% of total initial load reduction target refers to consumer 1. This share varies per hour. The daily average is 79.97%. Figures 14 and 15 present the results considering the logarithmic and exponential DR models, respectively. contributes to the reduction target by a different share that depends on the price elasticity and offered price. It is shown that the SC objective is successfully manifested. The first morning and the late night hours correspond to higher consumption periods. Consumption progressively decreases until 08:00 h and remains low until midnight. Afterwards, the consumption increases again. The higher share for delivering the 10% of total initial load reduction target refers to consumer 1. This share varies per hour. The daily average is 79.97%. Figures 14 and 15 present the results considering the logarithmic and exponential DR models, respectively.  The comparison between the various DR models is held in order to examine whether the selection of different DR model influence the SC objective success at delivering the predefined level of reduction set out by the SO. To keep the comparison feasible, the same load reduction R(h) and price limit u(h) are kept for all DR models. The logarithmic DR model leads to different contributions to the reduction target by the consumers. The highest one is met during the peak hours. This is because of the high deviation of the RTP from the nominal price. Recall that apart from the level of the offered price, price elasticity is a critical parameter for the fulfillment of the DSM objective. Again, consumer 1 accounts for the largest share in the reduction target. In the logarithmic model, the price profiles follow more strictly the pattern of the nominal price. In the exponential DR model, the price elasticities profiles display a correlation in the shape with the nominal price profile. This model leads to prices that lie between the other models. All models lead to the fulfillment of the R(h) = 10% target.
The difference among the models lies in the price profiles and the contributions of the consumers to the target. In addition, the upper price limit u(h) = 150% is never reached. Apart from the price profiles shapes, each model leads to prices that differ in terms of magnitudes. The comparison between the various DR models is held in order to examine whether the selection of different DR model influence the SC objective success at delivering the predefined level of reduction set out by the SO. To keep the comparison feasible, the same load reduction R(h) and price limit u(h) are kept for all DR models. The logarithmic DR model leads to different contributions to the reduction target by the consumers. The highest one is met during the peak hours. This is because of the high deviation of the RTP from the nominal price. Recall that apart from the level of the offered price, price elasticity is a critical parameter for the fulfillment of the DSM objective. model leads to prices that lie between the other models. All models lead to the fulfillment of the R(h) = 10% target.
The difference among the models lies in the price profiles and the contributions of the consumers to the target. In addition, the upper price limit u(h) = 150% is never reached. Apart from the price profiles shapes, each model leads to prices that differ in terms of magnitudes.

DSM Objective: LS
LS refers to the phenomenon of shifting loads from on-peak to off-peak periods. While the peak demand is increased, there is no changes in the total consumption. Peak loads that are cut are summed to loads of off-peak hours. In the present test case, the net electricity consumption remains unchanged; the amount of load that is cut from on-peak hours equals the one that is added to the off-peak hours. The LS objective can be combined with a SC option; the peak load that is shifted would not be equal to the load that is summed to the off-peak loads since a peak load reduction phase would precede the load filling event.
The LS objective is a combination of two DR events: A load reduction by 10% in hours 04:00, 05:00 and 06:00 h and a load increment in hours 19:00, 20:00 and 21:00 h. The load that is removed from the morning hours is aggregated to the existing loads of 19:00, 20:00 and 21:00 h. While hours 19:00, 20:00 and 21:00 h refer to off-peak hours, they do not refer to the lowest daily consumption. The two consumers present the lowest consumptions in different periods, the above hours since they refer to low consumption for both consumer are selected. However, it is feasible for the proposed model's operation to regard different load filling periods per consumers and more than one peak load shaving and load filling periods per consumer. In the LS objective, the duration of the two DR events is the same, each one lasts 3 h. Figure 16 present the results of the LS considering the linear DR model. The price profiles of the LS objective using the logarithmic demand function are displayed in Figure 17a, while the nominal and final loads of the consumers are shown in Figure 17b,c, respectively. Figure 18 presents the results of the LS objective using the exponential model. In the present test case, an hour per hour shifting is employed, i.e., the load from 04:00 h is aggregated to the load of 19:00 h, the load from 05:00 h is aggregated to the load of 20:00 h and the load from 06:00 h is aggregated to the load of 21:00 h. This per hour load matching concept is held per consumer.
Also, the LS is held consumer-wise, i.e., the load of 04:00 h of consumer 1 is aggregated to the load of 19:00 h of consumer 1, etc. However, different combinations can also be applied. It is not obligatory for instances i, i' and i" to have equal duration as long as the condition I(i') + I(i") ≤ R(i') + R(i") is satisfied. The load reduction amount can be distributed in various periods of the day.
In order to achieve the LS objective, prices at 04:00, 05:00 and 06:00 h should be higher than the nominal ones and prices at 19:00, 20:00 and 21:00 h should be lower, so that the consumers increase their consumption. Price limits are set to u(i) = 150%, u(i') = 90% and u(i") = 90%.
RTP profiles differ from the nominal only during the instances of load reduction and load increment. This is also the case for the final loads. Since consumer 1 is more sensitive to prices, it is offered a lower price. The GA utilized for solving the optimization problem results in slightly different outputs, i.e., prices p 1 (h) and p 2 (h), if executed a few times. This is an inherent characteristic of the GA. The initialization of the chromosome populations is done randomly and therefore, different executions of the algorithm lead to different outputs. To keep the comparison with the SC objective practical, prices of the peak shaving event, p 1 (i) and p 2 (i), are kept equal to the corresponding hours of the SC objective p 1 (h) and p 2 (h). As in the case of SC objective, the upper price limit u(i) = 150% is never met by the solution of the optimization problem.
Also, the upper price limits during the load filling events, u(i') = 90% and u(i") = 90%, are not met. Consumer 2 is offered higher price during the peak shaving event and lower price during the load filling event. As in the case of the SC objective, the exponential model results in prices that lie between those of the other models.

DSM Objective: VF
Figures 19-21 present the results corresponding to the linear, logarithmic and exponential models, respectively. A VF objective can be combined with other DSM objectives or exercised solely. During the VF event, consumers are encouraged to increase consumption during low system's demand. The goal of the VF objective is to build up off-peak loads in order to smooth out the load and achieve higher load factor ratios. To achieve a VF objective, the Retailer sends lower price signals to consumers. The VF event is employed at 19:00, 20:00 and 21:00 h for both consumers, thus the event includes three consecutive hours. All models achieve the 10% load increment target. In the linear model, the offered prices are reduced by 13.77%, 16.64% and 14.87% with respect to the

DSM Objective: VF
Figures 19-21 present the results corresponding to the linear, logarithmic and exponential models, respectively. A VF objective can be combined with other DSM objectives or exercised solely. During the VF event, consumers are encouraged to increase consumption during low system's demand. The goal of the VF objective is to build up off-peak loads in order to smooth out the load and achieve higher load factor ratios. To achieve a VF objective, the Retailer sends lower price signals to consumers. The VF event is employed at 19:00, 20:00 and 21:00 h for both consumers, thus the event includes three consecutive hours. All models achieve the 10% load increment target. In the linear model, the offered prices are reduced by 13.77%, 16.64% and 14.87% with respect to the nominal price at hours 19:00, 20:00 and 21:00 h for consumer 1, respectively and by 18.19%, 22.61% and 16.73% for the same hours for consumer 2, respectively. While in the formulation of the proposed model, prices can be reduced to zero, extremely low prices may lead to poor profits for the Retailer and are not a good practice within the retail energy market competition. Hence, a lower limit should also be placed.  16.73% for the same hours for consumer 2, respectively. While in the formulation of the proposed model, prices can be reduced to zero, extremely low prices may lead to poor profits for the Retailer and are not a good practice within the retail energy market competition. Hence, a lower limit should also be placed.

Conclusions
Due to the various benefits of DSM in the operation of electric power systems, many utilities and system operators focus on the application of DSM programs for lowering the risks of generation shortage and network congestion. In the present paper, the DSM is regarded as a resource in day-ahead energy market. A novel model has been proposed for achieving predefined DSM targets by using RTP schemes. Apart from the prices that lead to the achievement of the target, the model derives the contribution of the DSM target to the consumers based on their price elasticity.
Another contribution of the paper is the development of a novel soft clustering algorithm that leads to more robust clusters compared with the most commonly used soft clustering algorithm of the machine learning literature, i.e., the FCM. The new algorithm does not increase the complexity of the load profiling problem in terms of input parameter requirement prior to its execution. The new

Conclusions
Due to the various benefits of DSM in the operation of electric power systems, many utilities and system operators focus on the application of DSM programs for lowering the risks of generation shortage and network congestion. In the present paper, the DSM is regarded as a resource in day-ahead energy market. A novel model has been proposed for achieving predefined DSM targets by using RTP schemes. Apart from the prices that lead to the achievement of the target, the model derives the contribution of the DSM target to the consumers based on their price elasticity.
Another contribution of the paper is the development of a novel soft clustering algorithm that leads to more robust clusters compared with the most commonly used soft clustering algorithm of the machine learning literature, i.e., the FCM. The new algorithm does not increase the complexity of the load profiling problem in terms of input parameter requirement prior to its execution. The new algorithm is a part of the new method for extraction dynamic price elasticities curves. Time variant price elasticities offer a more representative technique to simulate the consumer's response to RTP programs. Regarding the load profiling analysis, the major findings of the article can be summarized in the following points:

•
In all cases examined, the proposed FCM algorithm leads to lower clustering errors. Therefore, a specific initialization formula of the partition matrix is favorable over the random one as it is the case in the conventional form of the algorithm. The K-means algorithm was used for the initialization due to its proven efficiency in many clustering problems. However, virtually every clustering algorithm can be used.

•
It is necessary to include a variety of assessment criteria to reach safe conclusions regarding the superiority of the algorithm. The criteria can be categorized into two types: (a) mathematical indicators that measure the Euclidean distances; and (b) parameters that are related with complexity. The term complexity may refer to a wide range of definitions. If complexity is related with required execution time, the time elapsed to run the algorithm for variable number of clusters expresses the suitability of an algorithm. The two algorithms compared in the paper, lead to comparable execution times. The conventional form of the algorithm is slightly faster since the proposed FCM includes an initial clustering with the K-means. If it is required to decrease the execution time of the proposed algorithm, it is recommended to use hierarchical agglomerative clustering algorithms instead of the K-means. Also, if the complexity is expressed as input parameters requirements to be defined prior to the execution of the algorithm, the proposed algorithm needs two additional parameters to be set by the user: (a) The objective function improvement threshold of the K-means and (b) the number of iterations of the K-means. • A variety of indicators have been used for the algorithm comparison. These indicators can be found in the load profiling-related literature. A reliable indicator should provide information about the optimal number of clusters. Theoretically, the optimal number of clusters is the largest possible. But in practical terms the number of optimal number should satisfy two contradictory conditions: (a) low clustering error and (b) number of clusters with considerable number of patterns. A large number of clusters are generally not recommended as it increases the difficulty of the exploitation of the clustering results. Among the indicators used in the study, the WCBCR measure is suggested. This is due to the following reasons: (a) it exhibits a decreasing tendency that it some cases it becomes monotonic and therefore, it is suitable for the application of the "knee" point detection method and (b) it measures both the compactness and the separation of the generated clusters. The IEI measure is the least suitable due its volatile shape. Both SI and IAI are preferred if the WCBCR is not considered.

•
The optimal number of clusters for both consumers is below 10. According to the shapes of the load profiles, the demand is high during night hours, again due to the nominal tariff structures. The load profiles do not display sudden peaks.
Regarding the proposed method for deriving the dynamic elasticity curves, the findings of the paper are gathered in the following statements:

•
In order to derive the price elasticity curves, it is recommended to use data that cover all seasons within the year. The weekly and seasonal variations of the demand changes with respect to prices are included if the calculation covers all seasons.

•
The type of the DR model affects the price elasticity curves. The linear DR model results in lower values of price elasticities. It can be noticed that there are large differences between the hourly values. This is more evident in Consumer#1. Therefore, it is recommended to form consumer specific curves instead of using pre-defined constant values.

•
To provide a robust modelling of the demand behavior, it is recommended to update the price elasticity curves periodically or in special cases of demand changes. For instance if large demand variations are observed or expected due to various reason, prior to the implementation of a DSM objective, a re-calculation of the hourly elasticities is advised.
The concluding remarks after the examination of the experiments held by the proposed model are gathered in the following paragraphs:

•
The formulation of the optimization problem is robust; in all cases examined reaches into feasible solution that satisfy the limitations imposed by the user.

•
All the DSM objectives presented in Figure 2 can be implemented in the model. Three cases have been examined. The prices offered by the Retailer depend on the price limits, the type of the price/demand function and the type of model used to derive the price elasticity curves. The elasticity is the most significant factor that determines the level of success of the DSM objective. A careful selection of price limits makes a Retailer competitive within the retail market competition. • The proposed model makes it feasible to achieve a DSM goal with no intervention in the apparatus of the consumers. It is up to the consumer to decide which loads will be cut off or shifted within the daily period. Therefore, the industrial consumer has more control upon its equipment and production processes. • The proposed model is highly flexible and can be applied to various types of consumers (e.g., buildings, residential consumer clusters and others). Also, different pricing schemes can be included, such as TOU rates as long as result in the satisfaction of the DSM target.
The present research can be expanded into the following areas: • In the case of the load profiling, dimensionality reduction techniques can be examined such as principal component analysis, for the purpose of lowering the clustering execution time.
The patterns can be represented with a reduced set of features and this concept, at least theoretically, should reduce the clustering duration. This would be more noticeable in cases with large amounts of data gathered by smart meters.

•
To modify the proposed model in order to simulate the Retailer's economic benefit. For instance, the demand reduction level in the SC objective can be negotiable among the System Operator and the Retailer. If the SC objective is partially fulfilled a penalty can be applied in the Retailer.

•
Another expansion of the model should take into account cases where self-production units, such as biomass or photovoltaics in the part of the consumer are present.
The model developed in this study is characterized by simplicity. The execution time lasts few seconds in a common computer configuration. Thus, it has the potential to be included in the Retailer's decision making portfolio.