Energy Flexometer: Transactive Energy-Based Internet of Things Technology

: Effective Energy Management with an active Demand Response (DR) is crucial for future smart energy system. Increasing number of Distributed Energy Resources (DER), local microgrids and prosumers have an essential and real inﬂuence on present power distribution system and generate new challenges in power, energy and demand management. A relatively new paradigm in this ﬁeld is transactive energy (TE), with its value and market-based economic and technical mechanisms to control energy ﬂows. Due to a distributed structure of present and future power system, the Internet of Things (IoT) environment is needed to fully explore ﬂexibility potential from the end-users and prosumers, to offer a bid to involved actors of the smart energy system. In this paper, new approach to connect the market-driven (bottom-up) DR program with current demand-driven (top-down) energy management system (EMS) is presented. Authors consider multi-agent system (MAS) to realize the approach and introduce a concept and standardize the design of new Energy Flexometer. It is proposed as a fundamental agent in the method. Three different functional blocks have been designed and presented as an IoT platform logical interface according to the LonWorks technology. An evaluation study has been performed as well. Results presented in the paper prove the proposed concept and design.


Introduction
With the growing implementation and use of distributed energy resources as well as renewable energy sources (RES) in power systems [1,2], the importance of effective energy management systems (EMS) with active control and monitoring functions has never been so high. The modern EMS organized with distributed control systems as well as building automation and control systems (BACS) provide tools for easy implementation of demand response (DR) and active demand side management (DSM) systems [3,4], the key mechanisms considered in effective energy and power management within the smart grid (SG) [5,6]. The SG concept has being proposed along last few years to modernize and facilitate the operation of power systems in the presence of the DER and RES, electric energy storages and local microgrids including prosumers. Relatively new concept in effective management of energy sources and loads connected to the SG is a transactive energy. Transactive energy (TE) has been introduced by the GridWise Architecture Council and refers to use of a combination of economic and control techniques to improve SG reliability and efficiency, using value as a key operational parameter [7,8]. Since control techniques are elements of the TE development of a software library contains advanced algorithms, which enhance the intelligence of agents. In particular, the algorithms consider local electronic auction to regulate the system where participants of auction send their requested power in form of bid to an upstream-agent (namely auctioneer). The auctioneer finds a balance between production and consumption by adding all bids from agents. Then it responds to the participants with a demand schedule over day-ahead basis [27]. There are also some technological applications like PowerRouter by Nedap or Intelliweb by Mastervolt. These solutions perform an intelligent control of solar energy at home in order to increase power injection to the grid. They provide access to data via their data server to acquire and control in real-time [28,29]. In addition to these works and applications, there are other projects that use MAS based EMS with DSM to improve quality and control of power system, like ForskEL in Denmark [30]. Some of them are focused on local microgrids with prosumers and their collaboration with the SG [31,32]. Others propose and analyze possibilities of use of the BACS, Building energy management system and Home Automation Systems to organize distributed and integrated network platforms for effective implementation of the MAS for EMS and DSM applications [33,34].
Moreover, an ongoing the European Union project titled "Multi-agent systems and secured coupling of Telecom and EnErgy gRIds for Next Generation smart grid services" is aimed at developing an IoT platform as a tool for low-voltage power grids management, control and monitoring. The project proposes technical solutions both for increasing the security of bi-directional communications as well as integration of last mile connectivity with distributed optimization technologies [35,36].
Despite the fact that many studies and research are carried out in the field of demand management performance in power system, current evidence is insufficient to provide and generalize principles. Moreover, different research communities tried to tackle the issue of DR integration in EMS within their own expertise, at the cost of precision in respective domain. Hence the studies are not representative, which is taken into consideration in this paper. The paper proposes "Energy Flexometer" as a monitoring, controlling and bi-directional communicating node or agent that can be integrated easily in a MAS-based EMS. Authors describe a concept of their solution and propose application interface for the universal IoT platform. Moreover, the Energy Flexometer as agent node has been implemented in a small proof-of-concept application to test and verify the concept and proposed communication BACS and IoT technologies. Scalability of the proposed solution is not considered in this paper. It will be a subject of future works.
The rest of this paper is organized as follows. The Section 2 provides details about design and concept of the proposed Energy Flexometer with its standardized logical interface. The Section 3 presents the demand elasticity estimation approach and algorithms. In Section 4 physical implementation of the proposed Energy Flexometer concept and technical solution is discussed. Details of a small test installation and the evaluation of results are discussed in Section 5. Finally, Section 6 gives the conclusions and future works.

Concept
The concept of energy flexometer lies in the implementation of TE at the low-voltage level, as shown in Figure 1. As in Figure 1, it can be observed that in TE based EMS or BACS, the aggregator and domotics are the smart nodes that distribute decision-making task among each other. In MAS based TE, as discussed in [22], the aggregator is the uppermost agent that has objectives to either mitigate network issues or solve local imbalance or both. On the other hand, domotic agents are coordinators transmits the aggregated value-proposition (i.e., bid) and recieves control information from the aggregator in a real-time environment. Moreover, domotic agent standardizes the value-proposition as a key element of FlexiblePower Application Infrastructure (FPAI) platform-proposed and developed by Flexible Power Alliance Network. FPAI has introduced a set of rules and protocols to create interoperability between the aggregator and the domotic agents. However, there was not any suitable effort to improve the interoperability between the domotic agents and the real physical loads/generators. Therefore, the concept of energy flexomenter has been introduced in this study to fill in the research gap. It provides a standardized design of lower agents to make the architecture more interoperable. The concept of flexometer will help the entire architecture by three means. Firstly, the standardized embedded system of the flexomter can turn the dumb loads into the smart appliances, hence increasing more flexible demand in an aggregator portfolio. Secondly, the standard design will allow the devices to be integrated into the system easily. Thirdly, the integration of learning from acquired knowledge can be used for demand dispatch and other planning purposes. Moreover, the learning capability of flexometer can upgrade smart appliances to be more intelligent.

Design
In order to implement Energy Flexometer in EMS, the authors proposed the Echelon's IzoT platform. The IzoT is considered to be a new version of the LonWorks technology dedicated for BACS, using IP-all-the-way connections to the end devices. It has already provided a ready to develop platform (including microprocessors, application programming interfaces, communication protocol, management and integration software). According to the LonWorks standard, an interoperability between IzoT nodes is provided by functional profiles. Based on this platform, a standardized design of the Energy Flexometer is proposed in this section. Energy Flexometer supports three main functions: (1) an Energy Meter, (2) an Energy Logger and (3) an Elasticity Learner, as shown in Figure 2.
Functional profiles provide definitions both for network variables (NVs) as well as configuration properties (CPs), which are included in functional blocks as per algorithm requirement. Moreover, functional blocks proposed in this paper are designed according to the Semantic Device Descriptions model presented in [37]. This way they are open and ready to use in Component-based Automation Systems model introduced in [38].
All NVs and algorithms proposed in this paper are universal and could be seamlessly integrated in other international BACS standards networks. Thus, interoperability is one of the most important objective of the Energy Meter functional profile. The developed profile describes the application layer interface (NVs, CPs) and defines functional blocks proposed for this application. The NVs are essential elements of BACS module's network interface for binding network variables from other nodes. They are defined with prefix of nvi for inputs and nvo for outputs, providing data andinformation in the BACS, simplifying the integration process (development and installation of distributed systems). Moreover , according to the LonWorks standard assumptions and requirements, all NVs are optimized as short data objects to minimize load of the data communication channels. It is important taking into account further implementations of the proposed concept in larger EMS systems. In this way the BACS devices can be defined individually, then easily rearranged into new applications. The NVs are essential for interoperability between nodes. Herein, the paper designs the functional blocks in such a fashion that they are collectively able to express all kind of primary process parameters and customer preferences. Bearing in mind all these technological aspects, the proposed Energy Flexometer concept is ready to implement in different applications, both small and large EMS systems. However, the scalability is not considered in this paper as it will be ensure due to well known, standardize and proof in many applications LonWorks technology and other open, international BACS standards [39,40].

Energy Meter
Changes within the device and its primary process parameters are acquired by energy meter on real-time basis. The energy meter also captures changes that may occur due to customer preferences (e.g., user implicitly controlled or EMS explicitly controlled the device) or may occur naturally due to environmental change (e.g., room temperature). Then energy meter stores the current primary process parameters and customer preferences as a table with columns for all network variable and configuration properties respectively. Table 1 shows the most important NVs in IoT Energy Meter functional block, providing electrical parameters measurements, such as nvoEnergy, nvoVoltage, nvoCurrent, nvoPower or nvoFreq. Mentioned NVs are meter value output, i.e., the actual running value with timestamps. All these output NVs are transmitted when polled, or are triggered by Send On Delta condition-for communication settings adjustment cpParamSendDelta and cpParamMaxSendTime are proposed (where Param could be Energy, Power, Voltage, Current or Freq). The nvoEnergy could be set to zero using nviEenrgyClear, nevertheless total active energy value is non-resettable and provided by nvoRegValueEnergy. Controls on the load or group of loads connected to the Energy Flexometer is provided by the nviSwitch and nvoSwitch, i.e., they are dedicated for controlling the state of relay actuator. Moreover, nvoStatus variable contains the data related to the internal status conditions of the energy meter.

Energy Logger
Once the energy meter has acquired agent state (i.e., primary process parameters and customer preferences), energy logger then separates it into events. Each event denotes an instance with respect to the state and a control action that was being performed. Then the pair of action and state is logged into the logger.
To apply learning algorithm, the state must be mapped to a Markov decision process (MDP) consisting of a data tuple (state, action, transition probability, reward). State x k of an agent includes all possible network variables and configuration properties. Action u k captures the control action to an agent (e.g., turn on/off lamp). Transitional probability is a vector describing the transition of a agent from current state to a new state for a given action. Reward r k herein is simply a marginal energy cost incurred by an agent for the given state x k and action u k . The general rule in a competitive market environment is that the profit can be maximised at the quantity of output where marginal revenue equals marginal cost. Hence, r k in the given formulation can be defined as follows: represents the vector of marginal cost which is expected to incurred by Flexometer. Γ k represents market price during k th interval. k is a price elasticity of demand.
The IoT Energy Logger functional block include NVs and CPs related to its functions, as shown in Table 2. Essential for the Energy Logger are NVs providing information about energy. The first one nvoEnergy is a copy of the current meter value ( for the last month). With this network variable, it is also available to display historical data stored by the unit. The desired output data could be selected by the nviTimeSelection variable. After setting nviTimeSelection, the nvoEnergy is updated with the data of the cumulative meter value as it was for the requested time. The nviTimeSelection controls, according to the time, which history value is shown on the output network variable side via the nvoEnergy and nvoEnergyHistTime. If the time is outside of the accepted range the register output nvoEnergy is zero and the status field indicates "Illegal value request" information. Furthermore, a group of NV dedicated to power demand handling is important as well. For example, the nvoDemand holds the demand value. It is related to average power calculated for a specified time interval. Moreover, a rolling demand function with "sliding window" mode is supported by the IoT Energy Logger as well. In this case, demand calculation is carried out for a fixed number of subintervals, providing average power value for specific time interval. This results in better accuracy, especially for demand peaks (nvoDemandPeak and nvoDemandPeakTime). The nvoStatus provides information about activated algorithms for demand calculation as well as modes for controls and operations.

Energy Elasticity Learner
This functional block has an objective to forecast a state of an agent for an expected action taking into account the price elasticity of demand. It has been found in [13] that price elasticity of demand can be successfully use for estimating agent's value-proposition (i.e., bid). Demand elasticity is defined as the change in demand ∂x k d,a at an interval k due to the change in the price ∂Γ k during the same interval. Mathematically, where ε kk is the elasticity coefficient that indicates demand flexibility, x k d,a is flexible demand during time interval k, Γ k is price signal during the same interval yielded by the domotic agent, x 0 d,a is initial flexible demand and Γ 0 is the initial price signal. From an economic point of view, ε kk represents the self-elasticity and can be used to calculate demand sensitivity of an appliance concerning the price signal. However, the concept (as shown in (2)) cannot be generalized for the TE because it is imprecise to use initial or reference states for elasticity calculation at the access layer in a real-time environment. In this regard, the precise calculation of the demand elasticity at the access layer can be performed by using the concept of arc elasticity, defined by Seldon in [41]. Arc elasticity calculates the change in percentage relative to the mid-points, thus resulting in (a) symmetrical change concerning the price and demand; (b) relational independency and (c) unity provided the total revenue at both points is comparable. Mathematically, The composite bidding rules, as mentioned in [42], so according to (2) the demand elasticity of the current k th interval versus an interval k : k = k, ∀k ∈ K can be defined as: ε kk corresponds to the cross elasticity. Hence, by combining self-elasticity and cross-elasticity in a matrix results in demand elasticity matrix. Mathematically, it can be shown as: where p, f , referred to a postponing cross-elasticity, maps the past input to the future output, thus receiving all necessary information about the previous behaviours/states of the consumer/agent that may influence the expected states. f ,p , referred to as advancing cross-elasticity, maps the expected future input to past output, thus receiving the prediction of future behaviour of an agent. Table 3 shows the Energy Elasticity Learner functional block's NVs and CPs. In this case, crucial network variable is nvoExpDemand related to expected demand value. It provides an information from online learning demand process taking into account additional input parameters. Moreover, the input nviDemand enable to get information about current demand value provided by the nvoDemand-from the IoT Energy Logger functional block. Both mentioned NVs (nvoExpDemand and nviDemand) could be correlated and compared with actual demand value. In this way they provide information holding in nvoAbsError. It is important for management and control of loads in the EMS and it could be used by the Domotic Agent module described in the FPAI concept.
The nviPrice provides information about changes in an electrical energy price, taking into account external signals. For example, information about a higher energy price effects on changes in load profile for the building/object and could initiate load control and shifting process. The nviOccupancy represents information about presence of persons in rooms and affects calculated value of nvoExpDemand depending on occupancy. The input nviParam is an action for the learning process to calculate nvoExpDemand and it is associated with customer preferences. Every nviParam is updated by a customer before the activation of an agent. The most common temporal preferences could be start time and stop time to the agent, which means during a day the agent can start from the given time and must complete its task before the identified time. In this way, it can be inferred that nviParam has a direct influence on PEM. The nvoStatus provides information about selected demand elasticity calculation and learning algorithm as well as control and operating modes. This information could be used by other BACS devices and energy provider, operating mode of the DR services to the Domotic Agent and energy provider. The CPs allow to adjust settings for proper work of the IoT Elasticity Learner.

Demand Elasticity Estimation
Although there are many variants of machine learning, this study considers Q-learning technique for solving the problem because it is used to learn primitive behaviours of an agent. Q-learning is a machine learning proposed in [43] for determining the Markov decision process with fragmentary knowledge. Q-learning is about the training of animal's behaviour in an environment. That is why, social cognitive theory considers the concept of Q-learning for mimicking social responses. Therefore, demand elasticity estimation holds Q-learning to emulate the consumer's demand flexibility. Q-learning technique consists of an agent that takes actions within an environment and receives respective experiences in the form of rewards for all possible states. The technique is suitable for evaluation of demand elasticity because (a) it can form the problem in (4) as a combinatorial optimisation problem; (b) the consumer preferences (e.g., NVs and CPs) can help in perceiving the environment; (c) Q-value simulates the consumer behavior easily and (d) the continuous updating process results in better estimation.
Therefore, this section explains in detail action space, state space, a reward function and action selection for the estimation of demand elasticity.

State Space and Action Space
The state carries the necessary knowledge (i.e., the price of electricity) for making a decision. So, during each interval k ∈ K, the state s 1 a , s 2 a , . . . , s K a represents the day-ahead price to the elasticity agent.
However, the action (δ 1 a ) represents price expectation (Γ k ) for a given state. Within the given state, an agent selects an action {δ 1 a , δ 2 a , . . ., δ k a , . . ., δ K a } ∈δ a . Consequently, the agent receives an estimated reward r i+1 ∈ R and a new state s i+1 a , where i is iteration number.

The Objective Function
The objective of the proposed problem is to estimate the demand elasticity (as discussed in (4)) that can be inferred from the finding of a suitable sequence of actions {δ 1 a , δ 2 a , . . ., δ k a , . . ., δ K a } ⇒ {Γ 1 , Γ 2 , . . . , Γ k , . . . , Γ K }. Inference can only be legitimate provided f ,p should be maximized and should keep the total revenue positive.
For an objective function, presume r i (s i a , δ a , s i+1 a ) be an transitional elasticity between s i a and s i+1 a . Remember, the revenue or profit maximisation does not always result in an optimal strategy if the objective is to explore the total available flexibility of demand. Consequently, the maximisation of r i would explore the total flexibility and would generate a bid that maintains an equilibrium such that the total marginal revenue returns in the least.

Action Selection
In order to understand the selection of suitable action, let s i be the current state at the interval k. During the i th iteration, an action δ is being randomly selected through an -greedy algorithm. Consequently, it receives r i . However, to achieve an optimal action δ * a , the agent looks into the Q-Value table. The iterative process may initially selected action δ a , thus δ * a can be seen as an odd recursion at first because it is expressing the Q-Value of an action in the current state regarding the best Q-Value of a successor state. However, δ a makes sense when you look at how the exploration process uses it. The process stops when it reaches a goal state (i.e., s ) and collects the reward (i.e., r ), which becomes that final transition's Q-Value. Now in a subsequent training episode, when the exploration process reaches that predecessor state, the method uses the above equality to update the current Q value of the predecessor state. So, if Q(s a , δ 1 a ), Q(s a , δ 2 a ), . . . , Q(s a , δ m a ), . . . , Q(s a , δ M a ) are the Q-values against respective actions, then δ * is an optimal action provided Q(s a , δ * a ) > Q(s a , δ a ). Mathematically, Next time its predecessor is visited (i.e., s i+1 a ) that state's Q value gets updated, and so on back down the line. Provided every state is visited infinitely often this process eventually computes the optimal Q.
where α ∈ (0, 1] represents learning coefficient. Remember, in the initial learning process, the optimal action may not be the best action. Therefore, the goodness of the algorithm depends on the balance between the exploitation and exploration of the previously acquired knowledge. The -greedy based Q-learning, which is used herein, is incredibly simple and often maintains the right balance between exploitation as well as exploration, as shown in Algorithm 1. Moreover, the algorithm fits well within the domain of the proposed problem, i.e., learning of demand elasticity. As a flexometer plays the algorithm, it keeps track of the average marginal cost of an appliance. Then, it selects the state of appliance with the highest current average marginal cost (i.e., r) with probability = (1 − ) + ( /k), where is a small value like 0.10. In addition, it selects states that do not have the highest current average marginal cost with probability = /k.

Simulation of Demand Elasticity by using the Algorithm
In order to analyse the efficacy and performance of the proposed algorithm is analysed, a case is studied herein. In this case, it is assumed that there are day-ahead prices for around 56 days with a granularity of 15 min (i.e., 56 days × 96 intervals, implies to = 5376 total intervals) However, an action space A(s k ) = {0 : 100} represented as indexes to price levels. Moreover, the vector of best actionŝ δ a = {δ * a (s 1 ), δ * a (s2), . . . , δ * a (sk), . . . , δ * a (sK)} is logged after every 96 intervals. As the vector of best actions represents the indexes to day-ahead prices, so the vector of expected day-ahead price can be easily generated. Furthermore, through out the simulation after 96 intervals, (8) is used to update Probability Density Function (PDF). PDF support in retrieving expected day-ahead prices. For an explanation, the most updated PDF, (i.e., P k a ), obtained at the end of simulation is presented in Figure 3. It can be observed that the vector of best actions has learned the price transitions because the line (representing the day-ahead prices) in the figure matches the PDF, which was updated from the vector of best actions.
In order to further study the effectiveness and performance of a learning process, an absolute error which calculates the differences between the expected day-ahead price and the real price signal is shown in Figure 4. The decreasing trend in an error provides an evidence that the proposed learning technique works seamlessly.
On the other hand, in order to study the accuracy of demand elasticity estimation, a numerical calculation for the entire duration of the simulation is performed by using (4). Then the numerically calculated demand elasticity is compared with estimated demand elasticity. In this regards, Figure 5 shows the comparison between the two different elastic values. Similarly, the diminishing of trend in comparison error is in line with the results shown in Figure 4. Means, the initial expectations were less accurate, but once the agent was learned then it results in less error. However, an interesting observation is Figure 5 is that the elasticity is less predictable in the last intervals for each time window (i.e., 96 intervals). The reason was related to flexibility allocation because all the flexible loads must be utilized at the end of the day that results in inelastic demand. Moreover, the effect was very prominent because in this study it assumed the domotic agent does not have scheduling capability. For the same reason, the unexpected high errors, that are in Figure 4, can happen even after learning.    Hence, the main conclusions from the results are two-fold. Firstly, the approach would learn the consumer behaviour in a number of intervals even-though there was not much information provided initially. Secondly, it would help the domotic agent to build sensitive and agile demand response programs for demand scheduling in day-ahead as well as real-time.

Physical Implementation
The functional blocks proposed and described in Section 2 were implemented in IzoT intelligent node as a logical interface of the Energy Flexometer. The IzoT device stack was based on Raspberry Pi 2 Model B Boards, with 900 MHz quadcore ARM processors and 1 GB of memory with an additional integrated power measurement circuit. This platform was chosen because of its versatility and ease of implementation. It provides an extensive environment for application development and tests.

Measurement System Design
The measurement system was developed with two CS5460 analog to digital converters. It was dedicated to measure essential electical parameters such as: Real Energy, RMS voltage, RMS current, and Instantaneous Power for single phase 2-or 3-wire installations. In the proposed system, the CS5460 was connected with Raspberry Pi microcontroller using general purpose in/output pins. System structure and all the connections are presented in Figure 6. Further detail information about the measurement system design are presented by authors in [15].

Finite State Machine
In the phase of logical implementaion, an application for the Flexometer has been developed. It was implemented in Raspberry Pi using the IzoT stack and provided: (1) serial communication interface for Raspberry Pi and CS5460, (2) reading of registered values from the CS5460, and (3) an application code for control and learning was written in C programming language [14,15]. Figure 7 shows finite state machine of the Flexometer, which has 7 states (i.e., configuration parameters and control) and 7 transitions (i.e., network variables). This provides a detail description of an agent in terms of a digital logic circuit function at a given instant in time, to which the state circuit or program has access.

Knowledge Base
The knowledge based ontology of the Flexometer, represents the functional relationship of all blocks within the Flexometer, is shown in Figure 8. It contains descriptions of functional blocks and their actions, as well as reference for calculating state change when a control action (i.e., price signal) is given. Basic customer preferences (i.e., nviParam like start and stop time) are also contained in this ontology. As can be seen in Figure 8, the logger functional block is associated with customer preference, demand response and learned value proposition.

Elasticity Agent
Logger Agent

Evaluation
In this paper, a hardware-in-loop approach was adapted to conduct an experimentation for the evaluation of the Flexometer as an element that supports in the integration of MCM in EMS.

Experiment Design
As shown in Figure 9, the system for the experimentation was designed by implementing multi-agents. In this architectural design, multiple agents were responsible to perform their respective tasks. Herein, agents were also organized in triple layers. Agent in most upper layer was called an aggregator agent. The aggregator agent aggregated bids received from domotic agents and then adjusted an equilibrium price signal as per the objective. The most simple objective of the aggregator is to balance supply and demand, which was taken under consideration in this study. As shown in Figure 9, herein aggregator agent simply broadcasted the price signal corresponding to nearly zero consumption as the equilibrium price signal.
On the other hand, domotic agents exist in the middle layer of the organization. Herein, domotic agent worked as a transceiver of bid and price signal between the connected appliance agents and the aggregators. In this architectural design, appliance agent were representatives of real physical load (like battery, PV system or other loads) to the domotic agent. Therefore, within this framework, "Flexometer" was an appliance agent with digital logic circuit. Therein it provided an opportunity for the standardized integration of physical load into Transactive-based control mechanism.
As mentioned, the purpose of this demonstration was to evaluate the Flexometer. So, the demonstration was planned for a time period of a week in the Smart Lab of AGH UST, Krakow-Poland. Moreover, in order to simplify the analysis of data obtained during demonstration, the granularity of an hour was considered. Figure 10 shows the illustration of lab setup for experimentation. It can be observed from Figure 10 that an aggregator agent was entirely developed in MATLAB run-time environment. In this experimentation, two domotic agents were designed, one was MATLAB-based and other was designed in Raspberry Pi. Python and C++ languages were used to implement the logic of domotic agent in Raspberry Pi. Moreover, each domotic agent was equipped with two appliance agents. Appliance agents, which were connected to MATLAB-based domotic agent, were also modeled in MATLAB, as shown in Figure 10. Out of two MATLAB-based appliance agents; one was modeled as a battery of maximum rated absolute power of 3 kW and other was modeled as a PV system of maximum peak power of 2.6 kW. For PV system, to have more real life experience, local irradiation values of a week in June were considered because during the period PV generates the maximum peak concerning the entire year. On the other hand, two Flexometers were implemented as per the design presented in Figure 4. Out of two Flexometers, one was connected to fixed power dummy load of 2 kW and other was connected to variable power dummy load (i.e., [min, max] = [0.5 kW, 4 kW]). Variable power dummy load was used to generate variable power profile for a defined time duration (i.e., maximum upto 7 h). Figure 11 shows box diagram of the power pattern by variable load for the time period of a week. It also shows the average consumption pattern (by a red line in the figure) of the variable load with respect to the time.
Moreover, as two temporal preferences (i.e., nviParam) were introduced as customer preferences, namely start time and stop time. Start time is a time that refers to an hour of a day from when device can be turn on, however stop time is a time that refers an hour of a day till when device must complete its defined task. The mean stop time and start time for fixed load were from 4th hour till 19th hour of the day. On the other hand, the mean stop time and start variable for fixed load was from 2nd hour till 20th hour of the day. Keeping in mind that the day of the demonstration starts at 9.00 a.m.

Performance Metrics
The performance matrics of Flexometer are training time taken by elasticity learner functional block and response time. When numerically calculating without nviParam for a learning task of elasticity learner programmed in Raspberry Pi 2, the algorithm that runs for 100 episodes should take 8.3 s on average. However, Table 4 shows average training time that the elasticity learner of each Flexometer takes in both situations i.e., with nviParams and without nviParams. It can be observed that time taken with nviParam is way lower than without nviParam. It provides an evidence to the fact that nviParam limits the exploration across state space during a training episode, thus allowing agent to converge faster. Moreover, the average response time, i.e., a time duration required to change the status of Flexometer, was found to be 5 ms. Observation 1. If an appliance agent conforms to the complex bidding rules [2], then the computation time required for the learning process will be lower than the simple bid. Because complexities in the bid limit the exploration across the state space during a training episode, thus allowing the agent to converge faster.

Results
The upper graphs in Figures 12 and 13 show bid (i.e., power verses price signal) generated by both Flexometer (variable and fix load) during two different demonstrations i.e., with or with nviParam respectively. Similarly, the lower graphs in Figures 12 and 13 show when both Flexometer turn ON verses the respective price signal they received from domotic agent for an action. From Figure 12, an evident observation is that Flexometers without nviParam turn ON more strictly on time as well as react on high values of price signal. On the other hand, as shown in Figure 13, Flexometers with nviParam turn ON to relatively lower values of price signal as well as dispersed more on time. This provides an additional fact that elasticity learner learns more strict bidding in case of without nviParam rather then with nviParam.

Observation 2.
If an appliance agent conforms to the bidding rules, then it turns ON to relatively lower values of the price-signal as well as relatively dispersed more throughout the day. Moreover, the elasticity agent learns relatively strict demand elasticity with respect to time in case of the simple bid rather then the complex.

Conclusions
In this paper, a new solution approach to organizing active connection between the local EMS and market-driven DR mechanism, within the TE framework has been introduced, taking into account the IoT technology as a universal data communication platform. An essential element of the proposed solution is an Energy Flexometer, which is a key part of any EMS and in this paper it is considered to be an agent-node of the proposed MAS. Moreover, in this research the authors developed new functional blocks for the Energy Flexometer, designed according to the LonWorks standard, implementing the IoT technology. They are considered as a logical interface of the Energy Flexometer and provide standard network variables and configuration properties to organize advanced and fully integrated control and monitoring systems within both filed level as well as IP communication networks. This approach allows using the designed and developed EMS platform with Energy Flexometer in the TE framework. Abovementioned comprehensive solution allows to implement the Energy Flexometer concept both in new EMS applications as well as existing ones, without any additional requirements. Due to the standardization of logic interface, proposed by authors, fully interoperability is provided. Moreover, the scalability is ensured and determined by using IzoT platform with the LonWorks standard communication objects and media.
All the mechanisms, methods and solutions considered by authors in this research have been validated. Moreover, an approach to generate an expected bid for market-based demand response by using reinforcement learning mechanism is analyzed and validated by real-time operation of Energy Flexometer as well, however the study herein in limited to a small-scale application. For the sake of demonstration, two different options of Flexometer were compared (i.e., with and without nviParam). In the first option with nviParam with limited timings for appliance operation, the energy elasticity functional block learns an optimal schedule of energy and the value of price signal is lower. However, for the second one, the value of price signal increase and learn process relatively fix the time of operation as optimal schedule. This contribution is essential for effective implementation of the active DSM in EMS and provides tools for its implementation in buildings with BACS.
Future works include: (i) implementation of the developed solution in a pilot building EMS project with active DSM control and monitoring functions to verify technical concept in larger applications, (ii) development of other, additional function blocks for energy sources and storages management within DR mechanism as well as (iii) application, checking and validation of the Energy Flexometer performance in prosumers microgrids EMS.