Prospect Theory with Bounded Temporal Horizon for Modeling Prosumer Behavior in the Smart Grid

: We study prosumer decision-making in the smart grid in which a prosumer must decide whether to make a sale of solar energy units generated at her home every day or hold (store) the energy units in anticipation of a future sale at a better price. Speciﬁcally, we enhance a Prospect Theory (PT)-based behavioral model by taking into account bounded temporal horizons (a time window speciﬁed in terms of the number of days) that prosumers implicitly impose on their decision-making in arriving at “hold” or “sell” decisions of energy units. The new behavioral model for prosumers assumes that in addition to the framing and probability weighting effects imposed by classical PT, humans make decisions that will affect their lives within a bounded temporal horizon regardless of how far into the future their units may be sold. Modeling the utility of the prosumer with parameters such as the offered price on a day, the available energy units on a day, and the probabilities of the forecast prices, we ﬁt the PT-based proposed behavioral model with bounded temporal horizons to prosumer data collected over 10 weeks from 57 homeowners who generated surplus units of solar power and had the opportunity to sell those units to the local utility at the price set that day by the utility or hold the units for sale in the future. For most participants, a model with a bounded temporal horizon in the range of 1–6 days provided a much better ﬁt to their responses than was found for the traditional EUT-based model, thus validating the need to model PT effects (framing and probability weighting) and bounded temporal horizons imposed in prosumer decision-making.


Introduction
The evolution of the smart grid has made end-user participation an essential characteristic of future energy-management processes. To address this need, several research efforts over the past decade have aimed at modeling prosumer (an end user who is both a producer and a consumer of energy) behavior in smart-grid settings and have shed light on the importance of considering prosumer behavior in smart-grid planning efforts [1][2][3]. Much of this existing literature on energy management in the smart grid used the economic framework of expected utility theory (EUT) as the basis for modeling prosumer decisionmaking [4][5][6][7][8][9]. EUT assumes that individuals make rational choices under conditions of uncertainty based upon objective perceptions of gains and losses. However, decades of empirical and theoretical work in economics and other domains have shown that human behavior often deviates significantly from the rational predictions of EUT when individuals make decisions under conditions of uncertainty and risk. As an alternative approach, prospect theory (PT) [10,11] has been shown to more accurately describe decision-making in finance/investment, gambling, insurance, labor supply and other scenarios in which people must make choices among two or more uncertain outcomes. PT describes decision makers' subjective perceptions of gains and losses, and their tendency to weight uncertain outcomes [3,10,11].
The main contribution of this paper is an enhancement to the PT-based behavioral model by taking into account bounded temporal horizons (in terms of the number of days) that prosumers implicitly impose on their decision-making in arriving at "hold" or "sell" decisions of energy units. The new behavioral model assumes that in addition to the framing and probability weighting effects imposed by classical PT, prosumers make decisions that will affect their lives within a bounded temporal horizon regardless of how far into the future their units may be sold. In the study, we investigated the behavior of homeowners in a realistic smart-grid scenario in which their hypothetical solar panels generated surplus electric power which they could choose to sell to the power utility. We recorded their daily sell/hold decisions over the course of a 10-week study in which energy prices varied probabilistically from day to day. Given the daily offered price and information about the probability of each possible price occurring, participants decided each day whether to sell none, some or all their surplus energy units. Our results reveal that the enhanced PT-based model with bounded temporal horizons fits the prosumer data better than models based on PT alone as well as EUT-based models with and without bounded temporal horizons.
The rest of this paper is organized as follows. Section 2 highlights some related literature. In Section 3, we briefly summarize the experiment that yielded the data used to test alternative decision models. We propose models based on classical PT and PT enhanced with bounded temporal horizons in Section 4. Section 5 presents data fitting and results; and finally conclusions are drawn in Section 6. The Appendix A contains proofs of the theorems in the paper.

Related Work
Much of the research on how prosumers will participate (e.g., buy, sell, trade energy) in the smart grid has adopted one of two theoretical frameworks-game theory or prospect theory. Several of these investigations are summarized below, including a few that combine ideas from both frameworks.

Game-Theoretic Approach
One area in which game theory has been applied to smart-grid issues is in modeling demand-side management (DSM). In [6], the authors proposed an autonomous demandside energy-management model that exploits the features of the smart grid. Rather than focusing exclusively on the interaction between power company and the consumers, they considered the interaction among consumers. Using a game-theoretic approach, the authors showed that the global optimal performance can be achieved at a Nash equilibrium of the formulated energy consumption scheduling game. In [7], instead of optimizing energy consumption based solely on either utility company or consumer interests, the optimization considered the interaction between the power company and the consumer. A more extensive overview of adopting game theory to model interactions in the smart grid, and addressing problems in microgrid systems, demand-side management and communications is provided in [8]. The system model in [9] uses game-theoretic methods to model demandside management in smart grids. The model that converges to a unique Nash equilibrium solution defines a load scheduling method with a dynamic pricing strategy.

Prospect-Theoretic Approach
As has often been noted, expected utility theory and EUT-based game-theoretic models often fall short of accounting for how humans actually behave when making risky choices. In many investigations of decision-making under uncertainty in economics and related domains (investing, insurance, labor supply and even sports betting), PT has provided a closer approximation to actual human behavior when assessing possible gains and losses. See [12,13] for reviews of this literature, including both theoretical and empirical papers.
More recently, PT has been applied to modeling human behavior in smart-grid settings as well. The mathematical framework provided by PT hinges on a subjective value function rather than an objective utility function, and on the subjective perception of probability and uncertain outcomes. In several theoretical studies in [3,[14][15][16][17][18][19][20], researchers have shown that models based on PT often make markedly different predictions regarding prosumer behavior than do models based on traditional EUT. Specifically, they predict that smartgrid prosumers will treat gains and losses differently, being more sensitive to losses than to gains. Saad et al. [3] studied the potential of PT as a decision-making framework in the smart grid. Based on some examples from smart-grid applications, the authors show that the accurate understanding of human decision makers' behavior is essential in energy-management processes. The authors in [14] modeled a non-cooperative game based on the features of PT for energy trading between prosumers. In the game, hypothetical prosumers try to maximize their utilities by increasing the gains realized by selling their surplus energy, and decreasing the penalties that can be caused by the stochastic nature of renewable energy sources such as wind. Wang et al. [15], use the weighting effect of prospect theory to develop their model which is defined based on a non-cooperative game between prosumers who own storage units and desire to maximize their utilities. It is shown in [16] that a power company can face a decrease in its profit if it fails to consider the prosumers' subjective perceptions of gains and losses. Prospect theory's framing effect is used for modeling microgrid operators' behavior in [17], in which the authors demonstrate the constructive effect of considering subjective behavior. Xiao et al. [18] use PT to analyze micro-grids energy exchange and show how subjective behavior can cause a decline in overall utility. Wang et al. [19] investigated prosumers' subjective perceptions of their own and other players' actions in a demand-side management simulation, and proposed a PT-based framework for optimal participation time. Similarly, Wang et al. [20] proposed a game-theoretic model, incorporating framing concepts from PT to describe electricity customers' subjective perceptions of gains and losses in a reactive power compensation game involving coordination between the simulated customers. Again, these simulations show that customer decision-making under prospect-theoretic assumptions deviates from what classic EUT-based models predict.
It should be noted that with the exception of our recent work [21], none of the investigations of prosumer behavior in the smart grid cited above have gathered data from human participants to test the various model predictions. More empirical studies will be needed to validate these models.

Temporal Decision Horizons
In our recent work [21] and in the current paper, we examine and model behavior in a scenario where prosumers must make a sell or hold decision every day for 10 weeks. In that regard, it is also worth noting that none of the studies reviewed above have sought to model human decision-making in a smart-grid scenario over an extended period of time. Additionally, while there has been extensive work on receding horizon predictive control models in the control systems engineering literature [22], these models describe the operation of physical systems, not human decision-making. A few studies from other domains (principally economics) have observed that humans tend to make decisions within a relatively short temporal horizon regardless of how far into the future their decisions could be made [23][24][25][26][27].
In our earlier work [21], we analyzed data from 57 prosumers over 10 weeks and studied two alternative EUT-based models developed to incorporate the study parameters. The first model was based upon traditional EUT, and the second model was an extension of the first one that incorporated the notion of a finite temporal horizon or time window for making a series of daily buy/hold decisions. The major finding of this study was that the time window model provided a better fit to the data than traditional EUT. However, there was one important aspect of the data that neither model could account for. Both EUT-based models predicted that on any given day participants would always choose to sell either none or all their surplus energy units ("all-or-nothing" selling behavior). In this paper, we consider two additional models based on PT and demonstrate that these models predict prosumer decision-making more accurately.

Energy Market Simulation Study
The design of the energy grid experiment is described in detail in [21]. A brief summary is presented here.
In a 10-week study conducted in October-December 2016, we examined the behavior of 57 household decision makers in a prosumer-centric smart-grid scenario. The participants were recruited through an online research participant recruiting service (www. findparticipants.com accessed on 1 September 2016), who met certain criteria such as being the decision-maker in the household regarding energy utility services, and having the ability to take actions such as "hold" or "sell" renewable energy units locally generated at the household. The participants were told to assume that they had solar panels on their roof that generated 1 or 2 surplus "units" of energy on most days (90 units total over the 70-day study) and that they had batteries to store any surplus energy until they chose to sell it to the power company. In an effort to simulate variability introduced by factors such as weather and household demand for electricity, the number of surplus units generated on each day was sampled from a distribution in which one unit of surplus power was generated on 50% of the days; two units on 35% of the days and no units on the remaining 15% of days. Electricity prices at which they could sell their surplus units varied stochastically over a range from $0.10 to $1.50 per unit, using a price distribution approximating that of US domestic wholesale electricity prices. The data source for modeling prices was monthly reports of wholesale energy prices published by the U.S Energy Industry Association (e.g., www.eia.gov/electricity/monthly/update/wholesale_markets.cfm accessed on 23 August 2016) referencing data from March and April, 2016. The shape of the distribution was positively skewed, emulating real-world prices, with a mean of $0.60 and a modal price of $0.50. The probability of each price occurring on any given day of the study is shown in Table 1. We refer to this set of 15 probabilities as the single-day probabilities (p j ) for the 15 possible prices. Their task every day was to decide whether to sell any of their surplus energy (and if so, how many units) or wait for a better price. The objective of the energy grid experiment was for participants to maximize their profits from selling their surplus units. Participants were paid $100 for completing the study. In addition, to motivate them to optimize their performance, participants also kept the profits they earned from selling their units, and were further incentivized by having the opportunity to earn a bonus of up to $100 if they were one of the top 3 earners.

Modeling Approach
In this section, we develop two PT-based models of decision-making in our energy grid experiment that parallel the two EUT models developed in [21]. Both PT models consider the finite temporal horizon of the 10-week study, but the second version denoted as PT TW , introduces the concept of a participant-specific time window (bounded temporal horizon) over which each individual evaluates their daily sell/hold decisions. Our expectation, based on the extensive PT literature, was that PT-based models would provide a better fit to our study data than EUT-based models. Furthermore, we anticipated that unlike the EUT-based models, PT-based models would be able to predict some degree of the "some-but-not-all" selling behavior that our study participants exhibited.
After outlining the typical formulation of PT as developed by Kahneman & Tversky [10,11] below, we will extend the models to incorporate our additional study parameters.
The general form of the PT value function is, where x is either gains (if x > 0) or losses (if x < 0) with respect to the reference point, λ is the coefficient for loss aversion, and α and β are coefficients for risk aversion and risk seeking, respectively. Both α and β are typically constrained to range from 0 to 1 with α ≤ β. The loss aversion coefficient is not bounded. Please note that since α ≤ β the value function, v(x), is concave for gains and convex for losses, with the slope being steeper for losses ("losses loom larger than gains"). PT's probability weighting effect is typically modeled using the Prelec function (for a given probability p) [28], defined as, where γ = 1 reflects objective perception, i.e., EUT behavior. When γ < 1, the subjective probability, w(p), in (2) overweights low probabilities and underweights high probabilities.

Prosumer Behavior Model Using Prospect Theory
The total number of days in the energy grid experiment is D. Each individual day is indexed by the variable d and days are numbered consecutively in ascending order, such that on the final day, d = D. Please note that data were not collected on 2 days of the 10-week study, so D in our models is 68. On each day d, a participant typically has one or more units of energy to sell.
The price on offer for a unit of energy for any given day d is denoted by i d , where i d could assume any of the 15 possible unit prices from $0.10 to $1.50 in $0.10 increments. Please note that to simplify modeling and data analysis, the unit prices were linearly transformed to the values ranging from 1 to 15. The maximum possible price of 15 is denoted by J.
In our PT model, the expected gain from selling n units at price i d is determined by considering both the probability weighting and framing effects. If n units were sold on day d, then the probability that they would be sold at a lower price on a subsequent day over all possible lower prices is multiplied with the corresponding gain in prices, and both framed by the variable α and weighted by the variable γ. Similarly, the expected loss from selling n units at a price i d is determined by considering both the probability weighting and framing effects. If n units were sold on day d, then the probability that they would be sold at a higher price on a subsequent day over all possible higher prices is multiplied with the corresponding loss in prices, and both framed by the variable β and weighted by the variable γ.
Let N d be the number of units available for sale on day d and let n be the number of units sold at price i d on day d. The lowest price for which a sale of one unit on day d yields a positive expected value is called the cutoff price, denoted i * PT d . In other words, i * PT d is the lowest price on day d for which the expected gain of selling one unit is greater than the expected loss of selling that unit. Notice that a sale will only be made if the price offered is at or above the cutoff price. Based on the above reasoning and notation, the expected utility from selling n units on day d can be formally given as, where p j is the probability that the price j would be the unit price on offer. In (3), the first two terms describe the expected gain relative to day d + 1 and the expected loss relative to day d + 1, respectively. The last expression in (3) is the expected utility relative to all the remaining days, d + 2 to D. Each term corresponding to k in the summation ranging from k = 1 to k = D − d − 1 is computed by multiplying the probability of holding for k days with the difference between the expected gain and expected loss relative to day d + 1 + k.
The cutoff price can now be formally defined as, For evaluating the cutoff price, we first consider the fact that on the last day, D, there is no chance of selling any units of energy at any price on a subsequent day. Thus, there can be no gain for holding any stored units on the final day of the study since the probability that any price will be subsequently offered is 0. Therefore, all the available units should be sold regardless of the price offered on the last day. Equations (3) and (4) are used to set the value of cutoff prices, computed in a backward iterative method. By working backwards through the sales period, all cutoff prices can be computed.
Given the model parameters α, β, λ, γ, if PT is considered to be a model of human decision-making, then the model predicts the number of units to be sold as follows, where n * d (α, β, λ, γ) is the optimum units predicted to be sold by the model.

Theorem 1.
The PT model does not always predict "all-or-nothing" selling behavior.
In other words, Theorem 1 states that prosumers optimal strategy of selling under the PT model, n * d , is not necessarily equal to N d or 0. n * d can also take values such that 0 < n * d < N d . Furthermore, the sell-some-but-not-all strategy is independent of probabilities of offered prices when a sale of one unit on a day yields a positive expected value. Additionally, the sell-some-but-not-all strategy is independent of the cutoff price when a sale of one unit on a day yields a positive expected value.

Bounded Temporal Horizon Model of Prosumer Behavior Using Prospect Theory
In the bounded temporal horizon model with prospect theory (denoted as PT TW ), a prosumer may compute the probability of selling at a higher price over a fixed number of days called a time window, t, until near the end of the study when the remaining days are fewer than the number of days in the prosumer's time window. Ifd = D − d denotes the number of remaining days, then the time period over which the expected gains and the expected losses are computed is defined as τ = min(t,d). In the bounded temporal horizon model, i * PT τ , is the lowest price at which the expected utility is positive when n = 1 or the lowest price for which the expected gain is greater than the expected loss from a sale for a time period of τ days. Therefore, the expected utility from selling n units of energy on day d with a time window of t is defined as, The cutoff price for the PT TW model with a time window of t days can now be formally defined as, Equations (6) and (7) are used to set the value of cutoff prices, computed in a backward iterative method. By working backwards through the sales period, all cutoff prices can be computed.
Given the model parameters α, β, λ, γ, τ, the PT TW model predicts the number of units to be sold as follows, where n * d,τ (α, β, λ, γ, τ) is the optimum units predicted by the model.

Theorem 2.
The PT TW model does not always predict "all-or-nothing" selling behavior.
In other words, Theorem 2 states that prosumers optimal strategy of selling under the PT TW model, n * d,τ , is not necessarily equal to N d or 0. n * d,τ can also take values such that 0 < n * d,τ < N d . Furthermore, the sell-some-but-not-all strategy is independent of probabilities of offered prices when a sale of one unit on a day yields a positive expected value. Additionally, The sell-some-but-not-all strategy is independent of the cutoff price when a sale of one unit on a day yields a positive expected value.

Modeling Prosumer Behavior Using EUT and EUT TW
To benchmark the PT models against EUT models, we briefly describe the baseline EUT models without and with bounded temporal horizon (denoted as EUT TW ) [21]. EUT is fundamentally a conventional game-theoretic concept that is instructed by objective notions of losses and gains and rationality of people who make decisions. Let i * d denote the unit cutoff price, which is the lowest price on day d for which the gain from a sale on that day is greater than the expected gain from a hold on that day. Expected utilities are computed for sell and hold decisions, the components of which are the gains realized from selling and the potential (future) gains from not selling. Hence, the equation for computing the expected utility of selling a unit of energy on day d at price i d has two components. The first component is the gain realized by selling the unit at price i d on day d. The second component is the possible gain that may be realized by holding the unit on day d and selling on a subsequent day. The possible gain from holding is determined by the probability of selling at a higher price on a subsequent day. The probability of selling at a higher price on a subsequent day is determined by the cutoff price on each subsequent day and the probability of it being met or exceeded by the price on offer on that day. Therefore, the expected utility of selling n units on day d is defined as, where f h,d+1 is the expected gain for a hold on day d + 1 and is defined recursively as: In the EUT TW , a prosumer may compute the probability of selling at a higher price over a time window. So i * τ is the cutoff price that is the lowest price for which the gain from a sale is greater than the expected gain from a hold for a time window of τ days. The expected utility of selling n units on day d with a time window of τ is defined as, where f h,τ is the expected gain for hold on day d with a time window of τ and is defined as, The best strategy of selling for both models is the number of unit, n that maximizes the expected utility. Therefore, both models predict that on any given day participants would always choose to sell either none or all their surplus energy units ("all-or-nothing" selling behavior) [21].

Data Fitting and Results
In this section, we evaluate our system models using a data fitting procedure. The expected utility of selling n units on day d is a function of that day, and the analysis of finding the optimal model parameters needs to be done for each prosumer independently. The responses of prosumer participants in our energy grid experiment are compared with the predictions of each model. To this end, the minimum average normalized deviation for each participant is computed. For each participant, N d denotes the number of units available for sale on day d, n d represents the number of energy units actually sold on day d, and n * d is the number of units predicted to be sold by the PT model. Therefore, the minimum average normalized deviation (MAND) between the PT model and real data is given as, where the set D = { d | N d > 0 }, and |D | is the number of days on which N d > 0.
Consequently, the optimal model parameters (α * , β * , λ * , γ * ) are obtained as, (14) Figure 1 illustrates the minimum average normalized deviation with the best fit of the model parameters for the PT model obtained through an exhaustive search. Additionally shown in Figure 1 are the MAND results for the EUT model considered in our previous work [21]. Please note that unlike the PT model that evaluates only gains and losses, the EUT model considers the expected utility of both sell and hold decisions. This figure shows the minimum average normalized deviation between the number of units predicted to be sold by the models and the number of units that participants actually decided to sell in our energy grid experiment. Figure 1 captures the fact that the PT model predicts prosumer decision-making in the smart grid better than the EUT model described in [21]. The mean percentage of improvement of PT is 40.9% over EUT. The 57 study participants are ordered across the x axis, from the participant who sold units on the fewest number of days to the participant who sold on the most number of days. The figure demonstrates that temporally unbounded PT outperforms EUT relative to our human decision-making data, i.e., predictions of the PT-based model are closer to actual human behavior than predictions of the EUT model.
The MAND between the PT TW model and real data is given as, where the set D = { d | N d > 0 }, and |D | is the number of days on which N d > 0. Therefore, the optimal model parameters (α * , β * , λ * , γ * , τ * ) are obtained as, The constraints for the model parameters are given in (4) and (7) for the PT and PT TW models, respectively. The time window τ is a positive integer takes values less than D. Figure 2 shows the minimum average normalized deviation with the best fit of model parameters for the PT and PT TW models for each participant. Additionally shown in Figure 2 are fits of the data to the EUT and EUT TW models considered in our previous work [21]. The figure demonstrates that adding the notion of time window to EUT and PT can improve each model's predictions of prosumer decision-making behavior in the smart grid with PT TW performing best. However, it can be seen that the model's performance may decline when a participant decides to sell too often in the study (participants are ordered on the horizontal axis from those who made the fewest number of sales to those who made the most). The mean percentage of improvement of PT TW is 45.29% over EUT.    It can be seen that more than half of the prosumer participants were best fit by a time window ranging from 1-6 days. Figure 4 shows the number of sale days predicted by each model, and the actual number days on which each of the 57 participants sold during the course of the study. It can be observed that EUT predicts that rational prosumers would sell their surplus energy units on just four days-the three days when the price exceeded the cutoff price and on the final day (as the probability of selling any units at any price on a subsequent day is zero)-and this prediction is the same for all participants. However, introducing the notion of time window to classic EUT can improve its performance. Although the frequency of selling in EUT TW is greater than EUT (and thus, closer to observed behavior), the extended model, EUT TW , still suffers from predicting either sell all or none of the available units [21]. Figure 4 also shows that both PT and PT TW have a better fit to the participants' selling behavior than do the EUT models. In addition to providing a closer fit to our study data in terms of predicting the number of days on which participants sold one or more units, the PT models are also able to predict the observed pattern of selling some-but-not-all units. The stacked bar charts in Figures 5 and 6 illustrate these predictions. In Figure 5 (PT model) and Figure 6 (PT TW model), each participant is represented by three different colored bars that show the number of days that the model predicts sell none, some or all available units. The two figures also serve to validate theorems 1 and 2 which state that the PT and PT TW models, respectively, can predict selling some but not all available units of energy.

Conclusions
We studied prosumer decision-making in the smart grid in which a prosumer had to decide whether to make a sale of solar energy units generated at her home every day or hold (store) the energy units in anticipation of a future sale at a better price. Specifically, we enhanced a prospect theory (PT)-based behavioral model by taking into account bounded temporal horizons (a time window specified in terms of the number of days) that prosumers implicitly imposed on their decision-making in arriving at "hold" or "sell" decisions of energy units. The new behavioral model for prosumers assumed that in addition to the framing and probability weighting effects imposed by classical PT, humans made decisions that affected their lives within a bounded temporal horizon regardless of how far into the future their units could be sold. Modeling the utility of the prosumer with parameters such as the offered price on a day, the available energy units on a day, and the probabilities of the forecast prices, we fitted the PT-based proposed behavioral model with bounded temporal horizons to prosumer data collected over 68 days from 57 homeowners who generated surplus units of solar energy. These homeowners had the opportunity to sell those units to the local utility at the price set that day by the utility or hold the units for sale in the future. For most participants, a bounded temporal horizon in the range of 1-6 days was observed to provide a much better fit to their responses than was found for the traditional EUT-based model, thus validating the need to model PT effects (probability weighting and framing) and bounded temporal horizons imposed in prosumer decision-making.
The above findings lead us to two conclusions. First, as has been demonstrated in other contexts in many previous investigations, PT-based models appear to provide a closer description of human decision-making than do EUT-based models. And second, in our energy grid experiment, where a series of related sell/hold decisions must be made over an extended period of time, the introduction of a bounded time window parameter to the decision model can significantly improve the degree to which model predictions fit actual data. This is true both in terms of predicting which days (i.e., at which offered prices) participants choose to sell their energy units and, when they do choose to sell, in predicting whether they sell all or just some of their units. Future directions for study include testing the generalizability of the bounded temporal horizon concept to additional experiments that could take into account other risk factors. The current survey for collecting the data is intended to be a simulation of prosumers in the smart grid. Although it captures prosumer actions from a behavioral perspective, it may not perfectly reflect real-world energy policy and markets. For future work, we plan to compare the collected data in a simulated setting to real energy generation data as well as real-world factors such as intra-day energy price fluctuations and battery storage limitations. We would also like to conduct a longer (perhaps 4-5 month) study that would allow us to simulate seasonal changes in energy prices and would afford a longitudinal analysis of decision behavior as prosumers learn how to optimize their profits.