# Dynamic Pricing for Charging of EVs with Monte Carlo Tree Search

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

#### 1.1. Motivation

#### 1.2. Problem Statement and Contributions

- Novel model of dynamic pricing of EV charging problem using the Markov Decision Process (MDP) methodology;
- A heuristics-based pricing strategy based on Monte Carlo Tree Search (MCTS), which is suitable for large-scale setups;
- Optimizations based on maximizing the revenue of the CS operators or the utilization of the available capacity;
- Parametric set of problem instances modeled on a real-world data from a German CS operator which spans two years;
- Experimental results showing that the proposed heuristics-based approach is comparable to the exact methods such as Value Iteration. However, unlike those exact methods, the proposed heuristics-based approach can generate results for large-scale setups without suffering from the state-space explosion problem.

## 2. Related Work

## 3. MDP Formulation of EV Dynamic Pricing Problem

- which reservations will arrive during the day,
- what will be the EV user’s responses to the offered prices.

#### 3.1. MDP Formalization

#### 3.1.1. State Space

#### 3.1.2. Action Space

#### 3.1.3. Reward Function

#### 3.1.4. Transition Function

#### 3.2. MDP Solution

## 4. Optimal and Heuristic Solutions

Algorithm 1: General MCTS structure. |

#### 4.1. Tree Policy

#### 4.2. Rollout Policy

## 5. Experiments and Results

#### 5.1. Evaluation Methodology

#### 5.2. Baseline Methods

#### 5.2.1. Flat-Rate

#### 5.2.2. Value Iteration

#### 5.2.3. Oracle

#### 5.3. Problem Instances and EV Charging Dataset

#### 5.4. Results

#### 5.4.1. Fixed-Demand Experiment

#### 5.4.2. Variable-Demand Experiment

## 6. Conclusions and Future Work

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Abbreviations

CPP | Critical peak pricing |

CS | Charging station |

DSM | Demand-side management |

EV | Electric vehicle |

ICE | Internal combustion engine |

ILP | Integer linear programming |

MCTS | Monte Carlo tree search |

MDP | Markov decision process |

RTP | Real-time pricing |

ToU | Time of use |

VI | value iteration |

## References

- Eider, M.; Sellner, D.; Berl, A.; Basmadjian, R.; de Meer, H.; Klingert, S.; Schulze, T.; Kutzner, F.; Kacperski, C.; Stolba, M. Seamless Electromobility. In Proceedings of the Eighth International Conference on Future Energy Systems, Hong Kong, China, 16–19 May 2017; e-Energy’17. Association for Computing Machinery: New York, NY, USA, 2017; pp. 316–321. [Google Scholar] [CrossRef] [Green Version]
- Kirpes, B.; Klingert, S.; Basmadjian, R.; de Meer, H.; Eider, M.; Perez Ortega, M. EV Charging Coordination to secure Power Grid Stability. In Proceedings of the 1st E-Mobility Power System Integration Symposium, Berlin, Germany, 23 October 2017. [Google Scholar]
- Europe Union. New Registrations of Electric Vehicles in Europe, 2021. Available online: https://www.eea.europa.eu/ims/new-registrations-of-electric-vehicles#:~:text=The%20uptake%20of%20electric%20cars,registrations%20in%20just%201%20year (accessed on 7 February 2022).
- Fairley, P. Speed Bumps Ahead for Electric-Vehicle Charging, 2021. Available online: https://spectrum.ieee.org/speed-bumps-ahead-for-electricvehicle-charging (accessed on 7 February 2022).
- Albadi, M.; El-Saadany, E. A summary of demand response in electricity markets. Electr. Power Syst. Res.
**2008**, 78, 1989–1996. [Google Scholar] [CrossRef] - McGill, J.I.; van Ryzin, G.J. Revenue Management: Research Overview and Prospects. Transp. Sci.
**1999**, 33, 233–256. [Google Scholar] [CrossRef] - Gan, J.; An, B.; Miao, C. Optimizing Efficiency of Taxi Systems: Scaling-up and Handling Arbitrary Constraints. In Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, Istanbul, Turkey, 4–8 May 2015; AAMAS’15. International Foundation for Autonomous Agents and Multiagent Systems: Richland, SC, USA, 2015; pp. 523–531. [Google Scholar]
- Deilami, S.; Masoum, A.S.; Moses, P.S.; Masoum, M.A.S. Real-Time Coordination of Plug-In Electric Vehicle Charging in Smart Grids to Minimize Power Losses and Improve Voltage Profile. IEEE Trans. Smart Grid
**2011**, 2, 456–467. [Google Scholar] [CrossRef] - Hayakawa, K.; Gerding, E.H.; Stein, S.; Shiga, T. Online mechanisms for charging electric vehicles in settings with varying marginal electricity costs. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015. [Google Scholar]
- Versi, T.; Allington, M. Overview of the Electric Vehicle Market and the Potential of Charge Points for Demand Response; Technical Report; ICF Consulting Services: London, UK, 10 March 2016. [Google Scholar]
- Frade, I.; Ribeiro, A.; Gonçalves, G.; Antunes, A.P. Optimal location of charging stations for electric vehicles in a neighborhood in Lisbon, Portugal. Transp. Res. Rec.
**2011**, 2252, 91–98. [Google Scholar] [CrossRef] [Green Version] - Wang, H.; Huang, Q.; Zhang, C.; Xia, A. A novel approach for the layout of electric vehicle charging station. In Proceedings of the 2010 International Conference on Apperceiving Computing and Intelligence Analysis Proceeding, Chengdu, China, 17–19 December 2010; pp. 64–70. [Google Scholar]
- Chen, D.; Khan, M.; Kockelman, K. The Electric Vehicle Charging Station Location Problem: A Parking-Based Assignment Method for Seattle. In Proceedings of the 92nd Annual Meeting of the Transportation Research Board, Washington, DC, USA, 9–13 January 2013. Under review for presentation in. [Google Scholar]
- He, F.; Wu, D.; Yin, Y.; Guan, Y. Optimal deployment of public charging stations for plug-in hybrid electric vehicles. Transp. Res. Part B Methodol.
**2013**, 47, 87–101. [Google Scholar] [CrossRef] - Xiong, Y.; Gan, J.; An, B.; Miao, C.; Bazzan, A.L.C. Optimal Electric Vehicle Charging Station Placement. In Proceedings of the 24th International Conference on Artificial Intelligence, Buenos Aires, Argentina, 25 –31 July 2015; IJCAI’15. AAAI Press: Palo Alto, CA, USA, 2015; pp. 2662–2668. [Google Scholar]
- Xiong, Y.; Gan, J.; An, B.; Miao, C.; Bazzan, A.L.C. Optimal Electric Vehicle Fast Charging Station Placement Based on Game Theoretical Framework. IEEE Trans. Intell. Transp. Syst.
**2018**, 19, 2493–2504. [Google Scholar] [CrossRef] - Xiong, Y.; Gan, J.; An, B.; Miao, C.; Soh, Y.C. Optimal pricing for efficient electric vehicle charging station management. In Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, Singapore, 9–13 May 2016; pp. 749–757. [Google Scholar]
- Chawla, S.; Hartline, J.D.; Malec, D.L.; Sivan, B. Multi-parameter mechanism design and sequential posted pricing. In Proceedings of the Forty-Second ACM Symposium on Theory of Computing, Cambridge, MA, USA, 5–8 June 2010; pp. 311–320. [Google Scholar]
- Garcia, F.; Rachelson, E. Markov Decision Processes. In Markov Decision Processes in Artificial Intelligence; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2013; Chapter 1, pp. 1–38. [Google Scholar] [CrossRef]
- Celebi, E.; Fuller, J.D. Time-of-Use Pricing in Electricity Markets Under Different Market Structures. IEEE Trans. Power Syst.
**2012**, 27, 1170–1181. [Google Scholar] [CrossRef] - Basmadjian, R. Optimized Charging of PV-Batteries for Households Using Real-Time Pricing Scheme: A Model and Heuristics-Based Implementation. Electronics
**2020**, 9, 113. [Google Scholar] [CrossRef] [Green Version] - Park, S.C.; Jin, Y.G.; Yoon, Y.T. Designing a Profit-Maximizing Critical Peak Pricing Scheme Considering the Payback Phenomenon. Energies
**2015**, 8, 11363–11379. [Google Scholar] [CrossRef] [Green Version] - Lee, S.; Choi, D.H. Dynamic pricing and energy management for profit maximization in multiple smart electric vehicle charging stations: A privacy-preserving deep reinforcement learning approach. Appl. Energy
**2021**, 304, 117754. [Google Scholar] [CrossRef] - Gu, Z. Proposing a room pricing model for optimizing profitability. Int. J. Hosp. Manag.
**1997**, 16, 273–277. [Google Scholar] [CrossRef] - Subramanian, J.; Stidham, S.; Lautenbacher, C.J. Airline Yield Management with Overbooking, Cancellations, and No-Shows. Transp. Sci.
**1999**, 33, 147–167. [Google Scholar] [CrossRef] [Green Version] - Ban, D.; Michailidis, G.; Devetsikiotis, M. Demand response control for PHEV charging stations by dynamic price adjustments. In Proceedings of the 2012 IEEE PES Innovative Smart Grid Technologies (ISGT), Washington, DC, USA, 16–20 January 2012; pp. 1–8. [Google Scholar] [CrossRef]
- Bhattacharya, S.; Kar, K.; Chow, J.H.; Gupta, A. Extended Second Price Auctions With Elastic Supply for PEV Charging in the Smart Grid. IEEE Trans. Smart Grid
**2016**, 7, 2082–2093. [Google Scholar] [CrossRef] - Kim, Y.; Kwak, J.; Chong, S. Dynamic Pricing, Scheduling, and Energy Management for Profit Maximization in PHEV Charging Stations. IEEE Trans. Veh. Technol.
**2017**, 66, 1011–1026. [Google Scholar] [CrossRef] - Limmer, S.; Rodemann, T. Multi-objective optimization of plug-in electric vehicle charging prices. In Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence (SSCI), Honolulu, HI, USA, 27 November–1 December 2017; pp. 1–8. [Google Scholar] [CrossRef]
- Mrkos, J.; Komenda, A.; Jakob, M. Revenue Maximization for Electric Vehicle Charging Service Providers Using Sequential Dynamic Pricing. In Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Stockholm, Sweden, 10–15 July 2018; IFAAMAS: Stockholm, Sweden, 2018; p. 9. [Google Scholar]
- Basmadjian, R.; Kirpes, B.; Mrkos, J.; Cuchý, M. A Reference Architecture for Interoperable Reservation Systems in Electric Vehicle Charging. Smart Cities
**2020**, 3, 1405–1427. [Google Scholar] [CrossRef] - Basmadjian, R.; Kirpes, B.; Mrkos, J.; Cuchý, M.; Rastegar, S. An Interoperable Reservation System for Public Electric Vehicle Charging Stations: A Case Study in Germany. In Proceedings of the 1st ACM International Workshop on Technology Enablers and Innovative Applications for Smart Cities and Communities, New York, NY, USA, 13–14 November 2019; TESCA’19. Association for Computing Machinery: New York, NY, USA, 2019; pp. 22–29. [Google Scholar]
- Basmadjian, R.; Niedermeier, F.; de Meer, H. Modelling Performance and Power Consumption of Utilisation-Based DVFS Using M/M/1 Queues. In Proceedings of the Seventh International Conference on Future Energy Systems, Waterloo, ON, Canada, 21–24 June 2016; e-Energy’16. Association for Computing Machinery: New York, NY, USA, 2016. [Google Scholar] [CrossRef]
- Basmadjian, R.; de Meer, H. Modelling and Analysing Conservative Governor of DVFS-Enabled Processors. In Proceedings of the Ninth International Conference on Future Energy Systems, Karlsruhe, Germany, 12–15 June 2018; e-Energy’18. Association for Computing Machinery: New York, NY, USA, 2018; pp. 519–525. [Google Scholar] [CrossRef]
- Browne, C.B.; Powley, E.; Whitehouse, D.; Lucas, S.M.; Cowling, P.I.; Rohlfshagen, P.; Tavener, S.; Perez, D.; Samothrakis, S.; Colton, S. A Survey of Monte Carlo Tree Search Methods. IEEE Trans. Comput. Intell. AI Games
**2012**, 4, 1–43. [Google Scholar] [CrossRef] [Green Version] - MCTS, 2022. Available online: https://github.com/JuliaPOMDP/MCTS.jl (accessed on 7 February 2022).
- Auer, P.; Cesa-Bianchi, N.; Fischer, P. Finite-time analysis of the multiarmed bandit problem. Mach. Learn.
**2002**, 47, 235–256. [Google Scholar] [CrossRef] - Coulom, R. Efficient selectivity and backup operators in Monte-Carlo tree search. In Proceedings of the International Conference on Computers and Games, Turin, Italy, 29–31 May 2006; pp. 72–83. [Google Scholar]
- Mausam; Kolobov, A. Planning with Markov Decision Processes: An AI Perspective; Morgan & Claypool Publishers: San Rafael, CA, USA, 2012; Volume 6, pp. 1–210. [Google Scholar]
- Ke, J.; Zhang, D.; Zheng, H. An Approximate Dynamic Programming Approach to Dynamic Pricing for Network Revenue Management. Prod. Oper. Manag.
**2019**, 28, 2719–2737. [Google Scholar] [CrossRef] - Gurobi Optimization, L. Gurobi Optimizer Reference Manual, 2020. Available online: https://www.gurobi.com/ (accessed on 7 February 2022).
- Basmadjian, R.; Shaafieyoun, A.; Julka, S. Day-Ahead Forecasting of the Percentage of Renewables Based on Time-Series Statistical Methods. Energies
**2021**, 14, 7443. [Google Scholar] [CrossRef] - Wittman, M.D.; Belobaba, P.P. Customized dynamic pricing of airline fare products. J. Revenue Pricing Manag.
**2018**, 17, 78–90. [Google Scholar] [CrossRef]

**Figure 1.**Illustration of the MDP states. The blue squares represent the MDP states. At timestep t, the capacity of the charging station is expressed by the capacity vector ${\mathit{c}}_{t}$. Elements of the vector represent available charging capacity in corresponding timeslots (time ranges in the green square). Possible charging session reservation request arriving since the previous timestep is expressed by the vector ${\mathit{d}}_{t}$, with ones representing the requested timeslots. Based on the three state variables ${\mathit{c}}_{t},t,{\mathit{d}}_{t}$, the pricing policy provides an action a, the price for charging, that the user either accepts (the first two states at the bottom) or rejects (the state on the right). The state then transitions into the next timestep (details of the transition function are illustrated by Figure 2). The accepted charging request leads to reduced capacity values. The next charging session reservation is entered into the new state. Note that the timesteps have much finer resolution than the charging timeslots. The gray color is used to show past information regarding the charging capacity and session vectors ${c}_{t}$ and ${d}_{t}$ respectively.

**Figure 2.**The structure of the transition function $\tau $. Given state ${s}_{t}$, the probability of getting to the next state ${s}_{t+1}$ is given by multiplying the probabilities along the edges. States are the decision nodes (in red), chance states are in blue and contain the definition of the probability used on edges.

**Figure 3.**Histograms show the start of charging sessions (

**a**) and duration (

**b**) contained in the considered dataset.

**Figure 4.**The number of requests in the experiment with constant expected duration of all charging requests, but a varying number of timeslots.

**Figure 5.**Revenue in the fixed demand experiment when optimizing for revenue. The plot of mean values for all instances is in (

**a**) while the boxplot (

**b**) shows problems with 6 and 12 timeslots only. The right-hand y-axis and thin plot lines in (

**a**) show the utilization of each method (that was not the optimization criterion). In this experiment, we have fixed the expected number of charging hours while varying the number of timeslots in the 24 h selling period, K, across the different problem instances.

**Figure 6.**Utilization in the fixed demand experiment when optimizing for utilization. Plot of mean values for all instances is in (

**a**) while the boxplot (

**b**) shows problems with 6 and 12 timeslots only. We have fixed the expected number of charging hours while varying the number of timeslots in the 24 h selling period, K, across the different problem instances.

**Figure 7.**Revenue (

**a**) and utilization (

**b**) in the experiment using distributions observed in a real charging location. The right-hand y-axis and thin plot lines in (

**b**) show the utilization of each method (that was not the optimization criterion). Here, the number of timeslots is fixed at 48, meaning 30 min charging timeslots, while the expected demand increases from $1/6$th of the available charging capacity (3 chargers with 48 charging timeslots each) to $7/6$.

**Figure 8.**Pricing response of MCTS to a state $s=({\mathit{c}}_{0},1,\mathit{d})$ (i.e., full initial capacity and first timestep) and a request $\mathit{d}$ for an hour-long charging session starting at 16:00. The plot shows average MCTS prices divided by a maximum possible price ($max\left(A\right)$) and average number of accepted price offers by 100 randomly sampled user budgets.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Mrkos, J.; Basmadjian, R.
Dynamic Pricing for Charging of EVs with Monte Carlo Tree Search. *Smart Cities* **2022**, *5*, 223-240.
https://doi.org/10.3390/smartcities5010014

**AMA Style**

Mrkos J, Basmadjian R.
Dynamic Pricing for Charging of EVs with Monte Carlo Tree Search. *Smart Cities*. 2022; 5(1):223-240.
https://doi.org/10.3390/smartcities5010014

**Chicago/Turabian Style**

Mrkos, Jan, and Robert Basmadjian.
2022. "Dynamic Pricing for Charging of EVs with Monte Carlo Tree Search" *Smart Cities* 5, no. 1: 223-240.
https://doi.org/10.3390/smartcities5010014