Simulation-Based Electric Vehicle Sustainable Routing with Time-Dependent Stochastic Information

: We propose a routing method for electric vehicles that finds a route with minimal expected travel time in time-dependent stochastic networks. The method first estimates whether the vehicle can reach the destination with the current battery level and selects potential reasonable charging stations if needed. Then, the route-search problem is formulated as a shortest path problem with time-dependent stochastic disruptions, using a Markov decision process. The shortest path problem is solved by an approximate dynamic programming algorithm to improve calculation efficiency in complex networks. Several simulation cases and a scenario-based example are given to prove the validity of the method.


Introduction
In recent years, electric vehicles (EVs) have gained much attention with the growing concern of the sustainability of energy development and climate change. About 20% of global greenhouse gases are emitted by conventional internal combustion engine vehicles (ICEVs), which is one of the main factors causing fossil energy depletion and global warming. EVs are zero-emission during driving and contribute more to energy-sustainable development and environmental protection than ICEVs. When compared with ICEVs, EVs are more efficient in energy conversion and less expensive to maintain [1]. Many countries have strongly promoted EVs, considering their enormous development potential. According to a Chinese government report, the production and sales volume of EVs will increase to 5 million by 2020 in China, and a supporting charging infrastructure system will be built [2]. In recent years, the United Kingdom, India, and Germany have also released a series of policies to promote the development of EVs. It is expected that EVs will spread rapidly in the near future.
There are still some disadvantages to driving EVs. The most prominent problems are the limited travel distance and insufficient charging stations [3]. Due to battery capacity limitations, the travel range of EVs is shorter than that of ICEVs without charging. An effective way to extend total distance is charging at charging stations on the way. However, compared to gas stations, battery charging stations are not common enough to be found everywhere. In order to eliminate driver concerns of charging station shortages, researchers have proposed various route-search methods for EVs.
Although EVs all have range limitation problems, different types of EVs (i.e., private electric vehicles, electric commercial vehicles, etc.) have different considerations on routing policies [4]. For an individual driver who has a private electric vehicle, the main concern is traveling from an origin to a destination by trying to determine an optimal path with regard to cost, time, or energy consumption. In this case, the problem falls into the more general category of shortest path problems [5]. The difference from the traditional short path problem is that it needs to take the characteristics of EVs into account. This paper focuses on the single EV routing problem, which is called the optimal path problem in the related literature.

Optimal Path Problem
On the basis of static traffic information, several route-search methods have been proposed. In these studies, traffic information is given as a known deterministic parameter that neglects the influence of dynamic and stochastic factors on path selection. Artmeier et al. [6] studied energyoptimal routing of EVs, considering regenerative braking. The researchers constructed a graph that allows negative weight edges to find the path with the highest remaining battery at the destination. Jurik et al. [7] proposed an energy optimal navigation system, but the study did not consider recharging cases. Kobayashi et al. [3] considered the recharging cases and proposed a method targeting minimal time or distance. The method first identifies potential charging stations when the vehicle cannot complete the path with its remaining battery level. The Dijkstra algorithm is then used to find the route with the least cost, using the identified stations. Siddiqi et al. [8] only consider fast charging stations along the path. The objective is to minimize the total distance traveled while respecting constraints on travel times, charging times, and charging costs. Sweda and Klabjan [9] attempted to determine the optimal route with the minimum cost. The authors allow recharging en route to ensure that the battery level remains within a usable state of charge (SOC) window and formulate the problem as dynamic programming.
Real-time traffic information is not considered in the above studies. With the advance of information and communication technologies, it is increasingly possible to gather dynamic information from Intelligent Transportation Systems (ITS). Travelers can receive real-time information about the state of the network before they reach the nodes of the network. Rationally using the dynamic information can help to reduce congestion, energy consumption, and environmental pollution [10]. Taking real-time information flow into account, Guo et al. [11] proposed a rapid-charging navigation of EVs based on real-time power systems and traffic data. Considering the charging price mechanism, Yang et al. [12] proposed a route-search strategy for EVs with minimal cost.

Dynamic and Stochastic Shortest Path Problem
To our knowledge, the EV routing method with dynamic and stochastic information has not been researched. We will give a brief discussion for the classical traffic-routing method, considering real-time traffic information and stochastic disruptions below and proposing an EV sustainable routing method.
A traffic routing method considering real-time traffic information and stochastic disruptions is dynamic and a stochastic shortest path problem in the literature; it has been the subject of extensive research over the last few decades. The advantage is that, in addition to efficiently handling dynamic events, stochastic knowledge about the revealed data is considered. There are two main research methods: adaptive routing recourse policies [13] (Fu, 2001) and the Markov decision process (MDP) [14,15] (Kim et al., 2005;Güner et al., 2012). Problems are usually modeled as MDP [16] (Ritzinger, 2016). Kim (2005) [14] first proposed the application of MDP in the domain of vehicle routing. Güner (2012) [15] considered the nonstationary stochastic shortest path problem using real-time traffic information. Sever (2013) [17] used MDP to model the dynamic shortest path problems with timedependent stochastic disruptions and proposed an optimization problem solution, using a hybrid approximate dynamic programming (ADP) algorithm with a clustering approach in 2018 [18].

Electric Vehicle Sustainable Routing with Time-Dependent Stochastic Information
A route-search method that considers stochastic traffic information and the locations of EV charging stations is proposed in this paper. The crucial object of the method is to find the most reasonable charging station with the assistance of dynamic and stochastic information, thereby providing a route with minimal expected travel time. The method first estimates whether the vehicle can reach the destination straight from the departure point with the current battery charge. If the battery charge cannot support the EV to reach the destination, the method selects some charging stations as potential stations. To find the shortest path from the departure point to the first stations, a dynamic shortest path problem with time-dependent stochastic disruptions is modeled by a Markov decision process (MDP). To efficiently solve this MDP model, a hybrid approximate dynamic programming (ADP) algorithm is applied. The expected travel time (obtained before) and expected charging time are used to calculate the total expected cost for every available route; the most costeffective route is selected as the optimum routing policy. Moreover, the method can reroute to adapt to changes in traffic cost during the plan execution. Offering such routes can assist in eliminating user concerns about travel distance and avoiding congested roads as much as possible to reduce travel cost. An example is given to prove the validity of the proposed method and evaluate its impact on expected cost.
The contributions of this paper comprise mainly two aspects: (i) to establish an electric vehicle navigation-planning algorithm that considers the locations of charging stations using dynamic information based on the MDP model and the ADP algorithm, and (ii) to carry out the algorithm, taking the traffic network of Nanjing as an example. The test verified the feasibility and effectiveness of the algorithm. The paper is organized as follows: Section 2 provides a method for finding potentially reasonable battery charging stations. The details of the MDP-based model and the ADP solution method are provided in Section 3 and 4. Section 5 discusses the calculation results of the proposed method. Section 6 presents a summary.

Route-Search Method
The method is divided into two sections. Firstly, potential charging stations are selected according to the current battery level of the vehicle and the location of nearby charging stations; then, several available routes are obtained. In the next section, we illustrate how the optimum route is found from the available routes using the proposed approach. The process is shown in Figure 1. In this section, the regions of potential charging stations are given. Situations can be classified into four cases according to current battery level and the distance between origin and destination. The method consists of the two following steps: Step 1: Estimate whether the vehicle needs charging en route. The situation without charging includes the two following cases: (1) the vehicle can reach the destination straight from the departure point with the current battery level, and (2) there is no feasible route with the current battery level. Otherwise, the EV needs to recharge en route.
Step 2: Select potential charging stations if the vehicle needs charging en route. The situations are classified into two cases: (3) the vehicle charges a single time en route, and (4) the vehicle needs to be charged twice or more en route.
The estimation conditions and route-search method for each case are discussed below. In this paper, the origin is at node , and the destination is located in node . ( , ) represents the shortest path distance between nodes and , which can be calculated by the Dijkstra algorithm. (kWh) represents the remaining battery level at the origin, and B (kWh) represents the full battery level.

No Charging en Route
In this situation, the EVs do not charge en route, which can be mainly divided into two cases: the battery has enough power to support the car until it reaches the destination, or the car cannot find any available route.

Case 1.
The vehicle can reach the destination straight from the departure point with the current battery level. In this case, the estimation equation is: The minimal expected travel time ( , ) can be calculated using the method proposed in the next section. ( , ) represents the travel distance of the route with the minimum expected travel time. If it satisfies equation: then, the route with the minimum expected travel time is selected as the optimum route. Otherwise, the route with the minimum travel distance is selected.
Case 2. The vehicle is unable to reach the destination straight from the departure point with the current battery level. In this case, the estimation equation is: One possible case is that there are no charging stations in the reachable range of the vehicle with the current battery level. represents the reachable distance of the vehicle with the current battery level, where = × . Another case is that the vehicle cannot reach the destination after recharging in the reachable range. The case is discussed in detail below.
The schematic diagram of the cases is shown in Figure 2.

Charging en Route
This situation satisfies Equation (3), which means that the vehicle cannot reach the destination unless it recharges on the way. The situation can be classified into two cases: Case 3. The vehicle charges a single time en route. In this case, the vehicle can successfully reach the destination on a single charge. This means that at least one charging station exists in the overlapping area, which is calculated below.
Reachable range A: circle area centering origin node , and as the radius. Reachable range B: circle area centering destination node , and as the radius, where = × . Battery charging stations in the overlapping areas of two circular areas are selected as potential stations. The total expected travel time is then calculated for each available route, using the potential stations above as sub-path nodes. Potential stations are denoted by , = 0, … , , where is the maximum number of the potential stations. The minimum expected travel time ( , ) can be calculated using the proposed method in the next section. The travel time ( , ) can be calculated using the Dijkstra algorithm. and here represent Gaussian distance, so the travel distance between node and with the potential stations should again be calculated to ensure that the routes are available. Charging time is represented by , which is affected by the type of the charging station (e.g., normal charging station or fast charging station). The total expected travel time is calculated by Equation (4): The route with the minimum ( , ) is selected as the optimum route.
Case 4. The vehicle charges twice or more en route. In this case, the overlapping area of reachable ranges A and B is not formed; the vehicle needs to stop at two or more battery charging stations. The method of determining the potential area where potential charging stations are located is given below.
As shown in Figure 3, the potential area consists of two circular areas and an enclosed area by the common tangents between them. One circle area centers the origin node , and the other centers the destination node . The radius of the circular area centered on is , and the radius of the circular area centered on is the value of multiplied by a scalar α, α ∈ 0.1,1 . A smaller value for α can be taken first, and if there is no available route, the radius increases. The more charging stations are taken into account, the more available routes can be found. However, the potential area is limited, considering the computational cost. Note that if there are not enough charging stations to support the vehicle to reach the end point after the first charge, the vehicle will not reach the destination. This is the case mentioned earlier.
Then, the total expected travel time for each available route is calculated using the potential stations above as nodes of the sub-paths. If the travel distance between a charging station and is less than , it is considered a neighboring station of , which is denoted by . The minimum expected travel time ( , ) can be calculated using the method proposed in Section 4. ( , ) and can be calculated by the Dijkstra algorithm. The potential routes should be verified to ensure that the routes are feasible with its travel distance. The total expected travel time is calculated by Equation (5): where is the station at which the vehicle stops first, and is the station at which the vehicle stops last. The charging station must be included in reachable range A, proposed in Case (3). The route with the minimum ( , ) is selected as the optimum route.
The schematic diagram of the cases is shown in Figure 3.

Shortest Path Problem with Stochastic Information
The fluctuations in demand and the incident may cause congestion in a traffic network. To reduce travel time, the traveler needs to take the congestion into account by considering all traffic states that change the travel time. In this paper, disruption factors represent the factors that can cause congestion, and the link in the disruption state means the link affected by disruption factors. The potential links in disruption are denoted as vulnerable links. The disruption rate is defined by the steady probability of having a disruption on a vulnerable link.
In this section, the dynamic shortest path with time-dependent stochastic disruptions is modeled as a discrete-time finite horizon Markov decision process (MDP). Considering a traffic network, of an arc ( , ) is related to historical traffic data and assumed to follow a discrete distribution function, given the disruption states. Information about the disruption states of all vulnerable links is updated when the next node is reached. The objective of the problem is to obtain the minimum expected travel between the origin node and potential charging station. The mathematical formulation process based on MDP for the problem is given below, using the framework and standard notations of Powell [19].

State Variables
represents the system state at stage and is composed of two parts: where represents the current node at stage t, and represents the disruption vector that gives the disruption level of all vulnerable links. The disruption level of a vulnerable can take any value from disruption level vector = {0, … , } (i.e., ( ) ∈ ).

Decision Variables
At each node, we made a decision about which node was to travel next. The decision function that returned decision under the given state is denoted by where π denotes a policy, Π denotes a set of possible policies, and ∈ Π.

Exogenous Information Processes
Exogenous information becoming available during interval is represented by . Exogenous information during interval + 1 is denoted by:

= ,
2) (7) where represents the available information of the disruption level of all vulnerable links between stages t and t + 1.

Transition Function
The standard transition function can be stated as: where superscript M stands for model. In this issue, the transition function can be formulated as: where = , = . (4) The transition matrix P( | ) should be given to solve the problem, which gives the probability that, if we are in state and take action , then we are next in state . , denotes the unit-time transition probability between any two disruption levels of vulnerable links . The transition probability of the time-dependent transition matrix Θ ( | , ) for vulnerable link is formulated as: Then, the transition probability is calculated as: where R is the maximal number of vulnerable links. In simulation-based experiments, the unit-time transition matrix is calculated by the given disruption rate.
Consider a two state Markov chain with a unit-time transition matrix for the vulnerable link : The eigenvalues of the matrix are 1 and 1 − − , so: The transition probability to a disrupted state at time, t, given that it is disrupted at time = 0, is denoted by .
It converges exponentially to / + , which is the disruption rate.

Objective Function
The objective of the problem is to obtain the optimal policy with minimal expected travel between the origin node and potential charging station. The objective can be calculated as:

Approximate Dynamic Programming Algorithm
The issue can be addressed by the classical backward recursion of dynamic programming. However, problems may become intractable considering the curse of dimensions in large-scale or complex networks. An ADP algorithm was used to improve the robustness of the method. In this paper, the value function approximation algorithm with a post-decision state variable proposed by Powell [19] was adopted. The algorithm is as below:
Node that gives the optimal value is denoted as * .
If > 0, update V (S , ) using The harmonic step size is calculated as = .
In the algorithm, the initial values are determined by solving a deterministic shortest path problem without disruptions. The approximate value of state is denoted by ( ), which consists of the expectation of all possible states at the next stage. is used to take a weighted average between the current approximate value and the last-iteration approximate value. is chosen to ensure that the step size is less than 0.05 as the algorithm approaches convergence.

Simulation Experiments
This section illustrates several typical cases and a scenario-based example of the method proposed above. The calculation results of this example prove the validity of the method from the two following aspects: (i) to find possible routes for EVs that need charging en route, considering the locations of charging stations; and (ii) to select the optimum route with the minimum expected time using the method proposed above, and to assess the performance of the method. Note that the routing results of the proposed method can be presented by the dynamic policies and the next node can be selected based on real-time disruption statuses information. The illustrative examples given below are a brief description of the route-search method based on the given states. A sensitivity analysis is given at the end of this section.

Typical Cases
Case studies are given to illustrate the proposed route-search method, including the following three cases: no charging en route, charging a single time en route, and charging twice or more en route.
When considering a network with 16 nodes and four vulnerable links, the travel time (min) and disruption rates are given in space. Supposing that the travel time with disruptions is valued 3 times higher than travel time without disruptions. The full battery level is set to 30 kWh, and electric mileage is set to 5 km/kWh. A driving speed of 90 km/h is assumed in normal conditions, and vulnerable links 1→2 and 6→7 are in disruption in the initial state. a. Origin point: node 1; destination point: node 16.
Assumptions: the remaining battery level can travel 60 km. The shortest path distance between nodes 1 and 16 ( , ) is 31.5 km, which means that the vehicle can reach the destination from the departure point with the current battery level. The minimum expected travel time ( , ) is 23 min, and the driving distance meets Equation (2). The optimal path is 1→2→3→7→11→15→16.
A traffic network weighted graph of the cases is shown in Figure 4. A traffic network weighted graph of the cases charging a single time is shown in Figure 5. c. Origin point: node 1; destination point: node 16.
Assumptions: the charging stations are located at node 2, node 3, and node 7. The charging efficiency of them is set to 120 kW. The radius of reachable range A: 60 km; the radius of reachable range B: 150 km. The Gaussian distance between the origin point and destination point is 240 km.
The vehicle needs to charge en route twice. The potential charging stations 2 and 3 are located in reachable range A, and the potential charging station 7 is located in reachable range B.
A traffic network weighted graph of cases that charge twice is shown in Figure 6.

Scenario-based Example
Traffic information: part of the Nanjing traffic network was taken as an example. We supposed that the traveler wanted to go from home to Lukou Airport, which meant that origin node 1 was located in the top-left corner, and destination node 20 was located in the bottom-right corner.
Electric vehicle information: with reference to the EV parameters of the Nissan LEAF 2018, the full battery level was set to 40 kWh, and the electric mileage was set to 6 km/kWh. It was assumed that the remaining battery level at origin was 5 kWh, and the traveler needed to retain 2.5 kWh for emergency. Therefore, the radius of reachable range A was 15 km, and the radius of reachable range B was 225 km.
Charging station information: the overlapping area is shown in Figure 7 (entire network is included in range B). Potential stations in the example are given in Table 1, represented by nearby node numbers. All charging stations in the network were fast-charging stations; the charging efficiency was set to 150kW.
Network information: the real road network and simulated traffic network are shown in Figure  7. The network had 20 nodes and six vulnerable links, and all of the vulnerable links had a high steady probability of having a disruption between 0.4-0.9. The travel time for the links without any disruptions was given on the basis of 2016 surveyed-traffic data. The travel time is in minutes and approximated a positive integer. According to the relevant literature, travel times with disruptions were valued 3-5 times higher than travel times without disruptions [20]. In this example, the travel time of the vulnerable link in a disruption state was considered to be three times more than the time in normal conditions. It was supposed that vulnerable links 1→2, 4→11, and 3→30 were in disruption in the initial state.   Table 1 shows that, in this example, the electric vehicle only needed to be recharged once. The weighted graph of the illustrative network is shown in Figure 8. The graph was obtained by removing co-ordinate information from the network.
The expected travel time for each available route, calculated by the method and considering stochastic information, is given in Table 2. In the ADP approach, step size constant was set to 5, and maximal number of iterations was set to 5000. The travel time between the potential charging station and destination was calculated by the expected value, using steady probability. In Table 2, Path 1 represents the path from the origin node to the charging station, and Path 2 represents the path from the charging station to the destination. Table 2 demonstrates that a reasonable route could be found with the method proposed in this paper. Both battery charging stations located in nodes 4 and 13 could be selected as optimal charging stations because of the same total travel time.
In order to evaluate the performance of the routing method when considering stochastic information, the results of naive and robust routing policies were calculated to compare the expected cost difference. The naive and robust routing policies are offline methods based on historical information. In the example below, the naive routing policy assumed that the network had no disruptions, and the robust policy assumed that all the vulnerable links were in disruption state. The calculated results by the naive routing method are given in Table 3, and those by the robust routing method are given in Table 4.  To evaluate the performance of the proposed method, we compared its cost to the cost of offline methods. We considered the naive and robust policies as predetermined policies and computed the expected value of the policies with time-dependent stochastic information. This evaluation gave a comprehensive consideration of the initial states. The impact of charging time was ignored in the evaluation due to the same charging time. The routing policy calculated by our method was simply denoted by a stochastic routing policy. The percentage travel time difference ∆ for each routing policy is given in Table 5, where:  Table 5 demonstrates that the stochastic policy performed better than offline policies, which may be because the naive policy ignored the disruptions, and the robust policy was risk-averse, while the stochastic policy took all possible states into account by using dynamic and stochastic information. Therefore, the stochastic policy could provide a more reliable route with an exhaustive evaluation for possible states. Another advantage of the stochastic policy is that the route could be updated en route using real-time information.
To clarify the method process, this section gives a small traffic network as an example. However, the increasing number of nodes and disruption levels changes neither the formulation nor the algorithm structure.

Sensitivity Analysis
To analyze the robustness of the algorithm proposed in this paper, we generated 600 test instances in total. For each instance type, we randomly generated 100 replications. In the test instances, each link had a randomly selected discrete travel time taken from a uniform distribution U [1,10]. If there was a disruption, the travel time of the link was tripled. The test instances were constructed based on the following network properties: Network size: the small network size consisted of 16 nodes and the large network size consisted of 36 nodes. Each instance was designed as such that the origin-destination pairs were located from the top-left to the bottom-right corners.
Network vulnerability: instances with 30% and 80% of vulnerable arcs were considered as low and high network vulnerability.
Disruption rate: we defined a low probability of having disruptions to be between 0-0.5, and a high probability to be between 0.5-1.
The algorithms presented in this section were implemented in C++. All experiments were conducted on a personal computer with an Advanced Micro Devices (AMD) Ryzen 5 2600 3.2GHz Processor with 16GB RAM. For the test instances, the overall average computational time was expressed by CPU(s): naive policy used 0.001 s, robust policy used 0.001 s, and stochastic policy used 0.125 s.
The average values of the percentage cost difference relative to the proposed policy with different network configurations is shown in Figure 6.

Conclusions
In this paper, a routing method for electric vehicles (EVs) was proposed. The objective of the method was to find a route with minimal expected travel time by using time-dependent stochastic information. The route-search method was first presented to find available routes, and, then, the problem was transferred to the shortest path problem. In order to calculate the expected travel time for each available route, the problem was modeled as a Markov decision process (MDP). An approximate dynamic programming (ADP) algorithm was presented to solve the problem, which made the method computable in complex traffic networks. The results of the illustrated examples proved that the method could find a reasonable route for EVs and reduce the expected travel time better than offline methods. The method proposed in this paper developed the research of the EV routing method with the assistance of dynamic and stochastic information provided by intelligent transportation systems (ITS). This method can help the sustainable driving of electric vehicles.
When selecting potential battery charging stations, it was assumed that power consumption was only related to the driving distance. However, power consumption also depends on road slope and air-conditioning equipment. Therefore, some power was retained in the example to narrow the range so that all potential charging stations could be reachable. The regions for selecting potential charging stations could be quantitatively narrowed when considering these factors in further studies. Moreover, the objective of this method was to find the route with the minimum expected time, which only took driving and charging time into account. To offer a more flexible route for users, some other factors should be considered, such as waiting time at charging stations and electricity price. In addition, a hybrid ADP algorithm with a clustering approach can be adopted to reduce the computational effort of the proposed method further [18]. If the disruption statuses of the vulnerable arcs at the next stage do not change dramatically as compared to those at the current stage, the nodes that are close to each other can be clustered. All in all, an electric vehicle routing sustainable method using dynamic and stochastic information is a topic of great research value.