Machine Learning Based PEVs Load Extraction and Analysis

: Transformation of the energy sector due to the appearance of plug-in electric vehicles (PEVs) has faced the researchers with challenges in recent years. The foremost challenge is uncertain behavior of a PEV that hinders operators determining a deterministic load proﬁle. Load forecasting of PEVs is so crucial in both operating and planning of the energy systems. PEV load demand mainly depends on traveling behavior of them. This paper tries to present an accurate model to forecast PEVs’ traveling behavior in order to extract the PEV load proﬁle. The presented model is based on machine-learning techniques; namely, a generalized regression neural network (GRNN) that correlates between PEVs’ arrival / departure times and traveling behavior is considered in the model. The results show the ability of the GRNN to communicate between arrival / departure times of PEVs and the distance traveled by them with a correlation coe ﬃ cient (R) of 99.49% for training and 98.99% for tests. Therefore, the trained and saved GRNN model is ready to forecast PEVs’ trip length based on training and testing with historical data. Finally, the results indicate the importance of implementing more accurate methods to predict PEVs to gain the signiﬁcant advantages in the importance of electrical energy in vehicles in the years to come.


Introduction
Among the most energy consumers in the world, which can be divided into four sectors, i.e., transportation, industry, commerce and residence [1], the transportation sector delivers the most greenhouse gases, nearly 50%, into the atmosphere, causing air pollution, climate change and global warming [2]. Transportation electrification is a feasible opportunity to stop the mass motorization of the transportation sector and consequently reduce greenhouse gas emissions. However, on the other hand, the advent of plug-in electric vehicles (PEVs) exposes new challenges that must be discussed from the power system point of view [3]. The most important challenge toward PEVs is the power-delivery issue to the storage system of them [4].
PEVs' charging issue due to weak distribution infrastructures in many countries and an increasing number of PEV fleets is one of the research topics in recent years. For instance, PEV aggregators have been proposed in [5] to handle the coordinated PEV charging. Another solution is battery-swapping technology that was proposed in [6,7]. The uncertainty of the load profile of PEVs is one of the main

Generalized Regression Neural Network (GRNN)
GRNN is one of the machine-learning applications and a powerful regression model. GRNN can be defined as a type of radial-basis-function (RBF) network that has a very high training speed based on a completely parallel structure. Even with low input data, this network is able to establish nonlinear relationships between the target variable and a set of independent variables [30,31]. Fast learning, meaningful communication between target time-series data and a set of explanatory variables and rapid convergence to optimal regression levels with an increasing number of samples are the most important advantages of GRNN [32,33]. As Figure 1 shows, the input layer, the pattern layer, the summation layer and the output layer form the GRNN structure [34].

Methodologies
This section presents a description of the GRNN method and PEV-battery model.

Generalized Regression Neural Network (GRNN)
GRNN is one of the machine-learning applications and a powerful regression model. GRNN can be defined as a type of radial-basis-function (RBF) network that has a very high training speed based on a completely parallel structure. Even with low input data, this network is able to establish nonlinear relationships between the target variable and a set of independent variables [30,31]. Fast learning, meaningful communication between target time-series data and a set of explanatory variables and rapid convergence to optimal regression levels with an increasing number of samples are the most important advantages of GRNN [32,33]. As Figure 1 shows, the input layer, the pattern layer, the summation layer and the output layer form the GRNN structure [34]. The inputs of Figure 1 include all the parameters that are effective in determining and predicting the output. In this paper, the arrival and departure times of the EVs are considered as the inputs, and the traveling distances by each EV are forecasted as the output of the GRNN. The input layer does not contain any processing and just transfers the input data to the next layer. The pattern layer determines the number of neurons by the training data and includes the following function [34]: where X shows the network input variable, i X depicts the training sample related to the neuron i, T demonstrates the transpose symbol and σ is the smoothing parameter. The more σ is made large, the smoother the estimated density that can be reached. The output of neuron i is defined as The summation layer includes two types of summation. The first case in which the arithmetic summation of all neurons can be calculated is expressed according to the following relation, and its output is defined as (4) [30,34]: The inputs of Figure 1 include all the parameters that are effective in determining and predicting the output. In this paper, the arrival and departure times of the EVs are considered as the inputs, and the traveling distances by each EV are forecasted as the output of the GRNN. The input layer does not contain any processing and just transfers the input data to the next layer. The pattern layer determines the number of neurons by the training data and includes the following function [34]: where X shows the network input variable, X i depicts the training sample related to the neuron i, T demonstrates the transpose symbol and σ is the smoothing parameter. The more σ is made large, the smoother the estimated density that can be reached. The output of neuron i is defined as The summation layer includes two types of summation. The first case in which the arithmetic summation of all neurons can be calculated is expressed according to the following relation, and its output is defined as (4) [30,34]: The second summation is obtained as follows: where Y i is the ith connection weight, and the second output is determined as follows: Finally, in the output layer, the probable value of the output y(x) is obtained by dividing the weighted summation by the arithmetic summation as follows [32,34]: In the GRNN structure, the summation neurons S N1 calculate the sum of the weighted outputs of the pattern layer, while S N1 computes the unweighted outputs of the pattern neurons. The output layer divides the output of the S1 neuron by the output of the S N2 neuron to yield the desired estimate [35].
The second summation is obtained as follows: where i Y is the ith connection weight, and the second output is determined as follows: Finally, in the output layer, the probable value of the output ) (x y is obtained by dividing the weighted summation by the arithmetic summation as follows [32,34]: In the GRNN structure, the summation neurons 1 calculate the sum of the weighted outputs of the pattern layer, while 1 computes the unweighted outputs of the pattern neurons. The output layer divides the output of the S1 neuron by the output of the 2 neuron to yield the desired estimate [35]. Figure 2 illustrates the flowchart of the GRNN.

Battery SoC Model
Each EV uses a battery as an energy-storage system. In this paper, the lithium-ion battery also is selected for each EV. The state of charge (SoC) of a battery can be modeled as a nonlinear function or linear function [10]. However, most of the nonlinear models are developed via empirical-based models that are not proper for optimal power-system research. Therefore, in this work, because of the simplicity, a linear model of the SoC calculation is adjusted like many of the other works [9,14]. A linear form of the SoC can be stated as below: where SoC t i (%) is the SoC of the ith EV at time t, η chr i is the charging efficiency of each EV, Cap i is the capacity of EVs and P chr,t (kW) is the charging power at time t. At the beginning of the charging process, the initial SoC of the EV must be known and can be extracted from Equation (10).
In Equation (10), SoC int i (%) is the initial SoC of each EV, d i (mile) is the traveling distance and ρ dis i (kWh/mile) is the efficiency of discharging.

Performance Evaluation of GRNN
Evaluating the performance of each study is considered the most important part. With this evaluation, the results of each study can be verified. Evaluating and ensuring the performance of results can be done in a variety of ways [36]. In this paper, the correlation coefficient (R), mean square error (MSE), root-mean-square error (RMSE) and mean absolute error (MAE) were used as statistical performance criteria to evaluate the performance of the GRNN method. Each of these criteria provides unique analysis and evaluation. Therefore, the highest value of R to 1 and the lowest values for MSE and RMSE to zero indicate the best network performance. Each of these statistical performance criteria can be calculated as follows [37,38]: where X i and X show the real value and average of real values and Y i and Y are the forecasted value and average of forecasted values, respectively.

Simulation Results
Utilizing machine-learning applications requires a database. The dataset used in this paper is related to historical and real-world data and has been presented in the supporting materials of the reference [9]. This dataset includes three parameters: arrival time, departure time and traveling distance of 500 EVs in the real world. Figure 3 depicts arrival and departure times for the dataset. Actually, traveling distance had a correlation with arrival/departure time and can be estimated by knowing arrival/departure times. This prediction was made by GRNN. Using this method requires a database as input. The arrival/departure times for each EV were collected and utilized as the input dataset, and traveling distance for each EV was selected as the target for the GRNN. In this work, 80% of this dataset was used for training the system, and the last 20% was used for verifying the efficiency of the system in forecasting traveling distance. The review of the employed algorithm is given in Figure 4. All the provided simulations were developed in MATLAB 2018b environment on a PC with an Intel Core i7 processor with 2.50 GHz and 8 GB of RAM. Figure 5 shows the correlation coefficient between the input and output variables for the training phase. Figure 6 illustrates the performance of the network in the form of MSE in the training phase. The results provided for training indicated good correlation of the network, with R = 99.49% accuracy at this stage. Figures 7-9 also show the results of the R, MSE and RMSE errors and the error histogram in the test phase, respectively.

Simulation Results
Utilizing machine-learning applications requires a database. The dataset used in this paper is related to historical and real-world data and has been presented in the supporting materials of the reference [9]. This dataset includes three parameters: arrival time, departure time and traveling distance of 500 EVs in the real world. Figure 3 depicts arrival and departure times for the dataset. Actually, traveling distance had a correlation with arrival/departure time and can be estimated by knowing arrival/departure times. This prediction was made by GRNN. Using this method requires a database as input. The arrival/departure times for each EV were collected and utilized as the input dataset, and traveling distance for each EV was selected as the target for the GRNN. In this work, 80% of this dataset was used for training the system, and the last 20% was used for verifying the efficiency of the system in forecasting traveling distance. The review of the employed algorithm is given in Figure 4. All the provided simulations were developed in MATLAB 2018b environment on a PC with an Intel Core i7 processor with 2.50 GHz and 8 GB of RAM. Figure 5 shows the correlation coefficient between the input and output variables for the training phase. Figure 6 illustrates the performance of the network in the form of MSE in the training phase. The results provided for training indicated good correlation of the network, with R = 99.49% accuracy at this stage.
Calculate SoC for each PEV using (9) i N                 The results presented in Figures 7-9 guaranteed the validation and accuracy of the training phase for predicting new input samples. After training and validation, the network could be saved as a black box and used to identify new data.
Then, arrival/departure times were fitted on a normal distribution, and 500 new arrival/departure datapoints were generated using the Monte Carlo simulation. It should be noted that in the literature, usually normal distribution was used to model the arrival/departure times [13]. Hence, normal distribution was employed in this work. The produced new dataset was utilized as inputs of the trained and saved network to forecast new traveling distances. This process was repeated 200 times to generate different scenarios. Figure 10 shows arrival and departure times for the generated new dataset for two of the scenarios. All of the PEVs were assumed to have a lithiumion battery with the capacity of 30 kWh, charging efficiency of 0.88 and discharging efficiency of 0.28 kWh/mile.  The results presented in Figures 7-9 guaranteed the validation and accuracy of the training phase for predicting new input samples. After training and validation, the network could be saved as a black box and used to identify new data.
Then, arrival/departure times were fitted on a normal distribution, and 500 new arrival/departure datapoints were generated using the Monte Carlo simulation. It should be noted that in the literature, usually normal distribution was used to model the arrival/departure times [13]. Hence, normal distribution was employed in this work. The produced new dataset was utilized as inputs of the trained and saved network to forecast new traveling distances. This process was repeated 200 times to generate different scenarios. Figure 10 shows arrival and departure times for the generated new dataset for two of the scenarios. All of the PEVs were assumed to have a lithiumion battery with the capacity of 30 kWh, charging efficiency of 0.88 and discharging efficiency of 0.28 kWh/mile. The results presented in Figures 7-9 guaranteed the validation and accuracy of the training phase for predicting new input samples. After training and validation, the network could be saved as a black box and used to identify new data.
Then, arrival/departure times were fitted on a normal distribution, and 500 new arrival/departure datapoints were generated using the Monte Carlo simulation. It should be noted that in the literature, usually normal distribution was used to model the arrival/departure times [13]. Hence, normal distribution was employed in this work. The produced new dataset was utilized as inputs of the trained and saved network to forecast new traveling distances. This process was repeated 200 times to generate different scenarios. Figure 10 shows arrival and departure times for the generated new dataset for two of the scenarios. All of the PEVs were assumed to have a lithium-ion battery with the capacity of 30 kWh, charging efficiency of 0.88 and discharging efficiency of 0.28 kWh/mile. PEVs start to be charged with rated power after arriving at the parking spot until the charging time has elapsed. Charging time of each PEV depends on its initial SoC at the arrival time. The initial SoC for the whole of the PEVs for different scenarios is depicted in Figure 11. It can be seen that most of the PEVs arrived at the parking spot with more than 50% SoC, i.e., more than 15 kWh of energy. Simultaneously with the PEVs' arrival time, residential homes' demand started to rise. A traditional home load profile is provided in Figure 12. A comparison of Figures 11 and 12 indicates that PEVs have great potential to help the peak shaving of residential homes in the way of discharging at the on-peak hours, i.e., t = 18:00 to t = 00:00, and then starting to charge at 01:00 until departure time. The main concern here is maybe that available time from 01:00 until PEVs' departure time is not enough to charge the PEVs up to an appropriate level that meets their daily traveling-distance requirement. For a residential home, the charging level is 1.5 kWh. Let us assume that each PEV starts to be charged instantly after arriving home with the 1.5 kWh rating power. Figure 13a shows the load profile of whole PEVs for different scenarios with the 1.5 kWh charging rate. In addition, load profiles of five different PEVs for two different scenarios are shown in Figure 7. According to Figure 14a, EV number 1's, 2's, 3's and 5's demand at the final hour of the charging period was a fraction of rated power due to their demand being fully met and their battery that could be fully charged before leaving the parking spot. This means that their available time for charging was enough to charge their battery up to 100%. However, EV number 4 could not be fully charged because of the lack of available time, and it charged with the rated power until it left the parking spot. Figure 13a shows that from the PEVs' arrival times (t = 15:00), the load profile had ascending behavior with the increasing number of PEVs, where the peak load occurred at time t = 23, when almost all PEVs were plugged into the grid, and 750 kW of power was required at that time. Then, from the hour t = 08:00, PEVs left the parking spots and the load profile had descending behavior. In total, nearly 6.8 MWh of energy was required to meet PEVs' demand in one day. According to this analysis with the low charging rate, it was not feasible to utilize PEVs as a storage system that can help peak shaving of residential homes. To do so, PEVs start to be charged with rated power after arriving at the parking spot until the charging time has elapsed. Charging time of each PEV depends on its initial SoC at the arrival time. The initial SoC for the whole of the PEVs for different scenarios is depicted in Figure 11. It can be seen that most of the PEVs arrived at the parking spot with more than 50% SoC, i.e., more than 15 kWh of energy. Simultaneously with the PEVs' arrival time, residential homes' demand started to rise. A traditional home load profile is provided in Figure 12. A comparison of Figures 11 and 12 indicates that PEVs have great potential to help the peak shaving of residential homes in the way of discharging at the on-peak hours, i.e., t = 18:00 to t = 00:00, and then starting to charge at 01:00 until departure time. The main concern here is maybe that available time from 01:00 until PEVs' departure time is not enough to charge the PEVs up to an appropriate level that meets their daily traveling-distance requirement. For a residential home, the charging level is 1.5 kWh. Let us assume that each PEV starts to be charged instantly after arriving home with the 1.5 kWh rating power. Figure 13a shows the load profile of whole PEVs for different scenarios with the 1.5 kWh charging rate. In addition, load profiles of five different PEVs for two different scenarios are shown in Figure 7. According to Figure 14a, EV number 1's, 2's, 3's and 5's demand at the final hour of the charging period was a fraction of rated power due to their demand being fully met and their battery that could be fully charged before leaving the parking spot. This means that their available time for charging was enough to charge their battery up to 100%. However, EV number 4 could not be fully charged because of the lack of available time, and it charged with the rated power until it left the parking spot. Figure 13a shows that from the PEVs' arrival times (t = 15:00), the load profile had ascending behavior with the increasing number of PEVs, where the peak load occurred at time t = 23, when almost all PEVs were plugged into the grid, and 750 kW of power was required at that time. Then, from the hour t = 08:00, PEVs left the parking spots and the load profile had descending behavior. In total, nearly 6.8 MWh of energy was required to meet PEVs' demand in one day. According to this analysis with the low charging rate, it was not feasible to utilize PEVs as a storage system that can help peak shaving of residential homes. To do so, a higher level of charging rate is required. Figure 13b,c shows the load profile of PEVs with higher charging levels. Table 1 describes adopted charging levels. The more the charging rate rose, the more energy was required at peak hours. In the case of II and III charging levels, 1.5 MWh and 1.7 MWh of energy were required at peak hours, which can cause bus-voltage drops and line-current violations. Therefore, an accurate model for PEV load profile is very necessary to optimally schedule the system in order to benefit from the great potential of PEVs from the power system and environmental points of view.
a higher level of charging rate is required. Figure 13b,c shows the load profile of PEVs with higher charging levels. Table 1 describes adopted charging levels. The more the charging rate rose, the more energy was required at peak hours. In the case of II and III charging levels, 1.5 MWh and 1.7 MWh of energy were required at peak hours, which can cause bus-voltage drops and line-current violations. Therefore, an accurate model for PEV load profile is very necessary to optimally schedule the system in order to benefit from the great potential of PEVs from the power system and environmental points of view.    4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24  Electronics 2020, 9, x FOR PEER REVIEW 11 of 15 a higher level of charging rate is required. Figure 13b,c shows the load profile of PEVs with higher charging levels. Table 1 describes adopted charging levels. The more the charging rate rose, the more energy was required at peak hours. In the case of II and III charging levels, 1.5 MWh and 1.7 MWh of energy were required at peak hours, which can cause bus-voltage drops and line-current violations. Therefore, an accurate model for PEV load profile is very necessary to optimally schedule the system in order to benefit from the great potential of PEVs from the power system and environmental points of view.     To evaluate the effectiveness of the GRNN in traveling-distance prediction, a comparison should be made with the results obtained from other methods. The results should be compared for the same data, but research has been done in various studies using different data. Table 2 compares the results of various studies in traveling-distance prediction with the model presented in this paper.   To evaluate the effectiveness of the GRNN in traveling-distance prediction, a comparison should be made with the results obtained from other methods. The results should be compared for the same data, but research has been done in various studies using different data. Table 2 compares the results of various studies in traveling-distance prediction with the model presented in this paper.  To evaluate the effectiveness of the GRNN in traveling-distance prediction, a comparison should be made with the results obtained from other methods. The results should be compared for the same data, but research has been done in various studies using different data. Table 2 compares the results of various studies in traveling-distance prediction with the model presented in this paper.  [24] 0.8928 ---Random forest algorithm (RFA) [24] 0.9459 ---Classification and regression trees (CHART) [24] 0.9186 ---Chi-square automatic interaction detector (CHAID) [24] 0.9151 ---Conventional Error Back Propagation (CEBP) [22] 0.8364 -15.90 11.33 Levenberg Marquardt (LM) [22] 0.8683 -12.56 9.09 Rough based CEBP (R-CEBP) [22] 0.8969 -11.27 7.84 Rough based LM (R-LM) [22] 0.9447 -8.11 5.87 Recurrent rough network with CEBP (RR-CEBP) [22] 0.9099 -9.04 6.42 Recurrent Rough network with (RR-LM) [22] 0.9588 -8. 10 5.48 The results presented in Table 2 show that selecting the significant method can have a high impact on the results. Thus, GRNN was able to achieve the goals of the paper with good accuracy and achieve this important goal. Finally, it should be noted that the proposed solution can also be utilized for a variety of similar data in the real world.

Conclusions
Following the transformation of the energy sector and integrating plug-in vehicles (PEVs) into the energy systems, this work was conducted to alleviate some of the appeared challenges. In this paper, a novel method for PEV load forecasting for energy management of future use was presented. For this purpose, one of the machine-learning applications, called a generalized regression neural network (GRNN), was suggested. The network used historical data on the arrival/departure times of PEVs for training as long as it could estimate the distance traveled by each EV. Additionally, Monte Carlo simulation was used to expand the data based on fitting historical data, i.e., arrival/departure time, on an appropriate distribution function. After training and tests, the network was saved by learning the travel pattern associated with each EV to be used to predict the distance traveled by new EVs. The network was able to forecast new data with 98.99% accuracy, MSE = 1.3165, RMSE = 1.1474 and MAE = 0.8199 errors.
This work can be extended by focusing on electrical-network constraints and investigating integrating PEV aggregators to manage the PEVs' demand. In addition, smart charging of PEVs can be taken into account to model the PEVs with more details.