The Application of Improved Random Forest Algorithm on the Prediction of Electric Vehicle Charging Load

: To cope with the increasing charging demand of electric vehicle (EV), this paper presents a forecasting method of EV charging load based on random forest algorithm (RF) and the load data of a single charging station. This method is completed by the classiﬁcation and regression tree (CART) algorithm to realize short-term forecast for the station. At the same time, the prediction algorithm of the daily charging capacity of charging stations with different scales and locations is proposed. By combining the regression and classiﬁcation algorithms, the effective learning of a large amount of historical charging data is completed. The characteristic data is divided from different aspects, realizing the establishment of RF and the effective prediction of ﬂuctuate charging load. By analyzing the data of each charging station in Shenzhen from the aspect of time and space, the algorithm is put into practice. The application form of current data in the algorithm is determined, and the accuracy of the prediction algorithm is veriﬁed to be reliable and practical. It can provide a reference for both power suppliers and users through the prediction of charging load.


Introduction
To solve the problems such as load balancing, capacity planning, and power quality caused by the access of large-scale electric vehicle (EV) [1], researchers have proposed many practical coordinated control schemes to guarantee the safety and reliability of the power system. For example, after the analyzing of the load demand in certain area, the EV chargers can be used to balance the unbalanced network without overloading the charger [2]. It has been proved that the load of EV can be converted into a tool to benefit the power system [3] by applying optimal charging schemes to arrange the charging and discharging through certain approaches such as demand side response [4,5]. However, these methods depend on the load prediction to a certain extent.
The variety of load forecasting is a large field of the researches on power system. As a new form of load, the exploration of EV load has already begun. At the beginning of these researches, the forecast of the EV load mainly based on the behavior of EV users or specific areas. By using the Markov Chain and Monte Carlo simulation, the seasonal and holiday characteristic of EV users can be analyzed, but the load prediction still has certain uncertainty [6]. Other studies focus on the space and time distribution of the EV load. Models can be established to simulate the fluctuation of EV load [7], and thus the load can be forecasted and thus it can be used as a feasible load to reduce the pressure of

Random Forest Algorithm
Random forest is an integrated learning method that integrates multiple decision trees to eliminate the correlation between feature data. At the same time, the computational complexity of RF is only O(n) (n stands for the number of samples) when the amount of data is large, furthermore, the algorithm can be run in parallel because of this integration to speed up.
RF reduce the correlation between decision trees by randomly selecting samples and features. Firstly, the same amount of data is selected randomly from the training sample in the original training data. Secondly, a part of the features is randomly selected to establish the decision tree. These two kinds of randomization make the correlation between each decision tree small, which reduces the error that may occur when the decision tree itself is over-fitting, and improves the accuracy of the model.

Gini Coefficient
During the generation of decision trees, the measure of the amount of information is defined as: the more the "uncertainty" of the data is reduced, the more information the partition can obtain. There are two common indicators for measuring this uncertainty: information entropy and Gini index.
Take K random variables, then the definition of the Gini coefficient is: where p k indicates the different probabilities of taking Kth variable. It can be proved that when the Equation (2) is satisfied, the maximum Gini(y) is obtained, and if p i = 1 and p j = 0, (i = j), then Gini(y) = 0. This shows that more irregular y (y is the variable being discussed) is, the larger Gini(y) is. Thus, the Gini coefficient can be used to measure uncertainty.

Decision Tree
The algorithm used in this paper is the CART algorithm, which is the classification and regression tree (CART) using Gini gain in Equation (6) or least square as division criteria, for CART is more sophisticated, and can be used to solve both classification and regression problems [23].

Classification Decision Tree
• Load data set D on a node; • If all the samples in D belong to the category c k , the node will not continue to generate and mark it as c k ; • If there is no optional feature, the category with the largest number of samples in D is taken as the category of the node; • Otherwise, if feature x (j) has S j different values u (j) 1 , · · · , u (j) S j which satisfy u (j) 1 < · · · < u (j) S j in the current data set, then: 1 , · · · , u (j) S j are selected as separation points a p uccessively, then: (ii) If x (j) is continuous, are selected as separation points a p successively, then: A jp is the result of the division of x (j) . According to the Gini gain, the feature x (j*) with the greatest information gain of the jth feature. The corresponding dichotomy are calculated as the division criteria: (j * , p * ) = argmax j,p g Gini y, A jp (5) • If the stop condition is satisfied, take the category with the largest number of samples in D at this time as the output category; • Otherwise, according to all possible values of x (j*) (which is {a 1 , . . . , a m }), divide D into {D 1 , . . . , D m }: • Call the algorithm from Equation (1) for each D j . By looping through the above seven steps, a decision tree that meets the specific goal is generated.

Regression Decision Tree
The difference between the generation of regression tree and classification tree is the node partitioning criteria of nodes and the selection of output. The division criterion is the least squares method. For x (j) , scan all its possible values, and select the separation point a p , then x (j) will be divided into two parts R 1 and R 2 . Find the value c 1 and c 2 in the output y, respectively, until the minimum value of Equation (11) is obtained. Then this a p is the best separation point of x (j) .
Similarly, the optimal partitioning features x (j*) can be got by traversing j and the corresponding nodes.
The output value is determined by the average value of the corresponding range. Take R 1 as an example, the output value is: Among them, N 1 is the number of samples in R 1 .
After the decision tree is generated, input the sample feature values that need to be processed, the corresponding output will be obtained.

Design of Random Forest Algorithm Application
Intuitively, RF can be thought of as generating a decision tree for each random sample from data set of the original data, and integrating the results of many decision tree outputs according to voting or averaging strategies as the final output.
This method of random sampling and the integrated output of the results is called Bagging. The specific algorithm process is as follows:

•
Using Bootstrap, randomly extract n training samples from the original data set; • k rounds of extraction are performed and k training sets are obtained; • training k decision tree models for k training sets; • For the problem of charging load prediction: the average of the prediction results of each model is used as the final prediction result.
RF can be intuitively understood through Figure 1. It is noted that the daily charging amount of different charging stations has a discrete characteristic, that is, the charging amount is much dispersed. Thus, step division of the original data is considered. According to the value range of the specific charging amount data, determine the intervals to cover the range. Then the small interference will be eliminated, and the effectiveness and accuracy of RF prediction algorithm will be improved. There are two main principles for the division of intervals: (1) The amount of data in each interval is the same, which can ensure that each step after the division occupies the same proportion of the historical data. This principle is suitable for the small charge portion in the single station prediction; (2) The length of the interval is the same. More intervals will be generated using this method, which requires a large amount of data. This principle is suitable for daily charging capacity prediction of station groups which is more uneven in data distribution.
After the bagging of pre-processed data samples, they are divided into k data packets. For each data packet, the regression decision tree is constructed separately: start from the starting node (root Energies 2018, 11, 3207 5 of 16 node), the regression type is targeted to minimize the Gini coefficient (the uncertainty) through the CART algorithm, continue separating until the target or the maximum depth is reached.
The nodes that no longer bifurcate are called leaf nodes, and each leaf node is assigned an output value. This value is set differently from the classification decision tree algorithm. The average value of the corresponding value before preprocessing of this leaf node is the output. Applying the division process to each data packet, the learning process of the random forest model is realized.
When making predictions, the predicted data features will be input into the model. Each decision tree will generate independent prediction results, and the entire random forest will use the average of the results of all the decision trees as the final prediction result.
Energies 2018, 11, x 5 of 16 interference will be eliminated, and the effectiveness and accuracy of RF prediction algorithm will be improved. There are two main principles for the division of intervals: (1) The amount of data in each interval is the same, which can ensure that each step after the division occupies the same proportion of the historical data. This principle is suitable for the small charge portion in the single station prediction; (2) The length of the interval is the same. More intervals will be generated using this method, which requires a large amount of data. This principle is suitable for daily charging capacity prediction of station groups which is more uneven in data distribution. After the bagging of pre-processed data samples, they are divided into k data packets. For each data packet, the regression decision tree is constructed separately: start from the starting node (root node), the regression type is targeted to minimize the Gini coefficient (the uncertainty) through the CART algorithm, continue separating until the target or the maximum depth is reached.
The nodes that no longer bifurcate are called leaf nodes, and each leaf node is assigned an output value. This value is set differently from the classification decision tree algorithm. The average value of the corresponding value before preprocessing of this leaf node is the output. Applying the division process to each data packet, the learning process of the random forest model is realized.
When making predictions, the predicted data features will be input into the model. Each decision tree will generate independent prediction results, and the entire random forest will use the average of the results of all the decision trees as the final prediction result.

Charging Load Prediction Algorithm of a Single Station
For the charging load prediction of a single charging station. To meet the actual demand of the forecasting, the load is predicted by using RF regression tree in Section 2.2.2. The characteristic properties of the corresponding model are designed. The specific input and output data information is shown in Table 1. The characteristic attributes include the following categories: • Date indicator (Year, Month and Day): an accurate judgment of the influence of climatic conditions such as temperature and humidity on the behavior of EV is difficult to make. Therefore, the attributes are directly integrated into the date indicator, and the impact of climate can be minimized with large amount of data; • 15-min quantity: the importance is represented by a numerical value, which will be limited in 15 min;

Charging Load Prediction Algorithm of a Single Station
For the charging load prediction of a single charging station. To meet the actual demand of the forecasting, the load is predicted by using RF regression tree in Section 2.2.2. The characteristic properties of the corresponding model are designed. The specific input and output data information is shown in Table 1. The characteristic attributes include the following categories:

•
Date indicator (Year, Month and Day): an accurate judgment of the influence of climatic conditions such as temperature and humidity on the behavior of EV is difficult to make. Therefore, the attributes are directly integrated into the date indicator, and the impact of climate can be minimized with large amount of data; • quantity: the importance is represented by a numerical value, which will be limited in 15 min; • Activity indicator: the importance can be expressed by numerical values, which will be limited in 15 min. Important activities may cause a surge in regional charging load; • Prosperity index: The infrastructure index in the prosperous index, which will fluctuate with the renovation of buildings and roads. This is an important indicator that affects the charging habits of EV users; • Charging capacity: before the current time, the amount of power that has been given. The charging area and the charging capacity of many EV users in a period are relatively fixed, so the accumulated charging load of the daily charging station should also be recorded. The volume will have an impact on the remaining load prediction for that day. • Previous day's charge: like the amount of charge, it can increase the temporality of the RF.

Charging Load Prediction Algorithm of Station Group
Unlike normal loads, the charging load of EV tends to have group nature. In fact, predictions based on historical charging data from a single charging station is the way most current prediction algorithms use, and its accuracy can indeed meet the needs. Base on this, the charging load prediction algorithm of charging station group contains many stations by using the stepped daily charging capacity. The separation criterion of the classification tree is combined with the output selection of the regression tree. The input characteristics are shown in Table 1. Compared with the single station algorithm, the station group algorithm adds the following input characteristic data: (1) Week symbol: indicates the position of the day in a week, the data contains the information of the weekend, and can reveal the characteristic attributes of different dates; (2) Capacity indicator: indicates the rated capacity of each charging station. This value is obtained by summing the rated power of charging piles at each charging station. The capacity index reflects the prosperity of the location of the charging station to some extent. (3) Longitude: the longitude of the location of each charging station; (4) Latitude: the latitude of the location of each charging station is used to uniquely determine each charging station. The longitude and latitude indicators can effectively quantify the regional characteristics of different charging stations.
By integrating the 12 input characteristics belonging to the charging station group in Table 1, the charging load prediction for station group can be realized by RF. Since the charging station group considers the charging load variation characteristics of many charging stations of different sizes and regions, it is possible to simulate the short-term load changes of the respective charging stations.
The flow chart of the entire prediction process is shown in Figure 2.
After the original data is processed, k sample sets are obtained by bagging algorithm, and k decision trees are generated by the CART algorithm in Section 2.2 to form a random forest. Then, the forest can input and predict the charging load through the input within the predicted period. The flow chart of the entire prediction process is shown in Figure 2.
After the original data is processed, k sample sets are obtained by bagging algorithm, and k decision trees are generated by the CART algorithm in Section 2.2 to form a random forest. Then, the forest can input and predict the charging load through the input within the predicted period.

Error Analysis
For the results, the mean absolute percentage error (MAPE) and the root mean square error (RMSE) are used for evaluation. The error calculation formulas are shown in Equations (9) and (10), respectively.
where PN(i) and ˆ( ) N P i (i = 1, 2, 3, ..., n) are the actual values and predicted values of the ith data point, respectively, and n represents the length of the data used for verification. εMAPE is regarded as the main judgment of error.

Analysis of Feature Importance
The importance of the input features is evaluated to verify the actual validity of the inputs. For each regression decision tree, the importance of a feature at a node refers to the variable of the Gini coefficient before and after the branch of the node, and its definition can be expressed as Equation (11).

Error Analysis
For the results, the mean absolute percentage error (MAPE) and the root mean square error (RMSE) are used for evaluation. The error calculation formulas are shown in Equations (9) and (10), respectively.
where P N (i) andP N (i) (i = 1, 2, 3, ..., n) are the actual values and predicted values of the ith data point, respectively, and n represents the length of the data used for verification. ε MAPE is regarded as the main judgment of error.

Analysis of Feature Importance
The importance of the input features is evaluated to verify the actual validity of the inputs. For each regression decision tree, the importance of a feature at a node refers to the variable of the Gini coefficient before and after the branch of the node, and its definition can be expressed as Equation (11).
where n and p represent the two child nodes generated by node m, respectively. The characteristic importance of any decision tree i can be obtained by summing: Energies 2018, 11, 3207 8 of 16

Case Analysis
In this section, the data of many charging stations in Shenzhen from 2016 to 2018 is analyzed, and charging load prediction for single station and station group is realized. The current situation and the effect of the application of RF are analyzed.

Analysis of the Construction of Charging Facilities and Charging Data in Shenzhen
Shenzhen City has jurisdiction over 10 districts including Luohu District, Futian District and Longgang District. The area of each district and the distribution of charging stations are shown in Figure 3a. Charging stations are most densely distributed in Nanshan District, Futian District and Luohu District. Baoan District and Longgang District are two districts with the largest number of charging stations. Nanshan, Futian, Longgang and Baoan are the most developed areas in Shenzhen. It is obvious that the distribution of charging stations is related to the economic strength of each district. Today, the total number of EV in the city has exceeded 80,000. According to the "2017 New Energy Vehicle Promotion and Application Financial Support Policy of Shenzhen", the government is now emphasizing on the construction of EV supporting facilities. This also indicates that the analysis and regulation research work of this new load of EV has entered the government's plan.

Case Analysis
In this section, the data of many charging stations in Shenzhen from 2016 to 2018 is analyzed, and charging load prediction for single station and station group is realized. The current situation and the effect of the application of RF are analyzed.

Analysis of the Construction of Charging Facilities and Charging Data in Shenzhen
Shenzhen City has jurisdiction over 10 districts including Luohu District, Futian District and Longgang District. The area of each district and the distribution of charging stations are shown in Figure 3a. Charging stations are most densely distributed in Nanshan District, Futian District and Luohu District. Baoan District and Longgang District are two districts with the largest number of charging stations. Nanshan, Futian, Longgang and Baoan are the most developed areas in Shenzhen. It is obvious that the distribution of charging stations is related to the economic strength of each district. Today, the total number of EV in the city has exceeded 80,000. According to the "2017 New Energy Vehicle Promotion and Application Financial Support Policy of Shenzhen", the government is now emphasizing on the construction of EV supporting facilities. This also indicates that the analysis and regulation research work of this new load of EV has entered the government's plan.
At present, about 6000 charging piles have been built in Shenzhen. The charging history of Shenzhen is shown in Figure 3b, and the data of August is not complete. It is obvious that the recent increase in the charge capacity reflects the growing popularity of EV. This trend also increases the importance and urgency for the government's charging policy and related research. The data used in the simulation are the information of charging station in Shenzhen and the data of charging capacity of different periods during two years. To fully display the spatial and temporal distribution characteristics of the selected data, the data is analyzed from temporal distribution and spatial distribution.

Temporal Analysis of Charging Data
In terms of time, the charging data includes the data from the second half of 2016 to the first half of 2018, which is enough to complete the prediction algorithm. From the monthly distribution, the distribution of the charging data in each month is shown in the violin diagram as shown in Figure 4a. At present, about 6000 charging piles have been built in Shenzhen. The charging history of Shenzhen is shown in Figure 3b, and the data of August is not complete. It is obvious that the recent increase in the charge capacity reflects the growing popularity of EV. This trend also increases the importance and urgency for the government's charging policy and related research.
The data used in the simulation are the information of charging station in Shenzhen and the data of charging capacity of different periods during two years. To fully display the spatial and temporal distribution characteristics of the selected data, the data is analyzed from temporal distribution and spatial distribution.

Temporal Analysis of Charging Data
In terms of time, the charging data includes the data from the second half of 2016 to the first half of 2018, which is enough to complete the prediction algorithm. From the monthly distribution, the distribution of the charging data in each month is shown in the violin diagram as shown in Figure 4a.
The monthly charging data is represented as a violin chart according to the distribution of charging capacity. The monthly distribution of the data is similar, consisting of a relatively centralized smaller charging capacity (wider part) in the lower part of the graph and a relatively larger charging capacity (slender part) in the upper part of the graph, so the monthly data form a needle. In the violin diagram, the area of the monthly figure is equal. Relatively speaking, the tip of the charge distribution is thicker and the bottom is narrower in summer, which indicates that the data distribution is narrower and the average charge is higher, while the average load in autumn and winter is smaller. From the daily point of view, the charging data can also be plotted as shown in Figure 4b, and the daily charge distribution is very similar in shape to Figure 4a. Since the monthly date itself is not of practical significance, it needs to be matched with the month and week symbol to have the ability to express the meaning of time, so it is necessary to append the weekend symbol to regularize the daily charge load changes.
In fact, Figure 4a shows only the scattered charging station charging data with a daily charge capacity of less than 100 kWh, for the rest of the data distribution is like this situation. Using small charging data can make the graph clearer. Figure 4b shows almost all the data, showing the trend of peak charging capacity, and the specific trends need to be judged by the prediction algorithm. charging capacity. The monthly distribution of the data is similar, consisting of a relatively centralized smaller charging capacity (wider part) in the lower part of the graph and a relatively larger charging capacity (slender part) in the upper part of the graph, so the monthly data form a needle. In the violin diagram, the area of the monthly figure is equal. Relatively speaking, the tip of the charge distribution is thicker and the bottom is narrower in summer, which indicates that the data distribution is narrower and the average charge is higher, while the average load in autumn and winter is smaller.
From the daily point of view, the charging data can also be plotted as shown in Figure 4b, and the daily charge distribution is very similar in shape to Figure 4a. Since the monthly date itself is not of practical significance, it needs to be matched with the month and week symbol to have the ability to express the meaning of time, so it is necessary to append the weekend symbol to regularize the daily charge load changes.
In fact, Figure 4a shows only the scattered charging station charging data with a daily charge capacity of less than 100 kWh, for the rest of the data distribution is like this situation. Using small charging data can make the graph clearer. Figure 4b shows almost all the data, showing the trend of peak charging capacity, and the specific trends need to be judged by the prediction algorithm.

Spatial Analysis of Charging Data
From the perspective of space, the latitude and longitude coordinates divide the geographic location of different charging stations. The relationship between the distribution of charging data and latitude and longitude coordinates is shown in Figure 5a. Since the distribution of the charging stations is discrete, Figure 5a is composed of a plurality of peaks on a plane. The horizontal coordinates respectively indicate the latitude and longitude of different stations, the ordinate and the color indicate the accumulation of the charging capacity of each station.
It can be seen from Figure 5a that the charging data is clearly divided into certain concentrated areas, mainly two red areas. In fact, the total charging capacity in each area for a period will be relatively stable. For example, the maximum deviation of the charging capacity between two months of the dark red part in the figure is only about 20%. This is an important foothold for the effectiveness of the charging station group prediction algorithm. Figure 5b shows the relationship between the data distribution and the capacity of the charging stations. The horizontal coordinates are the capacity of each station and the converted value of charging capacity. The ordinate and color indicate the data density of the converted value and the corresponding station. Obviously, the data is generally concentrated near small charging capacity, especially the small-capacity station, they have a large amount of charging data (red part). This is because the main part of the EV charging is still small-capacity stations with small charging capacity. To facilitate direct observation, the data coordinates in the heat map have been quantified. Although the charging data is too small (less than 500 kWh), the large-capacity station still has a large charging capacity (as shown on the right side in Figure 5b, and charging changes with

Spatial Analysis of Charging Data
From the perspective of space, the latitude and longitude coordinates divide the geographic location of different charging stations. The relationship between the distribution of charging data and latitude and longitude coordinates is shown in Figure 5a. Since the distribution of the charging stations is discrete, Figure 5a is composed of a plurality of peaks on a plane. The horizontal coordinates respectively indicate the latitude and longitude of different stations, the ordinate and the color indicate the accumulation of the charging capacity of each station. capacity. The distribution of the data also varies significantly, so it is also necessary to use capacity as an input feature.

Prediction of Single Station
To verify the effectiveness of the prediction algorithm mentioned in Section 3.1, the load data of a 524 kW charging station in Nanshan District, Shenzhen City was selected as a numerical example for simulation verification. The characteristic attributes of the training samples are selected as the year, month, day, 15-min quantity, weekend symbol, holiday symbol, activity indicator, and It can be seen from Figure 5a that the charging data is clearly divided into certain concentrated areas, mainly two red areas. In fact, the total charging capacity in each area for a period will be relatively stable. For example, the maximum deviation of the charging capacity between two months of the dark red part in the figure is only about 20%. This is an important foothold for the effectiveness of the charging station group prediction algorithm. Figure 5b shows the relationship between the data distribution and the capacity of the charging stations. The horizontal coordinates are the capacity of each station and the converted value of charging capacity. The ordinate and color indicate the data density of the converted value and the corresponding station. Obviously, the data is generally concentrated near small charging capacity, especially the small-capacity station, they have a large amount of charging data (red part). This is because the main part of the EV charging is still small-capacity stations with small charging capacity. To facilitate direct observation, the data coordinates in the heat map have been quantified. Although the charging data is too small (less than 500 kWh), the large-capacity station still has a large charging capacity (as shown on the right side in Figure 5b, and charging changes with capacity. The distribution of the data also varies significantly, so it is also necessary to use capacity as an input feature.

Prediction of Single Station
To verify the effectiveness of the prediction algorithm mentioned in Section 3.1, the load data of a 524 kW charging station in Nanshan District, Shenzhen City was selected as a numerical example for simulation verification. The characteristic attributes of the training samples are selected as the year, month, day, 15-min quantity, weekend symbol, holiday symbol, activity indicator, and charged amount in Table 1.
(1) Training Firstly, the daily charging capacity and the number of charging times are taken as the output, and the accuracy of RF is observed. 90% of the sample data is used as the training sample set, and the remaining 10% is used as the test sample set for the RF model. Select the sample characteristics of the year, month, day, weekend, holiday symbol, and activity indicator in Table 1. At the same time, specify 120 trees in the random forest. The depth of each tree is controlled within 80, and the average value is used as the output to obtain the load of the charging station. The test data is shown in Figure 6a. For the sake of brief observation, only some test data is shown in the figure.
The blue curve in the figure represents the prediction of RF, the green curve is the prediction of support vector regression (SVR) [24], and the orange curve is the actual value. SVR is selected to examine the actual effect of the RF prediction with the ε MAPE of 9.03%, and the ε RMSE of 457.21. Compared with ε MAPE of 9.82%, and the ε RMSE of 417.23 of the prediction of SVR.
As shown in Figure 6b, the prediction of charging times shown. The simulation shows that the prediction effect is also accurate, with ε MAPE of 9.67%, and ε RMSE of 16.46. Compared with ε MAPE of 11.37%, and the ε RMSE of 21.51 of the prediction of SVR.
Change the daily load data to the charging capacity every 15 min, add the 15-min sample feature in Table 1, and change the output to charging capacity every 15 min. The RF and SVR model thus trained by the data. The prediction of the test sample is shown in Figure 6c. RF Prediction result: ε MAPE : 10.27%, ε RMSE : 5.02. Compared with ε MAPE of 11.53%, and the ε RMSE of 7.82 of the prediction of SVR.
From the perspective of the prediction in the training set, RF model can achieve an average absolute error within 10% of the single charging station charging prediction. However, SVR model has similar results generally, but not as accurate. Then, the actual effect of the prediction process observation algorithm is simulated. From the perspective of the prediction in the training set, RF model can achieve an average absolute error within 10% of the single charging station charging prediction. However, SVR model has similar results generally, but not as accurate. Then, the actual effect of the prediction process observation algorithm is simulated.
(2) Prediction After training the model, it can be used to realize the function of forecasting by the newly collected charging station load data in June 2018. For the charging data from 14 June to 26 June the outputs are shown in Table 2.   Table 2. Since the characteristic data of the charged amount can only be acquired after the previous time elapsed, when using the algorithm for charging prediction, only the charging load of the next 15 min can be predicted in real time. As sown in Table 2, the prediction is very close to the actual value, with ε MAPE of 9.76% and ε RMSE of 2.27. Of course, the predicted value of the charging capacity can be used as the charged amount portion to continue for the prediction for a longer period, but the accumulated error will gradually become not ignorable.

Charging Load Prediction of Station Group
After verifying the validity of the charging load prediction of the ordinary single charging station, the following is verified for the actual effect of the charging station group charging load prediction.
All small-capacity (less than 150 kW) charging stations that have been working normally in Shenzhen for two years or more are included in the station group, and their historical charging data is used as a sample for simulation. The characteristic attributes of the training samples are the year, month, day, week symbol, capacity mark, longitude, latitude, activity indicator and previous day charge in Table 1.
(1) Training In the original sample data set, 10% of the charging data is extracted as the test sample, and the remaining 90% is the training sample, which is consistent with the training method of a single charging station.
To further improve the accuracy of the prediction algorithm and avoid the occurrence of over-fitting, the prediction of the test sample trained is observed by changing the structure of the random forest, thereby determining the best structure.
When the number of trees n and the tree depth m are 40 and 40, the performance of the RF algorithm on the test sample is shown in of Figure 7a; when n is 80 and m is 80, the performance is shown in Figure 7b; the performance when n is 120 and m is 120 is shown in Figure 7c; the performance when n is 140 and m is 140 is shown in Figure 7d. Although it can be roughly observed that the effect of prediction is different under different structures, it is difficult to determine the optimal forest structure. Therefore, n and m are traversed at intervals of 10, and the structure with the smallest value of ε MAPE and ε RMSE is extracted from it. Since the characteristic data of the charged amount can only be acquired after the previous time elapsed, when using the algorithm for charging prediction, only the charging load of the next 15 min can be predicted in real time. As sown in Table 2, the prediction is very close to the actual value, with εMAPE of 9.76% and εRMSE of 2.27. Of course, the predicted value of the charging capacity can be used as the charged amount portion to continue for the prediction for a longer period, but the accumulated error will gradually become not ignorable.

Charging Load Prediction of Station Group
After verifying the validity of the charging load prediction of the ordinary single charging station, the following is verified for the actual effect of the charging station group charging load prediction.
All small-capacity (less than 150 kW) charging stations that have been working normally in Shenzhen for two years or more are included in the station group, and their historical charging data is used as a sample for simulation. The characteristic attributes of the training samples are the year, month, day, week symbol, capacity mark, longitude, latitude, activity indicator and previous day charge in Table 1.
(1) Training In the original sample data set, 10% of the charging data is extracted as the test sample, and the remaining 90% is the training sample, which is consistent with the training method of a single charging station.
To further improve the accuracy of the prediction algorithm and avoid the occurrence of overfitting, the prediction of the test sample trained is observed by changing the structure of the random forest, thereby determining the best structure.
When the number of trees n and the tree depth m are 40 and 40, the performance of the RF algorithm on the test sample is shown in of Figure 7a; when n is 80 and m is 80, the performance is shown in Figure 7b; the performance when n is 120 and m is 120 is shown in Figure 7c; the performance when n is 140 and m is 140 is shown in Figure 7d. Although it can be roughly observed that the effect of prediction is different under different structures, it is difficult to determine the optimal forest structure. Therefore, n and m are traversed at intervals of 10, and the structure with the smallest value of εMAPE and εRMSE is extracted from it.  As the structure of the forest changes, the changes of ε MAPE and ε RMSE are shown in Figure 8a,b. As the depth and number of trees increase, both ε MAPE and ε RMSE shows a trend of decreasing in fluctuations. To avoid over-fitting, the upper limit of n and m is 200, and it is not necessary to continue to increase the upper limit because the effect of the forest will not continue to improve significantly, and the calculation time will be unnecessarily extended. In this range, the minimum ε MAPE occurs when n is 140 and m is 160; the minimum ε RMSE occurs when n is 120 and m is 180; and the minimum product of ε MAPE and ε RMSE occurs when n is 160 and m is 180. These three points are used as the actual application for charging prediction.
Energies 2018, 11, x 13 of 16 As the structure of the forest changes, the changes of εMAPE and εRMSE are shown in Figure 8a and Figure 8b. As the depth and number of trees increase, both εMAPE and εRMSE shows a trend of decreasing in fluctuations. To avoid over-fitting, the upper limit of n and m is 200, and it is not necessary to continue to increase the upper limit because the effect of the forest will not continue to improve significantly, and the calculation time will be unnecessarily extended. In this range, the minimum εMAPE occurs when n is 140 and m is 160; the minimum εRMSE occurs when n is 120 and m is 180; and the minimum product of εMAPE and εRMSE occurs when n is 160 and m is 180. These three points are used as the actual application for charging prediction. (2) Importance of input feature According to the sample data and the forest structure with n of 140 and m of 160 in the previous section, after the forest is generated, the relative importance relationship of all the characteristic attributes in the model can be obtained. The distribution map is shown in Figure 9. The height of each column gives the average value of importance for each tree in the forest, and the black line segment above it represents the standard deviation. The features x1 to x9 represent the previous day's charge, activity indicator, week symbol, day, latitude, longitude, month, capacity indicator and year.  (2) Importance of input feature According to the sample data and the forest structure with n of 140 and m of 160 in the previous section, after the forest is generated, the relative importance relationship of all the characteristic attributes in the model can be obtained. The distribution map is shown in Figure 9. The height of each column gives the average value of importance for each tree in the forest, and the black line segment above it represents the standard deviation. The features x1 to x9 represent the previous day's charge, activity indicator, week symbol, day, latitude, longitude, month, capacity indicator and year.
Energies 2018, 11, x 13 of 16 As the structure of the forest changes, the changes of εMAPE and εRMSE are shown in Figure 8a and Figure 8b. As the depth and number of trees increase, both εMAPE and εRMSE shows a trend of decreasing in fluctuations. To avoid over-fitting, the upper limit of n and m is 200, and it is not necessary to continue to increase the upper limit because the effect of the forest will not continue to improve significantly, and the calculation time will be unnecessarily extended. In this range, the minimum εMAPE occurs when n is 140 and m is 160; the minimum εRMSE occurs when n is 120 and m is 180; and the minimum product of εMAPE and εRMSE occurs when n is 160 and m is 180. These three points are used as the actual application for charging prediction. (2) Importance of input feature According to the sample data and the forest structure with n of 140 and m of 160 in the previous section, after the forest is generated, the relative importance relationship of all the characteristic attributes in the model can be obtained. The distribution map is shown in Figure 9. The height of each column gives the average value of importance for each tree in the forest, and the black line segment above it represents the standard deviation. The features x1 to x9 represent the previous day's charge, activity indicator, week symbol, day, latitude, longitude, month, capacity indicator and year. The most important one of the input characteristics is the previous day's charge. Its importance reaches 0.3, while the sum of all features is 1. This is because the indicator gives the algorithm timing characteristics and contains most of the information about the charging history. The second important factor is the activity indicator, which depicts the volatility of the charging load based on historical data, with an importance around 0.15. Next, come the time indicator, the latitude and longitude, their importance is around 0.1. The less important ones are capacity indicators and year. The most important one of the input characteristics is the previous day's charge. Its importance reaches 0.3, while the sum of all features is 1. This is because the indicator gives the algorithm timing characteristics and contains most of the information about the charging history. The second important factor is the activity indicator, which depicts the volatility of the charging load based on historical data, with an importance around 0.15. Next, come the time indicator, the latitude and longitude, their importance is around 0.1. The less important ones are capacity indicators and year. Indeed, capacity does not determine the direction of charging load at present, and the year does not have a significant impact on the behavior of EV. However, these two indicators, especially the capacity indicators, also give a certain degree of improvement in the algorithm and the scalability of the data accumulation in the future years.

(3) Prediction
The prediction results obtained by the three forest structures are shown in Figure 10, wherein the predicted value 1 is given by the forest with n of 140 and m of 160. The predicted value 2 is given by a forest with n of 120 and m of 180; the predicted value 3 is given by a forest with n of 160 and m of 180. The red and purple curves are the output of SVR model and C4.5 algorithm (C4.5 uses ordinary information gain as the partition criterion compared with Gini entropy for CART). Since the amount of charging stations is relatively large, some typical prediction results are taken as an illustration. Indeed, capacity does not determine the direction of charging load at present, and the year does not have a significant impact on the behavior of EV. However, these two indicators, especially the capacity indicators, also give a certain degree of improvement in the algorithm and the scalability of the data accumulation in the future years. (

3) Prediction
The prediction results obtained by the three forest structures are shown in Figure 10, wherein the predicted value 1 is given by the forest with n of 140 and m of 160. The predicted value 2 is given by a forest with n of 120 and m of 180; the predicted value 3 is given by a forest with n of 160 and m of 180. The red and purple curves are the output of SVR model and C4.5 algorithm (C4.5 uses ordinary information gain as the partition criterion compared with Gini entropy for CART). Since the amount of charging stations is relatively large, some typical prediction results are taken as an illustration. First 10 results are shown in Table 3. It can be inferred that the RF model of station group is more accurate than SVR and C4.5 (decision tree using information gain as split criterion) [25], for it put less emphasis on temporality and output basing on different stations. SVR is not capable of the prediction of station group. As shown in Figure 10, as the performance of the test sample, the three prediction curves of RF are highly accurate for single-day load prediction in the face of the actual prediction environment.
Among them, the closest to the actual value is the predicted value of 2, with εMAPE of 10.83% and εRMSE of 39.59. Although the overall situation is good, it can be clearly seen from the figure that in some stations with small charging load, due to the large randomness of daily charging, it may be First 10 results are shown in Table 3. It can be inferred that the RF model of station group is more accurate than SVR and C4.5 (decision tree using information gain as split criterion) [25], for it put less emphasis on temporality and output basing on different stations. SVR is not capable of the prediction of station group. As shown in Figure 10, as the performance of the test sample, the three prediction curves of RF are highly accurate for single-day load prediction in the face of the actual prediction environment. Among them, the closest to the actual value is the predicted value of 2, with ε MAPE of 10.83% and ε RMSE of 39.59. Although the overall situation is good, it can be clearly seen from the figure that in some stations with small charging load, due to the large randomness of daily charging, it may be inaccurate for data currently collected is not large enough. The prediction can still be further improved, as the charging prediction algorithm expands and new types of recorded data appear.

Conclusions
This paper proposes a method based on RF for EV charging load prediction and analysis, and apply it on Shenzhen actual charging data and application scenarios, and draws the following conclusions: (1) The current EV industry in Shenzhen is still in a booming stage, and the charging load has a dispersion of small amount. After a large amount of charging data analysis, it can be observed that the charging load of EV also has temporal and spatial distribution characteristics. In terms of time, the charging load is higher in summer than in winter, and there are different distribution rules according to holidays. For space, the charging load has a distribution characteristic like that of the charging station group. Based on this, the data feature with the largest degree of discrimination is selected based on the existing data to provide the basis for the application of random forest in charging prediction. (2) The proposed charging prediction algorithm of single station can effectively track the estimated charging capacity of the station every 15 min based on the actual recorded data. According to the simulation results, the prediction can reach a ε MAPE of 9.76% and a ε RMSE of 2.27. It can be used as a charging prediction method to provide reference for various EV charging load control strategies.

Conflicts of Interest:
The authors declare no conflict of interest.