Abstract
This study set out to extract the charging characteristics of an electrical vehicle (EV) from massive real operating data. Firstly, an unsupervised learning method based on self-organizing map (SOM) is developed to deal with the power supply side data of various charging operators. Secondly, a multi-dimensional evaluation index system is constructed for charging operation and vehicle-to-grid (V2G). Finally, according to more than five million pieces of charging operating data collected over a period of two years, the charging load composition and characteristics under different charging station types, daily types and weather conditions are analyzed. The results show that bus, high-way, and urban public charging loads are different in concentration and regulation flexibility, however, they all have the potential to synergy with power grid and cooperate with renewable energy. Especially in an urban area, more than 37 GWh of photovoltaic (PV) power can be consumed by smart charging at the current penetration rate of EVs.
1. Introduction
The construction and operation of charging infrastructures are not only fundamental to the sustainable development of electrical vehicle (EV), but also the basis for vehicle-to-grid (V2G) [,]. The analysis of EV load characteristics is essential for the planning and operation of charging infrastructures and V2G, thus it has been studied widely in recent years.
The load characteristics of EVs are affected by various coupled factors such as energy consumption, the habits of the users, and traffic condition, which leads to complexity of the models. The model-driven methods were firstly employed. The space-time distribution model of EV parking demand was established in [], providing basis for simulation of charging load curves through Monte Carlo method. Travel chain model was proposed in [,], in order to analyze the temporal and spatial distribution of EV charging demands, considering the influence of different factors on the power consumption of driving. The charging model of EVs in residential areas was developed in [], combining the regional parking model areas with the state-of-charge distribution model. However, model-driven methods require many assumptions so that the results can be easily idealized. Data-driven methods are less restricted by hypotheses and have gradually received attention in the study of EV load characteristics. Historical data of both traffic and weather were analyzed in [] to obtain different traffic scenarios, and EV charging behaviors were classified using decision tree algorithm. Travel trajectories of Di-Di cars were adopted in [] for data mining. Temporal-spatial distribution of charging load in different date types and areas were effectively predicted with a single charging model of EV. A data-driven methodology was presented in [] aiming at obtaining power requirement through the observation of the charging profiles of a fleet of EVs over one year. In recent years, load characteristics analysis considering the impact of power-traffic coupling [,] and bounded-rational users [,,] have become a research focus. The travel path of EV was simulated in [] with the purpose of evaluating the impact of large-scale EVs on both traffic and power systems under the constraints of the urban area road network. Based on the cumulative prospect theory, the bounded rationality of user travel decision was described in [] considering the dynamic characteristics of the transportation network. The influences of users’ bounded rationality on the load dispatching of charging stations were investigated in [].
In recent years, some achievements have been made in the construction of Chinese charging networks, with a large amount of operation data accumulated. To provide references for the further development of charging infrastructures and the design of intelligent charging systems, it is significant to conduct multi-dimensional characteristic analysis of various EVs’ load under different scenarios. However, research to date has tended to focus on special scenarios and EV types. Very few studies have employed massive amounts of real charging data, which is crucial to both overall and elaborate analysis of large-scale EVs’ charging load characteristics.
This study set out to explore the general method for analyzing charging load characteristics based on massive data and provide suggestions to the construction and operation of charging infrastructures. For this purpose, this paper utilizes more than five million pieces of electrical energy supply data from various charging operators in a provincial administrative area collected over a period of two years. A unsupervised learning method based on self-organizing map (SOM) is adopted to cluster the data on power supply side. A multi-dimensional evaluation index is constructed for charging operation and V2G. The charging load composition and characteristics under different charging station types, daily types and weather conditions are analyzed as well.
The following arrangements are as follows: Section 2 processes and expands the data on the power supply side. Section 3 introduces the SOM based data clustering method, multi-dimensional evaluation indicators of charging characteristics, and comprehensive analysis scheme. Section 4 shows the analysis results under different scenarios. Section 5 summarizes the main conclusions of the full text.
2. Data Processing and Expansion
2.1. Basic Format and Data Cleaning
In this paper, a total of 1588 charging stations including bus charging stations (BCS), highway charging stations (HCS), and urban public charging stations (UPCS) are selected as the research objects, including a total of more than 5.8 million charging records collected over a period of two years. The basic format of the data is shown in Table 1. In order to protect the privacy of users and enterprises, the user and charging pile number and specific location information of the charging station are hidden.

Table 1.
Basic format of power supply side data.
Let the charging station in the record be named as . , and denote the name, charging start time, charging end time, and the amount of energy delivered, respectively. Data error may be generated when an EV is connected to a charging point and during upload owing to unsuccessful charging, failure of measuring components, packet loss, and unknown network errors. Therefore, it is necessary to examine the original data to remove invalid and abnormal data to avoid interference and improve analysis accuracy. The data filtrating principles are as follows:
- Delete records that satisfy .
- Let be the charging duration and records with too long charging duration be deleted ().
- The maximum charging output power is denoted as . Delete abnormal charging records ().
After screening, a total of 5.1 × 106 valid data examples were obtained, including 1,342,372 data examples from 2018 and 3,831,368 data examples from 2019. As the numbers clearly demonstrate, the charging business has grown rapidly.
2.2. Data Expansion and Scene Classification
In order to perform a refined analysis of the load characteristics in different scenarios, the expansion based on the original data information mainly includes the following four aspects:
- Service type. The service object in each record can be determined based on the text information in . If contains ’bus’, it is determined to be a bus charging service; if it contains ’service area’, it is then classified as a highway charging service; if it does not contain any special characters, it is determined to be a general urban public charging service.
- Weather condition. Combine the charging start date (‘year’, ’month’, and ’day’) with the charging station location information in , such as “××province, ××city, ××district”, the weather information at the time when each record is generated can be obtained through the web information. The weather conditions are divided into three categories: ’sunny’, ’cloudy’, and ’rain or snow’.
- Day type. According to the charging start date and the holiday/weekend information from 2018 to 2019, the day type at the start time of each record can be obtained, that is, working day or non-working day.
- Temporal data conversion. With known , and can be calculated by using the time function in MATLAB. In addition, the Hour, Minute, and Second properties were extracted from , and they were converted into floating point numbers that are easy to statistically process.
After expanding the original data, 18 charging scenarios can be divided according to three groups of status indicators: service type, weather condition, and day type, as shown in Figure 1. Based on the SOM, the massive charging records are going to be clustered through the three characteristic quantities of and , . Before clustering, they are standardized according to the maximum/minimum values of each characteristic. On this basis, in-depth analysis of the charging load structure and characteristics in each scenario are carried out.

Figure 1.
The division of different charging scenarios.
3. Clustering and Characteristic Index Calculation
3.1. SOM Clustering Algorithm
Since it is very difficult to observationally determine the initial categories according to input variables (, , and ), given the significant amount of data, the unsupervised method has to be employed. The SOM is one of the most successful methods of unsupervised clustering, and can cope with non-linear correlations and map the input n-dimensional spatial data to a two-dimensional plane while maintaining the original topological relationship []. Since its introduction, it has achieved significant results in customer classification [] and power load curve analysis [] An SOM consists of an input and output layer. The clustering results are derived from the parallel distance computations (from the input vector) to a number of neurons. The weight vector value is independently adjusted to find the inherent characteristics of each input. The number of neurons in the input layer n should be fewer than those in the output layer m. Therefore, this study adopted a SOM to cluster charging records with unsupervised learning. The basic SOM training procedures are as follows:
- Initialize the neighborhood and learning rate functions ; set stop conditions. The area surrounding the winner neuron, which is calculated in next step, is called the neighborhood , and the neighborhood neurons are activated to varying degrees. is a function of the number of calculations, which decrease as it increases. affects the weight correction magnitude. To control the training stability, it also decreases with an increase in the number of calculations. The training ends when the maximum iterations are reached.
- Calculate the Euclidean distances between the p-th input sample and the weight vector , where . The output neuron with the smallest distance is selected as the winner neuron.
- According to the neighborhood and learning rate functions, the neighborhood neuron weights are updated:
- Determine whether all samples have been input: if complete, set k to zero and proceed to Step 5; if not, update the neighborhood and learning rate, and return to Step 2.
- Determine whether the iterations have completed: if completed, output the training results; if not, return to Step 2.
Upon the completion of training, the clustering stage can begin. The weights remain the same after training. For each input (i.e., charging record), the SOM will automatically find the similar output neuron and assign the record to the cluster corresponding to that neuron.
3.2. Characteristic Index Calculation Method
Compared with traditional load, EV charging load can be utilized as an optimization resource while considering its impact on the operation of charging stations and power grid. This paper adopts peak load ratio, daily load duration ratio, adjustment flexibility, valley filling potential, and synergy with renewable energy as the characteristic indicators of different types of charging load. The calculation methods are as follows:
- peak load ratio and daily load duration ratio .
According to , and , a total of 96 time periods in 24 h are taken at equal intervals. Each charging record is converted into a discrete power value in each time interval, as shown in Figure 2:

Figure 2.
Discrete processing of charging records.
Where is the load value generated during the interval t using constant charging power, . When is 1, the EV is in the charging connection state and can be charged. The total load curve of the m-th type of charging record . and are:
where is an infinitesimal quantity and sgn is symbolic function. The peak load is reached and when the load is higher than 90% of its maximum value. When the load is not 0, . represents the ratio of peak load to the duration of this load type. is the duration of this load type during one day. The larger the , the more concentrated this load type is. The larger the , the more even the distribution of this load type for one day is.
- adjustment flexibility
reflects the time and power adjustment capability of this type of load during the charging duration:
where and are the average charging capacity and charging duration of the m-th charging load, respectively.
- valley filling potential
It is generally considered that 23:00 at night to 7:00 the next day is the valley period of the power grid. is recorded as the charging capacity in the valley period of the m-th charging load type:
where represents the total charging capacity of the m-th charging load type. is the result of optimization in which the charging state, power and time constraints of each EV need to be met:
where is the charging power in the interval t after orderly adjustment. is 0/1 variable and the EV is charging when it is equal to 1.
- synergy with renewable energy
The Kendall rank correlation coefficient can effectively measure the correlation between variables []. It is proposed to use the Kendall rank correlation coefficient to measure the synergy between the charging load and the output of renewable energy:
where are the value of the typical output curve of wind or photovoltaic power at each time interval. The calculated by which is the sum of each EV’s charging power in time domain without adjusting reflects the synergy of EV load and new energy output under natural condition. . The closer its value is to 1, the closer the positive correlation between the two. Under the charging constraints of each EV, with the goal of minimizing the variance between the total load curve and the output value of the renewable energy, the charging power of each EV is adjusted according to the following objective function:
where is the value obtained by converting the new energy output curve based on the maximum value of the charging load under natural condition. calculated by the optimized reflects the optimal synergy.
3.3. Load Characteristic Analysis Process
The load characteristic analysis process is shown in Figure 3. The SOM is trained by a random part of the charging records and is used to cluster all the records, then the proportion of each type of charging record in different scenarios (service type, weather condition and day type) is analyzed.

Figure 3.
Load characteristic analysis flow.
Calculate , , , , and for each type of charging record according to the method described in Section 3.2. Since data has high similarity within each group due to a self-organizing clustering, in order to reduce the difficulty of optimization calculation, 2000 groups of data are randomly selected to represent each type, which are used as input for the calculation of , , , and .
4. Results of Multi-Dimensional Analysis of EV Load Characteristics
We randomly extracted 20,000 charging records from 2018 and 2019 to train their SOMs, respectively. The results are shown in Figure 4. The spatial location and topological connection of the neurons in the output layer of the training results are basically the same, indicating that although the charging business had achieved significant growth from 2018 to 2019, the data structure has not changed, which means the internal characteristics remained consistent. Therefore, this paper selected the charging records of 2019 with a larger amount of effective data to conduct further research on the load characteristics.

Figure 4.
SOM output layer topology.
4.1. Cluster Analysis
More than 3.8 million charging records in 2019 are clustered using the trained SOM and are divided into 9 load types. The results are displayed in Figure 5 and Figure 6.

Figure 5.
Proportion of each charging load type.

Figure 6.
The box figure of each type of load.
Load types 4, 6–9 account for nearly 80% of all charging records. The difference in the distributions of charging capacity and duration, of which the average values are between 21 and 24 kWh and 1.1 to 1.7 h, are not obvious. However, there are big differences in the distributions of charging start time, which are concentrated around 8:21, 21:57, 11:33, 14:57, and 18:23, respectively. Load types 2, 3, and 5 have larger charging capacity. Especially, load type 3 has an average charging power of 141.46 kWh. Load type 1 is concentrated around 2:00 in the morning, of which the average charging duration is more than 2 h with no particularly high charging capacity.
4.2. Characteristics of Each Load Type
4.2.1. Peak Load Ratio and Daily Load Duration Ratio
The calculation results of and are shown in Figure 7. Load types 4, 6–9 of which and are distributed around 20% have relatively similar characteristics. Load types 2, 3, and 5 have a relatively large duration while peak loads account for relatively small percentage. Load types 1 and 4, 6 to 9 have relatively similar small , which means the load concentration is weaker.

Figure 7.
The box figure of each type of load.
4.2.2. Adjustment Flexibility and Valley Filling Potential
The calculation results of and are shown in Figure 8. Except for load types 2, 3, and 5, all types of loads have good adjustment flexibility, which are up to more than 2 times. The adjustment flexibilities of load types 1, 6, and 9 have reached more than 4 times, that is, when charging with rated power, only 1/4 of the actual charging connection time can meet the charging demand. High adjustment flexibility brings large adjustment space of charging time and power. Since the duration of load types 1 and 6 of which the adjustment flexibility is large are more compatible with the grid valley period, the valley filling potentials of the two are the largest and the charging capacity of load type 1 can be 100% obtained during the valley period.

Figure 8.
Adjustment flexibility and valley filling potential of different types of loads.
4.2.3. Synergy with Renewable Energy
Calculate CN between different types of loads and wind power (WP)/PV output before/after adjustment. The results are shown in Figure 9. Before adjustment, load type 2 has good natural coordination with PV, with a correlation coefficient of 0.83. Load type 1 has good natural coordination with WP, and the correlation coefficient is 0.60. After adjustment, the synergies of various types of loads with PV/WP increased to a certain extent. It can be found that the displacements of load types 1 and 4 on the horizontal axis through adjustment are significant, indicating a greater potential for improvement in synergy with PV. However, load types 3, 5, 6, 8, and 9 have significant displacements on the horizontal axis, indicating that they have greater potential for improvement in synergy with WP.

Figure 9.
Synergy between different types of loads and new energy output.
4.3. Comparison of Different Scenarios
4.3.1. Load Composition in Different Scenarios
The load compositions in different scenarios are shown in Figure 10. The load compositions of different charging service types are obviously different, while the load compositions of the same charging station type under different weather conditions and day types are basically similar and can pass the consistency test.

Figure 10.
Proportion of each type of load in different scenarios.
The similarity of load compositions of the same charging service type, under different weather conditions and day types shows the consistency of its inherent characteristics. The daily average number of charging records in different scenarios is shown in Table 2. From the difference in the number of records in different scenarios, the following conclusions can be drawn:

Table 2.
Daily average number of charging records under different scenarios.
- The number of charging behaviors in HCS is much smaller than BCS and UPCS. The application areas of EVs are mainly in the interior of the city.
- BCS and UPCS generally charge more on working days than non-working days, while HPCS have a larger charging capacity on non-working days. The reason is that the rigid travel behaviors in the city on non-working days has been significantly reduced. The proportion of intercity-travel will increase in non-working day, resulting in a corresponding load increase.
- On working days, there are more charging records for different types of charging stations in rainy and snowy weather conditions. On the one hand, bad weather has led to an increase in the average power consumption of EVs, on the other hand, it has increased the users’ dependence on cars under the rigid travel demand. On non-working days, bad weather will reduce users’ flexible travel needs.
4.3.2. Comparison of Characteristics of BCS, HCS and UPCS
The proportions of different charging service types in each load type are shown in Figure 11. Based on the comparison with Figure 10 and the analysis of the characteristics of different load types in Section 4.1 and Section 4.2, the following conclusions can be drawn:

Figure 11.
The proportion of different charging station types in different load types.
- Load types 2, 3, and 5 with relatively stable distributions and large average charging capacities are mainly composed of bus charging loads. The total number of charging records and charging capacity are 25.54% and 58.69% of the bus load, respectively. Load type 2 and PV have better natural synergy characteristics, and load types 3 and 5 have better regulation and synergy characteristics with WP. Load type 6 is also mainly composed of bus charging load, which accounts for 18.75% of the bus charging records and 10.42% of the charging capacity.
- The load composition types of HCS and UPCS are similar. The main difference is that the HCS charging loads are more concentrated on load types 7 and 8 with shorter duration and less adjustment flexibility.
4.3.3. V2G Capabilities of Different Types of Charging Stations
ThroughV2G, the potential of large-scale EVs in cutting peaks, filling valleys, and coordinating new energy consumption can be brought into play. According to the load structure and characteristics of different types of charging stations, the V2G capacity under current conditions can be calculated.
As shown in Figure 12, the V2G capacities of BCS and UPCS are much higher than that of HCS, which can provide about 4.51 GWh/year and 7.56 GWh/year of valley filling capacity, respectively. Before the smart charging adjustment, the BCS and UPCS can consume about 19.64 GWh and 17.92 GWh of PV, respectively, which can be increased by about 25% after adjustment. In contrast, smart charging has more room for promoting WP consumption, and it can reach more than 7 times the original consumption after adjustment.

Figure 12.
V2G capability of different types of charging stations.
5. Conclusions
Unsupervised learning algorithm was adopted in this paper to analyze a total of more than 5 million charging data. Multi-dimensional evaluation indicators for the charging load characteristics were proposed. The main conclusions are as follows:
- The load structure and characteristics of the same charging service type (BCS, HCS, UPCS) under different weather conditions and day types are stable and consistent, which provides benefit for infrastructure planning and smart charging scheduling based on charging behavior analysis and load forecasting.
- The charging loads are closely coupled with the user’s travel behavior. The impact of weather conditions on the amount of charging load depends on the necessity of travel. On working days, bad weather will lead to an increase in the charging load and a decrease on non-working days.
- BCS has relatively stable total load curves and a large amount of charging loads concentrated at night, resulting in good valley filling capacity and the ability to absorb WP at night.
- The peaks of charging load in BCS and HCS are relatively concentrated, and the volatility of the load curve is large, which brings challenges to the economic and safe operation of charging stations. Compared with HCS, UPCS has longer load durations and greater adjustment flexibility.
- In different scenarios, EVs have the potential to synergize with the power grid and renewable energy sources, especially in cities. Smart charging adjustment can greatly increase the consumption of PV and WP by EVs.
Author Contributions
Conceptualization, Z.Z. and Z.C.; methodology, Z.Z. and Z.J.; software, Z.Z.; investigation, H.D. and J.T.; resources, X.H.; writing—original draft preparation, Z.Z.; writing—review and editing, S.G.; supervision, Z.C. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by State Grid Corporation Science and Technology Project (Research on Urban Power Grid Dispatching Technology for Large-scale Electric Vehicle, 5108-202118041A-0-00), Jiangsu Natural Science Youth Fund Project (BK20190710) and Jiangsu Provincial Colleges and Universities Natural Science Research General Project (19KJD470004).
Institutional Review Board Statement
Ethical review and approval were waived for this study, since it did not involve humans or animals.
Informed Consent Statement
The study did not involve humans.
Data Availability Statement
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the commercial secret.
Conflicts of Interest
Xueliang Huang has received research grants from State Grid Corporation. Hongen Ding and Jiang Tian are employees of State Grid Corporation. The paper reflects the views of the scientists, and not the company.
References
- Zhang, H.; Hu, Z.; Song, Y.; Xu, Z.; Jia, L. A prediction method for electric vehicle charging load considering spatial and temporal distribution. Autom. Electr. Power Syst. 2014, 38, 13–20. [Google Scholar]
- João, A.; Lopes, P.; Soares, F.J.; Almeida, P.M.R. Almeida. Integration of Electric Vehicles in the Electric Power System. Proc. IEEE 2011, 99, 168–183. [Google Scholar]
- Tao, S.; Liao, K.; Xiao, X.; Wen, J.; Yang, Y.; Zhang, J. Charging demand for electric vehicle based on stochastic analysis of trip chain. IET Gener. Transm. Distrib. 2016, 10, 2689–2698. [Google Scholar]
- Chen, L.; Nie, Y.; Zhong, Q. A model for electric vehicle charging load forecasting based on trip chains. Trans. China Electrotech. Soc. 2015, 30, 216–225. [Google Scholar]
- Wen, J.; Tao, S.; Xiao, X.; Luo, C. Analysis on charging demand of EV based on stochastic simulation of trip chain. Power Syst. Technol. 2015, 39, 1477–1484. [Google Scholar]
- Guo, C.; Liu, D.; Zhu, C.; Wang, X.; Cao, X. Modeling and analysis of electric vehicle charging load in residential area. Electr. Power Autom. Equip. 2020, 40, 1–9. [Google Scholar]
- Arias, M.B.; Bae, S. Electric vehicle charging demand forecasting model based on big data technologies. Appl. Energy 2016, 183, 327–339. [Google Scholar] [CrossRef]
- Xing, Q.; Chen, Z.; Huang, X.; Zhang, Z.; Xu, X.; Zhang, T.; Huang, X.; Wang, H. Electric vehicle charging demand forecasting model based on data-driven approach. Proc. CSEE 2020, 40, 3796–3813. [Google Scholar]
- Gajani, G.S.; Gruosso, G. Data-driven approach to model electrical vehicle charging profile for simulation of grid integration scenarios. IET Electr. Syst. Transp. 2019, 9, 168–175. [Google Scholar] [CrossRef]
- Shao, Y.; Mu, Y.; Yu, X.; Dong, X.; Jia, H.; Wu, J.; Zeng, Y. A spatial-temporal charging load forecast and impact analysis method for distribution network using EVs-traffic-distribution model. Proc. CSEE 2017, 37, 5207–5219. [Google Scholar]
- Yanga, J.; Wu, F.; JunYan Lin, Y.; Zhan, X.; Chen, L.; Liao, S.; Xu, J.; Sun, Y. Charging demand analysis framework for electric vehicles considering the bounded rationality behavior of users. Int. J. Electr. Power Energy Syst. 2020, 119, 1–16. [Google Scholar] [CrossRef]
- Yang, J.; Wu, F.; Yan, J.; Lin, Y.; Zhan, X.; Chen, L.; Liao, S.; Xu, J.; Sun, Y. Research on spatiotemporal behavior of electric vehicles considering the users’ bounded rationality. Trans. China Electrotech. Soc. 2020, 35, 1563–1574. [Google Scholar]
- Zhang, Z.; Chen, Z.; Xing, Q.; Ji, Z.; Huang, X. Comprehensive Optimal Scheduling Strategy of Multi-Element Charging Station for Bounded Rational Users. IEEE Access 2021, 9, 9442–9452. [Google Scholar] [CrossRef]
- Kohonen, T. Self-organized formation of topologically correct feature maps. Biol. Cybern. 1982, 43, 59–69. [Google Scholar] [CrossRef]
- Chen, B.C.; Liang, B.; Zhou, Y.B.; Lin, X.Q.; Zhao, Y. An application of SOM neural network in customer classification. Syst. Eng. Theory Pract. 2004, 3, 8–14. [Google Scholar]
- Li, Z.; Wu, J.; Wu, W.; Song, B. Power customers load profile clustering using the SOM neural network. Autom. Electr. Power Syst. 2008, 15, 66–70. [Google Scholar]
- Yinquan, H.; Heping, L.; Amp, L.P. An analysis on the charging characteristics of lithium iron phosphate batteries for electric vehicles. Automot. Eng. 2013, 5, 293–297. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).