Application of Clustering Algorithms in the Location of Electric Taxi Charging Stations

: The reasonable layout of charging stations is an important measure to improve the penetration rate of the electric taxi market. Based on the multi-type clustering algorithm, a widely applicable electric taxi charging stations locating method is proposed. By analyzing the massive gasoline taxi GPS trajectory data, the parking information and charging requirements of electric taxis are extracted, and the research area is divided into reasonable grids. Then, the divided grids are respectively subjected to multiple same-type clustering and multiple multi-type clustering algorithms, so as to help ﬁnd out the location of the charging station, and a comparative analysis is performed. The empirical analysis shows that the positioning results of the multiple multi-type clustering algorithms are more reasonable than the multiple same-type clustering algorithms, which can effectively prolong the driving distance of electric taxis and save the travel time of drivers.


Introduction
With the rapid development of the global economy and the continuous reduction of fossil energy [1], the greenhouse effect is becoming more and more serious. Low-pollution, lowemission electric vehicles (EVs) are gradually attracting great attention [2]. Batteries are the main power source of EVs, but due to the limitations of current battery technology, EVs are not practical for long-distance travel and charging. This can lead to mental distress or apprehension caused by the driver's fear of suddenly running out of power when driving an electric vehicle (range anxiety) [3], which has become an important factor hindering the development of EVs. It can be concluded from this that the key to solving the driving range problem of electric vehicles is to optimize the layout of charging stations according to the charging demand.
The reasonable deployment of charging infrastructure plays a positive role in extending the driving range of EVs and promoting the development of EVs. In order to solve the problems of insufficient range, inconvenient charging, and unreasonable charging infrastructure layout of EVs, a lot of studies have investigated the location problem of public charging stations for EVs from different perspectives.
In terms of the influencing factors, some researchers have found that charging demand [4], vehicle miles traveled [5,6], geographic distribution of cities [6], path deviation [7], traffic flow patterns [8], and other factors are important in influencing the location of charging stations, of which charging demand is the most fundamental factor. The current market penetration of EVs is low, and the accurate estimation of charging demand can help achieve an optimized layout of charging stations.
To ensure more accurate estimation of charging demand, some researchers have used GPS trajectory data of gasoline vehicles to simulate the trajectory data of EVs for charging infrastructure location selection. For example, Pan et al. [9] used survey data of household trips 2 of 15 to simulate the charging selection behavior of drivers with the decision process of EV charging selection so that the existing travel activities of drivers are maximally unaffected. Chen et al. [10] used parking information from more than 30,000 individual trip records collected from a household trip survey in Seattle, Washington, USA, to determine the optimal number of charging stations to be allocated. Liu et al. [11] proposed an intelligent optimization method as well as data-driven and particle swarm optimization based on GPS trajectory data of hybrid vehicles in Chengdu, China, to achieve intelligent siting of EV charging stations. Yang et al. [12] used the GPS trajectory information of a fleet of cabs in Changsha, China, to estimate the likelihood of EV charging using a queuing model, and investigated the relationship between installing more charging piles and the trade-off between providing more waiting space, and the effect of charging power on waiting time. Due to the short time of the emergence of electric taxis (ETs) in China, relevant GPS trajectory data are scarce to give a reasonable layout of charging stations. Shi et al. [13] present an improved destination selection model, proposed to simulate the ET operation system and to help find the optimal ET charging station size with statistical analysis based on the charging need prediction. Therefore, this paper simulates the travel trajectory of electric taxis with GPS trajectory data of gasoline vehicles and predicts the charging demand of electric taxis by combining this with grid-based maps.
In terms of location methods, most studies focus on constructing charging station location models and siting charging stations based on different objective functions and constraints [14,15], while relatively few studies have applied clustering algorithms to the location of electric vehicle charging stations. Cluster analysis is a kind of unsupervised learning, and there are many kinds, such as the division-based K-means clustering algorithm [16], hierarchical clustering algorithm (agglomerative and splitting) [17], and DBSCAN algorithm based on density clustering, etc. At present, clustering algorithms are widely used in short-time traffic flow prediction [18], logistics center location selection [19], traffic flow speed prediction [20], and travel hotspot area research [21], etc., but they are less widely applied in charging station location selection. For example, Zhang et al. [22] developed a siting model for electric cabs based on their dynamic distribution and charging demand using the K-means clustering method and the center of gravity method, and applied it to the problem of siting electric cabs in Chengdu, China. Straka et al. [23] analyzed charging transactions in the Netherlands using clustering algorithms (K-means, dbscan, and cohesive hierarchical clustering) to identify usage related segments of charging stations, which helps to improve the planning of charging infrastructure and the development of smart charging technologies. Liu et al. [24] used existing service areas on highways as potential locations for charging infrastructure, clustered the close service areas, and calculated the optimal location of charging stations for each cluster. Gilanifar et al. [25] proposed a Gaussian process based on the Clustered Multi-Node Learning (CMNL-GP) method to fuse and learn data from multiple charging stations simultaneously. Zhang et al. [26] proposed a density peak clustering-based optimization method for siting and sizing EV charging stations in an urban area. Sánchez et al. [27] proposed a clustering strategy based on the K-means algorithm to define potential charging station locations. The above studies are based on different clustering algorithms for the siting and sizing determination of charging stations, and there are not yet multiple clustering algorithms combined and applied in the siting study. This paper uses multiple and multi-type clustering algorithms to optimize the location of charging stations and obtain the optimal charging station location and clustering algorithm combination. At present, electric taxis have not been fully popularized in Qingdao. Therefore, we can only estimate the charging demand of electric taxis through the GPS trajectory data of gasoline taxis in reality. The research areas are five main districts of Qingdao (Shinan District, Shibei District, Licang District, Chengyang District, and Laoshan District). Firstly, the map of the study area is gridded, and the number of vehicles in each grid that stay longer than the time threshold is recorded as the number of dwell events, and the number of dwell events in each grid is used as the charging demand of the grid. Finally, the overall weighted Euclidean distance sum of the two location selection methods is compared, and the optimal location and the best location selection method for charging stations are obtained. The proposed method is of theoretical and practical significance as it This paper is organized as follows: Section 2 presents the problem statement and data processing. Section 3 presents the location selection methodology. Section 4 presents the results of the charging station location. Finally, we summarize this paper and present the limitations and future research directions in Section 5.

Problem Statement
Given a set of electric vehicles that charge at least once a day at a charging station and a set of grids with a long dwell of electric vehicles, the problem is to present multiple and multi-type clustering algorithms to optimize the location of charging stations, so as to obtain the optimal charging station layout and the best clustering siting algorithm combination.

Data Processing
In this paper, by analyzing the taxi GPS trajectory data from 0:00 to 24:00 on 11 October 2017 (a weekday) and combining the economic development and geographical location characteristics of each urban area in Qingdao, it is extracted from the taxi GPS trajectory data within five major municipal districts. By removing the abnormal data and discontinuous trajectory data, we finally obtained 828,341 GPS trajectory data for a total of 6042 taxis. The GPS data of each taxi was captured approximately once per 5 s. Data items include vehicle ID, time, longitude, latitude, speed (km/h), and passenger status (0 for empty, 1 for passenger) (See Table 1). The research area of this paper consists of the five main municipal districts of Qingdao ( Figure 1). In order to exhaustively count the dwell demand in each area and prevent incorrect statistics of dwell demand due to inconsistent area size, this paper divides the research area into 4759 grids according to the size of both image width and height of 0.005 • (about a rectangular grid of 450 m × 550 m), and automatically numbers them as i (i = 1, 2, . . . , 4759), so the size of the grid number is only for indication, and cannot be used to indicate the distance of geographical location between the grids. Therefore, the geometric center of the grid is chosen to represent the geographic location of the grid, and the number of dwell events of the grid represents the charging demand of electric taxis in the grid.

Charging Demand
In order to maximize the satisfaction of the charging demand of electric taxis and enhance the rationality of site selection, this paper assumes that the travel patterns of drivers will not change during the electrification of gasoline taxis. So the GPS travel trajectory data of gasoline taxis in the five main districts of Qingdao is used to simulate the travel behavior of electric taxis, and extract the vehicle parking patterns to mine their charging needs.

Charging Demand
In order to maximize the satisfaction of the charging demand of electric taxis and enhance the rationality of site selection, this paper assumes that the travel patterns of drivers will not change during the electrification of gasoline taxis. So the GPS travel trajectory data of gasoline taxis in the five main districts of Qingdao is used to simulate the travel behavior of electric taxis, and extract the vehicle parking patterns to mine their charging needs.
Taxi drivers typically have long dwelling times for meals, fuel, shift changes, or breaks, so it is reasonable to assume that electric taxis will have charging needs during this time. In order to fully understand the charging demand of taxi drivers, this paper uses a time threshold of 20 min [12] to distinguish the dwelling of vehicles. If the GPS trajectory data shows that the vehicle dwell in the same grid for more than 20 min, it is considered that the vehicle needs to be charged in this grid, and a dwell event occurs in this grid.
The research area in this paper was divided into a total of 4759 grids, and the total number of dwell events occurring in each grid was recorded. Of the 4759 grids with dwell events, 900 occurred in a total of 3202 taxis, with 3401 dwell events. Figure 2 shows the frequency statistics of the number of dwell events per grid. It can be seen from Figure 2 that the number of grids that did not have a dwell event accounted for 81% of the total number of grids, the number of grids that had a stay event accounted for 6%, and the number of grids with 20 or more dwell events is the least. vehicles, so the grid distribution in Figure 4 is consistent with the actual situation of EV dwells in the city.
In this paper, the number of dwell events in each grid is used to represent the charging demand for electric taxis in the grid, and the higher the number of dwell events occurring, the higher the charging demand for EVs in the grid. It is relatively uneconomical to install charging stations in places that are not attractive to taxi drivers. Therefore, grids with no less than 4 ( a ≥ 4) number of dwell events are selected as the study object, and it can be found that 295 grids out of 4759 grids satisfy the condition and contain 2312 taxis. Therefore, this paper optimizes the location of charging stations based on the 295 grids where taxis dwell.     Figure 3 shows the number of dwell sites at which one driver would dwell in a day. Most taxis only have one dwell event in a day, and a taxi can stay in up to three locations for more than 20 min in a day. Figure 4 shows the spatial distribution of dwell events (a represents the number of dwell events per grid). As seen in Figure 4, the grids with dwell events are relatively dense, and the grids with many dwell events are mostly located in the center of each district. The eastern part of Laoshan District is Laoshan Scenic Area, and the electric vehicle dwells are relatively few and scattered, so there are fewer grids with dwell events; the development of the eastern and western parts of Chengyang District is more different, and the western area is relatively backward and has fewer electric vehicles, so the grid distribution in Figure 4 is consistent with the actual situation of EV dwells in the city.

Methodology
To achieve the optimal layout of charging stations and find the best combination of clustering sizing algorithms, this paper proposes the multiple and multi-type clustering algorithms, which mainly involve the K-means clustering algorithm, K-means weighted clustering algorithm, and hierarchical clustering algorithm.

Calculation of Euclidean Distance
The difference between the K-means clustering algorithm and the K-means weighted clustering algorithm is whether Euclidean distance or the weighted Euclidean distance is used in the clustering process.
The Euclidean distance calculation formula of the K-means clustering algorithm is represent two data objects containing m -dimensional attributes [15].
In this paper, the position coordinates of the i-th grid can be expressed as In this paper, the number of dwell events in each grid is used to represent the charging demand for electric taxis in the grid, and the higher the number of dwell events occurring, the higher the charging demand for EVs in the grid. It is relatively uneconomical to install charging stations in places that are not attractive to taxi drivers. Therefore, grids with no less than 4 (a ≥ 4) number of dwell events are selected as the study object, and it can be found that 295 grids out of 4759 grids satisfy the condition and contain 2312 taxis. Therefore, this paper optimizes the location of charging stations based on the 295 grids where taxis dwell.

Methodology
To achieve the optimal layout of charging stations and find the best combination of clustering sizing algorithms, this paper proposes the multiple and multi-type clustering algorithms, which mainly involve the K-means clustering algorithm, K-means weighted clustering algorithm, and hierarchical clustering algorithm.

Calculation of Euclidean Distance
The difference between the K-means clustering algorithm and the K-means weighted clustering algorithm is whether Euclidean distance or the weighted Euclidean distance is used in the clustering process.
The Euclidean distance calculation formula of the K-means clustering algorithm is where . , x jm ) represent two data objects containing m-dimensional attributes [15]. In this paper, the position coordinates of the i-th grid can be expressed as where L i represents the position of the i-th grid; xpos i and ypos i denote the x-coordinate and y-coordinate of the i-th grid position expressed in terms of GPS longitude and latitude, respectively. N is the set of grid numbers in this category. Suppose the coordinates of the k-th cluster center can be written by Equation (3) Z k = (zx k , zy k ), k = 1, 2, . . . , K where Z k represents the position of the k-th cluster center; zx k and zy k represent the x-coordinate and y-coordinate of the k-th cluster center, respectively. Therefore, the calculation formula of the Euclidean distance between the i-th grid and the k-th cluster center can be written by Equation (4) D(L i , Z k ) = (xpos i − zx k ) 2 + (ypos i − zy k ) 2 , i ∈ N, k = 1, 2, . . . , K In the K-means clustering algorithm, the commonly used methods for determining the number of clusters K are the silhouette coefficient method and the elbow rule. The silhouette coefficient method determines the optimal K value by finding the local optimal result; the elbow rule determines the optimal K value by judging the change of the sum of squared errors (SSE) within the class. This paper uses the elbow rule to determine the number of clusters K. In the elbow rule, the sum of squared errors (SSE) of the distance between the cluster center of each class and the sample points in the class is called the degree of distortion. For a class, the lower the degree of distortion, the closer the sample points within the class are. The more the number of clusters, the fewer sample points each class contains, and the closer the sample points are to the center point of the cluster, so the degree of distortion will decrease with the increase of the number of clusters. If the number of clusters exceeds the actual number of categories, the degree of distortion changes little, even if the number of clusters K increases, the degree of distortion does not change significantly, so an area similar to "elbow" will be formed on the line graph composed of the degree of distortion and the corresponding K value of the elbow is the selected number of clusters. The formula for calculating the degree of distortion (SSE) can be written by Equation (5) In addition, the K-means weighted clustering algorithm selects a weight in the Kmeans clustering process and improves the Euclidean distance into the weighted Euclidean distance, so the weighted Euclidean distance between the position of the i-th grid and the center of the k-th cluster is calculated (Equation (6)).
where w i is the weight of the i-th grid.

Multiple Same-Type Clustering and Multiple Multi-Type Clustering Algorithms
This paper presents multiple and multi-type clustering algorithms for the siting layout of charging stations. The so-called multiple, that is, repeatedly applying the same clustering algorithm and improving this algorithm in the application; the so-called multi-type, that is, comprehensive application of multiple clustering algorithms.
Method 1: The multiple same-type clustering algorithms first use K-means clustering to obtain the classification results based on the geographical location between grids; secondly, using the charging demand of each grid as the weight, K-means weighted clustering is performed on the sample points of each category. The new cluster center of each class is obtained, which is the location of the charging station, and the intra-class weighted Euclidean distance sum from the sample points of each class to the cluster center is calculated, and finally, the overall weighted Euclidean distance sum is obtained.
Method 2: The multiple multi-type clustering algorithms use K-means clustering to obtain classification results based on the geographical location between grids; secondly, considering the charging demand of each grid, the two-step clustering method is used to select the location of charging stations. The so-called two-step clustering method is to perform agglomerative hierarchical clustering for each class of sample points, choose a fixed relative distance to reclassify each class of sample points, and then perform K-means weighted clustering for each class of sample points after classification to obtain the location of charging stations, and calculate the intra-class weighted Euclidean distance sum for each class. Therefore, finally, we obtain the overall weighted Euclidean distance sum.

Results
In order to obtain a reasonable layout and siting method for charging stations, this paper firstly clusters the screened 295 grids using a K-means clustering algorithm based on the geographic location attributes between the grids. The input data samples of the K-means clustering algorithm are shown in Table 2, which contains the grid number, the longitude and latitude corresponding to the grid location, and the number of dwell events for the grid. The choice of K value is crucial for K-means clustering algorithm. Table 3 shows the number of clusters K and the specific values of the corresponding degree of distortion (SSE) obtained by the elbow rule, and Figure 5 is the elbow diagram obtained by the elbow rule. According to the elbow diagram judgment K value is obtained by human subjective observation, so this paper sets a limit value for the variation difference of the degree of distortion (SSE). If the variation difference of the degree of distortion is less than this limit value, the former K value is selected. The limit value of the variation difference of the degree of distortion is set to 0.02, so K can be set to 4. Meanwhile, through the observation of the elbow diagram in Figure 5, it can be seen that when K>4, the degree of distortion (SSE) does not change significantly, so the final number of clusters K is determined to be 4. The simultaneous use of the two methods ensures the accuracy of the K value.       K-means clustering only considers the geographic location between 295 grids without considering the charging demand of electric taxis in each grid, so the 4 clustering centers obtained are not the best locations for charging stations. Therefore, the following two optimization modes of location selection are chosen to select the optimal location of the charging station on the basis of the K-means clustering results.

Location Results of Multiple Same-Type Clustering Algorithms
The more the number of dwell events in the grid, the greater the charging demand of electric taxis in the grid. According to the 4 classes of grid data obtained by K-means clustering, the charging demand of electric taxis in each grid is considered, and the charging demand of electric taxis in each grid is used as the weight of the grid. Then K-means weighted clustering (K = 1) is performed for each class of the sample points to obtain 4 new cluster centers and the corresponding intra-class weighted Euclidean distance sums (Table 4), and finally obtain the overall weighted Euclidean distance sum of the four types of grid data is 24.1.  Figure 7 shows the best locations of charging stations obtained by the multiple sametype clustering algorithms. As can be seen from Figure 7, charging stations are located in the economic and residential centers of Shibei District, Licang District, Laoshan District, and Chengyang District, respectively. However, the small number of charging stations may cause long queues of electric taxis and reduce drivers' satisfaction with charging, while the long queues may cause traffic congestion problems around the charging stations. Therefore, it is necessary to further deal with.

Location Results of Multiple Multi-Type Clustering Algorithms
Based on the results of K-means clustering, a two-step clustering method (agglomerative hierarchical clustering and K-means weighted clustering) is used to optimize the location of charging stations. Firstly, agglomerative hierarchical clustering is performed

Location Results of Multiple Multi-Type Clustering Algorithms
Based on the results of K-means clustering, a two-step clustering method (agglomerative hierarchical clustering and K-means weighted clustering) is used to optimize the location of charging stations. Firstly, agglomerative hierarchical clustering is performed on the sample grid data of each class obtained by K-means clustering, and a relative distance of 0.08 is selected to classify the grid. Figure 8 shows the tree diagram obtained by agglomerative hierarchical clustering for each of the 4 classes of sample grid data. The black dotted line in the tree diagram represents the relative height of 0.08, which is used to divide the results of agglomerative hierarchical clustering. Infrared thermal imager is a typical non-contact temperature measurement technology, which has the advantages of non-contact, fast and high temperature resolution. However, infrared thermal imager can only measure the surface temperature field that can be photographed; Compared with these temperature measurement technologies, fiber Bragg grating (FBG) temperature sensor has the advantages of small volume, good stability, high measurement accuracy, electromagnetic interference immunity and "one-line and multipoint" distributed on-line monitoring.  At present, FBG sensing technology is widely used in industry, medicine and other fields. Rao [3] gave a systematic and detailed introduction to FBG sensing technology. Rao, David, Webb et al. [4] applied FBG sensor technology to the medical field and measured the temperature of human body through FBG temperature sensor. The experiment shows that the measurement accuracy of FBG sensor can reach ±0.2 °C in the range of 30 °C -60 °C [5]. Scholars applied FBG sensor technology to the baby delivery room, designed an external real-time monitoring system, used multiple FBG sensors for measurement, optimized the way of measuring and processing data, and reduced the measurement error caused by multiple sensors [6]. Scholars presented the detection of flaws in the outer bearing's raceway from the measurement of motor dynamic strain signals collected from sensors based on fiber Bragg grating (FBG). Aiming to carry out real-time online monitoring of the thermal characteristics and their effect on the machine tool spindle bearing stiffness, a fiber Bragg grating (FBG) sensors network was proposed [7]. The on-line measurement     Figure 8b shows the agglomerative hierarchical clustering result of category 2. Most of the grids in the sample are located in Laoshan District, and the distance between grids is long, so it is divided into 2 categories. Figure 8c shows the agglomerative hierarchical clustering results of category 3. The grids in the sample are mostly located in the northern part of Licang District and Shibei District, the location between grids is close and the number of grids is large, which indicates that the charging demand of electric vehicles in this area is large. So it is divided into 4 categories. Figure 8d shows the agglomerative hierarchical clustering result of category 4. The grids in the sample are basically located in Chengyang District. Most grids are densely located, and some grids are scattered around, so it is divided into 3 categories. After performing agglomerative hierarchical clustering on grid samples, all sample grid data are divided into 12 categories.
The 12 categories of data obtained by the agglomerative hierarchical clustering method are respectively subjected to K-means weighted clustering (K = 1) to obtain 12 cluster centers, which are the optimal locations of charging stations. The 12 cluster centers and the corresponding intra-class weighted Euclidean distance sums are shown in Table 5, and the final overall weighted Euclidean distance sum is 16.1.  Figure 9 shows the optimal layout of charging stations obtained by the two-step clustering method. It can be seen from Figure 9 that the location of the charging station matches the grid with many dwell events, which meets the charging demand of electric vehicles. Compared to Laoshan District and Chengyang District, the number of charging stations in Shinan District, Shibei District, and Licang District is higher. This is because the three areas are densely populated with residential areas, commercial areas, scenic spots, and high population density, resulting in high traffic flow, many dwell events, and a high charging demand for electric vehicles.

Results Analysis
By comparing the overall weighted Euclidean distance sum obtained by the multiple same-type clustering algorithms and the multiple multi-type clustering algorithms, it can be seen that the multiple multi-type clustering algorithms effectively reduces the overall weighted Euclidean distance sum. That is, they reduce the traveling distance from the electric vehicle to the charging station, save the travel time of the electric vehicle driver, and increase the operating time. From Figure 7 and 9, it can be found that the location and layout of the charging station in Figure 9 is more reasonable, which can meet the charging demand of electric vehicles as much as possible and achieve the goal of optimizing the location of the charging station. Meanwhile, it can be seen that the multiple multi-type algorithms proposed in this paper are better than the multiple same-type clustering algo-

Results Analysis
By comparing the overall weighted Euclidean distance sum obtained by the multiple same-type clustering algorithms and the multiple multi-type clustering algorithms, it can be seen that the multiple multi-type clustering algorithms effectively reduces the overall weighted Euclidean distance sum. That is, they reduce the traveling distance from the electric vehicle to the charging station, save the travel time of the electric vehicle driver, and increase the operating time. From Figures 7 and 9, it can be found that the location and layout of the charging station in Figure 9 is more reasonable, which can meet the charging demand of electric vehicles as much as possible and achieve the goal of optimizing the location of the charging station. Meanwhile, it can be seen that the multiple multitype algorithms proposed in this paper are better than the multiple same-type clustering algorithms in the application of charging station locations, which provides a new method for future charging station locations.

Conclusions
This paper takes the gridded map of five major municipal districts of Qingdao (Shinan District, Shibei District, Licang District, Chengyang District, and Laoshan District) as the research area, extracts the number of vehicles in each grid with a dwell time of more than 20 min, based on the GPS trajectory data of gasoline taxis in the five main municipal districts, and takes the grid with the number of dwell events no less than 4. Assuming that the geometric center of each grid and the number of dwell events, respectively, represent the location of the grid and the charging demand of electric taxis. Then, the location of charging stations is selected using the clustering method. Based on the geographic location among grids, multiple same-type clustering algorithms and multiple multi-type clustering algorithms are performed for all grids separately. Finally, the overall intra-class weighted Euclidean distance sum obtained by the multiple same-type clustering method is 24.1, and the overall intra-class weighted Euclidean distance sum obtained by the multiple multi-type clustering method is 16.1, which indicates that the overall weighted Euclidean distance sum obtained by the multiple multi-type clustering algorithms is significantly smaller than that of the multiple same-type clustering algorithms, reducing the traveling time of electric taxis. The location selection result of the multiple multi-type clustering algorithms is more reasonable than that of the multiple same-type clustering algorithms. This paper provides feasible suggestions and methods for the location and optimal layout of charging stations in five major municipal districts of Qingdao.
Currently, the market penetration rate of electric vehicles is increasing, and the reasonable layout of charging stations plays a positive role in the promotion of electric vehicles. The multiple multi-type clustering location selection method proposed in this paper provides a new solution for the optimal layout of urban charging stations. However, only the travel time of electric vehicle drivers is considered, and the trajectory data of gasoline taxis are used to simulate the trajectory of electric taxis, which has some errors in terms of station location. In future research, the cost problem can be considered, more clustering algorithms can be integrated, and GPS trajectory data of electric vehicles can be used to further improve the scientific and reasonable location of charging stations.