1. Introduction
With the global energy transition and carbon-neutrality targets, the electric car market has expanded rapidly. In 2024, global electric car sales surpassed 17 million units, raising the worldwide share of electric cars in new-car sales to over 20% [
1]. China remained the largest market, with over 11 million electric cars sold in 2024; in the domestic market, electric cars accounted for almost half of all new car sales in China that year [
1]. The International Energy Agency projects that global electric car sales will exceed 20 million units in 2025, representing more than one-quarter of cars sold worldwide [
2]. However, the deployment of public charging infrastructure has not kept pace with this rapid adoption. According to [
2], by the end of 2024, more than 75% of the European highway network had chargers at most 50 km apart, whereas only 35% of the U.S. interstate highway system achieved the same level of coverage. In the United States, total non-home charging deployment increased from about 151,000 in mid-2023 to 204,000 in 2024 [
3].
In this context, the scientific accuracy and precision of charging station location optimization decisions have a direct impact on the development of the electric vehicle industry and the effectiveness of energy system transformation. Research demonstrates that optimized siting can reduce total system costs by 30–40%, increase charging station utilization rates by 60–80% compared to suboptimal locations, and reduce transmission losses by 6.4% [
4]. Charging station location optimization has become a critical technical issue for promoting the sustainable development of the electric vehicle industry.
Early studies typically formulated charging station placement using facility location models and heuristic algorithms. For example, p-median and related formulations have been used to minimize travel distance and improve accessibility, but they often rely on static or simplified demand representations and thus struggle to capture short-term spatio-temporal dynamics [
5]. Hybrid approaches combining location optimization with simulation can better reflect behavioral responses but may become computationally expensive for city-scale deployment [
6]. Metaheuristics such as genetic algorithms, simulated annealing hybrids, and particle-swarm variants have also been widely adopted to handle non-convex search spaces and complex constraints [
7,
8,
9,
10]. However, many of these approaches focus on a single objective (or collapse multiple objectives into a single score), which can obscure the tensions among competing planning goals.
Multi-objective evolutionary algorithms have consequently gained popularity in charging infrastructure planning. NSGA-II and related methods have been used to jointly optimize coverage, cost, and other performance metrics, and comparative studies frequently report NSGA-II as a strong baseline for approximating Pareto-optimal trade-offs [
11,
12,
13]. Beyond NSGA-II, multi-objective particle swarm optimization and other evolutionary designs have been applied to bi- or tri-objective charging station problems that consider electricity costs, emissions, or grid-oriented performance measures [
14,
15]. Despite these advances, two limitations remain common. First, many multi-objective siting studies still rely on coarse demand proxies (e.g., static POIs, aggregate travel indicators, or historical averages) rather than planning-ready spatio-temporal forecasts, which can lead to hotspot misallocation under demand sparsity. Second, the objectives and feasibility terms are often parameterized by assumptions that are difficult to verify at a city scale, which can weaken interpretability and transferability for practical planning.
Recent work has begun to incorporate richer mobility evidence and data-driven prediction to better match real-world charging dynamics. Activity-based and mobility-informed approaches can better reflect where charging opportunities are structurally needed beyond observed utilization alone [
16,
17,
18]. Complementing these mobility-informed approaches, GIS-enabled multi-criteria screening frameworks have been used to synthesize heterogeneous spatial layers (e.g., land-use constraints, walk/cycle catchments, population exposure, and proximity to grid assets) into interpretable suitability surfaces or ranked zones, providing a transparent pre-planning step for narrowing candidate areas before detailed optimization [
19,
20]. In parallel, spatio-temporal deep learning has emerged as a promising tool for forecasting charging demand, including graph-based models and recurrent variants that better capture temporal dependence and spatial spillovers [
21,
22,
23]. However, these components are often studied in isolation: forecasting is evaluated primarily by generic error metrics rather than planning-relevant hotspot fidelity, and mobility representations are rarely integrated into an end-to-end decision framework that explicitly balances efficiency, equity, and feasibility through constrained multi-objective optimization. As a result, planners still lack a unified workflow that (1) produces robust demand signals under extreme sparsity, (2) complements realized charging demand with citywide mobility structure to avoid overlooking demand-sparse but strategically important areas, and (3) yields transparent, constraint-feasible trade-off portfolios rather than a single opaque “optimal” solution.
To address these gaps, this study develops a hierarchical spatio-temporal planning framework that integrates demand forecasting, mobility representation learning, and constrained multi-objective siting within a unified grid-based decision system. First, a ConvLSTM-based forecasting module with attention is trained to produce short-term grid-level demand surfaces, and performance is evaluated in a planning-relevant manner under demand sparsity by reporting hotspot-oriented and system-level indicators. Second, to complement realized charging demand, a trajectory-based representation module (a denoising autoencoder with temporal self-attention) learns citywide mobility patterns and derives accessibility-related signals that remain available even in demand-sparse areas. Third, we formulate a constrained multi-objective siting problem that jointly considers demand-oriented service effectiveness, accessibility, and equity-oriented objectives, as well as infrastructure feasibility, which is proxied by network connection frictions. We solve it using NSGA-II to obtain Pareto-optimal portfolios. Finally, TOPSIS is used to rank Pareto solutions and recommend implementable deployment plans under different budget levels, enabling transparent comparison of efficiency–equity–cost trade-offs and supporting phased planning.
The remainder of the paper is organized as follows.
Section 2 introduces the study area and datasets.
Section 3 describes the hierarchical modeling and optimization framework.
Section 4 reports forecasting performance, mobility-layer characteristics, and multi-objective siting results, including recommended plans.
Section 5 and
Section 6 discuss implications, limitations, and conclusions.
2. Study Area and Data
2.1. Overview of Experimental Area
This study was conducted in Wuhan, a major megacity in central China, located approximately. The city covers a total area of 8569.2 km2 and has a resident population of approximately 11.21 million, providing a large and diverse urban context for examining public EV charging demand. Administratively, Wuhan administers 13 districts (including Jiang’an, Jianghan, Qiaokou, Hanyang, Wuchang, Qingshan, Hongshan, Dongxihu, Caidian, Jiangxia, Huangpi, Xinzhou, and Hannan). These are commonly described as consisting of seven central urban districts and six suburban districts, reflecting a distinct core–periphery structure in urban functions and mobility. Wuhan has experienced rapid urban development and maintains a high level of urbanization; for instance, official statistics reported an urbanization level of 80.29% in 2018. Against this backdrop, the city has witnessed fast growth in EV adoption. Public reporting indicates that by 2023, Wuhan’s EV ownership had risen to around 360,000 vehicles, implying increasing pressure on public charging infrastructure to serve both routine and intensive users. In terms of urban structure, Wuhan has historically been shaped by the “three towns” (Hankou, Hanyang, and Wuchang), separated by the Yangtze and Han rivers, and has evolved into a polycentric metropolis. Such geography and functional heterogeneity tend to produce spatially concentrated activity hotspots and corridor effects, while also creating frictions (e.g., river-crossing constraints and uneven land-use intensity) that can lead to localized service gaps. These characteristics make Wuhan a suitable empirical setting for studying the spatial concentration of public charging demand and for evaluating planning strategies under realistic metropolitan conditions.
2.2. Charging Dataset Description
This study utilizes EV charging demand data collected from the Charging Bar mobile application. Charging Bar is an online platform that helps EV users locate nearby public charging stations and provides real-time availability at the connector level. A distinct advantage of this data source is its cross-platform aggregation: the dataset covers public charging stations operated by 22 charging service providers, which mitigates single-operator platform bias and offers a more comprehensive view of citywide public charging activity.
Within Wuhan, the platform lists 1701 public charging stations and 24,763 charging connectors. To capture realistic short-term demand dynamics, we conducted high-frequency monitoring from 00:00 on 14 February 2025 to 00:00 on 20 February 2025 (see
Table 1). Data acquisition was performed using a time-series crawler that periodically queried the platform’s real-time status feed. At each snapshot, every connector is labeled as either “occupied” or “idle”. Because the platform API occasionally returns incomplete station lists and duplicated entries, we performed validity filtering and retained 5,859,191 connector-status records as the cleaned snapshot dataset.
Because a single snapshot does not represent a complete charging session, we optionally derive a session-like indicator from status transitions in each connector’s time series. Specifically, for a given connector, if its status is observed as occupied at time t and idle at the subsequent snapshot t′, we record an occupied-to-idle transition, indicating that a charging session likely ended within [t, t′]. Aggregating such transitions across all connectors during the observation window yields 1,823,781 transition-based events, which we use as an auxiliary behavioral statistic rather than the primary forecasting target.
Finally, all stations are mapped to a regular 1 km grid system, and the connector-level snapshots are aggregated into an hourly grid-level time series. For each grid cell i and hour t, the demand variable is defined as the hourly count of active connectors (i.e., connectors in an occupied/charging state) within the cell. This connector-occupancy demand proxy serves as the primary input for both spatiotemporal forecasting and downstream location optimization. The transition-based events provide additional supporting evidence on usage intensity and charging dynamics, complementing the main connector-level demand representation.
2.3. Trajectory Dataset Description
GPS trajectory data enable the characterization of urban mobility patterns and the identification of potential public-charging hotspots. Unlike static built-environment variables, trajectory records capture dynamic activity intensity and spatial mobility flows, which serve as a strong behavioral precursor of where and when public charging demand is likely to occur. In this study, we use ride-hailing vehicle trajectories as a city-scale mobility proxy (see
Table 2).
Ride-hailing trajectories are adopted for three reasons. First, ride-hailing vehicles are commercial vehicles with substantially higher daily mileage and operating hours than private cars, and thus tend to rely more heavily on public charging facilities, making them an informative proxy for intensive public-charging demand. Second, ride-hailing trips are strongly associated with significant activity centers such as commercial districts, transport hubs, and office clusters, which helps identify latent charging demand hotspots. Third, ride-hailing vehicles serve as a transport carrier and do not alter passengers’ underlying travel purposes, allowing them to represent urban residents’ macro-level mobility intentions with relatively low behavioral bias.
The GPS data are obtained from the Gaode Open Platform. It is important to note that unlike experimental datasets with fixed-frequency continuous sampling, this commercial dataset utilizes an event-triggered and non-uniform sampling mechanism to optimize data transmission efficiency. Specifically, the recording frequency adapts to vehicle status: our statistical analysis reveals a median sampling interval of 10.0 s during active movements, ensuring high-resolution capture of mobility flows, whereas intervals extend significantly during idling or congestion to avoid redundant data:
- (1)
Spatial filtering: removing records with coordinates outside the Wuhan administrative boundary;
- (2)
Outlier removal: filtering implausible jump points caused by GPS drift (e.g., abnormal displacement within a short time interval that implies unrealistic speed);
- (3)
Invalid record cleaning: removing duplicates, missing values, and logically inconsistent trips (e.g., end time earlier than start time).
- (4)
Activity filtering: To focus on effective mobility demand, we specifically excluded records corresponding to long-duration idling, parking, and inactive periods (e.g., driver rest breaks or offline shifts).
After strictly applying these cleaning criteria, we constructed a high-quality ride-hailing trajectory database containing 29,143 vehicles operating within Wuhan from 17 to 19 October 2019. The final dataset consists of 16,791,722 valid GPS waypoints. It is important to clarify that the vehicles were not sampled on a continuous 24 h basis. Instead, the dataset captures the discontinuous nature of ride-hailing operations (typically 8–12 h shifts), recording only active service periods while naturally excluding offline non-service time. Consequently, the retained data effectively represent the spatiotemporal distribution of urban charging demand without redundancy.
To quantify mobility intensity at a macro level, we map trajectory records to the same grid system used in the charging-demand analysis and discretize time at an hourly resolution. Within each grid cell and each hour, we compute grid-level summaries that reflect urban activity intensity (e.g., trajectory-point counts and/or vehicle-passage intensity). We then apply smoothing to reduce the influence of extreme values and differences in reporting frequency and obtain a stable grid-based mobility intensity layer. This mobility layer captures the spatial concentration of activity hotspots and serves as an input to downstream accessibility/activity characterization, as well as location-planning modules.
2.4. Data Preprocessing and Gridding
We discretize Wuhan into a regular 1 km × 1 km grid to construct a unified spatial analysis framework. The grid resolution is selected empirically as a practical trade-off: coarser grids can obscure localized demand hotspots and residential exposure, whereas finer grids rapidly expand the decision space and increase the computational cost of repeated coverage and distance evaluations in NSGA-II. Consistent with prior city-scale studies adopting kilometer-level gridding to balance spatial representativeness and computational feasibility [
24,
25], we select a 1 km grid. This produces 9023 valid cells within the Wuhan administrative boundary, providing adequate granularity for planning without incurring prohibitive computational burden.
In addition to the charging-demand and trajectory-based mobility layers, the multi-objective station location model requires auxiliary spatial datasets to represent equity, connection frictions, and feasibility constraints. All auxiliary datasets are harmonized to the Wuhan municipal boundary and aggregated to the same 1 km grid to ensure spatial consistency across layers and objectives. To operationalize equity in service provision, we use gridded population data derived from China’s Seventh National Population Census. The population surface is clipped to the Wuhan boundary and aggregated to the analysis grid via zonal summation, yielding a grid-level population indicator. This layer captures the spatial distribution of residents and supports equity-aware assessment of alternative station deployment plans [
26] (see
Figure 1a).
To represent power-infrastructure availability, we extract substation locations from OpenStreetMap (OSM). Substation point features are filtered to the Wuhan boundary and treated as proxy supply nodes for grid connection. For each candidate grid cell, we quantify connection friction by its proximity to the nearest substation and incorporate this indicator as a cost proxy in the optimization, discouraging solutions that systematically place new facilities far from existing power nodes. Because Euclidean distance can underestimate real-world connection frictions, distances between candidate locations and substations are computed as shortest-path distances on the drivable road network. Specifically, we construct a routable road graph (drivable links only) and calculate the shortest path length from each candidate cell centroid to its nearest substation, yielding a more realistic approximation of connection difficulty (see
Figure 1b).
3. Methodology
3.1. Spatio-Temporal Charging Demand Forecasting
Public charging demand is highly sparse and exhibits strong spatio-temporal clustering. We therefore formulate short-term public-charging demand forecasting on a regular 1 KM grid as a planning-oriented prediction task, where the goal is to provide a reliable demand surface for subsequent siting decisions.
We discretize the study area into a regular lattice of valid grid cells indexed by . For each hour , we aggregate platform status observations within each cell to obtain a grid-level demand proxy , defined as the number of charging connectors that are actively charging during hour (utilization intensity). Separately, we derive a capacity proxy as the total number of connectors located in cell (supply scale).
For each time step , we construct a 6-channel input tensor consisting of:
Demand channel: to stabilize the heavy-tailed distribution of observed utilization.
Capacity channel: (broadcast over time), representing grid-level supply intensity.
Time encodings (four channels): deterministic cyclic features capturing hour-of-day and day-of-week seasonality:
These time-encoding channels are broadcast to all grid cells for the corresponding time step. Given an input window length
(
h), the model predicts the next-hour demand map:
Training samples are generated by sliding the window across the full time series. We split sequences chronologically into training, validation, and test. Demand and capacity channels are standardized using training-set statistics and then denormalized for evaluation on the original scale (inverse transform of ).
We employ a multi-layer ConvLSTM backbone with attention to capture local spatio-temporal dynamics. Specifically, the network consists of three stacked ConvLSTM layers (16.8 and 4 filters, kernels) with batch normalization and dropout (0.2) after the first two layers, followed by a Convolutional Block Attention Module (CBAM) to refine features via channel and spatial attention. A convolution with ReLU activation produces the non-negative prediction in the space.
To address extreme sparsity, we optimize a composite loss that emphasizes active regions while maintaining global fit and spatial coherence:
The sparse-aware regression term combines a masked MSE on active cells with an unmasked global MSE:
where the binary mask is defined as
(computed on the original-scale y, applied in the transformed space). The gradient term penalizes discrepancies in first-order spatial gradients to preserve hotspot structure and avoid spurious speckle:
The model is trained with Adam, early stopping (patience 12), and ReduceLROnPlateau scheduling (factor 0.5, patience 6), with mixed precision enabled for efficiency.
For benchmarking, we evaluate two standard baselines on the original scale: (i) Persistence, which predicts ; and (ii) Historical Average (HistAvg), which uses the mean demand for the same (day-of-week, hour) computed from the training period, with hierarchical backoff (same hour; same day-of-week; global mean) if a specific slot is missing. Given the sparsity of grid demand, we report two complementary evaluation sets computed after inverse transformation: (i) active-only metrics (MAE, RMSE, WMAPE) computed on cells with , and (ii) a global WMAPE computed over all grid cells and time steps (including zeros), reflecting system-level aggregate error.
3.2. Trajectory-Driven Denoising Autoencoder with Temporal Self-Attention
Public charging demand is shaped not only by realized utilization (which is supply-constrained) but also by the broader mobility structure of the city. We therefore learn a mobility-informed representation from ride-hailing trajectories and derive a unitless, rank-based accessibility index () that captures relative structural connectivity rather than an absolute travel-time measure.
Let denote the hourly ride-hailing trajectory intensity aggregated to grid cell at hour (e.g., the number of GPS waypoints or trajectory counts falling within the cell). We apply a log transform and min-max scaling to obtain , where the scaling parameters are estimated from the training period and then applied consistently to all splits to avoid leakage.
Because mobility signals are dominated by persistent level differences (core vs. periphery), we construct a de-seasonalized residual series to emphasize local temporal deviations:
where
is the mean of
over the training period at the same hour-of-day
and day-of-week
. This revisualization suppresses stationary intensity levels and highlights temporal signatures linked to commuting and activity rhythms.
We then extract fixed-length windows
for each grid cell
with window length
h and stride
h (
Table S1). To improve robustness against missing records and measurement noise, we train a denoising sequence autoencoder. Each input window is corrupted by (i) random element masking (mask ratio
) and (ii) additive Gaussian noise
with
on the scaled residual series:
where b is a Bernoulli mask. The model reconstructs the clean residual sequence by minimizing mean squared reconstruction error.
The encoder uses stacked LSTMs (64 and 32 units) to obtain a latent sequence representation, followed by temporal self-attention to focus on critical time steps. Given hidden states
, we compute:
The attended sequence H′ is then pooled (global average pooling) to yield a fixed-length embedding
with
(
Table S1). The decoder (32 and 64 units) reconstructs the residual sequence from
, enforcing information preservation through the bottleneck.
For each grid cell
, we aggregate embeddings across all windows by temporal averaging,
. To obtain a comparable scalar accessibility score, we project the embeddings onto a one-dimensional structural axis via principal-component projection (first PC), yielding
. Finally, we convert
into a unitless, rank-based index:
where
is the number of valid cells (ties broken by average rank). By construction,
reflects the relative ordering of structural connectivity (higher values indicate more centrally connected mobility signatures) and is therefore suitable for equity stratification without claiming absolute accessibility levels.
Model training uses Adam with a batch size of 1024 for up to 60 epochs with an x
window split into training/validation (
Table S1). The learned
surface is exported as a planning layer for downstream optimization.
3.3. Multi-Objective Optimization and Spatial Economics-Based EV Charging Station Location Planning
We formulate EV charging station expansion as a discrete, grid-based location planning problem that integrates (i) demand potential, (ii) mobility-derived relative accessibility, (iii) population exposure, and (iv) implementation feasibility.
For each feasible buildable grid cell , we construct four planning attributes: (1) a demand score from the ConvLSTM demand surface (mean predicted daily peak intensity); (2) the mobility-derived relative accessibility index ; (3) population aggregated to the grid from a population raster; and (4) an implementation feasibility proxy defined as the road-network shortest-path distance (km) from cell to its nearest power substation (connection friction).
We screen feasible cells to form a tractable candidate pool (maximum size 600 in the main configuration). Let indicate whether a candidate cell is selected, with the deployment scale fixed to ().
Neighborhood service rule (served indicator). Because service is evaluated on a discrete grid and planners often interpret coverage in neighborhood terms, we define a binary served indicator for each demand cell j based on the 3\times 3 Moore neighborhood
centered at
:
Under a grid, the farthest neighbor in is at distance (), which is consistent with the neighborhood-scale service radius used in mapped interpretation.
Smooth distance-decayed coverage (kernel). To provide a smooth optimization signal and reflect diminishing marginal returns with distance, we define the effective service received by cell j from the selected set
as the best (closest) exponential influence:
where
is the centroid-to-centroid Euclidean distance (km) between grid cells
and
, and
is the decay scale (
in the main configuration;
Table S2).
Objective 1: Maximize demand-weighted service effectiveness (kernel-based coverage). We maximize:
This objective prioritizes locating new sites near high-demand areas while allowing benefits to attenuate smoothly with distance and preventing double counting via the max operator.
Objective 2: Minimize equity gap (Group Parity Gap, GPG) under the served indicator. We define low-accessibility zones as the bottom quantile of
(q = 0.25 in the main configuration), and compute population-weighted served shares:
The equity objective is then the absolute parity gap between the low-access group
and the overall population
:
To ensure a minimum level of investment in structurally disadvantaged areas, we impose a hard quota constraint on selected sites:
where
is set in the main configuration as reported in
Table S2 (e.g.,
).
Objective 3: Minimize implementation feasibility friction. We minimize the mean substation-distance proxy of selected sites:
Spatial dispersion constraint. To avoid redundant clustering and reflect practical spacing, we enforce a minimum separation distance
between any two selected sites (
in the main configuration;
Table S2), computed on centroid Euclidean distance:
We solve the constrained three-objective problem using NSGA-II and run three independent trials per deployment scale to address algorithmic stochasticity.
Because the Pareto set contains multiple efficient trade-offs, we select an implementable compromise plan using TOPSIS in the normalized objective space with explicit weights (
;
Table S2).
5. Discussion
This research developed a comprehensive hierarchical framework for optimizing EV charging station locations, with results demonstrating significant improvements in efficiency, cost-effectiveness, and equity. The findings have direct implications for sustainable urban development.
5.1. Principal Findings and Interpretation
This study yields three substantive insights that are relevant to charging-infrastructure planning. First, forecasting must be formulated and evaluated as a planning signal under sparsity, not as a purely predictive exercise. At the 1 km grid–hour resolution, the demand tensor is dominated by zero values (only 7.75% of grid–time observations are active), so “overall” metrics can reward models that fit zeros while remaining uninformative for operational hotspots. The dual evaluation protocol—hotspot-oriented errors on active cells plus system-level WMAPE on the citywide aggregate—therefore reflects a planning logic: siting decisions are driven by where peak stress emerges and whether the total demand trajectory is preserved.
Second, the analysis clarifies that mobility exposure and realized charging demand capture different constructs. The trajectory-derived layers provide ubiquitous citywide signals and exhibit a stable core–corridor structure, while realized charging demand is highly spatially sparse and constrained by the existing network. The moderate hotspot overlap (Jaccard = 0.39) indicates complementarity rather than redundancy: mobility represents potential exposure/structural connectivity, whereas demand reflects realized utilization shaped by supply and behavioral adaptation. This distinction explains why integrating both layers can mitigate “demand-blind” zones and supports an exploration–exploitation balance in siting.
Third, the optimization results show that equity gains are not a by-product of coverage expansion. Across deployment scales, the Pareto structure reveals a persistent tension between equity and cost: approaching near-zero equity gaps typically requires accepting higher connection frictions and/or moderate coverage reductions. Importantly, the TOPSIS-selected compromise solutions remain stable across independent NSGA-II runs, providing an implementable recommendation while keeping the underlying trade-offs transparent.
5.2. Planning Implications for Phased Deployment and Spatial Strategy
The framework provides several practical implications for planners. First, phased deployment is not merely an operational convenience but a rational response to structural trade-offs. Increasing the deployment budget (e.g., K = 15 to K = 30) expands the achievable service frontier and improves citywide coverage, yet it does not eliminate the equity–cost tension; rather, it reveals what additional coverage requires in terms of feasibility frictions (proxied here by substation-access distance) and distributional balance. This has two planning implications. On the one hand, early-stage investments should prioritize high-leverage “gap-filling” sites—locations that produce large neighborhood-level coverage gains while avoiding redundancy with already-served cores—because marginal returns are typically higher when service basins are fragmented. On the other hand, later-stage expansion can shift toward network consolidation and corridor extension, where the objective is not only to add coverage but to improve continuity of neighborhood service and to reduce persistent underserved pockets. From a management standpoint, this staged interpretation suggests a practical workflow: use smaller-K compromise solutions as “Phase-I candidates” for rapid deployment and pilot evaluation, then update the weight scenario (or the feasible candidate set) to generate Phase-II plans once utilization feedback and implementation constraints become clearer.
Second, the recommended layouts suggest a spatial logic that is more realistic than “uniform dispersion” interpretations of equity. Under the strictly new-build requirement and the minimum-separation constraint, the optimizer is discouraged from selecting already saturated cells and from co-locating with existing stations. Instead, it tends to place new sites in adjacent buildable cells around demand hubs—core-oriented but non-redundant infill—combined with a limited number of peripheral or secondary-center anchors that reduce uncovered pockets under the neighborhood service rule. This pattern carries a concrete equity message: equity-oriented planning does not necessarily mean reallocating investment away from the core; rather, it can be operationalized as (i) reducing local access gaps by targeting cells adjacent to saturated hubs, where marginal additions still improve neighborhood-level availability and reduce queuing pressure, while (ii) strategically extending service to edges and secondary belts where residents are otherwise persistently outside the service radius. In other words, the framework supports an interpretable “gap-filling + anchor” strategy that aligns with how networks are actually expanded: strengthening coverage in high-use basins without redundant co-location, while preventing peripheral exclusion via targeted expansions instead of diffuse scatter.
Finally, because TOPSIS is applied with an explicit weighting scheme (baseline weights: 0.45/0.35/0.20 for coverage/equity gap/cost;
Supplementary Materials, Table S2), the framework naturally supports transparent scenario analysis. Rather than presenting a single “optimal” plan, the Pareto set provides the menu of feasible trade-offs, and TOPSIS offers a reproducible rule to select an implementable compromise under a stated policy preference. This enables planners to translate policy priorities into operational choices: for example, a reliability- or congestion-relief scenario may emphasize coverage (higher weight on coverage), a social-inclusion scenario may prioritize reducing disparity (higher weight on equity), and a grid-coordination scenario may emphasize feasibility (higher weight on cost proxy). Importantly, reweighting does not change the underlying optimization engine or the constraint environment; it changes only the decision rule for selecting a plan from the same trade-off surface, which keeps the planning process auditable and easy to communicate to stakeholders. We therefore view the weighting scheme not as a fixed “expert truth” but as a policy lever that can be stress-tested, documented, and updated as implementation feedback accumulates.
5.3. Limitations and Directions for Future Research
Several limitations define the boundary of interpretation and point to extensions:
- (1)
Temporal representativeness: the demand layer is derived from a limited observation window; more extended temporal coverage would better capture seasonal and event-driven variability, thereby improving the stability of hotspot definitions.
- (2)
Temporal mismatch between mobility and charging datasets: mobility trajectories and charging observations are not temporally aligned. We therefore interpret relative accessibility as a unitless, rank-based index that captures persistent structural connectivity, rather than an absolute measure of travel time. Future work should incorporate temporally matched trajectory data to strengthen contemporaneous access inference.
We also note that both city-scale ride-hailing GPS trajectories and real-world public-charging usage records are valuable yet inherently scarce datasets and fully synchronized multi-source data at this scale are rarely available in practice. We therefore aim to make a transparent and defensible use of the best-available data, while quantifying the associated uncertainty via sensitivity/robustness diagnostics wherever possible.
- (3)
Engineering realism of the cost proxy: the cost term is modeled as road-network shortest-path distance to the nearest substation, capturing connection friction but not feeder capacity, transformer constraints, or permitting feasibility. Incorporating grid-capacity constraints and station-type heterogeneity would improve implementation realism and may reshape the equity–cost frontier.
6. Conclusions
This study develops a hierarchical spatio-temporal decision framework for public EV charging station expansion that integrates demand forecasting, mobility representation learning, and constrained multi-objective siting into a unified grid-based planning pipeline. The framework is designed to support sustainability-oriented planning decisions in the face of two persistent realities of urban charging systems: extreme demand sparsity and multi-objective trade-offs among service effectiveness, equity, and infrastructure feasibility.
Three conclusions can be drawn. First, planning-relevant demand forecasting must be hotspot-aware in the presence of sparse demand tensors. Because only a small share of grid–time observations are active, overall metrics alone can overstate performance for siting purposes. Using a dual evaluation protocol, the ConvLSTM with attention yields the most reliable planning signal, resulting in improved hotspot-oriented accuracy.
Second, mobility exposure and realized charging demand should be treated as complementary rather than interchangeable. Trajectory-derived layers supply ubiquitous citywide information about structural activity and accessibility patterns that remain visible even in demand-sparse zones. The moderate hotspot overlap between mobility and demand suggests that optimizing solely on demand risks reinforcing existing utilization patterns. In contrast, mobility-informed objectives can better identify structurally connected yet underserved locations.
Third, equity improvements are not a by-product of coverage expansion, and transparent trade-off management is therefore essential. The NSGA-II Pareto portfolios show persistent equity–cost tension under realistic constraints. TOPSIS provides an interpretable mechanism to select implementable compromise solutions from the Pareto set under an explicit weighting scheme, enabling scenario-based decision support and phased deployment comparisons.
Several limitations motivate future research. Demand is reconstructed from a limited temporal window, and longer observation horizons would strengthen generalizability. Mobility trajectories and charging observations are not temporally aligned; future work should incorporate temporally matched mobility data to support stronger contemporaneous accessibility inference. Finally, the infrastructure feasibility term is represented by a proxy based on road-network distance to substations; incorporating grid-capacity constraints, land feasibility, and station-type heterogeneity would improve implementation realism and may reshape the equity–cost frontier.
Overall, the proposed framework provides a reproducible pathway to integrate spatio-temporal prediction, mobility opportunity structures, and multi-objective optimization for equitable and operationally effective charging network expansion.